<pclass="blog-post-meta">August 14, 2015 by <ahref="../../resume.html">Vicky Steeves</a> or the SAA Museum & Archives Section Newsletter. <ahref="http://www2.archivists.org/sites/all/files/MAS%20Newsletter%20Summer%202015-new.pdf">See original posting here.</a></p>
<p>As the National Digital Stewardship Resident at the American Museum of Natural History, I was introduced to the very specific problems facing museum librarians and archivists not only through observing the Research Library, but by speaking individually with some of the most intensive data creators at the Museum. As a part of my larger needs assessment project at the Museum, I created a semi-structured interview guide that I used to enter into a targeted dialogue with scientific staff members, covering all aspects of their digital research and collections data. Topics included the volume of their data, its rate of growth, format types, necessary software and hardware support, management practices, and opinions on preservation of their data (i.e. what data they believe is important in the long-term). I interviewed close to 60 staff members in total, including all the curators in the five Science divisions at the Museum: Anthropology, Invertebrate Zoology, Paleontology, Physical Sciences, and Vertebrate Zoology.</p>
<p>During the course of my analysis, I discovered not only the sheer volume of data (with a substantial number of curators generating many terabytes a day!) but also the diversity of said data, for both research purposes and within collections. This is a big data problem that many research museums are facing. Looking at the AMNH, diversity of data is found not only in the macrocosm of the Museum’s five Science divisions, but also with each curator and research methodology.</p>
<divalign="center"><imgsrc="../../img/inez.jpg"height="30%"width="30%"alt="Inez the DigiPres Turtle">
<pclass="caption">The NDSR mascot, Inez the DigiPres Turtle, look-ing in on a CT scanner scanning a monkey's skull at AMNH.</p></div>
<p>After gathering this interview data, I was tasked with analyzing it in order to make recommendations in a larger final report on three essential categories: storage, management, and preservation of digital research and collec-tions data. A related deliverable of my project was also a report on solutions other museums have developed for curat-ing their in-house research and collections data. This environmental scan showed that few natural history museums in the United States take an institutional approach to solving this challenge, largely due to re-source constraints. A popular institutional solution for collections data is Arctos, the community-driven multidisciplinary collec-tion management information system that was developed as a collaboration among multiple institutions and currently holds three million natural history museum re-cords. However for research data, fewer such solutions exist for natural science research and are in development cur-rently. The National Museum of Natural History and the British Natural History Museum are both growing their digital preservation program by building institutional repositories to house their respective research data.</p>
<p>As I continued to develop my AMNH-specific recommendations for storage, management, and preservation of digital research and collections data, I remained cognizant of the community implications. This final report is still a working docu-ment, now totaling over 100 pages. It is my hope that through at least publicly releasing my semi-structured interview guide (which will be in my public NDSR report to be released in the coming weeks), that other natural science muse-ums can pursue the same needs assess-ment procedure to understand the ex-tent and scope of their own digital data—and in doing so, have the opportu-nity to advocate and educate for and on digital preservation in their own institu-tions. Only when there is institutional support can larger community-driven resources be developed and the risk of data loss minimized. </p>
<p><arel="license"href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><imgalt="Creative Commons License"style="border-width:0"src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png"/></a><br/><spanxmlns:dct="http://purl.org/dc/terms/"property="dct:title">Data, Science, & Librarians, Oh My!</span> by <axmlns:cc="http://creativecommons.org/ns#"href="http://victoriaisteeves.com/blog.html"property="cc:attributionName"rel="cc:attributionURL">Vicky Steeves</a> is licensed under a <arel="license"href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a></p>