Information Management, Digital Preservation and Records Management in the Cloud: Challenges and Opportunities (#saa13)The educational sessions of the Joint Annual Meeting of the Council of State Archivists and the Society of American Archivists continued today at the Hilton Riverside in New Orleans, Louisiana.

The first session, “Digital Preservation and Records Management in the Cloud,” emphasized the host of recordkeeping and preservation challenges with SAAS. On the panel of the session were:

  • Bonita L Weddle (Chair), Associate Archivist, New York State Archives
  • Mary Beth Herkert, CRM , CA, State Archivist, Oregon State Archives
  • Glen A. McAninch, Manager, Technology Analysis and Support Branch Manager, Kentucky Department for Libraries and Archives
  • Rachel E. Trent, Electronic Records Archivist, State Archives of North Carolina

Cloud Computing Benefits, Challenges

While cloud computing promises reasonably priced and sophisticated records management and digital preservation solutions, archivists and records managers moving content to the cloud face multiple issues: myth-breaking; securing buy-in from senior managers, records creators, and information technology leaders; recordkeeping safeguards; security controls; compliance with public records and other laws; unanticipated cost issues; and cloud-based collaboration hiccups.

Just in case anyone had forgotten what the cloud is, Ms. Weddle opened with

Rachel Trent continued. “There are advantages and disadvantages to moving to the cloud.”

  • Advantages include cost savings, accessibility, centralization, flexibility, scalability and stronger interaction with the user community; however,
  • Disadvantages include security, data integrity, ownership and control, availability and performance, legal compliance, and possibly intermittent retention and disposition management.

The scope of North Carolina’s digital repository, 35 TB of 3 million+ objects, includes geospatial data, email, images, audio/visual objects and text. They are either born digital and or accessioned digitally.

Trent admits the original preservation process was labor intensive: originally, her team stored a three month backup in Raleigh and then manually mailed hard drives to an offsite location and sent a duplicate copy to the "dark" archive at OCLC. She recently turned to DuraCloud. Simply put, Amazon S3 (the online dashboard) maps to Chronopolis at the San Diego Super Computer Center (the public access point); both in turn map to an API (data and audits are bridged to the API as well).

Tips for Negotiating a Service Level Agreement

When negotiating a service level agreement (SLA), Trent recommends the archivist determine:

  • Guarantees about data loss,
  • Guarantees about security,
  • Guarantees about availability/up-time,
  • Whether or not fees will be charged to download data?
  • Who has the ownership and rights to upload data?
  • What if (insert cloud provider here) goes away or ends its agreement with a storage provider? 

Ultimately, Trent said, she recommends the following best practices for cloud computing:

  • Understand the obligations and guarantees of the SLA,
  • Outline -- and be very specific -- about agencies’ legal obligations,
  • Plan for unexpected costs,
  • Understand and respect electronic discovery best practices, and
  • Beware of vendor lock-in.

Glen McAninch picked up the thread. “Our architecture is different. As of 2012, we use Archive-IT that offers:

  • Services in the cloud with public access,
  • Web harvesting services that auto-index and group with other institutions, and
  • Download to local storage as WARC files and Preservica
  • A private cloud, currently not using public access, which
  • Offers preservation services in the cloud,
  • Stores on Amazon in the United States, and
  • Downloads archival package to local storage.”

You can find Kentucky’s recordskeeping guidelines for the cloud here (pdf). McAnanich recommends managing risks: if files are not accessioned, improve harvesting; if files are destroyed, control access; if files are corrupted, distribute multiple copies; if files with access restrictions are released, tighten security by only putting open files in the cloud.

Mary Beth Herkert concluded with a description of Oregon’s records management solution.

The advantages are incredible: cost savings (the Secretary of State implemented HP trim in 2007 to the tune of millions of dollars -- roughly US$ 100 per person -- today the costs are a little over US$ 10 each); we have the latest and greatest tech; and this solution frees up our state IT staff. The disadvantages include lack of ownership and control, security, and intermittent retention and disposition.”

To mitigate some of these concerns, her team architected a private cloud. The good news: each agency maintains custody of their information.

The Way Forward 

Of course, the serious impediment today is public records laws. They were written in the 1960s, a time of tangible, owned records. To bridge the gap, Herkert decided to edit the definition of a record as tech independent and content driven. A state employee may create information in one format and retain in another and STILL be a public record. In other words, a Facebook post doesn’t have to be the original record.

The most valuable lesson Herkert has learned: it’s time to market and re-brand archives. Once, we were associated with history. To reach executive attention, archives should embrace transparency, accessibility and accountability.

Title image courtesy of Ladoga (Shutterstock)

Editor's Note: Check out some of Mimi's coverage from the 2012 conference.