DataverseNO Preservation Policy

Purpose of this Policy

This document outlines the DataverseNO plan for responsible and sustainable stewardship of published Datasets. The development of a reliable digital archive that adheres to, and remains compliant with, the standards expounded in the DataverseNO policies remains the best mechanism to support the access of research data published in DataverseNO.

Preservation Objectives

Following DataCite’s policy, DataverseNO commits itself to provide preservation of and access to published research data for a minimum of ten years after the date of publication in DataverseNO. The intention for DataverseNO is, however, to facilitate access to archived data in an enduring perspective.

To achieve this aim, DataverseNO shall:

  • Provide or participate in a reliable preservation environment for published Datasets in collaboration with other national and international stakeholders.
  • Provide reliable and consistent access to published Datasets available through the service to Depositors, the larger research community, and the public according to the DataverseNO policies.

In case of compelling reasons, DataverseNO may remove public access to (Deaccession) Data Files and/or Metadata Files in a Dataset from its holdings. Details regarding the process for Deaccession can be found in the DataverseNO Deaccession Procedure.

Operating Principles

DataverseNO shall adhere to the following operating principles:

  • Endeavour to model services in accordance to the National Digital Stewardship Alliance (NDSA) Levels of Digital Preservation and the Open Archival Information System (OAIS) Reference Model for the long-term preservation and access of digital collections whenever possible.
  • Strive for compliance with emerging international standards for managing a reliable digital repository infrastructure.
  • Adhere to prevailing community standards for preserving access to Datasets whenever possible.
  • Participate in the development and implementation of standards.
  • Commit to an interoperable, scalable digital archive with appropriate storage management for Datasets.
  • Create policies, procedures, and practices that are clearly documented and consistent.
  • Strive to maintain hardware, software, and storage media containing archival content in keeping with prevailing best practices.
  • Strive to establish procedures that meet archival requirements, such as clearly documenting and maintaining provenance, chain of custody, authenticity, and integrity of Datasets.
  • Comply with intellectual property, copyright, and ownership rights for all content.

Roles and Responsibilities

The Board of DataverseNO has the overall responsibility for the DataverseNO Preservation Policy, and for developing and keeping DataverseNO abreast of the challenges of digital preservation in an enduring perspective.

Development and management of preservation and access for DataverseNO is accomplished through the collaborative efforts of units at the University Library and the IT department at UiT The Arctic University of Norway (owner of DataverseNO) in collaboration with units at the DataverseNO partner institutions working in accessioning, curation, information technology, and systems administration capacities.

Research Data staff at UiT The Arctic University of Norway (owner of DataverseNO) and at DataverseNO partner institutions regularly attend and present at national and international workshops, assuring that the institutions are aware of and follow latest best practice recommendations in research data repository management and research data management. Research staff curating and managing DataverseNO are also in continuous dialog with users in the Designated Communities of DataverseNO. The administrators of DataverseNO are in dialog with the Board of DataverseNO, and in agreement with the Board they periodically update the DataverseNO policies and guidelines to meet the needs of the Designated Communities.

Preservation Strategies

Datasets deposited in DataverseNO utilize the centralized back-end storage and management services at UiT (owner). This is a common storage and management infrastructure for digital collections of enduring value to UiT, covering digitized and born-digital books, manuscripts, photographs, audio-visual materials, scholarly publications, and research data. The UiT University Library manages the digital collections together with the UiT IT department.

The DataverseNO Deposit Agreement assures that DataverseNO may convert the deposited data files and/or Metadata Files to any medium or format and make multiple copies of the deposited dataset for the purposes of security, back-up, and preservation. In the event one or more Data Files and/or Metadata Files in a Dataset are transformed into more stable formats and/or migrated to successive formats, the conversion/migration will be documented in the DataverseNO version control system.

Preservation Plan

DataverseNO addresses the challenges facing digital repositories as described in the following preservation plan:

Technological and Procedural Suitability

On deposit, Research Data staff responsible for the curation of the DataverseNO collection in question check data files and metadata for completeness and integrity and, if necessary, communicate with depositors to request changes and/or additions to the deposited dataset in order to ensure compliance with the DataverseNO Deposit Agreement. Complying datasets are published by responsible Research Data staff on request from depositor. To achieve the Preservation Objectives outlined above DataverseNO applies the following technological and procedural measures in the management of the DataverseNO repository:

The DataverseNO Deposit Guidelines require preferred file formats for datasets to be published in DataverseNO. The file format requirements follow recommendations from the Library of Congress as well as other data repositories with significant holdings such as UK Data and DANS. All file formats are monitored for obsolescence using the Library of Congress’s Sustainability of File Formats pages as well as the UK National Archive’s PRONOM service. Files in formats threatened by obsolescence are converted to suitable replacement formats. File converting results in a new version of the published dataset in DataverseNO, and the conversion is documented in the version control system of Dataverse. DataverseNO commits to perform file format monitoring at a regular base.

DataverseNO performs file integrity checks to protect against bitrot. On ingest, the Dataverse software automatically creates an MD5 checksum for every ingested file. The checksum is stored to allow for checking file integrity manually including by users and third parties. Files are stored in the centralized back-end storage and management services at UiT The Arctic University of Norway (owner of DataverseNO). Any content in DataverseNO is backed up using an enterprise class backup system with retention policies ensuring that multiple copies are maintained of all data in the system.

DataverseNO curates and stores standard-compliant metadata to ensure that metadata can be mapped easily to standard metadata schemas and be exported into JSON format (XML for tabular file metadata) for preservation and interoperability. For an overview of the metadata standards employed in Dataverse, see the Metadata References in the appendix to the Dataverse User Guide. The Dataverse developer community, headed by the Institute for Quantitative Social Science (IQSS) at Harvard University, the Global Dataverse Community Consortium, as well as UiT The Arctic University of Norway (owner of DataverseNO) and the DataverseNO partner institutions are monitoring the development of the metadata standards employed in Dataverse. Based on advice from the Designated Communities of the repositories using the Dataverse software, the mentioned stakeholders will ensure that necessary future changes to the metadata schemas in Dataverse will be implemented.

Continuity of Access and Preservation

In the unlikely case that DataverseNO is closed down UiT The Arctic University of Norway (owner of DataverseNO) commits to ensure that deposited data is retained and transferred to (an) approved repository/-ies in accordance with the agreement with DataCite for assignment of DOI to data sets in DataverseNO, before the service is discontinued. Datasets in the general collections are transferred to (a) certified general research data repository/-ies. Datasets in subject-based collections are transferred to certified subject-based repositories after consulting the involved Designated Communities. In addition and according to Norwegian legislation, research data from governmental sector will be transferred to the National Archives of Norway for securing long-term availability and accessibility of the data.

The preservations measures described in this preservation plan are communicated to Research Data staff at DataverseNO partner institutions through the DataverseNO Curator Guidelines. New Research Data staff members are trained in these guidelines within a few weeks after joining a DataverseNO partner institution. Those parts of the preservation measures concerning the overall administration of the DataverseNO repository are communicated to the administrators of DataverseNO through the DataverseNO Administrator Guidelines.

Financial Sustainability

DataverseNO is identified by the UiT management (owner) as an essential part of UiT’s strategy to fulfil the requirements for research data management from national and international funding agencies, and is a core service for UiT researchers and their partners. DataverseNO (by owner) commits to ensure the proper management and operation of the archive service in accordance with the responsibilities described in the Steering documents for DataverseNO.

UiT The Arctic University of Norway as owner of DataverseNO, makes a financial commitment toward the sustainable preservation of published Datasets through funding of the Research Data Service as well as through membership fees from DataverseNO partner institutions.

Technological Sustainability, Security, and Disaster Recovery

UiT (owner of DataverseNO) is committed to sustaining an effective digital preservation infrastructure for its digital collections, which includes the adequate provision of appropriate technologies. Datasets deposited in DataverseNO utilize the centralized back-end storage and management services at UiT. This is a common storage and management infrastructure for digital collections of enduring value to UiT, covering digitized and born-digital books, manuscripts, photographs, audio-visual materials, scholarly publications, and research data.

DataverseNO is running on UiT’s centralized storage and virtualization infrastructure which also hosts the accounting and payroll systems for the whole institution. Everything is backed up using an enterprise class backup system with retention policies ensuring that multiple copies are maintained of all data in the system. Data recovery is available from backup as necessary as for other storage services for researchers and students at UiT. The underlying hardware is mirrored between two datacentres in separate buildings on the UiT campus, where data is replicated to avoid data loss in case of physical threats like fires, floods etc. Both datacentres are secured with at least two layers of key access doors from public areas, and access is restricted to authorised operational staff.

All systems (included DataverseNO) and services delivered by the UiT IT department are subject to risk and vulnerability analysis at implementation, at start up, and at regular intervals throughout the lifetime of the systems and services.

The infrastructure and services are revised regularly according to the UiT IT department quality control system. The storage systems are renewed every 6-8 years which minimizes the risk for long-term deterioration of storage media. The transfer of data from old to new storage systems includes checks for bit-correctness of all data.

DataverseNO complies with the UiT requirements for good computer use practices. UiT has developed extensive technical and administrative procedures to ensure consistent and systematic information security. Good practice requirements include system security requirements, operational requirements and regular auditing and review.

Acknowledgements

Illinois Data Bank. Illinois Data Bank Policy Framework and Definitions. http://hdl.handle.net/2142/91039

Qualitative Data Repository (QDR) Preservation Policy. https://qdr.syr.edu/policies/digitalpreservation.

 

Contact research-data@support.uit.no with questions or to request an addition or revision to this policy.

 

Policy Document History and Version Control Table

Version Action Approved By Action Date
2.0 Policy revised. Board of DataverseNO 2019-03-06
1.0 Policy issued. Board of DataverseNO 2018-06-21