Preservation Policy

Policy Statement

KonDATA (hereafter “repository”), the institutional research data repository of the University of Konstanz, is dedicated to long-term archiving and preservation of the data it publishes.

Intended Audience

KonDATA staff, data providers, and data users.

Summary

This document aims at describing KonDATA’s approach to achieving and guaranteeing long-term access to its contents. It describes the scope, mission and goals of the policy, the content of the repository and the requirements needed for long-term preservation from archival and technical perspectives. Furthermore, it describes the roles and responsibilities of the repository staff, as well as content coverage, how the repository ensures integrity and security, and how it is financed.

1. Scope and Goals of this plan

1.1. Scope

This Preservation Policy is valid for KonDATA, the institutional research data repository of the University of Konstanz, and all its published datasets, files, and metadata. This does not include the preservation of related websites, documents and other materials connected to the service.

1.2. Mission and Goals

KonDATA is a service to provide publication and long-term archiving of research data of all members of the University of Konstanz regardless of their scientific discipline. The physical preservation of all uploaded datasets is guaranteed for at least ten years, as stated in KonDATA’s Terms of Use  (paragraph 4(5)).

This policy is driven by the “Guidelines for Safeguarding Good Research Practice” of the Deutsche Forschungsgemeinschaft (German Research Foundation, DFG) . The preservation strategy follows principles such as the OAIS reference model, the FAIR principles and CoreTrustSeal. This policy formalises and communicates long-standing processes and workflows to ensure the long-term preservation of the University of Konstanz’ research data.

KonDATA started its service in 2021. It is hosted and run by the Communication, Information, Media Centre (KIM) of the University of Konstanz. The KIM unites the library and the computing centre of the University of Konstanz. Its predecessor, the former university library, started its work with the founding of the University in 1966. As a state-funded memory institution, one main purpose of the university library is the enabling and long-term guarantee of scientific information in general. The repository uses the software RADAR, a service of FIZ Karlsruhe, Leibniz Institute for Information Infrastructure.

The goal of this preservation policy is to

  • provide long-term access of research data to researchers
  • maintain accessibility of published data
  • ensure authenticity, integrity, and security of published data
  • communicate the trustworthiness of the repository to its users

2. Content and Community of the repository

2.1. Characterisation of the content

KonDATA offers and guarantees the publication of research data for all university members and ensures its availability and long-term preservation. KonDATA’s content is not limited to a scientific discipline. This is reflected in the repository’s mission statement, which was approved by the KIM director Oliver Kohl-Frey on 11.07.2023.

The provided datasets are meticulously described with metadata to improve their findability, accessibility, interoperability and reusability, according to the FAIR principals. To comprehend, interpret and reuse the deposited data, information on the structure and content of the dataset and each data file is to be considered and included in the metadata description and as an additional README.txt file. This will be checked during curation. The provision of metadata in a README.txt is facilitated and encouraged by the provision of a template.

It is strongly recommended that data files are stored as open file formats. The publication of harmonised data using open file types lowers the barrier for using the data, as no proprietary software is needed, and increases the possibility of reuse for the data in the future. Updates to published datasets can be submitted and published as new datasets. Storage of datasets in the repository is guaranteed for a minimum of 10 years in accordance with DFG guidelines on the handling of research data . After the expiration of this term the repository will not remove the data from its archive. To ensure future access to and reusability of the data, migration to other file formats will be evaluated. Data providers grant the University of Konstanz as operator of KonDATA the rights to convert files to different formats, if necessary for long-term access, as defined in the Terms of Use for data providers, paragraph 2(1).

2.2. Designated Community of the repository

The target community of the repository are all members of the university (lecturers, researchers, students, administrative staff, etc.) who want/need to publish their research data. The public can access and download all published data (if there is no embargo). No organisational affiliation or account is needed to access the data.

3. Requirements

3.1. Archival requirements

The repository follows requirements that need to be fulfilled to ensure long-term preservation of published data:

  • data published in the repository are accompanied by adequate documentation to enable their use and re-use;
  • data are checked, validated, and curated following predefined workflows;
  • data are described and enriched with metadata following standards and best practices;
  • data packages and metadata are stored and preserved for the long-term; and
  • the authenticity, integrity and reliability of datasets preserved for future use are retained. The repository maintains a further commitment to the FAIR Principles to make data findable, accessible, interoperable and reusable, as described in our Mission Statement.
  • research data that contains personal data complies with the provisions of the European Union's General Data Protection Regulation.

3.2. Technical requirements

Technical hosting and maintenance of the repository are ensured by the IT department of the KIM. The department makes technical decisions with respect to state-of-the-art technology.

The repository software used for deposit, curation, preservation and access management relies on RADAR, which was developed by DFG funding and is maintained by the FIZ Karlsruhe. RADAR develops, maintains and operates an instance of the software for KIM on local hardware as arranged in the service contract between both institutions. This also includes regular updates of Rocky Linux a well-established Linux distribution as the operating system running on KonDATA’s virtual machines. Interfaces to other services, such as the DOI registration or authentication via Shibboleth enables KIM to seamlessly integrate KonDATA into its system landscape, while also outsourcing needed IT resources. KonDATA relies on a development system, which is only accessible from the university network and where software updates are deployed for testing and evaluation, before they are implemented with the productive system that users can also access.

To expand features needed by the designated community, repository staff are in constant exchange with the designated community (see 2.2). Communication with users can lead to requirements management, which may lead to the implementation of new features.

4. Roles and responsibilities

All staff working on the repository assist in fulfilling the needed requirements to ensure the continuity to data access and the service. Staff members will be guided to respect this policy. The product owner of the service is responsible for maintaining this policy.

5. Integrity and security

All workflows in the repository follow defined processes that are described in an internal manual. A visualization of the curation workflow can be found in the repository’s Frequently Asked Questions. A conceptualized schema of the publication workflow is visualized in Figure 1.

The repository is committed to taking all necessary precautions to ensure the physical safety and security of the data it preserves. This includes communication with the KIM’s IT security staff on a regular basis and the development and monitoring of an information security concept. The virtual machines the repository runs on, including the repository software and the data collection, are backed up regularly in different locations on campus. Metadata published with every dataset and file follow the DataCite standard. Checksums for checking data integrity after storage are generated. Data files are strongly recommended as open formats.Datasets in KonDATA follow the BagIt standard a hierarchical file layout concept that stores content data files together with the descriptive and technical metadata to enable independent understandability and reusability of datasets, even if the system, over which they were published, is not available. This BagIt package contains MD5 checksums for every data file (content data and metadata files) that can be used to check data integrity.

Figure 1: Conceptualized schema of the data publication workflow. The researcher’s (data provider) field of responsibility is depicted on the left and KIM’s responsibility on the right hand side.

6. Sustainability plans and funding

The repository will be funded indefinitely by the University of Konstanz in order to secure and ensure its long-term operation. The University of Konstanz is financed primarily through public funds from the German government and therefore offers stable long-term funding and hosting for the repository. Should the repository cease to exist due to unforeseen circumstances, all published data will be offered to the University archive of the University of Konstanz for further preservation. In the unlikely case of a closing of the university the content of the University Archive has to be transferred to the state archive of Baden-Wurttemberg. If the KIM or FIZ Karlsruhe decide to end the cooperation, the KIM has the option to take over the technical service of KonDATA and continue the service or extract data to migrate it into a different repository system, as stated in RADAR Locals service contract under paragraph 12(2), “The customer may take over the software in the condition it has at the time of the end of the contract and then use and further develop it under his own direction and at his own cost”.

This preservation policy was originally approved by Oliver Kohl-Frey, Director of the Communication, Information, Media Centre (KIM) of the University of Konstanz on 28.05.2024.