Collection Policy


ARCHE offers stable hosting of digital research data for the Austrian humanities community. This document describes the policies governing the acquisition, curation, and management of resources in ARCHE. It explicates the scope and kind of content admissible for deposition in ARCHE.

Scope of Collections

As part of the CLARIAH-AT infrastructure, ARCHE is primarily intended to be a digital data hosting service for the humanities in Austria. Thus data from all humanities fields including modern languages, classical languages, linguistics, literature, history, jurisprudence, philosophy, archaeology, comparative religion, ethics, criticism and theory of the arts are equally welcome. While ARCHE’s predecessor,  CLARIN Centre Vienna / Language Resources Portal ( was dedicated to digital language resources, ARCHE offers services open to a broader range of disciplines.

The service is designed to cover a wide range of humanities research data. We accept (annotated) digital texts, lexical resources, tabular data, databases, images, collections containing GIS, 3D or CAD data, multimedia files (sound and/or video), websites and social media data, etc. We also accept software (applications, source code, etc.).

ARCHE is mainly meant to accommodate resources related to Austria: either having been collected or created in Austria, or from an area or historical period of interest to Austrian scholars. We do not categorically exclude resources without direct relation to Austria.

ARCHE aims to preserve primary research data as well as derived or processed data in different versions. Traditional research publications are not the focus of ARCHE as there  exist numerous alternatives.

ARCHE is not only meant as a repository for completed and closed datasets from finished projects, but explicitly welcomes intermediate snapshots of data from active projects which are subject to revisions, such as e.g. transcripts of historical texts, gradually being enriched with additional information or annotation layers. Nevertheless, all deposited data is considered immutable, i.e. data with the same persistent identifier will remain unchanged.

ARCHE will decide on the admission of each submission on a case by case basis and may suggest other more suitable archives or repositories.


ARCHE acknowledges that it is beneficial for depositors and users that data proposed for deposit is effectively and rigorously peer reviewed. Data will be evaluated by curators according to the criteria listed below, including its manageability and measures that will need to be taken for its preservation and dissemination. In this regard, the intention is to match particular curators with the discipline from which the data arises.

ARCHE recognises its responsibility to cooperate with depositors to ensure the availability of needed metadata.

The following list contains general criteria for assessing the eligibility of resources for deposit. These criteria only serve as a general guideline; missing one criterion will not necessarily disqualify a resource from acceptance. In all cases, the final decision will be made through discussion between the repository and the depositor.

  • Risk: Is the data in danger of disappearing or becoming otherwise inaccessible?
  • Degree of availability: Can the data be found elsewhere?
  • Demand: To what extent is the data used?
  • Geographical or historical scope: Does the data have a relation to Austria?
  • Curation effort: How much work will be needed to prepare the data for archiving?
  • Stability of data:  Does the data represent a preliminary stage or is it the final version?


All resources included in ARCHE are intended to be retained permanently. Data may be removed at the depositor’s request, although we will preserve at minimum a reference to show that the data was there. Metadata will therefore be retained indicating that the data itself was removed. The assigned PID will be preserved, pointing to a tombstone page that displays the metadata.


Adapted from and inspired by: