Definitions

Attribution Where data is used in linguistic research, attribution should be given. This can be done through citation, or – where the data only exist in the publication – acknowledged in the document itself.

Data Linguistic data is any content that can be analysed and presented to further linguistic analysis. This can include, but is by no means limited to, naturalistic observations fixed as audio or video recordings or notes, experiment results, instrumental readings, corpora, surveys, intuitions and examples. Some data can be created by enriching other data, e.g. transcripts of recordings.

Data Citation A citation directs the reader from the article or location where the data are being discussed, to a location where they are being stored. A citation can also ensure that the reader can access more data or metadata, and allows appropriate credit for the data to be given.

Interoperability involves setting up data management so that data that can be accessed using different technological platforms.

Metadata Information about data that helps people to find, understand, contextualize, and use research data. In linguistics examples of metadata include, but are not limited to, language name, ISO code(s), participant details, location, date.

Metadata standards There is no one single set of metadata standards for all research data. Using existing metadata standards helps data interoperability, making it easier for both humans and computers to compare different datasets.

Participant The production and management of research data are tasks that require both time and expertise, and should be acknowledged. Specific participant roles may vary depending on the data, but can include speaker, transcriber, researcher, curator.

PID A Persistent Identifier (PID) is a long-lasting reference to a document, file, web page, or other object. This could be a standard, like a DOI (digital object identifier), or an archive-specific solution.

Replicability Replicable research methods are those that can be recreated elsewhere by other scientists, leading to new data.

Repository A data repository is a space where you can store your data, with an accessible structure and metadata that can be discovered by others. A repository should be persistent (have ways to store your data in an ongoing manner).

Reproducibility Reproducible research involves provides access to the original data for independent analysis, which can allow others to come to the same conclusions as the original researcher based on the given data.