When a metadata section in a Data Management Plan template includes a question on the used ontology (if any) what is usually meant is: is there a specific vocabulary or classification system used. Controlled vocabularies are created by domain experts to help translate ontological concepts as well as to organise knowledge for subsequent (information) retrieval. Controlled vocabularies (CESSDA: "structured controlled vocabularies") are intended to reduce ambiguity that is inherent in normal human languages where the same concept can be given different names and to ensure consistency. Controlled vocabularies are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Some vocabularies are very internationally accepted and standardized and may even become an ISO standard or a regional standard/classification. Controlled vocabularies can be broad in scope or very limited to a specific field. When a Data Management Plan template includes a question on the used ontology (if any), what is usually meant is: is there a specific vocabulary or classification system used.
Examples are:
Many examples of vocabularies and classification systems can be found at the FAIRsharing.org website. It has a large list for multiple disciplines. If you are working on new concepts or new ideas and are using or creating your own ontology/terminology, be sure to include them as part of the metadata documentation in your dataset (for example as part of your codebook).
Controlled vocabularies help make searching for and re-using information or data much easier when they are part of a machine-readable metadata scheme or system.
Metadata is descriptive information about data / information. Metadata allow humans and programs to more easily understand and interpret information or data. Controlled vocabularies are often used to help make searching for and re-using information or data much easier when they are part of a machine-readable metadata scheme or system.
The CESSDA has created helpful guidance about creating metadata.
There are three main levels of metadata: Data assets, Dataset documentation and Dataset registration (more information):.
Data Stage | Dataset description | Type of data | Versioning |
Raw data | Consumer spending data | Text files | 2017-02-23_ConsumerSpending_1.2.txt |
Processed data | Anonymized Transcription of patient interviews | Word files, Excel | 2014-11-17_RawTranscription_Checked1.docx |
Analysed data | Photo Images with descriptions | TIFF files, Word file | C:\Images\Raw\2016-07-01_Subject1-V2.tiff C:\Images\Clean\2016-07-01_Subject1-H1c.tiff C:\Images\Clean\Descript\2016-07-01_Subject1-H1c.Docx |
Many archives implement or make use of specific metadata standards. The UK Digital Curation Centre (DCC) provides an overview of metadata standards for different disciplines. The list is a great and useful resource in establishing and carrying out your research methodology. Go to the overview of metadata standards. More important tips are available at Dataset & Publication.
If you want to archive your dataset in such a way that it is compatible with the FAIR-principles, you can use the information in this practical guide which describes how to implement the FAIR data policy and this table which matches metadata fields from different systems (these documents were written for the Faculty of Behavioural and Movement Sciences).
The Dutch Techcentre for Life Sciences has developed open source software code to enable you to make your dataset's metadata FAIR. The software is being developed through GitHub and full details on the FAIR Data Point Software are available there. The Dutch eScience Center also developed Fair Data Point software, of which full details are, similarly, available on GitHub.