Meta Data

Research Data Best Practices


The Senckenberg – Leibniz Institution for Biodiversity and Earth System Research (SGN) aims at creating and preserving knowledge, data and associated material, generating new information and making it findable, accessible, interoperable and (re)usable for science and society, following the FAIR data principles. Research data and scientific collections are the basis of our research and the result of our scientific work (see here).

 In diesem Zusammenhang hat SGN eine besondere Verantwortung für Daten, die mit naturkundlichen Sammlungen verbunden sind, aber wir halten zunehmend auch andere Geobiodiversitätsdaten vor. Dabei beachten wir Datenschutz-, Urheberrechts-, Patent- oder Geheimhaltungsfragen, ethische Verpflichtungen wie die Einhaltung guter wissenschaftlicher Praxis im Umgang mit digitalen und nicht-digitalen Forschungsdaten sowie die Verantwortung für die Sicherstellung ihrer langfristigen Überprüfbarkeit und Wiederverwendbarkeit.

In order to enable scientific work in the sense of good scientific practice, the generation, storage, maintenance, and sustainable provision of research information, data and materials must be carried out according to recognised international standards. This must meet high expectations and take into account the subject cultures. In this context, SGN has special responsibility for data associated with scientific collections, yet we also keep an increasingly diverse set of other geobiodiversity data. Data protection, copyright, patent or secrecy issues, ethical obligations such as adherence to good scientific practice in the handling of digital and non-digital research data, as well as the responsibility for ensuring their long-term verification and reusability, must be observed.

The basis of this guideline is the following resources:

●       The “OECD Principles and Guidelines for Access to Research Data from Public Funding” (here),

●       The “DFG recommendations for “Securing Good Scientific Practice” (here),

●       The “Rules for Safeguarding Good Scientific Practice at the Senckenberg Research Institutes”, the “DFG Rules of Practice for Digitisation” and the “Guideline for Handling Research Data in the Leibniz Association” (here)

●       The “Provisional Data Management Plan for DiSSCo infrastructure” (here)

 

As a signatory of the Bouchout Declaration, SGN has committed itself to make research data and digital resources freely and openly available to the greatest extent possible following the FAIR data principles. The following text defines a framework for the Senckenberg data management plans for raw research data and metadata.

Raw research data comprise all data and data types that are generated or collected during the research process. Since these data are available in different formats and media types, e.g. digital databases, digital images, and digital library archives, depending on the research discipline and methodology used, sufficient documentation of the circumstances and methods of their creation is necessary for their effective re-use. The methods increasingly include software for the generation, processing or analysis of data. Data can be findable and accessible, but still not reusable and interoperable if the data formats do not follow open data standards.

Data formats should be open and non-proprietary so that they do not depend on specific non-open software to be read and, if possible, be human-readable (i.e. text files in contrast to binary files). Text files should be in UTF-8 encoding. Below is a list of guidelines that Senckenberg encourages:

  1. The guidelines of the National Archives.
  2. ETH Library for a concise summary of file formats
  3. Best File Formats for Archiving for a comprehensive article.

Encouraged licenses are Creative Commons Copyright Waiver (CC0) and Creative Commons By Attribution (CC-BY) or licenses equivalent to these.

 

Metadata here is understood as descriptive data for any kind of research object, such as real objects of scientific collections, raw research data, as well as descriptive data for research projects. Metadata should be prepared in English and follow a developed metadata scheme. This scheme should include

  1. bibliographic metadata (similar to DataCite) including
    a. ORCID of the expert
    b. ROR of the Institution of the expert
  2. DOI of the data refers to
  3. DOI or equivalent persistent identifier of the external data

Metadata should be as descriptive as possible and in compliance with the FAIR principles:

  1. Title
  2. DOI
  3. Description/Abstract
  4. Keywords
  5. Data Authors/Data Source
  6. Metadata author
  7. Creation date, version
  8. Geographical coverage
  9. Temporal coverage
  10. Taxonomic coverage
  11. Quality Assurance/Quality Control (QA/QC) procedures
  12. Data files/format
  13. Metadata author
  14. Contact person

SGN implements and maintains a basic research and collection data infrastructure, thus ensuring adequate, long-term preservation, and technical availability (findability and interoperability) of metadata and raw digital research data.
The storage and archiving of the data is carried out in the information infrastructure of the SGN or in external yet internationally recognised specialist repositories (see re3data.org). External repositories typical for data generated by SGN are listed below:

Internal Databases

1. Collection database

2.  Botany

3. Soil zoology

Edaphobase

Virmisco (database for digitalization of soil animals (microscope images))

4. Zoology

5. Entomology

ECatSym: Electronic World Catalog of Symphyta

6. Geology

AQUiLAgeo  geological collections of Freiberg and Dresden

 

7. Data domains

External repositories

1. Open-access

GBIF (Global Biodiversity Information Facility)

OBIS (Ocean Biodiversity Information System)

2. Soil

BonaRes (Soil as a sustainable resource for the bioeconomy)

3. Molecular ecology

4. Vegetation

DRYAD: Data from: Global vegetation patterns of the past 140,000 years

Github code repository at https://github.com/MagicForrest/DGVMTools

Global Index of Vegetation-Plot Databases (GIVD) – https://www.givd.info/

sPlot – The Global Vegetation Database – https://www.idiv.de/en/splot.html
TRY – plant trait database – https://www.try-db.org/TryWeb/Home.php

5. General

6.  Registry for DOIs

DataCite
Project leaders and other researchers working independently are usually responsible for research data management in their research projects. In particular, they are obliged to ensure compliance with good scientific practice and professional standards. Research projects that generate raw research data require a data management plan (DMP), which sets out, among other things, the scope of the data to be backed up and the access rights and reservations of the research data. SGN advises on research data management in research projects from the planning stage, through the implementation and beyond the end of the project. The German Federation for Biological Data (GFBio) is also available for consultation. A collection portal has been set up to provide access to the SGN’s collection databases. For further data, DMPs are collected centrally. This means that at the beginning of a project it is determined which data is collected and how it is archived in the long term. For the metadata, the basic aim is to make them fully visible to the public. Exceptions to this rule are possible, e.g. due to data protection or copyright concerns, to protect species and personal data, and explicitly to protect the scientific interests of the persons who originally generated the data.

Raw research data are secured on a long-term basis and usually linked to the corresponding metadata via corresponding database entries. In order to ensure that the raw research data is clearly identified worldwide, the aim is to provide the data with a unique “Persistent Identifier” if possible. After completion of their evaluation by the project generating them, raw research data should also be made visible to the scientific community. This is to be done as soon as possible and at the latest five years after completion of a project. As a rule, access to the original data requires the consent of the persons in whose projects they were generated. The authors can impose an embargo on the use of the data if they have not yet completed all analyses and the final results of the project have not yet been published.

Contact

Dr. Hanieh Saeedi
Biodiversity Information Coordinator

Data Manager Ocean Biogeographic Information System (OBIS), Deep Sea Node, UNESCO

Chair Data Quality Control Task Team, OBIS

Research interests

I am interested in understanding the driving factors (ecological and evolutionary process) which shape the biodiversity patterns and biogeography in marine species (shallow and deep sea) using big data. In addition, I am interested in predicting how these biodiversity patterns and species distribution ranges will shift under future climate change. I am also the OBIS (Ocean Biogeographic Information System) deep-sea node data manager in UNESCO, specialised in managing big datasets, biodiversity data standards, and quality control tasks. To carry out my research, I use different skillsets and apply different methods and techniques such as taxonomy (morphology and molecular), phylogeny, biogeography, big-data management, biodiversity informatics, macroecology, and species distribution modeling and ecological modeling.

At the moment, I am leading projects in digitisation of museum collections, biogeography, biodiversity informatics using big-data at the regional (e.g. NW Pacific) and global scales. I also work for science-policy intergovernmental bodies such as IPBES (Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services) to provide fundamental information for biodiversity assessment reports in a response to policy makers to better understand the global status of biodiversity in the World Oceans and consequently to establish more efficient strategic management plans to maintain the Ocean Biodiversity.

Current research projects

  • Biogeography of the NW Pacific deep-sea fauna and their possible future invasions into the Arctic Ocean (Beneficial Project)
  • Estimating the global future shift patterns of shallow-water and deep-sea Crustacea
  • Biodiversity and biogeography of molluscs along the NW Pacific and the Arctic Ocean
  • Biodiversity and future distributions of corals along the NW Pacific and the Arctic Ocean
  • Biogeography of marine species richness and impact of climate change
  • IPBES thematic assessment of invasive alien species and their control

Student opportunities

Various research projects for postdocs, PhDs, MSc and BSc students as well as for short internships are available this year and all year round. These opportunities are mostly in the field of biogeography, ecology, biodiversity informatics, and ecological modeling. Further Postdoc and PhD projects can also be discussed and jointly developed. Please contact me for more details.

Teaching

I have more than 15 years of international experience in teaching and supervising students from high school to MSc. Programs.

Short CV

Selected publications

Saeedi, H., Simoes, M., Brandt, A. (2020). Biodiversity and distribution patterns of deep-sea fauna along the temperate NW Pacific. Progress in Oceanography, 183: 102296. https://doi.org/10.1016/j.pocean.2020.102296.

Saeedi, H., Simoes, M., Brandt, A. (2019). Endemicity and community composition of marine species along the NW Pacific and the adjacent Arctic Ocean. Progress in Oceanography. Progress in Oceanography, 178: 102199. https://doi.org/10.1016/j.pocean.2019.102199.

Saeedi, H., Costello, M. J., Warren, D., Brandt, A. (2019). Latitudinal and bathymetrical species richness patterns in the NW Pacific and adjacent Arctic Ocean. Scientific Reports, 9:9303. https://doi.org/10.1038/s41598-019-45813-9.

Saeedi, H., Reimer, D. J., Brandt, J. M., Dumais, P. O., Jażdżewska, M. A., Jeffery, W. N., Thielen, M. P. (2019). Global marine biodiversity and prediction in the context of achieving the Aichi Targets: ways forward and addressing data gaps. Peerj, 7: e7221. https://doi.org/10.7717/peerj.7221.

Saeedi, H., Bernardino A. F., Shimabukuro M., Falchetto G., & Sumida P. Y. G (2019). Macrofaunal community structure and biodiversity patterns based on a wood-fall experiment in the deep South-west Atlantic. Deep Sea Research Part I: Oceanographic Research Papers, 145:73-82.

Saeedi, H. & Costello M. J. (2019). A world dataset on the geographic distributions of Solenidae razor clams (Mollusca: Bivalvia). Biodiversity Data Journal, 7:e31375. https://doi.org/10.3897/BDJ.7.e31375.

Saeedi, H., Kamrani, E., Shayesteh, F., Nordhaus, I., Diele, K., Raeisi, H. (2018). Sediment Temperature Impact on Population Structure and Dynamics of the Crab Austruca iranica Pretzmann, 1971 (Crustacea: Ocypodidae) in Subtropical Mangroves of the Persian Gulf. Wetlands, 38(3): 539–549.

Saeedi, H., Costello, M. J. and Dennis, T. (2017). Modelling present and future global distributions of razor clams (Bivalvia: Solenidae). Helgoland Marine Research, 70: 23.

Chaudhary, C., Saeedi, H., & Costello, M. J. (2017). Marine species richness is bimodal with latitude. Trends in Ecology and Evolution. 32(4): P234-237.

Saeedi, H., Costello, M. J. and Dennis, T. (2016). Bimodal latitudinal species richness and high endemicity in razor clams (Mollusca: Bivalvia). Journal of Biogeography. 44(3): 592–604.

Chaudhary, C., Saeedi, H., & Costello, M. J. (2016). Bimodality of latitudinal gradients in marine species richness. Trends in Ecology and Evolution, 3(9): 670-676.