A data set (record / observation) is – essentially – the answer to the questions
“What –Where – When – How – Who?”.
The data that Edaphobase can collect and hold are thus categorized roughly along these major questions.
A taxon is defined by a nomenclaturally complete name (including describing author and year) considered valid within the framework of Edaphobase. Taxa are hierarchically classified within a systematic tree representing Edaphobase’s “taxonomic backbone”. Synonymy relationships between taxonomic names are managed and linked to the valid taxon name, so that data queries also find “older” taxonomic designations.
“INDIVIDUALS”: RECORD / OBJECT DESCRIPTION (“WHAT?”)
The record or object descriptions contain the “actual” observations (e.g., number of individuals, abundances, biomass, morphological measurements; possibly separated into males / females / juveniles). Morphometric data etc. can either describe the “taxon” (as a general concept) as in traits or the (concrete) individual assigned to a taxon.
SAMPLING SITE DESCRIPTION (“WHERE?”)
A sampling site description contains a list of parameters describing the locality (sampling site) in detail. Most important is a geographic reference – related to a taxon’s site of occurrence – given as a descriptive name and its geographic coordinates. This can be hierarchically allocated to, or apportioned by, the country, region, site, plot or even an individual sample. Further site descriptions can be land-use categories, habitat type, climatic data or, especially, soil data (e.g., soil texture, soil pH, soil organic matter, etc.). Where possible, vocabularies for environmental categorical data follow existing standards.
SAMPLING EVENT DESCRIPTION (“WHEN”? “HOW?” “WHO”?)
A sampling description contains information about how the observation took place, which can be described as a collection or sampling event. This contains data regarding, e.g., the sampling date or period, the sampling methods, the collector, etc.
Furthermore, information on the species identification can be listed separately (in the object descriptions above). Here, particularly data is collected on the species identifier (person), the identification literature used, or if reference material has been deposited in an institutional collection. Such information allows a future data user to assess the taxonomic information related to a data set.
SCOPE (“HOW?”, or better: “Why?”)
Related to the sampling methodologies, the underlying purpose of the data collection is recorded, the so-called “scope”. Since Edaphobase harmonizes all data into a common data warehouse, allowing all data to be commonly queried and/or used in internal analyses tools, it is important to know which data can be compiled together. For this, recognizing the scientific background of each data set is vital; i.e., is the data at a species or community level, is it quantitative (counts, densities) or qualitative (presence/absence), can it be generalized for the site of occurrence (sample series per site) or not (individual observation)? This information is collected in the “Scope”.
SOURCE DESCRIPTION (“WHO”)
Edaphobase can collect and harmonize data from various sources, i.e.,:
- “raw” data tables from projects, thesis work, etc.
- museum collections
- literature (scientific publication, book, reports)
The data basis of all sources is, of course, the taxa listed in the source and the taxa’s sites of occurrence. Further information is source specific and can include the project name and principle investigator (for project sources), the collection name and collection-object number (for museum sources), or the authors, journal, article title, page numbers etc. (for literature sources).
A list of the categories, the specific data fields, as well as their definitions, formats and units can be found here
HOW PERSONAL DATA IS HANDLED
Edaphobase respects personal privacy, does not provide personal data to external sources, nor does it participate in the “data economy”!
To be able to credit data providers/owners with data sets (and acknowledge their Intellectual Property Rights in future data re-use), it is necessary to list person’s names with the corresponding data sets, i.e., as authors of literature, collectors and those responsible for collections, individuals determining taxonomic objects, or data owners or principle investigators of projects. Even institutions (such as a publisher, the institute hosting a collection or running a project) can be collated in Edaphobase as ‘persons’ in this context.
Only these “names” (if desired) are publically available in Edaphobase. Necessary contact data is internally linked (at Edaphobase’s host institution, i.e., Senckenberg) to data sets to allow future contact if necessary, but is not directly set in the data set and is thus not made publically available, nor is it possible to illegally obtain this information from outside sources.
Edaphobase’s Data Policy can be found here