Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/data/invalid/GeoLocationNameValue-no_country_or_sea.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
has_raw_value: "USA: Maryland: Bethesda"
subnational_1_or_ocean_region: Maryland
locality: Bethesda
type: nmdc:GeoLocationNameValue
4 changes: 4 additions & 0 deletions src/data/invalid/GeoLocationNameValue-no_raw_value.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
country_or_sea: USA
subnational_1_or_ocean_region: Maryland
locality: Bethesda
type: nmdc:GeoLocationNameValue
5 changes: 5 additions & 0 deletions src/data/valid/GeoLocationNameValue-1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
has_raw_value: "USA: Maryland: Bethesda"
country_or_sea: USA
subnational_1_or_ocean_region: Maryland
locality: Bethesda
type: nmdc:GeoLocationNameValue
5 changes: 5 additions & 0 deletions src/data/valid/GeoLocationNameValue-2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
has_raw_value: "USA: Bethesda, Maryland"
country_or_sea: USA
subnational_1_or_ocean_region: Maryland
locality: Bethesda
type: nmdc:GeoLocationNameValue
4 changes: 4 additions & 0 deletions src/data/valid/GeoLocationNameValue-3.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
has_raw_value: "Atlantic Ocean: Charlie Gibbs Fracture Zone"
country_or_sea: Atlantic Ocean
subnational_1_or_ocean_region: Charlie Gibbs Fracture Zone
type: nmdc:GeoLocationNameValue
22 changes: 21 additions & 1 deletion src/schema/attribute_values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,31 @@ classes:
class_uri: 'nmdc:AttributeValue'
description: >-
The value for any value of a attribute for a sample. This object can hold both the un-normalized atomic
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR (I'm commenting on it while it's in my attention): I don't understand this sentence.

The value for any value of a attribute for a sample.

value and the structured value
value and the structured value.
slots:
- has_raw_value
- type

GeoLocationNameValue:
class_uri: nmdc:GeoLocationNameValue
description: >-
A structured address-like record of where something is located, came from, or took place
comments:
- When an instance of this class is populated via an ETL pipeline, the raw value from the source should go into this class's has_raw_value slot.
The ETL pipeline can parse that raw value and insert the results into this class's other slots.
- "This class is more useful that string-only representations when data may be ingested from multiple sources, some of which may use a style like 'Nation: State, Locality' and other might use 'Nation: Locality, State'."
- country_or_sea is required. Instances of this class should have either a subnational_1_or_ocean_region or a locality. Preferably both.
is_a: AttributeValue
narrow_mappings:
- mixs:0000010
slots:
- country_or_sea
- subnational_1_or_ocean_region
- locality
slot_usage:
has_raw_value:
required: true

QuantityValue:
class_uri: nmdc:QuantityValue
is_a: AttributeValue
Expand Down
70 changes: 60 additions & 10 deletions src/schema/basic_slots.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,56 @@ default_range: string

slots:

country_or_sea:
range: string
description: A maximally coarse-grained, geo-political description of where something is, came from, took place, etc.
comments:
- MIxS states that country or sea names should be chosen from the INSDC country list aka the geo_loc_name-qualifier-vocabulary
- https://www.insdc.org/submitting-standards/geo_loc_name-qualifier-vocabulary/
- https://www.cia.gov/the-world-factbook/ and http://unstats.un.org/unsd/methods/m49/m49.htm
todos:
- "ValueError: https://www.cia.gov/the-world-factbook/ and http://unstats.un.org/unsd/methods/m49/m49.htm is not a valid URI or CURIE"
- when populating see_also
examples:
- value: "Canada"
- value: "Atlantic Ocean"
- value: "USA"
narrow_mappings:
- mixs:0000010
rank: 1
required: true

subnational_1_or_ocean_region:
range: string
description: A geo-political description of a region that is a direct, unambiguous subdivision of a paired country_or_sea value,
OR an unambiguous region or feature within an ocean or "sea".
Comment on lines +46 to +47
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @turbomam, it's not clear to me what the two operands of the OR are here.

Interpretation 1 (I added EITHER to mark the first operand):

...a region that is "EITHER" a [...] subdivision [...] OR an unambiguous...

Interpretation 2 (I added "EITHER" to mark the first operand):

...a region that is a [...] subdivision of EITHER a [...] value, OR an unambiguous...

Comment on lines +46 to +47
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few lines above this, we have the word sea without quotes.

MIxS states that country or sea names...

I don't know why it is enclosed in quotes here.

...within an ocean or "sea".

comments:
- "INSDC's guidance could be taken to imply that Vancouver or Cote d’Azur are appropriate values for this slot, but NMDC would place those in the locality slot."
aliases:
- state
examples:
- value: "British Columbia"
- value: "Charlie Gibbs Fracture Zone"
- value: "Maryland"
narrow_mappings:
- mixs:0000010
rank: 2
recommended: true

locality:
range: string
description: A geo-political description of a region or feature that is within or overlaps with a paired subnational_1_or_ocean_region value.
aliases:
- state
examples:
- value: "Vancouver"
- value: "Cote d’Azur"
- value: "Bethesda"
narrow_mappings:
- mixs:0000010
rank: 3
recommended: true

qc_comment:
range: string
description: >-
Expand Down Expand Up @@ -598,27 +648,27 @@ enums:

Virus Summary:
description: Tab separated file listing the viruses found by geNomad.
see_also:
see_also:
- https://portal.nersc.gov/genomad/
annotations:
file_name_pattern: '^_virus_summary\.tsv?$'

Plasmid Summary:
description: Tab separated file listing the plasmids found be geNomad.
see_also:
see_also:
- https://portal.nersc.gov/genomad/
annotations:
file_name_pattern: '^_plasmid_summary\.tsv?$'

GeNomad Aggregated Classification:
description: >-
Tab separated file which combines the results from neural network-based classification
and marker-based classification for virus and plasmid detection with geNomad.
see_also:
see_also:
- https://portal.nersc.gov/genomad/
annotations:
file_name_pattern: '^_aggregated_classification\.tsv?$'
file_name_pattern: '^_aggregated_classification\.tsv?$'

Reference Calibration File:
description: A file that contains data used to calibrate a natural organic matter or metabalomics analysis.

Expand Down Expand Up @@ -1090,12 +1140,12 @@ enums:
title: Metagenome
metatranscriptome:
aliases:
- metaT
- metaT
title: Metatranscriptome
amplicon_sequencing_assay:
meaning: OBI:0002767
title: Amplicon

MassSpectrometryEnum:
permissible_values:
metaproteome:
Expand All @@ -1106,13 +1156,13 @@ enums:
aliases:
- metaB
title: Metabolome
lipidome:
lipidome:
title: Lipidome
nom:
aliases:
- natural organic matter
title: Natural Organic Matter

ExtractionTargetEnum:
permissible_values:
DNA: { }
Expand Down
Loading