Resources

Deploying Collections

2010 Leadership Grant

Documentation

Skip to end of metadata
Go to start of metadata

With this iteration, we are targeting the data migration mappings for the migration of SAGE data into a CollectionSpace instance, based on the version 2.0 release of CollectionSpace. This is an interim release Changes from the previous mapping are in blue.

Person

Field mapping

SQL

View: cspace_person_2_0

Notes

  •  Temporarily excluding shortIdentifers 0, 60397 because they are not unique in the view. They are associated with multiple Orgs (which are going into the Groups field). Break into a repeating group and merge in separately.

Talend/ETL

bgimportperson1 0.2: Revising earlier talend job (based on 1.8) to change refname format, remove ampersand handling, and include collectionspace_core:uri element

Organization

Field mapping

SQL

View:

Notes

  • Skipping for now
  • Multiple records per org because many orgs have multiple contacts
  • Probably need to extend Org to have repeating PersonContacts information

Talend/ETL

Taxonomy

Field mapping

SQL

View: cspace_taxonomy

Notes

  • Not getting author shortIDs in view, just names. For now putting them in Taxon Notes. Adjust view or get them separately in SQL.
  • Do we need to adjust the format of scientific names?
  • Lots of other Sage data for Names to add

Talend/ETL

bgtaxonimport2 0.2

Accessions

Field mapping

SQL

View: xdb_accession_qv

Notes

  • Need dropdown values for form
  • Dead Flag Using expression (row1.dead_flag==0) ? "no" : "yes" but should be "alive" "dead"
  • Need to add dead_date, structured
  • Need to add accession date
  • Need hybrid flag
  • Need requested by
  • Is Source from a vocab? Need field (wait for Donor block being added for UCJEPS)
  • Add More Notes to Accession Notes
  • Geog location is often Earth so we know geography needs some work
  • For now, putting Source in Brief Description along with Material type
  • Put Accession Notes in Comments: Rename field
  • What is Plant Status?
  • What is Data Quality?
  • record_status drop down "1"
  • Do we need to fix the form of taxonomic names?

Talend/ETL

bgaccessionimport 0.1

Garden Locations

Field mapping

SQL

View: cspace_location

Notes

  • Type (storage location type) field expects a dynamic controlled list with URN-formatted values. For now just loading the raw data fields from the locationType field. Nothing will display.

Talend/ETL

bgimportgardenlocation 0.1

Plant Locations (Movement and Location procedure)

Field mapping

SQL

View: xdb_plant_location_qv

Notes

  • xdb_plant_location_qv.action_code map to Reason For Move, and rename? Right now it's a numeric code in Sage. What are decode values? Ignoring for now.

Talend/ETL

bgplantlocations 0.1

Relationships between Accessions (Cataloging) and Plant Locations (Movement)

Field mapping

SQL

Text file: plantlocids.txt, produced in Talend bgplantlocations job

Notes

  • Text file has CSID from Plant Location record and CSID from Accession record

Talend/ETL

bgplantlocations 0.1

  • Talend job creates two XML outputs (cataloging-to-movement, movement-to-cataloging)

Propagation

Field mapping

SQL

Control lists:

  • Pot Type: select pot_type from pot_type_code order by pot_type;
  • Pot Size: select pot_size from pot_size_code order by pot_size;
  • Treatment Type: select treatment_type_name from treatment_type order by treatment_type_name;

Notes

Prop Type:

  • lookup table: prop_type_code
  • propagation.prop_type = pro_type_code.prop_type
  • 21 values in prop_type_code

Reason for Prop: propagation.purpose

  • database has a rule that limits the values to: 'class use', 'research', 'garden collection',
    'distribution', 'conservation'
  • 5 distinct values in propagation.purpose

Extra Seeds: propagation.extra_seed_flag

  • boolean (0,1)

Activity Type:

  • lookup table: activity_type_code
  • activity_type_code.activity_type where growth_history.activity_type_id = activity_type_code.activity_type_id
  • 24 values in activity_type_code

Medium:

  • lookup table: medium
  • growth_history.medium_name = medium.medium_name
  • 1302 values in medium

Pot Type:

  • lookup table: pot_type
  • growth_history.pot_type = pot_type_code.pot_type
  • 35 values in pot_type_code

Pot Size:

  • lookup table: pot_size_code
  • pot_size_code.pot_size where growth_history.pot_size_id = pot_size_code.pot_size_id
  • 69 values in pot_size_code

Treatment Type:

  • lookup table: treatment_type
  • treatment_type.treatment_type_name where treatment.treatment_type_id = treatment_type.treatment_type_id
  • 91 values in treatment_type

Concentration: treatment.concentration

  • varchar(10) field
  • no rules or constraints on this field
  • 80 distinct values in treatment.concentration


  • No labels