Data import

Needed by whom and when

UCJEPS and HAVRC currently receive batches of new objects for their collections.  UCJEPS can perform a batch import at this time and needs this functionality. 

Question: What existing capability does HAVRC have and need?

Overview

Currently, importing data into CollectionSpace requires a very high level of technical sophistication.  At the same time, there are probably several methods for importing data, and it is not clear which is the most productive way.  Also, importing controlled vocabularies is perhaps not the same as importing core collections data which perhaps differs substantively from importing procedural or transactional data.  There is also a key distinction between importing data as part of an initial migration and subsequent imports of large data sets when new acquisitions come in to the collection.

User stories for definition

Please feel to rewrite these or eliminate completely!  Then move to the prioritized headings below.

Data import: UCJEPS receives a large set of specimens (e.g., 1000 plants) from a collector along with a spreadsheet of metadata.  They add bar code numbers, accession numbers, and other required data to the spreadsheet, reformat the information to a required format and send the file to a programmer who can import the data using data loading programs.

Data import: UCJEPS receives a large set of specimens (e.g., 1000 plants) from a collector along with a spreadsheet of metadata.  They add bar code numbers, accession numbers, and other required data to the spreadsheet, reformat the information to the format required by the batch importer, and import the data using a batch importer into CollectionSpace which creates the object records.

Data import: UCJEPS receives a large set of specimens (e.g., 1000 plants) from a collector along with a spreadsheet of metadata.  They add bar code numbers, accession numbers, and other required data to the spreadsheet.  A CollectionSpace import tool allows them to map columns from the spreadsheet to CollectionSpace's internal schema.  Then they import the data into CollectionSpace which creates the object records.

Import of researcher data: PAHMA staff member is sent a spreadsheet of taxonomic identifications (or cultural identifications, or material identifications, or ...) by a researcher who made those identifications. PAHMA staff member would check the museum numbers and would add researcher name, date(s) of research visit, and any other relevant contextual information to objects on the the spreadsheet. Then they import the data into CollectionSpace, which adds the (clearly attributed) identifications to the object records.

Import of conservator data: PAHMA conservator compiles a spreadsheet of condition statements (or material identifications, or treatment statements, or ...). PAHMA conservator double-checks the museum numbers and adds their name, date(s) of statements/IDs, and any other contextual information to the objects on the spreadsheet. Then they import the data into CollectionSpace, which adds the attributed statements/IDs to the object records.

Import of PAHMA and/or researcher photographic data (for linking images and other media to object records): PAHMA staff member prepares a spreadsheet of media filenames and filepaths, and the museum numbers to which they should be linked. PAHMA staff member double-checks the museum numbers and add the name of the photographer (or other creator and/or digitizer), the date(s) of creation/digitization, a caption/name/description, and any other contextual information to the objects on the spreadsheet. Then they import the data into CollectionSpace, which uses that information to create attributed media records and to link those records to the appropriate objects records.

Prioritization of user stories

As definitions and priorities are clarified, the user stories above should be moved into relative order below.

Must have for 1.x-MUSEUM (when they go live in system)

Placeholder for required functionality.  As a general rule, functionality that you need and use now should go here or where you have existing data.  However, this is up to the museum.  We will have to balance requirements against resources and timelines.

Placeholder

MUSEUM could wait six to twelve months

What could wait?  These will be re-prioritized at a later date.

Placeholder

MUSEUM would like to have this eventually

These are nice to have but not a near term requirement.

Placeholder