By Andree Ramsey
One of the most important aspects of any project is the requirement to submit data to data repositories such as National Oceanographic Data Center (NODC), National Climatic Data Center (NCDC), National Geophysical Data Center (NGDC), National Snow & Ice Data Center (NSIDC), Advanced Cooperative Arctic Data and Information Service (ACADIS), Carbon Dioxide Information Analysis Center (CDIAC), Biological & Chemical Oceanography Data Management Office (BCO-DMO), and the list goes on. This can be a daunting task for anyone submitting data for the first time. The instructions given by the agencies can be a bit baffling and extremely overwhelming.
Having recently gone through the process of submitting Physical Oceanographic Data to NODC, successfully I might add, I thought I would share what I have learned. My intention is to make the process less intimidating for other researchers submitting data for the first time. Please keep in mind that my experience has only been with submitting Physical Oceanographic Data to NODC thus far; however, I am confident that these thoughts will be applicable to the other data repositories as well.
- Requires detailed metadata
- Prefers netCDF format (Network Common Data Format)
- Accepts ASCII format provided you contact a data officer prior to submission
- Requests that metadata format follow the guidelines from:
Step 1: Determine Feature Type for data set
The first step is to categorize the data into a Feature Type (summarized below). NODC provides a decision tree to help the submitter determine which Feature Type is appropriate for their dataset.
- Point – single data point having no implied coordinate relationship to other points
- Time Series – a series of data points at the same spatial location with monotonically increasing times
- Trajectory – a series of data points along a path through space with monotonically increasing times
- Profile – an ordered set of data points along a vertical line at a fixed horizontal position and fixed time
- Time Series Profile – a series of profile features at the same horizontal position with monotonically increasing times
- Trajectory Profile – a series of profile features located at points ordered along a trajectory
Step 2: Organize data using the Feature Type templates
Once the Feature Type has been determined, the submitter should review the Feature Type Templates and Examples Table on the NODC website. These templates provide required and recommended variable attributes for the type of data being submitted. Below is a list of examples for variables attributes:
- Time – units, standard name, calendar, axis
- Latitude and Longitude – standard name, units, axis, fill value
- Z (depth, pressure, altitude, etc.) – standard name, units, direction of positive (up or down)
- Geophysical variable – standard name, units, scale factor, offset, fill value, coordinates (time, latitude, longitude, z)
Step 3: Determine Metadata required and include with data set
The most important part of any dataset is the metadata, the data that describes the data. A successful metadata file will be able to answer any questions the user may have about the dataset. The Standard Attributes and Guidance Table lists all metadata Attributes, stating which are required and which are recommended. Examples of useful metadata are:
- Summary or abstract
- Instrument and data type
- Sea name
- Time coverage
- Geospatial location
- Principal Investigator
- Feature Type
Step 4: Create netCDF file
As stated earlier, NODC prefers data files to be in netCDF. netCDF files are useful because they are self-describing (contain metadata), can be read on all platforms, and have standardized formats. UNIDATA has many programs available that will read and write netCDF files. The newer releases of MATLAB also have a netCDF library of functions that can read and write netCDF files.
Step 5: Data Report (optional, but extremely useful)
In addition to the metadata, a data report is extremely useful to include with the dataset being submitted. A data report describes the data collection and processing in much greater detail, including instrument settings, specific issues during deployment, recovery, and downloading the data. A data report also provides a well-organized overview of the data collected including lists of all instruments on a single mooring, all moorings deployed during the experiment, and locations of moorings with respect to other moorings. A link to the data report should be listed in the metadata references attribute.
Step 6: Fill out form and submit data
After the data files have been formatted, the metadata has been added, and the data is in netCDF, the data submission form can be filled out. It is now time to contact NODC and let them know that you would like to submit a dataset!
Submitting the data to NODC for the first time can be a bit overwhelming at first. It takes a bit of patience to understand what CF and ACDD are and how they relate to the data. The websites that describe CF and ACDD are also not very clear-cut. Having been through the process, they now make complete sense. The examples provided for the Feature Types are the output from a netCDF file and are difficult to read. For me, it would have been helpful to have a brief list of steps providing an overview of the submission process, a clear, concise description of CF and ACDD formats, and examples of Feature Types that can be easily read and understood.
The goal of this post was to provide a brief explanation and overview of the steps to submit data. In future posts, I will discuss CF and ACDD formats in greater detail, provide examples of data formats for each Feature Type, and discuss netCDF, which can also be baffling.
The process for submitting data to the National Data Centers is not overly difficult, it is just tedious and takes a bit of patience. I have tried to provide a clearer overview of the data submission process hoping to make it a little less complicated for other researchers. When questions do arise, the data officers at NODC will respond quickly to any question and are very helpful. It is a pleasure to work with them.
Network Common Data Format: www.unidata.ucar.edu/software/netcdf/workshops/2012/nc3model/index.html
NetCDF Climate and Forecast (CF) Metadata Conventions: cfconventions.org
Attribute Convention for Data Discovery (ACDD): wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_1-1
NODC Decision Tree: www.nodc.noaa.gov/data/formats/netcdf/v1.1/decision_tree_high_res.pdf
Feature Type Templates and Examples: www.nodc.noaa.gov/data/formats/netcdf/v1.1/#templatesexamples
Data submit form: www.nodc.noaa.gov/submit/index.html