Describe your data

How can we help?

The UC San Diego Library’s Research Data Curation Program (RDCP) and Metadata Services offers metadata consultation and services for researchers. We can help you understand metadata standards and options to better sync your data with other practitioners of your discipline as well as assist in metadata creation. We can also help review your metadata to ensure its usability and quality.

Why should you care about metadata?

When thinking about sharing and discoverability in regards to your research data, ask yourself these questions:

  • Can anyone find your data now?
  • Will anyone find your data in the near future?
  • Will anyone know what to do with your data?
  • Are you satisfying funder mandates for data sharing?

Metadata helps resolve these questions. Describing data aids the discoverability, sharing, reuse, and data archiving processes. Metadata are often the only information a secondary researcher or even a machine rely on to locate, contextualize, and understand the data; thus good metadata practices are highly valuable to a research data project.

Getting started

You might be surprised to learn that a lot of information about your data (that is, metadata) already exists in your notebooks or files. What you need to get started, then, is timely collection of information about your data that is created at or around the time your data is collected. This ensures that the metadata is connected to and lives with the data. Your initial metadata might not be very structured or organized, but what is needed most at this stage is thoroughness. Metadata cleanup is a natural part of the metadata creation process. The main goal is to capture information here that will make your data easier to find and understand in the future; by you, by others, and by computers.

Types of metadata

There are many types of metadata. These categories have overlap and shift over time, but in general include:

  • Descriptive metadata - The most common and familiar kind of metadata that describe an object. Examples include a title, date, and a researcher's name.
  • Rights metadata - Provides information about who (if anyone) owns the data, and outlines how and by whom the data can be accessed and used.
  • Structural metadata - Information about how complex objects are assembled, which helps the logical structuring of objects and with displaying objects in a user interface.
  • Technical metadata - Information about technical aspects of the data such as format-specific technical characteristics, how an object was created, and storage and location information.

Getting organized

There are many ways to organize your metadata, but it is important to remain consistent to avoid gaps and conflicting sources of metadata. You can use one or a combination of the following methods:

  • Paper or digital notebooks, lab notebooks - Digital notebooks have the advantage of allowing markup and being searchable.
  • Text files that live in folders with the data - You can create data dictionaries, but even basic README files can contain a good deal of technical documentation.
  • Tabular files - You can add an extra sheet or column that spells out variable names, definitions, and exactly what the contents of columns represent.

Putting it all together

Now that you have metadata and ways to organize it, all that remains is a way to structure it so that it's usable (even machine-readable), and also to define it, in order to facilitate reuse, discoverability, and to make it harvestable for data analysis and visualization. As is the case with metadata organization, there are many ways to add structure to metadata. One approach would be to create a template, where for each file, you would fill out a template that has the same elements or fields that make up a typical data file. As with tabular files, though, an exact explanation of the contents of fields should exist so that a person other than yourself would understand how and why the values were recorded.


An example of creating basic metadata


This template would not have to employ a metadata standard or schema, but it could easily accommodate one. You might find that creating a basic metadata template will conform roughly to creating a Simple Dublin Core template. It's also likely you will have fields that only seem to apply to your research, and are not referenced in metadata standards. This is perfectly fine metadata to keep, as long as the content is as specific (or granular) as possible, and as long as you preserve this 'schema' itself.

At the end of this process, the metadata will be in a state where it could be utilized for the self-deposit of data to a repository, or it could be supplied to metadata professionals along with your data, who will then work with the metadata to align it with a specific repository's data model. The latter is called a 'mediated' approach to digital collections or repositories. Eventually, both options will be available for research data in the UC San Diego Digital Collections, but currently the mediated approach is the default process for the metadata creation process. 

Want to learn more?

If you would like to learn evere more about metadata, see our resource on metadata schemas and standards. There is also a resource for those curious about metadata in the UC San Diego Digital Collections.

RDCP Logo
Contact the Research Data Curation Program with questions about our services or to provide feedback on our new website.