Archive and preserve your data

Archiving and preserving your research data involves more than keeping your data files on your lab server. In addition to capturing information about your data, you should consider the following:


Choosing File Formats for Preservation

The file format in which you keep your data is a primary factor in one's ability to use your data in the future. As technology continually changes, researchers should plan for both hardware and software obsolescence. How will your data be read if the software used to produce them becomes unavailable?

Formats more likely to be accessible in the future are:

  • Non-proprietary
  • Open, documented standards 
  • Commonly used by a research community
  • Standard representations (ASCII, Unicode)
  • Unencrypted
  • Uncompressed (If you need to compress files to conserve space, limit compression to your 3rd backup copy.)

Consider migrating your data into a format with the above characteristics, in addition to keeping a copy in the original software format.

Examples of preferred format choices:

  • PDF/A, not Word
  • ASCII or CSV, not Excel
  • MPEG-4, not Quicktime
  • TIFF or JPEG2000, not GIF or JPG
  • XML or RDF, not RDBMS

Research Data Retention Periods

All research data collected or generated as part of a government sponsored program should be retained for a minimum of 3 years from the end of the project, in order to comply with potential FOIA requests. If you collect data about humans, animals, or agricultural products, you must retain your data in accordance with the Georgia Board of Regents Records Retention Schedule. The policy specifies the retention periods for many research related records, in addition to certain types of research data, and should be reviewed by everyone involved in a research project. 

Type of Research Data Retention Period (data from projects that are not of major significance) Retention Period (data from projects of national or international significance, interest, or controversy)
Animal Care and Use 3 years  Permanent 
 Human Subjects 70 years (if there potential long-term effects to human subjects) Permanent
Agricultural 70 years (if project has potential long-term environmental effects) Permanent Comments 

Depositing your data in a research data repository will facilitate its discovery and preservation. 

Science and Engineering Data Repositories

Discipline/Domain Repository
Astronomy
Atmospheric Science
Biology
Chemistry
Earth Science
Earthquake Engineering/Seismology
Nanotechnology
Oceanography
Space Science
Social Sciences

For a more complete list of data repositories, see DataBib, a searchable catalog of research data repositories.

Data that have been created at Georgia Tech or by GT researchers, in any discipline, can be archived in SMARTech, the GT repository created to capture, distribute, and preserve digital products of faculty and researchers. Authors can archive their digital works in a variety of formats, including datasets. For more information on how to deposit data into SMARTech, please review the submission guidelines or contact us

Factors for Evaluating Data Repositories

While choosing to deposit your data into a repository is a great decision for preservation and access, not all data repositories are alike. When deciding where to deposit your research data, there are several factors to consider.

  • How is the repository sustained? What is their business model? Is this a recently established repository or has it been around for awhile?
  • Is there evidence of an explicit institutional commitment to preservation?
  • What is the repository's preservation policy or plan?
  • Has the repository worked to ensure compliance with OAIS Reference Model (this may also be referred to as the Trusted Repository Audit & Checklist (TRAC) or ISO 16363)?
  • Who is the audience for the repository>

If you have questions about a repository and whether they are a suitable home for your research data, contact them and ask about how they will preserve and disseminate your data. In many cases, the repository will want to know in advance if you plan to archive your data with them, and they will appreciate hearing from you. Additionally, they may be able to help with your data management plan.