Preserve

Ensure that your data will be accessible long after your project is over. It is important to think about a long-term plan from the earliest outset of your project so that you can set aside enough time and resources to accomplish it.

What should I keep?

It is important to be selective about what data you plan to retain, as every file requires some measure of overhead in terms of storage and maintenance for the long term. It’s a good idea to:

  • keep anything irreproducible, such as observations specific to a particular time and place,
  • retain results that are tied to a specific publication or presentation,
  • discard intermediate tests or failed experiments at the end of a project.

How long should I keep it?

Check with your funding agency to find out if there is a specific policy that spells out a data retention period. For publicly funded research in the US, this is often a minimum of three-years. It is better to aim for even longer, if possible, in case you or someone else need the data later on. Five to ten years is a good rule of thumb.

Keep files readable

Making sure your data remain accessible for the long term is a big challenge, especially since technology changes so quickly. Choosing the right file formats can help avoid obsolescence. Use formats that are:

  • Non-proprietary, open, documented standards (e.g., .tif, .txt, .csv, .pdf)
  • Used commonly in your research community
  • Encoded with standard characters (e.g., ASCII, UTF-8)

Tools and resources

  • Re3data.org is a global registry of data repositories organized by academic discipline. A rating system and faceted browsing can help you find the best place to deposit your data.
  • Texas ScholarWorks (TSW) is UT’s web-accessible DSpace repository, managed by UT Libraries. A free and secure place for archiving and sharing faculty research output, it provides persistent URLs, searchable metadata, full-text indexing and long-term preservation.
  • The Texas Data Repository (TDR) is now open! Hosted by the Texas Digital Library, and based on Harvard University’s Dataverse platform, TDR will serve as a long-term solution for the preservation and dissemination of UT’s research data.