Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
For information about visitor access, see COVID-19 library updates..

Research Data Management: Overview

Resources for learning about best practices in research data management across a variety of disciplines.

Back up your data!

Keep your data safe 

Know your Risk! 

Yale IT provides indepth guidance to assess your data risk level and approved services to keep your research safe. 

For pros and cons of data storage options: Ruggiero, Paul and Matthew A. Heckathorn. Data Backup Options. Carnegie Mellon University for US-CERT, 2012.

Quick data management checklist

1. always keep original data

2. back up regularly (automate this if at all possible)

3. document your data thoroughly (metadata, data dictionary)

4. name and organize files according to a schema

5. use version control

6. secure the data appropriately

7. cite any secondary data you use

8. consider your long-term plan: reuse?, sharing? what to keep? where to store? 


Public Access to Federally Funded Research

Yale is a member of Dryad!

Dryad is an open-source, research data curation and publication platform, making data publishing easy for the researcher.   The Dryad platform accepts data from any discipline.   As institutional member Yale researchers can deposit their data free of charge without limitation on the number of datasets deposited.

Highlights of the platform:

  • A picture containing object, clock

Description generated with very high confidenceIn addition to supporting datasets as part of a journal submission, Dryad now also supports datasets being submitted independently
  • Data can be uploaded from cloud storage or lab servers 
  • Datasets can be as large as 300GB 
  • Standardized data usage and citation statistics are updated and displayed for each published dataset 
  • Data can be submitted and downloaded through our new REST APIs
  • All datasets are indexed by Web of Science, Scopus, and Google Dataset Search for increased discoverability

Dryad is committed to supporting the changing needs of research allowing for datasets to be submitted and published at any point in the research process, providing full support for versioning, and fields for notes, methods and vocabularies. While Dryad accepts all research data, the platform is intended for complete, re-usable, low risk and open research datasets.  For information on Dryad’s guidelines for human subjects data, see

Getting started with Dryad is easy.  The platform uses ORCID as its primary login method, which is required to deposit data. You can connect your ORCID ID to the Yale Institutional membership using your NetID here.   Don’t have an ORCID ID? Registration is simple and quick at  Anyone can browse available datasets using the Explore Data link.

Data Librarian

Profile Photo
Barbara Esty
Marx Science and Social Science Library
219 Prospect St.
office: C42

Data Use Agreements

What is research data?

Research data is loosely defined as information collected, observed, or created for purposes of analysis to produce original research.

This includes observational variables like rainfall, wind speed, water quality, or survey data; simulated data from earthquake models; experimental data from lab instruments; and derived or compiled data for text mining or testing algorithms. Research data can take almost any digital file format (video, text, photographs, numbers), so managing it effectively can be a challenge.

Research Data Support Services

For help finding, using, managing, or archiving your research data, contact Research Data Support Services.

Why is managing research data important?

Good data management:

  • ensures integrity of data
  • ensures that data is findable and usable when grad students leave projects over the years
  • makes the data of a project readily understandable to people outside the project
  • enables the sharing of data within and across disciplines
  • makes it easier to archive and preserve data in the long term
  • encourages data citation to increase the impact of the research