Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Resources for Text and Data Mining: Find Text Data

Finding help

Yale researchers who need help getting started with a Humanities project, can visit the Digital Humanities Lab during office hours

For help with statistical analysis or projects in the Science or Social Sciences, contact the StatLab at the Marx Science and Social Science Library

Staff in these locations can work with you and a subject specialist to identify potential resources. 

Click here to browse all text data available through the library. 

Other helpful guides

Feedback

This is a new guide and still a work in progress. Please send us your feedback and suggestions: researchdata@yale.edu

Textual Data Sources

Yale University Library has negotiated agreements with many vendors to gain full-text access to a large and growing collection of historic newspapers. These agreements includes access to collections like ProQuest's Historic Newspapers and others. 

Please email researchdata@yale.edu to request access to this data or with any questions about Text & Data Mining.

Click here to browse all text data available through the library. 

Newspapers

Below is a list of magazines & periodicals YUL has licensed for Text & Data Mining. Some of these titles are large collections of historic periodicals, click on the title to see the full list of items included in a collection.

Title Start End Size
19th Century British Library Newspapers: Part I & Part II 1800 1900  
Atlanta Constitution 1868 1935 215.25 GB
Arizona Republican 1890 2007 1.67 TB
Austin American Statesman 1871 1926 65.43 GB
Baltimore Sun 1837 1932 335.93 GB
Boston Globe 1872 1984 2.08 TB
Chicago Defender 1910 1975 494.24 GB
Chicago Tribune 1849 1950 660.41 GB
Christian Science Monitor 1908 1995 267.72 GB
Cincinnati Enquirer 1923 2009 1.47 TB
Confederate Newspapers: A Collection from Florida, Georgia, Tennesse, Virginia, and Alabama 1859 1866  
Dayton Daily News 1898 1922 85.94 GB
Detroit Free Press 1831 1999 1.14 TB
Hartford Courant 1837 1930 403.34 GB
Indianapolis Star 1903 2004 1.09 TB
Los Angeles Times 1881 1950 464.12 GB
Louisville Courier-Journal 1830 2001 1.24 TB
Minneapolis Star Tribune 1867 2001 1.83 TB
Nashville Tennessean 1812 2002 1.05 TB
New York Times 1851 1934 1.29 TB
New York Tribune / Herald Tribune 1841 1962 741.26 GB
Newsday 1940 1990 508.18 GB
Philadelphia Inquirer 1860 2001 1.84 TB
Pittsburgh Post Gazette 1786 2003 1.03 TB
San Francisco Chronicle 1865 1922 157.61 GB
St. Petersburg Times / Tampa Bay Times 1901 2009 1.43 TB
Sunday Times 1822 2006  
The Illustrated London News 1842 2003  
The Times Digital Archive 1785 2006  
Wall Street Journal 1888 1933 3.63 GB
Washington Post 1889 1935 690.64 GB

Yale University Library has negotiated agreements with many vendors to gain full-text access to a large and growing collection of magazines & journals. 

Please email researchdata@yale.edu to request access to this data or with any questions about Text & Data Mining.

Click here to browse all text data available through the library. 

Magazines & Journals

Below is a list of periodicals YUL has licensed for Text & Data Mining. Some of these titles are large collections of historic periodicals, click on the title to see the full list of items included in a collection. 

Title Start End Size
19th Century UK Periodicals: Empire 1800 1900  
Art & Architecture Archive 1860 2005 1.51 TB
British Periodicals IV 1853 2007 2.47 TB
New Yorker 1925 2017 152.06 GB
Periodicals Archive Online 1802 2000  
The Listener 1929 1991  
Vogue 1892 2013 5.01 TB
Women's Wear Daily 1910 2015 1.26 TB
Women's Magazine Archive 1846 2005 1.97 TB

Yale University Library has negotiated agreements with many vendors to gain full-text access to a large and growing collection of archival documents. These agreements includes access to collections like Gale's Archives Unbound and others. 

Please email researchdata@yale.edu to request access to this data or with any questions about Text & Data Mining.

Click here to browse all text data available through the library. 

Archival Collections & Government Documents

Below is a list of archival & government documents YUL has licensed for Text & Data Mining. These collections contain a large amount of documents of varying types; click on the title to see the full list of items included in a collection. 

Title Topic Start End Size
Archives Unbound General 1256 1994  
Black Freedom Struggle in the 20th Century: Organizational Records & Personal Papers, Part 2 Social Justice 1898 1976 192.1 GB
British Literary Manuscripts Online Literature 1120 1900  
Congressional Hearings Digital Collection War/Politics 1824 2019 80.3 GB
Declassified Documents Reference System War/Politics 1900 2008  
Eighteenth Century Collections Online II Literature 1701 1800  
House of Lords Parliamentary Papers War/Politics 1801 1900 1.58 TB
Sabin Americana, 1500-1926 War/Politics 1500 1926  
Slavery & Anti-Slavery, Part I: Debates over Slavery and Abolition Social Justice 1490 1896  
Slavery & Anti-Slavery, Part II: Slave Trade in the Atlantic World Social Justice 1490 1896  
Slavery & Anti-Slavery, Part III: The Institution of Slavery Social Justice 1490 1888  
State Papers Online, 1509-1714: Parts I, II, III, IV War/Politics 1509 1714  
The Making of the Modern World, Part II: 1851-1914 General 1851 1914  
Times Literary Supplement Historical Archive Literature 1902 2014  
World Scholar: Latin America and the Caribbean General 1492 1986  
World War II: U.S. Documents on Planning, Operations, Intelligence, Axis War Crimes, and Refugees War/Politics 1938 1949 192.56 GB

Yale University Library has negotiated agreements with BYU to gain full-text access to two of their English language corpora. These resources are available to anyone in the Yale community. Click on the link below to gain access to the corpora.

Click here to browse all text data available through the library. 

English Language Corpora

Below is a list of corpora YUL has licensed for Text & Data Mining. These collections contain a large amount of documents of varying types including newspapers, magazines, fiction, and spoken word; click on the title to visit the catalog record and download information. 

 

Title Topic Start End Size
Corpus of Contemporary American English (COCA) General 1990 2017  
Corpus of Historical American English (COHA) General 1810 2009  

Yale University Library has negotiated agreements with LexisNexis to gain bulk access to many of their text collections.  This includes print and online journals, television and radio broadcasts, newswires and blogs, local, regional, national and international newspapers, and more. For a full list of available titles, download the Title List below.

Please email researchdata@yale.edu with question about this collection and for access. These resources are available to anyone who is a current Yale student, faculty, or staff with an active NetID.

This is a list of free or Open Access sources of text and data that researchers can use for research projects. This is not a comprehensive list. To suggest a free resource be added to this list, please email researchdata@yale.edu

Researchers are responsible for ensuring that their use complies with any terms required by the publishers. 

Some data providers (like HathiTrust or Springer) provide access to more sources for those affiliated with Yale. Be sure to check if you need Yale affiliation to get full access.