Skip to Main Content

Getting Started with Dewey: Data Available at Yale

Dewey Data Subscriptions Available at Yale

This list is current as of October 30, 2023

The following datasets are available for free to Yale-affiliated researchers, faculty, and students: 

 

SafeGraph: Global points of interest (POIs), building polygons, and aggregated US transaction data from 2019 - present

Datasets:

  • Spend - Aggregated, anonymized credit and debit transaction data associated to specific POIs, including median spend per day, median spend per customer, and other detailed statistics, as well as where else consumers spend money and the breakdown of online/offline spending.
  • Places - Global points of interest (POIs), including lat/long coordinates, address strings, place name, brand affiliation, NAICS categorization, open/close dates, open hours, contact info, and more.
  • Places + Geometry - Precision polygons representing place footprints, including detailed spatial hierarchy metadata that denotes how places are related to each other (ie. stores inside a shopping mall, co-tenants in a plaza, etc.).

Learn more here


Advan Research: Aggregated foot traffic and mobility data for the US and Canada from 2019 - present

Datasets:

  • Weekly and Monthly Patterns -  Aggregated raw counts of visits to POIs from a panel of mobile devices over a given month or week, detailing how often people visit, how long they stay, where they came from, where else they go, and more.
  • Neighborhood Patterns - Footfall data aggregated by census block group (CBG) in the US and dissemination area (DA) in Canada over the course of a month, showing how the population moves between different areas in terms of both volume and frequency.

Learn more here


Similarweb: Website metrics for top global brands, including website visits, search keywords, and popular pages from 2021 - present. 

Datasets:

  • Website Visits - Daily website visits for the top 1,000 global websites, top 500 most valuable companies, and all brands in SafeGraph and Advan datasets.
  • Organic Search Keywords - Monthly organic keywords for the domains of 500 of the most valuable companies, including keyword volume, CPC, URL, keyword position, and more.
  • Popular Pages - Data on up to 50 of the most visited sub-pages of 8000+ of the world’s most popular websites.
  • Additional custom datasets available upon request

Learn more here


LiveData Technologies: Historical job change data for working professionals in the US from 2012-present.

 

Datasets:

  • Open to Work - Job History - Job history for individuals that are currently unemployed or "open to work, including up to 10 layers of job history for over 3.3 million people. 
  • Additional custom datasets can be purchased upon request

Learn more here


Context Analytics: Social sentiment data from Twitter on thousands of US securities from 2011 - present. 

Dataset:

Normalized Social Sentiment Metrics - Aggregated tweet and social sentiment data with a  comparison to the security’s own baseline to deliver 15 sentiment- and Tweet volume-based metrics.

Learn more here


dataplor: Global points of interest (POIs) and GIS data for the current month, refreshed each month.

Datasets: 

  • Currently available in Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Ireland, Italy, Japan, Luxembourg, Netherlands, Norway, Portugal, Sweden, UK, & US
  • Additional countries available upon request.

Learn more here


CustomWeather: Detailed weather data from 2000+ weather stations in the US from 2000 - present. 

Datasets:

  • Daily Weather - A daily feed of climate data, including temperature, humidity, precipitation, windspeed, dew point, visibility, and sea level pressure.
  • Hourly Weather - An hourly feed of climate data, including temperature, humidity, precipitation, windspeed, dew point, visibility, and sea level pressure.

Learn more here


PredictHQ: Demand and event intelligence data for the US and Canada from 2020 - present.

Datasets:

  • Unscheduled Events - Data about unpredictable and unscheduled events, including severe weather, disasters, airport delays, health warnings, and terror.
  • Non-Attended Events - Data about non-attended events with a start and end date that are more fluid in impact, such as observances, public holidays, and school holidays.
  • All-Attended Events - Data about gatherings with a start and end date/time, where people come together in one location for entertainment or business, such as a sporting event or concert.

Learn more here


WageScape: Salary and compensation data for US workers from 2016 - present.

Datasets:

  • Historical Wage Data - Ethically sourced historical salary and compensation data for job titles, job families, and locations, all normalized with AI-patented technology.
  • Access to real-time data can be purchased upon request.

Learn more here


7 Chord: Real-time predictive pricing metrics for fixed income assets from 2012 - present.

Dataset: 

Top Liquid Bond Prices and Liquidity Indicators - Professional grade feed of intraday bid-ask prices, spreads, and liquidity indicators for the most liquid USD-denominated High Grade, High Yield, and EM Sovereign bonds, enriched with market consensus T-benchmarks and essential bond and issuer information.

Learn more here


People Data Labs: Global company profile data from 2010 - present.

Dataset:

Company Insights - Premium company profile data including employee tenure, top company metro areas, employee growth and churn, and executive leadership changes.

Learn more here


Skupos: SKU-level transaction data for US convenience stores from 2019 - present

Dataset:

Convenience Store Transaction Data - Detailed SKU-level data from convenience stores, including basket details, payment info, loyalty programs, and more.

Learn more here


REsimplifi: Commercial real estate listing data from 2021 - present.

Dataset:

Commercial Real Estate Listings - Real estate listings for commercial properties, including detailed attributes related to property type, listing type, listing organization, and even property photos.

Learn more here


Samba TV: TV viewership & ad exposure data, including two years of historical data.

Datasets:

  • TV Ad Exposure - A detailed view of advertising TV spots, including information such as the timestamp, network, prior and next title, advertiser, product, and more.
  • TV Content Viewership - Viewership behavior, including content timestamps, content type, titles and episode information, release dates, network information, and household DMA.

Learn more here


pass_by: Branded consumer foot traffic insights in the US, including two years of historical data.

Datasets:

Retail Store Visits - Foot traffic data for specific POIs, including weekly, daily, and hourly breakdowns of aggregated and anonymized visitor volume.

Retail Store Visitors - Anonymized and aggregated visitor characteristic data for specific POIs, including date range, educational attainment, income, other brands visited, and more.

Store Visit Trends -  Total monthly anonymized and aggregated foot traffic volume for specific POIs, including date range, brand, store location, and more.

Learn more here


WARN Tracker: US company layoff data from 1988-present.

Dataset:

Historical Layoff Data - Detailed company layoff data, including company name, state where layoffs took place, notice date, effective layoff date, temporary status, and year.

​​​​​​​Learn more here

Data Collections Librarian

Profile Photo
Barbara Esty
Contact:
Yale University Library
Data-Intensive Social Science Center
85 Trumbull St.
203-432-4587