Finding Data

The world is full of data, but sometimes finding it can be difficult. As a starting point, GW libraries has an excellent set of Research Guides that can be helpful. Depending on your research question, searching Google can also lead to good sources.

Here are some other resources:

Large, search-able sites

Basically like running a Google search but only for data sets:

Websites

A collection of various websites that have lots of interesting datasets:

Packages

Many packages contain interesting datasets. If you can find a package with data, it will usually be nicely-formatted 😄

Here is a table of lots of packages that contain multiple dataset. Some of my favorites are:

In addition, some packages exclusively just contain datasets, such as:

Government-ish sources:

“Government-ish” because while some of these sites host government data, the sites themselves may or may not be affiliated with a government agency:

Energy data

Since I happen to work with energy data a lot, here’s some common go-to sources:

  • China Energy Portal Statistics: Loads of energy statistics from China.
  • U.S. Energy Information Administration: from Wikipedia: the “principal agency of the U.S. Federal Statistical System responsible for collecting, analyzing, and disseminating energy information to promote sound policymaking, efficient markets, and public understanding of energy and its interaction with the economy and the environment. EIA programs cover data on coal, petroleum, natural gas, electric, renewable and nuclear energy. EIA is part of the U.S. Department of Energy.”
  • Environmental Performance Index, Yale University: Ranks 163 countries on 25 performance indicators tracked across ten policy categories covering both environmental public health and ecosystem vitality. These indicators provide a gauge at a national government scale of how close countries are to established environmental policy goals.

Spatial data