The world is full of data, but sometimes finding it can be difficult. Here are some resources that you might find helpful for finding data across the web.
Many packages contain interesting datasets. Here is a table of lots of packages that contain multiple dataset. Some of my favorites are:
In addition, some packages exclusively just contain datasets, such as:
A collection of various websites that have lots of interesting datasets:
Large, search-able sites
Large sites, i.e. “Google” for data sets:
“Government-ish” because while some of these sites host government data, the sites themselves may or may not be affiliated with a government agency:
Since I happen to work with energy data a lot, here’s some common go-to sources:
- China Energy Portal Statistics: Loads of energy statistics from China.
- U.S. Energy Information Administration: from Wikipedia: the “principal agency of the U.S. Federal Statistical System responsible for collecting, analyzing, and disseminating energy information to promote sound policymaking, efficient markets, and public understanding of energy and its interaction with the economy and the environment. EIA programs cover data on coal, petroleum, natural gas, electric, renewable and nuclear energy. EIA is part of the U.S. Department of Energy.”
- Environmental Performance Index, Yale University: Ranks 163 countries on 25 performance indicators tracked across ten policy categories covering both environmental public health and ecosystem vitality. These indicators provide a gauge at a national government scale of how close countries are to established environmental policy goals.