The world is full of data, but sometimes finding it can be difficult.
Here are some resources that you might find helpful for finding data
across the web.
Packages
Many packages contain interesting datasets. Here
is a table of lots of packages that contain multiple dataset. Some of my
favorites are:
In addition, some packages exclusively just contain datasets, such
as:
Websites
A collection of various websites that have lots of interesting
datasets:
Large, search-able
sites
Large sites, i.e. “Google” for data sets:
Government-ish
sources:
“Government-ish” because while some of these sites host government
data, the sites themselves may or may not be affiliated with a
government agency:
Energy data
Since I happen to work with energy data a lot, here’s some common
go-to sources:
- China
Energy Portal Statistics: Loads of energy statistics from
China.
- U.S. Energy Information
Administration: from Wikipedia: the “principal agency of the U.S.
Federal Statistical System responsible for collecting, analyzing, and
disseminating energy information to promote sound policymaking,
efficient markets, and public understanding of energy and its
interaction with the economy and the environment. EIA programs cover
data on coal, petroleum, natural gas, electric, renewable and nuclear
energy. EIA is part of the U.S. Department of Energy.”
- Environmental
Performance Index, Yale University: Ranks 163 countries on 25
performance indicators tracked across ten policy categories covering
both environmental public health and ecosystem vitality. These
indicators provide a gauge at a national government scale of how close
countries are to established environmental policy goals.