pe-tro-le-um (n.)

A thick, flammable, yellow-to-black mixture of gaseous, liquid, and solid hydrocarbons that occurs naturally beneath the earth’s surface, can be separated into fractions including natural gas, gasoline, naphtha, kerosene, fuel and lubricating oils, paraffin wax, and asphalt and is used as raw material for a wide variety of derivative products.

— American Heritage Dictionary


Introduction

Since the early mid-1950s, petroleum has significantly become the world’s most important source of energy. Its products underpin modern society, especially in supplying energy to the power industry, heat homes, and provide fuel for vehicles and airplanes to carry goods and people around the world. But only particular countries have an abundance of crude oil in their reserves. These countries for decades have been the major influencers on the global oil market. OPEC, the Organization of the Petroleum Exporting Countries, for instance, supplies almost half of the world’s total oil production, making it the major influencer in the global oil market. However, with US oil production growing tremendously over the last decade and becoming the world’s largest petroleum producer, there are some expectations that this will shift the role of OPEC.

However, it is not yet vividly clear if the US is going to take over the global oil market or the OPEC countries will continuously be the only driving force of this industry. Before diving into the analysis we need to address the initial question - Which countries are the major role players in the global petroleum market?

Data sources

We used the The U.S. Energy Information Administration and British Petroleum websites as our main data sources. The reason why we chose these sources for our research is based on the quality of the collected data. The information provided by the EIA is regulated by the U.S. Department of Energy and widely used for researches, policymaking, promotion of public understanding of energy, and its interaction with economy 1. Statistical reviews of British Petroleum is compiled by BP economists using publicly available information published by government agencies. The data provided by BP statistics enhanced our research by fulfilling and augmenting the analysis.


Data extraction and tidying

For our research we used the following list of files:

The analysis of every market starts with an examination of price, supply, demand, and political aspect. Since extraction of information related to the political aspect requires a more thorough approach and interpretation of this data is significantly subjective, we decided to discard this variable to stay focused on remaining ones.

In our case, after tidying of data we ended up with 5 general variables: year, country, production (thousands of barrels/day), consumption (thousand of barrels/day), and price (US dollar/barrel).

global_production <- read_csv(here::here('data_raw', 'global_production_eia.csv'), skip = 1) %>%
  select(-API) %>% 
  mutate(country = ifelse(!X2 %in% c(
    "Production",
    "Total petroleum and other liquids (Mb/d)",
    "Crude oil, NGPL, and other liquids (Mb/d)",
    "Crude oil including lease condensate (Mb/d)",
    "NGPL (Mb/d)",
    "Other liquids (Mb/d)",
    "Refinery processing gain (Mb/d)"), X2, NA),
    description = ifelse(is.na(country), X2, NA)
  ) 

countries <- global_production %>% 
  filter(!is.na(country)) %>% 
  select(country)

opec <- c("Algeria", "Angola", "Congo", "Equatorial Guinea", "Gabon", "Iran", "Iraq", "Kuwait", "Libya", "Nigeria", "Saudi Arabia", "United Arab Emirates", "Venezuela")

gp <- global_production %>% 
  filter(!is.na(description)) %>% 
  select(-country, -X2) %>% 
  mutate(country = rep(countries$country, each = 7)) %>% 
  gather(key = year, value = rate, '1973':'2019') %>% 
  spread(key = description, value = rate) %>% 
  clean_names() %>% 
  select(-production, -total_petroleum_and_other_liquids_mb_d, -crude_oil_ngpl_and_other_liquids_mb_d) %>% 
  rename(crude = crude_oil_including_lease_condensate_mb_d,
         ngpl = ngpl_mb_d,
         other = other_liquids_mb_d,
         refinery_gain = refinery_processing_gain_mb_d) %>% 
  mutate(year = as.numeric(year),
         crude = as.numeric(crude),
         ngpl = as.numeric(ngpl),
         other = as.numeric(other),
         refinery_gain = as.numeric(refinery_gain),
         production = crude + ngpl + other + refinery_gain) %>% 
  filter(country != "World") %>% 
  select(country, year, production) %>% 
  filter(production > 1) %>% 
  mutate(opec = case_when(
    country %in% opec ~ 'opec',
    TRUE ~ 'non-opec'
  )) %>% 
  glimpse()
## Observations: 3,620
## Variables: 4
## $ country    <chr> "Albania", "Albania", "Albania", "Albania", "Albania", "Al…
## $ year       <dbl> 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989…
## $ production <dbl> 44.000000, 44.000000, 64.000000, 75.000000, 55.000000, 55.…
## $ opec       <chr> "non-opec", "non-opec", "non-opec", "non-opec", "non-opec"…
bp_consumption <- read_xlsx(here::here('data_raw', 'bp_stats.xlsx'), sheet = 'Oil Consumption - Barrels', skip = 2, range = "Oil Consumption - Barrels!A3:BC108") %>% 
  gather(key = "year", value = "consumption", '1965':'2018') %>%
  rename("country" = "Thousand barrels daily") %>% 
  filter(!is.na(country)) %>% 
  spread(key = country, value = consumption) %>% 
  select(-contains("total")) %>% 
  gather(key = country, value = consumption, 'Algeria':'Western Africa') %>% 
  mutate(year = as.numeric(year),
         consumption = as.numeric(consumption))

glimpse(bp_consumption)
## Observations: 4,968
## Variables: 3
## $ year        <dbl> 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 197…
## $ country     <chr> "Algeria", "Algeria", "Algeria", "Algeria", "Algeria", "A…
## $ consumption <dbl> 26.71619, 35.35323, 33.28597, 35.37443, 37.71433, 43.0099…
spot_price <- read_excel(here::here('data_raw', 'spot_prices_wti_brent.xls'), sheet = 'Data 1', skip = 1, col_types = c("date", "numeric", "numeric")) %>% 
  rename(
    date = Sourcekey, 
    wti= RWTC, 
    brent = RBRTE
  ) %>% 
  gather(key = crude, value = price, 'wti':'brent') %>% 
  separate(date, into = c("year", "month", "day"), sep = "-", convert = TRUE) %>% 
  select(-month, -day)

glimpse(spot_price)
## Observations: 70
## Variables: 3
## $ year  <int> NA, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995,…
## $ crude <chr> "wti", "wti", "wti", "wti", "wti", "wti", "wti", "wti", "wti", …
## $ price <dbl> NA, 15.05, 19.20, 15.97, 19.64, 24.53, 21.54, 20.58, 18.43, 17.…


Analysis

Who is leading in oil production in 2019?

Two figures below represent the state of the petroleum industry in 2019. According to Figure 1, the United States is leading in production by almost reaching 20 mln barrels per day even leaving every OPEC member behind. However, the contribution of OPEC countries into the global market is hard to underestimate since members of the organization own almost 80% of the world’s total petroleum reserve2. Figure 2 provides the evidence to the fact that although the US became a leader in petroleum production, the total output of the OPEC is still dominating.

Nevertheless, the US is already producing more than half (56.9%) of OPEC’s production as it is shown in Figure 2. This fact implies that the role of the US as a petroleum exporter will rise if it’s production keeps increasing. To predict the future state of the global petroleum market we decided to examine historical data of both production and consumption of crude.


gp2019 <- gp %>% 
  filter(year == 2019 & production > 100) %>%
  mutate(country = fct_other(country, drop = c("South Africa", "Denmark", "South Korea", "Brunei", "Japan", "Chad", "Italy", "Peru", "South Sudan", "Ghana", "Germany", "Vietnam", "Turkmenistan", "Congo-Brazzaville", "France"), other_level = "Other countries")) %>% 
  mutate(country = fct_reorder(country, production)) %>%
  ggplot() +
  geom_col(aes(x = country, y = production, fill = opec)) +
  coord_flip() +
  scale_y_continuous(
        expand = expand_scale(mult = c(0, 0.05))) +
  theme_minimal_vgrid(font_family = "Roboto Condensed") +
  theme(axis.ticks = element_blank(),
        legend.position = 'none')+
  labs(x = "",
       y = "Production in thousands barrels per day",
       title = "Figure 1. Global petroleum production in 2019",
       subtitle = "by country")

gp2019_1 <- gp %>% 
  filter(year == 2019 & production > 100) %>% 
  mutate(country = fct_other(country, drop = c("Algeria", "Angola", "Congo", "Equatorial Guinea", "Gabon", "Iran", "Iraq", "Kuwait", "Libya", "Nigeria", "Saudi Arabia", "United Arab Emirates", "Venezuela"), other_level = "OPEC")) %>% 
  group_by(country) %>% 
  summarise(production = sum(production)) %>% 
  mutate(country = fct_other(country, drop = c("South Africa", "Denmark", "South Korea", "Brunei", "Japan", "Chad", "Italy", "Peru", "South Sudan", "Ghana", "Germany", "Vietnam", "Turkmenistan", "Congo-Brazzaville", "France"), other_level = "Other countries")) %>% 
  mutate(country = fct_reorder(country, production),
         cartel = ifelse(country == "OPEC", TRUE, FALSE)) %>% 
ggplot() +
  geom_col(aes(x = country, y = production, fill = cartel)) +
  coord_flip() +
  annotate('text', y = 34313, x = "OPEC", label = "34313", hjust = 1, fontface = 3) +
  annotate('text', y = 19520, x = "United States", label = "19520", hjust = 1, fontface = 3) +
  scale_y_continuous(
        expand = expand_scale(mult = c(0, 0.05))) +
  theme_minimal_vgrid(font_family = "Roboto Condensed") +
  theme(axis.ticks = element_blank(),
        legend.position = 'none') +
  labs(x = "",
       y = "Production in thousands barrels per day",
       title = "Figure 2. Global petroleum production in 2019",
       subtitle = "by comparing total output of OPEC members and other countries")

plot_grid(gp2019, gp2019_1)