Car Bloat in the United States

Do Americans have an obsession with big vehicles?

Author

Ifeoluwa Olaniyan and Emmanuel Agbeko

Published

December 8, 2024

Introduction

In recent years, there has been growing concern over the increasing size of vehicles on American roads, often referred to as “car bloat.” This trend is driven in part by a regulatory loophole in U.S. fuel economy standards that allows larger vehicles to meet less stringent efficiency requirements compared to smaller cars. Under the Corporate Average Fuel Economy (CAFE) standards, automakers have found it easier to meet fuel efficiency targets by prioritizing the production of larger vehicles, such as SUVs and trucks, rather than improving the efficiency of smaller cars. The result has been a steady shift toward a vehicle fleet dominated by bulkier, heavier models.

The implications of this shift extend beyond mere aesthetics. Larger vehicles are associated with increased emissions, contributing significantly to environmental degradation. Additionally, their size and weight raise serious concerns about road safety, as they tend to cause more severe accidents and fatalities when collisions occur. The proliferation of larger vehicles has also exacerbated issues related to urban planning, from parking shortages to road wear and tear, further straining public infrastructure.

This might be because Americans genuinely prefer these larger vehicles, or are they simply buying what automakers are offering? SUVs and trucks now dominate the sales charts, but consumer preferences are shaped by a range of factors, including advertising, perceived safety, and limited choices in smaller, fuel-efficient models. Automakers have heavily marketed SUVs and trucks as symbols of freedom and ruggedness, creating a strong demand narrative that aligns with cultural ideals. At the same time, practical considerations like family size, cargo space, and all-weather capability often make these vehicles more appealing to buyers. However, this demand is also constrained by the supply: fewer automakers are investing in the development and promotion of smaller cars, which limits options for consumers who might otherwise prefer a compact or mid-size vehicle.

With SUVs and trucks dominating sales, understanding the U.S. vehicle fleet’s shift toward larger models is crucial for informing public safety regulations, consumer choices, and urban planning.

Data sources

The primary source of data for this project is a website called Car and Driver, which provides detailed data on vehicle weight, size, and safety features for a wide variety of models. The website provides comparisons of vehicles– eg. dimensions, weight, and safety ratings– across categories such as sedans, SUVs, and trucks. The information on the website is generally original data compiled from automakers and third-party testing. The data should be reliable for comparing vehicle size, weight, and safety features, but the specifications might be pre-processed to include summaries or aggregated results, which may lack context about real-world performance in crashes.

The data used in this analysis was sourced from the Jhelvy GitHub repository, where it was scraped, cleaned, and made available by Professor Helveston. For the purpose of this project, we cleaned the data to filter out the vehicle types and particular specifications that we needed.

We also got data about the 25 Bestselling Cars, Trucks, and SUVs of 2024 So Far and the estimated number of each vehicle sold from the same website. We used this data to create a spreadsheet with the 25 Bestselling Vehicles, number of units sold, and vehicle type (Car/ Truck/ SUV) and joined that to the main data.

Below is a table of the data we worked with. It contains information from as early as 2007 about the 25 bestselling vehicles of 2024 so far. From this data, we see how the sizes of these vehicle models have evolved over the years- possibly due to advancement in technologies and an attempt to meet up with improved policies (eg. CAFE standards). An alternative (and perhaps more comprehensive) approach to this study would be to get information about the bestselling car information over a period of time (say, the last decade) to see how the sizes of the vehicles that populate the US fleet have changed as the years have progressed. We couldn’t do this due to an unavailability of reliable data to conduct our research fromt hat perspective, but this should give a clear enough picture of car bloat in the United States.

Code
main_df %>%
  reactable(
    searchable = TRUE,
    highlight = TRUE,
    filterable = TRUE,
    defaultPageSize = 10,
    showPageSizeOptions = TRUE,
    pageSizeOptions = c(10, 20, 25),
    theme = reactableTheme(
      style = list(fontSize = "12px"),
      stripedColor = "#f9f9f9", 
      borderColor = "#cccccc",
      rowHighlightStyle = list(backgroundColor = "#f7f7f7")
    )
  )
Code
# Count the number of each type of vehicle
vehicle_counts <- main_df %>%
  group_by(type) %>%
  summarise(count = n())

ggplot(vehicle_counts, aes(x = type, y = count)) +
  geom_col(show.legend = FALSE) + 
  theme(
    axis.title = element_blank(),
    plot.title = element_text(hjust = 0.5),
    panel.grid = element_blank()
  ) +
  labs(
    title = 'Number of Vehicles by Type'
  )

This is a basic representation of the number of each vehicle type (Cars, SUVs, and Trucks) in the list of bestselling vehicles in the U.S. Observe that the number of SUVs is more than five times that of regular passenger cars. This suggests that larger vehicles, particularly SUVs, are favored by consumers, possibly due to factors like perceived safety, utility, or marketing influence, which could be contributing to the growing trend of “car bloat” in the U.S. vehicle fleet.

Footprint over time- all cars

Code
# First, we calculate the mean for each year across all vehicles
footprint_data <- main_df %>%
  group_by(year) %>%
  mutate(
    mean_footprint = mean(footprint, na.rm = TRUE)
  )


ggplot(footprint_data) +
  geom_line(
    aes(x = year, y = footprint, group = style),
    color = 'grey', alpha = 0.3
  ) +
  geom_line(
    aes(x = year, y = mean_footprint),
    size = 0.8, color = 'black'
  ) +
  annotate(
    'text', x = min(footprint_data$year) + 9, y = mean(footprint_data$mean_footprint, na.rm = TRUE) + 100,
    hjust = 0, label = 'US Mean', color = 'black'
  ) +
  theme_minimal() +
   theme(
    axis.title.x = element_text(face = "bold", hjust = 0), 
    axis.text.y = element_text(hjust = 1, size = 9),
    plot.title = element_text(size = 16, face = "bold"),
    plot.subtitle = element_text(size = 13, face = "italic")) +
  labs(
      x = NULL,
    y = NULL,
    title = 'Trends in Vehicle Footprint (2007 - 2023)',
    subtitle = 'How is the environmental footprint of vehicles changing over time?'
  ) +
  scale_y_continuous(
    limits = c(12500, 14500)
  )

As expected, the trend shown in the chart indicates an overall increase in the environmental footprint of vehicles over the past several years. This could be due to several factors, including a shift in consumer preference toward larger vehicles like SUVs and trucks, which have a larger physical footprint compared to smaller cars. Automakers are designing vehicles with more space and advanced features, which can increase their size and weight. Regulatory changes, such as those around fuel economy standards possible play a role, as larger vehicles are sometimes subject to less stringent efficiency requirements. This suggests that the US vehicle fleet is becoming bulkier over time, which could contribute to higher resource consumption and environmental impact, making it important to consider how it affects sustainability efforts.

Code
# Plot: Footprint over time across vehicle types
main_df %>% 
  filter(year >= 2014 & year <= 2023) %>% 
  ggplot(aes(x = year, y = footprint, color = type)) +
  geom_smooth(size = 1, alpha = 0.8, se = FALSE) +
  theme_minimal() +
  labs(
    x = 'Year',
    y = 'Footprint (sq. inches)',
    title = 'Footprint over Time by Vehicle Type'
  ) +
  facet_wrap(~ type, scales = "free_y") +
  theme(legend.position = "none")

Code
View(main_df)

We plotted the environmental footprint over time (2014-2023) for different vehicle types to determine if one particular type had the most influence on the overall trend. Despite their larger size and significantly higher footprint, trucks did not contribute substantially to the overall trend due to their relatively small share in the list of bestselling vehicles. In contrast, SUVs, while still less destructive than trucks, have shown a consistent increase in their environmental impact in recent years. The charts reveal that larger vehicles, on average, leave a higher environmental footprint due to increased fuel consumption and higher emissions. The shift towards larger vehicles, particularly SUVs, played a major role in escalating the environmental impact over the years. This suggests that policies focused on vehicle size and fuel efficiency may be essential in mitigating the transportation sector’s environmental footprint.

Code
main_df <- main_df %>%
  mutate(volume = length * width * height)

us_mean_volume <- main_df %>%
  group_by(year) %>%
  summarise(mean_volume = mean(volume, na.rm = TRUE))

ggplot(main_df) +
  geom_line(aes(x = year, y = volume, group = style), color = 'grey', alpha = 0.3) +
  geom_line(data = us_mean_volume, aes(x = year, y = mean_volume), color = 'black', size = 1) +
  annotate('text', x = max(main_df$year) - 10, y = max(us_mean_volume$mean_volume)-200000, hjust = 0,
           label = 'U.S. Mean Volume', color = 'black') +
  labs(y = 'Volume (cubic inches)', x = 'Year') +
  theme_minimal() +
  theme(axis.text.y = element_blank(),
        plot.title = element_text(face = "bold")) +
  labs(
      x = NULL,
      y = NULL,
    title = 'Change in Vehicle Volume over Time'
  )  +
  scale_y_continuous(
    limits = c(800000, 1000000)  
  )

We computed a variable called volume to show the change in overall size of vehicles, and it confirmed that the overall increase in vehicle size is significant and consistent. We observed a clear upward trend in the average size of the top-selling vehicles over the years. This highlights the physical growth of vehicles in terms of length, width, and height, and it also reinforces the broader pattern of “car bloat” in the market. Larger vehicles, particularly SUVs and trucks, have become more dominant, pushing the overall fleet size higher.

The data suggests that this growth is not merely a function of changing consumer preferences for more spacious, versatile vehicles, but also a result of broader industry shifts, such as the move towards vehicles designed to meet fuel efficiency standards that inadvertently favor larger models. This poses challenges for environmental sustainability, as it contributes to higher fuel consumption, greater emissions, and more severe collisions. Additionally, it strains urban infrastructure, leading to issues such as overcrowded parking and increased road maintenance needs.

Conclusion

The growing trend of “car bloat” in the United States, characterized by an increasing preference for larger vehicles, has significant implications for both the environment and public safety. Our analysis of the 25 bestselling vehicles of 2024 reveals a clear shift toward SUVs and trucks, which now dominate the market. These vehicles, while popular for their perceived safety, utility, and marketing appeal, contribute to a larger environmental footprint due to their increased size and weight.

The data shows that over the years, despite some fluctuations, both the footprint and volume of vehicles in the U.S. have increased, particularly since the introduction of Corporate Average Fuel Economy (CAFE) standards that favor larger vehicles. This shift is not just a reflection of consumer preferences, but also of the regulatory environment and automakers’ strategies in meeting fuel efficiency targets. As larger vehicles proliferate, concerns about road safety, environmental sustainability, and urban infrastructure strain become more pronounced.

The increase in vehicle size contributes to higher resource consumption, greater emissions, and more severe accidents in the event of collisions. Urban planning challenges, such as parking shortages and road wear, are also exacerbated by these larger vehicles. Moving forward, addressing “car bloat” may require both regulatory changes to encourage smaller, more efficient vehicles and a cultural shift in consumer preferences toward more sustainable transportation options. The future of the U.S. vehicle fleet will depend on balancing consumer demand with environmental and safety concerns to mitigate the negative impacts of car bloat.

Limitations

While this analysis provides valuable insights into the trend of “car bloat” in the United States, we acknowledge that there are several limitations to the data:

  • The footprint and volume metrics were derived from vehicle dimensions but may not account for factors like weight distribution, materials used, or other physical design elements influencing environmental impact.
  • Our analysis focuses primarily on the bestselling vehicles of 2024, and historical data for the past decade or more was unavailable. A more extended time frame could have offered a better understanding of how car sizes have changed over longer periods.

Data Dictionary

The cleaned data file is named main_df. The table below describes each variable:

variable description
style Model style of the vehicle
year Year of the model’s release or manufacture
msrp Manufacturer’s suggested retail price (in USD), averaged per “style” and “year”
wheelbase Distance between front and rear axles in inches, representing the vehicle’s wheelbase size
length Total length of the vehicle in inches
width Width of the vehicle without mirrors in inches
height Height of the vehicle in inches
type Classification of the vehicle as “Car”, “Truck”, or “SUV”
units_sold Estimated number of units sold in 2024 for each vehicle “style”
volume Calculated vehicle volume (length c breadth x height)

Appendix

Code
# Libraries and settings
library(tidyverse)
library(here)
library(janitor)
library(readxl)
library(gridExtra)
library(knitr)
library(DT)
library(reactable)

knitr::opts_chunk$set(
  warning = FALSE,
  message = FALSE,
  comment = "#>",
  fig.path = "figs/", # Folder where rendered plots are saved
  fig.width = 7.252, # Default plot width
  fig.height = 4, # Default plot height
  fig.retina = 3 # For better plot resolution
)

# Other "global" settings
theme_set(theme_bw(base_size = 20))

# Data used
car_data <- read_csv(here("data_raw", "main_data.csv"))

relevant_columns <- c("style", "year", "msrp", "wheelbase_inches_2", "length_inches_2", "width_without_mirrors_inches_2", "height_inches_2")

# We created a vector containing the list of the 25 bestselling cars of 2024 so far
bestsellingcars_2024 <- c("Honda HR-V", "Toyota Tundra", "Kia Sportage", "Nissan Sentra", "Honda Accord", "Subaru Outback", "Toyota Tacoma", "Subaru Forester", "Subaru Crosstrek", "Chevrolet Equinox", "Hyundai Tucson", "Ford Explorer", "Chevrolet Trax", "Jeep Grand Cherokee", "Toyota Corolla", "Honda Civic", "Nissan Rogue", "Toyota Camry", "GMC Sierra 1500", "GMC Sierra 2500", "GMC Sierra 3500", "Ram Pickup", "Honda CR-V", "Tesla Model Y", "Toyota RAV4", "Chevrolet Silverado", "Ford F-Series")

bestSellingCarData <- car_data %>% 
  filter(Style %in% bestsellingcars_2024) %>%
  clean_names() %>% 
  select(relevant_columns) %>% 
# The GMC Sierra is one of the best selling cars of 2024, but it encompasses the light-duty 1500 and the heavy-duty 2500 and 3500 models, so we will rename the style for these vehicles to "GMC Sierra"
 mutate(
   style = ifelse(style %in% c("GMC Sierra 1500", "GMC Sierra 2500", "GMC Sierra 3500"), "GMC Sierra", style),
   msrp = parse_number(msrp)
        ) 
grouped_data <- bestSellingCarData %>% 
  group_by(style, year) %>% 
  summarise(
    msrp = mean(msrp, na.rm = TRUE),
    wheelbase = mean(wheelbase_inches_2, na.rm = TRUE),
    length = mean(length_inches_2, na.rm = TRUE),
    width = mean(width_without_mirrors_inches_2, na.rm = TRUE),
    height = mean(height_inches_2, na.rm = TRUE)) %>% 
  mutate(
    footprint = length * width,
    volume = length * width * height) %>% 
  filter(year != 2024)

# To get our final data frame, we created and imported an excel sheet containing data about the best selling vehicles, their style (Car/Truck/SUV), and the number of units sold in 2024 so far.
car_types_data <- read_excel(here("data_raw","style_and_units.sold.xlsx"))

main_df <- left_join(grouped_data, car_types_data, by = "style")

write_csv(main_df, here("data_processed", "main_df.csv"))
main_df %>%
  reactable(
    searchable = TRUE,
    highlight = TRUE,
    filterable = TRUE,
    defaultPageSize = 10,
    showPageSizeOptions = TRUE,
    pageSizeOptions = c(10, 20, 25),
    theme = reactableTheme(
      style = list(fontSize = "12px"),
      stripedColor = "#f9f9f9", 
      borderColor = "#cccccc",
      rowHighlightStyle = list(backgroundColor = "#f7f7f7")
    )
  )
# Count the number of each type of vehicle
vehicle_counts <- main_df %>%
  group_by(type) %>%
  summarise(count = n())

ggplot(vehicle_counts, aes(x = type, y = count)) +
  geom_col(show.legend = FALSE) + 
  theme(
    axis.title = element_blank(),
    plot.title = element_text(hjust = 0.5),
    panel.grid = element_blank()
  ) +
  labs(
    title = 'Number of Vehicles by Type'
  )

# First, we calculate the mean for each year across all vehicles
footprint_data <- main_df %>%
  group_by(year) %>%
  mutate(
    mean_footprint = mean(footprint, na.rm = TRUE)
  )


ggplot(footprint_data) +
  geom_line(
    aes(x = year, y = footprint, group = style),
    color = 'grey', alpha = 0.3
  ) +
  geom_line(
    aes(x = year, y = mean_footprint),
    size = 0.8, color = 'black'
  ) +
  annotate(
    'text', x = min(footprint_data$year) + 9, y = mean(footprint_data$mean_footprint, na.rm = TRUE) + 100,
    hjust = 0, label = 'US Mean', color = 'black'
  ) +
  theme_minimal() +
   theme(
    axis.title.x = element_text(face = "bold", hjust = 0), 
    axis.text.y = element_text(hjust = 1, size = 9),
    plot.title = element_text(size = 16, face = "bold"),
    plot.subtitle = element_text(size = 13, face = "italic")) +
  labs(
      x = NULL,
    y = NULL,
    title = 'Trends in Vehicle Footprint (2007 - 2023)',
    subtitle = 'How is the environmental footprint of vehicles changing over time?'
  ) +
  scale_y_continuous(
    limits = c(12500, 14500)
  )



# Plot: Footprint over time across vehicle types
main_df %>% 
  filter(year >= 2014 & year <= 2023) %>% 
  ggplot(aes(x = year, y = footprint, color = type)) +
  geom_smooth(size = 1, alpha = 0.8, se = FALSE) +
  theme_minimal() +
  labs(
    x = 'Year',
    y = 'Footprint (sq. inches)',
    title = 'Footprint over Time by Vehicle Type'
  ) +
  facet_wrap(~ type, scales = "free_y") +
  theme(legend.position = "none")

View(main_df)
main_df <- main_df %>%
  mutate(volume = length * width * height)

us_mean_volume <- main_df %>%
  group_by(year) %>%
  summarise(mean_volume = mean(volume, na.rm = TRUE))

ggplot(main_df) +
  geom_line(aes(x = year, y = volume, group = style), color = 'grey', alpha = 0.3) +
  geom_line(data = us_mean_volume, aes(x = year, y = mean_volume), color = 'black', size = 1) +
  annotate('text', x = max(main_df$year) - 10, y = max(us_mean_volume$mean_volume)-200000, hjust = 0,
           label = 'U.S. Mean Volume', color = 'black') +
  labs(y = 'Volume (cubic inches)', x = 'Year') +
  theme_minimal() +
  theme(axis.text.y = element_blank(),
        plot.title = element_text(face = "bold")) +
  labs(
      x = NULL,
      y = NULL,
    title = 'Change in Vehicle Volume over Time'
  )  +
  scale_y_continuous(
    limits = c(800000, 1000000)  
  )