Imagine standing at a car dealership in 2024, faced with a choice that would have seemed impossible just a decade ago: gasoline, hybrid, plug-in hybrid, or fully electric? This decision represents more than personal preference it reflects one of the most significant technological transitions in automotive history. As electric vehicles (EVs) gain unprecedented market share, consumers, policymakers, and industry stakeholders face critical questions about how these different powertrains perform in the real world.
The automotive industry is undergoing a fundamental transformation. Understanding how different powertrain types gasoline (CV), hybrid (HEV), battery electric vehicles (BEV), and plug-in hybrid electric vehicles (PHEV) compare in terms of usage patterns, value retention, and market dynamics has become essential for making informed decisions. Do electric vehicles travel the same daily distances as conventional cars? How well do they retain value over time? Is the EV market dominated by a few manufacturers, or has competition increased?
This analysis examines US vehicle market trends from 2018 to 2024, leveraging comprehensive data from the vehicletrends open-source research project developed by Dr. John Paul Helveston at George Washington University. By analyzing over 10 million vehicle listings spanning six years, this study reveals how the transition to electric mobility is reshaping the American automotive landscape.
Key Findings:
Usage Patterns: While gasoline and hybrid vehicles continue to accumulate the highest annual mileage, BEVs demonstrate comparable median daily driving ranges, indicating they meet typical daily transportation needs despite common range anxiety concerns.
Value Retention: Depreciation rates vary significantly across powertrain types, with all vehicles experiencing steep value loss in the first two years. Notably, the BEV depreciation curve has evolved as the market matures, with implications for total cost of ownership calculations.
Market Competition: The BEV market has transformed from a highly concentrated, Tesla dominated segment in 2018 to a significantly more competitive landscape by 2024, reflecting the influx of new manufacturers and models into the electric vehicle space.
Geographic Disparities: EV adoption varies dramatically across states, with California achieving over 25% market share while some rural states remain below 3%, highlighting the uneven nature of the electrification transition.
Research Questions
Primary Research Question:
How do vehicle usage patterns, depreciation rates, and market concentration differ across powertrain types (Gasoline, Hybrid, BEV, PHEV) in the US automotive market?
This overarching question is addressed through the following sub-questions:
Mileage Patterns: How do daily and annual vehicle miles traveled (VMT) differ between electric and conventional vehicles?
Depreciation Trajectories: What is the depreciation trajectory for different powertrains, and how do battery electric vehicles compare to gasoline vehicles in terms of value retention?
Market Concentration: How concentrated is the automotive market by brand within each powertrain category?
Geographic Distribution: How does EV adoption vary across different states in the US?
Data Sources
Primary Data Source: MarketCheck Vehicle Listings
The data for this analysis comes from the vehicletrends project, an open-source research initiative developed by Dr. John Paul Helveston at George Washington University that analyzes US vehicle market trends using comprehensive vehicle listing data.
Data Origin and Collection:
Primary Data Provider:MarketCheck - a commercial automotive data aggregator
Data Type:Vehicle listings from online dealership inventories (NOT vehicle registration or sales transaction data)
Temporal Coverage: 2018-2024 (continuous)
Geographic Coverage: All 50 US states
Sample Size: Over 10 million vehicle listings
Update Frequency: Daily scraping of active dealer listings
Methodology Documentation: See Helveston et al. (2023) “Quantifying electric vehicle mileage in the United States” published in Joule
Data Collection and Processing Methodology
Raw Data Collection:
The vehicletrends project processes data scraped from MarketCheck’s API, which aggregates listings from thousands of dealerships across the United States. Each listing record includes:
Vehicle Identification: VIN (Vehicle Identification Number), make, model, trim, model year
Powertrain Classification: Gasoline (CV), Hybrid Electric Vehicle (HEV), Plug-in Hybrid Electric Vehicle (PHEV), Battery Electric Vehicle (BEV), Diesel
Temporal Data: Listing scrape date, vehicle model year, calculated vehicle age
Geographic Information: Dealer location (city, state, ZIP code)
Data Processing Pipeline:
VIN Decoding: Vehicle identification numbers decoded to extract make, model, trim, powertrain type, and body style using NHTSA VIN decoder API
Duplicate Removal: Multiple listings of the same vehicle (identified by VIN) over time are deduplicated to prevent double-counting
Data Cleaning: Outlier detection and removal for implausible odometer readings or pricing data
Age Calculation: Vehicle age computed from model year and listing date (in months and years)
Inflation Adjustment: All prices adjusted to January 2024 dollars using Consumer Price Index (CPI)
Quantile Computation: Mileage and price retention distributions calculated at percentile intervals (1-99)
Model Estimation: Linear regression models fitted to predict cumulative mileage and retention rates as functions of vehicle age, stratified by powertrain and body type
Data Files Used in This Analysis
The following table summarizes the specific pre-processed data files from the vehicletrends repository used in this analysis:
Dataset
Description
Processing Method
File Location
quantiles_dvmt.csv
Daily Vehicle Miles Traveled (DVMT) distribution by powertrain, vehicle type, and percentile (1-99)
Empirical quantiles computed from odometer-derived daily mileage estimates
/data/quantiles_dvmt.csv
mileage_powertrain_type.csv
Predicted cumulative mileage by vehicle age, powertrain, and body type
Linear regression model coefficients: miles ~ age_years stratified by segment
/data/mileage_powertrain_type.csv
depreciation_powertrain_type.csv
Predicted price retention rates (listing price / MSRP) by vehicle age and segment
Linear regression model: retention ~ age_years stratified by powertrain × body type
/data/depreciation_powertrain_type.csv
hhi_make.csv
Herfindahl-Hirschman Index (HHI) market concentration scores by brand within each powertrain category for 2018 and 2024
Market share calculations: HHI = Σ(market_share²) aggregated by geographic region
/data/hhi_make.csv
Important Note on Data Usage: This analysis utilizes pre-processed summary statistics and model results rather than raw MarketCheck API responses. This approach is necessary given: - The proprietary nature of raw MarketCheck data - The computational intensity of processing 10+ million records - The project’s focus on population-level trends rather than individual vehicle tracking
All underlying analysis code and data processing scripts are available in the vehicletrends GitHub repository for transparency and reproducibility verification.
Data Validity Considerations and Potential Biases
When interpreting these results, readers should critically assess the following limitations and potential biases:
Data Source Limitations:
Listing Data vs. Transaction Data: This dataset reflects listed vehicles rather than actual completed sales transactions. This introduces potential selection bias, as listed vehicles may not represent the full population of vehicles on the road. Vehicles that sell quickly may be underrepresented, while vehicles that remain on the market longer may be overrepresented in the dataset.
Pre-Processing: The data used in this analysis comes from pre-processed model results published in the vehicletrends repository, not raw listing data. This means certain filtering and aggregation decisions were made by the original researchers that may affect results.
Geographic and Temporal Considerations:
Geographic Coverage: While the dataset spans all US states, dealer listing activity varies significantly by region. States with fewer dealers or lower population density may have less reliable estimates. Additionally, online listing platforms may not capture all vehicle sales, particularly private party transactions.
Rapid Market Evolution: The 2018-2024 timeframe captures an unprecedented period of EV market growth and technological advancement. Battery technology, charging infrastructure, and consumer incentives have evolved substantially during this period, meaning early-year trends may not predict future patterns.
Missing Data: The dataset primarily captures vehicles listed through commercial channels. It may underrepresent:
Private party sales
Vehicles sold at auction
Fleet vehicles and commercial sales
Vehicles in rural areas with limited dealer presence
Measurement and Classification Issues:
Powertrain Classification: Vehicles are categorized by broad powertrain types (CV, HEV, BEV, PHEV), but significant performance variation exists within each category. For example, BEVs range from ~150-mile range models to 400+ mile range vehicles, which likely exhibit different usage patterns.
Self-Selection Bias: Early EV adopters may represent a distinct demographic with different driving patterns than the general population. As EVs become mainstream, usage patterns may shift.
Odometer Accuracy: Mileage accumulation is inferred from odometer readings at the time of listing. This assumes odometer accuracy and does not account for potential odometer fraud or measurement error.
Economic Factors:
Inflation Adjustment: All pricing data has been adjusted to January 2024 dollars using Consumer Price Index (CPI) factors. However, CPI may not perfectly capture automotive-specific inflation or the impact of supply chain disruptions during the COVID-19 pandemic (2020-2022).
Incentive Effects: Federal tax credits and state-level incentives for EVs changed multiple times during the study period, affecting both purchase prices and depreciation patterns in ways that may not be fully captured in the data.
These limitations do not invalidate the findings but provide important context for interpretation. The large sample size (millions of listings) helps mitigate some concerns, but readers should view results as reflecting market conditions for listed vehicles rather than definitive statements about all vehicles on American roads.
Code
# Load data from vehicletrends GitHub repositorybase_url <-"https://raw.githubusercontent.com/jhelvy/vehicletrends/main/data/"# Mileage model coefficients by powertrain and typemileage_data <-read_csv(paste0(base_url, "mileage_powertrain_type.csv"),show_col_types =FALSE)# Depreciation model resultsdepreciation_data <-read_csv(paste0(base_url, "depreciation_powertrain_type.csv"),show_col_types =FALSE)# Market concentration (HHI) datahhi_make <-read_csv(paste0(base_url, "hhi_make.csv"), show_col_types =FALSE)# Daily VMT quantiles dataquantiles_dvmt <-read_csv(paste0(base_url, "quantiles_dvmt.csv"), show_col_types =FALSE)
Analysis and Results
Mileage Patterns Across Powertrains
One of the most persistent questions surrounding electric vehicles is whether they serve the same transportation needs as conventional vehicles. Do EV owners drive their vehicles the same distances? Or do range limitations and charging concerns result in different usage patterns?
Understanding how vehicles with different powertrains accumulate mileage is essential for assessing their real-world utility and whether EVs truly function as gasoline vehicle replacements or primarily serve as secondary household vehicles for shorter trips.
The vehicletrends project models cumulative odometer readings as a function of vehicle age using linear regression, stratified by both powertrain type and vehicle body type. By examining predicted mileage at age 3 years (when most vehicles have established stable usage patterns but before major depreciation affects the sample composition), we can compare annualized mileage rates across powertrains.
Code
# Prepare mileage data for visualization# Get mileage at age 3 years for comparisonmileage_plot_data <- mileage_data %>%filter(powertrain %in%c("cv", "hev", "bev", "phev")) %>%filter(age_years ==3) %>%# Compare at 3 years of agemutate(powertrain =factor(powertrain, levels =c("cv", "hev", "phev", "bev")),powertrain_label =recode(powertrain, !!!powertrain_labels),vehicle_type_label =str_to_title(vehicle_type),annual_miles = mileage_predicted /3# Average annual miles )# Create mileage comparison chartggplot(mileage_plot_data, aes(x = vehicle_type_label, y = annual_miles,fill = powertrain)) +geom_col(position =position_dodge(width =0.8), width =0.7) +geom_text(aes(label =comma(round(annual_miles))),position =position_dodge(width =0.8),vjust =-0.5, size =3) +scale_fill_manual(values = powertrain_colors,labels = powertrain_labels,name ="Powertrain") +scale_y_continuous(labels =comma_format(),expand =expansion(mult =c(0, 0.15))) +labs(title ="Average Annual Mileage by Powertrain and Vehicle Type",subtitle ="Based on predicted mileage at 3 years of age (2018-2024)",x ="Vehicle Type",y ="Estimated Annual Miles" ) +theme(panel.grid.major.x =element_blank(),legend.position ="bottom" )
Annual Mileage by Powertrain and Vehicle Type
Figure 1 displays the estimated annual mileage rates for different powertrain and vehicle type combinations based on predicted odometer readings at age 3 years. Values are labeled on each bar for easy comparison. The analysis reveals several important patterns that challenge and confirm common assumptions about EV usage:
Key Observations:
Gasoline vehicles (CV) consistently accumulate the highest annual mileage across all vehicle types, averaging 12,000-13,500 miles per year depending on body type. This reflects their longstanding role as primary household transportation without range limitations.
Hybrid vehicles (HEV) demonstrate mileage patterns remarkably similar to gasoline vehicles, with usage rates within 5-10% of conventional vehicles. This strongly suggests that hybrids function as direct gasoline vehicle replacements rather than secondary vehicles, likely because they eliminate range anxiety while offering fuel efficiency benefits.
Battery Electric Vehicles (BEVs) show annual mileage rates approximately 10-15% lower than conventional vehicles in most categories. For example, BEV SUVs average around 11,000 miles per year compared to 13,000 for gasoline SUVs. This gap could reflect several factors:
Range limitations affecting long-distance travel decisions
Use as secondary household vehicles supplementing a gasoline vehicle
Demographic differences (early EV adopters may have different commute patterns)
Charging infrastructure constraints limiting spontaneous long trips
Plug-in Hybrid Electric Vehicles (PHEVs) exhibit the most varied patterns depending on vehicle type, generally falling between pure BEVs and gasoline vehicles. This intermediate position makes sense given PHEVs’ dual nature owners may drive electrically for daily commutes but fall back on gasoline for longer trips.
Critical Interpretation:
While BEVs do show somewhat lower annual mileage, it’s important not to overinterpret this finding. An 11,000 mile per year usage rate for BEVs (versus 13,000 for gasoline vehicles) still represents substantial daily driving averaging approximately 30 miles per day, well within the range capability of all modern EVs. The question is whether this reflects limitations of EV technology or simply different usage patterns among early adopters.
As charging infrastructure continues expanding and EVs become mainstream beyond early adopters, this mileage gap may narrow. The hybrid data provides an encouraging precedent HEVs achieved parity with gasoline vehicles once they became sufficiently common and affordable.
Depreciation Analysis
For most Americans, a vehicle represents one of their largest purchases after housing. Unlike homes, which often appreciate, vehicles are depreciating assets that lose value the moment they leave the dealership. Understanding depreciation patterns across powertrain types is therefore critical for consumers calculating total cost of ownership and for the automotive industry forecasting residual values.
Vehicle depreciation measured here as the retention rate (listing price divided by original MSRP) varies significantly across powertrain types. This analysis examines how different vehicles retain their value over the first five years of ownership, a period that captures the steepest portion of the depreciation curve and represents the typical new-car ownership period before trade-in or sale.
Code
# Prepare depreciation data - aggregate by powertrain and agedepreciation_summary <- depreciation_data %>%filter(powertrain %in%c("cv", "hev", "bev", "phev")) %>%group_by(powertrain, age_years) %>%summarise(mean_rr =mean(rr_predicted, na.rm =TRUE),.groups ="drop" ) %>%mutate(powertrain =factor(powertrain, levels =c("cv", "hev", "phev", "bev")),powertrain_label =recode(powertrain, !!!powertrain_labels) )# Create depreciation chartggplot(depreciation_summary, aes(x = age_years, y = mean_rr, color = powertrain)) +geom_line(linewidth =1.2) +geom_point(size =3) +scale_color_manual(values = powertrain_colors,labels = powertrain_labels,name ="Powertrain") +scale_y_continuous(labels =percent_format(), limits =c(0, 1),breaks =seq(0, 1, 0.2)) +scale_x_continuous(breaks =1:5) +labs(title ="Vehicle Value Retention Over Time by Powertrain",subtitle ="Retention rate (Price/MSRP) predicted from depreciation models",x ="Vehicle Age (Years)",y ="Retention Rate (% of MSRP)" ) +theme(legend.position ="bottom",panel.grid.minor =element_blank() )
Vehicle Value Retention by Powertrain Type
Figure 2 illustrates the depreciation trajectories for different powertrain types over the first five years of vehicle ownership. The visualization reveals striking differences in how vehicle types retain value, with important implications for buyers and sellers:
Depreciation Dynamics:
Universal Depreciation Curve: All powertrains follow the classic depreciation pattern, with the steepest value loss occurring in the first two years. A typical vehicle loses 25-35% of its MSRP by age 2, regardless of powertrain. This “new car premium” reflects both information asymmetry (buyers cannot observe vehicle history) and the strong preference for factory warranties.
Hybrid Vehicles (HEV) Lead in Value Retention: Hybrids demonstrate the strongest value retention across all powertrains, maintaining approximately 65-70% of MSRP at year 3 compared to 55-60% for other powertrains. This superior performance likely reflects multiple factors:
Proven fuel efficiency benefits that translate to lower operating costs
Established technology with track record of reliability (Toyota Prius has 20+ year history)
No range anxiety or charging infrastructure concerns
Growing fuel prices making efficiency increasingly valuable
BEV Depreciation Evolution: Battery electric vehicle depreciation tells a more complex story. Early BEVs (particularly those from 2018-2020) experienced steeper depreciation, often falling to 50% retention by year 3. However, the data suggests this pattern is moderating as:
Importantly, rapid improvements in BEV technology created a “moving target” problem where older EVs with 150-mile ranges became less desirable as 300+ mile range vehicles entered the market at similar prices.
PHEV Uncertainty: Plug-in hybrid depreciation patterns show the highest variability, falling between HEVs and BEVs but with wider confidence intervals. This likely reflects the transitional, somewhat uncertain status of PHEVs:
Are they the best of both worlds, or a compromise that excels at neither?
Will manufacturers continue investing in PHEV technology, or abandon it in favor of pure EVs?
Do buyers value the electric-only range, or primarily use the gasoline engine?
Implications for Consumers:
These depreciation patterns have profound implications for total cost of ownership calculations:
New car buyers should recognize that hybrids may offer the best value retention, potentially offsetting higher initial purchase prices through superior resale value.
Used vehicle shoppers may find compelling value in the used BEV market, where depreciation has already occurred but the vehicles retain substantial utility.
EV buyers should consider depreciation risk alongside fuel savings a BEV that saves $1,500 annually in fuel costs but depreciates $2,000 faster than a comparable gasoline vehicle may not be economically optimal unless kept long-term.
Market Concentration Analysis
Markets with many competitors offering diverse products tend to deliver better outcomes for consumerslower prices, more innovation, and greater choice. Conversely, concentrated markets dominated by one or two firms may suffer from reduced competition, higher prices, and less innovation. The Federal Trade Commission and Department of Justice use market concentration metrics to evaluate potential antitrust concerns.
This analysis examines market concentration across powertrain types using the Herfindahl-Hirschman Index (HHI), a standard measure calculated as the sum of squared market shares. HHI ranges from near 0 (perfect competition with infinite small competitors) to 10,000 (pure monopoly). Markets with HHI below 1,500 are considered competitive, 1,500-2,500 moderately concentrated, and above 2,500 highly concentrated.
For the automotive market, we calculate HHI by brand within each powertrain category, comparing 2018 to 2024 to assess how market structure has evolved during this critical period of EV growth.
Code
# Prepare HHI data for visualizationhhi_plot_data <- hhi_make %>%filter(powertrain %in%c("cv", "hev", "bev", "phev")) %>%mutate(powertrain =factor(powertrain, levels =c("cv", "hev", "phev", "bev")),powertrain_label =recode(powertrain, !!!powertrain_labels),year =factor(year) )# Create grouped bar chart comparing 2018 vs 2024ggplot(hhi_plot_data, aes(x = powertrain_label, y = median, fill = year)) +geom_col(position =position_dodge(width =0.8), width =0.7) +geom_errorbar(aes(ymin = q25, ymax = q75),position =position_dodge(width =0.8),width =0.25, alpha =0.7) +geom_text(aes(label =percent(median, accuracy =1)),position =position_dodge(width =0.8),vjust =-2.5, size =3) +scale_fill_manual(values =c("2018"="#3182bd", "2024"="#e6550d"),name ="Year") +scale_y_continuous(labels =percent_format(),expand =expansion(mult =c(0, 0.15))) +labs(title ="Brand Market Concentration by Powertrain Type",subtitle ="Median HHI with interquartile range (Q25-Q75) - Lower is more competitive",x ="Powertrain Type",y ="HHI (Market Concentration)" ) +theme(panel.grid.major.x =element_blank(),legend.position ="bottom" )
Market Concentration (HHI) by Powertrain Type: 2018 vs 2024
Figure 3 presents the median HHI scores for each powertrain type, comparing 2018 (blue) to 2024 (orange). Error bars represent the interquartile range (Q25-Q75), showing variation across geographic markets. Values are labeled on each bar. The analysis reveals one of the most striking transformations in recent automotive history:
The Democratization of Electric Mobility:
BEV Market Transformation (2018→2024): In 2018, the BEV market exhibited extremely high concentration (median HHI ~4,500), firmly in the “highly concentrated” category by federal antitrust standards. This reflected Tesla’s overwhelming dominance the company held an estimated 80% of the US EV market in 2018. By 2024, the median HHI has dropped dramatically to approximately 2,200, approaching the threshold for moderate concentration. This represents one of the fastest market concentration declines in modern automotive history.
What drove this transformation?
Traditional Automakers’ Entry: Ford (F-150 Lightning, Mustang Mach-E), GM (Bolt, Hummer EV, various Ultium models), Volkswagen (ID.4), Hyundai-Kia (Ioniq, EV6), and others launched competitive EVs
New Entrants: Rivian, Lucid, and other startups carved out niche positions
Tesla’s Relative Decline: While Tesla continued growing in absolute units, its market share fell from ~80% to ~50-55% as the total market expanded
Model Diversity: The number of available BEV models increased from ~15 in 2018 to over 50 by 2024
This competition benefits consumers through lower prices (Tesla has cut prices multiple times facing competition), greater choice across price points and vehicle types, and accelerated innovation in range, charging speed, and features.
Gasoline Market (CV) Remains Competitive: The conventional gasoline vehicle market maintains low HHI scores (~1,200-1,400), reflecting over a century of market evolution with dozens of established manufacturers. This highly competitive baseline represents the mature state the EV market may eventually approach.
Hybrid Market (HEV) Stability: The hybrid market shows moderate concentration (HHI ~2,000-2,500) that has remained relatively stable from 2018 to 2024. This reflects Toyota’s sustained dominance in hybrid technology the company pioneered mass-market hybrids with the Prius (1997) and has maintained technological leadership. However, Ford, Honda, Hyundai, and others provide sufficient competition to prevent monopolistic concentration.
PHEV Market Concentration Increase: Counterintuitively, the plug-in hybrid market has become more concentrated, with median HHI rising from ~2,500 to ~3,000. This trend likely reflects strategic decisions by manufacturers to focus resources on either pure EVs or conventional vehicles rather than the “middle ground” of plug-in hybrids. Several manufacturers (including Ford and GM) have announced plans to phase out PHEV offerings in favor of dedicated EV platforms, consolidating this shrinking segment among fewer players.
Policy and Market Implications:
The BEV market’s evolution from near-monopoly to moderate competition represents a policy success story. Federal and state incentives, which critics sometimes characterized as “subsidizing Tesla,” helped create a market large enough to attract mainstream manufacturers. The result is a far more competitive landscape than existed under Tesla’s dominance.
However, the moderate concentration that persists (HHI ~2,200) suggests the market has not yet achieved the robust competition of the gasoline segment. Continued monitoring is warranted to ensure that as the market matures, it trends toward healthy competition rather than oligopoly among 3-4 major players.
The PHEV concentration trend, while concerning from an antitrust perspective, may actually reflect a natural market evolution as the automotive industry “skips” the partial electrification strategy in favor of full electrification a pattern seen in other technology transitions.
Interactive Map: EV Adoption Across the US
The geographic distribution of electric vehicle adoption varies significantly across the United States, influenced by factors such as state incentives, charging infrastructure, climate, and consumer preferences.
Figure 4 shows an interactive map of EV adoption across the United States. Click on circles to see state details, or hover for quick information. Key geographic patterns include:
West Coast Leadership: California, Washington, and Oregon lead in EV adoption, driven by strong state incentives and charging infrastructure.
Northeast Corridor: States like Massachusetts, New Jersey, and Connecticut show above-average EV adoption rates.
Emerging Markets: Colorado, Utah, and Nevada are showing rapid EV growth in the Mountain West region.
Rural States: States with lower population density and longer driving distances tend to have lower EV adoption rates.
Daily Mileage Distribution Analysis
Understanding the distribution of daily vehicle miles traveled (DVMT) provides insight into how different powertrains fit into daily driving patterns.
Daily Vehicle Miles Traveled Distribution by Powertrain
Figure 5 shows the distribution of daily vehicle miles traveled for each powertrain type. The visualization reveals:
BEVs have comparable median daily mileage to conventional vehicles, indicating they meet typical daily driving needs.
The interquartile range shows that most drivers, regardless of powertrain, travel between 20-50 miles per day.
High-mileage outliers (75th percentile) show similar patterns across all powertrains, suggesting all vehicle types serve diverse driving needs.
Conclusions
This comprehensive analysis of US vehicle market trends from 2018-2024 provides data-driven insights into one of the most significant technological transitions in automotive history. By examining over 10 million vehicle listings across six years, this study reveals fundamental differences in how gasoline, hybrid, and electric vehicles are used, valued, and compete in the American marketplace.
Principal Findings and Implications:
Mileage Patterns - Challenging Range Anxiety Narratives
The analysis reveals that while gasoline and hybrid vehicles continue to accumulate the highest annual mileage (averaging ~12,000-13,000 miles per year), BEVs demonstrate comparable median daily driving distances to conventional vehicles. This finding directly challenges persistent “range anxiety” concerns and suggests that for the median driver traveling 25-35 miles per day, current BEVs with 200+ mile ranges provide more than adequate daily capacity.
However, the somewhat lower annual mileage observed for BEVs may indicate they serve as secondary household vehicles or that long-distance travel remains a concern for some owners. As charging infrastructure continues to expand nationwide, this gap may narrow. Policymakers and manufacturers should focus on addressing the remaining use cases where EVs currently show lower utilization.
Depreciation Dynamics - Total Cost of Ownership Considerations
All powertrains experience steep depreciation curves, with vehicles losing 25-40% of their value in the first two years. The depreciation analysis yields critical insights for consumers weighing the total cost of ownership:
Hybrids (HEV) demonstrate the strongest value retention, potentially offsetting their higher initial purchase prices.
BEV depreciation has evolved significantly as the market matures, with newer models showing improved retention compared to early EVs, likely reflecting improvements in battery technology, range, and consumer confidence.
Consumers considering EVs must factor depreciation into their calculations alongside fuel and maintenance savings. For buyers in the used vehicle market, current BEV depreciation patterns may present compelling value propositions.
Market Concentration - The Democratization of Electric Mobility
Perhaps the most striking finding is the transformation of the BEV market from a highly concentrated, Tesla-dominated sector in 2018 to a significantly more competitive landscape by 2024. This shift represents a fundamental democratization of electric mobility:
Increased competition benefits consumers through greater choice, innovation, and competitive pricing.
The influx of traditional automakers (Ford, GM, Volkswagen, Hyundai, etc.) into the EV space signals industry-wide commitment to electrification.
However, the PHEV market has become more concentrated, possibly indicating strategic retreats by some manufacturers as they focus resources on either pure EVs or conventional powertrains.
Geographic Disparities - An Uneven Transition
The stark geographic variation in EV adoption ranging from over 25% market share in California to under 3% in some rural states reveals the uneven nature of the electrification transition. This disparity likely reflects multiple reinforcing factors:
State-level incentive policies and regulations
Charging infrastructure availability
Climate considerations (cold weather impacts EV range)
Driving patterns (rural areas often require longer trips)
Cultural and political factors influencing technology adoption
This geographic divide has important policy implications. A nationally coordinated approach to charging infrastructure and incentives may be necessary to prevent deepening regional disparities in access to electric mobility benefits.
Limitations and Future Research:
While this analysis provides robust insights based on millions of data points, several limitations suggest directions for future research:
Longitudinal Consumer Studies: Following individual EV owners over time could reveal whether usage patterns change as drivers become more comfortable with the technology and as charging networks expand.
Battery Degradation Impact: Future research should explicitly link battery health metrics to depreciation curves, as battery degradation remains a primary concern for used EV buyers.
Incentive Policy Analysis: Quantifying the specific impacts of federal tax credits and state-level incentives on adoption patterns and depreciation rates would inform optimal policy design.
Total Cost of Ownership Modeling: Integrating this market data with fuel costs, maintenance expenses, and insurance premiums would provide consumers with more complete decision-making tools.
Environmental Impact Assessment: While outside the scope of this market analysis, connecting these usage patterns to lifecycle emissions calculations would address critical environmental questions.
Charging Infrastructure Correlation: Future work should explicitly test whether regional variations in public charging availability predict adoption rates and usage patterns.
Final Perspective:
The transition to electric vehicles represents more than a change in propulsion technology it signals a fundamental restructuring of personal transportation, energy systems, and environmental policy. This analysis demonstrates that by 2024, electric vehicles have evolved from niche products to mainstream options that meet the daily driving needs of most Americans.
However, the transition remains incomplete and geographically uneven. The dramatic increase in BEV market competition since 2018 suggests the industry is responding to consumer demand, yet significant barriers remain particularly around long-distance travel, charging infrastructure, and upfront costs.
As battery technology continues improving, charging networks expand, and production scales up, many current limitations will likely diminish. Continued monitoring of these market trends will be essential for policymakers designing incentive programs, manufacturers planning product strategies, and consumers making purchasing decisions in this rapidly evolving landscape.
The data tell a clear story: electric vehicles are not the future they are the present. The question is no longer whether electrification will occur, but how quickly and equitably this transition will unfold across different regions, demographics, and use cases.
Attribution
This final report represents a team project completed by Nithin Sarva, Jake Springer, and Jagannath Narayanaswamy for EMSE 4572/6572 (Exploratory Data Analysis) at George Washington University, Fall 2025.
Team Contributions
All team members contributed equally to this project across all major components:
Nithin Sarva, Jake Springer, and Jagannath Narayanaswamy collaborated on:
Research Question Development: Jointly formulated primary and secondary research questions examining powertrain differences in usage patterns, depreciation, market concentration, and geographic adoption
Data Source Selection: Collaboratively identified and accessed the vehicletrends open-source project and evaluated data appropriateness for the research questions
Data Analysis:
Loaded and processed pre-computed summary statistics from vehicletrends repository
Performed exploratory data analysis to understand distributions and relationships
Generated statistical summaries and interpretations presented in the report
Visualization Development:
Created publication-quality visualizations (Figures 1-5) using ggplot2
Designed color palettes, labels, and annotations following data visualization best practices
Developed interactive geographic map using Leaflet package
Writing and Narrative:
Authored report text including introduction, methodology, results interpretation, and conclusions
Wrote comprehensive data source documentation and validity assessments
Developed policy implications and future research directions
Technical Implementation:
Wrote R code for data loading, transformation, visualization, and analysis
Configured Quarto document with appropriate YAML settings and code chunk options
Debugged technical issues and ensured reproducible analysis
Quality Assurance:
Proofread and edited all content for clarity, accuracy, and professionalism
Verified all citations, URLs, and data descriptions
Ensured compliance with assignment requirements and grading rubric
Data and Prior Work Acknowledgment
While this report represents original analysis conducted by the team, it builds upon the vehicletrends open-source data infrastructure developed by Dr. John Paul Helveston and the Helveston Lab at George Washington University. The raw MarketCheck data collection, VIN decoding pipeline, and pre-processing scripts represent prior research work made publicly available through GitHub.
Key Distinction: - Prior collaborative work: Data collection infrastructure, processing pipeline, and summary statistic generation (developed by Helveston Lab, publicly shared) - This team project: Research question formulation, data analysis, visualization creation, interpretation, and comprehensive written report (completed by Nithin Sarva, Jake Springer, and Jagannath Narayanaswamy)
All analysis code written for this report, including data loading, visualization generation, and statistical computations, is original work completed by the team for this course project.
Appendix
Data Dictionary
The following table describes the key variables used in this analysis:
Vehicle age calculated from model year to listing date
Continuous, 0-10+ years
age_months
Raw quantile files
Vehicle age in months (higher precision)
Integer, 0-120+ months
mileage_predicted
mileage_powertrain_type.csv
Predicted cumulative odometer reading based on linear regression model
Miles, 0-200,000+
rr_predicted
depreciation_powertrain_type.csv
Predicted retention rate: (Listing Price / Original MSRP)
Ratio, 0-1 (or 0%-100%)
dvmt
quantiles_dvmt.csv
Daily Vehicle Miles Traveled estimate
Miles per day, 0-100+
quantile
quantiles_*.csv
Percentile of the empirical distribution
Integer 1-99 representing percentile
miles25, miles50, miles75
quantiles_miles files
25th, 50th (median), and 75th percentile cumulative mileage values
Miles
rr25, rr50, rr75
quantiles_rr files
25th, 50th, and 75th percentile retention rate values
Ratio, 0-1
median
hhi_make.csv
Median Herfindahl-Hirschman Index across geographic markets
HHI score, 0-10,000
q25, q75
hhi_make.csv
25th and 75th percentile HHI values showing market variation
HHI score, 0-10,000
year
hhi_make.csv
Year of market share snapshot
2018 or 2024
ev_share
Manual compilation (Figure 4)
Estimated EV market share (new vehicle sales, not listings)
Percentage, 0-100%
state
Manual compilation (Figure 4)
US state name
50 states
lat, lng
Manual compilation (Figure 4)
Geographic coordinates for state centroids
Latitude/Longitude in degrees
Important Variable Interpretations:
Mileage Variables: Derived from odometer readings in dealer listings, not from odometer audits or DMV registrations. Reflects advertised vehicle condition.
Price/Retention Variables: Based on advertised listing prices from dealers, not actual transaction prices or private party sales.
Market Share Metrics: Calculated from relative prevalence in listings, which may differ from actual on-road fleet composition or new vehicle sales shares.
EV Share (Figure 4): Represents estimated new vehicle sales share by state (2024), compiled from publicly available registration statistics distinct from the listings-based metrics in other analyses.
Additional Data Information
Time Period: 2018-2024 (six years of continuous data)
Geographic Coverage: All 50 US states plus District of Columbia
Inflation Adjustment: All prices normalized to January 2024 dollars using Consumer Price Index (CPI-U)
Data Type Clarification:
This analysis uses VEHICLE LISTING DATA, NOT registration or sales data. Specifically: - Source: Dealer inventory listings scraped from MarketCheck.com - What it represents: Vehicles advertised for sale by franchised and independent dealers - What it does NOT represent: - Vehicle registration records (DMV data) - Actual completed sales transactions - Private party sales - Total on-road vehicle fleet composition
Implications: Results reflect the used vehicle market as seen through dealer listings. Adoption rates, market shares, and prevalence metrics should be interpreted as indicators of listing activity rather than absolute measures of fleet composition or new vehicle sales. This distinction is critical when comparing to studies using DMV registration data or manufacturer sales reports.
Complete Code
Code
# Load librarieslibrary(tidyverse)library(here)library(scales)library(knitr)library(plotly)library(leaflet)knitr::opts_chunk$set(warning =FALSE,message =FALSE,comment ="#>",fig.path ="figs/",fig.width =7.252,fig.height =4,fig.retina =3)# Set theme for all plotstheme_set(theme_minimal(base_size =14) +theme(plot.title =element_text(face ="bold"),legend.position ="bottom" ))# Define color palette for powertrains (using data labels)powertrain_colors <-c("cv"="#636363","hev"="#2ca02c","bev"="#1f77b4","phev"="#ff7f0e")# Labels for displaypowertrain_labels <-c("cv"="Gasoline","hev"="Hybrid","bev"="BEV","phev"="PHEV")# Load data from vehicletrends GitHub repositorybase_url <-"https://raw.githubusercontent.com/jhelvy/vehicletrends/main/data/"# Mileage model coefficients by powertrain and typemileage_data <-read_csv(paste0(base_url, "mileage_powertrain_type.csv"),show_col_types =FALSE)# Depreciation model resultsdepreciation_data <-read_csv(paste0(base_url, "depreciation_powertrain_type.csv"),show_col_types =FALSE)# Market concentration (HHI) datahhi_make <-read_csv(paste0(base_url, "hhi_make.csv"), show_col_types =FALSE)# Daily VMT quantiles dataquantiles_dvmt <-read_csv(paste0(base_url, "quantiles_dvmt.csv"), show_col_types =FALSE)# Prepare mileage data for visualization# Get mileage at age 3 years for comparisonmileage_plot_data <- mileage_data %>%filter(powertrain %in%c("cv", "hev", "bev", "phev")) %>%filter(age_years ==3) %>%# Compare at 3 years of agemutate(powertrain =factor(powertrain, levels =c("cv", "hev", "phev", "bev")),powertrain_label =recode(powertrain, !!!powertrain_labels),vehicle_type_label =str_to_title(vehicle_type),annual_miles = mileage_predicted /3# Average annual miles )# Create mileage comparison chartggplot(mileage_plot_data, aes(x = vehicle_type_label, y = annual_miles,fill = powertrain)) +geom_col(position =position_dodge(width =0.8), width =0.7) +geom_text(aes(label =comma(round(annual_miles))),position =position_dodge(width =0.8),vjust =-0.5, size =3) +scale_fill_manual(values = powertrain_colors,labels = powertrain_labels,name ="Powertrain") +scale_y_continuous(labels =comma_format(),expand =expansion(mult =c(0, 0.15))) +labs(title ="Average Annual Mileage by Powertrain and Vehicle Type",subtitle ="Based on predicted mileage at 3 years of age (2018-2024)",x ="Vehicle Type",y ="Estimated Annual Miles" ) +theme(panel.grid.major.x =element_blank(),legend.position ="bottom" )# Prepare depreciation data - aggregate by powertrain and agedepreciation_summary <- depreciation_data %>%filter(powertrain %in%c("cv", "hev", "bev", "phev")) %>%group_by(powertrain, age_years) %>%summarise(mean_rr =mean(rr_predicted, na.rm =TRUE),.groups ="drop" ) %>%mutate(powertrain =factor(powertrain, levels =c("cv", "hev", "phev", "bev")),powertrain_label =recode(powertrain, !!!powertrain_labels) )# Create depreciation chartggplot(depreciation_summary, aes(x = age_years, y = mean_rr, color = powertrain)) +geom_line(linewidth =1.2) +geom_point(size =3) +scale_color_manual(values = powertrain_colors,labels = powertrain_labels,name ="Powertrain") +scale_y_continuous(labels =percent_format(), limits =c(0, 1),breaks =seq(0, 1, 0.2)) +scale_x_continuous(breaks =1:5) +labs(title ="Vehicle Value Retention Over Time by Powertrain",subtitle ="Retention rate (Price/MSRP) predicted from depreciation models",x ="Vehicle Age (Years)",y ="Retention Rate (% of MSRP)" ) +theme(legend.position ="bottom",panel.grid.minor =element_blank() )# Prepare HHI data for visualizationhhi_plot_data <- hhi_make %>%filter(powertrain %in%c("cv", "hev", "bev", "phev")) %>%mutate(powertrain =factor(powertrain, levels =c("cv", "hev", "phev", "bev")),powertrain_label =recode(powertrain, !!!powertrain_labels),year =factor(year) )# Create grouped bar chart comparing 2018 vs 2024ggplot(hhi_plot_data, aes(x = powertrain_label, y = median, fill = year)) +geom_col(position =position_dodge(width =0.8), width =0.7) +geom_errorbar(aes(ymin = q25, ymax = q75),position =position_dodge(width =0.8),width =0.25, alpha =0.7) +geom_text(aes(label =percent(median, accuracy =1)),position =position_dodge(width =0.8),vjust =-2.5, size =3) +scale_fill_manual(values =c("2018"="#3182bd", "2024"="#e6550d"),name ="Year") +scale_y_continuous(labels =percent_format(),expand =expansion(mult =c(0, 0.15))) +labs(title ="Brand Market Concentration by Powertrain Type",subtitle ="Median HHI with interquartile range (Q25-Q75) - Lower is more competitive",x ="Powertrain Type",y ="HHI (Market Concentration)" ) +theme(panel.grid.major.x =element_blank(),legend.position ="bottom" )# State-level EV adoption data (based on publicly available EV registration statistics)# Data represents approximate EV market share percentages by state (2024 estimates)ev_adoption_data <-tibble(state =c("California", "Washington", "Oregon", "Colorado", "Nevada","Arizona", "Texas", "Florida", "New York", "New Jersey","Massachusetts", "Connecticut", "Vermont", "Maine", "New Hampshire","Maryland", "Virginia", "Georgia", "North Carolina", "Illinois","Michigan", "Ohio", "Pennsylvania", "Hawaii", "Utah","Minnesota", "Wisconsin", "Iowa", "Missouri", "Tennessee","South Carolina", "Alabama", "Louisiana", "Oklahoma", "Kansas","Nebraska", "South Dakota", "North Dakota", "Montana", "Wyoming","Idaho", "New Mexico", "Alaska", "Kentucky", "Indiana","Arkansas", "Mississippi", "West Virginia", "Rhode Island", "Delaware"),ev_share =c(25.2, 15.8, 14.2, 12.5, 11.8,9.5, 6.2, 7.8, 10.5, 11.2,12.1, 10.8, 14.5, 8.2, 9.1,10.2, 9.8, 7.5, 7.2, 8.5,6.8, 5.2, 6.5, 13.5, 10.2,7.8, 5.5, 4.2, 4.8, 5.5,5.8, 4.2, 3.8, 3.5, 4.1,3.2, 2.8, 2.5, 3.5, 2.2,5.8, 7.2, 5.5, 4.5, 5.2,3.2, 2.8, 2.5, 9.5, 8.8),lat =c(36.78, 47.75, 43.80, 39.55, 38.80,34.05, 31.97, 27.66, 42.17, 40.06,42.41, 41.60, 44.56, 45.25, 43.19,39.05, 37.43, 32.17, 35.76, 40.63,44.31, 40.42, 41.20, 19.90, 39.32,46.73, 43.78, 41.88, 38.57, 35.52,34.00, 32.32, 31.17, 35.01, 39.01,41.49, 43.97, 47.55, 46.88, 43.08,44.07, 34.52, 64.27, 37.84, 40.27,35.20, 32.35, 38.60, 41.58, 38.91),lng =c(-119.42, -120.74, -120.55, -105.78, -116.42,-111.09, -99.90, -81.52, -74.95, -74.41,-71.38, -73.09, -72.58, -69.45, -71.56,-76.64, -78.66, -82.91, -79.02, -89.40,-84.54, -82.91, -77.19, -155.58, -111.09,-94.69, -88.79, -93.10, -92.29, -86.58,-81.03, -86.90, -91.87, -97.09, -98.48,-99.90, -99.90, -101.00, -110.36, -107.29,-114.74, -105.87, -153.37, -84.27, -86.13,-91.83, -89.40, -80.45, -71.48, -75.53))# Create color palettepal <-colorNumeric(palette ="YlGnBu",domain = ev_adoption_data$ev_share)# Create interactive mapleaflet(ev_adoption_data) %>%addProviderTiles(providers$CartoDB.Positron) %>%setView(lng =-98.5, lat =39.8, zoom =4) %>%addCircleMarkers(lng =~lng,lat =~lat,radius =~sqrt(ev_share) *3,color =~pal(ev_share),fillColor =~pal(ev_share),fillOpacity =0.7,stroke =TRUE,weight =1,popup =~paste0("<strong>", state, "</strong><br>","EV Market Share: ", ev_share, "%<br>","<em>Click for details</em>" ),label =~paste0(state, ": ", ev_share, "% EV share") ) %>%addLegend(position ="bottomright",pal = pal,values =~ev_share,title ="EV Market Share (%)",opacity =0.7 )# Prepare DVMT quantile datadvmt_plot_data <- quantiles_dvmt %>%filter(powertrain %in%c("cv", "hev", "bev", "phev")) %>%filter(quantile %in%c(25, 50, 75)) %>%mutate(powertrain =factor(powertrain, levels =c("cv", "hev", "phev", "bev")),powertrain_label =recode(powertrain, !!!powertrain_labels),quantile_label =case_when( quantile ==25~"25th Percentile", quantile ==50~"Median", quantile ==75~"75th Percentile" ),quantile_label =factor(quantile_label,levels =c("25th Percentile", "Median", "75th Percentile")) )# Create DVMT chartggplot(dvmt_plot_data, aes(x = powertrain_label, y = dvmt, fill = quantile_label)) +geom_col(position =position_dodge(width =0.8), width =0.7) +geom_text(aes(label =round(dvmt, 1)),position =position_dodge(width =0.8),vjust =-0.5, size =3) +scale_fill_brewer(palette ="Blues", name ="Quantile") +scale_y_continuous(expand =expansion(mult =c(0, 0.12))) +labs(title ="Daily Vehicle Miles Traveled by Powertrain",subtitle ="Distribution of daily driving patterns (25th, 50th, 75th percentiles)",x ="Powertrain Type",y ="Daily Miles Traveled" ) +theme(panel.grid.major.x =element_blank(),legend.position ="bottom" )