The Study of Home Advantage in the English Premier League
Exploring the Existence and Influence of Crowd Atmosphere on Match Outcomes
Author
Badr Ismail, Sammy Sorayanejad, Sergio Ley, Tomas Haché Caro
Published
December 7, 2025
Introduction
In football, the idea of “home advantage” is widely accepted but not always well understood. Fans and commentators often talk about intimidating stadiums, familiar surroundings, or home supporters acting as a “twelfth man.” But beneath these narratives lies an important analytical question: does playing at home actually change match outcomes in a measurable and consistent way? And if home advantage does exist, does it affect all Premier League teams equally, or do some clubs benefit far more than others?
The English Premier League (EPL) provides an ideal setting to investigate these questions. It is one of the world’s most competitive leagues, with detailed match records spanning multiple decades. This allows us to examine long-term trends and to compare performance across seasons, teams, and match conditions.
Our project uses match data from 1993–94 through 2023–24, covering thousands of games across 31 seasons. We compute home and away win rates for every season and analyze goal-scoring patterns to evaluate whether home teams consistently outperform away teams. Beyond league-level trends, we also examine whether individual clubs show similar levels of home advantage, or whether team-specific factors lead to large variations.
Across the EPL era, we find clear evidence that home advantage exists: home teams win more often and score more goals than away teams in nearly every season. However, we also observe that the strength of home advantage is not constant. It varies across time and differs substantially between clubs. Some teams display a strong and consistent home edge, while others show relatively small differences between home and away performance.
Understanding these patterns matters because it reveals how environmental, psychological, and structural factors shape competitive outcomes. Home advantage is not just a narrative—it is a measurable feature of the league whose magnitude shifts depending on team identity, context, and league era.
Research Questions
This project investigates the existence and characteristics of home advantage in the English Premier League. Our analysis focuses on two guiding questions:
Primary Research Question
Does home advantage exist in the English Premier League?
This question establishes the foundation of our study by examining long-term differences in win rates and goal-scoring between home and away teams across more than 30 seasons.
Secondary Research Question
Do individual Premier League clubs follow the same home-advantage patterns as the league overall, or does the strength of home advantage vary significantly by team?
While league-wide averages show a clear home edge, this question explores deeper variation. By evaluating home and away performance for each club, we assess whether home advantage is universal or heavily team-specific.
Together, these questions allow us to measure the existence of home advantage at the league level and to understand how consistently (or inconsistently) this advantage appears across teams, seasons, and contexts.
Data Sources
Our analysis draws from three datasets that together cover Premier League seasons from 1993–94 through 2023–24. All files are stored locally in the data_raw/ directory as required by the course structure. Below, we describe each dataset, its origin, how it was collected, and potential limitations. Complete data dictionaries are included in the Appendix.
Dataset 1: English Premier League Results (1993–94 to 2021–22)
Description and collection details:
This dataset compiles nearly thirty seasons of Premier League match results. It was created by web-scraping match reports and score feeds from major sports reporting platforms, which draw from official Premier League statistical services. The dataset includes full-time and half-time scores, home/away team identifiers, match results, referee assignments, and various in-match statistics.
Because the file is pre-processed, the Kaggle contributor standardized variable names, removed inconsistencies, and formatted the data into a tidy match-level structure usable for long-term seasonal analysis.
Potential missing data or biases:
Early-season records may lack metadata such as referee names or shot totals. Inconsistencies may reflect differences in record-keeping practices over time, especially before digital archiving was standardized. However, full-time scores and match outcomes — the variables central to our analysis — are consistently available.
Description and collection details:
This dataset provides match-level statistics for the full 2022–23 season. Football-Data.co.uk maintains weekly updates by scraping official Premier League match reports, incorporating bookmaker statistics, and manually correcting formatting issues. The Kaggle uploader reorganized the file into a tidy CSV format compatible with the long-term historical dataset.
Potential missing data or biases:
The dataset is complete for all 380 matches. Variables related to betting markets may differ from purely league-generated statistics but do not affect the scoring or outcome variables used in this study.
Description and collection details:
This dataset includes match-by-match details for the ongoing 2023–24 season. It was collected by scraping FBref, a statistical service that maintains verified feeds from Premier League data partnerships. Unlike the historical match-level datasets, this file is team-centric — each match appears twice (once per team perspective). We filtered by Venue == "Home" or Venue == "Away" when computing home and away statistics.
Potential missing data or biases:
Because the season was active during collection, updates occur as new matches are played. Some advanced metrics (e.g., expected goals) rely on model-based estimates rather than direct observation.
General Considerations
All datasets used in our study were pre-processed before we obtained them; none represent raw scraping performed by our team.
Some variation in available variables across eras reflects changes in statistical reporting (e.g., xG added in recent seasons).
The match-level structure of all datasets enables both league-wide and team-specific comparisons, which is essential for analyzing variation in home advantage across clubs.
Differences in stadium construction, league rules, and scheduling formats may introduce contextual shifts in match outcomes over time.
Data dictionaries for all three datasets appear in the Appendix.
Results
To evaluate whether home advantage exists in the English Premier League, we began by examining long-term trends in home and away win rates across every season from 1993–94 through 2023–24. This first figure provides a high-level view of league-wide performance patterns over three decades and serves as the foundation for the rest of our analysis.
Code
Home_and_Away_Win_Rate_Table %>%pivot_longer(cols =c(home_win_rate, away_win_rate),names_to ="Type", values_to ="Rate") %>%ggplot(aes(x = Season, y = Rate, group = Type, color = Type)) +geom_line(linewidth =1.3) +geom_point(size =3) +scale_color_manual(values =c("home_win_rate"="#0077B6", # Deep blue"away_win_rate"="#D62828") # Muted red ) +labs(title ="Home vs Away Win Rate by Season",subtitle ="English Premier League, 1993–2024",x ="Season",y ="Win Rate",color ="Match Type" ) +theme_minimal(base_size =14) +theme(plot.title =element_text(face ="bold", size =18, hjust =0.5),plot.subtitle =element_text(size =13, hjust =0.5, color ="gray40"),axis.text.x =element_text(angle =65, hjust =1, vjust =1, size =10),axis.title =element_text(face ="bold"),legend.position ="top",legend.title =element_text(face ="bold"),panel.grid.minor =element_blank(),panel.grid.major.x =element_blank(),plot.background =element_rect(fill ="white", color =NA) )
The pattern in this chart shows a clear and persistent gap between home and away performance: in nearly every season, home teams win more frequently than away teams, typically by a margin of 10–20 percentage points. This long-run consistency provides strong evidence that home advantage is a stable and measurable feature of Premier League competition.
While individual seasons show modest fluctuations, the overall trend is remarkably stable. Home teams consistently outperform away teams, suggesting that structural factors—such as pitch familiarity, reduced travel burden, and the psychological boost of playing in a supportive environment—play a sustained role in shaping match outcomes.
The one feature that stands out more than any other: the 2020–21 season, when home and away win rates nearly converge. This is the only instance in more than three decades where the traditional home-field edge effectively disappears. What makes this collapse so striking is that it occurred during the COVID-19 pandemic, when all Premier League matches were played without fans. This rare natural experiment strongly supports the idea that the crowd is a core driver of home advantage.
However, this chart only compares levels of home and away win rates. To better understand how strong the advantage is in each season, and how it changes over time, we calculate a more focused metric: net home advantage, defined as the difference between home and away win rates (Home − Away).
This transformation allows us to isolate the size of the advantage in each season and see more clearly when the gap widens, narrows, or behaves unusually. The next visualization shows this pattern over time.
The animated chart reinforces the same conclusion from a different angle: home advantage is not only present, but consistently positive across the Premier League era. In most seasons, the net home advantage ranges from roughly 8 to 20 percentage points—meaning home teams win substantially more often than away teams. The long-term stability of this pattern highlights how deeply embedded home advantage is in the competitive structure of the league.
At the same time, the animation makes seasonal shifts easier to interpret. Some eras show stronger home dominance, while others exhibit a slightly narrower gap. These fluctuations likely reflect changes in team quality, league-wide tactical trends, and structural shifts such as stadium renovations or schedule congestion. Yet despite these contextual differences, the advantage almost never disappears—a sign of just how durable the phenomenon is.
The only season where the net home advantage approaches zero, and even dips slightly negative, is the pandemic-affected 2020–2021 season. Matches were played without fans, providing a rare natural experiment. The collapse of home advantage during that season stands out sharply in the animation and underscores how sensitive match outcomes can be to environmental and psychological factors normally present during home fixtures.
Having established that home advantage exists and fluctuates in magnitude over time, the next question is how that advantage translates into on-field performance. Win rates reflect final outcomes, but to understand why home teams win more often, we examine scoring behavior directly. If home advantage influences match dynamics, we should observe consistent differences in the number of goals scored by home and away teams.
The next chart compares the average goals scored per match at home and away across the same thirty-year period.
Code
avg_goals_all %>%pivot_longer(cols =c(avg_home_goals, avg_away_goals),names_to ="Type",values_to ="Goals" ) %>%ggplot(aes(x = Season, y = Goals, group = Type, color = Type)) +geom_line(linewidth =1.3) +geom_point(size =3) +scale_color_manual(values =c("avg_home_goals"="#0077B6", # same deep blue as home_win_rate"avg_away_goals"="#D62828"# same muted red as away_win_rate ),labels =c("avg_home_goals"="Home (avg goals)","avg_away_goals"="Away (avg goals)" ) ) +labs(title ="Average Home vs Away Goals per Season",subtitle ="English Premier League, 1993–2024",x ="Season",y ="Average Goals Scored",color ="Team Type" ) +theme_minimal(base_size =14) +theme(plot.title =element_text(face ="bold", size =18, hjust =0.5),plot.subtitle =element_text(size =13, hjust =0.5, color ="gray40"),axis.text.x =element_text(angle =65, hjust =1, vjust =1, size =10),axis.title =element_text(face ="bold"),legend.position ="top",legend.title =element_text(face ="bold"),panel.grid.minor =element_blank(),panel.grid.major.x =element_blank(),plot.background =element_rect(fill ="white", color =NA) )
The scoring trends mirror what we observed in the win-rate chart. Across nearly every season, home teams score more goals per match than away teams, typically maintaining a consistent advantage of about 0.3 to 0.5 goals. This sustained scoring gap reinforces the idea that home advantage is not random—it reflects a persistent performance edge that translates directly into match outcomes.
These scoring patterns also track closely with broader league trends. While individual seasons show small fluctuations, the overarching picture remains remarkably stable: home teams generate more offensive output, create more scoring opportunities, and convert them into goals more frequently than away teams. This offensive boost is a key mechanism through which home advantage materializes over time.
Taken together, the first three charts establish a strong league-wide baseline:
Home teams win more often and score more goals, confirming the persistent existence of home advantage.
These patterns hold across three decades, spanning multiple tactical eras, managerial styles, and league evolutions.
The advantage is visible both in outcomes (win rates) and in underlying performance metrics (goals scored), strengthening the conclusion that home advantage is deeply embedded in Premier League competition.
With the league-wide picture clearly established, the next step is to determine whether this advantage is uniform across all clubs. Some teams may enjoy a pronounced home boost due to stadium atmosphere, travel distance for opponents, pitch dimensions, or fan engagement—while others may perform similarly whether at home or away.
To address our second research question, we now shift from the league level to the team level and analyze whether individual Premier League clubs exhibit comparable home-advantage patterns, or whether the strength of home advantage varies meaningfully across teams.
Code
# Build Unified match-level table## ---- A) results + results_2022 (traditional FTR data) ----results_all <-bind_rows( results %>%select(Season, HomeTeam, AwayTeam, FTR), results_2022 %>%mutate(Season ="2022-23") %>%# you confirmed thisselect(Season, HomeTeam, AwayTeam, FTR))# Expand each match into two rows: one for the home club, one for the away clubresults_long <-bind_rows(# home side results_all %>%transmute( Season,Club = HomeTeam,Venue ="Home",Win =if_else(FTR =="H", 1L, 0L),Game = 1L ),# away side results_all %>%transmute( Season,Club = AwayTeam,Venue ="Away",Win =if_else(FTR =="A", 1L, 0L),Game = 1L ))## ---- B) epl_2324 (per-team rows with Venue/Result) ----epl_long <- epl_2324 %>%transmute(Season ="2023-24",Club = Team,Venue =if_else(Venue =="Home", "Home", "Away"),Win =if_else(Result =="W", 1L, 0L),Game = 1L )## ---- C) Combined match-level data ----matches_long <-bind_rows(results_long, epl_long)# 2 Club-season summary with net home advantage club_season <- matches_long %>%group_by(Season, Club) %>%summarise(home_games =sum(Game[Venue =="Home"], na.rm =TRUE),home_wins =sum(Win[Venue =="Home"], na.rm =TRUE),away_games =sum(Game[Venue =="Away"], na.rm =TRUE),away_wins =sum(Win[Venue =="Away"], na.rm =TRUE),.groups ="drop" ) %>%mutate(home_win_rate =if_else(home_games >0, home_wins / home_games, NA_real_),away_win_rate =if_else(away_games >0, away_wins / away_games, NA_real_),net_home_advantage = home_win_rate - away_win_rate ) %>%filter(!is.na(net_home_advantage))# 3 Keep only clubs with ≥ 5 seasons & build season index# keep clubs with at least 5 seasonsclub_season_filtered <- club_season %>%group_by(Club) %>%filter(n() >=5) %>%ungroup()# order seasons and create numeric index for x-axisseason_levels <-sort(unique(club_season_filtered$Season))club_season_plot <- club_season_filtered %>%mutate(season_index =match(Season, season_levels),# facets alphabetically by club nameClub =factor(Club) # default is alphabetical )# 4club_season_plot %>%ggplot(aes(x = season_index, y = net_home_advantage)) +geom_hline(yintercept =0, linetype ="dashed", color ="gray85") +geom_line(aes(group = Club), linewidth =0.35) +facet_wrap(vars(Club)) +# remove x-axis ticks + labels for all facetsscale_x_continuous(expand =c(0, 0)) +theme_minimal(base_size =11) +theme(legend.position ="none",# REMOVE all x-axis text for the facetsaxis.text.x =element_blank(),axis.ticks.x =element_blank(),# y-axis still readableaxis.text.y =element_text(size =6),axis.title.y =element_text(size =10),# facet labelsstrip.text =element_text(face ="bold", size =9),panel.grid.minor =element_blank() ) +labs(x ="Seasons from 1993 to 2024",y ="Net home advantage",title ="Net Home Advantage by Season for Premier League Clubs" )
The fourth chart provides a club-by-club view of how home advantage has evolved across the Premier League era. Each small-multiples panel represents a single team’s trajectory and highlights the extent to which home advantage is not uniform across clubs. Some teams—such as Liverpool, Manchester United, Newcastle, and Tottenham—display consistently positive home-advantage values across most seasons in which they appear. These clubs tend to maintain stable home performance, supported by strong fan environments, large stadiums, or tactical approaches that translate well to home fixtures.
For clubs that appear intermittently in the league, such as Coventry, Derby, Swansea, or Wigan, their shorter time spans reveal sharper fluctuations, suggesting that squad turnover, managerial instability, and adaptation to Premier League competition may all contribute to variability in home performance.
Taken together, this chart makes our conclusion unmistakably clear: home advantage exists at the league level, but the magnitude and consistency of that advantage vary widely by team.
Some clubs reliably outperform expectations at home year after year, while others show no meaningful home boost at all.
However, examining long-term club trends raises a deeper question: Did all clubs experience home advantage in the same way during major disruptions—especially those that directly affected fan presence and stadium atmosphere?
The most dramatic disruption occurred during the COVID-19 pandemic, when all Premier League matches were played without spectators. Our earlier league-wide chart showed that home advantage nearly vanished during this period. But league aggregates alone cannot tell us whether every club lost its edge equally, or whether some teams were more resilient to the absence of fans.
To explore this, our next visualization summarizes each club’s average net home advantage across three distinct periods:
Pre-COVID (normal conditions with fans)
COVID (matches without fans)
Post-COVID (return of supporters)
This chart does not track season-by-season trajectories; instead, it condenses performance into clear, comparable averages that reveal how dramatically each club’s home advantage was affected. By comparing these three periods side-by-side for every team present in the COVID era, we can assess whether the pandemic weakened home advantage uniformly or whether certain clubs were disproportionately impacted—or unexpectedly resilient.
Code
club_period <- club_season %>%mutate(net_home_advantage = home_win_rate - away_win_rate,start_year =as.integer(substr(Season, 1, 4)),period =case_when( start_year <=2018~"Pre-COVID", start_year %in%c(2019, 2020) ~"COVID", start_year >=2021~"Post-COVID",TRUE~NA_character_ ) ) %>%filter(!is.na(period))clubs_all3 <- club_period %>%group_by(Club) %>%filter(n_distinct(period) ==3) %>%ungroup()club_period_means <- clubs_all3 %>%group_by(Club, period) %>%summarise(net_home_advantage =mean(net_home_advantage, na.rm =TRUE),.groups ="drop" ) %>%mutate(# Ensure period order is correctperiod =factor(period, levels =c("Pre-COVID", "COVID", "Post-COVID")) )club_period_means_clean <- club_period_means %>%filter(Club !="Norwich")# remove Norwichclub_period_means_clean <- club_period_means %>%filter(Club !="Norwich") %>%mutate(PeriodShort =recode(period,"Pre-COVID"="Pre","COVID"="COVID","Post-COVID"="Post" ),PeriodShort =factor(PeriodShort, levels =c("Pre", "COVID", "Post")) )ggplot(club_period_means_clean,aes(x = PeriodShort, y = net_home_advantage, group = Club)) +geom_hline(yintercept =0, linetype ="dashed", color ="gray85") +geom_line(color ="#555555", linewidth =0.9) +geom_point(color ="#0077B6", size =2.5) +facet_wrap(vars(Club), scales ="free_y") +labs(title ="Net Home Advantage for Each Club Across Covid Periods",x ="Period",y ="Net Home Advantage" ) +theme_minimal(base_size =14) +theme(plot.title =element_text(face ="bold", size =16, hjust =0.5),plot.subtitle =element_text(size =12, hjust =0.5, color ="gray40"),strip.text =element_text(face ="bold", size =10),# ↓↓↓ KEY FIXES FOR OVERLAP ↓↓↓axis.text.x =element_text(size =7, angle =0, vjust =1),axis.text.y =element_text(size =8),# ↑↑↑ reduced font size # slight spacing between categoriesaxis.ticks.length =unit(0.15, "cm"),panel.grid.minor =element_blank(),legend.position ="none" )
The fifth chart reveals several important insights about how clubs experienced home advantage before, during, and after the COVID-19 period. Across nearly every team, the COVID era produced a noticeable decline in net home advantage, confirming that fan absence weakened the traditional home boost. However, the magnitude of this decline—and the extent of the post-COVID recovery—varies considerably across clubs.
For almost every team shown, the COVID-era point drops sharply toward zero—and in many cases dips into negative values—indicating that clubs performed no better, and often worse, at home than away when stadiums were empty. This widespread collapse mirrors the league-wide pattern in earlier charts, demonstrating that the removal of crowds had an immediate and measurable impact on team performance.
What makes this finding especially compelling is its consistency across clubs with very different histories, budgets, playing styles, and stadiums. Whether a club traditionally shows a strong home edge (e.g., Liverpool, Manchester United, Tottenham) or a weaker one (e.g., Fulham, Brighton, Wolves), nearly all exhibit the same COVID-era reduction. This alignment between individual club behavior and league-wide trends provides again strong evidence that crowd presence is a fundamental driver of home advantage.
With the return of supporters in the post-COVID period, most clubs show a rebound toward pre-pandemic levels, further reinforcing the conclusion: when fans disappear, so does home advantage; when fans return, the advantage returns with them.
Together, the COVID-period analysis reinforces our earlier findings:
Home advantage weakened substantially when fans were absent.
The size of the decline differed dramatically across clubs.
The return of supporters restored home advantage for most teams, though not always to pre-pandemic levels.
This evidence shows that home advantage is not merely a statistical artifact or a product of team quality alone. Instead, it is deeply connected to environmental and psychological factors—most notably the presence of fans—and these forces affect clubs in distinct ways.
With this understanding, we can now turn to the broader implications of our findings in the conclusion.
Conclusions
Across more than thirty seasons of Premier League data, our analysis shows that home advantage is both real and remarkably consistent. Home teams win more often, score more goals, and outperform away teams in nearly every season since 1993. Yet this edge is far from uniform: some clubs maintain a strong, stable home boost, while others fluctuate or show minimal benefit. The COVID-19 fan-free period further underscored the role of crowd atmosphere in sustaining home advantage, as league-wide performance gaps nearly vanished before rebounding once supporters returned.
Looking ahead, several avenues could deepen this work. Integrating attendance data, expected-goals metrics, travel distances, rest periods, and referee decision data would help clarify the mechanisms that create home advantage and explain why some clubs depend on it more than others. While many factors shape Premier League competition, one finding stands out: home advantage is a durable, structural feature of the league—powerful, uneven, and highly sensitive to the presence of fans.