Introduction


In this report, we investigate the following research question: What factors contribute to the severity of disciplinary actions towards NFL players from 2002-2014? Our investigation into this topic stemmed from an article and dataset created by Allison McCann in which she addresses The NFL’s Uneven History of Punishing Domestic Violence. To aid in the clarity of our report, we have decided to define a few key pieces from our research. A disciplinary action is defined as a player being suspended from a game; severity is defined as the length of game suspensions; and a 16+ game suspension is considered to be the most severe punishment. Ultimately, this research question is important because of the lack of attention provided to the topic. The NFL is widely known for its ability to entertain and the billions of dollars of revenue generated, however, by not adequately punishing players accused of domestic violence they create an environment that is harmful to victims and survivors alike. It’s important that institutions as big as the NFL are just, and hold their players accountable.

Note: All indefinite game suspensions have been replaced with the value 32 for easier data exploration.


Data Sources


nfl_suspensions_clean_df is the data frame we used to answer our research question. The information in nfl_suspensions_clean_df was pulled from several other data sources cited below:


1. nfl_suspensions


Description: a dataset containing NFL suspensions from 1946 to 2014. The variables list the player name, the team they played for at the time, how many games they were suspended for, the category of the suspension, the description of the offense, the year it took place, and where the information on that suspension was sourced. We originally sourced this dataset from the fivethirtyeight package which corresponds with the organization of the same name. The corresponding article for this dataset can be found at https://www.rdocumentation.org/packages/fivethirtyeight/versions/0.5.0/topics/nfl_suspensions. However the data was originally compiled by Allison McCann using sources from the San Diego Union-Tribune’s NFL arrests database, a Wikipedia list dating back to 1947, and the Spotrac suspension tracker.


Validity: The dataset was pre-processed from other sources by McCann who admits that a direct database from the NFL would have been more ideal. However, the NFL either did not keep records of their suspensions, or were not willing to give her that information. The San Diego Union-Tribune’s NFL arrest database was updated yearly by staff writers who would comb through reports, public records, and databases using key-words to filter for “NFL” and “arrests”. The Wikipedia list is crowdsourced, but a variety of credible public news articles are cited as sources. The Spotrac suspension tracker comes from the Spotrac website that tracks various data on different sports organizations, including the NFL and their arrests and suspensions per season. All three sources appear valid using information from verifiable public records. The only concern would be the possibility of more observations existing than the ones recorded.


2. All_Penalties.csv


Description: a dataset containing NFL suspensions and fines from 1996 to 2014. The variables list the player name, the position they played, the team they played for at the time, how many games they were suspended for, how much they were fined for, the category of the suspension, the description of the offense, and the date it took place. We originally sourced this dataset from the original creator, Alice Corona, on data.world which can be found here: https://data.world/alice-c/nfl-fines-and-suspensions/workspace/file?filename=All+Penalties.csv


Validity: this data was sourced from the original creator. From links posted in the Players_and_Penalties.csv file it appears much of the data was compiled from wikipedia profiles on the players, however, other sources are not listed. The Wikipedia list is crowdsourced, but a variety of credible public news articles are cited as sources. Since the observations in this dataset jump from 1996 to 2000, whereas the nfl_suspensions one had observations in between, it is certain that certain data points are missing.


3. Players_and_Penalties.csv


Description: a dataset containing NFL players and their suspension and fine history from 1996 to 2014. The variables list the player name, the position they played, the team they played for at the time, how many times they were suspended or fined, how many games they were suspended for, how much they were fined for, the category of the suspension, the description of the offense, their age during their first fine, and the date their first fine took place. We originally sourced this dataset from the original creator, Alice Corona, on data.world which can be found here: https://data.world/alice-c/nfl-fines-and-suspensions/workspace/file?filename=Players+%26+Penalties.csv


Validity: this data was sourced from the original creator. From links posted in the file it appears much of the data was compiled from wikipedia profiles on the players, however, other sources are not listed. The Wikipedia list is crowdsourced, but a variety of credible public news articles are cited as sources. Since the observations in this dataset jump from 1996 to 2000, whereas the nfl_suspensions one had observations in between, it is certain that certain data points are missing.


4. nfl_team_abbreviations


Description: a dataset containing NFL team names and their abbreviations. We originally sourced this dataset from: https://gyaanipedia.fandom.com/wiki/NFL_Team_Abbreviations. However, they sourced it from: http://static.nfl.com/static/content/public/image/rulebook/pdfs/22_2012_Table_of_Codes.pdf. A separate website, https://www.abbreviations.com/term/497231, was used to find the full name for the STL abbreviation. A third website, https://en.m.wikipedia.org/wiki/Free_agent, was used to find the full name for the FREE abbreviation


Validity: this dataframe was processed by Kareemot using the above three sources. The gyaanipedia fandom website simply copied and pasted the NFL team names and abbreviations from an official NFL document so this source is very credible. The abbreviations.com website sourced their information using an editorial board. Seeing as the team name and abbreviation align correctly, this source is credible. Finally, wikipedia is crowdsourced with hyperlink citations.


Data Cleaning


The data cleaning process for nfl_suspensions_clean_df can be viewed in the below code chunk. The data dictionary for this data frame and others can also be found in the Appendix.

nfl_suspensions_copy <- nfl_suspensions_copy %>%
  
  separate(name, into = c("first_initial", "last_name"), sep = "\\." ) %>%

  filter(year >= "2000") %>%
  filter(year <= "2014")%>%
  
  mutate(
    
    year = as.factor(year),
    last_name = str_trim(last_name),
    first_initial = str_trim(first_initial)
  )


################  

nfl_penalties_copy <- nfl_penalties_copy %>%
  
  rename(
    
    "name" = "player"
  ) %>%
  
  filter(year >= "2000") %>%
  
  separate(name, into = c("first_name", "last_name"), sep = " " ) %>%
  
  mutate(
    
    first_name_copy = first_name
  ) %>%
  
  separate(first_name_copy, into = c("first_initial", "filler"), sep = 1 ) %>%
  
  mutate(
    year = as.factor(year),
    last_name = str_trim(last_name),
    first_initial = str_trim(first_initial),
    first_name = str_trim(first_name)
  ) %>%
  
  select(-filler, -team) %>%
  
  select(first_initial, last_name, first_name, everything())
  
################  

nfl_suspensions_clean_df <- nfl_suspensions_copy %>%
   left_join(nfl_penalties_copy, by = c("last_name", "first_initial", "year")) %>%
    filter(is.na(first_name) == FALSE)


 
 
################  
 
 copy_players_and_penalties <- copy_players_and_penalties %>%
   
   rename(
     "name" = "player"
     
   ) %>%
  
  separate(name, into = c("first_name", "last_name"), sep = " " ) %>%
  mutate(
    
    last_name = str_trim(last_name),
    first_name = str_trim(first_name)
  ) %>%
    
    select(first_name, last_name, date_of_birth)
    
    



################  

nfl_suspensions_clean_df <- nfl_suspensions_clean_df %>%
  
  left_join(copy_players_and_penalties)




################  

nfl_team_abbreviations <- nfl_team_abbreviations %>% 
  
  rename(
    
    "full_team_name" = "team",
    "team" = "abbreviation"
  )


nfl_suspensions_clean_df <- nfl_suspensions_clean_df %>% 

  left_join(nfl_team_abbreviations)


###########################  


nfl_suspensions_clean_df <- nfl_suspensions_clean_df %>%
  
  separate(date_of_birth, into = c("filler", "year_of_birth"), sep = 6) %>%
  
  
  select(-first_initial, -source, -incident, -fine_category, -games_suspended_from, -fined_amount, -reason, -time_title, -date, -filler) %>%
  
  mutate(
    
    first_name_copy = first_name,
    last_name_copy = last_name,
    year = as.numeric(as.character(year)),
    year_of_birth = as.numeric(as.character(year_of_birth)),
    age_at_suspension = year - year_of_birth , 
    year_of_birth = as.factor(year_of_birth),
    year = as.factor(year)
   
    
  ) %>%
  
  unite(player_full_name, first_name_copy, last_name_copy, sep = " " ) %>%
   
   rename(

    "games_suspended_from" = "games"
    
  ) %>%
  
  mutate(
    
    suspended_indefinitely = ifelse(games_suspended_from == "Indef.", TRUE, FALSE),
    
    player_full_name = str_trim(player_full_name),
    description = str_trim(description),
    category = str_trim(category),
    
    reason_for_suspension = case_when(
      
      category == "Personal conduct" ~ description,
      category != "Personal conduct" ~ category
      
      
    ),
    
    games_suspended_from = str_trim(games_suspended_from),
    
    games_suspended_from = case_when(
      
      games_suspended_from == "Indef." ~ "32",
      games_suspended_from != "Indef." ~ games_suspended_from
    )
    
  
  ) %>%
  
  
  select(player_full_name, first_name, last_name, reason_for_suspension, games_suspended_from, year, suspended_indefinitely, age_at_suspension, position, team, full_team_name, year_of_birth, category, description) %>%
  
  mutate(
    
    games_suspended_from = as.numeric(games_suspended_from)
  ) %>%
  
  separate(reason_for_suspension, into = c("reason_for_suspension", "additional_information_about_suspension"), sep = ", ") %>%
  
  mutate(
    
      additional_information_about_suspension = str_trim(additional_information_about_suspension),
      
      reason_for_suspension = str_trim(reason_for_suspension),
    
        
       marijuana_related = case_when(
         
         str_detect(additional_information_about_suspension, "marijuana") == TRUE ~ TRUE,
         str_detect(additional_information_about_suspension, "marijuana") == FALSE ~ FALSE
         
       ),
       
       
       weapon_related = case_when(
         
         str_detect(additional_information_about_suspension, "weapon") == TRUE ~ TRUE,
         str_detect(additional_information_about_suspension, "weapon") == FALSE ~ FALSE
         
       ),
       
        repeated_offense = case_when(
         
         str_detect(additional_information_about_suspension, "repeated") == TRUE ~ TRUE,
         str_detect(additional_information_about_suspension, "repeated") == FALSE ~ FALSE
         
       ),
      
      
       additional_information_about_suspension = as.factor(additional_information_about_suspension),
       reason_for_suspension = as.factor(reason_for_suspension),
       team = as.factor(team),
       position = as.factor(position),
       full_team_name = as.factor(full_team_name)
       
    
  ) %>%
  
  select(player_full_name, first_name, last_name, reason_for_suspension, additional_information_about_suspension, marijuana_related, weapon_related, repeated_offense, games_suspended_from, suspended_indefinitely, year, age_at_suspension, position, team, full_team_name, year_of_birth, category, description) %>%
  
  select(-category, -description)

nfl_suspensions_clean_df <- nfl_suspensions_clean_df[!duplicated(nfl_suspensions_clean_df), ]  #removes duplicate rows




view(nfl_suspensions_clean_df)

# If games_suspended_from == 0, the player was suspended indefinitely


Finding Answers


To start, we identified the factors that could potentially contribute to the severity of disciplinary actions towards NFL players. Those factors are as follows:


  • Change in time
  • A player’s age at the time of their suspension
  • The team a player was on at the time of their suspension
  • A player’s position on their team at the time of their suspension
  • The offense that caused the suspension
  • Domestic Violence vs PEDs


Through different visualization methods, we will explore these factors and come to conclusions that will help us answer our research question.


Note:

The length of game suspensions is categorized into 5 severity levels where level 5 is the most severe.

  • Level 1 Severity: Suspended for 1 – 3 Games
  • Level 2 Severity: Suspended for 4 – 8 Games
  • Level 3 Severity: Suspended for 9 – 12 Games
  • Level 4 Severity: Suspended for 13 – 15 Games
  • Level 5 Severity: Suspended for 16+ Games


Change in Time


kareemot_nfl_suspensions_clean_copy <-  nfl_suspensions_clean_df


kareemot_nfl_suspensions_clean_copy <- kareemot_nfl_suspensions_clean_copy %>%
  
  mutate(
    
    
    severity_level = case_when(
      
      
        (games_suspended_from == 1) | (games_suspended_from == 2) | (games_suspended_from == 3) ~ "Level 1 Severity: 1-3 Games",
        (games_suspended_from == 4) | (games_suspended_from == 5) | (games_suspended_from == 6)| (games_suspended_from == 7) | (games_suspended_from == 8) ~ "Level 2 Severity: 4-8 Games",
        (games_suspended_from == 9) | (games_suspended_from == 10) | (games_suspended_from == 11) | (games_suspended_from == 12) ~ "Level 3 Severity: 9-12 Games",
        (games_suspended_from == 13) | (games_suspended_from == 14) | (games_suspended_from == 15) ~ "Level 4 Severity: 13-15 Games",
        (games_suspended_from == 16) | (games_suspended_from > 16) ~ "Level 5 Severity: 16+ Games"
  
    )
    
  
  ) 



change_in_time_df <- kareemot_nfl_suspensions_clean_copy %>%

  count(year, severity_level) %>%

  ungroup()

#view(change_in_time_df)


change_in_time_line_graph2 <-  change_in_time_df %>%

  ggplot(aes(x = as.numeric(as.character(year)), y = n, color = severity_level)) +

  geom_line(size = 1) +

  geom_point(size = 2) +

  geom_text_repel(
   aes(label = severity_level),
   hjust = 0,
   nudge_x = 1,
   direction = "y",
   size = 4,
   segment.color = NA)  +

  scale_x_continuous(

  breaks = seq(2002, 2014, 4),
  expand = expansion(add = c(1,5))) +
  
  scale_color_manual(values = c("green4", "red2", "darkorchid1", "blue")) +

  theme_ipsum(grid = FALSE) +

  theme(legend.position = "none") +
  
  labs(
    
    x = "Year",
    y = "Frequency of Player Suspensions",
    title = "Frequency of Player Suspensions \n in the NFL Across 5 Severity Levels"
    
  )




change_in_time_line_anim <- change_in_time_line_graph2 +

  transition_reveal(as.numeric(as.character(year)))

animate(change_in_time_line_anim, end_pause = 15, duration = 10, renderer = magick_renderer())


change_in_time_line_anim

anim_save(here::here(

  "figs", "change_in_time_line_anim.gif"
))


From this animation, it is clear that players in the NFL are given game suspensions with a wide range of severity levels. It appears that suspensions of level 2 severity are the most popular type of suspensions given, while suspensions of level 1 and 5 severity were given most frequently after 2010. Though suspensions of level 1 and 5 severity appeared more often as time progressed, suspensions of level 2 severity consistently surpass them in frequency. However, the increased frequency of level 5 suspensions and decreased frequency of level 1 suspensions as the years approached 2014 show that the NFL has become slightly harsher in the ways that it disciplines its players. Therefore, it may be concluded that overall disciplinary actions within the NFL have become more severe with time


# 
# kareemot_nfl_suspensions_clean_copy <-  nfl_suspensions_clean_df
# 
# 
# kareemot_nfl_suspensions_clean_copy <- kareemot_nfl_suspensions_clean_copy %>%
# 
#   mutate(
# 
# 
#     severity_level = case_when(
# 
# 
#         (games_suspended_from == 1) | (games_suspended_from == 2) | (games_suspended_from == 3) ~ "Level 1 Severity: 1-3 Games",
#         (games_suspended_from == 4) | (games_suspended_from == 5) | (games_suspended_from == 6)| (games_suspended_from == 7) | (games_suspended_from == 8) ~ "Level 2 Severity: 4-8 Games",
#         (games_suspended_from == 9) | (games_suspended_from == 10) | (games_suspended_from == 11) | (games_suspended_from == 12) ~ "Level 3 Severity: 9-12 Games",
#         (games_suspended_from == 13) | (games_suspended_from == 14) | (games_suspended_from == 15) ~ "Level 4 Severity: 13-15 Games",
#         (games_suspended_from == 16) | (games_suspended_from > 16) ~ "Level 5 Severity: 16+ Games"
# 
#     )
# 
# 
#   )
# 
# 
# 
# change_in_time_df <- kareemot_nfl_suspensions_clean_copy %>%
# 
#   count(year, severity_level) %>%
# 
#   ungroup()
# 
# #view(change_in_time_df)
# 
# 
# change_in_time_line_graph_static <-  change_in_time_df %>%
# 
#   ggplot(aes(x = as.numeric(as.character(year)), y = n, color = severity_level)) +
# 
#   geom_line(size = 1) +
# 
#   geom_point(size = 2) +
# 
#   geom_text_repel(
#     
#   data = change_in_time_df %>%
#     
#     filter(year == max(as.numeric(as.character(year)))),
#     
#    aes(label = severity_level),
#    hjust = 0,
#    nudge_x = 0.5,
#    direction = "y",
#    size = 6,
#    segment.color = NA)  +
# 
#   scale_x_continuous(
# 
#   breaks = seq(2002, 2014, 4),
#   expand = expansion(add = c(1,5))) +
# 
#   scale_color_manual(values = c("green4", "red2", "darkorchid1", "blue")) +
# 
#   theme_ipsum(grid = FALSE) +
# 
#   theme(legend.position = "none") +
# 
#   labs(
# 
#     x = "Year",
#     y = "Frequency of Player Suspensions",
#     title = "Frequency of Player Suspensions \nin the NFL Across 5 Severity Levels"
# 
#   )
# 
# 
# change_in_time_line_graph_static


A player’s age at the time of their suspension


Next we will explore whether a player’s age at the time of their suspension impacts the severity of their punishment


ebun_nfl_suspensions_copy <- nfl_suspensions_clean_df


corr <- cor(
  ebun_nfl_suspensions_copy$age_at_suspension,
  ebun_nfl_suspensions_copy$games_suspended_from,
  method = 'spearman', use ="complete.obs")


corrLabel <- paste('r = ', round(corr, 2))

corrLabel2 <- paste('r^2 = ', round(corr^2, 2))


ebun_nfl_suspensions_copy %>% 
  filter(! is.na(age_at_suspension)) %>% 
  filter(! is.na(games_suspended_from)) %>% 
  ggplot() +
  geom_point(aes(x = age_at_suspension, y = games_suspended_from), size = 2, alpha = 0.7, color = 'red3') +
  annotate(geom = 'text',
           x = 33, y = 28,
           label = corrLabel,
           hjust = 0, size = 5) +
    annotate(geom = 'text',
           x = 33, y = 23,
           label = corrLabel2,
           hjust = 0, size = 5) +
  theme_bw(base_size = 12) + #base_size = 12) +
  labs(
    title = "Relationship Between Age and Suspension Length",
    x = "Player Age at Time of Suspension",
    y = "Length of Game Suspension") +
  theme(plot.title = element_text(hjust = 0.5)) +
  theme(axis.title.y = element_text(margin = margin(t = 0, r = 10, b = 0, l = 0))) +
  theme(axis.text.x = element_text(margin = margin(t = 0, r = 20, b = 0, l = 0))) +
  theme(axis.title = element_text(margin = margin(t = 0, r = 20, b = 0, l = 0))) +
  theme_ipsum()


The scatter plot is clear, there is no correlation between a players age and the length of suspension they receive for a violation. This finding is statistically supported by the r value -0.05 as well as the coefficient of determination which shows that 0% of a player’s suspension length is explained by their age. This could suggest that other factors that could be associated with age, such as career length, relationships with NFL personnel, maturity, etc. also do not correlate with the severity of punishment a player receives.


The team a player was on at the time of their suspension


In this next figure we decided to explore the distribution of game suspensions across the teams in the NFL as a whole.


boxplot_kyara <- nfl_suspensions_clean_df %>%
  
  mutate(full_team_name = fct_reorder(full_team_name, games_suspended_from)) %>%
  
  ggplot() +
    
  geom_boxplot(aes(x = games_suspended_from, y = full_team_name)) +
  geom_vline(
    xintercept = median(nfl_suspensions_clean_df$games_suspended_from), color = 'red', linetype = 'dashed') +
  theme_ipsum() +
  annotate(
    'text', x = 4.5, y = 'New York Jets', color = 'red', hjust = 0, label = 'Median Game Suspensions Across Teams', size = 4
  ) +
 labs(x = "Number of games suspended from", y = "NFL team name", title = "Distribution of Game Suspensions \nAcross NFL Teams", subtitle = "Compared to the overall median game suspensions across teams, which is equal to 4")

boxplot_kyara


From this, we found that 4 is the median number of suspensions across all teams. The Dallas Cowboys, with zero outliers, has the highest median of all teams and falls well above the overall median. The San Francisco 49ers, on the other hand, have the lowest median of all teams and fall way below the overall median. The Green Bay Packers have the largest IQR of all teams, meaning the middle half of their data has a lot of variability in the number of game suspensions. Few teams, like the New York Giants and the Miami Dolphins, had very minimal amounts of game suspensions.


A player’s position on their team at the time of their suspension


kareemot_nfl_suspensions_clean_copy <- kareemot_nfl_suspensions_clean_copy %>%
  
  mutate(
    
    
    severity_level = case_when(
      
      
        (games_suspended_from == 1) | (games_suspended_from == 2) | (games_suspended_from == 3) ~ "Level 1 Severity: \n 1-3 Games",
        (games_suspended_from == 4) | (games_suspended_from == 5) | (games_suspended_from == 6)| (games_suspended_from == 7) | (games_suspended_from == 8) ~ "Level 2 Severity: \n 4-8 Games",
        (games_suspended_from == 9) | (games_suspended_from == 10) | (games_suspended_from == 11) | (games_suspended_from == 12) ~ "Level 3 Severity: \n 9-12 Games",
        (games_suspended_from == 13) | (games_suspended_from == 14) | (games_suspended_from == 15) ~ "Level 4 Severity: \n 13-15 Games",
        (games_suspended_from == 16) | (games_suspended_from > 16) ~ "Level 5 Severity: \n 16+ Games"
  
    )
    
  
  ) 



# x <-  player_position_df %>%
#   group_by(position) %>%
#   summarise(
#     
#     sum = sum(n)
#   )


player_position_df <- kareemot_nfl_suspensions_clean_copy %>%
   
   count(position, severity_level) %>%

  rename(

    "Frequency of \nPlayer Suspensions" = "n"
  )


tile_chart <- player_position_df %>%
  
  ggplot(aes(x = severity_level, y = position, fill = `Frequency of \nPlayer Suspensions`)) +
  
  geom_tile() +
  
  scale_y_discrete(
    
    expand = expansion(mult = c(0.05,0))
  ) +
  
  coord_cartesian(clip = "off") +
  
  labs(
    
    x = "Level of Suspension Severity",
    y = "Position on Team",
    title = "Severity of Suspensions in the NFL \nBased on Play Position",
    subtitle = "For the years 2002 - 2014"

    
  )  +
  
  theme_ipsum(grid = FALSE) +
  
  
  scale_fill_viridis_c(
    
    option = "magma", 
    direction = -1
  )


tile_chart


Without a doubt, Wide Receivers have the most suspensions of any position; they even tie with Safety for the most suspensions of level 5 severity. After that, it appears to be Cornerback and Running Backs.


The offense that caused the suspension


mean_df <- kareemot_nfl_suspensions_clean_copy %>%
  
  
  group_by(reason_for_suspension) %>%
  
  summarise(
    
    mean_game_suspensions = mean(games_suspended_from)
    
  ) 

mean_df_copy <- mean_df[-1:-15, ]
mean_df <- mean_df[-16, ]


mean_df_copy <- mean_df_copy %>%
  
  mutate(
    
    reason_for_suspension = case_when(
      
      is.na(reason_for_suspension) == TRUE ~ as.factor("Unknown")
    )
    
  ) 

  
  
mean_df <- bind_rows(mean_df, mean_df_copy) 


mean_df <- mean_df %>%
  
  mutate(
    
    reason_for_suspension = fct_reorder(reason_for_suspension, mean_game_suspensions),
    reason_for_suspension_fill = ifelse(reason_for_suspension == "DUI manslaughter", TRUE, FALSE)
    
  )


  
mean_chart <- mean_df %>%
  
  ggplot() + 
  
  geom_col(aes(x = mean_game_suspensions, y = reason_for_suspension, fill = reason_for_suspension_fill)) +
  
  
  scale_x_continuous(

  limits = c(0, 14)) +
  
  scale_fill_manual(values = c("grey85", "red2")) +

  theme_ipsum(grid = "X") +

  labs(
    
    x = "Average Game Suspensions",
    y = "Offense",
    title = "Average Game Suspensions in the NFL",
    subtitle = "For the Years 2002 - 2014"
  ) +

  theme(legend.position = "none")


mean_chart   


DUI manslaughter has the highest average game suspension of all offenses, while the offenses with the lowest average game suspensions are disorderly conduct, accused of battery, and resisting arrest. The average game suspension for domestic violence appears to fall above most other offenses, including performance-enhancing drugs (PEDs). These findings contradict claims made by Allison McCann in her article The NFL’s Uneven History Of Punishing Domestic Violence where she states “The NFL’s punishment of personal conduct violations has been inconsistent and on average less harsh than its punishment of drug offenses.”

We have concluded that the difference in these findings may be due to the ways we stored data. McCann grouped multiple offenses under “Personal Conduct Violations,” while we ungrouped them and allowed them to stand alone as individual offenses. The scope of time used in her research and ours differ as well.


Domestic Violence vs PEDs


After analyzing the highest and lowest average game suspensions given, we decided to still further explore PEDs and domestic violence as reasons for suspension. Through that exploration we created the following graph and drew the following conclusion:


kyara_copy_nfl_suspensions_clean <-nfl_suspensions_clean_df %>%
  filter(reason_for_suspension == "Domestic violence" | reason_for_suspension == "PEDs")


histogram_kyara <-kyara_copy_nfl_suspensions_clean %>%
  count(games_suspended_from, reason_for_suspension) %>%
  ggplot() +
  geom_col(aes(x = games_suspended_from, y = n), width = 1.0, color = 'blue', fill = 'blue') +
  facet_wrap(vars(reason_for_suspension), nrow = 1) +
  scale_y_continuous(
    expand = expansion(mult = c(0, 0.05))) +
  theme_ipsum() +
  labs(y = "Count", x = "Total number of games suspended from", title = "Frequency Distribution of Game Suspensions", subtitle = "Suspensions for use of PEDs and charges of Domestic Violence")

histogram_kyara


From the above graph, which looks at the frequency of game suspensions for PEDs and domestic violence, we conclude that game suspensions for PEDs are more frequent and severe than for domestic violence. In addition, we found that the most common amount of game suspensions given for PEDs is around 3-4, while for domestic violence, it is 1-3.


Final Conclusion


In looking at the factors that could potentially contribute to the severity of disciplinary actions in the NFL, we found that some have little to no influence while others may be strong predictors. For starters, age does not affect a player’s level of punishment since there is no correlation found between the two. Similarly, the majority of NFL teams have a median game suspension of around 3 or 4. The exception is the Dallas Cowboys, which fall well above the overall median, and the San Francisco 49ers which fall well below it. However, even then, their medians are still close in value to the rest of the NFL teams. Therefore it is reasonable to suggest that the team a player is on has minimal effect on the severity of their suspensions.


On the other hand, it seems that a player’s position may result in more severe suspensions. We observed that Wide Receivers, by far, receive more and severe punishment than any other position. While there is not anything inherent about the position that would cause more suspensions, it is worthwhile to look into how Wide Receivers are perceived in the NFL may contribute to their higher propensity to commit offenses or receive severer suspensions.


Ultimately, the most concrete factor in looking at the severity of disciplinary actions in the NFL is the type of offense a player commits and changes with time. Time plays a great factor as the league has become harsher in its level of severity. This may be due to how the USA culture itself, and how we view certain violations like drug usage, has changed over time. With the type of offenses committed, the trend is clear: domestic violence, excluding the one player in our data that is suspended indefinitely for the offense, is consistently under-punished compared to the offense of performance-enhancing drugs (PEDs). As our culture as a nation continues to evolve, we must thus question why we critique drug use more than violence against women; and how institutions like the NFL should take the lead in changing this narrative.


Appendix


nfl_suspensions_clean_df Data Dictionary:


nfl_suspensions_clean_df Data Dictionary
Variable Description
player_full_name First name, last name of player
first_name First name of player
last_name Last name of player
reason_for_suspension Category offense falls under
additional_information_about_suspension Context of offense
marijuana_related TRUE if offense was marijuana related
weapon_related TRUE if offense was weapon related
repeated_offense TRUE if offense was a repeat
games_suspended_from Number of games player was suspended
suspended_indefinitely TRUE if player suspended indefinitely
year Year offense took place
age_at_suspension Age of player at time of suspension
position Position player played at time of suspension
team Abbreviated team name player played for at time of suspension
full_team_name Full team name player played for at time of suspension
year_of_birth Year player was born


nfl_suspensions Data Dictionary:


nfl_suspensions Data Dictionary
Variable Description
name First-initial, last name of player
team Team player played for at time of suspension
games Number of games player suspended from
category Category suspension fell under
description Specific offense within the category
year Year suspension occurred
source Original news source on the suspension


All_Penalties.csv Data Dictionary:


All_Penalties.csv Data Dictionary
Variable Description
incident Brief description of incident
player First name, last name of player
position Position player played
team Team player played for at time of incident
fine_category Category offense falls under
games_suspended_from Number of games suspended from
fined_amount Amount player was fined
reason Specific offense within the category
time_title Which week during season suspension was applied
date Year, month, day of incident
year Year of incident


Players_and_Penalties.csv Data Dictionary:


Players_and_Penalties.csv Data Dictionary
Variable Description
player First name, last name of player
team_s_when_fined_suspended Team player played for at time of incident
role_when_fined_suspended Position player played
number_of_fines Number of times player has been fined
days_of_suspensions Number of games suspended from playing
fined_amount Amount player was fined first time
age_first_fine_approximate Age of player during first fine
date_first_fine Year, month, day of first fine
date_of_birth Year, month, day of player’s birth
biography Brief description of player
wikepedia_profile Wikipedia link to player profile
image Link to image of player


nfl_team_abbreviations Data Dictionary:


nfl_team_abbreviations Data Dictionary
Variable Description
Abbreviation Abbreviation of NFL team name
Team Full NFL team name