Research Question: What are the factors that influences ones comfortability on airplanes?

There are so many factors that play a role in a passengers comfortability on an airplane everyday. Whether people are taking a flight for business or vacation, millions of people fly everyday and have created opinions on how one should behave during a flight. Variables such as a baby crying, the middle armrest getting taken, or not having enough leg room after the person in front of you reclines their seat, have impacted the majority of passengers during a flight allowing them to relate this topic and each other. Now sit back and imagine you are going to take a flight where not everything goes as you had hoped.

Lets explore the data set

         #install.packages("fivethirtyeight")
         library(fivethirtyeight)
         glimpse(flying)
## Rows: 1,040
## Columns: 27
## $ respondent_id        <dbl> 3436139758, 3434278696, 3434275578, 3434268208, …
## $ gender               <chr> NA, "Male", "Male", "Male", "Male", "Male", "Mal…
## $ age                  <ord> NA, 30-44, 30-44, 30-44, 30-44, 30-44, 30-44, 30…
## $ height               <ord> NA, 6'3", 5'8", 5'11", 5'7", 5'9", 6'2", 6'0", 6…
## $ children_under_18    <lgl> NA, TRUE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE,…
## $ household_income     <ord> NA, NA, "$100,000 - $149,999", "$0 - $24,999", "…
## $ education            <ord> NA, Graduate degree, Bachelor degree, Bachelor d…
## $ location             <chr> NA, "Pacific", "Pacific", "Pacific", "Pacific", …
## $ frequency            <ord> Once a year or less, Once a year or less, Once a…
## $ recline_frequency    <ord> NA, About half the time, Usually, Always, About …
## $ recline_obligation   <lgl> NA, TRUE, TRUE, FALSE, FALSE, TRUE, FALSE, TRUE,…
## $ recline_rude         <ord> NA, Somewhat, No, No, No, No, Somewhat, No, No, …
## $ recline_eliminate    <lgl> NA, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TRU…
## $ switch_seats_friends <ord> NA, No, No, Somewhat, No, Somewhat, Somewhat, No…
## $ switch_seats_family  <ord> NA, No, No, No, No, No, No, No, NA, Very, No, NA…
## $ wake_up_bathroom     <ord> NA, No, No, No, Somewhat, Somewhat, No, No, NA, …
## $ wake_up_walk         <ord> NA, No, Somewhat, Somewhat, Somewhat, Very, No, …
## $ baby                 <ord> NA, No, Somewhat, Somewhat, Somewhat, Very, No, …
## $ unruly_child         <ord> NA, No, Very, Very, Very, Very, Somewhat, Very, …
## $ two_arm_rests        <chr> NA, "The arm rests should be shared", "Whoever p…
## $ middle_arm_rest      <chr> NA, "The arm rests should be shared", "The arm r…
## $ shade                <chr> NA, "Everyone in the row should have some say", …
## $ unsold_seat          <ord> NA, No, No, No, No, Somewhat, No, No, No, Very, …
## $ talk_stranger        <ord> NA, No, No, No, No, No, Somewhat, No, No, Very, …
## $ get_up               <ord> NA, Twice, Three times, Three times, Twice, Once…
## $ electronics          <lgl> NA, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, FAL…
## $ smoked               <lgl> NA, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FA…

About FiveThirtyEight

FiveThirtyEight is the source of this data set. This dataset was published on September 5, 2014. They have conducted a surveymonkey that contains various questions regarding one’s experience throughout a flight. They have received 1,040 responses all throughout various regions within the United States (Mountain, West South Central,South Atlantic, West North Central, New England, Pacific, Middle Atlantic, East North Central, and East South Central) which has created a dynamic set of opinions. The data is trustworthy as it has been directly gathered from the passengers that have flown on airplanes. Although the dataset is from first hand sources, some answers may have been untruthful as it could have not been filled out to the best of their ability. Andrei Scheinkman collected the original data which was then published in an article by Walt Hickey. The data is pre-processed and is accesible from the “fivethirtyeight” library.

It is also avaialble on the gihub site : https://github.com/fivethirtyeight/data/tree/master/flying-etiquette-survey This set has not been altered since 2014.

Cleaning the Data Set

flying_new <- flying %>%
        drop_na()

flying_proj <- flying_new %>%
        select(gender, age, height, household_income, education, location, frequency, recline_frequency, recline_frequency, recline_obligation, recline_rude, switch_seats_family, wake_up_bathroom, baby, unruly_child, two_arm_rests, middle_arm_rest, shade, unsold_seat, talk_stranger, get_up)

glimpse(flying_proj)
## Rows: 582
## Columns: 20
## $ gender              <chr> "Male", "Male", "Male", "Male", "Male", "Male", "…
## $ age                 <ord> 30-44, 30-44, 30-44, 30-44, 30-44, 30-44, 30-44, …
## $ height              <ord> 5'8", 5'11", 5'7", 5'9", 6'0", 5'6", 6'0", 5'8", …
## $ household_income    <ord> "$100,000 - $149,999", "$0 - $24,999", "$50,000 -…
## $ education           <ord> Bachelor degree, Bachelor degree, Bachelor degree…
## $ location            <chr> "Pacific", "Pacific", "Pacific", "East North Cent…
## $ frequency           <ord> Once a year or less, Once a year or less, Once a …
## $ recline_frequency   <ord> Usually, Always, About half the time, Usually, On…
## $ recline_obligation  <lgl> TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE…
## $ recline_rude        <ord> No, No, No, No, No, Very, No, No, Very, No, No, N…
## $ switch_seats_family <ord> No, No, No, No, No, Very, No, No, No, No, No, Som…
## $ wake_up_bathroom    <ord> No, No, Somewhat, Somewhat, No, Very, Somewhat, S…
## $ baby                <ord> Somewhat, Somewhat, Somewhat, Very, Somewhat, Ver…
## $ unruly_child        <ord> Very, Very, Very, Very, Very, Very, Very, No, No,…
## $ two_arm_rests       <chr> "Whoever puts their arm on the arm rest first", "…
## $ middle_arm_rest     <chr> "The arm rests should be shared", "The arm rests …
## $ shade               <chr> "The person in the window seat should have exclus…
## $ unsold_seat         <ord> No, No, No, Somewhat, No, Very, Very, No, No, No,…
## $ talk_stranger       <ord> No, No, No, No, No, Very, No, No, No, No, No, No,…
## $ get_up              <ord> Three times, Three times, Twice, Once, Four times…

The airport is always a busy place, lets see how often people travel…

Visualization - The frequency of travel by region in the United States

library(ggridges)
library(tidyverse)


flying_map <- flying_proj %>%
  group_by(frequency)%>%
  count(location, frequency)%>%
  mutate(p = n/sum(n), percent = round(100 * p, 2))



ggplot(flying_map) +
  geom_segment(aes(x = location,xend = location,y = 0,yend = percent, color= frequency),size=2) +
  geom_point(aes(x=location, y=percent,color= frequency), size= 2.5)+
  coord_flip()+
 theme_minimal_vgrid(
                font_family = 'Fira Sans Condensed',
                font_size = 10 ) +
  scale_y_continuous(
            limits = c(0, 100))+
  facet_wrap(~frequency, ncol=1)+
  panel_border()+
  
     labs(title = '                US travel frequency by region',
        y= " Passenger Percentage",
        x = 'US Region',
       color = 'Frequency',
      font_family = 'Fira Sans Condensed')

Visualization 1- Description and Observations:

We chose to first begin to observe the frequency of travels by region within the United States by creating a lollipop chart using the location and frequnecy variables. Before doing so we have grouped the variables by frequency and computed the percentage of location and frequency by its responses. It is apparent from the graph that most people travel once a year or less no matter the region type (75% or more). A dramatically apparent drop is found as some other passengers travel once a month or less ranging from 10%-30%. An outlier of 1 person is found that travels everyday in the Pacific region, and 1 person that travels a few times per week in the South Atlantic region .

After booking a flight to Florida, you walk into a packed airplane where you find that your seat is taken….

Visualization - Correlation of Ones Income and View of Using an Unsold Seat

flying_income <- flying_proj %>%
        group_by(household_income)%>%
        count(household_income, unsold_seat) %>%
        mutate(p = n/sum(n), percent = round(100 * p, 2))


ggplot(flying_income) +
        geom_col(aes(x = as.factor(household_income),y = percent, fill = unsold_seat),
        position = 'dodge',
        width = 0.7, alpha = 0.8) +
         scale_y_continuous(
                limits = c(0, 100))+
        coord_flip() +
        theme_minimal_vgrid() +
        labs(x = 'Household Income',
        y = ' Percentage of Responses',
        fill = 'Is it rude?',
        title = 'Correlation of Ones Income and View of Using an Unsold Seat?')

Visualization 2 - Description and Observations:

Is there a possibility that ones income can influence their perspective of if people sitting in an unsold seat during a flight is rude? We decided to explore the two variables household_income and unsold_seat using a bar chart to further observe the relationship. We first grouped the variables by household_income and then computed the percentage of their responses by the percentage of unsold_seat and household_income. It is apparent from the graph that the higher ones income, the increased responses indicate that they do not find it rude when someone sits in an unsold seat (84%). Surpisingly, passengers with a low income, ($0-$24,999), have the highest percentage of responses of finding it Very rude when people sit in an unsold seat (8.3%). The following trends indicate that if you decide to switch to an unsold seat in an airplane, you most likely wont find other people angered by your actions. It is likely however that a small percentage of passengers would be disatisfied since airplane tickets may be an expensive purchase based on their income and perfer that other passengers stay in the seat that they purchased.

Following a long argument, you finally sit down in your seat, the flight takes off, and the babies begin to cry and children create chaos ….

Visualization- The correlation of ones age and their view of an unruly child being brought on a plane

flying_mod <- flying_proj %>%
        group_by(age)%>%
        count(age, unruly_child)%>%
        mutate(p = n/sum(n), percent = round(100 * p, 2))
       
        
ggplot(flying_mod) +
        geom_col(aes(y = as.factor(age),x = percent, fill = unruly_child),
        position = 'dodge',
        width = 0.7, alpha = 0.8) +
         scale_x_continuous(
                limits = c(0, 100))+
        coord_flip() +
        theme_minimal_hgrid() +
        labs(y = 'Age',
        x = ' Percentage of Responses',
        fill = 'Is it rude?',
        title = ' How age influences ones view of an unruly child being \n                          brought on a plane?')

Visualization 3 - Description and Observations:

We knew there must be an interesting relationship between the variables “age” and “unruly_child”. Hence, we decided to make a barplot which showed the interaction between the passengers age and their reaction to unruly children being brought to an airplane. Our first step was to summarise our data frame and get the percentage of responses across each age group on if it was rude to bring unruly child to a plane. The possible answers were “No”, “Somewhat”, or “Very”. The findings were as follows: as we can see in the graph, the trend is that as people grow older, they have less patience to cope with unruly children. Within the ages of 18-29, only 35.94% of passengers thought it was very rude. Within the ages of 30-44, 37.25% of passengers had that same feeling. Within the ages of 45-60, that percentage increases to 39.13%. Finally, over the age of 60, that percentage goes up to 50%. Something else that is interesting to be noted is that within the ages of 30-44, the percentage of “No” answers is higher (23.53%) than all the other categories (17.19%, 15.53%, and 12.86% in order). That may be due to the fact that over the ages of 30-44, more people have growing children, since the majority of people have babies when they are 25-35 years old. Therefore, these people may relate to the parents who bring unruly children to the plane and may not be so annoyed by it.

Visualization- The correlation of ones Gender and their view of a baby being brought on a plane

flying_child <- flying_proj %>%
        group_by(gender)%>%
        count(gender, baby)%>%
        mutate(p = n/sum(n), percent = round(100 * p, 2))
       
        
ggplot(flying_child )+
        geom_col(aes(y = as.factor(gender),x = percent, fill = baby),
        position = 'dodge',
        width = 0.7, alpha = 0.8) +
         scale_x_continuous(
                limits = c(0, 100))+
        coord_flip() +
        theme_minimal_hgrid() +
        labs(y = 'Gender',
        x = ' Percentage of Responses',
        fill = 'Is it rude?',
        title = ' How Gender influences ones view of a\n     baby being brought on a plane?')

Visualization 4- Description and Observations:

We chose to make a bar plot in order to compare the variables “gender” and “baby”. We first decided to summarise the data by taking the percentage of each gender compared to their response of if it is rude to bring a baby on the plane. The alternatives were “no”, “somewhat”, and “very”. After plotting and seeing the trend across the graph, we reached the conclusions that women in general care less about a baby on the plane: nearly 75% of them think that it is not rude at all, whereas only around 60% of the men think that. When it comes to the “very” responses, we can clearly see that men have a higher percentage on that: around 15%, while only 5% of women seem to think that it is very rude to bring a baby on the plane.

These findings raise social questions such as: “do women genuinely care more about babies than men?” or even “do women care less about a baby on the plane because relate (considering what is imposed on women by society since they are born)?”

As you try to get comfortable, in the middle seat, both passengers next to you decide to use the arm rests….

Visualization - Correlation of ones frequency of travel and their view of whom should use the middle arm rest

flying_seat <- flying_proj %>%
  group_by(frequency)%>%
  count(middle_arm_rest, frequency)%>%
  mutate(p = n/sum(n), percent = round(100 * p, 2))




ggplot(flying_seat)+
  geom_col(aes(x = middle_arm_rest, y= percent, fill=frequency), size =3) +
  coord_flip()+
 theme_minimal_vgrid(
                font_family = 'Fira Sans Condensed',
                font_size = 10 ) +
  scale_y_continuous(
            limits = c(0, 100))+
  panel_border()+
  facet_wrap(~frequency, ncol=1)+
  labs(
    y = 'Passenger Percentage',
    x = 'Frequency of Travel',
    color = 'Who should use it? ',
    title = '                         Correlation of ones frequency of travel and their view\n                                 of whom should use the middle arm rest',
    font_family = 'Fira Sans Condensed')

Visualization 5 - Description and Observations:

The more often one flies, the more opinions they will form about other passengers behaviors including how the armrests should be used. With multiple people sitting next to each other, the armrests have to be shared amongst the row, and it does not always end up being fair to all the passengers. This graph shows how flying more frequently affects a passengers view of who should use the middle arm rest. The correlation between the two variables is that those passengers who fly less believe that the armrests should be shared, and as people fly more they start to think that there is a specific passenger who gets to use it whether it be whoever gets there first or who is sitting in the aisle seat. Since people who fly daily have more experience, they chose the person in the aisle seat to get it. This may be because they usually sit in the aisle seat, as it is a preferred, and they personally like to use the armrest. While there is no correct opinion, it is safe to say that as passengers take more flights, they are more likely to think that a certain passenger based on their seat should get the middle armrest, rather than whoever gets there first.

You try to ignore the commotion around you and attempt to sleep but the person next to you wont put the shade down and the person infront of you reclines their seat leaving you with no room….

Visualization - The correlations of ones education and their opinion of who should control the window shade

flying_education <- flying_proj %>%
        group_by(education)%>%
        count(education, shade)%>%
        mutate(p = n/sum(n), percent = round(100 * p, 2))



ggplot(flying_education) +
  geom_point(aes(x=education, y=percent,color= shade), size= 7)+
 theme_minimal_hgrid(
                font_family = 'Fira Sans Condensed',
                font_size = 25 ) +
  panel_border()+
  scale_y_continuous(
            limits = c(0, 100))+
     labs(x = 'Education level', y = 'Percentage of responses',
                color = 'Who should have control over the window shade?',
                title = 'How ones education influences their opinion on who should have control over the window shade?')

Visualization 6 - Description and Observations:

For this visualization, our goal was to explore a relationship between the variables “shade” and “education”, in order to see if passengers that had higher or lower educational status had different opinions on if the person sitting by the window should have exclusive control over the window shade, or if everyone sitting in that row should have a say. Our first step was to summarise our data set and calculate the percentage of each response across each education level. The possible answers were “Everyone in the row should have some say” or “The person in the window seat should have exclusive control”. The possible education levels were “graduate degree”, “bachelor degree”, “some college or associate degree”, “high school degree”, or “less than high school degree”.

Our findings were as follows: it seems that in the “extremes” (graduate degree and less than high school degree), passengers prefered that everyone in the row had a say (the percentages were 100% and 65.58%), whereas in the “middle education levels” (which includes bachelor degree, some college or associate degree, and high school degree), they also had the same opinion. However, the percentages were so much lower (59.57%, 54.91%, and 51.22%). When it comes to the passengers that thought that the person sitting by the window should have exclusive control over the window shade, the percentage of responses were higher in the “middle education levels”, while much lower in the “extreme” ones. The percentages for the “middle” ones were 40.43%, 45.09%, and 48.78%, and the percentage for the “extreme” ones were 0% and 34.42%. This result clearly shows that the majority of people agree that everyone in the row should have some say. However, we thought that the percentages of responses were interesting and showed a non expected relationship.

Visualization - Relationship between passenger’s height and their opinion on if reclining a seat on an airplane is rude

label<- "Interesting to note that no one \nwithin the 5'1 hieght category found \nit 'very' rude to recline a seat! "

flying_heightRude <- flying_proj %>%
        group_by(height)%>%
        count(height, recline_rude) %>%
        mutate(p = n/sum(n), percent = round(100 * p, 2))


ggplot(flying_heightRude)+
        geom_col(aes(x = height, y = percent, width=.55, fill =recline_rude)) +
        facet_wrap(~recline_rude, ncol = 3)+
        geom_curve(
                data = data.frame(x = 3, xend = 3,
                y = 50, yend = 5,
                recline_rude = 'Very'),
                mapping = aes(x = x, y = y, xend = xend, yend = yend),
                color = 'grey75', size = 0.5, curvature = 0.3,
                arrow = arrow(length = unit(0.09, 'npc'), type = 'open')) +
        geom_label(
                data = data.frame(x = 3, y = 50,
                label = label, recline_rude = 'Very'),
                mapping = aes(x = x, y = y, label = label),
                hjust = 0, lineheight = .8,
                family = 'Fira Sans Condensed',
                size= 9) +
        scale_y_continuous(
                limits = c(0, 100),
                expand = expand_scale(mult = c(0, 0.05)))+
        theme_minimal_hgrid(
                font_family = 'Fira Sans Condensed',
                font_size = 30 ) +
    theme(legend.position= "bottom" )+
  coord_flip()+
  panel_border()+
        labs(title = "                                                                                     Relationship between passenger's height and their opinion on if reclining a seat on an airplane is rude",
             x = "Height",
             y = "Percentage of Responses",
             fill = " Is it rude?")

Visualization 7 - Description and Observations:

We chose to do multiple bar graphs, one for each height category, to display the relationship between passenger’s height and their opinion on if reclining a seat on an airplane is rude. On the y-axis we have the percent of passengers, and on the x-axis we have the responses they chose. These graphs signify that those passengers that are taller, especially over 6’, are more likely to think that reclining a seat on a plane is rude. This makes sense because the taller passengers have longer legs and therefore less leg room in front of them making reclining a seat more uncomfortable. If you look at those who are around 5’, they’re responses include almost all “No” since the reclining of a seat would not make them as uncomfortable. This analysis displays that there is a relationship between height and a passenger’s opinion on reclining a seat because as a passenger gets taller they are more likely to think reclining a seat is rude.

Adding on to this tiring flight, the person next to you is asking you to get up again in order for him to use the restroom for the third time…

Visualization -Relationship between passenger’s height the amount of times they believe that it is acceptable to get up throughout a flight

flying_getup <- flying_proj %>%
        group_by(height)%>%
        count(height, get_up) %>%
        mutate(p = n/sum(n), percent = round(100 * p, 2))


ggplot(flying_getup)+
        geom_col(aes(x = height, y = percent, width=.55,height=.55, fill =get_up)) +
        facet_wrap(~get_up, ncol =2)+
        scale_y_continuous(
                limits = c(0, 100),
                expand = expand_scale(mult = c(0, 0.05)))+
        theme_minimal_hgrid(
                font_family = 'Fira Sans Condensed',
                font_size =15 ) +
  panel_border()+
   theme(legend.position= "bottom" )+
        labs(title = "      Relationship between passenger's height the amount of times they believe that it is acceptable to get up throughout a flight",
             x = "Height",
             y = "Percentage of Responses",
             fill = "How many times?",
             font_size = 20)

Visualization 8 - Description and Observations:

Lastly, we chose to analyze the relationship of ones height and the amount of times they believe it is acceptable for one to get up throughout a 6 hour flight. Is it possible that the taller one is, the more bothered they might be from ones constant movement (usually due to them having to adjust their legs so that person can get up)? To explore this relationship we have used the get_up and height variable. After grouping based pn height and calculating the percentage of height and get_up responses, we have used a seperated bar chart for each of the grouped get up responses. The overall trends indicate that most people find it acceptable to get up twice or three times, while almost none have expressed that it is completly not okay to get up. An outlier of 40% of passengers with the height of 6’5 surprisingly indicate that it is acceptable to get up 4 times. This is an unexpected result as it indicates that a passenger with a tall height infact does not mind having a person get up often throughout a flight. Most passengers no matter their height have a common view point based on the results provided.

Conclusion

The airplane gathers hundreds of people from different demographics and viewpoints. To better understand the factors that cause tensions among passengers, we conducted this research using variables that we believed were important and demonstrated interesting trends and correlations.

Data Dictionary

Variables Type Description
gender character Gender
age character Age
height ordinal Height
household_income ordinal Household income bracket
education ordinal Education Level
location character Location (census region)
frequency ordinal How often do you travel by plane?
recline_frequency ordinal Do you ever recline your seat when you fly?
recline_obligation logical Under normal circumstances, does a person who reclines their seat during a flight have any obligation to the person sitting behind them?
recline_rude ordinal Is it rude to recline your seat on a plane?
switch_seats_family ordinal Is it rude to ask someone to switch seats with you in order to be closer to family?
wake_up_bathroom ordinal Is it rude to wake a passenger up if you are trying to go to the bathroom?
baby ordinal In general, is it rude to bring a baby on a plane?
unruly_child ordinal In general, is it rude to knowingly bring unruly children on a plane?
two_arm_rests character In a row of three seats, who should get to use the two arm rests?
middle_arm_rest character In a row of two seats, who should get to use the middle arm rest?
shade character Who should have control over the window shade?
unsold_seat ordinal Is it rude to move to an unsold seat on a plane?
talk_stranger ordinal Generally speaking, is it rude to say more than a few words to the stranger sitting next to you on a plane?
get_up ordinal On a 6 hour flight from NYC to LA, how many times is it acceptable to get up if you’re not in an aisle seat?