Market Pricing Strategies for Different Brands of Mobile Phones

Do consumers really like phones with high-specification camera capabilities?

Author

Ji Qi(祁绩), Zeyu Cheng(程泽宇)

Published

December 10, 2023

Introduction

Our study delves into the complex dynamics of the cell phone market, harnessing the robust data visualization and analytical capabilities of the R language. In an area where smartphones transcend their role as mere communication tools, becoming integral to our personal and professional identities, it is crucial to understand what drives their market value. A pivotal question arises: Do consumers really pay a premium for the enhancement of a single feature, namely the camera functionality, when other hardware components remain largely unchanged? Imagine the scenarios in which we use our phones for photography, and think about where these photos are typically used. Is there a real necessity for different smartphone brands to continuously focus on improving their camera and related photographic capabilities? This research is structured around several core questions aimed at unraveling the complex relationship between cell phone attributes，especially focus on photographic performance and the market prices related to it. Moreover, the recently gathered data on cell phone shipments from various vendors provides us with an opportunity to examine the relationship between sales volumes and the attributes of cell phones.

Proposals of Research Issue

Our research issue can be separated into the following detailed problems in order to facilitate the research.

Which phone attributes significantly affect the price of cell phones?
Whether photographic performance significantly affects the price of cell phones?
According to the newly collected information on manufacturers’ shipment data and considering the costs to be incurred, do consumers really want a high-quality cell phone photography portfolio?

Compared to the beginning, newly collected data on cell phone shipments from some vendors allows us to try to analyze the relationship between sales and cell phone attributes. Considering that it is not related to the course content, the research plan for the prediction of cell phone prices is deleted for the time being.

Data Introduction and Discussion

Data Description

The dataset used for the study contains three different data tables. The names of the original data tables are “cleaned_all_phones”, “Phonesshipped”, “statistic_id271490_global-smartphone-shipments-by-vendor- 2009-2023” and “statistic_id271539_smartphone-unit-shipments-worldwide-2007-2022-by-vendor”.

In our research, we embarked on a meticulous process of data exploration, collection, screening, and cross-validation, with a particular focus on smartphone shipments. This process is critical in ensuring the integrity and reliability of our findings.

Cross-Validation of Data: During the organization of the smartphone shipment data, we conducted cross-validation using two distinct tables. This involved consolidating quarterly data into annual shipment volumes to compare the consistency of shipment volumes for the same brands in the same years across both tables. The outcome revealed a consistency in the data, thereby affirming the reliability of our information. This cross-validation step was crucial in establishing the credibility of our data sources.

Data Limitations: Our research encountered several limitations that necessitated careful consideration:

Granularity: The data we sourced was disaggregated by brand across different years. However, the granularity was not sufficient to detail the shipment volumes of each phone model in different years.

Substitution of Sales Volumes with Shipments: Since some of actual sales volumes were not disclosed, we used shipment volumes as a proxy, leveraging their positive correlation to explore our research theme.

Data Gaps and Non-Public Company Data: We faced challenges due to missing data and non-disclosure from private companies. Consequently, our study was based as much as possible on available data, focusing on identifying overarching trends and common patterns.

“cleaned_all_phones” is the main dataset, which contains data on the attributes and price of different phones released by different brands from 2016 to 2023. The attributes of phone data in this dataset contains screen size, RAM size, battery size, phone weight, memory size, and ten yes-no scales that measure the camera capabilities of the phone.
“statistic_id271490_global-smartphone-shipments-by-vendor- 2009-2023” contains shipment data for different brands of cell phones by quarter from the fourth quarter of 2009 through the second quarter of 2023.
“statistic_id271539_smartphone-unit-shipments-worldwide-2007-2022-by-vendor” contains shipment data for different brands of cell phones by year from the fourth quarter of 2007 through the second quarter of 2022.
“Phonesshipped” is the data withdrawn from “statistic_id271490_global-smartphone-shipments-by-vendor- 2009-2023”

Data Resources and Data Validity

“cleaned_all_phones” comes from Kaggle (https://www.kaggle.com/datasets/berkayeserr/phone-prices/data).This is a public dataset. The dataset has a high usability because it does not suffer from severe data deletion. The dataset also has high data reliability because the data it contains (i.e., attribute data and prices of different cell phones) can be obtained more accurately from official sources, such as the official websites of cell phone manufacturers. The dataset may have excluded from the dataset some of the cell phone types that will be discontinued in 2023.
“statistic_id271490_global-smartphone-shipments-by-vendor- 2009-2023” comes from the STATISTA database (https://www.statista.com/statistics/271490/quarterly-global-smartphone-shipments-by-vendor/). The author of the datasheet is IDC (idc.com) and the datasheet is released at July 2023. The dataset should not have undergone preprocessing and there are cases of data deletion, which may be due to the fact that the shipment data is sensitive and unavailable or the relevant cell phone brand is no longer in the cell phone business (e.g.HTC,Lenovo etc.).
“statistic_id271539_smartphone-unit-shipments-worldwide-2007-2022-by-vendor” also comes from the STATISTA database (“statistic_id271539_smartphone-unit-shipments-worldwide-2007-2022-by-vendor”). The author of the datasheet is IDC (idc.com) and the datasheet is released at January 2023. The dataset should not have undergone preprocessing but it has similar disadvantages that there are many cases of data deletion.
Data Alert: Due to the sensitivity of shipment data, e.g., most companies do not directly announce shipments for a single handset business in their financial reports, the types of handsets in the first and last two dataset do not exactly overlap. This do affect our research on consumer preferences.

Data primer

Data preprocessing

1.In our study, a crucial step involved an in-depth preprocess of the price data and the attributes of mobile phones. This preprocess phase was meticulously carried out to ensure data accuracy and usability, and it encompassed several key actions:

The ten non-metric scales are assigned values from the numbers 0 and 1. Assigning Values to Non-Metric Scales: We encountered ten non-metric scales in our dataset. These scales were not initially in a format conducive to quantitative analysis. To resolve this, we assigned numerical values to these scales, specifically using the numbers 0 and 1. This binary coding transformed qualitative attributes into a format that could be easily processed and analyzed, enabling us to include these non-metric scales in our quantitative models.
Reduction of the year and month data to the year of release. Streamlining Date Data: The original dataset contained detailed date information, including both the year and month of the phone’s release. However, for the purposes of our analysis, such granular detail was unnecessary. Therefore, we streamlined this data to only include the year of release. This reduction not only simplified our dataset but also made it more relevant for our analysis, focusing on annual trends rather than monthly fluctuations.
simplify the values Simplifying Complex Values: In the process of data cleaning, we encountered several instances where the values were overly complex or detailed for the scope of our analysis. To address this, we simplified these values without losing their essential characteristics or altering their underlying meaning. This simplification made the data more manageable and facilitated a more straightforward analysis.
Rearrangement of data tables Rearranging Data Tables: The organization of data in the tables was not optimal for analysis. We undertook a systematic rearrangement of these data tables, ensuring that related data points were aligned in a way that made sense for our analytical purposes. This reorganization not only improved the clarity of the dataset but also enhanced the efficiency of our subsequent data analysis processes.

PhoneData <-
    readr::read_csv (here("data_raw", "cleaned_all_phones.csv"))
#str(PhoneData)
PhoneData <-
    PhoneData %>% mutate_at(vars(video_720p:video_960fps), list( ~ as.integer(.))) #assigned values from the numbers 0 and 1
PhoneData <- PhoneData %>% mutate(price_USD = as.integer(price_USD))
PhoneData <-
    PhoneData %>% mutate(release_year = as.numeric(substr(announcement_date, 1, 4)))
PhoneData <-
    PhoneData %>% select(phone_name , brand, release_year, everything())
PhoneData <- PhoneData %>% arrange(brand, release_year)

write_csv(PhoneData, "data_processed/PhoneDataClean.csv")

The consolidation of sales data was a critical step in our research, designed to streamline and prepare the dataset for subsequent analysis. This process involved several meticulous and deliberate actions to ensure the data was both comprehensive and analytically viable.

Assigning ‘NA’ Values to Zero: In our dataset, the presence of ‘NA’ values represented a lack of recorded data in certain entries. To create a uniform dataset conducive to analysis, we assigned these ‘NA’ values a numerical value of zero. This approach was chosen to signify the absence of data while maintaining the integrity of the dataset. This treatment of ‘NA’ values was essential to prevent any analytical distortions that could arise from undefined or missing data points.

Aggregating Smaller Brands into ‘Others’: The dataset contained shipment data for a wide array of mobile phone brands, including some with relatively minor market shares. To streamline the analysis and focus on significant market players, we aggregated the shipment data of these smaller brands under a collective category named ‘others’. This aggregation provided a clearer view of the market by consolidating less significant data points into a single, more manageable category.

Transforming Data into a Long-Format Data Frame: We restructured the dataset into a long-format data frame. This transformation involved rearranging the data so that each row represented a single observation with variables such as the year, brand, and shipment volume. This format was chosen for its efficiency in handling and analyzing large datasets, as it allows for easier manipulation and visualization of data across different variables.

Saving the New Dataset as ‘ShipData’: After these modifications, the newly processed dataset was saved under the name ‘ShipData’ in the ‘data_processed’ folder. This step marked the completion of the data consolidation process, resulting in a dataset that was not only cleaner and more coherent but also more aligned with the analytical needs of our study.

Throughout this process, our focus was on maintaining the accuracy and relevance of the data while transforming it into a format that was both accessible and conducive to in-depth analysis. The careful handling of the dataset at this stage laid a solid foundation for the insightful exploration and analysis that would follow in our research.

ShipDataOrigin <- 
     readr::read_csv (here("data_raw", "Phonesshipped.csv"))

ShipData <- ShipDataOrigin %>% 
    mutate_all(funs(replace(., is.na(.), 0))) %>% 
    mutate(Others = Others + Nokia + HTC + RIM + LG + Lenovo + vivo) %>%
    select(-c(Nokia, HTC, RIM, LG, Lenovo, vivo))


ShipDataClean <- ShipData  %>%
  pivot_longer(cols = -Year, names_to = "Brand", values_to = "ShipNum") %>% 
   arrange(Brand,Year)

write_csv(ShipDataClean, "data_processed/ShipData.csv")

3.Obtaining Brand-Specific Data: We collected data on the average prices and shipment volumes for various smartphone brands over different years. This step was essential to understand the market positioning and strategy of each brand.

PhoneDataClean <-
    readr::read_csv (here("data_processed", "PhoneDataClean.csv"))
average_prices <- PhoneDataClean %>%
    group_by(brand, release_year) %>%
    summarize(avg_price = mean(price_USD)) %>%
    mutate(avg_price = as.integer(avg_price))
write_csv(average_prices, "data_processed/AveragePriceDataClean.csv")

Integrating Sales Data: Using the sales data consolidated in step 2, we created a statistical table, named “PriceShipData,” which detailed the average sales volumes and average prices for different years. This table was saved in the “data_processed” folder. The integration of sales data with price data allowed us to explore correlations and trends between these two critical market variables.

Overall Description of Mobile Phone Prices and Sales Volumes

Our analysis, grounded in robust data and visualized through these descriptive charts, provided a comprehensive overview of the smartphone market. It highlighted not only the competitive positioning of major brands but also the diverse strategies they employ to appeal to consumers across various price points. This holistic view of the smartphone market offered valuable insights into consumer behavior and market dynamics. Firstly, we investigate which specific attributes of cell phones significantly influence their market prices.The goal is to identify key factors that consumers consider most valuable, reflected in the pricing strategies of manufacturers.

Secondly, we explore the impact of photographic performance on cell phone pricing. In a world dominated by social media and visual communication, the quality of a phone’s camera can be a decisive factor for consumers. This part of the study aims to assess how much weight the photographic capabilities of a phone carry in its market valuation.

Given the prevalence of certain brands and the completeness of their data, our analysis primarily compared the data of major players like Apple, Huawei, Oppo, Samsung, and Xiaomi. This focus enabled us to draw more precise and relevant conclusions about market trends.

To begin our report, we want to give a brief review of the tendency of price and sales for phones.

Figure 1 gives the Mobile Phone Price Distribution. This chart provided a visual representation of how smartphone prices are distributed across different brands and over time.

#figure 1
PhoneDataClean <-
    readr::read_csv (here("data_processed", "PhoneDataClean.csv"))
#str(PhoneDataClean)
Price <- PhoneDataClean %>%
    select(phone_name, brand, release_year, price_USD) %>%
    arrange(brand, release_year)

#selected_brands <- c("Apple", "Huawei", "Xiaomi", "Samsung","Oppo")
selected_brands <- c("Apple", "Huawei")
plot1 <- Price[Price$brand %in% selected_brands,]  %>%
    ggplot(aes(x = release_year,
               y = price_USD,
               color =  brand)) +
    geom_point(
        position = position_jitter(width = 0.25, height = 0.15),
        alpha = 0.5,
        size = 2.8
    ) +
    labs(
        title = "Phone Price Apple and Huawei",
        x = "Year",
        y = "Price",
        color = "Selected Brand"
    ) +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5)) 


selected_brands <- c("Xiaomi")
plot2 <- Price[Price$brand %in% selected_brands,]  %>%
    ggplot(aes(x = release_year,
               y = price_USD)) +
    geom_point(
        position = position_jitter(width = 0.25, height = 0.15),
        alpha = 0.5,
        size = 2.8,color = "blue"
    ) +
    labs(
        title = "Phone Price of Xiaomi",
        x = "Year",
        y = "Price",
    ) +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5)) 

selected_brands <- c("Samsung")
plot3 <- Price[Price$brand %in% selected_brands,]  %>%
    ggplot(aes(x = release_year,
               y = price_USD)) +
    geom_point(
        position = position_jitter(width = 0.25, height = 0.15),
        alpha = 0.5,
        size = 2.8,color = "gray"
    ) +
    labs(
        title = "Phone Price of Samsung",
        x = "Year",
        y = "Price",
        color = "Selected Brand"
    ) +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5)) 

selected_brands <- c("Oppo")
plot4 <- Price[Price$brand %in% selected_brands,]  %>%
    ggplot(aes(x = release_year,
               y = price_USD)) +
    geom_point(
        position = position_jitter(width = 0.25, height = 0.15),
        alpha = 0.5,
        size = 2.8,color = "red"
    ) +
    labs(
        title = "Phone Price of Oppo",
        x = "Year",
        y = "Price",
    ) +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5)) 


combined_plots <- grid.arrange(plot1, plot2, plot3, plot4, 
                                nrow = 2, ncol = 2)

Figure 2 gives the Mobile Phone Sales Distribution. We also created a chart to depict the distribution of sales volumes, offering insights into the market reach and consumer preference for different brands.

#figure2
ShipDataClean <-
    readr::read_csv (here("data_processed", "ShipData.csv"))
ShipDataClean <- ShipDataClean %>%
  filter(ShipNum != 0)

ShipDataPlot <- ggplot(ShipDataClean, aes(x = Year, y = ShipNum, color = Brand, group = Brand)) +
  geom_line(size=1) +
    geom_text(aes(label = Brand), hjust = -0.2, vjust = 0.5, check_overlap = TRUE, show.legend = FALSE)+
  labs(title = "Shipments of Different Brands Phone", x = "Year", y = "ShipNumber (million)") +
  scale_color_brewer(palette = "Set1")+
    theme_minimal()+
    theme(plot.title = element_text(hjust = 0.5)) 

ShipDataAnim <- ShipDataPlot +
    transition_reveal(Year)

animate(ShipDataAnim ,
        end_pause = 15,
        duration = 10,
        width = 1100, height = 650, res = 150,
        renderer = magick_renderer())

anim_save(here::here(
  'figs', 'ShipDataAnim.gif'))

Figure 3 gives the Distribution of Average Prices and Sales Volumes. This chart compared the average prices and sales volumes of different brands. Interestingly, this visualization revealed no clear trend effect between the brands’ prices and sales volumes. This observation suggested that brands are expanding their price ranges to cater to a broader consumer base, covering all segments of the market.

# figure3
PriceShipData<- readr::read_csv (here("data_processed", "PriceShipData.csv"))

g1 <- PriceShipData %>% 
     filter(Brand == "Apple")%>% 
    ggplot(aes(x = Year))+
    geom_line(aes(y = ShipNum, color = "ShipNumber (million)"), size = 1.5) +
    geom_line(aes(y = AvePrice, color = "Average Price"), size = 1.5) +
     scale_y_continuous(name = "ShipNumber (million)", sec.axis = sec_axis(~./2, name = "Average Price")) +
     labs(title = "ShipNumber and Average Price Of Apple", x = "Year") +
  theme_minimal()

    
    
g2 <- PriceShipData %>% 
     filter(Brand == "Huawei")%>% 
    ggplot(aes(x = Year))+
    geom_line(aes(y = ShipNum, color = "ShipNumber (million)"), size = 1.5) +
    geom_line(aes(y = AvePrice, color = "Average Price"), size = 1.5) +
     scale_y_continuous(name = "ShipNumber (million)", sec.axis = sec_axis(~./2, name = "Average Price")) +
     labs(title = "ShipNumber and Average Price Of Huawei", x = "Year") +
  theme_minimal()


g3 <- PriceShipData %>% 
     filter(Brand == "Oppo")%>% 
    ggplot(aes(x = Year))+
    geom_line(aes(y = ShipNum, color = "ShipNumber (million)"), size = 1.5) +
    geom_line(aes(y = AvePrice, color = "Average Price"), size = 1.5) +
     scale_y_continuous(name = "ShipNumber (million)", sec.axis = sec_axis(~./2, name = "Average Price")) +
     labs(title = "ShipNumber and Average Price Of Oppo", x = "Year") +
  theme_minimal()


g4 <- PriceShipData %>% 
     filter(Brand == "Samsung")%>% 
    ggplot(aes(x = Year))+
    geom_line(aes(y = ShipNum, color = "ShipNumber (million)"), size = 1.5) +
    geom_line(aes(y = AvePrice, color = "Average Price"), size = 1.5) +
     scale_y_continuous(name = "ShipNumber (million)", sec.axis = sec_axis(~./2, name = "Average Price")) +
     labs(title = "ShipNumber and Average Price Of Samsung", x = "Year") +
  theme_minimal()


g5 <- PriceShipData %>% 
     filter(Brand == "Xiaomi")%>% 
    ggplot(aes(x = Year))+
    geom_line(aes(y = ShipNum, color = "ShipNumber (million)"), size = 1.5) +
    geom_line(aes(y = AvePrice, color = "Average Price"), size = 1.5) +
     scale_y_continuous(name = "ShipNumber (million)", sec.axis = sec_axis(~./2, name = "Average Price")) +
     labs(title = "ShipNumber and Average Price Of Xiaomi", x = "Year") +
  theme_minimal()
grid.arrange(g1, g2, g3, g4, g5, ncol = 2, nrow = 3)

Explore attributes significantly affect the price of cell phones

So, what exactly are the factors that influence the price of a mobile phone? In this section, the project would like to further explore the average price distribution of different cell phone manufacturers and initially explore some of the factors, such as the effect of cell phone screen size on the price of a cell phone. The first graph gives the average price distribution for some brands.

It can be seen that different vendors have different pricing strategies. Most of the vendors have a fluctuating and increasing average price in the interval from 2016 to 2021, but some vendors like google have the opposite trend of average price change from all other brands. the average price of Samsung’s cell phone remains stable. The impact of these different pricing strategies on the consumer end of the market such as cell phone sales is noteworthy, and this phenomenon and the market strategies behind it can be further explored in subsequent studies.

AveragePriceData <-
    readr::read_csv (here("data_processed", "AveragePriceDataClean.csv"))
selected_brands <-
    c("Apple", "Huawei", "Xiaomi", "Samsung", "Oppo")

AveragePricePlot <-
    AveragePriceData[AveragePriceData$brand %in% selected_brands,] %>%
    ggplot(aes(x = release_year, y = avg_price, color = brand), color =
               brand) +
    geom_line(linewidth = 1.25) +
    geom_point() +
    geom_text(
        aes(label = brand),
        hjust = -0.2,
        vjust = 0.5,
        check_overlap = TRUE,
        show.legend = FALSE
    )+
labs(title = "Average Prices for Differnet Phone Brands", x = "Year", y = "Average_Price (USD)") +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))

AveragePriceAnim <- AveragePricePlot  +
    transition_reveal(release_year)

animate(
    AveragePriceAnim ,
    end_pause = 15,
    duration = 10,
    width = 1100,
    height = 650,
    res = 150,
    renderer = magick_renderer()
)

anim_save(here::here('figs', 'AveragePriceAnim.gif'))

Inch

This part shows how the average inches could effect the price of different phones. The first chart depicts the average screen size of phones released by different brands in various years. It can be seen that almost all cell phone manufacturers have continued to expand the average screen size of their products, but the growth rate has slowed down rapidly after 2018.

The second graph and the third graph depicts the relationship between the average screen size and the average price of different brands of cell phones. It can be seen that phones with large screen sizes are gradually becoming mainstream in the market, and even lower priced phones can have larger screen sizes after 2019.

#set average inch dataframe
PhoneDataClean <- readr::read_csv (here("data_processed", "PhoneDataClean.csv"))

average_prices <- PhoneDataClean %>%
    group_by(brand, release_year) %>%
    summarize(avg_price = mean(price_USD)) %>%
    mutate(avg_price = as.integer(avg_price))

average_inch <- PhoneDataClean %>%
    group_by(brand, release_year) %>%
    summarize(avg_inch = mean(inches))

PhoneInch <- data.frame(brand = average_inch$brand, 
                    release_year = average_inch$release_year, 
                    avg_inch = average_inch$avg_inch,
                    avg_price = average_prices$avg_price)

selected_brands <- c("Apple", "Huawei", "Xiaomi", "Samsung","Oppo")

PhoneInchPlot <- PhoneInch[PhoneInch$brand %in% selected_brands,] %>%
    ggplot(aes(x = release_year, y = avg_inch, color = brand)) +
    geom_line(size = 1.25,position = position_dodge(width = 0.5)) +
        geom_text(
        aes(label = brand),
        hjust = -0.2,
        vjust = 0.5,
        check_overlap = TRUE,
        show.legend = FALSE
    )+
    labs(title = "Average Inches for Differnet Phone Brands", x = "Year", y = "Average_Inch") +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))

PhoneInchAnim <- PhoneInchPlot   +
    transition_reveal(release_year)

animate(PhoneInchAnim ,
    end_pause = 15,
    duration = 10,
    width = 1100,
    height = 650,
    res = 150,
    renderer = magick_renderer()
)

anim_save(here::here('figs', 'PhoneInchAnim.gif'))

#plot the relationship between average price and average inch
PhoneInch %>%
    ggplot(aes(x = release_year, y = avg_price, color = avg_inch)) +
    geom_point(
        position = position_jitter(width = 0.1, height = 0.1),
        alpha = 0.7,
        size = 3.5
    ) +
    labs(
        title = "relationship between average price and average inch",
        x = "Year",
        y = "Average_Price",
        color = "Average_Inch"
    ) +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5)) +
    scale_color_gradient(low = "blue",
                         high = "red", 
                         )

How camera quality effects phone price

The subsection explores the relationship between different combinations of cell phone camera features and cell phone prices. Bubble plots of camera parameters versus cell phone prices for cell phones released by five major cell phone manufacturers in various years are given. These statistical graphs reflect the different product ideas of different vendors in terms of cell phone camera, with some vendors such as Apple offering only a single solution for cell phone camera parameters. Others, such as Xiaomi, use different camera parameters for different phone models. At the same time, it can be seen that, compared to the camera frequency parameter fps, the camera resolution parameter for the price of the phone has a greater impact on the phone, after 2021, with a high camera frequency of the phone is already the norm.

PhoneDataCamera <- PhoneDataClean %>%
    mutate(
        fps = ifelse(
            video_960fps == 1,
            "960",
            ifelse(
                video_480fps == 1,
                "480",
                ifelse(
                    video_240fps == 1,
                    "240",
                    ifelse(video_120fps == 1, "120", ifelse(
                        video_60fps == 1, "60", ifelse(video_30fps == 1, "30", "")
                    ))
                )
            )
        ),
        resolution = ifelse(video_8K == 1, "8K", ifelse(
            video_4K == 1, "4K", ifelse(video_1080p == 1, "1080P", ifelse(video_720p == 1, "720P", ""))
        ))
    ) %>%
    select(
        -video_30fps,
        -video_60fps,
        -video_120fps,
        -video_240fps,
        -video_480fps,
        -video_960fps,-video_720p,
        -video_1080p,
        -video_4K,
        -video_8K
    )

CameraData <- PhoneDataCamera %>%
    select(brand, phone_name, release_year, fps, resolution, price_USD)

write_csv(CameraData, "data_processed/CameraData.csv")

#XiaoMi Situation

# selected_brand <- c("Apple")
selected_brand <- c("Xiaomi")
# selected_brand <- c("Google")
# selected_brand <- c("Huawei")

CameraData2 <- CameraData %>%
     filter(brand == selected_brand)

# plot1 <- CameraData2 %>%
#     ggplot(aes(
#         x = as.factor(release_year),
#         y = price_USD,
#         color = factor(
#             fps,
#             levels = c("30", "60", "120", "240", "480", "960"),
#             ordered = TRUE
#         ),
#         size = factor(resolution, levels = c("720P", "1080P", "2K", "4K", "8K"))
#     )) +
#     geom_point(position = position_jitter(width = 0.18, height = 0.15), alpha = 0.7) +
#     labs(
#         title = "Camara Quality Effect on Phone Price",
#         x = "year",
#         y = "price",
#         size = "resolution",
#         color = "fps"
#     ) +
#     scale_size_manual(values = c(2, 5, 8, 11, 14)) +
#     scale_color_brewer(palette = "Blues") +
#     theme_minimal() +
#     theme(plot.title = element_text(hjust = 0.5))
# 
# 

plot1 <- CameraData2 %>%
    ggplot(aes(
        x = as.factor(release_year),
        y = price_USD,
        color = factor(
            fps,
            levels = c("30", "60", "120", "240", "480", "960"),
            ordered = TRUE
        )
    )) +
    geom_point(position = position_jitter(width = 0.18, height = 0.15), alpha = 0.7,size=4) +
    labs(
        title = "Camara Quality Effect on Phone Price",
        x = "year",
        y = "price",
        size = "resolution",
        color = "fps"
    ) +
    scale_size_manual(values = c(2, 5, 8, 11, 14)) +
    scale_color_brewer(palette = "Blues") +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))



plot2 <- CameraData2 %>%
    ggplot(aes(
        x = as.factor(release_year),
        y = price_USD,
        color  = factor(resolution, levels = c("720P", "1080P", "2K", "4K", "8K"))
    )) +
    geom_point(position = position_jitter(width = 0.18, height = 0.15), alpha = 0.7,size=4) +
    labs(
        title = "Camara Quality Effect on Phone Price",
        x = "year",
        y = "price",
        size = "resolution",
        color = "fps"
    ) +
    scale_size_manual(values = c(2, 5, 8, 11, 14)) +
    scale_color_brewer(palette = "Blues") +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))

grid.arrange(plot1, plot2,ncol = 2)

#Huawei situation

# selected_brand <- c("Apple")
#selected_brand <- c("Xiaomi")
# selected_brand <- c("Google")
 selected_brand <- c("Huawei")
 
CameraData2 <- CameraData %>%
     na.omit() %>% 
     filter(brand == selected_brand)
 
plot1 <- CameraData2 %>%
    ggplot(aes(
        x = as.factor(release_year),
        y = price_USD,
        color = factor(
            fps,
            levels = c("30", "60", "120", "240", "480", "960"),
            ordered = TRUE
        )
    )) +
    geom_point(position = position_jitter(width = 0.18, height = 0.15), alpha = 0.7,size=4) +
    labs(
        title = "Camara Quality Effect on Phone Price",
        x = "year",
        y = "price",
        size = "resolution",
        color = "fps"
    ) +
    scale_size_manual(values = c(2, 5, 8, 11, 14)) +
    scale_color_brewer(palette = "Blues") +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))



plot2 <- CameraData2 %>%
    ggplot(aes(
        x = as.factor(release_year),
        y = price_USD,
        color  = factor(resolution, levels = c("720P", "1080P", "2K", "4K", "8K"))
    )) +
    geom_point(position = position_jitter(width = 0.18, height = 0.15), alpha = 0.7,size=4) +
    labs(
        title = "Camara Quality Effect on Phone Price",
        x = "year",
        y = "price",
        size = "resolution",
        color = "fps"
    ) +
    scale_size_manual(values = c(2, 5, 8, 11, 14)) +
    scale_color_brewer(palette = "Blues") +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))

grid.arrange(plot1, plot2,ncol = 2)

#Samsung situation


# selected_brand <- c("Apple")
#selected_brand <- c("Xiaomi")
selected_brand <- c("Samsung")
# selected_brand <- c("Huawei")
 
CameraData2 <- CameraData %>%
    filter(brand == selected_brand) 

plot1 <- CameraData2 %>%
    ggplot(aes(
        x = as.factor(release_year),
        y = price_USD,
        color = factor(
            fps,
            levels = c("30", "60", "120", "240", "480", "960"),
            ordered = TRUE
        )
    )) +
    geom_point(position = position_jitter(width = 0.18, height = 0.15), alpha = 0.7,size=4) +
    labs(
        title = "Camara Quality Effect on Phone Price",
        x = "year",
        y = "price",
        size = "resolution",
        color = "fps"
    ) +
    scale_size_manual(values = c(2, 5, 8, 11, 14)) +
    scale_color_brewer(palette = "Blues") +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))



plot2 <- CameraData2 %>%
    ggplot(aes(
        x = as.factor(release_year),
        y = price_USD,
        color  = factor(resolution, levels = c("720P", "1080P", "2K", "4K", "8K"))
    )) +
    geom_point(position = position_jitter(width = 0.18, height = 0.15), alpha = 0.7,size=4) +
    labs(
        title = "Camara Quality Effect on Phone Price",
        x = "year",
        y = "price",
        size = "resolution",
        color = "fps"
    ) +
    scale_size_manual(values = c(2, 5, 8, 11, 14)) +
    scale_color_brewer(palette = "Blues") +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))

grid.arrange(plot1, plot2,ncol = 2)

#Oppo situation

selected_brand <- c("Oppo")
 
CameraData2 <- CameraData %>%
    filter(brand == selected_brand) 

plot1 <- CameraData2 %>%
    ggplot(aes(
        x = as.factor(release_year),
        y = price_USD,
        color = factor(
            fps,
            levels = c("30", "60", "120", "240", "480", "960"),
            ordered = TRUE
        )
    )) +
    geom_point(position = position_jitter(width = 0.18, height = 0.15), alpha = 0.7,size=4) +
    labs(
        title = "Camara Quality Effect on Phone Price",
        x = "year",
        y = "price",
        size = "resolution",
        color = "fps"
    ) +
    scale_size_manual(values = c(2, 5, 8, 11, 14)) +
    scale_color_brewer(palette = "Blues") +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))



plot2 <- CameraData2 %>%
    ggplot(aes(
        x = as.factor(release_year),
        y = price_USD,
        color  = factor(resolution, levels = c("720P", "1080P", "2K", "4K", "8K"))
    )) +
    geom_point(position = position_jitter(width = 0.18, height = 0.15), alpha = 0.7,size=4) +
    labs(
        title = "Camara Quality Effect on Phone Price",
        x = "year",
        y = "price",
        size = "resolution",
        color = "fps"
    ) +
    scale_size_manual(values = c(2, 5, 8, 11, 14)) +
    scale_color_brewer(palette = "Blues") +
    theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))

grid.arrange(plot1, plot2,ncol = 2)

In detail, In this pivotal section of our research, we delve into the intricate relationship between various combinations of cell phone camera features and their corresponding prices. Our focus is to unravel how different camera attributes correlate with the market value of cell phones.

Exploration of Camera Features and Cell Phone Prices:

Analyzing Major Manufacturers: We examine the camera features of cell phones released by five leading manufacturers over various years. This approach allows us to capture a broad spectrum of market trends and strategies employed by these major players.

Utilizing Bubble Plots: To effectively visualize and analyze this relationship, we employ bubble plots that map camera parameters against cell phone prices. These plots are not only illustrative but also provide a clear and concise representation of data, allowing for easier interpretation of complex relationships.

Reflecting Diverse Product Philosophies: These statistical graphs are instrumental in highlighting the diverse product strategies of different vendors in terms of cell phone cameras. For instance, Apple typically offers a single solution for cell phone camera parameters, emphasizing a standardized, high-quality approach. In contrast, Xiaomi adopts a more varied strategy, using different camera parameters for different phone models, reflecting its aim to cater to a wider range of consumer preferences.

Impact of Camera Resolution and Frequency Parameter: Greater Impact of Camera Resolution: Our analysis reveals that, compared to the camera’s frequency parameter (fps), the resolution of the camera has a more significant impact on the phone’s price. This finding suggests that consumers place higher value on image quality, as denoted by resolution, rather than the frame rate capability.

Normalization of High Camera Frequency Post-2021: Interestingly, we observe that after 2021, a high camera frequency has become a norm across various phone models. This shift indicates a market evolution where high frame rates are no longer a premium feature but a standard expectation in new cell phone models.

How Camera quality effect phone shipments

In this section we will focus on exploring the trend relationship between cell phone shipments and cell phone camera quality. The original data is first preprocessed by assigning different fps values and resolution values for comparison.

Figure 1 shows that out of the major vendors other than Samsung, shipments increased before the peak of shipments as the quality of the phone’s camera improved. But after reaching the peak there is a decline. Samsung is unique in that it has been on a downward trend, mainly because as a major producer of cell phone camera components, Samsung’s products themselves are more affected by parameters other than those envisioned. Figure 2 shows that as cell phone camera components have grown, the price of cell phones has risen. It can be seen that most businesses do use camera components as an important way to increase the price of their phones, and consumers are more than willing to pay for them.

CameraDataQua<-
    readr::read_csv (here("data_processed", "CameraData.csv"))

CameraDataQua$fps_numeric <- case_when(
  CameraDataQua$fps == "30" ~ 1,
  CameraDataQua$fps == "60" ~ 2,
  CameraDataQua$fps == "120" ~ 3,
  CameraDataQua$fps == "240" ~ 4,
  CameraDataQua$fps == "480" ~ 5,
   CameraDataQua$fps == "960" ~ 6
)

CameraDataQua$resolution_numeric <- case_when(
  CameraDataQua$resolution == "480P" ~ 1,
  CameraDataQua$resolution == "1080P" ~ 2,
  CameraDataQua$resolution == "4K" ~ 3,
   CameraDataQua$resolution == "8K" ~ 4
)

CameraDataQua$camera_quality <- CameraDataQua$fps_numeric + CameraDataQua$resolution_numeric

CameraDataQua <- CameraDataQua[, !(names(CameraDataQua) %in% c("fps_numeric", "resolution_numeric"))]

CameraDataQua <- CameraDataQua %>% 
 rename(Brand = brand, Year = release_year, Price = price_USD)
CameraDataQuaClean <- CameraDataQua %>% 
    group_by(Brand, Year) %>% 
    summarise(avg_camera_quality = mean(camera_quality, na.rm = TRUE))%>%           mutate(avg_camera_quality = round(avg_camera_quality, 1))




PriceShipDataQua<-
    readr::read_csv (here("data_processed", "PriceShipData.csv"))

CameraShipPrice <- left_join(CameraDataQuaClean, PriceShipDataQua, by = c("Year", "Brand")) %>% 
    na.omit()

CameraShipPrice %>%
    filter(Brand != "Apple") %>% 
    ggplot(aes(x = avg_camera_quality,y = ShipNum, color=Brand)) +
    geom_line(size = 1.5) +
    geom_point(size=1.7)+
labs(title = "Tendency between Camera Quality and Shipment",
       x = "avg_camera_quality", y = "ShipNum") +
theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))

CameraShipPrice %>%
    filter(Brand != "Apple") %>% 
    ggplot(aes(x = avg_camera_quality,y = AvePrice, color=Brand)) +
    geom_line(size = 1.5) +
    geom_point(size=1.7)+
labs(title = "Tendency between Camera Quality and Price",
       x = "avg_camera_quality", y = "AveragePrice") +
theme_minimal() +
    theme(plot.title = element_text(hjust = 0.5))

Conclusions

Our analysis has led to a nuanced understanding of the impact of photographic capabilities on smartphone pricing and consumer purchasing behavior. Within a specific timeframe and price range, enhancements in camera functionality significantly and positively affect the price, indicating that consumers are indeed willing to pay more for improved photographic features. This trend highlights the value that consumers place on camera quality in their smartphones, especially in the context of widespread use in photography and social media.

However, an intriguing finding emerges as smartphones reach a certain developmental stage and price point: further improvements in camera functionality do not significantly boost sales volumes. This suggests a threshold in consumer willingness to pay a premium for camera enhancements. Beyond a certain level of quality and price, consumers may not perceive additional camera improvements as justifying higher costs.

To further explore this topic, future research could delve into consumer perception studies to understand better why this threshold exists and how it varies among different consumer segments. Additionally, examining technological advancements in smartphone cameras and their cost-effectiveness could provide insights into whether further improvements are perceived as valuable by consumers.

Further analysis could benefit from data sources like consumer satisfaction surveys, detailed reviews, and feedback on smartphone cameras. Such data could shed light on consumer expectations and satisfaction levels regarding camera functionality. Market trend analyses over a longer period could also reveal evolving consumer attitudes towards camera enhancements in smartphones.

In conclusion, our findings underscore a complex relationship between smartphone camera enhancements, pricing, and consumer willingness to pay. While improvements in camera features initially drive up prices and attract consumers, there appears to be a limit to how much more consumers are willing to pay for these enhancements. Understanding this dynamic is crucial for smartphone manufacturers and marketers in strategizing product development and pricing.

Attribution

All members contributed equally to the following aspects of the project:

Data Categorization, Cleaning, and Organization: The team collaboratively worked on sorting, cleaning, and structuring the data to ensure its usability and relevance for our research.

Establishing Research Issues: In the ‘Proposals of Research Issue’ section, every member contributed to identifying and formulating the research questions and objectives.

Discussing Data Resources and Data Validity: Except for the Data Dictionary, the team collectively engaged in the ‘Data Introduction and Discussion’ section, ensuring a comprehensive understanding and validation of the data sources used in our study.

Statistical Analysis and Text Preparation: For the ‘Preliminary Data Exploration’ section, the team was equally involved in conducting statistical analysis and preparing the text for three different preliminary data explorations.

Summary Writing: In drafting the ‘Summary of Conclusions’, each member played an equal role, synthesizing our findings and insights.

R Markdown Document Compilation: The team worked together on compiling the R Markdown document, ensuring a cohesive and well-structured presentation of our research.

Preparation of PPT Slides for Presentation: All members were involved in preparing the PowerPoint slides for the presentation, contributing to both the content and design aspects.

In addition to these collaborative efforts, individual responsibilities were as follows:

Ji Qi: Ji Qi was reponsible for data collection; data cleaning, pre-processing, and organization; data analysis and related R programming, R mardown document writing, research stircture conduction, ensuring the clarity, accuracy, and effectiveness of our data presentation and analysis.

Zeyu Cheng: Zeyu Cheng was responsible for creating the Data Dictionary, producing and preparing the PowerPoint slides for presentation, contributed to both the content and design aspects while ensuring the clarity, accuracy, and effectiveness.

Data Dictionary

Summary of Data Dictionary:

Our research proposal aims to elucidate the intricacies of smartphone pricing dynamics in relation to various hardware parameters. The datasets at our disposal provide comprehensive insights into global smartphone shipments and detailed specifications of different smartphone models.

• Global Smartphone Shipments Data: This dataset chronicles the yearly smartphone shipments from 2007 to 2022, categorized by the respective vendors. This information paints a broad picture of brand dominance and market penetration over the years, assisting in understanding market trends and brand trajectories.

• Smartphone Attributes & Pricing Data: A more granular dataset, it enumerates key hardware attributes of smartphones, such as video recording capabilities (spanning from 720p to 8K resolutions and diverse shooting frequencies), battery specifications, RAM, storage, and more. This dataset is pivotal in correlating the influence of these parameters on the smartphone’s pricing. For instance, understanding the value addition of shooting frequencies like 240fps or 960fps to the final price can be immensely beneficial for product development teams.

By synthesizing the insights from these datasets, we aim to offer valuable counsel to smartphone manufacturers and marketers. Our findings will not only shed light on effective pricing strategies but also guide product development endeavors. In the competitive realm of the mobile phone industry, such insights are invaluable for brands striving for differentiation and market leadership. Through our research, we aspire to deepen the understanding of industry dynamics and empower brands with knowledge, guiding them to astute decision-making.

Interpretation of Data Dictionary:

Global Smartphone Shipments Data: • Variable Name: Global smartphone shipments (2007-2022) • Type: Numerical • Description: Represents the number of smartphone units shipped by different vendors each year. • Relevance to Research: This dataset provides an overview of market trends, indicating the popularity and demand for specific brands over time. Understanding market dominance can indirectly give insights into which phone attributes might have been more appealing to the consumer base, affecting shipments. Smartphone Attributes & Pricing Data: • Variable Name: phone_name • Type: Categorical • Description: Name of the smartphone model. • Relevance to Research: Allows for unique identification of each smartphone and comparison of attributes within and across brands. • Variable Name: brand • Type: Categorical • Description: Brand or manufacturer of the smartphone. • Relevance to Research: To observe trends in pricing strategies across different brands and evaluate how each brand values specific attributes. • Variable Name: OS • Type: Categorical • Description: Operating system of the smartphone. • Relevance to Research: Investigate if the OS has a significant influence on phone pricing. • Variable Name: Inches • Type: Numerical • Description: Screen size of the smartphone. • Relevance to Research: Determine if larger screens correlate with higher prices. • Variable Name: resolution • Type: Categorical • Description: Display resolution of the smartphone. • Relevance to Research: Assess if higher resolutions drive up the phone’s price. • Variable Name: battery & battery_type • Type: Numerical & Categorical • Description: Capacity of the smartphone’s battery and its type. • Relevance to Research: Check the influence of battery capacity and type on phone pricing. • Variable Name: ram_GB • Type: Numerical • Description: RAM size in GB. • Relevance to Research: Analyze if phones with more RAM are priced higher. • Variable Name: announcement_date • Type: Date • Description: Date when the smartphone was announced. • Relevance to Research: To see if the announcement date impacts the initial pricing due to technological advancements or market conditions. • Variable Name: weight_g • Type: Numerical • Description: Weight of the smartphone in grams. • Relevance to Research: Investigate if weight (either lightweight or premium feel) has an influence on price. • Variable Name: storage_GB • Type: Numerical • Description: Internal storage size in GB. • Relevance to Research: Determine if larger storage capacities command higher prices. • Variable Name: video_* (720p, 1080p, etc.) • Type: Binary/Categorical • Description: Smartphone’s capability to record video at specific resolutions and frame rates. • Relevance to Research: Explore how different video capabilities influence the price, especially as camera features are often a major selling point. • Variable Name: price_USD • Type: Numerical • Description: Price of the smartphone in USD. • Relevance to Research: The dependent variable in our study