Tidy Data

Due: Sep 02 by 11:59pm

Weight: This assignment is worth 1% of your final grade.

Purpose: The purpose of this assignment is to introduce yourself to the “tidy data” concept and to practice modifying data frames between long and wide formats in R.

Assessment: This assignment is graded using a check system:

  • ✔+ (110%): Responses shows phenomenal thought and engagement with the course content. I will not assign these often.
  • ✔ (100%): Responses are thoughtful, well-written, and show engagement with the course content. This is the expected level of performance.
  • ✔− (50%): Responses are hastily composed, too short, and/or only cursorily engages with the course content. This grade signals that you need to improve next time. I will hopefully not assign these often.

Notice that this is essentially a pass/fail system. I’m not grading your writing ability and I’m not counting the number of words you write - I’m looking for thoughtful engagement. One or two sentences is not enough. Write at least a paragraph and show me that you did the readings assigned.

1. Software

If you haven’t yet, go to the Course Software page and install all the software we’ll need for this course. You’ll need these tools for this assignment.

2. Get Organized

Follow these instructions:

  1. Download and edit this template.
  2. Unzip the template folder. Make sure you actually unzip it! (in Windows, right-click it and use “extract all”)
  3. Open the .Rproj file to open RStudio.
  4. Inside RStudio, open the hw1.qmd file, take notes, and write some example code as you go through the readings / exercises below.

3. Readings & Exercises

  • Getting Familiar with the Course: Follow Snoop’s advice and read the entire Course Syllabus (actually read the whole thing). Then review the schedule and make sure to note important upcoming deadlines.
  • Getting Familiar with Tidy Data: Read this tidyverse article explaining the concept of tidy data. In your hw1.qmd, copy some of the examples into a code chunk and run them to see the results of converting data between long and wide formats.
  • Use AI to Practice Reshaping Messy Data: Open ChatGPT, Claude, or whatever AI tools you prefer and start a chat to practice the concept of reshaping data into long and wide formats. Ask the AI questions about what it gives you to make sure you understand the concept. Are you certain the AI is giving you good code? Play around with it and see if chatting with the AI is helpful. Include the link to your chat in your submission if you have one. Here is an example prompt (feel free to experiment with other prompts):

I'm practicing the concept of tidy data in R. Provide me an example dataset in wide format and then show me R code for how to convert it to long format. Afterwards, do the opposite - show me an example of a dataset in long format and show me R code for how to convert it to wide format. In each case, explain your reasoning first, then show me the code.

Optional

  • Chapter 6 in Hadley Wickham’s R4DS book covers more detail on the concept of tidy data with even more examples. It’s worth reading through for a more comprehensive understanding.
  • Chapter 3 in the YARDBook also covers tidy data.
  • Chapter 29 in Hadley Wickham’s R4DS book is a great introduction to Quarto. I highly recommend taking a look through it to get a better understanding of Quarto (we’ll be using it the whole semester!).

4. Reflect

Reflect on what you’ve learned while going through these readings and exercises. Is there anything that jumped out at you? Anything you found particularly interesting or confusing? Write at least a paragraph in your hw1.qmd file, and include at least one question. The teaching team will review the questions we get and will try to answer them either in Slack or in class.

If you’re unsure where to start with a reflection, try filling out this template:

“I used to think ______, now I think ______ 🤔”

5. Submit

To submit your assignment, follow these instructions:

  1. Render your .qmd file by either clicking the “Render” button in RStudio or running the command quarto::quarto_render("hw1.qmd") command.
  2. Open the rendered html file and make sure it looks good! Is all the formatting as you expected?
  3. Create a zip file of all the files in your R project folder for this assignment and submit it on the corresponding assignment submission on Blackboard.