Progress Report

Due: Nov 01 by 11:59pm

Weight: This assignment is worth 12% of your final grade.

Purpose: purpose

Assessment: Your submission will be assessed using the rubric at the bottom of this page.

Write a report summarizing progress you have made towards your project thus far, including summary statistics of your data and preliminary charts. You should have already identified data source(s) for your project and begun exploring the data. Your report should be written in a narrative format (i.e. using coherent paragraphs rather than a series of bullet points). You may use headings where appropriate to break up your report into sections. Below is a list of specific items your progress report should include.

1. Get organized

Download and unzip this template for your proposal report. Open the project.Rproj file and write your progress report in the report.Rmd file. The template comes with some text and code explaining how to use it - should should delete this code as it is only for explanatory purposes. Be sure to adjust the content in the YAML:

  • Write your project title in the title field (and provide a subtitle if you wish, otherwise delete the subtitle field).
  • In the author field, list the names of all teammates, e.g. author: Luke Skywalker, Leia Organa, and Han Solo.

You should also put the data you are working with for your project in the appropriate folders.

2. Write your research question

This can (and almost certainly should) be a revised research question from your proposal.

Again, follow these guidelines.

3. Discuss your data sources

Discuss the data source(s) you are using for your analysis. For each data source:

  • Describe it and include urls or references to the original sources as well as links to any pre-processed or formatted data you are using. These are not always the same. For example, for Mini Project 2, the original source was the Transit Costs Project whereas the data file used was posted on this GitHub repo.

  • Discuss the validity of your data:

    • Is the data you are using from the original source, or has it been pre-processed?
    • How was the original data collected and by whom?
    • Are there perhaps any missing data that might not have been observed?
    • Could the data be biased in some way?
  • Finally, provide a data dictionary for each data file you are using in an appendix at the end of your report. The dictionary should contain a table of each variable name along with a brief description of the variable.

4. Evaluate your proposal expectations

  • Do you find support for your original proposal expectation about how one variable might be distributed? Show it by looking at summaries of the variable. If that variable doesn’t exist in your data, discuss another key variable instead and how it is distributed.
  • Do you find support for the expected relationships you wrote about in your proposal? If those variables don’t exist in your data, discuss one key relationship that you have identified between two or more variables.

5. Include two preliminary charts

  • Your charts should either support or oppose your research question(s), or they should illustrate what else you might need to address your research question.
  • You may choose whatever chart types you wish, but your choices should highlight the point you want to make or should clearly show the relationship you want to emphasize. Consider these resources when choosing your charts.
  • Your charts should follow the design principles we have covered in class. They do not have to be fully “polished” yet, but at a minimum they should be accurate (i.e. not misleading) and they should not include distracting non-data ink.

6. Review your teammates

For students working in teams: On Blackboard under the assignment titled “Team Member Review: Progress Report”, submit a short description of the specific contributions of each team member in your team (this is one of the only things I’ll ask for on Blackboard). Here is an example review:

Student A identified the data source and wrote the documentation for it. Student B led the data cleaning process and did much of the initial data exploration. Student C helped write code for the main visualizations.

These reviews will be kept confidential and compared to assess that the workload and grading for team members are equitable. Team members who do not make meaningful contributions to their projects will not receive the same grade as that of their team mates. If you are having any disputes among team members, please contact Professor Helveston asap so we can find a resolution.

7. Knit and submit

Click the “knit” button to compile your .Rmd file into a html web page, then create a zip file of everything in your R Project folder. Name your file progress-report.zip, then go to your team Box folder and submit your zip file in the “submissions” folder. Only one person from your team should submit the report.

Grading Rubric

68 Total Points

Category Excellent Good Needs work
Organization & Formatting 5
All formatting guidelines are followed; YAML is correct with all team members listed.
4
Most formatting guidelines are followed; YAML is correct with all team members listed.
3
Several or all formatting guidelines not followed; YAML contains elements that aren't updated from the template.
Research Question 10 / 9
Research question is clear, focused, concise, complex, and arguable.
8 / 7 / 6
Research question is reasonably clear and focused, but may be too simple, too complex, or too verbose.
5 / 4 / 3
Research question is unclear and lacks focus; question is far too simple or overly complex.
Data Sources 10 / 9
Data sources or plausible data sources are clearly described; validity of and concerns about data are discussed.
8 / 7 / 6
Identified data sources or plausible data sources are not clearly described; validity of and concerns about data are minimally discussed.
5 / 4 / 3
Data sources are poorly described or missing; description of validity of and concerns about data are poor or missing.
Evaluation of Expectations 10 / 9
Clear discussion of whether or not proposal expectations about variables and relationships were met; evidence from data provided to support discussion.
8 / 7 / 6
Description of only one expected variable, relationship, or chart, or minimal description of two; minimal evidence from data provided to support discussion.
5 / 4 / 3
Poor or missing description of variables, relationships, or charts; poor or missing evidence to support discussion.
Data Visualization 1 10 / 9
Chart expertly demonstrates best practices of visual design; chart is functionally accurate and generally easy to understand; chart address research question.
8 / 7 / 6
Chart generally demonstrates best practices of visual design; although accurate, some elements are distracting or confusing; chart mildly addresses research question.
5 / 4 / 3
Chart generally lacks best practices of visual design; multiple confusing or unclear / distracting elements; unrelated to research question.
Data Visualization 2 10 / 9
Chart expertly demonstrates best practices of visual design; chart is functionally accurate and generally easy to understand; chart address research question.
8 / 7 / 6
Chart generally demonstrates best practices of visual design; although accurate, some elements are distracting or confusing; chart mildly addresses research question.
5 / 4 / 3
Chart generally lacks best practices of visual design; multiple confusing or unclear / distracting elements; unrelated to research question.
Technical things 5
All code runs without errors; all files included in the submitted .zip file.
4
Code has only one or two error, otherwise runs; all files included in the submitted .zip file.
3
Code has multiple errors; submitted .zip file is missing components necessary to reproduce analysis.
Individual Contributions 8 / 7
(Individual score): Based on reviews by team members, the team member made substantial contributions to this assignment. Individual also submitted a review.
6 / 5
(Individual score): Based on reviews by team members, the team member made contributions to this assignment, but contributions were late or incomplete. Individual also submitted a review.
4 / 3 / 2 / 1 / 0
(Individual score): Based on reviews by team members, the team member made little to no contributions to this assignment. Individual’s peer review is missing.