Dr. John Helveston, Assistant Professor in Engineering Management & Systems Engineering
Background:
Yanjie He, Masters student in Data Analytics | Lingmei Zhao, Masters student in Statistics |
---|---|
Students should have taken Programming for Analytics or have experience with at least one programming language. If you’re not sure whether you have the necessary prerequisite skills, you can try and get up to speed by completing Assignment 0 before classes start. Once classes start, it may be difficult to keep up without this background, and it may be more beneficial to wait and take this course next year after taking Programming for Analytics in the coming Fall.
For this class, you’ll need to install some software and register for some websites. Go to the course prep page to get setup.
Students taking this course should have already taken Programming for Analytics. If you haven’t, I strongly recommend you review the lessons and assignments on the previous semester website. You can also get up to speed by completing Assignment 0.
While this course follows a similar structure as P4A, there will be several key distinctions:
isPrime()
), assignments will involve more real-world data
problems that often have multiple, subjective solutions.Here are some philosophies that will get you far in data analytic work. We will be revisiting these over and over again.
1) Embrace plain text
You will write code to produce rich outputs that include text and graphics. While your output may have lots of different formatting, your code will be written in plain text.
2) Embrace reproducibility
Everything you produce in this course will be a reproducible output. That is, you should be able to reproduce your output from the raw data and code. For example, This webpage was generated from this markdown source file on GitHub.
If you want to generate this very HTML page, download the
.Rmd
file, then open it in RStudio and run the following
code:
rmarkdown::render('L1.1-course-introduction.Rmd')
The syllabus is lengthy, but I do expect you to look through each section. If any changes need to be made, you’ll be notified through Slack.
The course schedule is your roadmap for the semester. Visit it often to make sure you are well-prepared for class and aware of upcoming assignment / quiz dates.
This can be a challenging class - don’t suffer in silence! Look at the “Getting Help” page, come to office hours, send me a message on Slack.
Now that you’ve got R and RStudio installed, read the “Getting Started” lesson in Healy. We will follow the conventions laid out in this chapter throughout the class, including:
Check out the readr and readxl packages - we’ll be using these throughout the semester to import data into R.
We’ll cover the concept of “tidy” data in class on day 1. To get familiar with it, read Chapter 12 in R4DS, and take a look at these Tidy data explanations.
Read through this guide from GMU on how to write a research question. We’ll come back to these ideas again later when you start working on your final projects, but it’s a good idea to start thinking about your research question early.