Introduction to exploratory data analysis using the R programming language; data visualization, data cleaning, exploratory analysis, information communication, rmarkdown, reproducibility.
This course provides students with a foundation in exploring data using the R programming language. Students will learn how to source, manage, transform, and explore a wide variety of data types. Students will also master the fundamental concepts for visualizing and communicating information contained in raw data, including the human psychology of visual information processing. All analyses will be conducted to support reproducibility from raw data to results using RMarkdown. Teaching will involve interactive lectures with plenty of class time spent working on examples and coding. Students will be assessed through quizzes and exams. Throughout the semester, students will work on a research project of their own design to demonstrate mastery of the course’s topics. At the end of the semester, students will submit a final, reproducible report of their project and will give a 5-minute presentation of their findings.
Students should have taken Programming for Analytics or have experience with at least one programming language. If you’re not sure whether you have the necessary prerequisite skills, you can try and get up to speed by completing Assignment 0 before classes start. Once classes start, it may be difficult to keep up without this background, and it may be more beneficial to wait and take this course next year after taking Programming for Analytics in the coming Fall.
Having successfully completed this course, students will be able to:
Learning a programming language can be as challenging as learning a new spoken language. Hadley Wickham - the chief data scientist at RStudio, and author of many amazing R packages you’ll be using - made this wise observation:
It’s easy when you start out programming to get really frustrated and think, “Oh it’s me, I’m really stupid,” or, “I’m not made out to program.” But, that is absolutely not the case. Everyone gets frustrated. I still get frustrated occasionally when writing R code. It’s just a natural part of programming. So, it happens to everyone and gets less and less over time. Don’t blame yourself. Just take a break, do something fun, and then come back and try again later.
If you’re finding yourself taking way too long hitting your head against a wall and not understanding, take a break, talk to classmates, ask questions in Slack, e-mail me, etc.
I promise, you can do this.
All texts and software for this course is freely available on the web. This includes:
Regular class attendance is essential. Much of the class time will be spent doing exercises and coding. Multiple absences, inappropriate or unprofessional behavior during class (such as monopolizing discussions or being rude or disruptive), not participating in classroom exercises, and not being prepared for class will result in a lower grade for the class participation component. As a rule of thumb, the participation grade will be assigned according to the following rubric:
|Low||Frequently absent||Rude; disruptive; distracting; monopolizes discussions|
|Moderate||Attended most classes, but often arrived late or left early||Takes notes; attentive; occasionally contributes in class discussion / exercises|
|High||Attends on time and prepared||Takes notes; attentive; regularly contributes in class discussion / exercises; does not dominate conversation; listens and responds thoughtfully to comments made by others|
There will be several quizzes given about once every two weeks immediately at the beginning of class. Quizzes cover material presented in previous classes and assignments during the weeks since the most-recent quiz. Quizzes are designed to be time-intensive, to test for fluency, and to demonstrate where additional study is needed. Quizzes are low-stakes - your worst one is dropped, and the rest count for just 15% of your final grade. If you do poorly on one, use that as feedback on where you need additional improvement.
Why quiz at all? Research shows that giving small quizzes throughout a class can dramatically help with retention. It’s a phenomenon known as the “retrieval effect” - basically, you have to practice remembering things, otherwise your brain won’t remember them. The phenomenon and research on it is explained in detail in the book “Make It Stick: The Science of Successful Learning”, by Brown, Roediger, and McDaniel.
Assignments will be a combination of “exercises” and “redesign” projects. Exercises include instructional videos and applied practice writing code, mostly using the DataCamp platform. These are designed to be completed prior to the associated lectures to prepare for class. Redesign project assignments provide hands-on experience with the material covered in class by redesigning an existing data visualization. While students may work with their peers on these assignments, each student must submit their own work. Credit for each assignment will be allocated according to a rubric provided in the assignment description. No more than 2 late days can be used on any one assignment.
Throughout the semester, students will work in teams of 1-3 students towards a final project of an exploratory data analysis. At the end of the semester, each student will submit a report of their analysis in the form of an html web page and give a 5-minute presentation of their results to the class. To make the overall project more manageable, it will be broken down into several separate “milestone” deliverables due throughout the semester, including a proposal, progress report, peer review, presentation, and final report. View the final project assignment page for more details.
There will be no final exam - the final project is the final exam
Final grades will be calculated as follows:
|Attendance & Participation||5 %|
|Quizzes||16 %||Lowest quiz grade is dropped|
|Assignment 1||4 %||Exercises|
|Assignment 2||4 %||Exercises|
|Assignment 3||4 %||Exercises|
|Assignment 4||8 %||Redesign project|
|Assignment 5||4 %||Exercises|
|Assignment 6||8 %||Redesign project|
|Assignment 7||4 %||Exercises|
|Final Project Proposal||6 %|
|Final Project Progress Report||8 %|
|Final Project Peer Review||5 %|
|Final Project Presentation||8 %|
|Final Project Report||16 %|
Here’s a visual breakdown by category:
This Alternative Minimum Grading (AMG) policy is available to everybody, but is designed specifically for students who struggle in the first part of the course, and then through sustained hard work and dedication manage to elevate their performance in the latter part of the course to a level that merits passing with a C (even if their Standard Grade might be lower than that).
Student cannot “sign up” for AMG grading. Every student will be considered both for Standard Grading and AMG, and the instructor can choose to assign the AMG grade if a student’s effort merits it. To qualify for AMG you must put forth sustained effort, which means meeting the following requirements:
To compute your AMG score, first use the following to compute your raw score. If the resulting score is higher than a C, set it back to a C.
|Quizzes (Top 4)||25%|
|A||93 - 100%||C||73 - 76.99%|
|A-||90 - 92.99%||C-||70 - 72.99%|
|B+||87 - 89.99%||D+||67 - 69.99%|
|B||83 - 86.99%||D||63 - 66.99%|
|B-||80 - 82.99%||D-||60 - 62.99%|
|C+||77 - 79.99%||F||< 60%|
The course instructors may choose to change the scales at their discretion. You are guaranteed that your letter grade will never become worse as a result of changing scales.
This class can be challenging - don’t suffer in silence. Look at the “Getting Help” page for ways to get resources that can help you succeed.
Each students is allowed 5 late homework submission days - use them however you want, no questions asked. No more than 2 days can be applied toward a single assignment. Late days are meant to cover illness, family emergencies, and religious holidays. Assignments submitted more than 2 days after the due date will not be graded. In extreme circumstances, contact the instructor.
Learning how to program is like learning how to ride a bicycle - to get better, you must practice writing code yourself. Therefore, we have a set of strict rules regarding what kind of collaboration is allowed, what counts as over-collaboration, and what counts as cheating.
Over-collaboration results in a warning on the first offense, and a penalty on later offenses. Examples include:
Cheating results in a penalty on the first offense, and failing the course on the second offense. Cheating on assignments can include:
Cheating on quizzes, assignments, or the final project can include:
Penalties are decided by the course instructors, and can vary based on the severity of the offense. Possible penalties include:
Penalties may also be accompanied by a letter to the Dean of Student Affairs, again at the instructors’ discretion. This can lead to university-level penalties, such as being suspended or expelled.
Programs are naturally structured, which makes them very easy to compare for plagiarism. Automated plagiarism detection systems make this process even easier. Watch this video showing plagiarism detection software in action (this example is using Python code, but this also works for R code).
In short, if you copy code, we will know - don’t copy code!
Your first year of college is a time when you do a lot of learning. Sometimes, you might make bad decisions or mistakes. The most important thing for you to do is to learn from your mistakes, to constantly grow, and become a better person.
Sometimes, students panic and copy code right before the deadline, then regret what they did afterwards. Therefore, you may rescind any homework submission for up to 24 hours after the deadline with no questions asked. Simply email the course instructors asking to delete the submission in question, and we will do so. Deleted submissions will not be considered during plagiarism detection, though of course they will also not be graded. However, it will always be better to get a 0 (or partial credit) on an assignment than to get a cheating violation!
I applaud all of you who go to graduate school with children! It is difficult to balance academic, work, and family commitments, and I want you to succeed. Here are my policies regarding children in class:
I understand that sleep deprivation and exhaustion are among the most difficult aspects of parenting young children. The struggle of balancing school, work, childcare, and graduate school is tiring, and I will do my best to accommodate any such issues while maintaining the same high expectations for all students enrolled in the class. Please do not hesitate to contact me with any questions or concerns.
I will listen and believe you if someone is threatening you.
Lauren McCluskey, a 21-year-old honors student athlete, was murdered on October 22, 2018 by a man she briefly dated on the University of Utah campus. We must all take action to ensure that this never happens again.
If you are in immediate danger, call 911 or GWU police at 202-994-6111 (GWPD).
If you are experiencing sexual assault, domestic violence, or stalking, if you report it to me I will listen and connect you to resources or call GWU’s Counseling and Psychological Services (202-994-5300).
Any form of sexual harassment or violence will not be excused or tolerated at GWU. GWU has instituted procedures to respond to violations of these laws and standards, programs aimed at the prevention of such conduct, and intervention on behalf of the victims. GWU Police officers will treat victims of sexual assault, domestic violence, and stalking with respect and dignity. Advocates on campus and in the community can help with victims’ physical and emotional health, reporting options, and academic concerns.
All course materials available on the course website are developed open source - you are welcome to post and share them following the licensing guidelines listed in the license page.
However, all solutions to assignments and quizzes are proprietary. Don’t post them online or try to sell them - this is a violation of the student code of conduct.
Wait 20 minutes, after that you’re free to leave. One member of the class should be selected to notify the EMSE Department of the Instructor’s absence by calling the EMSE Department 202-994-4892 on next business day.
In accordance with University Policy, students should notify faculty during the first week of the semester of their intention to be absent from class on their day(s) of religious observance. Official university policy here: https://students.gwu.edu/accommodations-religious-holidays
Disability Support Services (DSS): Any student who may need an accommodation based on the potential impact of a disability should contact the Disability Support Services office at 202-994-8250 in the Rome Hall, Suite 102, to establish eligibility and to coordinate reasonable accommodations. For additional information please refer to: https://disabilitysupport.gwu.edu/
Mental Health Services (202-994-5300): The University’s Mental Health Services offers 24/7 assistance and referral to address students’ personal, social, career, and study skills problems. Services for students include: crisis and emergency mental health consultations confidential assessment, counseling services (individual and small group), and referrals. https://healthcenter.gwu.edu/counseling-and-psychological-services
Academic dishonesty is defined as cheating of any kind, including misrepresenting one’s own work, taking credit for the work of others without crediting them and without appropriate authorization, and the fabrication of information. For the remainder of the code, see: https://studentconduct.gwu.edu/code-academic-integrity
In addition to the formal code of academic integrity, the instructor expects that students will treat this course with the level of professionalism required in the workplace. Remember that real firms are sponsoring student projects throughout the semester; in a workplace setting, these firms would be paying clients for the analyses being conducted. This course prepares students to succeed in the workplace, and maintaining a high degree of professionalism is expected.
Once you have read this entire syllabus and viewed the course schedule, please send me a picture of your favorite super hero in a direct message on Slack.
Brownie points if it’s animated.
Some content on this page is inspired by and / or modified from other sources: