Project

The purpose of the data project is for you to conduct a reproducible analysis with a data set of your choosing. There are two components to the project, the proposal, which will be graded on a pass/fail basis, and the final report. The outline for each of these are provided in the templates. When submitting the assignments, include the R Markdown file (change the name to include your last name, for example Bryer-Proposal.Rmd and Bryer-Project.Rmd) along with any supplementary files necessary to run the R Markdown file (e.g. data files, screenshots, etc.). Suggestions for possible data sources are included below, however you are free to use data not listed below. The only requirement is that you are allowed to share the data. Projects will be shared with others on this website so should be presented in a way that other students can reproduce your analysis.

Project Proposal

The proposal can be more informal using bullet points where necessary and include R code and output. You must address the following areas:

• Research question
• What are the cases, and how many are there?
• Describe the method of data collection.
• What type of study is this (observational/experiment)?
• Data Source: If you collected the data, state self-collected. If not, provide a citation/link.
• Response: What is the response variable, and what type is it (numerical/categorical)?
• Explanatory: What is the explanatory variable(s), and what type is it (numerical/categorival)?
• Relevant summary statistics

Example Data Sources

You are not to use data sources used in class or the textbooks. Possible data sources include, but are not limited to: