For the course data analysis project, you will apply statistical analysis to a topic and dataset relevant to public policy that is of interest to you. You will write a research paper with approximately 5 pages of written text, in addition to tables and any figures, presenting the analysis and discussing your conclusions and any limitations of your findings, applying the concepts we will learn throughout the quarter.

You will conduct your analysis by developing a basic R script to document and allow for your work to be reproduced. You may employ any of the methods covered in Weeks 1 through 9 of the course. Students should develop a hypothesis related to a dependent variable and an independent variable and then test their hypothesis applying methods from the course. Developing graphics to explore your dataset and/or present your findings may enhance your paper but is not required.

Structure of Paper

Clarity and structure are critical for the writing of your paper and presenting your results. Your paper should be organized into the following parts.

  1. Introduction: Discussion of the problem, including the context of the question/issue and the citation of one to two background papers.
  2. Hypothesis: Statement of the hypotheses, both using formal statistical hypothesis testing, and describing the hypothesis in plain English, based upon the hypothesis testing approach we review during the quarter.
  3. Data and Methods: Description of the data file and discussion of statistical methods, including any assumptions applied for statistical methods used and reasonableness of assumptions. These should be based on what we learn about the methods and assumptions during the quarter. Descriptive statistics regarding the dataset and key variables may be included, though are not explicitly required.
  4. Results: Presentation of statistical analysis, including potentially test statistics, p-values, confidence intervals, and decision to reject or fail to reject your null hypothesis.
  5. Conclusion: Summary and interpretation of the results, including discussing any limitations of your findings or further questions for study based on your analysis.

Please use Ch. 11 of the Using Statistics text (from Module 2) as a resource on good statistical writing practice.

Your supporting R script must be turned in along with your paper (the dataset and output is not required, other than for possible use to present the results in your paper).

Assignment Details

  • Length: approximately 5 pages of written text, in addition to tables and any figures, double spaced, (not including the title page, references, tables, and figures)
  • Use standard APA formatting (one-inch margins, top and bottom, 12-point font size)
  • All citations should be in APA format.
  • Rubric

    A (90-100) B (80-89) C (70-79) D (0-69)
    Application and Understanding of Statistical Concepts (50%) The student’s paper shows an exceptional understanding of the statistical topics and application of concepts from the course to their data analysis problem. The description of methods and statistical conclusions are appropriate and the student recognizes limitations of their conclusions due to the assumptions of the statistical methods. The student successfully conducted a statistical analysis between an independent and dependent variable, but some aspects of the student’s understanding of the statistical concepts may need to be improved. The statistical methods are appropriate, but the writing misses some important details. The discussion of the limitations of the analysis may miss some critical assumptions of the methods. Statistical methods are applied, but the student’s paper demonstrates some serious misunderstanding of the methods and/or fails to apply the method. The discussion of choice of statistical approach and/or conclusions/limitations misses critical aspects of the methods for understanding their findings. The student may draw incorrect conclusions. The student does not succeed in conducting an analysis between an independent and dependent variable. The student’s writing shows a complete misunderstanding of the application of the statistical methods. The methods are misapplied or totally inappropriate. Conclusions are incorrect.
    Clarity and Organization of Writing (40%) The student’s writing is clear and complete and organized in the appropriate sections. The student thoughtfully establishes some background information and motivation for their research question and structures their hypothesis appropriately. The presentation of data analysis results is clear and may go above and beyond the project requirements, including descriptive statistics regarding their dataset and/or data visualization. The writing is mostly organized, but there are areas of improvement for the student. Certain areas lack clarity. The student may have somewhat deviated from the required structure of the paper. The hypothesis tested is appropriate, but needed details may be missing. The presentation of the data analysis may need some strengthening to adequately portray the results. The writing and organization has some serious deficiencies making it difficult to follow or understand the student’s work. Sections may be missing or underdeveloped. The student’s research question and/or hypothesis may be poorly formulated. The writing is disorganized and unfocused. Sections of the paper are missing and do not at all demonstrate structured thinking to present analyses.
    Supporting R Script (10%) An R script is included that is well commented and clearly documents the analysis steps, including data input and statistical application. The R script presents the student’s analysis, but may need more clarity in portraying the steps the student undertook to conduct their analysis. The R script does not properly document the student’s work. The R script is available but does not make clear the student’s key analysis steps. The R script is entirely missing and/or unrelated to the students’ work and analyses.

