Get help from the best in academic writing.

# lab 3 Analyzing the Association between Variables

Miami University | College of Arts and Science | Department of Statistics | Spring 2021-22
Course Title: Statistics | Subject and Course Number: STA 261
Lab 3: Analyzing the Association between Variables
Due date: Thursday, 02/17/2022
Total: 50 points
Table number:
Name of students: Write down the name of students in your table who equally contributed in this lab:

Learning Objectives: Lab 3 will
Help you to determine if there is an association between two categorical variables
Develop your skills to interpret the estimated regression parameters and the coefficient of determination
Prepare you for the test
Assist you in exploring the resources available in the class (such as, peers, instructor, graduate assistants, etc.)
Submission guidelines:
Make a copy under the File menu — share with group mates (give editing access). To share with others, click the Share button in the top right and add in email addresses, then click Done
Only one person per table needs to submit a pdf document of the lab (File → Download → PDF Document, then submit through Canvas)

Question 1: Vehicular Accidents The data were collected in 2019 by the National Highway Traffic Safety Administration. The Vehicular Accidents 2019 dataset can be found on our StatCrunch group STA 261 Spring 2022. This data set compiles all of the accidents in 2019 and includes various variables that describe the details of the crash. We are interested in determining which variables, if any, have an association with the type of injuries sustained from the accident. (30 points)

Variables Used:
Max_Injury = A description of the maximum injury sustained in the crashNo injuries implies no persons were injured
Possible Injuries implies at least one person involved may have suffered an injury
Minor Injuries implies at least one person suffered minor injuries
Serious Injuries implies at least one person was severely injured in the crash
Fatalities implies at least one person died as a result of the crash

Alcohol_Involved = A Yes/No indicating if alcohol was involved in the crash.
AM_PM = indicates the time of the accident; AM or PM
Speeding_involved = A Yes/No indicating if a vehicle involved in the crash was speeding
Accident_Type = indicates if the accident was a single car or multi-car accident
Number_Injured = Number of persons involved in the crash that were injured in some way.

Part I: What is an observational unit within this dataset?

Part II: Construct a contingency table that shows the conditional proportions of type of injuries, given alcohol consumption. Interpret
(5 points total, 1, point for explanatory/response variable, 2 points for creating a table, 2 points for interpretation)
What is the explanatory variable? What is the dependent (or response) variable?

Interpretation:

Part III: Is there a relationship between the time of the accident, AM or PM and type of injuries in vehicle accidents? Make a comparative bar graph that compares the two distributions. Interpret.
(5 points total, 1, point for explanatory/response variable, 2 points for creating a plot, 2 points for interpretation)
What is the explanatory variable? What is the dependent (or response) variable?

Paste your comparative bar graph here:

Interpretation:

Part IV: Is there a relationship between whether speeding was involved in the accident (Speeding_involvement) or not and type of injuries in vehicle accidents? Make a segmented bar graph that compares the two distributions. Interpret.
(5 points total, 1, point for explanatory/response variable, 2 points for creating a plot, 2 points for interpretation)
What is the explanatory variable? What is the dependent (or response) variable?

Paste your segmented bar graph here:

Interpretation:

Part V: Is there a relationship between whether it was a single or multi-car accident (Accident_Type) and type of injuries in vehicle accidents? Make a contingency table OR a comparative bar graph OR a segmented bar graph that compares the two distributions. Interpret.
(5 points total, 1, point for explanatory/response variable, 2 points for creating a plot, 2 points for interpretation)
What is the explanatory variable? What is the dependent (or response) variable?

Interpretation:

Part VI: Does the maximum speed of the accident differ when alcohol is involved? Create a comparative chart and describe the distributions. (Hint: don’t forget your SOCS!)(5 points)
Comparative Plots:

Interpretation:

Part VII: Does the maximum speed differ based on the time of the accident(AM vs. PM)? Create a comparative chart and describe the distributions. (Hint: don’t forget your SOCS!)(5 points)
Comparative Plots:

Interpretation:

Question 2: Fire Damage A fire insurance company is interested in investigating the effect of the distance between the burning house and the nearest fire station (in miles) on the amount of fire damage (in thousands of dollars) in major residential fires. A sample of 15 recent fires in a suburb is selected. The Fire dataset can be found in our StatCrunch group STA 261 Spring 2022. (20 points)
Source: McClave, J.

## 2 discussions post and 4 replies

Discussion 6: Could/Should This Type of Study Include Randomization to Condition?
Have a look at the article (https://www.goodtherapy.org/blog/study-do-self-help-programs-work-as-well-as-therapy-0828171) about whether self-help is as effective as professional therapy. (The article includes a link to the original study, but no need to read it.) Discuss whether and how a study of this type could be a “true” experiment (which includes randomization to condition). Notice, too, the experimental stimuli for the self-help condition, which is either a book or an online program. There are a few wrinkles, which are explained near the end of the article, but the bottom line is whether we should trust the results given design issues. Should we? And/or how would you design a study to test the hypothesis that professional therapy is more effective than self-help? (Note the directional hypothesis.)
REMEMBER: First, you must respond to the question posed in the discussion. Once you’ve done that, take some time and reply to at least two classmates’ discussion posts. Each peer response will include at least one positive observation (i.e., what do you agree with, appreciate, or find interesting? why?) and at least one critique (what would you change? what is unclear, needs more support, or further explanation? why?). Remember, you must post 3 times (your original response and 2 responses to classmate’s comments) to receive full credit.

Discussion 7: Response Rates Matter. Or Not?
Recent election polls in the US and UK have been fairly inaccurate–the 2020 US presidential election might be the worst of the lot. Read this article (https://www.politico.com/news/2021/07/18/pollsters-2020-polls-all-wrong-500050). Response rates have been pretty poor and/or the “wrong” sampling frames have been used. Or the wrong questions were asked. What response rate would you be comfortable with assuming a useful sampling frame? Notice that we can determine if a pool is accurate by comparing it to election results. If a poll is off, is the response rate and/or sampling frame always the culprit? How about a social scientific survey? How can we trust the results if we don’t have a population to compare it to?
REMEMBER: First, you must respond to the questions posed in the discussion. Once you’ve done that, take some time and reply to at least two classmates’ discussion posts. Each peer response will include at least one positive observation (i.e., what do you agree with, appreciate, or find interesting? why?) and at least one critique (what would you change? what is unclear, needs more support, or further explanation? why?). Remember, you must post 3 times (your original response and 2 responses to classmate’s comments) to receive full credit.
No specific length. Should just meet requirements. I need discussion post and replies done by est midnight, so please send discussion posts ASAP so I can post and send posts for replies. I’m putting 13 hours just so I can select tutor