STA10003 FOUNDATIONS OF STATISTICS ASSIGNMENT Help
This Assignment – Part 2 is worth 20% of your final mark for STA10003.
The Industry Scenario
You are a new graduate researcher at a social science and psychological sciences research institute, and you have been given a dataset to analyse which relates to a study of Californian adults in 2020. The survey, titled the California Health Interview Survey [CHIS] collects extensive information regarding health status, health conditions, health related behaviours, health insurance coverage and other health-related issues and demographic information. You have been tasked with conducting the initial analysis of some variables using SPSS and to write brief reports.
Data Preparation
For Assignment Part 2 you can use the random sample of 1500 of the 6259 observations you generated from the original data file STA10003 SP1 2022 Assignment Data.sav. You do not need to generate a new random sample. If you have misplaced the random sample drawn for Assignment Part 1 you should draw a new random sample as per the instructions contained in the Assignment part 1 document.
The data file STA10003 SP1 2022 Assignment Data.sav can be found on the Canvas under
Assignments > Assignment – Part1 Submission Instructions
- Your submission must be a single Word file or PDF file.
- Although a cover page is not required, you should include your name and student number within the document [e.g., in footer].
- You must submit your file via Canvas by the specified due date and time. Only the last document you submit will be retained by Canvas.
- Once submitted, please review your submission to ensure the correct file has been submitted.
- This is an individual assignment. Do not share your work with other students. They will have a different random sample of data, so any copying will be detected.
ASSIGNMENT – PART 2
For your Assignment Part 2, you are required to complete the first three [3] questions by producing the appropriate analyses and writing the relevant report for each question. You are also required to complete question 4, containing short answer questions.
For each of the first three questions requiring SPSS, include the relevant output – tables and graphs. Note that many of the variables have similar names, so it is important that the correct variable be selected to address the question asked.
Question 1: Cigarette Consumption
The variable Cigarettes gives an indication of number of cigarettes each Californian adult reported smoking in the previous day. Research indicated that American adults smoke, on average, 15 cigarettes per day. A respiratory researcher has claimed that Californian adults smoke more. Conduct a one sample t-test using the Cigarettes variable to test this claim.
Produce the relevant output and write a one-sample t-test report based on your output in the style presented in Supplement G: Report writing for Hypothesis Tests. Include the relevant output with your answer.
Question 2a: Psychological Distress
The variable PsychDistressScore gives an indication of psychological distress displayed by each participant. Scores are based on the Kessler K6 screening scale which asks participants six questions relating to how often they felt nervous, hopeless, restless, depressed, worthless and ‘everything is an effort’ over the previous month. Scores can range between 0 [no distress] and 24 [high distress].
A psychologist predicted that there is a difference in psychological distress score between people aged less than 40 years and people aged 40 or older. Conduct an independent samples t-test using the PsychDistressScore and AgeGrp variables to test this claim.
Produce the relevant output and write an independent samples t-test report based on your output in the style presented in Supplement H: Report Writing for Independent Samples t-Test. Include the relevant output with your answer.
Question 2b: Assumption Checking for the independent samples t-test
Check and comment on the assumptions of the independent samples t-test produced in Q2a. Include the relevant output with your answer.
Question 3a: Walking for leisure and walking to get somewhere
The variable WalkLeisure gives an indication of the time each participant spent walking for leisure, and the variable WalkSomewhere gives an indication of the time each participant spent walking ‘to get somewhere’ in particular. Each of these variables was measured in minutes spent walking in the previous week.
A health researcher believes that there is a difference, on average, between the time spent walking for leisure and the time spent walking purposefully in order to get somewhere in particular. Conduct a paired samples t- test [related samples t-test] using the WalkLeisure and WalkSomewhere variables to test this prediction.
Produce the relevant output and write a paired samples t-test report based on your output in the style presented in Supplement I (Reporting Writing for Paired Samples t-Test).
Question 3b: Assumption Checking for the paired samples t-test
Check and comment on the assumptions of the paired samples t-test produced in Q3a. Include the relevant output with your answer.
Question 4: [does not require SPSS]
A dietician wants to investigate ice-cream consumption, as this is her favourite ‘treat’ food. She has recently learnt that Americans consume, on average, 20.8 litres of ice-cream per person per year. She wondered if Australians consumption of ice-cream differs to that of Americans, so she accessed the data from an Australian Health Survey collected in 2021 pertaining to consumption of various food groups. A random sample of the data collected provided information pertaining to the ice-cream consumption [litres per year] of Australians.
- What type of statistical test would be appropriate to investigate the dietician’s prediction?
- What is the population we can draw conclusions about in this study?
- The dietician predicted that, on average, the consumption of ice-cream [litres per year] by Australians differs to that consumed by Americans. She conducted the appropriate hypothesis test and obtained a p-value of 0.143. Based on this result, the dietician concluded that Australians consume 20.8 litres of ice-cream per year. Is this conclusion valid or not?
- Comment on the validity of this conclusion. Provide justification for your answer.
Prior to submitting your Assignment via Canvas, use the following checklist as a guide to ensure that all of the relevant information is provided.
Q1 – Should include [as appropriate]:
The One-sample t-test output including any additional output produced to answer the question and report the results of the One-sample t-test following the format of reports in Supplement G.
Q2 – Should include [as appropriate]:
The Independent samples t-test output including any additional output produced to answer the question and report the results of the Independent samples t-test following the format of reports in Supplement H.
Q3– Should include [as appropriate]:
The Paired samples t-test output and all other output produced to answer the question and report the results of the Paired samples t-test following the format of reports in Supplement I.
Q4– The answers should be presented with sections (a), (b), (c), and (d) clearly identified
Checklist:
- Correct variable used to produce output [note that many of the variables have similar names so it is important to double-check that the correct variable has been used]
- Correct procedure performed
- Correct test values used
- All figures quoted in report correct according to your own output
- Including 95% confidence interval interpretations
- Significance interpreted correctly (i.e. not suggesting that the finding is significant when it is not or vice versa)
- Correctly referring to the sample or population where / when appropriate
- Proof reading of reports for errors
STA10003 Assignment Part 2 Marking Rubric [out of 27]:
0 | 1 | 2 | 3 | 4 | 5 | |
Q1 | Incorrect procedure | Correct procedure, but | Correct output. Report | Correct output. Report | Correct output. | Correct output. Report |
One-sample t-test | and/or no report, | incorrect variable or | presented following format | presented following format | Report presented | presented following |
[5 marks] | report covers no | comparison and/or major | used in course materials. | used in course materials. | following format used in | format used in course |
relevant/correct | errors in report. | Report has no major errors, | Report has no major errors, | course materials. | materials. Report has no | |
features of output. | however not all components | however not all components | Report has only 1-2 | errorsand is clearly and | ||
are included, and report has | are included, or report has | minor errors. | concisely written. | |||
multiple minor errors. | multiple minor errors. | |||||
Q2a | Incorrect procedure | Correct procedure, but | Correct output. Report | Correct output. Report | Correct output. | Correct output. Report |
Independent | and/or no report, | incorrect variable or | presented following format | presented following format | Report presented | presented following |
samples t-test | report covers no | comparison and/or major | used in course materials. | used in course materials. | following format used in | format used incourse |
[5 marks] | relevant/correct | errors in report. | Report has no major errors, | Report has no major errors, | course materials. | materials. Report has no |
features of output. | however not all components | however not all components | Report has only 1-2 | errorsand is clearly and | ||
are included, and report has | are included, or report has | minor errors. | concisely written. | |||
multiple minor errors. | multiple minor errors. | |||||
Q2b | Incorrect answer/or | Incomplete answer. | Incomplete answer. | Complete answer. | Not applicable | Not applicable |
Assumptions of the | no answer. | Limited information is | Most information is included. | All information is included. | ||
independent | included. | |||||
samples t-test | ||||||
[3 marks] | ||||||
Q3a | Incorrect procedure | Correct procedure, but | Correct output. Report | Correct output. Report | Correct output. | Correct output. Report |
Paired samples | and/or no report, | incorrect variable or | presented following format | presented following format | Report presented | presented following |
t-test | report covers no | comparison and/or major | used in course materials. | used in course materials. | following format used in | format used incourse |
[5 marks] | relevant/correct | errors in report. | Report has no major errors, | Report has no major errors, | course materials. | materials. Report has no |
features of output. | however not all components | however not all components | Report has only 1-2 | errors and is clearly and | ||
are included, and report has | are included, or report has | minor errors. | concisely written. | |||
multiple minor errors. | multiple minor errors. | |||||
Q3b | Incorrect answer/or | Incomplete answer. | Incomplete answer. | Complete answer. | Not applicable | Not applicable |
Assumptions of the | no answer. | Limited information is | Most information is included. | All information is included. | ||
paired samples | included. | |||||
t-test | ||||||
[3 marks] |
0 | 1 | 2 | 3 | 4 | 5 | |
Q4a [1 mark] | Incorrect answer | Correct answer | Not applicable | Not applicable | Not applicable | Not applicable |
Q4b [1 mark] | Incorrect answer or Incomplete answer | Correct answer | Not applicable | Not applicable | Not applicable | Not applicable |
Q4c [1 mark] | Incorrect answer | Correct answer | Not applicable | Not applicable | Not applicable | Not applicable |
Q4d [3 marks] | Incorrect answer or Incomplete answer | Correct answer Limited Justification | Correct answer Partial Justification | Correct answer Full Justification | Not applicable | Not applicable |