class: center, middle, inverse, title-slide .title[ #
ANOVA I
] .subtitle[ ## Psychological Research Methods:
Data Management & Analysis
] .author[ ### Monica Truelove-Hill ] .institute[ ### Department of Clinical Psychology
The University of Edinburgh ] --- ### This Week's Key Topics + Types of ANOVAs and their assumptions + Interpreting and reporting the results of ANOVA + Computing and interpreting effect sizes for ANOVA + Conducting a power analysis for ANOVA --- ### Moving on from `\(t\)`-tests + `\(t\)`-tests allow for the comparison of two means...what if we need to compare more than two? + 'We can just run more `\(t\)`-tests'... -- + **NO.** -- + *Type I Errors:* rejecting the null when it is actually true. + `\(\alpha\)` reflects our Type I error rate, or the probability of rejecting the null given that it is true. -- + When we run a statistical test, we have a Type I error probability equal to `\(\alpha\)` (usually .05, or 5%). This probability is not across ALL tests, but within each test. + When you run multiple tests when a single test would do, you're inflating the risk of making a Type I error. --- ### Basic ANOVA requirements + ANOVAs can be used to compare more than two means at once + All ANOVAs require: + A continuous dependent variable + At least one categorical independent variable with three or more levels --- ### Types of ANOVA + **One-Way ANOVA** test for differences between 3 or more independent means + **Factorial ANOVA** tests for differences between multiple means across multiple independent variables (AKA multiway ANOVA) + Both of these could also be a **Repeated-Measures ANOVA**, which tests for differences within a single sample -- <br> | IVs | Between-Subjects IV | Within-Subjects IV | Both Between & Within IV(s) | |:---:|:-------------------------------:|:--------------------------------:|:---------------------------:| | 1 | Between-Subjects One-Way ANOVA | Repeated-Measures One-Way ANOVA | N/A | | 2+ | Between-Subjects Factorial ANOVA| Repeated-Measures Factorial ANOVA| Mixed Factorial ANOVA | --- ### One Way ANOVAs .pull-left[ .center[**Between-Subjects One-Way ANOVA**] + Compares the mean of the DV across multiple levels of a single IV + Each level of the IV is collected from a separate sample + Results indicate whether there is a difference between at least one pair of levels ] --- count: false ### One Way ANOVAs .pull-left[ .center[**Between-Subjects One-Way ANOVA**] + Compares the mean of the DV across multiple levels of a single IV + Each level of the IV is collected from a separate sample + Results indicate whether there is a difference between at least one pair of levels <br> <b><p style="color:#9AD079;">Example research question:</b></p> When revising content, is it most effective to read the notes, rewrite the notes, or test yourself on the material? ] -- .pull-right[ .center[**Repeated-Measures One-Way ANOVA**] + Compare the mean of the DV across multiple levels of a single IV + Each member of the sample provides data for *all* levels of the independent variable + Results indicate whether there is a difference between at least one pair of levels, but these levels are measured within the same sample ] --- count: false ### One Way ANOVAs .pull-left[ .center[**Between-Subjects One-Way ANOVA**] + Compares the mean of the DV across multiple levels of a single IV + Each level of the IV is collected from a separate sample + Results indicate whether there is a difference between at least one pair of levels <br> <b><p style="color:#9AD079;">Example research question:</b></p> When revising content, is it most effective to read the notes, rewrite the notes, or test yourself on the material? ] .pull-right[ .center[**Repeated-Measures One-Way ANOVA**] + Compare the mean of the DV across multiple levels of a single IV + Each member of the sample provides data for *all* levels of the independent variable + Results indicate whether there is a difference between at least one pair of levels, but these levels are measured within the same sample <b><p style="color:#9AD079;">Example research question:</b></p> Are there differences in the number of cigarettes one smokes per day at baseline, midway through, and after completion of a smoking cessation programme? ] --- class: center, middle, inverse ### Questions? --- ### ANOVA + Running an ANOVA involves: + Computing `\(F\)` + Calculating the probability of obtaining our value of `\(F\)` if the null were true + Using this probability to make a decision whether to reject the null hypothesis --- ### The Logic Behind ANOVA .pull-left[ + When no further information is available, the best reflection of the DV as a whole is the mean. ] .pull-right.center[**Base Mean Model** <!-- --> ] --- count: false ### The Logic Behind ANOVA .pull-left[ + When no further information is available, the best reflection of the DV as a whole is the mean. + The mean represents some individuals well, but not others. ] .pull-right.center[**Base Mean Model** <!-- --> ] --- ### The Logic Behind ANOVA .pull-left[ + When no further information is available, the best reflection of the DV as a whole is the mean. + The mean represents some individuals well, but not others. + Researchers often want to know whether another variable will help them more accurately capture variation in the DV ] .pull-right.center[**Base Mean Model** <!-- --> **Model with IV** <!-- --> ] --- ### The Logic Behind ANOVA .pull-left[ + When no further information is available, the best reflection of the DV as a whole is the mean. + The mean represents some individuals well, but not others. + Researchers often want to know whether another variable will help them more accurately capture variation in the DV + ANOVAs test whether the IV does a significantly better job than the mean of modeling variance in the DV ] .pull-right.center[**Base Mean Model** <!-- --> **Model with IV** <!-- --> ] --- ### The Logic Behind ANOVA `$$F = \frac{MS_{model}}{MS_{residual}}$$` + ANOVAs produce an `\(F\)`-statistic that is the ratio of the variation in the DV explained by our IV(s) and the variation explained by other unmeasured factors. + `\(MS_{model}\)` - reflects the variation in DV explained by the IV + AKA `\(MS_{between}\)` or `\(MS_{treatment}\)` + `\(MS_{residual}\)` - reflects the variation in DV unexplained by IV + AKA `\(MS_{within}\)` or `\(MS_{error}\)` --- ### The Logic Behind ANOVA + To know how much variation in the DV can be explained by the IV (`\(MS_{model}\)`), we need to understand how much better our IV model does compared to the mean model. -- + This can be captured through the model sum of squares, `\(SS_{model}\)`: .pull-left[ <!-- --> ] -- .pull-right[ + Difference between grand mean and mean of each IV group + Reflects the variance in the DV explained by IV over and above the mean ] --- ### The Logic Behind ANOVA + It's also important to understand how accurately the IV captures individual participant data (`\(MS_{residual}\)`). -- + This is captured through the residual sum of squares, `\(SS_{residual}\)`: -- .pull-left[ <!-- --> ] .pull-right[ + Difference between IV group means and the observed data + Reflects the amount of variation in the DV not explained by the IV ] --- ### The Logic Behind ANOVA + Because results may be biased depending on the number of observations used to calculate them, the `\(SS\)` values must be adjusted by sample size and number of IVs in the model. + This standardization is done by dividing `\(SS\)` by the degrees of freedom: + `\(MS_{model} = \frac{SS_{model}}{df_{model}}\)` + `\(MS_{residual} = \frac{SS_{residual}}{df_{residual}}\)` + To summarise, the `\(F\)` statistic of an ANOVA is the standardised ratio of the variance in the DV explained by the IV to the variance in the DV unexplained by the IV. + In other words, the ratio of what our IV tells us about the DV to what it doesn't. --- class: inverse, middle, center ### Questions? --- ### Calculating `\(F\)` .pull-left[ `$$F = \frac{{MS_{model}}}{MS_{residual}}$$` ] --- count: false ### Calculating `\(F\)` .pull-left[ `$$F = \frac{\color{#4CA384}{MS_{model}}}{MS_{residual}}$$` `\(MS_{model} = \frac{\color{#4CA384}{SS_{model}}}{df_{model}}\)` + Compute distance between each observation ] .pull-right[ <!-- --> ] --- count: false ### Calculating `\(F\)` .pull-left[ `$$F = \frac{\color{#4CA384}{MS_{model}}}{MS_{residual}}$$` `\(MS_{model} = \frac{\color{#4CA384}{SS_{model}}}{df_{model}}\)` + Compute distance between each observation + Square them ] .pull-right[ <!-- --> ] --- count: false ### Calculating `\(F\)` .pull-left[ `$$F = \frac{\color{#4CA384}{MS_{model}}}{MS_{residual}}$$` `\(MS_{model} = \frac{\color{#4CA384}{114.72}}{df_{model}}\)` + Compute distance between each observation + Square them + Sum them up ] .pull-right[ <!-- --> ] --- count: false ### Calculating `\(F\)` .pull-left[ `$$F = \frac{\color{#4CA384}{MS_{model}}}{MS_{residual}}$$` `\(MS_{model} = \frac{114.72}{\color{#4CA384}{2}}\)` + Compute distance between each observation + Square them + Sum them up + `\(df_{model}\)` = number of levels - 1 ] .pull-right[ <!-- --> ] --- count: false ### Calculating `\(F\)` .pull-left[ `$$F = \frac{\color{#4CA384}{57.36}}{MS_{residual}}$$` `\(MS_{model} = \frac{114.72}{2}\)` + Compute distance between each observation + Square them + Sum them up + `\(df_{model}\)` = number of levels - 1 ] .pull-right[ <!-- --> ] --- ### Calculating `\(F\)` .pull-left[ `$$F = \frac{57.36}{\color{#4CA384}{MS_{residual}}}$$` `\(MS_{residual} = \frac{\color{#4CA384}{SS_{residual}}}{df_{residual}}\)` + Compute distance between each observation ] .pull-right[ <!-- --> ] --- count: false ### Calculating `\(F\)` .pull-left[ `$$F = \frac{57.36}{\color{#4CA384}{MS_{residual}}}$$` `\(MS_{residual} = \frac{\color{#4CA384}{SS_{residual}}}{df_{residual}}\)` + Compute distance between each observation + Square them ] .pull-right[ <!-- --> ] --- count: false ### Calculating `\(F\)` .pull-left[ `$$F = \frac{57.36}{\color{#4CA384}{MS_{residual}}}$$` `\(MS_{residual} = \frac{\color{#4CA384}{198.61}}{df_{residual}}\)` + Compute distance between each observation + Square them + Sum them up ] .pull-right[ <!-- --> ] --- count: false ### Calculating `\(F\)` .pull-left[ `$$F = \frac{57.36}{\color{#4CA384}{MS_{residual}}}$$` `\(MS_{residual} = \frac{198.61}{\color{#4CA384}{9}}\)` + Compute distance between each observation + Square them + Sum them up + `\(df_{residual}\)` = number of observations - number of levels of independent variable ] .pull-right[ <!-- --> ] --- count: false ### Calculating `\(F\)` .pull-left[ `$$F = \frac{57.36}{\color{#4CA384}{22.07}}$$` `\(MS_{residual} = \frac{198.61}{9}\)` + Compute distance between each observation + Square them + Sum them up + `\(df_{residual}\)` = number of observations - number of levels of independent variable ] .pull-right[ <!-- --> ] --- ### Calculating `\(F\)` .pull-left[ `$$F = \frac{57.36}{22.07} = 2.6$$` `\(MS_{residual} = \frac{198.61}{9}\)` + Compute distance between each observation + Square them + Sum them up + `\(df_{residual}\)` = number of observations - number of levels of independent variable ] .pull-right[ <!-- --> ] --- ### The `\(F\)` distribution .pull-left[ + The `\(F\)`-distribution is the null distribution against which our calculated `\(F\)`-statistic is compared. + Like the `\(t\)` distribution, the shape of the `\(F\)`-distribution depends on the degrees of freedom in our data. + Values of 0 or 1 are most common in the null distribution, with values farther from than 1 increasingly less likely. + Note that `\(F\)` will never be less than 0. ] .pull-right[ <!-- --> ] --- ### Putting it all together .pull-left[ Basically: 1) Calculate your `\(F\)`-statistic 2) Identify the proper null distribution using `\(df_{model}\)` and `\(df_{residual}\)` 3) Using this distribution, compute the probability of getting an `\(F\)`-statistic at least as extreme as yours ] .pull-right[ <!-- --> ] --- ### Posthoc Tests + After determining the presence of an overall effect, you can conduct posthoc tests to look for specific effects between groups. + Involves running separate tests on each possible comparison -- + But what about inflated Type I error risk?? -- + First, posthoc tests are only evaluated if the presence of an overall effect is detected. + Second, `\(p\)`-values may be adjusted so that the overall Type I error rate across all tests is still equal to the `\(\alpha\)` threshold --- class: center, inverse, middle ### Questions? --- ### Conducting an ANOVA 1. State your hypothesis 2. Conduct a power analysis 3. Check your data (visualisations/descriptives) 4. Check assumptions 5. Run the test 6. Calculate the effect size/confidence intervals 7. Interpret results 8. Report --- ### State Your Hypotheses + `\(H_0: \mu_1 = \mu_2 =...\mu_n\)` + `\(H_1:\)` at least one `\(\mu\)` is different from the other `\(\mu\)`s + Note that the ANOVA distribution only has one tail, due to the nature of the `\(F\)`-statistic calculation (a ratio cannot be less than 0) + However, `\(H_1\)` is still nondirectional + ANOVA doesn't tell you the direction of group differences, just that (at least) one exists. --- ### State Your Hypotheses .pull-left[ .center[**ANOVA results not significant**] <!-- --> ] .pull-right[ .center[**ANOVA results significant**] <!-- --><!-- --> ] --- ### Assumptions of ANOVA + **Normality:** DV should be normally distributed *within groups* -- + **Independence:** Observations/individuals should be sampled independently -- + **Homogeneity of Variance:** Equal variance between each group/category + This is called sphericity in the case of repeated-measures ANOVA, but it's the same general idea. --- ### Effect size - `\(\eta^2\)` `$$\eta^2 = \frac{SS_{model}}{SS_{total}}$$` + `\(\eta^2\)` is the proportion of variance in the dependent variable that is explained by the independent variable + Values range from 0 to 1 + When you have multiple variables, you instead use `\(\eta^2_p\)`, which is the proportion of variance in the dependent variable that is explained by the independent variable *while controlling for the other variables in the model*. .pull-left.center[ `\(SS_{model}\)` <!-- --> ] .pull-right.center[ `\(SS_{total}\)` <!-- --> ] --- ### Interpretation of `\(\eta^2\)` | Strength | Magnitude of `\(\eta^2\)` | |:--------:|:-------------------------:| | Weak | `\(\leq\)` .01 | | Moderate | `\(\approx\)` .06 | | Strong | `\(\geq\)` .14 | --- class: middle, inverse, center ### Questions? --- ### Running a One-way ANOVA **Step 1: State Your Hypotheses** **Independent Variable:** Auditory Stimulus (3 levels: Infant-directed Speech; Adult-directed Speech; Infant-directed Singing) **Dependent Variable:** Time until distressed >**Test Your Understanding:** What will your alternative hypothesis be? What will your null hypothesis be? -- .pull-left[.center[<span style = "color: #18778C"> Null Hypothesis </span> `$$\mu_{IDspeech} = \mu_{ADspeech} = \mu_{IDsinging}$$` ]] .pull-right[.center[<span style = "color: #18778C"> Alternative Hypothesis </span> At least one `\(\mu \neq\)` the other `\(\mu\)`s ]] --- ### Running a One-way ANOVA **Step 2: Conduct a Power Analysis** + [WebPower](https://webpower.psychstat.org/wiki/models/index) + Let's check the sample required to achieve 80% power to detect a moderate effect ( `\(\eta^2\)` = .06) with `\(\alpha\)` = .05. + Although we use `\(\eta^2\)`, Webpower requires the effect size to be entered as `\(f\)`, so you'll need to convert it using the following formula: `$$f = \sqrt{\frac{\eta^2}{1-\eta^2}} = \sqrt{\frac{.06}{1-.06}} = .25$$` --- ### Running a One-way ANOVA **Step 3: Check your data** + Compute descriptive statistics + Look at relevant plots -- + Let's do this in SPSS using [these data](https://mtruelovehill.github.io/PRM/Data/babyDat.sav). --- ### Running a One-way ANOVA **Step 4: Check Assumptions** + Normality + Have a look at the histograms & QQ-plots + Independence of Observations + Consider study design + Homogeneity of Variance + Check residuals vs fitted values plot --- ### Running a One-way ANOVA **Step 5: Run the test** **Step 6: Calculate the effect size** **Step 7: Interpret results** Let's continue in SPSS... --- ### Running a One-way ANOVA **Step 8: Report** + alpha threshold + Type of test conducted + Variables tested + Descriptive data + Test results: + Test statistic ( `\(t\)`) + Degrees of freedom + `\(p\)`-value + Effect sizes and/or confidence intervals + Brief interpretation (NO DISCUSSION) --- ### Running a One-way ANOVA **Step 8: Report** We conducted a **One-Way ANOVA** to determine the effect of <span style = "color:#9AD079"> <b> auditory stimulus </span></b> on <span style = "color:#9AD079"> <b> an infant's affect</span></b>. The `\(\alpha\)` threshold was set at .05 for all analyses. There was a significant effect of stimulus, <span style = "color:#18778C"><b> `\(F\)`(2, 57) = 5.99, `\(p\)` = .004, `\(\eta^2\)` = .174, 95% CI = [.02, .33] </span></b>. Posthoc Tukey's tests indicated that infants listening to infant-directed singing displayed a positive or neutral affect significantly longer <span style = "color:#19424C"><b> (`\(M\)` = 6.32, `\(SD\)` = 2.27)</span></b> than infants listening to adult-directed speech <span style = "color:#19424C"><b> (`\(M\)` = 4.36, `\(SD\)` = 1.55)</span></b>, `\(p\)` = .005, or infants listening to infant-directed speech <span style = "color:#19424C"><b>( `\(M\)` = 4.78, `\(SD\)` = 1.77)</span></b>, `\(p\)` = .033. There was no significant difference in time of distress in infants listening to adult-directed speech and those listening to infant-directed speech, `\(p\)` = .762. --- ### Running a One-way ANOVA **Step 8: Report** .pull-left[ + Figures are useful in helping readers visualise your results. + A **boxplot** is especially good for demonstrating results when you have a continuous DV and a categorical IV ] .pull-right[ <!-- --> ] --- class: center, inverse, middle ### Questions? --- ### Running a Repeated-measures ANOVA **Step 1: State Your Hypotheses** Does the presence of a family member have an effect on infant's affect? .pull-left[.center[<span style = "color: #18778C"> Null Hypothesis </span> `$$\mu_{parentAbsent} = \mu_{siblingPresent} = \mu_{parentPresent}$$` ]] .pull-right[.center[<span style = "color: #18778C"> Alternative Hypothesis </span> At least one `\(\mu \neq\)` the other `\(\mu\)`s ]] --- ### Running a Repeated-measures ANOVA **Step 2: Conduct a Power Analysis** + [WebPower](https://webpower.psychstat.org/wiki/models/index) + As before, let's check the sample required to achieve 80% power to detect a moderate effect ( `\(\eta^2\)` = .06) with `\(\alpha\)` = .05. --- ### Running a Repeated-measures ANOVA **Step 3: Check your data** + We can continue using the data from the previous example. --- ### Running a Repeated-measures ANOVA **Step 4: Check Assumptions** .pull-left[ + Normality of difference scores + Have a look at the histograms & QQ-plots + Check skewness and kurtosis values + May also run statistical tests of normality...but this is not recommended + Independence (of participants rather than observations) + Consider study design + Sphericity + Check Mauchly's test ] .pull-right[ <img src="https://mtruelovehill.github.io/PRM/Labs/images/week5_sphericity.png" width="75%" /> ] --- ### Running a Repeated-measures ANOVA **Step 5: Run the test** **Step 6: Calculate the effect size/confidence intervals** **Step 7: Interpret results** Again, let's continue in SPSS... --- ### Running a Repeated-measures ANOVA **Step 8: Report** We conducted a **Repeated-measures ANOVA** to determine the effect of <span style = "color:#9AD079"> <b> familial presence </span></b> on <span style = "color:#9AD079"> <b>infant affect </span></b>. There was a significant effect of familial presence, <span style = "color:#18778C"><b> `\(F\)`(1.83, 107.99) = 42.40, `\(p\)` < .001, `\(\eta^2_p\)` = 0.42 </span></b>. Mean comparisons with Bonferroni corrections indicated that infants became distressed more quickly when no family members were present <span style = "color:#19424C"><b> ( `\(M\)` = 3.85, `\(SD\)` = 2.05)</span></b> than when a sibling was present <span style = "color:#19424C"><b>( `\(M\)` = 5.04, `\(SD\)` = 1.79)</span></b>, `\(p\)` < .001, and when a parent was present <span style = "color:#19424C"><b>( `\(M\)` = 7.03, `\(SD\)` = 2.05)</span></b>, `\(p\)` < .001. Infants retained a positive or neutral affect for longer when a parent was present than when a sibling was present, `\(p\)` < .001. --- class: center, inverse, middle ## Questions?