Sampling sampling distribution Descriptive stats: collecting presenting describing data Inferential stats: drawing conclusions /or making decisions concerning a population based only on sample data Inferential stats Making statements about population by examining sample results Population Sample Sample statistics (Known Population para meters (Unknown, but can be estimated from sample evidence) Inference Estimation: e.g., estimate the population mean weight using the sample mean weight Hypothesis testing: e.g., use sample evidence to test the claim that the population mean weight is 120 Sampling from A population A population is the set of all items or individuals of interest. E.g., all likely voters in the next election, all parts produced today, all sales receipts for November A sample is a subset of the population. E.g., 1000 voters selected at r om for interview, A few parts selected for destructive testing, r om receipts selected for audit. Why sample? Less time consuming man a census less costly to administer than a census It is possible to obtain statistical results of a sufficiently high precision based on samples Simple r om sample Every object in the population has the same probability of being selected Objects are selected independently Samples can be obtained from a table of r om numbers or computer r om number generators A simple r om sample is the ideal against which other sampling methods are compared Sampling distributions A sampling distribution is a probability distribution of all of the possible values of a statistic for a given size sample selected from a population Developing A sampling distribution Assume there is a population ... Population size N=4 R om variable, X, is age of individuals Values of X: 18, 20, 22, 24 (years) In this example the population distribution is uniform: Let XI,x2,...,Xn represent a r om sample from population The sample mean value of these observations is defined as St ard error of the mean Different samples of the same size from the same population will yield different sample means A measure of the variability in the mean from sample to sample is given by the St ard Error of the Mean: Note that the st ard error of the mean decreases as the sample size increases Comparing the population with its sampling distribution Population N=4 U= 21 O= 2.2 36 Sample means distribution N= 2 UX= 21 OX= 1.58 Developing a sampling distribution Summary measures for the population distribution: Summary measures for the sampling distribution: If sample values are not independent 22 C If the sample size n is not a small fraction of the population size N, then individual sample members are not distributed independently of one another Thus, observations are not selected independently A finite population correction is made to account for this The term (N-n) I (N-l) is often called a finite population correction factor Or If the Population is normal If a population is normal with mean u st ard deviation o, the sampling distribution of Xbar is also normally distributed with And If the sample size n is not large relative to in population size N, then And St ard normal distribution for the sample means Z - valve for the sampling distribution of xbar: Where Xbar: sample mean Mu: population mean Oxbar: st ard error of mean t is a st ardized normal R om variable with mean of 0 A variance of 1 Sampling distribution properties Mu X Mu X Normal population distribution Normal sampling distribution (Both distributions have the same mean) Mu X As n increases, OXbar decreases Larger sample size Smaller sample size Central limit theorem Even if the population is not normal .... Sample means from the population will be approximately normal as long as the sample size is large enough Let x1,x2,...., xn Be a set of n independent r om variables having identical distributions with mean mu, variance O^2, xbar as the mean of these r om variables. As n becomes large, the central limit theorem states that the distribution of approaches the st ard normal distribution. As the sample size gets large enough... The sampling distribution becomes almost normal regardless of Shape of population If the population is not normal Sampling distribution properties Population distribution Mu X Mu Larger sample size Smaller sample size X Sampling distribution Becomes normal as n increases) Variation Central tendency How large is large enough? For most distributions, n > 25 will give a sampling distribution that is nearly normal For normal population distributions, the sampling distribution of the mean is always normally distributed Acceptance intervals Goal: determine a range within which sample means are likely to occur, given population mean variance By the Central Limit Theorem, we know that the distribution of X is approximately normal if n is large enough, with mean (mu) st ard deviation oxbar Let za/2 be the z -value that leaves area a/2 in the upper tail of the normal distribution (i.e., the interval - za/2 to za/2 encloses probability 1 - a) Then Is the interval that includes x bar with probability 1- a Sampling distribution of proportion P = the proportion of the population having some characteristic Sample proportion (phat) provides an estimate of p: Phat = x/n = number of items in the sample having the characteristic of interest/ sample size O< phat < 1 Phat has a binomial distribution but can be approximated by a normal distribution when np (1-p) >5 Normal approximation Properties And Z-valve for proportions St ardize phat to a z-value with the formula: Where the distribution of z is a good approximation to the st ard normal distribution it nP ( 1- p) >5 Sample variance Let X1, X2,... Xn be a r om sample from a Population The sample variance is The square root of the sample variance is called the sample st ard deviation The sample variance is different for different r om samples from the same population Sampling distribution of sample variance The sampling distribution of S2 has mean 02 If the population distribution is normal, then Chi-square distribution of sample population variances It the population distribution is normal then has a chi square (X2) distribution with n-l degrees of freedom The chi square distribution is a family of distributions, depending on df To find the value of x2 Population parameter Point estimate Mean Proportion Mu P Xbar Phat Unbiasedness A point estimator thetahat is said to be an Unbiased estimator of the parameter theta if its expected valve is equal to that parameter: Bias Let thetahat be an estimator of theta The bias in thetahat is defined as the difference between its mean theta The bias of an unbiased estimator is o The most efficient estimator or the minimum variance unbiased of the theta is the unbiased estimator with the smallest variance, let o1 o2 be two unbiased estimators of o based on the same number of sample observation, then ol is said to be more efficient than o2 if The relative efficiency of ol with respect o2 is the ration of their variances: An interval gives a range of values: Takes into consideration variation in sample statistics from sample to sample Based on observation from 1 sample Gives information about closeness to unknown population parameters Stated in terms of level of confidence Can never be 100% confident Confidence interval estimate If P(a < 0 < b) = 1 - a then the interval from a to b is called a 100(1 - a)% confidence interval of 0. The quantity 100(1 - a)% is called the confidence level of the interval a is between 0 1 In repeated samples of the population, the true value of the parameter 0 would be contained in 100(1 - a)% of intervals calculated this way. • The confidence interval calculated in this manner is written as a < 0 < b with 100(1 - o)% confidence Confidence interval confidence level Suppose confidence level = 95% • Also written (1 - a) = 0.95 • A relative frequency interpretation: • From repeated samples, 95% of all the confidence intervals that can be constructed of size n will contain the unknown true parameter • A specific interval either will contain or will not contain the true parameter • No probability involved in a specific interval General formula The general form for all confidence interval is: Point estimate + or - margin of error The value of the margin of error depends on the desired level of confidence Cl for for mean (variance known) Assumptions Population variance know Normally distributed if Not normal, use large sample Where za/2 is the normal dist. For a probability of a/2 in each tail Me Reducing the margin of error The margin of error Can be reduced it The population sd can be reduced Tre sample size is increased The CI is decreased Cl for for mean (variance unknown) Consider a r om sample of n observations with mean x st ard deviation s from a normally distributed population with mean u Follows the students t dist. With (n-1) df T - distributions are bell-shaped symmetric, but have 'fatter' tails than the normal If the population st ard deviation o is unknown, we can substitute the sample st ard deviation, s • This introduces extra uncertainty, since § is variable from sample to sample • So we use the t distribution instead of the normal distribution Cl formula CI for population proportion An interval estimate for the population proportion ( P ) can be calculated by adding an allowance for uncertainty to the sample proportion ( p ) where za/2 is the st ard normal value for the level of confidence desired p is the sample proportion n is the sample size P(1 − P) > 5 Confidence interval formula Confidence interval estimation for the variance Goal: Form a confidence interval for the population variance, 02 The confidence interval is based on the sample variance, s2 Assumed: the population is normally distributed The r om variable follows a chi-square distribution with (n-1) df Where the chi-square value X2n-1,a denotes the number for which Confidence interval formulas To determine the required sample size for the mean, you must know: • The desired level of confidence (1 - a), which determines the Za/2 value • The acceptable margin of error (sampling error), ME • The population st ard deviation, o Sample size determination Population proportion The sample population proportions, p P, are generally not known (since no sample has been taken yet) P(1 - P) = 0.25 generates the largest possible margin of error (so guarantees that the resulting sample size will meet the desired level of confidence) Z To determine the required sample size for the proportion, you must know: The desired level of confidence (1 - a), which determines the critical Za/2 value The acceptable sampling error (margin of error), ME Estimate P(1 - P) = 0.25 Dependent samples Confidence Interval Estimation of the Difference Between Two Normal Population Means: Dependent Samples Tests Means of 2 Related Populations Paired or matched samples Repeated measures (before/after) Use difference between paired values: Eliminates Variation Among Subjects • Assumptions: • Both Populations Are Normally Distributed confidence intervals two sample Confidence interval one sample Key: Sampling distribution C1 one sample Ci two sample Ht one sample Ht two sample ANOVA Chi squared I - W N * - T - ~ =.. I Mean difference (dependent samples) The ith paired difference is di, where The point estimate for the population main paired difference is d, The sample sd is: n is the number of matched pairs in the sample Confidence interval for mean difference The confidence interval for the difference between two population, ud, is The me is tn-1,a/2 is the value from the Student's t distribution with (n - 1) degrees of freedom for which When the confidence interval contains zero we can't be confident of anything Difference between two means: independent samples Confidence Interval Estimation of the Difference Between Two Normal Population Means: Independent Samples Goal: Form a confidence interval for the difference between two population means, Different data sources • Unrelated • Independent Sample selected from one population has no effect on the sample selected from the other population • The point estimate is the difference between the two sample means: x - y Both variances known Assumptions: Samples are r omly independently drawn both population distributions are normal Population variances are known When ox y are known both populations are normal, the variance of X- Y is And the r om variable, has a st ard normal distribution Confidence interval Both variances are unknown, assumed equal Assumptions: Samples are r omly independently drawn Populations are normally distributed Population variances are unknown but assumed equal Forming interval estimates: • The population variances are assumed equal, so use the two sample st ard deviations pool them to estimate o use at value with (nx + ny - 2) degrees of freedom The pooled variance is: The confidence interval Variances unknown, assumed unequal Assumptions: Samples are r omly independently drawn Populations are normally distributed Population variances are unknown assumed unequal Forming interval estimates: • The population variances are assumed unequal, so a pooled variance is not appropriate • use a t value with v degrees of freedom, where Confidence interval Two population proportions Confidence Interval Estimation of the Difference Between Two Population Proportions (Large Samples) Goal: Form a confidence interval for the difference between two population proportions Assumptions: Both sample size are large (generally at least 40 observations in each sample) The r om variable is approximately normally distributed Confidence limits hypothesis testing one sample A hypothesis is a claim (assumption) about a population parameter: Population mean Example: the mean monthly cell phone bill of this City is x= $52 Population proportion Example: The proportion of adults in this city with cell phones is p=.88 The null hypothesis, Ho States the assumption (numerical) to be tested Example: The average number of TV sets in v.s homes is equal to three (Ho:x=3) Is always about population parameter, not about a sample statistic Begin with the assumption that the null hypothesis is true Similar to the notion of innocent until proven guilty Refers to the status quo s always contains "=", "<", or ">" sign May or may not be rejected The alternative hypothesis, H1 Is the opposite of the null hypothesis E.g., The average number of TV sets in v.s homes is not equal to 3 ( Hl : x =3) Challenges the status quo Never contains the "=", "<" or ">" sign May or may not be supported Is generally the hypothesis that the researcher is trying to support Level of significance, a Defines the unlikely values of the sample statistic if the null hypothesis is true Defines rejection region of the sampling distribution Is designated by a , (level of significance) Typical values are 0.01, 0.05, or 0.10 Is selected by the researcher at the beginning Provides the critical value(s) of the test Ho: mu = 3 Hi: mu = 3 Ho: mu =< 3 Hi: mu > 3 Ho: mu >= 3 Hi: mu < 3 Two tailed test Upper-tail test Lower- tail test Errors in making decisions Type I error Reject a true null hypothesis Considered a serious type of error Probability of type I error is a Called level of significance of the test Set by researcher in advance Type 11 error Fail to reject a false null hypothesis Probability of type II error is B Outcomes possibilities (1-b) is called the Power Of the test Consequences of fixing the significance level of the test Once the significance level a is chosen (generally less than 0.10), the probability of Type II error, B, can be found. Investigator chooses sig level (Prob of type I error) Decision rule is established Probability of type II error follows Type l & I1 error relationship Type 1 type I1 errors can not happen at the same time Type I error can only occur if H0 is true Type II error can only occur if H0 is false If type l error probability (a) increases, then Type II error probability (B) decreases Factors affecting type II error All else equal, B increases when the difference between hypothesized parameter its true value decreases Power of the test The power of a test is the probability of rejecting a null hypothesis that is false Power = p(reject Ho 1 H1 is true) Power of the test increases as the sample size increases Tests of the mean of a normal distribution ( sd known) Convert sample result to ( x ) to a Z value Consider the test, ho: mu = muo, H1: mu > muo (Assume the population is normal) The decision rule is: P valve p-value: Probability of obtaining a test statistic more extreme ( < or >= ) than the observed sample value given Ho is true Also called observed level of significance Smallest value of a for which Ho can be rejected P value approach to testing sample result (e.g., X) to test statistic ( e.g., Z statistic) Obtain the p-value For an upper tail test Decision rule: compare p value to a Tests of the mean of a normal distribution ( sd unknown) Convert sample result ( x ) to a t test statistic Consider the test, ho: mu = muo, H1: mu > muo (Assume the population is normal) The decision rule is: For a two tailed test: Consider the test, Ho: mu = muo, H1: mu = muo ( assume the population is normal, the population variance is unknown) The decision rule is: Or if Hypothesis tests for proportions The sampling distribution of p is approximately normal, so the test statistic is a z value: Calculating B Tests of the variance of a normal distribution Goal: test hypotheses about the population variance , 02 ( e.g., Ho: O2 = Oo2) If the population is normally distributed Has a chi square distribution with (n-1) degrees of freedom The test statistic for hypothesis tests about one population variance is Hypothesis testing two samples Dependent samples Tests of the Difference Between Two Normal Population Means: Dependent Samples Tests Means of 2 Related Populations • Paired or matched samples Repeated measures (before/after) Use difference between paired values: Assumptions: Both populations are normally distributed Test statistic The test statistic for the mean difference is a t value, with n -1 degrees of freedom Independent samples Goal: form A confidence interval for the different between two population means Different populations Unrelated Independent Sample selected from one population has no effect on the sample selected from the other population Normally distributed Differences between two means (variances known) Assumptions: • Samples are r omly independently drawn • both population distributions are normal • Population variances are known When ox y are known both populations are normal, the variance of X- Y is And the r om variable, has a st ard normal distribution Test statistic for variances known Variances unknown, assumed equal Assumptions: • Samples are r omly independently drawn .. Populations are normally distributed • Population variances are unknown but assumed equal The population variances are assumed equal, so use the two sample st ard deviations pool them to estimate o use a t value with (nx + ny - 2) degrees of freedom Variances unknown, assumed unequal Assumptions: Samples are r omly independently drawn Populations are normally distributed Population variances are unknown assumed unequal Forming interval estimates: The population variances are assumed unequal, so a pooled variance is not appropriate use a tvalue with v degrees of freedom, where Test statistic Two population proportions Goal: test hypotheses for the differences Between two population proportions, The r om variable has a st ard normal distribution The test statistic for Ho: px-py = o Where Some comments on hypothesis Testing • A test with low power can result from: • Small sample size • Large variances in the underlying populations • Poor measurement procedures • If sample sizes are large it is possible to find significant differences that are not practically important • Researchers should select the appropriate level of significance before computing p-values Tests of equality of two variances Ho: mu = 3 Hi: mu = 3 Ho: mu =< 3 Hi: mu > 3 Ho: mu >= 3 Hi: mu < 3 Two tailed test Upper-tail test Lower- tail test Goal: test hypotheses about two population variances Both populations cave assumed to be independent normally distributed Test statistic The critical value for a hypothesis test about two population variances is Where F has (nx - 1) numerator df (ny -1) denominator df ANOVA Elements of a designed experiment Response variable The response variable is the variable of interest to be measured in the experiment. We also refer to the response as the dependent variable. Typically, the response/dependent variable is quantitative in nature. Factors Factors are those variables whose effect on the response is of interest to the experimenter. Quantitative factors are measured on a numerical scale, whereas qualitative factors are those that are not (naturally) measured on a numerical scale. Factors are also referred to as independent variables. Factor levels Factor levels are the values of the factor used in the experiment Treatments The treatmentsof an experiment are the factor - level combinations used Experimental Unit An experimental unit is the object on which the response factors are observed or measured Designed observational experiment A designed study is one for which the analyst controls the specification of the treatments the method of assigning the experimental units to each treatment. An observational study is one for which the analyst simply observes the treatments the response on a sample of experimental units Experiment Investigator controls one or more independent variables Called treatment variables or factors Contain two or more levels (subcategories) Observes effect on dependent variable Response to levels of independent variable Experimental design: plan used to test hypotheses One way ANOVA The completely r omized design: single factor A completely r omized design is a design in which the experimental units are r omly assigned to the k treatments or in which independent r om samples of experimental units are selected for each treatment. Experimental units (subjects) are assigned r omly to treatments Subjects are assumed homogeneous One factor or independent variable Two or more treatment levels or classifications Analyzed by one-way Analysis of Variance (ANOVA) / I = i Conditions Required for a Valid ANOVA F-test: Completely R omized Design 1. The samples are r omly selected in an independent manner from the k treatment populations. (This can be accomplished by r omly assigning the experimental units to the treatments.) 2. . All k sampled populations have distributions that are approximately normal. 3. The k population variances are equal (i.e. O12,=022 = 032 = ... = 0K2, ). Hypotheses of one way ANOVA All population means are equal i.e., no variation in means between groups At least one population mean is different i.e., there is variation between groups Does not mean that all population means are different (some pairs may be the same) Z/t test vs. ANOVA With only two groups t-test ANOVA are equivalent, but only if we use a pooled st ard variance in the denominator of the t-statistic. With more than two groups, ANOVA compares the sample means to an overall gr mean. Variability The variability of the data is key factor to test the equality of means In each case below, the means may look different, but a large variation within groups in B makes the evidence that the means are different weak Sum of squares decomposition Total variation can split into two parts: SS(T) = Total Sum of Squares Total Variation - the aggregate dispersion of the individual data values across the various groups • SST = Sum of Squares for treatments Between-Group/Treatment Variation = the aggregate dispersion between the group/treatment • SSE = Sum of Squares for Error Within-Group Variation = dispersion that exists among the data within a particular group. Total sum of squares Where: SS(T) = Total sum of squares k = number of groups (levels or treatments) ni = number of observations in group i xij = jth observation from group i x = overall sample mean Error/within - Group variation Where: SSE = Sum of squares within groups k = number of groups ni = sample size from group i xi = sample mean from group i xij = jth observation in group i Note that in some expositions the error sum of squares is referred to as the within groups error • The intuition behind is that the variability within groups makes the evidence that the means are equal or different weak. Between treatment / Group Variation Where: SST = Sum of squares between treatments groups k = number of groups ni = sample size from group i xi = sample mean from group i x = overall mean (mean of all data values) Obtaining the means squares An unbiased estimator of the population variance results if SSE is divided by (n-k). The resulting estimate is called the within- groups mean square, denoted MSW: If the population means are equal, another unbiased estimator of the population variance is obtained by dividing SST by (k-1). The resulting estimate is called the between treatments/groups mean square, denoted MST: Where n = sum of the sample sizes from all groups k = number of populations One-way ANOVA Table k = number of groups n = sum of the sample sizes from all groups df = degrees of freedom F test statistic MST is mean squares between variances MSE is mean squares within variances Degrees of freedom df1 =k – 1 (K=numberofgroups) df2 = n – k (n = sum of sample sizes from all groups) Interpreting the F statistic The F statistic is the ratio of the between estimate of variance the within estimate of variance The ratio must always be positive df1 = K -1 will typically be small df2 = n - K will typically be large When the population means are not equal, the between treatments/groups mean square does not provide an unbiased estimate of the common population variance. Rather, the expected value of the corresponding r om variable exceeds the common population variance. The greater the discrepancy between these two estimates (if the variability between treatments/groups is large compared to the variability within groups), all else being equal, the stronger our suspicion that the null hypothesis is not true. The null hypothesis is rejected for large values of this ratio. When the population means are equal (the null hypothesis is true), the between-groups mean square provides an unbiased estimate of the common population variance. We would now be in possession of two unbiased estimates of the same quantity, the common population variance. It would be reasonable to expect these estimates to be quite close to each other If this ratio is quite close to 1, there is little cause to doubt the null hypothesis of equality of population means. Post-hoc analysis Bonferroni multiple comparisons of means procedure (pairwise comparisons) Multiple Comparisons Between Subgroup Means To test which population means are significantly different e.g.: μ 1 = μ 2 ≠μ 3 Done after rejection of equal means in single factor ANOVA design Allows pair-wise comparisons Compare absolute mean differences with critical range Bonferroni Method We define 𝛼 as the experimentwise error rate. § We calculate the st ard error for each treatment pair (i,j): , where ni , nj are the number of i & j Now we need to calculate the significance level for each test. §This is a two-tail test, so a/2 §But there are many possible comparisons, where k is the number of treatments. §Now we divide alpha among all possible tests, so the significance level becomes a/2c § the critical value of the t distributionis: observations for treatment i j; with 𝜐 degrees of freedom associated with MSE (n-k) § The test statistic for each treatment is as follows: R omized block design In contrast to the selection of independent samples of experimental units specified by the completely r omized design, the r omized block design uses experimental units that are matched sets, assigning one from each set to each treatment. The matched sets of experimental units are called blocks. The theory behind the r omized block design is that the sampling variability of the experimental units in each block will be reduced, in turn reducing the measure of error, MSE. Assumptions Populations are normally distributed Populations have equal variances Independent r om samples are drawn Notation Let xij denote the observation in the ith group jth block Suppose that there are K groups H blocks, for a total of n = KH observations Let the overall mean be x Denote the group sample means by Denote the block sample means by Partition of total variation Sum of squares Mean squares F test statistic General table format Factorial experiments: Two factors Factorial design A complete factorial experiment is one in which every factor-level combination is employed – that is, the number of treatments in the experiment equals the total number of factor-level combinations. Also referred to as a two-way classification. ANOVA Data table To determine the nature of the treatment effect, if any, on the response in a factorial experiment, we need to break the treatment variability into three components: Interaction between Factors A B, Main Effect of Factor A, Main Effect of Factor B. The Factor Interaction component is used to test whether the factors combine to affect the response, while the Factor Main Effect components are used to determine whether the factors separately affect the response. Experimental units (subjects) are assigned r omly to treatments Subjects are assumed homogeneous Two or more factors or independent variables Each has two or more treatments (levels) Analyzed by two-way ANOVA Procedure for Analysis of Two-Factor Factorial Experiment 1. Partition the Total Sum of Squares into the Treatments Error components. Use either a statistical software package or the calculation formulas in Appendix C to accomplish the partitioning. 2. Use the F-ratio of Mean Square for Treatments to Mean Square for Error to test the null hypothesis that the treatment means are equal. a. If the test results in nonrejection of the null hypothesis, consider refining the experiment by increasing the number of replications or introducing other factors. Also consider the possibility that the response is unrelated to the two factors. b. If the test results in rejection of the null hypothesis, then proceed to step 3. 3. Partition the Treatments Sum of Squares into the Main Effect Interaction Sum of Squares. Use either a statistical software package or the calculation formulas in Appendix C to accomplish the partitioning. 4. Test the null hypothesis that factors A B do not interact to affect the response by computing the F-ratio of the Mean Square for Interaction to the Mean Square for Error. a. If the test results in nonrejection of the null hypothesis, proceed to step 5. b. If the test results in rejection of the null hypothesis, conclude that the two factors interact to affect the mean response. Then proceed to step 6a. 5. Conduct tests of two null hypotheses that the mean response is the same at each level of factor A factor B. Compute two F-ratios by comparing the Mean Square for each Factor Main Effect to the Mean Square for Error. a. If one or both tests result in rejection of the null hypothesis, conclude that the factor affects the mean response. Proceed to step 6b. b. If both tests result in nonrejection, an apparent contradiction has occurred. Although the treatment means apparently differ (step 2 test), the interaction (step 4) main effect (step 5) tests have not supported that result. Further experimentation is advised. 6. Compare the means: a. If the test for interaction (step 4) is significant, use a multiple comparisons procedure to compare any or all pairs of the treatment means. b. If the test for one or both main effects (step 5) is significant, use a multiple comparisons procedure to compare the pairs of means corresponding to the levels of the significant factor(s). ANOVA Tests Conducted for Factorial Experiments: Completely R omized Design, r Replicates per Treatment Test for Treatment Means H0: No difference among the ab treatment means Ha: At least two treatment means differ Test Statistic: F = MST/MSE Rejection region: F > Fa, based on (ab – 1) numerator (n – ab) denominator degrees of freedom [Note: n = abr.] Test for Factor Interaction H0: Factors A B do not interact to affect the response mean Ha: Factors A B do interact to affect the response mean MS(AB) Test Statistic: F = MS(AB)/MSE Rejection region: F > Fa, based on (a – 1)(b – 1) numerator (n – ab) denominator degrees of freedom Test for Main Effect of Factor A H0: No difference among the a mean levels of factor A Ha: At least two factor A mean levels differ Test Statistic: F = MS(A)/MSE Rejection region: F > Fa, based on (a – 1) numerator (n – ab) denominator degrees of freedom Test for Main Effect of Factor B H0: No difference among the b mean levels of factor B Ha: At least two factor B mean levels differ Test Statistic: F = MS(B)/MSE Rejection region: F > Fa, based on (b – 1) numerator (n – ab) denominator degrees of freedom Conditions Required for Valid F-tests in Factorial Experiments 1. The response distribution for each factor- level combination (treatment) is normal. 2.The response variance is constant for all treatments. 3.R om independent samples of experimental units are associated with each treatment. Analysis of categorical data Multinomial experiment When the qualitative variable results in one of two responses (success or failure), the data – called counts- can be analyzed using the binomial probability distribution n However, qualitative variables, such as level of education, that allow for more than two categories for a response are much more common, these must be analyzed using a different method Qualitative data that fall in more than two categories often result from a multinomial experiment. n The characteristics for a multinomial experiment with k outcomes are described in the next slide n It is easy to see that the binomial experiment is a multinomial experiment with k=2 Procedure Testing categorical probabilities: one way table One-way table In this section, we consider a multinomial experiment with k outcomes that correspond to categories of a single qualitative variable The results of such an experiment are summarized in a one-way table The term one-way is used because only one variable is classified. Typically, we want to make inferences about the true proportions that occur in the k categories based on the sample information in the one-way table In the consumer-preference survey, in most practical applications of the multinomial experiment, the k outcome probabilities p1, p2, p3 are unknown, we typically want to use the survey data to make inferences about their values. For example, to decide whether the consumers have a preference for any of the br s, we will want to test the null hypothesis that the br s of bread are equally preferred, that is, p1 = p2 = p3 = !" against the alternative hypothesis that one br is preferred, that is, at least one of the probabilities exceeds !". We want to test: H0 : p 1 = p 2 = p 3 = 1/3 H1: at least one of the probabilities exceeds 1/3 Testing categorical Probabilities: One way table We calculate the statistics: E1 = np1 =150* 1/3 = 50 (the same for E2 E3) n The following test statistic – the chi-square test- measures the degree of disagreement between the data the null hypothesis: Definition Goodness of fit tests: specified probabilities Does sample data conform to a hypothesized distribution? Examples: Do sample results conform to specified expected probabilities? Are technical support calls equal across all days of the week? (i.e., do calls follow a uniform distribution?) Do measurements from a production process follow a normal distribution? Chi -square goodness of fit test Are technical support calls equal across all days of the week? (i.e., do calls follow a uniform distribution?) Sample data for 10 days per day of week: If calls are uniformly distributed, the 1722 calls would be expected to be equally divided across the 7 days: Chi-Square Goodness-of-Fit Test: test to see if the sample results are consistent with the expected results Logic of goodness of fit test Observed vs. Expected frequencies Chi-square test statistic The test statistic is where: K = number of categories Oi = observed frequency for category i Ei = expected frequency for category i Rejection region Testing categorical probabilities: two - way (contingency) table We now consider multinomial experiments in which the data are classified according to two criteria, that is, classification with respect to two qualitative factors. Contingency table Two-way tables are also called Contingency Tables Used to classify sample observations according to a pair of attributes Also called a cross-classification or cross- tabulation table Assume r categories for attribute A c categories for attribute B Then there are (r x c) possible cross-classifications Consider n observations tabulated in an r x c contingency table by nij the number of observations in the cell that is in the ith row the jth column The null hypothesis is Test for independence The appropriate test is a chi-square test with (r-1)(c-1) degrees of freedom Testing Categorical Probabilities: Two-Way Table Two-way table This really is a multinomial experiment with a total of 300 trials, 4 cells or possible outcomes, probabilities for each cell as shown in table 10.4b If the 300 TV viewers are r omly chosen, the trials are considered independent, the probabilities are viewed as remaining constant from trial to trial. Suppose we want to know whether the two classifications, gender br awareness, are dependent. In the contingency table analysis, if the two classifications are independent, the probability that an item is classified in any particular cell of the table is the product of the corresponding marginal probabilities. Thus, under the hypothesis of independence we must have: Note: Please note that if two events A B are independent: P(A ∩ B)=P(A)*P(B) Procedure Two-way table The following test statistic – the chi-square test-is used to compare the observed expected counts in each cell of the contingency table: We make use of the fact that the sampling distribution of 𝛘 2 is approximately a 𝛘 2 probability distribution when the classifications are independent. n When testing the null hypothesis of independence in a two-way contingency table, the appropriate degrees of freedom will be (r-1)(c-1), where r is the number of rows c is the number of columns. A test of independence at a significance level a is based on the chi-square distribution the following decision rule We make use of the fact that the sampling distribution of 𝛘 2 is approximately a 𝛘 2 probability distribution when the classifications are independent. When testing the null hypothesis of independence in a two-way contingency table, the appropriate degrees of freedom will be (r-1)(c-1), where r is the number of rows c is the number of columns. n From this example df = (2-1)(2-1)=1 n Then for 𝛼 =0.05, 𝛘 2c=3.8146 Then for 𝛼 =0.05, 𝛘 2c=3.8146 Since 𝛘 2=46.14 we reject the null hypothesis of independence. The pattern of dependence can be seen more clearly by expressing the data as percentages. We first select one of the two classifications to be used as the base variable. Suppose we select gender of the TV viewer as the classificatory variable to be the b