如何在R软件上面找到degrees是什么意思 of freedom

R语言与数据科学_图文_百度文库
两大类热门资源免费畅读
续费一年阅读会员,立省24元!
R语言与数据科学
阅读已结束,如果下载本文需要使用
想免费下载本文?
下载文档到电脑,查找使用更方便
还剩87页未读,继续阅读
你可能喜欢Chi-Square Goodness of Fit Test
Chi-Square Goodness of Fit Test
When an analyst attempts to fit a statistical model to observed data, he
or she may wonder how well the model actually reflects the data. How "close"
are the observed values to those which would be expected under the fitted
model? One statistical test that addresses this issue is the chi-square
goodness of fit test. This test is commonly used to test association of
variables in two-way tables (see ), where the assumed model of independence
is evaluated against the observed data. In general, the chi-square
test statistic is of the form
If the computed test statistic is large, then the observed and expected
values are not close and the model is a poor fit to the data.&
A new casino game involves rolling 3 dice. The winnings are directly proportional
to the total number of sixes rolled. Suppose a gambler plays the game 100
times, with the following observed counts:
Number of Sixes&&&&&&&& Number of Rolls&&
&&&&&&& 0&&&&&&&&&&&&&&&&&&&&&& 48
&&&&&&& 1&&&&&&&&&&&&&&&&&&&&&& 35
&&&&&&& 2&&&&&&&&&&&&&&&&&&&&&& 15
&&&&&&& 3&&&&&&&&&&&&&&&&&&&&&&& 3
The casino becomes suspicious of the gambler and wishes to determine whether
the dice are fair. What do they conclude?
If a die is fair, we would expect the probability of rolling a 6 on
any given toss to be 1/6. Assuming the 3 dice are independent (the roll
of one die should not affect the roll of the others), we might assume that
the number of sixes in three rolls is distributed Binomial(3,1/6). To determine
whether the gambler's dice are fair, we may compare his results with the
results expected under this distribution. The expected values for 0, 1,
2, and 3 sixes under the Binomial(3,1/6) distribution are the following:
Null Hypothesis:
p1 = P(roll 0 sixes) = P(X=0) = 0.58
p2 = P(roll 1 six) = P(X=1) = 0.345
p3 = P(roll 2 sixes) = P(X=2) = 0.07
p4 = P(roll 3 sixes) = P(X=3) = 0.005.
Since the gambler plays 100 times, the expected counts are the following:
Number of Sixes&&&&&&&& Expected Counts&&&&&&&& Observed Counts&&
&&&&&&& 0&&&&&&&&&&&&&&&&&&&&&& 58&&&&&&&&&&&&&&&&&&&&& 48
&&&&&&& 1&&&&&&&&&&&&&&&&&&&&&& 34.5&&&&&&&&&&&&&&&&&&& 35
&&&&&&& 2&&&&&&&&&&&&&&&&&&&&&&& 7&&&&&&&&&&&&&&&&&&&&& 15
&&&&&&& 3&&&&&&&&&&&&&&&&&&&&&&& 0.5&&&&&&&&&&&&&&&&&&&& 3
The two plots shown below provide a visual comparison of the expected and
observed values:
From these graphs, it is difficult to distinguish differences between
the observed and expected counts. A visual representation of the differences
is the chi-gram, which plots the observed - expected counts divided
by the square root of the expected counts, as shown below:
The chi-square statistic is the sum of the squares of the plotted values,
(48-58)²/58 + (35-34.5)²/58 + (15-7)²/7 + (3-0.5)²/0.5
= 1.72 + 0.007 + 9.14 + 12.5 = 23.367.
Given this statistic, are the observed values likely under the assumed
A random variable&is said
to have a chi-square distribution with m degrees of freedom if it
is the sum of the squares of m independent standard normal random
variables (the square of a single standard normal random variable has a
chi-square distribution with one degree of freedom). This distribution
is denoted&(m), with
associated probability values available in Table G in Moore and McCabe
and in MINITAB.
The standardized counts (observed - expected )/sqrt(expected) for k
possibilities are approximately normal, but they are not independent because
one of the counts is entirely determined by the sum of the others (since
the total of the observed and expected counts must sum to n). This
results in a loss of one degree of freedom, so it turns out the the distribution
of the chi-square test statistic based on k counts is approximately
the chi-square distribution with m = k-1 degrees of freedom,
denoted&(k-1).
Hypothesis Testing
We use the chi-square test to test the validity of a distribution assumed
for a random phenomenon. The test evaluates the null hypotheses H0
(that the data are governed by the assumed distribution) against the alternative
(that the data are not drawn from the assumed distribution).
Let p1, p2, ..., pk denote the
probabilities hypothesized for k possible outcomes. In n
independent trials, we let Y1, Y2, ..., Yk
denote the observed counts of each outcome which are to be compared to
the expected counts np1, np2, ..., npk.
The chi-square test statistic is qk-1 =
= (Y1 - np1)²& + (Y2 - np2)² + ... + (Yk - npk)²
& ----------&&& ----------&&&&&&&&& --------
&&&& np1&&&&&&&&&& np2&&&&&&&&&&&&&&& npk
Reject H0 if this value exceeds the upper&
critical value of the&(k-1)
distribution, where& is the desired
level of significance.
In the gambling example above, the chi-square test statistic was calculated
to be 23.367. Since k = 4 in this case (the possibilities are 0,
1, 2, or 3 sixes), the test statistic&
is associated with the chi-square distribution with 3 degrees of freedom.
If we are interested in a significance level of 0.05 we may reject the
null hypothesis (that the dice are fair) if&
> 7.815, the value corresponding to the 0.05 significance level for
the&(3) distribution. Since
23.367 is clearly greater than 7.815, we may reject the null hypothesis
that the dice are fair at the 0.05 significance level.
Given this information, the casino asked the gambler to take his dice
(and his business) elsewhere.
Consider a
random variable Y with mean (expected value) np and variance&
y2 = np(1-p).
From the , we know that Z = (Y-np)/y
has an approximately Normal(0,1) distribution for large values of n.
Then Z² is approximately&(1),
since the square of a normal random variable has a chi-square distribution.
Suppose the random variable Y1 has a Bin(n,p1)
distribution, and let Y2 = n - Y1 and
p2 = 1 - p1.
&Then Z² = (Y1 - np1)²
&&&&&&&&&& ----------&&
&&&&&&&&&&& np1(1-p1)
&&&&&&& = (Y1 - np1)²(1 - p1) + (Y1 - np1)²(p1)
&&&&&&&&& ---------------------------------------
&&&&&&&&&&&&&&&&&&&& np1(1-p1)
&&&&&&& = (Y1 - np1)²& + (Y1 - np1)²
&&&&&&&&& ----------&&& ----------
&&&&&&&&&&&& np1&&&&&&&&& n(1-p1)
Since (Y1 - np1)² = (n - Y2
- n + np2)² = (Y2 - np2)²,
we have Z² = (Y1 - np1)²& + (Y2 - np2)²
&&&&&&&&&&& ----------&&& ----------
&&&&&&&&&&&&&& np1&&&&&&&&&& np2
where Z² has a chi-square distribution with 1 degree of freedom.
If the observed values Y1 and Y2 are
close to their expected values np1 and np2,
then the calculated value Z² will be close to zero. If not,
Z² will be large.
In general, for k random variables Yi, i
= 1, 2,..., k, with corresponding expected values npi,
a statistic measuring the "closeness" of the observations to their expectations
is the sum
&(Y1 - np1)²& + (Y2 - np2)² + ... + (Yk - npk)²
&----------&&& ----------&&&&&&&&& --------
&&&& np1&&&&&&&&&& np2&&&&&&&&&&&&&&& npk
which has a chi-square distribution with k-1 degrees of freedom.
Estimating Parameters
Often, the null hypothesis involves fitting a model with parameters estimated
from the observed data. In the above gambling example, for instance, we
might wish to fit a binomial model to evaluate the probability of rolling
a six with the gambler's loaded dice. We know that this probability is
not equal to 1/6, so we might estimate this value by calculating the probability
from the data. By estimating a parameter, we lose a degree of freedom in
the chi-square test statistic. In general, if we estimate d parameters
under the null hypothesis with k possible counts the degrees of
freedom for the associated chi-square distribution will be k - 1 - d.
A two-way table for two categorical variables X and Y with
r and c levels, respectively, will have r rows and
c columns. The table will have rc cells, with any one cell
entirely determined by the sum of the others, so k-1 = rc - 1
in this case. A chi-square test of this table tests the null hypothesis
of independence against the alternative hypothesis of association between
the variables. Under the assumption of independence, we estimate (r-1)
+ (c-1) parameters to give the marginal probabilities that determine
the expected counts, so d = (r-1) + (c-1). The degrees of
freedom for the chi-square test statistic are
(rc - 1) - [(r-1) + (c-1)]
= rc -1 - r + 1 - c + 1
= rc - r - c + 1
= (r - 1)(c - 1).
The chi-square goodness of fit test may also be applied to continuous
distributions. In this case, the observed data are grouped into discrete
bins so that the chi-square statistic may be calculated. The expected values
under the assumed distribution are the probabilities associated with each
bin multiplied by the number of observations. In the following example,
the chi-square test is used to determine whether or not a normal distribution
provides a good fit to observed data.&
The MINITAB data file "GRADES.MTW" contains data on verbal and mathematical
SAT scores and grade point average for 200 college students. Suppose we
wish to determine whether the verbal SAT scores follow a normal distribution.
One method is to evaluate the
for the data, shown below:
The plot indicates that the assumption of normality is not unreasonable
for the verbal scores data.
To compute a chi-square test statistic, I first standardized the verbal
scores data by subtracting the sample mean and dividing by the sample standard
deviation. Since these are estimated parameters, my value for d
in the test statistic will be equal to two. The 200 standardized observations
are the following:
&[1] -2.173& 0.702& 0.942& 0.202& 0.673
&[11] -0.557 -0.573 -0.530& 0.065 -0.264
&[21] -0.378& 0.238& 1.430 -0.693& 1.102
&[31]& 1.171& 0.518 -0.630& 0.306 -0.773
&[41] -1.546& 0.009& 0.235 -1.358& 0.900
&[51]& 0.109& 0.388& 1.130& 1.978& 0.766
&[61]& 0.638& 1.378& 0.314 -0.535 -0.973
&[71]& 0.413 -0.544& 0.309& 0.131& 2.237
&[81] -1.275& 1.060& 0.066 -0.371 -0.770
&[91] -0.002& 0.273 -0.071 -0.842 -0.231
[101] -0.035& 0.243& 1.035 -2.010& 0.037
[111]& 1.714 -0.735& 0.689& 0.965& 0.049
[121]& 0.627 -0.580 -2.453 -1.102& 1.199
[131] -0.638 -0.329 -0.438& 0.742 -0.486
[141] -0.306 -0.416 -0.238 -0.251& 0.571
[151] -0.929& 0.250& 1.478 -1.300 -0.510
[161] -0.474& 0.529& 1.442& 0.107 -0.371
[171] -0.072& 1.335 -1.882 -0.029 -0.402
[181] -1.821& 0.431& 0.638& 0.499 -0.378
[191]& 1.721 -0.302 -0.093 -1.866& 2.238
I chose to divide the observations into 10 bins, as follows:
Bin&&&&&&&&&&&& Observed Counts
(& -2.0)&&&&&&&&&&&&&&&& 6
(-2.0, -1.5)&&&&&&&&&&&& 6
(-1.5, -1.0)&&&&&&&&&&& 18
(-1.0, -0.5)&&&&&&&&&&& 33
(-0.5, 0.0)&&&&&&&&&&&& 38
(0.0, 0.5)&&&&&&&&&&&&& 38
(0.5, 1.0)&&&&&&&&&&&&& 28
(1.0, 1.5)&&&&&&&&&&&&& 21
(1.5, 2.0)&&&&&&&&&&&&&& 9
(> 2.0)&&&&&&&&&&&&&&&&& 3
The corresponding standard normal probabilities and the expected number
of observations (with n=200) are the following:
Bin&&&&&&&& Normal Prob.& Expected Counts& Observed - Expected& Chi-Value
(& -2.0)&&&&&&& 0.023&&&&&&&&&&& 4.6&&&&&&&&&&&& 1.4&&&&&&&&&&&& 0.65&&&
(-2.0, -1.5)&&& 0.044&&&&&&&&&&& 8.8&&&&&&&&&&& -2.8&&&&&&&&&&& -0.94
(-1.5, -1.0)&&& 0.092&&&&&&&&&& 18.4&&&&&&&&&&& -0.4&&&&&&&&&&& -0.09
(-1.0, -0.5)&&& 0.150&&&&&&&&&& 30.0&&&&&&&&&&&& 3.0&&&&&&&&&&&& 0.55
(-0.5, 0.0)&&&& 0.191&&&&&&&&&& 38.2&&&&&&&&&&& -0.2&&&&&&&&&&& -0.03
(0.0, 0.5)&&&&& 0.191&&&&&&&&&& 38.2&&&&&&&&&&& -0.2&&&&&&&&&&& -0.03
(0.5, 1.0)&&&&& 0.150&&&&&&&&&& 30.0&&&&&&&&&&& -2.0&&&&&&&&&&& -0.36
(1.0, 1.5)&&&&& 0.092&&&&&&&&&& 18.4&&&&&&&&&&&& 2.6&&&&&&&&&&&& 0.61
(1.5, 2.0)&&&&& 0.044&&&&&&&&&&& 8.8&&&&&&&&&&&& 0.2&&&&&&&&&&&& 0.07
(> 2.0)&&&&&&&& 0.023&&&&&&&&&&& 4.6&&&&&&&&&&& -1.6&&&&&&&&&&& -0.75
The chi-square statistic is the sum of the squares of the values in the
last column, and is equal to 2.69.
Since the data are divided into 10 bins and we have estimated two parameters,
the calculated value may be tested against the chi-square distribution
with 10 -1 -2 = 7 degrees of freedom. For this distribution, the critical
value for the 0.05 significance level is 14.07. Since 2.69 & 14.07,
we do not reject the null hypothesis that the data are normally distributed.A lot of researchers seem to be struggling with their understanding of the statistical concept of degrees of freedom. Most do not really care about why degrees of freedom are important to statistical tests, but just want to know how to calculate and report them. This page will help. For those interested in learning more about degrees of freedom, take a look at the following resources:
Walker, H. W. (1940). Degrees of Freedom. Journal of Educational Psychology, 31(4), 253-269.
I couldn’t find any resource on the web that explains calculating degrees of freedom in a simple and clear manner and believe this page will fill that void. It reflects my current understanding of degrees of freedom, based on what I read in textbooks and scattered sources on the web. Feel free to add or comment.
Conceptual Understanding
Let’s start with a simple explanation of degrees of freedom. I will describe how to calculate degrees of freedom in an F-test (ANOVA) without much statistical terminology. When reporting an ANOVA, between the brackets you write down degrees of freedom 1 (df1) and degrees of freedom 2 (df2), like this: “F(df1, df2) = …”. Df1 and df2 refer to different things, but can be understood the same following way.
Imagine a set of three numbers, pick any number you want. For instance, it could be the set [1, 6, 5]. Calculating the mean for those numbers is easy: (1 + 6 + 5) / 3 = 4.
Now, imagine a set of three numbers, whose mean is 3. There are lots of sets of three numbers with a mean of 3, but for any set the bottom line is this: you can freely pick the first two numbers, any number at all, but the third (last) number is out of your hands as soon as you picked the first two. Say our first two numbers are the same as in the previous set, 1 and 6, giving us a set of two freely picked numbers, and one number that we still need to choose, x: [1, 6, x].
For this set to have a mean of 3, we don’t have anything to choose about x. X has to be 2, because (1 + 6 + 2) / 3 is the only way to get to 3. So, the first two values were free for you to choose, the last value is set accordingly to get to a given mean. This set is said to have two degrees of freedom, corresponding with the number of values that you were free to choose (that is, that were allowed to vary freely).
This generalizes to a set of any given length. If I ask you to generate a set of 4, 10, or 1.000 numbers that average to 3, you can freely choose all numbers but the last one. In those sets the degrees of freedom are respectively, 3, 9, and 999. The general rule then for any set is that if n equals the number of values in the set, the degrees of freedom equals n – 1.
This is the basic method to calculate degrees of freedom, just n – 1. It is as simple as that. The thing that makes it seem more difficult, is the fact that in an ANOVA, you don’t have just one set of numbers, but there is a system (design) to the numbers. In the simplest form you test the mean of one set of numbers against the mean of another set of numbers (one-way ANOVA). In more complicated one-way designs, you test the means of three groups against each other. In a 2 x 2 design things seem even more complicated. Especially if there’s a within-subjects variable involved (Note: all examples on this page are between-subjects, but the reasoning mostly generalizes to within-subjects designs). However things are not as complicated as you might think. It’s all pretty much the same reasoning: how many values are free to vary to get to a given number?
Df1 is all about means and not about single observations. The value depends on the exact design of your test. Basically, the value represents the number of cell means that are free to vary to get to a given grand mean. The grand mean is just the mean across all groups and conditions of your entire sample. The cell means are nothing more than the means per group and condition. We’ll call the number of cells (or cell means) k.
Let’s start off with a one-way ANOVA. We have two groups that we want to compare, so we have two cells. If we know the mean of one of the cells and the grand mean, the other cell must have a specific value such that (cell mean 1 + cell mean 2) / 2 = grand mean (this example assumes equal cell sample sizes, but unequal cell sample sizes would not change the number of degrees of freedom). Conclusion: for a two-group design, df1 = 1.
Sticking to the one-way ANOVA, but moving on to three groups. We now have three cells, so we have three means and a grand mean. Again, how many means are free to vary to get to the given grand mean? That’s right, 2. So df1 = 2. See the pattern? For one-way ANOVA’s df1 = k – 1.
Moving on to an ANOVA with four groups. We know the answer if this is a one-way ANOVA (that is, a 4 x 1 design): df1 = k – 1 = 4 -1 = 3. However, what if this is a two-way ANOVA (a 2 x 2 design)? We still have four means, so to get to a given grand mean, we can have three freely varying cell means, right? Although this is true, we have more to deal with than just the grand mean, namely the marginal means. The marginal means are the combined cell means of one variable, given a specific level of the other variable. Let’s say our 2 x 2 ANOVA follows a 2 (gender: male vs. female) x 2 (eye color: blue vs. brown) design. In that case, the grand mean is the average of all observations in all 4 cells. The marginal means are the average of all eye colors for male participants, the average of all eye colors for female participants, the average of all genders for blue-eyed participants, and the average of all genders for brown-eyed participants. The following table shows the same thing:
Brown eyes
Brown eyed males
Blue eyed males
MARGINAL MEAN
of brown eyed males
and blue eyed males
Brown eyed females
Blue eyed females
MARGINAL MEAN
of brown eyed females
and blue eyed females
MARGINAL MEAN
of brown eyed males
and brown eyed females
MARGINAL MEAN
of blue eyed males
and blue eyed females
GRAND MEAN
The reason that we are now dealing with marginal means is that we are interested in interactions. In a 4 x 1 one-way ANOVA, no interactions can be calculated. In our 2 x 2 two-way ANOVA, we can. For instance, we might be interested in whether females perform better than males depending on their eye color. Now, because we are interested in cell means differences in a specific way (i.e., we are not just interested in whether one cell mean deviates from the grand mean, but we are also interested in more complex patterns), we need to pay attention to the marginal means. As a consequence, we now have less freedom to vary our cell means, because we need to account for the marginal means (if you want to know how this all works, you should read up on how the sums of squares are partitioned in 2 x 2 ANOVA’s). It is also important to realize that if all marginal means are fixed, the grand mean is fixed too. In other words, we do not have to worry about the grand mean anymore for calculating our df1 in a two-way ANOVA, because we are already worrying about the marginal means. As a consequence, our df1 will not lose a degree of freedom because we do not want to get to a specific grand mean. Our df1 will only lose degrees of freedom to get to the specific marginal means.
Now, how many cell means are free to vary before we need to fill in the other cell means to get to the four marginal means in the 2 x 2 design? Let’s start with freely picking the cell mean for brown eyed males. We know the marginal mean for brown eyed males and blue eyed males together (it is given, all marginal means are), so I guess we can’t choose the blue eyed males cell mean freely. There goes one degree of freedom. We also know the marginal mean for brown eyed males and brown eyed females together. That means we can’t choose the brown eyed female cell mean freely either. And as we know the other two marginal means, we have no choice in what we put in the blue eyed females cell mean to get to the correct marginal means. So, we chose one cell mean, and the other three cell means had to be filled in as a consequence to get to the correct marginal means. You know what that means don’t you? We only have one degree of freedom in df1 for a 2 x 2 design. That’s different from the three degrees of freedom in a 4 x 1 design. The same number of groups and they might even contain the same observations, but we get a different number of degrees of freedom. So now you see that using the degrees of freedom, you can infer a lot about the design of the test.
You could do the same mental exercise for a 2 x 3 design, but it is tedious for me to write up, so I am going to give you the general rule. Every variable in your design has a certain number of levels. Variable 1 in the 2 x 3 design has 2 levels, variable 2 has 3 levels. You get df1 when you multiply the levels of all variables with each other, but with each variable, subtract one level. So in the 2 x 3 design, df1 would be (2 – 1) x (3 – 1)
= 2 degrees of freedom. Back to the 2 x 2 design, df1 would be (2 – 1) x (2 – 1) = 1 degrees of freedom. Now let’s see what happens with a 2 x 2 x 2 design: (2 – 1) x (2 – 1) x (2 – 1) = still 1 degrees of freedom. A 3 x 3 x 4 design (I hope you’ll never have to analyze that one): (3 – 1) x ( 3 – 1) x (4 -1) = 2 x 2 x 3 = 12 degrees of freedom.
By now, you should be able to calculate df1 in F(df1, df2) with ease. By the way, most statistical programs give you this value for free. However, now you’ll be able to judge whether researchers have performed the right analyses in their papers to some extent based on their df1 value. Also, df1 is calculated the same way in a within-subjects design. Just treat the within-subjects variable as any other variable. Let’s move on to df2.
Whereas df1 was all about how the cell means relate to the grand mean or marginal means, df2 is about how the single observations in the cells relate to the cell means. Basically the df2 is the total number of observations in all cells (n) minus the degrees of freedoms lost because the cell means are set (that is, minus the number of cell means or groups/conditions: k). Df2 = n – k, that’s all folks! Say we have 150 participants across four conditions. That means we will have df2 = 150 – 4 = 146, regardless of whether the design is 2 x 2, or 4 x 1.
Most statistical packages give you df2 too. In SPSS, it’s called df error, in other packages it might be called df residuals.
For the case of within subjects-designs, things can become a bit more complicated. The following paragraphs are work in progress. The calculation of df2 for a repeated measures ANOVA with one within-subjects factor is as follows: df2 = df_total – df_subjects – df_factor, where df_total = number of observations (across all levels of the within-subjects factor, n) – 1, df_subjects = number of participants (N) – 1, and df_factor = number of levels (k) – 1. Basically, the take home message for repeated measures ANOVA is that you lose one additional degree of freedom for the subjects (if you’re interested: this is because the sum of squares representing individual subjects’ average deviation from the grand mean is partitioned separately, whereas in between-subjects designs, that’s not the case. To get to a specific subjects sum of squares, N – 1 subject means are free to vary, hence you lose one additional degree of freedom).
Conclusion
You should be able to calculate df1 and df2 with ease now (or identify it from the output of your statistical package like SPSS). Keep in mind that the degrees of freedom you specify are those of the design of the effect that you are describing. There is no such thing as one set of degrees of freedom that is appropriate for every effect of your design (although, in some cases, they might seem to have the same value for every effect).
Moreover, although we have been discussing means in this tutorial, for a complete understanding, you should learn about sums of squares, how those translate into variance, and how test statistics, such as F-ratio, work. This will make clear to you how degrees of freedom are used in statistical analysis. The short functional description is that, primarily, degrees of freedom affect which critical values are chosen for test statistics of interest, given a specific alpha level (remember those look-up tables in your early statistics classes?).
Share this:Last updated: 6th March, 2015
This site uses tracking cookies.
Send to Email Address
Your Email Address
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.

我要回帖

更多关于 degrees是什么意思 的文章

 

随机推荐