Confidence Intervals for Sample Size Less Than 30

In the preceding give-and-take we take been using s, the population standard difference, to compute the standard error. However, we don't really know the population standard deviation, since we are working from samples. To get around this, we have been using the sample standard deviation (s) every bit an estimate. This is not a problem if the sample size is 30 or greater because of the primal limit theorem. Even so, if the sample is pocket-sized (<thirty) , we have to adjust and apply a t-value instead of a Z score in order to account for the smaller sample size and using the sample SD.

Therefore, if n<30, use the appropriate t score instead of a z score, and note that the t-value will depend on the degrees of freedom (df) as a reflection of sample size. When using the t-distribution to compute a confidence interval, df = n-1.

Calculation of a 95% confidence interval when due north<30 volition then utilise the appropriate t-value in place of Z in the formula:

The T-distribution

1 way to remember about the t-distribution is that it is actually a large family of distributions that are similar in shape to the normal standard distribution, just adjusted to account for smaller sample sizes. A t-distribution for a small sample size would wait similar a squashed down version of the standard normal distribution, but as the sample size increase the t-distribution will get closer and closer to approximating the standard normal distribution.

The tabular array below shows a portion of the tabular array for the t-distribution. Find that sample size is represented by the "degrees of liberty" in the showtime column. For determining the confidence interval df=n-i. Notice also that this tabular array is gear up a lot differently than the table of Z scores. Here, only five levels of probability are shown in the column titles, whereas in the tabular array of Z scores, the probabilities were in the interior of the table. Consequently, the levels of probability are much more limited here, because t-values depend on the degrees of freedom, which are listed in the rows.

Confidence Level

fourscore%

xc%

95%

98%

99%

Two-sided examination p-values

.20

.10

.05

.02

.01

1-sided examination p-values

.10

.05

.025

.01

.005

Degrees of Freedom (df)

1

3.078

vi.314

12.71

31.82

63.66

two

1.886

2.920

iv.303

six.965

9.925

three

i.638

2.353

iii.182

4.541

5.841

4

1.533

two.132

2.776

3.747

four.604

5

ane.476

ii.015

2.571

3.365

4.032

6

1.440

ane.943

2.447

3.143

3.707

7

1.415

1.895

2.365

2.998

three.499

8

1.397

ane.860

2.306

2.896

3.355

nine

1.383

1.833

ii.262

ii.821

three.250

10

i.372

1.812

ii.228

two.764

3.169

11

i.362

one.796

two.201

2.718

iii.106

12

1.356

1.782

ii.179

two.681

3.055

13

ane.350

i.771

ii.160

2.650

iii.012

14

one.345

1.761

2.145

2.624

two.977

15

1.341

one.753

2.131

two.602

ii.947

16

1.337

1.746

2.120

2.583

2.921

17

i.333

1.740

2.110

2.567

2.898

18

1.330

1.734

2.101

2.552

ii.878

19

1.328

1.729

2.093

2.539

2.861

xx

i.325

one.725

2.086

2.528

2.845

Notice that the value of t is larger for smaller sample sizes (i.due east., lower df). When nosotros use "t" instead of "Z" in the equation for the conviction interval, it will consequence in a larger margin of fault and a wider confidence interval reflecting the smaller sample size.

With an infinitely large sample size the t-distribution and the standard normal distribution will exist the same, and for samples greater than 30 they volition exist similar, but the t-distribution volition exist somewhat more conservative. Consequently, one can always use a t-distribution instead of the standard normal distribution. However, when you want to compute a 95% confidence interval for an estimate from a large sample, it is easier to just apply Z=ane.96.

Because the t-distribution is, if anything, more than conservative, R relies heavily on the t-distribution.

Test Yourself

Trouble #1

Using the table higher up, what is the critical t score for a 95% confidence interval if the sample size (north) is eleven?

Answer

Problem #2

A sample of north=x patients costless of diabetes have their torso mass alphabetize (BMI) measured. The hateful is 27.26 with a standard difference of 2.ten. Generate a 90% conviction interval for the mean BMI amidst patients gratis of diabetes.

Link to Answer in a Word file

Confidence Intervals for a Mean Using R

Instead of using the table, you can use R to generate t-values. For example, to generate t values for calculating a 95% confidence interval, use the part qt(1-tail area,df).

For example, if the sample size is 15, then df=xiv, nosotros tin can calculate the t-score for the lower and upper tails of the 95% confidence interval in R:

> qt(0.025,14)
[1] -ii.144787
>
qt(0.975,14 )
[ane] 2.144787

And so, to compute the 95% confidence interval we could plug t=2.144787 into the equation:

Confidence Intervals from Raw Data Using R

It is as well piece of cake to compute the point estimate and 95% conviction interval from a raw data ready using the " t.exam " role in R. For instance, in the data set from the Weymouth Wellness Survey I could compute the mean and 95% confidence interval for BMI every bit follows. First, I would load the data set and requite it a short nickname. Then I would attach the information set, and then utilize the following command:

> t.test(bmi)

The output would look like this:

One Sample t-test

data:  bmi
t = 228.5395, df = 3231, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 pct confidence interval:
26.66357 27.12504

sample estimates:
mean of x
26.8943

R defaults to computing a 95% conviction interval, but yous tin can specify the confidence interval as follows:

> t.test(bmi,conf.level=.xc)

This would compute a ninety% confidence interval.

Examination Yourself

Lozoff and colleagues compared developmental outcomes in children who had been anemic in infancy to those in children who had not been anemic. Some of the data are shown in the tabular array below.

Mean + SD

Anemia in Infancy

(n=xxx)

Not-anemic in Infancy

(northward=133)

Gross Motor Score

52.4+xiv.three

58.seven+12.5

Exact IQ

101.4+13.2`

102.9+12.four

Source: Lozoff et al.: Long-term Developmental Result of Infants with Iron Deficiency, NEJM, 1991

Compute the 95% confidence interval for verbal IQ using the t-distribution

Link to the Answer in a Give-and-take file