It is symmetrical around the mean " The random variable has an

advertisement
The Normal Distribution
f(x)
s
µ
It is bell-shaped
x
Mean
= Median
= Mode
It is symmetrical around the mean
The random variable has an in…nite theoretical
range: 1 to +1
1
If random variable X has a normal distribution
with and variance 2 , then it is shown as
X
N ( ; 2)
where the probability density function is
1x
(
)2
1
f (x) = p e 2
2
2
The cumulative distribution function is
Z x0
F (x0) = P (X x0) =
f (x)dx
1
f(x)
0
x
x0
3
The total area under the curve is 1.0, and the
curve is symmetric, so half is above the mean,
half is below
f(X) P( −∞ < X < μ) = 0.5
0.5
P(μ < X < ∞ ) = 0.5
0.5
µ
P(−∞ < X < ∞) = 1.0
4
X
The Standardized Normal (Standart Normal Da¼
g¬l¬m)
Any normal distribution can be transformed into
the standardized normal distribution (Z), with
mean 0 and variance 1
X
Z=
and
Z N (0; 1)
It obtains the following
f(Z)
Z ~ N(0,1)
1
0
5
Z
Note that the distribution is the same, only the
scale is standardized
b −μ
 a −μ
P(a < X < b) = P
<Z<

σ
σ


 b −μ  a −μ
= F
 − F

 σ   σ 
f(x)
a
µ
b
x
a −μ
σ
0
b −μ
σ
Z
6
The Standardized Normal Table gives cumula-
7
tive probability for any value of z
8
Ex: X
X
Z=
N (8; 25) ) P (X < 8:6) =?
8:6 8
=
= 0:12, P (Z < 0:12) = 0:5478
5
µ=8
s = 10
8 8.6
µ=0
s =1
X
0 0.12
P(X < 8.6)
Z
P(Z < 0.12)
X rassal de¼
gişkeninin alabilece¼
gi de¼
gerlerin %54.78’i
8.6’n¬n alt¬ndad¬r
9
For negative Z-values, use the fact that it is symmetric distribution
Ex: P (Z <
P (Z < 2:00)
)
P (Z <
2:00) =? = P (Z > 2:00) = 1
2:00) = 1
0:9772 = 0:0228
.9772
.9772
.0228
Z
10
.0228
Z
Ex: Finding the X value for a Known Probability
–X
N (8; 25) ise X’in hangi de¼
geri X’in alabilece¼
gi tüm de¼
gerlerin %20’sinin üstündedir?
.80
.20
?
8.0
-0.84 0
11
Z de¼
geri için bahsi geçen de¼
gerin 0.84 oldu¼
gunu
standart normal tablosundan biliyoruz. O halde
X
Z=
)
X=
+ Z = 8 + ( 0:84)5 = 3:8
12
Lognormal Distribution
If X (= ln Y ) is normally distributed with
and , then Y has a log-normal distribution
N ( ; 2)
ln(X)
The lognormal distribution is used to model continuous random quantities when the distribution
is believed to be skewed, such as certain income
and lifetime variables
13
The lognormal is skewed to the right (ln100 =
4:6 ln10 = 2:3)
14
DISTRIBUTION OF SAMPLE STATISTICS
Sampling from a Population
Örnek: 2, 4, 6, 6, 7, 8 say¬lar¬ndan oluşan bir
populasyonumuz olsun
Bu say¬lardan 3 elemanl¬bir örneklem (sample)
seçebiliriz. Bu elemanlar da 2, 6, 7 olsun.
Bu 3 say¬n¬n ortalamas¬5’tir.
Di¼
ger yandan populasyonumuzun ortalamas¬5.5’tir.
15
Örneklemler seçmeye devam edersek
Örneklem Ortalama
2, 6, 7
5
2, 7, 8
5.7
4, 7, 8
6.33
2, 4, 7
4.33
Burada 3 elemanl¬örneklemlerin ortalamalar¬n¬n
ne kadar de¼
gişebilece¼
gi (4.33, 5,..., 5.66) hakk¬nda
…kir sahibi olduk (distribution of sample means)
16
Sampling Distribution of Sample Means
Central Limit Theorem: As n becomes large, the
distribution of
X
X
p
Z=
=
= n
X
approaches the standard normal distribution regardless of the underlying probability distribution. That is
2
X
N( ;
17
n
)
The standard deviation of the distribution of X
decreases when sample size, n; increases
18
Law of large numbers: Central limit theorem
states that X N ( ; 2=n).
Hence, as n become large, the mean of the samples, X, converges to the population mean, :
19
CONFIDENCE INTERVAL ESTIMATION: ONE POPULATION
A point estimator of a population parameter is
a function of the sample information that yields
a single number
An interval estimator of a population parameter
is a rule for determining (based on the sample
information) a range, or interval, in which the
parameter is likely to fall
20
Interval Estimation
Assume
is a random variable
P (a <
< b) = 1
–the quantity 100(1
)% is called the con…dence level of the interval
–the interval from a to b is called the 100(1
)% con…ence interval of
21
Con…dence Interval Estimation for the Mean of a Normal Distribution: Population Variance Known
Örnek: Ortalamas¬ , standart sapmas¬ olan
bir populasyondan n elemanl¬ bir X örneklemi
seçip bununla populasyonun ortalamas¬n¬aral¬k
tahmini ile bulmak istersek
Örne¼
gin bu da¼
g¬l¬m¬n sadece ortadaki %90’l¬k bölümüyl
ilgilendi¼
gimizde, iki kenardan da %5’lik bölümü
at¬yoruz
Sa¼
g taraftan att¬g¼¬m¬zda ilgilendi¼
gimiz Z de¼
gerinin
1:645 oldu¼
gunu, sol taraftan att¬g¼¬m¬zda ise bunun
22
simetri¼
gi olan
1:645 olaca¼
g¬n¬bulabiliriz
23
%90 güven aral¬g¼¬şu şekilde bulunabilir
0:90 = P ( 1:645 < Z < 1:645)
x
1:645 < p < 1:645
= n
1:645
1:645
p
<x
< p
n
n
1:645
1:645
p
x
< <x+ p
n
n
Örneklem ortalamas¬ndan 1.645 standart sapma
sa¼
ga ve sola gitti¼
gimizde populasyon ortalamas¬
için %90 güven aral¬g¼¬n¬elde etmiş oluyoruz
24
Farkl¬örneklemler kullan¬ld¬g¼¬nda ( ) için aşa¼
g¬daki gibi güven aral¬klar¬elde edilebilecektir
Bu güven aral¬klar¬n¬n %90’¬ ’yü içerecektir
25
Güven aral¬klar¬n¬n genel şekli
%90’un d¬ş¬nda en çok kullan¬lan güven aral¬klar¬
%95 ve %99’dur
26
Bunlar için
de¼
gerleri s¬ras¬yla %5 ve %1’dir
z de¼
gerleri ise
F (z =2) = F (z0:025) = 1:96
F (z =2) = F (z0:005) = 2:575
27
Con…dence Interval Estimation for the Mean of a Normal Distribution: Population Variance Unknown: The t Distribution
For a random sample from a normal pupulation
with mean and variance 2, the random variable X has a normal distribution with mean
and variance 2=n; i.e.
X
p
Z=
= n
has the standard normal distribution.
28
But if
used;
is unknown, usually sample estimate is
X
p
t=
sx = n
In this case the random variable t follows the
Student’s t distribution with (n 1) degrees of
freedom
29
A random variable having the Student’s t distribution with degrees of freedom will be denoted
t . Then t ; is the number for which
P (t > t ; ) =
30
31
A 100(1
)% con…dence interval for the population mean, variance unknown, given by
sx
sx
x tn 1; =2 p < < x + tn 1; =2 p
n
n
32
Örnek: Rassal bir şekilde seçilmiş 6 araban¬n galon/mil cinsinden yak¬t tüketimlerişu şekildedir:
18.6, 18.4, 19.2, 20.8, 19.4 ve 20.5. E¼
ger bu arabalar¬n seçildi¼
gi populasyona ait arabalar¬n yak¬t
tüketimi normal da¼
g¬l¬yorsa, bu populasyonun
ortalama yak¬t tüketimi için %90 güven aral¬g¼¬n¬
bulunuz
–Populasyon varyans¬verilmedi¼
ginden önce örneklem varyans¬n¬hesaplay¬p önceki sayfadaki for-
33
mülü kullanabiliriz. Örneklem varyans için
i
1
2
3
4
5
6
xi
18.6
18.4
19.2
20.8
19.4
20.5
Sums 116.9
34
x2i
345.96
338.56
368.64
432.64
376.36
420.25
2,282
Dolay¬syla örneklem ortalamas¬
n
P
xi
116:9
i=1
x=
=
= 19:5
n
6
örneklem varyans¬
n
n
P
P
x2i x2
(xi x)2
2
2282
= i=1
=
s2 = i=1
n 1
n 1
ve standart sapmas¬
p
sx = :96 = :98
35
6 19:52
=
5
Arad¬g¼¬m¬z güven aral¬g¼¬
tn 1; =2 sx
tn 1; =2 sx
p
p
< <x+
x
n
n
where n = 6
=2 = :10=2 = :05 ) t5;:05 =
2:015
2:015 :98
2:015 :98
p
p
19:48
< < 19:48 +
6
6
dolay¬s¬yla
18:67 <
36
< 20:29
Farkl¬ güven aral¬klar¬n¬n sonucu ise aşa¼
g¬daki
gibidir
37
HYPOTHESIS TESTING
We test validity of a claim about a population
parameter by using a sample data
Null Hypothesis: The hypothesis that is maintained unless there is strong evidence against it
Alternative Hypothesis: The hypothesis that is
accepted when the null hypothesis is rejected
–Note: If you do not reject the null hypothesis,
it does not mean that you accept it. You just
fail to reject it
38
Simple Hypothesis: A hypothesis that population parameter, , is equal to a speci…c value,
0
H0 :
= 0
Composite Hypothesis: A hypothesis that population parameter is equal to a range of values
39
Hypothesis Test Decisions:
–Type I Error: Rejecting a true null hypothesis
–Type II Error: The failure to reject a false null
hypothesis
–Signi…cance Level of a Test: The probability
of making Type I error, which is often denoted
in percentage and by :
–Power of a Test: The probability of not making Type II error
40
Null is True Null is False
Reject Null
Type I Error
Correct
Fail to Reject Null
Correct
Type II Error
Type I and Type II errors are inversely related:
As one increases, the other decreases (but not
one to one)
41
Tests of the Mean of a Normal Distribution: Population Variance Known
A random sample of n observations was obtained
from a normally distributed population with mean
and known variance 2. We know that this
sample mean has a standard normal distribution
X
p
Z=
= n
with mean 0 and variance 1
42
A test with signi…cance level
pothesis
H0 : = 0
against the alternative
H1 :
of the null hy-
> 0
is obtained by using the following decision rule
x
p0>z
Reject H0 if :
= n
p
or equivalently x > 0 + z = n
43
If we use a …gure
44
In this case is the signi…cance level of the test
(Probability of rejecting a true null hypothesis)
If it was two-sided test, the signi…cance level of
the test would had been 2
Yet, the power of the test (The probability of not
rejecting a false null hypothesis) is not 1 2 :
–Because, if null hypothesis is wrong, then you
hold the alternative hypothesis. It means the
underlying distribution is di¤erent
45
Örnek: Bir mal¬n üretim sistemi do¼
gru olarak
çal¬şt¬g¼¬zaman, ürünlerin a¼
g¬rl¬g¼¬n¬n ortalamas¬n¬n
5 kg, standart sapmas¬n¬n da 0.1 kg oldu¼
gu, ve
bu a¼
g¬rl¬klar¬n normal bir da¼
g¬l¬ma sahip oldu¼
gu
görülmüştür. Üretim müdürü taraf¬ndan yap¬lan
bir de¼
gişiklik sonucunda, ortalama ürün a¼
g¬rl¬g¼¬n¬n
artmas¬, ama standart sapmas¬n¬n de¼
gişmemesi
amaçlanm¬şt¬r. Bu de¼
gişiklikten sonra 16 elemanl¬ rassal bir örneklem seçildi¼
gi zaman, bu
örneklemdeki ürünlerin ortalama a¼
g¬rl¬g¼¬ 5.038
kg olarak bulunmuştur. Son populasyondaki ürün
46
a¼
g¬rl¬g¼¬n¬n 5 kg olmas¬null hipotezini, alternatif
hipotez olan 5 kg’dan büyük olmas¬ hipotezine
göre %5 ve %10 önem derecesinde (signi…cance
level) test ediniz
–Biz aşa¼
g¬daki hipotezi
H0 :
=5
şu alternetif hipoteze göre test etmek istiyoruz
H1 :
>5
–Aşa¼
g¬daki koşul sa¼
gland¬g¼¬ zaman H0’¬ H1’a
47
karş¬reddedebiliriz
X
p >z
= n
–Soruda verilenler: x = 5:038
0 = 5
16
= :1; dolay¬s¬yla
5:038 5
X
0
p =
p
= 1:52
= n
:1= 16
n=
–Önem derecesi %5 ise; standart normal tablosundan %5’e denk gelen z de¼
geri
z0:05 = 1:645
48
dolay¬s¬yla 1.52 bu say¬dan daha büyük olmad¬g¼¬ndan null hipotezini %5 önem seviyesinde
reddedemiyoruz (fail to reject)
–Önem derecesi %10 ise; standart normal tablosundan %10’e denk gelen z de¼
geri
z0:1 = 1:28
bu sefer 1.52 bu say¬dan daha büyük oldu¼
gundan null hipotezini %10 önem düzeyinde reddedebiliyoruz
49
Probability Value (p-value)*: In the previous example we have seen that we could not reject a
test at %5 signi…cance level, but at %10. Hence
it is possible to …nd the smallest signi…cance level
at which the null hypothesis is rejected, this is
called p-value of a test. Formally, if random sample of n observations was obtained from a normally distributed population with mean and
known variance 2, and if the observed sample
mean is x, the null hypothesis
H0 :
50
= 0
is tested against the alternative
H1 :
> 0
The p-value of the test is
x
zp j H0 :
p value = P ( p
= n
51
= 0)
Örnek: Bir önceki örnekte
X
5:038 5
0
p
p =
= 1:52
= n
:1= 16
bulunmuştu. Bu eşitli¼
gi sa¼
glayan de¼
geri standart normal tablosundan 0.643 olarak bulunabilir,
testin p-de¼
geridir. Şekille gösterirsek
52
Simple Null Against Two-Sided Alternative
To test the null hypothesis
H0 : = 0
against the alternative at signi…cance level
H1 : 6= 0
use the following decision rule
X
p 0 < z =2
Reject H0 if :
= n
X
p 0 > z =2
or
= n
53
Şekille gösterirsek
54
Tests of the Mean of a Normal Distribution: Population Variance Unknown
We are given a random sample of n observations
was obtained from a normally distributed population with mean . Using the sample mean and
sample standart deviation, x and s respectively,
we can use the following tests with signi…cance
level
55
1. To test the null hypothesis
H0 :
= 0
or
against the alternative
H1 :
H0 :
6 0
> 0
the decision rule is as follows
x
p 0 > tn 1;
Reject H0 if :
sx = n
56
2. To test the null hypothesis
H0 :
= 0
or
against the alternative
H0 :
> 0
H1 :
< 0
the decision rule is as follows
x
p0 <
Reject H0 if :
sx = n
57
tn 1;
3. To test the null hypothesis
H0 :
= 0
against the alternative
H1 :
6= 0
the decision rule is as follows
x
p 0 < tn 1; =2
Reject H0 if :
sx = n
x
p 0 > tn 1; =2
or
sx = n
58
bunu şekille gösterirsek
59
Assessing the Power of a Test
Determining the Probability of Type II Error
Consider the test
H0 :
= 0
against the alternative
H1 :
> 0
using the decision rule
Reject H0 if :
60
x
p0>z
= n
Now suppose the null hypothesis is wrong and
the population mean, , is in the region of H1.
Type II error is the failure to reject a false null
hypothesis. Thus, we consider a
=
such
that
> 0. Then the probability of making
Type II error is
x
p )
= P (z <
= n
therefore the Power of a Test (the probability of
not making Type II error)
1
61
Örnek: Daha önce verdi¼
gimiz örnekte, 16 elemanl¬ rassal bir örneklem seçildi¼
gi zaman, bu
örneklemdeki ürünlerin ortalama a¼
g¬rl¬g¼¬n¬n 5 kg
olmas¬null hipotezini, alternatif hipotez olan 5
kg’dan büyük olmas¬ hipotezine göre %5 önem
derecesinde test etmiştik
–Biz aşa¼
g¬daki hipotezi
H0 :
=5
şu alternatif hipoteze göre test etmek istiyoruz
H1 :
62
>5
2 =
–Soruda verilenler: 0 = 5
n = 16
:1
z = z:05 = 1:645; dolay¬s¬yla H0’¬
H1’a karş¬reddetmek için karar kural¬(decision rule)
x
x 5
0
p =
> 1:645
:1=4
= n
ya da
x > 1:645 (:1=4) + 5 = 5:041
bu da demek oluyor ki örneklem ortalamas¬
5.041’den küçük oldu¼
gunda null hipotezimizi
reddedemiyor olaca¼
g¬z
63
–Diyelim ki populasyon ortalamas¬ 5.05 olsun
(yani alternatif hipotez do¼
gru olsun), ve null
hipotezimizi reddetmeyerek Type II Error yapma
ihtimalimizi bulal¬m. Yani populasyon ortalamas¬5.05 iken örneklem ortalamas¬n¬n 5.041’den
küçük olma ihtimalini
5:041
p )
P (X
5:041) = P (Z
= n
5:041 5:05
= P (Z
) = P (Z
:36)
:1=4
= 1 :64 = 0:36
64
dolay¬s¬yla testimizin gücü
P ower = 1
65
= :64
Şekille gösterirsek
66
Download