The Normal Distribution f(x) s µ It is bell-shaped x Mean = Median = Mode It is symmetrical around the mean The random variable has an in…nite theoretical range: 1 to +1 1 If random variable X has a normal distribution with and variance 2 , then it is shown as X N ( ; 2) where the probability density function is 1x ( )2 1 f (x) = p e 2 2 2 The cumulative distribution function is Z x0 F (x0) = P (X x0) = f (x)dx 1 f(x) 0 x x0 3 The total area under the curve is 1.0, and the curve is symmetric, so half is above the mean, half is below f(X) P( −∞ < X < μ) = 0.5 0.5 P(μ < X < ∞ ) = 0.5 0.5 µ P(−∞ < X < ∞) = 1.0 4 X The Standardized Normal (Standart Normal Da¼ g¬l¬m) Any normal distribution can be transformed into the standardized normal distribution (Z), with mean 0 and variance 1 X Z= and Z N (0; 1) It obtains the following f(Z) Z ~ N(0,1) 1 0 5 Z Note that the distribution is the same, only the scale is standardized b −μ a −μ P(a < X < b) = P <Z< σ σ b −μ a −μ = F − F σ σ f(x) a µ b x a −μ σ 0 b −μ σ Z 6 The Standardized Normal Table gives cumula- 7 tive probability for any value of z 8 Ex: X X Z= N (8; 25) ) P (X < 8:6) =? 8:6 8 = = 0:12, P (Z < 0:12) = 0:5478 5 µ=8 s = 10 8 8.6 µ=0 s =1 X 0 0.12 P(X < 8.6) Z P(Z < 0.12) X rassal de¼ gişkeninin alabilece¼ gi de¼ gerlerin %54.78’i 8.6’n¬n alt¬ndad¬r 9 For negative Z-values, use the fact that it is symmetric distribution Ex: P (Z < P (Z < 2:00) ) P (Z < 2:00) =? = P (Z > 2:00) = 1 2:00) = 1 0:9772 = 0:0228 .9772 .9772 .0228 Z 10 .0228 Z Ex: Finding the X value for a Known Probability –X N (8; 25) ise X’in hangi de¼ geri X’in alabilece¼ gi tüm de¼ gerlerin %20’sinin üstündedir? .80 .20 ? 8.0 -0.84 0 11 Z de¼ geri için bahsi geçen de¼ gerin 0.84 oldu¼ gunu standart normal tablosundan biliyoruz. O halde X Z= ) X= + Z = 8 + ( 0:84)5 = 3:8 12 Lognormal Distribution If X (= ln Y ) is normally distributed with and , then Y has a log-normal distribution N ( ; 2) ln(X) The lognormal distribution is used to model continuous random quantities when the distribution is believed to be skewed, such as certain income and lifetime variables 13 The lognormal is skewed to the right (ln100 = 4:6 ln10 = 2:3) 14 DISTRIBUTION OF SAMPLE STATISTICS Sampling from a Population Örnek: 2, 4, 6, 6, 7, 8 say¬lar¬ndan oluşan bir populasyonumuz olsun Bu say¬lardan 3 elemanl¬bir örneklem (sample) seçebiliriz. Bu elemanlar da 2, 6, 7 olsun. Bu 3 say¬n¬n ortalamas¬5’tir. Di¼ ger yandan populasyonumuzun ortalamas¬5.5’tir. 15 Örneklemler seçmeye devam edersek Örneklem Ortalama 2, 6, 7 5 2, 7, 8 5.7 4, 7, 8 6.33 2, 4, 7 4.33 Burada 3 elemanl¬örneklemlerin ortalamalar¬n¬n ne kadar de¼ gişebilece¼ gi (4.33, 5,..., 5.66) hakk¬nda …kir sahibi olduk (distribution of sample means) 16 Sampling Distribution of Sample Means Central Limit Theorem: As n becomes large, the distribution of X X p Z= = = n X approaches the standard normal distribution regardless of the underlying probability distribution. That is 2 X N( ; 17 n ) The standard deviation of the distribution of X decreases when sample size, n; increases 18 Law of large numbers: Central limit theorem states that X N ( ; 2=n). Hence, as n become large, the mean of the samples, X, converges to the population mean, : 19 CONFIDENCE INTERVAL ESTIMATION: ONE POPULATION A point estimator of a population parameter is a function of the sample information that yields a single number An interval estimator of a population parameter is a rule for determining (based on the sample information) a range, or interval, in which the parameter is likely to fall 20 Interval Estimation Assume is a random variable P (a < < b) = 1 –the quantity 100(1 )% is called the con…dence level of the interval –the interval from a to b is called the 100(1 )% con…ence interval of 21 Con…dence Interval Estimation for the Mean of a Normal Distribution: Population Variance Known Örnek: Ortalamas¬ , standart sapmas¬ olan bir populasyondan n elemanl¬ bir X örneklemi seçip bununla populasyonun ortalamas¬n¬aral¬k tahmini ile bulmak istersek Örne¼ gin bu da¼ g¬l¬m¬n sadece ortadaki %90’l¬k bölümüyl ilgilendi¼ gimizde, iki kenardan da %5’lik bölümü at¬yoruz Sa¼ g taraftan att¬g¼¬m¬zda ilgilendi¼ gimiz Z de¼ gerinin 1:645 oldu¼ gunu, sol taraftan att¬g¼¬m¬zda ise bunun 22 simetri¼ gi olan 1:645 olaca¼ g¬n¬bulabiliriz 23 %90 güven aral¬g¼¬şu şekilde bulunabilir 0:90 = P ( 1:645 < Z < 1:645) x 1:645 < p < 1:645 = n 1:645 1:645 p <x < p n n 1:645 1:645 p x < <x+ p n n Örneklem ortalamas¬ndan 1.645 standart sapma sa¼ ga ve sola gitti¼ gimizde populasyon ortalamas¬ için %90 güven aral¬g¼¬n¬elde etmiş oluyoruz 24 Farkl¬örneklemler kullan¬ld¬g¼¬nda ( ) için aşa¼ g¬daki gibi güven aral¬klar¬elde edilebilecektir Bu güven aral¬klar¬n¬n %90’¬ ’yü içerecektir 25 Güven aral¬klar¬n¬n genel şekli %90’un d¬ş¬nda en çok kullan¬lan güven aral¬klar¬ %95 ve %99’dur 26 Bunlar için de¼ gerleri s¬ras¬yla %5 ve %1’dir z de¼ gerleri ise F (z =2) = F (z0:025) = 1:96 F (z =2) = F (z0:005) = 2:575 27 Con…dence Interval Estimation for the Mean of a Normal Distribution: Population Variance Unknown: The t Distribution For a random sample from a normal pupulation with mean and variance 2, the random variable X has a normal distribution with mean and variance 2=n; i.e. X p Z= = n has the standard normal distribution. 28 But if used; is unknown, usually sample estimate is X p t= sx = n In this case the random variable t follows the Student’s t distribution with (n 1) degrees of freedom 29 A random variable having the Student’s t distribution with degrees of freedom will be denoted t . Then t ; is the number for which P (t > t ; ) = 30 31 A 100(1 )% con…dence interval for the population mean, variance unknown, given by sx sx x tn 1; =2 p < < x + tn 1; =2 p n n 32 Örnek: Rassal bir şekilde seçilmiş 6 araban¬n galon/mil cinsinden yak¬t tüketimlerişu şekildedir: 18.6, 18.4, 19.2, 20.8, 19.4 ve 20.5. E¼ ger bu arabalar¬n seçildi¼ gi populasyona ait arabalar¬n yak¬t tüketimi normal da¼ g¬l¬yorsa, bu populasyonun ortalama yak¬t tüketimi için %90 güven aral¬g¼¬n¬ bulunuz –Populasyon varyans¬verilmedi¼ ginden önce örneklem varyans¬n¬hesaplay¬p önceki sayfadaki for- 33 mülü kullanabiliriz. Örneklem varyans için i 1 2 3 4 5 6 xi 18.6 18.4 19.2 20.8 19.4 20.5 Sums 116.9 34 x2i 345.96 338.56 368.64 432.64 376.36 420.25 2,282 Dolay¬syla örneklem ortalamas¬ n P xi 116:9 i=1 x= = = 19:5 n 6 örneklem varyans¬ n n P P x2i x2 (xi x)2 2 2282 = i=1 = s2 = i=1 n 1 n 1 ve standart sapmas¬ p sx = :96 = :98 35 6 19:52 = 5 Arad¬g¼¬m¬z güven aral¬g¼¬ tn 1; =2 sx tn 1; =2 sx p p < <x+ x n n where n = 6 =2 = :10=2 = :05 ) t5;:05 = 2:015 2:015 :98 2:015 :98 p p 19:48 < < 19:48 + 6 6 dolay¬s¬yla 18:67 < 36 < 20:29 Farkl¬ güven aral¬klar¬n¬n sonucu ise aşa¼ g¬daki gibidir 37 HYPOTHESIS TESTING We test validity of a claim about a population parameter by using a sample data Null Hypothesis: The hypothesis that is maintained unless there is strong evidence against it Alternative Hypothesis: The hypothesis that is accepted when the null hypothesis is rejected –Note: If you do not reject the null hypothesis, it does not mean that you accept it. You just fail to reject it 38 Simple Hypothesis: A hypothesis that population parameter, , is equal to a speci…c value, 0 H0 : = 0 Composite Hypothesis: A hypothesis that population parameter is equal to a range of values 39 Hypothesis Test Decisions: –Type I Error: Rejecting a true null hypothesis –Type II Error: The failure to reject a false null hypothesis –Signi…cance Level of a Test: The probability of making Type I error, which is often denoted in percentage and by : –Power of a Test: The probability of not making Type II error 40 Null is True Null is False Reject Null Type I Error Correct Fail to Reject Null Correct Type II Error Type I and Type II errors are inversely related: As one increases, the other decreases (but not one to one) 41 Tests of the Mean of a Normal Distribution: Population Variance Known A random sample of n observations was obtained from a normally distributed population with mean and known variance 2. We know that this sample mean has a standard normal distribution X p Z= = n with mean 0 and variance 1 42 A test with signi…cance level pothesis H0 : = 0 against the alternative H1 : of the null hy- > 0 is obtained by using the following decision rule x p0>z Reject H0 if : = n p or equivalently x > 0 + z = n 43 If we use a …gure 44 In this case is the signi…cance level of the test (Probability of rejecting a true null hypothesis) If it was two-sided test, the signi…cance level of the test would had been 2 Yet, the power of the test (The probability of not rejecting a false null hypothesis) is not 1 2 : –Because, if null hypothesis is wrong, then you hold the alternative hypothesis. It means the underlying distribution is di¤erent 45 Örnek: Bir mal¬n üretim sistemi do¼ gru olarak çal¬şt¬g¼¬zaman, ürünlerin a¼ g¬rl¬g¼¬n¬n ortalamas¬n¬n 5 kg, standart sapmas¬n¬n da 0.1 kg oldu¼ gu, ve bu a¼ g¬rl¬klar¬n normal bir da¼ g¬l¬ma sahip oldu¼ gu görülmüştür. Üretim müdürü taraf¬ndan yap¬lan bir de¼ gişiklik sonucunda, ortalama ürün a¼ g¬rl¬g¼¬n¬n artmas¬, ama standart sapmas¬n¬n de¼ gişmemesi amaçlanm¬şt¬r. Bu de¼ gişiklikten sonra 16 elemanl¬ rassal bir örneklem seçildi¼ gi zaman, bu örneklemdeki ürünlerin ortalama a¼ g¬rl¬g¼¬ 5.038 kg olarak bulunmuştur. Son populasyondaki ürün 46 a¼ g¬rl¬g¼¬n¬n 5 kg olmas¬null hipotezini, alternatif hipotez olan 5 kg’dan büyük olmas¬ hipotezine göre %5 ve %10 önem derecesinde (signi…cance level) test ediniz –Biz aşa¼ g¬daki hipotezi H0 : =5 şu alternetif hipoteze göre test etmek istiyoruz H1 : >5 –Aşa¼ g¬daki koşul sa¼ gland¬g¼¬ zaman H0’¬ H1’a 47 karş¬reddedebiliriz X p >z = n –Soruda verilenler: x = 5:038 0 = 5 16 = :1; dolay¬s¬yla 5:038 5 X 0 p = p = 1:52 = n :1= 16 n= –Önem derecesi %5 ise; standart normal tablosundan %5’e denk gelen z de¼ geri z0:05 = 1:645 48 dolay¬s¬yla 1.52 bu say¬dan daha büyük olmad¬g¼¬ndan null hipotezini %5 önem seviyesinde reddedemiyoruz (fail to reject) –Önem derecesi %10 ise; standart normal tablosundan %10’e denk gelen z de¼ geri z0:1 = 1:28 bu sefer 1.52 bu say¬dan daha büyük oldu¼ gundan null hipotezini %10 önem düzeyinde reddedebiliyoruz 49 Probability Value (p-value)*: In the previous example we have seen that we could not reject a test at %5 signi…cance level, but at %10. Hence it is possible to …nd the smallest signi…cance level at which the null hypothesis is rejected, this is called p-value of a test. Formally, if random sample of n observations was obtained from a normally distributed population with mean and known variance 2, and if the observed sample mean is x, the null hypothesis H0 : 50 = 0 is tested against the alternative H1 : > 0 The p-value of the test is x zp j H0 : p value = P ( p = n 51 = 0) Örnek: Bir önceki örnekte X 5:038 5 0 p p = = 1:52 = n :1= 16 bulunmuştu. Bu eşitli¼ gi sa¼ glayan de¼ geri standart normal tablosundan 0.643 olarak bulunabilir, testin p-de¼ geridir. Şekille gösterirsek 52 Simple Null Against Two-Sided Alternative To test the null hypothesis H0 : = 0 against the alternative at signi…cance level H1 : 6= 0 use the following decision rule X p 0 < z =2 Reject H0 if : = n X p 0 > z =2 or = n 53 Şekille gösterirsek 54 Tests of the Mean of a Normal Distribution: Population Variance Unknown We are given a random sample of n observations was obtained from a normally distributed population with mean . Using the sample mean and sample standart deviation, x and s respectively, we can use the following tests with signi…cance level 55 1. To test the null hypothesis H0 : = 0 or against the alternative H1 : H0 : 6 0 > 0 the decision rule is as follows x p 0 > tn 1; Reject H0 if : sx = n 56 2. To test the null hypothesis H0 : = 0 or against the alternative H0 : > 0 H1 : < 0 the decision rule is as follows x p0 < Reject H0 if : sx = n 57 tn 1; 3. To test the null hypothesis H0 : = 0 against the alternative H1 : 6= 0 the decision rule is as follows x p 0 < tn 1; =2 Reject H0 if : sx = n x p 0 > tn 1; =2 or sx = n 58 bunu şekille gösterirsek 59 Assessing the Power of a Test Determining the Probability of Type II Error Consider the test H0 : = 0 against the alternative H1 : > 0 using the decision rule Reject H0 if : 60 x p0>z = n Now suppose the null hypothesis is wrong and the population mean, , is in the region of H1. Type II error is the failure to reject a false null hypothesis. Thus, we consider a = such that > 0. Then the probability of making Type II error is x p ) = P (z < = n therefore the Power of a Test (the probability of not making Type II error) 1 61 Örnek: Daha önce verdi¼ gimiz örnekte, 16 elemanl¬ rassal bir örneklem seçildi¼ gi zaman, bu örneklemdeki ürünlerin ortalama a¼ g¬rl¬g¼¬n¬n 5 kg olmas¬null hipotezini, alternatif hipotez olan 5 kg’dan büyük olmas¬ hipotezine göre %5 önem derecesinde test etmiştik –Biz aşa¼ g¬daki hipotezi H0 : =5 şu alternatif hipoteze göre test etmek istiyoruz H1 : 62 >5 2 = –Soruda verilenler: 0 = 5 n = 16 :1 z = z:05 = 1:645; dolay¬s¬yla H0’¬ H1’a karş¬reddetmek için karar kural¬(decision rule) x x 5 0 p = > 1:645 :1=4 = n ya da x > 1:645 (:1=4) + 5 = 5:041 bu da demek oluyor ki örneklem ortalamas¬ 5.041’den küçük oldu¼ gunda null hipotezimizi reddedemiyor olaca¼ g¬z 63 –Diyelim ki populasyon ortalamas¬ 5.05 olsun (yani alternatif hipotez do¼ gru olsun), ve null hipotezimizi reddetmeyerek Type II Error yapma ihtimalimizi bulal¬m. Yani populasyon ortalamas¬5.05 iken örneklem ortalamas¬n¬n 5.041’den küçük olma ihtimalini 5:041 p ) P (X 5:041) = P (Z = n 5:041 5:05 = P (Z ) = P (Z :36) :1=4 = 1 :64 = 0:36 64 dolay¬s¬yla testimizin gücü P ower = 1 65 = :64 Şekille gösterirsek 66