Statistics 226, Winter 1997

Final Exam

Outline of solutions


A series of numbers is given in brackets for each problem; these are the point values assigned to the different parts of the problem. Thus, 1. [8/4/8] would mean that problem 1 was assigned 20 points: 8 for part (a), 4 for part (b) and 8 for part (c).

This is not meant to be a full solution set, but rather an indication of some of the important aspects of the problems, or an indication of the direction for a solution.

The exams were graded with students' names concealed.

Problem I.
   1-7. [2 pts each]   7612435 or cdfegba

   8. [5]
	   (ICD) -> (ID,IC,DC) -> (IC,DC) -> (IC,D) -> (I,D,C)
	   [Also accepted: ... (IC,D) -> (IC) -> (I,D,C), although technically 
	   this would be true only for the model (IC) fit to uncollapsed data.]

   9. [3]  loglin count irritate conc drug, fit(irritate conc, drug)

  10. [3/3/3/3] 
      a) 5.046/5/ p>0.25
      b) drug is jointly independent of concentration and score
      c) [1.6] fits as well and is simpler
      d) Irritation level is affected similarly by the two drugs, although
         higher concentrations are strongly associated with higher 
           Note: this ignores the information in the order of irritation scores.
           Note: points were deducted for serious deviations from the "one 
                 brief sentence" requirement. 
  11. [3/3/3/3]
      a) 0.045 / 2 / approx 1
      b) At each fixed concentration, drug and irritation exhibit uniform 
         association. (The conditional statement is important here.)
      c) [1.13] fits as well and is simpler
      d) At each concentration, the association of irritation and drug (as 
		 measured by the local odds ratio) is equal at high, medium, and 
		 low levels of irritation.		 
		   Note: points were deducted for serious deviations from 
		         the "one brief sentence" requirement.
  12. [4/3/3]
      a) Models [1.16], [1.18], and [1.19] all show moderate       
		 evidence for an effect (p=0.035).  The estimated log-odds 
		 parameter is fairly constant in these models (about 0.82), Model 
		 [1.14] does not involve drug.  The drug coefficient in model 
		 [1.20] cannot be directly interpreted, as the model also contains 
		 a higher-order interaction involving drug.
      b) The positive coefficient indicates that Drug 2 causes higher 
         levels of irritation.
      c) Given concentration level, the odds ratio for for being above any 
         given irritation level is about 2.3 times larger for drug 2 than 
         for drug 1.  [2.3 = exp(0.82)]

Problem II.
   1. [10]
      indedpendence, row effects, column effects, uniform association, 
   2. [5]
      Many residuals > 2 in magnitude; patterns of signs on residuals; 
      strong positive residuals in upper left, lower right corners with 
      strong negative residuals in the other corners. 
   3. [6]
      Chi-squared should be about 47 with 15 df, p less than 0.001.  [LRT 
      and Pearson chi-squared are nearly equivalent, especially in 
      moderately sized samples.]  This gives strong evidence against independence.
   4. [5] 
      The local log odds of being in a higher category of SES (or mental 
      health) is approximately 0.0907.  This means the odds ratio in any 
      2x2 subtable is approximately exp(0.0907) = 1.095.

   5. [6]
      Given uniform spacing of PSES, these represent optimal scores for 
      mental health status.  The scores in model [2.8] are proportional 
      to (0, 1, 1, 2).  This suggests that mild and moderate symptoms can't 
      be easily distinguished for these purposes.

Problem III.
   1. [6]  
      Each factor (location, gender, seatbelt) contributes separately to 
      risk.  The effect on the odds of injury is multiplicative. 

   2. [6]
      Using seatbelt decreases odds of injury, by factor of exp(-0.817) = 0.442
      Rural accidents have increased odds of injury, by factor of exp(0.758) = 2.134
      Males have decreased odds of injury, by factor of exp(-0.545) = 0.580
   3. [4]
     [3.13]:  (SLG, IS, IL, IG)
     [3.14]:  ( SL, SG, IS, IL, IG, LG)      

   4. [3] (a)
   5. [4]
      [3.13] contains a three-factor interaction missing from [3.14]

   6. [4]
      [3.14] exhibits statistically significant lack of fit (see 
      G-squared), while [3.13] does not.
   7. [6] 
      2.264  2.255   =    exp(0.817)   exp(0.813)
      2.134  2.128        exp(0.758)   exp(0.755)
      0.580  0.580        exp(-0.544)  exp(-0.545)
   8. [4]
      The effect of the three-factor interaction affects parameters very 
      little, but the differences attain statistical significance because 
      of the large sample size (n=68694).