Quiz 1: September 19

Summary:

Draw a Paik-Agresti diagram
Find the conditional and marginal effects of X
Find the least-squares regression coefficients — without solving for \(\hat{{\boldsymbol\beta}}\) with \(\hat{{\boldsymbol\beta}}=({\mathrm X}'{\mathrm X})^{-1}{\mathrm X}'Y\)
Why is the word ‘effects’ misleading?

Suppose you have data on a continuous variable, \(Y\), and two variables \(X\) and \(Z\) that are dichotomous with values 0 and 1.

The following table shows the values of \(\bar{Y}\) for all the combinations of values of \(X\) and \(Z\):

	X=0	X=1
Z = 0	5	7
Z = 1	1	2

The number of observations in each cell is:

	X=0	X=1
Z = 0	300	10
Z = 1	100	40

Questions:

Draw the Paik diagram for this data.
What are the ‘conditional effects of X’?
What is the ‘marginal effect of X’?
If you were to fit the model ’Y ~ X*Z’ in R, i.e. the model \[\hat{Y} = \hat{\beta}_0 + X \hat{\beta}_1 + Z \hat{\beta}_2 + X Z\hat{\beta}_3\] what would the values of the estimated regression coefficients be? (Hint: It’s a saturated model and it gives an exact fit to \(\bar{Y}\))
How would you get the ‘conditional effects of X’ from the regression coefficients? Express your answer in the form of a ‘hypothesis matrix’ multiplying the vector of fitted values, i.e. a matrix \({\mathrm L}\) so that your answer would have the form \({\mathrm L} \hat{{\boldsymbol\beta}}\).
What is the interpretation of the interaction coefficient, \(\hat{\beta}_3\)?
Why is the use of the word ‘effects’ problematic in this context? Can you think of a better word?
Is this an example of Simpson’s Paradox? Why or why not?
“Challenging – or tedious – so not on the quiz?” How would you get the marginal effect of X from the regression coefficients? What is the connection with the chain rule in multivariate calculus?

Sample Quiz questions