Quiz 1: September 19
Summary:
- Draw a Paik-Agresti diagram
- Find the conditional and marginal effects of X
- Find the least-squares regression coefficients — without solving for \(\hat{{\boldsymbol\beta}}\) with \(\hat{{\boldsymbol\beta}}=({\mathrm X}'{\mathrm X})^{-1}{\mathrm X}'Y\)
- Why is the word ‘effects’ misleading?
Suppose you have data on a continuous variable, \(Y\),
and two variables \(X\) and \(Z\) that are dichotomous with
values 0 and 1.
The following table shows the values of \(\bar{Y}\) for
all the combinations of values of \(X\) and \(Z\):
The number of observations in each cell is:
Z = 0 |
300 |
10 |
Z = 1 |
100 |
40 |
Questions:
- Draw the Paik diagram for this data.
- What are the ‘conditional effects of X’?
- What is the ‘marginal effect of X’?
- If you were to fit the model ’Y ~ X*Z’ in R, i.e. the model
\[\hat{Y} = \hat{\beta}_0 + X \hat{\beta}_1 + Z \hat{\beta}_2 + X Z\hat{\beta}_3\]
what would the values of the estimated regression coefficients be? (Hint: It’s a saturated model and it gives an exact fit to \(\bar{Y}\))
- How would you get the ‘conditional effects of X’ from the regression
coefficients? Express your answer in the form of a
‘hypothesis matrix’ multiplying the vector of fitted values, i.e.
a matrix \({\mathrm L}\) so that your answer would have the form \({\mathrm L} \hat{{\boldsymbol\beta}}\).
- What is the interpretation of the interaction coefficient, \(\hat{\beta}_3\)?
- Why is the use of the word ‘effects’ problematic in this context? Can you think of a better word?
- Is this an example of Simpson’s Paradox? Why or why not?
- “Challenging – or tedious – so not on the quiz?” How would you get the marginal effect of X from the regression coefficients?
What is the connection with the chain rule in multivariate calculus?