Project Logistic Regression
You will use the state data that you used in your linear regression project. Choose one or more numerical predictor variables, and one binary outcome. If any of your predictors have large p-values, be sure to justify why you are including them.
Please do not use HINCP or FINCP to predict FS:
For your final report:
- Include and explain all relevant output.
- Explain what (each of) your independent variable(s) is measuring and discuss the value (especially sign!) of its coefficient in the regression model.
- Discuss any outliers in your predicted vs observed graph.
- To really impress, make a prediction for a particular household with a given set of predictor variables.
Rubric for Project (40 points)
8 points: Model, including output including tables and p-values.
8 points: Graph. Comment on what you can learn from the graph, and what you get from the output that isn’t represented well in the graph. Is your model correctly predicting most of the outcomes?
8 points: Describe independent variable(s). Use the PUMS data dictionary as a starting point and explain in your own words.
8 points: Describe the model in your own words. Does it seem plausible to you?
8 points: Contrast with a linear regression model on the same data. Which do you prefer, and why?
Note: Many of the binary variables are not defined as 0’s and 1’s, they are 1’s and 2’s or they may be ordinal variables and defined as 0’s, 1’s and 2’s. Because of this, we need a way to recode our data.