1、The Plan for Day Two,Practice and pitfalls (1) Natural experiments as interesting sources of instrumental variables (2) The consequences of “weak” instruments for causal inference (3) Some useful IV diagnostics (4) Walk through an empirical application Goal = provide concrete examples of instrumenta
2、l variables methods,Instrumental Variables and Natural Experiments,What is a natural experiment? “situations where the forces of nature or government policy have conspired to produce an environment somewhat akin to a randomized experiment” Angrist and Krueger (2001, p. 73) Natural experiments can pr
3、ovide a useful source of exogenous variation in problematic regressors But they require detailed institutional knowledge,Instrumental Variables and Natural Experiments,Some natural experiments in economics Existing policy differences, or changes that affect some jurisdictions (or groups) but not oth
4、ers Minimum wage rate Excise taxes on consumer goods Unemployment insurance, workers compensation Unexpected “shocks” to the local economy Coal prices and the Middle East oil embargo (1973) Agricultural production and adverse weather events,Instrumental Variables and Natural Experiments,Some potenti
5、al pitfalls Not all policy differences/changes are exogenous Political factors and past realizations of the response variable can affect existing policies or policy changes Generalizability of causal effect estimates Results may not generalize beyond the units under study Heterogeneity in causal eff
6、ects Results may be sensitive to the natural experiment chosen in a specific study (L.A.T.E.),Instrumental Variables and Natural Experiments,Some natural experiments of criminological interest Levitt (1996) = prison population crime rate Levitt (1997) = police hiring crime rate Apel et al. (2008) =
7、youth employment delinquency Some natural experiments not of criminological interest, but interesting nonetheless Angrist and Evans (1998) = fertility labor supply,Levitt (1996), Q.J.E.,Large decline in crime did not accompany the large increase in prison population (1971-1993) Prima fascia evidence
8、 of prison ineffectiveness But.increased prison use could mask what would have been a greater increase in crime Underlying determinants of crime probably worsened And.prison population probably responded to crime increase,Levitt (1996), Q.J.E.,Prison overcrowding legislation Population caps, prohibi
9、tion of “double celling” In 12 states, the entire prison system came under court control AL, AK, AR, DE, FL, MS, NM, OK, RI, SC, TN, TX Relationship between legislation and prisons Prior to filing, prison growth outpaced national average by 2.3 percent After filing, prison growth was 5.1 percent slo
10、wer,Levitt (1996), Q.J.E.,Logic of the instrumental variable in this study Court rulings concerning prison capacity cannot be correlated with the unobserved determinants of crime rate changes Or.the only reason court rulings are related to crime is because they limit prison population growth,Levitt
11、(1996), Q.J.E.,2SLS model yields a “prison effect” on crime at least four times as high as the LS model Violent crime rate bLS = .099 (s.e. = .033) bIV = .424 (s.e. = .201) Property crime rate bLS = .071 (s.e. = .019) bIV = .321 (s.e. = .138) A 10% increase in prison size produces a 4.2% decrease in
12、 violent crime and a 3.2% decrease in property crime,Levitt (1996), Q.J.E.,L.A.T.E. = effect of prison growth on crime among states under court order to slow growth Some relevant observations Generalizability = predominately Southern states Large prison populations, unusually fast prison growth T.E.
13、 heterogeneity = (slowed) prison growth due to court-ordered prison reductions may be differentially related to crime rates Other IVs could lead to different causal effect estimates,Levitt (1997), A.E.R.,Breaking the simultaneity in the police-crime connection When more police are hired, crime shoul
14、d decline But.more police may be hired during crime waves Election cycles and police hiring Increases in size of police force disproportionately concentrated in election years Growth is 2.1% in mayoral election years, 2.0% in gubernatorial election years, and 0.0% in non-election years,Levitt (1997)
15、, A.E.R.,However.can election cycles affect crime rates through other spending channels? Ex., education, welfare, unemployment benefits If so, all of these other indirect channels must be netted out,Levitt (1997), A.E.R.,First-stage coefficients,Reduced-form coefficients,Levitt (1997), A.E.R.,Compar
16、ative estimates of the effect of police manpower on city crime rates Violent crime rate Levels: bLS = +.28 (s.e. = .05) Changes: bLS = .27 (s.e. = .06) Changes: bIV = 1.39 (s.e. = .55) Property crime rate Levels: bLS = +.21 (s.e. = .05) Changes: bLS = .23 (s.e. = .09) Changes: bIV = .38 (s.e. = .83)
17、,Levitt (1997), A.E.R.,Follow-up instrumental variables studies of the police-crime relationship in the U.S. Levitt (2002) = Number of firefighters Klick and Tabarrok (2005) = Washington, DC, terrorism alert levels post-9/11 Evans and Owens (2007) = Grants from the federal Office of C.O.P.S. These f
18、indings basically replicated those from Levitts (1997) original study,Apel et al. (2008), J.Q.C.,What effect does working have on adolescent behavior? Prior research suggests the consequences of work are uniformly negative Focus on “work intensity” rather than work per se Youth Worker Protection Act
19、 Problem of non-random selection into youth labor market Especially pronounced for high-intensity workers,Apel et al. (2008), J.Q.C.,Something interesting happens at age 16 Youth work is no longer governed by the federal Fair Labor Standards Act (F.L.S.A.),Apel et al. (2008), J.Q.C.,F.L.S.A. governs
20、 employment of all 15 year olds during the school year No work past 7:00 pm Maximum 3 hours/day and 18 hours/week But, F.L.S.A. expires for 16 year olds And.every state has its own law governing 16-year-old employment Thus, youth age into less restrictive regimes that vary across jurisdictions,Apel
21、et al. (2008), J.Q.C.,Change in work intensity at 15-16 transition among 15-year-old non-workers,Magnitude of change is an increasing function of the number of hours allowed at age 16,Apel et al. (2008), J.Q.C.,Apel et al. (2008), J.Q.C.,A 20-hour increase in the number of hours worked per week redu
22、ces the “variety” of delinquent behavior by 0.47 (.023320),Angrist and Krueger (1991), J.L.E.,Returns to education (Y = wages) Problem of omitted “ability bias” Years of schooling vary by quarter of birth Compulsory schooling laws, age-at-entry rules Someone born in Q1 is a little older and will be
23、able to drop out sooner than someone born in Q4 Q.O.B. can be treated as a useful source of exogeneity in schooling,Angrist and Krueger (1991), J.L.E.,People born in Q1 do obtain less schooling But pay close attention to the scale of the y-axis Mean difference between Q1 and Q4 is only 0.124, or 1.5
24、 months So.need large N since R2X,Z will be very small A&K had over 300k for the 1930-39 cohort,Angrist and Krueger (1991), J.L.E.,Final 2SLS model interacted QOB with year of birth (30), state of birth (150) OLS: b = .0628 (s.e. = .0003) 2SLS: b = .0811 (s.e. = .0109) Least squares estimate does no
25、t appear to be badly biased by omitted variables But.replication effort identified some pitfalls in this analysis that are instructive,Bound, Jaeger, and Baker (1995), J.A.S.A.,Potential problems with QOB as an IV Correlation between QOB and schooling is weak Small Cov(X,Z) introduces finite-sample
26、bias, which will be exacerbated with the inclusion of many IVs QOB may not be completely exogenous Even small Cov(Z,e) will cause inconsistency, and this will be exacerbated when Cov(X,Z) is small QOB qualifies as a weak instrument that may be correlated with unobserved determinants of wages (e.g.,
27、family income),Bound, Jaeger, and Baker (1995), J.A.S.A.,Even if the instrument is “good,” matters can be made far worse with IV as opposed to LS Weak correlation between IV and endogenous regressor can pose severe finite-sample bias Andreally large samples wont help, especially if there is even wea
28、k endogeneity between IV and error First-stage diagnostics provide a sense of how good an IV is in a given setting F-test and partial-R2 on IVs,Useful Diagnostic Tools for IV Models,Tests of instrument relevance Weak IVs Large variance of bIV as well as potentially severe finite-sample bias Tests of
29、 instrument exogeneity Endogenous IVs Inconsistency of bIV that makes it no better (and probably worse) than bLS Durbin-Wu-Hausman test Endogeneity of the problem regressor(s),Tests of Instrument Relevance,Diagnostics based on the F-test for the joint significance of the IVs Nelson and Startz (1990)
30、; Staiger and Stock (1997) Bound, Jaeger, and Baker (1995) Partial R-square for the IVs Shea (1997) There is a growing econometric literature on the “weak instrument” problem,Tests of Instrument Exogeneity,Model must be overidentified, i.e., more IVs than endogenous Xs H0: All IVs uncorrelated with
31、structural error Overidentification test: 1. Estimate structural model 2. Regress IV residuals on all exogenous variables 3. Compute NR2 and compare to chi-square df = # IVs # endogenous Xs,Durbin-Wu-Hausman (DWH) Test,Balances the consistency of IV against the efficiency of LS H0: IV and LS both co
32、nsistent, but LS is efficient H1: Only IV is consistent DWH test for a single endogenous regressor:DWH = (bIV bLS) / (s2bIV s2bLS) N(0,1) If |DWH| 1.96, then X is endogenous and IV is the preferred estimator despite its inefficiency,Durbin-Wu-Hausman (DWH) Test,A roughly equivalent procedure for DWH
33、: 1. Estimate the first-stage model 2. Include the first-stage residual in the structural model along with the endogenous X 3. Test for significance of the coefficient on residual Note: Coefficient on endogenous X in this model is bIV (standard error is smaller, though) First-stage residual is a “ge
34、nerated regressor”,Software Considerations,I have a strong preference for Stata Classic routine (-ivreg-) as well as a user-written one with a lot more diagnostic capability (-ivreg2-) Non-linear models: -ivprobit- and -ivtobit- Panel models: -xtivreg- and -xtivreg2- Useful post-estimation routines
35、Overidentification: -overid- Endogeneity of X in LS model: -ivendog- Heteroscedasticity: -ivhettest-,Software Considerations,Basic model specification in Stataivreg y (x = z) w weight = wtvar, optionsy = dependent variablex = endogenous variablez = instrumental variablew = control variable(s) Useful
36、 options: first, ffirst, robust, cluster(varname),Software Considerations,For SAS users: Proc Syslin (SAS/ETS) Basic command:proc syslin data=dataset 2sls options1;endogenous x;instruments z w;model y = x w / options2;weight wtvar;run; Useful “options1”: first Useful “options2”: overid,Software Cons
37、iderations,For SPSS users: 2SLS Basic command:2sls y with x w/ instruments z w/ constant. For point-and-click aficionados Analyze Regression Two-Stage Least Squares DEPENDENT, EXPLANATORY, and INSTRUMENTAL,Software Considerations,For Limdep users: 2SLS Basic command:2SLS ; Lhs = y; Rhs = one, x, w;
38、Inst = one, z, w; Wts = wtvar; Dfc $,Application: Adolescent Work and Delinquent Behavior,Prior research shows a positive correlation between teenage work and delinquency Reasons to suspect serious endogeneity bias 2nd wave of the NLSY97 (N = 8,368) Y = 1 if committed delinquent act (31.9%) X = 1 if
39、 worked in a formal job (52.6%) Z1 = 1 if child labor law allows 40+ hours (14.2%) Z2 = 1 if no child labor restriction in place (39.6%),Regression Model Ignoring Endogeneity,. reg pcrime work if nomiss=1 & wave=2Source | SS df MS Number of obs = 8368 -+- F( 1, 8366) = 6.33Model | 1.37395379 1 1.373
40、95379 Prob F = 0.0119Residual | 1815.97786 8366 .217066443 R-squared = 0.0008 -+- Adj R-squared = 0.0006Total | 1817.35182 8367 .217204711 Root MSE = .4659-pcrime | Coef. Std. Err. t P|t| 95% Conf. Interval -+-work | .0256633 .0102005 2.52 0.012 .0056677 .0456588_cons | .3053242 .0074009 41.26 0.000
41、 .2908167 .3198318 -Teenage workers significantly more delinquent Modest effect but consistent with prior research,First-Stage Model,. reg work law40 nolaw if nomiss=1 & wave=2Source | SS df MS Number of obs = 8368 -+- F( 2, 8365) = 626.64Model | 271.829722 2 135.914861 Prob F = 0.0000Residual | 181
42、4.33364 8365 .216895832 R-squared = 0.1303 -+- Adj R-squared = 0.1301Total | 2086.16336 8367 .249332301 Root MSE = .46572-work | Coef. Std. Err. t P|t| 95% Conf. Interval -+-law40 | .0688902 .0154383 4.46 0.000 .0386274 .099153nolaw | .3818684 .0110273 34.63 0.000 .3602521 .4034847_cons | .3655636 .
43、0074883 48.82 0.000 .3508847 .3802425 -State child labor laws affect probability of work This is a really strong first stage (F, R2),Two-Stage Least Squares Model,. ivreg pcrime (work = law40 nolaw) if nomiss=1 & wave=2Instrumental variables (2SLS) regressionSource | SS df MS Number of obs = 8368 -+
44、- F( 1, 8366) = 6.86Model | -19.5287923 1 -19.5287923 Prob F = 0.0088Residual | 1836.88061 8366 .219564978 R-squared = . -+- Adj R-squared = .Total | 1817.35182 8367 .217204711 Root MSE = .46858-pcrime | Coef. Std. Err. t P|t| 95% Conf. Interval -+-work | -.0744352 .0284206 -2.62 0.009 -.1301466 -.0
45、187238_cons | .3580171 .0158135 22.64 0.000 .3270187 .3890155 - Instrumented: work Instruments: law40 nolaw -,What Do the Models Suggest Thus Far?,Completely different conclusions! OLS = Teenage work is criminogenic (b = +.026) Delinquency risk increases by 8.5 percent (base = .305) 2SLS = Teenage work is prophylactic (b = .074) Delinquency risk decreases by 20.7 percent (base = .358) Which model should we believe? We still have some additional diagnostic work to do to evaluate the 2SLS model Overidentification test, Hausman test,