1、Heckit Method,Outlines:,1.Introduction to the Sample Selection Problem 2.Describe the Heckit Model 3.Apply Heckit method to the example of “Wage Offer for Married Women”,Introduction to Sample Selection Problem (background),Mostly, we assumed the availability of a random sample from the population.
2、But random samples are not always available. Sample Selection Problem: the observed sample is not random sample but systematically chosen from the population. Cause of nonrandom sample selection:-sample design;-the behavior of the units being sampled ( including nonresponse on survey questions and a
3、ttrition from social programs);,Introduction to Sample Selection (examples),Examples (by survey design):1. saving functionIf we only have access to a survey that included families whose household head was 45 years of age or older. We can obtain random sample only for a subset of the population since
4、 we are interested in the saving function for all families. 2. truncation based on wealth (choice-based sampling)If we can only sample people with a net wealth less than $200,000, the sample is selected on the basis of wealth (response variable).,Introduction to Sample Selection (examples),Incidenta
5、l truncation : do not observe y because of the outcome of another variable.Leading sample as wage offer function:consider a wage offer equation for people of working age. in the work force: wage offer observedout of the work force: wage offer unobserved(wage offer) is missing as a result of the outc
6、ome of another variable, labor force participation.,Introduction to Sample Selection Problem(the case bias doesnt exist),=When is OLS on the selected sample consistent? The population model is if we can observe y and x, we can simply use OLS, The model we estimated isselection indicator s=1 if we ob
7、serve all of y,x;s=0 otherwise. Condition for OLS to be consistent( error term has zero mean and uncorrelated with each explanatory variable):selected sample (2) :population(1) :,Introduction to Sample Selection Problem(the case bias doesnt exist),Condition for OLS to be unbiased:selected sample (2)
8、 :population(1) :Exogenous sample selection is consistent:If s=f(X) sX=f(X)X Then E(su|sX)=0 because E(u|X)=0 Independent sample selection is consistent:If s is independent of X an uE(sXu)=E(s)E(Xu)=0 If s depends on explanatory variables and additional random terms that are independent of X and u,
9、OLS is consistent and unbiased.OLS is incosistent if s and u are not uncorrelated. The result on consistency of OLS can be extended to the consistency of 2SLS in IVs (Z).,Heckit Model (type II tobit model),Assumption: a) (X,Y2) are always observed, Y1 is observed only when Y2=1; b) (u1,v2) is indepe
10、ndent of x with zero mean; c) v2 is distributed as normal(0,1); d) E(u1|v2)= Equation (17.19)and (17.20) are called type II Tobit Model by Amemiya(1985), but it is a model of sample selection Tobit Type II Model= Heckit Model,Heckit Model,Correlation between u1 and v2 causes a sample selection probl
11、em To derive an estimating equation, Estimate since Y1 is observed only when Y2=11) first, under assumption and (17.19)If =0 - u1 and v2 are uncorrelatedthen , because Y2 is function of (X,v2), If =0, there is no sample selection problem, can be consistently estimated by OLS used in selected sample.
12、,Heckit Model,If 0 - there exists selection bias. If we knew h(X,Y2), then we can estimate and from the regression Y1 on X1 and h(X,Y2) ; since Y2=1 in selected sample, let where inverse Mills ratioSo we can get,Heckmans 2 steps Method,A) obtain the probit estimate from the model , using all N obser
13、vations. Then, obtain the estimated inverse Mills ratios.B) Obtain and from the OLS regression on the selected sample,Yi1 on Xi1 and , i=1,2.N1 -(3)These estimators are consistent and N asymptotically normal.,Test for selection bias,no selection bias, Homoskedasticity holds,It is appropriate to use
14、the Least Squares estimates and their standard errors obtained from the regression (3).If isnt equal to 0, the usual OLS standard errors are not exactly correct. Which do not account for estimation of .,Note:X1 should be a strict subset of X. Any element in X1 should be in X. Excluding some elements
15、 in X1 from X will lead to inconsistency if they are incorrectly excluded. So when selectivity bias is present, wed better include all elements of X1 in X in order to get consistent estimates of the coefficient. (part answer for question b)At least one element in X is not also in X1. Because, althou
16、gh the inverse Mills ratio is a nonlinear function of X, it is often well-approximated by a linear function. If X=X1, Can be highly correlated with X1. And such multicollinearity can lead to very high standard errors for the estimate of the coefficients.,Example : Wage offer Equation for Married Wom
17、en,753 women in the sample, 428 worked for a wage during the year. Log(wage) as dependent variable Educ,exper, exper2 added as explanatory variables.,results of the example :,Table 17.5: Dependent Variable: log(wage) Independent Variables OLS Heckiteduc .108 .109(.014) (.016)exper .042 .044(.012) (-
18、.00086)exper2 -.00081 -.00086(.00039) (.00044)constant -.522 -.578(.199) (.307)- .032(.134)sample size 428 428R-squared .157 .157,Results of the example:,The inverse Mills ratio term is statistically insignificant( t-statistic is only (.239). Which shows there is no evidence of a sample selection pr
19、oblem in estimating the wage offer equation.holds, no selection bias. We can find from the table that difference between OLS and Heckit estimates are practically small. We can use the Least Squares estimates in this case.,Conclusion:,Sample selection problem arises when the observed sample is not random sample. In stead of OLS, we can use Heckit Method to get consistent estimates if we have selected sample. We can test for selection bias by using this method.,