ImageVerifierCode 换一换
格式:PPT , 页数:48 ,大小:210KB ,
资源ID:378306      下载积分:2000 积分
快捷下载
登录下载
邮箱/手机:
温馨提示:
如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝扫码支付 微信扫码支付   
注意:如需开发票,请勿充值!
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【http://www.mydoc123.com/d-378306.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(An Introduction to Stata for EconomistsPart II-Data Analysis.ppt)为本站会员(inwarn120)主动上传,麦多课文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文库(发送邮件至master@mydoc123.com或直接QQ联系客服),我们立即给予删除!

An Introduction to Stata for EconomistsPart II-Data Analysis.ppt

1、An Introduction to Stata for Economists Part II: Data Analysis,Kerry L. Papps,1. Overview,Do-files Summary statistics Correlation Linear regression Generating predicted values and hypothesis testing Instrumental variables and other estimators Panel data capabilities Panel estimators,2. Overview (con

2、t.),Writing loops Graphs,3. Comment on notation used,Consider the following syntax description: list varlist in range Text in typewriter-style font should be typed exactly as it appears (although there are possibilities for abbreviation). Italicised text should be replaced by desired variable names

3、etc. Square brackets (i.e. ) enclose optional Stata commands (do not actually type these).,4. Comment on notation used (cont.),For example, an actual Stata command might be: list name occupation This notation is consistent with notation in Stata Help menu and manuals.,5. Do-files,Do-files allow comm

4、ands to be saved and executed in “batch” form. We will use the Stata do-file editor to write do-files. To open do-file editor click Window Do-File Editor or click Can also use WordPad or Notepad: Save as “Text Document” with extension “.do” (instead of “.txt”). Allows larger files than do-file edito

5、r.,6. Do-files (cont.),Note: a blank line must be included at the end of a WordPad do-file (otherwise last line will not run). To run a do-file from within the do-file editor, either select Tools Do or click If you highlight certain lines of code, only those commands will run. To run do-file from th

6、e main Stata windows, either select File Do or type: do dofilename,7. Do-files (cont.),Can “comment out” lines by preceding with * or by enclosing text within /* and */. Can save the contents of the Review window as a do-file by right-clicking on window and selecting “Save All.”.,8. Univariate summa

7、ry statistics,tabstat produces a table of summary statistics: tabstat varlist , statistics(statlist) Example: tabstat age educ, stats(mean sd sdmean n) summarize displays a variety of univariate summary statistics (number of non-missing observations, mean, standard deviation, minimum, maximum): summ

8、arize varlist,9. Multivariate summary statistics,table displays table of statistics: table rowvar colvar , contents(clist varname) clist can be freq, mean, sum etc. rowvar and colvar may be numeric or string variables. Example: table sex educ, c(mean age median inc),10. Multivariate summary statisti

9、cs (cont.),One “super-column” and up to 4 “super-rows” are also allowed. Missing values are excluded from tables by default. To include them as a group, use the missing option with table.,EXERCISE 1 11. Generating simple statistics,Open the do-file editor in Stata. Run all your solutions to the exer

10、cises from here. Open nlswork.dta from the internet as follows: webuse nlswork Type summarize to look at the summary statistics for all variables in the dataset. Generate a wage variable, which exponentiates ln_wage: gen wage=exp(ln_wage),EXERCISE 1 (cont.) 12. Generating simple statistics,Restrict

11、summarize to hours and wage and perform it separately for non-married and married (i.e. msp=0 and 1). Use tabstat to report the mean, median, minimum and maximum for hours and wage. Report the mean and median of wage by age (along the rows) and race (across the columns) : table age race, c(mean wage

12、 median wage),13. Sets of dummy variables,Dummy variables take the values 0 and 1 only. Large sets of dummy variables can be created with: tab varname, gen(dummyname) When using large numbers of dummies in regressions, useful to name with pattern, e.g. id1, id2 Then id* can be used to refer to all v

13、ariables beginning with *.,14. Correlation,To obtain the correlation between a set of variables, type: correlate varlist weight , covariance covariance option displays the covariances rather than the correlation coefficients. pwcorr displays all the pairwise correlation coefficients between the vari

14、ables in varlist: pwcorr varlist weight , sig,15. Correlation (cont.),sig option adds a line to each row of matrix reporting the significance level of each correlation coefficient. Difference between correlate and pwcorr is that the former performs listwise deletion of missing observations while the

15、 latter performs pairwise deletion. To display the estimated covariance matrix after a regression command use: estat vce,16. Correlation (cont.),(This matrix can also be displayed using Statas matrix commands, which we will not cover in this course.),17. Linear regression,To perform a linear regress

16、ion of depvar on varlist, type: regress depvar varlist weight if exp , noconstant robust depvar is the dependent variable. varlist is the set of independent variables (regressors). By default Stata includes a constant. The noconstant option excludes it.,18. Linear regression (cont.),robust specifies

17、 that Stata report the Huber-White standard errors (which account for heteroskedasticity). Weights are often used, e.g. when data are group averages, as in: regress inflation unemplrate year aweight=pop This is weighted least squares (i.e. GLS). Note that here year allows for a linear time trend.,19

18、. Post-estimation commands,After all estimation commands (i.e. regress, logit) several predicted values can be computed using predict. predict refers to the most recent model estimated. predict yhat, xb creates a new variable yhat equal to the predicted values of the dependent variable. predict res,

19、 residual creates a new variable res equal to the residuals.,20. Post-estimation commands (cont.),Linear hypotheses can be tested (e.g. t-test or F-test) after estimating a model by using test. test varlist tests that the coefficients corresponding to every element in varlist jointly equal zero. tes

20、t eqlist tests the restrictions in eqlist, e.g.: test sex=3 The option accumulate allows a hypothesis to be tested jointly with the previously tested hypotheses.,21. Post-estimation commands (cont.),Example: regress lnw sex race school age test sex race test school = age, accum,EXERCISE 2 22. Linear

21、 regression,Compute the correlation between wage and grade. Is it significant at the 1% level? Generate a variable called age2 that is equal to the square of age (the square operator in Stata is ). Create a set of race dummies with: tab race, gen(race) Regress ln_wage on: age, age2, race2, race3, ms

22、p, grade, tenure, c_city.,EXERCISE 2 (cont.) 23. Linear regression,Display the covariance matrix from this regression. Use predict to generate a variable res containing the residuals from the equation. Use summarize to confirm that the mean of the residuals is zero. Rerun the regression and report H

23、uber-White standard errors.,24. Additional estimators,Instrumental variables: ivregress 2sls depvar exogvars (endogvars=ivvars) Both exogvars and ivvars are used as instruments for endogvars. For example: ivregress 2sls price inc pop (qty=cost) Logit: logit depvar indepvars,25. Additional estimators

24、 (cont.),Probit: probit depvar indepvars Ordered probit: oprobit depvar indepvars Tobit: tobit depvar indepvars, ll(cutoff) For example, tobit could be used to estimate labour supply: tobit hrs educ age child, ll(0),EXERCISE 3 26. IV and probit,Repeat the regression from Exercise 2 using ivregress 2

25、sls and instrument for tenure using union and south. Compare the results with those from Exercise 2. Estimate a probit model for union with the following regressors: age, age2, race2, race3, msp, grade, c_city, south.,27. Panel data manipulation,Panel data generally refer to the repeated observation

26、 of a set of fixed entities at fixed intervals of time (also known as longitudinal data). Stata is particularly good at arranging and analysing panel data. Stata refers to two panel display formats: Wide form: useful for display purposes and often the form data obtained in. Long form: needed for reg

27、ressions etc.,28. Panel data manipulation (cont.),Example of wide form:Note the naming convention for inc.,i,xij,29. Panel data manipulation (cont.),Example of long form:,i,j,xij,30. Panel data manipulation (cont.),To change from long to wide form, type: reshape wide varlist, i(ivarname) j(jvarname)

28、 varlist is the list of variables to be converted from long to wide form. i(ivarname) specifies the variable(s) whose unique values denote the spatial unit. j(jvarname) specifies the variable whose unique values denote the time period.,31. Panel data manipulation (cont.),To change from wide to long

29、form, type: reshape long stublist, i(ivarname) j(jvarname) stublist is the “word” part of the names of variables to be converted from wide to long form, e.g. “inc” above. It is important to name variables in this format, i.e. word description followed by year.,32. Panel data manipulation (cont.),To

30、move between the above example datasets use: reshape long inc, i(id) j(year) reshape wide inc, i(id) j(year) These steps “undo” each other.,33. Lags,You can “declare” the data to be in panel form, with the tsset command: tsset panelvar timevar For example: tsset country year After using tsset, a lag

31、 can be created with: gen lagname = L.varname Similarly, L2.varname gives the second lag.,34. Panel estimators,Panel data estimation: xtreg depvar indepvars , re fe i(panelvar) i(panelvar) specifies the variable corresponding to an independent unit (e.g. country). This can be omitted if the data hav

32、e been tsset. re and fe specify how we wish to treat the time-invariant error term (random effects vs fixed effects).,35. Panel estimators (cont.),An alternative to fe is to regress depvar on a set of dummy variables for each panel unit. You should either drop one dummy or use the noconstant option

33、to avoid the dummy variable trap, although Stata automatically drops regressors when they are perfectly collinear. To perform a Hausman test of fixed vs random effects, first run each estimator and save the estimates, then use the hausman command:,36. Panel estimators (cont.),xtreg depvar indepvars,

34、 fe estimates store fe_name xtreg depvar indepvars, re estimates store re_name hausman fe_name re_name You must list the fe_name before re_name in the hausman command.,EXERCISE 4 37. Manipulating a panel,Declare the data to be a panel using tsset, noting that idcode is the panel variable and year is

35、 the time variable. Generate a new variable lwage equal to the lag of wage and confirm that this contains the correct values by listing some data (use the break button): list idcode year wage lwage Save the file as “NLS data” in a folder of your choice.,EXERCISE 4 (cont.) 38. Manipulating a panel,Us

36、ing the same regressors from the regress command in Exercise 2, run a fixed effects regression for ln_wage using xtreg. Note that all time invariant variables are dropped. Store the estimates as fixed. Run a random effects regression and store the estimates as random. Perform a Hausman test of rando

37、m vs fixed effects. Which is preferred?,EXERCISE 4 (cont.) 39. Manipulating a panel,Drop all variables other than idcode, year and wage using the keep command (quicker than using drop). Use the reshape wide option to rearrange the data so that the first column represents each person (idcode) and the

38、 other columns contain wage for a particular year. Return the data to long form (change wide to long in the command).,EXERCISE 4 (cont.) 40. Manipulating a panel,Do not save the new dataset.,41. Writing loops,The foreach command allows one to repeat a sequence of commands over a set of variables: fo

39、reach name of varlist varlist Stata commands referring to name Stata sequentially sets name equal to each element in varlist and executes the commands enclosed in braces. name should be enclosed within the characters and when referred to within the braces.,42. Writing loops (cont.),name can be any w

40、ord and is an example of a “local macro”. For example: foreach var of varlist age educ inc gen lvar=log(var)drop var ,EXERCISE 5 43. Using loops in regression,Open “NLS data” and rerun the fixed effects regression from Exercise 4. Use foreach with varlist to loop over all the regressors and report t

41、heir t-statistics (using test). Use foreach with varlist to create a loop that renames each variable by adding “68” to the end of the existing name.,44. Graphs,To obtain a basic histogram of varname, type: histogram varname, discrete freq To display a scatterplot of two (or more) variables, type: sc

42、atter varlist weight weight determines the diameter of the markers used in the scatterplot.,45. Graphs (cont.),There are options for (among other things): Adding a title (title) Altering the scale of the axes (xscale, yscale) Specifying what axis labels to use (xlabel, ylabel) Changing the markers u

43、sed (msymbol) Changing the connecting lines (connect),46. Graphs (cont.),Particularly useful is mlabel(varname) which uses the values of varname as markers in the scatterplot. Example: scatter gdp unemplrate, mlabel(country),47. Graphs (cont.),Graphs are not saved by log files (separate windows). Select File Save Graph. To insert in a Word document etc., select Edit Copy and then paste into Word document. This can be resized but is not interactive (unlike Excel charts etc.).,

copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
备案/许可证编号:苏ICP备17064731号-1