version 8 capture log close set matsize 500 set more off set scheme spost // only used for generating graphs for publication log using st8ch2.log, replace text // * // * RM4STATA Ch 2: Introduction to Stata - 5/26/2003 // * // * Section 2.7: using and saving datasets use nomocc2.dta, clear save nomocc3.dta, replace // * Section 2.9: command or do files * example.do: short do file /* log using example, replace use binlfp2, clear tabulate hc wc, row nolabel log close */ * example2.do: short do file using comments /* ==> short simple do file ==> for didactic purposes log using example, replace // this text is ignored * next we load the data use binlfp2, clear * tabulate husband's and wife's education tabulate hc wc, /// the next line is the continuation of this one row nolabel * close up log close * make sure there is a cr at the end! */ * long lines using #delimit use gsskidvalue2.dta #delimit ; recode income91 1=500 2=1500 3=3500 4=4500 5=5500 6=6500 7=7500 8=9000 9=11250 10=13750 11=16250 12=18750 13=21250 14=23750 15=27500 16=32500 17=37500 18=45000 19=55000 20=67500 21=75000 *=. ; #delimit cr * Tip: long lines recode income91 1=500 2=1500 3=3500 4=4500 5=5500 6=6500 7=7500 8=9000 /// 9=11250 10=13750 11=16250 12=18750 13=21250 14=23750 15=27500 16=32500 /// 17=37500 18=45000 19=55000 20=67500 21=75000 *=. // * Section 2.9.5: recommended structure of do files /* * Note: version number ensures compatibility with later Stata releases version 8 * Note: if a log file is open, close it capture log close * Note: don't pause when output scrolls off the page set more off * Note: log results to file myfile.log log using myfile, replace text * myfile.do - written 29 jan 2003 to illustrate do files * Note: your commands go here * Note: close the log file. log close */ // * Section 2.11: syntax of stata commands use binlfp2, clear tabulate hc wc if age>=40, row // * Section 2.11.2: variable lists sum age inc k5 sum sum k* // * Section 2.11.3: if and in use gsskidvalue2, clear sum income if age>=25 & age<=65 sum income if age>=25 & age<=65 & female==1 sum income if (age<25 | age>65) & age~=. & female==1 // * Section 2.12.2: getting information use binlfp2, clear describe sum age, detail tab hc tab hc, nolabel tab hc wc tab1 hc wc dotplot age graph export 02dotplot.eps, replace codebook age // * Section 2.13.1: generate use binlfp2, clear generate age2 = age summarize age2 age gen age3 = age if age>40 sum age3 age gen agesq = age^2 gen lnage = ln(age) // * Section 2.13.2: replace gen age4 = age replace age4 = 40 if age<40 sum age4 age // * Section 2.13.3: recode use recodedata2.dta, clear recode origvar (1=2) (3=4), generate(myvar1) recode origvar (2=1) (*=0), gen(myvar2) recode origvar (2=1) (nonmissing=0), gen(myvar3) recode origvar (1/4=2), gen(myvar4) recode origvar (1 3 4 5=7), gen(myvar5) recode origvar (min/5=min), gen(myvar6) recode origvar (missing=9), gen(myvar7) recode origvar (.=-999) (1/3=-999) (7=-999) recode origvar (-999=.), gen(myvar8) // * Secton 2.13.4: common transformations of rhs variables * breaking a categorical variable into a set of dummy variables use gsskidvalue2, clear * example 1 - tab, gen() tab degree, gen(edlevel) sum edlevel* tab degree edlevel1, missing * example 2 - gen if gen hsdeg = (degree==1 | degree==2) if degree<. gen coldeg = (degree==3) if degree<. gen graddeg = (degree==4) if degree<. tab degree coldeg, missing * more examples of creating binary variables use ordwarm2, clear * example 1 - a single binary variable gen ed12plus = (ed>=12) if ed<. * example 2 - three indicator variables gen edlt13 = (ed<=12) if ed<. gen ed1316 = (ed>=13 & ed<=16) if ed<. gen ed17plus = (ed>17) if ed<. * example 3 - recode to a binary outcome gen wrmagree = warm recode wrmagree 1=0 2=0 3=1 4=1 tab wrmagree warm * nonlinear transformations use gsskidvalue2, clear gen agesq = age*age gen lnincome = ln(income) sum age agesq income lnincome * interaction terms gen feminc = female * income // * Section 2.14.1: variable label * adding a label label variable agesq "Age squared" describe agesq * dropping a label label variable agesq describe agesq // * Section 2.14.2: value labels * defining labels label define yesno 1 yes 0 no label define posneg4 1 veryN 2 negative 3 positive 4 veryP label define agree4 1 StrongA 2 Agree 3 Disagree 4 StrongD label define agree5 1 StrongA 2 Agree 3 Neutral 4 Disagree 5 StrongD * assigning labels label values female yesno label values black yesno label values anykids yesno describe female black anykids tab anykids * defining and assigning labels label define degree 0 "no_hs" 1 "hs" 2 "jun_col" 3 "bachelor" 4 "graduate" label values degree degree tab degree // * Section 2.14.3: notes * assign notes notes: small General Social Survey extract for Stata book notes income: self-reported family income, measured in dollars notes income: refusals coded as missing * list notes notes // * Section 2.15: macros * global macros use binlfp2, clear global myopt = ", ce m nol ch nokey" tab lfp wc $myopt tab lfp hc $myopt tab wc hc $myopt tab lfp wc, ce m nol ch nokey tab lfp hc, ce m nol ch nokey tab wc hc, ce m nol ch nokey * local macros local myopt = ", ce m nol ch nokey" tab lfp wc `myopt' tab lfp hc `myopt' tab wc hc `myopt' * macro functions global wclabel : variable label wc display "$wclabel" // * Section 2.16.1: the graph command use lfpgraph2, clear list income kid0p1 kid1p1 * a simple graph graph twoway scatter kid0p1 kid1p1 kid2p1 income graph export 02graphsimple.eps, replace graph twoway (connected kid0p1 income) /// (scatter kid1p1 kid2p1 income) graph export 02graphsimplec.eps, replace graph twoway (connected kid0p1 kid1p1 kid2p1 income), /// ytitle("Probability") /// title("Predicted Probability of Female LFP") /// subtitle("(as predicted by logit model)") /// xtitle("Family income, excluding wife's") /// caption("Data from 1976 PSID-T Mroz") graph export 02graphsimpled.eps, replace graph twoway (connected kid0p1 kid1p1 kid2p1 income), /// xlabel(10 "minimum" 50 "median" 90 "maximum") graph export 02graphsimpleg.eps, replace graph twoway (connected kid0p1 kid1p1 kid2p1 income), /// ytitle("Probability") /// title("Predicted Probability of Female LFP") /// subtitle("(as predicted by logit model)") /// xtitle("Family income, excluding wife's") /// caption("Data from 1976 PSID-T Mroz") /// xlabel(10 20 30 40 50 60 70 80 90) /// legend(symxsize(9)) name(graph1, replace) graph export 02graphsimpleh.eps, replace * make graphs for graph combine example use ordwarm2, clear ologit warm yr89 male white age ed prst, nolog prgen age, from(20) to(80) generate(w89) x(male=0 yr89=1) ncases(13) label var w89p1 "SD" label var w89p2 "D" label var w89p3 "A" label var w89p4 "SA" label var w89s1 "SD" label var w89s2 "SD or D" label var w89s3 "SD, D or A" * step 1: graph predicted probabilities graph twoway connected w89p1 w89p2 w89p3 w89p4 w89x, /// title("Panel A: Predicted Probabilities") /// xtitle("Age") xlabel(20(10)80) ylabel(0(.25).50) /// yscale(noline) ylabel("") xline(44.93) /// ytitle("") name(graph1, replace) * step 2: graph cumulative probabilities graph twoway connected w89s1 w89s2 w89s3 w89x, /// title("Panel B: Cumulative Probabilities") /// xtitle("Age") xlabel(20(10)80) ylabel(0(.25)1) /// yscale(noline) ylabel("") xline(44.93) name(graph2, replace) /// ytitle("") * step 3: combine graphs //graph combine graph1 graph2, iscale(*.9) imargin(small) graph combine graph1 graph2, imargin(small) graph display, xsize(8) ysize(4) graph export 02graphssimplee.eps, replace graph combine graph1 graph2, iscale(*.9) imargin(small) /// ysize(3.9) xsize(3.5405) col(1) graph display, xsize(6) ysize(8) graph export 02graphssimplef.eps, replace // * Section 2.17: tutorial - see st8ch2tutorial.do log close