#delimit; clear; capture log close; log using lab3.log, replace; *********************************************************** LAB3.DO is a STATA do-file that inputs data from the 1997 March CPS, runs regressions, tests for and corrects for heteroskedasticity and autocorrelation. - written by Dan Rosenbaum, 2006 Located at my website course (http://www.uncg.edu/bae/people/rosenbaum/Eco643/main.html) is a file containing a 1 in 100 subsample of all persons in the 1997 March CPS who are 25-54 and not in the armed forces. The file is named cps97.raw and is stored space-delimited ASCII. Here is a short description of the variables in the order that they are found in the data. AGE = age in years RACE = 1 if white = 2 if black = 3 if other FEMALE = 1 if female, 0 otherwise EDUCATT = 11 if high school dropout = 12 if high school graduate = 14 if some college = 16 if bachelors degree = 18 if masters degree or above EARN = annual earnings in nominal dollars WEEKS = total weeks worked last year HOURS = total hours worked last year NUMKID = number of children MS = 1 if married with spouse present = 2 if married but spouse absent = 3 if separated = 4 if divorced = 5 if never married = 6 if widowed WGT = March Supplement Weight YEAR = year (four digits) in July of previous year STATE = state of residence, alphabetical order (1-51) INSCHOOL = 1 if attending school, 0 otherwise UR = state unemployment rate (in percentage points) ***********************************************************; *********************************************************** I start by inputting the data using an INFILE statement, since the data is space-delimited rather than tab-delimited. I also calculate summary statistics for the sample. ***********************************************************; infile age race female educatt earn weeks hours numkid ms wgt year state inschool ur using cps97; *********************************************************** I rescale the annual earnings variable, so that the coefficient estimates are easier to interpret. ***********************************************************; gen earn1=earn/1000; sum; *********************************************************** Here I regress annual earnings onto age and educational attainment. I then calculate and plot the residuals. It appears that the variance of the error term increases with educational attainment. ***********************************************************; reg earn1 female age educatt; predict uhat, resid; plot uhat educatt; *********************************************************** YOU CAN IGNORE THIS! Here I run the Goldfield/Quandt test. My high variance group is those with 16 or more years of education and my low variance group is those with 12 or fewer years of education. Note how I use the DISP statement and FPROB function to display the p-value for the F-statistic. We reject the null hypothesis of homoskedasticity. ***********************************************************; reg earn1 female age educatt if educatt>=16; reg earn1 female age educatt if educatt<=12; disp fprob(149,236,1244/235); *********************************************************** YOU CAN IGNORE THIS! Here I run the Breusch-Pagan test. Note that I create "normalized" squared residuals. I compute the null hypothesis uder the assumption that the errors in the original model are normally distributed and not normally distributed. The null hypothesis of homoskedasticity is rejected in both cases. ***********************************************************; gen u2=uhat^2; egen u2ave=mean(u2); gen u2bp=u2/u2ave; sum uhat u2 u2ave u2bp; reg u2bp female age educatt; predict u2bphat; disp chiprob(3,364.8/2); disp chiprob(3,560*0.0674); *********************************************************** Here I run the White test. Again, the null hypothesis of homoskedasticity is rejected. ***********************************************************; gen age2=age^2; gen educatt2=educatt^2; gen ageeduc=age*educatt; gen fage=female*age; gen feducatt=female*educatt; sum age2 educatt2 ageeduc fage feducatt; reg u2 female age age2 educatt educatt2 ageeduc fage feducatt; disp chiprob(8,560*0.0977); *********************************************************** Here I run GLS assuming that the variance of the error term is a linear function of age. Note the AWEIGHT=x option implies that the observations are weighted by the square root of x. ***********************************************************; reg earn1 female age educatt [aweight=1/age]; *********************************************************** Here I run GLS assuming that the variance of the error term is a linear function of educational attainment. ***********************************************************; reg earn1 female age educatt [aweight=1/educatt]; *********************************************************** Here I run GLS assuming that the variance of the error term is a exponential function of an intercept, female, age, and educational attainment. ***********************************************************; gen lnu2=log(u2); reg lnu2 female age educatt; predict lnu2hata; gen u2hata=exp(lnu2hata); sum lnu2 lnu2hata u2 u2hata; reg earn1 female age educatt [aweight=1/u2hata]; *********************************************************** Here I run GLS assuming that the variance of the error term is a exponential function of an intercept, the predicted value of EARN1 and its square. ***********************************************************; reg earn1 female age educatt; predict yhat; gen yhat2=yhat^2; reg lnu2 yhat yhat2; predict lnu2hatb; gen u2hatb=exp(lnu2hatb); sum lnu2 lnu2hatb u2 u2hatb; reg earn1 female age educatt [aweight=1/u2hatb]; *********************************************************** Here I run GLS assuming that the variance of the error term is a exponential function an intercept, the predicted value of EARN1, and its square. ***********************************************************; gen earn1_t=earn1/sqrt(u2hatb); gen int_t=1/sqrt(u2hatb); gen age_t=age/sqrt(u2hatb); gen educ_t=educatt/sqrt(u2hatb); gen fem_t=female/sqrt(u2hatb); reg earn1_t fem_t age_t educ_t int_t, noc; *********************************************************** Here I calculate White standard errors. **********************************************************; reg earn1 female age educatt, robust; *********************************************************** Here I clear the data so that I can read in a new dataset. **********************************************************; clear; *********************************************************** Located at my website course (http://www.uncg.edu/bae/people/rosenbaum/Eco643/main.html) is a file containing inflation and unemployment data from 1948 through 1996. The file is named phillips.raw and is stored space-delimited ASCII. Here is a short description of the variables in the order that they are found in the data. Notice that I limit the sample to just data from 1970 onward. YEAR = year (1948-1996) UNEM = civilian unemployment rate INF = CPI inflation rate UNEM_1 = UNEM lagged once INF_1 = INF lagged once UNEM_2 = UNEM lagged twice INF_2 = INF lagged twice CUNEM = UNEM - UNEM_1 CINF = INF - INF_1 CUNEM_1 = CUNEM lagged once CINF_1 = CINF lagged once **********************************************************; infile year unem inf unem_1 inf_1 unem_2 inf_2 cunem cinf cunem_1 cinf_1 using phillips; drop if year<1970; sum; *********************************************************** Here I first set the variable that order the data, in this case the YEAR variable. Then I regressin the inflation rate onto unemployment and calculate the Durbin-Watson statistic. **********************************************************; tsset year; reg inf unem; dwstat; *********************************************************** Here I regress the residuals onto the lagged residuals to test for AR(1) serial correlation. **********************************************************; predict uhat1,resid; gen uhat1_1=uhat1[_n-1]; reg uhat1 uhat1_1; *********************************************************** Here I use the Prais-Winsten estimation method to estimate Feasible GLS. **********************************************************; prais inf unem;