This study requires each student to compile a sample of quarterly observations, in the UK, over a period of ten years. Thus, n=40. Each student will be given a specific period and area. You will collect data on these two variables.
(1) Birth rates (BR) (2) GDP per capita (GDP)
Table 1
Years Births rates (Y) GDP per capita £ (x) Y*x (X)
1991 Q1 15.86 3941 62504.26 15531481
1991 Q2 16 3923 62,766 15389929
1991 Q3 16.79 3905 65564.95 15249025
1991 Q4 15.7 3905 61308.5 15249025
1992 Q1 15.92 3907 62199.44 15264649
1992 16.19 3898 63108.62 15194404
1992 Q3 16.4 3915 64206 15327225
1992 Q4 15.06 3934 59246.04 15476356
1993 Q1 15.52 3956 61397.12 15649936
1993 Q2 15.66 3972 62201.52 15776784
1993 Q3 16.22 4004 64944.88 16032016
1993 Q4 15.05 4037 60756.85 16297369
1994 Q1 15.3 4078 62393.4 16630084
1994 Q2 15.59 4132 64417.88 17073424
1994 Q3 15.52 4187 64982.24 17530969
1994 Q4 14.77 4215 62255.55 17766225
1995 Q1 14.64 4227 61883.28 17867529
1995 Q2 15.25 4245 64736.25 18020025
1995 Q3 15.36 4290 65894.4 18404100
1995 Q4 14.47 4309 62351.23 18567481
1996 Q1 14.49 4348 63002.52 18905104
1996 Q2 14.61 4363 63743.43 19035769
1996 Q3 15.57 4389 68336.73 19263321
1996 Q4 15.11 4420 66786.2 19536400
1997 Q1 14.53 4460 64803.8 19891600
1997 Q2 15.02 4499 67574.98 20241001
1997 Q3 15.14 4533 68629.62 20548089
1997 Q4 14.4 4579 65937.6 20967241
1998 Q1 14.29 4619 66005.51 21335161
1998 Q2 14.61 4651 67951.11 21631801
1998 Q3 15.23 4695 71504.85 22043025
1998 Q4 14.25 4746 67630.5 22524516
1999 Q1 13.98 4769 66670.62 22743361
1999 Q2 14.43 4794 69177.42 22982436
1999 Q3 14.68 4855 71271.4 23571025
1999 Q4 13.91 4910 68298.1 24108100
2000 Q1 13.69 4975 68107.75 24750625
2000 Q2 13.75 5026 69107.5 25260676
2000 Q3 14.14 5043 71308.02 25431849
2000 Q4 13.67 5073 69347.91 25735329
total 600.77 174728 2614314.30 768804465
Births Rates were made by doing interpolation – what means it makes yearly data into quarterly. At first, I have collected live births, then total female’s population from 15-44 years (because it’s women’s fertile period) and then I done Interpolation and I got Quarterly Births Rates. Each rate per quarter means how many babies belong for 1000 women.
2. The equation to be estimated is:
BRi = b0 + b1 GDPi+ ui (i)
Answer these questions:
(i) In terms of the literature on demand for children, what would you expect to find for the coefficient on b1?
Given that the birth rate is the number of births per 1000 women then according to the equation the positive sign before the b1 means that birth rate and GDP have a direct relationship. This means that as the GDP increases so does the birth rate.
(ii) Explain how you would modify the model, implied by this equation, if there is an ‘Engel curve’ relationship in the demand for children. An Engel curve show that the demand for children is increasing at an increasing rate due to the concave nature of the curve. To modify this equation a GDP deflator should be used in calculating the GDP. The Engel curve is biased because of the biased nature of the consumer price index. The Engel curve has low explanatory power because it exhibits the problem of heteroscedastisity. This means that the Engel curve is not a satisfactory model for explaining human behavior. (Passineti 1981). Heteroscedastisity is caused by the omission of variables, averaging of data and errors of measurement. In addition, data that is collected from a cross section is likely to exhibit heteroscedasticity because the variance depend o the size of the group. Heteroscedasticity needs to be treated when it occurs in any equation because it is a violation of the assumptions of the ordinary least squares. It is treated by the use of a logarithmic model, reducing the size of the variables and incorporating all the variables into the project.
(ii) Why is there a constant term in the equation with no variable attached?
The equation described above, like many economic equations is deterministic in nature. A deterministic equation is one that for each of the independent variable there is one and only one corresponding value of the dependent variable. However, deterministic equations are not realistic because human behavior cannot be determined in exact quantities. Thus, the constant term is incorporated to take care of the errors in measurement, omitted variables and errors of specification.
(iii) Why do these types of equations have a ‘u’ term?
The constant is present in the equation because there are other small factors that influence the dependent variable but are too small to be incorporated in the equation. Human behavior has several variations hence the need of a constant term to take care of the variations.
3. Estimate equation (i) by OLS and present the results in a suitable table
(n.b. marks will be lost for simply pasting over the computer output)
(i) Comment on the result for the coefficient on the GDP variable.
For calculation purposes in this question let the birth rate=Y and GDP= x
BRi = b0 + b1 GDPi+ ui.
b0 = Yixi-∑ Yixixi b1 = n∑ Yixi-∑ Yi∑xi
n∑xi2 – (∑xi) 2 n∑Xi2-(∑xi) 2
b0 = (600.77) (174728) – (2614314.30) (174728) = 14.962
(40) (174728) – (174728) 2
b1 = 40(2614314.30) – (600.77) (174728) = 0.00013.
(40) (174728) – (174728)2
The model can now be specified as Y=14.962+0.0013X
(ii) Comment on the R squared statistic.
R statistic is obtained by explained sum of squares/ total sum of squares
] ]/ ∑ (Yi) 2 = (14.962*600.77) + (0.00013*2614314.30)/ (600.77)
Or 2.5%. this can be explained as follows: holding all other factors constant GDP explains 2.5% of the birth rate while the remaining 97.5% is explained by other factors.
(iii) Derive estimates of the income elasticity of demand for children from your results.
Y=14.962+0.0013X. Assume that X =4905, then Y (Birth rate) =21.33. Income elasticity is obtained by dividing the percentage change in income in the percentage change in quantity demanded. In this case we shall assume that the birth rate is the number of children demanded and estimate at two points when X=4905; X= 3115.
The change in income is=790 or 0.2536%
Y=14.962+0.0013*(4905) = 21.3385
Y=14.962+0.0013*(3115) =19.0115.
The change in birth rate is 2.327 or 0.1223%
The income elasticity is given as 0.2536/0.1223=2.07%
4. Carry out the following hypothesis tests:
(i) b0=0 against the two- sided alternative at the 1% level
Step 1: formulate the hypothesis
H0:B0=0 (B0 is not significant)
HA:B0 is not equal to zero. (B0 is significant)
H0:B1=0 (B1 is not significant)
H0:B1 is not equal to zero. (B1 is significant)
Step 2: Obtain the absolute value of the calculated t statistic.
t-calculated bo= b0/standard error of b0
Standard error of model = 6.475637037/ (40-2) =0.1703
Standard error b0 = 0.1703√ {768804465/ (40*768804465- 40*4368.2*4368.2)} =0.0273
t-calculated b1=b1/standard error of b1
standard error b1= 0.1703√{1/768804465-40*4368.2*4368.2)}=0.00722
Step3: obtain the critical t statistic from the t tables
T critical=tn-k degrees of freedom, %u03AC/2
%u03AC= 0.01/2=0.005; n=40; k=2
t-critical=2.704
Step 4: compare the t calculated and the t critical
B0=0.0273 which is less than 2.704; we reject the alternative hypothesis and do not reject the null thus b0 is not statistically significant.
B1=0.00722 is less than 2.704, we reject the alternative hypothesis, and conclude that B! is not statistically significant.
If t-calculated is greater than t critical, reject null hypothesis; if t calculated is less then t critical, reject the alternative hypothesis
(ii) B1=zero against the two- sided alternative at the 5% level.
The t critical value for a two sided alternative at the 5% level =2.021 thus b1 is less than 2.021. This concludes that b1 is not statistically significant at this level.
(iii) b0<0 against the alternative at the 5% level
H0:b0<zero. (b0 is not significant)
HA:b1>=0 (b0 is significant)
B1 is not statistically significant at this level.
(iv) b1<0 against the alternative at the 5% level
b1 is not statistically significant at this level
5. Differences in the pattern of births, over the calendar year, may cause serious problems with the accuracy of your results for this model. Outline the simple ‘seasonal dummy’ method of dealing with this and apply it to your data to produce a new set of results.
After respecifying the model to incorporate the estimates the model can be rewritten as Y=14.962+0.0013X. The data collected indicates that the birth rate increases in the third quarter. To take care of the seasonal changes that occur in the third quarter we introduce a dummy variable as follows Y=14.962+0.0013X (D), where 1=third quarter and zero for every other quarter.
In the third quarter, when D=1 then the birth rate will b given as 14.9633 while in the other quarters the birth rate will be the intercept of the model at 14.962. This intercept is lower than 26.605, which is the estimate of the model above.
6. Compare your new set of results (from Q.5) with your original results (from Q.3). You may consider the following relevant:
(i)Whether the new model offers a significant improvement in goodness of fit.
The model does not better the goodness of fit because the goodness of fit only increases when there are new variables that have been added to the model to make it a better estimate. Therefore, introducing a dummy variable has no effect on the goodness of fit.
(ii) Assessing whether there has been any major change in estimated income
elasticity.
Using the same example where X=4905 and X=3115. The model is rewritten as Y=14.962+ (0.0013*4905) D 14.962+6.3765D; when D=1 then Y= 21.3385
Y=14.962+ (0.0013*3115) (D) 14.962+4.0495D; when D=1 then Y=19.0115
It can therefore be concluded that a dummy variable does not have any significant change in the income elasticity of demand for children.
(iv) Assessing which quarters of the year tend to have, ceteris paribus, a higher or lower birth rate than others.
The quarters with the highest birth rate when all other factors are held constant are the third quarters. Throughout the data collected, the third quarters have a higher birth rate.
7. You should now write a short report of 450-600 words. This should briefly summarize your findings but most of your answer should consist of further exploration of your data (such as collecting further explanatory variables and estimating new regressions) and suggestions for improvement of the model you have estimated.
A model with dummy variables may at times exhibit perfect multicollinerity. The consequences of multicollinearity include:
Treating multicollinearity
In this example, I have chosen to redesign the model to include new variables as a treatment for multicollinearity (for the same period 1988-2001) which will be shown in the table below. The new variables include Marriages, Employment Female Unemployment Female, Deaths (Infants under one year).
Redesigned model to be estimated
Year Births rates (Y) GDP per capita £ (x1) Marriages (x2) Employment Female (x3) Unemployment Female(x4) Deaths (Infants under one year) (x5)
Table 2
Year Y x1 x2 x3 x4 x5
1991 Q1 15.86 3941 46.8 10647 488 1.57
1991 Q2 16 3923 101.6 10639 540.9 1.49
1991 Q3 16.79 3905 138.7 10562 584 1.35
1991 Q4 15.7 3905 62.7 10548 598.3 1.41
1992 Q1 15.92 3907 45.4 10495 618.4 1.36
1992 Q2 16.19 3898 101.9 10485 633.1 1.26
1992 Q3 16.4 3915 146.2 10302 656.9 1.22
1992 Q4 15.06 3934 62.3 10585 677.4 1.29
1993 Q1 15.52 3956 41.7 10528 686.3 1.29
1993 Q2 15.66 3972 100.5 10626 681.4 1.31
1993 Q3 16.22 4004 138.5 10633 677.8 1.15
1993 Q4 15.05 4037 60.9 10695 653.9 1.31
1994 Q1 15.3 4078 41.7 10603 637.9 1.22
1994 Q2 15.59 4132 95.5 10645 622.9 1.18
1994 Q3 15.52 4187 134.4 10663 614.3 1.08
1994 Q4 14.77 4215 59.8 10867 583.2 1.16
1995 Q1 14.64 4227 38.6 10762 560.8 1.16
1995 Q2 15.25 4245 92.8 10870 551.2 1.12
1995 Q3 15.36 4290 135.4 10821 544.6 1.06
1995 Q4 14.47 4309 55.2 11053 535.6 1.18
1996 Q1 14.49 4348 41 10992 523.8 1.06
1996 Q2 14.61 4363 91.4 11160 520 1.07
1996 Q3 15.57 4389 129.4 11230 506.6 1.13
1996 Q4 15.11 4420 55.8 11333 465.9 1.12
1997 Q1 14.53 4460 39.3 11207 414.9 1.1
1997 Q2 15.02 4499 87.1 11329 382.9 1.1
1997 Q3 15.14 4533 128.9 11361 346.6 1.01
1997 Q4 14.4 4579 54.9 11659 336.9 1.08
1998 Q1 14.29 4619 37.7 11614 326.9 1.02
1998 Q2 14.61 4651 85.6 11654 319.6 0.97
1998 Q3 15.23 4695 125.5 11728 315.7 0.98
1998 Q4 14.25 4746 56 11811 311.3 1.1
1999 Q1 13.98 4769 36.9 11688 307.8 1.06
1999 Q2 14.43 4794 83.2 11774 299.3 1.02
1999 Q3 14.68 4855 126.6 11827 284.5 0.99
1999 Q4 13.91 4910 52.1 11845 280.6 0.98
2000 Q1 13.69 4975 35.2 12494 273.2 1
2000 Q2 13.75 5026 84.7 12523 262.2 0.93
2000 Q3 14.14 5043 132.5 12603 247.6 0.96
2000 Q4 13.67 5073 52.9 12674 244.4 0.93
Summary Output Report.
Graph showing residuals against the birth rates.
Regression Statistics
Multiple R 0.964956395
R Square 0.931140844
Adjusted R Square 0.921014498
Standard Error 0.221707584
Observations 40
Coefficients standard of errors t- statistic
Intercept 26.6055 3.1146 8.5420
GDP per capita £ (x) -0.0028 0.00082 -3.4092
Marriages 0.0091 0.0010 8.4872
Employment Female 9.4724 0.00028 0.3346
Unemployment Female -0.0025 0.00087 -2.9581
Deaths (Infants under one year) 0.0700 0.6953 0.1007
(All numbers have been presented to 4s.f)
The above table shows the regression results of the model after an increase in the number of variables from one independent variable to five independent variables. The model would therefore appear in this form Y=b0+bx1+bx2+bx3+bx4+bx5+µ. From the above model that is estimated at a 1%, significant level the statistical significance of each of the coefficients can be determined. The model can therefore be written as follows:
Y = 26.6055-0.0028X1+0.0091X2+9.4724X3 -0.0025X4+0.0700X5
S.E (3.1146) (0.00082) (0.0010) (0.00028) (0.00087) (0.6953 )
H0:b0=0 (b0 is not significant)
HA: b0 is not equal to zero. (b0 is significant)
3.1146 <8.5420, thus we reject the alternative hypothesis and do not reject the alternative hypothesis therefore b0 is not statistically significant.
H0:b1=0 (b1 is not significant)
HA: b1 is not equal to zero. (b1 is significant)
0.00082 > -3.4092, the alternative hypothesis is rejected and this concludes thatb1 is statistically significant in the model
H0:b2=0 (b2 is not significant)
HA: b2 is not equal to zero. (b2 is significant)
0.0010<8.4872, this leads to the conclusion that b2 is not statistically significant because the t value is greater than the standard error which requires that the alternative hypothesis be rejected.
H0:b3=0 (b3 is not significant)
HA: b3 is not equal to zero. (b3 is significant)
0.00028 < 0.3346, b3 is not statistically significant in the model.
H0:b4=0 (b4 is not significant)
H1:b4 is not equal to zero. (b4 is significant)
0.00087 > -2.9581, this leads to the conclusion that b4 is of statistical significance to the model.
H0:b5=0 (b5 is not significant)
HA: b 5 is not equal to zero. (B5 is significant)
0.6953 > 0.1007,b5 is of statistical significance in the model.
Conclusion
From the above results its can be concluded GDP per capita, employment in females, and unemployment in females are all statistically significant in the model.
The above shows the results of a scatter diagram for the birth rates and all the statistically significant variables of the model.
Y = 26.6055-0.0028X1+0.0091X2+9.4724X3 -0.0025X4+0.0700X5 .The newly estimated model shows that as the birth rate increases the GDP and unemployment in females decreases due to the negative sign preceding the coefficient. Employment, marriage and the death rate have a direct relationship with the birth rate.
R Square 0.931140844
Adjusted R Square 0.921014498
The R square for the model is 93%, which shows that the variables explain of the birth rate while the remaining 7% is explained by other factor. The adjusted R has decreased which shows that additional terms will not improve the model. The adjusted R2 increases only if additional terms will help improve the model.