Air pollution control speciali
Air pollution control specialists in southern California monitorthe amount of ozone, carbon dioxide, and nitrogen dioxide in theair on an hourly basis. The hourly time series data exhibitseasonality, with the levels of pollutants showing patterns thatvary over the hours in the day. On July 15, 16, and 17, thefollowing levels of nitrogen dioxide were observed for the 12 hoursfrom 6:00 A.M. to 6:00 P.M.
Click on the datafile logo to reference the data.
July 15:  25  28  35  50  60  60  40  35  30  25  25  20 
July 16:  28  30  35  48  60  65  50  40  35  25  20  20 
July 17:  35  42  45  70  72  75  60  45  40  25  25 
25 
Use a multiple linear regression model with dummy variables asfollows to develop an equation to account for seasonal effects inthe data:  
Hour1 = 1 if the reading was made between 6:00 A.M. and7:00A.M.; 0 otherwise  
Hour2 = 1 if the reading was made between 7:00 A.M. and 8:00A.M.; 0 otherwise  
.  
.  
.  
Hour11 = 1 if the reading was made between 4:00 P.M. and 5:00P.M., 0 otherwise  
Note that when the values of the 11 dummy variables are equalto 0, the observation corresponds to the 5:00 P.M. to 6:00 P.M.hour.  
If required, round your answers to three decimal places. Forsubtractive or negative numbers use a minus sign even if there is a+ sign before the blank. (Example: 300) Do not round intermediatecalculation.  
Value = + Hour1 + Hour2+ Hour3 + Hour4 + Hour5+ Hour6 + Hour7 + Hour8+ Hour9 + Hour10 + Hour11  
(c)  Using the equation developed in part (b), compute estimates ofthe levels of nitrogen dioxide for July 18.  
If required, round your answers to three decimal places. Do notround intermediate calculation.  


(d)  Let t = 1 to refer to the observation in hour 1 onJuly 15; t = 2 to refer to the observation in hour 2 ofJuly 15; …; and t = 36 to refer to the observation inhour 12 of July 17. Using the dummy variables defined in part (b)and t_{s}, develop an equation to account forseasonal effects and any linear trend in the time series.  
If required, round your answers to three decimal places. Forsubtractive or negative numbers use a minus sign even if there is a+ sign before the blank. (Example: 300)  
Value = + Hour1 + Hour2+ Hour3 + Hour4 + Hour5+ Hour6 + Hour7 + Hour8+ Hour9 + Hour10 + Hour11+ t  
(e)  Based on the seasonal effects in the data and linear trendestimated in part (d), compute estimates of the levels of nitrogendioxide for July 18.  
If required, round your answers to three decimal places.  


(f)  Is the model you developed in part (b) or the model youdeveloped in part (d) more effective?  
If required, round your answers to three decimal places.  


– Select your answer Model developed in part (b)Modeldeveloped in part (d)Item 54  
Justify your answer. 
Answer:
Step1
Forecasting is a technique which helps inpredicting the future data based on the present data or situation.It is analyzed by trend analysis. Time series is a set ofobservation measured at successive points in time or oversuccessive period of time.
Step2
a.
Construct the time series plot using XLMINER software, theprocedure to make the time series plot is given as below:
1. Write down the provided data into spreadsheet, the screenshotis shown below:
2. Select the provided data range and then click on the“XLMMINER” Platform tab in the ribbon.
3. From the “Data Analysis” table select the “Explore” option.In the Explore option select the “Chart wizard” option.
A new dialog box will appear, select the “Line chart” option andpress “Next” option. Now select the “level and time period” press“level” tab. Select “time period” then press “Next” and select“level” and again press “Next”. Press “Finish” option, thescreenshot of the obtained time series plot is shown below:
The above time series plot indicate seasonal pattern in thelevel of ozone, carbon dioxide and nitrogen dioxide.
Step3
b.
The multiple linear regression models given as:
1
Here, the intercept isthe predicted value of when areequal to zero, and arethe slope coefficients. According to the provided criteriaintroduce the dummy variables for 11 hours for three days levels.Thus the obtained dummy variables for the 11 hours will be:
Step4
Cosider “Hour1…Hour11” as explnatory variables and “level” asdependent variable, regressing level on explanatory variable inExcel as follows:
1. Select the provided data range and then click on the“XLMINER” Platform tab in the ribbon.
2. Select the “Predict” in the “Data Mining” group and selectthe “Multiple Liner Regression” option.
3. A new dialog box will appear, select the explanatory variablein the “Selected variables” box. And select the dependent variablein the “Output variable” box. The screenshot is shown below:
Step5
4. Click “Next” option in the above dialog box, a new dialog boxwill appear. Select the options as shown below:
5. Press “Finish” option in the above dialog box, the screenshotof the obtained regression analysis is shown below:
According to the above output, the multile regression equationfor the seasonal effects is given as:
Step6
c.
Predict estimates of the level of nitrogen dioxide for july 18,use the multile regression line obtained in the part (b). Calculatthe forecast for each hour as shown below:
step7
Step8
Step9
Step10
Step11
Step12
Step13
Step14
Step15
Step16
d.
Now, introduce new variable to account the seasonal effect inthe data for, for this add new explanatory variable “Time(t)” in the data. The screenshot of the data file is shownbelow:
Step17
Now, follow the same procedure as done in the part (b), thescreenshot of the output is shown below:
According to the above output, the multile regression equationto predict the seasonal effects and liner trend is given as:
Step18
e.
Predict hourly forcast for July 18, use the multile regressionline obtained in the part (d). The forecast for the each hour forJuly 18 is calculated as below:
Step19
Step20
Step21
f.
According to the results obtained in the part (b), the minimumMSE for the hour is calculated as
According to the results obtained in the part (d), the minimumMSE for the hour is calculated as:
Hence, the mean squared error for the model from the part (d),which include the seasonal effects and linear trend is smaller thanthe mean squared error for the model from the part (b), whichinclude the seasonal effects. So, the model obtained in the part(d) will be more effective. This supports initial decisions gottenin review of the time series plot created in part (a) and the datashow a linear trend with seasonality.