Estimation of Demand and forecasting
By specifying the Regression Equation and Obtaining the Data, determine all the factors that may influence the demand for a specific good or service before estimating it. Assume we wanted to estimate cell phone demand among students at various educational institutions in Karachi, Pakistan. What factors are most likely to influence their desire for a cell phone? Price and all nonprice determinants (such as tastes and preferences, income, prices of related goods, future expectations, and a number of buyers) could be used to begin answering this question. However, including all of these variables in a demand analysis is not always possible or appropriate. As an example, one would not expect “taste for cell phone” to play a significant role in cell phone demand, but preference for android or apple phones does.
In an ideal world, the regression analysis would include all variables that are thought to have an impact on demand. In reality, the variables used in regression analysis are determined by the amount of data available and the cost of obtaining new data. Crosssectional and timeseries data are the two types of data used in regression analysis. Crosssectional data gives information on variables over a specific time period. Time series data provides information about variables over multiple time periods. Let’s pretend that we obtained crosssectional data on students from various educational institutions in Karachi by conducting a survey of a hundred randomly selected educational institutions in Karachi during a specific month.
We then express the regression equation to be estimated in the following linear, additive manner using these data:
Y = a + b_{1}X_{1 }+ b_{2}X_{2} + b_{3}X_{3} + b_{4}X_{4 }+ b_{5}X_{5}
where Y = Quantity of cell phone demanded (average number of cell phones sold per capita per month)
a = Constant value or Yintercept
X_{1} = Average price of a cell phone (Rs in “000”)
X_{2} = Average monthly income of parents (in thousands of Rupees)
X_{3} = Average call price (i.e., Rs 1.65 for five minutes and Rs 0.75 per minute exceeding five minutes)
X_{4} = Gender (1 if male, 0 if female)
b_{1}, b_{2}, b_{3}, b_{4}, b_{5} = Coefficients of the X variables measuring the impact of the variables on the demand for cell phones.
The dependent variable is Y, or the quantity demanded. The independent or explanatory variables are referred to as the X variables. It’s crucial to keep track of the units of measurement for each variable. The researcher has the option of recording the data for regression analysis in any way he or she wants. We’re measuring cell phone prices in rupees and parents’ average monthly income in thousands of rupees. It’s also worth noting that the male and female variables have different units of measurement than the others.
“Male” has a value of 1 and “Female” has a value of 0. Both variables are dummy variables or binary variables. Using any one of the many available software packages containing regression analysis, such as MS Excel in this case, we can now estimate the values of the slope coefficients of the independent variables, bi, as well as the intercept term, a, given this particular setup of the regression equation and measurement scheme for the variables.
Regression Coefficients: Estimation and Interpretation
The regression function in Excel was used to estimate the demand for cell phones. We believe it is perfectly suitable for many types of regression analysis that would be conducted in business research, despite the fact that it only contains the basic elements of regression (e.g., it does not provide a DurbinWatsonstatistic). Furthermore, Excel is more widely used in businesses and colleges, and universities than statistical software. We get the following results when we run a regression analysis on the data in Table in Excel. We can express the following regression equation based on this output.
Y_{cellphone =} 13.45 – 0.02X_{1 }+ 0.030X_{2} – 2.43X_{3} + 0.25X_{4}
(2.4765) (0.0176) (0.0493) (1.1052) (0.9429)
R^{2} = 0.966.
R^{2}_{adj} = 0.965; F = 678.3
Standard Error of Estimate, SEE = 4.345 The standard error of each coefficient, SEC, is listed beneath each coefficient in parentheses.
Before interpreting these findings, consider which direction changes in the explanatory variables are expected to have on pizza demand, as indicated by the signs of the estimated regression coefficients. To put it another way, the following hypotheses about the expected relationship between each of the explanatory variables and cellphone demand can be stated in more formal terms.
Hypothesis 1: The price of a cellphone (X1) is inversely proportional to the number of cellphones demanded (i.e., the sign of the coefficient is expected to be negative).
Hypothesis 2: Assuming that parents have a median income and that the price of a cellphone is either “normal” or “inferior.” As a result, we hypothesize that income (X2) is a determinant of cellphone demand, though we can’t say whether it’s an inverse or direct determinant (i.e., the sign of the coefficient could be either positive or negative).
Hypothesis 3: The cost of a phone call (X3) is an inverse determinant of cellphone demand (i.e., the sign of the coefficient is expected to be negative).
Hypothesis 4: Mobile phone demand is expected to be lower for female students (X_{4}) than for male students.
SUMMARY OUTPUT  
Regression Statistics  
Multiple  0.8326  
R Square  0.6932  
Adjusted  0.6441  
Standard  1.7067  
Observations  30  
ANOVA  
 df  SS  MS  F  Sig F  
Regression  4  164.54  41.14  14.12  0.00  
Residual  25  72.82  2.91  
Total  29  237.37 


 
 Coeff  SE  t –Stat  Pvalue  L –95%  U –95%  L 95.0%  U 95.0% 
Intercept  21.992  3.650  6.025  0.000  14.475  29.510  14.475  29.510 
X_{1}  0.059  0.019  3.054  0.005  0.100  0.019  0.100  0.019 
X_{2}  0.102  0.090  1.133  0.268  0.083  0.287  0.083  0.287 
X_{3}  0.064  0.023  2.853  0.009  0.111  0.018  0.111  0.018 
X_{4}  0.851  0.999  0.852  0.403  1.207  2.909  1.207  2.909 
When we look at the regression results, we can see that the Price of Cellphone (X1) coefficient has a negative sign, which is exactly what we would expect due to the law of demand, which states that as price rises, quantity demanded falls and vice versa. The quantity of demanded Cellphone will change in the opposite direction as the price of Cellphone (X1) changes. A negative slope coefficient indicates this. The fact that the Average Income of Parents (X2) coefficient is positive indicates that income and the number of cellphones demanded are directly related. Higher income is linked to increased cellphone demand, and vice versa. Thus, while a cellphone appears to be a “normal” product, it would be an “inferior” product if the quantity demanded decreased as income increased.
The complementarity between the call price and cellphone is confirmed by the negative sign of the fiveminute phone call price (X3). Students tend to buy fewer cellphones as the cost of a phone call rises. A decrease in the price of a call would have the opposite effect. The magnitudes of estimated regression coefficients for continuous variables require a little more thought. Each estimated coefficient indicates how much the demand for cellphones will change when each of the explanatory variables is changed by one unit.
A b1 of 0.059, for example, indicates that a unit change in price (X1) will result in a 0.059 change in cellphone demand in the opposite direction. As you may recall, the price was expressed in rupees. As a result, an Rs100 increase will result in a 5.9% decrease in the quantity demanded of cellphones (100 * 0.059), according to our regression estimates. An increase in income of one unit (in this case Rs100,000, or equivalent to 100 thousand) results in a 10.2% increase in cellphone demand. Are these changes, as well as those linked to changes in cellphone prices and gender, significant or insignificant?
Researchers who are constantly estimating demand for a specific good or service will have a good idea of whether the magnitudes of the coefficients estimated in one study are high or low in comparison to their other work. However, if no other studies are available for comparison, researchers can at least use demand elasticities to assess the relative impact of the explanatory variables on the quantity demanded, and regression analysis results are ideal for estimating point elasticity
where Q denotes the quantity demanded, and X denotes any variable that influences Q. (e.g., price or income). We need a quantity demanded estimate to estimate elasticity. Assume we want to estimate the demand for cellphones among students at various institutions, where the average monthly income of parents is Rs 100,000 (i.e., 100,000), the average price of a cellphone is Rs 50,000 (i.e., 50,000), and the average price of a call is Rs 5. In this case, the estimated demand is:
Y_{cellphone} = 21.992 – 0.059(100) + 0.102(14) – 0.064(110) + 0.851(1) = 11.2 about 11 cellphone per month)
We simply plug the appropriate numbers into the pointelasticity formula to compute the point elasticities for each variable assuming the preceding values. Y’s partial derivative with respect to
The estimated coefficient of each variable with respect to changes in each variable (i.e., Y/X) is simply the estimated coefficient of each variable.
Price elasticity: 0.059 * (100/11) = 0.541
Income elasticity: 0.102 * (14/11) = 0.130
Crossprice elasticity: 0.64 * (110/11) = 0.643
With these figures, we can conclude that cellphone demand is price inelastic and that there is some crossprice elasticity between call and cellphone prices. According to the relatively low elasticity coefficient of 0.130, income does not appear to have a significant impact on cellphone demand. Furthermore, we conclude that lowering the price of a cell phone will not increase revenue due to inelastic demand.
To maximize total revenue, the following conditions must be met.
The first point to remember is that revenue is highest when elasticity is unit elastic. Why? If you own a coffee shop, you’ll notice that when demand is elastic or inelastic, there are untapped opportunities.
If the quantity effect outweighs the price effect, the revenue gained from the increased number of units sold will outweigh the revenue lost from the price reduction.
If the market is inelastic, the price effect will outweigh the quantity effect, which means that if prices are raised, the revenue gained from the higher price will outweigh the revenue lost from fewer units sold.
Evaluation of the Regression Results Statistically
The results of our regression are based on a sample survey of educational institutions in Karachi. How confident are we that these findings accurately reflect the population of all institutions in the United States? The ttest is a basic statistical significance test for each estimated regression coefficient. This test is carried out by calculating a tvalue or tstatistic for each estimated coefficient.
This is accomplished by dividing the estimated coefficient by the standard error of the coefficient (SEC), as follows: t = bn / SEC of bn. The standard errors in our cellphone regression are presented in parentheses under the estimated coefficients, as is standard practice in the presentation of regression results. The ttable is used to interpret the value of t. In economic research, the.05 level of significance is commonly used. This means you can be 95% confident that the sample’s results are representative of the entire population.
We’ll also need to know how many degrees of freedom (df) are involved in the calculation. Degrees of freedom are calculated as n – k 1, where “n” is the sample size and “k” is the number of independent variables. The constant or intercept term is represented by the number “1.” As a result, we have 3041, or 25 degrees of freedom, in our cellphone example. The critical tvalue at the.05 levels of significance is 1.708 for a onetail test and 2.060 for a twotail test, according to the ttable in Table2.
Using a onetail test, if the tvalue computed for a particular estimated coefficient is greater than 1.711, we can say the estimate is “significant at the.05 levels.” The same can be said if it is greater than 2.064, but with a twotail test. The rule of two is a simple and effective way to handle the critical level. This means that if t has an absolute value greater than 2, the estimated coefficient is significant at the.05 levels.
The absolute values of their tstatistics are 3.054 and 2.853, respectively, in the preceding regression equation, indicating that X1 (the price of cellphone) and X3 (the price of call price) are statistically significant. Because the absolute values of their tstatistics are less than 2, the other two variables, X2 (Income) and X4 (Gender), are not statistically significant. If a variable’s estimated coefficient passes the ttest, we can be certain that the variable has an effect on demand. If the variable fails the ttest, it is highly unlikely that it has an effect on the entire population of college campuses. In other words, the regression coefficients are nonzero numbers due to a fluke in the population sample of campuses that we used.
The best we can hope for in statistical analysis is to be confident that our sample results are truly representative of the population they represent. However, there is no way to know for sure. As a result, statisticians create degrees of uncertainty. Using the rule of two generally implies a 5% level of significance, as explained in greater detail later in this chapter. To put it another way, declaring a coefficient statistically significant because it passes the rule of 2 versions of the ttest exposes us to a 5% chance of being wrong.
The coefficient of determination, or R2, is another important statistical indicator used to evaluate the regression results. This metric depicts the percentage of variation in a dependent variable that can be explained by changes in all explanatory variables in a regression equation. This value can range from 0 to 1.0, indicating that variations in the dependent variable are not accounted for by changes in the explanatory variables (indicating that all the variations in the dependent variable can be accounted for by the explanatory variables). For statisticians, the closer R2 is to 1.0, the greater the regression equation’s explanatory power. R2 = 0.693 in our cellphone regression. This means that variations in the price of a cellphone, income, the cost of a cell phone call, and gender can account for roughly 70% of the variation in student demand for cellphones. As more independent variables are added to a regression equation, R2 increases. As a result, most analysts prefer to use a metric that accounts for the number of independent variables in order to compare equations with different numbers of variables more fairly. The adjusted R2 is a different type of alternative measure.
In addition to R2, another test known as the Ftest is frequently used. Rather than measuring the statistical significance of each individual coefficient, this test measures the statistical significance of the entire regression equation (as the ttest is designed to do). In effect, the Ftest is a statistical significance test for R2. The Ftest is carried out in the same way as the ttest. Depending on the level of statistical significance that the researcher wants to achieve, a critical value for F is first determined (typically at the .05 or .01 level).
Table3 shows the critical Fvalues corresponding to these acceptable levels. As can be seen, when determining the critical Fvalue, two degrees of freedom must be taken into account. The sample size and number of independent variables in the equation, as well as the sample size minus the number of independent variables plus the equation’s intercept, are all related to these values. Because the cellphone example has a sample size of 30 people and five variables, the degrees of freedom are 5 and 24 respectively (305 1). Fdistribution table shows that the critical Fvalue with those degrees of freedom is 2.76 at the.05 level. The critical value at the.01 level is 4.18. We can conclude that our entire equation is statistically significant at the.01 level because the regression results for cellphone demand show an Fvalue of 14.12.