Product Market Forecasting using the Bass Model {#productForecastingBassModel}

A marketing application.

The Death of Mathematics

This topic is an interesting one in that it is more mathematical than some of the others in machine learning. It requires basic calculus and differential equations. Keith Devlin at Stanford has bemoaned the loss of mathematical ability as technology has created ways to substitute these skills. But, as he points out, it is hard to do math at the computer, the cognitive load is overwhelming and good old-fashioned paper and pencil is best. We will see some of that here. For a read of his article, see: This is taken from a chapter in this fascinating book:

Main Ideas

The Bass product diffusion model is a classic one in the marketing literature. It has been successfully used to predict the market shares of various newly introduced products, as well as mature ones.

The main idea of the model is that the adoption rate of a product comes from two sources:

  1. The propensity of consumers to adopt the product independent of social influences to do so.

  2. The additional propensity to adopt the product because others have adopted it. Hence, at some point in the life cycle of a good product, social contagion, i.e. the influence of the early adopters becomes sufficiently strong so as to drive many others to adopt the product as well. It may be going too far to think of this as a network effect, because Frank Bass did this work well before the concept of network effect was introduced, but essentially that is what it is.

The Bass model shows how the information of the first few periods of sales data may be used to develop a fairly good forecast of future sales. One can easily see that whereas this model came from the domain of marketing, it may just as easily be used to model forecasts of cashflows to determine the value of a start-up company.

Historical Examples

There are some classic examples from the literature of the Bass model providing a very good forecast of the ramp up in product adoption as a function of the two sources described above. See for example the actual versus predicted market growth for VCRs in the 80s and the adoption of answering machines shown in the Figures below.

The Basic Idea

We follow the exposition in Bass (1969).

Define the cumulative probability of purchase of a product from time zero to time $t$ by a single individual as $F(t)$. Then, the probability of purchase at time $t$ is the density function $f(t) = F'(t)$.

The rate of purchase at time $t$, given no purchase so far, logically follows, i.e.

$$ \frac{f(t)}{1-F(t)}. $$

Modeling this is just like modeling the adoption rate of the product at a given time $t$.

Main Differential Equation

Bass suggested that this adoption rate be defined as

$$ \frac{f(t)}{1-F(t)} = p + q\; F(t). $$

where we may think of $p$ as defining the independent rate of a consumer adopting the product, and $q$ as the imitation rate, because it modulates the impact from the cumulative intensity of adoption, $F(t)$.

Hence, if we can find $p$ and $q$ for a product, we can forecast its adoption over time, and thereby generate a time path of sales. To summarize:

Solving the Model for $F(t)$

We rewrite the Bass equation:

$$ \frac{dF/dt}{1-F} = p + q\; F. $$

and note that $F(0)=0$.

The steps in the solution are: $$ \begin{eqnarray} \frac{dF}{dt} &=& (p+qF)(1-F) \\ \frac{dF}{dt} &=& p + (q-p)F - qF^2 \\ \int \frac{1}{p + (q-p)F - qF^2}\;dF &=& \int dt \\ \frac{\ln(p+qF) - \ln(1-F)}{p+q} &=& t+c_1 \quad \quad (*) \\ t=0 &\Rightarrow& F(0)=0 \\ t=0 &\Rightarrow& c_1 = \frac{\ln p}{p+q} \\ F(t) &=& \frac{p(e^{(p+q)t}-1)}{p e^{(p+q)t} + q} \end{eqnarray} $$

Another solution

An alternative approach (this was suggested by students Muhammad Sagarwalla based on ideas from Alexey Orlovsky) goes as follows. First, split the integral above into partial fractions.

$$ \int \frac{1}{(p+qF)(1-F)}\;dF = \int dt $$

So we write $$ \begin{eqnarray} \frac{1}{(p+qF)(1-F)} &=& \frac{A}{p+qF} + \frac{B}{1-F}\\ &=& \frac{A-AF+pB+qFB}{(p+qF)(1-F)}\\ &=& \frac{A+pB+F(qB-A)}{(p+qF)(1-F)} \end{eqnarray} $$

This implies that $$ \begin{eqnarray} A+pB &=& 1 \\ qB-A &=& 0 \end{eqnarray} $$

Solving we get $$ \begin{eqnarray} A &=& q/(p+q)\\ B &=& 1/(p+q) \end{eqnarray} $$

so that $$ \begin{eqnarray} \int \frac{1}{(p+qF)(1-F)}\;dF &=& \int dt \\ \int \left(\frac{A}{p+qF} + \frac{B}{1-F}\right) \; dF&=& t + c_1 \\ \int \left(\frac{q/(p+q)}{p+qF} + \frac{1/(p+q)}{1-F}\right) \; dF&=& t+c_1\\ \frac{1}{p+q}\ln(p+qF) - \frac{1}{p+q}\ln(1-F) &=& t+c_1\\ \frac{\ln(p+qF) - \ln(1-F)}{p+q} &=& t+c_1 \end{eqnarray} $$

which is the same as equation (*). The solution as before is

$$ F(t) = \frac{p(e^{(p+q)t}-1)}{p e^{(p+q)t} + q} $$

Solve for $f(t)$

We may also solve for

$$ f(t) = \frac{dF}{dt} = \frac{e^{(p+q)t}\; p \; (p+q)^2}{[p e^{(p+q)t} + q]^2} $$

Therefore, if the target market is of size $m$, then at each $t$, the adoptions are simply given by $m \times f(t)$.


For example, set $m=100,000$, $p=0.01$ and $q=0.2$. Then the adoption rate is shown in the Figure below.

Symbolic Math in R

Solution using Wolfram Alpha


How do we get coefficients $p$ and $q$? Given we have the current sales history of the product, we can use it to fit the adoption curve.

Substituting for $f(t)$ and $F(t)$ in the Bass equation gives:

$$ \frac{s(t)/m}{1-S(t)/m} = p + q\; S(t)/m $$

We may rewrite this as

$$ s(t) = [p+q\; S(t)/m][m - S(t)] $$

Therefore, $$ \begin{eqnarray} s(t) &=& \beta_0 + \beta_1 \; S(t) + \beta_2 \; S(t)^2 \quad (BASS) \\ \beta_0 &=& pm \\ \beta_1 &=& q-p \\ \beta_2 &=& -q/m \end{eqnarray} $$

Equation (BASS) may be estimated by a regression of sales against cumulative sales. Once the coefficients in the regression $\{\beta_0, \beta_1, \beta_2\}$ are obtained, the equations above may be inverted to determine the values of $\{m,p,q\}$. We note that since

$$ \beta_1 = q-p = -m \beta_2 - \frac{\beta_0}{m}, $$

we obtain a quadratic equation in $m$:

$$ \beta_2 m^2 + \beta_1 m + \beta_0 = 0 $$

Solving we have

$$ m = \frac{-\beta_1 \pm \sqrt{\beta_1^2 - 4 \beta_0 \beta_2}}{2 \beta_2} $$

and then this value of $m$ may be used to solve for

$$ p = \frac{\beta_0}{m}; \quad \quad q = - m \beta_2 $$

iPhone Sales Forecast

As an example, let's look at the trend for iPhone sales (we store the quarterly sales in a file and read it in, and then undertake the Bass model analysis). We get the data from:

The R code for this computation is as follows:

Comparison to other products

The estimated Apple coefficients are: $p=0.0018$ and $q=0.1148$. For several other products, the table below shows the estimated coefficients reported in Table I of the original Bass (1969) paper.