March 15, 2021

Predicting Subscription-Based Customer Retention: A Probabilistic Approach

A key metric for success

At the heart of any business model is the concept of retention rate, the fraction of customers remaining at a given point in time. It’s a measure of how valuable your product or service is to your customers—the more valuable your product, the higher the retention rate.

Understanding customer retention helps you identify which products or cohorts¹ are not performing well and require more attention. 

Why we model

Accurately projecting customer retention rate into the future enables a business to make informed decisions. You can measure the number of customers that will remain in Years 2, 3, and on, and estimate how much revenue they’ll generate over their lifetime (Customer Lifetime Value, or CLV).

Traditionally, data scientists have used “curve-fitting” regression models, like linear, quadratic, and exponential curves, to predict CLV—often quite inaccurately. In this post, we introduce probabilistic modeling, which allows for far more accurate forecasts.

The “curve-fitting” (traditional) model in action

In the figure below, the blue dots represent the number of remaining customers over 12 months at a subscription-based software company. Traditional “curve-fitting” models (linear, quadratic, and exponential) were fitted to the first 6 months of data (highlighted in yellow) and used to predict the retention rate over the next 6 months.

All the models fit the data reasonably well for the first 6 months, but the linear and quadratic models deteriorate quickly when projected onto the next 6 months. The exponential model does much better, but still overestimates the last month by a noticeable 20%.

The probabilistic (new) model in action

Traditional “curve-fitting” models are “blind” in that they simply assume a mathematical form with no justification. The probabilistic approach², on the other hand, takes the underlying behavior of a customer into account. Specifically, it assumes that customers pay at regular intervals and that we know when they decide to cancel/churn. Consider the following:

  • At the end of each month or period, a customer flips a coin: Heads she cancels her contract, Tails she renews it.
  • A given individual always has the same probability of flipping Heads.
  • The probability of flipping Heads varies across customers.

These simple assumptions are actually quite powerful. They demonstrate heterogeneity: the idea that different customers have different probabilities of churning.

How? The model assumes there’s a probability distribution describing how likely it is for each customer to flip Heads. Early on, customers with a high probability of flipping Heads churn—so the retention curve falls quickly. These “high-churn-probability” customers all leave over time, until only the “low-churn-probability” customers remain. At that point, the retention curve flattens out. 

Let’s apply this new model to the example from before:

Fit to the same customer data, this new probabilistic model performs excellently—far better than the traditional models. The prediction error in the final month is <3%, compared to 20% from the exponential model. 

Using the probabilistic model to calculate CLV

Here’s how to calculate Customer Lifetime Value (CLV)³ :

     \[ CLV = \sum_{t=1}^T m\frac{n_t}{(1 + d)^t} \]

where m is the expected revenue per customer each period, n is the number of remaining customers at period t, T is the total number of periods in consideration, and d is the discount factor each period to reflect the time value of money.

Because the probabilistic model provides a more accurate view of the future, it results in a more accurate estimation of CLV, allowing you to set better expectations and decisions.

And probabilistic modeling doesn’t even require more data—it just approaches the problem in a more intuitive manner.

Using the probabilistic model to understand churn

Below, we plot the probability distribution of two different cohorts of customers:

  • Standard customers vary widely and can be fickle. Some are highly likely to churn (near 100% probability), while the majority are not likely to churn (<20% probability).
  • Premium customers are loyal. They are unlikely to churn (near 2% probability).

It follows that the retention curve for Standard customers has a steep initial drop before flattening out, while the retention curve for Premium customers has a shallow, steady decline.


These charts help determine which cohorts have the highest churn rates. This tells a business which customers to support more, and when.


  • Traditional regression models can be quite inaccurate in forecasting customer retention.
  • Probabilistic models have an intuitive approach and can make more accurate predictions.
  • Probabilistic models make for more accurate CLV estimates, which allows for more informed decision-making.
  • Probabilistic models also provide insights into cohort churn patterns.

Have questions or are interested in applying more accurate modeling to your business? Reach out to us directly at


¹ A cohort is a group of customers defined by a common attribute, or set of common attributes. For instance, a product-based cohort is made of customers who use the same product; a unit-time-based cohort is made of customers who sign on in the same period; a geography-based cohort is made of customers from the same location. A cohort can be a combination too, e.g. customers who signed on in January from Maryland.

² Fader, Peter S. and Bruce G. S. Hardie (2007), “How to Project Customer Retention,” Journal of Interactive Marketing, 21 (Winter), 76–90.

³ For a more sophisticated estimate, consider using discounted cash flow if you are projecting several years into the future. For shorter periods of time (months), this may be ignored by setting d=0.

Derek Chang

Derek Chang