Summary & Approach
Ultimately, I want to connect the dots. Lyft has several controls available to optimize for revenue. Lyft needs a mathematical model to make revenue a function of Match Rate, Churn Rate, Lyft Take per Ride, and Customer Acquisition Cost. In this analysis, I created this mathematical model. I’m almost certain the result is wrong given the assumptions I made for the simplicity of math, but the method showcases my approach to problem solving and provides a model to adjust analysis given the time for more complex math and less assumptions.
Definitions & Assumptions
Let’s start by defining our terms and listing our assumptions so that the analysis can be understood in context and repeated by someone else later.
Definitions & Acronyms:
CAC = Customer Acquisition Cost
Match Rate = % of ride requests that successfully pair a rider and driver
RCR = Rider Churn Rate
DCR = Driver Churn Rate
LT = Lift Take (e.g. $3 or $6)
LTR = Lift Take Ratio (e.g. 12% for $3/$25)
CPR = Cost Per Ride (e.g. fuel + mileage)
ARR = Annual Recurring Revenue
PR = Prevailing Rate
Problem Statement
To model revenue, by sequentially defining the relationships below:
LT (or LTR) → Match Rate
Match Rate → Rider Churn Rate
LT → Driver Churn Rate
Rider Churn Rate → Rider CAC
Driver Churn Rate → Driver CAC
Revenue as a function LT, which is now understood by all of the above relationships
Analysis
Define relationship: LT (or LTR) → Match Rate
We already have 2 data points for this relationship and we can infer a third, then use these three data points to plot a chart in Excel, utilizing a logarithmic regression (which is a common practice in defining an ROI curve). I chose to plot 1-LTR to make this graph look more like a common ROI curve.
Assumption: The average CPR is $10, incurred by the driver. Once LT = $15, the driver makes no money, so we can assume this would be a 0% Match Rate.
\(Match Rate = 0.996 + 1.11 * ln(1-LTR)\)
Define relationship: Match Rate → Rider Churn Rate
This one is a bit simpler given the information we have. We are told there is a binary relationship to rider churn.
\(Rider Churn Rate = 10 \% * MatchRate + 33\%(1-MatchRate)\)Define relationship: LT → Driver Churn Rate
We have one data point and infer one more. With two data points, I will assume a linear relationship.
Known: LT = $0, Driver Churn Rate = 5%
Inferred: LT = $15, Driver Churn Rate = 100% (they make no money because CPR = $10).
\(Driver Churn Rate = 1.58*LTR +0.05\)
Define relationship: Rider Churn Rate → Rider CAC
We are told the Rider CAC is $10-$20. For simplicity, let’s assume $15 average. Normalizing the CAC per ride, we simply take the likelihood of churn times the cost of churn.
\(Rider CAC = $15 * Rider Churn Rate\)Define relationship: Driver Churn Rate → Driver CAC
We are told the Driver CAC is $400-$600. For simplicity, let’s assume $500 average. Normalizing the CAC per ride, we simply take the likelihood of churn times the cost of churn. And we also take into account rides per month to match this to the rider’s formula.
Assumption: The driver churns at the end of the month, completing their 100 rides minimum.
\(Driver CAC = $500 / (100 rides/month) * Driver Churn Rate\)Put it all together: Revenue as a function LT
Every relationship is now defined at its base by the effect of LT.
Using Google Sheets, we can create the following model.
This is not what I was expecting and almost certainly wrong.
Reason 1: I would expect this chart to more closely resemble a bell curve that would peak at the optimal point to maximize revenue, then decreasing as CACs increased to point of making the LT untenable. This is likely to my assumptions of so many linear relationships.
Reason 2: I assumed a static (average) CAC. CAC will increase over time and would likely exceed $600 once there were little-to-no drivers left. If we assume there are unlimited drivers and the CAC maxes at $600, then I suppose it’s true that Lyft would always be financially better off taking a huge share and risking churn. So in that sense, this result accurately displays the results of my assumptions.
Improving this model
The main thing missing from this model is the effect of time. I chose to focus on averages to eliminate this major variable and simplify the model to produce a proof of concept.
I would also incorporate:
The driver and rider population sizes (I assumed unlimited)
The minimum ratio of drivers to riders
Appendix
The assumptions I made throughout the analysis are repeated and consolidated here.
To calculate Match Rate, I assumed:
The average CPR is $10, incurred by the driver. Once LT = $15, the driver makes no money, so we can assume this would be a 0% Match Rate.
To calculate Driver Churn Rate, I assumed:
The driver churn increases linearly (from 5% at LT = 0, to 100% at LT = $15)
In reality, this relationship is probably not linear, but since we only have one data point and no trend, I will start with linear and recommend future experimentation to chart a curve (LT vs. Driver Churn %).
To calculate Rider CAC, I assumed:
Static (average) CAC
The rider churns at the end of the month, completing their 1 monthly ride even after the match fails. This assumption could just as easily not be made, but would make the Revenue calculation a little uglier.
To calculate Driver CAC, I assumed:
Static (average) CAC
The driver churns at the end of the month, completing their 100 rides minimum.