Uplift modelling
{{Short description|Predictive modelling technique}}
Uplift modelling, also known as incremental modelling, true lift modelling, or net modelling is a predictive modelling technique that directly models the incremental impact of a treatment (such as a direct marketing action) on an individual's behaviour.
Uplift modelling has applications in customer relationship management for up-sell, cross-sell and retention modelling. It has also been applied to political election and personalised medicine. Unlike the related Differential Prediction concept in psychology, Uplift Modelling assumes an active agent.
Introduction
Uplift modelling uses a randomised scientific control not only to measure the effectiveness of an action but also to build a predictive model that predicts the incremental response to the action. The response could be a binary variable (for example, a website visit){{cite journal|last1=Devriendt|first1=Floris|last2=Moldovan|first2=Darie|last3=Verbeke|first3=Wouter|date=2018|title=A literature survey and experimental evaluation of the state-of-the-art in uplift modeling: A stepping stone toward the development of prescriptive analytics|journal=Big Data|volume=6|issue=1|pages=13–41|doi=10.1089/big.2017.0104|pmid=29570415 }} or a continuous variable (for example, customer revenue).{{cite journal|last1=Gubela|first1=Robin M.|last2=Lessmann|first2=Stefan|last3=Jaroszewicz|first3=Szymon|date=2020|title=Response transformation and profit decomposition for revenue uplift modeling|journal=European Journal of Operational Research|volume=283|issue=2|pages=647–661|doi=10.1016/j.ejor.2019.11.030 |arxiv=1911.08729|s2cid=208175716 }} Uplift modelling is a data mining technique that has been applied predominantly in the financial services, telecommunications and retail direct marketing industries to up-sell, cross-sell, churn and retention activities.
Measuring uplift
The uplift of a marketing campaign is usually defined as the difference in response rate between a treated group and a randomized control group. This allows a marketing team to isolate the effect of a marketing action and measure the effectiveness or otherwise of that individual marketing action. Honest marketing teams will only take credit for the incremental effect of their campaign.
However, many marketers define lift (rather than uplift) as the difference in response rate between treatment and control, so uplift modeling can be defined as improving (upping) lift through predictive modeling.
The table below shows the details of a campaign showing the number of responses and calculated response rate for a hypothetical marketing campaign. This campaign would be defined as having a response rate uplift of 5%. It has created 50,000 incremental responses (100,000 - 50,000).
class="wikitable" |
Group
! Number of Customers ! Responses ! Response Rate |
---|
Treated
|align="right" |1,000,000 |align="right" |100,000 |align="right" |10% |
Control
|align="right" |1,000,000 |align="right" |50,000 |align="right" |5% |
Traditional response modelling
Traditional response modelling typically takes a group of treated customers and attempts to build a predictive model that separates the likely responders from the non-responders through the use of one of a number of predictive modelling techniques. Typically this would use decision trees or regression analysis.
This model would only use the treated customers to build the model.
In contrast uplift modeling uses both the treated and control customers to build a predictive model that focuses on the incremental response. To understand this type of model it is proposed that there is a fundamental segmentation that separates customers into the following groups (their names were suggested by N. Radcliffe and explained in N. Radcliffe (2007). Identifying who can be saved and who will be driven away by retention activity. Stochastic Solution Limited)
- The Persuadables : customers who only respond to the marketing action because they were targeted
- The Sure Things : customers who would have responded whether they were targeted or not
- The Lost Causes : customers who will not respond irrespective of whether or not they are targeted
- The Do Not Disturbs or Sleeping Dogs : customers who are less likely to respond because they were targeted
The only segment that provides true incremental responses is the Persuadables.
Uplift modelling provides a scoring technique that can separate customers into the groups described above.
Traditional response modelling often targets the Sure Things being unable to distinguish them from the Persuadables.
Return on investment
Because uplift modelling focuses on incremental responses only, it provides very strong return on investment cases when applied to traditional demand generation and retention activities. For example, by only targeting the persuadable customers in an outbound marketing campaign, the contact costs and hence the return per unit spend can be dramatically improved.
Removal of negative effects
One of the most effective uses of uplift modelling is in the removal of negative effects from retention campaigns. Both in the telecommunications and financial services industries often retention campaigns can trigger customers to cancel a contract or policy. Uplift modelling allows these customers, the Do Not Disturbs, to be removed from the campaign.
Application to A/B and multivariate testing
It is rarely the case that there is a single treatment and control group. Often the "treatment" can be a variety of simple variations of a message or a multi-stage contact strategy that is classed as a single treatment. In the case of A/B or multivariate testing, uplift modelling can help in understanding whether the variations in tests provide any significant uplift compared to other targeting criteria such as behavioural or demographic indicators.
History of uplift modelling
The first appearance of true response modelling appears to be in the work of Radcliffe and Surry.Radcliffe, N. J.; and Surry, P. D. (1999); Differential response analysis: Modelling true response by isolating the effect of a single action, in Proceedings of Credit Scoring and Credit Control VI, Credit Research Centre, University of Edinburgh Management School
Victor Lo also published on this topic in The True Lift Model (2002),Lo, V. S. Y. (2002); The True Lift Model, ACM SIGKDD Explorations Newsletter, Vol. 4, No. 2, 78–86, available at http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=4FD247B4987CBF2E29186DACE0D40C3D?doi=10.1.1.99.7064&rep=rep1&type=pdf and later Radcliffe again with Using Control Groups to Target on Predicted Lift: Building and Assessing Uplift Models (2007).Radcliffe, N. J. (2007); Using Control Groups to Target on Predicted Lift: Building and Assessing Uplift Models, Direct Marketing Analytics Journal, Direct Marketing Association
Radcliffe also provides a very useful frequently asked questions (FAQ) section on his web site, Scientific Marketer.[http://scientificmarketer.com/2007/09/uplift-modelling-faq.html The Scientific Marketer FAQ on Uplift Modelling] Lo (2008) provides a more general framework, from program design to predictive modeling to optimization, along with future research areas.Lo, V. S.Y. (2008) “New Opportunities in Marketing Data Mining.” In Encyclopedia of Data Warehousing and Mining, 2nd edition, edited by Wang (2008), Idea Group Publishing.
Independently uplift modelling has been studied by Piotr Rzepakowski. Together with Szymon Jaroszewicz he adapted information theory to build multi-class uplift decision trees and published the paper in 2010.{{cite book
|last1 = Rzepakowski
|first1 = Piotr
|last2 = Jaroszewicz
|first2 = Szymon
|title = 2010 IEEE International Conference on Data Mining
|chapter = Decision Trees for Uplift Modeling
|pages = 441–450
|date = 2010
|doi = 10.1109/ICDM.2010.62
|location = Sydney, Australia
|isbn = 978-1-4244-9131-5
|s2cid = 14362608
}} And later in 2011 they extended the algorithm to multiple treatment case.
{{cite journal
|last1 = Rzepakowski
|first1 = Piotr
|last2 = Jaroszewicz
|first2 = Szymon
|title = Decision trees for uplift modeling with single and multiple treatments
|journal = Knowledge and Information Systems
|volume = 32
|issue = 2
|pages = 303–327
|date = 2011
|doi = 10.1007/s10115-011-0434-0
|doi-access = free
}}
Similar approaches have been explored in personalised medicine.Cai, T.; Tian, L.; Wong, P. H.; and Wei, L. J. (2009); Analysis of Randomized Comparative Clinical Trial Data for Personalized Treatment Selections, Harvard University Biostatistics Working Paper Series, Paper 97{{cite book|last1=Nassif|first1=Houssam|last2=Kuusisto|first2=Finn|last3=Burnside|first3=Elizabeth S|last4=Page|first4=David|last5=Shavlik|first5=Jude|last6=Santos Costa|first6=Vitor|title=Advanced Information Systems Engineering |chapter=Score as You Lift (SAYL): A Statistical Relational Learning Approach to Uplift Modeling |date=2013|pages=595–611|pmc=4492311|location=Prague|pmid=26158122|doi=10.1007/978-3-642-40994-3_38|volume=8190|series=Lecture Notes in Computer Science|isbn=978-3-642-38708-1}} Szymon Jaroszewicz and Piotr Rzepakowski (2014) designed uplift methodology for survival analysis and applied it to randomized controlled trial analysis.{{cite journal
|last1 = Jaroszewicz
|first1 = Szymon
|last2 = Rzepakowski
|first2 = Piotr
|title = Uplift modeling with survival data
|journal = ACM SIGKDD Workshop on Health Informatics (HI KDD'14)
|date = 2014
|url = http://cci.drexel.edu/hi/hi-kdd2014/morning_4.pdf
|location = New York, USA
}} Yong (2015) combined a mathematical optimization algorithm via dynamic programming with machine learning methods to optimally stratify patients.Yong, F.H. (2015), "Quantitative Methods for Stratified Medicine," PhD Dissertation, Department of Biostatistics, Harvard T.H. Chan School of Public Health, http://dash.harvard.edu/bitstream/handle/1/17463130/YONG-DISSERTATION-2015.pdf?sequence=1 .
Uplift modelling is a special case of the older psychology concept of Differential Prediction.{{cite book|last1=Nassif|first1=Houssam|last2=Santos Costa|first2=Vitor|last3=Burnside|first3=Elizabeth S|last4=Page|first4=David|title=Machine Learning and Knowledge Discovery in Databases |chapter=Relational Differential Prediction |volume=7523|date=2012|pages=617–632|location=Bristol, UK|doi=10.1007/978-3-642-33460-3_45|series=Lecture Notes in Computer Science|isbn=978-3-642-33459-7}} In contrast to differential prediction, uplift modelling assumes an active agent, and uses the uplift measure as an optimization metric.
Uplift modeling has been recently extended and incorporated into diverse machine learning algorithms, like Inductive Logic Programming, Bayesian Network,{{cite journal|last1=Nassif|first1=Houssam|last2=Wu|first2=Yirong|last3=Page|first3=David|last4=Burnside|first4=Elizabeth|title=Logical Differential Prediction Bayes Net, Improving Breast Cancer Diagnosis for Older Women|journal=American Medical Informatics Association Symposium (AMIA'12)|date=2012|pages=1330–1339|pmc=3540455|pmid=23304412|volume=2012}} Statistical relational learning, Support Vector Machines,{{cite book|last1=Kuusisto|first1=Finn|last2=Santos Costa|first2=Vitor|last3=Nassif|first3=Houssam|last4=Burnside|first4=Elizabeth|last5=Page|first5=David|last6=Shavlik|first6=Jude|title=Machine Learning and Knowledge Discovery in Databases |chapter=Support Vector Machines for Differential Prediction |date=2014|location=Nancy, France|pmc=4492338|pmid=26158123|doi=10.1007/978-3-662-44851-9_4|volume=8725|pages=50–65|series=Lecture Notes in Computer Science|isbn=978-3-662-44850-2}}{{cite journal|last1=Zaniewicz|first1=Lukasz|last2=Jaroszewicz|first2=Szymon|title=Support Vector Machines for Uplift Modeling|journal=The First IEEE ICDM Workshop on Causal Discovery|date=2013|location=Dallas, Texas}} Survival Analysis and Ensemble learning.{{cite journal|last1 = Sołtys|first1 = Michał|last2 = Jaroszewicz|first2 = Szymon|last3 = Rzepakowski|first3 = Piotr|title = Ensemble methods for uplift modeling |journal = Data Mining and Knowledge Discovery|volume = 29|issue = 6|pages = 1531–1559|date = 2015|doi = 10.1007/s10618-014-0383-9|doi-access = free}}
Even though uplift modeling is widely applied in marketing practice (along with political elections), it has rarely appeared in marketing literature. Kane, Lo and Zheng (2014) published a thorough analysis of three data sets using multiple methods in a marketing journal and provided evidence that a newer approach (known as the Four Quadrant Method) worked quite well in practice.{{cite journal | last1 = Kane | first1 = K. | last2 = Lo | first2 = V.S.Y. | last3 = Zheng | first3 = J. | year = 2014 | title = Mining for the Truly Responsive Customers and Prospects Using True-Lift Modeling: Comparison of New and Existing Methods | journal = Journal of Marketing Analytics | volume = 2 | issue = 4| pages = 218–238 | doi=10.1057/jma.2014.18| s2cid = 256513132 }} Lo and Pachamanova (2015) extended uplift modeling to prescriptive analytics for multiple treatment situations and proposed algorithms to solve large deterministic optimization problems and complex stochastic optimization problems where estimates are not exact.{{cite journal | last1 = Lo | first1 = V.S.Y. | last2 = Pachamanova | first2 = D. | year = 2015 | title = From Predictive Uplift Modeling to Prescriptive Uplift Analytics: A Practical Approach to Treatment Optimization While Accounting for Estimation Risk | journal = Journal of Marketing Analytics | volume = 3 | issue = 2| pages = 79–95 | doi=10.1057/jma.2015.5| s2cid = 256508939 }}
Recent research analyses the performance of various state-of-the-art uplift models in benchmark studies using large data amounts.{{cite journal|last1=Gubela|first1=Robin M.|last2=Bequé|first2=Artem|last3=Lessmann|first3=Stefan|last4=Gebert|first4=Fabian|date=2019|title=Conversion Uplift in E-Commerce: A Systematic Benchmark of Modeling Strategies|journal=International Journal of Information Technology & Decision Making|volume=18|issue=3|pages=747–791|doi=10.1142/S0219622019500172|hdl=10419/230773|s2cid=126538764 |hdl-access=free}}
A detailed description of uplift modeling, its history, the way uplift models are built, differences to classical model building as well as uplift-specific evaluation techniques, a comparison of various software solutions and an explanation of different economical scenarios can be found here.R. Michel, I. Schnakenburg, T. von Martens (2019). „Targeting Uplift“. Springer, {{ISBN|978-3-030-22625-1}}
Implementations
= In Python =
- [https://github.com/uber/causalml CausalML], implementation of algorithms related to causal inference and machine learning and aims to bridge the gap between theoretical work on methodology and practical applications{{cite arXiv |eprint=2002.11631 |last1=Chen |first1=Huigang |last2=Harinen |first2=Totte |last3=Lee |first3=Jeong-Yoon |last4=Yung |first4=Mike |last5=Zhao |first5=Zhenyu |title=CausalML: Python Package for Causal Machine Learning |date=2020 |class=cs.CY }}
- [https://docs.doubleml.org/stable/index.html DoubleML], implements Chernozhukov et al.'s double/debased machine learning framework{{Cite journal |last1=Chernozhukov |first1=Victor |last2=Chetverikov |first2=Denis |last3=Demirer |first3=Mert |last4=Duflo |first4=Esther |last5=Hansen |first5=Christian |last6=Newey |first6=Whitney |last7=Robins |first7=James |date=2018-02-01 |title=Double/debiased machine learning for treatment and structural parameters |url=https://academic.oup.com/ectj/article/21/1/C1/5056401 |journal=The Econometrics Journal |volume=21 |issue=1 |pages=C1–C68 |doi=10.1111/ectj.12097 |issn=1368-4221|hdl=10419/189736 |hdl-access=free }}
- [https://github.com/microsoft/EconML EconML], estimating heterogeneous treatment effects from observational data via machine learning, built as a part of Microsoft Research's Automated Learning and Intelligence for Causation and Economics (ALICE) project
- [https://github.com/bookingcom/upliftml UpliftML], provides scalable unconstrained and constrained uplift modeling from experimental data
- [https://github.com/wayfair/pylift PyLift] (was archived on GitHub on Nov 29, 2022)
- [https://github.com/maks-sh/scikit-uplift scikit-uplift], provides fast sklearn-style models implementation, evaluation metrics and visualization tools
= In R =
- [https://docs.doubleml.org/stable/index.html DoubleML], implements Chernozhukov et al.'s double/debased machine learning framework
- [https://cran.r-project.org/web/packages/uplift/index.html uplift package] (was removed from CRAN on February 19, 2022)
= Other languages =
- JMP by SAS
- Portrait Uplift by Pitney Bowes
- Uplift node for KNIME by Dymatrix
- Uplift Modelling in [http://www.stochasticsolutions.com/miro/ Miró] by [http://www.stochasticsolutions.com/ Stochastic Solutions]
Datasets
- [https://blog.minethatdata.com/2008/05/best-answer-e-mail-analytics-challenge.html Hillstrom Email Marketing dataset]
- [http://ailab.criteo.com/criteo-uplift-prediction-dataset/ Criteo Uplift Prediction dataset]
- [https://www.uplift-modeling.com/en/latest/api/datasets/fetch_lenta.html#lenta-uplift-modeling-dataset Lenta Uplift Modeling Dataset]
- [https://www.uplift-modeling.com/en/latest/api/datasets/fetch_x5.html#x5-retailhero-uplift-modeling-dataset X5 RetailHero Uplift Modeling Dataset]
- [https://www.uplift-modeling.com/en/latest/api/datasets/fetch_megafon.html#megafon-uplift-competition-dataset MegaFon Uplift Competition Dataset]
Notes and references
{{reflist}}
See also
External links
- [http://videos.smallbusinessnewz.com/2011/01/05/how-uplift-modeling-boosts-marketing-efforts/ Abby Johnson explains how it works in this video broadcast]
- [http://www.predictiveanalyticsworld.com/signup-uplift-whitepaper.php Introductory white paper with full references]
- [http://www.predictiveanalyticsworld.com/pdf/YTW03080USEN/Uplift-Modeling-Optimizes-Marketing-Decisions-White-Paper.pdf Eric Siegel: Uplift Modeling]
- [https://www.uplift-modeling.com/en/latest/user_guide/index.html User guide for uplift modelling on uplift-modeling.com]