ZeroInflated Regression: Modeling a Distribution with a Mass at Zero Bruce Ratner, Ph.D. 

The standard approach for modeling a continuous target variable is the ordinary leastsquares (OLS) regression model. One of the assumptions of OLS regression: the target variable is mainly continuous with permissible discontinuities and minor clumping at several values, including the value zero. If the target variable’s distribution has a mass at zero, then OLS regression renders questionable results. The purpose of this article is to present the flexible (nonparametric, assumptionfree, datadefined model structure) GenIQ approach for modeling a continuous target variable with a mass at zero, a situation quite common in direct and database marketing, CRM, catalogue campaign management, risk assessment, and the like. I illustrate the ZeroInflated Regression GenIQ Model using a real case study, focusing in on sales per account. I use a scaleddown version of the original data to make the application tractable. But suffice it to say, GenIQ is most valuable in big data settings.
