Data defines the model by dint of genetic programming, producing the best decile table.

Data Mining Using Genetic Programming
Bruce Ratner, Ph.D.

The GenIQ Model© is an evolutionary advance in data mining methodology using genetic programming. GenIQ offers exceptional predictions with minimal error variance, and a unique feature accommodating dirty and incomplete data. GenIQ can handle both classification (e.g., target yes-no response variable) and regression (e.g., target continuous sales variable) problems with categorical, ordinal and continuous candidate predictor variables.
GenIQ is designed for the optimization of the ubiquitous decile analysis (gains chart). When GenIQ achieves this goal - for either a simple or complex model - the visual displays produced are easy on the analyst's eyes for understanding the impact of any relevant predictor variable or pair of predictor variables on the target variable, thus revealing the underlying data structure.

GenIQ is a tool to be used virtually without data preparation - except for insuring there are no impossible or improbable values (e.g., age of 120 years, or a boy named Sue). GenIQ quickly leads to a detailed understanding of the value of the data, i.e., the identification of the key-drivers of the target variable. The GenIQ model output looks like a tree, not like a CHAID or CART tree, but like itself! Actually, it is technically a computer program, thus the GenIQ Model is a set of computer code. Each branch, which defined by two or more variables tied-together by one or more functions, is the identification of genetic-evolved key-drivers of the target variable. This is the unique data mining feature of the GenIQ Model in the tree below.

Thus, the following pieces of structure mined by GenIQ for predicting response are:

    1. Structure_1 (mini-model #1) = 3 / recency_mos
    2. Structure_2 (mini-model #2) = no_bal_decr + mos_on_file
    3. Structure_3 (mini-model #3) = no_of_trans / hi_balance
    4. Structure_4 (mini-model #4) = Structure_3 / 3
    5. Structure_5 (mini-model #5) = Structure_1 / Structure_2
    6. Structure_6 (mini-model #6) = tranx_active=med / Structure_3
    7. GenIQ Model (Super-structure) = Structure_6 - Structure_4

For another excellent GenIQ data mining illustration - where it serves as a "data straightner" - seeking the maximum predictive power of a variable as well as providing the necessary (but not sufficient) any-model assumption of the relationship between target and predictive variables is "straight," go here.

For more information about this article, call Bruce Ratner at 516.791.3544 or 1 800 DM STAT-1; or e-mail at
Sign-up for a free GenIQ webcast: Click here.