Data defines the model by dint of genetic programming, producing the best decile table.

CHAID-based Data Mining for Paired-Variable Assessment
Bruce Ratner, Ph.D.

Assessing the relationship between predictor and dependent variables is an essential task in the model building process. If the relationship is identified and tractable, then the variables are subject to re-expression to reflect the uncovered relationship, and consequently tested for inclusion into the model. The purpose of this article is two-fold: 1) to review the standard smooth scatterplot for unmasking a presuming existent underlying relationship as depicted in a raw-data scatterplot; and 2) to introduce a new method of obtaining a smoother scatterplot, which exposes a more reliable depiction of the unmasked relationship. The former scatterplot uses averages of raw data, and the latter uses the averages of fitted values of CHAID end-nodes. I illustrate both the smooth and smoother scatterplots using a real study. Click here (sorry, this article is in mynew book).

For more information about this article, call Bruce Ratner at 516.791.3544 or 1 800 DM STAT-1; or e-mail at
Sign-up for a free GenIQ webcast: Click here.