3.1 The pdimensional Cube and Cone
To understand the mechanism behind convergence, we first examine the formula for calculating the correlation coefficient. There are three major steps when computing a sample Pearson product moment correlation: the centering step, the product step, and the scaling step. The centering step divides the p coefficients on each of the p columns into positive and negative camps. Given two columns
and
with coordinates on the same (opposite) side of the mean for most of the p columns at iteration n, the product step pushes the correlation coefficient
at the next iteration to a relatively larger positive (negative) number. The scaling step then bounds the coefficients between 1 and 1.
Correlation matrices can be visualized differently. Given one, the p column (row) vectors can be treated as p points in pdimensional Euclidean space. Since the correlation coefficients are in the interval
, the p points move in the cube . A 3dimensional example is illustrated in Figure 4a with a 3*3 correlation matrix of three variables, A, B, and C, generated from
. The sample correlation matrix
is computed from a sample of 100 observations as the initial proximity matrix. It takes six iterations to converge to
.
Figure 4. The Converging Paths of the Columns on the pdimensional
Cube.(a). Simulation Study with 3 Columns on 3dimensional Cube;(b). Converging Paths for the Fifty Symptoms Projected onto the 3dimensional Cube of (NB1, DL9, and TH3).
Columns A and B move gradually to (1, 1, 1) while column C moves to the opposite corner (1, 1, 1) of the cube. A further observation shows that the points at each iteration can not move freely in the 3dimensional cube, they actually are confined on three of the six surfaces of the cube. These surfaces form a 3sided "cone" with the vertex at the intersection point, (1, 1, 1), of the three planes. (This is due to the ones on the diagonal of every correlation matrix.) Every column vector,
(n=0,1,…; i=1,…,p), can only move on one side of the pdimensional cube with the ith coordinate equals to one. At early iterations (iteration 0 to iteration 1 in this example), each point (column vector) moves toward its own corner (the corner with all coordinates equal to 1 except for the point itself). At intermediate iterations, because of the centering and product forces, columns with similar pattern (positioned on the same side relative to the means on most of all p columns) attract each other and move toward the corner with simultaneous ones on these coordinates. Several groups may be formed at this intermediate stage. At the final iterations, only two groups survive and these two groups of points move to one of the
pairs of opposite corners with ones and negative ones on opposite coordinates. For the simulation, column A and B form a group and move into the corner of (1, 1, 1) while column C alone moves into its own corner of (1, 1, 1). The converging path of the fifty symptom example projected onto the 3dimensional cube of columns NB1, DL9, and TH3 is displayed in Figure 4b. Behavior Similar to that in the simulation can be seen in the converging paths.
