5 Applications

In previous sections, we described convergence properties for the sequence of iteratively formed correlation matrices. Various possible applications of these properties are discussed in this section.

5.1 The Hierarchical Divisive Clustering Tree with Rank One Splitting Rule.

Without a perfect symmetry structure in the proximity matrix, one divides the p objects (variables or subjects) into two groups. We can recursively apply this correlation splitting rule to form a hierarchical divisive clustering tree. The clustering tree for the correlation proximity matrix of the fifty symptoms is illustrated in Figure 6a. This hierarchical divisive clustering tree with the correlation splitting rule was the major issue in the studies conducted by McQuitty (1968) and Breiger, Boorman, and Arabie (1975).

Figure 2c is also reconstructed as Figure 6b, using the permutation of fifty symptoms from the order of terminal nodes of this clustering tree with a scheme to flip the two branches at each intermediate node. The permutation method is the first type of seriation developed in the present study, it is similar to the study by Gale and Halperin (1984).

Next, Figure 6b is compared with Figure 2b to see if Figure 6b has recovered the original structure embedded in Figure 2b from Figure 2c. It is seen that Figure 6b actually has even more structural pattern than does Figure 2b. There are five major groups along the main diagonal, the negative symptoms, the thought process symptoms, the hallucination symptoms, the delusion symptoms, and the mania symptoms.

Figure 6. Seriation Methods and the Reconstructed Correlation maps. (a). Divisive Clustering Tree with the Rank-1 Splitting Rule; (b). Correlation map Sorted by the Tree Seriation in (a); (c). Rank-Two Ellipse Seriation at ; (d). Correlation map Sorted by the Ellipse Seriation in (c).

5.2 The Rank-Two Ellipse Seriation Technique

When the sequence reaches an iteration with rank two, the p objects fall on an ellipse. They have a unique relative position on the ellipse. There are p possible cuts on the ellipse. The order on the two-dimensional ellipse can be combined with the one-dimensional split to find two orders with the cuts at the two gaps between the two converged groups. The elliptic seriation with the sorted correlation map is given in Figure 6c and d. The symptom order in Figure 6d is different from that in Figure 6b but the major grouping patterns are identical.

Given a pre-Robinson matrix, the correlation matrix at the very first iteration shows a perfect half-ellipse structure with all p vectors fall on half of the two-dimensional ellipse for perfect seriation (see web page for an example). It is possible to combine the rank-1 splitting rule and the rank-2 elliptical seriation to form a hybrid seriation for data sets with clustering structure. That is to perform separate rank-2 seriation on each split sub-matrices.

 

Reference:

  • Breiger, R. L., Boorman, S. A. and Arabie, P (1975), "An Algorithm for Clustering Relational Data with Applications to Social

  • Gale, N., Halperin, C. W., and Costanzo, C. M. (1984), "Unclassed Matrix Shading and Optimal Ordering in Hierarchical Cluster Analysis," Journal of Classification, 1, 75-92.

  • McQuitty, L. L. (1968), "Multiple Clusters, Types, and Dimensions from Iterative Intercolumnar Correlational Analysis," Multivariate Behavioral Research, 3, 465-477.