IEEE Transactions on Visualization and Computer Graphics, early access

Optimally Ordered Orthogonal Neighbor Joining Trees for Hierarchical Cluster Analysis

Tong Ge, Xu Luo, Yunhai Wang, Michael Sedlmair, Zhanglin Cheng, Ying Zhao, Xin Liu, Oliver Deussen, Baoquan Chen

Abstract

We propose to use optimally ordered orthogonal neighbor-joining (O3NJ) trees as a new way to visually explore cluster structures and outliers in multi-dimensional data. Neighbor-joining (NJ) trees are widely used in biology, and their visual representation is similar to that of dendrograms. The core difference to dendrograms, however, is that NJ trees correctly encode distances between data points, resulting in trees with varying edge lengths. We optimize NJ trees for their use in visual analysis in two ways. First, we propose to use a novel leaf sorting algorithm that helps users to better interpret adjacencies and proximities within such a tree. Second, we provide a new method to visually distill the cluster tree from an ordered NJ tree. Numerical evaluation and three case studies illustrate the benefits of this approach for exploring multi-dimensional data in areas such as biology or image analysis.

Results

NJTree

Figure 1: A given dataset (a) is better clustered by our optimally ordered orthogonal neighbor joining tree (O3NJTree) (f) than by a dendrogram (d) produced by the hierarchical clustering (HC) method with complete linkage. Subfigure (b) shows the nested five clusters generated by cutting the dendrogram with a minimum similarity bar; (c) shows four clusters automatically extracted with our method; (e) displays the NJ tree with random leaf order while (f) shows the same tree with optimal leaf order (left) and resulting cluster tree (right) as described in the paper.

Acknowledgements

This work was supported by the grants of the National Key R&D Program of China under Grant 2022ZD0160805, in part by NSFC under Grants 62132017 and 62141217, in part by Shandong Provincial Natural Science Foundation under Grant ZQ2022JQ32, and the Shenzhen Science and Technology Program (GJHZ20210705141402008), as well as in part by the DFG (German Research Foundation) under Germany’s Excellence under Grant Strategy-EXC2117-422037984 and in part by the Project-ID 251654672 – TRR 161.

Copyright © IDEAS Lab 2024
Shandong Univeristy, Qingdao, China
Visitor Map powered by ClustrMaps