TY - GEN
T1 - A large-scale performance study of cluster-based high-dimensional indexing
AU - Gudmundsson Pór, Gylfi
AU - Jónsson Pór, Björn
AU - Amsaleg, Laurent
PY - 2010
Y1 - 2010
N2 - High-dimensional clustering is used by some content-based image retrieval systems to partition the data into groups; the groups (clusters) are then indexed to accelerate processing of queries. Recently, the Cluster Pruning approach was proposed as a simple way to produce such clusters. While the original evaluation of the algorithm was performed within a text indexing context at a rather small scale, its simplicity motivated us to study its behavior in an image indexing context at a much larger scale. This paper summarizes the results of this study and shows that while the basic algorithm works fairly well, three extensions dramatically improve its performance and scalability, accelerating both query processing and the construction of clusters, making Cluster Pruning a promising basis for building large-scale systems that require a clustering algorithm.
AB - High-dimensional clustering is used by some content-based image retrieval systems to partition the data into groups; the groups (clusters) are then indexed to accelerate processing of queries. Recently, the Cluster Pruning approach was proposed as a simple way to produce such clusters. While the original evaluation of the algorithm was performed within a text indexing context at a rather small scale, its simplicity motivated us to study its behavior in an image indexing context at a much larger scale. This paper summarizes the results of this study and shows that while the basic algorithm works fairly well, three extensions dramatically improve its performance and scalability, accelerating both query processing and the construction of clusters, making Cluster Pruning a promising basis for building large-scale systems that require a clustering algorithm.
KW - Algorithms
KW - Performance
UR - http://www.scopus.com/inward/record.url?scp=78650862575&partnerID=8YFLogxK
U2 - 10.1145/1878137.1878145
DO - 10.1145/1878137.1878145
M3 - Conference contribution
AN - SCOPUS:78650862575
SN - 9781450301664
T3 - VLS-MCMR'10 - Proceedings of the 2010 ACM International Workshop on Very-Large-Scale Multimedia Corpus, Mining and Retrieval, Co-located with ACM Multimedia 2010
SP - 31
EP - 36
BT - VLS-MCMR'10 - Proceedings of the 2010 ACM International Workshop on Very-Large-Scale Multimedia Corpus, Mining and Retrieval, Co-located with ACM Multimedia 2010
T2 - 2010 ACM International Workshop on Very-Large-Scale Multimedia Corpus, Mining and Retrieval, VLS-MCMR'10, Co-located with ACM Multimedia 2010
Y2 - 29 October 2010 through 29 October 2010
ER -