Principal Component Analysis based Filtering for Scalable, High Precision k-NN Search

Abstract

Approximate k Nearest Neighbours (A k NN) search is widely used in domains such as computer vision and machine learning. However, A k NN search in high-dimensional datasets does not scale well on multicore platforms, due to its large memory footprint. Parallel A k NN search using space subdivision for filtering helps reduce the memory footprint, but its loss of precision is unstable. In this paper, we propose a new data filtering method—PCAF—for parallel A k NN search based on principal component analysis. PCAF improves on previous methods, demonstrating sustained, high scalability for a wide range of high-dimensional datasets on both Intel and AMD multicore platforms. Moreover, PCAF maintains highly precise A k NN search results.

Publication
IEEE Transactions on Computers 2018, Vol. 67, Issue 2, pp. 252-267