PCAF: Scalable, High Precision k-NN Search using Principal Component Analysis based Filtering

Abstract

Approximate k Nearest Neighbours (AkNN) search is widely used in domains such as computer vision and machine learning. However, AkNN search in high dimensional datasets does not work well on multicore platforms. It scales poorly due to its large memory footprint. Current parallel AkNN search using space subdivision for filtering helps reduce the memory footprint, but leads to loss of precision. We propose a new data filtering method – PCAF – for parallel AkNN search based on principal components analysis. PCAF improves on previous methods by demonstrating sustained, high scalability for a wide range of high dimensional datasets on both Intel and AMD multicore platforms. Moreover, PCAF maintains high precision in terms of the AkNN search results.

Publication
45th International Conference on Parallel Processing