| Approximate Nearest Neighbor (ANN) search is a fundamental problem in large-scale high-dimensional data retrieval, where exact search methods become computationally prohibitive. Traditional hashing-based techniques, such as Locality-Sensitive Hashing (LSH), offer sub-linear query times but rely on global, data-independent transformations that often fail to capture the underlying geometry of non-uniform data. In this paper, we propose a cluster-aware neural hashing framework that integrates spatial partitioning with data-driven representation learning. The dataset is first partitioned using the K-Means algorithm, and a lightweight neural hashing model is trained independently for each cluster to capture local data structures. To support dynamic environments, the framework incorporates a localized fine-tuning mechanism that allows incremental updates to be applied only to affected clusters, thereby avoiding the computational bottleneck of global retraining. Experimental results on controlled, high-dimensional synthetic datasets demonstrate that our approach achieves significantly higher recall than a standard LSH baseline while maintaining stable query latency. Furthermore, the method effectively handles boundary queries and supports efficient data insertions, providing a scalable and adaptive solution for ANN search in evolving data environments. |
*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.