-
Notifications
You must be signed in to change notification settings - Fork 359
Open
Labels
enhancementNew feature or requestNew feature or request
Description
The code is in the function search_for_point_and_prune in index.cpp file:
std::shared_lock<std::shared_timed_mutex> tl(_tag_lock, std::defer_lock);
if (_dynamic_index)
tl.lock();
std::vector<uint32_t> filter_specific_start_nodes;
for (auto &x : _location_to_labels[location])
filter_specific_start_nodes.emplace_back(_label_to_start_id[x]);
if (_dynamic_index)
tl.unlock();
_data_store->get_vector(location, scratch->aligned_query());
iterate_to_fixed_point(scratch, filteredLindex, filter_specific_start_nodes, true,
_location_to_labels[location], false);
// Here
// combine candidate pools obtained with filter and unfiltered criteria.
std::set<Neighbor> best_candidate_pool;
for (auto filtered_neighbor : scratch->pool())
{
best_candidate_pool.insert(filtered_neighbor);
}
// clear scratch for finding unfiltered candidates
scratch->clear();
_data_store->get_vector(location, scratch->aligned_query());
iterate_to_fixed_point(scratch, Lindex, init_ids, false, unused_filter_label, false);
for (auto unfiltered_neighbour : scratch->pool())
{
// insert if this neighbour is not already in best_candidate_pool
if (best_candidate_pool.find(unfiltered_neighbour) == best_candidate_pool.end())
{
best_candidate_pool.insert(unfiltered_neighbour);
}
}
scratch->pool().clear();
std::copy(best_candidate_pool.begin(), best_candidate_pool.end(), std::back_inserter(scratch->pool()));The algorithm in the paper does not mention using the results of unfiltered search as part of the neighbor candidate set.
Then, I commented out the code for the unfiltered search and only used the results of the filtered search as neighbor candidates for pruning, and the constructed index performed better:
# combine filtered search and unfiltered search
Using 10 threads to search
Ls QPS Avg dist cmps Mean Latency (mus) 99.9 Latency Recall@10
=================================================================================
10 12354.06 74.61 807.78 1561.57 16.92
50 4896.18 230.08 2040.62 3916.28 52.96
100 3160.04 361.73 3162.08 5720.58 67.57
200 1895.10 597.24 5273.57 8610.87 79.11
300 1348.74 820.74 7409.41 10601.51 84.14
400 1045.22 1037.71 9560.64 12604.73 86.97
500 850.17 1249.79 11755.75 15468.46 88.82
600 713.07 1457.35 14015.92 17650.42 90.14
650 648.90 1559.70 15401.10 18695.55 90.68
# only use filtered search
Using 10 threads to search
Ls QPS Avg dist cmps Mean Latency (mus) 99.9 Latency Recall@10
=================================================================================
10 11770.21 125.70 847.92 1624.26 31.96
50 5101.96 336.21 1957.79 3600.15 70.68
100 3246.70 526.91 3077.13 4975.10 82.16
200 1896.82 872.66 5268.55 7761.59 89.84
300 1331.29 1195.33 7506.48 10179.60 92.85
400 1025.38 1501.36 9746.72 11969.66 94.38
500 830.82 1793.70 12028.78 14463.68 95.29
600 699.44 2076.82 14289.13 16531.08 95.94
650 647.19 2214.84 15442.54 17868.96 96.20I want to know why the results of unfiltered search are considered here. Is it to be compatible with query searches without filters?
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request