Searches against a GPU index using normalize_L2 is heavily CPU intensive #3662

jpletcher · 2024-07-22T21:54:20Z

Summary

Running searches against a GPU index using normalize_L2 is heavily CPU intensive.

Please refer to the following psuedo code:

def setup_index(numpy_array):
    res = faiss.StandardGpuResources()
    flat_index = faiss.IndexFlatIP(n)
    gpu_index = faiss.index_cpu_to_gpu(res, 0, flat_index)
    faiss.normalize_L2(numpy_array)
    gpu_index.add(numpy_array)
    return gpu_index


def query(query_vectors):    # 1024 32-bit floats
    faiss.normalize_L2(query_vectors)
    distances, indices = gpu_index.search(query_vector, VALUE_OF_K)

When invoking query(), we find that the platform is CPU constrained where CPU resources are consumed at fvec_renorm_L2 (faiss/swigfaiss.py), while GPU utilization remains stable.

In this psuedo code above, can normalize_L2 be optimized to utilize GPU?

Additionally, for this case, AVX2 support is not available.

Would AVX2 support provide any performance optimizations to normalize_L2 ?

Platform

OS: Linux

Faiss version: faiss-gpu-1.7.2

Installed from: pip

Faiss compilation options: N/A

Running on:

CPU
GPU

Interface:

C++
Python

Reproduction instructions

The text was updated successfully, but these errors were encountered:

junjieqi · 2024-07-29T21:24:59Z

@jpletcher

The normalize_l2 is only implemented on CPU so we can't utilize GPU for normalize_2 method for now.

Right now, the normalzie_l2 implementation should leverage autovectorization

faiss/faiss/utils/distances_simd.cpp

Lines 200 to 209 in 34bbe5e

    
           float fvec_norm_L2sqr(const float* x, size_t d) { 
        
               // the double in the _ref is suspected to be a typo. Some of the manual 
        
               // implementations this replaces used float. 
        
               float res = 0; 
        
               FAISS_PRAGMA_IMPRECISE_LOOP 
        
               for (size_t i = 0; i != d; ++i) { 
        
                   res += x[i] * x[i]; 
        
               } 
        
               return res;

kuarora added GPU feature request Performance labels Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Searches against a GPU index using normalize_L2 is heavily CPU intensive #3662

Searches against a GPU index using normalize_L2 is heavily CPU intensive #3662

jpletcher commented Jul 22, 2024 •

edited

Loading

junjieqi commented Jul 29, 2024

Searches against a GPU index using normalize_L2 is heavily CPU intensive #3662

Searches against a GPU index using normalize_L2 is heavily CPU intensive #3662

Comments

jpletcher commented Jul 22, 2024 • edited Loading

Summary

Platform

Reproduction instructions

junjieqi commented Jul 29, 2024

jpletcher commented Jul 22, 2024 •

edited

Loading