Submitted by RingoCatKeeper t3_zypzrv in MachineLearning
RingoCatKeeper OP t1_j287d0y wrote
Reply to comment by learn-deeply in [P]Run CLIP on your iPhone to Search Photos offline. by RingoCatKeeper
I implemented the part of cosine similarity calculation myself, as for the topK, you can use .sort().prefix(k)
in Swift.
Steve132 t1_j28og0o wrote
There's an O(n) algorithm for top k partitioning that could be much much faster than .sort() when you have thousands of elements.
QuickSelect. In C++ its available as std::nth_element
in swift I couldn't find it directly but you can implement it in a few lines using .partition as a subroutine
RingoCatKeeper OP t1_j28p4zl wrote
Will certainly check it out!
ElectronicCress3132 t1_j29c108 wrote
Btw, one should take care not to implement the worst-case O(n) algorithm (which is Quickselect + Median of Medians), because it has high constant factors in the time complexity which slow it down in the average case. QuickSelect + Random Partitioning, or Introselect (the C++ standard library function mentioned) have good average time complexities and rarely hit the worst case.
ElectronicCress3132 t1_j29byah wrote
I think the one in the standard library is introselect, which is a hybrid of QuickSelect
learn-deeply t1_j287u7z wrote
So it's calculating nearest neighbor compared to all of the images in the index every time a new search is done? Might be slow past say, 1,000 images.
londons_explorer t1_j28cfh3 wrote
It should scale to 1 million images without much slowdown.
1 million images * 512 vector length= 512 million multiples, which the neural engine ought to be able to do in ~100ms
learn-deeply t1_j28hirz wrote
Is that calculation taking into account memory (RAM/SSD) access latencies?
londons_explorer t1_j28kvqp wrote
There is no latency constraint - it's a pure streaming operation, and total data to be transferred is 1 gigabyte for the whole set of vectors - which is well within the read performance of apples ssd's.
This is also the naive approach - there are probably smarter approaches by doing an approximate search with very low resolution vectors (eg. 3 bit depth), and then a 2nd pass of the high resolution vectors of only the most promising few thousand results.
Steve132 t1_j28oxex wrote
One thing you aren't taking into account is that the computation of the similarity scores is O(n) but the sorting he's doing is n log n which for 1m might dominate especially since it's not necessarily hardware optimized
londons_explorer t1_j28ufby wrote
Top K sorting is linear in computational complexity, and I doubt it will dominate because it just needs to be done on a single number rather than a vector of 512 numbers.
RingoCatKeeper OP t1_j2885ds wrote
You're right. There were some optimized work by Google called ScanNN, which is much faster on large scale vector similarity search. However, it's much more complicated to port this model to iOS.
hattulanHuumeparoni t1_j28fac9 wrote
I mean it's just matrix-vector multiplication of (1000x 512) x 512
Viewing a single comment thread. View all comments