Submitted by RingoCatKeeper t3_zypzrv in MachineLearning
learn-deeply t1_j287u7z wrote
Reply to comment by RingoCatKeeper in [P]Run CLIP on your iPhone to Search Photos offline. by RingoCatKeeper
So it's calculating nearest neighbor compared to all of the images in the index every time a new search is done? Might be slow past say, 1,000 images.
londons_explorer t1_j28cfh3 wrote
It should scale to 1 million images without much slowdown.
1 million images * 512 vector length= 512 million multiples, which the neural engine ought to be able to do in ~100ms
learn-deeply t1_j28hirz wrote
Is that calculation taking into account memory (RAM/SSD) access latencies?
londons_explorer t1_j28kvqp wrote
There is no latency constraint - it's a pure streaming operation, and total data to be transferred is 1 gigabyte for the whole set of vectors - which is well within the read performance of apples ssd's.
This is also the naive approach - there are probably smarter approaches by doing an approximate search with very low resolution vectors (eg. 3 bit depth), and then a 2nd pass of the high resolution vectors of only the most promising few thousand results.
Steve132 t1_j28oxex wrote
One thing you aren't taking into account is that the computation of the similarity scores is O(n) but the sorting he's doing is n log n which for 1m might dominate especially since it's not necessarily hardware optimized
londons_explorer t1_j28ufby wrote
Top K sorting is linear in computational complexity, and I doubt it will dominate because it just needs to be done on a single number rather than a vector of 512 numbers.
RingoCatKeeper OP t1_j2885ds wrote
You're right. There were some optimized work by Google called ScanNN, which is much faster on large scale vector similarity search. However, it's much more complicated to port this model to iOS.
hattulanHuumeparoni t1_j28fac9 wrote
I mean it's just matrix-vector multiplication of (1000x 512) x 512
Viewing a single comment thread. View all comments