Viewing a single comment thread. View all comments

Agitated-Purpose-171 t1_j4z7iz5 wrote

Hi everybody, I have one question about VLAD while I read this paper (Aggregating local descriptors into a compact image representation) on CPVR.

My question is why VLAD works.

Aggregating local descriptors into a compact image representation paper links:

https://lear.inrialpes.fr/pubs/2010/JDSP10/jegou_compactimagerepresentation.pdf

In this paper, there is a network VLAD, it can turn the local features (N*D dimension) into a global feature (k* D dimension).

Below is my understanding of the operations of VLAD, step by step.

=> input: N*D dimension local feature.

(i) use k-means to find the k clusters and the central feature for each cluster.

(ii) for each cluster find a residual sum.

V = summation of ( each local feature in the cluster minus the central feature).

V = sum (Xi - C)

V: residual sum of the cluster

X: local feature in the cluster

C: Central feature of the cluster

(iii) concatenate the residual sum then get the global feature.

global feature = [V1,V2,....Vk]

(V1 is the residual sum of cluster 1, V2 is the residual sum of cluster 2... and so on.)

=> output: k*D dimension global feature.

My question is why the residual sum of each cluster is "not" zero.

Since the central feature of each cluster found by k-means is the average of the local feater of each cluster.

The central feature of cluster 1 = average of the local feature in cluster 1.

C1 = (X1 + X2 + X3 + ...+ Xm) / m

The residual sum of cluster 1 = (X1-C1) + (X2-C1) + (X3-C1) + ... + (Xm-C1) = V1

Based on the above equation, I think the residual sum of each cluster is zero. So the global feature will be a zero matrix = [V1, V2,..., Vk] = [zero vector, zero vector, ..., zero vector].

The only reason that came into my mind is that the iteration of the k means is not enough, so the central feature of each cluster is not equal to the average of the local feature in the cluster. Am I right?

Could anybody let me know why the residual sum is not a zero vector? Thanks a lot.

1