Submitted by Even_Stay3387 t3_zcdw0k in MachineLearning
The paper is "A Neural Corpus Indexer for Document Retrieval"
According to the Revisions record on OpenReview, the final modification of the Rebuttal phaseat which point Table 1 reads.
​
But the Camera Ready version in which results of the same experience in Table 1 are obviously different from the first submitting and the difference is huge.
​
dojoteef t1_iyvxzsz wrote
See the author's explanation on OpenReview:
> We update the result tables in the camera-ready version. The revision is due to a different data version of query augmentation. Previously, the data is cooked by one of our co-authors while using a different train-test split to train the query generator, causing some data leakage issue. All experiments in the previous submission are based on this query augmentation version, so the performance is relatively higher. When preparing the camera-ready version, we review and reproduce the code end-to-end for official release. At that time, we realize the data leakage problem. So, we re-cook the query augmentation data and reproduce all the experiments again in the new table. After solving the data leakage problem, NCI still shows more than 15% improvement over the current best SOTA. We have released the complete open-source code at GitHub: > > https://github.com/solidsea98/Neural-Corpus-Indexer-NCI > > Welcome to follow and reproduce our work. Looking forward to further discussions and collaborations.