searchivarius.org - Blogs | searchivarius.org

Example domain paragraphs

Due to high annotation costs making the best use of existing human-created training data is an important research direction. We, therefore, carried out a systematic evaluation of transferability of BERT-based neural ranking models across five English datasets . Previous studies focused primarily on zero-shot and few-shot transfer from a large dataset to a dataset with a small number of queries. In contrast, each of our collections has a substantial number of queries, which enables a full-shot evaluation mod

We have tried to answer several research questions related to the usefulness of transfer learning and pseudo-labeling in the small and big data regime. It was quite interesting to verify the pseudo-labeling results of a now well-known paper Dehghani, Zamani, and friends "Neural ranking models with weak supervision," where they showed that training a student neural network using BM25 as a teacher model allows one to greatly outperform BM25. Dehghani et al. trained a pre-BERT neural model using an insane amou

In that, we find that transfer-learning has a mixed success, which is not totally unsurprising due to a potential distribution shift: Pseudo-labeling, in contrast, uses only in-domain text data. Even though transfer learning and/or pseudo-labeling can be both effective, it is natural to try improving the model using a small number of available in-domain queries. However, this is not always possible due to a "A Little Bit Is Worse Than None" phenomenon, where training on small amounts of in-domain data degra

Links to searchivarius.org (2)

boytsov.info Leonid Boytsov | searchivarius.org
ameya005.github.io About Me - Ameya Joshi