It’s clear that LLMs have made significant strides in many domains. With ongoing research exploring and expanding the use of LLMs, the potential for advancing state-of-the-art methods is enormous.
It is so difficult to keep up with ongoing progress since there have been so many works discovering new applications and capabilities of LLMs. In this piece, I would like to take time and shed light on its progress in the area of Search & Ranking. If you are someone like me who is interested in Search, I hope you find this blog insightful (:
Is ChatGPT Good at Search?
First, let’s see if LLMs can be directly used for ranking. Researchers at Baidu Inc. have conducted research on both the use of LLM for ranking directly and its use for labeling ranking datasets. 🔎
- Most previous methods heavily rely on manual supervision signals, which require significant human effort and demonstrate weak generalizability. 😵💫
- There is a growing interest in leveraging the zero-shot language understanding and reasoning capabilities of LLMs in the IR area. However, most existing approaches primarily focus on exploiting LLMs for content generation (e.g., query or passage) rather than relevance ranking for groups of passages
How Proposed LLM Ranking Works:
- Since ChatGPT and GPT-4 have a strong capacity for text understanding, instruction following, and reasoning 🚀, the authors propose to prompt LLM to re-rank passages given a query. However, considering sequence length limitation, they propose sliding-window-based prompting/ranking (as shown above). The passages are ranked directly using the identifiers,  >  >  > etc, instead of producing intermediate relevance scores.
- Consequently, generated permutations can be used as supervision to distill knowledge on smaller language models, such as DeBERTa-large. They use RankNet Loss and optimize LM model to mimic LLM’s ranking behavior and logic. 👀