Member-only story
Modern Weak Supervision for ML Ranking Applications: Combining Heuristics and LLM
In recent years, the real-world impact of machine learning (ML) has grown in leaps and bounds. There’s no doubt about how important labels are for these constant improvements and innovations. The Imagenet dataset alone has been a major driving force for the growth of Computer Vision. Accurate labels allow us to improve model performance as well as safely and reliably evaluate our experiments.
These hand-labeled training sets are expensive and time-consuming to create — often requiring person months or years to assemble, clean, and debug — especially when domain expertise is required.
On top of this, tasks often change and evolve in the real world. The labels that took you months to collect might not be reliable anymore. In the scenario of ranking, implicit feedback (clicks, purchases, likes, and other signals of engagement ) has been another go-to source for producing ground-truth labels. However, relying on them alone might not be sufficient. These signals, while valuable, can often be noisy, biased, or incomplete. Another scenario is that they might not be applicable if we want to produce some other features for the ranking model that utilizes implicit feedback.
For all these reasons, practitioners have increasingly been turning to weaker forms of supervision.
This piece collects and summarizes multiple works and blogs to provide a summary and an idea for future inspirations in this research field.
Weak Supervision
Many traditional lines of research in ML are similarly motivated by the potential use of deep learning models for labeled training data. This is important, especially in the age of generative AI.
weak supervision is about leveraging higher-level and/or noisier input from subject matter experts (SMEs).
Weak supervision involves the use of multiple “labeling functions”, which can be anything ranging from rules up to ML models, to create training labels. Weak supervision has some similarities to rule-based classifiers; however, the rule-based classifier stops there — the rules are the classifier.