Distant Supervision
Distant supervision is a learning scheme in which a classifier is learned given a weakly labeled training set (training data is labeled automatically based on heuristics / rules).
It usually has the following steps:
- It may have some labeled training data.
- It has access to a pool of unlabeled data.
- It has an operator that allows it to sample from this unlabeled data and label them. The operator is noisy in its labels.
- The algorithm uses both the originally labeled training data and this new noisy labeled data to give the final output.