Distant supervision can reduce the cost of labeling data.
远程监督可以降低数据标注的成本。
In relation extraction, distant supervision aligns knowledge-base facts with sentences, but the automatically generated labels often contain noise that the model must learn to handle.
在关系抽取中,远程监督会把知识库事实与句子对齐,但自动生成的标签往往含有噪声,模型需要学会应对这些噪声。
Mintz, M., Bills, S., Snow, R., & Jurafsky, D. (2009). Distant supervision for relation extraction without labeled data.
Riedel, S., Yao, L., & McCallum, A. (2010). *Modeling relations and their mentions without labeled text.*(常被视为远程监督关系抽取的重要工作之一)
Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L., & Weld, D. S. (2011). Knowledge-based weak supervision for information extraction of overlapping relations.
Ratner, A., Bach, S., Ehrenberg, H., Fries, J., Wu, S., & Ré, C. (2016). *Data programming: Creating large training sets, quickly.*(与弱监督/远程监督密切相关的代表性框架思路)