朋友们好呀~在近期的研究中,小编越来越注意到用于事件抽取的数据集渐渐多样化了起来,所以这次把他们同一整理一下:
Sentence-level EE
ACE2005
KBP2017
MAVEN
- Time: EMNLP2020
- Paper: MAVEN: A Massive General Domain Event Detection Dataset
- Link: https://github.com/THU-KEG/MAVEN-dataset
FewED
FMC
CySecED
- Time: EMNLP2020
- Paper: Introducing a New Dataset for Event Detection in Cybersecurity Texts
CASIE
- Time: AAAI2020
- Paper: CASIE: Extracting Cybersecurity Event Information from Text
- Link: https://github.com/Ebiquity/CASIE
Dealogue EE
- Time: Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events, 2020
- Paper: Automatic extraction of personal events from dialogue
- Link: https://www.artie.com/data/personaleventsindialogue/
Commodity News Corpus for Event Extraction
Few-shot Financial Chinese event extraction datase
DuEE
Genia Event Extraction (GE)
- Time: 2011
- Link: http://bionlp-st.dbcls.jp/GE/2011/eval-test/
TimeBank
LitBank
- Time: ACL2019
- Paper: https://aclanthology.org/P19-1353/
- Link: https://github.com/dbamman/litbank
Doc-level EE
MUC4
DCFEE
ChFinAnn
RAMS
- Time: ACL2020
- Paper: Multi-Sentence Argument Linking
- Link: https://nlp.jhu.edu/rams/
WIKIEVENTS
- Time: NAACL2021
- Paper: Document-Level Event Argument Extraction by Conditional Generation
- Link: https://github.com/raspberryice/gen-arg
以上的整理主要面向于目前的事件抽取任务,以上内容在github也做了同步更新,疏漏之处欢迎交流讨论呀!
版权声明:本文为carrie_0307原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。