Event Detection from News in Indian Languages
The dataset is annotated at the word level. Each word is enclosed in <W> tag. These <W> tags are enclosed together in <P> tags. Words which are related to any of the events, reason, place, casualties, time are enclosed in separate tags. The tags are listed below.
<MANMADE_EVENT TYPE = “Subtype”>: This tag contains the words that are related to a manmade disaster. The TYPE contain the subtypes that have been mentioned for manmade disaster in Task 2.
<NATURAL_EVENT TYPE = “Subtype”>: This tag contains the words that are related to a natural disaster. The TYPE contain the subtypes that have been mentioned for natural disaster in Task 2.
<REASON-ARG>: This tag contains the words that are the reason due to which the event has occurred.
<TIME-ARG>: This tag contains the words that are time at which the event has occurred.
<CASUALTIES-ARG>: This tag contains the words that are casualties that have occurred due to an event.
<PLACE_ARG>: This tag contains the words that is the place at which the event has occurred.
Language |
Train Data |
Test Data |
---|---|---|
Bengali |
||
English |
||
Hindi |
||
Marathi |
||
Tamil |
Decryption key for the datasets can be obtained by registering for the task.
If you use the EDNIL dataset, please cite the following paper:
@inproceedings{dave2020fire, title={FIRE 2020 EDNIL track: Event detection from news in Indian languages}, author={Dave, Bhargav and Gangopadhyay, Surupendu and Majumder, Prasenjit and Bhattacharya, Pushpak and Sarkar, Sudeshna and Devi, Sobha Lalitha}, booktitle={Proceedings of the 12th Annual Meeting of the Forum for Information Retrieval Evaluation}, pages={25--28}, year={2020} }
@article{davea2020overview, title={Overview of the FIRE 2020 EDNIL Track: Event Detection from News in Indian Languages}, author={Davea, Bhargav and Gangopadhyaya, Surupendu and Majumdera, Prasenjit and Bhattacharyab, Pushpak and Sarkarc, Sudeshna and Devid, Sobha Lalitha}, year={2020}