Task Description:
Verb ambiguity in Arabic is a challenging problem at all natural language processing levels.
This task is concerned with verb ambiguity which can be in:
(A) Verb type and tense (imperative, past, present).
(B) Active and passive voice.
(C) Verb morphological features (person, number, and gender)
An example for (A): The verb forms تعلم/تعلموا which can be an imperative تَعَلَّمْ/تَعَلَّمُوا, past tense تَعَلَّمَ/تَعَلَّمُوا, or present tense verb تَعْلَمُ/تَعَلَمُوا.
An example for (B): The verb form حمل/يحمل which can be active voice حَمَلَ/يَحْمِلُ, or passive voice حُمٍل/يُحْمَلُ.
An example for (C) : The verb form فعلت which can be first person singular past verb فَعَلْتُ, second person singular masculine past verb فَعَلْتَ, second person singular feminine past verb فَعَلْتِ, or third person singular feminine past verb فَعَلَتْ.
Sub tasks:The task has sub-tasks:
Input: white space tokenized sentence
Task 1.A. Verb tense classification
Output: A list of the sentence tokens, each token is annotated as follows:
O: If the token is not a verb
TENSE: if the token is a verb, where TENSE:=PAST|PRESENT|FUTURE|IMPERATIVE
Task 1.B. Active/passive voice classification
Input: white space tokenized sentence
Output: A list of the sentence tokens, each token is annotated as follows:
O: If the token is not a verb
VOICE: if the token is a verb, where
VOICE:=ACTIVE|PASSIVE
Task 1.C. Verb morphological features classification
Input: a white space tokenized Arabic sentence.
Output: A list of the sentence tokens, each token is annotated as follows:
O: If the token is not a verb
NUMBER-GENDER: if the token is a verb, where
PERSON:=FIRST|SECOND|THIRD
NUMBER:=SINGULAR|DUAL|PLURAL
GENDER:=MASCULINE|FEMININE
Complete Task:
Input: a white space tokenized Arabic sentence.
Output: A list of the sentence tokens, each token is annotated as follows:
O: If the token is not a verb
TENSE-VOICE-PERSON-NUMBER-GENDER: if the token is a verb, where
TENSE:=PAST|PRESENT|FUTURE|IMPERATIVE
VOICE:= ACTIVE|PASSIVE
PERSON:=FIRST|SECOND|THIRD
NUMBER:=SINGULAR|DUAL|PLURAL
GENDER:=MASCULINE|FEMININE
Example:
Input: التركيز على تناول أغذية غنية بحمض الفوليك حيث تشير الدراسات وجود علاقة بين النقص في حمض الفوليك وحالات الإكتئاب كون نقص حمض الفوليك يساهم في إنخفاض مستويات مادة السيروتونين في الدماغ
Output:
التركيز | O | O | O | O | O |
على | O | O | O | O | O |
تناول | O | O | O | O | O |
أغذية | O | O | O | O | O |
غنية | O | O | O | O | O |
بحمض | O | O | O | O | O |
الفوليك | O | O | O | O | O |
حيث | O | O | O | O | O |
تشير | active | present | third | plural | feminine |
الدراسات | O | O | O | O | O |
وجود | O | O | O | O | O |
علاقة | O | O | O | O | O |
بين | O | O | O | O | O |
النقص | O | O | O | O | O |
في | O | O | O | O | O |
حمض | O | O | O | O | O |
الفوليك | O | O | O | O | O |
وحالات | O | O | O | O | O |
الإكتئاب | O | O | O | O | O |
كون | O | O | O | O | O |
نقص | O | O | O | O | O |
حمض | O | O | O | O | O |
الفوليك | O | O | O | O | O |
يساهم | active | present | third | singular | masculine |
في | O | O | O | O | O |
إنخفاض | O | O | O | O | O |
مستويات | O | O | O | O | O |
مادة | O | O | O | O | O |
السيروتونين | O | O | O | O | O |
في | O | O | O | O | O |
الدماغ | O | O | O | O | O |
Data :
You are free to use any available Arabic corpus. We recommend using Khaleej-2004 and/or Watan-2004 (Both corpora are in windows 1256 encoding and need to be utf8 encoded)
Tools:
You are free to use any available NLP tools such as the following tools. under one condition that the tools do not reveal the verb tense, type or verb morphological features . For example, you are not allowed to use Madamira, if you are not one of the developers of Madamira. If you attend to use ALP or Farasa, please send in the tool request that you need a simplified version of the tool that does not expose the verb tense, and/or verb type.
This restriction does not apply if you are the developer of the NLP tool that you are using. For example, the developers of Madamira, ALP, and Farasa are allowed to participate using their own tools without this restriction.
Important Dates: http://nsurl.org/importantdates
Task participation:
To participate in this task , the team leader has to do the following:
- Choose a name for your team (The name should reflect your team)
- login as an author to https://easychair.org/conferences/?conf=nsurl2019
- add the paper title: <Team-name> at NSURL-2019 Task 1: Verb Ambiguity in Arabic Morphological Analysis
- Paper authors of the paper: The team members
- Paper abstract and keywords: add a simple tentative abstract that you can modify anytime
- submit
Results:
We list here the results of the participating teams after 30 June 2019.
Paper submission:
We list here instructions for paper submissions after 30.June 2019.
Task Organizers:
If you have any queries regarding this task, please refer to the task organizers:
Abed Alhakim Freihat: <abed.freihat@unitn.it>
Mourad Abbas: <m_abbas04@yahoo.fr >
Abdallah Abushmaes abdallah.abushamaes@mawdoo3.com
Gabor Bella: <gabor.bella@unitn.it >
Follow us on Researchgate: https://www.researchgate.net/project/Task-1-Verb-ambiguity-in-Arabic-morphological-analysis