Arabic Information Extraction Methods A Survey

London Journal of Engineering Research
Volume | Issue | Compilation
Authored by Mazen El Sayed, , George Lebbos, Haissam Hajjar
Classification: NA
Keywords: NA
Language: English

The IR systems developed for western languages, such as English, have high performances when used in their own languages, but they donâA ̆Zt have this same performance when used for ́ eastern languages such as Arabic. This is due to the fact that the Arabic language has a different and complex structure and morphology: polysemy, irregular and inflected derived forms, various spelling of certain words, various writing of certain combination character, short (diacritics) and long vowels. In addition, an Arabic word is derived from a root by concatenating some affixes based on regular set of word patterns. To address these problems, several methods have been proposed. The aim of this paper is to propose a survey of these methods. Although we not claim that this an exhaustive study, this work covers near 20 different methods. The main approaches applied in these methods are morphological or statistical analyses. To extract information from an Arabic document, the involved methods based on both approaches must answer the following question: "How can we find the root of the word we search". To find a word in an Arabic dictionary, first we must extract the root of this word and then find this root in the dictionary, due to the fact that the vocabulary of the Arabic language is essentially built from the roots derivation. The roots are words composed of three to five consonants letters. This work will contribute to the enhancement of the Arabic information retrieval system performance, due to the fact that Arabic information extraction methods are the kernel of such system.



author

For Authors

Author Membership provide access to scientific innovation, next generation tools, access to conferences/seminars
/symposiums/webinars, networking opportunities, and privileged benefits.
Authors may submit research manuscript or paper without being an existing member of LJP. Once a non-member author submits a research paper he/she becomes a part of "Provisional Author Membership".

Know more

institutes

For Institutions

Society flourish when two institutions come together." Organizations, research institutes, and universities can join LJP Subscription membership or privileged "Fellow Membership" membership facilitating researchers to publish their work with us, become peer reviewers and join us on Advisory Board.

Know more

subsribe

For Subscribers

Subscribe to distinguished STM (scientific, technical, and medical) publisher. Subscription membership is available for individuals universities and institutions (print & online). Subscribers can access journals from our libraries, published in different formats like Printed Hardcopy, Interactive PDFs, EPUBs, eBooks, indexable documents and the author managed dynamic live web page articles, LaTeX, PDFs etc.

Know more