Text mining draw more and more attention recently, it has been applied on different domains including web mining, and sentiment analysis. Text preprocessing is an important stage in text mining. The main problems in text mining are structuring text data, and the very high dimensionality of text data. Natural language processing and morphological tools can be employed to reduce the dimensionality of text data. In addition, term weighting schemes can be used to enhance text representation as feature vector. Researches in the field of Arabic text mining are still fairly limited. The work of this book presents and compares the impact of text preprocessing on Arabic text classification using popular text classification algorithms. Text preprocessing includes applying different term weighting schemes, and Arabic morphological analysis (stemming and light stemming). Text Classification algorithms are applied on 7 Arabic corpora. Results show that Light stemming with term pruning is best feature reduction technique; Support Vector Machines and Naive Bayes variations outperform other algorithms; Weighting schemes impact the performance of distance based classifier.
Product Identifiers
Publisher
LAP
ISBN-13
9783844319576
eBay Product ID (ePID)
108481229
Product Key Features
Author
Motaz Saad
Publication Name
Arabic Text Classification
Format
Paperback
Language
English
Subject
Engineering & Technology
Publication Year
2011
Type
Textbook
Number of Pages
172 Pages
Dimensions
Item Height
229mm
Item Width
152mm
Item Weight
259g
Additional Product Features
Title_Author
Motaz Saad
Country/Region of Manufacture
Germany
Best Selling in Textbooks
Current slide {CURRENT_SLIDE} of {TOTAL_SLIDES}- Best Selling in Textbooks