Please use this identifier to cite or link to this item:
Title: Arabic documents indexing and classification based on latent semantic analysis and self-organizing map
Authors: Mokbel, Chafic 
Greige, Hanna 
Sarraf, Charles
Kurimo, Mikko
Affiliations: Department of Electrical Engineering 
Department of Mathematics 
Issue Date: 2001
Part of: Proceedings of the IEEE workshop on Natural Language Processing in Arabic
Conference: Workshop on Natural Language Processing in Arabic (2001 : Beirut, Lebanon) 
This paper describes an Arabic document indexing system based on a hybrid "Latent Semantic Analysis"(LSA) and "Self-Organizing Maps"(SOM) algorithm. The approach has the advantage to be completely statistic and to automatically infere the indices from the documents database. A rule-based stemming method is also proposed for the Arabic language. The whole system has been experimented on a database formed of the Alnahar newspaper articles for 1999. Documents clustering and few experiments in retrieval have provided satisfactory results.
Type: Conference Paper
Appears in Collections:Department of Electrical Engineering

Show full item record

Record view(s)

checked on Jul 23, 2021

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.