Please use this identifier to cite or link to this item:
https://scholarhub.balamand.edu.lb/handle/uob/391
Title: | Arabic documents indexing and classification based on latent semantic analysis and self-organizing map | Authors: | Mokbel, Chafic Greige, Hanna Sarraf, Charles Kurimo, Mikko |
Affiliations: | Department of Electrical Engineering Department of Mathematics |
Issue Date: | 2001 | Part of: | Proceedings of the IEEE workshop on Natural Language Processing in Arabic | Conference: | Workshop on Natural Language Processing in Arabic (2001 : Beirut, Lebanon) | Abstract: | This paper describes an Arabic document indexing system based on a hybrid "Latent Semantic Analysis"(LSA) and "Self-Organizing Maps"(SOM) algorithm. The approach has the advantage to be completely statistic and to automatically infere the indices from the documents database. A rule-based stemming method is also proposed for the Arabic language. The whole system has been experimented on a database formed of the Alnahar newspaper articles for 1999. Documents clustering and few experiments in retrieval have provided satisfactory results. |
URI: | https://scholarhub.balamand.edu.lb/handle/uob/391 | Type: | Conference Paper |
Appears in Collections: | Department of Electrical Engineering |
Show full item record
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.