과제정보
We would like to thank the Linguistic Data Consortium (LDC) for providing us with the LDC2001T55 Arabic Newswire Part 1 data set at no cost, and for awarding us with the fall 2012 LDC Data Scholarship.
참고문헌
- R. Elbarougy, G. Behery, and A. Khatib, "A Proposed Natural Language Processing Preprocessing Procedures for Enhancing Arabic Text Summarization," Studies in Computational Intelligence, vol. 874, pp. 39-57, 2020.
- C. Fox, "A stop list for general text," Acm Sigir Forum, vol. 24, no. 1-2, pp. 19-21, 1989.
- E. T. Al-Shammari, "Lemmatizing, stemming, and query expansion method and system," Google Patents. Available at, 2013.
- L. S. Larkey, L. Ballesteros, and M. E. Connell, "Light stemming for Arabic information retrieval," Arabic computational morphology, pp. 221-243, 2007.
- S. Khoja, "APT: Arabic part-of-speech tagger," Proceedings of the Student Workshop at NAACL, pp. 20-25, 2001.
- W. B. Croft, D. Metzler, and T. Strohman, "Addison-Wesley Reading," Search engines: Information retrieval in practice, vol. 520, 2010.
- Al-Shalabi, Riyad, G. Kanaan, M. Yaseen, B. Al- Sarayreh, and N. Al-Naji, "Arabic query expansion using interactive word sense disambiguation," in Proceedings of the Second International Conference on Arabic Language Resources and Tools, 2009.
- A. Masrai and J. Milton, "How different is Arabic from other languages? The relationship between word frequency and lexical coverage," Journal of Applied Linguistics and Language Research, vol. 3, no. 1, pp. 15-35, 2016.
- Y. Hacohen-Kerner, D. Miller, and Y. Yigal, "The influence of preprocessing on text classification using a bag-of-words representation," PloS One, vol. 15, no. 5, 2020.
- I. A. El-Khair, "Effects of stop words elimination for Arabic information retrieval: a comparative study," International Journal of Computing & Information Sciences, vol. 4, no. 3, pp. 119-133, 2006.
- S. Sarica and J. Luo, 2020.
- A. W. Pradana and M. Hayaty, "The effect of stemming and removal of stopwords on the accuracy of sentiment analysis on indonesian-language texts," Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, pp. 375-380, 2019.
- E. L. Lydia, P. K. Kumar, K. Shankar, S. K. Lakshmanaprabu, R. M. Vidhyavathi, and A. Maseleno, "Charismatic document clustering through novel K-Means non-negative matrix factorization (KNMF) algorithm using key phrase extraction," International Journal of Parallel Programming, vol. 48, no. 3, pp. 496-514, 2020. https://doi.org/10.1007/s10766-018-0591-9
- R. Baeza-Yates and B. Ribeiro-Neto, 1999.
- Al-Shalabi, Riyadh, G. Kanaan, J. M. Jaam, A. Hasnah, and E. Hilat, "Stop-word removal algorithm for Arabic language," Proceedings. 2004 International Conference on Information and Communication Technologies: From Theory to Applications, vol. 545, 2004.
- H. Schutze, C. D. Manning, and P. Raghavan, 2008.
- B. Alhadidi and M. Alwedyan, "Hybrid Stop-Word Removal Technique for Arabic Language," Egyptian Computer Science Journal, vol. 30, no. 1, pp. 35-38, 2008.
- B. Al-Salemi and M. J. A. Aziz, "Statistical bayesian learning for automatic arabic text categorization," Journal of Computer Science, vol. 7, no. 1, 2011.
- S. H. Mustafa, "Character contiguity in N-gram-based word matching: the case for Arabic text searching," Information Processing & Management, vol. 41, no. 4, pp. 819-827, 2005. https://doi.org/10.1016/j.ipm.2004.02.003
- E. Al-Shammari and J. Lin, "A novel Arabic lemmatization algorithm," Proceedings of the Second Workshop on Analytics for Noisy Unstructured Text Data, pp. 113-118, 2008.
- J. Atwan and M. Mohd, "Arabic Query Expansion: A Review," Asian Journal of Information Technology, vol. 16, no. 10, pp. 754-770, 2017.
- A. Cole, D. Graff, and K. Walker, "Arabic Newswire Part 1 Corpus (1-58563-190-6)," Linguistic Data Consortium (LDC). Available at, 2001.
- B. F. Willian and B. Y. Ricardo, 1999.
- J. Atwan, M. Mohd, H. Rashaideh, and G. Kanaan, 1999"Se- mantically enhanced pseudo relevance feedback for ara- bic information retrieval," Journal of Information Sci- ence, vol. 42, no. 2, pp. 246-260, 2016. https://doi.org/10.1177/0165551515594722
- J. Atwan, M. Mohd, and G. Kanaan, "Enhanced arabic information retrieval: Light stemming and stop words," International Multi-Conference on Artificial Intelligence Technology, pp. 219-228,