• Title/Summary/Keyword: data mining processes

Search Result 140, Processing Time 0.029 seconds

A study on the CRM strategy for medium and small industry of distribution (중소유통업체의 CRM 도입방안에 관한 연구)

  • Kim, Gi-Pyoung
    • Journal of Distribution Science
    • /
    • v.8 no.3
    • /
    • pp.37-47
    • /
    • 2010
  • CRM refers to the operating activities that always maintain and promote good relationship with customers to ultimately maximize the company's profits by understanding the value of customers to meet their demands, establishing a strategy which may maximize the Life Time Value and successfully operating the business by integrating the customer management processes. In our country, many big businesses are introducing CRM initiatively to use it in marketing strategy however, most medium and small sized companies do not understand CRM clearly or they feel difficult to introduce it due to huge investment needed. This study is intended to present CRM promotion strategy and activities plan fit for the medium and small sized companies by analyzing the success factors of the leading companies those have already executed CRM by surveying the precedents to make the distributors out of the industries have close relation with consumers to overcome their weakness in scale and strengthen their competitiveness in such a rapidly changing and fiercely competing market. There are 5 stages to build CRM such as the recognition of the needs of CRM establishment, the establishment of CRM integrated database, the establishment of customer analysis and marketing strategy through data mining, the practical use of customer analysis through data mining and the implementation of response analysis and close loop process. Through the case study of leading companies, CRM is needed in types of businesses where the companies constantly contact their customers. To meet their needs, they assertively analyze their customer information. Through this, they develop their own CRM programs personalized for their customers to provide high quality service products. For customers helping them make profits, the VIP marketing strategy is conducted to keep the customers from breaking their relationships with the companies. Through continuous management, CRM should be executed. In other words, through customer segmentation, the profitability for the customers should be maximized. The maximization of the profitability for the customers is the key to CRM. These are the success factors of the CRM of the distributors in Korea. Firstly, the top management's will power for CS management is needed. Secondly, the culture across the company should be made to respect the customers. Thirdly, specialized customer management and CRM workers should be trained. Fourthly, CRM behaviors should be developed for the whole staff members. Fifthly, CRM should be carried out through systematic cooperation between related departments. To make use of the case study for CRM, the company should understand the customer and establish customer management programs to set the optimal CRM strategy and continuously pursue it according to a long-term plan. For this, according to collected information and customer data, customers should be segmented and the responsive customer system should be designed according to the differentiated strategy according to the class of the customers. In terms of the future CRM, integrated CRM is essential where the customer information gathers together in one place. As the degree of customers' expectation increases a lot, the effective way to meet the customers' expectation should be pursued. As the IT technology improved rapidly, RFID (Radio Frequency Identification) appears. On a real-time basis, information about products and customers is obtained massively in a very short time. A strategy for successful CRM promotion should be improving the organizations in charge of contacting customers, re-planning the customer management processes and establishing the integrated system with the marketing strategy to keep good relation with the customers according to a long-term plan and a proper method suitable to the market conditions and run a company-wide program. In addition, a CRM program should be continuously improved and complemented to meet the company's characteristics. Especially, a strategy for successful CRM for the medium and small sized distributors should be as follows. First, they should change their existing recognition in CRM and keep in-depth care for the customers. Second, they should benchmark the techniques of CRM from the leading companies and find out success points to use. Third, they should seek some methods best suited for their particular conditions by achieving the ideas combining their own strong points with marketing. Fourth, a CRM model should be developed that will promote relationship with individual customers just like the precedents of small sized businesses in Switzerland through small but noticeable events.

  • PDF

A study on hard-core users and bots detection using classification of game character's growth type in online games (캐릭터 성장 유형 분류를 통한 온라인 게임 하드코어 유저와 게임 봇 탐지 연구)

  • Lee, Jin;Kang, Sung Wook;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.5
    • /
    • pp.1077-1084
    • /
    • 2015
  • Security issues such as an illegal acquisition of personal information and identity theft happen due to using game bots in online games. Game bots collect items and money unfairly, so in-game contents are rapidly depleted, and honest users feel deprived. It causes a downturn in the game market. In this paper, we defined the growth types by analyzing the growth processes of users with actual game data. We proposed the framework that classify hard-core users and game bots in the growth patterns. We applied the framework in the actual data. As a result, we classified five growth types and detected game bots from hard-core users with 93% precision. Earlier studies show that hard-core users are also detected as a bot. We clearly separated game bots and hard-core users before full growth.

USE OF NEAR INFRARED FOR THE QUANTITATIVE ANALYSES OF BAUXITE

  • Walker, Graham S.;Cirulis, Robyn;Fletcher, Benjimin;Chandrashekar, S.
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1171-1171
    • /
    • 2001
  • Quantitative analysis is an important requirement in exploration, mining and processing of minerals. There is an increasing need for the use of quantitative mineralogical data to assist with bore hole logging, deposit delineation, grade control, feed to processing plants and monitoring of solid process residues. Quantitative analysis using X-Ray Powder Diffraction (XRD) requires fine grinding and the addition of a reference material, or the application of Rietveld analysis to XRD patterns to provide accurate analysis of the suite of minerals present. Whilst accurate quantitative data can be obtained in this manner, the method is time consuming and limited to the laboratory. Mid infrared when combined with multivariant analysis has also been used for quantitative analysis. However, factors such as the absorption coefficients and refractive index of the minerals requires special sample preparation and dilution in a dispersive medium, such as KBr to minimize distortion of spectral features. In contrast, the lower intensity of the overtones and combinations of the fundamental vibrations in the near infrared allow direct measurement of virtually any solid without special sample preparation or dilution. Thus Near Infrared Spectroscopy (NIR) has found application for quantitative on-line/in line analysis and control in a range of processing applications which include, moisture control in clay and textile processing, fermentation processes, wheat analysis, gasoline analysis and chemicals and polymers. It is developing rapidly in the mineral exploration industry and has been underpinned by the development of portable NIR spectrometers and spectral libraries of a wide range of minerals. For example, iron ores have been identified and characterized in terms of the individual mineral components using field spectrometers. Data acquisition time of NIR field instruments is of the order of seconds and sample preparation is minimal. Consequently these types of spectrometers have great potential for in-line or on-line application in the minerals industry. To demonstrate the applicability of NIR field spectroscopy for quantitative analysis of minerals, a specific example on the quantification of lateritic bauxites will be presented. It has been shown that the application of Partial Least Squares regression analysis (PLS) to the NIR spectra can be used to quantify chemistry and mineralogy in a range of lateritic bauxites. Important, issues such as sampling, precision, repeatability, and replication which influence the results will be discussed.

  • PDF

Measuring the Public Service Quality Using Process Mining: Focusing on N City's Building Licensing Complaint Service (프로세스 마이닝을 이용한 공공서비스의 품질 측정: N시의 건축 인허가 민원 서비스를 중심으로)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.35-52
    • /
    • 2019
  • As public services are provided in various forms, including e-government, the level of public demand for public service quality is increasing. Although continuous measurement and improvement of the quality of public services is needed to improve the quality of public services, traditional surveys are costly and time-consuming and have limitations. Therefore, there is a need for an analytical technique that can measure the quality of public services quickly and accurately at any time based on the data generated from public services. In this study, we analyzed the quality of public services based on data using process mining techniques for civil licensing services in N city. It is because the N city's building license complaint service can secure data necessary for analysis and can be spread to other institutions through public service quality management. This study conducted process mining on a total of 3678 building license complaint services in N city for two years from January 2014, and identified process maps and departments with high frequency and long processing time. According to the analysis results, there was a case where a department was crowded or relatively few at a certain point in time. In addition, there was a reasonable doubt that the increase in the number of complaints would increase the time required to complete the complaints. According to the analysis results, the time required to complete the complaint was varied from the same day to a year and 146 days. The cumulative frequency of the top four departments of the Sewage Treatment Division, the Waterworks Division, the Urban Design Division, and the Green Growth Division exceeded 50% and the cumulative frequency of the top nine departments exceeded 70%. Higher departments were limited and there was a great deal of unbalanced load among departments. Most complaint services have a variety of different patterns of processes. Research shows that the number of 'complementary' decisions has the greatest impact on the length of a complaint. This is interpreted as a lengthy period until the completion of the entire complaint is required because the 'complement' decision requires a physical period in which the complainant supplements and submits the documents again. In order to solve these problems, it is possible to drastically reduce the overall processing time of the complaints by preparing thoroughly before the filing of the complaints or in the preparation of the complaints, or the 'complementary' decision of other complaints. By clarifying and disclosing the cause and solution of one of the important data in the system, it helps the complainant to prepare in advance and convinces that the documents prepared by the public information will be passed. The transparency of complaints can be sufficiently predictable. Documents prepared by pre-disclosed information are likely to be processed without problems, which not only shortens the processing period but also improves work efficiency by eliminating the need for renegotiation or multiple tasks from the point of view of the processor. The results of this study can be used to find departments with high burdens of civil complaints at certain points of time and to flexibly manage the workforce allocation between departments. In addition, as a result of analyzing the pattern of the departments participating in the consultation by the characteristics of the complaints, it is possible to use it for automation or recommendation when requesting the consultation department. In addition, by using various data generated during the complaint process and using machine learning techniques, the pattern of the complaint process can be found. It can be used for automation / intelligence of civil complaint processing by making this algorithm and applying it to the system. This study is expected to be used to suggest future public service quality improvement through process mining analysis on civil service.

Self Introduction Essay Classification Using Doc2Vec for Efficient Job Matching (Doc2Vec 모형에 기반한 자기소개서 분류 모형 구축 및 실험)

  • Kim, Young Soo;Moon, Hyun Sil;Kim, Jae Kyeong
    • Journal of Information Technology Services
    • /
    • v.19 no.1
    • /
    • pp.103-112
    • /
    • 2020
  • Job seekers are making various efforts to find a good company and companies attempt to recruit good people. Job search activities through self-introduction essay are nowadays one of the most active processes. Companies spend time and cost to reviewing all of the numerous self-introduction essays of job seekers. Job seekers are also worried about the possibility of acceptance of their self-introduction essays by companies. This research builds a classification model and conducted an experiments to classify self-introduction essays into pass or fail using deep learning and decision tree techniques. Real world data were classified using stratified sampling to alleviate the data imbalance problem between passed self-introduction essays and failed essays. Documents were embedded using Doc2Vec method developed from existing Word2Vec, and they were classified using logistic regression analysis. The decision tree model was chosen as a benchmark model, and K-fold cross-validation was conducted for the performance evaluation. As a result of several experiments, the area under curve (AUC) value of PV-DM results better than that of other models of Doc2Vec, i.e., PV-DBOW and Concatenate. Furthmore PV-DM classifies passed essays as well as failed essays, while PV_DBOW can not classify passed essays even though it classifies well failed essays. In addition, the classification performance of the logistic regression model embedded using the PV-DM model is better than the decision tree-based classification model. The implication of the experimental results is that company can reduce the cost of recruiting good d job seekers. In addition, our suggested model can help job candidates for pre-evaluating their self-introduction essays.

Real-time CRM Strategy of Big Data and Smart Offering System: KB Kookmin Card Case (KB국민카드의 빅데이터를 활용한 실시간 CRM 전략: 스마트 오퍼링 시스템)

  • Choi, Jaewon;Sohn, Bongjin;Lim, Hyuna
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.1-23
    • /
    • 2019
  • Big data refers to data that is difficult to store, manage, and analyze by existing software. As the lifestyle changes of consumers increase the size and types of needs that consumers desire, they are investing a lot of time and money to understand the needs of consumers. Companies in various industries utilize Big Data to improve their products and services to meet their needs, analyze unstructured data, and respond to real-time responses to products and services. The financial industry operates a decision support system that uses financial data to develop financial products and manage customer risks. The use of big data by financial institutions can effectively create added value of the value chain, and it is possible to develop a more advanced customer relationship management strategy. Financial institutions can utilize the purchase data and unstructured data generated by the credit card, and it becomes possible to confirm and satisfy the customer's desire. CRM has a granular process that can be measured in real time as it grows with information knowledge systems. With the development of information service and CRM, the platform has change and it has become possible to meet consumer needs in various environments. Recently, as the needs of consumers have diversified, more companies are providing systematic marketing services using data mining and advanced CRM (Customer Relationship Management) techniques. KB Kookmin Card, which started as a credit card business in 1980, introduced early stabilization of processes and computer systems, and actively participated in introducing new technologies and systems. In 2011, the bank and credit card companies separated, leading the 'Hye-dam Card' and 'One Card' markets, which were deviated from the existing concept. In 2017, the total use of domestic credit cards and check cards grew by 5.6% year-on-year to 886 trillion won. In 2018, we received a long-term rating of AA + as a result of our credit card evaluation. We confirmed that our credit rating was at the top of the list through effective marketing strategies and services. At present, Kookmin Card emphasizes strategies to meet the individual needs of customers and to maximize the lifetime value of consumers by utilizing payment data of customers. KB Kookmin Card combines internal and external big data and conducts marketing in real time or builds a system for monitoring. KB Kookmin Card has built a marketing system that detects realtime behavior using big data such as visiting the homepage and purchasing history by using the customer card information. It is designed to enable customers to capture action events in real time and execute marketing by utilizing the stores, locations, amounts, usage pattern, etc. of the card transactions. We have created more than 280 different scenarios based on the customer's life cycle and are conducting marketing plans to accommodate various customer groups in real time. We operate a smart offering system, which is a highly efficient marketing management system that detects customers' card usage, customer behavior, and location information in real time, and provides further refinement services by combining with various apps. This study aims to identify the traditional CRM to the current CRM strategy through the process of changing the CRM strategy. Finally, I will confirm the current CRM strategy through KB Kookmin card's big data utilization strategy and marketing activities and propose a marketing plan for KB Kookmin card's future CRM strategy. KB Kookmin Card should invest in securing ICT technology and human resources, which are becoming more sophisticated for the success and continuous growth of smart offering system. It is necessary to establish a strategy for securing profit from a long-term perspective and systematically proceed. Especially, in the current situation where privacy violation and personal information leakage issues are being addressed, efforts should be made to induce customers' recognition of marketing using customer information and to form corporate image emphasizing security.

Impact of Semantic Characteristics on Perceived Helpfulness of Online Reviews (온라인 상품평의 내용적 특성이 소비자의 인지된 유용성에 미치는 영향)

  • Park, Yoon-Joo;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.29-44
    • /
    • 2017
  • In Internet commerce, consumers are heavily influenced by product reviews written by other users who have already purchased the product. However, as the product reviews accumulate, it takes a lot of time and effort for consumers to individually check the massive number of product reviews. Moreover, product reviews that are written carelessly actually inconvenience consumers. Thus many online vendors provide mechanisms to identify reviews that customers perceive as most helpful (Cao et al. 2011; Mudambi and Schuff 2010). For example, some online retailers, such as Amazon.com and TripAdvisor, allow users to rate the helpfulness of each review, and use this feedback information to rank and re-order them. However, many reviews have only a few feedbacks or no feedback at all, thus making it hard to identify their helpfulness. Also, it takes time to accumulate feedbacks, thus the newly authored reviews do not have enough ones. For example, only 20% of the reviews in Amazon Review Dataset (Mcauley and Leskovec, 2013) have more than 5 reviews (Yan et al, 2014). The purpose of this study is to analyze the factors affecting the usefulness of online product reviews and to derive a forecasting model that selectively provides product reviews that can be helpful to consumers. In order to do this, we extracted the various linguistic, psychological, and perceptual elements included in product reviews by using text-mining techniques and identifying the determinants among these elements that affect the usability of product reviews. In particular, considering that the characteristics of the product reviews and determinants of usability for apparel products (which are experiential products) and electronic products (which are search goods) can differ, the characteristics of the product reviews were compared within each product group and the determinants were established for each. This study used 7,498 apparel product reviews and 106,962 electronic product reviews from Amazon.com. In order to understand a review text, we first extract linguistic and psychological characteristics from review texts such as a word count, the level of emotional tone and analytical thinking embedded in review text using widely adopted text analysis software LIWC (Linguistic Inquiry and Word Count). After then, we explore the descriptive statistics of review text for each category and statistically compare their differences using t-test. Lastly, we regression analysis using the data mining software RapidMiner to find out determinant factors. As a result of comparing and analyzing product review characteristics of electronic products and apparel products, it was found that reviewers used more words as well as longer sentences when writing product reviews for electronic products. As for the content characteristics of the product reviews, it was found that these reviews included many analytic words, carried more clout, and related to the cognitive processes (CogProc) more so than the apparel product reviews, in addition to including many words expressing negative emotions (NegEmo). On the other hand, the apparel product reviews included more personal, authentic, positive emotions (PosEmo) and perceptual processes (Percept) compared to the electronic product reviews. Next, we analyzed the determinants toward the usefulness of the product reviews between the two product groups. As a result, it was found that product reviews with high product ratings from reviewers in both product groups that were perceived as being useful contained a larger number of total words, many expressions involving perceptual processes, and fewer negative emotions. In addition, apparel product reviews with a large number of comparative expressions, a low expertise index, and concise content with fewer words in each sentence were perceived to be useful. In the case of electronic product reviews, those that were analytical with a high expertise index, along with containing many authentic expressions, cognitive processes, and positive emotions (PosEmo) were perceived to be useful. These findings are expected to help consumers effectively identify useful product reviews in the future.

Improving Performance of Recommendation Systems Using Topic Modeling (사용자 관심 이슈 분석을 통한 추천시스템 성능 향상 방안)

  • Choi, Seongi;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.101-116
    • /
    • 2015
  • Recently, due to the development of smart devices and social media, vast amounts of information with the various forms were accumulated. Particularly, considerable research efforts are being directed towards analyzing unstructured big data to resolve various social problems. Accordingly, focus of data-driven decision-making is being moved from structured data analysis to unstructured one. Also, in the field of recommendation system, which is the typical area of data-driven decision-making, the need of using unstructured data has been steadily increased to improve system performance. Approaches to improve the performance of recommendation systems can be found in two aspects- improving algorithms and acquiring useful data with high quality. Traditionally, most efforts to improve the performance of recommendation system were made by the former approach, while the latter approach has not attracted much attention relatively. In this sense, efforts to utilize unstructured data from variable sources are very timely and necessary. Particularly, as the interests of users are directly connected with their needs, identifying the interests of the user through unstructured big data analysis can be a crew for improving performance of recommendation systems. In this sense, this study proposes the methodology of improving recommendation system by measuring interests of the user. Specially, this study proposes the method to quantify interests of the user by analyzing user's internet usage patterns, and to predict user's repurchase based upon the discovered preferences. There are two important modules in this study. The first module predicts repurchase probability of each category through analyzing users' purchase history. We include the first module to our research scope for comparing the accuracy of traditional purchase-based prediction model to our new model presented in the second module. This procedure extracts purchase history of users. The core part of our methodology is in the second module. This module extracts users' interests by analyzing news articles the users have read. The second module constructs a correspondence matrix between topics and news articles by performing topic modeling on real world news articles. And then, the module analyzes users' news access patterns and then constructs a correspondence matrix between articles and users. After that, by merging the results of the previous processes in the second module, we can obtain a correspondence matrix between users and topics. This matrix describes users' interests in a structured manner. Finally, by using the matrix, the second module builds a model for predicting repurchase probability of each category. In this paper, we also provide experimental results of our performance evaluation. The outline of data used our experiments is as follows. We acquired web transaction data of 5,000 panels from a company that is specialized to analyzing ranks of internet sites. At first we extracted 15,000 URLs of news articles published from July 2012 to June 2013 from the original data and we crawled main contents of the news articles. After that we selected 2,615 users who have read at least one of the extracted news articles. Among the 2,615 users, we discovered that the number of target users who purchase at least one items from our target shopping mall 'G' is 359. In the experiments, we analyzed purchase history and news access records of the 359 internet users. From the performance evaluation, we found that our prediction model using both users' interests and purchase history outperforms a prediction model using only users' purchase history from a view point of misclassification ratio. In detail, our model outperformed the traditional one in appliance, beauty, computer, culture, digital, fashion, and sports categories when artificial neural network based models were used. Similarly, our model outperformed the traditional one in beauty, computer, digital, fashion, food, and furniture categories when decision tree based models were used although the improvement is very small.

Performance Comparison of Clustering using Discritization Algorithm (이산화 알고리즘을 이용한 계층적 클러스터링의 실험적 성능 평가)

  • Won, Jae Kang;Lee, Jeong Chan;Jung, Yong Gyu;Lee, Young Ho
    • Journal of Service Research and Studies
    • /
    • v.3 no.2
    • /
    • pp.53-60
    • /
    • 2013
  • Datamining from the large data in the form of various techniques for obtaining information have been developed. In recent years one of the most sought areas of pattern recognition and machine learning method is created with most of existing learning algorithms based on categorical attributes to a rule or decision model. However, the real-world data, it may consist of numeric attributes in many cases. In addition it contains attributes with numerical values to the normal categorical attribute. In this case, therefore, it is required processes in order to use the data to learn an appropriate value for the type attribute. In this paper, the domain of the numeric attributes are divided into several segments using learning algorithm techniques of discritization. It is described Clustering with other data mining techniques. Large amount of first cluster with characteristics is similar records from the database into smaller groups that split multiple given finite patterns in the pattern space. It is close to each other of a set of patterns that together make up a bunch. Among the set without specifying a particular category in a given data by extracting a pattern. It will be described similar grouping of data clustering technique to classify the data.

  • PDF

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.