• Title/Summary/Keyword: Automatically

Search Result 6,810, Processing Time 0.041 seconds

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

  • Lee, Minchul;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.183-203
    • /
    • 2018
  • News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

  • Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.125-148
    • /
    • 2018
  • Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.

The Precision Test Based on States of Bone Mineral Density (골밀도 상태에 따른 검사자의 재현성 평가)

  • Yoo, Jae-Sook;Kim, Eun-Hye;Kim, Ho-Seong;Shin, Sang-Ki;Cho, Si-Man
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.13 no.1
    • /
    • pp.67-72
    • /
    • 2009
  • Purpose: ISCD (International Society for Clinical Densitometry) requests that users perform mandatory Precision test to raise their quality even though there is no recommendation about patient selection for the test. Thus, we investigated the effect on precision test by measuring reproducibility of 3 bone density groups (normal, osteopenia, osteoporosis). Materials and Methods: 4 users performed precision test with 420 patients (age: $57.8{\pm}9.02$) for BMD in Asan Medical Center (JAN-2008 ~ JUN-2008). In first group (A), 4 users selected 30 patient respectively regardless of bone density condition and measured 2 part (L-spine, femur) in twice. In second group (B), 4 users measured bone density of 10 patients respectively in the same manner of first group (A) users but dividing patient into 3 stages (normal, osteopenia, osteoporosis). In third group (C), 2 users measured 30 patients respectively in the same manner of first group (A) users considering bone density condition. We used GE Lunar Prodigy Advance (Encore. V11.4) and analyzed the result by comparing %CV to LSC using precision tool from ISCD. Check back was done using SPSS. Results: In group A, the %CV calculated by 4 users (a, b, c, d) were 1.16, 1.01, 1.19, 0.65 g/$cm^2$ in L-spine and 0.69, 0.58, 0.97, 0.47 g/$cm^2$ in femur. In group B, the %CV calculated by 4 users (a, b, c, d) were 1.01, 1.19, 0.83, 1.37 g/$cm^2$ in L-spine and 1.03, 0.54, 0.69, 0.58 g/$cm^2$ in femur. When comparing results (group A, B), we found no considerable differences. In group C, the user_1's %CV of normal, osteopenia and osteoporosis were 1.26, 0.94, 0.94 g/$cm^2$ in L-spine and 0.94, 0.79, 1.01 g/$cm^2$ in femur. And the user_2's %CV were 0.97, 0.83, 0.72 g/$cm^2$ L-spine and 0.65, 0.65, 1.05 g/$cm^2$ in femur. When analyzing the result, we figured out that the difference of reproducibility was almost not found but the differences of two users' several result values have effect on total reproducibility. Conclusions: Precision test is a important factor of bone density follow up. When Machine and user's reproducibility is getting better, it’s useful in clinics because of low range of deviation. Users have to check machine's reproducibility before the test and keep the same mind doing BMD test for patient. In precision test, the difference of measured value is usually found for ROI change caused by patient position. In case of osteoporosis patient, there is difficult to make initial ROI accurately more than normal and osteopenia patient due to lack of bone recognition even though ROI is made automatically by computer software. However, initial ROI is very important and users have to make coherent ROI because we use ROI Copy function in a follow up. In this study, we performed precision test considering bone density condition and found LSC value was stayed within 3%. There was no considerable difference. Thus, patient selection could be done regardless of bone density condition.

  • PDF

Utility of Wide Beam Reconstruction in Whole Body Bone Scan (전신 뼈 검사에서 Wide Beam Reconstruction 기법의 유용성)

  • Kim, Jung-Yul;Kang, Chung-Koo;Park, Min-Soo;Park, Hoon-Hee;Lim, Han-Sang;Kim, Jae-Sam;Lee, Chang-Ho
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.14 no.1
    • /
    • pp.83-89
    • /
    • 2010
  • Purpose: The Wide Beam Reconstruction (WBR) algorithms that UltraSPECT, Ltd. (U.S) has provides solutions which improved image resolution by eliminating the effect of the line spread function by collimator and suppression of the noise. It controls the resolution and noise level automatically and yields unsurpassed image quality. The aim of this study is WBR of whole body bone scan in usefulness of clinical application. Materials and Methods: The standard line source and single photon emission computed tomography (SPECT) reconstructed spatial resolution measurements were performed on an INFINA (GE, Milwaukee, WI) gamma camera, equipped with low energy high resolution (LEHR) collimators. The total counts of line source measurements with 200 kcps and 300 kcps. The SPECT phantoms analyzed spatial resolution by the changing matrix size. Also a clinical evaluation study was performed with forty three patients, referred for bone scans. First group altered scan speed with 20 and 30 cm/min and dosage of 740 MBq (20 mCi) of $^{99m}Tc$-HDP administered but second group altered dosage of $^{99m}Tc$-HDP with 740 and 1,110 MBq (20 mCi and 30 mCi) in same scan speed. The acquired data was reconstructed using the typical clinical protocol in use and the WBR protocol. The patient's information was removed and a blind reading was done on each reconstruction method. For each reading, a questionnaire was completed in which the reader was asked to evaluate, on a scale of 1-5 point. Results: The result of planar WBR data improved resolution more than 10%. The Full-Width at Half-Maximum (FWHM) of WBR data improved about 16% (Standard: 8.45, WBR: 7.09). SPECT WBR data improved resolution more than about 50% and evaluate FWHM of WBR data (Standard: 3.52, WBR: 1.65). A clinical evaluation study, there was no statistically significant difference between the two method, which includes improvement of the bone to soft tissue ratio and the image resolution (first group p=0.07, second group p=0.458). Conclusion: The WBR method allows to shorten the acquisition time of bone scans while simultaneously providing improved image quality and to reduce the dosage of radiopharmaceuticals reducing radiation dose. Therefore, the WBR method can be applied to a wide range of clinical applications to provide clinical values as well as image quality.

  • PDF

Quantitative Assessment Technology of Small Animal Myocardial Infarction PET Image Using Gaussian Mixture Model (다중가우시안혼합모델을 이용한 소동물 심근경색 PET 영상의 정량적 평가 기술)

  • Woo, Sang-Keun;Lee, Yong-Jin;Lee, Won-Ho;Kim, Min-Hwan;Park, Ji-Ae;Kim, Jin-Su;Kim, Jong-Guk;Kang, Joo-Hyun;Ji, Young-Hoon;Choi, Chang-Woon;Lim, Sang-Moo;Kim, Kyeong-Min
    • Progress in Medical Physics
    • /
    • v.22 no.1
    • /
    • pp.42-51
    • /
    • 2011
  • Nuclear medicine images (SPECT, PET) were widely used tool for assessment of myocardial viability and perfusion. However it had difficult to define accurate myocardial infarct region. The purpose of this study was to investigate methodological approach for automatic measurement of rat myocardial infarct size using polar map with adaptive threshold. Rat myocardial infarction model was induced by ligation of the left circumflex artery. PET images were obtained after intravenous injection of 37 MBq $^{18}F$-FDG. After 60 min uptake, each animal was scanned for 20 min with ECG gating. PET data were reconstructed using ordered subset expectation maximization (OSEM) 2D. To automatically make the myocardial contour and generate polar map, we used QGS software (Cedars-Sinai Medical Center). The reference infarct size was defined by infarction area percentage of the total left myocardium using TTC staining. We used three threshold methods (predefined threshold, Otsu and Multi Gaussian mixture model; MGMM). Predefined threshold method was commonly used in other studies. We applied threshold value form 10% to 90% in step of 10%. Otsu algorithm calculated threshold with the maximum between class variance. MGMM method estimated the distribution of image intensity using multiple Gaussian mixture models (MGMM2, ${\cdots}$ MGMM5) and calculated adaptive threshold. The infarct size in polar map was calculated as the percentage of lower threshold area in polar map from the total polar map area. The measured infarct size using different threshold methods was evaluated by comparison with reference infarct size. The mean difference between with polar map defect size by predefined thresholds (20%, 30%, and 40%) and reference infarct size were $7.04{\pm}3.44%$, $3.87{\pm}2.09%$ and $2.15{\pm}2.07%$, respectively. Otsu verse reference infarct size was $3.56{\pm}4.16%$. MGMM methods verse reference infarct size was $2.29{\pm}1.94%$. The predefined threshold (30%) showed the smallest mean difference with reference infarct size. However, MGMM was more accurate than predefined threshold in under 10% reference infarct size case (MGMM: 0.006%, predefined threshold: 0.59%). In this study, we was to evaluate myocardial infarct size in polar map using multiple Gaussian mixture model. MGMM method was provide adaptive threshold in each subject and will be a useful for automatic measurement of infarct size.

Study the Analysis of Comparison with AROI and MROI Mode in Gated Cardiac Blood Pool Scan (게이트심장혈액풀 스캔에서 자동 관심영역 설정과 수동 관심영역 설정 모드의 비교 분석에 관한 고찰)

  • Kim, Jung-Yul;Kang, Chun-Koo;Kim, Yung-Jae;Park, Hoon-Hee;Kim, Jae-Sam;Lee, Chang-Ho
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.12 no.3
    • /
    • pp.222-228
    • /
    • 2008
  • Purpose: The objectives of this study were to compare the left ventricle ejection fraction (LVEF) from gated cardiac blood pool scan (GCBP) for analysis auto-drawing region of interest mode (AROI) and manual-drawing region of interest mode (MROI), respectively. To evaluation the relationships between values produced by both ROI modes. Materials and Methods: Gated cardiac blood pool scan using in vivo method Tc-99m Red Blood Cell were performed for 33 patients (mean age: $53.2{\pm}13.2\;y$) with objective of chemotherapy using single head gamma camera (ADAC Laboratories, Milpitas, CA). Left ventricular ejection fraction was automatically and manually measured, respectively. Results: There was significant difference statistically between AROI and MROI ($LVEF^{AROI}$: $71.4{\pm}12.4%$ vs. $LVEF^{MROI}$: $65.8{\pm}5.9%$, p=0.003). Intra-observer agreements in AROI was higher than MROI ($\gamma^{AROI}=0.964$, Cronbach's $\alpha^{AROI}=0.986$ vs. $\gamma^{MROI}=0.793$, Cronbach's $\alpha^{MROI}=0.911$), either. Additionally, there was no significant difference statistically at best septal view (${\Delta}LVEF^{BSV}=0.7{\pm}2.3%$, p=0.233), however statistically significant difference was found at badly separated septal view (${\Delta}LVEF=10.9{\pm}11.4%$, p=0.001). Moreover, Intra-observer agreements in best septal view was higher than badly separated septal view ($\gamma^{BSV}=0.939$, Cronbach's $\alpha^{BSV}=0.978$; $\gamma=0.948$, Cronbach's $\alpha=0.981$ at AROI, $\gamma^{BSV}=0.836$, Cronbach's $\alpha^{BSV}=0.936$; $\gamma=0.748$, Cronbach's $\alpha=0.888$ at MROI). Conclusion: When best septal view was acquired, LVEF by AROI and MROI indicated not different. Comparing Intra-observer agreements with AROI and MROI, the AROI tended to show higher. Therefore, it is considered that the AROI than MROI is valuable in reproducibility and objective when ROI analysis by acquire left ventricular of best septal view.

  • PDF

Usefulness of Gated RapidArc Radiation Therapy Patient evaluation and applied with the Amplitude mode (호흡 동조 체적 세기조절 회전 방사선치료의 유용성 평가와 진폭모드를 이용한 환자적용)

  • Kim, Sung Ki;Lim, Hhyun Sil;Kim, Wan Sun
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.26 no.1
    • /
    • pp.29-35
    • /
    • 2014
  • Purpose : This study has already started commercial Gated RapidArc automation equipment which was not previously in the Gated radiation therapy can be performed simultaneously with the VMAT Gated RapidArc radiation therapy to the accuracy of the analysis to evaluate the usability, Amplitude mode applied to the patient. Materials and Methods : The analysis of the distribution of radiation dose equivalent quality solid water phantom and GafChromic film was used Film QA film analysis program using the Gamma factor (3%, 3 mm). Three-dimensional dose distribution in order to check the accuracy of Matrixx dosimetry equipment and Compass was used for dose analysis program. Periodic breathing synchronized with solid phantom signals Phantom 4D Phantom and Varian RPM was created by breathing synchronized system, free breathing and breath holding at each of the dose distribution was analyzed. In order to apply to four patients from February 2013 to August 2013 with liver cancer targets enough to get a picture of 4DCT respiratory cycle and then patients are pratice to meet patient's breathing cycle phase mode using the patient eye goggles to see the pattern of the respiratory cycle to be able to follow exactly in a while 4DCT images were acquired. Gated RapidArc treatment Amplitude mode in order to create the breathing cycle breathing performed three times, and then at intervals of 40% to 60% 5-6 seconds and breathing exercises that can not stand (Fig. 5), 40% While they are treated 60% in the interval Beam On hold your breath when you press the button in a way that was treated with semi-automatic. Results : Non-respiratory and respiratory rotational intensity modulated radiation therapy technique absolute calculation dose of using computerized treatment plan were shown a difference of less than 1%, the difference between treatment technique was also less than 1%. Gamma (3%, 3 mm) and showed 99% agreement, each organ-specific dose difference were generally greater than 95% agreement. The rotational intensity modulated radiation therapy, respiratory synchronized to the respiratory cycle created Amplitude mode and the actual patient's breathing cycle could be seen that a good agreement. Conclusion : When you are treated Non-respiratory and respiratory method between volumetric intensity modulated radiation therapy rotation of the absolute dose and dose distribution showed a very good agreement. This breathing technique tuning volumetric intensity modulated radiation therapy using a rotary moving along the thoracic or abdominal breathing can be applied to the treatment of tumors is considered. The actual treatment of patients through the goggles of the respiratory cycle to create Amplitude mode Gated RapidArc treatment equipment that does not automatically apply to the results about 5-6 seconds stopped breathing in breathing synchronized rotary volumetric intensity modulated radiation therapy facilitate could see complement.

Intelligent VOC Analyzing System Using Opinion Mining (오피니언 마이닝을 이용한 지능형 VOC 분석시스템)

  • Kim, Yoosin;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.113-125
    • /
    • 2013
  • Every company wants to know customer's requirement and makes an effort to meet them. Cause that, communication between customer and company became core competition of business and that important is increasing continuously. There are several strategies to find customer's needs, but VOC (Voice of customer) is one of most powerful communication tools and VOC gathering by several channels as telephone, post, e-mail, website and so on is so meaningful. So, almost company is gathering VOC and operating VOC system. VOC is important not only to business organization but also public organization such as government, education institute, and medical center that should drive up public service quality and customer satisfaction. Accordingly, they make a VOC gathering and analyzing System and then use for making a new product and service, and upgrade. In recent years, innovations in internet and ICT have made diverse channels such as SNS, mobile, website and call-center to collect VOC data. Although a lot of VOC data is collected through diverse channel, the proper utilization is still difficult. It is because the VOC data is made of very emotional contents by voice or text of informal style and the volume of the VOC data are so big. These unstructured big data make a difficult to store and analyze for use by human. So that, the organization need to automatic collecting, storing, classifying and analyzing system for unstructured big VOC data. This study propose an intelligent VOC analyzing system based on opinion mining to classify the unstructured VOC data automatically and determine the polarity as well as the type of VOC. And then, the basis of the VOC opinion analyzing system, called domain-oriented sentiment dictionary is created and corresponding stages are presented in detail. The experiment is conducted with 4,300 VOC data collected from a medical website to measure the effectiveness of the proposed system and utilized them to develop the sensitive data dictionary by determining the special sentiment vocabulary and their polarity value in a medical domain. Through the experiment, it comes out that positive terms such as "칭찬, 친절함, 감사, 무사히, 잘해, 감동, 미소" have high positive opinion value, and negative terms such as "퉁명, 뭡니까, 말하더군요, 무시하는" have strong negative opinion. These terms are in general use and the experiment result seems to be a high probability of opinion polarity. Furthermore, the accuracy of proposed VOC classification model has been compared and the highest classification accuracy of 77.8% is conformed at threshold with -0.50 of opinion classification of VOC. Through the proposed intelligent VOC analyzing system, the real time opinion classification and response priority of VOC can be predicted. Ultimately the positive effectiveness is expected to catch the customer complains at early stage and deal with it quickly with the lower number of staff to operate the VOC system. It can be made available human resource and time of customer service part. Above all, this study is new try to automatic analyzing the unstructured VOC data using opinion mining, and shows that the system could be used as variable to classify the positive or negative polarity of VOC opinion. It is expected to suggest practical framework of the VOC analysis to diverse use and the model can be used as real VOC analyzing system if it is implemented as system. Despite experiment results and expectation, this study has several limits. First of all, the sample data is only collected from a hospital web-site. It means that the sentimental dictionary made by sample data can be lean too much towards on that hospital and web-site. Therefore, next research has to take several channels such as call-center and SNS, and other domain like government, financial company, and education institute.

The Audience Behavior-based Emotion Prediction Model for Personalized Service (고객 맞춤형 서비스를 위한 관객 행동 기반 감정예측모형)

  • Ryoo, Eun Chung;Ahn, Hyunchul;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.73-85
    • /
    • 2013
  • Nowadays, in today's information society, the importance of the knowledge service using the information to creative value is getting higher day by day. In addition, depending on the development of IT technology, it is ease to collect and use information. Also, many companies actively use customer information to marketing in a variety of industries. Into the 21st century, companies have been actively using the culture arts to manage corporate image and marketing closely linked to their commercial interests. But, it is difficult that companies attract or maintain consumer's interest through their technology. For that reason, it is trend to perform cultural activities for tool of differentiation over many firms. Many firms used the customer's experience to new marketing strategy in order to effectively respond to competitive market. Accordingly, it is emerging rapidly that the necessity of personalized service to provide a new experience for people based on the personal profile information that contains the characteristics of the individual. Like this, personalized service using customer's individual profile information such as language, symbols, behavior, and emotions is very important today. Through this, we will be able to judge interaction between people and content and to maximize customer's experience and satisfaction. There are various relative works provide customer-centered service. Specially, emotion recognition research is emerging recently. Existing researches experienced emotion recognition using mostly bio-signal. Most of researches are voice and face studies that have great emotional changes. However, there are several difficulties to predict people's emotion caused by limitation of equipment and service environments. So, in this paper, we develop emotion prediction model based on vision-based interface to overcome existing limitations. Emotion recognition research based on people's gesture and posture has been processed by several researchers. This paper developed a model that recognizes people's emotional states through body gesture and posture using difference image method. And we found optimization validation model for four kinds of emotions' prediction. A proposed model purposed to automatically determine and predict 4 human emotions (Sadness, Surprise, Joy, and Disgust). To build up the model, event booth was installed in the KOCCA's lobby and we provided some proper stimulative movie to collect their body gesture and posture as the change of emotions. And then, we extracted body movements using difference image method. And we revised people data to build proposed model through neural network. The proposed model for emotion prediction used 3 type time-frame sets (20 frames, 30 frames, and 40 frames). And then, we adopted the model which has best performance compared with other models.' Before build three kinds of models, the entire 97 data set were divided into three data sets of learning, test, and validation set. The proposed model for emotion prediction was constructed using artificial neural network. In this paper, we used the back-propagation algorithm as a learning method, and set learning rate to 10%, momentum rate to 10%. The sigmoid function was used as the transform function. And we designed a three-layer perceptron neural network with one hidden layer and four output nodes. Based on the test data set, the learning for this research model was stopped when it reaches 50000 after reaching the minimum error in order to explore the point of learning. We finally processed each model's accuracy and found best model to predict each emotions. The result showed prediction accuracy 100% from sadness, and 96% from joy prediction in 20 frames set model. And 88% from surprise, and 98% from disgust in 30 frames set model. The findings of our research are expected to be useful to provide effective algorithm for personalized service in various industries such as advertisement, exhibition, performance, etc.