Search | Korea Science

Development of Sentiment Analysis Model for the hot topic detection of online stock forums (온라인 주식 포럼의 핫토픽 탐지를 위한 감성분석 모형의 개발)

Hong, Taeho;Lee, Taewon;Li, Jingjing
- Journal of Intelligence and Information Systems
- /
- v.22 no.1
- /
- pp.187-204
- /
- 2016
Document classification based on emotional polarity has become a welcomed emerging task owing to the great explosion of data on the Web. In the big data age, there are too many information sources to refer to when making decisions. For example, when considering travel to a city, a person may search reviews from a search engine such as Google or social networking services (SNSs) such as blogs, Twitter, and Facebook. The emotional polarity of positive and negative reviews helps a user decide on whether or not to make a trip. Sentiment analysis of customer reviews has become an important research topic as datamining technology is widely accepted for text mining of the Web. Sentiment analysis has been used to classify documents through machine learning techniques, such as the decision tree, neural networks, and support vector machines (SVMs). is used to determine the attitude, position, and sensibility of people who write articles about various topics that are published on the Web. Regardless of the polarity of customer reviews, emotional reviews are very helpful materials for analyzing the opinions of customers through their reviews. Sentiment analysis helps with understanding what customers really want instantly through the help of automated text mining techniques. Sensitivity analysis utilizes text mining techniques on text on the Web to extract subjective information in the text for text analysis. Sensitivity analysis is utilized to determine the attitudes or positions of the person who wrote the article and presented their opinion about a particular topic. In this study, we developed a model that selects a hot topic from user posts at China's online stock forum by using the k-means algorithm and self-organizing map (SOM). In addition, we developed a detecting model to predict a hot topic by using machine learning techniques such as logit, the decision tree, and SVM. We employed sensitivity analysis to develop our model for the selection and detection of hot topics from China's online stock forum. The sensitivity analysis calculates a sentimental value from a document based on contrast and classification according to the polarity sentimental dictionary (positive or negative). The online stock forum was an attractive site because of its information about stock investment. Users post numerous texts about stock movement by analyzing the market according to government policy announcements, market reports, reports from research institutes on the economy, and even rumors. We divided the online forum's topics into 21 categories to utilize sentiment analysis. One hundred forty-four topics were selected among 21 categories at online forums about stock. The posts were crawled to build a positive and negative text database. We ultimately obtained 21,141 posts on 88 topics by preprocessing the text from March 2013 to February 2015. The interest index was defined to select the hot topics, and the k-means algorithm and SOM presented equivalent results with this data. We developed a decision tree model to detect hot topics with three algorithms: CHAID, CART, and C4.5. The results of CHAID were subpar compared to the others. We also employed SVM to detect the hot topics from negative data. The SVM models were trained with the radial basis function (RBF) kernel function by a grid search to detect the hot topics. The detection of hot topics by using sentiment analysis provides the latest trends and hot topics in the stock forum for investors so that they no longer need to search the vast amounts of information on the Web. Our proposed model is also helpful to rapidly determine customers' signals or attitudes towards government policy and firms' products and services.
https://doi.org/10.13088/jiis.2016.22.1.187 인용 PDF KSCI

Quantitative Assessment of Myocardial Tissue Velocity in Normal Children with Doppler Tissue Imaging : Reference Values, Growth and Heart Rate Related Change (소아에서 도플러 조직영상을 이용한 최대 심근 속도의 계측 : 정상 추정치 및 성장 및 심박동수에 따른 변화)

Kim, Se Young;Hyun, Myung Chul;Lee, Sang Bum
- Clinical and Experimental Pediatrics
- /
- v.48 no.8
- /
- pp.846-856
- /
- 2005
Purpose : To measure the peak myocardial tissue velocities and patterns of longitudinal motion of atrioventricular(AV) annuli and assess body weight and heart rates-related changes in normal children. Methods : Using pulsed wave Tissue Doppler Imaging(TDI), we measured peak systolic, early and late diastolic myocardial velocities in 72 normal children at six different sites in apical-4 chamber (A4C) view and at four different sites in apical-2 chamber(A2C) view and compared those values with each other, also observing effects with body weights and heart rates. Longitudinal motions of the AV annuli were measured at three different sites in A4C. Results : There were no significant differences of the TDI parameters between gender, ECHO-machines and among the three Doctors performing TDI. Peak myocardial velocities were significantly higher at the base of the heart than in the mid-ventricular region and in the right lateral ventricular wall than in the left lateral ventricular wall or IVS. The TDI parameters showed no significant correlation with fractional shortening(%). Peak systolic and early diastolic myocardial velocities had no correlation with heart rates, but peak late diastolic velocities and A/E ratio correlated positively with heart rates. Correlations between the TDI parameters and body weight were inconsistent. Absolute longitudinal displacement and % displacement were not differ between gender and not correlated with the TDI parameters. Conclusion : We measured the peak myocardial velocities with TDI and the longitudinal motion of the AV annuli using M-mode echocardiography in normal children. With more large scale evaluation, we may establish reference values in normal children and broaden clinical applicabilities in congenital and acquired heart diseases.
PDF KSCI

Current Status and Future Perspective of PET (PET 이용 현황 및 전망)

Lee, Myung-Chul
- The Korean Journal of Nuclear Medicine
- /
- v.36 no.1
- /
- pp.1-7
- /
- 2002
Positron Emission Tomography (PET) is a nuclear medicine imaging modality that consists of systemic administration to a subject of a radiopharmaceutical labeled with a positron-emitting radionuclide. Following administration, its distribution in the organ or structure under study can be assessed as a function of time and space by (1) defecting the annihilation radiation resulting from the interaction of the positrons with matter, and (2) reconstructing the distribution of the radioactivity from a series of that used in computed tomography (CT). The nuclides most generally exhibit chemical properties that render them particularly desirable in physiological studies. The radionuclides most widely used in PET are F-18, C-11, O-15 and N-13. Regarding to the number of the current PET Centers worldwide (based on ICP data), more than 300 PET Centers were in operation in 2000. The use of PET technology grew rapidly compared to that in 1992 and 1996, particularly in the USA, which demonstrates a three-fold rise in PET installations. In 2001, 194 PET Centers were operating in the USA. In 1994, two clinical and research-oriented PET Centers at Seoul National University Hospital and Samsung Medical Center, was established as the first dedicated PET and Cyclotron machines in Korea, followed by two more PET facilities at the Korea Cancer Center Hospital, Ajou Medical Center, Yonsei University Medical Center, National Cancer Center and established their PET Center. Catholic Medical School and Pusan National University Hospital have finalized a plan to install PET machine in 2002, which results in total of nine PET Centers in Korea. Considering annual trends of PET application in four major PET centers in Korea in Asan Medical Center recent six years (from 1995 to 2000), a total of 11,564 patients have been studied every year and the number of PET studies has shown steep growth year upon year. We had 1,020 PET patients in 1995. This number increased to 1,196, 1,756, 2,379, 3,015 and 4,414 in 1996,1997,1998,1999 and 2000, respectively. The application in cardiac disorders is minimal, and among various neuropsychiatric diseases, patients with epilepsy or dementia can benefit from PET studios. Recently, we investigated brain mapping and neuroreceptor works. PET is not a key application for evaluation of the cardiac patients in Korea because of the relatively low incidence of cardiac disease and less costly procedures such as SPECT can now be performed. The changes in the application of PET studios indicate that, initially, brain PET occupied almost 60% in 1995, followed by a gradual decrease in brain application. However, overall PET use in the diagnosis and management of patients with cancer was up to 63% in 2000. The current medicare coverage policy in the USA is very important because reimbursement policy is critical for the promotion of PET. In May 1995, the Health Care Financing Administration (HCFA) began covering the PET perfusion study using Rubidium-82, evaluation of a solitary pulmonary nodule and pathologically proven non-small cell lung cancer. As of July 1999, Medicare's coverage policy expanded to include additional indications: evaluation of recurrent colorectal cancer with a rising CEA level, staging of lymphoma and detection of recurrent or metastatic melanoma. In December of 2001, National Coverage decided to expand Medicare reimbursement for broad use in 6 cancers: lung, colorecctal, lymphoma, melanoma, head and neck, and esophageal cancers; for determining revascularization in heart diseases; and for identifying epilepsy patients. In addition, PET coverage is expected to further expand to diseases affecting women, such as breast, ovarian, uterine and vaginal cancers as well as diseases like prostate cancer and Alzheimer's disease.
PDF KSCI

Comparison and evaluation of treatment plans using Abdominal compression and Continuous Positive Air Pressure for lung cancer SABR (폐암의 SABR(Stereotactic Ablative Radiotherapy)시 복부압박(Abdominal compression)과 CPAP(Continuous Positive Air Pressure)를 이용한 치료계획의 비교 및 평가)

Kim, Dae Ho;Son, Sang Jun;Mun, Jun Ki;Park, Jang Pil;Lee, Je Hee
- The Journal of Korean Society for Radiation Therapy
- /
- v.33
- /
- pp.35-46
- /
- 2021
Purpose : By comparing and analyzing treatment plans using abdominal compression and The Continuous Positive Air Pressure(CPAP) during SABR of lung cancer, we try to contribute to the improvement of radiotherapy effect. Materials & Methods : In two of the lung SABR patients(A, B patient), we developed a SABR plan using abdominal compression device(the Body Pro-Lok, BPL) and CPAP and analyze the treatment plan through homogeneity, conformity and the parameters proposed in RTOG 0813. Furthermore, for each phase, the X, Y, and Z axis movements centered on PTV are analyzed in all 4D CTs and compared by obtaining the volume and average dose of PTV and OAR. Four cone beam computed tomography(CBCT) were used to measure the directions from the center of the PTV to the intrathoracic contacts in three directions out of 0°, 90°, 180° and 270°, and compare the differences from the average distance values in each direction. Result : Both treatment plans obtained using BPL and CPAP followed recommendations from RTOG, and there was no significant difference in homogeneity and conformity. The X-axis, Y-axis, and Z-axis movements centered on PTV in patient A were 0.49 cm, 0.37 cm, 1.66 cm with BPL and 0.16 cm, 0.12 cm, and 0.19 cm with CPAP, in patient B were 0.22 cm, 0.18 cm, 1.03 cm with BPL and 0.14 cm, 0.11 cm, and 0.4 cm with CPAP. In A patient, when using CPAP compared to BPL, ITV decreased by 46.27% and left lung volume increased by 41.94%, and average dose decreased by 52.81% in the heart. In B patient, volume increased by 106.89% in the left lung and 87.32% in the right lung, with an average dose decreased by 44.30% in the stomach. The maximum difference of A patient between the straight distance value and the mean distance value in each direction was 0.05 cm in the a-direction, 0.05 cm in the b-direction, and 0.41 cm in the c-direction. In B patient, there was a difference of 0.19 cm in the d-direction, 0.49 cm in the e-direction, and 0.06 cm in the f-direction. Conclusion : We confirm that increased lung volume with CPAP can reduce doses of OAR near the target more effectively than with BPL, and also contribute more effectively to restriction of tumor movement with respiration. It is considered that radiation therapy effects can be improved through the application of various sites of CPAP and the combination with CPAP and other treatment machines.
PDF KSCI

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

Ahn, Hyunchul
- Information Systems Review
- /
- v.16 no.3
- /
- pp.161-177
- /
- 2014
Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.
https://doi.org/10.14329/isr.2014.16.3.161 인용 PDF

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
- Journal of Intelligence and Information Systems
- /
- v.20 no.2
- /
- pp.109-122
- /
- 2014
People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.
https://doi.org/10.13088/jiis.2014.20.2.109 인용 PDF KSCI

Search Result 156, Processing Time 0.023 seconds

Development of Sentiment Analysis Model for the hot topic detection of online stock forums (온라인 주식 포럼의 핫토픽 탐지를 위한 감성분석 모형의 개발)

Current Status and Future Perspective of PET (PET 이용 현황 및 전망)

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)