• Title/Summary/Keyword: Opinion detection

Search Result 42, Processing Time 0.027 seconds

Fusion Approach to Targeted Opinion Detection in Blogosphere (블로고스피어에서 주제에 관한 의견을 찾는 융합적 의견탐지방법)

  • Yang, Kiduk
    • Journal of Korean Library and Information Science Society
    • /
    • v.46 no.1
    • /
    • pp.321-344
    • /
    • 2015
  • This paper presents a fusion approach to sentiment detection that combines multiple sources of evidence to retrieve blogs that contain opinions on a specific topic. Our approach to finding opinionated blogs on topic consists of first applying traditional information retrieval methods to retrieve blogs on a given topic and then boosting the ranks of opinionated blogs based on the opinion scores computed by multiple sentiment detection methods. Our sentiment detection strategy, whose central idea is to rely on a variety of complementary evidences rather than trying to optimize the utilization of a single source of evidence, includes High Frequency module, which identifies opinions based on the frequency of opinion terms (i.e., terms that occur frequently in opinionated documents), Low Frequency module, which makes use of uncommon/rare terms (e.g., "sooo good") that express strong sentiments, IU Module, which leverages n-grams with IU (I and you) anchor terms (e.g., I believe, You will love), Wilson's lexicon module, which uses a collection-independent opinion lexicon constructed from Wilson's subjectivity terms, and Opinion Acronym module, which utilizes a small set of opinion acronyms (e.g., imho). The results of our study show that combining multiple sources of opinion evidence is an effective method for improving opinion detection performance.

A Study on the Characteristics of Opinion Retrieval Using Term Statistical Analysis in Opinion Documents (의견 문서의 단어 통계 분석을 통한 의견 검색 특성에 관한 연구)

  • Han, Kyoung-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.11
    • /
    • pp.21-29
    • /
    • 2010
  • Opinion retrieval which searches the opinions expressed in documents by users cannot outperform significantly yet traditional topical retrieval which searches the facts. Therefore, the focus of this paper is to identify the statistical characteristics which can be applied to opinion retrieval by comparing and analyzing the term statistics of opinion and non-opinion documents in the blog domain. The TREC Blogs06 collection and 150 TREC topics are used in the experiments. The difference between term probability distributions in opinion documents is measured by JS divergence, and the difference according to the topic types and topic domains is also investigated. Moreover, the term probabilities of opinion terms are analyzed comparatively. The main findings of this study include the following: it is necessary to consider the topic-specific characteristics for the opinion detection; it is effective to extract positive and negative opinion terms according to the topics; the topic types are complementary to the topic domains; and special attention has to be given to the usage of the positive opinion terms.

Opinion Bias Detection Based on Social Opinions for Twitter

  • Kwon, A-Rong;Lee, Kyung-Soon
    • Journal of Information Processing Systems
    • /
    • v.9 no.4
    • /
    • pp.538-547
    • /
    • 2013
  • In this paper, we propose a bias detection method that is based on personal and social opinions that express contrasting views on competing topics on Twitter. We used unsupervised polarity classification is conducted for learning social opinions on targets. The $tf{\cdot}idf$ algorithm is applied to extract targets to reflect sentiments and features of tweets. Our method addresses there being a lack of a sentiment lexicon when learning social opinions. To evaluate the effectiveness of our method, experiments were conducted on four issues using Twitter test collection. The proposed method achieved significant improvements over the baselines.

CRPN (Customer-oriented Risk Priority Number): RPN Evaluation Method Based on Customer Opinion through SNS Opinion Mining (CRPN(Customer-oriented Risk Priority Number): SNS 오피니언 마이닝을 활용한 고객 의견 기반의 RPN 평가 기법)

  • Yoo, In-Hyeok;Kang, Won-Kyung;Choi, Kyu-Nam;Park, Ji-Yun;Lee, Geon-Ju;Kang, Sung-Woo
    • Journal of Korean Society for Quality Management
    • /
    • v.47 no.1
    • /
    • pp.97-108
    • /
    • 2019
  • Purpose: The purpose of this study is to propose a new Risk Priority Number(RPN) evaluation method which analyzes value of product functions by mining customer opinions in Social Network Service(SNS). Methods: A traditional RPN is measured by three evaluation standards (Severity, Occurrence, Detection) which are analyzed by manufacturing engineers and researchers. On the other hand, these standards are analyzed by customers' viewpoints through SNS opinion mining in this research. In order to extract customer feedbacks from textual data sets, the methodology in this paper implies natural language processing, hereby collecting product related data sets and analyzing the opinions automatically. An emotional polarity of an opinion indicates severity, while the number of negative opinion shows occurrence, and the entire number of customer opinion refers to detection. Results: The results of this study are as follows; As a result of the CRPN evaluation, it is confirmed that the features evaluated as risky are highly likely to be improved in the next series. Therefore, CRPN is an effective risk assessment model that reflects customer feedback. Conclusion: Reflecting customer feedback is a useful tool for risk assessment of the product as well as for developing new products and improving existing products.

Research on the Financial Data Fraud Detection of Chinese Listed Enterprises by Integrating Audit Opinions

  • Leiruo Zhou;Yunlong Duan;Wei Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.12
    • /
    • pp.3218-3241
    • /
    • 2023
  • Financial fraud undermines the sustainable development of financial markets. Financial statements can be regarded as the key source of information to obtain the operating conditions of listed companies. Current research focuses more on mining financial digital data instead of looking into text data. However, text data can reveal emotional information, which is an important basis for detecting financial fraud. The audit opinion of the financial statement is especially the fair opinion of a certified public accountant on the quality of enterprise financial reports. Therefore, this research was carried out by using the data features of 4,153 listed companies' financial annual reports and audits of text opinions in the past six years, and the paper puts forward a financial fraud detection model integrating audit opinions. First, the financial data index database and audit opinion text database were built. Second, digitized audit opinions with deep learning Bert model was employed. Finally, both the extracted audit numerical characteristics and the financial numerical indicators were used as the training data of the LightGBM model. What is worth paying attention to is that the imbalanced distribution of sample labels is also one of the focuses of financial fraud research. To solve this problem, data enhancement and Focal Loss feature learning functions were used in data processing and model training respectively. The experimental results show that compared with the conventional financial fraud detection model, the performance of the proposed model is improved greatly, with Area Under the Curve (AUC) and Accuracy reaching 81.42% and 78.15%, respectively.

A Crowdsourcing-Based Paraphrased Opinion Spam Dataset and Its Implication on Detection Performance (크라우드소싱 기반 문장재구성 방법을 통한 의견 스팸 데이터셋 구축 및 평가)

  • Lee, Seongwoon;Kim, Seongsoon;Park, Donghyeon;Kang, Jaewoo
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.7
    • /
    • pp.338-343
    • /
    • 2016
  • Today, opinion reviews on the Web are often used as a means of information exchange. As the importance of opinion reviews continues to grow, the number of issues for opinion spam also increases. Even though many research studies on detecting spam reviews have been conducted, some limitations of gold-standard datasets hinder research. Therefore, we introduce a new dataset called "Paraphrased Opinion Spam (POS)" that contains a new type of review spam that imitates truthful reviews. We have noticed that spammers refer to existing truthful reviews to fabricate spam reviews. To create such a seemingly truthful review spam dataset, we asked task participants to paraphrase truthful reviews to create a new deceptive review. The experiment results show that classifying our POS dataset is more difficult than classifying the existing spam datasets since the reviews in our dataset more linguistically look like truthful reviews. Also, training volume has been found to be an important factor for classification model performance.

Outlier Detection Techniques for Biased Opinion Discovery (편향된 의견 문서 검출을 위한 이상치 탐지 기법)

  • Yeon, Jongheum;Shim, Junho;Lee, Sanggoo
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.4
    • /
    • pp.315-326
    • /
    • 2013
  • Users in social media post various types of opinions such as product reviews and movie reviews. It is a common trend that customers get assistance from the opinions in making their decisions. However, as opinion usage grows, distorted feedbacks also have increased. For example, exaggerated positive opinions are posted for promoting target products. So are negative opinions which are far from common evaluations. Finding these biased opinions becomes important to keep social media reliable. Techniques of opinion mining (or sentiment analysis) have been developed to determine sentiment polarity of opinionated documents. These techniques can be utilized for finding the biased opinions. However, the previous techniques have some drawback. They categorize the text into only positive and negative, and they also need a large amount of training data to build the classifier. In this paper, we propose methods for discovering the biased opinions which are skewed from the overall common opinions. The methods are based on angle based outlier detection and personalized PageRank, which can be applied without training data. We analyze the performance of the proposed techniques by presenting experimental results on a movie review dataset.

A Nonuniform Sampling Technique and Its Application to Speech Coding (비균등 표본화 기법과 음성 부호화로의 응용)

  • Iem, Byeong-Gwan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.1
    • /
    • pp.28-32
    • /
    • 2014
  • For a signal such as speech showing piece-wise linear shape in a very short time period, a nonuniform sampling method based on the inflection point detection (IPD) is proposed to reduce data rate. The method exploits the geometrical characteristics of signal further than the existing local maxima/minima detection (MMD) based sampling method. As results, the reconstructed signal by the interpolation of the IPD based sampled data resembles the original speech more. Computer simulation shows that the proposed IPD based method produces about 9~23 dB improvement over the existing MMD method. To show the usefulness of the IPD technique, it is applied to speech coding, and compared to the continuously variable slope delta modulation (CVSD). The nonuniformly sampled data is binary coded with one bit flag set "1". Noninflection samples are not sent, but only flag bits set 0 are sent. The method shows 0.3 ~ 9 dB SNR and 0.5 ~ 1.3 mean opinion score (MOS) improvements over the CVSD.

End-to-end Transmission Performance of VoIP Traffics based on Mobility Pattern over MANET with IDS (IDS가 있는 MANET에서 이동패턴에 기반한 VoIP 트래픽의 종단간 전송성능)

  • Kim, Young-Dong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.7
    • /
    • pp.773-778
    • /
    • 2014
  • IDS(Intrusion Detection System) can be used as a countermeasure for blackhole attacks which cause degrade of transmission performance by causing of malicious intrusion to routing function of networks. In this paper, effects of IDS for transmission performance based on mobility patterns is analyzed for MANET(Mobile Ad-hoc Networks), a suggestion for effective countermeasure is considered. Computer simulation based on NS-2 is used in performance analysis, VoIP(Voice over Internet Protocol) as an application service is chosen for performance measure. MOS(Mean Opinion Score), call connection ratio and end-to-end delay is used as performance parameter.

Use of laser fluorescence device 'DIAGNODent$^{(R)}$' for detecting caries (레이저 우식진단기기 'DIAGNODent$^{(R)}$'의 활용)

  • Lee, Byoung-Jin
    • The Journal of the Korean dental association
    • /
    • v.49 no.8
    • /
    • pp.461-471
    • /
    • 2011
  • The detection of carious lesions is a key point to apply appropriate preventive measures or operative treatment of dental caries. A laser fluorescence device DIAGNOdent$^{(R)}$ (KaVo, Biberach, Germany) has also been shown to be of additional clinical value in the detection of initial caries. This report focus on the DIAGNOdent$^{(R)}$ for caries detection. DIAGNOdent$^{(R)}$ irradiate visible red light at a wavelength of 655 nm to elicit near-infrared fluorescence from caries lesion. This device is known as a reproducible method for caries detection, with good sensitivity and specificity especially for caries detection on occlusal and accessible smooth surfaces. DIAGNOdent$^{(R)}$ tended to be more sensitive method of detecting occlusal dentinal caries, however, showed more false-positive diagnoses than the visual inspection. So Clinician should not use the device as a clinician's primary diagnostic method and it is recommended that the device should be used in the decision-making process in relation to the diagnosis of caries as a second opinion in cases of doubt after visual inspection. The trend of modern dentistry would be a preventive approach rather than invasive treatment of the disease. This is possible only with early detection and respective preventive measures, DIAGNOdent$^{(R)}$ can help the changes.