• Title/Summary/Keyword: Entropy score

Search Result 26, Processing Time 0.028 seconds

Evaluation of Classification Algorithm Performance of Sentiment Analysis Using Entropy Score (엔트로피 점수를 이용한 감성분석 분류알고리즘의 수행도 평가)

  • Park, Man-Hee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.9
    • /
    • pp.1153-1158
    • /
    • 2018
  • Online customer evaluations and social media information among a variety of information sources are critical for businesses as it influences the customer's decision making. There are limitations on the time and money that the survey will ask to identify a variety of customers' needs and complaints. The customer review data at online shopping malls provide the ideal data sources for analyzing customer sentiment about their products. In this study, we collected product reviews data on the smartphone of Samsung and Apple from Amazon. We applied five classification algorithms which are used as representative sentiment analysis techniques in previous studies. The five algorithms are based on support vector machines, bagging, random forest, classification or regression tree and maximum entropy. In this study, we proposed entropy score which can comprehensively evaluate the performance of classification algorithm. As a result of evaluating five algorithms using an entropy score, the SVMs algorithm's entropy score was ranked highest.

Entropy-based Similarity Measures for Memory-based Collaborative Filtering

  • Kwon, Hyeong-Joon;Latchman, Haniph
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.5 no.2
    • /
    • pp.5-10
    • /
    • 2013
  • We proposed a novel similarity measure using weighted difference entropy (WDE) to improve the performance of the CF system. The proposed similarity metric evaluates the entropy with a preference score difference between the common rated items of two users, and normalizes it based on the Gaussian, tanh and sigmoid function. We showed significant improvement of experimental results and environments. These experiments involved changing the number of nearest neighborhoods, and we presented experimental results for two data sets with different characteristics, and results for the quality of recommendation.

A New Statistical Index for Detecting Cheaters on Multiple Choice Tests (다중선택 시험에서 부정행위자 발견을 위한 새로운 통계적 측도)

  • Han, Eun Su;Lim, Johan;Lee, Kyeong Eun
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.1
    • /
    • pp.81-92
    • /
    • 2013
  • It is important to construct a firm basis for accusing potential violators of academic integrity in order to avoid spurious accusations and false convictions. Educational researchers have developed many statistical methods that can either uncover or confirm cases of cheating on tests. However, most of them rely on simple correlation-based measures, and often fail to account for patterns in responses or answers. In this paper, we propose a new statistical index denoted by a Standardized Signed Entropy Similarity Score to resolve this difficulty. In addition, we apply the proposed method to analyze a real data set and compare the results to other existing methods.

An Algorithm of Score Function Generation using Convolution-FFT in Independent Component Analysis (독립성분분석에서 Convolution-FFT을 이용한 효율적인 점수함수의 생성 알고리즘)

  • Kim Woong-Myung;Lee Hyon-Soo
    • The KIPS Transactions:PartB
    • /
    • v.13B no.1 s.104
    • /
    • pp.27-34
    • /
    • 2006
  • In this study, we propose this new algorithm that generates score function in ICA(Independent Component Analysis) using entropy theory. To generate score function, estimation of probability density function about original signals are certainly necessary and density function should be differentiated. Therefore, we used kernel density estimation method in order to derive differential equation of score function by original signal. After changing formula to convolution form to increase speed of density estimation, we used FFT algorithm that can calculate convolution faster. Proposed score function generation method reduces the errors, it is density difference of recovered signals and originals signals. In the result of computer simulation, we estimate density function more similar to original signals compared with Extended Infomax and Fixed Point ICA in blind source separation problem and get improved performance at the SNR(Signal to Noise Ratio) between recovered signals and original signal.

An Analysis of Quality Efficiency of Loan Consultants in a Bank using Shannon's Entropy and PCA-DEA Model (Entropy와 PCA-DEA 모형을 이용한 은행 대출상담사의 서비스 품질 효율성 분석)

  • Choi, Jang Ki;Kim, Kyeongtaek;Suh, Jae Joon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.3
    • /
    • pp.7-17
    • /
    • 2017
  • Loan consultants assist clients with loan application processing and loan decisions. Their duties may include contacting people to ask if they want a loan, meeting with loan applicants and explaining different loan options. We studied the efficiency of service quality of loan consultants contracted to a bank in Korea. They do not work as a team, but do work independently. Since he/she is not an employee of the bank, the consultant is paid solely in proportion to how much he/she sell loans. In this study, a consultant is considered as a decision making unit (DMU) in the DEA (Data Envelopment Analysis) model. We use a principal component analysis-data envelopment analysis (PCA-DEA) model integrated with Shannon's Entropy to evaluate quality efficiency of the consultants. We adopt a three-stage process to calculate the efficiency of service quality of the consultants. In the first stage, we use PCA to obtain 6 synthetic indicators, including 4 input indicators and 2 output indicators, from survey results in which questionnaire items are constructed on the basis of SERVQUAL model. In the second stage, 3 DEA models allowing negative values are used to calculate the relative efficiency of each DMU. In the third stage, the weight of each result is calculated on the basis of Shannon's Entropy theory, and then we generate a comprehensive efficiency score using it. An example illustrates the proposed process of evaluating the relative quality efficiency of the loan consultants and how to use the efficiency to improve the service quality of the consultants.

Priority Determination of the Projects for Ecological Restoration of the Stream : Case Study for Han River Estuary (생태하천 복원사업 우선순위 선정에 대한 연구: 한강하구를 중심으로)

  • Seonuk Baek;Junhak Lee;Seungmin Lee;Haneul Lee;Hung Soo Kim;Soojun Kim
    • Journal of Wetlands Research
    • /
    • v.25 no.1
    • /
    • pp.64-73
    • /
    • 2023
  • Before 2022, there was a lot of confusion in the process of planning and implementing the projects for ecological restoration of the stream due to dualization the principal agent of stream management. Because the Ministry of Environment took charge of the project in 2022, securing the health of aquatic ecosystem of stream became an essential factor in the project. Therefore, in this study, the streams that require the project for ecological restoration was selected in Han River estuary, where it is essential to secure the health of the stream aquatic ecosystem as blackish water zone and Ramsar wetland are located. Physical, chemical, spatial/humanistic, health of aquatic ecosystems evaluation indexes were calculated based on the detailed facts and figures of the project for ecological restoration of the stream in the beginning. Ranking, re-scaling, z-score, and t-score normalization methods were applied to the calculated evaluation index, and the values were compared and analyzed. After that, the entropy weight method was applied to each evaluation index. Through this process, the streams(Mokgamcheon, Anyangcheon etc.) that require the project for ecological restoration were selected for the purpose of securing the health of the aquatic ecosystem in Han River estuary. The result of this study can be used as basic research data in the process of selecting the priority determination of the projects for ecological restoration of the stream.

Failure Modes and Effects Analysis by using the Entropy Method and Fuzzy ELECTRE III (엔트로피법과 Fuzzy ELECTRE III를 이용한 고장모드영향분석)

  • Ryu, Si Wook
    • Journal of the Korea Safety Management & Science
    • /
    • v.16 no.4
    • /
    • pp.229-236
    • /
    • 2014
  • Failure modes and effects analysis (FMEA) is a widely used engineering tool in the fields of the design of a product or a process to improve its quality or performance by prioritizing potential failure modes in terms of three risk factors-severity, occurrence, and detection. In a classical FMEA, the risk priority number is obtained by multiplying the three values in 10 score scales which are evaluated for the three risk factors. However, the drawbacks of the classical FMEA have been mentioned by many previous researchers. As a way to overcome these difficulties, this paper suggests the ELECTRE III that is a representative technique among outranking models. Furthermore, fuzzy linguistic variables are included to deal with ambiguous and imperfect evaluation process. In addition, when the importances for the three risk factors are obtained, the entropy method is applied. The numerical example which was previously studied by Kutlu and Ekmekio$\breve{g}$lu(2012), who suggested the fuzzy TOPSIS method along with fuzzy AHP, is also adopted so as to be compared with the results of their research. Finally, after comparing the results of this study with that of Kutlu and Ekmekio$\breve{g}$lu(2012), further possible researches are mentioned.

Modeling the Spatial Distribution of Black-Necked Cranes in Ladakh Using Maximum Entropy

  • Meenakshi Chauhan;Randeep Singh;Puneet Pandey
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.4 no.2
    • /
    • pp.79-85
    • /
    • 2023
  • The Tibetan Plateau is home to the only alpine crane species, the black-necked crane (Grus nigricollis). Conservation efforts are severely hampered by a lack of knowledge on the spatial distribution and breeding habitats of this species. The ecological niche modeling framework used to predict the spatial distribution of this species, based on the maximum entropy and occurrence record data, allowed us to generate a species-specific spatial distribution map in Ladakh, Trans-Himalaya, India. The model was created by assimilating species occurrence data from 486 geographical sites with 24 topographic and bioclimatic variables. Fourteen variables helped forecast the distribution of black-necked cranes by 96.2%. The area under the curve score for the model training data was high (0.98), indicating the accuracy and predictive performance of the model. Of the total study area, the areas with high and moderate habitat suitability for black-necked cranes were anticipated to be 8,156 km2 and 6,759 km2, respectively. The area with high habitat suitability within the protected areas was 5,335 km2. The spatial distribution predicted using our model showed that the majority of speculated conservation areas bordered the existing protected areas of the Changthang Wildlife Sanctuary. Hence, we believe, that by increasing the current study area, we can account for these gaps in conservation areas, more effectively.

Speaker Identification Using Score-based Confidence in Noisy Environments (스코어 기반 관측신뢰도를 이용한 잡음환경하 화자식별)

  • Min, So-Hee;Song, Min-Gyu;Na, Seung-You;Choi, Seung-Ho;Kim, Jin-Young
    • Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.145-156
    • /
    • 2007
  • The performance of speaker identification is severely degraded in noisy environments. Recently probability weighting method based on observation membership was proposed for overcoming the noise problem[1]. In the paper[1] the observation confidence was calculated from SNR with sigmoid function. However, estimating SNR needs additive calculation amount and estimated SNR is corrupted in dynamic noisy environments. In this paper we propose estimation methods of the observation confidence based on score-based reliabilities (SBR) of entropy and dispersion measures. Generally SBRs are obtained from speaker models' probabilities. The proposed methods are evaluated with ETRI speaker recognition DB. We compared the performances of the proposed methods with those in [1][8]. The experimental results show that the proposed methods can be successfully applied for the case where SNR is not available.

  • PDF

Measuring and Describing Seoul's Mixed-Use Phenomenon (서울시 용도복합 현상의 측정 및 기술에 관한 연구)

  • KIM, Hyun-Moo;LEE, Woo-Jin;KWON, Tae-Jung;YEON, Jeong-Min
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.3
    • /
    • pp.10-31
    • /
    • 2021
  • The mixed-use concept definition, this study reveals, is that the mixing three or more major types of urban uses implements for economical, social and environmental values in our urban space. With this definition the study explores Seoul's mixed-use phenomenon. The quantification method, the study uses, is the relative entropy which calculate the balance of each urban use in a certain area. The relative entropy method, also known as the LUM(land-use mix score), uses three urban-use categories which is derived from the mixed-use concept definition. Hundreds of building-use types in the building regulations are categorized and calculate the LUM of Seoul's legal-status neighborhoods. The result interpreted as the criteria of Seoul's mixed-use phenomenon and categorize mixed land-use status in a certain value: 'non mixed-use' category has a value 0.631 and below, 'unbalanced mixed-use' category has a value between 0.631 and 0.884, 'balanced mixed-use' category has a value between 0.884 and 0.991 and 'complete mixed-use' category has a value 0.991 and over.