• Title/Summary/Keyword: 오분류율

Search Result 118, Processing Time 0.024 seconds

Intraspecific Relationship of Eleutherococcus senticosus Max. by RAPD Markers (가시오갈피 수집종의 RAPD 변이분석)

  • Kim, Sun;Kim, Ki-Young;Park, Mun-Su;Choi, Sun-Young;Yun, Song-Joong
    • Korean Journal of Medicinal Crop Science
    • /
    • v.6 no.3
    • /
    • pp.165-169
    • /
    • 1998
  • To evaluate the intraspecific variations among the Kasiogalpi(Eleutherococcus senticosus Max.) collections. randomly amplified DNA polymorphisms were examined. Twenty primers from 90 primers applied were selected. The range of polymorphism was $7.1{\sim}90.9%$ in 113 randomly and specifically amplified DNA fragments. Collections were divided into two major groups at the similarity coefficient value of 0.65. A considerable degree of genetic diversity was also detected among plants within the same collections. Deokyu (1, 2, 3, 4, 6), Bukhaedo (7, 8) and Odae (9, 10) col1ections showed higher degree of genetic similarity with a value of $0.65{\sim}0.86$, while Deokyu 5 showed much lower genetic similarity than other col1ections.

  • PDF

On Practical Choice of Smoothing Parameter in Nonparametric Classification (베이즈 리스크를 이용한 커널형 분류에서 평활모수의 선택)

  • Kim, Rae-Sang;Kang, Kee-Hoon
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.2
    • /
    • pp.283-292
    • /
    • 2008
  • Smoothing parameter or bandwidth plays a key role in nonparametric classification based on kernel density estimation. We consider choosing smoothing parameter in nonparametric classification, which optimize the Bayes risk. Hall and Kang (2005) clarified the theoretical properties of smoothing parameter in terms of minimizing Bayes risk and derived the optimal order of it. Bootstrap method was used in their exploring numerical properties. We compare cross-validation and bootstrap method numerically in terms of optimal order of bandwidth. Effects on misclassification rate are also examined. We confirm that bootstrap method is superior to cross-validation in both cases.

Criterion of Test Statistics for Validation in Credit Rating Model (신용평가모형에서 타당성검증 통계량들의 판단기준)

  • Park, Yong-Seok;Hong, Chong-Sun;Lim, Han-Seung
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.2
    • /
    • pp.239-347
    • /
    • 2009
  • This paper presents Kolmogorov-Smirnov, mean difference, AUROC and AR, four well known statistics that have been widely used for evaluating the discriminatory power of credit rating models. Criteria for these statistics are determined by the value of mean difference under the assumption of normality and equal standard deviation. Alternative criteria are proposed through the simulations according to various sample sizes, type II error rates, and the ratio of bads, also we suggest the meaning of statistic on the basis of discriminatory power. Finally we make a comparative study of the currently used guidelines and simulated results.

Opinion Mining based Broadcasting-Consumption Impact Modeling (오피니언 마이닝기반 방송-소비 영향 모델링)

  • Kim, Jinah;Shin, Yoonmi;Moon, Nammee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.592-595
    • /
    • 2018
  • 소비자의 행동 예측을 하는 데 있어 기존의 소비 행동과 더불어 외부 환경 요인 중 하나인 방송 미디어에 대한 영향 반영이 요구되며, 이 때, '스낵컬처' 시대에 알맞은 분석이 요구된다. 본 논문에서는 네이버 TV에서의 국내 방송 영상 콘텐츠를 활용하여 방송이 소비에 끼치는 영향에 대한 모델링을 진행하였다. 월별 선호도가 높은 방송들을 대상으로 텍스트 마이닝을 통해 방송 영상 콘텐츠의 제목, 내용, 태그, 댓글을 활용하여 주요 키워드를 추출하였으며, 이를 바탕으로 SO-PMI 기반의 오피니언 마이닝을 통해 소비 성향 키워드를 필터링하여 소비 감성 지수를 계산하였다. 이때, 소비 선호를 파악 가능한 소비 감성 사전을 새로 구축하여 활용하였다. 최종적으로, 소비자의 연령과 성별을 분류하여 방송 콘텐츠의 조회수 및 좋아요수를 반영한 방송 선호율과 소비 감성지수를 바탕으로 방송-소비 영향 모델링을 설계 및 구현하였다.

Automatic Construction of a Negative/positive Corpus and Emotional Classification using the Internet Emotional Sign (인터넷 감정기호를 이용한 긍정/부정 말뭉치 구축 및 감정분류 자동화)

  • Jang, Kyoungae;Park, Sanghyun;Kim, Woo-Je
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.512-521
    • /
    • 2015
  • Internet users purchase goods on the Internet and express their positive or negative emotions of the goods in product reviews. Analysis of the product reviews become critical data to both potential consumers and to the decision making of enterprises. Therefore, the importance of opinion mining techniques which derive opinions by analyzing meaningful data from large numbers of Internet reviews. Existing studies were mostly based on comments written in English, yet analysis in Korean has not actively been done. Unlike English, Korean has characteristics of complex adjectives and suffixes. Existing studies did not consider the characteristics of the Internet language. This study proposes an emotional classification method which increases the accuracy of emotional classification by analyzing the characteristics of the Internet language connoting feelings. We can classify positive and negative comments about products automatically using the Internet emoticon. Also we can check the validity of the proposed algorithm through the result of high precision, recall and coverage for the evaluation of this method.

Prediction of replacement period of shield TBM disc cutter using SVM (SVM 기법을 이용한 쉴드 TBM 디스크 커터 교환 주기 예측)

  • La, You-Sung;Kim, Myung-In;Kim, Bumjoo
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.21 no.5
    • /
    • pp.641-656
    • /
    • 2019
  • In this study, a machine learning method was proposed to use in predicting optimal replacement period of shield TBM (Tunnel Boring Machine) disc cutter. To do this, a large dataset of ground condition, disc cutter replacement records and TBM excavation-related data, collected from a shield TBM tunnel site in Korea, was built and they were used to construct a disc cutter replacement period prediction model using a machine learning algorithm, SVM (Support Vector Machine) and to assess the performance of the model. The results showed that the performance of RBF (Radial Basis Function) SVM is the best among a total of three SVM classification functions (80% accuracy and 10% error rate on average). When compared between ground types, the more disc cutter replacement data existed, the better prediction results were obtained. From this results, it is expected that machine learning methods become very popularly used in practice in near future as more data is accumulated and the machine learning models continue to be fine-tuned.

Undecided inference using bivariate probit models (이변량 프로빗모형을 이용한 미결정자 추론)

  • Hong, Chong-Sun;Jung, Mi-Yang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.6
    • /
    • pp.1017-1028
    • /
    • 2011
  • When it is not easy to decide the credit scoring for some loan applicants, credit evaluation is postponded and reserve to ask a specialist for further evaluation of undecided applicants. This undecided inference is one of problems that happen to most statistical models including the biostatistics and sportal statistics as well as credit evaluation area. In this work, the undecided inference is regarded as a missing data mechanism under the assumption of MNAR, and use the bivariate probit model which is one of sample selection models. Two undecided inference methods are proposed: one is to make use of characteristic variables to represent the state for decided applicants, and the other is that more accurate and additional informations are collected and apply these new variables. With an illustrated example, misclassification error rates for undecided and overall applicants are obtainded and compared according to various characteristic variables, undecided intervals, and thresholds. It is found that misclassification error rates could be reduced when the undecided interval is increased and more accurate information is put to model, since more accurate situation of decided applications are reflected in the bivariate probit model.

The Hybrid Model using SVM and Decision Tree for Intrusion Detection (SVM과 의사결정트리를 이용한 혼합형 침입탐지 모델)

  • Um, Nam-Kyoung;Woo, Sung-Hee;Lee, Sang-Ho
    • The KIPS Transactions:PartC
    • /
    • v.14C no.1 s.111
    • /
    • pp.1-6
    • /
    • 2007
  • In order to operate a secure network, it is very important for the network to raise positive detection as well as lower negative detection for reducing the damage from network intrusion. By using SVM on the intrusion detection field, we expect to improve real-time detection of intrusion data. However, due to classification based on calculating values after having expressed input data in vector space by SVM, continuous data type can not be used as any input data. Therefore, we present the hybrid model between SVM and decision tree method to make up for the weak point. Accordingly, we see that intrusion detection rate, F-P error rate, F-N error rate are improved as 5.6%, 0.16%, 0.82%, respectively.

Correlation between the dielectric constant and porosity due to the nano pore in the thin film (나노기공에 의한 박막 내의 기공율과 절연상수의 상관관계)

  • Oh, Teresa
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.3 s.357
    • /
    • pp.1-5
    • /
    • 2007
  • SiOC films were made using the oxygen and bistrimethylsilylmethane mixed precursor. The chemical properties of SiOC films divided into three properties, organic, hybrid and inorganic depending on the flow rate ratio between oxygen and bistrimethylsilylmethane precursor. The films with organic properties decreased dielectric constant, because of pore incorporation in final materials. In this study, the porosity of SiOC films with organic properties was investigated using the Makwell-Garnett equation. The porosity of the films could be correlated with the blue shift in the infrared spectra scopy, and increased with the decreasing the dielectric constant of the film.

Eigenvoice Adaptation of Classification Model for Binary Mask Estimation (Eigenvoice를 이용한 이진 마스크 분류 모델 적응 방법)

  • Kim, Gibak
    • Journal of Broadcast Engineering
    • /
    • v.20 no.1
    • /
    • pp.164-170
    • /
    • 2015
  • This paper deals with the adaptation of classification model in the binary mask approach to suppress noise in the noisy environment. The binary mask estimation approach is known to improve speech intelligibility of noisy speech. However, the same type of noisy data for the test data should be included in the training data for building the classification model of binary mask estimation. The eigenvoice adaptation is applied to the noise-independent classification model and the adapted model is used as noise-dependent model. The results are reported in Hit rates and False alarm rates. The experimental results confirmed that the accuracy of classification is improved as the number of adaptation sentences increases.