• Title/Summary/Keyword: VE모델

Search Result 104, Processing Time 0.024 seconds

Word Embedding using word position information (단어의 위치정보를 이용한 Word Embedding)

  • Hwang, Hyunsun;Lee, Changki;Jang, HyunKi;Kang, Dongho
    • 한국어정보학회:학술대회논문집
    • /
    • 2017.10a
    • /
    • pp.60-63
    • /
    • 2017
  • 자연어처리에 딥 러닝을 적용하기 위해 사용되는 Word embedding은 단어를 벡터 공간상에 표현하는 것으로 차원축소 효과와 더불어 유사한 의미의 단어는 유사한 벡터 값을 갖는다는 장점이 있다. 이러한 word embedding은 대용량 코퍼스를 학습해야 좋은 성능을 얻을 수 있기 때문에 기존에 많이 사용되던 word2vec 모델은 대용량 코퍼스 학습을 위해 모델을 단순화 하여 주로 단어의 등장 비율에 중점적으로 맞추어 학습하게 되어 단어의 위치 정보를 이용하지 않는다는 단점이 있다. 본 논문에서는 기존의 word embedding 학습 모델을 단어의 위치정보를 이용하여 학습 할 수 있도록 수정하였다. 실험 결과 단어의 위치정보를 이용하여 word embedding을 학습 하였을 경우 word-analogy의 syntactic 성능이 크게 향상되며 어순이 바뀔 수 있는 한국어에서 특히 큰 효과를 보였다.

  • PDF

Word Representation Analysis of Bio-marker and Disease Word (바이오 마커와 질병 용어의 단어 표현 분석)

  • Youn, Young-Shin;Nam, Kyung-Min;Kim, Yu-Seop
    • Annual Conference on Human and Language Technology
    • /
    • 2015.10a
    • /
    • pp.165-168
    • /
    • 2015
  • 기계학습 기반의 자연어처리 모듈에서 중요한 단계 중 하나는 모듈의 입력으로 단어를 표현하는 것이다. 벡터의 사이즈가 크고, 단어 간의 유사성의 개념이 존재하지 않는 One-hot 형태와 대조적으로 유사성을 표현하기 위해서 단어를 벡터로 표현하는 단어 표현 (word representation/embedding) 생성 작업은 자연어 처리 작업의 기계학습 모델의 성능을 개선하고, 몇몇 자연어 처리 분야의 모델에서 성능 향상을 보여 주어 많은 관심을 받고 있다. 본 논문에서는 Word2Vec, CCA, 그리고 GloVe를 사용하여 106,552개의 PubMed의 바이오메디컬 논문의 요약으로 구축된 말뭉치 카테고리의 각 단어 표현 모델의 카테고리 분류 능력을 확인한다. 세부적으로 나눈 카테고리에는 질병의 이름, 질병 증상, 그리고 난소암 마커가 있다. 분류 능력을 확인하기 위해 t-SNE를 이용하여 2차원으로 단어 표현 결과를 맵핑하여 가시화 한다.

  • PDF

Combining Feature Variables for Improving the Accuracy of $Na\ddot{i}ve$ Bayes Classifiers (나이브베이즈분류기의 정확도 향상을 위한 자질변수통합)

  • Heo Min-Oh;Kim Byoung-Hee;Hwang Kyu-Baek;Zhang Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.727-729
    • /
    • 2005
  • 나이브베이즈분류기($na\ddot{i}ve$ Bayes classifier)는 학습, 적용 및 계산자원 이용의 측면에서 매우 효율적인 모델이다. 또한, 그 분류 성능 역시 다른 기법에 비해 크게 떨어지지 않음이 다양한 실험을 통해 보여져 왔다. 특히, 데이터를 생성한 실제 확률분포를 나이브베이즈분류기가 정확하게 표현할 수 있는 경우에는 최대의 효과를 볼 수 있다. 하지만, 실제 확률분포에 존재하는 조건부독립성(conditional independence)이 나이브베이즈분류기의 구조와 일치하지 않는 경우에는 성능이 하락할 수 있다. 보다 구체적으로, 각 자질변수(feature variable)들 사이에 확률적 의존관계(probabilistic dependency)가 존재하는 경우 성능 하락은 심화된다. 본 논문에서는 이러한 나이브베이즈분류기의 약점을 효율적으로 해결할 수 있는 자질변수의 통합기법을 제시한다. 자질변수의 통합은 각 변수들 사이의 관계를 명시적으로 표현해 주는 방법이며, 특히 상호정보량(mutual information)에 기반한 통합 변수의 선정이 성능 향상에 크게 기여함을 실험을 통해 보인다.

  • PDF

Text Categorization Using TextRank Algorithm (TextRank 알고리즘을 이용한 문서 범주화)

  • Bae, Won-Sik;Cha, Jeong-Won
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.1
    • /
    • pp.110-114
    • /
    • 2010
  • We describe a new method for text categorization using TextRank algorithm. Text categorization is a problem that over one pre-defined categories are assigned to a text document. TextRank algorithm is a graph-based ranking algorithm. If we consider that each word is a vertex, and co-occurrence of two adjacent words is a edge, we can get a graph from a document. After that, we find important words using TextRank algorithm from the graph and make feature which are pairs of words which are each important word and a word adjacent to the important word. We use classifiers: SVM, Na$\ddot{i}$ve Bayesian classifier, Maximum Entropy Model, and k-NN classifier. We use non-cross-posted version of 20 Newsgroups data set. In consequence, we had an improved performance in whole classifiers, and the result tells that is a possibility of TextRank algorithm in text categorization.

A Development of Quantitative Analysis Model for the Policy Analysis in Feasibility Study Using the Performance Assessment Method (성능평가기법을 활용한 타당성조사 정책적 분석단계의 정량적 의사결정모델 개발 - 복수대안의 타당성 평가를 중심으로 -)

  • Lim, Yong-Soo;Song, Hyun-Young;Jeong, Han-Kee;Jeong, Min-Chul;Kong, Jung-Sik
    • Korean Journal of Construction Engineering and Management
    • /
    • v.12 no.2
    • /
    • pp.89-100
    • /
    • 2011
  • As an impactive factor on industries and national economy, The Social Overhead Capital(SOC) is major factor to determine the national competitiveness and the investment of SOC is essential for its economic growth. Accordingly, introduction of the preliminary feasibility study and establishment of legal institutionalization and evaluation system has been carried out and reviewed since 1999. Nevertheless of these efforts, basic problems such as lack of scientific method for investment evaluation and loss of effectiveness on feasibility studies are continuously being brought up. Moreover, as the preliminary study to improve the mentioned problems is mainly focused on the economic and estimated demand analysis, the study of policy analysis, the most important phase during a feasibility study, is still insufficient. Therefore, in this paper, a quantitative decision-making model, to which the performance assessment method of Value Engineering(VE) is applied, is developed and proposed to improve the policy analysis of (preliminary) feasibility study that requires combining with relative studies, to induce quantitative analysis method, and to contribute the improvement of value on the political aspect for SOC investment goals and use as a strategic decision-making method by systematic analysis.

Isolation and Characterization of Two Methyltransferase Genes, AfuvipB and AfuvipC in Aspergillus fumigatus (Aspergillus fumigatus에서 Methyltransferase 유전자 AfuvipB와 AfuvipC의 분리 및 분석)

  • Elgabbar, Mohammed A. Abdo;Han, Kap-Hoon
    • The Korean Journal of Mycology
    • /
    • v.43 no.1
    • /
    • pp.33-39
    • /
    • 2015
  • In filamentous fungi, velvet complex associated with the veA gene plays pivotal roles in development and secondary metabolism. In a model fungus Aspergillus nidulans, many proteins that can interact with VeA, including two methyltransferases VipB and VipC, have been isolated and characterized. In this study, we isolated homologs of the vipB and vipC genes in the human opportunistic pathogenic fungus Aspergillus fumigatus and named AfuvipB and AfuvipC. The AfuvipB gene, annotated as Afu3g14920 in the Aspergillus Genome Database (AspGD) database, consists of 1,510 bp interrupted with 10 introns yielding 336 amino acid-long putative methyltransferase protein. Similarly, AfuvipC, which is Afu8g01930, has 10 introns and encodes a polypeptide with 339 amino acids having a methyltransferase domain in the middle of the protein. To characterize the function of the genes in A. fumigatus, knock-out mutants were generated and the phenotypes were observed. Deletion of AfuvipB gene caused no obvious phenotypic change on point inoculation but showed smaller colony than wild-type when the mutant was subjected to culture on single spore-driven culture condition. However, AfuvipC deletion mutant demonstrated no phenotypic difference from wild type both in point inoculation and streaking cultures. These results indicate that the two methyltransfereases might have a redundant role and could be dispensable in normal culture conditions.

Effectiveness of Two-dose Varicella Vaccination: Bayesian Network Meta-analysis

  • Kwan Hong;Young June Choe;Young Hwa Lee;Yoonsun Yoon;Yun-Kyung Kim
    • Pediatric Infection and Vaccine
    • /
    • v.31 no.1
    • /
    • pp.55-63
    • /
    • 2024
  • Purpose: A 2-dose varicella vaccination strategy has been introduced in many countries worldwide, aiming to increase vaccine effectiveness (VE) against varicella infection. In this network meta-analysis, we aimed to provide a comprehensive evaluation and an overall estimated effect of varicella vaccination strategies, via a Bayesian model. Methods: For each eligible study, we collected trial characteristics, such as: 1-dose vs. 2-dose, demographic characteristics, and outcomes of interest. For studies involving different doses, we aggregated the data for the same number of doses delivered into one arm. The preventive effect of 1-dose vs. 2-dose of varicella vaccine were evaluated in terms of the odds ratio (OR) and corresponding equal-tailed 95% confidence interval (95% CI). Results: A total of 903 studies were retrieved during our literature search, and 25 interventional or observational studies were selected for the Bayesian network meta-analysis. A total of 49,265 observed individuals were included in this network meta-analysis. Compared to the 0-dose control group, the OR of all varicella infections were 0.087 (95% CI, 0.046-0.164) and 0.310 (95% CI, 0.198-0.484) for 2-doses and one-dose, respectively, which corresponded to VE of 69.0% (95% CI, 51.6-81.2) and VE of 91.3% (95% CI, 83.6-95.4) for 1- and 2-doses, respectively. Conclusions: A 2-dose vaccine strategy was able to significantly reduce varicella burden. The effectiveness of 2-dose vaccination on reducing the risk of infection was demonstrated by sound statistical evidence, which highlights the public health need for a 2-dose vaccine recommendation.

A study on the Revenue model of Character business on the Internet (인터넷상에서 캐릭터를 활용한 수익 모델에 관한 연구)

  • Kim, Joon-Young
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2004.11a
    • /
    • pp.78-83
    • /
    • 2004
  • In the early days, we could be provided almost internet contents without pay. It's been turning out the time we need to pay for getting some internet contents as it has become one of important media. It's already been a kind of big market. We need to develop the revenue model of character brand in the internet business. Character brand could make the internet contents on a position of advantage. We've made the revenue through using character brand and making commodities and services more valuable. We use the internet in a wide range of daily life communication. It's expected to extend. It's needed to study which kind of revenue model is existed at the moment.

  • PDF

A Proposal of Quality Model for Alternative Evaluation of Major Construction Projects (건설공사에서 주요공종 대안평가를 위한 품질모델 제안)

  • Lee, Jong-Min;Seo, Bo-Ram;Son, Hyun-Jeong;Lee, Tae-Shim;Yang, Jin-Kook;Lee, Sang-Beom
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2019.05a
    • /
    • pp.43-44
    • /
    • 2019
  • Major construction projects will be directly related to construction costs and disclosure period depending on the selection of the construction method. However, there are some sites that choose without taking into account sufficient conditions. Therefore, the company wants to present a quality model that suits the characteristics of major engineering fields.

  • PDF

Scoring Korean Written Responses Using English-Based Automated Computer Scoring Models and Machine Translation: A Case of Natural Selection Concept Test (영어기반 컴퓨터자동채점모델과 기계번역을 활용한 서술형 한국어 응답 채점 -자연선택개념평가 사례-)

  • Ha, Minsu
    • Journal of The Korean Association For Science Education
    • /
    • v.36 no.3
    • /
    • pp.389-397
    • /
    • 2016
  • This study aims to test the efficacy of English-based automated computer scoring models and machine translation to score Korean college students' written responses on natural selection concept items. To this end, I collected 128 pre-service biology teachers' written responses on four-item instrument (total 512 written responses). The machine translation software (i.e., Google Translate) translated both original responses and spell-corrected responses. The presence/absence of five scientific ideas and three $na{\ddot{i}}ve$ ideas in both translated responses were judged by the automated computer scoring models (i.e., EvoGrader). The computer-scored results (4096 predictions) were compared with expert-scored results. The results illustrated that no significant differences in both average scores and statistical results using average scores was found between the computer-scored result and experts-scored result. The Pearson correlation coefficients of composite scores for each student between computer scoring and experts scoring were 0.848 for scientific ideas and 0.776 for $na{\ddot{i}}ve$ ideas. The inter-rater reliability indices (Cohen kappa) between computer scoring and experts scoring for linguistically simple concepts (e.g., variation, competition, and limited resources) were over 0.8. These findings reveal that the English-based automated computer scoring models and machine translation can be a promising method in scoring Korean college students' written responses on natural selection concept items.