• Title/Summary/Keyword: Compound words

Search Result 97, Processing Time 0.024 seconds

Environment for Translation Domain Adaptation and Continuous Improvement of English-Korean Machine Translation System

  • Kim, Sung-Dong;Kim, Namyun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.2
    • /
    • pp.127-136
    • /
    • 2020
  • This paper presents an environment for rule-based English-Korean machine translation system, which supports the translation domain adaptation and the continuous translation quality improvement. For the purposes, corpus is essential, from which necessary information for translation will be acquired. The environment consists of a corpus construction part and a translation knowledge extraction part. The corpus construction part crawls news articles from some newspaper sites. The extraction part builds the translation knowledge such as newly-created words, compound words, collocation information, distributional word representations, and so on. For the translation domain adaption, the corpus for the domain should be built and the translation knowledge should be constructed from the corpus. For the continuous improvement, corpus needs to be continuously expanded and the translation knowledge should be enhanced from the expanded corpus. The proposed web-based environment is expected to facilitate the tasks of domain adaptation and translation system improvement.

Conceptual Extraction of Compound Korean Keywords

  • Lee, Samuel Sangkon
    • Journal of Information Processing Systems
    • /
    • v.16 no.2
    • /
    • pp.447-459
    • /
    • 2020
  • After reading a document, people construct a concept about the information they consumed and merge multiple words to set up keywords that represent the material. With that in mind, this study suggests a smarter and more efficient keyword extraction method wherein scholarly journals are used as the basis for the establishment of production rules based on a concept information of words appearing in a document in a way in which author-provided keywords are functional although they do not appear in the body of the document. This study presents a new way to determine the importance of each keyword, excluding non-relevant keywords. To identify the validity of extracted keywords, titles and abstracts of journals about natural language and auditory language were collected for analysis. The comparison of author-provided keywords with the keyword results of the developed system showed that the developed system was highly useful, with an accuracy rate as good as up to 96%.

New Methods for Isolation of Sesquiterpene from Panax ginseng (인삼 Sesquiterpene의 새로운 분리방법)

  • 위재준;신지영
    • Journal of Ginseng Research
    • /
    • v.21 no.3
    • /
    • pp.214-218
    • /
    • 1997
  • New simple methods for the Isolation of sesquiterpenes from Panax ginseng were developed. First, volatile compounds were isolated by simultaneous distillation and extraction (SDE) with 30% methanol and $\alpha$-hexane instead of water and ethyl ether/pentane (1:1). Secondly, head space volatiles in U-shaped tube at 7$0^{\circ}C$ were passed through C18 Sep-Pak by nitrogen gas streaming and the adsorbed volatiles were fluted by $\alpha$-hexane. TLC analysis showed that the volatile concentrates consisted mainly of terpenes when colored by vanillin-sulfuric and. GC/MS data revealed that approximately 30 sesquiterpenes of molecular weight 204 occupied 81.1% or more of the volatile concentrates isolated by those two newly developed methods. Among these, alloaromadendrene, germacrene B, isocaryophyllene, $\alpha$-neoclovene, ${\gamma}$-muurolene, $\beta$-panaslnsene, and $\alpha$-humulene were identified as being major sesqulterpenes by authentic samples or literatme search Key words : Panax ginseng, volatile compound, sesquiterpene, isolation, new method, GC/MS.

  • PDF

Performance Analysis of n-Gram Indexing Methods for Korean text Retrieval (한글 문서 검색에서 n-Gram 색인방법의 성능 분석)

  • 이준규;심수정;박혁로
    • Proceedings of the IEEK Conference
    • /
    • 2003.11b
    • /
    • pp.145-148
    • /
    • 2003
  • The agglutinative nature of Korean language makes the problem of automatic indexing of Korean much different from that of Indo-Eroupean languages. Especially, indexing with compound nouns in Korean is very problematic because of the exponential number of possible analysis and the existence of unknown words. To deal with this compound noun indexing problem, we propose a new indexing methods which combines the merits of the morpheme-based indexing methods and the n-gram based indexing methods. Through the experiments, we also find that the best performance of n-gram indexing methods can be achieved with 1.75-gram which is never considered in the previous researches.

  • PDF

The Experimental Study on the Relationship between Hierarchical Agglomerative Clustering and Compound Nouns Indexing (계층적 결합형 문서 클러스터링 시스템과 복합명사 색인방법과의 연관관계 연구)

  • Cho Hyun-Yang;Choi Sung-Pil
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.38 no.4
    • /
    • pp.179-192
    • /
    • 2004
  • In this paper, we present that the result of document clustering can change dramatically with respect to the different ways of indexing compound nouns. First of all, the automatic indexing engine specialized for Korean words analysis, which also serves as the backbone engine for automatic document clustering system, is introduced. Then, the details of hierarchical agglomerative clustering(HAC) method, one of the widely used clustering methodologies in these days, was illustrated. As the result of observing the experiments, carried out in the final part of this paper, it comes to the conclusion that the various modes of indexing compound nouns have an effect on the outcome of HAC.

Antimicrobial Characterictics of Antimicrobial Agent (Antibiotics) and Reduction Effect on Mal-ordour. (항균제의 항균특성 및 악취제거 효과)

  • Shin, Choon-Hwan;Kim, Jong-Hyun;Han, Sun-Hong
    • Journal of Environmental Science International
    • /
    • v.3 no.2
    • /
    • pp.157-164
    • /
    • 1994
  • Various antimicrobial agents are widely used for the purpose of antimicrobial process. We investigated antimicrobial activity and reduction efficiency of mal-ordour by the diphenyl ether compound (2,4,4'- trichloro -2'- hydroxy diphenyl ether) against Sraphylocom aureus(S.aureus and Proton vulgaris(p.vulgaris causing the mal-ordour, Especially, the diphenyl ether compound is not restricted to the regulation of water-contamination. In this research, we found that the optimum concentration of diphenyl ether compound was 1.5w% for both strains and antimicrobial expressions were c0.38t= 2.56 for S.aureus, c0.38t=2.67 for P.vulgaris. We found also that -OH group played the role of antimicrobial functional group. Lastly, reduction effect of mal-ordour was more than 90% for both strain at the optimum conditions. Key Words : antimicrobial agents, antimicrobial activity, reduction effect of mal-ordour, antimicrobial expression, antimicrobial functional group.

  • PDF

Integrated Indexing Method using Compound Noun Segmentation and Noun Phrase Synthesis (복합명사 분할과 명사구 합성을 이용한 통합 색인 기법)

  • Won, Hyung-Suk;Park, Mi-Hwa;Lee, Geun-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.1
    • /
    • pp.84-95
    • /
    • 2000
  • In this paper, we propose an integrated indexing method with compound noun segmentation and noun phrase synthesis. Statistical information is used in the compound noun segmentation and natural language processing techniques are carefully utilized in the noun phrase synthesis. Firstly, we choose index terms from simple words through morphological analysis and part-of-speech tagging results. Secondly, noun phrases are automatically synthesized from the syntactic analysis results. If syntactic analysis fails, only morphological analysis and tagging results are applied. Thirdly, we select compound nouns from the tagging results and then segment and re-synthesize them using statistical information. In this way, segmented and synthesized terms are used together as index terms to supplement the single terms. We demonstrate the effectiveness of the proposed integrated indexing method for Korean compound noun processing using KTSET2.0 and KRIST SET which are a standard test collection for Korean information retrieval.

  • PDF

Sign Language recognition Using Sequential Ram-based Cumulative Neural Networks (순차 램 기반 누적 신경망을 이용한 수화 인식)

  • Lee, Dong-Hyung;Kang, Man-Mo;Kim, Young-Kee;Lee, Soo-Dong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.5
    • /
    • pp.205-211
    • /
    • 2009
  • The Weightless Neural Network(WNN) has the advantage of the processing speed, less computability than weighted neural network which readjusts the weight. Especially, The behavior information such as sequential gesture has many serial correlation. So, It is required the high computability and processing time to recognize. To solve these problem, Many algorithms used that added preprocessing and hardware interface device to reduce the computability and speed. In this paper, we proposed the Ram based Sequential Cumulative Neural Network(SCNN) model which is sign language recognition system without preprocessing and hardware interface. We experimented with using compound words in continuous korean sign language which was input binary image with edge detection from camera. The recognition system of sign language without preprocessing got 93% recognition rate.

  • PDF

Automatic Tagging and Tag Recommendation Techniques Using Tag Ontology (태그 온톨로지를 이용한 자동 태깅 및 태그 추천 기법)

  • Kim, Jae-Seung;Mun, Hyeon-Jeong;Woo, Tae-Yong
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.4
    • /
    • pp.167-179
    • /
    • 2009
  • This paper introduces techniques to recommend standardized tags using tag ontology. Tag recommendation consists of TWCIDF and TWCITC; the former technique automatically tags a large quantity of already existing document groups, and the latter recommends tagging for new documents. Tag groups are created through several processes, including preprocessing, standardization using tag ontology, automatic tagging and defining ranks for recommendation. In the preprocessing process, in order to search semantic compound nouns, words are combined to establish basic word groups. In the standardization process, typographical errors and similar words are processed. As a result of experiments conducted on the basis of techniques presented in this paper, it is proved that real-time automatic tagging and tag recommendation is possible while guaranteeing the accuracy of tag recommendation.

  • PDF

A Study on the Space Constitution for the Complex of Educational Facilities - Focused on Public Space Formation for Composition of Complex with Welfare Facilities for the Aged - (교육시설의 복합화를 위한 공간구성에 관한 연구 - 고령자 복지시설과의 복합화를 위한 공용공간구성을 중심으로 -)

  • Kim, Jin-Mo
    • Journal of the Korean Institute of Educational Facilities
    • /
    • v.14 no.3
    • /
    • pp.27-35
    • /
    • 2007
  • The purpose of this research is to analyzing the welfare function of the senior citizen with similar type of compounding the school facility, to analyze the problematical point of present condition. And also propose the spatial constitution which the student and the regional senior citizen are easy to utilize on the basis of the result. In a type of educational facility composition, it became clear that a ratio and the inflection order that elderly welfare facilities occupied of vacant class room were low. In other words, in composition of a school facilities, elderly welfare facilities can be said to be an important problem. As a result of having analyzed an existing compound facilities, there was limited interchange of student and a the aged. In addition, there were problems such as learning environment by an intersection of circulation. Therefore, composition of educational facility and elderly welfare facilities is not space composition that is intersection of simple specification function. In other words, a program of the common space which is necessary for the local community formation is necessary. Setting of semi-public area proposes the function of the existing facility and actual condition of spatial use in prerequisite.