• Title/Summary/Keyword: Natural language process

Search Result 252, Processing Time 0.03 seconds

Context Management of Conversational Agent using Two-Stage Bayesian Network (2단계 베이지안 네트워크를 이용한 대화형 에이전트의 문맥 관리)

  • 홍진혁;조성배
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.1
    • /
    • pp.89-98
    • /
    • 2004
  • Conversational agent is a system that provides users with proper information and maintains the context of dialogue on the natural language. Analyzing and modeling process of user's query is essential to make it more realistic, for which Bayesian network is a promising technique. When experts design the network for a domain, the network is usually very complicated and is hard to be understood. The separation of variables in the domain reduces the size of networks and makes it easy to design the conversational agent. Composing Bayesian network as two stages, we aim to design conversational agent easily and analyze user's query in detail. Also, previous information of dialogue makes it possible to maintain the context of conversation. Actually implementing it for a guide of web pages, we can confirm the usefulness of the proposed architecture for conversational agent.

Natural language processing techniques for bioinformatics

  • Tsujii, Jun-ichi
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2003.10a
    • /
    • pp.3-3
    • /
    • 2003
  • With biomedical literature expanding so rapidly, there is an urgent need to discover and organize knowledge extracted from texts. Although factual databases contain crucial information the overwhelming amount of new knowledge remains in textual form (e.g. MEDLINE). In addition, new terms are constantly coined as the relationships linking new genes, drugs, proteins etc. As the size of biomedical literature is expanding, more systems are applying a variety of methods to automate the process of knowledge acquisition and management. In my talk, I focus on the project, GENIA, of our group at the University of Tokyo, the objective of which is to construct an information extraction system of protein - protein interaction from abstracts of MEDLINE. The talk includes (1) Techniques we use fDr named entity recognition (1-a) SOHMM (Self-organized HMM) (1-b) Maximum Entropy Model (1-c) Lexicon-based Recognizer (2) Treatment of term variants and acronym finders (3) Event extraction using a full parser (4) Linguistic resources for text mining (GENIA corpus) (4-a) Semantic Tags (4-b) Structural Annotations (4-c) Co-reference tags (4-d) GENIA ontology I will also talk about possible extension of our work that links the findings of molecular biology with clinical findings, and claim that textual based or conceptual based biology would be a viable alternative to system biology that tends to emphasize the role of simulation models in bioinformatics.

  • PDF

An Example-based Korean Standard Industrial and Occupational Code Classification (예제기반 한국어 표준 산업/직업 코드 분류)

  • Lim Heui-Seok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.7 no.4
    • /
    • pp.594-601
    • /
    • 2006
  • Coding of occupational and industrial codes is a major operation in census survey of Korean statistics bureau. The coding process has been done manually. Such manual work is very labor and cost intensive and it usually causes inconsistent results. This paper proposes an automatic coding system based on example-based learning. The system converts natural language input into corresponding numeric codes using code generation system trained by example-based teaming after applying manually built rules. As experimental results performed with training data consisted of 400,000 records and 260 manual rules, the proposed system showed about 76.69% and 99.68% accuracy for occupational code classification and industrial code classification, respectively.

  • PDF

A Decision Tree based Real-time Hand Gesture Recognition Method using Kinect

  • Chang, Guochao;Park, Jaewan;Oh, Chimin;Lee, Chilwoo
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.12
    • /
    • pp.1393-1402
    • /
    • 2013
  • Hand gesture is one of the most popular communication methods in everyday life. In human-computer interaction applications, hand gesture recognition provides a natural way of communication between humans and computers. There are mainly two methods of hand gesture recognition: glove-based method and vision-based method. In this paper, we propose a vision-based hand gesture recognition method using Kinect. By using the depth information is efficient and robust to achieve the hand detection process. The finger labeling makes the system achieve pose classification according to the finger name and the relationship between each fingers. It also make the classification more effective and accutate. Two kinds of gesture sets can be recognized by our system. According to the experiment, the average accuracy of American Sign Language(ASL) number gesture set is 94.33%, and that of general gestures set is 95.01%. Since our system runs in real-time and has a high recognition rate, we can embed it into various applications.

User Needs-Based Technology Opportunities in Heterogeneous Fields Using Opinion Mining and Patent Analysis (오피니언 마이닝 및 특허분석을 통한 사용자 니즈기반 이종영역 기술기회 탐색)

  • Jang, Hyejin;Roh, Taeyeoun;Yoon, Byungun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.43 no.1
    • /
    • pp.39-48
    • /
    • 2017
  • In a digital economy, users actively express their needs in many ways. Thus, many researchers analyze what users need and whether they are satisfied or not through opinion mining. In addition, they begin to find technology opportunities in heterogeneous technology fields. But they did not connect users' opinion to technology development process, only focused on natural language processing or marketing or manufacturing area. Also, heterogeneous technology fields are focused on fusion technology. Thus, this study suggests a novel approach that is based on sentimental value and can be applied to exploring technology opportunities in heterogeneous fields. Sentimental value is calculated from users' opinion through sLDA. The heterogeneous technology opportunity is explored by patent analysis. This research contributes to suggesting a hybrid methodology through patent and users' opinion. In addition, it can provide managerial efficiency by suggesting base data onto decision making.

Application of Classification of Object-property Represented in Korea Building Act Sentences for BIM-enabled Automated Code Compliance Checking (BIM기반 설계 품질검토 자동화를 위한 건축 관련 법규문장의 객체 및 속성 표현에 대한 체계화 접근방법)

  • Shin, Jaeyoung;Lee, Jin-Kook
    • Korean Journal of Computational Design and Engineering
    • /
    • v.21 no.3
    • /
    • pp.325-333
    • /
    • 2016
  • This paper aims to classify objects and their properties represented in Korea Building Act sentences for applying to BIM-enabled automated code compliance checking task. In order to conduct automated code compliance checking, it is necessary to develop translation process of converting the building act sentences into computer-executable forms. However, since Korea building act sentences are written in natural language, some of requirements are ambiguous to translate explicitly. In this regard, the building act sentences regarding building permit requirements are analyzed focusing on the regulation-specific objects and related properties representation from noun phrases within the scope of this paper. From 1977 building act sentences and attached reference regulations, 1200 regulation-specific objects and about 220 related properties are extracted and classified. In the application for the classification, object-property database is implemented and some of application using the database and the regulation-specific classification is suggested to support to generate rule set written in computable codes.

Topic Extraction and Classification Method Based on Comment Sets

  • Tan, Xiaodong
    • Journal of Information Processing Systems
    • /
    • v.16 no.2
    • /
    • pp.329-342
    • /
    • 2020
  • In recent years, emotional text classification is one of the essential research contents in the field of natural language processing. It has been widely used in the sentiment analysis of commodities like hotels, and other commentary corpus. This paper proposes an improved W-LDA (weighted latent Dirichlet allocation) topic model to improve the shortcomings of traditional LDA topic models. In the process of the topic of word sampling and its word distribution expectation calculation of the Gibbs of the W-LDA topic model. An average weighted value is adopted to avoid topic-related words from being submerged by high-frequency words, to improve the distinction of the topic. It further integrates the highest classification of the algorithm of support vector machine based on the extracted high-quality document-topic distribution and topic-word vectors. Finally, an efficient integration method is constructed for the analysis and extraction of emotional words, topic distribution calculations, and sentiment classification. Through tests on real teaching evaluation data and test set of public comment set, the results show that the method proposed in the paper has distinct advantages compared with other two typical algorithms in terms of subject differentiation, classification precision, and F1-measure.

A Study on the Nature of the Mathematical Reasoning (수학적 추론의 본질에 관한 연구)

  • Seo, Dong-Yeop
    • Journal of Elementary Mathematics Education in Korea
    • /
    • v.14 no.1
    • /
    • pp.65-80
    • /
    • 2010
  • The aims of our study are to investigate the nature of mathematical reasoning and the teaching of mathematical reasoning in school mathematics. We analysed the process of shaping deduction in ancient Greek based on Netz's study, and discussed on the comparison between his study and Freudenthal's local organization. The result of our analysis shows that mathematical reasoning in elementary school has to be based on children's natural language and their intuitions, and then the mathematical necessity has to be formed. And we discussed on the sequences and implications of teaching of the sum of interior angles of polygon composed the discovery by induction, justification by intuition and logical reasoning, and generalization toward polygons.

  • PDF

A Meta Analysis of the Edible Insects (식용곤충 연구 메타 분석)

  • Yu, Ok-Kyeong;Jin, Chan-Yong;Nam, Soo-Tai;Lee, Hyun-Chang
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.182-183
    • /
    • 2018
  • Big data analysis is the process of discovering a meaningful correlation, pattern, and trends in large data set stored in existing data warehouse management tools and creating new values. In addition, by extracts new value from structured and unstructured data set in big volume means a technology to analyze the results. Most of the methods of Big data analysis technology are data mining, machine learning, natural language processing, pattern recognition, etc. used in existing statistical computer science. Global research institutes have identified Big data as the most notable new technology since 2011.

  • PDF

SMS Text Messages Filtering using Word Embedding and Deep Learning Techniques (워드 임베딩과 딥러닝 기법을 이용한 SMS 문자 메시지 필터링)

  • Lee, Hyun Young;Kang, Seung Shik
    • Smart Media Journal
    • /
    • v.7 no.4
    • /
    • pp.24-29
    • /
    • 2018
  • Text analysis technique for natural language processing in deep learning represents words in vector form through word embedding. In this paper, we propose a method of constructing a document vector and classifying it into spam and normal text message, using word embedding and deep learning method. Automatic spacing applied in the preprocessing process ensures that words with similar context are adjacently represented in vector space. Additionally, the intentional word formation errors with non-alphabetic or extraordinary characters are designed to avoid being blocked by spam message filter. Two embedding algorithms, CBOW and skip grams, are used to produce the sentence vector and the performance and the accuracy of deep learning based spam filter model are measured by comparing to those of SVM Light.