• Title/Summary/Keyword: Needleman-Wunsch Algorithm

Search Result 5, Processing Time 0.018 seconds

Searching Similar Example-Sentences Using the Needleman-Wunsch Algorithm (Needleman-Wunsch 알고리즘을 이용한 유사예문 검색)

  • Kim Dong-Joo;Kim Han-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.4 s.42
    • /
    • pp.181-188
    • /
    • 2006
  • In this paper, we propose a search algorithm for similar example-sentences in the computer-aided translation. The search for similar examples, which is a main part in the computer-aided translation, is to retrieve the most similar examples in the aspect of structural and semantical analogy for a given query from examples. The proposed algorithm is based on the Needleman-Wunsch algorithm, which is used to measure similarity between protein or nucleotide sequences in bioinformatics. If the original Needleman-Wunsch algorithm is applied to the search for similar sentences, it is likely to fail to find them since similarity is sensitive to word's inflectional components. Therefore, we use the lemma in addition to (typographical) surface information. In addition, we use the part-of-speech to capture the structural analogy. In other word, this paper proposes the similarity metric combining the surface, lemma, and part-of-speech information of a word. Finally, we present a search algorithm with the proposed metric and present pairs contributed to similarity between a query and a found example. Our algorithm shows good performance in the area of electricity and communication.

  • PDF

Searching Similar Example Sentences for the Computer-Aided Translation System (번역지원 시스템을 위한 유사 예문 검색)

  • Kim Dong-Joo;Kim Han-Woo
    • KSCI Review
    • /
    • v.14 no.1
    • /
    • pp.197-204
    • /
    • 2006
  • This paper proposes an similar sentence searching algorithm for the computer-aided translation. The Proposed algorithm, which is based on the Needleman-Wunsch algorithm, measures the similarity between the input sentence and the example sentences through combining surface. lemma, part-of-speech information of words with the multi-layered information. It also carries out the alignment between them. The accuracy of the proposed algorithm was very high in the experiment for the example sentences of the area of electricity and communication.

  • PDF

A DNA Sequence Alignment Algorithm Using Quality Information and a Fuzzy Inference Method (품질 정보와 퍼지 추론 기법을 이용한 DNA 염기 서열 배치 알고리즘)

  • Kim, Kwang-Baek
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.2
    • /
    • pp.55-68
    • /
    • 2007
  • DNA sequence alignment algorithms in computational molecular biology have been improved by diverse methods. In this paper, we proposed a DNA sequence alignment algorithm utilizing quality information and a fuzzy inference method utilizing characteristics of DNA sequence fragments and a fuzzy logic system in order to improve conventional DNA sequence alignment methods using DNA sequence quality information. In conventional algorithms, DNA sequence alignment scores were calculated by the global sequence alignment algorithm proposed by Needleman-Wunsch applying quality information of each DNA fragment. However, there may be errors in the process for calculating DNA sequence alignment scores in case of low quality of DNA fragment tips, because overall DNA sequence quality information are used. In the proposed method, exact DNA sequence alignment can be achieved in spite of low quality of DNA fragment tips by improvement of conventional algorithms using quality information. And also, mapping score parameters used to calculate DNA sequence alignment scores, are dynamically adjusted by the fuzzy logic system utilizing lengths of DNA fragments and frequencies of low quality DNA bases in the fragments. From the experiments by applying real genome data of NCBI (National Center for Biotechnology Information), we could see that the proposed method was more efficient than conventional algorithms using quality information in DNA sequence alignment.

  • PDF

Analysis and Visualization for Comment Messages of Internet Posts (인터넷 게시물의 댓글 분석 및 시각화)

  • Lee, Yun-Jung;Ji, Jeong-Hoon;Woo, Gyun;Cho, Hwan-Gue
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.7
    • /
    • pp.45-56
    • /
    • 2009
  • There are many internet users who collect the public opinions and express their opinions for internet news or blog articles through the replying comment on online community. But, it is hard to search and explore useful messages on web blogs since most of web blog systems show articles and their comments to the form of sequential list. Also, spam and malicious comments have become social problems as the internet users increase. In this paper, we propose a clustering and visualizing system for responding comments on large-scale weblogs, namely 'Daum AGORA,' using similarity analysis. Our system shows the comment clustering result as a simple screen view. Our system also detects spam comments using Needleman-Wunsch algorithm that is a well-known algorithm in bioinformatics.

A Method to Measure the Self-Supplied News Volumes of Internet Newspaper Company

  • Kim, Dong-Joo;Lee, Won Joo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.10
    • /
    • pp.99-105
    • /
    • 2015
  • The growth of internet infrastructure and a tremendous increment of internet users lead actively to found internet newspaper publishing companies, which are able to dig up and publish own news articles. In disregard of these quantitative growth of internet newspaper companies, the qualitative growth of them doesn't coincide with the quantitative growth. Therefore, to require social responsibility and to build healthy media environment, Korean government has put in force registration system of internet newspaper company. According to this system, internet newspaper companies have to produce at the inside over 30 percent of weekly publications, and this requisite increases the needs of its verification. This paper investigates technologies to measure the self-supplied news volumes of internet newspaper company, examines validity of them, and presents appropriate method to measure. To compare huge amount of news articles rapidly, the presented method is based on the modified edit-distance, which reflects human cognition of word and empirical information related with it. To prove correctness of our presented method, we show experimental results for some real internet news articles.