• 제목/요약/키워드: Feature data file

검색결과 68건 처리시간 0.023초

Reviving GOR method in protein secondary structure prediction: Effective usage of evolutionary information

  • Lee, Byung-Chul;Lee, Chang-Jun;Kim, Dong-Sup
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.133-138
    • /
    • 2003
  • The prediction of protein secondary structure has been an important bioinformatics tool that is an essential component of the template-based protein tertiary structure prediction process. It has been known that the predicted secondary structure information improves both the fold recognition performance and the alignment accuracy. In this paper, we describe several novel ideas that may improve the prediction accuracy. The main idea is motivated by an observation that the protein's structural information, especially when it is combined with the evolutionary information, significantly improves the accuracy of the predicted tertiary structure. From the non-redundant set of protein structures, we derive the 'potential' parameters for the protein secondary structure prediction that contains the structural information of proteins, by following the procedure similar to the way to derive the directional information table of GOR method. Those potential parameters are combined with the frequency matrices obtained by running PSI-BLAST to construct the feature vectors that are used to train the support vector machines (SVM) to build the secondary structure classifiers. Moreover, the problem of huge model file size, which is one of the known shortcomings of SVM, is partially overcome by reducing the size of training data by filtering out the redundancy not only at the protein level but also at the feature vector level. A preliminary result measured by the average three-state prediction accuracy is encouraging.

  • PDF

DNA Chip Database for the Korean Functional Genomics Project

  • Kim, Sang-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2001년도 제2회 생물정보 워크샵 (DNA Chip Bioinformatics)
    • /
    • pp.11-28
    • /
    • 2001
  • The Korean functional Genomics Project focuses on stomach and liver cancers. Specimens collected by six hospital teams are used in BNA microarray experiments. Experimental conditions, spot measurement data, and the associated clinical information are stored in a relational database. Microarray database schema was developed based on EBI's ArrayExpress. A diagrammatic representation of the schema is used to help navigate over marty tables in the database. Field description, table-to-table relationship, and other database features are also stored in the database and these are used by a PERL interface program to generate web-based input forms on the fly. As such, it is rather simple to modify the database definition and implement controlled vocabularies. This PERL program is a general-purpose utility which can be used for inputting and updating data in relational databases. It supports file upload and user-supplied filters of uploaded data. Joining related tables is implemented using JavaScripts, allowing this step to be deferred to a later stage. This feature alleviates the pain of inputting data into a multi-table database and promotes collaborative data input among several teams. Pathological finding, clinical laboratory parameters, demographical information, and environmental factors are also collected and stored in a separate database. The same PERL program facilitated developing this database and its user-interface.

  • PDF

대용량 데이터를 위한 효율적인 다차원 색인구조 (An Efficient Multi-Dimensional Index Structure for Large Data Set)

  • 이병엽;유재수
    • 한국지리정보학회지
    • /
    • 제5권2호
    • /
    • pp.54-68
    • /
    • 2002
  • 최근 지리정보시스템, 움직임 객체관리시스템, 동영상/이미지 내용기반 검색시스템, 시계열 데이터베이스시스템과 같이 다차원 데이터를 이용하는 응용에 대한 관심이 고조되고 있다. 이 논문은 다차원의 특징벡터를 벡터 근사치로 표현한 후 색인 트리를 구성하여 검색의 효율을 높이는 VA(vector approximate)-트리를 제안한다. 이 논문에서 제안하는 VA-트리는 전체적인 색인구조의 저장공간을 줄이기 위해서 VA-파일의 벡터 근사치 개념을 이용하여 데이터량이 증가해도 검색 성능이 저하되지 않도록 하는 트리 형태의 구조를 갖는다. VA-트리는 MBR 기반의 색인구조이지만 MBR 간에 겹침이 발생하지 않는 분할방법을 사용하여 검색 효율을 높인다. 제안하는 색인구조와 기존의 여러 다차원 색인구조와의 성능 평가를 통해 제안하는 방법의 우수함을 보인다.

  • PDF

음악 특징점간의 유사도 측정을 이용한 동일음원 인식 방법 (Same music file recognition method by using similarity measurement among music feature data)

  • 성보경;정명범;고일주
    • 한국컴퓨터정보학회논문지
    • /
    • 제13권3호
    • /
    • pp.99-106
    • /
    • 2008
  • 최근 다양한 분야에서(웹 포털, 유료 음원서비스 등) 디지털 음악의 검색이 사용되고 있다. 기존의 디지털 음악의 검색은 음악 데이터에 포함된 자체 메타 정보를 이용하여 이루어진다. 하지만 메타 정보가 다르게 작성되었거나 작성되지 않은 경우 정확한 검색은 어렵다. 요즘 이러한 문제의 보완 방안으로 음악자체를 이용하는 내용기반정보 검색 기법에 대한 연구가 이루어지고 있다. 본 논문에서는 음악의 파형에서 추출된 특징 정보간의 유사도 측정을 통하여 동일음원을 인식하는 방법에 대해 논하고자 한다. 디지털 음악의 특징 정보는 단순화시킨 MFCC (Mel Frequency Cepstral Coefficient)를 이용하여 음악의 파형으로부터 추출하였다. 디지털 음악간의 유사도는 Vision 및 Speech Recognition 분야에서 사용되던 DTW (Dynamic Time Warping) 기법을 활용하여 측정하였다. 제안된 동일 음원 인식 방법의 검증을 위한 같은 장르에서 무작위 추출된 1000곡에서 시행한 500번의 검색은 모두 성공했다. 검색에 사용된 500개의 디지털 오디오는 60개의 디지털음원을 압축방식과 비트율을 다르게 조합하여 만들었다. 실험의 결과로 DTW을 이용한 유사도 측정법이 동일음원을 인식할 수 있음을 증명하였다.

  • PDF

향상된 재구성능력을 가진 고속 어레이 구조 (Fast Array Architecture with Improved Reconfigurability)

  • 이재익;김진상;조원경;김영수
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2004년도 하계종합학술대회 논문집(2)
    • /
    • pp.451-454
    • /
    • 2004
  • The reconfigurable architecture is increasingly important for design of multi-mode communication systems and computation-intensive DSP systems. The proposed coarse-grain architecture is based on a reconfigurable processing element consisting of a MAC unit, a register file, a context data register, and PE interconnect control blocks. The main feature of the Proposed architecture is the loop context which enables faster configuration. Also, we propose another area-efficient reconfigurable architecture with improved reconfigurability. The SystemC modeling results show that the proposed architecture can reduce 9 clock cycles of 2D DCT compared to existing architectures.

  • PDF

A Method for Identifying Splice Sites and Translation Start Sites in Human Genomic Sequences

  • Kim, Ki-Bong;Park, Kie-Jung;Kong, Eun-Bae
    • BMB Reports
    • /
    • 제35권5호
    • /
    • pp.513-517
    • /
    • 2002
  • We describe a new method for identifying the sequences that signal the start of translation, and the boundaries between exons and introns (donor and acceptor sites) in human mRNA. According to the mandatory keyword, ORGANISM, and feature key, CDS, a large set of standard data for each signal site was extracted from the ASCII flat file, gbpri.seq, in the GenBank release 108.0. This was used to generate the scoring matrices, which summarize the sequence information for each signal site. The scoring matrices take into account the independent nucleotide frequencies between adjacent bases in each position within the signal site regions, and the relative weight on each nucleotide in proportion to their probabilities in the known signal sites. Using a scoring scheme that is based on the nucleotide scoring matrices, the method has great sensitivity and specificity when used to locate signals in uncharacterized human genomic DNA. These matrices are especially effective at distinguishing true and false sites.

ChroView: A Trace Viewer for Browsing and Editing Chromatogram files

  • Tae, Hong-Seok;Kong, Eun-Bae;Park, Kie-Jung
    • Genomics & Informatics
    • /
    • 제5권1호
    • /
    • pp.30-31
    • /
    • 2007
  • Many visualization tools have been designed to aid information processing during whole genome projects. We have developed a trace viewer program, ChroView, which can read a chromatogram file and display the chromatogram traces of the four bases. The program can be used to examine sequencing quality and base-calling errors. It can also help researchers to edit and save base-calling results while browsing the traces. Additionally, this program has a basecalling feature which can produce supplementary data for validation of the results from other base-calling programs.

Metadata Processing Technique for Similar Image Search of Mobile Platform

  • Seo, Jung-Hee
    • Journal of information and communication convergence engineering
    • /
    • 제19권1호
    • /
    • pp.36-41
    • /
    • 2021
  • Text-based image retrieval is not only cumbersome as it requires the manual input of keywords by the user, but is also limited in the semantic approach of keywords. However, content-based image retrieval enables visual processing by a computer to solve the problems of text retrieval more fundamentally. Vision applications such as extraction and mapping of image characteristics, require the processing of a large amount of data in a mobile environment, rendering efficient power consumption difficult. Hence, an effective image retrieval method on mobile platforms is proposed herein. To provide the visual meaning of keywords to be inserted into images, the efficiency of image retrieval is improved by extracting keywords of exchangeable image file format metadata from images retrieved through a content-based similar image retrieval method and then adding automatic keywords to images captured on mobile devices. Additionally, users can manually add or modify keywords to the image metadata.

가상표적 전시를 위한 이동 동기화 기법 (A Moving Synchronization Technique for Virtual Target Overlay)

  • 김계영;장석우
    • 인터넷정보학회논문지
    • /
    • 제7권4호
    • /
    • pp.45-55
    • /
    • 2006
  • 본 논문에서는 현장감 있는 모의훈련을 위해 가상영상이 아닌 지상기반 CCD 카메라영상에 지정된 시나리오대로 가상표적을 전시하는 방법을 제안한다. 이를 위해 고해상도 GeoTIFF(Geonraphic Tag Image File Format) 위성영상과 DTED(Digital Terrain Elevation Data)를 이용하여 현실감 있는 3차원 모델을 생성(운용자용)하고 입력된 CCD 영상(운용자 훈련자용)으로부터 도로를 추출하였다. 그러나 위성영상과 지상기반 센서영상은 관측위치, 분해능, 스케일 등에 많은 차이가 있어 특징기반 정합이 어렵다. 따라서 본 논문에서는 영상 워핑함수인 TPS(Thin-Plate Spline) 보간 함수를 일치하는 두 개의 제어점 집합에 적용하여 3차원 모델에 표시된 이동경로를 따라 CCD 영상에도 표적을 전시하는 이동 동기화 방법을 제안하였다. 실험에서는 대전지역의 위성영상과 CCD 영상을 이용하여 제안한 알고리즘의 유효성을 입증하였다.

  • PDF

A novel, reversible, Chinese text information hiding scheme based on lookalike traditional and simplified Chinese characters

  • Feng, Bin;Wang, Zhi-Hui;Wang, Duo;Chang, Ching-Yun;Li, Ming-Chu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권1호
    • /
    • pp.269-281
    • /
    • 2014
  • Compared to hiding information into digital image, hiding information into digital text file requires less storage space and smaller bandwidth for data transmission, and it has obvious universality and extensiveness. However, text files have low redundancy, so it is more difficult to hide information in text files. To overcome this difficulty, Wang et al. proposed a reversible information hiding scheme using left-right and up-down representations of Chinese characters, but, when the scheme is implemented, it does not provide good visual steganographic effectiveness, and the embedding and extracting processes are too complicated to be done with reasonable effort and cost. We observed that a lot of traditional and simplified Chinese characters look somewhat the same (also called lookalike), so we utilize this feature to propose a novel information hiding scheme for hiding secret data in lookalike Chinese characters. Comparing to Wang et al.'s scheme, the proposed scheme simplifies the embedding and extracting procedures significantly and improves the effectiveness of visual steganographic images. The experimental results demonstrated the advantages of our proposed scheme.