• Title/Summary/Keyword: feature similarity

Search Result 595, Processing Time 0.026 seconds

Classification Protein Subcellular Locations Using n-Gram Features (단백질 서열의 n-Gram 자질을 이용한 세포내 위치 예측)

  • Kim, Jinsuk
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.12-16
    • /
    • 2007
  • The function of a protein is closely co-related with its subcellular location(s). Given a protein sequence, therefore, how to determine its subcellular location is a vitally important problem. We have developed a new prediction method for protein subcellular location(s), which is based on n-gram feature extraction and k-nearest neighbor (kNN) classification algorithm. It classifies a protein sequence to one or more subcellular compartments based on the locations of top k sequences which show the highest similarity weights against the input sequence. The similarity weight is a kind of similarity measure which is determined by comparing n-gram features between two sequences. Currently our method extract penta-grams as features of protein sequences, computes scores of the potential localization site(s) using kNN algorithm, and finally presents the locations and their associated scores. We constructed a large-scale data set of protein sequences with known subcellular locations from the SWISS-PROT database. This data set contains 51,885 entries with one or more known subcellular locations. Our method show very high prediction precision of about 93% for this data set, and compared with other method, it also showed comparable prediction improvement for a test collection used in a previous work.

  • PDF

Identification and Characterization of Agar-degrading Vibrio sp. GNUM08123 Isolated from Marine Red Macroalgae (한천분해 미생물 Vibrio sp. GNUM08123의 동정 및 agarase 생산의 발효적 특성)

  • Chi, Won-Jae;Kim, Yoon Hee;Kim, Jong-Hee;Hong, Soon-Kwang
    • Microbiology and Biotechnology Letters
    • /
    • v.45 no.3
    • /
    • pp.243-249
    • /
    • 2017
  • An agar-degrading bacterium, designated as the GNUM08123 strain, was isolated from samples of red algae collected from the Yongil Bay near East Sea, Korea. The isolated GNUM08123 strain was gram-negative, aerobic, motile, and beige-pigmented, with $C_{16:0}$ (25.9%) and summed feature 3 (comprising $C_{16:1}{\omega}7c/iso-C_{15:0}2-OH$, 34.4%) as its major cellular fatty acids. A similarity search based on the 16S rRNA gene sequence revealed that it belonged to class Gammaproteobacteria and shared 97.7% similarity with the type strain Vibrio chagasii $R-3712^T$. The DNA G+C content of strain $GNUM08123^T$ was 46.9 mol%. The major isoprenoid quinone was ubiquinone-8. The results of DNA-DNA relatedness and 16S rRNA sequence similarity analyses, in addition to its phenotypic and chemotaxonomic characteristics, suggest that strain GNUM08123 is a novel species within genus Vibrio, designated as Vibrio sp. GNUM08123. Agarase production by strain GNUM08123 was induced by agar and sucrose, but was repressed probably owing to carbon catabolite repression by glucose and maltose.

Implementation and Performance Evaluation of Self-Similar Traffic Generator Using OPNET (OPNET을 이용한 자기유사성 트래픽 발생기 설계 및 성능 평가)

  • Han Kyeong-Eun;Jung Kwang-Bon;Lee Seung-Hyun;Kim Young-Chon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.5A
    • /
    • pp.441-450
    • /
    • 2006
  • Recently, with the exponential growth of the number of Internet users, IP traffic which occupies more than 90 percent of the entire Internet traffic affects significantly to the performance of networks. Therefore, the design of the self-similar traffic generator reflected the feature of IP traffic is very important to design the networks efficiently and evaluate the performance of it correctly. In this paper, we design the self-similar traffic generator using OPNET. In order to implement the self-similar characteristics, ON-OFF sources with Pateto distribution are employed and aggregated. The designed self-similarity traffic generator is evaluated and verified with R/S plot, variance time(VT) plot under the various offered loads and the number of sources. It is expected that the designed self-similar traffic generator can be put to practical use when wire or wireless networks is designed and verified as well as it can be useful to decide the specific parameter value for Internet traffic modeling.

Ontology Alignment based on Parse Tree Kernel usig Structural and Semantic Information (구조 및 의미 정보를 활용한 파스 트리 커널 기반의 온톨로지 정렬 방법)

  • Son, Jeong-Woo;Park, Seong-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.4
    • /
    • pp.329-334
    • /
    • 2009
  • The ontology alignment has two kinds of major problems. First, the features used for ontology alignment are usually defined by experts, but it is highly possible for some critical features to be excluded from the feature set. Second, the semantic and the structural similarities are usually computed independently, and then they are combined in an ad-hoc way where the weights are determined heuristically. This paper proposes the modified parse tree kernel (MPTK) for ontology alignment. In order to compute the similarity between entities in the ontologies, a tree is adopted as a representation of an ontology. After transforming an ontology into a set of trees, their similarity is computed using MPTK without explicit enumeration of features. In computing the similarity between trees, the approximate string matching is adopted to naturally reflect not only the structural information but also the semantic information. According to a series of experiments with a standard data set, the kernel method outperforms other structural similarities such as GMO. In addition, the proposed method shows the state-of-the-art performance in the ontology alignment.

Evaluation Model for Gab Analysis Between NCS Competence Unit Element and Traditional Curriculum (NCS 능력단위 요소와 기존 교육과정 간 갭 분석을 위한 평가모델)

  • Kim, Dae-kyung;Kim, Chang-Bok
    • Journal of Advanced Navigation Technology
    • /
    • v.19 no.4
    • /
    • pp.338-344
    • /
    • 2015
  • The national competency standards (NCS) is a systematize and standardize for skills required to perform their job. The NCS has developed a learning module with materialization and standardize by competence unit element, which is the unit of specific job competency. The existing curriculum is material to gab analysis for use in education training with competence unit element. The existing gab analysis has evaluated subjectively by experts. The gab analysis by experts bring up a subject subjective decision, accuracy lack, temporal and spatial inefficiency by psychological factor. This paper is proposed automated evaluation model for problem resolve of subjective evaluation. This paper use index term extraction, term frequency-inverse document frequency for feature value extraction, cosine similarity algorithm for gab analysis between existing curriculum and competence unit element. This paper was presented similarity mapping table between existing curriculum and competence unit element. The evaluation model in this paper should be complemented by an improved algorithm from the structural characteristics and speed.

A Content-Based Image Retrieval using Object Segmentation Method (물체 분할 기법을 이용한 내용기반 영상 검색)

  • 송석진;차봉현;김명호;남기곤;이상욱;주재흠
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.4 no.1
    • /
    • pp.1-8
    • /
    • 2003
  • Various methods have been studying to maintain and apply the multimedia inform abruptly increasing over all social fields, in recent years. For retrieval of still images, we is implemented content-based image retrieval system in this paper that make possible to retrieve similar objects from image database after segmenting query object from background if user request query. Query image is processed median filtering to remove noise first and then object edge is detected it by canny edge detection. And query object is segmented from background by using convex hull. Similarity value can be obtained by means of histogram intersection with database image after securing color histogram from segmented image. Also segmented image is processed gray convert and wavelet transform to extract spacial gray distribution and texture feature. After that, Similarity value can be obtained by means of banded autocorrelogram and energy. Final similar image can be retrieved by adding upper similarity values that it make possible to not only robust in background but also better correct object retrieval by using object segmentation method.

  • PDF

Super-Pixels Generation based on Fuzzy Similarity (퍼지 유사성 기반 슈퍼-픽셀 생성)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.2
    • /
    • pp.147-157
    • /
    • 2017
  • In recent years, Super-pixels have become very popular for use in computer vision applications. Super-pixel algorithm transforms pixels into perceptually feasible regions to reduce stiff features of grid pixel. In particular, super-pixels are useful to depth estimation, skeleton works, body labeling, and feature localization, etc. But, it is not easy to generate a good super-pixel partition for doing these tasks. Especially, super-pixels do not satisfy more meaningful features in view of the gestalt aspects such as non-sum, continuation, closure, perceptual constancy. In this paper, we suggest an advanced algorithm which combines simple linear iterative clustering with fuzzy clustering concepts. Simple linear iterative clustering technique has high adherence to image boundaries, speed, memory efficient than conventional methods. But, it does not suggest good compact and regular property to the super-pixel shapes in context of gestalt aspects. Fuzzy similarity measures provide a reasonable graph in view of bounded size and few neighbors. Thus, more compact and regular pixels are obtained, and can extract locally relevant features. Simulation shows that fuzzy similarity based super-pixel building represents natural features as the manner in which humans decompose images.

Design of Efficient Storage Exploiting Structural Similarity in Microarray Data (마이크로어레이 데이터의 구조적 유사성을 이용한 효율적인 저장 구조의 설계)

  • Yun, Jong-Han;Shin, Dong-Kyu;Shin, Dong-Il
    • The KIPS Transactions:PartD
    • /
    • v.16D no.5
    • /
    • pp.643-650
    • /
    • 2009
  • As one of typical techniques for acquiring bio-information, microarray has contributed greatly to development of bioinformatics. Although it is established as a core technology in bioinformatics, it has difficulty in sharing and storing data because data from experiments has huge and complex type. In this paper, we propose a new method which uses the feature that microarray data format in MAGE-ML, a standard format for exchanging data, has frequent structurally similar patterns. This method constructs compact database by simplifying MAGE-ML schema. In this method, Inlining techniques and newly proposed classification techniques using structural similarity of elements are used. The structure of database becomes simpler and number of table-joins is reduced, performance is enhanced using this method.

A Performance Improvement of Automatic Butterfly Identification Method Using Color Intensity Entropy (영상의 색체 강도 엔트로피를 이용한 나비 종 자동 인식 향상 방법)

  • Kang, Seung-Ho;Kim, Tae-Hee
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.5
    • /
    • pp.624-632
    • /
    • 2017
  • Automatic butterfly identification using images is one of the interesting research fields because it helps the related researchers studying species diversity and evolutionary and development process a lot in this field. The performance of the butterfly species identification system is dependent heavily on the quality of selected features. In this paper, we propose color intensity (CI) entropy by using the distribution of color intensities in a butterfly image. We show color intensity entropy can increase the recognition rate by 10% if it is used together with previously suggested branch length similarity entropy. In addition, the performance comparison with other features such as Eigenface, 2D Fourier transform, and 2D wavelet transform is conducted against several well known machine learning methods.

The Bird Diversity and Feature by the Habitat Environment in Gotjawal area, Jeju Island, the Republic of Korea (제주도 곶자왈 지역에서 서식 환경에 따른 조류 다양성 및 특징)

  • Kim, Eun-Mi;Kang, Chang-Wan;Choi, Hyung-Soon
    • Journal of Environmental Science International
    • /
    • v.28 no.11
    • /
    • pp.917-925
    • /
    • 2019
  • All of the animals and the plants in ecosystem are intimately connected to one another and the changes of forests and surroundings affect directly wild animals. This study was conducted at Hangyeong-myeon Cheongsu-ri located in the western part of Jeju Island belonging to Hangyeong Andeok Gotjawal Zone and Jocheon-eup Seonheul-ri located in the eastern part of Jeju Island belonging to Jocheon Hamdeok Gotjawal Zone. The survey on advent of birds was carried out twice a month from January 2014 to December 2015. We divided habitat environments into three survey sites such as a forest, a shrub forest and a farmland. A total of 65 species and 4,802 individuals were observed during the survey period. In a forest, 36 species and 1,287 individuals were observed while A shrub forest had 40 species and 1,554 individuals. And in a farmland, 41 species and 1,961 individuals were observed. The only 10 species were observed in forest and the only 7 species in shrub forest and the only 10 species in farmland. The species diversity and the evenness of a farmland were the highest, and the species richness was the highest in a shrub forest, and the dominance of a forest was the highest among the three areas. The similarity index between a shurb forest and a farmland was high while that between a forest and a farmland was low. The similarity index related with breeding appeared that a forest and a farmland was different from each other.