• Title/Summary/Keyword: sequence databases

Search Result 226, Processing Time 0.026 seconds

Sequence Analysis of Hypothetical Proteins from Helicobacter pylori 26695 to Identify Potential Virulence Factors

  • Naqvi, Ahmad Abu Turab;Anjum, Farah;Khan, Faez Iqbal;Islam, Asimul;Ahmad, Faizan;Hassan, Md. Imtaiyaz
    • Genomics & Informatics
    • /
    • v.14 no.3
    • /
    • pp.125-135
    • /
    • 2016
  • Helicobacter pylori is a Gram-negative bacteria that is responsible for gastritis in human. Its spiral flagellated body helps in locomotion and colonization in the host environment. It is capable of living in the highly acidic environment of the stomach with the help of acid adaptive genes. The genome of H. pylori 26695 strain contains 1,555 coding genes that encode 1,445 proteins. Out of these, 340 proteins are characterized as hypothetical proteins (HP). This study involves extensive analysis of the HPs using an established pipeline which comprises various bioinformatics tools and databases to find out probable functions of the HPs and identification of virulence factors. After extensive analysis of all the 340 HPs, we found that 104 HPs are showing characteristic similarities with the proteins with known functions. Thus, on the basis of such similarities, we assigned probable functions to 104 HPs with high confidence and precision. All the predicted HPs contain representative members of diverse functional classes of proteins such as enzymes, transporters, binding proteins, regulatory proteins, proteins involved in cellular processes and other proteins with miscellaneous functions. Therefore, we classified 104 HPs into aforementioned functional groups. During the virulence factors analysis of the HPs, we found 11 HPs are showing significant virulence. The identification of virulence proteins with the help their predicted functions may pave the way for drug target estimation and development of effective drug to counter the activity of that protein.

Comparative chloroplast genomics and phylogenetic analysis of the Viburnum dilatatum complex (Adoxaceae) in Korea

  • PARK, Jongsun;XI, Hong;OH, Sang-Hun
    • Korean Journal of Plant Taxonomy
    • /
    • v.50 no.1
    • /
    • pp.8-16
    • /
    • 2020
  • Complete chloroplast genome sequences provide detailed information about any structural changes of the genome, instances of phylogenetic reconstruction, and molecular markers for fine-scale analyses. Recent developments of next-generation sequencing (NGS) tools have led to the rapid accumulation of genomic data, especially data pertaining to chloroplasts. Short reads deposited in public databases such as the Sequence Read Archive of the NCBI are open resources, and the corresponding chloroplast genomes are yet to be completed. The V. dilatatum complex in Korea consists of four morphologically similar species: V. dilatatum, V. erosum, V. japonicum, and V. wrightii. Previous molecular phylogenetic analyses based on several DNA regions did not resolve the relationship at the species level. In order to examine the level of variation of the chloroplast genome in the V. dilatatum complex, raw reads of V. dilatatum deposited in the NCBI database were used to reconstruct the whole chloroplast genome, with these results compared to the genomes of V. erosum, V. japonicum, and three other species in Viburnum. These comparative genomics results found no significant structural changes in Viburnum. The degree of interspecific variation among the species in the V. dilatatum complex is very low, suggesting that the species of the complex may have been differentiated recently. The species of the V. dilatatum complex share large unique deletions, providing evidence of close relationships among the species. A phylogenetic analysis of the entire genome of the Viburnum showed that V. dilatatum is a sister to one of two accessions of V. erosum, making V. erosum paraphyletic. Given that the overall degree of variation among the species in the V. dilatatum complex is low, the chloroplast genome may not provide a phylogenetic signal pertaining to relationships among the species.

Prediction of ORFs in Metagenome by Using Cis-acting Transcriptional and Translational Factors (메타게놈 서열에 존재하는 보존적인 전사와 번역 인자를 이용한 ORF 예측)

  • Cheong, Dea-Eun;Kim, Geun-Joong
    • KSBB Journal
    • /
    • v.25 no.5
    • /
    • pp.490-496
    • /
    • 2010
  • As sequencing technologies are steadily improving, massive sequence data have been accumulated in public databases. Thereby, programs based on various algorithms are developed to mine useful information, such as genes, operons and regulatory factors,from these sequences. However, despite its usefulness in a wide range of applications, comprehensive analyses of metagenome using these programs have some drawbacks, thereby yielding inaccurate or complex results. We here provide a possibility of signature sequences (cis-acting transcriptional and translational factors of metagenome) as a hallmark of ORFs finding from metagenome.

Development of a 3-D Immersion Type Training Simulator

  • Jung, Young-Beom;Park, Chang-Hyun;Jang, Gil-Soo
    • KIEE International Transactions on Power Engineering
    • /
    • v.4A no.4
    • /
    • pp.171-177
    • /
    • 2004
  • In the current age of the information oriented society in which we live, many people use PCs and are dependant on the databases provided by the network server. However, online data can be missed during the occurrence of a blackout and furthermore, power failure can greatly effect Power Quality. This has resulted in the trend of using interruption-free live-line work when trouble occurs in a power system. However, 83% of the population receives an electric shock experience when a laborer is performing interruption-free live-line work. In the interruption-free method, education and training problems have been pinpointed. However, there are few instructors to implement the necessary training. Furthermore, the trainees undergo only a short training period of just 4 weeks. In this paper, to develop a method with no restrictions on time and place and to ensure a reduction in the misuse of materials, immersion type virtual reality (or environment) technology is used. The users of a 3D immersion type VR training system can interact with the system by performing the equivalent action in a safe environment. Thus, it can be valuable to apply this training system to such dangerous work as 'Interruption-free live-line work exchanging COS (Cut-Out-Switch)'. In this program, the user carries out work according to instructions displayed through the window and speaker and cannot perform other tasks until each part of the task is completed in the proper sequence. The workers using this system can utilize their hands and viewpoint movement since they are in a real environment but the trainee cannot use all parts and senses of a real body with the current VR technology. Despite these weak points, when we consider the trends of improvement in electrical devices and communication technology, we can say that 3D graphic VR application has high potentiality.

Atrial Fibrillation Pattern Analysis based on Symbolization and Information Entropy (부호화와 정보 엔트로피에 기반한 심방세동 (Atrial Fibrillation: AF) 패턴 분석)

  • Cho, Ik-Sung;Kwon, Hyeog-Soong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.5
    • /
    • pp.1047-1054
    • /
    • 2012
  • Atrial fibrillation (AF) is the most common arrhythmia encountered in clinical practice, and its risk increases with age. Conventionally, the way of detecting AF was the time·frequency domain analysis of RR variability. However, the detection of ECG signal is difficult because of the low amplitude of the P wave and the corruption by the noise. Also, the time·frequency domain analysis of RR variability has disadvantage to get the details of irregular RR interval rhythm. In this study, we describe an atrial fibrillation pattern analysis based on symbolization and information entropy. We transformed RR interval data into symbolic sequence through differential partition, analyzed RR interval pattern, quantified the complexity through Shannon entropy and detected atrial fibrillation. The detection algorithm was tested using the threshold between 10ms and 100ms on two databases, namely the MIT-BIH Atrial Fibrillation Database.

Discovering Temporal Relation Rules from Temporal Interval Data (시간간격을 고려한 시간관계 규칙 탐사 기법)

  • Lee, Yong-Joon;Seo, Sung-Bo;Ryu, Keun-Ho;Kim, Hye-Kyu
    • Journal of KIISE:Databases
    • /
    • v.28 no.3
    • /
    • pp.301-314
    • /
    • 2001
  • Data mining refers to a set of techniques for discovering implicit and useful knowledge from large database. Many studies on data mining have been pursued and some of them have involved issues of temporal data mining for discovering knowledge from temporal database, such as sequential pattern, similar time sequence, cyclic and temporal association rules, etc. However, all of the works treat problems for discovering temporal pattern from data which are stamped with time points and do not consider problems for discovering knowledge from temporal interval data. For example, there are many examples of temporal interval data that it can discover useful knowledge from. These include patient histories, purchaser histories, web log, and so on. Allen introduces relationships between intervals and operators for reasoning about relations between intervals. We present a new data mining technique that can discover temporal relation rules in temporal interval data by using the Allen's theory. In this paper, we present two new algorithms for discovering algorithm for generating temporal relation rules, discovers rules from temporal interval data. This technique can discover more useful knowledge in compared with conventional data mining techniques.

  • PDF

Temporal Pattern Mining of Moving Objects for Location based Services (위치 기반 서비스를 위한 이동 객체의 시간 패턴 탐사 기법)

  • Lee, Jun-Uk;Baek, Ok-Hyeon;Ryu, Geun-Ho
    • Journal of KIISE:Databases
    • /
    • v.29 no.5
    • /
    • pp.335-346
    • /
    • 2002
  • LBS(Location Based Services) provide the location-based information to its mobile users. The primary functionality of these services is to provide useful information to its users at a minimum cost of resources. The functionality can be implemented through data mining techniques. However, conventional data mining researches have not been considered spatial and temporal aspects of data simultaneously. Therefore, these techniques are inappropriate to apply on the objects of LBS, which change spatial attributes over time. In this paper, we propose a new data mining technique for identifying the temporal patterns from the series of the locations of moving objects that have both temporal and spatial dimension. We use a spatial operation of contains to generalize the location of moving point and apply time constraints between the locations of a moving object to make a valid moving sequence. Finally, the spatio-temporal technique proposed in this paper is very practical approach in not only providing more useful knowledge to LBS, but also improving the quality of the services.

Iceberg Query Evaluation Technical Using a Cuboid Prefix Tree (큐보이드 전위트리를 이용한 빙산질의 처리)

  • Han, Sang-Gil;Yang, Woo-Sock;Lee, Won-Suk
    • Journal of KIISE:Databases
    • /
    • v.36 no.3
    • /
    • pp.226-234
    • /
    • 2009
  • A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Due to the characteristics of a data stream, it is impossible to save all the data elements of a data stream. Therefore it is necessary to define a new synopsis structure to store the summary information of a data stream. For this purpose, this paper proposes a cuboid prefix tree that can be effectively employed in evaluating an iceberg query over data streams. A cuboid prefix tree only stores those itemsets that consist of grouping attributes used in GROUP BY query. In addition, a cuboid prefix tree can compute multiple iceberg queries simultaneously by sharing their common sub-expressions. A cuboid prefix tree evaluates an iceberg query over an infinitely generated data stream while efficiently reducing memory usage and processing time, which is verified by a series of experiments.

Finding the Workflow Critical Path in the Extended Structural Workflow Schema (확장된 구조적 워크플루우 스키마에서 워크플로우 임계 경로의 결정)

  • Son, Jin-Hyeon;Kim, Myeong-Ho
    • Journal of KIISE:Databases
    • /
    • v.29 no.2
    • /
    • pp.138-147
    • /
    • 2002
  • The concept of the critical path in the workflow is important because it can be utilized In many issues in workflow systems, e.g., workflow resource management and workflow time management. However, the critical path in the contest of the workflow has not been much addressed in the past. This is because control flows in the workflow, generally including sequence, parallel, alternative, iteration and so on, are much more complex than those in the ordinary graph or network. In this paper we first describe our workflow model that has considerable work(low control constructs. They would provide the sufficient expressive power for modeling the growing complexities of today's most business processes. Then, we propose a method to systematically determine the critical path in a workflow schema built by the workflow control constructs described in our workflow model.

Discovery of Frequent Sequence Pattern in Moving Object Databases (이동 객체 데이터베이스에서 빈발 시퀀스 패턴 탐색)

  • Vu, Thi Hong Nhan;Lee, Bum-Ju;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.15D no.2
    • /
    • pp.179-186
    • /
    • 2008
  • The converge of location-aware devices, GIS functionalities and the increasing accuracy and availability of positioning technologies pave the way to a range of new types of location-based services. The field of spatiotemporal data mining where relationships are defined by spatial and temporal aspect of data is encountering big challenges since the increased search space of knowledge. Therefore, we aim to propose algorithms for mining spatiotemporal patterns in mobile environment in this paper. Moving patterns are generated utilizing two algorithms called All_MOP and Max_MOP. The first one mines all frequent patterns and the other discovers only maximal frequent patterns. Our proposed approach is able to reduce consuming time through comparison with DFS_MINE algorithm. In addition, our approach is applicable to location-based services such as tourist service, traffic service, and so on.