Bioinformatics and Biosystems
한국생명정보학회 (Korean Society for Bioinformatics)
- 반년간
- /
- 1738-9798(pISSN)
Aim & Scope
amins Interdisciplinary Bio Central (IBC) aims to provide an interdisciplinary medium for open, interactive, and rapid communication and for archiving the scientific and technological achievements in the area of interdisciplinary bioscience and bioengineering. Scopes IBC covers areas of experimental, theoretical, fundamental and applied science and engineering dedicated to the understanding, analyzing, and applying biological phenomena and medical issues from interdisciplinary perspectives. The areas covered by IBC include but are not limited to: Bioinformatics/Computational biology/Molecular modeling, Systems biology, Cheminformatics/Chemical biology, Omics (Physiomics/metabolomics/proteomics/genomics), Synthetic biology, Biophysics, Biomathematics/Mathematical Biology and Medicine, Biomechanics/Biomachine interface/Biorobotics, Bioimaging, Biomaterials/Biomimetics, Bioelectronics, Medical informatics, Biological computation/Database, Pharmaceutical bioscience and technology, Resources, Nano-bio and Nano-medical science and technology, Environmental bioscience and technology/Bioenergy, Biological frontiers: In addition, any biological paper will be considered for publication if they may lead to promotion of interdisciplinary thinking and research.
제1권1호
-
현대 생물학은 온통 정보의 홍수에 넘쳐나고 있다. 이렇게 쏟아져 나오는 정보들을 체계적으로 정리하고 이해하고 파악하는 것은 매우 중요하다고 하겠다. 생물정보학은 이렇게 쏟아져 나오는 정보들을 수학, 전산학, 정보학 등의 방법론을 사용하여 체계화시키려는 새로운 학문이고 미래 지향적 융합 분야이다.
-
DNA microarray 기술은 동시적으로 수천 개의 유전자의 발현상황을 탐색할 수 있다. 이 기술을 통해 얻어진 자료는 분석하기에 앞서 전처리 과정으로 배경보정 (background correction), 표준화 (normalization) 그리고 요약 (summarization)이 필요하다. 표준화란 microarray 실험에서 기술상의 문제로 첨가되는 일정한 잡음을 인식, 제거하기 위해 필요한 기법으로 그 동안 여러 방법들이 제시되어 왔다. 또한 마이크로어레이 자료의 분석을 위한 요약 방법으로도 많은 방법들이 연구되었다. 본 글에서는 표준화 방법들과 요약 방법들의 특성을 분석, 비교하고자 한다.
-
Recent progress in the development of non-invasive imaging technologies continues to strengthen the role of molecular imaging biological research. These tools have been validated recently in variety of research models, and have been shown to provide continuous quantitative monitoring of the location(s), magnitude, and time-variation of gene delivery and/or expression. This article reviews the use of radionuclide, magnetic resonance, and optical imaging technologies as they have been used in imaging gene delivery and gene expression for molecular imaging applications. The studies published to date demonstrate that noninvasive imaging tools will help to accelerate pre-clinical model validation as well as allow for clinical monitoring of human diseases.
-
Chung, Joon-Yong;Kim, Nari;Joo, Hyun;Youm, Jae-Boum;Park, Won-Sun;Lee, Sang-Kyoung;Warda, Mohamad;Han, Jin 28
Recent studies in molecular biology and proteomics have identified a significant number of novel diagnostic, prognostic, and therapeutic disease markers. However, validation of these markers in clinical specimens with traditional histopathological techniques involves low throughput and is time consuming and labor intensive. Tissue microarrays (TMAs) offer a means of combining tens to hundreds of specimens of tissue onto a single slide for simultaneous analysis. This capability is particularly pertinent in the field of cancer for target verification of data obtained from cDNA micro arrays and protein expression profiling of tissues, as well as in epidemiology-based investigations using histochemical/immunohistochemical staining or in situ hybridization. In combination with automated image analysis, TMA technology can be used in the global cellular network analysis of tissues. In particular, this potential has generated much excitement in cardiovascular disease research. The following review discusses recent advances in the construction and application of TMAs and the opportunity for developing novel, highly sensitive diagnostic tools for the early detection of cardiovascular disease. -
본 논문은 베이지안 네트워크를 기반으로 대규모 유전자 상호작용 네트워크를 추론하기 위한 클라이언트-서버 시스템 구조를 제시한다. 유전체 수준(genome-wide)의 대규모 유전자 상호작용 네트워크를 베이지안 네트워크 형태로 추론하기 위해서는 병렬 서버를 이용하더라도 통상 수십시간이 소요된다. 따라서, 일반적인 대화형(interactive) 독자(standalone) 시스템 구조보다는 배치형(batch) 분산(distributed) 시스템 구조가 적합하다. 본 논문에서는 그와 같은 상황에 적합한 느슨한 연결의 (loosely-coupled) 클라이언트-서버 시스템을 구현할 결과를 기술한다. 유전자 상호작용 네트워크 추론은 크게 두 단계로 나누어진다. 첫째로, 생물주석정보(biological annotation)과 유전자 발현정보(expression data)를 사용하여, 전체 유전자 집단을 서로 중복이 가능한 모듈들로 나누며, 둘째로, 각각의 모듈들에 대해 독립적인 베이지안 학습을 수행하여 추론결과를 얻고, 각 모듈들이 공통으로 포함하는 유전자를 사용하여 각 모듈의 추론결과들을 하나로 통합한다.
-
According to the advancement of experimental techniques in molecular biology, genomic and protein sequence databases are increasing in size exponentially, and mean sequence lengths are also increasing. Because the sizes of these databases become larger, it is difficult to search similar sequences in biological databases with significant homologies to a query sequence. In this paper, we present the N-gram indexing method to retrieve similar sequences fast, precisely and comparably. This method regards a protein sequence as a text written in language of 20 amino acid codes, adapts N-gram tokens of fixed-length as its indexing scheme for sequence strings. After such tokens are indexed for all the sequences in the database, sequences can be searched with information retrieval algorithms. Using this new method, we have developed a protein sequence search system named as ProSeS (PROtein Sequence Search). ProSeS is a protein sequence analysis system which provides overall analysis results such as similar sequences with significant homologies, predicted subcellular locations of the query sequence, and major keywords extracted from annotations of similar sequences. We show experimentally that the N-gram indexing approach saves the retrieval time significantly, and that it is as accurate as current popular search tool BLAST.
-
The paradigm in biology is currently changing from that of conducting hypothesis-driven individual experiments to that of utilizing the results of a massive data analysis with appropriate computational tools. We present LayMap, an implemented visualization system that helps the user to deal with a high volume of the biomedical literature such as MEDLINE, through the layered maps that are constructed on the results of an information extraction system. LayMap also utilizes filtering and granularity for an enhanced view of the results. Since a biomedical information extraction system gives rise to a focused and effective way of slicing up the data space, the combined use of LayMap with such an information extraction system can help the user to navigate the data space in a speedy and guided manner. As a case study, we have applied the system to datasets of journal abstracts on 'MAPK pathway' and 'bufalin' from MEDLINE. With the proposed visualization, we have successfully rediscovered pathway maps of a reasonable quality for ERK, p38 and JNK. Furthermore, with respect to bufalin, we were able to identify the potentially interesting relation between the Chinese medicine Chan su and apoptosis with a high level of detail.
-
We propose a new fragment assembly program MLP (mate-based layout with PHRAP). MLP consists of PHRAP, repeat masking, and a new layout algorithm that uses the mate pair information. Our experimental results show that by using MLP instead of PHRAP, we can significantly reduce the difference between the assembled sequence and the original genome sequence.
-
Effect of missing values in detecting differentially expressed genes in a cDNA microarray experimentThe aim of this paper is to discuss the effect of missing values in detecting differentially expressed genes in a cDNA microarray experiment in the context of a one sample problem. We conducted a cDNA micro array experiment to detect differentially expressed genes for the metastasis of colorectal cancer based on twenty patients who underwent liver resection due to liver metastasis from colorectal cancer. Total RNAs from metastatic liver tumor and adjacent normal liver tissue from a single patient were labeled with cy5 and cy3, respectively, and competitively hybridized to a cDNA microarray with 7775 human genes. We used
$M=log_2(R/G)$ for the signal evaluation, where Rand G denoted the fluorescent intensities of Cy5 and Cy3 dyes, respectively. The statistical problem comprises a one sample test of testing E(M)=0 for each gene and involves multiple tests. The twenty cDNA microarray data would comprise a matrix of dimension 7775 by 20, if there were no missing values. However, missing values occur for various reasons. For each gene, the no missing proportion (NMP) was defined to be the proportion of non-missing values out of twenty. In detecting differentially expressed (DE) genes, we used the genes whose NMP is greater than or equal to 0.4 and then sequentially increased NMP by 0.1 for investigating its effect on the detection of DE genes. For each fixed NMP, we imputed the missing values with K-nearest neighbor method (K=10) and applied the nonparametric t-test of Dudoit et al. (2002), SAM by Tusher et al. (2001) and empirical Bayes procedure by$L\ddot{o}nnstedt$ and Speed (2002) to find out the effect of missing values in the final outcome. These three procedures yielded substantially agreeable result in detecting DE genes. Of these three procedures we used SAM for exploring the acceptable NMP level. The result showed that the optimum no missing proportion (NMP) found in this data set turned out to be 80%. It is more desirable to find the optimum level of NMP for each data set by applying the method described in this note, when the plot of (NMP, Number of overlapping genes) shows a turning point. -
This paper describes a genetic algorithm for predicting RNA structures that contain various types of pseudoknots. Pseudoknotted RNA structures are much more difficult to predict by computational methods than RNA secondary structures, as they are more complex and the analysis is time-consuming. We developed an efficient genetic algorithm to predict RNA folding structures containing any type of pseudoknot, as well as a novel initial population method to decrease computational complexity and increase the accuracy of the results. We also used an interaction filter to decrease the size of the possible stem lists for long RNA sequences. We predicted RNA structures using a number of different termination conditions and compared the validity of the results and the times required for the analyses. The algorithm proved able to predict efficiently RNA structures containing various types of pseudoknots.
-
현재 가장 많이 사용되는 단백질 구조 예측 방법은 비교 모델링 (comparative modeling) 방법이다. 비교 모델링 방법에서의 정확도를 높이기 위해서는 alignment의 정확도 역시 매우 필수적으로 필요하다. 비교 모델링 과정 중의 fold-recognition 단계에서 alignment의 정확도에 의해 template을 고르는 방법은 단지 가장 비슷한 template을 선택하는 방법에 비해 주목을 받지 못하고 있다. 최근에는 두 가지의 alignment에 사이의 shift 정보를 바탕으로 한 shift score라는 수치가 alignment의 성능을 표현하기 위해서 개발되었다. 우리는 더 정확한 구조 예측의 첫걸음이 될 수 있는 shift score를 예측하는 방법을 개발하였다. Shift score를 예측하기 위해 support vector regression (SVR)이 사용되었다. 사전에 구축된 라이브러리 안의 길이가 n 인 template과 구조를 알고 싶은 query 단백질 사이의 alignment는 n+2 차원의 input 벡터로 변환된다. Structural alignment가 가장 좋은 alignment로 가정되었고 SVR은 query 단백질과 template 단백질의 structural alignment과 profile-profile alignment 사이의 shift score를 예측하도록 training 되었다. 예측 정확도는 Pearson 상관계수로 측정되었다. Training 된 SVR은 실제의 shift score와 예측된 shift score 사이에 0.80의 Pearson 상관계수를 갖는 정도로 예측하였다.