• Title/Summary/Keyword: Database Algorithm

Search Result 1,655, Processing Time 0.029 seconds

A Practical Approximate Sub-Sequence Search Method for DNA Sequence Databases (DNA 시퀀스 데이타베이스를 위한 실용적인 유사 서브 시퀀스 검색 기법)

  • Won, Jung-Im;Hong, Sang-Kyoon;Yoon, Jee-Hee;Park, Sang-Hyun;Kim, Sang-Wook
    • Journal of KIISE:Databases
    • /
    • v.34 no.2
    • /
    • pp.119-132
    • /
    • 2007
  • In molecular biology, approximate subsequence search is one of the most important operations. In this paper, we propose an accurate and efficient method for approximate subsequence search in large DNA databases. The proposed method basically adopts a binary trie as its primary structure and stores all the window subsequences extracted from a DNA sequence. For approximate subsequence search, it traverses the binary trie in a breadth-first fashion and retrieves all the matched subsequences from the traversed path within the trie by a dynamic programming technique. However, the proposed method stores only window subsequences of the pre-determined length, and thus suffers from large post-processing time in case of long query sequences. To overcome this problem, we divide a query sequence into shorter pieces, perform searching for those subsequences, and then merge their results. To verify the superiority of the proposed method, we conducted performance evaluation via a series of experiments. The results reveal that the proposed method, which requires smaller storage space, achieves 4 to 17 times improvement in performance over the suffix tree based method. Even when the length of a query sequence is large, our method is more than an order of magnitude faster than the suffix tree based method and the Smith-Waterman algorithm.

Directional Feature Extraction of Handwritten Numerals using Local min/max Operations (Local min/max 연산을 이용한 필기체 숫자의 방향특징 추출)

  • Jung, Soon-Won;Park, Joong-Jo
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.10 no.1
    • /
    • pp.7-12
    • /
    • 2009
  • In this paper, we propose a directional feature extraction method for off-line handwritten numerals by using the morphological operations. Direction features are obtained from four directional line images, each of which contains horizontal, vertical, right-diagonal and left-diagonal lines in entire numeral lines. Conventional method for extracting directional features uses Kirsch masks which generate edge-shaped double line images for each direction, whereas our method uses directional erosion operations and generate single line images for each direction. To apply these directional erosion operations to the numeral image, preprocessing steps such as thinning and dilation are required, but resultant directional lines are more similar to numeral lines themselves. Our four [$4{\times}4$] directional features of a numeral are obtained from four directional line images through a zoning method. For obtaining the higher recognition rates of the handwrittern numerals, we use the multiple feature which is comprised of our proposed feature and the conventional features of a kirsch directional feature and a concavity feature. For recognition test with given features, we use a multi-layer perceptron neural network classifier which is trained with the back propagation algorithm. Through the experiments with the CENPARMI numeral database of Concordia University, we have achieved a recognition rate of 98.35%.

  • PDF

An Experiment for Surface Reflectance Image Generation of KOMPSAT 3A Image Data by Open Source Implementation (오픈소스 기반 다목적실용위성 3A호 영상자료의 지표면 반사도 영상 제작 실험)

  • Lee, Kiwon;Kim, Kwangseob
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.6_4
    • /
    • pp.1327-1339
    • /
    • 2019
  • Surface reflectance obtained by absolute atmospheric correction from satellite images is useful for scientific land applications and analysis ready data (ARD). For Landsat and Sentinel-2 images, many types of radiometric processing methods have been developed, and these images are supported by most commercial and open-source software. However, in the case of KOMPSAT 3/3A images, there are currently no tools or open source resources for obtaining the reflectance at the top-of-atmosphere (TOA) and top-of-canopy (TOC). In this study, the atmospheric correction module of KOMPSAT 3/3A images is newly implemented to the optical calibration algorithm supported in the Orfeo ToolBox (OTB), a remote sensing open-source tool. This module contains the sensor model and spectral response data of KOMPSAT 3A. Aerosol measurement properties, such as AERONET data, can be used to generate TOC reflectance image. Using this module, an experiment was conducted, and the reflection products for TOA and TOC with and without AERONET data were obtained. This approach can be used for building the ARD database for surface reflection by absolute atmospheric correction derived from KOMPSAT 3/3A satellite images.

Landslide Susceptibility Mapping by Comparing GIS-based Spatial Models in the Java, Indonesia (GIS 기반 공간예측모델 비교를 통한 인도네시아 자바지역 산사태 취약지도 제작)

  • Kim, Mi-Kyeong;Kim, Sangpil;Nho, Hyunju;Sohn, Hong-Gyoo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.37 no.5
    • /
    • pp.927-940
    • /
    • 2017
  • Landslide has been a major disaster in Indonesia, and recent climate change and indiscriminate urban development around the mountains have increased landslide risks. Java Island, Indonesia, where more than half of Indonesia's population lives, is experiencing a great deal of damage due to frequent landslides. However, even in such a dangerous situation, the number of inhabitants residing in the landslide-prone area increases year by year, and it is necessary to develop a technique for analyzing landslide-hazardous and vulnerable areas. In this regard, this study aims to evaluate landslide susceptibility of Java, an island of Indonesia, by using GIS-based spatial prediction models. We constructed the geospatial database such as landslide locations, topography, hydrology, soil type, and land cover over the study area and created spatial prediction models by applying Weight of Evidence (WoE), decision trees algorithm and artificial neural network. The three models showed prediction accuracy of 66.95%, 67.04%, and 69.67%, respectively. The results of the study are expected to be useful for prevention of landslide damage for the future and landslide disaster management policies in Indonesia.

Application of Particle Swarm Optimization(PSO) for Prediction of Water Quality in Agricultural Reservoirs of Korea (농업용 저수지의 수질 예측 모델을 위한 PSO(Particle Swarm Optimization) 알고리즘의 적용)

  • Kwon, Yong-Su;Bae, Mi-Jung;Hwang, Soon-Jin;Park, Young-Seuk
    • Korean Journal of Ecology and Environment
    • /
    • v.41 no.spc
    • /
    • pp.11-20
    • /
    • 2008
  • In this study, we applied a Particle Swarm Optimization (PSO) algorithm to predict the changes of chlorophyll-${\alpha}$ related to environmental factors in agricultural reservoirs in Korean national scale. Data were obtained from water quality monitoring networks of reservoirs operated by the Ministry of Agriculture and Forestry and the Ministry of Environment of Korea. From the database of the monitoring networks, 290 reservoirs were chosen with variables such as chlorophyll-${\alpha}$ and 13 environmental factors (COD, TN, TP, Altitude, Bank height, etc.) measured in 2002. Based on Carlson's trophic status index, reservoirs were divided into five groups, and most agricultural reservoirs $(TSI_{CHL}\;64.1%,\;TSI_{TP}\;75.5%)$ were in the eutrophic states. The groups were discriminated with environmental variables, showing that COD, DO, and TP were important factors to determine the trophic states. MLP-PSO (Multilayer perceptron (MLP) with PSO for the optimization) was applied for the prediction of chlorophyll-${\alpha}$ with environment factors, and showed high predictability (r=0.83, p<0.001). Additionally, the sensitivity analysis of the MLP-PSO model showed that COD had the strongest positive effects on the concentration of chlorophyll-${\alpha}$, and followed by TP, TN, DO, whereas altitude and bank height had negative effects on the concentration of chlorophyll-${\alpha}$.

Building an SNS Crawling System Using Python (Python을 이용한 SNS 크롤링 시스템 구축)

  • Lee, Jong-Hwa
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.23 no.5
    • /
    • pp.61-76
    • /
    • 2018
  • Everything is coming into the world of network where modern people are living. The Internet of Things that attach sensors to objects allows real-time data transfer to and from the network. Mobile devices, essential for modern humans, play an important role in keeping all traces of everyday life in real time. Through the social network services, information acquisition activities and communication activities are left in a huge network in real time. From the business point of view, customer needs analysis begins with SNS data. In this research, we want to build an automatic collection system of SNS contents of web environment in real time using Python. We want to help customers' needs analysis through the typical data collection system of Instagram, Twitter, and YouTube, which has a large number of users worldwide. It is stored in database through the exploitation process and NLP process by using the virtual web browser in the Python web server environment. According to the results of this study, we want to conduct service through the site, the desired data is automatically collected by the search function and the netizen's response can be confirmed in real time. Through time series data analysis. Also, since the search was performed within 5 seconds of the execution result, the advantage of the proposed algorithm is confirmed.

Arrhythmia Classification based on Binary Coding using QRS Feature Variability (QRS 특징점 변화에 따른 바이너리 코딩 기반의 부정맥 분류)

  • Cho, Ik-Sung;Kwon, Hyeog-Soong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.8
    • /
    • pp.1947-1954
    • /
    • 2013
  • Previous works for detecting arrhythmia have mostly used nonlinear method such as artificial neural network, fuzzy theory, support vector machine to increase classification accuracy. Most methods require accurate detection of P-QRS-T point, higher computational cost and larger processing time. But it is difficult to detect the P and T wave signal because of person's individual difference. Therefore it is necessary to design efficient algorithm that classifies different arrhythmia in realtime and decreases computational cost by extrating minimal feature. In this paper, we propose arrhythmia detection based on binary coding using QRS feature varibility. For this purpose, we detected R wave, RR interval, QRS width from noise-free ECG signal through the preprocessing method. Also, we classified arrhythmia in realtime by converting threshold variability of feature to binary code. PVC, PAC, Normal, BBB, Paced beat classification is evaluated by using 39 record of MIT-BIH arrhythmia database. The achieved scores indicate the average of 97.18%, 94.14%, 99.83%, 92.77%, 97.48% in PVC, PAC, Normal, BBB, Paced beat classification.

An Item-based Collaborative Filtering Technique by Associative Relation Clustering in Personalized Recommender Systems (개인화 추천 시스템에서 연관 관계 군집에 의한 아이템 기반의 협력적 필터링 기술)

  • 정경용;김진현;정헌만;이정현
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.467-477
    • /
    • 2004
  • While recommender systems were used by a few E-commerce sites former days, they are now becoming serious business tools that are re-shaping the world of I-commerce. And collaborative filtering has been a very successful recommendation technique in both research and practice. But there are two problems in personalized recommender systems, it is First-Rating problem and Sparsity problem. In this paper, we solve these problems using the associative relation clustering and “Lift” of association rules. We produce “Lift” between items using user's rating data. And we apply Threshold by -cut to the association between items. To make an efficiency of associative relation cluster higher, we use not only the existing Hypergraph Clique Clustering algorithm but also the suggested Split Cluster method. If the cluster is completed, we calculate a similarity iten in each inner cluster. And the index is saved in the database for the fast access. We apply the creating index to predict the preference for new items. To estimate the Performance, the suggested method is compared with existing collaborative filtering techniques. As a result, the proposed method is efficient for improving the accuracy of prediction through solving problems of existing collaborative filtering techniques.

Indexing and Retrieval Mechanism using Variation Patterns of Theme Melodies in Content-based Music Information Retrievals (내용 기반 음악 정보 검색에서 주제 선율의 변화 패턴을 이용한 색인 및 검색 기법)

  • 구경이;신창환;김유성
    • Journal of KIISE:Databases
    • /
    • v.30 no.5
    • /
    • pp.507-520
    • /
    • 2003
  • In this paper, an automatic construction method of theme melody index for large music database and an associative content-based music retrieval mechanism in which the constructed theme melody index is mainly used to improve the users' response time are proposed. First, the system automatically extracted the theme melody from a music file by the graphical clustering algorithm based on the similarities between motifs of the music. To place an extracted theme melody into the metric space of M-tree, we chose the average length variation and the average pitch variation of the theme melody as the major features. Moreover, we added the pitch signature and length signature which summarize the pitch variation pattern and the length variation pattern of a theme melody, respectively, to increase the precision of retrieval results. We also proposed the associative content-based music retrieval mechanism in which the k-nearest neighborhood searching and the range searching algorithms of M-tree are used to select the similar melodies to user's query melody from the theme melody index. To improve the users' satisfaction, the proposed retrieval mechanism includes ranking and user's relevance feedback functions. Also, we implemented the proposed mechanisms as the essential components of content-based music retrieval systems to verify the usefulness.

Incremental Generation of A Decision Tree Using Global Discretization For Large Data (대용량 데이터를 위한 전역적 범주화를 이용한 결정 트리의 순차적 생성)

  • Han, Kyong-Sik;Lee, Soo-Won
    • The KIPS Transactions:PartB
    • /
    • v.12B no.4 s.100
    • /
    • pp.487-498
    • /
    • 2005
  • Recently, It has focused on decision tree algorithm that can handle large dataset. However, because most of these algorithms for large datasets process data in a batch mode, if new data is added, they have to rebuild the tree from scratch. h more efficient approach to reducing the cost problem of rebuilding is an approach that builds a tree incrementally. Representative algorithms for incremental tree construction methods are BOAT and ITI and most of these algorithms use a local discretization method to handle the numeric data type. However, because a discretization requires sorted numeric data in situation of processing large data sets, a global discretization method that sorts all data only once is more suitable than a local discretization method that sorts in every node. This paper proposes an incremental tree construction method that efficiently rebuilds a tree using a global discretization method to handle the numeric data type. When new data is added, new categories influenced by the data should be recreated, and then the tree structure should be changed in accordance with category changes. This paper proposes a method that extracts sample points and performs discretiration from these sample points to recreate categories efficiently and uses confidence intervals and a tree restructuring method to adjust tree structure to category changes. In this study, an experiment using people database was made to compare the proposed method with the existing one that uses a local discretization.