• Title/Summary/Keyword: cluster method

Search Result 2,498, Processing Time 0.034 seconds

Analysis of Microbial Communities During Cyanobacterial Bloom in Daechung Reservoir by DGGE (DGGE를 이용한 대청호 수화 발생시기의 세균군집 분석)

  • Ko So-Ra;Park Seong-Joo;Ahn Chi-Yong;Choi Aeran;Lee Jung-Sook;Kim Hee-Sik;Yoon Byung-Dae;Oh Hee-Mock
    • Korean Journal of Microbiology
    • /
    • v.40 no.3
    • /
    • pp.205-210
    • /
    • 2004
  • The change of bacterial communities during cyanobacterial bloom was analyzed by DGGE in Daechung Reservoir from July to October in 2003. The traditional morphological analysis showed that the genera of Microcystis, Chroococcus, Oscillatoria, and Phormidium were dominated. The most frequent band in the DGGE profile by 16S rDNA sequence analysis was identified as Microcystis flos-aquae and the cyanobacterial bloom was peaked on September 2. Oscillatoria spp. were also identified and Aphanizomenon flos-aquae dominated in the middle of August. Judging from the analysis of the digitalized DGGE profiles using the cluster analysis technique, the microbial community on September 2 was considerably different from others. Consequently, it seems that the gene fingerprinting method can give not only the similar results to the traditional morphological method but also additional information on the bacterial species and similarity among the examined microbial communities.

Classification of Pollution Patterns in High School Classrooms using Disjoint Principal Component Analysis (분산주성분 분석을 이용한 고등학교교실 내 오염패턴분류에 관한 연구)

  • Jang, Choul-Soon;Lee, Tae-Jung;Kim, Dong-Sool
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.22 no.6
    • /
    • pp.808-820
    • /
    • 2006
  • In regard to indoor air quality patterns, the government introduced various polices that were about managing and monitoring quality of indoor air as a major assignment, and also executed 'Indoor Air Quality Management Act' which was presented in the May, 2004. However, among the multi-usage facilities controlled by the Act, the school was not included yet. This study goal was to investigate PM 10 pollution patterns of the high school classrooms using a pattern recognition method based on cluster analysis and disjoint principal component analysis, and further to survey levels of inorganic elements in May, June, and September, 2004. A hierarchical clustering method was examined to obtain possible objects in pseudo homogeneous sample classes by transformation raw data and by applying various distance. Following the analysis, the disjoint principal component analysis was used to define homogeneous sample class after deleting outliers. Then three homogeneous Patterns were obtained as follows: the first class had been separated and objects in the class were considered to be sampled under semi-open condition. This class had high concentration of Ca, Fe, Mg, K, Al, and Na which are related with a soil and a chalk compounds. The second class was obtained in which objects were sampled while working air-conditioners and was identified low concentration of PM 10 and elements. Objects in the last class were assigned during rainy day. A chalk, soil element and various types of anthropogenic sources including combustions and industrial influenced the third class. This methodology was thought to be helpful enough to classify indoor air quality patterns and indoor environmental categories when controlling an indoor air quality.

An Optimized User Behavior Prediction Model Using Genetic Algorithm On Mobile Web Structure

  • Hussan, M.I. Thariq;Kalaavathi, B.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.5
    • /
    • pp.1963-1978
    • /
    • 2015
  • With the advancement of mobile web environments, identification and analysis of the user behavior play a significant role and remains a challenging task to implement with variations observed in the model. This paper presents an efficient method for mining optimized user behavior prediction model using genetic algorithm on mobile web structure. The framework of optimized user behavior prediction model integrates the temporary and permanent register information and is stored immediately in the form of integrated logs which have higher precision and minimize the time for determining user behavior. Then by applying the temporal characteristics, suitable time interval table is obtained by segmenting the logs. The suitable time interval table that split the huge data logs is obtained using genetic algorithm. Existing cluster based temporal mobile sequential arrangement provide efficiency without bringing down the accuracy but compromise precision during the prediction of user behavior. To efficiently discover the mobile users' behavior, prediction model is associated with region and requested services, a method called optimized user behavior Prediction Model using Genetic Algorithm (PM-GA) on mobile web structure is introduced. This paper also provides a technique called MAA during the increase in the number of models related to the region and requested services are observed. Based on our analysis, we content that PM-GA provides improved performance in terms of precision, number of mobile models generated, execution time and increasing the prediction accuracy. Experiments are conducted with different parameter on real dataset in mobile web environment. Analytical and empirical result offers an efficient and effective mining and prediction of user behavior prediction model on mobile web structure.

Improvement of TAOS data process

  • Lee, Dong-Wook;Byun, Yong-Ik;Chang, Seo-Won;Kim, Dae-Won;TAOS Team, TAOS Team
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.36 no.2
    • /
    • pp.129.1-129.1
    • /
    • 2011
  • We have applied an advanced multi-aperture indexing photometry and sophisticated de-trending method to existing Taiwanese-American Occultation Survey (TAOS) data sets. TAOS, a wide-field ($3^{\circ}{\times}3^{\circ}$) and rapid photometry (5Hz) survey, is designed to detect small objects in the Kuiper Belt. Since TAOS has fast and multiple exposures per zipper mode image, point spread function (PSF) varies in a given image. Selecting appropriate aperture among various size apertures allows us to reflect these variations in each light curve. The survey data turned out to contain various trends such as telescope vibration, CCD noise, and unstable local weather. We select multiple sets of stars using a hierarchical clustering algorithm in such a way that the light curves in each cluster show strong correlations between them. We then determine a primary trend (PT) per cluster using a weighted sum of the normalized light curves, and we use the constructed PTs to remove trends in individual light curves. After removing the trend, we can get each synthetic light curve of star that has much higher signal-to-noise ratio. We compare the efficiency of the synthetic light curves with the efficiency of light curves made by previous existing photometry pipelines. Our photometric method is able to restore subtle brightness variation that tends to be missed in conventional aperture photometric methods, and can be applied to other wide-field surveys suffering from PSF variations and trends. We are developing an analysis package for the next generation TAOS survey (TAOS II) based on the current experiments.

  • PDF

A Study on Foot Shape of Women (성인 여성의 발 형태 분석에 관한 연구)

  • 서추연;석은영
    • Journal of the Korean Home Economics Association
    • /
    • v.41 no.6
    • /
    • pp.1-12
    • /
    • 2003
  • The purpose of this study were to analyze the anthropometric data of feet of Korean women with aging, to categorize the women's foot shapes, and to compare the shoe size according to the foot shapes in order to provide the basic information for more comfortable shoes. Subjects of this study were 181 women over age 20. They were measured with the direct measurement method and the indirect measurement method. 26 items were measured from the right foot and 6 items were taken on foot outline. Factor analysis, cluster analysis, analysis of variance, post-hoc test, and cross tabs were peformed for statistical analysis of the data by SPSS program. There were significant differences in height items, breadth items, girth items, and angle items by subjects' age. The older subjects' feet were wide and thick with big deformity on toes. The arch height of the older ones was low. This implicates that the degree of deformity on toes, the foot ratio, the foot girth, the foot breath and the arch height as well as the foot length are needed to be considered in developing comfortable shoes. Nine foot construction factors were extracted by the factor analysis of anthropometric measurements; foot size factor, heel and instep factor, malleolus lateralis factor, malleolus medialis factor, foot shape factor, shape of toes factor, heel height factor, big toe height factor, and internal factor. On the basis of the cluster analysis, three different foot shapes were categorized. Type 1 was large and wide foot with little deformity on little toe. Type 2 was medium foot with deformation of big toe, and with the lowest arch height. Type 3 was small and narrow foot with the highest arch height. Distribution of shoe size according to the foot shape was analyzed. The ball of foot breath was of wide distribution than the ball of foot girth. This implicates that girth items and breath items of the foot should be enclosed for the same foot length in the shoe sizing system.

A Study on Somatotype Classification of Muscular Men's Lower Body (근육형 남성의 하반신 체형분류에 관한 연구)

  • Jeong, Hye-Jin;Kim, So-Ra
    • Journal of the Ergonomics Society of Korea
    • /
    • v.28 no.1
    • /
    • pp.21-27
    • /
    • 2009
  • The purpose of this research is to understand the physiological characteristics of muscular men between the ages of 20 and 34 years who are distinct from the general population due to their muscular development, and to categorize them according to upper body somatotypes. This research was conducted in order to provide basic data necessary for developing clothing products for muscular men. The research method and results were as follows: 1. The study carried out factor analysis with the body measuring value of 168 muscular men according to the body classification method of Sheldon and Heath-Carter. The study materialized muscular men's lower body types statistically by carrying out cluster analysis, regarding scores of each factor extracted from the factor analysis as an independent variable. The study also carried out discriminant analysis with the results of cluster analysis classified so that morphological characters of each type were remarkably distinguished. 2. As the results of factor analysis, the study set up number of factors as three. Factor 1 occupied 38.149% of the total variables as a size factor of the lower body. Factor 2 occupied 20.417% of the total variables as a height and length factor of the lower body. Factor 3 occupied 8.466% of the total variables as a length factor of the hip. 3. The study classified the lower body type into three types and the characteristics by each type were as follows. Type 1 was a group with the best developed muscle in the lower of the body, considering that a size of their lower bodies was the largest. Type 2 was well-balanced muscular males though a size of the lower body was smaller than other types. This type didn't have fatness of the abdomen and large hips. Type 3 was a body type that the length from the waist to the hip was long. 4. As the results of carrying out discriminant analysis to distinguish muscular men's lower body types, the discriminant accuracy was 86.3% over all in the lower bodies.

Detection of Entry/Exit Zones for Visual Surveillance System using Graph Theoretic Clustering (그래프 이론 기반의 클러스터링을 이용한 영상 감시 시스템 시야 내의 출입 영역 검출)

  • Woo, Ha-Yong;Kim, Gyeong-Hwan
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.46 no.6
    • /
    • pp.1-8
    • /
    • 2009
  • Detecting entry and exit zones in a view covered by multiple cameras is an essential step to determine the topology of the camera setup, which is critical for achieving and sustaining the accuracy and efficiency of multi-camera surveillance system. In this paper, a graph theoretic clustering method is proposed to detect zones using data points which correspond to entry and exit events of objects in the camera view. The minimum spanning tree (MST) is constructed by associating the data points. Then a set of well-formed clusters is sought by removing inconsistent edges of the MST, based on the concepts of the cluster balance and the cluster density defined in the paper. Experimental results suggest that the proposed method is effective, even for sparsely elongated clusters which could be problematic for expectation-maximization (EM). In addition, comparing to the EM-based approaches, the number of data required to obtain stable outcome is relatively small, hence shorter learning period.

A GIS-Based Method for Delineating Spatial Clusters: A Modified AMOEBA Technique (공간 클러스터의 범역 설정을 위한 GIS-기반 방법론 연구 -수정 AMOEBA 기법-)

  • Lee, Sang-Il;Cho, Dae-Heon;Sohn, Hak-Gi;Chae, Mi-Ok
    • Journal of the Korean Geographical Society
    • /
    • v.45 no.4
    • /
    • pp.502-520
    • /
    • 2010
  • The main objective of the paper is to develop a GIS-based method for delineating spatial clusters. Major tasks are: (i) to devise a sustainable algorithm with reference to various methods developed in the fields of geographic boundary analysis and cluster detection; (ii) to develop a GIS-based program to implement the algorithm. The main results are as follows. First, it is recognized that the AMOEBA technique utilizing LISA is the best candidate. Second, a modified version of the AMOEBA technique is proposed and implemented in a GIS environment. Third, the validity and usefulness of the modified AMOEBA algorithm is assured by its applications to test and real data sets.

Proposal For Improving Data Processing Performance Using Python (파이썬 활용한 데이터 처리 성능 향상방법 제안)

  • Kim, Hyo-Kwan;Hwang, Won-Yong
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.4
    • /
    • pp.306-311
    • /
    • 2020
  • This paper deals with how to improve the performance of Python language with various libraries when developing a model using big data. The Python language uses the Pandas library for processing spreadsheet-format data such as Excel. In processing data, Python operates on an in-memory basis. There is no performance issue when processing small scale of data. However, performance issues occur when processing large scale of data. Therefore, this paper introduces a method for distributed processing of execution tasks in a single cluster and multiple clusters by using a Dask library that can be used with Pandas when processing data. The experiment compares the speed of processing a simple exponential model using only Pandas on the same specification hardware and the speed of processing using a dask together. This paper presents a method to develop a model by distributing a large scale of data by CPU cores in terms of performance while maintaining that python's advantage of using various libraries is easy.

Investigating Binding Area of Protein Surface using MCL Algorithm (MCL 알고리즘을 이용한 단백질 표면의 바인딩 영역 분석 기법)

  • Jung, Kwang-Su;Yu, Ki-Jin;Chung, Yong-Je;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.14D no.7
    • /
    • pp.743-752
    • /
    • 2007
  • Proteins combine with other materials to achieve their function and have similar function if their active sites are similar. Thus we can infer the function of protein by identifying the binding area of proteins. This paper suggests the novel method to select binding area of protein using MCL (Markov Cluster) algorithm. We construct the distance matrix from surface residues distance on protein. Then this distance matrix is transformed to connectivity matrix for applying MCL process. We adopted Catalytic Site Atlas (CSA) data to evaluate the proposed method. In the experimental result using CSA data (94 selected single chain proteins), our algorithm detects the 91 (97%) binding area near by active site of each protein. We introduced a new geometrical features and this mainly contributes to reduce the time to analyze the protein by selecting the residues near by active site.