• Title/Summary/Keyword: k-mean clustering

Search Result 283, Processing Time 0.036 seconds

Highlight based Lyrics Search Considering the Characteristics of Query (사용자 질의어 특징을 반영한 하이라이트 기반 노래 가사 검색)

  • Kim, Kweon Yang
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.4
    • /
    • pp.301-307
    • /
    • 2016
  • This paper proposes a lyric search method to consider the characteristics of the user query. According to the fact that queries for the lyric search are derived from highlight parts of the music, this paper uses the hierarchical agglomerative clustering to find the highlight and proposes a Gaussian weighting to consider the neighbor of the highlight as well as highlight. By setting the mean of a Gaussian weighting at the highlight, this weighting function has higher weights near the highlight and the lower weights far from the highlight. Then, this paper constructs a index of lyrics with the gaussian weighting. According to the experimental results on a data set obtained from 5 real users, the proposed method is proved to be effective.

Blind Channel Estimation through Clustering in Backscatter Communication Systems (후방산란 통신시스템에서 군집화를 통한 블라인드 채널 추정)

  • Kim, Soo-Hyun;Lee, Donggu;Sun, Young-Ghyu;Sim, Issac;Hwang, Yu-Min;Shin, Yoan;Kim, Dong-In;Kim, Jin-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.2
    • /
    • pp.81-86
    • /
    • 2020
  • Ambient backscatter communication has a drawback in which the transmission power is limited because the data is transmitted using the ambient RF signal. In order to improve transmission efficiency between transceiver, a channel estimator capable of estimating channel state at a receiver is needed. In this paper, we consider the K-means algorithm to improve the performance of the channel estimator based on EM algorithm. The simulation uses MSE as a performance parameter to verify the performance of the proposed channel estimator. The initial value setting through K-means shows improved performance compared to the channel estimation method using the general EM algorithm.

Data Clustering Algorithm Adaptive to Data Forms (데이터 형태에 적응하는 클러스터링 알고리즘)

  • Lee, K.H.;Lee, K.C.
    • Annual Conference of KIPS
    • /
    • 2000.10b
    • /
    • pp.1433-1436
    • /
    • 2000
  • 클러스터링에 있어서 k-means[7], DBSCAN[2], CURE[4], ROCK[5], PAM[8], 같은 기존의 알고리즘은 원형이나 타원형 등의 어느 고정된 모양에 의해 클러스터를 결정한다. 만약 클러스터 하려는 데이터의 분포가 우연히 알고리즘의 결정된 모양과 일치하면 정확한 해를 얻을 수 있다. 하지만 자연적인 데이터의 분포에서는 발생하기 어렵다. 데이터의 형태를 추적하여 이러한 문제점을 해결한 CHAMELEON[1] 알고리즘이 최근에 발표되었다. 하지만 모양에는 독립적이나 데이터의 양이 증가함에 따라 소요되는 시간이 폭발적으로 증가한다. 이것은 기존의 마이닝 데이터들이 대용량이라는 것을 고려하면 현실에 적용하기 힘든 문제점이 있다. 이러한 문제점을 해결하기 위해 본 논문에서는 K-means[7]]를 이용한 대표를 선출하는 방법으로 CHAMELEON[1]의 문제점 개선(EF-CHAMELEON)을 시도하였으며 여러 자연적인 형태의 도형들은 아주 작은 원형들의 집합으로 구성 될 수 있다는 생각을 기본으로 잡음에 영향을 받지 않을 정도로 아주 작은 초기 다수의 소형 클러스터를 K-mean을 이용하여 구성하고 이를 다시 크러스터간의 상대적인 거리를 이용하여 다시 머지 하는 방법으로 모양에 의존적인 문제를 해결하며 비교사 학습(unsupervised learning)에 충실하기 위해 임계값을 적용 적정 단계에서 알고리즘을 멈추게 한 ADF 알고리즘을 소개한다. 실험 데이터는 기존의 여러 클러스터링 알고리즘이 판별 할 수 없었던 다양한 모양을 가지고있는 2차원 배열을 사용하여 ADF. CHAMELEON[1], EF-CHAMELEON,의 성능을 비교하였다.

  • PDF

A Study on Automatic Analysis Method of Human Behavior Using K-Mean Clustering of Smartphone Acceleration Sensor (스마트폰 가속도 센서의 K-평균 클러스터링을 이용한 사람행동 자동분석 방법에 대한 연구)

  • Park, Jong-Kun;Song, Teuk-Seob
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.486-487
    • /
    • 2019
  • Smartphones have various sensors built in. In particular, acceleration sensors are used to analyze human behavior because they can detect movement of objects. Previous studies have analyzed the behavior of people by analyzing the magnitude of acceleration sensor values. In this study, we proposed a method of detecting the motion by applying the K-average of the acceleration sensor value built in the smartphone. We proposed a method of recognizing walking and running, which is basic human behavior, by applying K-average of acceleration sensor value of smartphone.

  • PDF

A Fine Dust Measurement Technique using K-means and Sobel-mask Edge Detection Method (K-means와 Sobel-mask 윤곽선 검출 기법을 이용한 미세먼지 측정 방법)

  • Lee, Won-Hyeung;Seo, Ju-Wan;Kim, Ki-Yeon;Lin, Chi-Ho
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.2
    • /
    • pp.97-101
    • /
    • 2022
  • In this paper, we propose a method of measuring Fine dust in images using K-means and Sobel-mask based edge detection techniques using CCTV. The proposed algorithm collects images using a CCTV camera and designates an image range through a region of interest. When clustering is completed by applying the K-means algorithm, outline is detected through Sobel-mask, edge strength is measured, and the concentration of fine dust is determined based on the measured data. The proposed method extracts the contour of the mountain range using the characteristics of Sobel-mask, which has an advantage in diagonal measurement, and shows the difference in detection according to the concentration of fine dust as an experimental result.

A Space-Time Cluster of Foot-and-Mouth Disease Outbreaks in South Korea, 2010~2011 (구제역의 시.공간 군집 분석 - 2010~2011 한국에서 발생한 구제역을 사례로 -)

  • Pak, Son Il;Bae, Sun Hak
    • Journal of the Korean association of regional geographers
    • /
    • v.18 no.4
    • /
    • pp.464-472
    • /
    • 2012
  • To assess the space-time clustering of FMD(Foot-and-Mouth Disease) epidemic occurred in Korea between November 2010 to April 2011, geographical information system (GIS)-based spatial analysis technique was used. Farm address and geographic data obtained from a commercial portal site were integrated into GIS software, which we used to map out the color-shading geographic features of the outbreaks through a process called thematic mapping, and to produce a visual representation of the relationship between epidemic course and time throughout the country. FMD cases reported in northern area of Gyounggi province were clustered in space and time within small geographic areas due to the environmental characteristics which livestock population density is high enough to ease transmit FMD virus to the neighboring farm, whereas FMD cases were clustered in space but not in time for southern and eastern area of Gyounggi province. When analyzing the data for 7-day interval, the mean radius of the spatial-time clustering was 25km with minimum 5.4km and maximum 74km. In addition, the radius of clustering was relatively small in the early stage of FMD epidemic, but the size was geographically expanded over the epidemic course. Prior to implementing control measures during the outbreak period, assessment of geographic units potentially affected and identification of risky areas which are subsequently be targeted for specific intervention measures is recommended.

  • PDF

Analyzing the Co-occurrence of Endangered Brackish-Water Snails with Other Species in Ecosystems Using Association Rule Learning and Clustering Analysis (연관 규칙 학습과 군집분석을 활용한 멸종위기 기수갈고둥과 생태계 내 종 간 연관성 분석)

  • Sung-Ho Lim;Yuno Do
    • Korean Journal of Ecology and Environment
    • /
    • v.57 no.2
    • /
    • pp.83-91
    • /
    • 2024
  • This study utilizes association rule learning and clustering analysis to explore the co-occurrence and relationships within ecosystems, focusing on the endangered brackish-water snail Clithon retropictum, classified as Class II endangered wildlife in Korea. The goal is to analyze co-occurrence patterns between brackish-water snails and other species to better understand their roles within the ecosystem. By examining co-occurrence patterns and relationships among species in large datasets, association rule learning aids in identifying significant relationships. Meanwhile, K-means and hierarchical clustering analyses are employed to assess ecological similarities and differences among species, facilitating their classification based on ecological characteristics. The findings reveal a significant level of relationship and co-occurrence between brackish-water snails and other species. This research underscores the importance of understanding these relationships for the conservation of endangered species like C. retropictum and for developing effective ecosystem management strategies. By emphasizing the role of a data-driven approach, this study contributes to advancing our knowledge on biodiversity conservation and ecosystem health, proposing new directions for future research in ecosystem management and conservation strategies.

Analysis of genetic diversity and population structure of rice cultivars from Africa, Asia, Europe, South America, and Oceania using SSR markers

  • Cheng, Yi;Cho, Young-Il;Chung, Jong-Wook;Ma, Kyung-Ho;Park, Yong-Jin
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.54 no.4
    • /
    • pp.441-451
    • /
    • 2009
  • In this study, 29 simple sequence repeat (SSR) markers were used to analyze the genetic diversity and population structure of 125 rice accessions from 40 different origins in Africa, Asia, Europe, South America, and Oceania. A total of 333 alleles were detected, with an average of 11.5 per locus. The mean values of major allele frequency, expected heterozygosity, and polymorphism information content (PIC) for each SSR locus were 0.39, 0.73, and 0.70, respectively. The highest mean PIC was 0.71 for Asia, followed by 0.66 for Africa, 0.59 for South America, 0.53 for Europe, and 0.47 for Oceania. Model-based structure analysis revealed the presence of five subpopulations, which was basically consistent with clustering based on genetic distance. Some accessions were clearly assigned to a single population in which >70% of their inferred ancestry was derived from one of the model-based populations. In addition, 12 accessions (9.6%) were categorized as having admixed ancestry. The results could be used to understanding the genetic structure of rice cultivars from these regions and to support effective breeding programs to broaden the genetic basis of rice varieties.

Genetic diversity and phenotype variation analysis among rice mutant lines (Oryza sativa L.)

  • Truong, Thi Tu Anh;Do, Tan Khang;Phung, Thi Tuyen;Pham, Thi Thu Ha;Tran, Dang Xuan
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2017.06a
    • /
    • pp.22-22
    • /
    • 2017
  • Genetic diversity is one of fundamental parameters for rice cultivar improvement. Rice mutants are also a new source for rice breeding innovation. In this study, ninety-three SSR markers were applied to evaluate the genetic variation among nineteen rice mutant lines. The results showed that a total of 169 alleles from 56 polymorphism markers was recorded with an average of 3.02 alleles per locus. The values of polymorphism information content (PIC) varied from 0.09 to 0.79. The maximum number of alleles was 7, whereas the minimum number of alleles was 2. The heterozygosity values ranged from 0.10 to 0.81. Four clusters were generated using the unweighted pair group method with arithmetic mean (UPGMA) clustering. Fourteen phenotype characteristics were also evaluated. The correlation coefficient values among these phenotye characteristics were obtained in this study. Genetic diversity information of rice mutant lines can support rice breeders in releasing new rice varieties with elite characterisitics.

  • PDF

Genetic Relationships among Korean Adlay, Coix lachryma-jobi L., Landraces Based on AFLPs

  • Moon Jung-Hun;Jang Jung Hee;Park Jung Soo;Kim Sung Kee;Lee Kyung-Jun;Lee Sang-Kyu;Kim Kyung-Hee;Lee Byung-Moo
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.50 no.2
    • /
    • pp.142-146
    • /
    • 2005
  • Thirty-two germplasms of Korean adlay landraces were examined to analyse the genetic relationship through the amplified fragment length polymorphism (AFLP) approach. Total number of AFLP products generated by 12 selective primer combinations was 882. The number of polymorphic fragments by each primer combination greatly varied from 4 to 51 with a mean of 20.3, bands visible on the polyacrylamide gel. A genetic similarity coefficient was used for cluster analysis following UPGMA (unweighted pair grouping method of averages) method. The resulting clusters were represented in the form of a dendrogram. The clustering was not tight in the dendrogram. There was generally no clear grouping of the adlay according to the geographic regions in which germplasms were collected. The present AFLP analysis imply that although Korean adlay displayed a larger amount of AFLP variation within germplasms, the variation was shown independently without reflecting a clinal variation. This study demonstrated that AFLP method can be used to examine the genetic relationships among different germplasms of adlay.