• Title/Summary/Keyword: K-means cluster

Search Result 622, Processing Time 0.048 seconds

Hierarchical and Incremental Clustering for Semi Real-time Issue Analysis on News Articles (준 실시간 뉴스 이슈 분석을 위한 계층적·점증적 군집화)

  • Kim, Hoyong;Lee, SeungWoo;Jang, Hong-Jun;Seo, DongMin
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.556-578
    • /
    • 2020
  • There are many different researches about how to analyze issues based on real-time news streams. But, there are few researches which analyze issues hierarchically from news articles and even a previous research of hierarchical issue analysis make clustering speed slower as the increment of news articles. In this paper, we propose a hierarchical and incremental clustering for semi real-time issue analysis on news articles. We trained siamese neural network based weighted cosine similarity model, applied this model to k-means algorithm which is used to make word clusters and converted news articles to document vectors by using these word clusters. Finally, we initialized an issue cluster tree from document vectors, updated this tree whenever news articles happen, and analyzed issues in semi real-time. Through the experiment and evaluation, we showed that up to about 0.26 performance has been improved in terms of NMI. Also, in terms of speed of incremental clustering, we also showed about 10 times faster than before.

A Study on Sizing System of Cycle Tights for Athlete depending on Lower Body Type for High School Boys Cyclist (남자 고등학교 사이클 선수의 하반신 유형 분류에 따른 선수용 사이클복 하의 치수설정에 관한 연구)

  • Park, Hyunjeong;Do, Wolhee
    • Fashion & Textile Research Journal
    • /
    • v.19 no.3
    • /
    • pp.320-330
    • /
    • 2017
  • People have recently became interested in eco-friendly cycling that attracted further attention as a sport activity. The number of high school cyclists has increased due to the popularity of cycling; however, high school cyclists have trouble choosing cycling suits because there is no professional cycling suit for high school cyclists in Korea. Therefore, it is necessary to develop a professional cycling suit for high school cyclists because sportswear for athletes is an important means to improve performance. This study suggests a standard sizing system for high school student athletes' cycle tights. The subjects were 111 high school cyclists. The 3 clusters were categorized by cluster analysis, and the sizing system was classified according to three lower body types. The size intervals of waist girth, hip girth and height were 5cm, respectively. The most frequent sizes were 75-100-175 in figure type 1, 70-90-170 and 75-95-170 in figure type 2, 70-90-175 and 70-90-180 in figure type 3. The sizing system, which had frequencies more than 3.6%, was classified into 9 cases, 8 cases, and 5 cases, respectively by lower body types. The results will contribute to the development of athletic performance cycle wear for high school cyclists.

Genetic Diversity Based on Morphology and RAPD Analysis in Vegetable Soybean

  • Srinives, P.;Chowdhury, A.K.;Tongpamnak, P.;Saksoong, P.
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.46 no.2
    • /
    • pp.112-120
    • /
    • 2001
  • Genetic diversity of 47 East-Asian vegetable soybean was characterized by means of agro-morphological traits and RAPD markers. A field trial was conducted to evaluate 14 agro-morphological traits. To study RAPD-based DNA analysis, a total of sixty 10-mer random primers were screened. Of these, 23 polymorphic markers in 16 varieties used for screening. Among 207 markers amplified, 48 were polymorphic for at least one pairwise comparison within the 47 varieties. A higher differentiation level between varieties was observed by using RAPD markers compared to morphological markers. Correspondence analysis using both types of marker showed that RAPD data could fully discriminate between all varieties, whereas morphological markers could not achieve a complete discrimination. Genetic distances between the varieties were estimated from simple matching coefficients, ranged from 0.0 to 0.640 with an average of 0.295$\pm$0.131 for morphological traits and 0.042 to 0.625 with an average of 0.336$\pm$0.099 for RAPD data, respectively. Cluster analysis based on genetic dissimilarity of these varieties gave rise to 4 distinct groups. The clustering results based on RAPDs did not match with those based on morphological traits. Geographical distribution of most varieties in each of the groups were not well defined. The results suggested that the level of genetic diversity within this group of East-Asian vegetable soybean varieties was sufficient for a breeding program and can be used to establish genetic relationships among them with unknown or unrelated pedigrees.

  • PDF

Favorable Colors on the Facial Color Types of Korean Adult Females (한국 여성의 얼굴 피부색 유형에 어울리는 색채에 대한 연구)

  • Kim Ku-Ja
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.30 no.6 s.154
    • /
    • pp.971-980
    • /
    • 2006
  • The colors of apparel have a close relationship with the facial color types of consumers. To extract the favorable colors that flatter to consumer's facial color types, the facial colors of Korean females were analyzed. With color meter JX-777, 2 points of face were measured and classified into 3 clusters that had similar hue, value and chroma. Other new 10 college girls were measured and 3 subject among them were selected by the criteria that choose new subjects who have the classified facial color types. 175 respondents answered the degree of becomingness of color samples on three subjects. Data were analyzed by K-means cluster analysis, ANOVA and Duncan multiple range test using SPSS Win. 12. Findings were as follows: 1) 324 subjects who had YR facial colors were classified into 3 facial color groups. The average facial color Type 1 was 4.82YR 6.47/3.70 and composed 48.88% among total observations. Type 2 was 5.99YR 6.12/4.12 and 30.25%. Type 3 was 5.15YR 7.07/4.97 and 20.99% respectively. 2) Favorable colors for Type 1 were 18 colors that belonged to 'a' group from among colors that were divided into a, b, c group by Duncan post hoc test. 3) Type 2 showed that this type had many unfavorable colors. Unfavorable colors were 18 colors that belonged to 'c' by Duncan test. 4) Type 3 showed that black is the most favorable color and 18 colors were at middle level, which belonged to 'b' from among 18 colors that were divided into a, b, and c by Duncan test.

A Study on the Deduction of Social Issues Applying Word Embedding: With an Empasis on News Articles related to the Disables (단어 임베딩(Word Embedding) 기법을 적용한 키워드 중심의 사회적 이슈 도출 연구: 장애인 관련 뉴스 기사를 중심으로)

  • Choi, Garam;Choi, Sung-Pil
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.1
    • /
    • pp.231-250
    • /
    • 2018
  • In this paper, we propose a new methodology for extracting and formalizing subjective topics at a specific time using a set of keywords extracted automatically from online news articles. To do this, we first extracted a set of keywords by applying TF-IDF methods selected by a series of comparative experiments on various statistical weighting schemes that can measure the importance of individual words in a large set of texts. In order to effectively calculate the semantic relation between extracted keywords, a set of word embedding vectors was constructed by using about 1,000,000 news articles collected separately. Individual keywords extracted were quantified in the form of numerical vectors and clustered by K-means algorithm. As a result of qualitative in-depth analysis of each keyword cluster finally obtained, we witnessed that most of the clusters were evaluated as appropriate topics with sufficient semantic concentration for us to easily assign labels to them.

The Character Area Extraction and the Character Segmentation on the Color Document (칼라 문서에서 문자 영역 추출믹 문자분리)

  • 김의정
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.9 no.4
    • /
    • pp.444-450
    • /
    • 1999
  • This paper deals with several methods: the clustering method that uses k-means algorithm to abstract the area of characters on the image document and the distance function that suits for the HIS coordinate system to cluster the image. For the prepossessing step to recognize this, or the method of characters segmentate, the algorithm to abstract a discrete character is also proposed, using the linking picture element. This algorithm provides the feature that separates any character such as the touching or overlapped character. The methods of projecting and tracking the edge have so far been used to segment them. However, with the new method proposed here, the picture element extracts a discrete character with only one-time projection after abstracting the character string. it is possible to pull out it. dividing the area into the character and the rest (non-character). This has great significance in terms of processing color documents, not the simple binary image, and already received verification that it is more advanced than the previous document processing system.

  • PDF

Clustering of Facial Color Types and Their Favorable Colors on Korean Adult Males (한국 남성의 얼굴 피부색 분류와 유형에 어울리는 색채 연구)

  • Kim, Ku-Ja
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.30 no.2 s.150
    • /
    • pp.316-325
    • /
    • 2006
  • The colors of apparel are getting more important to give the differentiated character on fiber and fabrics. This study was to extract the favorable colors that become to facial color types. Research was carried out to classify the facial colors into several similar facial color groups. With JX-777, 2 points of face: forehead and cheek, were measured and classified into 3 facial color types. Sample size was 418 Korean adult males and other 15 of new males subjects. New chosen 3 subjects who had the classified facial color types, wore silver gown and black hat on his head to minimize the interaction of the clothe color an hair. The 40 standardized color samples were used to extract the favorable colors. 187 respondents answered the degree of becomingness of color samples on 3 facial color types. Data were analyzed by K-means cluster analysis, ANOVA and Duncan multiple range test using SPSS Win. 12. Findings were as follows: 1. 418 subjects who had YR colors were classified into 3 kinds of facial color groups. Type 1 was 4.59YR 5.89/5.12, Type 2 was 5.61 YR 5.41/4.79 and Type 3 was 4.38YR 6.49/4.89 respectively. 2. Favorable colors for Type 1 were 2 colors that belonged to ' a ' group from among colors that were divided into a, b, c group and 18 colors that belonged to ' a ' group from among colors that were divided into a, b group by Duncan post hoc test. 3. Type 2 showed that this type had many unfavorable colors. Unfavorable colors were 16 colors that belonged to ' c ' by Duncan test. 5. Favorable colors for Type 3 were 14 colors that belonged to ' a ' from among colors that were divided into a, b, c and 16 colors that belonged to ' a ' from among colors that were divided into a, b by Duncan test.

Integrated Stochastic Admission Control Policy in Clustered Continuous Media Storage Server (클리스터 기반 연속 미디어 저장 서버에서의 통합형 통계적 승인 제어 기법)

  • Kim, Yeong-Ju;No, Yeong-Uk
    • The KIPS Transactions:PartA
    • /
    • v.8A no.3
    • /
    • pp.217-226
    • /
    • 2001
  • In this paper, for continuous media access operations performed by Clustered Continuous Media Storage Server (CCMSS) system, we present the analytical model based on the open queueing network, which considers simultaneously two critical delay factors, the disk I/O and the internal network, in the CCMSS system. And we derive by using the analytical model the stochastic model for the total service delay time in the system. Next, we propose the integrated stochastic admission control model for the CCMSS system, which estimate the maximum number of admittable service requests at the allowable service failure rate by using the derived stochastic model and apply the derived number of requests in the admission control operation. For the performance evaluation of the proposed model, we evaluated the deadline miss rates by means of the previous stochastic model considering only the disk I/O and the propose stochastic model considering the disk I/O and the internal network, and compared the values with the results obtained from the simulation under the real cluster-based distributed media server environment. The evaluation showed that the proposed admission control policy reflects more precisely the delay factors in the CCMSS system.

  • PDF

Item Filtering System Using Associative Relation Clustering Split Method (연관관계 군집 분할 방법을 이용한 아이템 필터링 시스템)

  • Cho, Dong-Ju;Park, Yang-Jae;Jung, Kyung-Yong
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.6
    • /
    • pp.1-8
    • /
    • 2007
  • In electronic commerce, it is important for users to recommend the proper item among large item sets with saving time and effort. Therefore, if the recommendation system can be recommended the suitable item, we will gain a good satisfaction to the user. In this paper, we proposed the associative relation clustering split method in the collaborative filtering in order to perform the accuracy and the scalability. We produce the lift between associative items using the ratings data. and then split the node group that consists of the item to improve an efficiency of the associative relation cluster. This method differs the association about the items of groups. If the association of groups is filled, the reminding items combine. To estimate the performance, the suggested method is compared with the K-means and EM in the MovieLens data set.

A Study on Ethical Consumption Behaviors of College Students: Classification and Analysis according to the Ethical Consumption Behaviors (대학생 소비자의 윤리적 소비행동에 따른 유형분류 및 특성분석)

  • Hong, Eun-Sil;Shin, Hyo-Yeon
    • Korean Journal of Human Ecology
    • /
    • v.20 no.4
    • /
    • pp.801-817
    • /
    • 2011
  • The purpose of this research was to explore the levels of ethical consumption of the college students and classify their types on ethical consumption behaviors. This research was conducted with university students living in Gwangju. Statistical analysis was achieved by using t-test, one-way ANOVA, Duncan's multiple range test, $X^2$, and Ward' hierarchical cluster analysis with a total of 761 questionnaires. The research results are summarized as follows: First, the overall ethical consumption average mark of college students was 3.14. Second, all surveyed college students were classified into five types based on the means scores of three dimension ethical consumption behaviors. A total 16.7% of students belonged to Type 1 (named as entire region active group) where students scored high points on three dimension ethical consumption behaviors. Type 2 (named as entire region average group) had about 41.6% of students whose scores were the average mark level in three dimension ethical consumption behaviors. Type 3 (named as future-oriented group) occupied 13.9% and this group scored low on the ethical consumption in commercial transaction but high on the ethical consumption for the future generation. Type 4 (named as commercial transaction oriented group) occupied 9.1% and this group scored low on the ethical consumption for contemporary humankind and the ethical consumption for the future generation but high on the ethical consumption in commercial transaction. Type 5 (named as entire region passive group) had 18.7% of students whose scores of three dimension ethical consumption behaviors were low.