• Title/Summary/Keyword: 클러스터링 문제

Search Result 429, Processing Time 0.023 seconds

Korean Phoneme Recognition Using Self-Organizing Feature Map (SOFM 신경회로망을 이용한 한국어 음소 인식)

  • Jeon, Yong-Koo;Yang, Jin-Woo;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.2
    • /
    • pp.101-112
    • /
    • 1995
  • In order to construct a feature map-based phoneme classification system for speech recognition, two procedures are usually required. One is clustering and the other is labeling. In this paper, we present a phoneme classification system based on the Kohonen's Self-Organizing Feature Map (SOFM) for clusterer and labeler. It is known that the SOFM performs self-organizing process by which optimal local topographical mapping of the signal space and yields a reasonably high accuracy in recognition tasks. Consequently, SOFM can effectively be applied to the recognition of phonemes. Besides to improve the performance of the phoneme classification system, we propose the learning algorithm combined with the classical K-mans clustering algorithm in fine-tuning stage. In order to evaluate the performance of the proposed phoneme classification algorithm, we first use totaly 43 phonemes which construct six intra-class feature maps for six different phoneme classes. From the speaker-dependent phoneme classification tests using these six feature maps, we obtain recognition rate of $87.2\%$ and confirm that the proposed algorithm is an efficient method for improvement of recognition performance and convergence speed.

  • PDF

A Study on Improvement of Energy Efficiency for LEACH Protocol in WSN (WSN에서 LEACH 프로토콜의 에너지 효율 향상에 관한 연구)

  • Lee, Won-Seok;Ahn, Tae-Won;Song, ChangYoung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.3
    • /
    • pp.213-220
    • /
    • 2015
  • Wireless sensor network(WSN) is made up of a lot of battery operated inexpensive sensors that, once deployed, can not be replaced. Therefore, energy efficiency of WSN is essential. Among the methods for energy efficiency of the network, clustering algorithms, which divide a WSN into multiple smaller clusters and separate all sensors into cluster heads and their associated member nodes, are very energy efficient routing technique. The first cluster-based routing protocol, LEACH, randomly elects the cluster heads in accordance with the probability. However, if the distribution of selected cluster heads is not good, uniform energy consumption of cluster heads is not guaranteed and it is possible to decrease the number of active nodes. Here we propose a new routing scheme that, by comparing the remaining energy of all nodes in a cluster, selects the maximum remaining energy node as a cluster head. Because of decrease in energy gap of nodes, the node that was a cluster head operates as a member node much over. As a result, the network lifespan is increased and more data arrives at base station.

A Flexible Multi-Threshold Based Control of Server Power Mode for Handling Rapidly Changing Loads in an Energy Aware Server Cluster (에너지 절감형 서버 클러스터에서 급변하는 부하 처리를 위한 유연한 다중 임계치 기반의 서버 전원 모드 제어)

  • Ahn, Taejune;Cho, Sungchoul;Kim, Seokkoo;Chun, Kyongho;Chung, Kyusik
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.3 no.9
    • /
    • pp.279-292
    • /
    • 2014
  • Energy aware server cluster aims to reduce power consumption at maximum while keeping QoS(quality of service) as much as energy non-aware server cluster. In the existing methods of energy aware server cluster, they calculate the minimum number of active servers needed to handle current user requests and control server power mode in a fixed time interval to make only the needed servers ON. When loads change rapidly, QoS of the existing methods become degraded because they cannot increase the number of active servers so quickly. To solve this QoS problem, we classify load change situations into five types of rapid growth, growth, normal, decline, and rapid decline, and apply five different thresholds respectively in calculating the number of active servers. Also, we use a flexible scheme to adjust the above classification criterion for multi threshold, considering not only load change but also the remaining capacity of servers to handle user requests. We performed experiments with a cluster of 15 servers. A special benchmarking tool called SPECweb was used to generate load patterns with rapid change. Experimental results showed that QoS of the proposed method is improved up to the level of energy non-aware server cluster and power consumption is reduced up to about 50 percent, depending on the load pattern.

Clustering Analysis by Customer Feature based on SOM for Predicting Purchase Pattern in Recommendation System (추천시스템에서 구매 패턴 예측을 위한 SOM기반 고객 특성에 의한 군집 분석)

  • Cho, Young Sung;Moon, Song Chul;Ryu, Keun Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.2
    • /
    • pp.193-200
    • /
    • 2014
  • Due to the advent of ubiquitous computing environment, it is becoming a part of our common life style. And tremendous information is cumulated rapidly. In these trends, it is becoming a very important technology to find out exact information in a large data to present users. Collaborative filtering is the method based on other users' preferences, can not only reflect exact attributes of user but also still has the problem of sparsity and scalability, though it has been practically used to improve these defects. In this paper, we propose clustering method by user's features based on SOM for predicting purchase pattern in u-Commerce. it is necessary for us to make the cluster with similarity by user's features to be able to reflect attributes of the customer information in order to find the items with same propensity in the cluster rapidly. The proposed makes the task of clustering to apply the variable of featured vector for the user's information and RFM factors based on purchase history data. To verify improved performance of proposing system, we make experiments with dataset collected in a cosmetic internet shopping mall.

A Stable Multilevel Partitioning Algorithm for VLSI Circuit Designs Using Adaptive Connectivity Threshold (가변적인 연결도 임계치 설정에 의한 대규모 집적회로 설계에서의 안정적인 다단 분할 방법)

  • 임창경;정정화
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.10
    • /
    • pp.69-77
    • /
    • 1998
  • This paper presents a new efficient and stable multilevel partitioning algorithm for VLSI circuit design. The performance of multilevel partitioning algorithms that are proposed to enhance the performance of previous iterative-improvement partitioning algorithms for large scale circuits, depend on choice of construction methods for partition hierarchy. As the most of previous multilevel partitioning algorithms forces experimental constraints on the process of hierarchy construction, the stability of their performances goes down. The lack of stability causes the large variation of partition results during multiple runs. In this paper, we minimize the use of experimental constraints and propose a new method for constructing partition hierarchy. The proposed method clusters the cells with the connection status of the circuit. After constructing the partition hierarchy, a partition improvement algorithm, HYIP$^{[11]}$ using hybrid bucket structure, unclusters the hierachy to get partition results. The experimental results on ACM/SIGDA benchmark circuits show improvement up to 10-40% in minimum outsize over the previous algorithm $^{[3] [4] [5] [8] [10]}$. Also our technique outperforms ML$^{[10]}$ represented multilevel partition method by about 5% and 20% for minimum and average custsize, respectively. In addition, the results of our algorithm with 10 runs are better than ML algorithm with 100 runs.

  • PDF

A Shared Cache Directory based Wireless Internet Proxy Server Cluster (공유 캐시 디렉토리 기반의 무선 인터넷 프록시 서버 클러스터)

  • Kwak Hu-Keun;Chung Kyu-Sik
    • The KIPS Transactions:PartA
    • /
    • v.13A no.4 s.101
    • /
    • pp.343-350
    • /
    • 2006
  • In this paper, wireless internet proxy server clusters are used for the wireless internet because their caching, distillation, and clustering functions are helpful to overcome the limitations and needs of the wireless internet. A wireless Internet proxy server cluster needs a systematic scalability, simple communication structure, cooperative caching, and serving Hot Spot requests. In our former research, we proposed the CD-A structure which can be scalable in a systematic way and has a simple communication structure but it has no cooperative caching. A hash based load balancing can be used to solve the problem, but it can not deal with Hot Spot request problem. In this paper, we proposed a shared storage based wireless internet proxy server cluster which has a systematic scalability, simple communication structure, cooperative caching, and serving Hot Spot requests. The proposed method shares one cache directory and it has advantages: advantages of the existing CD-A structure, cooperative caching, and serving Hot Spot requests. We performed experiments using 16 PCs and experimental results show high performance improvement of the proposed system compared to the existing systems in Hot Spot requests.

Analyzing Self-Introduction Letter of Freshmen at Korea National College of Agricultural and Fisheries by Using Semantic Network Analysis : Based on TF-IDF Analysis (언어네트워크분석을 활용한 한국농수산대학 신입생 자기소개서 분석 - TF-IDF 분석을 기초로 -)

  • Joo, J.S.;Lee, S.Y.;Kim, J.S.;Kim, S.H.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.23 no.1
    • /
    • pp.89-104
    • /
    • 2021
  • Based on the TF-IDF weighted value that evaluates the importance of words that play a key role, the semantic network analysis(SNA) was conducted on the self-introduction letter of freshman at Korea National College of Agriculture and Fisheries(KNCAF) in 2020. The top three words calculated by TF-IDF weights were agriculture, mathematics, study (Q. 1), clubs, plants, friends (Q. 2), friends, clubs, opinions, (Q. 3), mushrooms, insects, and fathers (Q. 4). In the relationship between words, the words with high betweenness centrality are reason, high school, attending (Q. 1), garbage, high school, school (Q. 2), importance, misunderstanding, completion (Q.3), processing, feed, and farmhouse (Q. 4). The words with high degree centrality are high school, inquiry, grades (Q. 1), garbage, cleanup, class time (Q. 2), opinion, meetings, volunteer activities (Q.3), processing, space, and practice (Q. 4). The combination of words with high frequency of simultaneous appearances, that is, high correlation, appeared as 'certification - acquisition', 'problem - solution', 'science - life', and 'misunderstanding - concession'. In cluster analysis, the number of clusters obtained by the height of cluster dendrogram was 2(Q.1), 4(Q.2, 4) and 5(Q. 3). At this time, the cohesion in Cluster was high and the heterogeneity between Clusters was clearly shown.

Topic Modeling Insomnia Social Media Corpus using BERTopic and Building Automatic Deep Learning Classification Model (BERTopic을 활용한 불면증 소셜 데이터 토픽 모델링 및 불면증 경향 문헌 딥러닝 자동분류 모델 구축)

  • Ko, Young Soo;Lee, Soobin;Cha, Minjung;Kim, Seongdeok;Lee, Juhee;Han, Ji Yeong;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.2
    • /
    • pp.111-129
    • /
    • 2022
  • Insomnia is a chronic disease in modern society, with the number of new patients increasing by more than 20% in the last 5 years. Insomnia is a serious disease that requires diagnosis and treatment because the individual and social problems that occur when there is a lack of sleep are serious and the triggers of insomnia are complex. This study collected 5,699 data from 'insomnia', a community on 'Reddit', a social media that freely expresses opinions. Based on the International Classification of Sleep Disorders ICSD-3 standard and the guidelines with the help of experts, the insomnia corpus was constructed by tagging them as insomnia tendency documents and non-insomnia tendency documents. Five deep learning language models (BERT, RoBERTa, ALBERT, ELECTRA, XLNet) were trained using the constructed insomnia corpus as training data. As a result of performance evaluation, RoBERTa showed the highest performance with an accuracy of 81.33%. In order to in-depth analysis of insomnia social data, topic modeling was performed using the newly emerged BERTopic method by supplementing the weaknesses of LDA, which is widely used in the past. As a result of the analysis, 8 subject groups ('Negative emotions', 'Advice and help and gratitude', 'Insomnia-related diseases', 'Sleeping pills', 'Exercise and eating habits', 'Physical characteristics', 'Activity characteristics', 'Environmental characteristics') could be confirmed. Users expressed negative emotions and sought help and advice from the Reddit insomnia community. In addition, they mentioned diseases related to insomnia, shared discourse on the use of sleeping pills, and expressed interest in exercise and eating habits. As insomnia-related characteristics, we found physical characteristics such as breathing, pregnancy, and heart, active characteristics such as zombies, hypnic jerk, and groggy, and environmental characteristics such as sunlight, blankets, temperature, and naps.

Digital Archives of Cultural Archetype Contents: Its Problems and Direction (디지털 아카이브즈의 문제점과 방향 - 문화원형 콘텐츠를 중심으로 -)

  • Hahm, Han-Hee;Park, Soon-Cheol
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.17 no.2
    • /
    • pp.23-42
    • /
    • 2006
  • This is a study of the digital archives of Culturecontent.com where 'Cultural Archetype Contents' are currently in service. One of the major purposes of our study is to point out problems in the current system and eventually propose improvements to the digital archives. The government launched a four-year project for developing the cultural archetype content sources and establishing its related business with the hope of enhancing the nation's competitiveness. More specifically, the project focuses on the production of source materials of cultural archetype contents in the subjects of Korea's history. tradition, everyday life. arts and general geographical books. In addition, through this project, the government also intends to establish a proper distribution system of digitalized culture contents and to control copyright issues. This paper analyzes the digital archives system that stores the culture content data that have been produced from 2002 to 2005 and evaluates the current system's weaknesses and strengths. The summary of our findings is as follows. First. the digital archives system does not contain a semantic search engine and therefore its full function is 1agged. Second, similar data is not classified into the same categories but into the different ones, thereby confusing and inconveniencing users. Users who want to find source materials could be disappointed by the current distributive system. Our paper suggests a better system of digital archives with text mining technology which consists of five significant intelligent process-keyword searches, summarization, clustering, classification and topic tracking. Our paper endeavors to develop the best technical environment for preserving and using culture contents data. With the new digitalized upgraded settings, users of culture contents data will discover a world of new knowledge. The technology we introduce in this paper will lead to the highest achievable digital intelligence through a new framework.