• Title/Summary/Keyword: clustering problem

Search Result 709, Processing Time 0.032 seconds

The Recommendation System based on Staged Clustering for Leveled Programming Education (수준별 프로그래밍 교육을 위한 단계별 클러스터링 기반 추천시스템)

  • Kim, Kyung-Ah;Moon, Nam-Mee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.8
    • /
    • pp.51-58
    • /
    • 2010
  • Programming education needs learning which is adjusted individual learners' level of their learning abilities. Recommendation system is one way of implementing personalized service. In this research, we propose recommendation method which learning items are recommended for individual learners' learning in web-based programming education environment by. Our proposed system for leveled programming education provides appropriate programming problems for a certain learner in his learning level and learning scope employing collaborative filtering method using learners' profile of their level and correlation profile between learning topics. As a result, it resolves a problem that providing appropriate programming problems in learner's level, and we get a result that improving leaner's programming ability. Furthermore, when we compared our proposed method and original collaborative filtering method, our proposed method provides the ways to solve the scalability which is one of the limitations in recommendation systems by improving recommendation performance and reducing analysis time.

Word Separation in Handwritten Legal Amounts on Bank Check by Measuring Gap Distance Between Connected Components (연결 성분 간 간격 측정에 의한 필기체 수표 금액 문장에서의 단어 추출)

  • Kim, In-Cheol
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.1
    • /
    • pp.57-62
    • /
    • 2004
  • We have proposed an efficient method of word separation in a handwritten legal amount on bank check based on the spatial gaps between the connected components. The previous gap measures all suffer from the inherent problem of underestimation or overestimation that causes a deterioration in separation performance. In order to alleviate such burden, we have developed a modified version of each distance measure. Also, 4 class clustering based method of integrating three different types of distance measures has been proposed to compensate effectively the errors in each measure, whereby further improvement in performance of word separation is expected. Through a series of word separation experiments, we found that the modified distance measures show a better performance with over 2 - 3% of the word separation rate than their corresponding original distance measures. In addition, the proposed combining method based on 4-class clustering achieved further improvement by effectively reducing the errors common to two of three distance measures as well as the individual errors.

Effective Streaming of XML Data for Wireless Broadcasting (무선 방송을 위한 효과적인 XML 스트리밍)

  • Park, Jun-Pyo;Park, Chang-Sup;Chung, Yon-Dohn
    • Journal of KIISE:Databases
    • /
    • v.36 no.1
    • /
    • pp.50-62
    • /
    • 2009
  • In wireless and mobile environments, data broadcasting is recognized as an effective way for data dissemination due to its benefits to bandwidth efficiency, energy-efficiency, and scalability. In this paper, we address the problem of delayed query processing raised by tree-based index structures in wireless broadcast environments, which increases the access time of the mobile clients. We propose a novel distributed index structure and a clustering strategy for streaming XML data which enable energy and latency-efficient broadcast of XML data. We first define the DIX node structure to implement a fully distributed index structure which contains tag name, attributes, and text content of an element as well as its corresponding indices. By exploiting the index information in the DIX node stream, a mobile client can access the wireless stream in a shorter latency. We also suggest a method of clustering DIX nodes in the stream, which can further enhance the performance of query processing over the stream in the mobile clients. Through extensive performance experiments, we demonstrate that our approach is effective for wireless broadcasting of XML data and outperforms the previous methods.

Separating nanocluster Si formation and Er activation in nanocluster-Si sensitized Er luminescence

  • Kim, In-Yong;Sin, Jung-Hun;Kim, Gyeong-Jung
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2010.02a
    • /
    • pp.109-109
    • /
    • 2010
  • $Er^{3+}$ ion shows a stable and efficient luminescence at 1.54mm due to its $^4I_{13/2}\;{\rightarrow}\;^4I_{15/2}$ intra-4f transition. As this corresponds to the low-loss window of silica-based optical fibers, Er-based light sources have become a mainstay of the long-distance telecom. In most telecom applications, $Er^{3+}$ ions are excited via resonant optical pumping. However, if nanocluster-Si (nc-Si) are co-doped with $Er^{3+}$, $Er^{3+}$ can be excited via energy transfer from excited electrical carriers in the nc-Si as well. This combines the broad, strong absorption band of nc-Si with narrow, stable emission spectra of $Er^{3+}$ to allow top-pumping with off-resonant, low-cost broadband light sources as well as electrical pumping. A widely used method to achieve nc-Si sensitization of $Er^{3+}$ is high-temperature annealing of Er-doped, non-stoichiometric amorphous thin film with excess Si (e.g.,silicon-rich silicon oxide(SRSO)) to precipitate nc-Si and optically activate $Er^{3+}$ at the same time. Unfortunately, such precipitation and growth of nc-Si into Er-doped oxide matrix can lead to $Er^{3+}$ clustering away from nc-Si at anneal temperatures much lower than ${\sim}1000^{\circ}C$ that is necessary for full optical activation of $Er^{3+}$ in $SiO_2$. Recently, silicon-rich silicon nitride (SRSN) was reported to be a promising alternative to SRSO that can overcome this problem of Er clustering. But as nc-Si formation and optical activation $Er^{3+}$ remain linked in Er-doped SRSN, it is not clear which mechanism is responsible for the observed improvement. In this paper, we report on investigating the effect of separating the nc-Si formation and $Er^{3+}$ activation by using hetero-multilayers that consist of nm-thin SRSO or SRSN sensitizing layers with Er-doped $SiO_2$ or $Si_3N_4$ luminescing layers.

  • PDF

A Probability-based Clustering Protocol for Data Dissemination in Wireless Sensor Networks (무선 센서 네트워크에서 확률 기반의 클러스터링을 이용한 계층적 데이터 전송 프로토콜)

  • Kim, Moon-Seong;Cho, Sang-Hun;Lim, Hyung-Jin;Choo, Hyun-Seung
    • Journal of Internet Computing and Services
    • /
    • v.10 no.2
    • /
    • pp.153-160
    • /
    • 2009
  • One of the major challenges of designing a dissemination protocol for Wireless Sensor Networks(WSNs) is energy efficiency. Recently, this issue has received much attention from the research community, and SPMS, which outperforms the well-known protocol SPIN, specially is a representative protocol. In addition, one of many characters of SPMS is the use of the shortest path to minimize the energy consumption. However, since it repeatedly uses the same path as the shortest path, the maximizing of the network lifetime is impossible, though it reduces the energy consumption. In this paper, we propose a dissemination protocol using probability-based clustering which guarantees energy-efficient data transmission and maximizes network lifetime. The proposed protocol solves the network lifetime problem by a novel probability function, which is related to the residual energy and the transmission radius between nodes. The simulation results show that it guarantees energy-efficient transmission and moreover increases the network lifetime by approximately 78% than that of SPMS.

  • PDF

MD-TIX: Multidimensional Type Inheritance Indexing for Efficient Execution of XML Queries (MD-TIX: XML 질의의 효율적 처리를 위한 다차원 타입상속 색인기법)

  • Lee, Jong-Hak
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.9
    • /
    • pp.1093-1105
    • /
    • 2007
  • This paper presents a multidimensional type inheritance indexing technique (MD-TIX) for XML databases. We use a multidimensional file organization as the index structure. In conventional XML database indexing techniques using one-dimensional index structures, they do not efficiently handle complex queries involving both nested elements and type inheritance hierarchies. We extend a two-dimensional type hierarchy indexing technique(2D-THI) for indexing the nested elements of XML databases. 2D-THI is an indexing scheme that deals with the problem of clustering elements in a two-dimensional domain space consisting of the key value domain and the type identifier domain for indexing a simple element in a type hierarchy. In our extended scheme, we handle the clustering of the index entries in a multidimensional domain space consisting of a key value domain and multiple type identifier domains that include one type identifier domain per type hierarchy on a path expression. This scheme efficiently supports queries that involve search conditions on the nested element represented by an extended path expression. An extended path expression is a path expression in which every type hierarchy on a path can be substituted by an individual type or a subtype hierarchy.

  • PDF

An Optimal Cluster Analysis Method with Fuzzy Performance Measures (퍼지 성능 측정자를 결합한 최적 클러스터 분석방법)

  • 이현숙;오경환
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.6 no.3
    • /
    • pp.81-88
    • /
    • 1996
  • Cluster analysis is based on partitioning a collection of data points into a number of clusters, where the data points in side a cluster have a certain degree of similarity and it is a fundamental process of data analysis. So, it has been playing an important role in solving many problems in pattern recognition and image processing. For these many clustering algorithms depending on distance criteria have been developed and fuzzy set theory has been introduced to reflect the description of real data, where boundaries might be fuzzy. If fuzzy cluster analysis is tomake a significant contribution to engineering applications, much more attention must be paid to fundamental questions of cluster validity problem which is how well it has identified the structure that is present in the data. Several validity functionals such as partition coefficient, claasification entropy and proportion exponent, have been used for measuring validity mathematically. But the issue of cluster validity involves complex aspects, it is difficult to measure it with one measuring function as the conventional study. In this paper, we propose four performance indices and the way to measure the quality of clustering formed by given learning strategy.

  • PDF

Unified Labeling and Fine-Grained Verification for Improving Ground-Truth of Malware Analysis (악성코드 분석의 Ground-Truth 향상을 위한 Unified Labeling과 Fine-Grained 검증)

  • Oh, Sang-Jin;Park, Leo-Hyun;Kwon, Tae-Kyoung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.3
    • /
    • pp.549-555
    • /
    • 2019
  • According to a recent report by anti-virus vendors, the number of new and modified malware increased exponentially. Therefore, malware analysis research using machine learning has been actively researched in order to replace passive analysis method which has low analysis speed. However, when using supervised learning based machine learning, many studies use low-reliability malware family name provided by the antivirus vendor as the label. In order to solve the problem of low-reliability of malware label, this paper introduces a new labeling technique, "Unified Labeling", and further verifies the malicious behavior similarity through the feature analysis of the fine-grained method. To verify this study, various clustering algorithms were used and compared with existing labeling techniques.

An improved LEACH-C routing protocol considering the distance between the cluster head and the base station (클러스터 헤드와 기지국간의 거리를 고려한 향상된 LEACH-C 라우팅 프로토콜)

  • Kim, TaeHyeon;Park, Sea Young;Kwon, Oh Seok;Lee, Jong-Yong;Jung, Kye-Dong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.2
    • /
    • pp.373-377
    • /
    • 2022
  • Wireless sensor networks are being used in various fields. Wireless sensor networks are applied in many areas, such as security, military detection, environmental management, industrial control, and home automation. There is a problem about the limit of energy that the sensor network basically has. In this paper, we propose the LEACH-CCBD (Low Energy Adaptive Clustering hierarchy - Centrailized with Cluster and Basestation Distance) algorithm that uses energy efficiently by improving network transmission based on LEACH-C among the representative routing protocols. The LEACH-CCBD algorithm is a method of assigning a cluster head to a cluster head by comparing the sum of the distance from the member node to the cluster distance and the distance from the cluster node to the base station with respect to the membership of the member nodes in the cluster when configuring the cluster. The proposed LEACH-CCBD used Matlab simulation to confirm the performance results for each protocol. As a result of the experiment, as the lifetime of the network increased, it was shown to be superior to the LEACH and LEACH-C algorithms.

Multi-Document Summarization Method of Reviews Using Word Embedding Clustering (워드 임베딩 클러스터링을 활용한 리뷰 다중문서 요약기법)

  • Lee, Pil Won;Hwang, Yun Young;Choi, Jong Seok;Shin, Young Tae
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.535-540
    • /
    • 2021
  • Multi-document refers to a document consisting of various topics, not a single topic, and a typical example is online reviews. There have been several attempts to summarize online reviews because of their vast amounts of information. However, collective summarization of reviews through existing summary models creates a problem of losing the various topics that make up the reviews. Therefore, in this paper, we present method to summarize the review with minimal loss of the topic. The proposed method classify reviews through processes such as preprocessing, importance evaluation, embedding substitution using BERT, and embedding clustering. Furthermore, the classified sentences generate the final summary using the trained Transformer summary model. The performance evaluation of the proposed model was compared by evaluating the existing summary model, seq2seq model, and the cosine similarity with the ROUGE score, and performed a high performance summary compared to the existing summary model.