• 제목/요약/키워드: Unsupervised

검색결과 819건 처리시간 0.025초

A Semantic Representation Based-on Term Co-occurrence Network and Graph Kernel

  • Noh, Tae-Gil;Park, Seong-Bae;Lee, Sang-Jo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제11권4호
    • /
    • pp.238-246
    • /
    • 2011
  • This paper proposes a new semantic representation and its associated similarity measure. The representation expresses textual context observed in a context of a certain term as a network where nodes are terms and edges are the number of cooccurrences between connected terms. To compare terms represented in networks, a graph kernel is adopted as a similarity measure. The proposed representation has two notable merits compared with previous semantic representations. First, it can process polysemous words in a better way than a vector representation. A network of a polysemous term is regarded as a combination of sub-networks that represent senses and the appropriate sub-network is identified by context before compared by the kernel. Second, the representation permits not only words but also senses or contexts to be represented directly from corresponding set of terms. The validity of the representation and its similarity measure is evaluated with two tasks: synonym test and unsupervised word sense disambiguation. The method performed well and could compete with the state-of-the-art unsupervised methods.

의미특징의 포괄적 중요도를 이용한 포괄적 문서 요약 (Generic Summarization Using Generic Important of Semantic Features)

  • 박선;이종훈
    • 한국항행학회논문지
    • /
    • 제12권5호
    • /
    • pp.502-508
    • /
    • 2008
  • 인터넷의 급속한 확산과 대량 정보의 이동은 문서요약을 더욱 필요 하고 있다. 본 논문은 비음수 행렬 인수분해로 얻어진 비음수 의미 가변 행렬과 의미특징의 포괄적 중요도를 이용하여 문장을 추출하여서 포괄적 문서요약을 하는 새로운 방법을 제안하였다. 제안된 방법은 인간의 인식 과정과 유사한 비음수 제약을 사용한다. 이 결과 주제의 군집방법이나 잠재의미분석을 사용한 비지도 학습방법에 비해 더욱 의미 있는 문장을 선택하여 문서를 요약할 수 있다. 실험결과 제안방법이 다른 방법들에 비하여 좋은 성능을 보인다.

  • PDF

Decoding Brain States during Auditory Perception by Supervising Unsupervised Learning

  • Porbadnigk, Anne K.;Gornitz, Nico;Kloft, Marius;Muller, Klaus-Robert
    • Journal of Computing Science and Engineering
    • /
    • 제7권2호
    • /
    • pp.112-121
    • /
    • 2013
  • The last years have seen a rise of interest in using electroencephalography-based brain computer interfacing methodology for investigating non-medical questions, beyond the purpose of communication and control. One of these novel applications is to examine how signal quality is being processed neurally, which is of particular interest for industry, besides providing neuroscientific insights. As for most behavioral experiments in the neurosciences, the assessment of a given stimulus by a subject is required. Based on an EEG study on speech quality of phonemes, we will first discuss the information contained in the neural correlate of this judgement. Typically, this is done by analyzing the data along behavioral responses/labels. However, participants in such complex experiments often guess at the threshold of perception. This leads to labels that are only partly correct, and oftentimes random, which is a problematic scenario for using supervised learning. Therefore, we propose a novel supervised-unsupervised learning scheme, which aims to differentiate true labels from random ones in a data-driven way. We show that this approach provides a more crisp view of the brain states that experimenters are looking for, besides discovering additional brain states to which the classical analysis is blind.

Improved Linear Dynamical System for Unsupervised Time Series Recognition

  • Thi, Ngoc Anh Nguyen;Yang, Hyung-Jeong;Kim, Soo-Hyung;Lee, Guee-Sang;Kim, Sun-Hee
    • International Journal of Contents
    • /
    • 제10권1호
    • /
    • pp.47-53
    • /
    • 2014
  • The paper considers the challenges involved in measuring the similarities between time series, such as time shifts and the mixture of frequencies. To improve recognition accuracy, we investigate an improved linear dynamical system for discovering prominent features by exploiting the evolving dynamics and correlations in a time series, as the quality of unsupervised pattern recognition relies strongly on the extracted features. The proposed approach yields a set of compact extracted features that boosts the accuracy and reliability of clustering for time series data. Experimental evaluations are carried out on time series applications from the scientific, socio-economic, and business domains. The results show that our method exhibits improved clustering performance compared to conventional methods. In addition, the computation time of the proposed approach increases linearly with the length of the time series.

차분진화 알고리즘을 이용한 지역 Linear Discriminant Analysis Classifier 기반 패턴 분류 규칙 설계 (Design of Pattern Classification Rule based on Local Linear Discriminant Analysis Classifier by using Differential Evolutionary Algorithm)

  • 노석범;황은진;안태천
    • 한국지능시스템학회논문지
    • /
    • 제22권1호
    • /
    • pp.81-86
    • /
    • 2012
  • 본 논문에서는 전형적인 Linear Discriminant Analysis을 확장시켜 전체 입력공간을 다수의 지역공간으로 분할하고 분할된 공간에 Local Linear Discriminant Analysis 기반으로 하여 패턴 분류 규칙을 설계하는 새로운 방법을 제안한다. 전체 입력공간을 여러 개의 지역공간으로 분할하기 위한 방법으로 unsupervised clustering의 대표적인 방법인 k-Means 클러스터링 기법과 최적화 알고리즘인 차분 진화 연산 알고리즘을 사용한다. 제안된 알고리즘의 성능 평가를 위해 기존의 패턴 분류기와 비교 결과를 제시한다.

앙상블 기법을 이용한 선박 메인엔진 빅데이터의 이상치 탐지 (Outlier detection of main engine data of a ship using ensemble method)

  • 김동현;이지환;이상봉;정봉규
    • 수산해양기술연구
    • /
    • 제56권4호
    • /
    • pp.384-394
    • /
    • 2020
  • This paper proposes an outlier detection model based on machine learning that can diagnose the presence or absence of major engine parts through unsupervised learning analysis of main engine big data of a ship. Engine big data of the ship was collected for more than seven months, and expert knowledge and correlation analysis were performed to select features that are closely related to the operation of the main engine. For unsupervised learning analysis, ensemble model wherein many predictive models are strategically combined to increase the model performance, is used for anomaly detection. As a result, the proposed model successfully detected the anomalous engine status from the normal status. To validate our approach, clustering analysis was conducted to find out the different patterns of anomalies the anomalous point. By examining distribution of each cluster, we could successfully find the patterns of anomalies.

부산시(釜山市) 청소년(責少年)의 반주행위(飯酒行爲)에 관한 연구(硏究) - 사회심리적(社會心理的) Model에 의한 분석(分析) - (Adolescent Drinking Behaviors in Pusan City : An Analysis on the Sociopsychological Model)

  • 고정자
    • 아동학회지
    • /
    • 제7권2호
    • /
    • pp.55-73
    • /
    • 1986
  • This study analyzed the socio-psychological process of adolescent drinking behaviors. A total 1,732 high school students in Pusan city were studied by the questionnaire from May to July, 1985. A structural model based on review of the literature was examined in order to test the following three hypotheses: (1) sociocultural and environmental impact on the adolescent belief system for drinking, on drinking situations, and on experiences of deviation, (2) relationships among adolescent belief system, drinking situations, and experiences of deviation, and (3) impact of antecedent variables on adolescent drinking levels. All hypotheses were supported by the data. The important outcomes were discussed as follows: 1. Because interpersonal factors were influential for the adolescent belief system concerning drinking, public drinking education through mass communication or drinking education in the curriculum were recommended. In addition to sex variables, friends' drinking and sibling's drinking were shown to have a positive impacts on drinking situations. Also, adolescent self-reported parents' views on drinking had significant effects. Because adolescent deviant experiences were generally affected by environmental factors, it is recommended that positive extra-curricular activities at both home and school should be investigated. 2. There were significant relationships among adolescent belief systems, drinking situations, and deviant experiences. However, adolescent drinking behaviors in supervised situations had weak correlations with their belif systems and deviant behaviors. 3. Adolescent drinking levels were remarkably influenced by drinking behaviors in unsupervised situations. Because it is difficult to control actual adolescent drinking behaviors in unsupervised situations, it is important to fortify their belief system with continuous education programs.

  • PDF

클러스터링을 이용한 급격한 장면 전환 검출 기법 (Abrupt Shot Change Detection using an Unsupervised Clustering of Multiple Features)

  • 이훈철;고윤호;윤병주;김성대;유상조
    • 대한전자공학회논문지SP
    • /
    • 제38권6호
    • /
    • pp.712-720
    • /
    • 2001
  • 본 논문에서는 클러스터링을 이용해서 급격한 장면 전환을 찾는 방법을 제안한다. 일반적으로 장면 전환검출 기법에서 많이 사용되는 특징들은 특별한 상황에서만 잘 적용된다는 단점이 있기 때문에 여러 종류의 특징을 동시에 고려하는 클러스터링 기반의 기법이 많이 사용되고 있다. 하지만 이 경우에는 클러스터의 초기 중심을 정하는 것이 중요한 문제가 된다. 본 논문에서는 k-평균 클러스터링에서의 초기 중심을 적응적으로 바꾸면서 장면 전환 존재 여부를 결정하도록 하였다. 실험 결과 초기 클러스터 중심이 고정된 경우에 비해서 더 좋은 결과를 얻었다.

  • PDF

자기 조직화 신경망을 이용한 클러스터링 알고리듬 (A Clustering Algorithm using Self-Organizing Feature Maps)

  • 이종섭;강맹규
    • 대한산업공학회지
    • /
    • 제31권3호
    • /
    • pp.257-264
    • /
    • 2005
  • This paper suggests a heuristic algorithm for the clustering problem. Clustering involves grouping similar objects into a cluster. Clustering is used in a wide variety of fields including data mining, marketing, and biology. Until now there are a lot of approaches using Self-Organizing Feature Maps(SOFMs). But they have problems with a small output-layer nodes and initial weight. For example, one of them is a one-dimension map of k output-layer nodes, if they want to make k clusters. This approach has problems to classify elaboratively. This paper suggests one-dimensional output-layer nodes in SOFMs. The number of output-layer nodes is more than those of clusters intended to find and the order of output-layer nodes is ascending in the sum of the output-layer node's weight. We can find input data in SOFMs output node and classify input data in output nodes using Euclidean distance. We use the well known IRIS data as an experimental data. Unsupervised clustering of IRIS data typically results in 15 - 17 clustering error. However, the proposed algorithm has only six clustering errors.

Unsupervised Clustering of Multivariate Time Series Microarray Experiments based on Incremental Non-Gaussian Analysis

  • Ng, Kam Swee;Yang, Hyung-Jeong;Kim, Soo-Hyung;Kim, Sun-Hee;Anh, Nguyen Thi Ngoc
    • International Journal of Contents
    • /
    • 제8권1호
    • /
    • pp.23-29
    • /
    • 2012
  • Multiple expression levels of genes obtained using time series microarray experiments have been exploited effectively to enhance understanding of a wide range of biological phenomena. However, the unique nature of microarray data is usually in the form of large matrices of expression genes with high dimensions. Among the huge number of genes presented in microarrays, only a small number of genes are expected to be effective for performing a certain task. Hence, discounting the majority of unaffected genes is the crucial goal of gene selection to improve accuracy for disease diagnosis. In this paper, a non-Gaussian weight matrix obtained from an incremental model is proposed to extract useful features of multivariate time series microarrays. The proposed method can automatically identify a small number of significant features via discovering hidden variables from a huge number of features. An unsupervised hierarchical clustering representative is then taken to evaluate the effectiveness of the proposed methodology. The proposed method achieves promising results based on predictive accuracy of clustering compared to existing methods of analysis. Furthermore, the proposed method offers a robust approach with low memory and computation costs.