• 제목/요약/키워드: second-order similarity

검색결과 110건 처리시간 0.018초

분포유사도를 이용한 문헌클러스터링의 성능향상에 대한 연구 (Improving the Performance of Document Clustering with Distributional Similarities)

  • 이재윤
    • 정보관리학회지
    • /
    • 제24권4호
    • /
    • pp.267-283
    • /
    • 2007
  • 이 연구에서는 분포 유사도를 문헌 클러스터링에 적용하여 전통적인 코사인 유사도 공식을 대체할 수 있는 가능성을 모색해보았다. 대표적인 분포 유사도인 KL 다이버전스 공식을 변형한 Jansen-Shannon 다이버전스, 대칭적 스큐 다이버전스, 최소스큐 다이버전스의 세 가지 공식을 문헌 벡터에 적용하는 방안을 고안하였다. 분포 유사도를 적용한 문헌 클러스터링 성능을 검증하기 위해서 세 실험 집단을 대상으로 두 가지 실험을 준비하여 실행하였다. 첫 번째 문헌클러스터링실험에서는 최소스큐다이버전스가 코사인 유사도 뿐만 아니라 다른 다이버전스공식의 성능도 확연히 앞서는 뛰어난 성능을 보였다. 두번째 실험에서는 피어슨 상관계수를 이용하여1차 유사도 행렬로부터2차 분포 유사도를 산출하여 문헌 클러스터링을 수행하였다. 실험결과는 2차 분포 유사도가 전반적으로더 좋은 문헌 클러스터링성능을 보이는 것으로 나타났다. 문헌클러스터링에서 처리 시간과 분류 성능을 함께 고려한다면 이 연구에서 제안한 최소 스큐 다이버전스 공식을 사용하고, 분류 성능만 고려할 경우에는 2차 분포 유사도 방식을 사용하는 것이 바람직하다고 판단된다.

용어 클러스터링을 이용한 단일문서 키워드 추출에 관한 연구 (A Study on Keyword Extraction From a Single Document Using Term Clustering)

  • 한승희
    • 한국문헌정보학회지
    • /
    • 제44권3호
    • /
    • pp.155-173
    • /
    • 2010
  • 이 연구에서는 용어 클러스터링을 이용하여 단일문서의 키워드를 추출하는 알고리즘을 제안하고자 한다. 단락단위로 분할한 단일문서를 대상으로 1차 유사도와 2차 분포 유사도를 산출하여 용어 클러스터링을 수행한 결과, 50단어 단락에서 2차 분포 유사도를 적용했을 때 가장 우수한 성능을 나타냈다. 이후, 용어 클러스터링결과를 이용하여 단일문서의 키워드를 추출하기 위해 단순빈도와 상대빈도의 조합을 통해 다양한 키워드 추출 공식을 도출, 적용한 결과, 단락빈도(pf)와 단어빈도$\times$역단락빈도($tf{\times}ipf$) 조건에서 가장 우수한 결과를 나타냈다. 이 결과를 통해, 본 연구에서 제안한 알고리즘은 좋은 키워드가 가져야 할 두 가지 조건인 주제성과 고른 빈도분포라는 측면에서 단일문서를 대상으로 효과적으로 키워드를 추출할 수 있음을 확인하였다.

2차 텐서 기반 유사도 함수를 이용한 영상 데이터 분류 (Image Data Classification using a Similarity Function based on Second Order Tensor)

  • 윤동우;이관용;박혜영
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제36권8호
    • /
    • pp.664-672
    • /
    • 2009
  • 최근 영상 데이터의 효율적인 표현 및 처리를 위해 텐서를 사용하는 연구가 관심을 모으고 있다. 본 연구에서는 2차 텐서로 표현된 데이터를 효과적으로 분류하기 위한 시스템을 개발하는 것을 목적으로 한다. 이를 위해 먼저 일반적인 벡터 데이터에 대해 개발되어진 클래스 요인과 환경 요인으로 이루어진 데이터 생성 모델을 확장하여 2차 텐서로 표현된 영상에 적합한 데이터 생성 모델을 정의하고, 이에 적합한 유사도 함수를 제안하였다. 제안하는 유사도 함수는 행렬정규분포를 이용하여 환경 요인의 확률분포를 추정함으로써 얻을 수 있다. 여러 벤치마크 데이터들을 이용하여 실험한 결과 2차 텐서를 사용함으로써 벡터 형태의 표현방식을 사용하는 것에 비해 분류율이 향상되었음을 확인하였다. 또한 제안하는 유사도 함수가 다른 기존의 유사도 함수에 비해 영상 데이터에 적합함을 확인할 수 있었다.

SYMMETRY REDUCTIONS, VARIABLE TRANSFORMATIONS AND EXACT SOLUTIONS TO THE SECOND-ORDER PDES

  • Liu, Hanze;Liu, Lei
    • Journal of applied mathematics & informatics
    • /
    • 제29권3_4호
    • /
    • pp.563-572
    • /
    • 2011
  • In this paper, the Lie symmetry analysis is performed on the three mixed second-order PDEs, which arise in fluid dynamics, nonlinear wave theory and plasma physics, etc. The symmetries and similarity reductions of the equations are obtained, and the exact solutions to the equations are investigated by the dynamical system and power series methods. Then, the exact solutions to the general types of PDEs are considered through a variable transformation. At last, the symmetry and integration method is employed for reducing the nonlinear ODEs.

The Relationships Between Children's Perceptions Toward Grandparents and Their Intimate Behavior

  • Jung, Min-Suk;Ko, Eun-Kyo;Rho, Joseph Y.;Lee, Seung-Hyun
    • International Journal of Contents
    • /
    • 제5권4호
    • /
    • pp.19-29
    • /
    • 2009
  • This study is focused on the causal relationship between children's intimate behavior and the level of perception towards their grandparents. Their perceptions are related to factors such as proximity, similarity, superiority, favorableness, and self-disclosure. We clarified the relation between intimate behavior and perception using effect factors of children's behavior regarding their grandparents so that this study could be used as an elementary material in developing a solution to improve grandparent-grandchild relationship where the grandparent actively encourages grandchildren's intimate behavior. Regression analysis was used as a hypothesis testing method. The results indicated the following three points. First, perception factors affect active intimate behavior in the order of favorableness, superiority, self-disclosure, and similarity. Second, perception factors affect intimate behavior will in the order of favorableness, superiority, and self-disclosure. Lastly, it was shown that a child's active intimate behavior has an influence on their intimate behavior will.

Heat and mass transfer of a second grade magnetohydrodynamic fluid over a convectively heated stretching sheet

  • Das, Kalidas;Sharma, Ram Prakash;Sarkar, Amit
    • Journal of Computational Design and Engineering
    • /
    • 제3권4호
    • /
    • pp.330-336
    • /
    • 2016
  • The present work is concerned with heat and mass transfer of an electrically conducting second grade MHD fluid past a semi-infinite stretching sheet with convective surface heat flux. The analysis accounts for thermophoresis and thermal radiation. A similarity transformations is used to reduce the governing equations into a dimensionless form. The local similarity equations are derived and solved using Nachtsheim-Swigert shooting iteration technique together with Runge-Kutta sixth order integration scheme. Results for various flow characteristics are presented through graphs and tables delineating the effect of various parameters characterizing the flow. Our analysis explores that the rate of heat transfer enhances with increasing the values of the surface convection parameter. Also the fluid velocity and temperature in the boundary layer region rise significantly for increasing the values of thermal radiation parameter.

영어 동사의 의미적 유사도와 논항 선택 사이의 연관성 : ICE-GB와 WordNet을 이용한 통계적 검증 (The Strength of the Relationship between Semantic Similarity and the Subcategorization Frames of the English Verbs: a Stochastic Test based on the ICE-GB and WordNet)

  • 송상헌;최재웅
    • 한국언어정보학회지:언어와정보
    • /
    • 제14권1호
    • /
    • pp.113-144
    • /
    • 2010
  • The primary goal of this paper is to find a feasible way to answer the question: Does the similarity in meaning between verbs relate to the similarity in their subcategorization? In order to answer this question in a rather concrete way on the basis of a large set of English verbs, this study made use of various language resources, tools, and statistical methodologies. We first compiled a list of 678 verbs that were selected from the most and second most frequent word lists from the Colins Cobuild English Dictionary, which also appeared in WordNet 3.0. We calculated similarity measures between all the pairs of the words based on the 'jcn' algorithm (Jiang and Conrath, 1997) implemented in the WordNet::Similarity module (Pedersen, Patwardhan, and Michelizzi, 2004). The clustering process followed, first building similarity matrices out of the similarity measure values, next drawing dendrograms on the basis of the matricies, then finally getting 177 meaningful clusters (covering 437 verbs) that passed a certain level set by z-score. The subcategorization frames and their frequency values were taken from the ICE-GB. In order to calculate the Selectional Preference Strength (SPS) of the relationship between a verb and its subcategorizations, we relied on the Kullback-Leibler Divergence model (Resnik, 1996). The SPS values of the verbs in the same cluster were compared with each other, which served to give the statistical values that indicate how much the SPS values overlap between the subcategorization frames of the verbs. Our final analysis shows that the degree of overlap, or the relationship between semantic similarity and the subcategorization frames of the verbs in English, is equally spread out from the 'very strongly related' to the 'very weakly related'. Some semantically similar verbs share a lot in terms of their subcategorization frames, and some others indicate an average degree of strength in the relationship, while the others, though still semantically similar, tend to share little in their subcategorization frames.

  • PDF

THE SPACE-TIME FRACTIONAL DIFFUSION EQUATION WITH CAPUTO DERIVATIVES

  • HUANG F.;LIU F.
    • Journal of applied mathematics & informatics
    • /
    • 제19권1_2호
    • /
    • pp.179-190
    • /
    • 2005
  • We deal with the Cauchy problem for the space-time fractional diffusion equation, which is obtained from standard diffusion equation by replacing the second-order space derivative with a Caputo (or Riemann-Liouville) derivative of order ${\beta}{\in}$ (0, 2] and the first-order time derivative with Caputo derivative of order ${\beta}{\in}$ (0, 1]. The fundamental solution (Green function) for the Cauchy problem is investigated with respect to its scaling and similarity properties, starting from its Fourier-Laplace representation. We derive explicit expression of the Green function. The Green function also can be interpreted as a spatial probability density function evolving in time. We further explain the similarity property by discussing the scale-invariance of the space-time fractional diffusion equation.

한글 글꼴 유사성 판단을 위한 획 요소 속성의 영향력 분석 (A Study on Influence of Stroke Element Properties to find Hangul Typeface Similarity)

  • 박동연;전자연;임서영;임순범
    • 한국멀티미디어학회논문지
    • /
    • 제23권12호
    • /
    • pp.1552-1564
    • /
    • 2020
  • As various styles of fonts were used, there were problems such as output errors due to uninstalled fonts and difficulty in font recognition. To solve these problems, research on font recognition and recommendation were actively conducted. However, Hangul font research remains at the basic level. Therefore, in order to automate the comparison on Hangul font similarity in the future, we analyze the influence of each stroke element property. First, we select seven representative properties based on Hangul stroke shape elements. Second, we design a calculation model to compare similarity between fonts. Third, we analyze the effect of each stroke element through the cosine similarity between the user's evaluation and the results of the model. As a result, there was no significant difference in the individual effect of each representative property. Also, the more accurate similarity comparison was possible when many representative properties were used.

대체공정이 있는 기계-부품 그룹 형성 (Machine-Part Grouping with Alternative Process Plans)

  • 이종섭;강맹규
    • 대한산업공학회지
    • /
    • 제31권1호
    • /
    • pp.20-26
    • /
    • 2005
  • This paper proposes the heuristic algorithm for the generalized GT problems to consider the restrictions which are given the number of machine cell and maximum number of machines in machine cell as well as minimum number of machines in machine cell. This approach is split into two phase. In the first phase, we use the similarity coefficient which proposes and calculates the similarity values about each pair of all machines and sort these values descending order. If we have a machine pair which has the largest similarity coefficient and adheres strictly to the constraint about birds of a different feather (BODF) in a machine cell, then we assign the machine to the machine cell. In the second phase, we assign parts into machine cell with the smallest number of exceptional elements. The results give a machine-part grouping. The proposed algorithm is compared to the Modified p-median model for machine-part grouping.