• 제목/요약/키워드: global similarity

검색결과 242건 처리시간 0.026초

경험적 정보를 이용한 kNN 기반 한국어 문서 분류기의 개선 (Improving of kNN-based Korean text classifier by using heuristic information)

  • 임희석;남기춘
    • 컴퓨터교육학회논문지
    • /
    • 제5권3호
    • /
    • pp.37-44
    • /
    • 2002
  • 문서 자동 분류란 입력 문서에 이미 정해져 있는 특정 범주를 할당하는 작업을 의미하며 이는 문서의 효율적, 체계적 관리를 위하여 그 필요성이 증가하고 있는 실정이다. 현재 국내외에서 기계 학습 방법을 이용한 문서 자동 분류에 대한 연구가 활발히 진행되고 있으나 대부분의 연구는 문서 분류기의 성능 향상을 위한 새로운 학습 모델 제안과 학습 모델간의 상호 비교 연구에 치중되어 있으며 특정 학습 모델을 이용한 분류 시스템의 최적화나 개선 방안에 대한 연구는 다소 미흡한 실정이다. 이에 본 논문은 kNN 학습 방법을 이용한 문서 분류 시스템의 성능 향상에 중요한 역할을 하는 파라미터를 정의하고 실험을 통해서 얻은 경험적 정보를 이용한 한국어 문서 분류기 성능 개성 방안을 제안한다. 실험 결과, 이웃 문서들간의 유사도 가중치를 사용하는 분류 함수, 분류 정보를 이용한 자질 선택 방법, 그리고 전역적 분류 방법이 높은 성능을 보였고, 분류 영역에 따라 신중히 결정된 k값을 사용한 지역적 방법도 많은 계산량을 필요로 하는 전역적 방법과 유사한 성능을 보일 수 있음을 확인하였다.

  • PDF

초음파 센서를 이용한 이동 로봇의 직선선분 지도 작성 (Line Segments Map Building Using Sonar for Mobile Robot)

  • 홍현주;권석근;노영식
    • 제어로봇시스템학회논문지
    • /
    • 제7권9호
    • /
    • pp.783-789
    • /
    • 2001
  • 본 논문에서는 미지의 환경에서 이동로봇이 주행 중 얻어진 격자 지도(grid map)상의 장애물 정보를 이용하여 직선 성분으로 이동로봇 주변환경을 표현한다. 격자 지도의 장애물 정보는 초음파 센서를 이용하여 얻어지므로 이동로봇과 인접한 장애물 정보만을 얻게 된다 얻어진 격자 정보를 호프변환을 이용하여 직선선분을 구축하고 완성해 간다. 논의된 방법은 실험을 통하여 증명하였다.

  • PDF

인간-침팬지간 대량의 지놈서열 비교분석 (Comparative Analysis of Large Genome in Human-Chimpanzee)

  • Kim, Tae-Hyung;Kim, Dae-Soo;Jeon, Yeo-Jin;Cho, Hwan-Gue;Kim, Heui-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.183-192
    • /
    • 2003
  • With the availability of complete whole-genomes such as the human, mouse, fugu and chimpanzee chromosome 22, comparative analysis of large genomes from cross-species at varying evolutionary distances is considered one of a powerful approach for identifying coding and functional non-coding sequences. Here we describe a fast and efficient global alignment method especially for large genomic regions over mega bases pair. We used an approach for identifying all similarity regions by HSP (Highest Segment Pair) regions using local alignments and then large syntenic genome based on the both extension of anchors at HSP regions in two species and global conservation map. Using this alignment approach, we examined rearrangement loci in human chromosome 21 and chimpanzee chromosome 22. Finally, we extracted syntenic genome 30 Mb of human chromosome 21 with chimpanzee chromosome 22, and then identified genomic rearrangements (deletions and insertions ranging h size from 0.3 to 200 kb). Our experiment shows that all jnsertion/deletion (indel) events in excess of 300 bp within chimpanzee chromosome 22 and human chromosome 21 alignments in order to identify new insertions that had occurred over the last 7 million years of evolution. Finally we also discussed evolutionary features throughout comparative analyses of Ka/ks (non-synonymous / synonymous substitutions) rate in orthologous 119 genes of chromosome 21 and 53 genes of MHC-I class in human and chimpanzee genome.

  • PDF

한국 웹 그래프와 진화에 대한 연구 (Graph Structure and Evolution of the Korea web)

  • 한인규;이상호
    • 정보처리학회논문지D
    • /
    • 제14D권3호
    • /
    • pp.293-302
    • /
    • 2007
  • 웹 그래프에 대한 연구는 웹 문서의 효율적인 수집을 위하여 적용되는 알고리즘과, 커뮤니티의 검색 및 발견의 분야에 있어 매우 중요한 위치를 차지한다. 또한 웹 그래프의 연구에 있어 발견되는 웹의 현상들은 웹이 가지고 있는 특징들을 나타내며 웹 그래프의 진화를 연구함으로써 웹의 크기와 진화 프로세스를 예측할 수 있다. 본 논문에서는 약 1억 1천만 개의 노드와 약 27억 개의 노드를 가지는 한국 웹 그래프에 대한 연구를 수행한다. 먼저 한국 웹의 페이지들이 서로 얼마나 연결되어 있는가에 대한 접속도 연구를 수행한다. 한국 웹의 접속도는 bow-tie 모형으로 표현할 수 있다. 또한 Power Law 현상과 같은 한국 웹의 특징이 글로벌 웹과 어떤 차이가 있는지 분석한다. 한국 웹 그래프의 속성은 글로벌 웹과는 많은 차이를 보여주었다. 마지막으로 한국 웹 그래프의 진화에 대한 연구를 여러 가지 관점으로 수행한다.

남극 세종기지에서의 지표 플럭스 관측: I. 난류 특성과 현열 플럭스 (Surface Flux Measurements at King Sejong Station in West Antarctica: I. Turbulent Characteristics and Sensible Beat Flux)

  • 최태진;이방용;이희춘;심재설
    • Ocean and Polar Research
    • /
    • 제26권3호
    • /
    • pp.453-463
    • /
    • 2004
  • The Antarctic Peninsula is important in terms of global warming research due to pronounced increase of air temperature over the last century. The first eddy covariance system was established at King Sejong Station located in the northern region of the Antarctic Peninsula in December of 2002 and has been operated over one year. Here, we analyze turbulent characteristics to determine quality control criteria for turbulent sensible heat flux data as well as to diagnose the possibility of long term eddy covariance measurement under extreme weather conditions of the Antarctic Peninsula. We also report the preliminary result on sensible heat flux. Based on the analyses on turbulent characteristics such as integral turbulence characteristics of vertical velocity (w) and heat (T), stationarity test and investigation of correlation coefficient, they fallow the Monin-Obukhov similarity and eddy covariance flux data were reliable. ${\sim}47%$ of total retrieved sensible heat flux data could be used for further analysis. Daytime averaged sensible heat flux showed a pronounced seasonal variation, with a maximum of up to $300Wm^{-2}$ in summer. In conclusion, continuous and long-term eddy covariance measurement may be possible at the study site and the land surface may influence the atmosphere significantly through heat transport in summer.

한국어-영어 법률 말뭉치의 로컬 이중 언어 임베딩 (Utilizing Local Bilingual Embeddings on Korean-English Law Data)

  • 최순영;;임희석
    • 한국융합학회논문지
    • /
    • 제9권10호
    • /
    • pp.45-53
    • /
    • 2018
  • 최근 이중 언어 임베딩(bilingual word embedding) 관련 연구들이 각광을 받고 있다. 그러나 한국어와 특정 언어로 구성된 병렬(parallel-aligned) 말뭉치로 이중 언어 워드 임베딩을 하는 연구는 질이 높은 많은 양의 말뭉치를 구하기 어려우므로 활발히 이루어지지 않고 있다. 특히, 특정 영역에 사용할 수 있는 로컬 이중 언어 워드 임베딩(local bilingual word embedding)의 경우는 상대적으로 더 희소하다. 또한 이중 언어 워드 임베딩을 하는 경우 번역 쌍이 단어의 개수에서 일대일 대응을 이루지 못하는 경우가 많다. 본 논문에서는 로컬 워드 임베딩을 위해 한국어-영어로 구성된 한국 법률 단락 868,163개를 크롤링(crawling)하여 임베딩을 하였고 3가지 연결 전략을 제안하였다. 본 전략은 앞서 언급한 불규칙적 대응 문제를 해결하고 단락 정렬 말뭉치에서 번역 쌍의 질을 향상시켰으며 베이스라인인 글로벌 워드 임베딩(global bilingual word embedding)과 비교하였을 때 2배의 성능을 확인하였다.

패션마케팅 영역에서의 비교문화적 연구의 경향 (Cross-Cultural Studies in Fashion Marketing Discipline)

  • 조윤진;양수진;김은영;추호정
    • 한국의류학회지
    • /
    • 제30권8호
    • /
    • pp.1312-1322
    • /
    • 2006
  • A recent accelerated globalization has changed every aspect of consumers' life on the globe, thus understanding the similarity and the difference among people in the world became the crucial element of business for many global companies. As one of the most globalized industries in Korea, fashion businesses also require urgent assistance of academics in understanding global consumers. This study aimed to analyze cross-cultural fashion marketing studies published in two respectful journals in fashion studies: Journal of Korean Society of Clothing and Textiles and Journal of the Korean Society of Costume. Four researchers independently searched the target journals to locate studies using cross-cultural approaches. A total of 45 cross-cultural studies published in two journals between 1977 and 2005 were found and analyzed. The major findings could be summarized as followed. First, the US was the most frequently studied country followed by China, Japan, Hong Kong and others. Second, popular subjects of cross cultural studies in fashion marketing were fashion marketing environment and management rather than consumer psychology. Third, about 78% of the sampled studies were using quantitative approach, and statistical methods such as factor analysis, t-test, ANOVA, and $X^2$ analysis were commonly used. Finally, problems in sampling methods, translation of scales, and equivalence of concept, measure and sample were analyzed. Suggestions for future cross-cultural studies were discussed.

공급 리스크를 고려한 공급자 선정의 다단계 의사결정 모형 (A Multi-Phase Decision Making Model for Supplier Selection Under Supply Risks)

  • 유준수;박양병
    • 산업경영시스템학회지
    • /
    • 제40권4호
    • /
    • pp.112-119
    • /
    • 2017
  • Selecting suppliers in the global supply chain is the very difficult and complicated decision making problem particularly due to the various types of supply risk in addition to the uncertain performance of the potential suppliers. This paper proposes a multi-phase decision making model for supplier selection under supply risks in global supply chains. In the first phase, the model suggests supplier selection solutions suitable to a given condition of decision making using a rule-based expert system. The expert system consists of a knowledge base of supplier selection solutions and an "if-then" rule-based inference engine. The knowledge base contains information about options and their consistency for seven characteristics of 20 supplier selection solutions chosen from articles published in SCIE journals since 2010. In the second phase, the model computes the potential suppliers' general performance indices using a technique for order preference by similarity to ideal solution (TOPSIS) based on their scores obtained by applying the suggested solutions. In the third phase, the model computes their risk indices using a TOPSIS based on their historical and predicted scores obtained by applying a risk evaluation algorithm. The evaluation algorithm deals with seven types of supply risk that significantly affect supplier's performance and eventually influence buyer's production plan. In the fourth phase, the model selects Pareto optimal suppliers based on their general performance and risk indices. An example demonstrates the implementation of the proposed model. The proposed model provides supply chain managers with a practical tool to effectively select best suppliers while considering supply risks as well as the general performance.

A New Classification for Cervical Ossification of the Posterior Longitudinal Ligament Based on the Coexistence of Segmental Disc Degeneration

  • Lee, Jun Ki;Ham, Chang Hwa;Kwon, Woo-Keun;Moon, Hong Joo;Kim, Joo Han;Park, Youn-Kwan
    • Journal of Korean Neurosurgical Society
    • /
    • 제64권1호
    • /
    • pp.69-77
    • /
    • 2021
  • Objective : Classification systems for cervical ossification of the posterior longitudinal ligament (OPLL) have traditionally focused on the morphological characteristics of ossification. Although the classification describes many clinical features associated with the shape of the ossification, including the concept of spondylosis seems necessary because of the similarity in age distribution. Methods : Patients diagnosed with OPLL who presented with increase signal intensity (ISI) on magnetic resonance imaging were surgically treated in our department. The patients were divided into two groups (pure versus degenerative) according to the presence of disc degeneration. Results : Of 141 patients enrolled in this study, more than half (61%) were classified into the degenerative group. The pure group showed a profound male predominance, early presentation of myelopathy, and a different predilection for ISI compared to the degenerative group. The mean canal compromise ratio (CC) of the ISI was 47% in the degenerative group versus 61% in the pure group (p<0.0000). On the contrary, the global and segment motions were significantly larger in the degenerative group (p<0.0000 and p=0.003, respectively). The canal diameters and global angles did not differ between groups. Conclusion : Classifying cervical OPLL based on the presence of combined disc degeneration is beneficial for understanding the disorder's behavior. CC appears to be the main factor in the development of myelopathy in the pure group, whereas additional dynamic factors appear to affect its development in the degenerative group.

Multivariable Integrated Evaluation of GloSea5 Ocean Hindcasting

  • Lee, Hyomee;Moon, Byung-Kwon;Kim, Han-Kyoung;Wie, Jieun;Park, Hyo Jin;Chang, Pil-Hun;Lee, Johan;Kim, Yoonjae
    • 한국지구과학회지
    • /
    • 제42권6호
    • /
    • pp.605-622
    • /
    • 2021
  • Seasonal forecasting has numerous socioeconomic benefits because it can be used for disaster mitigation. Therefore, it is necessary to diagnose and improve the seasonal forecast model. Moreover, the model performance is partly related to the ocean model. This study evaluated the hindcast performance in the upper ocean of the Global Seasonal Forecasting System version 5-Global Couple Configuration 2 (GloSea5-GC2) using a multivariable integrated evaluation method. The normalized potential temperature, salinity, zonal and meridional currents, and sea surface height anomalies were evaluated. Model performance was affected by the target month and was found to be better in the Pacific than in the Atlantic. An increase in lead time led to a decrease in overall model performance, along with decreases in interannual variability, pattern similarity, and root mean square vector deviation. Improving the performance for ocean currents is a more critical than enhancing the performance for other evaluated variables. The tropical Pacific showed the best accuracy in the surface layer, but a spring predictability barrier was present. At the depth of 301 m, the north Pacific and tropical Atlantic exhibited the best and worst accuracies, respectively. These findings provide fundamental evidence for the ocean forecasting performance of GloSea5.