• 제목/요약/키워드: Similarity Metrics

검색결과 75건 처리시간 0.024초

실수 지수 메트릭으로 구성된 스트링 커널을 이용한 신호펩티드의 절단위치 예측 (Signal Peptide Cleavage Site Prediction Using a String Kernel with Real Exponent Metric)

  • 지상문
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제36권10호
    • /
    • pp.786-792
    • /
    • 2009
  • 지지벡터기계는 자료간의 유사도를 커널함수를 사용하여 계산하고, 이러한 유사도를 이용하여 패턴을 분류하는 최적인 초평면을 구한다. 따라서 자료의 특성을 효과적으로 반영할 수 있는 유사도의 사용이 중요하다. 본 연구에서는 아미노산 서열간의 최적의 유사도를 얻기 위해서, 아미노산의 진화적인 관계와 소수성으로부터 유도된 메트릭을 실수 지수를 가지는 형태로 일반화하였다. 제안한 메트릭이 메트릭의 조건을 만족하고, 아미노산 서열과 DNA 서열의 유사도를 계산하기 위해서 널리 사용되는 스트링 커널내에서 이용되는 메트릭파의 관련성을 알아본다. 또한, 적용하려는 문제에 보다 효과적인 메트릭을 일반화 메트릭에서 찾을 수 있음을 신호펩티드의 절단위치 예측실험을 통하여 알아본다.

Pattern and process in MAEUL, a traditional Korean rural landscape

  • Kim, Jae-Eun;Hong, Sun-Kee
    • Journal of Ecology and Environment
    • /
    • 제34권2호
    • /
    • pp.237-249
    • /
    • 2011
  • Land-use changes due to the socio-economic environment influence landscape patterns and processes, which affect habitats and biodiversity. This study considers the effects of such land-use changes, particularly on the traditional rural "Maeul" forested landscape, by analyzing landscape structure and vegetation changes. Three study areas were examined that have seen their populations decrease and age over the last few decades. Five types of plant life-forms (Raunkier life-forms) were distinguished to investigate ecosystem function. Principle component analysis was used to understand vegetation dynamics and community characteristics based on a vegetation similarity index. Ordination analysis transformed species-coverage data was introduced to clarify vegetation dynamics. Landscape indices, such as area metrics, edge metrics, and shape metrics, showed that spatial heterogeneity has increased over time in all areas. Pinus densiflora was the main land-use plant type in all study areas but decreased over time, whereas Quercus spp. increased. Over a decade, P. densiflora communities shifted to deciduous oak and plantation. These findings indicate that the impact of human activities on the Maeul landscape is twofold. While forestry activities caused heavy disturbances, the abandonment of traditional human activities has led to natural succession. Furthermore, it can be concluded that the type and intensity of these human impacts on landscape heterogeneity relate differently to vegetation succession. This reflects the cause and consequence of patch dynamics. We discuss an approach for sustainable landscape planning and management of the Maeul landscape based on traditional management.

텍스트 유사성을 위한 파라미터 및 비 파라미터 측정 (Parametric and Non Parametric Measures for Text Similarity)

  • 존 믈랴히루;김종남
    • 융합신호처리학회논문지
    • /
    • 제20권4호
    • /
    • pp.193-198
    • /
    • 2019
  • 인터넷상에서의 진짜 및 가짜 정보의 범람이 수많은 텍스트 분석에 대한 연구를 이끌었다. 문헌 표기 없이 타인의 저작물을 무단 복제 및 관련 없는 연구결과 조작 등이 한동안 세간의 주목을 이끌었다. 연구 분야에서 표절과 이의 대항 및 감소를 위해 다양한 도구들이 개발되었다. Pearson Spearman 본 연구에서는 코사인 유사성과 및 상관관계를 이용하는 파라미터 및 비 파라미터 방법을 이용하여 문장 유사성을 측정한다. Pearson 코사인 유사성과 상관관계는 가장 높은 유사성 계수를 얻었으나 Spearman 상관관계는 낮은 유사성 계수를 보여주었다. 본 논문에서는 정상성 가정과 편향성에 의존하는 파라미터 방법들에 반하도록 비정상성 가정으로 인한 문장 유사도를 측정하는 데 있어 비 파라미터 방법들을 사용하는 것을 제안한다.

Using Genre Rating Information for Similarity Estimation in Collaborative Filtering

  • Lee, Soojung
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권12호
    • /
    • pp.93-100
    • /
    • 2019
  • 유사도 계산은 메모리 기반 협력필터링 시스템의 성능에 매우 중요하다. 이 시스템들은 사용자 평가치들을 이용하여 온라인 상업 사이트에서 고객들에게 상품을 추천한다. 더욱 적합한 추천을 위해 현 사용자와 가장 유사한 사용자들을 선정하여 참조한다. 기존 문헌에는 많은 유사도 척도들이 개발되었는데, 이들은 대개 데이터 희소성이나 완전 시작 문제를 내포하고 있다. 본 논문에서는 기존 척도들과는 달리 사용자 평가치들로부터 선호 정보를 최대한 추출함으로써 희소한 데이터 조건에서도 더욱 신뢰할 수 있는 유사도값을 산출하고자 한다. 사용자 평가치 뿐만 아니라 데이터셋이 제공하는 영화장르 정보를 이용하는 새로운 유사도 척도를 제시한다. 본 척도와 기존의 관련된 척도들의 성능 실험을 하였고, 그 결과, 제안 척도는 주요 성능 평가기준 상으로 더욱 우수하거나 유사한 성능 결과를 보임을 확인하였다.

공정계획 전문가시스템의 개발-조선 블럭분할에의 응용

  • 박병태;이재원
    • 한국정밀공학회:학술대회논문집
    • /
    • 한국정밀공학회 1993년도 춘계학술대회 논문집
    • /
    • pp.370-374
    • /
    • 1993
  • This paper describes a study on the expert system based process planning of the block division process in shipbuilding. The prototype system developed deterines the block division line of the midship of crude-oil tanker. Case-based reasoning (CBR) approach relying on previous similar cases to solve the problem is applied instead of rule-based reasoning (RBR). Similar cases are retrieved from case base according to the similarity metrics between input problem and cases. The retrieved case with the highest priority is then adapted to fit to the input problem buy adaptation rules. The adapted solution is proposed as the division line for the input problem.

Generative probabilistic model with Dirichlet prior distribution for similarity analysis of research topic

  • Milyahilu, John;Kim, Jong Nam
    • 한국멀티미디어학회논문지
    • /
    • 제23권4호
    • /
    • pp.595-602
    • /
    • 2020
  • We propose a generative probabilistic model with Dirichlet prior distribution for topic modeling and text similarity analysis. It assigns a topic and calculates text correlation between documents within a corpus. It also provides posterior probabilities that are assigned to each topic of a document based on the prior distribution in the corpus. We then present a Gibbs sampling algorithm for inference about the posterior distribution and compute text correlation among 50 abstracts from the papers published by IEEE. We also conduct a supervised learning to set a benchmark that justifies the performance of the LDA (Latent Dirichlet Allocation). The experiments show that the accuracy for topic assignment to a certain document is 76% for LDA. The results for supervised learning show the accuracy of 61%, the precision of 93% and the f1-score of 96%. A discussion for experimental results indicates a thorough justification based on probabilities, distributions, evaluation metrics and correlation coefficients with respect to topic assignment.

Precise segmentation of fetal head in ultrasound images using improved U-Net model

  • Vimala Nagabotu;Anupama Namburu
    • ETRI Journal
    • /
    • 제46권3호
    • /
    • pp.526-537
    • /
    • 2024
  • Monitoring fetal growth in utero is crucial to anomaly diagnosis. However, current computer-vision models struggle to accurately assess the key metrics (i.e., head circumference and occipitofrontal and biparietal diameters) from ultrasound images, largely owing to a lack of training data. Mitigation usually entails image augmentation (e.g., flipping, rotating, scaling, and translating). Nevertheless, the accuracy of our task remains insufficient. Hence, we offer a U-Net fetal head measurement tool that leverages a hybrid Dice and binary cross-entropy loss to compute the similarity between actual and predicted segmented regions. Ellipse-fitted two-dimensional ultrasound images acquired from the HC18 dataset are input, and their lower feature layers are reused for efficiency. During regression, a novel region of interest pooling layer extracts elliptical feature maps, and during segmentation, feature pyramids fuse field-layer data with a new scale attention method to reduce noise. Performance is measured by Dice similarity, mean pixel accuracy, and mean intersection-over-union, giving 97.90%, 99.18%, and 97.81% scores, respectively, which match or outperform the best U-Net models.

An approach for improving the performance of the Content-Based Image Retrieval (CBIR)

  • Jeong, Inseong
    • 한국측량학회지
    • /
    • 제30권6_2호
    • /
    • pp.665-672
    • /
    • 2012
  • Amid rapidly increasing imagery inputs and their volume in a remote sensing imagery database, Content-Based Image Retrieval (CBIR) is an effective tool to search for an image feature or image content of interest a user wants to retrieve. It seeks to capture salient features from a 'query' image, and then to locate other instances of image region having similar features elsewhere in the image database. For a CBIR approach that uses texture as a primary feature primitive, designing a texture descriptor to better represent image contents is a key to improve CBIR results. For this purpose, an extended feature vector combining the Gabor filter and co-occurrence histogram method is suggested and evaluated for quantitywise and qualitywise retrieval performance criterion. For the better CBIR performance, assessing similarity between high dimensional feature vectors is also a challenging issue. Therefore a number of distance metrics (i.e. L1 and L2 norm) is tried to measure closeness between two feature vectors, and its impact on retrieval result is analyzed. In this paper, experimental results are presented with several CBIR samples. The current results show that 1) the overall retrieval quantity and quality is improved by combining two types of feature vectors, 2) some feature is better retrieved by a specific feature vector, and 3) retrieval result quality (i.e. ranking of retrieved image tiles) is sensitive to an adopted similarity metric when the extended feature vector is employed.

Evaluation of Geo-based Image Fusion on Mobile Cloud Environment using Histogram Similarity Analysis

  • Lee, Kiwon;Kang, Sanggoo
    • 대한원격탐사학회지
    • /
    • 제31권1호
    • /
    • pp.1-9
    • /
    • 2015
  • Mobility and cloud platform have become the dominant paradigm to develop web services dealing with huge and diverse digital contents for scientific solution or engineering application. These two trends are technically combined into mobile cloud computing environment taking beneficial points from each. The intention of this study is to design and implement a mobile cloud application for remotely sensed image fusion for the further practical geo-based mobile services. In this implementation, the system architecture consists of two parts: mobile web client and cloud application server. Mobile web client is for user interface regarding image fusion application processing and image visualization and for mobile web service of data listing and browsing. Cloud application server works on OpenStack, open source cloud platform. In this part, three server instances are generated as web server instance, tiling server instance, and fusion server instance. With metadata browsing of the processing data, image fusion by Bayesian approach is performed using functions within Orfeo Toolbox (OTB), open source remote sensing library. In addition, similarity of fused images with respect to input image set is estimated by histogram distance metrics. This result can be used as the reference criterion for user parameter choice on Bayesian image fusion. It is thought that the implementation strategy for mobile cloud application based on full open sources provides good points for a mobile service supporting specific remote sensing functions, besides image fusion schemes, by user demands to expand remote sensing application fields.

스테가노그래피 소프트웨어 분석 연구 - 성능 비교 중심으로 (Steganography Software Analysis -Focusing on Performance Comparison)

  • 이효주;박용석
    • 한국정보통신학회논문지
    • /
    • 제25권10호
    • /
    • pp.1359-1368
    • /
    • 2021
  • 스테가노그래피는 데이터 안에 데이터를 은폐하는 기술로, 전달 매체의 존재가 발각되지 않도록 하는 것이 주요목적이다. 현재 스테가노그래피 관련 연구는 알고리즘을 기반으로 정립된 은닉 기법, 검출 기법들에 관련해서 다양하게 연구되고 있지만, 소프트웨어 성능을 분석하기 위한 실험 중심의 연구는 상대적으로 부족하다. 본 논문은 서로 다른 알고리즘으로 데이터를 은폐하는 다섯 개의 스테가노그래피 소프트웨어의 특징을 파악하고, 평가하는 데 목적을 두었다. 스테가노그래피 소프트웨어의 성능 조사를 위하여 시각 평가 척도로 사용되는 PSNR(Peak Signal to Noise Ratio), SSIM(Structural SIMilarity)을 이용하였다. 스테가노그래피 소프트웨어를 통하여 임베딩한 스테고 이 미지들의 PSNR, SSIM을 도출하여 정량적 성능 비교 분석한다. 평가 척도에 따라 우수한 스테가노그래피 소프트웨어를 소개하여 포렌식에 기여 하고자 한다.