• Title/Summary/Keyword: Similarity Metrics

Search Result 77, Processing Time 0.021 seconds

An Implementation of XML document searching system based on Structure and Semantics Similarity (구조와 내용 유사도에 기반한 XML 웹 문서 검색시스템 구축)

  • Park Uchang;Seo Yeojin
    • Journal of Internet Computing and Services
    • /
    • v.6 no.2
    • /
    • pp.99-115
    • /
    • 2005
  • Extensible Markup Language (XML) is an Internet standard that is used to express and convert data, In order to find the necessary information out of XML documents, you need a search system for XML documents, In this research, we have developed a search system that can find documents that matches the structure and content of a given XML document, making the best use of XML structure, Search metrics take account of the similarity in tag names, tag values, and the structure of tags, After a search, the system displays the ranked results in the order of aggregate similarity, Three methods of query are provided: keyword search which is conventional; search with tag names and their values; and search with XML documents, These three methods enable users to choose the method that best suits their preference, resulting in the increase of the usefulness of the system.

  • PDF

Isomer Differentiation Using in silico MS2 Spectra. A Case Study for the CFM-ID Mass Spectrum Predictor

  • Milman, Boris L.;Ostrovidova, Ekaterina V.;Zhurkovich, Inna K.
    • Mass Spectrometry Letters
    • /
    • v.10 no.3
    • /
    • pp.93-101
    • /
    • 2019
  • Algorithms and software for predicting tandem mass spectra have been developed in recent years. In this work, we explore how distinct in silico $MS^2$ spectra are predicted for isomers, i.e. compounds having the same formula and similar molecular structures, to differentiate between them. We used the CFM-ID 2.0/3.0 predictor with regard to (a) test compounds, whose experimental mass spectra had been randomly sampled from the MassBank of North America (MoNA) collection, and to (b) the most widespread isomers of test compounds searched in the PubChem database. In the first validation test, in silico mass spectra constitute a reference library, and library searches are performed for test experimental spectra of "unknowns". The searches led to the true positive rate (TPR) of ($46-48{\pm}10$)%. In the second test, in silico and experimental spectra were interchanged and this resulted in a TPR of ($58{\pm}10$)%. There were no significant differences between results obtained with different metrics of spectral similarity and predictor versions. In a comparison of test compounds vs. their isomers, a statistically significant correlation between mass spectral data and structural features was observed. The TPR values obtained should be regarded as reasonable results for predicting tandem mass spectra of related chemical structures.

Deep Learning Framework with Convolutional Sequential Semantic Embedding for Mining High-Utility Itemsets and Top-N Recommendations

  • Siva S;Shilpa Chaudhari
    • Journal of information and communication convergence engineering
    • /
    • v.22 no.1
    • /
    • pp.44-55
    • /
    • 2024
  • High-utility itemset mining (HUIM) is a dominant technology that enables enterprises to make real-time decisions, including supply chain management, customer segmentation, and business analytics. However, classical support value-driven Apriori solutions are confined and unable to meet real-time enterprise demands, especially for large amounts of input data. This study introduces a groundbreaking model for top-N high utility itemset mining in real-time enterprise applications. Unlike traditional Apriori-based solutions, the proposed convolutional sequential embedding metrics-driven cosine-similarity-based multilayer perception learning model leverages global and contextual features, including semantic attributes, for enhanced top-N recommendations over sequential transactions. The MATLAB-based simulations of the model on diverse datasets, demonstrated an impressive precision (0.5632), mean absolute error (MAE) (0.7610), hit rate (HR)@K (0.5720), and normalized discounted cumulative gain (NDCG)@K (0.4268). The average MAE across different datasets and latent dimensions was 0.608. Additionally, the model achieved remarkable cumulative accuracy and precision of 97.94% and 97.04% in performance, respectively, surpassing existing state-of-the-art models. This affirms the robustness and effectiveness of the proposed model in real-time enterprise scenarios.

Ontology Selection Ranking Model based on Semantic Similarity Approach (의미적 유사성에 기반한 온톨로지 선택 랭킹 모델)

  • Oh, Sun-Ju;Ahn, Joong-Ho;Park, Jin-Soo
    • The Journal of Society for e-Business Studies
    • /
    • v.14 no.2
    • /
    • pp.95-116
    • /
    • 2009
  • Ontologies have provided supports in integrating heterogeneous and distributed information. More and more ontologies and tools have been developed in various domains. However, building ontologies requires much time and effort. Therefore, ontologies need to be shared and reused among users. Specifically, finding the desired ontology from an ontology repository will benefit users. In the past, most of the studies on retrieving and ranking ontologies have mainly focused on lexical level supports. In those cases, it is impossible to find an ontology that includes concepts that users want to use at the semantic level. Most ontology libraries and ontology search engines have not provided semantic matching capability. Retrieving an ontology that users want to use requires a new ontology selection and ranking mechanism based on semantic similarity matching. We propose an ontology selection and ranking model consisting of selection criteria and metrics which are enhanced in semantic matching capabilities. The model we propose presents two novel features different from the previous research models. First, it enhances the ontology selection and ranking method practically and effectively by enabling semantic matching of taxonomy or relational linkage between concepts. Second, it identifies what measures should be used to rank ontologies in the given context and what weight should be assigned to each selection measure.

  • PDF

Sound quality characteristics of heavy-weight impact sounds generated by impact ball (임팩트 볼에 의한 중량 충격음의 Sound Quality 특성)

  • You, Jin;Lee, Hye-Mi;Jeon, Jin-Yong
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2006.11a
    • /
    • pp.671-674
    • /
    • 2006
  • Heavy-weight impact sounds generated by impact ball were classified according to the frequency characteristics on the equal loudness contours. Sound quality metrics such as Zwicker's loudness, sharpness, roughness of each classified impact sound were also measured. Loudness spectrum has been regarded as an indication of the characteristics difference of each classified impact sound. The adjectives in Korean expressing the sound quality characteristics of floor impact sounds were also investigated by adoptability and similarity tests. The group of the adjectives was used to evaluate the sound quality of floor impact sound by semantic differential test method.

  • PDF

Transformer-based dense 3D reconstruction from RGB images (RGB 이미지에서 트랜스포머 기반 고밀도 3D 재구성)

  • Xu, Jiajia;Gao, Rui;Wen, Mingyun;Cho, Kyungeun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.646-647
    • /
    • 2022
  • Multiview stereo (MVS) 3D reconstruction of a scene from images is a fundamental computer vision problem that has been thoroughly researched in recent times. Traditionally, MVS approaches create dense correspondences by constructing regularizations and hand-crafted similarity metrics. Although these techniques have achieved excellent results in the best Lambertian conditions, traditional MVS algorithms still contain a lot of artifacts. Therefore, in this study, we suggest using a transformer network to accelerate the MVS reconstruction. The network is based on a transformer model and can extract dense features with 3D consistency and global context, which are necessary to provide accurate matching for MVS.

Clustering Korean Stock Return Data Based on GARCH Model (이분산 시계열모형을 이용한 국내주식자료의 군집분석)

  • Park, Man-Sik;Kim, Na-Young;Kim, Hee-Young
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.6
    • /
    • pp.925-937
    • /
    • 2008
  • In this study, we considered the clustering analysis for stock return traded in the stock market. Most of financial time-series data, for instance, stock price and exchange rate have conditional heterogeneous variability depending on time, and, hence, are not properly applied to the autoregressive moving-average(ARMA) model with assumption of constant variance. Moreover, the variability is font and center for stock investors as well as academic researchers. So, this paper focuses on the generalized autoregressive conditional heteroscedastic(GARCH) model which is known as a solution for capturing the conditional variance(or volatility). We define the metrics for similarity of unconditional volatility and for homogeneity of model structure, and, then, evaluate the performances of the metrics. In real application, we do clustering analysis in terms of volatility and structure with stock return of the 11 Korean companies measured for the latest three years.

High Noise Density Median Filter Method for Denoising Cancer Images Using Image Processing Techniques

  • Priyadharsini.M, Suriya;Sathiaseelan, J.G.R
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.11
    • /
    • pp.308-318
    • /
    • 2022
  • Noise is a serious issue. While sending images via electronic communication, Impulse noise, which is created by unsteady voltage, is one of the most common noises in digital communication. During the acquisition process, pictures were collected. It is possible to obtain accurate diagnosis images by removing these noises without affecting the edges and tiny features. The New Average High Noise Density Median Filter. (HNDMF) was proposed in this paper, and it operates in two steps for each pixel. Filter can decide whether the test pixels is degraded by SPN. In the first stage, a detector identifies corrupted pixels, in the second stage, an algorithm replaced by noise free processed pixel, the New average suggested Filter produced for this window. The paper examines the performance of Gaussian Filter (GF), Adaptive Median Filter (AMF), and PHDNF. In this paper the comparison of known image denoising is discussed and a new decision based weighted median filter used to remove impulse noise. Using Mean Square Error (MSE), Peak Signal to Noise Ratio (PSNR), and Structure Similarity Index Method (SSIM) metrics, the paper examines the performance of Gaussian Filter (GF), Adaptive Median Filter (AMF), and PHDNF. A detailed simulation process is performed to ensure the betterment of the presented model on the Mini-MIAS dataset. The obtained experimental values stated that the HNDMF model has reached to a better performance with the maximum picture quality. images affected by various amounts of pretend salt and paper noise, as well as speckle noise, are calculated and provided as experimental results. According to quality metrics, the HNDMF Method produces a superior result than the existing filter method. Accurately detect and replace salt and pepper noise pixel values with mean and median value in images. The proposed method is to improve the median filter with a significant change.

Proximity based Circular Visualization for similarity analysis of voting patterns between nations in UN General Assembly (UN 국가의 투표 성향 유사도 분석을 위한 Proximity based Circular 시각화 연구)

  • Choi, Han Min;Mun, Seong Min;Ha, Hyo Ji;Lee, Kyung Won
    • Design Convergence Study
    • /
    • v.14 no.4
    • /
    • pp.133-150
    • /
    • 2015
  • In this study, we proposed Interactive Visualization methods that can be analyzed relations between nations in various viewpoints such as period, issue using total 5211 of the UN General Assembly voting data.For this research, we devised a similarity matrix between nations and developed two visualization method based similarity matrix. The first one is Network Graph Visualization that can be showed relations between nations which participated in the vote of the UN General Assembly like Social Network Graph by year. and the second one is Proximity based Circular Visualization that can be analyzed relations between nations focus on one nation or Changes in voting patterns between nations according to time. This study have a great signification. that's because we proposed Proximity based Circular Visualization methods which merged Line and Circle Graph for network analysis that never tried from other cases of studies that utilize conventional voting data and made it. We also derived co-operatives of each visualization through conducting a comparative experiment for the two visualization. As a research result, we found that Proximity based Circular Visualization can be better analysis each node and Network Graph Visualization can be better analysis patterns for the nations.

One-shot multi-speaker text-to-speech using RawNet3 speaker representation (RawNet3를 통해 추출한 화자 특성 기반 원샷 다화자 음성합성 시스템)

  • Sohee Han;Jisub Um;Hoirin Kim
    • Phonetics and Speech Sciences
    • /
    • v.16 no.1
    • /
    • pp.67-76
    • /
    • 2024
  • Recent advances in text-to-speech (TTS) technology have significantly improved the quality of synthesized speech, reaching a level where it can closely imitate natural human speech. Especially, TTS models offering various voice characteristics and personalized speech, are widely utilized in fields such as artificial intelligence (AI) tutors, advertising, and video dubbing. Accordingly, in this paper, we propose a one-shot multi-speaker TTS system that can ensure acoustic diversity and synthesize personalized voice by generating speech using unseen target speakers' utterances. The proposed model integrates a speaker encoder into a TTS model consisting of the FastSpeech2 acoustic model and the HiFi-GAN vocoder. The speaker encoder, based on the pre-trained RawNet3, extracts speaker-specific voice features. Furthermore, the proposed approach not only includes an English one-shot multi-speaker TTS but also introduces a Korean one-shot multi-speaker TTS. We evaluate naturalness and speaker similarity of the generated speech using objective and subjective metrics. In the subjective evaluation, the proposed Korean one-shot multi-speaker TTS obtained naturalness mean opinion score (NMOS) of 3.36 and similarity MOS (SMOS) of 3.16. The objective evaluation of the proposed English and Korean one-shot multi-speaker TTS showed a prediction MOS (P-MOS) of 2.54 and 3.74, respectively. These results indicate that the performance of our proposed model is improved over the baseline models in terms of both naturalness and speaker similarity.