• Title/Summary/Keyword: Similarity Metrics

Search Result 75, Processing Time 0.026 seconds

Extracting Maximal Similar Paths between Two XML Documents using Sequential Pattern Mining (순차 패턴 마이닝을 사용한 두 XML 문서간 최대 유사 경로 추출)

  • 이정원;박승수
    • Journal of KIISE:Databases
    • /
    • v.31 no.5
    • /
    • pp.553-566
    • /
    • 2004
  • Some of the current main research areas involving techniques related to XML consist of storing XML documents, optimizing the query, and indexing. As such we may focus on the set of documents that are composed of various structures, but that are not shared with common structure such as the same DTD or XML Schema. In the case, it is essential to analyze structural similarities and differences among many documents. For example, when the documents from the Web or EDMS (Electronic Document Management System) are required to be merged or classified, it is very important to find the common structure for the process of handling documents. In this paper, we transformed sequential pattern mining algorithms(1) to extract maximal similar paths between two XML documents. Experiments with XML documents show that our transformed sequential pattern mining algorithms can exactly find common structures and maximal similar paths between them. For analyzing experimental results, similarity metrics based on maximal similar paths can exactly classify the types of XML documents.

A Model for Measuring the R&D Project Similarity using Patent Information (특허 정보를 활용한 R&D 과제 유사도 측정 모델)

  • Kim, Jong-Bae;Byun, Jung-Won;Sun, Dong-Ju;Kim, Tae-Gyun;Kim, Yung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.5
    • /
    • pp.1013-1021
    • /
    • 2014
  • For efficient investments of government budgets, It is important to analyze the similarities of R&D projects. So, existing studies have proposed a techniques for analyzing similarities using keywords or segments. However, the techniques have low accuracy. We propose a technique for similarities of projects using patent information. To achieve our goal, we suggest three metrics that are based some mathematic theories; set theory and probability theory. In order to validate our technique, we perform case studies that have 156 R&D projects and 160,218 patent informations.

Perceptual Color Difference based Image Quality Assessment Method and Evaluation System according to the Types of Distortion (인지적 색 차이 기반의 이미지 품질 평가 기법 및 왜곡 종류에 따른 평가 시스템 제안)

  • Lee, Jee-Yong;Kim, Young-Jin
    • Journal of KIISE
    • /
    • v.42 no.10
    • /
    • pp.1294-1302
    • /
    • 2015
  • A lot of image quality assessment metrics that can precisely reflect the human visual system (HVS) have previously been researched. The Structural SIMilarity (SSIM) index is a remarkable HVS-aware metric that utilizes structural information, since the HVS is sensitive to the overall structure of an image. However, SSIM fails to deal with color difference in terms of the HVS. In order to solve this problem, the Structural and Hue SIMilarity (SHSIM) index has been selected with the Hue, Saturation, Intensity (HSI) model as a color space, but it cannot reflect the HVS-aware color difference between two color images. In this paper, we propose a new image quality assessment method for a color image by using a CIE Lab color space. In addition, by using a support vector machine (SVM) classifier, we also propose an optimization system for applying optimal metric according to the types of distortion. To evaluate the proposed index, a LIVE database, which is the most well-known in the area of image quality assessment, is employed and four criteria are used. Experimental results show that the proposed index is more consistent with the other methods.

Fuzzy Clustering with Genre Preference for Collaborative Filtering

  • Lee, Soojung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.5
    • /
    • pp.99-106
    • /
    • 2020
  • The scalability problem inherent in collaborative filtering-based recommender systems has been an issue in related studies during past decades. Clustering is a well-known technique for handling this problem, but has not been actively studied due to its low performance. This paper adopts a clustering method to overcome the scalability problem, inherent drawback of collaborative filtering systems. Furthermore, in order to handle performance degradation caused by applying clustering into collaborative filtering, we take two strategies into account. First, we use fuzzy clustering and secondly, we propose and apply a similarity estimation method based on user preference for movie genres. The proposed method of this study is evaluated through experiments and compared with several previous relevant methods in terms of major performance metrics. Experimental results show that the proposed demonstrated superior performance in prediction and rank accuracies and comparable performance to the best method in our experiments in recommendation accuracy.

Boosting the Reasoning-Based Approach by Applying Structural Metrics for Ontology Alignment

  • Khiat, Abderrahmane;Benaissa, Moussa
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.834-851
    • /
    • 2017
  • The amount of sources of information available on the web using ontologies as support continues to increase and is often heterogeneous and distributed. Ontology alignment is the solution to ensure semantic interoperability. In this paper, we describe a new ontology alignment approach, which consists of combining structure-based and reasoning-based approaches in order to discover new semantic correspondences between entities of different ontologies. We used the biblio test of the benchmark series and anatomy series of the Ontology Alignment Evaluation Initiative (OAEI) 2012 evaluation campaign to evaluate the performance of our approach. We compared our approach successively with LogMap and YAM++ systems. We also analyzed the contribution of our method compared to structural and semantic methods. The results obtained show that our performance provides good performance. Indeed, these results are better than those of the LogMap system in terms of precision, recall, and F-measure. Our approach has also been proven to be more relevant than YAM++ for certain types of ontologies and significantly improves the structure-based and reasoningbased methods.

Weighted Local Naive Bayes Link Prediction

  • Wu, JieHua;Zhang, GuoJi;Ren, YaZhou;Zhang, XiaYan;Yang, Qiao
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.914-927
    • /
    • 2017
  • Weighted network link prediction is a challenge issue in complex network analysis. Unsupervised methods based on local structure are widely used to handle the predictive task. However, the results are still far from satisfied as major literatures neglect two important points: common neighbors produce different influence on potential links; weighted values associated with links in local structure are also different. In this paper, we adapt an effective link prediction model-local naive Bayes model into a weighted scenario to address this issue. Correspondingly, we propose a weighted local naive Bayes (WLNB) probabilistic link prediction framework. The main contribution here is that a weighted cluster coefficient has been incorporated, allowing our model to inference the weighted contribution in the predicting stage. In addition, WLNB can extensively be applied to several classic similarity metrics. We evaluate WLNB on different kinds of real-world weighted datasets. Experimental results show that our proposed approach performs better (by AUC and Prec) than several alternative methods for link prediction in weighted complex networks.

Synthetic Computed Tomography Generation while Preserving Metallic Markers for Three-Dimensional Intracavitary Radiotherapy: Preliminary Study

  • Jin, Hyeongmin;Kang, Seonghee;Kang, Hyun-Cheol;Choi, Chang Heon
    • Progress in Medical Physics
    • /
    • v.32 no.4
    • /
    • pp.172-178
    • /
    • 2021
  • Purpose: This study aimed to develop a deep learning architecture combining two task models to generate synthetic computed tomography (sCT) images from low-tesla magnetic resonance (MR) images to improve metallic marker visibility. Methods: Twenty-three patients with cervical cancer treated with intracavitary radiotherapy (ICR) were retrospectively enrolled, and images were acquired using both a computed tomography (CT) scanner and a low-tesla MR machine. The CT images were aligned to the corresponding MR images using a deformable registration, and the metallic dummy source markers were delineated using threshold-based segmentation followed by manual modification. The deformed CT (dCT), MR, and segmentation mask pairs were used for training and testing. The sCT generation model has a cascaded three-dimensional (3D) U-Net-based architecture that converts MR images to CT images and segments the metallic marker. The performance of the model was evaluated with intensity-based comparison metrics. Results: The proposed model with segmentation loss outperformed the 3D U-Net in terms of errors between the sCT and dCT. The structural similarity score difference was not significant. Conclusions: Our study shows the two-task-based deep learning models for generating the sCT images using low-tesla MR images for 3D ICR. This approach will be useful to the MR-only workflow in high-dose-rate brachytherapy.

Simultaneous Motion Recognition Framework using Data Augmentation based on Muscle Activation Model (근육 활성화 모델 기반의 데이터 증강을 활용한 동시 동작 인식 프레임워크)

  • Sejin Kim;Wan Kyun Chung
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.2
    • /
    • pp.203-212
    • /
    • 2024
  • Simultaneous motion is essential in the activities of daily living (ADL). For motion intention recognition, surface electromyogram (sEMG) and corresponding motion label is necessary. However, this process is time-consuming and it may increase the burden of the user. Therefore, we propose a simultaneous motion recognition framework using data augmentation based on muscle activation model. The model consists of multiple point sources to be optimized while the number of point sources and their initial parameters are automatically determined. From the experimental results, it is shown that the framework has generated the data which are similar to the real one. This aspect is quantified with the following two metrics: structural similarity index measure (SSIM) and mean squared error (MSE). Furthermore, with k-nearest neighbor (k-NN) or support vector machine (SVM), the classification accuracy is also enhanced with the proposed framework. From these results, it can be concluded that the generalization property of the training data is enhanced and the classification accuracy is increased accordingly. We expect that this framework reduces the burden of the user from the excessive and time-consuming data acquisition.

Effectiveness of Fuzzy Graph Based Document Model

  • Aswathy M R;P.C. Reghu Raj;Ajeesh Ramanujan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.8
    • /
    • pp.2178-2198
    • /
    • 2024
  • Graph-based document models have good capabilities to reveal inter-dependencies among unstructured text data. Natural language processing (NLP) systems that use such models as an intermediate representation have shown good performance. This paper proposes a novel fuzzy graph-based document model and to demonstrate its effectiveness by applying fuzzy logic tools for text summarization. The proposed system accepts a text document as input and identifies some of its sentence level features, namely sentence position, sentence length, numerical data, thematic word, proper noun, title feature, upper case feature, and sentence similarity. The fuzzy membership value of each feature is computed from the sentences. We also propose a novel algorithm to construct the fuzzy graph as an intermediate representation of the input document. The Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metric is used to evaluate the model. The evaluation based on different quality metrics was also performed to verify the effectiveness of the model. The ANOVA test confirms the hypothesis that the proposed model improves the summarizer performance by 10% when compared with the state-of-the-art summarizers employing alternate intermediate representations for the input text.

Material Image Classification using Normal Map Generation (Normal map 생성을 이용한 물질 이미지 분류)

  • Nam, Hyeongil;Kim, Tae Hyun;Park, Jong-Il
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.69-79
    • /
    • 2022
  • In this study, a method of generating and utilizing a normal map image used to represent the characteristics of the surface of an image material to improve the classification accuracy of the original material image is proposed. First of all, (1) to generate a normal map that reflects the surface properties of a material in an image, a U-Net with attention-R2 gate as a generator was used, and a Pix2Pix-based method using the generated normal map and the similarity with the original normal map as a reconstruction loss was used. Next, (2) we propose a network that can improve the accuracy of classification of the original material image by applying the previously created normal map image to the attention gate of the classification network. For normal maps generated using Pixar Dataset, the similarity between normal maps corresponding to ground truth is evaluated. In this case, the results of reconstruction loss function applied differently according to the similarity metrics are compared. In addition, for evaluation of material image classification, it was confirmed that the proposed method based on MINC-2500 and FMD datasets and comparative experiments in previous studies could be more accurately distinguished. The method proposed in this paper is expected to be the basis for various image processing and network construction that can identify substances within an image.