• Title/Summary/Keyword: Similarity on Data Structures

Search Result 63, Processing Time 0.022 seconds

Sensor fault diagnosis for bridge monitoring system using similarity of symmetric responses

  • Xu, Xiang;Huang, Qiao;Ren, Yuan;Zhao, Dan-Yang;Yang, Juan
    • Smart Structures and Systems
    • /
    • v.23 no.3
    • /
    • pp.279-293
    • /
    • 2019
  • To ensure high quality data being used for data mining or feature extraction in the bridge structural health monitoring (SHM) system, a practical sensor fault diagnosis methodology has been developed based on the similarity of symmetric structure responses. First, the similarity of symmetric response is discussed using field monitoring data from different sensor types. All the sensors are initially paired and sensor faults are then detected pair by pair to achieve the multi-fault diagnosis of sensor systems. To resolve the coupling response issue between structural damage and sensor fault, the similarity for the target zone (where the studied sensor pair is located) is assessed to determine whether the localized structural damage or sensor fault results in the dissimilarity of the studied sensor pair. If the suspected sensor pair is detected with at least one sensor being faulty, field test could be implemented to support the regression analysis based on the monitoring and field test data for sensor fault isolation and reconstruction. Finally, a case study is adopted to demonstrate the effectiveness of the proposed methodology. As a result, Dasarathy's information fusion model is adopted for multi-sensor information fusion. Euclidean distance is selected as the index to assess the similarity. In conclusion, the proposed method is practical for actual engineering which ensures the reliability of further analysis based on monitoring data.

A Plagiarism Detection Technique for Source Codes Considering Data Structures (데이터 구조를 고려한 소스코드 표절 검사 기법)

  • Lee, Kihwa;Kim, Yeoneo;Woo, Gyun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.3 no.6
    • /
    • pp.189-196
    • /
    • 2014
  • Though the plagiarism is illegal and should be avoided, it still occurs frequently. Particularly, the plagiarism of source codes is more frequently committed than others since it is much easier to copy them because of their digital nature. To prevent code plagiarism, there have been reported a variety of studies. However, previous studies for plagiarism detection techniques on source codes do not consider the data structures although a source code consists both of data structures and algorithms. In this paper, a plagiarism detection technique for source codes considering data structures is proposed. Specifically, the data structures of two source codes are represented as sets of trees and compared with each other using Hungarian Method. To show the usefulness of this technique, an experiment has been performed on 126 source codes submitted as homework results in an object-oriented programming course. When both the data structures and the algorithms of the source codes are considered, the precision and the F-measure score are improved 22.6% and 19.3%, respectively, than those of the case where only the algorithms are considered.

A Clustering Technique using Common Structures of XML Documents (XML 문서의 공통 구조를 이용한 클러스터링 기법)

  • Hwang, Jeong-Hee;Ryu, Keun-Ho
    • Journal of KIISE:Databases
    • /
    • v.32 no.6
    • /
    • pp.650-661
    • /
    • 2005
  • As the Internet is growing, the use of XML which is a standard of semi-structured document is increasing. Therefore, there are on going works about integration and retrieval of XML documents. However, the basis of efficient integration and retrieval of documents is to cluster XML documents with similar structure. The conventional XML clustering approaches use the hierarchical clustering algorithm that produces the demanded number of clusters through repeated merge, but it have some problems that it is difficult to compute the similarity between XML documents and it costs much time to compare similarity repeatedly. In order to address this problem, we use clustering algorithm for transactional data that is scale for large size of data. In this paper we use common structures from XML documents that don't have DTD or schema. In order to use common structures of XML document, we extract representative structures by decomposing the structure from a tree model expressing the XML document, and we perform clustering with the extracted structure. Besides, we show efficiency of proposed method by comparing and analyzing with the previous method.

Similarity Measure based on XML Document's Structure and Contents (XML 문서의 구조와 내용을 고려한 유사도 측정)

  • Kim, Woo-Saeng
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.8
    • /
    • pp.1043-1050
    • /
    • 2008
  • XML has become a standard for data representation and exchange on the Internet. With a large number of XML documents on the Web, there is an increasing need to automatically process those structurally rich documents for information retrieval, document management, and data mining applications. In this paper, we propose a new method to measure the similarity between XML documents by considering their structures and contents. The similarity of document's structure is found by a simple string matching technique and that of document's contents is found by weights taking into account of the names and positions of elements. The overall algorithm runs in time that is linear in the combined size of the two documents involved in comparison evaluation.

  • PDF

Analysis of Image Similarity Index of Woven Fabrics and Virtual Fabrics - Application of Textile Design CAD System and Shuttle Loom - (직물과 가상소재의 화상 유사성 분석 연구 - 수직기 및 텍스타일 CAD시스템 활용 -)

  • Yoon, Jung-Won;Kim, Jong-Jun
    • Fashion & Textile Research Journal
    • /
    • v.15 no.6
    • /
    • pp.1010-1017
    • /
    • 2013
  • Current global textiles and fashion industries have gradually shifted focus to high value-added, high sensibility, and multi-functional products based on new human-friendliness and sustainable growth technologies. Textile design CAD systems have been developed in conjunction with computer hardware and software sector advances. This study compares the patterns or images of actual woven fabrics and virtual fabrics prepared with a textile design CAD system. In this study, several weave structures (such as fancy yarn weave and patterns) were prepared with a shuttle loom. The woven textile images were taken using a CCD camera. The same weave structure data and yarn data were fed into a textile design CAD system in order to simulate fabric images as similarly as possible. Similarity Index analysis methods allowed for an analysis of the index between the actual fabric specimen and the simulated image of the corresponding fabric. The results showed that repeated small pattern weaves provide superior similarity index values than those of a fancy yarn weave that indicate some irregularities due to fancy yarn attributes. A Complex Wavelet Structural Similarity(CW-SSIM) index resulted in a better index than other methods such as Multi-Scale(MS) SSIM, and Feature Similarity(FS) SSIM, across fabric specimen images. A correlation analysis of the similarity index based on an image analysis and a similarity evaluation by panel members was also implemented.

Approximate Top-k Labeled Subgraph Matching Scheme Based on Word Embedding (워드 임베딩 기반 근사 Top-k 레이블 서브그래프 매칭 기법)

  • Choi, Do-Jin;Oh, Young-Ho;Bok, Kyoung-Soo;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.8
    • /
    • pp.33-43
    • /
    • 2022
  • Labeled graphs are used to represent entities, their relationships, and their structures in real data such as knowledge graphs and protein interactions. With the rapid development of IT and the explosive increase in data, there has been a need for a subgraph matching technology to provide information that the user is interested in. In this paper, we propose an approximate Top-k labeled subgraph matching scheme that considers the semantic similarity of labels and the difference in graph structure. The proposed scheme utilizes a learning model using FastText in order to consider the semantic similarity of a label. In addition, the label similarity graph(LSG) is used for approximate subgraph matching by calculating similarity values between labels in advance. Through the LSG, we can resolve the limitations of the existing schemes that subgraph expansion is possible only if the labels match exactly. It supports structural similarity for a query graph by performing searches up to 2-hop. Based on the similarity value, we provide k subgraph matching results. We conduct various performance evaluations in order to show the superiority of the proposed scheme.

Nonlinear damage detection using linear ARMA models with classification algorithms

  • Chen, Liujie;Yu, Ling;Fu, Jiyang;Ng, Ching-Tai
    • Smart Structures and Systems
    • /
    • v.26 no.1
    • /
    • pp.23-33
    • /
    • 2020
  • Majority of the damage in engineering structures is nonlinear. Damage sensitive features (DSFs) extracted by traditional methods from linear time series models cannot effectively handle nonlinearity induced by structural damage. A new DSF is proposed based on vector space cosine similarity (VSCS), which combines K-means cluster analysis and Bayesian discrimination to detect nonlinear structural damage. A reference autoregressive moving average (ARMA) model is built based on measured acceleration data. This study first considers an existing DSF, residual standard deviation (RSD). The DSF is further advanced using the VSCS, and then the advanced VSCS is classified using K-means cluster analysis and Bayes discriminant analysis, respectively. The performance of the proposed approach is then verified using experimental data from a three-story shear building structure, and compared with the results of existing RSD. It is demonstrated that combining the linear ARMA model and the advanced VSCS, with cluster analysis and Bayes discriminant analysis, respectively, is an effective approach for detection of nonlinear damage. This approach improves the reliability and accuracy of the nonlinear damage detection using the linear model and significantly reduces the computational cost. The results indicate that the proposed approach is potential to be a promising damage detection technique.

Incremental Clustering of XML Documents based on Similar Structures (유사 구조 기반 XML 문서의 점진적 클러스터링)

  • Hwang Jeong Hee;Ryu Keun Ho
    • Journal of KIISE:Databases
    • /
    • v.31 no.6
    • /
    • pp.699-709
    • /
    • 2004
  • XML is increasingly important in data exchange and information management. Starting point for retrieving the structure and integrating the documents efficiently is clustering the documents that have similar structure. The reason is that we can retrieve the documents more flexible and faster than the method treating the whole documents that have different structure. Therefore, in this paper, we propose the similar structure-based incremental clustering method useful for retrieving the structure of XML documents and integrating them. As a novel method, we use a clustering algorithm for transactional data that facilitates the large number of data, which is quite different from the existing methods that measure the similarity between documents, using vector. We first extract the representative structures of XML documents using sequential pattern algorithm, and then we perform the similar structure based document clustering, assuming that the document as a transaction, the representative structure of the document as the items of the transaction. In addition, we define the cluster cohesion and inter-cluster similarity, and analyze the efficiency of the Proposed method through comparing with the existing method by experiments.

A XML Schema Matching based on Fuzzy Similarity Measure

  • Kim, Chang-Suk;Sim, Kwee-Bo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1482-1485
    • /
    • 2005
  • An equivalent schema matching among several different source schemas is very important for information integration or mining on the XML based World Wide Web. Finding most similar source schema corresponding mediated schema is a major bottleneck because of the arbitrary nesting property and hierarchical structures of XML DTD schemas. It is complex and both very labor intensive and error prune job. In this paper, we present the first complex matching of XML schema, i.e. XML DTD, inlining two dimensional DTD graph into flat feature values. The proposed method captures not only schematic information but also integrity constraints information of DTD to match different structured DTD. We show the integrity constraints based hierarchical schema matching is more semantic than the schema matching only to use schematic information and stored data.

  • PDF

Micro-seismic monitoring in mines based on cross wavelet transform

  • Huang, Linqi;Hao, Hong;Li, Xibing;Li, Jun
    • Earthquakes and Structures
    • /
    • v.11 no.6
    • /
    • pp.1143-1164
    • /
    • 2016
  • Time Delay of Arrival (TDOA) estimation methods based on correlation function analysis play an important role in the micro-seismic event monitoring. It makes full use of the similarity in the recorded signals that are from the same source. However, those methods are subjected to the noise effect, particularly when the global similarity of the signals is low. This paper proposes a new approach for micro-seismic monitoring based on cross wavelet transform. The cross wavelet transform is utilized to analyse the measured signals under micro-seismic events, and the cross wavelet power spectrum is used to measure the similarity of two signals in a multi-scale dimension and subsequently identify TDOA. The offset time instant associated with the maximum cross wavelet transform spectrum power is identified as TDOA, and then the location of micro-seismic event can be identified. Individual and statistical identification tests are performed with measurement data from an in-field mine. Experimental studies demonstrate that the proposed approach significantly improves the robustness and accuracy of micro-seismic source locating in mines compared to several existing methods, such as the cross-correlation, multi-correlation, STA/LTA and Kurtosis methods.