• Title/Summary/Keyword: Similar Data

Search Result 9,284, Processing Time 0.037 seconds

Extraction of similar XML data based on XML structure and processing unit

  • Park, Jong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.59-65
    • /
    • 2017
  • XML has established itself as the format for data exchange on the internet and the volume of its instance is large scale. Therefore, to extract similar information from XML instance is one of research topics but is insufficient. In this paper, we extract similar information from various kind of XML instances according to the same goal. Also we use only the structure information of XML instance for information extraction because some of XML instance is described without its schema. In order to efficiently extract similar information, we propose a minimum unit of processing and two approaches for finding the unit. The one is a structure-based method which uses only the structure information of XML instance and another is a measure-based method which finds a unit by numerical formula. Our two approaches can be applied to any application that needs the extraction of similar information based on XML data. Also the approach can be used for HTML instance.

An Approach to Applying Multiple Linear Regression Models by Interlacing Data in Classifying Similar Software

  • Lim, Hyun-il
    • Journal of Information Processing Systems
    • /
    • v.18 no.2
    • /
    • pp.268-281
    • /
    • 2022
  • The development of information technology is bringing many changes to everyday life, and machine learning can be used as a technique to solve a wide range of real-world problems. Analysis and utilization of data are essential processes in applying machine learning to real-world problems. As a method of processing data in machine learning, we propose an approach based on applying multiple linear regression models by interlacing data to the task of classifying similar software. Linear regression is widely used in estimation problems to model the relationship between input and output data. In our approach, multiple linear regression models are generated by training on interlaced feature data. A combination of these multiple models is then used as the prediction model for classifying similar software. Experiments are performed to evaluate the proposed approach as compared to conventional linear regression, and the experimental results show that the proposed method classifies similar software more accurately than the conventional model. We anticipate the proposed approach to be applied to various kinds of classification problems to improve the accuracy of conventional linear regression.

Self-Similarity Characteristic in Data traffic (데이터 트래픽 Self-Similar 특성에 관한 연구)

  • 장우현;오행석
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2000.10a
    • /
    • pp.272-277
    • /
    • 2000
  • The classical queuing analysis has been tremendously useful in doing capacity planning and performance prediction, However, in many real-world cases. it has found that the predicted results form a queuing analysis differ substantially hem the actual observed performance. Specially, in recent years, a number of studies have demonstrated that for some environments, the traffic pattern is self-similar rather than Poisson. In this paper, we study these self-similar traffic characteristics and the definition of self-similar stochastic processes. Then, we consider the examples of self-similar data traffic, which is reported from recent measurement studies. Finally, we wish you that it makes out about the characteristics of actual data traffic more easily.

  • PDF

Mobile Communications Data traffic using Self-Similarity Characteristic (Self-Similar 특성을 이용한 이동전화 데이터 트래픽 특성)

  • 이동철;양성현;김기문
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.7
    • /
    • pp.915-920
    • /
    • 2002
  • The classical queuing analysis has been tremendously useful in doing capacity planning and performance prediction. However, in many real-world cases. it has found that the predicted results form a queuing analysis differ substantially from the actual observed performance. Specially, in recent years, a number of studies have demonstrated that for some environments, the traffic pattern is self-similar rather than Poisson. In this paper, we study these self-similar traffic characteristics and the definition of self-similar stochastic processes. Then, we consider the examples of self-similar data traffic, which is reported from recent measurement studies. Finally, we wish yon that it makes out about the characteristics of actual data traffic more easily.

  • PDF

Self-Similarity Characteristic in Data traffic (Self-Similar특성을 이용한 데이터 트래픽 특성에 관한 연구)

  • 이동철;김기문;김동일
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2001.05a
    • /
    • pp.173-178
    • /
    • 2001
  • The classical queuing analysis has been tremendously useful in doing capacity planning and performance prediction. However, in many real-world cases. it has found that the predicted results form a queuing analysis differ substantially from the actual observed performance. Specially, in recent years, a number of studies have demonstrated that for some environments, the traffic pattern is self-similar rather than Poisson. In this paper, we study these self-similar traffic characteristics and the definition of self-similar stochastic processes. Then, we consider the examples of self-similar data traffic, which is reported from recent measurement studies. Finally, we wish you that it makes out about the characteristics of actual data traffic more easily.

  • PDF

Self-Similarity Characteristic in Data traffic (Self-Similar특성을 이용한 데이터 트래픽 특성에 관한 연구)

  • 이동철;김기문;김동일
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2001.10a
    • /
    • pp.454-459
    • /
    • 2001
  • The classical queuing analysis has been tremendously useful in doing capacity planning and performance prediction. However, in many real-world cases. it has found that the predicted results form a queuing analysis differ substantially from the actual observed performance. Specially, in recent years, a number of studies have demonstrated that for some environments, the traffic pattern is self-similar rather than Poisson. In this paper, we study these self-similar traffic characteristics and the definition of self-similar stochastic processes. Then, we consider the examples of self-similar data traffic, which is reported from recent measurement studies. Finally, we wish you that it makes out about the characteristics of actual data traffic more easily.

  • PDF

Self-Similarity Characteristic in Mobile Communications Data traffic (이동전화 데이터 트래픽에서의 Self-Similar 특성)

  • 이동철;정인명;김기문;김동일
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2001.10a
    • /
    • pp.468-471
    • /
    • 2001
  • The classical queuing analysis has been tremendously useful in doing capacity planning and performance prediction. However, in many real-world cases. it has found that the predicted results form a queuing analysis differ substantially from the actual observed performance. Specially, in recent years, a number of studies have demonstrated that for some environments, the traffic pattern is self-similar rather than Poisson. In this paper, we study these self-similar traffic characteristics and the definition of self-similar stochastic processes. Then, we consider the examples of self-similar data traffic, which is reported from recent measurement studies. Finally, we wish you that it makes out about the characteristics of actual data traffic more easily.

  • PDF

An Efficient Method for Finding Similar Regions in a 2-Dimensional Array Data (2차원 배열 데이터에서 유사 구역의 효율적인 탐색 기법)

  • Choe, YeonJeong;Lee, Ki Yong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.4
    • /
    • pp.185-192
    • /
    • 2017
  • In various fields of science, 2-dimensional array data is being generated actively as a result of measurements and simulations. Although various query processing techniques for array data are being studied, the problem of finding similar regions, whose sizes are not known in advance, in 2-dimensional array has not been addressed yet. Therefore, in this paper, we propose an efficient method for finding regions with similar element values, whose size is larger than a user-specified value, for a given 2-dimensional array data. The proposed method, for each pair of elements in the array, expands the corresponding two regions, whose initial size is 1, along the right and down direction in stages, keeping the shape of the two regions the same. If the difference between the elements values in the two regions becomes larger than a user-specified value, the proposed method stops the expansion. Consequently, the proposed method can find similar regions efficiently by accessing only those parts that are likely to be similar regions. Through theoretical analysis and various experiments, we show that the proposed method can find similar regions very efficiently.

Similar sub-Trajectory Retrieval Technique based on Grid for Video Data (비디오 데이타를 위한 그리드 기반의 유사 부분 궤적 검색 기법)

  • Lee, Ki-Young;Lim, Myung-Jae;Kim, Kyu-Ho;Kim, Joung-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.5
    • /
    • pp.183-189
    • /
    • 2009
  • Recently, PCS, PDA and mobile devices, such as the proliferation of spread, GPS (Global Positioning System) the use of, the rapid development of wireless network and a regular user even images, audio, video, multimedia data, such as increased use is for. In particular, video data among multimedia data, unlike the moving object, text or image data that contains information about the movements and changes in the space of time, depending on the kinds of changes that have sigongganjeok attributes. Spatial location of objects on the flow of time, changing according to the moving object (Moving Object) of the continuous movement trajectory of the meeting is called, from the user from the database that contains a given query trajectory and data trajectory similar to the finding of similar trajectory Search (Similar Sub-trajectory Retrieval) is called. To search for the trajectory, and these variations, and given the similar trajectory of the user query (Tolerance) in the search for a similar trajectory to approximate data matching (Approximate Matching) should be available. In addition, a large multimedia data from the database that you only want to be able to find a faster time-effective ways to search different from the existing research is required. To this end, in this paper effectively divided into a grid to search for the trajectory to the trajectory of moving objects, similar to the effective support of the search trajectory offers a new grid-based search techniques.

  • PDF

Self-Similarity Characteristic in Data traffic (데이터 트래픽에서의 Self-Similar 특성)

  • 김창호;황인수;최삼길;김동일;이동철;박기식
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 1999.05a
    • /
    • pp.146-151
    • /
    • 1999
  • The classical queuing analysis has been tremendously useful in doing capacity planning and performance prediction. However, in many real-world cases. it has found that the predicted results form a queuing analysis differ substantially from the actual observed performance. Specially, in recent years, a number of studies have demonstrated that for some environments, the traffic pattern is self-similar rather than Poisson. In this paper, we study these self-similar traffic characteristics and the definition of self-similar stochastic processes. Then, we consider the examples of self-similar data traffic, which is reported from recent measurement studies. Finally, we wish you that it makes out about the characteristics of actual data traffic more easily.

  • PDF