• Title/Summary/Keyword: 중복제거기법

Search Result 221, Processing Time 0.027 seconds

Data Quality Management: Operators and a Matching Algorithm with a CRM Example (데이터 품질 관리 : CRM을 사례로 연산자와 매칭기법 중심)

  • 심준호
    • The Journal of Society for e-Business Studies
    • /
    • v.8 no.3
    • /
    • pp.117-130
    • /
    • 2003
  • It is not unusual to observe that there Is a great amount of redundant or inconsistent data even within an e-business system such as CRM(Customer Relationship Management) system. This problem becomes aggravate when we construct a system of which information are gathered from different sources. Data quality management is indeed needed to avoid any possible redundant or inconsistent data in such information system. A data quality process, in general, consists of three phases: data cleaning (scrubbing), matching, and integration phase. In this paper, we introduce and categorize data quality operators for each phase. Then, we describe our distance function used in the matching phase, and present a matching algorithm PRIMAL (a PRactical Matching Algorithm). And finally, we present a related work and future research.

  • PDF

Removing Non-informative Features by Robust Feature Wrapping Method for Microarray Gene Expression Data (유전자 알고리즘과 Feature Wrapping을 통한 마이크로어레이 데이타 중복 특징 소거법)

  • Lee, Jae-Sung;Kim, Dae-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.8
    • /
    • pp.463-478
    • /
    • 2008
  • Due to the high dimensional problem, typically machine learning algorithms have relied on feature selection techniques in order to perform effective classification in microarray gene expression datasets. However, the large number of features compared to the number of samples makes the task of feature selection computationally inprohibitive and prone to errors. One of traditional feature selection approach was feature filtering; measuring one gene per one step. Then feature filtering was an univariate approach that cannot validate multivariate correlations. In this paper, we proposed a function for measuring both class separability and correlations. With this approach, we solved the problem related to feature filtering approach.

Storage and Retrieval of XML Documents Without Redundant Path Information (경로정보의 중복을 제거한 XML 문서의 저장 및 질의처리 기법)

  • Lee Hiye-Ja;Jeong Byeong-Soo;Kim Dae-Ho;Lee Young-Koo
    • The KIPS Transactions:PartD
    • /
    • v.12D no.5 s.101
    • /
    • pp.663-672
    • /
    • 2005
  • This Paper Proposes an approach that removes the redundancy of Path information and uses an inverted index, as an efficient way to store a large volume of XML documents and to retrieve wanted information from there. An XML document is decomposed into nodes based on its tree structure, and stored in relational tables according to the node type, with path information from the root to each node. The existing methods using path information store data for all element paths, which cause retrieval performance to be decreased with increased data volume. Our approach stores only data for leaf element path excluding internal element paths. As the inverted index is made by the leaf element path only, the number of posting lists by key words become smaller than those of the existing methods. For the storage and retrieval of U data, our approach doesn't require the XML schema information of XML documents and any extension of relational database. We demonstrate the better performance of on approach than the existing approaches within the scope of our experiment.

Mapping Cache for High-Performance Memory Mapped File I/O in Memory File Systems (메모리 파일 시스템 기반 고성능 메모리 맵 파일 입출력을 위한 매핑 캐시)

  • Kim, Jiwon;Choi, Jungsik;Han, Hwansoo
    • Journal of KIISE
    • /
    • v.43 no.5
    • /
    • pp.524-530
    • /
    • 2016
  • The desire to access data faster and the growth of next-generation memories such as non-volatile memories, contribute to the development of research on memory file systems. It is recommended that memory mapped file I/O, which has less overhead than read-write I/O, is utilized in a high-performance memory file system. Memory mapped file I/O, however, brings a page table overhead, which becomes one of the big overheads that needs to be resolved in the entire file I/O performance. We find that same overheads occur unnecessarily, because a page table of a file is removed whenever a file is opened after being closed. To remove the duplicated overhead, we propose the mapping cache, a technique that does not delete a page table of a file but saves the page table to be reused when the mapping of the file is released. We demonstrate that mapping cache improves the performance of traditional file I/O by 2.8x and web server performance by 12%.

Spectral Analysis Method to Eliminate Spurious in FMICW HRR Millimeter-Wave Seeker (주파수 변조 단속 지속파를 이용하는 고해상도 밀리미터파 탐색기의 스퓨리어스 제거를 위한 스펙트럼 분석 기법)

  • Yang, Hee-Seong;Chun, Joo-Hwan;Song, Sung-Chan
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.23 no.1
    • /
    • pp.85-95
    • /
    • 2012
  • In this thesis, we develop a spectral analysis scheme to eliminate the spurious peaks generated in HRR Millimeterwave Seeker based on FMICW system. In contrast to FMCW system, FMICW system generates spurious peaks in the spectrum of its IF signal, caused by the periodic discontinuity of the signal. These peaks make the accuracy of the system depend on the previously estimated range if a band pass filter is utilized to eliminate them and noise floor go to high level if random interrupted sequence is utilized and in case of using staggering process, we must transmit several waveforms to obtain overlapped information. Using the spectral analysis one of the schemes such as IAA(Iterative Adaptive Approach) and SPICE(SemiParametric Iterative Covariance-based Estimation method) which were introduced recently, the spurious peaks can be eliminated effectively. In order to utilize IAA and SPICE, since we must distinguish between reliable data and unreliable data and only use reliable data, STFT(Short Time Fourier Transform) is applied to the distinguishment process.

Program Osptimality Using Network Partiton in Embedded System (임베디드 시스템에서 네트워크 분할을 이용한 프로그램 최적화)

  • Choi Kang-Hee;Shin Hyun-Duck
    • Journal of the Korea Computer Industry Society
    • /
    • v.7 no.3
    • /
    • pp.145-154
    • /
    • 2006
  • This paper improves algorithms of Speculative Partial Redundancy Elimination(SPRE) proposed by Knoop et al. Improving SPRE algorithm performs the execution speed optimization based on the information of the execution frequency from profiling and the memory space optimization. The first purpose of presented algorithm is to reduce in space requirements and the second purpose is to de crease the execution time. Since too much weight on execution speed optimization may cause the explosion of the memory space, it is important to consider the size of memory. This fact can be a big advantage in the embedded system which concerns the required memory size more than the execution speed In this paper we implemented the min-cut algorithm, and this algorithm used the control flow graph is constructed with network and partitioned.

  • PDF

Multi-view Video Coding with View Scalability (시점 계위성을 고려한 다시점 비디오 부호화 기법)

  • Kim, Jae-Sub;Choi, Mi-Nam;Baek, Yun-Ki;Kim, Dong-Wook;Kim, Hwa-Sung;Yoo, Ji-Sang
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.8C
    • /
    • pp.703-711
    • /
    • 2007
  • In this paper, we propose a multi-view coding(MVC) algorithm with considering view scalability. The proposed algorithm has a high compression efficiency by reducing inter-view redundancy through inter-view decomposition, and adaptively reconstructs a multi-view video from an encoded bit stream. Furthermore, a reference view can be decoded by a traditional H.264/AVC, and the other views are adaptively decoded at the receiver by filtering to support view scalability. Experimental results show that the proposed algorithm performed better than the conventional H.264 codec even though it offers the view scalability.

Multispectral Image Compression Using Classified Interband Prediction and Vector Quantization in Wavelet domain (웨이브릿 영역에서의 영역별 대역간 예측과 벡터 양자화를 이용한 다분광 화상 데이타의 압축)

  • 반성원;권성근;이종원;박경남;김영춘;장종국;이건일
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.1B
    • /
    • pp.120-127
    • /
    • 2000
  • In this paper, we propose multispectral image compression using classified interband prediction and vector quantization in wavelet domain. This method classifies each region considering reflection characteristics of each band in image data. In wavelet domain, we perform the classified intraband VQ to remove intraband redundancy for a reference band image that has the lowest spatial variance and the best correlation with other band. And in wavelet domain, we perform the classifled interband prediction to remove interband redundancy for the remaining bands. Then error wavelet coefficients between original image and predicted image are intraband vector quantized to reduce prediction error. Experiments on remotely sensed satellite image show that coding efficiency of theproposed method is better than that of the conventional method.

  • PDF

A Fast Half Pixel Motion Estimation Method based on the Correlations between Integer pixel MVs and Half pixel MVs (정 화소 움직임 벡터와 반 화소 움직임 벡터의 상관성을 이용한 빠른 반 화소 움직임 추정 기법)

  • Yoon HyoSun;Lee GueeSang
    • The KIPS Transactions:PartB
    • /
    • v.12B no.2 s.98
    • /
    • pp.131-136
    • /
    • 2005
  • Motion Estimation (ME) has been developed to remove redundant data contained in a sequence of image. And ME is an important part of video encoding systems, since it can significantly affect the qualify of an encoded sequences. Generally, ME consists of two stages, the integer pixel motion estimation and the half pixel motion estimation. Many methods have been developed to reduce the computational complexity at the integer pixel motion estimation. However, the studies are needed at the half pixel motion estimation to reduce the complexity. In this paper, a method based on the correlations between integer pixel motion vectors and half pixel motion vectors is proposed for the half pixel motion estimation. The proposed method has less computational complexity than the full half pixel search method (FHSM) that needs the bilinear interpolation of half pixels and examines nine half pixel points to the find the half pixel motion vector. Experimental results show that the speedup improvement of the proposed method over FHSM can be up to $2.5\~80$ times faster and the image quality degradation is about to $0.07\~0.69(dB)$.

Investigating an Automatic Method in Summarizing a Video Speech Using User-Assigned Tags (이용자 태그를 활용한 비디오 스피치 요약의 자동 생성 연구)

  • Kim, Hyun-Hee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.46 no.1
    • /
    • pp.163-181
    • /
    • 2012
  • We investigated how useful video tags were in summarizing video speech and how valuable positional information was for speech summarization. Furthermore, we examined the similarity among sentences selected for a speech summary to reduce its redundancy. Based on such analysis results, we then designed and evaluated a method for automatically summarizing speech transcripts using a modified Maximum Marginal Relevance model. This model did not only reduce redundancy but it also enabled the use of social tags, title words, and sentence positional information. Finally, we compared the proposed method to the Extractor system in which key sentences of a video speech were chosen using the frequency and location information of speech content words. Results showed that the precision and recall rates of the proposed method were higher than those of the Extractor system, although there was no significant difference in the recall rates.