• Title/Summary/Keyword: redundant data

Search Result 442, Processing Time 0.024 seconds

A Study on Solving of Double-layer Pattern Problem in Daejeon Correlator (대전상관기에서 복층패턴 문제의 해결에 관한 연구)

  • Oh, Se-Jin;Roh, Duk-Gyoo;Yeom, Jae-Hwan;Chung, Dong-Kyu;Oh, Chung-Sik;Hwang, Ju-Yeon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.16 no.4
    • /
    • pp.162-167
    • /
    • 2015
  • This paper describes the reason and the problem solving for the double-layer pattern of a Daejeon correlator operated in Korea-Japan Correlation Center. When the electric power of an input signal in the correlator is charged small enough to be buried in the noise, it is hard to see a signal with a specific pattern in the input signal, but when the electric power is large, a specific one is reported to be seen. By comparing data from observation with one from software correlator, it was confirmed from the analysis using the AIPS software that the amplitude gain of a source signal was affected about 3%. Therefore, in order to solve the problem of double-layer patterns, we found that a problem in the memory management module responsible for both the data input and the data serialization of the correlator is a cause for the double-layer pattern detected periodically. In other words, while data is serialized and read repeatedly in the memory area assigned to serialize the data from the serialization module, redundant last data is generated and an overlap for the memory allocation is occurred. Therefore, by modifying the program of the FPGA memory sections on serialization module to correct the problem, we confirmed that double-layer pattern is disappeared and correlation results are normally acquired.

An Hybrid Clustering Using Meta-Data Scheme in Ubiquitous Sensor Network (유비쿼터스 센서 네트워크에서 메타 데이터 구조를 이용한 하이브리드 클러스터링)

  • Nam, Do-Hyun;Min, Hong-Ki
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.9 no.4
    • /
    • pp.313-320
    • /
    • 2008
  • The dynamic clustering technique has some problems regarding energy consumption. In the cluster configuration aspect the cluster structure must be modified every time the head nodes are re-selected resulting in high energy consumption. Also, there is excessive energy consumption when a cluster head node receives identical data from adjacent cluster sources nodes. This paper proposes a solution to the problems described above from the energy efficiency perspective. The round-robin cluster header(RRCH) technique, which fixes the initially structured cluster and sequentially selects duster head nodes, is suggested for solving the energy consumption problem regarding repetitive cluster construction. Furthermore, the issue of redundant data occurring at the cluster head node is dealt with by broadcasting metadata of the initially received data to prevent reception by a sensor node with identical data. A simulation experiment was performed to verify the validity of the proposed approach. The results of the simulation experiments were compared with the performances of two of the must widely used conventional techniques, the LEACH(Low Energy Adaptive Clustering Hierarchy) and HEED(Hybrid, Energy Efficient Distributed Clustering) algorithms, based on energy consumption, remaining energy for each node and uniform distribution. The evaluation confirmed that in terms of energy consumption, the technique proposed in this paper was 29.3% and 21.2% more efficient than LEACH and HEED, respectively.

  • PDF

An Efficient Spatial Index Technique based on Flash-Memory (플래시 메모리 기반의 효율적인 공간 인덱스 기법)

  • Kim, Joung-Joon;Sim, Hee-Joung;Kang, Hong-Koo;Lee, Ki-Young;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.2
    • /
    • pp.133-142
    • /
    • 2009
  • Recently, with the advance of wireless internet and the frequent use of mobile devices, demand for LBS(Location Based Service) is increasing, and research is required on spatial indexes for the storage and maintenance of spatial data to provide efficient LBS in mobile device environments. In addition, the use of flash memory as an auxiliary storage device is increasing in order to store large spatial data in a mobile terminal with small storage space. However, the application of existing spatial indexes to flash-memory lowers index performance due to the frequent updates of nodes. To solve this problem, research is being conducted on flash-memory based spatial indexes, but the efficiency of such spatial indexes is lowered by low utilization of buffer and flash-memory space. Accordingly, in order to solve problems in existing flash-memory based spatial indexes, this paper proposed FR-Tree (Flash-Memory based R-Tree) that uses the node compression technique and the delayed write operation technique. The node compression technique of FR-Tree increased the utilization of flash-memory space by compressing MBR(Minimum Bounding Rectangle) of spatial data using relative coordinates and MBR size. And, the delayed write operation technique reduced the number of write operations in flash memory by storing spatial data in the buffer temporarily and reflecting them in flash memory at once instead of reflecting the insert, update and delete of spatial data in flash-memory for each operation. Especially, the utilization of buffer space was enhanced by preventing the redundant storage of the same spatial data in the buffer. Finally, we perform ed various performance evaluations and proved the superiority of FR-Tree to the existing spatial indexes.

  • PDF

A Study on Automatic Threshold Selection in Line Simplification for Pedestrian Road Network Using Road Attribute Data (보행자용 도로망 선형단순화를 위한 도로속성정보 기반 임계값 자동 선정 연구)

  • Park, Bumsub;Yang, Sungchul;Yu, Kiyun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.31 no.4
    • /
    • pp.269-275
    • /
    • 2013
  • Recently, importance of pedestrian road network is getting emphasized as it is possible to provide mobile device users with both route guidance services and surrounding spatial information. However, it costs a tremendous amount of budget for generating and renovating pedestrian road network nationally, which hinder further advances of these services. Hence, algorithms extracting pedestrian road network automatically based on raster data are needed. On the other hand, road dataset generated from raster data usually has unnecessary vertices which lead to maintenance disutility such as excessive turns and increase in data memory. Therefore, this study proposed a method of selecting a proper threshold automatically for separate road entity using not only Douglas-Peucker algorithm but also road attribute data of digital map in order to remove redundant vertices, which maximizes line simplification efficiency and minimizes distortion of shape of roads simultaneously. As a result of the test, proposed method was suitable for automatic line simplification in terms of reduction ratio of vertices and accuracy of position.

CORE-Dedup: IO Extent Chunking based Deduplication using Content-Preserving Access Locality (CORE-Dedup: 내용보존 접근 지역성 활용한 IO 크기 분할 기반 중복제거)

  • Kim, Myung-Sik;Won, You-Jip
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.6
    • /
    • pp.59-76
    • /
    • 2015
  • Recent wide spread of embedded devices and technology growth of broadband communication has led to rapid increase in the volume of created and managed data. As a result, data centers have to increase the storage capacity cost-effectively to store the created data. Data deduplication is one way to save the storage space by removing redundant data. This work propose IO extent based deduplication schemes called CORE-Dedup that exploits content-preserving access locality. We acquire IO traces from block device layer in virtual machine host, and compare the deduplication performance of chunking method between the fixed size and IO extent based. At multiple workload of 10 user's compile in virtual machine environment, the result shows that 4 KB fixed size chunking and IO extent based chunking use chunk index 14500 and 1700, respectively. The deduplication rate account for 60.4% and 57.6% on fixed size and IO extent chunking, respectively.

The Development of Topographic Feature Extraction Method by use of the Seafloor Curvature Measurement (곡률 계산에 의한 해저면 지형요소 추출 기법 개발)

  • Kim, Hyun-Sub;Jung, Mee-Sook;Park, Cheong-Kee
    • Geophysics and Geophysical Exploration
    • /
    • v.10 no.3
    • /
    • pp.163-172
    • /
    • 2007
  • A seafloor curvature measurement method was developed to extract redundant topographic features from the multi-beam bathymetry data, and then applied to the data of abyssal plain area in the Pacific. Any seafloor might be modeled to a quadratic surface determined in a linear least squares sense, and its curvature could be derived from the eigen values related with quadratic model parameters. The curvature's magnitude as well as polarity showed distinct relationship with geometric characteristics of the seafloor like as ridge and valley. From the investigation of curvature's variation with the number of data in the quadratic surface, the optimal size of data aperture could be applied to real bathymetry data. The application to real data also required the determination of the accompanying threshold values to cope with corresponding topographic features. The calculation method of previous studies were reported to be sensitive to the background noise. The improved curvature measurement method, incorporating the sum of eigen values has reduced unwanted artifacts and enhanced ability to extract lineament features along strike direction. The result of application shows that the curvature measurement method is effective tool for the estimation of a possible mining area in the seamount free abyssal hill area.

Data Cude Index to Support Integrated Multi-dimensional Concept Hierarchies in Spatial Data Warehouse (공간 데이터웨어하우스에서 통합된 다차원 개념 계층 지원을 위한 데이터 큐브 색인)

  • Lee, Dong-Wook;Baek, Sung-Ha;Kim, Gyoung-Bae;Bae, Hae-Young
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.10
    • /
    • pp.1386-1396
    • /
    • 2009
  • Most decision support functions of spatial data warehouse rely on the OLAP operations upon a spatial cube. Meanwhile, higher performance is always guaranteed by indexing the cube, which stores huge amount of pre-aggregated information. Hierarchical Dwarf was proposed as a solution, which can be taken as an extension of the Dwarf, a compressed index for cube structures. However, it does not consider the spatial dimension and even aggregates incorrectly if there are redundant values at the lower levels. OLAP-favored Searching was proposed as a spatial hierarchy based OLAP operation, which employs the advantages of R-tree. Although it supports aggregating functions well against specified areas, it ignores the operations on the spatial dimensions. In this paper, an indexing approach, which aims at utilizing the concept hierarchy of the spatial cube for decision support, is proposed. The index consists of concept hierarchy trees of all dimensions, which are linked according to the tuples stored in the fact table. It saves storage cost by preventing identical trees from being created redundantly. Also, it reduces the OLAP operation cost by integrating the spatial and aspatial dimensions in the virtual concept hierarchy.

  • PDF

EST Analysis system for panning gene

  • Hur, Cheol-Goo;Lim, So-Hyung;Goh, Sung-Ho;Shin, Min-Su;Cho, Hwan-Gue
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.21-22
    • /
    • 2000
  • Expressed sequence tags (EFTs) are the partial segments of cDNA produced from 5 or 3 single-pass sequencing of cDNA clones, error-prone and generated in highly redundant sets. Advancement and expansion of Genomics made biologists to generate huge amount of ESTs from variety of organisms-human, microorganisms as well as plants, and the cumulated number of ESTs is over 5.3 million, As the EST data being accumulate more rapidly, it becomes bigger that the needs of the EST analysis tools for extraction of biological meaning from EST data. Among the several needs of EST analyses, the extraction of protein sequence or functional motifs from ESTs are important for the identification of their function in vivo. To accomplish that purpose the precise and accurate identification of the region where the coding sequences (CDSs) is a crucial problem to solve primarily, and it will be helpful to extract and detect of genuine CD5s and protein motifs from EST collections. Although several public tools are available for EST analysis, there is not any one to accomplish the object. Furthermore, they are not targeted to the plant ESTs but human or microorganism. Thus, to correspond the urgent needs of collaborators deals with plant ESTs and to establish the analysis system to be used as general-purpose public software we constructed the pipelined-EST analysis system by integration of public software components. The software we used are as follows - Phred/Cross-match for the quality control and vector screening, NCBI Blast for the similarity searching, ICATools for the EST clustering, Phrap for EST contig assembly, and BLOCKS/Prosite for protein motif searching. The sample data set used for the construction and verification of this system was 1,386 ESTs from human intrathymic T-cells that verified using UniGene and Nr database of NCBI. The approach for the extraction of CDSs from sample data set was carried out by comparison between sample data and protein sequences/motif database, determining matched protein sequences/motifs that agree with our defined parameters, and extracting the regions that shows similarities. In recent future, in addition to these components, it is supposed to be also integrated into our system and served that the software for the peptide mass spectrometry fingerprint analysis, one of the proteomics fields. This pipelined-EST analysis system will extend our knowledge on the plant ESTs and proteins by identification of unknown-genes.

  • PDF

Underground Facility Survey and 3D Visualization Using Drones (드론을 활용한 지하시설물측량 및 3D 시각화)

  • Kim, Min Su;An, Hyo Won;Choi, Jae Hoon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.1
    • /
    • pp.1-14
    • /
    • 2022
  • In order to conduct rapid, accurate and safe surveying at the excavation site, In this study, the possibility of underground facility survey using drones and the expected effect of 3D visualization were obtained as follows. Phantom4Pro 20MP drones have a 30m flight altitude and a redundant 85% flight plan, securing a GSD (Ground Sampling Distance) value of 0.85mm and 4points of GCP (Groud Control Point)and 2points of check point were calculated, and 7.3mm of ground control point and 11mm of check point were obtained. The importance of GCP was confirmed when measured with low-cost drones. If there is no ground reference point, the error range of X value is derived from -81.2 cm to +90.0 cm, and the error range of Y value is +6.8 cm to 155.9 cm. This study classifies point cloud data using the Pix4D program. I'm sorting underground facility data and road pavement data, and visualized 3D data of road and underground facilities of actual model through overlapping process. Overlaid point cloud data can be used to check the location and depth of the place you want through the Open Source program CloudCompare. This study will become a new paradigm of underground facility surveying.

A Real-Time Stock Market Prediction Using Knowledge Accumulation (지식 누적을 이용한 실시간 주식시장 예측)

  • Kim, Jin-Hwa;Hong, Kwang-Hun;Min, Jin-Young
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.109-130
    • /
    • 2011
  • One of the major problems in the area of data mining is the size of the data, as most data set has huge volume these days. Streams of data are normally accumulated into data storages or databases. Transactions in internet, mobile devices and ubiquitous environment produce streams of data continuously. Some data set are just buried un-used inside huge data storage due to its huge size. Some data set is quickly lost as soon as it is created as it is not saved due to many reasons. How to use this large size data and to use data on stream efficiently are challenging questions in the study of data mining. Stream data is a data set that is accumulated to the data storage from a data source continuously. The size of this data set, in many cases, becomes increasingly large over time. To mine information from this massive data, it takes too many resources such as storage, money and time. These unique characteristics of the stream data make it difficult and expensive to store all the stream data sets accumulated over time. Otherwise, if one uses only recent or partial of data to mine information or pattern, there can be losses of valuable information, which can be useful. To avoid these problems, this study suggests a method efficiently accumulates information or patterns in the form of rule set over time. A rule set is mined from a data set in stream and this rule set is accumulated into a master rule set storage, which is also a model for real-time decision making. One of the main advantages of this method is that it takes much smaller storage space compared to the traditional method, which saves the whole data set. Another advantage of using this method is that the accumulated rule set is used as a prediction model. Prompt response to the request from users is possible anytime as the rule set is ready anytime to be used to make decisions. This makes real-time decision making possible, which is the greatest advantage of this method. Based on theories of ensemble approaches, combination of many different models can produce better prediction model in performance. The consolidated rule set actually covers all the data set while the traditional sampling approach only covers part of the whole data set. This study uses a stock market data that has a heterogeneous data set as the characteristic of data varies over time. The indexes in stock market data can fluctuate in different situations whenever there is an event influencing the stock market index. Therefore the variance of the values in each variable is large compared to that of the homogeneous data set. Prediction with heterogeneous data set is naturally much more difficult, compared to that of homogeneous data set as it is more difficult to predict in unpredictable situation. This study tests two general mining approaches and compare prediction performances of these two suggested methods with the method we suggest in this study. The first approach is inducing a rule set from the recent data set to predict new data set. The seocnd one is inducing a rule set from all the data which have been accumulated from the beginning every time one has to predict new data set. We found neither of these two is as good as the method of accumulated rule set in its performance. Furthermore, the study shows experiments with different prediction models. The first approach is building a prediction model only with more important rule sets and the second approach is the method using all the rule sets by assigning weights on the rules based on their performance. The second approach shows better performance compared to the first one. The experiments also show that the suggested method in this study can be an efficient approach for mining information and pattern with stream data. This method has a limitation of bounding its application to stock market data. More dynamic real-time steam data set is desirable for the application of this method. There is also another problem in this study. When the number of rules is increasing over time, it has to manage special rules such as redundant rules or conflicting rules efficiently.