• 제목/요약/키워드: histogram data

검색결과 488건 처리시간 0.023초

Hadoop Based Wavelet Histogram for Big Data in Cloud

  • Kim, Jeong-Joon
    • Journal of Information Processing Systems
    • /
    • 제13권4호
    • /
    • pp.668-676
    • /
    • 2017
  • Recently, the importance of big data has been emphasized with the development of smartphone, web/SNS. As a result, MapReduce, which can efficiently process big data, is receiving worldwide attention because of its excellent scalability and stability. Since big data has a large amount, fast creation speed, and various properties, it is more efficient to process big data summary information than big data itself. Wavelet histogram, which is a typical data summary information generation technique, can generate optimal data summary information that does not cause loss of information of original data. Therefore, a system applying a wavelet histogram generation technique based on MapReduce has been actively studied. However, existing research has a disadvantage in that the generation speed is slow because the wavelet histogram is generated through one or more MapReduce Jobs. And there is a high possibility that the error of the data restored by the wavelet histogram becomes large. However, since the wavelet histogram generation system based on the MapReduce developed in this paper generates the wavelet histogram through one MapReduce Job, the generation speed can be greatly increased. In addition, since the wavelet histogram is generated by adjusting the error boundary specified by the user, the error of the restored data can be adjusted from the wavelet histogram. Finally, we verified the efficiency of the wavelet histogram generation system developed in this paper through performance evaluation.

Piecewise Continuous Linear Density Estimator

  • Jang, Dae-Heung
    • Journal of the Korean Data and Information Science Society
    • /
    • 제16권4호
    • /
    • pp.959-968
    • /
    • 2005
  • The piecewise linear histogram can be used as a simple and efficient tool for the density estimator. But, this piecewise linear histogram is discontinuous function. We suppose the piecewise continuous linear histogram as a simple and efficient tool for the density estimator and the alternative of the piecewise linear histogram.

  • PDF

Reversible Data Hiding Scheme Based on Maximum Histogram Gap of Image Blocks

  • Arabzadeh, Mohammad;Rahimi, Mohammad Reza
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제6권8호
    • /
    • pp.1964-1981
    • /
    • 2012
  • In this paper a reversible data hiding scheme based on histogram shifting of host image blocks is presented. This method attempts to use full available capacity for data embedding by dividing the image into non-overlapping blocks. Applying histogram shifting to each block requires that extra information to be saved as overhead data for each block. This extra information (overhead or bookkeeping information) is used in order to extract payload and recover the block to its original state. A method to eliminate the need for this extra information is also introduced. This method uses maximum gap that exists between histogram bins for finding the value of pixels that was used for embedding in sender side. Experimental results show that the proposed method provides higher embedding capacity than the original reversible data hiding based on histogram shifting method and its improved versions in the current literature while it maintains the quality of marked image at an acceptable level.

A Novel Filter ed Bi-Histogram Equalization Method

  • Sengee, Nyamlkhagva;Choi, Heung-Kook
    • 한국멀티미디어학회논문지
    • /
    • 제18권6호
    • /
    • pp.691-700
    • /
    • 2015
  • Here, we present a new framework for histogram equalization in which both local and global contrasts are enhanced using neighborhood metrics. When checking neighborhood information, filters can simultaneously improve image quality. Filters are chosen depending on image properties, such as noise removal and smoothing. Our experimental results confirmed that this does not increase the computational cost because the filtering process is done by our proposed arrangement of making the histogram while checking neighborhood metrics simultaneously. If the two methods, i.e., histogram equalization and filtering, are performed sequentially, the first method uses the original image data and next method uses the data altered by the first. With combined histogram equalization and filtering, the original data can be used for both methods. The proposed method is fully automated and any spatial neighborhood filter type and size can be used. Our experiments confirmed that the proposed method is more effective than other similar techniques reported previously.

운영 위험 관련 손실 분포 - 퍼지 히스토그램의 효과 (Fuzzy histogram in estimating loss distributions for operational risk)

  • 박노진
    • Journal of the Korean Data and Information Science Society
    • /
    • 제20권4호
    • /
    • pp.705-712
    • /
    • 2009
  • 히스토그램이 활용의 간편성과 자료의 전체적 구조를 한 눈에 볼 수 있는 정보량을 제공하지만 히스토그램의 계급 구간의 설정에 따라 그 표현이 달라 질 수 있는 문제가 있다. 이러한 문제를 해결하기 위해 퍼지 개념을 활용한 히스토그램이 제안되었고 그 효과가 제시되었다 (Loquin과 Strauss, 2008). 히스토그램이 다양한 분야에서 사용되지만 요즘 운영 위험과 관련된 손실 분포를 추정함에 있어서 유용하게 사용되고 있다. 그런데, 임계치를 활용한 극단치 확률 함수 추정에 사용함에 있어 임계치의 선택에 따른 히스토그램의 모양 변화는 그 활용을 어렵게 하는 경향이 있다. 본 연구는 퍼지히스토그램을 손실에 대한 극단치 분포를 추정에 사용할 경우 임계치의 선택에 따른 전체적 모양의 차이가 일반적인 히스토그램 보다 크지 않아 상대적으로 안정된 분포를 추정할 수 있음을 보였다.

  • PDF

구간 데이타에 대한 히스토그램 구축 알고리즘의 확장 (Extensions of Histogram Construction Algorithms for Interval Data)

  • 이호석;심규석;이병기
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제34권4호
    • /
    • pp.369-377
    • /
    • 2007
  • 히스토그램은 원본 데이타를 효과적으로 요약하는 기법중의 하나이며, 선택도 측정과 근사 질의 처리 등에 널리 사용되고 있다. 기존의 히스토그램 구축 알고리즘들은 하나의 값으로 표현되는 점 데이타에 대하여 적용 가능한 알고리즘이었다. 그러나 일상생활에서는 하루 동안의 온도, 주식 가격과 같은 구간 데이타들도 점 데이터만큼 흔하게 접할 수 있다. 본 논문에서는 점 데이타에 대한 히스토그램 구축 알고리즘을 구간 데이타에 대하여 확장한다. 합성 데이타를 사용한 실험을 통하여 기존의 점 데이타에 대한 히스토그램을 초보적으로 확장하는 방법보다 본 논문에서 제시된 알고리즘의 성능이 좋다는 것을 보였다.

Double monothetic clustering for histogram-valued data

  • Kim, Jaejik;Billard, L.
    • Communications for Statistical Applications and Methods
    • /
    • 제25권3호
    • /
    • pp.263-274
    • /
    • 2018
  • One of the common issues in large dataset analyses is to detect and construct homogeneous groups of objects in those datasets. This is typically done by some form of clustering technique. In this study, we present a divisive hierarchical clustering method for two monothetic characteristics of histogram data. Unlike classical data points, a histogram has internal variation of itself as well as location information. However, to find the optimal bipartition, existing divisive monothetic clustering methods for histogram data consider only location information as a monothetic characteristic and they cannot distinguish histograms with the same location but different internal variations. Thus, a divisive clustering method considering both location and internal variation of histograms is proposed in this study. The method has an advantage in interpreting clustering outcomes by providing binary questions for each split. The proposed clustering method is verified through a simulation study and applied to a large U.S. house property value dataset.

정부 품질보증활동 데이터 활용을 위한 Zero-Inflated 포아송 분포 적용 (Application of Zero-Inflated Poisson Distribution to Utilize Government Quality Assurance Activity Data)

  • 김지훈;이창우
    • 품질경영학회지
    • /
    • 제46권3호
    • /
    • pp.509-522
    • /
    • 2018
  • Purpose: The purpose of this study was to propose more accurate mathematical model which can represent result of government quality assurance activity, especially corrective action and flaw. Methods: The collected data during government quality assurance activity was represented through histogram. To find out which distributions (Poisson distribution, Zero-Inflated Poisson distribution) could represent the histogram better, this study applied Pearson's correlation coefficient. Results: The result of this study is as follows; Histogram of corrective action during past 3 years and Zero-Inflated Poisson distribution had strong relationship that their correlation coefficients was over 0.94. Flaw data could not re-parameterize to Zero-Inflated Poisson distribution because its frequency of flaw occurrence was too small. However, histogram of flaw data during past 3 years and Poisson distribution showed strong relationship that their correlation coefficients was 0.99. Conclusion: Zero-Inflated Poisson distribution represented better than Poisson distribution to demonstrate corrective action histogram. However, in the case of flaw data histogram, Poisson distribution was more accurate than Zero-Inflated Poisson distribution.

화자 식별에서의 배경화자데이터를 이용한 히스토그램 등화 기법 (Histogram Equalization Using Background Speakers' Utterances for Speaker Identification)

  • 김명재;양일호;소병민;김민석;유하진
    • 말소리와 음성과학
    • /
    • 제4권2호
    • /
    • pp.79-86
    • /
    • 2012
  • In this paper, we propose a novel approach to improve histogram equalization for speaker identification. Our method collects all speech features of UBM training data to make a reference distribution. The ranks of the feature vectors are calculated in the sorted list of the collection of the UBM training data and the test data. We use the ranks to perform order-based histogram equalization. The proposed method improves the accuracy of the speaker recognition system with short utterances. We use four kinds of speech databases to evaluate the proposed speaker recognition system and compare the system with cepstral mean normalization (CMN), mean and variance normalization (MVN), and histogram equalization (HEQ). Our system reduced the relative error rate by 33.3% from the baseline system.

Histogram-based Reversible Data Hiding Based on Pixel Differences with Prediction and Sorting

  • Chang, Ya-Fen;Tai, Wei-Liang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제6권12호
    • /
    • pp.3100-3116
    • /
    • 2012
  • Reversible data hiding enables the embedding of messages in a host image without any loss of host content, which is proposed for image authentication that if the watermarked image is deemed authentic, we can revert it to the exact copy of the original image before the embedding occurred. In this paper, we present an improved histogram-based reversible data hiding scheme based on prediction and sorting. A rhombus prediction is employed to explore the prediction for histogram-based embedding. Sorting the prediction has a good influence on increasing the embedding capacity. Characteristics of the pixel difference are used to achieve large hiding capacity while keeping low distortion. The proposed scheme exploits a two-stage embedding strategy to solve the problem about communicating peak points. We also present a histogram shifting technique to prevent overflow and underflow. Performance comparisons with other existing reversible data hiding schemes are provided to demonstrate the superiority of the proposed scheme.