• Title/Summary/Keyword: kernel estimation

Search Result 295, Processing Time 0.029 seconds

Estimation of Document Similarity using Semantic Kernel Derived from Helmholtz Machines (헬름홀츠머신 학습 기반의 의미 커널을 이용한 문서 유사도 측정)

  • 장정호;김유섭;장병탁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04c
    • /
    • pp.440-442
    • /
    • 2003
  • 문서 집합 내의 개념 또는 의미 관계의 자동 분석은 보다 효율적인 정보 획득과 단어수준 이상의 개념 수준에서의 운서 비교를 가능하게 한다. 본 논문에서는 은닉변수모델을 이용하여 문서 집합으로부터 단어들 간의 의미관계를 자동적으로 추출하고 이를 통해 문서간 유사도 측정을 효과적으로 하기 위한 방안을 제시한다. 은닉변수 모델로는 다중요인모델의 학습이 용이한 헬름홀츠 머신을 활용하묘 이의 학습 결과에 기반하여, 문서간 비교를 한 의미 커널(semantic kernel)을 구축한다. 2개의 문서 집합 HEDLINE과 CACM 데이터에 대한 검색 실험에서, 제안된 기법을 적응함으로써 기본 VSM(Vector Space Model) 에 비해 20% 이상의 평균 정확도 향상을 이를 수 있었다.

  • PDF

Kernel Regression Estimation for Permutation Fixed Design Additive Models

  • Baek, Jangsun;Wehrly, Thomas E.
    • Journal of the Korean Statistical Society
    • /
    • v.25 no.4
    • /
    • pp.499-514
    • /
    • 1996
  • Consider an additive regression model of Y on X = (X$_1$,X$_2$,. . .,$X_p$), Y = $sum_{j=1}^pf_j(X_j) + $\varepsilon$$, where $f_j$s are smooth functions to be estimated and $\varepsilon$ is a random error. If $X_j$s are fixed design points, we call it the fixed design additive model. Since the response variable Y is observed at fixed p-dimensional design points, the behavior of the nonparametric regression estimator depends on the design. We propose a fixed design called permutation fixed design, and fit the regression function by the kernel method. The estimator in the permutation fixed design achieves the univariate optimal rate of convergence in mean squared error for any p $\geq$ 2.

  • PDF

Estimating small area proportions with kernel logistic regressions models

  • Shim, Jooyong;Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.941-949
    • /
    • 2014
  • Unit level logistic regression model with mixed effects has been used for estimating small area proportions, which treats the spatial effects as random effects and assumes linearity between the logistic link and the covariates. However, when the functional form of the relationship between the logistic link and the covariates is not linear, it may lead to biased estimators of the small area proportions. In this paper, we relax the linearity assumption and propose two types of kernel-based logistic regression models for estimating small area proportions. We also demonstrate the efficiency of our propose models using simulated data and real data.

Radar Pulse Clustering using Kernel Density Window (커널 밀도 윈도우를 이용한 레이더 펄스 클러스터링)

  • Lee, Dong-Weon;Han, Jin-Woo;Lee, Won-Don
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.973-974
    • /
    • 2008
  • As radar signal environments become denser and more complex, the capability of high-speed and accurate signal analysis is required for ES(Electronic warfare Support) system to identify individual radar signals at real-time. In this paper, we propose the new novel clustering algorithm of radar pulses to alleviate the load of signal analysis process and support reliable analysis. The proposed algorithm uses KDE(Kernel Density Estimation) and its CDF(Cumulative Distribution Function) to compose clusters considering the distribution characteristics of pulses. Simulation results show the good performance of the proposed clustering algorithm in clustering and classifying the emitters.

  • PDF

DEVELOPMENT OF POINT KERNEL SHIELDING ANALYSIS COMPUTER PROGRAM IMPLEMENTING RECENT NUCLEAR DATA AND GRAPHIC USER INTERFACES

  • Kang, Sang-Ho;Lee, Seung-Gi;Chung, Chan-Young;Lee, Choon-Sik;Lee, Jai-Ki
    • Journal of Radiation Protection and Research
    • /
    • v.26 no.3
    • /
    • pp.215-224
    • /
    • 2001
  • In order to comply with revised national regulationson radiological protection and to implement recent nuclear data and dose conversion factors, KOPEC developed a new point kernel gamma and beta ray shielding analysis computer program. This new code, named VisualShield, adopted mass attenuation coefficient and buildup factors from recent ANSI/ANS standards and flux-to-dose conversion factors from the International Commission on Radiological Protection (ICRP) Publication 74 for estimation of effective/equivalent dose recommended in ICRP 60. VisualShieid utilizes graphical user interfaces and 3-D visualization of the geometric configuration for preparing input data sets and analyzing results, which leads users to error free processing with visual effects. Code validation and data analysis were performed by comparing the results of various calculations to the data outputs of previous programs such as MCNP 4B, ISOSHLD-II, QAD-CGGP, etc.

  • PDF

Semiparametric and Nonparametric Mixed Effects Models for Small Area Estimation (비모수와 준모수 혼합모형을 이용한 소지역 추정)

  • Jeong, Seok-Oh;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.1
    • /
    • pp.71-79
    • /
    • 2013
  • Semiparametric and nonparametric small area estimations have been studied to overcome a large variance due to a small sample size allocated in a small area. In this study, we investigate semiparametric and nonparametric mixed effect small area estimators using penalized spline and kernel smoothing methods respectively and compare their performances using labor statistics.

FREQUENCY HISTOGRAM MODEL FOR LINE TRANSECT DATA WITH AND WITHOUT THE SHOULDER CONDITION

  • EIDOUS OMAR
    • Journal of the Korean Statistical Society
    • /
    • v.34 no.1
    • /
    • pp.49-60
    • /
    • 2005
  • In this paper we introduce a nonparametric method for estimating the probability density function of detection distances in line transect sampling. The estimator is obtained using a frequency histogram density estimation method. The asymptotic properties of the proposed estimator are derived and compared with those of the kernel estimator under the assumption that the data collected satisfy the shoulder condition. We found that the asymptotic mean square error (AMSE) of the two estimators have about the same convergence rate. The formula for the optimal histogram bin width is derived which minimizes AMSE. Moreover, the performances of the corresponding k-nearest-neighbor estimators are studied through simulation techniques. In the absence of our knowledge whether the shoulder condition is valid or not a new semi-parametric model is suggested to fit the line transect data. The performances of the proposed two estimators are studied and compared with some existing nonparametric and semiparametric estimators using simulation techniques. The results demonstrate the superiority of the new estimators in most cases considered.

Lagged Cross-Correlation of Probability Density Functions and Application to Blind Equalization

  • Kim, Namyong;Kwon, Ki-Hyeon;You, Young-Hwan
    • Journal of Communications and Networks
    • /
    • v.14 no.5
    • /
    • pp.540-545
    • /
    • 2012
  • In this paper, the lagged cross-correlation of two probability density functions constructed by kernel density estimation is proposed, and by maximizing the proposed function, adaptive filtering algorithms for supervised and unsupervised training are also introduced. From the results of simulation for blind equalization applications in multipath channels with impulsive and slowly varying direct current (DC) bias noise, it is observed that Gaussian kernel of the proposed algorithm cuts out the large errors due to impulsive noise, and the output affected by the DC bias noise can be effectively controlled by the lag ${\tau}$ intrinsically embedded in the proposed function.

Bezier curve smoothing of cumulative hazard function estimators

  • Cha, Yongseb;Kim, Choongrak
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.3
    • /
    • pp.189-201
    • /
    • 2016
  • In survival analysis, the Nelson-Aalen estimator and Peterson estimator are often used to estimate a cumulative hazard function in randomly right censored data. In this paper, we suggested the smoothing version of the cumulative hazard function estimators using a Bezier curve. We compare them with the existing estimators including a kernel smooth version of the Nelson-Aalen estimator and the Peterson estimator in the sense of mean integrated square error to show through numerical studies that the proposed estimators are better than existing ones. Further, we applied our method to the Cox regression where covariates are used as predictors and suggested a survival function estimation at a given covariate.

kNNDD-based One-Class Classification by Nonparametric Density Estimation (비모수 추정방법을 활용한 kNNDD의 이상치 탐지 기법)

  • Son, Jung-Hwan;Kim, Seoung-Bum
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.38 no.3
    • /
    • pp.191-197
    • /
    • 2012
  • One-class classification (OCC) is one of the recent growing areas in data mining and pattern recognition. In the present study we examine a k-nearest neighbors data description (kNNDD) algorithm, one of the OCC algorithms widely used. In particular, we propose to use nonparametric estimation methods to determine the threshold of the kNNDD algorithm. A simulation study has been conducted to explore the characteristics of the proposed approach and compare it with the existing approach that determines the threshold. The results demonstrate the usefulness and flexibility of the proposed approach.