• Title/Summary/Keyword: Sparseness

Search Result 77, Processing Time 0.022 seconds

SOME POPULAR WAVELET DISTRIBUTION

  • Nadarajah, Saralees
    • Bulletin of the Korean Mathematical Society
    • /
    • v.44 no.2
    • /
    • pp.265-270
    • /
    • 2007
  • The modern approach for wavelets imposes a Bayesian prior model on the wavelet coefficients to capture the sparseness of the wavelet expansion. The idea is to build flexible probability models for the marginal posterior densities of the wavelet coefficients. In this note, we derive exact expressions for a popular model for the marginal posterior density.

Sparse Web Data Analysis Using MCMC Missing Value Imputation and PCA Plot-based SOM (MCMC 결측치 대체와 주성분 산점도 기반의 SOM을 이용한 희소한 웹 데이터 분석)

  • Jun, Sung-Hae;Oh, Kyung-Whan
    • The KIPS Transactions:PartD
    • /
    • v.10D no.2
    • /
    • pp.277-282
    • /
    • 2003
  • The knowledge discovery from web has been studied in many researches. There are some difficulties using web log for training data on efficient information predictive models. In this paper, we studied on the method to eliminate sparseness from web log data and to perform web user clustering. Using missing value imputation by Bayesian inference of MCMC, the sparseness of web data is removed. And web user clustering is performed using self organizing maps based on 3-D plot by principal component. Finally, using KDD Cup data, our experimental results were shown the problem solving process and the performance evaluation.

Text-Confidence Feature Based Quality Evaluation Model for Knowledge Q&A Documents (텍스트 신뢰도 자질 기반 지식 질의응답 문서 품질 평가 모델)

  • Lee, Jung-Tae;Song, Young-In;Park, So-Young;Rim, Hae-Chang
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.10
    • /
    • pp.608-615
    • /
    • 2008
  • In Knowledge Q&A services where information is created by unspecified users, document quality is an important factor of user satisfaction with search results. Previous work on quality prediction of Knowledge Q&A documents evaluate the quality of documents by using non-textual information, such as click counts and recommendation counts, and focus on enhancing retrieval performance by incorporating the quality measure into retrieval model. Although the non-textual information used in previous work was proven to be useful by experiments, data sparseness problem may occur when predicting the quality of newly created documents with such information. To solve data sparseness problem of non-textual features, this paper proposes new features for document quality prediction, namely text-confidence features, which indicate how trustworthy the content of a document is. The proposed features, extracted directly from the document content, are stable against data sparseness problem, compared to non-textual features that indirectly require participation of service users in order to be collected. Experiments conducted on real world Knowledge Q&A documents suggests that text-confidence features show performance comparable to the non-textual features. We believe the proposed features can be utilized as effective features for document quality prediction and improve the performance of Knowledge Q&A services in the future.

BOOTSTRAP TESTS FOR THE EQUALITY OF DISTRIBUTIONS

  • Ping, Jing
    • Journal of applied mathematics & informatics
    • /
    • v.7 no.2
    • /
    • pp.467-482
    • /
    • 2000
  • Testing equality of two and k distributions has long been an interesting issue in statistical inference. To overcome the sparseness of data points in high-dimensional space and deal with the general cases, we suggest several projection pursuit type statistics. Some results on the limiting distributions of the statistics are obtained, some properties of Bootstrap approximation are investigated. Furthermore, for computational reasons an approximation for the statistics the based on Number theoretic method is applied. Several simulation experiments are performed.

Dimension-Reduced Model for Word Co-occurrence Probability Estimation (단어 공기 확률 추정을 위한 차원 축소 모델)

  • 김길연;최기선
    • Proceedings of the Korean Society for Cognitive Science Conference
    • /
    • 2000.05a
    • /
    • pp.137-142
    • /
    • 2000
  • 본 논문에서는 확률적 자연언어 처리에서 중요한 문제인 자료 희귀(data sparseness)의 어려움을 해결하는 새로운 방법으로 차원 축소 모델을 제시한다. 세 가지의 세부 방법이 제안되었으며 Katz의 back-off 방법의 성능을 최저로 했을 때에 비해 약 60%정도의 성능이 향상되었다. 현재까지 최고의 성능을 보이고 있는 유사도 기반의 방법에 비해서도 약 5∼20%의 성능이 향상되었다. 따라서 차원 축소 모델은 확률 추정의 새로운 방법으로 쓰일 수 있다.

  • PDF

On Linear Discriminant Procedures Based On Projection Pursuit Method

  • Hwang, Chang-Ha;Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.5 no.1
    • /
    • pp.1-10
    • /
    • 1994
  • Projection pursuit(PP) is a computer-intensive method which seeks out interesting linear projections of multivariate data onto a lower dimension space by machine. By working with lower dimensional projections, projection pursuit avoids the sparseness of high dimensional data. We show through simulation that two projection pursuit discriminant mothods proposed by Chen(1989) and Huber(1985) do not improve very much the error rate than the existing methods and compare several classification procedures.

  • PDF

A Note on A Bayesian Approach to the Choice of Wavelet Basis Functions at Each Resolution Level

  • Park, Chun-Gun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1465-1476
    • /
    • 2008
  • In recent years wavelet methods have been focused on block shrinkage or thresholding approaches to accounting for the sparseness of the wavelet representation for an unknown function. The block shrinkage or thresholding methods have been developed in both of classical methods and Bayesian methods. In this paper, we propose a Bayesian approach to selecting wavelet basis functions at each resolution level without MCMC procedure. Simulation study and an application are shown.

  • PDF

Wavelet Denoising based on a Bayesian Approach (Bayesian 방법에 의한 잡음감소 방법에 관한 연구)

  • Lee, Moon-Jik;Chung, Chin-Hyun
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.2956-2958
    • /
    • 1999
  • The classical solution to the noise removal problem is the Wiener filter, which utilizes the second-order statistics of the Fourier decomposition. We discuss a Bayesian formalism which gives rise to a type of wavelet threshold estimation in non-parametric regression. A prior distribution is imposed on the wavelet coefficients of the unknown response function, designed to capture the sparseness of wavelet expansion common to most application. For the prior specified, the posterior median yields a thresholding procedure

  • PDF

A Feature Generation Method for Multimedia Recommendation System (멀티미디어 추천시스템을 위한 속성 생성 기법)

  • Kim, Hyung-Il;Eom, Jeong-Kook
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.2
    • /
    • pp.257-268
    • /
    • 2008
  • Multimedia recommendation systems analyze user preferences and recommend items(multimedia contents) to a user by predicting the user's preference for those items. Among various kinds of recommendation methods, collaborative filtering(CF) has been widely used and successfully applied to practical applications. However, collaborative filtering has two inherent problems: data sparseness and the cold-start problems. If there are few known preferences for a user, it is difficult to find many similar users, and therefore the performance of recommendation is degraded. This problem is more serious when a new user is first using the system. In this paper, we propose a method of generating additional feature of users and items into CF to overcome the difficulties caused by sparseness and improve the accuracy of recommendation. In our method, we first generate additional features by using the probability distribution of feature values, then recommend items by applying collaborative filtering on the modified data to include additional features. Several experimental results that show the effectiveness of the proposed method are also presented.

  • PDF

Time delay estimation between two receivers using basis pursuit denoising (Basis pursuit denoising을 사용한 두 수신기 간 시간 지연 추정 알고리즘)

  • Lim, Jun-Seok;Cheong, MyoungJun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.4
    • /
    • pp.285-291
    • /
    • 2017
  • Many methods have been studied to estimate the time delay between incoming signals to two receivers. In the case of the method based on the channel estimation technique, the relative delay between the input signals of the two receivers is estimated as an impulse response of the channel between the two signals. In this case, the characteristic of the channel has sparsity. Most of the existing methods do not take advantage of the channel sparseness. In this paper, we propose a time delay estimation method using BPD (Basis Pursuit Denoising) optimization technique, which is one of the sparse signal optimization methods, in order to utilize the channel sparseness. Compared with the existing GCC (Generalized Cross Correlation) method, adaptive eigen decomposition method and RZA-LMS (Reweighted Zero-Attracting Least Mean Square), the proposed method shows that it can mitigate the threshold phenomenon even under a white Gaussian source, a colored signal source and oceanic mammal sound source.