• Title/Summary/Keyword: 샘플링 집합 선택

Search Result 12, Processing Time 0.021 seconds

Sampling Set Selection Algorithm for Weighted Graph Signals (가중치를 갖는 그래프신호를 위한 샘플링 집합 선택 알고리즘)

  • Kim, Yoon Hak
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.1
    • /
    • pp.153-160
    • /
    • 2022
  • A greedy algorithm is proposed to select a subset of nodes of a graph for bandlimited graph signals in which each signal value is generated with its weight. Since graph signals are weighted, we seek to minimize the weighted reconstruction error which is formulated by using the QR factorization and derive an analytic result to find iteratively the node minimizing the weighted reconstruction error, leading to a simplified iterative selection process. Experiments show that the proposed method achieves a significant performance gain for graph signals with weights on various graphs as compared with the previous novel selection techniques.

Fast Sampling Set Selection Algorithm for Arbitrary Graph Signals (임의의 그래프신호를 위한 고속 샘플링 집합 선택 알고리즘)

  • Kim, Yoon-Hak
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.6
    • /
    • pp.1023-1030
    • /
    • 2020
  • We address the sampling set selection problem for arbitrary graph signals such that the original graph signal is reconstructed from the signal values on the nodes in the sampling set. We introduce a variation difference as a new indirect metric that measures the error of signal variations caused by sampling process without resorting to the eigen-decomposition which requires a huge computational cost. Instead of directly minimizing the reconstruction error, we propose a simple and fast greedy selection algorithm that minimizes the variation differences at each iteration and justify the proposed reasoning by showing that the principle used in the proposed process is similar to that in the previous novel technique. We run experiments to show that the proposed method yields a competitive reconstruction performance with a substantially reduced complexity for various graphs as compared with the previous selection methods.

Efficient Sampling of Graph Signals with Reduced Complexity (저 복잡도를 갖는 효율적인 그래프 신호의 샘플링 알고리즘)

  • Kim, Yoon Hak
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.2
    • /
    • pp.367-374
    • /
    • 2022
  • A sampling set selection algorithm is proposed to reconstruct original graph signals from the sampled signals generated on the nodes in the sampling set. Instead of directly minimizing the reconstruction error, we focus on minimizing the upper bound on the reconstruction error to reduce the algorithm complexity. The metric is manipulated by using QR factorization to produce the upper triangular matrix and the analytic result is presented to enable a greedy selection of the next nodes at iterations by using the diagonal entries of the upper triangular matrix, leading to an efficient sampling process with reduced complexity. We run experiments for various graphs to demonstrate a competitive reconstruction performance of the proposed algorithm while offering the execution time about 3.5 times faster than one of the previous selection methods.

Low-complexity Sampling Set Selection for Bandlimited Graph Signals (대역폭 제한 그래프신호를 위한 저 복잡도 샘플링 집합 선택 알고리즘)

  • Kim, Yoon Hak
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.12
    • /
    • pp.1682-1687
    • /
    • 2020
  • We study the problem of sampling a subset of nodes of graphs for bandlimited graph signals such that the signal values on the sampled nodes provide the most information in order to reconstruct the original graph signal. Instead of directly minimizing the reconstruction error, we focus on minimizing the upper bound of the reconstruction error to reduce the complexity of the selection process. We further simplify the upper bound by applying useful approximations to propose a low-weight greedy selection process that is iteratively conducted to find a suboptimal sampling set. Through the extensive experiments for various graphs, we inspect the performance of the proposed algorithm by comparing with different sampling set selection methods and show that the proposed technique runs fast while preserving a competitive reconstruction performance, yielding a practical solution to real-time applications.

Accelerating the EM Algorithm through Selective Sampling for Naive Bayes Text Classifier (나이브베이즈 문서분류시스템을 위한 선택적샘플링 기반 EM 가속 알고리즘)

  • Chang Jae-Young;Kim Han-Joon
    • The KIPS Transactions:PartD
    • /
    • v.13D no.3 s.106
    • /
    • pp.369-376
    • /
    • 2006
  • This paper presents a new method of significantly improving conventional Bayesian statistical text classifier by incorporating accelerated EM(Expectation Maximization) algorithm. EM algorithm experiences a slow convergence and performance degrade in its iterative process, especially when real online-textual documents do not follow EM's assumptions. In this study, we propose a new accelerated EM algorithm with uncertainty-based selective sampling, which is simple yet has a fast convergence speed and allow to estimate a more accurate classification model on Naive Bayesian text classifier. Experiments using the popular Reuters-21578 document collection showed that the proposed algorithm effectively improves classification accuracy.

Low-Complexity Graph Sampling Algorithm Based on Thresholding (임계값 적용에 기반한 저 복잡도 그래프 신호 샘플링 알고리즘)

  • Yoon-Hak Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.5
    • /
    • pp.895-900
    • /
    • 2023
  • We study low-complexity graph sampling which selects a subset of nodes from graph nodes so as to reconstruct the original signal from the sampled one. To achieve complexity reduction, we propose a graph sampling algorithm with thresholding which selects a node with a cost lower than a given threshold at each step without fully searching all of the remaining nodes to find one with the minimum cost. Since it is important to find the threshold as close to a minimum cost as possible to avoid degradation of the reconstruction performance, we present a mathematical expression to compute the threshold at each step. We investigate the performance of the different sampling methods for various graphs, showing that the proposed algorithm runs 1.3 times faster than the previous method while maintaining the reconstruction performance.

Resampling Feedback Documents Using Overlapping Clusters (중첩 클러스터를 이용한 피드백 문서의 재샘플링 기법)

  • Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.16B no.3
    • /
    • pp.247-256
    • /
    • 2009
  • Typical pseudo-relevance feedback methods assume the top-retrieved documents are relevant and use these pseudo-relevant documents to expand terms. The initial retrieval set can, however, contain a great deal of noise. In this paper, we present a cluster-based resampling method to select better pseudo-relevant documents based on the relevance model. The main idea is to use document clusters to find dominant documents for the initial retrieval set, and to repeatedly feed the documents to emphasize the core topics of a query. Experimental results on large-scale web TREC collections show significant improvements over the relevance model. For justification of the resampling approach, we examine relevance density of feedback documents. The resampling approach shows higher relevance density than the baseline relevance model on all collections, resulting in better retrieval accuracy in pseudo-relevance feedback. This result indicates that the proposed method is effective for pseudo-relevance feedback.

Analytical Approach for Scalable Feature Selection (확장 가능한 요소선택방법을 위한 분석적 접근)

  • Yang, Jae-Kyung;Lee, Tae-Han
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.29 no.2
    • /
    • pp.75-82
    • /
    • 2006
  • 본 연구에서 조합 최적화(Combinatorial Optimization) 이론에 바탕을 두고 있는 네스티드 분할(Nested Partition, 이하 NP) 방법을 이용한 최적화 기탄 요소선택 방법(Feature Selection)을 제안한다. 이 새로운 방법은 좋은 요소 부분집합을 찾는 휴리스틱 탐색 절차를 채용하고 있으며 데이터의 인스턴스(Instances 또는 Records)의 무작위 추출(Random Sampling)을 이용하여 이 요소선택 방법의 처리시간 관점에서의 성능을 항상 시키고자 한다. 이 새로운 접근 방법은 처리시간 향상을 위해 2단계 샘플링 방법을 채용하여 근접 최적해로의 수렴(Convergence)을 보장하는 샘플 사이즈를 결정한다. 이는 앨고리듬이 유한한 시간내에 끝이날 때 최종 요소 부분집합 해의 질(Qualtiy)에 관한 정확한 설명을 할 수 있는 이론적인 배경을 제시한다. 중요 결과를 예시하기 위해서 다양한 형태의 다섯 개의 데이터 셋을 이용하였으며 다섯 번의 반복 실험을 통한 실험 결과가 제시되며, 이 새로운 접근 방법이 기존의 단순 네스티드 분할 방법 기반의 요소선택 방법보다 처리시간 관점에서 더욱 효율적임을 보여준다.

도계육의 시료 채취방법에 따른 세균의 검출량 비교

  • Yun, Hye-Suk
    • Bulletin of Food Technology
    • /
    • v.18 no.1
    • /
    • pp.127-135
    • /
    • 2005
  • 상업적으로 시판되는 10주 미만의 식용 영계육을 처리하는 공정에서 25 그룹의 내장적출하지 않은 것 또는 내장적출한 도계육을 도계공정의 네지점에서 채취하였다. 도계 공정 각 지점의 계육으로부터 무작위로 선택된 부위에서 껍질을 잘라낸 것, 목의 껍질을 잘라낸 것, 셀룰로스 아세테이트 스폰지 재질의 스왑봉으로 계육을 스왑법으로 채취한 것, 또는 도계육을 통째로 세척한 세척액이다. 총균수, 대장균군, 대장균을 각 시료로부터 증균하였다. 사방 1 cm 안의 평균 로그값(log A)과 사방 25cm안에서 측정된 전체 측값치들의 로그값(N)은각 그룹 당 25개였다. 동일한 처리공정 단계에서 얻은 계육으로부터 얻은 같은 그룹의 세균의 측정값의 경우에, 무작위로 선택한 부위에서 잘라낸 껍질 또는 목에서 벗겨낸 껍질, 세척액 샘플에서 측정한 균수의 집합인 N의 세 값 또는 log A의 세 값은 <0.5 logunit 차이였다.그러나 스왑법으로 측정한 사방 1 cm안의 평균로그값(log A)과 사방 25 cm 안에서 측정된 전체측정치의 로그값(N)의 대략 절반이 다른 방법들로 측정한 동일한 측정항목의 값과 >0.5 log unit 까지 차이가 났다. 이런 결과로부터 도계육에서 무작위로 선택한 부위나 목에서 껍질을 잘라내는 방법, 또는 세척액 채취 등의 샘플링 방법은 도계육 표면의 단위영역 당 유사한 양의 세균이 검출될 것이라는 것을 나타낸다.

  • PDF

An Active Learning-based Method for Composing Training Document Set in Bayesian Text Classification Systems (베이지언 문서분류시스템을 위한 능동적 학습 기반의 학습문서집합 구성방법)

  • 김제욱;김한준;이상구
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.12
    • /
    • pp.966-978
    • /
    • 2002
  • There are two important problems in improving text classification systems based on machine learning approach. The first one, called "selection problem", is how to select a minimum number of informative documents from a given document collection. The second one, called "composition problem", is how to reorganize selected training documents so that they can fit an adopted learning method. The former problem is addressed in "active learning" algorithms, and the latter is discussed in "boosting" algorithms. This paper proposes a new learning method, called AdaBUS, which proactively solves the above problems in the context of Naive Bayes classification systems. The proposed method constructs more accurate classification hypothesis by increasing the valiance in "weak" hypotheses that determine the final classification hypothesis. Consequently, the proposed algorithm yields perturbation effect makes the boosting algorithm work properly. Through the empirical experiment using the Routers-21578 document collection, we show that the AdaBUS algorithm more significantly improves the Naive Bayes-based classification system than other conventional learning methodson system than other conventional learning methods