• Title/Summary/Keyword: 희소데이터

Search Result 85, Processing Time 0.021 seconds

Dual Dictionary Learning for Cell Segmentation in Bright-field Microscopy Images (명시야 현미경 영상에서의 세포 분할을 위한 이중 사전 학습 기법)

  • Lee, Gyuhyun;Quan, Tran Minh;Jeong, Won-Ki
    • Journal of the Korea Computer Graphics Society
    • /
    • v.22 no.3
    • /
    • pp.21-29
    • /
    • 2016
  • Cell segmentation is an important but time-consuming and laborious task in biological image analysis. An automated, robust, and fast method is required to overcome such burdensome processes. These needs are, however, challenging due to various cell shapes, intensity, and incomplete boundaries. A precise cell segmentation will allow to making a pathological diagnosis of tissue samples. A vast body of literature exists on cell segmentation in microscopy images [1]. The majority of existing work is based on input images and predefined feature models only - for example, using a deformable model to extract edge boundaries in the image. Only a handful of recent methods employ data-driven approaches, such as supervised learning. In this paper, we propose a novel data-driven cell segmentation algorithm for bright-field microscopy images. The proposed method minimizes an energy formula defined by two dictionaries - one is for input images and the other is for their manual segmentation results - and a common sparse code, which aims to find the pixel-level classification by deploying the learned dictionaries on new images. In contrast to deformable models, we do not need to know a prior knowledge of objects. We also employed convolutional sparse coding and Alternating Direction of Multiplier Method (ADMM) for fast dictionary learning and energy minimization. Unlike an existing method [1], our method trains both dictionaries concurrently, and is implemented using the GPU device for faster performance.

Topographic Non-negative Matrix Factorization for Topic Visualization from Text Documents (Topographic non-negative matrix factorization에 기반한 텍스트 문서로부터의 토픽 가시화)

  • Chang, Jeong-Ho;Eom, Jae-Hong;Zhang, Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10b
    • /
    • pp.324-329
    • /
    • 2006
  • Non-negative matrix factorization(NMF) 기법은 음이 아닌 값으로 구성된 데이터를 두 종류의 양의 행렬의 곱의 형식으로 분할하는 데이터 분석기법으로서, 텍스트마이닝, 바이오인포매틱스, 멀티미디어 데이터 분석 등에 활용되었다. 본 연구에서는 기본 NMF 기법에 기반하여 텍스트 문서로부터 토픽을 추출하고 동시에 이를 가시적으로 도시하기 위한 Topographic NMF (TNMF) 기법을 제안한다. TNMF에 의한 토픽 가시화는 데이터를 전체적인 관점에서 보다 직관적으로 파악하는데 도움이 될 수 있다. TNMF는 생성모델 관점에서 볼 때, 2개의 은닉층을 갖는 계층적 모델로 표현할 수 있으며, 상위 은닉층에서 하위 은닉층으로의 연결은 토픽공간상에서 토픽간의 전이확률 또는 이웃함수를 정의한다. TNMF에서의 학습은 전이확률값의 연속적 스케줄링 과정 속에서 반복적 파리미터 갱신 과정을 통해 학습이 이루어지는데, 파라미터 갱신은 기본 NMF 기반 학습 과정으로부터 유사한 형태로 유도될 수 있음을 보인다. 추가적으로 Probabilistic LSA에 기초한 토픽 가시화 기법 및 희소(sparse)한 해(解) 도출을 목적으로 한 non-smooth NMF 기법과의 연관성을 분석, 제시한다. NIPS 학회 논문 데이터에 대한 실험을 통해 제안된 방법론이 문서 내에 내재된 토픽들을 효과적으로 가시화 할 수 있음을 제시한다.

  • PDF

Compare to Factorization Machines Learning and High-order Factorization Machines Learning for Recommend system (추천시스템에 활용되는 Matrix Factorization 중 FM과 HOFM의 비교)

  • Cho, Seong-Eun
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.731-737
    • /
    • 2018
  • The recommendation system is actively researched for the purpose of suggesting information that users may be interested in in many fields such as contents, online commerce, social network, advertisement system, and the like. However, there are many recommendation systems that propose based on past preference data, and it is difficult to provide users with little or no data in the past. Therefore, interest in higher-order data analysis is increasing and Matrix Factorization is attracting attention. In this paper, we study and propose a comparison and replay of the Factorization Machines Leaning(FM) model which is attracting attention in the recommendation system and High-Order Factorization Machines Learning(HOFM) which is a high - dimensional data analysis.

Sparse Web Data Analysis Using MCMC Missing Value Imputation and PCA Plot-based SOM (MCMC 결측치 대체와 주성분 산점도 기반의 SOM을 이용한 희소한 웹 데이터 분석)

  • Jun, Sung-Hae;Oh, Kyung-Whan
    • The KIPS Transactions:PartD
    • /
    • v.10D no.2
    • /
    • pp.277-282
    • /
    • 2003
  • The knowledge discovery from web has been studied in many researches. There are some difficulties using web log for training data on efficient information predictive models. In this paper, we studied on the method to eliminate sparseness from web log data and to perform web user clustering. Using missing value imputation by Bayesian inference of MCMC, the sparseness of web data is removed. And web user clustering is performed using self organizing maps based on 3-D plot by principal component. Finally, using KDD Cup data, our experimental results were shown the problem solving process and the performance evaluation.

Image Denoising Using Nonlocal Similarity and 3D Filtering (비지역적 유사성 및 3차원 필터링 기반 영상 잡음제거)

  • Kim, Seehyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.10
    • /
    • pp.1886-1891
    • /
    • 2017
  • Denoising which is one of major research topics in the image processing deals with recovering the noisy images. Natural images are well known not only for their local but also nonlocal similarity. Patterns of unique edges and texture which are crucial for understanding the image are repeated over the nonlocal region. In this paper, a nonlocal similarity based denoising algorithm is proposed. First for every blocks of the noisy image, nonlocal similar blocks are gathered to construct a overcomplete data set which are inherently sparse in the transform domain due to the characteristics of the images. Then, the sparse transform coefficients are filtered to suppress the non-sparse additive noise. Finally, the image is recovered by aggregating the overcomplete estimates of each pixel. Performance experiments with several images show that the proposed algorithm outperforms the conventional methods in removing the additive Gaussian noise effectively while preserving the image details.

Method to Improve Data Sparsity Problem of Collaborative Filtering Using Latent Attribute Preference (잠재적 속성 선호도를 이용한 협업 필터링의 데이터 희소성 문제 개선 방법)

  • Kwon, Hyeong-Joon;Hong, Kwang-Seok
    • Journal of Internet Computing and Services
    • /
    • v.14 no.5
    • /
    • pp.59-67
    • /
    • 2013
  • In this paper, we propose the LAR_CF, latent attribute rating-based collaborative filtering, that is robust to data sparsity problem which is one of traditional problems caused of decreasing rating prediction accuracy. As compared with that existing collaborative filtering method uses a preference rating rated by users as feature vector to calculate similarity between objects, the proposed method improves data sparsity problem using unique attributes of two target objects with existing explicit preference. We consider MovieLens 100k dataset and its item attributes to evaluate the LAR_CF. As a result of artificial data sparsity and full-rating experiments, we confirmed that rating prediction accuracy can be improved rating prediction accuracy in data sparsity condition by the LAR_CF.

Improving prediction performance of network traffic using dense sampling technique (밀집 샘플링 기법을 이용한 네트워크 트래픽 예측 성능 향상)

  • Jin-Seon Lee;Il-Seok Oh
    • Smart Media Journal
    • /
    • v.13 no.6
    • /
    • pp.24-34
    • /
    • 2024
  • If the future can be predicted from network traffic data, which is a time series, it can achieve effects such as efficient resource allocation, prevention of malicious attacks, and energy saving. Many models based on statistical and deep learning techniques have been proposed, and most of these studies have focused on improving model structures and learning algorithms. Another approach to improving the prediction performance of the model is to obtain a good-quality data. With the aim of obtaining a good-quality data, this paper applies a dense sampling technique that augments time series data to the application of network traffic prediction and analyzes the performance improvement. As a dataset, UNSW-NB15, which is widely used for network traffic analysis, is used. Performance is analyzed using RMSE, MAE, and MAPE. To increase the objectivity of performance measurement, experiment is performed independently 10 times and the performance of existing sparse sampling and dense sampling is compared as a box plot. As a result of comparing the performance by changing the window size and the horizon factor, dense sampling consistently showed a better performance.

Modified Transformation and Evaluation for High Concentration Ozone Predictions (고농도 오존 예측을 위한 향상된 변환 기법과 예측 성능 평가)

  • Cheon, Seong-Pyo;Kim, Sung-Shin;Lee, Chong-Bum
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.4
    • /
    • pp.435-442
    • /
    • 2007
  • To reduce damage from high concentration ozone in the air, we have researched how to predict high concentration ozone before it occurs. High concentration ozone is a rare event and its reaction mechanism has nonlinearities and complexities. In this paper, we have tried to apply and consider as many methods as we could. We clustered the data using the fuzzy c-mean method and took a rejection sampling to fill in the missing and abnormal data. Next, correlations of the input component and output ozone concentration were calculated to transform more correlated components by modified log transformation. Then, we made the prediction models using Dynamic Polynomial Neural Networks. To select the optimal model, we adopted a minimum bias criterion. Finally, to evaluate suggested models, we compared the two models. One model was trained and tested by the transformed data and the other was not. We concluded that the modified transformation effected good to ideal performance In some evaluations. In particular, the data were related to seasonal characteristics or its variation trends.

Improving the MAE by Removing Lower Rated Items in Recommender System

  • Kim, Sun-Ok;Lee, Seok-Jun;Park, Young-Seo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.3
    • /
    • pp.819-830
    • /
    • 2008
  • Web recommender system was suggested in order to solve the problem which is cause by overflow of information. Collaborative filtering is the technique which predicts and recommends the suitable goods to the user with collection of preference information based on the history which user was interested in. However, there is a difficulty of recommendation by lack of information of goods which have less popularity. In this paper, it has been researched the way to select the sparsity of goods and the preference in order to solve the problem of recommender system's sparsity which is occurred by lack of information, as well as it has been described the solution which develops the quality of recommender system by selection of customers who were interested in.

  • PDF

BICF : Collaborative Filtering Based on Online Behavior Information (온라인 행동정보를 이용한 협업 필터링)

  • Kwak, Jee-yoon;Kim, Ga-yeong;Hong, Da-young;Kim, Hyon Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.401-404
    • /
    • 2020
  • 현재 전자상거래에서 사용되는 협업 필터링은 고객이 입력한 평점 정보를 이용하여 추천 시스템을 구축한다. 하지만 기존의 평점 정보는 고객이 직접 입력해야 하므로 데이터 희소생의 문제가 있고 허위정보를 가려내지 못한다는 문제점 또한 존재한다. 본 논문에서는 기존 평점 정보 기반의 협업 필터링 추천 시스템의 문제점을 해결하기 위해, 온라인 고객 행동 정보를 활용한 협업 필터링 알고리즘을 제안하였다. 실험 결과 본 연구에서 제안한 Collaborative Filtering based on Online Behavior Information (BICF) 알고리즘이 기존의 평점 기반 협업 필터링 방식보다 우수한 성능을 보임을 보여주었다.