• Title/Summary/Keyword: Dirichlet process

Search Result 73, Processing Time 0.025 seconds

A Proofreader Matching Method Based on Topic Modeling Using the Importance of Documents (문서 중요도를 고려한 토픽 기반의 논문 교정자 매칭 방법론)

  • Son, Yeonbin;An, Hyeontae;Choi, Yerim
    • Journal of Internet Computing and Services
    • /
    • v.19 no.4
    • /
    • pp.27-33
    • /
    • 2018
  • In the process of submitting a manuscript to a journal in order to present the results of the research at the research institution, researchers often proofread the manuscript because it can manuscripts to communicate the results more effectively. Currently, most of the manuscript proofreading companies use the manual proofreader assignment method according to the subjective judgment of the matching manager. Therefore, in this paper, we propose a topic-based proofreader matching method for effective proofreading results. The proposed method is categorized into two steps. First, a topic modeling is performed by using Latent Dirichlet Allocation. In this process, the frequency of each document constituting the representative document of a user is determined according to the importance of the document. Second, the user similarity is calculated based on the cosine similarity method. In addition, we confirmed through experiments by using real-world dataset. The performance of the proposed method is superior to the comparative method, and the validity of the matching results was verified using qualitative evaluation.

A Study on Automatic Analysis System of National Defense Articles (국방 기사 자동 분석 시스템 구축 방안 연구)

  • Kim, Hyunjung;Kim, Wooju
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.21 no.1
    • /
    • pp.86-93
    • /
    • 2018
  • Since media articles, which have a great influence on public opinion, are transmitted to the public through various media, it is very difficult to analyze them manually. There are many discussions on methods that can collect, process, and analyze documents in the academia, but this is mostly done in the areas related to politics and stocks, and national-defense articles are poorly researched. In this study, we will explain how to build an automatic analysis system of national defense articles that can collect information on defense articles automatically, and can process information quickly by using topic modeling with LDA, emotional analysis, and extraction-based text summarization.

Smartphone-User Interactive based Self Developing Place-Time-Activity Coupled Prediction Method for Daily Routine Planning System (일상생활 계획을 위한 스마트폰-사용자 상호작용 기반 지속 발전 가능한 사용자 맞춤 위치-시간-행동 추론 방법)

  • Lee, Beom-Jin;Kim, Jiseob;Ryu, Je-Hwan;Heo, Min-Oh;Kim, Joo-Seuk;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.2
    • /
    • pp.154-159
    • /
    • 2015
  • Over the past few years, user needs in the smartphone application market have been shifted from diversity toward intelligence. Here, we propose a novel cognitive agent that plans the daily routines of users using the lifelog data collected by the smart phones of individuals. The proposed method first employs DPGMM (Dirichlet Process Gaussian Mixture Model) to automatically extract the users' POI (Point of Interest) from the lifelog data. After extraction, the POI and other meaningful features such as GPS, the user's activity label extracted from the log data is then used to learn the patterns of the user's daily routine by POMDP (Partially Observable Markov Decision Process). To determine the significant patterns within the user's time dependent patterns, collaboration was made with the SNS application Foursquare to record the locations visited by the user and the activities that the user had performed. The method was evaluated by predicting the daily routine of seven users with 3300 feedback data. Experimental results showed that daily routine scheduling can be established after seven days of lifelogged data and feedback data have been collected, demonstrating the potential of the new method of place-time-activity coupled daily routine planning systems in the intelligence application market.

Establishment of ITS Policy Issues Investigation Method in the Road Section applied Textmining (텍스트마이닝을 활용한 도로분야 ITS 정책이슈 탐색기법 정립)

  • Oh, Chang-Seok;Lee, Yong-taeck;Ko, Minsu
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.15 no.6
    • /
    • pp.10-23
    • /
    • 2016
  • With requiring circumspections using big data, this study attempts to develop and apply the search method for audit issues relating to the ITS policy or program. For the foregoing, the auditing process of the board of audit and inspection was converged with the theoretical frame of boundary analysis proposed by William Dunn as an analysis tool for audit issues. Moreover, we apply the text mining technique in order to computerize the analysis tool, which is similar to the boundary analysis in the concept of approaching meta-problems. For the text mining analysis, specific model we applied the antisymmetry-symmetry compound lexeme-based LDA model based on the Latent Dirichlet Allocation(LDA) methodologies proposed by David Blei. The several prime issues were founded through a case analysis as follows: lack of collection of traffic information by the urban traffic information system, which is operated by the National Police Agency, the overlapping problems between the Ministry of Land, Infrastructure and Transport and the Advanced Traffic Management System and fabrication of the mileage on digital tachograph.

Unsupervised Motion Learning for Abnormal Behavior Detection in Visual Surveillance (영상감시시스템에서 움직임의 비교사학습을 통한 비정상행동탐지)

  • Jeong, Ha-Wook;Chang, Hyung-Jin;Choi, Jin-Young
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.48 no.5
    • /
    • pp.45-51
    • /
    • 2011
  • In this paper, we propose an unsupervised learning method for modeling motion trajectory patterns effectively. In our approach, observations of an object on a trajectory are treated as words in a document for latent dirichlet allocation algorithm which is used for clustering words on the topic in natural language process. This allows clustering topics (e.g. go straight, turn left, turn right) effectively in complex scenes, such as crossroads. After this procedure, we learn patterns of word sequences in each cluster using Baum-Welch algorithm used to find the unknown parameters in a hidden markov model. Evaluation of abnormality can be done using forward algorithm by comparing learned sequence and input sequence. Results of experiments show that modeling of semantic region is robust against noise in various scene.

Detection of Abnormal Behavior by Scene Analysis in Surveillance Video (감시 영상에서의 장면 분석을 통한 이상행위 검출)

  • Bae, Gun-Tae;Uh, Young-Jung;Kwak, Soo-Yeong;Byun, Hye-Ran
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.12C
    • /
    • pp.744-752
    • /
    • 2011
  • In intelligent surveillance system, various methods for detecting abnormal behavior were proposed recently. However, most researches are not robust enough to be utilized for actual reality which often has occlusions because of assumption the researches have that individual objects can be tracked. This paper presents a novel method to detect abnormal behavior by analysing major motion of the scene for complex environment in which object tracking cannot work. First, we generate Visual Word and Visual Document from motion information extracted from input video and process them through LDA(Latent Dirichlet Allocation) algorithm which is one of document analysis technique to obtain major motion information(location, magnitude, direction, distribution) of the scene. Using acquired information, we compare similarity between motion appeared in input video and analysed major motion in order to detect motions which does not match to major motions as abnormal behavior.

On Efficient Algorithms for Generating Fundamental Units and their H/W Implementations over Number Fields (효율적인 수체의 기본단수계 생성 알고리즘과 H/W 구현에 관한 연구)

  • Kim, Yong-Tae
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.12 no.6
    • /
    • pp.1181-1188
    • /
    • 2017
  • The unit and fundamental units of number fields are important to number field sieves testing primality of more than 400 digits integers and number field seive factoring the number in RSA cryptosystem, and multiplication of ideals and counting class number of the number field in imaginary quadratic cryptosystem. To minimize the time and space in H/W implementation of cryptosystems using fundamental units, in this paper, we introduce the Dirichlet's unit Theorem and propose our process of generating the fundamental units of the number field. And then we present the algorithm generating our fundamental units of the number field to minimize the time and space in H/W implementation and implementation results using the algorithm over the number field.

Performance Improvement of Topic Modeling using BART based Document Summarization (BART 기반 문서 요약을 통한 토픽 모델링 성능 향상)

  • Eun Su Kim;Hyun Yoo;Kyungyong Chung
    • Journal of Internet Computing and Services
    • /
    • v.25 no.3
    • /
    • pp.27-33
    • /
    • 2024
  • The environment of academic research is continuously changing due to the increase of information, which raises the need for an effective way to analyze and organize large amounts of documents. In this paper, we propose Performance Improvement of Topic Modeling using BART(Bidirectional and Auto-Regressive Transformers) based Document Summarization. The proposed method uses BART-based document summary model to extract the core content and improve topic modeling performance using LDA(Latent Dirichlet Allocation) algorithm. We suggest an approach to improve the performance and efficiency of LDA topic modeling through document summarization and validate it through experiments. The experimental results show that the BART-based model for summarizing article data captures the important information of the original articles with F1-Scores of 0.5819, 0.4384, and 0.5038 in Rouge-1, Rouge-2, and Rouge-L performance evaluations, respectively. In addition, topic modeling using summarized documents performs about 8.08% better than topic modeling using full text in the performance comparison using the Perplexity metric. This contributes to the reduction of data throughput and improvement of efficiency in the topic modeling process.

Indian Buffet Process Inspired Component Analysis for fMRI Data (fMRI 데이터에 적용한 인디언 뷔페 프로세스 닮은 성분 분석법)

  • Kim, Joon-Shik;Kim, Eun-Sol;Lim, Byoung-Kwon;Lee, Chung-Yeon;Zhang, Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.191-194
    • /
    • 2011
  • 문서를 이루는 단어들의 빈도수가 지수법칙(power law)를 따른다는 지프의 법칩(Zipf's law)이 있다. 이러한 단어분포를 고려하여 문서의 토픽을 찾아내는 기계학습법이 디리쉴레 프로세스(Dirichlet process) 이다. 이를 발전시켜서 데이터의 잠재 요인(latent factor)들을 베이즈 확률모델에 기반한 샘플링 바탕으로 찾는 방법이 인디언 뷔페 과정(Indian buffet process) 이다. 우리는 25가지의 특징(feature)들에 대한 점수(rating)들이 볼드(blood oxygen dependent level) 신호와 함께 주어지는 PBAIC 2007 데이터에 주성분 분석법(principal component analysis)를 적용했다. PBAIC 2007 데이터는 비디오 게임을 수행하며 기능적뇌영상(functional magnetic resonance imaging, fMRI) 촬영을 하여 얻어진 공개데이터이다. 우리의 연구에서는 주성분 분석법을 이용하여 10개의 독립 성분(independent component)들을 찾았다. 그리고 1.75초 마다 촬영된 BOLD 신호와 10개의 고유벡터(eigenvector)들간의 내적을 취하여 가중치(weight)를 구하였다. 성분들의 가중치를 낮은 순서로 정렬함으로써 각 시간마다 주도적으로 영향을 미치는 성분들을 알아낼 수 있었다.

FEYNMAN-KAC SEMIGROUPS, MARTINGALES AND WAVE OPERATORS

  • Van Casteren, Jan A.
    • Journal of the Korean Mathematical Society
    • /
    • v.38 no.2
    • /
    • pp.227-274
    • /
    • 2001
  • In this paper we intended to discuss the following topics: (1) Notation, generalities, Markov processes. The close relationship between (generators of) Markov processes and the martingale problem is exhibited. A link between the Korovkin property and generators of Feller semigroups is established. (2) Feynman-Kac semigroups: 0-order regular perturbations, pinned Markov measures. A basic representation via distributions of Markov processes is depicted. (3) Dirichlet semigroups: 0-order singular perturbations, harmonic functions, multiplicative functionals. Here a representation theorem of solutions to the heat equation is depicted in terms of the distributions of the underlying Markov process and a suitable stopping time. (4) Sets of finite capacity, wave operators, and related results. In this section a number of results are presented concerning the completeness of scattering systems (and its spectral consequences). (5) Some (abstract) problems related to Neumann semigroups: 1st order perturbations. In this section some rather abstract problems are presented, which lie on the borderline between first order perturbations together with their boundary limits (Neumann type boundary conditions and) and reflected Markov processes.

  • PDF