• 제목/요약/키워드: Topic Segmentation

검색결과 36건 처리시간 0.028초

Object detection in financial reporting documents for subsequent recognition

  • Sokerin, Petr;Volkova, Alla;Kushnarev, Kirill
    • International journal of advanced smart convergence
    • /
    • 제10권1호
    • /
    • pp.1-11
    • /
    • 2021
  • Document page segmentation is an important step in building a quality optical character recognition module. The study examined already existing work on the topic of page segmentation and focused on the development of a segmentation model that has greater functional significance for application in an organization, as well as broad capabilities for managing the quality of the model. The main problems of document segmentation were highlighted, which include a complex background of intersecting objects. As classes for detection, not only classic text, table and figure were selected, but also additional types, such as signature, logo and table without borders (or with partially missing borders). This made it possible to pose a non-trivial task of detecting non-standard document elements. The authors compared existing neural network architectures for object detection based on published research data. The most suitable architecture was RetinaNet. To ensure the possibility of quality control of the model, a method based on neural network modeling using the RetinaNet architecture is proposed. During the study, several models were built, the quality of which was assessed on the test sample using the Mean average Precision metric. The best result among the constructed algorithms was shown by a model that includes four neural networks: the focus of the first neural network on detecting tables and tables without borders, the second - seals and signatures, the third - pictures and logos, and the fourth - text. As a result of the analysis, it was revealed that the approach based on four neural networks showed the best results in accordance with the objectives of the study on the test sample in the context of most classes of detection. The method proposed in the article can be used to recognize other objects. A promising direction in which the analysis can be continued is the segmentation of tables; the areas of the table that differ in function will act as classes: heading, cell with a name, cell with data, empty cell.

Variational Expectation-Maximization Algorithm in Posterior Distribution of a Latent Dirichlet Allocation Model for Research Topic Analysis

  • Kim, Jong Nam
    • 한국멀티미디어학회논문지
    • /
    • 제23권7호
    • /
    • pp.883-890
    • /
    • 2020
  • In this paper, we propose a variational expectation-maximization algorithm that computes posterior probabilities from Latent Dirichlet Allocation (LDA) model. The algorithm approximates the intractable posterior distribution of a document term matrix generated from a corpus made up by 50 papers. It approximates the posterior by searching the local optima using lower bound of the true posterior distribution. Moreover, it maximizes the lower bound of the log-likelihood of the true posterior by minimizing the relative entropy of the prior and the posterior distribution known as KL-Divergence. The experimental results indicate that documents clustered to image classification and segmentation are correlated at 0.79 while those clustered to object detection and image segmentation are highly correlated at 0.96. The proposed variational inference algorithm performs efficiently and faster than Gibbs sampling at a computational time of 0.029s.

부식 검출과 분석에 적용한 영상 처리 기술 동향 (Trends in image processing techniques applied to corrosion detection and analysis)

  • 김범수;권재성;양정현
    • 한국표면공학회지
    • /
    • 제56권6호
    • /
    • pp.353-370
    • /
    • 2023
  • Corrosion detection and analysis is a very important topic in reducing costs and preventing disasters. Recently, image processing techniques have been widely applied to corrosion identification and analysis. In this work, we briefly introduces traditional image processing techniques and machine learning algorithms applied to detect or analyze corrosion in various fields. Recently, machine learning, especially CNN-based algorithms, have been widely applied to corrosion detection. Additionally, research on applying machine learning to region segmentation is very actively underway. The corrosion is reddish and brown in color and has a very irregular shape, so a combination of techniques that consider color and texture, various mathematical techniques, and machine learning algorithms are used to detect and analyze corrosion. We present examples of the application of traditional image processing techniques and machine learning to corrosion detection and analysis.

A multisource image fusion method for multimodal pig-body feature detection

  • Zhong, Zhen;Wang, Minjuan;Gao, Wanlin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권11호
    • /
    • pp.4395-4412
    • /
    • 2020
  • The multisource image fusion has become an active topic in the last few years owing to its higher segmentation rate. To enhance the accuracy of multimodal pig-body feature segmentation, a multisource image fusion method was employed. Nevertheless, the conventional multisource image fusion methods can not extract superior contrast and abundant details of fused image. To superior segment shape feature and detect temperature feature, a new multisource image fusion method was presented and entitled as NSST-GF-IPCNN. Firstly, the multisource images were resolved into a range of multiscale and multidirectional subbands by Nonsubsampled Shearlet Transform (NSST). Then, to superior describe fine-scale texture and edge information, even-symmetrical Gabor filter and Improved Pulse Coupled Neural Network (IPCNN) were used to fuse low and high-frequency subbands, respectively. Next, the fused coefficients were reconstructed into a fusion image using inverse NSST. Finally, the shape feature was extracted using automatic threshold algorithm and optimized using morphological operation. Nevertheless, the highest temperature of pig-body was gained in view of segmentation results. Experiments revealed that the presented fusion algorithm was able to realize 2.102-4.066% higher average accuracy rate than the traditional algorithms and also enhanced efficiency.

분야연상어를 이용한 화제의 계속성과 전환성을 추적하는 단락분할 방법 (Passage Retrieval based on Tracing Topic Continuity and Transition by Using Field-Associated Term)

  • 이상곤
    • 정보처리학회논문지B
    • /
    • 제10B권1호
    • /
    • pp.57-66
    • /
    • 2003
  • 복수의 화제가 혼합되어 있는 문서에서 각 화제의 경계부분을 구분하여 결정하는 기술을 단락분할이라 한다. 이 기술은 정보검색의 분야에만 한정되지 않고 다양한 분야에서 중요한 역할을 담당할 기술이다. 잘 정의된 분야체계에 따라 구축된 분야연상어를 이용하여 단락분할을 시도한다. 분야연상어란 특정한 분야를 정확하게 연상할 수 있는 단어로서 잘 분류된 문서 컬렉션에서 구축할 수 있다. 이 분야연상어를 이용하여 문서를 관련된 분야별로 추출하여 의미기반 단락추출 방법을 제안한다. 화제의 계속성에 주목하여 분야연상어의 수준(범위)이나 연속출현성에 의해 계산된 계속도에 의해 화제의 실마리를 추적하고, 화제의 전환성을 고려한 방법을 제안한다. 문서 내 각 화제의 단락구분을 명확히 하여, 단락을 화제분야별로 추출하는 방법을 제안한다. 일본어 50문서를 실험한 결과 82%의 정확율과 63%의 재현율을 얻어 실용성을 기대할 수 있었고, 한국어에 적용하여도 좋을 것으로 예상한다.

기간별 이슈 매핑을 통한 이슈 생명주기 분석 방법론 (Analyzing the Issue Life Cycle by Mapping Inter-Period Issues)

  • 임명수;김남규
    • 지능정보연구
    • /
    • 제20권4호
    • /
    • pp.25-41
    • /
    • 2014
  • 최근 스마트 기기를 통해 소셜미디어에 참여하는 사용자가 급격히 증가하고 있다. 이에 따라 빅데이터 분석에 대한 관심이 높아지고 있으며 최근 포털 사이트에서 검색어로 자주 입력되거나 다양한 소셜미디어에서 자주 언급되는 단어에 대한 분석을 통해 사회적 이슈를 파악하기 위한 시도가 이루어 지고 있다. 이처럼 다량의 텍스트를 통해 도출된 사회적 이슈의 기간별 추이를 비교하는 분석을 이슈 트래킹이라 한다. 하지만 기존의 이슈 트래킹은 두 가지 한계를 가지고 있다. 첫째, 전통적 방식의 이슈 트래킹은 전체 기간의 문서에 대해 일괄 토픽 분석을 실시하고 각 토픽의 기간별 분포를 파악하는 방식으로 이루어지므로, 새로운 기간의 문서가 추가되었을 때 추가된 문서에 대해서만 분석을 추가 실시하는 것이 아니라 전체 기간의 문서에 대한 분석을 다시 실시해야 한다는 실용성 측면의 한계를 갖고 있다. 둘째, 이슈는 끊임 없이 생성되고 소멸될 뿐 아니라, 때로는 하나의 이슈가 둘 이상의 이슈로 분화하고 둘 이상의 이슈가 하나로 통합되기도 한다. 즉, 이슈는 생성, 변화(병합, 분화), 그리고 소멸의 생명주기를 갖게 되는데, 전통적 이슈 트래킹은 이러한 이슈의 가변성을 다루지 않았다는 한계를 갖는다. 본 연구에서는 이러한 한계를 극복하기 위해 대상 기간 전체의 문서를 한꺼번에 분석하는 방식이 아닌 세부 기간별 문서에 대해 독립적인 분석을 수행하고 이를 통합할 수 있는 방안을 제시하였으며, 이를 통해 새로운 이슈가 생성되고 변화하며 소멸되는 전체 과정을 규명하였다. 또한 실제 인터넷 뉴스에 대해 제안 방법론을 적용함으로써, 제안 방법론의 실무 적용 가능성을 분석하였다.

Fast Mode Decision For Depth Video Coding Based On Depth Segmentation

  • Wang, Yequn;Peng, Zongju;Jiang, Gangyi;Yu, Mei;Shao, Feng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제6권4호
    • /
    • pp.1128-1139
    • /
    • 2012
  • With the development of three-dimensional display and related technologies, depth video coding becomes a new topic and attracts great attention from industries and research institutes. Because (1) the depth video is not a sequence of images for final viewing by end users but an aid for rendering, and (2) depth video is simpler than the corresponding color video, fast algorithm for depth video is necessary and possible to reduce the computational burden of the encoder. This paper proposes a fast mode decision algorithm for depth video coding based on depth segmentation. Firstly, based on depth perception, the depth video is segmented into three regions: edge, foreground and background. Then, different mode candidates are searched to decide the encoding macroblock mode. Finally, encoding time, bit rate and video quality of virtual view of the proposed algorithm are tested. Experimental results show that the proposed algorithm save encoding time ranging from 82.49% to 93.21% with negligible quality degradation of rendered virtual view image and bit rate increment.

A Novel Text Sample Selection Model for Scene Text Detection via Bootstrap Learning

  • Kong, Jun;Sun, Jinhua;Jiang, Min;Hou, Jian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권2호
    • /
    • pp.771-789
    • /
    • 2019
  • Text detection has been a popular research topic in the field of computer vision. It is difficult for prevalent text detection algorithms to avoid the dependence on datasets. To overcome this problem, we proposed a novel unsupervised text detection algorithm inspired by bootstrap learning. Firstly, the text candidate in a novel form of superpixel is proposed to improve the text recall rate by image segmentation. Secondly, we propose a unique text sample selection model (TSSM) to extract text samples from the current image and eliminate database dependency. Specifically, to improve the precision of samples, we combine maximally stable extremal regions (MSERs) and the saliency map to generate sample reference maps with a double threshold scheme. Finally, a multiple kernel boosting method is developed to generate a strong text classifier by combining multiple single kernel SVMs based on the samples selected from TSSM. Experimental results on standard datasets demonstrate that our text detection method is robust to complex backgrounds and multilingual text and shows stable performance on different standard datasets.

중소기업을 위한 지식서비스 수요 조사 모형 개발 (The development of knowledge service needs assessment model for small and medium-sized businesses)

  • 맹윤호;유선희;서진이
    • 지식경영연구
    • /
    • 제16권4호
    • /
    • pp.169-190
    • /
    • 2015
  • The status of small and medium-sized enterprises has been changed into more independent business entities rather than simply subcontractor so that the utilization of specialized knowledge has been much more necessary for the survival in the market. However, small and medium-sized enterprises, it is difficult to sufficient investment in knowledge services due to limited resources relative to large enterprises and demand for knowledge services business of government support is growing. For this reason, it is important to measure accurately the demand for knowledge services of small and medium-sized enterprises in knowledge management for effective utilization of knowledge service. In this study, we analyzed previous studies on small and medium-sized enterprises knowledge services that can be utilized in a comprehensive way. As a result, we developed knowledge service needs assessment model based on five critical success factors for continual growth and 12 types of knowledge service. This model has been modified and supplemented through expert meeting using delphi research method and topic modeling analysis using secondary data. This study is attempted to appropriately measure necessary knowledge services for small and medium-sized enterprises so that generated the evaluation model of knowledge service demands, comprehensively dealing with core knowledge services for many kinds of business entities. It is expected that the developed model will be a useful tool to understand and evaluate knowledge services demands of enterprises.

화상 통신에서의 사생활 보호를 위한 실시간 전경 분리 및 배경 대체 (Real-Time Foreground Segmentation and Background Substitution for Protecting Privacy on Visual Communication)

  • 배건태;곽수영;변혜란
    • 한국통신학회논문지
    • /
    • 제34권5C호
    • /
    • pp.505-513
    • /
    • 2009
  • 본 논문은 모노 카메라로 입력받은 영상에서 실시간으로 전경과 배경을 분리하여 배경을 자연스럽게 대체 하는 방법을 제안한다. 기존 연구는 대부분 단일 색상의 배경을 이용하여 전경 색에 대한 제약이 있거나, 깊이 정보를 추출을 위한 스테레오 카메라와 같은 장치에 대한 제약이 있거나, 제한적인 전경의 모양 모델을 이용하여 분리할 수 있는 전경의 모양에 대한 제약이 있었다. 이에 본 논문에서는 일반적으로 사용되는 웹캠과 같은 고정된 모노 카메라를 이용하여 실시간으로 전경 분리가 가능한 전경 분리 방법을 제안한다. 또한, 전경 분리의 성능 향상을 위하여 통영상의 시간적인 특징 정보를 이용한 시간적 전경 확률 모델을 제안한다. 또한 분리된 전경과 새로운 배경의 자연스러운 합성을 위한 알파 매트를 이용한 경계선 영역 처리방법과 간단한 후 처리 방법을 제안한다. 제안된 방법은 실제의 화상통신에서 개인의 사적인 정보가 포함된 배경을 자연스럽게 대체시켜 개인의 사생활을 보호할 수 있다.