• Title/Summary/Keyword: Automatic Data Extraction

Search Result 310, Processing Time 0.026 seconds

Feature Extraction of Non-proliferative Diabetic Retinopathy Using Faster R-CNN and Automatic Severity Classification System Using Random Forest Method

  • Jung, Younghoon;Kim, Daewon
    • Journal of Information Processing Systems
    • /
    • v.18 no.5
    • /
    • pp.599-613
    • /
    • 2022
  • Non-proliferative diabetic retinopathy is a representative complication of diabetic patients and is known to be a major cause of impaired vision and blindness. There has been ongoing research on automatic detection of diabetic retinopathy, however, there is also a growing need for research on an automatic severity classification system. This study proposes an automatic detection system for pathological symptoms of diabetic retinopathy such as microaneurysms, retinal hemorrhage, and hard exudate by applying the Faster R-CNN technique. An automatic severity classification system was devised by training and testing a Random Forest classifier based on the data obtained through preprocessing of detected features. An experiment of classifying 228 test fundus images with the proposed classification system showed 97.8% accuracy.

A Study on the Automatic Detection and Extraction of Narrowband Multiple Frequency Lines (협대역 다중 주파수선의 자동 탐지 및 추출 기법 연구)

  • 이성은;황수복
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.8
    • /
    • pp.78-83
    • /
    • 2000
  • Passive sonar system is designed to classify the underwater targets by analyzing and comparing the various acoustic characteristics such as signal strength, bandwidth, number of tonals and relationship of tonals from the extracted tonals and frequency lines. First of all the precise detection and extraction of signal frequency lines is of particular importance for enhancing the reliability of target classification. But, the narrowband frequency lines which are the line formed in spectrogram by a tonal of constant frequency in each frame can be detected weakly or discontinuously because of the variation of signal strength and transmission loss in the sea. Also, it is very difficult to detect and extract precisely the signal frequency lines by the complexity of impulsive ambient noise and signal components. In this paper, the automatic detection and extraction method that can detect and extract the signal components of frequency tines precisely are proposed. The proposed method can be applied under the bad conditions with weak signal strength and high ambient noise. It is confirmed by the simulation using real underwater target data.

  • PDF

A Development of Automatic Lineament Extraction Algorithm from Landsat TM images for Geological Applications (지질학적 활용을 위한 Landsat TM 자료의 자동화된 선구조 추출 알고리즘의 개발)

  • 원중선;김상완;민경덕;이영훈
    • Korean Journal of Remote Sensing
    • /
    • v.14 no.2
    • /
    • pp.175-195
    • /
    • 1998
  • Automatic lineament extraction algorithms had been developed by various researches for geological purpose using remotely sensed data. However, most of them are designed for a certain topographic model, for instance rugged mountainous region or flat basin. Most of common topographic characteristic in Korea is a mountainous region along with alluvial plain, and consequently it is difficult to apply previous algorithms directly to this area. A new algorithm of automatic lineament extraction from remotely sensed images is developed in this study specifically for geological applications. An algorithm, named as DSTA(Dynamic Segment Tracing Algorithm), is developed to produce binary image composed of linear component and non-linear component. The proposed algorithm effectively reduces the look direction bias associated with sun's azimuth angle and the noise in the low contrast region by utilizing a dynamic sub window. This algorithm can successfully accomodate lineaments in the alluvial plain as well as mountainous region. Two additional algorithms for estimating the individual lineament vector, named as ALEHHT(Automatic Lineament Extraction by Hierarchical Hough Transform) and ALEGHT(Automatic Lineament Extraction by Generalized Hough Transform) which are merging operation steps through the Hierarchical Hough transform and Generalized Hough transform respectively, are also developed to generate geological lineaments. The merging operation proposed in this study is consisted of three parameters: the angle between two lines($\delta$$\beta$), the perpendicular distance($(d_ij)$), and the distance between midpoints of lines(dn). The test result of the developed algorithm using Landsat TM image demonstrates that lineaments in alluvial plain as well as in rugged mountain is extremely well extracted. Even the lineaments parallel to sun's azimuth angle are also well detected by this approach. Further study is, however, required to accommodate the effect of quantization interval(droh) parameter in ALEGHT for optimization.

DEVELOPMENT OF AN AUTOMATIC PROCESSING PROGRAM FOR BOES DATA (BOES 관측데이터의 자동처리 프로그램 개발)

  • Kang, Dong-Il;Park, Hong-Suh;Han, In-Woo;Valyavin, G.;Lee, Byeong-Cheol;Kim, Kang-Min
    • Publications of The Korean Astronomical Society
    • /
    • v.20 no.1 s.24
    • /
    • pp.97-107
    • /
    • 2005
  • We developed a data reduction program (RX) to process BOES data automatically. It processes a whole set of data taken during one night automatically - preprocessing, extraction to one-dimensional spectra and wavelength calibration. The execution is very fast and the performance looks pretty good. We described the performance of this program, comparing its procedure with that of IRAF. RX does not have functions for continuum normalization yet. We will develop those functions in the next works.

Automatic Extraction of References for Research Reports using Deep Learning Language Model (딥러닝 언어 모델을 이용한 연구보고서의 참고문헌 자동추출 연구)

  • Yukyung Han;Wonsuk Choi;Minchul Lee
    • Journal of the Korean Society for information Management
    • /
    • v.40 no.2
    • /
    • pp.115-135
    • /
    • 2023
  • The purpose of this study is to assess the effectiveness of using deep learning language models to extract references automatically and create a reference database for research reports in an efficient manner. Unlike academic journals, research reports present difficulties in automatically extracting references due to variations in formatting across institutions. In this study, we addressed this issue by introducing the task of separating references from non-reference phrases, in addition to the commonly used metadata extraction task for reference extraction. The study employed datasets that included various types of references, such as those from research reports of a particular institution, academic journals, and a combination of academic journal references and non-reference texts. Two deep learning language models, namely RoBERTa+CRF and ChatGPT, were compared to evaluate their performance in automatic extraction. They were used to extract metadata, categorize data types, and separate original text. The research findings showed that the deep learning language models were highly effective, achieving maximum F1-scores of 95.41% for metadata extraction and 98.91% for categorization of data types and separation of the original text. These results provide valuable insights into the use of deep learning language models and different types of datasets for constructing reference databases for research reports including both reference and non-reference texts.

An Automatic Extraction Algorithm of Structure Boundary from Terrestrial LIDAR Data (지상라이다 데이터를 이용한 구조물 윤곽선 자동 추출 알고리즘 연구)

  • Roh, Yi-Ju;Kim, Nam-Woon;Yun, Kee-Bang;Jung, Kyeong-Hoon;Kang, Dong-Wook;Kim, Ki-Doo
    • 전자공학회논문지 IE
    • /
    • v.46 no.1
    • /
    • pp.7-15
    • /
    • 2009
  • In this paper, automatic structure boundary extraction is proposed using terrestrial LIDAR (Light Detection And Ranging) in 3-dimensional data. This paper describes an algorithm which does not use pictures and pre-processing. In this algorithm, an efficient decimation method is proposed, considering the size of object, the amount of LIDAR data, etc. From these decimated data, object points and non-object points are distinguished using distance information which is a major features of LIDAR. After that, large and small values are extracted using local variations, which can be candidate for boundary. Finally, a boundary line is drawn based on the boundary point candidates. In this way, the approximate boundary of the object is extracted.

Automatic Extraction of Stable Visual Landmarks for a Mobile Robot under Uncertainty (이동로봇의 불확실성을 고려한 안정한 시각 랜드마크의 자동 추출)

  • Moon, In-Hyuk
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.7 no.9
    • /
    • pp.758-765
    • /
    • 2001
  • This paper proposes a method to automatically extract stable visual landmarks from sensory data. Given a 2D occupancy map, a mobile robot first extracts vertical line features which are distinct and on vertical planar surfaces, because they are expected to be observed reliably from various viewpoints. Since the feature information such as position and length includes uncertainty due to errors of vision and motion, the robot then reduces the uncertainty by matching the planar surface containing the features to the map. As a result, the robot obtains modeled stable visual landmarks from extracted features. This extraction process is performed on-line to adapt to an actual changes of lighting and scene depending on the robot’s view. Experimental results in various real scenes show the validity of the proposed method.

  • PDF

Fault Pattern Extraction Via Adjustable Time Segmentation Considering Inflection Points of Sensor Signals for Aircraft Engine Monitoring (센서 데이터 변곡점에 따른 Time Segmentation 기반 항공기 엔진의 고장 패턴 추출)

  • Baek, Sujeong
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.3
    • /
    • pp.86-97
    • /
    • 2021
  • As mechatronic systems have various, complex functions and require high performance, automatic fault detection is necessary for secure operation in manufacturing processes. For conducting automatic and real-time fault detection in modern mechatronic systems, multiple sensor signals are collected by internet of things technologies. Since traditional statistical control charts or machine learning approaches show significant results with unified and solid density models under normal operating states but they have limitations with scattered signal models under normal states, many pattern extraction and matching approaches have been paid attention. Signal discretization-based pattern extraction methods are one of popular signal analyses, which reduce the size of the given datasets as much as possible as well as highlight significant and inherent signal behaviors. Since general pattern extraction methods are usually conducted with a fixed size of time segmentation, they can easily cut off significant behaviors, and consequently the performance of the extracted fault patterns will be reduced. In this regard, adjustable time segmentation is proposed to extract much meaningful fault patterns in multiple sensor signals. By considering inflection points of signals, we determine the optimal cut-points of time segments in each sensor signal. In addition, to clarify the inflection points, we apply Savitzky-golay filter to the original datasets. To validate and verify the performance of the proposed segmentation, the dataset collected from an aircraft engine (provided by NASA prognostics center) is used to fault pattern extraction. As a result, the proposed adjustable time segmentation shows better performance in fault pattern extraction.

Extraction of Geometric Primitives from Point Cloud Data

  • Kim, Sung-Il;Ahn, Sung-Joon
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.2010-2014
    • /
    • 2005
  • Object detection and parameter estimation in point cloud data is a relevant subject to robotics, reverse engineering, computer vision, and sport mechanics. In this paper a software is presented for fully-automatic object detection and parameter estimation in unordered, incomplete and error-contaminated point cloud with a large number of data points. The software consists of three algorithmic modules each for object identification, point segmentation, and model fitting. The newly developed algorithms for orthogonal distance fitting (ODF) play a fundamental role in each of the three modules. The ODF algorithms estimate the model parameters by minimizing the square sum of the shortest distances between the model feature and the measurement points. Curvature analysis of the local quadric surfaces fitted to small patches of point cloud provides the necessary seed information for automatic model selection, point segmentation, and model fitting. The performance of the software on a variety of point cloud data will be demonstrated live.

  • PDF

A Research on Automatic Data Extract Method for Herbal Formula Combinations Using Herb and Dosage Terminology - Based on 『Euijongsonik』 - (본초 및 용량 용어를 이용한 방제구성 자동추출방법에 대한 연구 -『의종손익』을 중심으로-)

  • Keum, Yujeong;Lee, Byungwook;Eom, Dongmyung;Song, Jichung
    • Journal of Korean Medical classics
    • /
    • v.33 no.4
    • /
    • pp.67-81
    • /
    • 2020
  • Objectives : This research aims to suggest a automatic data extract method for herbal formula combinations from medical classics' texts. Methods : This research was carried out by using Access of Microsoft Office 365 in Windows 10 of Microsoft. The subject text for extraction was 『Euijongsonik』. Using data sets of herb and dosage terminology, herbal medicinals and their dosages were extracted. Afterwards, using the position value of the character string, the formula combinations were automatically extracted. Results :The PC environment of this research was Intel Core i7-1065G7 CPU 1.30GHz, with 8GB of RAM and a Windows 10 64bit operation system. Out of 6,115 verses, 19,277 herb-dosage combinations were extracted. Conclusions : In this research, it was demonstrated that in the case of classical texts that are available as data, knowledge on herbal medicine could be extracted without human or material resources. This suggests an applicability of classical text knowledge to clinical practice.