• Title/Summary/Keyword: Automatic Data Extraction

Search Result 309, Processing Time 0.035 seconds

Development of Expert Systems using Automatic Knowledge Acquisition and Composite Knowledge Expression Mechanism

  • Kim, Jin-Sung
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.447-450
    • /
    • 2003
  • In this research, we propose an automatic knowledge acquisition and composite knowledge expression mechanism based on machine learning and relational database. Most of traditional approaches to develop a knowledge base and inference engine of expert systems were based on IF-THEN rules, AND-OR graph, Semantic networks, and Frame separately. However, there are some limitations such as automatic knowledge acquisition, complicate knowledge expression, expansibility of knowledge base, speed of inference, and hierarchies among rules. To overcome these limitations, many of researchers tried to develop an automatic knowledge acquisition, composite knowledge expression, and fast inference method. As a result, the adaptability of the expert systems was improved rapidly. Nonetheless, they didn't suggest a hybrid and generalized solution to support the entire process of development of expert systems. Our proposed mechanism has five advantages empirically. First, it could extract the specific domain knowledge from incomplete database based on machine learning algorithm. Second, this mechanism could reduce the number of rules efficiently according to the rule extraction mechanism used in machine learning. Third, our proposed mechanism could expand the knowledge base unlimitedly by using relational database. Fourth, the backward inference engine developed in this study, could manipulate the knowledge base stored in relational database rapidly. Therefore, the speed of inference is faster than traditional text -oriented inference mechanism. Fifth, our composite knowledge expression mechanism could reflect the traditional knowledge expression method such as IF-THEN rules, AND-OR graph, and Relationship matrix simultaneously. To validate the inference ability of our system, a real data set was adopted from a clinical diagnosis classifying the dermatology disease.

  • PDF

Automatic Object Extraction from Electronic Documents Using Deep Neural Network (심층 신경망을 활용한 전자문서 내 객체의 자동 추출 방법 연구)

  • Jang, Heejin;Chae, Yeonghun;Lee, Sangwon;Jo, Jinyong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.11
    • /
    • pp.411-418
    • /
    • 2018
  • With the proliferation of artificial intelligence technology, it is becoming important to obtain, store, and utilize scientific data in research and science sectors. A number of methods for extracting meaningful objects such as graphs and tables from research articles have been proposed to eventually obtain scientific data. Existing extraction methods using heuristic approaches are hardly applicable to electronic documents having heterogeneous manuscript formats because they are designed to work properly for some targeted manuscripts. This paper proposes a prototype of an object extraction system which exploits a recent deep-learning technology so as to overcome the inflexibility of the heuristic approaches. We implemented our trained model, based on the Faster R-CNN algorithm, using the Google TensorFlow Object Detection API and also composed an annotated data set from 100 research articles for training and evaluation. Finally, a performance evaluation shows that the proposed system outperforms a comparator adopting heuristic approaches by 5.2%.

Automatic Tree Extraction Using LIDAR Data (라이다 자료를 이용한 수목추출 자동화)

  • Lee, Su Jee;Kim, Eui Myoung
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.21 no.1
    • /
    • pp.39-44
    • /
    • 2013
  • Trees are important ground objects that cause oxygen and reduce carbon dioxide in urban areas. For management of the trees, many studies using LIDAR data have been conducted. But, they rely on overseas developed LIDAR data processing software applications because there is a lack of domestically developed software applications. Therefore, this work was intended to propose an automation process that helps to extract trees automatically from LIDAR data. The proposed process has the function to classify LIDAR data and to extract building regions and trees automatically. It was applied to a study place in Yongin to conduct a test. As a result, about 88% of trees were extracted from the automation process.

A Study on Automatic Extraction of Buildings Using LIDAR with Aerial Imagery (LIDAR 데이터와 항공사진을 이용한 건물의 자동추출에 관한 연구)

  • 이영진;조우석
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2003.04a
    • /
    • pp.471-477
    • /
    • 2003
  • This paper presents an algorithm that automatically extracts buildings among many different features on the earth surface by fusing LIDAR data with panchromatic aerial images. The proposed algorithm consists of three stages such as point level process, polygon level process, parameter space level process. At the first stage, we eliminate gross errors and apply a local maxima filter to detect building candidate points from the raw laser scanning data. After then, a grouping procedure is performed for segmenting raw LIDAR data and the segmented LIDAR data is polygonized by the encasing polygon algorithm developed in the research. At the second stage, we eliminate non-building polygons using several constraints such as area and circularity. At the last stage, all the polygons generated at the second stage are projected onto the aerial stereo images through collinearity condition equations. Finally, we fuse the projected encasing polygons with edges detected by image processing for refining the building segments. The experimental results showed that the RMSEs of building corners in X, Y and Z were ${\pm}$8.1cm, ${\pm}$24.7cm, ${\pm}$35.9cm, respectively.

  • PDF

Recognizing Emotional Content of Emails as a byproduct of Natural Language Processing-based Metadata Extraction (이메일에 포함된 감성정보 관련 메타데이터 추출에 관한 연구)

  • Paik, Woo-Jin
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.2
    • /
    • pp.167-183
    • /
    • 2006
  • This paper describes a metadata extraction technique based on natural language processing (NLP) which extracts personalized information from email communications between financial analysts and their clients. Personalized means connecting users with content in a personally meaningful way to create, grow, and retain online relationships. Personalization often results in the creation of user profiles that store individuals' preferences regarding goods or services offered by various e-commerce merchants. We developed an automatic metadata extraction system designed to process textual data such as emails, discussion group postings, or chat group transcriptions. The focus of this paper is the recognition of emotional contents such as mood and urgency, which are embedded in the business communications, as metadata.

3D BUILDING INFORMATION EXTRACTION FROM A SINGLE QUICKBIRD IMAGE

  • Kim, Hye-Jin;Han, Dong-Yeob;Kim, Yong-Il
    • Proceedings of the KSRS Conference
    • /
    • v.1
    • /
    • pp.409-412
    • /
    • 2006
  • Today's commercial high resolution satellite imagery such as IKONOS and QuickBird, offers the potential to extract useful spatial information for geographical database construction and GIS applications. Recognizing this potential use of high resolution satellite imagery, KARI is performing a project for developing Korea multipurpose satellite 3(KOMPSAT-3). Therefore, it is necessary to develop techniques for various GIS applications of KOMPSAT-3, using similar high resolution satellite imagery. As fundamental studies for this purpose, we focused on the extraction of 3D spatial information and the update of existing GIS data from QuickBird imagery. This paper examines the scheme for rectification of high resolution image, and suggests the convenient semi-automatic algorithm for extraction of 3D building information from a single image. The algorithm is based on triangular vector structure that consists of a building bottom point, its corresponding roof point and a shadow end point. The proposed method could increase the number of measurable building, and enhance the digitizing accuracy and the computation efficiency.

  • PDF

Car Frame Extraction using Background Frame in Video (동영상에서 배경프레임을 이용한 차량 프레임 검출)

  • Nam, Seok-Woo;Oh, Hea-Seok
    • The KIPS Transactions:PartB
    • /
    • v.10B no.6
    • /
    • pp.705-710
    • /
    • 2003
  • Recent years, as a rapid development of multimedia technology, video database system to retrieve video data efficiently seems to core technology in the oriented society. This thesis describes an efficient automatic frame detection and location method for content based retrieval of video. Frame extraction part is consist of incoming / outgoing car frame extraction and car number frame extraction stage. We gain star/end time of car video also car number frames. Frames are selected at fixed time interval from video and key frames are selected by color scale histogram and edge operation method. Car frame recognized can be searched by content based retrieval method.

Automatic Extraction of Route Information from Road Sign Imagery

  • Youn, Junhee;Chong, Kyusoo
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.33 no.6
    • /
    • pp.595-603
    • /
    • 2015
  • With the advances of the big-data process technology, acquiring the real-time information from the massive image data taken by a mobile device inside a vehicle will be possible in the near future. Among the information that can be found around the vehicle, the route information is needed for safe driving. In this study, the automatic extraction of route information from the road sign imagery was dealt with. The scope of the route information in this study included the route number, route type, and their relationship with the driving direction. For the recognition of the route number, the modified Tesseract OCR (Optical Character Recognition) engine was used after extracting the rectangular-road-sign area with the Freeman chain code tracing algorithm. The route types (expressway, highway, rural highway, and municipal road) are recognized using the proposed algorithms, which are acquired from colour space analysis. Those road signs provide information about the route number as well as the roads that may be encountered along the way. In this study, such information was called “OTW (on the way)” or “TTW (to the way)” which between the two should be indicated is determined using direction information. Finally, the route number is matched with the direction information. Experiments are carried out with the road sign imagery taken inside a car. As a result, route numbers, route number type, OTW or TTW are successfully recognized, however some errors occurred in the process of matching TTW number with the direction.

A Study on Automatic Extraction of Buildings Using LIDAR with Aerial Imagery

  • Lee, Young-Jin;Cho, Woo-Sug;Jeong, Soo;Kim, Kyung-Ok
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.241-243
    • /
    • 2003
  • This paper presents an algorithm that automatically extracts buildings among many different features on the earth surface by fusing LIDAR data with panchromatic aerial images. The proposed algorithm consists of three stages such as point level process, polygon level process, parameter space level process. At the first stage, we eliminate gross errors and apply a local maxima filter to detect building candidate points from the raw laser scanning data. After then, a grouping procedure is performed for segmenting raw LIDAR data and the segmented LIDAR data is polygonized by the encasing polygon algorithm developed in the research. At the second stage, we eliminate non-building polygons using several constraints such as area and circularity. At the last stage, all the polygons generated at the second stage are projected onto the aerial stereo images through collinearity condition equations. Finally, we fuse the projected encasing polygons with edges detected by image processing for refining the building segments. The experimental results showed that the RMSEs of building corners in X, Y and Z were ${\pm}$8.1㎝, ${\pm}$24.7㎝, ${\pm}$35.9㎝, respectively.

  • PDF

The Verification of Accuracy of 3D Body Scan Data - Focused on the Cyberware WB4 Whole Body Scanner - (3차원 인체 스캔 데이터의 정확도 검증에 관한 연구 - Cyberware의 WB4 스캐너를 중심으로 -)

  • Park, Sun-Mi;Nam, Yun-Ja
    • Journal of the Korea Fashion and Costume Design Association
    • /
    • v.14 no.1
    • /
    • pp.81-96
    • /
    • 2012
  • The purpose of this study is to provide fundamental information for standardization of 3D body measurement. This research analyzes errors occurring in the process of extracting body size from 3D body scan data. First, as a result of analyzing basic state of the 3D body scanner's calibration, the point number of each section was almost the same, while the right and left as well as the front and back coordinates of the center of gravity are not, showing unstable data. Nevertheless, the latter does not influence on the size of cylinder such as width and circumference. Next, we analyzed point coordinates variations of scan data on a mannequin nude by life casting. The result was great deflection in case of complicated or horizontal sections including the reference point beyond proper distance from centers of four cameras. In case of the mannequin's size, accuracy proves comparatively high in that measurement errors in height, width, depth, and length dimension occurred all within allowable errors, only except chest depth, while there were a lot of measurement errors in a circumference dimension. Secondly, analysis of accuracy of automatic extraction identification program algorithm presented that a semi-automatic measurement program is better than an automatic measurement program. While both of them ate very acute in parts related to crotch, they are not in armpit related parts. Therefore, in extracting of human body size from 3D scan data, what really matters seems to parts related to armpits.

  • PDF