• Title/Summary/Keyword: Automatic Data Extraction

Search Result 312, Processing Time 0.03 seconds

Trend Analysis using Spatial-Temporal Visualization of Event Information based on Social Media (소셜 미디어에 기반한 이벤트 정보의 시공간적 시각화를 통한 추이 분석)

  • Oh, Hyo-Jung;Yun, Bo-Hyun;Yoo, Cheol-Jung;Kim, Yong
    • Journal of Internet Computing and Services
    • /
    • v.15 no.6
    • /
    • pp.65-75
    • /
    • 2014
  • The main focus of this paper is to analyze trend of event informations in a variety of mass media by graphical visualization in axis of the time and location. Especially, continuity analysis based on user-generated social media can reflect the social impact of a certain event according to change time and location and their directional changes. To reveal the characteristics of continuous events, we survey the data set collected from news articles and tweets during two years. Based on case studies on 'disease' and 'leisure', we verify the effectiveness and usefulness of our proposed method. Even though some events occurred during same period, we showed directional changes which have high-impact in social media referred user interest's, compared with fact-based continuous visualization results.

Automatic Construction of Reduced Dimensional Cluster-based Keyword Association Networks using LSI (LSI를 이용한 차원 축소 클러스터 기반 키워드 연관망 자동 구축 기법)

  • Yoo, Han-mook;Kim, Han-joon;Chang, Jae-young
    • Journal of KIISE
    • /
    • v.44 no.11
    • /
    • pp.1236-1243
    • /
    • 2017
  • In this paper, we propose a novel way of producing keyword networks, named LSI-based ClusterTextRank, which extracts significant key words from a set of clusters with a mutual information metric, and constructs an association network using latent semantic indexing (LSI). The proposed method reduces the dimension of documents through LSI, decomposes documents into multiple clusters through k-means clustering, and expresses the words within each cluster as a maximal spanning tree graph. The significant key words are identified by evaluating their mutual information within clusters. Then, the method calculates the similarities between the extracted key words using the term-concept matrix, and the results are represented as a keyword association network. To evaluate the performance of the proposed method, we used travel-related blog data and showed that the proposed method outperforms the existing TextRank algorithm by about 14% in terms of accuracy.

Automatic Extract User Intention from Web Search Log (웹 정보 검색 이력을 이용한 사용자 의도 자동 추출)

  • Park, Kinam;Jung, Soonyoung;Suh, Taewon;Ji, Hyesung;Lee, Taemin;Lim, Heuiseok
    • The Journal of Korean Association of Computer Education
    • /
    • v.12 no.6
    • /
    • pp.21-32
    • /
    • 2009
  • This paper proposes a method to extract a user's intention automatically and implementation of intention map that support a user can appropriate search results using a user' information need accurately. It selects user intention based on searching history obtained from previous users' same queries and extracts user intentions by using clustering algorithm and user intention extraction algorithm, extracted user intentions are represented in an intention map base on a theory of knowledge representation. For the efficiency analysis of intention map, we extracted user intentions using 2,600 search history data which provided by a current domestic commercial search engine. The experimental results using the information intention map search when using general search engines represent more than satisfaction was statistically significant.

  • PDF

AUTOMATIC DETECTION AND EXTRACTION ALGORITHM OF INTER-GRANULAR BRIGHT POINTS

  • Feng, Song;Ji, Kai-Fan;Deng, Hui;Wang, Feng;Fu, Xiao-Dong
    • Journal of The Korean Astronomical Society
    • /
    • v.45 no.6
    • /
    • pp.167-173
    • /
    • 2012
  • Inter-granular Bright Points (igBPs) are small-scale objects in the Solar photosphere which can be seen within dark inter-granular lanes. We present a new algorithm to automatically detect and extract igBPs. Laplacian and Morphological Dilation (LMD) technique is employed by the algorithm. It involves three basic processing steps: (1) obtaining candidate "seed" regions by Laplacian; (2) determining the boundary and size of igBPs by morphological dilation; (3) discarding brighter granules by a probability criterion. For validating our algorithm, we used the observed samples of the Dutch Open Telescope (DOT), collected on April 12, 2007. They contain 180 high-resolution images, and each has a $85{\times}68\;arcsec^2$ field of view (FOV). Two important results are obtained: first, the identified rate of igBPs reaches 95% and is higher than previous results; second, the diameter distribution is $220{\pm}25km$, which is fully consistent with previously published data. We conclude that the presented algorithm can detect and extract igBPs automatically and effectively.

Development of Automatic Extraction Model of Soil Erosion Management Area using ArcGIS Model Builder (ArcGIS Model Builder를 이용한 토양유실 우선관리 지역 선정 자동화 모형 개발)

  • Kum, Dong-Hyuk;Choi, Jae-Wan;Kim, Ik-Jae;Kong, Dong-Soo;Ryu, Ji-Chul;Kang, Hyun-Woo;Lim, Kyoung-Jae
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.53 no.1
    • /
    • pp.71-81
    • /
    • 2011
  • Due to increased human activities and intensive rainfall events in a watershed, soil erosion and sediment transport have been hot issues in many areas of the world. To evaluate soil erosion problems spatially and temporarily, many computer models have been developed and evaluated over the years. However, it would not be reasonable to apply the model to a watershed if topography and environment are different to some degrees. Also, source codes of these models are not always public for modification. The ArcGIS model builder provides ease-of-use interface to develop model by linking several processes and input/output data together. In addition, it would be much easier to modify/enhance the model developed by others. Thus, simple model was developed to decide soil erosion hot spot areas using ArcGIS model builder tool in this study. This tool was applied to a watershed to evaluate model performance. It was found that sediment yield was estimated to be 13.7 ton/ha/yr at the most severe soil erosion hot spot area in the study watershed. As shown in this study, the ArcGIS model builder is an efficient tool to develop simple models without professional programming abilities. The model, developed in this study, is available at http://www.EnvSys.co.kr/~sateec/toolbox for free download. This tool can be easily modified for further enhancement with simple operations within ArcGIS model builder interface. Although very simple soil erosion and sediment yield were developed using model builder and applied to study watershed for soil erosion hot spot area in this study. The approaches shown in this study provides insights for model development and code sharing for the researchers in the related areas.

A Study on Automation about Painting the Letters to Road Surface

  • Lee, Kyong-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.1
    • /
    • pp.75-84
    • /
    • 2018
  • In this study, the researchers attempted to automate the process of painting the characters on the road surface, which is currently done by manual labor, by using the information and communication technology. Here are the descriptions of how we put in our efforts to achieve such a goal. First, we familiarized ourselves with the current regulations about painting letters or characters on the road, with reference to Road Mark Installation Management Manual of the National Police Agency. Regarding the graphemes, we adopted a new one using connection components, in Gothic print characters which was within the range of acceptance according to the aforementioned manual. We also made it possible for the automated program to recognize the graphemes by means of the feature dots of the isolated dots, end dots, 2-line gathering dots, and gathering dots of 3 lines or more. Regarding the database, we built graphemes database for plotting information, classified the characters by means of the arrangement information of the graphemes and the layers that the graphemes form within the characters, and last but not least, made the character shape information database for character plotting by using such data. We measured the layers and the arrangement information of the graphemes consisting the characters by using the information of: 1) the information of the position of the center of gravity, and 2) the information of the graphemes that was acquired through vertical exploration from the center of gravity in each grapheme. We identified and compared the group to which each character of the database belonged, and recognized the characters through the use of the information gathered using this method. We analyzed the input characters using the aforementioned analysis method and database, and then converted into plotting information. It was shown that the plotting was performed after the correction.

A Study on Speechreading about the Korean 8 Vowels (한국어 8모음 자동 독화에 관한 연구)

  • Lee, Kyong-Ho;Yang, Ryong;Kim, Sun-Ok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.3
    • /
    • pp.173-182
    • /
    • 2009
  • In this paper, we studied about the extraction of the parameter and implementation of speechreading system to recognize the Korean 8 vowel. Face features are detected by amplifying, reducing the image value and making a comparison between the image value which is represented for various value in various color space. The eyes position, the nose position, the inner boundary of lip, the outer boundary of upper lip and the outer line of the tooth is found to the feature and using the analysis the area of inner lip, the hight and width of inner lip, the outer line length of the tooth rate about a inner mouth area and the distance between the nose and outer boundary of upper lip are used for the parameter. 2400 data are gathered and analyzed. Based on this analysis, the neural net is constructed and the recognition experiments are performed. In the experiment, 5 normal persons were sampled. The observational error between samples was corrected using normalization method. The experiment show very encouraging result about the usefulness of the parameter.

Application of the Developed Pre- and Post-Processing System to Yongdamdam Watershed using PRMS Hydrological Model (수문학적 유역특성자료 자동화 추출 및 분석시스템 적용 (II) -PRMS 모형을 이용한 용담댐 유역을 대상으로-)

  • Kwon, Hyung-Joong;Hwang, Eui-Ho;Lee, Geun-Sang;Yu, Byeong-Hyeok;Koh, Deuk-Koo
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.11 no.3
    • /
    • pp.13-22
    • /
    • 2008
  • The objective of this study is to evaluate the applicability of extracted PRMS input parameters by KGIS-Hydrology over Yongdam-Dam watershed. KGIS-Hydrology is a system for automatic extraction and analysis of watershed characteristic data. Input parameters of PRMS were generated from GIS data (DEM, soil, forest type, etc.) using KGIS-Hydrology. Multi-temporal meteorological data from Jangsu station of KMA (Korea Meteorological Administration) were used for all simulation periods. Input parameters of PRMS were optimized using observed runoff data of Yongdam-Dam station (1966-2001) and validated using observed runoff data of Yongdam-Dam station (2002-2006, Yongdam-Dam watershed). The results showed that the simulated flows were much closed to the observed flows of Yongdam-Dam (2002-2006) and Donghyang (2001-2004) station by 0.49~0.83 and 0.57~0.75 model efficiencies, respectively.

  • PDF

The Construction of GIS-based Flood Risk Area Layer Considering River Bight (하천 만곡부를 고려한 GIS 기반 침수지역 레이어 구축)

  • Lee, Geun-Sang;Yu, Byeong-Hyeok;Park, Jin-Hyeog;Lee, Eul-Rae
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.12 no.1
    • /
    • pp.1-11
    • /
    • 2009
  • Rapid visualization of flood area of downstream according to the dam effluent in flood season is very important in dam management works. Overlay zone of river bight should be removed to represent flood area efficiently based on flood stage which was modeled in river channels. This study applied drainage enforcement algorithm to visualize flood area considering river bight by coupling Coordinate Operation System for Flood control In Multi-reservoir (COSFIM) and Flood Wave routing model (FLDWAV). The drainage enforcement algorithm is a kind of interpolation which gives to advantage into hydrological process studies by removing spurious sinks of terrain in automatic drainage algorithm. This study presented mapping technique of flood area layer considering river bight in Namgang-Dam downstream, and developed system based on Arcobject component to execute this process automatically. Automatic extraction system of flood area layer could save time-consuming efficiently in flood inundation visualization work which was propelled based on large volume data. Also, flood area layer by coupling with IKONOS satellite image presented real information in flood disaster works.

  • PDF

An Efficient Numeric Character Segmentation of Metering Devices for Remote Automatic Meter Reading (원격 자동 검침을 위한 효과적인 계량기 숫자 분할)

  • Toan, Vo Van;Chung, Sun-Tae;Cho, Seong-Won
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.6
    • /
    • pp.737-747
    • /
    • 2012
  • Recently, in order to support automatic meter reading for conventional metering devices, an image processing-based approach of recognizing the number meter data in the captured meter images has attracted many researchers' interests. Numerical character segmentation is a very critical process for successful recognition. In this paper, we propose an efficient numeric character segmentation method which can segment numeric characters well for any metering device types under diverse illumination environments. The proposed method consists of two consecutive stages; detection of number area containing all numbers as a tight ROI(Region of Interest) and segmentation of numerical characters in the ROI. Detection of tight ROI is achieved in two steps: extraction of rough ROI by utilizing horizontal line segments after illumination enhancement preprocessing, and making the rough ROI more tight through clipping utilizing vertical and horizontal projection about binarized ROI. Numerical character segmentation in the detected ROI is stably achieved in two processes of 'vertical segmentation of each number region' and 'number segmentation in the each vertical segmented number region'. Through the experiments about a homegrown meter image database containing various meter type images of low contrast, low intensity, shadow, and saturation, it is shown that the proposed numeric character segmentation method performs effectively well for any metering device types under diverse illumination environments.