• Title/Summary/Keyword: 자동정보 추출

Search Result 1,995, Processing Time 0.037 seconds

Automatic Electronic Medical Record Generation System using Speech Recognition and Natural Language Processing Deep Learning (음성인식과 자연어 처리 딥러닝을 통한 전자의무기록자동 생성 시스템)

  • Hyeon-kon Son;Gi-hwan Ryu
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.731-736
    • /
    • 2023
  • Recently, the medical field has been applying mandatory Electronic Medical Records (EMRs) and Electronic Health Records (EHRs) systems that computerize and manage medical records, and distributing them throughout the entire medical industry to utilize patients' past medical records for additional medical procedures. However, the conversations between medical professionals and patients that occur during general medical consultations and counseling sessions are not separately recorded or stored, so additional important patient information cannot be efficiently utilized. Therefore, we propose an electronic medical record system that uses speech recognition and natural language processing deep learning to store conversations between medical professionals and patients in text form, automatically extracts and summarizes important medical consultation information, and generates electronic medical records. The system acquires text information through the recognition process of medical professionals and patients' medical consultation content. The acquired text is then divided into multiple sentences, and the importance of multiple keywords included in the generated sentences is calculated. Based on the calculated importance, the system ranks multiple sentences and summarizes them to create the final electronic medical record data. The proposed system's performance is verified to be excellent through quantitative analysis.

A Methodology for Extracting Shopping-Related Keywords by Analyzing Internet Navigation Patterns (인터넷 검색기록 분석을 통한 쇼핑의도 포함 키워드 자동 추출 기법)

  • Kim, Mingyu;Kim, Namgyu;Jung, Inhwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.123-136
    • /
    • 2014
  • Recently, online shopping has further developed as the use of the Internet and a variety of smart mobile devices becomes more prevalent. The increase in the scale of such shopping has led to the creation of many Internet shopping malls. Consequently, there is a tendency for increasingly fierce competition among online retailers, and as a result, many Internet shopping malls are making significant attempts to attract online users to their sites. One such attempt is keyword marketing, whereby a retail site pays a fee to expose its link to potential customers when they insert a specific keyword on an Internet portal site. The price related to each keyword is generally estimated by the keyword's frequency of appearance. However, it is widely accepted that the price of keywords cannot be based solely on their frequency because many keywords may appear frequently but have little relationship to shopping. This implies that it is unreasonable for an online shopping mall to spend a great deal on some keywords simply because people frequently use them. Therefore, from the perspective of shopping malls, a specialized process is required to extract meaningful keywords. Further, the demand for automating this extraction process is increasing because of the drive to improve online sales performance. In this study, we propose a methodology that can automatically extract only shopping-related keywords from the entire set of search keywords used on portal sites. We define a shopping-related keyword as a keyword that is used directly before shopping behaviors. In other words, only search keywords that direct the search results page to shopping-related pages are extracted from among the entire set of search keywords. A comparison is then made between the extracted keywords' rankings and the rankings of the entire set of search keywords. Two types of data are used in our study's experiment: web browsing history from July 1, 2012 to June 30, 2013, and site information. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The original sample dataset contains 150 million transaction logs. First, portal sites are selected, and search keywords in those sites are extracted. Search keywords can be easily extracted by simple parsing. The extracted keywords are ranked according to their frequency. The experiment uses approximately 3.9 million search results from Korea's largest search portal site. As a result, a total of 344,822 search keywords were extracted. Next, by using web browsing history and site information, the shopping-related keywords were taken from the entire set of search keywords. As a result, we obtained 4,709 shopping-related keywords. For performance evaluation, we compared the hit ratios of all the search keywords with the shopping-related keywords. To achieve this, we extracted 80,298 search keywords from several Internet shopping malls and then chose the top 1,000 keywords as a set of true shopping keywords. We measured precision, recall, and F-scores of the entire amount of keywords and the shopping-related keywords. The F-Score was formulated by calculating the harmonic mean of precision and recall. The precision, recall, and F-score of shopping-related keywords derived by the proposed methodology were revealed to be higher than those of the entire number of keywords. This study proposes a scheme that is able to obtain shopping-related keywords in a relatively simple manner. We could easily extract shopping-related keywords simply by examining transactions whose next visit is a shopping mall. The resultant shopping-related keyword set is expected to be a useful asset for many shopping malls that participate in keyword marketing. Moreover, the proposed methodology can be easily applied to the construction of special area-related keywords as well as shopping-related ones.

Automated Analyses of Ground-Penetrating Radar Images to Determine Spatial Distribution of Buried Cultural Heritage (매장 문화재 공간 분포 결정을 위한 지하투과레이더 영상 분석 자동화 기법 탐색)

  • Kwon, Moonhee;Kim, Seung-Sep
    • Economic and Environmental Geology
    • /
    • v.55 no.5
    • /
    • pp.551-561
    • /
    • 2022
  • Geophysical exploration methods are very useful for generating high-resolution images of underground structures, and such methods can be applied to investigation of buried cultural properties and for determining their exact locations. In this study, image feature extraction and image segmentation methods were applied to automatically distinguish the structures of buried relics from the high-resolution ground-penetrating radar (GPR) images obtained at the center of Silla Kingdom, Gyeongju, South Korea. The major purpose for image feature extraction analyses is identifying the circular features from building remains and the linear features from ancient roads and fences. Feature extraction is implemented by applying the Canny edge detection and Hough transform algorithms. We applied the Hough transforms to the edge image resulted from the Canny algorithm in order to determine the locations the target features. However, the Hough transform requires different parameter settings for each survey sector. As for image segmentation, we applied the connected element labeling algorithm and object-based image analysis using Orfeo Toolbox (OTB) in QGIS. The connected components labeled image shows the signals associated with the target buried relics are effectively connected and labeled. However, we often find multiple labels are assigned to a single structure on the given GPR data. Object-based image analysis was conducted by using a Large-Scale Mean-Shift (LSMS) image segmentation. In this analysis, a vector layer containing pixel values for each segmented polygon was estimated first and then used to build a train-validation dataset by assigning the polygons to one class associated with the buried relics and another class for the background field. With the Random Forest Classifier, we find that the polygons on the LSMS image segmentation layer can be successfully classified into the polygons of the buried relics and those of the background. Thus, we propose that these automatic classification methods applied to the GPR images of buried cultural heritage in this study can be useful to obtain consistent analyses results for planning excavation processes.

Automatic Matching of Building Polygon Dataset from Digital Maps Using Hierarchical Matching Algorithm (계층적 매칭 기법을 이용한 수치지도 건물 폴리곤 데이터의 자동 정합에 관한 연구)

  • Yeom, Junho;Kim, Yongil;Lee, Jeabin
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.33 no.1
    • /
    • pp.45-52
    • /
    • 2015
  • The interoperability of multi-source data has become more important due to various digital maps, produced from public institutions and enterprises. In this study, the automatic matching algorithm of multi-source building data using hierarchical matching was proposed. At first, we divide digital maps into blocks and perform the primary geometric registration of buildings with the ICP algorithm. Then, corresponding building pairs were determined by evaluating the similarity of overlap area, and the matching threshold value of similarity was automatically derived by the Otsu binary thresholding. After the first matching, we extracted error matching candidates buildings which are similar with threshold value to conduct the secondary ICP matching and to make a matching decision using turning angle function analysis. For the evaluation, the proposed method was applied to representative public digital maps, road name address map and digital topographic map 2.0. As a result, the F measures of matching and non-matching buildings increased by 2% and 17%, respectively. Therefore, the proposed method is efficient for the matching of building polygons from multi-source digital maps.

Development of a Prototype Automatic Sorting System for Dried Oak Mushrooms (건표고 자동선별을 위한 시작시스템 개발)

  • Hwang, H.;Lee, C.H.
    • Journal of Biosystems Engineering
    • /
    • v.21 no.4
    • /
    • pp.414-421
    • /
    • 1996
  • 한국과 일본의 경우 건표고를 외관의 품질상태 에 따라 12등급에서 16등급으로 구분하고 있다. 그리고 등급판정 작업은 임의로 추출한 샘플을 대상으로 전문 감정가에 의해 수작업으로 수행되고 있다. 건표고의 품질을 결정짓는 외관의 품질인자들은 갓과 내피에 고루 분포하고 있다. 본 논문에서는 컴퓨터 영상처리 시스템에 의거하여 개발한 건표고 자동 등급판정 및 선별 시작시스템의 구조와 기능 그리고 성능에 대하여 설명하였다. 개발한 시작시스템은 표고의 이송과 취급자동화를 위한 진동이송기, 반전장치, 컨베이어 이송장치와 두 세트의 컴퓨터 영상처리 시스템, 그리고 시스템 통괄제어를 위한 IBM PC AT호환 컴퓨터, 디지털 입출력 보드, 전공압실린더 구동제어를 위한 PLC등으로 구성하였다. 등급판정의 효율성 및 실시간 작업시스템을 고려하여 건표고의 등급판정은 두 세트의 컴퓨터 영상처리 시스템을 이용하여 이송되는 건표고의 갓 또는 내피 중 어디가 위를 향하는 지에 따라 두 단계에 걸쳐 독립적으로 판정을 수행하도록 하였다. 첫 번째 영상처리부에서는 갓표면 영상으로부터 4등급의 고품질 표고를 분류하며 두 번째 영상처리부에서는 내피표면 영상으로부터 중간 및 저품질 표고를 8개의 등급으로 분류한다. 실시간 영상정보처리를 목적으로 기존에 개발한 신경회로망을 이용한 등급판정 알고리즘을 시작시스템에 적용하였다. 개발한 시작기는 88% 이상의 등급판정 정확도를 보여 주었으며, 전공압시스템의 구동제약으로 인하여 표고 1개당 약0.7초의 선별시간이 소요되었다. 일조 선별라인의 경우 본 연구에서 제안한 시작기의 선별능력은 표고가 일차 처리부로 갓이 위로 올라와 있는 상태로 계속 공급된다면 시간당 대략 5,000여 개의 표고를 처리할 수 있을 것으로 기대된다.보강하여 가능하면 B-Pillar의 Middle이 Bending type collapse을 방지하여 Pelvis와 Door가 먼저 접촉하는 방법 등이 적용가능하다. 제작하기 이전에 설계된 부품에 대한 스프링 상수 및 내구특성을 체계적으로 규명하여 제품 시험의 횟수를 줄이고, 보다 정밀한 제품을 제작할 수 있도록 하기 위한 것이다.세포수는 초기 배반포기배에서 팽윤 배반포기배로 진행됨에 따라 두배에서 세배 정도 증가되었음을 알 수 있었다. 또한, differential labelling과 bisbenzimide기법에서 얻어진 각각의 총세포수를 비교하였을 때 총세포수는 발달의 진행 정도에 따라 증가되며 그와 동시에 동일한 군 간의 세포수도 거의 유사함을 알 수 있었다. 따라서, ICM과 TE를 differential labelling하는 기법은 수정란의 quality를 평가하는데 매우 유용한 기법으로서 착상전 embryo 발달을 연구하는데 효과적으로 이용될 수 있다는 것을 시사한다. 고도의 유의차를 나타낸 반면 비수구, 초생수로구 및 Bromegrass 목초구 간에는 아무런 유의차가 인정되지 않았다. 7. 농지보전 처리구인 배수구와 초생수로구는 비처리구에 비해 낮은 침두 유출량과 낮은 토양유실량을 나타내었다.구보다 14% 절감되는 것으로 나타났다.작용하는 것으로 사료된다.된다.정량 분석한 결과이다. 시편의 조성은 33.6 at% U, 66.4 at% O의 결과를 얻었다. 산화물 핵연료의 표면 관찰 및 정량 분석 시험시 시편 표면을 전도성 물질로 증착시키지 않고, Silver Paint 에 시편을 접착하는 방법으로도 만족한 시험 결과를 얻을 수 있었다.째, 회복기 중에 일어나는 입자들의 유입은 자기폭풍의 지속시간을 연장시키는 경향을 보이며 큰

  • PDF

Reducing latency of neural automatic piano transcription models (인공신경망 기반 저지연 피아노 채보 모델)

  • Dasol Lee;Dasaem Jeong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.2
    • /
    • pp.102-111
    • /
    • 2023
  • Automatic Music Transcription (AMT) is a task that detects and recognizes musical note events from a given audio recording. In this paper, we focus on reducing the latency of real-time AMT systems on piano music. Although neural AMT models have been adapted for real-time piano transcription, they suffer from high latency, which hinders their usefulness in interactive scenarios. To tackle this issue, we explore several techniques for reducing the intrinsic latency of a neural network for piano transcription, including reducing window and hop sizes of Fast Fourier Transformation (FFT), modifying convolutional layer's kernel size, and shifting the label in the time-axis to train the model to predict onset earlier. Our experiments demonstrate that combining these approaches can lower latency while maintaining high transcription accuracy. Specifically, our modified model achieved note F1 scores of 92.67 % and 90.51 % with latencies of 96 ms and 64 ms, respectively, compared to the baseline model's note F1 score of 93.43 % with a latency of 160 ms. This methodology has potential for training AMT models for various interactive scenarios, including providing real-time feedback for piano education.

Automatic Inference of Standard BOQ(Bill of Quantities) Items using BIM and Ontology (BIM과 온톨로지를 활용한 표준내역항목 추론 자동화)

  • Lee, Seul-Ki;Kim, Ka-Ram;Yu, Jung-Ho
    • Korean Journal of Construction Engineering and Management
    • /
    • v.13 no.3
    • /
    • pp.99-108
    • /
    • 2012
  • The rough design information is only available from BIM(Building Information Model) based schematic design. So, it is difficult to obtain sufficient information for generating BOQ. Like 2D design, there are some problems that the results are depend on what the choice of cost estimator. However, the most research of BIM based cost estimation are focus on quantity takeoff, the consideration of work item for generating BOQ is insufficient. Therefore, this paper present automatic inference process of work items in a BOQ using ontology. The proposed process and ontology is validated through applying tiling construction. If the proposed process is utilized, it is expected the basis of developing generation method for consistent BOQ by resolving intervention of cost estimator's arbitrary decision.

The Reference Identifier Matching System for Developing Reference Linking Service (참조연계 서비스 구현을 위한 참고문헌 식별자 매칭 시스템)

  • Lee, Yong-Sik;Lee, Sang-Gi
    • Journal of Information Management
    • /
    • v.41 no.3
    • /
    • pp.191-209
    • /
    • 2010
  • A reference linking service that is connection of each other different information resource need to setup the reference database and to match identifier. CrossRef, PubMed and Web Of Science etc. the many overseas agencies developed reference linking service, that they used the automatic tools of Inera eXstyles, Parity Computings Reference Extractor etc. and setup in base DOI and PMID etc. Domestic the various agencies of KISTI(Korea Institute Science and Technology of Information), KRF(Korea Research Foundation) etc are construction reference database. But each research communities adopts a various reference bibliography writing format. As, the data base construction which is collect is confronting is many to being difficult. In this paper, We developed the Citation Matcher System. This system is automatic parsing the reference string to metadata and matching DOI, PMID and KOI as Identifier. It is improved the effectiveness of reference database setup.

XML Schema Model of Great Staff Music Score using the Integration Method (통합 방식을 이용한 대보표 악보의 XML 스키마 모델)

  • 김정희;곽호영
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.2
    • /
    • pp.302-313
    • /
    • 2003
  • Currently, DTD(Document Type Definition) Definition of Music score has been widely studied according to applications, and the methods of automatic transformation from defined DTD to XML Schema is in progress. In addition, studies of structure of DTD definition are focused on the expression of music information by individual format. In this paper, expression method of the music information by continuous string values is suggested using the fact that measure is basically a component of score, and XML Schema is also modelled. In addition, mechanism extracting the music information from XML instance which was expressed using the proposed method is presented. As a result, XML Schema taking the continuous string values could be defined, instance obtained by the proposed method results in increasing efficiency by simplicity of XPATH and reduction of search step compared to previous method. In addition, it is possible for human to make direct expression, and it is known that the instance size decreases.

An Experimental Study on Feature Selection Using Wikipedia for Text Categorization (위키피디아를 이용한 분류자질 선정에 관한 연구)

  • Kim, Yong-Hwan;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.2
    • /
    • pp.155-171
    • /
    • 2012
  • In text categorization, core terms of an input document are hardly selected as classification features if they do not occur in a training document set. Besides, synonymous terms with the same concept are usually treated as different features. This study aims to improve text categorization performance by integrating synonyms into a single feature and by replacing input terms not in the training document set with the most similar term occurring in training documents using Wikipedia. For the selection of classification features, experiments were performed in various settings composed of three different conditions: the use of category information of non-training terms, the part of Wikipedia used for measuring term-term similarity, and the type of similarity measures. The categorization performance of a kNN classifier was improved by 0.35~1.85% in $F_1$ value in all the experimental settings when non-learning terms were replaced by the learning term with the highest similarity above the threshold value. Although the improvement ratio is not as high as expected, several semantic as well as structural devices of Wikipedia could be used for selecting more effective classification features.