• Title/Summary/Keyword: vocabulary search

Search Result 75, Processing Time 0.026 seconds

The LMOF Preprocessing Tool for Mapping Laboratory Vocabulary to LOINC in Clinical Document Architecture (임상문서표준규격내 검사실 용어의 LOINC 매핑을 위한 LMOF 전처리 도구)

  • Do, Hyoung-Ho;Kim, Il-Kon;Lee, Sung-Kee;Kwak, Yun-Sik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.4
    • /
    • pp.158-165
    • /
    • 2008
  • LOINC (Logical Observation Identifiers Names and Codes) is a database and universal standard for identifying laboratory and clinical test results that is developed and maintained by Regenstrief Institute. Exchanging laboratory test results is one of the most important area in EHR system and the terminology for laboratory test results has to be standardized. In this paper, we present a pre-preprocessing tool that converts a local database in healthcare organizations to LMOF format LMOF format is required by RELMA and our work helps mapping laboratory test results to LOINC very efficiently Our proposed tool provided user friendly interface and 15% keyword reduction in RELMA search compared to no pre-processing RELMA search.

A Study on the Online Service of Cultural Heritage Contents (문화유산 콘텐츠 온라인 서비스에 관한 연구)

  • Park, Ok Nam
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.19 no.1
    • /
    • pp.195-224
    • /
    • 2019
  • Online service has been emphasized in various studies for content uses and diffusion of cultural heritage domain. This study purports to investigate the status of contents organization and information services for online cultural heritage services and to suggest improvement directions. This study conducted case studies and expert interviews based on contents, search systems, additional services, and expansion services. It also suggested an integrated information retrieval service for cultural heritage contents as well as the provision of high-quality content and various types of contents. The flexibility of the search function through the content hierarchy, the expansion of access points through the construction of controlled vocabulary, and authority data were also focused. As an additional service, the study proposed a curation-based, user-customized service, data sets open and share, and user participation.

1-Pass Semi-Dynamic Network Decoding Using a Subnetwork-Based Representation for Large Vocabulary Continuous Speech Recognition (대어휘 연속음성인식을 위한 서브네트워크 기반의 1-패스 세미다이나믹 네트워크 디코딩)

  • Chung Minhwa;Ahn Dong-Hoon
    • MALSORI
    • /
    • no.50
    • /
    • pp.51-69
    • /
    • 2004
  • In this paper, we present a one-pass semi-dynamic network decoding framework that inherits both advantages of fast decoding speed from static network decoders and memory efficiency from dynamic network decoders. Our method is based on the novel language model network representation that is essentially of finite state machine (FSM). The static network derived from the language model network [1][2] is partitioned into smaller subnetworks which are static by nature or self-structured. The whole network is dynamically managed so that those subnetworks required for decoding are cached in memory. The network is near-minimized by applying the tail-sharing algorithm. Our decoder is evaluated on the 25k-word Korean broadcast news transcription task. In case of the search network itself, the network is reduced by 73.4% from the tail-sharing algorithm. Compared with the equivalent static network decoder, the semi-dynamic network decoder has increased at most 6% in decoding time while it can be flexibly adapted to the various memory configurations, giving the minimal usage of 37.6% of the complete network size.

  • PDF

Improving Phoneme Recognition based on Gaussian Model using Bhattacharyya Distance Measurement Method (바타챠랴 거리 측정 기법을 사용한 가우시안 모델 기반 음소 인식 향상)

  • Oh, Sang-Yeob
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.1
    • /
    • pp.85-93
    • /
    • 2011
  • Previous existing vocabulary recognition programs calculate general vector values from a database, so they can not process phonemes that form during a search. And because they can not create a model for phoneme data, the accuracy of the Gaussian model can not secure. Therefore, in this paper, we recommend use of the Bhattacharyya distance measurement method based on the features of the phoneme-thus allowing us to improve the recognition rate by picking up accurate phonemes and minimizing recognition of similar and erroneous phonemes. We test the Gaussian model optimization through share continuous probability distribution, and we confirm the heighten recognition rate. The Bhattacharyya distance measurement method suggest in this paper reflect an average 1.9% improvement in performance compare to previous methods, and it has average 2.9% improvement based on reliability in recognition rate.

Theory and practice of alphabetical subject indexing (주제색인의 이론과 실제)

  • 윤구호
    • Journal of Korean Library and Information Science Society
    • /
    • v.10
    • /
    • pp.95-131
    • /
    • 1983
  • Index is a systematic guide to items contained in, or concepts derived from, a collection, Thus, it is represented as a paired set of index terms (t) and documents (D) : I= {(t,D) vertical bar t .mem. V, D .mem. W), where V is index vocabulary and W is document collection. Indexing is the process of analysing the informational content of records of knowledge and expressing the informational content in the language of the indexing system. It involves: 1) Selecting indexable concepts in a document; and 2) expressing these concepts in the language of the indexing system (as index entries): and an ordered list. Indexing process involves technical, semantic and syntactic problems. Technical problems are related to the accuracy of indexing, which is primarily governed by the indexer's ability of analysing subject, identifying indexable concepts, and coding. The proper levels of indexing exhaustivity, and index language specificity are also significant factors affecting the quality of index. Semantic problems are related to the choice of index terms and the form in which they should be used. Equivalent, hierarchical and affinitive/associative relationships of index terms are involved. Syntactic problems are largely related to the coordination of index terms. This process of coordination arises from the need to be able to search for the intersection of two or more classes defined by terms denoting distinct concepts. Finally, most valuable aspects of alphabetical subject indexing theories and practices are derived from those of Cutter, Kaiser, Ranganathan, Coates, Lynch and Austin, and discussed in details.

  • PDF

Critical Discourse Analysis of '5.18' in 'Honam' and 'Yeongnam' Local Newspapers by Using Corpus (코퍼스를 이용한 '호남'과 '영남' 지역신문에서의 '5.18'에 대한 비판적 담화분석)

  • Lee, Sukeui;Jin, Duhyeon
    • Korean Linguistics
    • /
    • v.76
    • /
    • pp.83-112
    • /
    • 2017
  • In this paper, newspaper articles were collected through '5.18' keyword search results and the news corpus was constructed from the collected data. In the articles of local newspapers 'Honam' and 'Yeongnam', the ideological differences regarding '5.18' were investigated. The ideological differences of local newspaper discourse through objective figures was analyzed.. The subjects of the newspaper articles, the frequency of nouns and predicates were analyzed. The use and meaning of the intended vocabulary were examined. As a result of analyzing the title of the newspaper article, the discourse written in 'Honam' emphasized the necessity of re - recognition of 5.18. In both regions, the word "Gwangju" is often used. However, 'Gwangju' in 'Honam' newspaper means spiritual space, not physical space. In Honam regional newspapers, there are many vocabularies describing the events such as 'shoot' and 'fire', this calls for recollection and memory of '5.18'. In the analysis of newspaper discourse, the analysis of the contrast between the local newspapers was very insignificant, but, this study was conducted to analyze the discourse among local newspapers.

Development of A System for Registration of Korean Terminology on The Electropedia

  • Moon, Bonghee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.8
    • /
    • pp.105-111
    • /
    • 2019
  • In this paper, I introduce the development of a system to register Korean standard technical terms which are corresponded with English electronical terminologies on the Electropedia of the International Electronical Committee(IEC). In 2016, this project was started with the permission of registration at the Technical Committee 1 of the $80^{th}$ IEC General Meeting in Frankfurt, Germany. The work was consisted of 3 parts, the 1st step was gathering Korean vocabularies and building a databse for the translation of English terms of International Electronical Vocabulary(IEV) into Korean terms, the 2nd step was to find correct or proper Korean term which is in accord with each English term of IEV on the Electropedia. In this step, members of Korean TC 1 worked for search proper Korean terms using developed computer programs and databases which were made of Korean electronical dictionaries. After selection of proper terms, they did the cross-checking work for Korean terms each other. The last step was to register all of these Korean terms on the Electropedia. As a result, 20,766 Korean electronical terms were registered on the Electropedia in 2017. In the future, it is needed that the definition of English technical terms are translated into Korean.

Anomalies of the clivus of interest in dental practice: A systematic review

  • McCartney, Troy E.;Mupparapu, Mel
    • Imaging Science in Dentistry
    • /
    • v.51 no.4
    • /
    • pp.351-361
    • /
    • 2021
  • Purpose: The clivus is a region in the anterior section of the occipital bone that is commonly imaged on large-volume cone-beam computed tomography (CBCT). There have been several reports of incidental clivus variations and certain pathological entities that have been attributed to the variations. This study aimed to evaluate the effects of these variations within the scope of dentistry. Materials and Methods: Medical databases (PubMed, Scopus, and Web of Science) were searched using a controlled vocabulary (clival anomalies, cone-beam CT, canalis basilaris medianus, fossa navicularis magna, clival variation). The search was limited to English language, humans, and studies published in the last 25 years. The articles were exported into RefWorks® and duplicates were removed. The remaining articles were screened and reviewed for supporting information on variations of the clivus on CBCT imaging. Results: Canalis basilaris medianus and fossa navicularis magna were the most common anomalies noted. Many of these variations were asymptomatic, with most patients unaware of the anomaly. In certain cases, associated pathologies ranged from developmental (Tornwaldt cyst), to acquired (recurrent meningitis). While no distinct pathognomonic aspects were noted, there were unique patterns of radiographic diagnosis and treatment modalities. Most patients had a normal course of follow-up. Conclusion: Interpretation of CBCT volumes is a skill every dentist must possess. When reviewing large-volume CBCT scans, the clinician should be able to distinguish pathology from normal anatomic variations within the skull base. The majority of clivus variations are asymptomatic and will remain undetected unless incidentally noted on radiographic examinations.

Incorporating Deep Median Networks for Arabic Document Retrieval Using Word Embeddings-Based Query Expansion

  • Yasir Hadi Farhan;Mohanaad Shakir;Mustafa Abd Tareq;Boumedyen Shannaq
    • Journal of Information Science Theory and Practice
    • /
    • v.12 no.3
    • /
    • pp.36-48
    • /
    • 2024
  • The information retrieval (IR) process often encounters a challenge known as query-document vocabulary mismatch, where user queries do not align with document content, impacting search effectiveness. Automatic query expansion (AQE) techniques aim to mitigate this issue by augmenting user queries with related terms or synonyms. Word embedding, particularly Word2Vec, has gained prominence for AQE due to its ability to represent words as real-number vectors. However, AQE methods typically expand individual query terms, potentially leading to query drift if not carefully selected. To address this, researchers propose utilizing median vectors derived from deep median networks to capture query similarity comprehensively. Integrating median vectors into candidate term generation and combining them with the BM25 probabilistic model and two IR strategies (EQE1 and V2Q) yields promising results, outperforming baseline methods in experimental settings.

Development of a Korean Speech Recognition Platform (ECHOS) (한국어 음성인식 플랫폼 (ECHOS) 개발)

  • Kwon Oh-Wook;Kwon Sukbong;Jang Gyucheol;Yun Sungrack;Kim Yong-Rae;Jang Kwang-Dong;Kim Hoi-Rin;Yoo Changdong;Kim Bong-Wan;Lee Yong-Ju
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.8
    • /
    • pp.498-504
    • /
    • 2005
  • We introduce a Korean speech recognition platform (ECHOS) developed for education and research Purposes. ECHOS lowers the entry barrier to speech recognition research and can be used as a reference engine by providing elementary speech recognition modules. It has an easy simple object-oriented architecture, implemented in the C++ language with the standard template library. The input of the ECHOS is digital speech data sampled at 8 or 16 kHz. Its output is the 1-best recognition result. N-best recognition results, and a word graph. The recognition engine is composed of MFCC/PLP feature extraction, HMM-based acoustic modeling, n-gram language modeling, finite state network (FSN)- and lexical tree-based search algorithms. It can handle various tasks from isolated word recognition to large vocabulary continuous speech recognition. We compare the performance of ECHOS and hidden Markov model toolkit (HTK) for validation. In an FSN-based task. ECHOS shows similar word accuracy while the recognition time is doubled because of object-oriented implementation. For a 8000-word continuous speech recognition task, using the lexical tree search algorithm different from the algorithm used in HTK, it increases the word error rate by $40\%$ relatively but reduces the recognition time to half.