• Title/Summary/Keyword: terminology analysis

Search Result 291, Processing Time 0.032 seconds

An Intelligence Support System Research on KTX Rolling Stock Failure Using Case-based Reasoning and Text Mining (사례기반추론과 텍스트마이닝 기법을 활용한 KTX 차량고장 지능형 조치지원시스템 연구)

  • Lee, Hyung Il;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.47-73
    • /
    • 2020
  • KTX rolling stocks are a system consisting of several machines, electrical devices, and components. The maintenance of the rolling stocks requires considerable expertise and experience of maintenance workers. In the event of a rolling stock failure, the knowledge and experience of the maintainer will result in a difference in the quality of the time and work to solve the problem. So, the resulting availability of the vehicle will vary. Although problem solving is generally based on fault manuals, experienced and skilled professionals can quickly diagnose and take actions by applying personal know-how. Since this knowledge exists in a tacit form, it is difficult to pass it on completely to a successor, and there have been studies that have developed a case-based rolling stock expert system to turn it into a data-driven one. Nonetheless, research on the most commonly used KTX rolling stock on the main-line or the development of a system that extracts text meanings and searches for similar cases is still lacking. Therefore, this study proposes an intelligence supporting system that provides an action guide for emerging failures by using the know-how of these rolling stocks maintenance experts as an example of problem solving. For this purpose, the case base was constructed by collecting the rolling stocks failure data generated from 2015 to 2017, and the integrated dictionary was constructed separately through the case base to include the essential terminology and failure codes in consideration of the specialty of the railway rolling stock sector. Based on a deployed case base, a new failure was retrieved from past cases and the top three most similar failure cases were extracted to propose the actual actions of these cases as a diagnostic guide. In this study, various dimensionality reduction measures were applied to calculate similarity by taking into account the meaningful relationship of failure details in order to compensate for the limitations of the method of searching cases by keyword matching in rolling stock failure expert system studies using case-based reasoning in the precedent case-based expert system studies, and their usefulness was verified through experiments. Among the various dimensionality reduction techniques, similar cases were retrieved by applying three algorithms: Non-negative Matrix Factorization(NMF), Latent Semantic Analysis(LSA), and Doc2Vec to extract the characteristics of the failure and measure the cosine distance between the vectors. The precision, recall, and F-measure methods were used to assess the performance of the proposed actions. To compare the performance of dimensionality reduction techniques, the analysis of variance confirmed that the performance differences of the five algorithms were statistically significant, with a comparison between the algorithm that randomly extracts failure cases with identical failure codes and the algorithm that applies cosine similarity directly based on words. In addition, optimal techniques were derived for practical application by verifying differences in performance depending on the number of dimensions for dimensionality reduction. The analysis showed that the performance of the cosine similarity was higher than that of the dimension using Non-negative Matrix Factorization(NMF) and Latent Semantic Analysis(LSA) and the performance of algorithm using Doc2Vec was the highest. Furthermore, in terms of dimensionality reduction techniques, the larger the number of dimensions at the appropriate level, the better the performance was found. Through this study, we confirmed the usefulness of effective methods of extracting characteristics of data and converting unstructured data when applying case-based reasoning based on which most of the attributes are texted in the special field of KTX rolling stock. Text mining is a trend where studies are being conducted for use in many areas, but studies using such text data are still lacking in an environment where there are a number of specialized terms and limited access to data, such as the one we want to use in this study. In this regard, it is significant that the study first presented an intelligent diagnostic system that suggested action by searching for a case by applying text mining techniques to extract the characteristics of the failure to complement keyword-based case searches. It is expected that this will provide implications as basic study for developing diagnostic systems that can be used immediately on the site.

An Investigation of Local Naming Issue of Tamarix aphylla (에셀나무(Tamarix aphylla)의 명칭문제에 대한 고찰)

  • Kim, Young-Sook
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.37 no.1
    • /
    • pp.56-67
    • /
    • 2019
  • In order to investigate the issue with the proper name of eshel(Tamarix aphylla) mentioned in the Bible, analysis of morphological taxonomy features of plants, studies on the symbolism of the Tamarix genus, analysis of examples in Korean classics and Chinese classics, and studies on the problems found in translations of Korean, Chinese and Japanese Bibles. The results are as follows. According to plant taxonomy, similar species of the Tamarix genus are differentiated by the leaf and flower, and because the size is very small about 2-4mm, it is difficult to differentiate by the naked eye. However, T. aphylla found in the plains of Israel and T. chinensis of China and Korea have distinctive differences in terms of the shape of the branch that droops and its blooming period. The Tamarix genus is a very precious tree that was planted in royal courtyards of ancient Mesopotamia and the Han(漢) Dynasty of China, and in ancient Egypt, it was said to be a tree that gave life to the dead. In the Bible, it was used as a sign of the covenant that God was with Abraham, and it also symbolized the prophet Samuel and the court of Samuel. When examining the example in Korean classics, the Tamarix genus was used as a common term in the Joseon Dynasty and it was often used as the medical term '$Ch{\bar{e}}ngli{\check{u}}$(檉柳)'. Meanwhile, the term 'wiseonglyu(渭城柳)' was used as a literary term. Upon researching the period and name of literature related to $Ch{\bar{e}}ngli{\check{u}}$(檉柳) among Chinese medicinal herb books, a total of 16 terms were used and among these terms, the term Chuísīliǔ(垂絲柳) used in the Chinese Bible cannot be found. There was no word called 'wiseonglyu(渭城柳)' that originated from the poem by Wang Wei(699-759) of Tang(唐) Dynasty and in fact, the word 'halyu(河柳)' that was related to Zhou(周) China. But when investigating the academic terms of China currently used, the words Chuísīliǔ(垂絲柳) and $Ch{\bar{e}}ngli{\check{u}}$(檉柳) are used equally, and therefore, it appears that the translation of eshel in the Chinese Bible as either Chuísīliǔ (垂絲柳) or $Ch{\bar{e}}ngli{\check{u}}$(檉柳) both appear to be of no issue. There were errors translating tamarix into 'やなぎ(willow)' in the Meiji Testaments(舊新約全書 1887), and translated correctly 'ぎょりゅう(檉柳)' since the Colloquial Japanese Bible(口語譯 聖書 1955). However, there are claims that 'gyoryu(ぎょりゅう 檉柳)' is not an indigenous species but an exotics species in the Edo Period, so it is necessary to reconsider the terminology. As apparent in the Korean classics examples analysis, there is high possibility that Korea's T. chinensis were grown in the Korean Peninsula for medicinal and gardening purposes. Therefore, the use of the medicinal term $Ch{\bar{e}}ngli{\check{u}}$(檉柳) or literary term 'wiseonglyu' in the Korean Bible may not be a big issue. However, the term 'wiseonglyu' is used very rarely even in China and as this may be connected to the admiration of China and Chinese things by literary persons of the Joseon Dynasty, so the use of this term should be reviewed carefully. Therefore, rather than using terms that may be of issue in the Bible, it is more feasible to transliterate the Hebrew word and call it eshel.

Questions and Answers about the Humidifier Disinfectant Disaster as of February 2017 (가습기살균제 참사의 진행과 교훈(Q&A))

  • Choi, Yeyong
    • Journal of Environmental Health Sciences
    • /
    • v.43 no.1
    • /
    • pp.1-22
    • /
    • 2017
  • 'The worstest environment disaster', 'World's first biocide massacre', 'Home-based Sewol ferry disaster' are all phrases attached to the recent humidifier disinfectant disaster. In the spring of 2011, four of 8 pregnant women including 1 adult man passed away at a university hospital in Seoul due to breathing failure. Epidemiologic investigation conducted by the Korean CDC soon revealed the inhalation of humidifier disinfectant, which had been widely used in Korea during the winter, to be responsible for the disease. As well as lung fibrosis hardening of the lungs, other diseases including asthma, rhinitis, skin disease, liver disease, fetal disease or cancers have been researched for their relation with exposure to the products. By February 9, 2017, 5,342 cases had registered for health problems and 1,131 of them were already dead (20.8% mortality rate). Based on studies by government agencies and a telephone survey of the general population by Seoul National University and civic groups, around 20% of the general public of Korea has used these products. Since the market release of the first product by SK Chemical in 1994, over 7.1 million items from around 20 brands were sold up to 2011. Most of the products were manufactured by well-known large conglomerates such as SK, Lotte, Samsung, Shinsegye, LG, and GS, as well as some European companies including UK-based Reckitt Benckiser and TESCO, the German firm Henkel, the Danish firm KeTox, and an Irish company. Even though this disaster was unveiled in 2011 by the Korean government, the issue of the victims was neglected for over five years. In 2016, an unexpected but intensive investigation by prosecutors found that Reckitt Benckiser manipulated and concealed animal tests for its own brand and brought several university experts and company employees to court. The matter was an intense social issue in Korea from May to June with a surge in media coverage. The prosecutor's investigation and a nationwide boycott campaign organized by victims and environmental groups against Reckitt Benckiser, whose product had been used by more than 70% of victims, led to the producer's official apology and a compensation scheme. A legislative investigation organized after the April 2016 national election revealed the producers' faults and the government's responsibility, but failed to meet expectations. A special law for the victims passed the National Assembly in January 2017 and a punitive system together with a massive environmental epidemiology investigation are expected to be the only solutions for this tragedy. Sciences of medicine, toxicology and environmental health have provided decisive evidence so far, but for the remaining problems the perspectives of social sciences such as sociology and jurisprudence are highly necessary, similar to with the Minamata disease and Wonjin Rayon events. It may not be easy to follow this issue using unfamiliar terminology from medical and chemical science and the long, complicated history of the event. For these reasons the author has attempted to write this article in a question and answer format to render it easier to follow. The 17 questions are: Q1 What is humidifier disinfectant? Q2 What kind of health problems are caused by humidifier disinfectant? Q3 How many victims are there? Q4 What is the analysis of the 1,112 cases of death? Q5 What is the problem with the government's diagnostic criteria and the solution? Q6 Who made what brands? Q7 Has there been a recall? What is still on sale? Q8 Was safety not checked by any producers? Q9 What are the government's responsibilities? Q10 Is it true that these products were sold only in Korea? Q11 Why and how was it unveiled only in 2011 after 17 years of sales? Q12 What delayed the resolution of the victim issue? Q13 What is the background of the prosecutor's investigation in early 2016? Q14 Is it possible to report new victim cases without evidence of product purchase? Q15 What is happening with the victim issue? Q16 How does it compare with the cases of Minamata disease and Wonjin Rayon? Q17 Are there prevention measures and lessons?

VIDEO GAME CULTURE AND INTERACTIVITY -An exploration of digital interactive media through a metaphorical approach to video game culture-

  • U, Tak
    • 한국게임학회지
    • /
    • v.6 no.1
    • /
    • pp.70-72
    • /
    • 2009
  • This research is focused on defining interaction within the context of digital media and creating a multi cultural definition of interactivity. The concept of multi digital culture and a definition of interaction in digital media have often been overlooked by other researchers and this has caused the emergence of many different notions on this issue. As a result of these varied notions of the concept, public confusion has arisen regarding interactivity. The main purpose of this research is to find a suitable multi definition of interaction through examining local digital culture. In order to analogise multi digital culture, the video game culture is employed as a metaphor to interpret local digital culture. The reason for this is that a specific national culture can be easily identified within the video game culture. Four countries, South Korea, Japan, the U.S. and the UK have been chosen for comparison purposes. Case study, questionnaire and publicly accessible video game related data, such as, video game charts, are used for formalising and analysing unique local digital culture. The Heyri POP UP IMAGE Festival, S. Korea, was also used as a pilot study, with some of the above research methods being employed to analyse South Korean digital culture. In relation to western cases, interview and questionnaire were primarily used. The data from the case countries was carefully compared and analysed and then it became the basis of a theory of multi definition of interaction in digital media. The case study employed the cultural metaphor for this research and in addition video game culture related questionnaires and interviews with experts of interactive art genre, regarding new notions of digital interaction were utilised. The survey was conducted simultaneously in the four different cultural case nations of this research. Twenty respondents from each case nation participated in the survey, in order to investigate firstly, the existence of 'local digital culture' and secondly, the trends and phenomena of 'digital culture' in these four different 'local digital cultural areas'. In terms of interviews with experts of the interactive art genre, these were focused on obtaining their understanding of contemporary digital culture in their research. Using gathered data from the observation of local digital culture, the basic theory of interaction and the terminology of interaction are reformed. Localised definitions of interaction on digital media, control based interaction and communication based interaction are presented, in order to identify a 'locality' in terms of various contemporary digital cultures. As a result of analysing digital culture, new definitions of 'multi definition of digital interaction' were formulated. As mentioned above, 'control' and 'communication' based interaction were initiated, based on 'user to media' relationships. Based on the degree of physical interaction, 'liminal' and 'transitive' interactions were initiated. Less physical digital interaction is named 'liminal' interaction and more physical digital interaction is named' transitive' interaction. These new definitions of interaction were applied to the real world examples of uses of digital interaction, such as, digital interactive installation artworks and video games. The newly defined meaning of digital interaction can be applied to analysing digital interactive installation artworks and possibly indicate their future development and the prospects of future electronic games. Three leading digital interactive artists were selected for this analysis and their works were studied in terms of the implementation of 'multi definition of digital interaction'. Throughout these processes, the meaning of 'communication' in digital interactive media was emphasised. Many of the selected artists' digital installations were focused on 'communication' or 'interaction between each user through digital media', rather than the concept of 'control' in digital interaction, otherwise termed, 'communication with digital media'. In their artworks, interaction between each audience was digitally engaged within the physical interactive environment which was created by the digital media. Both the audience's action and all the reaction throughout the interaction between the audiences, triggered the digital media' s reaction. This audience-audience-media interaction is the key to understanding the concept of 'communication' in physical digital media and it is the main interactive concept upon which the selected digital interactive installation artists for this research and many other artists from similar fields, are concentrating their efforts. In the case of the video game, a similar trend was noticed to that of digital interactive installations. Based on this research's 'multi definition of digital interaction', the video game has evolved from the early stage of being conventional game, which was focused on control based interaction, to the on-line game which was focused on communication based interaction, to physical interactive games, such as, Nintendo Wii, which are focused on more physical interaction and finally, the ubiquitous interactive game, which is mainly concentrated on the concept of 'communication' in physical digital interaction. It is possible that this evolution of the video game concept of interaction is comparable to the progress of digital interactive artworks. This view is based on the fact that both genres show evidence that they are developing in the direction of the concept of 'communication', in terms of physical digital interaction. The important emphasis of this research's results is 'locality' and 'communication' in physical digital interaction. The existence of different digital culture trends, which were assessed by the 'multi definition of digital interaction', can explain the concept of 'locality' in digital interaction. This meaning of 'locality' may assist in understanding contemporary digital culture and can reduce possible misunderstanding as regards 'local' digital culture. In the application of the concept of digital interaction to the field of either artworks or video games, it is possible to form the opinion that an innovative concept of physical digital interaction is 'communication' within this context. This concept and its applications can improve the potential of both digital interactive culture and technology.

  • PDF

An Article Analysis of Animation-specialized Magazine: Focusing on Animatoon magazine (애니메이션 전문 정보지의 기사 분석: 『애니메이툰』의 기사 항목과 특성에 대한 고찰)

  • Kwon, Jae-Woong
    • Cartoon and Animation Studies
    • /
    • s.43
    • /
    • pp.151-184
    • /
    • 2016
  • In order to develop the animation industry, it is essential to encourage other related areas because all related industries can create synergy effects through exchanging their business. The magazine industry dealing with animation is one of them since the animation magazine plays an critical role in providing information and knowledge as well as analyzing market and industry from the practical viewpoint. But, there is only one magazine working in Korea, and it has never been researched so far even though over 20 years has passed since its first publication in 1995. It is Animatoon that is published by the company of Akom Production Co. who produced animation from the mid-1980s and Nelson Shin, the CEO of Akom, has worked as the Chief Editor. This research deals with Animatoon from the first issue till vol. 115 which was published in 2015, and tries to explore the characteristics. The result is as follows. First, Animatoon provides items to understand every article easily. Second, it provides original English articles written by foreign correspondents. Third, it provides different type of articles such as terminology explanation, production pipeline related knowledge, information about newly-released animation DVDs and so on. In addition, it is found out that Animatoon has small amount of advertisements. These characteristics shows that Animatoon helps subscribers to judge the contents of articles easier, tries to focus on the global trend, and provides basic and critical information. Therefore it can be said that Animatoon has enough features to be judged as the specialized magazine. First, Animatoon has its own specialized area, which is animation. Second, considering the type of articles, Animatoon tries to have all the parties concerned as its subscribers. Although, the amount of ads in Animatoon is small, only animation related ads are run through the whole issues.

Manbojeonseo(萬寶全書) Geumdoron(琴道論) in the old scores of Joseon(朝鮮) (조선시대 고악보에 나타난 『만보전서(萬寶全書)』의 금도론(琴道論))

  • Choi, Sun-a
    • (The) Research of the performance art and culture
    • /
    • no.20
    • /
    • pp.251-307
    • /
    • 2010
  • Manbojeonseo, a kind of an encyclopedia published several times in Ming Ch'ing dynasty, includes useful information for scholars and common people on daily lives. In 1720, Manbojeonseo was first introduced to Joseon(朝鮮) dynasty by the diplomatic corps visiting Ch'ing dynasty, and widely circulated in the society as an useful information magazine or an individual collection of reference book. Since Manbojeonseo includes the systematically-organized contents of Geumdoron(琴道論, a theory of a heptachord), it could provide a useful reference when the Geumdoron was inserted as the contents of old scores. For an instance, Obultan(五不彈), Tangeumsuji(彈琴須知), and Taeeumgibeop(太音紀法) recorded in Hangeumsinbo(韓琴新譜, 1724) clearly acknowledge Manbojeonseo as their common source. In this paper, the order and the contents of Geumdorons from four different Manbojeonseo are compared. At first, the comparative analysis of Manbojeonseo (1610) edited by Seo Giryong(徐企龍) and Manbojeonseo(1612) edited by Yu Jamyeong(劉子明) are carried out focusing on the contents of the Geumdoron, where both Manbojeonseos contain considerable amount of Geumdoron sections. The tables of the contents in both Manbojeonseos are composed of upper and lower levels classified into 4 large divisions for each. While the contents of the upper level is presumably older and focused more on the theory of the cardinal virtues, the contents of the lower one is relatively new and centered more on the skills for the real play of a heptachord(琴), the lyrics and the musical scores composed of Gamjabo(減字譜). Therefore, it could be said that the upper level is metaphysical while the lower level is physical. One of the differences between those two Manbojeonseos lies in the order and the terminology found in the large divisions. In the case of Manbojeonseo(1612), some terms in the large division represent and theoretically group the detailed descriptions in the small divisions such as 5 demands or 7 taboos in the play of the heptachord. In addition, a few lower divisions were newly added or revised in order to enhance the completeness of Geumhangmun(琴學門, study of a heptachord), and the detailed classification was revised and polished to improve the reasonableness. In Manbojeonseo(1614) composed by the same editor as Manbojeonseo(1610), the contents of the Geumdoron become much briefer than those of Manbojeonseo(1610) and Manbojeonseo(1612). In the case of Manbojeonseo(1739), a new type of the Geumdoron is included called Oeumjeongjobo(五音正操譜) while carrying a similarly brief section of the Geumdoron. Finally, the Geumdorons in Manbojeonseo and several old scores are comparatively analyzed. While the Geumbo(琴譜) owned by Gugagwon(國樂院) and Hangeumsinbo contains relatively old Geumdoron, Yuyeji(遊藝志) and Bangsanhanssigeumbo(芳山韓氏琴譜) adopt practical and relatively new Geumdorons different from the former old scores and similar to Manbojeonseo(1739) considering the order and the contents. In particular, the contents of the Geumdoron in Geumheonakbo(琴軒樂譜) is notably unique containing much of the upper and the lower levels of Manbojeonseo(1612), therefore thought to have actively adopted the contents of new Geumdorons.

Analysis of Authority Control System in Collecting Repository -from the case of Archival Management System in Korea Democracy Foundation- (수집형 기록관의 전거제어시스템 분석 - 민주화운동기념사업회 사료관리시스템의 사례를 중심으로 -)

  • Lee, Hyun-Jeong
    • The Korean Journal of Archival Studies
    • /
    • no.13
    • /
    • pp.91-134
    • /
    • 2006
  • In general, personally collected archives, manuscripts, are physically badly conditioned and also contextual of the archives and information on the history of production is mostly collected partly in the manuscripts. Therefore they need to control the name of the producers on the archives collected in various ways effectively and accumulate provenance information which is the key element when understanding the production background in the collecting repository. Here, the authority control and provenance information management must be organized from the beginning of acquisition and this means to collect necessary information considering control process of acquisition as well. This thesis is for verifying the necessity of the authority control in collecting repository and accumulation of the provenance information and for suggesting the things to be considered as collecting Archival authority system. For all these, this thesis shows that it has checked out the necessity of the authority control in archival management and archival authority control and researched the standard of archival authority control, work process and accumulation process. Archival provenance information management and authority control in the archival authority control system are organized through the whole steps of the archival management starting from the lead file to the name of the producers at archival registration and archival description at acquisition. And a lot of information is registered and described at the proper point of time and finally all the information including authority control which controls the Heading in the authority management must be organized to use them as an intellectual management of archives and Finding Aids. The features of the Archival authority system are as follows; first of all, Authority file type which is necessary at the archival authority control of democracy movement is made up of the name of the group, person, affair and terminology(subject name). Second of all, basic record structures and description elements in authority collection of Korea Democracy Foundation Archives apply in the paragraph 1 of ISAAR(CPF) adding some necessary elements and details of description rule such as spacing words and using the periods apply in the paragraph 4 of KCR coping with the features of the archival management system. And also the way of input on the authority record is based on EAC(Encoded Archival Context). Third of all, it made users approach to the sources which they want more easily by connecting the authority terms systemically making it possible to connect the relative terms with up and down words, before and after words variously and concretely expanding the term relations rather than earlier traditional authority system which is usually expressed only with relative words (see also). So the authority control of archival management system can effectively collect and manage the function of various and multiple groups and information on main activities as well as its own function which is controlling the Heading and express the multiple and intermediary relationship between archives and producers or between producers and it also provides them with expanded Record information service which satisfies user's various requests through Indexing service. Finally applying in this international standard ISAAR(CPF) through the instance of the authority management like this, it can be referred to making Archival authority system in Collecting repository hereafter by reorganizing the description elements into appropriate formations and setting up the authority file type which is to be managed properly for every service.

International Comparative Study on Education for International Understanding(EIU) : Based on the Regional Analysis of Europe, North America, Asia Pacific, and Africa (국제이해교육의 지역별 동향 분석 연구: 유럽·북미·아시아태평양·아프리카를 중심으로)

  • Kim, Hyun-Duk;Kang, Soon-Won;Yi, Kyeong-Han;Kim, Da-Won
    • Korean Journal of Comparative Education
    • /
    • v.27 no.4
    • /
    • pp.127-154
    • /
    • 2017
  • EIU has evolved diversely depending on the national environment and culture on the basis of the philosophy of individual human rights and world peace articulated in the "1974 Recommendation on EIU". However, the global environment surrounding EIU has been changed socially, economically, culturally and ecologically in the 21st century, and therefore it is necessary to raise the following questions: Is the concept of EIU initiated for international understanding and cooperation for world peace in the 20th century still valid in the 21st century? Which direction should we take in order for EIU to be efficient in the globalized world? To answer these questions, this study reviewed and analyzed the historical development and current trends of the EIU in the regions of Europe, North America, Asia Pacific area, and Africa. For the empirical study, thirty-four experts in EIU selected from the four regions were interviewed by the researchers. Based on the interviews and the related literature review, it was found that the diverse terms of EIU were used in the four regions and the focus on EIU was different depending on the geographical, historical and social environment of each region. But, despite of the diversity in terminology in EIU, human rights, peace, equity and social justice which are emphasized by UNESCO, were universally taught in EIU. The EIU in these regions is currently dealt with in school education, social education and lifelong education, and particularly global citizenship allowing multiple identities is importantly treated together with citizenship education. Another important aspect of EIU that was commonly found in these four regions was that global citizenship education for solving global problems was coexistent with the reinforcement of nationalism for the economic competency of each nation in a globalized world. The issue of global inequality was particularly dealt with in EIU, and the teaching of voluntary civic involvement and responsibility were particularly emphasized in EIU. Based on these research findings, the study proposes "glocalism", connecting global issues with local issues for solving global problems, as a new approach to the EIU of the 21st century.

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.

Term Mapping Methodology between Everyday Words and Legal Terms for Law Information Search System (법령정보 검색을 위한 생활용어와 법률용어 간의 대응관계 탐색 방법론)

  • Kim, Ji Hyun;Lee, Jong-Seo;Lee, Myungjin;Kim, Wooju;Hong, June Seok
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.137-152
    • /
    • 2012
  • In the generation of Web 2.0, as many users start to make lots of web contents called user created contents by themselves, the World Wide Web is overflowing by countless information. Therefore, it becomes the key to find out meaningful information among lots of resources. Nowadays, the information retrieval is the most important thing throughout the whole field and several types of search services are developed and widely used in various fields to retrieve information that user really wants. Especially, the legal information search is one of the indispensable services in order to provide people with their convenience through searching the law necessary to their present situation as a channel getting knowledge about it. The Office of Legislation in Korea provides the Korean Law Information portal service to search the law information such as legislation, administrative rule, and judicial precedent from 2009, so people can conveniently find information related to the law. However, this service has limitation because the recent technology for search engine basically returns documents depending on whether the query is included in it or not as a search result. Therefore, it is really difficult to retrieve information related the law for general users who are not familiar with legal terms in the search engine using simple matching of keywords in spite of those kinds of efforts of the Office of Legislation in Korea, because there is a huge divergence between everyday words and legal terms which are especially from Chinese words. Generally, people try to access the law information using everyday words, so they have a difficulty to get the result that they exactly want. In this paper, we propose a term mapping methodology between everyday words and legal terms for general users who don't have sufficient background about legal terms, and we develop a search service that can provide the search results of law information from everyday words. This will be able to search the law information accurately without the knowledge of legal terminology. In other words, our research goal is to make a law information search system that general users are able to retrieval the law information with everyday words. First, this paper takes advantage of tags of internet blogs using the concept for collective intelligence to find out the term mapping relationship between everyday words and legal terms. In order to achieve our goal, we collect tags related to an everyday word from web blog posts. Generally, people add a non-hierarchical keyword or term like a synonym, especially called tag, in order to describe, classify, and manage their posts when they make any post in the internet blog. Second, the collected tags are clustered through the cluster analysis method, K-means. Then, we find a mapping relationship between an everyday word and a legal term using our estimation measure to select the fittest one that can match with an everyday word. Selected legal terms are given the definite relationship, and the relations between everyday words and legal terms are described using SKOS that is an ontology to describe the knowledge related to thesauri, classification schemes, taxonomies, and subject-heading. Thus, based on proposed mapping and searching methodologies, our legal information search system finds out a legal term mapped with user query and retrieves law information using a matched legal term, if users try to retrieve law information using an everyday word. Therefore, from our research, users can get exact results even if they do not have the knowledge related to legal terms. As a result of our research, we expect that general users who don't have professional legal background can conveniently and efficiently retrieve the legal information using everyday words.