• Title/Summary/Keyword: Language order

Search Result 1,980, Processing Time 0.031 seconds

A review of the direction of French liberal arts education based on a university competency-based education approach (대학의 역량 중심 교육 방안에 따른 프랑스어 교양교육의 방향성 고찰)

  • KIM Eunnekyung
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.729-736
    • /
    • 2024
  • In connection with the OECD's core competency proposal, we would like to consider an attempt to realize this in liberal arts education at Korean universities and examine what kind of education plan it is desirable to present to learners. Universities are expanding competency-based education into human and social fields by reconsidering new talent awards and the direction of education. In this way, each university selects and organizes core competencies and incorporates the core competencies that the university pursues into educational goals. Under the supervision of the Ministry of Education, education centered on core competencies is exploring its potential in liberal arts courses at universities above all else. We want to explore a methodology that can achieve learner-centered teaching and learning effects in the process of incorporating and accepting this. Language acquisition along with cross-cultural understanding is above all else a part that can promote learners' competencies in terms of diversity and mutual understanding. Therefore, we reflect this in French liberal arts education and explore teaching and learning processes by incorporating respect for diversity and mutual cultural understanding competency education related to learners' motivation into lectures. We aim to supplement this through collaboration and mutual cultural understanding processes as presentation tasks in order to overcome the existing competency-based evaluation while deriving acceptance results from learners. Therefore, they recognize that the direction of core competency education naturally shifts to value-centered education.

Query-based Answer Extraction using Korean Dependency Parsing (의존 구문 분석을 이용한 질의 기반 정답 추출)

  • Lee, Dokyoung;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.161-177
    • /
    • 2019
  • In this paper, we study the performance improvement of the answer extraction in Question-Answering system by using sentence dependency parsing result. The Question-Answering (QA) system consists of query analysis, which is a method of analyzing the user's query, and answer extraction, which is a method to extract appropriate answers in the document. And various studies have been conducted on two methods. In order to improve the performance of answer extraction, it is necessary to accurately reflect the grammatical information of sentences. In Korean, because word order structure is free and omission of sentence components is frequent, dependency parsing is a good way to analyze Korean syntax. Therefore, in this study, we improved the performance of the answer extraction by adding the features generated by dependency parsing analysis to the inputs of the answer extraction model (Bidirectional LSTM-CRF). The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. In this study, we compared the performance of the answer extraction model when inputting basic word features generated without the dependency parsing and the performance of the model when inputting the addition of the Eojeol tag feature and dependency graph embedding feature. Since dependency parsing is performed on a basic unit of an Eojeol, which is a component of sentences separated by a space, the tag information of the Eojeol can be obtained as a result of the dependency parsing. The Eojeol tag feature means the tag information of the Eojeol. The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. From the dependency parsing result, a graph is generated from the Eojeol to the node, the dependency between the Eojeol to the edge, and the Eojeol tag to the node label. In this process, an undirected graph is generated or a directed graph is generated according to whether or not the dependency relation direction is considered. To obtain the embedding of the graph, we used Graph2Vec, which is a method of finding the embedding of the graph by the subgraphs constituting a graph. We can specify the maximum path length between nodes in the process of finding subgraphs of a graph. If the maximum path length between nodes is 1, graph embedding is generated only by direct dependency between Eojeol, and graph embedding is generated including indirect dependencies as the maximum path length between nodes becomes larger. In the experiment, the maximum path length between nodes is adjusted differently from 1 to 3 depending on whether direction of dependency is considered or not, and the performance of answer extraction is measured. Experimental results show that both Eojeol tag feature and dependency graph embedding feature improve the performance of answer extraction. In particular, considering the direction of the dependency relation and extracting the dependency graph generated with the maximum path length of 1 in the subgraph extraction process in Graph2Vec as the input of the model, the highest answer extraction performance was shown. As a result of these experiments, we concluded that it is better to take into account the direction of dependence and to consider only the direct connection rather than the indirect dependence between the words. The significance of this study is as follows. First, we improved the performance of answer extraction by adding features using dependency parsing results, taking into account the characteristics of Korean, which is free of word order structure and omission of sentence components. Second, we generated feature of dependency parsing result by learning - based graph embedding method without defining the pattern of dependency between Eojeol. Future research directions are as follows. In this study, the features generated as a result of the dependency parsing are applied only to the answer extraction model in order to grasp the meaning. However, in the future, if the performance is confirmed by applying the features to various natural language processing models such as sentiment analysis or name entity recognition, the validity of the features can be verified more accurately.

A Study on the Motive of Escape from the North Korea and the Life Situation of Female Fugitives in China - based on the Interview with North Korean Female Refugees in Yenben Province - (북한 여성들의 탈북동기와 생활실태 - 중국 연변지역의 탈북 여성들을 중심으로 -)

  • 문숙재;김지희;이명근
    • Journal of the Korean Home Economics Association
    • /
    • v.38 no.5
    • /
    • pp.137-152
    • /
    • 2000
  • North Korean fugitives is one of various nominations referring to the North Koreans who have secretly crossed the territorial border of their country. It is a new terminology that huts gained wider usage in our society as we entered the 1990s. North Koreans list various motives for escaping their county, such as food shortage and disillusionment of belief in the system. Most of the forced repatriation of North Korean escapees takes place in China. The purpose of this study examines the family knife of female fugitives from North Korea in order to provide pertinent alterntives which are needed to secure basic human right of the female fugitives and enable them to keep stability of their family lives and to adapt themselves into new socio-cultural circumstances in China. For this, the preliminary survey performed to examine the demographic characteristics on the female fugitives; to find out the incentives and channels of their escape out of North Korea; to investigate what types of family life and family relationship they manage in China; to grasp their problems and need of family life in adaptation into Chinese society. The specific questions for grasping the general characteristics of the female fugitives are composed of age, education level residential district in North Korea. In order to find out main causes and influential factors of their escape from North Korea, the following questions are included: what the most important incentives and motives are; the frequency of escape; and whether they discuss their escape with their family or not. The questions to find out their present actual life situations in China are about difficult things to adjust in China, family life, relationship with husband, and their conversational diction, the degree of their mastering the chinese language, the degree of their adaptation to chinese way of living, and so forth, which reveal to what extent they are adapted themselves to new cultural situation in China. This study collected the data through face-to-face personal interview from July to October, 1999 Yenben province along the China-North Korea border. Data from 202 female fugitives were used in final analysis. This study uses the SAS PC program for windows, Ver, 6.12 to analyze the data such as the distribution of frequency, percentage, mean and so on. The results from this analysis are follows; the most principal motive of North Korean women's escape to china is to eat to live because of famine. Concerning the year when the fugitives escape from North Korea, all of the interviewees haute escaped since 1990. After escape their continual contact with their family in North Korea, 81.7% of the respondent have not been in touch with their family. The main reasons for their not contacting with their family in North Korea are that it is not helpful although they contacts with their family. Female fugitives from North Korea have difficulties in life. They have rather stable relationship to their husband, but they have experienced difficulties in other aspects of family life. Their main difficulties are largely from their relationships to husbands'family members, and from the problems relate to their family in North Korea, and their children. Based on this study, further research has to present supportive policies that help North Korean female escapees live without being deprived and protect their human rights. And the development of practical program to help their efficient social adaptation has to be continued without stop together.

  • PDF

Some General Characteristics of the Abstracting Journals Published in Korea (한국초록집의 특성)

  • 최성진
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.7 no.1
    • /
    • pp.5-22
    • /
    • 1994
  • This paper attempts to define some general characteristics of the Abstracting Journals published in Korea as evidenced in those published during last ten years. This purpose is achieved by comparing the results of the two studies conducted by the author in 1984 and in 1994. Both studies were conducted to present the state of the art in the abstracting services in Korea. The major conclusions made in this paper are summarised as follows: (1) Researchers and professionals working in a small number of subject fields are benefited by the abstracting journals, which provide current-awareness services of recent achievements in research and development in Korea. Those in most of the fields have no abstracting journals of their own, and naturally they have no substantial abstract-ing services. Even many researchers and professionals in the fields that have some abstracting journals are not informed of research results in their fields because the abstracting journals are scattered in many narrow subjects and in many cases, the abstracting journals only cover publications in some specific forms and kinds. (2) Abstracting journals that cover more than two subject fields, which are supposed to be of more or less help to the researchers and professionals in the subject fields that have no abstracting journals published in their fields, have rapidly increased in number in the past ten years. Most of suh abstracting journals carry thesis and dissertation abstracts, and the rest, those of research papers published in specific places, in specific forms, by specific institutions, and of reports of research projects sponsored by specific foundations. These abstracting journals are not of the kind that comprehensively provide researchers in related fields with current awareness of publications of research results in Korea. (3) Most of the abstracting Journals existing in Korea are Published by institutions of higher education and research institutes, and the rest, by commercial publishers, industrial firms, libraries, information centres, government agencies, research foundations, learned societies, etc. Those which publish many titles are small in number and those publish one or two titles are large in number. The former is largely made up of institutions of higher education and research institutes. (4) The abstracting journals published in Korea are classified by type into those of dissertations, research papers, journal articles, patent specifications in that descending order. The fact that Master; and doctoral dissertation abstracts ate dominating in Korea is due to the irrational practice of publishing those abstracts at many different institutions. (5) Most of the abstracting journals existing in Korea are published by national or government-supported research institutes in order to publicise their own research outputs. Their coverage of literature is normally narrow, and naturally their value to users is limited. (6) Korean is the desirable language for the abstracting journals intended to be distributed within Korea. About half of the abstracting jornals published in Korea is printed in Korean and the other half, in foreign languages, and in Korean and in foreign languages together. All the abstracting journals in foreign languages are printed in English except one, which is printed in Japanese. (7) Some twenty per cent of the abstracting journals in Korea is published monthly, bimonthly, and quarterly. The others are published annually, biannually and irregularly. The latter may not function properly as a current-awareness tool due to long intervals between their issues. It is particularly undesirable that about half of the abstracting journals in Korea is published irregularly. Most of the abstracting journals published in Korea are distributed freely to individuals and institutions selected by the publishers. (8) The abstracting journals published by the use of computers increased drastically in the past ten years. The abstracting journals produced by the conventional type-setting method will possibly disappear in Korea in another ten years to come. Automation of the production of abstracting journals does not simply mean technical, economic improvement in publishing processes but availability of machine-readable databases that can be used for many other pur-poses, including generation of other bibliographical publications and provision of machine literature searching capabilities. Necessary steps should be taken for this important development immediately.

  • PDF

An Empirical Study on How the Moderating Effects of Individual Cultural Characteristics towards a Specific Target Affects User Experience: Based on the Survey Results of Four Types of Digital Device Users in the US, Germany, and Russia (특정 대상에 대한 개인 수준의 문화적 성향이 사용자 경험에 미치는 조절효과에 대한 실증적 연구: 미국, 독일, 러시아의 4개 디지털 기기 사용자를 대상으로)

  • Lee, In-Seong;Choi, Gi-Woong;Kim, So-Lyung;Lee, Ki-Ho;Kim, Jin-Woo
    • Asia pacific journal of information systems
    • /
    • v.19 no.1
    • /
    • pp.113-145
    • /
    • 2009
  • Recently, due to the globalization of the IT(Information Technology) market, devices and systems designed in one country are used in other countries as well. This phenomenon is becoming the key factor for increased interest on cross-cultural, or cross-national, research within the IT area. However, as the IT market is becoming bigger and more globalized, a great number of IT practitioners are having difficulty in designing and developing devices or systems which can provide optimal experience. This is because not only tangible factors such as language and a country's economic or industrial power affect the user experience of a certain device or system but also invisible and intangible factors as well. Among such invisible and intangible factors, the cultural characteristics of users from different countries may affect the user experience of certain devices or systems because cultural characteristics affect how they understand and interpret the devices or systems. In other words, when users evaluate the quality of overall user experience, the cultural characteristics of each user act as a perceptual lens that leads the user to focus on a certain elements of experience. Therefore, there is a need within the IT field to consider cultural characteristics when designing or developing certain devices or systems and plan a strategy for localization. In such an environment, existing IS studies identify the culture with the country, emphasize the importance of culture in a national level perspective, and hypothesize that users within the same country have same cultural characteristics. Under such assumptions, these studies focus on the moderating effects of cultural characteristics on a national level within a certain theoretical framework. This has already been suggested by cross-cultural studies conducted by scholars such as Hofstede(1980) in providing numerical research results and measurement items for cultural characteristics and using such results or items as they increase the efficiency of studies. However, such national level culture has its limitations in forecasting and explaining individual-level behaviors such as voluntary device or system usage. This is because individual cultural characteristics are the outcome of not only the national culture but also the culture of a race, company, local area, family, and other groups that are formulated through interaction within the group. Therefore, national or nationally dominant cultural characteristics may have its limitations in forecasting and explaining the cultural characteristics of an individual. Moreover, past studies in psychology suggest a possibility that there exist different cultural characteristics within a single individual depending on the subject being measured or its context. For example, in relation to individual vs. collective characteristics, which is one of the major cultural characteristics, an individual may show collectivistic characteristics when he or she is with family or friends but show individualistic characteristics in his or her workplace. Therefore, this study acknowledged such limitations of past studies and conducted a research within the framework of 'theoretically integrated model of user satisfaction and emotional attachment', which was developed through a former study, on how the effects of different experience elements on emotional attachment or user satisfaction are differentiated depending on the individual cultural characteristics related to a system or device usage. In order to do this, this study hypothesized the moderating effects of four cultural dimensions (uncertainty avoidance, individualism vs, collectivism, masculinity vs. femininity, and power distance) as suggested by Hofstede(1980) within the theoretically integrated model of emotional attachment and user satisfaction. Statistical tests were then implemented on these moderating effects through conducting surveys with users of four digital devices (mobile phone, MP3 player, LCD TV, and refrigerator) in three countries (US, Germany, and Russia). In order to explain and forecast the behavior of personal device or system users, individual cultural characteristics must be measured, and depending on the target device or system, measurements must be measured independently. Through this suggestion, this study hopes to provide new and useful perspectives for future IS research.

A Research in Applying Big Data and Artificial Intelligence on Defense Metadata using Multi Repository Meta-Data Management (MRMM) (국방 빅데이터/인공지능 활성화를 위한 다중메타데이터 저장소 관리시스템(MRMM) 기술 연구)

  • Shin, Philip Wootaek;Lee, Jinhee;Kim, Jeongwoo;Shin, Dongsun;Lee, Youngsang;Hwang, Seung Ho
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.169-178
    • /
    • 2020
  • The reductions of troops/human resources, and improvement in combat power have made Korean Department of Defense actively adapt 4th Industrial Revolution technology (Artificial Intelligence, Big Data). The defense information system has been developed in various ways according to the task and the uniqueness of each military. In order to take full advantage of the 4th Industrial Revolution technology, it is necessary to improve the closed defense datamanagement system.However, the establishment and usage of data standards in all information systems for the utilization of defense big data and artificial intelligence has limitations due to security issues, business characteristics of each military, anddifficulty in standardizing large-scale systems. Based on the interworking requirements of each system, data sharing is limited through direct linkage through interoperability agreement between systems. In order to implement smart defense using the 4th Industrial Revolution technology, it is urgent to prepare a system that can share defense data and make good use of it. To technically support the defense, it is critical to develop Multi Repository Meta-Data Management (MRMM) that supports systematic standard management of defense data that manages enterprise standard and standard mapping for each system and promotes data interoperability through linkage between standards which obeys the Defense Interoperability Management Development Guidelines. We introduced MRMM, and implemented by using vocabulary similarity using machine learning and statistical approach. Based on MRMM, We expect to simplify the standardization integration of all military databases using artificial intelligence and bigdata. This will lead to huge reduction of defense budget while increasing combat power for implementing smart defense.

A Convergence Study for the Academic Systematization of Cartoon-animation (만화영상학의 학문적 체계화를 위한 융합적 연구)

  • Lim, Jae-Hwan
    • Cartoon and Animation Studies
    • /
    • s.43
    • /
    • pp.285-320
    • /
    • 2016
  • Cartoons and Animation are convergent arts created with a composite application of language arts described in the form of literary texts and sounds, plastic arts visualized in the form of artistic paintings, and film arts produced in the form of moving pictures. An academic university major in cartoons and animation studies established in late 20th century however, did not satisfactorily meet the needs in academic research and development and the free expression of artistic creation was limited. In order to systematize the major in cartoons and animation studies, an convergent approach to establish and clarify following are in demand : the terms and definitions, the historical developments, the research areas and methods, the major education and related jobs and start-ups. New culture and arts industries including cartoons, animation, moving images, and games contents are not yet listed in the industries listing service jointly provided online by the portal site Naver.com and Hyung-Seol publishing company. Above all, cartoons and animation are inseparably related to each other that even if one uses the term separately and independently, the meaning may not be complete. So a new combined term "Animatoon" can be established for the major in cartoons and animation studies and also used for its degree with concentrations of cartoons, animation, moving images, games, and etc. In the Introduction, a new combined term Animatoon is defined and explained the use of this term as the name of the major and degree in cartoons and animation studies. In the body, first, the Historical Developments classified Animatoon in the ancient times, the medieval times, and the modern times and they are analyzed with the help of esthetics and arts using examples of mural frescos, animal painting, religion cartoons, caricatures, cartoons, satire cartoons, comics, animation, 2 or 3 dimensional webtoons, and K-toons. Second, the Research Areas of Animatoon reviewed the theories, genres, artworks, and artists and the Research Methods of Animatoon presented the curriculum that integrated the courses in humanities, science technologies, culture and arts, and etc. Third, the Major Education considered Animatoon education in children, young adults, students of the major and the Related Jobs and Start-Ups explored various jobs relating to personal creation of artwork and collective production of business-oriented artwork. In the Conclusion, the current challenges of Animatoon considered personalization of the artists, specialization of the contents, diversification of the types, and liberalization of the art creation. And the direction of improvement advocated Animatoon to be an academic field of study, to be an art, to be a culture, and to be an industry. The importance of cartoons and animation along with videos and games rose in the 21st century. In order for cartoons and animation to take a leading role, make efforts in studying Animatoon academically and also in developing Animatoon as good contents in the cultural industries.

Semantic Process Retrieval with Similarity Algorithms (유사도 알고리즘을 활용한 시맨틱 프로세스 검색방안)

  • Lee, Hong-Joo;Klein, Mark
    • Asia pacific journal of information systems
    • /
    • v.18 no.1
    • /
    • pp.79-96
    • /
    • 2008
  • One of the roles of the Semantic Web services is to execute dynamic intra-organizational services including the integration and interoperation of business processes. Since different organizations design their processes differently, the retrieval of similar semantic business processes is necessary in order to support inter-organizational collaborations. Most approaches for finding services that have certain features and support certain business processes have relied on some type of logical reasoning and exact matching. This paper presents our approach of using imprecise matching for expanding results from an exact matching engine to query the OWL(Web Ontology Language) MIT Process Handbook. MIT Process Handbook is an electronic repository of best-practice business processes. The Handbook is intended to help people: (1) redesigning organizational processes, (2) inventing new processes, and (3) sharing ideas about organizational practices. In order to use the MIT Process Handbook for process retrieval experiments, we had to export it into an OWL-based format. We model the Process Handbook meta-model in OWL and export the processes in the Handbook as instances of the meta-model. Next, we need to find a sizable number of queries and their corresponding correct answers in the Process Handbook. Many previous studies devised artificial dataset composed of randomly generated numbers without real meaning and used subjective ratings for correct answers and similarity values between processes. To generate a semantic-preserving test data set, we create 20 variants for each target process that are syntactically different but semantically equivalent using mutation operators. These variants represent the correct answers of the target process. We devise diverse similarity algorithms based on values of process attributes and structures of business processes. We use simple similarity algorithms for text retrieval such as TF-IDF and Levenshtein edit distance to devise our approaches, and utilize tree edit distance measure because semantic processes are appeared to have a graph structure. Also, we design similarity algorithms considering similarity of process structure such as part process, goal, and exception. Since we can identify relationships between semantic process and its subcomponents, this information can be utilized for calculating similarities between processes. Dice's coefficient and Jaccard similarity measures are utilized to calculate portion of overlaps between processes in diverse ways. We perform retrieval experiments to compare the performance of the devised similarity algorithms. We measure the retrieval performance in terms of precision, recall and F measure? the harmonic mean of precision and recall. The tree edit distance shows the poorest performance in terms of all measures. TF-IDF and the method incorporating TF-IDF measure and Levenshtein edit distance show better performances than other devised methods. These two measures are focused on similarity between name and descriptions of process. In addition, we calculate rank correlation coefficient, Kendall's tau b, between the number of process mutations and ranking of similarity values among the mutation sets. In this experiment, similarity measures based on process structure, such as Dice's, Jaccard, and derivatives of these measures, show greater coefficient than measures based on values of process attributes. However, the Lev-TFIDF-JaccardAll measure considering process structure and attributes' values together shows reasonably better performances in these two experiments. For retrieving semantic process, we can think that it's better to consider diverse aspects of process similarity such as process structure and values of process attributes. We generate semantic process data and its dataset for retrieval experiment from MIT Process Handbook repository. We suggest imprecise query algorithms that expand retrieval results from exact matching engine such as SPARQL, and compare the retrieval performances of the similarity algorithms. For the limitations and future work, we need to perform experiments with other dataset from other domain. And, since there are many similarity values from diverse measures, we may find better ways to identify relevant processes by applying these values simultaneously.

Abstracting Services in Korea (한국의 초록서비스에 대하여)

  • Choi Sung-Jin
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.24
    • /
    • pp.9-51
    • /
    • 1993
  • The purpose of this study is twofold: to investigate into general characteristics of the abstracting services in Korea and to discuss general directions of development of the abstracting services in the country. This study is designed to achieve the purpose by gathering and analysing data related to the abstracting journals published in the past ten years and by comparing the results with similar data gathered by the investigator in 1984. The major conclusions made in this study is summarised as follows. (1) Researchers and professionals working in limited numbers of subject fields are benefited by abstracting services of recent achievements in research and development in Korea. Those in most of the fields have essentially no abstracting services of such achievements. Even many researchers and professionals in the limited numbers of the fields that have some elementary abstracting services are not informed of research results in their fields because the abstracting journals are scattered in many narrow subjects and in many cases, the abstracting journals only cover publications in some specific forms and kinds. (2) Abstracting journals of general subjects, which are supposed to be of more or less help to the researchers in the subject fields that have no abstracting journals of their own, have rapidly increased in number in the past ten years. Most of such abstracting journals carry thesis and dissertation abstracts, and the rest those of research papers published in specific places, in specific forms, by specific institutes, and of reports of research projects sponsored by specific foundations. These abstracting journals are not of the kind that comprehensively provide general readers with current awareness of publications of research results in Korea. (3) Most of the abstracting journals existing in Korea are published by institutions of higher education and research institutes, and the rest by commercial publishers, industrial firms, libraries, information centers, government agencies, research foundations, learned societies, etc. Those which publish many titles are small in number and those publish one or two titles are large in number. The former is largely made up of institutions of higher education and research institutes. (4) Ten years ago, there was not a single publishing house that produced abstracting journals. Three commercial publishing houses now produce abstracting journals. As this change occurs, centers of excellence are founded and competitive elements are introduced in abstracting services. This change, in turn, is expected to improve quality of the other abstracting journals in Korea. (5) The abstracting journals published in Korea are classified by type into those of dissertations, research papers, journal articles, patent specifications in that descending order. The fact that Master's and doctoral dissertation abstracts are dominating in Korea is due to the irrational practice of publishing those abstracts at many institutions. (6) Most of the abstracting journals existing in Korea are published by national or government-supported research institutes in order to publicise their own research outputs. Their coverage of literature is normally narrow, and naturally their value to users is limited. (7) The abstracting journals published in Korea increased in number at the rate of $77.8-100\%$ every five years in the past twenty-five years. Most of the abstracting journals that ceased to be published during the period survived for two years. (8) Korean is the desirable language for the abstracting journals designed to be distributed within Korea. About half of the abstracting journals published in Korea is printed in Korean and the other half in foreign languages, and in Korean with foreign languages. All the abstracting journals in foreign languages are printed in English xcept one, which is printed in Japanese. (9) Some twenty percent of the abstracting journals in Korea is published monthly, bimonthly, and quarterly. Others are published annually, biannually, and irregularly. The latter may not function properly as a current-awareness tool due to long intervals between their issues. It is particularly undesirable that about half of the abstracting journals in Korea is published irregularly. Most of the abstracting journals published in Korea are distributed freely to individuals and institutions selected by the publishers. (10) The abstracting journals published by the use of computers increased drastically in the past ten years. The abstracting journals produced by the conventional type-setting method will probably disappear In Korea in another ten years to come. Automation of the production of abstracting journals does not simply mean technical, economic improvement of publishing processes but availability of machine-readable databases that can be used for other purposes, including the generation of other publications and the provision of machine literature searching capabilities. Necessary steps should be taken for this important development that is occurring in the abstracting services in Korea.

  • PDF

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.