• Title/Summary/Keyword: 키워드 학습

Search Result 272, Processing Time 0.027 seconds

The Use of Reinforcement Learning and The Reference Page Selection Method to improve Web Spidering Performance (웹 탐색 성능 향상을 위한 강화학습 이용과 기준 페이지 선택 기법)

  • 이기철;이선애
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.3
    • /
    • pp.331-340
    • /
    • 2002
  • The web world is getting so huge and untractable that without an intelligent information extractor we would get more and more helpless. Conventional web spidering techniques for general purpose search engine may be too slow for the specific search engines, which concentrate only on specific areas or keywords. In this paper a new model for improving web spidering capabilities is suggested and experimented. How to select adequate reference web pages from the initial web Page set relevant to a given specific area (or keywords) can be very important to reduce the spidering speed. Our reference web page selection method DOPS dynamically and orthogonally selects web pages, and it can also decide the appropriate number of reference pages, using a newly defined measure. Even for a very specific area, this method worked comparably well almost at the level of experts. If we consider that experts cannot work on a huge initial page set, and they still have difficulty in deciding the optimal number of the reference web pages, this method seems to be very promising. We also applied reinforcement learning to web environment, and DOPS-based reinforcement learning experiments shows that our method works quite favorably in terms of both the number of hyper links and time.

  • PDF

Literature Review of AI Hallucination Research Since the Advent of ChatGPT: Focusing on Papers from arXiv (챗GPT 등장 이후 인공지능 환각 연구의 문헌 검토: 아카이브(arXiv)의 논문을 중심으로)

  • Park, Dae-Min;Lee, Han-Jong
    • Informatization Policy
    • /
    • v.31 no.2
    • /
    • pp.3-38
    • /
    • 2024
  • Hallucination is a significant barrier to the utilization of large-scale language models or multimodal models. In this study, we collected 654 computer science papers with "hallucination" in the abstract from arXiv from December 2022 to January 2024 following the advent of Chat GPT and conducted frequency analysis, knowledge network analysis, and literature review to explore the latest trends in hallucination research. The results showed that research in the fields of "Computation and Language," "Artificial Intelligence," "Computer Vision and Pattern Recognition," and "Machine Learning" were active. We then analyzed the research trends in the four major fields by focusing on the main authors and dividing them into data, hallucination detection, and hallucination mitigation. The main research trends included hallucination mitigation through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF), inference enhancement via "chain of thought" (CoT), and growing interest in hallucination mitigation within the domain of multimodal AI. This study provides insights into the latest developments in hallucination research through a technology-oriented literature review. This study is expected to help subsequent research in both engineering and humanities and social sciences fields by understanding the latest trends in hallucination research.

Analysis of Research Trends in Home Economics Education by Language Network Analysis: Focused on the KCI Journals (2000-2019) (언어 네트워크 분석에 기반 한 가정과교육 연구 동향 분석: 2000-2019년 KCI 등재지를 중심으로)

  • Gham, Kyoung Won;Park, Mi Jeong
    • Journal of Korean Home Economics Education Association
    • /
    • v.32 no.3
    • /
    • pp.179-197
    • /
    • 2020
  • This study analyzed the trends in home economics education research using the language network analysis method, focusing on papers published in the KCI list for 20 years from 2000 to 2019. A total of 501 home economics education papers analyzed through word cloud, centrality analysis, and topic modeling using NetMiner 4.4, and the results are as follows. First, the number of papers in home economics education published in the KCI listing increased gradually to 186 in the 2000s and 315 in the 2010s. The academic journals in which home economics education papers were published have been diversified to 16 in the 2000s and 22 in the 2010s. 60% of all papers were published in the 'Journal of Korean Home Economics Education Association', and since 2018, the number of papers published in the 'Journal of Learner-Centered Curriculum and Instruction' has increased dramatically. Second, in the 2000s and 2010s, home economics education studies published in KCI were categorized into home economics education content analysis, home economics educational program development & application, curriculum analysis, perception survey & direction exploration. In the 2000s, 'Home Economics Teacher' appeared as the main keyword, and a lot of perception survey & direction exploration were conducted. Relatively, the influence of 'development' increased in the 2010s, and many studies were conducted to analyze home economics education contents and develop and apply home economics programs. This study has significance in that it analyzed the research trend of HEE by expanding the analysis target and analysis period of the existing studies.

A Question Example Generation System for Multiple Choice Tests by utilizing Concept Similarity in Korean WordNet (한국어 워드넷에서의 개념 유사도를 활용한 선택형 문항 생성 시스템)

  • Kim, Young-Bum;Kim, Yu-Seop
    • The KIPS Transactions:PartA
    • /
    • v.15A no.2
    • /
    • pp.125-134
    • /
    • 2008
  • We implemented a system being able to suggest example sentences for multiple choice tests, considering the level of students. To build the system, we designed an automatic method for sentence generation, which made it possible to control the difficulty degree of questions. For the proper evaluation in the multiple choice tests, proper size of question pools is required. To satisfy this requirement, a system which can generate various and numerous questions and their example sentences in a fast way should be used. In this paper, we designed an automatic generation method using a linguistic resource called WordNet. For the automatic generation, firstly, we extracted keywords from the existing sentences with the morphological analysis and candidate terms with similar meaning to the keywords in Korean WordNet space are suggested. When suggesting candidate terms, we transformed the existing Korean WordNet scheme into a new scheme to construct the concept similarity matrix. The similarity degree between concepts can be ranged from 0, representing synonyms relationships, to 9, representing non-connected relationships. By using the degree, we can control the difficulty degree of newly generated questions. We used two methods for evaluating semantic similarity between two concepts. The first one is considering only the distance between two concepts and the second one additionally considers positions of two concepts in the Korean Wordnet space. With these methods, we can build a system which can help the instructors generate new questions and their example sentences with various contents and difficulty degree from existing sentences more easily.

Keyword Analysis of Research on Consumption of Children and Adolescents Using Text Mining (텍스트마이닝을 활용한 아동, 청소년 대상 소비관련 연구 키워드 분석)

  • Jin, Hyun-Jeong
    • Journal of Korean Home Economics Education Association
    • /
    • v.33 no.4
    • /
    • pp.1-13
    • /
    • 2021
  • The purpose of this study is to identify trends and potential themes of research on consumption of children and adolescents for 20 years by analyzing keywords. The keywords of 869 studies on consumption of children and adolescents published in journals listed in Korean Citation Index were analyzed using text mining techniques. The most frequent keywords were found in the order of youth, youth consumers, consumer education, conspicuous consumption, consumption behavior, and character. As a result of analyzing the frequency of keywords by dividing into five-year periods, it was confirmed that the frequency of consumer education was significantly higher betwn 2006 and 2010. Research on ethical consumption has been active since 2011, and research has been conducted on various topics instead of without a prominent keyword during the most recent 5-year period. Looking at the keywords based on the TF-IDF, the keywords related to the environment and the Internet were the main keywords between 2001 and 2005. From 2006 to 2010, the TF-IDF values of media use, advertisement education, and Internet items were high. From 2011 to 2015, fair trade, green growth, green consumption, North Korean defector youths, social media, and from 2016 to 2020, text mining, sustainable development education, maker education, and the 2015 revised curriculum appeared as important themes. As a result of topic modeling, eight topics were derived: consumer education, mass media/peer culture, rational consumption, Hallyu/cultural industry, consumer competency, economic education, teaching and learning method, and eco-friendly/ethical consumption. As a result of network analysis, it was found that conspicuous consumption and consumer education are important topics in consumption research of children and adolescents.

A Generation and Matching Method of Normal-Transient Dictionary for Realtime Topic Detection (실시간 이슈 탐지를 위한 일반-급상승 단어사전 생성 및 매칭 기법)

  • Choi, Bongjun;Lee, Hanjoo;Yong, Wooseok;Lee, Wonsuk
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.5
    • /
    • pp.7-18
    • /
    • 2017
  • Recently, the number of SNS user has rapidly increased due to smart device industry development and also the amount of generated data is exponentially increasing. In the twitter, Text data generated by user is a key issue to research because it involves events, accidents, reputations of products, and brand images. Twitter has become a channel for users to receive and exchange information. An important characteristic of Twitter is its realtime. Earthquakes, floods and suicides event among the various events should be analyzed rapidly for immediately applying to events. It is necessary to collect tweets related to the event in order to analyze the events. But it is difficult to find all tweets related to the event using normal keywords. In order to solve such a mentioned above, this paper proposes A Generation and Matching Method of Normal-Transient Dictionary for realtime topic detection. Normal dictionaries consist of general keywords(event: suicide-death-loop, death, die, hang oneself, etc) related to events. Whereas transient dictionaries consist of transient keywords(event: suicide-names and information of celebrities, information of social issues) related to events. Experimental results show that matching method using two dictionary finds more tweets related to the event than a simple keyword search.

A Study on the Research Trends in the Fourth Industrial Revolution in Korea Using Topic Modeling (토픽모델링을 활용한 4차 산업혁명 분야의 국내 연구 동향 분석)

  • Gi Young Kim;Dong-Jo Noh
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.34 no.4
    • /
    • pp.207-234
    • /
    • 2023
  • Since the advent of the Fourth Industrial Revolution, related studies have been conducted in various fields including industrial fields. In this study, to analyze domestic research trends on the Fourth Industrial Revolution, a keyword analysis and topic modeling analysis based on the LDA algorithm were conducted on 2,115 papers included in the KCI from January 2016 to August 2023. As a result of this study, first, the journals in which more than 30 academic papers related to the Fourth Industrial Revolution were published were digital convergence research, humanities society 21, e-business research, and learner-centered subject education research. Second, as a result of the topic modeling analysis, seven topics were selected: "human and artificial intelligence," "data and personal information management," "curriculum change and innovation," "corporate change and innovation," "education change and jobs," "culture and arts and content," and "information and corporate policies and responses." Third, common research topics related to the Fourth Industrial Revolution are "change in the curriculum," "human and artificial intelligence," and "culture arts and content," and common keywords include "company," "information," "protection," "smart," and "system." Fourth, in the first half of the research period (2016-2019), topics in the field of education appeared at the top, but in the second half (2020-2023), topics related to corporate, smart, digital, and service innovation appeared at the top. Fifth, research topics tended to become more specific or subdivided in the second half of the study. This trend is interpreted as a result of socioeconomic changes that occur as core technologies in the fourth industrial revolution are applied and utilized in various industrial fields after the corona pandemic. The results of this study are expected to provide useful information for identifying research trends in the field of the Fourth Industrial Revolution, establishing strategies, and subsequent research.

Exploratory research on the dynamic capabilities of leading firms: Research framework building (시장 선도 기업의 동태적 역량에 대한 탐색적 연구: 연구 프레임워크 구축)

  • Paek, Byung-Joo;Lee, Hee-Sang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.12
    • /
    • pp.8262-8273
    • /
    • 2015
  • Sources of innovation and a sustainable competitive advantage for market leading firms in a dynamic environment are major concerns for both firm managers and academic researchers. For this reason, the dynamic capability view (DCV), evolved from the resource-based view (RBV) is gaining popularity as a cornerstone for success of firms in a dynamic market. DCV proposes a number of new concepts and elements, but these are still under development. This study suggests a comprehensive constitution of dynamic capabilities by performing a systematic literature review of 59 conceptual research papers from 238 bodies of literature. For the integrative conceptual framework, the elements of the dynamic capabilities are segmented into organizational learning and strategic choices with sensing, seizing, and reconfiguring stages. In addition, our conceptual framework lays a foundation for further empirical studies, with investigating sources of competitive advantage of market leading firms.

Feature Extraction to Detect Hoax Articles (낚시성 인터넷 신문기사 검출을 위한 특징 추출)

  • Heo, Seong-Wan;Sohn, Kyung-Ah
    • Journal of KIISE
    • /
    • v.43 no.11
    • /
    • pp.1210-1215
    • /
    • 2016
  • Readership of online newspapers has grown with the proliferation of smart devices. However, fierce competition between Internet newspaper companies has resulted in a large increase in the number of hoax articles. Hoax articles are those where the title does not convey the content of the main story, and this gives readers the wrong information about the contents. We note that the hoax articles have certain characteristics, such as unnecessary celebrity quotations, mismatch in the title and content, or incomplete sentences. Based on these, we extract and validate features to identify hoax articles. We build a large-scale training dataset by analyzing text keywords in replies to articles and thus extracted five effective features. We evaluate the performance of the support vector machine classifier on the extracted features, and a 92% accuracy is observed in our validation set. In addition, we also present a selective bigram model to measure the consistency between the title and content, which can be effectively used to analyze short texts in general.

Case Study for the Communication Method of Information Design Type Advertising (정보디자인형 광고의 커뮤니케이션 기법에 관한 연구)

  • Kim, Jong-Min;Park, Han-Sol
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.11
    • /
    • pp.90-101
    • /
    • 2017
  • This study will analyze that the meaning and the characteristic of Information design type advertising. This study research the advertisement and Information with the issue and explore Information design type advertising samples by doing an in-depth analysis with an expert group and an inexpert group. It attracts customers visualizing sensational information and data as information design technique. It can be classified in to Manual type ad, Identity type ad, Data visualizing type ad. The communication formula of it goes through the keywords: Attention, Curation, Study, and these Curation and Study are new steps which didn't exist before in consumer behavior model. Information used in it comes from common sense or storytelling made by imagination, but there is no example of using false information distorting truth. Not exaggeration and falsehood, interesting which based on confidence creates a bond of sympathy: period time.