• Title/Summary/Keyword: keyword-based learning

Search Result 133, Processing Time 0.026 seconds

Understanding of Generative Artificial Intelligence Based on Textual Data and Discussion for Its Application in Science Education (텍스트 기반 생성형 인공지능의 이해와 과학교육에서의 활용에 대한 논의)

  • Hunkoog Jho
    • Journal of The Korean Association For Science Education
    • /
    • v.43 no.3
    • /
    • pp.307-319
    • /
    • 2023
  • This study aims to explain the key concepts and principles of text-based generative artificial intelligence (AI) that has been receiving increasing interest and utilization, focusing on its application in science education. It also highlights the potential and limitations of utilizing generative AI in science education, providing insights for its implementation and research aspects. Recent advancements in generative AI, predominantly based on transformer models consisting of encoders and decoders, have shown remarkable progress through optimization of reinforcement learning and reward models using human feedback, as well as understanding context. Particularly, it can perform various functions such as writing, summarizing, keyword extraction, evaluation, and feedback based on the ability to understand various user questions and intents. It also offers practical utility in diagnosing learners and structuring educational content based on provided examples by educators. However, it is necessary to examine the concerns regarding the limitations of generative AI, including the potential for conveying inaccurate facts or knowledge, bias resulting from overconfidence, and uncertainties regarding its impact on user attitudes or emotions. Moreover, the responses provided by generative AI are probabilistic based on response data from many individuals, which raises concerns about limiting insightful and innovative thinking that may offer different perspectives or ideas. In light of these considerations, this study provides practical suggestions for the positive utilization of AI in science education.

Recognition Method of Korean Abnormal Language for Spam Mail Filtering (스팸메일 필터링을 위한 한글 변칙어 인식 방법)

  • Ahn, Hee-Kook;Han, Uk-Pyo;Shin, Seung-Ho;Yang, Dong-Il;Roh, Hee-Young
    • Journal of Advanced Navigation Technology
    • /
    • v.15 no.2
    • /
    • pp.287-297
    • /
    • 2011
  • As electronic mails are being widely used for facility and speedness of information communication, as the amount of spam mails which have malice and advertisement increase and cause lots of social and economic problem. A number of approaches have been proposed to alleviate the impact of spam. These approaches can be categorized into pre-acceptance and post-acceptance methods. Post-acceptance methods include bayesian filters, collaborative filtering and e-mail prioritization which are based on words or sentances. But, spammers are changing those characteristics and sending to avoid filtering system. In the case of Korean, the abnormal usages can be much more than other languages because syllable is composed of chosung, jungsung, and jongsung. Existing formal expressions and learning algorithms have the limits to meet with those changes promptly and efficiently. So, we present an methods for recognizing Korean abnormal language(Koral) to improve accuracy and efficiency of filtering system. The method is based on syllabic than word and Smith-waterman algorithm. Through the experiment on filter keyword and e-mail extracted from mail server, we confirmed that Koral is recognized exactly according to similarity level. The required time and space costs are within the permitted limit.

An Investigation on Digital Humanities Research Trend by Analyzing the Papers of Digital Humanities Conferences (디지털 인문학 연구 동향 분석 - Digital Humanities 학술대회 논문을 중심으로 -)

  • Chung, EunKyung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.55 no.1
    • /
    • pp.393-413
    • /
    • 2021
  • Digital humanities, which creates new and innovative knowledge through the combination of digital information technology and humanities research problems, can be seen as a representative multidisciplinary field of study. To investigate the intellectual structure of the digital humanities field, a network analysis of authors and keywords co-word was performed on a total of 441 papers in the last two years (2019, 2020) at the Digital Humanities Conference. As the results of the author and keyword analysis show, we can find out the active activities of Europe, North America, and Japanese and Chinese authors in East Asia. Through the co-author network, 11 dis-connected sub-networks are identified, which can be seen as a result of closed co-authoring activities. Through keyword analysis, 16 sub-subject areas are identified, which are machine learning, pedagogy, metadata, topic modeling, stylometry, cultural heritage, network, digital archive, natural language processing, digital library, twitter, drama, big data, neural network, virtual reality, and ethics. This results imply that a diver variety of digital information technologies are playing a major role in the digital humanities. In addition, keywords with high frequency can be classified into humanities-based keywords, digital information technology-based keywords, and convergence keywords. The dynamics of the growth and development of digital humanities can represented in these combinations of keywords.

High-Speed Search for Pirated Content and Research on Heavy Uploader Profiling Analysis Technology (불법복제물 고속검색 및 Heavy Uploader 프로파일링 분석기술 연구)

  • Hwang, Chan-Woong;Kim, Jin-Gang;Lee, Yong-Soo;Kim, Hyeong-Rae;Lee, Tae-Jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1067-1078
    • /
    • 2020
  • With the development of internet technology, a lot of content is produced, and the demand for it is increasing. Accordingly, the number of contents in circulation is increasing, while the number of distributing illegal copies that infringe on copyright is also increasing. The Korea Copyright Protection Agency operates a illegal content obstruction program based on substring matching, and it is difficult to accurately search because a large number of noises are inserted to bypass this. Recently, researches using natural language processing and AI deep learning technologies to remove noise and various blockchain technologies for copyright protection are being studied, but there are limitations. In this paper, noise is removed from data collected online, and keyword-based illegal copies are searched. In addition, the same heavy uploader is estimated through profiling analysis for heavy uploaders. In the future, it is expected that copyright damage will be minimized if the illegal copy search technology and blocking and response technology are combined based on the results of profiling analysis for heavy uploaders.

The way to make training data for deep learning model to recognize keywords in product catalog image at E-commerce (온라인 쇼핑몰에서 상품 설명 이미지 내의 키워드 인식을 위한 딥러닝 훈련 데이터 자동 생성 방안)

  • Kim, Kitae;Oh, Wonseok;Lim, Geunwon;Cha, Eunwoo;Shin, Minyoung;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.1-23
    • /
    • 2018
  • From the 21st century, various high-quality services have come up with the growth of the internet or 'Information and Communication Technologies'. Especially, the scale of E-commerce industry in which Amazon and E-bay are standing out is exploding in a large way. As E-commerce grows, Customers could get what they want to buy easily while comparing various products because more products have been registered at online shopping malls. However, a problem has arisen with the growth of E-commerce. As too many products have been registered, it has become difficult for customers to search what they really need in the flood of products. When customers search for desired products with a generalized keyword, too many products have come out as a result. On the contrary, few products have been searched if customers type in details of products because concrete product-attributes have been registered rarely. In this situation, recognizing texts in images automatically with a machine can be a solution. Because bulk of product details are written in catalogs as image format, most of product information are not searched with text inputs in the current text-based searching system. It means if information in images can be converted to text format, customers can search products with product-details, which make them shop more conveniently. There are various existing OCR(Optical Character Recognition) programs which can recognize texts in images. But existing OCR programs are hard to be applied to catalog because they have problems in recognizing texts in certain circumstances, like texts are not big enough or fonts are not consistent. Therefore, this research suggests the way to recognize keywords in catalog with the Deep Learning algorithm which is state of the art in image-recognition area from 2010s. Single Shot Multibox Detector(SSD), which is a credited model for object-detection performance, can be used with structures re-designed to take into account the difference of text from object. But there is an issue that SSD model needs a lot of labeled-train data to be trained, because of the characteristic of deep learning algorithms, that it should be trained by supervised-learning. To collect data, we can try labelling location and classification information to texts in catalog manually. But if data are collected manually, many problems would come up. Some keywords would be missed because human can make mistakes while labelling train data. And it becomes too time-consuming to collect train data considering the scale of data needed or costly if a lot of workers are hired to shorten the time. Furthermore, if some specific keywords are needed to be trained, searching images that have the words would be difficult, as well. To solve the data issue, this research developed a program which create train data automatically. This program can make images which have various keywords and pictures like catalog and save location-information of keywords at the same time. With this program, not only data can be collected efficiently, but also the performance of SSD model becomes better. The SSD model recorded 81.99% of recognition rate with 20,000 data created by the program. Moreover, this research had an efficiency test of SSD model according to data differences to analyze what feature of data exert influence upon the performance of recognizing texts in images. As a result, it is figured out that the number of labeled keywords, the addition of overlapped keyword label, the existence of keywords that is not labeled, the spaces among keywords and the differences of background images are related to the performance of SSD model. This test can lead performance improvement of SSD model or other text-recognizing machine based on deep learning algorithm with high-quality data. SSD model which is re-designed to recognize texts in images and the program developed for creating train data are expected to contribute to improvement of searching system in E-commerce. Suppliers can put less time to register keywords for products and customers can search products with product-details which is written on the catalog.

The Prediction of Cryptocurrency on Using Text Mining and Deep Learning Techniques : Comparison of Korean and USA Market (텍스트 마이닝과 딥러닝을 활용한 암호화폐 가격 예측 : 한국과 미국시장 비교)

  • Won, Jonggwan;Hong, Taeho
    • Knowledge Management Research
    • /
    • v.22 no.2
    • /
    • pp.1-17
    • /
    • 2021
  • In this study, we predicted the bitcoin prices of Bithum and Coinbase, a leading exchange in Korea and USA, using ARIMA and Recurrent Neural Networks(RNNs). And we used news articles from each country to suggest a separated RNN model. The suggested model identifies the datasets based on the changing trend of prices in the training data, and then applies time series prediction technique(RNNs) to create multiple models. Then we used daily news data to create a term-based dictionary for each trend change point. We explored trend change points in the test data using the daily news keyword data of testset and term-based dictionary, and apply a matching model to produce prediction results. With this approach we obtained higher accuracy than the model which predicted price by applying just time series prediction technique. This study presents that the limitations of the time series prediction techniques could be overcome by exploring trend change points using news data and various time series prediction techniques with text mining techniques could be applied to improve the performance of the model in the further research.

Design and Implementation of Web Based Instruction Based on Constructivism for Self-Directed Learning Ablity (구성주의 이론에 기반한 자기주도적 웹 기반 교육의 설계와 구현)

  • Kim Gi-Nam;Kim Eui-Jeong;Kim Chang-Suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2006.05a
    • /
    • pp.855-858
    • /
    • 2006
  • First of all, Developing information technology makes it possible to change a paradigm of all kinds of areas, including an education. Students can choose learning goals and objects themselves and acquire not the accumulation of knowledge but the method of their learning. Moreover, Teachers get to be adviser, and students play a key role in teaming. That is, the subject of leaning is students. Constructivism emphasizes the student-oriented environment of education, which corresponds to the characteristics of hypeimedia. In addition, Internet allows us to make a practical plan for constructivism. Web Based Internet provides us with a proper environment to make constructivism practice md causes an education system to change. Sure Web Based Instruction makes them motivated to learn more, they can gain plenty of information regardless of places or time. Besides, they are able to consult more up-to-date information regarding their learning use hypermedia such as an image, audio, video, and test, and effectively communicate with their instructor through a board, an e-mail, a chatting etc. A school and instructors have been making effort to develop a new model of a teaching method to cope with a new environment change. In this thesis, with 'Design and Implementation of Web Based Instruction Based on Constructivism', providing online learner-oriented and indexed video lesson, learners can get chance of self-oriented learning. In addition, learners doesn't have to cover all contents of a lesson but can choose contents they want to have from a indexed list of a lesson, and they ran search contents they want to have with a 'Keyword Search' on a main page, which can make learners improve learner's achievement.

  • PDF

Intelligent Spam-mail Filtering Based on Textual Information and Hyperlinks (텍스트정보와 하이퍼링크에 기반한 지능형 스팸 메일 필터링)

  • Kang, Sin-Jae;Kim, Jong-Wan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.7
    • /
    • pp.895-901
    • /
    • 2004
  • This paper describes a two-phase intelligent method for filtering spam mail based on textual information and hyperlinks. Scince the body of spam mail has little text information, it provides insufficient hints to distinguish spam mails from legitimate mails. To resolve this problem, we follows hyperlinks contained in the email body, fetches contents of a remote webpage, and extracts hints (i.e., features) from original email body and fetched webpages. We divided hints into two kinds of information: definite information (sender`s information and definite spam keyword lists) and less definite textual information (words or phrases, and particular features of email). In filtering spam mails, definite information is used first, and then less definite textual information is applied. In our experiment, the method of fetching web pages achieved an improvement of F-measure by 9.4% over the method of using on original email header and body only.

A Distinction Technology for Harmful Web Documents by Rates (등급에 따른 웹 유해 문서 분류 기술)

  • Kim, Yong-Soo;Nam, Taek-Yong;Won, Dong-Ho
    • The KIPS Transactions:PartC
    • /
    • v.13C no.7 s.110
    • /
    • pp.859-864
    • /
    • 2006
  • The openness of the Web allows any user to access almost any type of information easily at any time and anywhere. However, with function of easy access for useful information, internet has dysfunctions of providing users with harmful contents indiscriminately. Some information, such as adult content, is not appropriate for all users, notably children. Additionally for adults, some contents included in abnormal porn sites can do ordinary people's mental health harm. In the meantime, since Internet is a worldwide open network it has a limit to regulate users providing harmful contents through each countrie's national laws or systems. Additionally it is not a desirable way of developing a certain system-specific classification technology for harmful contents, because internet users can contact with them in diverse way, for example, porn sites, harmful spams, or peer-to-peer networks, etc. Therefore, it is being emphasized to research and develop context-based core technologies for classifying harmful contents. In this paper, we propose an efficient text filter for blocking harmful texts of web documents using context-based technologies.

Meta-Analysis of Self-Advocacy of People with Developmental Disabilities : Focusing on Research from 2000 to 2023 (발달장애인의 자기옹호에 관련 메타분석 2000년부터 2023년까지 -)

  • Su-Mi Jin;Wha-Soo Kim;Ji-Woo Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.201-210
    • /
    • 2023
  • The purpose of this study is to analyze the general characteristics, effect size, and qualitative indicators of self-advocacy studies of people with developmental disabilities published in domestic academic journals and theses. For this purpose, among a total of 2153 papers related to self-advocacy published from 2000 to 2023, 41 studies with developmental disabilities as the keyword were selected, and the specific research results are as follows. Based on the results of this study, when developing a language intervention program related to self-advocacy for people with developmental disabilities, it is recommended to develop an intervention program based on the number of sessions of 10-19 in a learning situation with 20-30 people in adolescents and adults, or during the transition period. There are many studies limited to educational aspects such as special education and integrated education, and by applying this, it is hoped that a self-advocacy language intervention program will be developed at the level of language rehabilitation that can effectively and sophisticatedly assert self-assertion and self-rights after experiencing difficulties in communication.