• Title/Summary/Keyword: Dictionary learning

Search Result 140, Processing Time 0.025 seconds

Named entity recognition using transfer learning and small human- and meta-pseudo-labeled datasets

  • Kyoungman Bae;Joon-Ho Lim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.59-70
    • /
    • 2024
  • We introduce a high-performance named entity recognition (NER) model for written and spoken language. To overcome challenges related to labeled data scarcity and domain shifts, we use transfer learning to leverage our previously developed KorBERT as the base model. We also adopt a meta-pseudo-label method using a teacher/student framework with labeled and unlabeled data. Our model presents two modifications. First, the student model is updated with an average loss from both human- and pseudo-labeled data. Second, the influence of noisy pseudo-labeled data is mitigated by considering feedback scores and updating the teacher model only when below a threshold (0.0005). We achieve the target NER performance in the spoken language domain and improve that in the written language domain by proposing a straightforward rollback method that reverts to the best model based on scarce human-labeled data. Further improvement is achieved by adjusting the label vector weights in the named entity dictionary.

Korean Word Learning System Using Automatic Question Generation Technique (자동 문제 생성 기술을 이용한 한국어 어휘학습시스템)

  • Choe, Su-Il;Im, Ji-Hui;Choe, Ho-Seop;Ock, Cheol-Young
    • Korean Journal of Cognitive Science
    • /
    • v.17 no.4
    • /
    • pp.271-286
    • /
    • 2006
  • In this paper, we introduce automatic question generation technique using the language resources like User-Word Intelligent Network(U-WIN) and Korean dictionary including quite a for of information. And we present Korean word learning system with this technique. The item pool method which almost learning-system are using makes some problems. As a solution of the problems, we classified into 8 question type and implemented the Korean word learning system which is making the Korean question automatically by using the morphological and semantic information according to the automatic question generation pattern of each type.

  • PDF

Edutech in the Era of the 4th Industrial Revolution (4차 산업혁명 시대의 에듀테크)

  • Park, Ji Su;Gil, Joon-Min
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.11
    • /
    • pp.329-331
    • /
    • 2020
  • Edutech is a compound word of education and technology, and is an educational paradigm in the era of the 4th industrial revolution. This refers to next-generation education using information and communication technology (ICT) such as big data, artificial intelligence (AI), robots, and virtual reality (VR) of the 4th industrial revolution. e-Learning is being used as an online lecture for education in ICT, but edutech is attracting attention along with e-learning as the feeding of non-face-to-face education has rapidly increased due to COVID-19. Therefore, this paper summarizes the reviewed papers on the blockchain-based badge service platform, simulation-based collaborative e-Learning system, video English dictionary, and blockchain-based access control audit system.

Developing Learning Materials of Multimedia for General Science Instruction of High School (고등학교 공통과학 학습을 위한 멀티미디어 자료 구축)

  • Kim, Jae Hyun;Lee, Hee Bok;Kim, Hyun Sub;Kim, Hee Soo;Park, Jeong Wok;Park, Hyun Ju
    • Journal of the Korean Chemical Society
    • /
    • v.44 no.3
    • /
    • pp.249-257
    • /
    • 2000
  • This study was designed to develop learning materials of multimediafor general science instruction of high school.this learning material was made of HTML record for each middle unit according to the general science curriculum, and was included a variety of Ietter, graph, picture, drawing, animation, and other moving image materials. And it was composed five coursewares:Content, Dictionary, Science Story,lmage Material, and Questions.The learning material is uploaded an internet website under Science Education Research Institute of Kongju National University (http://science.kongju.ac.kr), and also is provided to a CD-ROM title.

  • PDF

Symbolizing Numbers to Improve Neural Machine Translation (숫자 기호화를 통한 신경기계번역 성능 향상)

  • Kang, Cheongwoong;Ro, Youngheon;Kim, Jisu;Choi, Heeyoul
    • Journal of Digital Contents Society
    • /
    • v.19 no.6
    • /
    • pp.1161-1167
    • /
    • 2018
  • The development of machine learning has enabled machines to perform delicate tasks that only humans could do, and thus many companies have introduced machine learning based translators. Existing translators have good performances but they have problems in number translation. The translators often mistranslate numbers when the input sentence includes a large number. Furthermore, the output sentence structure completely changes even if only one number in the input sentence changes. In this paper, first, we optimized a neural machine translation model architecture that uses bidirectional RNN, LSTM, and the attention mechanism through data cleansing and changing the dictionary size. Then, we implemented a number-processing algorithm specialized in number translation and applied it to the neural machine translation model to solve the problems above. The paper includes the data cleansing method, an optimal dictionary size and the number-processing algorithm, as well as experiment results for translation performance based on the BLEU score.

A comparison of the effects of a programmed instruction method and a lecture/laboratory method on achievement in a course in reference materials (강의식교수법과 프로그램식교수법에 의한 참고정보원의 학습효과 비교연구)

  • ;Ro, Jin Young
    • Journal of Korean Library and Information Science Society
    • /
    • v.28
    • /
    • pp.93-135
    • /
    • 1998
  • The purpose of this study was to compare the effectiveness of programmed instruction versus lecture and discussion method on the knowledge of basic reference sources among undergraduate library and information science students. The hypotheses of the study were: 1. Programmed instruction will be more effective than the lecture/discussion method with regard to academic achievement. 2. There will be a significant difference in learning time between the experimental and the control groups. Seventy-eight library and information science students were participated m the study from the two universities in Chungchong Province. A programmed instruction manual, including 4-types of reference sources-dictionary, encyclopaedia, bibliography, indexes and abstracts, 40-item multiple choice post-test, and a questionnaire for the students' attitude toward programmed instruction were developed specifically for this research. The post-test only control-group design was selected for this experimental study. Students were given instruction on the specific reference titles in dictionary, encyclopedia, bibliography, indexes and abstracts. The control group was instructed by the lecture and discussion method while the experimental group completed a programmed instruction manual by themselves. Both the control and the experimental group were tested right after the instruction of 4-types of reference sources. In addition, a questionnaire asking students' attitude toward programmed instruction was administered to the experimental group. The findings from this study are summarized as follows: 1. The results showed that there were no significant difference in the mean of the post test score between the two groups. Therefore, programmed instruction is viable as an alternative method of instruction in the teaching of reference sources. 2. There was a significant difference in the mean of time spending for the leaning of bibliography, indexes and abstracts between the two groups. Accordingly, programmed instruction proved to be more efficient than the conventional lecture/discussion method in terms of learning time. 3. Students showed positive response to programmed instruction and evaluated it very interesting and challenging. In conclusion, the programmed instruction method was just as effective as the lecture/discussion method in the teaching of reference sources. And students' attitude toward the programmed instruction was favorable enough to secure a continued use of this method for the teaching of reference sources.

  • PDF

The practical use with online database program of cosmetics' raw materials. (화장품원료 온라인 데이터베이스 구축과 활용)

  • Jeon Sang-hoon;Kim Ju-Duck
    • Journal of the Society of Cosmetic Scientists of Korea
    • /
    • v.29 no.2 s.43
    • /
    • pp.233-250
    • /
    • 2003
  • We often use the KCID(Korean Cosmetic Ingredient Dictionary) and ICID(International Cosmetic Ingredient Dictionary) within cosmetics research and within their export and import. so far, we do not have a database of a cosmetics' raw materials. Because of this, we consume a lot of time to find the raw material data that is needed. This study constructs a cosmetics' raw material database and develops the program to retrieve it. We used a Linux machine as the equipment for this study and we used Apache web server, MySQL database server and PHP as the tools of this study. 11,817 kinds of raw materials data were registered as ICID, 866 kinds of raw materials data were registered as KCID and 28,008 kinds of raw materials data with registered trade name into the database. Also, The database was composed of the database of the association form. The database of the online form could ultimately reduce the task time as soon as it did its purpose. The product of this study can become a good basis of data to reconfigure. In the future, it can become a good database in relation with different databases.

The Smart Learning System for English Language Using Hangeul (한글을 이용한 스마트 영어 학습 시스템)

  • Kwon, Seung-tag;Kim, Yong-seok
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.6
    • /
    • pp.1157-1163
    • /
    • 2015
  • In this paper, we developed a Web App that operates in a mobile device. Also, we designed and developed an electronic dictionary of English words and sentences are expressed by English pronunciation with hangeul. The database using English words, Hangeul code with pictures, vocabulary definitions, speech sound files, and many sentences are created in this system. We developed the English learning system using HTML5 and m-Bizmaker software tools.

Keyword Extraction in Korean Using Unsupervised Learning Method (비감독 학습 기법에 의한 한국어의 키워드 추출)

  • Shin, Seong-Yoon;Rhee, Yang-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.6
    • /
    • pp.1403-1408
    • /
    • 2010
  • Korean information retrieval uses noun as index terms or keywords of representing the document. and noun and keyword extraction is to find all nouns presented in the document, In this paper, we proposes the method of keyword extraction using pre-built dictionary. This method reduces the execution time by reducing unnecessary operations. And noun, even large documents without affecting significantly the accuracy, can be extracted. This paper proposed noun extraction method using the appearance characteristics of the noun and keyword extraction method using unsupervised learning techniques.

Keyword Extraction Using Unsupervised Learning Method (비감독 학습 기법에 의한 키워드 추출)

  • Shin, Seong-Yoon;Baek, Jeong-Uk;Rhee, Yang-Won
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.165-166
    • /
    • 2010
  • Noun extraction is to find all nouns presented in the document, Korean information retrieval uses noun as index terms or keywords of representing the document. In this paper, we proposes the method of keyword extraction using pre-built dictionary. This method reduces the execution time by reducing unnecessary operations. And noun, even large documents without affecting significantly the accuracy, can be extracted. This paper proposed noun extraction method using the appearance characteristics of the noun and keyword extraction method using unsupervised learning techniques.

  • PDF