Search | Korea Science

Voice Synthesis Detection Using Language Model-Based Speech Feature Extraction (언어 모델 기반 음성 특징 추출을 활용한 생성 음성 탐지)

Seung-min Kim;So-hee Park;Dae-seon Choi
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.34 no.3
- /
- pp.439-449
- /
- 2024
Recent rapid advancements in voice generation technology have enabled the natural synthesis of voices using text alone. However, this progress has led to an increase in malicious activities, such as voice phishing (voishing), where generated voices are exploited for criminal purposes. Numerous models have been developed to detect the presence of synthesized voices, typically by extracting features from the voice and using these features to determine the likelihood of voice generation.This paper proposes a new model for extracting voice features to address misuse cases arising from generated voices. It utilizes a deep learning-based audio codec model and the pre-trained natural language processing model BERT to extract novel voice features. To assess the suitability of the proposed voice feature extraction model for voice detection, four generated voice detection models were created using the extracted features, and performance evaluations were conducted. For performance comparison, three voice detection models based on Deepfeature proposed in previous studies were evaluated against other models in terms of accuracy and EER. The model proposed in this paper achieved an accuracy of 88.08%and a low EER of 11.79%, outperforming the existing models. These results confirm that the voice feature extraction method introduced in this paper can be an effective tool for distinguishing between generated and real voices.
https://doi.org/10.13089/JKIISC.2024.34.3.439 인용 PDF HTML

Korean Ironic Expression Detector (한국어 반어 표현 탐지기)

Seung Ju Bang;Yo-Han Park;Jee Eun Kim;Kong Joo Lee
- The Transactions of the Korea Information Processing Society
- /
- v.13 no.3
- /
- pp.148-155
- /
- 2024
Despite the increasing importance of irony and sarcasm detection in the field of natural language processing, research on the Korean language is relatively scarce compared to other languages. This study aims to experiment with various models for irony detection in Korean text. The study conducted irony detection experiments using KoBERT, a BERT-based model, and ChatGPT. For KoBERT, two methods of additional training on sentiment data were applied (Transfer Learning and MultiTask Learning). Additionally, for ChatGPT, the Few-Shot Learning technique was applied by increasing the number of example sentences entered as prompts. The results of the experiments showed that the Transfer Learning and MultiTask Learning models, which were trained with additional sentiment data, outperformed the baseline model without additional sentiment data. On the other hand, ChatGPT exhibited significantly lower performance compared to KoBERT, and increasing the number of example sentences did not lead to a noticeable improvement in performance. In conclusion, this study suggests that a model based on KoBERT is more suitable for irony detection than ChatGPT, and it highlights the potential contribution of additional training on sentiment data to improve irony detection performance.
https://doi.org/10.3745/TKIPS.2024.13.3.148 인용 PDF

Digital Library Interface Research Based on EEG, Eye-Tracking, and Artificial Intelligence Technologies: Focusing on the Utilization of Implicit Relevance Feedback (뇌파, 시선추적 및 인공지능 기술에 기반한 디지털 도서관 인터페이스 연구: 암묵적 적합성 피드백 활용을 중심으로)

Hyun-Hee Kim;Yong-Ho Kim
- Journal of the Korean Society for information Management
- /
- v.41 no.1
- /
- pp.261-282
- /
- 2024
This study proposed and evaluated electroencephalography (EEG)-based and eye-tracking-based methods to determine relevance by utilizing users' implicit relevance feedback while navigating content in a digital library. For this, EEG/eye-tracking experiments were conducted on 32 participants using video, image, and text data. To assess the usefulness of the proposed methods, deep learning-based artificial intelligence (AI) techniques were used as a competitive benchmark. The evaluation results showed that EEG component-based methods (av_P600 and f_P3b components) demonstrated high classification accuracy in selecting relevant videos and images (faces/emotions). In contrast, AI-based methods, specifically object recognition and natural language processing, showed high classification accuracy for selecting images (objects) and texts (newspaper articles). Finally, guidelines for implementing a digital library interface based on EEG, eye-tracking, and artificial intelligence technologies have been proposed. Specifically, a system model based on implicit relevance feedback has been presented. Moreover, to enhance classification accuracy, methods suitable for each media type have been suggested, including EEG-based, eye-tracking-based, and AI-based approaches.
https://doi.org/10.3743/KOSIM.2024.41.1.261 인용 PDF

Active Inferential Processing During Comprehension in Poor Readers (미숙 독자들에 있어 이해 도중의 능동적 추리의 처리)

Zoh Myeong-Han;Ahn Jeung-Chan
- Korean Journal of Cognitive Science
- /
- v.17 no.2
- /
- pp.75-102
- /
- 2006
Three experiments were conducted using a verification task to examine good and poor readers' generation of causal inferences(with because sentences) and contrastive inferences(with although sentences). The unfamiliar, critical verification statement was either explicitly mentioned or was implied. In Experiment 1, both good and poor readers responded accurately to the critical statement, suggesting that both groups had the linguistic knowledge necessary to the required inferences. Differences were found, however, in the groups' verification latencies. Poor, but not good, readers responded faster to explicit than to implicit verification statements for both because and although sentences. In Experiment 2, poor readers were induced to generate causal inferences for the because experimental sentences by including fillers that were apparently counterfactual unless a causal inference was made. In Experiment 3, poor readers were induced to generate contrastive inferences for the although sentences by including fillers that could only be resolved by making a contrastive inference. Verification latencies for the critical statements showed that poor readers made causal inferences in Experiment 2 and contrastive inferences in Experiment 3 doting comprehension. These results were discussed in terms of context effect: Specific encoding operations performed on anomaly backgrounded in another passage would form part of the context that guides the ongoing activity in processing potentially relevant subsequent text.
PDF

Target Word Selection Disambiguation using Untagged Text Data in English-Korean Machine Translation (영한 기계 번역에서 미가공 텍스트 데이터를 이용한 대역어 선택 중의성 해소)

Kim Yu-Seop;Chang Jeong-Ho
- The KIPS Transactions:PartB
- /
- v.11B no.6
- /
- pp.749-758
- /
- 2004
In this paper, we propose a new method utilizing only raw corpus without additional human effort for disambiguation of target word selection in English-Korean machine translation. We use two data-driven techniques; one is the Latent Semantic Analysis(LSA) and the other the Probabilistic Latent Semantic Analysis(PLSA). These two techniques can represent complex semantic structures in given contexts like text passages. We construct linguistic semantic knowledge by using the two techniques and use the knowledge for target word selection in English-Korean machine translation. For target word selection, we utilize a grammatical relationship stored in a dictionary. We use k- nearest neighbor learning algorithm for the resolution of data sparseness Problem in target word selection and estimate the distance between instances based on these models. In experiments, we use TREC data of AP news for construction of latent semantic space and Wail Street Journal corpus for evaluation of target word selection. Through the Latent Semantic Analysis methods, the accuracy of target word selection has improved over 10% and PLSA has showed better accuracy than LSA method. finally we have showed the relatedness between the accuracy and two important factors ; one is dimensionality of latent space and k value of k-NT learning by using correlation calculation.
https://doi.org/10.3745/KIPSTB.2004.11B.6.749 인용 PDF KSCI

Automatic Speech Style Recognition Through Sentence Sequencing for Speaker Recognition in Bilateral Dialogue Situations (양자 간 대화 상황에서의 화자인식을 위한 문장 시퀀싱 방법을 통한 자동 말투 인식)

Kang, Garam;Kwon, Ohbyung
- Journal of Intelligence and Information Systems
- /
- v.27 no.2
- /
- pp.17-32
- /
- 2021
Speaker recognition is generally divided into speaker identification and speaker verification. Speaker recognition plays an important function in the automatic voice system, and the importance of speaker recognition technology is becoming more prominent as the recent development of portable devices, voice technology, and audio content fields continue to expand. Previous speaker recognition studies have been conducted with the goal of automatically determining who the speaker is based on voice files and improving accuracy. Speech is an important sociolinguistic subject, and it contains very useful information that reveals the speaker's attitude, conversation intention, and personality, and this can be an important clue to speaker recognition. The final ending used in the speaker's speech determines the type of sentence or has functions and information such as the speaker's intention, psychological attitude, or relationship to the listener. The use of the terminating ending has various probabilities depending on the characteristics of the speaker, so the type and distribution of the terminating ending of a specific unidentified speaker will be helpful in recognizing the speaker. However, there have been few studies that considered speech in the existing text-based speaker recognition, and if speech information is added to the speech signal-based speaker recognition technique, the accuracy of speaker recognition can be further improved. Hence, the purpose of this paper is to propose a novel method using speech style expressed as a sentence-final ending to improve the accuracy of Korean speaker recognition. To this end, a method called sentence sequencing that generates vector values by using the type and frequency of the sentence-final ending appearing in the utterance of a specific person is proposed. To evaluate the performance of the proposed method, learning and performance evaluation were conducted with a actual drama script. The method proposed in this study can be used as a means to improve the performance of Korean speech recognition service.
https://doi.org/10.13088/jiis.2021.27.2.017 인용 PDF KSCI

Plans for Teaching and Learning of Learner-centered Activities in Korean Verse Education (시조교육의 현황과 학습자 활동 중심의 교수$\cdot$학습 모형 - 고등학교 국어 교과서 수록 작품 <시조>를 중심으로 -)

Kang Myong-Hye
- Sijohaknonchong
- /
- v.20
- /
- pp.141-171
- /
- 2004
Even though only 3 sijo are in high school textbook. through these 3 sijo each type can be understood in that each represents pyung sijo, sasul sijo, and present sijo. To learn with learner-centered activities, which aim for full knowledge acquisition regarding literary works, as the preparing stage, students can learn what theyll learn by teachers. Sijo are, so to speak, formed with three chapters, and stand for the world that is colorless, scentless, and flavorless. So, the theme can be found with ease. Compared with other genres, sijo can be formed creating background with ease. Moreover, sijo are not too long, so learners can paraphrase it. Sijo that express private experiences with the everyday language can be related to other genres or everyday language. So, sijo are last to present. In the teaching phase, on the gradation of concretion and gradation, writing or presentation activities are presented. After classroom, learners keep a reaction journal. In the phase of concretion and gradation, learners can apprehend that typical differences of the emotions of poetic speakers is from typical differences, even though emotions of poetic speakers of (1)$\cdot$(2)$\cdot$(3) that is each stand for pyung sijo, sasul sijo, and present sijo are roughly summarized loneliness, desolateness, and gloominess. Moreover, these typical differences are from social, political. and cultural settings, namely, the differences of contexts. In this teaching model. learners should prepare for content regarding context and text before the class. Teachers should act as an assistant to help learners pre-understand their subjective experiences and imaginations.
PDF

Revisiting the cause of unemployment problem in Korea's labor market: The job seeker's interests-based topic analysis (취업준비생 토픽 분석을 통한 취업난 원인의 재탐색)

Kim, Jung-Su;Lee, Suk-Jun
- Management & Information Systems Review
- /
- v.35 no.1
- /
- pp.85-116
- /
- 2016
The present study aims to explore the causes of employment difficulty on the basis of job applicant's interest from P-E (person-environment) fit perspective. Our approach relied on a textual analytic method to reveal insights from their situational interests in a job search during the change of labor market. Thus, to investigate the type of major interests and psychological responses, user-generated texts in a social community were collected for analysis between January 1, 2013 through December 31, 2015 by crawling the online-community in regard to job seeking and sharing information and opinions. The results of topic analysis indicated user's primary interests were divided into four types: perception of vocation expectation, employment pre-preparation behaviors, perception of labor market, and job-seeking stress. Specially, job applicants put mainly concerns of monetary reward and a form of employment, rather than their work values or career exploration, thus youth job applicants expressed their psychological responses using contextualized language (e.g., slang, vulgarisms) for projecting their unstable state under uncertainty in response to environmental changes. Additionally, they have perceived activities in the restricted preparation (e.g., certification, English exam) as determinant factors for success in employment and suffered form job-seeking stress. On the basis of these findings, current unemployment matters are totally attributed to the absence of pursing the value of vocation and job in individuals, organizations, and society. Concretely, job seekers are preoccupied with occupational prestige in social aspect and have undecided vocational value. On the other hand, most companies have no perception of the importance of human resources and have overlooked the needs for proper work environment development in respect of stimulating individual motivation. The attempt in this study to reinterpret the effect of environment as for classifying job applicant's interests in reference to linguistic and psychological theories not only helps conduct a more comprehensive meaning for understanding social matters, but guides new directions for future research on job applicant's psychological factors (e.g., attitudes, motivation) using topic analysis.
PDF

Exploring Changes in Science PCK Characteristics through a Family Resemblance Approach (가족유사성 접근을 통한 과학 PCK 변화 탐색)

Kwak, Youngsun
- Journal of the Korean Society of Earth Science Education
- /
- v.15 no.2
- /
- pp.235-248
- /
- 2022
With the changes in the future educational environment, such as the rapid decline of the school-age population and the expansion of students' choice of curriculum, changes are also required in PCK, the expertise of science teachers. In other words, the categories constituting the existing 'consensus-PCK' and the characteristics of 'science PCK' are not fixed, so more categories and characteristics can be added. The purpose of this study is to explore the potential area of science PCK required to cope with changes in the future educational environment in the form of 'Family Resemblance Science PCK (Family Resemblance-PCK, hereafter)' through Wittgenstein's family resemblance approach. For this purpose, in-depth interviews were conducted with three focus groups. In the focus group in-depth interview, participants discussed how the science PCK required for science teachers in future schools in 2030-2045 will change due to changes in the future society and educational environment. Qualitative analysis was performed based on the in-depth interview, and semantic network analysis was performed on the in-depth interview text to analyze the characteristics of 'Family Resemblance-PCK' differentiated from the existing 'consensus-PCK'. In results, the characteristics of Family Resemblance-PCK, which are newly requested along with changes in role expectations of science teachers, were examined by PCK area. As a result of semantic network analysis of Family Resemblance-PCK, it was found that Family Resemblance-PCK expands its boundaries from the existing consensus-PCK, which is the starting point, and new PCK elements were added. Looking at the aspects of Family Resemblance-PCK, [AI-Convergence Knowledge-Contents-Digital], [Community-Network-Human Resources-Relationships], [Technology-Exploration-Virtual Reality-Research], [Self-Directed Learning-Collaboration-Community], etc., form a distinct network cluster, and it is expected that future science teacher expertise will be formed and strengthened around these PCK areas. Based on the research results, changes in the professionalism of science teachers in future schools and countermeasures were proposed as a conclusion.
https://doi.org/10.15523/JKSESE.2022.15.2.235 인용 PDF KSCI

A Study on the Development Trend of Artificial Intelligence Using Text Mining Technique: Focused on Open Source Software Projects on Github (텍스트 마이닝 기법을 활용한 인공지능 기술개발 동향 분석 연구: 깃허브 상의 오픈 소스 소프트웨어 프로젝트를 대상으로)

Chong, JiSeon;Kim, Dongsung;Lee, Hong Joo;Kim, Jong Woo
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.1-19
- /
- 2019
Artificial intelligence (AI) is one of the main driving forces leading the Fourth Industrial Revolution. The technologies associated with AI have already shown superior abilities that are equal to or better than people in many fields including image and speech recognition. Particularly, many efforts have been actively given to identify the current technology trends and analyze development directions of it, because AI technologies can be utilized in a wide range of fields including medical, financial, manufacturing, service, and education fields. Major platforms that can develop complex AI algorithms for learning, reasoning, and recognition have been open to the public as open source projects. As a result, technologies and services that utilize them have increased rapidly. It has been confirmed as one of the major reasons for the fast development of AI technologies. Additionally, the spread of the technology is greatly in debt to open source software, developed by major global companies, supporting natural language recognition, speech recognition, and image recognition. Therefore, this study aimed to identify the practical trend of AI technology development by analyzing OSS projects associated with AI, which have been developed by the online collaboration of many parties. This study searched and collected a list of major projects related to AI, which were generated from 2000 to July 2018 on Github. This study confirmed the development trends of major technologies in detail by applying text mining technique targeting topic information, which indicates the characteristics of the collected projects and technical fields. The results of the analysis showed that the number of software development projects by year was less than 100 projects per year until 2013. However, it increased to 229 projects in 2014 and 597 projects in 2015. Particularly, the number of open source projects related to AI increased rapidly in 2016 (2,559 OSS projects). It was confirmed that the number of projects initiated in 2017 was 14,213, which is almost four-folds of the number of total projects generated from 2009 to 2016 (3,555 projects). The number of projects initiated from Jan to Jul 2018 was 8,737. The development trend of AI-related technologies was evaluated by dividing the study period into three phases. The appearance frequency of topics indicate the technology trends of AI-related OSS projects. The results showed that the natural language processing technology has continued to be at the top in all years. It implied that OSS had been developed continuously. Until 2015, Python, C ++, and Java, programming languages, were listed as the top ten frequently appeared topics. However, after 2016, programming languages other than Python disappeared from the top ten topics. Instead of them, platforms supporting the development of AI algorithms, such as TensorFlow and Keras, are showing high appearance frequency. Additionally, reinforcement learning algorithms and convolutional neural networks, which have been used in various fields, were frequently appeared topics. The results of topic network analysis showed that the most important topics of degree centrality were similar to those of appearance frequency. The main difference was that visualization and medical imaging topics were found at the top of the list, although they were not in the top of the list from 2009 to 2012. The results indicated that OSS was developed in the medical field in order to utilize the AI technology. Moreover, although the computer vision was in the top 10 of the appearance frequency list from 2013 to 2015, they were not in the top 10 of the degree centrality. The topics at the top of the degree centrality list were similar to those at the top of the appearance frequency list. It was found that the ranks of the composite neural network and reinforcement learning were changed slightly. The trend of technology development was examined using the appearance frequency of topics and degree centrality. The results showed that machine learning revealed the highest frequency and the highest degree centrality in all years. Moreover, it is noteworthy that, although the deep learning topic showed a low frequency and a low degree centrality between 2009 and 2012, their ranks abruptly increased between 2013 and 2015. It was confirmed that in recent years both technologies had high appearance frequency and degree centrality. TensorFlow first appeared during the phase of 2013-2015, and the appearance frequency and degree centrality of it soared between 2016 and 2018 to be at the top of the lists after deep learning, python. Computer vision and reinforcement learning did not show an abrupt increase or decrease, and they had relatively low appearance frequency and degree centrality compared with the above-mentioned topics. Based on these analysis results, it is possible to identify the fields in which AI technologies are actively developed. The results of this study can be used as a baseline dataset for more empirical analysis on future technology trends that can be converged.
https://doi.org/10.13088/jiis.2019.25.1.001 인용 PDF KSCI HTML

Search Result 756, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)