• Title/Summary/Keyword: 오디오기술

Search Result 655, Processing Time 0.024 seconds

Analysis of the Status of Legal Deposit and Acquisition of Electronic Publications in Korea (국내 전자출판물의 납본·수집 현황 분석)

  • Gyuhwan Kim;Daekeun Jeong;Soojung Kim
    • Journal of Korean Library and Information Science Society
    • /
    • v.54 no.4
    • /
    • pp.281-306
    • /
    • 2023
  • This study analyzed the legal deposit, acquisition, and donation status from 2020 to 2022, along with the deposit status of e-publications with issued ISBNs. Through this analysis, the study derived improvement measures to strengthen compliance with legal deposit obligations for domestic e-publications. The key findings are as follows: The collection methods were acquisition (57.07%), legal deposit (41.74%), and donation (1.19%). The file formats varied, including e-books (pdf, epub), webtoons (jpg), and audiobooks (mp3). Most e-publications collected were published from 2012 to 2022, with some from 1960 to 2011. Webtoons dominated acquired materials, while legal deposits mainly comprised e-books. Analyzing the status of e-publications with issued ISBNs, e-books (96.2%) were most common, with the literature field receiving the highest number of ISBNs. Most ISBNs were issued during 2020 to 2022. Looking at the top 10 publishers, the low legal deposit rate indicates the need for improvement. To address this, proposed improvement measures include enhancing publishers' awareness of legal deposits, strengthening incentives and sanctions, encouraging voluntary participation through transparent disclosure of the legal deposit status, and improving the accuracy of data in the ISBN issuance and deposit system.

Research on Generative AI for Korean Multi-Modal Montage App (한국형 멀티모달 몽타주 앱을 위한 생성형 AI 연구)

  • Lim, Jeounghyun;Cha, Kyung-Ae;Koh, Jaepil;Hong, Won-Kee
    • Journal of Service Research and Studies
    • /
    • v.14 no.1
    • /
    • pp.13-26
    • /
    • 2024
  • Multi-modal generation is the process of generating results based on a variety of information, such as text, images, and audio. With the rapid development of AI technology, there is a growing number of multi-modal based systems that synthesize different types of data to produce results. In this paper, we present an AI system that uses speech and text recognition to describe a person and generate a montage image. While the existing montage generation technology is based on the appearance of Westerners, the montage generation system developed in this paper learns a model based on Korean facial features. Therefore, it is possible to create more accurate and effective Korean montage images based on multi-modal voice and text specific to Korean. Since the developed montage generation app can be utilized as a draft montage, it can dramatically reduce the manual labor of existing montage production personnel. For this purpose, we utilized persona-based virtual person montage data provided by the AI-Hub of the National Information Society Agency. AI-Hub is an AI integration platform aimed at providing a one-stop service by building artificial intelligence learning data necessary for the development of AI technology and services. The image generation system was implemented using VQGAN, a deep learning model used to generate high-resolution images, and the KoDALLE model, a Korean-based image generation model. It can be confirmed that the learned AI model creates a montage image of a face that is very similar to what was described using voice and text. To verify the practicality of the developed montage generation app, 10 testers used it and more than 70% responded that they were satisfied. The montage generator can be used in various fields, such as criminal detection, to describe and image facial features.

Design and Implementation of Web Based Instruction Based on Constructivism for Self-Directed Learning Ablity (구성주의 이론에 기반한 자기주도적 웹 기반 교육의 설계와 구현)

  • Kim Gi-Nam;Kim Eui-Jeong;Kim Chang-Suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2006.05a
    • /
    • pp.855-858
    • /
    • 2006
  • First of all, Developing information technology makes it possible to change a paradigm of all kinds of areas, including an education. Students can choose learning goals and objects themselves and acquire not the accumulation of knowledge but the method of their learning. Moreover, Teachers get to be adviser, and students play a key role in teaming. That is, the subject of leaning is students. Constructivism emphasizes the student-oriented environment of education, which corresponds to the characteristics of hypeimedia. In addition, Internet allows us to make a practical plan for constructivism. Web Based Internet provides us with a proper environment to make constructivism practice md causes an education system to change. Sure Web Based Instruction makes them motivated to learn more, they can gain plenty of information regardless of places or time. Besides, they are able to consult more up-to-date information regarding their learning use hypermedia such as an image, audio, video, and test, and effectively communicate with their instructor through a board, an e-mail, a chatting etc. A school and instructors have been making effort to develop a new model of a teaching method to cope with a new environment change. In this thesis, with 'Design and Implementation of Web Based Instruction Based on Constructivism', providing online learner-oriented and indexed video lesson, learners can get chance of self-oriented learning. In addition, learners doesn't have to cover all contents of a lesson but can choose contents they want to have from a indexed list of a lesson, and they ran search contents they want to have with a 'Keyword Search' on a main page, which can make learners improve learner's achievement.

  • PDF

Automatic Speech Style Recognition Through Sentence Sequencing for Speaker Recognition in Bilateral Dialogue Situations (양자 간 대화 상황에서의 화자인식을 위한 문장 시퀀싱 방법을 통한 자동 말투 인식)

  • Kang, Garam;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.17-32
    • /
    • 2021
  • Speaker recognition is generally divided into speaker identification and speaker verification. Speaker recognition plays an important function in the automatic voice system, and the importance of speaker recognition technology is becoming more prominent as the recent development of portable devices, voice technology, and audio content fields continue to expand. Previous speaker recognition studies have been conducted with the goal of automatically determining who the speaker is based on voice files and improving accuracy. Speech is an important sociolinguistic subject, and it contains very useful information that reveals the speaker's attitude, conversation intention, and personality, and this can be an important clue to speaker recognition. The final ending used in the speaker's speech determines the type of sentence or has functions and information such as the speaker's intention, psychological attitude, or relationship to the listener. The use of the terminating ending has various probabilities depending on the characteristics of the speaker, so the type and distribution of the terminating ending of a specific unidentified speaker will be helpful in recognizing the speaker. However, there have been few studies that considered speech in the existing text-based speaker recognition, and if speech information is added to the speech signal-based speaker recognition technique, the accuracy of speaker recognition can be further improved. Hence, the purpose of this paper is to propose a novel method using speech style expressed as a sentence-final ending to improve the accuracy of Korean speaker recognition. To this end, a method called sentence sequencing that generates vector values by using the type and frequency of the sentence-final ending appearing in the utterance of a specific person is proposed. To evaluate the performance of the proposed method, learning and performance evaluation were conducted with a actual drama script. The method proposed in this study can be used as a means to improve the performance of Korean speech recognition service.

Real data-based active sonar signal synthesis method (실데이터 기반 능동 소나 신호 합성 방법론)

  • Yunsu Kim;Juho Kim;Jongwon Seok;Jungpyo Hong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.1
    • /
    • pp.9-18
    • /
    • 2024
  • The importance of active sonar systems is emerging due to the quietness of underwater targets and the increase in ambient noise due to the increase in maritime traffic. However, the low signal-to-noise ratio of the echo signal due to multipath propagation of the signal, various clutter, ambient noise and reverberation makes it difficult to identify underwater targets using active sonar. Attempts have been made to apply data-based methods such as machine learning or deep learning to improve the performance of underwater target recognition systems, but it is difficult to collect enough data for training due to the nature of sonar datasets. Methods based on mathematical modeling have been mainly used to compensate for insufficient active sonar data. However, methodologies based on mathematical modeling have limitations in accurately simulating complex underwater phenomena. Therefore, in this paper, we propose a sonar signal synthesis method based on a deep neural network. In order to apply the neural network model to the field of sonar signal synthesis, the proposed method appropriately corrects the attention-based encoder and decoder to the sonar signal, which is the main module of the Tacotron model mainly used in the field of speech synthesis. It is possible to synthesize a signal more similar to the actual signal by training the proposed model using the dataset collected by arranging a simulated target in an actual marine environment. In order to verify the performance of the proposed method, Perceptual evaluation of audio quality test was conducted and within score difference -2.3 was shown compared to actual signal in a total of four different environments. These results prove that the active sonar signal generated by the proposed method approximates the actual signal.