Search | Korea Science

Voice Synthesis Detection Using Language Model-Based Speech Feature Extraction (언어 모델 기반 음성 특징 추출을 활용한 생성 음성 탐지)

Seung-min Kim;So-hee Park;Dae-seon Choi
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.34 no.3
- /
- pp.439-449
- /
- 2024
Recent rapid advancements in voice generation technology have enabled the natural synthesis of voices using text alone. However, this progress has led to an increase in malicious activities, such as voice phishing (voishing), where generated voices are exploited for criminal purposes. Numerous models have been developed to detect the presence of synthesized voices, typically by extracting features from the voice and using these features to determine the likelihood of voice generation.This paper proposes a new model for extracting voice features to address misuse cases arising from generated voices. It utilizes a deep learning-based audio codec model and the pre-trained natural language processing model BERT to extract novel voice features. To assess the suitability of the proposed voice feature extraction model for voice detection, four generated voice detection models were created using the extracted features, and performance evaluations were conducted. For performance comparison, three voice detection models based on Deepfeature proposed in previous studies were evaluated against other models in terms of accuracy and EER. The model proposed in this paper achieved an accuracy of 88.08%and a low EER of 11.79%, outperforming the existing models. These results confirm that the voice feature extraction method introduced in this paper can be an effective tool for distinguishing between generated and real voices.
https://doi.org/10.13089/JKIISC.2024.34.3.439 인용 PDF HTML

A Study on the Process Form Generation and Expressive Characteristic by Storytelling in BIG's Architecture (BIG의 건축에서 나타나는 스토리텔링에 의한 형태생성 프로세스와 표현 특성에 관한 연구)

Kim, Jong-Sung;Kim, Kai-Chun
- Korean Institute of Interior Design Journal
- /
- v.24 no.6
- /
- pp.79-86
- /
- 2015
This study started from the concern for Bjrake Ingels, an emerging architect in the architecture circle, who is creative and popular. Recently, the architecture field provides architects with a foundation to express a process on a new form creation through various new expressive languages, design concepts, and methods. The global Danish group BIG(Bjarke Ingels Group) develops a story by their distinctive architectural language. The storytelling is being used in various fields and now the tool called 'story' is settling down as an important element in the life that human lives. Bjarke Ingels leading the group BIG aims for the form expression by the scientific analysis and adaptation after being affected by Danish regional background and OMA. It creates a form to share stories with local members by visually simplifying the region, culture, environment, social phenomenon, economy, and politics that are invisible and do not have the form in the modern society. The elements and expressive features of the space storytelling include locality, cultural, natural environment, and connectivity which are the content structure(story) that enables you to intervene in the story according to the main agent to imagine a new space. The expressive element includes the watching moving line story of the successive, hierarchical, and organic structures which are constructive elements creating various spaces through the mixture, transmutability, and relocation of the program and inducing users to the space. The space storytelling is composed of the symbolism, community, and eco-friendliness to appear diversely through BIG's case analysis. This study will have significance that it drew a method and feature looked at by many contemporary architects from the storytelling viewpoint in the form-creating process, classified the form-creating process through a new storytelling type, and showed a possibility on the development of various methodologies.
https://doi.org/10.14774/JKIID.2015.24.6.079 인용 PDF KSCI

Story Generation Method using User Information in Mobile Environment (모바일 환경에서 사용자 정보를 이용한 스토리 생성 방법)

Hong, Jeen-Pyo;Cha, Jeong-Won
- Journal of Internet Computing and Services
- /
- v.14 no.3
- /
- pp.81-90
- /
- 2013
Mobile device can get useful user information, because users have always this device. In this paper, we propose automatically story generation method and user topic extraction using user information in mobile environment. Proposed method is follows: (1) We collect user action information in mobile device. Then, (2) we extract topics from collected information. (3) For the results of (2), we determine episodes for one day. Then, (4) we generate sentences using sentence templates and we compose stories which have theme-based or time-based. Because proposed method is simpler than previous method, proposed method can work only in mobile device. There's no room to leak user information. And proposed method is expressed more informative than previous method, because proposed method is provided sentence-based result. Extracted user-topic, a result of our method, can use to analyze user action and user preference.
https://doi.org/10.7472/jksii.2013.14.3.81 인용 PDF KSCI

Automated Story Generation with Image Captions and Recursiva Calls (이미지 캡션 및 재귀호출을 통한 스토리 생성 방법)

Isle Jeon;Dongha Jo;Mikyeong Moon
- Journal of the Institute of Convergence Signal Processing
- /
- v.24 no.1
- /
- pp.42-50
- /
- 2023
The development of technology has achieved digital innovation throughout the media industry, including production techniques and editing technologies, and has brought diversity in the form of consumer viewing through the OTT service and streaming era. The convergence of big data and deep learning networks automatically generated text in format such as news articles, novels, and scripts, but there were insufficient studies that reflected the author's intention and generated story with contextually smooth. In this paper, we describe the flow of pictures in the storyboard with image caption generation techniques, and the automatic generation of story-tailored scenarios through language models. Image caption using CNN and Attention Mechanism, we generate sentences describing pictures on the storyboard, and input the generated sentences into the artificial intelligence natural language processing model KoGPT-2 in order to automatically generate scenarios that meet the planning intention. Through this paper, the author's intention and story customized scenarios are created in large quantities to alleviate the pain of content creation, and artificial intelligence participates in the overall process of digital content production to activate media intelligence.
https://doi.org/10.23087/jkicsp.2023.24.1.006 인용 PDF

Research on the Utilization of Recurrent Neural Networks for Automatic Generation of Korean Definitional Sentences of Technical Terms (기술 용어에 대한 한국어 정의 문장 자동 생성을 위한 순환 신경망 모델 활용 연구)

Choi, Garam;Kim, Han-Gook;Kim, Kwang-Hoon;Kim, You-eil;Choi, Sung-Pil
- Journal of the Korean Society for Library and Information Science
- /
- v.51 no.4
- /
- pp.99-120
- /
- 2017
In order to develop a semiautomatic support system that allows researchers concerned to efficiently analyze the technical trends for the ever-growing industry and market. This paper introduces a couple of Korean sentence generation models that can automatically generate definitional statements as well as descriptions of technical terms and concepts. The proposed models are based on a deep learning model called LSTM (Long Sort-Term Memory) capable of effectively labeling textual sequences by taking into account the contextual relations of each item in the sequences. Our models take technical terms as inputs and can generate a broad range of heterogeneous textual descriptions that explain the concept of the terms. In the experiments using large-scale training collections, we confirmed that more accurate and reasonable sentences can be generated by CHAR-CNN-LSTM model that is a word-based LSTM exploiting character embeddings based on convolutional neural networks (CNN). The results of this study can be a force for developing an extension model that can generate a set of sentences covering the same subjects, and furthermore, we can implement an artificial intelligence model that automatically creates technical literature.
https://doi.org/10.4275/KSLIS.2017.51.4.099 인용 PDF KSCI

A general-purpose model capable of image captioning in Korean and Englishand a method to generate text suitable for the purpose (한국어 및 영어 이미지 캡션이 가능한 범용적 모델 및 목적에 맞는 텍스트를 생성해주는 기법)

Cho, Su Hyun;Oh, Hayoung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.26 no.8
- /
- pp.1111-1120
- /
- 2022
Image Capturing is a matter of viewing images and describing images in language. The problem is an important problem that can be solved by keeping, understanding, and bringing together two areas of image processing and natural language processing. In addition, by automatically recognizing and describing images in text, images can be converted into text and then into speech for visually impaired people to help them understand their surroundings, and important issues such as image search, art therapy, sports commentary, and real-time traffic information commentary. So far, the image captioning research approach focuses solely on recognizing and texturing images. However, various environments in reality must be considered for practical use, as well as being able to provide image descriptions for the intended purpose. In this work, we limit the universally available Korean and English image captioning models and text generation techniques for the purpose of image captioning.
https://doi.org/10.6109/jkiice.2022.26.8.1111 인용 PDF KSCI

A Morpheme Analyzer based on Transformer using Morpheme Tokens and User Dictionary (사용자 사전과 형태소 토큰을 사용한 트랜스포머 기반 형태소 분석기)

DongHyun Kim;Do-Guk Kim;ChulHui Kim;MyungSun Shin;Young-Duk Seo
- Smart Media Journal
- /
- v.12 no.9
- /
- pp.19-27
- /
- 2023
Since morphemes are the smallest unit of meaning in Korean, it is necessary to develop an accurate morphemes analyzer to improve the performance of the Korean language model. However, most existing analyzers present morpheme analysis results by learning word unit tokens as input values. However, since Korean words are consist of postpositions and affixes that are attached to the root, even if they have the same root, the meaning tends to change due to the postpositions or affixes. Therefore, learning morphemes using word unit tokens can lead to misclassification of postposition or affixes. In this paper, we use morpheme-level tokens to grasp the inherent meaning in Korean sentences and propose a morpheme analyzer based on a sequence generation method using Transformer. In addition, a user dictionary is constructed based on corpus data to solve the out - of-vocabulary problem. During the experiment, the morpheme and morpheme tags printed by each morpheme analyzer were compared with the correct answer data, and the experiment proved that the morpheme analyzer presented in this paper performed better than the existing morpheme analyzer.
https://doi.org/10.30693/SMJ.2023.12.9.19 인용 PDF

Design of Parallel Input Pattern and Synchronization Method for Multimodal Interaction (멀티모달 인터랙션을 위한 사용자 병렬 모달리티 입력방식 및 입력 동기화 방법 설계)

Im, Mi-Jeong;Park, Beom
- Journal of the Ergonomics Society of Korea
- /
- v.25 no.2
- /
- pp.135-146
- /
- 2006
Multimodal interfaces are recognition-based technologies that interpret and encode hand gestures, eye-gaze, movement pattern, speech, physical location and other natural human behaviors. Modality is the type of communication channel used for interaction. It also covers the way an idea is expressed or perceived, or the manner in which an action is performed. Multimodal Interfaces are the technologies that constitute multimodal interaction processes which occur consciously or unconsciously while communicating between human and computer. So input/output forms of multimodal interfaces assume different aspects from existing ones. Moreover, different people show different cognitive styles and individual preferences play a role in the selection of one input mode over another. Therefore to develop an effective design of multimodal user interfaces, input/output structure need to be formulated through the research of human cognition. This paper analyzes the characteristics of each human modality and suggests combination types of modalities, dual-coding for formulating multimodal interaction. Then it designs multimodal language and input synchronization method according to the granularity of input synchronization. To effectively guide the development of next-generation multimodal interfaces, substantially cognitive modeling will be needed to understand the temporal and semantic relations between different modalities, their joint functionality, and their overall potential for supporting computation in different forms. This paper is expected that it can show multimodal interface designers how to organize and integrate human input modalities while interacting with multimodal interfaces.
https://doi.org/10.5143/JESK.2006.25.2.135 인용 PDF KSCI

Choosing preferable labels for the Japanese translation of the Human Phenotype Ontology

Ninomiya, Kota;Takatsuki, Terue;Kushida, Tatsuya;Yamamoto, Yasunori;Ogishima, Soichi
- Genomics & Informatics
- /
- v.18 no.2
- /
- pp.23.1-23.6
- /
- 2020
The Human Phenotype Ontology (HPO) is the de facto standard ontology to describe human phenotypes in detail, and it is actively used, particularly in the field of rare disease diagnoses. For clinicians who are not fluent in English, the HPO has been translated into many languages, and there have been four initiatives to develop Japanese translations. At the Biomedical Linked Annotation Hackathon 6 (BLAH6), a rule-based approach was attempted to determine the preferable Japanese translation for each HPO term among the candidates developed by the four approaches. The relationship between the HPO and Mammalian Phenotype translations was also investigated, with the eventual goal of harmonizing the two translations to facilitate phenotype-based comparisons of species in Japanese through cross-species phenotype matching. In order to deal with the increase in the number of HPO terms and the need for manual curation, it would be useful to have a dictionary containing word-by-word correspondences and fixed translation phrases for English word order. These considerations seem applicable to HPO localization into other languages.
https://doi.org/10.5808/GI.2020.18.2.e23 인용 PDF KSCI

An Example-based Korean Standard Industrial and Occupational Code Classification (예제기반 한국어 표준 산업/직업 코드 분류)

Lim Heui-Seok
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.7 no.4
- /
- pp.594-601
- /
- 2006
Coding of occupational and industrial codes is a major operation in census survey of Korean statistics bureau. The coding process has been done manually. Such manual work is very labor and cost intensive and it usually causes inconsistent results. This paper proposes an automatic coding system based on example-based learning. The system converts natural language input into corresponding numeric codes using code generation system trained by example-based teaming after applying manually built rules. As experimental results performed with training data consisted of 400,000 records and 260 manual rules, the proposed system showed about 76.69% and 99.68% accuracy for occupational code classification and industrial code classification, respectively.
PDF

Search Result 138, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)