• Title/Summary/Keyword: artificial intelligence-based models

Search Result 575, Processing Time 0.028 seconds

Token-Based Classification and Dataset Construction for Detecting Modified Profanity (변형된 비속어 탐지를 위한 토큰 기반의 분류 및 데이터셋)

  • Sungmin Ko;Youhyun Shin
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.181-188
    • /
    • 2024
  • Traditional profanity detection methods have limitations in identifying intentionally altered profanities. This paper introduces a new method based on Named Entity Recognition, a subfield of Natural Language Processing. We developed a profanity detection technique using sequence labeling, for which we constructed a dataset by labeling some profanities in Korean malicious comments and conducted experiments. Additionally, to enhance the model's performance, we augmented the dataset by labeling parts of a Korean hate speech dataset using one of the large language models, ChatGPT, and conducted training. During this process, we confirmed that filtering the dataset created by the large language model by humans alone could improve performance. This suggests that human oversight is still necessary in the dataset augmentation process.

A Research of User Experience on Multi-Modal Interactive Digital Art

  • Qianqian Jiang;Jeanhun Chung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.1
    • /
    • pp.80-85
    • /
    • 2024
  • The concept of single-modal digital art originated in the 20th century and has evolved through three key stages. Over time, digital art has transformed into multi-modal interaction, representing a new era in art forms. Based on multi-modal theory, this paper aims to explore the characteristics of interactive digital art in innovative art forms and its impact on user experience. Through an analysis of practical application of multi-modal interactive digital art, this study summarises the impact of creative models of digital art on the physical and mental aspects of user experience. In creating audio-visual-based art, multi-modal digital art should seamlessly incorporate sensory elements and leverage computer image processing technology. Focusing on user perception, emotional expression, and cultural communication, it strives to establish an immersive environment with user experience at its core. Future research, particularly with emerging technologies like Artificial Intelligence(AR) and Virtual Reality(VR), should not merely prioritize technology but aim for meaningful interaction. Through multi-modal interaction, digital art is poised to continually innovate, offering new possibilities and expanding the realm of interactive digital art.

Risk Estimates of Structural Changes in Freight Rates (해상운임의 구조변화 리스크 추정)

  • Hyunsok Kim
    • Journal of Korea Port Economic Association
    • /
    • v.39 no.4
    • /
    • pp.255-268
    • /
    • 2023
  • This paper focuses on the tests for generalized fluctuation in the context of assessing structural changes based on linear regression models. For efficient estimation there has been a growing focus on the structural change monitoring, particularly in relation to fields such as artificial intelligence(hereafter AI) and machine learning(hereafter ML). Specifically, the investigation elucidates the implementation of structural changes and presents a coherent approach for the practical application to the BDI(Baltic Dry-bulk Index), which serves as a representative maritime trade index in global market. The framework encompasses a range of F-statistics type methodologies for fitting, visualization, and evaluation of empirical fluctuation processes, including CUSUM, MOSUM, and estimates-based processes. Additionally, it provides functionality for the computation and evaluation of sequences of pruned exact linear time(hereafter PELT).

CoNSIST: Consist of New Methodologies on AASIST for Audio Deepfake Detection (컨시스트: 오디오 딥페이크 탐지를 위한 그래프 어텐션 기반 새로운 모델링 방법론 연구)

  • Jae Hoon Ha;Joo Won Mun;Sang Yup Lee
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.10
    • /
    • pp.513-519
    • /
    • 2024
  • Advancements in artificial intelligence(AI) have significantly improved deep learning-based audio deepfake technology, which has been exploited for criminal activities. To detect audio deepfake, we propose CoNSIST, an advanced audio deepfake detection model. CoNSIST builds on AASIST, which a graph-based end-to-end model, by integrating three key components: Squeeze and Excitation, Positional Encoding, and Reformulated HS-GAL. These additions aim to enhance feature extraction, eliminate unnecessary operations, and incorporate diverse information. Our experimental results demonstrate that CoNSIST significantly outperforms existing models in detecting audio deepfakes, offering a more robust solution to combat the misuse of this technology.

Development of AI-based Cognitive Production Technology for Digital Datadriven Agriculture, Livestock Farming, and Fisheries (디지털 데이터 중심의 AI기반 환경인지 생산기술 개발 방향)

  • Kim, S.H.
    • Electronics and Telecommunications Trends
    • /
    • v.36 no.1
    • /
    • pp.54-63
    • /
    • 2021
  • Since the recent COVID-19 pandemic, countries have been strengthening trade protection for their security, and the importance of securing strategic materials, such as food, is drawing attention. In addition to the cultural aspects, the global preference for food produced in Korea is increasing because of the Korean Wave. Thus, the Korean food industry can be developed into a high-value-added export food industry. Currently, Korea has a low self-sufficiency rate for foodstuffs apart from rice. Korea also suffers from problems arising from population decline, aging, rapid climate change, and various animal and plant diseases. It is necessary to develop technologies that can overcome the production structures highly dependent on the outside world of food and foster them into export-type system industries. The global agricultural industry-related technologies are actively being modified via data accumulation, e.g., environmental data, production information, and distribution and consumption information in climate and production facilities, and by actively expanding the introduction of the latest information and communication technologies such as big data and artificial intelligence. However, long-term research and investment should precede the field of living organisms. Compared to other industries, it is necessary to overcome poor production and labor environment investment efficiency in the food industry with respect to the production cost, equipment postmanagement, development tailored to the eye level of field workers, and service models suitable for production facilities of various sizes. This paper discusses the flow of domestic and international technologies that form the core issues of the site centered on the 4th Industrial Revolution in the field of agriculture, livestock, and fisheries. It also explains the environmental awareness production technologies centered on sustainable intelligence platforms that link climate change responses, optimization of energy costs, and mass production for unmanned production, distribution, and consumption using the unstructured data obtained based on detection and growth measurement data.

Imputation of Missing SST Observation Data Using Multivariate Bidirectional RNN (다변수 Bidirectional RNN을 이용한 표층수온 결측 데이터 보간)

  • Shin, YongTak;Kim, Dong-Hoon;Kim, Hyeon-Jae;Lim, Chaewook;Woo, Seung-Buhm
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.34 no.4
    • /
    • pp.109-118
    • /
    • 2022
  • The data of the missing section among the vertex surface sea temperature observation data was imputed using the Bidirectional Recurrent Neural Network(BiRNN). Among artificial intelligence techniques, Recurrent Neural Networks (RNNs), which are commonly used for time series data, only estimate in the direction of time flow or in the reverse direction to the missing estimation position, so the estimation performance is poor in the long-term missing section. On the other hand, in this study, estimation performance can be improved even for long-term missing data by estimating in both directions before and after the missing section. Also, by using all available data around the observation point (sea surface temperature, temperature, wind field, atmospheric pressure, humidity), the imputation performance was further improved by estimating the imputation data from these correlations together. For performance verification, a statistical model, Multivariate Imputation by Chained Equations (MICE), a machine learning-based Random Forest model, and an RNN model using Long Short-Term Memory (LSTM) were compared. For imputation of long-term missing for 7 days, the average accuracy of the BiRNN/statistical models is 70.8%/61.2%, respectively, and the average error is 0.28 degrees/0.44 degrees, respectively, so the BiRNN model performs better than other models. By applying a temporal decay factor representing the missing pattern, it is judged that the BiRNN technique has better imputation performance than the existing method as the missing section becomes longer.

Development and Assessment of LSTM Model for Correcting Underestimation of Water Temperature in Korean Marine Heatwave Prediction System (한반도 고수온 예측 시스템의 수온 과소모의 보정을 위한 LSTM 모델 구축 및 예측성 평가)

  • NA KYOUNG IM;HYUNKEUN JIN;GYUNDO PAK;YOUNG-GYU PARK;KYEONG OK KIM;YONGHAN CHOI;YOUNG HO KIM
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.29 no.2
    • /
    • pp.101-115
    • /
    • 2024
  • The ocean heatwave is emerging as a major issue due to global warming, posing a direct threat to marine ecosystems and humanity through decreased food resources and reduced carbon absorption capacity of the oceans. Consequently, the prediction of ocean heatwaves in the vicinity of the Korean Peninsula is becoming increasingly important for marine environmental monitoring and management. In this study, an LSTM model was developed to improve the underestimated prediction of ocean heatwaves caused by the coarse vertical grid system of the Korean Peninsula Ocean Prediction System. Based on the results of ocean heatwave predictions for the Korean Peninsula conducted in 2023, as well as those generated by the LSTM model, the performance of heatwave predictions in the East Sea, Yellow Sea, and South Sea areas surrounding the Korean Peninsula was evaluated. The LSTM model developed in this study significantly improved the prediction performance of sea surface temperatures during periods of temperature increase in all three regions. However, its effectiveness in improving prediction performance during periods of temperature decrease or before temperature rise initiation was limited. This demonstrates the potential of the LSTM model to address the underestimated prediction of ocean heatwaves caused by the coarse vertical grid system during periods of enhanced stratification. It is anticipated that the utility of data-driven artificial intelligence models will expand in the future to improve the prediction performance of dynamical models or even replace them.

One-shot multi-speaker text-to-speech using RawNet3 speaker representation (RawNet3를 통해 추출한 화자 특성 기반 원샷 다화자 음성합성 시스템)

  • Sohee Han;Jisub Um;Hoirin Kim
    • Phonetics and Speech Sciences
    • /
    • v.16 no.1
    • /
    • pp.67-76
    • /
    • 2024
  • Recent advances in text-to-speech (TTS) technology have significantly improved the quality of synthesized speech, reaching a level where it can closely imitate natural human speech. Especially, TTS models offering various voice characteristics and personalized speech, are widely utilized in fields such as artificial intelligence (AI) tutors, advertising, and video dubbing. Accordingly, in this paper, we propose a one-shot multi-speaker TTS system that can ensure acoustic diversity and synthesize personalized voice by generating speech using unseen target speakers' utterances. The proposed model integrates a speaker encoder into a TTS model consisting of the FastSpeech2 acoustic model and the HiFi-GAN vocoder. The speaker encoder, based on the pre-trained RawNet3, extracts speaker-specific voice features. Furthermore, the proposed approach not only includes an English one-shot multi-speaker TTS but also introduces a Korean one-shot multi-speaker TTS. We evaluate naturalness and speaker similarity of the generated speech using objective and subjective metrics. In the subjective evaluation, the proposed Korean one-shot multi-speaker TTS obtained naturalness mean opinion score (NMOS) of 3.36 and similarity MOS (SMOS) of 3.16. The objective evaluation of the proposed English and Korean one-shot multi-speaker TTS showed a prediction MOS (P-MOS) of 2.54 and 3.74, respectively. These results indicate that the performance of our proposed model is improved over the baseline models in terms of both naturalness and speaker similarity.

Deep Learning-Based Brain Tumor Classification in MRI images using Ensemble of Deep Features

  • Kang, Jaeyong;Gwak, Jeonghwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.7
    • /
    • pp.37-44
    • /
    • 2021
  • Automatic classification of brain MRI images play an important role in early diagnosis of brain tumors. In this work, we present a deep learning-based brain tumor classification model in MRI images using ensemble of deep features. In our proposed framework, three different deep features from brain MR image are extracted using three different pre-trained models. After that, the extracted deep features are fed to the classification module. In the classification module, the three different deep features are first fed into the fully-connected layers individually to reduce the dimension of the features. After that, the output features from the fully-connected layers are concatenated and fed into the fully-connected layer to predict the final output. To evaluate our proposed model, we use openly accessible brain MRI dataset from web. Experimental results show that our proposed model outperforms other machine learning-based models.

Microcode based Controller for Compact CNN Accelerators Aimed at Mobile Devices (모바일 디바이스를 위한 소형 CNN 가속기의 마이크로코드 기반 컨트롤러)

  • Na, Yong-Seok;Son, Hyun-Wook;Kim, Hyung-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.355-366
    • /
    • 2022
  • This paper proposes a microcode-based neural network accelerator controller for artificial intelligence accelerators that can be reconstructed using a programmable architecture and provide the advantages of low-power and ultra-small chip size. In order for the target accelerator to support various neural network models, the neural network model can be converted into microcode through microcode compiler and mounted on accelerator to control the operators of the accelerator such as datapath and memory access. While the proposed controller and accelerator can run various CNN models, in this paper, we tested them using the YOLOv2-Tiny CNN model. Using a system clock of 200 MHz, the Controller and accelerator achieved an inference time of 137.9 ms/image for VOC 2012 dataset to detect object, 99.5ms/image for mask detection dataset to detect wearing mask. When implementing an accelerator equipped with the proposed controller as a silicon chip, the gate count is 618,388, which corresponds to 65.5% reduction in chip area compared with an accelerator employing a CPU-based controller (RISC-V).