• 제목/요약/키워드: artificial intelligence-based model

Search Result 1,215, Processing Time 0.03 seconds

Imputation of Missing SST Observation Data Using Multivariate Bidirectional RNN (다변수 Bidirectional RNN을 이용한 표층수온 결측 데이터 보간)

  • Shin, YongTak;Kim, Dong-Hoon;Kim, Hyeon-Jae;Lim, Chaewook;Woo, Seung-Buhm
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.34 no.4
    • /
    • pp.109-118
    • /
    • 2022
  • The data of the missing section among the vertex surface sea temperature observation data was imputed using the Bidirectional Recurrent Neural Network(BiRNN). Among artificial intelligence techniques, Recurrent Neural Networks (RNNs), which are commonly used for time series data, only estimate in the direction of time flow or in the reverse direction to the missing estimation position, so the estimation performance is poor in the long-term missing section. On the other hand, in this study, estimation performance can be improved even for long-term missing data by estimating in both directions before and after the missing section. Also, by using all available data around the observation point (sea surface temperature, temperature, wind field, atmospheric pressure, humidity), the imputation performance was further improved by estimating the imputation data from these correlations together. For performance verification, a statistical model, Multivariate Imputation by Chained Equations (MICE), a machine learning-based Random Forest model, and an RNN model using Long Short-Term Memory (LSTM) were compared. For imputation of long-term missing for 7 days, the average accuracy of the BiRNN/statistical models is 70.8%/61.2%, respectively, and the average error is 0.28 degrees/0.44 degrees, respectively, so the BiRNN model performs better than other models. By applying a temporal decay factor representing the missing pattern, it is judged that the BiRNN technique has better imputation performance than the existing method as the missing section becomes longer.

One-shot multi-speaker text-to-speech using RawNet3 speaker representation (RawNet3를 통해 추출한 화자 특성 기반 원샷 다화자 음성합성 시스템)

  • Sohee Han;Jisub Um;Hoirin Kim
    • Phonetics and Speech Sciences
    • /
    • v.16 no.1
    • /
    • pp.67-76
    • /
    • 2024
  • Recent advances in text-to-speech (TTS) technology have significantly improved the quality of synthesized speech, reaching a level where it can closely imitate natural human speech. Especially, TTS models offering various voice characteristics and personalized speech, are widely utilized in fields such as artificial intelligence (AI) tutors, advertising, and video dubbing. Accordingly, in this paper, we propose a one-shot multi-speaker TTS system that can ensure acoustic diversity and synthesize personalized voice by generating speech using unseen target speakers' utterances. The proposed model integrates a speaker encoder into a TTS model consisting of the FastSpeech2 acoustic model and the HiFi-GAN vocoder. The speaker encoder, based on the pre-trained RawNet3, extracts speaker-specific voice features. Furthermore, the proposed approach not only includes an English one-shot multi-speaker TTS but also introduces a Korean one-shot multi-speaker TTS. We evaluate naturalness and speaker similarity of the generated speech using objective and subjective metrics. In the subjective evaluation, the proposed Korean one-shot multi-speaker TTS obtained naturalness mean opinion score (NMOS) of 3.36 and similarity MOS (SMOS) of 3.16. The objective evaluation of the proposed English and Korean one-shot multi-speaker TTS showed a prediction MOS (P-MOS) of 2.54 and 3.74, respectively. These results indicate that the performance of our proposed model is improved over the baseline models in terms of both naturalness and speaker similarity.

Development of Low-Cost Vision-based Eye Tracking Algorithm for Information Augmented Interactive System

  • Park, Seo-Jeon;Kim, Byung-Gyu
    • Journal of Multimedia Information System
    • /
    • v.7 no.1
    • /
    • pp.11-16
    • /
    • 2020
  • Deep Learning has become the most important technology in the field of artificial intelligence machine learning, with its high performance overwhelming existing methods in various applications. In this paper, an interactive window service based on object recognition technology is proposed. The main goal is to implement an object recognition technology using this deep learning technology to remove the existing eye tracking technology, which requires users to wear eye tracking devices themselves, and to implement an eye tracking technology that uses only usual cameras to track users' eye. We design an interactive system based on efficient eye detection and pupil tracking method that can verify the user's eye movement. To estimate the view-direction of user's eye, we initialize to make the reference (origin) coordinate. Then the view direction is estimated from the extracted eye pupils from the origin coordinate. Also, we propose a blink detection technique based on the eye apply ratio (EAR). With the extracted view direction and eye action, we provide some augmented information of interest without the existing complex and expensive eye-tracking systems with various service topics and situations. For verification, the user guiding service is implemented as a proto-type model with the school map to inform the location information of the desired location or building.

Development of ResNet-based WBC Classification Algorithm Using Super-pixel Image Segmentation

  • Lee, Kyu-Man;Kang, Soon-Ah
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.4
    • /
    • pp.147-153
    • /
    • 2018
  • In this paper, we propose an efficient WBC 14-Diff classification which performs using the WBC-ResNet-152, a type of CNN model. The main point of view is to use Super-pixel for the segmentation of the image of WBC, and to use ResNet for the classification of WBC. A total of 136,164 blood image samples (224x224) were grouped for image segmentation, training, training verification, and final test performance analysis. Image segmentation using super-pixels have different number of images for each classes, so weighted average was applied and therefore image segmentation error was low at 7.23%. Using the training data-set for training 50 times, and using soft-max classifier, TPR average of 80.3% for the training set of 8,827 images was achieved. Based on this, using verification data-set of 21,437 images, 14-Diff classification TPR average of normal WBCs were at 93.4% and TPR average of abnormal WBCs were at 83.3%. The result and methodology of this research demonstrates the usefulness of artificial intelligence technology in the blood cell image classification field. WBC-ResNet-152 based morphology approach is shown to be meaningful and worthwhile method. And based on stored medical data, in-depth diagnosis and early detection of curable diseases is expected to improve the quality of treatment.

Low-Quality Banknote Serial Number Recognition Based on Deep Neural Network

  • Jang, Unsoo;Suh, Kun Ha;Lee, Eui Chul
    • Journal of Information Processing Systems
    • /
    • v.16 no.1
    • /
    • pp.224-237
    • /
    • 2020
  • Recognition of banknote serial number is one of the important functions for intelligent banknote counter implementation and can be used for various purposes. However, the previous character recognition method is limited to use due to the font type of the banknote serial number, the variation problem by the solid status, and the recognition speed issue. In this paper, we propose an aspect ratio based character region segmentation and a convolutional neural network (CNN) based banknote serial number recognition method. In order to detect the character region, the character area is determined based on the aspect ratio of each character in the serial number candidate area after the banknote area detection and de-skewing process is performed. Then, we designed and compared four types of CNN models and determined the best model for serial number recognition. Experimental results showed that the recognition accuracy of each character was 99.85%. In addition, it was confirmed that the recognition performance is improved as a result of performing data augmentation. The banknote used in the experiment is Indian rupee, which is badly soiled and the font of characters is unusual, therefore it can be regarded to have good performance. Recognition speed was also enough to run in real time on a device that counts 800 banknotes per minute.

Mask Region-Based Convolutional Neural Network (R-CNN) Based Image Segmentation of Rays in Softwoods

  • Hye-Ji, YOO;Ohkyung, KWON;Jeong-Wook, SEO
    • Journal of the Korean Wood Science and Technology
    • /
    • v.50 no.6
    • /
    • pp.490-498
    • /
    • 2022
  • The current study aimed to verify the image segmentation ability of rays in tangential thin sections of conifers using artificial intelligence technology. The applied model was Mask region-based convolutional neural network (Mask R-CNN) and softwoods (viz. Picea jezoensis, Larix gmelinii, Abies nephrolepis, Abies koreana, Ginkgo biloba, Taxus cuspidata, Cryptomeria japonica, Cedrus deodara, Pinus koraiensis) were selected for the study. To take digital pictures, thin sections of thickness 10-15 ㎛ were cut using a microtome, and then stained using a 1:1 mixture of 0.5% astra blue and 1% safranin. In the digital images, rays were selected as detection objects, and Computer Vision Annotation Tool was used to annotate the rays in the training images taken from the tangential sections of the woods. The performance of the Mask R-CNN applied to select rays was as high as 0.837 mean average precision and saving the time more than half of that required for Ground Truth. During the image analysis process, however, division of the rays into two or more rays occurred. This caused some errors in the measurement of the ray height. To improve the image processing algorithms, further work on combining the fragments of a ray into one ray segment, and increasing the precision of the boundary between rays and the neighboring tissues is required.

A Study on NaverZ's Metaverse Platform Scaling Strategy

  • Song, Minzheong
    • International journal of advanced smart convergence
    • /
    • v.11 no.3
    • /
    • pp.132-141
    • /
    • 2022
  • We look at the rocket life stages of NaverZ's metaverse platform scaling and investigate the ignition and scale-up stage of its metaverse platform brand, Zepeto based on the Rocket Model (RM). The results are derived as follows: Firstly, NaverZ shows the event strategy by collaborating with K-pops, the piggybacking strategy by utilizing other SNSs, and the VIP strategy by investing in game and entertainment content genres in the 'attract' function. In the second 'match' function, based on the matching rule of Zepeto, the users can generate their own characters and "World" with Zepeto Studio. However, for strengthening the matching quality, NaverZ is investing in the artificial intelligence (AI) based companies consistently. In the 'connect' function, NaverZ's maximization of the positive interaction is possible by inducing feed activities in Zepeto & other SNSs and by uploading attractive content for viral effects in the ignition. For facilitating this, NaverZ expands the scale to other continents like Southeast Asia and Middle East with the localization strategy inclusive investment. Lastly, in the 'transact' function, based on three monetization experiments like Coin & ZEM, user generated content (UGC) fee, and advertising revenue in the ignition, NaverZ starts to invest in NFT platforms and abroad blockchain companies.

An Approach of Cognitive Health Advisor Model for Untact Technology Environment (언택트 기술 환경에서의 지능형 헬스 어드바이저 모델 접근 방안)

  • Hwang, Tae-Ho;Lee, Kang-Yoon
    • The Journal of Bigdata
    • /
    • v.5 no.1
    • /
    • pp.139-145
    • /
    • 2020
  • In the era of the 4th Industrial Revolution, the use of information based on AI APIs has a great influence on industry and life. In particular, the use of artificial intelligence data in the medical field will have many changes and effects on society. This paper is to study the necessary components to implement the "Cognitive Health Advisor model (CHA model)" and to implement the "CHA model using chatbot" based on this. It uses the open Cognitive chatbot to analyze and analyze the health status of users changing in their daily lives. The user's health information analyzed by the biometric sensor and chatbot consultation delivers the information to the user through the chatbot. And it implements a cognitive health advisor model that provides educational information for users' health promotion. Through this implementation, it intends to confirm the possibility of future use and to suggest research directions.

Factors affecting success and failure of Internet company business model using inductive learning based on ID3 algorithm (ID3 알고리즘 기반의 귀납적 추론을 활용한 인터넷 기업 비즈니스 모델의 성공과 실패에 영향을 미치는 요인에 관한 연구)

  • Jin, Dong-su
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.2
    • /
    • pp.111-116
    • /
    • 2019
  • New technologies such as the IoT, Big Data, and Artificial Intelligence, starting from the Web, mobile, and smart device, enable new business models that did not exist before, and various types of Internet companies based on these business models has been emerged. In this research, we examine the factors that influence the success and failure of Internet companies. To do this, we review the recent studies on business model and examine the variables affecting the success of Internet companies in terms of network effect, user interface, cooperation with actors, creating value for users. Using the five derived variables, we will select 14 Internet companies that succeeded and failed in seven commercial business model categories. We derive decision tree by applying inductive learning based on ID3 algorithm to the analysis result and derive rules that affect success and failure based on derived decision tree. With these rules, we want to present the strategic implications for actors to succeed in Internet companies.

Performance Evaluation of Efficient Vision Transformers on Embedded Edge Platforms (임베디드 엣지 플랫폼에서의 경량 비전 트랜스포머 성능 평가)

  • Minha Lee;Seongjae Lee;Taehyoun Kim
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.3
    • /
    • pp.89-100
    • /
    • 2023
  • Recently, on-device artificial intelligence (AI) solutions using mobile devices and embedded edge devices have emerged in various fields, such as computer vision, to address network traffic burdens, low-energy operations, and security problems. Although vision transformer deep learning models have outperformed conventional convolutional neural network (CNN) models in computer vision, they require more computations and parameters than CNN models. Thus, they are not directly applicable to embedded edge devices with limited hardware resources. Many researchers have proposed various model compression methods or lightweight architectures for vision transformers; however, there are only a few studies evaluating the effects of model compression techniques of vision transformers on performance. Regarding this problem, this paper presents a performance evaluation of vision transformers on embedded platforms. We investigated the behaviors of three vision transformers: DeiT, LeViT, and MobileViT. Each model performance was evaluated by accuracy and inference time on edge devices using the ImageNet dataset. We assessed the effects of the quantization method applied to the models on latency enhancement and accuracy degradation by profiling the proportion of response time occupied by major operations. In addition, we evaluated the performance of each model on GPU and EdgeTPU-based edge devices. In our experimental results, LeViT showed the best performance in CPU-based edge devices, and DeiT-small showed the highest performance improvement in GPU-based edge devices. In addition, only MobileViT models showed performance improvement on EdgeTPU. Summarizing the analysis results through profiling, the degree of performance improvement of each vision transformer model was highly dependent on the proportion of parts that could be optimized in the target edge device. In summary, to apply vision transformers to on-device AI solutions, either proper operation composition and optimizations specific to target edge devices must be considered.