• Title/Summary/Keyword: Improved Convolutional Neural Network

Search Result 171, Processing Time 0.033 seconds

Non-intrusive Calibration for User Interaction based Gaze Estimation (사용자 상호작용 기반의 시선 검출을 위한 비강압식 캘리브레이션)

  • Lee, Tae-Gyun;Yoo, Jang-Hee
    • Journal of Software Assessment and Valuation
    • /
    • v.16 no.1
    • /
    • pp.45-53
    • /
    • 2020
  • In this paper, we describe a new method for acquiring calibration data using a user interaction process, which occurs continuously during web browsing in gaze estimation, and for performing calibration naturally while estimating the user's gaze. The proposed non-intrusive calibration is a tuning process over the pre-trained gaze estimation model to adapt to a new user using the obtained data. To achieve this, a generalized CNN model for estimating gaze is trained, then the non-intrusive calibration is employed to adapt quickly to new users through online learning. In experiments, the gaze estimation model is calibrated with a combination of various user interactions to compare the performance, and improved accuracy is achieved compared to existing methods.

FGW-FER: Lightweight Facial Expression Recognition with Attention

  • Huy-Hoang Dinh;Hong-Quan Do;Trung-Tung Doan;Cuong Le;Ngo Xuan Bach;Tu Minh Phuong;Viet-Vu Vu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.9
    • /
    • pp.2505-2528
    • /
    • 2023
  • The field of facial expression recognition (FER) has been actively researched to improve human-computer interaction. In recent years, deep learning techniques have gained popularity for addressing FER, with numerous studies proposing end-to-end frameworks that stack or widen significant convolutional neural network layers. While this has led to improved performance, it has also resulted in larger model sizes and longer inference times. To overcome this challenge, our work introduces a novel lightweight model architecture. The architecture incorporates three key factors: Depth-wise Separable Convolution, Residual Block, and Attention Modules. By doing so, we aim to strike a balance between model size, inference speed, and accuracy in FER tasks. Through extensive experimentation on popular benchmark FER datasets, our proposed method has demonstrated promising results. Notably, it stands out due to its substantial reduction in parameter count and faster inference time, while maintaining accuracy levels comparable to other lightweight models discussed in the existing literature.

Enhancing Wind Speed and Wind Power Forecasting Using Shape-Wise Feature Engineering: A Novel Approach for Improved Accuracy and Robustness

  • Mulomba Mukendi Christian;Yun Seon Kim;Hyebong Choi;Jaeyoung Lee;SongHee You
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.4
    • /
    • pp.393-405
    • /
    • 2023
  • Accurate prediction of wind speed and power is vital for enhancing the efficiency of wind energy systems. Numerous solutions have been implemented to date, demonstrating their potential to improve forecasting. Among these, deep learning is perceived as a revolutionary approach in the field. However, despite their effectiveness, the noise present in the collected data remains a significant challenge. This noise has the potential to diminish the performance of these algorithms, leading to inaccurate predictions. In response to this, this study explores a novel feature engineering approach. This approach involves altering the data input shape in both Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) and Autoregressive models for various forecasting horizons. The results reveal substantial enhancements in model resilience against noise resulting from step increases in data. The approach could achieve an impressive 83% accuracy in predicting unseen data up to the 24th steps. Furthermore, this method consistently provides high accuracy for short, mid, and long-term forecasts, outperforming the performance of individual models. These findings pave the way for further research on noise reduction strategies at different forecasting horizons through shape-wise feature engineering.

Multimodal audiovisual speech recognition architecture using a three-feature multi-fusion method for noise-robust systems

  • Sanghun Jeon;Jieun Lee;Dohyeon Yeo;Yong-Ju Lee;SeungJun Kim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.22-34
    • /
    • 2024
  • Exposure to varied noisy environments impairs the recognition performance of artificial intelligence-based speech recognition technologies. Degraded-performance services can be utilized as limited systems that assure good performance in certain environments, but impair the general quality of speech recognition services. This study introduces an audiovisual speech recognition (AVSR) model robust to various noise settings, mimicking human dialogue recognition elements. The model converts word embeddings and log-Mel spectrograms into feature vectors for audio recognition. A dense spatial-temporal convolutional neural network model extracts features from log-Mel spectrograms, transformed for visual-based recognition. This approach exhibits improved aural and visual recognition capabilities. We assess the signal-to-noise ratio in nine synthesized noise environments, with the proposed model exhibiting lower average error rates. The error rate for the AVSR model using a three-feature multi-fusion method is 1.711%, compared to the general 3.939% rate. This model is applicable in noise-affected environments owing to its enhanced stability and recognition rate.

Classification of Gravitational Waves from Black Hole-Neutron Star Mergers with Machine Learning

  • Nurzhan Ussipov;Zeinulla Zhanabaev;Almat, Akhmetali;Marat Zaidyn;Dana Turlykozhayeva;Aigerim Akniyazova;Timur Namazbayev
    • Journal of Astronomy and Space Sciences
    • /
    • v.41 no.3
    • /
    • pp.149-158
    • /
    • 2024
  • This study developed a machine learning-based methodology to classify gravitational wave (GW) signals from black hol-eneutron star (BH-NS) mergers by combining convolutional neural network (CNN) with conditional information for feature extraction. The model was trained and validated on a dataset of simulated GW signals injected to Gaussian noise to mimic real world signals. We considered all three types of merger: binary black hole (BBH), binary neutron star (BNS) and neutron starblack hole (NSBH). We achieved up to 96% correct classification of GW signals sources. Incorporating our novel conditional information approach improved classification accuracy by 10% compared to standard time series training. Additionally, to show the effectiveness of our method, we tested the model with real GW data from the Gravitational Wave Transient Catalog (GWTC-3) and successfully classified ~90% of signals. These results are an important step towards low-latency real-time GW detection.

Development of a modified model for predicting cabbage yield based on soil properties using GIS (GIS를 이용한 토양정보 기반의 배추 생산량 예측 수정모델 개발)

  • Choi, Yeon Oh;Lee, Jaehyeon;Sim, Jae Hoo;Lee, Seung Woo
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.5
    • /
    • pp.449-456
    • /
    • 2022
  • This study proposes a deep learning algorithm to predict crop yield using GIS (Geographic Information System) to extract soil properties from Soilgrids and soil suitability class maps. The proposed model modified the structure of a published CNN-RNN (Convolutional Neural Network-Recurrent Neural Network) based crop yield prediction model suitable for the domestic crop environment. The existing model has two characteristics. The first is that it replaces the original yield with the average yield of the year, and the second is that it trains the data of the predicted year. The new model uses the original field value to ensure accuracy, and the network structure has been improved so that it can train only with data prior to the year to be predicted. The proposed model predicted the yield per unit area of autumn cabbage for kimchi by region based on weather, soil, soil suitability classes, and yield data from 1980 to 2020. As a result of computing and predicting data for each of the four years from 2018 to 2021, the error amount for the test data set was about 10%, enabling accurate yield prediction, especially in regions with a large proportion of total yield. In addition, both the proposed model and the existing model show that the error gradually decreases as the number of years of training data increases, resulting in improved general-purpose performance as the number of training data increases.

Improved CNN Algorithm for Object Detection in Large Images

  • Yang, Seong Bong;Lee, Soo Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.1
    • /
    • pp.45-53
    • /
    • 2020
  • Conventional Convolutional Neural Network(CNN) algorithms have limitations in detecting small objects in large image. In this paper, we propose an improved model which is based on Region Of Interest(ROI) selection and image dividing technique. We prepared YOLOv3 / Faster R-CNN algorithms which are transfer-learned by airfield and aircraft datasets. Also we prepared large images for testing. In order to verify our model, we selected airfield area from large image as ROI first and divided it in two power n orders. Then we compared the aircraft detection rates by number of divisions. We could get the best size of divided image pieces for efficient small object detection derived from the comparison of aircraft detection rates. As a result, we could verify that the improved CNN algorithm can detect small object in large images.

ACA: Automatic search strategy for radioactive source

  • Jianwen Huo;Xulin Hu;Junling Wang;Li Hu
    • Nuclear Engineering and Technology
    • /
    • v.55 no.8
    • /
    • pp.3030-3038
    • /
    • 2023
  • Nowadays, mobile robots have been used to search for uncontrolled radioactive source in indoor environments to avoid radiation exposure for technicians. However, in the indoor environments, especially in the presence of obstacles, how to make the robots with limited sensing capabilities automatically search for the radioactive source remains a major challenge. Also, the source search efficiency of robots needs to be further improved to meet practical scenarios such as limited exploration time. This paper proposes an automatic source search strategy, abbreviated as ACA: the location of source is estimated by a convolutional neural network (CNN), and the path is planned by the A-star algorithm. First, the search area is represented as an occupancy grid map. Then, the radiation dose distribution of the radioactive source in the occupancy grid map is obtained by Monte Carlo (MC) method simulation, and multiple sets of radiation data are collected through the eight neighborhood self-avoiding random walk (ENSAW) algorithm as the radiation data set. Further, the radiation data set is fed into the designed CNN architecture to train the network model in advance. When the searcher enters the search area where the radioactive source exists, the location of source is estimated by the network model and the search path is planned by the A-star algorithm, and this process is iterated continuously until the searcher reaches the location of radioactive source. The experimental results show that the average number of radiometric measurements and the average number of moving steps of the ACA algorithm are only 2.1% and 33.2% of those of the gradient search (GS) algorithm in the indoor environment without obstacles. In the indoor environment shielded by concrete walls, the GS algorithm fails to search for the source, while the ACA algorithm successfully searches for the source with fewer moving steps and sparse radiometric data.

Vector-Based Data Augmentation and Network Learning for Efficient Crack Data Collection (효율적인 균열 데이터 수집을 위한 벡터 기반 데이터 증강과 네트워크 학습)

  • Kim, Jong-Hyun
    • Journal of the Korea Computer Graphics Society
    • /
    • v.28 no.2
    • /
    • pp.1-9
    • /
    • 2022
  • In this paper, we propose a vector-based augmentation technique that can generate data required for crack detection and a ConvNet(Convolutional Neural Network) technique that can learn it. Detecting cracks quickly and accurately is an important technology to prevent building collapse and fall accidents in advance. In order to solve this problem with artificial intelligence, it is essential to obtain a large amount of data, but it is difficult to obtain a large amount of crack data because the situation for obtaining an actual crack image is mostly dangerous. This problem of database construction can be alleviated with elastic distortion, which increases the amount of data by applying deformation to a specific artificial part. In this paper, the improved crack pattern results are modeled using ConvNet. Rather than elastic distortion, our method can obtain results similar to the actual crack pattern. By designing the crack data augmentation based on a vector, rather than the pixel unit used in general data augmentation, excellent results can be obtained in terms of the amount of crack change. As a result, in this paper, even though a small number of crack data were used as input, a crack database can be efficiently constructed by generating various crack directions and patterns.

Development of Methodology for Measuring Water Level in Agricultural Water Reservoir through Deep Learning anlaysis of CCTV Images (딥러닝 기법을 이용한 농업용저수지 CCTV 영상 기반의 수위계측 방법 개발)

  • Joo, Donghyuk;Lee, Sang-Hyun;Choi, Gyu-Hoon;Yoo, Seung-Hwan;Na, Ra;Kim, Hayoung;Oh, Chang-Jo;Yoon, Kwang-Sik
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.65 no.1
    • /
    • pp.15-26
    • /
    • 2023
  • This study aimed to evaluate the performance of water level classification from CCTV images in agricultural facilities such as reservoirs. Recently, the CCTV system, widely used for facility monitor or disaster detection, can automatically detect and identify people and objects from the images by developing new technologies such as a deep learning system. Accordingly, we applied the ResNet-50 deep learning system based on Convolutional Neural Network and analyzed the water level of the agricultural reservoir from CCTV images obtained from TOMS (Total Operation Management System) of the Korea Rural Community Corporation. As a result, the accuracy of water level detection was improved by excluding night and rainfall CCTV images and applying measures. For example, the error rate significantly decreased from 24.39 % to 1.43 % in the Bakseok reservoir. We believe that the utilization of CCTVs should be further improved when calculating the amount of water supply and establishing a supply plan according to the integrated water management policy.