• Title/Summary/Keyword: Deep Features

Search Result 1,096, Processing Time 0.027 seconds

Effects of CNN Backbone on Trajectory Prediction Models for Autonomous Vehicle

  • Seoyoung Lee;Hyogyeong Park;Yeonhwi You;Sungjung Yong;Il-Young Moon
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.4
    • /
    • pp.346-350
    • /
    • 2023
  • Trajectory prediction is an essential element for driving autonomous vehicles, and various trajectory prediction models have emerged with the development of deep learning technology. Convolutional neural network (CNN) is the most commonly used neural network architecture for extracting the features of visual images, and the latest models exhibit high performances. This study was conducted to identify an efficient CNN backbone model among the components of deep learning models for trajectory prediction. We changed the existing CNN backbone network of multiple-trajectory prediction models used as feature extractors to various state-of-the-art CNN models. The experiment was conducted using nuScenes, which is a dataset used for the development of autonomous vehicles. The results of each model were compared using frequently used evaluation metrics for trajectory prediction. Analyzing the impact of the backbone can improve the performance of the trajectory prediction task. Investigating the influence of the backbone on multiple deep learning models can be a future challenge.

How Long Will Your Videos Remain Popular? Empirical Study with Deep Learning and Survival Analysis

  • Min Gyeong Choi;Jae Hong Park
    • Asia pacific journal of information systems
    • /
    • v.33 no.2
    • /
    • pp.282-297
    • /
    • 2023
  • One of the emerging trends in the marketing field is digital video marketing. Online videos offer rich content typically containing more information than any other type of content (e.g., audible or textual content). Accordingly, previous researchers have examined factors influencing videos' popularity. However, few studies have examined what causes a video to remain popular. Some videos achieve continuous, ongoing popularity, while others fade out quickly. For practitioners, videos at the recommendation slots may serve as strong communication channels, as many potential consumers are exposed to such videos. So,this study will provide practitioners important advice regarding how to choose videos that will survive as long-lasting favorites, allowing them to advertise in a cost-effective manner. Using deep learning techniques, this study extracts text from videos and measured the videos' tones, including factual and emotional tones. Additionally, we measure the aesthetic score by analyzing the thumbnail images in the data. We then empirically show that the cognitive features of a video, such as the tone of a message and the aesthetic assessment of a thumbnail image, play an important role in determining videos' long-term popularity. We believe that this is the first study of its kind to examine new factors that aid in ensuring a video remains popular using both deep learning and econometric methodologies.

Deep Learning-Based Speech Emotion Recognition Technology Using Voice Feature Filters (음성 특징 필터를 이용한 딥러닝 기반 음성 감정 인식 기술)

  • Shin Hyun Sam;Jun-Ki Hong
    • The Journal of Bigdata
    • /
    • v.8 no.2
    • /
    • pp.223-231
    • /
    • 2023
  • In this study, we propose a model that extracts and analyzes features from deep learning-based speech signals, generates filters, and utilizes these filters to recognize emotions in speech signals. We evaluate the performance of emotion recognition accuracy using the proposed model. According to the simulation results using the proposed model, the average emotion recognition accuracy of DNN and RNN was very similar, at 84.59% and 84.52%, respectively. However, we observed that the simulation time for DNN was approximately 44.5% shorter than that of RNN, enabling quicker emotion prediction.

Estimation of tomato maturity as a continuous index using deep neural networks

  • Taehyeong Kim;Dae-Hyun Lee;Seung-Woo Kang;Soo-Hyun Cho;Kyoung-Chul Kim
    • Korean Journal of Agricultural Science
    • /
    • v.49 no.4
    • /
    • pp.837-845
    • /
    • 2022
  • In this study, tomato maturity was estimated based on deep learning for a harvesting robot. Tomato images were obtained using a RGB camera installed on a monitoring robot, which was developed previously, and the samples were cropped to 128 × 128 size images to generate a dataset for training the classification model. The classification model was constructed based on convolutional neural networks, and the mean-variance loss was used to learn implicitly the distribution of the data features by class. In the test stage, the tomato maturity was estimated as a continuous index, which has a range of 0 to 1, by calculating the expected class value. The results show that the F1-score of the classification was approximately 0.94, and the performance was similar to that of a deep learning-based classification task in the agriculture field. In addition, it was possible to estimate the distribution in each maturity stage. From the results, it was found that our approach can not only classify the discrete maturation stages of the tomatoes but also can estimate the continuous maturity.

Image Captioning with Synergy-Gated Attention and Recurrent Fusion LSTM

  • Yang, You;Chen, Lizhi;Pan, Longyue;Hu, Juntao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.10
    • /
    • pp.3390-3405
    • /
    • 2022
  • Long Short-Term Memory (LSTM) combined with attention mechanism is extensively used to generate semantic sentences of images in image captioning models. However, features of salient regions and spatial information are not utilized sufficiently in most related works. Meanwhile, the LSTM also suffers from the problem of underutilized information in a single time step. In the paper, two innovative approaches are proposed to solve these problems. First, the Synergy-Gated Attention (SGA) method is proposed, which can process the spatial features and the salient region features of given images simultaneously. SGA establishes a gated mechanism through the global features to guide the interaction of information between these two features. Then, the Recurrent Fusion LSTM (RF-LSTM) mechanism is proposed, which can predict the next hidden vectors in one time step and improve linguistic coherence by fusing future information. Experimental results on the benchmark dataset of MSCOCO show that compared with the state-of-the-art methods, the proposed method can improve the performance of image captioning model, and achieve competitive performance on multiple evaluation indicators.

Merging Features and Optical-NIR Color Gradient of Early-type Galaxies

  • Kim, Du-Ho;Im, Myeong-Sin
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.37 no.1
    • /
    • pp.41.2-41.2
    • /
    • 2012
  • It has been suggested that merging plays an important role in the formation and the evolution of early-type galaxies (ETGs). Optical-NIR color gradients of ETGs in high density environments are found to be less steep than those of ETGs in low density environments, hinting frequent merger activities in ETGs in high density environments. In order to examine if the flat color gradients are the result of dry mergers, we studied the relations between merging features, luminosities, environments and color gradients of 196 low redshift ETGs selected from Sloan Digital Sky Survey (SDSS) Stripe82. Near Infrared (NIR) images are taken from UKIRT Infrared Deep Sky Survey (UKIDSS) Large Area Survey (LAS). Color (r-K) gradients of ETGs with tidal features are a little flatter than relaxed ETGs, but not significant. We found that massive (> 10^11.3 solar masses) ETGs have -40% less scattered color gradients than less massive ETGs. The less scattered color gradients of massive ETGs could be evidence of dry merger processes in the evolution of massive ETGs. We found no relation between color gradients of ETGs and their environments.

  • PDF

A Gradient-Based Explanation Method for Node Classification Using Graph Convolutional Networks

  • Chaehyeon Kim;Hyewon Ryu;Ki Yong Lee
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.803-816
    • /
    • 2023
  • Explainable artificial intelligence is a method that explains how a complex model (e.g., a deep neural network) yields its output from a given input. Recently, graph-type data have been widely used in various fields, and diverse graph neural networks (GNNs) have been developed for graph-type data. However, methods to explain the behavior of GNNs have not been studied much, and only a limited understanding of GNNs is currently available. Therefore, in this paper, we propose an explanation method for node classification using graph convolutional networks (GCNs), which is a representative type of GNN. The proposed method finds out which features of each node have the greatest influence on the classification of that node using GCN. The proposed method identifies influential features by backtracking the layers of the GCN from the output layer to the input layer using the gradients. The experimental results on both synthetic and real datasets demonstrate that the proposed explanation method accurately identifies the features of each node that have the greatest influence on its classification.

Neural-network based Computerized Emotion Analysis using Multiple Biological Signals (다중 생체신호를 이용한 신경망 기반 전산화 감정해석)

  • Lee, Jee-Eun;Kim, Byeong-Nam;Yoo, Sun-Kook
    • Science of Emotion and Sensibility
    • /
    • v.20 no.2
    • /
    • pp.161-170
    • /
    • 2017
  • Emotion affects many parts of human life such as learning ability, behavior and judgment. It is important to understand human nature. Emotion can only be inferred from facial expressions or gestures, what it actually is. In particular, emotion is difficult to classify not only because individuals feel differently about emotion but also because visually induced emotion does not sustain during whole testing period. To solve the problem, we acquired bio-signals and extracted features from those signals, which offer objective information about emotion stimulus. The emotion pattern classifier was composed of unsupervised learning algorithm with hidden nodes and feature vectors. Restricted Boltzmann machine (RBM) based on probability estimation was used in the unsupervised learning and maps emotion features to transformed dimensions. The emotion was characterized by non-linear classifiers with hidden nodes of a multi layer neural network, named deep belief network (DBN). The accuracy of DBN (about 94 %) was better than that of back-propagation neural network (about 40 %). The DBN showed good performance as the emotion pattern classifier.

BSR (Buzz, Squeak, Rattle) noise classification based on convolutional neural network with short-time Fourier transform noise-map (Short-time Fourier transform 소음맵을 이용한 컨볼루션 기반 BSR (Buzz, Squeak, Rattle) 소음 분류)

  • Bu, Seok-Jun;Moon, Se-Min;Cho, Sung-Bae
    • The Journal of the Acoustical Society of Korea
    • /
    • v.37 no.4
    • /
    • pp.256-261
    • /
    • 2018
  • There are three types of noise generated inside the vehicle: BSR (Buzz, Squeak, Rattle). In this paper, we propose a classifier that automatically classifies automotive BSR noise by using features extracted from deep convolutional neural networks. In the preprocessing process, the features of above three noises are represented as noise-map using STFT (Short-time Fourier Transform) algorithm. In order to cope with the problem that the position of the actual noise is unknown in the part of the generated noise map, the noise map is divided using the sliding window method. In this paper, internal parameter of the deep convolutional neural networks is visualized using the t-SNE (t-Stochastic Neighbor Embedding) algorithm, and the misclassified data is analyzed in a qualitative way. In order to analyze the classified data, the similarity of the noise type was quantified by SSIM (Structural Similarity Index) value, and it was found that the retractor tremble sound is most similar to the normal travel sound. The classifier of the proposed method compared with other classifiers of machine learning method recorded the highest classification accuracy (99.15 %).

Animal Face Classification using Dual Deep Convolutional Neural Network

  • Khan, Rafiul Hasan;Kang, Kyung-Won;Lim, Seon-Ja;Youn, Sung-Dae;Kwon, Oh-Jun;Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.4
    • /
    • pp.525-538
    • /
    • 2020
  • A practical animal face classification system that classifies animals in image and video data is considered as a pivotal topic in machine learning. In this research, we are proposing a novel method of fully connected dual Deep Convolutional Neural Network (DCNN), which extracts and analyzes image features on a large scale. With the inclusion of the state of the art Batch Normalization layer and Exponential Linear Unit (ELU) layer, our proposed DCNN has gained the capability of analyzing a large amount of dataset as well as extracting more features than before. For this research, we have built our dataset containing ten thousand animal faces of ten animal classes and a dual DCNN. The significance of our network is that it has four sets of convolutional functions that work laterally with each other. We used a relatively small amount of batch size and a large number of iteration to mitigate overfitting during the training session. We have also used image augmentation to vary the shapes of the training images for the better learning process. The results demonstrate that, with an accuracy rate of 92.0%, the proposed DCNN outruns its counterparts while causing less computing costs.