• Title/Summary/Keyword: 3D network

Search Result 2,078, Processing Time 0.029 seconds

Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

  • Eom, Youngsik;Bang, Junseong
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.3
    • /
    • pp.148-154
    • /
    • 2021
  • With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more.

A Study on Unsupervised Learning Method of RAM-based Neural Net (RAM 기반 신경망의 비지도 학습에 관한 연구)

  • Park, Sang-Moo;Kim, Seong-Jin;Lee, Dong-Hyung;Lee, Soo-Dong;Ock, Cheol-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.1
    • /
    • pp.31-38
    • /
    • 2011
  • A RAM-based Neural Net is a weightless neural network based on binary neural network. 3-D neural network using this paper is binary neural network with multiful information bits and store counts of training. Recognition method by MRD technique is based on the supervised learning. Therefore neural network by itself can not distinguish between the categories and well-separated categories of training data can achieve only through the performance. In this paper, unsupervised learning algorithm is proposed which is trained existing 3-D neural network without distinction of data, to distinguish between categories depending on the only input training patterns. The training data for proposed unsupervised learning provided by the NIST handwritten digits of MNIST which is consist of 0 to 9 multi-pattern, a randomly materials are used as training patterns. Through experiments, neural network is to determine the number of discriminator which each have an idea of the handwritten digits that can be interpreted.

An algorithm for estimating surface normal from its boundary curves

  • Park, Jisoon;Kim, Taewon;Baek, Seung-Yeob;Lee, Kunwoo
    • Journal of Computational Design and Engineering
    • /
    • v.2 no.1
    • /
    • pp.67-72
    • /
    • 2015
  • Recently, along with the improvements of geometry modeling methods using sketch-based interface, there have been a lot of developments in research about generating surface model from 3D curves. However, surfacing a 3D curve network remains an ambiguous problem due to the lack of geometric information. In this paper, we propose a new algorithm for estimating the normal vectors of the 3D curves which accord closely with user intent. Bending energy is defined by utilizing RMF(Rotation-Minimizing Frame) of 3D curve, and we estimated this minimal energy frame as the one that accords design intent. The proposed algorithm is demonstrated with surface model creation of various curve networks. The algorithm of estimating geometric information in 3D curves which is proposed in this paper can be utilized to extract new information in the sketch-based modeling process. Also, a new framework of 3D modeling can be expected through the fusion between curve network and surface creating algorithm.

A Conceptual Data Model for a 3D Cadastre in Korea

  • Lee, Ji-Yeong;Koh, June-Hwan
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.25 no.6_1
    • /
    • pp.565-574
    • /
    • 2007
  • Because of most current cadastral systems maintain 2D geometric descriptions of parcels linked to administrative records, the system may not reflect current tendency to use space above and under the surface. The land has been used in multi-levels, e.g. constructions of multi-used complex buildings, subways and infrastructure above/under the ground. This cadastre situation of multilevel use of lands cannot be defined as cadastre objects (2D parcel-based) in the cadastre systems. This trend has requested a new system in which right to land is clearly and indisputably recorded because a right of ownership on a parcel relates to a space in 3D, not any more relates to 2D surface area. Therefore, this article proposes a 3D spatial data model to represent geometrical and topological data of 3D (property) situation on multilevel uses of lands in 3D cadastre systems, and a conceptual 3D cadastral model in Korea to design a conceptual schema for a 3D cadastre. Lastly, this paper presents the results of an experimental implementation of the 3D Cadastre to perform topological analyses based on 3D Network Data Model to identify spatial neighbors.

Neural Network Approach to Sensor Fusion System for Improving the Recognition Performance of 3D Objects (3차원 물체의 인식 성능 향상을 위한 감각 융합 신경망 시스템)

  • Dong Sung Soo;Lee Chong Ho;Kim Ji Kyoung
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.54 no.3
    • /
    • pp.156-165
    • /
    • 2005
  • Human being recognizes the physical world by integrating a great variety of sensory inputs, the information acquired by their own action, and their knowledge of the world using hierarchically parallel-distributed mechanism. In this paper, authors propose the sensor fusion system that can recognize multiple 3D objects from 2D projection images and tactile informations. The proposed system focuses on improving recognition performance of 3D objects. Unlike the conventional object recognition system that uses image sensor alone, the proposed method uses tactual sensors in addition to visual sensor. Neural network is used to fuse the two sensory signals. Tactual signals are obtained from the reaction force of the pressure sensors at the fingertips when unknown objects are grasped by four-fingered robot hand. The experiment evaluates the recognition rate and the number of learning iterations of various objects. The merits of the proposed systems are not only the high performance of the learning ability but also the reliability of the system with tactual information for recognizing various objects even though the visual sensory signals get defects. The experimental results show that the proposed system can improve recognition rate and reduce teeming time. These results verify the effectiveness of the proposed sensor fusion system as recognition scheme for 3D objects.

Implementation of an Autostereoscopic Virtual 3D Button in Non-contact Manner Using Simple Deep Learning Network

  • You, Sang-Hee;Hwang, Min;Kim, Ki-Hoon;Cho, Chang-Suk
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.505-517
    • /
    • 2021
  • This research presented an implementation of autostereoscopic virtual three-dimensional (3D) button device as non-contact style. The proposed device has several characteristics about visible feature, non-contact use and artificial intelligence (AI) engine. The device was designed to be contactless to prevent virus contamination and consists of 3D buttons in a virtual stereoscopic view. To specify the button pressed virtually by fingertip pointing, a simple deep learning network having two stages without convolution filters was designed. As confirmed in the experiment, if the input data composition is clearly designed, the deep learning network does not need to be configured so complexly. As the results of testing and evaluation by the certification institute, the proposed button device shows high reliability and stability.

No-Reference Sports Video-Quality Assessment Using 3D Shearlet Transform and Deep Residual Neural Network (3차원 쉐어렛 변환과 심층 잔류 신경망을 이용한 무참조 스포츠 비디오 화질 평가)

  • Lee, Gi Yong;Shin, Seung-Su;Kim, Hyoung-Gook
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.12
    • /
    • pp.1447-1453
    • /
    • 2020
  • In this paper, we propose a method for no-reference quality assessment of sports videos using 3D shearlet transform and deep residual neural networks. In the proposed method, 3D shearlet transform-based spatiotemporal features are extracted from the overlapped video blocks and applied to logistic regression concatenated with a deep residual neural network based on a conditional video block-wise constraint to learn the spatiotemporal correlation and predict the quality score. Our evaluation reveals that the proposed method predicts the video quality with higher accuracy than the conventional no-reference video quality assessment methods.

3D Adjacency Spatial Query using 3D Topological Network Data Model (3차원 네트워크 기반 위상학적 데이터 모델을 이용한 3차원 인접성 공간질의)

  • Lee, Seok-Ho;Park, Se-Ho;Lee, Ji-Yeong
    • Spatial Information Research
    • /
    • v.18 no.5
    • /
    • pp.93-105
    • /
    • 2010
  • Spatial neighborhoods are spaces which are relate to target space. A 3D spatial query which is a function for searching spatial neighborhoods is a significant function in spatial analysis. Various methodologies have been proposed in related these studies, this study suggests an adjacent based methodology. The methodology of this paper implements topological data for represent a adjacency via using network based topological data model, then apply modifiable Dijkstra's algorithm to each topological data. Results of ordering analysis about an adjacent space from a target space were visualized and considered ways to take advantage of. Object of this paper is to implement a 3D spatial query for searching a target space with a adjacent relationship in 3D space. And purposes of this study are to 1)generate adjacency based 3D network data via network based topological data model and to 2)implement a 3D spatial query for searching spatial neighborhoods by applying Dijkstra's algorithms to these data.

Synchronization Method of Stereoscopic Video in 3D Mobile Broadcasting through Heterogeneous Network (이종망을 통한 3D 모바일 방송에서의 스테레오스코픽 비디오 전송을 위한 동기화 방법)

  • Kwon, Ki-Deok;Yoo, Young-Hwan;Jeong, Hyeon-Jun;Lee, Gwang-Soon;Cheong, Won-Sik;Hur, Nam-Ho
    • Journal of Broadcast Engineering
    • /
    • v.17 no.4
    • /
    • pp.596-610
    • /
    • 2012
  • This paper proposes a method to provide the high quality 3D broadcasting service in a mobile broadcasting system. In this method, audio and video data are delivered through a heterogeneous network, consisting of a mobile network as well as a broadcasting network, due to the limited bandwidth of the broadcasting system. However, it is more difficult to synchronize the left and right video frames of a 3D stereoscopic service, which come through different types of networks. The proposed method suggests the use of the offset from the initial timestamp of RTP (Real Time Protocol) to determine the order of frames and to find the pair of a left and a right frame that must be played at the same time. Additionally, a new signaling method is introduced for a mobile device to request a 3D service and to get the initial RTP timestamp.