• Title/Summary/Keyword: Multimedia Features

Search Result 739, Processing Time 0.023 seconds

Comparison of Off-the-Shelf DCNN Models for Extracting Bark Feature and Tree Species Recognition Using Multi-layer Perceptron (수피 특징 추출을 위한 상용 DCNN 모델의 비교와 다층 퍼셉트론을 이용한 수종 인식)

  • Kim, Min-Ki
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.9
    • /
    • pp.1155-1163
    • /
    • 2020
  • Deep learning approach is emerging as a new way to improve the accuracy of tree species identification using bark image. However, the approach has not been studied enough because it is confronted with the problem of acquiring a large volume of bark image dataset. This study solved this problem by utilizing a pretrained off-the-shelf DCNN model. It compares the discrimination power of bark features extracted by each DCNN model. Then it extracts the features by using a selected DCNN model and feeds them to a multi-layer perceptron (MLP). We found out that the ResNet50 model is effective in extracting bark features and the MLP could be trained well with the features reduced by the principal component analysis. The proposed approach gives accuracy of 99.1% and 98.4% for BarkTex and Trunk12 datasets respectively.

Human Gait Recognition Based on Spatio-Temporal Deep Convolutional Neural Network for Identification

  • Zhang, Ning;Park, Jin-ho;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.927-939
    • /
    • 2020
  • Gait recognition can identify people's identity from a long distance, which is very important for improving the intelligence of the monitoring system. Among many human features, gait features have the advantages of being remotely available, robust, and secure. Traditional gait feature extraction, affected by the development of behavior recognition, can only rely on manual feature extraction, which cannot meet the needs of fine gait recognition. The emergence of deep convolutional neural networks has made researchers get rid of complex feature design engineering, and can automatically learn available features through data, which has been widely used. In this paper,conduct feature metric learning in the three-dimensional space by combining the three-dimensional convolution features of the gait sequence and the Siamese structure. This method can capture the information of spatial dimension and time dimension from the continuous periodic gait sequence, and further improve the accuracy and practicability of gait recognition.

Acoustic Cues in Spoken French for the Pronunciation Assessment Multimedia System (발음평가용 멀티미디어 시스템 구현을 위한 구어 프랑스어의 음향학적 단서)

  • Lee, Eun-Yung;Song, Mi-Young
    • Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.185-200
    • /
    • 2005
  • The objective of this study is to examine acoustic cues in spoken French for the assessment of pronunciation which is necessary to realization of the multimedia system. The corpus is composed of simple expressions which consist of the French phonological system include all phonemes. This experiment was made on 4 male and female French native speakers and on 20 Korean speakers, university students who had learned the French language more than two years. We analyzed the recorded data by using spectrograph and measured comparative features by the numerical values. First of all, we found the mean and the deviation of all phonemes, and then chose features which had high error frequency and great differences between French and Korean pronunciations. The selected data were simplified and compared among them. After we judged whether the problems of pronunciation in each Korean speaker were either the utterance mistake or the interference of mother tongue, in terms of articulatory and auditory aspects, we tried to find acoustic features as simplified as possible. From this experiment, we could extract acoustic cues for the construction of the French pronunciation training system.

  • PDF

Face Recognition Based on Improved Fuzzy RBF Neural Network for Smar t Device

  • Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.11
    • /
    • pp.1338-1347
    • /
    • 2013
  • Face recognition is a science of automatically identifying individuals based their unique facial features. In order to avoid overfitting and reduce the computational reduce the computational burden, a new face recognition algorithm using PCA-fisher linear discriminant (PCA-FLD) and fuzzy radial basis function neural network (RBFNN) is proposed in this paper. First, face features are extracted by the principal component analysis (PCA) method. Then, the extracted features are further processed by the Fisher's linear discriminant technique to acquire lower-dimensional discriminant patterns, the processed features will be considered as the input of the fuzzy RBFNN. As a widely applied algorithm in fuzzy RBF neural network, BP learning algorithm has the low rate of convergence, therefore, an improved learning algorithm based on Levenberg-Marquart (L-M) for fuzzy RBF neural network is introduced in this paper, which combined the Gradient Descent algorithm with the Gauss-Newton algorithm. Experimental results on the ORL face database demonstrate that the proposed algorithm has satisfactory performance and high recognition rate.

DETECTION OF FACIAL FEATURES IN COLOR IMAGES WITH VARIOUS BACKGROUNDS AND FACE POSES

  • Park, Jae-Young;Kim, Nak-Bin
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.4
    • /
    • pp.594-600
    • /
    • 2003
  • In this paper, we propose a detection method for facial features in color images with various backgrounds and face poses. To begin with, the proposed method extracts face candidacy region from images with various backgrounds, which have skin-tone color and complex objects, via the color and edge information of face. And then, by using the elliptical shape property of face, we correct a rotation, scale, and tilt of face region caused by various poses of head. Finally, we verify the face using features of face and detect facial features. In our experimental results, it is shown that accuracy of detection is high and the proposed method can be used in pose-invariant face recognition system effectively

  • PDF

An Intelligent Fire Leaning and Detection System (지능형 화재 학습 및 탐지 시스템)

  • Cheoi, Kyungjoo
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.3
    • /
    • pp.359-367
    • /
    • 2015
  • In this paper, we propose intelligent fire learning and detection system using hybrid visual attention mechanism of human. Proposed fire learning system generates leaned data by learning process of fire and smoke images. The features used as learning feature are selected among many features which are extracted based on bottom-up visual attention mechanism of human, and these features are modified as learned data by calculating average and standard variation of them. Proposed fire detection system uses learned data which is generated in fire learning system and features of input image to detect fire.

Music Key Identification using Chroma Features and Hidden Markov Models

  • Kanyange, Pamela;Sin, Bong-Kee
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.9
    • /
    • pp.1502-1508
    • /
    • 2017
  • A musical key is a fundamental concept in Western music theory. It is a collective characterization of pitches and chords that together create a musical perception of the entire piece. It is based on a group of pitches in a scale with which a music is constructed. Each key specifies the set of seven primary chromatic notes that are used out of the twelve possible notes. This paper presents a method that identifies the key of a song using Hidden Markov Models given a sequence of chroma features. Given an input song, a sequence of chroma features are computed. It is then classified into one of the 24 keys using a discrete Hidden Markov Models. The proposed method can help musicians and disc-jockeys in mixing a segment of tracks to create a medley. When tested on 120 songs, the success rate of the music key identification reached around 87.5%.

Multi-Path Feature Fusion Module for Semantic Segmentation (다중 경로 특징점 융합 기반의 의미론적 영상 분할 기법)

  • Park, Sangyong;Heo, Yong Seok
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.1
    • /
    • pp.1-12
    • /
    • 2021
  • In this paper, we present a new architecture for semantic segmentation. Semantic segmentation aims at a pixel-wise classification which is important to fully understand images. Previous semantic segmentation networks use features of multi-layers in the encoder to predict final results. However, they do not contain various receptive fields in the multi-layers features, which easily lead to inaccurate results for boundaries between different classes and small objects. To solve this problem, we propose a multi-path feature fusion module that allows for features of each layers to contain various receptive fields by use of a set of dilated convolutions with different dilatation rates. Various experiments demonstrate that our method outperforms previous methods in terms of mean intersection over unit (mIoU).

Research on the Features of VR Marketing Design Based on Emotional Experience

  • Sui, Qiao;Cho, Dong-Min
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.3
    • /
    • pp.537-545
    • /
    • 2022
  • Emotional experience (James, 1884)[1] can affect people's behavior. There are few types of research on VR marketing(Maojun Zhou, Zeru Yan, 2018)[2] design based on emotional experience. This article is based on emotional evaluation theory and empirical research, and the VR marketing case "Buy+" online shopping platform (Wu Yongyi, 2016). It is concluded that there are three levels of emotional experience definition on VR marketing which decompose the features of the VR marketing design of "Buy+ as an online-shop" correspondingly and find out the design features of VR marketing from the perspective of emotional experience. Finally, through the analysis of the questionnaire data, it verified that vividness, functionality and effectiveness could represent the features of VR marketing design. Moreover, it analyzed the correlation among these factors. Vividness and functionality have the closest relationship among them. The definition, the components, and the correlation of the three-layer emotional experience obtained from this research can provide theoretical support and reference for other VR marketing designs.

Multimedia documents for user interfaces of cooperative work (공동 작업을 위한 사용자 인터페이스로서의 멀티미디어 문서)

  • 성미영
    • Proceedings of the ESK Conference
    • /
    • 1995.10a
    • /
    • pp.46-55
    • /
    • 1995
  • The multimedia documents becomes the most natural user interface for CSCW(Conputer Supported Cooperative Work) in distributed environment. The objective of this study is to propose a multimedia document architecture and to develop a system that can manage it well. The new architecture is for revisable documents and is the basic layer for hypermedia documents. A good document architecture for CSCW must support pointing, marking, and editing over a part of documents. The user views, version control, and full- content search are also desirable features. In this paper, we discuss the basic concept of a new document architecture for CSCW. We also present the user interfaces for spatio-temporal compositions of multimedia documents.

  • PDF