• Title/Summary/Keyword: Visual Feature Extraction

Search Result 141, Processing Time 0.025 seconds

Video retrieval method using non-parametric based motion classification (비-파라미터 기반의 움직임 분류를 통한 비디오 검색 기법)

  • Kim Nac-Woo;Choi Jong-Soo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.2 s.308
    • /
    • pp.1-11
    • /
    • 2006
  • In this paper, we propose the novel video retrieval algorithm using non-parametric based motion classification in the shot-based video indexing structure. The proposed system firstly gets the key frame and motion information from each shot segmented by scene change detection method, and then extracts visual features and non-parametric based motion information from them. Finally, we construct real-time retrieval system supporting similarity comparison of these spatio-temporal features. After the normalized motion vector fields is created from MPEG compressed stream, the extraction of non-parametric based motion feature is effectively achieved by discretizing each normalized motion vectors into various angle bins, and considering a mean, a variance, and a direction of these bins. We use the edge-based spatial descriptor to extract the visual feature in key frames. Experimental evidence shows that our algorithm outperforms other video retrieval methods for image indexing and retrieval. To index the feature vectors, we use R*-tree structures.

Robot Control based on Steady-State Visual Evoked Potential using Arduino and Emotiv Epoc (아두이노와 Emotiv Epoc을 이용한 정상상태시각유발전위 (SSVEP) 기반의 로봇 제어)

  • Yu, Je-Hun;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.3
    • /
    • pp.254-259
    • /
    • 2015
  • In this paper, The wireless robot control system was proposed using Brain-computer interface(BCI) systems based on the steady-state visual evoked potential(SSVEP). Cross Power Spectral Density(CPSD) was used for analysis of electroencephalogram(EEG) and extraction of feature data. And Linear Discriminant Analysis(LDA) and Support Vector Machine(SVM) was used for patterns classification. We obtained the average classification rates of about 70% of each subject. Robot control was implemented using the results of classification of EEG and commanded using bluetooth communication for robot moving.

Medical Image Watermarking Based on Visual Secret Sharing and Cellular Automata Transform for Copyright Protection

  • Fan, Tzuo-Yau;Chao, Her-Chang;Chieu, Bin-Chang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.12
    • /
    • pp.6177-6200
    • /
    • 2018
  • In order to achieve the goal of protecting medical images, some existing watermark techniques for medical image protection mainly focus on improving the invisibility and robustness properties of the method, in order to prevent unnecessary medical disputes. This paper proposes a novel copyright method for medical image protection based on visual secret sharing (VSS) and cellular automata transform (CAT). This method uses the protected medical image feature as well as VSS and a watermark to produce the ownership share image (OSI). The OSI is used for medical image verification and must be registered to a certified authority. In the watermark extraction process, the suspected medical image is used to generate a master share image (MSI). The watermark can be extracted by combining the MSI and the OSI. Different from other traditional methods, the proposed method does not need to modify the medical image in order to protect the copyright of the image. Moreover, the registered OSI used to verify the ownership and its appearance display meaningful information, facilitating image management. Finally, the results of the final experiment can prove the effectiveness of our method.

A Multi-category Task for Bitrate Interval Prediction with the Target Perceptual Quality

  • Yang, Zhenwei;Shen, Liquan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4476-4491
    • /
    • 2021
  • Video service providers tend to face user network problems in the process of transmitting video streams. They strive to provide user with superior video quality in a limited bitrate environment. It is necessary to accurately determine the target bitrate range of the video under different quality requirements. Recently, several schemes have been proposed to meet this requirement. However, they do not take the impact of visual influence into account. In this paper, we propose a new multi-category model to accurately predict the target bitrate range with target visual quality by machine learning. Firstly, a dataset is constructed to generate multi-category models by machine learning. The quality score ladders and the corresponding bitrate-interval categories are defined in the dataset. Secondly, several types of spatial-temporal features related to VMAF evaluation metrics and visual factors are extracted and processed statistically for classification. Finally, bitrate prediction models trained on the dataset by RandomForest classifier can be used to accurately predict the target bitrate of the input videos with target video quality. The classification prediction accuracy of the model reaches 0.705 and the encoded video which is compressed by the bitrate predicted by the model can achieve the target perceptual quality.

Activity Object Detection Based on Improved Faster R-CNN

  • Zhang, Ning;Feng, Yiran;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.3
    • /
    • pp.416-422
    • /
    • 2021
  • Due to the large differences in human activity within classes, the large similarity between classes, and the problems of visual angle and occlusion, it is difficult to extract features manually, and the detection rate of human behavior is low. In order to better solve these problems, an improved Faster R-CNN-based detection algorithm is proposed in this paper. It achieves multi-object recognition and localization through a second-order detection network, and replaces the original feature extraction module with Dense-Net, which can fuse multi-level feature information, increase network depth and avoid disappearance of network gradients. Meanwhile, the proposal merging strategy is improved with Soft-NMS, where an attenuation function is designed to replace the conventional NMS algorithm, thereby avoiding missed detection of adjacent or overlapping objects, and enhancing the network detection accuracy under multiple objects. During the experiment, the improved Faster R-CNN method in this article has 84.7% target detection result, which is improved compared to other methods, which proves that the target recognition method has significant advantages and potential.

Revisiting Deep Learning Model for Image Quality Assessment: Is Strided Convolution Better than Pooling? (영상 화질 평가 딥러닝 모델 재검토: 스트라이드 컨볼루션이 풀링보다 좋은가?)

  • Uddin, AFM Shahab;Chung, TaeChoong;Bae, Sung-Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.29-32
    • /
    • 2020
  • Due to the lack of improper image acquisition process, noise induction is an inevitable step. As a result, objective image quality assessment (IQA) plays an important role in estimating the visual quality of noisy image. Plenty of IQA methods have been proposed including traditional signal processing based methods as well as current deep learning based methods where the later one shows promising performance due to their complex representation ability. The deep learning based methods consists of several convolution layers and down sampling layers for feature extraction and fully connected layers for regression. Usually, the down sampling is performed by using max-pooling layer after each convolutional block. We reveal that this max-pooling causes information loss despite of knowing their importance. Consequently, we propose a better IQA method that replaces the max-pooling layers with strided convolutions to down sample the feature space and since the strided convolution layers have learnable parameters, they preserve optimal features and discard redundant information, thereby improve the prediction accuracy. The experimental results verify the effectiveness of the proposed method.

  • PDF

Improved Cycle GAN Performance By Considering Semantic Loss (의미적 손실 함수를 통한 Cycle GAN 성능 개선)

  • Tae-Young Jeong;Hyun-Sik Lee;Ye-Rim Eom;Kyung-Su Park;Yu-Rim Shin;Jae-Hyun Moon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.908-909
    • /
    • 2023
  • Recently, several generative models have emerged and are being used in various industries. Among them, Cycle GAN is still used in various fields such as style transfer, medical care and autonomous driving. In this paper, we propose two methods to improve the performance of these Cycle GAN model. The ReLU activation function previously used in the generator was changed to Leaky ReLU. And a new loss function is proposed that considers the semantic level rather than focusing only on the pixel level through the VGG feature extractor. The proposed model showed quality improvement on the test set in the art domain, and it can be expected to be applied to other domains in the future to improve performance.

Moving Object Tracking Using Active Contour Model (동적 윤곽 모델을 이용한 이동 물체 추적)

  • Han, Kyu-Bum;Baek, Yoon-Su
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.27 no.5
    • /
    • pp.697-704
    • /
    • 2003
  • In this paper, the visual tracking system for arbitrary shaped moving object is proposed. The established tracking system can be divided into model based method that needs previous model for target object and image based method that uses image feature. In the model based method, the reliable tracking is possible, but simplification of the shape is necessary and the application is restricted to definite target mod el. On the other hand, in the image based method, the process speed can be increased, but the shape information is lost and the tracking system is sensitive to image noise. The proposed tracking system is composed of the extraction process that recognizes the existence of moving object and tracking process that extracts dynamic characteristics and shape information of the target objects. Specially, active contour model is used to effectively track the object that is undergoing shape change. In initializatio n process of the contour model, the semi-automatic operation can be avoided and the convergence speed of the contour can be increased by the proposed effective initialization method. Also, for the efficient solution of the correspondence problem in multiple objects tracking, the variation function that uses the variation of position structure in image frame and snake energy level is proposed. In order to verify the validity and effectiveness of the proposed tracking system, real time tracking experiment for multiple moving objects is implemented.

Physiological Responses-Based Emotion Recognition Using Multi-Class SVM with RBF Kernel (RBF 커널과 다중 클래스 SVM을 이용한 생리적 반응 기반 감정 인식 기술)

  • Vanny, Makara;Ko, Kwang-Eun;Park, Seung-Min;Sim, Kwee-Bo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.4
    • /
    • pp.364-371
    • /
    • 2013
  • Emotion Recognition is one of the important part to develop in human-human and human computer interaction. In this paper, we have focused on the performance of multi-class SVM (Support Vector Machine) with Gaussian RFB (Radial Basis function) kernel, which has been used to solve the problem of emotion recognition from physiological signals and to improve the accuracy of emotion recognition. The experimental paradigm for data acquisition, visual-stimuli of IAPS (International Affective Picture System) are used to induce emotional states, such as fear, disgust, joy, and neutral for each subject. The raw signals of acquisited data are splitted in the trial from each session to pre-process the data. The mean value and standard deviation are employed to extract the data for feature extraction and preparing in the next step of classification. The experimental results are proving that the proposed approach of multi-class SVM with Gaussian RBF kernel with OVO (One-Versus-One) method provided the successful performance, accuracies of classification, which has been performed over these four emotions.

Soft Sensor Design Using Image Analysis and its Industrial Applications Part 1. Estimation and Monitoring of Product Appearance (화상분석을 이용한 소프트 센서의 설계와 산업응용사례 1. 외관 품질의 수치적 추정과 모니터링)

  • Liu, J. Jay
    • Korean Chemical Engineering Research
    • /
    • v.48 no.4
    • /
    • pp.475-482
    • /
    • 2010
  • In this work, soft sensor based on image anlaysis is proposed for quantitatively estimating the visual appearance of manufactured products and is applied to quality monitoring. The methodology consists of three steps; (1) textural feature extraction from product images using wavelet transform, (2) numerical estimation of the product appearance through projection of the textural features on subspace, and (3) use of latent variables of textural features (i.e., numerical estimates of product appearance). The focus of this approach is on the consistent and quantitative estimation of continuous variations in visual appearance rather than on classification into discrete classes. This approach is illustrated through the application to the estimation and monitoring of the appearance of engineered stone countertops.