• Title/Summary/Keyword: convolution layer

Search Result 138, Processing Time 0.025 seconds

CNN Based 2D and 2.5D Face Recognition For Home Security System (홈보안 시스템을 위한 CNN 기반 2D와 2.5D 얼굴 인식)

  • MaYing, MaYing;Kim, Kang-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.6
    • /
    • pp.1207-1214
    • /
    • 2019
  • Technologies of the 4th industrial revolution have been unknowingly seeping into our lives. Many IoT based home security systems are using the convolutional neural network(CNN) as good biometrics to recognize a face and protect home and family from intruders since CNN has demonstrated its excellent ability in image recognition. In this paper, three layouts of CNN for 2D and 2.5D image of small dataset with various input image size and filter size are explored. The simulation results show that the layout of CNN with 50*50 input size of 2.5D image, 2 convolution and max pooling layer, and 3*3 filter size for small dataset of 2.5D image is optimal for a home security system with recognition accuracy of 0.966. In addition, the longest CPU time consumption for one input image is 0.057S. The proposed layout of CNN for a face recognition is suitable to control the actuators in the home security system because a home security system requires good face recognition and short recognition time.

Studies on Joint Source/Channel Coding for MPEG-4 Scalable Video Transmission in Mobile Broadcast Receiving Environments (이동방송수신환경에서 MPEG-4 계층적 비디오 전송을 위한 결합 소스/채널 부호화에 관한 연구)

  • Lee Woon-Moon;Sohn Won
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.3 s.303
    • /
    • pp.31-40
    • /
    • 2005
  • In this paper, we develop an approach toward JSC(Joint Source-Channel Coding) method for MPEG-4 based FGS(Fine Granular Scalability) video coding and transmission in fixed and mobile receiving environment(Digital Audio Broadcasting, DAB). The source coder used MPEG-4 FGS video codec, the channel coder used RCPC(Rate Compatible Punctured Convolution) code and the modulation method used QPSK modulation. We have considered channel environment of AWGN and mobile receiving environment. This study determined optimum Trade-off point between source bit rate and channel coding rate in variable channel states. We compared FGS-JSC method and general single layer CBR(Constant Bit Rate) transmission. In this results, FGS-JSC was appeared better performance than CBR transmission.

Speech detection from broadcast contents using multi-scale time-dilated convolutional neural networks (다중 스케일 시간 확장 합성곱 신경망을 이용한 방송 콘텐츠에서의 음성 검출)

  • Jang, Byeong-Yong;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.89-96
    • /
    • 2019
  • In this paper, we propose a deep learning architecture that can effectively detect speech segmentation in broadcast contents. We also propose a multi-scale time-dilated layer for learning the temporal changes of feature vectors. We implement several comparison models to verify the performance of proposed model and calculated the frame-by-frame F-score, precision, and recall. Both the proposed model and the comparison model are trained with the same training data, and we train the model using 32 hours of Korean broadcast data which is composed of various genres (drama, news, documentary, and so on). Our proposed model shows the best performance with F-score 91.7% in Korean broadcast data. The British and Spanish broadcast data also show the highest performance with F-score 87.9% and 92.6%. As a result, our proposed model can contribute to the improvement of performance of speech detection by learning the temporal changes of the feature vectors.

Defect Classification of Cross-section of Additive Manufacturing Using Image-Labeling (이미지 라벨링을 이용한 적층제조 단면의 결함 분류)

  • Lee, Jeong-Seong;Choi, Byung-Joo;Lee, Moon-Gu;Kim, Jung-Sub;Lee, Sang-Won;Jeon, Yong-Ho
    • Journal of the Korean Society of Manufacturing Process Engineers
    • /
    • v.19 no.7
    • /
    • pp.7-15
    • /
    • 2020
  • Recently, the fourth industrial revolution has been presented as a new paradigm and additive manufacturing (AM) has become one of the most important topics. For this reason, process monitoring for each cross-sectional layer of additive metal manufacturing is important. Particularly, deep learning can train a machine to analyze, optimize, and repair defects. In this paper, image classification is proposed by learning images of defects in the metal cross sections using the convolution neural network (CNN) image labeling algorithm. Defects were classified into three categories: crack, porosity, and hole. To overcome a lack-of-data problem, the amount of learning data was augmented using a data augmentation algorithm. This augmentation algorithm can transform an image to 180 images, increasing the learning accuracy. The number of training and validation images was 25,920 (80 %) and 6,480 (20 %), respectively. An optimized case with a combination of fully connected layers, an optimizer, and a loss function, showed that the model accuracy was 99.7 % and had a success rate of 97.8 % for 180 test images. In conclusion, image labeling was successfully performed and it is expected to be applied to automated AM process inspection and repair systems in the future.

Deep Learning based HEVC Double Compression Detection (딥러닝 기술 기반 HEVC로 압축된 영상의 이중 압축 검출 기술)

  • Uddin, Kutub;Yang, Yoonmo;Oh, Byung Tae
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.1134-1142
    • /
    • 2019
  • Detection of double compression is one of the most efficient ways of remarking the validity of videos. Many methods have been introduced to detect HEVC double compression with different coding parameters. However, HEVC double compression detection under the same coding environments is still a challenging task in video forensic. In this paper, we introduce a novel method based on the frame partitioning information in intra prediction mode for detecting double compression in with the same coding environments. We propose to extract statistical feature and Deep Convolution Neural Network (DCNN) feature from the difference of partitioning picture including Coding Unit (CU) and Transform Unit (TU) information. Finally, a softmax layer is integrated to perform the classification of the videos into single and double compression by combing the statistical and the DCNN features. Experimental results show the effectiveness of the statistical and the DCNN features with an average accuracy of 87.5% for WVGA and 84.1% for HD dataset.

EPS Gesture Signal Recognition using Deep Learning Model (심층 학습 모델을 이용한 EPS 동작 신호의 인식)

  • Lee, Yu ra;Kim, Soo Hyung;Kim, Young Chul;Na, In Seop
    • Smart Media Journal
    • /
    • v.5 no.3
    • /
    • pp.35-41
    • /
    • 2016
  • In this paper, we propose hand-gesture signal recognition based on EPS(Electronic Potential Sensor) using Deep learning model. Extracted signals which from Electronic field based sensor, EPS have much of the noise, so it must remove in pre-processing. After the noise are removed with filter using frequency feature, the signals are reconstructed with dimensional transformation to overcome limit which have just one-dimension feature with voltage value for using convolution operation. Then, the reconstructed signal data is finally classified and recognized using multiple learning layers model based on deep learning. Since the statistical model based on probability is sensitive to initial parameters, the result can change after training in modeling phase. Deep learning model can overcome this problem because of several layers in training phase. In experiment, we used two different deep learning structures, Convolutional neural networks and Recurrent Neural Network and compared with statistical model algorithm with four kinds of gestures. The recognition result of method using convolutional neural network is better than other algorithms in EPS gesture signal recognition.

Shadow Removal based on the Deep Neural Network Using Self Attention Distillation (자기 주의 증류를 이용한 심층 신경망 기반의 그림자 제거)

  • Kim, Jinhee;Kim, Wonjun
    • Journal of Broadcast Engineering
    • /
    • v.26 no.4
    • /
    • pp.419-428
    • /
    • 2021
  • Shadow removal plays a key role for the pre-processing of image processing techniques such as object tracking and detection. With the advances of image recognition based on deep convolution neural networks, researches for shadow removal have been actively conducted. In this paper, we propose a novel method for shadow removal, which utilizes self attention distillation to extract semantic features. The proposed method gradually refines results of shadow detection, which are extracted from each layer of the proposed network, via top-down distillation. Specifically, the training procedure can be efficiently performed by learning the contextual information for shadow removal without shadow masks. Experimental results on various datasets show the effectiveness of the proposed method for shadow removal under real world environments.

The Design and Practice of Disaster Response RL Environment Using Dimension Reduction Method for Training Performance Enhancement (학습 성능 향상을 위한 차원 축소 기법 기반 재난 시뮬레이션 강화학습 환경 구성 및 활용)

  • Yeo, Sangho;Lee, Seungjun;Oh, Sangyoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.7
    • /
    • pp.263-270
    • /
    • 2021
  • Reinforcement learning(RL) is the method to find an optimal policy through training. and it is one of popular methods for solving lifesaving and disaster response problems effectively. However, the conventional reinforcement learning method for disaster response utilizes either simple environment such as. grid and graph or a self-developed environment that are hard to verify the practical effectiveness. In this paper, we propose the design of a disaster response RL environment which utilizes the detailed property information of the disaster simulation in order to utilize the reinforcement learning method in the real world. For the RL environment, we design and build the reinforcement learning communication as well as the interface between the RL agent and the disaster simulation. Also, we apply the dimension reduction method for converting non-image feature vectors into image format which is effectively utilized with convolution layer to utilize the high-dimensional and detailed property of the disaster simulation. To verify the effectiveness of our proposed method, we conducted empirical evaluations and it shows that our proposed method outperformed conventional methods in the building fire damage.

Active pulse classification algorithm using convolutional neural networks (콘볼루션 신경회로망을 이용한 능동펄스 식별 알고리즘)

  • Kim, Geunhwan;Choi, Seung-Ryul;Yoon, Kyung-Sik;Lee, Kyun-Kyung;Lee, Donghwa
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.1
    • /
    • pp.106-113
    • /
    • 2019
  • In this paper, we propose an algorithm to classify the received active pulse when the active sonar system is operated as a non-cooperative mode. The proposed algorithm uses CNN (Convolutional Neural Networks) which shows good performance in various fields. As an input of CNN, time frequency analysis data which performs STFT (Short Time Fourier Transform) of the received signal is used. The CNN used in this paper consists of two convolution and pulling layers. We designed a database based neural network and a pulse feature based neural network according to the output layer design. To verify the performance of the algorithm, the data of 3110 CW (Continuous Wave) pulses and LFM (Linear Frequency Modulated) pulses received from the actual ocean were processed to construct training data and test data. As a result of simulation, the database based neural network showed 99.9 % accuracy and the feature based neural network showed about 96 % accuracy when allowing 2 pixel error.

SIFT Image Feature Extraction based on Deep Learning (딥 러닝 기반의 SIFT 이미지 특징 추출)

  • Lee, Jae-Eun;Moon, Won-Jun;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.24 no.2
    • /
    • pp.234-242
    • /
    • 2019
  • In this paper, we propose a deep neural network which extracts SIFT feature points by determining whether the center pixel of a cropped image is a SIFT feature point. The data set of this network consists of a DIV2K dataset cut into $33{\times}33$ size and uses RGB image unlike SIFT which uses black and white image. The ground truth consists of the RobHess SIFT features extracted by setting the octave (scale) to 0, the sigma to 1.6, and the intervals to 3. Based on the VGG-16, we construct an increasingly deep network of 13 to 23 and 33 convolution layers, and experiment with changing the method of increasing the image scale. The result of using the sigmoid function as the activation function of the output layer is compared with the result using the softmax function. Experimental results show that the proposed network not only has more than 99% extraction accuracy but also has high extraction repeatability for distorted images.