• Title/Summary/Keyword: 3D-CNN

Search Result 157, Processing Time 0.025 seconds

A Study on the Design of Prediction Model for Safety Evaluation of Partial Discharge (부분 방전의 안전도 평가를 위한 예측 모델 설계)

  • Lee, Su-Il;Ko, Dae-Sik
    • Journal of Platform Technology
    • /
    • v.8 no.3
    • /
    • pp.10-21
    • /
    • 2020
  • Partial discharge occurs a lot in high-voltage power equipment such as switchgear, transformers, and switch gears. Partial discharge shortens the life of the insulator and causes insulation breakdown, resulting in large-scale damage such as a power outage. There are several types of partial discharge that occur inside the product and the surface. In this paper, we design a predictive model that can predict the pattern and probability of occurrence of partial discharge. In order to analyze the designed model, learning data for each type of partial discharge was collected through the UHF sensor by using a simulator that generates partial discharge. The predictive model designed in this paper was designed based on CNN during deep learning, and the model was verified through learning. To learn about the designed model, 5000 training data were created, and the form of training data was used as input data for the model by pre-processing the 3D raw data input from the UHF sensor as 2D data. As a result of the experiment, it was found that the accuracy of the model designed through learning has an accuracy of 0.9972. It was found that the accuracy of the proposed model was higher in the case of learning by making the data into a two-dimensional image and learning it in the form of a grayscale image.

  • PDF

A Comparative Study on the Effective Deep Learning for Fingerprint Recognition with Scar and Wrinkle (상처와 주름이 있는 지문 판별에 효율적인 심층 학습 비교연구)

  • Kim, JunSeob;Rim, BeanBonyka;Sung, Nak-Jun;Hong, Min
    • Journal of Internet Computing and Services
    • /
    • v.21 no.4
    • /
    • pp.17-23
    • /
    • 2020
  • Biometric information indicating measurement items related to human characteristics has attracted great attention as security technology with high reliability since there is no fear of theft or loss. Among these biometric information, fingerprints are mainly used in fields such as identity verification and identification. If there is a problem such as a wound, wrinkle, or moisture that is difficult to authenticate to the fingerprint image when identifying the identity, the fingerprint expert can identify the problem with the fingerprint directly through the preprocessing step, and apply the image processing algorithm appropriate to the problem. Solve the problem. In this case, by implementing artificial intelligence software that distinguishes fingerprint images with cuts and wrinkles on the fingerprint, it is easy to check whether there are cuts or wrinkles, and by selecting an appropriate algorithm, the fingerprint image can be easily improved. In this study, we developed a total of 17,080 fingerprint databases by acquiring all finger prints of 1,010 students from the Royal University of Cambodia, 600 Sokoto open data sets, and 98 Korean students. In order to determine if there are any injuries or wrinkles in the built database, criteria were established, and the data were validated by experts. The training and test datasets consisted of Cambodian data and Sokoto data, and the ratio was set to 8: 2. The data of 98 Korean students were set up as a validation data set. Using the constructed data set, five CNN-based architectures such as Classic CNN, AlexNet, VGG-16, Resnet50, and Yolo v3 were implemented. A study was conducted to find the model that performed best on the readings. Among the five architectures, ResNet50 showed the best performance with 81.51%.

2D and 3D Hand Pose Estimation Based on Skip Connection Form (스킵 연결 형태 기반의 손 관절 2D 및 3D 검출 기법)

  • Ku, Jong-Hoe;Kim, Mi-Kyung;Cha, Eui-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.12
    • /
    • pp.1574-1580
    • /
    • 2020
  • Traditional pose estimation methods include using special devices or images through image processing. The disadvantage of using a device is that the environment in which the device can be used is limited and costly. The use of cameras and image processing has the advantage of reducing environmental constraints and costs, but the performance is lower. CNN(Convolutional Neural Networks) were studied for pose estimation just using only camera without these disadvantage. Various techniques were proposed to increase cognitive performance. In this paper, the effect of the skip connection on the network was experimented by using various skip connections on the joint recognition of the hand. Experiments have confirmed that the presence of additional skip connections other than the basic skip connections has a better effect on performance, but the network with downward skip connections is the best performance.

ROV Manipulation from Observation and Exploration using Deep Reinforcement Learning

  • Jadhav, Yashashree Rajendra;Moon, Yong Seon
    • Journal of Advanced Research in Ocean Engineering
    • /
    • v.3 no.3
    • /
    • pp.136-148
    • /
    • 2017
  • The paper presents dual arm ROV manipulation using deep reinforcement learning. The purpose of this underwater manipulator is to investigate and excavate natural resources in ocean, finding lost aircraft blackboxes and for performing other extremely dangerous tasks without endangering humans. This research work emphasizes on a self-learning approach using Deep Reinforcement Learning (DRL). DRL technique allows ROV to learn the policy of performing manipulation task directly, from raw image data. Our proposed architecture maps the visual inputs (images) to control actions (output) and get reward after each action, which allows an agent to learn manipulation skill through trial and error method. We have trained our network in simulation. The raw images and rewards are directly provided by our simple Lua simulator. Our simulator achieve accuracy by considering underwater dynamic environmental conditions. Major goal of this research is to provide a smart self-learning way to achieve manipulation in highly dynamic underwater environment. The results showed that a dual robotic arm trained for a 3DOF movement successfully achieved target reaching task in a 2D space by considering real environmental factor.

Multi-modal Emotion Recognition using Semi-supervised Learning and Multiple Neural Networks in the Wild (준 지도학습과 여러 개의 딥 뉴럴 네트워크를 사용한 멀티 모달 기반 감정 인식 알고리즘)

  • Kim, Dae Ha;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.351-360
    • /
    • 2018
  • Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-task learning and semi-supervised learning using the spatio-temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. The proposed network improves emotion classification performance by appropriately integrating existing supervised learning and semi-supervised learning networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.

Building Dataset of Sensor-only Facilities for Autonomous Cooperative Driving

  • Hyung Lee;Chulwoo Park;Handong Lee;Junhyuk Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.1
    • /
    • pp.21-30
    • /
    • 2024
  • In this paper, we propose a method to build a sample dataset of the features of eight sensor-only facilities built as infrastructure for autonomous cooperative driving. The feature extracted from point cloud data acquired by LiDAR and build them into the sample dataset for recognizing the facilities. In order to build the dataset, eight sensor-only facilities with high-brightness reflector sheets and a sensor acquisition system were developed. To extract the features of facilities located within a certain measurement distance from the acquired point cloud data, a cylindrical projection method was applied to the extracted points after applying DBSCAN method for points and then a modified OTSU method for reflected intensity. Coordinates of 3D points, projected coordinates of 2D, and reflection intensity were set as the features of the facility, and the dataset was built along with labels. In order to check the effectiveness of the facility dataset built based on LiDAR data, a common CNN model was selected and tested after training, showing an accuracy of about 90% or more, confirming the possibility of facility recognition. Through continuous experiments, we will improve the feature extraction algorithm for building the proposed dataset and improve its performance, and develop a dedicated model for recognizing sensor-only facilities for autonomous cooperative driving.

An Efficient CT Image Denoising using WT-GAN Model

  • Hae Chan Jeong;Dong Hoon Lim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.5
    • /
    • pp.21-29
    • /
    • 2024
  • Reducing the radiation dose during CT scanning can lower the risk of radiation exposure, but not only does the image resolution significantly deteriorate, but the effectiveness of diagnosis is reduced due to the generation of noise. Therefore, noise removal from CT images is a very important and essential processing process in the image restoration. Until now, there are limitations in removing only the noise by separating the noise and the original signal in the image area. In this paper, we aim to effectively remove noise from CT images using the wavelet transform-based GAN model, that is, the WT-GAN model in the frequency domain. The GAN model used here generates images with noise removed through a U-Net structured generator and a PatchGAN structured discriminator. To evaluate the performance of the WT-GAN model proposed in this paper, experiments were conducted on CT images damaged by various noises, namely Gaussian noise, Poisson noise, and speckle noise. As a result of the performance experiment, the WT-GAN model is better than the traditional filter, that is, the BM3D filter, as well as the existing deep learning models, such as DnCNN, CDAE model, and U-Net GAN model, in qualitative and quantitative measures, that is, PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity Index Measure) showed excellent results.

Deep Learning-based Depth Map Estimation: A Review

  • Abdullah, Jan;Safran, Khan;Suyoung, Seo
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.1
    • /
    • pp.1-21
    • /
    • 2023
  • In this technically advanced era, we are surrounded by smartphones, computers, and cameras, which help us to store visual information in 2D image planes. However, such images lack 3D spatial information about the scene, which is very useful for scientists, surveyors, engineers, and even robots. To tackle such problems, depth maps are generated for respective image planes. Depth maps or depth images are single image metric which carries the information in three-dimensional axes, i.e., xyz coordinates, where z is the object's distance from camera axes. For many applications, including augmented reality, object tracking, segmentation, scene reconstruction, distance measurement, autonomous navigation, and autonomous driving, depth estimation is a fundamental task. Much of the work has been done to calculate depth maps. We reviewed the status of depth map estimation using different techniques from several papers, study areas, and models applied over the last 20 years. We surveyed different depth-mapping techniques based on traditional ways and newly developed deep-learning methods. The primary purpose of this study is to present a detailed review of the state-of-the-art traditional depth mapping techniques and recent deep learning methodologies. This study encompasses the critical points of each method from different perspectives, like datasets, procedures performed, types of algorithms, loss functions, and well-known evaluation metrics. Similarly, this paper also discusses the subdomains in each method, like supervised, unsupervised, and semi-supervised methods. We also elaborate on the challenges of different methods. At the conclusion of this study, we discussed new ideas for future research and studies in depth map research.

Training Performance Analysis of Semantic Segmentation Deep Learning Model by Progressive Combining Multi-modal Spatial Information Datasets (다중 공간정보 데이터의 점진적 조합에 의한 의미적 분류 딥러닝 모델 학습 성능 분석)

  • Lee, Dae-Geon;Shin, Young-Ha;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.2
    • /
    • pp.91-108
    • /
    • 2022
  • In most cases, optical images have been used as training data of DL (Deep Learning) models for object detection, recognition, identification, classification, semantic segmentation, and instance segmentation. However, properties of 3D objects in the real-world could not be fully explored with 2D images. One of the major sources of the 3D geospatial information is DSM (Digital Surface Model). In this matter, characteristic information derived from DSM would be effective to analyze 3D terrain features. Especially, man-made objects such as buildings having geometrically unique shape could be described by geometric elements that are obtained from 3D geospatial data. The background and motivation of this paper were drawn from concept of the intrinsic image that is involved in high-level visual information processing. This paper aims to extract buildings after classifying terrain features by training DL model with DSM-derived information including slope, aspect, and SRI (Shaded Relief Image). The experiments were carried out using DSM and label dataset provided by ISPRS (International Society for Photogrammetry and Remote Sensing) for CNN-based SegNet model. In particular, experiments focus on combining multi-source information to improve training performance and synergistic effect of the DL model. The results demonstrate that buildings were effectively classified and extracted by the proposed approach.

Image Retrieval Based on the Weighted and Regional Integration of CNN Features

  • Liao, Kaiyang;Fan, Bing;Zheng, Yuanlin;Lin, Guangfeng;Cao, Congjun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.894-907
    • /
    • 2022
  • The features extracted by convolutional neural networks are more descriptive of images than traditional features, and their convolutional layers are more suitable for retrieving images than are fully connected layers. The convolutional layer features will consume considerable time and memory if used directly to match an image. Therefore, this paper proposes a feature weighting and region integration method for convolutional layer features to form global feature vectors and subsequently use them for image matching. First, the 3D feature of the last convolutional layer is extracted, and the convolutional feature is subsequently weighted again to highlight the edge information and position information of the image. Next, we integrate several regional eigenvectors that are processed by sliding windows into a global eigenvector. Finally, the initial ranking of the retrieval is obtained by measuring the similarity of the query image and the test image using the cosine distance, and the final mean Average Precision (mAP) is obtained by using the extended query method for rearrangement. We conduct experiments using the Oxford5k and Paris6k datasets and their extended datasets, Paris106k and Oxford105k. These experimental results indicate that the global feature extracted by the new method can better describe an image.