• Title/Summary/Keyword: Deep Features

Search Result 1,096, Processing Time 0.032 seconds

Multi-Object Goal Visual Navigation Based on Multimodal Context Fusion (멀티모달 맥락정보 융합에 기초한 다중 물체 목표 시각적 탐색 이동)

  • Jeong Hyun Choi;In Cheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.9
    • /
    • pp.407-418
    • /
    • 2023
  • The Multi-Object Goal Visual Navigation(MultiOn) is a visual navigation task in which an agent must visit to multiple object goals in an unknown indoor environment in a given order. Existing models for the MultiOn task suffer from the limitation that they cannot utilize an integrated view of multimodal context because use only a unimodal context map. To overcome this limitation, in this paper, we propose a novel deep neural network-based agent model for MultiOn task. The proposed model, MCFMO, uses a multimodal context map, containing visual appearance features, semantic features of environmental objects, and goal object features. Moreover, the proposed model effectively fuses these three heterogeneous features into a global multimodal context map by using a point-wise convolutional neural network module. Lastly, the proposed model adopts an auxiliary task learning module to predict the observation status, goal direction and the goal distance, which can guide to learn the navigational policy efficiently. Conducting various quantitative and qualitative experiments using the Habitat-Matterport3D simulation environment and scene dataset, we demonstrate the superiority of the proposed model.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

A Real-time People Counting Algorithm Using Background Modeling and CNN (배경모델링과 CNN을 이용한 실시간 피플 카운팅 알고리즘)

  • Yang, HunJun;Jang, Hyeok;Jeong, JaeHyup;Lee, Bowon;Jeong, DongSeok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.3
    • /
    • pp.70-77
    • /
    • 2017
  • Recently, Internet of Things (IoT) and deep learning techniques have affected video surveillance systems in various ways. The surveillance features that perform detection, tracking, and classification of specific objects in Closed Circuit Television (CCTV) video are becoming more intelligent. This paper presents real-time algorithm that can run in a PC environment using only a low power CPU. Traditional tracking algorithms combine background modeling using the Gaussian Mixture Model (GMM), Hungarian algorithm, and a Kalman filter; they have relatively low complexity but high detection errors. To supplement this, deep learning technology was used, which can be trained from a large amounts of data. In particular, an SRGB(Sequential RGB)-3 Layer CNN was used on tracked objects to emphasize the features of moving people. Performance evaluation comparing the proposed algorithm with existing ones using HOG and SVM showed move-in and move-out error rate reductions by 7.6 % and 9.0 %, respectively.

Optimal strategy for low surface brightness imaging with KMTNet

  • Byun, Woowon;Kim, Minjin;Sheen, Yun-Kyeong;Ho, Luis C.;Lee, Joon Hyeop;Jeong, Hyunjin;Kim, Sang Chul;Park, Byeong-Gon;Seon, Kwang-Il
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.43 no.1
    • /
    • pp.42.4-43
    • /
    • 2018
  • Most galaxies are believed to evolve through mergers and accretions. In particular, minor mergers and gas accretion appear to play an important role in galaxy evolution in the present-day Universe. Tidally-disrupted debris from such processes remain as diffuse, low-surface brightness structures because the dynamical timescale in the outskirts is significantly longer than that in the central regions. Although these structures will give us useful insight into the mass assembly history of galaxies, it is difficult to detect them due to their faint surface brightness. In order to investigate the structural properties of outskirts in nearby galaxies, we conduct deep and wide-field imaging survey with KMTNet. We present our observing strategy and an optimal data reduction process to recover faint extended features in the images of KMTNet. Using the imaging data of NGC 1291 obtained from KMTNet, we find that a peak-to-peak sky gradient can be reduced less than 0.4-0.6% of the original sky level in the entire image. We also find that we can reach the surface brightness of ${\mu}_{(B,1{\sigma})}$ ~ 29.5, ${\mu}_{(R,1{\sigma})}$ ~ 28.5 mag $arcsec^{-2}$ in one-dimensional profile, that is mainly limited by the uncertainty in the sky determination. It indicates that deep imaging data of KMTNet is suitable to study the extended faint features of nearby galaxies, such as stellar halos, outer disks, and dwarf companions.

  • PDF

SINGLE PANORAMA DEPTH ESTIMATION USING DOMAIN ADAPTATION (도메인 적응을 이용한 단일 파노라마 깊이 추정)

  • Lee, Jonghyeop;Son, Hyeongseok;Lee, Junyong;Yoon, Haeun;Cho, Sunghyun;Lee, Seungyong
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.3
    • /
    • pp.61-68
    • /
    • 2020
  • In this paper, we propose a deep learning framework for predicting a depth map of a 360° panorama image. Previous works use synthetic 360° panorama datasets to train networks due to the lack of realistic datasets. However, the synthetic nature of the datasets induces features extracted by the networks to differ from those of real 360° panorama images, which inevitably leads previous methods to fail in depth prediction of real 360° panorama images. To address this gap, we use domain adaptation to learn features shared by real and synthetic panorama images. Experimental results show that our approach can greatly improve the accuracy of depth estimation on real panorama images while achieving the state-of-the-art performance on synthetic images.

Reynolds and froude number effect on the flow past an interface-piercing circular cylinder

  • Koo, Bonguk;Yang, Jianming;Yeon, Seong Mo;Stern, Frederick
    • International Journal of Naval Architecture and Ocean Engineering
    • /
    • v.6 no.3
    • /
    • pp.529-561
    • /
    • 2014
  • The two-phase turbulent flow past an interface-piercing circular cylinder is studied using a high-fidelity orthogonal curvilinear grid solver with a Lagrangian dynamic subgrid-scale model for large-eddy simulation and a coupled level set and volume of fluid method for air-water interface tracking. The simulations cover the sub-critical and critical and post critical regimes of the Reynolds and sub and super-critical Froude numbers in order to investigate the effect of both dimensionless parameters on the flow. Significant changes in flow features near the air-water interface were observed as the Reynolds number was increased from the sub-critical to the critical regime. The interface makes the separation point near the interface much delayed for all Reynolds numbers. The separation region at intermediate depths is remarkably reduced for the critical Reynolds number regime. The deep flow resembles the single-phase turbulent flow past a circular cylinder, but includes the effect of the free-surface and the limited span length for sub-critical Reynolds numbers. At different Froude numbers, the air-water interface exhibits significantly changed structures, including breaking bow waves with splashes and bubbles at high Froude numbers. Instantaneous and mean flow features such as interface structures, vortex shedding, Reynolds stresses, and vorticity transport are also analyzed. The results are compared with reference experimental data available in the literature. The deep flow is also compared with the single-phase turbulent flow past a circular cylinder in the similar ranges of Reynolds numbers. Discussion is provided concerning the limitations of the current simulations and available experimental data along with future research.

Face Super-Resolution using Adversarial Distillation of Multi-Scale Facial Region Dictionary (다중 스케일 얼굴 영역 딕셔너리의 적대적 증류를 이용한 얼굴 초해상화)

  • Jo, Byungho;Park, In Kyu;Hong, Sungeun
    • Journal of Broadcast Engineering
    • /
    • v.26 no.5
    • /
    • pp.608-620
    • /
    • 2021
  • Recent deep learning-based face super-resolution (FSR) works showed significant performances by utilizing facial prior knowledge such as facial landmark and dictionary that reflects structural or semantic characteristics of the human face. However, most of these methods require additional processing time and memory. To solve this issue, this paper propose an efficient FSR models using knowledge distillation techniques. The intermediate features of teacher network which contains dictionary information based on major face regions are transferred to the student through adversarial multi-scale features distillation. Experimental results show that the proposed model is superior to other SR methods, and its effectiveness compare to teacher model.

Atrous Residual U-Net for Semantic Segmentation in Street Scenes based on Deep Learning (딥러닝 기반 거리 영상의 Semantic Segmentation을 위한 Atrous Residual U-Net)

  • Shin, SeokYong;Lee, SangHun;Han, HyunHo
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.10
    • /
    • pp.45-52
    • /
    • 2021
  • In this paper, we proposed an Atrous Residual U-Net (AR-UNet) to improve the segmentation accuracy of semantic segmentation method based on U-Net. The U-Net is mainly used in fields such as medical image analysis, autonomous vehicles, and remote sensing images. The conventional U-Net lacks extracted features due to the small number of convolution layers in the encoder part. The extracted features are essential for classifying object categories, and if they are insufficient, it causes a problem of lowering the segmentation accuracy. Therefore, to improve this problem, we proposed the AR-UNet using residual learning and ASPP in the encoder. Residual learning improves feature extraction ability and is effective in preventing feature loss and vanishing gradient problems caused by continuous convolutions. In addition, ASPP enables additional feature extraction without reducing the resolution of the feature map. Experiments verified the effectiveness of the AR-UNet with Cityscapes dataset. The experimental results showed that the AR-UNet showed improved segmentation results compared to the conventional U-Net. In this way, AR-UNet can contribute to the advancement of many applications where accuracy is important.

Decision Tree Techniques with Feature Reduction for Network Anomaly Detection (네트워크 비정상 탐지를 위한 속성 축소를 반영한 의사결정나무 기술)

  • Kang, Koohong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.4
    • /
    • pp.795-805
    • /
    • 2019
  • Recently, there is a growing interest in network anomaly detection technology to tackle unknown attacks. For this purpose, diverse studies using data mining, machine learning, and deep learning have been applied to detect network anomalies. In this paper, we evaluate the decision tree to see its feasibility for network anomaly detection on NSL-KDD data set, which is one of the most popular data mining techniques for classification. In order to handle the over-fitting problem of decision tree, we select 13 features from the original 41 features of the data set using chi-square test, and then model the decision tree using TensorFlow and Scik-Learn, yielding 84% and 70% of binary classification accuracies on the KDDTest+ and KDDTest-21 of NSL-KDD test data set. This result shows 3% and 6% improvements compared to the previous 81% and 64% of binary classification accuracies by decision tree technologies, respectively.

Vehicle Detection in Aerial Images Based on Hyper Feature Map in Deep Convolutional Network

  • Shen, Jiaquan;Liu, Ningzhong;Sun, Han;Tao, Xiaoli;Li, Qiangyi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.1989-2011
    • /
    • 2019
  • Vehicle detection based on aerial images is an interesting and challenging research topic. Most of the traditional vehicle detection methods are based on the sliding window search algorithm, but these methods are not sufficient for the extraction of object features, and accompanied with heavy computational costs. Recent studies have shown that convolutional neural network algorithm has made a significant progress in computer vision, especially Faster R-CNN. However, this algorithm mainly detects objects in natural scenes, it is not suitable for detecting small object in aerial view. In this paper, an accurate and effective vehicle detection algorithm based on Faster R-CNN is proposed. Our method fuse a hyperactive feature map network with Eltwise model and Concat model, which is more conducive to the extraction of small object features. Moreover, setting suitable anchor boxes based on the size of the object is used in our model, which also effectively improves the performance of the detection. We evaluate the detection performance of our method on the Munich dataset and our collected dataset, with improvements in accuracy and effectivity compared with other methods. Our model achieves 82.2% in recall rate and 90.2% accuracy rate on Munich dataset, which has increased by 2.5 and 1.3 percentage points respectively over the state-of-the-art methods.