• Title/Summary/Keyword: CNN model

Search Result 992, Processing Time 0.027 seconds

Vehicle Headlight and Taillight Recognition in Nighttime using Low-Exposure Camera and Wavelet-based Random Forest (저노출 카메라와 웨이블릿 기반 랜덤 포레스트를 이용한 야간 자동차 전조등 및 후미등 인식)

  • Heo, Duyoung;Kim, Sang Jun;Kwak, Choong Sub;Nam, Jae-Yeal;Ko, Byoung Chul
    • Journal of Broadcast Engineering
    • /
    • v.22 no.3
    • /
    • pp.282-294
    • /
    • 2017
  • In this paper, we propose a novel intelligent headlight control (IHC) system which is durable to various road lights and camera movement caused by vehicle driving. For detecting candidate light blobs, the region of interest (ROI) is decided as front ROI (FROI) and back ROI (BROI) by considering the camera geometry based on perspective range estimation model. Then, light blobs such as headlights, taillights of vehicles, reflection light as well as the surrounding road lighting are segmented using two different adaptive thresholding. From the number of segmented blobs, taillights are first detected using the redness checking and random forest classifier based on Haar-like feature. For the headlight and taillight classification, we use the random forest instead of popular support vector machine or convolutional neural networks for supporting fast learning and testing in real-life applications. Pairing is performed by using the predefined geometric rules, such as vertical coordinate similarity and association check between blobs. The proposed algorithm was successfully applied to various driving sequences in night-time, and the results show that the performance of the proposed algorithms is better than that of recent related works.

Rare Malware Classification Using Memory Augmented Neural Networks (메모리 추가 신경망을 이용한 희소 악성코드 분류)

  • Kang, Min Chul;Kim, Huy Kang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.4
    • /
    • pp.847-857
    • /
    • 2018
  • As the number of malicious code increases steeply, cyber attack victims targeting corporations, public institutions, financial institutions, hospitals are also increasing. Accordingly, academia and security industry are conducting various researches on malicious code detection. In recent years, there have been a lot of researches using machine learning techniques including deep learning. In the case of research using Convolutional Neural Network, ResNet, etc. for classification of malicious code, it can be confirmed that the performance improvement is higher than the existing classification method. However, one of the characteristics of the target attack is that it is custom malicious code that makes it operate only for a specific company, so it is not a form spreading widely to a large number of users. Since there are not many malicious codes of this kind, it is difficult to apply the previously studied machine learning or deep learning techniques. In this paper, we propose a method to classify malicious codes when the amount of samples is insufficient such as targeting type malicious code. As a result of the study, we confirmed that the accuracy of 97% can be achieved even with a small amount of data by applying the Memory Augmented Neural Networks model.

Speaker-Independent Korean Digit Recognition Using HCNN with Weighted Distance Measure (가중 거리 개념이 도입된 HCNN을 이용한 화자 독립 숫자음 인식에 관한 연구)

  • 김도석;이수영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.10
    • /
    • pp.1422-1432
    • /
    • 1993
  • Nonlinear mapping function of the HCNN( Hidden Control Neural Network ) can change over time to model the temporal variability of a speech signal by combining the nonlinear prediction of conventional neural networks with the segmentation capability of HMM. We have two things in this paper. first, we showed that the performance of the HCNN is better than that of HMM. Second, the HCNN with its prediction error measure given by weighted distance is proposed to use suitable distance measure for the HCNN, and then we showed that the superiority of the proposed system for speaker-independent speech recognition tasks. Weighted distance considers the differences between the variances of each component of the feature vector extraced from the speech data. Speaker-independent Korean digit recognition experiment showed that the recognition rate of 95%was obtained for the HCNN with Euclidean distance. This result is 1.28% higher than HMM, and shows that the HCNN which models the dynamical system is superior to HMM which is based on the statistical restrictions. And we obtained 97.35% for the HCNN with weighted distance, which is 2.35% better than the HCNN with Euclidean distance. The reason why the HCNN with weighted distance shows better performance is as follows : it reduces the variations of the recognition error rate over different speakers by increasing the recognition rate for the speakers who have many misclassified utterances. So we can conclude that the HCNN with weighted distance is more suit-able for speaker-independent speech recognition tasks.

  • PDF

Experimental Comparison of Network Intrusion Detection Models Solving Imbalanced Data Problem (데이터의 불균형성을 제거한 네트워크 침입 탐지 모델 비교 분석)

  • Lee, Jong-Hwa;Bang, Jiwon;Kim, Jong-Wouk;Choi, Mi-Jung
    • KNOM Review
    • /
    • v.23 no.2
    • /
    • pp.18-28
    • /
    • 2020
  • With the development of the virtual community, the benefits that IT technology provides to people in fields such as healthcare, industry, communication, and culture are increasing, and the quality of life is also improving. Accordingly, there are various malicious attacks targeting the developed network environment. Firewalls and intrusion detection systems exist to detect these attacks in advance, but there is a limit to detecting malicious attacks that are evolving day by day. In order to solve this problem, intrusion detection research using machine learning is being actively conducted, but false positives and false negatives are occurring due to imbalance of the learning dataset. In this paper, a Random Oversampling method is used to solve the unbalance problem of the UNSW-NB15 dataset used for network intrusion detection. And through experiments, we compared and analyzed the accuracy, precision, recall, F1-score, training and prediction time, and hardware resource consumption of the models. Based on this study using the Random Oversampling method, we develop a more efficient network intrusion detection model study using other methods and high-performance models that can solve the unbalanced data problem.

S-PRESENT Cryptanalysis through Know-Plaintext Attack Based on Deep Learning (딥러닝 기반의 알려진 평문 공격을 통한 S-PRESENT 분석)

  • Se-jin Lim;Hyun-Ji Kim;Kyung-Bae Jang;Yea-jun Kang;Won-Woong Kim;Yu-Jin Yang;Hwa-Jeong Seo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.2
    • /
    • pp.193-200
    • /
    • 2023
  • Cryptanalysis can be performed by various techniques such as known plaintext attack, differential attack, side-channel analysis, and the like. Recently, many studies have been conducted on cryptanalysis using deep learning. A known-plaintext attack is a technique that uses a known plaintext and ciphertext pair to find a key. In this paper, we use deep learning technology to perform a known-plaintext attack against S-PRESENT, a reduced version of the lightweight block cipher PRESENT. This paper is significant in that it is the first known-plaintext attack based on deep learning performed on a reduced lightweight block cipher. For cryptanalysis, MLP (Multi-Layer Perceptron) and 1D and 2D CNN(Convolutional Neural Network) models are used and optimized, and the performance of the three models is compared. It showed the highest performance in 2D convolutional neural networks, but it was possible to attack only up to some key spaces. From this, it can be seen that the known-plaintext attack through the MLP model and the convolutional neural network is limited in attackable key bits.

AI-Based Object Recognition Research for Augmented Reality Character Implementation (증강현실 캐릭터 구현을 위한 AI기반 객체인식 연구)

  • Seok-Hwan Lee;Jung-Keum Lee;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1321-1330
    • /
    • 2023
  • This study attempts to address the problem of 3D pose estimation for multiple human objects through a single image generated during the character development process that can be used in augmented reality. In the existing top-down method, all objects in the image are first detected, and then each is reconstructed independently. The problem is that inconsistent results may occur due to overlap or depth order mismatch between the reconstructed objects. The goal of this study is to solve these problems and develop a single network that provides consistent 3D reconstruction of all humans in a scene. Integrating a human body model based on the SMPL parametric system into a top-down framework became an important choice. Through this, two types of collision loss based on distance field and loss that considers depth order were introduced. The first loss prevents overlap between reconstructed people, and the second loss adjusts the depth ordering of people to render occlusion inference and annotated instance segmentation consistently. This method allows depth information to be provided to the network without explicit 3D annotation of the image. Experimental results show that this study's methodology performs better than existing methods on standard 3D pose benchmarks, and the proposed losses enable more consistent reconstruction from natural images.

Deep learning-based automatic segmentation of the mandibular canal on panoramic radiographs: A multi-device study

  • Moe Thu Zar Aung;Sang-Heon Lim;Jiyong Han;Su Yang;Ju-Hee Kang;Jo-Eun Kim;Kyung-Hoe Huh;Won-Jin Yi;Min-Suk Heo;Sam-Sun Lee
    • Imaging Science in Dentistry
    • /
    • v.54 no.1
    • /
    • pp.81-91
    • /
    • 2024
  • Purpose: The objective of this study was to propose a deep-learning model for the detection of the mandibular canal on dental panoramic radiographs. Materials and Methods: A total of 2,100 panoramic radiographs (PANs) were collected from 3 different machines: RAYSCAN Alpha (n=700, PAN A), OP-100 (n=700, PAN B), and CS8100 (n=700, PAN C). Initially, an oral and maxillofacial radiologist coarsely annotated the mandibular canals. For deep learning analysis, convolutional neural networks (CNNs) utilizing U-Net architecture were employed for automated canal segmentation. Seven independent networks were trained using training sets representing all possible combinations of the 3 groups. These networks were then assessed using a hold-out test dataset. Results: Among the 7 networks evaluated, the network trained with all 3 available groups achieved an average precision of 90.6%, a recall of 87.4%, and a Dice similarity coefficient (DSC) of 88.9%. The 3 networks trained using each of the 3 possible 2-group combinations also demonstrated reliable performance for mandibular canal segmentation, as follows: 1) PAN A and B exhibited a mean DSC of 87.9%, 2) PAN A and C displayed a mean DSC of 87.8%, and 3) PAN B and C demonstrated a mean DSC of 88.4%. Conclusion: This multi-device study indicated that the examined CNN-based deep learning approach can achieve excellent canal segmentation performance, with a DSC exceeding 88%. Furthermore, the study highlighted the importance of considering the characteristics of panoramic radiographs when developing a robust deep-learning network, rather than depending solely on the size of the dataset.

Sorghum Panicle Detection using YOLOv5 based on RGB Image Acquired by UAV System (무인기로 취득한 RGB 영상과 YOLOv5를 이용한 수수 이삭 탐지)

  • Min-Jun, Park;Chan-Seok, Ryu;Ye-Seong, Kang;Hye-Young, Song;Hyun-Chan, Baek;Ki-Su, Park;Eun-Ri, Kim;Jin-Ki, Park;Si-Hyeong, Jang
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.24 no.4
    • /
    • pp.295-304
    • /
    • 2022
  • The purpose of this study is to detect the sorghum panicle using YOLOv5 based on RGB images acquired by a unmanned aerial vehicle (UAV) system. The high-resolution images acquired using the RGB camera mounted in the UAV on September 2, 2022 were split into 512×512 size for YOLOv5 analysis. Sorghum panicles were labeled as bounding boxes in the split image. 2,000images of 512×512 size were divided at a ratio of 6:2:2 and used to train, validate, and test the YOLOv5 model, respectively. When learning with YOLOv5s, which has the fewest parameters among YOLOv5 models, sorghum panicles were detected with mAP@50=0.845. In YOLOv5m with more parameters, sorghum panicles could be detected with mAP@50=0.844. Although the performance of the two models is similar, YOLOv5s ( 4 hours 35 minutes) has a faster training time than YOLOv5m (5 hours 15 minutes). Therefore, in terms of time cost, developing the YOLOv5s model was considered more efficient for detecting sorghum panicles. As an important step in predicting sorghum yield, a technique for detecting sorghum panicles using high-resolution RGB images and the YOLOv5 model was presented.

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

  • Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.105-122
    • /
    • 2019
  • Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.