• Title/Summary/Keyword: Convolution Kernel

Search Result 87, Processing Time 0.024 seconds

Lane Detection System using CNN (CNN을 사용한 차선검출 시스템)

  • Kim, Jihun;Lee, Daesik;Lee, Minho
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.3
    • /
    • pp.163-171
    • /
    • 2016
  • Lane detection is a widely researched topic. Although simple road detection is easily achieved by previous methods, lane detection becomes very difficult in several complex cases involving noisy edges. To address this, we use a Convolution neural network (CNN) for image enhancement. CNN is a deep learning method that has been very successfully applied in object detection and recognition. In this paper, we introduce a robust lane detection method based on a CNN combined with random sample consensus (RANSAC) algorithm. Initially, we calculate edges in an image using a hat shaped kernel, then we detect lanes using the CNN combined with the RANSAC. In the training process of the CNN, input data consists of edge images and target data is images that have real white color lanes on an otherwise black background. The CNN structure consists of 8 layers with 3 convolutional layers, 2 subsampling layers and multi-layer perceptron (MLP) of 3 fully-connected layers. Convolutional and subsampling layers are hierarchically arranged to form a deep structure. Our proposed lane detection algorithm successfully eliminates noise lines and was found to perform better than other formal line detection algorithms such as RANSAC

Elderly Assistance System Development based on Real-time Embedded Linux (실시간 임베디드 리눅스 기반 노약자 지원 로봇 개발)

  • Koh, Jae-Hwan;Yang, Gil-Jin;Choi, Byoung-Wook
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.11
    • /
    • pp.1036-1042
    • /
    • 2013
  • In this paper, an elderly assistance system is developed based on Xenomai, a real-time development framework cooperating with the Linux kernel. A Kinect sensor is used to recognize the behavior of the elderly and A-star search algorithm is implemented to find the shortest path to the person. The mobile robot also generates a trajectory using a digital convolution operator which is based on a Bezier curve for smooth driving. In order to follow the generated trajectory within the control period, we developed real-time tasks and compared the performance of the tracking trajectory with that of non real-time tasks. The real-time task has a better result on following the trajectory within the physical constraints which means that it is more appropriate to apply to an elderly assistant system.

Parameter Analysis for Time Reduction in Extracting SIFT Keypoints in the Aspect of Image Stitching (영상 스티칭 관점에서 SIFT 특징점 추출시간 감소를 위한 파라미터 분석)

  • Moon, Won-Jun;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.23 no.4
    • /
    • pp.559-573
    • /
    • 2018
  • Recently, one of the most actively applied image media in the most fields such as virtual reality (VR) is omni-directional or panorama image. This image is generated by stitching images obtained by various methods. In this process, it takes the most time to extract keypoints necessary for stitching. In this paper, we analyze the parameters involved in the extraction of SIFT keypoints with the aim of reducing the computation time for extracting the most widely used SIFT keypoints. The parameters considered in this paper are the initial standard deviation of the Gaussian kernel used for Gaussian filtering, the number of gaussian difference image sets for extracting local extrema, and the number of octaves. As the SIFT algorithm, the Lowe scheme, the originally proposed one, and the Hess scheme which is a convolution cascade scheme, are considered. First, the effect of each parameter value on the computation time is analyzed, and the effect of each parameter on the stitching performance is analyzed by performing actual stitching experiments. Finally, based on the results of the two analyses, we extract parameter value set that minimize computation time without degrading.

Classification Algorithms for Human and Dog Movement Based on Micro-Doppler Signals

  • Lee, Jeehyun;Kwon, Jihoon;Bae, Jin-Ho;Lee, Chong Hyun
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.1
    • /
    • pp.10-17
    • /
    • 2017
  • We propose classification algorithms for human and dog movement. The proposed algorithms use micro-Doppler signals obtained from humans and dogs moving in four different directions. A two-stage classifier based on a support vector machine (SVM) is proposed, which uses a radial-based function (RBF) kernel and $16^{th}$-order linear predictive code (LPC) coefficients as feature vectors. With the proposed algorithms, we obtain the best classification results when a first-level SVM classifies the type of movement, and then, a second-level SVM classifies the moving object. We obtain the correct classification probability 95.54% of the time, on average. Next, to deal with the difficult classification problem of human and dog running, we propose a two-layer convolutional neural network (CNN). The proposed CNN is composed of six ($6{\times}6$) convolution filters at the first and second layers, with ($5{\times}5$) max pooling for the first layer and ($2{\times}2$) max pooling for the second layer. The proposed CNN-based classifier adopts an auto regressive spectrogram as the feature image obtained from the $16^{th}$-order LPC vectors for a specific time duration. The proposed CNN exhibits 100% classification accuracy and outperforms the SVM-based classifier. These results show that the proposed classifiers can be used for human and dog classification systems and also for classification problems using data obtained from an ultra-wideband (UWB) sensor.

A new lightweight network based on MobileNetV3

  • Zhao, Liquan;Wang, Leilei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.1-15
    • /
    • 2022
  • The MobileNetV3 is specially designed for mobile devices with limited memory and computing power. To reduce the network parameters and improve the network inference speed, a new lightweight network is proposed based on MobileNetV3. Firstly, to reduce the computation of residual blocks, a partial residual structure is designed by dividing the input feature maps into two parts. The designed partial residual structure is used to replace the residual block in MobileNetV3. Secondly, a dual-path feature extraction structure is designed to further reduce the computation of MobileNetV3. Different convolution kernel sizes are used in the two paths to extract feature maps with different sizes. Besides, a transition layer is also designed for fusing features to reduce the influence of the new structure on accuracy. The CIFAR-100 dataset and Image Net dataset are used to test the performance of the proposed partial residual structure. The ResNet based on the proposed partial residual structure has smaller parameters and FLOPs than the original ResNet. The performance of improved MobileNetV3 is tested on CIFAR-10, CIFAR-100 and ImageNet image classification task dataset. Comparing MobileNetV3, GhostNet and MobileNetV2, the improved MobileNetV3 has smaller parameters and FLOPs. Besides, the improved MobileNetV3 is also tested on CPU and Raspberry Pi. It is faster than other networks

Compare the Clinical Tissue Dose Distributions to the Derived from the Energy Spectrum of 15 MV X Rays Linear Accelerator by Using the Transmitted Dose of Lead Filter (연(鉛)필터의 투과선량을 이용한 15 MV X선의 에너지스펙트럼 결정과 조직선량 비교)

  • Choi, Tae-Jin;Kim, Jin-Hee;Kim, Ok-Bae
    • Progress in Medical Physics
    • /
    • v.19 no.1
    • /
    • pp.80-88
    • /
    • 2008
  • Recent radiotherapy dose planning system (RTPS) generally adapted the kernel beam using the convolution method for computation of tissue dose. To get a depth and profile dose in a given depth concerened a given photon beam, the energy spectrum was reconstructed from the attenuation dose of transmission of filter through iterative numerical analysis. The experiments were performed with 15 MV X rays (Oncor, Siemens) and ionization chamber (0.125 cc, PTW) for measurements of filter transmitted dose. The energy spectrum of 15MV X-rays was determined from attenuated dose of lead filter transmission from 0.51 cm to 8.04 cm with energy interval 0.25 MeV. In the results, the peak flux revealed at 3.75 MeV and mean energy of 15 MV X rays was 4.639 MeV in this experiments. The results of transmitted dose of lead filter showed within 0.6% in average but maximum 2.5% discrepancy in a 5 cm thickness of lead filter. Since the tissue dose is highly depend on the its energy, the lateral dose are delivered from the lateral spread of energy fluence through flattening filter shape as tangent 0.075 and 0.125 which showed 4.211 MeV and 3.906 MeV. In this experiments, analyzed the energy spectrum has applied to obtain the percent depth dose of RTPS (XiO, Version 4.3.1, CMS). The generated percent depth dose from $6{\times}6cm^2$ of field to $30{\times}30cm^2$ showed very close to that of experimental measurement within 1 % discrepancy in average. The computed dose profile were within 1% discrepancy to measurement in field size $10{\times}10cm$, however, the large field sizes were obtained within 2% uncertainty. The resulting algorithm produced x-ray spectrum that match both quality and quantity with small discrepancy in this experiments.

  • PDF

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.