Search | Korea Science

Unsupervised Monocular Depth Estimation Using Self-Attention for Autonomous Driving (자율주행을 위한 Self-Attention 기반 비지도 단안 카메라 영상 깊이 추정)

Seung-Jun Hwang;Sung-Jun Park;Joong-Hwan Baek
- Journal of Advanced Navigation Technology
- /
- v.27 no.2
- /
- pp.182-189
- /
- 2023
Depth estimation is a key technology in 3D map generation for autonomous driving of vehicles, robots, and drones. The existing sensor-based method has high accuracy but is expensive and has low resolution, while the camera-based method is more affordable with higher resolution. In this study, we propose self-attention-based unsupervised monocular depth estimation for UAV camera system. Self-Attention operation is applied to the network to improve the global feature extraction performance. In addition, we reduce the weight size of the self-attention operation for a low computational amount. The estimated depth and camera pose are transformed into point cloud. The point cloud is mapped into 3D map using the occupancy grid of Octree structure. The proposed network is evaluated using synthesized images and depth sequences from the Mid-Air dataset. Our network demonstrates a 7.69% reduction in error compared to prior studies.
https://doi.org/10.12673/jant.2023.27.2.182 인용 PDF HTML

Multi-classification Sensitive Image Detection Method Based on Lightweight Convolutional Neural Network

Yueheng Mao;Bin Song;Zhiyong Zhang;Wenhou Yang;Yu Lan
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.17 no.5
- /
- pp.1433-1449
- /
- 2023
In recent years, the rapid development of social networks has led to a rapid increase in the amount of information available on the Internet, which contains a large amount of sensitive information related to pornography, politics, and terrorism. In the aspect of sensitive image detection, the existing machine learning algorithms are confronted with problems such as large model size, long training time, and slow detection speed when auditing and supervising. In order to detect sensitive images more accurately and quickly, this paper proposes a multiclassification sensitive image detection method based on lightweight Convolutional Neural Network. On the basis of the EfficientNet model, this method combines the Ghost Module idea of the GhostNet model and adds the SE channel attention mechanism in the Ghost Module for feature extraction training. The experimental results on the sensitive image data set constructed in this paper show that the accuracy of the proposed method in sensitive information detection is 94.46% higher than that of the similar methods. Then, the model is pruned through an ablation experiment, and the activation function is replaced by Hard-Swish, which reduces the parameters of the original model by 54.67%. Under the condition of ensuring accuracy, the detection time of a single image is reduced from 8.88ms to 6.37ms. The results of the experiment demonstrate that the method put forward has successfully enhanced the precision of identifying multi-class sensitive images, significantly decreased the number of parameters in the model, and achieved higher accuracy than comparable algorithms while using a more lightweight model design.
https://doi.org/10.3837/tiis.2023.05.007 인용 PDF HTML

Features Extraction for Classifying Parkinson's Disease Based on Gait Analysis (걸음걸이 분석 기반의 파킨슨병 분류를 위한 특징 추출)

Lee, Sang-Hong;Lim, Joon-S.;Shin, Dong-Kun
- Journal of Internet Computing and Services
- /
- v.11 no.6
- /
- pp.13-20
- /
- 2010
This paper presents a measure to classify healthy persons and Parkinson disease patients from the foot pressure of healthy persons and that of Parkinson disease patients using gait analysis based characteristics extraction and Neural Network with Weighted Fuzzy Membership Functions (NEWFM). To extract the inputs to be used in NEWFM, in the first step, the foot pressure data provided by the PhysioBank and changes in foot pressure over time were used to extract four characteristics respectively. In the second step, wavelet coefficients were extracted from the eight characteristics extracted from the previous stage using the wavelet transform (WT). In the final step, 40 inputs were extracted from the extracted wavelet coefficients using statistical methods including the frequency distribution of signals and the amount of variability in the frequency distribution. NEWFM showed high accuracy in the case of the characteristics obtained using differences between the left foot pressure and the right food pressure and in the case of the characteristics obtained using differences in changes in foot pressure over time when healthy persons and Parkinson disease patients were classified by extracting eight characteristics from foot pressure data. Based on these results, the fact that differences between the left and right foot pressures of Parkinson disease patients who show a characteristic of dragging their feet in gaits were relatively smaller than those of healthy persons could be identified through this experiment.
PDF KSCI

A Study on Face Recognition using Neural Networks and Characteristics Extraction based on Differential Image and DCT (차영상과 DCT 기반 특징 추출과 신경망을 이용한 얼굴 인식에 관한 연구)

임춘환;고낙용;박종안
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.24 no.8B
- /
- pp.1549-1557
- /
- 1999
In this paper, we propose a face recognition algorithm based on the differential image method-DCT This algorithm uses neural networks which is flexible for noise. Using the same condition (same luminous intensity and same distance from the fixed CCD camera to human face), we have captured two images. One doesn't contain human face. The other contains human face. Differential image method is used to separate the second image into face region and background region. After that, we have extracted square area from the face region, which is based on the edge distribution. This square region is used as the characteristics region of human face. It contains the eye bows, the eyes, the nose, and the mouth. After executing DCT for this square region, we have extracted the feature vectors. The feature vectors were normalized and used as the input vectors of the neural network. Simulation results show 100% recognition rate when face images were learned and 92.25% recognition rate when face images weren't learned for 30 persons.
PDF

Extracting Core Events Based on Timeline and Retweet Analysis in Twitter Corpus (트위터 문서에서 시간 및 리트윗 분석을 통한 핵심 사건 추출)

Tsolmon, Bayar;Lee, Kyung-Soon
- KIPS Transactions on Software and Data Engineering
- /
- v.1 no.1
- /
- pp.69-74
- /
- 2012
Many internet users attempt to focus on the issues which have posted on social network services in a very short time. When some social big issue or event occurred, it will affect the number of comments and retweet on that day in twitter. In this paper, we propose the method of extracting core events based on timeline analysis, sentiment feature and retweet information in twitter data. To validate our method, we have compared the methods using only the frequency of words, word frequency with sentiment analysis, using only chi-square method and using sentiment analysis with chi-square method. For justification of the proposed approach, we have evaluated accuracy of correct answers in top 10 results. The proposed method achieved 94.9% performance. The experimental results show that the proposed method is effective for extracting core events in twitter corpus.
https://doi.org/10.3745/KTSDE.2012.1.1.069 인용 PDF

Corroded and loosened bolt detection of steel bolted joints based on improved you only look once network and line segment detector

Youhao Ni;Jianxiao Mao;Hao Wang;Yuguang Fu;Zhuo Xi
- Smart Structures and Systems
- /
- v.32 no.1
- /
- pp.23-35
- /
- 2023
Steel bolted joint is an important part of steel structure, and its damage directly affects the bearing capacity and durability of steel structure. Currently, the existing research mainly focuses on the identification of corroded bolts and corroded bolts respectively, and there are few studies on multiple states. A detection framework of corroded and loosened bolts is proposed in this study, and the innovations can be summarized as follows: (i) Vision Transformer (ViT) is introduced to replace the third and fourth C3 module of you-only-look-once version 5s (YOLOv5s) algorithm, which increases the attention weights of feature channels and the feature extraction capability. (ii) Three states of the steel bolts are considered, including corroded bolt, bolt missing and clean bolt. (iii) Line segment detector (LSD) is introduced for bolt rotation angle calculation, which realizes bolt looseness detection. The improved YOLOv5s model was validated on the dataset, and the mean average precision (mAP) was increased from 0.902 to 0.952. In terms of a lab-scale joint, the performance of the LSD algorithm and the Hough transform was compared from different perspective angles. The error value of bolt loosening angle of the LSD algorithm is controlled within 1.09%, less than 8.91% of the Hough transform. Furthermore, the proposed framework was applied to fullscale joints of a steel bridge in China. Synthetic images of loosened bolts were successfully identified and the multiple states were well detected. Therefore, the proposed framework can be alternative of monitoring steel bolted joints for management department.
https://doi.org/10.12989/sss.2023.32.1.023 인용

RoutingConvNet: A Light-weight Speech Emotion Recognition Model Based on Bidirectional MFCC (RoutingConvNet: 양방향 MFCC 기반 경량 음성감정인식 모델)

Hyun Taek Lim;Soo Hyung Kim;Guee Sang Lee;Hyung Jeong Yang
- Smart Media Journal
- /
- v.12 no.5
- /
- pp.28-35
- /
- 2023
In this study, we propose a new light-weight model RoutingConvNet with fewer parameters to improve the applicability and practicality of speech emotion recognition. To reduce the number of learnable parameters, the proposed model connects bidirectional MFCCs on a channel-by-channel basis to learn long-term emotion dependence and extract contextual features. A light-weight deep CNN is constructed for low-level feature extraction, and self-attention is used to obtain information about channel and spatial signals in speech signals. In addition, we apply dynamic routing to improve the accuracy and construct a model that is robust to feature variations. The proposed model shows parameter reduction and accuracy improvement in the overall experiments of speech emotion datasets (EMO-DB, RAVDESS, and IEMOCAP), achieving 87.86%, 83.44%, and 66.06% accuracy respectively with about 156,000 parameters. In this study, we proposed a metric to calculate the trade-off between the number of parameters and accuracy for performance evaluation against light-weight.
https://doi.org/10.30693/SMJ.2023.12.5.28 인용 PDF

Speech Recognition for the Korean Vowel 'ㅣ' based on Waveform-feature Extraction and Neural-network Learning (파형 특징 추출과 신경망 학습 기반 모음 'ㅣ' 음성 인식)

Rho, Wonbin;Lee, Jongwoo;Lee, Jaewon
- KIISE Transactions on Computing Practices
- /
- v.22 no.2
- /
- pp.69-76
- /
- 2016
With the recent increase of the interest in IoT in almost all areas of industry, computing technologies have been increasingly applied in human environments such as houses, buildings, cars, and streets; in these IoT environments, speech recognition is being widely accepted as a means of HCI. The existing server-based speech recognition techniques are typically fast and show quite high recognition rates; however, an internet connection is necessary, and complicated server computing is required because a voice is recognized by units of words that are stored in server databases. This paper, as a successive research results of speech recognition algorithms for the Korean phonemic vowel 'ㅏ', 'ㅓ', suggests an implementation of speech recognition algorithms for the Korean phonemic vowel 'ㅣ'. We observed that almost all of the vocal waveform patterns for 'ㅣ' are unique and different when compared with the patterns of the 'ㅏ' and 'ㅓ' waveforms. In this paper we propose specific waveform patterns for the Korean vowel 'ㅣ' and the corresponding recognition algorithms. We also presents experiment results showing that, by adding neural-network learning to our algorithm, the voice recognition success rate for the vowel 'ㅣ' can be increased. As a result we observed that 90% or more of the vocal expressions of the vowel 'ㅣ' can be successfully recognized when our algorithms are used.
https://doi.org/10.5626/KTCP.2016.22.2.69 인용 KSCI

Traffic Flow Prediction Model Based on Spatio-Temporal Dilated Graph Convolution

Sun, Xiufang;Li, Jianbo;Lv, Zhiqiang;Dong, Chuanhao
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.9
- /
- pp.3598-3614
- /
- 2020
With the increase of motor vehicles and tourism demand, some traffic problems gradually appear, such as traffic congestion, safety accidents and insufficient allocation of traffic resources. Facing these challenges, a model of Spatio-Temporal Dilated Convolutional Network (STDGCN) is proposed for assistance of extracting highly nonlinear and complex characteristics to accurately predict the future traffic flow. In particular, we model the traffic as undirected graphs, on which graph convolutions are built to extract spatial feature informations. Furthermore, a dilated convolution is deployed into graph convolution for capturing multi-scale contextual messages. The proposed STDGCN integrates the dilated convolution into the graph convolution, which realizes the extraction of the spatial and temporal characteristics of traffic flow data, as well as features of road occupancy. To observe the performance of the proposed model, we compare with it with four rivals. We also employ four indicators for evaluation. The experimental results show STDGCN's effectiveness. The prediction accuracy is improved by 17% in comparison with the traditional prediction methods on various real-world traffic datasets.
https://doi.org/10.3837/tiis.2020.09.002 인용 PDF KSCI HTML

Fall Detection Based on Human Skeleton Keypoints Using GRU

Kang, Yoon-Kyu;Kang, Hee-Yong;Weon, Dal-Soo
- International Journal of Internet, Broadcasting and Communication
- /
- v.12 no.4
- /
- pp.83-92
- /
- 2020
A recent study to determine the fall is focused on analyzing fall motions using a recurrent neural network (RNN), and uses a deep learning approach to get good results for detecting human poses in 2D from a mono color image. In this paper, we investigated the improved detection method to estimate the position of the head and shoulder key points and the acceleration of position change using the skeletal key points information extracted using PoseNet from the image obtained from the 2D RGB low-cost camera, and to increase the accuracy of the fall judgment. In particular, we propose a fall detection method based on the characteristics of post-fall posture in the fall motion analysis method and on the velocity of human body skeleton key points change as well as the ratio change of body bounding box's width and height. The public data set was used to extract human skeletal features and to train deep learning, GRU, and as a result of an experiment to find a feature extraction method that can achieve high classification accuracy, the proposed method showed a 99.8% success rate in detecting falls more effectively than the conventional primitive skeletal data use method.
https://doi.org/10.7236/IJIBC.2020.12.4.83 인용 PDF KSCI

Search Result 491, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)