Search | Korea Science

Representative Batch Normalization for Scene Text Recognition

Sun, Yajie;Cao, Xiaoling;Sun, Yingying
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.16 no.7
- /
- pp.2390-2406
- /
- 2022
Scene text recognition has important application value and attracted the interest of plenty of researchers. At present, many methods have achieved good results, but most of the existing approaches attempt to improve the performance of scene text recognition from the image level. They have a good effect on reading regular scene texts. However, there are still many obstacles to recognizing text on low-quality images such as curved, occlusion, and blur. This exacerbates the difficulty of feature extraction because the image quality is uneven. In addition, the results of model testing are highly dependent on training data, so there is still room for improvement in scene text recognition methods. In this work, we present a natural scene text recognizer to improve the recognition performance from the feature level, which contains feature representation and feature enhancement. In terms of feature representation, we propose an efficient feature extractor combined with Representative Batch Normalization and ResNet. It reduces the dependence of the model on training data and improves the feature representation ability of different instances. In terms of feature enhancement, we use a feature enhancement network to expand the receptive field of feature maps, so that feature maps contain rich feature information. Enhanced feature representation capability helps to improve the recognition performance of the model. We conducted experiments on 7 benchmarks, which shows that this method is highly competitive in recognizing both regular and irregular texts. The method achieved top1 recognition accuracy on four benchmarks of IC03, IC13, IC15, and SVTP.
https://doi.org/10.3837/tiis.2022.07.015 인용 PDF KSCI HTML

Feature Extraction in 3-Dimensional Object with Closed-surface using Fourier Transform (Fourier Transform을 이용한 3차원 폐곡면 객체의 특징 벡터 추출)

이준복;김문화;장동식
- Journal of the Institute of Convergence Signal Processing
- /
- v.4 no.3
- /
- pp.21-26
- /
- 2003
A new method to realize 3-dimensional object pattern recognition system using Fourier-based feature extractor has been proposed. The procedure to obtain the invariant feature vector is as follows ; A closed surface is generated by tracing the surface of object using the 3-dimensional polar coordinate. The centroidal distances between object's geometrical center and each closed surface points are calculated. The distance vector is translation invariant. The distance vector is normalized, so the result is scale invariant. The Fourier spectrum of each normalized distance vector is calculated, and the spectrum is rotation invariant. The Fourier-based feature generating from above procedure completely eliminates the effect of variations in translation, scale, and rotation of 3-dimensional object with closed-surface. The experimental results show that the proposed method has a high accuracy.
PDF

Side Scan Sonar based Pose-graph SLAM (사이드 스캔 소나 기반 Pose-graph SLAM)

Gwon, Dae-Hyeon;Kim, Joowan;Kim, Moon Hwan;Park, Ho Gyu;Kim, Tae Yeong;Kim, Ayoung
- The Journal of Korea Robotics Society
- /
- v.12 no.4
- /
- pp.385-394
- /
- 2017
Side scanning sonar (SSS) provides valuable information for robot navigation. However using the side scanning sonar images in the navigation was not fully studied. In this paper, we use range data, and side scanning sonar images from UnderWater Simulator (UWSim) and propose measurement models in a feature based simultaneous localization and mapping (SLAM) framework. The range data is obtained by echosounder and sidescanning sonar images from side scan sonar module for UWSim. For the feature, we used the A-KAZE feature for the SSS image matching and adjusting the relative robot pose by SSS bundle adjustment (BA) with Ceres solver. We use BA for the loop closure constraint of pose-graph SLAM. We used the Incremental Smoothing and Mapping (iSAM) to optimize the graph. The optimized trajectory was compared against the dead reckoning (DR).
https://doi.org/10.7746/jkros.2017.12.4.385 인용 PDF KSCI

An Implementation of a Feature Extraction Hardware Accelerator based on Memory Usage Improvement SURF Algorithm (메모리 사용률을 개선한 SURF 알고리즘 특징점 추출기의 하드웨어 가속기 설계)

Jung, Chang-min;Kwak, Jae-chang;Lee, Kwang-yeob
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2013.10a
- /
- pp.77-80
- /
- 2013
SURF algorithm is an algorithm to extract feature points and to generate descriptors from input images. It is robust to change of environment such as scale, rotation, illumination and view points. Because of these features, it is used for many image processing applications such as object recognition, constructing panorama pictures and 3D image restoration. But there is disadvantage for real time operation because many recognition algorithms such as SURF algorithm requires a lot of calculations. In this paper, we propose a design of feature extractor and descriptor generator based on SURF for high memory efficiency. The proposed design reduced a memory access and memory usage to operate in real time.
PDF

A Study on the Implementation of Hybrid Learning Rule for Neural Network (다층신경망에서 하이브리드 학습 규칙의 구현에 관한 연구)

Song, Do-Sun;Kim, Suk-Dong;Lee, Haing-Sei
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.4
- /
- pp.60-68
- /
- 1994
In this paper we propose a new Hybrid learning rule applied to multilayer feedforward neural networks, which is constructed by combining Hebbian learning rule that is a good feature extractor and Back-Propagation(BP) learning rule that is an excellent classifier. Unlike the BP rule used in multi-layer perceptron(MLP), the proposed Hybrid learning rule is used for uptate of all connection weights except for output connection weigths becase the Hebbian learning in output layer does not guarantee learning convergence. To evaluate the performance, the proposed hybrid rule is applied to classifier problems in two dimensional space and shows better performance than the one applied only by the BP rule. In terms of learning speed the proposed rule converges faster than the conventional BP. For example, the learning of the proposed Hybrid can be done in 2/10 of the iterations that are required for BP, while the recognition rate of the proposed Hybrid is improved by about $0.778\%$ at the peak.
PDF

Layer-wise Feature Extraction Capacity using Pre-trained CNN (사전학습된 CNN의 계층별 특징추출능력연구)

Lee, Jaehwan;Yoon, Sook;Park, Dong Sun
- Proceedings of the Korea Contents Association Conference
- /
- 2016.05a
- /
- pp.435-436
- /
- 2016
최근 객체인식 분야에서는 Convolutional Neural Network (CNN)이 주목받고 있다. CNN의 특징 중 하나는 입력이미지로 부터 특징 추출 방법을 스스로 학습한다는 것이다. 전통적은 객체인식 방법에서는 hand-written feature extractor를 사용하지만, CNN은 스스로가 특징을 추출한다. 하지만 CNN은 많은 학습데이터와 학습 시간을 필요로 한다. 우리는 객체인식 데이터로 사전학습된 CNN을 사용하여 특징을 추출하였고, 이 특징으로 People re-identification을 수행하였다. 이 과정에서 어떠한 학습도 하지 않았지만 CNN은 다른 영상처리 응용에 대해서도 비교적 좋은 성능을 보여주었다.
PDF

ECG Pattern Classification Using Back Propagation Neural Network (역전달 신경회로망을 이용한 심전도 신호의 패턴분류에 관한 연구)

이제석;이정환;권혁제;이명호
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.30B no.6
- /
- pp.67-75
- /
- 1993
ECG pattern was classified using a back-propagation neural network. An improved feature extractor of ECG is proposed for better classification capability. It is consisted of preprocessing ECG signal by an FIR filter faster than conventional one by a factor of 5. QRS complex recognition by moving-window integration, and peak extraction by quadratic approximation. Since the FIR filter had a periodic frequency spectrum, only one-fifth of usual processing time was required. Also, segmentation of ECG signal followed by quadratic approximation of each segment enabled accurate detection of both P and T waves. When improtant features were extracted and fed into back-propagation neural network for pattern classification, the required number of nodes in hidden and input layers was reduced compared to using raw data as an input, also reducing the necessary time for study. Accurate pattern classification was possible by an appropriate feature selection.
PDF

Transformation Technique for Null Space-Based Linear Discriminant Analysis with Lagrange Method (라그랑지 기법을 쓴 영 공간 기반 선형 판별 분석법의 변형 기법)

Hou, Yuxi;Min, Hwang-Ki;Song, Iickho;Choi, Myeong Soo;Park, Sun;Lee, Seong Ro
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.38C no.2
- /
- pp.208-212
- /
- 2013
Due to the singularity of the within-class scatter, linear discriminant analysis (LDA) becomes ill-posed for small sample size (SSS) problems. An extension of LDA, the null space-based LDA (NLDA) provides good discriminant performances for SSS problems. In this paper, by applying the Lagrange technique, the procedure of transforming the problem of finding the feature extractor of NLDA into a linear equation problem is derived.
https://doi.org/10.7840/kics.2013.38C.2.208 인용 PDF KSCI

Efficient Iris Recognition using Deep-Learning Convolution Neural Network (딥러닝 합성곱 신경망을 이용한 효율적인 홍채인식)

Choi, Gwang-Mi;Jeong, Yu-Jeong
- The Journal of the Korea institute of electronic communication sciences
- /
- v.15 no.3
- /
- pp.521-526
- /
- 2020
This paper presents an improved HOLP neural network that adds 25 average values to a typical HOLP neural network using 25 feature vector values as input values by applying high-order local autocorrelation function, which is excellent for extracting immutable feature values of iris images. Compared with deep learning structures with different types, we compared the recognition rate of iris recognition using Back-Propagation neural network, which shows excellent performance in voice and image field, and synthetic product neural network that integrates feature extractor and classifier.
https://doi.org/10.13067/JKIECS.2020.15.3.521 인용 PDF KSCI

DA-Res2Net: a novel Densely connected residual Attention network for image semantic segmentation

Zhao, Xiaopin;Liu, Weibin;Xing, Weiwei;Wei, Xiang
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.14 no.11
- /
- pp.4426-4442
- /
- 2020
Since scene segmentation is becoming a hot topic in the field of autonomous driving and medical image analysis, researchers are actively trying new methods to improve segmentation accuracy. At present, the main issues in image semantic segmentation are intra-class inconsistency and inter-class indistinction. From our analysis, the lack of global information as well as macroscopic discrimination on the object are the two main reasons. In this paper, we propose a Densely connected residual Attention network (DA-Res2Net) which consists of a dense residual network and channel attention guidance module to deal with these problems and improve the accuracy of image segmentation. Specifically, in order to make the extracted features equipped with stronger multi-scale characteristics, a densely connected residual network is proposed as a feature extractor. Furthermore, to improve the representativeness of each channel feature, we design a Channel-Attention-Guide module to make the model focusing on the high-level semantic features and low-level location features simultaneously. Experimental results show that the method achieves significant performance on various datasets. Compared to other state-of-the-art methods, the proposed method reaches the mean IOU accuracy of 83.2% on PASCAL VOC 2012 and 79.7% on Cityscapes dataset, respectively.
https://doi.org/10.3837/tiis.2020.11.010 인용 PDF KSCI HTML

Search Result 75, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)