Search | Korea Science

Research on Korea Text Recognition in Images Using Deep Learning (딥 러닝 기법을 활용한 이미지 내 한글 텍스트 인식에 관한 연구)

Sung, Sang-Ha;Lee, Kang-Bae;Park, Sung-Ho
- Journal of the Korea Convergence Society
- /
- v.11 no.6
- /
- pp.1-6
- /
- 2020
In this study, research on character recognition, which is one of the fields of computer vision, was conducted. Optical character recognition, which is one of the most widely used character recognition techniques, suffers from decreasing recognition rate if the recognition target deviates from a certain standard and format. Hence, this study aimed to address this limitation by applying deep learning techniques to character recognition. In addition, as most character recognition studies have been limited to English or number recognition, the recognition range has been expanded through additional data training on Korean text. As a result, this study derived a deep learning-based character recognition algorithm for Korean text recognition. The algorithm obtained a score of 0.841 on the 1-NED evaluation method, which is a similar result to that of English recognition. Further, based on the analysis of the results, major issues with Korean text recognition and possible future study tasks are introduced.
https://doi.org/10.15207/JKCS.2020.11.6.001 인용 PDF KSCI

A Study on the Voiced, Unvoiced and Silence Classification (유.무성음 및 묵음 식별에 관한 연구)

김명환
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1984.12a
- /
- pp.73-77
- /
- 1984
This paper reports on a Voiced-Unvoiced-Silence Classification of speech for Korean Speech Recognition. In this paper, it is describe a method which uses a Pattern Recognition Technique for classifying a given speech segment into the three classes. Best result is obtained with the combination using ZCR, P1, Ep and classification error rate is less than 1%.
PDF

A Study on the Effectiveness of the Image Recognition Technique of Augmented Reality Contents (증강현실 콘텐츠의 이미지 인식 기법 효과성 연구)

Suh, Dong-Hee
- Cartoon and Animation Studies
- /
- s.41
- /
- pp.337-356
- /
- 2015
Recently augmented reality contents are variously used in public such as advertisements or exhibits as well as children's books. Therefore, it is certain that the market, development of augmented reality contents, is gradually growing. Those who are the producer of augmented reality may be familiar with the skill where those images are used as a marker which is created by image recognition technique. In case of using image recognition technique, they usually use the augmented reality marker platform from Qualcomm since it is able to recognize self-produced images and 3-dimensional figures at no cost. This study was started when undergraduate students began to use those general techniques in their contents producing process. AR majoring students in Namseoul University applied image recognition technique to 3 AR contents exhibited in Sejong Center. Creating 3 different images, they have registered images at Image Target Manager provided by Vuforia to use as a marker. Moreover, they have modified the image producing method to raise the recognition rate by research. The higher recognition rate brings the more stable use of augmented reality contents. To achieve the satisfied rate, they have compared the elements of color contrast, pattern and etc. in the use of platform. Thus, the effective image creation method has been drawn. This study is aiming to suggest the production of stable contents by recognizing smart devices' limitation and producing educational contents. The purpose of this study is to help practically augmented reality contents developers by illustrating the application of augmented reality contents which are based on image recognition technique and also its effectiveness at the same time.
https://doi.org/10.7230/KOSCAS.2015.41.337 인용 PDF KSCI

The Speaker Recognition System using the Pitch Alteration (피치변경을 이용한 화자인식 시스템)

Jung JongSoon;Bae MyungJin
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.115-118
- /
- 2002
Parameters used in a speaker recognition system are desirable expressing speaker's characteristics filly and have in a speech. That is to say, if inter-speaker than intra-speaker variance a big characteristic, it is useful to distinguish between speakers. Also, to make minimum error between speakers, it is required the improved recognition technology as well as the distinguishing characteristics. When we see the result of recent simulation performance, we obtain more exact performance by using dynamic characteristics and constant characteristics by a speaking habit. Therefore we suggest it to solve this problem as followings. The prosodic information is used by a characteristic vector of speech. Characteristics vector generally using in speaker recognition system is a modeling spectrum information and is working for a high performance in non-noise circumstance. However, it is found a problem that characteristic vector is distorted in noise circumstance and it makes a reduction of recognition rate. In this paper, we change pitch line divided by segment which can estimate a dynamic characteristic and it is used as a recognition characteristic. we confirmed that the dynamic characteristic is very robust in noise circumstance with a simulation. We make a decision of acceptance or rejection by comparing test pattern and recognition rate using the proposed algorithm has more improvement than using spectrum and prosodic information. Especially stational recognition rate can be obtained in noise circumstance through the simulation.
PDF

A Study on Lip-reading Enhancement Using Time-domain Filter (시간영역 필터를 이용한 립리딩 성능향상에 관한 연구)

신도성;김진영;최승호
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.5
- /
- pp.375-382
- /
- 2003
Lip-reading technique based on bimodal is to enhance speech recognition rate in noisy environment. It is most important to detect the correct lip-image. But it is hard to estimate stable performance in dynamic environment, because of many factors to deteriorate Lip-reading's performance. There are illumination change, speaker's pronunciation habit, versatility of lips shape and rotation or size change of lips etc. In this paper, we propose the IIR filtering in time-domain for the stable performance. It is very proper to remove the noise of speech, to enhance performance of recognition by digital filtering in time domain. While the lip-reading technique in whole lip image makes data massive, the Principal Component Analysis of pre-process allows to reduce the data quantify by detection of feature without loss of image information. For the observation performance of speech recognition using only image information, we made an experiment on recognition after choosing 22 words in available car service. We used Hidden Markov Model by speech recognition algorithm to compare this words' recognition performance. As a result, while the recognition rate of lip-reading using PCA is 64%, Time-domain filter applied to lip-reading enhances recognition rate of 72.4%.
PDF KSCI

A Tracking Algorithm to Certain People Using Recognition of Face and Cloth Color and Motion Analysis with Moving Energy in CCTV (폐쇄회로 카메라에서 운동에너지를 이용한 모션인식과 의상색상 및 얼굴인식을 통한 특정인 추적 알고리즘)

Lee, In-Jung
- The KIPS Transactions:PartB
- /
- v.15B no.3
- /
- pp.197-204
- /
- 2008
It is well known that the tracking a certain person is a vary needed technic in the humanoid robot. In robot technic, we should consider three aspects that is cloth color matching, face recognition and motion analysis. Because a robot technic use some sensors, it is many different with the robot technic to track a certain person through the CCTV images. A system speed should be fast in CCTV images, hence we must have small calculation numbers. We need the statistical variable for color matching and we adapt the eigen-face for face recognition to speed up the system. In this situation, motion analysis have to added for the propose of the efficient detecting system. But, in many motion analysis systems, the speed and the recognition rate is low because the system operates on the all image area. In this paper, we use the moving energy only on the face area which is searched when the face recognition is processed, since the moving energy has low calculation numbers. When the proposed algorithm has been compared with Girondel, V. et al's method for experiment, we obtained same recognition rate as Girondel, V., the speed of the proposed algorithm was the more faster. When the LDA has been used, the speed was same and the recognition rate was better than Girondel, V.'s method, consequently the proposed algorithm is more efficient for tracking a certain person.
https://doi.org/10.3745/KIPSTB.2008.15-B.3.197 인용 PDF KSCI

Automatic hand gesture area extraction and recognition technique using FMCW radar based point cloud and LSTM (FMCW 레이다 기반의 포인트 클라우드와 LSTM을 이용한 자동 핸드 제스처 영역 추출 및 인식 기법)

Seung-Tak Ra;Seung-Ho Lee
- Journal of IKEEE
- /
- v.27 no.4
- /
- pp.486-493
- /
- 2023
In this paper, we propose an automatic hand gesture area extraction and recognition technique using FMCW radar-based point cloud and LSTM. The proposed technique has the following originality compared to existing methods. First, unlike methods that use 2D images as input vectors such as existing range-dopplers, point cloud input vectors in the form of time series are intuitive input data that can recognize movement over time that occurs in front of the radar in the form of a coordinate system. Second, because the size of the input vector is small, the deep learning model used for recognition can also be designed lightly. The implementation process of the proposed technique is as follows. Using the distance, speed, and angle information measured by the FMCW radar, a point cloud containing x, y, z coordinate format and Doppler velocity information is utilized. For the gesture area, the hand gesture area is automatically extracted by identifying the start and end points of the gesture using the Doppler point obtained through speed information. The point cloud in the form of a time series corresponding to the viewpoint of the extracted gesture area is ultimately used for learning and recognition of the LSTM deep learning model used in this paper. To evaluate the objective reliability of the proposed technique, an experiment calculating MAE with other deep learning models and an experiment calculating recognition rate with existing techniques were performed and compared. As a result of the experiment, the MAE value of the time series point cloud input vector + LSTM deep learning model was calculated to be 0.262 and the recognition rate was 97.5%. The lower the MAE and the higher the recognition rate, the better the results, proving the efficiency of the technique proposed in this paper.
https://doi.org/10.7471/ikeee.2023.27.4.486 인용 PDF

Pattern Classification Algorithm for Wrist Movements based on EMG (근전도 신호 기반 손목 움직임 패턴 분류 알고리즘에 대한 연구)

Cui, H.D.;Kim, Y.H.;Shim, H.M.;Yoon, K.S.;Lee, S.M.
- Journal of rehabilitation welfare engineering & assistive technology
- /
- v.7 no.2
- /
- pp.69-74
- /
- 2013
In this paper, we propose the pattern classification algorithm of recognizing wrist movements based on electromyogram(EMG) to raise the recognition rate. We consider 30 characteristics of EMG signals wirh the root mean square(RMS) and the difference absolute standard deviation value(DASDV) for the extraction of precise features from EMG signals. To get the groups of each wrist movement, we estimated 2-dimension features. On this basis, we divide each group into two parts with mean to compare and promote the recognition rate of pattern classification effectively. For the motion classification based on EMG, the k-nearest neighbor(k-NN) is used. In this paper, the recognition rate is 92.59% and 0.84% higher than the study before.
PDF

The Automated Threshold Decision Algorithm for Node Split of Phonetic Decision Tree (음소 결정트리의 노드 분할을 위한 임계치 자동 결정 알고리즘)

Kim, Beom-Seung;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.31 no.3
- /
- pp.170-178
- /
- 2012
In the paper, phonetic decision tree of the triphone unit was built for the phoneme-based speech recognition of 640 stations which run by the Korail. The clustering rate was determined by Pearson and Regression analysis to decide threshold used in node splitting. Using the determined the clustering rate, thresholds are automatically decided by the threshold value according to the average clustering rate. In the recognition experiments for verifying the proposed method, the performance improved 1.4~2.3 % absolutely than that of the baseline system.
https://doi.org/10.7776/ASK.2012.31.3.170 인용 PDF KSCI

Dynamic PCA algorithm for Detecting Types of Electric Poles (전신주의 종류 판별을 위한 동적 PCA 알고리즘)

Choi, Jae-Young;Lee, Jang-Myung
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.59 no.3
- /
- pp.651-656
- /
- 2010
This paper proposes a new dynamic PCA algorithm to recognize types of electric poles, which is necessary for a mobile robot moving along the neutral line for inspecting high-voltage facilities. Since the mobile robot needs to pass over the electric poles and grasp the neutral wire again for the next region inspection, the detection of the electric pole type is a critical factor for the successful passing-over the electric pole. The CCD camera installed on the mobile robot captures the image of the electric pole while it is approaching to the electric pole. Applying the dynamic PCA algorithm to the CCD image, the electric pole type has been classified to provide the stable grasping operation for the mobile robot. The new dynamic PCA algorithm replaces the reference image in real time to improve the robustness of the PCA algorithm, adjusts the brightness to get the clear images, and applies the Laplacian edge detection algorithm to increase the recognition rate of electric pole type. Through the real experiments, the effectiveness of this proposed dynamic PCA algorithm method using Laplacian edge detecting method has been demonstrated, which improves the recognition rate about 20% comparing to the conventional PCA algorithm.
https://doi.org/10.5370/KIEE.2010.59.3.651 인용 PDF KSCI

Search Result 2,809, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)