• Title/Summary/Keyword: Bit Recognition Rate

Search Result 36, Processing Time 0.023 seconds

A Biological Fuzzy Multilayer Perceptron Algorithm

  • Kim, Kwang-Baek;Seo, Chang-Jin;Yang, Hwang-Kyu
    • Journal of information and communication convergence engineering
    • /
    • v.1 no.3
    • /
    • pp.104-108
    • /
    • 2003
  • A biologically inspired fuzzy multilayer perceptron is proposed in this paper. The proposed algorithm is established under consideration of biological neuronal structure as well as fuzzy logic operation. We applied this suggested learning algorithm to benchmark problem in neural network such as exclusive OR and 3-bit parity, and to digit image recognition problems. For the comparison between the existing and proposed neural networks, the convergence speed is measured. The result of our simulation indicates that the convergence speed of the proposed learning algorithm is much faster than that of conventional backpropagation algorithm. Furthermore, in the image recognition task, the recognition rate of our learning algorithm is higher than of conventional backpropagation algorithm.

Speech Recognition in the Car Noise Environment (자동차 소음 환경에서 음성 인식)

  • 김완구;차일환;윤대희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.2
    • /
    • pp.51-58
    • /
    • 1993
  • This paper describes the development of a speaker-dependent isolated word recognizer as applied to voice dialing in a car noise environment. for this purpose, several methods to improve performance under such condition are evaluated using database collected in a small car moving at 100km/h The main features of the recognizer are as follow: The endpoint detection error can be reduced by using the magnitude of the signal which is inverse filtered by the AR model of the background noise, and it can be compensated by using variants of the DTW algorithm. To remove the noise, an autocorrelation subtraction method is used with the constraint that residual energy obtainable by linear predictive analysis should be positive. By using the noise rubust distance measure, distortion of the feature vector is minimized. The speech recognizer is implemented using the Motorola DSP56001(24-bit general purpose digital signal processor). The recognition database is composed of 50 Korean names spoken by 3 male speakers. The recognition error rate of the system is reduced to 4.3% using a single reference pattern for each word and 1.5% using 2 reference patterns for each word.

  • PDF

Speech Recognition Using Formant Bandwidth Normalization (포만트 밴드폭 정규화를 이용한 음성인식)

  • 홍종진;강석건;박군작;박규태
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.16 no.5
    • /
    • pp.458-467
    • /
    • 1991
  • In this paper, the cause of linear prediction error is analysed and the theoretical basis for nomalizing the format bandwidth to 0is given and its validity is verified. The formant and bandwidth in relation to the position of the poles of AR filter are measured for an alaysis of the relation between the pole position and the formant bandwidth. By changing the glottis reflection coefficient to 1. the pole position and the formant bandwidth. By changing the glottis reflection coefficient to 1. the effect of the glottis is eliminated and as the result a new linear preiction coefficients are obtained by normalizing the formant bandwidth of the signal to 0. since these coefficients are symmetrical, the standard deviation is larger than the coefficients with fixed glottis reflection coefficient. The bit rate for speech coding can be reduced by a factor of 2 without any loss of information. Through computer simulation, recognition rate of 96.7% is botained by using the proposed algorithm in recognizing 5 Korean vowels in noisy environment.

  • PDF

Conception and Performance Analysis of Efficient CDMA-Based Full-Duplex Anti-collision Scheme

  • Cao, Xiaohua;Li, Tiffany
    • ETRI Journal
    • /
    • v.37 no.5
    • /
    • pp.929-939
    • /
    • 2015
  • Ultra-high-frequency radio-frequency identification (UHF RFID) is widely applied in different industries. The Frame Slotted ALOHA in EPC C1G2 suffers severe collisions that limit the efficiency of tag recognition. An efficient full-duplex anti-collision scheme is proposed to reduce the rate of collision by coordinating the transmitting process of CDMA UWB uplink and UHF downlink. The relevant mathematical models are built to analyze the performance of the proposed scheme. Through simulation, some important findings are gained. The maximum number of identified tags in one slot is g/e (g is the number of PN codes and e is Euler's constant) when the number of tags is equal to mg (m is the number of slots). Unlike the Frame Slotted ALOHA, even if the frame size is small and the number of tags is large, there aren't too many collisions if the number of PN codes is large enough. Our approach with 7-bit Gold codes, 15-bit Gold codes, or 31-bit Gold codes operates 1.4 times, 1.7 times, or 3 times faster than the CDMA Slotted ALOHA, respectively, and 14.5 times, 16.2 times, or 18.5 times faster than the EPC C1 G2 system, respectively. More than 2,000 tags can be processed within 300 ms in our approach.

Feed-forward Learning Algorithm by Generalized Clustering Network (Generalized Clustering Network를 이용한 전방향 학습 알고리즘)

  • Min, Jun-Yeong;Jo, Hyeong-Gi
    • The Transactions of the Korea Information Processing Society
    • /
    • v.2 no.5
    • /
    • pp.619-625
    • /
    • 1995
  • This paper constructs a feed-forward learning complex algorithm which replaced by the backpropagation learning. This algorithm first attempts to organize the pattern vectors into clusters by Generalized Learning Vector Quantization(GLVQ) clustering algorithm(Nikhil R. Pal et al, 1993), second, regroup the pattern vectors belonging to different clusters, and the last, recognize into regrouping pattern vectors by single layer perceptron. Because this algorithm is feed-forward learning algorithm, time is less than backpropagation algorithm and the recognition rate is increased. We use 250 ASCII code bit patterns that is normalized to 16$\times$8. As experimental results, when 250 patterns devide by 10 clusters, average iteration of each cluster is 94.7, and recognition rate is 100%.

  • PDF

An Enhanced Fuzzy Single Layer Perceptron for Image Recognition (이미지 인식을 위한 개선된 퍼지 단층 퍼셉트론)

  • Lee, Jong-Hee
    • Journal of Korea Multimedia Society
    • /
    • v.2 no.4
    • /
    • pp.490-495
    • /
    • 1999
  • In this paper, a method of improving the learning time and convergence rate is proposed to exploit the advantages of artificial neural networks and fuzzy theory to neuron structure. This method is applied to the XOR Problem, n bit parity problem which is used as the benchmark in neural network structure, and recognition of digit image in the vehicle plate image for practical image application. As a result of the experiments, it does not always guarantee the convergence. However, the network showed improved the teaming time and has the high convergence rate. The proposed network can be extended to an arbitrary layer Though a single layer structure Is considered, the proposed method has a capability of high speed 3earning even on large images.

  • PDF

HMM-based Music Identification System for Copyright Protection (저작권 보호를 위한 HMM기반의 음악 식별 시스템)

  • Kim, Hee-Dong;Kim, Do-Hyun;Kim, Ji-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.1 no.1
    • /
    • pp.63-67
    • /
    • 2009
  • In this paper, in order to protect music copyrights, we propose a music identification system which is scalable to the number of pieces of registered music and robust to signal-level variations of registered music. For its implementation, we define the new concepts of 'music word' and 'music phoneme' as recognition units to construct 'music acoustic models'. Then, with these concepts, we apply the HMM-based framework used in continuous speech recognition to identify the music. Each music file is transformed to a sequence of 39-dimensional vectors. This sequence of vectors is represented as ordered states with Gaussian mixtures. These ordered states are trained using Baum-Welch re-estimation method. Music files with a suspicious copyright are also transformed to a sequence of vectors. Then, the most probable music file is identified using Viterbi algorithm through the music identification network. We implemented a music identification system for 1,000 MP3 music files and tested this system with variations in terms of MP3 bit rate and music speed rate. Our proposed music identification system demonstrates robust performance to signal variations. In addition, scalability of this system is independent of the number of registered music files, since our system is based on HMM method.

  • PDF

A design of the processor dedicated to LPC-CEPSTRUM (LPC-CEPSTRUM 추출을 위한 전용 프로세서의 설계)

  • 황인철;김성남;김영우;김태근;김수원
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.8
    • /
    • pp.71-78
    • /
    • 1997
  • An LPC cepstrum processor for speech recognition is implemented on CMOS array process. The designed processor contains a 24-bit floating-point MAC unit to perform the correlation quickly, which occupies the majority of operations used in the algorithm, and has 22 register files to store temporary variables. For the purpose of fast operations, the floating-point MAC consists of a 3-stage pipeline and the new post-normalization shceme is proposed and applied to it. Experimental result shows that it takes approximately 266.mu.s to process 200 samples/frame at 15 MHz clock rate. This processor runs at the maximum rate of 16.6 MHz and the number of gates are 27,760.

  • PDF

Same music file recognition method by using similarity measurement among music feature data (음악 특징점간의 유사도 측정을 이용한 동일음원 인식 방법)

  • Sung, Bo-Kyung;Chung, Myoung-Beom;Ko, Il-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.3
    • /
    • pp.99-106
    • /
    • 2008
  • Recently, digital music retrieval is using in many fields (Web portal. audio service site etc). In existing fields, Meta data of music are used for digital music retrieval. If Meta data are not right or do not exist, it is hard to get high accurate retrieval result. Contents based information retrieval that use music itself are researched for solving upper problem. In this paper, we propose Same music recognition method using similarity measurement. Feature data of digital music are extracted from waveform of music using Simplified MFCC (Mel Frequency Cepstral Coefficient). Similarity between digital music files are measured using DTW (Dynamic time Warping) that are used in Vision and Speech recognition fields. We success all of 500 times experiment in randomly collected 1000 songs from same genre for preying of proposed same music recognition method. 500 digital music were made by mixing different compressing codec and bit-rate from 60 digital audios. We ploved that similarity measurement using DTW can recognize same music.

  • PDF

Efficient Transmission Scheme with Viewport Prediction of 360VR Content using Sound Location Information (360VR 콘텐츠의 음원위치정보를 활용한 시점예측 전송기법)

  • Jeong, Eunyoung;Kim, Dong Ho
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.1002-1012
    • /
    • 2019
  • 360VR content requires short latency, such as immediate response to viewers' viewport changes and high quality video delivery. It is necessary to consider efficient transmission that guarantees the QoE(Quality of Experience) of the 360VR contents with limited bandwidth. Several research has been introduced to reduce overall bandwidth consumption by predicting a user's viewport and allocating different bit rates to the area corresponding to the viewport. In this paper, we propose novel viewport prediction scheme that uses sound source location information of 360VR contents as auditory recognition information along with visual recognition information. Also, we propose efficient transmission algorithm by allocating a bit rate properly based on improved viewport prediction. The proposed scheme improves the accuracy of the viewport prediction and provides high quality videos to tiles corresponding to the user's viewpoint within the limited bandwidth.