Search | Korea Science

On a robust text-dependent speaker identification over telephone channels (전화음성에 강인한 문장종속 화자인식에 관한 연구)

Jung, Eu-Sang;Choi, Hong-Sub
- Speech Sciences
- /
- v.2
- /
- pp.57-66
- /
- 1997
This paper studies the effects of the method, CMS(Cepstral Mean Subtraction), (which compensates for some of the speech distortion. caused by telephone channels), on the performance of the text-dependent speaker identification system. This system is based on the VQ(Vector Quantization) and HMM(Hidden Markov Model) method and chooses the LPC-Cepstrum and Mel-Cepstrum as the feature vectors extracted from the speech data transmitted through telephone channels. Accordingly, we can compare the correct recognition rates of the speaker identification system between the use of LPC-Cepstrum and Mel-Cepstrum. Finally, from the experiment results table, it is found that the Mel-Cepstrum parameter is proven to be superior to the LPC-Cepstrum and that recognition performance improves by about 10% when compensating for telephone channel using the CMS.
PDF

Video Rate Control Using Activity Based Rate Prediction

Park, Hyung-Shin;Jung, You-Young;Kim, Young-Ro;Ko, Sung-Jea
- Proceedings of the IEEK Conference
- /
- 2000.07a
- /
- pp.454-457
- /
- 2000
In this paper, an efficient rate control algorithm based on rate prediction is proposed for maintaining a smooth buffer variation and a small buffer size. The proposed method adjusts the quantization scaling factor by using the predicted bit-rate to meet the target bit budget exactly. Experimental result show that the proposed prediction-based rate control scheme can regulate the bit-rate across scene changes more effectively and achieve better PSNR performance than existing rate control mechanisms such as the MPEG-2 Test Model 5 (TM5) and the Adaptive Scene Analysis (ASA).
PDF

A Study on the Implementation of Low Power DCT Architecture for MPEG-4 AVC (저전력 DCT를 이용한 MPEG-4 AVC 압축에 관한 연구)

Kim, Dong-Hoon;Seo, Sang-Jin;Park, Sang-Bong;Jin, Hyun-Joon;Park, Nho-Kyung
- Proceedings of the KIEE Conference
- /
- 2007.10a
- /
- pp.371-372
- /
- 2007
In this paper we present performance and implementation comparisons of high performance two dimensional forward and inverse Discrete Cosine Transform (2D-DCT/IDCT) algorithm and low power algorithm for $8{\times}8$ 20 DCT and quantization based on partial sum and its corresponding hardware architecture for FPGA in MPEG-4. The architecture used in both low power 20 DCT and 2D IDCT is based on the conventional row-column decomposition method. The use of Fast algorithm and distributed arithmetic(DA) technique to implement the DCT/IDCT reduces the hardware complexity. The design was made using Mentor Graphics Tools for design entry and implementation. Mentor Graphics ModelSim SE6.1f was used for Verilog HDL entry, behavioral Simulation and Synthesis. The 2D DCT/IDCT consumes only 50% of the Operating Power.
PDF

Visual Modeling and Content-based Processing for Video Data Storage and Delivery

Hwang Jae-Jeong;Cho Sang-Gyu
- Journal of information and communication convergence engineering
- /
- v.3 no.1
- /
- pp.56-61
- /
- 2005
In this paper, we present a video rate control scheme for storage and delivery in which the time-varying viewing interests are controlled by human gaze. To track the gaze, the pupil's movement is detected using the three-step process : detecting face region, eye region, and pupil point. To control bit rates, the quantization parameter (QP) is changed by considering the static parameters, the video object priority derived from the pupil tracking, the target PSNR, and the weighted distortion value of the coder. As results, we achieved human interfaced visual model and corresponding region-of-interest rate control system.
PDF KSCI

Robust Speech Recognition by Utilizing Class Histogram Equalization (클래스 히스토그램 등화 기법에 의한 강인한 음성 인식)

Suh, Yung-Joo;Kim, Hor-Rin;Lee, Yun-Keun
- MALSORI
- /
- no.60
- /
- pp.145-164
- /
- 2006
This paper proposes class histogram equalization (CHEQ) to compensate noisy acoustic features for robust speech recognition. CHEQ aims to compensate for the acoustic mismatch between training and test speech recognition environments as well as to reduce the limitations of the conventional histogram equalization (HEQ). In contrast to HEQ, CHEQ adopts multiple class-specific distribution functions for training and test environments and equalizes the features by using their class-specific training and test distributions. According to the class-information extraction methods, CHEQ is further classified into two forms such as hard-CHEQ based on vector quantization and soft-CHEQ using the Gaussian mixture model. Experiments on the Aurora 2 database confirmed the effectiveness of CHEQ by producing a relative word error reduction of 61.17% over the baseline met-cepstral features and that of 19.62% over the conventional HEQ.
PDF

Trend of Edge Machine Learning as-a-Service (서비스형 엣지 머신러닝 기술 동향)

Na, J.C.;Jeon, S.H.
- Electronics and Telecommunications Trends
- /
- v.37 no.5
- /
- pp.44-53
- /
- 2022
The Internet of Things (IoT) is growing exponentially, with the number of IoT devices multiplying annually. Accordingly, the paradigm is changing from cloud computing to edge computing and even tiny edge computing because of the low latency and cost reduction. Machine learning is also shifting its role from the cloud to edge or tiny edge according to the paradigm shift. However, the fragmented and resource-constrained features of IoT devices have limited the development of artificial intelligence applications. Edge MLaaS (Machine Learning as-a-Service) has been studied to easily and quickly adopt machine learning to products and overcome the device limitations. This paper briefly summarizes what Edge MLaaS is and what element of research it requires.
https://doi.org/10.22648/ETRI.2022.J.370505 인용 PDF

Sparsity Increases Uncertainty Estimation in Deep Ensemble

Dorjsembe, Uyanga;Lee, Ju Hong;Choi, Bumghi;Song, Jae Won
- Proceedings of the Korea Information Processing Society Conference
- /
- 2021.05a
- /
- pp.373-376
- /
- 2021
Deep neural networks have achieved almost human-level results in various tasks and have become popular in the broad artificial intelligence domains. Uncertainty estimation is an on-demand task caused by the black-box point estimation behavior of deep learning. The deep ensemble provides increased accuracy and estimated uncertainty; however, linearly increasing the size makes the deep ensemble unfeasible for memory-intensive tasks. To address this problem, we used model pruning and quantization with a deep ensemble and analyzed the effect in the context of uncertainty metrics. We empirically showed that the ensemble members' disagreement increases with pruning, making models sparser by zeroing irrelevant parameters. Increased disagreement implies increased uncertainty, which helps in making more robust predictions. Accordingly, an energy-efficient compressed deep ensemble is appropriate for memory-intensive and uncertainty-aware tasks.
https://doi.org/10.3745/PKIPS.y2021m05a.373 인용 PDF

A Model Compression for Super Resolution Multi Scale Residual Networks based on a Layer-wise Quantization (계층별 양자화 기반 초해상화 다중 스케일 잔차 네트워크 압축)

Hwang, Jiwon;Bae, Sung-Ho
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.07a
- /
- pp.540-543
- /
- 2020
기존의 초해상도 딥러닝 기법은 모델의 깊이가 깊어지면서, 좋은 성능을 내지만 점점 더 복잡해지고 있고, 실제로 사용하는데 있어 많은 시간을 요구한다. 이를 해결하기 위해, 우리는 딥러닝 모델의 가중치를 양자화 하여 추론시간을 줄이고자 한다. 초해상도 모델은 feature extraction, non-linear mapping, reconstruction 세 부분으로 나누어져 있으며, 레이어 사이에 많은 skip-connection 이 존재하는 특징이 있다. 따라서 양자화 시 최종 성능 하락에 미치는 영향력이 레이어 별로 다르며, 이를 감안하여 강화학습으로 레이어 별 최적 bit 를 찾아 성능 하락을 최소화한다. 본 논문에서는 Skip-connection 이 많이 존재하는 MSRN 을 사용하였으며, 결과에서 feature extraction, reconstruction 부분과 블록 내 특정 위치의 레이어가 항상 높은 bit 를 가짐을 알 수 있다. 기존에 영상 분류에 한정되어 사용되었던 혼합 bit 양자화를 사용하여 초해상도 딥러닝 기법의 모델 사이즈를 줄인 최초의 논문이며, 제안 방법은 모바일 등 제한된 환경에 적용 가능할 것으로 생각된다.
PDF

A Study on the Mixed Model Approach and Symbol Probability Weighting Function for Maximization of Inter-Speaker Variation (화자간 변별력 최대화를 위한 혼합 모델 방식과 심볼 확률 가중함수에 관한 연구)

Chin Se-Hoon;Kang Chul-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.24 no.7
- /
- pp.410-415
- /
- 2005
Recently, most of the speaker verification systems are based on the pattern recognition approach method. And performance of the pattern-classifier depends on how to classify a variety of speakers' feature parameters. In order to classify feature parameters efficiently and effectively, it is of great importance to enlarge variations between speakers and effectively measure distances between feature parameters. Therefore, this paper would suggest the positively mixed model scheme that can enlarge inter-speaker variation by searching the individual model with world model at the same time. During decision procedure, we can maximize inter-speaker variation by using the proposed mixed model scheme. We also make use of a symbol probability weighting function in this system so as to reduce vector quantization errors by measuring symbol probability derived from the distance rate of between the world codebook and individual codebook. As the result of our experiment using this method, we could halve the Detection Cost Function (DCF) of the system from $2.37\%\;to\;1.16\%$.
PDF KSCI

A High Performance Permanent Magnet Synchronous Motor Servo System Using Predictive Functional Control and Kalman Filter

Wang, Shuang;Zhu, Wenju;Shi, Jian;Ji, Hua;Huang, Surong
- Journal of Power Electronics
- /
- v.15 no.6
- /
- pp.1547-1558
- /
- 2015
A predictive functional control (PFC) scheme for permanent magnet synchronous motor (PMSM) servo systems is proposed in this paper. The PFC-based method is first introduced in the control design of speed loop. Since the accuracy of the PFC model is influenced by external disturbances and speed detection quantization errors of the low distinguishability optical encoder in servo systems, it is noted that the standard PFC method does not achieve satisfactory results in the presence of strong disturbances. This paper adopted the Kalman filter to observe the load torque, the rotor position and the rotor angular velocity under the condition of a limited precision encoder. The observations are then fed back into PFC model to rebuild it when considering the influence of perturbation. Therefore, an improved PFC method, called the PFC+Kalman filter method, is presented, and a high performance PMSM servo system was achieved. The validity of the proposed controller was tested via experiments. Excellent results were obtained with respect to the speed trajectory tracking, stability, and disturbance rejection.
https://doi.org/10.6113/JPE.2015.15.6.1547 인용 PDF KSCI KPUBS HTML

Search Result 224, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)