• Title/Summary/Keyword: Quality metrics

Search Result 522, Processing Time 0.031 seconds

Performance comparison evaluation of real and complex networks for deep neural network-based speech enhancement in the frequency domain (주파수 영역 심층 신경망 기반 음성 향상을 위한 실수 네트워크와 복소 네트워크 성능 비교 평가)

  • Hwang, Seo-Rim;Park, Sung Wook;Park, Youngcheol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.1
    • /
    • pp.30-37
    • /
    • 2022
  • This paper compares and evaluates model performance from two perspectives according to the learning target and network structure for training Deep Neural Network (DNN)-based speech enhancement models in the frequency domain. In this case, spectrum mapping and Time-Frequency (T-F) masking techniques were used as learning targets, and a real network and a complex network were used for the network structure. The performance of the speech enhancement model was evaluated through two objective evaluation metrics: Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI) depending on the scale of the dataset. Test results show the appropriate size of the training data differs depending on the type of networks and the type of dataset. In addition, they show that, in some cases, using a real network may be a more realistic solution if the number of total parameters is considered because the real network shows relatively higher performance than the complex network depending on the size of the data and the learning target.

Performance Enhancement of Speech Declipping using Clipping Detector (클리핑 감지기를 이용한 음성 신호 클리핑 제거의 성능 향상)

  • Eunmi Seo;Jeongchan Yu;Yujin Lim;Hochong Park
    • Journal of Broadcast Engineering
    • /
    • v.28 no.1
    • /
    • pp.132-140
    • /
    • 2023
  • In this paper, we propose a method for performance enhancement of speech declipping using clipping detector. Clipping occurs when the input speech level exceeds the dynamic range of microphone, and it significantly degrades the speech quality. Recently, many methods for high-performance speech declipping based on machine learning have been developed. However, they often deteriorate the speech signal because of degradation in signal reconstruction process when the degree of clipping is not high. To solve this problem, we propose a new approach that combines the declipping network and clipping detector, which enables a selective declipping operation depending on the clipping level and provides high-quality speech in all clipping levels. We measured the declipping performance using various metrics and confirmed that the proposed method improves the average performance over all clipping levels, compared with the conventional methods, and greatly improves the performance when the clipping distortion is small.

Complex nested U-Net-based speech enhancement model using a dual-branch decoder (이중 분기 디코더를 사용하는 복소 중첩 U-Net 기반 음성 향상 모델)

  • Seorim Hwang;Sung Wook Park;Youngcheol Park
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.253-259
    • /
    • 2024
  • This paper proposes a new speech enhancement model based on a complex nested U-Net with a dual-branch decoder. The proposed model consists of a complex nested U-Net to simultaneously estimate the magnitude and phase components of the speech signal, and the decoder has a dual-branch decoder structure that performs spectral mapping and time-frequency masking in each branch. At this time, compared to the single-branch decoder structure, the dual-branch decoder structure allows noise to be effectively removed while minimizing the loss of speech information. The experiment was conducted on the VoiceBank + DEMAND database, commonly used for speech enhancement model training, and was evaluated through various objective evaluation metrics. As a result of the experiment, the complex nested U-Net-based speech enhancement model using a dual-branch decoder increased the Perceptual Evaluation of Speech Quality (PESQ) score by about 0.13 compared to the baseline, and showed a higher objective evaluation score than recently proposed speech enhancement models.

Cascade Fusion-Based Multi-Scale Enhancement of Thermal Image (캐스케이드 융합 기반 다중 스케일 열화상 향상 기법)

  • Kyung-Jae Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.301-307
    • /
    • 2024
  • This study introduces a novel cascade fusion architecture aimed at enhancing thermal images across various scale conditions. The processing of thermal images at multiple scales has been challenging due to the limitations of existing methods that are designed for specific scales. To overcome these limitations, this paper proposes a unified framework that utilizes cascade feature fusion to effectively learn multi-scale representations. Confidence maps from different image scales are fused in a cascaded manner, enabling scale-invariant learning. The architecture comprises end-to-end trained convolutional neural networks to enhance image quality by reinforcing mutual scale dependencies. Experimental results indicate that the proposed technique outperforms existing methods in multi-scale thermal image enhancement. Performance evaluation results are provided, demonstrating consistent improvements in image quality metrics. The cascade fusion design facilitates robust generalization across scales and efficient learning of cross-scale representations.

EDF: An Interactive Tool for Event Log Generation for Enabling Process Mining in Small and Medium-sized Enterprises

  • Frans Prathama;Seokrae Won;Iq Reviessay Pulshashi;Riska Asriana Sutrisnowati
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.6
    • /
    • pp.101-112
    • /
    • 2024
  • In this paper, we present EDF (Event Data Factory), an interactive tool designed to assist event log generation for process mining. EDF integrates various data connectors to improve its capability to assist users in connecting to diverse data sources. Our tool employs low-code/no-code technology, along with graph-based visualization, to help non-expert users understand process flow and enhance the user experience. By utilizing metadata information, EDF allows users to efficiently generate an event log containing case, activity, and timestamp attributes. Through log quality metrics, our tool enables users to assess the generated event log quality. We implement EDF under a cloud-based architecture and run a performance evaluation. Our case study and results demonstrate the usability and applicability of EDF. Finally, an observational study confirms that EDF is easy to use and beneficial, expanding small and medium-sized enterprises' (SMEs) access to process mining applications.

A Method for Measuring and Evaluating for Block-based Programming Code (블록기반 프로그래밍 코드의 수준 및 취약수준 측정방안)

  • Sohn, Wonsung
    • Journal of The Korean Association of Information Education
    • /
    • v.20 no.3
    • /
    • pp.293-302
    • /
    • 2016
  • It is the latest fashion of interesting with software education in public school environment and also consider as high priority issue of curriculum for college freshman with programming 101 courses. The block-based programming tool is used widely for the beginner and provides several positive features compare than text-based programming language tools. To measure quality of programming code elaborately which is based script language, it is need to very tough manual process. As a result the previously research related with evaluation of block-based script code has been focused very simple methods in which normalize the number of blocks used which is related with programming concept. In such cases in this, it is difficult to measure structural vulnerability of script code and implicit programming concept which does not expose. In this research, the framework is proposed which enable to measure and evaluate quality of code script of block-based programming tools and also provides method to find of vulnerability of script code. In this framework, the quality metrics is constructed to structuralize implicit programming concept and then developed the quality measure and vulnerability model of script to improve level of programming. Consequently, the proposed methods enable to check of level of programming and predict the heuristic target level.

Development of Support Package for the Software Quality Assurance (소프트웨어 품질보증(SQA) 지원 패키지 개발)

  • Yu, Chung-Jae;Han, Hyuk-Soo
    • The KIPS Transactions:PartD
    • /
    • v.11D no.5
    • /
    • pp.1105-1122
    • /
    • 2004
  • The organization and company's effort to improve software qualify contributes to the increase of software productivity and quality in some sense. However, it has not been a solution of root causes. This result is caused not because of people or technology, but process in-stitutionalization. Recently SQA (Software Quality Assurance), which provide mechanism to make sure that the software development process and products follow the assigned requirements, plan and standards, is applied to achieve the quality improvements. Several standards and models are developed for SQA activities. However, those standards and mode]s are written in form and do not provide information related to the detailed procedures, methods and outputs. Therefore, the organizations that want to adopt those models or standards have to put a lot of effort to acquire the knowledge about the models and to set up SQA Process that is tailored to meet organization's goal and objectives. In this research, we developed SQA support package to support the organization to develop their SQA process in more convenient and systematic ways. With this package, the organizations can establish SQA Process by tailoring those features necessary to reflect organization's characteristics. We expect this package contribute the organizations in a way that it reduce the effort and cost for establishing SQA process.

A study on speech enhancement using complex-valued spectrum employing Feature map Dependent attention gate (특징 맵 중요도 기반 어텐션을 적용한 복소 스펙트럼 기반 음성 향상에 관한 연구)

  • Jaehee Jung;Wooil Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.6
    • /
    • pp.544-551
    • /
    • 2023
  • Speech enhancement used to improve the perceptual quality and intelligibility of noise speech has been studied as a method using a complex-valued spectrum that can improve both magnitude and phase in a method using a magnitude spectrum. In this paper, a study was conducted on how to apply attention mechanism to complex-valued spectrum-based speech enhancement systems to further improve the intelligibility and quality of noise speech. The attention is performed based on additive attention and allows the attention weight to be calculated in consideration of the complex-valued spectrum. In addition, the global average pooling was used to consider the importance of the feature map. Complex-valued spectrum-based speech enhancement was performed based on the Deep Complex U-Net (DCUNET) model, and additive attention was conducted based on the proposed method in the Attention U-Net model. The results of the experiments on noise speech in a living room environment showed that the proposed method is improved performance over the baseline model according to evaluation metrics such as Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short Time Object Intelligence (STOI), and consistently improved performance across various background noise environments and low Signal-to-Noise Ratio (SNR) conditions. Through this, the proposed speech enhancement system demonstrated its effectiveness in improving the intelligibility and quality of noisy speech.

Comparative analysis of wavelet transform and machine learning approaches for noise reduction in water level data (웨이블릿 변환과 기계 학습 접근법을 이용한 수위 데이터의 노이즈 제거 비교 분석)

  • Hwang, Yukwan;Lim, Kyoung Jae;Kim, Jonggun;Shin, Minhwan;Park, Youn Shik;Shin, Yongchul;Ji, Bongjun
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.3
    • /
    • pp.209-223
    • /
    • 2024
  • In the context of the fourth industrial revolution, data-driven decision-making has increasingly become pivotal. However, the integrity of data analysis is compromised if data quality is not adequately ensured, potentially leading to biased interpretations. This is particularly critical for water level data, essential for water resource management, which often encounters quality issues such as missing values, spikes, and noise. This study addresses the challenge of noise-induced data quality deterioration, which complicates trend analysis and may produce anomalous outliers. To mitigate this issue, we propose a noise removal strategy employing Wavelet Transform, a technique renowned for its efficacy in signal processing and noise elimination. The advantage of Wavelet Transform lies in its operational efficiency - it reduces both time and costs as it obviates the need for acquiring the true values of collected data. This study conducted a comparative performance evaluation between our Wavelet Transform-based approach and the Denoising Autoencoder, a prominent machine learning method for noise reduction.. The findings demonstrate that the Coiflets wavelet function outperforms the Denoising Autoencoder across various metrics, including Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Mean Squared Error (MSE). The superiority of the Coiflets function suggests that selecting an appropriate wavelet function tailored to the specific application environment can effectively address data quality issues caused by noise. This study underscores the potential of Wavelet Transform as a robust tool for enhancing the quality of water level data, thereby contributing to the reliability of water resource management decisions.

Comparative Analysis of Markerless Facial Recognition Technology for 3D Character's Facial Expression Animation -Focusing on the method of Faceware and Faceshift- (3D 캐릭터의 얼굴 표정 애니메이션 마커리스 표정 인식 기술 비교 분석 -페이스웨어와 페이스쉬프트 방식 중심으로-)

  • Kim, Hae-Yoon;Park, Dong-Joo;Lee, Tae-Gu
    • Cartoon and Animation Studies
    • /
    • s.37
    • /
    • pp.221-245
    • /
    • 2014
  • With the success of the world's first 3D computer animated film, "Toy Story" in 1995, industrial development of 3D computer animation gained considerable momentum. Consequently, various 3D animations for TV were produced; in addition, high quality 3D computer animation games became common. To save a large amount of 3D animation production time and cost, technological development has been conducted actively, in accordance with the expansion of industrial demand in this field. Further, compared with the traditional approach of producing animations through hand-drawings, the efficiency of producing 3D computer animations is infinitely greater. In this study, an experiment and a comparative analysis of markerless motion capture systems for facial expression animation has been conducted that aims to improve the efficiency of 3D computer animation production. Faceware system, which is a product of Image Metrics, provides sophisticated production tools despite the complexity of motion capture recognition and application process. Faceshift system, which is a product of same-named Faceshift, though relatively less sophisticated, provides applications for rapid real-time motion recognition. It is hoped that the results of the comparative analysis presented in this paper become baseline data for selecting the appropriate motion capture and key frame animation method for the most efficient production of facial expression animation in accordance with production time and cost, and the degree of sophistication and media in use, when creating animation.