• Title/Summary/Keyword: perceptual quality

Search Result 344, Processing Time 0.034 seconds

Improved CycleGAN for underwater ship engine audio translation (수중 선박엔진 음향 변환을 위한 향상된 CycleGAN 알고리즘)

  • Ashraf, Hina;Jeong, Yoon-Sang;Lee, Chong Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.4
    • /
    • pp.292-302
    • /
    • 2020
  • Machine learning algorithms have made immense contributions in various fields including sonar and radar applications. Recently developed Cycle-Consistency Generative Adversarial Network (CycleGAN), a variant of GAN has been successfully used for unpaired image-to-image translation. We present a modified CycleGAN for translation of underwater ship engine sounds with high perceptual quality. The proposed network is composed of an improved generator model trained to translate underwater audio from one vessel type to other, an improved discriminator to identify the data as real or fake and a modified cycle-consistency loss function. The quantitative and qualitative analysis of the proposed CycleGAN are performed on publicly available underwater dataset ShipsEar by evaluating and comparing Mel-cepstral distortion, pitch contour matching, nearest neighbor comparison and mean opinion score with existing algorithms. The analysis results of the proposed network demonstrate the effectiveness of the proposed network.

Effects of Injection Laryngoplasty with Hyaluronic Acid in Patients with Vocal Fold Paralysis

  • Kim, Geun-Hyo;Lee, Jae-Seok;Lee, Chang-Yoon;Lee, Yeon-Woo;Bae, In-Ho;Park, Hee-June;Lee, Byung-Joo;Kwon, Soon-Bok
    • Osong Public Health and Research Perspectives
    • /
    • v.9 no.6
    • /
    • pp.354-361
    • /
    • 2018
  • Objectives: The purpose of this study was to explore the effects of injection laryngoplasty (IL) with hyaluronic acid in patients with vocal fold paralysis (VFP). Methods: A total of 50 patients with VFP participated in this study. Pre- and post-IL assessments were performed, which included analyzing the sustained vowel /a/ phonation, and the patient reading 1 Korean sentence from the "Walk" passage that comprised 25 syllables in 10 words. To investigate the effect of IL on vocal fold function, acoustic analysis (acoustic voice quality index, cepstral peak prominence, maximum phonation time, speaking fundamental frequency) was conducted and auditory-perceptual (grade and overall severity), visual judgment (gap), and self-questionnaire (voice handicap index-10) assessments were performed. Results: The patients with VFP showed statistically significant differences between pre-and post-IL assessments for acoustic and auditory-perception, visual judgment, and self-questionnaire assessments. Conclusion: The patients with VFP showed positive change in vocal fold function between pre- and post-IL measurements. The findings showed that IL with hyaluronic acid is an effective method to improve vocal fold function in patients with VFP.

A RST Resistant Logo Embedding Technique Using Block DCT and Image Normalization (블록 DCT와 영상 정규화를 이용한 회전, 크기, 이동 변환에 견디는 강인한 로고 삽입방법)

  • Choi Yoon-Hee;Choi Tae-Sun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.15 no.5
    • /
    • pp.93-103
    • /
    • 2005
  • In this paper, we propose a RST resistant robust logo embedding technique for multimedia copyright protection Geometric manipulations are challenging attacks in that they do not introduce the quality degradation very much but make the detection process very complex and difficult. Watermark embedding in the normalized image directly suffers from smoothing effect due to the interpolation during the image normalization. This can be avoided by estimating the transform parameters using an image normalization technique, instead of embedding in the normalized image. Conventional RST resistant schemes that use full frame transform suffer from the absence of effective perceptual masking methods. Thus, we adopt $8\times8$ block DCT and calculate masking using a spatio-frequency localization of the $8\times8$ block DCT coefficients. Simulation results show that the proposed algorithm is robust against various signal processing techniques, compression and geometrical manipulations.

The Effect of Voice Therapy for the Treatment of Functional Aphonia: A Preliminary Study (기능적 실성증에 대한 음성치료의 효과 분석: 기초 연구)

  • Kim, No Eul;Kim, Jun Seok;Oh, Jae Hwan;Kim, Dong Young;Woo, Joo Hyun
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.32 no.2
    • /
    • pp.75-80
    • /
    • 2021
  • Background and Objectives Functional aphonia refers to in which by presenting whispering voice and almost producing very high-pitched tensed voices are produced. Voice therapy is the most effective treatment, but there is a lack of consensus for application of voice therapy. The purpose of this study was to examine the vocal characteristics of functional aphonia and the effect of voice therapy applied accordingly. Materials and Method From October 2019 to December 2020, 11 patients with functional aphonia were treated using voice therapy which was processing three stages such as vocal hygiene, trial therapy, and behavioral therapy. Of these, 7 patients who completed the voice evaluation before and after voice therapy was enrolled in this study. By retrospective chart review, clinical information such as sex, age, symptoms, duration, social and medical history, process of voice therapy, subjective and objective findings were analyzed. Voice parameters before and after voice therapy were compared. Results In GRBAS study, grade, rough, and asthenic, and in Consensus Auditory-Perceptual Evaluation of Voice, overall severity, roughness, pitch, and loudness were significantly improved after voice therapy. In Voice handicap index, all of the scores of total and sub-categories were significantly decreased. In objective voice analysis, jitter, cepstral peak prominence, and maximum phonation time were significantly improved. Conclusion The voice therapy was effective for the treatment of functional aphonia by restoring patient's vocalization and improving voice quality, pitch and loudness.

Determinant-based two-channel noise reduction method using speech presence probability (음성존재확률을 이용한 행렬식 기반 2채널 잡음제거기법)

  • Park, Jinuk;Hong, Jungpyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.5
    • /
    • pp.649-655
    • /
    • 2022
  • In this paper, a determinant-based two-channel noise reduction method which utilizes speech presence probability (SPP) is proposed. The proposed method improves noise reduction performance from the conventional determinant-based two-channel noise reduction method in [7] by applying SPP to the Wiener filter gain. Consequently, the proposed method adaptively controls the amount of noise reduction depending on the SPP. For performance evaluation, the segmental signal-to-noise ratio (SNR), the perceptual evaluation of speech quality, the short time objective intelligibility, and the log spectral distance were measured in the simulated noisy environments considered various types of noise, reverberation, SNR, and the direction and number of noise sources. The experimental results presented that determinant-based methods outperform phase difference-based methods in most cases. In particular, the proposed method achieved the best noise reduction performance maintaining minimum speech distortion.

Blind Noise Separation Method of Convolutive Mixed Signals (컨볼루션 혼합신호의 암묵 잡음분리방법)

  • Lee, Haeng-Woo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.3
    • /
    • pp.409-416
    • /
    • 2022
  • This paper relates to the blind noise separation method of time-delayed convolutive mixed signals. Since the mixed model of acoustic signals in a closed space is multi-channel, a convolutive blind signal separation method is applied and time-delayed data samples of the two microphone input signals is used. For signal separation, the mixing coefficient is calculated using an inverse model rather than directly calculating the separation coefficient, and the coefficient update is performed by repeated calculations based on secondary statistical properties to estimate the speech signal. Many simulations were performed to verify the performance of the proposed blind signal separation. As a result of the simulation, noise separation using this method operates safely regardless of convolutive mixing, and PESQ is improved by 0.3 points compared to the general adaptive FIR filter structure.

Analysis on Subjective Image Quality Assessments for 4K-UHD Video Viewing Environments (4K-UHD 비디오 시청환경 특성분석을 위한 주관적 화질평가 분석)

  • Park, In-Kyung;Ha, Kwang-Sung;Kim, Mun-Churl;Cho, Suk-Hee;Cho, Jin-Soo
    • Journal of Broadcast Engineering
    • /
    • v.15 no.4
    • /
    • pp.563-581
    • /
    • 2010
  • In this paper, we perform subjective visual quality assessments on UHD video for UHD TV services and analyze the assessment results. Demands for video services have been increased with availabilities of DTV, Internet and personal media equipments. With this trend, the demands for high definition video have also been increasing. Currently, 2K-HD ($1920{\times}1080$) video have been widely consumed over DTV, DVD, digital camcoders, security cameras and other multimedia terminals in various types, and recently digital cinema contents of 4K-UHD($3840{\times}2160$) have been popularly produced and the cameras, beam projects, display panels that support for 4K-UHD video start to come out into multimedia markets. Also it is expected that 4K-UHD service will appear soon in broadcasting and telecommunications environments. Therefore, in this paper, subjective assessments of visual quality on resolutions, color formats, frame rates and compression rates have been carried to provide basis information for standardization of signal specification of UHD video and viewing environments for future UHDTV. As the analysis on the assessments, UHD video exhibits better subjective visual quality than HD by the evaluators. Also, the 4K-UHD test sequences in YUV444 shows better subjective visual quality than the 4K-UHD test sequences in YUV422 and YUV420, but there is little perceptual difference on 4K-UHD test sequences between YUV422 and YUV420 formats. For the comparison between different frame rates, 4K-UHD test sequences of 60fps gives better subjective visual quality than those of 30fps. For bit-depth comparison, HD test sequences in 10-bit depth were little differentiated from those in 8-bit depth in subject visual quality assessment. Lastly, the larger the PSNR values of the reconstructed 4K-UHD test sequences are, the higher the subjective visual quality is. Against the viewing distances, the differences among encoded 4K-UHD test sequences were less distinguished in longer distances from the display.

The Usefulness of Product Display of Online Store by the Product Type of Usage Situation - Focusing on the moderate effect of the product portability - (사용상황별 제품유형에 따른 온라인 점포 제품디스플레이의 유용성 - 제품 휴대성의 조절효과를 중심으로 -)

  • Lee, Dong-Il;Choi, Seung-Hoon
    • Journal of Distribution Research
    • /
    • v.16 no.2
    • /
    • pp.1-24
    • /
    • 2011
  • 1. Introduction: Contrast to the offline purchasing environment, online store cannot offer the sense of touch or direct visual information of its product to the consumers. So the builder of the online shopping mall should provide more concrete and detailed product information(Kim 2008), and Alba (1997) also predicted that the quality of the offered information is determined by the post-purchase consumer satisfaction. In practice, many fashion and apparel online shopping malls offer the picture information with the product on the real person model to enhance the usefulness of product information. On the other virtual product experience has been suggested to the ways of overcoming the online consumers' limited perceptual capability (Jiang & Benbasat 2005). However, the adoption and the facilitation of the virtual reality tools requires high investment and technical specialty compared to the text/picture product information offerings (Shaffer 2006). This could make the entry barrier to the online shopping to the small retailers and sometimes it could be demanding high level of consumers' perceptual efforts. So the expensive technological solution could affects negatively to the consumer decision making processes. Nevertheless, most of the previous research on the online product information provision suggests the VR be the more effective tools. 2. Research Model and Hypothesis: Presented in

    , research model suggests VR effect could be moderated by the product types by the usage situations. Product types could be defined as the portable product and installed product, and the information offering type as still picture of the product, picture of the product with the real-person model and VR. 3. Methods and Results: 3.1. Experimental design and measured variables We designed the 2(product types) X 3(product information types) experimental setting and measured dependent variables such as information usefulness, attitude toward the shopping mall, overall product quality, purchase intention and the revisiting intention. In the case of information usefulness and attitude toward the shopping mall were measured by multi-item scale. As a result of reliability test, Cronbach's Alpha value of each variable shows more than 0.6. Thus, we ensured that the internal consistency of items. 3.2. Manipulation check The main concern of this study is to verify the moderate effect by the product type of usage situation. indicates that our experimental manipulation of the moderate effect of the product type was successful. 3.3. Results As
    indicates, there was a significant main effect on the only one dependent variable(attitude toward the shopping mall) by the information types. As predicted, VR has highest mean value compared to other information types. Thus, H1 was partially supported. However, main effect by the product types was not found. To evaluate H2 and H3, a two-way ANOVA was conducted. As
    indicates, there exist the interaction effects on the three dependent variables(information usefulness, overall product quality and purchase intention) by the information types and the product types. As predicted, picture of the product with the real-person model has highest mean among the information types in the case of portable product. On the other hand, VR has highest mean among the information types in the case of installed product. Thus, H2 and H3 was supported. 4. Implications: The present study found the moderate effect by the product type of usage situation. Based on the findings the following managerial implications are asserted. First, it was found that information types are affect only the attitude toward the shopping mall. The meaning of this finding is that VR effects are not enough to understand the product itself. Therefore, we must consider when and how to use this VR tools. Second, it was found that there exist the interaction effects on the information usefulness, overall product quality and purchase intention. This finding suggests that consideration of usage situation helps consumer's understanding of product and promotes their purchase intention. In conclusion, not only product attributes but also product usage situations must be fully considered by the online retailers when they want to meet the needs of consumers.

  • PDF
  • A Research on the Influences on the Intention to be Continuously Subscribed to the Pension Service -Centered on the Small and Medium-sized Enterprises Science and Technology Pension (연금서비스의 지속가입의도에 영향을 미치는 요인에 관한 연구 -중소기업 과학기술인연금을 중심으로)

    • Jung, Soo-Yong;Shin, Yong-Tae;Koh, In-Soo
      • Journal of Digital Convergence
      • /
      • v.16 no.5
      • /
      • pp.85-95
      • /
      • 2018
    • With the scientists and the technicians of the small- and medium-sized enterprises who have been subscribing to the pension service as the subjects, this research took a look at the influences on the intention to continuously subscribe to the pension service and, finally, took a look at the differences between the subscribers of the safety type and the profit type which have been provided by the pension service. Through the questionnaire survey, which collected the data, an actual proof analysis was carried out. Through the statistical program, the degree of the reliability analysis and the feasibility analysis were carried out. And the degree of the suitability of the structural equation model was tested. And, finally, through the research model, the hypothesis was verified and the differences between the groups were analyzed. It appeared that the factors of the reliability and the responsiveness of the service quality factor have the positive influence on the perceived value, which is a parameter. And it appeared that the materiality and the perceptual openness factors cannot have any influence. And the stability and the usefulness, which are the attributed factors of the pension service, had the positive influences on the perceived value. Finally, it appeared that the perceived value of the pension service has a positive influence on the intention to subscribe continuously. Through the results of this research, it can contribute to the invigoration of the pension service. And it is thought that a pension service which is better than the preexistent pension service can be provided.

    A CELP Coder using the Band-Divided Long Term Prediction (대역 분할 장구간 예측을 이용한 CELP 부호화기)

    • Choi, Young-Soo;Kang, Hong-Goo;Lim, Myoung-Seob;Ahn, Dong-Soon;Youn, Dae-Hee
      • The Journal of the Acoustical Society of Korea
      • /
      • v.14 no.4
      • /
      • pp.38-45
      • /
      • 1995
    • In this paper a way to improve the performance of the long term prediction is proposed, which adopts the Multi-band Excitation (MBE) method in addition to the Code-Excited Linear Prediction (CELP) method at low bit rates below 4.8 kbps. In the proposed method, the multiband long term prediction is performed on the periodic components which still remain after the long term prediction of the conventional CELP method. At this point, the whole frequency region is divided into subbands whose size is equal to the spacing between the harmonics of the fundamental frequency, and the periodic multiband excitation signals. are represented as the sum of sine waves approximately as large as the spectrum of the excitation signals, so that the actual characteristics of the excitation signals can be better taken into account. To evaluate the performance of the proposed method, computer simulation is performed at 4.8 kbps. The 4.8 kbps DoD CELP and the 4.4 kbps IMBE were chosen as the reference vocoders for the speech quality measure. The result of the perceptual speech quality measure showed that the performance of the proposed method is better than that of the 4.8 kbps DoD CELP vocoder, and similar to that of the 4.4 kbps IMBE vocoder.

    • PDF

    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.