• 제목/요약/키워드: feature enhancement

Search Result 258, Processing Time 0.032 seconds

Comparative Study of Sonar Image Processing for Underwater Navigation (항법 적용을 위한 수중 소나 영상 처리 요소 기법 비교 분석)

  • Shin, Young-Sik;Cho, Younggun;Lee, Yeongjun;Choi, Hyun-Taek;Kim, Ayoung
    • Journal of Ocean Engineering and Technology
    • /
    • v.30 no.3
    • /
    • pp.214-220
    • /
    • 2016
  • Imaging sonars such as side-scanning sonar or forward-looking sonar are becoming fundamental sensors in the underwater robotics field. However, using sonar images for underwater perception presents many challenges. Sonar images are usually low resolution with inherent speckled noise. To overcome the limited sensor information for underwater perception, we investigated preprocessing methods for sonar images and feature detection methods for a nonlinear scale space. In this paper, we focus on a comparative analysis of (1) preprocessing for sonar images and (2) the feature detection performance in relation to the scale space composition.

Estimation of speech feature vectors and enhancement of speech recognition performance using lip information (입술정보를 이용한 음성 특징 파라미터 추정 및 음성인식 성능향상)

  • Min So-Hee;Kim Jin-Young;Choi Seung-Ho
    • MALSORI
    • /
    • no.44
    • /
    • pp.83-92
    • /
    • 2002
  • Speech recognition performance is severly degraded under noisy envrionments. One approach to cope with this problem is audio-visual speech recognition. In this paper, we discuss the experiment results of bimodal speech recongition based on enhanced speech feature vectors using lip information. We try various kinds of speech features as like linear predicion coefficient, cepstrum, log area ratio and etc for transforming lip information into speech parameters. The experimental results show that the cepstrum parameter is the best feature in the point of reconition rate. Also, we present the desirable weighting values of audio and visual informations depending on signal-to-noiso ratio.

  • PDF

Enhancement of Ship's Wheel Order Recognition System using Speaker's Intention Predictive Parameters (화자의도예측 파라미터를 이용한 조타명령 음성인식 시스템의 개선)

  • Moon, Serng-Bae
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.32 no.5
    • /
    • pp.791-797
    • /
    • 2008
  • The officer of the deck(OOD) may sometimes have to carry out lookout as well as handling of auto pilot without a quartermaster at sea. The purpose of this paper is to develop the ship's auto pilot control module using speech recognition in order to reduce the potential risk of one man bridge system. The feature parameters predicting the OOD's intention was extracted from the sample wheel orders written in SMCP(IMO Standard Marine Communication Phrases). We designed a pre-recognition procedure which could make some candidate words using DTW(Dynamic Time Warping) algorithm, a post-recognition procedure which made a final decision from the candidate words using the feature parameters. To evaluate the effectiveness of these procedures the experiment was conducted with 500 wheel orders.

Variation Analysis of Feature Parameters According to the Channel Distortion of Korean Telephone Digit Speech (한국어 숫자음 전화음성의 채널왜곡에 따른 특징파라미터의 변이 분석)

  • 정성윤;손종목;김민성;배건성
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.191-194
    • /
    • 2002
  • The final purpose of this paper is the enhancement of speech recognition rate under the matched telephone environment between training data and test data. To analyze the effect by the distortion of the changing telephone channel on every call, MFCC is used as the feature parameter and CMN, RTCN, and RASTA are used as channel compensation techniques. For each case, the variation of feature parameters of all phones is analyzed. And, we find recognition rates according to each compensation method using the continuous HMM recognizer, and examine the relationship between variation and recognition rate.

  • PDF

Enhanced and applicable algorithm for Big-Data by Combining Sparse Auto-Encoder and Load-Balancing, ProGReGA-KF

  • Kim, Hyunah;Kim, Chayoung
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.1
    • /
    • pp.218-223
    • /
    • 2021
  • Pervasive enhancement and required enforcement of the Internet of Things (IoTs) in a distributed massively multiplayer online architecture have effected in massive growth of Big-Data in terms of server over-load. There have been some previous works to overcome the overloading of server works. However, there are lack of considered methods, which is commonly applicable. Therefore, we propose a combing Sparse Auto-Encoder and Load-Balancing, which is ProGReGA for Big-Data of server loads. In the process of Sparse Auto-Encoder, when it comes to selection of the feature-pattern, the less relevant feature-pattern could be eliminated from Big-Data. In relation to Load-Balancing, the alleviated degradation of ProGReGA can take advantage of the less redundant feature-pattern. That means the most relevant of Big-Data representation can work. In the performance evaluation, we can find that the proposed method have become more approachable and stable.

End-to-End Learning-based Spatial Scalable Image Compression with Multi-scale Feature Fusion Module (다중 스케일 특징 융합 모듈을 통한 종단 간 학습기반 공간적 스케일러블 영상 압축)

  • Shin Juyeon;Kang Jewon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.11a
    • /
    • pp.1-3
    • /
    • 2022
  • 최근 기존의 영상 압축 파이프라인 대신 신경망의 종단 간 학습을 통해 압축을 수행하는 알고리즘의 연구가 활발히 진행되고 있다. 본 논문은 종단 간 학습 기반 공간적 스케일러블 압축 기술을 제안한다. 보다 구체적으로 본 논문은 신경망의 각 계층에서 하위 계층의 학습된 특징 (feature)을 융합하여 상위 계층으로 전달하는 다중 스케일 특징 융합 (multi-scale feature fusion) 모듈을 도입해 상위 계층이 더욱 풍부한 특징 정보를 학습하고 계층 사이의 특징 중복성을 더욱 잘 제거할 수 있도록 한다. 기존 방법 대비 향상 계층(enhancement layer)에서 1.37%의 BD-rate가 향상된 결과를 볼 수 있다.

  • PDF

Improvement in Supervector Linear Kernel SVM for Speaker Identification Using Feature Enhancement and Training Length Adjustment (특징 강화 기법과 학습 데이터 길이 조절에 의한 Supervector Linear Kernel SVM 화자식별 개선)

  • So, Byung-Min;Kim, Kyung-Wha;Kim, Min-Seok;Yang, Il-Ho;Kim, Myung-Jae;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.6
    • /
    • pp.330-336
    • /
    • 2011
  • In this paper, we propose a new method to improve the performance of supervector linear kernel SVM (Support Vector Machine) for speaker identification. This method is based on splitting one training datum into several pieces of utterances. We use four different databases for evaluating performance and use PCA (Principal Component Analysis), GKPCA (Greedy Kernel PCA) and KMDA (Kernel Multimodal Discriminant Analysis) for feature enhancement. As a result, the proposed method shows improved performance for speaker identification using supervector linear kernel SVM.

AANet: Adjacency auxiliary network for salient object detection

  • Li, Xialu;Cui, Ziguan;Gan, Zongliang;Tang, Guijin;Liu, Feng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.10
    • /
    • pp.3729-3749
    • /
    • 2021
  • At present, deep convolution network-based salient object detection (SOD) has achieved impressive performance. However, it is still a challenging problem to make full use of the multi-scale information of the extracted features and which appropriate feature fusion method is adopted to process feature mapping. In this paper, we propose a new adjacency auxiliary network (AANet) based on multi-scale feature fusion for SOD. Firstly, we design the parallel connection feature enhancement module (PFEM) for each layer of feature extraction, which improves the feature density by connecting different dilated convolution branches in parallel, and add channel attention flow to fully extract the context information of features. Then the adjacent layer features with close degree of abstraction but different characteristic properties are fused through the adjacent auxiliary module (AAM) to eliminate the ambiguity and noise of the features. Besides, in order to refine the features effectively to get more accurate object boundaries, we design adjacency decoder (AAM_D) based on adjacency auxiliary module (AAM), which concatenates the features of adjacent layers, extracts their spatial attention, and then combines them with the output of AAM. The outputs of AAM_D features with semantic information and spatial detail obtained from each feature are used as salient prediction maps for multi-level feature joint supervising. Experiment results on six benchmark SOD datasets demonstrate that the proposed method outperforms similar previous methods.

Experiment on Low Light Image Enhancement and Feature Extraction Methods for Rover Exploration in Lunar Permanently Shadowed Region (달 영구음영지역에서 로버 탐사를 위한 저조도 영상강화 및 영상 특징점 추출 성능 실험)

  • Park, Jae-Min;Hong, Sungchul;Shin, Hyu-Soung
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.42 no.5
    • /
    • pp.741-749
    • /
    • 2022
  • Major space agencies are planning for the rover-based lunar exploration since water-ice was detected in permanently shadowed regions (PSR). Although sunlight does not directly reach the PSRs, it is expected that reflected sunlight sustains a certain level of low-light environment. In this research, the indoor testbed was made to simulate the PSR's lighting and topological conditions, to which low light enhancement methods (CLAHE, Dehaze, RetinexNet, GLADNet) were applied to restore image brightness and color as well as to investigate their influences on the performance of feature extraction and matching methods (SIFT, SURF, ORB, AKAZE). The experiment results show that GLADNet and Dehaze images in order significantly improve image brightness and color. However, the performance of the feature extraction and matching methods were improved by Dehaze and GLADNet images in order, especially for ORB and AKAZE. Thus, in the lunar exploration, Dehaze is appropriate for building 3D topographic map whereas GLADNet is adequate for geological investigation.

Two-Microphone Binary Mask Speech Enhancement in Diffuse and Directional Noise Fields

  • Abdipour, Roohollah;Akbari, Ahmad;Rahmani, Mohsen
    • ETRI Journal
    • /
    • v.36 no.5
    • /
    • pp.772-782
    • /
    • 2014
  • Two-microphone binary mask speech enhancement (2mBMSE) has been of particular interest in recent literature and has shown promising results. Current 2mBMSE systems rely on spatial cues of speech and noise sources. Although these cues are helpful for directional noise sources, they lose their efficiency in diffuse noise fields. We propose a new system that is effective in both directional and diffuse noise conditions. The system exploits two features. The first determines whether a given time-frequency (T-F) unit of the input spectrum is dominated by a diffuse or directional source. A diffuse signal is certainly a noise signal, but a directional signal could correspond to a noise or speech source. The second feature discriminates between T-F units dominated by speech or directional noise signals. Speech enhancement is performed using a binary mask, calculated based on the proposed features. In both directional and diffuse noise fields, the proposed system segregates speech T-F units with hit rates above 85%. It outperforms previous solutions in terms of signal-to-noise ratio and perceptual evaluation of speech quality improvement, especially in diffuse noise conditions.