Search | Korea Science

Comparative Study of Sonar Image Processing for Underwater Navigation (항법 적용을 위한 수중 소나 영상 처리 요소 기법 비교 분석)

Shin, Young-Sik;Cho, Younggun;Lee, Yeongjun;Choi, Hyun-Taek;Kim, Ayoung
- Journal of Ocean Engineering and Technology
- /
- v.30 no.3
- /
- pp.214-220
- /
- 2016
Imaging sonars such as side-scanning sonar or forward-looking sonar are becoming fundamental sensors in the underwater robotics field. However, using sonar images for underwater perception presents many challenges. Sonar images are usually low resolution with inherent speckled noise. To overcome the limited sensor information for underwater perception, we investigated preprocessing methods for sonar images and feature detection methods for a nonlinear scale space. In this paper, we focus on a comparative analysis of (1) preprocessing for sonar images and (2) the feature detection performance in relation to the scale space composition.
https://doi.org/10.5574/KSOE.2016.30.3.214 인용 PDF KSCI KPUBS HTML

Estimation of speech feature vectors and enhancement of speech recognition performance using lip information (입술정보를 이용한 음성 특징 파라미터 추정 및 음성인식 성능향상)

Min So-Hee;Kim Jin-Young;Choi Seung-Ho
- MALSORI
- /
- no.44
- /
- pp.83-92
- /
- 2002
Speech recognition performance is severly degraded under noisy envrionments. One approach to cope with this problem is audio-visual speech recognition. In this paper, we discuss the experiment results of bimodal speech recongition based on enhanced speech feature vectors using lip information. We try various kinds of speech features as like linear predicion coefficient, cepstrum, log area ratio and etc for transforming lip information into speech parameters. The experimental results show that the cepstrum parameter is the best feature in the point of reconition rate. Also, we present the desirable weighting values of audio and visual informations depending on signal-to-noiso ratio.
PDF

Enhancement of Ship's Wheel Order Recognition System using Speaker's Intention Predictive Parameters (화자의도예측 파라미터를 이용한 조타명령 음성인식 시스템의 개선)

Moon, Serng-Bae
- Journal of Advanced Marine Engineering and Technology
- /
- v.32 no.5
- /
- pp.791-797
- /
- 2008
The officer of the deck(OOD) may sometimes have to carry out lookout as well as handling of auto pilot without a quartermaster at sea. The purpose of this paper is to develop the ship's auto pilot control module using speech recognition in order to reduce the potential risk of one man bridge system. The feature parameters predicting the OOD's intention was extracted from the sample wheel orders written in SMCP(IMO Standard Marine Communication Phrases). We designed a pre-recognition procedure which could make some candidate words using DTW(Dynamic Time Warping) algorithm, a post-recognition procedure which made a final decision from the candidate words using the feature parameters. To evaluate the effectiveness of these procedures the experiment was conducted with 500 wheel orders.
https://doi.org/10.5916/jkosme.2008.32.5.791 인용 PDF KSCI

Variation Analysis of Feature Parameters According to the Channel Distortion of Korean Telephone Digit Speech (한국어 숫자음 전화음성의 채널왜곡에 따른 특징파라미터의 변이 분석)

정성윤;손종목;김민성;배건성
- Proceedings of the IEEK Conference
- /
- 2002.06d
- /
- pp.191-194
- /
- 2002
The final purpose of this paper is the enhancement of speech recognition rate under the matched telephone environment between training data and test data. To analyze the effect by the distortion of the changing telephone channel on every call, MFCC is used as the feature parameter and CMN, RTCN, and RASTA are used as channel compensation techniques. For each case, the variation of feature parameters of all phones is analyzed. And, we find recognition rates according to each compensation method using the continuous HMM recognizer, and examine the relationship between variation and recognition rate.
PDF

Enhanced and applicable algorithm for Big-Data by Combining Sparse Auto-Encoder and Load-Balancing, ProGReGA-KF

Kim, Hyunah;Kim, Chayoung
- International Journal of Advanced Culture Technology
- /
- v.9 no.1
- /
- pp.218-223
- /
- 2021
Pervasive enhancement and required enforcement of the Internet of Things (IoTs) in a distributed massively multiplayer online architecture have effected in massive growth of Big-Data in terms of server over-load. There have been some previous works to overcome the overloading of server works. However, there are lack of considered methods, which is commonly applicable. Therefore, we propose a combing Sparse Auto-Encoder and Load-Balancing, which is ProGReGA for Big-Data of server loads. In the process of Sparse Auto-Encoder, when it comes to selection of the feature-pattern, the less relevant feature-pattern could be eliminated from Big-Data. In relation to Load-Balancing, the alleviated degradation of ProGReGA can take advantage of the less redundant feature-pattern. That means the most relevant of Big-Data representation can work. In the performance evaluation, we can find that the proposed method have become more approachable and stable.
https://doi.org/10.17703/IJACT.2021.9.1.218 인용 PDF KSCI

End-to-End Learning-based Spatial Scalable Image Compression with Multi-scale Feature Fusion Module (다중 스케일 특징 융합 모듈을 통한 종단 간 학습기반 공간적 스케일러블 영상 압축)

Shin Juyeon;Kang Jewon
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2022.11a
- /
- pp.1-3
- /
- 2022
최근 기존의 영상 압축 파이프라인 대신 신경망의 종단 간 학습을 통해 압축을 수행하는 알고리즘의 연구가 활발히 진행되고 있다. 본 논문은 종단 간 학습 기반 공간적 스케일러블 압축 기술을 제안한다. 보다 구체적으로 본 논문은 신경망의 각 계층에서 하위 계층의 학습된 특징 (feature)을 융합하여 상위 계층으로 전달하는 다중 스케일 특징 융합 (multi-scale feature fusion) 모듈을 도입해 상위 계층이 더욱 풍부한 특징 정보를 학습하고 계층 사이의 특징 중복성을 더욱 잘 제거할 수 있도록 한다. 기존 방법 대비 향상 계층(enhancement layer)에서 1.37%의 BD-rate가 향상된 결과를 볼 수 있다.
PDF

Improvement in Supervector Linear Kernel SVM for Speaker Identification Using Feature Enhancement and Training Length Adjustment (특징 강화 기법과 학습 데이터 길이 조절에 의한 Supervector Linear Kernel SVM 화자식별 개선)

So, Byung-Min;Kim, Kyung-Wha;Kim, Min-Seok;Yang, Il-Ho;Kim, Myung-Jae;Yu, Ha-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.30 no.6
- /
- pp.330-336
- /
- 2011
In this paper, we propose a new method to improve the performance of supervector linear kernel SVM (Support Vector Machine) for speaker identification. This method is based on splitting one training datum into several pieces of utterances. We use four different databases for evaluating performance and use PCA (Principal Component Analysis), GKPCA (Greedy Kernel PCA) and KMDA (Kernel Multimodal Discriminant Analysis) for feature enhancement. As a result, the proposed method shows improved performance for speaker identification using supervector linear kernel SVM.
https://doi.org/10.7776/ASK.2011.30.6.330 인용 PDF KSCI

AANet: Adjacency auxiliary network for salient object detection

Li, Xialu;Cui, Ziguan;Gan, Zongliang;Tang, Guijin;Liu, Feng
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.10
- /
- pp.3729-3749
- /
- 2021
At present, deep convolution network-based salient object detection (SOD) has achieved impressive performance. However, it is still a challenging problem to make full use of the multi-scale information of the extracted features and which appropriate feature fusion method is adopted to process feature mapping. In this paper, we propose a new adjacency auxiliary network (AANet) based on multi-scale feature fusion for SOD. Firstly, we design the parallel connection feature enhancement module (PFEM) for each layer of feature extraction, which improves the feature density by connecting different dilated convolution branches in parallel, and add channel attention flow to fully extract the context information of features. Then the adjacent layer features with close degree of abstraction but different characteristic properties are fused through the adjacent auxiliary module (AAM) to eliminate the ambiguity and noise of the features. Besides, in order to refine the features effectively to get more accurate object boundaries, we design adjacency decoder (AAM_D) based on adjacency auxiliary module (AAM), which concatenates the features of adjacent layers, extracts their spatial attention, and then combines them with the output of AAM. The outputs of AAM_D features with semantic information and spatial detail obtained from each feature are used as salient prediction maps for multi-level feature joint supervising. Experiment results on six benchmark SOD datasets demonstrate that the proposed method outperforms similar previous methods.
https://doi.org/10.3837/tiis.2021.10.014 인용 PDF KSCI HTML

Experiment on Low Light Image Enhancement and Feature Extraction Methods for Rover Exploration in Lunar Permanently Shadowed Region (달 영구음영지역에서 로버 탐사를 위한 저조도 영상강화 및 영상 특징점 추출 성능 실험)

Park, Jae-Min;Hong, Sungchul;Shin, Hyu-Soung
- KSCE Journal of Civil and Environmental Engineering Research
- /
- v.42 no.5
- /
- pp.741-749
- /
- 2022
Major space agencies are planning for the rover-based lunar exploration since water-ice was detected in permanently shadowed regions (PSR). Although sunlight does not directly reach the PSRs, it is expected that reflected sunlight sustains a certain level of low-light environment. In this research, the indoor testbed was made to simulate the PSR's lighting and topological conditions, to which low light enhancement methods (CLAHE, Dehaze, RetinexNet, GLADNet) were applied to restore image brightness and color as well as to investigate their influences on the performance of feature extraction and matching methods (SIFT, SURF, ORB, AKAZE). The experiment results show that GLADNet and Dehaze images in order significantly improve image brightness and color. However, the performance of the feature extraction and matching methods were improved by Dehaze and GLADNet images in order, especially for ORB and AKAZE. Thus, in the lunar exploration, Dehaze is appropriate for building 3D topographic map whereas GLADNet is adequate for geological investigation.
https://doi.org/10.12652/Ksce.2022.42.5.0741 인용 PDF KSCI

Two-Microphone Binary Mask Speech Enhancement in Diffuse and Directional Noise Fields

Abdipour, Roohollah;Akbari, Ahmad;Rahmani, Mohsen
- ETRI Journal
- /
- v.36 no.5
- /
- pp.772-782
- /
- 2014
Two-microphone binary mask speech enhancement (2mBMSE) has been of particular interest in recent literature and has shown promising results. Current 2mBMSE systems rely on spatial cues of speech and noise sources. Although these cues are helpful for directional noise sources, they lose their efficiency in diffuse noise fields. We propose a new system that is effective in both directional and diffuse noise conditions. The system exploits two features. The first determines whether a given time-frequency (T-F) unit of the input spectrum is dominated by a diffuse or directional source. A diffuse signal is certainly a noise signal, but a directional signal could correspond to a noise or speech source. The second feature discriminates between T-F units dominated by speech or directional noise signals. Speech enhancement is performed using a binary mask, calculated based on the proposed features. In both directional and diffuse noise fields, the proposed system segregates speech T-F units with hit rates above 85%. It outperforms previous solutions in terms of signal-to-noise ratio and perceptual evaluation of speech quality improvement, especially in diffuse noise conditions.
https://doi.org/10.4218/etrij.14.0113.0917 인용 PDF KSCI KPUBS

Search Result 258, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)