• Title/Summary/Keyword: 레이블링 기법

Search Result 115, Processing Time 0.02 seconds

Speech Recognition in the Pager System displaying Defined Sentences (문자출력 무선호출기를 위한 음성인식 시스템)

  • Park, Gyu-Bong;Park, Jeon-Gue;Suh, Sang-Weon;Hwang, Doo-Sung;Kim, Hyun-Bin;Han, Mun-Sung
    • Annual Conference on Human and Language Technology
    • /
    • 1996.10a
    • /
    • pp.158-162
    • /
    • 1996
  • 본 논문에서는 문자출력이 가능한 무선호출기에 음성인식 기술을 접목한, 특성화된 한 음성인식 시스템에 대하여 설명하고자 한다. 시스템 동작 과정은, 일단 호출자가 음성인식 서버와 접속하게 되면 서버는 호출자의 자연스런 입력음성을 인식, 그 결과를 문장 형태로 피호출자의 호출기 단말기에 출력시키는 방식으로 되어 있다. 본 시스템에서는 통계적 음성인식 기법을 도입하여, 각 단어를 연속 HMM으로 모델링하였다. 가우시안 혼합 확률밀도함수를 사용하는 각 모델은 전통적인 HMM 학습법들 중의 하나인 Baum-Welch 알고리듬에 의해 학습되고 인식시에는 이들에 비터비 빔 탐색을 적용하여 최선의 결과를 얻도록 한다. MFCC와 파워를 혼용한 26 차원 특징벡터를 각 프레임으로부터 추출하여, 최종적으로, 83 개의 도메인 어휘들 및 무음과 같은 특수어휘들에 대한 모델링을 완성하게 된다. 여기에 구문론적 기능과 의미론적 기능을 함께 수행하는 FSN을 결합시켜 자연발화음성에 대한 연속음성인식 시스템을 구성한다. 본문에서는 이상의 사항들 외에도 음성 데이터베이스, 레이블링 등과 갈이 시스템 성능과 직결되는 시스템의 외적 요소들에 대해 고찰하고, 시스템에 구현되어 있는 다양한 특성들에 대해 밝히며, 실험 결과 및 앞으로의 개선 방향 등에 대해 논의하기로 한다.

  • PDF

Study on the Ship Detection Method Using SAR Imagery (SAR 영상을 이용한 선박탐지에 관한 연구)

  • Kwon, Seung-Joon;Shin, Sung-Woong
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.17 no.1
    • /
    • pp.131-139
    • /
    • 2009
  • The existing vessel monitoring system using the ground surveillance radar has a difficulty in monitoring ships continuously due to the limited range of detecting ships. For resolving this problem, we carry out a research on ship detection which is to be the core technology of vessel monitoring system for ocean monitoring using SAR imagery. There are two different methods of detecting ships in SAR imagery: detection of the ship target itself and detection of the ship wake. In this paper, we mainly focus on algorithms which detect the ship itself, and also present the accuracy test after extracting positional and directional figures of the ships. After rectifying input SAR imagery using polynomial transformation, we use Wiener filter to remove speckle noises. A labeling technique and morphological filtering in conjunction with Otsu's method are used to automatically detect the ships based on the image processing domain. For ground truth data, information from a radar system is used, which allows assessing the accuracy of the proposed method. The results show that the proposed method has the high potential in automatically detecting the ships and its positional/directional figures in a fast way.

  • PDF

Robust Object Detection from Indoor Environmental Factors (다양한 실내 환경변수로부터 강인한 객체 검출)

  • Choi, Mi-Young;Kim, Gye-Young;Choi, Hyung-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.2
    • /
    • pp.41-46
    • /
    • 2010
  • In this paper, we propose a detection method of reduced computational complexity aimed at separating the moving objects from the background in a generic video sequence. In generally, indoor environments, it is difficult to accurately detect the object because environmental factors, such as lighting changes, shadows, reflections on the floor. First, the background image to detect an object is created. If an object exists in video, on a previously created background images for similarity comparison between the current input image and to detect objects through several operations to generate a mixture image. Mixed-use video and video inputs to detect objects. To complement the objects detected through the labeling process to remove noise components and then apply the technique of morphology complements the object area. Environment variable such as, lighting changes and shadows, to the strength of the object is detected. In this paper, we proposed that environmental factors, such as lighting changes, shadows, reflections on the floor, including the system uses mixture images. Therefore, the existing system more effectively than the object region is detected.

A Data Type for Concept-Based Retrieval against Image Databases Indefinitely Indexed (불확정적으로 색인된 이미지 데이터베이스를 개념 기반으로 검색하기 위한 자료형)

  • Yang, Jae-Dong
    • Journal of KIISE:Databases
    • /
    • v.29 no.1
    • /
    • pp.27-33
    • /
    • 2002
  • There are two significant drawbacks in triple image indexing; one is that is cannot support concept-based image retrieval and the other is that it fails to allow disjunctive labeling of images. To remedy the drawbacks, we propose a new technique supporting a concept-based retrieval against images indexed by indefinite fuzzy triples (I-fuzzy triples). The I-fuzzy triples allow not only a disjunctive image labeling, but also a concept-based matching against images labeled disjunctively. The disjunctive labeling is based on the expended closed world assumption and the concept-based image retrieval is based on fuzzy matching. In this paper, we also propose a concept-based query evaluation against the image database to extract desired answers with the degree of certainty $\alpha$$\in$[1,0].

Human Skin Region Detection Utilizing Depth Information (깊이 정보를 활용한 사람의 피부영역 검출)

  • Jang, Seok-Woo;Park, Young-Jae;Kim, Gye-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.6
    • /
    • pp.29-36
    • /
    • 2012
  • In this paper, we suggest a new method of detecting human skin-color regions from three-dimensional static or dynamic stereoscopic images by effectively integrating depth and color features. The suggested method first extracts depth information that represents the distance between a camera and an object from input left and right stereoscopic images through a stereo matching technique. It then performs labeling for pixels with similar depth features and determines the labeled regions having human skin color as actual skin color regions. Our experimental results show that the suggested skin region extraction method outperforms existing skin detection methods in terms of skin-color region extraction accuracy.

Performance Comparison of Machine Learning Algorithms for TAB Digit Recognition (타브 숫자 인식을 위한 기계 학습 알고리즘의 성능 비교)

  • Heo, Jaehyeok;Lee, Hyunjung;Hwang, Doosung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.1
    • /
    • pp.19-26
    • /
    • 2019
  • In this paper, the classification performance of learning algorithms is compared for TAB digit recognition. The TAB digits that are segmented from TAB musical notes contain TAB lines and musical symbols. The labeling method and non-linear filter are designed and applied to extract fret digits only. The shift operation of the 4 directions is applied to generate more data. The selected models are Bayesian classifier, support vector machine, prototype based learning, multi-layer perceptron, and convolutional neural network. The result shows that the mean accuracy of the Bayesian classifier is about 85.0% while that of the others reaches more than 99.0%. In addition, the convolutional neural network outperforms the others in terms of generalization and the step of the data preprocessing.

CALS: Channel State Information Auto-Labeling System for Large-scale Deep Learning-based Wi-Fi Sensing (딥러닝 기반 Wi-Fi 센싱 시스템의 효율적인 구축을 위한 지능형 데이터 수집 기법)

  • Jang, Jung-Ik;Choi, Jaehyuk
    • Journal of IKEEE
    • /
    • v.26 no.3
    • /
    • pp.341-348
    • /
    • 2022
  • Wi-Fi Sensing, which uses Wi-Fi technology to sense the surrounding environments, has strong potentials in a variety of sensing applications. Recently several advanced deep learning-based solutions using CSI (Channel State Information) data have achieved high performance, but it is still difficult to use in practice without explicit data collection, which requires expensive adaptation efforts for model retraining. In this study, we propose a Channel State Information Automatic Labeling System (CALS) that automatically collects and labels training CSI data for deep learning-based Wi-Fi sensing systems. The proposed system allows the CSI data collection process to efficiently collect labeled CSI for labeling for supervised learning using computer vision technologies such as object detection algorithms. We built a prototype of CALS to demonstrate its efficiency and collected data to train deep learning models for detecting the presence of a person in an indoor environment, showing to achieve an accuracy of over 90% with the auto-labeled data sets generated by CALS.

Research on Driving Pattern Analysis Techniques Using Contrastive Learning Methods (대조학습 방법을 이용한 주행패턴 분석 기법 연구)

  • Hoe Jun Jeong;Seung Ha Kim;Joon Hee Kim;Jang Woo Kwon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.23 no.1
    • /
    • pp.182-196
    • /
    • 2024
  • This study introduces driving pattern analysis and change detection methods using smartphone sensors, based on contrastive learning. These methods characterize driving patterns without labeled data, allowing accurate classification with minimal labeling. In addition, they are robust to domain changes, such as different vehicle types. The study also examined the applicability of these methods to smartphones by comparing them with six lightweight deep-learning models. This comparison supported the development of smartphone-based driving pattern analysis and assistance systems, utilizing smartphone sensors and contrastive learning to enhance driving safety and efficiency while reducing the need for extensive labeled data. This research offers a promising avenue for addressing contemporary transportation challenges and advancing intelligent transportation systems.

Estimation of Maximum Crack Width Using Histogram Analysis in Concrete Structures (히스토그램 분석을 이용한 콘크리트 구조물의 최대 균열 폭 평가)

  • Lee, Seok-Min;Jung, Beom-Seok
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.23 no.7
    • /
    • pp.9-15
    • /
    • 2019
  • The purpose of present study is to assess the maximum width of the surface cracks using the histogram analysis of image processing techniques in concrete structures. For this purpose, the concrete crack image is acquired by the camera. The image is Grayscale coded and Binary coded. After Binary coded image is Dilate and Erode coded, the image is then recognized as separated objects by applying Labeling techniques. Over time, dust and stains may occur naturally on the surface of concrete. The crack image of concrete may include shadows and reflections by lighting depending on a surrounding conditions. In general, concrete cracks occur in a continuous pattern and noise of image appears in the form of shot noises. Bilateral Blurring and Adaptive Threshold apply to the Grayscale image to eliminate these effects. The remaining noises are removed by the object area ratio to the Labeled area. The maximum numbers of pixels and its positions in the crack objects without noises are calculated in x-direction and y-direction by Histogram analysis. The widths of the crack are estimated by trigonometric ratio at the positions of the pixels maximum numbers for the Labeled objects. Finally, the maximum crack width estimated by the proposed method is compared to the crack width measured with the crack gauge. The proposed method by the present study may increase the reliability for the estimation of maximum crack width using image processing techniques in concrete surface images.

Creation and labeling of multiple phonotopic maps using a hierarchical self-organizing classifier (계층적 자기조직화 분류기를 이용한 다수 음성자판의 생성과 레이블링)

  • Chung, Dam;Lee, Kee-Cheol;Byun, Young-Tai
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.3
    • /
    • pp.600-611
    • /
    • 1996
  • Recently, neural network-based speech recognition has been studied to utilize the adaptivity and learnability of neural network models. However, conventional neural network models have difficulty in the co-articulation processing and the boundary detection of similar phonmes of the Korean speech. Also, in case of using one phonotopic map, learning speed may dramatically increase and inaccuracies may be caused because homogeneous learning and recognition method should be applied for heterogenous data. Hence, in this paper, a neural net typewriter has been designed using a hierarchical self-organizing classifier(HSOC), and related algorithms are presented. This HSOC, during its learing stage, distributed phoneme data on hierarchically structured multiple phonotopic maps, using Kohonen's self-organizing feature maps(SOFM). Presented and experimented in this paper were the algorithms for deciding the number of maps, map sizes, the selection of phonemes and their placement per map, an approapriate learning and preprocessing method per map. If maps are divided according to a priorlinguistic knowledge, we would have difficulty in acquiring linguistic knowledge and how to alpply it(e.g., processing extended phonemes). Contrarily, our HSOC has an advantage that multiple phonotopic maps suitable for given input data are self-organizable. The resulting three korean phonotopic maps are optimally labelled and have their own optimal preprocessing schemes, and also confirm to the conventional linguistic knowledge.

  • PDF