• Title/Summary/Keyword: segment-based classification

Search Result 124, Processing Time 0.028 seconds

Object-based classification for building detection using VHR image and Lidar data (고해상도 영상 및 라이다 자료를 이용한 객체 기반 건물 탐지)

  • Yoon Yeo-Sang
    • Proceedings of the KSRS Conference
    • /
    • 2006.03a
    • /
    • pp.307-310
    • /
    • 2006
  • 고해상도(VHR, Very High Resolution) 영상은 활용에 따라 도심의 다양한 정보를 얻을 수 있는 잠재적 가치가 매우 큰 자료이다. 그러나 이러한 고해상도 영상자료는 매우 높은 공간해상력으로 인해 같은 용도의 객체 혹은 같은 객체(예, 건물)라 할지라도 다양한 분광 특성 및 형태로 표현된다. 그러므로 이러한 고해상도영상을 이용하여 효과적으로 주제도를 생성하기 위해서는 현재까지 영상분류 분야에서 주로 활용되고 있는 화소(pixel)단위 기반의 분석방법으로는 한계가 존재한다. 본 연구에서는 이러한 문제점을 보완하기 위한 방법으로 활발한 연구가 진행되고 있는 세그멘트(segment) 혹은 객체(object) 기반 분류기법을 고해상도 영상 및 라이다 자료에 적용하여 도심지역의 건물들을 추출해 보았으며, 그 활용 가능성에 대하여 판단해 보았다. 이러한 세그멘트 기법은 분류하고자 하는 객체들을 하나의 동일한 특성을 가지는 집단으로 모으는 방법을 말하는데, 이를 위해 본 연구에서는 multi-resolution image segmentation기법을 제공해주는 eCognition이라는 소프트웨어를 이용하였다.

  • PDF

Classification of Lower Body Types of Female Adults aged 18 to 69 based on 3D Body Scan Data - Focusing on the Front Type, Lateral-Front Type, and Lateral-Back Type -

  • Kim, Min Kyoung;Nam, Yun Ja
    • Fashion & Textile Research Journal
    • /
    • v.18 no.1
    • /
    • pp.91-102
    • /
    • 2016
  • This study classified the lower body types of female adults aged 18 to 69. The lower body was divided into front, lateral front, and lateral back. In order to understand the shape and somatotype of each segment, 592 people were analyzed based on girth, height, length, depth, width, angle and cross section distance for each segment. For data analysis, SPSS 18.0 was performed for descriptive statics, principal component analysis, K-means cluster analysis, ANOVA, and Duncan's test (as verification). Factor analysis was performed based on index values, calculation values, angles, and cross section distances. The measured items resulted in a.) 16 items were extracted to 5 factors in the case of the front factor (FF) of the lower body, and b.) 24 items were extracted to 6 factors in the case of lateral front factor (LFF) and lateral back factor (LBF). Each factor was put through K-means cluster analysis, classifying the lower bodies into one of four types of based on the front type (FT), the lateral front type (LFT), and the lateral back type (LBT) respectively. This study proposed an understanding of various lower body shapes by segmenting and classifying the lower body shapes for each type.

Automatic Indexing Algorithm of Golf Video Using Audio Information (오디오 정보를 이용한 골프 동영상 자동 색인 알고리즘)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.5
    • /
    • pp.441-446
    • /
    • 2009
  • This paper proposes an automatic indexing algorithm of golf video using audio information. In the proposed algorithm, the input audio stream is demultiplexed into the stream of video and audio. By means of Adaboost-cascade classifier, the continuous audio stream is classified into announcer's speech segment recorded in studio, music segment accompanied with players' names on TV screen, reaction segment of audience according to the play, reporter's speech segment with field background, filed noise segment like wind or waves. And golf swing sound including drive shot, iron shot, and putting shot is detected by the method of impulse onset detection and modulation spectrum verification. The detected swing and applause are used effectively to index action or highlight unit. Compared with video based semantic analysis, main advantage of the proposed system is its small computation requirement so that it facilitates to apply the technology to embedded consumer electronic devices for fast browsing.

Detection of Music Mood for Context-aware Music Recommendation (상황인지 음악추천을 위한 음악 분위기 검출)

  • Lee, Jong-In;Yeo, Dong-Gyu;Kim, Byeong-Man
    • The KIPS Transactions:PartB
    • /
    • v.17B no.4
    • /
    • pp.263-274
    • /
    • 2010
  • To provide context-aware music recommendation service, first of all, we need to catch music mood that a user prefers depending on his situation or context. Among various music characteristics, music mood has a close relation with people‘s emotion. Based on this relationship, some researchers have studied on music mood detection, where they manually select a representative segment of music and classify its mood. Although such approaches show good performance on music mood classification, it's difficult to apply them to new music due to the manual intervention. Moreover, it is more difficult to detect music mood because the mood usually varies with time. To cope with these problems, this paper presents an automatic method to classify the music mood. First, a whole music is segmented into several groups that have similar characteristics by structural information. Then, the mood of each segments is detected, where each individual's preference on mood is modelled by regression based on Thayer's two-dimensional mood model. Experimental results show that the proposed method achieves 80% or higher accuracy.

Document Image Segmentation and Classification using Texture Features and Structural Information (텍스쳐 특징과 구조적인 정보를 이용한 문서 영상의 분할 및 분류)

  • Park, Kun-Hye;Kim, Bo-Ram;Kim, Wook-Hyun
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.11 no.3
    • /
    • pp.215-220
    • /
    • 2010
  • In this paper, we propose a new texture-based page segmentation and classification method in which table region, background region, image region and text region in a given document image are automatically identified. The proposed method for document images consists of two stages, document segmentation and contents classification. In the first stage, we segment the document image, and then, we classify contents of document in the second stage. The proposed classification method is based on a texture analysis. Each contents in the document are considered as regions with different textures. Thus the problem of classification contents of document can be posed as a texture segmentation and analysis problem. Two-dimensional Gabor filters are used to extract texture features for each of these regions. Our method does not assume any a priori knowledge about content or language of the document. As we can see experiment results, our method gives good performance in document segmentation and contents classification. The proposed system is expected to apply such as multimedia data searching, real-time image processing.

Image-based Soft Drink Type Classification and Dietary Assessment System Using Deep Convolutional Neural Network with Transfer Learning

  • Rubaiya Hafiz;Mohammad Reduanul Haque;Aniruddha Rakshit;Amina khatun;Mohammad Shorif Uddin
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.158-168
    • /
    • 2024
  • There is hardly any person in modern times who has not taken soft drinks instead of drinking water. The rate of people taking soft drinks being surprisingly high, researchers around the world have cautioned from time to time that these drinks lead to weight gain, raise the risk of non-communicable diseases and so on. Therefore, in this work an image-based tool is developed to monitor the nutritional information of soft drinks by using deep convolutional neural network with transfer learning. At first, visual saliency, mean shift segmentation, thresholding and noise reduction technique, collectively known as 'pre-processing' are adopted to extract the location of drinks region. After removing backgrounds and segment out only the desired area from image, we impose Discrete Wavelength Transform (DWT) based resolution enhancement technique is applied to improve the quality of image. After that, transfer learning model is employed for the classification of drinks. Finally, nutrition value of each drink is estimated using Bag-of-Feature (BoF) based classification and Euclidean distance-based ratio calculation technique. To achieve this, a dataset is built with ten most consumed soft drinks in Bangladesh. These images were collected from imageNet dataset as well as internet and proposed method confirms that it has the ability to detect and recognize different types of drinks with an accuracy of 98.51%.

Maritime region segmentation and segment-based destination prediction methods for vessel path prediction (선박 이동 경로 예측을 위한 해상 영역 분할 및 영역 단위 목적지 예측 방법)

  • Kim, Jonghee;Jung, Chanho;Kang, Dokeun;Lee, Chang Jin
    • Journal of IKEEE
    • /
    • v.24 no.2
    • /
    • pp.661-664
    • /
    • 2020
  • In this paper, we propose a maritime region segmentation method and a segment-based destination prediction method for vessel path prediction. In order to perform maritime segmentation, clustering on destination candidates generated from the past paths is conducted. Then the segment-based destination prediction is followed. For destination prediction, different prediction methods are applied according to whether the current region is linear or not. In the linear domain, the vessel is regarded to move constantly, and linear prediction is applied. In the nonlinear domain with an uncertainty, we assume that the vessel moves similarly to the most similar past path. Experimental results show that applying the linear prediction and the prediction method using a similar path differently depending on the linearity and the uncertainty of the path is better than applying one of them alone.

A study on the classifying vehicles for traffic flow analysis using LiDAR DATA

  • Heo J.Y.;Choi J.W.;Kim Y.I.;Yu K.Y.
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.633-636
    • /
    • 2004
  • Airborne laser scanning thechnology has been studied in many applications, DSM(Digital Surface Model) development, building extraction, 3D virtual city modeling. In this paper, we will evaluate the possibility of airborne laser scanning technology for transportation application, especially for recognizing moving vehicles on road. First, we initially segment the region of roads from all LiDAR DATA using the GIS map and intensity image. Secondly, the segmented region is divided into the roads and vehicles using the height threshold value of local based window. Finally, the vehicles will be classified into the several types of vehicles by MDC(Minimum Distance Classification) method using the vehicle's geometry information, height, length, width, etc

  • PDF

Classification of Cognitive Mental States for Brain Wave based Human-Computer Interface (뇌파기반 휴먼-컴퓨터 인터페이스를 위한 인지적 정신상태의 분별)

  • 신승철
    • Proceedings of the IEEK Conference
    • /
    • 2001.06e
    • /
    • pp.61-64
    • /
    • 2001
  • This paper describes a basic study for the classification of cognitive mental states as a basic research of a human-computer interface technique. To recognize the mental states, we obtained 22 subjects’brain waves in course of two types of experiments. One of the experiments is to choose an answer among yes, no or reject buttons, to underlying questions and the other is to select an icon displayed in a monitor screen. After acquiring the brain wave signals, we construct a feature set with the percent power increase for a given segment with respect to that of the reference period. The linear discriminative algorithm is used to classify the cognitive yes/no mental states.

  • PDF

Implementation of Sports Video Clip Extraction Based on MobileNetV3 Transfer Learning (MobileNetV3 전이학습 기반 스포츠 비디오 클립 추출 구현)

  • YU, LI
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.5
    • /
    • pp.897-904
    • /
    • 2022
  • Sports video is a very critical information resource. High-precision extraction of effective segments in sports video can better assist coaches in analyzing the player's actions in the video, and enable users to more intuitively appreciate the player's hitting action. Aiming at the shortcomings of the current sports video clip extraction results, such as strong subjectivity, large workload and low efficiency, a classification method of sports video clips based on MobileNetV3 is proposed to save user time. Experiments evaluate the effectiveness of effective segment extraction. Among the extracted segments, the effective proportion is 97.0%, indicating that the effective segment extraction results are good, and it can lay the foundation for the construction of the subsequent badminton action metadata video dataset.