• 제목/요약/키워드: multi-scale features

검색결과 185건 처리시간 0.025초

Human Action Recognition Using Pyramid Histograms of Oriented Gradients and Collaborative Multi-task Learning

  • Gao, Zan;Zhang, Hua;Liu, An-An;Xue, Yan-Bing;Xu, Guang-Ping
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권2호
    • /
    • pp.483-503
    • /
    • 2014
  • In this paper, human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning is proposed. First, we accumulate global activities and construct motion history image (MHI) for both RGB and depth channels respectively to encode the dynamics of one action in different modalities, and then different action descriptors are extracted from depth and RGB MHI to represent global textual and structural characteristics of these actions. Specially, average value in hierarchical block, GIST and pyramid histograms of oriented gradients descriptors are employed to represent human motion. To demonstrate the superiority of the proposed method, we evaluate them by KNN, SVM with linear and RBF kernels, SRC and CRC models on DHA dataset, the well-known dataset for human action recognition. Large scale experimental results show our descriptors are robust, stable and efficient, and outperform the state-of-the-art methods. In addition, we investigate the performance of our descriptors further by combining these descriptors on DHA dataset, and observe that the performances of combined descriptors are much better than just using only sole descriptor. With multimodal features, we also propose a collaborative multi-task learning method for model learning and inference based on transfer learning theory. The main contributions lie in four aspects: 1) the proposed encoding the scheme can filter the stationary part of human body and reduce noise interference; 2) different kind of features and models are assessed, and the neighbor gradients information and pyramid layers are very helpful for representing these actions; 3) The proposed model can fuse the features from different modalities regardless of the sensor types, the ranges of the value, and the dimensions of different features; 4) The latent common knowledge among different modalities can be discovered by transfer learning to boost the performance.

다중 스케일 그라디언트 조건부 적대적 생성 신경망을 활용한 문장 기반 영상 생성 기법 (Text-to-Face Generation Using Multi-Scale Gradients Conditional Generative Adversarial Networks)

  • ;;추현승
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2021년도 추계학술발표대회
    • /
    • pp.764-767
    • /
    • 2021
  • While Generative Adversarial Networks (GANs) have seen huge success in image synthesis tasks, synthesizing high-quality images from text descriptions is a challenging problem in computer vision. This paper proposes a method named Text-to-Face Generation Using Multi-Scale Gradients for Conditional Generative Adversarial Networks (T2F-MSGGANs) that combines GANs and a natural language processing model to create human faces has features found in the input text. The proposed method addresses two problems of GANs: model collapse and training instability by investigating how gradients at multiple scales can be used to generate high-resolution images. We show that T2F-MSGGANs converge stably and generate good-quality images.

스케일링을 이용한 다중 스케일 균열 검출 (Multi-scale Crack Detection Using Scaling)

  • 김영로;오태명
    • 전자공학회논문지
    • /
    • 제50권9호
    • /
    • pp.194-200
    • /
    • 2013
  • 본 논문에서는 스케일링을 이용한 다중 스케일 균열 검출 방법을 제안한다. 제안하는 방법은 형태학 알고리즘, 균열 특징, 스케일링을 기반으로 한다. 사용하는 형태학 연산자는 균열의 패턴을 추출한다. 열림과 닫힘의 연산을 이용하여 균열과 배경을 구분한다. 형태학을 기반으로 하는 분할은 작은 간격의 균열을 검출하는 기존의 차분 이용 통합 방법 보다 좋은 성능을 보인다. 그러나, 형태학 방법들은 오직 하나의 구조 연산자를 사용하면 고정된 크기의 균열만을 검출할 수 있다. 따라서 스케일링 방법을 사용한다. 스케일링에 이중선형 보간법을 사용한다. 제안하는 방법은 분할된 영역의 화소 수와 최대 길이와 같은 특징들의 값들을 계산한다. 구분된 영역이 균열에 해당하는 지를 계산한 특징들의 값들에 의하여 결정한다. 실험 결과에서 제안한 다중 스케일 균열 검출 방법이 기존의 검출 방법들보다 향상된 결과를 보인다.

가려짐 영역 검출 및 스테레오 영상 내의 특징들을 이용한 다시점 영상 생성 (Multi-view Image Generation from Stereoscopic Image Features and the Occlusion Region Extraction)

  • 이왕로;고민수;엄기문;정원식;허남호;유지상
    • 방송공학회논문지
    • /
    • 제17권5호
    • /
    • pp.838-850
    • /
    • 2012
  • 본 논문에서는 스테레오 영상에서 얻은 다양한 특징들을 이용하여 다시점 영상을 생성하는 방법을 제안한다. 제안된 기법에서는 먼저 주어진 스테레오 영상에서 명암변화 주목도 지도(intensity gradient saliency map)를 생성한다. 다음으로 좌우 영상 간에 블럭 단위의 움직임을 나타내는 광류(optical flow)를 계산하고 scale-invariant feature transform(SIFT) 기법을 통해 사물의 크기와 회전에 변하지 않는 영상의 특징 점을 구하여 이 특징점 간의 변이를 구한 다음, 이 두 변이 정보들을 결합하여 변이 주목도 지도(disparity saliency map)를 생성 한다. 생성된 변이 주목도 지도는 가려짐 영역 검출을 통해 오류 변이가 제거된다. 세 번째로 영상 워핑시에 직선의 왜곡을 최소화하기 위해 직선 세그먼트를 얻는다. 마지막으로 다시점 영상은 이렇게 추출된 영상 특징들을 제한 조건으로 사용하여 그리드 메쉬(grid-mesh) 기반 영상 워핑(warping) 기법에 의해 생성된다. 실험 결과를 통해 제안한 기법으로 생성된 다시점 영상의 화질이 기존 DIBR 기법보다 우수한 것을 확인할 수 있었다.

여름강수량의 단기예측을 위한 Multi-Ensemble GCMs 기반 시공간적 Downscaling 기법 개발 (Development of Multi-Ensemble GCMs Based Spatio-Temporal Downscaling Scheme for Short-term Prediction)

  • 권현한;민영미
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2009년도 학술발표회 초록집
    • /
    • pp.1142-1146
    • /
    • 2009
  • A rainfall simulation and forecasting technique that can generate daily rainfall sequences conditional on multi-model ensemble GCMs is developed and applied to data in Korea for the major rainy season. The GCM forecasts are provided by APEC climate center. A Weather State Based Downscaling Model (WSDM) is used to map teleconnections from ocean-atmosphere data or key state variables from numerical integrations of Ocean-Atmosphere General Circulation Models to simulate daily sequences at multiple rain gauges. The method presented is general and is applied to the wet season which is JJA(June-July-August) data in Korea. The sequences of weather states identified by the EM algorithm are shown to correspond to dominant synoptic-scale features of rainfall generating mechanisms. Application of the methodology to seasonal rainfall forecasts using empirical teleconnections and GCM derived climate forecast are discussed.

  • PDF

Towards Resource-Generative Skyscrapers

  • Imam, Mohamed;Kolarevic, Branko
    • 국제초고층학회논문집
    • /
    • 제7권2호
    • /
    • pp.161-170
    • /
    • 2018
  • Rapid urbanization, resource depletion, and limited land are further increasing the need for skyscrapers in city centers; therefore, it is imperative to enhance tall building performance efficiency and energy-generative capability. Potential performance improvements can be explored using parametric multi-objective optimization, aided by evaluation tools, such as computational fluid dynamics and energy analysis software, to visualize and explore skyscrapers' multi-resource, multi-system generative potential. An optimization-centered, software-based design platform can potentially enable the simultaneous exploration of multiple strategies for the decreased consumption and large-scale production of multiple resources. Resource Generative Skyscrapers (RGS) are proposed as a possible solution to further explore and optimize the generative potentials of skyscrapers. RGS can be optimized with waste-energy-harvesting capabilities by capitalizing on passive features of integrated renewable systems. This paper describes various resource-generation technologies suitable for a synergetic integration within the RGS typology, and the software tools that can facilitate exploration of their optimal use.

Multi-resolution Fusion Network for Human Pose Estimation in Low-resolution Images

  • Kim, Boeun;Choo, YeonSeung;Jeong, Hea In;Kim, Chung-Il;Shin, Saim;Kim, Jungho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권7호
    • /
    • pp.2328-2344
    • /
    • 2022
  • 2D human pose estimation still faces difficulty in low-resolution images. Most existing top-down approaches scale up the target human bonding box images to the large size and insert the scaled image into the network. Due to up-sampling, artifacts occur in the low-resolution target images, and the degraded images adversely affect the accurate estimation of the joint positions. To address this issue, we propose a multi-resolution input feature fusion network for human pose estimation. Specifically, the bounding box image of the target human is rescaled to multiple input images of various sizes, and the features extracted from the multiple images are fused in the network. Moreover, we introduce a guiding channel which induces the multi-resolution input features to alternatively affect the network according to the resolution of the target image. We conduct experiments on MS COCO dataset which is a representative dataset for 2D human pose estimation, where our method achieves superior performance compared to the strong baseline HRNet and the previous state-of-the-art methods.

셀프센싱 상시계측 기반 CFRP보강 콘크리트 구조물의 손상검색 (Damage Detecion of CFRP-Laminated Concrete based on a Continuous Self-Sensing Technology)

  • 김영진;박승희;진규남;이창길
    • 토지주택연구
    • /
    • 제2권4호
    • /
    • pp.407-413
    • /
    • 2011
  • 본 논문에서는 콘크리트 보의 표면에 부착된 CFRP (Carbon Fiber Reinforced Plastic) 보강재의 박리 손상 진단을 위한 구조 건전성 모니터링 기법을 소개한다. 이를 위해 압전 능동 센서를 이용한 셀프센싱 회로 기반의 다중 스케일 계측 기법이 적용되었다. 다중 스케일 계측 시스템으로부터 셀프센싱 임피던스 계측을 통한 주파수 영역 구조 응답 및 셀프센싱 유도 초음파 계측을 통한 특정 주파수에서의 구조 응답을 획득할 수 있다. 박리 손상의 정량화를 위하여 임피던스 및 유도 초음파 신호로부터 추출된 손상 특성을 이용하여 2차원 손상 지수를 도출하고 이를 지도학습 기반 확률론적 패턴인식 기법에 적용하였다.

A Multi-Scale Parallel Convolutional Neural Network Based Intelligent Human Identification Using Face Information

  • Li, Chen;Liang, Mengti;Song, Wei;Xiao, Ke
    • Journal of Information Processing Systems
    • /
    • 제14권6호
    • /
    • pp.1494-1507
    • /
    • 2018
  • Intelligent human identification using face information has been the research hotspot ranging from Internet of Things (IoT) application, intelligent self-service bank, intelligent surveillance to public safety and intelligent access control. Since 2D face images are usually captured from a long distance in an unconstrained environment, to fully exploit this advantage and make human recognition appropriate for wider intelligent applications with higher security and convenience, the key difficulties here include gray scale change caused by illumination variance, occlusion caused by glasses, hair or scarf, self-occlusion and deformation caused by pose or expression variation. To conquer these, many solutions have been proposed. However, most of them only improve recognition performance under one influence factor, which still cannot meet the real face recognition scenario. In this paper we propose a multi-scale parallel convolutional neural network architecture to extract deep robust facial features with high discriminative ability. Abundant experiments are conducted on CMU-PIE, extended FERET and AR database. And the experiment results show that the proposed algorithm exhibits excellent discriminative ability compared with other existing algorithms.

Free vibration analysis of chiral double-walled carbon nanotube embedded in an elastic medium using non-local elasticity theory and Euler Bernoulli beam model

  • Dihaj, Ahmed;Zidour, Mohamed;Meradjah, Mustapha;Rakrak, Kaddour;Heireche, Houari;Chemi, Awda
    • Structural Engineering and Mechanics
    • /
    • 제65권3호
    • /
    • pp.335-342
    • /
    • 2018
  • The transverse free vibration of chiral double-walled carbon nanotube (DWCNTs) embedded in elastic medium is modeled by the non-local elasticity theory and Euler Bernoulli beam model. The governing equations are derived and the solutions of frequency are obtained. According to this study, the vibrational mode number, the small-scale coefficient, the Winkler parameter and chirality of double-walled carbon nanotube on the frequency ratio (xN) of the (DWCNTs) are studied and discussed. The new features of the vibration behavior of (DWCNTs) embedded in an elastic medium and the present solutions can be used for the static and dynamic analyses of double-walled carbon nanotubes.