• Title/Summary/Keyword: Scene-context

Search Result 73, Processing Time 0.027 seconds

Scene Graph Generation with Graph Neural Network and Multimodal Context (그래프 신경망과 멀티 모달 맥락 정보를 이용한 장면 그래프 생성)

  • Jung, Ga-Young;Kim, In-cheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.555-558
    • /
    • 2020
  • 본 논문에서는 입력 영상에 담긴 다양한 물체들과 그들 간의 관계를 효과적으로 탐지하여, 하나의 장면 그래프로 표현해내는 새로운 심층 신경망 모델을 제안한다. 제안 모델에서는 물체와 관계의 효과적인 탐지를 위해, 합성 곱 신경망 기반의 시각 맥락 특징들뿐만 아니라 언어 맥락 특징들을 포함하는 다양한 멀티 모달 맥락 정보들을 활용한다. 또한, 제안 모델에서는 관계를 맺는 두 물체 간의 상호 의존성이 그래프 노드 특징값들에 충분히 반영되도록, 그래프 신경망을 이용해 맥락 정보를 임베딩한다. 본 논문에서는 Visual Genome 벤치마크 데이터 집합을 이용한 비교 실험들을 통해, 제안 모델의 효과와 성능을 입증한다.

Transformer-based dense 3D reconstruction from RGB images (RGB 이미지에서 트랜스포머 기반 고밀도 3D 재구성)

  • Xu, Jiajia;Gao, Rui;Wen, Mingyun;Cho, Kyungeun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.646-647
    • /
    • 2022
  • Multiview stereo (MVS) 3D reconstruction of a scene from images is a fundamental computer vision problem that has been thoroughly researched in recent times. Traditionally, MVS approaches create dense correspondences by constructing regularizations and hand-crafted similarity metrics. Although these techniques have achieved excellent results in the best Lambertian conditions, traditional MVS algorithms still contain a lot of artifacts. Therefore, in this study, we suggest using a transformer network to accelerate the MVS reconstruction. The network is based on a transformer model and can extract dense features with 3D consistency and global context, which are necessary to provide accurate matching for MVS.

GLIBP: Gradual Locality Integration of Binary Patterns for Scene Images Retrieval

  • Bougueroua, Salah;Boucheham, Bachir
    • Journal of Information Processing Systems
    • /
    • v.14 no.2
    • /
    • pp.469-486
    • /
    • 2018
  • We propose an enhanced version of the local binary pattern (LBP) operator for texture extraction in images in the context of image retrieval. The novelty of our proposal is based on the observation that the LBP exploits only the lowest kind of local information through the global histogram. However, such global Histograms reflect only the statistical distribution of the various LBP codes in the image. The block based LBP, which uses local histograms of the LBP, was one of few tentative to catch higher level textural information. We believe that important local and useful information in between the two levels is just ignored by the two schemas. The newly developed method: gradual locality integration of binary patterns (GLIBP) is a novel attempt to catch as much local information as possible, in a gradual fashion. Indeed, GLIBP aggregates the texture features present in grayscale images extracted by LBP through a complex structure. The used framework is comprised of a multitude of ellipse-shaped regions that are arranged in circular-concentric forms of increasing size. The framework of ellipses is in fact derived from a simple parameterized generator. In addition, the elliptic forms allow targeting texture directionality, which is a very useful property in texture characterization. In addition, the general framework of ellipses allows for taking into account the spatial information (specifically rotation). The effectiveness of GLIBP was investigated on the Corel-1K (Wang) dataset. It was also compared to published works including the very effective DLEP. Results show significant higher or comparable performance of GLIBP with regard to the other methods, which qualifies it as a good tool for scene images retrieval.

Object Tracking System Using Kalman Filter (칼만 필터를 이용한 물체 추적 시스템)

  • Xu, Yanan;Ban, Tae-Hak;Yuk, Jung-Soo;Park, Dong-Won;Jung, Hoe-kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.10a
    • /
    • pp.1015-1017
    • /
    • 2013
  • Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, non-rigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location or the shape of the object in every frame. This paper describes an object tracking system based on active vision with two cameras, into algorithm of single camera tracking system an object active visual tracking and object locked system based on Extend Kalman Filter (EKF) is introduced, by analyzing data from which the next running state of the object can be figured out and after the tracking is performed at each of the cameras, the individual tracks are to be fused (combined) to obtain the final system object track.

  • PDF

Development and Performance Evaluation of an Image Detection System for Efficient 4D Images (효율적인 4D 영상을 위한 영상 검출 시스템 개발 및 성능평가)

  • Cho, Kyoung-Woo;Liu, Ze-Qi;Jeon, Min-Ho;Oh, Chang-Heon
    • Journal of Advanced Navigation Technology
    • /
    • v.17 no.6
    • /
    • pp.792-797
    • /
    • 2013
  • 4D film is just a film that made by adding some physical effects to 3D film or general film. In order to provide physical effects to the audience, the data that make the physical effect must be added to each frames. In this paper, we proposed a video detection system that can efficiently provide physical effects by assessing the present situation such as explosion scene, snowing scene. The proposed video detection system contains an algorithm for fire detection by using R color and $C_r$ value, and also an algorithm for snow detection by using RGB color model. The system constitutes in a MCU that from 8051 family. In the performance evaluations, the result shows that 91% of detection rate in case of fire and 25% of false detection rate in case of snow. Also the system is capable of providing physical effects automatically.

A New Approach to Naturalness for Still Images-Depending On TV Genre (TV화질에 있어서 자연스러움의 새로운 접근-TV장르)

  • Park, Yung-Kyung
    • Science of Emotion and Sensibility
    • /
    • v.13 no.1
    • /
    • pp.251-258
    • /
    • 2010
  • 'Naturalness' is the important "ness" which is a key factor in image quality assessment. 'Naturalness' is a representive factor depending on the context of the image which arouses different emotions. The Image Quality Circle was split into two steps. The first step is predicting the visual perceptual attribute which are lightness, colourfulness, hue and contrast. The next step is SSE which is dependent to image contents. In this study the image contents are grouped in genres. The images were rendered using four different colour attributes which are lightness, contrast, colourfulness and hue. Using a scale, the score of image quality and SSE was asked to each participant for all rendered images. A seven-point category scale of increasing amount of "ness" is used as a quantitative adjectives sequence. The image quality model was built by combining the SSEs for each scene. The SSEs, where vividness is common, are considered as independent variables to predict the image quality score. Then the vividness model was built using colour attributes as variables to predict the vividness of each scene (genre). Vividness is an important factor of naturalness which the meaning is different for all scenes that links the naturalness and image quality. The vividness meaning was different for each scene (genre). Therefore, the colour attributes that express the vividness would depend on the image content.

  • PDF

Development of facial recognition application for automation logging of emotion log (감정로그 자동화 기록을 위한 표정인식 어플리케이션 개발)

  • Shin, Seong-Yoon;Kang, Sun-Kyoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.4
    • /
    • pp.737-743
    • /
    • 2017
  • The intelligent life-log system proposed in this paper is intended to identify and record a myriad of everyday life information as to the occurrence of various events based on when, where, with whom, what and how, that is, a wide variety of contextual information involving person, scene, ages, emotion, relation, state, location, moving route, etc. with a unique tag on each piece of such information and to allow users to get a quick and easy access to such information. Context awareness generates and classifies information on a tag unit basis using the auto-tagging technology and biometrics recognition technology and builds a situation information database. In this paper, we developed an active modeling method and an application that recognizes expressionless and smile expressions using lip lines to automatically record emotion information.

An Efficient Competition-based Skip Motion Vector Coding Scheme Based on the Context-based Adaptive Choice of Motion Vector Predictors (효율적 경쟁 기반 스킵모드 부호화를 위한 적응적 문맥 기반 움직임 예측 후보 선택 기법)

  • Kim, Sung-Jei;Kim, Yong-Goo;Choe, Yoon-Sik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.5C
    • /
    • pp.464-471
    • /
    • 2010
  • The demand for high quality of multimedia applications, which far surpasses the rapid evolution of transmission and storage technologies, makes better compression coding capabilities ever increasingly more important. In order to provide enhanced video coding performance, this paper proposes an efficient competition-based motion vector coding scheme. The proposed algorithm adaptively forms the motion vector predictors based on the contexts of scene characteristics such as camera motion and nearby motion vectors, providing more efficient candidate predictors than the previous competition-based motion vector coding schemes which resort to the fixed candidates optimized by extensive simulations. Up to 200% of compression gain was observed in the experimental results for the proposed scheme applied to the motion vector selection for skip mode processing.

ORMN: A Deep Neural Network Model for Referring Expression Comprehension (ORMN: 참조 표현 이해를 위한 심층 신경망 모델)

  • Shin, Donghyeop;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.2
    • /
    • pp.69-76
    • /
    • 2018
  • Referring expressions are natural language constructions used to identify particular objects within a scene. In this paper, we propose a new deep neural network model for referring expression comprehension. The proposed model finds out the region of the referred object in the given image by making use of the rich information about the referred object itself, the context object, and the relationship with the context object mentioned in the referring expression. In the proposed model, the object matching score and the relationship matching score are combined to compute the fitness score of each candidate region according to the structure of the referring expression sentence. Therefore, the proposed model consists of four different sub-networks: Language Representation Network(LRN), Object Matching Network (OMN), Relationship Matching Network(RMN), and Weighted Composition Network(WCN). We demonstrate that our model achieves state-of-the-art results for comprehension on three referring expression datasets.

Townscape Color Character by Form Finishes of the Traditional Area - Focusing on Stockholm, Sweden - (전통지역의 형태 마감재별 경관 색채 특성 - 스웨덴 스톡홀름시의 실례를 대상으로 -)

  • Choe, Seung-Heuy
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.29 no.4
    • /
    • pp.49-58
    • /
    • 2011
  • This article attempts to propose the control planning of townscape color around the historic and cultural heritages. The streets and roads of historic conservation and the changing perspectives to which it gives rise in Stockholm has changed dramatically during this century. New development or changes to existing buildings should be carried out in a way which acknowledges its surroundings and is a good neighbour, both in the cultural and social sense that makes good color design sense. There are many examples of townscape color, but the conservative and the historical streets and roads in the whole of the Stockholm city should benefit from careful design of the environment. To achieve this purposes, some strategies of case study of several streets and roads are reviewed; designing color context to relate to urban architectural design proposals of specific sites of cultural heritages are explored. In all new developments the scale of new buildings and the material finishes and colors used should respect the character of their surroundings and have due regard to the setting of any listed building. Streetscape color of visual assessment proposals should aim to help assimilate the development into the local scene. Important streets and roads should also include color townscape.