통합 검색 | Korea Science

Salient Region Extraction based on Global Contrast Enhancement and Saliency Cut for Image Information Recognition of the Visually Impaired

Yoon, Hongchan;Kim, Baek-Hyun;Mukhriddin, Mukhiddinov;Cho, Jinsoo
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제12권5호
- /
- pp.2287-2312
- /
- 2018
Extracting key visual information from images containing natural scene is a challenging task and an important step for the visually impaired to recognize information based on tactile graphics. In this study, a novel method is proposed for extracting salient regions based on global contrast enhancement and saliency cuts in order to improve the process of recognizing images for the visually impaired. To accomplish this, an image enhancement technique is applied to natural scene images, and a saliency map is acquired to measure the color contrast of homogeneous regions against other areas of the image. The saliency maps also help automatic salient region extraction, referred to as saliency cuts, and assist in obtaining a binary mask of high quality. Finally, outer boundaries and inner edges are detected in images with natural scene to identify edges that are visually significant. Experimental results indicate that the method we propose in this paper extracts salient objects effectively and achieves remarkable performance compared to conventional methods. Our method offers benefits in extracting salient objects and generating simple but important edges from images containing natural scene and for providing information to the visually impaired.
https://doi.org/10.3837/tiis.2018.05.021 인용 PDF KSCI

A Remote Sensing Scene Classification Model Based on EfficientNetV2L Deep Neural Networks

Aljabri, Atif A.;Alshanqiti, Abdullah;Alkhodre, Ahmad B.;Alzahem, Ayyub;Hagag, Ahmed
- International Journal of Computer Science & Network Security
- /
- 제22권10호
- /
- pp.406-412
- /
- 2022
Scene classification of very high-resolution (VHR) imagery can attribute semantics to land cover in a variety of domains. Real-world application requirements have not been addressed by conventional techniques for remote sensing image classification. Recent research has demonstrated that deep convolutional neural networks (CNNs) are effective at extracting features due to their strong feature extraction capabilities. In order to improve classification performance, these approaches rely primarily on semantic information. Since the abstract and global semantic information makes it difficult for the network to correctly classify scene images with similar structures and high interclass similarity, it achieves a low classification accuracy. We propose a VHR remote sensing image classification model that uses extracts the global feature from the original VHR image using an EfficientNet-V2L CNN pre-trained to detect similar classes. The image is then classified using a multilayer perceptron (MLP). This method was evaluated using two benchmark remote sensing datasets: the 21-class UC Merced, and the 38-class PatternNet. As compared to other state-of-the-art models, the proposed model significantly improves performance.
https://doi.org/10.22937/IJCSNS.2022.22.10.53 인용 PDF KSCI

Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion

Xinhua Lu;Haihai Wei;Li Ma;Qingji Xue;Yonghui Fu
- Journal of Information Processing Systems
- /
- 제19권4호
- /
- pp.427-438
- /
- 2023
Plenty of works have indicated that single image super-resolution (SISR) models relying on synthetic datasets are difficult to be applied to real scene text image super-resolution (STISR) for its more complex degradation. The up-to-date dataset for realistic STISR is called TextZoom, while the current methods trained on this dataset have not considered the effect of multi-scale features of text images. In this paper, a multi-scale and attention fusion model for realistic STISR is proposed. The multi-scale learning mechanism is introduced to acquire sophisticated feature representations of text images; The spatial and channel attentions are introduced to capture the local information and inter-channel interaction information of text images; At last, this paper designs a multi-scale residual attention module by skillfully fusing multi-scale learning and attention mechanisms. The experiments on TextZoom demonstrate that the model proposed increases scene text recognition's (ASTER) average recognition accuracy by 1.2% compared to text super-resolution network.
https://doi.org/10.3745/JIPS.02.0199 인용 PDF

자연 영상에서의 정확한 문자 검출에 관한 연구 (A Study on Localization of Text in Natural Scene Images)

최미영;김계영;최형일
- 한국컴퓨터정보학회논문지
- /
- 제13권5호
- /
- pp.77-84
- /
- 2008
본 논문에서는 자연영상에 존재하는 문자들을 효율적으로 검출하기 위한 새로운 접근 방법을 제안한다. 빛 또는 조명의 영향에 의해 획득된 영상 내에 존재하는 반사성분은 문자 또는 관심객체들의 경계가 모호해 지거나 관심객체와 배경이 서로 혼합되었을 경우, 문자추출 및 인식을 함에 있어서 오류를 포함시킬 수 있다. 따라서 영상 내에 존재하는 반사성분을 제거하기 위해 먼저, 영상으로부터 Red컬러 성분에 해당하는 히스토그램에서 두개의 피크 점을 검출한다. 검출된 두 개의 피크 점들 간의 분포를 사용하여 노말 또는 편광 영상에 해당하는지를 판별한다. 노말 영상의 경우 부가적인 처리를 거치지 않고 문자영역을 검출하며 편광 영상인 경우 조명성분을 제거하기 위해 호모모픽 필터링 방법을 적용하여 반사성분에 해당하는 영역을 제거한다. 그리고 문자영역을 검출하기 위해 색 병합과 세일런스 맵을 이용하여 각각의 문자 후보영역을 결정한다. 마지막으로 두 후보영역을 이용하여 최종 문자영역을 검출한다.
PDF

애니메이션 화면 전환 수단으로서의 조형 요소 변화에 대한 연구 (A Study on the code and design elements as a way of transition)

김진영
- 만화애니메이션 연구
- /
- 통권14호
- /
- pp.83-99
- /
- 2008
일반적으로 필름에서의 화면 전환은 컷이나 디졸브 등, 화면 전체의 일괄적 전환으로 대표된다. 애니메이션 필름에서는 프레임의 이미지를 하나하나 생성하는 제작 기법의 특수성으로 인해 화면의 다양한 요소들에 전달하고자 하는 감성이나 내러티브적 요소를 부여할 수 있으며 다른 기호적 차원의 표현으로도 전환하는 것이 가능하다. 현대에 이르러 몰핑이나 메타모포시스 등 이미지 조작 기술이 다양화 되고 정교해짐에 따라 연속적 화면 구성은 2D애니메이션만의 고유한 특수성으로 보기 힘들어졌다. 그러나 캐릭터와 배경 즉, 사물과 공간을 너머 관객의 시선을 서로 다른 시각적 차원으로 지속적으로 강렬하게 몰입시키는 것은 2D 수작업 애니메이션의 강한 매력으로 볼 수 있다. 결국 이 같은 특성은 화면 전체의 구성 요소들을 통한 섬세한 은유와 개체들 각각의 함축적 의미 체계의 전달을 가능케 하는 문학적 기능을 가능케 한다. 장면에 관한 해석은 기호적 원근법의 세계와 평면적 조형 세계의 경계를 허물며 보다 다분화 되고 복잡하게 되었다. 이에 애니메이션 필름 화면상 조형 요소의 구성 기준, 그리고 그 활용 효과를 분석하는 것은 현시대의 새로운 몰입 수단을 가진 첨단 영상 화면에 있어서의 분석과 적용에 도움이 되리라고 본다.
PDF

영상합성을 위한 3D 공간 해석 및 조명환경의 재구성 (3D Analysis of Scene and Light Environment Reconstruction for Image Synthesis)

황용호;홍현기
- 한국게임학회 논문지
- /
- 제6권2호
- /
- pp.45-50
- /
- 2006
실 세계 공간에 가상 물체를 사실적으로 합성하기 위해서는 공간 내에 존재하는 조명정보 등을 분석해야 한다. 본 논문에서는 카메라에 대한 사전 교정(calibration)없이 카메라 및 조명의 위치 등을 추정하는 새로운 조명공간 재구성 방법이 제안된다. 먼저, 어안렌즈(fisheye lens)로부터 얻어진 전방향(omni-directional) 다중 노출 영상을 이용해 HDR (High Dynamic Range) 래디언스 맵(radiance map)을 생성한다. 그리고 다수의 대응점으로부터 카메라의 위치를 추정한 다음, 방향벡터를 이용해 조명의 위치를 재구성한다. 또한 대상 공간 내 많은 영향을 미치는 전역 조명과 일부 지역에 국한되어 영향을 주는 방향성을 갖는 지역 조명으로 분류하여 조명 환경을 재구성한다. 재구성된 조명환경 내에서 분산광선추적(distributed ray tracing) 방법으로 렌더링한 결과로부터 사실적인 합성영상이 얻어짐을 확인하였다. 제안된 방법은 카메라의 사전 교정 등이 필요하지 않으며 조명공간을 자동으로 재구성할 수 있는 장점이 있다.
PDF

Effectual Method FOR 3D Rebuilding From Diverse Images

Leung, Carlos Wai Yin;Hons, B.E.
- 한국정보컨버전스학회:학술대회논문집
- /
- 한국정보컨버전스학회 2008년도 International conference on information convergence
- /
- pp.145-150
- /
- 2008
This thesis explores the problem of reconstructing a three-dimensional(3D) scene given a set of images or image sequences of the scene. It describes efficient methods for the 3D reconstruction of static and dynamic scenes from stereo images, stereo image sequences, and images captured from multiple viewpoints. Novel methods for image-based and volumetric modelling approaches to 3D reconstruction are presented, with an emphasis on the development of efficient algorithm which produce high quality and accurate reconstructions. For image-based 3D reconstruction a novel energy minimisation scheme, Iterated Dynamic Programming, is presented for the efficient computation of strong local minima of discontinuity preserving energyy functions. Coupled with a novel morphological decomposition method and subregioning schemes for the efficient computation of a narrowband matching cost volume. the minimisation framework is applied to solve problems in stereo matching, stereo-temporal reconstruction, motion estimation, 2D image registration and 3D image registration. This thesis establishes Iterated Dynamic Programming as an efficient and effective energy minimisation scheme suitable for computer vision problems which involve finding correspondences across images. For 3D reconstruction from multiple view images with arbitrary camera placement, a novel volumetric modelling technique, Embedded Voxel Colouring, is presented that efficiently embeds all reconstructions of a 3D scene into a single output in a single scan of the volumetric space under exact visibility. An adaptive thresholding framework is also introduced for the computation of the optimal set of thresholds to obtain high quality 3D reconstructions. This thesis establishes the Embedded Voxel Colouring framework as a fast, efficient and effective method for 3D reconstruction from multiple view images.
PDF

동적 환경에 강인한 장면 인식 기반의 로봇 자율 주행 (Scene Recognition based Autonomous Robot Navigation robust to Dynamic Environments)

김정호;권인소
- 로봇학회논문지
- /
- 제3권3호
- /
- pp.245-254
- /
- 2008
Recently, many vision-based navigation methods have been introduced as an intelligent robot application. However, many of these methods mainly focus on finding an image in the database corresponding to a query image. Thus, if the environment changes, for example, objects moving in the environment, a robot is unlikely to find consistent corresponding points with one of the database images. To solve these problems, we propose a novel navigation strategy which uses fast motion estimation and a practical scene recognition scheme preparing the kidnapping problem, which is defined as the problem of re-localizing a mobile robot after it is undergone an unknown motion or visual occlusion. This algorithm is based on motion estimation by a camera to plan the next movement of a robot and an efficient outlier rejection algorithm for scene recognition. Experimental results demonstrate the capability of the vision-based autonomous navigation against dynamic environments.
PDF

Video Segmentation and Key frame Extraction using Multi-resolution Analysis and Statistical Characteristic

Cho, Wan-Hyun;Park, Soon-Young;Park, Jong-Hyun
- Communications for Statistical Applications and Methods
- /
- 제10권2호
- /
- pp.457-469
- /
- 2003
In this paper, we have proposed the efficient algorithm that can segment the video scene change using a various statistical characteristics obtained from by applying the wavelet transformation for each frames. Our method firstly extracts the histogram features from low frequency subband of wavelet-transformed image and then uses these features to detect the abrupt scene change. Second, it extracts the edge information from applying the mesh method to the high frequency subband of transformed image. We quantify the extracted edge information as the values of variance characteristic of each pixel and use these values to detect the gradual scene change. And we have also proposed an algorithm how extract the proper key frame from segmented video scene. Experiment results show that the proposed method is both very efficient algorithm in segmenting video frames and also is to become the appropriate key frame extraction method.
https://doi.org/10.5351/CKSS.2003.10.2.457 인용 PDF KSCI

비전 시스템을 이용한 이동로봇 Self-positioning과 VRML과의 영상오버레이 (Self-Positioning of a Mobile Robot using a Vision System and Image Overlay with VRML)

권방현;정길도
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 2005년도 심포지엄 논문집 정보 및 제어부문
- /
- pp.258-260
- /
- 2005
We describe a method for localizing a mobile robot in the working environment using a vision system and VRML. The robot identifies landmarks in the environment and carries out the self-positioning. The image-processing and neural network pattern matching technique are employed to recognize landmarks placed in a robot working environment. The robot self-positioning using vision system is based on the well-known localization algorithm. After self-positioning, 2D scene is overlaid with VRML scene. This paper describes how to realize the self-positioning and shows the result of overlaying between 2D scene and VRML scene. In addition we describe the advantage expected from overlapping both scenes.
PDF

검색결과 945건 처리시간 0.025초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)