• Title/Summary/Keyword: saliency.

Search Result 225, Processing Time 0.041 seconds

Building Change Detection Using Deep Learning for Remote Sensing Images

  • Wang, Chang;Han, Shijing;Zhang, Wen;Miao, Shufeng
    • Journal of Information Processing Systems
    • /
    • v.18 no.4
    • /
    • pp.587-598
    • /
    • 2022
  • To increase building change recognition accuracy, we present a deep learning-based building change detection using remote sensing images. In the proposed approach, by merging pixel-level and object-level information of multitemporal remote sensing images, we create the difference image (DI), and the frequency-domain significance technique is used to generate the DI saliency map. The fuzzy C-means clustering technique pre-classifies the coarse change detection map by defining the DI saliency map threshold. We then extract the neighborhood features of the unchanged pixels and the changed (buildings) from pixel-level and object-level feature images, which are then used as valid deep neural network (DNN) training samples. The trained DNNs are then utilized to identify changes in DI. The suggested strategy was evaluated and compared to current detection methods using two datasets. The results suggest that our proposed technique can detect more building change information and improve change detection accuracy.

Wine Label Detection Using Saliency Map and Mean Shift Algorithm (중요도 맵과 Mean Shift 알고리즘을 이용한 와인 라벨 검출)

  • Chen, Yan-Juan;Lee, Myung-Eun;Kim, Soo-Hyung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.384-385
    • /
    • 2011
  • 본 논문은 중요도 맵과 Mean Shift 알고리즘을 이용하여 모바일 폰 영상 내의 와인 라벨 검출 방법을 제안한다. Mean Shift 알고리즘은 비모수적 클러스터링 기술로 클러스터의 수에 대한 사전 지식이 없이도 클러스터링이 가능한 알고리즘인데 실행 시간이 많이 필요한 단점이 있다. 이러한 문제를 해결하기 위해서 입력 칼라 와인 영상에 Saliency Map을 먼저 적용하고 영상의 두드러진 영역을 찾는다. 다음으로 Mean Shift 알고리즘을 이용한 분할 결과에서 얻은 칼라 마스크를 따라 빈도가 가장 높은 칼라 영역을 찾고 와인 라벨 영역을 검출한다. 실험결과를 통하여 제안된 방법을 모바일 폰을 이용하여 획득된 다양한 와인 영상의 라벨 영역을 효율적으로 검출할 수 있음을 볼 수 있다.

Automatic Segmentation of Product Bottle Label Based on GrabCut Algorithm

  • Na, In Seop;Chen, Yan Juan;Kim, Soo Hyung
    • International Journal of Contents
    • /
    • v.10 no.4
    • /
    • pp.1-10
    • /
    • 2014
  • In this paper, we propose a method to build an accurate initial trimap for the GrabCut algorithm without the need for human interaction. First, we identify a rough candidate for the label region of a bottle by applying a saliency map to find a salient area from the image. Then, the Hough Transformation method is used to detect the left and right borders of the label region, and the k-means algorithm is used to localize the upper and lower borders of the label of the bottle. These four borders are used to build an initial trimap for the GrabCut method. Finally, GrabCut segments accurate regions for the label. The experimental results for 130 wine bottle images demonstrated that the saliency map extracted a rough label region with an accuracy of 97.69% while also removing the complex background. The Hough transform and projection method accurately drew the outline of the label from the saliency area, and then the outline was used to build an initial trimap for GrabCut. Finally, the GrabCut algorithm successfully segmented the bottle label with an average accuracy of 92.31%. Therefore, we believe that our method is suitable for product label recognition systems that automatically segment product labels. Although our method achieved encouraging results, it has some limitations in that unreliable results are produced under conditions with varying illumination and reflections. Therefore, we are in the process of developing preprocessing algorithms to improve the proposed method to take into account variations in illumination and reflections.

Automatic Detection of Objects-of-Interest using Visual Attention and Image Segmentation (시각 주의와 영상 분할을 이용한 관심 객체 자동 검출 기법)

  • Shi, Do Kyung;Moon, Young Shik
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.5
    • /
    • pp.137-151
    • /
    • 2014
  • This paper proposes a method of detecting object of interest(OOI) in general natural images. OOI is subjectively estimated by human in images. The vision of human, in general, might focus on OOI. As the first step for automatic detection of OOI, candidate regions of OOI are detected by using a saliency map based on the human visual perception. A saliency map locates an approximate OOI, but there is a problem that they are not accurately segmented. In order to address this problem, in the second step, an exact object region is automatically detected by combining graph-based image segmentation and skeletonization. In this paper, we calculate the precision, recall and accuracy to compare the performance of the proposed method to existing methods. In experimental results, the proposed method has achieved better performance than existing methods by reducing the problems such as under detection and over detection.

Generation of Stereoscopic Image from 2D Image based on Saliency and Edge Modeling (관심맵과 에지 모델링을 이용한 2D 영상의 3D 변환)

  • Kim, Manbae
    • Journal of Broadcast Engineering
    • /
    • v.20 no.3
    • /
    • pp.368-378
    • /
    • 2015
  • 3D conversion technology has been studied over past decades and integrated to commercial 3D displays and 3DTVs. The 3D conversion plays an important role in the augmented functionality of three-dimensional television (3DTV), because it can easily provide 3D contents. Generally, depth cues extracted from a static image is used for generating a depth map followed by DIBR (Depth Image Based Rendering) rendering for producing a stereoscopic image. However except some particular images, the existence of depth cues is rare so that the consistent quality of a depth map cannot be accordingly guaranteed. Therefore, it is imperative to make a 3D conversion method that produces satisfactory and consistent 3D for diverse video contents. From this viewpoint, this paper proposes a novel method with applicability to general types of image. For this, saliency as well as edge is utilized. To generate a depth map, geometric perspective, affinity model and binomic filter are used. In the experiments, the proposed method was performed on 24 video clips with a variety of contents. From a subjective test for 3D perception and visual fatigue, satisfactory and comfortable viewing of 3D contents was validated.

A Method of Auto Photography Composition Suggestion (사진의 자동 구도 보정 제시 기법)

  • Choi, Yong-Sub;Park, Dae-Hyun;Kim, Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.1
    • /
    • pp.9-21
    • /
    • 2014
  • In this paper, we propose the auto correction technique of photography composition by which the eye line is concentrated and the stable image of the structure can be obtained in case the general user takes a picture. Because the general user photographs in most case without background knowledge about the composition of the photo, the subject location is not appropriate and the unstable composition is contrasted with the stable composition of pictures which the experts take. Therefore, we provide not the method processing the image after photographing, but he method presenting automatically the stable composition when the general users take a photograph. The proposed method analyze the subject through Saliency Map, Image Segmentation, Edge Detection, etc. and outputs the subject at the location where the stable composition can be comprised along with the guideline of the Rule of Thirds. The experimental result shows that the good composition was presented to the user automatically.

Salient Object Detection via Multiple Random Walks

  • Zhai, Jiyou;Zhou, Jingbo;Ren, Yongfeng;Wang, Zhijian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.4
    • /
    • pp.1712-1731
    • /
    • 2016
  • In this paper, we propose a novel saliency detection framework via multiple random walks (MRW) which simulate multiple agents on a graph simultaneously. In the MRW system, two agents, which represent the seeds of background and foreground, traverse the graph according to a transition matrix, and interact with each other to achieve a state of equilibrium. The proposed algorithm is divided into three steps. First, an initial segmentation is performed to partition an input image into homogeneous regions (i.e., superpixels) for saliency computation. Based on the regions of image, we construct a graph that the nodes correspond to the superpixels in the image, and the edges between neighboring nodes represent the similarities of the corresponding superpixels. Second, to generate the seeds of background, we first filter out one of the four boundaries that most unlikely belong to the background. The superpixels on each of the three remaining sides of the image will be labeled as the seeds of background. To generate the seeds of foreground, we utilize the center prior that foreground objects tend to appear near the image center. In last step, the seeds of foreground and background are treated as two different agents in multiple random walkers to complete the process of salient object detection. Experimental results on three benchmark databases demonstrate the proposed method performs well when it against the state-of-the-art methods in terms of accuracy and robustness.

Video Based Tail-Lights Status Recognition Algorithm (영상기반 차량 후미등 상태 인식 알고리즘)

  • Kim, Gyu-Yeong;Lee, Geun-Hoo;Do, Jin-Kyu;Park, Keun-Soo;Park, Jang-Sik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.10
    • /
    • pp.1443-1449
    • /
    • 2013
  • Automatic detection of vehicles in front is an integral component of many advanced driver-assistance system, such as collision mitigation, automatic cruise control, and automatic head-lamp dimming. Regardless day and night, tail-lights play an important role in vehicle detecting and status recognizing of driving in front. However, some drivers do not know the status of the tail-lights of vehicles. Thus, it is required for drivers to inform status of tail-lights automatically. In this paper, a recognition method of status of tail-lights based on video processing and recognition technology is proposed. Background estimation, optical flow and Euclidean distance is used to detect vehicles entering tollgate. Then saliency map is used to detect tail-lights and recognize their status in the Lab color coordinates. As results of experiments of using tollgate videos, it is shown that the proposed method can be used to inform status of tail-lights.

Tile-Based 360 Degree Video Streaming System with User's gaze Prediction (사용자 시선 예측을 통한 360 영상 타일 기반 스트리밍 시스템)

  • Lee, Soonbin;Jang, Dongmin;Jeong, Jong-Beom;Lee, Sangsoon;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.1053-1063
    • /
    • 2019
  • Recently, tile-based streaming that transmits one 360 video in several tiles, is actively being studied in order to transmit these 360 video more efficiently. In this paper, for the transmission of high-definition 360 video corresponding to user's viewport in tile-based streaming scenarios, a system of assigning the quality of tiles at each tile by applying the saliency map generated by existing network models is proposed. As a result of usage of Motion-Constrained Tile Set (MCTS) technique to encode each tile independently, the user's viewport was rendered and tested based on Salient360! dataset, streaming 360 video based on the proposed system results in gain to 23% of the user's viewport compared to using the existing high-efficiency video coding (HEVC).

Face Recognition Network using gradCAM (gradCam을 사용한 얼굴인식 신경망)

  • Chan Hyung Baek;Kwon Jihun;Ho Yub Jung
    • Smart Media Journal
    • /
    • v.12 no.2
    • /
    • pp.9-14
    • /
    • 2023
  • In this paper, we proposed a face recognition network which attempts to use more facial features awhile using smaller number of training sets. When combining the neural network together for face recognition, we want to use networks that use different part of the facial features. However, the network training chooses randomly where these facial features are obtained. Other hand, the judgment basis of the network model can be expressed as a saliency map through gradCAM. Therefore, in this paper, we use gradCAM to visualize where the trained face recognition model has made a observations and recognition judgments. Thus, the network combination can be constructed based on the different facial features used. Using this approach, we trained a network for small face recognition problem. In an simple toy face recognition example, the recognition network used in this paper improves the accuracy by 1.79% and reduces the equal error rate (EER) by 0.01788 compared to the conventional approach.