• Title/Summary/Keyword: saliency.

Search Result 225, Processing Time 0.033 seconds

Small Object Segmentation Based on Visual Saliency in Natural Images

  • Manh, Huynh Trung;Lee, Gueesang
    • Journal of Information Processing Systems
    • /
    • v.9 no.4
    • /
    • pp.592-601
    • /
    • 2013
  • Object segmentation is a challenging task in image processing and computer vision. In this paper, we present a visual attention based segmentation method to segment small sized interesting objects in natural images. Different from the traditional methods, we first search the region of interest by using our novel saliency-based method, which is mainly based on band-pass filtering, to obtain the appropriate frequency. Secondly, we applied the Gaussian Mixture Model (GMM) to locate the object region. By incorporating the visual attention analysis into object segmentation, our proposed approach is able to narrow the search region for object segmentation, so that the accuracy is increased and the computational complexity is reduced. The experimental results indicate that our proposed approach is efficient for object segmentation in natural images, especially for small objects. Our proposed method significantly outperforms traditional GMM based segmentation.

Video Saliency Detection Using Bi-directional LSTM

  • Chi, Yang;Li, Jinjiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.6
    • /
    • pp.2444-2463
    • /
    • 2020
  • Significant detection of video can more rationally allocate computing resources and reduce the amount of computation to improve accuracy. Deep learning can extract the edge features of the image, providing technical support for video saliency. This paper proposes a new detection method. We combine the Convolutional Neural Network (CNN) and the Deep Bidirectional LSTM Network (DB-LSTM) to learn the spatio-temporal features by exploring the object motion information and object motion information to generate video. A continuous frame of significant images. We also analyzed the sample database and found that human attention and significant conversion are time-dependent, so we also considered the significance detection of video cross-frame. Finally, experiments show that our method is superior to other advanced methods.

Location-Based Saliency Maps from a Fully Connected Layer using Multi-Shapes

  • Kim, Hoseung;Han, Seong-Soo;Jeong, Chang-Sung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.1
    • /
    • pp.166-179
    • /
    • 2021
  • Recently, with the development of technology, computer vision research based on the human visual system has been actively conducted. Saliency maps have been used to highlight areas that are visually interesting within the image, but they can suffer from low performance due to external factors, such as an indistinct background or light source. In this study, existing color, brightness, and contrast feature maps are subjected to multiple shape and orientation filters and then connected to a fully connected layer to determine pixel intensities within the image based on location-based weights. The proposed method demonstrates better performance in separating the background from the area of interest in terms of color and brightness in the presence of external elements and noise. Location-based weight normalization is also effective in removing pixels with high intensity that are outside of the image or in non-interest regions. Our proposed method also demonstrates that multi-filter normalization can be processed faster using parallel processing.

Saliency-Assisted Collaborative Learning Network for Road Scene Semantic Segmentation

  • Haifeng Sima;Yushuang Xu;Minmin Du;Meng Gao;Jing Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.861-880
    • /
    • 2023
  • Semantic segmentation of road scene is the key technology of autonomous driving, and the improvement of convolutional neural network architecture promotes the improvement of model segmentation performance. The existing convolutional neural network has the simplification of learning knowledge and the complexity of the model. To address this issue, we proposed a road scene semantic segmentation algorithm based on multi-task collaborative learning. Firstly, a depthwise separable convolution atrous spatial pyramid pooling is proposed to reduce model complexity. Secondly, a collaborative learning framework is proposed involved with saliency detection, and the joint loss function is defined using homoscedastic uncertainty to meet the new learning model. Experiments are conducted on the road and nature scenes datasets. The proposed method achieves 70.94% and 64.90% mIoU on Cityscapes and PASCAL VOC 2012 datasets, respectively. Qualitatively, Compared to methods with excellent performance, the method proposed in this paper has significant advantages in the segmentation of fine targets and boundaries.

Multi-view Image Generation from Stereoscopic Image Features and the Occlusion Region Extraction (가려짐 영역 검출 및 스테레오 영상 내의 특징들을 이용한 다시점 영상 생성)

  • Lee, Wang-Ro;Ko, Min-Soo;Um, Gi-Mun;Cheong, Won-Sik;Hur, Nam-Ho;Yoo, Ji-Sang
    • Journal of Broadcast Engineering
    • /
    • v.17 no.5
    • /
    • pp.838-850
    • /
    • 2012
  • In this paper, we propose a novel algorithm that generates multi-view images by using various image features obtained from the given stereoscopic images. In the proposed algorithm, we first create an intensity gradient saliency map from the given stereo images. And then we calculate a block-based optical flow that represents the relative movement(disparity) of each block with certain size between left and right images. And we also obtain the disparities of feature points that are extracted by SIFT(scale-invariant We then create a disparity saliency map by combining these extracted disparity features. Disparity saliency map is refined through the occlusion detection and removal of false disparities. Thirdly, we extract straight line segments in order to minimize the distortion of straight lines during the image warping. Finally, we generate multi-view images by grid mesh-based image warping algorithm. Extracted image features are used as constraints during grid mesh-based image warping. The experimental results show that the proposed algorithm performs better than the conventional DIBR algorithm in terms of visual quality.

Method of creating augmented saliency map for 360-degree video (360 도 비디오의 객체 증강 saliency map 생성 방법)

  • Shim, Yoojeong;Seo, Jimin;Lee, Myeong-jin
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.109-111
    • /
    • 2021
  • 360 도 영상은 기존 미디어와 다른 몰입감을 제공하지만 HMD 기반 시청은 멀미, 신체적 불편함 등을 유발할 수 있다. 또한, 시청 디바이스 보급 문제, 네트워크 대역의 문제, 단일 소스 다중 이용의 수요 등으로 일반 디스플레이 기반 서비스 수요도 존재한다. 본 논문에서는 360 도 영상의 일반 디스플레이 서비스를 위한 뷰포트 추출에 필요한 영상 내 객체의 동적 속성을 활용한 시각적 관심 지도 증강 기법과 이를 이용한 서비스 구조를 제시한다.

  • PDF

A Scale Invariant Object Detection Algorithm Using Wavelet Transform in Sea Environment (해양 환경에서 웨이블렛 변환을 이용한 크기 변화에 무관한 물표 탐지 알고리즘)

  • Bazarvaani, Badamtseren;Park, Ki Tae;Jeong, Jongmyeon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.3
    • /
    • pp.249-255
    • /
    • 2013
  • In this paper, we propose an algorithm to detect scale invariant object from IR image obtained in the sea environment. We create horizontal edge (HL), vertical edge (LH), diagonal edge (HH) of images through 2-D discrete Haar wavelet transform (DHWT) technique after noise reduction using morphology operations. Considering the sea environment, Gaussian blurring to the horizontal and vertical edge images at each level of wavelet is performed and then saliency map is generated by multiplying the blurred horizontal and vertical edges and combining into one image. Then we extract object candidate region by performing a binarization to saliency map. A small area in the object candidate region are removed to produce final result. Experiment results show the feasibility of the proposed algorithm.

Object Detection and 3D Position Estimation based on Stereo Vision (스테레오 영상 기반의 객체 탐지 및 객체의 3차원 위치 추정)

  • Son, Haengseon;Lee, Seonyoung;Min, Kyoungwon;Seo, Seongjin
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.10 no.4
    • /
    • pp.318-324
    • /
    • 2017
  • We introduced a stereo camera on the aircraft to detect flight objects and to estimate the 3D position of them. The Saliency map algorithm based on PCT was proposed to detect a small object between clouds, and then we processed a stereo matching algorithm to find out the disparity between the left and right camera. In order to extract accurate disparity, cost aggregation region was used as a variable region to adapt to detection object. In this paper, we use the detection result as the cost aggregation region. In order to extract more precise disparity, sub-pixel interpolation is used to extract float type-disparity at sub-pixel level. We also proposed a method to estimate the spatial position of an object by using camera parameters. It is expected that it can be applied to image - based object detection and collision avoidance system of autonomous aircraft in the future.

Image Classification Using Bag of Visual Words and Visual Saliency Model (이미지 단어집과 관심영역 자동추출을 사용한 이미지 분류)

  • Jang, Hyunwoong;Cho, Soosun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.12
    • /
    • pp.547-552
    • /
    • 2014
  • As social multimedia sites are getting popular such as Flickr and Facebook, the amount of image information has been increasing very fast. So there have been many studies for accurate social image retrieval. Some of them were web image classification using semantic relations of image tags and BoVW(Bag of Visual Words). In this paper, we propose a method to detect salient region in images using GBVS(Graph Based Visual Saliency) model which can eliminate less important region like a background. First, We construct BoVW based on SIFT algorithm from the database of the preliminary retrieved images with semantically related tags. Second, detect salient region in test images using GBVS model. The result of image classification showed higher accuracy than the previous research. Therefore we expect that our method can classify a variety of images more accurately.

Preprocessing Technique for Improving Action Recognition Performance in ERP Video with Multiple Objects (다중 객체가 존재하는 ERP 영상에서 행동 인식 모델 성능 향상을 위한 전처리 기법)

  • Park, Eun-Soo;Kim, Seunghwan;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.25 no.3
    • /
    • pp.374-385
    • /
    • 2020
  • In this paper, we propose a preprocessing technique to solve the problems of action recognition with Equirectangular Projection (ERP) video. The preprocessing technique proposed in this paper assumes the person object as the subject of action, that is, the Object of Interest (OOI), and the surrounding area of the OOI as the ROI. The preprocessing technique consists of three modules. I) Recognize person object in the image with object recognition model. II) Create a saliency map from the input image. III) Select subject of action using recognized person object and saliency map. The subject boundary box of the selected action is input to the action recognition model in order to improve the action recognition performance. When comparing the performance of the proposed preprocessing method to the action recognition model and the performance of the original ERP image input method, the performance is improved up to 99.6%, and the action is obtained when only the OOI is detected. It can also see the effects of related video summaries.