• Title/Summary/Keyword: Visual Saliency Model

Search Result 27, Processing Time 0.021 seconds

A Salient Based Bag of Visual Word Model (SBBoVW): Improvements toward Difficult Object Recognition and Object Location in Image Retrieval

  • Mansourian, Leila;Abdullah, Muhamad Taufik;Abdullah, Lilli Nurliyana;Azman, Azreen;Mustaffa, Mas Rina
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.2
    • /
    • pp.769-786
    • /
    • 2016
  • Object recognition and object location have always drawn much interest. Also, recently various computational models have been designed. One of the big issues in this domain is the lack of an appropriate model for extracting important part of the picture and estimating the object place in the same environments that caused low accuracy. To solve this problem, a new Salient Based Bag of Visual Word (SBBoVW) model for object recognition and object location estimation is presented. Contributions lied in the present study are two-fold. One is to introduce a new approach, which is a Salient Based Bag of Visual Word model (SBBoVW) to recognize difficult objects that have had low accuracy in previous methods. This method integrates SIFT features of the original and salient parts of pictures and fuses them together to generate better codebooks using bag of visual word method. The second contribution is to introduce a new algorithm for finding object place based on the salient map automatically. The performance evaluation on several data sets proves that the new approach outperforms other state-of-the-arts.

An Explainable Deep Learning-Based Classification Method for Facial Image Quality Assessment

  • Kuldeep Gurjar;Surjeet Kumar;Arnav Bhavsar;Kotiba Hamad;Yang-Sae Moon;Dae Ho Yoon
    • Journal of Information Processing Systems
    • /
    • v.20 no.4
    • /
    • pp.558-573
    • /
    • 2024
  • Considering factors such as illumination, camera quality variations, and background-specific variations, identifying a face using a smartphone-based facial image capture application is challenging. Face Image Quality Assessment refers to the process of taking a face image as input and producing some form of "quality" estimate as an output. Typically, quality assessment techniques use deep learning methods to categorize images. The models used in deep learning are shown as black boxes. This raises the question of the trustworthiness of the models. Several explainability techniques have gained importance in building this trust. Explainability techniques provide visual evidence of the active regions within an image on which the deep learning model makes a prediction. Here, we developed a technique for reliable prediction of facial images before medical analysis and security operations. A combination of gradient-weighted class activation mapping and local interpretable model-agnostic explanations were used to explain the model. This approach has been implemented in the preselection of facial images for skin feature extraction, which is important in critical medical science applications. We demonstrate that the use of combined explanations provides better visual explanations for the model, where both the saliency map and perturbation-based explainability techniques verify predictions.

Traffic Lights Detection Based on Visual Attention and Spot-Lights Regions Detection (시각적 주의 및 Spot-Lights 영역 검출 기반의 교통신호등 검출 방안)

  • Kim, JongBae
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.6
    • /
    • pp.132-142
    • /
    • 2014
  • In this paper, we propose a traffic lights detection method using visual attention and spot-lights detection. To detect traffic lights in city streets at day and night time, the proposed method is used the structural form of a traffic lights such as colors, intensity, shape, textures. In general, traffic lights are installed at a position to increase the visibility of the drivers. The proposed method detects the candidate traffic lights regions using the top-down visual saliency model and spot-lights detect models. The visual saliency and spot-lights regions are positions of its difference from the neighboring locations in multiple features and multiple scales. For detecting traffic lights, by not using a color thresholding method, the proposed method can be applied to urban environments of variety changes in illumination and night times.

Detection of ROIs using the Bottom-Up Saliency Model for Selective Visual Attention (관심영역 검출을 위한 상향식 현저함 모델 기반의 선택적 주의 집중 연구)

  • Kim, Jong-Bae
    • Annual Conference of KIPS
    • /
    • 2011.11a
    • /
    • pp.314-317
    • /
    • 2011
  • 본 논문은 상향식 현저함 모델을 이용하여 입력 영상으로부터 시각적 주의를 갖는 영역들을 자동으로 검출하는 방법을 제안한다. 제안한 방법에서는 인간의 시각 시스템과 같이 사전 지식 없이 시각정보의 공간적인 분포에 근거하여 장면을 해석하는 상향식 현저함 모델 방법을 입력 영상에 적용하여 관심 물체 영역을 검출하는 연구이다. 상향식 현저함 방법은 Treisman의 세부특징이론 연구에서 제시한 바와 같이 시각적 주의를 갖는 영역은 시각정보의 현격한 대비차이를 가지는 영역으로 집중되어 배경에서 관심영역을 구분할 수 있다. 입력 영상에서 현저함 모델을 통해 3차원 현저함 맵을 생성한다. 그리고 생성된 현저함 맵으로부터 실제 관심영역들을 검출하기 위해 제안한 방법에서는 적응적 임계치 방법을 적용하여 관심영역을 검출한다. 제안한 방법을 관심영역 분할에 적용한 결과, 영역 분할 정확도 및 정밀도가 약 88%와 89%로 제시되어 관심 영상분할 시스템에 적용이 가능함을 알 수 있다.

Driver Assistance System for Integration Interpretation of Driver's Gaze and Selective Attention Model (운전자 시선 및 선택적 주의 집중 모델 통합 해석을 통한 운전자 보조 시스템)

  • Kim, Jihun;Jo, Hyunrae;Jang, Giljin;Lee, Minho
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.3
    • /
    • pp.115-122
    • /
    • 2016
  • This paper proposes a system to detect driver's cognitive state by internal and external information of vehicle. The proposed system can measure driver's eye gaze. This is done by concept of information delivery and mutual information measure. For this study, we set up two web-cameras at vehicles to obtain visual information of the driver and front of the vehicle. We propose Gestalt principle based selective attention model to define information quantity of road scene. The saliency map based on gestalt principle is prominently represented by stimulus such as traffic signals. The proposed system assumes driver's cognitive resource allocation on the front scene by gaze analysis and head pose direction information. Then we use several feature algorithms for detecting driver's characteristics in real time. Modified census transform (MCT) based Adaboost is used to detect driver's face and its component whereas POSIT algorithms are used for eye detection and 3D head pose estimation. Experimental results show that the proposed system works well in real environment and confirm its usability.

Superpixel Exclusion-Inclusion Multiscale Approach for Explanations of Deep Learning (딥러닝 설명을 위한 슈퍼픽셀 제외·포함 다중스케일 접근법)

  • Seo, Dasom;Oh, KangHan;Oh, Il-Seok;Yoo, Tae-Woong
    • Smart Media Journal
    • /
    • v.8 no.2
    • /
    • pp.39-45
    • /
    • 2019
  • As deep learning has become popular, researches which can help explaining the prediction results also become important. Superpixel based multi-scale combining technique, which provides the advantage of visual pleasing by maintaining the shape of the object, has been recently proposed. Based on the principle of prediction difference, this technique computes the saliency map from the difference between the predicted result excluding the superpixel and the original predicted result. In this paper, we propose a new technique of both excluding and including super pixels. Experimental results show 3.3% improvement in IoU evaluation.

Image-based Soft Drink Type Classification and Dietary Assessment System Using Deep Convolutional Neural Network with Transfer Learning

  • Rubaiya Hafiz;Mohammad Reduanul Haque;Aniruddha Rakshit;Amina khatun;Mohammad Shorif Uddin
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.158-168
    • /
    • 2024
  • There is hardly any person in modern times who has not taken soft drinks instead of drinking water. The rate of people taking soft drinks being surprisingly high, researchers around the world have cautioned from time to time that these drinks lead to weight gain, raise the risk of non-communicable diseases and so on. Therefore, in this work an image-based tool is developed to monitor the nutritional information of soft drinks by using deep convolutional neural network with transfer learning. At first, visual saliency, mean shift segmentation, thresholding and noise reduction technique, collectively known as 'pre-processing' are adopted to extract the location of drinks region. After removing backgrounds and segment out only the desired area from image, we impose Discrete Wavelength Transform (DWT) based resolution enhancement technique is applied to improve the quality of image. After that, transfer learning model is employed for the classification of drinks. Finally, nutrition value of each drink is estimated using Bag-of-Feature (BoF) based classification and Euclidean distance-based ratio calculation technique. To achieve this, a dataset is built with ten most consumed soft drinks in Bangladesh. These images were collected from imageNet dataset as well as internet and proposed method confirms that it has the ability to detect and recognize different types of drinks with an accuracy of 98.51%.