• Title/Summary/Keyword: Salient object detection

Search Result 26, Processing Time 0.018 seconds

Context Aware Feature Selection Model for Salient Feature Detection from Mobile Video Devices (모바일 비디오기기 위에서의 중요한 객체탐색을 위한 문맥인식 특성벡터 선택 모델)

  • Lee, Jaeho;Shin, Hyunkyung
    • Journal of Internet Computing and Services
    • /
    • v.15 no.6
    • /
    • pp.117-124
    • /
    • 2014
  • Cluttered background is a major obstacle in developing salient object detection and tracking system for mobile device captured natural scene video frames. In this paper we propose a context aware feature vector selection model to provide an efficient noise filtering by machine learning based classifiers. Since the context awareness for feature selection is achieved by searching nearest neighborhoods, known as NP hard problem, we apply a fast approximation method with complexity analysis in details. Separability enhancement in feature vector space by adding the context aware feature subsets is studied rigorously using principal component analysis (PCA). Overall performance enhancement is quantified by the statistical measures in terms of the various machine learning models including MLP, SVM, Naïve Bayesian, CART. Summary of computational costs and performance enhancement is also presented.

Repeated Cropping based on Deep Learning for Photo Re-composition (사진 구도 개선을 위한 딥러닝 기반 반복적 크롭핑)

  • Hong, Eunbin;Jeon, Junho;Lee, Seungyong
    • Journal of KIISE
    • /
    • v.43 no.12
    • /
    • pp.1356-1364
    • /
    • 2016
  • This paper proposes a novel aesthetic photo recomposition method using a deep convolutional neural network (DCNN). Previous recomposition approaches define the aesthetic score of photo composition based on the distribution of salient objects, and enhance the photo composition by maximizing the score. These methods suffer from heavy computational overheads, and often fail to enhance the composition because their optimization depends on the performance of existing salient object detection algorithms. Unlike previous approaches, we address the photo recomposition problem by utilizing DCNN, which shows remarkable performance in object detection and recognition. DCNN is used to iteratively predict cropping directions for a given photo, thus generating an aesthetically enhanced photo in terms of composition. Experimental results and user study show that the proposed framework can automatically crop the photo to follow specific composition guidelines, such as the rule of thirds.

Product Label Detection based on the Local Structure Tensor (구조 텐서 기반의 상품 라벨 검출)

  • Chen, Yan-Juan;Lee, Myung-Eun;Kim, Soo-Hyung
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.397-400
    • /
    • 2011
  • In this paper, we propose an approach to detect the product label for mobile phone images based on saliency map and the local structure tensor. The object boundary information can be better described by the local structure tensor than other edge detectors, and the saliency map methods can find out the most salient area and shorten the computational time by reducing the size of the orignal image. Therefore, these two methods are considered for our product label detection. The experimental results show an acceptable performance based on our proposed approach.

Applicability Evaluation of Deep Learning-Based Object Detection for Coastal Debris Monitoring: A Comparative Study of YOLOv8 and RT-DETR (해안쓰레기 탐지 및 모니터링에 대한 딥러닝 기반 객체 탐지 기술의 적용성 평가: YOLOv8과 RT-DETR을 중심으로)

  • Suho Bak;Heung-Min Kim;Youngmin Kim;Inji Lee;Miso Park;Seungyeol Oh;Tak-Young Kim;Seon Woong Jang
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1195-1210
    • /
    • 2023
  • Coastal debris has emerged as a salient issue due to its adverse effects on coastal aesthetics, ecological systems, and human health. In pursuit of effective countermeasures, the present study delineated the construction of a specialized image dataset for coastal debris detection and embarked on a comparative analysis between two paramount real-time object detection algorithms, YOLOv8 and RT-DETR. Rigorous assessments of robustness under multifarious conditions were instituted, subjecting the models to assorted distortion paradigms. YOLOv8 manifested a detection accuracy with a mean Average Precision (mAP) value ranging from 0.927 to 0.945 and an operational speed between 65 and 135 Frames Per Second (FPS). Conversely, RT-DETR yielded an mAP value bracket of 0.917 to 0.918 with a detection velocity spanning 40 to 53 FPS. While RT-DETR exhibited enhanced robustness against color distortions, YOLOv8 surpassed resilience under other evaluative criteria. The implications derived from this investigation are poised to furnish pivotal directives for algorithmic selection in the practical deployment of marine debris monitoring systems.

2D-to-3D Conversion System using Depth Map Enhancement

  • Chen, Ju-Chin;Huang, Meng-yuan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.3
    • /
    • pp.1159-1181
    • /
    • 2016
  • This study introduces an image-based 2D-to-3D conversion system that provides significant stereoscopic visual effects for humans. The linear and atmospheric perspective cues that compensate each other are employed to estimate depth information. Rather than retrieving a precise depth value for pixels from the depth cues, a direction angle of the image is estimated and then the depth gradient, in accordance with the direction angle, is integrated with superpixels to obtain the depth map. However, stereoscopic effects of synthesized views obtained from this depth map are limited and dissatisfy viewers. To obtain impressive visual effects, the viewer's main focus is considered, and thus salient object detection is performed to explore the significance region for visual attention. Then, the depth map is refined by locally modifying the depth values within the significance region. The refinement process not only maintains global depth consistency by correcting non-uniform depth values but also enhances the visual stereoscopic effect. Experimental results show that in subjective evaluation, the subjectively evaluated degree of satisfaction with the proposed method is approximately 7% greater than both existing commercial conversion software and state-of-the-art approach.

DCNN Optimization Using Multi-Resolution Image Fusion

  • Alshehri, Abdullah A.;Lutz, Adam;Ezekiel, Soundararajan;Pearlstein, Larry;Conlen, John
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.11
    • /
    • pp.4290-4309
    • /
    • 2020
  • In recent years, advancements in machine learning capabilities have allowed it to see widespread adoption for tasks such as object detection, image classification, and anomaly detection. However, despite their promise, a limitation lies in the fact that a network's performance quality is based on the data which it receives. A well-trained network will still have poor performance if the subsequent data supplied to it contains artifacts, out of focus regions, or other visual distortions. Under normal circumstances, images of the same scene captured from differing points of focus, angles, or modalities must be separately analysed by the network, despite possibly containing overlapping information such as in the case of images of the same scene captured from different angles, or irrelevant information such as images captured from infrared sensors which can capture thermal information well but not topographical details. This factor can potentially add significantly to the computational time and resources required to utilize the network without providing any additional benefit. In this study, we plan to explore using image fusion techniques to assemble multiple images of the same scene into a single image that retains the most salient key features of the individual source images while discarding overlapping or irrelevant data that does not provide any benefit to the network. Utilizing this image fusion step before inputting a dataset into the network, the number of images would be significantly reduced with the potential to improve the classification performance accuracy by enhancing images while discarding irrelevant and overlapping regions.