• 제목/요약/키워드: Visual Attention Software

검색결과 32건 처리시간 0.023초

횡단보도 옐로카펫 설치에 따른 시인성 증진효과 연구 : Visual Attention Software 분석 중심으로 (Study on Visual Recognition Enhancement of Yellow Carpet Placed at Near Pedestrian Crossing Areas : Visual Attention Software Implementation)

  • 안효섭;김진태
    • 한국IT서비스학회지
    • /
    • 제15권4호
    • /
    • pp.73-83
    • /
    • 2016
  • Pedestrian safety was recently highlighted with a yellow carpet, a yellow-colored pavement material prepared for children waiting for signals for pedestrian crossing, without validation of its efficiency in practice. It was a promising device likely to assist highway safety by stimulating pedestrian to step on the yellow-colored area; it was generally called nudge effects. This paper delivers a study conducted to check the effectiveness of yellow carpet in three different aspects in vehicle driver's perspective by applying the newly introduced information technology (IT) service: Visual Attention Software (VAS). It was assumed that VAS developed by 3M in the United States should be able explain the Korean drivers' visual reaction behaviors since technology embedded in VAS was developed based on and proved by other various international countries and continents in the world. A set of pictures was taken at thirteen different field sites in seven school zone areas in the Seoul metropolitan area before and after the installation of a yellow carpet, respectively. Sets of those pictures were analyzed with VAS, and the results were compared based on the selective safety measures: the likely focusing on standing pedestrians (waiting for a pedestrian's green signal time) affected by its background (yellow-colored pavement) contrasting him or her. The test results from a set of before-and-after comparison analyses showed that the placement of yellow carpet would (1) increase 71% of driver's visual attention on pedestrian crossing areas and (2) change the sequential order of visual attention on that area 2.4 steps ahead. The findings would enhance deployment of such promising efficiency and thus increase children safety in pedestrian crossing. The result was promising to highlight the way to support the changes in conservative traffic safety engineering field by applying the advanced IT services, while much robust research was recommended to overcome the limitation of simplification of this study.

Multi-level Cross-attention Siamese Network For Visual Object Tracking

  • Zhang, Jianwei;Wang, Jingchao;Zhang, Huanlong;Miao, Mengen;Cai, Zengyu;Chen, Fuguo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권12호
    • /
    • pp.3976-3990
    • /
    • 2022
  • Currently, cross-attention is widely used in Siamese trackers to replace traditional correlation operations for feature fusion between template and search region. The former can establish a similar relationship between the target and the search region better than the latter for robust visual object tracking. But existing trackers using cross-attention only focus on rich semantic information of high-level features, while ignoring the appearance information contained in low-level features, which makes trackers vulnerable to interference from similar objects. In this paper, we propose a Multi-level Cross-attention Siamese network(MCSiam) to aggregate the semantic information and appearance information at the same time. Specifically, a multi-level cross-attention module is designed to fuse the multi-layer features extracted from the backbone, which integrate different levels of the template and search region features, so that the rich appearance information and semantic information can be used to carry out the tracking task simultaneously. In addition, before cross-attention, a target-aware module is introduced to enhance the target feature and alleviate interference, which makes the multi-level cross-attention module more efficient to fuse the information of the target and the search region. We test the MCSiam on four tracking benchmarks and the result show that the proposed tracker achieves comparable performance to the state-of-the-art trackers.

MLSE-Net: Multi-level Semantic Enriched Network for Medical Image Segmentation

  • Di Gai;Heng Luo;Jing He;Pengxiang Su;Zheng Huang;Song Zhang;Zhijun Tu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권9호
    • /
    • pp.2458-2482
    • /
    • 2023
  • Medical image segmentation techniques based on convolution neural networks indulge in feature extraction triggering redundancy of parameters and unsatisfactory target localization, which outcomes in less accurate segmentation results to assist doctors in diagnosis. In this paper, we propose a multi-level semantic-rich encoding-decoding network, which consists of a Pooling-Conv-Former (PCFormer) module and a Cbam-Dilated-Transformer (CDT) module. In the PCFormer module, it is used to tackle the issue of parameter explosion in the conservative transformer and to compensate for the feature loss in the down-sampling process. In the CDT module, the Cbam attention module is adopted to highlight the feature regions by blending the intersection of attention mechanisms implicitly, and the Dilated convolution-Concat (DCC) module is designed as a parallel concatenation of multiple atrous convolution blocks to display the expanded perceptual field explicitly. In addition, MultiHead Attention-DwConv-Transformer (MDTransformer) module is utilized to evidently distinguish the target region from the background region. Extensive experiments on medical image segmentation from Glas, SIIM-ACR, ISIC and LGG demonstrated that our proposed network outperforms existing advanced methods in terms of both objective evaluation and subjective visual performance.

2D-to-3D Conversion System using Depth Map Enhancement

  • Chen, Ju-Chin;Huang, Meng-yuan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권3호
    • /
    • pp.1159-1181
    • /
    • 2016
  • This study introduces an image-based 2D-to-3D conversion system that provides significant stereoscopic visual effects for humans. The linear and atmospheric perspective cues that compensate each other are employed to estimate depth information. Rather than retrieving a precise depth value for pixels from the depth cues, a direction angle of the image is estimated and then the depth gradient, in accordance with the direction angle, is integrated with superpixels to obtain the depth map. However, stereoscopic effects of synthesized views obtained from this depth map are limited and dissatisfy viewers. To obtain impressive visual effects, the viewer's main focus is considered, and thus salient object detection is performed to explore the significance region for visual attention. Then, the depth map is refined by locally modifying the depth values within the significance region. The refinement process not only maintains global depth consistency by correcting non-uniform depth values but also enhances the visual stereoscopic effect. Experimental results show that in subjective evaluation, the subjectively evaluated degree of satisfaction with the proposed method is approximately 7% greater than both existing commercial conversion software and state-of-the-art approach.

시각추적과제의 뇌자도 : 예비실험 (A Pilot MEG Study During A Visual Search Task)

  • 김성훈;이상건;김광기
    • Annals of Clinical Neurophysiology
    • /
    • 제8권1호
    • /
    • pp.44-47
    • /
    • 2006
  • Background: The present study used magnetoencephalography (MEG) to investigate the neural substrates for modified version of Treisman's visual search task. Methods: Two volunteers who gave informed consent participated MEG experiment. One was 27- year old male and another was 24-year-old female. All were right handed. Experiment were performed using a 306-channel biomagnetometer (Neuromag LTD). There were three task conditions in this experiment. The first was searching an open circle among seven closed circles (open condition). The second was searching a closed circle among seven uni-directionally open circles (closed condition). And the third was searching a closed circle among seven eight-directionally open circles (random (closed) condition). In one run, participants performed one task condition so there were three runs in one session of experiment. During one session, 128 trials were performed during every three runs. One participant underwent one session of experiment. The participant pressed button when they found targets. Magnetic source localization images were generated using software programs that allowed for interactive identification of a common set of fiduciary points in the MRI and MEG coordinate frames. Results: In each participant we can found activations of anterior cingulate, primary visual and association cortices, posterior parietal cortex and brain areas in the vicinity of thalamus. Conclusions: we could find activations corresponding to anterior and posterior visual attention systems.

  • PDF

인간의 상향식 시각적 주의 특성에 바탕을 둔 현저한 영역 탐지 (Detecting Salient Regions based on Bottom-up Human Visual Attention Characteristic)

  • 최경주;이일병
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제31권2호
    • /
    • pp.189-202
    • /
    • 2004
  • 본 논문에서는 영상 입력 장치로 입력되는 영상 내의 수많은 정보 중에서 지각적으로 중요하다고 여겨지는 현저한(salient) 영역만을 탐지해내는 새로운 방법을 제안한다. 제안하는 방법은 인간이 가지고 있는 시각적 주의 기능에 기본 바탕을 두고 있으며, 영상을 구성하고 있는 정보의 특징에 기반을 두고 있다. 가장 먼저 인간의 시각적 주의 기능에 영향을 미친다고 알려져 있는 몇 가지 특징들이 입력되는 영상의 모든 영역에 걸쳐 추출되어 각각의 특징에 해당되는 특징지도들로 형성된다. 이렇게 형성된 각각의 특징지도들을 구성하고 있는 특징 값들은 이들 각각의 국부적인 경쟁력 특성에 의하여 영상의 각 영역에서의 중요도를 나타내는 값으로 변환되어 중요도지도를 형성하게 된다. 이러한 중요도지도들은 모두 통합되어 하나의 현저함지도를 생성하게 된다. 현저함지도는 영상 내 각 장소의 현저함 정도를 미리 계산된 특징들의 공간적 중요도 측정치에 따른 스칼라 값으로 표시함으로써 영상 내에서 가장 현저한 영역을 찾을 수 있도록 가이드 한다. 제안하는 방법에 의해 시스템을 구성하여 실험한 결과, 인간이 중요하다고 여겨지는 주요 영역을 만족스럽게 탐지해 냄을 알 수 있었다.

Audio and Video Bimodal Emotion Recognition in Social Networks Based on Improved AlexNet Network and Attention Mechanism

  • Liu, Min;Tang, Jun
    • Journal of Information Processing Systems
    • /
    • 제17권4호
    • /
    • pp.754-771
    • /
    • 2021
  • In the task of continuous dimension emotion recognition, the parts that highlight the emotional expression are not the same in each mode, and the influences of different modes on the emotional state is also different. Therefore, this paper studies the fusion of the two most important modes in emotional recognition (voice and visual expression), and proposes a two-mode dual-modal emotion recognition method combined with the attention mechanism of the improved AlexNet network. After a simple preprocessing of the audio signal and the video signal, respectively, the first step is to use the prior knowledge to realize the extraction of audio characteristics. Then, facial expression features are extracted by the improved AlexNet network. Finally, the multimodal attention mechanism is used to fuse facial expression features and audio features, and the improved loss function is used to optimize the modal missing problem, so as to improve the robustness of the model and the performance of emotion recognition. The experimental results show that the concordance coefficient of the proposed model in the two dimensions of arousal and valence (concordance correlation coefficient) were 0.729 and 0.718, respectively, which are superior to several comparative algorithms.

신호등 인식 성능 향상을 위한 쿠버네티스 기반의 프레임워크: YOLOv5와 Visual Attention을 적용한 C-RNN의 융합 Vision AI 시스템 (Kubernetes-based Framework for Improving Traffic Light Recognition Performance: Convergence Vision AI System based on YOLOv5 and C-RNN with Visual Attention)

  • 조형서;이민정;한연지
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2022년도 추계학술발표대회
    • /
    • pp.851-853
    • /
    • 2022
  • 고령화로 인해 65세 이상 운전자가 급증하며 고령운전자의 교통사고 비율이 증가함에 따라 시급한 사회 문제로 떠오르고 있다. 이에 본 연구에서는 객체 검출, 인식 모델을 결합하고 신호등을 인식하여 Text-To-Speech(TTS)로 알리는 쿠버네티스 기반의 프레임워크를 제안한다. 객체 검출 단계에서는 YOLOv5 모델들의 성능을 비교하여 활용하였으며 객체 인식 단계에서는 C-RNN 기반의 attention-OCR 모델을 활용하였다. 이는 신호등의 내부 LED 영역이 아닌 이미지 전체를 인식하는 방식으로 오탐지 요소를 낮춰 인식률을 높였다. 결과적으로 1,628장의 테스트 데이터에서 accuracy 0.997, F1-score 0.991의 성능 평가를 얻어 제안한 프레임워크의 타당성을 입증하였다. 본 연구는 후속 연구에서 특정 도메인에 딥러닝 모델을 한정하지 않고 다양한 분야의 모델을 접목할 수 있도록 하며 고령 운전자 및 신호 위반으로 인한 교통사고 문제를 예방할 수 있다.

선택적 주의집중 모델과 YOLO를 이용한 선행 차량 정지등 검출 시스템 구현 (Implementation of Preceding Vehicle Break-Lamp Detection System using Selective Attention Model and YOLO)

  • 이우범
    • 융합신호처리학회논문지
    • /
    • 제22권2호
    • /
    • pp.85-90
    • /
    • 2021
  • 운전자의 안전 운전을 위한 첨단 운전자 보조시스템(ADAS; Advanced Driver Assistance System)은 자율주행 자동차에서 중요한 연구 분야 가운데 하나이다. 특히, 이전에 자동차에 부착된 영상센서를 기반으로 한 ADAS 소프트웨어는 구축 비용이 저렴하고 그 활용도가 우수하다. 본 논문에서는 선행차의 주행 상황을 인지할 수 있는 선행 차량 후미등(Tail-Lamp)의 정지등(Break-Lamp) 영역을 검출하는 알고리즘을 제안한다. 제안하는 방법은 주행 영상으로부터 객체 추적에 우수한 성능을 보이고 있는 YOLO 기술을 이용하여 자동차 객체를 추출하고, 추출된 자동차 관심 영역의 HSV 영상을 이용하여 정지등의 밝기 변화 영역을 검출한다. 그 다음 검출된 각 정지등 후보 고립영역을 라벨링하여 후보 영역들 간의 모양 대칭성을 인지하는 선택적 주의집중 모델(Selective Attention Model)을 적용하여 정지등 영역을 검출한다. 제안한 알고리즘의 성능 평가를 위하여 다양한 주행 영상에 적용하여 실험한 결과 ADAS에 적용 가능한 성공적인 검출 결과를 보였다.

비주얼 서보잉을 위한 딥러닝 기반 물체 인식 및 자세 추정 (Object Recognition and Pose Estimation Based on Deep Learning for Visual Servoing)

  • 조재민;강상승;김계경
    • 로봇학회논문지
    • /
    • 제14권1호
    • /
    • pp.1-7
    • /
    • 2019
  • Recently, smart factories have attracted much attention as a result of the 4th Industrial Revolution. Existing factory automation technologies are generally designed for simple repetition without using vision sensors. Even small object assemblies are still dependent on manual work. To satisfy the needs for replacing the existing system with new technology such as bin picking and visual servoing, precision and real-time application should be core. Therefore in our work we focused on the core elements by using deep learning algorithm to detect and classify the target object for real-time and analyzing the object features. We chose YOLO CNN which is capable of real-time working and combining the two tasks as mentioned above though there are lots of good deep learning algorithms such as Mask R-CNN and Fast R-CNN. Then through the line and inside features extracted from target object, we can obtain final outline and estimate object posture.