• 제목/요약/키워드: Attention algorithm

검색결과 754건 처리시간 0.03초

이동로봇의 물체인식을 위한 질의 기반 시각 집중 알고리즘 (Query-based Visual Attention Algorithm for Object Recognition of A Mobile Robot)

  • 류광근;이상훈;서일홍
    • 전자공학회논문지SC
    • /
    • 제44권1호
    • /
    • pp.50-58
    • /
    • 2007
  • 본 논문에서는 로봇이 태스크와 관련된 부분에 시각 집중을 하도록 하기 위해서 기존의 상향식 주목 알고리즘을 확장한 질의 기반 시각 집중 알고리즘을 제안한다. 질의 기반 시각 집중 알고리즘은 로봇이 수행 할 태스크와 관련한 물체를 질의하면 그 물체의 속성을 분석하여 여러 종류의 도드라짐(Conspicuity) 영상 지도에 적용될 가중치 값을 작성한다. 그리고 가중치를 이용하여 도드라짐 영상 지도를을 합성한 Saliency 영상 지도를 작성하여 기존의 주목 알고리즘과 비교 평가를 수행하였다. 여기서는 일예로서 질의 물체의 속성을 색으로 사용하였다.

An Improved Recommendation Algorithm Based on Two-layer Attention Mechanism

  • Kim, Hye-jin
    • 한국컴퓨터정보학회논문지
    • /
    • 제26권10호
    • /
    • pp.185-198
    • /
    • 2021
  • 인터넷 기술의 발달로 기존의 추천 알고리즘은 사용자나 항목의 심층적인 특성을 학습할 수 없기 때문에 본 논문은 이 문제를 해결하기 위해 AMITI(주의 메커니즘 및 개선된 TF-IDF)에 기반한 추천 알고리즘을 제안했다. CNN(Convolutional Neural Network)에 2중 주의 메커니즘을 도입함으로써 CNN의 특징 추출 능력이 향상되고, 항목 특징에 다른 선호도 가중치가 할당되며, 사용자 선호도와 더 일치하는 권고사항이 달성되었다. 대상 사용자에게 항목을 추천할 때 점수 데이터와 항목 유형 데이터를 TF-IDF와 결합하여 권장 결과의 그룹화를 완료하였다. 본 논문에서 진행한 MovieLens-1M 데이터 세트에 대한 실험 결과는, AMITI 알고리즘이 권장 사항의 정확도를 향상시키고 프레젠테이션 방법의 순서와 선택성을 향상시킨다는 것을 보여준다.

어텐션 알고리듬 기반 양방향성 LSTM을 이용한 동영상의 압축 표준 예측 (Video Compression Standard Prediction using Attention-based Bidirectional LSTM)

  • 김상민;박범준;정제창
    • 방송공학회논문지
    • /
    • 제24권5호
    • /
    • pp.870-878
    • /
    • 2019
  • 본 논문에서는 어텐션 알고리듬 (attention algorithm) 기반의 양방향성 LSTM (bidirectional long short-term memory; BLSTM) 을 동영상의 압축 표준을 예측하기 위해 사용한다. 자연어 처리 (natural language processing; NLP) 분야에서 순환적 신경망 (recurrent neural networks; RNN) 의 구조를 이용하여 문장의 다음 단어를 예측하거나 의미에 따라 문장을 분류하거나 번역하는 연구들은 계속되어왔고, 이는 챗봇, 음성인식 스피커, 번역 애플리케이션 등으로 상용화되었다. LSTM 은 RNN에서 gradient vanishing problem 을 해결하고자 고안됐고, NLP 분야에서 유용하게 사용되고 있다. 제안한 알고리듬은 BLSTM과 특정 단어에 집중하여 분류할 수 있는 어텐션 알고리듬을 자연어 문장이 아닌 동영상의 비트스트림에 적용해 동영상의 압축 표준을 예측하는 것이 가능하다.

Self-Attention 시각화를 사용한 기계번역 서비스의 번역 오류 요인 설명 (Explaining the Translation Error Factors of Machine Translation Services Using Self-Attention Visualization)

  • 장청롱;안현철
    • 한국IT서비스학회지
    • /
    • 제21권2호
    • /
    • pp.85-95
    • /
    • 2022
  • This study analyzed the translation error factors of machine translation services such as Naver Papago and Google Translate through Self-Attention path visualization. Self-Attention is a key method of the Transformer and BERT NLP models and recently widely used in machine translation. We propose a method to explain translation error factors of machine translation algorithms by comparison the Self-Attention paths between ST(source text) and ST'(transformed ST) of which meaning is not changed, but the translation output is more accurate. Through this method, it is possible to gain explainability to analyze a machine translation algorithm's inside process, which is invisible like a black box. In our experiment, it was possible to explore the factors that caused translation errors by analyzing the difference in key word's attention path. The study used the XLM-RoBERTa multilingual NLP model provided by exBERT for Self-Attention visualization, and it was applied to two examples of Korean-Chinese and Korean-English translations.

AttentionMesh를 활용한 국가과학기술표준분류체계 소분류 키워드 자동추천에 관한 연구 (A Study on Automatic Recommendation of Keywords for Sub-Classification of National Science and Technology Standard Classification System Using AttentionMesh)

  • 박진호;송민선
    • 한국도서관정보학회지
    • /
    • 제53권2호
    • /
    • pp.95-115
    • /
    • 2022
  • 이 연구의 목적은 국가과학기술표준분류체계의 소분류 용어를 기계학습 알고리즘을 적용하여 기술키워드 변환하는 것이 목적이다. 이를 위해 본 연구에서는 주제어 추천에 적합한 학습 알고리즘으로 AttentionMeSH를 활용했다. 원천데이터는 한국과학기술기획평가원이 정제한 2017년부터 2020년까지 4개년 연구현황 파일을 사용하였다. 학습은 과제명, 연구목표, 연구내용, 기대효과와 같이 연구내용을 잘 표현하고 있는 4개 속성을 사용했다. 그 결과 임계치(threshold)가 0.5일 때 MiF 0.6377이라는 결과가 도출됨을 확인하였다. 향후 실제 업무에 기계학습을 활용하고, 기술키워드 확보를 위해서는 용어관리체계 구축과 다양한 속성들의 데이터 확보가 필요할 것으로 보인다.

A Tuberculosis Detection Method Using Attention and Sparse R-CNN

  • Xu, Xuebin;Zhang, Jiada;Cheng, Xiaorui;Lu, Longbin;Zhao, Yuqing;Xu, Zongyu;Gu, Zhuangzhuang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권7호
    • /
    • pp.2131-2153
    • /
    • 2022
  • To achieve accurate detection of tuberculosis (TB) areas in chest radiographs, we design a chest X-ray TB area detection algorithm. The algorithm consists of two stages: the chest X-ray TB classification network (CXTCNet) and the chest X-ray TB area detection network (CXTDNet). CXTCNet is used to judge the presence or absence of TB areas in chest X-ray images, thereby excluding the influence of other lung diseases on the detection of TB areas. It can reduce false positives in the detection network and improve the accuracy of detection results. In CXTCNet, we propose a channel attention mechanism (CAM) module and combine it with DenseNet. This module enables the network to learn more spatial and channel features information about chest X-ray images, thereby improving network performance. CXTDNet is a design based on a sparse object detection algorithm (Sparse R-CNN). A group of fixed learnable proposal boxes and learnable proposal features are using for classification and location. The predictions of the algorithm are output directly without non-maximal suppression post-processing. Furthermore, we use CLAHE to reduce image noise and improve image quality for data preprocessing. Experiments on dataset TBX11K show that the accuracy of the proposed CXTCNet is up to 99.10%, which is better than most current TB classification algorithms. Finally, our proposed chest X-ray TB detection algorithm could achieve AP of 45.35% and AP50 of 74.20%. We also establish a chest X-ray TB dataset with 304 sheets. And experiments on this dataset showed that the accuracy of the diagnosis was comparable to that of radiologists. We hope that our proposed algorithm and established dataset will advance the field of TB detection.

Multi-Task FaceBoxes: A Lightweight Face Detector Based on Channel Attention and Context Information

  • Qi, Shuaihui;Yang, Jungang;Song, Xiaofeng;Jiang, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권10호
    • /
    • pp.4080-4097
    • /
    • 2020
  • In recent years, convolutional neural network (CNN) has become the primary method for face detection. But its shortcomings are obvious, such as expensive calculation, heavy model, etc. This makes CNN difficult to use on the mobile devices which have limited computing and storage capabilities. Therefore, the design of lightweight CNN for face detection is becoming more and more important with the popularity of smartphones and mobile Internet. Based on the CPU real-time face detector FaceBoxes, we propose a multi-task lightweight face detector, which has low computing cost and higher detection precision. First, to improve the detection capability, the squeeze and excitation modules are used to extract attention between channels. Then, the textual and semantic information are extracted by shallow networks and deep networks respectively to get rich features. Finally, the landmark detection module is used to improve the detection performance for small faces and provide landmark data for face alignment. Experiments on AFW, FDDB, PASCAL, and WIDER FACE datasets show that our algorithm has achieved significant improvement in the mean average precision. Especially, on the WIDER FACE hard validation set, our algorithm outperforms the mean average precision of FaceBoxes by 7.2%. For VGA-resolution images, the running speed of our algorithm can reach 23FPS on a CPU device.

물체 인식을 위한 시각 주목 알고리즘 (Visual Attention Algorithm for Object Recognition)

  • 류광근;이상훈;서일홍
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2006년도 심포지엄 논문집 정보 및 제어부문
    • /
    • pp.306-308
    • /
    • 2006
  • We propose an attention based object recognition system, to recognize object fast and robustly. For this we calculate visual stimulus degrees and make saliency maps. Through this map we find a strongly attentive part of image by stimulus degrees, where local features are extracted to recognize objects.

  • PDF

Region of Interest Detection Based on Visual Attention and Threshold Segmentation in High Spatial Resolution Remote Sensing Images

  • Zhang, Libao;Li, Hao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제7권8호
    • /
    • pp.1843-1859
    • /
    • 2013
  • The continuous increase of the spatial resolution of remote sensing images brings great challenge to image analysis and processing. Traditional prior knowledge-based region detection and target recognition algorithms for processing high resolution remote sensing images generally employ a global searching solution, which results in prohibitive computational complexity. In this paper, a more efficient region of interest (ROI) detection algorithm based on visual attention and threshold segmentation (VA-TS) is proposed, wherein a visual attention mechanism is used to eliminate image segmentation and feature detection to the entire image. The input image is subsampled to decrease the amount of data and the discrete moment transform (DMT) feature is extracted to provide a finer description of the edges. The feature maps are combined with weights according to the amount of the "strong points" and the "salient points". A threshold segmentation strategy is employed to obtain more accurate region of interest shape information with the very low computational complexity. Experimental statistics have shown that the proposed algorithm is computational efficient and provide more visually accurate detection results. The calculation time is only about 0.7% of the traditional Itti's model.

적응적인 Saliency Map 모델 구현 (Implementation of Image Adaptive Map)

  • 박상범;김기중;한영준;한헌수
    • 한국정밀공학회지
    • /
    • 제25권2호
    • /
    • pp.131-139
    • /
    • 2008
  • This paper presents a new saliency map which is constructed by providing dynamic weights on individual features in an input image to search ROI(Region Of Interest) or FOA(Focus Of Attention). To construct a saliency map on there is no a priori information, three feature-maps are constructed first which emphasize orientation, color, and intensity of individual pixels, respectively. From feature-maps, conspicuity maps are generated by using the It's algorithm and their information quantities are measured in terms of entropy. Final saliency map is constructed by summing the conspicuity maps weighted with their individual entropies. The prominency of the proposed algorithm has been proved by showing that the ROIs detected by the proposed algorithm in ten different images are similar with those selected by one-hundred person's naked eyes.