• Title/Summary/Keyword: Visual Attention Software

Search Result 32, Processing Time 0.239 seconds

LVLN : A Landmark-Based Deep Neural Network Model for Vision-and-Language Navigation (LVLN: 시각-언어 이동을 위한 랜드마크 기반의 심층 신경망 모델)

  • Hwang, Jisu;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.9
    • /
    • pp.379-390
    • /
    • 2019
  • In this paper, we propose a novel deep neural network model for Vision-and-Language Navigation (VLN) named LVLN (Landmark-based VLN). In addition to both visual features extracted from input images and linguistic features extracted from the natural language instructions, this model makes use of information about places and landmark objects detected from images. The model also applies a context-based attention mechanism in order to associate each entity mentioned in the instruction, the corresponding region of interest (ROI) in the image, and the corresponding place and landmark object detected from the image with each other. Moreover, in order to improve the success rate of arriving the target goal, the model adopts a progress monitor module for checking substantial approach to the target goal. Conducting experiments with the Matterport3D simulator and the Room-to-Room (R2R) benchmark dataset, we demonstrate high performance of the proposed model.

2-Stage Detection and Classification Network for Kiosk User Analysis (디스플레이형 자판기 사용자 분석을 위한 이중 단계 검출 및 분류 망)

  • Seo, Ji-Won;Kim, Mi-Kyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.5
    • /
    • pp.668-674
    • /
    • 2022
  • Machine learning techniques using visual data have high usability in fields of industry and service such as scene recognition, fault detection, security and user analysis. Among these, user analysis through the videos from CCTV is one of the practical way of using vision data. Also, many studies about lightweight artificial neural network have been published to increase high usability for mobile and embedded environment so far. In this study, we propose the network combining the object detection and classification for mobile graphic processing unit. This network detects pedestrian and face, classifies age and gender from detected face. Proposed network is constructed based on MobileNet, YOLOv2 and skip connection. Both detection and classification models are trained individually and combined as 2-stage structure. Also, attention mechanism is used to improve detection and classification ability. Nvidia Jetson Nano is used to run and evaluate the proposed system.

A Safety Score Prediction Model in Urban Environment Using Convolutional Neural Network (컨볼루션 신경망을 이용한 도시 환경에서의 안전도 점수 예측 모델 연구)

  • Kang, Hyeon-Woo;Kang, Hang-Bong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.8
    • /
    • pp.393-400
    • /
    • 2016
  • Recently, there have been various researches on efficient and automatic analysis on urban environment methods that utilize the computer vision and machine learning technology. Among many new analyses, urban safety analysis has received a major attention. In order to predict more accurately on safety score and reflect the human visual perception, it is necessary to consider the generic and local information that are most important to human perception. In this paper, we use Double-column Convolutional Neural network consisting of generic and local columns for the prediction of urban safety. The input of generic and local column used re-sized and random cropped images from original images, respectively. In addition, a new learning method is proposed to solve the problem of over-fitting in a particular column in the learning process. For the performance comparison of our Double-column Convolutional Neural Network, we compare two Support Vector Regression and three Convolutional Neural Network models using Root Mean Square Error and correlation analysis. Our experimental results demonstrate that our Double-column Convolutional Neural Network model show the best performance with Root Mean Square Error of 0.7432 and Pearson/Spearman correlation coefficient of 0.853/0.840.

System Design and Application of External Feature Extraction for Quality Maintenance of Yukwa (유과의 품질규격 유지를 위한 외형 정보 측정 시스템 설계 및 적용 연구)

  • Cho, Sung Ho;Kim, Tae Jung;Hwang, Heon
    • The Korean Journal of Community Living Science
    • /
    • v.24 no.2
    • /
    • pp.251-258
    • /
    • 2013
  • Korean oil and honey Yukwa has been paid attention as formal cake for traditional national seasons' holiday and religious service. Quality of Yukwa, however, has been maintained arbitrarily by each Yukwa manufacturer. Since even same Yukwa had severe differences in size, weight, and pattern, it has given the negative effect to the consumer. Yukwa industries need to setup the quantitative quality specifications instead of qualitative ones to maintain the uniformity of Yukwa quality. Efficient and economical inspection and process control system should be developed. In developing quality standards of Yukwa, features which can measure quality quantitatively in real time should be properly chosen. Existing quality features such as acidity, oxidization, hardness, viscosity, and texture were measured by the chemical or physical base destructive methods. Many research and developments have been performed in investigating and analyzing chemical transition states of those quality features as environment or storage condition changes. Most methods, however, require either off-line or complex treatment or time consuming process of analysis in evaluating quality features. Consumer, however, selects products mostly based on the external features such as shape, size, and color. Therefore, critical visual quality features should be chosen and the efficient real time measurement system must be developed. In this paper, computer image acquisition and processing system were developed and software modules were developed to extract the quantitative data of those features in real-time. Computer image processing system will promote in maintaining uniform quality of Yukwa and establishing quality standards of Yukwa.

Correlation between Brain Cognition and Cyberdisease in VR Media (VR매체에서의 뇌인지와 사이버 멀미의 상관관계)

  • Kim, Min-Seo;Kim, Kyun-Ho;Kim, Yu-Ri;Kim, Eun-Seo;HUH, Won-Whoi
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.5
    • /
    • pp.603-611
    • /
    • 2022
  • As the era of metaverse approaches, there are challenges that need to be solved. Among them, 'cyber motion sickness' is a representative problem from 2016;when VR technology began to attract attention. According to the theory of sensory conflict, motion sickness is caused when the perceived direction of motion information and the expected value are not the same. The paper was written to theoretically explore the relationship between brain cognition and cyber motion sickness, and to prove the effect of user immersion on motion sickness symptoms based on this. Through the SSQ experiment, it was found that the rotation value of the camera aggravates the symptoms of cyber motion sickness and can alleviate cyber motion sickness by increasing the immersion of the game by giving the viewer visual and shift missions to solve. This study was conducted to solve the problem of cyber motion sickness during the process of developing the VR rhythm game "beatale", and it is expected to be the basis for improving cyber motion sickness not only in the development of the project but also in the production of VR contents in the future.

The research and Development trends of Telecommunications of the End of the 20th Century(Present) and the Beginning of the 21st Century(Future) (20세기 말과 21세기 초의 전기통신의 연구개발동향)

  • 조규심
    • Journal of the Korean Professional Engineers Association
    • /
    • v.29 no.2
    • /
    • pp.15-23
    • /
    • 1996
  • With the ever-increasing importance of high-speed information in society as we move towards the 21 st century, telecommunication laboratories of advanced nations are pressing forward with research and development aimed at implementing its W & P(Visual Intelligent and Personal) services and construction of a new network to support them. In legals to the former, based on a long-term view of technological and market trends, those laboratories are researching and developing services that will make possible an effective progression from the development of services that answer to potential needs towards the full-scale implementation of VI & P services. In regards to the latter, these laboratories are responding in a flexible manner to the increasing diversity and disposal of the communications environment by separating the network into a transmission system and a versatile information control/conversion -ion system and laboratories are working at enhancing the performance of both. Within these board aims, the laboratories are currently focusing our attention in three areas : the technology for a high-speed broadband transmission system featuring optical frequency multiplexing and ATM techniques, network and software technologies for advanced information control and conversion, and technology for constructing a new access network that can provide a comprehensive range of multimedia services. This article describes the laboratories' concept of how VI & P services will develop in the future, and the latest trends in the field of communications. It also describes the ideal configuration of the new network and discusses the important technological aspects of how it is to be constructed. Finally, it presents the results of the laboratories'recent research which include some innovative work, point out the areas requiring future investigation.

  • PDF

Rotation Invariant 3D Star Skeleton Feature Extraction (회전무관 3D Star Skeleton 특징 추출)

  • Chun, Sung-Kuk;Hong, Kwang-Jin;Jung, Kee-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.10
    • /
    • pp.836-850
    • /
    • 2009
  • Human posture recognition has attracted tremendous attention in ubiquitous environment, performing arts and robot control so that, recently, many researchers in pattern recognition and computer vision are working to make efficient posture recognition system. However the most of existing studies is very sensitive to human variations such as the rotation or the translation of body. This is why the feature, which is extracted from the feature extraction part as the first step of general posture recognition system, is influenced by these variations. To alleviate these human variations and improve the posture recognition result, this paper presents 3D Star Skeleton and Principle Component Analysis (PCA) based feature extraction methods in the multi-view environment. The proposed system use the 8 projection maps, a kind of depth map, as an input data. And the projection maps are extracted from the visual hull generation process. Though these data, the system constructs 3D Star Skeleton and extracts the rotation invariant feature using PCA. In experimental result, we extract the feature from the 3D Star Skeleton and recognize the human posture using the feature. Finally we prove that the proposed method is robust to human variations.

Effective Multi-Modal Feature Fusion for 3D Semantic Segmentation with Multi-View Images (멀티-뷰 영상들을 활용하는 3차원 의미적 분할을 위한 효과적인 멀티-모달 특징 융합)

  • Hye-Lim Bae;Incheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.12
    • /
    • pp.505-518
    • /
    • 2023
  • 3D point cloud semantic segmentation is a computer vision task that involves dividing the point cloud into different objects and regions by predicting the class label of each point. Existing 3D semantic segmentation models have some limitations in performing sufficient fusion of multi-modal features while ensuring both characteristics of 2D visual features extracted from RGB images and 3D geometric features extracted from point cloud. Therefore, in this paper, we propose MMCA-Net, a novel 3D semantic segmentation model using 2D-3D multi-modal features. The proposed model effectively fuses two heterogeneous 2D visual features and 3D geometric features by using an intermediate fusion strategy and a multi-modal cross attention-based fusion operation. Also, the proposed model extracts context-rich 3D geometric features from input point cloud consisting of irregularly distributed points by adopting PTv2 as 3D geometric encoder. In this paper, we conducted both quantitative and qualitative experiments with the benchmark dataset, ScanNetv2 in order to analyze the performance of the proposed model. In terms of the metric mIoU, the proposed model showed a 9.2% performance improvement over the PTv2 model using only 3D geometric features, and a 12.12% performance improvement over the MVPNet model using 2D-3D multi-modal features. As a result, we proved the effectiveness and usefulness of the proposed model.

Application of Cognitive Enhancement Protocol Based on Information & Communication Technology Program to Improve Cognitive Level of Older Adults Residents in Small-Sized City Community: A Pilot Study (중소도시 지역사회 거주 노인의 치매예방을 위한 Information & Communication Technology 프로그램 기반 인지향상 프로토콜 적용: 파일럿(Pilot) 연구)

  • Yun, Sohyeon;Lee, Hamin;Kim, Mi Kyeong;Park, Hae Yean
    • Therapeutic Science for Rehabilitation
    • /
    • v.12 no.2
    • /
    • pp.69-83
    • /
    • 2023
  • Objective : This study, as a preliminary study, applied an Information & Communication Technology (ICT) home-based program to elderly people aged 65 years or older to confirm the effect of the cognitive enhancement program and to find the possibility of remote rehabilitation. Methods : This study from August to October 2022, three subjects were selected and the intervention was conducted for about 2 months. This intervention was conducted using Korean version of Mini-Mental State Examination, Korean version of Montreal Cognitive Assessment (MoCA-K), Computer Cognitive Senior Assessment System, and the Center for Epidemiologic Studies Depression scale to evaluate cognitive improvement before and after the program. The therapist remotely set the level of cognitive training according to the subject's level through weekly feedback. Results : After the intervention, all subjects showed improved scores in most items of the MoCA-K conducted before and after the intervention. In addition, among the items of Cotras-pro, upper cognition, language ability, attention, visual perception, and memory were improved. Conclusion : Cognitive rehabilitation training using an ICT home-based program not only prevented dementia but also made it habitual. Through this study, it was confirmed that remote rehabilitation for the elderly could be possible.

The new explore of the animated content using OculusVR - Focusing on the VR platform and killer content - (오큘러스 VR (Oculus VR)를 이용한 애니메이션 콘텐츠의 새로운 모색 - VR 플랫폼과 킬러콘텐츠를 중심으로 -)

  • Lee, Jong-Han
    • Cartoon and Animation Studies
    • /
    • s.45
    • /
    • pp.197-214
    • /
    • 2016
  • Augmented Reality, virtual reality in recently attracted attention throughout the world. and Mix them mixed reality etc., it has had a significant impact on the overall pop culture beyond the scope of science and technology. The world's leading IT company : Google, Apple, Samsung, Microsoft, Sony, LG is focusing on development of AR, VR technology for the public. The many large and small companies developed VR hardware, VR software, VR content. It does not look that makes a human a human operation in the cognitive experience of certain places or situations or invisible through Specific platforms or program is Encompass a common technique that a realization of the virtual space. In particular, out of the three-dimensional image reveals the limitations of the conventional two-dimensional structure - 180, 360 degree images provided by the subjective and objective symptoms such as vision and sense of time and got participants to select it. VR technology that can significantly induce the commitment and participation is Industry as well as to the general public which leads to the attention of colostrum. It was introduced more than 10 related VR works Year 2015 Sundance Film Festival New Frontier program. The appearance VR content : medical, architecture, shopping, movies, animations. Also, 360 individuals can be produced by the camera / video sharing VR is becoming an interactive tunnel between two possible users. Nevertheless, This confusion of values, moral degeneration and the realization of a virtual space that has been pointed out that the inherent. 4K or HUD, location tracking, motion sensors, processing power, and superior 3D graphics, touch, smell, 4D technology, 3D audio technology - It developed more than ever and possible approaches to reality. Thereafter, This is because the moral degeneration, identity, generational conflict, and escapism concerns. Animation is also seeking costs in this category Reality. Despite the similarities rather it has that image, and may be the reason that the animation is pushed back to the VR content creation. However, it is focused on the game and VR technology and the platform that is entertaining, but also seek new points within the animation staying in the flat Given that eventually consist of visual images is clear that VR sought. Finally, What is the reality created in the virtual space using VR technology could be applied to the animation? So it can be seen that the common interest is research on what methods and means applied.