• Title/Summary/Keyword: 멀티모달정보

Search Result 187, Processing Time 0.021 seconds

Game Platform and System that Synchronize Actual Humanoid Robot with Virtual 3D Character Robot (가상의 3D와 실제 로봇이 동기화하는 시스템 및 플랫폼)

  • Park, Chang-Hyun;Lee, Chang-Jo
    • Journal of Korea Entertainment Industry Association
    • /
    • v.8 no.2
    • /
    • pp.283-297
    • /
    • 2014
  • The future of human life is expected to be innovative by increasing social, economic, political and personal, including all areas of life across the multi-disciplinary skills. Particularly, in the field of robotics and next-generation games with robots, by multidisciplinary contributions and interaction, convergence between technology is expected to accelerate more and more. The purpose of this study is that by new interface model beyond the technical limitations of the "human-robot interface technology," until now and time and spatial constraints and through fusion of various modalities which existing human-robot interface technologies can't have, the research of more reliable and easy free "human-robot interface technology". This is the research of robot game system which develop and utilizing real time synchronization engine linking between biped humanoid robot and the behavior of the position value of mobile device screen's 3D content (contents), robot (virtual robots), the wireless protocol for sending and receiving (Protocol) mutual information and development of a teaching program of "Direct Teaching & Play" by the study for effective teaching.

A Study on Success Strategies for Generative AI Services in Mobile Environments: Analyzing User Experience Using LDA Topic Modeling Approach (모바일 환경에서의 생성형 AI 서비스 성공 전략 연구: LDA 토픽모델링을 활용한 사용자 경험 분석)

  • Soyon Kim;Ji Yeon Cho;Sang-Yeol Park;Bong Gyou Lee
    • Journal of Internet Computing and Services
    • /
    • v.25 no.4
    • /
    • pp.109-119
    • /
    • 2024
  • This study aims to contribute to the initial research on on-device AI in an environment where generative AI-based services on mobile and other on-device platforms are increasing. To derive success strategies for generative AI-based chatbot services in a mobile environment, over 200,000 actual user experience review data collected from the Google Play Store were analyzed using the LDA topic modeling technique. Interpreting the derived topics based on the Information System Success Model (ISSM), the topics such as tutoring, limitation of response, and hallucination and outdated informaiton were linked to information quality; multimodal service, quality of response, and issues of device interoperability were linked to system quality; inter-device compatibility, utility of the service, quality of premium services, and challenges in account were linked to service quality; and finally, creative collaboration was linked to net benefits. Humanization of generative AI emerged as a new experience factor not explained by the existing model. By explaining specific positive and negative experience dimensions from the user's perspective based on theory, this study suggests directions for future related research and provides strategic insights for companies to improve and supplement their services for successful business operations.

Desigining a Feedback for Exercises Using a Wearable Device (웨어러블 디바이스를 활용한 운동 중 피드백 방식 연구 - 근력 운동에 대한 멀티 모달 피드백 적용을 중심으로 -)

  • Yoo, Hyunjin;Maeng, Wookjae;Lee, Joongseek
    • Journal of the HCI Society of Korea
    • /
    • v.11 no.3
    • /
    • pp.23-30
    • /
    • 2016
  • The landscape of the current fitness trackers is not only limited to the aerobic exercises but also the weight training is comparatively excluded. Recently, a few weight training fitness tracker was released, human-computer interaction was not well designed due to the lack of considering the context. Because body movement would be intense while doing exercises, having exercise performers hold or operate a device makes a negative experience. As the wearable device is always inseparable to body, it could provide effective feedback because holding or operating a device is not necessary. Therefore, this study aims to make the exercise performers feel a natural feedback through the wearable device to do effective exercises. As a result, this study identified three findings. First, the information which exercise performers most needed was 'during exercise.' and the most necessary information for exercise performers through wearable device's sensory feedback was about 'pace control' with counting and motivation. Second, the order of the most preferred presentation type of sensory feedback was auditory feedback, haptic feedback and visual feedback. Third, the satisfaction, utility, usefulness score of sensory feedback as same as the personal trainer's feedback. In conclusion, this study illustrated the feedback design implications using a wearable device while doing weight training and the possibilities that wearable device could be substitute for personal trainer.

Multi-classification of Osteoporosis Grading Stages Using Abdominal Computed Tomography with Clinical Variables : Application of Deep Learning with a Convolutional Neural Network (멀티 모달리티 데이터 활용을 통한 골다공증 단계 다중 분류 시스템 개발: 합성곱 신경망 기반의 딥러닝 적용)

  • Tae Jun Ha;Hee Sang Kim;Seong Uk Kang;DooHee Lee;Woo Jin Kim;Ki Won Moon;Hyun-Soo Choi;Jeong Hyun Kim;Yoon Kim;So Hyeon Bak;Sang Won Park
    • Journal of the Korean Society of Radiology
    • /
    • v.18 no.3
    • /
    • pp.187-201
    • /
    • 2024
  • Osteoporosis is a major health issue globally, often remaining undetected until a fracture occurs. To facilitate early detection, deep learning (DL) models were developed to classify osteoporosis using abdominal computed tomography (CT) scans. This study was conducted using retrospectively collected data from 3,012 contrast-enhanced abdominal CT scans. The DL models developed in this study were constructed for using image data, demographic/clinical information, and multi-modality data, respectively. Patients were categorized into the normal, osteopenia, and osteoporosis groups based on their T-scores, obtained from dual-energy X-ray absorptiometry, into normal, osteopenia, and osteoporosis groups. The models showed high accuracy and effectiveness, with the combined data model performing the best, achieving an area under the receiver operating characteristic curve of 0.94 and an accuracy of 0.80. The image-based model also performed well, while the demographic data model had lower accuracy and effectiveness. In addition, the DL model was interpreted by gradient-weighted class activation mapping (Grad-CAM) to highlight clinically relevant features in the images, revealing the femoral neck as a common site for fractures. The study shows that DL can accurately identify osteoporosis stages from clinical data, indicating the potential of abdominal CT scans in early osteoporosis detection and reducing fracture risks with prompt treatment.

Multi-modal Image Processing for Improving Recognition Accuracy of Text Data in Images (이미지 내의 텍스트 데이터 인식 정확도 향상을 위한 멀티 모달 이미지 처리 프로세스)

  • Park, Jungeun;Joo, Gyeongdon;Kim, Chulyun
    • Database Research
    • /
    • v.34 no.3
    • /
    • pp.148-158
    • /
    • 2018
  • The optical character recognition (OCR) is a technique to extract and recognize texts from images. It is an important preprocessing step in data analysis since most actual text information is embedded in images. Many OCR engines have high recognition accuracy for images where texts are clearly separable from background, such as white background and black lettering. However, they have low recognition accuracy for images where texts are not easily separable from complex background. To improve this low accuracy problem with complex images, it is necessary to transform the input image to make texts more noticeable. In this paper, we propose a method to segment an input image into text lines to enable OCR engines to recognize each line more efficiently, and to determine the final output by comparing the recognition rates of CLAHE module and Two-step module which distinguish texts from background regions based on image processing techniques. Through thorough experiments comparing with well-known OCR engines, Tesseract and Abbyy, we show that our proposed method have the best recognition accuracy with complex background images.

A Study on Biometric Model for Information Security (정보보안을 위한 생체 인식 모델에 관한 연구)

  • Jun-Yeong Kim;Se-Hoon Jung;Chun-Bo Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.317-326
    • /
    • 2024
  • Biometric recognition is a technology that determines whether a person is identified by extracting information on a person's biometric and behavioral characteristics with a specific device. Cyber threats such as forgery, duplication, and hacking of biometric characteristics are increasing in the field of biometrics. In response, the security system is strengthened and complex, and it is becoming difficult for individuals to use. To this end, multiple biometric models are being studied. Existing studies have suggested feature fusion methods, but comparisons between feature fusion methods are insufficient. Therefore, in this paper, we compared and evaluated the fusion method of multiple biometric models using fingerprint, face, and iris images. VGG-16, ResNet-50, EfficientNet-B1, EfficientNet-B4, EfficientNet-B7, and Inception-v3 were used for feature extraction, and the fusion methods of 'Sensor-Level', 'Feature-Level', 'Score-Level', and 'Rank-Level' were compared and evaluated for feature fusion. As a result of the comparative evaluation, the EfficientNet-B7 model showed 98.51% accuracy and high stability in the 'Feature-Level' fusion method. However, because the EfficietnNet-B7 model is large in size, model lightweight studies are needed for biocharacteristic fusion.

Multi-modal Emotion Recognition using Semi-supervised Learning and Multiple Neural Networks in the Wild (준 지도학습과 여러 개의 딥 뉴럴 네트워크를 사용한 멀티 모달 기반 감정 인식 알고리즘)

  • Kim, Dae Ha;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.351-360
    • /
    • 2018
  • Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-task learning and semi-supervised learning using the spatio-temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. The proposed network improves emotion classification performance by appropriately integrating existing supervised learning and semi-supervised learning networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.