• 제목/요약/키워드: 영상 언어 모델

Search Result 75, Processing Time 0.033 seconds

Three Dimensional Target Volume Reconstruction from Multiple Projection Images (다중투사영상을 이용한 표적체적의 3차원 재구성)

  • 정광호;진호상;이형구;최보영;서태석
    • Progress in Medical Physics
    • /
    • v.14 no.3
    • /
    • pp.167-174
    • /
    • 2003
  • In the radiation treatment planning (RTP) process, especially for stereotactic radiosurgery (SRS), knowing the exact volume and shape and the precise position of a lesion is very important. Sometimes X-ray projection images, such as angiograms, become the best choice for lesion identification. However, while the exact target position can be acquired by bi-projection images, 3D target reconstruction from bi-projection images is considered to be impossible. The aim of this study was to reconstruct the 3D target volume from multiple projection images. It was assumed that we knew the exact target position in advance, and all processes were performed in Target Coordinates, where the origin was the center of the target. We used six projections: two projections were used to make a Reconstruction Box and four projections were for image acquisition. The Reconstruction Box was made up of voxels of 3D matrices. Projection images were transformed into 3D in this virtual box using a geometric back-projection method. The resolution and the accuracy of the reconstructed target volume were dependent on the target size. An algorithm was applied to an ellipsoid model and a horseshoe-shaped model. Projection images were created geometrically using C program language, and reconstruction was also performed using C program language and Matlab ver. 6(The Mathwork Inc., USA). For the ellipsoid model, the reconstructed volume was slightly overestimated, but the target shape and position proved to be correct. For the horseshoe-shaped model, reconstructed volume was somewhat different from the original target model, but there was a considerable improvement in determining the target volume.

  • PDF

ORMN: A Deep Neural Network Model for Referring Expression Comprehension (ORMN: 참조 표현 이해를 위한 심층 신경망 모델)

  • Shin, Donghyeop;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.2
    • /
    • pp.69-76
    • /
    • 2018
  • Referring expressions are natural language constructions used to identify particular objects within a scene. In this paper, we propose a new deep neural network model for referring expression comprehension. The proposed model finds out the region of the referred object in the given image by making use of the rich information about the referred object itself, the context object, and the relationship with the context object mentioned in the referring expression. In the proposed model, the object matching score and the relationship matching score are combined to compute the fitness score of each candidate region according to the structure of the referring expression sentence. Therefore, the proposed model consists of four different sub-networks: Language Representation Network(LRN), Object Matching Network (OMN), Relationship Matching Network(RMN), and Weighted Composition Network(WCN). We demonstrate that our model achieves state-of-the-art results for comprehension on three referring expression datasets.

A study on the lip shape recognition algorithm using 3-D Model (3차원 모델을 이용한 입모양 인식 알고리즘에 관한 연구)

  • 김동수;남기환;한준희;배철수;나상동
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 1998.11a
    • /
    • pp.181-185
    • /
    • 1998
  • Recently, research and developmental direction of communication system is concurrent adopting voice data and face image in speaking to provide more higher recognition rate then in the case of only voice data. Therefore, we present a method of lipreading in speech image sequence by using the 3-D facial shape model. The method use a feature information of the face image such as the opening-level of lip, the movement of jaw, and the projection height of lip. At first, we adjust the 3-D face model to speeching face image sequence. Then, to get a feature information we compute variance quantity from adjusted 3-D shape model of image sequence and use the variance quality of the adjusted 3-D model as recognition parameters. We use the intensity inclination values which obtaining from the variance in 3-D feature points as the separation of recognition units from the sequential image. After then, we use discrete HMM algorithm at recognition process, depending on multiple observation sequence which considers the variance of 3-D feature point fully. As a result of recognition experiment with the 8 Korean vowels and 2 Korean consonants, we have about 80% of recognition rate for the plosives and vowels.

  • PDF

Tensorflow Model Environment with JavaCv for Mobile Devices (모바일을 위한 JavaCv를 이용한 Tensoflow모델 구동환경 개발)

  • Park, JinSang;Oh, SangGwon;Lee, SeongJin
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.01a
    • /
    • pp.23-24
    • /
    • 2020
  • 현재 PC환경 뿐만 아니라 모바일 환경, 임베디드 환경에서 딥러닝 모델을 구동하기 위한 많은 연구들이 진행 중에 있다. 본 연구에서는 완성된 딥러닝 모델을 구동하는 환경을 Java로 구현하여 개발 접근성을 높이고자 한다. 이미지, 영상처리를 위해 OpenCV를 사용시 C++ API문서는 보편화되어있는 반면에 JavaCv API 문서는 그렇지 못하다. 그러나 모바일 개발 환경 특성상 Java언어로 작업한 코드를 안드로이드 스튜디오에서 작업 시 그대로 가져올 수 있어 개발이 용이하다. 모델 구동을 위한 전반적인 이미지 처리 및 작업환경을 개발하였다.

  • PDF

Scene Graph Generation with Graph Neural Network and Multimodal Context (그래프 신경망과 멀티 모달 맥락 정보를 이용한 장면 그래프 생성)

  • Jung, Ga-Young;Kim, In-cheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.555-558
    • /
    • 2020
  • 본 논문에서는 입력 영상에 담긴 다양한 물체들과 그들 간의 관계를 효과적으로 탐지하여, 하나의 장면 그래프로 표현해내는 새로운 심층 신경망 모델을 제안한다. 제안 모델에서는 물체와 관계의 효과적인 탐지를 위해, 합성 곱 신경망 기반의 시각 맥락 특징들뿐만 아니라 언어 맥락 특징들을 포함하는 다양한 멀티 모달 맥락 정보들을 활용한다. 또한, 제안 모델에서는 관계를 맺는 두 물체 간의 상호 의존성이 그래프 노드 특징값들에 충분히 반영되도록, 그래프 신경망을 이용해 맥락 정보를 임베딩한다. 본 논문에서는 Visual Genome 벤치마크 데이터 집합을 이용한 비교 실험들을 통해, 제안 모델의 효과와 성능을 입증한다.

A Study on the Construction of a Real-time Sign-language Communication System between Korean and Japanese Using 3D Model on the Internet (인터넷상에 3차원 모델을 이용한 한-일간 실시간 수화 통신 시스템의 구축을 위한 기초적인 검토)

  • Kim, Sang-Woon;Oh, Ji-Young;Aoki, Yoshinao
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.7
    • /
    • pp.71-80
    • /
    • 1999
  • Sign-language communication can be a useful way of exchanging message between people who using different languages. In this paper, we report an experimental survey on the construction of a Korean-Japanese sign-language communication system using 3D model. For real-time communication, we introduced an intelligent communication method and built the system as a client-server architecture on the Internet. A character model is stored previously in the clients and a series of animation parameters are sent instead of real image data. The input-sentence is converted into a series of parameters of Korean sign language or Japanese sign language at server. The parameters are transmitted to clients and used for generating the animation. We also employ the emotional expressions, variable frames allocation method, and a cubic spline interpolation for the purpose of enhancing the reality of animation. The proposed system is implemented with Visual $C^{++}$ and Open Inventor library on Windows platform. Experimental results show a possibility that the system could be used as a non-verbal communication means beyond the linguistic barrier.

  • PDF

Exploring Narrative Intelligence in AI: Implications for the Evolution of Homo narrans (인공지능의 서사 지능 탐구 : 새로운 서사 생태계와 호모 나랜스의 진화)

  • Hochang Kwon
    • Trans-
    • /
    • v.16
    • /
    • pp.107-133
    • /
    • 2024
  • Narratives are fundamental to human cognition and social culture, serving as the primary means by which individuals and societies construct meaning, share experiences, and convey cultural and moral values. The field of artificial intelligence, which seeks to mimic human thought and behavior, has long studied story generation and story understanding, and today's Large Language Models are demonstrating remarkable narrative capabilities based on advances in natural language processing. This situation raises a variety of changes and new issues, but a comprehensive discussion of them is hard to find. This paper aims to provide a holistic view of the current state and future changes by exploring the intersections and interactions of human and AI narrative intelligence. This paper begins with a review of multidisciplinary research on the intrinsic relationship between humans and narrative, represented by the term Homo narrans, and then provide a historical overview of how narrative has been studied in the field of AI. This paper then explore the possibilities and limitations of narrative intelligence as revealed by today's Large Language Models, and present three philosophical challenges for understanding the implications of AI with narrative intelligence.

Design of Translator for generating Secure Java Bytecode from Thread code of Multithreaded Models (다중스레드 모델의 스레드 코드를 안전한 자바 바이트코드로 변환하기 위한 번역기 설계)

  • 김기태;유원희
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2002.06a
    • /
    • pp.148-155
    • /
    • 2002
  • Multithreaded models improve the efficiency of parallel systems by combining inner parallelism, asynchronous data availability and the locality of von Neumann model. This model executes thread code which is generated by compiler and of which quality is given by the method of generation. But multithreaded models have the demerit that execution model is restricted to a specific platform. On the contrary, Java has the platform independency, so if we can translate from threads code to Java bytecode, we can use the advantages of multithreaded models in many platforms. Java executes Java bytecode which is intermediate language format for Java virtual machine. Java bytecode plays a role of an intermediate language in translator and Java virtual machine work as back-end in translator. But, Java bytecode which is translated from multithreaded models have the demerit that it is not secure. This paper, multhithread code whose feature of platform independent can execute in java virtual machine. We design and implement translator which translate from thread code of multithreaded code to Java bytecode and which check secure problems from Java bytecode.

  • PDF

A Study on Deep Learning Based RobotArm System (딥러닝 기반의 로봇팔 시스템 연구)

  • Shin, Jun-Ho;Shim, Gyu-Seok
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.901-904
    • /
    • 2020
  • 본 시스템은 세 단계의 모델을 복합적으로 구성하여 이루어진다. 첫 단계로 사람의 음성언어를 텍스트로 전환한 후 사용자의 발화 의도를 분류해내는 BoW방식을 이용해 인간의 명령을 이해할 수 있는 자연어 처리 알고리즘을 구성한다. 이후 YOLOv3-tiny를 이용한 실시간 영상처리모델과 OctoMapping모델을 활용하여 주변환경에 대한 3차원 지도생성 후 지도데이터를 기반으로하여 동작하는 기구제어 알고리즘 등을 ROS actionlib을 이용한 관리자시스템을 구성하여 ROS와 딥러닝을 활용한 편리한 인간-로봇 상호작용 시스템을 제안한다.

XrML을 이용한 디지털 저작권 관리구현

  • Park, Jeong-Hui;Lee, Gi-Dong
    • 한국디지털정책학회:학술대회논문집
    • /
    • 2005.11a
    • /
    • pp.523-530
    • /
    • 2005
  • 전자상거래(e-commerce)의 발달은 과거의 전통적인 상거래를 IT기술을 이용한, 새로운 시장구조를 제공하고 있다. 특히 인터넷을 통한 소프트웨어, 게임, 동영상 등 디지털 컨텐츠(digital contents)에 대한 상거래가 크게 증가하고 있어, 이러한 디지털 컨텐츠의 유통에 신뢰를 제공할 수 있는 시스템, 즉 디지털 저작권관리(digital rights management system), 또는 디지털 지적 저작권관리(digital intellectual property system)에 대한 유통 비즈니스 모델이 필요하다. XrML은 권리(rights)를 명시하는 언어로써 본 연구는 XrML을 이용하여 저작권관리에 필요한 권리구조를 구현한 시스템(prototype)을 표현했다. 벤츠와 그에 따른 서비스들을 사용할 수 있는 권리와 조건들을 명시해준다. XrML은 현재 디지털 저작권관리(Digital Rights Management: DIPR)에 가장 많이 쓰이고 있는 Rights Language이다. XrML은 ContentGuard가 개발한 DIPR 서술 언어로 전 세계 산업계 표준으로 추진하기 위하여 파트너 회사 확대, 기능 확장, 무료 / 공개 형식으로 보급을 추진중이다.

  • PDF