• Title/Summary/Keyword: Image-to-Video

Search Result 2,715, Processing Time 0.024 seconds

Efficient Intra Predictor Design for H.264/AVC Decoder (H.264/AVC 복호기를 위한 효율적인 인트라 예측기 설계)

  • Kim, Ok;Ryoo, Kwangki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.175-178
    • /
    • 2009
  • H.264/AVC is a video coding standard of ITU-T and ISO/IEC, and widely spreads its application due to its high compression ratio more than twice that of MPEG-2 and high image quality. In this paper, we explained Intra Prediction in H.264/AVC, which is able to achieve higher compressing efficiency from correlation removal of adjacent samples in spatial domain, and proposed efficient Intra Predictor architecture design for H.264/AVC decoder. The proposed system reduced computation cycle using processing element and precomputation processing element and also reduced the number of access to external memory using efficient register. We designed the proposed system with Verilog-HDL and verified with suitable test vector. The proposed Intra Predictor achieved about 60% cycle reduction comparing with existing Intra Predictors.

  • PDF

Considerations for Applying Korean Natural Language Processing Technology in Records Management (기록관리 분야에서 한국어 자연어 처리 기술을 적용하기 위한 고려사항)

  • Haklae, Kim
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.22 no.4
    • /
    • pp.129-149
    • /
    • 2022
  • Records have temporal characteristics, including the past and present; linguistic characteristics not limited to a specific language; and various types categorized in a complex way. Processing records such as text, video, and audio in the life cycle of records' creation, preservation, and utilization entails exhaustive effort and cost. Primary natural language processing (NLP) technologies, such as machine translation, document summarization, named-entity recognition, and image recognition, can be widely applied to electronic records and analog digitization. In particular, Korean deep learning-based NLP technologies effectively recognize various record types and generate record management metadata. This paper provides an overview of Korean NLP technologies and discusses considerations for applying NLP technology in records management. The process of using NLP technologies, such as machine translation and optical character recognition for digital conversion of records, is introduced as an example implemented in the Python environment. In contrast, a plan to improve environmental factors and record digitization guidelines for applying NLP technology in the records management field is proposed for utilizing NLP technology.

A Study on the ACC Safety Evaluation Method Using Dual Cameras (듀얼카메라를 활용한 ACC 안전성 평가 방법에 관한 연구)

  • Kim, Bong-Ju;Lee, Seon-Bong
    • Journal of Auto-vehicle Safety Association
    • /
    • v.14 no.2
    • /
    • pp.57-69
    • /
    • 2022
  • Recently, as interest in self-driving cars has increased worldwide, research and development on the Advanced Driver Assist System is actively underway. Among them, the purpose of Adaptive Cruise Control (ACC) is to minimize the driver's driving fatigue through the control of the vehicle's longitudinal speed and relative distance. In this study, for the research of the ACC test in the real environment, the real-road test was conducted based on domestic-road test scenario proposed in preceding study, considering ISO 15622 test method. In this case, the distance measurement method using the dual camera was verified by comparing and analyzing the result of using the dual camera and the result of using the measurement equipment. As a result of the comparison, two results could be derived. First, the relative distance after stabilizing the ACC was compared. As a result of the comparison, it was found that the minimum error rate was 0.251% in the first test of scenario 8 and the maximum error rate was 4.202% in the third test of scenario 9. Second, the result of the same time was compared. As a result of the comparison, it was found that the minimum error rate was 0.000% in the second test of scenario 10 and the maximum error rate was 9.945% in the second test of scenario 1. However, the average error rate for all scenarios was within 3%. It was determined that the representative cause of the maximum error occurred in the dual camera installed in the test vehicle. There were problems such as shaking caused by road surface vibration and air resistance during driving, changes in ambient brightness, and the process of focusing the video. Accordingly, it was determined that the result of calculating the distance to the preceding vehicle in the image where the problem occurred was incorrect. In the development stage of ADAS such as ACC, it is judged that only dual cameras can reduce the cost burden according to the above derivation of test results.

Detection of Smoking Behavior in Images Using Deep Learning Technology (딥러닝 기술을 이용한 영상에서 흡연행위 검출)

  • Dong Jun Kim;Yu Jin Choi;Kyung Min Park;Ji Hyun Park;Jae-Moon Lee;Kitae Hwang;In Hwan Jung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.4
    • /
    • pp.107-113
    • /
    • 2023
  • This paper proposes a method for detecting smoking behavior in images using artificial intelligence technology. Since smoking is not a static phenomenon but an action, the object detection technology was combined with the posture estimation technology that can detect the action. A smoker detection learning model was developed to detect smokers in images, and the characteristics of smoking behaviors were applied to posture estimation technology to detect smoking behaviors in images. YOLOv8 was used for object detection, and OpenPose was used for posture estimation. In addition, when smokers and non-smokers are included in the image, a method of separating only people was applied. The proposed method was implemented using Google Colab NVIDEA Tesla T4 GPU in Python, and it was found that the smoking behavior was perfectly detected in the given video as a result of the test.

An Embedded Text Index System for Mass Flash Memory (대용량 플래시 메모리를 위한 임베디드 텍스트 인덱스 시스템)

  • Yun, Sang-Hun;Cho, Haeng-Rae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.6
    • /
    • pp.1-10
    • /
    • 2009
  • Flash memory has the advantages of nonvolatile, low power consumption, light weight, and high endurance. This enables the flash memory to be utilized as a storage of mobile computing device such as PMP(Portable Multimedia Player). Potable device with a mass flash memory can store various multimedia data such as video, audio, or image. Typical index systems for mobile computer are inefficient to search a form of text like lyric or title. In this paper, we propose a new text index system, named EMTEX(Embedded Text Index). EMTEX has the following salient features. First, it uses a compression algorithm for embedded system. Second, if a new insert or delete operation is executed on the base table. EMTEX updates the text index immediately. Third, EMTEX considers the characteristics of flash memory to design insert, delete, and rebuild operations on the text index. Finally, EMTEX is executed as an upper layer of DBMS. Therefore, it is independent of the underlying DBMS. We evaluate the performance of EMTEX. The Experiment results show that EMTEX can outperform th conventional index systems such as Oracle Text and FT3.

The Algorithm Improved the Speed for the 3-Dimensional CT Video Composition (3D CT 동영상 구성을 위한 속도 개선 알고리즘)

  • Jeong, Chan-Woong;Park, Jin-Woo;Jun, Kyu-Suk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.141-147
    • /
    • 2009
  • This paper presents a new fast algorithm, rotation-based method (RBM), for the reconstruction of 3 dimensional image for cone beam computerized tomography (CB CT) system. The system used cone beam has less exposure time of radioactivity than fan beam. The Three-Pass Shear Matrices (TPSM) is applied, that has less transcendental functions than the one-pass shear method to decrease a time of calculations in the computer. To evaluate the quality of the 3-D images and the time for the reconstruction of the 3-D images, another 3-D images were reconstructed by the radon transform under the same condition. For the quality of the 3-D images, the images by radon transform was shown little good quality than REM. But for the time for the reconstruction of the 3-D images REM algorithm was 35 times faster than radon transform. This algorithm offered $4{\sim}5$ frames a second. It meant that it will be possible to reconstruct the 3-D dynamic images in real time.

Implementation of AESA Radar Integration Analysis System by using Heterogeneous Media

  • Min-Jung Kang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.3
    • /
    • pp.117-125
    • /
    • 2024
  • In this paper, implement and propose an Active Electronically Scanned Array (AESA) radar integration analysis system which specialized for radar development by using heterogeneous media. Most analysis systems are used to analyze and improve the cause of defects, so they help the test easier. However, previous log analysis systems that operate only based on text are not intuitive and difficult to find the information user want at once if there is a lot of log information. so when an equipment defect occurs, there are limitations in analyzing the cause of defect. Therefore, the analysis system in this paper utilizes heterogeneous media. The media defined in this paper refers to recording text-based data, displaying data as image or video and visualizing data. The proposed analysis system classifies and stores data that transmitted and received between radar devices, radar target detection and Tracking algorithm data, etc. also displays and visualizes radar operation results and equipment defect information in real time. With this analysis system, it can quickly provide information what user want and assistance in developing high quality radar.

Development for Analysis Service of Crowd Density in CCTV Video using YOLOv4 (YOLOv4를 이용한 CCTV 영상 내 군중 밀집도 분석 서비스 개발)

  • Seung-Yeon Hwang;Jeong-Joon Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.3
    • /
    • pp.177-182
    • /
    • 2024
  • In this paper, the purpose of this paper is to predict and prevent the risk of crowd concentration in advance for possible future crowd accidents based on the Itaewon crush accident in Korea on October 29, 2022. In the case of a single CCTV, the administrator can determine the current situation in real time, but since the screen cannot be seen throughout the day, objects are detected using YOLOv4, which learns images taken with CCTV angle, and safety accidents due to crowd concentration are prevented by notification when the number of clusters exceeds. The reason for using the YOLO v4 model is that it improves with higher accuracy and faster speed than the previous YOLO model, making object detection techniques easier. This service will go through the process of testing with CCTV image data registered on the AI-Hub site. Currently, CCTVs have increased exponentially in Korea, and if they are applied to actual CCTVs, it is expected that various accidents, including accidents caused by crowd concentration in the future, can be prevented.

Implementation of Radiotherapy Educational Contents Using Virtual Reality (가상현실 기술을 활용한 방사선치료 교육 콘텐츠 제작 구현)

  • Kwon, Soon-Mu;Shim, Jae-Goo;Chon, Kwon-Su
    • Journal of the Korean Society of Radiology
    • /
    • v.12 no.3
    • /
    • pp.409-415
    • /
    • 2018
  • The development of smart devices has brought about significant changes in daily life and one of the most significant changes is the virtual reality zone. Virtual reality is a technology that creates the illusion that a 3D high-resolution image has already been created using a display device just like it does in itself. Unrealized subjects are forced to rely on audiovisual materials, resulting in a decline in the concentration of practices and the quality of classes. It used virtual reality to develop effective teaching materials for radiology students. In order to produce a video clip bridge using virtual reality, a radiology clinic was selected to conduct two exposures from July to September 2017. The video was produced taking into account the radiology and work flow chart and filming was carried out in two separate locations : in the computerized tomography unit and in the LINAC room. Prior to filming the scenario and the filming route were checked in advance to facilitate editing of the video. Modeling and mapping was performed in a PC environment using the Window XP operating system. Using two leading virtual reality camera Gopro Hero, CC pixels were produced using a 4K UHD, Adobe, followed by an 8 megapixel resolution of $3,840{\times}2,160/4,096{\times}2,160$. Total regeneration time was performed in about 5 minutes during the production of using virtual reality to prevent vomiting and dizziness. Currently developed virtual reality radiation and educational contents are being used to secure the market and extend the promotion process to be used by various institutions. The researchers will investigate the satisfaction level of radiation and educational contents using virtual reality and carry out supplementary tasks depending on the results.

Intensity Compensation for Efficient Stereo Image Compression (효율적인 스테레오 영상 압축을 위한 밝기차 보상)

  • Jeon Youngtak;Jeon Byeungwoo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.2 s.302
    • /
    • pp.101-112
    • /
    • 2005
  • As we perceive the world as 3-dimensional through our two eyes, we can extract 3-dimensional information from stereo images obtained from two or more cameras. Since stereo images have a large amount of data, with recent advances in digital video coding technology, efficient compression algorithms have been developed for stereo images. In order to compress stereo images and to obtain 3-D information such as depth, we find disparity vectors by using disparity estimation algorithm generally utilizing pixel differences between stereo pairs. However, it is not unusual to have stereo images having different intensity values for several reasons, such as incorrect control of the iris of each camera, disagreement of the foci of two cameras, orientation, position, and different characteristics of CCD (charge-coupled device) cameras, and so on. The intensity differences of stereo pairs often cause undesirable problems such as incorrect disparity vectors and consequent low coding efficiency. By compensating intensity differences between left and right images, we can obtain higher coding efficiency and hopefully reduce the perceptual burden of brain to combine different information incoming from two eyes. We propose several methods of intensity compensation such as local intensity compensation, global intensity compensation, and hierarchical intensity compensation as very simple and efficient preprocessing tool. Experimental results show that the proposed algerian provides significant improvement in coding efficiency.