• Title/Summary/Keyword: Real Time Image Processing

Search Result 1,339, Processing Time 0.033 seconds

R-lambda Model based Rate Control for GOP Parallel Coding in A Real-Time HEVC Software Encoder (HEVC 실시간 소프트웨어 인코더에서 GOP 병렬 부호화를 지원하는 R-lambda 모델 기반의 율 제어 방법)

  • Kim, Dae-Eun;Chang, Yongjun;Kim, Munchurl;Lim, Woong;Kim, Hui Yong;Seok, Jin Wook
    • Journal of Broadcast Engineering
    • /
    • v.22 no.2
    • /
    • pp.193-206
    • /
    • 2017
  • In this paper, we propose a rate control method based on the $R-{\lambda}$ model that supports a parallel encoding structure in GOP levels or IDR period levels for 4K UHD input video in real-time. For this, a slice-level bit allocation method is proposed for parallel encoding instead of sequential encoding. When a rate control algorithm is applied in the GOP level or IDR period level parallelism, the information of how many bits are consumed cannot be shared among the frames belonging to a same frame level except the lowest frame level of the hierarchical B structure. Therefore, it is impossible to manage the bit budget with the existing bit allocation method. In order to solve this problem, we improve the bit allocation procedure of the conventional ones that allocate target bits sequentially according to the encoding order. That is, the proposed bit allocation strategy is to assign the target bits in GOPs first, then to distribute the assigned target bits from the lowest depth level to the highest depth level of the HEVC hierarchical B structure within each GOP. In addition, we proposed a processing method that is used to improve subjective image qualities by allocating the bits according to the coding complexities of the frames. Experimental results show that the proposed bit allocation method works well for frame-level parallel HEVC software encoders and it is confirmed that the performance of our rate controller can be improved with a more elaborate bit allocation strategy by using the preprocessing results.

A Real-Time Head Tracking Algorithm Using Mean-Shift Color Convergence and Shape Based Refinement (Mean-Shift의 색 수렴성과 모양 기반의 재조정을 이용한 실시간 머리 추적 알고리즘)

  • Jeong Dong-Gil;Kang Dong-Goo;Yang Yu Kyung;Ra Jong Beom
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.6
    • /
    • pp.1-8
    • /
    • 2005
  • In this paper, we propose a two-stage head tracking algorithm adequate for real-time active camera system having pan-tilt-zoom functions. In the color convergence stage, we first assume that the shape of a head is an ellipse and its model color histogram is acquired in advance. Then, the min-shift method is applied to roughly estimate a target position by examining the histogram similarity of the model and a candidate ellipse. To reflect the temporal change of object color and enhance the reliability of mean-shift based tracking, the target histogram obtained in the previous frame is considered to update the model histogram. In the updating process, to alleviate error-accumulation due to outliers in the target ellipse of the previous frame, the target histogram in the previous frame is obtained within an ellipse adaptively shrunken on the basis of the model histogram. In addition, to enhance tracking reliability further, we set the initial position closer to the true position by compensating the global motion, which is rapidly estimated on the basis of two 1-D projection datasets. In the subsequent stage, we refine the position and size of the ellipse obtained in the first stage by using shape information. Here, we define a robust shape-similarity function based on the gradient direction. Extensive experimental results proved that the proposed algorithm performs head hacking well, even when a person moves fast, the head size changes drastically, or the background has many clusters and distracting colors. Also, the propose algorithm can perform tracking with the processing speed of about 30 fps on a standard PC.

Lightweight Super-Resolution Network Based on Deep Learning using Information Distillation and Recursive Methods (정보 증류 및 재귀적인 방식을 이용한 심층 학습법 기반 경량화된 초해상도 네트워크)

  • Woo, Hee-Jo;Sim, Ji-Woo;Kim, Eung-Tae
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.378-390
    • /
    • 2022
  • With the recent development of deep composite multiplication neural network learning, deep learning techniques applied to single-image super-resolution have shown good results, and the strong expression ability of deep networks has enabled complex nonlinear mapping between low-resolution and high-resolution images. However, there are limitations in applying it to real-time or low-power devices with increasing parameters and computational amounts due to excessive use of composite multiplication neural networks. This paper uses blocks that extract hierarchical characteristics little by little using information distillation and suggests the Recursive Distillation Super Resolution Network (RDSRN), a lightweight network that improves performance by making more accurate high frequency components through high frequency residual purification blocks. It was confirmed that the proposed network restores images of similar quality compared to RDN, restores images 3.5 times faster with about 32 times fewer parameters and about 10 times less computation, and produces 0.16 dB better performance with about 2.2 times less parameters and 1.8 times faster processing time than the existing lightweight network CARN.

A preliminary study for development of an automatic incident detection system on CCTV in tunnels based on a machine learning algorithm (기계학습(machine learning) 기반 터널 영상유고 자동 감지 시스템 개발을 위한 사전검토 연구)

  • Shin, Hyu-Soung;Kim, Dong-Gyou;Yim, Min-Jin;Lee, Kyu-Beom;Oh, Young-Sup
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.19 no.1
    • /
    • pp.95-107
    • /
    • 2017
  • In this study, a preliminary study was undertaken for development of a tunnel incident automatic detection system based on a machine learning algorithm which is to detect a number of incidents taking place in tunnel in real time and also to be able to identify the type of incident. Two road sites where CCTVs are operating have been selected and a part of CCTV images are treated to produce sets of training data. The data sets are composed of position and time information of moving objects on CCTV screen which are extracted by initially detecting and tracking of incoming objects into CCTV screen by using a conventional image processing technique available in this study. And the data sets are matched with 6 categories of events such as lane change, stoping, etc which are also involved in the training data sets. The training data are learnt by a resilience neural network where two hidden layers are applied and 9 architectural models are set up for parametric studies, from which the architectural model, 300(first hidden layer)-150(second hidden layer) is found to be optimum in highest accuracy with respect to training data as well as testing data not used for training. From this study, it was shown that the highly variable and complex traffic and incident features could be well identified without any definition of feature regulation by using a concept of machine learning. In addition, detection capability and accuracy of the machine learning based system will be automatically enhanced as much as big data of CCTV images in tunnel becomes rich.

3D Facial Animation with Head Motion Estimation and Facial Expression Cloning (얼굴 모션 추정과 표정 복제에 의한 3차원 얼굴 애니메이션)

  • Kwon, Oh-Ryun;Chun, Jun-Chul
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.311-320
    • /
    • 2007
  • This paper presents vision-based 3D facial expression animation technique and system which provide the robust 3D head pose estimation and real-time facial expression control. Many researches of 3D face animation have been done for the facial expression control itself rather than focusing on 3D head motion tracking. However, the head motion tracking is one of critical issues to be solved for developing realistic facial animation. In this research, we developed an integrated animation system that includes 3D head motion tracking and facial expression control at the same time. The proposed system consists of three major phases: face detection, 3D head motion tracking, and facial expression control. For face detection, with the non-parametric HT skin color model and template matching, we can detect the facial region efficiently from video frame. For 3D head motion tracking, we exploit the cylindrical head model that is projected to the initial head motion template. Given an initial reference template of the face image and the corresponding head motion, the cylindrical head model is created and the foil head motion is traced based on the optical flow method. For the facial expression cloning we utilize the feature-based method, The major facial feature points are detected by the geometry of information of the face with template matching and traced by optical flow. Since the locations of varying feature points are composed of head motion and facial expression information, the animation parameters which describe the variation of the facial features are acquired from geometrically transformed frontal head pose image. Finally, the facial expression cloning is done by two fitting process. The control points of the 3D model are varied applying the animation parameters to the face model, and the non-feature points around the control points are changed by use of Radial Basis Function(RBF). From the experiment, we can prove that the developed vision-based animation system can create realistic facial animation with robust head pose estimation and facial variation from input video image.

A Framework for Digitalizing Handwritten Document using Digital Pen and Handwriting Recognition Technology (디지털펜과 필기체인식 기술을 이용한 수기문서 전자화 프레임워크)

  • Son, Bong-Ki;Kim, Hak-Joon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.3
    • /
    • pp.1417-1426
    • /
    • 2011
  • Business still relies heavily on pen and paper for legal reasons or convenience. The handwritten document is to be converted into digitalized document for IT system to manage and process in real time. Because the previous document digitalization systems convert the handwritten documents into digitalized documents by scanning and post-processing the documents, it is difficult to seamlessly proceed the work process. This paper proposes the LiveForm, a framework for digitalizing handwritten document using digital pen and handwriting recognition technology. To prove the applicability of the proposed LiveForm, we also implement a LiveForm based service in industrial gas distribution process and analyze effects of the system. The LiveForm generates the same digital image as the handwritten document by writing up the paper with absolute coordinates by digital pen and converts the handwriting data to digital text to insert the information into back-end system. The LiveForm based system eliminates scanning for document digitalization and data input with keyboard into back-end system in paper-based information gathering. Therefore, it is possible for the LiveForm to improve work process in various business areas.

A Block based 3D Map for Recognizing Three Dimensional Spaces (3차원 공간의 인식을 위한 블록기반 3D맵)

  • Yi, Jong-Su;Kim, Jun-Seong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.4
    • /
    • pp.89-96
    • /
    • 2012
  • A 3D map provides useful information for intelligent services. Traditional 3D maps, however, consist of a raw image data and are not suitable for real-time applications. In this paper, we propose the Block-based 3D map, that represents three dimensional spaces in a collection of square blocks. The Block_based 3D map has two major variables: an object ratio and a block size. The object ratio is defined as the proportion of object pixels to space pixels in a block and determines the type of the block. The block size is defined as the number of pixels of the side of a block and determines the size of the block. Experiments show the advantage of the Block-based 3D map in reducing noise, and in saving the amount of processing data. With the block size of $40{\times}40$ and the object ratio of 30% to 50% we can get the most matched Block-based 3D map for the $320{\times}240$ depthmap. The Block-based 3D map provides useful information, that can produce a variety of new services with high added value in intelligent environments.

An Algorithm for Segmenting the License Plate Region of a Vehicle Using a Color Model (차량번호판 색상모델에 의한 번호판 영역분할 알고리즘)

  • Jun Young-Min;Cha Jeong-Hee
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.2 s.308
    • /
    • pp.21-32
    • /
    • 2006
  • The license plate recognition (LPR) unit consists of the following core components: plate region segmentation, individual character extraction, and character recognition. Out of the above three components, accuracy in the performance of plate region segmentation determines the overall recognition rate of the LPR unit. This paper proposes an algorithm for segmenting the license plate region on the front or rear of a vehicle in a fast and accurate manner. In the case of the proposed algorithm images are captured on the spot where unmanned monitoring of illegal parking and stowage is performed with a variety of roadway environments taken into account. As a means of enhancing the segmentation performance of the on-the-spot-captured images of license plate regions, the proposed algorithm uses a mathematical model for license plate colors to convert color images into digital data. In addition, this algorithm uses Gaussian smoothing and double threshold to eliminate image noises, one-pass boundary tracing to do region labeling, and MBR to determine license plate region candidates and extract individual characters from the determined license plate region candidates, thereby segmenting the license plate region on the front or rear of a vehicle through a verification process. This study contributed to addressing the inability of conventional techniques to segment the license plate region on the front or rear of a vehicle where the frame of the license plate is damaged, through processing images in a real-time manner, thereby allowing for the practical application of the proposed algorithm.

Application of Traffic Conflict Decision Criteria for Signalized Intersections Using an Individual Vehicle Tracking Technique (개별차량 추적기법을 이용한 신호교차로 교통상충 판단기준 정립 및 적용)

  • Kim, Myung-Seob;Oh, Ju-Taek;Kim, Eung-Cheol;Jung, Dong-Woo
    • Journal of Korean Society of Transportation
    • /
    • v.26 no.4
    • /
    • pp.173-184
    • /
    • 2008
  • Development of an accident estimation model based on accident data can be made after accident occurrences. However, the taking of historical accident data is not easy, and there have been differences between real accident data and police-reported accident data. Also, another difficult shortcoming is that historical traffic accident data better consider driver behavior or intersection characteristics. A new method needs to be developed that can predict accident occurrences for traffic safety improvement in black spots. Traffic conflict decision techniques can acquire and analyze data in time and space, requiring less data collection through investigation. However, there are shortcomings: as existing traffic conflict techniques do not operate automatically, the analyst's opinion could easily affect the study results. Also, existing methods do not consider the severity of traffic conflicts. In this study, the authors presented traffic conflict decision criteria which consider conflict severity, including opposing left turn traffic conflict and cross traffic conflict decision criteria. In order to test these criteria, the authors acquired three signalized intersection images (two intersections in Sungnam city and one intersection in Paju) and analyzed the acquired images using image processing techniques based on individual vehicle tracking technology. Within the analyzed images, level 1 conflicts occurred 343 times over three intersections. Some of these traffic conflicts resulted in level 3 conflict situations. Level 3 traffic conflicts occurred 25 times. From the study results, the authors found that traffic conflict decision techniques can be an alternative to evaluate traffic safety in black spots.

A Study on the Estimation of Multi-Object Social Distancing Using Stereo Vision and AlphaPose (Stereo Vision과 AlphaPose를 이용한 다중 객체 거리 추정 방법에 관한 연구)

  • Lee, Ju-Min;Bae, Hyeon-Jae;Jang, Gyu-Jin;Kim, Jin-Pyeong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.7
    • /
    • pp.279-286
    • /
    • 2021
  • Recently, We are carrying out a policy of physical distancing of at least 1m from each other to prevent the spreading of COVID-19 disease in public places. In this paper, we propose a method for measuring distances between people in real time and an automation system that recognizes objects that are within 1 meter of each other from stereo images acquired by drones or CCTVs according to the estimated distance. A problem with existing methods used to estimate distances between multiple objects is that they do not obtain three-dimensional information of objects using only one CCTV. his is because three-dimensional information is necessary to measure distances between people when they are right next to each other or overlap in two dimensional image. Furthermore, they use only the Bounding Box information to obtain the exact coordinates of human existence. Therefore, in this paper, to obtain the exact two-dimensional coordinate value in which a person exists, we extract a person's key point to detect the location, convert it to a three-dimensional coordinate value using Stereo Vision and Camera Calibration, and estimate the Euclidean distance between people. As a result of performing an experiment for estimating the accuracy of 3D coordinates and the distance between objects (persons), the average error within 0.098m was shown in the estimation of the distance between multiple people within 1m.