• Title/Summary/Keyword: Scene Matching

Search Result 156, Processing Time 0.025 seconds

A Real-time Dual-mode Temporal Synchronization and Compensation based on Reliability Measure in Stereoscopic Video (3D 입체 영상 시스템에서 신뢰도를 활용한 듀얼 모드 실시간 동기 에러 검출 및 보상 방법)

  • Kim, Giseok;Cho, Jae-Soo;Lee, Gwangsoon;Lee, Eung-Don
    • Journal of Broadcast Engineering
    • /
    • v.19 no.6
    • /
    • pp.896-906
    • /
    • 2014
  • In this paper, a real-time dual-mode temporal synchronization and compensation method based on a new reliability measure in stereoscopic video is proposed. The goal of temporal alignment is to detect the temporal asynchrony and recover synchronization of the two video streams. The accuracy of the temporal synchronization algorithm depends on the 3DTV contents. In order to compensate the temporal synchronization error, it is necessary to judge whether the result of the temporal synchronization is reliable or not. Based on our recently developed temporal synchronization method[1], we define a new reliability measure for the result of the temporal synchronization method. Furthermore, we developed a dual-mode temporal synchronization method, which uses a usual texture matching method and the temporal spatiogram method[1]. The new reliability measure is based on two distinctive features, a dynamic feature for scene change and a matching distinction feature. Various experimental results show the effectiveness of the proposed method. The proposed algorithms are evaluated and verified through an experimental system implemented for 3DTV.

New Methods for Correcting the Atmospheric Effects in Landsat Imagery over Turbid (Case-2) Waters

  • Ahn Yu-Hwan;Shanmugam P.
    • Korean Journal of Remote Sensing
    • /
    • v.20 no.5
    • /
    • pp.289-305
    • /
    • 2004
  • Atmospheric correction of Landsat Visible and Near Infrared imagery (VIS/NIR) over aquatic environment is more demanding than over land because the signal from the water column is small and it carries immense information about biogeochemical variables in the ocean. This paper introduces two methods, a modified dark-pixel substraction technique (path--extraction) and our spectral shape matching method (SSMM), for the correction of the atmospheric effects in the Landsat VIS/NIR imagery in relation to the retrieval of meaningful information about the ocean color, especially from Case-2 waters (Morel and Prieur, 1977) around Korean peninsula. The results of these methods are compared with the classical atmospheric correction approaches based on the 6S radiative transfer model and standard SeaWiFS atmospheric algorithm. The atmospheric correction scheme using 6S radiative transfer code assumes a standard atmosphere with constant aerosol loading and a uniform, Lambertian surface, while the path-extraction assumes that the total radiance (L/sub TOA/) of a pixel of the black ocean (referred by Antoine and Morel, 1999) in a given image is considered as the path signal, which remains constant over, at least, the sub scene of Landsat VIS/NIR imagery. The assumption of SSMM is nearly similar, but it extracts the path signal from the L/sub TOA/ by matching-up the in-situ data of water-leaving radiance, for typical clear and turbid waters, and extrapolate it to be the spatially homogeneous contribution of the scattered signal after complex interaction of light with atmospheric aerosols and Raleigh particles, and direct reflection of light on the sea surface. The overall shape and magnitude of radiance or reflectance spectra of the atmospherically corrected Landsat VIS/NIR imagery by SSMM appears to have good agreement with the in-situ spectra collected for clear and turbid waters, while path-extraction over turbid waters though often reproduces in-situ spectra, but yields significant errors for clear waters due to the invalid assumption of zero water-leaving radiance for the black ocean pixels. Because of the standard atmosphere with constant aerosols and models adopted in 6S radiative transfer code, a large error is possible between the retrieved and in-situ spectra. The efficiency of spectral shape matching has also been explored, using SeaWiFS imagery for turbid waters and compared with that of the standard SeaWiFS atmospheric correction algorithm, which falls in highly turbid waters, due to the assumption that values of water-leaving radiance in the two NIR bands are negligible to enable retrieval of aerosol reflectance in the correction of ocean color imagery. Validation suggests that accurate the retrieval of water-leaving radiance is not feasible with the invalid assumption of the classical algorithms, but is feasible with SSMM.

Speaker verification with ECAPA-TDNN trained on new dataset combined with Voxceleb and Korean (Voxceleb과 한국어를 결합한 새로운 데이터셋으로 학습된 ECAPA-TDNN을 활용한 화자 검증)

  • Keumjae Yoon;Soyoung Park
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.2
    • /
    • pp.209-224
    • /
    • 2024
  • Speaker verification is becoming popular as a method of non-face-to-face identity authentication. It involves determining whether two voice data belong to the same speaker. In cases where the criminal's voice remains at the crime scene, it is vital to establish a speaker verification system that can accurately compare the two voice evidence. In this study, to achieve this, a new speaker verification system was built using a deep learning model for Korean language. High-dimensional voice data with a high variability like background noise made it necessary to use deep learning-based methods for speaker matching. To construct the matching algorithm, the ECAPA-TDNN model, known as the most famous deep learning system for speaker verification, was selected. A large dataset of the voice data, Voxceleb, collected from people of various nationalities without Korean. To study the appropriate form of datasets necessary for learning the Korean language, experiments were carried out to find out how Korean voice data affects the matching performance. The results showed that when comparing models learned only with Voxceleb and models learned with datasets combining Voxceleb and Korean datasets to maximize language and speaker diversity, the performance of learning data, including Korean, is improved for all test sets.

A reliable quasi-dense corresponding points for structure from motion

  • Oh, Jangseok;Hong, Hyunggil;Cho, Yongjun;Yun, Haeyong;Seo, Kap-Ho;Kim, Hochul;Kim, Mingi;Lee, Onseok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.9
    • /
    • pp.3782-3796
    • /
    • 2020
  • A three-dimensional (3D) reconstruction is an important research area in computer vision. The ability to detect and match features across multiple views of a scene is a critical initial step. The tracking matrix W obtained from a 3D reconstruction can be applied to structure from motion (SFM) algorithms for 3D modeling. We often fail to generate an acceptable number of features when processing face or medical images because such images typically contain large homogeneous regions with minimal variation in intensity. In this study, we seek to locate sufficient matching points not only in general images but also in face and medical images, where it is difficult to determine the feature points. The algorithm is implemented on an adaptive threshold value, a scale invariant feature transform (SIFT), affine SIFT, speeded up robust features (SURF), and affine SURF. By applying the algorithm to face and general images and studying the geometric errors, we can achieve quasi-dense matching points that satisfy well-functioning geometric constraints. We also demonstrate a 3D reconstruction with a respectable performance by applying a column space fitting algorithm, which is an SFM algorithm.

Content-Based Image Retrieval Algorithm Using HAQ Algorithm and Moment-Based Feature (HAQ 알고리즘과 Moment 기반 특징을 이용한 내용 기반 영상 검색 알고리즘)

  • 김대일;강대성
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.4
    • /
    • pp.113-120
    • /
    • 2004
  • In this paper, we propose an efficient feature extraction and image retrieval algorithm for content-based retrieval method. First, we extract the object using Gaussian edge detector for input image which is key frames of MPEG video and extract the object features that are location feature, distributed dimension feature and invariant moments feature. Next, we extract the characteristic color feature using the proposed HAQ(Histogram Analysis md Quantization) algorithm. Finally, we implement an retrieval of four features in sequence with the proposed matching method for query image which is a shot frame except the key frames of MPEG video. The purpose of this paper is to propose the novel content-based image retrieval algerian which retrieves the key frame in the shot boundary of MPEG video belonging to the scene requested by user. The experimental results show an efficient retrieval for 836 sample images in 10 music videos using the proposed algorithm.

3D Shape Reconstruction of Non-Lambertian Surface (Non-Lambertian면의 형상복원)

  • 김태은;이말례
    • Journal of Korea Multimedia Society
    • /
    • v.1 no.1
    • /
    • pp.26-36
    • /
    • 1998
  • It is very important study field in computer vision 'How we obtain 3D information from 2D image'. For this purpose, we must know position of camera, direction of light source, and surface reflectance property before we take the image, which are intrinsic information of the object in the scene. Among them, surface reflectance property presents very important clues. Most previous researches assume that objects have only Lambertian reflectance, but many real world objects have Non-Lambertian reflectance property. In this paper the new method for analyzing the properties of surface reflectance and reconstructing the shape of object through estimation of reflectance parameters is proposed. We have interest in Non-Lambertian reflectance surface that has specular reflection and diffuse reflection which can be explained by Torrance-Sparrow model. Photometric matching method proposed in this paper is robust method because it match reference image and object image considering the neighbor brightness distribution. Also in this thesis, the neural network based shaped reconstruction method is proposed, which can be performed in the absence of reflectance information. When brightness obtained by each light is inputted, neural network is trained by surface normal and can determine the surface shape of object.

  • PDF

A Modified Diamond Zonal Search Algorithm for Motion Estimation (움직임추정을 위한 수정된 다이아몬드 지역탐색 알고리즘)

  • Kwak, Sung-Keun
    • Journal of the Korea Computer Industry Society
    • /
    • v.10 no.5
    • /
    • pp.227-234
    • /
    • 2009
  • The Paper introduces a new technique for block matching motion estimation. since the temporal correlation of a animation sequence between the motion vector of current block and the motion vector of previous block. In this paper, we propose the scene change detection algorithm for block matching using the temporal correlation of the animation sequence and the center-biased property of motion vectors. The proposed algorithm determines the location of a better starting point for the search of an exact motion vector using the point of the smallest SAD(sum of absolute difference) value by the predicted motion vector from the same block of the previous frame and the predictor candidate point on each search region. Simulation results show that the PSNR values are improved as high as 9~32% in terms of average number of search point per motion vector estimation and improved about 0.06~0.21dB on an average except the FS(full search) algorithm.

  • PDF

Analysis on 3D Positioning Precision Using Mobile Mapping System Images in Photograrmmetric Perspective (사진측량 관점에서 차량측량시스템 영상을 이용한 3차원 위치의 정밀도 분석)

  • 조우석;황현덕
    • Korean Journal of Remote Sensing
    • /
    • v.19 no.6
    • /
    • pp.431-445
    • /
    • 2003
  • In this paper, we experimentally investigated the precision of 3D positioning using 4S-Van images in photograrmmetric perspective. The 3D calibration target was built over building facade outside and was captured separately by two CCD cameras installed in 4S-Van. After then, we determined the interior orientation parameter for each CCD camera through self-calibration technique. With the interior orientation parameter computed, the bundle adjustment was performed to obtain the exterior orientation parameters simultaneously for two CCD cameras using calibration target image and object coordinates. The reverse lens distortion coefficients were computed and acquired by least squares method so as to introduce lens distortion into epipolar line. It was shown that the reverse lens distortion coefficients could transform image coordinates into lens distorted image coordinates within about 0.5 pixel. The proposed semi-automatic matching scheme incorporated with lens distorted epipolar line was implemented with scene images captured by 4S-Van in moving. The experimental results showed that the precision of 3D positioning from 4S-Van images in photograrmmetric perspective is within 2cm in the range of 20m from the camera.

Visual Model of Pattern Design Based on Deep Convolutional Neural Network

  • Jingjing Ye;Jun Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.311-326
    • /
    • 2024
  • The rapid development of neural network technology promotes the neural network model driven by big data to overcome the texture effect of complex objects. Due to the limitations in complex scenes, it is necessary to establish custom template matching and apply it to the research of many fields of computational vision technology. The dependence on high-quality small label sample database data is not very strong, and the machine learning system of deep feature connection to complete the task of texture effect inference and speculation is relatively poor. The style transfer algorithm based on neural network collects and preserves the data of patterns, extracts and modernizes their features. Through the algorithm model, it is easier to present the texture color of patterns and display them digitally. In this paper, according to the texture effect reasoning of custom template matching, the 3D visualization of the target is transformed into a 3D model. The high similarity between the scene to be inferred and the user-defined template is calculated by the user-defined template of the multi-dimensional external feature label. The convolutional neural network is adopted to optimize the external area of the object to improve the sampling quality and computational performance of the sample pyramid structure. The results indicate that the proposed algorithm can accurately capture the significant target, achieve more ablation noise, and improve the visualization results. The proposed deep convolutional neural network optimization algorithm has good rapidity, data accuracy and robustness. The proposed algorithm can adapt to the calculation of more task scenes, display the redundant vision-related information of image conversion, enhance the powerful computing power, and further improve the computational efficiency and accuracy of convolutional networks, which has a high research significance for the study of image information conversion.

Automatic Extraction of Stable Visual Landmarks for a Mobile Robot under Uncertainty (이동로봇의 불확실성을 고려한 안정한 시각 랜드마크의 자동 추출)

  • Moon, In-Hyuk
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.7 no.9
    • /
    • pp.758-765
    • /
    • 2001
  • This paper proposes a method to automatically extract stable visual landmarks from sensory data. Given a 2D occupancy map, a mobile robot first extracts vertical line features which are distinct and on vertical planar surfaces, because they are expected to be observed reliably from various viewpoints. Since the feature information such as position and length includes uncertainty due to errors of vision and motion, the robot then reduces the uncertainty by matching the planar surface containing the features to the map. As a result, the robot obtains modeled stable visual landmarks from extracted features. This extraction process is performed on-line to adapt to an actual changes of lighting and scene depending on the robot’s view. Experimental results in various real scenes show the validity of the proposed method.

  • PDF