• Title/Summary/Keyword: Information Processing Model

Search Result 5,541, Processing Time 0.032 seconds

Efficient Methodology in Markov Random Field Modeling : Multiresolution Structure and Bayesian Approach in Parameter Estimation (피라미드 구조와 베이지안 접근법을 이용한 Markove Random Field의 효율적 모델링)

  • 정명희;홍의석
    • Korean Journal of Remote Sensing
    • /
    • v.15 no.2
    • /
    • pp.147-158
    • /
    • 1999
  • Remote sensing technique has offered better understanding of our environment for the decades by providing useful level of information on the landcover. In many applications using the remotely sensed data, digital image processing methodology has been usefully employed to characterize the features in the data and develop the models. Random field models, especially Markov Random Field (MRF) models exploiting spatial relationships, are successfully utilized in many problems such as texture modeling, region labeling and so on. Usually, remotely sensed imagery are very large in nature and the data increase greatly in the problem requiring temporal data over time period. The time required to process increasing larger images is not linear. In this study, the methodology to reduce the computational cost is investigated in the utilization of the Markov Random Field. For this, multiresolution framework is explored which provides convenient and efficient structures for the transition between the local and global features. The computational requirements for parameter estimation of the MRF model also become excessive as image size increases. A Bayesian approach is investigated as an alternative estimation method to reduce the computational burden in estimation of the parameters of large images.

Hybrid All-Reduce Strategy with Layer Overlapping for Reducing Communication Overhead in Distributed Deep Learning (분산 딥러닝에서 통신 오버헤드를 줄이기 위해 레이어를 오버래핑하는 하이브리드 올-리듀스 기법)

  • Kim, Daehyun;Yeo, Sangho;Oh, Sangyoon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.7
    • /
    • pp.191-198
    • /
    • 2021
  • Since the size of training dataset become large and the model is getting deeper to achieve high accuracy in deep learning, the deep neural network training requires a lot of computation and it takes too much time with a single node. Therefore, distributed deep learning is proposed to reduce the training time by distributing computation across multiple nodes. In this study, we propose hybrid allreduce strategy that considers the characteristics of each layer and communication and computational overlapping technique for synchronization of distributed deep learning. Since the convolution layer has fewer parameters than the fully-connected layer as well as it is located at the upper, only short overlapping time is allowed. Thus, butterfly allreduce is used to synchronize the convolution layer. On the other hand, fully-connecter layer is synchronized using ring all-reduce. The empirical experiment results on PyTorch with our proposed scheme shows that the proposed method reduced the training time by up to 33% compared to the baseline PyTorch.

The Method for Colorizing SAR Images of Kompsat-5 Using Cycle GAN with Multi-scale Discriminators (다양한 크기의 식별자를 적용한 Cycle GAN을 이용한 다목적실용위성 5호 SAR 영상 색상 구현 방법)

  • Ku, Wonhoe;Chun, Daewon
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.6_3
    • /
    • pp.1415-1425
    • /
    • 2018
  • Kompsat-5 is the first Earth Observation Satellite which is equipped with an SAR in Korea. SAR images are generated by receiving signals reflected from an object by microwaves emitted from a SAR antenna. Because the wavelengths of microwaves are longer than the size of particles in the atmosphere, it can penetrate clouds and fog, and high-resolution images can be obtained without distinction between day and night. However, there is no color information in SAR images. To overcome these limitations of SAR images, colorization of SAR images using Cycle GAN, a deep learning model developed for domain translation, was conducted. Training of Cycle GAN is unstable due to the unsupervised learning based on unpaired dataset. Therefore, we proposed MS Cycle GAN applying multi-scale discriminator to solve the training instability of Cycle GAN and to improve the performance of colorization in this paper. To compare colorization performance of MS Cycle GAN and Cycle GAN, generated images by both models were compared qualitatively and quantitatively. Training Cycle GAN with multi-scale discriminator shows the losses of generators and discriminators are significantly reduced compared to the conventional Cycle GAN, and we identified that generated images by MS Cycle GAN are well-matched with the characteristics of regions such as leaves, rivers, and land.

Conversion of Camera Lens Distortions between Photogrammetry and Computer Vision (사진측량과 컴퓨터비전 간의 카메라 렌즈왜곡 변환)

  • Hong, Song Pyo;Choi, Han Seung;Kim, Eui Myoung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.4
    • /
    • pp.267-277
    • /
    • 2019
  • Photogrammetry and computer vision are identical in determining the three-dimensional coordinates of images taken with a camera, but the two fields are not directly compatible with each other due to differences in camera lens distortion modeling methods and camera coordinate systems. In general, data processing of drone images is performed by bundle block adjustments using computer vision-based software, and then the plotting of the image is performed by photogrammetry-based software for mapping. In this case, we are faced with the problem of converting the model of camera lens distortions into the formula used in photogrammetry. Therefore, this study described the differences between the coordinate systems and lens distortion models used in photogrammetry and computer vision, and proposed a methodology for converting them. In order to verify the conversion formula of the camera lens distortion models, first, lens distortions were added to the virtual coordinates without lens distortions by using the computer vision-based lens distortion models. Then, the distortion coefficients were determined using photogrammetry-based lens distortion models, and the lens distortions were removed from the photo coordinates and compared with the virtual coordinates without the original distortions. The results showed that the root mean square distance was good within 0.5 pixels. In addition, epipolar images were generated to determine the accuracy by applying lens distortion coefficients for photogrammetry. The calculated root mean square error of y-parallax was found to be within 0.3 pixels.

A vision-based system for long-distance remote monitoring of dynamic displacement: experimental verification on a supertall structure

  • Ni, Yi-Qing;Wang, You-Wu;Liao, Wei-Yang;Chen, Wei-Huan
    • Smart Structures and Systems
    • /
    • v.24 no.6
    • /
    • pp.769-781
    • /
    • 2019
  • Dynamic displacement response of civil structures is an important index for in-construction and in-service structural condition assessment. However, accurately measuring the displacement of large-scale civil structures such as high-rise buildings still remains as a challenging task. In order to cope with this problem, a vision-based system with the use of industrial digital camera and image processing has been developed for long-distance, remote, and real-time monitoring of dynamic displacement of supertall structures. Instead of acquiring image signals, the proposed system traces only the coordinates of the target points, therefore enabling real-time monitoring and display of displacement responses in a relatively high sampling rate. This study addresses the in-situ experimental verification of the developed vision-based system on the Canton Tower of 600 m high. To facilitate the verification, a GPS system is used to calibrate/verify the structural displacement responses measured by the vision-based system. Meanwhile, an accelerometer deployed in the vicinity of the target point also provides frequency-domain information for comparison. Special attention has been given on understanding the influence of the surrounding light on the monitoring results. For this purpose, the experimental tests are conducted in daytime and nighttime through placing the vision-based system outside the tower (in a brilliant environment) and inside the tower (in a dark environment), respectively. The results indicate that the displacement response time histories monitored by the vision-based system not only match well with those acquired by the GPS receiver, but also have higher fidelity and are less noise-corrupted. In addition, the low-order modal frequencies of the building identified with use of the data obtained from the vision-based system are all in good agreement with those obtained from the accelerometer, the GPS receiver and an elaborate finite element model. Especially, the vision-based system placed at the bottom of the enclosed elevator shaft offers better monitoring data compared with the system placed outside the tower. Based on a wavelet filtering technique, the displacement response time histories obtained by the vision-based system are easily decomposed into two parts: a quasi-static ingredient primarily resulting from temperature variation and a dynamic component mainly caused by fluctuating wind load.

Comparison of Korean Real-time Text-to-Speech Technology Based on Deep Learning (딥러닝 기반 한국어 실시간 TTS 기술 비교)

  • Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.1
    • /
    • pp.640-645
    • /
    • 2021
  • The deep learning based end-to-end TTS system consists of Text2Mel module that generates spectrogram from text, and vocoder module that synthesizes speech signals from spectrogram. Recently, by applying deep learning technology to the TTS system the intelligibility and naturalness of the synthesized speech is as improved as human vocalization. However, it has the disadvantage that the inference speed for synthesizing speech is very slow compared to the conventional method. The inference speed can be improved by applying the non-autoregressive method which can generate speech samples in parallel independent of previously generated samples. In this paper, we introduce FastSpeech, FastSpeech 2, and FastPitch as Text2Mel technology, and Parallel WaveGAN, Multi-band MelGAN, and WaveGlow as vocoder technology applying non-autoregressive method. And we implement them to verify whether it can be processed in real time. Experimental results show that by the obtained RTF all the presented methods are sufficiently capable of real-time processing. And it can be seen that the size of the learned model is about tens to hundreds of megabytes except WaveGlow, and it can be applied to the embedded environment where the memory is limited.

Performance Evaluation of KOMPSAT-3 Satellite DSM in Overseas Testbed Area (해외 테스트베드 지역 아리랑 위성 3호 DSM 성능평가)

  • Oh, Kwan-Young;Hwang, Jeong-In;Yoo, Woo-Sun;Lee, Kwang-Jae
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.6_2
    • /
    • pp.1615-1627
    • /
    • 2020
  • The purpose of this study is to compare and analyze the performance of KOMPSAT-3 Digital Surface Model (DSM) made in overseas testbed area. To that end, we collected the KOMPSAT-3 in-track stereo image taken in San Francisco, the U.S. The stereo geometry elements (B/H, converse angle, etc.) of the stereo image taken were all found to be in the stable range. By applying precise sensor modeling using Ground Control Point (GCP) and DSM automatic generation technique, DSM with 1 m resolution was produced. Reference materials for evaluation and calibration are ground points with accuracy within 0.01 m from Compass Data Inc., 1 m resolution Elevation 1-DSM produced by Airbus. The precision sensor modeling accuracy of KOMPSAT-3 was within 0.5 m (RMSE) in horizontal and vertical directions. When the difference map was written between the generated DSM and the reference DSM, the mean and standard deviation were 0.61 m and 5.25 m respectively, but in some areas, they showed a large difference of more than 100 m. These areas appeared mainly in closed areas where high-rise buildings were concentrated. If KOMPSAT-3 tri-stereo images are used and various post-processing techniques are developed, it will be possible to produce DSM with more improved quality.

Line-Segment Feature Analysis Algorithm for Handwritten-Digits Data Reduction (필기체 숫자 데이터 차원 감소를 위한 선분 특징 분석 알고리즘)

  • Kim, Chang-Min;Lee, Woo-Beom
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.4
    • /
    • pp.125-132
    • /
    • 2021
  • As the layers of artificial neural network deepens, and the dimension of data used as an input increases, there is a problem of high arithmetic operation requiring a lot of arithmetic operation at a high speed in the learning and recognition of the neural network (NN). Thus, this study proposes a data dimensionality reduction method to reduce the dimension of the input data in the NN. The proposed Line-segment Feature Analysis (LFA) algorithm applies a gradient-based edge detection algorithm using median filters to analyze the line-segment features of the objects existing in an image. Concerning the extracted edge image, the eigenvalues corresponding to eight kinds of line-segment are calculated, using 3×3 or 5×5-sized detection filters consisting of the coefficient values, including [0, 1, 2, 4, 8, 16, 32, 64, and 128]. Two one-dimensional 256-sized data are produced, accumulating the same response values from the eigenvalue calculated with each detection filter, and the two data elements are added up. Two LFA256 data are merged to produce 512-sized LAF512 data. For the performance evaluation of the proposed LFA algorithm to reduce the data dimension for the recognition of handwritten numbers, as a result of a comparative experiment, using the PCA technique and AlexNet model, LFA256 and LFA512 showed a recognition performance respectively of 98.7% and 99%.

Comparison of Korean Classification Models' Korean Essay Score Range Prediction Performance (한국어 학습 모델별 한국어 쓰기 답안지 점수 구간 예측 성능 비교)

  • Cho, Heeryon;Im, Hyeonyeol;Yi, Yumi;Cha, Junwoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.3
    • /
    • pp.133-140
    • /
    • 2022
  • We investigate the performance of deep learning-based Korean language models on a task of predicting the score range of Korean essays written by foreign students. We construct a data set containing a total of 304 essays, which include essays discussing the criteria for choosing a job ('job'), conditions of a happy life ('happ'), relationship between money and happiness ('econ'), and definition of success ('succ'). These essays were labeled according to four letter grades (A, B, C, and D), and a total of eleven essay score range prediction experiments were conducted (i.e., five for predicting the score range of 'job' essays, five for predicting the score range of 'happiness' essays, and one for predicting the score range of mixed topic essays). Three deep learning-based Korean language models, KoBERT, KcBERT, and KR-BERT, were fine-tuned using various training data. Moreover, two traditional probabilistic machine learning classifiers, naive Bayes and logistic regression, were also evaluated. Experiment results show that deep learning-based Korean language models performed better than the two traditional classifiers, with KR-BERT performing the best with 55.83% overall average prediction accuracy. A close second was KcBERT (55.77%) followed by KoBERT (54.91%). The performances of naive Bayes and logistic regression classifiers were 52.52% and 50.28% respectively. Due to the scarcity of training data and the imbalance in class distribution, the overall prediction performance was not high for all classifiers. Moreover, the classifiers' vocabulary did not explicitly capture the error features that were helpful in correctly grading the Korean essay. By overcoming these two limitations, we expect the score range prediction performance to improve.

A Study on Automated Fake News Detection Using Verification Articles (검증 자료를 활용한 가짜뉴스 탐지 자동화 연구)

  • Han, Yoon-Jin;Kim, Geun-Hyung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.12
    • /
    • pp.569-578
    • /
    • 2021
  • Thanks to web development today, we can easily access online news via various media. As much as it is easy to access online news, we often face fake news pretending to be true. As fake news items have become a global problem, fact-checking services are provided domestically, too. However, these are based on expert-based manual detection, and research to provide technologies that automate the detection of fake news is being actively conducted. As for the existing research, detection is made available based on contextual characteristics of an article and the comparison of a title and the main article. However, there is a limit to such an attempt making detection difficult when manipulation precision has become high. Therefore, this study suggests using a verifying article to decide whether a news item is genuine or not to be affected by article manipulation. Also, to improve the precision of fake news detection, the study added a process to summarize a subject article and a verifying article through the summarization model. In order to verify the suggested algorithm, this study conducted verification for summarization method of documents, verification for search method of verification articles, and verification for the precision of fake news detection in the finally suggested algorithm. The algorithm suggested in this study can be helpful to identify the truth of an article before it is applied to media sources and made available online via various media sources.