• Title/Summary/Keyword: 이미지 학습

Search Result 1,413, Processing Time 0.028 seconds

Artificial Neural Network Method Based on Convolution to Efficiently Extract the DoF Embodied in Images

  • Kim, Jong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.3
    • /
    • pp.51-57
    • /
    • 2021
  • In this paper, we propose a method to find the DoF(Depth of field) that is blurred in an image by focusing and out-focusing the camera through a efficient convolutional neural network. Our approach uses the RGB channel-based cross-correlation filter to efficiently classify the DoF region from the image and build data for learning in the convolutional neural network. A data pair of the training data is established between the image and the DoF weighted map. Data used for learning uses DoF weight maps extracted by cross-correlation filters, and uses the result of applying the smoothing process to increase the convergence rate in the network learning stage. The DoF weighted image obtained as the test result stably finds the DoF region in the input image. As a result, the proposed method can be used in various places such as NPR(Non-photorealistic rendering) rendering and object detection by using the DoF area as the user's ROI(Region of interest).

Estimation of High-Resolution Soil Moisture Using Sentinel-1A/B SAR and Deep Learning Regression Model (딥러닝 모형을 이용한 Sentinel SAR 기반 고해상도 토양수분 산정)

  • Lee, Taehwa;Kim, Sangwoo;Chun, Beomseok;Jung, Younghun;Shin, Yongchul
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.114-114
    • /
    • 2021
  • 본 연구에서는 Sentinel-1 SAR 센서 기반 이미지자료와 딥러닝기법을 이용하여 고해상도 토양수분을 산정하였다. 입력자료는 지표특성(모래함량, 점토함량, 경사도), 인공위성 기반의 강우와 LANDSAT 기반의 이미지자료(NDVI, LST, 공간분포 토양수분)를 사용하였다. 강우자료의 경우 GPM(Global Precipitation Measurement) 일강우 자료를 사용하였으며, 관측일 기준으로 5일전까지의 강우자료와 5일평균강우를 구분하여 사용하였다. LANDSAT 기반의 토양수분 이미지자료와 지점관측 토양수분을 이용하여 검·보정 이후 딥러닝 모형의 입력자료로 사용하였다. 입력자료는 30m × 30m 해상도로 Resample 하여 딥러닝 모형의 학습을 진행하였으며, 학습에 사용된 모형을 이용하여 Sentinel-1 기반의 고해상도(10m × 10m) 토양수분이미지를 산정하였다. 검증지점은 거창군 거창읍, 계룡시 두마면, 장수군 장수읍 및 무주군 무주읍 토양수분 관측지점을 선정하였다. 거창군 거창읍의 산정결과, LANDSAT 기반의 토양수분 이미지와 DNN 기반의 토양수분 이미지가 매우 유사하게 나타났으며, 모의값(DNN 기반 토양수분)이 실측값(LANDSAT 기반의 토양수분)을 잘 반영한 것(R: 0.875 ; RMSE: 0.013)으로 나타났다. 또한 학습모형을 토지피복이 유사한 지역에 적용하여 토양수분을 산정한 결과 검증지점 계룡시(R: 0.897 ; RMSE: 0.014), 장수군(R: 0.770 ; RMSE: 0.024) 및 무주군(R: 0.909 ; RMSE: 0.012)의 모의값이 실측값과 매우 유사한 것으로 나타났다. 이를 바탕으로 Seninel-1 SAR센서 이미지자료와 딥러닝기법을 연계한 고해상도 토양수분자료가 농업, 수문, 환경 등 다양한 분야에서 활용될 수 있을 것으로 판단된다.

  • PDF

Fashion Search Service Using Transfer Learning (전이 학습을 이용한 패션 스타일 검색 서비스)

  • Lee, Byeong-Jun;Sim, Ju-Yong;Lee, Jun-Yeong;Lee, Songwook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.432-434
    • /
    • 2022
  • 우리는 전이 학습을 이용하여 원하는 특정 패션 스타일 분류기를 학습하였다. 패션 스타일 검색 결과물을 온라인 쇼핑몰과 연결하는 웹 서비스를 사용자에게 제공한다. 패션 스타일 분류기는 구글에서 이미지 검색을 통해 수집된 데이터를 이용하여 ResNet34[1]에 전이 학습하였다. 학습된 분류 모델을 이용하여 사용자 이미지로부터 패션 스타일을 17가지 클래스로 분류하였고 F1 스코어는 평균 65.5%를 얻었다. 패션 스타일 분류 결과를 네이버 쇼핑몰과 연결하여 사용자가 원하는 패션 상품을 구매할 수 있는 서비스를 제공한다.

Arrhythmia classification based on meta-transfer learning using 2D-CNN model (2D-CNN 모델을 이용한 메타-전이학습 기반 부정맥 분류)

  • Kim, Ahyun;Yeom, Sunhwoong;Kim, Kyungbaek
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.550-552
    • /
    • 2022
  • 최근 사물인터넷(IoT) 기기가 활성화됨에 따라 웨어러블 장치 환경에서 장기간 모니터링 및 수집이 가능해짐에 따라 생체 신호 처리 및 ECG 분석 연구가 활성화되고 있다. 그러나, ECG 데이터는 부정맥 비트의 불규칙적인 발생으로 인한 클래스 불균형 문제와 근육의 떨림 및 신호의 미약등과 같은 잡음으로 인해 낮은 신호 품질이 발생할 수 있으며 훈련용 공개데이터 세트가 작다는 특징을 갖는다. 이 논문에서는 ECG 1D 신호를 2D 스펙트로그램 이미지로 변환하여 잡음의 영향을 최소화하고 전이학습과 메타학습의 장점을 결합하여 클래스 불균형 문제와 소수의 데이터에서도 빠른 학습이 가능하다는 특징을 갖는다. 따라서, 이 논문에서는 ECG 스펙트럼 이미지를 사용하여 2D-CNN 메타-전이 학습 기반 부정맥 분류 기법을 제안한다.

Character-based Subtitle Generation by Learning of Multimodal Concept Hierarchy from Cartoon Videos (멀티모달 개념계층모델을 이용한 만화비디오 컨텐츠 학습을 통한 등장인물 기반 비디오 자막 생성)

  • Kim, Kyung-Min;Ha, Jung-Woo;Lee, Beom-Jin;Zhang, Byoung-Tak
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.451-458
    • /
    • 2015
  • Previous multimodal learning methods focus on problem-solving aspects, such as image and video search and tagging, rather than on knowledge acquisition via content modeling. In this paper, we propose the Multimodal Concept Hierarchy (MuCH), which is a content modeling method that uses a cartoon video dataset and a character-based subtitle generation method from the learned model. The MuCH model has a multimodal hypernetwork layer, in which the patterns of the words and image patches are represented, and a concept layer, in which each concept variable is represented by a probability distribution of the words and the image patches. The model can learn the characteristics of the characters as concepts from the video subtitles and scene images by using a Bayesian learning method and can also generate character-based subtitles from the learned model if text queries are provided. As an experiment, the MuCH model learned concepts from 'Pororo' cartoon videos with a total of 268 minutes in length and generated character-based subtitles. Finally, we compare the results with those of other multimodal learning models. The Experimental results indicate that given the same text query, our model generates more accurate and more character-specific subtitles than other models.

A Study on Relationship Between Instructor's Image Perceived by Elderly Participating in Life Sports and Participation Motive (생활체육 참여 노인이 인식하는 지도자의 이미지와 참여동기의 관계)

  • Son, Ji-Young
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.5
    • /
    • pp.371-380
    • /
    • 2019
  • To identify the relationship between instructor's image and participation motive, this study conducted a research on 293 elderlies over 65 at 6 sports centers located in Seoul, Gyeonggi, and Incheon regions. The research results were as follows. First, in the image of instructor based on demographic characteristic, gender showed significant difference with instructor's talent while gender showed significant difference with instructor's talent, instructing image, and living image. Also, the sports type had significant difference with all subfactors of instructor image. Secondly, in the participation motive based on the demographic characteristic, gender showed significant difference with internal motive while age and sports type demonstrated significant difference with all subfactors of participation motive. third, the instructor image had significant influence on both internal and external motive. While instructing image and human relationship image had significant difference in internal motive, the external motive showed significant difference with instructor talent, living image, and human relationship image. thus, the research result showed that instructor's image is a significant variable which enhances elderly's motive for participating in life sports.

A Study on the Image Preprosessing model linkage method for usability of Pix2Pix (Pix2Pix의 활용성을 위한 학습이미지 전처리 모델연계방안 연구)

  • Kim, Hyo-Kwan;Hwang, Won-Yong
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.15 no.5
    • /
    • pp.380-386
    • /
    • 2022
  • This paper proposes a method for structuring the preprocessing process of a training image when color is applied using Pix2Pix, one of the adversarial generative neural network techniques. This paper concentrate on the prediction result can be damaged according to the degree of light reflection of the training image. Therefore, image preprocesisng and parameters for model optimization were configured before model application. In order to increase the image resolution of training and prediction results, it is necessary to modify the of the model so this part is designed to be tuned with parameters. In addition, in this paper, the logic that processes only the part where the prediction result is damaged by light reflection is configured together, and the pre-processing logic that does not distort the prediction result is also configured.Therefore, in order to improve the usability, the accuracy was improved through experiments on the part that applies the light reflection tuning filter to the training image of the Pix2Pix model and the parameter configuration.

Performance Improvement of Image-to-Image Translation with RAPGAN and RRDB (RAPGAN와 RRDB를 이용한 Image-to-Image Translation의 성능 개선)

  • Dongsik Yoon;Noyoon Kwak
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.1
    • /
    • pp.131-138
    • /
    • 2023
  • This paper is related to performance improvement of Image-to-Image translation using Relativistic Average Patch GAN and Residual in Residual Dense Block. The purpose of this paper is to improve performance through technical improvements in three aspects to compensate for the shortcomings of the previous pix2pix, a type of Image-to-Image translation. First, unlike the previous pix2pix constructor, it enables deeper learning by using Residual in Residual Block in the part of encoding the input image. Second, since we use a loss function based on Relativistic Average Patch GAN to predict how real the original image is compared to the generated image, both of these images affect adversarial generative learning. Finally, the generator is pre-trained to prevent the discriminator from being learned prematurely. According to the proposed method, it was possible to generate images superior to the previous pix2pix by more than 13% on average at the aspect of FID.

A study on age distortion reduction in facial expression image generation using StyleGAN Encoder (StyleGAN Encoder를 활용한 표정 이미지 생성에서의 연령 왜곡 감소에 대한 연구)

  • Hee-Yeol Lee;Seung-Ho Lee
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.464-471
    • /
    • 2023
  • In this paper, we propose a method to reduce age distortion in facial expression image generation using StyleGAN Encoder. The facial expression image generation process first creates a face image using StyleGAN Encoder, and changes the expression by applying the learned boundary to the latent vector using SVM. However, when learning the boundary of a smiling expression, age distortion occurs due to changes in facial expression. The smile boundary created in SVM learning for smiling expressions includes wrinkles caused by changes in facial expressions as learning elements, and it is determined that age characteristics were also learned. To solve this problem, the proposed method calculates the correlation coefficient between the smile boundary and the age boundary and uses this to introduce a method of adjusting the age boundary at the smile boundary in proportion to the correlation coefficient. To confirm the effectiveness of the proposed method, the results of an experiment using the FFHQ dataset, a publicly available standard face dataset, and measuring the FID score are as follows. In the smile image, compared to the existing method, the FID score of the smile image generated by the ground truth and the proposed method was improved by about 0.46. In addition, compared to the existing method in the smile image, the FID score of the image generated by StyleGAN Encoder and the smile image generated by the proposed method improved by about 1.031. In non-smile images, compared to the existing method, the FID score of the non-smile image generated by the ground truth and the method proposed in this paper was improved by about 2.25. In addition, compared to the existing method in non-smile images, it was confirmed that the FID score of the image generated by StyleGAN Encoder and the non-smile image generated by the proposed method improved by about 1.908. Meanwhile, as a result of estimating the age of each generated facial expression image and measuring the estimated age and MSE of the image generated with StyleGAN Encoder, compared to the existing method, the proposed method has an average age of about 1.5 in smile images and about 1.63 in non-smile images. Performance was improved, proving the effectiveness of the proposed method.

A novel Node2Vec-based 2-D image representation method for effective learning of cancer genomic data (암 유전체 데이터를 효과적으로 학습하기 위한 Node2Vec 기반의 새로운 2 차원 이미지 표현기법)

  • Choi, Jonghwan;Park, Sanghyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.383-386
    • /
    • 2019
  • 4 차산업혁명의 발달은 전 세계가 건강한 삶에 관련된 스마트시티 및 맞춤형 치료에 큰 관심을 갖게 하였고, 특히 기계학습 기술은 암을 극복하기 위한 유전체 기반의 정밀 의학 연구에 널리 활용되고 있어 암환자의 예후 예측 및 예후에 따른 맞춤형 치료 전략 수립 등을 가능케하였다. 하지만 암 예후 예측 연구에 주로 사용되는 유전자 발현량 데이터는 약 17,000 개의 유전자를 갖는 반면에 샘플의 수가 200 여개 밖에 없는 문제를 안고 있어, 예후 예측을 위한 신경망 모델의 일반화를 어렵게 한다. 이러한 문제를 해결하기 위해 본 연구에서는 고차원의 유전자 발현량 데이터를 신경망 모델이 효과적으로 학습할 수 있도록 2D 이미지로 표현하는 기법을 제안한다. 길이 17,000 인 1 차원 유전자 벡터를 64×64 크기의 2 차원 이미지로 사상하여 입력크기를 압축하였다. 2 차원 평면 상의 유전자 좌표를 구하기 위해 유전자 네트워크 데이터와 Node2Vec 이 활용되었고, 이미지 기반의 암 예후 예측을 수행하기 위해 합성곱 신경망 모델을 사용하였다. 제안하는 기법을 정확하게 평가하기 위해 이중 교차 검증 및 무작위 탐색 기법으로 모델 선택 및 평가 작업을 수행하였고, 그 결과로 베이스라인 모델인 고차원의 유전자 벡터를 입력 받는 다층 퍼셉트론 모델보다 더 높은 예측 정확도를 보여주는 것을 확인하였다.