• 제목/요약/키워드: Image model

검색결과 6,498건 처리시간 0.029초

A Research on Aesthetic Aspects of Checkpoint Models in [Stable Diffusion]

  • Ke Ma;Jeanhun Chung
    • International journal of advanced smart convergence
    • /
    • 제13권2호
    • /
    • pp.130-135
    • /
    • 2024
  • The Stable diffsuion AI tool is popular among designers because of its flexible and powerful image generation capabilities. However, due to the diversity of its AI models, it needs to spend a lot of time testing different AI models in the face of different design plans, so choosing a suitable general AI model has become a big problem at present. In this paper, by comparing the AI images generated by two different Stable diffsuion models, the advantages and disadvantages of each model are analyzed from the aspects of the matching degree of the AI image and the prompt, the color composition and light composition of the image, and the general AI model that the generated AI image has an aesthetic sense is analyzed, and the designer does not need to take cumbersome steps. A satisfactory AI image can be obtained. The results show that Playground V2.5 model can be used as a general AI model, which has both aesthetic and design sense in various style design requirements. As a result, content designers can focus more on creative content development, and expect more groundbreaking technologies to merge generative AI with content design.

전이학습 기반 사출 성형품 burr 이미지 검출 시스템 개발 (Development of a transfer learning based detection system for burr image of injection molded products)

  • 양동철;김종선
    • Design & Manufacturing
    • /
    • 제15권3호
    • /
    • pp.1-6
    • /
    • 2021
  • An artificial neural network model based on a deep learning algorithm is known to be more accurate than humans in image classification, but there is still a limit in the sense that there needs to be a lot of training data that can be called big data. Therefore, various techniques are being studied to build an artificial neural network model with high precision, even with small data. The transfer learning technique is assessed as an excellent alternative. As a result, the purpose of this study is to develop an artificial neural network system that can classify burr images of light guide plate products with 99% accuracy using transfer learning technique. Specifically, for the light guide plate product, 150 images of the normal product and the burr were taken at various angles, heights, positions, etc., respectively. Then, after the preprocessing of images such as thresholding and image augmentation, for a total of 3,300 images were generated. 2,970 images were separated for training, while the remaining 330 images were separated for model accuracy testing. For the transfer learning, a base model was developed using the NASNet-Large model that pre-trained 14 million ImageNet data. According to the final model accuracy test, the 99% accuracy in the image classification for training and test images was confirmed. Consequently, based on the results of this study, it is expected to help develop an integrated AI production management system by training not only the burr but also various defective images.

Bi-GRU 이미지 캡션의 서술 성능 향상을 위한 Parallel Injection 기법 연구 (Parallel Injection Method for Improving Descriptive Performance of Bi-GRU Image Captions)

  • 이준희;이수환;태수호;서동환
    • 한국멀티미디어학회논문지
    • /
    • 제22권11호
    • /
    • pp.1223-1232
    • /
    • 2019
  • The injection is the input method of the image feature vector from the encoder to the decoder. Since the image feature vector contains object details such as color and texture, it is essential to generate image captions. However, the bidirectional decoder model using the existing injection method only inputs the image feature vector in the first step, so image feature vectors of the backward sequence are vanishing. This problem makes it difficult to describe the context in detail. Therefore, in this paper, we propose the parallel injection method to improve the description performance of image captions. The proposed Injection method fuses all embeddings and image vectors to preserve the context. Also, We optimize our image caption model with Bidirectional Gated Recurrent Unit (Bi-GRU) to reduce the amount of computation of the decoder. To validate the proposed model, experiments were conducted with a certified image caption dataset, demonstrating excellence in comparison with the latest models using BLEU and METEOR scores. The proposed model improved the BLEU score up to 20.2 points and the METEOR score up to 3.65 points compared to the existing caption model.

이미지-텍스트 쌍을 활용한 이미지 분류 정확도 향상에 관한 연구 (A Study on Improvement of Image Classification Accuracy Using Image-Text Pairs)

  • 김미희;이주혁
    • 전기전자학회논문지
    • /
    • 제27권4호
    • /
    • pp.561-566
    • /
    • 2023
  • 딥러닝의 발전으로 다양한 컴퓨터 비전 연구를 수행할 수 있게 됐다. 딥러닝은 컴퓨터 비전 연구 중 이미지 처리에서 높은 정확도와 성능을 보여줬다. 하지만 대부분의 이미지 처리 방식은 이미지의 시각 정보만을 이용해 이미지를 처리하는 경우가 대부분이다. 이미지-텍스트 쌍을 활용할 경우 이미지와 관련된 설명, 주석 등의 텍스트 데이터가 이미지 자체에서는 얻기 힘든 추가적인 맥락과 시각 정보를 제공할 수 있다. 본 논문에서는 이미지-텍스트 쌍을 활용하여 이미지와 텍스트를 분석하는 딥러닝 모델 제안한다. 제안 모델은 이미지 정보만을 사용한 딥러닝 모델보다 약 11% 향상된 분류 정확도 결과를 보였다.

싱글 야외 영상에서 계층적 이미지 트리 모델과 k-평균 세분화를 이용한 날씨 분류와 안개 검출 (Weather Classification and Fog Detection using Hierarchical Image Tree Model and k-mean Segmentation in Single Outdoor Image)

  • 박기홍
    • 디지털콘텐츠학회 논문지
    • /
    • 제18권8호
    • /
    • pp.1635-1640
    • /
    • 2017
  • 본 논문에서는 싱글 야외 영상에서 날씨 분류를 위한 계층적 이미지 트리 모델을 정의하고, 영상의 밝기와 k-평균 세분화 영상을 이용한 날씨 분류 알고리즘을 제안하였다. 계층적 이미지 트리 모델의 첫 번째 레벨에서 실내와 야외 영상을 구분하고, 두 번째 레벨에서는 야외 영상이 주간, 야간 또는 일출/일몰 영상인지를 밝기 영상과 k-평균 세분화 영상을 이용하여 판단하였다. 마지막 레벨에서는 두 번째 레벨에서 주간 영상으로 분류된 경우 에지 맵과 안개 율을 기반으로 맑은 영상 또는 안개 영상인지를 최종 추정하였다. 실험 결과, 날씨 분류가 설계 규격대로 수행됨을 확인할 수 있었으며, 제안하는 방법이 주어진 영상에서 효과적으로 날씨 특징이 검출됨을 보였다.

A Model-Based Image Steganography Method Using Watson's Visual Model

  • Fakhredanesh, Mohammad;Safabakhsh, Reza;Rahmati, Mohammad
    • ETRI Journal
    • /
    • 제36권3호
    • /
    • pp.479-489
    • /
    • 2014
  • This paper presents a model-based image steganography method based on Watson's visual model. Model-based steganography assumes a model for cover image statistics. This approach, however, has some weaknesses, including perceptual detectability. We propose to use Watson's visual model to improve perceptual undetectability of model-based steganography. The proposed method prevents visually perceptible changes during embedding. First, the maximum acceptable change in each discrete cosine transform coefficient is extracted based on Watson's visual model. Then, a model is fitted to a low-precision histogram of such coefficients and the message bits are encoded to this model. Finally, the encoded message bits are embedded in those coefficients whose maximum possible changes are visually imperceptible. Experimental results show that changes resulting from the proposed method are perceptually undetectable, whereas model-based steganography retains perceptually detectable changes. This perceptual undetectability is achieved while the perceptual quality - based on the structural similarity measure - and the security - based on two steganalysis methods - do not show any significant changes.

A Comparison of the Rudin-Osher-Fatemi Total Variation model and the Nonlocal Means Algorithm

  • ;최흥국
    • 한국멀티미디어학회:학술대회논문집
    • /
    • 한국멀티미디어학회 2012년도 춘계학술발표대회논문집
    • /
    • pp.6-9
    • /
    • 2012
  • In this study, we compare two image denoising methods which are the Rudin-Osher-Fatemi total variation (TV) model and the nonlocal means (NLM) algorithm on medical images. To evaluate those methods, we used two well known measuring metrics. The methods are tested with a CT image, one X-Ray image, and three MRI images. Experimental result shows that the NML algorithm can give better results than the ROF TV model, but computational complexity is high.

  • PDF

Noise PDF Analysis of Nonlinear Image Sensor Model;GOCI Case

  • Myung, Hwan-Chun;Youn, Heong-Sik
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2007년도 Proceedings of ISRS 2007
    • /
    • pp.191-194
    • /
    • 2007
  • The paper clarifies all the noise sources of a CMOS image sensor, with which the GOCI (Geostationary Ocean Color Imager) is equipped, and analyzes their contribution to a nonlinear image sensor model. In particular, the noise PDF (Probability Density Function) is derived in terms of sensor-gain coefficients: a linear and a nonlinear gains. As a result, the relation between the noise characteristic and the sensor gains is studied.

  • PDF

구조적 왜곡특성 측정을 이용한 블록기반 DCT 영상 부호화기의 객관적 화질평가 (Objective Image Quality Metric for Block-Based DCT Image Coder Using Structural Distortion Measurement)

  • 정태윤
    • 대한전기학회논문지:시스템및제어부문D
    • /
    • 제52권7호
    • /
    • pp.434-441
    • /
    • 2003
  • This paper proposes a new quantitative and objective image quality metric which is essential to verify the performance of block-based DCT image coding. The proposed metric considers not only global distortion of coded image such as spatial frequency sensitivity and channel masking using HVS based multi-channel model, but also structural distortions caused block-based coding. The experimental results show a strong correlation between proposed metric and subjective metric.

구조적 왜곡특성 측정을 이용한 블록기반 DCT 영상 부호화기의 객관적 화질평가 (Objective Image Quality Metric for Block-Based DCT Image Coder-using Structural Distortion Measurement)

  • 정태윤
    • 대한전기학회논문지:전기물성ㆍ응용부문C
    • /
    • 제52권7호
    • /
    • pp.434-434
    • /
    • 2003
  • This paper proposes a new quantitative and objective image quality metric which is essential to verify the performance of block-based DCT image coding The proposed metric considers not only global distortion of coded image such as spatial frequency sensitivity and channel masking using HVS based multi-channel model, but also structural distortions caused block-based coding. The experimental results show a strong correlation between propose(B metric and subjective metric.