• Title/Summary/Keyword: synthesis models

Search Result 351, Processing Time 0.032 seconds

Multicontents Integrated Image Animation within Synthesis for Hiqh Quality Multimodal Video (고화질 멀티 모달 영상 합성을 통한 다중 콘텐츠 통합 애니메이션 방법)

  • Jae Seung Roh;Jinbeom Kang
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.257-269
    • /
    • 2023
  • There is currently a burgeoning demand for image synthesis from photos and videos using deep learning models. Existing video synthesis models solely extract motion information from the provided video to generate animation effects on photos. However, these synthesis models encounter challenges in achieving accurate lip synchronization with the audio and maintaining the image quality of the synthesized output. To tackle these issues, this paper introduces a novel framework based on an image animation approach. Within this framework, upon receiving a photo, a video, and audio input, it produces an output that not only retains the unique characteristics of the individuals in the photo but also synchronizes their movements with the provided video, achieving lip synchronization with the audio. Furthermore, a super-resolution model is employed to enhance the quality and resolution of the synthesized output.

Entity Embeddings for Enhancing Feasible and Diverse Population Synthesis in a Deep Generative Models (심층 생성모델 기반 합성인구 생성 성능 향상을 위한 개체 임베딩 분석연구)

  • Donghyun Kwon;Taeho Oh;Seungmo Yoo;Heechan Kang
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.6
    • /
    • pp.17-31
    • /
    • 2023
  • An activity-based model requires detailed population information to model individual travel behavior in a disaggregated manner. The recent innovative approach developed deep generative models with novel regularization terms that improves fidelity and diversity for population synthesis. Since the method relies on measuring the distance between distribution boundaries of the sample data and the generated sample, it is crucial to obtain well-defined continuous representation from the discretized dataset. Therefore, we propose an improved entity embedding models to enhance the performance of the regularization terms, which indirectly supports the synthesis in terms of feasible and diverse populations. Our results show a 28.87% improvement in the F1 score compared to the baseline method.

'Hanmal' Korean Language Diphone Database for Speech Synthesis

  • Chung, Hyun-Song
    • Speech Sciences
    • /
    • v.12 no.1
    • /
    • pp.55-63
    • /
    • 2005
  • This paper introduces a 'Hanmal' Korean language diphone database for speech synthesis, which has been publicly available since 1999 in the MBROLA web site and never been properly published in a journal. The diphone database is compatible with the MBROLA programme of high-quality multilingual speech synthesis systems. The usefulness of the diphone database is introduced in the paper. The paper also describes the phonetic and phonological structure of the database, showing the process of creating a text corpus. A machine-readable Korean SAMPA convention for the control data input to the MBROLA application is also suggested. Diphone concatenation and prosody manipulation are performed using the MBR-PSOLA algorithm. A set of segment duration models can be applied to the diphone synthesis of Korean.

  • PDF

The μ-synthesis and analysis of water level control in steam generators

  • Salehi, Ahmad;Kazemi, Mohammad Hosein;Safarzadeh, Omid
    • Nuclear Engineering and Technology
    • /
    • v.51 no.1
    • /
    • pp.163-169
    • /
    • 2019
  • The robust controller synthesis and analysis of the water level process in the U-tube system generator (UTSG) is addressed in this paper. The parameter uncertainties of the steam generator (SG) are modeled as multiplicative perturbations which are normalized by designing suitable weighting functions. The relative errors of the nominal SG model with respect to the other operating power level models are employed to specify the weighting functions for normalizing the plant uncertainties. Then, a robust controller is designed based on ${\mu}$-synthesis and D-K iteration, and its stability robustness is verified over the whole range of power operations. A gain-scheduled controller with $H_{\infty}$-synthesis is also designed to compare its robustness with the proposed controller. The stability analysis is accomplished and compared with the previous QFT design. The ${\mu}$-analysis of the system shows that the proposed controller has a favorable stability robustness for the whole range of operating power conditions. The proposed controller response is simulated against the power level deviation in start-up and shutdown stages and compared with the other concerning controllers.

Singing Voice Synthesis Using HMM Based TTS and MusicXML (HMM 기반 TTS와 MusicXML을 이용한 노래음 합성)

  • Khan, Najeeb Ullah;Lee, Jung-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.5
    • /
    • pp.53-63
    • /
    • 2015
  • Singing voice synthesis is the generation of a song using a computer given its lyrics and musical notes. Hidden Markov models (HMM) have been proved to be the models of choice for text to speech synthesis. HMMs have also been used for singing voice synthesis research, however, a huge database is needed for the training of HMMs for singing voice synthesis. And commercially available singing voice synthesis systems which use the piano roll music notation, needs to adopt the easy to read standard music notation which make it suitable for singing learning applications. To overcome this problem, we use a speech database for training context dependent HMMs, to be used for singing voice synthesis. Pitch and duration control methods have been devised to modify the parameters of the HMMs trained on speech, to be used as the synthesis units for the singing voice. This work describes a singing voice synthesis system which uses a MusicXML based music score editor as the front-end interface for entry of the notes and lyrics to be synthesized and a hidden Markov model based text to speech synthesis system as the back-end synthesizer. A perceptual test shows the feasibility of our proposed system.

A simulation study on synthesis gas process optimization for FT(Fischer-Tropsh) synthesis (FT(Fischer-Tropsh) 합성유 제조를 위한 합성가스 공정 최적화 연구)

  • Kim, Yong-Heon;Lee, Won-Su;Lee, Heoung-Yeoun;Koo, Kee-Young;Song, In-Kyu
    • 한국신재생에너지학회:학술대회논문집
    • /
    • 2009.06a
    • /
    • pp.888-888
    • /
    • 2009
  • A simulation study on SCR (Steam Carbon dioxide Reforming) process in gas-to-liquid (natural gas to Fischer-Tropsch synthetic fuel) process was carried out in order to find optimum reaction conditions for GTL (gas-to-liquid) process reaction. Optimum SCR operating conditions for synthesis gas to FT (Fischer-Tropsch) process were determined by changing reaction variables such as feed temperature and pressure. During the simulation, overall synthesis process was assumed to proceed under steady-state conditions. It was also assumed that physical properties of reaction medium were governed by RKS (Redlich-Kwong-Soave) equation. SCR process was considered as reaction models for synthesis gas in GTL proess. The effect of temperature and pressure on SCR process $H_2$/CO ratio and the effect of reaction pressure on SCR reaction were mainly examined. Simulation results were also compared to experimental results to confirm the reliability of simulation model. Simulation results were reasonably well matched with experimental results.

  • PDF

Accurate Face Pose Estimation and Synthesis Using Linear Transform Among Face Models (얼굴 모델간 선형변환을 이용한 정밀한 얼굴 포즈추정 및 포즈합성)

  • Suvdaa, B.;Ko, J.
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.4
    • /
    • pp.508-515
    • /
    • 2012
  • This paper presents a method that estimates face pose for a given face image and synthesizes any posed face images using Active Appearance Model(AAM). The AAM that having been successfully applied to various applications is an example-based learning model and learns the variations of training examples. However, with a single model, it is difficult to handle large pose variations of face images. This paper proposes to build a model covering only a small range of angle for each pose. Then, with a proper model for a given face image, we can achieve accurate pose estimation and synthesis. In case of the model used for pose estimation was not trained with the angle to synthesize, we solve this problem by training the linear relationship between the models in advance. In the experiments on Yale B public face database, we present the accurate pose estimation and pose synthesis results. For our face database having large pose variations, we demonstrate successful frontal pose synthesis results.

On a Substructure Synthesis Having Non-Matching Nodes (비부합 절점으로 이루어진 구조물의 합성과 재해석)

  • 정의일;박윤식
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2001.11a
    • /
    • pp.155-160
    • /
    • 2001
  • Actual engineering structure is frequently very complex, and parts of structure are designed independently by different engineers. Also each structure contains so many degree of freedom. For these reason, methods have been developed which permits the structure to be divided into components or substructures, with analysis being done on a small substructure in order to obtain a full structural system. In such case, because of different mesh size among finite element model (FEM) or different matching points among FEM models and experimentally obtained models, their interfacing points may be non-matching. Solving this non-matching problem is useful to other application such as structural dynamic modification or model updating. In this work, virtual node concept is introduced. Lagrange multipliers are used to enforce the interface compatibility constraint, and interface displacement is approximated by polynomial presentation. The governing equation of whole structure is derived using hybrid variational principle. The eigenvalue of whole structure are calculated using the determinant search method. The number of degree of freedom in the eigenvalue problem can be drastically reduced to just the number of interface degree of freedom. Some numerical simulation is performed to show usefulness of synthesis method.

  • PDF