Search | Korea Science

Extending StarGAN-VC to Unseen Speakers Using RawNet3 Speaker Representation (RawNet3 화자 표현을 활용한 임의의 화자 간 음성 변환을 위한 StarGAN의 확장)

Bogyung Park;Somin Park;Hyunki Hong
- KIPS Transactions on Software and Data Engineering
- /
- v.12 no.7
- /
- pp.303-314
- /
- 2023
Voice conversion, a technology that allows an individual's speech data to be regenerated with the acoustic properties(tone, cadence, gender) of another, has countless applications in education, communication, and entertainment. This paper proposes an approach based on the StarGAN-VC model that generates realistic-sounding speech without requiring parallel utterances. To overcome the constraints of the existing StarGAN-VC model that utilizes one-hot vectors of original and target speaker information, this paper extracts feature vectors of target speakers using a pre-trained version of Rawnet3. This results in a latent space where voice conversion can be performed without direct speaker-to-speaker mappings, enabling an any-to-any structure. In addition to the loss terms used in the original StarGAN-VC model, Wasserstein distance is used as a loss term to ensure that generated voice segments match the acoustic properties of the target voice. Two Time-Scale Update Rule (TTUR) is also used to facilitate stable training. Experimental results show that the proposed method outperforms previous methods, including the StarGAN-VC network on which it was based.
https://doi.org/10.3745/KTSDE.2023.12.7.303 인용 PDF

Fabrication and Performance of Anode-Supported Flat Tubular Solid Oxide Fuel Cell Unit Bundle (연료극 지지체식 평관형 고체산화물 연료전지 단위 번들의 제조 및 성능)

Lim, Tak-Hyoung;Kim, Gwan-Yeong;Park, Jae-Layng;Lee, Seung-Bok;Shin, Dong-Ryul;Song, Rak-Hyun
- Journal of the Korean Electrochemical Society
- /
- v.10 no.4
- /
- pp.283-287
- /
- 2007
KIER has been developing the anode-supported flat tubular solid oxide fuel cell unit bundle for the intermediate temperature($700{\sim}800^{\circ}C$) operation. Anode-supported flat tubular cells have Ni/YSZ cermet anode support, 8 moi.% $Y_2O_3$ stabilized $ZrO_2(YSZ)$ thin electrolyte, and cathode multi-layer composed of Sr-doped $LaSrMnO_3(LSM)$, LSM-YSZ composite, and $LaSrCoFeO_3(LSCF)$. The prepared anode-supported flat tubular cell was joined with ferritic stainless steel cap by induction brazing process. Current collection for the cathode was achieved by winding Ag wire and $La_{0.6}Sr_{0.4}CoO_3(LSCo)$ paste, while current collection for the anode was achieved by using Ni wire and felt. For making stack, the prepared anode-supported flat tubular cells with effective electrode area of $90\;cm^2$ connected in series with 12 unit bundles, in which unit bundle consists of two cells connected in parallel. The performance of unit bundle in 3% humidified $H_2$ and air at $800^{\circ}C$ shows maximum power density of $0.39\;W/cm^2$ (@ 0.7V). Through these experiments, we obtained basic technology of the anode-supported flat tubular cell and established the proprietary concept of the anode-supported flat tubular cell unit bundle.
https://doi.org/10.5229/JKES.2007.10.4.283 인용 PDF KSCI

ATM Cell Encipherment Method using Rijndael Algorithm in Physical Layer (Rijndael 알고리즘을 이용한 물리 계층 ATM 셀 보안 기법)

Im Sung-Yeal;Chung Ki-Dong
- The KIPS Transactions:PartC
- /
- v.13C no.1 s.104
- /
- pp.83-94
- /
- 2006
This paper describes ATM cell encipherment method using Rijndael Algorithm adopted as an AES(Advanced Encryption Standard) by NIST in 2001. ISO 9160 describes the requirement of physical layer data processing in encryption/decryption. For the description of ATM cell encipherment method, we implemented ATM data encipherment equipment which satisfies the requirements of ISO 9160, and verified the encipherment/decipherment processing at ATM STM-1 rate(155.52Mbps). The DES algorithm can process data in the block size of 64 bits and its key length is 64 bits, but the Rijndael algorithm can process data in the block size of 128 bits and the key length of 128, 192, or 256 bits selectively. So it is more flexible in high bit rate data processing and stronger in encription strength than DES. For tile real time encryption of high bit rate data stream. Rijndael algorithm was implemented in FPGA in this experiment. The boundary of serial UNI cell was detected by the CRC method, and in the case of user data cell the payload of 48 octets (384 bits) is converted in parallel and transferred to 3 Rijndael encipherment module in the block size of 128 bits individually. After completion of encryption, the header stored in buffer is attached to the enciphered payload and retransmitted in the format of cell. At the receiving end, the boundary of ceil is detected by the CRC method and the payload type is decided. n the payload type is the user data cell, the payload of the cell is transferred to the 3-Rijndael decryption module in the block sire of 128 bits for decryption of data. And in the case of maintenance cell, the payload is extracted without decryption processing.
https://doi.org/10.3745/KIPSTC.2006.13C.1.083 인용 PDF KSCI

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
- Journal of Intelligence and Information Systems
- /
- v.24 no.1
- /
- pp.205-225
- /
- 2018
Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.
https://doi.org/10.13088/jiis.2018.24.1.205 인용 PDF KSCI

An Analysis of Big Video Data with Cloud Computing in Ubiquitous City (클라우드 컴퓨팅을 이용한 유시티 비디오 빅데이터 분석)

Lee, Hak Geon;Yun, Chang Ho;Park, Jong Won;Lee, Yong Woo
- Journal of Internet Computing and Services
- /
- v.15 no.3
- /
- pp.45-52
- /
- 2014
The Ubiquitous-City (U-City) is a smart or intelligent city to satisfy human beings' desire to enjoy IT services with any device, anytime, anywhere. It is a future city model based on Internet of everything or things (IoE or IoT). It includes a lot of video cameras which are networked together. The networked video cameras support a lot of U-City services as one of the main input data together with sensors. They generate huge amount of video information, real big data for the U-City all the time. It is usually required that the U-City manipulates the big data in real-time. And it is not easy at all. Also, many times, it is required that the accumulated video data are analyzed to detect an event or find a figure among them. It requires a lot of computational power and usually takes a lot of time. Currently we can find researches which try to reduce the processing time of the big video data. Cloud computing can be a good solution to address this matter. There are many cloud computing methodologies which can be used to address the matter. MapReduce is an interesting and attractive methodology for it. It has many advantages and is getting popularity in many areas. Video cameras evolve day by day so that the resolution improves sharply. It leads to the exponential growth of the produced data by the networked video cameras. We are coping with real big data when we have to deal with video image data which are produced by the good quality video cameras. A video surveillance system was not useful until we find the cloud computing. But it is now being widely spread in U-Cities since we find some useful methodologies. Video data are unstructured data thus it is not easy to find a good research result of analyzing the data with MapReduce. This paper presents an analyzing system for the video surveillance system, which is a cloud-computing based video data management system. It is easy to deploy, flexible and reliable. It consists of the video manager, the video monitors, the storage for the video images, the storage client and streaming IN component. The "video monitor" for the video images consists of "video translater" and "protocol manager". The "storage" contains MapReduce analyzer. All components were designed according to the functional requirement of video surveillance system. The "streaming IN" component receives the video data from the networked video cameras and delivers them to the "storage client". It also manages the bottleneck of the network to smooth the data stream. The "storage client" receives the video data from the "streaming IN" component and stores them to the storage. It also helps other components to access the storage. The "video monitor" component transfers the video data by smoothly streaming and manages the protocol. The "video translator" sub-component enables users to manage the resolution, the codec and the frame rate of the video image. The "protocol" sub-component manages the Real Time Streaming Protocol (RTSP) and Real Time Messaging Protocol (RTMP). We use Hadoop Distributed File System(HDFS) for the storage of cloud computing. Hadoop stores the data in HDFS and provides the platform that can process data with simple MapReduce programming model. We suggest our own methodology to analyze the video images using MapReduce in this paper. That is, the workflow of video analysis is presented and detailed explanation is given in this paper. The performance evaluation was experiment and we found that our proposed system worked well. The performance evaluation results are presented in this paper with analysis. With our cluster system, we used compressed $1920{\times}1080(FHD)$ resolution video data, H.264 codec and HDFS as video storage. We measured the processing time according to the number of frame per mapper. Tracing the optimal splitting size of input data and the processing time according to the number of node, we found the linearity of the system performance.
https://doi.org/10.7472/jksii.2014.15.3.45 인용 PDF KSCI

A Hardware Implementation of the Underlying Field Arithmetic Processor based on Optimized Unit Operation Components for Elliptic Curve Cryptosystems (타원곡선을 암호시스템에 사용되는 최적단위 연산항을 기반으로 한 기저체 연산기의 하드웨어 구현)

Jo, Seong-Je;Kwon, Yong-Jin
- Journal of KIISE:Computing Practices and Letters
- /
- v.8 no.1
- /
- pp.88-95
- /
- 2002
In recent years, the security of hardware and software systems is one of the most essential factor of our safe network community. As elliptic Curve Cryptosystems proposed by N. Koblitz and V. Miller independently in 1985, require fewer bits for the same security as the existing cryptosystems, for example RSA, there is a net reduction in cost size, and time. In this thesis, we propose an efficient hardware architecture of underlying field arithmetic processor for Elliptic Curve Cryptosystems, and a very useful method for implementing the architecture, especially multiplicative inverse operator over GF$GF (2^m)$ onto FPGA and futhermore VLSI, where the method is based on optimized unit operation components. We optimize the arithmetic processor for speed so that it has a resonable number of gates to implement. The proposed architecture could be applied to any finite field $F_{2m}$. According to the simulation result, though the number of gates are increased by a factor of 8.8, the multiplication speed We optimize the arithmetic processor for speed so that it has a resonable number of gates to implement. The proposed architecture could be applied to any finite field $F_{2m}$. According to the simulation result, though the number of gates are increased by a factor of 8.8, the multiplication speed and inversion speed has been improved 150 times, 480 times respectively compared with the thesis presented by Sarwono Sutikno et al. [7]. The designed underlying arithmetic processor can be also applied for implementing other crypto-processor and various finite field applications.
PDF KSCI

Search Result 1,946, Processing Time 0.024 seconds

Extending StarGAN-VC to Unseen Speakers Using RawNet3 Speaker Representation (RawNet3 화자 표현을 활용한 임의의 화자 간 음성 변환을 위한 StarGAN의 확장)

Fabrication and Performance of Anode-Supported Flat Tubular Solid Oxide Fuel Cell Unit Bundle (연료극 지지체식 평관형 고체산화물 연료전지 단위 번들의 제조 및 성능)

ATM Cell Encipherment Method using Rijndael Algorithm in Physical Layer (Rijndael 알고리즘을 이용한 물리 계층 ATM 셀 보안 기법)

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

An Analysis of Big Video Data with Cloud Computing in Ubiquitous City (클라우드 컴퓨팅을 이용한 유시티 비디오 빅데이터 분석)

A Hardware Implementation of the Underlying Field Arithmetic Processor based on Optimized Unit Operation Components for Elliptic Curve Cryptosystems (타원곡선을 암호시스템에 사용되는 최적단위 연산항을 기반으로 한 기저체 연산기의 하드웨어 구현)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)