Search | Korea Science

Zero-shot voice conversion with HuBERT

Hyelee Chung;Hosung Nam
- Phonetics and Speech Sciences
- /
- v.15 no.3
- /
- pp.69-74
- /
- 2023
This study introduces an innovative model for zero-shot voice conversion that utilizes the capabilities of HuBERT. Zero-shot voice conversion models can transform the speech of one speaker to mimic that of another, even when the model has not been exposed to the target speaker's voice during the training phase. Comprising five main components (HuBERT, feature encoder, flow, speaker encoder, and vocoder), the model offers remarkable performance across a range of scenarios. Notably, it excels in the challenging unseen-to-unseen voice-conversion tasks. The effectiveness of the model was assessed based on the mean opinion scores and similarity scores, reflecting high voice quality and similarity to the target speakers. This model demonstrates considerable promise for a range of real-world applications demanding high-quality voice conversion. This study sets a precedent in the exploration of HuBERT-based models for voice conversion, and presents new directions for future research in this domain. Despite its complexities, the robust performance of this model underscores the viability of HuBERT in advancing voice conversion technology, making it a significant contributor to the field.
https://doi.org/10.13064/KSSS.2023.15.3.069 인용 PDF

Motion classification using distributional features of 3D skeleton data

Woohyun Kim;Daeun Kim;Kyoung Shin Park;Sungim Lee
- Communications for Statistical Applications and Methods
- /
- v.30 no.6
- /
- pp.551-560
- /
- 2023
Recently, there has been significant research into the recognition of human activities using three-dimensional sequential skeleton data captured by the Kinect depth sensor. Many of these studies employ deep learning models. This study introduces a novel feature selection method for this data and analyzes it using machine learning models. Due to the high-dimensional nature of the original Kinect data, effective feature extraction methods are required to address the classification challenge. In this research, we propose using the first four moments as predictors to represent the distribution of joint sequences and evaluate their effectiveness using two datasets: The exergame dataset, consisting of three activities, and the MSR daily activity dataset, composed of ten activities. The results show that the accuracy of our approach outperforms existing methods on average across different classifiers.
https://doi.org/10.29220/CSAM.2023.30.6.551 인용 PDF

Residual Blocks-Based Convolutional Neural Network for Age, Gender, and Race Classification (연령, 성별, 인종 구분을 위한 잔차블록 기반 컨볼루션 신경망)

Khasanova Nodira Gayrat Kizi;Bong-Kee Sin
- Annual Conference of KIPS
- /
- 2023.11a
- /
- pp.568-570
- /
- 2023
The problem of classifying of age, gender, and race images still poses challenges. Despite deep and machine learning strides, convolutional neural networks (CNNs) remain pivotal in addressing these issues. This paper introduces a novel CNN-based approach for accurate and efficient age, gender, and race classification. Leveraging CNNs with residual blocks, our method enhances learning while minimizing computational complexity. The model effectively captures low-level and high-level features, yielding improved classification accuracy. Evaluation of the diverse 'fair face' dataset shows our model achieving 56.3%, 94.6%, and 58.4% accuracy for age, gender, and race, respectively.
https://doi.org/10.3745/PKIPS.y2023m11a.568 인용 PDF

Research on predicting changes in crop cultivation areas due to climate change: Focusing on Hallabong (기후변화에 따른 과수작물 재배지 변화 예측 연구: 한라봉을 중심으로)

Park, Hye Eun;Lee, Jong Tae
- The Journal of Information Systems
- /
- v.33 no.1
- /
- pp.31-44
- /
- 2024
Purpose The purpose of this study is to use climate data to find the algorithm with the highest Hallabong production prediction ability and to predict future Hallabong production in areas where Hallabong cultivation is expected to be possible. Design/methodology/approach The research is conducted in two stages. In the first step, find the algorithm with the highest predictive power among XGBoost, Random Forest, SVM, and LSTM methodologies. In the second stage, the algorithm found in the first stage is applied to predict future Hallabong production in three regions where Hallabong production is expected to be possible. Findings As with many prediction studies, we found that XGBoost showed the highest prediction power. Even in areas where Hallabong production is expected to be possible, Hallabong production was predicted to be highest in Hongcheon, Gangwon-do, which has the highest latitude.
https://doi.org/10.5859/KAIS.2024.33.1.31 인용 PDF

Criteria for implementing artificial intelligence systems in reproductive medicine

Enric Guell
- Clinical and Experimental Reproductive Medicine
- /
- v.51 no.1
- /
- pp.1-12
- /
- 2024
This review article discusses the integration of artificial intelligence (AI) in assisted reproductive technology and provides key concepts to consider when introducing AI systems into reproductive medicine practices. The article highlights the various applications of AI in reproductive medicine and discusses whether to use commercial or in-house AI systems. This review also provides criteria for implementing new AI systems in the laboratory and discusses the factors that should be considered when introducing AI in the laboratory, including the user interface, scalability, training, support, follow-up, cost, ethics, and data quality. The article emphasises the importance of ethical considerations, data quality, and continuous algorithm updates to ensure the accuracy and safety of AI systems.
https://doi.org/10.5653/cerm.2023.06009 인용 PDF

Artificial Intelligence and Air Pollution : A Bibliometric Analysis from 2012 to 2022

Yong Sauk Hau
- International journal of advanced smart convergence
- /
- v.13 no.1
- /
- pp.48-56
- /
- 2024
The application of artificial intelligence (AI) is becoming increasingly important to coping with air pollution. AI is effective in coping with it in various ways including air pollution forecasting, monitoring, and control, which is attracting a lot of attention. This attention has created high need for analyzing studies on AI and air pollution. To contribute for satisfying it, this study performed bibliometric analyses on the studies on AI and air pollution from 2012 to 2022 using the Web of Science database. This study analyzed them in various aspects such as the trend in the number of articles, the trend in the number of citations, the top 10 countries of origin, the top 10 research organizations, the top 10 research funding agencies, the top 10 journals, the top 10 articles in terms of total citations, and the distribution by languages. This study not only reports the bibliometric analysis results but also reveals the eight distinct features in the research steam in studies on AI and air pollution, identified from the bibliometric analysis results. They are expected to make a useful contribution for understanding the research stream in AI and air pollution.
https://doi.org/10.7236/IJASC.2024.13.1.48 인용 PDF

Development and Validation of a Prediction Model: Application to Digestive Cancer Research (예측모형의 구축과 검증: 소화기암연구 사례를 중심으로)

Yonghan Kwon;Kyunghwa Han
- Journal of Digestive Cancer Research
- /
- v.11 no.3
- /
- pp.157-164
- /
- 2023
Prediction is a significant topic in clinical research. The development and validation of a prediction model has been increasingly published in clinical research. In this review, we investigated analytical methods and validation schemes for a clinical prediction model used in digestive cancer research. Deep learning and logistic regression, with split-sample validation as an internal or external validation, were the most commonly used methods. Furthermore, we briefly introduced and summarized the advantages and disadvantages of each method. Finally, we discussed several points to consider when conducting prediction model studies.
https://doi.org/10.52927/jdcr.2023.11.3.157 인용 PDF

Design of detection method for malicious URL based on Deep Neural Network (뉴럴네트워크 기반에 악성 URL 탐지방법 설계)

Kwon, Hyun;Park, Sangjun;Kim, Yongchul
- Journal of Convergence for Information Technology
- /
- v.11 no.5
- /
- pp.30-37
- /
- 2021
Various devices are connected to the Internet, and attacks using the Internet are occurring. Among such attacks, there are attacks that use malicious URLs to make users access to wrong phishing sites or distribute malicious viruses. Therefore, how to detect such malicious URL attacks is one of the important security issues. Among recent deep learning technologies, neural networks are showing good performance in image recognition, speech recognition, and pattern recognition. This neural network can be applied to research that analyzes and detects patterns of malicious URL characteristics. In this paper, performance analysis according to various parameters was performed on a method of detecting malicious URLs using neural networks. In this paper, malicious URL detection performance was analyzed while changing the activation function, learning rate, and neural network structure. The experimental data was crawled by Alexa top 1 million and Whois to build the data, and the machine learning library used TensorFlow. As a result of the experiment, when the number of layers is 4, the learning rate is 0.005, and the number of nodes in each layer is 100, the accuracy of 97.8% and the f1 score of 92.94% are obtained.
https://doi.org/10.22156/CS4SMB.2021.11.05.030 인용 PDF KSCI

Effects of Spatio-temporal Features of Dynamic Hand Gestures on Learning Accuracy in 3D-CNN (3D-CNN에서 동적 손 제스처의 시공간적 특징이 학습 정확성에 미치는 영향)

Yeongjee Chung
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.23 no.3
- /
- pp.145-151
- /
- 2023
3D-CNN is one of the deep learning techniques for learning time series data. Such three-dimensional learning can generate many parameters, so that high-performance machine learning is required or can have a large impact on the learning rate. When learning dynamic hand-gestures in spatiotemporal domain, it is necessary for the improvement of the efficiency of dynamic hand-gesture learning with 3D-CNN to find the optimal conditions of input video data by analyzing the learning accuracy according to the spatiotemporal change of input video data without structural change of the 3D-CNN model. First, the time ratio between dynamic hand-gesture actions is adjusted by setting the learning interval of image frames in the dynamic hand-gesture video data. Second, through 2D cross-correlation analysis between classes, similarity between image frames of input video data is measured and normalized to obtain an average value between frames and analyze learning accuracy. Based on this analysis, this work proposed two methods to effectively select input video data for 3D-CNN deep learning of dynamic hand-gestures. Experimental results showed that the learning interval of image data frames and the similarity of image frames between classes can affect the accuracy of the learning model.
https://doi.org/10.7236/JIIBC.2023.23.3.145 인용 PDF HTML

Evaluation of a Thermal Conductivity Prediction Model for Compacted Clay Based on a Machine Learning Method (기계학습법을 통한 압축 벤토나이트의 열전도도 추정 모델 평가)

Yoon, Seok;Bang, Hyun-Tae;Kim, Geon-Young;Jeon, Haemin
- KSCE Journal of Civil and Environmental Engineering Research
- /
- v.41 no.2
- /
- pp.123-131
- /
- 2021
The buffer is a key component of an engineered barrier system that safeguards the disposal of high-level radioactive waste. Buffers are located between disposal canisters and host rock, and they can restrain the release of radionuclides and protect canisters from the inflow of ground water. Since considerable heat is released from a disposal canister to the surrounding buffer, the thermal conductivity of the buffer is a very important parameter in the entire disposal safety. For this reason, a lot of research has been conducted on thermal conductivity prediction models that consider various factors. In this study, the thermal conductivity of a buffer is estimated using the machine learning methods of: linear regression, decision tree, support vector machine (SVM), ensemble, Gaussian process regression (GPR), neural network, deep belief network, and genetic programming. In the results, the machine learning methods such as ensemble, genetic programming, SVM with cubic parameter, and GPR showed better performance compared with the regression model, with the ensemble with XGBoost and Gaussian process regression models showing best performance.
https://doi.org/10.12652/Ksce.2021.41.2.0123 인용 PDF KSCI

Search Result 1,085, Processing Time 0.036 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)