• Title/Summary/Keyword: unseen model

Search Result 38, Processing Time 0.024 seconds

Sea Ice Type Classification with Optical Remote Sensing Data (광학영상에서의 해빙종류 분류 연구)

  • Chi, Junhwa;Kim, Hyun-cheol
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.6_2
    • /
    • pp.1239-1249
    • /
    • 2018
  • Optical remote sensing sensors provide visually more familiar images than radar images. However, it is difficult to discriminate sea ice types in optical images using spectral information based machine learning algorithms. This study addresses two topics. First, we propose a semantic segmentation which is a part of the state-of-the-art deep learning algorithms to identify ice types by learning hierarchical and spatial features of sea ice. Second, we propose a new approach by combining of semi-supervised and active learning to obtain accurate and meaningful labels from unlabeled or unseen images to improve the performance of supervised classification for multiple images. Therefore, we successfully added new labels from unlabeled data to automatically update the semantic segmentation model. This should be noted that an operational system to generate ice type products from optical remote sensing data may be possible in the near future.

Predicting tensile strength of reinforced concrete composited with geopolymer using several machine learning algorithms

  • Ibrahim Albaijan;Hanan Samadi;Arsalan Mahmoodzadeh;Danial Fakhri;Mehdi Hosseinzadeh;Nejib Ghazouani;Khaled Mohamed Elhadi
    • Steel and Composite Structures
    • /
    • v.52 no.3
    • /
    • pp.293-312
    • /
    • 2024
  • Researchers are actively investigating the potential for utilizing alternative materials in construction to tackle the environmental and economic challenges linked to traditional concrete-based materials. Nevertheless, conventional laboratory methods for testing the mechanical properties of concrete are both costly and time-consuming. The limitations of traditional models in predicting the tensile strength of concrete composited with geopolymer have created a demand for more advanced models. Fortunately, the increasing availability of data has facilitated the use of machine learning methods, which offer powerful and cost-effective models. This paper aims to explore the potential of several machine learning methods in predicting the tensile strength of geopolymer concrete under different curing conditions. The study utilizes a dataset of 221 tensile strength test results for geopolymer concrete with varying mix ratios and curing conditions. The effectiveness of the machine learning models is evaluated using additional unseen datasets. Based on the values of loss functions and evaluation metrics, the results indicate that most models have the potential to estimate the tensile strength of geopolymer concrete satisfactorily. However, the Takagi Sugeno fuzzy model (TSF) and gene expression programming (GEP) models demonstrate the highest robustness. Both the laboratory tests and machine learning outcomes indicate that geopolymer concrete composed of 50% fly ash and 40% ground granulated blast slag, mixed with 10 mol of NaOH, and cured in an oven at 190°F for 28 days has superior tensile strength.

One-shot multi-speaker text-to-speech using RawNet3 speaker representation (RawNet3를 통해 추출한 화자 특성 기반 원샷 다화자 음성합성 시스템)

  • Sohee Han;Jisub Um;Hoirin Kim
    • Phonetics and Speech Sciences
    • /
    • v.16 no.1
    • /
    • pp.67-76
    • /
    • 2024
  • Recent advances in text-to-speech (TTS) technology have significantly improved the quality of synthesized speech, reaching a level where it can closely imitate natural human speech. Especially, TTS models offering various voice characteristics and personalized speech, are widely utilized in fields such as artificial intelligence (AI) tutors, advertising, and video dubbing. Accordingly, in this paper, we propose a one-shot multi-speaker TTS system that can ensure acoustic diversity and synthesize personalized voice by generating speech using unseen target speakers' utterances. The proposed model integrates a speaker encoder into a TTS model consisting of the FastSpeech2 acoustic model and the HiFi-GAN vocoder. The speaker encoder, based on the pre-trained RawNet3, extracts speaker-specific voice features. Furthermore, the proposed approach not only includes an English one-shot multi-speaker TTS but also introduces a Korean one-shot multi-speaker TTS. We evaluate naturalness and speaker similarity of the generated speech using objective and subjective metrics. In the subjective evaluation, the proposed Korean one-shot multi-speaker TTS obtained naturalness mean opinion score (NMOS) of 3.36 and similarity MOS (SMOS) of 3.16. The objective evaluation of the proposed English and Korean one-shot multi-speaker TTS showed a prediction MOS (P-MOS) of 2.54 and 3.74, respectively. These results indicate that the performance of our proposed model is improved over the baseline models in terms of both naturalness and speaker similarity.

Extending StarGAN-VC to Unseen Speakers Using RawNet3 Speaker Representation (RawNet3 화자 표현을 활용한 임의의 화자 간 음성 변환을 위한 StarGAN의 확장)

  • Bogyung Park;Somin Park;Hyunki Hong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.7
    • /
    • pp.303-314
    • /
    • 2023
  • Voice conversion, a technology that allows an individual's speech data to be regenerated with the acoustic properties(tone, cadence, gender) of another, has countless applications in education, communication, and entertainment. This paper proposes an approach based on the StarGAN-VC model that generates realistic-sounding speech without requiring parallel utterances. To overcome the constraints of the existing StarGAN-VC model that utilizes one-hot vectors of original and target speaker information, this paper extracts feature vectors of target speakers using a pre-trained version of Rawnet3. This results in a latent space where voice conversion can be performed without direct speaker-to-speaker mappings, enabling an any-to-any structure. In addition to the loss terms used in the original StarGAN-VC model, Wasserstein distance is used as a loss term to ensure that generated voice segments match the acoustic properties of the target voice. Two Time-Scale Update Rule (TTUR) is also used to facilitate stable training. Experimental results show that the proposed method outperforms previous methods, including the StarGAN-VC network on which it was based.

Machine Learning-based Phase Picking Algorithm of P and S Waves for Distributed Acoustic Sensing Data (분포형 광섬유 센서 자료 적용을 위한 기계학습 기반 P, S파 위상 발췌 알고리즘 개발)

  • Yonggyu, Choi;Youngseok, Song;Soon Jee, Seol;Joongmoo, Byun
    • Geophysics and Geophysical Exploration
    • /
    • v.25 no.4
    • /
    • pp.177-188
    • /
    • 2022
  • Recently, the application of distributed acoustic sensors (DAS), which can replace geophones and seismometers, has significantly increased along with interest in micro-seismic monitoring technique, which is one of the CO2 storage monitoring techniques. A significant amount of temporally and spatially continuous data is recorded in a DAS monitoring system, thereby necessitating fast and accurate data processing techniques. Because event detection and seismic phase picking are the most basic data processing techniques, they should be performed on all data. In this study, a machine learning-based P, S wave phase picking algorithm was developed to compensate for the limitations of conventional phase picking algorithms, and it was modified using a transfer learning technique for the application of DAS data consisting of a single component with a low signal-to-noise ratio. Our model was constructed by modifying the convolution-based EQTransformer, which performs well in phase picking, to the ResUNet structure. Not only the global earthquake dataset, STEAD but also the augmented dataset was used as training datasets to enhance the prediction performance on the unseen characteristics of the target dataset. The performance of the developed algorithm was verified using K-net and KiK-net data with characteristics different from the training data. Additionally, after modifying the trained model to suit DAS data using the transfer learning technique, the performance was verified by applying it to the DAS field data measured in the Pohang Janggi basin.

A Method to Estimate the Cell Based Sustainable Development Yield of Groundwater (셀기반 지하수 개발가능량 산정기법)

  • Chung, Il-Moon;Kim, Nam Won;Lee, Jeongwoo;Na, Hanna;Kim, Youn-Jung;Park, Seunghyuk
    • Economic and Environmental Geology
    • /
    • v.47 no.6
    • /
    • pp.635-643
    • /
    • 2014
  • Sustaiable development yield of groundwater in Korea has been determined according to 10 year drought frequency of groundwater recharge in the standard mid-sized watershed or relatively large area of district. Therefore, the evaluation of groundwater impact in a small watershed is hard to apply. Fot this purpose, a novel approach to estimate cell based sustainable development yield of groundwater (SDYG) is suggested and applied to Gyeongju region. Cell based groundwater recharge is computed using hydrological component analysis using the SWAT-MODFLOW which is an integrated surface water-groundwater model. To estimate the potential amount of groundwater development, the existing method which uses 10 year drought frequency rainfall multiplied by recharge coefficient is adopted. Cell based SDYGs are computed and summed for 143 sub-watersheds and administrative districts. When these SDYGs are combined with groundwater usage data, the groundwater usage rate (total usage / SDYG) shows wide local variations (7.1~108.8%) which are unseen when average rate (24%) is only evaluated. Also, it is expected that additional SDYGs in any small district could be estimated.

Deep Learning-based Fracture Mode Determination in Composite Laminates (복합 적층판의 딥러닝 기반 파괴 모드 결정)

  • Muhammad Muzammil Azad;Atta Ur Rehman Shah;M.N. Prabhakar;Heung Soo Kim
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.37 no.4
    • /
    • pp.225-232
    • /
    • 2024
  • This study focuses on the determination of the fracture mode in composite laminates using deep learning. With the increase in the use of laminated composites in numerous engineering applications, the insurance of their integrity and performance is of paramount importance. However, owing to the complex nature of these materials, the identification of fracture modes is often a tedious and time-consuming task that requires critical domain knowledge. Therefore, to alleviate these issues, this study aims to utilize modern artificial intelligence technology to automate the fractographic analysis of laminated composites. To accomplish this goal, scanning electron microscopy (SEM) images of fractured tensile test specimens are obtained from laminated composites to showcase various fracture modes. These SEM images are then categorized based on numerous fracture modes, including fiber breakage, fiber pull-out, mix-mode fracture, matrix brittle fracture, and matrix ductile fracture. Next, the collective data for all classes are divided into train, test, and validation datasets. Two state-of-the-art, deep learning-based pre-trained models, namely, DenseNet and GoogleNet, are trained to learn the discriminative features for each fracture mode. The DenseNet models shows training and testing accuracies of 94.01% and 75.49%, respectively, whereas those of the GoogleNet model are 84.55% and 54.48%, respectively. The trained deep learning models are then validated on unseen validation datasets. This validation demonstrates that the DenseNet model, owing to its deeper architecture, can extract high-quality features, resulting in 84.44% validation accuracy. This value is 36.84% higher than that of the GoogleNet model. Hence, these results affirm that the DenseNet model is effective in performing fractographic analyses of laminated composites by predicting fracture modes with high precision.

KB-BERT: Training and Application of Korean Pre-trained Language Model in Financial Domain (KB-BERT: 금융 특화 한국어 사전학습 언어모델과 그 응용)

  • Kim, Donggyu;Lee, Dongwook;Park, Jangwon;Oh, Sungwoo;Kwon, Sungjun;Lee, Inyong;Choi, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.191-206
    • /
    • 2022
  • Recently, it is a de-facto approach to utilize a pre-trained language model(PLM) to achieve the state-of-the-art performance for various natural language tasks(called downstream tasks) such as sentiment analysis and question answering. However, similar to any other machine learning method, PLM tends to depend on the data distribution seen during the training phase and shows worse performance on the unseen (Out-of-Distribution) domain. Due to the aforementioned reason, there have been many efforts to develop domain-specified PLM for various fields such as medical and legal industries. In this paper, we discuss the training of a finance domain-specified PLM for the Korean language and its applications. Our finance domain-specified PLM, KB-BERT, is trained on a carefully curated financial corpus that includes domain-specific documents such as financial reports. We provide extensive performance evaluation results on three natural language tasks, topic classification, sentiment analysis, and question answering. Compared to the state-of-the-art Korean PLM models such as KoELECTRA and KLUE-RoBERTa, KB-BERT shows comparable performance on general datasets based on common corpora like Wikipedia and news articles. Moreover, KB-BERT outperforms compared models on finance domain datasets that require finance-specific knowledge to solve given problems.