• Title/Summary/Keyword: feature model validation

Search Result 111, Processing Time 0.028 seconds

Alzheimer progression classification using fMRI data (fMRI 데이터를 이용한 알츠하이머 진행상태 분류)

  • Ju Hyeon-Noh;Hee-Deok Yang
    • Smart Media Journal
    • /
    • v.13 no.4
    • /
    • pp.86-93
    • /
    • 2024
  • The development of functional magnetic resonance imaging (fMRI) has significantly contributed to mapping brain functions and understanding brain networks during rest. This paper proposes a CNN-LSTM-based classification model to classify the progression stages of Alzheimer's disease. Firstly, four preprocessing steps are performed to remove noise from the fMRI data before feature extraction. Secondly, the U-Net architecture is utilized to extract spatial features once preprocessing is completed. Thirdly, the extracted spatial features undergo LSTM processing to extract temporal features, ultimately leading to classification. Experiments were conducted by adjusting the temporal dimension of the data. Using 5-fold cross-validation, an average accuracy of 96.4% was achieved, indicating that the proposed method has high potential for identifying the progression of Alzheimer's disease by analyzing fMRI data.

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.

Development of The Irregular Radial Pulse Detection Algorithm Based on Statistical Learning Model (통계적 학습 모형에 기반한 불규칙 맥파 검출 알고리즘 개발)

  • Bae, Jang-Han;Jang, Jun-Su;Ku, Boncho
    • Journal of Biomedical Engineering Research
    • /
    • v.41 no.5
    • /
    • pp.185-194
    • /
    • 2020
  • Arrhythmia is basically diagnosed with the electrocardiogram (ECG) signal, however, ECG is difficult to measure and it requires expert help in analyzing the signal. On the other hand, the radial pulse can be measured with easy and uncomplicated way in daily life, and could be suitable bio-signal for the recent untact paradigm and extensible signal for diagnosis of Korean medicine based on pulse pattern. In this study, we developed an irregular radial pulse detection algorithm based on a learning model and considered its applicability as arrhythmia screening. A total of 1432 pulse waves including irregular pulse data were used in the experiment. Three data sets were prepared with minimal preprocessing to avoid the heuristic feature extraction. As classification algorithms, elastic net logistic regression, random forest, and extreme gradient boosting were applied to each data set and the irregular pulse detection performances were estimated using area under the receiver operating characteristic curve based on a 10-fold cross-validation. The extreme gradient boosting method showed the superior performance than others and found that the classification accuracy reached 99.7%. The results confirmed that the proposed algorithm could be used for arrhythmia screening. To make a fusion technology integrating western and Korean medicine, arrhythmia subtype classification from the perspective of Korean medicine will be needed for future research.

Prediction of the Exposure to 1763MHz Radiofrequency Radiation Based on Gene Expression Patterns

  • Lee, Min-Su;Huang, Tai-Qin;Seo, Jeong-Sun;Park, Woong-Yang
    • Genomics & Informatics
    • /
    • v.5 no.3
    • /
    • pp.102-106
    • /
    • 2007
  • Radiofrequency (RF) radiation at the frequency of mobile phones has been not reported to induce cellular responses in in vitro and in vivo models. We exposed HEI-OC1, conditionally-immortalized mouse auditory cells, to RF radiation to characterize cellular responses to 1763 MHz RF radiation. While we could not detect any differences upon RF exposure, whole-genome expression profiling might provide the most sensitive method to find the molecular responses to RF radiation. HEI-OC1 cells were exposed to 1763 MHz RF radiation at an average specific absorption rate (SAR) of 20 W/kg for 24 hr and harvested after 5 hr of recovery (R5), alongside sham-exposed samples (S5). From the whole-genome profiles of mouse neurons, we selected 9 differentially-expressed genes between the S5 and R5 groups using information gain-based recursive feature elimination procedure. Based on support vector machine (SVM), we designed a prediction model using the 9 genes to discriminate the two groups. Our prediction model could predict the target class without any error. From these results, we developed a prediction model using biomarkers to determine the RF radiation exposure in mouse auditory cells with perfect accuracy, which may need validation in in vivo RF-exposure models.

A Korean speech recognition based on conformer (콘포머 기반 한국어 음성인식)

  • Koo, Myoung-Wan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.488-495
    • /
    • 2021
  • We propose a speech recognition system based on conformer. Conformer is known to be convolution-augmented transformer, which combines transfer model for capturing global information with Convolution Neural Network (CNN) for exploiting local feature effectively. The baseline system is developed to be a transfer-based speech recognition using Long Short-Term Memory (LSTM)-based language model. The proposed system is a system which uses conformer instead of transformer with transformer-based language model. When Electronics and Telecommunications Research Institute (ETRI) speech corpus in AI-Hub is used for our evaluation, the proposed system yields 5.7 % of Character Error Rate (CER) while the baseline system results in 11.8 % of CER. Even though speech corpus is extended into other domain of AI-hub such as NHNdiguest speech corpus, the proposed system makes a robust performance for two domains. Throughout those experiments, we can prove a validation of the proposed system.

Quality Prediction Model for Manufacturing Process of Free-Machining 303-series Stainless Steel Small Rolling Wire Rods (쾌삭 303계 스테인리스강 소형 압연 선재 제조 공정의 생산품질 예측 모형)

  • Seo, Seokjun;Kim, Heungseob
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.4
    • /
    • pp.12-22
    • /
    • 2021
  • This article suggests the machine learning model, i.e., classifier, for predicting the production quality of free-machining 303-series stainless steel(STS303) small rolling wire rods according to the operating condition of the manufacturing process. For the development of the classifier, manufacturing data for 37 operating variables were collected from the manufacturing execution system(MES) of Company S, and the 12 types of derived variables were generated based on literature review and interviews with field experts. This research was performed with data preprocessing, exploratory data analysis, feature selection, machine learning modeling, and the evaluation of alternative models. In the preprocessing stage, missing values and outliers are removed, and oversampling using SMOTE(Synthetic oversampling technique) to resolve data imbalance. Features are selected by variable importance of LASSO(Least absolute shrinkage and selection operator) regression, extreme gradient boosting(XGBoost), and random forest models. Finally, logistic regression, support vector machine(SVM), random forest, and XGBoost are developed as a classifier to predict the adequate or defective products with new operating conditions. The optimal hyper-parameters for each model are investigated by the grid search and random search methods based on k-fold cross-validation. As a result of the experiment, XGBoost showed relatively high predictive performance compared to other models with an accuracy of 0.9929, specificity of 0.9372, F1-score of 0.9963, and logarithmic loss of 0.0209. The classifier developed in this study is expected to improve productivity by enabling effective management of the manufacturing process for the STS303 small rolling wire rods.

Predictive Models for the Tourism and Accommodation Industry in the Era of Smart Tourism: Focusing on the COVID-19 Pandemic (스마트관광 시대의 관광숙박업 영업 예측 모형: 코로나19 팬더믹을 중심으로)

  • Yu Jin Jo;Cha Mi Kim;Seung Yeon Son;Mi Jin Noh
    • Smart Media Journal
    • /
    • v.12 no.8
    • /
    • pp.18-25
    • /
    • 2023
  • The COVID-19 outbreak in 2020 caused continuous damage worldwode, especially the smart tourism industry was hit directly by the blockade of sky roads and restriction of going out. At a time when overseas travel and domestic travel have decreased significantly, the number of tourist hotels that are colsed and closed due to the continued deficit is increasing. Therefore, in this study, licensing data from the Ministry of Public Administraion and Security were collected and visualized to understand the operation status of the tourism and lodging industry. The machine learning classification algorithm was applied to implement the business status prediction model of the tourist hotel, the performance of the prediction model was optimized using the ensemble algorithm, and the performance of the model was evaluated through 5-Fold cross-validation. It was predicted that the survival rate of tourist hotels would decrease somewhat, but the actual survival rate was analyzed to be no different from before COVID-19. Through the prediction of the business status of the hotel industry in this paper, it can be used as a basis for grasping the operability and development trends of the entire tourism and lodging industry.

Evaluation of reactor pulse experiments

  • I. Svajger;D. Calic;A. Pungercic;A. Trkov;L. Snoj
    • Nuclear Engineering and Technology
    • /
    • v.56 no.4
    • /
    • pp.1165-1203
    • /
    • 2024
  • In the paper we validate theoretical models of the pulse against experimental data from the Jozef Stefan Institute TRIGA Mark II research reactor. Data from all pulse experiments since 1991 have been collected, analysed and are publicly available. This paper summarizes the validation study, which is focused on the comparison between experimental values, theoretical predictions (Fuchs-Hansen and Nordheim-Fuchs models) and calculation using computational program Improved Pulse Model. The results show that the theoretical models predicts higher maximum power but lower total released energy, full width at half maximum and the time when the maximum power is reached is shorter, compared to Improved Pulse Model. We evaluate the uncertainties in pulse physical parameters (maximum power, total released energy and full width at half maximum) due to uncertainties in reactor physical parameters (inserted reactivity, delayed neutron fraction, prompt neutron lifetime and effective temperature reactivity coefficient of fuel). It is found that taking into account overestimated correlation of reactor physical parameters does not significantly affect the estimated uncertainties of pulse physical parameters. The relative uncertainties of pulse physical parameters decrease with increasing inserted reactivity. If all reactor physical parameters feature an uncorrelated uncertainty of 10 % the estimated total uncertainty in peak pulse power at 3 $ inserted reactivity is 59 %, where significant contributions come from uncertainties in prompt neutron lifetime and effective temperature reactivity coefficient of fuel. In addition we analyse contribution of two physical mechanisms (Doppler broadening of resonances and neutron spectrum shift) that contribute to the temperature reactivity coefficient of fuel. The Doppler effect contributes around 30 %-15 % while the rest is due to the thermal spectrum hardening for a temperature range between 300 K and 800 K.

Enhancement of concrete crack detection using U-Net

  • Molaka Maruthi;Lee, Dong Eun;Kim Bubryur
    • International conference on construction engineering and project management
    • /
    • 2024.07a
    • /
    • pp.152-159
    • /
    • 2024
  • Cracks in structural materials present a critical challenge to infrastructure safety and long-term durability. Timely and precise crack detection is essential for proactive maintenance and the prevention of catastrophic structural failures. This study introduces an innovative approach to tackle this issue using U-Net deep learning architecture. The primary objective of the intended research is to explore the potential of U-Net in enhancing the precision and efficiency of crack detection across various concrete crack detection under various environmental conditions. Commencing with the assembling by a comprehensive dataset featuring diverse images of concrete cracks, optimizing crack visibility and facilitating feature extraction through advanced image processing techniques. A wide range of concrete crack images were collected and used advanced techniques to enhance their visibility. The U-Net model, well recognized for its proficiency in image segmentation tasks, is implemented to achieve precise segmentation and localization of concrete cracks. In terms of accuracy, our research attests to a substantial advancement in automated of 95% across all tested concrete materials, surpassing traditional manual inspection methods. The accuracy extends to detecting cracks of varying sizes, orientations, and challenging lighting conditions, underlining the systems robustness and reliability. The reliability of the proposed model is measured using performance metrics such as, precision(93%), Recall(96%), and F1-score(94%). For validation, the model was tested on a different set of data and confirmed an accuracy of 94%. The results shows that the system consistently performs well, even with different concrete types and lighting conditions. With real-time monitoring capabilities, the system ensures the prompt detection of cracks as they emerge, holding significant potential for reducing risks associated with structural damage and achieving substantial cost savings.

Development of an Interface Module with a Microscopic Simulation Model for COSMOS Evaluation (미시적 시뮬레이터를 이용한 실시간 신호제어시스템(COSMOS) 평가 시뮬레이션 환경 개발)

  • Song, Sung-Ju;Lee, Seung-Hwan;Lee, Sang-Soo
    • Journal of Korean Society of Transportation
    • /
    • v.22 no.2 s.73
    • /
    • pp.95-102
    • /
    • 2004
  • The COSMOS is an adaptive traffic control systems that can adjust signal timing parameters in response to various traffic conditions. To evaluate the performance of the COSMOS systems, the field study is only practical option because any evaluation tools are not available. To overcome this limitation, a newly integrated interfacing simulator between a microscopic simulation program and COSMOS was developed. In this paper, a detector module and a signal timing module as well as general feature of the simulator were described. A validation test was performed to verify the accuracy of the data flow within the simulator. It was shown that the accuracy level of information from the simulator was high enough for real application. Several practical comments on further studies were also included to enhance the functional specifications of the simulator.