• Title/Summary/Keyword: Machine Learning Methodologies

Search Result 86, Processing Time 0.028 seconds

A Study on the Prediction Model of the Elderly Depression

  • SEO, Beom-Seok;SUH, Eung-Kyo;KIM, Tae-Hyeong
    • The Journal of Industrial Distribution & Business
    • /
    • v.11 no.7
    • /
    • pp.29-40
    • /
    • 2020
  • Purpose: In modern society, many urban problems are occurring, such as aging, hollowing out old city centers and polarization within cities. In this study, we intend to apply big data and machine learning methodologies to predict depression symptoms in the elderly population early on, thus contributing to solving the problem of elderly depression. Research design, data and methodology: Machine learning techniques used random forest and analyzed the correlation between CES-D10 and other variables, which are widely used worldwide, to estimate important variables. Dependent variables were set up as two variables that distinguish normal/depression from moderate/severe depression, and a total of 106 independent variables were included, including subjective health conditions, cognitive abilities, and daily life quality surveys, as well as the objective characteristics of the elderly as well as the subjective health, health, employment, household background, income, consumption, assets, subjective expectations, and quality of life surveys. Results: Studies have shown that satisfaction with residential areas and quality of life and cognitive ability scores have important effects in classifying elderly depression, satisfaction with living quality and economic conditions, and number of outpatient care in living areas and clinics have been important variables. In addition, the results of a random forest performance evaluation, the accuracy of classification model that classify whether elderly depression or not was 86.3%, the sensitivity 79.5%, and the specificity 93.3%. And the accuracy of classification model the degree of elderly depression was 86.1%, sensitivity 93.9% and specificity 74.7%. Conclusions: In this study, the important variables of the estimated predictive model were identified using the random forest technique and the study was conducted with a focus on the predictive performance itself. Although there are limitations in research, such as the lack of clear criteria for the classification of depression levels and the failure to reflect variables other than KLoSA data, it is expected that if additional variables are secured in the future and high-performance predictive models are estimated and utilized through various machine learning techniques, it will be able to consider ways to improve the quality of life of senior citizens through early detection of depression and thus help them make public policy decisions.

Predicting the splitting tensile strength of manufactured-sand concrete containing stone nano-powder through advanced machine learning techniques

  • Manish Kewalramani;Hanan Samadi;Adil Hussein Mohammed;Arsalan Mahmoodzadeh;Ibrahim Albaijan;Hawkar Hashim Ibrahim;Saleh Alsulamy
    • Advances in nano research
    • /
    • v.16 no.4
    • /
    • pp.375-394
    • /
    • 2024
  • The extensive utilization of concrete has given rise to environmental concerns, specifically concerning the depletion of river sand. To address this issue, waste deposits can provide manufactured-sand (MS) as a substitute for river sand. The objective of this study is to explore the application of machine learning techniques to facilitate the production of manufactured-sand concrete (MSC) containing stone nano-powder through estimating the splitting tensile strength (STS) containing compressive strength of cement (CSC), tensile strength of cement (TSC), curing age (CA), maximum size of the crushed stone (Dmax), stone nano-powder content (SNC), fineness modulus of sand (FMS), water to cement ratio (W/C), sand ratio (SR), and slump (S). To achieve this goal, a total of 310 data points, encompassing nine influential factors affecting the mechanical properties of MSC, are collected through laboratory tests. Subsequently, the gathered dataset is divided into two subsets, one for training and the other for testing; comprising 90% (280 samples) and 10% (30 samples) of the total data, respectively. By employing the generated dataset, novel models were developed for evaluating the STS of MSC in relation to the nine input features. The analysis results revealed significant correlations between the CSC and the curing age CA with STS. Moreover, when delving into sensitivity analysis using an empirical model, it becomes apparent that parameters such as the FMS and the W/C exert minimal influence on the STS. We employed various loss functions to gauge the effectiveness and precision of our methodologies. Impressively, the outcomes of our devised models exhibited commendable accuracy and reliability, with all models displaying an R-squared value surpassing 0.75 and loss function values approaching insignificance. To further refine the estimation of STS for engineering endeavors, we also developed a user-friendly graphical interface for our machine learning models. These proposed models present a practical alternative to laborious, expensive, and complex laboratory techniques, thereby simplifying the production of mortar specimens.

Study on the Performance Evaluation of Encoding and Decoding Schemes in Vector Symbolic Architectures (벡터 심볼릭 구조의 부호화 및 복호화 성능 평가에 관한 연구)

  • Youngseok Lee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.4
    • /
    • pp.229-235
    • /
    • 2024
  • Recent years have seen active research on methods for efficiently processing and interpreting large volumes of data in the fields of artificial intelligence and machine learning. One of these data processing technologies, Vector Symbolic Architecture (VSA), offers an innovative approach to representing complex symbols and data using high-dimensional vectors. VSA has garnered particular attention in various applications such as natural language processing, image recognition, and robotics. This study quantitatively evaluates the characteristics and performance of VSA methodologies by applying five VSA methodologies to the MNIST dataset and measuring key performance indicators such as encoding speed, decoding speed, memory usage, and recovery accuracy across different vector lengths. BSC and VT demonstrated relatively fast performance in encoding and decoding speeds, while MAP and HRR were relatively slow. In terms of memory usage, BSC was the most efficient, whereas MAP used the most memory. The recovery accuracy was highest for MAP and lowest for BSC. The results of this study provide a basis for selecting appropriate VSA methodologies depending on the application area.

Generalized Steganalysis using Deep Learning (딥러닝을 이용한 범용적 스테그아날리시스)

  • Kim, Hyunjae;Lee, Jaekoo;Kim, Gyuwan;Yoon, Sungroh
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.4
    • /
    • pp.244-249
    • /
    • 2017
  • Steganalysis is to detect information hidden by steganography inside general data such as images. There are stegoanalysis techniques that use machine learning (ML). Existing ML approaches to steganalysis are based on extracting features from stego images and modeling them. Recently deep learning-based methodologies have shown significant improvements in detection accuracy. However, all the existing methods, including deep learning-based ones, have a critical limitation in that they can only detect stego images that are created by a specific steganography method. In this paper, we propose a generalized steganalysis method that can model multiple types of stego images using deep learning. Through various experiments, we confirm the effectiveness of our approach and envision directions for future research. In particular, we show that our method can detect each type of steganography with the same level of accuracy as that of a steganalysis method dedicated to that type of steganography, thereby demonstrating the general applicability of our approach to multiple types of stego images.

Dynamic forecasts of bankruptcy with Recurrent Neural Network model (RNN(Recurrent Neural Network)을 이용한 기업부도예측모형에서 회계정보의 동적 변화 연구)

  • Kwon, Hyukkun;Lee, Dongkyu;Shin, Minsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.139-153
    • /
    • 2017
  • Corporate bankruptcy can cause great losses not only to stakeholders but also to many related sectors in society. Through the economic crises, bankruptcy have increased and bankruptcy prediction models have become more and more important. Therefore, corporate bankruptcy has been regarded as one of the major topics of research in business management. Also, many studies in the industry are in progress and important. Previous studies attempted to utilize various methodologies to improve the bankruptcy prediction accuracy and to resolve the overfitting problem, such as Multivariate Discriminant Analysis (MDA), Generalized Linear Model (GLM). These methods are based on statistics. Recently, researchers have used machine learning methodologies such as Support Vector Machine (SVM), Artificial Neural Network (ANN). Furthermore, fuzzy theory and genetic algorithms were used. Because of this change, many of bankruptcy models are developed. Also, performance has been improved. In general, the company's financial and accounting information will change over time. Likewise, the market situation also changes, so there are many difficulties in predicting bankruptcy only with information at a certain point in time. However, even though traditional research has problems that don't take into account the time effect, dynamic model has not been studied much. When we ignore the time effect, we get the biased results. So the static model may not be suitable for predicting bankruptcy. Thus, using the dynamic model, there is a possibility that bankruptcy prediction model is improved. In this paper, we propose RNN (Recurrent Neural Network) which is one of the deep learning methodologies. The RNN learns time series data and the performance is known to be good. Prior to experiment, we selected non-financial firms listed on the KOSPI, KOSDAQ and KONEX markets from 2010 to 2016 for the estimation of the bankruptcy prediction model and the comparison of forecasting performance. In order to prevent a mistake of predicting bankruptcy by using the financial information already reflected in the deterioration of the financial condition of the company, the financial information was collected with a lag of two years, and the default period was defined from January to December of the year. Then we defined the bankruptcy. The bankruptcy we defined is the abolition of the listing due to sluggish earnings. We confirmed abolition of the list at KIND that is corporate stock information website. Then we selected variables at previous papers. The first set of variables are Z-score variables. These variables have become traditional variables in predicting bankruptcy. The second set of variables are dynamic variable set. Finally we selected 240 normal companies and 226 bankrupt companies at the first variable set. Likewise, we selected 229 normal companies and 226 bankrupt companies at the second variable set. We created a model that reflects dynamic changes in time-series financial data and by comparing the suggested model with the analysis of existing bankruptcy predictive models, we found that the suggested model could help to improve the accuracy of bankruptcy predictions. We used financial data in KIS Value (Financial database) and selected Multivariate Discriminant Analysis (MDA), Generalized Linear Model called logistic regression (GLM), Support Vector Machine (SVM), Artificial Neural Network (ANN) model as benchmark. The result of the experiment proved that RNN's performance was better than comparative model. The accuracy of RNN was high in both sets of variables and the Area Under the Curve (AUC) value was also high. Also when we saw the hit-ratio table, the ratio of RNNs that predicted a poor company to be bankrupt was higher than that of other comparative models. However the limitation of this paper is that an overfitting problem occurs during RNN learning. But we expect to be able to solve the overfitting problem by selecting more learning data and appropriate variables. From these result, it is expected that this research will contribute to the development of a bankruptcy prediction by proposing a new dynamic model.

Tree size determination for classification ensemble

  • Choi, Sung Hoon;Kim, Hyunjoong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.1
    • /
    • pp.255-264
    • /
    • 2016
  • Classification is a predictive modeling for a categorical target variable. Various classification ensemble methods, which predict with better accuracy by combining multiple classifiers, became a powerful machine learning and data mining paradigm. Well-known methodologies of classification ensemble are boosting, bagging and random forest. In this article, we assume that decision trees are used as classifiers in the ensemble. Further, we hypothesized that tree size affects classification accuracy. To study how the tree size in uences accuracy, we performed experiments using twenty-eight data sets. Then we compare the performances of ensemble algorithms; bagging, double-bagging, boosting and random forest, with different tree sizes in the experiment.

Bioinformatics and Genomic Medicine (생명정보학과 유전체의학)

  • Kim, Ju-Han
    • Journal of Preventive Medicine and Public Health
    • /
    • v.35 no.2
    • /
    • pp.83-91
    • /
    • 2002
  • Bioinformatics is a rapidly emerging field of biomedical research. A flood of large-scale genomic and postgenomic data means that many of the challenges in biomedical research are now challenges in computational sciences. Clinical informatics has long developed methodologies to improve biomedical research and clinical care by integrating experimental and clinical information systems. The informatics revolutions both in bioinformatics and clinical informatics will eventually change the current practice of medicine, including diagnostics, therapeutics, and prognostics. Postgenome informatics, powered by high throughput technologies and genomic-scale databases, is likely to transform our biomedical understanding forever much the same way that biochemistry did a generation ago. The paper describes how these technologies will impact biomedical research and clinical care, emphasizing recent advances in biochip-based functional genomics and proteomics. Basic data preprocessing with normalization, primary pattern analysis, and machine learning algorithms will be presented. Use of integrated biochip informatics technologies, text mining of factual and literature databases, and integrated management of biomolecular databases will be discussed. Each step will be given with real examples in the context of clinical relevance. Issues of linking molecular genotype and clinical phenotype information will be discussed.

Systematic Literature Review for the Application of Artificial Intelligence to the Management of Construction Claims and Disputes

  • Seo, Wonkyoung;Kang, Youngcheol
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.57-66
    • /
    • 2022
  • Claims and disputes are major causes of cost and schedule overruns in the construction business. In order to manage claims and disputes effectively, it is necessary to analyze various types of contract documents punctually and accurately. Since volume of such documents is so vast, analyzing them in a timely manner is practically very challenging. Recently developed approaches such as artificial intelligence (AI), machine learning algorithms, and natural language processing (NLP) have been applied to various topics in the field of construction contract and claim management. Based on the systematic literature review, this paper analyzed the goals, methodologies, and application results of such approaches. AI methods applied to construction contract management are classified into several categories. This study identified possibilities and limitations of the application of such approaches. This study contributes to providing the directions for how such approaches should be applied to contract management for future studies, which will eventually lead to more effective management of claims and disputes.

  • PDF

Blockchain-Enabled Decentralized Clustering for Enhanced Decision Support in the Coffee Supply Chain

  • Keo Ratanak;Muhammad Firdaus;Kyung-Hyune Rhee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.260-263
    • /
    • 2023
  • Considering the growth of blockchain technology, the research aims to transform the efficiency of recommending optimal coffee suppliers within the complex supply chain network. This transformation relies on the extraction of vital transactional data and insights from stakeholders, facilitated by the dynamic interaction between the application interface (e.g., Rest API) and the blockchain network. These extracted data are then subjected to advanced data processing techniques and harnessed through machine learning methodologies to establish a robust recommendation system. This innovative approach seeks to empower users with informed decision-making abilities, thereby enhancing operational efficiency in identifying the most suitable coffee supplier for each customer. Furthermore, the research employs data visualization techniques to illustrate intricate clustering patterns generated by the K-Means algorithm, providing a visual dimension to the study's evaluation.

Applying NIST AI Risk Management Framework: Case Study on NTIS Database Analysis Using MAP, MEASURE, MANAGE Approaches (NIST AI 위험 관리 프레임워크 적용: NTIS 데이터베이스 분석의 MAP, MEASURE, MANAGE 접근 사례 연구)

  • Jung Sun Lim;Seoung Hun, Bae;Taehoon Kwon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.47 no.2
    • /
    • pp.21-29
    • /
    • 2024
  • Fueled by international efforts towards AI standardization, including those by the European Commission, the United States, and international organizations, this study introduces a AI-driven framework for analyzing advancements in drone technology. Utilizing project data retrieved from the NTIS DB via the "drone" keyword, the framework employs a diverse toolkit of supervised learning methods (Keras MLP, XGboost, LightGBM, and CatBoost) enhanced by BERTopic (natural language analysis tool). This multifaceted approach ensures both comprehensive data quality evaluation and in-depth structural analysis of documents. Furthermore, a 6T-based classification method refines non-applicable data for year-on-year AI analysis, demonstrably improving accuracy as measured by accuracy metric. Utilizing AI's power, including GPT-4, this research unveils year-on-year trends in emerging keywords and employs them to generate detailed summaries, enabling efficient processing of large text datasets and offering an AI analysis system applicable to policy domains. Notably, this study not only advances methodologies aligned with AI Act standards but also lays the groundwork for responsible AI implementation through analysis of government research and development investments.