• Title/Summary/Keyword: Multiple Machine Learning

Search Result 356, Processing Time 0.024 seconds

Combining Multiple Classifiers for Automatic Classification of Email Documents (전자우편 문서의 자동분류를 위한 다중 분류기 결합)

  • Lee, Jae-Haeng;Cho, Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.192-201
    • /
    • 2002
  • Automated text classification is considered as an important method to manage and process a huge amount of documents in digital forms that are widespread and continuously increasing. Recently, text classification has been addressed with machine learning technologies such as k-nearest neighbor, decision tree, support vector machine and neural networks. However, only few investigations in text classification are studied on real problems but on well-organized text corpus, and do not show their usefulness. This paper proposes and analyzes text classification methods for a real application, email document classification task. First, we propose a combining method of multiple neural networks that improves the performance through the combinations with maximum and neural networks. Second, we present another strategy of combining multiple machine learning classifiers. Voting, Borda count and neural networks improve the overall classification performance. Experimental results show the usefulness of the proposed methods for a real application domain, yielding more than 90% precision rates.

Malaysian Name-based Ethnicity Classification using LSTM

  • Hur, Youngbum
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.3855-3867
    • /
    • 2022
  • Name separation (splitting full names into surnames and given names) is not a tedious task in a multiethnic country because the procedure for splitting surnames and given names is ethnicity-specific. Malaysia has multiple main ethnic groups; therefore, separating Malaysian full names into surnames and given names proves a challenge. In this study, we develop a two-phase framework for Malaysian name separation using deep learning. In the initial phase, we predict the ethnicity of full names. We propose a recurrent neural network with long short-term memory network-based model with character embeddings for prediction. Based on the predicted ethnicity, we use a rule-based algorithm for splitting full names into surnames and given names in the second phase. We evaluate the performance of the proposed model against various machine learning models and demonstrate that it outperforms them by an average of 9%. Moreover, transfer learning and fine-tuning of the proposed model with an additional dataset results in an improvement of up to 7% on average.

Incorporating Machine Learning into a Data Warehouse for Real-Time Construction Projects Benchmarking

  • Yin, Zhe;DeGezelle, Deborah;Hirota, Kazuma;Choi, Jiyong
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.831-838
    • /
    • 2022
  • Machine Learning is a process of using computer algorithms to extract information from raw data to solve complex problems in a data-rich environment. It has been used in the construction industry by both academics and practitioners for multiple applications to improve the construction process. The Construction Industry Institute, a leading construction research organization has twenty-five years of experience in benchmarking capital projects in the industry. The organization is at an advantage to develop useful machine learning applications because it possesses enormous real construction data. Its benchmarking programs have been actively used by owner and contractor companies today to assess their capital projects' performance. A credible benchmarking program requires statistically valid data without subjective interference in the program administration. In developing the next-generation benchmarking program, the Data Warehouse, the organization aims to use machine learning algorithms to minimize human effort and to enable rapid data ingestion from diverse sources with data validity and reliability. This research effort uses a focus group comprised of practitioners from the construction industry and data scientists from a variety of disciplines. The group collaborated to identify the machine learning requirements and potential applications in the program. Technical and domain experts worked to select appropriate algorithms to support the business objectives. This paper presents initial steps in a chain of what is expected to be numerous learning algorithms to support high-performance computing, a fully automated performance benchmarking system.

  • PDF

Estimation of Cerchar abrasivity index based on rock strength and petrological characteristics using linear regression and machine learning (선형회귀분석과 머신러닝을 이용한 암석의 강도 및 암석학적 특징 기반 세르샤 마모지수 추정)

  • Ju-Pyo Hong;Yun Seong Kang;Tae Young Ko
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.26 no.1
    • /
    • pp.39-58
    • /
    • 2024
  • Tunnel Boring Machines (TBM) use multiple disc cutters to excavate tunnels through rock. These cutters wear out due to continuous contact and friction with the rock, leading to decreased cutting efficiency and reduced excavation performance. The rock's abrasivity significantly affects cutter wear, with highly abrasive rocks causing more wear and reducing the cutter's lifespan. The Cerchar Abrasivity Index (CAI) is a key indicator for assessing rock abrasivity, essential for predicting disc cutter life and performance. This study aims to develop a new method for effectively estimating CAI using rock strength, petrological characteristics, linear regression, and machine learning. A database including CAI, uniaxial compressive strength, Brazilian tensile strength, and equivalent quartz content was created, with additional derived variables. Variables for multiple linear regression were selected considering statistical significance and multicollinearity, while machine learning model inputs were chosen based on variable importance. Among the machine learning prediction models, the Gradient Boosting model showed the highest predictive performance. Finally, the predictive performance of the multiple linear regression analysis and the Gradient Boosting model derived in this study were compared with the CAI prediction models of previous studies to validate the results of this research.

Control of Single Propeller Pendulum with Supervised Machine Learning Algorithm

  • Tengis, Tserendondog;Batmunkh, Amar
    • International journal of advanced smart convergence
    • /
    • v.7 no.3
    • /
    • pp.15-22
    • /
    • 2018
  • Nowadays multiple control methods are used in robot control systems. A model, predictor or error estimator is often used as feedback controller to control a robot. While robots have become more and more intensive with algorithms capable to acquiring independent knowledge from raw data. This paper represents experimental results of real time machine learning control that does not require explicit knowledge about the plant. The controller can be applied on a broad range of tasks with different dynamic characteristics. We tested our controller on the balancing problem of a single propeller pendulum. Experimental results show that the use of a supervised machine learning algorithm in a single propeller pendulum allows the stable swing of a given angle.

A Study on Prediction Model of Scaffold Appearance Defect Using Machine Learning (기계 학습을 이용한 인공지지체 외형 불량 예측 모델에 관한 연구)

  • Lee, Song-Yeon;Huh, Yong Jeong
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.2
    • /
    • pp.26-30
    • /
    • 2020
  • In this paper, we studied the problem if the experiment number occurring in order to identify defect in scaffold. We need to change each of the 5 print factor to predict defect when printing disk type scaffold using FDM 3d printer. So then the number of scaffold print will be more than 100,000 times. This experiment number is difficult to perform in the field. In order to solve this problem, we have produced a prediction model based on machine learning multiple linear regression using print conditions and defect scaffold data for print conditions. The prediction model produced was verified through experiments. The verification confirmed that the error was less than 0.5 %. We have confirmed that satisfied within the target margin of error 5 %.

A shop recommendation learning with Tensorflow.js (Tensorflow.js를 활용한 상점 추천 학습)

  • Cho, Jaeyoung;Lee, Sangwon;Chung, Tai Myoung
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.07a
    • /
    • pp.267-270
    • /
    • 2019
  • Through this research, the rating data of shops were analyzed. The model was designed for discrete multiple classification as to the corresponding data, and the following experiments were initiated to observe the learned machine. By comparing each benchmarks in the experiments, which contains different setting variables for the machine model, the hit ratio was measured which indicates how much it is matched with the expected label. By analyzing those results from each benchmarks, the model was redesigned one time during the research and the effects of each setting variables on this machine were clarified. Furthermore, the research result left the future works, which are related with how the learning could be improved and what should be designed in the further research.

  • PDF

Production Performance Prediction of Pig Farming using Machine Learning (기계학습기반 양돈생산성 예측방안)

  • Lee, Woongsup;Sung, Kil-Young;Ban, Tae-Won;Ham, Young Hwa
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.1
    • /
    • pp.130-133
    • /
    • 2020
  • Smart pig farm which is based on IoT has been widely adopted by many pig farmers. In order to achieve optimal control of smart pig farm, the relation between environmental conditions and performance metric should be characterized. In this study, the relation between multiple environmental conditions including temperature, humidity and various performance metrics, which are daily gain, feed intake, and MSY, is analyzed based on data obtained from 55 real pig farm. Especially, based on preprocessing of data, various regression based machine learning algorithms are considered. Through performance evaluation, we show that the performance can be predicted with high precision, which can improve the efficiency of management.

Severity Prediction of Sleep Respiratory Disease Based on Statistical Analysis Using Machine Learning (머신러닝을 활용한 통계 분석 기반의 수면 호흡 장애 중증도 예측)

  • Jun-Su Kim;Byung-Jae Choi
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.2
    • /
    • pp.59-65
    • /
    • 2023
  • Currently, polysomnography is essential to diagnose sleep-related breathing disorders. However, there are several disadvantages to polysomnography, such as the requirement for multiple sensors and a long reading time. In this paper, we propose a system for predicting the severity of sleep-related breathing disorders at home utilizing measurable elements in a wearable device. To predict severity, the variables were refined through a three-step variable selection process, and the refined variables were used as inputs into three machine-learning models. As a result of the study, random forest models showed excellent prediction performance throughout. The best performance of the model in terms of F1 scores for the three threshold criteria of 5, 15, and 30 classified as the AHI index was about 87.3%, 90.7%, and 90.8%, respectively, and the maximum performance of the model for the three threshold criteria classified as the RDI index was approx 79.8%, 90.2%, and 90.1%, respectively.

Clinico-pathologic Factors and Machine Learning Algorithm for Survival Prediction in Parotid Gland Cancer (귀밑샘 암종에서 생존 예측을 위한 임상병리 인자 분석 및 머신러닝 모델의 구축)

  • Kwak, Seung Min;Kim, Se-Heon;Choi, Eun Chang;Lim, Jae-Yol;Koh, Yoon Woo;Park, Young Min
    • Korean Journal of Head & Neck Oncology
    • /
    • v.38 no.1
    • /
    • pp.17-24
    • /
    • 2022
  • Background/Objectives: This study analyzed the prognostic significance of clinico-pathologic factors including comprehensive nodal factors in parotid gland cancers (PGCs) patients and constructed a survival prediction model for PGCs patients using machine learning techniques. Materials & Methods: A total of 131 PGCs patients were enrolled in the study. Results: There were 19 cases (14.5%) of lymph nodes (LNs) at the lower neck level and 43 cases (32.8%) involved multiple level LNs metastases. There were 2 cases (1.5%) of metastases to the contralateral LNs. Intraparotid LNs metastasis was observed in 6 cases (4.6%) and extranodal extension (ENE) findings were observed in 35 cases (26.7%). Lymphovascular invasion (LVI) and perineural invasion findings were observed in 42 cases (32.1%) and 49 cases (37.4%), respectively. Machine learning prediction models were constructed using clinico-pathologic factors including comprehensive nodal factors and Decision Tree and Stacking model showed the highest accuracy at 74% and 70% for predicting patient's survival. Conclusion: Lower level LNs metastasis and LNR have important prognostic significance for predicting disease recurrence and survival in PGCs patients. These two factors were used as important features for constructing machine learning prediction model. Our machine learning model could predict PGCs patient's survival with a considerable level of accuracy.