Search | Korea Science

Study on Accelerating Distributed ML Training in Orchestration

Su-Yeon Kim;Seok-Jae Moon
- International journal of advanced smart convergence
- /
- v.13 no.3
- /
- pp.143-149
- /
- 2024
As the size of data and models in machine learning training continues to grow, training on a single server is becoming increasingly challenging. Consequently, the importance of distributed machine learning, which distributes computational loads across multiple machines, is becoming more prominent. However, several unresolved issues remain regarding the performance enhancement of distributed machine learning, including communication overhead, inter-node synchronization challenges, data imbalance and bias, as well as resource management and scheduling. In this paper, we propose ParamHub, which utilizes orchestration to accelerate training speed. This system monitors the performance of each node after the first iteration and reallocates resources to slow nodes, thereby speeding up the training process. This approach ensures that resources are appropriately allocated to nodes in need, maximizing the overall efficiency of resource utilization and enabling all nodes to perform tasks uniformly, resulting in a faster training speed overall. Furthermore, this method enhances the system's scalability and flexibility, allowing for effective application in clusters of various sizes.
https://doi.org/10.7236/IJASC.2024.13.3.143 인용 PDF

Wavelet-like convolutional neural network structure for time-series data classification

Park, Seungtae;Jeong, Haedong;Min, Hyungcheol;Lee, Hojin;Lee, Seungchul
- Smart Structures and Systems
- /
- v.22 no.2
- /
- pp.175-183
- /
- 2018
Time-series data often contain one of the most valuable pieces of information in many fields including manufacturing. Because time-series data are relatively cheap to acquire, they (e.g., vibration signals) have become a crucial part of big data even in manufacturing shop floors. Recently, deep-learning models have shown state-of-art performance for analyzing big data because of their sophisticated structures and considerable computational power. Traditional models for a machinery-monitoring system have highly relied on features selected by human experts. In addition, the representational power of such models fails as the data distribution becomes complicated. On the other hand, deep-learning models automatically select highly abstracted features during the optimization process, and their representational power is better than that of traditional neural network models. However, the applicability of deep-learning models to the field of prognostics and health management (PHM) has not been well investigated yet. This study integrates the "residual fitting" mechanism inherently embedded in the wavelet transform into the convolutional neural network deep-learning structure. As a result, the architecture combines a signal smoother and classification procedures into a single model. Validation results from rotor vibration data demonstrate that our model outperforms all other off-the-shelf feature-based models.
https://doi.org/10.12989/sss.2018.22.2.175 인용 KSCI

Comparison of Scala and R for Machine Learning in Spark (스파크에서 스칼라와 R을 이용한 머신러닝의 비교)

Woo-Seok Ryu
- The Journal of the Korea institute of electronic communication sciences
- /
- v.18 no.1
- /
- pp.85-90
- /
- 2023
Data analysis methodology in the healthcare field is shifting from traditional statistics-oriented research methods to predictive research using machine learning. In this study, we survey various machine learning tools, and compare several programming models, which utilize R and Spark, for applying R, a statistical tool widely used in the health care field, to machine learning. In addition, we compare the performance of linear regression model using scala, which is the basic languages of Spark and R. As a result of the experiment, the learning execution time when using SparkR increased by 10 to 20% compared to Scala. Considering the presented performance degradation, SparkR's distributed processing was confirmed as useful in R as the traditional statistical analysis tool that could be used as it is.
https://doi.org/10.13067/JKIECS.2023.18.1.85 인용 PDF

Simulation for Power Efficiency Optimization of Air Compressor Using Machine Learning Ensemble (머신러닝 앙상블을 활용한 공압기의 전력 효율 최적화 시뮬레이션 )

Juhyeon Kim;Moonsoo Jang;Jieun Choi;Yoseob Heo;Hyunsang Chung;Soyoung Park
- Journal of the Korean Society of Industry Convergence
- /
- v.26 no.6_3
- /
- pp.1205-1213
- /
- 2023
This study delves into methods for enhancing the power efficiency of air compressor systems, with the primary objective of significantly impacting industrial energy consumption and environmental preservation. The paper scrutinizes Shinhan Airro Co., Ltd.'s power efficiency optimization technology and employs machine learning ensemble models to simulate power efficiency optimization. The results indicate that Shinhan Airro's optimization system led to a notable 23.5% increase in power efficiency. Nonetheless, the study's simulations, utilizing machine learning ensemble techniques, reveal the potential for a further 51.3% increase in power efficiency. By continually exploring and advancing these methodologies, this research introduces a practical approach for identifying optimization points through data-driven simulations using machine learning ensembles.
https://doi.org/10.21289/KSIC.2023.26.6.1205 인용 PDF HTML

Machine Learning vs. Statistical Model for Prediction Modelling: Application in Medical Imaging Research (예측모형의 머신러닝 방법론과 통계학적 방법론의 비교: 영상의학 연구에서의 적용)

Leeha Ryu;Kyunghwa Han
- Journal of the Korean Society of Radiology
- /
- v.83 no.6
- /
- pp.1219-1228
- /
- 2022
Clinical prediction models has been increasingly published in radiology research. In particular, as a radiomics research is being actively conducted, the prediction model is developed based on the traditional statistical model, as well as machine learning, to account for the high-dimensional data. In this review, we investigated the statistical and machine learning methods used in clinical prediction model research, and briefly summarized each analytical method for statistical model, machine learning, and statistical learning. Finally, we discussed several considerations for choosing the prediction modeling method.
https://doi.org/10.3348/jksr.2022.0111 인용 PDF

A Novel Feature Selection Approach to Classify Breast Cancer Drug using Optimized Grey Wolf Algorithm

Shobana, G.;Priya, N.
- International Journal of Computer Science & Network Security
- /
- v.22 no.9
- /
- pp.258-270
- /
- 2022
Cancer has become a common disease for the past two decades throughout the globe and there is significant increase of cancer among women. Breast cancer and ovarian cancers are more prevalent among women. Majority of the patients approach the physicians only during their final stage of the disease. Early diagnosis of cancer remains a great challenge for the researchers. Although several drugs are being synthesized very often, their multi-benefits are less investigated. With millions of drugs synthesized and their data are accessible through open repositories. Drug repurposing can be done using machine learning techniques. We propose a feature selection technique in this paper, which is novel that generates multiple populations for the grey wolf algorithm and classifies breast cancer drugs efficiently. Leukemia drug dataset is also investigated and Multilayer perceptron achieved 96% prediction accuracy. Three supervised machine learning algorithms namely Random Forest classifier, Multilayer Perceptron and Support Vector Machine models were applied and Multilayer perceptron had higher accuracy rate of 97.7% for breast cancer drug classification.
https://doi.org/10.22937/IJCSNS.2022.22.9.36 인용 PDF KSCI

Rockfall Source Identification Using a Hybrid Gaussian Mixture-Ensemble Machine Learning Model and LiDAR Data

Fanos, Ali Mutar;Pradhan, Biswajeet;Mansor, Shattri;Yusoff, Zainuddin Md;Abdullah, Ahmad Fikri bin;Jung, Hyung-Sup
- Korean Journal of Remote Sensing
- /
- v.35 no.1
- /
- pp.93-115
- /
- 2019
The availability of high-resolution laser scanning data and advanced machine learning algorithms has enabled an accurate potential rockfall source identification. However, the presence of other mass movements, such as landslides within the same region of interest, poses additional challenges to this task. Thus, this research presents a method based on an integration of Gaussian mixture model (GMM) and ensemble artificial neural network (bagging ANN [BANN]) for automatic detection of potential rockfall sources at Kinta Valley area, Malaysia. The GMM was utilised to determine slope angle thresholds of various geomorphological units. Different algorithms(ANN, support vector machine [SVM] and k nearest neighbour [kNN]) were individually tested with various ensemble models (bagging, voting and boosting). Grid search method was adopted to optimise the hyperparameters of the investigated base models. The proposed model achieves excellent results with success and prediction accuracies at 95% and 94%, respectively. In addition, this technique has achieved excellent accuracies (ROC = 95%) over other methods used. Moreover, the proposed model has achieved the optimal prediction accuracies (92%) on the basis of testing data, thereby indicating that the model can be generalised and replicated in different regions, and the proposed method can be applied to various landslide studies.
https://doi.org/10.7780/kjrs.2019.35.1.7 인용 PDF KSCI HTML

Identification of Pb-Zn ore under the condition of low count rate detection of slim hole based on PGNAA technology

Haolong Huang;Pingkun Cai;Wenbao Jia;Yan Zhang
- Nuclear Engineering and Technology
- /
- v.55 no.5
- /
- pp.1708-1717
- /
- 2023
The grade analysis of lead-zinc ore is the basis for the optimal development and utilization of deposits. In this study, a method combining Prompt Gamma Neutron Activation Analysis (PGNAA) technology and machine learning is proposed for lead-zinc mine borehole logging, which can identify lead-zinc ores of different grades and gangue in the formation, providing real-time grade information qualitatively and semi-quantitatively. Firstly, Monte Carlo simulation is used to obtain a gamma-ray spectrum data set for training and testing machine learning classification algorithms. These spectra are broadened, normalized and separated into inelastic scattering and capture spectra, and then used to fit different classifier models. When the comprehensive grade boundary of high- and low-grade ores is set to 5%, the evaluation metrics calculated by the 5-fold cross-validation show that the SVM (Support Vector Machine), KNN (K-Nearest Neighbor), GNB (Gaussian Naive Bayes) and RF (Random Forest) models can effectively distinguish lead-zinc ore from gangue. At the same time, the GNB model has achieved the optimal accuracy of 91.45% when identifying high- and low-grade ores, and the F₁ score for both types of ores is greater than 0.9.
https://doi.org/10.1016/j.net.2023.01.005 인용 PDF

Empirical evaluations for predicting the damage of FRC wall subjected to close-in explosions

Duc-Kien Thai;Thai-Hoan Pham;Duy-Liem Nguyen;Tran Minh Tu;Phan Van Tien
- Steel and Composite Structures
- /
- v.49 no.1
- /
- pp.65-79
- /
- 2023
This paper presents a development of empirical evaluations, which can be used to evaluate the damage of fiber-reinforced concrete composites (FRC) wall subjected to close-in blast loads. For this development, a combined application of numerical simulation and machine learning approaches are employed. First, finite element modeling of FRC wall under blast loading is developed and verified using experimental data. Numerical analyses are then carried out to investigate the dynamic behavior of the FRC wall under blast loading. In addition, a data set of 384 samples on the damage of FRC wall due to blast loads is then produced in order to develop machine learning models. Second, three robust machine learning models of Random Forest (RF), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost) are employed to propose empirical evaluations for predicting the damage of FRC wall. The proposed empirical evaluations are very useful for practical evaluation and design of FRC wall subjected to blast loads.
https://doi.org/10.12989/scs.2023.49.1.065 인용

A Study on Drift Phenomenon of Trained ML (학습된 머신러닝의 표류 현상에 관한 고찰)

Shin, ByeongChun;Cha, YoonSeok;Kim, Chaeyun;Cha, ByungRae
- Smart Media Journal
- /
- v.11 no.7
- /
- pp.61-69
- /
- 2022
In the learned machine learning, the performance of machine learning degrades at the same time as drift occurs in terms of learning models and learning data over time. As a solution to this problem, I would like to propose the concept and evaluation method of ML drift to determine the re-learning period of machine learning. An XAI test and an XAI test of an apple image were performed according to strawberry and clarity. In the case of strawberries, the change in the XAI analysis of ML models according to the clarity value was insignificant, and in the case of XAI of apple image, apples normally classified objects and heat map areas, but in the case of apple flowers and buds, the results were insignificant compared to strawberries and apples. This is expected to be caused by the lack of learning images of apple flowers and buds, and more apple flowers and buds will be studied and tested in the future.
PDF KSCI

Search Result 1,395, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)