• Title/Summary/Keyword: Cluster Models

Search Result 358, Processing Time 0.029 seconds

Design and Implementation of Incremental Learning Technology for Big Data Mining

  • Min, Byung-Won;Oh, Yong-Sun
    • International Journal of Contents
    • /
    • v.15 no.3
    • /
    • pp.32-38
    • /
    • 2019
  • We usually suffer from difficulties in treating or managing Big Data generated from various digital media and/or sensors using traditional mining techniques. Additionally, there are many problems relative to the lack of memory and the burden of the learning curve, etc. in an increasing capacity of large volumes of text when new data are continuously accumulated because we ineffectively analyze total data including data previously analyzed and collected. In this paper, we propose a general-purpose classifier and its structure to solve these problems. We depart from the current feature-reduction methods and introduce a new scheme that only adopts changed elements when new features are partially accumulated in this free-style learning environment. The incremental learning module built from a gradually progressive formation learns only changed parts of data without any re-processing of current accumulations while traditional methods re-learn total data for every adding or changing of data. Additionally, users can freely merge new data with previous data throughout the resource management procedure whenever re-learning is needed. At the end of this paper, we confirm a good performance of this method in data processing based on the Big Data environment throughout an analysis because of its learning efficiency. Also, comparing this algorithm with those of NB and SVM, we can achieve an accuracy of approximately 95% in all three models. We expect that our method will be a viable substitute for high performance and accuracy relative to large computing systems for Big Data analysis using a PC cluster environment.

Physical Properties of Molecular Clouds in NGC 6822 Hubble V

  • Lee, Hye-In;Pak, Soojong;Oh, Heeyoung;Le, Huynh Anh N.;Lee, Sungho;Lim, Beomdu;Tatematsu, Ken'ichi;Park, Sangwook;Mace, Gregory;Jaffe, Daniel T.
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.1
    • /
    • pp.66.4-66.4
    • /
    • 2019
  • NGC 6822 is a dwarf irregular galaxy whose metal abundance is lower than of the Large Magellanic Cloud. Hubble V is the brightest HII complex where molecular clouds surround the core cluster of OB stars. Because of its proximity (d = 500 kpc), we can resolve the star-forming regions on parsec scales (1 arcsec = 2.4 pc). Using the high-resolution (R = 45,000) near-infrared spectrograph, IGRINS, we observed molecular hydrogen emission lines from photo-dissociation regions (PDRs) and $Br{\gamma}$ emission line from ionized regions. In this presentation, we compare our data PDR models in order to derive the density distribution of the molecular clouds on parsec scales and to estimate the total mass of the clouds.

  • PDF

Wellness Prediction in Diabetes Mellitus Risks Via Machine Learning Classifiers

  • Saravanakumar M, Venkatesh;Sabibullah, M.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.203-208
    • /
    • 2022
  • The occurrence of Type 2 Diabetes Mellitus (T2DM) is hoarding globally. All kinds of Diabetes Mellitus is controlled to disrupt over 415 million grownups worldwide. It was the seventh prime cause of demise widespread with a measured 1.6 million deaths right prompted by diabetes during 2016. Over 90% of diabetes cases are T2DM, with the utmost persons having at smallest one other chronic condition in UK. In valuation of contemporary applications of Big Data (BD) to Diabetes Medicare by sighted its upcoming abilities, it is compulsory to transmit out a bottomless revision over foremost theoretical literatures. The long-term growth in medicine and, in explicit, in the field of "Diabetology", is powerfully encroached to a sequence of differences and inventions. The medical and healthcare data from varied bases like analysis and treatment tactics which assistances healthcare workers to guess the actual perceptions about the development of Diabetes Medicare measures accessible by them. Apache Spark extracts "Resilient Distributed Dataset (RDD)", a vital data structure distributed finished a cluster on machines. Machine Learning (ML) deals a note-worthy method for building elegant and automatic algorithms. ML library involving of communal ML algorithms like Support Vector Classification and Random Forest are investigated in this projected work by using Jupiter Notebook - Python code, where significant quantity of result (Accuracy) is carried out by the models.

K-Means Clustering with Deep Learning for Fingerprint Class Type Prediction

  • Mukoya, Esther;Rimiru, Richard;Kimwele, Michael;Mashava, Destine
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.3
    • /
    • pp.29-36
    • /
    • 2022
  • In deep learning classification tasks, most models frequently assume that all labels are available for the training datasets. As such strategies to learn new concepts from unlabeled datasets are scarce. In fingerprint classification tasks, most of the fingerprint datasets are labelled using the subject/individual and fingerprint datasets labelled with finger type classes are scarce. In this paper, authors have developed approaches of classifying fingerprint images using the majorly known fingerprint classes. Our study provides a flexible method to learn new classes of fingerprints. Our classifier model combines both the clustering technique and use of deep learning to cluster and hence label the fingerprint images into appropriate classes. The K means clustering strategy explores the label uncertainty and high-density regions from unlabeled data to be clustered. Using similarity index, five clusters are created. Deep learning is then used to train a model using a publicly known fingerprint dataset with known finger class types. A prediction technique is then employed to predict the classes of the clusters from the trained model. Our proposed model is better and has less computational costs in learning new classes and hence significantly saving on labelling costs of fingerprint images.

Prediction of Energy Consumption in a Smart Home Using Coherent Weighted K-Means Clustering ARIMA Model

  • Magdalene, J. Jasmine Christina;Zoraida, B.S.E.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.177-182
    • /
    • 2022
  • Technology is progressing with every passing day and the enormous usage of electricity is becoming a necessity. One of the techniques to enjoy the assistances in a smart home is the efficiency to manage the electric energy. When electric energy is managed in an appropriate way, it drastically saves sufficient power even to be spent during hard time as when hit by natural calamities. To accomplish this, prediction of energy consumption plays a very important role. This proposed prediction model Coherent Weighted K-Means Clustering ARIMA (CWKMCA) enhances the weighted k-means clustering technique by adding weights to the cluster points. Forecasting is done using the ARIMA model based on the centroid of the clusters produced. The dataset for this proposed work is taken from the Pecan Project in Texas, USA. The level of accuracy of this model is compared with the traditional ARIMA model and the Weighted K-Means Clustering ARIMA Model. When predicting,errors such as RMSE, MAPE, AIC and AICC are analysed, the results of this suggested work reveal lower values than the ARIMA and Weighted K-Means Clustering ARIMA models. This model also has a greater loglikelihood, demonstrating that this model outperforms the ARIMA model for time series forecasting.

Implementation of Digital Management System for the Enterprises Development and Distribution in Aviation Industry

  • TIKHONOV, Alexey;SAZONOV, Andrey
    • Journal of Distribution Science
    • /
    • v.20 no.9
    • /
    • pp.39-46
    • /
    • 2022
  • Purpose: At the industrial sites of aviation enterprises there is a significant optimization of the main production processes through the use of advanced digital technologies. The most promising are the latest technologies of industrial Internet of Things, active use of big data and practical application of artificial intelligence in production. Research design, data and methodology:The process of creating a competitive product in the high-tech aviation sector is actively linked to the investment appeal of aircraft and helicopter construction products, which is built on the basis of reducing production and time costs through the creation of an effective digital system. Results: The aviation cluster of Rostec State Corporation is currently being transformed in a significant way. The leading enterprises of the Russian aviation industry are actively mastering cooperation schemes using integrated digital management principles and the widespread introduction of digital products from leading Russian vendors. Conclusions: Following the transition to electronic aircraft design technologies and modern materials in the production of aircraft, UAC continues to improve all production processes through robotization and optimization of technological processes, due to the introduction of aircraft assembly technology in accordance with digital models.

Cyclic behavior of self-centering braces utilizing energy absorbing steel plate clusters

  • Jiawang Liu;Canxing Qiu
    • Steel and Composite Structures
    • /
    • v.47 no.4
    • /
    • pp.523-537
    • /
    • 2023
  • This paper proposed a new self-centering brace (SCB), which consists of four post-tensioned (PT) high strength steel strands and energy absorbing steel plate (EASP) clusters. First, analytical equations were derived to describe the working principle of the SCB. Then, to investigate the hysteretic performance of the SCB, four full-size specimens were manufactured and subjected to the same cyclic loading protocol. One additional specimen using only EASP clusters was also tested to highlight the contribution of PT strands. The test parameters varied in the testing process included the thickness of the EASP and the number of EASP in each cluster. Testing results shown that the SCB exhibited nearly flag-shape hysteresis up to expectation, including excellent recentering capability and satisfactory energy dissipating capacity. For all the specimens, the ratio of the recovered deformation is in the range of 89.6% to 92.1%, and the ratio of the height of the hysteresis loop to the yielding force is in the range of 0.47 to 0.77. Finally, in order to further understand the mechanism of the SCB and provide additional information to the testing results, the high-fidelity finite element (FE) models were established and the numerical results were compared against the experimental data. Good agreement between the experimental, numerical, and analytical results was observed, and the maximum difference is less than 12%. Parametric analysis was also carried out based on the validated FE model to evaluate the effect of some key parameters on the cyclic behavior of the SCB.

Molecular Simulation of Influence of Surface Energy on Water Lubrication (표면 에너지가 물 윤활 현상에 미치는 영향에 대한 분자시뮬레이션 연구)

  • Hyun-Joon Kim
    • Tribology and Lubricants
    • /
    • v.39 no.6
    • /
    • pp.273-277
    • /
    • 2023
  • This paper presents a molecular dynamics simulation-based numerical investigation of the influence of surface energy on water lubrication. Models composed of a crystalline substrate, half cylindrical tip, and cluster of water molecules are prepared for a tribological-characteristic evaluation. To determine the effect of surface energy on lubrication, the surface energy between the substrate and water molecules as well as that between the tip and water molecules are controlled by changing the interatomic potential parameters. Simulations are conducted to investigate the indentation and sliding processes. Three different normal forces are applied to the system by controlling the indentation depth to examine the influence of normal force on the lubrication of the system. The simulation results reveal that the solid surface's surface energy and normal force significantly affect the behavior of the water molecules and lubrication characteristics. The lubrication characteristics of the water molecules deteriorate with the increasing magnitude of the normal force. At a low surface energy, the water molecules are readily squeezed out of the interface under a load, thus increasing the frictional force. Contrarily, a moderate surface energy prevents expulsion of the water molecules due to squeezing, resulting in a low frictional force. At a high surface energy, although squeezing of the water molecules is restricted, similar to the case of moderate surface energy, dragging occurs at the soil surface-water molecule interface, and the frictional force increases.

An Empirical Analysis on the Determinants of Residential Mobility and Reclassifying Urban and Rural Areas (도시와 농촌의 재유형화와 주거이동 결정요인 분석)

  • Heewon Chang;Donghwan An
    • Journal of Korean Society of Rural Planning
    • /
    • v.30 no.2
    • /
    • pp.79-96
    • /
    • 2024
  • The purpose of this study is to analyze the factors affecting residential mobility between urban and rural. After classifying urban and rural region based on discriminatory attributes of the regions, we applied a multinomial logistic model, using the sample data of 2020 Korea Population and Housing Census. The major findings are as follows. The young highly educated in cities avoided rural. The young less educated in rural engaged in 2, 3th industries as well as agricultural industry, but remained in low-paying and unstable jobs. In addition, various classes moved to rural and rising house prices in cities pushed people to rural. Therefore, it is necessary to develop diversified regional industry models and provide opportunities for high quality and stable jobs in rural by linking industrial demand, education and jobs. Also, preserving the rural environment, settlement conditions and residential environment are needed for satisfying various needs of urban residents who migrate to rural areas. While regional policies so far have focused on maintaining the population size and promoting a population influx, rural development and population policies should be established in a way that responds to diverse population classes in an era of population decline.

Prediction of multipurpose dam inflow utilizing catchment attributes with LSTM and transformer models (유역정보 기반 Transformer및 LSTM을 활용한 다목적댐 일 단위 유입량 예측)

  • Kim, Hyung Ju;Song, Young Hoon;Chung, Eun Sung
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.7
    • /
    • pp.437-449
    • /
    • 2024
  • Rainfall-runoff prediction studies using deep learning while considering catchment attributes have been gaining attention. In this study, we selected two models: the Transformer model, which is suitable for large-scale data training through the self-attention mechanism, and the LSTM-based multi-state-vector sequence-to-sequence (LSTM-MSV-S2S) model with an encoder-decoder structure. These models were constructed to incorporate catchment attributes and predict the inflow of 10 multi-purpose dam watersheds in South Korea. The experimental design consisted of three training methods: Single-basin Training (ST), Pretraining (PT), and Pretraining-Finetuning (PT-FT). The input data for the models included 10 selected watershed attributes along with meteorological data. The inflow prediction performance was compared based on the training methods. The results showed that the Transformer model outperformed the LSTM-MSV-S2S model when using the PT and PT-FT methods, with the PT-FT method yielding the highest performance. The LSTM-MSV-S2S model showed better performance than the Transformer when using the ST method; however, it showed lower performance when using the PT and PT-FT methods. Additionally, the embedding layer activation vectors and raw catchment attributes were used to cluster watersheds and analyze whether the models learned the similarities between them. The Transformer model demonstrated improved performance among watersheds with similar activation vectors, proving that utilizing information from other pre-trained watersheds enhances the prediction performance. This study compared the suitable models and training methods for each multi-purpose dam and highlighted the necessity of constructing deep learning models using PT and PT-FT methods for domestic watersheds. Furthermore, the results confirmed that the Transformer model outperforms the LSTM-MSV-S2S model when applying PT and PT-FT methods.