• Title/Summary/Keyword: cluster method

Search Result 2,498, Processing Time 0.028 seconds

An Analytic solution for the Hadoop Configuration Combinatorial Puzzle based on General Factorial Design

  • Priya, R. Sathia;Prakash, A. John;Uthariaraj, V. Rhymend
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.11
    • /
    • pp.3619-3637
    • /
    • 2022
  • Big data analytics offers endless opportunities for operational enhancement by extracting valuable insights from complex voluminous data. Hadoop is a comprehensive technological suite which offers solutions for the large scale storage and computing needs of Big data. The performance of Hadoop is closely tied with its configuration settings which depends on the cluster capacity and the application profile. Since Hadoop has over 190 configuration parameters, tuning them to gain optimal application performance is a daunting challenge. Our approach is to extract a subset of impactful parameters from which the performance enhancing sub-optimal configuration is then narrowed down. This paper presents a statistical model to analyze the significance of the effect of Hadoop parameters on a variety of performance metrics. Our model decomposes the total observed performance variation and ascribes them to the main parameters, their interaction effects and noise factors. The method clearly segregates impactful parameters from the rest. The configuration setting determined by our methodology has reduced the Job completion time by 22%, resource utilization in terms of memory and CPU by 15% and 12% respectively, the number of killed Maps by 50% and Disk spillage by 23%. The proposed technique can be leveraged to ease the configuration tuning task of any Hadoop cluster despite the differences in the underlying infrastructure and the application running on it.

TPACK of Faculty in Higher Education: Current Status and Future Directions

  • KIM, Dongsim;KIM, Wonsik
    • Educational Technology International
    • /
    • v.19 no.1
    • /
    • pp.153-173
    • /
    • 2018
  • The purpose of this study was to investigate teaching competence of faculty members based on TPACK which should be examined to ensure high quality in higher education. This study was conducted with a focus on TPACK, which integrate technology knowledge (TK), content knowledge (CK), and pedagogy knowledge (PK). Except insincere responses data from a total of 85 participants were used for data analysis in this study. K-mean cluster analysis method was used to examine how faculty members could be distinguished depending on TPACK type. Study results showed that there were three different types of faculty groups (well-balanced competence type, development required competence type, and lack of technology competence type). First, faculty members defined as well balanced competence type were more than the average level in TPACK. Second, faculty members belonged to development required competence type reported below the average level in TPACK. Thus, faculty members in this type were required to increase teaching competence. Finally, faculty members in lack of technology competence type were needed to enhance competence related to technology because their overall TK level was relatively low. This study examined what distinctive characteristics exited in each type depending on gender, teaching career, nationality, and age. Results from this study offered a basis for better understanding TPACK for enhancing teaching competence at the university level.

Genome-Wide Comprehensive Analysis of the GASA Gene Family in Peanut (Arachis hypogaea L.)

  • Rizwana B.Syed Nabi;Eunyoung Oh;Sungup Kim;Kwang-Soo Cho;Myoung Hee Lee
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2022.10a
    • /
    • pp.231-231
    • /
    • 2022
  • The GASA protein (Gibberellic acid-stimulated Arabidopsis) are family of small cysteine-rich peptides found in plants. These GASA gene family mainly involved in biotic/abiotic stress responses and plant development. Despite being present in a wide plant species, their action and functions still remain unclear. In this study, using the in-silico analysis method we identified 41 GASA genes in peanuts (Arachis hypogaea L.). Based on the phylogenetic analysis 41 GASA genes are classified in the four major clusters and subclades. Mainly, clusters IV and III comprise the majority of GASA genes 15 and 11 genes respectively, followed by cluster I and cluster II with 9 and 6 genes respectively. Additionally, based on in-silico analysis we predicted the post-transcriptional and post-translational changes of GASA proteins under abiotic stresses such as drought and salt stress would aid our understanding of the regulatory mechanisms. Hence, a further study is planned to evaluate the expression of these GASA genes under stress in different plant tissues to elucidate the possible functional role of GASA genes in peanut plants. These findings might offer insightful data for peanut advancement.

  • PDF

Molecular dynamics simulation of primary irradiation damage in Ti-6Al-4V alloys

  • Tengwu He;Xipeng Li;Yuming Qi;Min Zhao;Miaolin Feng
    • Nuclear Engineering and Technology
    • /
    • v.56 no.4
    • /
    • pp.1480-1489
    • /
    • 2024
  • Displacement cascade behaviors of Ti-6Al-4V alloys are investigated using molecular dynamics (MD) simulation. The embedded atom method (EAM) potential including Ti, Al and V elements is modified by adding Ziegler-Biersack-Littmark (ZBL) potential to describe the short-range interaction among different atoms. The time evolution of displacement cascades at the atomic scale is quantitatively evaluated with the energy of primary knock-on atom (PKA) ranging from 0.5 keV to 15 keV, and that for pure Ti is also computed as a comparison. The effects of temperature and incident direction of PKA are studied in detail. The results show that the temperature reduces the number of surviving Frenkel pairs (FPs), and the incident direction of PKA shows little correlation with them. Furthermore, the increasing temperature promotes the point defects to form clusters but reduces the number of defects due to the accelerated recombination of vacancies and interstitial atoms at relatively high temperature. The cluster fractions of interstitials and vacancies both increase with the PKA energy, whereas the increase of interstitial cluster is slightly larger due to their higher mobility. Compared to pure Ti, the presence of Al and V is beneficial to the formation of interstitial clusters and indirectly hinders the production of vacancy clusters.

An Inference Similarity-based Federated Learning Framework for Enhancing Collaborative Perception in Autonomous Driving

  • Zilong Jin;Chi Zhang;Lejun Zhang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.5
    • /
    • pp.1223-1237
    • /
    • 2024
  • Autonomous vehicles use onboard sensors to sense the surrounding environment. In complex autonomous driving scenarios, the detection and recognition capabilities are constrained, which may result in serious accidents. An efficient way to enhance the detection and recognition capabilities is establishing collaborations with the neighbor vehicles. However, the collaborations introduce additional challenges in terms of the data heterogeneity, communication cost, and data privacy. In this paper, a novel personalized federated learning framework is proposed for addressing the challenges and enabling efficient collaborations in autonomous driving environment. For obtaining a global model, vehicles perform local training and transmit logits to a central unit instead of the entire model, and thus the communication cost is minimized, and the data privacy is protected. Then, the inference similarity is derived for capturing the characteristics of data heterogeneity. The vehicles are divided into clusters based on the inference similarity and a weighted aggregation is performed within a cluster. Finally, the vehicles download the corresponding aggregated global model and train a personalized model which is personalized for the cluster that has similar data distribution, so that accuracy is not affected by heterogeneous data. Experimental results demonstrate significant advantages of our proposed method in improving the efficiency of collaborative perception and reducing communication cost.

On Enhanced e-Government Security - Network Forensics

  • Wei, Ren
    • 한국디지털정책학회:학술대회논문집
    • /
    • 2004.11a
    • /
    • pp.173-184
    • /
    • 2004
  • E-Government security is crucial to the development of e-government. Due to the complexity and characteristics of e-government security, the viable current approaches for security focus on preventing the network intrusion or misusing in advanced and seldom concern of the forensics data attaining for the investigation after the network attack or fraud. We discuss the method for resolving the problem of the e-government security from the different side of view - network forensics approaches? from the thinking of the active protection or defense for the e-government security, which can also improve the ability of emergence response and incident investigation for e-government security.

  • PDF

Environmental Survey Data Modeling Using K-means Clustering Techniques

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.3
    • /
    • pp.557-566
    • /
    • 2005
  • Clustering is the process of grouping the data into clusters so that objects within a cluster have high similarity in comparison to one another. In this paper we used k-means clustering of several clustering techniques. The k-means Clustering Is classified as a partitional clustering method. We analyze 2002 Gyeongnam social indicator survey data using k-means clustering techniques for environmental information. We can use these outputs given by k-means clustering for environmental preservation and environmental improvement.

  • PDF

DNA Marker Mining of BMS1167 Microsatellite Locus in Hanwoo Chromosome 17

  • Lee, Jea-Young;Lee, Yong-Won;Kwon, Jae-Chul
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.325-333
    • /
    • 2006
  • We describe tests for detecting and locating quantitative traits loci (QTL) for traits in Hanwoo. Lod scores and a permutation test have been described. From results of a permutation test to detect QTL, we select major DNA markers of BMS1167 microsatellite locus in Hanwoo chromosome 17 for further analysis. K-means clustering analysis applied to four traits and eight DNA markers in BMS1167 resulted in three cluster groups. We conclude that the major DNA markers of BMS1167 microsatellite locus in Hanwoo chromosome 17 are markers 100bp, 108bp and 110bp.

  • PDF

SOME RESULTS FOR THE EXTREMAL LENGTHS OF CURVE FAMILIES (II)

  • Chung, Bo-Hyun
    • Journal of applied mathematics & informatics
    • /
    • v.15 no.1_2
    • /
    • pp.495-502
    • /
    • 2004
  • We consider the applications of extremal length to the boundary behavior of analytic functions and derive a theorem in connection with the capacity. This theorem applies the extremal length to the analytic functions defined on the domain with a number of holes. So it shows us the usefulness of the method of extremal length.

A surface area measurement techniques for the human head and face (두상의 표면적 측정 방안)

  • 이근부
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1996.04a
    • /
    • pp.655-657
    • /
    • 1996
  • In this study, the methods and equipment that can be used to detail the anthropomorphic data were developed. This new method that utilizes the Moire' interferometry and image processing technique is less expensive than the conventional methods. We took 36 subjects (18 years to 28 years old). The face area was calculated based on contour information. The cluster analysis about those data enables us to classify our subjects into four groups.

  • PDF