• Title/Summary/Keyword: Optimal Cluster Size

Search Result 38, Processing Time 0.02 seconds

A design of GPU container co-execution framework measuring interference among applications (GPU 컨테이너 동시 실행에 따른 응용의 간섭 측정 프레임워크 설계)

  • Kim, Sejin;Kim, Yoonhee
    • KNOM Review
    • /
    • v.23 no.1
    • /
    • pp.43-50
    • /
    • 2020
  • As General Purpose Graphics Processing Unit (GPGPU) recently plays an essential role in high-performance computing, several cloud service providers offer GPU service. Most cluster orchestration platforms in a cloud environment using containers allocate the integer number of GPU to jobs and do not allow a node shared with other jobs. In this case, resource utilization of a GPU node might be low if a job does not intensively require either many cores or large size of memory in GPU. GPU virtualization brings opportunities to realize kernel concurrency and share resources. However, performance may vary depending on characteristics of applications running concurrently and interference among them due to resource contention on a node. This paper proposes GPU container co-execution framework with multiple server creation and execution based on Kubernetes, container orchestration platform for measuring interference which may be occurred by sharing GPU resources. Performance changes according to scheduling policies were investigated by executing several jobs on GPU. The result shows that optimal scheduling is not possible only considering GPU memory and computing resource usage. Interference caused by co-execution among applications is measured using the framework.

A Study on the Job Career Patterns of Korean IT Personnel

  • Lee, Kyoungnam
    • Journal of Information Technology Services
    • /
    • v.17 no.4
    • /
    • pp.37-52
    • /
    • 2018
  • With the increasing severity of the shortage of highly-skilled IT personnel, more and more attention is being paid to the professional career path and systematic career management of IT employees. Although some studies have been conducted to describe the current status of IT personnel, limited attempts have been made to analyze the career paths or career patterns of IT professionals from a longitudinal perspective. In this context, this study explored the job career patterns of IT professionals in Korea and examined their relationship with subjective and objective career success. To identify job career patterns over time, detailed information about jobs and positions were used and an optimal matching analysis (OMA) was conducted to calculate the dissimilarity matrix between employees' career sequences, while a cluster analysis was used to categorize the meaningful groups based on this dissimilarity data. This analysis revealed that career patterns among Korean IT personnel are more varied than previously thought. These career types have a significant relationship with individual profiles, such as age, education, industry and company size, and account for significant variations in the three main career success variables, i.e. quality of life, assessment of software quality, and wage level. It is expected that the findings of this study will contribute to refining the Korean career path so as to retain IT personnel, and raise the need to improve the low quality of life and poor SW work environment of IT personnel.

A Study on Scalability of Profiling Method Based on Hardware Performance Counter for Optimal Execution of Supercomputer (슈퍼컴퓨터 최적 실행 지원을 위한 하드웨어 성능 카운터 기반 프로파일링 기법의 확장성 연구)

  • Choi, Jieun;Park, Guenchul;Rho, Seungwoo;Park, Chan-Yeol
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.10
    • /
    • pp.221-230
    • /
    • 2020
  • Supercomputer that shares limited resources to multiple users needs a way to optimize the execution of application. For this, it is useful for system administrators to get prior information and hint about the applications to be executed. In most high-performance computing system operations, system administrators strive to increase system productivity by receiving information about execution duration and resource requirements from users when executing tasks. They are also using profiling techniques that generates the necessary information using statistics such as system usage to increase system utilization. In a previous study, we have proposed a scheduling optimization technique by developing a hardware performance counter-based profiling technique that enables characterization of applications without further understanding of the source code. In this paper, we constructed a profiling testbed cluster to support optimal execution of the supercomputer and experimented with the scalability of the profiling method to analyze application characteristics in the built cluster environment. Also, we experimented that the profiling method can be utilized in actual scheduling optimization with scalability even if the application class is reduced or the number of nodes for profiling is minimized. Even though the number of nodes used for profiling was reduced to 1/4, the execution time of the application increased by 1.08% compared to profiling using all nodes, and the scheduling optimization performance improved by up to 37% compared to sequential execution. In addition, profiling by reducing the size of the problem resulted in a quarter of the cost of collecting profiling data and a performance improvement of up to 35%.

A Clustering Technique to Minimize Energy Consumption of Sensor networks by using Enhanced Genetic Algorithm (진보된 유전자 알고리즘 이용하여 센서 네트워크의 에너지 소모를 최소화하는 클러스터링 기법)

  • Seo, Hyun-Sik;Oh, Se-Jin;Lee, Chae-Woo
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.46 no.2
    • /
    • pp.27-37
    • /
    • 2009
  • Sensor nodes forming a sensor network have limited energy capacity such as small batteries and when these nodes are placed in a specific field, it is important to research minimizing sensor nodes' energy consumption because of difficulty in supplying additional energy for the sensor nodes. Clustering has been in the limelight as one of efficient techniques to reduce sensor nodes' energy consumption in sensor networks. However, energy saving results can vary greatly depending on election of cluster heads, the number and size of clusters and the distance among the sensor nodes. /This research has an aim to find the optimal set of clusters which can reduce sensor nodes' energy consumption. We use a Genetic Algorithm(GA), a stochastic search technique used in computing, to find optimal solutions. GA performs searching through evolution processes to find optimal clusters in terms of energy efficiency. Our results show that GA is more efficient than LEACH which is a clustering algorithm without evolution processes. The two-dimensional GA (2D-GA) proposed in this research can perform more efficient gene evolution than one-dimensional GA(1D-GA)by giving unique location information to each node existing in chromosomes. As a result, the 2D-GA can find rapidly and effectively optimal clusters to maximize lifetime of the sensor networks.

A Study on Energy Conservative Hierarchical Clustering for Ad-hoc Network (애드-혹 네트워크에서의 에너지 보존적인 계층 클러스터링에 관한 연구)

  • Mun, Chang-Min;Lee, Kang-Whan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.12
    • /
    • pp.2800-2807
    • /
    • 2012
  • An ad-hoc wireless network provides self-organizing data networking while they are routing of packets among themselves. Typically multi-hop and control packets overhead affects the change of route of transmission. There are numerous routing protocols have been developed for ad hoc wireless networks as the size of the network scale. Hence the scalable routing protocol would be needed for energy efficient various network routing environment conditions. The number of depth or layer of hierarchical clustering nodes are analyzed the different clustering structure with topology in this paper. To estimate the energy efficient number of cluster layer and energy dissipation are studied based on distributed homogeneous spatial Poisson process with context-awareness nodes condition. The simulation results show that CACHE-R could be conserved the energy of node under the setting the optimal layer given parameters.

Prognostic Evaluation of Categorical Platelet-based Indices Using Clustering Methods Based on the Monte Carlo Comparison for Hepatocellular Carcinoma

  • Guo, Pi;Shen, Shun-Li;Zhang, Qin;Zeng, Fang-Fang;Zhang, Wang-Jian;Hu, Xiao-Min;Zhang, Ding-Mei;Peng, Bao-Gang;Hao, Yuan-Tao
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.14
    • /
    • pp.5721-5727
    • /
    • 2014
  • Objectives: To evaluate the performance of clustering methods used in the prognostic assessment of categorical clinical data for hepatocellular carcinoma (HCC) patients in China, and establish a predictable prognostic nomogram for clinical decisions. Materials and Methods: A total of 332 newly diagnosed HCC patients treated with hepatic resection during 2006-2009 were enrolled. Patients were regularly followed up at outpatient clinics. Clustering methods including the Average linkage, k-modes, fuzzy k-modes, PAM, CLARA, protocluster, and ROCK were compared by Monte Carlo simulation, and the optimal method was applied to investigate the clustering pattern of the indices including platelet count, platelet/lymphocyte ratio (PLR) and serum aspartate aminotransferase activity/platelet count ratio index (APRI). Then the clustering variable, age group, tumor size, number of tumor and vascular invasion were studied in a multivariable Cox regression model. A prognostic nomogram was constructed for clinical decisions. Results: The ROCK was best in both the overlapping and non-overlapping cases performed to assess the prognostic value of platelet-based indices. Patients with categorical platelet-based indices significantly split across two clusters, and those with high values, had a high risk of HCC recurrence (hazard ratio [HR] 1.42, 95% CI 1.09-1.86; p<0.01). Tumor size, number of tumor and blood vessel invasion were also associated with high risk of HCC recurrence (all p< 0.01). The nomogram well predicted HCC patient survival at 3 and 5 years. Conclusions: A cluster of platelet-based indices combined with other clinical covariates could be used for prognosis evaluation in HCC.

Development of Market Growth Pattern Map Based on Growth Model and Self-organizing Map Algorithm: Focusing on ICT products (자기조직화 지도를 활용한 성장모형 기반의 시장 성장패턴 지도 구축: ICT제품을 중심으로)

  • Park, Do-Hyung;Chung, Jaekwon;Chung, Yeo Jin;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.1-23
    • /
    • 2014
  • Market forecasting aims to estimate the sales volume of a product or service that is sold to consumers for a specific selling period. From the perspective of the enterprise, accurate market forecasting assists in determining the timing of new product introduction, product design, and establishing production plans and marketing strategies that enable a more efficient decision-making process. Moreover, accurate market forecasting enables governments to efficiently establish a national budget organization. This study aims to generate a market growth curve for ICT (information and communication technology) goods using past time series data; categorize products showing similar growth patterns; understand markets in the industry; and forecast the future outlook of such products. This study suggests the useful and meaningful process (or methodology) to identify the market growth pattern with quantitative growth model and data mining algorithm. The study employs the following methodology. At the first stage, past time series data are collected based on the target products or services of categorized industry. The data, such as the volume of sales and domestic consumption for a specific product or service, are collected from the relevant government ministry, the National Statistical Office, and other relevant government organizations. For collected data that may not be analyzed due to the lack of past data and the alteration of code names, data pre-processing work should be performed. At the second stage of this process, an optimal model for market forecasting should be selected. This model can be varied on the basis of the characteristics of each categorized industry. As this study is focused on the ICT industry, which has more frequent new technology appearances resulting in changes of the market structure, Logistic model, Gompertz model, and Bass model are selected. A hybrid model that combines different models can also be considered. The hybrid model considered for use in this study analyzes the size of the market potential through the Logistic and Gompertz models, and then the figures are used for the Bass model. The third stage of this process is to evaluate which model most accurately explains the data. In order to do this, the parameter should be estimated on the basis of the collected past time series data to generate the models' predictive value and calculate the root-mean squared error (RMSE). The model that shows the lowest average RMSE value for every product type is considered as the best model. At the fourth stage of this process, based on the estimated parameter value generated by the best model, a market growth pattern map is constructed with self-organizing map algorithm. A self-organizing map is learning with market pattern parameters for all products or services as input data, and the products or services are organized into an $N{\times}N$ map. The number of clusters increase from 2 to M, depending on the characteristics of the nodes on the map. The clusters are divided into zones, and the clusters with the ability to provide the most meaningful explanation are selected. Based on the final selection of clusters, the boundaries between the nodes are selected and, ultimately, the market growth pattern map is completed. The last step is to determine the final characteristics of the clusters as well as the market growth curve. The average of the market growth pattern parameters in the clusters is taken to be a representative figure. Using this figure, a growth curve is drawn for each cluster, and their characteristics are analyzed. Also, taking into consideration the product types in each cluster, their characteristics can be qualitatively generated. We expect that the process and system that this paper suggests can be used as a tool for forecasting demand in the ICT and other industries.

An Analysis of Big Video Data with Cloud Computing in Ubiquitous City (클라우드 컴퓨팅을 이용한 유시티 비디오 빅데이터 분석)

  • Lee, Hak Geon;Yun, Chang Ho;Park, Jong Won;Lee, Yong Woo
    • Journal of Internet Computing and Services
    • /
    • v.15 no.3
    • /
    • pp.45-52
    • /
    • 2014
  • The Ubiquitous-City (U-City) is a smart or intelligent city to satisfy human beings' desire to enjoy IT services with any device, anytime, anywhere. It is a future city model based on Internet of everything or things (IoE or IoT). It includes a lot of video cameras which are networked together. The networked video cameras support a lot of U-City services as one of the main input data together with sensors. They generate huge amount of video information, real big data for the U-City all the time. It is usually required that the U-City manipulates the big data in real-time. And it is not easy at all. Also, many times, it is required that the accumulated video data are analyzed to detect an event or find a figure among them. It requires a lot of computational power and usually takes a lot of time. Currently we can find researches which try to reduce the processing time of the big video data. Cloud computing can be a good solution to address this matter. There are many cloud computing methodologies which can be used to address the matter. MapReduce is an interesting and attractive methodology for it. It has many advantages and is getting popularity in many areas. Video cameras evolve day by day so that the resolution improves sharply. It leads to the exponential growth of the produced data by the networked video cameras. We are coping with real big data when we have to deal with video image data which are produced by the good quality video cameras. A video surveillance system was not useful until we find the cloud computing. But it is now being widely spread in U-Cities since we find some useful methodologies. Video data are unstructured data thus it is not easy to find a good research result of analyzing the data with MapReduce. This paper presents an analyzing system for the video surveillance system, which is a cloud-computing based video data management system. It is easy to deploy, flexible and reliable. It consists of the video manager, the video monitors, the storage for the video images, the storage client and streaming IN component. The "video monitor" for the video images consists of "video translater" and "protocol manager". The "storage" contains MapReduce analyzer. All components were designed according to the functional requirement of video surveillance system. The "streaming IN" component receives the video data from the networked video cameras and delivers them to the "storage client". It also manages the bottleneck of the network to smooth the data stream. The "storage client" receives the video data from the "streaming IN" component and stores them to the storage. It also helps other components to access the storage. The "video monitor" component transfers the video data by smoothly streaming and manages the protocol. The "video translator" sub-component enables users to manage the resolution, the codec and the frame rate of the video image. The "protocol" sub-component manages the Real Time Streaming Protocol (RTSP) and Real Time Messaging Protocol (RTMP). We use Hadoop Distributed File System(HDFS) for the storage of cloud computing. Hadoop stores the data in HDFS and provides the platform that can process data with simple MapReduce programming model. We suggest our own methodology to analyze the video images using MapReduce in this paper. That is, the workflow of video analysis is presented and detailed explanation is given in this paper. The performance evaluation was experiment and we found that our proposed system worked well. The performance evaluation results are presented in this paper with analysis. With our cluster system, we used compressed $1920{\times}1080(FHD)$ resolution video data, H.264 codec and HDFS as video storage. We measured the processing time according to the number of frame per mapper. Tracing the optimal splitting size of input data and the processing time according to the number of node, we found the linearity of the system performance.