A Workflow Scheduling Technique Using Genetic Algorithm in Spot Instance-Based Cloud

Jung, Daeyong;Suh, Taeweon;Yu, Heonchang;Gil, JoonMin;

doi:10.3837/tiis.2014.09.010

KSII Transactions on Internet and Information Systems (TIIS)

Volume 8 Issue 9
/
Pages.3126-3145
/
2014
/
1976-7277(pISSN)
/
1976-7277(eISSN)

Korean Society for Internet Information (한국인터넷정보학회)

DOI QR Code

A Workflow Scheduling Technique Using Genetic Algorithm in Spot Instance-Based Cloud

Jung, Daeyong (Dept. of Computer Science Education, Korea University) ;
Suh, Taeweon (Dept. of Computer Science Education, Korea University) ;
Yu, Heonchang (Dept. of Computer Science Education, Korea University) ;
Gil, JoonMin (School of Information Technology Engineering, Catholic University of Daegu)

Received : 2013.12.02
Accepted : 2014.08.19
Published : 2014.09.30

https://doi.org/10.3837/tiis.2014.09.010 Citation PDF KSCI KPUBS HTML

Download PDF

⟨ Previous Next ⟩

Abstract

Cloud computing is a computing paradigm in which users can rent computing resources from service providers according to their requirements. A spot instance in cloud computing helps a user to obtain resources at a lower cost. However, a crucial weakness of spot instances is that the resources can be unreliable anytime due to the fluctuation of instance prices, resulting in increasing the failure time of users' job. In this paper, we propose a Genetic Algorithm (GA)-based workflow scheduling scheme that can find the optimal task size of each instance in a spot instance-based cloud computing environment without increasing users' budgets. Our scheme reduces total task execution time even if an out-of-bid situation occurs in an instance. The simulation results, based on a before-and-after GA comparison, reveal that our scheme achieves performance improvements in terms of reducing the task execution time on average by 7.06%. Additionally, the cost in our scheme is similar to that when GA is not applied. Therefore, our scheme can achieve better performance than the existing scheme, by optimizing the task size allocated to each available instance throughout the evolutionary process of GA.

Keywords

1. Introduction

In recent years, due to the increased interest in cloud computing, many cloud projects and commercial systems such as the Amazon Elastic Compute Cloud (EC2) [1], and FlexiScale [2] have been implemented. Cloud computing provides high utilization and high flexibility for managing computing resources. In addition, cloud computing services provide a high level of scalability of computing resources combined with Internet technology that are distributed among several customers [3, 4, 5]. In most cloud services, the concept of an instance unit is used to provide users with resources in a cost-efficient manner. Generally, instances are classified into three types: on-demand instances, reserved instances, and spot instances. On-demand instances allow a user to pay for computing capacity by the hour, with no long-term commitments. This frees users from the costs and complexities of planning, purchasing, and maintaining hardware, and transforms what are usually large fixed costs into much smaller variable costs [1]. Reserved instances allow a user to make a low, one-time payment to reserve instance capacity and further reduce the user’s cost. While in reserved instances a user pays a yearly fee and receives a discount on hourly rates, in on-demand instances a user pays one hourly rate [6]. On the other hand, spot instances allow customers to bid on unused Amazon EC2 capacity and run those instances for as long as their bid exceeds the current spot price. Based on the supply and demand of spot instances, the spot price is changed. Customers whose bid exceeds the spot price can gain access to the available spot instances. If the applications executed are time-flexible, spot instances can significantly decrease the Amazon EC2 costs [7]. Therefore, spot instances may incur lower costs while performing tasks than on-demand instances.

Spot market-based cloud environment configures the spot instance. The environment affects the successful completion or failure of tasks depending on the changing spot prices. Because spot prices have a market structure and follow the law of demand and supply, cloud services in Amazon EC2 can provide a spot instance when a user’s bid is higher than the current spot price. Furthermore, a running instance stops when a user’s bid becomes less than or equal to the current spot price. After a running instance stops, it restarts when a user’s bid becomes greater than the current spot price [8].

Scientific applications, in particular, make the current common of workflow. However, spot instance-based cloud computing has variable performance, because the available execution time of spot instances depends on the spot price. The completion time for the same amount of a task varies according to the performance of an instance. In other words, the failure time of each instance differs according to the user’s bid and the performance in an instance. Thereby, we infer that the completion time of a task in an instance increases when a failure occurs. For efficient execution of tasks, we analyze the task and instance information from the price history data of spot instances, and estimate the task size and instance availability from the analyzed data. A workflow is created based on each available instance and the task size. However, the created workflow has a problem in that it does not consider the failure time of each instance. To solve this problem, we propose a scheme to change the task size of each instance using a Genetic Algorithm (GA). Our proposed scheduling scheme uses workflow mechanisms and GA to handle job execution. In our scheme, a user’s job is executed within selected instances and the user’s budget is stretched.

2. Related Works

Two different environments in cloud computing have been considered in recent research studies: reliable environments (with on-demand instances [9, 10]) and unreliable environments (with spot instances [8, 11, 12, 12]). Our study falls into the latter category of cloud computing environments.

Typically, studies on spot instances have mainly focused on performing tasks at low monetary costs. The spot instances in the Amazon EC2 offer cloud services to users at lower prices at the expense of reliability [1]. Cloud exchange [14] provides the actual price history of EC2 spot instances. There has been numerous studies on resource allocation [12, 13], fault tolerance [8, 11, 12, 15], workflow scheduling [16, 17], and the use of Genetic Algorithms [18, 19, 20] for spot instances.

In the field of resource allocation, Voorsluys et al. [12] provided a solution for running computation-intensive tasks in a pool of intermittent VMs. To mitigate potential unavailability periods, the study proposed a multifaceted fault-aware resource provisioning policy. Voorsluys et al’s solution employed price and runtime estimation mechanisms. The proposed strategy achieved cost savings and stricter adherence to deadlines. Zhang et al. [13] demonstrated how best to match customer demand in terms of both supply and price, and how to maximize provider revenue and customer satisfaction in terms of VM scheduling. Their model was designed to solve the problem of discrete-time optimal control. It achieved higher revenues than static allocation strategies and minimized the average request waiting time. Our work differs from [12] and [13] in that we focus on reducing the rollback time after a task failure, achieving cost savings and reducing the total execution time.

In the fault tolerance category, two similar studies ([11] and [8]) proposed enforcing fault tolerance in cloud computing with spot instances. Based on the actual price history of EC2 spot instances, these studies compared several adaptive checkpointing schemes in terms of monetary costs and job execution time. Goiri et al. [10] evaluated three fault tolerance schemes - checkpointing, migration, and job duplication - assuming fixed communication costs. The migration-based scheme showed a better performance than the checkpointing or the job duplication-based scheme. Voorsluys et al. [12] also analyzed and evaluated the impact of checkpointing and migration on fault tolerance using spot instances. Our paper differs from [8, 10, 11, 12] in that we utilize two thresholds for fault tolerance.

A workflow is a model that represents complex problems with structures such as Directed Acyclic Graphs (DAG). Workflow scheduling is a kind of global task scheduling as it focuses on mapping and managing the execution of interdependent tasks on shared resources. The existing workflow scheduling methods have limited scalability and are based on centralized scheduling algorithms. Consequently, these methods are not suitable for spot instance-based cloud computing. In spot instances, the available time and instance costs have to be considered for job execution. Fully decentralized workflow scheduling systems use a chemistry-inspired model to determine the instance in a community cloud platform [16]. The throughput maximization strategy is designed for transaction-intensive workflow scheduling that does not support multiple workflows [17]. Our proposed scheduling scheme guarantees an equal task distribution across available instances in spot instance-based cloud computing.

A GA is a popular search technique that uses the concept of evolution in the natural world to solve optimization problems [18, 19, 20]. It has been used for cloud scheduling in various studies [21, 22, 23]. The synapsing variable-length crossover (SVLC) algorithm provides a biologically inspired method for performing meaningful crossover between variable-length genomes [18]. However, traditional GAs [19, 20] operate on a population of fixed-length genomes. In additional, those have the problem to relate a set of potential solutions. Our GA utilizes the crossover to adopt the variable-length. In [21], the scheduling of VM resources in a load balanced manner was based on GA. In a private cloud environment, a Modified Genetic algorithm (MGA) was utilized for task scheduling with a combination of Shortest Cloudlet to Fastest Processor (SCFP) and Longest Cloudlet to Fastest Processor (LFCP) methods [22]. In this study, authors also used a meta-heuristic GA as an optimization method. Fatma et al. [23] adapted task scheduling to use two fitness functions. The first fitness function is concerned with minimizing the total execution time, and the second is concerned with the load balance. However, the existing studies did not consider the failure time of instances that makes task execution stop. Our GA-based scheduling method considers the failure time of instances, the task execution time, and task execution costs.

3. System Architecture

Our proposed scheme expands on our previous work [24] and includes a workflow scheduling algorithm. We make workflow tasks based on available instances and propose how to handle tasks. Each available instance operates workflow tasks and uses checkpointing scheme in Section 3.3. Fig. 1 presents the mapping relationship between workflows and instances, and illustrates the roles of the instance information manager, the workflow manager, and the resource scheduler. The instance information manager obtains the information required for job allocation and resource management. This information includes the VM specifications for each instance and execution related information such as the execution costs, execution completion time, and failure time. The execution-related information is calculated by using the selected VM and is based on the spot history. The workflow manager and resource scheduler extract the necessary execution-related information from the instance information manager. First, the workflow manager generates the workflow for the request job. The generated workflow determines the task size according to the VM performance, the execution time and costs, and the failure time in the selected instance. Second, the resource scheduler manages the resource and allocates the task to handle the job. Resource and task managements perform reallocation when the resource cannot get the mapping-related information for the task, or when the task has a fault during execution.

Fig. 1.Mapping between workflows and instances

In the above model, our proposed scheme uses the workflow in spot instances to minimize job processing time within the user’s budget. The task size is determined by evaluating the availability and performance of each instance in order to minimize the job processing time. The available time is estimated by calculating the execution time and cost using the price history of spot instances. This helps to improve the performance and stability of task processing. The estimated time data is used in assigning the amount of task to each instance. Our proposed scheme reduces out-of-bid situations and improves the job execution time. However, the total cost is higher than the cost when workflow scheduling is not used.

3.1 Instance types

An instance means a VM rented by a cloud user. Instances are classified into two types: on-demand instances and spot instances. In on-demand instances, users can use VM resources after paying a fixed cost on an hourly basis. On the other hand, when using spot instances, users can use VM resources only when the price of the instances is lower than their bid. The difference between the two instance types is as follows. In on-demand instances, a failure does not occur during task execution, but the cost is comparatively high. In contrast, the cost of spot instances is lower than that of on-demand instances. However, in the case of spot instances, there is a risk of task failures when the price of the instance becomes higher than the user’s bid.

Amazon allows users to bid on unused EC2 capacity that includes 42 types of spot instances [1]. Their prices, which are referred to as spot prices, change dynamically based on the supply and demand. Fig. 2 shows an example of spot price fluctuation during seven days in December 2010 for c1-xlarge (High-CPU Spot Instances - Extra Large) [14]. Our proposed system model is based on the characteristics of Amazon EC2's spot instances, which are as follows:

Fig. 2.Price history of EC2's spot instances for c1-xlarge

3.2 Estimated Interval

Our checkpointing operation is performed by analyzing the price variations during selected time intervals in the past. We estimate the job execution time and cost from the analyzed data. These estimations are combined with the failure probability in order to calculate the thresholds for the checkpointing operation. The proper estimation of the execution time and cost is crucial for maintaining the credibility of service providers with customers.

In this paper, we introduce a terminology referred to as Estimated Interval (EI). Fig. 3 shows an illustrative definition of the EI. The detailed definition is as follows:

Fig. 3.EI illustration

3.3 Our Proposed Checkpointing Scheme

Fig. 4 illustrates our proposed checkpointing scheme [24]. This scheme performs checkpointing using two kinds of thresholds - price threshold and time threshold - depending on the expected execution time based on the price history. Let tstart and tend denote a start point and an end point in the expected execution time, respectively. Based on tstart and tend , we obtain the price threshold (PriceTh ) and the time threshold (TimeThPi ), which are used as thresholds in our proposed checkpoint scheme. The price threshold, PriceTh , can be calculated as

Fig. 4.Our proposed checkpointing scheme

where Userbid represents the bid suggested by the user. Pmin is the available minimum price that the function PriceMin extracts in the period between tstart and tend . This is given as follows:

The time threshold of the price Pi , TimeThPi , can be calculated as

where FPi is the failure probability of price Pi , and AvgTimePi (tstart , tend ) represents the average execution time of Pi , in the period between tstart and tend .

4. Genetic Algorithm Based Job Scheduling

The main goal of our scheduling method is to minimize the execution time and the cost of running applications. Our scheduling method is based on the genetic algorithm (GA) that is a well-known and robust search technique for solving large-scale optimization problems. The GA consists of the following five steps:

4.1 Initial Population

Initially, all chromosomes in a population are randomly constructed without any knowledge of experts. Before describing the representation of a chromosome, let us consider the definitions of both instances and jobs that each chromosome consists.

To avoid the complexity of task allocations to instances, we assume two constraints as follows: one is that all tasks have an identical size. The other is that there is no sequence for task execution on an instance. Under this assumption, we construct the structure of two-dimensional chromosomes. Fig. 5 illustrates the two-dimensional chromosome represented as a grid with N rows and M columns. In this figure, a gene g(Ii, Jj) means the i-th row and the j-th column (1 ≤ i ≤ N , 1 ≤ j ≤ M ) in the chromosome and is interpreted as follows.

Fig. 5.Chromosome representation

where k and l respectively represent the values of the adjacent two genes, g(Ii-1, Jj) and g(Ii, Jj) (1 ≤ k ≤ l ≤ Lj ).

To meet the requirements for task execution, n instances satisfying the condition in Eq. (5) are chosen when allocating tasks to initial instances. Our task distribution method determines the task size in order to allocate a task to a selected instance. Based on a compute-unit and an available state, the task size of an instance Ii for Jj () is calculated as

where represents the total size of tasks in Jj required for executing a user request. In an instance Ii , UIi and AIi represent the compute-unit (the product of CPU and cores) and the available state, respectively. The available state AIi can be either 0 (unavailable) or 1 (available). The Ubaseline represents the compute-unit of the selected instance to request the user. The task size is decided by the compute-unit rate based on the baseline.

Each gene g(Ii, Jj) shows the number of task sequences according to the compute-unit of the selected instance and the variation of task size. The task sequences are related to the task size. Eq. (6) represents the number of task sequences in gene g(Ii, Jj).

The task size of each instance in the chromosome adjusts based on Eq. (7). The variation of task size is randomly set by Eq. (8).

When an initial population is constructed, all genes in the chromosome are randomly set to an integer value adopting Eq. (6).

4.2 Fitness Function

The fitness function defines the criterion for ranking potential hypotheses and for probabilistically selecting them for inclusion in the next generation of the population. Additionally, the fitness function is the criterion for determining the quality of chromosomes in the population. It directly reflects the performance of task distribution. In our paper, the standard deviation is used as the fitness function to evaluate the performance of chromosomes. Based on the standard deviation for each chromosome Ck , our fitness function f(σCk) is defined as follows.

In Eq. (9), EETIi is the total estimated execution time of the instance Ii . AvgCk is the average of total estimated execution time and can be calculated as

where, P represents population size.

4.3 Selection

The fittest chromosomes have higher probability to be selected for the next generation. We use two selection mechanisms. One is the elitism method that forces the GA to retain some number of the best chromosomes at each generation. The other is a roulette wheel method. Given the population size P, P - 1 chromosomes in a new population are made by the roulette wheel method. The elitism method is applied to produce the remaining one chromosome, which is the best of chromosomes in population pool.

4.4 Genetic Operations (Crossover, Mutation, and Reproduction)

The successive generation in GA is determined by a set of operators that recombine and mutate selected members of the current population. The crossover operator produces two new offsprings from two parents, leading to improving the fitness of chromosomes in the current population. We adapt two-point crossover method to the spot instance environment. In the typical two-point crossover, a pair of chromosomes is selected according to the crossover probability pc, and two crossover points are randomly selected. Fig. 6 illustrates an example of two-point crossover operation in our genetic algorithm.

Fig. 6.Example of two-point crossover

The mutation operation changes the value of genes in the chromosome according to the mutation probability pm. Our mutation operation is depicted in Fig. 7. The number of mutation points is randomly selected by row. After selecting the number of rows, the number of column mutations is randomly selected in each row. After the mutation operation is applied, the gene g'(Ii, Jj) is calculated by

Fig. 7.Example of mutation

where g(Ii, Jj) is a previous gene in the chromosome Ck. r1 and r2 represent the variation to the gene g(Ii, Jj). There are two cases. In the first case (i.e. r1 ), if the EETIi of the instance Ii is given less than the average of EET of available instances, the gene g(Ii, Jj) is changed into the new gene g'(Ii, Jj) greater than the current value. The r1 is extracted from the value ranging between g(Ii, Jj) and g(Ii+1, Jj). In the second case (i.e. r2), if the EETIi of the instance Ii is greater than the average of EET of available instances, the gene g(Ii, Jj) is changed into the new gene value less than the current value. The r2 is extracted from the value ranging between g(Ii-1, Jj) and g(Ii, Jj).

The above-mentioned two cases are applied to only instances between I1 and IN-1. The genes of last instance (g(IN, Jj)) can not be changed, because they have the final task number in a job; i.e, the mutation operation is not applied to these genes.

The reproduction operation is used to obtain the modifed chromosome for the next generation. The reproducted chromosomes selects chromosomes to consider the low ranking of the fitness according to the reproduction probability pr. Each chromosome calculates the ranking of the fitness based on Eq. (9). The selected chromosome deletes current chromosome and than reproduces new chromosome. The selected chromosome adjusts the task size based on the elitism chromosome of the selection phase. The task size of each instance is determined according to the standard deviation of each instance in the elitism chromosome.

4.5 GA-based Workflow Scheduling (GAWS) Algorithm

Our GA-based Workflow Scheduling (GAWS) algorithm is described in Fig. 8. The Task_flag represents the occurrence of a task execution, and its initial value is false. When the task execution is normal (i.e., Task_flag is true), the scheduler performs a workflow operation (lines 3-27). The GA_flag represents the execution of GA, and its initial value is true; Lines 17-24 are initial population, fitness function, and genetic operations of GA, respectively. Lines 28-35 show the workflow function.

Fig. 8.GAWS algorithm

5. Performance Evaluation

In this section, we evaluate and analyze the performance of the proposed scheme through simulations. The performance comparision for the scheme is also presented.

5.1 Simulation Environment

Before performing our simulations, we describe spot-instance environments and the simulation parameters used in our GA. Our simulations were conducted using the history data obtained from Amazon EC2 spot instances [14]. The history data before 10-01-2010 was used to extract the expected execution time and failure occurrence probability for our checkpointing scheme. The applicability of our scheme was tested using the history data after 10-01-2010.

In the simulations, one type of spot instance was applied to demonstrate the effect of task execution time analysis on the performance. Table 1 shows the various resource types used in Amazon EC2. In this table, resource types are comprised of different instance types. First, standard instances offer a basic resource type. Second, high-CPU instances offer more compute units than other resources, and can be used for computation-intensive applications. Finally, high-memory instances offer more memory capacity than other resources, and can be used for high-throughput applications such as databases and memory caching applications. In the simulation environments, we compared the performance of our proposed scheme with that of the existing schemes without considering task distribution based on the task execution time.

Table 1.Resource type information

Table 1 shows the information related to resource types in each instance, and Table 2 shows the parameters and values for our simulation. Table 3 shows the parameters and values for our genetic algorithm. The spot price information was extracted from the spot history data from 11-30-2009 to 01-23-2011. The user’s bid was taken as the average of the spot prices from the spot history data. Using Eq. (5), the task size was decided by the compute-unit rate based on the baseline m1.xlarge.

Table 2.Parameters and values for simulation

Table 3.Parameters and values for GA

5.2 Effect of GA Analysis

In this section, we show the size variations of requested task and the fitness variation according to GA. Fig. 9 shows the size variations of requested tasks in each instance before and after using GA. In this figure, “B” and “A” denote the before and after GA, respectively, and each instance type (m1.small, m1.large, etc.) indicates the task size allocated to each instance among the requested tasks of users. As shown in Fig. 9, when GA is applied, the task size is varied for each instance. This variation directly relates to the failure time of each instance; i.e, it is because more tasks are allocated to instances with low failure time by the evolutionary process of GA.

Fig. 9.Size variations in requested tasks

Fig. 10 shows the fitness curve of the best chromosome at each generation when GA is applied in case that the task size is 259,200. In this figure, we can observe that the value of fitness becomes higher as the generation increases. Therefore, we can see that our genetic algorithm-based scheme can sufficiently find the optimum solution.

Fig. 10.Fitness curve

Fig. 11 shows the performance comparison between with and without crossover operation. In this figure, “Without” and “With” denote without and with the crossover operation, respectively. Except for the crossover operation, the remaining operations such as initial population, fitness functions, mutation, and reproduction, are the same as in the previous simulation. As we can see in this figure, the total execution time is reduced by an average of 3.08% with crossover operation compared to without crossover operation. We can see from the result of this figure that crossover operation can improve more performance rather than without crossover operation.

Fig. 11.Comparison with and without crossover operation

5.3 Effect of the Estimated Execution Analysis

In this simulation, we examined the performance of the estimated execution analysis according to the task size before and after GA. Figs. 12, 13, 14, and 15 show the total estimated execution time, estimated cost, estimated failure time, estimated rollback time, respectively. TotalT is the sum of the estimated execution time in each instance, task distribution time, and task merge time. TotalC denotes the sum of estimated costs for task execution in each instance. Fig. 12 shows the total estimated execution time according to the task size before and after GA, respectively. After GA was applied, the total estimated execution time was reduced by an average of 33.59%. The total estimated execution times before and after GA were different because the instance failures occurred at different times. Fig. 13 shows the estimated failure times before and after GA, respectively. The estimated failure time of all instances was reduced by an average of 15.45% after applying GA as compared to when GA was not applied. In Fig. 14, the estimated rollback time after GA showed an average performance improvement of 19.61% when compared to the rollback time before GA. The rollback time is calculated from a failure point to the last checkpoint time. Fig. 15 shows the estimated cost before and after applying GA, respectively. The cost after GA was decreased by an average of $0.12 as compared to the cost before GA. The difference between costs before and after applying GA is a little due to handling the equal task size. From the above results, we observe that the estimated execution times and failure times after applying GA were reduced as compared to before GA. Moreover, the costs were similar. The actual execution times and costs were compared based on above information. The improved results is had the reallocated task size according to GA. Because, our proposed GA is obtained relatively low the standard deviation about the estimated execution time than the standard deviation without GA.

Fig. 12.Total estimated execution time

Fig. 13.Estimated failure time

Fig. 14.Estimated rollback time

Fig. 15.Estimated costs

5.4 Effect of the Execution Analysis

In this simulation, we examined the performance of the execution analysis according to the task size before and after GA. Figs. 16, 17, 18, and 19 show the execution results of the actual data based on the estimated data, before and after GA. In the figures, TotalT denotes the total time taken for the distribution and merging of tasks. TotalC denotes the sum of costs of task execution in each instance. Fig. 16 shows that the total execution time is reduced by an average of 7.06% after GA as compared to before GA. Fig. 17 shows that the total costs after GA decreased the average by $0.17 when compared to the cost before GA. Fig. 18 shows that the failure time after GA was increased on average by 13.93% as compared to before GA. In Fig. 19, the rollback time after GA showed an average performance improvement of 6.52% when compared to the rollback time before GA. Different fluctuation of spot price contributes to difference between estimation and actual performances.

Fig. 16.Total execution time in task distribution

Fig. 17.Costs in task distribution

Fig. 18.Failure time in task distribution

Fig. 19.Rollback time in task distribution

6. Conclusion

In this paper, we proposed a GA-based workflow scheduling technique for task distribution in unreliable cloud computing environments. In our environment, the resources can be unreliable anytime due to the fluctuation of instance prices, resulting in increasing the failure time of users’ job. In order to solve the problem, we proposed GA-based workflow scheduling technique. The scheme proposed in this study reduced the failure time and the rollback time. The rollback time in our scheme was less than that of the existing scheme (without GA) because our scheme adaptively performs task distribution according to the estimated execution time of available instances. The simulation results showed that the execution time in our scheme was improved on average by 7.06% after GA as compared to before GA. Additionally, the failure time after applying GA was reduced on average by 6.52% as compared to before GA. Therefore, our scheduling method achieved minimizing the execution time and the cost of running applications. In future, we plan to expand our environment with an efficient GA operation that takes into consideration the current state of available instances.

References

Elastic Compute Cloud (EC2), http://aws.amazon.com/ec2, 2013.
F.L. Ferraris, D. Franceschelli, M.P. Gioiosa, D. Lucia, D. Ardagna, E. Di Nitto, and T. Sharif, "Evaluating the Auto Scaling Performance of Flexiscale and Amazon EC2 Clouds," in Proc. of Proceedings of 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 423-429, 2012.
H.N. Van, F.D. Tran, and J.M. Menaud, "SLA-Aware Virtual Resource Management for Cloud Infrastructures," in Proc. of Proceedings of the 2009 Ninth IEEE International Conference on Computer and Information Technology, vol. 2, pp. 357-362. IEEE Computer Society, 2009.
M. Komal, M. Ansuyia, and D. Deepak, "Round Robin with Server Affinity: A VM Load Balancing Algorithm for Cloud Based Infrastructure," Journal of Information Processing Systems, vol. 9, no. 3, pp. 379-394, 2013. https://doi.org/10.3745/JIPS.2013.9.3.379
Hasan Sabbir and Eui-Nam Huh, "Heuristic based Energy-aware Resource Allocation by Dynamic Consolidation of Virtual Machines in Cloud Data Center," KSII Transactions on Internet & Information Systems, vol. 7, Issue 8, pp. 1825-1842, 2013. https://doi.org/10.3837/tiis.2013.08.005
Siqi Shen, Kefeng Deng, Alexandru Iosup, and Dick Epema, "Scheduling jobs in the cloud using on-demand and reserved instances," in Proc. of Proceedings of the 19th international conference on Parallel Processing (Euro-Par'13), pp. 242-254, 2013.
Amazon EC2 spot Instances, http://aws.amazon.com/ec2/spot-instances/, 2013.
S. Yi, D. Kondo, and A. Andrzejak, "Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud," in Proc. of Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing, pp. 236-243. IEEE Computer Society, 2010.
G. Singer, I. Livenson, M. Dumas, S. N. Srirama, and U. Norbisrath, "Towards a model for cloud computing cost estimation with reserved resources," in Proc. of Proceedings. of 2nd ICST International Conference on CloudComp 2010, Barcelona, Spain. Springer, October 2010.
M. Mazzucco and M. Dumas, "Reserved or On-Demand Instances? A Revenue Maximization Model for Cloud Providers," in Proc. of Proceedings of the 4th IEEE International CLOUD 2011, pp. 428-435, July 2011.
S. Yi, J. Heo, Y. Cho, and J. Hong, "Taking point decision mechanism for page-level incremental checkpointing based on cost analysis of process execution time," Journal of Information Science and Engineering, vol. 23, no. 5, pp. 1325-1337, 2007.
William Voorsluys and Rajkumar Buyya., "Reliable Provisioning of Spot Instances for Compute-intensive Applications," in Proc. of IEEE 26th International Conference on Advanced Information Networking and Applications, 2012.
Qi Zhang, Eren Gurses, Raouf Boutaba, and Jin Xiao., "Dynamic resource allocation for spot markets in clouds," in Proc. of the 11th USENIX conference Hot-ICE'11, pp. 1-6, 2011.
Cloud exchange, http://cloudexchange.org, 2013.
Goiri, F. Julia, J. Guitart, and J. Torres., "Checkpoint-based Fault-tolerant Infrastructure for Virtualized Service Providers," 12th IEEE/IFIP NOMS'10, pp. 455-462, April 2010.
H. Fernandez, M. Obrovac, and C. Tedeschi, "Decentralised Multiple Workflow Scheduling via a Chemically-coordinated Shared Space," INRIA Research Report, RR-7925, pp. 1-14, 2012.
K. Liu, J. Chen, Y. Yang, and H. Jin, "A throughput maximization strategy for scheduling transaction-intensive workflows on SwinDeW-G," Concurrency and Computation: Practice and Experience, vol. 20, issue 15, pp. 1807-1820, 2008. https://doi.org/10.1002/cpe.1316
B. Hutt and K. Warwick, "Synapsing Variable-Length Crossover: Meaningful Crossover for Variable-Length Genomes," IEEE Transactions on Evolutionary Computation, vol. 11, issue 1, pp. 118-131, 2007. https://doi.org/10.1109/TEVC.2006.878096
John H. Holland, "Adaptation in natural and artificial systems: An introductory analysis with applications to biology, control, and artificial intelligence," U. Michigan Press, 1975.
Fullmer, Brad, and Risto Miikkulainen, "Using marker-based genetic encoding of neural networks to evolve finite-state behavior," in Proc. of Toward a Practice of Autonomous Systems: Proceedings of the First European Conference on Artificial Life, pp. 252-262, 1992.
J. Gu, J. Hu, Tianhai Zhao, and Guofei Sun, "A new resource scheduling strategy based on genetic algorithm in cloud computing environment," Journal of Computers, vol. 7, no. 1, pp. 42-52, 2012.
S. Kaur and A. Verma, "An Efficient Approach to Genetic Algorithm for Task Scheduling in Cloud Computing Environment," International Journal of Information Technology and Computer Science (IJITCS), vol. 4, no.10, pp. 74-79, 2012.
Fatma A. Omara and Mona M. Arafa, "Genetic algorithms for task scheduling problem," Journal of Parallel and Distributed Computing (JPDC), vol. 70, issue 7, pp. 758-766, 2010. https://doi.org/10.1016/j.jpdc.2010.03.011
D. Jung, S. Chin, K. Chung, H. Yu, and J. Gil, "An Efficient Checkpointing Scheme Using Price History of Spot Instances in Cloud Computing Environment," in Proc. of Proceeding of NPC2011, pp. 185-200, 2011.

Cited by

Deadline Constrained Adaptive Multilevel Scheduling System in Cloud Environment vol.9, pp.4, 2014, https://doi.org/10.3837/tiis.2015.04.002
A Multi-objective Optimization Approach to Workflow Scheduling in Clouds Considering Fault Recovery vol.10, pp.3, 2014, https://doi.org/10.3837/tiis.2016.03.002
Cost Optimization Control of Logistics Service Supply Chain Based on Cloud Genetic Algorithm vol.102, pp.4, 2014, https://doi.org/10.1007/s11277-018-5335-z

KSII Transactions on Internet and Information Systems (TIIS)

A Workflow Scheduling Technique Using Genetic Algorithm in Spot Instance-Based Cloud

Abstract

Keywords

1. Introduction

2. Related Works

3. System Architecture

3.1 Instance types

3.2 Estimated Interval

3.3 Our Proposed Checkpointing Scheme

4. Genetic Algorithm Based Job Scheduling

4.1 Initial Population

4.2 Fitness Function

4.3 Selection

4.4 Genetic Operations (Crossover, Mutation, and Reproduction)

4.5 GA-based Workflow Scheduling (GAWS) Algorithm

5. Performance Evaluation

5.1 Simulation Environment

5.2 Effect of GA Analysis

5.3 Effect of the Estimated Execution Analysis

5.4 Effect of the Execution Analysis

6. Conclusion

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)