Approximate methods based on particle swarm optimization and differential evolution to solve the workflow scheduling problem in cloud environment

We use both random and real world instances in the experiments. Random instances were taken from Cloud Sim library. Real world instances were taken from Vietnamese Cloud company and Cloud Amazon EC2. The instances are divided into groups based on: • The number of servers N • The number of tasks M • The ratio of the number of edges to the number of vertices of graph G (denoted by ) The experiment results of five compared algorithms: proposed algorithm MODE, PSO_H, Random, RRTSM, and EGA. For each instance, we compare the best position vector (gbest). It is notable that in most cases the best value of MODE are better than those of Random, RRTSM. Compared to the PSO_H algorithm, the best value of MODE are better than those of PSO_H from 2.1% to 12.5%, and better than those of EGA from 1.2% to 9.4%.

27 trang | Chia sẻ: tueminh09 | Lượt xem: 843 | Lượt tải: 1

Bạn đang xem trước 20 trang tài liệu Approximate methods based on particle swarm optimization and differential evolution to solve the workflow scheduling problem in cloud environment, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên

MINISTRY OF EDUCATION AND TRAINING MINISTRY OF NATIONAL DEFENCE ACADEMY OF MILITARY SCIENCE AND TECHNOLOGY PHAN THANH TOÀN APPROXIMATE METHODS BASED ON PARTICLE SWARM OPTIMIZATION AND DIFFERENTIAL EVOLUTION TO SOLVE THE WORKFLOW SCHEDULING PROBLEM IN CLOUD ENVIRONMENT Major : mathematical basis for computing Major code : 62 46 01 10 SUMMARY OF PHD. THESIS HA NOI - 2018 Research has been accomplished at: Academy of Military Science and Technology – Ministry of National Defence Supervisor: 1. Dr. Nguyễn Thế Lộc 2. Dr. Nguyễn Doãn Cường Reviewer 1: Assoc. Prof, Ph.D. Nguyễn Đức Nghĩa Hanoi University of science and technology Reviewer 2: Assoc. Prof, Ph.D. Lê Trọng Vĩnh University of Science Vietnam National University, Hanoi Reviewer 3: Assoc. Prof, Ph.D. Nguyễn Xuân Hoài Hanoi University Thesis will be defended in front of PhD thesis examination Committee at Academy of Military Science and Technology in hour on . , 2018 The thesis could be found at: - Library of Academy of Military Science and Technology, - Vietnamese National library. 1 INTRODUCTION Cloud computing environment has been defined to be a type of parallel and distributed system consisting of inter-connected and virtualized computers that allows users to utilize on-demand computation, storage, data and services from around the world. Cloud service providers charge users for these services. Via the support of virtualization technology, cloud platforms enable enterprises to lease computing power in the form of virtual machines to users. Because there are hundreds of thousands of Virtual Machines (VM) are used on the cloud environment, it is difficult to manually assign tasks to computing resources in clouds. Cloud computing offers everything as a service. Therefore, there are mainly three service which are Software as a Service (SaaS), Platform as a Service (PaaS) and Infrastructure as a Service (IaaS). In a Software as a Service (SaaS) model, a pre-made application, along with any required software, operating system, hardware, and network are provided. In PaaS, an operating system, hardware, and network are provided, and customers install or develop their own software and applications. The IaaS model provides just the hardware and network; customers install or develop their own operating systems, software and applications. Workflow scheduling is applied in many areas of science and life, such as scheduling resources in the operating system, distributed systems, scheduling in production line system. Scientists have used workflow data in many fields of science such as astronomy, earthquakes, biophysics, and physics. The characteristics of these applications are the need to handle a large number of tasks, the volume of data exchanged between tasks is also very large so these 2 applications are usually deployed on distributed computing systems, such as grid computing or cloud computing. Completion time and execution cost of workflow depend on many inputs such as: + Number of tasks in the workflow + Computing resources + Relationship between tasks in the workflow Many instances of scheduling problems have been proven to be NP-hard [2]. Therefore, finding the optimal solution for large-input data with exhausted algorithm takes a lot of time. Some traditional heuristic approaches such as min-min, max-min, ... often give poor quality solutions. Some other solutions, such as GA or PSO, suggested by researchers so far are not aimed at minimizing the makespan as in this thesis. Therefore, researching and proposing scheduling algorithms that find near-optimal solutions in short time will improve the efficiency of the cloud center. Organization of the thesis The thesis includes abstract, conclusion, references and three chapters: • Chapter 1: Introduction to the problem and some related studies • Chapter 2: Solving the CLOS problem based on Particle Swarm Optimization Strategy • Chapter 3: Solving the CLOS problem based on Differential Evolution Strategy. 3 Chapter 1 INTRODUCTION TO THE PROBLEM AND SOME RELATED STUDIES This chapter presents a model of workflow scheduling problem in cloud computing environment, classifying methods for solving scheduling problems and prove the proposed problem CLOS is NP- hard. Workflow Scheduling Model on The Cloud Environment We denote the workflow as a Directed Acyclic Graph (DAG) represented by G=(V, E), where: • V is set of vertex, each vertex represents a task • T={T1, T2,,TM } is the set of tasks, M is the number of tasks • E represents the data dependencies between these tasks. The edge (Ti, Tj)  E means the task Ti is the father of the task Tj, the data produced by Ti will be consumed by the task Tj. • The Cloud’s computation resource, set of servers S = {S1, S2,.,SN}. N is the number of servers. • Each task Ti can be executed by any server SjS, and Si have to handle whole the workload of Ti • The computation of task Ti denoted by Wi (flop-floating point operations) • P(Si) : the computation power of the server Si (MI/s : million instructions/second) • The bandwidth B(Si,Sj) between server Si and server Sj represents by the function B(): S×S → R+ . We assume that B(Si,Si) = ∞ and B(Si,Sj ) = B(Sj,Si) 4 • Dij: data produced by task Ti and consumed by task Tj. Each scheduling plan can be represented by the function f(): T→S where f(Ti) is the server which handle the task Ti Thanks to the above assumptions we have: • The execution time of the task Ti is   i i TfP W • The communication time between the task Ti and Tj is     ji ij TfTfB D , Formally, we need to minimize the execution time of the workflow: 𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛(𝐹) = 𝑚𝑎𝑥 𝑇𝑖∈𝑇 {𝑡𝑓(𝑇𝑖)} − 𝑚𝑖𝑛 𝑇𝑖∈𝑇 {𝑡𝑠(𝑇𝑖)} where the execution time, called makespan, is the time difference between the start and finish of a sequence of workflow's tasks. Classify the CLOS problem through Graham's notation The CLOS problem can be represented according to Graham's notation as follows: Q|outtree, cij |Cmax The complexity of the CLOS problem Based on the SCHED problem, O. Sinnen has proven this problem to be in NP-hard class, the author demonstrates that the CLOS problem is also in NP-hard class by reduce the SCHED problem to CLOS problem in polynomial time. Related Work The methods for solving scheduling problems 5 Fig 1.12: A taxonomy of scheduling methods A taxonomy of static scheduling algorithms as in the figure 1.13 Workflow Scheduling Architecture Planning Scheme Decision Making Centralized Hierarchical Decentralize d Static Dynamic Local Global Heuristic algorithms Metaheuristic algorithms Static Algorithms List scheduling Cluster based scheduling Duplication based scheduling individual task scheduling Genetic algorithms Ant Colony Optimization PSO Simulated annealing Differential Evolution Fig 1.13: A taxonomy of static scheduling methods 6 Scheduling algorithms Heuristic algorithms In general, there are four classes of scheduling heuristics for workflow applications, namely individual task scheduling, list scheduling, cluster and duplication based scheduling. Some typical heuristic algorithms are Myopic, Min-min, Max-min, HEFT, TANH, Random, and RRTSM. Metaheuristic algorithms There have been many researches solving scheduling problems based on metaheuristic approaches such as EGA algorithms, GATSM algorithm, GAPSO, PSO_H, MPSO, ... Comparison of scheduling algorithms The heuristic based algorithms can produce a reasonable good solution in a polynomial time. Among the heuristic algorithms, individual task scheduling is simplest and only suitable for simple workflow structures such as a pipeline in which several tasks are required to be executed in sequential. Unlike individual task scheduling, list scheduling algorithms set the priorities of tasks in order to make an efficient schedule in the situation of many tasks compete for limited number of resources. The priority of the tasks determines their execution order. A drawback of the heuristic approach is that it cannot efficiently solve resource competition problem for a workflow consisting of many parallel tasks having the same length. Even though data transmission time has been considered in the list scheduling approach, it still may not provide an efficient schedule for data intensive workflow applications. 7 In compared to the heuristics based scheduling approaches, the advantage of the meta-heuristics based approaches is that producing an optimized scheduling solution based on the performance of entire workflow, rather than the partial of the workflow as considered by heuristics based approach. Thus, unlike heuristics based approach designed for a specified type of workflow application, it can produce good quality solutions for different types of workflow applications. Chapter 2 Solving the CLOS problem based on Particle Swarm Optimization Strategy This chapter is divided into two main sections: i) Proposed PSOi_H algorithm ii) Proposed LPSO_H algorithm 2.1. PSOi_H algorithm The proposed algorithm works in the Particle Swarm Optimization method, but the algorithm is improved in the following points: i) Modify the updating Particle’s Position Method ii) Propose Inverse procedure to escape from local extremum Particle Representation In the design of PSOi_H, at the iteration k particle i determined by the velocity vector vi k and position vector xi k . In the proposed scheduling algorithm, the solution is represented as a vector of length equal to the number of tasks. The value corresponding to each position i in the vector represents the server to which task i was executed. 8 Consider the workflow G=(V,E), the set of tasks T={T1, T2, T3, T4, T5}, set of servers S = {S1, S2, S3}. So the particle which has the position vector xi k = [1 ; 2 ; 1 ; 3 ; 2] give us the following scheduling plan: T1 T2 T3 T4 T5 S1 S2 S1 S3 S2 Updating Particle’s Position Method Because of the number of server (N) is finite integer number, so the elements of the position vector xi k[t] must be the integer number (t=1..M). In the equation (1.9.a), the value of the left hand side xi k+1 is integer number while the value of the right hand side (xi k + vi k) is the real number. In order to solve that issue, some related work round the real value of the right hand side, after that they assign the integer result to the left hand side. For example, if xi k[t] + vi k[t] = 3.2 then task Tt had been assign and execute by the server S3. In case of xi k[t] + vi k[t] = 3.8 then Tt had been execute by the server S4. Obviously, in the above rough way, the calculated task had been assigned to the random server which has the numerical order match to the value of the right hand side quite by chance. We proposes a novel method as the follows. The left hand side xi k+1 will be assigned by the identity of the server which it’s computation power closer to (xi k + vi k) than any other servers xi k+1[t]←j nếu │P(Sj) - (xi k[t] + vi k[t])│≤│P(Sr) - (xi k[t] + vi k[t])│ 9 SrS ; t =1,2 .. M Escape From Local Extremum In progressing PSO algorithms may be trapped in the area around the local extrema. We proposes a function named Inverse as the method used to move the swarm to a new area (see Figure 2.1) Fig 2.1: The particles migrate by the function Inverse Fig 2.2: Function Inverse 0 2 4 6 8 10 12 14 0 5 10 15 Xi k[2] Xi k[1] The original positions The new positions 3.1 4.1 5.2 Sπ(2) Sπ(1) Sπ(3) Sπ(1) Sπ(3) S1  S2 3.1 5.2 4.1 S2 S1 S3 10 Like other PSO algorithms, PSOi updates the position vectors of particles at each iteration by using formulas (1.9.a), (1.9.b). In case of the deviation of gbest is not larger than  during K continuous generations, the function Inverse will be used to move the swarm to the new area (a migration). In case of the gbest had not been significantly improved, i.e. the deviation of gbest is not larger than  after K continuous migrations by using function Inverse, PSOi will be stopped. Experimental results We use both random and real world instances in the experiments. Random instances were taken from Cloud Sim library. Real world instances were taken from Vietnamese Cloud company and Cloud Amazon EC2. The instances are divided into groups based on: • The number of servers N • The number of tasks M • The ratio of the number of edges to the number of vertices of graph G (denoted by ) The experiment results of five compared algorithms: proposed algorithm PSOi_H, PSO_H, Random, RRTSM, and EGA. For each instance, we compare the best position vector (gbest). It is notable that in most cases the best value of PSOi_H are better than those of Random, RRTSM. Compared to the PSO_H algorithm, the best value of PSOi_H are better than those of PSO_H from 2% to 7%, and better than those of EGA from 2.3% to 8%. In most case the mean value of PSOi_H are better than those of PSO_H from 1% to 7%, and better than those of EGA from 2% to 9%. 11 Especially, in some case the difference between the PSOi_H’s best plan and the global extremum found by exhausted search is negligible. Comparison the makespan of PSOi_H with other algorithms Fig 2.4: M=10, N=5 Fig 2.5: M=20, N=3 0 50 100 150 200 STD Mean Best 0 10 20 30 40 50 Random RRTSM PSO_H PSOi_H EGA STD Mean Best 12 2.2. LPSO_H algorithm The LPSO_H algorithm is a hybrid Particle Swarm Optimization Algorithm, which uses a combination of Particle Swarm Optimization methods and local search methods. The algorithm is improved in the following points: i) Modify the updating Particle’s Position Method ii) Combine the PSO algorithm with a neighborhood searching method to move particles to the new area Proposed Algorithm Escaping Local Extremum As PSO algorithms progress, they may get trapped in a local extremum. We here propose the following method for escaping such local extrema: When the swarm gets stuck in an area around a local extremum, we combine the PSOs having topological neighborhood with a neighborhood searching function to move particles to the new area. Variable Neighborhood Searching Function A neighborhood search algorithm typically starts from a solution (that is, an assignment of values to its decision variables) and moves from solutions to neighboring solutions in hope of improving a function f . The function f measures the quality of solutions to the problem at hand. The main operation of a neighborhood search algorithm amounts to moving from a solution s to one of its neighbors. The set of neighboring solutions of s , denoted by N(s) , is called the neighborhood of s . At a specific computation step, some of these neighbors may be legal, in which case they may be selected, or they 13 may be forbidden. Once the legal neighbors are identified (by operation L ), the local search selects one of them and decides whether to move to this neighbor or to stay at s (operation S ). In order to help the swarm escape from the area around the local extrema, we designed two operators named Exchange and RotateRight, as illustrated in the Figure 2.14, and built a Variable_Neighborhood_Searching function based on these operators. Fig 2.14: Operator RotateRight (a) and Operator Exchange (b) At each iteration, the LPSO updates the position vectors of particles based on gbest using formulas (1.9). If the deviation of gbest less than  during K continuous generations, this means that the swarm is trapped in a local extremum area, and hence the function Variable_Neighbourhood_Searching() should be called. This function moves (migrates) the swarm to a new area and produces a new generation. The LPSO algorithm can be described as follows: Algorithm LPSO ( ) Input: T, S, size of workload W[1×M], P[1×N], B[N×N], D[M×M], the deviation , the number of particle NoP, the constant K (b ) (a) 3 1 2 3 1 3 1 2 3 1 3 3 2 1 1 14 Output: the best position gbest 1. For i: = 1 to NoP do 2. xi:=random vectors;vi:=random vectors; 3. end for 4. t: = 0 ; 5. While (t ≤ number of iterations) Do 6. for i: = 1 to NoP do 7. Compute new position xi 8. end for 9. for i: = 1 to NoP do 10. Update pbesti; 11. end for 12. Update gbest; 13. for i: = 1 to NoP do 14. lbesti: = Ring(xi) ; 15. end for 16. for i: = 1 to NoP do 17. Update vik and compute xi; 19. end for 20. t++ ; 21. if (the deviation of gbest ≤  after K generations) then gbest:= Variable_Neighborhood_Searching (gbest); 23. End while; 24. Return gbest; At each iteration, the LPSO updates the position vectors of particles based on gbest using formulas (1.9.a) and (1.9.b). If the deviation of gbest less than  during K continuous generations, this means that the swarm is stuck in a local extremum and hence the function Variable_Neighbourhood_Searching() should be called. 15 This function moves (migrates) the swarm to a new area and produces a new generation. In the case of LPSO fall into a trap of the local extremum, LPSO will escape from the area around the extremum by calling the function Variable_Neighbourhood_Searching( ). Experimental results We use both random and real world instances in the experiments. Random instances were taken from Cloud Sim library. Real world instances were taken from Vietnamese Cloud company and Cloud Amazon EC2. The instances are divided into groups based on: • The number of servers N • The number of tasks M • The ratio of the number of edges to the number of vertices of graph G (denoted by ) The experiment results of five compared algorithms: proposed algorithm LPSO_H, PSO_H, Random, RRTSM, and EGA. For each instance, we compare the best position vector (gbest). It is notable that in most cases the best value of LPSO_H are better than those of Random, RRTSM. Compared to the PSO_H algorithm, the best value of LPSO_H are better than those of PSO_H from 2.1% to 11%, and better than those of EGA from 1.2% to 15.4%. In most case the mean value of LPSO_H are better than those of PSO_H from 2.7% to 9.8%, and better than those of EGA from 2.4% to 11%. Especially, in some case the difference between the LPSO_H’s best plan and the global extremum found by exhausted search is negligible. 16 Comparison the makespan of LPSO_H with other algorithms Fig 2.18: M=50, N=8 Fig 2.17 : M=20, N=3 0 20 40 60 80 100 120 140 160 180 200 Random RRTSM PSO_H LPSO_H EGA STD Mean Best 0 10 20 30 40 50 60 70 80 Random RRTSM PSO_H LPSO_H EGA STD Mean Best 17 Chapter 3 Solving the CLOS problem based on Particle Swarm Optimization Strategy This chapter is divided into two main sections: i) Opposition-Based Differential Evolution Algorithm ii) Propose a new algorithm for workflow scheduling on the cloud environment based on differential evolution method. Opposition-Based Differential Evolution Opposition-based Differential Evolution (ODE) is an evolutionary optimization technique that consists of two main steps: population initialization and producing new generations by genetic operations such as selection, crossover and mutation. ODE enhances these two steps by considering the opposite point of individuals in the population. Definition 1 (Opposite Number): Let x  [a,b] be a real number, then the opposite number �̅� is defined as �̅� = 𝑎 + 𝑏 − 𝑥 Definition 2 (Opposite Point in n-Dimensional Space): Similarly, the above definition can be extended for higher dimensions as follows: Let P(x1, x2,,xD) be an D-dimemsional vector, where xi  [ai, bi]; i=1,2,,D. The opposite of P is defined by: �̅� = (�̅�1, �̅�2, , �̅�𝐷) where 𝑥�̅� = 𝑎𝑖 + 𝑏𝑖 − 𝑥𝑖 The opposition-based optimization can be defined as follows: 18 Let P=(x1, x2,..,xD) be a point in D-dimensional space, and f(.) is the fitness function which is used to measure the candidate’s fitness. According to the definition of opposite point, �̅� = (�̅�1, �̅�2, , �̅�𝐷) is the opposite of P=(x1, x2,..,xD). Now if 𝑓(�̅�) ≤ 𝑓(𝑃), then point P can be replaced by 𝑃,̅ otherwise we continue with P. Proposed Algorithm MODE MODE is a new workflow scheduling algorithm in the cloud environment to minimize the makespan of workflow, which is derived from the Opposition-based Differential Evolution method, but the algorithm is improved in the following points: i) Define the method to calculate the opposite of individuals ii) Using the Rank-based Roulette Wheel Selection method to select individuals Method to calculate the opposite of individuals The design of ODE entails calculating the opposite of individuals in the population, which can be carried out as follows: Let a = Max{P(Si)}; i=1,2,..,N b = Min{P(Si)}; i=1,2,..,N Assuming that the particle xi = (Si(1), Si(2),,Si(M)); Si(j)  S, j=1,2,..,M; the opposite of xi, denoted by 𝑥�̅�, will be calculated as follows: 𝑥�̅� = (𝑆�̅�𝜋(1), 𝑆�̅�𝜋(2), , 𝑆�̅�𝜋(𝑀)) where: 𝑆�̅�𝜋(𝑗) = 𝑎 + 𝑏 − 𝑆𝑖𝜋(𝑗); ∀𝑗 = 1,2, . . , 𝑀 19 We subsequently assign the value corresponding to each position j of vector 𝑥�̅� by identifying the server which has a computation power closer to 𝑆�̅�𝜋(𝑗) than any other server 𝑥𝑖𝑗̅̅̅̅ ← 𝑘 𝑛ế𝑢 |𝑃(𝑆𝑘) − 𝑆�̅�𝜋(𝑗)| ≤ |𝑃(𝑆𝑟) − 𝑆𝑖𝜋(𝑗)| ∀𝑆𝑟 (3.4) Rank-based Roulette Wheel Selection Rank-based roulette wheel selection is the selection strategy where the probability of a particle being selected is based on its fitness rank relative to the entire population. Rank-based selection schemes first sort individuals in the population according to their fitness and then computes the selection probabilities according to their ranks rather than the fitness values. Rank-based selection uses a function to map the indices of individuals in the sorted list to their selection probabilities. The rank for an individual may be scaled linearly using the following formula: 𝑟𝑎𝑛𝑘(𝑝𝑜𝑠) = 2 − 𝑆𝑃 + (2 × (𝑆𝑃 − 1) × 𝑝𝑜𝑠 − 1 𝑃𝑜𝑝𝑆𝑖𝑧𝑒 − 1 ) (3.5) where 1.0  SP  2.0 The MODE algorithm can be described as follows: Algorithm MODE ( ) Input: T, S, size of workload W[1×M], P[1×N], B[N×N], D[M×M], the number of particle NoP Output: the best position gbest 1. Call procedure: Opposition-Based Population Initialization 2. while(criteria is not satisfied)do 20 3. for i=1 to PopSize do 4. selecting p1 from population by RBRWS algorithm 5. selecting p2 from population by RBRWS algorithm 6. F  Random(1,0) 7. vi  pbest + F(p1 – p2) 8. assign the identity of the server to each position j of vector vi by equation (7) 9. randi,j  Random(0,1) 10. Irand  random(1,M) 11. 𝑢𝑖,𝑗 = { 𝑣𝑖,𝑗 𝑛ế𝑢 𝑟𝑎𝑛𝑑𝑖,𝑗 ≤ 𝐶𝑅 ℎ𝑜ặ𝑐 𝑖 = 𝐼𝑟𝑎𝑛𝑑 𝑥𝑖,𝑗 𝑛ế𝑢 𝑟𝑎𝑛𝑑𝑖,𝑗 ≥ 𝐶𝑅 ℎ𝑜ặ𝑐 𝑖 ≠ 𝐼𝑟𝑎𝑛𝑑 12. if (makespan(ui) < makespan(xi)) 13. xi  ui 14. if(rand < Jr) 15. Calculating Opposite Population OP by OP_Algorithm 16. Fitness Evaluation 17. Selecting PopSize Fittest Individuals from {Current Population, OP} 18. End while Return gbest; In the Initialization step, we randomly generate a population of PopSize individuals, and calculate the opposite of individuals in the population, then select the Popsize fittest individuals from current population and their counterpart opposites. In each iteration, MODE selects two individuals from population according to the Rank-based Roulette Wheel Selection method. 21 According to the mutation operator, a mutant vector is generated by adding the weighted difference between two selected vector of the target population individuals to a third one as follows : vi  pbest + F(p1 – p2) where p1, p2 are two individuals selected by rank-based roulette wheel selection method, and pbest is the best individual in the population. Rank-based Roulette Wheel Selection method helps prevent premature convergence due to “super” individuals, because the best individual is always assigned the same selection probability, regardless of its objective value. After the mutation phase, the crossover operator is applied to obtain the trail vector ui using the following equation : 𝑢𝑖,𝑗 = { 𝑣𝑖,𝑗 𝑖𝑓 𝑟𝑎𝑛𝑑𝑖,𝑗 ≤ 𝐶𝑅 𝑜𝑟 𝑖 = 𝐼𝑟𝑎𝑛𝑑 𝑥𝑖,𝑗 𝑖𝑓 𝑟𝑎𝑛𝑑𝑖,𝑗 ≥ 𝐶𝑅 𝑜𝑟 𝑖 ≠ 𝐼𝑟𝑎𝑛𝑑 Where randi,j is the jth independent random number uniformly distributed in the range of [0, 1]. Also Irand is a random number in the range of [1, M]. CR is a user defined crossover factor in the range [0, 1]. Now, we need to decide whether the trail vector ui should be a member of the population of the next generation, it is compared with the target individual xi. Finally, the selection is based on the survival of the fitness as follows : 𝑥𝑖 = { 𝑢𝑖, 𝑖𝑓 𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛(𝑢𝑖) < 𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛(𝑥𝑖) 𝑥𝑖, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 } 22 After the crossover phase, we calculate the opposite of individuals in the population. The fitness of individuals is evaluated, then the PopSize fittest individuals are selected for the next generation. Experimental results We use both random and real world instances in the experiments. Random instances were taken from Cloud Sim library. Real world instances were taken from Vietnamese Cloud company and Cloud Amazon EC2. The instances are divided into groups based on: • The number of servers N • The number of tasks M • The ratio of the number of edges to the number of vertices of graph G (denoted by ) The experiment results of five compared algorithms: proposed algorithm MODE, PSO_H, Random, RRTSM, and EGA. For each instance, we compare the best position vector (gbest). It is notable that in most cases the best value of MODE are better than those of Random, RRTSM. Compared to the PSO_H algorithm, the best value of MODE are better than those of PSO_H from 2.1% to 12.5%, and better than those of EGA from 1.2% to 9.4%. In most case the mean value of MODE are better than those of PSO_H from 2% to 9%, and better than those of EGA from 1.1% to 11%. Especially, in some case the difference between the MODE’s best plan and the global extremum found by exhausted search is negligible. Comparison the makespan of LPSO_H with other algorithms 23 Fig 3.4: M=10, N=5 Fig 3.6: M=20, N=3 Comparison of algorithms PSOi_H, LPSO-H and MODE Based on experiments, we suggest using the algorithms as following: With the complex workflow, there are many data distribution tasks and data aggregation tasks, and there is a huge amount of data to transfer between tasks. In this case MODE algorithm has the best solution. 0 5 10 15 20 25 30 STD Mean Best 0 100 200 300 400 500 STD Mean Best 24 With the simple workflow, such as process structure, or pipeline structure the PSOi_H algorithm is a good choice. With the complex workflow, and there is a little amount of data to transfer between tasks. In this case we should use the LPSO_H algorithm. CONCLUTION AND FUTURE WORK CONCLUTION The thesis presented the workflow scheduling problem on the cloud computing environment CLOS, this is the first time the CLOS problem has been formal express. The thesis presented an overview of Graham's classification for scheduling problem and gave Graham's classification for the CLOS problem. Based on the SCHED problem, the thesis also demonstrated that the CLOS problem belong to NP-hard. In chapter 2, the thesis proposed two new algorithms for solving the CLOS problem based on Particle Swarm Optimization. In chapter 3, the thesis proposed one new algorithm for solving the CLOS problem based on Differential Evolution. FUTURE WORK Apply the parallel algorithms, Branch and bound algorithms to find the optimal solution for large and medium-sized datasets, in order to test the quality of the proposed algorithms. Another research, we apply the resource demand predicting methods in order to schedule better LIST OF SCIENTIFIC PUBLICATIONS [1]. Phan Thanh Toàn, Kiều Tuấn Dũng, Nguyễn Thế Lộc, Nguyễn Doãn Cường, “Sắp xếp lịch biểu thực thi luồng công việc tại đám mây điện toán”, Hội thảo quốc gia lần thứ XVI, một số vấn đề chọn lọc của công nghệ thông tin và truyền thông (@ 2013), pp. 285-290, 2013. [2]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, Đỗ Như Long, “Thuật toán lập lịch luồng công việc theo phương pháp tối ưu bày đàn trong môi trường điện toán đám mây”, tạp chí Nghiên cứu khoa học và Công nghệ quân sự, pp. 132-138, 2015. [3]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, Đỗ Như Long, “Giải thuật tối thiểu hóa chi phí thực thi luồng công việc trong môi trường điện toán đám mây”, Tạp chí khoa học trường đại học Sư Phạm Hà Nội, pp. 47-55, 2015. [4]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường , “Thuật toán lập lịch luồng công việc trong môi trường điện toán đám mây”, Proc. Of the 8th National Conference on Foundamental and Applied Information Techonoly Research (FAIR’8), pp. 687-693, 2015. [5]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, “Thuật toán lập lịch luồng công việc trong môi trường điện toán đám mây dựa trên chiến lược tối ưu bày đàn”, tạp chí Công nghệ thông tin và Truyền thông, pp. 15-20, 2015. [6]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, “A Novel Workflow Scheduling Algorithm in Cloud Environment”, Proc. Of 2015 2nd National Foundation for Science and Technology Development Conference on Information and Computer Science (NICS’2015), pp. 125-129, 2015. [7]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, Trần Đăng Hưng, “MODE: Hướng tiếp cận mới cho việc thực thi luồng công việc”, tạp chí khoa học Công nghệ thông tin và Truyền thông, tập 1, số 1, pp. 63-70, 2016. [8]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường , “Thuật toán LPSO lập lịch cho các ứng dụng khoa học trong môi trường điện toán đám mây”, tạp chí Khoa học và Công nghệ, Viện hàn lâm khoa học Việt Nam, pp. 287-299, 2016. [9]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, “A Robus and Effective MODE Algorithm for Workflow Scheduling in Cloud Enveronment”, Proc. Of The 7th International Symposium on Information and Communication Technology (SoICT’2016), pp. 259-264, 2016. [10]. Toan Phan Thanh, Loc Nguyen The, Said Elnaffar and Cuong Nguyen Doan, "LPSO: Another Algorithm for Workflow Scheduling in the Cloud", Journal of Computer Science, ISSN: 1549-3636, DOI: 10.3844/jcssp.2016, pp.611-617, Volume 12, Issue 12, 2016.

Các file đính kèm theo tài liệu này:

approximate_methods_based_on_particle_swarm_optimization_and.pdf
TomTat LuanAn NCS PhanThanhToan_TiengViet.pdf