We use both random and real world instances in the experiments.
Random instances were taken from Cloud Sim library. Real world
instances were taken from Vietnamese Cloud company and Cloud
Amazon EC2. The instances are divided into groups based on:
• The number of servers N
• The number of tasks M
• The ratio of the number of edges to the number of vertices of
graph G (denoted by )
The experiment results of five compared algorithms: proposed
algorithm MODE, PSO_H, Random, RRTSM, and EGA. For each
instance, we compare the best position vector (gbest). It is notable that
in most cases the best value of MODE are better than those of
Random, RRTSM. Compared to the PSO_H algorithm, the best value
of MODE are better than those of PSO_H from 2.1% to 12.5%, and
better than those of EGA from 1.2% to 9.4%.
27 trang |
Chia sẻ: tueminh09 | Lượt xem: 666 | Lượt tải: 1
Bạn đang xem trước 20 trang tài liệu Approximate methods based on particle swarm optimization and differential evolution to solve the workflow scheduling problem in cloud environment, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
MINISTRY OF EDUCATION AND TRAINING MINISTRY OF NATIONAL DEFENCE
ACADEMY OF MILITARY SCIENCE AND TECHNOLOGY
PHAN THANH TOÀN
APPROXIMATE METHODS BASED ON PARTICLE SWARM
OPTIMIZATION AND DIFFERENTIAL EVOLUTION TO
SOLVE THE WORKFLOW SCHEDULING PROBLEM
IN CLOUD ENVIRONMENT
Major : mathematical basis for computing
Major code : 62 46 01 10
SUMMARY OF PHD. THESIS
HA NOI - 2018
Research has been accomplished at: Academy of Military
Science and Technology – Ministry of National Defence
Supervisor:
1. Dr. Nguyễn Thế Lộc
2. Dr. Nguyễn Doãn Cường
Reviewer 1: Assoc. Prof, Ph.D. Nguyễn Đức Nghĩa
Hanoi University of science and technology
Reviewer 2: Assoc. Prof, Ph.D. Lê Trọng Vĩnh
University of Science
Vietnam National University, Hanoi
Reviewer 3: Assoc. Prof, Ph.D. Nguyễn Xuân Hoài
Hanoi University
Thesis will be defended in front of PhD thesis examination
Committee at Academy of Military Science and Technology in
hour on . , 2018
The thesis could be found at:
- Library of Academy of Military Science and Technology,
- Vietnamese National library.
1
INTRODUCTION
Cloud computing environment has been defined to be a type of
parallel and distributed system consisting of inter-connected and
virtualized computers that allows users to utilize on-demand
computation, storage, data and services from around the world. Cloud
service providers charge users for these services. Via the support of
virtualization technology, cloud platforms enable enterprises to lease
computing power in the form of virtual machines to users. Because
there are hundreds of thousands of Virtual Machines (VM) are used
on the cloud environment, it is difficult to manually assign tasks to
computing resources in clouds.
Cloud computing offers everything as a service. Therefore, there
are mainly three service which are Software as a Service (SaaS),
Platform as a Service (PaaS) and Infrastructure as a Service (IaaS). In
a Software as a Service (SaaS) model, a pre-made application, along
with any required software, operating system, hardware, and network
are provided. In PaaS, an operating system, hardware, and network are
provided, and customers install or develop their own software and
applications. The IaaS model provides just the hardware and network;
customers install or develop their own operating systems, software and
applications.
Workflow scheduling is applied in many areas of science and life,
such as scheduling resources in the operating system, distributed
systems, scheduling in production line system. Scientists have used
workflow data in many fields of science such as astronomy,
earthquakes, biophysics, and physics. The characteristics of these
applications are the need to handle a large number of tasks, the volume
of data exchanged between tasks is also very large so these
2
applications are usually deployed on distributed computing systems,
such as grid computing or cloud computing. Completion time and
execution cost of workflow depend on many inputs such as:
+ Number of tasks in the workflow
+ Computing resources
+ Relationship between tasks in the workflow
Many instances of scheduling problems have been proven to be
NP-hard [2]. Therefore, finding the optimal solution for large-input
data with exhausted algorithm takes a lot of time. Some traditional
heuristic approaches such as min-min, max-min, ... often give poor
quality solutions. Some other solutions, such as GA or PSO, suggested
by researchers so far are not aimed at minimizing the makespan as in
this thesis. Therefore, researching and proposing scheduling
algorithms that find near-optimal solutions in short time will improve
the efficiency of the cloud center.
Organization of the thesis
The thesis includes abstract, conclusion, references and three
chapters:
• Chapter 1: Introduction to the problem and some related studies
• Chapter 2: Solving the CLOS problem based on Particle Swarm
Optimization Strategy
• Chapter 3: Solving the CLOS problem based on Differential
Evolution Strategy.
3
Chapter 1
INTRODUCTION TO THE PROBLEM AND SOME
RELATED STUDIES
This chapter presents a model of workflow scheduling problem in
cloud computing environment, classifying methods for solving
scheduling problems and prove the proposed problem CLOS is NP-
hard.
Workflow Scheduling Model on The Cloud Environment
We denote the workflow as a Directed Acyclic Graph (DAG)
represented by G=(V, E), where:
• V is set of vertex, each vertex represents a task
• T={T1, T2,,TM } is the set of tasks, M is the number of tasks
• E represents the data dependencies between these tasks. The edge
(Ti, Tj) E means the task Ti is the father of the task Tj, the data
produced by Ti will be consumed by the task Tj.
• The Cloud’s computation resource, set of servers S = {S1,
S2,.,SN}. N is the number of servers.
• Each task Ti can be executed by any server SjS, and Si have to
handle whole the workload of Ti
• The computation of task Ti denoted by Wi (flop-floating point
operations)
• P(Si) : the computation power of the server Si (MI/s : million
instructions/second)
• The bandwidth B(Si,Sj) between server Si and server Sj represents
by the function B(): S×S → R+ . We assume that B(Si,Si) = ∞ and
B(Si,Sj ) = B(Sj,Si)
4
• Dij: data produced by task Ti and consumed by task Tj.
Each scheduling plan can be represented by the function f(): T→S
where f(Ti) is the server which handle the task Ti
Thanks to the above assumptions we have:
• The execution time of the task Ti is
i
i
TfP
W
• The communication time between the task Ti and Tj is
ji
ij
TfTfB
D
,
Formally, we need to minimize the execution time of the workflow:
𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛(𝐹) = 𝑚𝑎𝑥
𝑇𝑖∈𝑇
{𝑡𝑓(𝑇𝑖)} − 𝑚𝑖𝑛
𝑇𝑖∈𝑇
{𝑡𝑠(𝑇𝑖)}
where the execution time, called makespan, is the time difference
between the start and finish of a sequence of workflow's tasks.
Classify the CLOS problem through Graham's notation
The CLOS problem can be represented according to Graham's
notation as follows: Q|outtree, cij |Cmax
The complexity of the CLOS problem
Based on the SCHED problem, O. Sinnen has proven this problem
to be in NP-hard class, the author demonstrates that the CLOS problem
is also in NP-hard class by reduce the SCHED problem to CLOS
problem in polynomial time.
Related Work
The methods for solving scheduling problems
5
Fig 1.12: A taxonomy of scheduling methods
A taxonomy of static scheduling algorithms as in the figure 1.13
Workflow
Scheduling
Architecture Planning
Scheme
Decision
Making
Centralized
Hierarchical
Decentralize
d
Static
Dynamic
Local
Global
Heuristic algorithms Metaheuristic algorithms
Static Algorithms
List scheduling
Cluster based scheduling
Duplication based scheduling
individual task scheduling
Genetic algorithms
Ant Colony Optimization
PSO
Simulated annealing
Differential Evolution
Fig 1.13: A taxonomy of static scheduling methods
6
Scheduling algorithms
Heuristic algorithms
In general, there are four classes of scheduling heuristics for
workflow applications, namely individual task scheduling, list
scheduling, cluster and duplication based scheduling. Some typical
heuristic algorithms are Myopic, Min-min, Max-min, HEFT, TANH,
Random, and RRTSM.
Metaheuristic algorithms
There have been many researches solving scheduling problems
based on metaheuristic approaches such as EGA algorithms, GATSM
algorithm, GAPSO, PSO_H, MPSO, ...
Comparison of scheduling algorithms
The heuristic based algorithms can produce a reasonable good
solution in a polynomial time. Among the heuristic algorithms,
individual task scheduling is simplest and only suitable for simple
workflow structures such as a pipeline in which several tasks are
required to be executed in sequential. Unlike individual task
scheduling, list scheduling algorithms set the priorities of tasks in
order to make an efficient schedule in the situation of many tasks
compete for limited number of resources. The priority of the tasks
determines their execution order. A drawback of the heuristic
approach is that it cannot efficiently solve resource competition
problem for a workflow consisting of many parallel tasks having the
same length. Even though data transmission time has been considered
in the list scheduling approach, it still may not provide an efficient
schedule for data intensive workflow applications.
7
In compared to the heuristics based scheduling approaches, the
advantage of the meta-heuristics based approaches is that producing
an optimized scheduling solution based on the performance of entire
workflow, rather than the partial of the workflow as considered by
heuristics based approach. Thus, unlike heuristics based approach
designed for a specified type of workflow application, it can produce
good quality solutions for different types of workflow applications.
Chapter 2
Solving the CLOS problem based on Particle Swarm
Optimization Strategy
This chapter is divided into two main sections:
i) Proposed PSOi_H algorithm
ii) Proposed LPSO_H algorithm
2.1. PSOi_H algorithm
The proposed algorithm works in the Particle Swarm Optimization
method, but the algorithm is improved in the following points:
i) Modify the updating Particle’s Position Method
ii) Propose Inverse procedure to escape from local extremum
Particle Representation
In the design of PSOi_H, at the iteration k particle i determined by
the velocity vector vi
k and position vector xi
k . In the proposed
scheduling algorithm, the solution is represented as a vector of length
equal to the number of tasks. The value corresponding to each position
i in the vector represents the server to which task i was executed.
8
Consider the workflow G=(V,E), the set of tasks T={T1, T2, T3, T4, T5},
set of servers S = {S1, S2, S3}. So the particle which has the position
vector xi
k = [1 ; 2 ; 1 ; 3 ; 2] give us the following scheduling plan:
T1 T2 T3 T4 T5
S1 S2 S1 S3 S2
Updating Particle’s Position Method
Because of the number of server (N) is finite integer number, so the
elements of the position vector xi
k[t] must be the integer number
(t=1..M). In the equation (1.9.a), the value of the left hand side xi
k+1 is
integer number while the value of the right hand side (xi
k + vi
k) is the
real number. In order to solve that issue, some related work round the
real value of the right hand side, after that they assign the integer result
to the left hand side.
For example, if xi
k[t] + vi
k[t] = 3.2 then task Tt had been assign and
execute by the server S3. In case of xi
k[t] + vi
k[t] = 3.8 then Tt had been
execute by the server S4.
Obviously, in the above rough way, the calculated task had been
assigned to the random server which has the numerical order match to
the value of the right hand side quite by chance.
We proposes a novel method as the follows. The left hand side xi
k+1
will be assigned by the identity of the server which it’s computation
power closer to (xi
k + vi
k) than any other servers
xi
k+1[t]←j nếu │P(Sj) - (xi
k[t] + vi
k[t])│≤│P(Sr) - (xi
k[t] + vi
k[t])│
9
SrS ; t =1,2 .. M
Escape From Local Extremum
In progressing PSO algorithms may be trapped in the area around
the local extrema. We proposes a function named Inverse as the
method used to move the swarm to a new area (see Figure 2.1)
Fig 2.1: The particles migrate by the function Inverse
Fig 2.2: Function Inverse
0
2
4
6
8
10
12
14
0 5 10 15
Xi
k[2]
Xi
k[1]
The original positions
The new positions
3.1 4.1 5.2
Sπ(2) Sπ(1) Sπ(3)
Sπ(1) Sπ(3)
S1 S2
3.1 5.2 4.1
S2 S1 S3
10
Like other PSO algorithms, PSOi updates the position vectors of
particles at each iteration by using formulas (1.9.a), (1.9.b). In case of
the deviation of gbest is not larger than during K continuous
generations, the function Inverse will be used to move the swarm to
the new area (a migration). In case of the gbest had not been
significantly improved, i.e. the deviation of gbest is not larger than
after K continuous migrations by using function Inverse, PSOi will be
stopped.
Experimental results
We use both random and real world instances in the experiments.
Random instances were taken from Cloud Sim library. Real world
instances were taken from Vietnamese Cloud company and Cloud
Amazon EC2. The instances are divided into groups based on:
• The number of servers N
• The number of tasks M
• The ratio of the number of edges to the number of vertices of
graph G (denoted by )
The experiment results of five compared algorithms: proposed
algorithm PSOi_H, PSO_H, Random, RRTSM, and EGA. For each
instance, we compare the best position vector (gbest). It is notable that
in most cases the best value of PSOi_H are better than those of
Random, RRTSM. Compared to the PSO_H algorithm, the best value
of PSOi_H are better than those of PSO_H from 2% to 7%, and better
than those of EGA from 2.3% to 8%.
In most case the mean value of PSOi_H are better than those of
PSO_H from 1% to 7%, and better than those of EGA from 2% to 9%.
11
Especially, in some case the difference between the PSOi_H’s best
plan and the global extremum found by exhausted search is negligible.
Comparison the makespan of PSOi_H with other algorithms
Fig 2.4: M=10, N=5
Fig 2.5: M=20, N=3
0
50
100
150
200
STD
Mean
Best
0
10
20
30
40
50
Random RRTSM PSO_H PSOi_H EGA
STD
Mean
Best
12
2.2. LPSO_H algorithm
The LPSO_H algorithm is a hybrid Particle Swarm Optimization
Algorithm, which uses a combination of Particle Swarm Optimization
methods and local search methods. The algorithm is improved in the
following points:
i) Modify the updating Particle’s Position Method
ii) Combine the PSO algorithm with a neighborhood searching
method to move particles to the new area
Proposed Algorithm
Escaping Local Extremum
As PSO algorithms progress, they may get trapped in a local
extremum. We here propose the following method for escaping such
local extrema: When the swarm gets stuck in an area around a local
extremum, we combine the PSOs having topological neighborhood
with a neighborhood searching function to move particles to the new
area.
Variable Neighborhood Searching Function
A neighborhood search algorithm typically starts from a solution
(that is, an assignment of values to its decision variables) and moves
from solutions to neighboring solutions in hope of improving a
function f . The function f measures the quality of solutions to the
problem at hand. The main operation of a neighborhood search
algorithm amounts to moving from a solution s to one of its neighbors.
The set of neighboring solutions of s , denoted by N(s) , is called the
neighborhood of s . At a specific computation step, some of these
neighbors may be legal, in which case they may be selected, or they
13
may be forbidden. Once the legal neighbors are identified (by
operation L ), the local search selects one of them and decides whether
to move to this neighbor or to stay at s (operation S ).
In order to help the swarm escape from the area around the local
extrema, we designed two operators named Exchange and
RotateRight, as illustrated in the Figure 2.14, and built a
Variable_Neighborhood_Searching function based on these operators.
Fig 2.14: Operator RotateRight (a) and Operator Exchange (b)
At each iteration, the LPSO updates the position vectors of
particles based on gbest using formulas (1.9). If the deviation of gbest
less than during K continuous generations, this means that the swarm
is trapped in a local extremum area, and hence the function
Variable_Neighbourhood_Searching() should be called. This function
moves (migrates) the swarm to a new area and produces a new
generation.
The LPSO algorithm can be described as follows:
Algorithm LPSO ( )
Input: T, S, size of workload W[1×M],
P[1×N], B[N×N], D[M×M], the deviation ,
the number of particle NoP, the constant K
(b
)
(a)
3 1 2 3 1
3 1 2 3 1
3 3 2 1 1
14
Output: the best position gbest
1. For i: = 1 to NoP do
2. xi:=random vectors;vi:=random vectors;
3. end for
4. t: = 0 ;
5. While (t ≤ number of iterations) Do
6. for i: = 1 to NoP do
7. Compute new position xi
8. end for
9. for i: = 1 to NoP do
10. Update pbesti;
11. end for
12. Update gbest;
13. for i: = 1 to NoP do
14. lbesti: = Ring(xi) ;
15. end for
16. for i: = 1 to NoP do
17. Update vik and compute xi;
19. end for
20. t++ ;
21. if (the deviation of gbest ≤ after
K generations) then gbest:=
Variable_Neighborhood_Searching (gbest);
23. End while;
24. Return gbest;
At each iteration, the LPSO updates the position vectors of
particles based on gbest using formulas (1.9.a) and (1.9.b). If the
deviation of gbest less than during K continuous generations, this
means that the swarm is stuck in a local extremum and hence the
function Variable_Neighbourhood_Searching() should be called.
15
This function moves (migrates) the swarm to a new area and
produces a new generation.
In the case of LPSO fall into a trap of the local extremum, LPSO
will escape from the area around the extremum by calling the function
Variable_Neighbourhood_Searching( ).
Experimental results
We use both random and real world instances in the experiments.
Random instances were taken from Cloud Sim library. Real world
instances were taken from Vietnamese Cloud company and Cloud
Amazon EC2. The instances are divided into groups based on:
• The number of servers N
• The number of tasks M
• The ratio of the number of edges to the number of vertices of
graph G (denoted by )
The experiment results of five compared algorithms: proposed
algorithm LPSO_H, PSO_H, Random, RRTSM, and EGA. For each
instance, we compare the best position vector (gbest). It is notable that
in most cases the best value of LPSO_H are better than those of
Random, RRTSM. Compared to the PSO_H algorithm, the best value
of LPSO_H are better than those of PSO_H from 2.1% to 11%, and
better than those of EGA from 1.2% to 15.4%.
In most case the mean value of LPSO_H are better than those of
PSO_H from 2.7% to 9.8%, and better than those of EGA from 2.4%
to 11%. Especially, in some case the difference between the
LPSO_H’s best plan and the global extremum found by exhausted
search is negligible.
16
Comparison the makespan of LPSO_H with other algorithms
Fig 2.18: M=50, N=8
Fig 2.17 : M=20, N=3
0
20
40
60
80
100
120
140
160
180
200
Random RRTSM PSO_H LPSO_H EGA
STD
Mean
Best
0
10
20
30
40
50
60
70
80
Random RRTSM PSO_H LPSO_H EGA
STD
Mean
Best
17
Chapter 3
Solving the CLOS problem based on Particle Swarm
Optimization Strategy
This chapter is divided into two main sections:
i) Opposition-Based Differential Evolution Algorithm
ii) Propose a new algorithm for workflow scheduling on the cloud
environment based on differential evolution method.
Opposition-Based Differential Evolution
Opposition-based Differential Evolution (ODE) is an
evolutionary optimization technique that consists of two main steps:
population initialization and producing new generations by genetic
operations such as selection, crossover and mutation. ODE enhances
these two steps by considering the opposite point of individuals in the
population.
Definition 1 (Opposite Number): Let x [a,b] be a real number, then
the opposite number �̅� is defined as
�̅� = 𝑎 + 𝑏 − 𝑥
Definition 2 (Opposite Point in n-Dimensional Space):
Similarly, the above definition can be extended for higher
dimensions as follows:
Let P(x1, x2,,xD) be an D-dimemsional vector, where xi [ai, bi];
i=1,2,,D. The opposite of P is defined by: �̅� = (�̅�1, �̅�2, , �̅�𝐷) where
𝑥�̅� = 𝑎𝑖 + 𝑏𝑖 − 𝑥𝑖
The opposition-based optimization can be defined as follows:
18
Let P=(x1, x2,..,xD) be a point in D-dimensional space, and f(.) is
the fitness function which is used to measure the candidate’s fitness.
According to the definition of opposite point, �̅� = (�̅�1, �̅�2, , �̅�𝐷) is
the opposite of P=(x1, x2,..,xD). Now if 𝑓(�̅�) ≤ 𝑓(𝑃), then point P can
be replaced by 𝑃,̅ otherwise we continue with P.
Proposed Algorithm MODE
MODE is a new workflow scheduling algorithm in the cloud
environment to minimize the makespan of workflow, which is derived
from the Opposition-based Differential Evolution method, but the
algorithm is improved in the following points:
i) Define the method to calculate the opposite of individuals
ii) Using the Rank-based Roulette Wheel Selection method to select
individuals
Method to calculate the opposite of individuals
The design of ODE entails calculating the opposite of individuals
in the population, which can be carried out as follows:
Let a = Max{P(Si)}; i=1,2,..,N
b = Min{P(Si)}; i=1,2,..,N
Assuming that the particle xi = (Si(1), Si(2),,Si(M)); Si(j) S,
j=1,2,..,M; the opposite of xi, denoted by 𝑥�̅�, will be calculated as
follows:
𝑥�̅� = (𝑆�̅�𝜋(1), 𝑆�̅�𝜋(2), , 𝑆�̅�𝜋(𝑀))
where:
𝑆�̅�𝜋(𝑗) = 𝑎 + 𝑏 − 𝑆𝑖𝜋(𝑗); ∀𝑗 = 1,2, . . , 𝑀
19
We subsequently assign the value corresponding to each position j of
vector 𝑥�̅� by identifying the server which has a computation power
closer to 𝑆�̅�𝜋(𝑗) than any other server
𝑥𝑖𝑗̅̅̅̅ ← 𝑘 𝑛ế𝑢 |𝑃(𝑆𝑘) − 𝑆�̅�𝜋(𝑗)| ≤ |𝑃(𝑆𝑟) − 𝑆𝑖𝜋(𝑗)| ∀𝑆𝑟 (3.4)
Rank-based Roulette Wheel Selection
Rank-based roulette wheel selection is the selection strategy where
the probability of a particle being selected is based on its fitness rank
relative to the entire population. Rank-based selection schemes first
sort individuals in the population according to their fitness and then
computes the selection probabilities according to their ranks rather
than the fitness values. Rank-based selection uses a function to map
the indices of individuals in the sorted list to their selection
probabilities. The rank for an individual may be scaled linearly using
the following formula:
𝑟𝑎𝑛𝑘(𝑝𝑜𝑠) = 2 − 𝑆𝑃 + (2 × (𝑆𝑃 − 1) ×
𝑝𝑜𝑠 − 1
𝑃𝑜𝑝𝑆𝑖𝑧𝑒 − 1
) (3.5)
where 1.0 SP 2.0
The MODE algorithm can be described as follows:
Algorithm MODE ( )
Input: T, S, size of workload W[1×M],
P[1×N], B[N×N], D[M×M], the number of
particle NoP
Output: the best position gbest
1. Call procedure: Opposition-Based
Population Initialization
2. while(criteria is not satisfied)do
20
3. for i=1 to PopSize do
4. selecting p1 from population by RBRWS
algorithm
5. selecting p2 from population by RBRWS
algorithm
6. F Random(1,0)
7. vi pbest + F(p1 – p2)
8. assign the identity of the server to each
position j of vector vi by equation (7)
9. randi,j Random(0,1)
10. Irand random(1,M)
11. 𝑢𝑖,𝑗 = {
𝑣𝑖,𝑗 𝑛ế𝑢 𝑟𝑎𝑛𝑑𝑖,𝑗 ≤ 𝐶𝑅 ℎ𝑜ặ𝑐 𝑖 = 𝐼𝑟𝑎𝑛𝑑
𝑥𝑖,𝑗 𝑛ế𝑢 𝑟𝑎𝑛𝑑𝑖,𝑗 ≥ 𝐶𝑅 ℎ𝑜ặ𝑐 𝑖 ≠ 𝐼𝑟𝑎𝑛𝑑
12. if (makespan(ui) < makespan(xi))
13. xi ui
14. if(rand < Jr)
15. Calculating Opposite Population OP by
OP_Algorithm
16. Fitness Evaluation
17. Selecting PopSize Fittest Individuals
from {Current Population, OP}
18. End while
Return gbest;
In the Initialization step, we randomly generate a population of
PopSize individuals, and calculate the opposite of individuals in the
population, then select the Popsize fittest individuals from current
population and their counterpart opposites.
In each iteration, MODE selects two individuals from population
according to the Rank-based Roulette Wheel Selection method.
21
According to the mutation operator, a mutant vector is generated by
adding the weighted difference between two selected vector of the
target population individuals to a third one as follows :
vi pbest + F(p1 – p2)
where p1, p2 are two individuals selected by rank-based roulette
wheel selection method, and pbest is the best individual in the
population.
Rank-based Roulette Wheel Selection method helps prevent
premature convergence due to “super” individuals, because the best
individual is always assigned the same selection probability,
regardless of its objective value.
After the mutation phase, the crossover operator is applied to
obtain the trail vector ui using the following equation :
𝑢𝑖,𝑗 = {
𝑣𝑖,𝑗 𝑖𝑓 𝑟𝑎𝑛𝑑𝑖,𝑗 ≤ 𝐶𝑅 𝑜𝑟 𝑖 = 𝐼𝑟𝑎𝑛𝑑
𝑥𝑖,𝑗 𝑖𝑓 𝑟𝑎𝑛𝑑𝑖,𝑗 ≥ 𝐶𝑅 𝑜𝑟 𝑖 ≠ 𝐼𝑟𝑎𝑛𝑑
Where randi,j is the jth independent random number uniformly
distributed in the range of [0, 1]. Also Irand is a random number in the
range of [1, M]. CR is a user defined crossover factor in the range [0,
1].
Now, we need to decide whether the trail vector ui should be a
member of the population of the next generation, it is compared with
the target individual xi. Finally, the selection is based on the survival
of the fitness as follows :
𝑥𝑖 = {
𝑢𝑖, 𝑖𝑓 𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛(𝑢𝑖) < 𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛(𝑥𝑖)
𝑥𝑖, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
}
22
After the crossover phase, we calculate the opposite of individuals
in the population. The fitness of individuals is evaluated, then the
PopSize fittest individuals are selected for the next generation.
Experimental results
We use both random and real world instances in the experiments.
Random instances were taken from Cloud Sim library. Real world
instances were taken from Vietnamese Cloud company and Cloud
Amazon EC2. The instances are divided into groups based on:
• The number of servers N
• The number of tasks M
• The ratio of the number of edges to the number of vertices of
graph G (denoted by )
The experiment results of five compared algorithms: proposed
algorithm MODE, PSO_H, Random, RRTSM, and EGA. For each
instance, we compare the best position vector (gbest). It is notable that
in most cases the best value of MODE are better than those of
Random, RRTSM. Compared to the PSO_H algorithm, the best value
of MODE are better than those of PSO_H from 2.1% to 12.5%, and
better than those of EGA from 1.2% to 9.4%.
In most case the mean value of MODE are better than those of
PSO_H from 2% to 9%, and better than those of EGA from 1.1% to
11%. Especially, in some case the difference between the MODE’s
best plan and the global extremum found by exhausted search is
negligible.
Comparison the makespan of LPSO_H with other algorithms
23
Fig 3.4: M=10, N=5
Fig 3.6: M=20, N=3
Comparison of algorithms PSOi_H, LPSO-H and MODE
Based on experiments, we suggest using the algorithms as
following:
With the complex workflow, there are many data distribution tasks
and data aggregation tasks, and there is a huge amount of data to
transfer between tasks. In this case MODE algorithm has the best
solution.
0
5
10
15
20
25
30
STD
Mean
Best
0
100
200
300
400
500
STD
Mean
Best
24
With the simple workflow, such as process structure, or pipeline
structure the PSOi_H algorithm is a good choice.
With the complex workflow, and there is a little amount of data to
transfer between tasks. In this case we should use the LPSO_H
algorithm.
CONCLUTION AND FUTURE WORK
CONCLUTION
The thesis presented the workflow scheduling problem on the
cloud computing environment CLOS, this is the first time the CLOS
problem has been formal express.
The thesis presented an overview of Graham's classification for
scheduling problem and gave Graham's classification for the CLOS
problem. Based on the SCHED problem, the thesis also demonstrated
that the CLOS problem belong to NP-hard.
In chapter 2, the thesis proposed two new algorithms for solving
the CLOS problem based on Particle Swarm Optimization.
In chapter 3, the thesis proposed one new algorithm for solving the
CLOS problem based on Differential Evolution.
FUTURE WORK
Apply the parallel algorithms, Branch and bound algorithms to find
the optimal solution for large and medium-sized datasets, in order to
test the quality of the proposed algorithms.
Another research, we apply the resource demand predicting
methods in order to schedule better
LIST OF SCIENTIFIC PUBLICATIONS
[1]. Phan Thanh Toàn, Kiều Tuấn Dũng, Nguyễn Thế Lộc, Nguyễn Doãn Cường,
“Sắp xếp lịch biểu thực thi luồng công việc tại đám mây điện toán”, Hội thảo
quốc gia lần thứ XVI, một số vấn đề chọn lọc của công nghệ thông tin và truyền
thông (@ 2013), pp. 285-290, 2013.
[2]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, Đỗ Như Long,
“Thuật toán lập lịch luồng công việc theo phương pháp tối ưu bày đàn trong môi
trường điện toán đám mây”, tạp chí Nghiên cứu khoa học và Công nghệ quân
sự, pp. 132-138, 2015.
[3]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, Đỗ Như Long, “Giải
thuật tối thiểu hóa chi phí thực thi luồng công việc trong môi trường điện toán
đám mây”, Tạp chí khoa học trường đại học Sư Phạm Hà Nội, pp. 47-55, 2015.
[4]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường , “Thuật toán lập lịch
luồng công việc trong môi trường điện toán đám mây”, Proc. Of the 8th National
Conference on Foundamental and Applied Information Techonoly Research
(FAIR’8), pp. 687-693, 2015.
[5]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, “Thuật toán lập lịch
luồng công việc trong môi trường điện toán đám mây dựa trên chiến lược tối ưu
bày đàn”, tạp chí Công nghệ thông tin và Truyền thông, pp. 15-20, 2015.
[6]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, “A Novel Workflow
Scheduling Algorithm in Cloud Environment”, Proc. Of 2015 2nd National
Foundation for Science and Technology Development Conference on
Information and Computer Science (NICS’2015), pp. 125-129, 2015.
[7]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, Trần Đăng Hưng,
“MODE: Hướng tiếp cận mới cho việc thực thi luồng công việc”, tạp chí khoa
học Công nghệ thông tin và Truyền thông, tập 1, số 1, pp. 63-70, 2016.
[8]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường , “Thuật toán LPSO
lập lịch cho các ứng dụng khoa học trong môi trường điện toán đám mây”, tạp
chí Khoa học và Công nghệ, Viện hàn lâm khoa học Việt Nam, pp. 287-299,
2016.
[9]. Phan Thanh Toàn, Nguyễn Thế Lộc, Nguyễn Doãn Cường, “A Robus and
Effective MODE Algorithm for Workflow Scheduling in Cloud Enveronment”,
Proc. Of The 7th International Symposium on Information and Communication
Technology (SoICT’2016), pp. 259-264, 2016.
[10]. Toan Phan Thanh, Loc Nguyen The, Said Elnaffar and Cuong Nguyen Doan,
"LPSO: Another Algorithm for Workflow Scheduling in the Cloud", Journal of
Computer Science, ISSN: 1549-3636, DOI: 10.3844/jcssp.2016, pp.611-617,
Volume 12, Issue 12, 2016.
Các file đính kèm theo tài liệu này:
- approximate_methods_based_on_particle_swarm_optimization_and.pdf
- TomTat LuanAn NCS PhanThanhToan_TiengViet.pdf