Dc algorithms and nonconvex quadratic programming

The general theory on DCA of Pham Dinh and Le Thi can be applied with a success to indefinite QPs in a similar way as it has been used for problem (2.1). In fact, among the many solution methods for (3.1), the following one of T. Pham Dinh, H. A. Le Thi, and F. Akoa (2008) deserves a special attention due to its simplicity and effectiveness in finding the stationary point set of the problem

pdf26 trang | Chia sẻ: tueminh09 | Ngày: 25/01/2022 | Lượt xem: 428 | Lượt tải: 0download
Bạn đang xem trước 20 trang tài liệu Dc algorithms and nonconvex quadratic programming, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
VIETNAM ACADEMY OF SCIENCE AND TECHNOLOGY INSTITUTE OF MATHEMATICS Hoang Ngoc Tuan DC ALGORITHMS AND NONCONVEX QUADRATIC PROGRAMMING Speciality: Applied Mathematics Speciality code: 62 46 01 12 SUMMARY OF PH.D. DISSERTATION SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN MATHEMATICS Supervisor: Prof. Dr. Hab. Nguyen Dong Yen HANOI - 2015 The dissertation was written on the basis of the author’s research works carried at Institute of Mathematics, Vietnam Academy of Science and Tech- nology. Supervisor: Prof. Dr. Hab. Nguyen Dong Yen First referee: ................................................................... ................................................................... Second referee: ................................................................ ................................................................ Third referee: .................................................................. .................................................................. To be defended at the Jury of Institute of Mathematics, Vietnam Academy of Science and Technology: ....................................................................................................................... ....................................................................................................................... on........................, at............o’clock............................................................... The dissertation is publicly available at: • The National Library of Vietnam • The Library of Institute of Mathematics Introduction Convex functions have many nice properties. For instance, a convex func- tion, say ϕ : Rn → R, is continuous, directionally differentiable, locally Lipschitz at any point u ∈ Rn. In addition, ϕ is Fre´chet differentiable al- most everywhere on Rn, i.e., the set of points where the gradient ∇ϕ(x) exists is of full Lebesgue measure. It is also known that the subdifferential ∂ϕ(u) := {x∗ ∈ Rn : 〈x∗, x− u〉 ≤ ϕ(x)− ϕ(u) ∀x ∈ Rn} of a convex function ϕ : Rn → R ∪ {+∞} at u ∈ domϕ := {x ∈ Rn : ϕ(x) < +∞} is a closed, convex set. If x /∈ domϕ then one puts ∂ϕ(x) = ∅. The Fermat Rule for convex optimization problems asserts that x¯ ∈ Rn is a solution of the minimization problem min{ϕ(x) : x ∈ Rn} if and only if 0 ∈ ∂ϕ(x¯). Convex analysis is a powerful machinery for dealing with convex optimiza- tion problems. Note that convex programming is an important branch of optimization theory, which continues to draw attention of many researchers worldwide until now. If f : Rn → R is a given function, and if there exist convex functions g : Rn → R and h : Rn → R such that f(x) = g(x) − h(x) for every x ∈ Rn, then f is called a d.c. function. The abbreviation “d.c.” here 1 comes from the combination of words “Difference (of) Convex (functions”. More generally, a function f : Rn →R, where R= R ∪ {±∞}, is said to be a d.c. function if there are lower semicontinuous, proper, convex functions g, h : Rn →R such that f(x) = g(x)− h(x) for all x ∈ Rn. The convention (+∞) − (+∞) = +∞ is used here. Despite their (possible) nonconvexity, d.c. functions still enjoy some good properties of convex functions. A minimization problem with a geometrical constraint min{f(x) = g(x)− h(x) : x ∈ C}, (0.1) where f, g and h are given as above, and C ⊂ Rn is a nonempty closed convex set, is a typical DC programming problem. Setting f˜(x) = (g(x) + δC(x))− h(x), where δC , with δC(x) = 0 for all x ∈ C and δC(x) = +∞ for all x /∈ C, is the indicator function of the set C, one can easily transform (0.1) to min{f˜(x) : x ∈ Rn}, (0.2) which is an unconstrained DC programming problem, with f˜(x) being a d.c. function. DC programming and DC algorithms (DCA, for brevity) treat the problem of minimizing a function f = g − h, with g, h being lower semicontinuous, proper, convex functions on Rn, on the whole space. Usually, g and h are called d.c. components of f . The DCA are constructed on the basis of the DC programming theory and the duality theory of J. F. Toland. It was Pham Dinh Tao who suggested a general DCA theory, which have been developed intensively by him and Le Thi Hoai An starting from their fundamental paper “Convex analysis approach to D.C. programming: Theory, algorithms and applications” (Acta Mathematica Vietnamica, Vol. 22, 1997). Note that DC programming is among the most successful convex analysis approaches to nonconvex programming. One wishes to make an extension of convex programming, not too wide so that the powerful tools of convex analysis and convex optimization still can be used, but sufficiently large to 2 cover the most important nonconvex optimization problems. The set of d.c. functions, which is closed under the basic operations usually considered in optimization, can serve well the purpose. Note that the convexity of the two components of the objective function has been employed widely in DC programming to obtain essential theoretical results and to construct efficient solution methods. The DC duality scheme of J. F. Toland, is an example of such essential theoretical results. To be more precise, Toland’s Duality Theorem asserts that, under mild conditions, the dual problem of a DC program is also a DC program, and the two problems have the same optimal value. Due to their local character, DCA (i.e., DC algorithms) do not ensure the convergence of an iteration sequence to a global solution of the problem in question. However, with the help of a restart procedure, DCA applied to trust-region subproblems can yield a global solution of the problem. In practice, DCA have been successfully applied to many different nonconvex optimization problems for which it has proved to be more robust and efficient than many standard methods; in particular, DCA work well for large-scale problems. Note also that, with appropriate decompositions of the objective functions, DCA can generate several standard algorithms in convex and nonconvex programming. This dissertation studies the DCA applied for minimizing a quadratic problem on an Euclidean ball (also called the trust-region subproblem) and for minimizing a quadratic function on a polyhedral convex set. These problems play important roles in optimization theory. Let A ∈ Rn×n be a symmetric matrix, b ∈ Rn be a given vector, and r > 0 be a real number. The nonconvex quadratic programming with convex constraints min { f(x) := 1 2 xTAx+ bTx : ‖x‖2 ≤ r2 } , where ‖x‖ = ( n∑ i=1 x2i )1/2 denotes the Euclidean norm of x = (x1, . . . , xn) T ∈ Rn and T means the matrix transposition, is called the trust-region subprob- 3 lem. One encounters with the problem while applying the trust-region method (see, e.g., A. R. Conn, N. I. M. Gould, and P. L. Toint, “Trust-Region Methods”, 2000) to solve the unconstrained problem of finding the minimum of a C2–smooth function ϕ : Rn → R. Having an approximate solution xk at a step k of the trust-region method, to get a better approximate solution xk+1 one finds the minimum of ϕ on a ball with center xk and a radius depending on a ratio defined by some calculations on ϕ and the point xk. If one replaces ϕ with its second-order Taylor expansion around xk, the auxiliary problem of the form of the trust-region subproblem appears, and xk+1 is a solution of this problem. Consider a quadratic programming problem under linear constraints of the form min { f(x) := 1 2 xTQx+ qTx : Dx ≥ d } where Q ∈ Rn×n and D ∈ Rm×n be given matrices, q ∈ Rn and d ∈ Rm be given vectors. It is assumed that Q is symmetric. This class of optimiza- tion problems is well known and has various applications. Basic qualitative properties related to the solution existence, structure of the solution set, first-order necessary and sufficient optimality conditions, second-order nec- essary and sufficient optimality conditions, stability, differential stability of the problem can be found in the books of B. Bank, J. Guddat, D. Klatte, B. Kummer, and K. Tammer, “Non-Linear Parametric Optimization” (1982), R. W. Cottle, J.-S. Pang, and R. E. Stone, “The Linear Complementarity Problem” (1992), G. M. Lee, N. N. Tam, and N. D. Yen, “Quadratic Pro- gramming and Affine Variational Inequalities: A Qualitative Study” (2005), and the references therein. The structure of the solution set and the Karush-Kuhn-Tucker point set of this problem is far different from the trust-region subproblem since the constraint set of the trust-region subproblem is convex, compact, and has smooth boundary. Our aim is to study the convergence and the convergence rate of DCA 4 applied for the two mentioned problems. An open question and a conjecture raised in the two papers by H. A. Le Thi, T. Pham Dinh, and N. D. Yen (J. Global Optim., Vol. 49, 2011, pp. 481–495, and Vol. 53, 2012, pp. 317–329) will be completely solved in this dissertation. By using some advanced tools, we are able to obtain complete results on convergence of DCA sequences. Moreover, convergence rates of DCA sequences are established for the first time in this dissertation. The results of this dissertation complement and develop the corresponding published results of T. Pham Dinh and H. A. Le Thi (SIAM J. Optim., Vol. 8, 1998), T. Pham Dinh, H. A. Le Thi, and F. Akoa (Optim. Methods Softw., Vol. 23, 2008), H. A. Le Thi, T. Pham Dinh, and N. D. Yen (J. Global Optim., Vol 49, 2011; Vol. 53, 2012). The dissertation has three chapters and a list of references. Chapter 1 “Preliminaries” presents basic concepts and results of a general theory on DC programming and DCA. Chapter 2 “Minimization of a Quadratic Function on an Euclidean Ball” considers an application of DCA to trust-region subproblems. Here we present in details an useful restart procedure that allows the algorithm to find a global solution. We also give an answer in the affirmative to the ques- tion raised by H. A. Le Thi, T. Pham Dinh, and N. D. Yen (2012) about the convergence of DCA. Furthermore, the convergence rate of DCA is studied. Chapter 3 “Minimization of a Quadratic Function on a Polyhedral Con- vex Set” investigates an application of DCA to indefinite quadratic programs under linear constraints. Here we solve in the affirmative a conjecture raised by H. A. Le Thi, T. Pham Dinh, and N. D. Yen (2011) about the bound- edness of the DCA sequences. At first, by a direct proof, we obtain the boundedness of the DCA sequences for quadratic programs in R2. Then, by using some error bounds for affine variational inequalities, we establish the R-linear convergence rate of the algorithm, hence give a complete solution for the conjecture. 5 The results of Chapter 2 were published in Journal of Global Optimization [5] (a joint work with N. D. Yen) and in Journal of Optimization Theory and Applications [2]. Chapter 3 is written on the basis of the papers [3], which was published in Journal of Optimization Theory and Applications, and of the paper [4], which was published in Journal of Mathematical Analysis and Aplications. The above results were reported by the author of this dissertation at Semi- nar of Department of Numerical Analysis and Scientific Computing of Hanoi Institute of Mathematics, The 8th Vietnam-Korea Workshop “Mathemati- cal Optimization Theory and Applications” (University of Dalat, December 8-10, 2011), The 5th International Conference on High Performance Scien- tific Computing (March 5-9, 2012, Hanoi, Vietnam), The Joint Congress of the French Mathematical Society (SMF) and the Vietnamese Mathemati- cal Society (VMS) (University of Hue, August 20-24, 2012), The 8th Viet- namese Mathematical Conference (Nha Trang, August 10-14, 2013), The 12th Workshop on Optimization and Scientific Computing (Ba Vi, April 23-25, 2014). 6 Chapter 1 Preliminaries This chapter reviews some background materials of DC Algorithms. For more details, we refer to H. A. Le Thi and T. Pham Dinh’s papers (1997, 1998), H. N. Tuan’s Master dissertation (“DC Algorithms and Applications in Quadratic Programming”, Hanoi, 2010), and the references therein. 1.1 Toland’s Duality Theorem for DC Programs and Iteration Algorithms Consider the space Rn which is equipped with the canonical inner product 〈·, ·〉. Then the dual space of Rn can be identified with Rn. A function θ : Rn → R is said to be proper if it does not take the value −∞ and it is not equal identically to +∞, i.e., there is some x ∈ Rn with θ(x) ∈ R. The effective domain of θ is defined by dom θ := {x ∈ Rn : θ(x) < +∞}. Let Γ0(Rn) be the set of all lower semicontinuous, proper, convex functions on Rn. The conjugate function g∗ of the function g ∈ Γ0(Rn) is defined by g∗(y) = sup{〈x, y〉 − g(x) : x ∈ Rn} ∀ y ∈ Rn. Note that g∗ : Rn → R is also a lower semicontinuous, proper, convex function. In the sequel, we use the convention (+∞)−(+∞)=(+∞). 7 Definition 1.1 The optimization problem inf{f(x) := g(x)− h(x) : x ∈ Rn}, (P) where g and h are functions belonging to Γ0(Rn), is called a DC program. The functions g and h are called d.c. components of f . Definition 1.2 For any g, h ∈ Γ0(Rn), the DC program inf{h∗(y)− g∗(y) : y ∈ Rn}, (D) is called the dual problem of (P). Proposition 1.1 (Toland’s Duality Theorem) The DC programs (P) and (D) have the same optimal value. Definition 1.3 A vector x∗ ∈ Rn is said to be a a local solution of (P) if f(x∗) = g(x∗)− h(x∗) is finite (i.e., x∗ ∈ dom g ∩ domh) and there exists a neighborhood U of x∗ such that g(x∗)− h(x∗) ≤ g(x)− h(x), ∀x ∈ U. If we can choose U = Rn, then x∗ is called a (global) solution of (P). Proposition 1.2 (First-order optimality condition) If x∗ is a local solution of (P), then ∂h(x∗) ⊂ ∂g(x∗). Definition 1.4 A point x∗ ∈ Rn satisfying ∂h(x∗) ⊂ ∂g(x∗) is called a stationary point of (P). Definition 1.5 A point x∗ ∈ Rn is said to be a critical point of (P) if ∂g(x∗) ∩ ∂h(x∗) 6= ∅. If ∂h(x∗) 6= ∅ and x∗ is a stationary point of (P), then x∗ is a critical point of (P). The reverse implication does not hold in general. The idea of the theory of DC algorithms (DCA for brevity) is to construct two sequences {xk} and {yk} (approximate solution sequences of (P) and (D), respectively) in an appropriate way such that: 8 (i) The sequences {(g − h)(xk)} and {(h∗ − g∗)(yk)} are decreasing; (ii) The cluster point x∗ (resp. y∗) of {xk} (resp., of {yk}) is the critical point of (P) (resp., of (D)). The general DC algorithm by H. A. Le Thi and T. Pham Dinh (1997) is formulated as follows. DCA • Choose x0 ∈ dom g. • For each k ≥ 0, if xk has been defined, select a vector yk ∈ ∂h(xk). (1.1) • For each k ≥ 0, if yk has been defined, select a vector xk+1 ∈ ∂g∗(yk). (1.2) It can be shown that yk is satisfied (1.1) if and only if it is a solution of the convex program min{h∗(y)− 〈xk, y〉 : y ∈ Rn}. (Dk) Similarly, xk+1 satisfied (1.2) if and only if it is a solution of the convex program min{g(x)− 〈x, yk〉 : x ∈ Rn}. (Pk) A simplified version of DCA with adding a termination procedure can be stated as follows: • Step 1 Choose x0 ∈ dom g. Take  ≥ 0. Put k = 0. • Step 2 Solve (Dk) by some algorithm from convex programming to get a solution yk. Solve (Pk) by some algorithm from convex programming to get a solution 9 xk+1. Check the condition ||xk+1 − xk|| < . If it is satisfied, then terminate the computation; otherwise, go to Step 3. • Step 3 Increase k by 1 and return to Step 2. 1.2 General Convergence Theorem Definition 1.6 Let ρ ≥ 0 and C be a convex set in the space Rn. A function θ : C → R ∪ {+∞} is called ρ-convex if θ ( λx+ (1− λ)x′) ≤ λθ(x) + (1− λ)θ(x′)− λ(1− λ) 2 ρ ||x− x′ ||2 for all numbers λ ∈ (0, 1) and vectors x, x′ ∈ C. This amounts to saying that the function θ(·)− (ρ/2)|| · ||2 is convex on C. Definition 1.7 The modulus of convexity of θ on C is given by ρ(θ, C) = sup { ρ ≥ 0 : θ − (ρ/2)|| · ||2 is convex on C } . If C = Rn then we write ρ(θ) instead of ρ(θ, C). Function θ is called strongly convex on C if ρ(θ, C) > 0. Consider the problem (P). If ρ(g) > 0 (resp., ρ(g∗) > 0), let ρ1 (resp., ρ∗1) be real numbers such that 0 ≤ ρ1 < ρ(g) (resp., 0 ≤ ρ∗1 < ρ(g∗)). If ρ(g) = 0 (resp., ρ(g∗) = 0), let ρ1 = 0 (resp., ρ∗1 = 0). If ρ(h) > 0 (resp., ρ(h∗) > 0), let ρ2 (resp., ρ∗2) be real numbers such that 0 ≤ ρ2 < ρ(h) (resp., 0 ≤ ρ∗2 < ρ(h∗)). If ρ(h) = 0 (resp., ρ(h∗) = 0), let ρ2 = 0 (resp., ρ∗2 = 0). One adopts the abbreviations dxk = xk+1 − xk and dyk = yk+1 − yk. Let α := inf{f(x) = g(x)− h(x) : x ∈ Rn}. Theorem 1.1 Assume that {xk} and {yk} are generated by the DCA. We 10 have (i) (g − h)(xk+1) ≤ (h∗ − g∗)(yk)−max { ρ2 2 ||dxk||2, ρ ∗ 2 2 ||dyk||2 } ≤ (g − h)(xk)−max { ρ1+ρ2 2 ||dxk||2, ρ ∗ 1 2 ||dyk−1||2 +ρ22 ||dxk||2, ρ ∗ 1 2 ||dyk−1||2 + ρ ∗ 2 2 ||dyk||2 } ; (ii) (h∗ − g∗)(yk+1) ≤ (g − h)(xk+1)−max { ρ1 2 ||dxk+1||2, ρ ∗ 1 2 ||dyk||2 } ≤ (h∗ − g∗)(yk)−max { ρ∗1+ρ ∗ 2 2 ||dyk||2, ρ12 × ||dxk+1||2 + ρ22 ||dxk||2, ρ ∗ 1 2 ||dyk||2 + ρ22 ||dxk||2 } ; (iii) If α is finite, then {(g − h)(xk)} and {(h∗ − g∗)(yk)} are decreasing sequences that converge to the same limit β ≥ α. Furthermore, (a) If ρ(g) + ρ(h) > 0 (resp., ρ(g∗) + ρ(h∗) > 0), then lim k→∞ {xk+1 − xk} = 0 (resp., lim k→∞ {yk+1 − yk} = 0); (b) lim k→∞ {g(xk) + g∗(yk)− 〈xk, yk〉} = 0 ; (c) lim k→∞ {h(xk+1) + h∗(yk)− 〈xk+1, yk〉} = 0. (iv) If α is finite, and {xk} and {yk} are bounded, then for every cluster point x∗ of {xk} (resp., y∗ of {yk}), there is a cluster point y∗ of {yk} (resp., x∗ of {xk}) such that: (d) (x∗, y∗) ∈ [∂g∗(y∗) ∩ ∂h∗(y∗)]× [∂g(x∗) ∩ ∂h(x∗)], (e) (g − h)(x∗) = (h∗ − g∗)(y∗) = β, (f) lim k→∞ {g(xk) + g∗(yk)} = lim k→∞ 〈xk, yk〉. 11 Chapter 2 Minimization of a Quadratic Function on an Euclidean Ball In this chapter, we prove that any DCA sequence constructed by the Pham Dinh–Le Thi algorithm for the trust-region subproblem converges to a Karush- Kuhn-Tucker point. We also obtain sufficient conditions for the Q-linear convergence of DCA sequences. In addition, we give two examples to show that, if the sufficient conditions are not satisfied, then the sequences may not be Q-linearly convergent. This chapter is written on the basis of the papers [2, 5]. A part of the results from [4] is used in the final section of this chapter. 2.1 The Trust-Region Subproblem Let A ∈ Rn×n be a symmetric matrix, b ∈ Rn be a given vector, and r > 0 be a real number. The trust-region subproblem corresponding to the triple {A, b, r} is the optimization problem min { f(x) := 1 2 xTAx+ bTx : ‖x‖2 ≤ r2 } . (2.1) It is well-known that if x ∈ E := {x ∈ Rn : ‖x‖ ≤ r} is a local minimum of (2.1), then there exists a unique Lagrange multiplier λ ≥ 0 such that (A+ λI)x = −b, λ(‖x‖ − r) = 0, (2.2) 12 where I denotes the n × n unit matrix. If x ∈ E and there exists λ ≥ 0 satisfying (2.2), x is said to be a Karush-Kuhn-Tucker point (or a KKT point) of (2.1). 2.2 Solving the Trust-Region Subproblem by a DC Algorithm 2.2.1 The Pham Dinh–Le Thi Algorithm Applying the DCA presented in Chapter 1 to (2.1), we have the following iterative algorithm: 1. Choose ρ > 0 so that ρI − A is a positive semidefinite matrix. 2. Fix an initial point x0 ∈ Rn and a constant ε ≥ 0 (a tolerance). Set k = 0. 3. If ρ−1‖(ρI − A)xk − b‖ ≤ r, (2.3) then take xk+1 = ρ−1 [ (ρI − A)xk − b]. (2.4) Otherwise, set xk+1 = r‖(ρI − A)xk − b‖−1[(ρI − A)xk − b]. (2.5) 4. If ‖xk+1 − xk‖ < ε, then terminate the computation. Otherwise, in- crease k by 1 and resume the test (2.3). For ε = 0, the above algorithm generates an infinite sequence {xk}k≥0, called a DCA sequence. Basic properties of DCA sequences produced by the above algorithm can be sated as follows. Theorem 2.1 (Pham Dinh and Le Thi, 1998) For any k ≥ 1, f(xk+1) ≤ f(xk)− 1 2 (ρ+ θ1)‖xk+1 − xk‖2, 13 where θ1 is the smallest eigenvalue of the matrix ρI − A. If x0 ∈ E then the inequality holds for all k ≥ 0. It holds lim k→∞ ‖xk+1 − xk‖ = 0 and f(xk) → β ≥ α as k → ∞, where α is the optimal value of (2.1) and β is a constant depending on the choice of x0. In addition, every cluster point of the sequence {xk} is a KKT point. After proving that “if A is a nonsingular matrix that has no multiple negative eigenvalues, then any DCA sequence of (2.1) converges to a KKT point”, H. A. Le Thi, T. Pham Dinh, and N. D. Yen (J. Global Optim., 2012) have posed the following. Question. Under what conditions is the DCA sequence {xk} convergent? The next sections are aimed at solving completely the above Question. It will be proved that any DCA sequence constructed by the Pham Dinh–Le Thi algorithm for the trust-region subproblem (2.1) converges to a KKT point. 2.2.2 Restart Procedure DCA for finding global solutions of (2.1): Start. Compute λ1 (the smallest eigenvalue of A) , λn (the largest eigen- value of A) and an eigienvector u corresponding to λ1 by a suitable algo- rithm. Take ρ > max{0, λn}, x ∈ domf , stop:=false. While stop=false do 1. Implement DCA with an initial point x0 := x to get a KKT point x∗. 2. Take λ∗ = (−〈x∗, Ax∗〉 − 〈x∗, b〉)/r2. If λ∗ ≥ −λ1 then stop:=true , x∗ is a global solution of (TRS) else using the procedure in T. Pham Dinh and H. A. Le Thi, 14 SIAM J. Optim. (1998) to find a new vector x such that f(x) < f(x∗) and return to 1. end while This enlarged version of DCA is no longer a local optimization algorithm. In fact, it is a very efficient global optimization algorithm to solve (P). 2.3 Auxiliary Results This section presents 4 lemmas together with 2 corollaries, which are needed in establishing the convergence of DCA sequences. 2.4 Convergence Theorem The main result of this section is the following. Theorem 2.2 For every initial point x0 ∈ Rn, the DCA sequence {xk}k≥0 for the trust-region subproblem (2.1) constructed by the Pham Dinh–Le Thi algorithm (2.3)–(2.5) converges to a Karush-Kuhn-Tucker point of (2.1). 2.5 Illustrative Examples This section gives two examples to illustrate the application of Theorem 2.2 for the trust-region subproblem. 2.6 Convergence Rate A sequence {xk} ⊂ Rn is said to be Q-linearly convergent to x∗ if there is a constant µ ∈ (0, 1) such that ||xk+1 − x∗|| ≤ µ||xk − x∗|| for all sufficiently large k. If limsup k→∞ ||xk − x∗‖ 1k < 1 then one says that {xk} is R-linearly 15 convergent to x∗. It is well known that Q-linear convergence implies R- linear convergence, but the reverse implication is not true. Theorem 2.3 Let a DCA sequence {xk}k≥0 for the trust-region subproblem (2.1) be convergent to a KKT point x∗, and λ∗ be the Lagrange multiplier corresponding to x∗. Then we have ||xk+1 − x∗|| ≤ ρ− λ1√ ρ(ρ+ λ∗) ||xk − x∗||, (2.6) for all sufficiently large k. Here λ1 denotes the smallest eigenvalue of A. Hence, if ρ− λ1√ ρ(ρ+ λ∗) ∈ (0, 1) then {xk} converges Q-linearly to x∗. From Theorem 2.3 we can derive the following sufficient conditions for the linear convergence rate of DCA sequences for problem (2.1). Theorem 2.4 Let {xk} be a DCA sequence of problem (2.1) converging to a KKT point x∗. Let λ∗ ≥ 0 be the Lagrange multiplier associated with x∗. The following is valid: (a) If A is positive definite, then {xk} converges Q-linearly to x∗; (b) If A is positive semidefinite, but not positive definite, and λ∗ is positive, then {xk} converges Q-linearly to x∗. Definition 2.1 A KKT point of (2.1) is called proper if the Lagrange mul- tiplier corresponding to it is a positive number. By using the above properness concept, we can restate Claim (b) of The- orem 2.4 as follows: If A is positive semidefinite, but not positive definite, and x∗ is a proper KKT point, then {xk} converges Q-linearly to x∗. 2.7 Further Analysis In this section, we give two examples to show that, in general, the conver- gence rate of {xk} may not be Q-linear if either A is not positive semidefinite or x∗ is improper. 16 2.8 Minimization of a Quadratic Form on a Ball Re- visited The convergence rate of the Pham Dinh–Le Thi algorithm, which was de- scribed in Subsection 2.2.1 is classified furthermore in this section. Namely, by an example we show that there is a DCA sequence which does not con- verge R-linearly. 17 Chapter 3 Minimization of a Quadratic Function on a Polyhedral Convex Set In this chapter, we first prove that any iteration sequence generated by the Projection DC decomposition algorithm in quadratic programming is bounded, provided that the quadratic program in question is two-dimensional and solvable. Then, by using some error bounds for affine variational in- equalities we prove that any DCA sequence of that type is R-linearly con- vergent, provided that the original problem has solutions. Our results solve in the affirmative the first part of the Conjecture stated by H. A. Le Thi, T. Pham Dinh, and N. D. Yen (2011). This chapter is written on the basis of the papers [3, 4]. 3.1 Quadratic Problems with Linear Constraints Let Q ∈ Rn×n and D ∈ Rm×n be given matrices, q ∈ Rn and d ∈ Rm be given vectors. Suppose that Q is symmetric. Consider the indefinite quadratic programming problem under linear constraints min { f(x) := 1 2 xTQx+ qTx : Dx ≥ d } . (3.1) Let C := { x ∈ Rn : Dx ≥ d}. 18 Since Q is not required to be positive semidefinite, (3.1) is a nonconvex optimization problem in general. This is a well-known polynomial optimization problem. Qualitative prop- erties of the problem as well as numerical methods for its solving have been discussed in many books and research papers . 3.2 Solving QP problems by a DC Algorithm The general theory on DCA of Pham Dinh and Le Thi can be applied with a success to indefinite QPs in a similar way as it has been used for problem (2.1). In fact, among the many solution methods for (3.1), the following one of T. Pham Dinh, H. A. Le Thi, and F. Akoa (2008) deserves a special attention due to its simplicity and effectiveness in finding the stationary point set of the problem. Projection DC Decomposition Algorithm. For a given initial point x0 ∈ Rn, the DCA sequence {xk} corresponding to x0 is defined by the iteration formula xk+1 := PC ( xk − 1 ρ (Qxk + q) ) , k = 0, 1, 2, . . . . (3.2) Here the constant ρ > 0 should be larger than the largest eigenvalue of Q, and PC(u) denotes the metric projection of u ∈ Rn onto C; that is, PC(u) ∈ C and ‖u− PC(u)‖ ≤ ‖u− x‖ for every x ∈ C. Since Q is symmetric, the procedure of finding the largest eigenvalue of Q is rather simple and it requires a very small computation time. Definition 3.1 For any x ∈ Rn, if there exists a multiplier λ ∈ Rm such that Qx+ q − ATλ = 0,Ax ≥ b, λ ≥ 0, λT (Ax− b) = 0, then x is said to be a Karush-Kuhn-Tucker point (or a KKT point, for brevity) of (3.1). 19 Theorem 3.1 (T. Pham Dinh, H. A. Le Thi, and F. Akoa, 2008) Every DCA sequence {xk} generated by the DC Algorithm and an initial point x0 ∈ Rn possesses the following properties: (i) f(xk+1) ≤ f(xk)− ρ2‖xk+1 − xk‖2 for every k ≥ 1; (ii) The sequence {f(xk)} converges to an upper bound f∗ for the optimal value of (3.1); (iii) Every cluster point of {xk} is a KKT point of (3.1); (iv) If inf x∈C f(x) > −∞, then lim k→∞ ‖xk+1 − xk‖ = 0. Our aim is to solve the next conjecture, which is the first part of the Conjecture stated by H. A. Le Thi, T. Pham Dinh, and N. D. Yen (J. Global Optim., 2011). Conjecture 1 If (3.1) has solutions, then every DCA sequence generated by the Projection DC decomposition algorithm in (3.2) must be bounded. Conjecture 1 was solved by the authors under certain additional assump- tions. In accordance with the progresses of our long efforts in finding a solution of the conjecture, we first solve Conjecture 1 in the case of two-dimensional quadratic programs, without using any additional assumption on the data sets. Then, by a different approach, we give a complete solution to Conjec- ture 1. 3.3 DCA Sequences in Two-Dimensional Spaces Our result on the boundedness of the DCA sequences in two-dimensional spaces can be stated as follows. Theorem 3.2 If (3.1) has a global solution and if n = 2, then every DCA sequence generated by Algorithm A is bounded. 20 The arguments for proving Theorem (3.2) cannot be applied to indefinite QPs of the form (3.1) where n ≥ 3. Next, by employing some advanced tools (namely, the error bounds for affine variational inequalities due to Z. Q. Luo and P. Tseng, SIAM J. Op- tim., Vol. 2, 1992, 43-54), we are able not only to solve completely the above Conjecture 1 but also establish the R-linear convergence rate of the DCA sequences defined by (3.2). Our result shows clearly that the Pham Dinh–Le Thi algorithm can solve effectively the problem (3.1). 3.4 Complete Solution of the Conjecture Denote by C∗ the KKT point set of (3.1). The convergence and the rate of convergence of the Projection DC decomposition algorithm can be described as follows. Theorem 3.3 If the problem (3.1) has a nonempty solution set, then for each x0 ∈ Rn, the DCA sequence {xk} constructed by Projection DC decom- position algorithm (3.2) converges R-linearly to a KKT point of (3.1), that is, there exists x∗ ∈ C∗ such that limsup k→∞ ||xk − x∗||1/k < 1. 21 General Conclusions This dissertation investigates in detail the convergence and convergence rate of iteration sequences generated by Pham Dinh–Le Thi’s projection algo- rithms for solving two classes of nonconvex quadratic programs: (a) the trust–region subproblems; (b) quadratic programs on polyhedral convex sets. Our main results include: - A convergence theorem for a DC algorithm to the trust–region subprob- lem; sufficient conditions for the Q-linear convergence of the DCA iteration sequences in question. - A direct proof of the boundedness of DCA sequences in two-dimensional quadratic programming; a theorem on the convergence and the R-linear convergence rate of DCA sequences for solving indefinite quadratic programs under linear constraints. Our results complement and develop the corresponding published results of T. Pham Dinh and H. A. Le Thi (SIAM J. Optim., Vol. 8, 1998), T. Pham Dinh, H. A. Le Thi, and F. Akoa (Optim. Methods Softw., Vol. 23, 2008), H. A. Le Thi, T. Pham Dinh, and N. D. Yen (J. Global Optim., Vol 49, 2011; Vol. 53, 2012). Investigations on the following issues would be of interest: 1. Sufficient conditions for R-linear convergence of the DCA sequences studied in Chapter 2; 2. Q-linear convergence of the DCA sequences studied in Chapter 3; 22 3. Solving (3.1) globally by DCA; 4. Convergence and convergence rate of DCA sequences generated by Proximal DC decomposition algorithms; 5. Applicability of the general DC algorithm discussed in Chapter 1 to optimization problems not belonging to the above classes (a) and (b). 23 References [1] H. N. Tuan, DC Algorithms and applications in quadratic program- ming, Master Thesis, Institute of Mathematics, VAST, Hanoi, 2010. [2] H. N. Tuan, Convergence rate of the Pham Dinh–Le Thi algorithm for the trust-region subproblem, J. Optim. Theory Appl., 154 (2012), 904–915. [3] H. N. Tuan, Boundedness of DCA iterative sequences in two- dimensional quadratic programming, J. Optim. Theory Appl., 164 (2015), 234–245. [4] H. N. Tuan, Linear convergence of a type of iterative sequences in nonconvex quadratic programming, J. Math. Anal. Appl., 423 (2015), 1311–1319. [5] H. N. Tuan and N. D. Yen, Convergence of Pham Dinh–Le Thi’s algorithm for the trust-region subproblem, J. Global Optim., 55 (2013), 337–347. 24

Các file đính kèm theo tài liệu này:

  • pdfdc_algorithms_and_nonconvex_quadratic_programming.pdf