Convex Analysis

Convex Analysis

by Ralph Tyrell Rockafellar
     
 

View All Available Formats & Editions

Available for the first time in paperback, R. Tyrrell Rockafellar's classic study presents readers with a coherent branch of nonlinear mathematical analysis that is especially suited to the study of optimization problems. Rockafellar's theory differs from classical analysis in that differentiability assumptions are replaced by convexity assumptions. The topics

Overview

Available for the first time in paperback, R. Tyrrell Rockafellar's classic study presents readers with a coherent branch of nonlinear mathematical analysis that is especially suited to the study of optimization problems. Rockafellar's theory differs from classical analysis in that differentiability assumptions are replaced by convexity assumptions. The topics treated in this volume include: systems of inequalities, the minimum or maximum of a convex function over a convex set, Lagrange multipliers, minimax theorems and duality, as well as basic results about the structure of convex sets and the continuity and differentiability of convex functions and saddle- functions.

This book has firmly established a new and vital area not only for pure mathematics but also for applications to economics and engineering. A sound knowledge of linear algebra and introductory real analysis should provide readers with sufficient background for this book. There is also a guide for the reader who may be using the book as an introduction, indicating which parts are essential and which may be skipped on a first reading.

Editorial Reviews

Edinburgh Mathematical Society
This book should remain for some years as the standard reference for anyone interested in convex analysis.
— J. D. Pryce
Edinburgh Mathematical Society - J.D. Pryce
This book should remain for some years as the standard reference for anyone interested in convex analysis.
Edinburgh Mathematical Society - J. D. Pryce
This book should remain for some years as the standard reference for anyone interested in convex analysis.
From the Publisher
"This book should remain for some years as the standard reference for anyone interested in convex analysis."—J. D. Pryce, Edinburgh Mathematical Society

Product Details

ISBN-13:
9780691015866
Publisher:
Princeton University Press
Publication date:
12/23/1996
Series:
Princeton Landmarks in Mathematics and Physics Series
Edition description:
Reprint
Pages:
469
Product dimensions:
5.98(w) x 8.92(h) x 0.96(d)

Read an Excerpt

Convex Analysis


By R. Tyrrell Rockafellar

PRINCETON UNIVERSITY PRESS

Copyright © 1970 Princeton University Press
All rights reserved.
ISBN: 978-0-691-08069-7



CHAPTER 1

SECTION 1

Affine Sets

Throughout this book, R denotes the real number system, and Rn is the usual vector space of real n-tuples x = ([xi]1 ..., [xi]n). Everything takes place in Rn unless otherwise specified. The inner product of two vectors x and x* in Rn is expressed by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The same symbol A is used to denote an m X n real matrix A and the corresponding linear transformation x -> Ax from Rn to Rn. The transpose matrix and the corresponding adjoint linear transformation from Rn to Rn are denoted by A*, so that one has the identity

= .

(In a symbol denoting a vector, * has no operational significance; all vectors are to be regarded as column vectors for purposes of matrix multiplication. Vector symbols involving * are used from time to time merely to bring out the familiar duality between vectors considered as points and vectors considered as the coefficient n-tuples of linear functions.) The end of a proof is signalled by [parallel].

If x and y are different points in Rn, the set of points of the form

(1 - λ)x + λy = x + λ(y - x), λ [member of] R,

is called the line through x and y. A subset M of Rn is called an affine set if (1 - λ)x + λy [member of] M for every x [member of] M, y [member of] M and A [member of] R. (Synonyms for "affine set" used by other authors are "affine manifold," "affine variety," "linear variety" or "flat.")

The empty set Ø and the space Rn itself are extreme examples of affine sets. Also covered by the definition is the case where M consists of a solitary point. In general, an affine set has to contain, along with any two different points, the entire line through those points. The intuitive picture is that of an endless uncurved structure, like a line or a plane in space.

The formal geometry of affine sets may be developed from the theorems of linear algebra about subspaces of Rn. The exact correspondence between affine sets and subspaces is described in the two theorems which follow.

Theorem 1.1. The subspaces of Rn are the affine sets which contain the origin.

Proof. Every subspace contains 0 and, being closed under addition and scalar multiplication, is in particular an affine set.

Conversely, suppose M is an affine set containing 0. For any x [member of] M and λ [member of] R, we have

λx = (1 - λ)0 + λx [member of] M,

so M is closed under scalar multiplication. Now, if x [member of] M and y [member of] M, we have

1/2 (x + y) = 1/2x + (1 - 1/2)y [member of] M,

and hence

x + y = 2(|1/2 + y)) [member of] M.

Thus M is also closed under addition and is a subspace. [parallel]|

For M [subset] Rn and a [member of] Rn, the translate of M by a is defined to be the set

M + a=-{x + a\x [member of] M}.

A translate of an affine set is another affine set, as is easily verified.

An affine set M is said to be parallel to an affine set L if M = L + a for some a. Evidently "M is parallel to L" is an equivalence relation on the collection of affine subsets of Rn. Note that this definition of parallelism is more restrictive than the everyday one, in that it does not include the idea of a line being parallel to a plane. One has to speak of a line which is parallel to another line within a given plane, and so forth.

Theorem 1.2. Each non-empty affine set M is parallel to a unique subspace L. This L is given by

L = M - M = {x - y\ x [member of] M, y [member of] M}.

Proof. Let us show first that M cannot be parallel to two different subspaces. Subspaces L1 and L2 parallel to M would be parallel to each other, so that L2 = L1 + a for some a. Since 0 [member of] L2, we would then have - a [member of L1, and hence a [member of] L1. But then L1 [contains] + a = L2. By a similar argument L2L1, so L1 = L2. This establishes the uniqueness. Now observe that, for any y [member of] M, M - y = M + (-y) is a translate of M containing 0. By Theorem 1.1 and what we have just proved, this affine set must be the unique subspace L parallel to M. Since L = M - y no matter which y [member of] M is chosen, we actually have L = M - M. [parallel]|

The dimension of a non-empty affine set is defined as the dimension of the subspace parallel to it. (The dimension of Ø is -1 by convention.) Naturally, affine sets of dimension 0, 1 and 2 are called points, lines and planes, respectively. An (n - 1)-dimensional affine set in Rn is called a hyperplane. Hyperplanes are very important, because they play a role dual to the role of points in n-dimensional geometry.

Hyperplanes and other affine sets may be represented by linear functions and linear equations. It is easy to deduce this from the theory of orthogonality in Rn. Recall that, by definition, x [perpendicular to] j means <x, y> = 0. Given a subspace L of Rn, the set of vectors x such that x [perpendicular to] L, i.e. x [perpendicular to] y for every y [member of] L, is called the orthogonal complement of L, denoted L[perpendicular to]. It is another subspace, of course, and

dim L + dim L[perpendicular to] = n.

The orthogonal complement (L[perpendicular to])[perpendicular to] of L[perpendicular to] is in turn L. If b1, ..., bm is a basis for L, then x [perpendicular to] L is equivalent to the condition that x [perpendicular to] b1, ..., x [perpendicular to] b 1. In particular, the (n - l)-dimensional subspaces of Rn are the orthogonal complements of the one-dimensional subspaces, which are the subspaces L having a basis consisting of a single non-zero vector b (unique up to a non-zero scalar multiple). Thus the (n- l)-dimensional subspaces are the sets of the form {x | x [perpendicular to] b}, where b [not equal] 0. The hyperplanes are the translates of these. But

{x | x [perpendicular to] b} + a = {x + a | = 0} = {y | = 0} = {y\ - β},

where β = <a, b>. This leads to the following characterization of hyper- planes.

Theorem 1.3. Given β [member of] R and a non-zero b [member of] Rn, the set

H = {x | = β}

is a hyperplane in Rn. Moreover, every hyperplane may be represented in this way, with b and β unique up to a common non-zero multiple.

In Theorem 1.3, the vector b is called a normal to the hyperplane H. Every other normal to H is either a positive or a negative scalar multiple of b. A good interpretation of this is that every hyperplane has "two sides," like one's picture of a line in R2 or a plane in R3. Note that a plane in R4 would not have "two sides," any more than a line in R3 has.

The next theorem characterizes the affine subsets of Rn as the solution sets to systems of simultaneous linear equations in n variables.

Theorem 1.4. Given b e Rm and an m X n real matrix B, the set

M = {x [member of] Rn\Bx = b}

is an affine set in Rn. Moreover, every affine set may be represented in this way.

Proof. If x [member of] M, y [member of] M and λ [member of] R, then for z = (1 - λ)x + λy one has

Bz = (1 - λ)Bx + λBy = (1 - λ)b + λb = b,

so z [member of] M. Thus the given M is affine.

On the other hand, starting with an arbitrary non-empty affine set M other than Rn itself, let L be the subspace parallel to M. Let b1, ..., bm be a basis for L[perpendicular to]. Then

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where B is the m × n matrix whose rows are b1, ..., bm. Since M is parallel to L, there exists an a [member of] Rn such that

M = L + a - {x | B(x - a) = 0} = {x | Bx = b},

where b = Ba. (The affine sets Rn and Ø can be represented in the form in the theorem by taking B to be the m × n zero matrix, say, with b = 0 in the case of Rn and b ≠ 0 in the case of Ø .)[parallel]

Observe that in Theorem 1.4 one has

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where bi, is the ith row of B, βi, is the ith component of b, and

Hi = (x | <x, bi = βi}.

Each Hi is a hyperplane (bi ≠ 0), or the empty set (bi , = 0, βi ≠ 0), or Rn (bi , = 0, βi = 0). The empty set may itself be regarded as the inter- section of two different parallel hyperplanes, while Rn may be regarded as the intersection of the empty collection of hyperplanes of Rn. Thus:

Corollary 1.4.1. Every affine subset of Rn is an intersection of a finite collection of hyperplanes.

The affine set M in Theorem 1.4 can be expressed in terms of the vectors b'1, ..., b'n which form the columns of B by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Obviously, the intersection of an arbitrary collection of affine sets is again affine. Therefore, given any S [subset] Rn there exists a unique smallest affine set containing S (namely, the intersection of the collection of affine sets M such that M [contains] S). This set is called the affine hull of S and is denoted by aff S. It can be proved, as an exercise, that aff S consists of all the vectors of the form [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], such that xi [member of] S and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

A set of m + 1 points b0, b1, ..., bm is said to be affinely independent if aff {b0, b1, ..., bm} is m-dimensional. Of course

aff {b0, b1, ..., bm} = L + b0,

where

L = aff {0, b0, b1, ..., bm- b0}.


By Theorem 1.1, L is the same as the smallest subspace containing b1 - b0, b1, ..., bm - b0. Its dimension is m if and only if these vectors are linearly independent. Thus b0, b1, ..., bm are affinely independent if and only if b1 - b0, b1, ..., bm - b0 are linearly independent.

All the facts about linear independence can be applied to affine independence in the obvious way. For instance, any affinely independent set of m + 1 points in Rn can be enlarged to an affinely independent set of n + 1 points. An m-dimensional affine set M can be expressed as the affine hull of m + 1 points (translate the points which correspond to a basis of the subspace parallel to M).

Note that, if M = aff {b0, b1, ..., bm}, the vectors in the subspace L parallel to M are the linear combinations of b1- b0, b1, ..., bm - b0. The vectors in M are therefore those expressible in the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

i.e. in the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The coefficients in such an expression of .v are unique if and only if b0, b1, ..., bm are affinely independent. In that event, λ0, λ1, ..., λm, as parameters, define what is called a barycentric coordinate system for M.

A single-valued mapping T: x -> Tx from Rn to Rm is called an affine transformation if

T(1 - λ)x + λy) = (1 - λ)Tx + λTy

for every x and y in Rn and λ [member of] R.

Theorem 1.5. The affine transformationsfrom Rn to Rm are the mappings T of the form Tx = Ax + a, where A is a linear transformation and a [member of] Rm.

Proof. If T is affine, let a = T0 and Ax = Tx - a. Then A is an affine transformation with A0 = 0. A simple argument resembling the one in Theorem 1.1 shows that A is actually linear.

Conversely, if Tx = Ax + a where A is linear, one has

T((1 -λ)x + λy) = (1 - λ)Ax + λAy + a = (1 - λ)Tx + λTy.

Thus T is affine. [parallel]

The inverse of an affine transformation, if it exists, is affine.

As an elementary exercise, one can demonstrate that if a mapping T from Rn to Rm is an affine transformation the image set TM = {Tx | x [member] M} is affine in Rm for every affine set M in Rn. In particular, then, affine transformations preserve affine hulls:

aff (TS) = T(aff S).

Theorem 1.6. Let {b0, b1, ..., bm} and {b'0, b'1,..., b'm} be affinely independent sets in Rn. Then there exists a one-to-one affine transformation T of Rn onto itself, such that Tb1 = b'i for I = 0, ..., m. If m = n, T is unique.

Proof. Enlarging the given affinely independent sets if necessary, we can reduce the question to the case where m = n. Then, as is well known in linear algebra, there exists a unique one-to-one linear transformation A of Rn onto itself carrying the basis b1 - b0, ..., bn - b0 of Rn onto the basis b1 - b'0, ..., b'n - b'0. The desired affine transformation is then given by Tx = Ax + a, where a = b'0 - Ab0. [parallel]

Corollary 1.6.1. Let M1and M2be any two affine sets in Rn of the same dimension. Then there exists a one-to-one affine transformation T of Rn onto itself such that TM1 = M2.

Proof. Any m-dimensional affine set can be expressed as the affine hull of an affinely independent set of m + 1 points, and affine hulls are preserved by affine transformations. [parallel]

The graph of an affine transformation T from Rn to Rm is an affine subset of Rn+m. This follows from Theorem 1.4, for if Tx = Ax + a the graph of T consists of the vectors z = (x, y), x [member of] Rn and y [member of] Rm, such that Bz = b, where b = -a and B is the linear transformation (x, y) ->Ax - y from Rn+m to Rm.


(Continues...)

Excerpted from Convex Analysis by R. Tyrrell Rockafellar. Copyright © 1970 Princeton University Press. Excerpted by permission of PRINCETON UNIVERSITY PRESS.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >