Jordan Canonical Form

The characteristic polynomial fT(x) of a linear operator T on a vector space V of dimension n may, unambigously, be defined as the characteristic polynomial f[T](x) of the matrix [T] of T with respect to any basis b of V . This is because, in view of the relation [T]b = [b ]a -1[T]a [b ]a for any two bases a and b , all such matrices are similar and so have the same characteristic polynomial.

If W is a subspace of V such that Tw Î W for all w Î W , W is said to be T-invariant. For such a W , the restriction R of T to W is a linear operator on W . If a basis w = {w1, w2, … , wm} of W is extended to an ordered basis b = {w1, w2, … , wm, vm+1, … , vn} of V , there follows

[T] = ,

for some m´ (n-m) matrix X and (n-m)´ (n-m) Y. It follows that fT(x) = fR(x)fY(x), showing that the characteristic polynomial of a restriction of T divides the characteristic polynomial of T. In particular, if fT(x) is a product of linear factors so is fR(x).

If (l , v) is an eigenpair of T (i.e., v ¹ 0, and Tv = l v), W = <v>, the span of v, is T-invariant and the characteristic polynomial of the restriction of T to <v> is simply (x-l ). It follows that (x-l ) divides fT(x). Conversely, if (x-l ) is a factor of fT(x), l I-T is singular and so there exists a vector v such that (l , v) is an eigenpair of T. The above background is enough to prove the important:

Jordan Canonical Form Theorem. If T is a linear operator on a finite dimensional vector space V over a field F such that fT(x) is a product of linear factors, there exists a basis b of V such that [T]b is of a Jordan canonical form.

Proof: Let dim(V ) = n. Let the characteristic polynomial ft(x) of T be a product of linear factors. Let l be an eigenvalue of T. Put S = T - l I. Then S is singular and dim (S(V )) < n. By an induction on the dimension of the vector space, we can assume that to S restricted to S(V ) there corresponds a Jordan basis

g = {zi,j: 1 £ j £ l(i), 1 £ i £ p}È {yi,j: 1 £ j £ m(i), 1 £ i £ q},

where zi,j 's are associated with blocks of zero eigenvalues and yi,j 's with the non-zero eigenvalues l i 's:

Szi,1 = 0, Szi,j = zi,j-1, 1 < j £ l(i) , 1 £ i £ p,

Syi,1 = l yi,1, Syi,j = l iyi,j + yi,j-1, 1 < j £ m(i), 1 £ i £ q.

Since zi,l(i) Î S(V ), there exist zi Î V such that

Szi = zi,l(i), 1 £ i £ p.

Let {zi,1: 1 £ i £ p}È {xi: 1 £ i £ r} be a basis of null space N (S) of S. We show that the following ordered set b is a Jordan basis for T in V :

b = {{zi,1, zi,2, … , zi,l(i), zi: 1 £ i £ p}È {yi,j: 1 £ j £ m(i), 1 £ i £ q}È {xi: 1 £ i £ r}}.

To show that the set b is linearly independent, let

0 = S 1£ i£ p (S 1£ j£ l(i) ai,j zi,j + ai zi) + S 1£ i£ q (S 1£ j£ m(i) bi,j yi,j +S 1£ i£ r ci xi.

Operating by S, we have

0 = S 1£ i£ p (S 2£ j£ l(i) ai,j zi,j-1 + ai zi,l(i)) + S 1£ i£ q [S 2£ j£ m(i) bi,j (l iyi,j +yi,j-1) + bi1 l i yi,1].

By the linear independence of zi,j 's and yi,j 's we have

aij = 0, 2 £ j £ l(i) , 1 £ i £ p; ai = 0, 1 £ i £ p; bij = 0, 2 £ j £ m(i) , 1 £ i £ q; bi1 = 0, 1 £ i £ q.

Using these, for the remaining coefficients, we have

S 1£ i£ p ai1 zi,1 + S 1£ i£ r ci xi = 0,

which by the linear independence of zi,1 's and xi 's implies ai1 = 0, 1 £ i £ p; ci = 0, 1 £ i £ r, establishing that b is a linearly independent set. Note that the number of vectors zi,j 's and yi,j 's equals the rank of S, and the number of vectors zi 's and xi 's equals the nullity of S. Hence b is a basis of V . Clearly b is a Jordan basis for S and hence for T. #

The above matrix [T]b itself is generally referred to as a Jordan canonical form of T.

Corollary. Let l be an eigenvalue of T and root of a polynomial P(x) of order m. Then, in the Jordan canonical form of T, if l is associated with l Jordan blocks of sizes {mi´ mi: 1 £ i £ l}, the Jordan blocks in the Jordan canonical form of the operator P(T), corresponding to its eigenvalue P(l ) have the sizes {(mi-m)´ (mi-m): mi > m} their number equalling the number of mi 's larger than m. If l is not a root of P(x), in the Jordan canonical form of P(T), P(l ) is associated with the same number of Jordan blocks and of the same sizes as is l in the Jordan canonical form of T.

 

PROBLEMS

1. Prove that to a linear operator T on a vector space V there corresponds a Jordan basis iff to T there corresponds a triangulizing basis.

2. Show that the number of linearly independent eigenvectors possessed by a triangulable operator equals the number of Jordan blocks in its Jordan canonical form.

3. Prove that a Jordan canonical form of an operator is unique except for a reordering of its Jordan blocks.

4. If A is a triangulable matrix, show that there exists a non-singular matrix C such that C-1AC is in Jordan canonical form.

 

Rank of a Linear Transformation

The rank of a linear transformation T : U ® V , denoted by r(T) is defined to be the dimension of its range space R (T) = {Tu : u Î U }. The null space N (T) = {u Î U : Tu = 0}, of T is the subspace of vectors u annihilated by T. The dimension of N (T) is called the nullity of T and is written as n(T).

Example. If V is the vector space of all 2´ 2 matrices over a field F and T Î L(V ) is defined by

,

then, if F is not of characteristic 2,

N (T) = {: x Î F } , R (T) = {: x, y, z Î F }

and if ch (F ) = 2, then

N (T) = {: x, z Î F } , R (T) = {: y, z Î F }.

Theorem. Let T : V ® W be a linear transformation. Then r(T) + n(T) = dim V .

Proof: Let {v1, … , vn} be a basis of R (T). There exist u1, … , ur Î V such that Tui = vi , i = 1, ... , r. Let {w1, … , wk} be a basis of N (T). Let U = span {u1, … , ur}. Let x Î V . Write Tx = S aivi . Then x - S aiui Î N (T). Write x - S aiui = S bjwj . Then x = S aiui + S bjwj Î span {u1, … , ur, w1, … , wk} = V (therefore). It remains to show that {u1, … , ur, w1, … , wk} is an independent set. If not, there exist non-trivial scalars pi, qj 's such that S piui + S qjwj = 0. All pi 's can't be zero, for otherwise independence of wj 's is contradicted. Now 0 = T(S piui + S qjwj) = S pivi , contradicting independence of vi 's. #

 

PROBLEMS

1. Let T be a linear operator on a two dimensional vector space V . If T2 = T prove that T = 0 or T = I or that there is a basis b of V such that

[T]b = .

2. Let T be a linear operator on a two dimensional vector space V . If r(T) = 2 and T2 = T, prove that T = I.

3. Let T be a linear operator on a two dimensional complex or real vector space. Let limn® ¥ Tn = T (in the sense that the entries in the matrix of Tn approach the corresponding entries in the matrix of T with respect to some basis). Prove that T = 0, I, or that for some basis b ,

[T]b = .

4. Let T be a linear operator on a two dimensional complex or real vector space. Let limn® ¥ Tn = S. Prove that S = 0, I, or, that for some basis b ,

[S]b = .

.5. Let B be n´ p and A an m´ n matrix over a field F . Prove that R (A¢ ) = N (B¢ ) iff R (B) = N (A).

 

Linear Functionals

A linear functional f on a vector space V is simply a linear transformation from V to F. The set of all linear functionals on V is denoted by V ¢ and is called the algebraic dual of V . Thus, V ¢ = L(V , F), is a vector space, with the usual operations, and V ¢ is isomorphic to V if V is finite dimensional. The dual space V ¢ of V deserves a special consideration due to its role in analysing problems on V .

If {va : a Î J} = b is a basis of V , J being an indexing set, and j : J ® F is any map, the rule f (va ) = j (a ), determines the most general linear functional f for which, writing v = S a Î J ca va , v Î V , where all but a finite number of ca 's are zero on the right so that the sum is actually finite for each v Î V , we have f (v) = S a Î J ca f (va ) = S a Î J ca j (va ). If J is finite, for a given basis b = {va : a Î J} of V , the linear functionals f a determined by f a (vw ) = d a w , a , w Î J, are linearly independent, and any linear functional f on V can be written as the linear combination f = S a Î J f (va )f a . In particular, if V = F n, f is a linear functional on V , and ei is the i-th standard unit vector in F n, putting yi = f (ei), 1 £ i £ n, f(x) = S 1£ i£ n f (ei)xi = S 1£ i£ n yixi = y¢ x, and the correspondence f « y Î F n is obviously 1-1 onto.

Lemma. f 1, f 2, … , f m Î V ¢ are ll.i. iff there exist vectors u1, u2, … , um Î V , such that det (f i (vj)) ¹ 0.

Proof: If f i 's are linearly dependent one of them is a linear combination of the others and then the corresponding row of (f i(uj)) is a linear combination of the other rows and so det (f i(uj)) = 0. This proves the 'if' part. For the 'only if' part, we note that if m = 1 the result is true. For, {f 1} l.i. is equivalent to f 1 ¹ 0, in which case there exists a w1 such that f 1(w1) ¹ 0. Thus assume the result for m and let f 1, f 2, … , f m, f m+1 be l.i. Then ff 1, f 2, … , f m are l.i. and so there exist u1, … , um s.t. det (f i(uj))1£ i,j£ m ¹ 0. Let u be an arbitrary vector and consider

.

Suppose for no u is it different from zero. Let C1, C2, … , Cm+1 denote the cofactors corresponding to the elements in the last column. Then for all u, 0 = S 1£ i£ m+1 Cif i(u) implies that S 1£ i£ m+1 Cif i = 0. By linear independence of f i 's, Cm+1 = 0. But Cm+1 = det (f i (uj)1£ i,j£ m) ¹ 0, a contradiction. This completes the proof. #

Lemma. f 1, f 2, … , f m Î V ¢ are l.i. iff there exist v1, v2, … , vm Î V , such that f i(vj) = d ij, i, j = 1, 2, … , m.

Proof: If f i(vj) = d ij, det (f i(vj)) = 1 ¹ 0. By the previous lemma f 1, f 2, … , f m are linearly independent. Conversely, if f 1, f 2, … , f m are l.i., by the previous lemma there exist vectors u1, … , um s.t. det (f i(uj)) ¹ 0. Then for each 1 £ k £ m, the system S 1£ j£ m akj f i(uj) = d ki , i = 1, … , m, has a solution. Hence, putting vk = S 1£ j£ m akjuj, we have f i(vk) = d ik , i,k = 1, ... , m. #

Theorem. T Î L(V , W ) has rank m iff there exist l.i. f 1, f 2, … , f m Î V ¢ and l.i. w1, … , wm Î W , such that T = S 1£ i£ m f i wi, i.e., Tv = S 1£ i£ m f i(v)wi, v Î V .

Proof. If T is of the given form R (T) Ì span {w1, … , wm}. By the previous lemma, T(vj) = S 1£ i<m f i(vj) wi = S 1£ i<m d ij wi = wj, and hence, wj Î R (T), j = 1, … , m. As wj 's are linearly independent, dim R (T) = dim (span {w1, … ,wm}) = m. Conversely, let r(T) = m, and let w1, … , wm span R (T). Then for some functionals f i, i = 1, ... , m, Tu = S 1£ i£ m f i(u)wi. Thus, S 1£ i£ m f i(au+bv)wi = T(au+bv) = aTu + bTv = S 1£ i£ m af i(u)+bf i(v)wi. The linear independence of wi 's implies that f i(au + bv) = af i(u) + bf i(v) for i = 1, ... , m and for all vectors u, v and scalars a, b. Thus each f i is a linear functional. If f i 's are not l.i., w.l.g. let f m = S 1£ i£ m-1 ai f i. Then, Tu = S 1£ i£ m f i(u)wi = S 1£ i£ m-1 f i(u)(wi + aiwm), implying that r(T) £ m-1, a contradiction. Thus f i 's and wi 's are l.i., completing the proof. #

Corollary. Let A be an m´ n matrix. Then r(A) = r iff there exist r l.i. m´ 1 vectors w1, w2, … , wr and r l.i. n´ 1 vectors u1, u2, … , ur such that

A = S 1£ i£ r ui wi¢ = .

Corollary. Let f , f 1, f 2, … , f k Î V ¢ and N (f ), N (f 1), ... , N (f k) denote their null spaces. Then f is a linear combination of f 1, f 2, … , f k iff Ç 1£ i£ k N (f i) Ì N (f ).

Proof: Without loss of generality, we can assume that f 1, f 2, … , f m are l.i., m £ k, and that f i, i > m can be written as l.c.'s of f 1, f 2, … , f m. Now N k = Ç 1£ i£ k N (f i)Ì Ç 1£ i£ m N(f i) = N m. Also if j > m and v Î N m then f j(v) = 0 implies v Î N (f j). Hence N m Ì N (f j), j ³ m+1. It follows that N k = N m. Hence, it is sufficient to prove that f is a l.c. of f 1, f 2, … , f m iff N m Ì N (f ), where f 1, f 2, … , f m are l.i.

Now, if f is an l.c. of f i 's, obviously Ç N (f i) Ì N (f ). And if f is not an l.c. of f i 's, f , f 1, f 2, … , f m are l.i. as f i 's are so. But by the theorem, then, there is a v s.t. f (v) = 1 and f i(v) = 0 and so Ç N (f i) is not contained in N (f ). #

PROBLEMS

1. In R 3 , let u1 = (2, 1, 1)¢ , u2 = (1, 2, 1)¢ , u3 = (1, 1, 2)¢ . (A) If f is a linear functional on R 3 such that f(u1) = 1, f(u2) = 2, f(u3) = 3, and if u = (1, 2, 3)¢ , find f(u). (B) Describe explicitly a linear functional f on R 3 such that f(u1 + u2 ) = f(u2 + u3 ) = 0 but f(u1 + u3) ¹ 0. (C) Let f be any linear functional such that f(u1 + u2 ) = f(u2 + u3 ) = 0, and, f(u1 + u3) ¹ 0. If u = (1, 1, 1), show that f(u) ¹ 0.

2. Let u1 = (1, 1, 1)¢ , u2 = (1, w , w 2)¢ , and u3 = (1, w 2, w )¢ , where w ¹ 1 is a cube root of 1. Is b = {u1, u2, u3} a basis for C 3? If so, find the dual basis of b . What is the matrix of the identity operator relative to b ?

3. Let V be a finite dimensional vector space, b = {u1, … , un} a basis of V and w = {f1, … , fn} the dual basis. If T Î L(V ), we define the trace of T by tr T = S 1£ i£ n fi (Tui). Show that : (a) the trace function is independent of the choice of the basis b ; (b) tr TS = tr ST; and (c) tr T = tr [T]b , for any basi b of V , where the trace of a matrix is the familiar sum of its diagonal entries.

4. Let V be the vector space of all real polynomial functions p(x) = c0+c1x+…+cn-1xn-1 on R of degree < n. Define the linear functionals fk 's on V by: fk(x) = ò [0,k] p(x) dx, 1 £ k £ n. Show that {f1, f2, … , fn} is a basis for V ¢ . Exhibit the basis for V of which it is the dual.

5. If T and S are linear operators on a finite dimensional vector space over a field of characteristic zero, show that the relation TS - ST = I is impossible.

6. Show that every T Î L(F n, F m) can be expressed as Tx = (f1(x), … , fm(x))¢ , x Î F n, for some linear functionals f1, … , fn on F n, uniquely determined by T. Use this result to show that N (T) can be written as an intersection of m subspaces of F n, each of dimension n-1. What are the f1, f2, … , fm if n = m and T = I on F n.

7. If u1 = (1, 1, 1, 1), u2 = (1, 2, 3, 4), u3 = (4, 3, 2, 1), and if W is the span of u1, u2 and u3 in the row space C (4), characterize the linear functionals f: f(x) = y¢x lying in the annihilator of W 0, by finding a basis of W 0.

8. Find a basis for W 0, if W denotes the subspace of R 5 spanned by the three vectors u1 = e1+e2+e3, u2 = e2+e3+e4, u3 = e3+e4+e5.

9. Let P denote the 2´ 2 real symmetric matrix such that P2 = P and (1, 1)¢ Px = 0, x Î R 2. Let W consist of matrices A Î R 2´ 2 such that AP = 0. Are there f Î W 0 such that f(B) = 0, and f(C) = 1, if

B = and C = ?

10. If the linear functionals fk on R n are defined by fk(x1, … , xn) = S 1£ j£ n (k2–j2)xj, 1 £ k £ n, find the dimension of the subspace annihilated by f1, … , fn.

11. If W 1 and W 2 are subspaces of F n, prove that : (a) (W 1+W 2)0 = W 10 Ç W 20 ; (b) (W 1 Ç W 2)0 = W 10+W 20.

12. If f is any non-zero linear functional on the vector space F [x] of polynomials over a field F such that f (fg) = f (f)f (g), for all f, g Î F [x], show that for some c Î F , f (f) = f(c), f Î F [x].

13. Let V be a vector space over a field F and let W be a subspace of V . If f is a linear functional on W , show that it can be extended as a linear functional on the whole of V , i.e., there is a linear functional g on V such that g(w) = f(w), w Î W . Find the dimension of the subspace of all such functionals g if V is finite dimensional.

14. Let F be a field having at least three distinct elements. Let V be any vector space over F . If f and g are linear functionals on V such that the function h defined by h(v) = f(v)g(v) is also a linear functional on V , prove that either f = 0, or g = 0. If F has no more than two elements, verify that the conclusion is false.

15. Let F be any infinite field and let V be a finite-dimensional vector space over F . If v1 , ... , vm are finitely many non-zero vectors in V , prove that there is a linear functional f on V such that f(vi) ¹ 0, 1 £ i £ m.

16. If W is an n-dimensional vector space over a field F , the dual space W ¢ , being isomorphic to W , could be 'naturally' identified with W itself. In particular, if W = F (n), f Î W ¢ iff f has the form : f(x1, x2, … , xn) = w1x1 + w2x2 + … + wnxn, where (w1, w2, … , wn) Î W , itself. Now let n be a positive integer and F a field. Let W be the set of row vectors (x1, x2, … , xn) in F (n) such that x1 + x2 + … + xn = 0. Is it true that W ¢ consists of linear functionals f(x1, x2, … , xn) = c1x1 + c2x2 + … + cnxn, on F (n) which satisfy c1 + c2 + … + cn = 0? Verify that this happens iff the characteristic of the field ch (F ) Œ n.

17. Let z Î V = F n and let W = {x Î F n : z¢x = 0}. Let g Î V ¢ be defined by g(x) = z¢x, x Î V . Show that: (a) W is precisely the one dimensional subspace spanned by the linear functional g; (b) W ¢ can be 'naturally' identified with the restriction to W of all linear functionals f on V of the form f(x) = w¢ x, where w Î W , iff w¢ w ¹ 0; and, (c) if w¢ w = 0, then W ¢ can be identified with the restriction to W of all f Î V ¢ of the form f(x) = u¢ x, u Î U , where U is any subspace of F n such that F n = <w>Å U .

18. Characterize subspaces W of an infinite dimensional vector space V such that W is finite dimensional.

19. If W is a finite dimensional subspace of a vector space V , show that W 0 = {f Î V ¢: f(w) = 0, w Î W } is finite dimensional iff V is finite dimensional.

20. If W is a subspace of finite dimensional vector space V , and {f1, … , fr} Ì V ¢ is any basis for W 0, show that W = Ç 1£ i£ r N (fi) .

21. Let V denote the vector space of all functions from a given set S into a field F with the usual operations. Show that for any n-dimensional subspace W of V , there exist points x1, … , xn Î S , and functions f1, … , fn Î W such that fi(xj) = d ij, 1 £ i, j £ n.

22. Let V and W be respectively, n and m-dimensional vector spaces over a field F . If R Î L(V ), S Î L(W ) and T is a linear operator defined on L(V , W ) by T(U) = SUR, prove that tr(T) = mn´ tr(S)´ tr(R).

23. Let V be a finite dimensional vector space over a field F . If f is a linear functional on L(V ) such that f(TS) = f(ST), for all S, T Î L(V ) show that f is a scalar multiple of the trace function on L(V ). If, in addition, f(I) = n, and if the characteristic of the field F does not divide n, then f is the trace function.

24. If V is a finite dimensional vector space, any operators T on V of the form T = RS-SR, S, R Î L(V ), is of trace zero. Show that there are enough operators of this type so as to span the subspace of all linear operators on V of trace zero.

25. If V is a vector space show that each v Î V defines a linear functional f v on V ¢ by f v(f) = f(v), f Î V ¢ and that (V ¢ )¢ is 'naturally' isomorphic to the subspace {f v: v Î V } of (V ¢ )¢ under the correspondence v « f v.

26. Prove that V ¢ is isomorphic to V iff dim V < ¥ .

 

The Transpose of a Linear Transformation

The transpose of a linear tranformation T e L(V , W ) is a transformation in L(W ¢, V ¢), written Tt or T¢ , defined by (T¢ f)(v) = f(Tv), f e W ¢, v e V , in which the right hand side is linear in f, T, as well as v, so that T¢ is unabigously defined and moreover we have (aT + bS)¢ = aT¢ + bS¢ , a, b e F, T, S e L(V , W ), showing that the operation of taking the transpose is a linear operation from L(V , W ) to L(W ¢, V ¢). If we agree to write g(v) as (v, g), the relation between T and T¢ could be expressed simply as (Tv, g) = (v, T¢ g), g e W ¢ , v e V .

The dual space V ¢ ¢ of the dual space V ¢ of V is called the bi-dual or the double dual of V . The map i : v ® F v e V ¢ ¢ , v e V , defined by F v(f ) = f (v), f e V ¢, is a homomorphism of V into V ¢¢ (a linear transformation from V to V ¢¢), the homomorphic image of V under i being R (i), the range of i. Since f (v) = 0, f e V ¢ , implies that v = 0, i is an isomorhism of V into V ¢ ¢ (i.e., i is invertible as a transformation from V to R (i) Ì V ¢ ¢ ). If V is finite dimensional dim V = dim V ¢ = dim V ¢ ¢ = dim R (i), so that R (i) = V ¢ ¢ and i is an isomorphism between V and V ¢ ¢ (an isomorphism of V onto V ¢ ¢ ).

Hence, in case V and W are finite dimensional vector spaces, V ¢ ¢ and W ¢ ¢ can be identified, respectively, with V and W in the sense that any F e V ¢ ¢ is, for some v e V , of the form F v given by F v(f ) = f (v), f e V ¢ , and, similarly, any Y e W ¢ ¢ is a Y g , for a g in W , given by Y g (y ) = y (g ), y e W ¢ . With these identifications, the transpose of the transpose of T, T¢¢ : V ® W is given by y (T¢ ¢ v) = (T¢ y )v = y (Tv), y e W ¢ , v e V . Since this relation holds for all y e W ¢ , T¢ ¢ v = Tv, v e V , which in turn, as v e V is arbitrary, implies that T¢ ¢ = T. Hence, T¢ ¢ for a linear transformation T from a finite dimensional vector spaces V to another finite dimensional vector space W gets identified with the transformation T itself.

Taking V = Fn and W = Fm and identifying L(Fn, Fm) with the space Fm´ n of the m´ n matrix transformation, (Fn)¢ with F(n) and (Fm)¢ with F(m) , for T = A, (T¢ z)(x) = z¢ Tx = z¢ Ax = (A¢ z)¢ (x), for all x e Fn and z e Fm. Hence T¢ could be identified with the ordinary transpose of the matrix A when T = A, a reason for the nomenclature 'transpose'.

 

PROBLEMS

1. Describe T¢ , if T equals: (a) 0, the zero transformation from V to W ; (b) I, the identity operator from V to V ; and (c) f, a linear functional from V to F.

2. If T is a linear transformation, show that the transormations T ® T¢ and T ® T¢ ¢ are non-singular.

3. If T is a linear operator on a finite dimensional vector space V , show that T, T¢ and T¢¢ have the same characteristic and minimal polynomials and the same eigenvalues with the same algebraic and geometric multiplicities.

4. Let V be a finite dimensional vector space over a field F, b a basis of V and b ¢ the basis dual to b . Let F as a vector space over itself have its standard basis e = {1}. Let f e V ¢. Determine [f]b ¢ , [f]b º [f]b ,e and the relation between [f]b ¢ and ([f]b )¢ ?

5. Let V and W be finite dimensional and T e L(V , W ). If [T]b denotes the matrix of T with respect to a pair of ordered bases b V and b W of V and W and [T¢]b ¢ the matrix of T¢ with respect to the dual bases b W ¢ and b V ¢, show that [T¢]b ¢ = [T]b ¢.

6. Describe T¢ and T¢f explicitly and compute (T¢f)(v) if :

(a) T(a, b, c) = a+b+c, f(a) = -a, v = (1, 2, 3);

(b) T(a, b, c) = (a+b, b+c), f(a, b) = b-a, v = (-1, 1, -1); and

(c) T(a, b, c) = (a+b, b+c, c+a), f(a, b, c) = c+b+a, v = (1, -1, 1).

7. Let V be the vector space of polynomials over R , D denote the differentiation operator: Dp(x) = p¢(x) and J the indefinite integral operator: Jp(x) = ò [0,x] p(t)dt. Give an explicit description of D¢ and J¢ and find [D¢Ja –JDa¢]p, if Ja p = Jp(a) and Da p = Dp(a).

8. Let T and S be linear operators on a vector space V such that for some non-zero a Î V , Ta = Sa. Prove that there is a non-zero linear functional f on V such that T¢f = S¢f.

9. If V is a finite-dimensional vector space, prove that T ® Tt is an isomorphism of L(V , V ) onto L(V ¢, V ¢). Is the statement true if V is not finite dimensional?

10. If V is the vector space of n´ n matrices over a field F and P is any n´ n permutation matrix, veriy that: (a) for each B Î V , fB(A) = trace (B¢AP), A Î V , defines a linear functional on V ; (b) every linear functional on V is an f for some B; and (c) B ® fB is an isomorphism of V onto V ¢.

11. Let A be an m´ n matrix with real entries. Are AAt and AtA defined if At denotes the transpose of the matrix transformation A from R n to R m? Are AA¢ and A¢A defined if A¢ denotes the transpose of the matrix A?

12. Let in : Fn ® F(n) be defined by in x = x¢. If A denotes an m´ n matrix transformation and A¢ the transpose of A, find the matrix of the operator TA = in-1A¢imA on Fn in the standard ordered basis and prove that trace (TA) = 0 iff A = 0.

13. Determine the transpose of the differentiation operator D on the vector space of real polynomials of degree not exceeding n. Determine R (D¢) and N (D¢) and a basis for each of these.

 

Invariant Subspaces and Direct Sum of Operators

Let T be a linear operator on a vector space V . A subspace W of V is called invariant with respect to T or simply T-invariant if TW = {Tw : w e W } Ì W (i.e., the action of T on W does not take one outside W ).

Thus if A =, and B =, the subspace F1 of F2 spanned by e1 is A-invariant but not B-invariant.

If W is T-invariant, T could be regarded as an operator on W in its own right. This defines the operator T|W : W ® W , the restriction of T to W by the rule (T|W )w = Tw, w Î W . Note that if W is not T-invariant T|W is not defined as an operator from W to W , but could still be regarded as a linear transformation from W regarded as a vector space to the vector space V .

If T is a linear operator on a vector space V , a T-invariant suspace W of V is said to possess a T-complementary subspace if there exists another T-invariant subspace W c such that V = W Å W c . For example the one dimensional A-invariant subspace F1 considered above, has an A-complementary subspace iff c ¹ a, or, b = 0.

A direct sum decomposition V = W 1 Å W 2 ÅÅ W k of the vector space V is called a T-invariant direct sum decomposition, or simply an invariant direct sum decomposition, provided the operator T under consideration is implicitly understood, if each W i is T-invariant. In this case if Ti = T|W i , one writes T = T1 Å T2 ÅÅ Tk , and says that the operator T is a direct sum of the operators Ti .

An invariant direct sum decomposition of V enables one to reduce a problem concerning the operator T by transferring it to the lower dimensional problems for the operators Ti on the vector spaces W i 's. In this connection the following result is evident:

THEOREM. If V = W 1 Å W 2 ÅÅ W k is a T-invariant direct sum decomposition of V , b i is a basis of W i and Ti is the restriction of T to W i and b denotes the pooled basis {b 1, b 2, … , b k}, then [T]b = diag ([T1] b 1, [T2] b 2 , ... , [Tk] b k ).

The Projection operators Ei , 1 < i < k, associated with a direct sum decomposition V = W 1 Å W 2 ÅÅ W k, are defined as follows. A vector v Î V can be uniquely written as v = w1 + w2 + … + wk, where wi e W i . Writing Ei v = wi, it follows that, for 1 £ i £ k: (1) Ei is a linear operator with range W i; (2) Ei2 = Ei; (3) Ei Ej = 0, i ¹ j; and, (4) I = E1 Å E2 ÅÅ Ek.

In general, a linear operator P on a vector space V is called a projection if P2 = P. If P is a projection it is easily verified that N (P) = R (I-P) and N (I-P) = R (P), that V = R (P) Å N (P), and that P and I-P are the projections associated with this direct sum decomposition. One says that P is a projection of V on the subspace R (P) along the subspace R (I-P) = N (P).

 

PROBLEMS

1. Let E be a projection (E2 = E) of V and let T be a linear operator on V . Prove that the range of E is invariant under T if and only if ETE = TE and that the null space of E is invariant under T iff ETE = ET. Hence deduce that both the range and the null spaces of E are invariant under T if and only if ET = TE.

2. Let A Î C 2´ 2, and W = {(x, x)¢ : x Î C } Ì C 2. Find conditions on a11, a12, a21, a22 such that W has an A-complementary subspace.

3. If T is a linear operator on a finite-dimensional vector space V , verify that R = R (T), the range of T, and N = N (T), the null space of T, are T-invariant and that V = R Å N iff R Ç N = {0}.

4. Let V = W 1 Å W 2 ÅÅ W k, where each W i is invariant under a linear operator T. Express (a) the determinant, (b) the characteristic polynomial, and, (c) the minimal polynomial of T in terms of those for the restriction operators Ti = T|W i .

5. If Ei, 1 £ i £ m, are projections such that E1 + … + E = I, prove that T = c1E1 + … + ckEk, where ci 's are scalars, is diagonalizable and conversely.

6. Characterize linear operators on a finite dimensional vector space V which commute with every projection operator on V .

7. Let P denote the space of all polynomials on (-¥ , ¥ ) and P e and P o denote the subspaces of even, respectively, odd polynomials. (a) Show that P = P e Å P o. (b) If T is the indefinite integral operator (Tf)(x) = ò [0,x] f(t) dt, and D the differential operator Df(x) º df(x)/dx, for which of the operators T, D, T2, D2, … , are P e and P o invariant?

8. Can a T-invariant W possess two distinct T-complementary subspaces?

 

Norms on Finite Dimensional Real or Complex Vector Spaces

A norm on a vector space V over a real or complex field K is a map P .P : V ® [0, ¥ ) satisfying: for x, y Î V and c Î K , (a) P xP = 0 iff x = 0; (b) P cxP = |c|P xP (homogeneity); and, (c) P x+yP £ P xP + P yP (triangle inequality). A vector space, with a norm defined on it, is called a normed linear space, or just a normed space. It is called real if K = R , and, complex if K = C .

The axioms of a metric or a distance function on a set M are: d(., .) : M ´ M ® [0, ¥ ), such that for x, y, z Î M, (a) d(x, y) = 0 iff x = y; (b) d(x, y) = d(y, x) (symmetry); and (c) d(x, z) £ d(x, y) + d(y, z) (triangle inequality). A set M along with a metric d(., .) on it constitutes a metric space.

A normed space (V , K , P .P ) has a natural metric associated with it given by: d(x, y) = P x-yP . It is termed as the metric induced by the norm in V .

By the convergence of a sequence {x(k)} ® x in a metric d(., .) we mean d(x(k), x) ® 0, as k ® ¥ . Convergence in a norm means convergence in the metric induced by the norm, described above.

 

Examples of Norms: Let V = K n ' x = (x1, x2, … ,xn)¢.

(a) P xP ¥ = max {|xj|: 1 £ j £ n} (max or sup -norm).

(b) P xP 2 = Ö (x* x) = (S 1£ j£ n |xj|2)1/2 (l2 , Euclidean (K = R ), unitary (K = C ) -norm).

(c) P xP 1 = S 1£ j£ n |xj| (l1 -norm).

(d) More generally, for p ³ 1, the lp -norm: P xP p = (S 1£ j£ n |xj|p)1/p, of which the above norms are special cases, (where we note that limp® ¥ P xP p ® P xP ¥ ).

 

PROBLEMS

1. Prove that a positive multiple of a norm and the sum of two norms is a norm. Is the geometric mean of two norms a norm?

2. Consider all possible positive numbers a and b. For what pairs of a, b is the function N(x) = (S 1£ i£ n |xi|a)b a norm on K n? For what a, b is d(x,y) = (S 1£ i£ n |xi –yi |a)b a metric on K n?

3. Find all norms on K and show that P xP /P yP , x, y Î K , does not depend on the choice of the norm.

4. Consider R n with the Euclidean. Let K be a closed and convex neighborhood of origin in R n such that x Î K implies that -x Î K (i.e., K is symmetric about the origin). Show that K defines a norm on R n by declaring its boundary to be the unit ball. Show also that every norm on R n arises in this fashion.

 

The Hölder and Minkowski's Inequalities

 

Hölder's Inequality. Let x, y Î R n and xi, yi ³ 0, 1 £ i £ n. Let 1 < p < ¥ , and, 1/p + 1/q = 1 (i.e., q = p/(p-1)). Then: S 1£ i£ n xi yi £ (S 1£ i£ n xip)1/p (S 1£ i£ n yiq)1/q, where the equality holds iff the vectors xp º (x1p, x2p, … , xnp)¢ and yp º (y1p, y2p, … , ynp)¢ are linearly dependent.

Proof: If x, or, y equals 0 the result is trivial, and so is the case if n = 1. Hence, using induction on n, we may assume the result for R m, with m < n and work with x, y ¹ 0. We note that the quotient Qn(x, y) = (S 1£ i£ n xi yi) / [(S 1£ i£ n xip)1/p (S 1£ i£ n yiq)1/q], is invariant with respect to multiplication of x and y by positive scalars. Hence, as we are to prove that Q (x,y) £ 1, with equlality iff xp, yp ¹ 0 are linearly dependent, without loss of generality we could restrict ourselves to the region Rn defined by: Rn = {x, y Î R n: 1 £ S 1£ i£ n xip, S 1£ i£ n yip £ 2, xi, yi ³ 0}. Since Rn is a bounded and closed region in R n ´ R n, and Qn(x, y) is a continuous function on it, it attains its maxima at some point of Rn. Note that, since Q/ xj = [yj - xjp-1 S 1£ i£ n xi yi )/( S 1£ i£ n xip)]/D, and Q/ yk = [xk - ykq-1 S 1£ i£ n xi yi )/( S 1£ i£ n yip)]/D, where D = [(S 1£ i£ n xip)1/p (S 1£ i£ n yiq)1/q], Q(x, y) does not attain its maximum at a boundary point in Rn at which some xj, or, yk equals zero but the corresponding yj, or, xk is positive, as in this case Q increases towards Rn. If, however, also the corresponding yj, or, xk equals zero, we are reduced to a problem in Rm with m < n. Hence, in view of the invariance of Q with respect to the scalar multiplications, without loss of generality, we can assume that the maximum of Q is attained at an interior point in Rn. By the theory of lacal maxima and minima then Q/ xj = 0 and Q/ yk = 0, 1 £ j, k £ n, each of which is equivalent to xp and yq being linearly dependent. For computing the local maximum we can assume that xp = yq (due to invariance with respect to scalar multiplication). As xp = yq means yiq = xip, so that xiyi = xi1+p/q = xi1+p-1 = xip for 1 £ i £ n, in this case Q(x,y) = 1. The result follows. #

Corollary. If xi, yi ³ 0 and S i³ 1 xip, S i³ 1 yiq < ¥ , the series S = S i³ 1 xiyi converges and S £ (S i³ 1 xip)1/p(S i³ 1 yiq)1/q.

Minkowski's Inequality. Let x, y Î R n and xi, yi ³ 0, 1 £ i £ n. If 1 £ p < ¥ , then: [S 1£ i£ n (xi+yi)p]1/p £ (S 1£ i£ n xip)1/p + (S 1£ i£ n yip)1/p, with the equality iff the vectors x and y are linearly dependent.

Proof: If p = 1, the inequality follows from the triangle inequality. So let p > 1. Writing S 1£ i£ n (xi+yi)p = S 1£ i£ n xi(xi+yi)p-1 +S 1£ i£ n yi(xi+yi)p-1, and using the Hölder's inequality for the first term on the right, S 1£ i£ n xi(xi+yi)p-1 £ (S 1£ i£ n xip)1/p (S 1£ i£ n (xi+yi)(p-1)q)1/q = (S 1£ i£ n xip)1/p (S 1£ i£ n (xi+yi)p)1/q. Similarly, for the second term we have the inequality S 1£ i£ n yi(xi+yi)p-1 £ (S 1£ i£ n yip)1/p (S 1£ i£ n (xi+yi)p)1/q.

The Minkowski's inequality follows after adding these two inequalities and dividing both sides by (S 1£ i£ n (xi+yi)p)1/q.

For the equality, we must have the equality in the two inequalities in the above, i.e., the pairs xp, (x+y)p and yp, (x+y)p be linearly dependent, which is easily seen to be the case iff x and y are linearly dependent. This completes the proof. #

Corollary. The set lp of infinite sequences x = {x1, x2, … } over K , for which S i³ 1 |xi|p < ¥ , where 1 £ p < ¥ , is a normed vector space over K , with component-wise addition and scalar multiplication and with the norm defined by P xP = (S i³ 1 |xi|p)1/p.

 

Equivalence of Norms on Finite Dimensional Spaces

If h is an isomorphism from a vector space U to a vector space V and P .P is a norm on V then P .P defined by P xP = P h (x)P is a norm on U , induced by the norm P .P in V .

Theorem. All norms on a finite dimensional vector space V are equivalent, in the sense that for any two norms P .P A and P .P B there exist positive constants p, q such that pP xP A £ P xP B £ qP xP A for all x e V .

Proof: Due to the isomorphism of an n-dimensional vector space over K with K n, it is sufficient consider the case when V = K n , P .P A = P .P ¥ and P .P B = P .P , the latter being any norm on K n. Let ei, 1 £ i £ n, denote the standard unit vectors in K n. By the triangle inequality, P xP = P (S 1£ i£ n xiei ) P £ S 1£ i£ n |xi|P eiP £ P xP ¥ S 1£ i£ n P eiP = qP xP ¥ , say.

If the reverse type inequality does not hold, i.e., there is no positive number p such that pP xP ¥ £ P xP , for all x Î V , then there is a sequence {x(m) Î V }m³ 1 for which P x(m) P ¥ /P x(m) P ® ¥ . Using the homogeneity poroperty: P cxP = cP xP , of any norm, without loss of generality, we may assume that P x(m) P ¥ = 1, for all m. Then, by the Bolzano-Weierstrass principle {x(m)} has a P .P ¥ -convergent subsequence ® x(0) ¹ 0, say. For the subsequence we shall use the same notation as {x(m)}. Then,

P x(m)P ³ P x(0)P -P x(m) -x(0)P ³ P x(0)P - S 1£ i£ n |x(m)i - x(0)i|P eiP ® P x(0)P ¹ 0, as m ® ¥ .

Hence, the sequence {P x(m)P } is bounded away from zero, and since P x(m)P ¥ = 1, it follows that P x(m)P ¥ / P x(m)P can not diverge to ¥ , a contradiction. #

Corollary. If V is a finite dimensional vector space over K and V ' x(m) ® x Î V , as m ® ¥ , in some norm then x(m) ® x, as m ® ¥ , in any norm in V .

 

Matrix Norms

A norm P .P on the space K n´ n of nxn matrices over K is called a matrix norm, if in addition to the norm axioms, it also satisfies P ABP £ P AP P BP , for all A, B Î K n´ n.

As is easily verified, a norm on K n, i.e., on the n´ 1 vectors, induces a norm on K n´ n by defining

P AP = supx¹ 0 P AxP /P xP = max {P AxP : P xP = 1}, A Î K n´ n.

Since it means that P AxP £ P AP P xP , we have P (AB)xP = P A(Bx)P £ P AP P BxP £ P AP P BP P xP , from which it follows that P ABP £ P AP P BP . Hence an induced norm on matrices is always a matrix norm.

The above notion of an induced norm could be generalized for rectangular matrices, and in general for linear operators T : U ® V , by defining P TP = supu¹ 0 P TuP /P uP , where P TuP refers to a norm on V and P uP to the norm on U . Then if S : V ® W , with the induced norm P SP = supu¹ 0 P SvP /P vP , we have P STuP £ P SP P TuP £ P SP P TP P uP , where P SvP refers to a norm on W , implying that P STP £ P SP P TP .

Examples. Let the matrix norms induced by the vector norms P .P 1, P .P 2, and P .P ¥ be denoted by the same notations. Then

(i) P AP 2 = Ö r (A* A) = P AP S, (spectral-norm);

(ii) P AP ¥ = max1£ i£ n S 1£ j£ n |aij|, (max. absolute row-sum);

(iii) P AP 1 = max1£ j£ n S 1£ i£ n |aij|, (max. absolute column-sum).

To prove these forms of the respective induced norms, we have: (i) P AP 22 = maxx¹ 0 (Ax, Ax)/(x,x) = maxx¹ 0 (A* Ax, x)/(x, x) = r (A* A). (ii) P AP ¥ = maxi { |S 1£ j£ n |aijxj|: maxj |xj| = 1} £ maxi S 1£ j£ n |aij| = M, say. If this maximum is attained for i = k, say, then choosing x by xj = sgn akj, j = 1, 2, … , n, where sgn z = |z|/z, for z ¹ 0, and zero if z = 0, we find P AP ¥ ³ M, so that P AP ¥ = maxi S 1£ j£ n |aij|. Finally, (iii) P AP 1 = max {S 1£ i£ n |S 1£ j£ n aijxj|: S 1£ j£ n |xj| = 1} £ max {S 1£ j£ n (S 1£ i£ n |aij|) |xj|: S 1£ j£ n |xj| = 1} £ maxj S 1£ i£ n |aij| = N, say. If N is attained for j = k, with xj = d jk , j = 1, 2, … , n, we have P AxP 1 = S 1£ i£ n |S 1£ j£ n aijxj| = S 1£ i£ n |S 1£ j£ n aijd j| = S 1£ i£ n |aik| = N, so that, as P xP 1 = 1, also P AP 1 ³ N. Hence, P AP 1 = max1£ j£ n S 1£ i£ n |aij|. #

The Frobenius norm P AP F of A Î C n´ n is defined by: (AP F)2 = tr(A* A), i.e., P AP F = Ö (tr(A* A)). As, tr(A* A) = S 1£ i£ n S 1£ j£ n |aij|2, P AP F ³ 0 and P AP F = 0 iff A = 0. Further, P cAP F = |c|P AP F is clear. For proving the triangle inequality: |tr(A* B)| = S 1£ i£ n |S 1£ j£ n aji* bji | £ P AP F P BP F, by Cauchy-Schwarz inequality. Similarly, |tr(B* A)| £ P AP F P BP F. Hence, P A+BP F2 = P AP F2 + tr (A* B) + tr (B* A) + P BP F2 £ (P AP F + P BP F )2, showing that the triangle inequality, P A+BP F £ P AP F + P BP F, holds.

To prove that P .P is a matrix norm, P ABP F2 = tr((AB)* AB) = tr (B* A* AB) = tr(A* ABB* ). Since the trace functional is invariant under a unitary transformation A* A could be taken to be diagonal D, and BB* a p.s.d. P. Then as di, pii ³ 0, P ABP F2 = tr (DP) = S 1£ i£ n dipii £ (S 1£ i£ n di)( S 1£ i£ n pii) = P AP F2 P BP F2,

proving that P ABP £ P AP P BP .

Note, however, that P .P F is not an induced norm for n ³ 2. This may be seen, e.g., by noting that P IP F = n, whereas for any induced norm P .P , P IP = max { P IxP : P xP = 1 } = 1. [Note that (P .P F /n), n ³ 2, fails to be a matrix norm at all! (e.g., take A = diag (1, 0, … ,0) = B, to get the contradiction 1/n £ 1/n2).]

A matrix norm on C n´ n is called unitarily invariant if P AP = P AUP = P UAP , for every A Î C n´ n and unitary U Î C n´ n.

Theorem. The Frobenius norm P AP F = (tr A* A)1/2 and the spectral norm P AP S = (r (A* A))1/2 are unitarily invariant. If P .P is any unitarily invariant matrix norm, there holds the inequality P AP ³ P AP S for any A Î C n´ n.

Proof: P AUP F2 = tr (AU)* AU = tr (U* A* AU) = tr (A* AUU* ) = tr A* A =P AP F2 . P UAP F = tr (UA)* UA = tr (A* U* UA) = tr(A* A) = P AP F2 . P UAP S2 = r ((UA)* UA) = r (A* A) = P AP S2 . Also, P AUP S2 = r ((AU)* AU) = r (U* A* AU) = r (A* A) = P AP S2 . Finally, let A = UL V* (s.v.d.). Then for a unitarily invariant matrix norm P .P , P AP = P UL V* P = P L V* P = P L P ³ r (L ) = (r (AA* ))1/2 = (r (A* A))1/2 = P AP S . #

 

PROBLEMS

1. Let A be an n´ n complex matrix and W unitary. Show that AW is positive definite iff P A-WP F £ P A-UP F for all unitary U.

2. Prove that r (A) £ P AP S £ P AP F .

3. Is P AP L¥ = max |aij| a matrix norm on C n´ n?

4. For what p ³ 1 is P AP Lp = (S 1£ i£ n S 1£ i£ n |aij|p)1/p a matrix norm on C n´ n? Is the answer 1 £ p £ 2, for n > 1?