Chapter 7
Jordan Canonical form

7.1 Jordan Canonical form theorem

We start this chapter with the following theorem which generalizes the Schur Upper triangularization theorem.

Theorem 7.1.1. [Generalized Schur’s Theorem] Let A Mn(). Suppose λ1,k are the distinct eigenvalues of A with multiplicities m1,,mk, respectively. Then, there exists a non-singular matrix W such that

           ⊕k
W - 1AW  =     Ti, where, Ti ∈ Mmi (ℂ ), for 1 ≤ i ≤ k
           i=1
and Ti’s are upper triangular matrices with constant diagonal λi. If A has real entries with real eigenvalues then W can be chosen to have real entries.

Proof. By Schur Upper Triangularization (see Lemma 6.2.12), there exists a unitary matrix U such that U*AU = T, an upper triangular matrix with diag(T) = (λ1,1,k,k).

Now, for any upper triangular matrix B, a real number α and i < j, consider the matrix F(B,i,j,α) = Eij(-α)BEij(α), where the matrix Eij(α) is defined in Definition 2.1.13. Then, for 1 k,ℓ n,

                 (
                 |||  Bij - αBjj + αBii, whenever  k = i,ℓ = j
                 |{  Biℓ - αBj ℓ,        whenever  ℓ ⁄= j
(F(B, i,j,α ))kℓ =
                 ||||  Bkj + αBki,        whenever  k ⁄= i
                 (  Bkℓ,               otherwise.
(7.1.1)

Now, using Equation (7.1.1), the diagonal entries of F(T,i,j,α) and T are equal and

                {
(F(T,i,j,α))  =    Tij,  whenever Tjj = Tii          T
            ij      0,   whenever Tjj ⁄= Tii and α = Tjji-jTii.
Thus, if we denote the matrix F(T,i,j,α) by T1 then (F(T1,i - 1,j,α ))i-1,j = 0, for some choice of α, whenever (T1)i-1,i-1Tjj. Moreover, this operation also preserves the 0 created by F(T,i,j,α) at (i,j)-th place. Similarly, F(T1,i,j + 1) preserves the 0 created by F(T,i,j,α) at (i,j)-th place. So, we can successively apply the following sequence of operations to get
T →  F(T,m1, m1 +1, α) = T1 → F (T1,m1 - 1,m1 +1, β) → ⋅⋅⋅ → F(Tm1 -1,1,m1+1, γ) = Tm1,
where α,β,are appropriately chosen and Tm1[:,m1 + 1] = λ2em1+1. Thus, observe that the above operation can be applied for different choices of i and j with i < j to get the required result. _

Exercise 7.1.2. Apply Theorem 7.1.1 to the matrix given below for better understanding. PICT PICT DRAFT

⌊        |        |       ⌋
|1  2  3 |4  5  6 |7  8  9|
|0  1  2 |3  4  5 |6  7  8|
||0  0  1 |2  3  4 |5  6  7||
||--------|--------|-------||
||0  0  0 |2  3  4 |5  6  7||
||0  0  0 |0  2  3 |4  5  6|| .
||0  0  0 |0  0  2 |3  4  5||
|--------|--------|-------|
||0  0  0 |0  0  0 |3  4  5||
|⌈0  0  0 |0  0  0 |0  3  4|⌉
         |        |
 0  0  0 |0  0  0 |0  0  3

Definition 7.1.3. [Jordan Block and Jordan Matrix]

1.
Let λ and k be a positive integer. Then, by the Jordan block Jk(λ) Mk(), we understand the matrix
⌊             ⌋
 λ    1
||    ..   ..   ||
|     .    .  | .
|⌈        λ   1|⌉

             λ
2.
A Jordan matrix is a direct sum of Jordan blocks. That is, if A is a Jordan matrix having r blocks then there exist positive integers ki’s and complex numbers λi’s (not necessarily distinct), for 1 i r such that PICT PICT DRAFT
A =  Jk1(λ1 )⊕ ⋅⋅⋅⊕ Jkr(λr).

We now give some examples of Jordan matrices with diagonal entries 0.

Example 7.1.4.

1.
J1(0) = [ ]
 0 is the only Jordan matrix of size 1.
2.
J1(0) J1(0) = [    ]
 0  0
 0  0 and J2(0) = [    ]
 0  1
 0  0 are Jordan matrices of size 2.
3.
Even though , J1(0) J2(0) and J2(0) J1(0) are two Jordan matrices of size 3, we do not differentiate between them as they are similar (use permutations).
4.
J1(0) J1(0) J1(0) = ⌊0  0  0⌋
|       |
⌈0  0  0⌉
 0  0  0, J 2(0) J1(0) = ⌊ 0  1  0⌋
|        |
⌈ 0  0  0⌉
  0  0  0 and J 3(0) = ⌊0  1  0⌋
|       |
⌈0  0  1⌉
 0  0  0 are Jordan matrices of size 3.
5.
Observe that the number of Jordan matrices of size 4 with 0 on the diagonal are 5.

We now give some properties of the Jordan blocks. The proofs are immediate and hence left for the reader. They will be used in the proof of subsequent results.

Remark 7.1.5. [Jordan blocks] Fix a positive integer k. Then,

1.
Jk(λ) is an upper triangular matrix with λ as an eigenvalue. PICT PICT DRAFT
2.
Jk(λ) = λIk + Jk(0).
3.
Alg.Mulλ(Jk(λ)) = k.
4.
The matrix Jk(0) satisfies the following properties.
(a)
Rank((Jk(0)i) = k - i, for 1 i k.
(b)
JkT (0)Jk(0) = [       ]
 0  0

 0  Ik-1.
(c)
Jk(0)p = 0 whenever p k.
(d)
Jk(0)ei = ei-1 for i = 2,,k.
(e)
(I - JkT (0)Jk(0))x = [   ]
 x1

 0 = x,e1e1.
5.
Thus, using Remark 7.1.5.4d Geo.Mulλ(Jk(λ)) = 1.

Exercise 7.1.6.

1.
Fix a positive integer k and a complex number λ. Then, prove that
(a)
Rank(Jk(λ) - λIk) = k - 1.
(b)
Rank(Jk(λ) - αIk) = k, whenever αλ. Or equivalently, for all αλ the matrix Jk(λ) - αIk is invertible.
(c)
for 1 i k, Rank((Jk(λ) - λIk)i) = k - i.
(d)
for αλ, Rank((Jk(λ) - αIk)i) = k, for all i.
2.
Let J be a Jordan matrix that contains Jordan blocks for λ. Then, prove that PICT PICT DRAFT
(a)
Rank(J - λI) = n - .
(b)
J has linearly independent eigenvectors for λ.
(c)
Rank(J - λI) Rank((J - λI)2) Rank((J - λI)3) ⋅⋅⋅.
3.
Let A Mn(). Then, prove that AJn(λ) = Jn(λ)A if and only if AJn(0) = Jn(0)A.

Definition 7.1.7. [Index of an Eigenvalue] Let J be a Jordan matrix containing Jt(λ), for some positive integer t and some complex number λ. Then, the smallest value of k for which Rank((J - λI)k) stops decreasing is the order of the largest Jordan block Jk(λ) in J. This number k is called the index of the eigenvalue λ.

Lemma 7.1.8. Let A Mn() be strictly upper triangular. Then, A is similar to a direct sum of Jordan blocks. That is, there exists a non-singular matrix S and integers n1 nm 1 such that

A = S- 1(J  (0)⊕  ⋅⋅⋅⊕  J  (0))S.
         n1            nm
If A Mn() then S can be chosen to have real entries.
PICT PICT DRAFT

Proof. We will prove the result by induction on n. For n = 1, the statement is trivial. So, let the result be true for matrices of size n - 1 and let A Mn() be strictly upper triangular. Then, A = [     T]
 0  a
 0  A1. By induction hypothesis there exists an invertible matrix S1 such that

                                      ∑m
A1 =  S-11(Jn1(0)⊕ ⋅⋅⋅⊕ Jnm (0))S1 with     ni = n - 1.
                                      i=1
Thus,
                                                                ⌊             ⌋
[      ]  [     ]   [       ][     T][      ]   [      T    ]    0    aT1    aT2
 1   0   A 1   0  =   1   0    0  a   1   0  =   0    a S1    = |⌈0  Jn (0)   0|⌉,
 0  S-11    0  S1      0  S-11   0  A1  0  S1      0  S-1A1S1           1
                                                                 0     0    J
where S1-1(Jn1(0) ⋅⋅⋅Jnm(0))S1 = Jn1(0) J and aT S1 = [       ]
 aT1  aT2. Now, writing Jn1 to mean Jn1(0) and using Remark 7.1.5.4e, we have
⌊              ⌋⌊           ⌋⌊            ⌋   ⌊                 ⌋
  1  - aT1 JTn1 0  0   aT1  aT2    1 aT1JTn1  0     0  ⟨a1,e1⟩eT1  aT2
|⌈ 0    In     0|⌉|⌈0  Jn    0 |⌉|⌈ 0   In    0|⌉ = |⌈0     Jn       0 |⌉.
         1             1             1                  1
  0     0     I  0   0    J    0    0    I     0      0       J
So, we now need to consider two cases depending on whether a1,e1= 0 or a1,e10. In the first case, A PICT PICT DRAFT is similar to ⌊           ⌋
 0   0   aT2
|⌈0  Jn    0 |⌉
      1
 0   0    J. This in turn is similar to ⌊           ⌋
 Jn1  0   0
|⌈ 0   0  aT |⌉
          2
  0   0   J by permuting the first row and column. At this stage, one can apply induction and if necessary do a block permutation, in order to keep the block sizes in decreasing order.

So, let us now assume that a1,e10. Then, writing α = a1,e1, we have

⌊         ⌋ ⌊           ⌋⌊          ⌋   ⌊          ⌋
  1- 0  0    0  αeT1   aT2   α  0   0      0  eT1   aT2     [           ]
| α       | |           ||          |   |          |     Jn1+1  e1aT2
⌈ 0  I  0 ⌉ ⌈0  Jn1   0 ⌉⌈ 0  I   0 ⌉ = ⌈0  Jn1   0⌉ ≡     0     J   .
  0  0  1αI   0   0    J    0  0  αI      0   0   J
Now, using Remark 7.1.5.4c, verify that
[              ][                ][               ]   [               ]
  I  ei+1aT2Ji-1  Jn1+1  eiaT2 Ji- 1  I  - ei+1aT2Ji-1     Jn1+1  ei+1aT2J i
                                                    =                   , for i ≥ 1.
  0      I         0        J      0       I             0        J
Hence, for p = n - n1 - 1, we have
[              ]   [        ][            ][         ]    [               ]   [         ]
 I  ep+1aT2 Jp-1     I  e2aT   Jn1+1  e1aT   I  - e2aT      I  - ep+1aT Jp-1    Jn1+1   0
                ⋅⋅⋅                                    ⋅⋅⋅                  =            .
 0       I          0    I      0      J    0     I        0       I             0    J
If necessary, we need to do a block permutation, in order to keep the block sizes in decreasing order. Hence, the required result follows. _
PICT PICT DRAFT

Exercise 7.1.9. Convert ⌊       ⌋
 0  1  1
|⌈0  0  1|⌉

 0  0  0 to J 3(0) and ⌊       ⌋
  0 1  2
|⌈ 0 0  0|⌉

  0 0  0 to J 2(0) J1(0).

Corollary 7.1.10. A Mn(). Then, A is similar to J, a Jordan matrix.

Proof. Let λ1,k be the distinct eigenvalues of A with algebraic multiplicities m1,,mk. By Theorem 7.1.1, there exists a non-singular matrix S such that S-1AS = i=1kTi, where Ti is an upper triangular with diagonal (λi,i). Thus Ti - λiImi is a strictly upper triangular matrix. Thus, by Theorem 7.1.8, there exist a non-singular matrix Si such that

 - 1
Si  (Ti - λiImi)Si = J (0),
a Jordan matrix with 0 on the diagonal and the size of the Jordan blocks decreases as we move down the diagonal. So, Si-1TiSi = J(λi) is a Jordan matrix with λi on the diagonal and the size of the Jordan blocks decreases as we move down the diagonal.

Now, take W = S(      )
  ⊕k
  i=1 Si. Then, verify that W-1AW is a Jordan matrix. _

Let A Mn(). Suppose λ σ(A) and J is a Jordan matrix that is similar to A. Then, for each fixed i,1 i n, by i(λ), we denote the number of Jordan blocks Jk(λ) in J for which k i. Then, the next result uses Exercise 7.1.6 to determine the number i(λ).

Remark 7.1.11. Let A Mn(). Suppose λ σ(A) and J is a Jordan matrix that is similar to A. Then, for 1 k n, PICT PICT DRAFT

                    k-1               k
ℓk(λ) = Rank(A - λI)    - Rank(A - λI) .

Proof. In view of Exercise 7.1.6, we need to consider only the Jordan blocks Jk(λ), for different values of k. Hence, without loss of generality, let us assume that J = i=1naiJi(λ), where ai’s are non-negative integers and J contains exactly ai copies of the Jordan block Ji(λ), for 1 i n. Then, by definition and Exercise 7.1.6, we observe the following:

1.
n = i1iai.
2.
Rank(J - λI) = i2(i - 1)ai.
3.
Rank((J - λI)2) = i3(i - 2)ai.
4.
In general, for 1 k n, Rank((J - λI)k) = ik+1(i - k)ai.

Thus, writing i in place of i(λ), we get

ℓ   =  ∑   a = ∑  ia -  ∑  (i - 1)a  = n - Rank(J - λI),
 1          i       i             i
       i≥1     i≥1      i≥2
       ∑       ∑            ∑                                         2
ℓ2  =      ai =   (i- 1)ai -   (i- 2 )ai = Rank(J - λI) - Rank((J - λI) ),
       i≥2     i≥2           i≥3
    ..
    .  ∑       ∑                   ∑
ℓk  =      ai =   (i- (k - 1))ai -     (i-  k)ai = Rank((J - λI)k-1)- Rank((J - λI )k).
       i≥k     i≥k                i≥k+1
Now, the required result follows as rank is invariant under similarity operation and the matrices J and A are similar. _ PICT PICT DRAFT

Lemma 7.1.12. [Similar Jordan Matrices] Let J and Jbe two similar Jordan matrices of size n. Then, J is a block permutation of J.

Proof. For 1 i n, let i and ibe, respectively, the number of Jordan blocks of J and Jof size at least i corresponding to λ. Since J and Jare similar, the matrices (J -λI)i and (J′-λI)i are similar for all i,1 i n. Therefore, their ranks are equal for all i 1 and hence, i = ifor all i 1. Thus the required result follows. _

We now state the main result of this section which directly follows from Lemma 6.2.12, Theorem 7.1.1 and Corollary 7.1.10 and hence the proof is omitted.

Theorem 7.1.13. [Jordan Canonical Form Theorem] Let A Mn(). Then, A is similar to a Jordan matrix J, which is unique up to permutation of Jordan blocks. If A Mn() and has real eigenvalues then the similarity transformation matrix S may be chosen to have real entries. This matrix J is called the the Jordan canonical form of A, denoted Jordan CF(A).

We now start with a few examples and observations.

Example 7.1.14. Let us use the idea from Lemma 7.1.11 to find the Jordan Canonical Form of the following matrices.

1.
Let A = J4(0)2 = ⌊          ⌋
 0  0  1  0
||0  0  0  1||
||          ||
⌈0  0  0  0⌉
 0  0  0  0.

Solution: Note that 1 = 4 -Rank(A - 0I) = 2. So, there are two Jordan blocks.

Also, 2 = Rank(A - 0I) -Rank((A - 0I)2) = 2. So, there are at least 2 Jordan blocks of size 2. As there are exactly two Jordan blocks, both the blocks must have size 2. Hence, Jordan CF(A) = J2(0) J2(0). PICT PICT DRAFT

2.
Let A1 = ⌊          ⌋
 1  1  0  1
||0  1  1  1||
||          ||
⌈0  0  1  1⌉
 0  0  0  1.

Solution: Let B = A1 - I. Then, 1 = 4 -Rank(B) = 1. So, B has exactly one Jordan block and hence A1 is similar to J4(1).

3.
A2 = ⌊          ⌋
|1  1  0  1|
|0  1  1  1|
||0  0  1  0||
⌈          ⌉
 0  0  0  1.

Solution: Let C = A2 - I. Then, 1 = 4 -Rank(C) = 2. So, C has exactly two Jordan blocks. Also, 2 = Rank(C) -Rank(C2) = 1 and 3 = Rank(C2) -Rank(C3) = 1. So, there is at least 1 Jordan blocks of size 3.

Thus, we see that there are two Jordan blocks and one of them is of size 3. Also, the size of the matrix is 4. Thus, A2 is similar to J3(1) J1(1).

4.
Let A = J4(1)2 A1 A2, where A1 and A2 are given in the previous exercises.

Solution: One can directly get the answer from the previous exercises as the matrix A is already in the block diagonal form. But, we compute it again for better understanding.

Let B = A - I. Then, 1 = 16 -Rank(B) = 5, 2 = Rank(B) -Rank(B2) = 11 - 7 = 4, 3 = Rank(B2) -Rank(B3) = 7 - 3 = 4 and 4 = Rank(B3) -Rank(B4) = 3 - 0 = 3.

Hence, J4(1) appears thrice (as 4 = 3 and 5 = 0), J3(1) also appears once (as 3-4 = 1), J2(1) does not appear as (as 2 - 3 = 0) and J1(1) appears once (as 1 - 2 = 1). Thus, the required result follows.

Remark 7.1.15. [Observations about Jordan CF(A)]

1.
What are the steps to find Jordan CFA?

Łet λ1,k be the distinct eigenvalues of A. Now, apply the Schur Upper Triangularization Lemma (see Lemma 6.2.12) to get an upper triangular matrix, say T such that the diagonal entries of T are λ1,12,2,k,k. Now, apply similarity transformations (see Theorem 7.1.1) to get T = i=1kTi, where each diagonal entry of Ti is λi. Then, for each PICT PICT DRAFT i,1 i k, use Theorem 7.1.8 to get an invertible matrix Si such that Si-1(Ti-λiI)Si = J^i, a Jordan matrix. Thus, we obtain a Jordan matrix Ji = ^Ji + λiI = Si-1TiSi, where each diagonal entry of Ji is λi. Hence, S = i=1kSi converts T = i=1kTi into the required Jordan matrix.

2.
Let A Mn() be a diagonalizable matrix. Then, by definition, A is similar to i=1nλi, where λi σ(A), for 1 i n. Thus, Jordan CF(A) = i=1nλi, up to a permutation of λi’s.
3.
In general, the computation of Jordan CF(A) is not numerically stable. To understand this, let Aϵ = [    ]
  ϵ 0
 1  0. Then, Aϵ is diagonalizable as A has distinct eigenvalues. So, Jordan CF(Aϵ) = [    ]
 ϵ  0
 0  0.

Whereas, for A = [    ]
 0  0
 1  0, we know that Jordan CF(A) = [    ]
 0  1
 0  0limϵ0Jordan CF(Aϵ). Thus, a small change in the entries of A may change Jordan CF(A) significantly.

4.
Let A Mn() and ϵ > 0 be given. Then, there exists an invertible matrix S such that S-1AS = i=1kJni(λi), where Jni(λi) is obtained from Jni(λi) by replacing each off diagonal entry 1 by an ϵ. To get this, define Di(ϵ) = diag(1,ϵ,ϵ2,ni-1), for 1 i k. Now compute i=1k(       -1           )
 (Di(ϵ))  Jni(λi)Di (ϵ).
5.
Let Jordan CF(A) contain Jordan blocks for λ. Then, A has linearly independent eigenvectors for λ.

For if, A has at least + 1 linearly independent eigenvectors for λ, then dim(Null(A - λI)) > ℓ. So, Rank(A - λI) < n - . But, the number of Jordan blocks for λ in A is . Thus, we must have Rank(J - λI) = n - , a contradiction.

6.
Let λ σ(A). Then, by Remark 7.1.5.5, Geo.Mulλ(A) = the number of Jordan blocks Jk(λ) in Jordan CF(A).
7.
Let λ σ(A). Then, by Remark 7.1.5.3, Alg.Mulλ(A) = the sum of the sizes of all Jordan blocks Jk(λ) in Jordan CF(A).
8.
Let λ σ(A). Then, Jordan CF(A) does not get determined by Alg.Mulλ(A) and Geo.Mulλ(A). For example, [ ]
 0[    ]
 0  1

 0  0⌊       ⌋
 0  1  0
|⌈0  0  1|⌉
 0  0  0 and [    ]
 0  1

 0  0[    ]
 0  1

 0  0[    ]
 0  1

 0  0 PICT PICT DRAFT are different Jordan CFs but they have the same algebraic and geometric multiplicities.
9.
Let A Mn(). Suppose that, for each λ σ(A), the values of Rank(A - λI)k, for k = 1,,n are known. Then, using Remark 7.1.11, Jordan CF(A) can be computed. But, note here that finding rank is numerically unstable as [ ]
 ϵ has rank 1 but it converges to [ ]
 0 which has a different rank.

Theorem 7.1.16. [A is similar to AT] Let A Mn(). Then, A is similar to AT .

Proof. Let Kn = ⌊         ⌋
|        1|
|⌈    ...   |⌉

  1. Then, observe that K-1 = K and KJn(a)K = Jn(a)T , as the (i,j)-th entry of A goes to (n - i + 1,n - j + 1)-th position in KAK. Hence,

[⊕      ] [⊕        ] [⊕      ]   [⊕        ]T
     Kni      Jni(λi)     Kni  =      Jni(λi)  .
Thus, J is similar to JT . But, A is similar to J and hence AT is similar to JT and finally we get A is similar to AT . Therefore, the required result follows. _

7.2 Minimal polynomial

We start this section with the following definition. Recall that a polynomial p(x) = a0 + a1x + ⋅⋅⋅ + anxn with an = 1 is called a monic polynomial.

PICT PICT DRAFT Definition 7.2.1. [Companion Matrix] Let P(t) = tn + an-1tn-1 + ⋅⋅⋅ + a0 be a monic polynomial in t of degree n. Then, the n×n matrix A = ⌊                         ⌋
| 0  0  0   ⋅⋅⋅  0   - a0 |
| 1  0  0   ⋅⋅⋅  0   - a1 |
|| 0  1  0    ⋅⋅⋅ 0   - a2 ||
||       .    .   .     .  ||
|| 0  0   ..   ..  ..     ..  ||
|⌈ 0  0  0   ⋅⋅⋅  0  - a   |⌉
                       n-2
  0  0  0        1  - an-1, denoted A(n : a0,,an-1) or Companion(P), is called the companion matrix of P(t).

Definition 7.2.2. [Annihilating Polynomial] Let A Mn(). Then, the polynomial P(t) is said to annihilate (destroy) A if P(A) = 0.

Let P(x) be the characteristic polynomial of A. Then, by the Cayley-Hamilton Theorem, P(A) = 0. So, if f(x) = P(x)g(x), for any multiple of g(x), then f(A) = P(A)g(A) = 0g(A) = 0. Thus, there are infinitely many polynomials which annihilate A. In this section, we will concentrate on a monic polynomial of least positive degree that annihilates A.

Definition 7.2.3. [Minimal polynomial] Let A Mn(). Then, the minimal polynomial of A, denoted mA(x), is a monic polynomial of least positive degree satisfying mA(A) = 0.

Theorem 7.2.4. Let A be the companion matrix of the monic polynomial P(t) = tn + an-1tn-1 + ⋅⋅⋅ + a0. Then, P(t) is both the characteristic and the minimal polynomial of A.

Proof. Expanding det(tIn - Companion(P)) along the first row, we have

PICT PICT DRAFT
det(tIn - A (n : a0,...,an- 1)) =  tdet(tIn-1 - A(n - 1 : a1,...,an- 1)) + (- 1)n+1a0(- 1)n- 1

                             =   t2 det(tIn-2 - A(n - 2 : a2,...,an- 1)) + a0 + a1t
                              ..
                              .
                             =   P(t).
Thus, P(t) is the characteristic polynomial of A and hence P(A) = 0.

We will now show that P(t) is the minimal polynomial of A. To do so, we first observe that Ae1 = e2,,Aen-1 = en. That is,

Ake1 =  ek+1, for 1 ≤ k ≤ n - 1.
(7.2.1)

Now, Suppose we have a monic polynomial Q(t) = tm + bm-1tm-1 + ⋅⋅⋅ + b0, with m < n, such that Q(A) = 0. Then, using Equation (7.2.1), we get

0 = Q(A )e = Ame   + b    Am -1e + ⋅⋅⋅+ b Ie  = e    + b    e  + ⋅⋅⋅+ b e ,
          1       1   m- 1      1        0  1    m+1    m- 1 m         0 1
a contradiction to the linear independence of {e1,,em+1}⊆{e1,,en}. _ PICT PICT DRAFT

The next result gives us the existence of such a polynomial for every matrix A. To do so, recall that the well-ordering principle implies that if S is a subset of natural numbers then it contains a least element.

Lemma 7.2.5. [Existence of the Minimal Polynomial] Let A Mn(). Then, there exists a unique monic polynomial m(x) of minimum (positive) degree such that m(A) = 0. Further, if f(x) is any polynomial with f(A) = 0 then m(x) divides f(x).

Proof. Let P(x) be the characteristic polynomial of A. Then, deg(P(x)) = n and by the Cayley-Hamilton Theorem, P(A) = 0. So, consider the set

S = {deg (f (x )) : f(x) is a nonzero polynomial, f(A) = 0}.
Then, S is a non-empty subset of as n S. Thus, by well-ordering principle there exists a smallest positive integer, say M, and a corresponding polynomial, say m(x), such that deg(m(x)) = M, m(A) = 0.

Also, without loss of generality, we can assume that m(x) is monic and unique (non-uniqueness will lead to a polynomial of smaller degree in S).

Now, suppose there is a polynomial f(x) such that f(A) = 0. Then, by division algorithm, there exist polynomials q(x) and r(x) such that f(x) = m(x)q(x) + r(x), where either r(x) is identically the zero polynomial of deg(r(x)) < M = deg(m(x)). As

0 = f(A ) = m (A )q(A )+ r(A ) = 0q(A )+ r(A) = r(A),
we get r(A) = 0. But, m(x) was the least degree polynomial with m(A) = 0 and hence r(x) is the zero polynomial. That is, m(x) divides f(x). _

As an immediate corollary, we have the following result. PICT PICT DRAFT

Corollary 7.2.6. [Minimal polynomial divides the Characteristic Polynomial] Let mA(x) and PA(x) be, respectively, the minimal and the characteristic polynomials of A Mn().

1.
Then, mA(x) divides PA(x).
2.
Further, if λ is an eigenvalue of A then mA(λ) = 0.

Proof. The first part following directly from Lemma 7.2.5. For the second part, let (λ,x) be an eigen-pair. Then, f(A)x = f(λ)x, for any polynomial of f, implies that

mA (λ)x = mA (A )x = 0x = 0.
But, x0 and hence mA(λ) = 0. Thus, the required result follows. _

we also have the following result.

Lemma 7.2.7. Let A and B be two similar matrices. Then, they have the same minimal polynomial.

Proof. Since A and B are similar, there exists an invertible matrix S such that A = S-1BS. Hence, f(A) = F(S-1BS) = S-1f(B)S, for any polynomial f. Hence, mA(A) = 0 if and only if mA(B) = 0 and thus the required result follows. _

PICT PICT DRAFT Theorem 7.2.8. Let A Mn() and let λ1,k be the distinct eigenvalues of A. If ni is the size of the largest Jordan block for λi in J = Jordan CFA then

        ∏k        n
mA (x) =   (x-  λi) i.
        i=1

Proof. Using 7.2.6, we see that mA(x) = i=1k(x-λi)αi, for some αi’s with 1 αi Alg.Mulλ i(A). As mA(A) = 0, using Lemma 7.2.7 we have mA(J) = i=1k(J - λiI)αi = 0. But, observe that for the Jordan block Jni(λi), one has

1.
(Jni(λi)- λiI)αi = 0 if and only if αi ni, and
2.
(Jnm(λm )- λiI)αi is invertible, for all mi.

Thus i=1k(J - λiI)ni = 0 and i=1k(x - λi)ni divides i=1k(x - λi)αi = mA(x) and i=1k(x - λi)ni is a monic polynomial, the result follows. _

As an immediate consequence, we also have the following result which corresponds to the converse of the above theorem.

Theorem 7.2.9. Let A Mn() and let λ1,k be the distinct eigenvalues of A. If the minimal polynomial of A equals i=1k(x - λi)ni then ni is the size of the largest Jordan block for λi in J = Jordan CFA.

Proof. It directly follows from Theorem 7.2.8. _

We now give equivalent conditions for a square matrix to be diagonalizable.

Theorem 7.2.10. Let A Mn(). Then, the following statements are equivalent. PICT PICT DRAFT

1.
A is diagonalizable.
2.
Every zero of mA(x) has multiplicity 1.
3.
Whenever mA(α) = 0, for some α, then d--
dxmA(x)|x=α0.

Proof. Part 1 Part 2. If A is diagonalizable, then each Jordan block in J = Jordan CFA has size 1. Hence, by Theorem 7.2.8, mA(x) = i=1k(x - λi), where λi’s are the distinct eigenvalues of A.

Part 2 Part 3. Let mA(x) = i=1k(x-λi), where λi’s are the distinct eigenvalues of A. Then, mA(x) = 0 if and only if x = λi, for some i,1 i k. In that case, it is easy to verify that  d
---
dxmA(x)0, for each λi.

Part 3 Part 1. Suppose that for each α satisfying mA(α) = 0, one has  d
---
dxmA(α)0. Then, it follows that each zero of mA(x) has multiplicity 1. Also, using Corollary 7.2.6, each zero of mA(x) is an eigenvalue of A and hence by Theorem 7.2.8, the size of each Jordan block is 1. Thus, A is diagonalizable. _

We now have the following remarks and observations.

Remark 7.2.11.

1.
Let f(x) be a monic polynomial and A = Companion(f) be the companion matrix of f. Then, by Theorem 7.2.4) f(A) = 0 and no monic polynomial of smaller degree annihilates A. Thus PA(x) = mA(x) = f(x), where PA(x) is the characteristic polynomial and mA(x), the minimal polynomial of A.
2.
Let A Mn(). Then, A is similar to Companion(f), for some monic polynomial f if and only if mA(x) = f(x).

Proof. Let B = Companion (f). Then, using Lemma 7.2.7, we see that mA(x) = mB(x). But, by Remark 7.2.11.1, we get mB(x) = f(x) and hence the required result follows.

Conversely, assume that mA(x) = f(x). But, by Remark 7.2.11.1, mB(x) = f(x) = PB(x), the characteristic polynomial of B. Since mA(x) = mB(x), the matrices A and B have the same largest Jordan blocks for each eigenvalue λ. As PB = mB, we know that for each λ, there is only one Jordan block in Jordan CFB. Thus, Jordan CFA = Jordan CFB and hence A is similar to Companion (f). _

PICT PICT DRAFT

Exercise 7.2.12. The following are some facts and questions.

1.
Let A Mn(). If PA(x) is the minimal polynomial of A then A is similar to Companion (PA) if and only if A is nonderogatory. T/F?
2.
Let A,B M3() with eigenvalues 1,2,3. Is it necessary that A is similar to B?
3.
Let A,B M3() with eigenvalues 1,1,3. Is it necessary that A is similar to B?
4.
Let A,B M4() with the same minimal polynomial. Is it necessary that A is similar to B?
5.
Let A,B M3() with the same minimal polynomial. Is it necessary that A is similar to B?
6.
Let A Mn() be idempotent and let J = Jordan CFA. Thus, J2 = J and hence conclude that J must be a diagonal matrix. Hence, every idempotent matrix is diagonalizable.
7.
Let A Mn(). Suppose that mA(x)|x(x - 1)(x - 2)(x - 3). Must A be diagonalizable?
8.
Let A M9() be a nilpotent matrix such that A50 but A6 = 0. Determine PA(x) and mA(x).
9.
Recall that for A,B Mn(), the characteristic polynomial of AB and BA are the same. That is, PAB(x) = PBA(x). However, they need not have the same minimal polynomial. Take A = [    ]
 0  1

 0  0 and B = [    ]
 0  0

 0  1 to verify that mAB(x)mBA(x).

We end this section with a method to compute the minimal polynomial of a given matrix.

PICT PICT DRAFT Example 7.2.13. [Computing the Minimal Polynomial] Let λ1,k be the distinct eigenvalues of A Mn().

7.3 Applications of Jordan Canonical Form

In the last section, we say that the matrices if A is a square matrix then A and AT are similar. In this section, we look at some more applications of the Jordan Canonical Form.

7.3.1 Coupled system of linear differential equations

Consider the first order Initial Value Problem (IVP) x(t) = ⌊     ⌋
 x ′1(t)
||  .  ||
⌈  ..  ⌉
 x ′(t)
   n = A⌊     ⌋
  x1(t)
||   . ||
⌈   .. ⌉
  xn(t) = Ax(t), with x(0) = 0. If A is not a diagonal matrix then the system is called coupled and is hard to solve. Note that if A can be transformed to a nearly diagonal matrix, then the amount of coupling among xi’s can be reduced. So, let us look at J = Jordan CF(A) = S-1AS. Then, using S-1A = JS-1. verify that the initial problem x(t) = Ax(t) is equivalent to the equation S-1x(t) = S-1Ax(t) which in turn is equivalent to y(t) = Jy(t), where S-1x(t) = y(t) with y(0) = S-1x(0) = 0. Therefore, if y is a solution to the second equation then x(t) = Sy is a solution to the initial problem.

When J is diagonalizable then solving the second is as easy as solving yi(t) = λiyi(t) for which the required solution is given by yi(t) = yi(0)eλit.

If J is not diagonal, then for each Jordan block, the system reduces to

 ′                        ′                         ′
y1(t) = λy1 (t)+ y2(t),⋅⋅⋅,yk-1(t) = λyk -1(t) + yk(t),yk(t) = λyk(t).
This problem can also be solved as in this case the solution is given by yk = c0eλt; yk-1 = (c0t + c1)eλt and so on. PICT PICT DRAFT

7.3.2 Commuting matrices

Let P(x) be a polynomial and A Mn(). Then, P(A)A = AP(A). What about the converse? That is, suppose we are given that AB = BA for some B Mn(). Does it necessarily imply that B = P(A), for some nonzero polynomial P(x)? The answer is No as I commutes with A for every A. We start with a set of remarks.

Theorem 7.3.1. Let A Mn() and B Mm(). Then, the linear system AX -XB = 0, in the variable matrix X of size n×m, has a unique solution, namely X = 0 (the trivial solution), if and only if σ(A) and σ(B) are disjoint.

Proof. Let us assume that σ(A) and σ(B) are disjoint.

Since σ(A) and σ(B) are disjoint, the matrix PB(A) = (              )
    ∏
        [λI - A ]
  λ∈σ(B), obtained by evaluating A at the characteristic polynomial, PB(t), of B, is invertible. So, let us look at the implication of the condition AX = XB. This condition implies that A2X = AXB = XBB = XB2 and hence, P(A)X = XP(B), for any polynomial P(t). In particular, PB(A)X = XPB(B) = X0 = 0. As PB(A) is invertible, we get X = 0.

Now, conversely assume that AX - XB = 0 has only the trivial solution X = 0. Suppose on the contrary λ is a common eigenvalue of both A and B. So, choose nonzero vectors x n and y m such that (λ,x) is an eigen-pair of A and (λ,y) is a left eigen-pair of B. Now, define X0 = xyT . Then, X0 is an n × m nonzero matrix and

                 T     T        T       T
AX  - XB  = Axy   - xy  B =  λxy  - λxy   = 0.
Thus, we see that if λ is a common eigenvalue of A and B then the system AX - XB = 0 has a nonzero solution X0, a contradiction. Hence, the required result follows. _

Corollary 7.3.2. Let A Mn(),B Mm() and C be an n × m matrix. Also, assume that PICT PICT DRAFT σ(A) and σ(B) are disjoint. Then, it can be easily verified that the system AX - XB = C, in the variable matrix X of size n × m, has a unique solution, for any given C.

Proof. Consider the linear transformation T : Mn,m() Mn,m(), defined by T(X) = AX - XB. Then, by Theorem 7.3.1, Null(T) = {0}. Hence, by the rank-nullity theorem, T is a bijection and the required result follows. _

Definition 7.3.3. [Toeplitz Matrix] A square matrix A is said to be of Toeplitz type if each (super/sub)-diagonal of A consists of the same element. For example, A = ⌊              ⌋
  b1  b2  b3 b4
|| a1  b1  b2 b3||
||              ||
⌈ a2  a1  b1 b2⌉
  a3  a2 a1  b1 is a 4 × 4 Toeplitz type matrix. and the matrix B = ⌊b   b   b  b ⌋
| 1   2   3  4|
|| 0  b1  b2 b3||
|⌈ 0   0  b1 b2|⌉

  0   0  0  b1 is an upper triangular Toeplitz type matrix.

Exercise 7.3.4. Let Jn(0) Mn() be the Jordan block with 0 on the diagonal.

1.
Further, if A Mn() such that AJn(0) = Jn(0)A then prove that A is an upper Toeplitz type matrix.
2.
Further, if A,B Mn() are two upper Toeplitz type matrices then prove that
(a)
there exists ai ,1 i n, such that A = a0I + a1Jn(0) + ⋅⋅⋅ + anJn(0)n-1.
(b)
P(A) is a Toeplitz matrix for any polynomial P(t).
(c)
AB is a Toeplitz matrix.
(d)
if A is invertible then A-1 is also an upper Toeplitz type matrix.
PICT PICT DRAFT

To proceed further, recall that a matrix A Mn() is called non-derogatory if Geo.Mulα(A) = 1, for each α σ(A) (see Definition 6.2.4).

Theorem 7.3.5. Let A Mn() be a non-derogatory matrix. Then, the matrices A and B commute if and only if B = P(A), for some polynomial P(t) of degree at most n - 1.

Proof. If B = P(A), for some polynomial P(t), then A and B commute. Conversely, suppose that AB = BA, σ(A) = {λ1,k} and let J = Jordan CFA = S-1AS be the Jordan matrix of A. Then, J = ⌊                    ⌋
 Jn1(λ1)
||         ..         ||
⌈           .        ⌉
              Jnk(λk). Now, write B = S-1BS = ⌊ --       --  ⌋
  B11  ⋅⋅⋅ B1k
||   ..  ..     ..||
⌈-- .    . -- .⌉
 Bk1   ⋅⋅⋅ Bkk, where B is partitioned conformally with J. Note that AB = BA gives JB = BJ. Thus, verify that

Jn (λ1)B12 = [JB ]12 = [BJ ]12 = B12Jn (λ2),
   1                                2
and hence B12 = 0. A similar argument gives Bij = 0, for all ij. Hence, JB = BJ implies Jni(λi)Bii = BiiJni(λi), for 1 i k. Or equivalently, Jni(0)Bii = BiiJni(0), for 1 i k (using Exercise 7.1.6.3). Now, using Exercise 7.3.4.1, we see that Bii is an upper triangular Toeplitz type matrix.

To proceed further, for 1 i k, define Fi(t) = ji(t-λj)nj. Then, Fi(t) is a polynomial with deg(Fi(t)) = n - ni and Fi(Jnj(λj)) = 0 if ji. Also, note that Fi(Jni(λi)) is a nonsingular upper triangular Toeplitz type matrix. Hence, its inverse has the same form and using Exercise 7.3.4.1, the matrix Fi(Jni(λi))-1Bii is also a Toeplitz type upper triangular matrix. Hence,

PICT PICT DRAFT          - 1--                                ni- 1
Fi(Jni(λi))  Bii = c1I + c2Jni(0) + ⋅⋅⋅+ cniJni(0)    = Ri(Jni(λi)),(say).
Thus, Bii = Fi(Jni(λi))Ri(Jni(λi)). Putting Pi(t) = Fi(t)Ri(t), for 1 i k, we see that Pi(t) is a polynomial of degree at most n - 1 with Pi((Jnj(λj)) = 0, for ji and Pi((Jnj(λi)) = Bii. Taking, P = P1 + ⋅⋅⋅ + Pk, we have
             (⌊                     ⌋)           ( ⌊                    ⌋ )
                Jn (λ1)                             Jn (λ1)
             ||   1      .          ||           | |   1     .          | |
P (J )  =  P1 |(|⌈          ..         |⌉|) +  ⋅⋅⋅+  Pk|( |⌈          ..        |⌉ |)
                             J  (λ )                             J   (λ  )
          ⌊--         ⌋       nk⌊  k        ⌋                       nk  k
           B11                  0
          ||      .    ||        ||    .      ||   --
       =  ⌈       ..   ⌉ + ⋅⋅⋅+ ⌈    ..     ⌉ = B.
                     0                  Bkk
Hence, B = SBS-1 = SP(J)S-1 = P(SJS-1) = P(A) and the required result follows. _ PICT PICT DRAFT PICT PICT DRAFT PICT PICT DRAFT