MATRIX THEORY

Chapter 2

MATRIX THEORY

The basic purpose of matrix theory is to provide a systematic mechanism for solving a system of linear equations:

a₁₁x₁ + a₁₂x₂ + ... + a_1nx_n = b₁,

a₂₁x₁ + a₂₂x₂ + ... + a_2nx_n = b₂,

………………………………

a_m1x₁ + a_m2x₂ + ... + a_mnx_n = b_m,

This system has "m" equations in "n" unknowns x₁, x₂, ... , x_n. The a_ij’s constitute the "matrix" A of the system, x_i’s the "solution vector" x, and b_i’s the "right-hand" or the "data vector" b, written, using matrix/vector notation, as

In the matrix/vector notation, the system of equations is abbreviated as: Ax = b. The definitions and operations on matrices/vectors that are introduced subsequently, make this equation not only symbolically but also mathematically meaningful.

Matrices

An m´ n matrix A = (a_ij), 1 £ i £ m, 1 £ j £ n, is an abstraction of a rectangular arrangement of the m´ n entries a_ijinto m-rows and n-columns:

The entry a_ij occurring in the i-th row and the j-th column is called the (i, j)-th entry of the matrix A. By changing i one moves along the rows of the matrix and for entries in a particular row i remains constant. Similarly, varying j takes one to different columns of the matrix, j for all entries in a particular column being the same.

In general the entries in a matrix could be all kind of objects. If these objects come from a set S , we say that the matrix is over the set S . For definitive results in the sequel we shall mostly confine ourselves to matrices over certain structured sets (e.g., those of real and complex numbers, or polynomials etc.) which seem to us most useful.

The set of all m´ n matrices over a field F is denoted by F ^m´ n , F ^m,n or just F ^mn, if no confusion is caused. Those not interested in a general field F may, with a little loss and a possible ambiguity at places, assume F to be particularized to mean either the set R of real numbers or the set C of complex numbers with the standard operations of addition and multiplication (as well as subtraction and division) defined for them. A matrix over R is called a real matrix and that over C a complex matrix. A general m´ n matrix A is called a rectangular matrix, while if m and n equal, A is termed a square matrix of order n. An m´ 1 matrix is also called an m-component or m-dimensional column vector and a 1´ n matrix is called an n-component or n-dimensional row vector. Here the usage "n-component vector" is to be preferred over "n-dimensional vector" since the latter only refers to a property of the space the vector is coming from and not of the vector itself, whereas the n-components of the vectors are, of course, visibly clear. The set of all n´ 1 column vectors over a field F is simply denoted by F ⁿ (instead of the bulkier notation F ⁿ´ 1 ). The particular case C ⁿ is called the unitary n-space and R ⁿ, the Euclidean n-space. Similarly. we will denote the set of all 1´ n row-vectors over a field F by F ⁽ⁿ⁾. A sub-matrix of a matrix A is a matrix obtained from A by deleting a certain number of rows and columns from A.

As is the case with all mathematical objects, two matrices A and B are said to be equal, written A = B iff they are identical, i.e., of the same size and with the same respective entries.

PROBLEMS

1. If A = B and B = C, verify that B = A and A = C (symmetry and transitivity of equality).

2. Check if the total number of sub-matrices of an m´ n matrix is the odd number 2^m+n – 2^m – 2ⁿ + 1?

3. If the 3´ 4 matrix A of units of assets of three Stockists is given as per the table:

	Item 1	Item 2	Item 3	Item 4
Stockist 1	10	4	8	2
Stockist 2	8	6	6	3
Stockist 3	15	8	4	4

and a commission agent buys/sells different items in rupees per unit of the items as given in the following 4´ 2 matrix C in the table:

	Buying price	Selling price
Item 1	20000	22000
Item 2	15000	20000
Item 3	200000	240000
Item 4	300000	320000

verify that the profits the agent will get in buying and then disposing of the total assests of each of the three holders is given by the column vector t with components calculable according to the formula

t_i = S ₁£ j£ 4 a_ij (c_j2 – c_j1 ), 1 £ i £ 3.

If all the holders wish to sell their entire assets while the agent can purchase the assets of only one holder, which one should he approach to make a maximum percentage of profit on his investment?

4. Show the number of p´ q sub-matrices of an m´ n matrix is [m! n!]/[(m-p)!(n-q)!p! q!] = ^mC_p ´ ⁿC_q.

Matrix Addition

If A and B are two m´ n matrices over a field F , their addition or sum matrix A+B is defined as an m´ n matrix C written as C = A + B, whose (i, j)-th element c_ij is given by: c_ij = a_ij + b_ij, 1 £ i £ m, 1 £ j £ n. The difference E = A - B of two matrices A and B is defined similarly (i.e., by: e_ij = a_ij – b_ij , 1 £ i £ m, 1 £ j £ n). Note that a zero matrix (i.e., a matrix with all entries 0) added to, or subtracted from, a matrix does not change it and that this property is enjoyed only by the zero matrix.

Negative of a matrix A, written -A, is the matrix the (i, j)-th entry of which is -a , the negative of the corresponding entry in A. It follows that A - B = A + (-B).

Example. .

PROBLEMS

1. For matrices over a field, verify that A+B = B+A, known as the commutativity property of addition, and that A+(B+C) = (A+B)+C, the associativity property of matrix addition.

2. Let the box X contain 10 white balls and 20 red balls and the box Y contain 25 red and 15 white balls. If the contents of X and Y are cleared into an empty box Z and we wish to know the number of white and red balls in Z, what is wrong with the following statements:

x = , y = , z = ?

If one wishes to keep x as it is, correct y and z so as to make sense.

3. Set up an xy-coordinate system. Let r₁ denote the velocity of a river in the x-direction and r₂ in the y-direction. Let the respective components of the velocity of a boat relative to the river be denoted by b₁ and b₂. Find the absolute velocity components v₁ and v₂ of the boat in the x and y directions. If one puts

r = , b = , v =,

find the relation between the vectors r, b and v. Can you express b in terms of r and v?

4. Let {A_k}_k³ 1 denote a sequence of matrices. Discuss what meaning should be ascribed to the writing

B = A₁ + A₂ + ... + A_j-1 + A_j = S ₁£ k£ j A_k?

What would you like to mean by an infinite sum

C = A₁ + A₂ + ... + A_j + ... = S _k³ 1 A_k?

Could you imagine a situations where an infinite sum could be meaningful? Would an infinite sum be meaningful if the field under consideration is not a subfield of C ?

5. The stocks (in kilograms) of three milkmen are as follows:

	Milk	Butter	Cheese	Curd
Milkman 1	110	14	18	28
Milkman 2	108	17	16	35
Milkman 3	115	18	14	41

If at the end of the day the stock position is according to

	Milk	Butter	Cheese	Curd
Milkman 1	15	4	13	11
Milkman 2	8	7	11	17
Milkman 3	14	5	9	15

find the amounts of various items the milkmen sold during the day. What has the matrix subtraction to do with the calculations? Use matrix addition to verify that the items sold and the items remaining add up to the morning stocks.

6. Let the components of a force F in the x, y and z directions be denoted by F₁ , F₂ and F₃, respectively, so that F may be represented as a row vector (F₁, F₂, F₃). If G is another force and both F and G act on the same particle, what is the resultant force H on the particle. If the particle moves from the point P₁ = (x₁, y₁ , z₁) to the point P₂ = (x₂, y₂, z₂), find the work done by the forces F, G and the resultant H of F and G. What is the relation between these three works done?

7. If p is a positive integer pA is defined as: 1A = A, 2A = A+A, 3A = A + 2A and (p+1)A = A + pA, in general. We put 0A = 0 and -pA = p(-A). Verify that if p and q are any integers, (p + q)A = pA + qA, and that (pq)A = p(qA).

Matrix Multiplication

If A is m´ n and B is n´ p, that is the number of columns of A equals the number of rows in B, their multiplication or product AB is defined as an m´ p matrix C, written C = AB, for which

Note that AB is an m´ p matrix and that in order for it to be defined the number of columns in A has to equal the number of rows in B. Moreover, the order in which A and B occur in AB is important.

Example: If , out of the possible nine juxtapositions AA, AB, AC, BA, BB, BC, CA, CB and CC, the four products AA, AB, BC and CC are not defined, whereas the remaining five products AC, BA, BB, CA, and CB are defined and are given by

In a product AB, A is called the pre-multiplying and B as the post-multiplying matrix. Thus, AB is the pre-multiplication of B by A, or the post-multiplication of A by B.

If A = (a_ij ) is an m´ n matrix, the n´ m matrix B defined by b_ij = a_ji is called the transpose of the matrix A and is denoted by A¢ , A^tr or A^t. If A and C are both m´ n we note that both the products A'B and BA' are defined and are respectively n´ n and m´ m matrices.

An n´ n matrix B for which b_ij = 0, if i ¹ j, and b_ii = 1 is called the n´ n identity matrix and is denoted by I_n. When the size n is understood from the context one drops n form I_n and writes it simply as I. Thus the symbol I commonly denotes identity matrices of various sizes. It is clear from the definition of matrix multiplication that IA = A and AI = A for any m´ n A, where the two identity matrices are respectively m´ m and n´ n.

PROBLEMS

1. If x = Ay and y = Bz, is it true that x = (AB)z?

2. Verify that the matrix multiplication is associative, i.e., A(BC) = (AB)C, (A l´ m, B m´ n, C n´ p). Construct examples to show that it is not commutative, in general, i.e., AB = BA does not hold for all n´ n A and B. Also prove that the matrix multiplication is both left and right distributive over the matrix addition, i.e., A(B + C) = AB + AC, (A m´ n, B, C n´ p) and (A + B)C = AC + BC, (A, B m´ n, C n´ p).

3. By now you can multiply two matrices and also know at least one situation where it could be useful. Can you imagine and interpret a case where you may need division of a matrix by another?

4. Let e = ad - bc ¹ 0. Let . Compute AB and BA. If x = Ay, show that y = Bx? If F is a matrix and z a vector such that Ay = Fz, find a matrix C such that y = Bz?

5. Find the number of distinct 2x2 matrices A ¹ 0, with entries 0 and 1 only that satisfy A² = 0, where the field F under consideration equals (a) R , (b) C , (c) Z ₂, & (d) Z ₃.

6. Let A and B be 2x2 matrices such that AB = I. Prove that BA = I.

7. Prove that there exist matrices A and B such that AB - BA = , iff a + d = 0.

8. If BA = A for all m´ n A, prove that B = I_m and that if AC = A for all m´ n A, then C = I_n.

9. Since A(BC) = (AB)C, both the sides could be commonly represented as ABC. Then ABC could be evaluated in two ways. Similarly, (A(BC))D = ((AB)C)D = A((BC)D) = A(B(CD)) = (AB)₁(CD)₂ = (AB)₂(CD)₁ , where in the last product, e.g., the subscripts refer to the sequence in which the product is

formed and which is that first we evaluate CD and then AB and finally multiply AB with CD. All of these six equal products are commonly denoted by ABCD. Thus ABCD could be evaluated in six distinct ways. Give a similar meaning to the product A₁A₂A₃... A_n-1A_n of n matrices and show that this product can be evaluated in precisely (n-1)! distinct ways.

10. Cite and discuss some physical situations where the matrix multiplication could be a useful tool in solving some problem.

11. Consider the triadiagonal and the bi-diagonal matrices

show that A = LU, where b ₁ = b₁, g ₁ = c₁/b ₁; b _j = b_j –a_jg j-1, j = 2, 3, ... , n, and g _j= c_j/b _j, j = 2, 3, ... , n-1, provided that b _j, 1 £ j £ n-1 are all different from zero. Further, show that the problem of solving the system of linear equations Ax = y gets transformed to that of solving the two related systems Lz = y and Ux = z, which, if none of the b _j 's are zero, could be solved as follows:

z₁= y₁/b ₁, z_j = (y_j – a_jz_j-1)/b _j, j = 2, 3, ... , n; x_n = z_n, x_j = z_j - g _jz_j+1, j = n-1, n-2, ... , 1.

12. If the n´ n matrices A, B commute, prove that the binomial expansion (A+B)^m = S ₀£ k£ m ^mC_kA^m-kB^k, is valid. Construct examples of non-commutative matrices where (a) the expansion is valid, and (b) the expansion is not valid.

13. A matrix A = (a ) is said to be upper-triangular if a_ij = 0, whenever i > j. On the other hand, A is called lower-triangular if a_ij = 0, whenever i < j. Prove that the product of two upper triangular matrices is upper triangular and that of two lower triangular ones is lower triangular.

14. Entries in a matrix just below the main diagonal are said to constitute the sub-diagonal of the matrix. A square matrix in which every entry below the sub-diagonal is zero is called an upper-Hessenberg or just a Hessenberg matrix. Show that a product of a Hessenberg matrix and an upper triangular matrix is a Hessenberg matrix.

15. Banded matrices: A matrix A is called a (2k+1)-band matrix if a_ij = 0, whenever |i-j| > k. Thus a triadiagonal matrix is a 3-band matrix, a pentadiagonal matrix is a 5-band matrix and so on. Verify that the product of two triadiagonal matrices is a pentadiagonal matrix. Generalize to prove that the product of a (2k+1)-band matrix with a (2m+1)-band matrix is a (2[m+2k]+1)-band matrix.

16. If A is an n´ n upper triangular matrix, show that Ax = b is solvable for every b iff a_ii ¹ 0, 1 £ i £ n. What happens if A is lower triangular?

17. The obvious way of solving an upper triangular system (a system whose matrix is upper triangular) is known as the method of backward substitution. Similarly, for a lower triangular system (a system whose matrix is upper triangular) we may use the method of forward substitution. Use appropriate simplifications and backward/forward substitutions to solve the two systems:

x₁ + 2x₂ + 3x₃ + ...………………………+ nx_n = n²,

x₂ + 2x₃ + …………………… + (n-1)x_n = (n-1)²,

……………………………………………………………..

x_k + 2x_k+1 + ... + (n-k+1)x_n = (n-k+1)²,

………………………………………

x_n-1 + 2x_n = 4,

x_n = 1;

and

z₁ = n,

z₁ + z₂ = n-1,

z₁ + z₂ + z₃ = n-2,

…………………………………………………………

z₁ + z₂ + z₃ + …………………………+ z_n-1 = 2,

z₁ + z₂ + z₃ + …………………………+ z_n-1 + z_n = 1,

18. Positive integral powers of a square matrix are defined as follows: A¹ = A, A² = A¹A = AA, A³ = A²A = AAA, and in general A^p+1 = A^pA, where p is a positive integer. One puts A⁰ = I, the identity matrix of the same order as A. If there exists a matrix B such that BA = I, we define A^-1 = B and A^-p = (A^-1)^p . Thus A^q is defined for any integer q, where for negative q existence of the matrix B = A^-1 is assumed. Show that if p and q are arbitrary integers, there hold: (i) A^pq = (A^p)^q,and (ii) A^p+q = A^pA^q.

19. Let A be a matrix with real entries. Suppose the matrices A², A³, A⁴, ... are all having only positive entries. Is it necessary that A must also be a matrix of the same type?

20. Let A be a matrix with complex entries. If the powers A², A³, A⁴, ... are all real matrices with all entries non-zero, is it necessary that A, too, is a matrix of the same type?

Scalar Multiplication

The Product of a matrix A by a scalar t is defined as the matrix C = tA where c_ij = ta_ij, i.e., to get tA, each entry of A is multiplied by the scalar t. An m´ m matrix B is called a scalar matrix if for some scalar c, b_ii = c and b_ij = 0, if i ¹ j. Thus a scalar matrix is a matrix of type cI, where I is an identity matrix. It is then easy to see that cA = (cI)A, i.e., scalar multiplying A by c is equivalent to pre-multiplying A by the scalar matrix cI, where I is m´ m if A is m´ n. Since (tI_m)A = tA = A(tI_n), often "At" is also used denote the scalar multiplication of A by the scalar t, and thus we have At º tA.

Quite often one identifies a scalar s with the 1´ 1 matrix (s); but then sA or As are not to be confused with the matrix products (s)A or A(s), which may not be defined at all. Indeed (s)A is defined iff A is 1´ n and then of course (s)A = sA. Similarly A(s) is defined iff A is m´ 1, and then A(s) = sA.

PROBLEMS

1. Let U denote an n´ n matrix with 1's along the super-diagonal and 0's elsewhere:

Compute U², U³, ... , U^n-1, Uⁿ. Hence, if I denotes the n´ n identity matrix (i.e., the matrix whose all diagonal elements are 1 and the off diagonal elements 0) and p = min{m, n-1}, show that

(l I+U)^m = S ₁£ k£ p ^mC_k l ^m-k U^k.

2. Let A be m´ n, x is n´ 1 and b = Ax. Let C_j denote the n´ 1 vector with i-th component as a_ij. C_j is called the j-th column vector of A. Verify that b = S ₁£ j£ n C_jx_j = C₁x₁+ C₂x₂ + ... + C_nx_n = S ₁£ j£ n x_jC_j.

3. Let A be a 3´ 2 matrix. Prove that for whatever 2´ 2 matrix B there exists a 2´ 3 matrix C such that CA = B iff no column vector of A is a scalar multiple of the other.

4. Let A be a 2´ 2 matrix over a field F with ch(F ) ¹ 2. If A has only two distinct entries a and b both occurring twice and a + b ¹ 0, show that there exist two vectors v₁ and v₂ none of which is a scalar multiple of the other such that Av_i = c_iv_i for some scalar c_i, i = 1, 2. Moreover, show that if v ¹ 0 is any vector and c is a scalar such that Av =cv, then c = c₁ or c = c₂ and then: (i) if c₁ ¹ c₂ , v is a scalar multiples of v₁ or v₂ according as c = c₁ or c = c₂ and (ii) if c₁ = c₂ , v is a sum of scalar multiples of v₁ and v₂.

5. Prove that: (i) (a + b)A = aA + bA, (ii) (ab)A = a(bA), and (iii) a(A + B) = aA + aB. What names would you like to give to these laws?

6. Let A =, x, y Î F ², and c Î F . If Ax = cx, prove that cx = 0. If Ax = Ay = 0, show that x₁y₂ = y₁x₂. Arrive at the same conclusions if A = .

7. If f(x) is a polynomial, A is an n´ n, B an n´ m and C an m´ n matrix over a field F and BC = I, the identity matrix show that f(CAB) = C[f(A)]B.

Block-Partitioned Matrices and Block Operations

By drawing horizontal and vertical lines across a matrix we get a partitioning of a matrix into blocks of sub-matrices such that the sub-matrices along the same row block have the same number of rows and those along the same column block have the same number of columns. Given such a configuration of a matrix A, we say that the matrix A has been given in a block-partitioned form or that A is a block-partitioned matrix. Thus if A is partitioned into p-row blocks and q-column blocks, the (i, j)-th block A_ij, being a sub-matrix of A of size p_i´ q_j , we write

If C₁, C₂, ... ,C_j, ... , C_n denote the column vectors of an m´ n matrix A, the column partitioned form of A is as: A = [C₁ | C₂ | … | C_j | … | C_n]. The row partitioned form of A is defined analogously.

Using the block-partitioned forms, the operations of addition, multiplication and scalar multiplication on matrices can be performed en-block. Thus, let A = (A_ij) and B = (B_ij) be two block partitioned matrices. Then, in the respective cases, assuming that sizes of A and B 's are compatible, in the block-partitioned form one has A+B = (A_ij +B_ij), AB = (S _k A_ikB_kj), and a A = (aA_ij), where on the right hand sides is shown the (i, j)-th block of the matrices on the left hand sides. The compatibility in the above refers to the sizes of blocks being such that the sub-matrices A_ij +B_ij and A_ikB_kj make sense. A verification of these is straightforward evaluation.

PROBLEMS

1. Let B and C be real n´ n matrices and b, c, y and z real n´ 1 column vectors. Let A = B + iC, and for1 £ j £ n, a_j = b_j + ic_j, and x_j = y_j + iz_j. Verify that the n´ n complex system of equations Ax = a is equivalent to the 2n´ 2n real system of equations, written in a block partitioned form as:

2. If A = diag (A₁₁,A₂₂, ... ,A_kk) is a block-diagonal n´ n matrix where the diagonal blocks A_ii are square matrices and B is another n´ n matrix such that AB = I, the identity matrix, show that also the matrix B has the form diag (B₁₁,B₂₂, ... ,B_kk). Generalize the result for block triangular matrices.

3. Use partitioning to explicitly verify that the columns of AB are linear combinations of the columns of A and that the rows of of AB are linear combinations of the rows of B.

Vector Spaces of Matrices

With respect to the operations (+) (matrix addition) and (.) (scalar multiplication of matrices) the set V = F^{m´ n} of m´ n matrices over a field F has the following properties:

(1) For A, B Î V and c Î F, A + B Î V and cA Î V. [V is closed under (+) and (× )].

(2) A + B = B + A, for A, B Î V.

(3) A + (B + C) = (A + B) + C, for A, B, C Î V.

(4) There is a 0 (the zero matrix) Î V, such that A + 0 = A for every A Î V.

(5) For every A Î V there is a -A (negative of A, i.e., the matrix whose every entry is the negative of the corresponding entry in A) Î V such that A+(-A) = 0.

(6) 1.A = A, where A Î V and 1 is the unity in F.

(7) a(bC) = (ab)C, where a, b Î F and C Î V (associativity with respect to scalar multiplication.

(8) (a+b)C = aC+bC, where a, b Î F and C Î V (distributivity of scalar multiplication over a scalar sum).

(9) a(B + C) = aB + aC, where a Î F and B, C Î V (distributivity of scalar multiplication over a vector sum).

The vector spaces Fⁿ and F⁽ⁿ⁾ of n-componet column and row vectors, respectively, are simply the vector spaces F^{n´ 1} and F^{1´ n} . More generally, in the sequel, a set V is called a vector space over a field F if there exist operations of vector addition + : V ´ V ® V and scalar multiplication • : F ´ V ® V for which (1)-(9) hold. A subset W of a vector space V which is closed under the operations of addition and scalar multiplication (i.e., + : W ´ W ® W and • : F ´ W ® W) is called a subspace of V.

PROBLEMS

1. If W₁ and W₂ are subspaces of F^{m´ n} show that W = W₁ Ç W₂ is also a subspace. Also prove that W₁ È W₂ is a subspace iff one of W₁ and W₂ is contained in the other.

2. Defining W = W₁ + W₂ = {w₁ + w₂ : w_i Î W_i }, where W_i 's are subspaces of F^{m´ n}, show that W is a subspace of F^{m´ n}.

3. Show that the equation (W₁ + W₂ ) Ç W₃ = W₁ Ç W₃ + W₂ Ç W₃ , is not necessarily true if W_i 's are subspaces.

4. Let W₁ and W₂ denote subsets of F^{2´ 2} of matrices of type and , respectively.

(i) Show that W₁ , W₂ , W₁ Ç W₂ and W₁ + W₂ are subspaces of F^{2´ 2}.

(ii) Note that W₁ and W₂ are each described by three parameters (namely x, y, z and a, b, c) and that no two will do. Now find the number of minimal parameters that descibe W₁ Ç W₂ and W₁ + W₂ .

The Standard Inner Product In Rⁿ and C ⁿ

Let K denote any of the fields R or C . Consider the vector space K ⁿ over K . The sum S ₁£ i£ n x_iy_i* = (x,y), where x = (x₁,x₂, ... ,x_n)¢ , and y = (y₁,y₂, ... ,y_n)¢ Î K ⁿ and y* denotes the complex-conjugate of y, is called the standard inner-product in K ⁿ. Note that when K = R , the complex conjugation is not required and we simply have (x,y) = S ₁£ i£ n x_iy_i.

PROBLEMS

1. Verify the following properties of the standard inner product: (i) (x,x) ³ 0 and the equality holds iff x = 0, (ii) (x+y,z) = (x,z)+(y,z), (iii) (cx,y) = c(x,y), (iv) (x,y) = (y,x)* .

2. Show that the complex l minimizing f(l ) = (x-l y,x-l y) is given by l = (x,y)/(y,y). Deduce that (x,y)(y,x) £ (x,x)(y,y). What if we minimize f(l ) with respect to a real l ?

3. Let A be an m´ n matrix, x Î Rⁿ and y Î R^m . If (Ax,y) = 0 for all x and y, show that A = 0, the matrix with all zero entries.

4. If for A = (a_ij), its conjugate transpose or adjoint A* is defined as the matrix whose (i,j)-th entry equals a_ji* , the complex conjugate of the (j,i)-th entry of A, prove that (Ax,y) = (x,A* y).

5. If A is a complex n´ n matrix and A* = A (i.e., A is self-adjoint), show that (i) (Ax,x) is real for all x Î ℂⁿ, (ii) if for some l Î ℂ and 0 ¹ x Î ℂⁿ, Ax = l x, then l Î ℝ and (iii) if Ax = l x and Ay = m y, where l ¹ m and x, y ¹ 0, then (x,y) = 0.

Linear Independence of Matrices

Let A₁, A₂, ... , A_k be m´ n matrices over a field F. We call them linealy-dependent or simply dependent if there exist c₁, c₂, ... , c_k Î F, not all zero, such that c₁ A₁ + c₂ A₂ + ... + c_k A_k = 0, the m´ n zero-matrix. A finite or infinite set of matrices is called linearly dependent or dependent if a finite subset of it is dependent. A set S of matrices (and the members of it) are called linearly-independent if S is not linearly-dependent. It follows that if S is linearly-independent then A₁, A₂, ... , A_k Î S, c₁, c₂, ... , c_k Î F and c₁ A₁ + c₂ A₂ + ... + c_k A_k = 0 imply that c₁ = c₂ = ... = c_k = 0.

Linear dependence or independence of vectors in Fⁿ and F⁽ⁿ⁾ is synonymous with their linear dependence or independence as elements of F^{n´ 1} and F^{1´ n} , respectively. The notion of linear independence in a general vector space is exaclty analogous, but will be taken up only later.

PROBLEMS

1. Find three vectors in R ³ which are linearly dependent but are such that any two of them are linearly independent.

2. Prove that the four vectors (1,0,0), (0,1,0), (0,0,1) and (1,1,1) in R ⁽³⁾ form a linearly dependent set, but that any three of them are linearly dependent.

3. Give an upper bound for the number of elements in a linearly independent subset of (i) F^{m´ n}, (ii) the set of symmetric matrices in F^{n´ n}, (iii) the set of upper triangular matrices in F^{n´ n} and (iv) the set of diagonal matrices in F^{n´ n}. Is the upper bound sharp?

4. For 1 £ i £ m and 1 £ j £ n, let E_ij denote the m´ n matrix whose (i,j)-th entry is 1 and the remaining entries are all zero. Prove that the matrices E_ij are linearly independent.

5. Prove that the row vectors of an n´ n upper triangular matrix A are linearly independent iff a_ii ¹ 0, 1 £ i £ n. Deduce that the column vectors of an n´ n upper triangular matrix A are linearly independent iff a_ii ¹ 0, 1 £ i £ n.

Some Standard Matrices

In the sequel, unless otherwise stated, elements of matrices will be scalars from a general field F or ℝ (real line) or ℂ (complex plane). The square matrix I = (d _ij), where d _ij is the Kronecker-delta

is called an identity matrix. (We have come across this matrix earlier.) Obviously IA = A, AI = A, where A is any m´ n matrix, the first I being m´ m and the second n´ n. A matrix all of whose entries are zero is called a zero or a null matrix.

A matrix B is called a left inverse of a matrix A if BA = I, an identity matrix. In this situation A is equivalently called a right inverse of B. Note that if A is m´ n, here, then B is n´ m. If AB = BA = I, B is called an inverse of A (and is written as B = A^-1 ). Here A has to be a square matrix. Later we will see that in this case AB = I Þ BA = I. Moreover, an inverse, if it exists, will be seen to be unique. Also, the symmetry of the definition implies that if B is the inverse of A then A is the inverse of B. A matrix possessing an inverse is called an invertible or a non- singular matrix. If A^-1 does not exist, A is called singular or non-invertible.

A square matrix is called a diagonal matrix if all entries above or below the main diagonal in A are zero. For a diagonal matrix D one writes D = diag(d₁,d₂,d₃, ...,d_n), where d_ii is the i-th diagonal entry of D. A matrix T is called upper triangular if entries in T below the main diagonal are zero. Similarly L is called lower triangular if entries in L above the main diagonal are all zero. An upper triangular matrix is called strictly upper triangular if also its diagonal entries are zero. Similarly, a strictly lower triangular matrix is a lower triangular matrix whose diagonal entries are also all zero.

A¢ º A^T º A^t , called the transpose of matrix A = (a_ij), is the matrix A¢ = (a_ji), whose (i,j)-th element is the (j,i)-th element of A. Ā, called conjugate of a complex matrix A is the matrix whose (i,j)-th element is the complex conjugate of the (i,j)-th element of A. Thus Ā = (ā_ij). The conjugate-transpose (sometimes, also called the adjoint) of a complex matrix A is the matrix A* º Ā¢ º (A¢ )^-, obtained by conjugating the transpose, or equivalently, transposing the conjugate of the matrix A.

A matrix A is called symmetric if A = A¢ and skew-symmetric if A¢ = -A. A symmetric matrix with real entries is called a real-symmetric matrix. A is called hermitian if A* = A and skew-hermitian if A* = -A. A is called orthogonal if A^-1 = A¢ , i.e. A¢ A = AA¢ = I. An orthogonal matrix with real entries is called real-orthogonal. A matrix is called unitary if A^-1 = A* , i.e. A* A = AA* = I. Clearly, a real Hermitian matrix is symmetric and a real symmetric matrix is hermitian. Also, a real unitary matrix is orthogonal and a real-orthogonal matrix is unitary.

An m´ n A is called sub-unitary if A* A = I. Thus, a square sub-unitary matrix is unitary. Similarly, if A¢ A = I and A is m´ n, it is called sub-orthogonal (for m = n it becomes orthogonal). A sub-orthogonal matrix with real entries is called a real-sub-orthogonal matrix.

A matrix A with real or complex entries is called positive definite if x* Ax º S ₁£ i£ nS 1£ j£ n a_ij(x_i)^-x_j > 0, for every n´ 1 non-zero complex vector x. If x* Ax ³ 0 for all complex n´ 1 vectors x, A is called a positive semidefinite matrix. If -A is p.d. (positive definite), A is called negative definite (n.d) and similarly, if -A is positive semidefinite (p.s.d), A is called negative semidefinite (n.s.d).

Let A be a square n´ n matrix. The trace of A, written Tr A or tr A, is the sum of its diagonal entries: tr A = S ₁£ i£ n a_ii. By changing the order of the double summation, one easily verifies that tr AB = tr BA, where A is m´ n and B is n´ m.

For a square A, (l ,x) is called an eigen-pair, l an eigen-value and x a corresponding eigen-vector of A if Ax = l x, and x ¹ 0, where l Î F, A is n´ n and x is n´ 1.

PROBLEMS

1. Every real n´ n matrix A can be uniquely written as a sum of a real symmetric matrix and a real skew-symmetric matrix.

2. Every complex n´ n matrix A can be uniquely written as a sum of a Hermitian matrix and a skew-Hermitian matrix.

3. Every diagonal matrix commutes with every other diagonal matrix.

4. If D is a diagonal matrix with distinct diagonal entries and A is a matrix such that DA = AD, then A is also a diagonal matrix.

5. If S is an n´ n upper (lower)-triangular matrix (i.e., entries below (above) the diagonal are all zero) and UT = I, then T is an upper (lower)-triangular matrix. Moreover, the set of all upper (lower) triangular matrices is closed with respect to the operations of sums, differences and scalar, as well as, matrix products.

6. Show that a diagonal matrix D = diag (d₁,d₂, ... ,d_n) has an inverse iff none of d_i 's is zero. Moreover, if the inverse exists show that it is given by D = diag (d₁^-1,d₂^-1, ... ,d_n^-1).

7. Eigen-values of a triangular-matrix are its diagonal entries. (Prove.)

8. Prove that tr(ABC) = tr(BCA) = tr(CAB) and find couter examples for the other non-cyclic permutations.

9. Let A be a complex matrix (i.e., with complex entries). Prove that tr(A* A) = 0 iff A = 0. Find an example of a matrix A ¹ 0, for which tr(A²) = 0.

10. If A is p.d., can you prove that A is necessarily hermitian?

11. Prove that A^*A is p.s.d. for every complex A and it is p.d. if Ax ¹ 0 for every x ¹ 0.

12. A², even for a real matrix A, need not be p.s.d., p.d., positive or even non-negative. Find examples to this effect.

13. If A and B are n´ n matrices, AB = I implies that BA = I. Can you prove it? Construct examples to show that if A is m´ n and B is n´ m and m < n, it is possible that AB = I (the m´ m identity matrix) but that it is impossible that BA = I (the n´ n identity matrix).

14. Prove that a 2´ 2 matrix A = is invertible iff ad-bc ¹ 0, in which case the inverse of A is given by A^-1 = (ad-bc)^-1.

15. If D is a diagonal matrix with positive diagonal entries and A is a complex matrix such that D² commutes with A, show that A commutes with D.

Elementary Row And Column Operations

The elementary row operations (EROs) on an m´ n matrix A are of the following three types:

(i) Multiplying i-th row by a non-zero scalar c.

(ii) Interchanging i-th and j (¹ i)-th rows.

(iii) Adding to i-th row a scalar c times j (¹ i)-th row.

The operations (i)-(iii) may, respectively, be denoted by cR , Ri « R_j and R_i + cR_j. It is easily seen that their effects are reversible (through the inverse elementary row operations c^-1R , R_i « R_j and R_i - cR_j, respectively).

Let A and B be two m´ n matrices. A is said to be row equivalent to B (A ~ B) if A can be obtained from B by performing a finite number of elementary row operations on B. It follows that the row equivalence is an equivalence relation on F^{m´ n} (i.e., A ~ A, A ~ B Þ B ~ A and A ~ B and B ~ C Þ A ~ C).

A motivation for studying elementary row operations comes from systematizing methods for solving a system of linear equations and related questions. Consider, for instance the problem of determining when two m´ n consistent systems Ax = b and Cx = d are equivalent. A consistent system means that it

possesses at least one solution, i.e., the solution set is non-empty. By equvalence of two consistent systems we mean that the two systems have exactly the same set of solutions.

Now, starting with a system Ax = b if we apply a succession of the following three types of operations: (1) multiplying both sides of the i-th equation by a non-zero scalar c; (2) interchanging i-th and j (¹ i)-th equations; and (3) adding to i-th equation a scalar c times j (¹ i)-th equation, on the system we get another equivalent system. This is because all these operations are reversible and so if x satisfies the original system it satisfies the transformed one and vice versa. Note the similarity of these operations with the corresponding elementary row operations. Now the interesting result in this direction is the following converse: "Two consistent m´ n systems are equivalent only if one can be obtained from the other in this fashion, i.e., by a succession of the above three types of operations."

Now, suppose Ax = b is an n´ n system possessing a unique solution, given by x_i = d_i, 1 £ i £ n, say. But this writing of the solution again constitutes a system of linear equations of the form Cx = d, with C as the identity matrix. Hence by the above equivalence it follows that the solution can indeed be obtained simply by an appropriate succession of the operations of type (1)-(3). This idea of Gauss leads us to the well-known "elimination method" for solving Ax = b : Use one of the equations to eliminate x₁ from the remaining n-1 equations. Use one of the n-1 equations to eliminate x₂ from the remaining n-2 equations, and so on till you eliminate x_n-1 from the remaining 1 equation which would then contain only x_n . This process is called forward elimination. The n-equations that you have after the forward elimination, are of type Ux = v, where U is an upper triangular matrix. Its solution can be found by the so-called back substitution, which consists in determining x_n from the last equation, substituting for x_n and determining x_n-1 from the (n-1)-th equation, substituting for x_n and x_n-1 and determining x_n-2 from the (n-2)-th equation, and so on till finally substituting for x_n, x_n-1, x_n-2, ... , x₃, x₂ and determining x₁ from the first equation.

One more point: The m´ n system Ax = b is characterized by the matrix A of the system and the right hand vector b. If we construct the m´ (n+1) matrix [A | b], whose first n columns are the columns of the matrix A and the last column is a copy of the vector b, the matrix [A | b] contains all information regarding the system Ax = b excepting for the dummy notation for the vector x, which is of no mathematical consequence anyway. This matrix [A | b] is called the augmented matrix of the system Ax = b. Using the notion of the augmented matrix, we see that if Ax = b is transformed into Cx = d by a succession of the operations (1)-(3), the corresponding operations (i)-(iii) applied on [A | b] convert it to [C | d]. That is why for studying a system Ax = b of equations, we could confine our attention to the augmented matrix [A | b] and the elementary row operations (i)-(iii). The studies subsequently reveal as to how to deal with the inconsistent systems, or the systems that do not possess a unique solution. For the present, we give the following definition: Two m´ n systems (consistent or not) Ax = b and Cx = d are called equivalent if one can be obtained from the other by a finite sequence of operations of type (1)-(3), or equivalently, if [A | b] and [C | d] can be obtained from each other by a succession of elementary row operations.

Let A be m´ n. Each of the above mentioned three elementary row operations (i)-(iii) on A corresponds to premultiplication of A by an m´ m matrix, obtainable by performing the same operation on an m´ m identity matrix. The respective pre-multiplying matrices for (i)-(iii) are thus:

That the premultiplications by these E₁, E₂, E₃ results in the indicated EROs (i)-(iii), is a matter of direct verification. Matrices of above three types are called elementary matrices.

In (i)-(iii) the word 'row' changed to 'column' results in the elementary column operations (ECOs). These in turn correspond to post multiplication by n´ n matrices:

Note that in the column case, the matricesf F₁, F₂ in the cases (i)-(ii), excepting for their sizes, have the same structure while in (iii) in the column case we get F₃ which is the transpose of the matrix E₃ of the row case (iii).

PROBLEMS

1. Are the following pairs of complex systems of linear equations equivalent? If so, express each equation in each system as a linear combination of the equations in the other system:

(a) x₁ - ix₂ = 0, x₁ + ix₂ = 0,

x₁ + x₂= 0, x₁ + x₂ = 0.

(b) x₁ + x₂ + x₃ = 0, x₁ + 2x₂ + x₃ = 0,

x₁- x₂ + x₃ = 0, 2x₁ + x₂ + 2x₃ = 0.

x₁ + x₂ + x₃ = 0.

x₂ - ix₃+ 5x₄= 0, 5x₁ + 4x₂ + ix_{3 +}0x₄ = 0.

2. For what choice of the parameter a are the following systems of linear equations over the field R consistent:

(a) x₁ + x₂ = 10, (b) x₁ + x₂ = 10, (c) a x₁ + a x₂ = 10,

a x₁ + x₂ = 20. a x₁ + a x₂ = 20. a x₁ + a x₂ = 20.

(d) x₁ + x₂ - x₃ = 1, (e) x₁ + x₂ - x₃ = 1,

x₁ - x₂ + x₃ = 1, x₁ + x₂ + x₃ = 1,

a x₁ + x₂ + x₃ = 1. a x₁ - x₂ + x₃ = 1.

3. Devise an algorithm (a general strategy, or procedure) for expressing (a) a non-singular n´ n matrix A and (b) its inverse A^-1, as products of elementary matrices. The algorithm should fail iff A is singular. Practise the algorithm on the following matrices:

4. Prove that the matrices E₁, E₂, E₃ and F₁, F₂, F₃ obtained by performing the elementary row and column operations (i)-(iii) on an identity matrix are invertible and find their inverses.

Row-Reduced Echelon Form

A matrix is said to be in a row-reduced echelon form if:

(i) no non-zero row succeeds a zero row, i.e., zero rows occur only at the bottom of the matrix.

(ii) the first non-zero entry in any non zero row is 1 (unity of the field, called a leading unity.

(iii) the column containing a leading unity has every other entry zero.

(iv) if leading unity in i-th row occurs in p(i)-th column p(1) < p(2) < …, i.e., {p(i)} is an increasing sequence.

It is clear that through elementary row operations (eros) any matrix could be reduced to a row-reduced echelon form: we start with the first column. If all entries in it are zero, we proceed to the next column and so on. If some entry in the column is non-zero, apply an interchange of rows to bring it to the top position and then divide this row by the non-zero entry to make the entry a leading unity. Next subtract

appropriate multiples of this row from every other row so that every other entry in the same column is zero. Continuing this process with subsequent columns and rows we end up, finally with a row-reduced echelon form.

Replacing the word 'row' by 'column' every in above, we get the so called column-reduced echelon form of a matrix, which can be realized through the elementary column operations.

In view of the equivalence of elementary row (column) operations with pre (post) multiplication by the corresponding elementary matrices, we immediately have:

Theorem. Let A be m´ n. Then there exist non-singular m´ m B and n´ n C such that BA is in row-reduced echelon form and AC is in column-reduced echelon form.

Theorem (Unicity of the row-reduced echelon form). Let A be an m´ n matrix over a field. If B and C are non-singular and BA and CA are in row-reduced echelon forms, then BA = CA.

Proof. The proof is by induction on m. When m = 1 the result is obvious. Thus assume the result for (m-1)´ n matrices. Let E = BA and F = CA. Since B and C are non-singular and hence products of elementary matrices, each row of E can be obtained as a linear combination of rows of F and conversely. If E = 0, obviously F = 0 and conversely. If E ¹ 0, F ¹ 0, and let p denote the number of the column in E containing the leading unity of the first row and let let q denote the same for F. If p < q, the columns 1 to q-1 in F are zero. In particular the p-th column of F is zero and hence the first row of E can not be obtained from F by elementary row operations as the p-th column entry of the first row of E is non-zero. Similarly q < p is equally false and so p = q.

Now the second onward rows of F can be written as linear combinations of the second onward rows of E, since if the first row of E is involved in the corresponding linear combination the p-th entry in the row would be non-zero, which would contradict the just established fact that the p-th column of F has each entry, excepting the one in the first row, zero. Similarly the second onward rows of E can be written as linear combinations of the second onward rows of F.

Thus, if E₁ and F₁ denote the matrices obtained from E and F, respectively, by deleting their first rows, then E₁ and F₁ are row-equivalent row-reduced echelon forms and hence, by the induction hypothesis for (m-1)´ n matrices, are equal. In view of this the first rows of E and F are also to be the same, since no contribution can come from second onward rows of F while we are writing the first row of E as a linear combinations of the rows of F. Thus the result for m´ n matrices follows and the induction is complete. #

Note that B in the above theorem is the product of the pre-multiplying elementary matrices which give rise to the row-reduced echelon form of A. Similarly, C is the product of post multiplying elementary matrices corresponding to the elementary column operations giving rise to the column-reduced echelon form for A.

Since the column-reduced echelon form of A is the transpose of the row-reduced echelon form of the transpose of A, it follows that the column-reduced echelon form of a matrix is also unique.

PROBLEMS

1. Find all solutions ofo the following system of equations by row-reducing the coefficient matrix to its echelon form:

x₁ + 6x₂ - 18x₃ = 0,

3x₁ - 6x₂ + 13x₃ = 0,

5x₁ + 6x₂ - 23x₃ = 0,

7x₁ - 6x₂ + 8x₃ = 0.

2. Find a row-reduced echelon matrix which is row-equivalent to: .

What are all the solutions of Ax = 0? If the last row of A is changed to something else, would it make a difference in the results?

3. Describe explicitly the possible forms of all 2´ 2 and 3´ 3 row-reduced echelon matrices. Do these forms depend on the field F under consideration?

4. Is the system of equations

x₁ - 2x₂ + 8x₃ = 1,

x₁ - 3x₂ + 5x₃ = 1,

x₁ - 4x₂ + 2x₃ = 1,

consistent? If so, describe explicitly all solutions.

5. Show that the system

x₁ - 2x₂ + x₃ + 4x₄ = 1,

x₁ + x₂- x₃ + 2x₄ = 2,

x₁ + a x₂ -5x₃ - 2x₄ = 3,

has no solution iff a = 7. If a ¹ 7, find the general solution?

6. Find all solutions of

x₁ + x₂ + 3x₃ - 3x₄ = 3,

x₂ + x₃ - x₄ = -2,

4x₂ + 4x₃ - 4x₄ - x₅= 7,

x₁ -3x₂ - 3x₃ + 2x₄ +x₅= 0.

7. Find the most general a, b and c if the system of equations

is to be consistent. What is the smallest number of parameters required to describe a, b and c?

8. Let . What is the minimum number of parameters needed to represent the most general b if Ax = b is to be consistent? What is the minimum number of parameters required to describe z such that Az takes all possible values b that make Ax = b consistent?

9. If A is an m´ n matrix over a field F, prove that there exists an n´ m matrix B such that x = Bb is a solution of Ax = b, whenever b is such that the system Ax = b is consistent.

10. If Ax = b is inconsistent, there exists a c such that Ax = c has more than one solution. True or false?

11. For a square matrix A over a field F, the following statements are equivalent:

(i) A is invertible.

(ii) The row (column) -reduced echelon form of A is I, the identity matrix.

(iii) A is a product of elementary matrices.

(iv) Each row (column) in the row (column) -reduced echelon form of A is non-zero.

12. For the block partitioned matrices

where R = (A + B)^-1, S = -RB, and T = RA = I + S. If A + B is not invertible, could it be that P+Q is still invertible.

13. Prove that there are exactly two 2´ 2 complex row-reduced echelon matrices the sum of all the four entries of which is 0.

14. Prove that the interchange of two rows of matrix can be accomplished by a finite sequence of elementary row operations of the other two types. (Hint: R_i - R_j , R_j + R_i , R_i + R_j , -R_i).

15. Find all solutions of the system of equations: 2x₁ + (1+i)x₂ = 0, (1+i)x₁ + ix₂ = 0.

16. If , find all solutions of Ax = 0 by row-reducing A to its echelon form.

17. If , prove that Ax = l x is a consistent system iff l = 1, or 2, and find all solutions of the systems Ax = x and Ax = 2x.

18. Find the row-reduced echelon form of the matrix .

19. Verify that the following matrices are not row equivalent: .

20. Find all solutions of the system of equations Ax = 0, where , when: (i) A = 0, (ii) A ¹ 0, but ad - bc = 0 and (iii) ad - bc ¹ 0. [The quantity (ad-bc) is known as the determinant of the 2´ 2 A.]