DRAFT Lecture Notes on Linear Algebra
DRAFT DRAFT DRAFT

July 10, 2016
DRAFT
DRAFT

## Contents

DRAFT DRAFT DRAFT

DRAFT

## Chapter 1Introduction to Matrices

### 1.1 Definition of a Matrix

Definition 1.1.1.1. Matrix A rectangular array of numbers is called a matrix.

The horizontal arrays of a matrix are called its rows and the vertical arrays are called its columns. Let A be a matrix having m rows and n columns. Then, A is said to have order m × n or is called a matrix of size m × n and can be represented in either of the following forms:

where aij is the entry at the intersection of the ith row and jth column. To be concise, one writes Am×n = [aij] or, in short, A = [aij]. We will also use A[i,:] to denote the i-th row of A, A[:,j] to denote the j-th column of A. We shall mostly be concerned with matrices having complex numbers as entries.

For example, if A = then A[1,:] = [13 + i7], A[:,3] = and a22 = 5. In general, in row vector commas are inserted to differentiate between entries. Thus, A[1,:] = [1,3 + i,7]. A matrix having only one column is called a column vector and a matrix with only one row is called a row vector. All our vectors will be column vectors and will be represented by bold letters. Thus, A[1,:] is a row vector and A[:,3] is a column vector.

DRAFT Definition 1.1.1.2. Equality of two Matrices Two matrices A = [aij] and B = [bij] having the same order m × n are equal if aij = bij, for each i = 1,2,,m and j = 1,2,,n.

In other words, two matrices are said to be equal if they have the same order and their corresponding entries are equal.

Example 1.1.1.3. The linear system of equations 2x+3y = 5 and 3x+2y = 5 can be identified with the matrix A = . Note that x and y are unknowns with the understanding that x is associated with A[:,1] and y is associated with A[:,2].

#### 1.1.A Special Matrices

Definition 1.1.1.4.

1. A matrix in which each entry is zero is called a zero-matrix, denoted 0. For example,
DRAFT
2. A matrix that has the same number of rows as the number of columns, is called a square matrix. A square matrix is said to have order n if it is an n × n matrix.
3. Let A = [aij] be an n × n square matrix.
1. Then the entries a11,a22,,ann are called the diagonal entries the principal diagonal of A.
2. Then A is said to be a diagonal matrix if aij = 0 for ij, denoted diag[a11,,ann]. For example, the zero matrix 0n and are a few diagonal matrices.
3. If A = diag[a11,,ann] and aii = d for all i = 1,,n then the diagonal matrix A is called a scalar matrix.
4. Then A = diag[1,,1] is called the identity matrix, denoted In, or in short I.

For example, I2 = and I3 = .

5. Then A is said to be an upper triangular matrix if aij = 0 for i > j.
6. Then A is said to be a lower triangular matrix if aij = 0 for i < j.
7. Then A is said to be triangular if it is an upper or a lower triangular matrix.

For example, is upper triangular, is lower triangular.

4. An m × n matrix A = [aij] is said to have an upper triangular form if aij = 0 for all i > j. For example, the matrices , and have upper triangular forms.
DRAFT

### 1.2 Operations on Matrices

Definition 1.1.2.1. Transpose of a Matrix Let A = [aij] be an m × n matrix with real entries. Then the transpose of A, denoted AT = [bij], is an n×m matrix with bij = aji, for all i,j.

Definition 1.1.2.2. Conjugate Transpose of a Matrix Let A = [aij] be an m × n matrix with complex entries. Then the conjugate transpose of A, denoted A*, is an n × m matrix with (A*)ij = aji, for all i,j, where for a , a denotes the complex-conjugate of a.

Thus, if x is a column vector then xT and x* are row vectors and vice-versa. For example, if A = then A* = AT = , whereas if A = then A* = and note that A*AT .

Theorem 1.1.2.3. For any matrix A, (A*)* = A. Thus, (AT )T = A.

Proof. Let A = [aij],A* = [bij] and (A*)* = [cij]. Clearly, the order of A and (A*)* is the same. Also, by definition cij = bji = aij = aij for all i,j and hence the result follows. _

DRAFT Remark 1.1.2.4. Note that transpose is studied whenever the entries of the matrix are real. Since, we are allowing the matrix entries to be complex numbers, we will state and prove the results for complex-conjugate. The readers should separately very that similar results hold for transpose whenever the matrix has real entries.

Definition 1.1.2.5. Addition of Matrices Let A = [aij] and B = [bij] be two m × n matrices. Then, the sum of A and B, denoted A + B, is defined to be the matrix C = [cij] with cij = aij + bij.

Definition 1.1.2.6. Multiplying a Scalar to a Matrix Let A = [aij] be an m × n matrix. Then the product of k with A, denoted kA, is defined as kA = [kaij].

For example, if A = then 5A = and (2 + i)A = .

Theorem 1.1.2.7. Let A,B and C be matrices of order m × n, and let k,ℓ . Then

1. A + B = B + A      (commutativity).
2. (A + B) + C = A + (B + C)     (associativity).
3. k(ℓA) = (kℓ)A.
4. (k + )A = kA + ℓA.
DRAFT

Proof. Part 1.
Let A = [aij] and B = [bij]. Then, by definition

as complex numbers commute. The reader is required to prove the other parts as all the results follow from the properties of complex numbers. _

Definition 1.1.2.8. Additive Inverse Let A be an m × n matrix.

1. Then there exists a matrix B with A + B = 0. This matrix B is called the additive inverse of A, and is denoted by -A = (-1)A.
2. Also, for the matrix 0m×n, A + 0 = 0 + A = A. Hence, the matrix 0m×n is called the additive identity.

Exercise 1.1.2.9.

1. Find a 3 × 3 non-zero matrix A with real entries satisfying
1. AT = A. DRAFT
2. AT = -A.
2. Find a 3 × 3 non-zero matrix A with complex entries satisfying
1. A* = A.
2. A* = -A.
3. Find the 3 × 3 matrix A = [aij] satisfying aij = 1 if ij and 2 otherwise.
4. Find the 3 × 3 matrix A = [aij] satisfying aij = 1 if |i - j|1 and 0 otherwise.
5. Find the 4 × 4 matrix A = [aij] satisfying aij = i + j.
6. Find the 4 × 4 matrix A = [aij] satisfying aij = 2i+j.
7. Suppose A + B = A. Then show that B = 0.
8. Suppose A + B = 0. Then show that B = (-1)A = [-aij].
9. Let A = and B = . Compute A + B* and B + A*.

#### 1.2.A Multiplication of Matrices

Definition 1.1.2.10. Matrix Multiplication / Product Let A = [aij] be an m×n matrix and B = [bij] be an n × r matrix. The product of A and B, denoted AB, is a matrix C = [cij] of order m × r with DRAFT

Thus, AB is defined if and only if number of columns of A = number of rows of B.

For example, if A = and B = then
 (1.1.2.1)

Thus, note that the rows of the matrix AB can be written directly as

and similarly, the columns of the matrix AB can be written directly as
 DRAFT (1.1.2.3)

(AB)[:,2] = βA[:,1] + yA[:,2] + vA[:,3],,(AB)[:,4] = δA[:,1] + tA[:,2] + sA[:,3].

Remark 1.1.2.11. Observe the following:

1. In this example, while AB is defined, the product BA is not defined. However, for square matrices A and B of the same order, both the product AB and BA are defined.
2. The product AB corresponds to operating on the rows of the matrix B (see Equation (1.1.2.2)). This is row method for calculating the matrix product.
3. The product AB also corresponds to operating on the columns of the matrix A (see Equation (1.1.2.3)). This is column method for calculating the matrix product.
4. Let A and B be two matrices such that the product AB is defined. Then verify that
 (1.1.2.4)

Example 1.1.2.12. Let A = and B = . Use the row/column method of matrix multiplication to

1. find the second row of the matrix AB.
Solution: By Remark 1.1.2.11.4, (AB)[2,:] = A[2,:]B and hence
2. find the third column of the matrix AB.
Solution: Again, by Remark 1.1.2.11.4, (AB)[:,3] = AB[:,3] and hence
DRAFT

Definition 1.1.2.13. [Commutativity of Matrix Product] Two square matrices A and B are said to commute if AB = BA.

Remark 1.1.2.14. Note that if A is a square matrix of order n and if B is a scalar matrix of order n then AB = BA. In general, the matrix product is not commutative. For example, consider A = and B = . Then verify that AB = = BA.

Theorem 1.1.2.15. Suppose that the matrices A,B and C are so chosen that the matrix multiplications are defined.

1. Then (AB)C = A(BC). That is, the matrix multiplication is associative.
2. For any k ,(kA)B = k(AB) = A(kB).
3. Then A(B + C) = AB + AC. That is, multiplication distributes over addition. DRAFT
4. If A is an n × n matrix then AIn = InA = A.
5. Now let A be a square matrix of order n and D = diag[d1,d2,,dn]. Then
• (DA)[i,:] = diA[i,:], for 1 i n, and
• (AD)[:,j] = djA[:,j], for 1 j n.

Proof. Part 1. Let A = [aij]m×n,B = [bij]n×p and C = [cij]p×q. Then

Therefore,

Part 5. As D is a diagonal matrix, using Remark 1.1.2.11.4, we have DRAFT

Using a similar argument, the next part follows. The other parts are left for the reader. _

Exercise 1.1.2.16.

1. Find a 2 × 2 non-zero matrix A satisfying A2 = 0.
2. Find a 2 × 2 non-zero matrix A satisfying A2 = A and AI2.
3. Find 2 × 2 non-zero matrices A,B and C satisfying AB = AC but BC. That is, the cancelation law doesn’t hold.
4. Let A = . Compute A2 and A3. Is A3 = I? Determine aA3 + bA + cA2.
5. Let A and B be two m × n matrices. Then prove that (A + B)* = A* + B*.
6. Let A be a 1×n matrix and B be an n×1 matrix. Then verify that AB is a 1×1 matrix, whereas BA has order n × n.
7. Let A and B be two matrices such that the matrix product AB is defined.
1. Prove that (AB)* = B*A*.
2. If A[1,:] = 0* then (AB)[1,:] = 0*. DRAFT
3. If B[:,1] = 0 then (AB)[:,1] = 0.
4. If A[i,:] = A[j,:] for some i and j then (AB)[i,:] = (AB)[j,:].
5. If B[:,i] = B[:,j] for some i and j then (AB)[:,i] = (AB)[:,j].
8. Let A = and B = . Compute
1. A - A*,A + A*,(3AB)*- 4B*A and 3A - 2A*.
2. (AB)[1,:],(AB)[3,:],(AB)[:,1] and (AB)[:,2].
3. (B*A*)[:,1],(B*A*)[:,3],(B*A*)[1,:] and (B*A*)[2,:].
9. Let A = and B = . Guess a formula for An and Bn and prove it?
10. Let A = , B = and C = . Is it true that A2 - 2A + I = 0? What is B3 - 3B2 + 3B - I? Is C3 = 3C2?
11. Construct the matrices A and B satisfying the following statements.
1. The product AB is defined but BA is not defined.
2. The products AB and BA are defined but they have different orders.
3. The products AB and BA are defined, they have the same order but ABBA.
12. Let a,b and c be indeterminate. Then, can we find A with complex entries satisfying DRAFT A = ? What if A = ? Give reasons for your answer.

#### 1.2.B Inverse of a Matrix

Definition 1.1.2.17. [Inverse of a Matrix] Let A be a square matrix of order n.

1. A square matrix B is said to be a left inverse of A if BA = In.
2. A square matrix C is called a right inverse of A, if AC = In.
3. A matrix A is said to be invertible (or is said to have an inverse) if there exists a matrix B such that AB = BA = In.

Lemma 1.1.2.18. Let A be an n × n matrix. Suppose that there exist n × n matrices B and C such that AB = In and CA = In then B = C.

Proof. Note that C = CIn = C(AB) = (CA)B = InB = B. _

Remark 1.1.2.19. DRAFT

1. Lemma 1.1.2.18 implies that whenever A is invertible, the inverse is unique.
2. Therefore the inverse of A is denoted by A-1. That is, AA-1 = A-1A = I.

Example 1.1.2.20.

1. Let A = .
1. If ad - bc0. Then verify that A-1 = .
2. In particular, the inverse of equals .
3. If ad-bc = 0 then prove that either A[1,:] = 0* or A[:,1] = 0 or A[2,:] = αA[1,:] or A[:,2] = αA[:,1] for some α . Hence, prove that A is not invertible.
4. The matrices , and do not have inverses.
2. Let A = . Then A-1 = .
3. Prove that the matrices A = and B = are not invertible.

Solution: Suppose there exists C such that CA = AC = I. Then, using matrix product

DRAFT
But A[1,:] = A[2,:] and thus [1,0,0] = [0,1,0], a contradiction.

Similarly, if there exists D such that BD = DB = I then

But B[:,3] = B[:,1] + B[:,2] and hence I[:,3] = I[:,1] + I[:,2], a contradiction.

Theorem 1.1.2.21. Let A and B be two invertible matrices. Then

1. (A-1)-1 = A.
2. (AB)-1 = B-1A-1.
3. (A*)-1 = (A-1)*.

Proof. Proof of Part 1. Let B = A-1 be the inverse of A. Then AB = BA = I. Thus, by definition, B is invertible and B-1 = A. Or equivalently, (A-1)-1 = A.

Proof of Part 2. By associativity (AB)(B-1A-1) = A(BB-1)A-1 = I = (B-1A-1)(AB).

Proof of Part 3. As AA-1 = A-1A = I, we get (AA-1)* = (A-1A)* = I*. Or equivalently, (A-1)*A* = A*(A-1)* = I. Thus, by definition (A*)-1 = (A-1)*. _

We will again come back to the study of invertible matrices in Sections 2.2 and 2.3.A.

DRAFT

Exercise 1.1.2.22.

1. Let A be an invertible matrix. Then (A-1)r = A-r for all integer r.
2. Find the inverse of and .
3. Let A1,,Ar be invertible matrices. Then the matrix B = A1A2Ar is also invertible.
4. Let x* = [1 + i,2,3] and y* = [2,-1 + i,4]. Prove that x*y is invertible but xy* is not invertible.
5. Let A be an n × n invertible matrix. Then prove that
1. A[i,:]0* for any i.
2. A[:,j]0 for any j.
3. A[i,:]A[j,:] for any i and j.
4. A[:,i]A[:,j] for any i and j.
5. A[3,:]αA[1,:] + βA[2,:] for any α,β , whenever n 3.
6. A[:,3]αA[:,1] + βA[:,2] for any α,β , whenever n 3.
6. Determine A that satisfies (I + 3A)-1 = .
7. Determine A that satisfies (I - A)-1 = . [See Example 1.1.2.20.2].
8. Let A be a square matrix satisfying A3 + A - 2I = 0. Prove that A-1 = . DRAFT
9. Let A = [aij] be an invertible matrix. If B = [pi-jaij] for some p , p0 then determine B-1.

### 1.3 Some More Special Matrices

Definition 1.1.3.1.

1. Let A be a square matrix with real entries. Then, A is called
1. symmetric if AT = A. For example, A = .
2. skew-symmetric if AT = -A. For example, A = .
3. orthogonal if AAT = AT A = I. For example, A = .
2. Let A be a square matrix with complex entries. Then, A is called
1. Normal if A*A = AA*. For example, is a normal matrix.
2. Hermitian if A* = A. For example, A = .
3. skew-Hermitian if A* = -A. For example, A = .
4. unitary if AA* = A*A = I. For example, A = .
3. A matrix A is said to be idempotent if A2 = A. For example, A = is idempotent. DRAFT
4. A matrix that is symmetric and idempotent is called a projection matrix. For example, let u n be a column vector with uT u = 1 then A = uuT is an idempotent matrix. Moreover, A is symmetric and hence is a projection matrix. In particular, let u = (1,2)T and A = uuT . Then uT u = 1 and for any vector x = (x1,x2)T 2 note that
Thus, Ax is the feet of the perpendicular from the point x on the vector [12]T .
5. A square matrix A is said to be nilpotent if there exists a positive integer n such that An = 0. The least positive integer k for which Ak = 0 is called the order of nilpotency. For example, if A = [aij] is an n × n matrix with aij equal to 1 if i - j = 1 and 0, otherwise then An = 0 and A0 for 1 n - 1.

Exercise 1.1.3.2.

1. Let A be a complex square matrix. Then S1 = (A + A*) is symmetric, S2 = (A - A*) is skew-symmetric, and A = S1 + S2.
2. Let A and B be two lower triangular matrices. Then prove that AB is a lower triangular matrix. A similar statement holds for upper triangular matrices.
3. Let A and B be Hermitian matrices. Then prove that AB is Hermitian if and only if AB = BA. DRAFT
4. Show that the diagonal entries of a skew-Hermitian matrix are zero or purely imaginary.
5. Let A,B be skew-Hermitian matrices with AB = BA. Is the matrix AB Hermitian or skew-Hermitian?
6. Let A be a Hermitian matrix of order n with A2 = 0. Is it necessarily true that A = 0?
7. Let A be a nilpotent matrix. Prove that there exists a matrix B such that B(I +A) = I = (I + A)B [If Ak = 0 then look at I - A + A2 - + (-1)k-1Ak-1].
8. Are the matrices and orthogonal, for θ [0,2π]?
9. Let {u1,u2,u3} be three vectors in 3 such that ui*ui = 1, for 1 i 3, and ui*uj = 0 whenever ij. Then prove that the 3 × 3 matrix
1. U = [u1u2u3] satisfies U*U = I. Thus, UU* = I.
2. A = uiui*, for 1 i 3, satisfies A2 = A. Is A symmetric?
3. A = uiui* + ujuj*, for ij, satisfies A2 = A. Is A symmetric?
10. Verify that the matrices in Exercises 9.9b and 9.9c are projection matrices.
11. Let A and B be two n × n orthogonal matrices. Then prove that AB is also an orthogonal matrix.

#### 1.3.A sub-matrix of a Matrix

DRAFT Definition 1.1.3.3. A matrix obtained by deleting some of the rows and/or columns of a matrix is said to be a sub-matrix of the given matrix.

For example, if A = then [1],[2],,[15],,A are a few sub-matrices of A. But the matrices and are not sub-matrices of A (the reader is advised to give reasons).

Let A be an n×m matrix and B be an m×p matrix. Suppose r < m. Then, we can decompose the matrices A and B as A = [PQ] and B = , where P has order n×r and H has order r ×p. That is, the matrices P and Q are sub-matrices of A and P consists of the first r columns of A and Q consists of the last m - r columns of A. Similarly, H and K are sub-matrices of B and H consists of the first r rows of B and K consists of the last m-r rows of B. We now prove the following important theorem.

Theorem 1.1.3.4. Let A = [aij] = [PQ] and B = [bij] = be defined as above. Then

Proof. The matrix products PH and QK are valid as the order of the matrices P,H,Q and K are respectively, n×r,r ×p,n× (m-r) and (m-r) ×p. Also, the matrices PH and QK are of the same order and hence their sum is justified. Now, let P = [Pij],Q = [Qij],H = [Hij], and K = [kij]. Then, for 1 i n and 1 j p, we have

DRAFT
Thus, the required result follows. _

Remark 1.1.3.5. Theorem 1.1.3.4 is very useful due to the following reasons:

1. The order of the matrices P,Q,H and K are smaller than that of A or B.
2. The matrices P,Q,H and K can be further partitioned so as to form blocks that are either identity or zero or matrices that have nice forms. This partition may be quite useful during different matrix operations.
3. If we want to prove results using induction then after proving the initial step, one assume the result for all r×r sub-matrices and then try to prove it for (r+1)×(r+1) sub-matrices.

For example, if A = and B = then AB = .

Suppose A =

m1m2
 n1 n2
and B =
s1s2
 r1 r2
. Then the matrices P,Q,R,S and E,F,G,H, are called the blocks of the matrices A and B, respectively. Note that even if A + B is defined, the orders of P and E need not be the same. But, if the block sums are defined then A + B = . Similarly, if the DRAFT product AB is defined, the product PE may not be defined. Again, if the block products are defined, one can verify that AB = . That is, once a partition of A is fixed, the partition of B has to be properly chosen for purposes of block addition or multiplication.

Exercise 1.1.3.6.

1. Complete the proofs of Theorems 1.1.2.7 and 1.1.2.15.
2. Let A = ,B = and C = . Compute
1. (AC)[1,:],
2. (B(AC))[1,:], (B(AC))[2,:] and (B(AC))[3,:].
3. Note that (B(AC))[:,1] + (B(AC))[:,2] + (B(AC))[:,3] - (B(AC))[:,4] = 0.
4. Let xT = [1,1,1,-1]. Use previous result to prove Cx = 0.
3. Let x = , y = , A = and B = . Then
1. prove that y = Ax gives the counter-clockwise rotation through an angle α.
2. prove that y = Bx gives the reflection about the line y = tan(θ)x.
3. compute y = (AB)x and y = (BA)x. Do they correspond to reflection? If yes, then about which line?
4. furthermore if y = Cx gives the counter-clockwise rotation through β and y = Dx gives the reflection about the line y = tan(δ)x, respectively. Then prove that DRAFT
1. AC = CA and y = (AC)x gives the counter-clockwise rotation through α + β.
2. y = (BD)x and y = (DB)x give rotations. Which angles do they represent?
4. Fix a unit vector a n and define f : n n by f(y) = 2(aT y)a-y. Does this function give a reflection about the line that contains the points 0 and a.
5. Consider the two coordinate transformations
 x1 = a11y1 + a12y2 x2 = a21y1 + a22y2
and
 y1 = b11z1 + b12z2 y2 = b21z1 + b22z2
.
1. Compose the two transformations to express x1,x2 in terms of z1,z2.
2. If xT = [x1,x2],yT = [y1,y2] and zT = [z1,z2] then find matrices A,B and C such that x = Ay,y = Bz and x = Cz.
3. Is C = AB? Give reasons for your answer.
6. For An×n = [aij], the trace of A, denoted Tr(A), is defined by Tr(A) = a11 + a22 + + ann.
1. Compute Tr(A) for A = and A = .
2. Let A be a matrix with A = 2 and A = 3. If B = then compute Tr(AB).
3. Let A and B be two square matrices of the same order. Then prove that
1. Tr(A + B) = Tr(A) + Tr(B).
2. Tr(AB) = tr(BA).
4. Prove that one cannot find matrices A and B such that AB - BA = cI for any c0.
DRAFT
7. Let A and B be two m × n matrices with complex entries. Then prove that
1. Ax = 0 for all n × 1 vector x implies that A = 0, the zero matrix.
2. Ax = Bx for all n × 1 vector x implies that A = B.
8. Let A be an n×n matrix such that AB = BA for all n×n matrices B. Then prove that A is a scalar matrix. That is, A = αI for some α .
9. Let A = .
1. Find a matrix B such that AB = I2.
2. What can you say about the number of such matrices? Give reasons for your answer.
3. Does there exist a matrix C such that CA = I3? Give reasons for your answer.
10. Let A = and B = . Compute the matrix product AB using the block matrix multiplication.
11. Let A = . If P,Q and R are Hermitian, is the matrix A Hermitian?
12. Let A = , where A11 is an n × n invertible matrix and c .
1. If p = c - A21A11-1A12 is non-zero, prove that
is the inverse of A.
2. Use the above to find the inverse of and .
13. Let x be an n × 1 vector with real entries and satisfying xT x = 1.
1. Define A = In - 2xxT . Prove that A is symmetric and A2 = I. The matrix A is commonly known as the Householder matrix.
2. Let α1 be a real number and define A = In-αxxT . Prove that A is symmetric and invertible [The inverse is also of the form In + βxxT for some value of β].
14. Let A be an invertible matrix of order n and let x and y be two n× 1 vectors with real entries. Also, let β be a real number such that α = 1 + βyT A-1x0. Then prove the famous Shermon-Morrison formula
This formula gives the information about the inverse when an invertible matrix is modified by a rank one matrix.
15. Suppose the matrices B and C are invertible and the involved partitioned products are defined, then prove that DRAFT
16. Let J be an n × n matrix having each entry 1.
1. Prove that J2 = nJ.
2. Let α1212 . Prove that there exist α33 such that
3. Let α,β with α0 and α + 0 and define A = αIn + βJ. Prove that A is invertible.
17. Let A be an upper triangular matrix. If A*A = AA* then prove that A is a diagonal matrix. The same holds for lower triangular matrix.
18. Let A be an m × n matrix. Then a matrix G of order n×m is called a generalized inverse of A if AGA = A. For example, a generalized inverse of the matrix A = [1,2] is a matrix G = , for all α . A generalized inverse G is called a pseudo inverse or a Moore-Penrose inverse if GAG = G and the matrices AG and GA are symmetric. Check that for α = the matrix G is a pseudo inverse of A.
DRAFT

### 1.4 Summary

In this chapter, we started with the definition of a matrix and came across lots of examples. In particular, the following examples were important:

1. The zero matrix of size m × n, denoted 0m×n or 0.
2. The identity matrix of size n × n, denoted In or I.
3. Triangular matrices.
4. Hermitian/Symmetric matrices.
5. Skew-Hermitian/skew-symmetric matrices.
6. Unitary/Orthogonal matrices.
7. Idempotent matrices.
8. nilpotent matrices.

We also learnt product of two matrices. Even though it seemed complicated, it basically tells that multiplying by a matrix on the

1. left to a matrix A is same as operating on the rows of A.
2. right to a matrix A is same as operating on the columns of A.
DRAFT DRAFT DRAFT

DRAFT

## Chapter 2System of Linear Equations over ℝ

### 2.1 Introduction

Let us look at some examples of linear systems.

1. Suppose a,b . Consider the system ax = b in the unknown x. If
1. a0 then the system has a unique solution x = .
2. a = 0 and
1. b0 then the system has no solution.
2. b = 0 then the system has infinite number of solutions, namely all x .
2. Consider a linear system with 2 equations in 2 unknowns. The equation ax + by = c in the unknowns x and y represents a line in 2 if either a0 or b0. Thus the solution set of the system
is given by the points of intersection of the two lines. The different cases are illustrated by examples (see Figure 1).
1. Unique Solution
x + 2y = 1 and x + 3y = 1. The unique solution is [x,y]T = [1,0]T .
Observe that in this case, a1b2 - a2b10.
2. Infinite Number of Solutions
x + 2y = 1 and 2x + 4y = 2. As both equations represent the same line, the DRAFT solution set is [x,y]T = [1 - 2y,y]T = [1,0]T + y[-2,1]T with y arbitrary. Observe that
1. a1b2 - a2b1 = 0,a1c2 - a2c1 = 0 and b1c2 - b2c1 = 0.
2. the vector [1,0]T corresponds to the solution x = 1,y = 0 of the given system.
3. the vector [-2,1]T corresponds to the solution x = -2,y = 1 of the system x + 2y = 0,2x + 4y = 0.
3. No Solution
x + 2y = 1 and 2x + 4y = 3. The equations represent a pair of parallel lines and hence there is no point of intersection. Observe that in this case, a1b2 - a2b1 = 0 but a1c2 - a2c10.

3. As a last example, consider 3 equations in 3 unknowns.
A linear equation ax + by + cz = d represent a plane in 3 provided [a,b,c][0,0,0]. Here, we have to look at the points of intersection of the three given planes.

DRAFT

1. Unique Solution
Consider the system x+y +z = 3,x+4y +2z = 7 and 4x+10y-z = 13. The unique solution to this system is [x,y,z]T = [1,1,1]T , i.e., the three planes intersect at a point.
2. Infinite Number of Solutions
Consider the system x + y + z = 3,x + 2y + 2z = 5 and 3x + 4y + 4z = 11. The solution set is [x,y,z]T = [1,2 - z,z]T = [1,2,0]T + z[0,-1,1]T , with z arbitrary. Observe the following:
1. Here, the three planes intersect in a line.
2. The vector [1,2,0]T corresponds to the solution x = 1,y = 2 and z = 0 of the linear system x + y + z = 3,x + 2y + 2z = 5 and 3x + 4y + 4z = 11. Also, the vector [0,-1,1]T corresponds to the solution x = 0,y = -1 and z = 1 of the linear system x + y + z = 0,x + 2y + 2z = 0 and 3x + 4y + 4z = 0.
3. No Solution
The system x + y + z = 3,2x + 2y + 2z = 5 and 3x + 3y + 3z = 3 has no solution. In this case, we have three parallel planes. The readers are advised to supply the proof.

Definition 2.2.1.1. [Linear System] A system of m linear equations in n unknowns x1,x2,,xn is a set of equations of the form

where for 1 i n and 1 j m;aij,bi . Linear System (2.2.1.1) is called homogeneous if b1 = 0 = b2 = = bm and non-homogeneous , otherwise.

Let A = , x = and b = . Then (2.2.1.1) can be re-written as Ax = b. In this setup, the matrix A is called the coefficient matrix and the block matrix is called the augmented matrix of the linear system (2.2.1.1).

Remark 2.2.1.2. Consider the augmented matrix [Ab] of the linear system Ax = b, where A is an m×n matrix, b and x are column vectors of appropriate size. If xT = [x1,,xn] then it is important to note that

1. the unknown x1 corresponds to the column ([Ab])[:,1].
2. in general, for j = 1,2,,n, the unknown xj corresponds to the column ([Ab])[:,j].
3. the vector b = ([Ab])[:,n + 1].
4. for i = 1,2,,m, the ith equation corresponds to the row ([Ab])[i,:].

DRAFT Definition 2.2.1.3. [Consistent, Inconsistent] A linear system is called consistent if it admits a solution and is called inconsistent if it admits no solution. For example, the homogeneous system Ax = 0 is always consistent as 0 is a solution whereas the system x + y = 2,2x + 2y = 1 is inconsistent.

Definition 2.2.1.4. Consider the linear system Ax = b. Then the corresponding linear system Ax = 0 is called the associated homogeneous system. As mentioned in the previous paragraph, the associated homogeneous system is always consistent.

Definition 2.2.1.5. [Solution of a Linear System] A solution of Ax = b is a vector y such that Ay indeed equals b. The set of all solutions is called the solution set of the system. For example, the solution set of Ax = b, with A = and b = equals .

The readers are advised to supply the proof of the next theorem that gives information about the solution set of a homogeneous system.

Theorem 2.2.1.6. Consider the homogeneous linear system Ax = 0.

1. Then x = 0, the zero vector, is always a solution.
2. Let u0 be a solution of Ax = 0. Then, y = cu is also a solution for all c . DRAFT
3. Let u1,,uk be solutions of Ax = 0. Then i=1kaiui is also a solution of Ax = 0, for all ai ,1 i k.

Remark 2.2.1.7. Consider the homogeneous system Ax = 0. Then

1. the vector 0 is called the trivial solution.
2. a non-zero solution is called a non-trivial solution. For example, for the system Ax = 0, where A = , the vector x = is a non-trivial solution.
3. Thus, by Theorem 2.2.1.6, the existence of a non-trivial solution of Ax = 0 is equivalent to having an infinite number of solutions for the system Ax = 0.
4. Let u,v be two distinct solutions of the non-homogeneous system Ax = b. Then xh = u-v is a solution of the homogeneous system Ax = 0. That is, any two solutions of Ax = b differ by a solution of the associated homogeneous system Ax = 0. Or equivalently, the solution set of Ax = b is of the form, {x0 + xh}, where x0 is a particular solution of Ax = b and xh is a solution of the associated homogeneous system Ax = 0.

Exercise 2.2.1.8.

1. Consider a system of 2 equations in 3 unknowns. If this system is consistent then how many solutions does it have? DRAFT
2. Define a linear system of 3 equations in 2 unknowns such that the system is inconsistent.
3. Define a linear system of 4 equations in 3 unknowns such that the system is inconsistent whereas for any three equations the system is consistent.
4. Let Ax = b be a system of m equations and n unknowns. Then
1. determine the possible solution set if m 3 and n = 2.
2. determine the possible solution set if m 4 and n = 3.
3. can this system have exactly two distinct solutions?
4. can have only a finitely many (greater than 1) solutions?

#### 2.1.A Elementary Row Operations

Example 2.2.1.9. Solve the linear system y + z = 2,2x + 3z = 5,x + y + z = 3.

Solution: Let B0 = [Ab], the augmented matrix. Then B0 = . We now systematically proceed to get the solution.

1. Interchange 1st and 2nd equation (interchange B0[1,:] and B0[2,:] to get B1).
DRAFT
2. In the new system, multiply 1st equation by (multiply B1[1,:] by to get B2).
3. In the new system, replace 3rd equation by 3rd equation minus 1st equation (replace B2[3,:] by B2[3,:] - B2[1,:] to get B3).
4. In the new system, replace 3rd equation by 3rd equation minus 2nd equation (replace B3[3,:] by B3[3,:] - B3[2,:] to get B4). DRAFT
5. In the new system, multiply 3rd equation by (multiply B4[3,:] by to get B5).

The last equation gives z = 1. Using this, the second equation gives y = 1. Finally, the first equation gives x = 1. Hence, the solution set is {[x,y,z]T |[x,y,z] = [1,1,1]}, a unique solution.

In Example 2.2.1.9, observe that for each operation on the system of linear equations there is a corresponding operation on the row of the augmented matrix. We use this idea to define elementary row operations and the equivalence of two linear systems.

Definition 2.2.1.10. [Elementary Row Operations] Let A be an m × n matrix. Then the elementary row operations are

1. Eij: Interchange of A[i,:] and A[j,:]. DRAFT
2. Ek(c) for c0: Multiply A[k,:] by c.
3. Eij(c) for c0: Replace A[i,:] by A[i,:] + cA[j,:].

Definition 2.2.1.11. [Equivalent Linear Systems] Consider the linear systems Ax = b and Cx = d with corresponding augmented matrices, [Ab] and [Cd], respectively. Then the two linear systems are said to be equivalent if [Cd] can be obtained from [Ab] by application of a finite number of elementary row operations.

Definition 2.2.1.12. [Equivalent Matrices] Two matrices are said to be equivalent if one can be obtained from the other by a finite number of elementary row operations.

Thus, note that the linear systems at each step in Example 2.2.1.9 are equivalent to each other. We now prove that the solution set of two equivalent linear systems are same.

Lemma 2.2.1.13. Let Cx = d be the linear system obtained from Ax = b by application of a single elementary row operation. Then Ax = b and Cx = d have the same solution set.

Proof. We prove the result for the elementary row operation Ejk(c) with c0. The reader is advised to prove the result for the other two elementary operations.

In this case, the systems Ax = b and Cx = d vary only in the jth equation. So, we need to show that y satisfies the jth equation of Ax = b if and only if y satisfies the jth equation of Cx = d. So, let yT = [α1,n]. Then, the jth and kth equations of Ax = b are aj1α1 + + ajnαn = bj and ak1α1 + + aknαn = bk. Therefore, we see that αi’s satisfy

DRAFT
Also, by definition the jth equation of Cx = d equals
Therefore, using Equation (2.2.1.2), we see that yT = [α1,n] is also a solution for Equation (2.2.1.3). Now, use a similar argument to show that if zT = [β1,n] is a solution of Cx = d then it is also a solution of Ax = b. Hence, the required result follows. _

The readers are advised to use Lemma 2.2.1.13 as an induction step to prove the main result of this subsection which is stated next.

Theorem 2.2.1.14. Let Ax = b and Cx = d be two equivalent linear systems. Then they have the same solution set.

### 2.2 System of Linear Equations

In the previous section, we saw that if one linear system can be obtained from another by a repeated application of elementary row operations then the two linear systems have the same solution set. Sometimes it helps to imagine an elementary row operation as a product on the left by DRAFT elementary matrix. In this section, we will try to understand this relationship and use them to first obtain results for the system of linear equations and then to the theory of square matrices.

#### 2.2.A Elementary Matrices and the Row-Reduced Echelon Form (RREF)

Definition 2.2.2.1. A square matrix E of order n is called an elementary matrix if it is obtained by applying exactly one elementary row operation to the identity matrix In.

Remark 2.2.2.2. The elementary matrices are of three types and they correspond to elementary row operations.

1. Eij: Matrix obtained by applying elementary row operation Eij to In.
2. Ek(c) for c0: Matrix obtained by applying elementary row operation Ek(C) to In.
3. Eij(c) for c0: Matrix obtained by applying elementary row operation Eij(c) to In.

Example 2.2.2.3.

1. In particular, for n = 3 and a real number c0, one has
E23 = , E1(c) = , E31(c) = and E23(c) = . DRAFT
2. Let A = and B = . Then verify that
E23A = = = B.
3. Let A = . Then E21(-1)E32(-2)A = E21(-1) = .
4. Let A = . Then verify that E31(-2)E13E31(-1)A = .

Exercise 2.2.2.4.

1. Which of the following matrices are elementary?
2. Find elementary matrices E1,,Ek such that EkE1 = I2.
3. Determine elementary matrices F1,,Fk such that EkE1 = I3.
DRAFT

Remark 2.2.2.5. Observe that

1. (Eij)-1 = Eij as EijEij = I = EijEij.
2. Let c0. Then (Ek(c))-1 = Ek(1∕c) as Ek(c)Ek(1∕c) = I = Ek(1∕c)Ek(c).
3. Let c0. Then (Eij(c))-1 = Eij(-c) as Eij(c)Eij(-c) = I = Eij(-c)Eij(c).

Thus, each elementary matrix is invertible and the inverse is also an elementary matrix.

Based on the above observation and the fact that product of invertible matrices is invertible, the readers are advised to prove the next result.

Lemma 2.2.2.6. Prove that applying elementary row operations is equivalent to multiplying on the left by the corresponding elementary matrix and vice-versa.

Proposition 2.2.2.7. Let A and B be two equivalent matrices. Then prove that B = E1EkA, for some elementary matrices E1,,Ek.

Proof. By definition of equivalence, the matrix B can be obtained from A by a finite number of elementary row operations. But by Lemma 2.2.2.6, each elementary row operation on A corresponds to multiplying on the left of A by an elementary matrix. Thus, the required result follows. _ DRAFT

We now give a direct prove of Theorem 2.2.1.14. To do so, we state the theorem once again.

Theorem 2.2.2.8. Let Ax = b and Cx = d be two equivalent linear systems. Then they have the same solution set.

Proof. Let E1,,Ek be the elementary matrices such that E1Ek[Ab] = [Cd]. Then, by definition of matrix product and Remark 2.2.2.5
 (2.2.2.1)

Now assume that Ay = b holds. Then, by Equation (2.2.2.1)
 (2.2.2.2)

On the other hand if Cz = d holds then using Equation (2.2.2.1), we have DRAFT
 (2.2.2.3)

Therefore, using Equations (2.2.2.2) and (2.2.2.3) the required result follows. _

As an immediate corollary, the readers are advised to prove the following result.

Corollary 2.2.2.9. Let A and B be two equivalent matrices. Then the systems Ax = 0 and Bx = 0 have the same solution set.

Example 2.2.2.10. Are the matrices A = and B = equivalent?
Solution: No, as is a solution of Bx = 0 but it isn’t a solution of Ax = 0.

Definition 2.2.2.11. [Pivot/Leading Term] Let A be a non-zero matrix. Then, a pivot/leading term is the first (from left) nonzero element of a non-zero row in A and the column containing the pivot term is called the pivot column. If aij is a pivot then we denote DRAFT it by aij. For example, the entries a12 and a23 are pivots in A = . Thus, the columns 1 and 2 are pivot columns.

Definition 2.2.2.12. [Echelon Form] A matrix A is in echelon form (EF) (ladder like) if

1. pivot of the (i + 1)-th row comes to the right of the i-th.
2. entries below the pivot in a pivotal column are 0.
3. the zero rows are at the bottom.

Example 2.2.2.13.

1. The following matrices are in echelon form.
, , and .
2. The following matrices are not in echelon form (determine the rule(s) that fail).
and .

DRAFT Definition 2.2.2.14. [Row-Reduced Echelon Form] A matrix C is said to be in the row-reduced echelon form (RREF) if

1. C is already in echelon form,
2. pivot of each non-zero row is 1,
3. every other entry in the pivotal column is zero.

A matrix in RREF is also called a row-reduced echelon matrix.

Example 2.2.2.15.

1. The following matrices are in RREF.
, , and .
2. The following matrices are not in RREF (determine the rule(s) that fail).
, , .

The proof of the next result is beyond the scope of this book and hence is omitted.

Theorem 2.2.2.16. Let A and B be two matrices in RREF. If they are equivalent then A = B.

As an immediate corollary, we obtain the following important result.

DRAFT Corollary 2.2.2.17. The RREF of a matrix is unique.

Proof. Suppose there exists a matrix A with two different RREFs, say B and C. As the RREFs are obtained by multiplication of elementary matrices there exist elementary matrices E1,,Ek and F1,,F such that B = E1EkA and C = F1FA. Thus,

As inverse of elementary matrices are elementary matrices, we see that the matrices B and C are equivalent. As B and C are in RREF, using Theorem 2.2.2.16, we see that B = C. _

Theorem 2.2.2.18. Let A be an m × n matrix. If B consists of the first s columns of A then RREF(B) equals the first s columns of RREF(A).

Proof. Let us write F = RREF(A). By definition of RREF, there exist elementary matrices E1,,Ek such that E1EkA = F. Then, by matrix multiplication

Thus, E1EkB = [E1EkA[:,1],,E1EkA[:,s]] = [F[:,1],,F[:,s]]. Since the matrix F is in RREF, by definition, it’s first s columns are also in RREF. Hence, by Corollary 2.2.2.17 we see that RREF(B) = [F[:,1],,F[:,s]]. Thus, the required result follows. _ DRAFT

Let A an m × n matrix. Then by Corollary 2.2.2.17, it’s RREF is unique. We use it to define our next object.

#### 2.2.B Rank of a Matrix

Definition 2.2.2.19. [Row-Rank of a Matrix] Let A be an m×n matrix and let the number of pivots (number of non-zero rows) in it’s RREF. Then the row rank of A, denoted row-rank(A), equals r. For example, row-rank(In) = n and row-rank(0) = 0.

Remark 2.2.2.20.

1. Even though, row-rank is defined using the RREF of a matrix, we just need to compute the echelon form as the number of non-zero rows/pivots do not change when we proceed to compute the RREF from the echelon form.
2. Let A be an m × n matrix. Then by the definition of RREF, the number of pivots, say r, satisfies r min{m,n}. Thus, row-rank(A) min{m,n}.

Example 2.2.2.21. Determine the row-rank of the following matrices.

1. diag(d1,,dn).
Solution: Let S = {i|1 i n,di0}. Then, verify that row-rank equals the number of elements in S. DRAFT
2. A = .
Solution: The echelon form of A is obtained as follows:
As the echelon form of A has 3 non-zero rows row-rank(A)  =  3.
3. A = .
Solution: row-rank(A) = 2 as the echelon form of A (given below) has two non-zero rows:
DRAFT

Remark 2.2.2.22. Let Ax = b be a linear system with m equations in n unknowns. Let RREF([Ab]) = [Cd], row-rank(A) = r and row-rank([Ab]) = ra.

1. Then, using Theorem 2.2.2.18 conclude that r ra.
2. If r < ra then again using Theorem 2.2.2.18, note that ra = r + 1 and ([Cd])[:,n + 1] has a pivot at the (r + 1)-th place. Hence, by definition of RREF, ([Cd])[r + 1,:] = [0T ,1].
3. If r = ra then ([Cd])[:,n + 1] has no pivot. Thus, [0T ,1] is not a row of [Cd].

Now, consider an m × n matrix A and an elementary matrix E of order n. Then the product AE corresponds to applying column transformation on the matrix A. Therefore, for each elementary matrix, there is a corresponding column transformation as well. We summarize these ideas as follows.

Definition 2.2.2.23. The column transformations obtained by right multiplication of elementary matrices are called column operations. For example, if A = then AE23 = and AE14(-1) = .

Remark 2.2.2.24 (Rank of a Matrix).

1. The idea of row-rank was based on RREF and RREF was the result of systematically applying a DRAFT finite number of elementary row operations. So, starting with a matrix A, we can systematically apply a finite number of elementary column operations (see Definition 2.2.2.23) to get a matrix which in some sense looks similar to RREF, call it say B, and then use the non-zero columns in that to define the column-rank. Note that B will have the following form:
1. The zero columns appear after the non-zero columns.
2. The first non-zero entry of a non-zero column is 1, called the pivot.
3. The pivots in non-zero column move down in successive columns.
2. It will be proved later that row-rank(A) = column-rank(A). Thus, we just talk of the “rank”, denoted Rank(A).

we are now ready to prove a few results associated with the rank of a matrix.

Theorem 2.2.2.25. Let A be a matrix of rank r. Then there exist a finite number of elementary matrices E1,,Es and F1,,F such that

Proof. Let C = RREF(A). As Rank(A) = r, by definition of RREF, there exist elementary matrices E1,,Es such that C = E1EsA. Note that C has r pivots and they appear in columns, say i1 < i2 < < ir. DRAFT

Now, let D be the matrix obtained from C by successively multiplying the elementary matrices Ejij, for 1 j r, on the right of C. Then observe that D = , where B is a matrix of an appropriate size.

As the (1,1) block of D is an identity matrix, the block (1,2) can be made the zero matrix by elementary column operations to D. Thus, the required result follows. _

Exercise 2.2.2.26.

1. Let A = and B = . Find P and Q such that B = PAQ.
2. Let A be a matrix of rank r. Then prove that there exist invertible matrices Bi,Ci such that B1A = ,AC1 = ,B2AC2 = and B3AC3 = , where the (1,1) block of each matrix is of size r × r. Also, prove that A1 is an invertible matrix.
3. Let A be an m×n matrix of rank r. Then prove that A can be written as A = BC, where both B and C have rank r and B is of size m × r and C is of size r × n.
4. Prove that if the product AB is defined and Rank(A) = Rank(AB) then A = ABX for some matrix X. Similarly, if BA is defined and Rank(A) = Rank(BA) then A = Y BA for some matrix Y . [Hint: Choose invertible matrices P,Q satisfying PAQ = , P(AB) = (PAQ)(Q-1B) = . Now find an invertible matrix R such that P(AB)R = . Now, define X = RQ-1 to get the required result.]
5. Prove that if AB is defined then Rank(AB) Rank(A) and Rank(AB) Rank(B).
6. Let P and Q be invertible matrices such that the matrix product PAQ is defined. Then prove that Rank(PAQ) = Rank(A).
7. Prove that if A + B is defined then Rank(A + B) Rank(A) + Rank(B).
DRAFT

#### 2.2.C Gauss-Jordan Elimination and System of Linear Equations

Let A be an m × n matrix. We now present an algorithm, commonly known as the Gauss-Jordan Elimination (GJE) method, to compute the RREF of A.

1. Input: A.
2. Output: a matrix B in RREF such that A is row equivalent to B.
3. Step 1: Put ‘Region’ = A.
4. Step 2: If all entries in the Region are 0, STOP. Else, in the Region, find the leftmost nonzero column and find its topmost non-zero entry. Suppose this non-zero entry is aij. Box it. This is a pivot.
5. Step 3: Replace the row containing the pivot with the top row of the region. Also, make the pivot entry 1. Use this pivot to make other entries in the pivotal column as 0.
6. Step 4: Put Region = the sub-matrix below and to the right of the current pivot. Now, go to step 2.

Important: The process will stop, as we can get at most min{m,n} pivots.

Example 2.2.2.27. Apply GJE to

1. Region = A as A0.
2. Then E12A = . Also, E31(-1)E12A = = B (say). DRAFT
3. Now, Region = 0. Let C = .
4. Then E1()C = and E21(-2)E1()C = . Or equivalently, E32(-2)E2()B = .
5. Thus, E34E32(-2)E2()E31(-1)E12A = . Hence,
6. As the matrix A has been multiplied with elementary matrices on the left the RREF matrix is equivalent to A.

Remark 2.2.2.28 (Gauss Elimination (GE)). GE is similar to the GJE except that

1. the pivots need not be made 1 and
2. the entries above the pivots need not be made 0.
DRAFT

For example, if A = then GE gives E32(-3)E21(-1)E31(-1)A = .

Thus, Gauss-Jordan Elimination may be viewed as an extension of the Gauss Elimination.

Example 2.2.2.29. Consider the system Ax = b with A a matrix of order 3 × 3 and A[:,1]0. If [Cd] = RREF([Ab]) then the possible choices for [Cd] are given below:

1. . Here, Ax = b is consistent and with unique solution = .
2. , or . Here, Ax = b is inconsistent for any choice of α,β.
3. , or . Here, Ax = b is consistent and has infinite number of solutions for every choice of α,β.

Exercise 2.2.2.30.

1. Let Ax = b be a linear system of m equations in 2 unknowns. What are the possible choices for RREF([Ab]) if m 1?
2. Find the row-reduced echelon form of the following matrices: DRAFT
3. Find the solutions of the linear system of equations using Gauss-Jordan method.

Now, using Proposition 2.2.2.7, Theorem 2.2.2.8 and the definition of RREF of a matrix, we obtain the following remark.

Remark 2.2.2.31. Let RREF([Ab]) = [Cd]. Then

1. there exist elementary matrices, say E1,,Ek, such that E1Ek[Ab] = [Cd]. Thus, the GJE (or the GE) is equivalent to multiplying by a finite number of elementary matrices on the left of [Ab]. DRAFT
2. by Theorem 2.2.2.18 RREF(A) = C.

Definition 2.2.2.32. [Basic, Free Variables] Consider the linear system Ax = b. If RREF([Ab]) = [Cd] then the unknowns

1. corresponding to the pivotal columns of C are called the basic variables.
2. that are not basic are called free variables.

Example 2.2.2.33.

1. Let RREF([Ab]) = . Then, Ax = b has
as it’s solution set. Note that x and y are basic variables and z is the free variable.
2. Let RREF([Ab]) = . Then, the system Ax = b has no solution as RREF([Ab])[3,:] = [0,0,0,1] which corresponds to the equation 0 x + 0 y + 0 z = 1. DRAFT
3. Suppose the system Ax = b is consistent and RREF(A) has r non-zero rows. Then the system has r basic variables and n - r free variables.

We now prove the main result in the theory of linear system (recall Remark 2.2.2.22).

Theorem 2.2.2.34. Let A be an m × n matrix and let RREF([Ab]) = [Cd], Rank(A) = r and Rank([Ab]) = ra. Then Ax = b

1. is inconsistent if r < ra
2. is consistent if r = ra. Furthermore, Ax = b has
1. a unique solution if r = n.
2. infinite number of solutions if r < n. In this case, the solution set equals
where x0,u1,,un-r n with Ax0 = b and Aui = 0, for 1 i n - r.

Proof. Part 1: As r < ra, by Remark 2.2.2.22 ([Cd])[r + 1,:] = [0T ,1]. Then, this row corresponds to the linear equation DRAFT

which clearly has no solution. Thus, by definition and Theorem 2.2.1.14, Ax = b is inconsistent.

Part 2: As r = ra, by Remark 2.2.2.22, [Cd] doesn’t have a row of the form [0T ,1] and there are r pivots in C. Suppose the pivots appear in columns i1,,ir with 1 i1 < < ir n. Thus, the unknowns xij, for 1 j r, are basic variables and the remaining n - r variables, say xt1,,xtn-r, are free variables with t1 < < tn-r. Since C is in RREF, in terms of the free variables and basic variables, the -th row of [Cd], for ℓ,1 r, corresponds to the equation

Hence, the solution set of the system Cx = d is given by
 (2.2.2.4)

Thus, by Theorem 2.2.1.14 the system Ax = b is consistent. In case of Part 2a, r = n and hence there are no free variables. Thus, the unique solution equals xi = di, for 1 i n.

In case of Part 2b, define x0 = and u1 = ,,un-r = . Then, it can be easily verified that Ax0 = b and for 1 i n - r, Aui = 0 and by Equation (2.2.2.4) the solution set has indeed the required form, where ki corresponds to the free variable xti. As there is at least one free variable the system has infinite number of solutions. Thus, the proof of the theorem is complete. _

Let A be an m × n matrix. Then by Remark 2.2.2.20, Rank(A) m and hence using Theorem 2.2.2.34 the next result follows.

Corollary 2.2.2.35. Let A be a matrix of order m × n and consider Ax = 0. If

1. Rank(A) = r < min{m,n} then Ax = 0 has infinitely many solutions.
2. m < n, then Ax = 0 has infinitely many solutions.

Thus, in either case, the homogeneous system Ax = 0 has at least one non-trivial solution.

DRAFT Remark 2.2.2.36. Let A be an m × n matrix. Then Theorem 2.2.2.34 implies that

1. the linear system Ax = b is consistent if and only if Rank(A) = Rank([Ab]).
2. the vectors associated to the free variables in Equation (2.2.2.4) are solutions to the associated homogeneous system Ax = 0.

We end this subsection with some applications.

Example 2.2.2.37.

1. Determine the equation of the line/circle that passes through the points (-1,4),(0,1) and (1,4).
Solution: The general equation of a line/circle in Euclidean plane is given by a(x2 + y2) + bx + cy + d = 0, where a,b,c and d are unknowns. Since this curve passes through the given points, we get a homogeneous system in 3 equations and 4 unknowns, namely = 0. Solving this system, we get [a,b,c,d] = [d,0,-d,d]. Hence, taking d = 13, the equation of the required circle is 3(x2 + y2) - 16y + 13 = 0.
2. Determine the equation of the plane that contains the points (1,1,1),(1,3,2) and (2,-1,2).
Solution: The general equation of a plane in space is given by ax + by + cz + d = 0, where a,b,c and d are unknowns. Since this plane passes through the 3 given points, we get a homogeneous system in 3 equations and 4 unknowns. So, it has a non-trivial solution, namely [a,b,c,d] = [-d,-,-d,d]. Hence, taking d = 3, the equation of the required plane is -4x - y + 2z + 3 = 0.
3. Let A = . Then DRAFT
1. find a non-zero x 3 such that Ax = 2x.
2. does there exist a non-zero vector y 3 such that Ay = 4y?

Solution of Part 3a: Solving for Ax = 2x is equivalent to solving (A - 2I)x = 0 whose augmented matrix equals . Verify that xT = [1,0,0] is a non-zero solution.

Part 3b: As above, Ay = 4y is equivalent to solving (A - 4I)y = 0 whose augmented matrix equals . Now, verify that yT = [2,0,1] is a non-zero solution.

Exercise 2.2.2.38.

1. In the first part of this chapter 3 figures (see Figure 2.1) were given to illustrate different cases in Euclidean plane (2 equations in 2 unknowns). It is well known that in the case of Euclidean space (3 equations in 3 unknowns) there
1. is a way to place the 3 planes so that the system has a unique solution.
2. are 4 distinct ways to place the 3 planes so that the system has no solution.
3. are 3 distinct ways to place the 3 planes so that the system has an infinite number of solutions.

Determine the position of planes by drawing diagram to explain the above cases. Do these diagrams and the RREF matrices that appear in Example 2.2.2.29 have any relationship? Justify your answer.

2. Determine the equation of the curve y = ax2 + bx + c that passes through the points (-1,4),(0,1) and (1,4). DRAFT
3. Solve the following linear systems.
1. x + y + z + w = 0,x - y + z + w = 0 and -x + y + 3z + 3w = 0.
2. x + 2y = 1,x + y + z = 4 and 3y + 2z = 1.
3. x + y + z = 3,x + y - z = 1 and x + y + 7z = 6.
4. x + y + z = 3,x + y - z = 1 and x + y + 4z = 6.
5. x + y + z = 3,x + y - z = 1,x + y + 4z = 6 and x + y - 4z = -1.
4. For what values of c and k, the following systems have i) no solution, ii) a unique solution and iii) infinite number of solutions.
1. x + y + z = 3,x + 2y + cz = 4,2x + 3y + 2cz = k.
2. x + y + z = 3,x + y + 2cz = 7,x + 2y + 3cz = k.
3. x + y + 2z = 3,x + 2y + cz = 5,x + 2y + 4z = k.
4. kx + y + z = 1,x + ky + z = 1,x + y + kz = 1.
5. x + 2y - z = 1,2x + 3y + kz = 3,x + ky + 3z = 2.
6. x - 2y = 1,x - y + kz = 1,ky + 4z = 6.
5. For what values of a, does the following systems have i) no solution, ii) a unique solution and iii) infinite number of solutions.
1. x + 2y + 3z = 4,2x + 5y + 5z = 6,2x + (a2 - 6)z = a + 20.
2. x + y + z = 3,2x + 5y + 4z = a,3x + (a2 - 8)z = 12.
DRAFT
6. Find the condition(s) on x,y,z so that the system of linear equations given below (in the unknowns a,b and c) is consistent?
1. a + 2b - 3c = x,2a + 6b - 11c = y,a - 2b + 7c = z
2. a + b + 5c = x,a + 3c = y,2a - b + 4c = z
3. a + 2b + 3c = x,2a + 4b + 6c = y,3a + 6b + 9c = z
7. Let A be an n × n matrix. If the system A2x = 0 has a non trivial solution then show that Ax = 0 also has a non trivial solution.
8. Prove that 5 distinct points are needed to specify a general conic in Euclidean plane.
9. Let u = (1,1,-2)T and v = (-1,2,3)T . Find condition on x,y and z such that the system cu + dv = (x,y,z)T in the unknowns c and d is consistent.
10. Consider the linear system Ax = b in m equations and 3 unknowns. Then, for each of the given solution set, determine the possible choices of m? Further, for each choice of m, determine a choice of A and b.
1. (1,1,1)T is the only solution.
2. (1,1,1)T is the only solution.
3. {(1,1,1)T + c(1,2,1)T |c } as the solution set.
4. {c(1,2,1)T |c } as the solution set.
5. {(1,1,1)T + c(1,2,1)T + d(2,2,-1)T |c,d } as the solution set.
6. {c(1,2,1)T + d(2,2,-1)T |c,d } as the solution set.
DRAFT

### 2.3 Square Matrices and System of Linear Equations

In this section the coefficient matrix of the linear system Ax = b will be a square matrix. We start with proving a few equivalent conditions that relate different ideas.

Theorem 2.2.3.1. Let A be a square matrix of order n. Then the following statements are equivalent.

1. A is invertible.
2. The homogeneous system Ax = 0 has only the trivial solution.
3. Rank(A) = n.
4. The RREF of A is In.
5. A is a product of elementary matrices.

Proof. 1 2   As A is invertible, A-1 exists and A-1A = In. So, if x0 is any solution of the homogeneous system Ax = 0. Then

Hence, 0 is the only solution of the homogeneous system Ax = 0.

2 3   Let if possible Rank(A) = r < n. Then, by Corollary 2.2.2.35, the homogeneous system Ax = 0 has infinitely many solution. A contradiction. Thus, A has full rank. DRAFT

3 4   Suppose Rank(A) = n and let B = RREF(A). As B = [bij] is a square matrix of order n, each column of B contains a pivot. Since the pivots move to the right as we go down the row, the i-th pivot must be at position bii, for 1 i n. Thus, B = In and hence, the RREF of A is In.

4 5   Suppose RREF(A) = In. Then using Proposition 2.2.2.7, there exist elementary matrices E1,,Ek such that E1EkA = In. Or equivalently,
 (2.2.3.1)

which gives the desired result as by Remark 2.2.2.5 we know that the inverse of an elementary matrix is also an elementary matrix.

5 1  Suppose A = E1Ek for some elementary matrices E1,,Ek. As the elementary matrices are invertible (see Remark 2.2.2.5) and the product of invertible matrices is also invertible, we get the required result. _

As an immediate consequence of Theorem 2.2.3.1, we have the following important result which implies that one needs to compute either the left or the right inverse to prove invertibility.

Corollary 2.2.3.2. Let A be a square matrix of order n. Suppose there exists a matrix

1. C such that CA = In. Then A-1 exists.
2. B such that AB = In. Then A-1 exists.

Proof. Part 1: Let CA = In for some matrix C and let x0 be a solution of the homogeneous system Ax = 0. Then Ax0 = 0 and DRAFT

Thus, the homogeneous system Ax = 0 has only the trivial solution. Hence, using Theorem 2.2.3.1, the matrix A is invertible.

Part 2: Using the first part, B is invertible. Hence, B-1 = A or equivalently A-1 = B and thus A is invertible as well. _

Another important consequence of Theorem 2.2.3.1 is stated next which uses Equation (2.2.3.1) to get the required result. This result is used to compute the inverse of a matrix using the Gauss-Jordan Elimination.

Corollary 2.2.3.3. Let A be an invertible matrix of order n. Suppose there exist elementary matrices E1,,Ek such that E1EkA = In. Then A-1 = E1Ek.

Remark 2.2.3.4. Let A be an n×n matrix. Apply GJE to [AIn] and let RREF([AIn]) = [BC]. If B = In, then A-1 = C or else A is not invertible.

Example 2.2.3.5. Use GJE to find the inverse of A = .
Solution: Applying GJE to [A|I3] = gives

DRAFT
Thus, A-1 = .

Exercise 2.2.3.6.

1. Find the inverse of the following matrices using GJE.
(i) (ii) (iii)(iv).
2. Let A and B be two matrices having positive entries and of order 1 × 2 and 2 × 1, respectively. Which of BA or AB is invertible? Give reasons.
3. Let A be an n × m matrix and B be an m × n matrix. Prove that
1. I - BA is invertible if and only if I - AB is invertible [Use Theorem 2.2.3.1.2].
2. if I - AB is invertible then (I - BA)-1 = I + B(I - AB)-1A.
3. if I - AB is invertible then (I - BA)-1B = B(I - AB)-1. DRAFT
4. if A,B and A + B are invertible then (A-1 + B-1)-1 = A(A + B)-1B.
4. Let A be a square matrix. Then prove that
1. A is invertible if and only if AT A is invertible.
2. A is invertible if and only if AAT is invertible.

We end this section by giving two more equivalent conditions for a matrix to be invertible.

Theorem 2.2.3.7. The following statements are equivalent for an n × n matrix A.

1. A is invertible.
2. The system Ax = b has a unique solution for every b.
3. The system Ax = b is consistent for every b.

Proof. 1 2  Note that x0 = A-1b is the unique solution of Ax = b.

2 3   The system is consistent as Ax = b has a solution.

3 1   For 1 i n, define eiT = In[i,:]. By assumption, the linear system Ax = ei has a solution, say xi, for 1 i n. Define a matrix B = [x1,,xn]. Then

Therefore, by Corollary 2.2.3.2, the matrix A is invertible. _ DRAFT

We now give an immediate application of Theorem 2.2.3.7 and Theorem 2.2.3.1 without proof.

Theorem 2.2.3.8. The following two statements cannot hold together for an n × n matrix A.

1. The system Ax = b has a unique solution for every b.
2. The system Ax = 0 has a non-trivial solution.

Exercise 2.2.3.9.

1. Let A and B be square matrices of order n with B = PA, for an invertible matrix P. Then prove that A is invertible if and only if B is invertible.
2. Let A and B be two m×n matrices. Then prove that A and B are equivalent if and only if B = PA, where P is product of elementary matrices. When is this P unique?
3. Let bT = [1,2,-1,-2]. Suppose A is a 4 × 4 matrix such that the linear system Ax = b has no solution. Mark each of the statements given below as true or false?
1. The homogeneous system Ax = 0 has only the trivial solution.
2. The matrix A is invertible.
3. Let cT = [-1,-2,1,2]. Then the system Ax = c has no solution.
4. Let B = RREF(A). Then DRAFT
1. B[4,:] = [0,0,0,0].
2. B[4,:] = [0,0,0,1].
3. B[3,:] = [0,0,0,0].
4. B[3,:] = [0,0,0,1].
5. B[3,:] = [0,0,1], where α is any real number.

#### 2.3.A Determinant

In this section, we associate a number with each square matrix. To start with, let A be an n × n matrix. Then, for 1 i,j k and 1 αij n, by A(α1,kβ1,) we mean that sub-matrix of A that is obtained by deleting the rows corresponding to αi’s and the columns corresponding to βj’s of A.

Example 2.2.3.10. For A = , A(1|2) = and A(1,2|1,3) = [4].

With the notations as above, we are ready to give an inductive definition of the determinant of a square matrix. The advanced students can find the actual definition of the determinant in Appendix 7.7.1.22, where it is proved that the definition given below corresponds to the expansion of determinant along the first row.

DRAFT Definition 2.2.3.11. [Determinant] Let A be a square matrix of order n. Then the determinant of A, denoted det(A) (or |A|) is defined by

Example 2.2.3.12.

1. Let A = [-2]. Then det(A) = |A| = -2.
2. Let A = . Then, det(A) = |A| = adet(A(1|1)) - bdet(A(1|2)) = ad - bc. For example, if A = then det(A) = = 1 5 - 2 3 = -1.
3. Let A = [aij] be a 3 × 3 matrix. Then,
For A = , |A| = 1 - 2 + 3 = 4 - 2(3) + 3(1) = 1.

Exercise 2.2.3.13. Find the determinant of the following matrices.
i)ii)iii).

Definition 2.2.3.14. [Singular, Non-Singular] A matrix A is said to be a singular if det(A) = 0 and is called non-singular if det(A)0.

The next result relates the determinant with row operations. For proof, see Appendix 7.2.

Theorem 2.2.3.15. Let A be an n × n matrix. If

1. B = EijA, for 1 ij n, then det(B) = -det(A).
2. B = Ei(c)A, for c0,1 i n, then det(B) = cdet(A). DRAFT
3. B = Eij(c)A, for c0 and 1 ij n, then det(B) = det(A).
4. A[i,:]T = 0, for 1 i,j n then det(A) = 0.
5. A[i,:] = A[j,:] for 1 ij n then det(A) = 0.
6. A is a triangular matrix with d1,,dn on the diagonal then det(A) = d1dn.

Example 2.2.3.16.

1. Since , using Theorem 2.2.3.15, we see that, for A = , det(A) = 2 (1 2 (-1)) = -4, where the first 2 appears from the elementary matrix E1().
2. For A = verify that |A|. Now, a successive application of E21(-1),E31(-1) and E41(-3) gives and then applying E32 and E43(-4), we get . Thus, by Theorem 2.2.3.15 det(A) = 2(-1)(8) = -16 as 2 gets contributed due to E1() and -1 due to E32.
DRAFT

Exercise 2.2.3.17. Use Theorem 2.2.3.15 to arrive at the answer.

1. Let A = ,B = and CT = for some complex numbers α and β. Prove that det(B) = α det(A) and det(C) = det(A).
2. Prove that 3 divides and = 0.

By Theorem 2.2.3.15.6 det(In) = 1. The next result about the determinant of the elementary matrices is an immediate consequence of Theorem 2.2.3.15 and hence the proof is omitted.

Corollary 2.2.3.18. Fix a positive integer n. Then

1. det(Eij) = -1.
2. For c0, det(Ek(c)) = c.
3. For c0, det(Eij(c)) = 1.

DRAFT Remark 2.2.3.19. Theorem 2.2.3.15.1 implies that the determinant can be calculated by expanding along any row. Hence, the readers are advised to verify that

Example 2.2.3.20. Let us use Remark 2.2.3.19 to compute the determinant.

1. = (-1)2+2 + (-1)2+3 2 = (2 - 6) - 2(4 - 2) = -8.
2. = (-1)2+2 = 2 - 1 = 1.
3. = (-1)2+3 2 + (-1)2+4 = -2 1 + (-8) = -10.

Remark 2.2.3.21. DRAFT

1. Let uT = [u1,u2],vT = [v1,v2] 2. Now, consider the parallelogram on vertices P = [0,0]T ,Q = u,R = u + v and S = v (see Figure 3). Then Area(PQRS) = |u1v2 - u2v1|, the absolute value of .

Recall that the dot product of uT ,vT , denoted u v = u1v1 + u2v2, the length of u, denoted (u) = and cos(θ) = , where θ is the angle between u and v. Therefore

That is, in 2, the determinant is ± times the area of the parallelogram.
2. DRAFT
3. Consider Figure 3 again. Let u = [u1,u2,u3]T ,v = [v1,v2,v3]T ,w = [w1,w2,w3]T 3. Then u v = u1v1 + u2v2 + u3v3 and the cross product of u and v, denoted
Also, the vector u × v is perpendicular to the plane containing both u and v. So, if u3 = v3 = 0 then one can think of u and v as vectors in the XY -plane and in this case (u × v) = |u1v2 -u2v1| = Area(PQRS). Hence, if γ is the angle between the vector w and the vector u × v then
4. In general, for an n×n matrix A, det(A) satisfies all the properties associated with the volume of the n-dimensional parallelepiped. The actual proof is beyond the scope of this book. But, one can observe the following:
1. The volume of the n-dimensional unit cube is 1 = det(In).
2. If one vector is multiplied by c0 then the volume either reduces or expands by c. The same holds for determinant.
3. Recall that if the height and base of a parallelogram is not changed then the area remains the same. This corresponds to replacing the perpendicular vector with the perpendicular vector plus c0 times the base vector for different choices of c.
DRAFT

#### 2.3.B Adjugate (classically Adjoint) of a Matrix

Definition 2.2.3.22. [Minor, Cofactor] Let A be an n × n matrix. Then the

1. (i,j)th minor of A, denoted Aij = det, for 1 i,j n.
2. (i,j)th cofactor of A, denoted Cij = (-1)i+jAij.
3. the Adjugate (classically Adjoint) of A, denoted Adj(A) = [bij] with bij = Cji, for 1 i,j n. For example, for A = , A11 = det(A(1|1)) = = 4,A12 = = 3,A13 = = 1,,A32 = = -5 and A33 = = -1. So,
Adj(A) = = .

We now prove a very important result that relates adjugate matrix with the inverse.

Theorem 2.2.3.23. Let A be an n × n matrix. Then

1. for 1 i n, j=1naijCij = j=1naij(-1)i+jAij = det(A),
2. for iℓ, j=1naijCℓj = j=1naij(-1)+jAℓj = 0, and DRAFT
3. A(Adj(A)) = det(A)In. Thus,
 (2.2.3.2)

Proof. Part 1: It follows directly from Remark 2.2.3.19 and the definition of the cofactor.

Part 2: Fix positive integers i,ℓ with 1 in and let B = [bij] be a square matrix with B[ℓ,:] = A[i,:] and B[t,:] = A[t,:], for t. As i, B[ℓ,:] = B[i,:] and thus, by Theorem 2.2.3.15.5, det(B) = 0. As A(|j) = B(|j), for 1 j n, using Remark 2.2.3.19

This completes the proof of Part 2.

Part 3:, Using Equation (2.2.3.3) and Remark 2.2.3.19, observe that

DRAFT
Thus, A(Adj(A)) = det(A)In. Therefore, if det(A)0 then A = In. Hence, by Corollary 2.2.3.2, A-1 = Adj(A). _

Example 2.2.3.24. For A = , Adj(A) = and det(A) = -2. Thus, by Theorem 2.2.3.23.3, A-1 = .

Let A be a non-singular matrix. Then, by Theorem 2.2.3.23.3, A-1 = Adj(A). Thus AAdj(A) = Adj(A)A = det(A)In and this completes the proof of the next result

Corollary 2.2.3.25. Let A be a non-singular matrix. Then

DRAFT

The next result gives another equivalent condition for a square matrix to be invertible.

Theorem 2.2.3.26. A square matrix A is non-singular if and only if A is invertible.

Proof. Let A be non-singular. Then det(A)0 and hence A-1 = Adj(A) as .

Now, let us assume that A is invertible. Then, using Theorem 2.2.3.1, A = E1Ek a product of elementary matrices. Also, by Corollary 2.2.3.18, det(Ei)0 for 1 i k. Thus, a repeated application of Parts 1,2 and 3 of Theorem 2.2.3.15 gives det(A)0. _

The next result relates the determinant of product of two matrices with their determinants.

Theorem 2.2.3.27. Let A and B be square matrices of order n. Then

Proof. Step 1. Let A be non-singular. Then, by Theorem 2.2.3.23.3, A is invertible and by Theorem 2.2.3.1, A = E1Ek, a product of elementary matrices. Then a repeated application of Parts 1,2 and 3 of Theorem 2.2.3.15 gives the desired result as

Step 2. Let A be singular. Then, by Theorem 2.2.3.26 A is not invertible. So, by Theorem 2.2.3.1 and Exercise 2.2.2.26.2 there exists an invertible matrix P such that PA = C, where C = . So, A = P-1. As P is invertible, using Part 1, we have
Thus, the proof of the theorem is complete. _

The next result relates the determinant of a matrix with the determinant of its transpose. Thus, the determinant can be computed by expanding along any column as well.

Theorem 2.2.3.28. Let A be a square matrix. Then det(A) = det(AT ).

Proof. If A is a non-singular, Corollary 2.2.3.25 gives det(A) = det(AT ).

If A is singular then, by Theorem 2.2.3.26, A is not invertible. So, AT is also not invertible and hence by Theorem 2.2.3.26, det(AT ) = 0 = det(A). _

DRAFT Example 2.2.3.29. Let A be an orthogonal matrix then, by definition, AAT = I. Thus, by Theorems 2.2.3.27 and 2.2.3.28

Hence detA = ±1. In particular, if A = then I = AAT = .
1. Thus, a2 + b2 = 1 and hence there exists θ [0,2π] such that a = cosθ and b = sinθ.
2. As ac + bd = 0, we get c = r sinθ and d = -r cosθ, for some r . But, c2 + d2 = 1 implies that either c = sinθ and d = -cosθ or c = -sinθ and d = cosθ.
3. Thus, A = or A = .
4. For A = , det(A) = -1. Then A represents a reflection across the line y = mx. Determine m? (see Exercise 3.3b).
5. For A = , det(A) = 1. Then A represents a rotation through the angle α. Determine α? (see Exercise 3.3a).

Exercise 2.2.3.30.

1. Let A be an n × n upper triangular matrix with non-zero entries on the diagonal. Then prove that A-1 is also an upper triangular matrix. DRAFT
2. Let A be an n × n matrix. Then det(A) = 0 if either
1. Prove that = for some α,β .
2. Prove that 17 divides .
3. A[i,:]T = 0 or A[:,i] = 0, for some i,1 i n.
4. or A[i,:= cA[j,:] or A[:,i] = cA[:,j], for some c and for some ij.

#### 2.3.C Cramer’s Rule

Let A be a square matrix. Then combining Theorem 2.2.3.7 and Theorem 2.2.3.26, one has the following result.

Corollary 2.2.3.31. Let A be a square matrix. Then the following statements are equivalent:

1. A is invertible.
2. The linear system Ax = b has a unique solution for every b.
3. det(A)0.

Thus, Ax = b has a unique solution for every b if and only if det(A)0. The next theorem gives a direct method of finding the solution of the linear system Ax = b when det(A)0.

DRAFT Theorem 2.2.3.32 (Cramer’s Rule). Let A be an n×n non-singular matrix. Then the unique solution of the linear system Ax = b with xT = [x1,,xn] is given by

where Aj is the matrix obtained from A by replacing A[:,j] by b.

Proof. Since det(A)0, A is invertible. Thus, there exists an invertible matrix P such that PA = In and P[A|b] = [I|Pb]. Let d = Ab. Then Ax = b has the unique solution xj = dj, for 1 j n. Also, [e1,,en] = I = PA = [PA[:,1],,PA[:,n]]. Thus,

Thus, det(PAj) = dj, for 1 j n. Also, dj = = = = . Hence, xj = and the required result follows. _

Example 2.2.3.33. Solve Ax = b using Cramer’s rule, where A = and b = .
Solution: Check that det(A) = 1 and xT = [-1,1,0] as

### 2.4 Miscellaneous Exercises

Exercise 2.2.4.1.

1. Suppose A-1 = B with A = and B = . Also, assume that A11 is invertible and define P = A22 - A21A11-1A12. Then prove that
1. = ,
2. P is invertible and B = .
2. Determine necessary and sufficient condition for a triangular matrix to be invertible.
3. Let A be a unitary matrix then what can you say about det(A)?
4. Let A be a 2 × 2 matrix with Tr(A) = 0 and det(A) = 0. Then A is a nilpotent matrix. DRAFT
5. Let A and B be two non-singular matrices. Are the matrices A + B and A - B non-singular? Justify your answer.
6. Let A be an n × n matrix. Prove that the following statements are equivalent:
1. A is not invertible.
2. Rank(A)n.
3. det(A) = 0.
4. A is not row-equivalent to In.
5. The homogeneous system Ax = 0 has a non-trivial solution.
6. The system Ax = b is either inconsistent or it has an infinite number of solutions.
7. A is not a product of elementary matrices.
7. For what value(s) of λ does the following systems have non-trivial solutions? Also, for each value of λ, determine a non-trivial solution.
1. (λ - 2)x + y = 0,x + (λ + 2)y = 0.
2. λx + 3y = 0,(λ + 6)y = 0.
8. Let a1,,an and define A = [aij]n×n with aij = aij-1. Prove that det(A) = 1i<jn(aj -ai). This matrix is usually called the van der monde matrix.
9. Let A = [aij]n×n with aij = max{i,j}. Prove that detA = (-1)n-1n.
10. Solve the following system of equations by Cramer’s rule.
i)x + y + z - w = 1,x + y - z + w = 2, 2x + y + z - w = 7,x + y + z + w = 3.
ii)x - y + z - w = 1,x + y - z + w = 2, 2x + y - z - w = 7,x - y - z + w = 3. DRAFT
11. Let p ,p0. Let A = [aij] and B = [bij] be two n × n matrices with bij = pi-jaij, for 1 i,j n. Then compute det(B) in terms of det(A).
12. The position of an element aij of a determinant is called even or odd according as i + j is even or odd. Prove that if all the entries in
1. odd positions are multiplied with -1 then the value of determinant doesn’t change.
2. even positions are multiplied with -1 then the value of determinant
1. does not change if the matrix is of even order.
2. is multiplied by -1 if the matrix is of odd order.
13. Let A be a Hermitian matrix. Prove that detA is a real number.
14. Let A be an n × n matrix. Then A is invertible if and only if Adj(A) is invertible.
15. Let A and B be invertible matrices. Prove that Adj(AB) = Adj(B)Adj(A).
16. Let A be an invertible matrix and let P = . Then show that Rank(P) = n if and only if D = CA-1B.

### 2.5 Summary

In this chapter, we started with a system of m linear equations in n unknowns and formally wrote it as Ax = b and in turn to the augmented matrix [A|b]. Then the basic operations on equations led to multiplication by elementary matrices on the right of [A|b] and thus giving as the RREF which in turn gave us rank of a matrix. If Rank(A) = r and Rank([A|b]) = ra and DRAFT

1. r < ra then the linear system Ax = b is inconsistent.
2. r = ra then the linear system Ax = b is consistent. Furthermore, if
1. r = n then the system Ax = b has a unique solution.
2. r < n then the system Ax = b has an infinite number of solutions.

We have also see that the following conditions are equivalent for an n × n matrix A.

1. A is invertible.
2. The homogeneous system Ax = 0 has only the trivial solution.
3. The row reduced echelon form of A is I.
4. A is a product of elementary matrices.
5. The system Ax = b has a unique solution for every b.
6. The system Ax = b has a solution for every b.
7. rank(A) = n.
8. det(A)0.

So, overall we have learnt to solve the following type of problems:

1. Solving the linear system Ax = b. This idea will lead to the question “is the vector b a linear combination of the columns of A”?
2. Solving the linear system Ax = 0. This will lead to the question “are the columns of A linearly independent/dependent”? In particular, if Ax = 0 has DRAFT
1. a unique solution then the columns of A are linear independent.
2. else, the columns of A are linearly dependent.
DRAFT DRAFT DRAFT

DRAFT

## Chapter 3Vector Spaces

In this chapter, we will mainly be concerned with finite dimensional vector spaces over or . The last section will consist of results in infinite dimensional vector spaces that are similar but different as compared with he finite dimensional case. We have given lots of examples of vector spaces that are infinite dimensional or are vector spaces over fields that are different from and . See appendix to have some ideas about fields that are different from and .

### 3.1 Vector Spaces: Definition and Examples

In this chapter, F denotes either , the set of real numbers or , the set of complex numbers.

Let A be an m × n complex matrix and let V denote the solution set of the homogeneous system Ax = 0. Then, by Theorem 2.2.1.6, V satisfies:

1. 0 V as A0 = 0.
2. if x V then αx V, for all α . In particular, for α = -1, -x V.
3. if x,y V then, for any α,β , αx + βy V.
4. if x,y,z V then, (x + y) + z = x + (y + z).

That is, the solution set of a homogeneous linear system satisfies some nice properties. The Euclidean plane, 2 and the Euclidean space, 3, also satisfy the above properties. In this chapter, our aim is to understand sets that satisfy such properties. We start with the following definition.

Definition 3.3.1.1. [Vector Space] A vector space V over F, denoted V(F) or in short V (if the field F is clear from the context), is a non-empty set, satisfying the following axioms:

1. Vector Addition: To every pair u,v V there corresponds a unique element uv V (called the addition of vectors) such that
1. u v = v u (Commutative law). DRAFT
2. (u v) w = u (v w) (Associative law).
3. V has a unique element, denoted 0, called the zero vector that satisfies u0 = u, for every u V (called the additive identity).
4. for every u V there is a unique element -u V that satisfies u(-u) = 0 (called the additive inverse).
2. Scalar Multiplication: For each u V and α F, there corresponds a unique element α u in V (called the scalar multiplication) such that
1. α (β u) = (αβ) u for every α,β F and u V (is multiplication in F).
2. 1 u = u for every u V, where 1 F.
3. Distributive Laws: relating vector addition with scalar multiplication
For any α,β F and u,v V, the following distributive laws hold:
1. α (u v) = (α u) (α v).
2. (α + β) u = (α u) (β u) (+ is addition in F).

Definition 3.3.1.2.

1. The number 0 F, whereas 0 V.
2. The elements of F are called scalars.
3. The elements of V are called vectors. DRAFT
4. If F = then V is called a real vector space.
5. If F = then V is called a complex vector space.
6. In general, a vector space over or is called a linear space.

Some interesting consequences of Definition 3.3.1.1 is stated next. Intuitively, these results seem obvious but for better understanding of the axioms it is desirable to go through the proof.

Theorem 3.3.1.3. Let V be a vector space over F. Then

1. u v = u implies v = 0.
2. α u = 0 if and only if either u = 0 or α = 0.
3. (-1) u = -u, for every u V.

Proof. Part 1: By Axiom 3.3.1.1.1d, for each u V there exists -u V such that -u u = 0. Hence, u v = u is equivalent to

Part 2: As 0 = 0 0, using Axiom 3.3.1.1.3, we have DRAFT

Thus, using Part 1, α 0 = 0 for any α F. In the same way, using Axiom 3.3.1.1.3b,
Hence, using Part 1, one has 0 u = 0 for any u V.

Now suppose α u = 0. If α = 0 then the proof is over. Therefore, assume that α0F. Then, (α)-1 F and

as 1 u = u for every vector u V (see Axiom 2.2b). Thus, if α0 and α u = 0 then u = 0.

Part 3: As 0 = 0 u = (1 + (-1))u = u (-1) u, one has (-1) u = -u. _

Example 3.3.1.4. The readers are advised to justify the statements given below.

1. Let A be an m × n matrix with complex entries with Rank(A) = r n. Then, using Theorem 2.2.2.34, V = {x|Ax = 0} is a vector space. DRAFT
2. Consider with the usual addition and multiplication. That is, ⊕≡ + and ⊙≡⋅. Then, forms a real vector space.
3. Let 2 = {(x1,x2)T |x1,x2 } Then, for x1,x2,y1,y2 and α , define
Verify that 2 is a real vector space.
4. Let n = {(a1,,an)T |ai ,1 i n}. For u = (a1,,an)T ,v = (b1,,bn)T V and α , define
(called component wise operations). Then, V is a real vector space. The vector space n is called the real vector space of n-tuples.

Recall that the symbol i represents the complex number .

5. Consider = {x + iy|x,y }, the set of complex numbers. Let z1 = x1 + iy1 and z2 = x2 + iy2 and define z1 z2 = (x1 + x2) + i(y1 + y2). For scalar multiplication,
1. let α . Then α z1 = (αx1) + i(αy1) and we call the real vector space.
2. let α + . Then (α + ) (x1 + iy1) = (αx1 -βy1) + i(αy1 + βx1) and we call the complex vector space.
DRAFT
6. Let n = {(z1,,zn)T |zi ,1 i n}. For z = (z1,,zn),w = (w1,,wn)T n and α F, define
Then, verify that n forms a vector space over (called complex vector space) as well as over (called real vector space). In general, we assume n to be a complex vector space.

Remark 3.3.1.5. If F = then i(1,0) = (i,0) is allowed. Whereas, if F = then i(1,0) doesn’t make sense as i ⁄∈ .

7. Fix m,n and let Mm,n() = {Am×n = [aij]|aij }. For A,B Mm,n() and α , define (A + αB)ij = aij + αbij. Then Mm,n() is a complex vector space. If m = n, the vector space Mm,n() is denoted by Mn().
8. Let S be a non-empty set and let S = {f|f is a function from S to }. For f,g S and α , define (f + αg)(x) = f(x) + αg(x), for all x S. Then, S is a real vector space. In particular,
1. for S = , observe that , consisting of all real sequences, forms a real vector space.
2. Let V be the set of all bounded real sequences. Then V is a real vector space.
3. Let V be the set of all real sequences that converge to 0. Then V is a real vector space. DRAFT
4. Let S be the set of all real sequences that converge to 1. Then check that S is not a vector space. Determine the conditions that fail.
9. Fix a,b with a < b and let ([a,b], ) = {f : [a,b] |f is continuous}. Then ([a,b], ) with (f + αg)(x) = f(x) + αg(x), for all x [a,b], is a real vector space.
10. Let (, ) = {f : |f is continuous}. Then (, ) with (f + αg)(x) = f(x) + αg(x), for all x , is a real vector space.
11. Fix a < b and let 2((a,b), ) = {f : (a,b) |f′′exists and f′′ is continuous}. Then 2((a,b), ) with (f + αg)(x) = f(x) + αg(x), for all x (a,b), is a real vector space.
12. Fix a < b and let ((a,b), ) = {f : (a,b) |f is infinitely differentiable}. Then ((a,b), ) with (f + αg)(x) = f(x) + αg(x), for all x (a,b) is a real vector space.
13. Fix a < b . Then V = {f : (a,b) |f′′ + f+ 2f = 0} is a real vector space.
14. Let [x] = {p(x)|p(x) is a real polynomial in x}. Then, with the usual addition of polynomials and α(a0 + a1x + + anxn) = (αa0) + + (αan)xn, for α , gives [x] a real vector space structure.
15. Fix n and let [x;n] = {p(x) [x]|p(x) has degree n}.Then, with the usual addition of polynomials and α(a0 + a1x + + anxn) = (αa0) + + (αan)xn, for α , gives [x;n] a real vector space structure.
16. Let [x] = {p(x)|p(x) is a complex polynomial in x}. Then, with the usual addition of polynomials and α(a0 + a1x + + anxn) = (αa0) + + (αan)xn, for α , gives [x] a real vector space structure.
17. Let V = {0}. Then V is a real as well as a complex vector space.
18. Let + = {x |x > 0}. Then DRAFT
1. + is not a vector space under usual operations of addition and scalar multiplication.
2. + is a real vector space with 1 as the additive identity if we define
19. For any α and x = (x1,x2)T ,y = (y1,y2)T 2, define
Then 2 is a real vector space with (-1,3)T as the additive identity.
20. Let V = {A = [aij] Mn()|a11 = 0}. Then V is a complex vector space.
21. Let V = {A = [aij] Mn()|A = A*}. Then V is a real vector space but not a complex vector space.
22. Let V and W be vector spaces over F, with operations (+,) and (,), respectively. Let V × W = {(v,w)|v V,w W}. Then V × W forms a vector space over F, if for every (v1,w1),(v2,w2) V × W and α , we define
DRAFT
v1 + v2 and w1 w2 on the right hand side mean vector addition in V and W, respectively. Similarly, α v1 and α w1 correspond to scalar multiplication in V and W, respectively.
23. Let be the set of scalars. Then is a vector space over . As e,π - ⁄∈ , these real numbers are vectors but not scalars in this space.
24. Similarly, is a vector space over . Since e - π,i + ,i ⁄∈ , these complex numbers are vectors but not scalars in this space.
25. Let 5 = {0,1,2,3,4} with addition and multiplication, respectively, given by
Then, V = {(a,b)|a,b 5} is a vector space having 25 vectors.

Note that all our vector spaces, except the last three, are linear spaces. DRAFT

From now on, we will use ‘u + v’ for ‘u v’ and ‘αu or α u’ for ‘α u’.

Exercise 3.3.1.6.

1. Verify the axioms for vector spaces that appear in Example 3.3.1.4.
2. Does the set V given below form a real/complex or both real and complex vector space? Give reasons for your answer.
1. For x = (x1,x2)T ,y = (y1,y2)T 2, define x + y = (x1 + y1,x2 + y2)T and αx = (αx1,0)T for all α .
2. Let V = .
3. Let V = .
4. Let V = {(x,y,z)T |x + y + z = 1}.
5. Let V = {(x,y)T 2|x y = 0}.
6. Let V = {(x,y)T 2|x = y2}.
7. Let V = {α(1,1,1)T + β(1,1,-1)T |α,β }.
8. Let V = with x y = x - y and α x = -αx, for all x,y V and α .
9. Let V = 2. Define (x1,y1)T (x2,y2)T = (x1+x2,0)T and α(x1,y1)T = (αx1,0)T , for α,x1,x2,y1,y2 .
DRAFT

#### 3.1.A Subspaces

Definition 3.3.1.7. [Vector Subspace] Let V be a vector space over F. Then, a non-empty subset S of V is called a subspace of V if S is also a vector space with vector addition and scalar multiplication inherited from V.

Example 3.3.1.8.

1. If V is a vector space then V and {0} are subspaces, called trivial subspaces.
2. The real vector space has no non-trivial subspace.
3. W = {x 3|[1,2,-1]x = 0} is a plane in 3 containing 0 (so a subspace).
4. W = {x 3|x = 0} is a line in 3 containing 0 (so a subspace).
5. The vector space [x;n] is a subspace of [x].
6. Prove that 2(a,b) is a subspace of (a,b).
7. Prove that W = {(x,0)T 2|x } is a subspace of 2.
8. Is the set of sequences converging to 0 a subspace of the set of all bounded sequences?
9. Let V be the vector space of Example 3.3.1.4.19. Then DRAFT
1. S = {(x,0)T |x } is not a subspace of V as (x,0)T (y,0)T = (x+y+1,-3)T ⁄∈ S.
2. W = {(x,3)T |x } is a subspace of V.
10. The vector space + defined in Example 3.3.1.4.18 is not a subspace of .

Let V(F) be a vector space and W V, W. We now prove a result which implies that to check W to be a subspace, we need to verify only one condition.

Theorem 3.3.1.9. Let V(F) be a vector space and W V, W. Then W is a subspace of V if and only if αu + βv W whenever α,β F and u,v W.

Proof. Let W be a subspace of V and let u,v W. Then, for every α,β F, αuv W and hence αu + βv W.

Now, we assume that αu + βv W, whenever α,β F and u,v W. To show, W is a subspace of V:

1. Taking α = 1 and β = 1, we see that u + v W, for every u,v W.
2. Taking α = 0 and β = 0, we see that 0 W.
3. Taking β = 0, we see that αu W, for every α F and u W. Hence, using Theorem 3.3.1.3.3, -u = (-1)u W as well.
4. The commutative and associative laws of vector addition hold as they hold in V.
5. The axioms related with scalar multiplication and the distributive laws also hold as they hold in V.

Thus, one obtains the required result. _

DRAFT Exercise 3.3.1.10.

1. Determine all the subspaces of and 2.
2. Prove that a line in 2 is a subspace if and only if it passes through (0,0) 2.
3. Are all the sets given below subspaces of ([-1,1])?
1. W = {f C([-1,1])|f(12) = 0}.
2. W = {f C([-1,1])|f(-12) = 0,f(12) = 0}.
3. W = {f C([-1,1])|f() exists }.
4. Are all the sets given below subspaces of [x]?
1. W = {f(x) [x]|deg(f(x)) = 3}.
2. W = {f(x) [x]|deg(f(x)) = 0}.
3. W = {f(x) [x]|f(1) = 0}.
4. W = {f(x) [x]|f(0) = 0,f(12) = 0}.
5. Which of the following are subspaces of n()?
1. {(x1,x2,,xn)T |x1 0}.
2. {(x1,x2,,xn)T |x1 is rational}.
3. {(x1,x2,,xn)T ||x1|1}.
6. Among the following, determine the subspaces of the complex vector space n? DRAFT
1. {(z1,z2,,zn)T |z1 is real }.
2. {(z1,z2,,zn)T |z1 + z2 = z3}.
3. {(z1,z2,,zn)T |z1 = z2}.
7. Fix n . Then, is W a subspace of Mn(), where
1. W = {A Mn()|A is upper triangular}?
2. W = {A Mn()|A is symmetric}?
3. W = {A Mn()|A is skew-symmetric}?
4. W = {A|A is a diagonal matrix}?
5. W = {A|trace(A) = 0}?
6. W = {A Mn()|AT = 2A}?
7. W = {A = [aij]|a11 + a21 + a34 = 0}?
8. Fix n . Then, is W = {A = [aij]|a11 + a22 = 0} a subspace of the complex vector space Mn()? What if Mn() is a real vector space?
9. Prove that the following sets are not subspaces of Mn().
1. G = {A Mn()|det(A) = 0}.
2. G = {A Mn()|det(A)0}.
3. G = {A Mn()|det(A) = 1}.
DRAFT

#### 3.1.B Linear Span

Definition 3.3.1.11. [Linear Combination] Let V be a vector space over F and let u1,,un V. Then, a vector u V is said to be a linear combination of u1,,un if we can find scalars α1,n F such that u = α1u1 + + αnun.

Example 3.3.1.12.

1. (3,4,3) = 2(1,1,1) + (1,2,1) but we cannot find a,b such that (3,4,5) = a(1,1,1) + b(1,2,1).
2. Is (4,5,5) a linear combination of (1,0,0), (2,1,0), and (3,3,1)?
Solution: (4,5,5) is a linear combination if the linear system
 (3.3.1.1)

in the unknowns a,b,c has a solution. Clearly, Equation (3.3.1.1) has solution a = 9,b = -10 and c = 5. DRAFT

3. Is (4,5,5) a linear combination of the vectors (1,2,3),(-1,1,4) and (3,3,2)?
Solution: The vector (4,5,5) is a linear combination if the linear system
 (3.3.1.2)

in the unknowns a,b,c has a solution. The RREF of the corresponding augmented matrix equals , implying infinite number of solutions. For example, (4,5,5) = 3(1,2,3) - (-1,1,4).

4. Is (4,5,5) a linear combination of the vectors (1,2,1),(1,0,-1) and (1,1,0)?
Solution: The vector (4,5,5) is a linear combination if the linear system
 (3.3.1.3)

in the unknowns a,b,c has a solution. The RREF of the corresponding augmented matrix equals . So, Equation (3.3.1.3) has no solution. Thus, (4,5,5) is not a linear DRAFT combination of the given vectors.

Exercise 3.3.1.13.

1. Let x 3. Prove that xT is a linear combination of (1,0,0), (2,1,0) and (3,3,1). Is this linear combination unique? That is, does there exist (a,b,c)(e,f,g) with xT = a(1,0,0) + b(2,1,0) + c(3,3,1) = e(1,0,0) + f(2,1,0) + g(3,3,1)?
2. Find condition(s) on x,y,z such that (x,y,z) is a linear combination of
1. (1,2,3),(-1,1,4) and (3,3,2).
2. (1,2,1),(1,0,-1) and (1,1,0).
3. (1,1,1),(1,1,0) and (1,-1,0).

Definition 3.3.1.14. [Linear Span] Let V be a vector space over F and S = {u1,,un}⊆ V. Then, the linear span of S, denoted LS(S), equals

If S is an empty set, we define LS(S) = {0}.

Example 3.3.1.15. For the set S given below, determine LS(S).

1. S = {(1,0)T ,(0,1)T }⊆ 2.
Solution: LS(S) = {a(1,0)T + b(0,1)T |a,b } = {(a,b)T |a,b } = 2.
2. S = {(1,1,1)T ,(2,1,3)T }. What is the geometrical representation of LS(S)?
Solution: LS(S) = {a(1,1,1)T + b(2,1,3)T |a,b } = {(a + 2b,a + b,a + 3b)T |a,b }. For geometrical representation, we need to find conditions on x,y and z such that (a + 2b,a + b,a + 3b) = (x,y,z). Or equivalently, a + 2b = x,a + b = y,a + 3b = z has a solution for all a,b . Check that the RREF of the augmented matrix equals . Thus, we need 2x - y - z = 0. Hence, LS(S) is a plane given by
3. S = {(1,2,1)T ,(1,0,-1)T ,(1,1,0)T }. What is the geometrical representation of LS(S)?
Solution: As above, we need to find condition(s) on x,y,z such that the linear system
 DRAFT (3.3.1.4)

in the unknowns a,b,c is always consistent. An application of GJE to Equation (3.3.1.4) gives . Thus,

4. S = {(1,2,3)T ,(-1,1,4)T ,(3,3,2)T }.
Solution: As above, need to find condition(s) on x,y,z such that the linear system
in the unknowns a,b,c is always consistent. An application of GJE method gives 5x - 7y + 3z = 0 as the required condition. Thus, DRAFT
5. S = {(1,2,3,4)T ,(-1,1,4,5)T ,(3,3,2,3)T }⊆ 4.
Solution: The readers are advised to show that

Exercise 3.3.1.16. For each S, determine the geometric representation of LS(S).

1. S = {-1}⊆ .
2. S = {π}⊆ .
3. S = {(1,0,1)T ,(0,1,0)T ,(3,0,3)T }⊆ 3.
4. S = {(1,2,1)T ,(2,0,1)T ,(1,1,1)T }⊆ 3.
5. S = {(1,0,1,1)T ,(0,1,0,1)T ,(3,0,3,1)T }⊆ 4.
DRAFT

Definition 3.3.1.17. [Finite Dimensional Vector Space] Let V be a vector space over F. Then V is called finite dimensional if there exists S V, such that S has finite number of elements and V = LS(S). If such an S does not exist then V is called infinite dimensional.

Example 3.3.1.18.

1. {(1,2)T ,(2,1)T } spans 2. Thus, 2 is finite dimensional.
2. {1,1 + x,1 - x + x2,x3,x4,x5} spans [x;5]. Thus, [x;5] is finite dimensional.
3. Fix n . Then, [x;n] is finite dimensional as [x;n] = LS({1,x,x2,,xn}).
4. [x] is not finite dimensional as the degree of a polynomial can be any large positive integer. Indeed, verify that [x] = LS({1,x,x2,,xn,}).
5. The vector space over is infinite dimensional.
6. The vector space over is infinite dimensional.

Lemma 3.3.1.19 (Linear Span is a Subspace). Let V be a vector space over F and S V. Then LS(S) is a subspace of V. DRAFT

Proof. By definition, 0 LS(S). So, LS(S) is non-empty. Let u,v LS(S). To show, au + bv LS(S) for all a,b F. As u,v LS(S), there exist n , vectors wi S and scalars αii F such that u = α1w1 + + αnwn and v = β1w1 + + βnwn. Hence,

as i + i F for 1 i n. Thus, by Theorem 3.3.1.9, LS(S) is a vector subspace. _

Remark 3.3.1.20. Let V be a vector space over F. If W is a subspace of V and S W then LS(S) is a subspace of W as well.

Theorem 3.3.1.21. Let V be a vector space over F and S V. Then LS(S) is the smallest subspace of V containing S.

Proof. For every u S, u = 1 u LS(S). Thus, S LS(S). Need to show that LS(S) is the smallest subspace of V containing S. So, let W be any subspace of V containing S. Then, by Remark 3.3.1.20, LS(S) W and hence the result follows. _

Definition 3.3.1.22. Let V be a vector space over F. DRAFT

1. Let S and T be two subsets of V. Then, the sum of S and T, denoted S + T equals {s + t|s S,t T}. For example,
1. if V = , S = {0,1,2,3,4,5,6} and T = {5,10,15} then S + T = {5,6,,21}.
2. if V = 2, S = and T = then S + T = .
3. if V = 2, S = and T = LS then S + T = .
2. Let P and Q be two subspaces of V. Then, we define their sum, denoted P + Q, as P + Q = {u + v|u P,v Q}. For example, P + Q = R2, if
1. P = {(x,0)T |x } and Q = {(0,x)T |x } as (x,y) = (x,0) + (0,y).
2. P = {(x,0)T |x } and Q = {(x,x)T |x } as (x,y) = (x - y,0) + (y,y).
3. P = LS((1,2)T ) and Q = LS((2,1)T ) as (x,y) = (1,2) + (2,1).

We leave the proof of the next result for readers.

Lemma 3.3.1.23. Let V be a vector space over F and let P and Q be two subspaces of V. Then P + Q is the smallest subspace of V containing both P and Q.

Exercise 3.3.1.24.

1. Let a 2,a0. Then show that {x 2|aT x = 0} is a non-trivial subspace of 2. Geometrically, what does this set represent in 2? DRAFT
2. Find all subspaces of 3.
3. Prove that {(x,y,z)T 3|ax + by + cz = d} is a subspace of 3 if and only if d = 0.
4. Let W = {f(x) [x]|deg(f(x)) = 5}. Prove that W is not a subspace [x].
5. Determine all subspaces of the vector space in Example 3.3.1.4.19.
6. Let U = and W = be subspaces of M2(). Determine U W. Is M2() = U W? What is U + W?
7. Let W and U be two subspaces of a vector space V over F.
1. Prove that W U is a subspace of V.
2. Give examples of W and U such that W U is not a subspace of V.
3. Determine conditions on W and U such that W W a subspace of V?
4. Prove that LS(W U) = W + U.
8. Let S = {x1,x2,x3,x4}, where x1 = (1,0,0)T ,x2 = (1,1,0)T ,x3 = (1,2,0)T and x4 = (1,1,1)T . Then, determine all xi such that LS(S) = LS(S \{xi}).
9. Let W = LS((1,0,0)T ,(1,1,0)T ) and U = LS((1,1,1)T ). Prove that W + U = 3 and W U = {0}. If u 3, determine uW W and uU U such that u = uW + uU. Is it necessary that uW and uU are unique?
10. Let W = LS((1,-1,0),(1,1,0)) and U = LS((1,1,1),(1,2,1)). Prove that W + U = 3 and W U{0}. Find u 3 such that when we write u = uW + uU, with uW W and uU U, the vectors uW and uU are not unique.
DRAFT

#### 3.1.C Fundamental Subspaces Associated with a Matrix

Definition 3.3.1.25. Let A Mm,n(). Then, we use the functions P : n m and Q : m n defined by P(x) = Ax and Q(y) = A*y, respectively, for x n and y m, to get the four fundamental subspaces associated with A, namely,

1. Col(A) = {Ax|x n} = Rng(P) is a subspace of m, called the Column / Range space. Observe that Col(A) is the linear span of columns of A.
2. Null(A) = {x|Ax = 0,x n} = Null(P) is a subspace of n, called the Null space.
3. Col(A*) = {A*x|x n} = Rng(Q) is the linear span of rows of A = [aij]. If A Mm,n() then Col(A*) reduces to Row(A) = {xT A|x m}, the row space of A.
4. Null(A*) = {x|A*x = 0,x n} = Null(Q). If A Mm,n() then Null(A*) reduces to Null(AT ) = {x m|xT A = 0T }.

Example 3.3.1.26.

1. Let A = . Then
1. Col(A) = {x = (x1,x2,x3)T 3|x1 - x2 + x3 = 0}.
2. Row(A) = {x = (x1,x2,x3)T 3|x1 + x2 - 2x3 = 0}.
3. Null(A) = LS((1,1,-2)T ). DRAFT
4. Null(AT ) = LS((1,-1,1)T ).
2. Let A = . Then
1. Col(A) = {x = (x1,x2,x3)T 3|x1 + x2 - x3 = 0}.
2. Row(A) = {x = (x1,x2,x3,x4)T 4|x1 - x2 - 2x3 = 0,x1 - 3x2 - 2x4 = 0}.
3. Null(A) = LS({(1,-1,-2,0)T ,(1,-3,0,-2)T }).
4. Null(AT ) = LS((1,1,-1)T ).
3. Let A = . Then
1. Col(A) = {(x1,x2,x3) 3|(2 + i)x1 - (1 - i)x2 - x3 = 0}.
2. Col(A*) = {(x1,x2,x3) 3|ix1 - x2 + x3 = 0}.
3. Null(A) = LS((i,1,-1)T ).
4. Null(A*) = LS((-2 + i,1 + i,1)T ).

Remark 3.3.1.27. Let A Mm,n(). Then, in Example 3.3.1.26, observe that the direction ratios of Col(A) matches vector(s) in Null(AT ). Similarly, the direction ratios of Row(A) matches with vectors in Null(A). What are the relationships in case A Mm,n()? We will come back to these spaces again and again. DRAFT

Let V be a vector space over either or . Then, we have learnt that

1. for any S V, LS(S) is again a vector space. Moreover, LS(S) is the smallest subspace containing S.
2. unless S = , LS(S) has infinite number of vectors.

Therefore, the following questions arise:

1. Are there conditions under which LS(S1) = LS(S2) for S1S2?
2. Is it always possible to find S so that LS(S) = V?
3. Suppose we have found S V such that LS(S) = V. Can we find the minimum number of vectors in S?

We try to answer these questions in the subsequent sections.

### 3.2 Linear Independence

Definition 3.3.2.1. [Linear Independence and Dependence] Let S = {u1,,um} be a non-empty subset of a vector space V over F. Then the set S is said to be linearly independent if the system of linear equations
 DRAFT " class="math-display" > (3.3.2.1)

in the unknowns αi’s, 1 i m, has only the trivial solution. If the system (3.3.2.1) has a non-trivial solution then the set S is said to be linearly dependent.

If the set S has infinitely many vectors then S is said to be linearly independent if for every finite subset T of S, T is linearly independent, else S is linearly dependent.

Let V be a vector space over F and S = {u1,,um}⊆ V with S. Then, one needs to solve the linear system of equation
 (3.3.2.2)

in the unknowns α1,n F. If α1 = = αm = 0 is the only solution of (3.3.2.2) then S is a linearly independent subset of V. Otherwise, S is a linearly dependent subset of V. Since one is solving a linear system over F, linear independence and dependence depend on F, the set of scalars.

Example 3.3.2.2. Is the set S a linear independent set? Give reasons.

1. Let S = {(1,2,1)T ,(2,1,4)T ,(3,3,5)T }.
Solution: Consider the system a(1,2,1) + b(2,1,4) + c(3,3,5) = (0,0,0) in the unknowns a,b and c. As rank of coefficient matrix is 2 < 3, the number of unknowns, the system has a non-trivial solution. Thus, S is a linearly dependent subset of 3.
2. Let S = {(1,1,1)T ,(1,1,0)T ,(1,0,1)T }.
Solution: Consider the system a(1,1,1) + b(1,1,0) + c(1,0,1) = (0,0,0) in the unknowns DRAFT a,b and c. As rank of coefficient matrix is 3 = the number of unknowns, the system has only the trivial solution. Hence, S is a linearly independent subset of 3.
3. Consider as a complex vector space and let S = {1,i}.
Solution: Since is a complex vector space, i 1 + (-1)i = i - i = 0. So, S is a linear dependent subset the complex vector space .
4. Consider as a real vector space and let S = {1,i}.
Solution: Consider the linear system a 1 + b i = 0, in the unknowns a,b . Since a,b , equating real and imaginary parts, we get a = b = 0. So, S is a linear independent subset the real vector space .
5. Let A Mm,n(). If Rank(A) < min{m,n} then the rows of A are linearly dependent.
Solution: Let B = RREF(A). Then, there exists an invertible matrix P = [pij] such that B = PA. Since Rank(A) < min{m,n}, B[m,:] = 0T . Thus, 0T = B[m,:] = i=1npmiA[i,: ]. As P is invertible, at least one pmi0. Thus, the required result follows.

#### 3.2.A Basic Results related to Linear Independence

The reader is expected to supply the proof of parts that are not given.

Proposition 3.3.2.3. Let V be a vector space over F. Then,

1. 0, the zero-vector, cannot belong to a linearly independent set.
2. every non-empty subset of a linearly independent set in V is also linearly independent.
3. a set containing a linearly dependent set of V is also linearly dependent.
DRAFT

Proof. Let S = {0 = u1,u2,,un}. Then, 1 u1 + 0 u2 + + 0 un = 0. Hence, the system α1u1 + + αmum = 0 has a non-trivial solution [α12,n] = [1,0,0]. Thus, the set S is linearly dependent. _

Theorem 3.3.2.4. Let V be a vector space over F. Let S = {u1,,uk} ⊆ V with S. If T LS(S) such that m = |T| > k then T is a linearly dependent set.

Proof. Let T = {w1,,wm}. As wi LS(S), there exist aij F such that wi = ai1u1 + + aikuk, for 1 i m. So,

As m > k, using Corollary 2.2.2.35, the system AT x = 0 has a non-trivial solution, say xT = [α1,m]0T . That is, i=1mαiAT [:,i] = 0. Or equivalently, i=1mαiA[i,:] = 0T . Thus,
Thus, the set T is linearly dependent. _

DRAFT Corollary 3.3.2.5. Fix n . Then, any set S n with |S|n + 1 is linearly dependent.

Proof. Observe that n = LS({e1,,en}), where ei = In[:,i], the i-th column of In. Hence, the required result follows using Theorem 3.3.2.4. _

Theorem 3.3.2.6. Let V be a vector space over F and let S be a linearly independent subset of V. Suppose v V. Then S ∪{v} is linearly dependent if and only if v LS(S).

Proof. Let us assume that S ∪{v} is linearly dependent. Then, there exist vi S such that the system α1v1 + + αpvp + αp+1v = 0 has a non-trivial solution, say αi = ci, for 1 i p + 1. As the solution is non-trivial one of the ci’s is non-zero. We claim that cp+10.

For, if cp+1 = 0 then the system α1v1 + + αpvp = 0 in the unknowns α1,p has a non-trivial solution [c1,,cp]. This contradicts Proposition 3.3.2.3.2 as {v1,,vp} is a subset of the linearly independent set S. Thus, cp+10 and we get

as - F, for 1 i p. Hence, v is a linear combination of v1,,vp.

Now, assume that v LS(S). Then, there exists ci F, not all zero and vi S such that v = i=1pcivi. Thus, the system α1v1 + + αpvp + αp+1v = 0 in the unknowns αi’s has a non-trivial solution [c1,,cp,-1]. Hence, S ∪{v} is linearly dependent. _

We now state a very important corollary of Theorem 3.3.2.6 without proof.

DRAFT Corollary 3.3.2.7. Let V be a vector space over F and let S = {u1,,un}⊆ V. If S is

1. linearly dependent then there exists k,2 k n with LS(u1,,uk) = LS(u1,,uk-1).
2. linearly independent then v V \ LS(S) if and only if S ∪{v} = {u1,,un,v} is also a linearly independent subset of V.
3. linearly independent then LS(S) = V if and only if each proper superset of S is linearly dependent.

#### 3.2.B Application to Matrices

We leave the proof of the next result for readers.

Theorem 3.3.2.8. The following statements are equivalent for A Mn().

1. A is invertible.
2. The columns of A are linearly independent.
3. The rows of A are linearly independent.

A generalization of Theorem 3.3.2.8 is stated next.

DRAFT Theorem 3.3.2.9. Let A Mm,n() with B = RREF(A). Then, the rows of A corresponding to the pivotal rows of B are linearly independent. Also, the columns of A corresponding to the pivotal columns of B are linearly independent.

Proof. Pivotal rows of B are linearly independent due to the pivotal 1’s. Now, let B1 be the submatrix of B consisting of the pivotal rows of B. Let A1 be the submatrix of A which gives B1. As the RREF of a matrix is unique (see Corollary 2.2.2.17) there exists an invertible matrix Q such that QA1 = B1. So, if there exists c0 such that cT A1 = 0T then

with dT = cT Q-10T as Q is an invertible matrix (see Theorem 2.2.3.1). Thus, it contradicts the linear independence of the pivotal rows of B.

Let B[:,i1],,B[:,ir] be the pivotal columns of B which are linearly independent due to pivotal 1’s. Let B = PA then the corresponding columns of A satisfy

Hence, the required result follows as P is an invertible matrix. _

We give an example for better understanding.

Example 3.3.2.10. Let A = with RREF(A) = B = . Then DRAFT B[:,3] = -B[:,1]+2B[:,2]. Thus, A[:,3] = -A[:,1]+2A[:,2]. As the 1-st, 2-nd and 4-th columns of B are linearly independent, the set {A[:,1],A[:,2],A[:,4]} is linearly independent. Also, note that during the application of GJE to get RREF, we have interchanged the 3-rd and 4-th rows. Hence, the rows A[1,:],A[2,:] and A[4,:] are linearly independent.

#### 3.2.C Linear Independence and Uniqueness of Linear Combination

We end this section with a result that states that linear combination with respect to linearly independent set is unique.

Lemma 3.3.2.11. Let S be a linearly independent set in a vector space V over F. Then each v LS(S) is a unique linear combination vectors from S.

Proof. On the contrary, suppose there exists v LS(S) such that v = α1v1 + + αpvp and v = β1v1 + + βpvp, for αii F and vi S, for 1 i p. Equating the two expressions for v gives
 (3.3.2.3)

As {v1,,vp}⊆ S is a linearly independent subset in LS(S), the system c1v1 + + cpvp = 0 in the unknowns c1,,cp has only the trivial solution. Thus, each of the scalars αi - βi, appearing in Equation (3.3.2.3), must be equal to 0. That is, αi - βi = 0 , for 1 i p. Thus, for 1 i p, αi = βi and the result follows. _

DRAFT Exercise 3.3.2.12.

1. Consider the euclidean plane 2. Let u1 = (1,0)T . Determine condition on u2 such that {u1,u2} is a linearly independent subset of 2.
2. Let S = {(1,1,1,1)T ,(1,-1,1,2)T ,(1,1,-1,1)T } ⊆ 4. Does (1,1,2,1)T LS(S)? Furthermore, determine conditions on x,y,z and u such that (x,y,z,u)T LS(S).
3. Show that S = {(1,2,3)T ,(-2,1,1)T ,(8,6,10)T }⊆ 3 is linearly dependent.
4. Prove that {u1,,un}⊆ n is linearly independent if and only if {Au1,,Aun} is linearly independent for every invertible matrix A.
5. Let V be a complex vector space and let A Mn() be invertible. Then {u1,,un}⊆ V is a linearly independent if and only if {w1,,wn} ⊆ V is linearly independent, where wi = j=1naijuj, for 1 i n.
6. Find u,v,w 4 such that {u,v,w} is linearly dependent whereas {u,v},{u,w} and {v,w} are linearly independent.
7. Is {(1,0)T ,(i,0)T } a linearly independent subset of the real vector space 2?
8. Suppose V is a collection of vectors such that V is a real as well as a complex vector space. Then prove that {u1,,uk,iu1,,iuk} is a linearly independent subset of V over if and only if {u1,,uk} is a linear independent subset of V over .
9. Let V be a vector space and M be a subspace of V. For u,v V\M, define K = LS(M,u) and H = LS(M,v). Then prove that v K if and only if u H.
10. Let A Mn(). Suppose x,y n \{0} such that Ax = 3x and Ay = 2y. Then prove that x and y are linearly independent. DRAFT
11. Let A = . Determine x,y,z 3 \{0} such that Ax = 6x, Ay = 2y and Az = -2z. Use the vectors x,y and z obtained above to prove the following.
1. A2v = 4v, where v = cy + dz for any c,d .
2. The set {x,y,z} is linearly independent.
3. Let P = [x,y,z] be a 3 × 3 matrix. Then P is invertible.
4. Let D = . Then AP = PD.
12. Prove that the rows/columns of
1. A Mn() are linearly independent if and only if det(A)0.
2. A Mn() span n if and only if A is an invertible matrix.
3. a skew-symmetric matrix A of odd order are linearly dependent.
13. Let P and Q be subspaces of n such that P + Q = n and P Q = {0}. Prove that each u n is uniquely expressible as u = uP + uQ, where uP P and uQ Q.

### 3.3 Basis of a Vector Space

Definition 3.3.3.1. Let S be a subset of a set T. Then S is said to be a maximal subset of T having property P if DRAFT

1. S has property P and
2. no proper superset S of T has property P.

Example 3.3.3.2. Let T = {2,3,4,7,8,10,12,13,14,15}. Then a maximal subset of T of consecutive integers is S = {2,3,4}. Other maximal subsets are {7,8},{10} and {12,13,14,15}. Note that {12,13} is not maximal. Why?

Definition 3.3.3.3. Let V be a vector space over F. Then S is called a maximal linearly independent subset of V if

1. S is linearly independent and
2. no proper superset S of V linearly independent.

Example 3.3.3.4.

1. In 3, the set S = {e1,e2} is linearly independent but not maximal as S ∪{(1,1,1)T } is a linearly independent set containing S.
2. In 3, S = {(1,0,0)T ,(1,1,0)T ,(1,1,-1)T } is a maximal linearly independent set as any collection of 4 or more vectors from 3 is linearly dependent (see Corollary 3.3.2.5). DRAFT
3. Let S = {v1,,vk}⊆ n. Now, form the matrix A = [v1,,vk] and let B = RREF(A). Then, using Theorem 3.3.2.9, we see that if B[:,i1],,B[:,ir] are the pivotal columns of B then {vi1,,vir} is a maximal linearly independent subset of S.

Theorem 3.3.3.5. Let V be a vector space over F and S a linearly independent set in V. Then S is maximal linearly independent if and only if LS(S) = V.

Proof. Let v V. As S is linearly independent, using Corollary 3.3.2.7.2, the set S ∪{v} is linearly independent if and only if v V \ LS(S). Thus, the required result follows. _

Let V = LS(S) for some set S with |S| = k. Then, using Theorem 3.3.2.4, we see that if T V is linearly independent then |T|k. Hence, a maximal linearly independent subset of V can have at most k vectors. Thus, we arrive at the following important result.

Theorem 3.3.3.6. Let V be a vector space over F and let S and T be two finite maximal linearly independent subsets of V. Then |S| = |T|.

Proof. By Theorem 3.3.3.5, S and T are maximal linearly independent if and only if LS(S) = V = LS(T). Now, use the previous paragraph to get the required result. _

Definition 3.3.3.7. Let V be a vector space over F with V{0}. Suppose V has a finite maximal linearly independent set S. Then |S| is called the dimension of V, denoted dim(V). By convention, dim({0}) = 0.

DRAFT Example 3.3.3.8.

1. As {π} is a maximal linearly independent subset of , dim() = 1.
2. As {(1,0,1)T ,(0,1,1)T ,(1,1,0)T }⊆ 3 is maximal linearly independent, dim(3) = 3.
3. As {e1,,en} is a maximal linearly independent set in n, dim(n) = n.
4. As {e1,,en} is a maximal linearly independent subset of the complex vector space n, dim(n) = n.
5. Using Exercise 3.3.2.12.8, {e1,,en,ie1,,ien} is a maximal linearly independent subset of the real vector space n. Thus, as a real vector space dim(n) = 2n.
6. Let S = {v1,,vk}⊆ n. Define A = [v1,,vk]. Then, using Example 3.3.3.4.3, we see that dim(LS(S)) = Rank(A).

Definition 3.3.3.9. [Basis of a Vector Space] Let V be a vector space over F with V{0}. Then a maximal linearly independent subset of V is called a basis of V. The vectors in a basis are called basis vectors. Note that a basis of {0} is either not defined or is the empty set.

Definition 3.3.3.10. Let V be a vector space over F with V{0}. Then a set S V is called minimal spanning if LS(S) = V and no proper subset of S spans V.

DRAFT Example 3.3.3.11. [Standard Basis] Fix n and let ei = In[:,i], the i-th column of the identity matrix. Then = {e1,,en} is called the standard basis of n or n. In particular,

1. = {e1} = {1} is a standard basis of over .
2. = {e1,e2} with e1T = (1,0)T and e2T = (0,1)T is the standard basis of 2.
3. = {e1,e2,e3} = {(1,0,0)T ,(0,1,0)T ,(0,0,1)T } is the standard basis of 3.
4. {1} is a basis of over .
5. {e1,,en,ie1,,ien} is a basis of n over . So, {1,i} is a basis of over .

Example 3.3.3.12.

1. Note that {-2} is a basis and a minimal spanning set of .
2. Let u1,u2,u3 2. Then, {u1,u2,u3} can neither be a basis nor a minimal spanning subset of 2.
3. {(1,1,-1)T ,(1,-1,1)T ,(-1,1,1)T } is a basis and a minimal spanning subset of 3.
4. Let V = {(x,y,0)T |x,y }⊆ 3. Then = {(1,0,0)T ,(1,3,0)T } is a basis of V.
5. Let V = {(x,y,z)T 3|x + y - z = 0}⊆ 3. As each element (x,y,z)T V satisfies x + y - z = 0. Or equivalently z = x + y, we see that DRAFT
Hence, {(1,0,1)T ,(0,1,1)T } forms a basis of V.
6. Let S = {1,2,,n} and consider the vector space S (see Example 3.3.1.4.8). Then, for 1 i n, define ei(j) = Prove that = {e1,,en} is a linearly independent set. Is it a basis of S?
7. Let S = n and consider the vector space S (see Example 3.3.1.4.8). For 1 i n, define the functions ei(x) = ei(x1,,xn) = xi. Then, verify that {e1,,en} is a linearly independent set. Is it a basis of S?
8. Let S = {v1,,vk} ⊆ n. Define A = [v1,,vk]. Then, using Theorem 3.3.2.9, the columns of A corresponding to the pivotal columns in RREF(A) form a basis as well as a minimal spanning subset of LS(S).
9. Let S = {a1,,an}. Then recall that S is a real vector space (see Example 8). For 1 i n, define fi(aj) = . Then, verify that {f1,,fn} is a basis of S. What can you say if S is a countable set?
10. Recall the vector space [a,b], where a < b . For each α [a,b] , define fα(x) = x - α, for all x [a,b]. Is the set {fα|α [a,b]} linearly independent? What if α [a,b]? Can we write any function in [a,b] as a finite linear combination? Give reasons for your answer.

#### 3.3.A Main Results associated with Bases

DRAFT Theorem 3.3.3.13. Let V{0} be a vector space over F. Then the following statements are equivalent.

1. is a basis (maximal linearly independent subset) of V.
2. is linearly independent and it spans V.
3. is a minimal spanning set of V.

Proof. 1 2   By definition, every basis is a maximal linearly independent subset of V. Thus, using Corollary 3.3.2.7.2, we see that spans V.

2 3   Let S be a linearly independent set that spans V. As S is linearly independent, for any x S, xLS. Hence LSV.

3 1   If is linearly dependent then using Corollary 3.3.2.7.1 is not minimal spanning. A contradiction. Hence, is linearly independent.

We now need to show that is a maximal linearly independent set. Since spans V, for any x V \, the set ∪{x} is linearly dependent. That is, every proper superset of is linearly dependent. Hence, the required result follows. _

Now, using Lemma 3.3.2.11, we get the following result.

Remark 3.3.3.14. Let be a basis of a vector space V over F. Then, for each v V, there exist unique ui and unique αi F, for 1 i n, such that v = i=1nαiui.

The next result is generally known as ”every linearly independent set can be extended to form a basis in a finite dimensional vector space”.

Theorem 3.3.3.15. Let V be a vector space over F with dim(V) = n. If S is a linearly independent subset of V then there exists a basis T of V such that S T. DRAFT

Proof. If LS(S) = V, done. Else, choose u1 V \LS(S). Thus, by Corollary 3.3.2.7.2, the set S ∪{u1} is linearly independent. We repeat this process till we get n vectors in T as dim(V) = n. By Theorem 3.3.3.13, this T is indeed a required basis. _

#### 3.3.B Constructing a Basis of a Finite Dimensional Vector Space

We end this section with an algorithm which is based on the proof of the previous theorem.

Step 1: Let v1 V with v10. Then {v1} is linearly independent.
Step 2: If V = LS(v1), we have got a basis of V. Else, pick v2 V \ LS(v1). Then by Corollary 3.3.2.7.2, {v1,v2} is linearly independent.
Step i: Either V = LS(v1,,vi) or LS(v1,,vi)V. In the first case, {v1,,vi} is a basis of V. Else, pick vi+1 V \ LS(v1,,vi). Then, by Corollary 3.3.2.7.2, the set {v1,,vi+1} is linearly independent.

This process will finally end as V is a finite dimensional vector space.

Exercise 3.3.3.16.

1. Let = {u1,,un} be a basis of a vector space V over F. Then, does the condition i=1nαiui = 0 in αi’s imply that αi = 0, for 1 i n?
2. Find a basis of 3 containing the vector (1,1,-2)T .
3. Find a basis of 3 containing the vector (1,1,-2)T and (1,2,-1)T .
4. Is it possible to find a basis of 4 containing the vectors (1,1,1,-2)T , (1,2,-1,1)T and (1,-2,7,-11)T ? DRAFT
5. Let S = {v1,,vp} be a subset of a vector space V over F. Suppose LS(S) = V but S is not a linearly independent set. Then does this imply that each v V is expressible in more than one way as a linear combination of vectors from S?
6. Show that = {(1,0,1)T ,(1,i,0)T ,(1,1,1 - i)T } is a basis of 3 over .
7. Find a basis of 3 over containing the basis given in Example 3.3.3.16.6.
8. Determine a basis and dimension of W = {(x,y,z,w)T 4|x + y - z + w = 0}.
9. Find a basis of V = {(x,y,z,u) 4|x - y - z = 0,x + z - u = 0}.
10. Let A = . Find a basis of V = {x 5|Ax = 0}.
11. Prove that = {1,x,,xn,} is a basis of [x]. is called the standard basis of [x].
12. Let uT = (1,1,-2),vT = (-1,2,3) and wT = (1,10,1). Find a basis of L(u,v,w). Determine a geometrical representation of LS(u,v,w).
13. Let V be a vector space of dimension n. Then any set
1. consisting of n linearly independent vectors forms a basis of V.
2. S in V having n vectors with LS(S) = V forms a basis of V.
14. Let {v1,,vn} be a basis of n. Then prove that the matrices B = [v1,,vn] and B = are invertible.
15. Let W1 and W2 be two subspaces of a vector space V such that W1 W2. Show that W1 = W2 if and only if dim(W1) = dim(W2). DRAFT
16. Consider the vector space ([-π,π]) over . For each n , define en(x) = sin(nx). Then prove that S = {en|n } is linearly independent. [Hint: Need to show that every finite subset of S is linearly independent. So, on the contrary assume that there exists and functions ek1,,ek such that α1ek1 + + αek = 0, for some αt0 with 1 tℓ. But, the above system is equivalent to looking at α1 sin(k1x) + + α sin(kx) = 0 for all x [-π,π]. Now in the integral
replace m with ki’s to show that αi = 0, for all i,1 i to get the required contradiction.]
17. Let V be a vector space over F with dim(V) = n. If W1 is a k-dimensional subspace of V then prove that there exists a subspace W2 of V such that W1 W2 = {0}, W1 + W2 = V and dim(W2) = n - k. Also, prove that for each v V there exist unique vectors w1 W1 and w2 W2 such that v = w1 + w2. The subspace W2 is called the complementary subspace of W1 in V.
18. Is the set W = {p(x) [x;4]|p(-1) = p(1) = 0} a subspace of [x;4]? If yes, find its dimension.

### 3.4 Application to the subspaces of ℂn

In this subsection, we will study results that are intrinsic to the understanding of linear algebra from the point of view of matrices, especially the fundamental subspaces (see Definition 3.3.1.25) associated with matrices. We start with an example.

Example 3.3.4.1.

DRAFT
1. Compute the fundamental subspaces for A = .
Solution: Verify the following
1. Col(A*) = Row(A) = {(x,y,z,u)T 4|3x - 2y = z,5x - 3y + u = 0}.
2. Col(A)= Row(A*) = {(x,y,z)T 3|4x - 3y - z = 0}.
3. Null(A) = {(x,y,z,u)T 4|x + 3z - 5u = 0,y - 2z + 3u = 0}.
4. Null(A*) = {(x,y,z)T 3|x + 4z = 0,y - 3z = 0}.
2. Let A = . Find a basis and dimension of Null(A).
Solution: Writinf the basic vairables x1,x3 and x6 in terms of the free variables x2,x4,x5 and x7, we get x1 = x7 - x2 - x4 - x5, x3 = 2x7 - 2x4 - 3x5 and x6 = -x7. Hence, the solution set has the form
 (3.3.4.1)

DRAFT

Now, let u1T = , u2T = , u3T = and u4T = . Then S = {u1,u2,u3,u4} is a basis of Null(A). The reasons for S to be a basis are as follows:

1. By Equation (3.3.4.1) Null(A) = LS(S).
2. For Linear independence, the homogeneous system c1u1 + c2u2 + c3u3 + c4u4 = 0 in the unknowns c1,c2,c3 and c4 has only the trivial solution as
1. u4 is the only vector with a non-zero entry at the 7-th place (u4 corresponds to x7) and hence c4 = 0.
2. u3 is the only vector with a non-zero entry at the 5-th place (u3 corresponds to x5) and hence c3 = 0.
3. Similar arguments hold for the unknowns c2 and c1.

Exercise 3.3.4.2. Let A = and B = .

1. Find RREF(A) and RREF(B).
2. Find invertible matrices P1 and P2 such that P1A = RREF(A) and P2B = RREF(B).
3. Find bases for Col(A), Col(A*), Col(B) and Col(B*).
4. Find bases of Null(A), Null(A*), Null(B) and Null(B*). DRAFT
5. Find the dimensions of all the vector subspaces so obtained.
6. Does there exist relationship between the bases of different spaces?

Lemma 3.3.4.3. Let A Mm×n() and let E be an elementary matrix. If

1. B = EA then Col(A*) = Col(B*) and Col(AT ) = Col(BT ). Hence dim(Col(A*)) = dim(Col(B*)) and dim(Col(AT )) = dim(Col(BT )).
2. B = AE then Col(A) = Col(B) and Col(A) = Col(B). Hence dim(Col(A)) = dim(Col(B)) and dim(Col(A)) = dim(Col(B)).

Proof. Note that B = EA if and only if B = EA. As E is invertible, A are B are equivalent and hence they have the same RREF. Also, E is invertible as well and hence A are B have the same RREF. Now, use Theorem 3.3.2.9 to get the required result.

For the second part, note that B* = E*A* and E* is invertible. Hence, using the first part Col((A*)*) = Col((B*)*), or equivalently, Col(A) = Col(B). _

Let A Mm×n() and let B = RREF(A). Then as an immediate application of Lemma 3.3.4.3, we get dim(Col(A*)) = Row rank(A). Hence, dim(Col(A)) = Column rank(A) as well. We now prove that Row rank(A) = Column rank(A).

Theorem 3.3.4.4. Let A Mm×n(). Then Row rank(A) = Column rank(A).

Proof. Let Row rank(A) = r = dim(Col(AT )). Then there exist i1,,ir such that {A[i1,:],,A[ir,:]} form a basis of Col(AT ). Then, B = is an r × n matrix and it’s rows are a basis of Col(AT ). Therefore, there exist αij ,1 i m,1 j r such that A[t,:] = [αt1,tr]B, for 1 t m. So, using matrix multiplication (see Remark 1.1.2.11.4) DRAFT

where C = [αij] is an m × r matrix. Thus, matrix multiplication implies that each column of A is a linear combination of r columns of C. Hence, Column rank(A) = dim(Col(A)) r = Row rank. A similar argument gives Row rank(A) Column rank(A). Hence, we have the required result. _

Remark 3.3.4.5. The proof also shows that for every A Mm×n() of rank r there exists matrices Br×n and Cm×r, each of rank r, such that A = CB.

Let W1 and W1 be two subspaces of a vector space V over F. Then recall that (see Exercise 3.3.1.24.7d) W1 + W2 = {u + v|u W1,v W2} = LS(W1 W2) is the smallest subspace of V containing both W1 and W2. We now state a result similar to a result in Venn diagram that states |A| + |B| = |A B| + |A B|, whenever the sets A and B are finite (for a proof, see Appendix 7.7.4.1).

Theorem 3.3.4.6. Let V be a finite dimensional vector space over F. If W1 and W2 are two subspaces of V then
 (3.3.4.2)

For better understanding, we give an example for finite subsets of n. The example uses Theorem 3.3.2.9 to obtain bases of LS(S), for different choices S. The readers are advised to see Example 3.3.2.9 before proceeding further.

Example 3.3.4.7. Let V and W be two spaces with V = {(v,w,x,y,z)T 5|v + x + z = 3y} and W = {(v,w,x,y,z)T 5|w - x = z,v = y}. Find bases of V and W containing a basis of V W.
Solution: (v,w,x,y,z)T V W if v,w,x,y and z satisfy v + x - 3y + z = 0,w - x - z = 0 and v = y. The solution of the system is given by

Thus, = {(1,2,0,1,2)T ,(0,0,1,0,-1)T } is a basis of V W. Similarly, a basis of V is given by = {(-1,0,1,0,0)T ,(0,1,0,0,0)T ,(3,0,0,1,0)T ,(-1,0,0,0,1)T } and that of W is given by = {(1,0,0,1,0)T ,(0,1,1,0,0)T ,(0,1,0,0,1)T }. To find the required basis form a matrix whose rows are the vectors in , and (see Equation(3.3.4.3)) and apply row operations other than Eij. Then after a few row operations, we get
 DRAFT (3.3.4.3)

Thus, a required basis of V is {(1,2,0,1,2)T ,(0,0,1,0,-1)T ,(0,1,0,0,0)T ,(0,0,0,1,3)T }. Similarly, a required basis of W is {(1,2,0,1,2)T ,(0,0,1,0,-1)T ,(0,1,0,0,1)T }.

Exercise 3.3.4.8.

1. Give an example to show that if A and B are equivalent then Col(A) need not equal Col(B).
2. Let V = {(x,y,z,w)T 4|x + y - z + w = 0,x + y + z + w = 0,x + 2y = 0} and W = {(x,y,z,w)T 4|x - y - z + w = 0,x + 2y - w = 0} be two subspaces of 4. Find bases and dimensions of V, W, V W and V + W.
3. Let W1 and W2 be 4-dimensional subspaces of a vector space V of dimension 7. Then prove that dim(W1 W2) 1.
4. Let W1 and W2 be two subspaces of a vector space V. If dim(W1) + dim(W2) > dim(V), then prove that dim(W1 W2) 1.
5. Let A Mm×n() with m < n. Prove that the columns of A are linearly dependent.
DRAFT

We now prove the rank-nullity theorem and give some of it’s consequences.

Theorem 3.3.4.9 (Rank-Nullity Theorem). Let A Mm×n(). Then
 (3.3.4.4)

Proof. Let dim(Null(A)) = r n and let = {u1,,ur} be a basis of Null(A). Since is a linearly independent set in n, extend it to get = {u1,,un} as a basis of n. Then,

So, = {Aur+1,,Aun} spans Col(A). We further need to show that is linearly independent. So, consider the linear system
 DRAFT (3.3.4.5)

in the unknowns α1,n-r. Thus, α1ur+1 + + αn-run Null(A) = LS(). Therefore, there exist scalars βi,1 i r, such that i=1n-rαiur+i = j=1rβjuj. Or equivalently,
 (3.3.4.6)

As is a linearly independent set, the only solution of Equation (3.3.4.6) is

In other words, we have shown that the only solution of Equation (3.3.4.5) is the trivial solution. Hence, {Aur+1,,Aun} is a basis of Col(A). Thus, the required result follows. _

Theorem 3.3.4.9 is part of what is known as the fundamental theorem of linear algebra (see Theorem 5.5.1.23). The following are some of the consequences of the rank-nullity theorem. The proof is left as an exercise for the reader.

DRAFT Exercise 3.3.4.10.

1. Let A Mm,n(). If
1. n > m then the system Ax = 0 has infinitely many solutions,
2. n < m then there exists b m \{0} such that Ax = b is inconsistent.
2. The following statements are equivalent for an m × n matrix A.
1. Rank (A) = k.
2. There exist a set of k rows of A that are linearly independent.
3. There exist a set of k columns of A that are linearly independent.
4. dim(Col(A)) = k.
5. There exists a k × k submatrix B of A with det(B)0. Further, the determinant of every (k + 1) × (k + 1) submatrix of A is zero.
6. There exists a linearly independent subset {b1,,bk} of m such that the system Ax = bi, for 1 i k, is consistent.
7. dim(Null(A)) = n - k.

### 3.5 Ordered Bases

DRAFT

Let V be a vector subspace of n for some n with dim(V) = k. Then, a basis of V may not look like a standard basis. Our problem may force us to look for some other basis. In such a case, it is always helpful to fix the vectors in a particular order and then concentrate only on the coefficients of the vectors as was done for the system of linear equations where we didn’t worry about the unknowns. It also may happen that k is very-very small as compared to n and hence it is always better to work with vectors of small size.

Definition 3.3.5.1. [Ordered Basis, Basis Matrix] Let V be a vector space over F with a basis = {u1,,un}. Then an ordered basis for V is a basis together with a one-to-one correspondence between and {1,2,,n}. Since there is an order among the elements of , we write = [u1,,un]. The matrix B = [u1,,un] is called the basis matrix.

Thus, = [u1,u2,,un] is different from = [u2,u3,,un,u1] and both of them are different from = [un,un-1,,u2,u1] even though they have the same set of vectors as elements. We now define the notion of coordinates of a vector with respect to an ordered basis.

Definition 3.3.5.2. [Coordinates of a Vector] Let B = [v1,,vn] be the basis matrix corresponding to an ordered basis of V. Since is a basis of V, for each v V, there exist βi,1 i n, such that v = i=1nβivi = B. The column vector is called the coordinates of v with respect to , denoted [v]. Thus, using notation v = B[v].

Example 3.3.5.3.

1. Let = be an ordered basis of 2. Then, = . Thus, = -1. DRAFT
2. Consider the vector space [x;2] with basis {1 -x,1 + x,x2}. Then an ordered basis can either be = [1-x,1+x,x2] or = [1+x,1-x,x2] or .... Note that there are 3! different ordered bases. Also, for a0 + a1x + a2x2 [x;2], one has
Thus, [a0 + a1x + a2x2] = , whereas [a0 + a1x + a2x2] = .
3. Let V = {(x,y,z)T |x + y = z}. If = [(-1,1,0)T ,(1,0,1)T ] is an ordered basis of V then = and hence = .
4. Let V = {(v,w,x,y,z)T 5|w - x = z,v = y,v + x + z = 3y}. So, if = [(1,2,0,1,2)T ,(0,0,1,0,-1)T ] is an ordered basis then [(3,6,0,3,1)] = .

Let V be a vector space over F with dim(V) = n. Let A = [v1,,vn] and B = [u1,,un] be basis matrices corresponding to the ordered bases and , respectively. So, using the notation in Definition 3.3.5.2, we have

So, the matrix [A] = , denoted [], is called the matrix of with respect to the ordered basis or the change of basis matrix from to . We now summarize the above discussion. DRAFT

Theorem 3.3.5.4. Let V be a vector space over F with dim(V) = n. Let = [v1,,vn] and = [u1,,un] be two ordered bases of V.

1. Then, the matrix [] is invertible and [w] = [][w], for all w V.
2. Similarly, verify that [] is invertible and [w] = [][w], for all w V.
3. Furthermore, ([])-1 = [].

Proof. Part 1: We prove all the parts together. Let A = [v1,,vn],B = [u1,,un] and C = [] and D = []. Then, by previous paragraph A = BC. Similarly,

But, by Exercise 3.3.3.16.14, A and B are invertible and thus C = B-1A and D = A-1B are invertible as well. Clearly, C-1 = -1 = A-1B = D which proves the third part. For the first two parts, note that for any w V, w = A[w], w = B[w]. Hence,
and thus [w] = [][w]. Similarly, [w] = []