Definition 1.1.1.1.Matrix A rectangular array of numbers is called a matrix.
The horizontal arrays of a matrix are called its rows and the vertical arrays are called its
columns. Let A be a matrix having m rows and n columns. Then, A is said to have orderm × n or is called a matrix of size m × n and can be represented in either of the following
forms:
where a_{ij} is the entry at the intersection of the i^{th} row and j^{th} column. To be concise, one writes
A_{m×n} = [a_{ij}] or, in short, A = [a_{ij}]. We will also use A[i,:] to denote the i-th row of A, A[:,j] to denote
the j-th column of A. We shall mostly be concerned with matrices having complex numbers as
entries.
For example, if A = then A[1,:] = [13 + i7], A[:,3] = and
a_{22} = 5. In general, in row vector commas are inserted to differentiate between entries. Thus,
A[1,:] = [1,3 + i,7]. A matrix having only one column is called a column vector and a
matrix with only one row is called a row vector. All our vectors will be column vectors and
will be represented by bold letters. Thus, A[1,:] is a row vector and A[:,3] is a column
vector.
DRAFTDefinition 1.1.1.2.Equality of two Matrices Two matrices A = [a_{ij}] and B = [b_{ij}] having
the same order m × n are equal if a_{ij} = b_{ij}, for each i = 1,2,…,m and j = 1,2,…,n.
In other words, two matrices are said to be equal if they have the same order and their corresponding
entries are equal.
Example 1.1.1.3. The linear system of equations 2x+3y = 5 and 3x+2y = 5 can be identified
with the matrix A = . Note that x and y are unknowns with the understanding
that x is associated with A[:,1] and y is associated with A[:,2].
1.1.A Special Matrices
Definition 1.1.1.4.
A matrix in which each entry is zero is called a zero-matrix, denoted 0. For example,
DRAFT
A matrix that has the same number of rows as the number of columns, is called a square
matrix. A square matrix is said to have order n if it is an n × n matrix.
Let A = [a_{ij}] be an n × n square matrix.
Then the entries a_{11},a_{22},…,a_{nn} are called the diagonal entries the principaldiagonal of A.
Then A is said to be a diagonal matrix if a_{ij} = 0 for i≠j, denoted diag[a_{11},…,a_{nn}].
For example, the zero matrix 0_{n} and are a few diagonal matrices.
If A = diag[a_{11},…,a_{nn}] and a_{ii} = d for all i = 1,…,n then the diagonal matrix A is
called a scalar matrix.
Then A = diag[1,…,1] is called the identity matrix, denoted I_{n}, or in short I.
For example, I_{2} = and I_{3} = .
Then A is said to be an upper triangular matrix if a_{ij} = 0 for i > j.
Then A is said to be a lower triangular matrix if a_{ij} = 0 for i < j.
Then A is said to be triangular if it is an upper or a lower triangular matrix.
For example, is upper triangular, is lower triangular.
An m × n matrix A = [a_{ij}] is said to have an upper triangular form if a_{ij} = 0 for all i > j.
For example, the matrices , and have upper
triangular forms.
DRAFT
1.2 Operations on Matrices
Definition 1.1.2.1.Transpose of a MatrixLet A = [a_{ij}] be an m × n matrix with real
entries. Then the transpose of A, denoted A^{T } = [b_{ij}], is an n×m matrix with b_{ij} = a_{ji}, for all
i,j.
Definition 1.1.2.2.Conjugate Transpose of a MatrixLet A = [a_{ij}] be an m × n matrix
with complex entries. Then the conjugate transpose of A, denoted A^{*}, is an n × m matrix
with (A^{*})_{ij} =a_{ji}, for all i,j, where for a ∈ ℂ, a denotes the complex-conjugate of a.
Thus, if x is a column vector then x^{T } and x^{*} are row vectors and vice-versa. For example, if
A = then A^{*} = A^{T } = , whereas if A = then A^{*} = and
note that A^{*}≠A^{T }.
Theorem 1.1.2.3.For any matrix A, (A^{*})^{*} = A. Thus, (A^{T })^{T } = A.
Proof. Let A = [a_{ij}],A^{*} = [b_{ij}] and (A^{*})^{*} = [c_{ij}]. Clearly, the order of A and (A^{*})^{*} is the same. Also,
by definition c_{ij} =b_{ji} =a_{ij} = a_{ij} for all i,j and hence the result follows. _
DRAFTRemark 1.1.2.4.Note that transpose is studied whenever the entries of the matrix are real.Since, we are allowing the matrix entries to be complex numbers, we will state and prove theresults for complex-conjugate. The readers should separately very that similar results hold fortranspose whenever the matrix has real entries.
Definition 1.1.2.5.Addition of Matrices Let A = [a_{ij}] and B = [b_{ij}] be two m × n
matrices. Then, the sum of A and B, denoted A + B, is defined to be the matrix C = [c_{ij}] with
c_{ij} = a_{ij} + b_{ij}.
Definition 1.1.2.6.Multiplying a Scalar to a Matrix Let A = [a_{ij}] be an m × n matrix.
Then the product of k ∈ ℂ with A, denoted kA, is defined as kA = [ka_{ij}].
For example, if A = then 5A = and (2 + i)A = .
Theorem 1.1.2.7.Let A,B and C be matrices of order m × n, and let k,ℓ ∈ ℂ. Then
A + B = B + A(commutativity).
(A + B) + C = A + (B + C)(associativity).
k(ℓA) = (kℓ)A.
(k + ℓ)A = kA + ℓA.
DRAFT
Proof. Part 1. Let A = [a_{ij}] and B = [b_{ij}]. Then, by definition
as
complex numbers commute. The reader is required to prove the other parts as all the results follow
from the properties of complex numbers. _
Definition 1.1.2.8.Additive Inverse Let A be an m × n matrix.
Then there exists a matrix B with A + B = 0. This matrix B is called the additiveinverse of A, and is denoted by -A = (-1)A.
Also, for the matrix 0_{m×n}, A + 0 = 0 + A = A. Hence, the matrix 0_{m×n} is called the
additive identity.
Exercise1.1.2.9.
Find a 3 × 3 non-zero matrix A with real entries satisfying
A^{T } = A.DRAFT
A^{T } = -A.
Find a 3 × 3 non-zero matrix A with complex entries satisfying
A^{*} = A.
A^{*} = -A.
Find the 3 × 3 matrix A = [a_{ij}] satisfying a_{ij} = 1 if i≠j and 2 otherwise.
Find the 3 × 3 matrix A = [a_{ij}] satisfying a_{ij} = 1 if|i - j|≤ 1 and 0 otherwise.
Find the 4 × 4 matrix A = [a_{ij}] satisfying a_{ij} = i + j.
Find the 4 × 4 matrix A = [a_{ij}] satisfying a_{ij} = 2^{i+j}.
Suppose A + B = A. Then show that B = 0.
Suppose A + B = 0. Then show that B = (-1)A = [-a_{ij}].
Let A = and B = . Compute A + B^{*}and B + A^{*}.
1.2.A Multiplication of Matrices
Definition 1.1.2.10.Matrix Multiplication / ProductLet A = [a_{ij}] be an m×n matrix
and B = [b_{ij}] be an n × r matrix. The product of A and B, denoted AB, is a matrix C = [c_{ij}]
of order m × r with
DRAFT
Thus, AB is defined if and only if number of columns of A = number of rows ofB.
For example, if A = and B = then
(1.1.2.1)
Thus, note that the rows of the matrix AB can be written directly as
and similarly, the columns of the matrix AB can be written directly as
In this example, while AB is defined, the product BA is not defined. However, for squarematrices A and B of the same order, both the product AB and BA are defined.
The product AB corresponds to operating on the rows of the matrix B (seeEquation (1.1.2.2)). This is row method for calculating the matrix product.
The product AB also corresponds to operating on the columns of the matrix A (seeEquation (1.1.2.3)). This is column method for calculating the matrix product.
Let A and B be two matrices such that the product AB is defined. Then verify that
(1.1.2.4)
Example 1.1.2.12. Let A = and B = . Use the row/column method of
matrix multiplication to
find the second row of the matrix AB. Solution: By Remark 1.1.2.11.4, (AB)[2,:] = A[2,:]B and hence
find the third column of the matrix AB. Solution: Again, by Remark 1.1.2.11.4, (AB)[:,3] = AB[:,3] and hence
DRAFT
Definition 1.1.2.13.[Commutativity of Matrix Product] Two square matrices A and B are
said to commute if AB = BA.
Remark 1.1.2.14.Note that if A is a square matrix of order n and if B is a scalar matrixof order n then AB = BA. In general, the matrix product is not commutative. For example,consider A = and B = . Then verify that AB = ≠ = BA.
Theorem 1.1.2.15.Suppose that the matrices A,B and C are so chosen that the matrixmultiplications are defined.
Then (AB)C = A(BC). That is, the matrix multiplication is associative.
For any k ∈ ℝ,(kA)B = k(AB) = A(kB).
Then A(B + C) = AB + AC. That is, multiplication distributes over addition.DRAFT
If A is an n × n matrix then AI_{n} = I_{n}A = A.
Now let A be a square matrix of order n and D = diag[d_{1},d_{2},…,d_{n}]. Then
(DA)[i,:] = d_{i}A[i,:], for 1 ≤ i ≤ n, and
(AD)[:,j] = d_{j}A[:,j], for 1 ≤ j ≤ n.
Proof. Part 1. Let A = [a_{ij}]_{m×n},B = [b_{ij}]_{n×p} and C = [c_{ij}]_{p×q}. Then
Therefore,
Part 5. As D is a diagonal matrix, using Remark 1.1.2.11.4, we have
DRAFT
Using a similar argument, the next part follows. The other parts are left for the reader. _
Exercise1.1.2.16.
Find a 2 × 2 non-zero matrix A satisfying A^{2} = 0.
Find a 2 × 2 non-zero matrix A satisfying A^{2} = A and A≠I_{2}.
Find 2 × 2 non-zero matrices A,B and C satisfying AB = AC but B≠C. That is, thecancelation law doesn’t hold.
Let A = . Compute A^{2}and A^{3}. Is A^{3} = I? Determine aA^{3} + bA + cA^{2}.
Let A and B be two m × n matrices. Then prove that (A + B)^{*} = A^{*} + B^{*}.
Let A be a 1×n matrix and B be an n×1 matrix. Then verify that AB is a 1×1 matrix,whereas BA has order n × n.
Let A and B be two matrices such that the matrix product AB is defined.
Prove that (AB)^{*} = B^{*}A^{*}.
If A[1,:] = 0^{*}then (AB)[1,:] = 0^{*}.DRAFT
If B[:,1] = 0 then (AB)[:,1] = 0.
If A[i,:] = A[j,:] for some i and j then (AB)[i,:] = (AB)[j,:].
If B[:,i] = B[:,j] for some i and j then (AB)[:,i] = (AB)[:,j].
Let A = and B = . Compute
A - A^{*},A + A^{*},(3AB)^{*}- 4B^{*}A and 3A - 2A^{*}.
(AB)[1,:],(AB)[3,:],(AB)[:,1] and (AB)[:,2].
(B^{*}A^{*})[:,1],(B^{*}A^{*})[:,3],(B^{*}A^{*})[1,:] and (B^{*}A^{*})[2,:].
Let A = and B = . Guess a formula for A^{n}and B^{n}and proveit?
Let A = , B = and C = . Is it true that A^{2}- 2A + I = 0? Whatis B^{3}- 3B^{2} + 3B - I? Is C^{3} = 3C^{2}?
Construct the matrices A and B satisfying the following statements.
The product AB is defined but BA is not defined.
The products AB and BA are defined but they have different orders.
The products AB and BA are defined, they have the same order but AB≠BA.
Let a,b and c be indeterminate. Then, can we find A with complex entries satisfyingDRAFTA = ? What if A = ? Give reasons for your answer.
1.2.B Inverse of a Matrix
Definition 1.1.2.17.[Inverse of a Matrix] Let A be a square matrix of order n.
A square matrix B is said to be a left inverse of A if BA = I_{n}.
A square matrix C is called a right inverse of A, if AC = I_{n}.
A matrix A is said to be invertible (or is said to have an inverse) if there exists a matrix
B such that AB = BA = I_{n}.
Lemma 1.1.2.18.Let A be an n × n matrix. Suppose that there exist n × n matrices B andC such that AB = I_{n}and CA = I_{n}then B = C.
Proof. Note that C = CI_{n} = C(AB) = (CA)B = I_{n}B = B. _
Remark 1.1.2.19.DRAFT
Lemma1.1.2.18implies that whenever A is invertible, the inverse is unique.
Therefore the inverse of A is denoted by A^{-1}. That is, AA^{-1} = A^{-1}A = I.
Example 1.1.2.20.
Let A = .
If ad - bc≠0. Then verify that A^{-1} = .
In particular, the inverse of equals .
If ad-bc = 0 then prove that either A[1,:] = 0^{*} or A[:,1] = 0 or A[2,:] = αA[1,:] or
A[:,2] = αA[:,1] for some α ∈ ℂ. Hence, prove that A is not invertible.
The matrices , and do not have inverses.
Let A = . Then A^{-1} = .
Prove that the matrices A = and B = are not invertible.
Solution: Suppose there exists C such that CA = AC = I. Then, using matrix product
DRAFT
But A[1,:] = A[2,:] and thus [1,0,0] = [0,1,0], a contradiction.
Similarly, if there exists D such that BD = DB = I then
But B[:,3] = B[:,1] + B[:,2] and hence I[:,3] = I[:,1] + I[:,2], a contradiction.
Theorem 1.1.2.21.Let A and B be two invertible matrices. Then
(A^{-1})^{-1} = A.
(AB)^{-1} = B^{-1}A^{-1}.
(A^{*})^{-1} = (A^{-1})^{*}.
Proof. Proof of Part 1. Let B = A^{-1} be the inverse of A. Then AB = BA = I. Thus, by definition, B
is invertible and B^{-1} = A. Or equivalently, (A^{-1})^{-1} = A.
Proof of Part 2. By associativity (AB)(B^{-1}A^{-1}) = A(BB^{-1})A^{-1} = I = (B^{-1}A^{-1})(AB).
Proof of Part 3. As AA^{-1} = A^{-1}A = I, we get (AA^{-1})^{*} = (A^{-1}A)^{*} = I^{*}. Or equivalently,
(A^{-1})^{*}A^{*} = A^{*}(A^{-1})^{*} = I. Thus, by definition (A^{*})^{-1} = (A^{-1})^{*}. _
We will again come back to the study of invertible matrices in Sections 2.2 and 2.3.A.
DRAFT
Exercise1.1.2.22.
Let A be an invertible matrix. Then (A^{-1})^{r} = A^{-r}for all integer r.
Find the inverse of and .
Let A_{1},…,A_{r}be invertible matrices. Then the matrix B = A_{1}A_{2}A_{r}is also invertible.
Let x^{*} = [1 + i,2,3] and y^{*} = [2,-1 + i,4]. Prove that x^{*}y is invertible but xy^{*}is notinvertible.
Let A be an n × n invertible matrix. Then prove that
A[i,:]≠0^{*}for any i.
A[:,j]≠0 for any j.
A[i,:]≠A[j,:] for any i and j.
A[:,i]≠A[:,j] for any i and j.
A[3,:]≠αA[1,:] + βA[2,:] for any α,β ∈ ℂ, whenever n ≥ 3.
A[:,3]≠αA[:,1] + βA[:,2] for any α,β ∈ ℂ, whenever n ≥ 3.
Determine A that satisfies (I + 3A)^{-1} = .
Determine A that satisfies (I - A)^{-1} = . [See Example1.1.2.20.2].
Let A be a square matrix satisfying A^{3} + A - 2I = 0. Prove that A^{-1} = .DRAFT
Let A = [a_{ij}] be an invertible matrix. If B = [p^{i-j}a_{ij}] for some p ∈ ℂ, p≠0 then determineB^{-1}.
1.3 Some More Special Matrices
Definition 1.1.3.1.
Let A be a square matrix with real entries. Then, A is called
symmetric if A^{T } = A. For example, A = .
skew-symmetric if A^{T } = -A. For example, A = .
orthogonal if AA^{T } = A^{T }A = I. For example, A = .
Let A be a square matrix with complex entries. Then, A is called
Normal if A^{*}A = AA^{*}. For example, is a normal matrix.
Hermitian if A^{*} = A. For example, A = .
skew-Hermitian if A^{*} = -A. For example, A = .
unitary if AA^{*} = A^{*}A = I. For example, A = .
A matrix A is said to be idempotent if A^{2} = A. For example, A = is idempotent.
DRAFT
A matrix that is symmetric and idempotent is called a projection matrix. For example, let
u ∈ ℝ^{n} be a column vector with u^{T }u = 1 then A = uu^{T } is an idempotent matrix. Moreover, A is
symmetric and hence is a projection matrix. In particular, let u = (1,2)^{T } and A = uu^{T }. Then
u^{T }u = 1 and for any vector x = (x_{1},x_{2})^{T }∈ ℝ^{2} note that
Thus, Ax is the feet of the perpendicular from the point x on the vector [12]^{T }.
A square matrix A is said to be nilpotent if there exists a positive integer n such that A^{n} = 0.
The least positive integer k for which A^{k} = 0 is called the order of nilpotency. For example, if
A = [a_{ij}] is an n × n matrix with a_{ij} equal to 1 if i - j = 1 and 0, otherwise then A^{n} = 0 and
A^{ℓ}≠0 for 1 ≤ ℓ ≤ n - 1.
Exercise1.1.3.2.
Let A be a complex square matrix. Then S_{1} = (A + A^{*}) is symmetric, S_{2} = (A - A^{*})
is skew-symmetric, and A = S_{1} + S_{2}.
Let A and B be two lower triangular matrices. Then prove that AB is a lower triangularmatrix. A similar statement holds for upper triangular matrices.
Let A and B be Hermitian matrices. Then prove that AB is Hermitian if and only ifAB = BA.DRAFT
Show that the diagonal entries of a skew-Hermitian matrix are zero or purely imaginary.
Let A,B be skew-Hermitian matrices with AB = BA. Is the matrix AB Hermitian orskew-Hermitian?
Let A be a Hermitian matrix of order n with A^{2} = 0. Is it necessarily true that A = 0?
Let A be a nilpotent matrix. Prove that there exists a matrix B such that B(I +A) = I =
(I + A)B [If A^{k} = 0 then look at I - A + A^{2}- + (-1)^{k-1}A^{k-1}].
Are the matricesandorthogonal, for θ ∈ [0,2π]?
Let {u_{1},u_{2},u_{3}} be three vectors in ℝ^{3}such that u_{i}^{*}u_{i} = 1, for 1 ≤ i ≤ 3, and u_{i}^{*}u_{j} = 0
whenever i≠j. Then prove that the 3 × 3 matrix
U = [u_{1}u_{2}u_{3}] satisfies U^{*}U = I. Thus, UU^{*} = I.
A = u_{i}u_{i}^{*}, for 1 ≤ i ≤ 3, satisfies A^{2} = A. Is A symmetric?
A = u_{i}u_{i}^{*} + u_{j}u_{j}^{*}, for i≠j, satisfies A^{2} = A. Is A symmetric?
Verify that the matrices in Exercises9.9band 9.9care projection matrices.
Let A and B be two n × n orthogonal matrices. Then prove that AB is also an orthogonalmatrix.
1.3.A sub-matrix of a Matrix
DRAFTDefinition 1.1.3.3.A matrix obtained by deleting some of the rows and/or columns of a
matrix is said to be a sub-matrix of the given matrix.
For example, if A = then [1],[2],,[15],,A are a few sub-matrices of A. But the
matrices and are not sub-matrices of A (the reader is advised to give
reasons).
Let A be an n×m matrix and B be an m×p matrix. Suppose r < m. Then, we can decompose the
matrices A and B as A = [PQ] and B = , where P has order n×r and H has order r ×p. That
is, the matrices P and Q are sub-matrices of A and P consists of the first r columns of A and Q
consists of the last m - r columns of A. Similarly, H and K are sub-matrices of B and H consists of
the first r rows of B and K consists of the last m-r rows of B. We now prove the following important
theorem.
Theorem 1.1.3.4.Let A = [a_{ij}] = [PQ] and B = [b_{ij}] = be defined as above. Then
Proof. The matrix products PH and QK are valid as the order of the matrices P,H,Q and K are
respectively, n×r,r ×p,n× (m-r) and (m-r) ×p. Also, the matrices PH and QK are of the same
order and hence their sum is justified. Now, let P = [P_{ij}],Q = [Q_{ij}],H = [H_{ij}], and K = [k_{ij}]. Then,
for 1 ≤ i ≤ n and 1 ≤ j ≤ p, we have
DRAFT
Thus, the required result follows. _
Remark 1.1.3.5.Theorem1.1.3.4is very useful due to the following reasons:
The order of the matrices P,Q,H and K are smaller than that of A or B.
The matrices P,Q,H and K can be further partitioned so as to form blocks that are eitheridentity or zero or matrices that have nice forms. This partition may be quite useful duringdifferent matrix operations.
If we want to prove results using induction then after proving the initial step, one assumethe result for all r×r sub-matrices and then try to prove it for (r+1)×(r+1) sub-matrices.
For example, if A = and B = then AB = .
Suppose A =
m_{1}m_{2}
n_{1}
n_{2}
and B =
s_{1}s_{2}
r_{1}
r_{2}
.
Then the matrices P,Q,R,S and E,F,G,H, are called the blocks of the matrices A and B,
respectively. Note that even if A + B is defined, the orders of P and E need not be the
same. But, if the block sums are defined then A + B = . Similarly, if the
DRAFT
product AB is defined, the product PE may not be defined. Again, if the block products are
defined, one can verify that AB = . That is, once a partition of A
is fixed, the partition of B has to be properly chosen for purposes of block addition or
multiplication.
Let x^{T } = [1,1,1,-1]. Use previous result to prove Cx = 0.
Let x = , y = , A = and B = .Then
prove that y = Ax gives the counter-clockwise rotation through an angle α.
prove that y = Bx gives the reflection about the line y = tan(θ)x.
compute y = (AB)x and y = (BA)x. Do they correspond to reflection? If yes, thenabout which line?
furthermore if y = Cx gives the counter-clockwise rotation through β and y = Dx gives thereflection about the line y = tan(δ)x, respectively. Then prove thatDRAFT
AC = CA and y = (AC)x gives the counter-clockwise rotation through α + β.
y = (BD)x and y = (DB)x give rotations. Which angles do they represent?
Fix a unit vector a ∈ ℝ^{n}and define f : ℝ^{n}→ ℝ^{n}by f(y) = 2(a^{T }y)a-y. Does this function givea reflection about the line that contains the points 0 and a.
Consider the two coordinate transformations
x_{1}
= a_{11}y_{1} + a_{12}y_{2}
x_{2}
= a_{21}y_{1} + a_{22}y_{2}
and
y_{1}
= b_{11}z_{1} + b_{12}z_{2}
y_{2}
= b_{21}z_{1} + b_{22}z_{2}
.
Compose the two transformations to express x_{1},x_{2}in terms of z_{1},z_{2}.
If x^{T } = [x_{1},x_{2}],y^{T } = [y_{1},y_{2}] and z^{T } = [z_{1},z_{2}] then find matrices A,B and C suchthat x = Ay,y = Bz and x = Cz.
Is C = AB? Give reasons for your answer.
For A_{n×n} = [a_{ij}], the trace of A, denotedTr(A), is defined byTr(A) = a_{11} + a_{22} + + a_{nn}.
ComputeTr(A) for A = and A = .
Let A be a matrix with A = 2 and A = 3. If B =
then computeTr(AB).
Let A and B be two square matrices of the same order. Then prove that
Tr(A + B) = Tr(A) + Tr(B).
Tr(AB) = tr(BA).
Prove that one cannot find matrices A and B such that AB - BA = cI for anyc≠0.
DRAFT
Let A and B be two m × n matrices with complex entries. Then prove that
Ax = 0 for all n × 1 vector x implies that A = 0, the zero matrix.
Ax = Bx for all n × 1 vector x implies that A = B.
Let A be an n×n matrix such that AB = BA for all n×n matrices B. Then prove that A is ascalar matrix. That is, A = αI for some α ∈ ℂ.
Let A = .
Find a matrix B such that AB = I_{2}.
What can you say about the number of such matrices? Give reasons for your answer.
Does there exist a matrix C such that CA = I_{3}? Give reasons for your answer.
Let A = and B = . Compute the matrix product AB usingthe block matrix multiplication.
Let A = . If P,Q and R are Hermitian, is the matrix A Hermitian?
Let A = , where A_{11}is an n × n invertible matrix and c ∈ ℂ.
If p = c - A_{21}A_{11}^{-1}A_{12}is non-zero, prove that
is the inverse of A.
Use the above to find the inverse of and .
Let x be an n × 1 vector with real entries and satisfying x^{T }x = 1.
Define A = I_{n}- 2xx^{T }. Prove that A is symmetric and A^{2} = I. The matrix A iscommonly known as the Householder matrix.
Let α≠1 be a real number and define A = I_{n}-αxx^{T }. Prove that A is symmetric andinvertible [The inverse is also of the form I_{n} + βxx^{T }for some value of β].
Let A be an invertible matrix of order n and let x and y be two n× 1 vectors with real entries.Also, let β be a real number such that α = 1 + βy^{T }A^{-1}x≠0. Then prove the famousShermon-Morrison formula
This formula gives the information about the inverse when an invertible matrix is modified by arank one matrix.
Suppose the matrices B and C are invertible and the involved partitioned products are defined,then prove thatDRAFT
Let J be an n × n matrix having each entry 1.
Prove that J^{2} = nJ.
Let α_{1},α_{2},β_{1},β_{2}∈ ℝ. Prove that there exist α_{3},β_{3}∈ ℝ such that
Let α,β ∈ ℝ with α≠0 and α + nβ≠0 and define A = αI_{n} + βJ. Prove that A isinvertible.
Let A be an upper triangular matrix. If A^{*}A = AA^{*}then prove that A is a diagonal matrix. Thesame holds for lower triangular matrix.
Let A be anm × nmatrix. Then a matrix G of order n×m is called a generalized inverse ofA if AGA = A. For example, a generalized inverse of the matrix A = [1,2] is a matrixG = , for all α ∈ ℝ. A generalized inverse G is called a pseudo inverse or aMoore-Penrose inverse if GAG = G and the matrices AG and GA are symmetric. Check thatfor α = the matrix G is a pseudo inverse of A.
DRAFT
1.4 Summary
In this chapter, we started with the definition of a matrix and came across lots of examples. In
particular, the following examples were important:
The zero matrix of size m × n, denoted 0_{m×n} or 0.
The identity matrix of size n × n, denoted I_{n} or I.
Triangular matrices.
Hermitian/Symmetric matrices.
Skew-Hermitian/skew-symmetric matrices.
Unitary/Orthogonal matrices.
Idempotent matrices.
nilpotent matrices.
We also learnt product of two matrices. Even though it seemed complicated, it basically tells that
multiplying by a matrix on the
left to a matrix A is same as operating on the rows of A.
right to a matrix A is same as operating on the columns of A.
DRAFTDRAFTDRAFT
DRAFT
Chapter 2 System of Linear Equations over ℝ
2.1 Introduction
Let us look at some examples of linear systems.
Suppose a,b ∈ ℝ. Consider the system ax = b in the unknown x. If
a≠0 then the system has a uniquesolutionx = .
a = 0 and
b≠0 then the system has nosolution.
b = 0 then the system has infinitenumberofsolutions, namely all x ∈ ℝ.
Consider a linear system with 2 equations in 2 unknowns. The equation ax + by = c in the
unknowns x and y represents a line in ℝ^{2} if either a≠0 or b≠0. Thus the solution set of the
system
is given by the points of intersection of the two lines. The different cases are illustrated by
examples (see Figure 1).
Unique Solution x + 2y = 1 and x + 3y = 1. The unique solution is [x,y]^{T } = [1,0]^{T }. Observe that in this case, a_{1}b_{2}- a_{2}b_{1}≠0.
Infinite Numberof Solutions x + 2y = 1 and 2x + 4y = 2. As both equations represent the same line, the
DRAFT
solution set is [x,y]^{T } = [1 - 2y,y]^{T } = [1,0]^{T } + y[-2,1]^{T } with y arbitrary. Observe
that
the vector [1,0]^{T } corresponds to the solution x = 1,y = 0 of the given system.
the vector [-2,1]^{T } corresponds to the solution x = -2,y = 1 of the system
x + 2y = 0,2x + 4y = 0.
No Solution x + 2y = 1 and 2x + 4y = 3. The equations represent a pair of parallel lines and hence
there is no point of intersection. Observe that in this case, a_{1}b_{2}- a_{2}b_{1} = 0 but
a_{1}c_{2}- a_{2}c_{1}≠0.
Figure 2.1: Examples in 2 dimension
As a last example, consider 3 equations in 3 unknowns. A linear equation ax + by + cz = d represent a plane in ℝ^{3} provided [a,b,c]≠[0,0,0]. Here, we
have to look at the points of intersection of the three given planes.
DRAFT
Unique Solution Consider the system x+y +z = 3,x+4y +2z = 7 and 4x+10y-z = 13. The unique
solution to this system is [x,y,z]^{T } = [1,1,1]^{T }, i.e., thethreeplanesintersectatapoint.
Infinite Numberof Solutions Consider the system x + y + z = 3,x + 2y + 2z = 5 and 3x + 4y + 4z = 11. The solution set
is [x,y,z]^{T } = [1,2 - z,z]^{T } = [1,2,0]^{T } + z[0,-1,1]^{T }, with z arbitrary. Observe the
following:
Here, the three planes intersect in a line.
The vector [1,2,0]^{T } corresponds to the solution x = 1,y = 2 and z = 0 of the
linear system x + y + z = 3,x + 2y + 2z = 5 and 3x + 4y + 4z = 11. Also, the
vector [0,-1,1]^{T } corresponds to the solution x = 0,y = -1 and z = 1 of the
linear system x + y + z = 0,x + 2y + 2z = 0 and 3x + 4y + 4z = 0.
No Solution The system x + y + z = 3,2x + 2y + 2z = 5 and 3x + 3y + 3z = 3 has no solution. In
this case, we have three parallel planes. The readers are advised to supply the
proof.
Definition 2.2.1.1.[Linear System]A system of m linear equations in n unknowns x_{1},x_{2},…,x_{n}
is a set of equations of the form
where for 1 ≤ i ≤ n and 1 ≤ j ≤ m;a_{ij},b_{i}∈ ℝ. Linear System (2.2.1.1) is called homogeneousif
b_{1} = 0 = b_{2} = = b_{m} and non-homogeneous, otherwise.
Let A = , x = and b = . Then (2.2.1.1) can be re-written as
Ax = b. In this setup, the matrix A is called the coefficient matrix and the block matrix is
called the augmented matrix of the linear system (2.2.1.1).
Remark 2.2.1.2.Consider the augmented matrix [Ab] of the linear system Ax = b, where A is anm×n matrix, b and x are column vectors of appropriate size. If x^{T } = [x_{1},…,x_{n}] then it is importantto note that
the unknown x_{1}corresponds to the column ([Ab])[:,1].
in general, for j = 1,2,…,n, the unknown x_{j}corresponds to the column ([Ab])[:,j].
the vector b = ([Ab])[:,n + 1].
for i = 1,2,…,m, the i^{th}equation corresponds to the row ([Ab])[i,:].
DRAFTDefinition 2.2.1.3.[Consistent, Inconsistent] A linear system is called consistent if it admits
a solution and is called inconsistent if it admits no solution. For example, the homogeneous
system Ax = 0 is always consistent as 0 is a solution whereas the system x + y = 2,2x + 2y = 1
is inconsistent.
Definition 2.2.1.4. Consider the linear system Ax = b. Then the corresponding linear system
Ax = 0 is called the associated homogeneous system. As mentioned in the previous
paragraph, the associated homogeneous system is always consistent.
Definition 2.2.1.5.[Solution of a Linear System] A solution of Ax = b is a vector y such
that Ay indeed equals b. The set of all solutions is called the solution set of the system. For
example, the solution set of Ax = b, with A = and b = equals .
The readers are advised to supply the proof of the next theorem that gives information about the
solution set of a homogeneous system.
Theorem 2.2.1.6.Consider the homogeneous linear system Ax = 0.
Then x = 0, the zero vector, is always a solution.
Let u≠0 be a solution of Ax = 0. Then, y = cu is also a solution for all c ∈ ℝ.DRAFT
Letu_{1},…,u_{k}be solutions of Ax = 0. Then∑_{i=1}^{k}a_{i}u_{i}is also a solution of Ax = 0, forall a_{i}∈ ℝ,1 ≤ i ≤ k.
Remark 2.2.1.7.Consider the homogeneous system Ax = 0. Then
the vector 0 is called the trivial solution.
a non-zero solution is called a non-trivial solution. For example, for the system Ax = 0,where A = , the vector x = is a non-trivial solution.
Thus, by Theorem2.2.1.6, the existence of a non-trivial solution of Ax = 0 is equivalentto having an infinite number of solutions for the system Ax = 0.
Let u,v be two distinct solutions of the non-homogeneous system Ax = b. Then x_{h} = u-vis a solution of the homogeneous system Ax = 0. That is, any two solutions of Ax = bdiffer by a solution of the associated homogeneous system Ax = 0. Or equivalently, thesolution set of Ax = b is of the form, {x_{0} + x_{h}}, where x_{0}is a particular solution ofAx = b and x_{h}is a solution of the associated homogeneous system Ax = 0.
Exercise2.2.1.8.
Consider a system of 2 equations in 3 unknowns. If this system is consistent then howmany solutions does it have?DRAFT
Define a linear system of 3 equations in 2 unknowns such that the system is inconsistent.
Define a linear system of 4 equations in 3 unknowns such that the system is inconsistentwhereas for any three equations the system is consistent.
Let Ax = b be a system of m equations and n unknowns. Then
determine the possible solution set if m ≥ 3 and n = 2.
determine the possible solution set if m ≥ 4 and n = 3.
can this system have exactly two distinct solutions?
can have only a finitely many (greater than 1) solutions?
2.1.A Elementary Row Operations
Example 2.2.1.9. Solve the linear system y + z = 2,2x + 3z = 5,x + y + z = 3.
Solution: Let B_{0} = [Ab], the augmented matrix. Then B_{0} = . We now systematically
proceed to get the solution.
Interchange 1^{st} and 2^{nd} equation (interchange B_{0}[1,:] and B_{0}[2,:] to get B_{1}).
DRAFT
In the new system, multiply 1^{st} equation by (multiply B_{1}[1,:] by to get B_{2}).
In the new system, replace 3^{rd} equation by 3^{rd} equation minus 1^{st} equation (replace
B_{2}[3,:] by B_{2}[3,:] - B_{2}[1,:] to get B_{3}).
In the new system, replace 3^{rd} equation by 3^{rd} equation minus 2^{nd} equation (replace
B_{3}[3,:] by B_{3}[3,:] - B_{3}[2,:] to get B_{4}).
DRAFT
In the new system, multiply 3^{rd} equation by (multiply B_{4}[3,:] by to get B_{5}).
The last equation gives z = 1. Using this, the second equation gives y = 1. Finally, the first
equation gives x = 1. Hence, the solution set is {[x,y,z]^{T }|[x,y,z] = [1,1,1]}, auniquesolution.
In Example 2.2.1.9, observe that for each operation on the system of linear equations there is a
corresponding operation on the row of the augmented matrix. We use this idea to define elementary
row operations and the equivalence of two linear systems.
Definition 2.2.1.10.[Elementary Row Operations] Let A be an m × n matrix. Then the
elementary row operations are
E_{ij}: Interchange of A[i,:] and A[j,:].
DRAFT
E_{k}(c) for c≠0: Multiply A[k,:] by c.
E_{ij}(c) for c≠0: Replace A[i,:] by A[i,:] + cA[j,:].
Definition 2.2.1.11.[Equivalent Linear Systems] Consider the linear systems Ax = b and
Cx = d with corresponding augmented matrices, [Ab] and [Cd], respectively. Then the two
linear systems are said to be equivalent if [Cd] can be obtained from [Ab] by application of a
finite number of elementary row operations.
Definition 2.2.1.12.[Equivalent Matrices] Two matrices are said to be equivalent if one can
be obtained from the other by a finite number of elementary row operations.
Thus, note that the linear systems at each step in Example 2.2.1.9 are equivalent to each other. We
now prove that the solution set of two equivalent linear systems are same.
Lemma 2.2.1.13.Let Cx = d be the linear system obtained from Ax = b by application of asingle elementary row operation. Then Ax = b and Cx = d have the same solution set.
Proof. We prove the result for the elementary row operation E_{jk}(c) with c≠0. The reader is advised to
prove the result for the other two elementary operations.
In this case, the systems Ax = b and Cx = d vary only in the j^{th} equation. So, we need to show
that y satisfies the j^{th} equation of Ax = b if and only if y satisfies the j^{th} equation of Cx = d. So, let
y^{T } = [α_{1},…,α_{n}]. Then, the j^{th} and k^{th} equations of Ax = b are a_{j1}α_{1} + + a_{jn}α_{n} = b_{j} and
a_{k1}α_{1} + + a_{kn}α_{n} = b_{k}. Therefore, we see that α_{i}’s satisfy
DRAFT
Also, by definition the j^{th} equation of Cx = d equals
Therefore, using Equation (2.2.1.2), we see that y^{T } = [α_{1},…,α_{n}] is also a solution for Equation
(2.2.1.3). Now, use a similar argument to show that if z^{T } = [β_{1},…,β_{n}] is a solution of Cx = d then it is
also a solution of Ax = b. Hence, the required result follows. _
The readers are advised to use Lemma 2.2.1.13 as an induction step to prove the main result of this
subsection which is stated next.
Theorem 2.2.1.14.Let Ax = b and Cx = d be two equivalent linear systems. Then they havethe same solution set.
2.2 System of Linear Equations
In the previous section, we saw that if one linear system can be obtained from another by a repeated
application of elementary row operations then the two linear systems have the same solution set.
Sometimes it helps to imagine an elementary row operation as a product on the left by
DRAFT
elementary matrix. In this section, we will try to understand this relationship and use them to
first obtain results for the system of linear equations and then to the theory of square
matrices.
2.2.A Elementary Matrices and the Row-Reduced Echelon Form (RREF)
Definition 2.2.2.1. A square matrix E of order n is called an elementary matrix if it is
obtained by applying exactly one elementary row operation to the identity matrix I_{n}.
Remark 2.2.2.2.The elementary matrices are of three types and they correspond to elementary rowoperations.
E_{ij}: Matrix obtained by applying elementary row operation E_{ij}to I_{n}.
E_{k}(c) for c≠0: Matrix obtained by applying elementary row operation E_{k}(C) to I_{n}.
E_{ij}(c) for c≠0: Matrix obtained by applying elementary row operation E_{ij}(c) to I_{n}.
Example 2.2.2.3.
In particular, for n = 3 and a real number c≠0, one has E_{23} = , E_{1}(c) = , E_{31}(c) = and E_{23}(c) = .
DRAFT
Let A = and B = . Then verify that E_{23}A = = = B.
Let A = . Then E_{21}(-1)E_{32}(-2)A = E_{21}(-1) = .
Let A = . Then verify that E_{31}(-2)E_{13}E_{31}(-1)A = .
Exercise2.2.2.4.
Which of the following matrices are elementary?
Find elementary matrices E_{1},…,E_{k}such that E_{k}E_{1} = I_{2}.
Determine elementary matrices F_{1},…,F_{k}such that E_{k}E_{1} = I_{3}.
DRAFT
Remark 2.2.2.5.Observe that
(E_{ij})^{-1} = E_{ij}as E_{ij}E_{ij} = I = E_{ij}E_{ij}.
Let c≠0. Then (E_{k}(c))^{-1} = E_{k}(1∕c) as E_{k}(c)E_{k}(1∕c) = I = E_{k}(1∕c)E_{k}(c).
Let c≠0. Then (E_{ij}(c))^{-1} = E_{ij}(-c) as E_{ij}(c)E_{ij}(-c) = I = E_{ij}(-c)E_{ij}(c).
Thus, each elementary matrix is invertible and the inverse is also an elementary matrix.
Based on the above observation and the fact that product of invertible matrices is invertible, the
readers are advised to prove the next result.
Lemma 2.2.2.6.Prove that applying elementary row operations is equivalent to multiplyingon the left by the corresponding elementary matrix and vice-versa.
Proposition 2.2.2.7.Let A and B be two equivalent matrices. Then prove that B =
E_{1}E_{k}A, for some elementary matrices E_{1},…,E_{k}.
Proof. By definition of equivalence, the matrix B can be obtained from A by a finite number of
elementary row operations. But by Lemma 2.2.2.6, each elementary row operation on A
corresponds to multiplying on the left of A by an elementary matrix. Thus, the required result
follows. _
DRAFT
We now give a direct prove of Theorem 2.2.1.14. To do so, we state the theorem once
again.
Theorem 2.2.2.8.Let Ax = b and Cx = d be two equivalent linear systems. Then they havethe same solution set.
Proof. Let E_{1},…,E_{k} be the elementary matrices such that E_{1}E_{k}[Ab] = [Cd]. Then, by definition of
matrix product and Remark 2.2.2.5
(2.2.2.1)
Now assume that Ay = b holds. Then, by Equation (2.2.2.1)
(2.2.2.2)
On the other hand if Cz = d holds then using Equation (2.2.2.1), we have
DRAFT
(2.2.2.3)
Therefore, using Equations (2.2.2.2) and (2.2.2.3) the required result follows. _
As an immediate corollary, the readers are advised to prove the following result.
Corollary 2.2.2.9.Let A and B be two equivalent matrices. Then the systems Ax = 0 andBx = 0 have the same solution set.
Example 2.2.2.10.Are the matrices A = and B = equivalent? Solution: No, as is a solution of Bx = 0 but it isn’t a solution of Ax = 0.
Definition 2.2.2.11.[Pivot/Leading Term] Let A be a non-zero matrix. Then, a
pivot/leading term is the first (from left) nonzero element of a non-zero row in A and the
column containing the pivot term is called the pivot column. If a_{ij} is a pivot then we denote
DRAFT
it by a_{ij}. For example, the entries a_{12} and a_{23} are pivots in A = . Thus, the
columns 1 and 2 are pivot columns.
Definition 2.2.2.12.[Echelon Form] A matrix A is in echelon form (EF) (ladder like)
if
pivot of the (i + 1)-th row comes to the right of the i-th.
entries below the pivot in a pivotal column are 0.
the zero rows are at the bottom.
Example 2.2.2.13.
The following matrices are in echelon form. , , and .
The following matrices are not in echelon form (determine the rule(s) that fail). and .
DRAFTDefinition 2.2.2.14.[Row-Reduced Echelon Form] A matrix C is said to be in the row-reducedechelon form (RREF) if
C is already in echelon form,
pivot of each non-zero row is 1,
every other entry in the pivotal column is zero.
A matrix in RREF is also called a row-reduced echelon matrix.
Example 2.2.2.15.
The following matrices are in RREF. , , and .
The following matrices are not in RREF (determine the rule(s) that fail). , , .
The proof of the next result is beyond the scope of this book and hence is omitted.
Theorem 2.2.2.16.Let A and B be two matrices in RREF. If they are equivalent then A = B.
As an immediate corollary, we obtain the following important result.
DRAFTCorollary 2.2.2.17.The RREF of a matrix is unique.
Proof. Suppose there exists a matrix A with two different RREFs, say B and C. As the RREFs are
obtained by multiplication of elementary matrices there exist elementary matrices E_{1},…,E_{k} and
F_{1},…,F_{ℓ} such that B = E_{1}E_{k}A and C = F_{1}F_{ℓ}A. Thus,
As inverse of elementary matrices are elementary matrices, we see that the matrices B
and C are equivalent. As B and C are in RREF, using Theorem 2.2.2.16, we see that
B = C. _
Theorem 2.2.2.18.Let A be an m × n matrix. If B consists of the first s columns of A thenRREF(B) equals the first s columns ofRREF(A).
Proof. Let us write F = RREF(A). By definition of RREF, there exist elementary matrices E_{1},…,E_{k}
such that E_{1}E_{k}A = F. Then, by matrix multiplication
Thus, E_{1}E_{k}B = [E_{1}E_{k}A[:,1],…,E_{1}E_{k}A[:,s]] = [F[:,1],…,F[:,s]]. Since the matrix F is in
RREF, by definition, it’s first s columns are also in RREF. Hence, by Corollary 2.2.2.17 we see that
RREF(B) = [F[:,1],…,F[:,s]]. Thus, the required result follows. _
DRAFT
Let A an m × n matrix. Then by Corollary 2.2.2.17, it’s RREF is unique. We use it to define our
next object.
2.2.B Rank of a Matrix
Definition 2.2.2.19.[Row-Rank of a Matrix] Let A be an m×n matrix and let the number of
pivots (number of non-zero rows) in it’s RREF. Then the row rank of A, denoted row-rank(A),
equals r. For example, row-rank(I_{n}) = n and row-rank(0) = 0.
Remark 2.2.2.20.
Even though, row-rank is defined using the RREF of a matrix, we just need to computethe echelon form as the number of non-zero rows/pivots do not change when we proceed tocompute the RREF from the echelon form.
Let A be an m × n matrix. Then by the definition of RREF, the number of pivots, say r,satisfies r ≤ min{m,n}. Thus,row-rank(A) ≤ min{m,n}.
Example 2.2.2.21. Determine the row-rank of the following matrices.
diag(d_{1},…,d_{n}). Solution: Let S = {i|1 ≤ i ≤ n,d_{i}≠0}. Then, verify that row-rank equals the number of
elements in S.
DRAFT
A = . Solution: The echelon form of A is obtained as follows:
As the echelon form of A has 3 non-zero rows row-rank(A)= 3.
A = . Solution: row-rank(A) = 2 as the echelon form of A (given below) has two non-zero rows:
DRAFT
Remark 2.2.2.22.Let Ax = b be a linear system with m equations in n unknowns. LetRREF([Ab]) = [Cd],row-rank(A) = r androw-rank([Ab]) = r_{a}.
Then, using Theorem2.2.2.18conclude that r ≤ r_{a}.
If r < r_{a}then again using Theorem2.2.2.18, note that r_{a} = r + 1 and ([Cd])[:,n + 1] hasa pivot at the (r + 1)-th place. Hence, by definition of RREF, ([Cd])[r + 1,:] = [0^{T },1].
If r = r_{a}then ([Cd])[:,n + 1] has no pivot. Thus, [0^{T },1] is not a row of [Cd].
Now, consider an m × n matrix A and an elementary matrix E of order n. Then the product AE
corresponds to applying column transformation on the matrix A. Therefore, for each elementary
matrix, there is a corresponding column transformation as well. We summarize these ideas as
follows.
Definition 2.2.2.23. The column transformations obtained by right multiplication of
elementary matrices are called columnoperations. For example, if A = then
AE_{23} = and AE_{14}(-1) = .
Remark 2.2.2.24 (Rank of a Matrix).
The idea of row-rank was based on RREF and RREF was the result of systematically applying aDRAFTfinite number of elementary row operations. So, starting with a matrix A, we can systematicallyapply a finite number of elementary column operations (see Definition2.2.2.23) to get a matrixwhich in some sense looks similar to RREF, call it say B, and then use the non-zerocolumns in that to define the column-rank. Note that B will have the followingform:
The zero columns appear after the non-zero columns.
The first non-zero entry of a non-zero column is 1, called the pivot.
The pivots in non-zero column move down in successive columns.
It will be proved later thatrow-rank(A) = column-rank(A). Thus, we just talk of the “rank”,denotedRank(A).
we are now ready to prove a few results associated with the rank of a matrix.
Theorem 2.2.2.25.Let A be a matrix of rank r. Then there exist a finite number of elementarymatrices E_{1},…,E_{s}and F_{1},…,F_{ℓ}such that
Proof. Let C = RREF(A). As Rank(A) = r, by definition of RREF, there exist elementary matrices
E_{1},…,E_{s} such that C = E_{1}E_{s}A. Note that C has r pivots and they appear in columns, say
i_{1}< i_{2}<< i_{r}.
DRAFT
Now, let D be the matrix obtained from C by successively multiplying the elementary matrices E_{jij},
for 1 ≤ j ≤ r, on the right of C. Then observe that D = , where B is a matrix of an
appropriate size.
As the (1,1) block of D is an identity matrix, the block (1,2) can be made the zero matrix by
elementary column operations to D. Thus, the required result follows. _
Exercise2.2.2.26.
Let A = and B = . Find P and Q such that B = PAQ.
Let A be a matrix of rank r. Then prove that there exist invertible matrices B_{i},C_{i}such thatB_{1}A = ,AC_{1} = ,B_{2}AC_{2} = and B_{3}AC_{3} = , wherethe (1,1) block of each matrix is of size r × r. Also, prove that A_{1}is an invertible matrix.
Let A be an m×n matrix of rank r. Then prove that A can be written as A = BC, whereboth B and C have rank r and B is of size m × r and C is of size r × n.
Prove that if the product AB is defined andRank(A) = Rank(AB) then A = ABX forsome matrix X. Similarly, if BA is defined andRank(A) = Rank(BA) then A = Y BAfor some matrix Y . [Hint: Choose invertible matrices P,Q satisfying PAQ = ,P(AB) = (PAQ)(Q^{-1}B) = . Now find an invertible matrix R such thatP(AB)R = . Now, define X = RQ^{-1}to get the required result.]
Prove that if AB is defined thenRank(AB) ≤Rank(A) andRank(AB) ≤Rank(B).
Let P and Q be invertible matrices such that the matrix product PAQ is defined. Thenprove thatRank(PAQ) = Rank(A).
Prove that if A + B is defined thenRank(A + B) ≤Rank(A) + Rank(B).
DRAFT
2.2.C Gauss-Jordan Elimination and System of Linear Equations
Let A be an m × n matrix. We now present an algorithm, commonly known as the Gauss-Jordan
Elimination (GJE) method, to compute the RREF of A.
Input: A.
Output: a matrix B in RREF such that A is row equivalent to B.
Step 1: Put ‘Region’ = A.
Step 2: If all entries in the Region are 0, STOP. Else, in the Region, find the leftmost
nonzero column and find its topmost non-zero entry. Suppose this non-zero entry is a_{ij}.
Box it. This is a pivot.
Step 3: Replace the row containing the pivot with the top row of the region. Also, make
the pivot entry 1. Use this pivot to make other entries in the pivotal column as 0.
Step 4: Put Region = the sub-matrix below and to the right of the current pivot. Now,
go to step 2.
Important: The process will stop, as we can get at most min{m,n} pivots.
Example 2.2.2.27.Apply GJE to
Region = A as A≠0.
Then E_{12}A = . Also, E_{31}(-1)E_{12}A = = B (say).
DRAFT
Now, Region = ≠0. Let C = .
Then E_{1}()C = and E_{21}(-2)E_{1}()C = . Or equivalently,
E_{32}(-2)E_{2}()B = .
As the matrix A has been multiplied with elementary matrices on the left the RREF
matrix is equivalent to A.
Remark 2.2.2.28 (Gauss Elimination (GE)).GE is similar to the GJE except that
the pivots need not be made 1 and
the entries above the pivots need not be made 0.
DRAFT
For example, if A = then GE gives E_{32}(-3)E_{21}(-1)E_{31}(-1)A = .
Thus, Gauss-Jordan Elimination may be viewed as an extension of the Gauss Elimination.
Example 2.2.2.29. Consider the system Ax = b with A a matrix of order 3 × 3 and A[:,1]≠0. If
[Cd] = RREF([Ab]) then the possible choices for [Cd] are given below:
. Here, Ax = b is consistent and with uniquesolution = .
, or . Here, Ax = b is inconsistent for any
choice of α,β.
, or . Here, Ax = b is consistent and has
infinitenumberofsolutionsfor every choice of α,β.
Exercise2.2.2.30.
Let Ax = b be a linear system of m equations in 2 unknowns. What are the possiblechoices forRREF([Ab]) if m ≥ 1?
Find the row-reduced echelon form of the following matrices:DRAFT
Find the solutions of the linear system of equations using Gauss-Jordan method.
Now, using Proposition 2.2.2.7, Theorem 2.2.2.8 and the definition of RREF of a matrix, we obtain
the following remark.
Remark 2.2.2.31.LetRREF([Ab]) = [Cd]. Then
there exist elementary matrices, say E_{1},…,E_{k}, such that E_{1}E_{k}[Ab] = [Cd]. Thus, theGJE (or the GE) is equivalent to multiplying by a finite number of elementary matriceson the left of [Ab].DRAFT
Definition 2.2.2.32.[Basic, Free Variables]Consider the linear system Ax = b. If RREF([Ab]) = [Cd]
then the unknowns
corresponding to the pivotal columns of C are called the basic variables.
that are not basic are called free variables.
Example 2.2.2.33.
Let RREF([Ab]) = . Then, Ax = b has
as it’s solution set. Note that x and y are basic variables and z is the free variable.
Let RREF([Ab]) = . Then, the system Ax = b has no solution as
RREF([Ab])[3,:] = [0,0,0,1] which corresponds to the equation 0 ⋅ x + 0 ⋅ y + 0 ⋅ z = 1.
DRAFT
Suppose the system Ax = b is consistent and RREF(A) has r non-zero rows. Then the
system has r basic variables and n - r free variables.
We now prove the main result in the theory of linear system (recall Remark 2.2.2.22).
Theorem 2.2.2.34.Let A be an m × n matrix and letRREF([Ab]) = [Cd],Rank(A) = r andRank([Ab]) = r_{a}. Then Ax = b
is inconsistent if r < r_{a}
is consistent if r = r_{a}. Furthermore, Ax = b has
auniquesolutionif r = n.
infinitenumberofsolutionsif r < n. In this case, the solution set equals
where x_{0},u_{1},…,u_{n-r}∈ ℝ^{n}with Ax_{0} = b and Au_{i} = 0, for 1 ≤ i ≤ n - r.
Proof. Part1: As r < r_{a}, by Remark 2.2.2.22 ([Cd])[r + 1,:] = [0^{T },1]. Then, this row corresponds to
the linear equation
DRAFT
which clearly has no solution. Thus, by definition and Theorem 2.2.1.14, Ax = b is inconsistent.
Part2: As r = r_{a}, by Remark 2.2.2.22, [Cd] doesn’t have a row of the form [0^{T },1] and there are r
pivots in C. Suppose the pivots appear in columns i_{1},…,i_{r} with 1 ≤ i_{1}<< i_{r}≤ n. Thus, the
unknowns x_{ij}, for 1 ≤ j ≤ r, are basic variables and the remaining n - r variables, say
x_{t1},…,x_{tn-r}, are free variables with t_{1}<< t_{n-r}. Since C is in RREF, in terms of the free
variables and basic variables, the ℓ-th row of [Cd], for ℓ,1 ≤ ℓ ≤ r, corresponds to the
equation
Hence, the solution set of the system Cx = d is given by
(2.2.2.4)
Thus, by Theorem 2.2.1.14 the system Ax = b is consistent. In case of Part 2a, r = n and hence there
are no free variables. Thus, the unique solution equals x_{i} = d_{i}, for 1 ≤ i ≤ n.
In case of Part 2b, define x_{0} = and u_{1} = ,…,u_{n-r} = . Then, it can be easily
verified that Ax_{0} = b and for 1 ≤ i ≤ n - r, Au_{i} = 0 and by Equation (2.2.2.4) the solution set has
indeed the required form, where k_{i} corresponds to the free variable x_{ti}. As there is at least one free
variable the system has infinite number of solutions. Thus, the proof of the theorem is
complete. _
Let A be an m × n matrix. Then by Remark 2.2.2.20, Rank(A) ≤ m and hence using
Theorem 2.2.2.34 the next result follows.
Corollary 2.2.2.35.Let A be a matrix of order m × n and consider Ax = 0. If
Rank(A) = r < min{m,n} then Ax = 0 has infinitely many solutions.
m < n, then Ax = 0 has infinitely many solutions.
Thus, in either case, the homogeneous system Ax = 0 has at least one non-trivial solution.
DRAFTRemark 2.2.2.36.Let A be an m × n matrix. Then Theorem2.2.2.34implies that
the linear system Ax = b is consistent if and only ifRank(A) = Rank([Ab]).
the vectors associated to the free variables in Equation (2.2.2.4) are solutions to theassociated homogeneous system Ax = 0.
We end this subsection with some applications.
Example 2.2.2.37.
Determine the equation of the line/circle that passes through the points (-1,4),(0,1) and
(1,4). Solution: The general equation of a line/circle in Euclidean plane is given by a(x^{2} +
y^{2}) + bx + cy + d = 0, where a,b,c and d are unknowns. Since this curve passes
through the given points, we get a homogeneous system in 3 equations and 4 unknowns,
namely = 0. Solving this system, we get [a,b,c,d] =
[d,0,-d,d]. Hence, taking d = 13, the equation of the required circle is 3(x^{2} + y^{2}) -
16y + 13 = 0.
Determine the equation of the plane that contains the points (1,1,1),(1,3,2) and
(2,-1,2). Solution: The general equation of a plane in space is given by ax + by + cz + d = 0,
where a,b,c and d are unknowns. Since this plane passes through the 3 given points, we
get a homogeneous system in 3 equations and 4 unknowns. So, it has a non-trivial solution,
namely [a,b,c,d] = [-d,-,-d,d]. Hence, taking d = 3, the equation of the required
plane is -4x - y + 2z + 3 = 0.
Let A = . Then
DRAFT
find a non-zero x ∈ ℝ^{3} such that Ax = 2x.
does there exist a non-zero vector y ∈ ℝ^{3} such that Ay = 4y?
Solution of Part3a: Solving for Ax = 2x is equivalent to solving (A - 2I)x = 0 whose
augmented matrix equals . Verify that x^{T } = [1,0,0] is a non-zero
solution.
Part3b: As above, Ay = 4y is equivalent to solving (A - 4I)y = 0 whose augmented matrix
equals . Now, verify that y^{T } = [2,0,1] is a non-zero solution.
Exercise2.2.2.38.
In the first part of this chapter 3 figures (see Figure2.1) were given to illustrate different casesin Euclidean plane (2 equations in 2 unknowns). It is well known that in the case of Euclideanspace (3 equations in 3 unknowns) there
is a way to place the 3 planes so that the system has a unique solution.
are 4 distinct ways to place the 3 planes so that the system has no solution.
are 3 distinct ways to place the 3 planes so that the system has an infinite numberof solutions.
Determine the position of planes by drawing diagram to explain the above cases. Do thesediagrams and the RREF matrices that appear in Example2.2.2.29have any relationship? Justifyyour answer.
Determine the equation of the curve y = ax^{2} + bx + c that passes through the points (-1,4),(0,1)
and (1,4).DRAFT
Solve the following linear systems.
x + y + z + w = 0,x - y + z + w = 0 and -x + y + 3z + 3w = 0.
x + 2y = 1,x + y + z = 4 and 3y + 2z = 1.
x + y + z = 3,x + y - z = 1 and x + y + 7z = 6.
x + y + z = 3,x + y - z = 1 and x + y + 4z = 6.
x + y + z = 3,x + y - z = 1,x + y + 4z = 6 and x + y - 4z = -1.
For what values of c and k, the following systems have i) no solution,ii) a unique solution andiii) infinite number of solutions.
x + y + z = 3,x + 2y + cz = 4,2x + 3y + 2cz = k.
x + y + z = 3,x + y + 2cz = 7,x + 2y + 3cz = k.
x + y + 2z = 3,x + 2y + cz = 5,x + 2y + 4z = k.
kx + y + z = 1,x + ky + z = 1,x + y + kz = 1.
x + 2y - z = 1,2x + 3y + kz = 3,x + ky + 3z = 2.
x - 2y = 1,x - y + kz = 1,ky + 4z = 6.
For what values of a, does the following systems have i) no solution,ii) a unique solution andiii) infinite number of solutions.
x + 2y + 3z = 4,2x + 5y + 5z = 6,2x + (a^{2}- 6)z = a + 20.
x + y + z = 3,2x + 5y + 4z = a,3x + (a^{2}- 8)z = 12.
DRAFT
Find the condition(s) on x,y,z so that the system of linear equations given below (in theunknowns a,b and c) is consistent?
a + 2b - 3c = x,2a + 6b - 11c = y,a - 2b + 7c = z
a + b + 5c = x,a + 3c = y,2a - b + 4c = z
a + 2b + 3c = x,2a + 4b + 6c = y,3a + 6b + 9c = z
Let A be an n × n matrix. If the system A^{2}x = 0 has a non trivial solution then show thatAx = 0 also has a non trivial solution.
Prove that 5 distinct points are needed to specify a general conic in Euclidean plane.
Let u = (1,1,-2)^{T }and v = (-1,2,3)^{T }. Find condition on x,y and z such that the systemcu + dv = (x,y,z)^{T }in the unknowns c and d is consistent.
Consider the linear system Ax = b in m equations and 3 unknowns. Then, for each of the givensolution set, determine the possible choices of m? Further, for each choice of m, determine achoice of A and b.
(1,1,1)^{T }is the only solution.
(1,1,1)^{T }is the only solution.
{(1,1,1)^{T } + c(1,2,1)^{T }|c ∈ ℝ} as the solution set.
{c(1,2,1)^{T }|c ∈ ℝ} as the solution set.
{(1,1,1)^{T } + c(1,2,1)^{T } + d(2,2,-1)^{T }|c,d ∈ ℝ} as the solution set.
{c(1,2,1)^{T } + d(2,2,-1)^{T }|c,d ∈ ℝ} as the solution set.
DRAFT
2.3 Square Matrices and System of Linear Equations
In this section the coefficient matrix of the linear system Ax = b will be a square matrix. We start
with proving a few equivalent conditions that relate different ideas.
Theorem 2.2.3.1.Let A be a square matrix of order n. Then the following statements areequivalent.
A is invertible.
The homogeneous system Ax = 0 has only the trivial solution.
Rank(A) = n.
The RREF of A is I_{n}.
A is a product of elementary matrices.
Proof. 12 As A is invertible, A^{-1} exists and A^{-1}A = I_{n}. So, if x_{0} is any solution of the
homogeneous system Ax = 0. Then
Hence, 0 is the only solution of the homogeneous system Ax = 0.
23 Let if possible Rank(A) = r < n. Then, by Corollary 2.2.2.35, the homogeneous
system Ax = 0 has infinitely many solution. A contradiction. Thus, A has full rank.
DRAFT
34 Suppose Rank(A) = n and let B = RREF(A). As B = [b_{ij}] is a square matrix of order
n, each column of B contains a pivot. Since the pivots move to the right as we go down the row, the
i-th pivot must be at position b_{ii}, for 1 ≤ i ≤ n. Thus, B = I_{n} and hence, the RREF of A is
I_{n}.
45 Suppose RREF(A) = I_{n}. Then using Proposition 2.2.2.7, there exist elementary
matrices E_{1},…,E_{k} such that E_{1}E_{k}A = I_{n}. Or equivalently,
(2.2.3.1)
which gives the desired result as by Remark 2.2.2.5 we know that the inverse of an elementary matrix
is also an elementary matrix.
51 Suppose A = E_{1}E_{k} for some elementary matrices E_{1},…,E_{k}. As the elementary
matrices are invertible (see Remark 2.2.2.5) and the product of invertible matrices is also invertible,
we get the required result. _
As an immediate consequence of Theorem 2.2.3.1, we have the following important result which
implies that one needs to compute either the left or the right inverse to prove invertibility.
Corollary 2.2.3.2.Let A be a square matrix of order n. Suppose there exists a matrix
C such that CA = I_{n}. Then A^{-1}exists.
B such that AB = I_{n}. Then A^{-1}exists.
Proof. Part 1: Let CA = I_{n} for some matrix C and let x_{0} be a solution of the homogeneous system
Ax = 0. Then Ax_{0} = 0 and
DRAFT
Thus, the homogeneous system Ax = 0 has only the trivial solution. Hence, using Theorem 2.2.3.1,
the matrix A is invertible.
Part 2: Using the first part, B is invertible. Hence, B^{-1} = A or equivalently A^{-1} = B and thus A is
invertible as well. _
Another important consequence of Theorem 2.2.3.1 is stated next which uses Equation (2.2.3.1) to
get the required result. This result is used to compute the inverse of a matrix using the Gauss-Jordan
Elimination.
Corollary 2.2.3.3.Let A be an invertible matrix of order n. Suppose there exist elementarymatrices E_{1},…,E_{k}such that E_{1}E_{k}A = I_{n}. Then A^{-1} = E_{1}E_{k}.
Remark 2.2.3.4.Let A be an n×n matrix. Apply GJE to [AI_{n}] and letRREF([AI_{n}]) = [BC].If B = I_{n}, then A^{-1} = C or else A is not invertible.
Example 2.2.3.5. Use GJE to find the inverse of A = . Solution: Applying GJE to [A|I_{3}] = gives
DRAFT
Thus, A^{-1} = .
Exercise2.2.3.6.
Find the inverse of the following matrices using GJE. (i) (ii) (iii)(iv).
Let A and B be two matrices having positive entries and of order 1 × 2 and 2 × 1,respectively. Which of BA or AB is invertible? Give reasons.
Let A be an n × m matrix and B be an m × n matrix. Prove that
I - BA is invertible if and only if I - AB is invertible [Use Theorem2.2.3.1.2].
if I - AB is invertible then (I - BA)^{-1} = I + B(I - AB)^{-1}A.
if I - AB is invertible then (I - BA)^{-1}B = B(I - AB)^{-1}.DRAFT
if A,B and A + B are invertible then (A^{-1} + B^{-1})^{-1} = A(A + B)^{-1}B.
Let A be a square matrix. Then prove that
A is invertible if and only if A^{T }A is invertible.
A is invertible if and only if AA^{T }is invertible.
We end this section by giving two more equivalent conditions for a matrix to be invertible.
Theorem 2.2.3.7.The following statements are equivalent for an n × n matrix A.
A is invertible.
The system Ax = b has a unique solution for every b.
The system Ax = b is consistent for every b.
Proof. 12 Note that x_{0} = A^{-1}b is the unique solution of Ax = b.
23 The system is consistent as Ax = b has a solution.
31 For 1 ≤ i ≤ n, define e_{i}^{T } = I_{n}[i,:]. By assumption, the linear system Ax = e_{i} has a
solution, say x_{i}, for 1 ≤ i ≤ n. Define a matrix B = [x_{1},…,x_{n}]. Then
Therefore, by Corollary 2.2.3.2, the matrix A is invertible. _
DRAFT
We now give an immediate application of Theorem 2.2.3.7 and Theorem 2.2.3.1 without
proof.
Theorem 2.2.3.8.The following two statements cannot hold together for an n × n matrixA.
The system Ax = b has a unique solution for every b.
The system Ax = 0 has a non-trivial solution.
Exercise2.2.3.9.
Let A and B be square matrices of order n with B = PA, for an invertible matrix P.Then prove that A is invertible if and only if B is invertible.
Let A and B be two m×n matrices. Then prove that A and B are equivalent if and onlyif B = PA, where P is product of elementary matrices. When is this P unique?
Let b^{T } = [1,2,-1,-2]. Suppose A is a 4 × 4 matrix such that the linear system Ax = b has nosolution. Mark each of the statements given below as trueor false?
The homogeneous system Ax = 0 has only the trivial solution.
The matrix A is invertible.
Let c^{T } = [-1,-2,1,2]. Then the system Ax = c has no solution.
Let B = RREF(A). ThenDRAFT
B[4,:] = [0,0,0,0].
B[4,:] = [0,0,0,1].
B[3,:] = [0,0,0,0].
B[3,:] = [0,0,0,1].
B[3,:] = [0,0,1,α], where α is any real number.
2.3.A Determinant
In this section, we associate a number with each square matrix. To start with, let A be an n × n
matrix. Then, for 1 ≤ i,j ≤ k and 1 ≤ α_{i},β_{j}≤ n, by A(α_{1},…,α_{k}β_{1},…,β_{ℓ}) we mean that sub-matrix of
A that is obtained by deleting the rows corresponding to α_{i}’s and the columns corresponding to β_{j}’s of
A.
Example 2.2.3.10.For A = , A(1|2) = and A(1,2|1,3) = [4].
With the notations as above, we are ready to give an inductive definition of the determinant of a
square matrix. The advanced students can find the actual definition of the determinant in
Appendix 7.7.1.22, where it is proved that the definition given below corresponds to the expansion of
determinant along the first row.
DRAFTDefinition 2.2.3.11.[Determinant] Let A be a square matrix of order n. Then the determinant
of A, denoted det(A) (or |A|) is defined by
Example 2.2.3.12.
Let A = [-2]. Then det(A) = |A| = -2.
Let A = . Then, det(A) = |A| = adet(A(1|1)) - bdet(A(1|2)) = ad - bc. For
example, if A = then det(A) = = 1 ⋅ 5 - 2 ⋅ 3 = -1.
Exercise2.2.3.13.Find the determinant of the following matrices. i)ii)iii).
Definition 2.2.3.14.[Singular, Non-Singular] A matrix A is said to be a singularif
det(A) = 0 and is called non-singularif det(A)≠0.
The next result relates the determinant with row operations. For proof, see Appendix 7.2.
Theorem 2.2.3.15.Let A be an n × n matrix. If
B = E_{ij}A, for 1 ≤ i≠j ≤ n, then det(B) = -det(A).
B = E_{i}(c)A, for c≠0,1 ≤ i ≤ n, then det(B) = cdet(A).DRAFT
B = E_{ij}(c)A, for c≠0 and 1 ≤ i≠j ≤ n, then det(B) = det(A).
A[i,:]^{T } = 0, for 1 ≤ i,j ≤ n then det(A) = 0.
A[i,:] = A[j,:] for 1 ≤ i≠j ≤ n then det(A) = 0.
A is a triangular matrix with d_{1},…,d_{n}on the diagonal then det(A) = d_{1}d_{n}.
Example 2.2.3.16.
Since , using Theorem 2.2.3.15, we
see that, for A = , det(A) = 2 ⋅ (1 ⋅ 2 ⋅ (-1)) = -4, where the first 2 appears
from the elementary matrix E_{1}().
For A = verify that |A|. Now, a successive application
of E_{21}(-1),E_{31}(-1) and E_{41}(-3) gives and then applying E_{32} and
E_{43}(-4), we get . Thus, by Theorem 2.2.3.15 det(A) = 2⋅(-1)⋅(8) = -16
as 2 gets contributed due to E_{1}() and -1 due to E_{32}.
DRAFT
Exercise2.2.3.17.Use Theorem2.2.3.15to arrive at the answer.
Let A = ,B = and C^{T } = for somecomplex numbers α and β. Prove that det(B) = α det(A) and det(C) = det(A).
Prove that 3 divides and = 0.
By Theorem 2.2.3.15.6 det(I_{n}) = 1. The next result about the determinant of the elementary
matrices is an immediate consequence of Theorem 2.2.3.15 and hence the proof is omitted.
Corollary 2.2.3.18.Fix a positive integer n. Then
det(E_{ij}) = -1.
For c≠0, det(E_{k}(c)) = c.
For c≠0, det(E_{ij}(c)) = 1.
DRAFTRemark 2.2.3.19.Theorem2.2.3.15.1implies that the determinant can be calculated byexpanding along any row. Hence, the readers are advised to verify that
Example 2.2.3.20. Let us use Remark 2.2.3.19 to compute the determinant.
Let u^{T } = [u_{1},u_{2}],v^{T } = [v_{1},v_{2}] ∈ ℝ^{2}. Now, consider the parallelogram on vertices P =
[0,0]^{T },Q = u,R = u + v and S = v (see Figure 3). ThenArea(PQRS) = |u_{1}v_{2}- u_{2}v_{1}|,the absolute value of.
Figure 2.2: Parallelepiped with vertices P,Q,R and S as base
Recall that the dot product of u^{T },v^{T }, denoted u ∙ v = u_{1}v_{1} + u_{2}v_{2}, the length of u, denotedℓ(u) = and cos(θ) = , where θ is the angle between u and v. Therefore
That is, in ℝ^{2}, the determinant is ± times the area of the parallelogram.
DRAFT
Consider Figure 3 again. Let u = [u_{1},u_{2},u_{3}]^{T },v = [v_{1},v_{2},v_{3}]^{T },w = [w_{1},w_{2},w_{3}]^{T }∈ ℝ^{3}. Thenu ∙ v = u_{1}v_{1} + u_{2}v_{2} + u_{3}v_{3}and the cross product of u and v, denoted
Also, the vector u × v is perpendicular to the plane containing both u and v. So, ifu_{3} = v_{3} = 0 then one can think of u and v as vectors in the XY -plane and in this caseℓ(u × v) = |u_{1}v_{2}-u_{2}v_{1}| = Area(PQRS). Hence, if γ is the angle between the vector w and thevector u × v then
In general, for an n×n matrix A,∣det(A)∣satisfies all the properties associated with the volumeof the n-dimensional parallelepiped. The actual proof is beyond the scope of this book. But, onecan observe the following:
The volume of the n-dimensional unit cube is 1 = det(I_{n}).
If one vector is multiplied by c≠0 then the volume either reduces or expands by c. Thesame holds for determinant.
Recall that if the height and base of a parallelogram is not changed then the arearemains the same. This corresponds to replacing the perpendicular vector with theperpendicular vector plus c≠0 times the base vector for different choices of c.
DRAFT
2.3.B Adjugate (classically Adjoint) of a Matrix
Definition 2.2.3.22.[Minor, Cofactor]Let A be an n × n matrix. Then the
(i,j)^{th}minor of A, denoted A_{ij} = det, for 1 ≤ i,j ≤ n.
(i,j)^{th}cofactor of A, denoted C_{ij} = (-1)^{i+j}A_{ij}.
the Adjugate (classically Adjoint) of A, denoted Adj(A) = [b_{ij}] with b_{ij} = C_{ji}, for
1 ≤ i,j ≤ n. For example, for A = , A_{11} = det(A(1|1)) = = 4,A_{12} =
= 3,A_{13} = = 1,…,A_{32} = = -5 and A_{33} = = -1. So, Adj(A) = = .
We now prove a very important result that relates adjugate matrix with the inverse.
Theorem 2.2.3.23.Let A be an n × n matrix. Then
for 1 ≤ i ≤ n,∑_{j=1}^{n}a_{ij}C_{ij} = ∑_{j=1}^{n}a_{ij}(-1)^{i+j}A_{ij} = det(A),
for i≠ℓ,∑_{j=1}^{n}a_{ij}C_{ℓj} = ∑_{j=1}^{n}a_{ij}(-1)^{ℓ+j}A_{ℓj} = 0, andDRAFT
A(Adj(A)) = det(A)I_{n}. Thus,
(2.2.3.2)
Proof. Part 1: It follows directly from Remark 2.2.3.19 and the definition of the cofactor.
Part 2: Fix positive integers i,ℓ with 1 ≤ i≠ℓ ≤ n and let B = [b_{ij}] be a square matrix with
B[ℓ,:] = A[i,:] and B[t,:] = A[t,:], for t≠ℓ. As ℓ≠i, B[ℓ,:] = B[i,:] and thus, by Theorem 2.2.3.15.5,
det(B) = 0. As A(ℓ|j) = B(ℓ|j), for 1 ≤ j ≤ n, using Remark 2.2.3.19
Part3:, Using Equation (2.2.3.3) and Remark 2.2.3.19, observe that
DRAFT
Thus, A(Adj(A)) = det(A)I_{n}. Therefore, if det(A)≠0 then A = I_{n}. Hence, by
Corollary 2.2.3.2, A^{-1} = Adj(A). _
Example 2.2.3.24. For A = , Adj(A) = and det(A) = -2.
Thus, by Theorem 2.2.3.23.3, A^{-1} = .
Let A be a non-singular matrix. Then, by Theorem 2.2.3.23.3, A^{-1} = Adj(A). Thus
AAdj(A) = Adj(A)A = det(A)I_{n} and this completes the proof of the next result
Corollary 2.2.3.25.Let A be a non-singular matrix. Then
DRAFT
The next result gives another equivalent condition for a square matrix to be invertible.
Theorem 2.2.3.26.A square matrix A is non-singular if and only if A is invertible.
Proof. Let A be non-singular. Then det(A)≠0 and hence A^{-1} = Adj(A) as .
Now, let us assume that A is invertible. Then, using Theorem 2.2.3.1, A = E_{1}E_{k} a product of
elementary matrices. Also, by Corollary 2.2.3.18, det(E_{i})≠0 for 1 ≤ i ≤ k. Thus, a repeated application
of Parts 1,2 and 3 of Theorem 2.2.3.15 gives det(A)≠0. _
The next result relates the determinant of product of two matrices with their determinants.
Theorem 2.2.3.27.Let A and B be square matrices of order n. Then
Proof. Step 1. Let A be non-singular. Then, by Theorem 2.2.3.23.3, A is invertible and by
Theorem 2.2.3.1, A = E_{1}E_{k}, a product of elementary matrices. Then a repeated application of
Parts 1,2 and 3 of Theorem 2.2.3.15 gives the desired result as
Step 2. Let A be singular. Then, by Theorem 2.2.3.26A is not invertible. So, by Theorem 2.2.3.1 and
Exercise 2.2.2.26.2 there exists an invertible matrix P such that PA = C, where C = . So,
A = P^{-1}. As P is invertible, using Part 1, we have
Thus, the proof of the theorem is complete. _
The next result relates the determinant of a matrix with the determinant of its transpose. Thus, the
determinant can be computed by expanding along any column as well.
Theorem 2.2.3.28.Let A be a square matrix. Then det(A) = det(A^{T }).
Proof. If A is a non-singular, Corollary 2.2.3.25 gives det(A) = det(A^{T }).
If A is singular then, by Theorem 2.2.3.26, A is not invertible. So, A^{T } is also not invertible and hence
by Theorem 2.2.3.26, det(A^{T }) = 0 = det(A). _
DRAFTExample 2.2.3.29.Let A be an orthogonal matrix then, by definition, AA^{T } = I. Thus, by
Theorems 2.2.3.27 and 2.2.3.28
Hence detA = ±1. In particular, if A = then I = AA^{T } = .
Thus, a^{2} + b^{2} = 1 and hence there exists θ ∈ [0,2π] such that a = cosθ and b = sinθ.
As ac + bd = 0, we get c = r sinθ and d = -r cosθ, for some r ∈ ℝ. But, c^{2} + d^{2} = 1
implies that either c = sinθ and d = -cosθ or c = -sinθ and d = cosθ.
Thus, A = or A = .
For A = , det(A) = -1. Then A represents a reflection across the line
y = mx. Determine m? (see Exercise 3.3b).
For A = , det(A) = 1. Then A represents a rotation through the angle α.
Determine α? (see Exercise 3.3a).
Exercise2.2.3.30.
Let A be an n × n upper triangular matrix with non-zero entries on the diagonal. Thenprove that A^{-1}is also an upper triangular matrix.DRAFT
Let A be an n × n matrix. Then det(A) = 0 if either
Prove that = for some α,β ∈ ℂ.
Prove that 17 divides .
A[i,:]^{T } = 0 or A[:,i] = 0, for some i,1 ≤ i ≤ n.
or A[i,:= cA[j,:] or A[:,i] = cA[:,j], for some c ∈ ℂ and for some i≠j.
2.3.C Cramer’s Rule
Let A be a square matrix. Then combining Theorem 2.2.3.7 and Theorem 2.2.3.26, one has the
following result.
Corollary 2.2.3.31.Let A be a square matrix. Then the following statements are equivalent:
A is invertible.
The linear system Ax = b has a unique solution for every b.
det(A)≠0.
Thus, Ax = b has a unique solution for every b if and only if det(A)≠0. The next theorem gives a
direct method of finding the solution of the linear system Ax = b when det(A)≠0.
DRAFTTheorem 2.2.3.32 (Cramer’s Rule).Let A be an n×n non-singular matrix. Then the uniquesolution of the linear system Ax = b with x^{T } = [x_{1},…,x_{n}] is given by
where A_{j}is the matrix obtained from A by replacing A[:,j] by b.
Proof. Since det(A)≠0, A is invertible. Thus, there exists an invertible matrix P such that PA = I_{n}
and P[A|b] = [I|Pb]. Let d = Ab. Then Ax = b has the unique solution x_{j} = d_{j}, for 1 ≤ j ≤ n. Also,
[e_{1},…,e_{n}] = I = PA = [PA[:,1],…,PA[:,n]]. Thus,
Thus, det(PA_{j}) = d_{j}, for 1 ≤ j ≤ n. Also, d_{j} = = = = . Hence,
x_{j} = and the required result follows. _
Example 2.2.3.33. Solve Ax = b using Cramer’s rule, where A = and b = .
Solution: Check that det(A) = 1 and x^{T } = [-1,1,0] as
2.4 Miscellaneous Exercises
Exercise2.2.4.1.
Suppose A^{-1} = B with A = and B = . Also, assume that A_{11}isinvertible and define P = A_{22}- A_{21}A_{11}^{-1}A_{12}. Then prove that
= ,
P is invertible and B = .
Determine necessary and sufficient condition for a triangular matrix to be invertible.
Let A be a unitary matrix then what can you say about∣det(A)∣?
Let A be a 2 × 2 matrix withTr(A) = 0 and det(A) = 0. Then A is a nilpotentmatrix.DRAFT
Let A and B be two non-singular matrices. Are the matrices A + B and A - B non-singular?Justify your answer.
Let A be an n × n matrix. Prove that the following statements are equivalent:
A is not invertible.
Rank(A)≠n.
det(A) = 0.
A is not row-equivalent to I_{n}.
The homogeneous system Ax = 0 has a non-trivial solution.
The system Ax = b is either inconsistent or it has an infinite number of solutions.
A is not a product of elementary matrices.
For what value(s) of λ does the following systems have non-trivial solutions? Also, for each valueof λ, determine a non-trivial solution.
(λ - 2)x + y = 0,x + (λ + 2)y = 0.
λx + 3y = 0,(λ + 6)y = 0.
Let a_{1},…,a_{n}∈ ℂ and define A = [a_{ij}]_{n×n}with a_{ij} = a_{i}^{j-1}. Prove that det(A) = ∏_{1≤i<j≤n}(a_{j}-a_{i}).This matrix is usually called the van der monde matrix.
Let A = [a_{ij}]_{n×n}with a_{ij} = max{i,j}. Prove that detA = (-1)^{n-1}n.
Solve the following system of equations by Cramer’s rule. i)x + y + z - w = 1,x + y - z + w = 2, 2x + y + z - w = 7,x + y + z + w = 3. ii)x - y + z - w = 1,x + y - z + w = 2, 2x + y - z - w = 7,x - y - z + w = 3.DRAFT
Let p ∈ ℂ,p≠0. Let A = [a_{ij}] and B = [b_{ij}] be two n × n matrices with b_{ij} = p^{i-j}a_{ij}, for
1 ≤ i,j ≤ n. Then compute det(B) in terms of det(A).
The position of an element a_{ij}of a determinant is called even or odd according as i + j is evenor odd. Prove that if all the entries in
odd positions are multiplied with -1 then the value of determinant doesn’t change.
even positions are multiplied with -1 then the value of determinant
does not change if the matrix is of even order.
is multiplied by -1 if the matrix is of odd order.
Let A be a Hermitian matrix. Prove that detA is a real number.
Let A be an n × n matrix. Then A is invertible if and only if Adj(A) is invertible.
Let A and B be invertible matrices. Prove thatAdj(AB) = Adj(B)Adj(A).
Let A be an invertible matrix and let P = . Then show thatRank(P) = n if and only ifD = CA^{-1}B.
2.5 Summary
In this chapter, we started with a system of m linear equations in n unknowns and formally wrote
it as Ax = b and in turn to the augmented matrix [A|b]. Then the basic operations on
equations led to multiplication by elementary matrices on the right of [A|b] and thus giving as
the RREF which in turn gave us rank of a matrix. If Rank(A) = r and Rank([A|b]) = r_{a}
and
DRAFT
r < r_{a} then the linear system Ax = b is inconsistent.
r = r_{a} then the linear system Ax = b is consistent. Furthermore, if
r = n then the system Ax = b has a unique solution.
r < n then the system Ax = b has an infinite number of solutions.
We have also see that the following conditions are equivalent for an n × n matrix A.
A is invertible.
The homogeneous system Ax = 0 has only the trivial solution.
The row reduced echelon form of A is I.
A is a product of elementary matrices.
The system Ax = b has a unique solution for every b.
The system Ax = b has a solution for every b.
rank(A) = n.
det(A)≠0.
So, overall we have learnt to solve the following type of problems:
Solving the linear system Ax = b. This idea will lead to the question “is the vector b a
linear combination of the columns of A”?
Solving the linear system Ax = 0. This will lead to the question “are the columns of A linearly
independent/dependent”? In particular, if Ax = 0 has
DRAFT
a unique solution then the columns of A are linear independent.
else, the columns of A are linearly dependent.
DRAFTDRAFTDRAFT
DRAFT
Chapter 3 Vector Spaces
In this chapter, we will mainly be concerned with finite dimensional vector spaces over ℝ or ℂ. The
last section will consist of results in infinite dimensional vector spaces that are similar but
different as compared with he finite dimensional case. We have given lots of examples of
vector spaces that are infinite dimensional or are vector spaces over fields that are different
from ℝ and ℂ. See appendix to have some ideas about fields that are different from ℝ and
ℂ.
3.1 Vector Spaces: Definition and Examples
In this chapter, F denotes either ℝ, the set of real numbers or ℂ, the set of complex numbers.
Let A be an m × n complex matrix and let V denote the solution set of the homogeneous system
Ax = 0. Then, by Theorem 2.2.1.6, V satisfies:
0 ∈ V as A0 = 0.
if x ∈ V then αx ∈ V, for all α ∈ ℂ. In particular, for α = -1, -x ∈ V.
if x,y ∈ V then, for any α,β ∈ ℂ, αx + βy ∈ V.
if x,y,z ∈ V then, (x + y) + z = x + (y + z).
That is, the solution set of a homogeneous linear system satisfies some nice properties. The Euclidean
plane, ℝ^{2} and the Euclidean space, ℝ^{3}, also satisfy the above properties. In this chapter,
our aim is to understand sets that satisfy such properties. We start with the following
definition.
Definition 3.3.1.1.[Vector Space] A vector space V over F, denoted V(F) or in short
V (if the field F is clear from the context), is a non-empty set, satisfying the following
axioms:
Vector Addition: To every pair u,v ∈ V there corresponds a unique element u⊕v ∈ V (called
the addition of vectors) such that
u ⊕ v = v ⊕ u (Commutative law).
DRAFT
(u ⊕ v) ⊕ w = u ⊕ (v ⊕ w) (Associative law).
V has a unique element, denoted 0, called the zero vector that satisfies u⊕0 = u,
for every u ∈ V (called the additive identity).
for every u ∈ V there is a unique element -u ∈ V that satisfies u⊕(-u) = 0 (called
the additive inverse).
Scalar Multiplication: For each u ∈ V and α ∈ F, there corresponds a unique element α ⊙ u
in V (called the scalar multiplication) such that
α ⋅ (β ⊙ u) = (αβ) ⊙ u for every α,β ∈ F and u ∈ V (⋅ is multiplication in F).
1 ⊙ u = u for every u ∈ V, where 1 ∈ F.
Distributive Laws: relating vector addition with scalar multiplication For any α,β ∈ F and u,v ∈ V, the following distributive laws hold:
α ⊙ (u ⊕ v) = (α ⊙ u) ⊕(α ⊙ v).
(α + β) ⊙ u = (α ⊙ u) ⊕(β ⊙ u) (+ is addition in F).
Definition 3.3.1.2.
The number 0 ∈ F, whereas 0 ∈ V.
The elements of F are called scalars.
The elements of V are called vectors.
DRAFT
If F = ℝ then V is called a real vector space.
If F = ℂ then V is called a complex vector space.
In general, a vector space over ℝ or ℂ is called a linear space.
Some interesting consequences of Definition 3.3.1.1 is stated next. Intuitively, these results
seem obvious but for better understanding of the axioms it is desirable to go through the
proof.
Theorem 3.3.1.3.Let V be a vector space over F. Then
u ⊕ v = u implies v = 0.
α ⊙ u = 0 if and only if either u = 0 or α = 0.
(-1) ⊙ u = -u, for every u ∈ V.
Proof. Part1: By Axiom 3.3.1.1.1d, for each u ∈ V there exists -u ∈ V such that -u ⊕ u = 0.
Hence, u ⊕ v = u is equivalent to
Part 2: As 0 = 0 ⊕ 0, using Axiom 3.3.1.1.3, we have
DRAFT
Thus, using Part 1, α ⊙ 0 = 0 for any α ∈ F. In the same way, using Axiom 3.3.1.1.3b,
Hence, using Part 1, one has 0 ⊙ u = 0 for any u ∈ V.
Now suppose α ⊙ u = 0. If α = 0 then the proof is over. Therefore, assume that α≠0,α ∈ F. Then,
(α)^{-1}∈ F and
as
1 ⊙ u = u for every vector u ∈ V (see Axiom 2.2b). Thus, if α≠0 and α ⊙ u = 0 then
u = 0.
Part 3: As 0 = 0 ⋅ u = (1 + (-1))u = u ⊕ (-1) ⋅ u, one has (-1) ⋅ u = -u. _
Example 3.3.1.4. The readers are advised to justify the statements given below.
Let A be an m × n matrix with complex entries with Rank(A) = r ≤ n. Then, using
Theorem 2.2.2.34, V = {x|Ax = 0} is a vector space.
DRAFT
Consider ℝ with the usual addition and multiplication. That is, ⊕≡ + and ⊙≡⋅. Then,
ℝ forms a real vector space.
Let ℝ^{2} = {(x_{1},x_{2})^{T }|x_{1},x_{2}∈ ℝ} Then, for x_{1},x_{2},y_{1},y_{2}∈ ℝ and α ∈ ℝ, define
Verify that ℝ^{2} is a real vector space.
Let ℝ^{n} = {(a_{1},…,a_{n})^{T }|a_{i}∈ ℝ,1 ≤ i ≤ n}. For u = (a_{1},…,a_{n})^{T },v = (b_{1},…,b_{n})^{T }∈ V and
α ∈ ℝ, define
(calledcomponentwiseoperations). Then, V is a real vector space. The vector
space ℝ^{n} is called the real vector space of n-tuples.
Recall that the symbol i represents the complex number .
Consider ℂ = {x + iy|x,y ∈ ℝ}, the set of complex numbers. Let z_{1} = x_{1} + iy_{1} and z_{2} = x_{2} + iy_{2}
and define z_{1}⊕ z_{2} = (x_{1} + x_{2}) + i(y_{1} + y_{2}). For scalar multiplication,
let α ∈ ℝ. Then α ⊙ z_{1} = (αx_{1}) + i(αy_{1}) and we call ℂ the real vector space.
let α + iβ ∈ ℂ. Then (α + iβ) ⊙ (x_{1} + iy_{1}) = (αx_{1}-βy_{1}) + i(αy_{1} + βx_{1}) and we call
ℂ the complex vector space.
DRAFT
Let ℂ^{n} = {(z_{1},…,z_{n})^{T }|z_{i}∈ ℂ,1 ≤ i ≤ n}. For z = (z_{1},…,z_{n}),w = (w_{1},…,w_{n})^{T }∈ ℂ^{n} and α ∈ F,
define
Then, verify that ℂ^{n} forms a vector space over ℂ (called complex vector space) as well as
over ℝ (called real vector space). In general, we assume ℂ^{n} to be a complex vector
space.
Remark 3.3.1.5.If F = ℂ theni(1,0) = (i,0) is allowed. Whereas, if F = ℝ then i(1,0)
doesn’t make sense as i ⁄∈ ℝ.
Fix m,n ∈ ℕ and let M_{m,n}(ℂ) = {A_{m×n} = [a_{ij}]|a_{ij}∈ ℂ}. For A,B ∈ M_{m,n}(ℂ) and α ∈ ℂ, define
(A + αB)_{ij} = a_{ij} + αb_{ij}. Then M_{m,n}(ℂ) is a complex vector space. If m = n, the vector space
M_{m,n}(ℂ) is denoted by M_{n}(ℂ).
Let S be a non-empty set and let ℝ^{S} = {f|f is a function from S to ℝ}. For f,g ∈ ℝ^{S} and
α ∈ ℝ, define (f + αg)(x) = f(x) + αg(x), for all x ∈ S. Then, ℝ^{S} is a real vector space. In
particular,
for S = ℕ, observe that ℝ^{ℕ}, consisting of all real sequences, forms a real vector space.
Let V be the set of all bounded real sequences. Then V is a real vector space.
Let V be the set of all real sequences that converge to 0. Then V is a real vector
space.
DRAFT
Let S be the set of all real sequences that converge to 1. Then check that S is not a
vector space. Determine the conditions that fail.
Fix a,b ∈ ℝ with a < b and let ([a,b], ℝ) = {f : [a,b] → ℝ|f is continuous}. Then ([a,b], ℝ)
with (f + αg)(x) = f(x) + αg(x), for all x ∈ [a,b], is a real vector space.
Let (ℝ, ℝ) = {f : ℝ → ℝ|f is continuous}. Then (ℝ, ℝ) with (f + αg)(x) = f(x) + αg(x), for
all x ∈ ℝ, is a real vector space.
Fix a < b ∈ ℝ and let ^{2}((a,b), ℝ) = {f : (a,b) → ℝ|f′′exists and f′′ is continuous}.
Then ^{2}((a,b), ℝ) with (f + αg)(x) = f(x) + αg(x), for all x ∈ (a,b), is a real vector
space.
Fix a < b ∈ ℝ and let ^{∞}((a,b), ℝ) = {f : (a,b) → ℝ|f is infinitely differentiable}.
Then ^{∞}((a,b), ℝ) with (f + αg)(x) = f(x) + αg(x), for all x ∈ (a,b) is a real vector
space.
Fix a < b ∈ ℝ. Then V = {f : (a,b) → ℝ|f′′ + f′ + 2f = 0} is a real vector space.
Let ℝ[x] = {p(x)|p(x) is a real polynomial in x}. Then, with the usual addition of polynomials
and α(a_{0} + a_{1}x + + a_{n}x^{n}) = (αa_{0}) + + (αa_{n})x^{n}, for α ∈ ℝ, gives ℝ[x] a real vector space
structure.
Fix n ∈ ℕ and let ℝ[x;n] = {p(x) ∈ ℝ[x]|p(x) has degree ≤ n}.Then, with the usual addition of
polynomials and α(a_{0} + a_{1}x + + a_{n}x^{n}) = (αa_{0}) + + (αa_{n})x^{n}, for α ∈ ℝ, gives ℝ[x;n] a real
vector space structure.
Let ℂ[x] = {p(x)|p(x) is a complex polynomial in x}. Then, with the usual addition of
polynomials and α(a_{0} + a_{1}x + + a_{n}x^{n}) = (αa_{0}) + + (αa_{n})x^{n}, for α ∈ ℂ, gives ℂ[x] a real
vector space structure.
Let V = {0}. Then V is a real as well as a complex vector space.
Let ℝ^{+} = {x ∈ ℝ|x > 0}. Then
DRAFT
ℝ^{+} is not a vector space under usual operations of addition and scalar multiplication.
ℝ^{+} is a real vector space with 1 as the additive identity if we define
For any α ∈ ℝ and x = (x_{1},x_{2})^{T },y = (y_{1},y_{2})^{T }∈ ℝ^{2}, define
Then ℝ^{2} is a real vector space with (-1,3)^{T } as the additive identity.
Let V = {A = [a_{ij}] ∈ M_{n}(ℂ)|a_{11} = 0}. Then V is a complex vector space.
Let V = {A = [a_{ij}] ∈ M_{n}(ℂ)|A = A^{*}}. Then V is a real vector space but not a complex vector
space.
Let V and W be vector spaces over F, with operations (+,∙) and (⊕,⊙), respectively. Let
V × W = {(v,w)|v ∈ V,w ∈ W}. Then V × W forms a vector space over F, if for every
(v_{1},w_{1}),(v_{2},w_{2}) ∈ V × W and α ∈ ℝ, we define
DRAFT
v_{1} + v_{2} and w_{1}⊕ w_{2} on the right hand side mean vector addition in V and W, respectively.
Similarly, α ∙ v_{1} and α ⊙ w_{1} correspond to scalar multiplication in V and W, respectively.
Let ℚ be the set of scalars. Then ℝ is a vector space over ℚ. As e,π -⁄∈ ℚ, these real
numbers are vectors but not scalars in this space.
Similarly, ℂ is a vector space over ℚ. Since e - π,i + ,i ⁄∈ ℚ, these complex numbers are
vectors but not scalars in this space.
Let ℤ_{5} = {0,1,2,3,4} with addition and multiplication, respectively, given by
Then, V = {(a,b)|a,b ∈ ℤ_{5}} is a vector space having 25 vectors.
Note that all our vector spaces, except the last three, are linear spaces.
DRAFT
From now on, we will use ‘u + v’ for ‘u ⊕ v’ and ‘αu or α ⋅ u’ for ‘α ⊙ u’.
Exercise3.3.1.6.
Verify the axioms for vector spaces that appear in Example3.3.1.4.
Does the set V given below form a real/complex or both real and complex vector space? Givereasons for your answer.
For x = (x_{1},x_{2})^{T },y = (y_{1},y_{2})^{T }∈ ℝ^{2}, define x + y = (x_{1} + y_{1},x_{2} + y_{2})^{T }andαx = (αx_{1},0)^{T }for all α ∈ ℝ.
Let V = .
Let V = .
Let V = {(x,y,z)^{T }|x + y + z = 1}.
Let V = {(x,y)^{T }∈ ℝ^{2}|x ⋅ y = 0}.
Let V = {(x,y)^{T }∈ ℝ^{2}|x = y^{2}}.
Let V = {α(1,1,1)^{T } + β(1,1,-1)^{T }|α,β ∈ ℝ}.
Let V = ℝ with x ⊕ y = x - y and α ⊙ x = -αx, for all x,y ∈ V and α ∈ ℝ.
Let V = ℝ^{2}. Define (x_{1},y_{1})^{T }⊕(x_{2},y_{2})^{T } = (x_{1}+x_{2},0)^{T }and α⊙(x_{1},y_{1})^{T } = (αx_{1},0)^{T },for α,x_{1},x_{2},y_{1},y_{2}∈ ℝ.
DRAFT
3.1.A Subspaces
Definition 3.3.1.7.[Vector Subspace] Let V be a vector space over F. Then, a non-empty
subset S of V is called a subspace of V if S is also a vector space with vector addition and
scalar multiplication inherited from V.
Example 3.3.1.8.
If V is a vector space then V and {0} are subspaces, called trivial subspaces.
The real vector space ℝ has no non-trivial subspace.
W = {x ∈ ℝ^{3}|[1,2,-1]x = 0} is a plane in ℝ^{3} containing 0 (so a subspace).
W = {x ∈ ℝ^{3}|x = 0} is a line in ℝ^{3} containing 0 (so a subspace).
The vector space ℝ[x;n] is a subspace of ℝ[x].
Prove that ^{2}(a,b) is a subspace of (a,b).
Prove that W = {(x,0)^{T }∈ ℝ^{2}|x ∈ ℝ} is a subspace of ℝ^{2}.
Is the set of sequences converging to 0 a subspace of the set of all bounded sequences?
Let V be the vector space of Example 3.3.1.4.19. Then
DRAFT
S = {(x,0)^{T }|x ∈ ℝ} is not a subspace of V as (x,0)^{T }⊕(y,0)^{T } = (x+y+1,-3)^{T }⁄∈ S.
W = {(x,3)^{T }|x ∈ ℝ} is a subspace of V.
The vector space ℝ^{+} defined in Example 3.3.1.4.18 is not a subspace of ℝ.
Let V(F) be a vector space and W ⊆ V, W≠∅. We now prove a result which implies that to check W to
be a subspace, we need to verify only one condition.
Theorem 3.3.1.9.Let V(F) be a vector space and W ⊆ V, W≠∅. Then W is a subspace of Vif and only if αu + βv ∈ W whenever α,β ∈ F and u,v ∈ W.
Proof. Let W be a subspace of V and let u,v ∈ W. Then, for every α,β ∈ F, αu,βv ∈ W and hence
αu + βv ∈ W.
Now, we assume that αu + βv ∈ W, whenever α,β ∈ F and u,v ∈ W. To show, W is a subspace of
V:
Taking α = 1 and β = 1, we see that u + v ∈ W, for every u,v ∈ W.
Taking α = 0 and β = 0, we see that 0 ∈ W.
Taking β = 0, we see that αu ∈ W, for every α ∈ F and u ∈ W. Hence, using
Theorem 3.3.1.3.3, -u = (-1)u ∈ W as well.
The commutative and associative laws of vector addition hold as they hold in V.
The axioms related with scalar multiplication and the distributive laws also hold as they
hold in V.
Thus, one obtains the required result. _
DRAFTExercise3.3.1.10.
Determine all the subspaces of ℝ and ℝ^{2}.
Prove that a line in ℝ^{2}is a subspace if and only if it passes through (0,0) ∈ ℝ^{2}.
Are all the sets given below subspaces of ([-1,1])?
W = {f ∈ C([-1,1])|f(1∕2) = 0}.
W = {f ∈ C([-1,1])|f(-1∕2) = 0,f(1∕2) = 0}.
W = {f ∈ C([-1,1])|f′() exists}.
Are all the sets given below subspaces of ℝ[x]?
W = {f(x) ∈ ℝ[x]|deg(f(x)) = 3}.
W = {f(x) ∈ ℝ[x]|deg(f(x)) = 0}.
W = {f(x) ∈ ℝ[x]|f(1) = 0}.
W = {f(x) ∈ ℝ[x]|f(0) = 0,f(1∕2) = 0}.
Which of the following are subspaces of ℝ^{n}(ℝ)?
{(x_{1},x_{2},…,x_{n})^{T }|x_{1}≥ 0}.
{(x_{1},x_{2},…,x_{n})^{T }|x_{1}is rational}.
{(x_{1},x_{2},…,x_{n})^{T }||x_{1}|≤ 1}.
Among the following, determine the subspaces of the complex vector space ℂ^{n}?DRAFT
Fix n ∈ ℕ. Then, is W a subspace of M_{n}(ℝ), where
W = {A ∈ M_{n}(ℝ)|Ais upper triangular}?
W = {A ∈ M_{n}(ℝ)|Ais symmetric}?
W = {A ∈ M_{n}(ℝ)|Ais skew-symmetric}?
W = {A|Ais a diagonal matrix}?
W = {A|trace(A) = 0}?
W = {A ∈ M_{n}(ℝ)|A^{T } = 2A}?
W = {A = [a_{ij}]|a_{11} + a_{21} + a_{34} = 0}?
Fix n ∈ ℕ. Then, is W = {A = [a_{ij}]|a_{11} +a_{22} = 0} a subspace of the complex vector spaceM_{n}(ℂ)? What if M_{n}(ℂ) is a real vector space?
Prove that the following sets are not subspaces of M_{n}(ℝ).
G = {A ∈ M_{n}(ℝ)|det(A) = 0}.
G = {A ∈ M_{n}(ℝ)|det(A)≠0}.
G = {A ∈ M_{n}(ℝ)|det(A) = 1}.
DRAFT
3.1.B Linear Span
Definition 3.3.1.11.[Linear Combination] Let V be a vector space over F and let u_{1},…,u_{n}∈V. Then, a vector u ∈ V is said to be a linear combination of u_{1},…,u_{n} if we can find scalars
α_{1},…,α_{n}∈ F such that u = α_{1}u_{1} + + α_{n}u_{n}.
Example 3.3.1.12.
(3,4,3) = 2(1,1,1) + (1,2,1) but we cannot find a,b ∈ ℝ such that (3,4,5) = a(1,1,1) +
b(1,2,1).
Is (4,5,5) a linear combination of (1,0,0), (2,1,0), and (3,3,1)? Solution: (4,5,5) is a linear combination if the linear system
(3.3.1.1)
in the unknowns a,b,c ∈ ℝ has a solution. Clearly, Equation (3.3.1.1) has solution a = 9,b = -10
and c = 5.
DRAFT
Is (4,5,5) a linear combination of the vectors (1,2,3),(-1,1,4) and (3,3,2)? Solution: The vector (4,5,5) is a linear combination if the linear system
(3.3.1.2)
in the unknowns a,b,c ∈ ℝ has a solution. The RREF of the corresponding augmented
matrix equals , implying infinite number of solutions. For example,
(4,5,5) = 3(1,2,3) - (-1,1,4).
Is (4,5,5) a linear combination of the vectors (1,2,1),(1,0,-1) and (1,1,0)? Solution: The vector (4,5,5) is a linear combination if the linear system
(3.3.1.3)
in the unknowns a,b,c ∈ ℝ has a solution. The RREF of the corresponding augmented matrix
equals . So, Equation (3.3.1.3) has no solution. Thus, (4,5,5) is not a linear
DRAFT
combination of the given vectors.
Exercise3.3.1.13.
Let x ∈ ℝ^{3}. Prove that x^{T }is a linear combination of (1,0,0), (2,1,0) and (3,3,1).Is this linear combination unique? That is, does there exist (a,b,c)≠(e,f,g) with x^{T } =
a(1,0,0) + b(2,1,0) + c(3,3,1) = e(1,0,0) + f(2,1,0) + g(3,3,1)?
Find condition(s) on x,y,z ∈ ℝ such that (x,y,z) is a linear combination of
(1,2,3),(-1,1,4) and (3,3,2).
(1,2,1),(1,0,-1) and (1,1,0).
(1,1,1),(1,1,0) and (1,-1,0).
Definition 3.3.1.14.[Linear Span] Let V be a vector space over F and S = {u_{1},…,u_{n}}⊆ V. Then,
the linear span of S, denoted LS(S), equals
If S is an empty set, we define LS(S) = {0}.
Example 3.3.1.15. For the set S given below, determine LS(S).
S = {(1,1,1)^{T },(2,1,3)^{T }}. What is the geometrical representation of LS(S)? Solution: LS(S) = {a(1,1,1)^{T } + b(2,1,3)^{T }|a,b ∈ ℝ} = {(a + 2b,a + b,a + 3b)^{T }|a,b ∈ ℝ}.
For geometrical representation, we need to find conditions on x,y and z such that
(a + 2b,a + b,a + 3b) = (x,y,z). Or equivalently, a + 2b = x,a + b = y,a + 3b = z
has a solution for all a,b ∈ ℝ. Check that the RREF of the augmented matrix equals
. Thus, we need 2x - y - z = 0. Hence, LS(S) is a plane given by
S = {(1,2,1)^{T },(1,0,-1)^{T },(1,1,0)^{T }}. What is the geometrical representation of LS(S)?
Solution: As above, we need to find condition(s) on x,y,z such that the linear system
DRAFT
(3.3.1.4)
in the unknowns a,b,c is always consistent. An application of GJE to Equation (3.3.1.4) gives
. Thus,
S = {(1,2,3)^{T },(-1,1,4)^{T },(3,3,2)^{T }}. Solution: As above, need to find condition(s) on x,y,z such that the linear system
in the unknowns a,b,c is always consistent. An application of GJE method gives
5x - 7y + 3z = 0 as the required condition. Thus,
DRAFT
S = {(1,2,3,4)^{T },(-1,1,4,5)^{T },(3,3,2,3)^{T }}⊆ ℝ^{4}. Solution: The readers are advised to show that
Exercise3.3.1.16.For each S, determine the geometric representation of LS(S).
S = {-1}⊆ ℝ.
S = {π}⊆ ℝ.
S = {(1,0,1)^{T },(0,1,0)^{T },(3,0,3)^{T }}⊆ ℝ^{3}.
S = {(1,2,1)^{T },(2,0,1)^{T },(1,1,1)^{T }}⊆ ℝ^{3}.
S = {(1,0,1,1)^{T },(0,1,0,1)^{T },(3,0,3,1)^{T }}⊆ ℝ^{4}.
DRAFT
Definition 3.3.1.17.[Finite Dimensional Vector Space] Let V be a vector space over F. Then
V is called finite dimensional if there exists S ⊆ V, such that S has finite number of elements
and V = LS(S). If such an S does not exist then V is called infinite dimensional.
Example 3.3.1.18.
{(1,2)^{T },(2,1)^{T }} spans ℝ^{2}. Thus, ℝ^{2} is finite dimensional.
{1,1 + x,1 - x + x^{2},x^{3},x^{4},x^{5}} spans ℂ[x;5]. Thus, ℂ[x;5] is finite dimensional.
Fix n ∈ ℕ. Then, ℝ[x;n] is finite dimensional as ℝ[x;n] = LS({1,x,x^{2},…,x^{n}}).
ℂ[x] is not finite dimensional as the degree of a polynomial can be any large positive
integer. Indeed, verify that ℂ[x] = LS({1,x,x^{2},…,x^{n},…}).
The vector space ℝ over ℚ is infinite dimensional.
The vector space ℂ over ℚ is infinite dimensional.
Lemma 3.3.1.19 (Linear Span is a Subspace).Let V be a vector space over F and S ⊆ V.Then LS(S) is a subspace of V.DRAFT
Proof. By definition, 0 ∈ LS(S). So, LS(S) is non-empty. Let u,v ∈ LS(S). To show, au + bv ∈ LS(S)
for all a,b ∈ F. As u,v ∈ LS(S), there exist n ∈ ℕ, vectors w_{i}∈ S and scalars α_{i},β_{i}∈ F such that
u = α_{1}w_{1} + + α_{n}w_{n} and v = β_{1}w_{1} + + β_{n}w_{n}. Hence,
as
aα_{i} + bβ_{i}∈ F for 1 ≤ i ≤ n. Thus, by Theorem 3.3.1.9, LS(S) is a vector subspace. _
Remark 3.3.1.20.Let V be a vector space over F. If W is a subspace of V and S ⊆ W thenLS(S) is a subspace of W as well.
Theorem 3.3.1.21.Let V be a vector space over F and S ⊆ V. Then LS(S) is the smallestsubspace of V containing S.
Proof. For every u ∈ S, u = 1 ⋅u ∈ LS(S). Thus, S ⊆ LS(S). Need to show that LS(S) is the smallest
subspace of V containing S. So, let W be any subspace of V containing S. Then, by Remark 3.3.1.20,
LS(S) ⊆ W and hence the result follows. _
Definition 3.3.1.22. Let V be a vector space over F.
DRAFT
Let S and T be two subsets of V. Then, the sum of S and T, denoted S + T equals
{s + t|s ∈ S,t ∈ T}. For example,
if V = ℝ, S = {0,1,2,3,4,5,6} and T = {5,10,15} then S + T = {5,6,…,21}.
if V = ℝ^{2}, S = and T = then S + T = .
if V = ℝ^{2}, S = and T = LS then S + T = .
Let P and Q be two subspaces of V. Then, we define their sum, denoted P + Q, as
P + Q = {u + v|u ∈ P,v ∈ Q}. For example, P + Q = R^{2}, if
P = {(x,0)^{T }|x ∈ ℝ} and Q = {(0,x)^{T }|x ∈ ℝ} as (x,y) = (x,0) + (0,y).
P = {(x,0)^{T }|x ∈ ℝ} and Q = {(x,x)^{T }|x ∈ ℝ} as (x,y) = (x - y,0) + (y,y).
P = LS((1,2)^{T }) and Q = LS((2,1)^{T }) as (x,y) = (1,2) + (2,1).
We leave the proof of the next result for readers.
Lemma 3.3.1.23.Let V be a vector space over F and let P and Q be two subspaces of V. ThenP + Q is the smallest subspace of V containing both P and Q.
Exercise3.3.1.24.
Let a ∈ ℝ^{2},a≠0. Then show that {x ∈ ℝ^{2}|a^{T }x = 0} is a non-trivial subspace of ℝ^{2}.Geometrically, what does this set represent in ℝ^{2}?DRAFT
Find all subspaces of ℝ^{3}.
Prove that {(x,y,z)^{T }∈ ℝ^{3}|ax + by + cz = d} is a subspace of ℝ^{3}if and only if d = 0.
Let W = {f(x) ∈ ℝ[x]|deg(f(x)) = 5}. Prove that W is not a subspace ℝ[x].
Determine all subspaces of the vector space in Example3.3.1.4.19.
Let U = and W = be subspaces of M_{2}(ℝ).Determine U ∩ W. Is M_{2}(ℝ) = U ∪ W? What is U + W?
Let W and U be two subspaces of a vector space V over F.
Prove that W ∩ U is a subspace of V.
Give examples of W and U such that W ∪ U is not a subspace of V.
Determine conditions on W and U such that W ∪ W a subspace of V?
Prove that LS(W ∪ U) = W + U.
Let S = {x_{1},x_{2},x_{3},x_{4}}, where x_{1} = (1,0,0)^{T },x_{2} = (1,1,0)^{T },x_{3} = (1,2,0)^{T }and x_{4} = (1,1,1)^{T }.Then, determine all x_{i}such that LS(S) = LS(S \{x_{i}}).
Let W = LS((1,0,0)^{T },(1,1,0)^{T }) and U = LS((1,1,1)^{T }). Prove that W + U = ℝ^{3}andW ∩ U = {0}. If u ∈ ℝ^{3}, determine u_{W }∈ W and u_{U}∈ U such that u = u_{W } + u_{U}. Is it necessarythat u_{W }and u_{U}are unique?
Let W = LS((1,-1,0),(1,1,0)) and U = LS((1,1,1),(1,2,1)). Prove that W + U = ℝ^{3}andW ∩ U≠{0}. Find u ∈ ℝ^{3}such that when we write u = u_{W } + u_{U}, with u_{W }∈ W and u_{U}∈ U, thevectors u_{W }and u_{U}are not unique.
DRAFT
3.1.C Fundamental Subspaces Associated with a Matrix
Definition 3.3.1.25.Let A ∈ M_{m,n}(ℂ). Then, we use the functions P : ℂ^{n}→ ℂ^{m} and Q : ℂ^{m}→ ℂ^{n}
defined by P(x) = Ax and Q(y) = A^{*}y, respectively, for x ∈ ℂ^{n} and y ∈ ℂ^{m}, to get the four
fundamental subspaces associated with A, namely,
Col(A) = {Ax|x ∈ ℂ^{n}} = Rng(P) is a subspace of ℂ^{m}, called the Column / Rangespace. Observe that Col(A) is the linear span of columns of A.
Null(A) = {x|Ax = 0,x ∈ ℂ^{n}} = Null(P) is a subspace of ℂ^{n}, called the Null space.
Col(A^{*}) = {A^{*}x|x ∈ ℂ^{n}} = Rng(Q) is the linear span of rows of A = [a_{ij}]. If
A ∈ M_{m,n}(ℝ) then Col(A^{*}) reduces to Row(A) = {x^{T }A|x ∈ ℝ^{m}}, the row space of A.
Null(A^{*}) = {x|A^{*}x = 0,x ∈ ℂ^{n}} = Null(Q). If A ∈ M_{m,n}(ℝ) then Null(A^{*}) reduces
to Null(A^{T }) = {x ∈ ℝ^{m}|x^{T }A = 0^{T }}.
Remark 3.3.1.27.Let A ∈ M_{m,n}(ℝ). Then, in Example3.3.1.26, observe that the directionratios of Col(A) matches vector(s) in Null(A^{T }). Similarly, the direction ratios of Row(A)
matches with vectors in Null(A). What are the relationships in case A ∈ M_{m,n}(ℂ)? We willcome back to these spaces again and again.DRAFT
Let V be a vector space over either ℝ or ℂ. Then, we have learnt that
for any S ⊆ V, LS(S) is again a vector space. Moreover, LS(S) is the smallest subspace
containing S.
unless S = ∅, LS(S) has infinite number of vectors.
Therefore, the following questions arise:
Are there conditions under which LS(S_{1}) = LS(S_{2}) for S_{1}≠S_{2}?
Is it always possible to find S so that LS(S) = V?
Suppose we have found S ⊆ V such that LS(S) = V. Can we find the minimum
number of vectors in S?
We try to answer these questions in the subsequent sections.
3.2 Linear Independence
Definition 3.3.2.1.[Linear Independence and Dependence] Let S = {u_{1},…,u_{m}} be a non-empty
subset of a vector space V over F. Then the set S is said to be linearly independent if the system of
linear equations
DRAFT
" class="math-display" >
(3.3.2.1)
in the unknowns α_{i}’s, 1 ≤ i ≤ m, has only the trivial solution. If the system (3.3.2.1) has a non-trivial
solution then the set S is said to be linearly dependent.
If the set S has infinitely many vectors then S is said to be linearly independent if for every finite
subset T of S, T is linearly independent, else S is linearly dependent.
Let V be a vector space over F and S = {u_{1},…,u_{m}}⊆ V with S≠∅. Then, one needs to solve the linear
system of equation
(3.3.2.2)
in the unknowns α_{1},…,α_{n}∈ F. If α_{1} = = α_{m} = 0 is theonlysolutionof (3.3.2.2) then S is a
linearly independent subset of V. Otherwise, S is a linearly dependent subset of V. Since one is
solving a linear system over F, linear independence and dependence depend on F, the set of
scalars.
Example 3.3.2.2. Is the set S a linear independent set? Give reasons.
Let S = {(1,2,1)^{T },(2,1,4)^{T },(3,3,5)^{T }}. Solution: Consider the system a(1,2,1) + b(2,1,4) + c(3,3,5) = (0,0,0) in the unknowns
a,b and c. As rank of coefficient matrix is 2 < 3, the number of unknowns, the system has
a non-trivial solution. Thus, S is a linearly dependent subset of ℝ^{3}.
Let S = {(1,1,1)^{T },(1,1,0)^{T },(1,0,1)^{T }}. Solution: Consider the system a(1,1,1) + b(1,1,0) + c(1,0,1) = (0,0,0) in the unknowns
DRAFTa,b and c. As rank of coefficient matrix is 3 = the number of unknowns, the system has
only the trivial solution. Hence, S is a linearly independent subset of ℝ^{3}.
Consider ℂ as a complex vector space and let S = {1,i}. Solution: Since ℂ is a complex vector space, i ⋅ 1 + (-1)i = i - i = 0. So, S is a linear
dependent subset the complex vector space ℂ.
Consider ℂ as a real vector space and let S = {1,i}. Solution: Consider the linear system a ⋅ 1 + b ⋅ i = 0, in the unknowns a,b ∈ ℝ. Since
a,b ∈ ℝ, equating real and imaginary parts, we get a = b = 0. So, S is a linear independent
subset the real vector space ℂ.
Let A ∈ M_{m,n}(ℂ). If Rank(A) < min{m,n} then the rows of A are linearly dependent. Solution: Let B = RREF(A). Then, there exists an invertible matrix P = [p_{ij}] such that
B = PA. Since Rank(A) < min{m,n}, B[m,:] = 0^{T }. Thus, 0^{T } = B[m,:] = ∑_{i=1}^{n}p_{mi}A[i,:
]. As P is invertible, at least one p_{mi}≠0. Thus, the required result follows.
3.2.A Basic Results related to Linear Independence
The reader is expected to supply the proof of parts that are not given.
Proposition 3.3.2.3.Let V be a vector space over F. Then,
0, the zero-vector, cannot belong to a linearly independent set.
every non-empty subset of a linearly independent set in V is also linearly independent.
a set containing a linearly dependent set of V is also linearly dependent.
DRAFT
Proof. Let S = {0 = u_{1},u_{2},…,u_{n}}. Then, 1 ⋅ u_{1} + 0 ⋅ u_{2} + + 0 ⋅ u_{n} = 0. Hence, the system
α_{1}u_{1} + + α_{m}u_{m} = 0 has a non-trivial solution [α_{1},α_{2},…,α_{n}] = [1,0…,0]. Thus, the set S is linearly
dependent. _
Theorem 3.3.2.4.Let V be a vector space over F. Let S = {u_{1},…,u_{k}} ⊆ V with S≠∅. IfT ⊆ LS(S) such that m = |T|> k then T is a linearly dependent set.
Proof. Let T = {w_{1},…,w_{m}}. As w_{i}∈ LS(S), there exist a_{ij}∈ F such that w_{i} = a_{i1}u_{1} + + a_{ik}u_{k}, for
1 ≤ i ≤ m. So,
As m > k, using Corollary 2.2.2.35, the system A^{T }x = 0 has a non-trivial solution, say
x^{T } = [α_{1},…,α_{m}]≠0^{T }. That is, ∑_{i=1}^{m}α_{i}A^{T }[:,i] = 0. Or equivalently, ∑_{i=1}^{m}α_{i}A[i,:] = 0^{T }.
Thus,
Thus, the set T is linearly dependent. _
DRAFTCorollary 3.3.2.5.Fix n ∈ ℕ. Then, any set S ⊆ ℝ^{n}with|S|≥ n + 1 is linearly dependent.
Proof. Observe that ℝ^{n} = LS({e_{1},…,e_{n}}), where e_{i} = I_{n}[:,i], the i-th column of I_{n}. Hence, the
required result follows using Theorem 3.3.2.4. _
Theorem 3.3.2.6.Let V be a vector space over F and let S be a linearly independent subsetof V. Suppose v ∈ V. Then S ∪{v} is linearly dependent if and only if v ∈ LS(S).
Proof. Let us assume that S ∪{v} is linearly dependent. Then, there exist v_{i}∈ S such
that the system α_{1}v_{1} + + α_{p}v_{p} + α_{p+1}v = 0 has a non-trivial solution, say α_{i} = c_{i},
for 1 ≤ i ≤ p + 1. As the solution is non-trivial one of the c_{i}’s is non-zero. We claim that
c_{p+1}≠0.
For, if c_{p+1} = 0 then the system α_{1}v_{1} + + α_{p}v_{p} = 0 in the unknowns α_{1},…,α_{p} has a non-trivial
solution [c_{1},…,c_{p}]. This contradicts Proposition 3.3.2.3.2 as {v_{1},…,v_{p}} is a subset of the linearly
independent set S. Thus, c_{p+1}≠0 and we get
as
-∈ F, for 1 ≤ i ≤ p. Hence, v is a linear combination of v_{1},…,v_{p}.
Now, assume that v ∈ LS(S). Then, there exists c_{i}∈ F, not all zero and v_{i}∈ S such that
v = ∑_{i=1}^{p}c_{i}v_{i}. Thus, the system α_{1}v_{1} + + α_{p}v_{p} + α_{p+1}v = 0 in the unknowns α_{i}’s has a
non-trivial solution [c_{1},…,c_{p},-1]. Hence, S ∪{v} is linearly dependent. _
We now state a very important corollary of Theorem 3.3.2.6 without proof.
DRAFTCorollary 3.3.2.7.Let V be a vector space over F and let S = {u_{1},…,u_{n}}⊆ V. If Sis
linearly dependent then there exists k,2 ≤ k ≤ n with LS(u_{1},…,u_{k}) = LS(u_{1},…,u_{k-1}).
linearly independent then v ∈ V \ LS(S) if and only if S ∪{v} = {u_{1},…,u_{n},v} is also alinearly independent subset of V.
linearly independent then LS(S) = V if and only if each proper superset of S is linearlydependent.
3.2.B Application to Matrices
We leave the proof of the next result for readers.
Theorem 3.3.2.8.The following statements are equivalent for A ∈ M_{n}(ℂ).
A is invertible.
The columns of A are linearly independent.
The rows of A are linearly independent.
A generalization of Theorem 3.3.2.8 is stated next.
DRAFTTheorem 3.3.2.9.Let A ∈ M_{m,n}(ℂ) with B = RREF(A). Then, the rows of A correspondingto the pivotal rows of B are linearly independent. Also, the columns of A corresponding to thepivotal columns of B are linearly independent.
Proof. Pivotal rows of B are linearly independent due to the pivotal 1’s. Now, let B_{1} be the submatrix
of B consisting of the pivotal rows of B. Let A_{1} be the submatrix of A which gives B_{1}. As the RREF
of a matrix is unique (see Corollary 2.2.2.17) there exists an invertible matrix Q such that QA_{1} = B_{1}.
So, if there exists c≠0 such that c^{T }A_{1} = 0^{T } then
with
d^{T } = c^{T }Q^{-1}≠0^{T } as Q is an invertible matrix (see Theorem 2.2.3.1). Thus, it contradicts the linear
independence of the pivotal rows of B.
Let B[:,i_{1}],…,B[:,i_{r}] be the pivotal columns of B which are linearly independent due to pivotal 1’s.
Let B = PA then the corresponding columns of A satisfy
Hence, the required result follows as P is an invertible matrix. _
We give an example for better understanding.
Example 3.3.2.10. Let A = with RREF(A) = B = . Then
DRAFTB[:,3] = -B[:,1]+2B[:,2]. Thus, A[:,3] = -A[:,1]+2A[:,2]. As the 1-st, 2-nd and 4-th columns
of B are linearly independent, the set {A[:,1],A[:,2],A[:,4]} is linearly independent. Also, note
that during the application of GJE to get RREF, we have interchanged the 3-rd and 4-th rows.
Hence, the rows A[1,:],A[2,:] and A[4,:] are linearly independent.
3.2.C Linear Independence and Uniqueness of Linear Combination
We end this section with a result that states that linear combination with respect to linearly
independent set is unique.
Lemma 3.3.2.11.Let S be a linearly independent set in a vector space V over F. Then eachv ∈ LS(S) is a unique linear combination vectors from S.
Proof. On the contrary, suppose there exists v ∈ LS(S) such that v = α_{1}v_{1} + + α_{p}v_{p} and
v = β_{1}v_{1} + + β_{p}v_{p}, for α_{i},β_{i}∈ F and v_{i}∈ S, for 1 ≤ i ≤ p. Equating the two expressions for v
gives
(3.3.2.3)
As {v_{1},…,v_{p}}⊆ S is a linearly independent subset in LS(S), the system c_{1}v_{1} + + c_{p}v_{p} = 0 in the
unknowns c_{1},…,c_{p} has only the trivial solution. Thus, each of the scalars α_{i}- β_{i}, appearing in
Equation (3.3.2.3), must be equal to 0. That is, α_{i}- β_{i} = 0 , for 1 ≤ i ≤ p. Thus, for 1 ≤ i ≤ p,
α_{i} = β_{i} and the result follows. _
DRAFTExercise3.3.2.12.
Consider the euclidean plane ℝ^{2}. Let u_{1} = (1,0)^{T }. Determine condition on u_{2}such that{u_{1},u_{2}} is a linearly independent subset of ℝ^{2}.
Let S = {(1,1,1,1)^{T },(1,-1,1,2)^{T },(1,1,-1,1)^{T }} ⊆ ℝ^{4}. Does (1,1,2,1)^{T }∈ LS(S)?Furthermore, determine conditions on x,y,z and u such that (x,y,z,u)^{T }∈ LS(S).
Show that S = {(1,2,3)^{T },(-2,1,1)^{T },(8,6,10)^{T }}⊆ ℝ^{3}is linearly dependent.
Prove that {u_{1},…,u_{n}}⊆ ℂ^{n}is linearly independent if and only if {Au_{1},…,Au_{n}} is linearlyindependent for every invertible matrix A.
Let V be a complex vector space and let A ∈ M_{n}(ℂ) be invertible. Then {u_{1},…,u_{n}}⊆ Vis a linearly independent if and only if {w_{1},…,w_{n}} ⊆ V is linearly independent, wherew_{i} = ∑_{j=1}^{n}a_{ij}u_{j}, for 1 ≤ i ≤ n.
Find u,v,w ∈ ℝ^{4}such that {u,v,w} is linearly dependent whereas {u,v},{u,w} and{v,w} are linearly independent.
Is {(1,0)^{T },(i,0)^{T }} a linearly independent subset of the real vector space ℂ^{2}?
Suppose V is a collection of vectors such that V is a real as well as a complex vector space.Then prove that {u_{1},…,u_{k},iu_{1},…,iu_{k}} is a linearly independent subset of V over ℝ if andonly if {u_{1},…,u_{k}} is a linear independent subset of V over ℂ.
Let V be a vector space and M be a subspace of V. For u,v ∈ V\M, define K = LS(M,u)
and H = LS(M,v). Then prove that v ∈ K if and only if u ∈ H.
Let A ∈ M_{n}(ℝ). Suppose x,y ∈ ℝ^{n}\{0} such that Ax = 3x and Ay = 2y. Then provethat x and y are linearly independent.DRAFT
Let A = . Determine x,y,z ∈ ℝ^{3}\{0} such that Ax = 6x, Ay = 2y andAz = -2z. Use the vectors x,y and z obtained above to prove the following.
A^{2}v = 4v, where v = cy + dz for any c,d ∈ ℝ.
The set {x,y,z} is linearly independent.
Let P = [x,y,z] be a 3 × 3 matrix. Then P is invertible.
Let D = . Then AP = PD.
Prove that the rows/columns of
A ∈ M_{n}(ℂ) are linearly independent if and only if det(A)≠0.
A ∈ M_{n}(ℂ) span ℂ^{n}if and only if A is an invertible matrix.
a skew-symmetric matrix A of odd order are linearly dependent.
Let P and Q be subspaces of ℝ^{n}such that P + Q = ℝ^{n}and P ∩ Q = {0}. Prove that eachu ∈ ℝ^{n}is uniquely expressible as u = u_{P } + u_{Q}, where u_{P }∈ P and u_{Q}∈ Q.
3.3 Basis of a Vector Space
Definition 3.3.3.1. Let S be a subset of a set T. Then S is said to be a maximal subset of T
having property P if
DRAFT
S has property P and
no proper superset S of T has property P.
Example 3.3.3.2.Let T = {2,3,4,7,8,10,12,13,14,15}. Then a maximal subset of T of
consecutive integers is S = {2,3,4}. Other maximal subsets are {7,8},{10} and {12,13,14,15}.
Note that {12,13} is not maximal. Why?
Definition 3.3.3.3. Let V be a vector space over F. Then S is called a maximal linearlyindependent subset of V if
S is linearly independent and
no proper superset S of V linearly independent.
Example 3.3.3.4.
In ℝ^{3}, the set S = {e_{1},e_{2}} is linearly independent but not maximal as S ∪{(1,1,1)^{T }} is
a linearly independent set containing S.
In ℝ^{3}, S = {(1,0,0)^{T },(1,1,0)^{T },(1,1,-1)^{T }} is a maximal linearly independent set as any
collection of 4 or more vectors from ℝ^{3} is linearly dependent (see Corollary 3.3.2.5).
DRAFT
Let S = {v_{1},…,v_{k}}⊆ ℝ^{n}. Now, form the matrix A = [v_{1},…,v_{k}] and let B = RREF(A).
Then, using Theorem 3.3.2.9, we see that if B[:,i_{1}],…,B[:,i_{r}] are the pivotal columns of
B then {v_{i1},…,v_{ir}} is a maximal linearly independent subset of S.
Theorem 3.3.3.5.Let V be a vector space over F and S a linearly independent set in V. ThenS is maximal linearly independent if and only if LS(S) = V.
Proof. Let v ∈ V. As S is linearly independent, using Corollary 3.3.2.7.2, the set S ∪{v} is linearly
independent if and only if v ∈ V \ LS(S). Thus, the required result follows. _
Let V = LS(S) for some set S with |S| = k. Then, using Theorem 3.3.2.4, we see that if T ⊆ V is
linearly independent then |T|≤ k. Hence, a maximal linearly independent subset of V can have at
most k vectors. Thus, we arrive at the following important result.
Theorem 3.3.3.6.Let V be a vector space over F and let S and T be two finite maximallinearly independent subsets of V. Then|S| = |T|.
Proof. By Theorem 3.3.3.5, S and T are maximal linearly independent if and only if LS(S) = V = LS(T).
Now, use the previous paragraph to get the required result. _
Definition 3.3.3.7. Let V be a vector space over F with V≠{0}. Suppose V has a finite
maximal linearly independent set S. Then |S| is called the dimension of V, denoted dim(V).
By convention, dim({0}) = 0.
DRAFTExample 3.3.3.8.
As {π} is a maximal linearly independent subset of ℝ, dim(ℝ) = 1.
As {(1,0,1)^{T },(0,1,1)^{T },(1,1,0)^{T }}⊆ ℝ^{3} is maximal linearly independent, dim(ℝ^{3}) = 3.
As {e_{1},…,e_{n}} is a maximal linearly independent set in ℝ^{n}, dim(ℝ^{n}) = n.
As {e_{1},…,e_{n}} is a maximal linearly independent subset of the complex vector space ℂ^{n},
dim(ℂ^{n}) = n.
Using Exercise 3.3.2.12.8, {e_{1},…,e_{n},ie_{1},…,ie_{n}} is a maximal linearly independent subset
of the real vector space ℂ^{n}. Thus, as a real vector space dim(ℂ^{n}) = 2n.
Let S = {v_{1},…,v_{k}}⊆ ℝ^{n}. Define A = [v_{1},…,v_{k}]. Then, using Example 3.3.3.4.3, we see
that dim(LS(S)) = Rank(A).
Definition 3.3.3.9.[Basis of a Vector Space] Let V be a vector space over F with V≠{0}.
Then a maximal linearly independent subset of V is called a basis of V. The vectors in a basis
are called basis vectors. Note that a basis of {0} is either not defined or is the empty set.
Definition 3.3.3.10.Let V be a vector space over F with V≠{0}. Then a set S ⊆ V is called
minimal spanning if LS(S) = V and no proper subset of S spans V.
DRAFTExample 3.3.3.11.[Standard Basis] Fix n ∈ ℕ and let e_{i} = I_{n}[:,i], the i-th column of
the identity matrix. Then = {e_{1},…,e_{n}} is called the standard basis of ℝ^{n} or ℂ^{n}. In
particular,
= {e_{1}} = {1} is a standard basis of ℝ over ℝ.
= {e_{1},e_{2}} with e_{1}^{T } = (1,0)^{T } and e_{2}^{T } = (0,1)^{T } is the standard basis of ℝ^{2}.
= {e_{1},e_{2},e_{3}} = {(1,0,0)^{T },(0,1,0)^{T },(0,0,1)^{T }} is the standard basis of ℝ^{3}.
{1} is a basis of ℂ over ℂ.
{e_{1},…,e_{n},ie_{1},…,ie_{n}} is a basis of ℂ^{n} over ℝ. So, {1,i} is a basis of ℂ over ℝ.
Example 3.3.3.12.
Note that {-2} is a basis and a minimal spanning set of ℝ.
Let u_{1},u_{2},u_{3}∈ ℝ^{2}. Then, {u_{1},u_{2},u_{3}} can neither be a basis nor a minimal spanning
subset of ℝ^{2}.
{(1,1,-1)^{T },(1,-1,1)^{T },(-1,1,1)^{T }} is a basis and a minimal spanning subset of ℝ^{3}.
Let V = {(x,y,0)^{T }|x,y ∈ ℝ}⊆ ℝ^{3}. Then = {(1,0,0)^{T },(1,3,0)^{T }} is a basis of V.
Let V = {(x,y,z)^{T }∈ ℝ^{3}|x + y - z = 0}⊆ ℝ^{3}. As each element (x,y,z)^{T }∈ V satisfies
x + y - z = 0. Or equivalently z = x + y, we see that
DRAFT
Hence, {(1,0,1)^{T },(0,1,1)^{T }} forms a basis of V.
Let S = {1,2,…,n} and consider the vector space ℝ^{S} (see Example 3.3.1.4.8). Then,
for 1 ≤ i ≤ n, define e_{i}(j) = Prove that = {e_{1},…,e_{n}} is a linearly
independent set. Is it a basis of ℝ^{S}?
Let S = ℝ^{n} and consider the vector space ℝ^{S} (see Example 3.3.1.4.8). For 1 ≤ i ≤ n,
define the functions e_{i}(x) = e_{i}(x_{1},…,x_{n}) = x_{i}. Then, verify that {e_{1},…,e_{n}} is a linearly
independent set. Is it a basis of ℝ^{S}?
Let S = {v_{1},…,v_{k}} ⊆ ℝ^{n}. Define A = [v_{1},…,v_{k}]. Then, using Theorem 3.3.2.9, the
columns of A corresponding to the pivotal columns in RREF(A) form a basis as well as a
minimal spanning subset of LS(S).
Let S = {a_{1},…,a_{n}}. Then recall that ℝ^{S} is a real vector space (see Example 8). For
1 ≤ i ≤ n, define f_{i}(a_{j}) = . Then, verify that {f_{1},…,f_{n}} is a basis of
ℝ^{S}. What can you say if S is a countable set?
Recall the vector space [a,b], where a < b ∈ ℝ. For each α ∈ [a,b] ∩ ℚ, define
f_{α}(x) = x - α, for all x ∈ [a,b]. Is the set {f_{α}|α ∈ [a,b]} linearly independent? What if
α ∈ [a,b]? Can we write any function in [a,b] as a finite linear combination? Give reasons
for your answer.
3.3.A Main Results associated with Bases
DRAFTTheorem 3.3.3.13.Let V≠{0} be a vector space over F. Then the following statements areequivalent.
is a basis (maximal linearly independent subset) of V.
is linearly independent and it spans V.
is a minimal spanning set of V.
Proof. 1 ⇒2 By definition, every basis is a maximal linearly independent subset of V. Thus,
using Corollary 3.3.2.7.2, we see that spans V.
2 ⇒3 Let S be a linearly independent set that spans V. As S is linearly independent, for any
x ∈ S, xLS. Hence LS≠V.
3 ⇒1 If is linearly dependent then using Corollary 3.3.2.7.1is not minimal spanning. A
contradiction. Hence, is linearly independent.
We now need to show that is a maximal linearly independent set. Since spans V, for any
x ∈ V \, the set ∪{x} is linearly dependent. That is, every proper superset of is linearly
dependent. Hence, the required result follows. _
Now, using Lemma 3.3.2.11, we get the following result.
Remark 3.3.3.14.Let be a basis of a vector space V over F. Then, for each v ∈ V, thereexist unique u_{i}∈ and unique α_{i}∈ F, for 1 ≤ i ≤ n, such that v = ∑_{i=1}^{n}α_{i}u_{i}.
The next result is generally known as ”every linearly independent set can be extended to form a basis
in a finite dimensional vector space”.
Theorem 3.3.3.15.Let V be a vector space over F with dim(V) = n. If S is a linearlyindependent subset of V then there exists a basis T of V such that S ⊆ T.DRAFT
Proof. If LS(S) = V, done. Else, choose u_{1}∈ V \LS(S). Thus, by Corollary 3.3.2.7.2, the set S ∪{u_{1}}
is linearly independent. We repeat this process till we get n vectors in T as dim(V) = n. By
Theorem 3.3.3.13, this T is indeed a required basis. _
3.3.B Constructing a Basis of a Finite Dimensional Vector Space
We end this section with an algorithm which is based on the proof of the previous theorem.
Step 1: Let v_{1}∈ V with v_{1}≠0. Then {v_{1}} is linearly independent.
Step 2: If V = LS(v_{1}), we have got a basis of V. Else, pick v_{2}∈ V \ LS(v_{1}). Then by
Corollary 3.3.2.7.2, {v_{1},v_{2}} is linearly independent.
Step i: Either V = LS(v_{1},…,v_{i}) or LS(v_{1},…,v_{i})≠V. In the first case, {v_{1},…,v_{i}} is a basis of
V. Else, pick v_{i+1}∈ V \ LS(v_{1},…,v_{i}). Then, by Corollary 3.3.2.7.2, the set {v_{1},…,v_{i+1}}
is linearly independent.
This process will finally end as V is a finite dimensional vector space.
Exercise3.3.3.16.
Let = {u_{1},…,u_{n}} be a basis of a vector space V over F. Then, does the condition∑_{i=1}^{n}α_{i}u_{i} = 0 in α_{i}’s imply that α_{i} = 0, for 1 ≤ i ≤ n?
Find a basis of ℝ^{3}containing the vector (1,1,-2)^{T }.
Find a basis of ℝ^{3}containing the vector (1,1,-2)^{T }and (1,2,-1)^{T }.
Is it possible to find a basis of ℝ^{4}containing the vectors (1,1,1,-2)^{T }, (1,2,-1,1)^{T }and
(1,-2,7,-11)^{T }?DRAFT
Let S = {v_{1},…,v_{p}} be a subset of a vector space V over F. Suppose LS(S) = V but S isnot a linearly independent set. Then does this imply that each v ∈ V is expressible in morethan one way as a linear combination of vectors from S?
Show that = {(1,0,1)^{T },(1,i,0)^{T },(1,1,1 - i)^{T }} is a basis of ℂ^{3}over ℂ.
Find a basis of ℂ^{3}over ℝ containing the basis given in Example3.3.3.16.6.
Determine a basis and dimension of W = {(x,y,z,w)^{T }∈ ℝ^{4}|x + y - z + w = 0}.
Find a basis of V = {(x,y,z,u) ∈ ℝ^{4}|x - y - z = 0,x + z - u = 0}.
Let A = . Find a basis of V = {x ∈ ℝ^{5}|Ax = 0}.
Prove that = {1,x,…,x^{n},…} is a basis of ℝ[x]. is called the standard basis of ℝ[x].
Let u^{T } = (1,1,-2),v^{T } = (-1,2,3) and w^{T } = (1,10,1). Find a basis of L(u,v,w).Determine a geometrical representation of LS(u,v,w).
Let V be a vector space of dimension n. Then any set
consisting of n linearly independent vectors forms a basis of V.
S in V having n vectors with LS(S) = V forms a basis of V.
Let {v_{1},…,v_{n}} be a basis of ℂ^{n}. Then prove that the matrices B = [v_{1},…,v_{n}] and B = areinvertible.
Let W_{1}and W_{2}be two subspaces of a vector space V such that W_{1}⊆ W_{2}. Show that W_{1} = W_{2}ifand only if dim(W_{1}) = dim(W_{2}).DRAFT
Consider the vector space ([-π,π]) over ℝ. For each n ∈ ℕ, define e_{n}(x) = sin(nx). Then provethat S = {e_{n}|n ∈ ℕ} is linearly independent. [Hint: Need to show that every finite subset of Sis linearly independent. So, on the contrary assume that there exists ℓ ∈ℕ and functionse_{k1},…,e_{kℓ}such that α_{1}e_{k1}++ α_{ℓ}e_{kℓ}= 0, for some α_{t}≠0 with 1 ≤ tℓ. But, the above system isequivalent to looking at α_{1}sin(k_{1}x) ++ α_{ℓ}sin(k_{ℓ}x) = 0 for all x ∈ [-π,π]. Now in theintegral
replace m with k_{i}’s to show that α_{i}= 0, for all i,1 ≤ i ≤ ℓ to get the required contradiction.]
Let V be a vector space over F with dim(V) = n. If W_{1}is a k-dimensional subspace of V thenprove that there exists a subspace W_{2}of V such that W_{1}∩ W_{2} = {0}, W_{1} + W_{2} = V and
dim(W_{2}) = n - k. Also, prove that for each v ∈ V there exist unique vectors w_{1}∈ W_{1}andw_{2}∈ W_{2}such that v = w_{1} + w_{2}. The subspace W_{2}is called the complementary subspace of W_{1}in V.
Is the set W = {p(x) ∈ ℝ[x;4]|p(-1) = p(1) = 0} a subspace of ℝ[x;4]? If yes, find itsdimension.
3.4 Application to the subspaces of ℂ^{n}
In this subsection, we will study results that are intrinsic to the understanding of linear algebra from
the point of view of matrices, especially the fundamental subspaces (see Definition 3.3.1.25) associated
with matrices. We start with an example.
Example 3.3.4.1.
DRAFT
Compute the fundamental subspaces for A = . Solution: Verify the following
Let A = . Find a basis and dimension of Null(A). Solution: Writinf the basic vairables x_{1},x_{3} and x_{6} in terms of the free variables x_{2},x_{4},x_{5} and
x_{7}, we get x_{1} = x_{7}- x_{2}- x_{4}- x_{5}, x_{3} = 2x_{7}- 2x_{4}- 3x_{5} and x_{6} = -x_{7}. Hence, the solution set
has the form
(3.3.4.1)
DRAFT
Now, let u_{1}^{T } = , u_{2}^{T } = , u_{3}^{T } = and
u_{4}^{T } = . Then S = {u_{1},u_{2},u_{3},u_{4}} is a basis of Null(A). The reasons for S to
be a basis are as follows:
For Linear independence, the homogeneous system c_{1}u_{1} + c_{2}u_{2} + c_{3}u_{3} + c_{4}u_{4} = 0 in the
unknowns c_{1},c_{2},c_{3} and c_{4} has only the trivial solution as
u_{4} is the only vector with a non-zero entry at the 7-th place (u_{4} corresponds to
x_{7}) and hence c_{4} = 0.
u_{3} is the only vector with a non-zero entry at the 5-th place (u_{3} corresponds to
x_{5}) and hence c_{3} = 0.
Similar arguments hold for the unknowns c_{2} and c_{1}.
Exercise3.3.4.2.Let A = and B = .
FindRREF(A) andRREF(B).
Find invertible matrices P_{1}and P_{2}such that P_{1}A = RREF(A) and P_{2}B = RREF(B).
Find bases for Col(A), Col(A^{*}), Col(B) and Col(B^{*}).
Find bases of Null(A), Null(A^{*}), Null(B) and Null(B^{*}).DRAFT
Find the dimensions of all the vector subspaces so obtained.
Does there exist relationship between the bases of different spaces?
Lemma 3.3.4.3.Let A ∈ M_{m×n}(ℂ) and let E be an elementary matrix. If
B = EA then Col(A^{*}) = Col(B^{*}) and Col(A^{T }) = Col(B^{T }). Hence dim(Col(A^{*})) =
dim(Col(B^{*})) and dim(Col(A^{T })) = dim(Col(B^{T })).
B = AE then Col(A) = Col(B) andCol(A) = Col(B). Hence dim(Col(A)) =
dim(Col(B)) and dim(Col(A)) = dim(Col(B)).
Proof. Note that B = EA if and only if B =EA. As E is invertible, A are B are equivalent and hence
they have the same RREF. Also, E is invertible as well and hence A are B have the same RREF. Now,
use Theorem 3.3.2.9 to get the required result.
For the second part, note that B^{*} = E^{*}A^{*} and E^{*} is invertible. Hence, using the first part
Col((A^{*})^{*}) = Col((B^{*})^{*}), or equivalently, Col(A) = Col(B). _
Let A ∈ M_{m×n}(ℂ) and let B = RREF(A). Then as an immediate application of Lemma 3.3.4.3, we
get dim(Col(A^{*})) = Row rank(A). Hence, dim(Col(A)) = Column rank(A) as well. We now prove
that Row rank(A) = Column rank(A).
Theorem 3.3.4.4.Let A ∈ M_{m×n}(ℂ). ThenRow rank(A) = Column rank(A).
Proof. Let Row rank(A) = r = dim(Col(A^{T })). Then there exist i_{1},…,i_{r} such that {A[i_{1},:],…,A[i_{r},:]}
form a basis of Col(A^{T }). Then, B = is an r × n matrix and it’s rows are a basis of
Col(A^{T }). Therefore, there exist α_{ij}∈ ℂ,1 ≤ i ≤ m,1 ≤ j ≤ r such that A[t,:] = [α_{t1},…,α_{tr}]B, for
1 ≤ t ≤ m. So, using matrix multiplication (see Remark 1.1.2.11.4)
DRAFT
where C = [α_{ij}] is an m × r matrix. Thus, matrix multiplication implies that each column of A is a
linear combination of r columns of C. Hence, Column rank(A) = dim(Col(A)) ≤ r = Row rank.
A similar argument gives Row rank(A) ≤ Column rank(A). Hence, we have the required
result. _
Remark 3.3.4.5.The proof also shows that for every A ∈ M_{m×n}(ℂ) of rank r there existsmatrices B_{r×n}and C_{m×r}, each of rank r, such that A = CB.
Let W_{1} and W_{1} be two subspaces of a vector space V over F. Then recall that (see
Exercise 3.3.1.24.7d) W_{1} + W_{2} = {u + v|u ∈ W_{1},v ∈ W_{2}} = LS(W_{1}∪ W_{2}) is the smallest subspace of
V containing both W_{1} and W_{2}. We now state a result similar to a result in Venn diagram that
states |A| + |B| = |A ∪ B| + |A ∩ B|, whenever the sets A and B are finite (for a proof, see
Appendix 7.7.4.1).
Theorem 3.3.4.6.Let V be a finite dimensional vector space over F. If W_{1}and W_{2}are twosubspaces of V then
(3.3.4.2)
For better understanding, we give an example for finite subsets of ℝ^{n}. The example uses
Theorem 3.3.2.9 to obtain bases of LS(S), for different choices S. The readers are advised to see
Example 3.3.2.9 before proceeding further.
Example 3.3.4.7. Let V and W be two spaces with V = {(v,w,x,y,z)^{T }∈ ℝ^{5}|v + x + z = 3y}
and W = {(v,w,x,y,z)^{T }∈ ℝ^{5}|w - x = z,v = y}. Find bases of V and W containing a basis of
V ∩ W. Solution: (v,w,x,y,z)^{T }∈ V ∩ W if v,w,x,y and z satisfy v + x - 3y + z = 0,w - x - z = 0
and v = y. The solution of the system is given by
Thus, = {(1,2,0,1,2)^{T },(0,0,1,0,-1)^{T }} is a basis of V ∩ W. Similarly, a basis of V is given by
= {(-1,0,1,0,0)^{T },(0,1,0,0,0)^{T },(3,0,0,1,0)^{T },(-1,0,0,0,1)^{T }} and that of W is given by
= {(1,0,0,1,0)^{T },(0,1,1,0,0)^{T },(0,1,0,0,1)^{T }}. To find the required basis form a matrix whose rows
are the vectors in ,and (see Equation(3.3.4.3)) and apply row operations other than E_{ij}. Then
after a few row operations, we get
DRAFT
(3.3.4.3)
Thus, a required basis of V is {(1,2,0,1,2)^{T },(0,0,1,0,-1)^{T },(0,1,0,0,0)^{T },(0,0,0,1,3)^{T }}. Similarly, a
required basis of W is {(1,2,0,1,2)^{T },(0,0,1,0,-1)^{T },(0,1,0,0,1)^{T }}.
Exercise3.3.4.8.
Give an example to show that if A and B are equivalent thenCol(A) need not equalCol(B).
Let V = {(x,y,z,w)^{T }∈ ℝ^{4}|x + y - z + w = 0,x + y + z + w = 0,x + 2y = 0} andW = {(x,y,z,w)^{T }∈ ℝ^{4}|x - y - z + w = 0,x + 2y - w = 0} be two subspaces of ℝ^{4}. Findbases and dimensions of V, W, V ∩ W and V + W.
Let W_{1}and W_{2}be 4-dimensional subspaces of a vector space V of dimension 7. Thenprove that dim(W_{1}∩ W_{2}) ≥ 1.
Let W_{1}and W_{2}be two subspaces of a vector space V. If dim(W_{1}) + dim(W_{2}) > dim(V),then prove that dim(W_{1}∩ W_{2}) ≥ 1.
Let A ∈ M_{m×n}(ℂ) with m < n. Prove that the columns of A are linearly dependent.
DRAFT
We now prove the rank-nullity theorem and give some of it’s consequences.
Theorem 3.3.4.9 (Rank-Nullity Theorem).Let A ∈ M_{m×n}(ℂ). Then
(3.3.4.4)
Proof. Let dim(Null(A)) = r ≤ n and let = {u_{1},…,u_{r}} be a basis of Null(A). Since is a
linearly independent set in ℝ^{n}, extend it to get = {u_{1},…,u_{n}} as a basis of ℝ^{n}. Then,
So, = {Au_{r+1},…,Au_{n}} spans Col(A). We further need to show that is linearly independent. So,
consider the linear system
DRAFT
(3.3.4.5)
in the unknowns α_{1},…,α_{n-r}. Thus, α_{1}u_{r+1} + + α_{n-r}u_{n}∈Null(A) = LS(). Therefore, there exist
scalars β_{i},1 ≤ i ≤ r, such that ∑_{i=1}^{n-r}α_{i}u_{r+i} = ∑_{j=1}^{r}β_{j}u_{j}. Or equivalently,
(3.3.4.6)
As is a linearly independent set, the only solution of Equation (3.3.4.6) is
In
other words, we have shown that the only solution of Equation (3.3.4.5) is the trivial solution. Hence,
{Au_{r+1},…,Au_{n}} is a basis of Col(A). Thus, the required result follows. _
Theorem 3.3.4.9 is part of what is known as the fundamental theorem of linear algebra (see
Theorem 5.5.1.23). The following are some of the consequences of the rank-nullity theorem. The proof
is left as an exercise for the reader.
DRAFTExercise3.3.4.10.
Let A ∈ M_{m,n}(ℂ). If
n > m then the system Ax = 0 has infinitely many solutions,
n < m then there exists b ∈ ℝ^{m}\{0} such that Ax = b is inconsistent.
The following statements are equivalent for an m × n matrix A.
Rank (A) = k.
There exist a set of k rows of A that are linearly independent.
There exist a set of k columns of A that are linearly independent.
dim(Col(A)) = k.
There exists a k × k submatrix B of A with det(B)≠0. Further, the determinant ofevery (k + 1) × (k + 1) submatrix of A is zero.
There exists a linearly independent subset {b_{1},…,b_{k}} of ℝ^{m}such that the systemAx = b_{i}, for 1 ≤ i ≤ k, is consistent.
dim(Null(A)) = n - k.
3.5 Ordered Bases
DRAFT
Let V be a vector subspace of ℂ^{n} for some n ∈ ℕ with dim(V) = k. Then, a basis of V may not look
like a standard basis. Our problem may force us to look for some other basis. In such a case, it is
always helpful to fix the vectors in a particular order and then concentrate only on the coefficients of
the vectors as was done for the system of linear equations where we didn’t worry about the unknowns.
It also may happen that k is very-very small as compared to n and hence it is always better to work
with vectors of small size.
Definition 3.3.5.1.[Ordered Basis, Basis Matrix] Let V be a vector space over F with a
basis = {u_{1},…,u_{n}}. Then an ordered basis for V is a basis together with a one-to-one
correspondence between and {1,2,…,n}. Since there is an order among the elements of , we
write = [u_{1},…,u_{n}]. The matrix B = [u_{1},…,u_{n}] is called the basis matrix.
Thus, = [u_{1},u_{2},…,u_{n}] is different from = [u_{2},u_{3},…,u_{n},u_{1}] and both of them are
different from = [u_{n},u_{n-1},…,u_{2},u_{1}] even though they have the same set of vectors as
elements. We now define the notion of coordinates of a vector with respect to an ordered
basis.
Definition 3.3.5.2.[Coordinates of a Vector] Let B = [v_{1},…,v_{n}] be the basis matrix
corresponding to an ordered basis of V. Since is a basis of V, for each v ∈ V, there exist
β_{i},1 ≤ i ≤ n, such that v = ∑_{i=1}^{n}β_{i}v_{i} = B. The column vector is called thecoordinates of v with respect to , denoted [v]_{}. Thus, using notation v = B[v]_{}.
Example 3.3.5.3.
Let = be an ordered basis of ℝ^{2}. Then, = _{}. Thus,
_{} = ^{-1}.
DRAFT
Consider the vector space ℝ[x;2] with basis {1 -x,1 + x,x^{2}}. Then an ordered basis can
either be = [1-x,1+x,x^{2}] or = [1+x,1-x,x^{2}] or .... Note that there are 3! different
ordered bases. Also, for a_{0} + a_{1}x + a_{2}x^{2}∈ ℝ[x;2], one has
Let V = {(x,y,z)^{T }|x + y = z}. If = [(-1,1,0)^{T },(1,0,1)^{T }] is an ordered basis of V then
_{} = and hence _{} = .
Let V = {(v,w,x,y,z)^{T }∈ ℝ^{5}|w - x = z,v = y,v + x + z = 3y}. So, if =
[(1,2,0,1,2)^{T },(0,0,1,0,-1)^{T }] is an ordered basis then [(3,6,0,3,1)]_{} = .
Let V be a vector space over F with dim(V) = n. Let A = [v_{1},…,v_{n}] and B = [u_{1},…,u_{n}] be basis
matrices corresponding to the ordered bases and , respectively. So, using the notation in
Definition 3.3.5.2, we have
So,
the matrix [A]_{} = , denoted []_{}, is called the matrix of with respect to theordered basis or the change of basis matrix from to . We now summarize the above
discussion.
DRAFT
Theorem 3.3.5.4.Let V be a vector space over F with dim(V) = n. Let = [v_{1},…,v_{n}] and= [u_{1},…,u_{n}] be two ordered bases of V.
Then, the matrix []_{}is invertible and [w]_{} = []_{}[w]_{}, for all w ∈ V.
Similarly, verify that []_{}is invertible and [w]_{} = []_{}[w]_{}, for all w ∈ V.
Furthermore, ([]_{})^{-1} = []_{}.
Proof. Part 1: We prove all the parts together. Let A = [v_{1},…,v_{n}],B = [u_{1},…,u_{n}] and C = []_{} and