Chapter 5
Inner Product Spaces

5.1 Definition and Basic Properties

Recall the dot product in 2 and 3. Dot product helped us to compute the length of vectors and angle between vectors. This enabled us to rephrase geometrical problems in 2 and 3 in the language of vectors. We generalize the idea of dot product to achieve similar goal for a general vector space over or . So, in this chapter F will denote either or .

Definition 5.1.1. [Inner Product] Let V be a vector space over F. An inner product over V, denoted by ,, is a map from V × V to F satisfying

1.
au + bv,w= au,w+ bv,w, for all u,v,w V and a,b F,
2.
u,v= v,u, the complex conjugate of u,v, for all u,v V and
3.
u,u⟩≥ 0 for all u V. Furthermore, equality holds if and only if u = 0.

Remark 5.1.2. Using the definition of inner product, we immediately observe that

1.
vw= αw,v = αw,v = αv,w, for all α F and v,w V.
2.
If u,v= 0 for all v V then in particular u,u= 0. Hence, u = 0.

PICT PICT DRAFT Definition 5.1.3. [Inner Product Space] Let V be a vector space with an inner product ,. Then, (V,,) is called an inner product space (in short, ips).

Example 5.1.4. Examples 1 and 2 that appear below are called the standard inner product or the dot product on n and n, respectively. Whenever an inner product is not clearly mentioned, it will be assumed to be the standard inner product.

1.
For u = (u1,,un)T ,v = (v1,,vn)T n define u,v= u1v1 + ⋅⋅⋅ + unvn = vT u. Then, ,is indeed an inner product and hence (n,,) is an ips.
2.
For u = (u1,,un)*,v = (v1,,vn)*n define u,v= u1v1+⋅⋅⋅+unvn = v*u. Then, (n,,) is an ips.
3.
For x = (x1,x2)T ,y = (y1,y2)T 2 and A = [        ]
  4   - 1
  - 1  2, define x,y= yT Ax. Then, ,is an inner product as x,x= (x1 - x2)2 + 3x12 + x22.
4.
Fix A = [    ]
 a  b
 b  c with a,c > 0 and ac > b2. Then, x,y= yT Ax is an inner product on 2 as x,x= ax12 + 2bx1x2 + cx22 = a[     bx ]
 x1 + -a22 + 1
a[      2]
 ac - bx22.
5.
Verify that for x = (x1,x2,x3)T ,y = (y1,y2,y3)T 3, x,y= 10x1y1+3x1y2+3x2y1+ 2x2y2 + x2y3 + x3y2 + x3y3 defines an inner product.
6.
For x = (x1,x2)T ,y = (y1,y2)T 2, we define three maps that satisfy at least one condition out of the three conditions for an inner product. Determine the condition which is not satisfied. Give reasons for your answer.
(a)
x,y= x1y1.
(b)
x,y= x12 + y12 + x22 + y22.
(c)
x,y= x1y13 + x2y23.
PICT PICT DRAFT
7.
Let A Mn() be a Hermitian matrix. Then, for x,y n, define x,y= y*Ax. Then, , satisfies x,y= y,x and x + αz,y= x,y+ αz,y, for all x,y,z n and α . Does there exist conditions on A such that x,x⟩≥ 0 for all x ? This will be answered in affirmative in the chapter on eigenvalues and eigenvectors.
8.
For A,B Mn(), define A,B= tr(BT A). Then,
⟨A + B, C⟩  =   tr(CT (A + B )) = tr(CT A)+ tr(CT B ) = ⟨A, C⟩+ ⟨B, C ⟩ and
                    T          T   T        T
    ⟨A, B⟩  =   tr(B  A ) = tr((B A) ) = tr(A  B ) = ⟨B, A ⟩.
If A = [aij] then A,A= tr(AT A) = i=1n(AT A)ii = i,j=1naijaij = i,j=1naij2 and therefore, A,A> 0 for all nonzero matrix A.
9.
Consider the complex vector space C[-1,1] and define f,g= -11f(x)g(x)dx. Then,
(a)
f,f= -11|f(x)|2dx 0 as |f(x)|2 0 and this integral is 0 if and only if f 0 as f is continuous.
(b)
g,f = -11g(x)f(x)dx = -11g(x)f(x)dx = -11f(x)g(x)dx = f,g.
(c)
f + g,h= -11(f + g)(x)h(x)dx = -11[f(x)h(x) + g(x)h(x)]dx = f,h+ g,h.
(d)
αf,g= -11(αf(x))g(x)dx = α -11f(x)g(x)dx = αf,g.
(e)
Fix an ordered basis B = [u1,,un] of a complex vector space V. Then, for any u,v V, with [u]B = ⌊  ⌋
 a1
|| ..||
⌈ .⌉
 an and [v]B = ⌊   ⌋
  b1
|| .. ||
⌈ . ⌉
  bn, define u,v= i=1naibi. Then, ,is indeed an inner product in V. So, any finite dimensional vector space can be endowed with an inner product.
PICT PICT DRAFT

5.1.1 Cauchy Schwartz Inequality

As u,u> 0, for all u0, we use inner product to define length of a vector.

Definition 5.1.5. [Length / Norm of a Vector] Let V be a vector space over F. Then, for any vector u V, we define the length (norm) of u, denoted u= ∘ ------
  ⟨u,u ⟩, the positive square root. A vector of norm 1 is called a unit vector. Thus,  u
----
∥u∥ is called the unit vector in the direction of u.

Example 5.1.6.

1.
Let V be an ips and u V. Then, for any scalar α, αu= |α|⋅∥u.
2.
Let u = (1,-1,2,-3)T 4. Then, u= √1-+-1-+-4+-9- = √15--. Thus, √1-
 15u and --1-
√15-u are vectors of norm 1. Moreover -1-
√15u is a unit vector in the direction of u.

Exercise 5.1.7.

1.
Let u = (-1,1,2,3,7)T 5. Find all α such that αu= 1.
2.
Let u = (-1,1,2,3,7)T 5. Find all α such that αu= 1. PICT PICT DRAFT
3.
Prove that x + y2 + x - y2 = 2(x2 + y2), for all xT ,yT n. This equality is called the Parallelogram Law as in a parallelogram the sum of square of the lengths of the diagonals is equal to twice the sum of squares of the lengths of the sides.
4.
Apollonius’ Identity: Let the length of the sides of a triangle be a,b,c and that of the median be d . If the median is drawn on the side with length a then prove that b2 + c2 = 2(     (  ) )
 d2 +  a- 2
       2.
5.
Let u = (1,2)T ,v = (2,-1)T 2. Then, does there exist an inner product in 2 such that u= 1,v= 1 and u,v= 0? [Hint: Let A = [    ]
 a  b
 b  c and define x,y= yT Ax. Use given conditions to get a linear system of 3 equations in the variables a,b,c.]
6.
Let x = (x1,x2)T ,y = (y1,y2)T 2. Then, x,y= 3x1y1 - x1y2 - x2y1 + x2y2 defines an inner product. Use this inner product to find
(a)
the angle between e1 = (1,0)T and e2 = (0,1)T .
(b)
v 2 such that v,e1= 0.
(c)
x,y 2 such that x= y= 1 and x,y= 0.

A very useful and a fundamental inequality, commonly called the Cauchy-Schwartz inequality, concerning the inner product is proved next.

Theorem 5.1.8 (Cauchy-Bunyakovskii-Schwartz inequality). Let V be an inner product space over F. Then, for any u,v V

PICT PICT DRAFT |⟨u,v ⟩| ≤ ∥u∥∥v ∥.
(5.1.1)

Moreover, equality holds in Inequality (5.1.1) if and only if u and v are linearly dependent. Furthermore, if u0 then v = ⟨       ⟩
  v, -u--
     ∥u∥-u--
∥u ∥.

Proof. If u = 0 then Inequality (5.1.1) holds. Hence, let u0. Then, by Definition 5.1.1.3, λu + vu + v⟩≥ 0 for all λ F and v V. In particular, for λ = -⟨v,-u⟩
 ∥u∥2,

                          --               --
0  ≤   ⟨λu +  v,λu + v⟩ = λλ∥u ∥2 + λ⟨u,v ⟩+ λ⟨v,u ⟩+ ∥v∥2
       ⟨v,u⟩ ⟨v,-u⟩-      ⟨v,u⟩        ⟨v,u-⟩                      |⟨v,u ⟩|2
   =   ----2----2-∥u∥2 - ----2 ⟨u, v⟩-----2-⟨v,u⟩+  ∥v∥2 = ∥v∥2 - -----2-.
       ∥u ∥  ∥u∥         ∥u ∥          ∥u∥                         ∥u ∥
Or, in other words |v,u|2 ≤∥u2v2 and the proof of the inequality is over.

Now, note that equality holds in Inequality (5.1.1) if and only if λu + vu + v= 0, or equivalently, λu + v = 0. Hence, u and v are linearly dependent. Moreover,

0 = ⟨0,u ⟩ = ⟨λu + v,u ⟩ = λ⟨u, u⟩+ ⟨v,u ⟩
implies that v = -λu = -⟨v,u⟩
----2-
∥u ∥u = ⟨    u ⟩
 v, ----
    ∥u∥ u
----
∥u ∥. _

Corollary 5.1.9. Let x,y n. Then, ( ∑n     )
     xiyi
  i=12 (∑n   )
    x2i
 i=1( ∑n   )
     y2i
  i=1. PICT PICT DRAFT

5.1.2 Angle between two Vectors

Let V be a real vector space. Then, for u,v V, the Cauchy-Schwartz inequality implies that -1 ∥⟨uu,∥v∥v⟩∥-1. We use this together with the properties of the cosine function to define the angle between two vectors in an inner product space.

Definition 5.1.10. [Angle between Vectors] Let V be a real vector space. If θ [0,π] is the angle between u,v V \{0} then we define

       -⟨u,-v⟩-
cos θ = ∥u∥∥v ∥.

Example 5.1.11.

1.
Take (1,0)T ,(1,1)T 2. Then, cosθ = √1-
  2. So θ = π∕4.
2.
Take (1,1,0)T ,(1,1,1)T 3. Then, angle between them, say β = cos-12√--
 6.
3.
Angle depends on the IP. Take x,y= 2x1y1 + x1y2 + x2y1 + x2y2 on 2. Then, angle between (1,0)T ,(1,1)T 2 equals cos-1√310.
4.
As x,y= y,xfor any real vector space, the angle between x and y is same as the angle between y and x. PICT PICT DRAFT
5.
Let a,b with a,b > 0. Then, prove that (a+ b)(      )
 -1+  1-
 a    b4.
6.
For 1 i n, let ai with ai > 0. Then, use Corollary 5.1.9 to show that ( n∑   )
    ai
 i=1( ∑n 1 )
     --
  i=1 ain2.
7.
Prove that |z1 + ⋅⋅⋅ + zn| ∘ -------------------
  n(|z1|2 + ⋅⋅⋅ + |zn|2), for z1,,zn . When does the equality hold?
8.
Let V be an ips. If u,v V with u= 1,v= 1 and u,v= 1 then prove that u = αv for some α F. Is α = 1?


PICT PICT DRAFT PICT PICT DRAFT

PIC


Figure 5.1: Triangle with vertices A,B and C
PICT PICT DRAFT


We will now prove that if A,B and C are the vertices of a triangle (see Figure 5.1) and a,b and c, respectively, are the lengths of the corresponding sides then cos(A) = b2+c2--a2
   2bc. This in turn implies that the angle between vectors has been rightly defined.

Lemma 5.1.12. Let A,B and C be the vertices of a triangle (see Figure 5.1) with corresponding side lengths a,b and c, respectively, in a real inner product space V then

          2   2    2
cos(A ) = b-+-c---a--.
             2bc

Proof. Let 0, u and v be the coordinates of the vertices A,B and C, respectively, of the triangle ABC. Then, A⃗B = u,⃗AC = v and B⃗C = v - u. Thus, we need to prove that

            2      2          2
cos(A) = ∥v∥--+-∥u∥----∥v---u∥- ⇔ ∥v ∥2 + ∥u ∥2 - ∥v - u∥2 = 2∥v∥ ∥u∥cos(A).
                2∥v∥ ∥u∥
Now, by definition v - u2 = v2 + u2 - 2v,uand hence v2 + u2 -∥v - u2 = 2u,v. As v,u= vucos(A), the required result follows. _

Definition 5.1.13. [Orthogonality / Perpendicularity] Let V be an inner product space over . Then, PICT PICT DRAFT

1.
the vectors u,v V are called orthogonal/perpendicular if u,v= 0.
2.
Let S V. Then, the orthogonal complement of S in V, denoted S, equals
  ⊥
S  =  {v ∈ V : ⟨v,w ⟩ = 0, for all w ∈ S }.

Example 5.1.14.

1.
0 is orthogonal to every vector as 0,x= 0 for all x V.
2.
If V is a vector space over or then 0 is the only vector that is orthogonal to itself.
3.
Let V = .
(a)
S = {0}. Then, S = .
(b)
S = , Then, S = {0}.
(c)
Let S be any subset of containing a nonzero real number. Then, S = {0}.
4.
Let u = (1,2)T . What is u in 2?
Solution: {(x,y)T 2|x + 2y = 0}. Is this Null(u)? Note that (2,-1)T is a basis of u and for any vector x 2, PICT PICT DRAFT
                (               )
         --u--             --u--     x1 +-2x2    T   2x1---x2       T
x = ⟨x, u⟩∥u ∥2 +  x - ⟨x,u⟩∥u ∥2  =     5    (1,2)  +     5   (2,- 1)
is a decomposition of x into two vectors, one parallel to u and the other parallel to u.
5.
Fix u = (1,1,1,1)T ,v = (1,1,-1,0)T 4. Determine z,w 4 such that u = z + w with the condition that z is parallel to v and w is orthogonal to v.
Solution: As z is parallel to v, z = kv = (k,k,-k,0)T , for some k . Since w is orthogonal to v the vector w = (a,b,c,d)T satisfies a + b - c = 0. Thus, c = a + b and
(1,1,1,1)T = u = z + w =  (k,k,- k,0)T + (a,b,a+ b,d)T.
Comparing the corresponding coordinates, gives the linear system d = 1, a + k = 1, b + k = 1 and a + b - k = 1 in the variables a,b,d and k. Thus, solving for a,b,d and k gives z = 1-
3(1,1,-1,0)T and w = 1-
3(2,2,4,3)T .
6.
Let x,y n then prove that
(a)
x,y= 0⇐⇒x - y2 = x2 + y2 (Pythagoras Theorem).
Solution: Use x - y2 = x2 + y2 - 2x,yto get the required result follows.
(b)
x= y⇐⇒x + y,x - y= 0 (x and y form adjacent sides of a rhombus as the diagonals x + y and x - y are orthogonal).
Solution: Use x + y,x - y= x2 -∥y2 to get the required result follows.
(c)
4x,y= x + y2 -∥x - y2 (polarization identity in n).
Solution: Just expand the right hand side to get the required result follows. PICT PICT DRAFT
(d)
x + y2 + x-y2 = 2x2 + 2y2 (parallelogram law: the sum of squares of the diagonals of a parallelogram equals twice the sum of squares of its sides).
Solution: Just expand the left hand side to get the required result follows.
7.
Let P = (1,1,1)T , Q = (2,1,3)T and R = (-1,1,2)T be three vertices of a triangle in 3. Compute the angle between the sides PQ and PR.
Solution: Method 1: Note that P⃗Q = (2,1,3)T - (1,1,1)T = (1,0,2)T , P⃗R = (-2,0,1)T and R⃗Q = (-3,0,-1)T . As P⃗Q,P⃗R= 0, the angle between the sides PQ and PR is π
2-.

Method 2: PQ= √ --
  5,PR= √ --
  5 and QR= √---
 10. As QR2 = PQ2 + PR2, by Pythagoras theorem, the angle between the sides PQ and PR is π
--
2.

Exercise 5.1.15.

1.
Let V be an ips.
(a)
If S V then S is a subspace of V and S = (LS(S)).
(b)
Furthermore, if V is finite dimensional then S and LS(S) are complementary. That is, V = LS(S) + S. Equivalently, u,w= 0, for all u LS(S) and w S.
2.
Consider 3 with the standard inner product. Find
(a)
S for S = {(1,1,1)T ,(0,1,-1)T } and S = LS((1,1,1)T ,(0,1,-1)T ).
(b)
vectors v,w 3 such that v,w,u = (1,1,1)T are mutually orthogonal.
(c)
the line passing through (1,1,-1)T and parallel to (a,b,c)0.
(d)
the plane containing (1,1 - 1) with (a,b,c)0 as the normal vector.
(e)
the area of the parallelogram with three vertices 0T , (1,2,-2)T and (2,3,0)T . PICT PICT DRAFT
(f)
the area of the parallelogram when x= 5,x - y= 8 and x + y= 14.
(g)
the plane containing (2,-2,1)T and perpendicular to the line with parametric equation x = t - 1,y = 3t + 2,z = t + 1.
(h)
the plane containing the lines (1,2,-2) + t(1,1,0) and (1,2,-2) + t(0,1,2).
(i)
k such that cos-1(⟨u,v ⟩) = π∕3, where u = (1,-1,1)T and v = (1,k,1)T .
(j)
the plane containing (1,1,2)T and orthogonal to the line with parametric equation x = 2 + t,y = 3 and z = 1 - t.
(k)
a parametric equation of a line containing (1,-2,1)T and orthogonal to x+3y+2z = 1.
3.
Let P = (3,0,2)T ,Q = (1,2,-1)T and R = (2,-1,1)T be three points in 3. Then,
(a)
find the area of the triangle with vertices P,Q and R.
(b)
find the area of the parallelogram built on vectors P⃗Q and Q⃗R.
(c)
find a nonzero vector orthogonal to the plane of the above triangle.
(d)
find all vectors x orthogonal to P⃗Q and Q⃗R with x= √ --
  2.
(e)
the volume of the parallelepiped built on vectors P⃗Q and Q⃗R and x, where x is one of the vectors found in Part 3d. Do you think the volume would be different if you choose the other vector x?
4.
Let p1 be a plane containing A = (1,2,3)T and (2,-1,1)T as its normal vector. Then,
(a)
find the equation of the plane p2 that is parallel to p1 and contains (-1,2,-3)T .
(b)
calculate the distance between the planes p1 and p2.
5.
In the parallelogram ABCD, ABDC and ADBC and A = (-2,1,3)T ,B = (-1,2,2)T and C = (-3,1,5)T . Find the PICT PICT DRAFT
(a)
coordinates of the point D,
(b)
cosine of the angle BCD.
(c)
area of the triangle ABC
(d)
volume of the parallelepiped determined by AB,AD and (0,0,-7)T .
6.
Let W = {(x,y,z,w)T 4 : x + y + z - w = 0}. Find a basis of W.
7.
Recall the ips Mn() (see Example 5.1.4.8). If W = {A Mn()|AT = A} then W?

5.1.3 Normed Linear Space

To proceed further, recall that a vector space over or was a linear space.

Definition 5.1.16. [Normed Linear Space] Let V be a linear space.

1.
Then, a norm on V is a function f(x) = xfrom V to such that
(a)
x∥≥ 0 for all x V and if x= 0 then x = 0.
(b)
αx= |α|xfor all α F and x V.
(c)
x + y∥≤∥x+ yfor all x,y V (triangle inequality).
2.
A linear space with a norm on it is called a normed linear space (nls).

PICT PICT DRAFT Theorem 5.1.17. Let V be a normed linear space and x,y V. Then, |x∥-∥y|≤∥x-y.

Proof. As x= x - y + y∥≤∥x - y+ yone has x∥-∥y∥≤∥x - y. Similarly, one obtains y∥-∥x∥≤∥y - x= x - y. Combining the two, the required result follows. _

Example 5.1.18.

1.
On 3, x= ∘ -2----2----2
  x1 + x2 + x 3 is a norm. Also, observe that this norm corresponds to ∘  ------
   ⟨x, x⟩, where ,is the standard inner product.
2.
Let V be an ips. Is it true that f(x) = ∘ ------
  ⟨x,x ⟩ is a norm?
Solution: Yes. The readers should verify the first two conditions. For the third condition, recalling the Cauchy-Schwartz inequality, we get
        2
f(x + y)   =  ⟨x + y,x + y⟩ = ⟨x,x⟩+ ⟨x,y ⟩+ ⟨y,x⟩ + ⟨y, y⟩
           ≤  ∥x∥2 + ∥x∥∥y ∥+ ∥x ∥∥y∥+  ∥y∥2 = (f (x )+ f(y))2.
Thus, x= ∘  ------
   ⟨x, x⟩ is a norm, called the norm induced by the inner product ⟨⋅,⋅⟩.

Exercise 5.1.19.

1.
Let V be an ips. Then, PICT PICT DRAFT
4⟨x,y ⟩ = ∥x + y∥2 - ∥x - y∥2 + i∥x + iy∥2 - i∥x - iy∥2(Polarization Identity ).
2.
Consider the complex vector space n. If x,y n then prove that
(a)
If x0 then x + ix2 = x2 + ix2, even though x,ix0.
(b)
x,y= 0 whenever x + y2 = x2 + y2 and x + iy2 = x2 + iy2.
3.
Let A Mn() satisfy Ax∥≤∥xfor all x n. Then, prove that if α with |α| > 1 then A - αI is invertible.

The next result is stated without proof as the proof is beyond the scope of this book.

Theorem 5.1.20. Let ∥⋅∥ be a norm on a nls V. Then, ∥⋅∥ is induced by some inner product if and only if ∥⋅∥ satisfies the parallelogram law: x + y2 + x - y2 = 2x2 + 2y2.

Example 5.1.21.

1.
For x = (x1,x2)T 2, we define x1 = |x1| + |x2|. Verify that x1 is indeed a norm. But, for x = e1 and y = e2, 2(x2 + y2) = 4 whereas
PICT PICT DRAFT        2          2         2           2           2              2
∥x + y∥  + ∥x - y∥ =  ∥(1,1)∥  + ∥(1,- 1)∥ = (|1|+ |1|) + (|1|+ |-  1|) =  8.
So, the parallelogram law fails. Thus, x1 is not induced by any inner product in 2.
2.
Does there exist an inner product in 2 such that x= max{|x1|,|x2|}?
3.
If ∥⋅∥ is a norm in V then d(x,y) = x - y, for x,y V, defines a distance function as
(a)
d(x,x) = 0, for each x V.
(b)
using the triangle inequality, for any z V, we have
d(x,y ) = ∥x - y∥ = ∥ (x - z) - (y - z)∥ ≤ ∥ (x- z) ∥+ ∥ (y - z)∥ = d(x,z) + d(z,y).

5.2 Gram-Schmidt Orthonormalization Process

We start with the definition of an orthonormal set.

Definition 5.2.1. Let V be an ips. Then, a non-empty set S = {v1,,vn}⊆ V is called an orthogonal set if vi and vj are mutually orthogonal, for 1 ij n, i.e., PICT PICT DRAFT

⟨ui,uj⟩ = 0, for 1 ≤ i < j ≤ n.

Further, if vi= 1, for 1 i n, Then S is called an orthonormal set. If S is also a basis of V then S is called an orthonormal basis of V.

Example 5.2.2.

1.
A few orthonormal sets in 2 are
                  1         1                1         1
{(1,0)T ,(0,1)T },{ √--(1,1)T ,√--(1,- 1)T} and {√-(2,1)T,√--(1,- 2)T}.
                  2          2                5         5
2.
Let S = {e1,,en} be the standard basis of n. Then, S is an orthonormal set as
(a)
ei= 1, for 1 i n.
(b)
ei,ej= 0, for 1 ij n.
3.
The set {                                            }
  [-1-  -1- 1-]T  [  1---1-]T [-2- -1-  -1]T
   √3,- √3, √3   , 0,√2,√2-  , √6-,√6,- √6 is an orthonormal set in 3. PICT PICT DRAFT
4.
Recall that f(x),g(x)= -ππf(x)g(x)dx defines the standard inner product in C[-π,π]. Consider S = {1}∪{em|m 1}∪{fn|n 1}, where 1(x) = 1, em(x) = cos(mx) and fn(x) = sin(nx), for all m,n 1 and for all x [-π,π]. Then,
(a)
S is a linearly independent set.
(b)
12 = 2π, em2 = π and fn2 = π.
(c)
the functions in S are orthogonal.

Hence, {     }
 √-1--
   2π{             }
  √1-em |m  ≥ 1
   π{           }
  √1-fn|n ≥ 1
    π is an orthonormal set in C[-π,π].

To proceed further, we consider a few examples for better understanding.

Example 5.2.3. Which point on the plane P is closest to the point, say Q?


PIC


Solution: Let y be the foot of the perpendicular from Q on P. Thus, by Pythagoras Theorem, this point is unique. So, the question arises: how do we find y?

Note that -y→Q gives a normal vector of the plane P. Hence, -→Q = y + -y→Q. So, need to decompose -→Q into two vectors such that one of them lies on the plane P and the other is orthogonal to the plane.

Thus, we see that given u,v V \{0}, we need to find two vectors, say y and z, such that y is parallel to u and z is perpendicular to u. Thus, y = ucos(θ) and z = usin(θ), where θ is the angle between u and v.


PICT PICT DRAFT PICT PICT DRAFT

PIC


Figure 5.2: Decomposition of vector v
PICT PICT DRAFT


We do this as follows (see Figure 5.2). Let ˆu = -u--
∥u ∥ be the unit vector in the direction of u. Then, using trigonometry, cos(θ) = ∥O⃗Q-∥
∥O⃗P ∥. Hence O⃗Q= O⃗Pcos(θ). Now using Definition 5.1.10, O⃗Q= v|      |
||∥⟨vv,∥u∥⟩u∥|| = |    |
||⟨v∥,uu⟩∥||, where the absolute value is taken as the length/norm is a positive quantity. Thus,

                ⟨      ⟩
O⃗Q  = ∥O⃗Q  ∥ˆu =  v, -u--  -u-.
                    ∥u∥   ∥u∥
Hence, y = O⃗Q = ⟨    u ⟩
 v, ----
    ∥u∥∥uu∥ and z = v -⟨    u  ⟩
 v, ----
    ∥u∥ u
----
∥u∥. In literature, the vector y = O⃗Q is called the orthogonal projection of v on u, denoted Proju(v). Thus,
           ⟨    u ⟩   u                           ||⟨v,u⟩||
Proju(v) =  v, ∥u∥-  ∥u∥-and ∥Proju(v)∥ = ∥O⃗Q ∥ = ||-∥u∥-||.
(5.2.1)

Moreover, the distance of u from the point P equals O⃗R= P⃗Q= ∥              ∥
∥∥v-  ⟨v, ∥uu∥⟩∥uu∥∥∥.

Example 5.2.4.

1.
Determine the foot of the perpendicular from the point (1,2,3) on the XY -plane.
Solution: Verify that the required point is (1,2,0)? PICT PICT DRAFT
2.
Determine the foot of the perpendicular from the point Q = (1,2,3,4) on the plane generated by (1,1,0,0),(1,0,1,0) and (0,1,1,1).

Answer: (x,y,z,w) lies on the plane x-y-z+2w = 0 ⇔⟨(1,-1,-1,2),(x,y,z,w)= 0.

So, the required point equals

(1,2,3,4) - ⟨(1,2,3,4),√1-(1,- 1,- 1,2)⟩√1-(1,- 1,- 1,2)
                       7               7
                         4-              1-
          =  (1,2,3,4)-  7(1,- 1,- 1,2) = 7(3,18,25,20).
3.
Determine the projection of v = (1,1,1,1)T on u = (1,1,-1,0)T .
Solution: By Equation (5.2.1), we have Projv(u) = ⟨v, u⟩  u
----2
∥u ∥ = 13(1,1,-1,0)T and w = (1,1,1,1)T - Projv(u) = 1
3(2,2,4,3)T is orthogonal to u.
4.
Let u = (1,1,1,1)T ,v = (1,1,-1,0)T ,w = (1,1,0,-1)T 4. Write v = v1 + v2, where v1 is parallel to u and v2 is orthogonal to u. Also, write w = w1 + w2 + w3 such that w1 is parallel to u, w2 is parallel to v2 and w3 is orthogonal to both u and v2.
Solution: Note that
(a)
v1 = Proju(v) = v,u-u--
∥u∥2 = 1
4u = 1
4(1,1,1,1)T is parallel to u.
(b)
v2 = v -1
4u = 1
4(3,3,-5,-1)T is orthogonal to u.

Note that Proju(w) is parallel to u and Projv2(w) is parallel to v2. Hence, we have

(a)
w1 = Proju(w) = w,u--u2
∥u∥ = 1
4u = 1
4(1,1,1,1)T is parallel to u,
(b)
w2 = Projv2(w) = w,v2-v22-
∥v2∥ = -7
44(3,3,-5,-1)T is parallel to v2 and
(c)
w3 = w - w1 - w2 = -3
11(1,1,2,-4)T is orthogonal to both u and v2.
PICT PICT DRAFT

We now prove the most important initial result of this section.

Theorem 5.2.5. Let S = {u1,,un} be an orthonormal subset of an ips V(F).

1.
Then, S is a linearly independent subset of V.
2.
Suppose v LS(S) with v = i=1nαiui, for some αi’s in F. Then,
(a)
αi = v,ui.
(b)
v2 = i=1nαiui2 = i=1n|αi|2.
3.
Let z V and w = i=1nz,uiui. Then, z = w + (z - w ) with z - w,w= 0, i.e., z - w LS(S). Further, z2 = w2 + z - w2 ≥∥w2.
4.
Let dim(V) = n. Then, v,ui= 0 for all i = 1,2,,n if and only if v = 0.

Proof. Part 1: Consider the linear system c1u1 + ⋅⋅⋅ + cnun = 0 in the variables c1,,cn. As 0,u= 0 and uj,ui= 0, for all ji, we have

                                    ∑n
0 = ⟨0,ui⟩ = ⟨c1u1 + ⋅⋅⋅+ cnun,ui⟩ =    cj⟨uj,ui⟩ = ci⟨ui,ui⟩ = ci.
                                    j=1
Hence, ci = 0, for 1 i n. Thus, the above linear system has only the trivial solution. So, the set S is linearly independent.

Part 2: Note that v,ui= j=1nαjuj,ui= j=1nαjuj,ui= αiui,ui= αi. This completes the first sub-part. For the second sub-part, we have

PICT PICT DRAFT
 ∑n             ⟨ ∑n      ∑n     ⟩    ∑n   ⟨    ∑n      ⟩
∥    αiui∥2 =        αiui,   αiui   =    αi  ui,   αjuj
  i=1              i=1      i=1         i=1        j=1
                 n     n              n                n
            =   ∑  α  ∑  α--⟨u ,u ⟩ = ∑  α α- ⟨u ,u ⟩ = ∑  |α |2.
                i=1  ij=1 j   i  j   i=1  i i  i  i   i=1  i

Part 3: Note that for 1 i n,

                                         ⟨              ⟩
                                          ∑n
⟨z- w, ui⟩  =  ⟨z,ui⟩-  ⟨w, ui⟩ = ⟨z,ui⟩ -     ⟨z,uj⟩uj,ui
                                           j=1
                        ∑n
            =  ⟨z,ui⟩-     ⟨z,uj⟩⟨uj,ui⟩ = ⟨z,ui⟩- ⟨z,ui⟩ = 0.
                        j=1
So, z - w LS(S). Hence, z - w,w= 0 as w LS(S). Further,
   2                2       2          2      2
∥z∥  = ∥w + (z - w)∥  = ∥w ∥ + ∥z - w ∥ ≥ ∥w ∥ .

Part 4: Follows directly using Part 2b as {u1,,un} is a basis of V. _

A rephrasing of Theorem 5.2.5.2b gives a generalization of the pythagoras theorem, popularly known as the Parseval’s formula. The proof is left as an exercise for the reader.

PICT PICT DRAFT Theorem 5.2.6. Let V be a finite dimensional ips with an orthonormal basis {v1,⋅⋅⋅,vn}. Then, for each x,y V,

       ∑n       ------
⟨x, y⟩ =   ⟨x,vi⟩⟨y,vi⟩.
        i=1
Furthermore, if x = y then x2 = i=1n|x,vi|2 (generalizing the Pythagoras Theorem).

As a corollary to Theorem 5.2.5, we have the following result.

Theorem 5.2.7 (Bessel’s Inequality). Let V be an ips with {v1,⋅⋅⋅,vn} as an orthogonal set. Then, k=1n|⟨z,vk⟩|2
 ∥vk∥2 ≤ ∥z2, for each z V. Equality holds if and only if z = k=1n⟨z,vk⟩
-----2
∥vk ∥vk.

Proof. For 1 k n, define uk =  v
--k--
∥vk∥ and use Theorem 5.2.5.4 to get the required result. _

Remark 5.2.8. Using Theorem 5.2.5, we see that if B = [v1,,vn] is an ordered orthonormal basis of an ips V then for each u V, [u]B = ⌊       ⌋
  ⟨u, v1⟩
||   ..   ||
⌈   .   ⌉
 ⟨u, vn⟩. Thus, in place of solving a linear system to get the coordinates of a vector, we just need to compute the inner product with basis vectors.

PICT PICT DRAFT Exercise 5.2.9.

1.
Find v,w 3 such that v,w,(1,-1,-2)T are mutually orthogonal.
2.
Let B = [   [ ]    [   ]]
  1√--1  ,√1- 1
   2 1    2 - 1 be an ordered basis of 2. Then, [[ ] ]
  x
  yB = [x+y]
 √ 2
 x√-y-
   2.
3.
For the ordered basis B = ⌊   ⌊  ⌋    ⌊   ⌋    ⌊   ⌋⌋
      1       1        1
|⌈ √1|⌈ 1|⌉,√1-|⌈- 1|⌉,√1-|⌈ 1 |⌉|⌉
   3       2        6
      1       0       - 2 of 3 , [(2,3,1)T ]B = ⌊ √ -⌋
 2  3
|| -√-1||
⌈ -32⌉
  √6-.

In view of the importance of Theorem 5.2.5, we inquire into the question of extracting an orthonormal basis from a given basis. The process of extracting an orthonormal basis from a finite linearly independent set is called the Gram-Schmidt Orthonormalization process. We first consider a few examples. Note that Theorem 5.2.5 also gives us an algorithm for doing so, i.e., from the given vector subtract all the orthogonal projections/components. If the new vector is nonzero then this vector is orthogonal to the previous ones. The proof follows directly from Theorem 5.2.5 but we give it again for the sake of completeness.

Theorem 5.2.10 (Gram-Schmidt Orthogonalization Process). Let V be an ips. If {v1,,vn} is a set of linearly independent vectors in V then there exists an orthonormal set {w1,,wn} in V. Furthermore, LS(w1,,wi) = LS(v1,,vi), for 1 i n.

Proof. Note that for orthonormality, we need wi= 1, for 1 i n and wi,wj= 0, for 1 ij n. Also, by Corollary 3.3.8.2, vi∕∈LS(v1,,vi-1), for 2 i n, as {v1,,vn} is a linearly independent set. We are now ready to prove the result by induction.

Step 1: Define w1 = -v1--
∥v1 ∥ then LS(v1) = LS(w1).

Step 2: Define u2 = v2 -⟨v2,w1w1. Then, u20 as v2 ⁄∈ LS(v1). So, let w2 = -u2--
∥u2∥.

Note that {w1,w2} is orthonormal and LS(w1,w2) = LS(v1,v2).

Step 3: For induction, assume that we have obtained an orthonormal set {w1,,wk-1} such that LS(v1,,vk-1) = LS(w1,,wk-1). Now, note that
uk = vk - i=1k-1vk,wiwi = vk - i=1k-1Projwi(vk)0 as vk∈∕LS(v1,,vk-1). So, let us put wk =  u
--k--
∥uk∥. Then, {w1,,wk} is orthonormal as wk= 1 and

PICT PICT DRAFT
                                  k∑-1                            k-∑ 1
∥uk ∥⟨wk,w1 ⟩  =  ⟨uk,w1 ⟩ = ⟨vk -   ⟨vk,wi⟩wi, w1⟩ = ⟨vk,w1⟩ - ⟨   ⟨vk, wi⟩wi,w1 ⟩
                                  i=1                            i=1
                            k-1
               =  ⟨v ,w  ⟩- ∑  ⟨v ,w  ⟩⟨w  ,w  ⟩ = ⟨v  ,w ⟩ - ⟨v  ,w ⟩ = 0.
                    k   1        k   i   i  1      k  1      k  1
                            i=1
Similarly, wk,wi= 0, for 2 i k - 1. Clearly, wk = ukuk∥∈ LS(w1,,wk-1,vk). So, wk LS(v1,,vk).

As vk = ukwk + i=1k-1vk,wiwi, we get vk LS(w1,,wk). Hence, by the principle of mathematical induction LS(w1,,wk) = LS(v1,,vk) and the required result follows. _

We now illustrate the Gram-Schmidt process with a few examples.

Example 5.2.11.

1.
Let S = {(1,-1,1,1),(1,0,1,0),(0,1,0,1)}⊆ 4. Find an orthonormal set T such that LS(S) = LS(T).
Solution: Let v1 = (1,0,1,0)T ,v2 = (0,1,0,1)T and v3 = (1,-1,1,1)T . Then, w1 = √1-
  2(1,0,1,0)T . As v2,w1= 0, we get w2 = √1-
 2(0,1,0,1)T . For the third vector, let u3 = v3 -⟨v3,w1w1 -⟨v3,w2w2 = (0,-1,0,1)T . Thus, w3 = -1-
√2(0,-1,0,1)T .
2.
Let S = {v1 = [       ]
 2  0  0T ,v2 = [       ]
 3  2  0
 2T ,v3 = [       ]
 1  3  0
 2  2T ,v4 = [       ]
 1  1  1T }. Find an orthonormal set T such that LS(S) = LS(T).
Solution: Take w1 =  v
∥v11∥ = [      ]
 1 0  0T = e1. For the second vector, consider u2 = v2 -3
2w1 = [       ]
 0  2  0T . So, put w2 = -u2-
∥u2∥ = [      ]
 0 1  0T = e2.

For the third vector, let u3 = v3 - i=12v3,wiwi = (0,0,0)T . So, v3 LS((w1,w2)). Or equivalently, the set {v1,v2,v3} is linearly dependent.

So, for again computing the third vector, define u4 = v4 - i=12v4,wiwi. Then, u4 = v4 - w1 - w2 = e3. So w4 = e3. Hence, T = {w1,w2,w4} = {e1,e2,e3}.

3.
Find an orthonormal set in 3 containing (1,2,1)T .
Solution: Let (x,y,z)T 3 with ⟨(1,2,1),(x,y,z)⟩ = 0. Thus,
(x, y,z) = (- 2y - z,y,z) = y(- 2,1,0 )+ z(- 1,0,1).
Observe that (-2,1,0) and (-1,0,1) are orthogonal to (1,2,1) but are themselves not orthogonal.

Method 1: Apply Gram-Schmidt process to {1√--
 6(1,2,1)T ,(-2,1,0)T ,(-1,0,1)T }⊆ 3.

Method 2: Valid only in 3 using the cross product of two vectors.

In either case, verify that {1√--
 6(1,2,1),-√1-
 5(2,-1,0),√-1-
  30(1,2,-5)} is the required set.

We now state two immediate corollaries without proof.

Corollary 5.2.12. Let V{0} be an ips. If

1.
V is finite dimensional then V has an orthonormal basis.
2.
S is a non-empty orthonormal set and dim(V) is finite then S can be extended to form an orthonormal basis of V.

Remark 5.2.13. Let S = {v1,,vn}{0} be a non-empty subset of a finite dimensional vector space V. If we apply Gram-Schmidt process to

1.
S then we obtain an orthonormal basis of LS(v1,,vn). PICT PICT DRAFT
2.
a re-arrangement of elements of S then we may obtain another orthonormal basis of LS(v1,,vn). But, observe that the size of the two bases will be the same.

Exercise 5.2.14.

1.
Let V be an ips with B = {v1,,vn} as a basis. Then, prove that B is orthonormal if and only if for each x V, x = i=1nx,vivi. [Hint: Since B is a basis, each x V has a unique linear combination in terms of vi’s.]
2.
Let S be a subset of V having 101 elements. Suppose that the application of the Gram-Schmidt process yields u5 = 0. Does it imply that LS(v1,,v5) = LS(v1,,v4)? Give reasons for your answer.
3.
Let B = {v1,,vn} be an orthonormal set in n. For 1 k n, define Ak = i=1kviviT . Then, prove that AkT = Ak and Ak2 = Ak. Thus, Ak’s are projection matrices.
4.
Determine an orthonormal basis of 4 containing (1,-2,1,3)T and (2,1,-3,1)T .
5.
Let x n with x= 1.
(a)
Then, prove that {x} can be extended to form an orthonormal basis of n.
(b)
Let the extended basis be {x,x2,,xn} and B = [e1,,en] the standard ordered basis of n. Prove that A = [[x]B,[x2]B,,[xn]B] is an orthogonal matrix.
6.
Let v,w n,n 1 with u= w= 1. Prove that there exists an orthogonal matrix A such that Av = w. Prove also that A can be chosen such that det(A) = 1.
7.
Let (V,,) be an n-dimensional ips. If u V with u= 1 then give reasons for the following statements.
(a)
Let S = {v V|v,u= 0}. Then, dim(S) = n - 1. PICT PICT DRAFT
(b)
Let 0β F. Then, S = {v V : v,u= β} is not a subspace of V.
(c)
Let v V. Then, v = v0 + v,uu for a vector v0 S. That is, V = LS(u,S).

5.2.1 Application to Fundamental Spaces

We end this section by proving the fundamental theorem of linear algebra. So, the readers are advised to recall the four fundamental subspaces and also to go through Theorem 3.5.9 (the rank-nullity theorem for matrices). We start with the following result.

Lemma 5.2.15. Let A Mm,n(). Then, Null(A) = Null(AT A).

Proof. Let x Null(A). Then, Ax = 0. So, (AT A)x = AT (Ax) = AT 0 = 0. Thus, x Null(AT A). That is, Null(A) Null(AT A).

Suppose that x Null(AT A). Then, (AT A)x = 0 and 0 = xT 0 = xT (AT A)x = (Ax)T (Ax) = Ax2. Thus, Ax = 0 and the required result follows. _

Theorem 5.2.16 (Fundamental Theorem of Linear Algebra). Let A Mn(). Then,

1.
dim(Null(A)) + dim(Col(A)) = n.
2.
Null(A) = (Col(A*)) and Null(A*) = (Col(A)).
3.
dim(Col(A)) = dim(Col(A*)).

Proof. Part 1: Proved in Theorem 3.5.9.

Part 2: We first prove that Null(A) Col(A*). Let x Null(A). Then, Ax = 0 and PICT PICT DRAFT

0 = ⟨0,u⟩ = ⟨Ax,u ⟩ = u *Ax = (A*u )*x = ⟨x,A *u⟩, for all u ∈ ℂn.
But Col(A*) = {A*u|u n}. Thus, x Col(A*) and Null(A) Col(A*).

We now prove that Col(A*) Null(A). Let x Col(A*). Then, for every y n,

0 = ⟨x,A *y⟩ = (A*y)*x = y*(A *)*x = y *Ax = ⟨Ax, y⟩.
In particular, for y = Ax n, we get Ax2 = 0. Hence Ax = 0. That is, x Null(A). Thus, the proof of the first equality in Part 2 is over. We omit the second equality as it proceeds on the same lines as above.

Part 3: Use the first two parts to get the required result.

Hence the proof of the fundamental theorem is complete. _

Remark 5.2.17. Theorem 5.2.16.2 implies that Null(A) = (Col(A*)). This statement is just stating the usual fact that if x Null(A) then Ax = 0 and hence the usual dot product of every row of A with x equals 0.

As an implication of Theorem 5.2.16.2 and Theorem 5.2.16.3, we show the existence of an invertible linear map T : Col(A*) Col(A).

Corollary 5.2.18. Let A Mn(). Then, the function T : Col(A*) Col(A) defined by T(x) = Ax is invertible. PICT PICT DRAFT

Proof. In view of Theorem 5.2.16.3 and the rank-nullity theorem, we just need to show that the map is one-one. So, suppose that there exist x,y Col(A*) such that T(x) = T(y). Or equivalently, Ax = Ay. Thus, x - y Null(A) = (Col(A*)) (by Theorem 5.2.16.2). Therefore, x-y (Col(A*))Col(A*) = {0}. Thus, x = y and hence the map is one-one. Thus, the required result follows. _

The readers should look at Example 3.2.3 and Remark 3.2.4. We give one more example.

Example 5.2.19. Let A = ⌊       ⌋
 1  1  0
|⌈2  1  1|⌉

 3  2  1. Then, verify that

1.
{(0,1,1)T ,(1,1,2)T } is a basis of Col(A).
2.
{(1,1,-1)T } is a basis of Null(AT ).
3.
Null(AT ) = (Col(A)).

Exercise 5.2.20.

1.
Find distinct subspaces W1 and W2
(a)
in 2 such that W1 and W2 are orthogonal but not orthogonal complement.
(b)
in 3 such that W1{0} and W2{0} are orthogonal, but not orthogonal complement.
2.
Let A Mm,n(). Then, Null(A) = Null(A*A).
3.
Let A Mm,n(). Then, Col(A) = Col(AT A).
4.
Let A Mm,n(). Then, Rank(A) = n if and only if Rank(AT A) = n. PICT PICT DRAFT
5.
Let A Mm,n(). Then, for every
(a)
x n, x = u + v, where u Col(AT ) and v Null(A) are unique.
(b)
y m, y = w + z, where w Col(A) and z Null(AT ) are unique.

For more information related with the fundamental theorem of linear algebra the interested readers are advised to see the article “The Fundamental Theorem of Linear Algebra, Gilbert Strang, The American Mathematical Monthly, Vol. 100, No. 9, Nov., 1993, pp. 848 - 855.”

5.2.2 QR Decomposition*

The next result gives the proof of the QR decomposition for real matrices. The readers are advised to prove similar results for matrices with complex entries. This decomposition and its generalizations are helpful in the numerical calculations related with eigenvalue problems (see Chapter 6).

Theorem 5.2.1 (QR Decomposition). Let A Mn() be invertible. Then, there exist matrices Q and R such that Q is orthogonal and R is upper triangular with A = QR. Furthermore, if det(A)0 then the diagonal entries of R can be chosen to be positive. Also, in this case, the decomposition is unique.

Proof. As A is invertible, it’s columns form a basis of n. So, an application of the Gram-Schmidt orthonormalization process to {A[:,1],,A[:,n]} gives an orthonormal basis {v1,,vn} of n satisfying

LS (A [:,1],...,A[:,i]) = LS (v1,...,vi), for 1 ≤ i ≤ n.
PICT PICT DRAFT

Since A[:,i] LS(v1,,vi), for 1 i n, there exist αji ,1 j i, such that A[:,i] = [v1,,vi]⌊   ⌋
 α1i
|| .. ||
⌈ . ⌉
 αii. Thus, if Q = [v1,,vn] and R = ⌊                  ⌋
 α11  α12  ⋅⋅⋅ α1n
|                  |
|| 0   α22  ⋅⋅⋅ α2n ||
|⌈ ...    ...   ...   ... |⌉

  0    0   ⋅⋅⋅ αnn then

1.
Q is an orthogonal matrix (see Exercise 5.4.8.5),
2.
R is an upper triangular matrix, and
3.
A = QR.

Thus, this completes the proof of the first part. Note that

1.
αii0, for 1 i n, as A[:,1]0 and A[:,i]∕∈LS(v1,,vi-1).
2.
if αii < 0, for some i,1 i n then we can replace vi in Q by -vi to get a new Q ad R in which the diagonal entries of R are positive.

Uniqueness: Suppose Q1R1 = Q2R2 for some orthogonal matrices Qi’s and upper triangular matrices Ri’s with positive diagonal entries. As Qi’s and Ri’s are invertible, we get Q2-1Q1 = R2R1-1. Now, using

1.
Exercises 2.3.25.1, 1.2.15.1, the matrix R2R1-1 is an upper triangular matrix.
2.
Exercises 1.3.2.3, Q2-1Q1 is an orthogonal matrix.

So, the matrix R2R1-1 is an orthogonal upper triangular matrix and hence, by Exercise 1.2.11.4, R2R1-1 = In. So, R2 = R1 and therefore Q2 = Q1. _

Let A be an n × k matrix with Rank(A) = r. Then, by Remark 5.2.13, an application of the Gram-Schmidt orthonormalization process to columns of A yields an orthonormal set {v1,,vr}⊆ n such that

LS (A [:,1],...,A [:,j]) = LS (v1,...,vi), for 1 ≤ i ≤ j ≤ k.
Hence, proceeding on the lines of the above theorem, we have the following result. PICT PICT DRAFT

Theorem 5.2.2 (Generalized QR Decomposition). Let A be an n × k matrix of rank r. Then, A = QR, where

1.
Q = [v1,,vr] is an n × r matrix with QT Q = Ir,
2.
LS(A[:,1],,A[:,j]) = LS(v1,,vi), for 1 i j k and
3.
R is an r × k matrix with Rank(R) = r.

Example 5.2.3.

1.
Let A = ⌊           ⌋
 1  0   1  2
||0  1  - 1 1||
||           ||
⌈1  0   1  1⌉
 0  1   1  1. Find an orthogonal matrix Q and an upper triangular matrix R such that A = QR.
Solution: From Example 5.2.11, we know that w1 = -1-
√2(1,0,1,0)T , w2 = -1-
√2(0,1,0,1)T and w3 = √12-(0,-1,0,1)T . We now compute w4. If v4 = (2,1,1,1)T then
u4 = v4 - ⟨v4,w1 ⟩w1 - ⟨v4,w2⟩w2 - ⟨v4,w3 ⟩w3 = 1-(1,0,- 1,0)T.
                                               2
Thus, w4 = 1√--
 2(-1,0,1,0)T . Hence, we see that A = QR with
Q = [w1,,w4] = ⌊                 ⌋
  √12-  0   0   √12-
|| 0   √1-  -√1-  0 ||
||       2   2     ||
|⌈ √12-  0   0   -√21|⌉
  0   √1-  √1-  0
        2   2 and R = ⌊√ --      √ --     ⌋
   2   0     2  -√3-
||     √ --       √ 2||
| 0     2  √0---   2|
|⌈ 0    0     2   0  |⌉
  0    0    0    1√--
                  2.
2.
Let A = ⌊              ⌋
  1   1   1   0
||- 1  0  - 2  1||
||              ||
⌈ 1   1   1   0⌉
  1   0   2   1. Find a 4 × 3 matrix Q satisfying QT Q = I3 and an upper triangular matrix R such that A = QR.
Solution: Let us apply the Gram-Schmidt orthonormalization process to the columns of A. As v1 = (1,-1,1,1)T , we get w1 = 12v1. Let v2 = (1,0,1,0)T . Then,
u2 = v2 - ⟨v2,w1 ⟩w1  = (1,0,1,0 )T - w1 =  1(1,1,1,- 1)T .
                                         2
Hence, w2 = 1
2(1,1,1,-1)T . Let v3 = (1,-2,1,2)T . Then,
u3 = v3 - ⟨v3,w1 ⟩w1 - ⟨v3,w2⟩w2 = v3 - 3w1 + w2 = 0.
So, we again take v3 = (0,1,0,1)T . Then,
u3 = v3 - ⟨v3,w1 ⟩w1 - ⟨v3,w2⟩w2 = v3 - 0w1 - 0w2 = v3.
So, w3 = √1-
  2(0,1,0,1)T . Hence, PICT PICT DRAFT
                 ⌊ 1    1     ⌋
                 | 2    2   0 |          ⌊2  1   3    0 ⌋
                 || -21-  12   1√2||          |              |
Q  = [v1, v2,v3] = | 1   1   0 |  and R = ⌈0  1  - 1  √0-⌉.
                 ⌈ 21   -21  1 ⌉           0  0   0     2
                   2   -2-  √2-
The readers are advised to check the following:
(a)
Rank(A) = 3,
(b)
A = QR with QT Q = I3, and
(c)
R is a 3 × 4 upper triangular matrix with Rank(R) = 3.

Remark 5.2.4. Let A Mm,n().

1.
If A = QR with Q = [v1,,vn] then R = ⌊                                       ⌋
 ⟨v1,A [:,1]⟩ ⟨v1,A [:,2]⟩  ⟨v1,A [:,3]⟩  ⋅⋅⋅
||     0      ⟨v2,A [:,2]⟩  ⟨v2,A [:,3]⟩  ⋅⋅⋅||
|                                       |
|⌈     0           0      ⟨v3,A [:,3]⟩  ⋅⋅⋅|⌉
      ...           ...                  .... In case Rank(A) < n then a slight modification gives the matrix R.
2.
Further, let Rank(A) = n.
(a)
Then, AT A is invertible (see Exercise 5.2.20.4).
(b)
By Theorem 5.2.2, A = QR with Q a matrix of size m×n and R an upper triangular matrix of size n × n. Also, QT Q = In and Rank(R) = n. PICT PICT DRAFT
(c)
Thus, AT A = RT QT QR = RT R. As AT A is invertible, the matrix RT R is invertible. Since R is a square matrix, by Exercise 2.3.5.1, the matrix R itself is invertible. Hence, (RT R)-1 = R-1(RT )-1.
(d)
So, if Q = [v1,,vn] then
A(AT A )- 1AT = QR (RT R )-1RT QT = (QR )(R -1(RT )-1)RT QT =  QQT .
(e)
Hence, using Theorem 5.3.7, we see that the matrix
                                      ⌊   ⌋
                                       vT
                                      | 1. |   ∑n
P =  A(AT A)-1AT  = QQT  = [v1,...,vn]|⌈ .. |⌉ =    vivTi
                                       vT     i=1
                                        n
is the orthogonal projection matrix on Col(A).
3.
Further, let Rank(A) = r < n. If j1,,jr are the pivot columns of A then Col(A) = Col(B), where B = [A[:,j1],,A[:,jr]] is an m × r matrix with Rank(B) = r. So, using Part 2e we see that B(BT B)-1BT is the orthogonal projection matrix on Col(A). So, compute RREF of A and choose columns of A corresponding to the pivot columns.
PICT PICT DRAFT

5.3 Orthogonal Projections and Applications

Till now, our main interest was to understand the linear system Ax = b, for A Mm,n(),x n and b m, from different view points. But, in most practical situations the system has no solution. So, we are interested in finding a point x0 n such that the err = b - Ax0 is the least. Thus, we consider the problem of finding x0 n such that

∥err∥ = ∥b - Ax0∥ = min {∥b - Ax0∥ : x ∈ ℝn },
i.e., we try to find the vector x0 n which is nearest to b.

To begin with, recall the following result from Page §.

Theorem 5.3.1 (Decomposition). Let V be an ips having W as a finite dimensional subspace. Suppose {f1,,fk} is an orthonormal basis of W. Then, for each b V, y = i=1kb,fifi is the closest point in W from b. Furthermore, b - y W.

We now give a definition and then an implication of Theorem 5.3.1.

Definition 5.3.2. [Orthogonal Projection] Let W be a finite dimensional subspace of an ips V. Then, by Theorem 5.3.1, for each v V there exist unique vectors w W and u W with v = w + u. We thus define the orthogonal projection of V onto W, denoted PW, by

PW : V → W  by PW (v) = w.
The vector w is called the projection of v on W.

Remark 5.3.3. Let A Mm,n() and W = Col(A). Then, to find the orthogonal projection PW(b), we can use either of the following ideas:

1.
Determine an orthonormal basis {f1,,fk} of Col(A) and get PW(b) = i=1kb,fifi.
2.
By Theorem 5.2.16.2, Col(A) = Null(AT ). Hence, for b m there exists unique u Col(A) and v Null(AT ) such that b = u + v. Thus, using Definition 5.3.2 and Theorem 5.3.1, PW(b) = u.

Before proceeding to projections, we give an application of Theorem 5.3.1 to a linear system.

Corollary 5.3.4. Let A Mm,n() and b m. Then, every least square solution of Ax = b is a solution of the system AT Ax = AT b. Conversely, every solution of AT Ax = AT b is a least square solution of Ax = b.

Proof. As b m, by Remark 5.3.3, there exists y Col(A) and v Null(AT ) such that b = y + v and min{∥b-w|w Col(A)} = b-y. As y Col(A), there exists x0 n such that Ax0 = y, i.e., x0 is the least square solution of Ax = b. Hence,

   T         T          T      T            T         T
(A  A)x0 = A  (Ax0 ) = A y = A   (b - v ) = A b - 0 = A  b.
PICT PICT DRAFT

Conversely, let x1 n be a solution of AT Ax = AT b, i.e., AT (Ax1 - b ) = 0. To show

min{∥b - Ax ∥|x ∈ ℝn } = ∥b - Ax ∥.
                                1

Note that AT (Ax1 - b) = 0 implies 0 = (x - x1)T AT (Ax1 - b) = (Ax - Ax1)T (Ax1 - b) and hence b - Ax1,A(x - x1)= 0. Thus,

∥b - Ax ∥2 = ∥b - Ax1 + Ax1 - Ax ∥2 = ∥b - Ax1 ∥2 + ∥Ax1 - Ax ∥2 ≥ ∥b - y ∥2.
Hence, the required result follows. _

The above corollary gives the following result.

Corollary 5.3.5. Let A Mm,n() and b m. If

1.
AT A is invertible then the least square solution of Ax = b equals x = (AT A)-1AT b.
2.
AT A is not invertible then the least square solution of Ax = b equals x = (AT A)-AT b, where (AT A)- is the pseudo-inverse of AT A.

Proof. Part 1 directly follows from Corollary 5.3.5. For Part 2, let b = y + v, for y Col(A) and v Null(AT ). As y Col(A), there exists x0 n such that Ax0 = y. Thus, by Remark 5.3.3, AT b = AT (y + v) = AT y = AT Ax0. Now, using the definition of pseudo-inverse (see Exercise 1.3.7.14), we see that PICT PICT DRAFT

(AAA  )((AT A)- AT b ) = (AT A)(AT A)- (AT A )x0 = (AT A )x0 = ATb.
Thus, we see that (AT A)-AT b is a solution of the system AT Ax = AT b. Hence, by Corollary 5.3.4, the required result follows. _

We now give a few examples to understand projections.

Example 5.3.6. Use the fundamental theorem of linear algebra to compute the vector of the orthogonal projection.

1.
Determine the projection of (1,1,1,1,1)T on Null([1,- 1,1,- 1,1]).
Solution: Here A = [1,-1,1,-1,1]. So, a basis of Col(AT ) equals {(1,-1,1,-1,1)T } and that of Null(A) equals {(1,1,0,0,0)T ,(1,0,-1,0,0)T ,(1,0,0,1,0)T ,(1,0,0,0,-1)T }. Then, the solution of the linear system
Bx = ⌊ ⌋
 1
||1||
|| ||
||1||
⌈1⌉
 1, where B = ⌊                  ⌋
 1   1   1   1   1
||1   0   0   0  - 1||
||                  ||
||0  - 1  0   0   1 ||
⌈0   0   1   0  - 1⌉
 0   0   0  - 1  1 equals x = 1-
5⌊   ⌋
  6
||- 4||
||   ||
|| 6 ||
⌈- 4⌉
  1. Thus, the projection is 1
--
5(                                                              )
 6(1,1,0,0,0)T - 4(1,0,- 1,0,0 )T + 6(1,0,0,1,0)T - 4(1,0,0,0,- 1)T = 2
--
5(2,3,2,3,2)T .
2.
Determine the projection of (1,1,1)T on Null([1,1,- 1]).
Solution: Here A = [1,1,-1]. So, a basis of Null(A) equals {(1,-1,0)T ,(1,0,1)T } and that of Col(AT ) equals {(1,1,-1)T }. Then, the solution of the linear system
Bx = ⌊ ⌋
 1
|⌈1|⌉

 1, where B = ⌊           ⌋
  1   1   1
|⌈ - 1 0   1 |⌉

  0   1  - 1 equals x = 1-
3⌊   ⌋
 - 2
|⌈ 4 |⌉

  1. Thus, the projection is 1-
3((- 2)(1,- 1,0)T + 4(1,0,1)T ) = 2-
3(1,1,2)T .
3.
Determine the projection of (1,1,1)T on Col(      T)
 [1,2,1].
Solution: Here, AT = [1,2,1], a basis of Col(A) equals {(1,2,1)T } and that of Null(AT ) equals {(1,0,-1)T ,(2,-1,0)T }. Then, using the solution of the linear system
Bx = ⌊ ⌋
 1
|⌈1|⌉

 1, where B = ⌊           ⌋
   1   2   1
|⌈  0  - 1  2|⌉

  - 1  0   1 gives 2-
3(1,2,1)T as the required vector.

To use the first idea in Remark 5.3.3, we prove the following result which helps us to get the matrix of the orthogonal projection from an orthonormal basis.

Theorem 5.3.7. Let {f1,,fk} be an orthonormal basis of a finite dimensional subspace W of an ips V. Then PW = i=1kfifi*.

Proof. Let v V. Then,

       (       )
         ∑k          ∑k           ∑k
PWv  =      fif*i  v =     fi(f*i v) =   ⟨v,fi⟩fi.
         i=1          i=1          i=1
As PWv is the only closet point (see Theorem 5.3.1), the required result follows. _

Example 5.3.8. In each of the following, determine the matrix of the orthogonal projection. Also, verify that PW + PW = I. What can you say about Rank(PW) and Rank(PW)? Also, verify the orthogonal projection vectors obtained in Example 5.3.6.

1.
W = {(x1,,x5)T 5|x1 - x2 + x3 - x4 + x5 = 0} = Null([1,- 1,1,- 1,1]).
Solution: An orthonormal basis of W is (|   ⌊ 1⌋    ⌊0⌋    ⌊ 1 ⌋      ⌊- 2⌋)|
|||   |  |    | |    |   |      |   ||||
||{   || 1||    ||0||    || - 1||     || 2 ||||}
  √1|| 0||,√1-||1|| ,√1|| 0 || ,√1--|| 3 ||
|||  2|  |   2| |   6|   |    30|   ||||
|||(   ⌈ 0⌉    ⌈1⌉    ⌈ 0 ⌉      ⌈- 3⌉|||)
      0      0       - 2       - 2. Thus,
PW = i=14fifiT = 1-
5⌊                     ⌋
   4   1  - 1   1  - 1
||  1   4    1  - 1   1||
||                     ||
||- 1   1    4   1  - 1||
⌈  1  - 1   1   4    1⌉
 - 1   1  - 1   1    4 and PW = 1-
5⌊                     ⌋
   1 - 1    1  - 1   1
||- 1   1  - 1   1  - 1||
||                     ||
||  1 - 1    1  - 1   1||
⌈- 1   1  - 1   1  - 1⌉
   1 - 1    1  - 1   1.
2.
W = {(x,y,z)T 3|x + y - z = 0} = Null([1,1,- 1]).
Solution: Note {(1,1,-1)} is a basis of W and {                      }
 √1-(1,- 1,0),√1-(1,1,2)
   2            6 an orthonormal basis of W. So,
         ⌊            ⌋             ⌊           ⌋
           1    1   - 1               2   - 1  1
P W⊥ = 1-|⌈ 1    1   - 1|⌉ and PW =  1|⌈ - 1  2   1|⌉ .
       3                           3
          - 1  - 1  1                 1    1   2
Verify that PW + PW = I3, Rank(PW) = 2 and Rank(PW) = 1.
3.
W = LS((1,2,1)) = Col([1,2,1]T )3.
Solution: Using Example 5.2.11.3 and Equation (5.2.1)
W  ⊥ = LS ({(- 2,1,0),(- 1,0,1)}) = LS ({(- 2,1,0),(1,2,- 5)}).
So, PW = 1
6⌊        ⌋
  1  2  1
|⌈ 2  4  2|⌉

  1  2  1 and P W = 1
6⌊            ⌋
  5   - 2 - 1
|⌈- 2   2  - 2|⌉

 - 1  - 2  5.
PICT PICT DRAFT

We advise the readers to give a proof of the next result.

Theorem 5.3.9. Let {f1,,fk} be an orthonormal basis of a subspace W of n. If {f1,,fn} is an extended orthonormal basis of n, PW = i=1kfifiT and PW = i=k+1nfifiT then prove that

1.
In - PW = PW.
2.
(PW)T = PW and (PW)T = PW. That is, PW and PW are symmetric.
3.
(PW)2 = PW and (PW)2 = PW. That is, PW and PW are idempotent.
4.
PW PW = PWPW = 0.

Exercise 5.3.10.

1.
Let W = {(x,y,z,w) 4 : x = y,z = w} be a subspace of 4. Determine the matrix of the orthogonal projection.
2.
Let PW1 and PW2 be the orthogonal projections of 2 onto W1 = {(x,0) : x } and W2 = {(x,x) : x }, respectively. Note that PW1 PW2 is a projection onto W1. But, it is not an orthogonal projection. Hence or otherwise, conclude that the composition of two orthogonal projections need not be an orthogonal projection?
3.
Let A = [     ]
  1  1

  0  0. Then, A is idempotent but not symmetric. Now, define P : 2 2 by P(v) = Av, for all v 2. Then,
(a)
P is idempotent.
(b)
Null(P) Rng(P) = Null(A) Col(A) = {0}. PICT PICT DRAFT
(c)
2 = Null(P) + Rng(P). But, (Rng(P)) = (Col(A))Null(A).
(d)
Since (Col(A))Null(A), the map P is not an orthogonal projector. In this case, P is called a projection of 2 onto Rng(P) along Null(P).
4.
Find all 2 × 2 real matrices A such that A2 = A. Hence, or otherwise, determine all projection operators of 2.
5.
Let W be an (n - 1)-dimensional subspace of n with ordered basis BW = [f1,,fn-1]. Suppose B = [f1,,fn-1,fn] is an orthogonal ordered basis of n obtained by extending BW. Now, define a function Q : n n by Q(v) = v,fnfn - i=1n-1v,fifi. Then,
(a)
Q fixes every vector in W.
(b)
Q sends every vector w W to -w.
(c)
Q Q = In.

The function Q is called the reflection operator with respect to W.

5.3.1 Orthogonal Projections as Self-Adjoint Operators*

Theorem 5.3.9 implies that the matrix of the projection operator is symmetric. We use this idea to proceed further.

Definition 5.3.11. [Self-Adjoint Operator] Let V be an ips with inner product ,. A linear operator P : V V is called self-adjoint if P(v),u= v,P(u), for every u,v V.

A careful understanding of the examples given below shows that self-adjoint operators and Hermitian matrices are related. It also shows that the vector spaces n and n can be decomposed in terms of the null space and column space of Hermitian matrices. They also follow directly from the fundamental theorem of linear algebra. PICT PICT DRAFT

Example 5.3.12.

1.
Let A be an n × n real symmetric matrix. If P : n n is defined by P(x) = Ax, for every x n then
(a)
P is a self adjoint operator as A = AT , for every x,y n, implies
             T         T  T         T
⟨P (x),y⟩ = (y )Ax =  (y  )A  x = (Ay ) x = ⟨x,Ay ⟩ = ⟨x,P (y )⟩.
(b)
Null(P) = (Rng(P)) as A = AT . Thus, n = Null(P) Rng(P).
2.
Let A be an n × n Hermitian matrix. If P : n n is defined by P(z) = Az, for all z n then using similar arguments (see Example 5.3.12.1) prove the following:
(a)
P is a self-adjoint operator.
(b)
Null(P) = (Rng(P)) as A = A*. Thus, n = Null(P) Rng(P).

We now state and prove the main result related with orthogonal projection operators.

Theorem 5.3.13. Let V be a finite dimensional ips. If V = W W then the orthogonal projectors PW : V V on W and PW : V V on W satisfy PICT PICT DRAFT

1.
Null(PW) = {v V : PW(v) = 0} = W = Rng(PW).
2.
Rng(PW) = {PW(v) : v V} = W = Null(PW).
3.
PW PW = PW, PWPW = PW (Idempotent).
4.
PWPW = 0V and PW PW = 0V, where 0V(v) = 0, for all v V
5.
PW + PW = IV, where IV(v) = v, for all v V.
6.
The operators PW and PW are self-adjoint.

Proof. Part 1: As V = W W, for each u W, one uniquely writes u = 0 + u, where 0 W and u W. Hence, by definition, PW(u) = 0 and PW(u) = u. Thus, W Null(PW) and W Rng(PW).

Now suppose that v Null(PW). So, PW(v) = 0. As V = W W, v = w + u, for unique w W and unique u W. So, by definition, PW(v) = w. Thus, w = PW(v) = 0. That is, v = 0 + u = u W. Thus, Null(PW) W.

A similar argument implies Rng(PW) W and thus completing the proof of the first part.

Part 2: Use an argument similar to the proof of Part 1.

Part 3, Part 4 and Part 5: Let v V. Then, v = w + u, for unique w W and unique u W. Thus, by definition,

  (PW  ∘PW )(v)  =  PW (PW (v)) = P W(w ) = w and PW (v) = w
 (P  ⊥ ∘P  )(v)  =  P  ⊥(P  (v)) = P ⊥ (w) = 0 and
   W     W           W    W         W
(P W ⊕ PW ⊥)(v)  =  PW (v)+  PW⊥ (v) = w + u = v = IV(v).
Hence, PW PW = PW, PWPW = 0V and IV = PW PW.

Part 6: Let u = w1 + x1 and v = w2 + x2, for unique w1,w2 W and unique x1,x2 W. Then, by definition, wi,xj= 0 for 1 i,j 2. Thus, PICT PICT DRAFT

⟨PW (u),v⟩ = ⟨w1,v⟩ = ⟨w1,w2 ⟩ = ⟨u, w2⟩ = ⟨u,PW (v)⟩
and the proof of the theorem is complete. _

Remark 5.3.14. Theorem 5.3.13 gives us the following:

1.
The orthogonal projectors PW and PW are idempotent and self-adjoint.
2.
Let v V. Then, v-PW(v) = (IV-PW)(v) = PW(v) W. Thus, v-PW(v),w= 0, for every v V and w W.
3.
As PW(v) - w W, for each v V and w W, we have
∥v - w ∥2 =   ∥v - PW (v)+ P W(v) - w ∥2
                          2               2
          =   ∥v - PW (v)∥ + ∥P W(v) - w ∥ + 2⟨v - PW (v ),PW(v )- w ⟩
          =   ∥v - PW (v)∥2 + ∥P W(v) - w ∥2.
Therefore, v - w∥≥∥v - PW(v)and equality holds if and only if w = PW(v). Since PW(v) W, we see that
PICT PICT DRAFT d(v,W ) = inf {∥v - w ∥ : w ∈ W } = ∥v - PW (v )∥.
That is, PW(v) is the vector nearest to v W. This can also be stated as: the vector PW(v) solves the following minimization problem:
inf ∥v - w ∥ = ∥v - P W(v)∥.
w∈W

The next theorem is a generalization of Theorem 5.3.13. We omit the proof as the arguments are similar and uses the following:
Let V be a finite dimensional ips with V = W1 ⋅⋅⋅Wk, for certain subspaces Wi’s of V. Then, for each v V there exist unique vectors v1,,vk such that

1.
vi Wi, for 1 i k,
2.
vi,vj= 0 for each vi Wi,vj Wj,1 ij k and
3.
v = v1 + ⋅⋅⋅ + vk.

Theorem 5.3.15. Let V be a finite dimensional ips with subspaces W1,, Wk of V such that V = W1 ⋅⋅⋅Wk. Then, for each i,j,1 ij k, there exist orthogonal projectors PWi : V V of V onto Wi satisfying the following:

1.
Null(PWi) = Wi = W1 W2 ⋅⋅⋅Wi-1 Wi+1 ⋅⋅⋅Wk.
2.
Rng(PWi) = Wi. PICT PICT DRAFT
3.
PWi PWi = PWi.
4.
PWi PWj = 0V.
5.
PWi is a self-adjoint operator, and
6.
IV = PW1 PW2 ⋅⋅⋅PWk.

5.4 Orthogonal Operator and Rigid Motion

We now give the definition and a few properties of an orthogonal operator.

Definition 5.4.1. [Orthogonal Operator] Let V be a vector space. Then, a linear operator T : V V is said to be an orthogonal operator if T(x)= x, for all x V.

Example 5.4.2. Each T L(V) given below is an orthogonal operator.

1.
Fix a unit vector a V and define T(x) = 2x,aa - x, for all x V.
Solution: Note that Proja(x) = x,aa. So, ⟨x,aa,x - ⟨x,aa⟩ = 0. Also, by Pythagoras theorem x -⟨x,aa2 = x2 - (x,a)2. Thus,
       2                          2           2               2      2
∥T (x )∥ = ∥(⟨x,a⟩a) + (⟨x,a ⟩a - x)∥  = ∥⟨x,a⟩a∥  + ∥x-  ⟨x, a⟩a∥ = ∥x ∥ .
PICT PICT DRAFT
2.
Let n = 2, V = 2 and 0 θ < 2π. Now define T(x) = [            ]
 cos θ - sinθ
  sinθ   cosθ[ ]
 x
 y.

We now show that an operator is orthogonal if and only if it preserves the angle.

Theorem 5.4.3. Let T L(V). Then, the following statements are equivalent.

1.
T is an orthogonal operator.
2.
T(x),T(y)= x,y, for all x,y V. That is, T preserves inner product.

Proof. 1 2 Let T be an orthogonal operator. Then, T(x + y)2 = x + y2. So, T(x)2 + T(y)2 + 2T(x),T(y)= T(x) + T(y)2 = T(x + y)2 = x2 + y2 + 2x,y. Thus, using definition again T(x),T(y)= x,y.

2 1 If T(x),T(y)= x,y, for all x,y V then T is an orthogonal operator as T(x)2 = T(x),T(x)= x,x= x2. _

As an immediate corollary, we obtain the following result.

Corollary 5.4.4. Let T L(V). Then, T is an orthogonal operator if and only if “for every orthonormal basis {u1,,un} of V, {T(u1),,T(un)} is an orthonormal basis of V”. Thus, if B is an orthonormal ordered basis of V then T[B,B] is an orthogonal matrix.

Definition 5.4.5. [Isometry, Rigid Motion] Let V be a vector space. Then, a map T : V V is said to be an isometry or a rigid motion if T(x) - T(y)= x - y, for all x,y V. That is, an isometry is distance preserving. PICT PICT DRAFT

Observe that if T and S are two rigid motions then ST is also a rigid motion. Furthermore, it is clear from the definition that every rigid motion is invertible.

Example 5.4.6. The maps given below are rigid motions/isometry.

1.
Let V be a linear space with norm ∥⋅∥. If a V then the translation map Ta : V V (see Exercise 7), defined by Ta(x) = x + a for all x V, is an isometry/rigid motion as
∥Ta(x)- Ta (y)∥ = ∥ (x+ a )- (y + a)∥ = ∥x - y∥.
2.
Let V be an ips. Then, using Theorem 5.4.3, we see that every orthogonal operator is an isometry.

We now prove that every rigid motion that fixes origin is an orthogonal operator.

Theorem 5.4.7. Let V be a real ips. Then, the following statements are equivalent for any map T : V V.

1.
T is a rigid motion that fixes origin.
2.
T is linear and T(x),T(y)= x,y, for all x,y V (preserves inner product).
3.
T is an orthogonal operator.
PICT PICT DRAFT

Proof. We have already seen the equivalence of Part 2 and Part 3 in Theorem 5.4.3. Let us now prove the equivalence of Part 1 and Part 2/Part 3.

If T is an orthogonal operator then T(0) = 0 and T(x) - T(y)= T(x - y)= x - y. This proves Part 3 implies Part 1.

We now prove Part 1 implies Part 2. So, let T be a rigid motion that fixes 0. Thus, T(0) = 0 and T(x) -T(y)= x - y, for all x,y V. Hence, in particular for y = 0, we have T(x)= x, for all x V. So,

      2         2                                                              2
∥T(x)∥  + ∥T(y)∥  - 2⟨T(x),T(y )⟩  =   ⟨T (x)- T (y),T(x) - T(y)⟩ = ∥T(x) - T(y)∥
                                  =   ∥x- y ∥2 = ⟨x - y,x - y⟩

                                  =   ∥x∥2 + ∥y ∥2 - 2⟨x,y ⟩.
Thus, using T(x)= x, for all x V, we get T(x),T(y)= x,y, for all x,y V. Now, to prove T is linear, we use T(x),T(y)= x,yin 3-rd and 4-th line to get
∥T (x + y )- (T(x) + T(y))∥2  =   ⟨T(x + y) - (T (x)+ T (y)),T(x + y) - (T (x)+ T (y))⟩

                             =   ⟨T (x + y),T(x + y)⟩- 2 ⟨T (x + y),T(x)⟩
                                 - 2 ⟨T (x + y),T(y)⟩+ ⟨T (x)+ T (y),T(x)+ T(y)⟩

                             =   ⟨x + y, x+ y ⟩- 2⟨x + y,x⟩-  2⟨x + y,y ⟩
                                 +⟨T (x ),T (x)⟩+ 2⟨T (x ),T (y)⟩+ ⟨T(y ),T (y)⟩

                             =   - ⟨x + y,x + y⟩ + ⟨x, x⟩+ 2⟨x,y ⟩+ ⟨y,y⟩ = 0.

Thus, T(x + y) -(T (x )+ T (y )) = 0 and hence T(x + y) = T(x) + T(y). A similar calculation gives T(αx) = αT(x) and hence T is linear. _ PICT PICT DRAFT

Exercise 5.4.8.

1.
Let A,B Mn(). Then, A and B are said to be
(a)
Orthogonally Congruent if B = ST AS, for some invertible matrix S.
(b)
Unitarily Congruent if B = S*AS, for some invertible matrix S.

Prove that Orthogonal and Unitary congruences are equivalence relations on Mn() and Mn(), respectively.

2.
Let x 2. Identify it with the complex number x = x1 + ix2. If we rotate x by a counterclockwise rotation θ,0 θ < 2π then, we have
xeiθ = (x1 + ix2)(cosθ + isinθ ) = x1 cosθ - x2 sin θ + i[x1sinθ + x2cosθ].
Thus, the corresponding vector in 2 is
[                ]   [            ][   ]
 x1 cosθ - x2 sin θ =  cosθ  - sinθ   x1 .
 x1 sin θ + x2 cosθ    sinθ    cosθ   x2
Is the matrix, [            ]
 cosθ  - sinθ
 sinθ    cosθ, the matrix of the corresponding rotation? Justify. PICT PICT DRAFT
3.
Let A M2() and T(θ) = [            ]
   cosθ  sinθ
 - sinθ  cosθ, for θ . Then, A is an orthogonal matrix if and only if A = T(θ) or A = [    ]
 0  1
 1  0T(θ), for some θ .
4.
Let A Mn(). Then, the following statements are equivalent.
(a)
A is an orthogonal matrix.
(b)
A-1 = AT .
(c)
AT is orthogonal.
(d)
the columns of A form an orthonormal basis of the real vector space n.
(e)
the rows of A form an orthonormal basis of the real vector space n.
(f)
for any two vectors x,y n, Ax,Ay= x,yOrthogonal matrices preserve angle.
(g)
for any vector x n, Ax= xOrthogonal matrices preserve length.
5.
Let U be an n × n matrix. Then, prove that the following statements are equivalent.
(a)
U is a unitary matrix.
(b)
U-1 = U*.
(c)
U* is unitary.
(d)
the columns of U form an orthonormal basis of the complex vector space n.
(e)
the rows of U form an orthonormal basis of the complex vector space n.
(f)
for any two vectors x,y n, Ux,Uy= x,yUnitary matrices preserve angle.
(g)
for any vector x n, Ux= xUnitary matrices preserve length.
PICT PICT DRAFT
6.
Let A be an n × n orthogonal matrix. Then, prove that det(A) = ±1.
7.
Let A be an n × n upper triangular matrix. If A is also an orthogonal matrix then A is a diagonal matrix with diagonal entries ±1.
8.
Prove that in M5(), there are infinitely many orthogonal matrices of which only finitely many are diagonal (in fact, there number is just 32).
9.
Prove that permutation matrices are real orthogonal.
10.
Let A,B Mn() be two unitary matrices. Then, prove that AB and BA are unitary matrices.
11.
If A = [aij] and B = [bij] are unitarily equivalent then prove that ij|aij|2 = ij|bij|2.
12.
Let U be a unitary matrix and for every x n, define
∥x ∥1 = max {|xi| : xT = [x1,...,xn]}.
Then, is it necessary that Ux1 = x1?

5.5 Summary

In the previous chapter, we learnt that if V is vector space over F with dim(V) = n then V basically looks like Fn. Also, any subspace of Fn is either Col(A) or Null(A) or both, for some matrix A with entries from F.

So, we started this chapter with inner product, a generalization of the dot product in 3 or n. We used the inner product to define the length/norm of a vector. The norm has the property that “the norm of a vector is zero if and only if the vector itself is the zero vector”. We then proved the Cauchy-Bunyakovskii-Schwartz Inequality which helped us in defining the angle between two PICT PICT DRAFT vector. Thus, one can talk of geometrical problems in n and proved some geometrical results.

We then independently defined the notion of a norm in n and showed that a norm is induced by an inner product if and only if the norm satisfies the parallelogram law (sum of squares of the diagonal equals twice the sum of square of the two non-parallel sides).

The next subsection dealt with the fundamental theorem of linear algebra where we showed that if A Mm,n() then

1.
dim(Null(A)) + dim(Col(A)) = n.
2.
Null(A) = (Col(A*)) and Null(A*) = (Col(A)).
3.
dim(Col(A)) = dim(Col(A*)).

We then saw that having an orthonormal basis is an asset as determining the

1.
coordinates of a vector boils down to computing the inner product.
2.
projection of a vector on a subspace boils down to finding an orthonormal basis of the subspace and then summing the corresponding rank 1 matrices.

So, the question arises, how do we compute an orthonormal basis? This is where we came across the Gram-Schmidt Orthonormalization process. This algorithm helps us to determine an orthonormal basis of LS(S) for any finite subset S of a vector space. This also lead to the QR-decomposition of a matrix.

Thus, we observe the following about the linear system Ax = b. If

1.
b Col(A) then we can use the Gauss-Jordan method to get a solution.
2.
b∈∕Col(A) then in most cases we need a vector x such that the least square error between b and Ax is minimum. We saw that this minimum is attained by the projection of b on Col(A). Also, this vector can be obtained either using the fundamental theorem of linear algebra or by computing the matrix B(BT B)-1BT , where the columns of B are either the pivot columns of A or a basis of Col(A).
PICT PICT DRAFT PICT PICT DRAFT