5 Inner Product Spaces

Recall the dot product in ℝ² and ℝ³. Dot product helped us to compute the length of vectors and angle between vectors. This enabled us to rephrase geometrical problems in ℝ² and ℝ³ in the language of vectors. We generalize the idea of dot product to achieve similar goal for a general vector space over ℝ or ℂ. So, in this chapter F will denote either ℝ or ℂ.

Definition 5.1.1. [Inner Product] Let V be a vector space over F. An inner product over V, denoted by ⟨,⟩, is a map from V × V to F satisfying

1.: ⟨au + bv,w⟩ = a⟨u,w⟩ + b⟨v,w⟩, for all u,v,w ∈ V and a,b ∈ F,
2.: ⟨u,v⟩ = ⟨v,u⟩, the complex conjugate of ⟨u,v⟩, for all u,v ∈ V and
3.: ⟨u,u⟩≥ 0 for all u ∈ V. Furthermore, equality holds if and only if u = 0.

Remark 5.1.2. Using the definition of inner product, we immediately observe that

1.: ⟨v,αw⟩ = ⟨αw,v⟩ = α⟨w,v⟩ = α⟨v,w⟩, for all α ∈ F and v,w ∈ V.
2.: If ⟨u,v⟩ = 0 for all v ∈ V then in particular ⟨u,u⟩ = 0. Hence, u = 0.

PICT PICT DRAFT Definition 5.1.3. [Inner Product Space] Let V be a vector space with an inner product ⟨,⟩. Then, (V,⟨,⟩) is called an inner product space (in short, ips).

Example 5.1.4. Examples 1 and 2 that appear below are called the standard inner product or the dot product on ℝⁿ and ℂⁿ, respectively. Whenever an inner product is not clearly mentioned, it will be assumed to be the standard inner product.

1.

For u = (u₁,…,u_n)^T,v = (v₁,…,v_n)^T ∈ ℝⁿ define ⟨u,v⟩ = u₁v₁ + ⋅⋅⋅

+ u_nv_n = v^Tu. Then, ⟨,⟩ is indeed an inner product and hence

ℝⁿ,⟨,⟩

is an ips.

2.

For u = (u₁,…,u_n)^*,v = (v₁,…,v_n)^*∈ ℂⁿ define ⟨u,v⟩ = u₁v₁+ ⋅⋅⋅

+u_nv_n = v^*u. Then,

ℂⁿ,⟨,⟩

is an ips.

3.

For x = (x₁,x₂)^T,y = (y₁,y₂)^T ∈ ℝ² and A = [ ]
4 - 1
- 1 2

, define ⟨x,y⟩ = y^TAx. Then, ⟨,⟩ is an inner product as ⟨x,x⟩ = (x₁ - x₂)² + 3x₁² + x₂².

4.

Fix A =

with a,c > 0 and ac > b². Then, ⟨x,y⟩ = y^TAx is an inner product on ℝ² as ⟨x,x⟩ = ax₁² + 2bx₁x₂ + cx₂² = a [ bx ]
x1 + -a2

² +

x₂².

5.

Verify that for x = (x₁,x₂,x₃)^T,y = (y₁,y₂,y₃)^T ∈ ℝ³, ⟨x,y⟩ = 10x₁y₁+3x₁y₂+3x₂y₁+ 2x₂y₂ + x₂y₃ + x₃y₂ + x₃y₃ defines an inner product.

6.

For x = (x₁,x₂)^T,y = (y₁,y₂)^T ∈ ℝ², we define three maps that satisfy at least one condition out of the three conditions for an inner product. Determine the condition which is not satisfied. Give reasons for your answer.

(a): ⟨x,y⟩ = x₁y₁.
(b): ⟨x,y⟩ = x₁² + y₁² + x₂² + y₂².
(c): ⟨x,y⟩ = x₁y₁³ + x₂y₂³.

DRAFT

7.

Let A ∈ M_n(ℂ) be a Hermitian matrix. Then, for x,y ∈ ℂⁿ, define ⟨x,y⟩ = y^*Ax. Then, ⟨,⟩ satisfies ⟨x,y⟩ = ⟨y,x⟩ and ⟨x + αz,y⟩ = ⟨x,y⟩ + α⟨z,y⟩, for all x,y,z ∈ ℂⁿ and α ∈ ℂ. Does there exist conditions on A such that ⟨x,x⟩≥ 0 for all x ∈ ℂ? This will be answered in affirmative in the chapter on eigenvalues and eigenvectors.

8.

For A,B ∈ M_n(ℝ), define ⟨A,B⟩ = tr(B^TA). Then,

⟨A + B, C⟩ = tr(CT (A + B )) = tr(CT A)+ tr(CT B ) = ⟨A, C⟩+ ⟨B, C ⟩ and T T T T ⟨A, B⟩ = tr(B A ) = tr((B A) ) = tr(A B ) = ⟨B, A ⟩.

If A = [a_ij] then ⟨A,A⟩ = tr(A^TA) = ∑ _i=1ⁿ(A^TA)_ii = ∑ _i,j=1ⁿa_ija_ij = ∑ _i,j=1ⁿa_ij² and therefore, ⟨A,A⟩ > 0 for all nonzero matrix A.

9.

Consider the complex vector space

[-1,1] and define ⟨f,g⟩ = ∫ _-1¹f(x)g(x)dx. Then,

(a): ⟨f,f⟩ = ∫ _-1¹|f(x)|²dx ≥ 0 as |f(x)|² ≥ 0 and this integral is 0 if and only if f ≡ 0 as f is continuous.
(b): ⟨g,f⟩ = ∫ _-1¹g(x)f(x)dx = ∫ _-1¹g(x)f(x)dx = ∫ _-1¹f(x)g(x)dx = ⟨f,g⟩.
(c): ⟨f + g,h⟩ = ∫ _-1¹(f + g)(x)h(x)dx = ∫ _-1¹[f(x)h(x) + g(x)h(x)]dx = ⟨f,h⟩ + ⟨g,h⟩.
(d): ⟨αf,g⟩ = ∫ _-1¹(αf(x))g(x)dx = α∫ _-1¹f(x)g(x)dx = α⟨f,g⟩.
(e): Fix an ordered basis = [u₁,…,u_n] of a complex vector space V. Then, for any u,v ∈ V, with [u] = and [v] = , define ⟨u,v⟩ = ∑ _i=1ⁿa_ib_i. Then, ⟨,⟩ is indeed an inner product in V. So, any finite dimensional vector space can be endowed with an inner product.

DRAFT

5.1.1 Cauchy Schwartz Inequality

As ⟨u,u⟩ > 0, for all u≠0, we use inner product to define length of a vector.

Definition 5.1.5. [Length / Norm of a Vector] Let V be a vector space over F. Then, for any vector u ∈ V, we define the length (norm) of u, denoted ∥u∥ = ∘ ------
⟨u,u ⟩ , the positive square root. A vector of norm 1 is called a unit vector. Thus, u
----
∥u∥ is called the unit vector in the direction of u.

Example 5.1.6.

1.: Let V be an ips and u ∈ V. Then, for any scalar α, ∥αu∥ = α⋅∥u∥.
2.: Let u = (1,-1,2,-3)^T ∈ ℝ⁴. Then, ∥u∥ = = . Thus, u and -u are vectors of norm 1. Moreover u is a unit vector in the direction of u.

Exercise 5.1.7.

1.

Let u = (-1,1,2,3,7)^T ∈ ℝ⁵. Find all α ∈ ℝ such that ∥αu∥ = 1.

2.

Let u = (-1,1,2,3,7)^T ∈ ℂ⁵. Find all α ∈ ℂ such that ∥αu∥ = 1. PICT

DRAFT

3.

Prove that ∥x + y∥² + ∥x - y∥² = 2

∥x∥² + ∥y∥²

, for all x^T,y^T ∈ ℝⁿ. This equality is called the Parallelogram Law as in a parallelogram the sum of square of the lengths of the diagonals is equal to twice the sum of squares of the lengths of the sides.

4.

Apollonius’ Identity: Let the length of the sides of a triangle be a,b,c ∈ ℝ and that of the median be d ∈ ℝ. If the median is drawn on the side with length a then prove that b² + c² = 2 ( ( ) )
d2 + a- 2
2

5.

Let u = (1,2)^T,v = (2,-1)^T ∈ ℝ². Then, does there exist an inner product in ℝ² such that ∥u∥ = 1,∥v∥ = 1 and ⟨u,v⟩ = 0? [Hint: Let A = [ ]
a b
b c

and define ⟨x,y⟩ = y^TAx. Use given conditions to get a linear system of 3 equations in the variables a,b,c.]

6.

Let x = (x₁,x₂)^T,y = (y₁,y₂)^T ∈ ℝ². Then, ⟨x,y⟩ = 3x₁y₁ - x₁y₂ - x₂y₁ + x₂y₂ defines an inner product. Use this inner product to find

(a): the angle between e₁ = (1,0)^T and e₂ = (0,1)^T.
(b): v ∈ ℝ² such that ⟨v,e₁⟩ = 0.
(c): x,y ∈ ℝ² such that ∥x∥ = ∥y∥ = 1 and ⟨x,y⟩ = 0.

A very useful and a fundamental inequality, commonly called the Cauchy-Schwartz inequality, concerning the inner product is proved next.

Theorem 5.1.8 (Cauchy-Bunyakovskii-Schwartz inequality). Let V be an inner product space over F. Then, for any u,v ∈ V

DRAFT

(5.1.1)

Moreover, equality holds in Inequality (5.1.1) if and only if u and v are linearly dependent. Furthermore, if u≠0 then v = ⟨ ⟩
v, -u--
∥u∥ -u--
∥u ∥ .

Proof. If u = 0 then Inequality (5.1.1) holds. Hence, let u≠0. Then, by Definition 5.1.1.3, ⟨λu + v,λu + v⟩≥ 0 for all λ ∈ F and v ∈ V. In particular, for λ = - ⟨v,-u⟩
∥u∥2

Now, note that equality holds in Inequality (5.1.1) if and only if ⟨λu + v,λu + v⟩ = 0, or equivalently, λu + v = 0. Hence, u and v are linearly dependent. Moreover,

Corollary 5.1.9. Let x,y ∈ ℝⁿ. Then, ( ∑n )
xiyi
i=1 ² ≤ (∑n )
x2i
i=1 ( ∑n )
y2i
i=1 . PICT PICT DRAFT

5.1.2 Angle between two Vectors

Let V be a real vector space. Then, for u,v ∈ V, the Cauchy-Schwartz inequality implies that -1 ≤ ∥⟨uu,∥v∥v⟩∥-

≤ 1. We use this together with the properties of the cosine function to define the angle between two vectors in an inner product space.

Definition 5.1.10. [Angle between Vectors] Let V be a real vector space. If θ ∈ [0,π] is the angle between u,v ∈ V \{0} then we define

-⟨u,-v⟩- cos θ = ∥u∥∥v ∥.

Example 5.1.11.

1.: Take (1,0)^T,(1,1)^T ∈ ℝ². Then, cosθ = . So θ = π∕4.
2.: Take (1,1,0)^T,(1,1,1)^T ∈ ℝ³. Then, angle between them, say β = cos^-1.
3.: Angle depends on the IP. Take ⟨x,y⟩ = 2x₁y₁ + x₁y₂ + x₂y₁ + x₂y₂ on ℝ². Then, angle between (1,0)^T,(1,1)^T ∈ ℝ² equals cos^-1.
4.: As ⟨x,y⟩ = ⟨y,x⟩ for any real vector space, the angle between x and y is same as the angle between y and x. DRAFT
5.: Let a,b ∈ ℝ with a,b > 0. Then, prove that ≥ 4.
6.: For 1 ≤ i ≤ n, let a_i ∈ ℝ with a_i > 0. Then, use Corollary 5.1.9 to show that ≥ n².
7.: Prove that |z₁ + + z_n| ≤ , for z₁,…,z_n ∈ ℂ. When does the equality hold?
8.: Let V be an ips. If u,v ∈ V with ∥u∥ = 1,∥v∥ = 1 and ⟨u,v⟩ = 1 then prove that u = αv for some α ∈ F. Is α = 1?

We will now prove that if A,B and C are the vertices of a triangle (see Figure 5.1) and a,b and c, respectively, are the lengths of the corresponding sides then cos(A) = b2+c2--a2
2bc

. This in turn implies that the angle between vectors has been rightly defined.

Lemma 5.1.12. Let A,B and C be the vertices of a triangle (see Figure 5.1) with corresponding side lengths a,b and c, respectively, in a real inner product space V then

2 2 2 cos(A ) = b-+-c---a--. 2bc

Proof. Let 0, u and v be the coordinates of the vertices A,B and C, respectively, of the triangle ABC. Then, A⃗B

= u,

= v and

= v - u. Thus, we need to prove that

Definition 5.1.13. [Orthogonality / Perpendicularity] Let V be an inner product space over ℝ. Then, PICT PICT DRAFT

1.: the vectors u,v ∈ V are called orthogonal/perpendicular if ⟨u,v⟩ = 0.
2.: Let S ⊆ V. Then, the orthogonal complement of S in V, denoted S^⊥, equals $⊥ S = {v ∈ V : ⟨v,w ⟩ = 0, for all w ∈ S }.$

Example 5.1.14.

1.

0 is orthogonal to every vector as ⟨0,x⟩ = 0 for all x ∈ V.

2.

If V is a vector space over ℝ or ℂ then 0 is the only vector that is orthogonal to itself.

3.

Let V = ℝ.

(a): S = {0}. Then, S^⊥ = ℝ.
(b): S = ℝ, Then, S^⊥ = {0}.
(c): Let S be any subset of ℝ containing a nonzero real number. Then, S^⊥ = {0}.

4.

Let u = (1,2)^T. What is u^⊥ in ℝ²?
Solution: {(x,y)^T ∈ ℝ²|x + 2y = 0}. Is this Null(u)? Note that (2,-1)^T is a basis of u^⊥ and for any vector x ∈ ℝ², PICT

DRAFT

( ) --u-- --u-- x1 +-2x2 T 2x1---x2 T x = ⟨x, u⟩∥u ∥2 + x - ⟨x,u⟩∥u ∥2 = 5 (1,2) + 5 (2,- 1)

is a decomposition of x into two vectors, one parallel to u and the other parallel to u^⊥.

5.

Fix u = (1,1,1,1)^T,v = (1,1,-1,0)^T ∈ ℝ⁴. Determine z,w ∈ ℝ⁴ such that u = z + w with the condition that z is parallel to v and w is orthogonal to v.
Solution: As z is parallel to v, z = kv = (k,k,-k,0)^T, for some k ∈ ℝ. Since w is orthogonal to v the vector w = (a,b,c,d)^T satisfies a + b - c = 0. Thus, c = a + b and

(1,1,1,1)T = u = z + w = (k,k,- k,0)T + (a,b,a+ b,d)T.

Comparing the corresponding coordinates, gives the linear system d = 1, a + k = 1, b + k = 1 and a + b - k = 1 in the variables a,b,d and k. Thus, solving for a,b,d and k gives z = 1-
3

(1,1,-1,0)^T and w = 1-
3

(2,2,4,3)^T.

6.

Let x,y ∈ ℝⁿ then prove that

(a): ⟨x,y⟩ = 0⇐⇒∥x - y∥² = ∥x∥² + ∥y∥² (Pythagoras Theorem).
Solution: Use ∥x - y∥² = ∥x∥² + ∥y∥² - 2⟨x,y⟩ to get the required result follows.
(b): ∥x∥ = ∥y∥⇐⇒⟨x + y,x - y⟩ = 0 (x and y form adjacent sides of a rhombus as the diagonals x + y and x - y are orthogonal).
Solution: Use ⟨x + y,x - y⟩ = ∥x∥² -∥y∥² to get the required result follows.
(c): 4⟨x,y⟩ = ∥x + y∥² -∥x - y∥² (polarization identity in ℝⁿ).
Solution: Just expand the right hand side to get the required result follows. DRAFT
(d): ∥x + y∥² + ∥x-y∥² = 2∥x∥² + 2∥y∥² (parallelogram law: the sum of squares of the diagonals of a parallelogram equals twice the sum of squares of its sides).
Solution: Just expand the left hand side to get the required result follows.

7.

Let P = (1,1,1)^T, Q = (2,1,3)^T and R = (-1,1,2)^T be three vertices of a triangle in ℝ³. Compute the angle between the sides PQ and PR.
Solution: Method 1: Note that P⃗Q

= (2,1,3)^T - (1,1,1)^T = (1,0,2)^T, P⃗R

= (-2,0,1)^T and R⃗Q

= (-3,0,-1)^T. As ⟨ P⃗Q

⟩ = 0, the angle between the sides PQ and PR is π
2-

Method 2: ∥PQ∥ = √ --
5 ,∥PR∥ = and ∥QR∥ = √---
10 . As ∥QR∥² = ∥PQ∥² + ∥PR∥², by Pythagoras theorem, the angle between the sides PQ and PR is π
--
2 .

Exercise 5.1.15.

1.

Let V be an ips.

(a): If S ⊆ V then S^⊥ is a subspace of V and S^⊥ = (LS(S))^⊥.
(b): Furthermore, if V is finite dimensional then S^⊥ and LS(S) are complementary. That is, V = LS(S) + S^⊥. Equivalently, ⟨u,w⟩ = 0, for all u ∈ LS(S) and w ∈ S^⊥.

2.

Consider ℝ³ with the standard inner product. Find

(a): S^⊥ for S = {(1,1,1)^T,(0,1,-1)^T} and S = LS((1,1,1)^T,(0,1,-1)^T).
(b): vectors v,w ∈ ℝ³ such that v,w,u = (1,1,1)^T are mutually orthogonal.
(c): the line passing through (1,1,-1)^T and parallel to (a,b,c)≠0.
(d): the plane containing (1,1 - 1) with (a,b,c)≠0 as the normal vector.
(e): the area of the parallelogram with three vertices 0^T, (1,2,-2)^T and (2,3,0)^T. DRAFT
(f): the area of the parallelogram when ∥x∥ = 5,∥x - y∥ = 8 and ∥x + y∥ = 14.
(g): the plane containing (2,-2,1)^T and perpendicular to the line with parametric equation x = t - 1,y = 3t + 2,z = t + 1.
(h): the plane containing the lines (1,2,-2) + t(1,1,0) and (1,2,-2) + t(0,1,2).
(i): k such that cos^-1 = π∕3, where u = (1,-1,1)^T and v = (1,k,1)^T.
(j): the plane containing (1,1,2)^T and orthogonal to the line with parametric equation x = 2 + t,y = 3 and z = 1 - t.
(k): a parametric equation of a line containing (1,-2,1)^T and orthogonal to x+3y+2z = 1.

3.

Let P = (3,0,2)^T,Q = (1,2,-1)^T and R = (2,-1,1)^T be three points in ℝ³. Then,

(a): find the area of the triangle with vertices P,Q and R.
(b): find the area of the parallelogram built on vectors and .
(c): find a nonzero vector orthogonal to the plane of the above triangle.
(d): find all vectors x orthogonal to and with ∥x∥ = .
(e): the volume of the parallelepiped built on vectors and and x, where x is one of the vectors found in Part 3d. Do you think the volume would be different if you choose the other vector x?

4.

Let p₁ be a plane containing A = (1,2,3)^T and (2,-1,1)^T as its normal vector. Then,

(a): find the equation of the plane p₂ that is parallel to p₁ and contains (-1,2,-3)^T.
(b): calculate the distance between the planes p₁ and p₂.

5.

In the parallelogram ABCD, AB∥DC and AD∥BC and A = (-2,1,3)^T,B = (-1,2,2)^T and C = (-3,1,5)^T. Find the PICT

DRAFT

(a): coordinates of the point D,
(b): cosine of the angle BCD.
(c): area of the triangle ABC
(d): volume of the parallelepiped determined by AB,AD and (0,0,-7)^T.

6.

Let W = {(x,y,z,w)^T ∈ ℝ⁴ : x + y + z - w = 0}. Find a basis of W^⊥.

7.

Recall the ips M_n(ℝ) (see Example 5.1.4.8). If W = {A ∈ M_n(ℝ)|A^T = A} then W^⊥?

5.1.3 Normed Linear Space

To proceed further, recall that a vector space over ℝ or ℂ was a linear space.

Definition 5.1.16. [Normed Linear Space] Let V be a linear space.

1.

Then, a norm on V is a function f(x) = ∥x∥ from V to ℝ such that

(a): ∥x∥≥ 0 for all x ∈ V and if ∥x∥ = 0 then x = 0.
(b): ∥αx∥ = |α|∥x∥ for all α ∈ F and x ∈ V.
(c): ∥x + y∥≤∥x∥ + ∥y∥ for all x,y ∈ V (triangle inequality).

2.

A linear space with a norm on it is called a normed linear space (nls).

PICT PICT DRAFT Theorem 5.1.17. Let V be a normed linear space and x,y ∈ V. Then, ∥x∥-∥y∥≤∥x-y∥.

Proof. As ∥x∥ = ∥x - y + y∥≤∥x - y∥ + ∥y∥ one has ∥x∥-∥y∥≤∥x - y∥. Similarly, one obtains ∥y∥-∥x∥≤∥y - x∥ = ∥x - y∥. Combining the two, the required result follows. _

Example 5.1.18.

1.: On ℝ³, ∥x∥ = is a norm. Also, observe that this norm corresponds to , where ⟨,⟩ is the standard inner product.
2.: Let V be an ips. Is it true that f(x) = is a norm?
Solution: Yes. The readers should verify the first two conditions. For the third condition, recalling the Cauchy-Schwartz inequality, we get
$2 f(x + y) = ⟨x + y,x + y⟩ = ⟨x,x⟩+ ⟨x,y ⟩+ ⟨y,x⟩ + ⟨y, y⟩ ≤ ∥x∥2 + ∥x∥∥y ∥+ ∥x ∥∥y∥+ ∥y∥2 = (f (x )+ f(y))2.$
Thus, ∥x∥ = is a norm, called the norm induced by the inner product ⟨⋅,⋅⟩.

Exercise 5.1.19.

1.

Let V be an ips. Then, PICT

DRAFT

4⟨x,y ⟩ = ∥x + y∥2 - ∥x - y∥2 + i∥x + iy∥2 - i∥x - iy∥2(Polarization Identity ).

2.

Consider the complex vector space ℂⁿ. If x,y ∈ ℂⁿ then prove that

(a): If x≠0 then ∥x + ix∥² = ∥x∥² + ∥ix∥², even though ⟨x,ix⟩≠0.
(b): ⟨x,y⟩ = 0 whenever ∥x + y∥² = ∥x∥² + ∥y∥² and ∥x + iy∥² = ∥x∥² + ∥iy∥².

3.

Let A ∈ M_n(ℂ) satisfy ∥Ax∥≤∥x∥ for all x ∈ ℂⁿ. Then, prove that if α ∈ ℂ with |α| > 1 then A - αI is invertible.

The next result is stated without proof as the proof is beyond the scope of this book.

Theorem 5.1.20. Let ∥⋅∥ be a norm on a nls V. Then, ∥⋅∥ is induced by some inner product if and only if ∥⋅∥ satisfies the parallelogram law: ∥x + y∥² + ∥x - y∥² = 2∥x∥² + 2∥y∥².

Example 5.1.21.

1.

For x = (x₁,x₂)^T ∈ ℝ², we define ∥x∥₁ = |x₁| + |x₂|. Verify that ∥x∥₁ is indeed a norm. But, for x = e₁ and y = e₂, 2(∥x∥² + ∥y∥²) = 4 whereas

DRAFT

So, the parallelogram law fails. Thus, ∥x∥₁ is not induced by any inner product in ℝ².

2.

Does there exist an inner product in ℝ² such that ∥x∥ = max{|x₁|,|x₂|}?

3.

If ∥⋅∥ is a norm in V then d(x,y) = ∥x - y∥, for x,y ∈ V, defines a distance function as

(a): d(x,x) = 0, for each x ∈ V.
(b): using the triangle inequality, for any z ∈ V, we have $d(x,y ) = ∥x - y∥ = ∥ (x - z) - (y - z)∥ ≤ ∥ (x- z) ∥+ ∥ (y - z)∥ = d(x,z) + d(z,y).$

5.2 Gram-Schmidt Orthonormalization Process

Definition 5.2.1. Let V be an ips. Then, a non-empty set S = {v₁,…,v_n}⊆ V is called an orthogonal set if v_i and v_j are mutually orthogonal, for 1 ≤ i≠j ≤ n, i.e., PICT PICT DRAFT

⟨ui,uj⟩ = 0, for 1 ≤ i < j ≤ n.

Further, if ∥v_i∥ = 1, for 1 ≤ i ≤ n, Then S is called an orthonormal set. If S is also a basis of V then S is called an orthonormal basis of V.

Example 5.2.2.

1.

A few orthonormal sets in ℝ² are

1 1 1 1 {(1,0)T ,(0,1)T },{ √--(1,1)T ,√--(1,- 1)T} and {√-(2,1)T,√--(1,- 2)T}. 2 2 5 5

2.

Let S = {e₁,…,e_n} be the standard basis of ℝⁿ. Then, S is an orthonormal set as

(a): ∥e_i∥ = 1, for 1 ≤ i ≤ n.
(b): ⟨e_i,e_j⟩ = 0, for 1 ≤ i≠j ≤ n.

3.

The set

{ }
[-1- -1- 1-]T [ 1---1-]T [-2- -1- -1]T
√3,- √3, √3 , 0,√2,√2- , √6-,√6,- √6

is an orthonormal set in ℝ³. PICT

DRAFT

4.

Recall that ⟨f(x),g(x)⟩ = ∫ _-π^πf(x)g(x)dx defines the standard inner product in

[-π,π]. Consider S = {1}∪{e_m|m ≥ 1}∪{f_n|n ≥ 1}, where 1(x) = 1, e_m(x) = cos(mx) and f_n(x) = sin(nx), for all m,n ≥ 1 and for all x ∈ [-π,π]. Then,

(a): S is a linearly independent set.
(b): ∥1∥² = 2π, ∥e_m∥² = π and ∥f_n∥² = π.
(c): the functions in S are orthogonal.

Hence, { }
√-1--
2π ∪ { }
√1-em |m ≥ 1
π ∪ { }
√1-fn|n ≥ 1
π is an orthonormal set in [-π,π].

Example 5.2.3. Which point on the plane P is closest to the point, say Q?

Solution: Let y be the foot of the perpendicular from Q on P. Thus, by Pythagoras Theorem, this point is unique. So, the question arises: how do we find y?

Note that -y→Q gives a normal vector of the plane P. Hence, -→Q = y + -y→Q . So, need to decompose -→Q into two vectors such that one of them lies on the plane P and the other is orthogonal to the plane.

Thus, we see that given u,v ∈ V \{0}, we need to find two vectors, say y and z, such that y is parallel to u and z is perpendicular to u. Thus, y = ucos(θ) and z = usin(θ), where θ is the angle between u and v.

We do this as follows (see Figure 5.2). Let

be the unit vector in the direction of u. Then, using trigonometry, cos(θ) = ∥O⃗Q-∥
∥O⃗P ∥

. Hence ∥

∥ = ∥

∥cos(θ). Now using Definition 5.1.10, ∥ O⃗Q

∥ = ∥v∥

, where the absolute value is taken as the length/norm is a positive quantity. Thus,

Example 5.2.4.

1.

Determine the foot of the perpendicular from the point (1,2,3) on the XY -plane.
Solution: Verify that the required point is (1,2,0)? PICT

DRAFT

2.

Determine the foot of the perpendicular from the point Q = (1,2,3,4) on the plane generated by (1,1,0,0),(1,0,1,0) and (0,1,1,1).

Answer: (x,y,z,w) lies on the plane x-y-z+2w = 0 ⇔⟨(1,-1,-1,2),(x,y,z,w)⟩ = 0.

So, the required point equals

(1,2,3,4) - ⟨(1,2,3,4),√1-(1,- 1,- 1,2)⟩√1-(1,- 1,- 1,2) 7 7 4- 1- = (1,2,3,4)- 7(1,- 1,- 1,2) = 7(3,18,25,20).

3.

Determine the projection of v = (1,1,1,1)^T on u = (1,1,-1,0)^T.
Solution: By Equation (5.2.1), we have Proj_v(u) = ⟨v, u⟩

(1,1,-1,0)^T and w = (1,1,1,1)^T - Proj_v(u) = 1
3

(2,2,4,3)^T is orthogonal to u.

4.

Let u = (1,1,1,1)^T,v = (1,1,-1,0)^T,w = (1,1,0,-1)^T ∈ ℝ⁴. Write v = v₁ + v₂, where v₁ is parallel to u and v₂ is orthogonal to u. Also, write w = w₁ + w₂ + w₃ such that w₁ is parallel to u, w₂ is parallel to v₂ and w₃ is orthogonal to both u and v₂.
Solution: Note that

(a): v₁ = Proj_u(v) = ⟨v,u⟩ = u = (1,1,1,1)^T is parallel to u.
(b): v₂ = v -u = (3,3,-5,-1)^T is orthogonal to u.

Note that Proj_u(w) is parallel to u and Proj_v₂(w) is parallel to v₂. Hence, we have

(a): w₁ = Proj_u(w) = ⟨w,u⟩ = u = (1,1,1,1)^T is parallel to u,
(b): w₂ = Proj_v₂(w) = ⟨w,v₂⟩ = (3,3,-5,-1)^T is parallel to v₂ and
(c): w₃ = w - w₁ - w₂ = (1,1,2,-4)^T is orthogonal to both u and v₂.

DRAFT

Theorem 5.2.5. Let S = {u₁,…,u_n} be an orthonormal subset of an ips V(F).

1.

Then, S is a linearly independent subset of V.

2.

Suppose v ∈ LS(S) with v = ∑ _i=1ⁿα_iu_i, for some α_i’s in F. Then,

(a): α_i = ⟨v,u_i⟩.
(b): ∥v∥² = ∥∑ _i=1ⁿα_iu_i∥² = ∑ _i=1ⁿ|α_i|².

3.

Let z ∈ V and w = ∑ _i=1ⁿ⟨z,u_i⟩u_i. Then, z = w + (z - w )

with ⟨z - w,w⟩ = 0, i.e., z - w ∈ LS(S)^⊥. Further, ∥z∥² = ∥w∥² + ∥z - w∥² ≥∥w∥².

4.

Let dim(V) = n. Then, ⟨v,u_i⟩ = 0 for all i = 1,2,…,n if and only if v = 0.

Proof. Part 1: Consider the linear system c₁u₁ + ⋅⋅⋅

+ c_nu_n = 0 in the variables c₁,…,c_n. As ⟨0,u⟩ = 0 and ⟨u_j,u_i⟩ = 0, for all j≠i, we have

Part 2: Note that ⟨v,u_i⟩ = ⟨∑ _j=1ⁿα_ju_j,u_i⟩ = ∑ _j=1ⁿα_j⟨u_j,u_i⟩ = α_i⟨u_i,u_i⟩ = α_i. This completes the first sub-part. For the second sub-part, we have

A rephrasing of Theorem 5.2.5.2b gives a generalization of the pythagoras theorem, popularly known as the Parseval’s formula. The proof is left as an exercise for the reader.

PICT PICT DRAFT Theorem 5.2.6. Let V be a finite dimensional ips with an orthonormal basis {v₁, ⋅⋅⋅ ,v_n}. Then, for each x,y ∈ V,

∑n ------ ⟨x, y⟩ = ⟨x,vi⟩⟨y,vi⟩. i=1

Furthermore, if x = y then ∥x∥² = ∑ _i=1ⁿ|⟨x,v_i⟩|² (generalizing the Pythagoras Theorem).

Theorem 5.2.7 (Bessel’s Inequality). Let V be an ips with {v₁, ⋅⋅⋅ ,v_n} as an orthogonal set. Then, ∑ _k=1ⁿ |⟨z,vk⟩|2
∥vk∥2 ≤ ∥z∥², for each z ∈ V. Equality holds if and only if z = ∑ _k=1ⁿ ⟨z,vk⟩
-----2
∥vk ∥ v_k.

Proof. For 1 ≤ k ≤ n, define u_k = v
--k--
∥vk∥

and use Theorem 5.2.5.4 to get the required result. _

Remark 5.2.8. Using Theorem 5.2.5, we see that if = v₁,…,v_n is an ordered orthonormal basis of an ips V then for each u ∈ V, [u] = ⌊ ⌋
⟨u, v1⟩
|| .. ||
⌈ . ⌉
⟨u, vn⟩ . Thus, in place of solving a linear system to get the coordinates of a vector, we just need to compute the inner product with basis vectors.

PICT PICT DRAFT Exercise 5.2.9.

1.: Find v,w ∈ ℝ³ such that v,w,(1,-1,-2)^T are mutually orthogonal.
2.: Let = be an ordered basis of ℝ². Then, = .
3.: For the ordered basis = of ℝ³, [(2,3,1)^T] = .

In view of the importance of Theorem 5.2.5, we inquire into the question of extracting an orthonormal basis from a given basis. The process of extracting an orthonormal basis from a finite linearly independent set is called the Gram-Schmidt Orthonormalization process. We first consider a few examples. Note that Theorem 5.2.5 also gives us an algorithm for doing so, i.e., from the given vector subtract all the orthogonal projections/components. If the new vector is nonzero then this vector is orthogonal to the previous ones. The proof follows directly from Theorem 5.2.5 but we give it again for the sake of completeness.

Theorem 5.2.10 (Gram-Schmidt Orthogonalization Process). Let V be an ips. If {v₁,…,v_n} is a set of linearly independent vectors in V then there exists an orthonormal set {w₁,…,w_n} in V. Furthermore, LS(w₁,…,w_i) = LS(v₁,…,v_i), for 1 ≤ i ≤ n.

Proof. Note that for orthonormality, we need ∥w_i∥ = 1, for 1 ≤ i ≤ n and ⟨w_i,w_j⟩ = 0, for 1 ≤ i≠j ≤ n. Also, by Corollary 3.3.8.2, v_i

LS(v₁,…,v_i-1), for 2 ≤ i ≤ n, as {v₁,…,v_n} is a linearly independent set. We are now ready to prove the result by induction.

Step 2: Define u₂ = v₂ -⟨v₂,w₁⟩w₁. Then, u₂≠0 as v₂ ⁄∈ LS(v₁). So, let w₂ = -u2--
∥u2∥

Step 3: For induction, assume that we have obtained an orthonormal set {w₁,…,w_k-1} such that LS(v₁,…,v_k-1) = LS(w₁,…,w_k-1). Now, note that
u_k = v_k -∑ _i=1^k-1⟨v_k,w_i⟩w_i = v_k -∑ _i=1^k-1Proj_{w_i}(v_k)≠0 as v_k

LS(v₁,…,v_k-1). So, let us put w_k = u
--k--
∥uk∥

. Then, {w₁,…,w_k} is orthonormal as ∥w_k∥ = 1 and

As v_k = ∥u_k∥w_k + ∑ _i=1^k-1⟨v_k,w_i⟩w_i, we get v_k ∈ LS(w₁,…,w_k). Hence, by the principle of mathematical induction LS(w₁,…,w_k) = LS(v₁,…,v_k) and the required result follows. _

Example 5.2.11.

1.

Let S = {(1,-1,1,1),(1,0,1,0),(0,1,0,1)}⊆ ℝ⁴. Find an orthonormal set T such that LS(S) = LS(T).
Solution: Let v₁ = (1,0,1,0)^T,v₂ = (0,1,0,1)^T and v₃ = (1,-1,1,1)^T. Then, w₁ = √1-
2

(1,0,1,0)^T. As ⟨v₂,w₁⟩ = 0, we get w₂ = √1-
2

(0,1,0,1)^T. For the third vector, let u₃ = v₃ -⟨v₃,w₁⟩w₁ -⟨v₃,w₂⟩w₂ = (0,-1,0,1)^T. Thus, w₃ = -1-
√2

(0,-1,0,1)^T.

2.

Let S = {v₁ = [ ]
2 0 0

^T,v₂ =

^T,v₃ =

^T,v₄ =

^T}. Find an orthonormal set T such that LS(S) = LS(T).
Solution: Take w₁ = v
∥v11∥

^T = e₁. For the second vector, consider u₂ = v₂ - 3
2

w₁ =

^T. So, put w₂ = -u2-
∥u2∥

^T = e₂.

For the third vector, let u₃ = v₃ -∑ _i=1²⟨v₃,w_i⟩w_i = (0,0,0)^T. So, v₃ ∈ LS((w₁,w₂)). Or equivalently, the set {v₁,v₂,v₃} is linearly dependent.

So, for again computing the third vector, define u₄ = v₄ -∑ _i=1²⟨v₄,w_i⟩w_i. Then, u₄ = v₄ - w₁ - w₂ = e₃. So w₄ = e₃. Hence, T = {w₁,w₂,w₄} = {e₁,e₂,e₃}.

3.

Find an orthonormal set in ℝ³ containing (1,2,1)^T.
Solution: Let (x,y,z)^T ∈ ℝ³ with

(1,2,1),(x,y,z)

= 0. Thus,

(x, y,z) = (- 2y - z,y,z) = y(- 2,1,0 )+ z(- 1,0,1).

Observe that (-2,1,0) and (-1,0,1) are orthogonal to (1,2,1) but are themselves not orthogonal.

Method 1: Apply Gram-Schmidt process to { 1√--
6 (1,2,1)^T,(-2,1,0)^T,(-1,0,1)^T}⊆ ℝ³.

Method 2: Valid only in ℝ³ using the cross product of two vectors.

In either case, verify that { 1√--
6 (1,2,1), -√1-
5 (2,-1,0), √-1-
30 (1,2,-5)} is the required set.

Corollary 5.2.12. Let V≠{0} be an ips. If

1.: V is finite dimensional then V has an orthonormal basis.
2.: S is a non-empty orthonormal set and dim(V) is finite then S can be extended to form an orthonormal basis of V.

Remark 5.2.13. Let S = {v₁,…,v_n}≠{0} be a non-empty subset of a finite dimensional vector space V. If we apply Gram-Schmidt process to

1.: S then we obtain an orthonormal basis of LS(v₁,…,v_n). DRAFT
2.: a re-arrangement of elements of S then we may obtain another orthonormal basis of LS(v₁,…,v_n). But, observe that the size of the two bases will be the same.

Exercise 5.2.14.

1.

Let V be an ips with

= {v₁,…,v_n} as a basis. Then, prove that

is orthonormal if and only if for each x ∈ V, x = ∑ _i=1ⁿ⟨x,v_i⟩v_i. [Hint: Since

is a basis, each x ∈ V has a unique linear combination in terms of v_i’s.]

2.

Let S be a subset of V having 101 elements. Suppose that the application of the Gram-Schmidt process yields u₅ = 0. Does it imply that LS(v₁,…,v₅) = LS(v₁,…,v₄)? Give reasons for your answer.

3.

Let

= {v₁,…,v_n} be an orthonormal set in ℝⁿ. For 1 ≤ k ≤ n, define A_k = ∑ _i=1^kv_iv_i^T. Then, prove that A_k^T = A_k and A_k² = A_k. Thus, A_k’s are projection matrices.

4.

Determine an orthonormal basis of ℝ⁴ containing (1,-2,1,3)^T and (2,1,-3,1)^T.

5.

Let x ∈ ℝⁿ with ∥x∥ = 1.

(a): Then, prove that {x} can be extended to form an orthonormal basis of ℝⁿ.
(b): Let the extended basis be {x,x₂,…,x_n} and = [e₁,…,e_n] the standard ordered basis of ℝⁿ. Prove that A = [x],[x₂],…,[x_n] is an orthogonal matrix.

6.

Let v,w ∈ ℝⁿ,n ≥ 1 with ∥u∥ = ∥w∥ = 1. Prove that there exists an orthogonal matrix A such that Av = w. Prove also that A can be chosen such that det(A) = 1.

7.

Let (V,⟨,⟩) be an n-dimensional ips. If u ∈ V with ∥u∥ = 1 then give reasons for the following statements.

(a): Let S^⊥ = {v ∈ V|⟨v,u⟩ = 0}. Then, dim(S^⊥) = n - 1. DRAFT
(b): Let 0≠β ∈ F. Then, S = {v ∈ V : ⟨v,u⟩ = β} is not a subspace of V.
(c): Let v ∈ V. Then, v = v₀ + ⟨v,u⟩u for a vector v₀ ∈ S^⊥. That is, V = LS(u,S^⊥).

5.2.1 Application to Fundamental Spaces

We end this section by proving the fundamental theorem of linear algebra. So, the readers are advised to recall the four fundamental subspaces and also to go through Theorem 3.5.9 (the rank-nullity theorem for matrices). We start with the following result.

Lemma 5.2.15. Let A ∈ M_m,n(ℝ). Then, Null(A) = Null(A^TA).

Proof. Let x ∈ Null(A). Then, Ax = 0. So, (A^TA)x = A^T(Ax) = A^T0 = 0. Thus, x ∈ Null(A^TA). That is, Null(A) ⊆ Null(A^TA).

Suppose that x ∈ Null(A^TA). Then, (A^TA)x = 0 and 0 = x^T0 = x^T(A^TA)x = (Ax)^T(Ax) = ∥Ax∥². Thus, Ax = 0 and the required result follows. _

Theorem 5.2.16 (Fundamental Theorem of Linear Algebra). Let A ∈ M_n(ℂ). Then,

1.: dim(Null(A)) + dim(Col(A)) = n.
2.: Null(A) = Col(A^*)^⊥ and Null(A^*) = Col(A)^⊥.
3.: dim(Col(A)) = dim(Col(A^*)).

Part 2: We first prove that Null(A) ⊆ Col(A^*)^⊥. Let x ∈ Null(A). Then, Ax = 0 and PICT

DRAFT

We now prove that Col(A^*)^⊥⊆ Null(A). Let x ∈ Col(A^*)^⊥. Then, for every y ∈ ℂⁿ,

Remark 5.2.17. Theorem 5.2.16.2 implies that Null(A) = Col(A^*)^⊥. This statement is just stating the usual fact that if x ∈ Null(A) then Ax = 0 and hence the usual dot product of every row of A with x equals 0.

As an implication of Theorem 5.2.16.2 and Theorem 5.2.16.3, we show the existence of an invertible linear map T : Col(A^*) → Col(A).

Corollary 5.2.18. Let A ∈ M_n(ℂ). Then, the function T : Col(A^*) → Col(A) defined by T(x) = Ax is invertible. PICT PICT DRAFT

Proof. In view of Theorem 5.2.16.3 and the rank-nullity theorem, we just need to show that the map is one-one. So, suppose that there exist x,y ∈ Col(A^*) such that T(x) = T(y). Or equivalently, Ax = Ay. Thus, x - y ∈ Null(A) = (Col(A^*))^⊥ (by Theorem 5.2.16.2). Therefore, x-y ∈ (Col(A^*))^⊥∩Col(A^*) = {0}. Thus, x = y and hence the map is one-one. Thus, the required result follows. _

The readers should look at Example 3.2.3 and Remark 3.2.4. We give one more example.

Example 5.2.19. Let A = ⌊ ⌋
1 1 0
|⌈2 1 1|⌉

3 2 1 . Then, verify that

1.: {(0,1,1)^T,(1,1,2)^T} is a basis of Col(A).
2.: {(1,1,-1)^T} is a basis of Null(A^T).
3.: Null(A^T) = (Col(A))^⊥.

Exercise 5.2.20.

1.

Find distinct subspaces W₁ and W₂

(a): in ℝ² such that W₁ and W₂ are orthogonal but not orthogonal complement.
(b): in ℝ³ such that W₁≠{0} and W₂≠{0} are orthogonal, but not orthogonal complement.

2.

Let A ∈ M_m,n(ℂ). Then, Null(A) = Null(A^*A).

3.

Let A ∈ M_m,n(ℝ). Then, Col(A) = Col(A^TA).

4.

Let A ∈ M_m,n(ℝ). Then, Rank(A) = n if and only if Rank(A^TA) = n. PICT

DRAFT

5.

Let A ∈ M_m,n(ℂ). Then, for every

(a): x ∈ ℝⁿ, x = u + v, where u ∈ Col(A^T) and v ∈ Null(A) are unique.
(b): y ∈ ℝ^m, y = w + z, where w ∈ Col(A) and z ∈ Null(A^T) are unique.

For more information related with the fundamental theorem of linear algebra the interested readers are advised to see the article “The Fundamental Theorem of Linear Algebra, Gilbert Strang, The American Mathematical Monthly, Vol. 100, No. 9, Nov., 1993, pp. 848 - 855.”

5.2.2 QR Decomposition^*

The next result gives the proof of the QR decomposition for real matrices. The readers are advised to prove similar results for matrices with complex entries. This decomposition and its generalizations are helpful in the numerical calculations related with eigenvalue problems (see Chapter 6).

Theorem 5.2.1 (QR Decomposition). Let A ∈ M_n(ℝ) be invertible. Then, there exist matrices Q and R such that Q is orthogonal and R is upper triangular with A = QR. Furthermore, if det(A)≠0 then the diagonal entries of R can be chosen to be positive. Also, in this case, the decomposition is unique.

Proof. As A is invertible, it’s columns form a basis of ℝⁿ. So, an application of the Gram-Schmidt orthonormalization process to {A[:,1],…,A[:,n]} gives an orthonormal basis {v₁,…,v_n} of ℝⁿ satisfying

Since A[:,i] ∈ LS(v₁,…,v_i), for 1 ≤ i ≤ n, there exist α_ji ∈ ℝ,1 ≤ j ≤ i, such that A[:,i] = [v₁,…,v_i] ⌊ ⌋
α1i
|| .. ||
⌈ . ⌉
αii

. Thus, if Q = [v₁,…,v_n] and R = ⌊ ⌋
α11 α12 ⋅⋅⋅ α1n
| |
|| 0 α22 ⋅⋅⋅ α2n ||
|⌈ ... ... ... ... |⌉

0 0 ⋅⋅⋅ αnn

then

Uniqueness: Suppose Q₁R₁ = Q₂R₂ for some orthogonal matrices Q_i’s and upper triangular matrices R_i’s with positive diagonal entries. As Q_i’s and R_i’s are invertible, we get Q₂^-1Q₁ = R₂R₁^-1. Now, using

So, the matrix R₂R₁^-1 is an orthogonal upper triangular matrix and hence, by Exercise 1.2.11.4, R₂R₁^-1 = I_n. So, R₂ = R₁ and therefore Q₂ = Q₁. _

Let A be an n × k matrix with Rank(A) = r. Then, by Remark 5.2.13, an application of the Gram-Schmidt orthonormalization process to columns of A yields an orthonormal set {v₁,…,v_r}⊆ ℝⁿ such that

Theorem 5.2.2 (Generalized QR Decomposition). Let A be an n × k matrix of rank r. Then, A = QR, where

1.: Q = [v₁,…,v_r] is an n × r matrix with Q^TQ = I_r,
2.: LS(A[:,1],…,A[:,j]) = LS(v₁,…,v_i), for 1 ≤ i ≤ j ≤ k and
3.: R is an r × k matrix with Rank(R) = r.

Example 5.2.3.

1.

Let A =

⌊ ⌋
1 0 1 2
||0 1 - 1 1||
|| ||
⌈1 0 1 1⌉
0 1 1 1

. Find an orthogonal matrix Q and an upper triangular matrix R such that A = QR.
Solution: From Example 5.2.11, we know that w₁ = -1-
√2

(1,0,1,0)^T, w₂ = -1-
√2

(0,1,0,1)^T and w₃ = √12-

(0,-1,0,1)^T. We now compute w₄. If v₄ = (2,1,1,1)^T then

u4 = v4 - ⟨v4,w1 ⟩w1 - ⟨v4,w2⟩w2 - ⟨v4,w3 ⟩w3 = 1-(1,0,- 1,0)T. 2

Thus, w₄ = 1√--
2

(-1,0,1,0)^T. Hence, we see that A = QR with
Q =

w₁,…,w₄

⌊ ⌋
√12- 0 0 √12-
|| 0 √1- -√1- 0 ||
|| 2 2 ||
|⌈ √12- 0 0 -√21|⌉
0 √1- √1- 0
2 2

and R =

⌊√ -- √ -- ⌋
2 0 2 -√3-
|| √ -- √ 2||
| 0 2 √0--- 2|
|⌈ 0 0 2 0 |⌉
0 0 0 1√--
2

2.

Let A =

⌊ ⌋
1 1 1 0
||- 1 0 - 2 1||
|| ||
⌈ 1 1 1 0⌉
1 0 2 1

. Find a 4 × 3 matrix Q satisfying Q^TQ = I₃ and an upper triangular matrix R such that A = QR.
Solution: Let us apply the Gram-Schmidt orthonormalization process to the columns of A. As v₁ = (1,-1,1,1)^T, we get w₁ =

v₁. Let v₂ = (1,0,1,0)^T. Then,

u2 = v2 - ⟨v2,w1 ⟩w1 = (1,0,1,0 )T - w1 = 1(1,1,1,- 1)T . 2

Hence, w₂ = 1
2

(1,1,1,-1)^T. Let v₃ = (1,-2,1,2)^T. Then,

u3 = v3 - ⟨v3,w1 ⟩w1 - ⟨v3,w2⟩w2 = v3 - 3w1 + w2 = 0.

So, we again take v₃ = (0,1,0,1)^T. Then,

u3 = v3 - ⟨v3,w1 ⟩w1 - ⟨v3,w2⟩w2 = v3 - 0w1 - 0w2 = v3.

So, w₃ =

(0,1,0,1)^T. Hence, PICT

DRAFT

⌊ 1 1 ⌋ | 2 2 0 | ⌊2 1 3 0 ⌋ || -21- 12 1√2|| | | Q = [v1, v2,v3] = | 1 1 0 | and R = ⌈0 1 - 1 √0-⌉. ⌈ 21 -21 1 ⌉ 0 0 0 2 2 -2- √2-

The readers are advised to check the following:

(a): Rank(A) = 3,
(b): A = QR with Q^TQ = I₃, and
(c): R is a 3 × 4 upper triangular matrix with Rank(R) = 3.

Remark 5.2.4. Let A ∈ M_m,n(ℝ).

1.

If A = QR with Q = [v₁,…,v_n] then R = ⌊ ⌋
⟨v1,A [:,1]⟩ ⟨v1,A [:,2]⟩ ⟨v1,A [:,3]⟩ ⋅⋅⋅
|| 0 ⟨v2,A [:,2]⟩ ⟨v2,A [:,3]⟩ ⋅⋅⋅||
| |
|⌈ 0 0 ⟨v3,A [:,3]⟩ ⋅⋅⋅|⌉
... ... ...

. In case Rank(A) < n then a slight modification gives the matrix R.

2.

Further, let Rank(A) = n.

(a): Then, A^TA is invertible (see Exercise 5.2.20.4).
(b): By Theorem 5.2.2, A = QR with Q a matrix of size m×n and R an upper triangular matrix of size n × n. Also, Q^TQ = I_n and Rank(R) = n. DRAFT
(c): Thus, A^TA = R^TQ^TQR = R^TR. As A^TA is invertible, the matrix R^TR is invertible. Since R is a square matrix, by Exercise 2.3.5.1, the matrix R itself is invertible. Hence, (R^TR)^-1 = R^-1(R^T)^-1.
(d): So, if Q = [v₁,…,v_n] then $A(AT A )- 1AT = QR (RT R )-1RT QT = (QR )(R -1(RT )-1)RT QT = QQT .$
(e): Hence, using Theorem 5.3.7, we see that the matrix $⌊ ⌋ vT | 1. | ∑n P = A(AT A)-1AT = QQT = [v1,...,vn]|⌈ .. |⌉ = vivTi vT i=1 n$ is the orthogonal projection matrix on Col(A).

3.

Further, let Rank(A) = r < n. If j₁,…,j_r are the pivot columns of A then Col(A) = Col(B), where B = [A[:,j₁],…,A[:,j_r]] is an m × r matrix with Rank(B) = r. So, using Part 2e we see that B(B^TB)^-1B^T is the orthogonal projection matrix on Col(A). So, compute RREF of A and choose columns of A corresponding to the pivot columns.

5.3 Orthogonal Projections and Applications

Till now, our main interest was to understand the linear system Ax = b, for A ∈ M_m,n(ℂ),x ∈ ℂⁿ and b ∈ ℂ^m, from different view points. But, in most practical situations the system has no solution. So, we are interested in finding a point x₀ ∈ ℝⁿ such that the err = b - Ax₀ is the least. Thus, we consider the problem of finding x₀ ∈ ℝⁿ such that

Theorem 5.3.1 (Decomposition). Let V be an ips having W as a finite dimensional subspace. Suppose {f₁,…,f_k} is an orthonormal basis of W. Then, for each b ∈ V, y = ∑ _i=1^k⟨b,f_i⟩f_i is the closest point in W from b. Furthermore, b - y ∈ W^⊥.

Definition 5.3.2. [Orthogonal Projection] Let W be a finite dimensional subspace of an ips V. Then, by Theorem 5.3.1, for each v ∈ V there exist unique vectors w ∈ W and u ∈ W^⊥ with v = w + u. We thus define the orthogonal projection of V onto W, denoted P_W, by

PW : V → W by PW (v) = w.

The vector w is called the projection of v on W.

Remark 5.3.3. Let A ∈ M_m,n(ℝ) and W = Col(A). Then, to find the orthogonal projection P_W(b), we can use either of the following ideas:

1.: Determine an orthonormal basis {f₁,…,f_k} of Col(A) and get P_W(b) = ∑ _i=1^k⟨b,f_i⟩f_i.
2.: By Theorem 5.2.16.2, Col(A) = Null(A^T)^⊥. Hence, for b ∈ ℝ^m there exists unique u ∈ Col(A) and v ∈ Null(A^T) such that b = u + v. Thus, using Definition 5.3.2 and Theorem 5.3.1, P_W(b) = u.

Before proceeding to projections, we give an application of Theorem 5.3.1 to a linear system.

Corollary 5.3.4. Let A ∈ M_m,n(ℝ) and b ∈ ℝ^m. Then, every least square solution of Ax = b is a solution of the system A^TAx = A^Tb. Conversely, every solution of A^TAx = A^Tb is a least square solution of Ax = b.

Proof. As b ∈ ℝ^m, by Remark 5.3.3, there exists y ∈ Col(A) and v ∈ Null(A^T) such that b = y + v and min{∥b-w∥|w ∈ Col(A)} = ∥b-y∥. As y ∈ Col(A), there exists x₀ ∈ ℝⁿ such that Ax₀ = y, i.e., x₀ is the least square solution of Ax = b. Hence,

Conversely, let x₁ ∈ ℝⁿ be a solution of A^TAx = A^Tb, i.e., A^T (Ax1 - b )

= 0. To show

Note that A^T(Ax₁ - b) = 0 implies 0 = (x - x₁)^TA^T(Ax₁ - b) = (Ax - Ax₁)^T(Ax₁ - b) and hence ⟨b - Ax₁,A(x - x₁)⟩ = 0. Thus,

Corollary 5.3.5. Let A ∈ M_m,n(ℝ) and b ∈ ℝ^m. If

1.: A^TA is invertible then the least square solution of Ax = b equals x = (A^TA)^-1A^Tb.
2.: A^TA is not invertible then the least square solution of Ax = b equals x = (A^TA)^-A^Tb, where (A^TA)^- is the pseudo-inverse of A^TA.

Proof. Part 1 directly follows from Corollary 5.3.5. For Part 2, let b = y + v, for y ∈ Col(A) and v ∈ Null(A^T). As y ∈ Col(A), there exists x₀ ∈ ℝⁿ such that Ax₀ = y. Thus, by Remark 5.3.3, A^Tb = A^T(y + v) = A^Ty = A^TAx₀. Now, using the definition of pseudo-inverse (see Exercise 1.3.7.14), we see that PICT

DRAFT

Example 5.3.6. Use the fundamental theorem of linear algebra to compute the vector of the orthogonal projection.

1.: Determine the projection of (1,1,1,1,1)^T on Null.
Solution: Here A = [1,-1,1,-1,1]. So, a basis of Col(A^T) equals {(1,-1,1,-1,1)^T} and that of Null(A) equals {(1,1,0,0,0)^T,(1,0,-1,0,0)^T,(1,0,0,1,0)^T,(1,0,0,0,-1)^T}. Then, the solution of the linear system
Bx = , where B = equals x = . Thus, the projection is = (2,3,2,3,2)^T.
2.: Determine the projection of (1,1,1)^T on Null.
Solution: Here A = [1,1,-1]. So, a basis of Null(A) equals {(1,-1,0)^T,(1,0,1)^T} and that of Col(A^T) equals {(1,1,-1)^T}. Then, the solution of the linear system
Bx = , where B = equals x = . Thus, the projection is = (1,1,2)^T.
3.: Determine the projection of (1,1,1)^T on Col.
Solution: Here, A^T = [1,2,1], a basis of Col(A) equals {(1,2,1)^T} and that of Null(A^T) equals {(1,0,-1)^T,(2,-1,0)^T}. Then, using the solution of the linear system
Bx = , where B = gives (1,2,1)^T as the required vector.

To use the first idea in Remark 5.3.3, we prove the following result which helps us to get the matrix of the orthogonal projection from an orthonormal basis.

Theorem 5.3.7. Let {f₁,…,f_k} be an orthonormal basis of a finite dimensional subspace W of an ips V. Then P_W = ∑ _i=1^kf_if_i^*.

Example 5.3.8. In each of the following, determine the matrix of the orthogonal projection. Also, verify that P_W + P_W^⊥ = I. What can you say about Rank(P_W^⊥) and Rank(P_W)? Also, verify the orthogonal projection vectors obtained in Example 5.3.6.

1.: W = {(x₁,…,x₅)^T ∈ ℝ⁵|x₁ - x₂ + x₃ - x₄ + x₅ = 0} = Null.
Solution: An orthonormal basis of W is . Thus,
P_W = ∑ _i=1⁴f_if_i^T = and P_W^⊥ = .
2.: W = {(x,y,z)^T ∈ ℝ³|x + y - z = 0} = Null.
Solution: Note {(1,1,-1)} is a basis of W^⊥ and an orthonormal basis of W. So, $⌊ ⌋ ⌊ ⌋ 1 1 - 1 2 - 1 1 P W⊥ = 1-|⌈ 1 1 - 1|⌉ and PW = 1|⌈ - 1 2 1|⌉ . 3 3 - 1 - 1 1 1 1 2$ Verify that P_W + P_W^⊥ = I₃, Rank(P_W^⊥) = 2 and Rank(P_W) = 1.
3.: W = LS((1,2,1)) = Col⊆ ℝ³.
Solution: Using Example 5.2.11.3 and Equation (5.2.1) $W ⊥ = LS ({(- 2,1,0),(- 1,0,1)}) = LS ({(- 2,1,0),(1,2,- 5)}).$ So, P_W = and P_W^⊥ = .

DRAFT

Theorem 5.3.9. Let {f₁,…,f_k} be an orthonormal basis of a subspace W of ℝⁿ. If {f₁,…,f_n} is an extended orthonormal basis of ℝⁿ, P_W = ∑ _i=1^kf_if_i^T and P_W^⊥ = ∑ _i=k+1ⁿf_if_i^T then prove that

1.: I_n - P_W = P_W^⊥.
2.: (P_W)^T = P_W and (P_W^⊥)^T = P_W^⊥. That is, P_W and P_W^⊥ are symmetric.
3.: (P_W)² = P_W and (P_W^⊥)² = P_W^⊥. That is, P_W and P_W^⊥ are idempotent.
4.: P_W ∘ P_W^⊥ = P_W^⊥∘ P_W = 0.

Exercise 5.3.10.

1.

Let W = {(x,y,z,w) ∈ ℝ⁴ : x = y,z = w} be a subspace of ℝ⁴. Determine the matrix of the orthogonal projection.

2.

Let P_W₁ and P_W₂ be the orthogonal projections of ℝ² onto W₁ = {(x,0) : x ∈ ℝ} and W₂ = {(x,x) : x ∈ ℝ}, respectively. Note that P_W₁ ∘P_W₂ is a projection onto W₁. But, it is not an orthogonal projection. Hence or otherwise, conclude that the composition of two orthogonal projections need not be an orthogonal projection?

3.

Let A =

. Then, A is idempotent but not symmetric. Now, define P : ℝ² → ℝ² by P(v) = Av, for all v ∈ ℝ². Then,

(a): P is idempotent.
(b): Null(P) ∩ Rng(P) = Null(A) ∩ Col(A) = {0}. DRAFT
(c): ℝ² = Null(P) + Rng(P). But, (Rng(P))^⊥ = (Col(A))^⊥≠Null(A).
(d): Since (Col(A))^⊥≠Null(A), the map P is not an orthogonal projector. In this case, P is called a projection of ℝ² onto Rng(P) along Null(P).

4.

Find all 2 × 2 real matrices A such that A² = A. Hence, or otherwise, determine all projection operators of ℝ².

5.

Let W be an (n - 1)-dimensional subspace of ℝⁿ with ordered basis

_W = [f₁,…,f_n-1]. Suppose

= [f₁,…,f_n-1,f_n] is an orthogonal ordered basis of ℝⁿ obtained by extending

_W. Now, define a function Q : ℝⁿ → ℝⁿ by Q(v) = ⟨v,f_n⟩f_n -∑ _i=1^n-1⟨v,f_i⟩f_i. Then,

(a): Q fixes every vector in W^⊥.
(b): Q sends every vector w ∈ W to -w.
(c): Q ∘ Q = I_n.

The function Q is called the reflection operator with respect to W^⊥.

5.3.1 Orthogonal Projections as Self-Adjoint Operators*

Theorem 5.3.9 implies that the matrix of the projection operator is symmetric. We use this idea to proceed further.

Definition 5.3.11. [Self-Adjoint Operator] Let V be an ips with inner product ⟨,⟩. A linear operator P : V → V is called self-adjoint if ⟨P(v),u⟩ = ⟨v,P(u)⟩, for every u,v ∈ V.

A careful understanding of the examples given below shows that self-adjoint operators and Hermitian matrices are related. It also shows that the vector spaces ℂⁿ and ℝⁿ can be decomposed in terms of the null space and column space of Hermitian matrices. They also follow directly from the fundamental theorem of linear algebra. PICT

DRAFT

Example 5.3.12.

1.

Let A be an n × n real symmetric matrix. If P : ℝⁿ → ℝⁿ is defined by P(x) = Ax, for every x ∈ ℝⁿ then

(a): P is a self adjoint operator as A = A^T, for every x,y ∈ ℝⁿ, implies $T T T T ⟨P (x),y⟩ = (y )Ax = (y )A x = (Ay ) x = ⟨x,Ay ⟩ = ⟨x,P (y )⟩.$
(b): Null(P) = (Rng(P))^⊥ as A = A^T. Thus, ℝⁿ = Null(P) ⊕ Rng(P).

2.

Let A be an n × n Hermitian matrix. If P : ℂⁿ → ℂⁿ is defined by P(z) = Az, for all z ∈ ℂⁿ then using similar arguments (see Example 5.3.12.1) prove the following:

(a): P is a self-adjoint operator.
(b): Null(P) = (Rng(P))^⊥ as A = A^*. Thus, ℂⁿ = Null(P) ⊕ Rng(P).

We now state and prove the main result related with orthogonal projection operators.

Theorem 5.3.13. Let V be a finite dimensional ips. If V = W ⊕ W^⊥ then the orthogonal projectors P_W : V → V on W and P_W^⊥ : V → V on W^⊥ satisfy PICT PICT DRAFT

1.: Null(P_W) = {v ∈ V : P_W(v) = 0} = W^⊥ = Rng(P_W^⊥).
2.: Rng(P_W) = {P_W(v) : v ∈ V} = W = Null(P_W^⊥).
3.: P_W ∘ P_W = P_W, P_W^⊥∘ P_W^⊥ = P_W^⊥ (Idempotent).
4.: P_W^⊥∘ P_W = 0_V and P_W ∘ P_W^⊥ = 0_V, where 0_V(v) = 0, for all v ∈ V
5.: P_W + P_W^⊥ = I_V, where I_V(v) = v, for all v ∈ V.
6.: The operators P_W and P_W^⊥ are self-adjoint.

Proof. Part 1: As V = W ⊕ W^⊥, for each u ∈ W^⊥, one uniquely writes u = 0 + u, where 0 ∈ W and u ∈ W^⊥. Hence, by definition, P_W(u) = 0 and P_W^⊥(u) = u. Thus, W^⊥ ⊆ Null(P_W) and W^⊥⊆ Rng(P_W^⊥).

Now suppose that v ∈ Null(P_W). So, P_W(v) = 0. As V = W ⊕ W^⊥, v = w + u, for unique w ∈ W and unique u ∈ W^⊥. So, by definition, P_W(v) = w. Thus, w = P_W(v) = 0. That is, v = 0 + u = u ∈ W^⊥. Thus, Null(P_W) ⊆ W^⊥.

A similar argument implies Rng(P_W^⊥) ⊆ W^⊥ and thus completing the proof of the first part.

Part 3, Part 4 and Part 5: Let v ∈ V. Then, v = w + u, for unique w ∈ W and unique u ∈ W^⊥. Thus, by definition,

Part 6: Let u = w₁ + x₁ and v = w₂ + x₂, for unique w₁,w₂ ∈ W and unique x₁,x₂ ∈ W^⊥. Then, by definition, ⟨w_i,x_j⟩ = 0 for 1 ≤ i,j ≤ 2. Thus, PICT

DRAFT

Remark 5.3.14. Theorem 5.3.13 gives us the following:

1.: The orthogonal projectors P_W and P_W^⊥ are idempotent and self-adjoint.
2.: Let v ∈ V. Then, v-P_W(v) = (I_V-P_W)(v) = P_W^⊥(v) ∈ W^⊥. Thus, ⟨v-P_W(v),w⟩ = 0, for every v ∈ V and w ∈ W.
3.: As P_W(v) - w ∈ W, for each v ∈ V and w ∈ W, we have
$∥v - w ∥2 = ∥v - PW (v)+ P W(v) - w ∥2 2 2 = ∥v - PW (v)∥ + ∥P W(v) - w ∥ + 2⟨v - PW (v ),PW(v )- w ⟩ = ∥v - PW (v)∥2 + ∥P W(v) - w ∥2.$
Therefore, ∥v - w∥≥∥v - P_W(v)∥ and equality holds if and only if w = P_W(v). Since P_W(v) ∈ W, we see that $DRAFT$ That is, P_W(v) is the vector nearest to v ∈ W. This can also be stated as: the vector P_W(v) solves the following minimization problem: $inf ∥v - w ∥ = ∥v - P W(v)∥. w∈W$

The next theorem is a generalization of Theorem 5.3.13. We omit the proof as the arguments are similar and uses the following:
Let V be a finite dimensional ips with V = W₁ ⊕ ⋅⋅⋅

⊕ W_k, for certain subspaces W_i’s of V. Then, for each v ∈ V there exist unique vectors v₁,…,v_k such that

Theorem 5.3.15. Let V be a finite dimensional ips with subspaces W₁,…, W_k of V such that V = W₁ ⊕ ⋅⋅⋅ ⊕ W_k. Then, for each i,j,1 ≤ i≠j ≤ k, there exist orthogonal projectors P_{W_i} : V → V of V onto W_i satisfying the following:

1.: Null(P_{W_i}) = W_i^⊥ = W₁ ⊕ W₂ ⊕⊕ W_i-1 ⊕ W_i+1 ⊕⊕ W_k.
2.: Rng(P_{W_i}) = W_i. DRAFT
3.: P_{W_i} ∘ P_{W_i} = P_{W_i}.
4.: P_{W_i} ∘ P_{W_j} = 0_V.
5.: P_{W_i} is a self-adjoint operator, and
6.: I_V = P_W₁ ⊕ P_W₂ ⊕⊕ P_{W_k}.

5.4 Orthogonal Operator and Rigid Motion

Definition 5.4.1. [Orthogonal Operator] Let V be a vector space. Then, a linear operator T : V → V is said to be an orthogonal operator if ∥T(x)∥ = ∥x∥, for all x ∈ V.

Example 5.4.2. Each T ∈(V) given below is an orthogonal operator.

1.: Fix a unit vector a ∈ V and define T(x) = 2⟨x,a⟩a - x, for all x ∈ V.
Solution: Note that Proj_a(x) = ⟨x,a⟩a. So, ⟨x,a⟩a,x - ⟨x,a⟩a = 0. Also, by Pythagoras theorem ∥x -⟨x,a⟩a∥² = ∥x∥² - (⟨x,a⟩)². Thus, $2 2 2 2 2 ∥T (x )∥ = ∥(⟨x,a⟩a) + (⟨x,a ⟩a - x)∥ = ∥⟨x,a⟩a∥ + ∥x- ⟨x, a⟩a∥ = ∥x ∥ .$ DRAFT
2.: Let n = 2, V = ℝ² and 0 ≤ θ < 2π. Now define T(x) = .

We now show that an operator is orthogonal if and only if it preserves the angle.

Theorem 5.4.3. Let T ∈(V). Then, the following statements are equivalent.

1.: T is an orthogonal operator.
2.: ⟨T(x),T(y)⟩ = ⟨x,y⟩, for all x,y ∈ V. That is, T preserves inner product.

Proof. 1 ⇒2 Let T be an orthogonal operator. Then, ∥T(x + y)∥² = ∥x + y∥². So, ∥T(x)∥² + ∥T(y)∥² + 2⟨T(x),T(y)⟩ = ∥T(x) + T(y)∥² = ∥T(x + y)∥² = ∥x∥² + ∥y∥² + 2⟨x,y⟩. Thus, using definition again ⟨T(x),T(y)⟩ = ⟨x,y⟩.

2 ⇒1 If ⟨T(x),T(y)⟩ = ⟨x,y⟩, for all x,y ∈ V then T is an orthogonal operator as ∥T(x)∥² = ⟨T(x),T(x)⟩ = ⟨x,x⟩ = ∥x∥². _

Corollary 5.4.4. Let T ∈(V). Then, T is an orthogonal operator if and only if “for every orthonormal basis {u₁,…,u_n} of V, {T(u₁),…,T(u_n)} is an orthonormal basis of V”. Thus, if is an orthonormal ordered basis of V then T [B,B] is an orthogonal matrix.

Definition 5.4.5. [Isometry, Rigid Motion] Let V be a vector space. Then, a map T : V → V is said to be an isometry or a rigid motion if ∥T(x) - T(y)∥ = ∥x - y∥, for all x,y ∈ V. That is, an isometry is distance preserving. PICT PICT DRAFT

Observe that if T and S are two rigid motions then ST is also a rigid motion. Furthermore, it is clear from the definition that every rigid motion is invertible.

Example 5.4.6. The maps given below are rigid motions/isometry.

1.: Let V be a linear space with norm ∥⋅∥. If a ∈ V then the translation map T_a : V → V (see Exercise 7), defined by T_a(x) = x + a for all x ∈ V, is an isometry/rigid motion as $∥Ta(x)- Ta (y)∥ = ∥ (x+ a )- (y + a)∥ = ∥x - y∥.$
2.: Let V be an ips. Then, using Theorem 5.4.3, we see that every orthogonal operator is an isometry.

We now prove that every rigid motion that fixes origin is an orthogonal operator.

Theorem 5.4.7. Let V be a real ips. Then, the following statements are equivalent for any map T : V → V.

1.: T is a rigid motion that fixes origin.
2.: T is linear and ⟨T(x),T(y)⟩ = ⟨x,y⟩, for all x,y ∈ V (preserves inner product).
3.: T is an orthogonal operator.

DRAFT

Proof. We have already seen the equivalence of Part 2 and Part 3 in Theorem 5.4.3. Let us now prove the equivalence of Part 1 and Part 2/Part 3.

If T is an orthogonal operator then T(0) = 0 and ∥T(x) - T(y)∥ = ∥T(x - y)∥ = ∥x - y∥. This proves Part 3 implies Part 1.

We now prove Part 1 implies Part 2. So, let T be a rigid motion that fixes 0. Thus, T(0) = 0 and ∥T(x) -T(y)∥ = ∥x - y∥, for all x,y ∈ V. Hence, in particular for y = 0, we have ∥T(x)∥ = ∥x∥, for all x ∈ V. So,

Thus, T(x + y) - (T (x )+ T (y ))

= 0 and hence T(x + y) = T(x) + T(y). A similar calculation gives T(αx) = αT(x) and hence T is linear. _ PICT

DRAFT

Exercise 5.4.8.

1.

Let A,B ∈ M_n(ℂ). Then, A and B are said to be

(a): Orthogonally Congruent if B = S^TAS, for some invertible matrix S.
(b): Unitarily Congruent if B = S^*AS, for some invertible matrix S.

Prove that Orthogonal and Unitary congruences are equivalence relations on M_n(ℝ) and M_n(ℂ), respectively.

2.

Let x ∈ ℂ². Identify it with the complex number x = x₁ + ix₂. If we rotate x by a counterclockwise rotation θ,0 ≤ θ < 2π then, we have

xeiθ = (x1 + ix2)(cosθ + isinθ ) = x1 cosθ - x2 sin θ + i[x1sinθ + x2cosθ].

Thus, the corresponding vector in ℝ² is

[ ] [ ][ ] x1 cosθ - x2 sin θ = cosθ - sinθ x1 . x1 sin θ + x2 cosθ sinθ cosθ x2

Is the matrix, [ ]
cosθ - sinθ
sinθ cosθ

, the matrix of the corresponding rotation? Justify. PICT

DRAFT

3.

Let A ∈ M₂(ℝ) and T(θ) = [ ]
cosθ sinθ
- sinθ cosθ

, for θ ∈ ℝ. Then, A is an orthogonal matrix if and only if A = T(θ) or A = [ ]
0 1
1 0

T(θ), for some θ ∈ ℝ.

4.

Let A ∈ M_n(ℂ). Then, the following statements are equivalent.

(a): A is an orthogonal matrix.
(b): A^-1 = A^T.
(c): A^T is orthogonal.
(d): the columns of A form an orthonormal basis of the real vector space ℝⁿ.
(e): the rows of A form an orthonormal basis of the real vector space ℝⁿ.
(f): for any two vectors x,y ∈ ℂⁿ, ⟨Ax,Ay⟩ = ⟨x,y⟩ Orthogonal matrices preserve angle.
(g): for any vector x ∈ ℂⁿ, ∥Ax∥ = ∥x∥ Orthogonal matrices preserve length.

5.

Let U be an n × n matrix. Then, prove that the following statements are equivalent.

(a): U is a unitary matrix.
(b): U^-1 = U^*.
(c): U^* is unitary.
(d): the columns of U form an orthonormal basis of the complex vector space ℂⁿ.
(e): the rows of U form an orthonormal basis of the complex vector space ℂⁿ.
(f): for any two vectors x,y ∈ ℂⁿ, ⟨Ux,Uy⟩ = ⟨x,y⟩ Unitary matrices preserve angle.
(g): for any vector x ∈ ℂⁿ, ∥Ux∥ = ∥x∥ Unitary matrices preserve length.

DRAFT

6.

Let A be an n × n orthogonal matrix. Then, prove that det(A) = ±1.

7.

Let A be an n × n upper triangular matrix. If A is also an orthogonal matrix then A is a diagonal matrix with diagonal entries ±1.

8.

Prove that in M₅(ℝ), there are infinitely many orthogonal matrices of which only finitely many are diagonal (in fact, there number is just 32).

9.

Prove that permutation matrices are real orthogonal.

10.

Let A,B ∈ M_n(ℂ) be two unitary matrices. Then, prove that AB and BA are unitary matrices.

11.

If A = [a_ij] and B = [b_ij] are unitarily equivalent then prove that ∑ _ij|a_ij|² = ∑ _ij|b_ij|².

12.

Let U be a unitary matrix and for every x ∈ ℂⁿ, define

∥x ∥1 = max {|xi| : xT = [x1,...,xn]}.

Then, is it necessary that ∥Ux∥₁ = ∥x∥₁?

5.5 Summary

In the previous chapter, we learnt that if V is vector space over F with dim(V) = n then V basically looks like Fⁿ. Also, any subspace of Fⁿ is either Col(A) or Null(A) or both, for some matrix A with entries from F.

So, we started this chapter with inner product, a generalization of the dot product in ℝ³ or ℝⁿ. We used the inner product to define the length/norm of a vector. The norm has the property that “the norm of a vector is zero if and only if the vector itself is the zero vector”. We then proved the Cauchy-Bunyakovskii-Schwartz Inequality which helped us in defining the angle between two PICT

DRAFT vector. Thus, one can talk of geometrical problems in ℝⁿ and proved some geometrical results.

We then independently defined the notion of a norm in ℝⁿ and showed that a norm is induced by an inner product if and only if the norm satisfies the parallelogram law (sum of squares of the diagonal equals twice the sum of square of the two non-parallel sides).

The next subsection dealt with the fundamental theorem of linear algebra where we showed that if A ∈ M_m,n(ℂ) then

So, the question arises, how do we compute an orthonormal basis? This is where we came across the Gram-Schmidt Orthonormalization process. This algorithm helps us to determine an orthonormal basis of LS(S) for any finite subset S of a vector space. This also lead to the QR-decomposition of a matrix.

Chapter 5
Inner Product Spaces

5.1 Definition and Basic Properties

5.1.1 Cauchy Schwartz Inequality

5.1.2 Angle between two Vectors

5.1.3 Normed Linear Space

5.2 Gram-Schmidt Orthonormalization Process

5.2.1 Application to Fundamental Spaces

5.2.2 QR Decomposition^*

5.3 Orthogonal Projections and Applications

5.3.1 Orthogonal Projections as Self-Adjoint Operators*

5.4 Orthogonal Operator and Rigid Motion

5.5 Summary

Chapter 5Inner Product Spaces

5.1 Definition and Basic Properties

5.1.1 Cauchy Schwartz Inequality

5.1.2 Angle between two Vectors

5.1.3 Normed Linear Space

5.2 Gram-Schmidt Orthonormalization Process

5.2.1 Application to Fundamental Spaces

5.2.2 QR Decomposition*

5.3 Orthogonal Projections and Applications

5.3.1 Orthogonal Projections as Self-Adjoint Operators*

5.4 Orthogonal Operator and Rigid Motion

5.5 Summary

Chapter 5
Inner Product Spaces

5.2.2 QR Decomposition^*