2 Matrix Algebra
Introduction
In the study of systems of linear equations in Chapter 1, we found it convenient to manipulate the augmented matrix of the system. Our aim was to reduce it to row-echelon form (using elementary row operations) and hence to write down all solutions to the system. In the present chapter we consider matrices for their own sake. While some of the motivation comes from linear equations, it turns out that matrices can be multiplied and added and so form an algebraic system somewhat analogous to the real numbers. This “matrix algebra” is useful in ways that are quite different from the study of linear equations. For example, the geometrical transformations obtained by rotating the euclidean plane about the origin can be viewed as multiplications by certain matrices. These “matrix transformations” are an important tool in geometry and, in turn, the geometry provides a “picture” of the matrices. Furthermore, matrix algebra has many other applications, some of which will be explored in this chapter. This subject is quite old and was first studied systematically in 1858 by Arthur Cayley.
2.1 Matrix Addition, Scalar Multiplication, and Transposition
A rectangular array of numbers is called a matrix (the plural is matrices), and the numbers are called the entries of the matrix. Matrices are usually denoted by uppercase letters: ,
,
, and so on. Hence,
are matrices. Clearly matrices come in various shapes depending on the number of rows and columns. For example, the matrix shown has
rows and
columns. In general, a matrix with
rows and
columns is referred to as an
matrix or as having size
. Thus matrices
,
, and
above have sizes
,
, and
, respectively. A matrix of size
is called a row matrix, whereas one of size
is called a column matrix. Matrices of size
for some
are called square matrices.
Each entry of a matrix is identified by the row and column in which it lies. The rows are numbered from the top down, and the columns are numbered from left to right. Then the -entry of a matrix is the number lying simultaneously in row
and column
. For example,
A special notation is commonly used for the entries of a matrix. If is an
matrix, and if the
-entry of
is denoted as
, then
is displayed as follows:
This is usually denoted simply as . Thus
is the entry in row
and column
of
. For example, a
matrix in this notation is written
It is worth pointing out a convention regarding rows and columns: Rows are mentioned before columns. For example:
- If a matrix has size
, it has
rows and
columns.
- If we speak of the
-entry of a matrix, it lies in row
and column
.
- If an entry is denoted
, the first subscript
refers to the row and the second subscript
to the column in which
lies.
Two points and
in the plane are equal if and only if they have the same coordinates, that is
and
. Similarly, two matrices
and
are called equal (written
) if and only if:
- They have the same size.
- Corresponding entries are equal.
If the entries of and
are written in the form
,
, described earlier, then the second condition takes the following form:
Example 2.1.1
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{cc} a & b \\ c & d \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-960f3d868d8e145702de168e1ea33a78_l3.png)
![Rendered by QuickLaTeX.com B = \left[ \begin{array}{rrr} 1 & 2 & -1 \\ 3 & 0 & 1 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-b404b54047e563e4573ab38c44f37918_l3.png)
![Rendered by QuickLaTeX.com C = \left[ \begin{array}{rr} 1 & 0 \\ -1 & 2 \end{array} \rightB]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-a7eb95e148a1cbaec3e2fea23f4f32be_l3.png)
discuss the possibility that



Solution:
is impossible because
and
are of different sizes:
is
whereas
is
. Similarly,
is impossible. But
is possible provided that corresponding entries are equal:
means ,
,
, and
.
Matrix Addition
Definition 2.1 Matrix Addition



If and
, this takes the form
Note that addition is not defined for matrices of different sizes.
Example 2.1.2
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rrr} 2 & 1 & 3 \\ -1 & 2 & 0 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-cad57b88c3cf20119062b871721c7b83_l3.png)
and
![Rendered by QuickLaTeX.com B = \left[ \begin{array}{rrr} 1 & 1 & -1 \\ 2 & 0 & 6 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-fe359910d71cce65e15f3e9a5a1e946d_l3.png)
compute

Solution:
Example 2.1.3



![Rendered by QuickLaTeX.com \left[ \begin{array}{ccc} a & b & c \end{array} \right] + \left[ \begin{array}{ccc} c & a & b \end{array} \right] = \left[ \begin{array}{ccc} 3 & 2 & -1 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c65a58bba8c3952a54184859a6f4110d_l3.png)
Solution:
Add the matrices on the left side to obtain
Because corresponding entries must be equal, this gives three equations: ,
, and
. Solving these yields
,
,
.
If ,
, and
are any matrices of the same size, then
(commutative law)
In fact, if and
, then the
-entries of
and
are, respectively,
and
. Since these are equal for all
and
, we get
The associative law is verified similarly.
The matrix in which every entry is zero is called the
zero matrix and is denoted as
(or
if it is important to emphasize the size). Hence,
holds for all matrices
. The negative of an
matrix
(written
) is defined to be the
matrix obtained by multiplying each entry of
by
. If
, this becomes
. Hence,
holds for all matrices where, of course,
is the zero matrix of the same size as
.
A closely related notion is that of subtracting matrices. If and
are two
matrices, their difference
is defined by
Note that if and
, then
is the matrix formed by subtracting corresponding entries.
Example 2.1.4
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rrr} 3 & -1 & 0 \\ 1 & 2 & -4 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-e33744cc342eb377a33f8c4157acd83c_l3.png)
![Rendered by QuickLaTeX.com B = \left[ \begin{array}{rrr} 1 & -1 & 1 \\ -2 & 0 & 6 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-f5516eec15a608034b95232c6d8cfa3f_l3.png)
![Rendered by QuickLaTeX.com C = \left[ \begin{array}{rrr} 1 & 0 & -2 \\ 3 & 1 & 1 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-25bc050f4c5ab86a251f4a657ad3a47d_l3.png)



Solution:
Example 2.1.5
![Rendered by QuickLaTeX.com \left[ \begin{array}{rr} 3 & 2 \\ -1 & 1 \end{array} \right] + X = \left[ \begin{array}{rr} 1 & 0 \\ -1 & 2 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-5f202ea3404ad3ada7db1fdd1d9934f3_l3.png)
where

We solve a numerical equation by subtracting the number
from both sides to obtain
. This also works for matrices. To solve
simply subtract the matrix
from both sides to get
The reader should verify that this matrix does indeed satisfy the original equation.
The solution in Example 2.1.5 solves the single matrix equation directly via matrix subtraction:
. This ability to work with matrices as entities lies at the heart of matrix algebra.
It is important to note that the sizes of matrices involved in some calculations are often determined by the context. For example, if
then and
must be the same size (so that
makes sense), and that size must be
(so that the sum is
). For simplicity we shall often omit reference to such facts when they are clear from the context.
Scalar Multiplication
In gaussian elimination, multiplying a row of a matrix by a number means multiplying every entry of that row by
.
Definition 2.2 Matrix Scalar Multiplication






The term scalar arises here because the set of numbers from which the entries are drawn is usually referred to as the set of scalars. We have been using real numbers as scalars, but we could equally well have been using complex numbers.
Example 2.1.6
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rrr} 3 & -1 & 4 \\ 2 & 0 & 1 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-488ed59f8747107b59a88b3be7a22b51_l3.png)
and
![Rendered by QuickLaTeX.com B = \left[ \begin{array}{rrr} 1 & 2 & -1 \\ 0 & 3 & 2 \end{array} \right],](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-e3caaef7dad35c91b37e791e9cc51bb8_l3.png)
compute



Solution:
If is any matrix, note that
is the same size as
for all scalars
. We also have
because the zero matrix has every entry zero. In other words, if either
or
. The converse of this statement is also true, as Example 2.1.7 shows.
Example 2.1.7



Solution:
Write so that
means
for all
and
. If
, there is nothing to do. If
, then
implies that
for all
and
; that is,
.
For future reference, the basic properties of matrix addition and scalar multiplication are listed in Theorem 2.1.1.
Theorem 2.1.1
Let ,
, and
denote arbitrary
matrices where
and
are fixed. Let
and
denote arbitrary real numbers. Then
-
.
.
- There is an
matrix
, such that
for each
.
- For each
there is an
matrix,
, such that
.
-
.
-
.
-
.
-
.
Proof:
Properties 1–4 were given previously. To check Property 5, let and
denote matrices of the same size. Then
, as before, so the
-entry of
is
But this is just the -entry of
, and it follows that
. The other Properties can be similarly verified; the details are left to the reader.
The Properties in Theorem 2.1.1 enable us to do calculations with matrices in much the same way that
numerical calculations are carried out. To begin, Property 2 implies that the sum
is the same no matter how it is formed and so is written as . Similarly, the sum
is independent of how it is formed; for example, it equals both and
. Furthermore, property 1 ensures that, for example,
In other words, the order in which the matrices are added does not matter. A similar remark applies to sums of five (or more) matrices.
Properties 5 and 6 in Theorem 2.1.1 are called distributive laws for scalar multiplication, and they extend to sums of more than two terms. For example,
Similar observations hold for more than three summands. These facts, together with properties 7 and 8, enable us to simplify expressions by collecting like terms, expanding, and taking common factors in exactly the same way that algebraic expressions involving variables and real numbers are manipulated. The following example illustrates these techniques.
Example 2.1.8
![Rendered by QuickLaTeX.com 2(A + 3C) - 3(2C - B) - 3 \left[ 2(2A + B - 4C) - 4(A - 2C) \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-771ffbc46a98c53b0c48fd12cdf44f14_l3.png)


Solution:
The reduction proceeds as though ,
, and
were variables.
Transpose of a Matrix
Many results about a matrix involve the rows of
, and the corresponding result for columns is derived in an analogous way, essentially by replacing the word row by the word column throughout. The following definition is made with such applications in mind.
Definition 2.3 Transpose of a Matrix






In other words, the first row of is the first column of
(that is it consists of the entries of column 1 in order). Similarly the second row of
is the second column of
, and so on.
Example 2.1.9
Solution:
If is a matrix, write
. Then
is the
th element of the
th row of
and so is the
th element of the
th column of
. This means
, so the definition of
can be stated as follows:
(2.1)
This is useful in verifying the following properties of transposition.
Theorem 2.1.2
Let and
denote matrices of the same size, and let
denote a scalar.
- If
is an
matrix, then
is an
matrix.
.
-
.
.
Proof:
Property 1 is part of the definition of , and Property 2 follows from (2.1). As to Property 3: If
, then
, so (2.1) gives
Finally, if , then
where
Then (2.1) gives Property 4:
There is another useful way to think of transposition. If is an
matrix, the elements
are called the main diagonal of
. Hence the main diagonal extends down and to the right from the upper left corner of the matrix
; it is shaded in the following examples:
Thus forming the transpose of a matrix can be viewed as “flipping”
about its main diagonal, or as “rotating”
through
about the line containing the main diagonal. This makes Property 2 in Theorem~?? transparent.
Example 2.1.10

![Rendered by QuickLaTeX.com \left(2A^{T} - 3 \left[ \begin{array}{rr} 1 & 2 \\ -1 & 1 \end{array} \right] \right)^{T} = \left[ \begin{array}{rr} 2 & 3 \\ -1 & 2 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-412707efa97069310ef423847f21291f_l3.png)
Solution:
Using Theorem 2.1.2, the left side of the equation is
Hence the equation becomes
Thus
, so finally
.
Note that Example 2.1.10 can also be solved by first transposing both sides, then solving for , and so obtaining
. The reader should do this.
The matrix in Example 2.1.9 has the property that
. Such matrices are important; a matrix
is called symmetric if
. A symmetric matrix
is necessarily square (if
is
, then
is
, so
forces
). The name comes from the fact that these matrices exhibit a symmetry about the main diagonal. That is, entries that are directly across the main diagonal from each other are equal.
For example, is symmetric when
,
, and
.
Example 2.1.11




Solution:
We have and
, so, by Theorem 2.1.2, we have
. Hence
is symmetric.
Example 2.1.12



Solution:
If we iterate the given equation, Theorem 2.1.2 gives
Subtracting from both sides gives
, so
.
2.2 Matrix-Vector Multiplication
Up to now we have used matrices to solve systems of linear equations by manipulating the rows of the augmented matrix. In this section we introduce a different way of describing linear systems that makes more use of the coefficient matrix of the system and leads to a useful way of “multiplying” matrices.
Vectors
It is a well-known fact in analytic geometry that two points in the plane with coordinates and
are equal if and only if
and
. Moreover, a similar condition applies to points
in space. We extend this idea as follows.
An ordered sequence of real numbers is called an ordered
–tuple. The word “ordered” here reflects our insistence that two ordered
-tuples are equal if and only if corresponding entries are the same. In other words,
Thus the ordered -tuples and
-tuples are just the ordered pairs and triples familiar from geometry.
Definition 2.4 The set of ordered
-tuples of real numbers



There are two commonly used ways to denote the -tuples in
: As rows
or columns
;
the notation we use depends on the context. In any event they are called vectors or –vectors and will be denoted using bold type such as x or v. For example, an
matrix
will be written as a row of columns:
If and
are two
-vectors in
, it is clear that their matrix sum
is also in
as is the scalar multiple
for any real number
. We express this observation by saying that
is closed under addition and scalar multiplication. In particular, all the basic properties in Theorem 2.1.1 are true of these
-vectors. These properties are fundamental and will be used frequently below without comment. As for matrices in general, the
zero matrix is called the zero
–vector in
and, if
is an
-vector, the
-vector
is called the negative
.
Of course, we have already encountered these -vectors in Section 1.3 as the solutions to systems of linear equations with
variables. In particular we defined the notion of a linear combination of vectors and showed that a linear combination of solutions to a homogeneous system is again a solution. Clearly, a linear combination of
-vectors in
is again in
, a fact that we will be using.
Matrix-Vector Multiplication
Given a system of linear equations, the left sides of the equations depend only on the coefficient matrix and the column
of variables, and not on the constants. This observation leads to a fundamental idea in linear algebra: We view the left sides of the equations as the “product”
of the matrix
and the vector
. This simple change of perspective leads to a completely new way of viewing linear systems—one that is very useful and will occupy our attention throughout this book.
To motivate the definition of the “product” , consider first the following system of two equations in three variables:
(2.2)
and let ,
,
denote the coefficient matrix, the variable matrix, and the constant matrix, respectively. The system (2.2) can be expressed as a single vector equation
which in turn can be written as follows:
Now observe that the vectors appearing on the left side are just the columns
of the coefficient matrix . Hence the system (2.2) takes the form
(2.3)
This shows that the system (2.2) has a solution if and only if the constant matrix is a linear combination of the columns of
, and that in this case the entries of the solution are the coefficients
,
, and
in this linear combination.
Moreover, this holds in general. If is any
matrix, it is often convenient to view
as a row of columns. That is, if
are the columns of
, we write
and say that is given in terms of its columns.
Now consider any system of linear equations with coefficient matrix
. If
is the constant matrix of the system, and if
is the matrix of variables then, exactly as above, the system can be written as a single vector equation
(2.4)
Example 2.2.1

in the form given in (2.4).
Solution:
As mentioned above, we view the left side of (2.4) as the product of the matrix and the vector
. This basic idea is formalized in the following definition:
Definition 2.5 Matrix-Vector Multiplication
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{cccc} \textbf{a}_{1} &\textbf{a}_{2} & \cdots & \textbf{a}_{n} \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-dbb5a2778dd46c36260d2751f566f8ee_l3.png)


![Rendered by QuickLaTeX.com \textbf{x} = \left[ \begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-106c7c5485d7f7fc87cca59ba85b2a28_l3.png)
is any n-vector, the product


In other words, if is
and
is an
-vector, the product
is the linear combination of the columns of
where the coefficients are the entries of
(in order).
Note that if is an
matrix, the product
is only defined if
is an
-vector and then the vector
is an
-vector because this is true of each column
of
. But in this case the system of linear equations with coefficient matrix
and constant vector
takes the form of a single matrix equation
The following theorem combines Definition 2.5 and equation (2.4) and summarizes the above discussion. Recall that a system of linear equations is said to be consistent if it has at least one solution.
Theorem 2.2.1
- Every system of linear equations has the form
where
is the coefficient matrix,
is the constant matrix, and
is the matrix of variables.
- The system
is consistent if and only if
is a linear combination of the columns of
.
- If
are the columns of
and if
, then
is a solution to the linear system
if and only if
are a solution of the vector equation
A system of linear equations in the form as in (1) of Theorem 2.2.1 is said to be written in matrix form. This is a useful way to view linear systems as we shall see.
Theorem 2.2.1 transforms the problem of solving the linear system into the problem of expressing the constant matrix
as a linear combination of the columns of the coefficient matrix
. Such a change in perspective is very useful because one approach or the other may be better in a particular situation; the importance of the theorem is that there is a choice.
Example 2.2.2
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rrrr} 2 & -1 & 3 & 5 \\ 0 & 2 & -3 & 1 \\ -3 & 4 & 1 & 2 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-71a850c6d1766fa957cf52cf8c2ef491_l3.png)
![Rendered by QuickLaTeX.com \textbf{x} = \left[ \begin{array}{r} 2 \\ 1 \\ 0 \\ -2 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-6c9931265e1a362abea7ad860a6fe19b_l3.png)

Solution:
By Definition 2.5:
.
Example 2.2.3
Given columns ,
,
, and
in
, write
in the form
where
is a matrix and
is a vector.
Solution:
Here the column of coefficients is
Hence Definition 2.5 gives
where is the matrix with
,
,
, and
as its columns.
Example 2.2.4
Let be the
matrix given in terms of its columns
,
,
, and
.
In each case below, either express as a linear combination of
,
,
, and
, or show that it is not such a linear combination. Explain what your answer means for the corresponding system
of linear equations.
1.
2.
Solution:
By Theorem 2.2.1, is a linear combination of
,
,
, and
if and only if the system
is consistent (that is, it has a solution). So in each case we carry the augmented matrix
of the system
to reduced form.
1. Here
, so the system
has no solution in this case. Hence
is \textit{not} a linear combination of
,
,
, and
.
2. Now
, so the system
is consistent.
Thus is a linear combination of
,
,
, and
in this case. In fact the general solution is
,
,
, and
where
and
are arbitrary parameters. Hence
for any choice of and
. If we take
and
, this becomes
, whereas taking
gives
.
Example 2.2.5





Example 2.2.6
![Rendered by QuickLaTeX.com I = \left[ \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-713bc6c3634032ac1445ed97d0a604ee_l3.png)



Solution:
If then Definition 2.5 gives
The matrix in Example 2.2.6 is called the
identity matrix, and we will encounter such matrices again in future. Before proceeding, we develop some algebraic properties of matrix-vector multiplication that are used extensively throughout linear algebra.
Theorem 2.2.2
Let and
be
matrices, and let
and
be
-vectors in
. Then:
.
for all scalars
.
.
Proof:
We prove (3); the other verifications are similar and are left as exercises. Let and
be given in terms of their columns. Since adding two matrices is the same as adding their columns, we have
If we write
Definition 2.5 gives
Theorem 2.2.2 allows matrix-vector computations to be carried out much as in ordinary arithmetic. For example, for any matrices
and
and any
-vectors
and
, we have:
We will use such manipulations throughout the book, often without mention.
Linear Equations
Theorem 2.2.2 also gives a useful way to describe the solutions to a system
of linear equations. There is a related system
called the associated homogeneous system, obtained from the original system by replacing all the constants by zeros. Suppose
is a solution to
and
is a solution to
(that is
and
). Then
is another solution to
. Indeed, Theorem 2.2.2 gives
This observation has a useful converse.
Theorem 2.2.3




for some solution of the associated homogeneous system
.
Proof:
Suppose is also a solution to
, so that
. Write
. Then
and, using Theorem 2.2.2, we compute
Hence is a solution to the associated homogeneous system
.
Note that gaussian elimination provides one such representation.
Example 2.2.7
Solution:
Gaussian elimination gives ,
,
, and
where
and
are arbitrary parameters. Hence the general solution can be written
Thus
is a particular solution (where ), and
gives all solutions to the associated homogeneous system. (To see why this is so, carry out the gaussian elimination again but with all the constants set equal to zero.)
The following useful result is included with no proof.
Theorem 2.2.4

![Rendered by QuickLaTeX.com \left[ \begin{array}{c|c} A & \textbf{b} \end{array}\right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-f681910a31d8713f922dea98af1eec7d_l3.png)

![Rendered by QuickLaTeX.com \text{rank} \left[ \begin{array}{c|c} A & \textbf{b} \end{array}\right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-cac04fda0063d3821de9d3de8724ea73_l3.png)


![Rendered by QuickLaTeX.com \text{rank} \left[ \begin{array}{c|c} A & \textbf{b} \end{array}\right] = r](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-84c08ae56def0268d555df95de1631fa_l3.png)
![Rendered by QuickLaTeX.com \text{rank} \left[ \begin{array}{c|c} A & \textbf{b} \end{array}\right] = r+1](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-5fd36d3c379363c63b70797b8552274c_l3.png)
The Dot Product
Definition 2.5 is not always the easiest way to compute a matrix-vector product because it requires that the columns of
be explicitly identified. There is another way to find such a product which uses the matrix
as a whole with no reference to its columns, and hence is useful in practice. The method depends on the following notion.
Definition 2.6 Dot Product in




obtained by multiplying corresponding entries and adding the results.
To see how this relates to matrix products, let denote a
matrix and let
be a
-vector. Writing
in the notation of Section 2.1, we compute
From this we see that each entry of is the dot product of the corresponding row of
with
. This computation goes through in general, and we record the result in Theorem 2.2.5.
Theorem 2.2.5 Dot Product Rule







This result is used extensively throughout linear algebra.
If is
and
is an
-vector, the computation of
by the dot product rule is simpler than using Definition 2.5 because the computation can be carried out directly with no explicit reference to the columns of
(as in Definition 2.5. The first entry of
is the dot product of row 1 of
with
. In hand calculations this is computed by going across row one of
, going down the column
, multiplying corresponding entries, and adding the results. The other entries of
are computed in the same way using the other rows of
with the column
.
In general, compute entry of
as follows (see the diagram):
Go across row of
and down column
, multiply corresponding entries, and add the results.
As an illustration, we rework Example 2.2.2 using the dot product rule instead of Definition 2.5.
Example 2.2.8
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rrrr} 2 & -1 & 3 & 5 \\ 0 & 2 & -3 & 1 \\ -3 & 4 & 1 & 2 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-71a850c6d1766fa957cf52cf8c2ef491_l3.png)
and
![Rendered by QuickLaTeX.com \textbf{x} = \left[ \begin{array}{r} 2 \\ 1 \\ 0 \\ -2 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-6c9931265e1a362abea7ad860a6fe19b_l3.png)

Solution:
The entries of are the dot products of the rows of
with
:
Of course, this agrees with the outcome in Example 2.2.2.
Example 2.2.9

Solution:
Write ,
, and
. Then the dot product rule gives
, so the entries of
are the left sides of the equations in the linear system. Hence the system becomes
because matrices are equal if and only corresponding entries are equal.
Example 2.2.10
If is the zero
matrix, then
for each
-vector
.
Solution:
For each , entry
of
is the dot product of row
of
with
, and this is zero because row
of
consists of zeros.
Definition 2.7 The Identity Matrix




The first few identity matrices are
In Example 2.2.6 we showed that for each
-vector
using Definition 2.5. The following result shows that this holds in general, and is the reason for the name.
Example 2.2.11





Solution:
We verify the case . Given the
-vector
the dot product rule gives
In general, because entry
of
is the dot product of row
of
with
, and row
of
has
in position
and zeros elsewhere.
Example 2.2.12
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{cccc} \textbf{a}_{1} & \textbf{a}_{2} & \cdots & \textbf{a}_{n} \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-c606dd8f9536b5aa2b0e7830b2334ae1_l3.png)








Solution:
Write
where , but
for all
. Then Theorem 2.2.5 gives
Example 2.2.12will be referred to later; for now we use it to prove:
Theorem 2.2.6







Proof:
Write and
and in terms of their columns. It is enough to show that
holds for all
. But we are assuming that
, which gives
by Example 2.2.12.
We have introduced matrix-vector multiplication as a new way to think about systems of linear equations. But it has several other uses as well. It turns out that many geometric operations can be described using matrix multiplication, and we now investigate how this happens. As a bonus, this description provides a geometric “picture” of a matrix by revealing the effect on a vector when it is multiplied by . This “geometric view” of matrices is a fundamental tool in understanding them.
2.3 Matrix Multiplication
In Section 2.2 matrix-vector products were introduced. If is an
matrix, the product
was defined for any
-column
in
as follows: If
where the
are the columns of
, and if
,
Definition 2.5 reads
(2.5)
This was motivated as a way of describing systems of linear equations with coefficient matrix . Indeed every such system has the form
where
is the column of constants.
In this section we extend this matrix-vector multiplication to a way of multiplying matrices in general, and then investigate matrix algebra for its own sake. While it shares several properties of ordinary arithmetic, it will soon become clear that matrix arithmetic is different in a number of ways.
Definition 2.9 Matrix Multiplication




![Rendered by QuickLaTeX.com B = \left[ \begin{array}{cccc} \textbf{b}_{1} & \textbf{b}_{2} & \cdots & \textbf{b}_{k} \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-6e72e7b0d95fad19922a98b6c1601abd_l3.png)






Thus the product matrix is given in terms of its columns
: Column
of
is the matrix-vector product
of
and the corresponding column
of
. Note that each such product
makes sense by Definition 2.5 because
is
and each
is in
(since
has
rows). Note also that if
is a column matrix, this definition reduces to Definition 2.5 for matrix-vector multiplication.
Given matrices and
, Definition 2.9 and the above computation give
for all in
. We record this for reference.
Theorem 2.3.1






Here is an example of how to compute the product of two matrices using Definition 2.9.
Example 2.3.1

![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rrr} 2 & 3 & 5 \\ 1 & 4 & 7 \\ 0 & 1 & 8 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-150065d7fe1fd50383eb17d6412b1738_l3.png)
and
![Rendered by QuickLaTeX.com B = \left[\begin{array}{rr} 8 & 9 \\ 7 & 2 \\ 6 & 1 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-cabec8536af1589b0245e0257dcfd1c9_l3.png)
Solution:
The columns of are
and
, so Definition 2.5 gives
Hence Definition 2.9 above gives .
While Definition 2.9 is important, there is another way to compute the matrix product that gives a way to calculate each individual entry. In Section 2.2 we defined the dot product of two
-tuples to be the sum of the products of corresponding entries. We went on to show (Theorem 2.2.5) that if
is an
matrix and
is an
-vector, then entry
of the product
is the dot product of row
of
with
. This observation was called the “dot product rule” for matrix-vector multiplication, and the next theorem shows that it extends to matrix multiplication in general.
Theorem 2.3.2 Dot Product Rule






product of row




Proof:
Write in terms of its columns. Then
is column
of
for each
. Hence the
-entry of
is entry
of
, which is the dot product of row
of
with
. This proves the theorem.
Thus to compute the -entry of
, proceed as follows (see the diagram):
Go across row of
, and down column
of
, multiply corresponding entries, and add the results.
Note that this requires that the rows of must be the same length as the columns of
. The following rule is useful for remembering this and for deciding the size of the product matrix
.
Compatibility Rule
Let and
denote matrices. If
is
and
is
, the product
can be formed if and only if
. In this case the size of the product matrix
is
, and we say that
is defined, or that
and
are compatible for multiplication.
The diagram provides a useful mnemonic for remembering this. We adopt the following convention:
Whenever a product of matrices is written, it is tacitly assumed that the sizes of the factors are such that the product is defined.
To illustrate the dot product rule, we recompute the matrix product in Example 2.3.1.
Example 2.3.3

![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rrr} 2 & 3 & 5 \\ 1 & 4 & 7 \\ 0 & 1 & 8 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-150065d7fe1fd50383eb17d6412b1738_l3.png)
and
![Rendered by QuickLaTeX.com B = \left[ \begin{array}{rr} 8 & 9 \\ 7 & 2 \\ 6 & 1 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-f01a349370b30ab76899f1fe24b42d0d_l3.png)
Solution:
Here is
and
is
, so the product matrix
is defined and will be of size
. Theorem 2.3.2 gives each entry of
as the dot product of the corresponding row of
with the corresponding column of
that is,
Of course, this agrees with Example 2.3.1.
Example 2.3.4



Then compute .
Solution:
The -entry of
is the dot product of row 1 of
and column 3 of
(highlighted in the following display), computed by multiplying corresponding entries and adding the results.
Similarly, the -entry of
involves row 2 of
and column 4 of
.
Since is
and
is
, the product is
.
Example 2.3.5
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{ccc} 1 & 3 & 2\end{array}\right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-6d493a65ad4978c61a6ccf94494a29c5_l3.png)
![Rendered by QuickLaTeX.com B = \left[ \begin{array}{r} 5 \\ 6 \\ 4 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-5d2a61617fd897fffc16706f2f2d643c_l3.png)




Solution:
Here, is a
matrix and
is a
matrix, so
and
are not defined. However, the compatibility rule reads
so both and
can be formed and these are
and
matrices, respectively.
Unlike numerical multiplication, matrix products and
need not be equal. In fact they need not even be the same size, as Example 2.3.5 shows. It turns out to be rare that
(although it is by no means impossible), and
and
are said to commute when this happens.
Example 2.3.6
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rr} 6 & 9 \\ -4 & -6 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-bdfc43a92e198b02e19b072d7ab5a289_l3.png)
![Rendered by QuickLaTeX.com B = \left[ \begin{array}{rr} 1 & 2 \\ -1 & 0 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-3376671e520bce856360af31e8caea77_l3.png)



Solution:
, so
can occur even if
. Next,
Hence , even though
and
are the same size.
Example 2.3.7




Solution:
These both follow from the dot product rule as the reader should verify. For a more formal proof, write where
is column
of
. Then Definition 2.9 and Example 2.2.1 give
If denotes column
of
, then
for each
by Example 2.2.12. Hence Definition 2.9 gives:
The following theorem collects several results about matrix multiplication that are used everywhere in linear algebra.
Theorem 2.3.3
Assume that is any scalar, and that
,
, and
are matrices of sizes such that the indicated matrix products are defined. Then:
1. and
where
denotes an identity matrix.
2. .
3. .
4. .
5. .
6. .
Proof:
Condition (1) is Example 2.3.7; we prove (2), (4), and (6) and leave (3) and (5) as exercises.
1. If in terms of its columns, then
by Definition 2.9, so
4. We know (Theorem 2.2.) that holds for every column
. If we write
in terms of its columns, we get
6. As in Section 2.1, write and
, so that
and
where
and
for all
and
. If
denotes the
-entry of
, then
is the dot product of row
of
with column
of
. Hence
But this is the dot product of row of
with column
of
; that is, the
-entry of
; that is, the
-entry of
. This proves (6).
Property 2 in Theorem 2.3.3 is called the associative law of matrix multiplication. It asserts that the equation holds for all matrices (if the products are defined). Hence this product is the same no matter how it is formed, and so is written simply as
. This extends: The product
of four matrices can be formed several ways—for example,
,
, and
—but the associative law implies that they are all equal and so are written as
. A similar remark applies in general: Matrix products can be written unambiguously with no parentheses.
However, a note of caution about matrix multiplication must be taken: The fact that and
need not be equal means that the order of the factors is important in a product of matrices. For example
and
may not be equal.
Warning:
If the order of the factors in a product of matrices is changed, the product matrix may change (or may not be defined). Ignoring this warning is a source of many errors by students of linear algebra!}
Properties 3 and 4 in Theorem 2.3.3 are called distributive laws. They assert that and
hold whenever the sums and products are defined. These rules extend to more than two terms and, together with Property 5, ensure that many manipulations familiar from ordinary algebra extend to matrices. For example
Note again that the warning is in effect: For example need not equal
. These rules make possible a lot of simplification of matrix expressions.
Example 2.3.8

Solution:
Example 2.3.9 and Example 2.3.10 below show how we can use the properties in Theorem 2.3.2to deduce other facts about matrix multiplication. Matrices and
are said to commute if
.
Example 2.3.9











Solution:
Showing that commutes with
means verifying that
. The computation uses the associative law several times, as well as the given facts that
and
.
Example 2.3.10


Solution:
The following always holds:
(2.6)
Hence if , then
follows. Conversely, if this last equation holds, then equation (2.6 becomes
This gives , and
follows.
In Section 2.2 we saw (in Theorem 2.2.1 ) that every system of linear equations has the form
where is the coefficient matrix,
is the column of variables, and
is the constant matrix. Thus the system of linear equations becomes a single matrix equation. Matrix multiplication can yield information about such a system.
Example 2.3.11








Solution:
Suppose that is any solution to the system, so that
. Multiply both sides of this matrix equation by
to obtain, successively,
This shows that if the system has a solution , then that solution must be
, as required. But it does not guarantee that the system has a solution. However, if we write
, then
Thus will be a solution if the condition
is satisfied.
The ideas in Example 2.3.11 lead to important information about matrices; this will be pursued in the next section.
2.4 Matrix Inverse
Three basic operations on matrices, addition, multiplication, and subtraction, are analogs for matrices of the same operations for numbers. In this section we introduce the matrix analog of numerical division.
To begin, consider how a numerical equation is solved when
and
are known numbers. If
, there is no solution (unless
). But if
, we can multiply both sides by the inverse
to obtain the solution
. Of course multiplying by
is just dividing by
, and the property of
that makes this work is that
. Moreover, we saw in Section~?? that the role that
plays in arithmetic is played in matrix algebra by the identity matrix
. This suggests the following definition.
Definition 2.11 Matrix Inverses



A matrix that has an inverse is called an
Note that only square matrices have inverses. Even though it is plausible that nonsquare matrices and
could exist such that
and
, where
is
and
is
, we claim that this forces
. Indeed, if
there exists a nonzero column
such that
(by Theorem 1.3.1), so
, a contradiction. Hence
. Similarly, the condition
implies that
. Hence
so
is square.}
Example 2.4.1
![Rendered by QuickLaTeX.com B = \left[ \begin{array}{rr} -1 & 1 \\ 1 & 0 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-cf19b03b112d5d2923ab0e826c73f3e4_l3.png)
is an inverse of
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rr} 0 & 1 \\ 1 & 1 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-d59edd2b72fc15508d3b90fa4c7c838c_l3.png)
Solution:
Compute and
.
Hence , so
is indeed an inverse of
.
Example 2.4.2
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rr} 0 & 0 \\ 1 & 3 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-48aa3348c447cca2ffaef5a995e2a4a8_l3.png)
has no inverse.
Solution:
Let
denote an arbitrary matrix. Then
so has a row of zeros. Hence
cannot equal
for any
.
The argument in Example 2.4.2 shows that no zero matrix has an inverse. But Example 2.4.2 also shows that, unlike arithmetic, it is possible for a nonzero matrix to have no inverse. However, if a matrix does have an inverse, it has only one.
Theorem 2.4.1




Proof:
Since and
are both inverses of
, we have
. Hence
If is an invertible matrix, the (unique) inverse of
is denoted
. Hence
(when it exists) is a square matrix of the same size as
with the property that
These equations characterize in the following sense:
Inverse Criterion: If somehow a matrix can be found such that
and
, then
is invertible and
is the inverse of
; in symbols,
.}
This is a way to verify that the inverse of a matrix exists. Example 2.3.3 and Example 2.3.4 offer illustrations.
Example 2.4.3
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{rr} 0 & -1 \\ 1 & -1 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-a5bf0e767b45d47ebab559be9923b89d_l3.png)


Solution:
We have , and so
Hence , as asserted. This can be written as
, so it shows that
is the inverse of
. That is,
.
The next example presents a useful formula for the inverse of a matrix
when it exists. To state it, we define the
and the
of the matrix
as follows:
Example 2.4.4
![Rendered by QuickLaTeX.com A = \left[ \begin{array}{cc} a & b \\ c & d \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-a42b19e014d25e05cd0083a5eded3f97_l3.png)


Solution:
For convenience, write and
. Then
as the reader can verify. So if
, scalar multiplication by
gives
Hence is invertible and
. Thus it remains only to show that if
exists, then
.
We prove this by showing that assuming leads to a contradiction. In fact, if
, then
, so left multiplication by
gives
; that is,
, so
. But this implies that
,
,
, and
are all zero, so
, contrary to the assumption that
exists.
As an illustration, if
then . Hence
is invertible and
, as the reader is invited to verify.
The determinant and adjugate will be defined in Chapter 3 for any square matrix, and the conclusions in Example 2.4.4 will be proved in full generality.
Inverse and Linear systems
Matrix inverses can be used to solve certain systems of linear equations. Recall that a of linear equations can be written as a
matrix equation
where and
are known and
is to be determined. If
is invertible, we multiply each side of the equation on the left by
to get
This gives the solution to the system of equations (the reader should verify that really does satisfy
). Furthermore, the argument shows that if
is
solution, then necessarily
, so the solution is unique. Of course the technique works only when the coefficient matrix
has an inverse. This proves Theorem 2.4.2.
Theorem 2.4.2


If the coefficient matrix
is invertible, the system has the unique solution
Example 2.4.5

Solution:
In matrix form this is where
,
, and
. Then
, so
is invertible and
by Example 2.4.4. Thus Theorem 2.4.2 gives
so the solution is and
.
An inversion method
If a matrix is
and invertible, it is desirable to have an efficient technique for finding the inverse. The following procedure will be justified in Section 2.5.
Matrix Inversion Algorithm







where the row operations on and
are carried out simultaneously.
Example 2.4.6
Solution:
Apply elementary row operations to the double matrix
so as to carry to
. First interchange rows 1 and 2.
Next subtract times row 1 from row 2, and subtract row 1 from row 3.
Continue to reduced row-echelon form.
Hence , as is readily verified.
Given any matrix
, Theorem 1.2.1 shows that
can be carried by elementary row operations to a matrix
in reduced row-echelon form. If
, the matrix
is invertible (this will be proved in the next section), so the algorithm produces
. If
, then
has a row of zeros (it is square), so no system of linear equations
can have a unique solution. But then
is not invertible by Theorem 2.4.2. Hence, the algorithm is effective in the sense conveyed in Theorem 2.4.3.
Theorem 2.4.3




first case, the algorithm produces


Properties of inverses
The following properties of an invertible matrix are used everywhere.
Example 2.4.7: Cancellation Laws
Let be an invertible matrix. Show that:
1. If , then
.
2. If , then
.
Solution:
Given the equation , left multiply both sides by
to obtain
. Thus
, that is
. This proves (1) and the proof of (2) is left to the reader.
Properties (1) and (2) in Example 2.4.7 are described by saying that an invertible matrix can be “left cancelled” and “right cancelled”, respectively. Note however that “mixed” cancellation does not hold in general: If is invertible and
, then
and
may
be equal, even if both are
. Here is a specific example:
Sometimes the inverse of a matrix is given by a formula. Example 2.4.4 is one illustration; Example 2.4.8 and Example 2.4.9 provide two more. The idea is the : If a matrix
can be found such that
, then
is invertible and
.
Example 2.4.8





Solution:
exists (by assumption). Its transpose
is the candidate proposed for the inverse of
. Using the inverse criterion, we test it as follows:
Hence is indeed the inverse of
; that is,
.
Example 2.4.9





Solution:
We are given a candidate for the inverse of , namely
. We test it as follows:
Hence is the inverse of
; in symbols,
.
We now collect several basic properties of matrix inverses for reference.
Theorem 2.4.4
All the following matrices are square matrices of the same size.
1. is invertible and
.
2. If is invertible, so is
, and
.
3. If and
are invertible, so is
, and
.
4. If are all invertible, so is their product
, and
5. If is invertible, so is
for any
, and
.
6. If is invertible and
is a number, then
is invertible and
.
7. If is invertible, so is its transpose
, and
.
Proof:
1. This is an immediate consequence of the fact that .
2. The equations show that
is the inverse of
; in symbols,
.
3. This is Example 2.4.9.
4. Use induction on . If
, there is nothing to prove, and if
, the result is property 3. If
, assume inductively that
. We apply this fact together with property 3 as follows:
So the proof by induction is complete.
5. This is property 4 with .
6. The readers are invited to verify it.
7. This is Example 2.4.8.
The reversal of the order of the inverses in properties 3 and 4 of Theorem 2.4.4 is a consequence of the fact that matrix multiplication is not
commutative. Another manifestation of this comes when matrix equations are dealt with. If a matrix equation is given, it can be
by a matrix
to yield
. Similarly,
gives
. However, we cannot mix the two: If
, it need
be the case that
even if
is invertible, for example,
,
.
Part 7 of Theorem 2.4.4 together with the fact that gives
Corollary 2.4.1


Example 2.4.10

![Rendered by QuickLaTeX.com (A^{T} - 2I)^{-1} = \left[ \begin{array}{rr} 2 & 1 \\ -1 & 0 \end{array} \right]](https://ecampusontario.pressbooks.pub/app/uploads/quicklatex/quicklatex.com-1271135b960ed263a93075fb822ca7bf_l3.png)
Solution:
By Theorem 2.4.2 (2) and Example 2.4.4, we have
Hence , so
by Theorem 2.4.4(7).
The following important theorem collects a number of conditions all equivalent to invertibility. It will be referred to frequently below.
Theorem 2.4.5 Inverse Theorem
The following conditions are equivalent for an matrix
:
1. is invertible.
2. The homogeneous system has only the trivial solution
.
3. can be carried to the identity matrix
by elementary row operations.
4. The system has at least one solution
for every choice of column
.
5. There exists an matrix
such that
.
Proof:
We show that each of these conditions implies the next, and that (5) implies (1).
(1) (2). If
exists, then
gives
.
(2) (3). Assume that (2) is true. Certainly
by row operations where
is a reduced, row-echelon matrix. It suffices to show that
. Suppose that this is not the case. Then
has a row of zeros (being square). Now consider the augmented matrix
of the system
. Then
is the reduced form, and
also has a row of zeros. Since
is square there must be at least one nonleading variable, and hence at least one parameter. Hence the system
has infinitely many solutions, contrary to (2). So
after all.
(3) (4). Consider the augmented matrix
of the system
. Using (3), let
by a sequence of row operations. Then these same operations carry
for some column
. Hence the system
has a solution (in fact unique) by gaussian elimination. This proves (4).
(4) (5). Write
where
are the columns of
. For each \newline
, the system
has a solution
by (4), so
. Now let
be the
matrix with these matrices
as its columns. Then Definition 2.9 gives (5):
(5) (1). Assume that (5) is true so that
for some matrix
. Then
implies
(because
). Thus condition (2) holds for the matrix
rather than
. Hence the argument above that (2)
(3)
(4)
(5) (with
replaced by
) shows that a matrix
exists such that
. But then
Thus which, together with
, shows that
is the inverse of
. This proves (1).
The proof of (5) (1) in Theorem 2.4.5 shows that if
for square matrices, then necessarily
, and hence that
and
are inverses of each other. We record this important fact for reference.
Corollary 2.4.1








Here is a quick way to remember Corollary 2.4.1. If is a square matrix, then
1. If then
.
2. If then
.
Observe that Corollary 2.4.1 is false if and
are not square matrices. For example, we have
In fact, it can be verified that if and
, where
is
and
is
, then
and
and
are (square) inverses of each other.
An matrix
has
if and only if (3) of Theorem 2.4.5 holds. Hence
Corollary 2.4.2


