Monthly Archives: June 2011

Every matrix over a field is row equivalent to a matrix in reduced row echelon form

Let F be a field and let A be a matrix over F. Prove that A is row equivalent to a matrix in reduced row echelon form.


We begin with a discussion about row echelon and reduced row echelon forms of matrices.

We will define row echelon form for matrices recursively as follows. For the base case, a 1 \times m matrix is in row echelon form if it is 0 or if the first nonzero entry is 1. For the recursive step, an n \times m matrix A is said to be in row echelon form if we have A = 0 or A = \left[ \begin{array}{c|c|c} 0 & 1 & V \\ \hline 0 & 0 & A^\prime \end{array} \right], where 1 is the 1 \times 1 identity matrix, V has dimension 1 \times k for some k < m, and A^\prime is in row echelon form and has dimension (n-1) \times k. (Note that the leftmost submatrices have m-k-1 columns.)

To each matrix in row echelon form, we associate a (possibly empty) set of pairs (i,j) called pivots as follows. For A has dimension 1 \times m, if A = 0 then A has no pivots. If j is minimal such that the (1,j) entry of A is nonzero, then \{(i,j)\} is the set of pivots for A. For the recursive step, suppose A has dimension n \times m. If A = 0, then A has no pivots. If A = \left[ \begin{array}{c|c|c} 0 & 1 & V \\ \hline 0 & 0 & A^\prime \end{array} \right], where V has dimension 1 \times k, then the pivots of A are (1,m-k) and (i+1,j+1) where (i,j) is a pivot of A^\prime.

We will say that a matrix A = [a_{i,j}] is in reduced row echelon form if it is in row echelon form and for each pivot (i,j), we have a_{k,j} = 0 for all k < i.

To prove that every matrix is row equivalent to a matrix in reduced row echelon form, we will first show that every matrix is row equivalent to a matrix in row echelon form. We will then show that every row echelon matrix is row equivalent to a reduced row echelon matrix.

Lemma 1: Every matrix is row equivalent to a matrix in row echelon form. Proof: We will proceed by induction on the number of rows. For the base case, let A = [a_{i,j}] be a 1 \times m matrix. If A = 0, then A is in row echelon form. If A \neq 0, then there exists a minimal index j such that a_{1,j} \neq 0. Let P be the elementary matrix P = E_{1,1}(1/a_{1,j}). Then PA is in row echelon form. For the inductive step, let A be an (n+1) \times m matrix. If A = 0, then A is in row echelon form. Suppose now that A = [a_{i,j}] \neq 0, and let k be minimal such that a_{i,k} \neq 0 for some i. Then we have A = [0 | C | A_2], where C = [a_{i,k}] \neq 0 has dimension n \times 1. Choose an index \ell such that a_{\ell,k} \neq 0. We have a product P of elementary matrices which interchanges rows 1 and \ell. Now PA = [0 | C | A_2] (different C and A_2) where the first entry of C, say c_1, is nonzero. Let Q = E_{1,1}(1/c_1). Then QPA = \left[ \begin{array}{c|c|c} 0 & 1 & V \\ \hline 0 & C & A_2 \end{array} \right] (again, with different C and A_2, and where V has dimension 1 \times (m-k).) Let C = [c_{i,j}] be indexed from 2 in the i variable. Then for each nonzero c_{i,1}, we let R_i = E_{i,1}(-c_i) and let R = \prod R_i. Now RQPA = \left[ \begin{array}{c|c|c} 0 & 1 & V \\ \hline 0 & 0 & A_2 \end{array} \right] (again, with a different V and A_2.) By the induction hypothesis, there exists a product T_2 of elementary matrices such that T_2A_2 is in row echelon form. Let T = \left[ \begin{array}{c|c} 1 & 0 \\ \hline 0 & T_2 \end{array} \right]. Using the lemmas from this previous exercise, we have the following.

TRQPA  =  \left[ \begin{array}{c|c} 1 & 0 \\ \hline 0 & T_2 \end{array} \right] \left[ \begin{array}{c|c|c} 0 & 1 & V \\ \hline 0 & 0 & A_2 \end{array} \right]
 =  \left[ \begin{array}{c|c} I[0|1] + 0[0|0] & IV + 0A_2 \\ \hline 0[0|1] + T_2[0|0] & 0V + T_2A_2 \end{array} \right]
 =  \left[ \begin{array}{c|c|c} 0 & 1 & V \\ \hline 0 & 0 & T_2A_2 \end{array} \right]

Which is in row echelon form, by our recursive definition. By induction, then, every matrix is row equivalent to a matrix in row echelon form.

Finally, we need to see that every row echelon matrix is row equivalent to a reduced row echelon matrix. Again, we proceed by induction on the number of rows. A nonzero 1 \times m matrix in row echelon form is already in reduced row echelon form (there is nothing to show). Suppose now that A = \left[ \begin{array}{c|c|c} 0 & 1 & V \\ \hline 0 & 0 & A_2 \end{array} \right] is in row echelon form, where A_2 is also in row echelon form. There is a product P_2 of elementary matrices such that P_2A_2 is in reduced row echelon form. Let P = \left[ \begin{array}{c|c} 1 & 0 \\ \hline 0 & A_2 \end{array} \right]. Let V = [v_{i,j}]. For each pivot (i,j) of P_2A_2, let Q_{i,j} = E_{1,i}(-v_{1,j}), and let Q = \prod Q_{i,j}. Then QPA is in reduced row echelon form.

Thus every matrix is row equivalent to a matrix in reduced row echelon form.

Row equivalent matrices have the same row rank

Prove that two row equivalent matrices over a field have the same row rank.


Let F be a field.

Suppose A and C are row equivalent via the matrix P, which is a product of elementary matrices. That is, PA = C, so that A^\mathsf{T} P^\mathsf{T} = C^\mathsf{T}. Let B and E denote the standard bases, and say A^\mathsf{T} = M^E_B(\varphi), C^\mathsf{T} = M^E_B(\psi), and P^\mathsf{T} = M^B_B(\theta). In particular, we have \varphi \circ \theta = \psi. By this previous exercise, the row rank of A is the column rank of A^\mathsf{T} is the dimension of \mathsf{im}\ \varphi, and the row rank of C is the column rank of C^\mathsf{T} is the dimension of \mathsf{im}\ \psi. Since \varphi \circ \theta = \psi and \theta is an isomorphism, we have \mathsf{im}\ \varphi \subseteq \mathsf{im}\ \psi, and likewise since \varphi = \psi \circ \theta^{-1}, \mathsf{im}\ \psi \subseteq \mathsf{im}\ \varphi. Thus we have \mathsf{im}\ \varphi = \mathsf{im}\ \psi, so that A and C have the same row rank.

Row equivalence is an equivalence relation

Two matrices A and B over a field are said to be row equivalent if there is a sequence of elementary row operations which transforms A into B. Prove that row equivalence is an equivalence relation.


Recall that the elementary row operations (for matrices over a field F) are as follows:

  1. Interchange two rows
  2. Add an F-multiple of one row to another
  3. Multiply a row by a nonzero element of F

We claim that (1) two of these are redundant and (2) elementary row operations can be achieved using multiplication by certain class of invertible “elementary matrices”, which happens to be closed under matrix inversion.

First, we show that row operation 1 (interchange two rows) can be achieved using a series of operations of types 2 and 3. Suppose we have a matrix M whose rows are R_1, R_2, \ldots, R_n, and we wish to interchange R_t and R_u. Let’s write M^\mathsf{T} = [M_1 | R_t | M_2 | R_u | M_3]. We will perform a sequence of row operations of type 2 and 3, keeping track of the rows after each step.

Operation Type Rows
Start [M_1 | R_t | M_2 | R_u | M_3]
Add -1 times row t to row u. 2 [M_1 | R_t | M_2 | R_u - R_t | M_3]
Add 1 times row u to row t. 2 [M_1 | R_u | M_2 | R_u - R_t | M_3]
Add -1 times row t to row u. 2 [M_1 | R_u | M_2 | -R_t | M_3]
Multiply row u by -1. 3 [M_1 | R_u | M_2 | R_t | M_3]

Note that we have achieved the same effect as a single application of operation 1. In addition, we only needed to use the elements 1 and -1 in F, which always exist. (In fact, we have 1 and -1 in any ring with 1, which is a useful generalization.)

We can think of a type 3 operation as a special case of a type 2 operation if we allow the same row to play both parts in a type 2 operation. Thus we can say that two matrices are row equivalent if there is a sequence of type 2 row operations turning one into the other.

Now define, over a field F, the (s,t,\alpha) elementary matrix as follows: E_{s,t}(\alpha) = [e_{i,j}], where e_{i,j} = \alpha if (i,j) = (s,t), 1 if j = i and (i,j) \neq (s,t), and 0 otherwise. Letting M = [m_{i,j}], we see that E_{s,y}(\alpha) M = [\sum_k e_{i,k} m_{k,j}]. If i \neq s, then e_{i,k} = 0 except when k = i, so that this sum is just m_{i,j}. If i = s and t \neq s, then e_{i,k} = 0 except when k \in \{i,t\}. In this case, this sum is m_{s,j} + \alpha m_{t,j}. If i = s and t = s, then e_{i,k} = 0 except when k = i, and the sum is \alpha m_{s,j}. In either case, the matrix E_{s,t}(\alpha) M is simply M after having row s replaced by row s plus \alpha times row t or after having row s multiplied by \alpha.

That is, two matrices A and B are row equivalent if and only if there is a matrix P = E_{s_1,t_1}(\alpha_1) \cdots E_{s_k,t_k}(\alpha_k) such that PA = B.

Note that, if t \neq s, then E_{s,t}(\alpha) \cdot E_{s,t}(-\alpha) = [\sum_k e_{i,k}f_{k,j}]. In this sum, if i \neq s, then e_{i,k} = 0 unless k = i. Then the sum is 0 unless j = i, in which case it is 1. If i = s, then e_{i,k} = 0 unless k \in \{i,t\}. \sum_k e_{i,k}f_{k,j} = e_{s,s}f_{s,j} + e_{s,t}f_{t,j}. If j = s (= i), this sum is 1. if j = t, this sum is -\alpha + \alpha = 0. Otherwise, the sum is 0. Hence E_{s,t}(\alpha) \cdot E_{s,t}(-\alpha) = I. In particular, E_{s,t}(\alpha) is invertible and E_{s,t}(\alpha)^{-1} = E_{s,t}(-\alpha).

Now note that E_{s,s}(\alpha) \cdot E_{s,s}(\alpha^{-1}) = [\sum_k e_{i,k}f_{k,j}]. In this sum, if i \neq s, then e_{i,k} = 0 unless k = i. If j = i, then the sum is 1, and otherwise is 0. If i = s, then e_{i,k} = 0 unless k = s. Then the sum is 0 unless j = s, in which case it is \alpha\alpha^{-1} = 1. Thus E_{s,s}(\alpha) is invertible and we have E_{s,s}(\alpha)^{-1} = E_{s,s}(\alpha^{-1}).

Thus the matrix P in our definition of row equivalence is invertible. Moreover, P^{-1} is also a product of elementary matrices.

Now it is easy to show that row equivalence is an equivalence relation.

  1. (Reflexive) The identiy matrix is elementary. Specifically, I = E_{s,s}(1). Thus every matrix is row equivalent to itself.
  2. (Symmetric) Suppose A can be row reduced to C. Then there is a product P of elementary matrices such that PA = C. Then A = P^{-1}C. Since P^{-1} is a product of elementary matrices, C can be row reduced to A.
  3. (Transitive) Suppose A can be row reduced to B and B to C. Then there exist products P and Q of elementary matrices such that PA = B and QB = C. Now QP is a product of elementary matrices, and QPA = C. Thus A can be row reduced to C.

(It turns out that GL_n(F) is generated by these elementary matrices, so that we can simply say A and C are row equivalent if there is an invertible matrix P such that PA = C, rather than specifying that P be a product of elementary matrices. We will prove this later.)

Analyze a proof of a simplification of Kummer’s Theorem

Theorem 11.10 in TAN is the following simplification of Kummer’s Theorem: if p is a regular odd prime, then x^p + y^p = z^p has no solutions such that p|xyz. The proof itself, however, does not explicitly appeal to the regularity of p. Where is this required?


The regularity of p is implicitly used when we appeal to Corollary 10.5 to show that [x+\zeta y] = [\delta^p]. Essentially, we show that A^p is principal for some ideal A, but since p does not divide the order of the class group, it must be that A is itself principal.

A bound on potential positive solutions of x^n + y^n = z^n

Suppose (x,y,z) is a solution in \mathbb{N}^+ of the equation x^n + y^n = z^n, where n > 2. Show that x,y > n.


Note that z > y.

If x^n + y^n = z^n, then x^n = z^n - y^n = (z-y)(\sum_{k=0}^{n} {n \choose k} z^{n-k}y^k) \geq \sum_{k=0}^{n} {n \choose k} z^{n-k}y^k \geq {n \choose {n-1}} z y^{n-1} = nzy^{n-1} > ny^{n-1}. Thus x^n > ny^{n-1}. Likewise, y^n > nx^{n-1}.

Suppose, without loss of generality, that x > y. Then y^n > nx^{n-1} > ny^{n-1}, and so y > n. Thus x > n.

If the roots of a monic irreducible polynomial over ZZ all have modulus at most 1, then the roots are roots of unity

Let p(x) \in \mathbb{Z}[x] be a monic irreducible polynomial and suppose that for each root \zeta of p(x), we have ||\zeta || \leq 1, where || \cdot || denotes complex modulus. Show that the roots of p(x) are roots of unity.


Note that p(x) factors as p(x) = \prod (x - \zeta_i), where \zeta_i ranges over the roots of p. Now the constant coefficient of p(x) is \prod \zeta_i, and thus ||\prod \zeta_i|| = \prod ||\zeta_i|| \leq 1. On the other hand, the constant coefficient is an integer, and so is either 0 or 1. If the constant coefficient is 0, then x divides p(x). So the constant coefficient of p is 1, and we have ||\zeta_i|| = 1 for all i.

By Lemma 11.8, the \zeta_i are all roots of unity.

Similarity among nonsquare matrices

Let F be a field, let V be an n-dimensional vector space over F, and let W be an m-dimensional vector space over F. Let B_1, B_2 \subseteq V and E_1, E_2 \subseteq W be bases, let \varphi : V \rightarrow W be a linear transformation, and let A = M^{E_1}_{B_1}(\varphi) and B = M^{E_2}_{B_2}(\varphi). Let P = M^{B_1}_{B_2}(1) be the matrix of the identity map on V with respect to the source basis B_2 and the target basis B_1, and likewise let Q = M^{E_1}_{E_2}(1) be the matrix of the identity map on W with respect to the source basis E_2 and the target basis E_1.

Prove that Q^{-1} = M^{E_2}_{E_1}(1) and that Q^{-1}AP = B.


Using Theorem 12 in D&F, we have Q \times M^{E_2}_{E_1}(1) = M^{E_1}_{E_2}(1) \times M^{E_2}_{E_1}(1) = M^{E_1}_{E_1}(1 \circ 1) = M^{E_1}_{E_2}(1) = I. Similarly, M^{E_2}_{E_1}(1) \circ Q = M^{E_2}_{E_2}(1) = I. By the uniqueness of inverses, M^{E_2}_{E_1}(1) = Q^{-1}.

Now Q^{-1}AP = M^{E_2}_{E_1}(1) \times M^{E_1}_{B_1}(\varphi) \times M^{B_1}_{B_2}(1) = M^{E_2}_{B_2}(1 \circ \varphi \circ 1) = M^{E_2}_{B_2}(\varphi) = B.

The conjugates of a root of unity are roots of unity

Let \zeta be a root of unity. Show that the conjugates of \zeta are also roots of unity.


If \zeta is a root of 1, then \zeta is a root of p(x) = x^n - 1 for some n. Thus the minimal polynomial of \zeta over \mathbb{Q} is also a root of p(x), so the conjugates of \zeta are roots of unity.

Find all the Pythagorean triples with one parameter fixed

Find all of the positive Pythagorean triples (x,y,z) where y = 60.


First we find all of the positive primitive solutions. If (x,60,z) is a primitive Pythagorean triple, then by Theorem 11.1, we have integers r and s such that x = r^2 - s^2, 60 = 2rs, and z = r^2 + s^2, where 0 > s > r, (r,s) = (1), and r+s \equiv 1 mod 2. Now rs = 30; there are only four possibilities for (r,s) with these constraints: (r,s) \in \{(30,1),(15,2),(10,3),(6,5)\}. These yield the triples (899,60,901), (221,60,229), (91,60,109), and (11,60,61).

If (x,60,z) is not primitive, then x, z, and 60 have a greatest common factor d \neq 1. There are 11 possibilities: d \in \{2,3,4,5,6,10,12,15,20,30,60\}. Then (x/d,60/d,z/d) is primitive. First, we have a lemma.

Lemma: Consider the triple T = (a,b,c) of positive integers. T is not primitive Pythagorean if any of the following hold: (1) if b is odd, (2) if b = 2m where m is odd, and (3) if b = 2. Proof: If (a,b,c) is a primitive Pythagorean triple, then we have integers r and s such that b = 2rs, 0 < s < r, (r,s) = (1), and r+s\equiv 1 mod 2. If b is odd, we contradict b = 2rs. If b = 2m where m is an odd, then r and s are both odd, but then r+s \equiv 0 mod 2. If b = 2, then r = s = 1, a contradiction. \square

By the lemma, no primitive Pythagorean triples (x/d,60/d,z/d) exist for d in \{2,4,6,10,12,20,30,60\}. We handle each remaining case in turn.

(d = 3) Let a = x/3 and c = z/3, and suppose (a,20,c) is a primitive Pythagorean triple. By Theorem 11.1, we have integers r and s such that a = r^2 - s^2, 20 = 2rs, c = r^2+s^2, 0 < s < r, (r,s) = (1), and r+s \equiv 1 mod 2. Now rs = 10; there are two possibilities for (r,s) with these constraints: (r,s) \in \{(10,1),(5,2)\}. These yield the solutions 297^2 + 60^2 = 303^2 and 63^2 + 60^2 = 87^2.

(d = 5) Let a = x/5 and c = z/5, and suppose (a,12,c) is a primitive Pythagorean triple. By Theorem 11.1, we have integers r and s such that a = r^2 - s^2, 12 = 2rs, c = r^2+s^2, 0 < s < r, (r,s) = (1), and r+s \equiv 1 mod 2. Now 6 = rs; the only such pair is (r,s) = (3,2), which yields the solution 25^2 + 60^2 = 65^2.

(d = 15) Let a = x/15 and c = z/15, and suppose (a,4,c) is a primitive Pythagorean triple. By Theorem 11.1, we have integers r and s such that a = r^2 - s^2, 4 = 2rs, c = r^2+s^2, 0 < s < r, (r,s) = (1), and r+s \equiv 1 mod 2. Now rs = 2, so that (r,s) = (2,1). This yields the solution 45^2 + 60^2 = 75^2.

There are only finitely many Pythagorean triples (x,y,z) with any one parameter fixed

Suppose (x,y,z) is a solution to x^2 + y^2 = z^2 in \mathbb{N}^+. Show that x^2 \geq 2y+1. Deduce that if we fix any of x, y, or z, then there are only finitely Pythagorean triples (x,y,z).


Suppose x^2 + y^2 = z^2. Now x^2 = z^2 - y^2, and z > y. (We may assume that xyz \neq 0.) Now x^2 = (z+y)(z-y) \geq z+y > 2y. Since x^2 and 2y are integers, x^2 \geq 2y+1.

With x fixed, there are only finitely many possible y, as they must satisfy 2y+1\leq x^2. With x and y fixed, z is determined, so there are only finitely many solutions with fixed x. Symmetrically, there are finitely many solutions with fixed y.

With z fixed, we have z > x,y, so there are certainly only finitely many solutions.