Tag Archives: diagonalizable

Over CC, matrices of finite multiplicative order are diagonalizable

Let A be an n \times n matrix over \mathbb{C}. Show that if A^k=I for some k, then A is diagonalizable.

Let F be a field of characteristic p. Show that A = \begin{bmatrix} 1 & \alpha \\ 0 & 1 \end{bmatrix} has finite order but cannot be diagonalized over F unless \alpha = 0.


Since A^k = I, the minimal polynomial m of A over \mathbb{C} divides x^k-1. In particular, the roots of m are distinct. Since \mathbb{C} contains all the roots of unity, by Corollary 25 on page 494 of D&F, A is diagonalizable over \mathbb{C}.

Note that \begin{bmatrix} 1 & \alpha \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & \beta \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 & \alpha+\beta \\ 0 & 1 \end{bmatrix}. By an easy inductive argument, then, \begin{bmatrix} 1 & \alpha \\ 0 & 1 \end{bmatrix}^t = \begin{bmatrix} 1 & t\alpha \\ 0 & 1 \end{bmatrix}, and in particular, A^p = I.

Suppose \alpha \neq 0. Now \frac{1}{\alpha}A is in Jordan canonical form, and is not diagonalizable. (See Corollary 24 on page 493 of D&F.) So A cannot be diagonalizable, for if it were, then so would \frac{1}{\alpha}A. (If P^{-1}AP = D is diagonal, then so is P^{-1}\frac{1}{\alpha}AP = \frac{1}{\alpha}D.)

Any matrix A such that A³ = A can be diagonalized

Let A be an n \times n matrix over \mathbb{C} such that A^3 = A. Show that A can be diagonalized. Is this result true if we replace \mathbb{C} by an arbitrary field F?


Note that the minimal polynomial of A divides x^3-x = x(x+1)(x-1). By Corollary 25 in D&F, A is similar to a diagonal matrix D, and moreover the diagonal entries of D are either 0, 1, or -1. Since this factorization of x^3-x holds over any field F, in fact A is diagonalizable over any field.

Every idempotent matrix over a field is diagonalizable

Let A be an n \times n matrix over a field F. Show that if A^2 = A, then A is similar to a diagonal matrix whose diagonal entries are either 0 or 1.


(Compare to this previous exercise.)

Short Proof that I Thought Of Seconds after Publishing the Long Proof:

The minimal polynomial of A divides x^2-x = x(x-1). By Corollary 25 in D&F, A is diagonalizable. Moreover, the elementary divisors of A are all either x or x-1, so the diagonal entries of the Jordan canonical form of A are either 0 or 1.

Long Proof that Missed the Point of the Whole Chapter but to which I am Too Attached to Delete Wholesale:

Let A be such a matrix, and let J be the Jordan canonical form of A; say P^{-1}AP = J. Now J^2 = (P^{-1}AP)^2 = P^{-1}APP^{-1}AP = P^{-1}A^2P = P^{-1}AP = J, so that J is also idempotent. Note also that J is upper triangular, with all entries above the first superdiagonal equal to 0. That is, if J = [t_{i,j}], then t_{i,j} = 0 if j \neq i or j \neq i+i and t_{i,j} \in \{0,1\} if j = i+1. (The diagonal entries of J are unknown at this time.) Since J^2 = J, we have \sum_k t_{i,k}t_{k,j} = t_{i,j} for each pair (i,j).

Fix i. Now t_{i,i} = \sum_k t_{i,k}t_{k,i} = t_{i,i}t_{i,i} + \sum_{k \neq i} t_{i,k}t_{k,i} = t_{i,i}t_{i,i}. That is, t_{i,i}^2 = t_{i,i}, so that t_{i,i} is a root of x^2-x = x(x-1) = 0. So t_{i,i} (i.e. an arbitrary diagonal entry) is either 1 or 0.

Now fix i and let j = i+1. Now t_{i,j} = \sum_k t_{i,k}t_{k,i+1} = t_{i,i}t_{i,i+1} + t_{i,i+1}t_{i+1,i+1} + \sum_{k \neq i,i+1} t_{i,k}t_{k,i+1} = t_{i,i}t_{i,i+1} + t_{i,i+1}t_{i+1,i+1}. Note that if t_{i,i+1} = 1, then we must have t_{i,i} = t_{i+1,i+1}, since these entries are in the same Jordan block of J. Now 1 = 2t_{i,i}, and we have 1 = 0, a contradiction. Thus $latex t_{i,i+1} = 0 for all appropriate i. That is, J is diagonal.

Compute the Jordan Canonical Form for a given matrix

Let A = \begin{bmatrix} -8 & -10 & -1 \\ 7 & 9 & 1 \\ 3 & 2 & 0 \end{bmatrix} and B = \begin{bmatrix} -3 & 2 & -4 \\ 4 & -1 & 4 \\ 4 & -2 & 5 \end{bmatrix}. Show that these matrices have the same characteristic polynomial, but that one is diagonalizable and the other not. Compute the Jordan canonical form for each.


Let P = \begin{bmatrix} -1 & -1 & 0 \\ 4 & 4 & 1 \\ -5 & -6 & -1 \end{bmatrix} and Q = \begin{bmatrix} -2 & 1 & -2 \\ 2 & -1 & 3 \\ 1 & 0 & 1 \end{bmatrix}. Evidently we have P^{-1}AP = \begin{bmatrix} -1 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{bmatrix} and Q^{-1}BQ = \begin{bmatrix} -1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}, which are both in Jordan canonical form.

Recall that characteristic polynomials are invariant under conjugation; in particular, evidently both A and B have characteristic polynomial (x+1)(x-1)^2. By Corollary 24 in D&F, A is not diagonalizable.

Every square matrix over a Euclidean domain can be diagonalized by elementary row and column operations

Show that if A is a matrix over a Euclidean domain R, then there is a sequence of elementary row and column operations (as described here and here) which, when performed on A, result in a matrix \begin{bmatrix} D & 0 \\ 0 & 0 \end{bmatrix} such that D = [d_{i,j}] is square and diagonal and if d_{1,1},\ldots, d_{k,k} are the nonzero diagonal entries of D (k \leq n), we have d_{i,i} | d_{j,j} whenever i \leq j.

Conclude that if M is a finitely generated R-module with relations matrix A (see here for a discussion of relations matrices) then M \cong R^{n-k} \oplus \bigoplus_{i=1}^k R/(d_i).


Let N : R \rightarrow \mathbb{N} denote the Euclidean norm on R.

Note first that if A = 0, there is nothing to show. If A \neq 0, we proceed by induction on the number m of rows of A.

For the base case m = 1, say A = [a_{1,1}\ a_{1,2}\ \cdots\ a_{1,n}]. If n = 1 there is nothing to show, so we assume n \geq 2. We now proceed by induction on \sum_{i=2}^n N(a_{1,i}). For the base case, if \sum_{i=2}^n N(a_{1,i}) = 0, then a_{1,i} = 0 for each 2 \leq i \leq n, and so A = [a_{1,1} | 0] has the desired form. For the inductive step, suppose that for all 1 \times n matrices B = [b_{1,1}\ b_{1,2}\ \cdots\ b_{1,n}] such that \sum_{i=2}^n N(b_{1,i}) \leq K, B can be diagonalized by some sequence of elementary row and column operations. Now suppose \sum_{i=2}^n N(a_{1,i}) = K+1. Using column operation 1 (swap two columns), rearrange the columns of A so that a_{1,1} has minimal norm. By the Division Algorithm, there exist q_i and r_i such that a_{1,i} = q_ia_{1,1} + r_i and either r_i = 0 or N(r_i) < N(a_{1,1}) for each 2 \leq i \leq n. Using column operation 2 (add a multiple of one column to a different one) n-1 times, one for each column after the first, we can put A in the form [a_{1,1}\ r_2\ \cdots\ r_n]. Now \sum_{i=2}^n N(r_i) < \sum_{i=2}^n N(a_{1,1}) \leq \sum_{i=2}^n N(a_{1,i}). By the inductive hypothesis, there is a sequence of elementary row and column operations which diagonalize this new matrix, and so there is a sequence of elementary row and column operations which diagonalizes A. This proves our base case, where A has one row.

Suppose now that the result holds for every matrix having at most m rows, and let A be an (m+1) \times n matrix. Say A = \begin{bmatrix} a_{1,1} & R_1 \\ C_1 & A_1 \end{bmatrix}, where R_1 = [r_{1,2}\ cdots\ r_{1,n}] and C_1 = [c_{2,1}\ \cdots\ c_{m+1,1}]^\mathsf{T}.

We next claim that there is a sequence of elementary row and column operations which transforms A into a matrix of the form \begin{bmatrix} a_1 & 0 \\ 0 & A_1 \end{bmatrix}. To prove this, we proceed by induction on \sum_{i=2}^{m+1} N(c_{i,1}) + \sum_{i=2}^n N(r_{1,i}). (Note that it may be the case that n = 1. In this case the sum over the r is empty and thus 0. However, the sum over the c is not empty.) If \sum_{i=2}^{m+1} N(c_{i,1}) + \sum_{i=2}^n N(r_{1,i}) = 0, then we have c_{i,1} = r_{1,j} = 0 for each 2 \leq i \leq m+1 and 2 \leq j \leq n, and so A = \begin{bmatrix} a_{1,1} & 0 \\ 0 & A_1 \end{bmatrix} has the desired form. For the induction hypothesis, suppose that for some K \in \mathbb{N}, every matrix B can be put in the desired form provided \sum_{i=2}^{m+1} N(c_{i,1}) + \sum_{i=2}^n N(r_{1,i}) \leq K, and suppose that for our A we have \sum_{i=2}^{m+1} N(c_{i,1}) + \sum_{i=2}^n N(r_{1,i}) = K+1. Now among the entries of our matrix, choose a nonzero element of minimal norm and use elementary row and column operations of type 1 (swap) to put this entry in position (1,1). (If no such entry exists, then A = 0 and there is nothing to show.) Now by the Division Algorithm in R, there exist q_i,t_j,u_i,v_j such that r_{1,i} = q_ia_{1,1} + u_i, c_{j,1} = t_ja_{1,1} + v_j, 0 \leq N(u_i) < N(a_{1,1}), and 0 \leq N(v_j) < N(a_{1,1}). Using elementary row and column operations of type 2 (add a multiple of one row/column to a different row/column) we can replace r_{1,i} with u_i and likewise replace c_{j,1} by v_j. Note now that \sum_{i=2}^{m+1} N(v_j) + \sum_{i=2}^n N(u_i) < \sum_{i=2}^{m+1} N(a_{1,1}) + \sum_{i=2}^n N(a_{1,1}) \leq \sum_{i=2}^{m+1} N(c_{i,1}) + \sum_{i=2}^n N(r_{1,i}) = K+1. By the induction hypothesis, there is a sequence of elementary row and column operations which puts the new A in the desired form, and so (performing these one after the other) we can put the original matrix A in the desired form.

Now A can be put in the form \begin{bmatrix} a_1 & 0 \\ 0 & A_1 \end{bmatrix} by elementary row and column operations. Note that any row or column operation we perform on A_1, when performed analogously on A, does not change the entry a_1 or introduce new nonzero elements to row and column 1. Since (as we assume by the induction hypothesis) the matrix A_1 can be diagonalized, then A can also be diagonalized by elementary row and column operations.

We have thus shown by induction that every matrix A over a Euclidean domain can be diagonalized. It remains to be seen that these diagonal entries can be ordered by divisibility. We can assume that our matrix D is square and has all nonzero entries on the main diagonal. We proceed by induction on the dimension of D. For the base case, where D has dimension 1 \times 1, there is nothing to show. Suppose now that every square diagonal matrix D of dimension n can have its diagonal entries divisibility-ordered by elementary row and column operations and let D = \begin{bmatrix} d_1 & 0 \\ 0 & D_1 \end{bmatrix} be a square diagonal matrix of dimension n+1. We claim that d_1 can be replaced by the greatest common divisor of all the d_i, and that each remaining d_i can be replaced by a linear combination of the d_is, via elementary row and column operations. We will show this for one d_i. Note that if D is diagonal, then row and column operations which only operate on two rows and columns (say a and b) do not affect the rest of the matrix. For this reason we can focus on a 2 \times 2 matrix, say \begin{bmatrix} d_1 & 0 \\ 0 & d_i \end{bmatrix}. Since R is a Euclidean domain, using the extended Euclidean Algorithm we can find x and y such that \mathsf{gcd}(d_1,d_i) = xd_1 + yd_i. Using a column/row operation of type 2, we can obtain \begin{bmatrix} d_1 & xd_1 + yd_i \\ 0 & d_i \end{bmatrix}. Swapping rows (or columns), we have \begin{bmatrix} \mathsf{gcd}(d_1,d_i) & d_1 \\ d_i & 0 \end{bmatrix}. If d_1 = a_1\mathsf{gcd}(d_1,d_i), then again we can obtain \begin{bmatrix} \mathsf{gcd}(d_1,d_i) & 0 \\ d_i & a_1d_i \end{bmatrix}, and so \begin{bmatrix} \mathsf{gcd}(d_1,d_i) & 0 \\ 0 & a_1d_i \end{bmatrix}. Repeating this procedure for each diagonal entry (except the first), and since greatest common divisor is associative, we have a diagonal matrix \begin{bmatrix} d_1 & 0 \\ 0 & D_1 \end{bmatrix} where d_1 divides each diagonal entry of D_1. By induction, D is row/column equivalent to a diagonal matrices whose diagonal entries are ordered by divisibility.

Finally, suppose we have a finitely generated R-module M, with (say) a generating set of cardinality n and corresponding surjective homomorphism \varphi : R^n \rightarrow M. By the preceding result, there exists a basis B = \{e_1, \ldots, e_n\} for R^n and a generating set E = \{e_1,\ldots,e_m\} for \mathsf{ker}\ \varphi such that the corresponding relations matrix is of the form \begin{bmatrix} D & 0 \\ 0 & 0 \end{bmatrix}, D is square of dimension k \leq n and has no zero diagonal entries, and the diagonal entries of D are ordered by divisibility. In particular, we have \mathsf{ker}\ \varphi = Rd_1 + \cdots Rd_k = (d_1) \oplus \cdots \oplus (d_k) \oplus 0^{n-k}. Then by this previous exercise, M \cong R/\mathsf{ker}\ \varphi \cong R/(d_1) \oplus \cdots \oplus R/(d_k) \oplus R^{n-k}. Note that if any d_i are units, then the corresponding quotient R/(d_i) is trivial.

By the uniqueness portion of the Fundamental Theorem of Finitely Generated Modules over a PID, the (nonunit) d_i are precisely the invariant factors of M.