## Tag Archives: matrix

### Over CC, matrices of finite multiplicative order are diagonalizable

Let $A$ be an $n \times n$ matrix over $\mathbb{C}$. Show that if $A^k=I$ for some $k$, then $A$ is diagonalizable.

Let $F$ be a field of characteristic $p$. Show that $A = \begin{bmatrix} 1 & \alpha \\ 0 & 1 \end{bmatrix}$ has finite order but cannot be diagonalized over $F$ unless $\alpha = 0$.

Since $A^k = I$, the minimal polynomial $m$ of $A$ over $\mathbb{C}$ divides $x^k-1$. In particular, the roots of $m$ are distinct. Since $\mathbb{C}$ contains all the roots of unity, by Corollary 25 on page 494 of D&F, $A$ is diagonalizable over $\mathbb{C}$.

Note that $\begin{bmatrix} 1 & \alpha \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 & \beta \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 & \alpha+\beta \\ 0 & 1 \end{bmatrix}$. By an easy inductive argument, then, $\begin{bmatrix} 1 & \alpha \\ 0 & 1 \end{bmatrix}^t = \begin{bmatrix} 1 & t\alpha \\ 0 & 1 \end{bmatrix}$, and in particular, $A^p = I$.

Suppose $\alpha \neq 0$. Now $\frac{1}{\alpha}A$ is in Jordan canonical form, and is not diagonalizable. (See Corollary 24 on page 493 of D&F.) So $A$ cannot be diagonalizable, for if it were, then so would $\frac{1}{\alpha}A$. (If $P^{-1}AP = D$ is diagonal, then so is $P^{-1}\frac{1}{\alpha}AP = \frac{1}{\alpha}D$.)

### Exhibit a quadratic field as a field of matrices

Let $K = \mathbb{Q}(\sqrt{D})$, where $D$ is a squarefree integer. Let $\alpha = a+b\sqrt{D}$ be in $K$, and consider the basis $B = \{1,\sqrt{D}\}$ of $K$ over $\mathbb{Q}$. Compute the matrix of the $\mathbb{Q}$-linear transformation ‘multiplication by $\alpha$‘ (described previously) with respect to $B$. Give an explicit embedding of $\mathbb{Q}(\sqrt{D})$ in the ring $\mathsf{Mat}_2(\mathbb{Q})$.

We have $\varphi_\alpha(1) = a+b\sqrt{D}$ and $\varphi(\alpha)(\sqrt{D}) = bD + a\sqrt{D}$. Making these the columns of a matrix $M_\alpha$, we have $M_\alpha = \begin{bmatrix} a & bD \\ b & a \end{bmatrix}$, and this is the matrix of $\varphi_\alpha$ with respect to $B$. As we showed in the exercise linked above, $\alpha \mapsto M_\alpha$ is an embedding of $K$ in $\mathsf{Mat}_2(\mathbb{Q})$.

Compare to this previous exercise about $\mathbb{Z}[\sqrt{D}]$.

### Every degree n extension of a field F is embedded in the set of nxn matrices over F

Let $F$ be a field, and let $K$ be an extension of $F$ of finite degree.

1. Fix $\alpha \in K$. Prove that the mapping ‘multiplication by $\alpha$‘ is an $F$-linear transformation on $K$. (In fact an automorphism for $\alpha \neq 0$.)
2. Deduce that $K$ is isomorphically embedded in $\mathsf{Mat}_n(F)$.

Let $\varphi_\alpha(x) = \alpha x$. Certainly then we have $\varphi_\alpha(x+ry) = \alpha(x+ry) = \alpha x + r \alpha y$ $= \varphi_\alpha(x) + r \varphi_\alpha(y)$ for all $x,y \in K$ and $r \in F$; so $\varphi_\alpha$ is an $F$-linear transformation. If $\alpha \neq 0$, then evidently $\varphi_{\alpha^{-1}} \circ \varphi_\alpha = \varphi_\alpha \circ \varphi_{\alpha^{-1}} = 1$.

Fix a basis for $K$ over $F$; this yields a ring homomorphism $\Psi : K \rightarrow \mathsf{Mat}_n(F)$ which takes $\alpha$ and returns the matrix of $\varphi_\alpha$ with respect to the chosen basis. Suppose $\alpha \in \mathsf{ker}\ \Psi$; then $\varphi_\alpha(x) = 0$ for all $x \in K$, and thus $\alpha = 0$. So $\Psi$ is injective as desired.

### Solve a given nth order linear differential equation

Find bases for the vector spaces of solutions to the following differential equations.

1. $y^{(3)} - 3y^{(1)} + 2y = 0$
2. $y^{(4)} + 4y^{(3)} + 6y^{(2)} + 4y^{(1)} + y = 0$

Let $y_1 = y^{(0)}$, $y_2 = y^{(1)}$, and $y_3 = y^{(2)}$, and consider the following linear system of first order differential equations.

 $\begin{array}{ccccccc} \frac{d}{dt} y_1 & = & & & y_2 & & \\ \frac{d}{dt} y_2 & = & & & & & y_3 \\ \frac{d}{dt} y_3 & = & -2y_1 & + & 3y_2 & & \end{array}$

Which we can express as a matrix equation by

 $\dfrac{d}{dt} \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ -2 & 3 & 0 \end{bmatrix} \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix}$

Now let $A$ denote this coefficient matrix. Let $P = \frac{1}{9} \begin{bmatrix} 1 & -2 & 1 \\ -4 & 8 & 5 \\ -6 & 3 & 3 \end{bmatrix}$; evidently, $P^{-1}AP = \begin{bmatrix} -2 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{bmatrix}$ is in Jordan canonical form. (Computations performed by WolframAlpha.) Now $P\left(\mathsf{exp}([-2]t) \oplus \mathsf{exp}(\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}t\right) = P\begin{bmatrix} e^{-2t} & 0 & 0 \\ 0 & e^t & te^t \\ 0 & 0 & e^t \end{bmatrix}$ is a fundamental matrix of our linear system. Reading off the first row of this matrix (and multiplying by 9), we see that $e^{-2t}$, $-2e^t$, and $-2te^t + e^t$ are solutions of our original 3rd order differential equation. Moreover, it is clear that these are linearly independent.

For the second equation, we follow a similar strategy and see that the corresponding coefficient matrix is $A = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ -1 & -4 & -6 & -4 \end{bmatrix}$. Let $Q = \begin{bmatrix} -1 & -3 & -6 & -10 \\ 1 & 2 & 3 & 4 \\ -1 & -1 & -1 & -1 \\ 1 & 0 & 0 & 0 \end{bmatrix}$. Evidently, $Q^{-1}AQ = \begin{bmatrix} -1 & 1 & 0 & 0 \\ 0 & -1 & 1 & 0 \\ 0 & 0 & -1 & 1 \\ 0 & 0 & 0 & -1 \end{bmatrix} = B$ is in Jordan canonical form. (Computation performed by WolframAlpha.) Taking the first row of $Q\mathsf{exp}(Bt)$ (and multiplying by -1), we see that $e^{-t}$, $e^{-t}(3t-2)$, $e^{-t}(\frac{1}{2}t^2 + 3t + 6)$, and $e^{-t}(\frac{1}{6}t^3 + \frac{3}{2}t^2 + 6t + 10)$ are linearly independent solutions of the original 4th order differential equation.

### Nonsingular, constant multiples of a fundamental matrix are fundamental

Let $\frac{d}{dt} Y = AY$ be a linear system of first order differential equations over $\mathbb{R}$ as in this previous exercise. Suppose $M$ is a fundamental matrix of this system (i.e. a matrix whose columns are linearly independent solutions) and let $Q$ be a nonsingular matrix over $\mathbb{R}$. Prove that $MQ$ is also a fundamental matrix for this differential equation.

Write $Q = [Q_1\ \ldots\ Q_n]$ as a column matrix. Now $\frac{d}{dt} MQ = \frac{d}{dt} [MQ_1\ ldots\ MQ_n]$ $= [(\frac{d}{dt}M)Q_1\ \ldots\ (\frac{d}{dt} M)Q_n]$ by the definition of our matrix derivative and part 2 of this previous exercise. This is then equal to $(\frac{d}{dt} M)Q = AMQ$. In particular, the columns of $MQ$ are solutions of $\frac{d}{dt} Y = AY$. Moreover, since $Q$ is nonsingular, $MQ$ is nonsingular. So $MQ$ is also a fundamental matrix.

### Properties of the derivative of a matrix power series

Let $G(x)$ be a formal power series on $\mathbb{C}$ having an infinite radius of convergence and fix an $n \times n$ matrix $A$ over $\mathbb{C}$. The mapping $t \mapsto G(At)$ carries a complex number $t$ to a complex matrix $G(At)$; we can think of this as the ‘direct sum’ of $n \times n$ different functions on $\mathbb{C}$, one for each entry of $G(At)$.

We now define the derivative of $G(At)$ with respect to $t$ to be a mapping $\mathbb{C} \rightarrow \mathsf{Mat}_n(\mathbb{C})$ as follows: $\left[\frac{d}{dt} G(At)\right]_{i,j} = \frac{d}{dt} \left[ G(At)_{i,j}\right]$. In other words, thinking of $G(At)$ as a matrix of functions, $\frac{d}{dt} G(At)$ is the matrix whose entries are the derivatives of the corresponding entries of $G(At)$.

We will use the limit definition of derivative (that is, $\frac{d}{dt} f(t) = \mathsf{lim}_{h \rightarrow 0} \dfrac{f(t+h) - f(t)}{h}$, where it doesnt matter how $h$ approaches 0 in $\mathbb{C}$) and will assume that all derivatives exist everywhere.

Prove the following properties of derivatives:

1. If $G(x) = \sum_{k \in \mathbb{N}} \alpha_kx^k$, then $\frac{d}{dt} G(At) = A \sum_{k \in \mathbb{N}} (k+1)\alpha_{k+1}(At)^k$.
2. If $V$ is an $n \times 1$ matrix with constant entries (i.e. not dependent on $t$) then $\frac{d}{dt} (G(At)V) = \left( \frac{d}{dt} G(At) \right) V$.

[My usual disclaimer about analysis applies here: as soon as I see words like ‘limit’ and ‘continuous’ I become even more confused than usual. Read the following with a healthy dose of skepticism, and please point out any errors.]

Note the following.

 $\left[ \dfrac{d}{dt} G(At) \right]_{i,j}$ = $\dfrac{d}{dt} G(At)_{i,j}$ = $\mathsf{lim}_{h \rightarrow 0} \dfrac{G(A(t+h)_{i,j}) - G(At)_{i,j}}{h}$ = $\mathsf{lim}_{h \rightarrow 0} \left[ \dfrac{G(A(t+h)) - G(At)}{h} \right]_{i,j}$ = $\mathsf{lim}_{h \rightarrow 0} \left[ \dfrac{\sum_k \alpha_k(A(t+h))^k - \sum_k \alpha_k(At)^k}{h} \right]_{i,j}$ = $\mathsf{lim}_{h \rightarrow 0} \left[ \dfrac{\sum_k \alpha_k A^k ((t+h)^k - t^k)}{h} \right]_{i,j}$ = $\mathsf{lim}_{h \rightarrow 0} \left[ \dfrac{\sum_{k > 0} \alpha_k A^k \left( \sum_{m=0}^k {k \choose m} t^m h^{k-m} - t^k \right)}{h} \right]_{i,j}$ = $\mathsf{lim}_{h \rightarrow 0} \left[ \dfrac{\sum_{k > 0} \alpha_k A^k \left( \sum_{m=0}^{k-1} {k \choose m} t^m h^{k-m} \right)}{h} \right]_{i,j}$ = $\mathsf{lim}_{h \rightarrow 0} \displaystyle\sum_{k > 0} \alpha_k A^k \sum_{m=0}^{k-1} {k \choose m} t^m h^{k-1-m}$ (Now we can substitute $h = 0$.) = $\displaystyle\sum_{k > 0} \alpha_k A^k {k \choose {k-1}} t^{k-1}$ (All terms but $m=k-1$ vanish.) = $\displaystyle\sum_{k > 0} k \alpha_k A^k t^{k-1}$ = $\displaystyle\sum_k (k+1) \alpha_{k+1}A^{k+1}t^k$ = $A \sum_k (k+1)\alpha_{k+1}(At)^k$

As desired.

Now say $V = [v_i]$ and $G(At) = [c_{i,j}(t)]$; we then have the following.

 $\dfrac{d}{dt} \left( G(At)V \right)$ = $\dfrac{d}{dt} \left( [c_{i,j}(t)][v_{i,j}] \right)$ = $\dfrac{d}{dt} [\sum_k c_{i,k}(t)v_k]$ = $[\frac{d}{dt} \sum_k c_{i,k}(t)v_k]$ = $[\sum_k (\frac{d}{dt} c_{i,k}(t)) v_k]$ = $[\frac{d}{dt} c_{i,j}(t)][v_i]$ = $\left(\frac{d}{dt} G(At) \right)V$

As desired.

### Facts about power series of matrices

Let $G(x) = \sum_{k \in \mathbb{N}} \alpha_k x^k$ be a power series over $\mathbb{C}$ with radius of convergence $R$. Let $A$ be an $n \times n$ matrix over $\mathbb{C}$, and let $P$ be a nonsingular matrix. Prove the following.

1. If $G(A)$ converges, then $G(P^{-1}AP)$ converges, and $G(P^{-1}AP) = P^{-1}G(A)P$.
2. If $A = B \oplus C$ and $G(A)$ converges, then $G(B)$ and $G(C)$ converge and $G(B \oplus C) = G(B) \oplus G(C)$.
3. If $D$ is a diagonal matrix with diagonal entries $d_i$, then $G(D)$ converges, $G(d_i)$ converges for each $d_i$, and $G(D)$ is diagonal with diagonal entries $G(d_i)$.

Suppose $G(A)$ converges. Then (by definition) the sequence of matrices $G_N(A)$ converges entrywise. Let $G_N(A) = [a_{i,j}^N]$, $P = [p_{i,j}]$, and $P^{-1} = [q_{i,j}]$. Now $G_N(P^{-1}AP) = P^{-1}G_N(A)P$ $= [\sum_\ell \sum_k q_{i,k} a_{k,\ell}^Np_{\ell,j}]$. That is, the $(i,j)$ entry of $G_N(P^{-1}AP)$ is $\sum_\ell \sum_k q_{i,k} a_{k,\ell}^Np_{\ell,j}$. Since each sequence $a_{k,\ell}^N$ converges, this sum converges as well. In particular, $G(P^{-1}AP)$ converges (again by definition). Now since $G_N(P^{-1}AP) = P^{-1}G_N(A)P$ for each $N$, the corresponding sequences for each $(i,j)$ are equal for each term, and so have the same limit. Thus $G(P^{-1}AP) = P^{-1}G(A)P$.

Now suppose $A = B \oplus C$. We have $G_N(B \oplus C) = \sum_{k=0}^N \alpha_k (B \oplus C)^k$ $= \sum_{k=0}^N \alpha_k B^k \oplus C^k$ $= (\sum_{k = 0}^N \alpha_k B^k) \oplus (\sum_{k=0}^N \alpha_k C^k)$ $= G_N(B) \oplus G_N(C)$. Since $G_N(A)$ converges in each entry, each of $G_N(B)$ and $G_N(C)$ converge in each entry. So $G(B)$ and $G(C)$ converge. Again, because for each $(i,j)$ the corresponding sequences $G_N(A)_{i,j}$ and $(G_N(B) \oplus G_N(C))_{i,j}$ are the same, they converge to the same limit, and thus $G(B \oplus C) = G(B) \oplus G(C)$.

Finally, suppose $D$ is diagonal. Then in fact we have $D = \bigoplus_{t=1}^n [d_t]$, and so by the previous part, $G(D) = \bigoplus_{t=1}^n G(d_t)$. In particular, $G(d_t)$ converges, and $G(D)$ is diagonal with diagonal entries $G(d_t)$ as desired.

### On the convergence of formal power series of matrices

Let $G(x) = \sum_{k \in \mathbb{N}} \alpha_kx^k$ be a formal power series over $\mathbb{C}$ with radius of convergence $R$. Let $||A|| = \sum_{i,j} |a_{i,j}|$ be the matrix norm introduced in this previous exercise.

Given a matrix $A$ over $\mathbb{C}$, we can construct a sequence of matrices by taking the $N$th partial sum of $G(A)$. That is, $G_N(A) = \sum_{k = 0}^N a_kA^k$. This gives us $n^2$ sequences $\{c_{i,j}^N\}$ where $c_{i,j}^N$ is the $(i,j)$ entry of $G_N(A)$. Suppose $c_{i,j}^N$ converges to $c_{i,j}$ for each $(i,j)$, and let $C = [c_{i,j}]$. In this situation, we say that $G_N(A)$ converges to $C$, and that $G(A) = C$. (In other words, $G_N(A)$ converges precisely when each entrywise sequence $G_N(A)_{i,j}$ converges.)

1. Prove that if $||A|| \leq R$, then $G_N(A)$ converges.
2. Deduce that for all matrices $A$, the following power series converge: $\mathsf{sin}(A) = \sum_{k \in \mathbb{N}} \dfrac{(-1)^k}{(2k+1)!}A^{2k+1}$, $\mathsf{cos}(A) = \sum_{k \in \mathbb{N}} \dfrac{(-1)^k}{(2k)!}A^{2k}$, and $\mathsf{exp}(A) = \sum_{k \in \mathbb{N}} \dfrac{1}{k!} A^k$.

[Disclaimer: My facility with even simple analytical concepts is laughably bad, but I’m going to give this a shot. Please let me know what’s wrong with this solution.]

We begin with a lemma.

Lemma: For all $(i,j)$, $(A^k)_{i,j} \leq ||A||^k$, where the subscripts denote taking the $(i,j)$ entry. Proof: By the definition of matrix multiplication, we have $A^k = [\sum_{t \in n^{k-1}} \prod_{i=0}^{k-1} a_{t(i), t(i+1)}]$, where $t(0) = i$ and $t(k+1) = j$. (I’m abusing the notation a bit here; $t$ is an element of $\{1,\ldots,n\}^{k-1}$, which we think of as a function on $\{1,\ldots,k-1\}$.) Now $|A^k_{i,j}| \leq \sum_{t \in n^{k-1}} \prod_{i=0}^{k-1} |a_{t(i), t(i+1)}|$ by the triangle inequality. Note that $||A||^k$ is the sum of all possible $k$-fold products of (absolute values of) entries of $A$, and that we have $|A^k_{i,j}|$ bounded above by a sum of some distinct $k$-fold products of (absolute values of) entries of $A$. In particular, $|A^k_{i,j}| \leq ||A||^k$ since the missing terms are all positive. Thus $|\alpha_k A^k_{i,j}| = |(\alpha_k A^k)_{i,j}| \leq |\alpha_k| ||A||^k$.

Let us define the formal power series $|G|(x)$ by $|G|(x) = \sum |\alpha_k| x^k$. What we have shown (I think) is that $\sum_{k=0}^N |\alpha_k A^k_{i,j}| \leq |G|_N(||A||)$, using the triangle inequality.

Now recall, by the Cauchy-Hadamard theorem, that the radius of convergence of $G(x)$ satisfies $1/R = \mathsf{lim\ sup}_{k \rightarrow \infty} |\alpha_k|^{1/k}$. In particular, $|G|$ has the same radius of convergence as $G$. So in fact the sequence $|G|_N(||A||)$ converges. Now the sequence $\sum_{k=0}^N |\alpha_k A^k_{i,j}|$ is bounded and monotone increasing, and so must also converge. That is, $\sum_{k=0}^N (\alpha_k A^k)_{i,j} = G_N(A)_{i,j}$ is absolutely convergent, and so is convergent. Thus $G(A)$ converges.

Now we will address the convergence of the series $\mathsf{sin}$. Recall that the radius of convergence $R$ of $\mathsf{sin}$ satisfies $R^{-1} = \mathsf{lim\ sup}_{k \rightarrow \infty} |\alpha_k|^{1/k}$ $\mathsf{lim}_{k \rightarrow \infty} \mathsf{sup}_{n \geq k} |\alpha_k|^{1/k}$. Now $\alpha_k = 0$ if $k$ is even and $(-1)^t/k!$ if $k = 2t+1$ is odd. So $|\alpha_k|^{1/k} = 0$ if $k$ is even and $|1/k!|^{1/k} = 1/(k!)^{1/k} > 0$ if $k$ is odd.

Brief aside: Suppose $k \geq 1$. Certainly $1 \leq k! < (k+1)^k$, so that $(k!)^{1/k} < k+1$. Now $1 \leq (k!)^{1 + 1/k} < (k+1)!$, so that $(k!)^{1/k} < ((k+1)!)^{1/(k+1)}$. In particular, if $n > k$, then $(k!)^{1/k} < (n!)^{1/n}$.

By our brief aside, we have that $\mathsf{sup}_{n \geq k} |\alpha_k|^{1/k} = 1/(k!)^{1/k}$ if $k$ is odd, and $1/((k+1)!)^{1/(k+1)}$ if $k$ is even. So (skipping redundant terms) we have $\mathsf{sin}(x)$ satisfies $R^{-1} = \mathsf{lim}_{k \rightarrow \infty} 1/(k!)^{1/k}$. We know from the Brief Aside that this sequence is monotone decreasing. Now let $\varepsilon > 0$. Consider now the sequence $\varepsilon$, $2 \varepsilon$, $3 \varepsilon$, et cetera. only finitely many terms of this sequence are less than or equal to 1. So there must exist some $k$ such that the product $\prod_{i=1}^k \varepsilon i$ is greater than 1. Now $\varepsilon^k k! > 1$, so that $\varepsilon > 1/(k!)^{1/k}$. Thus the sequence $1/(k!)^{1/k}$ converges to 0, and so the radius of convergence of $\mathsf{sin}(x)$ is $\infty$.

By a similar argument, the radii of convergence for $\mathsf{cos}(X)$ and $\mathsf{exp}(X)$ are also $\infty$.

### Properties of a matrix norm

Given an $n \times n$ matrix $A$ over $\mathbb{C}$, define $||A|| = \sum_{i,j} |a_{i,j}|$, where bars denote the complex modulus. Prove the following for all $A,B \in \mathsf{Mat}_n(\mathbb{C})$ and all $\alpha \in \mathbb{C}$.

1. $||A+B|| \leq ||A|| + ||B||$
2. $||AB|| \leq ||A|| \cdot ||B||$
3. $||\alpha A|| = |\alpha| \cdot ||A||$

Say $A = [a_{i,j}]$ and $B = [b_{i,j}]$.

We have $||A+B|| = \sum_{i,j} |a_{i,j} + b_{i,j}|$ $\leq \sum_{i,j} |a_{i,j}| + |b_{i,j}|$ by the triangle inequality. Rearranging, we have $||A+B|| \leq (\sum_{i,j} |a_{i,j}| + \sum_{i,j} |b_{i,j}|$ $= ||A|| + ||B||$ as desired.

Now $||AB|| = \sum_{i,j} |\sum_k a_{i,k}b_{k,j}|$ $\leq$latex \sum_{i,j} \sum_k |a_{i,k}|b_{k,j}|\$ by the triangle inequality. Now $||AB|| \leq \sum_{i,j} \sum_{k,t} |a_{i,k}| |b_{t,j}|$ since all the new terms are positive, and rearranging, we have $||AB|| \leq \sum_{i,k} \sum_{j,t} |a_{i,k}||b_{t,j}|$ $= (\sum_{i,k} |a_{i,k}|)(\sum_{j,t} |b_{t,j}|)$ $= ||A|| \cdot ||B||$.

Finally, we have $||\alpha A|| = \sum_{i,j} |\alpha a_{i,j}|$ $= |\alpha| \sum_{i,j} |a_{i,j}|$ $= |\alpha| \cdot ||A||$.

### Matrices with square roots over fields of characteristic 2

Let $F$ be a field of characteristic 2. Compute the Jordan canonical form of a Jordan block $J$ of size $n$ and eigenvalue $\lambda$ over $F$. Characterize those matrices $A$ over $F$ which are squares; that is, characterize $A$ such that $A = B^2$ for some matrix $B$.

Let $J = [b_{i,j}]$ be the Jordan block with eigenvalue $\lambda$ and size $n$. That is, $b_{i,j} = \lambda$ if $j = i$, $1$ if $j = i+1$, and 0 otherwise. Now $J^2 = [\sum_k b_{i,k}b_{k,j}]$; if $k \neq i$ or $k \neq i+1$, then $b_{i,k} = 0$. Evidently then we have $(J^2)_{i,j} = \lambda^2$ if $j = i$, $1$ if $j = i+2$, and 0 otherwise, Noting that $2 = 0$. So $J^2 - \lambda^2 I = \begin{bmatrix} 0 & I \\ 0_2 & 0 \end{bmatrix}$, where $0_2$ is the $2 \times 2$ zero matrix and $I$ is the $(n-2) \times (n-2)$ identity matrix. Now let $v = \begin{bmatrix} V_1 \\ V_2 \end{bmatrix}$, where $V_1$ has dimension $2 \times 1$. Now $(J^2 - \lambda^2 I)v = \begin{bmatrix} V_2 \\ 0 \end{bmatrix}$. That is, $J^2 - \lambda^2 I$ ‘shifts’ the entries of $v$– so $e_{i+2} \mapsto e_i$ and $e_1, e_2 \mapsto 0$. In particular, the kernel of $J^2 - \lambda^2 I$ has dimension 2, so that by this previous exercise, the Jordan canonical form of $J^2$ has two blocks (both with eigenvalue $\lambda^2$.

Now $J^2 - \lambda^2 I = (J + \lambda I)(J - \lambda I)$ $= (J - \lambda I)^2$, since $F$ has characteristic 2. Note that $J-\lambda I$ has order $n$, since (evidently) we have $(J - \lambda I)e_{i+1} = e_i$ and $(J - \lambda I)e_1 = 0$. So $J-\lambda I$ has order $n$. If $n$ is even, then $(J^2 - \lambda^2 I)^{n/2} = 0$ while $(J^2 - \lambda^2 I)^{n/2-1} \neq 0$, and if $n$ is odd, then $(J^2-\lambda^2 I)^{(n+1)/2} = 0$ while $(J^2 - \lambda^2 I)^{(n+1)/2-1} \neq 0$. So the minimal polynomial of $J^2$ is $(x-\lambda^2)^{n/2}$ if $n$ is even and $(x-\lambda^2)^{(n+1)/2}$ if $n$ is odd.

So the Jordan canonical form of $J^2$ has two Jordan blocks with eigenvalue $\lambda^2$. If $n$ is even, these have size $n/2,n/2$, and if $n$ is odd, they have size $(n+1)/2, (n-1)/2$.

Now let $A$ be an arbitrary $n \times n$ matrix over $F$ (with eigenvalues in $F$). We claim that $A$ is a square if and only if the following hold.

1. The eigenvalues of $A$ are square in $F$
2. For each eigenvalue $\lambda$ of $A$, the Jordan blocks with eigenvalue $\lambda$ can be paired up so that the sizes of the blocks in each pair differ by 0 or 1.

To see the ‘if’ part, suppose $P^{-1}AP = \bigoplus H_i \oplus K_i$ is in Jordan canonical form, where $H_i$ and $K_i$ are Jordan blocks having the same eigenvalue $\lambda_i$ and whose sizes differ by 0 or 1. By the first half of this exercise, $H_i \oplus K_i$ is the Jordan canonical form of $J_i^2$, where $J_i$ is a Jordan block with eigenvalue $\sqrt{\lambda_i}$. Now $A$ is similar to the direct sum of these $J_i^2$, and so $Q^{-1}AQ = J^2$. Then $A = (Q^{-1}JQ)^2 = B^2$ is square.

Conversely, suppose $A = B^2$ is square, and say $P^{-1}BP = J$ is in Jordan canonical form. So $P^{-1}AP = J^2$. Letting $J_i$ denote the Jordan blocks of $J$, we have $P^{-1}AP = \bigoplus J_i^2$. The Jordan canonical form of $J_i^2$ has two blocks with eigenvalue $\lambda_i^2$ and whose sizes differ by 0 or 1, by the first half of this exercise. So the Jordan blocks of $A$ all have eigenvalues which are square in $F$ and can be paired so that the sizes in each pair differ by 0 or 1.