Diagonalising a quadratic Fermion Hamiltonian

It’s even easier than you probably realised.

I’m talking about a Hamiltonian such as $$H=a_1^\dagger a_1^{\phantom\dagger} + a_2^{\phantom\dagger}a_2^\dagger + 2a_1^\dagger a_2^\dagger + 2a_2^{\phantom\dagger}a_1^{\phantom\dagger},$$ where the $a_i$ satisfy the canonical anti-commutation relations $$\{a_i^{\phantom\dagger},a_j^\dagger\}=\delta_{ij},\quad \{a_i^{\phantom\dagger},a_j^{\phantom\dagger}\}=0,\quad \{a_i^\dagger,a_j^\dagger\}=0.$$ And yes, we will allow terms like $a_i^\dagger a_j^\dagger$ that are not number preserving. They appear commonly in the theory of superconductivity.

There is lots of literature on this, see below. And that’s fine, but I wanted a very short recipe, which I found in the following paper:

  • J. L. van Hemmen, Z. Physik B–Consensed Matter 38, 271–277 (1980). doi:10.1007/BF01315667.

It’s the most concise treatment I could find, so here we go.

Conventions: $a^\dagger$ is the Hermitian conjugate, or adjoint, of $a$; $\bar x$ is the complex conjugate of $x$; and $X^t$ denotes the transpose of $X$.

First, let’s write the Hamiltonian in the form $$H=\sum_{i,j=1}^n\bigl(A_{ij}a_i^\dagger a_j^{\phantom\dagger} + B_{ij}a_i^\dagger a_j^\dagger + \overline{B_{ji}}a_i^{\phantom\dagger} a_j^{\phantom\dagger} - A_{ji}a_i^{\phantom\dagger} a_j^\dagger \bigr)$$ with $$A^\dagger=A,\qquad B=-B^t.$$ (This is without loss of generality, and I’ll explain below what to do if your favourite Hamiltonian isn’t of this form yet.)

As a second preparation, we define the anti-unitary operator $$J\begin{pmatrix} u\\v \end{pmatrix}=\begin{pmatrix} \bar v\\\bar u \end{pmatrix}$$ for any $u,v\in\mathbb{C}^n$. That’s nothing but a ‘conjugate swap’.

Now all we have to do is the following:

Step 1 of 2

Diagonalise the matrix $$D=\begin{pmatrix}A&B\\-\bar B&-\bar A\end{pmatrix},$$ and consider its eigenvalues and set of orthonormal eigenvectors. (If necessary, repeat identical eigenvalues for distinct eigenvectors, so that we definitely have a total of $2n$ eigenvalues.) For the moment, assume that all eigenvalues are non-zero.

It turns out that for every eigenvector $x_i$ with eigenvalue $\omega_i>0$, $Jx_i$ is an eigenvector with eigenvalue $-\omega_i$. Therefore, we can arrange the eigenvectors in pairs $(x_i, Jx_i)$, where $x_i$ has eigenvalue $\omega_i>0$ and $Jx_i$ has eigenvalue $-\omega_i$.

Step 2 of 2

Form the matrix $$T=(x_1, \dotsc, x_n, Jx_1, \dotsc, Jx_n).$$

Done. That’s it.

(If some eigenvalues are zero and $k=\dim\ker D$ is the dimension of the corresponding eigenspace, you also need to pick $k/2$ orthonormal eigenvectors $y_1,\dotsc, y_k$ in that subspace, so that you can add the $y_i$ and $Jy_i$ as columns to $T$. In this case, some additional work might be necessary to ensure that the final $T$ matrix has orthonormal columns—but this is rarely a problem in practice.)

The matrix $T$ is unitary and diagonalises $D$, or more specifically, $$T^\dagger D T=\operatorname{diag}(\omega_1, \dotsc, \omega_n, -\omega_1, \dotsc, -\omega_n),$$ and it also leads to canonical anti-commutation relations. The transformed operators are $$b_i=\sum_{j=1}^n T^\dagger_{ij}a_j^{\phantom\dagger}+\sum_{j=n+1}^{2n} T^\dagger_{ij}a_j^\dagger,\quad b_i^\dagger=\sum_{j=1}^n T^\dagger_{i+n,j}a_j^{\phantom\dagger}+\sum_{j=n+1}^{2n} T^\dagger_{i+n,j}a_j^\dagger.$$ Conversely, by setting $$a_i=\sum_{j=1}^n T_{ij}b_j^{\phantom\dagger}+\sum_{j=n+1}^{2n} T_{ij}b_j^\dagger,\quad a_i^\dagger=\sum_{j=1}^n T_{i+n,j}b_j^{\phantom\dagger}+\sum_{j=n+1}^{2n} T_{i+n,j}b_j^\dagger,$$ we can now rewrite the Hamiltonian, up to an additive constant, in the diagonal form $$H=\sum_{i=1}^n2\omega_i b_i^\dagger b_i^{\phantom\dagger}.$$

For the details on why all of this is true, just read Section 2 and 3 of van Hemmen’s paper.

🚑 If your Hamiltonian doesn’t have the right form

Let’s say the Hamiltonian contains only the term $a^\dagger a$, but no term $aa^\dagger$. Then we can easily make one by using the anti-commutation relation: $$ a^\dagger a = \tfrac{1}{2} a^\dagger a + \tfrac{1}{2} a^\dagger a= \tfrac{1}{2} a^\dagger a - \tfrac{1}{2} aa^\dagger + \tfrac{1}{2}. $$ Look how the prefactors of $a^\dagger a$ and $aa^\dagger$ have opposite signs—just what we needed to get $H$ into the right form. Of course, we can just ignore the constant term $\frac{1}{2}$ at the end since it only offsets the whole spectrum of $H$ by a constant.

To summarise: If no $aa^\dagger$ is present, replace the $a^\dagger a$ term with $\frac{1}{2} a^\dagger a - \frac{1}{2} aa^\dagger$. Similarly for $a^\dagger a^\dagger$ and so on.

📝 An example

Back to our initial example $$H=a_1^\dagger a_1^{\phantom\dagger} + a_2^{\phantom\dagger}a_2^\dagger + 2a_1^\dagger a_2^\dagger + 2a_2^{\phantom\dagger}a_1^{\phantom\dagger}.$$ As mentioned, we have to bring it into a nice form, $$\begin{align}H&=\tfrac{1}{2}a_1^\dagger a_1^{\phantom\dagger}-\tfrac{1}{2}a_1^{\phantom\dagger} a_1^\dagger+\tfrac{1}{2}a_2^{\phantom\dagger}a_2^\dagger-\tfrac{1}{2}a_2^\dagger a_2^{\phantom\dagger}\\&\quad+a_1^\dagger a_2^\dagger - a_2^\dagger a_1^\dagger + a_2^{\phantom\dagger}a_1^{\phantom\dagger}-a_1^{\phantom\dagger}a_2^{\phantom\dagger},\end{align}$$ which differs from the original one only by an additive constant (omitted here). Now we can read off the matrices $$A=\begin{pmatrix}\frac{1}{2}&0\\0&-\frac{1}{2}\end{pmatrix},\quad B=\begin{pmatrix}0&1\\-1&0\end{pmatrix}.$$ Therefore, $$D=\begin{pmatrix}\frac{1}{2}&0&0&1\\0&-\frac{1}{2}&-1&0\\0&-1&-\frac{1}{2}&0\\1&0&0&\frac{1}{2}\end{pmatrix}.$$ This matrix has the two positive eigenvalues $\omega_1=\frac{1}{2}$ and $\omega_2=\frac{3}{2}$; the corresponding normalised eigenvectors are $$x_1=\tfrac{1}{\sqrt 2}(0,-1,1,0)^t,\quad x_2=\tfrac{1}{\sqrt 2}(1,0,0,1)^t.$$ Now the transformation matrix is $$T=(x_1,x_2,Jx_1,Jx_2)=\frac{1}{\sqrt 2}\begin{pmatrix}0&1&1&0\\-1&0&0&1\\1&0&0&1\\0&1&-1&0\end{pmatrix}.$$ This means that $$a_1=\tfrac{1}{\sqrt 2}(b_2+b_1^\dagger),\quad a_2=\tfrac{1}{\sqrt 2}(-b_1+b_2^\dagger)$$ and $$H=b_1^\dagger b_1^{\phantom\dagger} + 3 b_2^\dagger b_2^{\phantom\dagger} + \mathrm{const}.$$

🤯 Another example that initially confused me

When I wrote the first draft for this post, I randomly picked the Hamiltonian $$H=a_1^\dagger a_1^{\phantom\dagger}+a_2^{\phantom\dagger}a_2^\dagger+a_1^\dagger a_2^\dagger + a_2^{\phantom\dagger}a_1^{\phantom\dagger}$$ which you can bring into the form (up to an additive constant, as usual) $$H=\tfrac{1}{2}(a_1^\dagger a_1^{\phantom\dagger}-a_1^{\phantom\dagger} a_1^\dagger+a_2^{\phantom\dagger}a_2^\dagger-a_2^\dagger a_2^{\phantom\dagger}+a_1^\dagger a_2^\dagger - a_2^\dagger a_1^\dagger + a_2^{\phantom\dagger}a_1^{\phantom\dagger}-a_1^{\phantom\dagger}a_2^{\phantom\dagger}).$$ If you do all of the above steps—diagonalising the matrix $D$ and ordering its eigenvectors to form the matrix $T$—you eventually end up with the transformed Hamiltonian $$H=2b^\dagger b.$$ Yes. Only one mode. Where did the other one go?

It turns out, $D$ doesn’t have full rank in this case, so the original Hamiltonian was just a very unfortunate way of writing down a single number operator. In fact, you can easily see what happened if you just notice that the original Hamiltonian factors as $$H=a_1^\dagger(a_1^{\phantom\dagger}+a_2^\dagger)+a_2^{\phantom\dagger}(a_2^\dagger+a_1^{\phantom\dagger})=(a_2^{\phantom\dagger}+a_1^\dagger)(a_1^{\phantom\dagger}+a_2^\dagger).$$ So everything is fine and works as expected.

🕹️ With the computer

Let’s quickly jot this down in Mathematica. I will assume that $D$ has full rank because the case of zero eigenvalues is extremely annoying. First, two little helper functions. One to fill out the block matrix $D$, given $A$ and $B$:

DMatrix[a_, b_] := 
  ArrayFlatten[{{a, b}, {-Conjugate@b, -Conjugate@a}}]

And the other one is the operator $J$:

JSwap[x_] := Flatten@TakeList[x, {-Length[x]/2, Length[x]/2}]

The function FermiDiagonalization takes as input the eigensystem (that is, eigenvalues & eigenvectors) of $D$ as returned by Mathematica’s Eigensystem function, and outputs the matrix $T$:

FermiDiagonalization[eigensys_] := Transpose[
  Join[#, JSwap /@ #] &@
    Extract[eigensys[[2]], Position[eigensys[[1]], _?Positive, {1}]]

The idea is to filter out the indices of the positive eigenvalues in the eigenvalue list using Position; we can then pick out the corresponding eigenvectors using Extract. They are then combined in one matrix together with their negative-eigenvalue counterparts.

📚 Other sources of wisdom

In general, these kinds of transformations appear to be called canonical transformations, Bogoliubov transformations or Bogoliubov-Valatin transformations. A standard reference is Chapters 2 & 3 of

  • J. P. Blaizot, G. Ripka. Quantum Theory of Finite Systems. The MIT Press (1986).

I recommend the very readable (not peer-reviewed) notes

Nielsen claims that it is necessary to bring the transformation matrix $T$ into the appropriate block form, so that the new transformed Fermion operators also satisfy the anti-commutation relations.

The point of van Hemmen’s paper is that you basically only have to apply the $J$ operator to the eigenvectors to obtain the desired form. (Except when some eigenvalues are zero, in which case you might need to pick an appropriate basis of the kernel of $D$.)