Diagonalising a quadratic Fermion Hamiltonian

It’s even easier than you probably realised.

I’m talking about a Hamiltonian such as $$H=a_1^\dagger a_1^{\phantom\dagger} + a_2^{\phantom\dagger}a_2^\dagger + 2a_1^\dagger a_2^\dagger + 2a_2^{\phantom\dagger}a_1^{\phantom\dagger},$$ where the $a_i$ satisfy the canonical anti-commutation relations $$\{a_i^{\phantom\dagger},a_j^\dagger\}=\delta_{ij},\quad \{a_i^{\phantom\dagger},a_j^{\phantom\dagger}\}=0,\quad \{a_i^\dagger,a_j^\dagger\}=0.$$ And yes, we will allow terms like $a_i^\dagger a_j^\dagger$ that are not number preserving. They appear commonly in the theory of superconductivity.

There is lots of literature on this, see below. And that’s fine, but I wanted a very short recipe, which I found in the following paper:

  • J. L. van Hemmen, Z. Physik B–Consensed Matter 38, 271–277 (1980). doi:10.1007/BF01315667.

It’s the most concise treatment I could find, so here we go.

Conventions: $a^\dagger$ is the Hermitian conjugate, or adjoint, of $a$; $\bar x$ is the complex conjugate of $x$; and $X^t$ denotes the transpose of $X$.

First, let’s write the Hamiltonian in the form $$H=\sum_{i,j=1}^n\bigl(A_{ij}a_i^\dagger a_j^{\phantom\dagger} + B_{ij}a_i^\dagger a_j^\dagger - \overline{B_{ji}}a_i^{\phantom\dagger} a_j^{\phantom\dagger} - A_{ji}a_i^{\phantom\dagger} a_j^\dagger \bigr)$$ with $$A^\dagger=A,\qquad B=-B^t.$$ (This is without loss of generality, and I’ll explain below what to do if your favourite Hamiltonian isn’t of this form yet.)

As a second preparation, we define the anti-unitary operator $$J\begin{pmatrix} u\\v \end{pmatrix}=\begin{pmatrix} \bar v\\\bar u \end{pmatrix}$$ for any $u,v\in\mathbb{C}^n$. That’s nothing but a ‘conjugate swap’.

Now all we have to do is the following:

Step 1 of 2

Diagonalise the matrix $$D=\begin{pmatrix}A&B\\-\bar B&-\bar A\end{pmatrix},$$ and consider its eigenvalues and set of orthonormal eigenvectors. (If necessary, repeat identical eigenvalues for distinct eigenvectors, so that we definitely have a total of $2n$ eigenvalues.) For the moment, assume that all eigenvalues are non-zero.

It turns out that for every eigenvector $x_i$ with eigenvalue $\omega_i>0$, $Jx_i$ is an eigenvector with eigenvalue $-\omega_i$. Therefore, we can arrange the eigenvectors in pairs $(x_i, Jx_i)$, where $x_i$ has eigenvalue $\omega_i>0$ and $Jx_i$ has eigenvalue $-\omega_i$.

Step 2 of 2

Form the matrix $$T=(x_1, \dotsc, x_n, Jx_1, \dotsc, Jx_n).$$

Done. That’s it.

(If some eigenvalues are zero and $k=\dim\ker D$ is the dimension of the corresponding eigenspace, you also need pick $k/2$ orthonormal eigenvectors $y_1,\dotsc, y_k$ in that subspace, so that you can add the $y_i$ and $Jy_i$ as columns to $T$. In this case, some additional work might be necessary to ensure that the final $T$ matrix has orthonormal columns—but this is rarely a problem in practice.)


The matrix $T$ is unitary and diagonalises $D$, or more specifically, $$T^\dagger D T=\operatorname{diag}(\omega_1, \dotsc, \omega_n, -\omega_1, \dotsc, -\omega_n),$$ and it also leads to canonical anti-commutation relations. The transformed operators are $$b_i=\sum_{j=1}^n T^\dagger_{ij}a_j^{\phantom\dagger}+\sum_{j=n+1}^{2n} T^\dagger_{ij}a_j^\dagger,\quad b_i^\dagger=\sum_{j=1}^n T^\dagger_{i+n,j}a_j^{\phantom\dagger}+\sum_{j=n+1}^{2n} T^\dagger_{i+n,j}a_j^\dagger.$$ Conversely, by setting $$a_i=\sum_{j=1}^n T_{ij}b_j^{\phantom\dagger}+\sum_{j=n+1}^{2n} T_{ij}b_j^\dagger,\quad a_i^\dagger=\sum_{j=1}^n T_{i+n,j}b_j^{\phantom\dagger}+\sum_{j=n+1}^{2n} T_{i+n,j}b_j^\dagger,$$ we can now rewrite the Hamiltonian, up to an additive constant, in the diagonal form $$H=\sum_{i=1}^n2\omega_i b_i^\dagger b_i^{\phantom\dagger}.$$

For the details on why all of this is true, just read Section 2 and 3 of van Hemmen’s paper.

🚑 If your Hamiltonian doesn’t have the right form

Let’s say the Hamiltonian contains only the term $a^\dagger a$, but no term $aa^\dagger$. Then we can easily make one by using the anti-commutation relation: $$ a^\dagger a = \tfrac{1}{2} a^\dagger a + \tfrac{1}{2} a^\dagger a= \tfrac{1}{2} a^\dagger a - \tfrac{1}{2} aa^\dagger + \tfrac{1}{2}. $$ Look how the prefactors of $a^\dagger a$ and $aa^\dagger$ have opposite signs—just what we needed to get $H$ into the right form. Of course, we can just ignore the constant term $\frac{1}{2}$ at the end since it just offsets the whole spectrum of $H$ by a constant.

To summarise: If no $aa^\dagger$ is present, replace the $a^\dagger a$ term with $\frac{1}{2} a^\dagger a - \frac{1}{2} aa^\dagger$. Similarly for $a^\dagger a^\dagger$ and so on.

📝 An example

Back to our initial example $$H=a_1^\dagger a_1^{\phantom\dagger} + a_2^{\phantom\dagger}a_2^\dagger + 2a_1^\dagger a_2^\dagger + 2a_2^{\phantom\dagger}a_1^{\phantom\dagger}.$$ As mentioned, we have to bring it into a nice form, $$\begin{align}H&=\tfrac{1}{2}a_1^\dagger a_1^{\phantom\dagger}-\tfrac{1}{2}a_1^{\phantom\dagger} a_1^\dagger+\tfrac{1}{2}a_2^{\phantom\dagger}a_2^\dagger-\tfrac{1}{2}a_2^\dagger a_2^{\phantom\dagger}\\&\quad+a_1^\dagger a_2^\dagger - a_2^\dagger a_1^\dagger + a_2^{\phantom\dagger}a_1^{\phantom\dagger}-a_1^{\phantom\dagger}a_2^{\phantom\dagger}.\end{align}$$ which differs from the original one only by an additive constant (omitted here). Now we can read off the matrices $$A=\begin{pmatrix}\frac{1}{2}&0\\0&-\frac{1}{2}\end{pmatrix},\quad B=\begin{pmatrix}0&1\\-1&0\end{pmatrix}.$$ Therefore, $$D=\begin{pmatrix}\frac{1}{2}&0&0&1\\0&-\frac{1}{2}&-1&0\\0&-1&-\frac{1}{2}&0\\1&0&0&\frac{1}{2}\end{pmatrix}.$$ This matrix has the two positive eigenvalues $\omega_1=\frac{1}{2}$ and $\omega_2=\frac{3}{2}$; the corresponding normalised eigenvectors are $$x_1=\tfrac{1}{\sqrt 2}(0,-1,1,0)^t,\quad x_2=\tfrac{1}{\sqrt 2}(1,0,0,1)^t.$$ Now the transformation matrix is $$T=(x_1,x_2,Jx_1,Jx_2)=\frac{1}{\sqrt 2}\begin{pmatrix}0&1&1&0\\-1&0&0&1\\1&0&0&1\\0&1&-1&0\end{pmatrix}.$$ This means that $$a_1=\tfrac{1}{\sqrt 2}(b_2+b_1^\dagger),\quad a_2=\tfrac{1}{\sqrt 2}(-b_1+b_2^\dagger)$$ and $$H=b_1^\dagger b_1^{\phantom\dagger} + 3 b_2^\dagger b_2^{\phantom\dagger} + \mathrm{const}.$$

🤯 Another example that initially confused me

When I wrote the first draft for this post, I randomly picked the Hamiltonian $$H=a_1^\dagger a_1^{\phantom\dagger}+a_2^{\phantom\dagger}a_2^\dagger+a_1^\dagger a_2^\dagger + a_2^{\phantom\dagger}a_1^{\phantom\dagger}$$ which you can bring into the form (up to an additive constant, as usual) $$H=\tfrac{1}{2}(a_1^\dagger a_1^{\phantom\dagger}-a_1^{\phantom\dagger} a_1^\dagger+a_2^{\phantom\dagger}a_2^\dagger-a_2^\dagger a_2^{\phantom\dagger}+a_1^\dagger a_2^\dagger - a_2^\dagger a_1^\dagger + a_2^{\phantom\dagger}a_1^{\phantom\dagger}-a_1^{\phantom\dagger}a_2^{\phantom\dagger}).$$ If you do all of the above steps—diagonalising the matrix $D$ and ordering its eigenvectors to form the matrix $T$—you eventually end up with the transformed Hamiltonian $$H=2b^\dagger b.$$ Yes. Only one mode. Where did the other one go?

It turns out, $D$ doesn’t have full rank in this case, so the original Hamiltonian was just a very unfortunate way of writing down a single number operator. In fact, you can easily see what happened if you just notice that the original Hamiltonian factors as $$H=a_1^\dagger(a_1^{\phantom\dagger}+a_2^\dagger)+a_2^{\phantom\dagger}(a_2^\dagger+a_1^{\phantom\dagger})=(a_2^{\phantom\dagger}+a_1^\dagger)(a_1^{\phantom\dagger}+a_2^\dagger).$$ So everything is fine and works as expected.

🕹️ With the computer

Let’s quickly jot this down in Mathematica. I will assume that $D$ has full rank because the case of zero eigenvalues is extremely annoying. First, two little helper functions. One to fill out the block matrix $D$, given $A$ and $B$:

DMatrix[a_, b_] := 
  ArrayFlatten[{{a, b}, {-Conjugate@b, -Conjugate@a}}]

And the other one is the operator $J$:

JSwap[x_] := Flatten@TakeList[x, {-Length[x]/2, Length[x]/2}]

The function FermiDiagonalization takes as input the eigensystem (that is, eigenvalues & eigenvectors) of $D$ as returned by Mathematica’s Eigensystem function, and outputs the matrix $T$:

FermiDiagonalization[eigensys_] := Transpose[
  Join[#, JSwap /@ #] &@
    Extract[eigensys[[2]], Position[eigensys[[1]], _?Positive, {1}]]
  ]

The idea is to filter out the indices of the positive eigenvalues in the eigenvalue list using Position; we can then pick out the corresponding eigenvectors using Extract. They are then combined in one matrix together with their negative-eigenvalue counterparts.

📚 Other sources of wisdom

In general, these kinds of transformations appear to be called canonical transformations, Bogoliubov transformations or Bogoliubov-Valatin transformations. A standard reference is Chapters 2 & 3 of

  • J. P. Blaizot, G. Ripka. Quantum Theory of Finite Systems. The MIT Press (1986).

I recommend the very readable (not peer-reviewed) notes

Nielsen claims that it is necessary to bring the transformation matrix $T$ into the appropriate block form, so that the new transformed Fermion operators also satisfy the anti-commutation relations.

The point of van Hemmen’s paper is that the transformation matrix already has the desired form after an appropriate reordering of its columns. (Except when some eigenvalues are zero, in which case you might need to pick an appropriate eigenspace basis.)