The State of Quantum RAM: A new enlightening Investigation

The realization of quantum RAM aka QRAM would be a game changer for the quantum industry. Very promising proposals almost two decades ago have inspired quantum algorithm engineers at first. But the lack of success stories has disillusioned many by now. Information about the state of QRAM is rare and hardly known. A highly interesting, recent preprint brings light into the dark: A thorough investigation about the state of quantum RAM and its fundamental challenges. Yet, the text is also long and rich of technical details. In this article you will find an overview of my lessons learned from studying the survey. I have simplified the selected highlights to help you getting into the subject yourself.

Symbolic picture of QRAM

Image by Gerd Altmann from Pixabay

QRAM: The missing Game Changer

Data plays a central role in universal computation. It is data that breathes life into most algorithms. Making data memory fast and randomly accessible, was a major step in the evolution of digital computers: RAM. The same holds for quantum computation. Unfortunately here, the situation for quantum RAM aka QRAM is even more complicated:

Imagine a superfast quantum algorithm, that has to perform a lookup on a “table” $ T $ of $ N = 2^n $ bits of data. We want to store this table and make it randomly accessible for a quantum computer. In the quantum world a QRAM-lookup of the (classical) data has to perform the following operation on an address register A and an output register O

\lvert i \rangle_A \lvert 0 \rangle_O \longrightarrow \lvert i \rangle_A \lvert T_i \rangle_O

A simple circuit implementation to access the table, is a series of controlled operations of the type

IF address register = i: 
   THEN add T_i to the output register

For such a lookup an order of $ O(N) $ gates are executed altogether. But this would destroy any superpolynomial speedup in $ N $ of the algorithm!

So, to maintain a superfast quantum speedup, you need to a have superefficient QRAM. Exactly such a QRAM was sketched in the ground breaking “bucket-brigade”-paper by Giovannetti, Lloyd and Maccone in 2008 i – more about this later. Afterwards, QRAM played a key role in various superfast quantum algorithms. The most famous among them is probably the HHL algorithm for solving equations in linear algebra, which achieves a runtime of $ O(log N) $. Along, QRAM was one of the concepts that helped to drive the first wave of quantum machine learning algorithms. But the problem was: The physical realization of the “bucket-brigade”-QRAM was not in sight. This and other constraints led Scott Aaronson to demand “Quantum Machine Learning Algorithms: Read the Fine Print” in 2015 ii. In it, he also coined the term “passive” QRAM. Afterwards not only the first wave of quantum machine learning algorithms ebbed significantly. Also, the enthusiasm for QRAM dropped.

“QRAM: A Survey and Critique”: Lifting the Fog

In my impression the situation has not changed and is still … very foggy. Once in a while I hear the rather wistful question “… and what about QRAM?”, but not even the hardware teams seem to have a real answer. This is all very disappointing.

Therefore, I was very surprised to stumble over a recent preprint from May 2023 by Samuel Jaques and Arthur Rattew (University of Oxford) with a pretty remarkable rating of 58 on Scirate iii. Its title “QRAM: A Survey and Critique” immediately got me hooked.

The study is a thorough investigation about the concepts, challenges and use cases for QRAM. It discusses several strategies of hardware and software implementations. Also, the authors propose regimes, where QRAM could work and encourage scenarios for further research. The preprint has finally given me an understanding of the fundamental problems, that the realization of QRAM faces. But I was also surprised to find out, that there exist promising works, that incorporate a realistic concept of QRAM.

As a survey of various papers the preprint is a broad collection of technical details and even the main text is about 30 pages long. So, if you are also curious about the subject but would like to speed up things for you, then in the following text you will find a personal overview of my lessons learned from reading. For the purpose of this article I have rearranged and simplified the selected highlights of the work. This should help you getting started with the survey yourself.

The Bucket-Brigade: The Mother of all QRAM

But first step back for a second to the mother of all quantum RAM: The “bucket-brigade”. The original work also sketches a “proof of principle” implementation. This is how it works:

A binary tree

A binary tree in general
Image by Rune Magnussen via Wikimedia

The implementation is a binary tree of trapped ions or atoms. Each atom is restricted to three states “wait”, “left” and “right”. Thus, each node is a “qutrit”. In the “wait” state, the node is inactive. This is the initial state of all nodes. The excited states “left” and “right” couple to the left and right spatial direction of the next layer. The coupling is mediated through photons in the following way:

If a quantum processor needs to access an address encoded as qubits in an address register, each qubit in the address register generates a photon, starting from most significant to least significant qubit. The state of the qubit is translated into the photon’s polarisation states $ \lvert left \rangle_P$ and $ \lvert right \rangle_P$. If this photon reaches a node, the following can happen:

    • $ \lvert W \rangle_A $: If the node’s atom is in the wait-state, the photon gets absorbed and generates an excited state $ \lvert L \rangle_A $ or $ \lvert R \rangle_A $ depending on its own polarization

    • $ \lvert L \rangle_A $: If the node’s atom is in the left-state, the photon is deflected to the node on the left spatial direction of the next layer, using stimulated Raman emission, without changing its own state or the state of the forwarding atom. At the next node, the same three cases are possible.

    • $ \lvert R \rangle_A $: If the atom is in the right-state, the photon is deflected to the node on the right spatial direction of the next layer without changing states as above. Again, at the next node, the same three cases are possible.

This will initialize a path of active nodes through the binary tree according to the qubits in the address register from most significant to least significant: A bucket-brigade. If all address qubits are sent this way, the data bit in the leaf is read out by generating a polarized photon which gets routed in the same way but from bottom to top.

A nice feature of the bucket-brigade model is that an error in any node just logarithmically effects the outcome, as only the nodes in the active path are able to propagate errors all the way up. The authors state, that this may be improved even further by error correcting the top layers (somehow) as errors at this stage have greater impact.

As appealing as the bucket-brigade is, it also illustrates all the problems that challenges the realization of QRAM.

Active versus Passive Gate-QRAM


In their preprint Jaques and Rattew call a specialized hardware device as the one described above “gate-QRAM“. It is absolutely crucial that such a device executes with as less interventions as possible. Ideally such a system evolves passively through a single time-independent Hamiltonian. The authors call this “ballistically” and compare it to a Galton box

A Galton box

Galton box, an example of a “ballistical” device
Image by Rodrigo Argenton via Wikimedia

They formulate the key argument of their work as follows:

“In the active model, there is a powerful opportunity cost argument: in many applications, one could repurpose the control hardware for the qubits in the QRAM (or the qubits themselves) to run an extremely parallel classical algorithm to achieve the same results just as fast.”. If gate QRAM needs $O(N)$ interventions, then it is qualitatively no better than “circuit-QRAM“, a software implementation as the simple one sketched above (more about this further down). For example, the original bucket-brigade-paper suggests stimulated Raman emission for the atom-photon interactions, which Jaques and Rattew judge as an active device. More generally, the preprint formalizes the problem of active interventions in a theorem that indicates, that there is trade-off of the number of interventions against time and energy: If one is fixed, the other must grow with the QRAM-size.

Note, that this requirement not only counts for routing to the correct memory cell, but also for reading out the data. As the authors state “For perspective, if we lose a single visible photon’s worth of energy for every bit in memory for each access, accessing $2^{50}$ bits costs about 1 kJ. As we move into cryptographic scales, QRAM becomes clearly absurd. A single photon’s worth of energy per bit means 1 giga-Joule per access if we have $ 2^{80} $ bits” (more about these applications and memory scales later).


A purely passive gate-QRAM that is driven by a single time-independent Hamiltonian, would be reversible. Thus, any implementation faces the challenge that excited states relax to lower energy states. Nodes will need to be carefully engineered to control this without interventions. Yet, as for the bucket-brigade, relaxation is also required: After each QRAM readout the nodes need to turn into the initial waiting-state. This may be achieved, if the emission times of the atoms in the layers at the bottom are designed to be shorter, than the once further up the active path. Thus, the last atom would emit the first photon which would be neatly routed along the remaining active path. In this sense, atom by atom would relax, emit photons which are correctly routed and the address register would restore. For this to happen, emission and absorption errors need to be extremely small. The preprint quantifies this further. Because of all these challenges, the authors concludes “If this interaction problem is solved, the QRAM routing tree looks like a full-fledged trapped ion quantum computer”.

Please note, that besides all of these challenges, we are not even talking about coherence times yet!

Proposals for physical QRAM-Implementations

The authors describe several proposals in the literature for implementing bucket-brigade-QRAM. Unfortunately, according to Jaques and Rattew, they all fall short in some way or the other as outlined. In particular, besides the original design with trapped ions, they explain approaches using

    • Transmons
    • Photonic transistors
    • Heralded Routers
    • Photonic CNOT

Among these, the Transmon-proposal seems most promising, as it provides enough parameters to tune the nodes for constructing a truly passive device iv.

Besides these, the authors describe other QRAM-designs in the literature, that do not follow the bucket-brigade approach. Again, they also explain in detail, why they fail to provide a passive device:

    • Time-bins
    • Quantum Optical Fanout
    • Phase Gate Fanout
    • Derangement Codes

QRAM errors

From the previous arguments we should realize by now, that error correcting the individual nodes in gate-QRAM will essential copy error corrected software implementations. So Jaques and Rattew propose to interpret gate-QRAM as a single physical gate and to correct its output errors. This imposes certain problems, as this physical device would need to couple to logical qubits. As the authors outline, it its unlikely, that the QRAM-gate acts transversal to the error correcting code in the quantum computer. This means, we cannot just couple the noisy QRAM to the physical qubits and the error correcting code will do the rest for us. Instead they propose a distillation of the QRAM-gate, very much like the distillation of T-gates in the surface code. This means, we use $ d $ calls to the noisy QRAM-gate and distill these outcomes to a state of much higher fidelity. Unfortunately, as the authors argue in a theorem, to achieve a high fidelity state, $ d $ has to grow quadratically with the size of the QRAM. But this is asymptotically no better than error corrected circuit-QRAM aka software implementations.

Applications for QRAM and their Scaling

Symbolic image of software applications

Image by Mudassar Iqbal from Pixabay

The preprint lists several interesting applications for QRAM along with numerous references. To my surprise, their focus is not on data for context (like big data, machine learning, finance, …). Instead these works rather use QRAM to speed up quantum calculations. Some of them incorporate optimized software implementation for QRAM (Jaques and Rattew call these “circuit-QRAM”) which are used to compile detailed resource estimations of the use cases. The preprint summarizes the QRAM-specific scaling of these studies and outlines details about the QRAM-implementations.

Optimizing calculations

As the authors state for the the first application:

“Optimizing calculations. Many quantum algorithms perform complicated classical functions in superposition, such as modular exponentiation in Shor’s algorithm or inverse square roots in chemistry problems. Recent techniques … choose instead to classically precompute the result for a large portion of inputs, and then use … [QRAM] to look up the result.”.

For instance, in one referenced paper Google’s Craig Gidney uses a technique called “windowed arithmetics” v. In there, he trades controlled arithmetic calculations, which are controlled by a qubit register, by using the same register as an address register for a QRAM-lookup. The operations are merged into a single windowed batch lookup on the QRAM. Gidney gives the following motivation for this procedure:

“At first glance, this may seem like a bad idea. This optimization saves twenty [quantum] multiplications, but generating the lookup circuit is going to require classically computing all $ 2^{20} $ possible results and so will take millions of multiplications. But physical qubits are noisy …, and quantum error correction is expensive … . Unless huge advances are made on these problems, fault tolerant quantum computers will be at least a billion times less efficient than classical computers on an operation-by-operation basis. Given the current state of the art, trading twenty quantum multiplications for a measly million classical multiplications is a great deal.”.

By the way, see my article Does Quadratic Quantum Speedup have a problem? to find out more about the problems, that Gidney is referring to.

Jaques and Rattew state the following nice aspect about windowed modular exponentiation: Shor’s algorithm requires only about 17 applications of QRAM-lookups. So we could afford much higher error rates for each QRAM-lookup than for other gates. Note, that passive gate-QRAM would only speed up this type of optimization, if the table of data will not have to be regenerated for each application. This could be the case for Google’s qubitization-paper vi, which uses QRAM to look up the value of a function of the Coulumb potential.

In the exemplary papers the QRAM-size $ N $ is rather manageable (around $ 2^{15} $ to $2^{25} $). This means, that in these cases a lookup cost of $ O(N) $ is actually acceptable and the studies still achieve an overall quantum advantage.

Other applications

Another QRAM-application that the authors of the preprint give is the “Dihedral Hidden Subgroup”. And again by the way, see my general introduction for these kind of problems From Shor to the Quantum Hidden Subgroup Problem – A jumpstart. In my article I also introduce Greg Kuperberg’s sieve-algorithm for solving the Dihedral Hidden Subgroup – problem. Almost ten years later he published an improved algorithm that relies an QRAM vii. The goal of the algorithm is to construct an optimized “phase vector state” to read off the characteristics of the hidden subgroup.

The new algorithm uses a table of “phase multipliers” constructed from previous measurements. Using QRAM-lookups Kuperberg’s iteratively constructs improved phase vector states from the previous vector states that came along with the previous measurements. For such cryptographic applications Jaques and Rattew give a mid sized scaling of $ 2^{18} $ to $ 2^{51} $ bits. In these regimes a passive gate-QRAM would start to speed up calculations significantly.

Much worse QRAM-scalings are needed for attacks on quantum-safe cryptography. For these applications, the preprint gives a range of $ 2^{49} $ up to $ 2^{93} $ bits. As these also tend to find a single element in memory, the error rate needs to be extraordinarily high. This is also the case for a Grover-search on a database.

These regimes also indicate, at which scales context data may be loaded into the quantum computer using circuit-QRAM or active gate-QRAM without ruining the overall performance of a fast quantum algorithm. In the case of quantum linear algebra, such as the HHL algorithm, the preprint presents a detailed case study by considering all kinds of runtime costs (obvious and hidden).


Symbolic image of ciruit-QRAM

Credits Hann, Connor & Lee, Gideon & Girvin, Steven & Jiang, Liang. (2020)
The resilience of quantum random access memory to generic noise. / CC BY 4.0

From the arguments in the previous sections it should be obvious, that the authors of the preprint have little hope for the case of truly passive gate-QRAM. Yet in the last section it also became apparent, that there exist regimes, for which software implemented QRAM could be sufficient. As mentioned, the work provides detailed description of various implementations, with different pros and cons regarding gate complexity, depth, T-gate count, space usage and error scaling.

A bucket-brigade implementation

For instance a bucket-brigade may be implemented by using two qubits for each node in the tree: A routing qubit and a controlling qubit. The routing qubit takes the role of the photon, which is routed along the active path. The controlling qubit manages the activation state of each node and controls the routing: For each node the routing qubit is swapped either to the left or the right routing qubit in the next layer of the active path. This swapping is controlled by the controlling qubit of the node. If the routing qubit reaches an inactive layer, it is not swapped to the routing qubit but to the controlling qubit of the inactive node, thus making it active and controlling the routing of the subsequent routing qubits. After the full path is activated, the data qubit of the leaf controls a CNOT on the routing qubit of the last layer. This routing qubit is then swapped layer by layer as before, yet from bottom to top. This implementation has a gate count of $ O(N log N) $ and its depth is $ O(log N) $. As described above for the physical implementation, the error rate per node may be relatively large compared to simpler QRAM-implementations. Generally, one would expect error rates of order $ O(1/N) $.

Have fun reading

You will find even more aspects in the survey, that I did not cover (e.g. a detailed comparison with classical RAM and the case study about quantum linear algebra) – so, check out the preprint yourself. I should mention, that at first, it took me a while to get into the work due to some technical formalities: The preprint starts by proposing a general naming scheme for QRAM (following Greg Kuperberg’s paper) as this is used rather inconsistently throughout the literature. Thus, the authors use the terms QRACM or QRAQM for quantumly addressable (in superposition) classical data or quantum data viii. Although Samuel Jaques was kind enough to provide further explanations to me, I decided to use the term QRAM instead in my article and keep things simple. Also, the authors themselves state, that at the end the difference between QRACM and QRAQM may be neglected, as the addressing and routing works the same for either case and is the main challenge for QRAM. In this sense, they also use QRAM as a generic term.


i “Quantum random access memory“, paper by Vittorio Giovannetti, Seth Lloyd, Lorenzo Maccone (2008)

ii “Quantum Machine Learning Algorithms: Read the Fine Print” , survey by Scott Aaronson (2015).

iii “QRAM: A Survey and Critique”, preprint by Samuel Jaques, Arthur G. Rattew (May 2023)

iv “A transmon based quantum switch for a quantum random access memory”, Master’s thesis Arnau Sala Cadellans (2015)

v “Windowed quantum arithmetic”, by Craig Gidney (2019)

vi “Qubitization of Arbitrary Basis Quantum Chemistry Leveraging Sparsity and Low Rank Factorization”, by D. Berry, C. Gidney, M. Motta, J. McClean, R. Babbush (2019)

vii “Another subexponential-time quantum algorithm for the dihedral hidden subgroup problem“, by G. Kuperberg (2011)

viii Jaques and Rattew use the term QRACM for lookups in a data table $ T $

\lvert i \rangle_A \lvert 0 \rangle_O \longrightarrow \lvert i \rangle_A \lvert T_i \rangle_O
whereas QRAQM supports superpositioning of tables, which I symbolically rewrite as
\lvert i \rangle_A \lvert 0 \rangle_O \lvert T \rangle_T \longrightarrow \lvert i \rangle_A \lvert T_i \rangle_O \lvert T \rangle_T
Note, that this type of “quantum data memory” does not support lookups of general wavefunctions $ \psi_i $
\lvert i \rangle_A \lvert 0 \rangle_O \longrightarrow \lvert i \rangle_A \lvert \psi_i \rangle_O
This is state preparation and not possible with QRAM, although most techniques for QRAM may be used for this as well.

From Shor to the Quantum Hidden Subgroup Problem – A jumpstart

Regarding the importance of exponential quantum advantages, it is natural to ask how far the boundaries of known, superfast quantum algorithms may be pushed. In the case of quantum computing‘s most famous algorithm, Shor’s algorithm, this leads to the quantum computing frameworks for the “Hidden Subgroup Problem” and generalized Fourier transform. Instances of this range from hidden shift, integer period finding, discrete logarithm up to lattice cryptosystems and graph isomorphism. Unfortunately, the search for introductions quickly leads to very demanding “LaTeX deserts” and paper cascades. This article provides a gentle and intuitive introduction and a broad overview to the very fascinating and elegant subject.

An "army" of lego figures

Image by Matthias Wewering from Pixabay

The importance of Exponential Quantum Advantages

As quantum computing evolves the community starts to focus on the next steps towards fault tolerance. These developments start to highlight an aspect that is often overlooked: The algorithmic overhead of fault tolerance. In my last article Does Quadratic Quantum Speedup have a problem? I described two papers that formulate concerns regarding central quantum algorithms with small polynomial speedup in comparison to its classical counterparts.

Also because of this outlook, leading scientists in quantum computing like Scott Aaronson, emphasize the importance of exponential asymptotics to quickly reach the breakeven time with classical algorithms. In his talk at the Solvay Physics Conference (May 2022, Brussels) he gives a very informative overview about the current state of exponential quantum advantage i. Here I want to spoil just one thing from the very readable transcript: After summarizing some of the known exponential speedups, he concludes “People often ask, is that it? Well, it’s not quite it, but yes: the list of exponential speedups is smaller and more specialized than many people would like”.

In this light, it is reasonable to ask the question: How far may the boundaries of the known, superfast quantum algorithms be pushed?

Now quick! What’s the very first quantum computing algorithm that comes to your mind now?

From Shor to the Hidden Subgroup Problem

Let me guess, it’s Shor’s algorithm, the mother of all quantum computing algorithm? … OK, presumably. But let’s face it: Shor’s algorithm for factoring prime numbers not only sparked all the great attention for quantum computing – and does so until today! More importantly, it brilliantly condenses crucial aspects and subroutines of quantum computing into one single algorithm. Having done this already in the year 1994 was Peter Shor’s farsighted masterpiece for science and he was justifiably honored with the Breakthrough Prize 2023 for it. Besides, until now it serves as a centerpiece for various (rather constructed) use cases with an exponential quantum advantages. Namely for use cases in optimization ii or in machine learning iii, for which you usually find rather quadratic speedups.

Taking Shor’s algorithm even further we end up with the quantum computing framework for the Hidden Subgroup Problem.

Among its instances are

  • Finding hidden shifts in bitstring functions (also known as “Simon’s Problem”)
  • Period finding over integers (the core of Shor’s factoring algorithm)
  • Discrete logarithm
  • Lattice cryptosystems (Regev’s algorithm)
  • Graph isomorphism

The hidden subgroup problem is very often referred to as one of the major applications for quantum computing. But if you want to get ready for a deeper dive into the subject you are quickly lost in cascades of papers and very demanding LaTeX deserts: “Ehh … ok … let’s save this up for later.”

Which is really a shame, because the hidden subgroup problem is a highly fascinating subject. Also, its quantum computing framework is mathematically very elegant and what seems rather technical in Shor’s or Simon’s algorithm, has a deeper and intuitive reason. My article aims to provide a gentle and intuitive introduction and a comprehensive overview to the very fascinating but challenging subject. Of course, I will also use some math – quite a bit actually – but I will emphasize the crucial points and the intuition in the main text and provide much more helpful comments than you would usually find. Purely technical details are rather presented in the appendices. Besides this, I will outline the most interesting examples of the hidden subgroup problem and point out some limits of the framework.

Ok, let’s start our tour.

It all began with Simon’s problem …

Simon’s Problem, a Hidden Subgroup Problem

As stated by Peter Shor, he was heavily inspired by Simon’s algorithm which is the simplest solution of a hidden subgroup problem. Daniel Simon’s introduced the problem in 1994 and proved for the first time, that there exists an exponential oracle separation between classical and quantum computing: A quantum algorithm is able to query a black box function exponentially fewer times than the classical counterpart to solve the problem. Recently Simon’s problem was even adapted to the NISQ regime, proving for the first time, that the exponential oracle-speedup holds even for noisy near term quantum computers iv. Here I will quickly recap Simon’s algorithm and point out some crucial basic aspects that you will later recognize again, in a much more abstract form in the quantum computing framework for the hidden subgroup problem. I will heavily make use of the very nice introduction to Simon’s algorithm in IBM’s great Qiskit textbook v.

Simon’s Algorithm to solve the Hidden Shift Problem

A bitstring

Image by Gerd Altmann from Pixabay

We are given an unknown blackbox function $f$ for bitstrings of size $n$, which is two-to-one and the additional promise of a unique hidden bitstring $b$:

\textrm{given }x_1,x_2: \quad f(x_1) = f(x_2) \\
\textrm{it is guaranteed }: \quad x_1 \oplus b = x_2

where $ x_1 \oplus b $ is of course the modulo binary addition for bitstrings.

Problem: Find the hidden bitstring $ b $!

The classical solution is exponentially hard, because in the worst case you might need to execute the blackbox function for half of the exponentially many bitstrings $ x_i $ until you complete a pair. On average you have to execute half as many bitstrings which are still exponentially many.

In the quantum computing version you need to execute the blackbox function only $n$ times!

We achieve this, using an iterated approach with the following steps in each iteration:

  • We initialize two $n$-qubit input registers to the zero state:\begin{gather}\lvert \psi_1 \rangle = \lvert 0 \rangle^{\otimes n} \lvert 0 \rangle^{\otimes n}\end{gather}
  • We apply a (first) Hadamard transform to the first register, which generates a uniform superposition of all possible bitstrings:\begin{gather}\lvert \psi_2 \rangle = \frac{1}{\sqrt{2^n}} \sum_{x \in \{0,1\}^{n} } \lvert x \rangle\lvert 0 \rangle^{\otimes n}\end{gather}
  • We apply a controlled operation which executes the query function via composed gates $\text{Q}_f$ on the second register:\begin{gather}\lvert \psi_3 \rangle = \frac{1}{\sqrt{2^n}} \sum_{x \in \{0,1\}^{n} } \lvert x \rangle \lvert f(x) \rangle\end{gather}
  • We measure the second register. A certain value of $f(x)$ will be observed. Because of the setting of the problem, the observed value $f(x)$ could correspond only to two possible inputs: $x_1$ and $x_2 = x_1 \oplus b $. Therefore the first register becomes:
    \begin{gather}\lvert \psi_4 \rangle = \frac{1}{\sqrt{2}} \left( \lvert x_1 \rangle + \lvert x_1 \oplus b \rangle \right)
    We omitted the second register since it has been measured and we don’t care for its value. All we needed was, that the first register collapsed into a superposition of a single pair linked by the hidden bitstring $b$. Now, we would like to find some way to get rid off $x_1$.
  • We magically guess that a second Hadamard on the first register might help. For this it is helpful to remember the “definition” of a general Hadamard transform for a bitstring $x$:\begin{gather}H \lvert x \rangle = \frac{1}{\sqrt{2^{n}}} \sum_{z \in \{0,1\}^{n} } (-1)^{x \cdot z} \lvert z \rangle\end{gather}where $ x \cdot z $ is the inner product of the two bitstrings / binary vectors. In this case, we get
    \begin{gather}\lvert \psi_5 \rangle = \frac{1}{\sqrt{2^{n}}} \sum_{z \in \{0,1\}^{n} } \left[ (-1)^{x _1 \cdot z} + (-1)^{(x_1 \oplus b) \cdot z} \right] \lvert z \rangle

    We will later find out exactly why a second Hadarmard transform was just the right thing for this purpose. You will be surprised!

  • Now we measure the first register. It will give us an output only if:\begin{gather}(-1)^{x_1 \cdot z} = (-1)^{(x_1 \oplus b) \cdot z}\end{gather}which means:\begin{gather}x_1 \cdot z = \left( x_1 \oplus b \right) \cdot z \\
    x_1 \cdot z = x_1 \cdot z \oplus b \cdot z \\
    b \cdot z = 0 \text{ (mod 2)}

    Like I promised: We got rid off the $ x_1 $!

    We see, that we will measure a bitstring $z$, whose inner product with $b$ gives $0$. Thus, repeating the algorithm $\approx n$ times, we will be able to obtain $n$ different bitstrings $z_k$ and the following system of equation can be written:

    \begin{cases} b \cdot z_1 = 0 \\ b \cdot z_2 = 0 \\ \quad \vdots \\ b \cdot z_n = 0 \end{cases}

    From which $b$ can be determined, for example by Gaussian elimination.

What is the deal with the black box function?

The use of black box functions / oracles seems mysterious at first as these are just algorithmic sub routines. The whole point is, that usually one cannot rule out an efficient classical solution, if details are known about the concrete problem structure. If we limit ourselves to being ignorant, there is no way to exploit any structure and to take shortcuts. If we are forced to execute the black box function many, many times, it enables us to separate the complexity of quantum and classical solutions – relative to the oracle – and thus differentiate between the nature of quantum computing and classical computation.

Yet, in certain cases it is possible to extend the relative advantage to an absolute advantage. like in Shor’s algorithm. A necessary condition for this is, that the oracle needs to be efficiently computable. This is not always possible, just for the simple reason, that there are vastly more possible oracles than there are efficient computations. Finding such absolute examples for Simon’s problem would make them great candidates for near term quantum supremacy experiments. Unfortunately this is an open question.

The Connection to the Hidden Subgroup Problem

Here are some crucial remarks to Simon’s algorithm in the context of the hidden subgroup problem:

The steps 1-4 seem kind of straight forward. The magic is happening in steps 5 and 6. After the measurement of the second register we created the uniform superposition

\frac{1}{\sqrt{2}} \left( \lvert x_1 \rangle + \lvert x_1 \oplus b \rangle \right)
= \frac{1}{\sqrt{2}} \left( \lvert x_1 \oplus 0 \rangle + \lvert x_1 \oplus b \rangle \right)

We may symbolically write this as

\begin{gather} \lvert x_1 \oplus H \rangle
\quad \textrm{for the set } \quad
H = \bigl\{0, b \bigr\}

Notice, that

b \oplus b = 0 \quad \textrm{making H … a group!}

Of course, we will exploit this fact further down. By the way: As you might have guessed already $H$ is even a subgroup of a certain other group.

Next, we still have to find out what is the deal with the second Hadamard transform. I leave this for later, too.

Now, just for fun, let’s call

\bigl(-1\bigr)^{x \cdot z} = \chi_z(x) \quad \textrm{a “character function”.}

We notice, that after step 6, the final measurement collapses the previous superposition to only those “labels” $ z $ for which the character $ \chi_z $ is constant on the complete set $ x_1 + H $! As we will see, these $ \chi_z $ are even equal to the trivial character on the domain $ H $

\chi_z(x) = \bigl(-1\bigr)^{x \cdot z} = 1
We got rid off $ x_1 $ because every character function $ \chi_z $ has the nice property

\chi_z (x_1 + x_2) = \chi_z (x_1) \cdot \chi_z (x_2) \quad \textrm{for all } x

This property makes $ \chi_z $ a “group homomorphism”.

At the end we are able to extract the group $ H $ because:

  • We somehow guessed the exact form of the characters. These seemed to be kind of native to this type of problem.
  • We knew, that the final measurement only gave us those labels, for which the corresponding character is trivial on the set $ H $
  • Repeating this iteration gives us more and more information about $ H $ and we are able to efficiently compute it classically.

The general Hidden Subgroup Problem

Now we are ready to formulate the Hidden Subgroup Problem itself:

We are given a group $ G $ and a function $f$ from $ G $ to some set $ X $:

f: G \longrightarrow X

Further we are told, that $ f $ has the unique feature, that there is a hidden subgroup $ H \subseteq G $ on which $ f $ is “H-periodic”:

\textrm{for every $ g_1, g_2 $ with:} \quad f(g_1) = f(g_2) \\
\textrm{it is guaranteed that:}
\quad g_1 \cdot h = g_2 \\
\textrm{for a $ h \in H $}

In other words: $ f $ is constant exactly on every “coset” $ S_i := g_i \cdot H $ with distinct values for every distinct coset $ S_i $.

Problem: Find the hidden subgroup $ H $!

In the following sections we will

  • See how to create so called “coset states” for the general problem
  • First solve the problem for abelian / commuting groups
  • Consider the hidden subgroup problem for non-abelian groups
  • Discuss some examples

The Quantum Algorithm for the Hidden Subgroup Problem

First of all, to do quantum computation, we need to encode the problem into a Hilbert space.

For this, we map the group $ G $ to a Hilbert space of dimension $ |G| $ (the number of elements in $G$) by identifying each group element with a basis vector of the standard basis:

\{ \lvert g \rangle: g \in G \}

(By the way: This also defines the “regular representation” of $G$)

All the group operations in $G$ are now encoded in a set of shifting operators (left shift and right shift to b e exact):

U_l(k): \lvert g \rangle \longrightarrow \lvert k \cdot g \rangle \quad g, k \in G

We will exploit this fact further down.

To solve the hidden subgroup problem, we start just like in Simon’s problem.

  • We initialize two input registers to the zero state:\begin{gather}\lvert \psi_1 \rangle = \lvert 0 \rangle^{\otimes n} \lvert 0 \rangle^{\otimes m}\end{gather}The actual dimensions $n$ and $m$ of the two registers are adapted to $|G|$ and the image of $ f(G) $
  • We apply a Hadamard transform to the first register, which creates a uniform superposition over all group elements,\begin{gather}\lvert \psi_2 \rangle = \frac{1}{\sqrt{|G|}} \sum_{g \in G } \lvert g \rangle\lvert 0 \rangle^{\otimes m}\end{gather}
  • We apply a controlled operation which executes the query function via composed gates $\text{Q}_f$ on the second register:\begin{gather}\lvert \psi_3 \rangle = \frac{1}{\sqrt{|G|}} \sum_{g \in G} \lvert g \rangle \lvert f(g) \rangle\end{gather}
  • We measure the second register. A certain value of $f(g)$ will be observed. Because of the setting of the problem, the observed value $f(g)$ could correspond only to inputs in the (left) coset $ S = g \cdot H $. Therefore the first register becomes:

    \begin{gather}\lvert \psi_4 \rangle = \frac{1}{\sqrt{|H|}} \sum_{h \in H} \lvert g \cdot h \rangle =: \lvert g \cdot H \rangle
    We omitted the second register since it has been measured and we don’t care for its value. All we needed, was that the first register collapsed into the so called “coset state”.

Now we are stuck! We don’t know how to extract $ H $ from the coset state. This is really too bad. Measuring this state won’t do, because we don’t know $ g $ and each iteration would randomly give us another one.

To solve this problem we need a few tools and rules from abstract algebra.

I promise: This will give you some neat intuition on how to tackle the coset state!

The Hidden Subgroup Problem for Abelian groups

An abelian group

Image by Jakob.scholbach, via Wikimedia

Some rules for irreducible representations and characters

Tightly connected to the hidden subgroup problem, and may be mathematically even more fascinating, is the question for the Fourier transform on arbitrary groups. For functions (or “vectors”) on the real line (the group of real numbers under ordinary addition) we might already know the regular Fourier transformation. The Fourier transform on arbitrary groups requires a few tools and rules from abstract algebra. If your knowledge about groups and representations feels a little waggy you might want to check out my quick summary “Appendix 1: Some facts about group and representation theory” further down.

But if the following terms only put a smile of wisdom on your face, you will do just fine

  • Abelian groups
  • Group of bitstrings $ \mathbb{Z}_2^n $
  • Cyclic group $ \mathbb{Z_{m}} $
  • Cosets
  • Representations, Irreducable representations
  • Characters

For this and the next section I will stick to the paper of Ekert & Jozsa 1998, “Quantum Algorithms: Entanglement Enhanced Information Processing” vi. Also, we will first deal with the hidden subgroup problem for finite abelian groups and later discuss adaptions to other cases.

Now to some crucial rules for characters: Let $(G, +)$ be any finite abelian group and let’s write the group operation in additive notation. Let $|G|$ be the number of elements of $G$. For abelian groups irreducible representations are 1-dimensional and equal to their traces, their “characters”: So both are functions $\chi$ from $G$ to the complex numbers

\chi : G \longrightarrow \mathbb{C}


\chi(g_1 + g_2) = \chi(g_1 ) \cdot \chi(g_2)

i.e. $\chi$ is a group homomorphism. This has the following consequences

  • 1. Any value $\chi(g)$ is a $|G|$ th root of unity. Thus $\chi$ may be viewed as a group homomorphism to the complex circle group of all units

\chi : G \longrightarrow S^1 = \left\{ e^{i\phi} \right\}

And in particular: $ \chi \left( g^{-1} \right) = \overline{\chi(g)} $

  • 2. Orthogonality (Schur’s lemma): If $\chi_i$ and $\chi_j$ are any two such functions then:

\frac{1}{|G|} \sum_{g \in G} \chi_i(g) \cdot \overline{\chi_j(g)} = \delta_{ij}

  • 3. There are always exactly $|G|$ different functions $\chi$ satisfying this. So we may label them by the elements of $ k \in G$: $\chi_k$.

Quantum Fourier Transform: The centerpiece of the Hidden Subgroup Problem

Shift-Operators for $G$

Again, we use a Hilbert space of dimension $ |G| $ with the basis:

\{ \lvert g \rangle: g \in G \}

Remember that all group operations in $G$ are now encoded in a set of shifting operators:

U(g’): \lvert g \rangle \longrightarrow \lvert g+g’ \rangle \quad g, g’ \in G

Using all the characters $\chi_k$ of $ G $ with labels $ k \in G $, we introduce a family of very interesting states:

\lvert \chi_k \rangle =
\frac{1}{\sqrt{|G|}} \sum_{g \in G} \overline{\chi_k(g)} \cdot \lvert g \rangle

These states are orthonormal since all characters are orthonormal (see Schur’s lemma in the section above). Furthermore: These states are eigenstates of all translation operators!

U(g’) \lvert \chi_k \rangle &=
\frac{1}{\sqrt{|G|}} \sum_{g \in G} \overline{\chi_k(g)} \cdot U(g’) \lvert g \rangle \\
\frac{1}{\sqrt{|G|}} \sum_{g \in G} \overline{\chi_k(g)} \cdot \lvert g + g’ \rangle


\overline{\chi_k(g)} = \overline{\chi_k(-g’ + g + g’)} = \chi_k(g’) \cdot \overline{\chi_k(g + g’)}

and we may relabel the sum over all group elements to $ g+g’$ as well. This gives

U(g’) \lvert \chi_k \rangle = \frac{1}{\sqrt{|G|}} \cdot \chi_k(g’) \cdot \lvert \chi_k \rangle

The Fourier transform for $ G $

May be you have guessed this already:

\bigl( \lvert \chi_k \rangle \bigr)_{k \in G} \quad \text{is called the Fourier basis for $ G $.}
As mentioned, the Fourier states are orthonormal and there are $ |G| $ of them.

The Fourier transform is defined as the map, that extracts the label for each Fourier state:

\mathcal{F} \lvert \chi_k \rangle = \lvert k \rangle

Thus, it translates every state into its spectrum of Fourier labels. In particular we have

\mathcal{F} &= \sum_{k \in G} \lvert k \rangle \langle \chi_k \lvert \\
&= \frac{1}{\sqrt{|G|}} \cdot \sum_{k, g \in G} \chi_k(g) \lvert k \rangle \langle g \lvert

Fourier transform for $ \mathbb{Z}_m $ , $ \mathbb{Z}_2^n $ and the magical 2nd Hadamard

As we know that $ \chi_k $ are $ |G| $th roots of the unity, we can start constructing the Fourier transform for different abelian groups.

For the cyclic group $ \mathbb{Z}_m $, there are the following $ m $ roots of unity:

\chi_k (g) = e^{ 2\pi i \cdot \frac{k}{m} \cdot g}

In particular for $ \mathbb{Z}_2 $ we have for the two bits $g$ and $k$

\chi_k (g) = e^{ \pi i \cdot k \cdot g} = (-1)^{k \cdot g}

For $ \mathbb{Z}_2^n $ the $ n $ characters from above just factor and we can sum up the individual exponents. Altogether the two bits turn into two bitstrings and the product $k \cdot g$ in the exponent turns into the inner product of the two bitstrings:

\mathcal{F} = \frac{1}{\sqrt{2^{n}}} \sum_{k, g \in \{0,1\}^{n} } (-1)^{k \cdot g} \lvert k \rangle \langle g \lvert

We have just found our magical second Hadamard transform in the Simon’s algorithm!

The Forrelation Problem

Just as a side remark: The fact, that the Fourier transform of $ \mathbb{Z}_2^n $ – functions may be implemented on a quantum computer so efficiently, inspired Scott Aaronson, to formulate another black box problem in 2009, the “Forrelation” problem: Given two black box functions $f$, $g$. Is the $ \mathbb{Z}_2^n $-Fourier transform of $ f $ correlated to $ g $ (using a special Fourier-like metric)?

He introduced the problem while trying find an even more spectacular black box separation, the separation between $ BQP $ and the “Polynomial Hierarchy (PH)”. Ten years later R. Raz and A. Tal proved just this, which was a major breakthrough in quantum complexity theory.

There is nice article on Quantamagazine to this (along with further references) vii.

By the way, the Polynomial ierarchy is like exponentiating the class NP further and further. Basically it is similar to $ N \times N $ check. In this sense, NP is like Checkmate in one move. As we climb up the Polynomial hierarchie, the problem gets drastically harder: Checkmate in two moves, Checkmate in three moves, in four moves, … And this would remain hard, even if NP problems were efficiently solvable.

Tackling the coset state

Now we have set up everything to extract $ H $ from the coset and solve the hidden subgroup problem for finite abelian groups. Let’s recap:

We are given a finite abelian group $ (G, +) $ and a H-periodic black box function $f$ on $ G $ to some set $ X $.

We created a coset state by generating the state

\lvert \psi_3 \rangle = \frac{1}{\sqrt{|G|}} \sum_{g \in G} \lvert g \rangle \lvert f(g) \rangle

and measuring the second register, getting some value $f(g)$. In the first register this leaves:

\lvert \psi_4 \rangle = \frac{1}{\sqrt{|H|}} \sum_{h \in H} \lvert g + h \rangle =: \lvert g + H \rangle

This is the point where we got stuck before, since

\lvert g + H \rangle = U(g) \cdot \lvert H \rangle

and we are only interested in the latter state. But by now we know a G-invariant basis: The Fourier basis. And we know, that $ \lvert H \rangle $ may be written as some superposition in this basis:

\lvert H \rangle = \sum_{k \in G} \alpha_k \cdot \lvert \chi_k \rangle

As all $ \lvert \chi_k \rangle $ are translationally invariant, $ U(g) $ will only add a phase factor to the amplitutes $ \alpha_k $ in this superposition

U(g) \cdot \lvert H \rangle = \sum_{k \in G} \chi_k(g) \cdot \alpha_k \cdot \lvert \chi_k \rangle

You might think: So what? Unfortunately the phase factors are local! But the crucial thing is that $ \lvert g + H \rangle $ and $ \lvert H \rangle $ have the same spectrum of Fourier labels. If we “Fourier sample” the superposition above, it will collapse to a single label $k$, one that is also a label of $ \lvert H \rangle $. Plus, the local phase factor $\chi_k(g)$ does not matter any more, so even the measured probabilities are the same.

This means, we just need to apply a Fourier transform of $G$ on the coset state in the first register and measure the first register.

So we are done, right? Not so fast. There is another surprise waiting!

Constraints for the Fourier labels

We may actually formulate some crucial constraints for the measured Fourier labels as well:

The same argument from above also applies to every shift by an element $ h \in H $:

U(h) \cdot \lvert H \rangle = \sum_{k \in G} \chi_k(h) \cdot \alpha_k \cdot \lvert \chi_k \rangle

But since $ H $ is a group, we have $ U(h) \cdot \lvert H \rangle = \lvert H \rangle $. For each individual amplitude, this means:

\chi_k(h) \cdot \alpha_k = \alpha_k

This is only possible if

\chi_k(h) = 1 \\
\text{or} \\
\alpha_k = 0

Thus, we will only measure labels of trivial characters on $H$.

Fourier sampling the coset state

By repeating these steps over and over again we collect more and more of those labels. Since we know $ G $, we know its characters $ \chi_k $. Thus, we gain more and more information about the domain on which all the characters are trivial.

As for every homomorphism, it is guaranteed that each of those “kernels” $\{g \in G: \chi_k(g) = 1\} $ is a subgroup (think about this claim for a second). The intersection $ K $ of all kernels will also be a subgroup (think about this for another second). After only polynomially many steps, the subgroup obtained this way is the hidden subgroup $ H $ with high probability. We will see the reason for this below.

If you want to see the calculation for Fourier sampling the coset state in detail, please check out the “Appendix 2: Fourier sampling the abelian coset state in detail”. There, you will also see that the remaining amplitudes from above are even constant:

\alpha_k = \frac{1}{\sqrt{|G| \cdot |H|}} \cdot \sum_{h \in H} \chi_k(h) = \sqrt{\frac{|H|}{|G|}}

May be this is something, that one could have expected all along, since the Fourier transform is most sensitive to periodic patterns.

In the appendix I followed the presentation in Andrew Childs’ great “Lecture Notes on Quantum Algorithms” (chapter 6) viii.

Childs’ lecture is actually my most favorite course on the quantum computing algorithm landscape. If you haven’t done so far, you should definitely take a closer look at it. It is incredibly comprehensive, up to date and a great starting point to tackle the papers themselves. Yet Childs is also a demanding teacher and you should bring some time with you. But it’s worth it.

He also gives a tricky argument for the efficiency of Fourier sampling coset states for abelian subgroups. The claim is, that each sampling will cut the intersection $ K $ of all kernels $\{g \in G: \chi_k(g) = 1\} $ at least to half with a probability greater than 50%. Thus, $ K $ will shrink exponentially fast to $ H $. I simplified it a little bit with some additional remarks:

As stated above the intersection $ K $ is a subgroup of $ G $ itself. So, either already is $ H = K $ or not yet $ H < K $. If $ H $ is still a proper subgroup of $ K $, Langrange’s theorem tells us more about their sizes:

|K| = [K:H] \cdot |H|

where $ [K:H] $ is the number of distinct cosets, which is at least $ [K:H] \ge 2$. This applies to subgroups in general: A proper subgroup contains at most half of all group elements:

|H| \le \frac{1}{2} \cdot |K|

The same is true for each consecutive intersection $ K $: Each Fourier sampling will either reveal new information and $ K $ will at least cut to half or we don’t gain new information and $ K $ will stay the same. Now, the important question is: What’s the probability for this?

We know from above, that the probability for measuring a certain label is:

P(k) = |\alpha_k|^2 = \frac{|H|}{|G|}

This also means, there are $ \frac{|G|}{|H|} $ labels altogether. How many of those labels will reveal new information and shrink the current intersection $ K $ further? We may answer this by thinking about a different question: What would be the situation if the current intersection $ K $ was the hidden subgroup itself? Then, the number of labels defining $ K $ would be $ \frac{|G|}{|K|} $. Returning to the situation for $ H $, this means, that the probability to Fourier sample a blank (a $ k $ that is among the set of labels defining the intersection $ K $) is

P_{\text{blank}} = \frac{|H|}{|G|} \cdot \frac{|G|}{|K|} = \frac{|H|}{|K|} \le \frac{|H|}{2|H|} = \frac{1}{2}

The opposite to this proves the claim above: The probability to sample a $ k $ which reveals new information is $ > \frac{1}{2} $ !

And by the way, the argumentation also shows: The greater $ K $ is compared to $ H $, the faster will it shrink (even faster than $ \frac{1}{2} $).

Example: Shor’s Algorithm as a Hidden Subgroup Problem

As this article also deals with Shor’s algorithm, let’s look briefly into it as an example of a hidden subgroup problem (you may find all the details in any introduction of Shor’s algorithm):

$N_0$ is an integer, that we want to prime-factor and this can famously be done by finding the integer period $ r $ for the modular exponentiation of some integer $a < N_0$. \begin{gather} f(x) = a^x \text{ mod } N_0 \quad \text{with} \quad f(x) = f(x+r) \end{gather} Thus, Shor's algorithm is an integer period finding algorithm. Looking at it as a hidden subgroup problem, we choose $ G = \mathbb{Z}_N $ with $N$ being some multiple of $r$. It shows, that $N=O(N_0^2)$ is a good choice and we don't even need a precise multiple of $r$, if $N$ is large enough. We are looking for the hidden subgroup: \begin{gather} H = \{0, r, 2r, 3r, ...\} \end{gather} By now, we are able to immediately write done the character labels $k$, that we will Fourier sample for this type of problem! Above, we saw that the characters for $ \mathbb{Z}_N $ are: \begin{gather} \chi_k(g) = e^{ 2\pi i \cdot k \cdot \frac{g}{N}} \end{gather} Sampling only trivial characters on $H$, will give us: \begin{gather} \chi_k(r) = \chi_k(2r) = \chi_k(3r) = ... = 1 \\ \iff \\ k \text{ is a multiple of } \frac{N}{r} \end{gather} Thus, we only need to measure a few $k_1$, $k_2$, ... and find their greatest common denominator \begin{gather} r = \text{gcd}(k_1, k_2, ...) \end{gather} That’s it!

The Hidden Subgroup Problem for non-Abelian groups

Now, we would like to solve the hidden subgroup problem for finite non-abelian groups $ (G, \cdot) $ as well. It turns out, that this is much more challenging. A very interesting application are lattice cryptosystems which are connected to dihedral groups. I will give some details about this example at the end of this article. Another very interesting application is the problem to decide, if two graphs are isomorphic. As a teaser I will sketch this problem first in the context of the hidden subgroup problem. Afterwards we will get into the details about the quantum computing framework of the hidden subgroup problem for non-abelian groups.

Graph Isomorphism as a Hidden Subgroup Problem

Two isomorphic graphs

Image by Booyabazooka via Wikimedia

Ok, let’s look at graph isomorphism as a hidden subgroup problem. For this I will follow Richard Jozsa’s description in “Quantum factoring, discrete logarithms and the hidden subgroup problem” ix. Again, I simplified it a little bit by adding some remarks and changed notations:

An undirected graph A with vertices labeled $ 1, 2, … , n $ may be described by an $ n \times n $ matrix $ M_A $ with entries that are either 0 or 1. In this matrix, an entry $ (i, j) $ is 1 if the graph has an edge joining vertices $i$ and $j$. Two graphs A and B are said to be “isomorphic” if B can be made identical to A by re-labeling its vertices (which is essentially the same as a map from the vertices of A to the vertices of B): Vertex 1 in graph A now has the same edges as vertex 1 in graph B, vertex 2 in graph A now has the same edges as vertex 2 in graph B, … and so on. In other words, there exists a permutation $\Pi$ such that $ M_A $ is obtained by simultaneously permuting the rows and columns of $ M_B $ by $\Pi$. If the graphs A and B happen to be equal, these permutations are called “automorphisms”. The automorphisms may be viewed as the symmetry transformations of the graph: For instance, if the quadratic graph in the image above is rotated or reflected by an axis.

So the problem is, to check whether two graphs are isomorphic or not.

Up to now, there exists no known classical algorithm to solve this problem efficiently in general.

But it fits in the hidden subgroup problem framework. Unfortunately it is an open question, whether the quantum computing solution for this problem can be extracted efficiently.

To model the relabeling in our framework, we think of another graph: The simple union of both graphs $ A \cup B $ with no connection between them. With this trick, we cast the isomorphism problem to an automorphism problem. The group $ G $ in this case is the finite symmetric group of all permutations $ S_{2n}$ of the numbers $ 1,2, …, 2n$.

What is the group of symmetry transformations $ H $ of this union graph? This will become our hidden subgroup. $ H $ consists of those permutations, that permute the graph $ A \cup B $ to itself. Among them are definitely all symmetry transformation of the individual graphs $ A $ and $ B $. Let’s call those $H_{A,B}$:

H_{A,B} \le S_n \times S_n

Now to the crucial part: If A and B are isomorphic, we may also make one full swap of both graphs and than do the magical and unknown permutation that relabels both graphs into each other. Let’s call this whole transformation $ \rho $. We are left with the same graph $ A \cup B $ again. After this, again every separate symmetry transformation of $ A $ and $ B $ will give us an identical graph. This is simply the same as:

\text{If $ A $ and $ B $ are not isomorphic} \quad H = H_{A,B} \\
\text{If $ A $ and $ B $ are isomorphic} \quad H = H_{A,B} \quad \cup \quad \rho \cdot H_{A,B}

If we could construct $ H $ as a quantum superposition and randomly sample any permutation $\Pi$ from it, it would be easy to check wether $\Pi$ is part of $ H_{A,B} \leq S_n \times S_n $ or part of $ \rho \cdot H_{A,B} $:

Just look at $ \Pi(1) $!

If $ \Pi(1) \leq n $, then the new label stayed on the same side of the union graph. If $ \Pi(1) \geq n+1 $ then the label was swapped to the other side of the union graph. So we could sample a few times until we either draw a permutation in $ \rho \cdot H_{A,B} $ or we start to get very confinced, that the graphs just aren’t isomorphic. Thus, we wouldn’t even need to know $ H $ itself. We would just need to sample from it.

To construct an appropriate map $ f: G \longrightarrow X $, we use

1. $ G = S_{2n} $ and

2. $ X $ as the set of $ 2n \times 2n $ matrices with 0, 1 entries

3. $ f(\Pi) $ is the matrix, that we get, if we permute the rows and columns of the matrix $ M_{A \cup B} $ by $ \Pi $.

A Fourier-like basis for non-Abelian groups

So much for the excursion. Now, we want to tackle the non-abelian hidden subgroup problem. One challenge for non-abelian groups is the fact, that we are not able construct a translation invariant basis. Earlier, we constructed this, using the orthonormal relation for irreducable representations, which takes a simple form for abelian groups. For non-abelian groups, we may see in the “Appendix 1: Some facts about group and representation theory”, that the diagonal block structure of every matrix representation coarse grain: As the number of their irreducable representations $ \sigma $ decrease (let’s call it $ \sigma \in \hat{G} $), their size as a $ d_\sigma \times d_\sigma $ matrix increase (being equal to $ 1 \times 1 $ for abelian groups). But all degrees of freedom (labels $\sigma$ and index pairs $i, j$) still add up to the total number of group elements: $ \sum_{\sigma \in \hat{G}} {d_{\sigma}}^2 = |G| $.

But even for general groups, there exists a nice orthonormal relation for all irreducable representations (which is a consequence of Schur’s lemma):

\frac{d_{\sigma}}{|G|} \sum_{g \in G} \sigma(g)_{i, j} \cdot \overline{\sigma'(g)_{i’, j’}} = \delta_{\sigma,\sigma’} \cdot \delta_{i,i’} \cdot \delta_{j,j’}

Using this relation we are able to construct orthonormal states by incorparating labels and matrix indices. Similar to our definition of states $ \lvert \chi_k \rangle $ for the $\mathbb{C}$-functions $ \chi_k $ on $ G $, that we used earlier, we may define states $ \lvert \sigma_{i,j} \rangle $ for the $\mathbb{C}$-functions $ \sigma_{i,j} $ (the matrix elements of the irreducable representations)

\lvert \sigma_{i, j} \rangle := \sqrt{\frac{d_{\sigma}}{|G|}} \cdot \sum_{g \in G} \overline{\sigma(g)}_{i, j} \lvert g \rangle

We know from above, that there are $|G|$ such states, making $\bigl( \lvert \sigma_{i, j} \rangle \bigr) $ a basis. Note, that this is not a unique definition, as it depends on the choice of sub basis for the sub space of each individual $ \sigma $.

Unfortunately this basis is not translation invariant. To see this, we left translate this state by $ g’ $ again, use the fact that $ \sigma(g) = \sigma({g’}^{-1}) \cdot \sigma(g’ \cdot g) $ and make a substitution for the sum over all $ g $:

U(g’) \lvert \sigma_{i, j} \rangle
&= \sqrt{\frac{d_{\sigma}}{|G|}} \cdot \sum_{g \in G} \overline{\sigma(g)}_{i, j} \lvert g’ \cdot g \rangle \\
&= \sqrt{\frac{d_{\sigma}}{|G|}} \cdot \sum_{g \in G} \overline{\sigma({g’}^{-1}) \cdot \sigma(g)}_{i, j} \lvert g \rangle

By summing out the matrix multiplication in detail, we get an inner index $ k $ and this will rotate the state away from $ \lvert \sigma_{i, j} \rangle $:

U(g’) \lvert \sigma_{i, j} \rangle &= \sum_{k=1}^{d_\sigma} \overline{\sigma(g’)^{-1}}_{i, k} \cdot \lvert \sigma_{k, j} \rangle \\
&= \sum_{k=1}^{d_\sigma} \sigma(g’)_{k, i} \cdot \lvert \sigma_{k, j} \rangle

At least the label $ \sigma $ and the column index $ j $ are invariant. Thus, the Fourier spectrum of the coset state will only differ from the spectrum of the hidden subgroup state in the row register $ i $. Before we look into this in detail, let’s stay at this simple example of a single state $ \lvert \sigma_{i, j} \rangle $ and analyze the effect of a measurement in this basis, to give you some intuition how to deal with the mixing of the row index $ i $:

The translation by $ g’ $ will spread out a $ \lvert \sigma_{i, j} \rangle $-measurement of the row index with the probability of:

P = | \sigma(g’)_{k, i} |^2

We are still able to get rid off $ g’$. Since the second register, the one for our blackbox function, will collapse into arbitrary $ g’$-images in each iteration, we practically sample uniformly over all $ g’ \in G $! Thus, we actually measure the mean average $ \frac{1}{|G|} \cdot \sum_{g’ \in G} $ from the probability above. Altogether this will give us the following distribution for the measurement above:

\frac{1}{|G|} \cdot \sum_{g’ \in G} | \sigma(g’)_{k, i} |^2 = \frac{1}{d_\sigma}

by the orthonormality relation (Schur’s lemma). This indicates, that during the execution of our hidden subgroup algorithm all information about the translation $ g’ $ and the row indices $ i $ will even out.

Now. let’s analyze this in detail:

Strong Fourier Sampling

Again, we define the Fourier transform by extracting labels and indices from each of these $|G|$ constructed states:

\mathcal{F} \lvert \sigma_{i, j} \rangle = \lvert \sigma, i, j \rangle

Like above, this translates every state into its spectrum of Fourier labels and indices:

\mathcal{F} = \sum_{\sigma \in \hat{G}} \sum_{i, j \leq d_{\sigma}} \lvert \sigma, i, j \rangle \langle \sigma_{i, j} \lvert

By now, we know, that a translation doesn’t change the labels in the Fourier spectrum. At the end we want to sample from these labels, so let’s analyze the Fourier transform for a specific label:

\mathcal{F}_{\sigma} = \sqrt{\frac{d_{\sigma}}{|G|}} \cdot \sum_{i, j \leq d_{\sigma}} \sum_{g \in G} \sigma(g)_{i, j} \cdot \lvert \sigma, i, j \rangle \langle g \lvert

For the coset state we have:

\mathcal{F}_{\sigma} \lvert g’ \cdot H \rangle
&= \sqrt{\frac{d_{\sigma}}{|G|}} \cdot \sum_{i, j \leq d_{\sigma}} \sigma(g’ \cdot H)_{i, j} \cdot \lvert \sigma, i, j \rangle \\
&= \sqrt{\frac{d_{\sigma}}{|G|}} \cdot \sum_{i, j \leq d_{\sigma}} \big( \sigma(g’) \cdot \sigma(H)\bigr)_{i, j} \cdot \lvert \sigma, i, j \rangle

where we used the definition $ \sigma(g’ \cdot H) := \frac{1}{\sqrt{|H|}} \sum_{h \in H} \sigma(g’ \cdot h) $

Sampling all degrees of freedom in the Fourier spectrum (labels and indices) is called “Strong Fourier Sampling”. It will give us a probability distribution $ P(\sigma, i, j ) $. In the equation above, it seems, that sampling the Fourier transform still depends on the translation $ g’ $. From our argument above we have the intuition, that this will cancel out since we implicitly sample uniformly over all possible $ g’ $’s. In the “Appendix 3: Strong and Weak Fourier sampling in detail” we will carry this out. The result will be

P(\sigma, i, j ) &= \frac{1}{|G|} \cdot
\sum_{k=1}^{d_\sigma} | \sigma(H)_{k, j} |^2 \\
&= \frac{1}{|G|} \cdot \lVert \sigma(H)_j \rVert^2

where $ \lVert \sigma(H)_j \rVert $ is the norm of the j’th column vector of $ \sigma(H) $.

This shows in detail, that the probability $ P(\sigma, i, j ) $ is constant on the row register $ i $ and independent of any translation by $ g’ $.

What makes the column index so special? It’s just, that in our definition of the hidden subgroup problem we specifically used left coset states. If we had restricted the problem to right coset states, then strong Fourier sampling would differ only by the row index instead.

Also note, that the distribution actually differs for different $ j $’s. Unlike for the abelian case, generally we won’t sample only trivial representations on $ H $. I give some reasoning for this in the appendix.

Weak Fourier Sampling

Instead of sampling all degrees of freedom in the Fourier spectrum, we may also decide to sample only the labels of the irreducable representations. This is called “Weak Fourier Sampling” and will give us coarser information about the hidden subgroup:

P(\sigma) = \sum_{i,j=1}^{d_\sigma} P(\sigma, i, j )
= \frac{d_\sigma}{|G|} \cdot \sum_{k, j=1}^{d_\sigma} | \sigma(H)_{k, j} |^2

Also in “Appendix 3: Strong and Weak Fourier sampling in detail”, I show, that this equals to the “Frobenius norm” of $ \sigma(H) $ and finally we get:

P(\sigma) = \sqrt{|H|} \cdot \frac{d_\sigma}{|G|} \cdot Tr \big( \sigma(H) \big)

This may be interpretated as a direct extension of the case for abelian hidden subgroups, that we got earlier. It is also obvious, that weak Fourier sampling is not able to extract all information about the hidden subgroup $ H $. Specifically, it is not able to distinguish between different subgroups $ H_1 $ and $ H_2 $ with equal $ Tr\big( \sigma(H_1) \big)= Tr\big( \sigma(H_2) \big)$.

A positive case are “normal subgroups” with $ g H g^{−1} = H $ for all $g \in G$. One may show (and Andrew Childs does so in his mentioned lecture notes viii), that for normal subgroups we will again only sample trivial representations $ \sigma $ using weak Fourier sampling. In this case, the same argumentation that we used for abelian subgroups holds regarding $ H \le Ker(\sigma) $ and each new measurement will very likely at least half the intersection $ K $ of all kernels. This will shrink $ K $ exponentially fast to our hidden subgroup.

Limits of Fourier Sampling

The quantum computing framework for the general hidden subgroup problem is polynomially efficient as far as querying the black box function concerns. This fundamental result was proven by Ettinger, Hoyer, Knill . They showed, that the hidden subgroup states are pairwise statistically distinguishable. So much for the good news.

Yet this does not automatically implicate, that we are able to extract this information using an efficient algorithm. We saw earlier, that unlike in the abelian case, Fourier sampling will not restrict to trivial representations $ \sigma $ in general. Also we won’t extract information about the row index of $ \sigma(H) $. Another question is what sub basis for each $ \sigma $ we should choose for strong Fourier sampling. The first guess is to use a random basis. One can show that if $H$ is sufficiently small and $G$ is sufficiently non-abelian then random strong Fourier sampling gives not enough information. In particular M. Grigni et al. showed in “Quantum mechanical algorithms for the nonabelian hidden subgroup problem” x that you need at least about $ \bigg( \frac{\sqrt{|G|}}{|H| \sqrt{c(G)}} \bigg)^{1/3} $ rounds of random strong Fourier sampling (with $c(G)$ being the number of conjugacy classes) until you can identify $ H $. This is especially dissappointing for the symmetric group $ G = S_n $ of all permutations, which is needed for graph isomorphism. For $S_n$ for which $ |G| = O \big( e^{n \cdot ln (n)} \big) $ for large $ n $ and $ c(S_n) $ only goes $ c(S_n) = O\big( e^{\sqrt{n}}\big) $. Later this result for $ S_n $ was qualitively extended to any type of single measurement basis. This improves significantly if one considers entangled measurements on multiple copies of the hidden subgroup states. For the “dihedral group” we will illustrate such an entangled measurement in the next section.

For the symmetric group S. Hallgren et al extended this procedure and showed that $ log |G| $ copies are needed the extract $ H $ xi.

But even in situations where all these constraints produce favorable results, there might not be an efficient quantum computing circuit that implements the measurement or a classical post processing is not known.

In the next section I show an example of sampling outcomes for the “dihedral group”.

The Hidden Subgroup Problem for Dihedral Groups

An example of a non-abelian hidden subgroup problem that may be solved by Fourier sampling and especially by entangled measurements is the dihedral group $ D_N$. This is a group that represents the symmetry transformation of a regular “N-gon” (e.g. triangle, square, pentagon, sextagon, …). This example is especially very interesting as cracking certain lattice cryptosystems are equivalent to the dihedral hidden subgroup problem. We will look into this application at the very end.

The dihedral group D5

Image by FaRud via Wikimedia

The dihedral group $D_5$: Do not get confused, the notation for “s” und “r” is exactly the other way around as in my text

Altogether there are $ |G|= 2N $ group elements, which are generated by $ N $ rotations $ s $ and a reflection about a fixed axis $ r $. Thus we can write any group element in the form

s^x r^a \quad \text{with $ x \in \mathbb{Z}_N $ and $ a \in \mathbb{Z}_2 $}

By our knowledge about 2d-rotations and reflections, we know, that the group operation is given by:

s^x r^a \cdot s^y r^b = s^{x + (-1)^a y} r^{a+b}

Aquivalently we may express this as the “semidirect product” $ (x, a), (y, b) \in \mathbb{Z}_N \rtimes \mathbb{Z}_2 $ and

(x, a) \cdot (y, b) = (x + (-1)^a y, a+b) \\
\text{and} \\
(x, a)^{-1} = (−(−1)^a x, a)

where the weird $\rtimes$-sign expresses the fact the combination of elements is given by the latter relation and is not just a simple product.

Example: Fourier Sampling the Dihedral Group D4

Another very comprehensive investigation of the quantum computing framework for the hidden subgroup problem is the master thesis “The Hidden Subgroup Problem” by Frédéric Wang xii.

In there, Wang also presents an example of the different irreducable representations and measurement probabilities for weak and strong Fourier sampling outcomes for subgroups of $D_4$.


$ \sigma $ $ \sigma(x, 0)$ $ \sigma(x, 1)$
$ \sigma_0 $ $ 1 $ $ 1 $
$ \sigma_1 $ $ 1 $ $ -1 $
$ \sigma_2 $ $ (-1)^x $ $ (-1)^{x+1} $
$ \sigma_4 $ $$ \begin{pmatrix}
(-1)^x & 0 \\
0 & i^x \\
$$ \begin{pmatrix}
0 & (-i)^x \\
i^x & 0 \\

As for weak and strong Fourier sampling, Wang compares two different subgroups of $ D_4 $:

H_1 = \bigl\{(0, 0), (1, 1)\bigr\} \\
\text{and} \\
H_2 = \bigl\{(0, 0), (2, 0)\bigr\}

Weak Fourier Sampling

For weak Fourier sampling this gives:

$ \sigma $ $ \sigma(H_1)$ $ P(H_1, \sigma)$ $ \sigma(H_2)$ $ P(H_2, \sigma)$
$ \sigma_0 $ $ \sqrt{2} $ $ \frac{1}{4} $ $ \sqrt{2} $ $ \frac{1}{4} $
$ \sigma_1 $ $ 0 $ $ 0 $ $ \sqrt{2} $ $ \frac{1}{4} $
$ \sigma_2 $ $ 0 $ $ 0 $ $ \sqrt{2} $ $ \frac{1}{4} $
$ \sigma_3 $ $ \sqrt{2} $ $ \frac{1}{4} $ $ \sqrt{2} $ $ \frac{1}{4} $
$ \sigma_4 $ $$
1 & -i \\
i & 1 \\
$ \frac{1}{2} $ $$
0 & 0 \\
0 & 0 \\
$ 0 $

Strong Fourier Sampling

Since the first 4 representations are 1-dimensional, there will be no difference in weak and strong for those. Only sampling outcomes for $ \sigma_4 $ will give finer grained information for strong Fourier sampling this gives:

$ (\sigma, j) $ $ P(H_1, \sigma, j)$ $ P(H_2, \sigma, j)$
$ (\sigma_4, 1) $ $ \frac{1}{4} $ $ 0 $
$ (\sigma_4, 2) $ $ \frac{1}{4} $ $ 0 $

Entangled measurements and the Kuperberg sieve

It can shown, that the dihedral hidden subgroup problem reduces to subgroups of type:

H = \bigl\{(0, 0), (y, 1) \bigr\} \quad \text{for some $ y \in \mathbb{Z}_N$}

Note, that $ (y, 1) $ is self inverse by the former relation.

Thus, the hidden subgroup problem for $ D_N $ reduces to find a hidden “slope” $ y $. For simplicity, let’s assume the case $ N = 2^n $. Setting up our quantum computing framework for the dihedral hidden subgroup problem will give us coset states of the form:

U(z) \lvert H \rangle = \frac{1}{\sqrt{2}} \big( \lvert z, 0 \rangle + \lvert z + y, 1 \rangle \big)

Solving this for $ y $ is also called the “Dihedral Coset Problem (DCP)”. Following an insight by Greg Kuperberg, this can be done in a little different way than in the standard approach xiii.

First we $\mathbb{Z}_N$-Fourier transform only the first register. This gives a superposition of states of the form

\lvert k \rangle \otimes \lvert \psi_k \rangle \\
\text{with} \\
\lvert \psi_k \rangle =
\frac{1}{\sqrt{2}} \big( \lvert 0 \rangle + e^{2\pi i \cdot \frac{k}{N} y} \lvert 1 \rangle \big)

Now, we measure the $ \lvert k \rangle $-register. If we could generate / postselect states with $ k = N/2 $, we would end up with:

\lvert \psi_{N/2} \rangle =
\frac{1}{\sqrt{2}} \big( \lvert 0 \rangle + (-1)^y \lvert 1 \rangle \big)

If we measure such a state in the X-basis $ \lvert \pm \rangle = (\lvert 0 \rangle \pm \lvert 1 \rangle)/\sqrt{2}$ we would learn if $ y $ is even or uneven (its “parity”). This parity is nothing else than the last bit of $ y $ or its “least significant” bit. Kuperberg realized, that you could use this procedure to iteratively learn the last bit, the second last bit, the third last bit, … until you have learned all bits of the hidden slope $ y $. This is possible, because the dihedral group $ D_N $ is recursively self-similar: It contains two subgroups that are isomorphic to $ D_{N/2} $

\big\{(2x, 0), (2x, 1) : x \in Z_{N/2} \big\} \\
\big\{(2x, 0), (2x + 1, 1) : x \in Z_{N/2} \big\}

If the parity of $ y $ is even, than the hidden subgroup is part of the former and if the parity of $ y $ is uneven, than the hidden subgroup is part of the latter. But this recursive group setup sais even more: If we simply drop a certain bit (respectively one certain qubit) from our coset state in the $\mathbb{Z}_N$-register, we get a new dihedral $(n-1)$-bit coset state which entirely lives in the even or in the uneven subgroup. And the bit, that we have to drop is the same parity-bit, that we just learned for our hidden slope $y$. The new coset state contains a new hidden subgroup with hidden slope $ y’ $. Just like the new coset state, the new slope is the string of the remaining $n-1$ bits and its new parity bit is the second last bit of $y$. Thus, by repeating this procedure we are able to learn all bits of the hidden slope $y$.

But …

… we still need to figure out, how to generate the state $ \lvert \psi_{N/2} \rangle $!

For one thing: Since we measure the $ \lvert k \rangle $-register, we may hope to draw a $ k = N/2 $. But since $ N = 2^n $ this is exponentially unlikely.

Kuperberg’s insight was, to entangle two such states $ \lvert \psi_p \rangle $ and $ \lvert \psi_q \rangle $ via a Controlled NOT. This gives a state of the same kind:

\lvert \psi_p, \psi_q \rangle = \frac{1}{\sqrt{2}}
\big( \lvert \psi_{p+q} \rangle \lvert 0 \rangle + \alpha \cdot \lvert \psi_{p-q} \rangle \lvert 1 \rangle \big)

with some irrelevant phase $ \alpha $. After measuring the second register, we get states of type $ \lvert \psi_{p+q} \rangle $ or $ \lvert \psi_{p-q} \rangle $. Remember that we know $ p $ and $ q $, since we measured this register after the Fourier transform before. Now, we can think of a whole bunch of clever ways to pair $ p $ and $ q $ according to their bitstrings, so that after the entangled measurement, we get a similar state with more zeros in their bitstrings. We can use such new states, that after yet another entangled measurement, we would get other similar states with even more zeros in their bitstrings. Thus by collecting states, measuring their Fourier labels, clever pairing and repeating entangled measurements we are able to generate a sequence of states that pretty soon end up with $ \lvert \psi_{N/2} \rangle $:

\lvert \psi_{k} \rangle = \lvert \psi_{******} \rangle \longrightarrow \lvert \psi_{*****0} \rangle
\longrightarrow \lvert \psi_{****00} \rangle
\longrightarrow …
\longrightarrow \lvert \psi_{*00000} \rangle = \lvert \psi_{N/2} \rangle

Lattice cryptography and the Dihedral Hidden Subgroup Problem

To complete this overview of the hidden subgroup problem, we will look into a highly interesting application of the dihedral hidden subgroup problem. It was discovered by Oded Regev in 2003 xiv. In particular, he reduced certain “shortest vector problems” (SVP), that occur in lattice cryptography, to the dihedral hidden subgroup problem.

2 dimenional lattice with shortest vectors

Image by Catslash via Wikimedia

A mathematical lattice is a periodic set of vectors in $ \mathbb{R}^n $, that is integer-spanned by a few basis vectors $ v_k $. These basis vectors might not be particularly short. A very well studied problem is the question for the shortest vector in the whole lattice (SVP). The larger the dimension of the lattice gets, the harder this shortest vector problem becomes. All known algorithms of the SVP have an exponential runtime, making it a great candidate for cryptography. In his seminal work Regev proved that finding this shortest vector may be reduced to the problem of finding the hidden slope in a dihedral hidden subgroup problem. The dihedral coset problem, that we examined in the previous section, can be extended to the “two point problem” of finding the minimal distance between two vectors $\vec{x}$ and $\vec{y}$ in a lattice via a collection of states of the form:

\frac{1}{\sqrt{2}} \big( \lvert \vec{x}, 0 \rangle + \lvert \vec{y}, 1 \rangle \big)

This sounds surprising at first, because here we are dealing with vectors. But the problem may be cast to an integer problem by chaining the integer components $ x_i – y_i $ of the distance vector to a b-adic representation (with a certain base $ M $), which is the hidden slope to be found:

\sum_{i=1}^n(x_i – y_i) \cdot (2M)^{i-1}

By knowing $M$ and this value, we are able to read off the integer components of the distance vector. Since we also know the basis of the lattice, we have solved the two point problem.

An essential part of Regev’s solution to the shortest vector problem is to partition the lattice into labeled elementary cubes, in such a way, that each cube contains either no lattice points or exactly two nearest lattice points. This can actually be done without knowing the shortest vector upfront. By storing the label of each cell in an ancillary register, the superposition of all lattice points will collapse to the points of exactly one elementary cube with two nearest points after measuring this register.

A nice introduction to this subject may be found here xv.

By the way: Just two years later Regev introduced the “Learning with Errors” Problem (LWE), which started a revolution in cryptography xvi.

In 2018, he was honored with the Gödel-Prize for his work.

Appendix 1: Some facts about group and representation theory


Of course, a “group” is a set $ G $ with an operation that maps every pair of elements into the set again $ g_1 \cdot g_2 \in G $. For every element $g$ there is also an “inverse” element $g^{-1} $ mapping $ g $ to the “neutral” element $ e $ which is also part of the group. The group is called “abelian” if the operation commutes for all elements. In this case the group operation is sometimes called $ + $, as we will also do.

The number of all elements in G (its “order”) is denoted as $ | G | $.


Example $ \mathbb{Z}_2^n $:

The set of bitstrings of length $ n $ forms the “cyclic group” $ \mathbb{Z}_2^n $ under the modulo addition: e.g. $ 01011 + 00001 = 01010 $ or $ 11111 + 00001 = 11110 $.

Example $ \mathbb{Z}_{2^n} $:

Of course, the same set of bistrings may also be interpreted in other ways, i.e. as the group $ \mathbb{Z}_{2^n} $: The set of integers $ 0, 1, …, 2^n-1 $ which repeat in a cyclic way again at $0$ just like the hours of a clock. The same bitstrings just map to a different element under addition / group operation: $ 01011 + 00001 = 01100 $ or e.g. $ 11111 + 00001 = 00000 $.


A subset $ H \subset G $ is a “subgroup”, if it also has all the group properties and all of $ H \cdot H $ stays in $ H $. Note, that the subgroup $ H = \bigl\{0, b \bigr\} $ from Simon’s problem is a subgroup of $ \mathbb{Z}_2^n $.

A “coset” is a set of type $ g \cdot H $ or $ g + H $ for a subgroup H and any $ g \in G $.

By “Lagrange’s theorem”, there are $ | G | / | H | $ distinct cosets.

Probably you knew all of this already. But the next section about representations might be less familiar …

BTW: A nice, short and rather gentle introduction can also be find here xvii.:


A “representation” $ \rho $ of a group is a way to encode the group elements as complex and invertable $N \times N$ matrices. Further it has to be a “group homomorphism”

\rho: G \longrightarrow GL \bigl(\mathbb{C}^N \bigr) \\
\rho(g_1 \cdot g_2) = \rho(g_1) \cdot \rho(g_2)

So $\rho$ translates the group operation into ordinary matrix multiplication.

If $G$ is abelian this has some interesting consequences: Since

\rho(g_1 + g_2) = \rho(g_1) \cdot \rho(g_2) \\
= \rho(g_2 + g_1) = \rho(g_2) \cdot \rho(g_1)

all the matrices in $ \rho(G) $ commute. That means, they have a common eigenbasis and there exists an invertable matrix $ V $ such that

V \cdot \rho(g) \cdot V^{-1} \quad \text{is diagonal for all $ g \in G $.}

Irreducable Representations

In the abelian case above, the diagonal elements for each $ g \in G $ are different complex numbers $ \sigma(g) $.

Since $ V \cdot \rho \cdot V^{-1} $ is also a homomorphism and thus a representation, so are all the $ \sigma $. We see, that $ V \rho V^{-1} $ is “reducable”: It may be decomposed into smaller “irreducable representations” $ \sigma $ and so does $ \rho $.

A key insight of representation theory is, that any representation may be decomposed in this way into irreducable representations. For abelian groups these are just homomorphism functions

\sigma: G \longrightarrow \mathbb{C} \quad \text{abelian}

and there are $ |G| $ different ones.

In general, for non-abelian groups, the $ \sigma $ map to small $d_\sigma \times d_\sigma$ matrices with individual dimension $ d_\sigma $.

\sigma: G \longrightarrow GL \bigl(\mathbb{C}^{d_\sigma} \bigr) \quad \text{non-abelian}

Also in gernal, there are less such irreducable representations, usually denoted as $ \sigma \in \hat{G} $. But all degrees of freedom (labels and matrix indices) still add up to |G|:

\sum_{\sigma \in \hat{G}} d_\sigma^2 = |G|

So, as the number of irreducable representations reduce, their sizes increase. In other words: The diagonal block structure of the group representations becomes coarser.


Now to the before mentioned “characters”: You may construct an interesting group function if you take the trace of a representation.

\chi(g) = Tr( \rho(g) )

Thus, a character has the nice symmetrical property that it is constant on “conjugacy classes” for every subgroup $H$. This means, elementwise we have (because traces are cyclic):

\chi(g \cdot H \cdot g^{-1} ) = \chi(H)

Also, you may extract some central properties of any group just by studying the characters and especially the characters of irreducable representations

\chi(g) = Tr( \sigma(g) )

For abelian groups these characters are of course just the irreducable representations themselfs. So, they have the homomorphism property

\chi_z(g_1 + g_2) = \chi_z(g_1) \cdot \chi_z(g_2)

Appendix 2: Fourier sampling the abelian coset state in detail

In this appendix we will carry out the calculation for the Fourier transform of an abelian coset state in detail. I will closely follow the presentation in Andrew Childs’ great “Lecture Notes on Quantum Algorithms” (chapter 6)

\lvert \psi_5 \rangle &= \mathcal{F} \lvert g + H \rangle \\
&= \frac{1}{\sqrt{|H|\cdot|G|}} \cdot \sum_{k \in G} \sum_{h \in H} \chi_k(g + h) \lvert k \rangle \\
&= \sqrt{\frac{|H|}{|G|}} \cdot \sum_{k \in G} \chi_k(g) \cdot \chi_k(H) \lvert k \rangle

using the definition:

:= \frac{1}{\sqrt{|H|}} \cdot \sum_{h \in H} \chi_k(h)

Now, this latter sum is highly interesting:

I claim: If you sum up $\chi_k$ like this (over all elements of a group or a subgroup), then every $\chi_k$ will cancel out to $ 0 $ except for those, that are trivial $\chi_k = 1 $ on the group or subgroup!

Why so?

Let’s select a random $ h’$ and take a closer look at $ \chi_k(h’) $.

&= \frac{1}{\sqrt{|H|}} \cdot \sum_{h \in H} \chi_k(h) \\
&= \frac{1}{\sqrt{|H|}} \cdot \sum_{h \in H} \chi_k(h + h’) \\
&= \frac{1}{\sqrt{|H|}} \cdot \chi_k(h’) \cdot \sum_{h \in H} \chi_k(h) \\
&= \chi_k(h’) \cdot \chi_k(H)

This is only possible if $ \chi_k(h’) = 1 $ or $ \chi_k(H) = 0$.

This leaves us with:

\lvert \psi_5 \rangle
= \sqrt{\frac{|H|}{|G|}} \cdot \sum_{k: \chi_k(H) = 1} \chi_k(g) \lvert k \rangle

Appendix 3: Strong and Weak Fourier sampling in detail

In the main text we saw, that Fourier transforming a non-abelian coset state gives us:

\mathcal{F}_{\sigma} \lvert g’ \cdot H \rangle
&= \sqrt{\frac{d_{\sigma}}{|G|}} \cdot \sum_{i, j \leq d_{\sigma}} \sigma(g’ \cdot H)_{i, j} \cdot \lvert \sigma, i, j \rangle \\
&= \sqrt{\frac{d_{\sigma}}{|G|}} \cdot \sum_{i, j \leq d_{\sigma}} \big( \sigma(g’) \cdot \sigma(H)\bigr)_{i, j} \cdot \lvert \sigma, i, j \rangle

where we have used the definition $ \sigma(g’ \cdot H) := \frac{1}{\sqrt{|H|}} \sum_{h \in H} \sigma(g’ \cdot h) $

Strong Fourier sampling in detail

The probability to measure $\sigma, i, j$ for the coset state $\lvert g’ \cdot H \rangle $ still depends on $ g’ $:
P(g’, \sigma, i, j ) = \frac{d_{\sigma}}{|G|} \cdot \big| \big( \sigma(g’) \cdot \sigma(H)\bigr)_{i, j} \big|^2

This looks disappointing. But we are still able to get rid off $ g’$! Since the second register, the one for our blackbox function, will collapse into arbitrary $ g’$-images in each iteration, we practically sample uniformly from all $ g’ \in G $! Thus, if we just keep measuring $ \sigma, i, j $, this will give us the mean average $ \frac{1}{|G|} \sum_{g’ \in G} $:

P(\sigma, i, j ) = \frac{d_{\sigma}}{|G|} \cdot \frac{1}{|G|} \sum_{g’ \in G} \big| \big( \sigma(g’) \cdot \sigma(H)\bigr)_{i, j} \big|^2

If we write the matrix multiplication in detail, we get:

P(\sigma, i, j ) = \frac{d_{\sigma}}{|G|^2} \cdot \sum_{g’ \in G} \big| \sum_{k=1}^{d_\sigma} \big( \sigma(g’) \big)_{i,k} \cdot \big( \sigma(H)\bigr)_{k, j} \big|^2

Oh no, this looks even messier!

But don’t get confused with all the notations. The key to success here, is the observation that for each $k$, $ \big( \sigma(g’)_{i,k} \big)_{g’} =: \big(v_k \big)_{g’} $ may be interpretated as a $ |G| $-dimensional vector $ v_k $ with index $ g’$. To simplify the sum, we ignore $ i, j $ for a second and also define the complex coefficients $ \gamma_k := \sigma(H)_{k, j} $.

This makes the sum:

P(\sigma, i, j ) = \frac{d_{\sigma}}{|G|^2} \cdot \sum_{g’ \in G} \big| \sum_{k=1}^{d_\sigma} \big(v_k \big)_{g’} \cdot \gamma_k \big|^2

Now, think about the following claim for a second: This is nothing else than the norm of the superposition of the vectors $ (v_k) $ in the |G| dimensional vector space with index $g’$! So we get rid off the first sum:

P(\sigma, i, j ) = \frac{d_{\sigma}}{|G|^2} \cdot
\left\lVert \sum_{k=1}^{d_\sigma} \big(v_k \big) \cdot \gamma_k \right\rVert^2

Note, that these vectors are not quantum states (we just measured one). The norm of the $ |G| $-dimensional vector is just a way of interpreting the distribution. Since it is not a quantum state, we will omit the bracket notation. This norm simplifys further, by the observation, that the vectors $ \big(v_k \big) $ are orthogonal for all $ k $ by Schur’s lemma (the orthogonal relation for the $\sigma$’s).

Now, if you make a superposition of any set of orthogonal vectors by some coefficients, what is the norm of this? The norm is just the squares of the coefficients $ \gamma_k $ times the inner product of the vectors $ \langle v_k, v_k \rangle $ (in this case, given by Schur’s lemma again)! No matter how large the vectors are and what the vector space is. All that counts, is that they are orthogonal. So the sum simplifies to:

P(\sigma, i, j ) &= \frac{d_{\sigma}}{|G|^2} \cdot
\sum_{k=1}^{d_\sigma} | \gamma_k |^2 \cdot \frac{|G|}{d_{\sigma}} \\
&= \frac{1}{|G|} \cdot
\sum_{k=1}^{d_\sigma} | \sigma(H)_{k, j} |^2 \\
&= \frac{1}{|G|} \cdot \lVert \sigma(H)_j \rVert^2

where $ \lVert \sigma(H)_j \rVert $ is the norm of the j’th column vector of $ \sigma(H) $.

This shows in detail, that the probability $ P(\sigma, i, j ) $ is constant on the row register $ i $ and independent on any translation by $ g’ $.

Weak Fourier sampling in detail

If we only care about measuring the Fourier labels, we need to sum over this distribution. Since the distribution is constant on $i$, the sum $ \sum_i $ will give us a constant factor $ d_\sigma $:

P(\sigma) = \sum_{i,j=1}^{d_\sigma} P(\sigma, i, j )
= \frac{d_\sigma}{|G|} \cdot \sum_{k, j=1}^{d_\sigma} | \sigma(H)_{k, j} |^2

This sum is also a canonical norm for matrices, called the “Frobenius norm”, for which a nice trace equality holds:

P(\sigma) = \frac{d_\sigma}{|G|} \|\sigma(H) \|_F^2 =
\frac{d_\sigma}{|G|} Tr \big( \sigma(H)^\dagger \cdot \sigma(H) \big)

This may be simplified further. Recalling our trick for abelian Fourier sampling,

\sigma(H) = \sigma(h) \cdot \sigma(h^{-1}H) = \sigma(h) \cdot \sigma(H)

we might be tempted to conclude, that $ \sigma(h) = 1 ? $ and we will Fourier sample only those $ \sigma $’s that are trivial on $ H $. But what is true for complex numbers does not necessarily hold for matrices as well. All that the equation above tells us is:

\sigma(H) \cdot \sigma(H) = \frac{1}{\sqrt{|H|}} \sum_{h \in H} \sigma(h) \cdot \sigma(H) = \frac{|H|}{\sqrt{|H|}} \cdot \sigma(H) = \sqrt{|H|} \cdot \sigma(H)

Thus $ \sigma(H) $ is proportional to a projection matrix (and each $ \sigma(h) $ might be non trivial on the kernel of $ \sigma(H) $). We may use the equation above to simplify the Frobenius norm. Since $ \sigma(H)^\dagger = \sigma(H^{-1}) = \sigma(H) $ (think of this as written in sums):

P(\sigma) &= \frac{d_\sigma}{|G|} \cdot Tr \big( \sigma(H)^\dagger \cdot \sigma(H) \big) \\
&= \sqrt{|H|} \cdot \frac{d_\sigma}{|G|} \cdot Tr \big( \sigma(H) \big)


i “How Much Structure Is Needed for Huge Quantum Speedups?” Transcript of Scott Aaronson‘s talk at the Solvay Physics Conference in Brussels, May 2022.

ii “A super-polynomial quantum advantage for combinatorial optimization problems” paper by N. Pirnay, V. Ulitzsch, F. Wilde, J. Eisert, J. Seifert 2022

iii “A rigorous and robust quantum speed-up in supervised machine learning” 2010, paper by Y. Liu, S. Arunachalam, K. Temme. It creates a very nice machine learning example with data that a classical computer can easily generate by modular exponentiation but cannot classify / distinguish from random data due to the classical hardness of discrete logarithms whereas the quantum support vector machine can.

iv “The Complexity of NISQ” 2022, paper by S. Chen, J. Cotler, HY. aka Robert Huang, J. Li one of the highest SciRated preprints in quantum computing in the last year. It introduces a new complexity class for NISQ-algorithms and proves that there exists an exponential oracle seperation between BPP, NISQ and BQP (in this order).

v You will find a very similar introduction to Simon’s problem and a nice coding example in the Qiskit Textbook

vi Ekert & Jozsa 1998, “Quantum Algorithms: Entanglement Enhanced Information Processing”, which also gives details about the general Fourier transform over finite groups

vii “Finally, a Problem That Only Quantum Computers Will Ever Be Able to Solve”, a nice article on Quantamagazine about the Forrelation problem and the breakthrough in quantum complexity theory. There, you will also find a link to the preprint.

viii Andrew Childs’ truly great “Lecture Notes on Quantum Algorithms” and a very comprehensive reference for the hidden subgroup problem

ix Richard Jozsa 2000, “Quantum factoring, discrete logarithms and the hidden subgroup problem”

x M. Grigni et al. 2001, “Quantum mechanical algorithms for the nonabelian hidden subgroup problem”

xi, S. Hallgren et al 2005. “Limitations of Quantum Coset States for Graph Isomorphism”

xii, F. Wang’s master thesis. 2010 “The Hidden Subgroup Problem”, a very comprehensive investigation about prior work on the subject.

xiii, Greg Kuperberg 2003, “A subexponential-time quantum algorithm for the dihedral hidden subgroup problem.”. You will also find an introducation in Andrew Child’s lecture notes and F. Wang’s master thesis

xiv, Oded Regev 2003, “Quantum Computation and Lattice Problems”

xv, Curtis Bright, 2009 “From the Shortest Vector Problem to the Dihedral hidden subgroup problem”

xvi, Oded Regev 2005 “On lattices, learning with errors, random linear codes, and cryptography”

xvii, Ricky Roy, “Representation Theory”, A nice, short and rather gentle introduction

Does Quadratic Quantum Speedup have a problem?

As quantum computing evolves the community focuses on the next steps towards fault tolerance. Recent estimates for the overhead of error correcting codes on reasonable hardware assumptions conclude certain unfavorable results for quadratic quantum speedups. As this might effect important use cases for quantum computing we should take a closer look at these results.

Image by PublicDomainPictures from Pixabay

Google and the problem with quadratic quantum speedups

As quantum computing evolves the community starts to focus on the next steps towards fault tolerance. First trial implementations of error correcting codes, as the surface code, have been successfully executed and companies start to work out industrially relevant use cases beyond NISQ in detail. These developments start to highlight an aspect that is often overlooked: The computational overhead of fault tolerance. In this light Google’s Quantum AI team published a study in 2021 “Focus beyond quadratic speedups for error-corrected quantum advantage” i in which they provide some “back of the envelope” calculations and conclude with certain unfavorable results for the quantum case. These roughly estimate the runtime of quantum algorithms with a small polynomial speedup and ask the question: How long would such an algorithm need to run in order to beat the classical runtime? They simplify the aspect in the following generic form:

Consider a quantum algorithm, that repeatedly executes a quantum primitive M times (e.g. an oracle for optimization). If the runtime of each round would be $ t_Q $ the total runtime is:

T_{total} = M * t_Q

If the classical algorithm needs to execute the same primitive $ M^d $ times (e.g. $ d=2 $ for a quadratic speedup like for Grover search), the total classical runtime is:

T_{total} = M^d * t_C

The central question is: How long are $ t_Q $ and $ t_C $? They argue that the quantum runtime for such algorithms is mostly governed by costly, error corrected Toffoli gates which are the quantum version of classical NAND gates. Toffoli gates are not “native” to the surface code and require the generation or “distillation” of so called magic states from a number of noisy states.
Considering state of the art surface code implementations and reasonable assumptions for superconducting quantum computers, they predict a Toffoli execution time of 170 µs (they also mention the case for trapped ions). How many Toffoli executions can we expect until we achieve a quantum advantage? In the case of a Grover search implementation for optimization, Google’s team reasons a count of order $ N $ Toffolis ($ N $ being the number of qubits) for a typical oracle which should be at least $ N = 100 $ to hit quantum advantage. Thus

t_Q = 440 ms

On the classical side they give an estimate of

t_C = 33 ns

Such a quantum algorithm would have to execute for 2.4 hours in order to breakeven with the classical algorithm. Of course, a Grover search for $ N = 100 $ whould need only about 10 cycles to end, but these are just estimates to give an impression. Also Google’s team explains, that considering even larger Ns (and more interesting problems) would be even less favorable for the quantum side. I guess, you could also argue to sequentially execute many $ N=100 $ – problem instances until you reach the breakeven time.

Now, 2.4 hours seems like good news – until you consider multiple CPU cores and a classical algorithm exploiting parallelization (which would be easy for optimization). Even worse: In order to execute the surface code to error-correct the quantum computer you would also need classical processing. And you need lots of it. Google’s team names “thousands of classical CPUs”. And if you use additional 3000 CPUs on the classical side as well, the breakeven time increases to 1 year!

BTW, they also compare annealing algorithms and Sherrington-Kirkpatrick model instances, achieving even worse results for quantum computers. Now, this looks very disappointing for quadratic speedups and let’s hope for significantly better error correcting codes and quantum hardware. But Google’s team also considers the case for $ d = 3, 4 $, achieving drastically better results for the quantum algorithms, especially for quartic speedups. Unfortunately, there aren’t that many such algorithms and most speedups for optimization are at most a quadratic one ii.

Google’s team states that their and similar results are already “folk wisdom” among the small community that studies error-correction. Indeed for instance Microsoft’s Matthias Troyer formulated similar concerns in the past iii.

A detailed study by Montanaro et al

Already in 2019 Montanaro et al published a much more detailed resource estimation about the same subject “Applying quantum algorithms to constraint satisfaction problems” iv. Again at first sight, the researches calculate very promising prospects for best case random instances of the classes 12-SAT, 14-SAT and graph coloring. These classes are among the hardest combinatorial optimization problems out there. Montanaro et al even compared the quantum algorithms against leading classical algorithms for the same task. In particular, they also achieved their performance results by trading space for time, resulting in a very large physical qubit overhead. You may have noticed that Google’s estimates assume that the Toffoli gates are executed sequentially. Actually they mention the option of performing magic state distillation for Toffoli gates in parallel. They argue that, due do the runtime of the involved Clifford gates and the routing overhead such an effect can only accelerate the quantum algorithm by a factor 10 to 100.

But even Montanaro et al conclude, and again, they formulate this in great detail: Once the classical processing overhead for the surface code is taken into account, this quantum advantage vanishes. BTW Google’s team states that error correcting needs to improve by a magnitude of about 3 to enable favorable results. But this would require significant breakthroughs. Let’s hope, that this will not be the end of the story. Nevertheless, once more it emphasizes the role of exponential quantum speedups.

Exponential Quantum Advantage: Aaronson’s state of the union

If you wonder what Scott Aaronson, one of quantum computing‘s most renowned scientists, has to say about the current state of exponential quantum speedups, you just need to look inside a transcript of his talk at the Solvay Physics Conference in Brussels, May 2022. It is on quantum computing‘s top five list at SciRate for the past year and highly informative v. In there, Aaronson also mentions Google’s work from above and the importance of exponential asympotics to quickly reach the breakeven time compared to classical algorithms. In some sense it reminds me of his “QML – Read the fine print” paper, because it is very readable and kind of … well, relentless. But you should really read it yourself.

In this light, it is reasonable to ask how far the boundaries of known, super fast quantum algorithms can be pushed. In my next article, I will discuss this in more detail for the case of quantum computing‘s arguably most central algorithm – Shor’s algorithm:

From Shor to the Quantum Hidden Subgroup Problem – A jumpstart


i “Focus beyond quadratic speedups for error-corrected quantum advantage” paper by R. Babbush, J. McClean, M. Newman, C. Gidney, S. Boixo, and H. Neven

ii “Quantum Algorithms for Optimization” Talk by Ronald de Wolf on the Quantum Colloquium of the Simons Institute. He introduces various strategies for fault tolerant quantum computers. In the following panel discussion, with Edward Farhi present, also the case for QAOA is discussed. BTW, de Wolf also mentions a simpler kind of argument for the ropic of this section, the problem with quadratic speedups.

iii “Towards Practical Quantum Advantage” Talk by Matthias Troyer on the Quantum Colloquium of the Simons Institute in which he roughly estimates the overhead for quantum error correction by a constant factor of magnitude 10 to 12.

iv “Applying quantum algorithms to constraint satisfaction problems”, paper by E. Campbell, A. Khurana and A. Montanaro which calculates detailed resource estimates for the fault tolerant computation in comparison to best classical algorithms.

v “How Much Structure Is Needed for Huge Quantum Speedups?” Transcript of Scott Aaronson‘s talk at the Solvay Physics Conference in Brussels, May 2022.

Challenging the “Exponential Quantum Advantage Hypothesis“ for Quantum Chemistry

Ever since Feynman‘s statement “Nature isn‘t classical dammit“ there exists the general agreement in the quantum computing community that fault tolerant quantum computers will generically provide an exponential quantum advantage in quantum chemistry for problems of practical interest. A well-known quantum chemist from Caltech now challenges this “Exponential Quantum Advantage Hypothesis“ for a most important class of problems – we should be well advised to pay attention to his insights.

Symbolic sketch for Quantum Chemistry

Image by Irene from Pixabay

Quantum computing and the “Exponential Quantum Advantage Hypothesis” (EQA) for chemistry

Quantum computing has several great use cases of tremendous public interest. Yet, not for all of those use cases is the predicted benefit well established. But there is one field, for which an exponential quantum advantage will remain, no matter what. The purpose for which quantum computers had been proposed in the first place: For the simulation of nature at its fundamental level and its very prominent industrial version: Quantum chemistry. OK, we are not sure, if this advantage could already be achieved in the current NISQ-era. But even if algorithms like the variational eigensolver will not do the job, then definitely fault tolerant quantum computing will do.

This is not just my expectation. It is a general agreement which has almost the status of a fact in the quantum computing community. Rock solid and absolutely hype free. And come on, why not?! We know nature at this level mathematically lives in an exponential Hilbert space. We know since Seth Lloyd’s seminal paper on “Universal Quantum Simulators” that quantum dynamics is of polynomial complexity for quantum computing and thus efficiently solvable. And even for calculating the energy levels of quantum systems, we know a generic algorithm with polynomial complexity: Quantum phase estimation.

A Quantum Chemist from Caltech

Recently, I was very surprised to hear a different view about this matter in the Quantum Colloquium of the Simons Institute (organized by Umesh Vazirani, one of quantum computing’s “fathers”). In his talk Garnet Chan from Caltech discusses the somewhat bold question “Is There Evidence of Exponential Quantum Advantage in Quantum Chemistry?” i. Like often in this series, there is also a panel discussion with other leading scientists following ii – but more about this later. And of course there is also a preprint to Chan’s talk iii.

Now, Chan is a respected scientist from a leading university … and he knows both worlds very well: His main domains are theoretical quantum chemistry and quantum many-body physics. But also, he has worked on some interesting papers on quantum computing e.g. together with Google’s and IBM’s quantum teams (e.g. iv v). As Vazirani states in his introduction “I am particularly looking forward to Garnet’s talk … he has been the one who has been most clear in expressing the big picture as well as the subtleties of the problems of simulating quantum chemistry”.

I find the fact, that such experts start to get really serious with quantum computing extremely promising. It drives the research in an industrially relevant direction and forces quantum computing to start proving itself in practice. What I also find remarkable regarding his preprint on the EQA-hypothesis is its list of co-authors: Among them are leading scientists in quantum computing like John Preskill himself and Ryan Babbush from Google.

So, we should be well advised to pay attention to his insights and his opinion.

Quantum Chemistry: The ground state energy problem

As Chan explains, among the various questions one can ask in quantum chemistry, probably the most central one is the question for the ground state energy of an electronic Hamiltonian. The reason for this is, that one can deduce many things from it

      • If you calculate the energy for two different configurations of molecules, the difference will be the reaction energy

      • If you do the same for two different phases of matter, you can study thermodynamic stability

      • If you calculate the energy as a function of the nuclear coordinates, you get the potential surface for analyzing catalytic and reaction cascades

      • If you study the energy in terms of externally generated perturbations you can analyze aspects of spectroscopy

The textbook algorithm for calculating this ground state energy is quantum phase estimation (QPE). It efficiently splits any state into a sum of eigenstates / eigenvalues – pairs of a Hamiltonian, weighted by the individual overlap of the trial state with these eigenstates. A measurement on the eigenvalue register collapses the superposition into a single pair. The probability that this is the ground state depends on the mentioned overlap.

In this procedure the exact Hamiltonian is discretized with a “basis set”, a more or less generic state-set / orbital-set of size L which relates to the system size. The general expectation is, that this is a generic method to calculate the ground state energy of interesting chemical systems with an exponential advantage in L compared to classical methods.

Quantum Advantage: The problem with the initial overlap

Chan points out that the weakness of QPE is certainly not its polynomial circuit size. The problem is the overlap S of the initial trial state with the true, unknown ground state. This ratio controls how often we need to execute the circuit in order to hit the ground state and not some other eigenstate of the system. Thus, the complexity of the quantum algorithm and all its necessary repetitions scale like:

poly(1 / S)

(Here I leave out other dependencies as they are irrelevant for this article)

By the way, according to Chan, post-QPE-methods exist, but these show the same behavior.

If we choose some arbitrary quantum state, we should expect an overlap, which shrinks as the size of the Hilbert space increases. As this scales exponentially with the system size L, we are left with an exponentially small overlap. This destroys the efficiency of the QPE strategy.

More specifically: If the initial state is chosen as a product state of L non-interacting subsystems and each subsystem has an local overlap of $ s < 1 $, then the global overlap will become exponentially small


Thus, in order to achieve an efficient, polynomial quantum performance, we need an initial state with good overlap, but which mismatches the ground state in terms of its energy. If we had a match in both terms, this would be a good approximation of the ground state at the first place and we would not need to bother calculating.

So the big question for Chan is: Can we generically construct such initial states for systems where classical methods fail (estimating the ground state energy efficiently)?

The search for the right initial states in quantum chemistry

According to Chan, there is currently no indication, that we could construct such states. The argument from above about initial product states also works for single Slater determinant states. These are more or less the electron-version or fermionic version of generic product states while respecting the fermionic anti-communication rules of multi-electron systems (thus, the trick with the determinant).

Another candidate would be adiabatic state preparation, by slowly adding a perturbation to some initial Hamiltonian system and its trial ground state. This method seems promising, but according to Chan this relies on heuristics which lacks the initial requirement of a generic method.

Now, you can improve any type of state preparation by exploiting known properties of the specific system (like its symmetries) and physical intuitions. But as he argues, this also applies to any classical method, leading to improvements on the other side as well.

Chan gives a few numerical simulations of his claims. Specifically, he calculates the overlap of the “best” Slater determinant with a classically estimated or extrapolated ground state for different Iron-sulfur (Fe-S) clusters, including the famous FeMo-cofactor of nature’s nitrogenase. The qubit count for mapping the basis into a quantum computer register ranges from 40 to 154 qubits. In his simulations the overlap indeed drops sharply as this count increases. This does not significantly change by improving the Slater-determinant-ansatz.

Evidence for classical hardness for a Quantum Advantage?

As for the other side of the EQA-hypothesis, he asks himself if there is indeed evidence of exponential hardness for classical methods and gives some counter examples. The first class of those systems have the “single-reference” character and he gives some reasoning, why those are representatives for large parts of the chemical space (after all, the preprint contains 77 pages). In single-reference systems, a single Slater determinant qualitatively describes the ground state locally which can be estimated well by classical methods such as coupled cluster.

The other class of systems he discusses is the Heisenberg model. This is a model for quantum magnets but according to Chan also serves as a representative model for correlated materials. He names several efficient classical methods which can address these systems (like tensor networks / PEPS, sign-free quantum Monte Carlo and to some extend DMRG).

The Quantum Computing Colloquium of the Simons Institute

If you haven’t done so far, I recommend to definitely check out the YouTube-channel of the Quantum Colloquium vi . It started in the middle of the pandemic and is the source of many highly interesting talks of well-known speakers, high-class audience and usually followed by a panel discussion with top scientists. I am myself a co-organizer of a Meetup-group on quantum computing and I have to realize: If your name is Umesh Vazirani, I guess, you have a few other options on how to organize these things.

What I found very remarkable in the discussion afterwards was the amount of consent among the panelists with Chan’s talk. Also I have learned, that if you are a computer scientist, even a top scientist, you argue very cautiously when it comes to quantum chemistry. I guess this is just very respectable. The panelists agree with Chan, that a solution of the overlap-problem in state preparation is totally unknown.

Nathan Wiebe agrees with all the data, while coming to different conclusions, because the requirements for the quantum and classical calculations need to be specified further. Indeed, as also Chan and co-author Lin Lin point out, classical heuristics are carried out in polynomial time, without necessarily achieving a specific accuracy, while the calculation error in quantum computing can be efficiently controlled.

Wiebe also doubts, if conclusions can be made for the asymptotics from small size examples.

Birgitta Whaley, a chemist and Vazirani’s colleague from Berkley, outlines that multi-reference systems (for which multiple Slater-determinants play a role) and conical intersection problems of potential surfaces definitely show exponential hardness for classical methods. Also for “non-othogonal reference states” (I guess meaning: reference states expanded in non-orthogonal orbitals) couple cluster expansion tend to continue forever.

She and also Isaac Chuang advertise to look for other problems in quantum chemistry which are more optimal for quantum computers. Whaley specifically names spin ordering which may be formulated as a decision problem (yet not a binary one).

What about Quantum Advantage for quantum computing?

What is the conclusion from all of this? Nevertheless, even Chan expects that quantum computers will open the door for answering lots of interesting questions and he can’t wait to get fault tolerant quantum computing in his hands. While he questions a generic exponential advantage, he assumes that there will be lots of room for drastic, polynomial advantage. After all, every improvement in classical quantum chemistry has been a polynomial one. Among the panelists there are also doubts whether asymptotic complexity is the right tool to study quantum advantage.

At the end, I find this reassuring. Indeed, even for the ground state energy problem, there has been detailed resource estimations for the related quantum circuit in the past which calculate a quantum advantage – in wall-clock time. One example for this is the famous FeMo-cofactor of nitrogenase (btw: the initial paper and some further work was co-authored by panelist Nathan Wiebe) vii. Another nice example is a recent thorough investigation by Google Quantum AI and Boehringer Ingelheim about the class of cytochrome P450 enzymes viii.

So I think, there is no reason to doubt quantum advantage – especially for quantum chemistry. But perhaps we should be more careful about statements regarding exponential speed ups – even for quantum chemistry.


i “Is There Evidence of Exponential Quantum Advantage in Quantum Chemistry?“ talk by Garnet Chan / Caltech on the Quantum Colloquium of the Simons Institute.

ii Panel discussion to Garnet Chan‘s talk with various leading scientists.

iii “Is there evidence for exponential quantum advantage in quantum chemistry?“ Preprint by Garnet Chan et al.

iv “Simulating challenging correlated molecules and materials on the Sycamore quantum processor“ paper by Chan and Google AI

v “Quantum algorithms for quantum chemistry and quantum materials science“ paper by Chan and IBM‘s quantum group

vi YouTube-channel of the Quantum Colloquium / Simons Institute

vii“Elucidating Reaction Mechanisms on Quantum Computers”, the seminal FeMoCo-paper by ETH Zürich and Microsoft which has started extensive research. A more recent advance regarding the molecule was published by Google in“Even more efficient quantum computations of chemistry through tensor hypercontraction”

viii“Reliably assessing the electronic structure of cytochrome P450 on today’s classical computers and tomorrow’s quantum computers” paper by Google Quantum AI and Boehringer Ingelheim

How to visualize quantum entanglement?

In case you want to explain some central aspects of quantum information theory to curious people (like to explain aspects of the recent Nobel Prize in physics), some visualizations in my free online book might help you out.

Quantum entanglement visualized by a simple arrow in 4d

In 2018 I launched my “free online online book for curious people” .

I think, one of the main challenges for explaining quantum computing to laymen is to visualize central aspects in quantum theory. Of course, the Bloch sphere explains a qubit perfectly. But in my opinion it needs too much further explanation in order to be applicable for this purpose. Plus, the Bloch sphere lacks visualizing multiple-qubit states.

So instead of the Bloch sphere, for my online book, I have decided to use simple arrow diagrams. On the one hand these “brutally” downgrade the complex Hilbert space to ordinary real Euclidean space, but on the other hand I think, they have some nice advantages like explaining

      • Hilbert space as a vector space
      • Amplitudes and probabilities
      • Unitaries as rotations
      • Provide intuition for multi-qubit states

A qubit as a simple arrow in a planeA qubit as a uniform superposition of the states 0 and 1 in the simplified visualization. The amplitudes of the two states are the red dashed lines at each axis. A measurement „collapses“ / rotates the arrow on one of the axis. The amplitudes / red dashed lines are indicators of the probability that this will happen.

In this picture a H-gate is a simple rotation by 45° degrees:

H gate rotates a qubit by 45 degrees in this plane.

A H-gate transforms the state 0 into the uniform superposition of 0 and 1.

And the X-gate is a simple reflection on just this uniform superposition:

X gate reflects a state on the uniform superposition in the plane

An X-gate is a reflection on the uniform superposition.

This, by the way, also demonstrates that H maps the computational basis to eigenstates of the X-gate with eigenvalues +1 and -1.

Once you have established this simplified picture you can visualize the measurement in different bases (like for the uncertainty relation):
Measurement in different bases

Uncertainty relation: Measuring the qubit in the property / observable „gender“ requires a totally different perspective / basis as measuring the property / observable „age“. Measuring one property for sure, collapses the arrow to the corresponding axis and makes the other property completely unknown.

What I find especially handy: In the simplified picture you can even visualize entanglement – sort of. OK, we would actually need 4 dimensions for this, but we can give it a try on a flat screen using some perspective tricks:

Quantum entanglement visualized by a simple arrow in 4d

Entanglement: Measuring the 2-qubit-state on the brown qubit B collapses the blue arrow to the green dashed arrows of qubit A. If we measure 0 for qubit B we get the green dashed arrow in the left 2d-plane which mainly points in the green 0 direction. On contrary, measuring 1 for qubit B we get the green dashed arrow in the bottom 2d-plane which mainly points in the green 1 direction.


The above picture shows an entangled state which is not maximally entangled (unlike Bell pairs). I find this example more appealing, as it implies that a measurement on qubit B leaves / keeps an uncollapsed quantum state / arrow for the other qubit which is generally still a superposition of 0 and 1 (again, unlike Bell pairs).

Is Google’s quantum supremacy milestone now really obsolete?

In August 2022 news has spread through the media, that a team of researchers from the Chinese Academy of Sciences managed to implement Google‘s quantum supremacy task on a classical supercomputer. Regarding the fact that Google’s 2019-milestone plays a prominent role for the quantum computing community and also for the communication with the public, I think, it is necessary that the community needs to deal with the details of this new paper.


The Sycamore chip

The Sycamore chip (credits Google, CC BY 3.0, via Wikimedia Commons)

Google‘s quantum supremacy milestone

You know the story: In October 2019 Google published a seminal milestone in quantum computing. In a paper, which Google‘s Quantum AI team managed to keep secret almost until the end, they announced its new quantum computer Sycamore with 53 qubits i. But most important they also described a quantum algorithm that managed to outperform even the largest classical supercomputer by the factor 200s : 10 000 years.

Despite some criticism, the work has been regarded as one of major milestone for quantum computing in recent history. By time, criticism grew louder and in August 2022 news has spread through the media that a team of researchers from the Chinese Academy of Sciences finally managed to equalize the performance on the same task using a classical supercomputer. BTW, the preprint dates back to last year November.

In October 2019 I myself wrote a pretty enthusiastic article about quantum supremacy in my German online book and now I have looked into the paper „Solving the sampling problem of the Sycamore quantum circuits“ by Pan, Chen and Zhang ii.

Quantum Random Circuit Sampling

The idea of Google‘s supremacy experiment is simple and pretty brilliant. Once a quantum circuit is complex enough no classical supercomputer should be able to simulate it due to the exponentially large Hilbert space of the resulting state. Google achieved just this by randomly dicing a 53-qubit circuit from a set of one and two qubit quantum gates. The two qubit gates are iSWAP-like gates, that are especially hard to simulate for a classical computer on a large scale. The complexity of the circuit grows exponentially in circuit depth and at 20 cycles Google hit the supremacy regime. The result of the algorithm is a distribution of random looking 53 bit strings, like a „speckled intensity pattern produced by light interference in laser scatter“, as the paper states.

A very central feature of the paper is the answer to the following question: How do you actually prove, that the quantum computer did what it was supposed to do and didn‘t just produce random looking trash?

For one thing, they reproduced the result of their quantum hardware for less complex circuits of the same style on classical hardware, this is how Google‘s team got the 10,000 year estimate. But most of all: They used a reliable benchmark tool, which they had already introduced in 2017 iii, to measure the correctness of the resulting distribution …

The key point: Google‘s cross entropy benchmark

The basic idea of the cross entropy bench marking fidelity is very clever: If we measure the quantum circuit we will most likely measure a bit string with a large amplitude in the resulting quantum state of the circuit. If we now calculate the ideal amplitude of just this bit string on a classical computer, we should get a value, which is larger than the average. Now we collect a large number of measurement results and their bit strings. If we sum up all their calculated ideal amplitudes, we should definitely get a value that is „suspiciously large“ (as Scott Aaronson calls this).

BTW 1: To calculate „a few“ amplitudes of the ideal circuit is nothing compared with calculating the full exponentially large state vector as you will see later in this text.

BTW 2: Could you „spoof“ the result of the cross entropy fidelity? Let‘s say, we don‘t measure a quantum circuit but just select a series of random bit strings. What would the fidelity look like?

In detail the fidelity is defined as

F_{XEB} = \tfrac{2^n}{k} \big(
\sum^k_{i=1} |\langle 0^n | C | x_i \rangle|^2
\big) -1
$$ Here, $ n $ is the number of qubits, $ k $ is the number of samples, $ C $ is the quantum circuit and $ x_i $ is the i‘th sampled bit string.

Now, $ \tfrac{1}{k} \big( \sum … \big) $ is like a monte carlo integration and if the $ x_i $ were drawn from the uniform distribution, then this integration would converge to the uniform summation over the full computational basis. Thus, giving $ \rightarrow \tfrac{1}{2^n} || 0^n ||^2 = \frac{1}{2^n} $ as C is unitary. So for random guessing $ x_i $ we have

F_{XEB} = 0
$$ Question: But could you improve this by guessing better?

Answer: This is exactly what Pan, Chen and Zhang did!

An important fact about Google‘s result is: As Sycamore is a NISQ-device it produces imperfect results and the measured distribution is not the true distribution. A perfect quantum device would actually generate a fidelity 1 (the argument for this is much more subtle iv). But Google scored a fidelity of

F_{XEB} = 0,002
$$. So to rival Google‘s result, you do not need a perfect result either.

Interpretations of „Solving the sampling problem“

The 10,000 year estimate by Google was also caused by the type of simulation variant, that they used. As there is no way to store a state of a $ 2^{53} \approx 10^{18} $ dimensional Hilbert space on RAM, they used a hybrid Schrödinger-Feynman algorithm, that breaks the circuit in patches and recombines those via path-integrals an the end. This is memory efficient but becomes exponentially more computationally expensive than a pure Schrödinger simulation. But Google‘s team also mentions in the paper „We expect that lower simulation costs than reported here will eventually be achieved, but we also expect they will be consistently outpaced by hardware improvements on larger quantum processors.“.

Just a few days after the release, IBM’s quantum team managed to propose the former: An alternative approach that aimed at using the massive hard drive of the Summit supercomputer instead of RAM. They estimated a runtime of 2 days. Now, this approach still consumes exponential resources and probably due to practical reasons they never executed the algorithm. So trying to solve the problem by executing a full amplitude simulation and sample from this seems unrealistic, what also the Chinese team now emphasizes.

We all know, a quantum computer itself does just this: A full amplitude generation. But of course, we are never able to see the full amplitude result, all we see are repeated measurement results. So in this sense: If you are able to somehow generate a series of bit strings from a given quantum circuit which scores at least 0,002 in the cross entropy benchmark, what should this be called? Is this „solving the sampling problem“ or is this just „spoofing the Sycamore test“? I think, both terms are fair to say.

The classical sampling algorithm by Pan, Chen and Zhang

So a full amplitude simulation won‘t work. If we would just do a single amplitude simulation of one bit string this would vastly reduces the computational cost. Just think of a simple example like $ \langle 0^n | H^n | 1^n \rangle| $. You never need to write down the full state vector, you just focus / project on the given bit strings. Pan, Chen and Zhang compare the supremacy circuit with a gigantic three dimensional tensor network with certain boundary conditions for the initial and the final state. To request a single amplitude for a certain computational basis state / bit string, both boundaries are fixed. The case in their work is a little different:

They choose their 53-bit strings the following way:

      1. 6 bits are variable
      2. For the remaining 47 bits, they generate $ 2^{20} \approx 1 million $ randomly chosen strings

Altogether this covers almost $ 10^8 $ of all possible $ 10^{18} $ bit strings. Thus, the latter are 10 billions times more strings. It is remarkable that this proved to be enough samples. They call the collection the „sparse state“. For each of the 47-bit strings: For the additional 6 variable bits, finally they chose only the most significant ones, i.e. the ones that generate the highest scores on the cross entropy fidelity. This leaves about 1 million uncorrelated bit strings, which is comparable with the figures from Google‘s experiment.

In a sense this is similar to what Ewin Tang does in her quantum inspired algorithms: Sample first and then calculate (quantum algorithms do the opposite).

There is another trick that the researchers do, to simplify the calculation: They drill K holes into the tensor network, which makes the computation far easier but decreases the fidelity by a factor $ 2^{-K} $. I think, this is also a fair alteration of the sampling problem, since Sycamore also inserts errors – yet unwillingly, of course.

The remaining problem is still highly difficult. The team manages to calculate all $ 10^8 $ amplitudes with a single tensor network contraction: The basic purpose of the sparse state is to multiply each gate or contract each subtensor starting from the bottom right of the circuit (qubit no. 53 of the final state) and then iteratively work yourself up and to the initial state. Each iteration step projects the subtensor only to those product states / bit combinations that are included in the sparse state. For instance, if we contract a two qubit gate for qubit no. 53 and 52 and in the sparse state only the combinations 01 and 10 exist for bits no. 53 and 52, than the remaining two combination will be dropped out of the computation. This decreases the complexity of the sparse state calculation, compared to a full amplitude simulation, approximately by a factor 10 billion.

All in all the Chinese team arrives at a fidelity

F_{XEB} = 0.0037
$$  in about 15 hours using 512 GPUs. They argue, that using even more advanced hardware would finally beat Google‘s 200 seconds.

The future of quantum supremacy experiments

The paper of Pan, Chen and Zhang finally dissolves the 10,000 year estimate of Google’s work for good. You could argue, that there remains a quantum advantage regarding computing cycles or carbon footprint (as Scott Aaronson states).

In my opinion Google’s result is still a milestone, may-be a less spectacular one by now. The current development was kind of expected by the team and the community. Also, we should get used to the fact that, newly detected quantum advantages might be caught up by some improvement on the classical side.

But what about the other part of the Google-quote from above “…but we also expect they [the classical simulations] will be consistently outpaced by hardware improvements on larger quantum processors.”?

The quantum algorithm could be improved in various ways by hardware advances. Most of all: Better gate fidelity, which would lead to deeper circuit depth and better cross entropy fidelity scoring. At some point using enhanced random guessing and drilling holes would probably be not good enough on the classical side. Of course, as always in quantum, more qubits would also work pretty good.

Might there be some news in preparation? Alongside a preprint about toric code implementations, Google Quantum AI recently mentioned (by the way) a planar 72-qubit quantum processor, Sycamore’s big brother v. I very much wonder, if the team is still on the quantum supremacy track …

Yet there is an intrinsic problem with random quantum circuit sampling. At some point, the heart of all the arguments, the cross entropy benchmark, will hit the quantum supremacy regime itself, making it impossible to verify the measured distribution on classical hardware. So at some point conceptionally new quantum supremacy experiments are probably needed. Now, this is related to a much more interesting question:

At what tasks will we be able to observe exponential quantum advantages in the near future?

I will deal with this question in a future post. So stay tuned … Smiley


i „Quantum supremacy using a programmable superconducting processor“ by Google Quantum AI

ii „Solving the sampling problem of the Sycamore quantum circuits“ by Pan, Chen and Zhang from the Chinese Academy of Sciences

iii „Characterizing Quantum Supremacy in Near-Term Devices“ by Google Quantum AI

iv, „Mathematical aspects of Google’s quantum supremacy experiment“ by Ion Nechita. Since I could not come up for the explanation myself why $ F_{XEB} = 1 $ for ideal circuits, this is best / easiest explanation, I have found. For more and more samples N the distribution converges to the „Porter-Thomas distribution“ $ N*exp(-Nx) $ which leads to an easy integral and the result from above. If you have a simpler explanation feel free to tell me.

v „Suppressing quantum errors by scaling a surface code logical qubit“ by Google Quantum AI

Welcome to my blog!

First of all: A warm welcome to everybody who visits my new website. Before I start posting, I thought it would be appropriate to introduce myself and share a few words about the intention of this blog and what kind of topics you will find in here.


I am a senior consultant for professional computing for large companies since the year 2000 and I have run my own business for many years now. Yet in my heart I also kept being a theoretical physicist all along. Since the very first news about D-Wave’s quantum annealer I am following the dawn of the quantum computing era closely. What really hit me was IBM’s launch of the IBM Quantum Experience. This was the point when I started my deep dive into the subject, which lasts until now. Probably I am among the very first wave of enthusiasts who are educating themself all the way through YouTube lectures, Mike & Ike and without any tutor or mentor from the academia. It this sense, I am speaking and writing from the perspective of a new group in the quantum computing community which is (hopefully) growing fast.

I am writing about quantum computing since 2018. My “free online book for curious people” has gained quite some attention in my target group and I am happy about the numerous positive feedback I have received about it. There are three central aspects that guided my work on the book: 1. It is written in German, 2. Besides this it addresses a broad audience with little or possibly no background in quantum computing and 3. It gives an overview (a complete overview, as I hope) about the state and the applications of quantum computers

Yet during the years I kept stumbling upon quantum topics that I found worth writing about but which just would not fit into the scope of the book. Most of all because the topics were far too advanced for a general audience and too specific for an overview perspective. This is where comes in. I am looking for true speed ups and advances in the field, interesting events and discussions. I presume, that NISQ might end up being only „a niche“ (as to speak with Ryan Babbush from Google AI back in Q2B 2019). But nevertheless, I am eagerly on the search to be proven otherwise.

I am also starting to write in English. I am very much aware, that this is not my native language and I apologize up front for the numerous errors to come. But I think, that while I am narrowing my target audience on one side I should widen it on the other side. But most of all: As this blog is also supposed to target the quantum computing community I should start to write in the language of this community.

If you want to get in touch, feel free to reach me via the contact page or LinkedIn!