Machine Learning for Decomposing Arbitrary Unitary Matrices of Any Size

Solving The IBM Quantum Challenge

In this post I’ll try to introduced the IBM Quantum Challenge and how I managed to finish it. It’s better if you have a machine learning background. otherwise, the quantum computing part can be addressed using the resources inside the post

On May 4, 2016, IBM has introduced the first programmable quantum computer on cloud. That enabled everyone on earth to interact with a new computational paradigm, and opened the door for new possibilities and opportunities. That announcement was exciting and groundbreaking for many related research fields to explore what we can achieve through this new technology.

Four years later, IBM has 18 quantum systems ready on the cloud serving more than 200,000 users. It managed to build a Quantum Network of over 100 clients and partners. Celebrating these achievements and in the memory of its first quantum computing announcement, on April 27, 2020, IBM has announced a new Quantum Challenge. The challenge started on May 4, 2020 and ended on May 8, 2020. If you want to know more about quantum computing please use the Qiskit textbook

The quantum challenge consisted of 4 exercises . The difficulty of the exercises increases gradually till the last one. The first challenge gives you the chance to explore the Qiskit package, to know what quantum computing means, and to build simple circuits with some intuition behind quantum gates’ identities.


Starting from the second task things starts to get a bit difficult. This task tests your understanding of what kind of noise sources you should expect from a real quantum device and how to overcome one of these sources. The challenge is based on error mitigation and how to retrieve the noise free measurements from a real device. Error mitigation is an active area of research and there are many ways to achieve it. For further information please click on the underlined word to watch Mr. Abraham Asfaw explaining this concept and you can find the code of this episode and other episodes by him here.

The Third task is a great learning opportunity to start learning about quantum communication and encryption. It presents one of the first proposed quantum communication protocols the BB48 [1]. The workload in this task is pure software programming if and only if you understood the basics of quantum circuits and quantum logic gates from the first task.

Finally the fourth challenge, The GRAND FINALE. This challenge requires us to have a solid understanding of linear algebra and unitary matrices. This challenge starts with giving you a unitary matrix with shape 16\times 16. The task simply requires you to decompose this matrix into two unitary gates using only CNOTs and U_3 gates because they are universal in the sense that you can build any quantum circuit from using only these two gates.

The CNOT can be represented as:
CNOT= \begin{bmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ \end{bmatrix}

and the U_3 gate can be represented as:
\large U_3(\theta,\phi,\lambda) = \begin{bmatrix} \cos(\frac{\theta}{2}) & -e^{i\lambda}\sin(\frac{\theta}{2})\\ e^{i\phi}\sin(\frac{\theta}{2}) & e^{i\lambda+i\phi}\cos(\frac{\theta}{2}) \end{bmatrix}

The restrictions here, besides the limited set of gates that we are supposed to use, are the following:

  • The norm of the difference between the decomposed unitary that we should find and the original one should be less than \large \leq 0.01 as described in this equation \large \left |\right |U-V\left | \right | where U is the original unitary and V is the decomposed one.
  • The cost value, which is calculated as \large C = 10\times c_x + u_3, should be less than 1600. This means that the number of CNOTs multiplied by 10 plus the number of U_3 gates must be below 1600.

In Qiskit there are several functions that can decompose single and 2-qubit unitary gates for us but unitary gates of more than 2-qubit are harder to decompose and it’s really a cumbersome problem. There is something called Isometries [2] where we can decompose any unitary circuit using CNOTs and single-qubit gates.

First Trial

My first approach to solve this problem – and i was a bit naive to think it will end that easy – was to use the Transpiler method and the iso method and that’s it. The following code shows my first attempt:

from qiskit import QuantumCircuit
from qiskit import *
from may4_challenge.ex4 import check_circuit, submit_circuit
from qiskit.compiler import transpile
from may4_challenge.ex4 import get_unitary
import numpy as np

U = get_unitary()

q = QuantumRegister(4)
qc = QuantumCircuit(q)

qc = transpile(qc, optimization_level=3,basis_gates=['u3','cx'],seed_transpiler=5)

The output from this block of code was the following:

Circuit stats:
||U-V||_2 = 4.64378762993992 \times 10^{-14}
(U is the reference unitary, V is yours, and the global phase has been removed from both of them).
Cost is 1676. Something is not right with your circuit: the cost of the circuit is too high (above 1600)

Let’s first break the code. I created a quantum circuit with 4 qubits. Then I used the isometry method to decompose that unitary. Finally I used the transpiler function to map the whole circuit into using CNOTs and U_3. I tried to set the optimization level to 3 which is the highest value. As you can see from the output the cost is high since the number of gates are (‘u3’, 186), (‘cx’, 149). Probably, the most hated number in those four days of the challenge was the 1676.

My second Trial

The challenge was full of funny stories with each task. for example the number sequence that you should submit for the first task is [104, 52, 100, 97, 109, 65, 114, 100]. If you convert these numbers to their ASCII representation you will get the word Hadamard. It’s a very famous quantum gate that is used in nearly every algorithm.

The third answer from task number 3 was a website URL that will direct you to a reddit account. In that Account you would see a post that says the answer of the first task is maybe what you are looking for to finish the fourth task.

At that exact moment i was clueless and i didn’t know what to do. I also was a bit frustrated. All I could see was that it’s a symmetric matrix and there is a repeating pattern inside of it but with no clue to use the Hadamard in this task. Then, I remembered that I do machine learning for living. So, I said to myself why not convert this whole thing into an optimization problem.

There are two ways to use machine learning for this; either in a reinforcement way or supervised way. Since I have a life and It’s also Ramadan and I’m fasting most of the day, I decided to keep it simple and go with the supervised approach. I knew I had to come up with a good variational circuit that can mimic the original unitary.

I decided to use the proposed architecture in Circuit-centric quantum classifiers paper [3]. The figure below shows a single layer of what I implemented. I used this method for another tutorial.

This circuit consists of only U_3 gates and CU_3 gates so that it can be very flexible to simulate the original Unitary and use lesser number of CNOTs as possible when the transpiler function is invoked.

I repeated this layer 7 times and I let the parameters to be updates with each optimization step. I used the Unitary simulator from Qiskit to extract the unitary matrix from the quantum circuit and my cost function was simply:

np.linalg.norm(original_U - unitary,ord=2)

With a 16\times16 matrix elements the optimization process was a bit tedious since the cost funtion was the \lVert U - V\rVert_2. This means that i’m going to optimize many parameters based on something like the mean squared error function and that’s not good at all for convergence using a gradient based method like the gradient descent.

I decided to use the Sequential Least Squares Programming (SLSQP) as the optimizer from Qiskit Aqua since it’s fast and it avoided local minima valleys in our problem. Weight initialization was very small so that the optimizer would move freely to the lowest possible point and also to make the weights as sparse as possible so that we can threshold them. Just remember this “threshold” thing

The initial result was very promising and I got a cost of 600 and the difference was less the 10^{-7}. That actually cheered me up. So, I decided to think more about the hadamard thing. I read chapter 4 from the great book by Isaac Chuang and Michael Nielsen “Quantum Computation and Quantum Information”. I learnt that diagonalizing a unitary matrix can be very beneficial to reduce it into more basic unitary gates.

Third Trial

Of course with the help of the notes on the slack channel of the challenge and and thanks to my professor Zoltán Zimborás, instead of only multiplying the hadamard gate by the original unitary which reduced the cost efficiently, I sandwiched the U with two hadamards. That reduced the cost to 148.

I was happy about that and I decided to play around with machine learning to reduce a bit further. I decided to change the places of the 2-qubit gates i.e. the Controlled Rotations CU_3 and start to decrease the number of layers. My second trail has 147 parameters but the new architecture that I used had 90 parameters. It’s a huge reduction and made the optimization faster. The number of gates is (‘u3’, 17), (‘cu3’, 13). I then used the transpiler function and it gave me the following number of gates (‘u3’, 31), (‘cx’, 17). The final result was:

Circuit stats:
||U-V||_2 = 3.7789372338197153\times10^{-7}
(U is the reference unitary, V is yours, and the global phase has been removed from both of them).
Cost is 201
Great! Your circuit meets all the constrains.
Your score is 201. The lower, the better!
Feel free to submit your answer and remember you can re-submit a new circuit at any time!

I was happy and sad at the same time because i wanted to beat the 148 I got from the hadamard sandwich. Then I decided to do a trick to check for very low weights in the optimized circuit. Do you remember the threshold thing.

I clipped the absolute value of the parameters that are less than 0.01 to be zero. That reduced the number of gates to (‘u3’, 22), (‘cx’, 11) and the output from the checking function was:

Circuit stats:
||U-V||_2 = 0.006499012124886762
(U is the reference unitary, V is yours, and the global phase has been removed from both of them).
Cost is 132
Great! Your circuit meets all the constrains.
Your score is 132. The lower, the better!
Feel free to submit your answer and remember you can re-submit a new circuit at any time!

At that particular moment I felt incredibly happy.

Software and Code

In this repo, you shall find a readme file that tells you how to setup your environment for the challenge and you will find a notebook that has all of my work. Besides that there is a bash file that you can use if you are using Ubuntu to automate everything for you.

Important Mentions & Notes

I’d like to praise one of the participants, Lukasz Cincio, a staff scientist at LANL. He used a different approach that he published [3] a while ago. He got a cost of 45! The method is also based on machine learning. A huge a applause for him on that.

I’d like to thank Elisa Bäumer who made this wonderful task that kept me scratching my head for a very long time. But I had fun and will never forget the experience. Tomorrow at 6 PM Egypt time (GMT+2), there will be a live session on Youtube hosted by Qiskit Channel. The session will be held by her and the amazing Abe Asfaw who is the Global Lead of Quantum Education at IBM Quantum.


In the end I hope that you can benefit from this challenge and try it yourself as the repo and tools are still available. I presented to you my point of view and how I solved the 4th task using machine learning. I learnt a lot about quantum circuit decompositions and I hope you shall learn that too.

This post is my made by Kareem H. El-Safty. I’m a AI engineer in DevisionX, a Research Assistant in Wigner Research Centre for Physics and a member at the Alexandria Quantum Computing Group. Please feel free to share your thoughts with me at any time.


  1. Quantum cryptography:Public key distribution and coin tossing. DOI: Online version:
  2. Quantum Circuits for Isometries. DOI: Online version:
  3. Learning the quantum algorithm for state overlap. DOI: 10.1088/1367-2630/aae94a. Online version: