Order Finding
First, let's briefly state the classical order finding problem. We are given two positive integers, $x$ and $N$, which are coprime (meaning $\gcd(x, N) = 1$). We want to find the order of $x$ modulo $N$. The order is defined as the smallest positive integer $r$ such that:
$$x^r \equiv 1 \pmod{N}$$
Remember, the quantum phase estimation algorithm takes as input a unitary operator $U$ and an eigenvector $|u\rangle$, and outputs an estimate for the phase of the corresponding eigenvalue $\theta$, where the eigenvalue $\lambda = e^{2\pi i\theta}$ with $0 \leq \theta \leq 1$.
Hence, if we are going to use the quantum phase estimation algorithm to find $r$, we need to invent a unitary operator $U$ that encodes this modular arithmetic, and we need to figure out its eigenvalues and eigenvectors.
To kick things off, let's figure out how this operator should behave.
$$U|y\rangle = |\text{?}\rangle$$
We define our unitary operator such that:
$$U|y\rangle = |xy \pmod{N}\rangle$$
(Note: to keep it strictly unitary, we usually say it does this for $0 \leq y < N$, and acts as the identity for $y \ge N$, but we only really care about the $0$ to $N-1$ subspace).
Now, let's connect this back to what we already know. The quantum phase estimation algorithm is a machine that takes a unitary operator $U$ and one of its eigenvectors $|u\rangle$, and spits out the phase $\theta$ of its corresponding eigenvalue $e^{2\pi i \theta}$.
If we want to use QPE to find the order $r$, our next mission is to figure out the eigenvalues and eigenvectors of this specific $U$.
Let's start with the eigenvalues. We established earlier that applying $U$ exactly $r$ times brings a state back to itself. In other words, on this relevant subspace:
$$U^r = I$$
$\lambda^r = e^{2\pi i \theta r} = 1$ is exactly the equation we need to solve.
Make sure you see why $\theta=\frac{s}{r}$ and $\lambda_s = e^{2\pi i \frac{s}{r}}$.
Now we have our eigenvalues. If we can feed an eigenvector corresponding to one of these eigenvalues into the quantum phase estimation algorithm, it will spit out $\frac{s}{r}$. (And once we have the fraction $\frac{s}{r}$, we can use some classical number theory to extract $r$, which is our ultimate goal).
But to do that, we need to know what these eigenvectors $|u_s\rangle$ actually look like.
We know the operator $U$ just steps us through the cycle: $U|1\rangle = |x\rangle$, $U|x\rangle = |x^2\rangle$, ..., $U|x^{r-1}\rangle = |1\rangle$ (all modulo $N$).
We want to construct an eigenvector $|u_s\rangle$ such that applying $U$ just pulls out a global phase of $e^{2\pi i \frac{s}{r}}$, meaning $U|u_s\rangle = e^{2\pi i \frac{s}{r}} |u_s\rangle$.
Make sure you see why $U|u_s\rangle=\alpha_0 U|x^0\rangle+\alpha_1 U|x^1\rangle+\dots+\alpha_{r-1} U|x^{r-1}\rangle = \alpha_{r-1} |x^0\rangle+\alpha_0|x^1\rangle+\dots+\alpha_{r-2}|x^{r-1}\rangle$ and note that we also want $U|u_s\rangle=e^{2\pi is/r}|u_s\rangle$.
Hence, we choose coefficients such that $\alpha_{r-1}=e^{2\pi is/r} \alpha_0$, $\alpha_0=e^{2\pi is/r} \alpha_1$, $\alpha_1=e^{2\pi is/r} \alpha_2$ etc.
Make sure you see why the general term is $\alpha_k=e^{-2\pi i s k / r}$.
So, we can write the eigenvector beautifully as:
$$|u_s\rangle = \frac{1}{\sqrt{r}} \sum_{k=0}^{r-1} e^{-2\pi i s k / r} |x^k \pmod{N}\rangle$$
We now have our unitary $U$ and its eigenvectors $|u_s\rangle$. If we could feed just one of these $|u_s\rangle$ states into the second register of a quantum phase estimation circuit, the top register would measure the phase $\frac{s}{r}$. From there, we could easily classical-post-process our way to finding $r$.
But we have hit a massive roadblock: To construct the state $|u_s\rangle$ to plug into the circuit, we need to know $r$ in advance. But finding $r$ is the entire reason we are running the algorithm in the first place!
We cannot prepare $|u_s\rangle$. We need a workaround.
$$\sum_{s=0}^{r-1} |u_s\rangle$$