# Binary Exponentiation by Factoring

Consider a problem of computing $$ax^y \pmod{2^d}$$, given integers $$a$$, $$x$$, $$y$$ and $$d \geq 3$$, where $$x$$ is odd.

The algorithm below allows to solve this problem with $$O(d)$$ additions and binary operations and a single multiplication by $$y$$.

Due to the structure of the multiplicative group modulo $$2^d$$, any number $$x$$ such that $$x \equiv 1 \pmod 4$$ can be represented as

$x \equiv b^{L(x)} \pmod{2^d},$

where $$b \equiv 5 \pmod 8$$. Without loss of generality we assume that $$x \equiv 1 \pmod 4$$, as we can reduce $$x \equiv 3 \pmod 4$$ to $$x \equiv 1 \pmod 4$$ by substituting $$x \mapsto -x$$ and $$a \mapsto (-1)^{y} a$$. In this notion, $$ax^y$$ is represented as

$a x^y \equiv a b^{yL(x)} \pmod{2^d}.$

The core idea of the algorithm is to simplify the computation of $$L(x)$$ and $$b^{y L(x)}$$ using the fact that we're working modulo $$2^d$$. For reasons that will be apparent later on, we'll be working with $$4L(x)$$ rather than $$L(x)$$, but taken modulo $$2^d$$ instead of $$2^{d-2}$$.

In this article, we will cover the implementation for $$32$$-bit integers. Let

• mbin_log_32(r, x) be a function that computes $$r+4L(x) \pmod{2^d}$$;
• mbin_exp_32(r, x) be a function that computes $$r b^{\frac{x}{4}} \pmod{2^d}$$;
• mbin_power_odd_32(a, x, y) be a function that computes $$ax^y \pmod{2^d}$$.

Then mbin_power_odd_32 is implemented as follows:

uint32_t mbin_power_odd_32(uint32_t rem, uint32_t base, uint32_t exp) {
if (base & 2) {
/* divider is considered negative */
base = -base;
/* check if result should be negative */
if (exp & 1) {
rem = -rem;
}
}
return (mbin_exp_32(rem, mbin_log_32(0, base) * exp));
}


## Computing 4L(x) from x

Let $$x$$ be an odd number such that $$x \equiv 1 \pmod 4$$. It can be represented as

$x \equiv (2^{a_1}+1)\dots(2^{a_k}+1) \pmod{2^d},$

where $$1 < a_1 < \dots < a_k < d$$. Here $$L(\cdot)$$ is well-defined for each multiplier, as they're equal to $$1$$ modulo $$4$$. Hence,

$4L(x) \equiv 4L(2^{a_1}+1)+\dots+4L(2^{a_k}+1) \pmod{2^{d}}.$

So, if we precompute $$t_k = 4L(2^n+1)$$ for all $$1 < k < d$$, we will be able to compute $$4L(x)$$ for any number $$x$$.

For 32-bit integers, we can use the following table:

const uint32_t mbin_log_32_table = {
0x00000000, 0x00000000, 0xd3cfd984, 0x9ee62e18,
0xe83d9070, 0xb59e81e0, 0xa17407c0, 0xce601f80,
0xf4807f00, 0xe701fe00, 0xbe07fc00, 0xfc1ff800,
0xf87ff000, 0xf1ffe000, 0xe7ffc000, 0xdfff8000,
0xffff0000, 0xfffe0000, 0xfffc0000, 0xfff80000,
0xfff00000, 0xffe00000, 0xffc00000, 0xff800000,
0xff000000, 0xfe000000, 0xfc000000, 0xf8000000,
0xf0000000, 0xe0000000, 0xc0000000, 0x80000000,
};


On practice, a slightly different approach is used than described above. Rather than finding the factorization for $$x$$, we will consequently multiply $$x$$ with $$2^n+1$$ until we turn it into $$1$$ modulo $$2^d$$. In this way, we will find the representation of $$x^{-1}$$, that is

$x (2^{a_1}+1)\dots(2^{a_k}+1) \equiv 1 \pmod {2^d}.$

To do this, we iterate over $$n$$ such that $$1 < n < d$$. If the current $$x$$ has $$n$$-th bit set, we multiply $$x$$ with $$2^n+1$$, which is conveniently done in C++ as x = x + (x << n). This won't change bits lower than $$n$$, but will turn the $$n$$-th bit to zero, because $$x$$ is odd.

With all this in mind, the function mbin_log_32(r, x) is implemented as follows:

uint32_t mbin_log_32(uint32_t r, uint32_t x) {
uint8_t n;

for (n = 2; n < 32; n++) {
if (x & (1 << n)) {
x = x + (x << n);
r -= mbin_log_32_table[n];
}
}

return r;
}


Note that $$4L(x) = -4L(x^{-1})$$, so instead of adding $$4L(2^n+1)$$, we subtract it from $$r$$, which initially equates to $$0$$.

## Computing x from 4L(x)

Note that for $$k \geq 1$$ it holds that

$(a 2^{k}+1)^2 = a^2 2^{2k} +a 2^{k+1}+1 = b2^{k+1}+1,$

from which (by repeated squaring) we can deduce that

$(2^a+1)^{2^b} \equiv 1 \pmod{2^{a+b}}.$

Applying this result to $$a=2^n+1$$ and $$b=d-k$$ we deduce that the multiplicative order of $$2^n+1$$ is a divisor of $$2^{d-n}$$.

This, in turn, means that $$L(2^n+1)$$ must be divisible by $$2^{n}$$, as the order of $$b$$ is $$2^{d-2}$$ and the order of $$b^y$$ is $$2^{d-2-v}$$, where $$2^v$$ is the highest power of $$2$$ that divides $$y$$, so we need

$2^{d-k} \equiv 0 \pmod{2^{d-2-v}},$

thus $$v$$ must be greater or equal than $$k-2$$. This is a bit ugly and to mitigate this we said in the beginning that we multiply $$L(x)$$ by $$4$$. Now if we know $$4L(x)$$, we can uniquely decomposing it into a sum of $$4L(2^n+1)$$ by consequentially checking bits in $$4L(x)$$. If the $$n$$-th bit is set to $$1$$, we will multiply the result with $$2^n+1$$ and reduce the current $$4L(x)$$ by $$4L(2^n+1)$$.

Thus, mbin_exp_32 is implemented as follows:

uint32_t mbin_exp_32(uint32_t r, uint32_t x) {
uint8_t n;

for (n = 2; n < 32; n++) {
if (x & (1 << n)) {
r = r + (r << n);
x -= mbin_log_32_table[n];
}
}

return r;
}


## Further optimizations

It is possible to halve the number of iterations if you note that $$4L(2^{d-1}+1)=2^{d-1}$$ and that for $$2k \geq d$$ it holds that

$(2^n+1)^2 \equiv 2^{2n} + 2^{n+1}+1 \equiv 2^{n+1}+1 \pmod{2^d},$

which allows to deduce that $$4L(2^n+1)=2^n$$ for $$2n \geq d$$. So, you could simplify the algorithm by only going up to $$\frac{d}{2}$$ and then use the fact above to compute the remaining part with bitwise operations:

uint32_t mbin_log_32(uint32_t r, uint32_t x) {
uint8_t n;

for (n = 2; n != 16; n++) {
if (x & (1 << n)) {
x = x + (x << n);
r -= mbin_log_32_table[n];
}
}

r -= (x & 0xFFFF0000);

return r;
}

uint32_t mbin_exp_32(uint32_t r, uint32_t x) {
uint8_t n;

for (n = 2; n != 16; n++) {
if (x & (1 << n)) {
r = r + (r << n);
x -= mbin_log_32_table[n];
}
}

r *= 1 - (x & 0xFFFF0000);

return r;
}


## Computing logarithm table

To compute log-table, one could modify the Pohlig–Hellman algorithm for the case when modulo is a power of $$2$$.

Our main task here is to compute $$x$$ such that $$g^x \equiv y \pmod{2^d}$$, where $$g=5$$ and $$y$$ is a number of kind $$2^n+1$$.

Squaring both parts $$k$$ times we arrive to

$g^{2^k x} \equiv y^{2^k} \pmod{2^d}.$

Note that the order of $$g$$ is not greater than $$2^{d}$$ (in fact, than $$2^{d-2}$$, but we will stick to $$2^d$$ for convenience), hence using $$k=d-1$$ we will have either $$g^1$$ or $$g^0$$ on the left hand side which allows us to determine the smallest bit of $$x$$ by comparing $$y^{2^k}$$ to $$g$$. Now assume that $$x=x_0 + 2^k x_1$$, where $$x_0$$ is a known part and $$x_1$$ is not yet known. Then

$g^{x_0+2^k x_1} \equiv y \pmod{2^d}.$

Multiplying both parts with $$g^{-x_0}$$, we get

$g^{2^k x_1} \equiv (g^{-x_0} y) \pmod{2^d}.$

Now, squaring both sides $$d-k-1$$ times we can obtain the next bit of $$x$$, eventually recovering all its bits.