Skip to content

Binary Exponentiation by Factoring

Consider a problem of computing \(ax^y \pmod{2^d}\), given integers \(a\), \(x\), \(y\) and \(d \geq 3\), where \(x\) is odd.

The algorithm below allows to solve this problem with \(O(d)\) additions and binary operations and a single multiplication by \(y\).

Due to the structure of the multiplicative group modulo \(2^d\), any number \(x\) such that \(x \equiv 1 \pmod 4\) can be represented as

\[ x \equiv b^{L(x)} \pmod{2^d}, \]

where \(b \equiv 5 \pmod 8\). Without loss of generality we assume that \(x \equiv 1 \pmod 4\), as we can reduce \(x \equiv 3 \pmod 4\) to \(x \equiv 1 \pmod 4\) by substituting \(x \mapsto -x\) and \(a \mapsto (-1)^{y} a\). In this notion, \(ax^y\) is represented as

\[ a x^y \equiv a b^{yL(x)} \pmod{2^d}. \]

The core idea of the algorithm is to simplify the computation of \(L(x)\) and \(b^{y L(x)}\) using the fact that we're working modulo \(2^d\). For reasons that will be apparent later on, we'll be working with \(4L(x)\) rather than \(L(x)\), but taken modulo \(2^d\) instead of \(2^{d-2}\).

In this article, we will cover the implementation for \(32\)-bit integers. Let

  • mbin_log_32(r, x) be a function that computes \(r+4L(x) \pmod{2^d}\);
  • mbin_exp_32(r, x) be a function that computes \(r b^{\frac{x}{4}} \pmod{2^d}\);
  • mbin_power_odd_32(a, x, y) be a function that computes \(ax^y \pmod{2^d}\).

Then mbin_power_odd_32 is implemented as follows:

uint32_t mbin_power_odd_32(uint32_t rem, uint32_t base, uint32_t exp) {
    if (base & 2) {
        /* divider is considered negative */
        base = -base;
        /* check if result should be negative */
        if (exp & 1) {
            rem = -rem;
        }
    }
    return (mbin_exp_32(rem, mbin_log_32(0, base) * exp));
}

Computing 4L(x) from x

Let \(x\) be an odd number such that \(x \equiv 1 \pmod 4\). It can be represented as

\[ x \equiv (2^{a_1}+1)\dots(2^{a_k}+1) \pmod{2^d}, \]

where \(1 < a_1 < \dots < a_k < d\). Here \(L(\cdot)\) is well-defined for each multiplier, as they're equal to \(1\) modulo \(4\). Hence,

\[ 4L(x) \equiv 4L(2^{a_1}+1)+\dots+4L(2^{a_k}+1) \pmod{2^{d}}. \]

So, if we precompute \(t_k = 4L(2^n+1)\) for all \(1 < k < d\), we will be able to compute \(4L(x)\) for any number \(x\).

For 32-bit integers, we can use the following table:

const uint32_t mbin_log_32_table[32] = {
    0x00000000, 0x00000000, 0xd3cfd984, 0x9ee62e18,
    0xe83d9070, 0xb59e81e0, 0xa17407c0, 0xce601f80,
    0xf4807f00, 0xe701fe00, 0xbe07fc00, 0xfc1ff800,
    0xf87ff000, 0xf1ffe000, 0xe7ffc000, 0xdfff8000,
    0xffff0000, 0xfffe0000, 0xfffc0000, 0xfff80000,
    0xfff00000, 0xffe00000, 0xffc00000, 0xff800000,
    0xff000000, 0xfe000000, 0xfc000000, 0xf8000000,
    0xf0000000, 0xe0000000, 0xc0000000, 0x80000000,
};

On practice, a slightly different approach is used than described above. Rather than finding the factorization for \(x\), we will consequently multiply \(x\) with \(2^n+1\) until we turn it into \(1\) modulo \(2^d\). In this way, we will find the representation of \(x^{-1}\), that is

\[ x (2^{a_1}+1)\dots(2^{a_k}+1) \equiv 1 \pmod {2^d}. \]

To do this, we iterate over \(n\) such that \(1 < n < d\). If the current \(x\) has \(n\)-th bit set, we multiply \(x\) with \(2^n+1\), which is conveniently done in C++ as x = x + (x << n). This won't change bits lower than \(n\), but will turn the \(n\)-th bit to zero, because \(x\) is odd.

With all this in mind, the function mbin_log_32(r, x) is implemented as follows:

uint32_t mbin_log_32(uint32_t r, uint32_t x) {
    uint8_t n;

    for (n = 2; n < 32; n++) {
        if (x & (1 << n)) {
            x = x + (x << n);
            r -= mbin_log_32_table[n];
        }
    }

    return r;
}

Note that \(4L(x) = -4L(x^{-1})\), so instead of adding \(4L(2^n+1)\), we subtract it from \(r\), which initially equates to \(0\).

Computing x from 4L(x)

Note that for \(k \geq 1\) it holds that

\[ (a 2^{k}+1)^2 = a^2 2^{2k} +a 2^{k+1}+1 = b2^{k+1}+1, \]

from which (by repeated squaring) we can deduce that

\[ (2^a+1)^{2^b} \equiv 1 \pmod{2^{a+b}}. \]

Applying this result to \(a=2^n+1\) and \(b=d-k\) we deduce that the multiplicative order of \(2^n+1\) is a divisor of \(2^{d-n}\).

This, in turn, means that \(L(2^n+1)\) must be divisible by \(2^{n}\), as the order of \(b\) is \(2^{d-2}\) and the order of \(b^y\) is \(2^{d-2-v}\), where \(2^v\) is the highest power of \(2\) that divides \(y\), so we need

\[ 2^{d-k} \equiv 0 \pmod{2^{d-2-v}}, \]

thus \(v\) must be greater or equal than \(k-2\). This is a bit ugly and to mitigate this we said in the beginning that we multiply \(L(x)\) by \(4\). Now if we know \(4L(x)\), we can uniquely decomposing it into a sum of \(4L(2^n+1)\) by consequentially checking bits in \(4L(x)\). If the \(n\)-th bit is set to \(1\), we will multiply the result with \(2^n+1\) and reduce the current \(4L(x)\) by \(4L(2^n+1)\).

Thus, mbin_exp_32 is implemented as follows:

uint32_t mbin_exp_32(uint32_t r, uint32_t x) {
    uint8_t n;

    for (n = 2; n < 32; n++) {
        if (x & (1 << n)) {
            r = r + (r << n);
            x -= mbin_log_32_table[n];
        }
    }

    return r;
}

Further optimizations

It is possible to halve the number of iterations if you note that \(4L(2^{d-1}+1)=2^{d-1}\) and that for \(2k \geq d\) it holds that

\[ (2^n+1)^2 \equiv 2^{2n} + 2^{n+1}+1 \equiv 2^{n+1}+1 \pmod{2^d}, \]

which allows to deduce that \(4L(2^n+1)=2^n\) for \(2n \geq d\). So, you could simplify the algorithm by only going up to \(\frac{d}{2}\) and then use the fact above to compute the remaining part with bitwise operations:

uint32_t mbin_log_32(uint32_t r, uint32_t x) {
    uint8_t n;

    for (n = 2; n != 16; n++) {
        if (x & (1 << n)) {
            x = x + (x << n);
            r -= mbin_log_32_table[n];
        }
    }

    r -= (x & 0xFFFF0000);

    return r;
}

uint32_t mbin_exp_32(uint32_t r, uint32_t x) {
    uint8_t n;

    for (n = 2; n != 16; n++) {
        if (x & (1 << n)) {
            r = r + (r << n);
            x -= mbin_log_32_table[n];
        }
    }

    r *= 1 - (x & 0xFFFF0000);

    return r;
}

Computing logarithm table

To compute log-table, one could modify the Pohlig–Hellman algorithm for the case when modulo is a power of \(2\).

Our main task here is to compute \(x\) such that \(g^x \equiv y \pmod{2^d}\), where \(g=5\) and \(y\) is a number of kind \(2^n+1\).

Squaring both parts \(k\) times we arrive to

\[ g^{2^k x} \equiv y^{2^k} \pmod{2^d}. \]

Note that the order of \(g\) is not greater than \(2^{d}\) (in fact, than \(2^{d-2}\), but we will stick to \(2^d\) for convenience), hence using \(k=d-1\) we will have either \(g^1\) or \(g^0\) on the left hand side which allows us to determine the smallest bit of \(x\) by comparing \(y^{2^k}\) to \(g\). Now assume that \(x=x_0 + 2^k x_1\), where \(x_0\) is a known part and \(x_1\) is not yet known. Then

\[ g^{x_0+2^k x_1} \equiv y \pmod{2^d}. \]

Multiplying both parts with \(g^{-x_0}\), we get

\[ g^{2^k x_1} \equiv (g^{-x_0} y) \pmod{2^d}. \]

Now, squaring both sides \(d-k-1\) times we can obtain the next bit of \(x\), eventually recovering all its bits.

References