Implementation: Multiplication

The multiplication of integers may be thought as repeated addition; that is, the multiplication of two numbers is equivalent to the following sum: $$ A \cdot B = \sum_{i=1}^{A} B $$ Following this idea, we can implement a naïve multiplication:

def naive_mul(a, b):
    r = 0
    for i in range(0, a):
        r += b
    return r

Although we could be smart and reduce the number of loops by choosing the smaller one to be the multiplier, the number of additions always grows linearly with the size of the multiplier. Ignoring the effect of the multiplicand, this behavior is called linear-scaling, which is often denoted as \(\mathcal{O}(n)\).

Can we do better?

We have learned that multiplication by powers of 2 can be easily realized by the left shift operator <<. Since every integer can be written as sum of powers of 2, we may try to compute the necessary products of the multiplicand with powers of 2 and sum them up. We shall do an example: 11 * 3.

assert (2**3 + 2**1 + 2**0) * 3 == 11 * 3
assert (2**3 * 3 + 2**1 * 3 + 2**0 *3) == 11 * 3
assert (3 << 3) + (3 << 1) + 3 == 11 * 3

To implement this idea, we can start from the multiplicand (b) and check the least-significant-bit (LSB) of the multiplier (a). If the LSB is 1, this power of 2 is present in a, and b will be added to the result. If the LSB is 0, this power is not present in a and nothing will be done. In order to check the second LSB of a and perhaps add 2 * b, we can just right shift a. In this way the second LSB will become the new LSB and b needs to be multiplied by 2 (left shift). This algorithm is illustrated for the example above as:

IterationabrAction
01011000011000000r += b, b <<= 1, a >>= 1
10101000110000011r += b, b <<= 1, a >>= 1
20010001100001001b <<= 1, a >>= 1
30001011000001001r += b, b <<= 1, a >>= 1
40000110000100001Only zeros in a. Stop.

An example implementation is given in the following listing:

def mul(a, b):
    r = 0
    for _ in range(0, a.bit_length()):
        if a & 1 != 0:
            r += b
        b <<= 1
        a >>= 1
    return r

This new algorithm should scale with the length of the binary representation of the multiplier, which grows logarithmically with its size. This is denoted as \(\mathcal{O}(\log n)\).

To show the difference between these two algorithms, we can write a function to time their executions. The following example uses the function perf_counter from time module:

    from random import randrange
    from time import perf_counter
    
    def time_multiplication(func, n, cycles=1):
        total_time = 0.0
        for i in range(0, cycles):
            a = randrange(0, n)
            b = randrange(0, n)
            
            t_start = perf_counter()
            func(a, b)
            t_stop = perf_counter()
            
            total_time += (t_stop - t_start)
    
        return total_time / float(cycles)

The execution time per execution for different sizes are listed below:

nnaive_mul / μsmul / μs
100.480.80
1002.831.56
1 00028.831.88
10 000224.332.02
100 0002299.512.51

Although the naïve algorithm is faster for \(a,b \leq 10\), its time consumption grows rapidly when a and b become larger. For even larger numbers, it will quickly become unusable. The second algorithm, however, scales resonably well to be applied for larger numbers.