Prime Time! #1: Hunting For Primes w/ Trial Division | High Performance Computing in Python

The OG prime boss Optimus Prime.
Image credits: Transformers: The Last Knight

Okay so I recently got into something called competitive programming, which basically is competing in solving problems with programming. The types of puzzles to solve can test anything from understanding of basic language features and the standard library to knowledge in algorithms, number theory and optimization. There are several sites such as leetcode.org and codewars.com which provide practice problems to train on. A pretty common theme for these practice problems revolve around computing either sequences such as the fibonacci sequence (which I’ll cover in a future post) or computing primes in some manner or another. So that’s basically the inspiration for this post: computing primes!

In this post we will talk about what a prime is, write code for computing them and learn some maths along the way. We will become familiarized with trial division which is the simplest to understand method for verifying the primality of a number, as well as what optimizations we can add to it in order to crank up the performance. We’ll start with some small primes and work our way up to a 20-digit prime. So strap up, put your boots on and let’s get coding. Welcome, to the learning journey!

Is 66495897020462343733 A Prime? | What The Heck’s A Prime Number, Anyways?

Great question! We will answer whether 66495897020462343733 is a prime, but first we need to go on a journey. We start with trying to understand what a prime actually is.

A good way to define primes is the following: a number N is a prime number if there are no positive integer numbers a and b smaller than N that can be multiplied together to yield N. For example, 12 is not a prime number since it can be formed by multiplying 4 and 3 together. 4 is not a prime number because it can be formed by multiplying 2 and 2 together. On the other hand 3 is a prime as there are no two positive integers smaller than 3 that can be multiplied together to yield 3. Note that by the definition of primes, 1 is not a prime number either, as you cannot multiply any two positive integers smaller than 1 together to yield 1. The number 1 is instead called a unit. All non-prime numbers larger than 1 are called composite numbers.

The way I imagine prime numbers in mathematics is that they are the fundamental atoms of the numerical universe. With the primes every other positive number can be created by multiplication. In that sense the primes build up all other numbers.

Starting Out | A Super Simple Prime Verifier

A really simple algorithm to test for the primality of a number N is to simply divide N by every number f for which 1< f < N holds. By checking if the modulo (remainder) is 0 we know if f is a factor of N or not. So, let’s take what we’ve learnt so far and put it to use in code!

def primality(N):
    is_prime = True
    for f in range(2,N):
        if(N % f == 0):
            is_prime = False
    return is_prime

Great! That was easy! This is the naive approach to testing for primality, which means that it is the basic approach where we test every possible number in the range. The algorithm works. But it is very, very, very inefficient.

This type of primality testing builds on what we call trial division. As implied by the name, trial division simply means that we test the primality of N by testing to divide it by every integer number 1<f<N. There are much more efficient ways to test for primality, but we will leave those for a future post.

To begin to see how we could improve upon the first algorithm let us take a look at all the numbers it tests. We first write the function generate_numbers(N) which will return all of our numbers. Note 1: I use something called list compression in this case, but it is perfectly O.K. to use a simple for loop instead if you feel more comfortable with that. Note 2: if you were about to comment on it then yes, I know that this technically isn’t a generator yet. The name simply foreshadows the final code.

The method looks as follows:

def generate_numbers(N):
    return [i for i in range(2,N)]

We then test it in the terminal for a small N.

Input: generate_numbers(N=100)

Output: [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

Input: len(generate_numbers(N=100))

Output: 98

We now begin to see what the problem with our naive approach is. Can you see it? If N was not divisible for 2, then we know for a fact that it will not be divisible by any number that also is divisible by 2. Ergo, the only even number we need to test with is 2 and then we can skip all other even numbers. By doing so we only test half and should theoretically double the speed. Let’s write a new generate_numbers method that skips all even numbers larger than 2, and update our primality method to be able to take any method of generating numbers.

def primality(N,number_generator):
    for f in number_generator(N):
        if(N%f == 0):
            return False
    return True

def generate_numbers2(N):
    return [2] + [i for i in range(3,N,2)]

We start with 2, and then from numbers 3 and onwards we increment by 2 instead of by 1. Let’s test the numbers generated now.

Input: generate_numbers2(N=100)

Output: [2, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99]

Input: len(generate_numbers2(N=100))

Output: 50

Much better! We’ve about halved the number of divisors we test! That is sure to speed up our primality function. But we are not done yet. We can see that we test for 3, and then we also test for powers of three such as 9, 27 and 81 as well as numbers such as 11 and 99. We know for sure that by the same logic as with the 2’s: if it is not divisible by 3 then it’s not divisible by any power larger than 3 either. So we can write a function that also ignores powers of 3 as well.

The sequence we want is: “2, 3, 5, 7, 11, 13, 17, 19, 23, …”. If we start from 5, we can see that the difference between consecutive values is +2,+4,+2,+4,+2,+4 etc. This can be achieved by the following code:

def generate_numbers3(N):
    numbers = [2,3]
    f = 5
    add = 2
    while(f<N):
        numbers.append(f)
        f += add
        add = 6 - add
    return numbers

And testing how many numbers we get with this new optimization yields:

Input: generate_numbers3(N=100)

Output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35, 37, 41, 43, 47, 49, 53, 55, 59, 61, 65, 67, 71, 73, 77, 79, 83, 85, 89, 91, 95, 97]

Input: len(generate_numbers3(N=100)

Output: 34

Great improvement! By using these two hacks we’ve gone from calculating 98 divisions to calculating only 34 divisions. There are still more optimizations we can do for our simple trial division algorithm but let’s take a moment to evaluate. Let’s see how long time it takes to test the first 100,000 numbers if they are prime or not. To our help we will use the process_time() method of the time library.

from time import process_time()
def primality_time_test(Nmax,fnc):
    start = process_time()
    for N in range(2,Nmax):
        primality(N,fnc)
    end = process_time()
    print(end-start,"seconds")

Alright, let’s test it out and see how well our improvements did!

#test 1 - naive approach 1<f<N
Input: primality_time_test(Nmax=100000,generate_numbers1)
Output: 146.1 seconds

#test 2 - only odd numbers
Input: primality_time_test(Nmax=100000,generate_numbers2)
Output: 71.9 seconds

#test 3 - only odd numbers non-divisible by 3
Input: primality_time_test(Nmax=100000,generate_numbers3)
Output: 148.1 seconds

Well okay… We see by test 2 that skipping all even numbers larger than 2 does indeed give us the expected speed up of halving the time – as was expected. But test 3 is a surprise! We compute about a third of the values in test 3 as we do in test 1, but they take the same amount of time. Why?

As always, the devil is in the details; in this case it’s in how we generate our numbers. Let’s inspect the methods generate_numbers2 and generate_numbers3. In the first we create our numbers in basically the same manner as in the naive approach only that we take half as many steps. Thus by logic it should do about half the number of operations as the naive approach. But in our other method while we create only a third of the numbers the way in which we create them is much more expensive. We compute both an incremental value as well as computing what number to add next. That means that we effectively double the number of computations for each number. In the other cases we only had a simple loop where we compute a single incrementing value. Even worse is that in the naive and odd-only methods we could use list compression which is (generally) significantly faster than using a for loop, and definitely faster than while loops. In generate_numbers3 it’s not really possible to use list compression (as far as I could figure out at least!). We could speed up the while loop by pre-allocating the list using numbers = [None]*N and then using numbers[i] = f instead of .append(f), but there is an even better way which eliminates the need for allocation all together.

We Need More Power! | Powering Up With Generators

If you guessed that we’re going to turn our generate_numbers methods into actual generators – well you guessed right!

Most people who know python (or really any programming language) knows the return function. The procedure with methods is usually something as follows: First you call the method, then do some computation inside the method which gets you a result and you then return that result with the help of the return keyword, and you continue with your program. In our case that means that we call our method and then generate all the numbers we want between 1 and N and then return the whole list of numbers. However, we only use each number in the list once. And we spend a lot of time creating it during which we don’t do anything else. Furthermore for really larger numbers N, the size of the list itself becomes a huge problem.

Luckily, for processors that have any form of multi-threading capabilities (i.e. just about every processor you would run python code on), there is the option to use the yield keyword instead. By using the yield keyword instead of the return keyword we have created a generator. Generators are a whole topic on its own, however. For now we only need to understand that generators “generate” the values we need only when we ask for (need) them. We then get the number we need and can proceed with our code. In the meantime the generator executes its own code until it finds the next value to yield. It then waits for when we ask for the next value. This means that we don’t need to allocate a whole 15,000+ digit list, instead we only compute the next number for when we need it. Let’s put it in code to see how it looks.

def generate_numbers4(N):
    yield 2 #give 2 first time we ask for a number
    yield 3 #give 3 second time we ask for a number
    f = 5
    add = 2
    while(f<N):
        yield f  #give next number when we ask for it
        f += add
        add = 6 - add

When this code first is called it executes until it first meets a yield keyword, in this case the first line at yield 2. It then waits for us to ask for the next number. When we do it is ready to give us the 2, and then goes on until the next yield keyword, in this case the second line at yield 3. It then continues to set f to 5, add to 2 and enter the while loop. It there finds a yield and waits for us to ask for the next number. When we ask for it, it gives f and then goes exactly once in the loop until it meets the yield keyword again. When the condition for the while loop isn’t meet anymore the code exits the loop. Since there is no yield after the while loop the code runs and finds that the code has ended. Thus the generator is then finished, or empty.

Okay so did that do anything? Let’s test the time for checking the first N = 100,000 numbers for primality, using the generate_numbers4 method.

#test 4 - generator skipping even numbers and numbers divisible by 3
Input: primality_time_test(Nmax=100000,generate_numbers4)
Output: 16.4 seconds

Okay wow! That is much better than we originally had expected! Since the naive approach took 150 seconds and this approach only computes a third of those numbers we should expect it to be about 50 seconds. But remember that the yield keyword negates the need for creating huge lists all together. Hence we see speed improvement not only in terms of computing fewer numbers, but also the overhead from list memory allocation as well.

Let The Hunt Begin! | Tackling Some Large Primes

from time import process_time

def primality(N,number_generator):
    for f in number_generator(N):
        if(N%f == 0):
            return False
    return True

def generate_numbers4(N):
    yield 2
    yield 3
    f = 5
    add = 2
    while(f<N):
        yield f
        f += add
        add = 6 - add

def primality_time_test(N,numgen):
    start = process_time()
    is_prime = primality(N,numgen)
    end = process_time()
    print(end-start,"seconds")
    return is_prime

Now that we use generators for finding the divisors we don’t have to worry about the lists getting too big for larger primes. We are finally ready. So let’s get hunting some primes! We’ll start with something small just to see how we fare. Let’s test N = 7919.

Input: primality_time_test(N=7919,numgen=generate_numers4)

Output: 0.000273 seconds
True

Okay so that was very fast and nice. But 7919 is an ant when it comes to hunting primes. Let’s test some slightly bigger primes to see how we fare.

Input: primality_time_test(N=1313579,numgen=generate_numers4)

Output: 0.0455 seconds
True

So it is still fast for testing a single prime number. Let’s go even further! A nice prime number is N= 982451653. An even better prime number is one of my favourites: N = 1000000999999. Let’s test them both!

Input: primality_time_test(N=982451653,numgen=generate_numers4)

Output: 33.12 seconds
True

Input: primality_time_test(N=1000000999999,numgen=generate_numers4)

Output: ???????? seconds

Wow okay. That’s no good. The 9-digit prime took over half a minute, and the 13-digit prime took waay too long. After 2 hours I just force quit the program. There is no way that we will be able to get any of the big monster primes in this way. We need more optimization!

Be Prime Or Be Square(root) | Taking It To The Next Level

So, what we have done so far will seem inconsequential compared to what comes next. Let’s begin by looking at the different ways we can factor a given number. To get the number N = 72, the ways we can multiply two integers together is:

1 * 72	9 * 8
2 * 36	12 * 6
3 * 24	18 * 4
4 * 18	24 * 3
6 * 12	36 * 2
8 * 9	72 * 1

What we can notice from this is that after a while, we are really just repeating the same multiplications but in the inverse order. This shift happens for all numbers, and always half way through. The reason for this is that half way we pass the square root of the number. For 72 the square root is ~8.5. That means that there is no use in ever testing for integers larger than the square of N as we know those numbers will just repeat previous tests.

def generate_numbers5(N):
    yield 2
    yield 3
    f = 5
    add = 2
    nsqr = int(N**0.5)+1
    while(f<nsqr):
        yield f
        f += add
        add = 6 - add

When it comes to optimizing the trial division algorithm, this is the heavy hitter. Just look at the time to test the primes N=982451653 and N=1000000999999 with this version:

Input: primality_time_test(N=982451653,numgen=generate_numers4)

Output: 0.00108 seconds
True

Input: primality_time_test(N=1000000999999,numgen=generate_numers4)

Output: 0.0422 seconds
True

This is… not even close to being in the same league as the previous algorithms! We can easily compute 13 digit primes in mere fractions of a second. Let’s test the limits of this and see how big primes we can hunt! Some other nice primes are N=100000000010001 at 15 digits and N=100055128505716009 at 18 digits, and of course – the crown jewel N=66495897020462343733 at 20-digits! If you’ve got good memory, this is the number we wanted to know if it was a prime at the beginning of the post.

Input: primality_time_test(N=1000000000100011,numgen=generate_numers4)

Output: 1.30 seconds
True

Input: primality_time_test(N=100055128505716009,numgen=generate_numers4)

Output: 12.86 seconds
True

Input: primality_time_test(N=66495897020462343733,numgen=generate_numers4)

Output: 464.2 seconds
True

YES! We caught our big fish! 66495897020462343733 is a prime! We’ve worked our from the small primes all the way up some not too trivial primes. But 20 digits is still tiny in the game of big primes. In order to go further beyond this limit, we will have to look for more advanced algorithms however. We’ll let that adventure be for another day. For now we should rest and feel good about ourselves for managing this far :)

Wrapping Up

In this post we learnt what a prime is, as well the simplest method for testing if a number is prime (trial division). We observed a few improvements that we could apply to our trial division algorithm in order to remove redundant divisors and thus speed up the program. The first improvement was to for f > 3 skip all even divisors and all divisors divisible by 3. Additionally by using a generator with the yield keyword we could both speed up the creation of these divisors and reduce the memory cost of the program. Finally, the big improvement was to only use divisors up to the square root of N. I hope you enjoyed reading it and I for sure know I enjoyed learning and writing about it. Prime numbers are a pretty interesting field of algorithms since they are so incredibly hard once you go up the numbers.

In a future post I’ll talk about the sieve of Eratosthenes and methods used for computing all primes up to any N. But these methods are all only usable for small primes. To get into the big game we will need to look at algorithms such as AKS tests and non-deterministic algorithms. Until then, stay hungry for knowledge!