Prime number generator

For questions about problems on the Project Euler web site. No spoilers. Please include the question number in the subject line of your post.

Prime number generator

Postby jimbo_jones » Wed Nov 06, 2013 10:50 pm

Hi, i've started on some Project Euler problems so I can get to learn python -- I might have a few questions about different things!

I had a question about generating prime numbers - a few of the problems so far have been about primes, so having a nice fast script would be quite useful.
I made v1, which seemed to be ok... until I needed the 10001st prime - then it took about 20 mins to crunch through to a (correct) answer
I have now coded v2, which gets to 10001st in <2 secs.

v1: Takes a 'candidate prime', checks if it is prime by checking (candidate % div == 0) against the current list of confirmed primes, primes = [2, 3, 5, 7...etc etc]". If the candidate is prime, append to primes[]; if not, take the next odd number as the new candidate (until some specified limit).

Problem: it was very slow! I guessed that, in order to loop through all the confirmed primes, when I call my isItAPrimeNumber(candidate) function, Python has to hold the *entire* list of confirmed primes[] in memory, in order to test each one... Is this correct? Using python -m profiler, it tells me all the processing time is going to this function, but not really anything more detailed.

v2: Takes the 'candidate' and simply divides by 2, 3, 5, 7, 9... etc (basically just odd numbers) until it either a) finds a divisor; or b) gets too big (sqrt of the candidate). Now, we don't hold anything in memory -- just the integer that is being used as a divisor, which is incremented by 2 at the end of each loop.

Questions: It feels like v2 is "logically inefficient" -- I shouldn't be testing 9 (= 3*3), 15 (3*5), 21 (3*7) at all, since I have already tested 3, 5 and 7...... but it must be more "computationally efficient" to simply test all the odd numbers, because the computer is faster at doing the 'stupid thing' (incrementing the divisor) than the 'smart thing' (load primes[] into memory, grab each item and use as divisors). With low numbers it probably doesn't make much difference - a lot of the odd numbers will be prime anyway. But as the range increases, we are pointlessly testing a lot of odd numbers which have prime factors themselves (altough this seems to be massively compensated for, by not having to load the massive list of primes[] into memory anyways).

Is there a 'smart way' to grab items from primes[] one at a time?? Or is that just part of the game, because of the way Python deals with list objects?
(I've been reading a bit about 'generators' but I don't really understand. It feels like this should be part of the solution, but I could be wrong. Also - i am learning python! And never really done any OOP before either!)

Thanks for reading! Hope that was relatively clear...
jimbo_jones
 
Posts: 1
Joined: Wed Nov 06, 2013 10:26 pm

Re: Prime number generator

Postby micseydel » Wed Nov 06, 2013 11:03 pm

jimbo_jones wrote:Hi, i've started on some Project Euler problems so I can get to learn python -- I might have a few questions about different things!

Welcome to the forum!

Problem: it was very slow! I guessed that, in order to loop through all the confirmed primes, when I call my isItAPrimeNumber(candidate) function, Python has to hold the *entire* list of confirmed primes[] in memory, in order to test each one... Is this correct? Using python -m profiler, it tells me all the processing time is going to this function, but not really anything more detailed.

When you have a question about code, it's usually better to post the code in question, but yes, it sounds like you're using memory proportional to the size of your prime.

v2: Takes the 'candidate' and simply divides by 2, 3, 5, 7, 9... etc (basically just odd numbers) until it either a) finds a divisor; or b) gets too big (sqrt of the candidate). Now, we don't hold anything in memory -- just the integer that is being used as a divisor, which is incremented by 2 at the end of each loop.

Questions: It feels like v2 is "logically inefficient" -- I shouldn't be testing 9 (= 3*3), 15 (3*5), 21 (3*7) at all, since I have already tested 3, 5 and 7...... but it must be more "computationally efficient" to simply test all the odd numbers, because the computer is faster at doing the 'stupid thing' (incrementing the divisor) than the 'smart thing' (load primes[] into memory, grab each item and use as divisors). With low numbers it probably doesn't make much difference - a lot of the odd numbers will be prime anyway. But as the range increases, we are pointlessly testing a lot of odd numbers which have prime factors themselves (altough this seems to be massively compensated for, by not having to load the massive list of primes[] into memory anyways).

The reason you're experiencing this change is time complexity. Your first is O(n) in time and space and the second is O(sqrt(n)). For large inputs, a square root function is always faster than linear. The idea of time complexity will come into play very often in your Project Euler journey, where you'll usually find exponential and square algorithms too slow (even linear is sometimes too slow!).

Is there a 'smart way' to grab items from primes[] one at a time?? Or is that just part of the game, because of the way Python deals with list objects?
(I've been reading a bit about 'generators' but I don't really understand. It feels like this should be part of the solution, but I could be wrong. Also - i am learning python! And never really done any OOP before either!)

Thanks for reading! Hope that was relatively clear...

The typical way to create primes is with a sieve, such as the Sieve of Eratosthenes. I'm not sure what you mean by "one at a time". If you want to check if a single number is prime, the O(sqrt(n)) solution is what you use. If you want to check up to n numbers, that's n*O(sqrt(n)) work which is O(n * sqrt(n)). Creating a sieve is O(n) in time and space, and if you store its contents in a set, then lookup is constant, so it would be 2*n work to create the sieve then do lookups, which quickly becomes faster than O(n * sqrt(n)).

I just threw a tone of stuff at you here, feel free to ask any questions you have but definitely check out the links!
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1355
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA


Return to Project Euler

Who is online

Users browsing this forum: No registered users and 1 guest