20090725, 04:11  #1 
Jul 2009
31 Posts 
Fast isPrime() for n < 2^32
I needed an isPrime() test for generating hash table sizes. If you google for efficient prime testing code, you usually end up at Chris Caldwell's great page.
I ended up making my own test that's pretty efficient for n<2^32. Instead of using 3 or 4 SPRP tests, I use only one test, but I choose the base of the test based on a hashed value of N itself. A 1KB pregenerated lookup table then gives the base needed for the single SPRP test to classify the primality any number < 2^32 properly. Simple code is below. It can definitely be tweaked based on your CPU, language, or compiler, especially how many test divides you want to prescreen with. I could probably speed this up slightly by replacing the test divides with an Euler GCD and then Bloom filter, but that's likely only a few percentage points speed boost. Of course I haven't tried it yet. Code:
/* * Copyright 2009 Steve Worley < m a t h g e e k @(my last name).com > * * Permission to use, copy, modify, and distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL * WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED * WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE * AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS * OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, * NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN * CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ /* isPrime() by Steve Worley Efficiently computes whether a 32 bit integer is prime or not. Strategy: Do first screening with a few trial divisions to identify easy composites. Next use a strongprobableprime test. Normally several tests would be necessary but this code hashes the number into a bin which has a precomputed table listing the SPRP test base which is sufficient for the numbers in that bin. Thus, you can prove (guaranteed 100%) whether your number is prime or not with a single SPRP test. More discussion at www.worley.com/mathgeek/isprime.html This algorithm is deliberately unpatented. The BSD license above also lets you use it even in commercial code. */ static const unsigned char SPRPlookup[1024]={5,6,11,15,76,15,11,26,10,35,29,13, 22,6,14,3,21,24,30,13,6,11,12,6,11,11,5,11,19,3,13, 14,3,6,17,26,6,13,13,13,2,6,5,2,11,15,7,3,10,37,43, 18,33,5,37,13,5,3,37,14,26,5,6,15,19,45,11,24,2,6, 23,2,11,10,2,15,5,5,15,15,15,55,5,92,10,5,18,10,2, 66,22,21,2,42,13,5,17,3,2,6,2,3,2,13,3,14,14,42,6,7, 39,6,15,3,24,5,3,11,14,22,11,5,26,22,14,2,28,6,19,7, 7,10,2,45,40,11,31,2,11,17,33,12,23,2,14,10,13,15, 13,6,3,3,3,38,7,23,18,7,6,10,10,5,11,5,6,2,2,11,6, 12,21,5,43,6,11,15,21,55,38,13,2,2,17,10,13,13,13,7, 10,5,13,3,29,14,11,23,18,5,2,7,2,14,10,2,2,20,10,7, 18,7,6,5,3,10,15,57,18,10,26,12,13,6,17,6,3,3,10,13, 15,44,35,30,26,13,2,13,2,31,37,5,10,31,2,15,5,7,17, 33,17,2,21,23,7,5,17,6,40,17,11,56,30,30,10,3,15,11, 2,15,3,5,2,2,17,22,2,10,2,7,12,13,5,22,5,20,17,7,6, 18,18,6,19,6,5,21,24,2,10,2,2,11,19,42,2,5,17,7,2,42, 7,41,5,5,23,15,11,2,34,10,24,11,12,14,11,2,15,30,28, 19,6,10,12,5,6,45,7,13,11,12,15,13,11,13,2,23,2,33, 15,15,11,10,24,26,13,31,13,7,11,23,5,12,17,14,7,29, 29,15,22,19,14,11,15,19,2,15,2,28,13,20,14,15,60,5, 2,12,69,24,20,7,18,2,13,5,6,10,14,6,24,3,11,23,5,13, 88,13,14,19,3,5,3,21,2,11,6,31,42,18,73,2,7,3,2,3,23, 11,6,6,5,6,14,11,2,11,7,10,7,38,39,5,15,2,19,5,7,22, 13,7,11,24,10,20,14,3,2,10,15,3,13,15,29,11,10,3,6,5, 30,13,26,6,6,17,29,10,19,15,19,21,46,6,17,3,14,3,2,6, 17,26,34,2,11,5,29,19,3,2,19,7,11,14,29,15,7,33,14,21, 12,7,17,15,5,5,7,14,35,21,15,11,15,3,18,3,11,17,7,21, 20,3,15,11,10,3,24,47,10,5,10,5,3,42,6,2,10,21,3,2,2, 5,14,7,3,2,14,15,11,5,12,13,15,15,5,11,10,6,5,7,6,19, 5,2,6,15,5,6,3,2,24,10,10,11,14,11,7,13,3,7,3,18,20, 30,33,17,30,17,3,35,5,5,17,46,6,26,6,20,40,5,10,12,7, 17,5,12,2,6,5,10,5,7,7,7,3,17,11,10,34,10,15,2,6,3,11, 23,13,6,2,18,14,10,7,6,10,7,10,77,6,17,10,5,5,18,17,5, 15,2,11,3,10,13,28,21,20,23,6,7,14,11,3,6,18,5,5,10,6, 23,3,3,2,6,2,15,6,10,5,15,21,7,6,3,7,12,5,3,13,17,20, 21,15,7,2,14,40,20,5,11,7,7,10,10,31,11,13,21,43,3,14, 22,10,22,23,2,10,10,3,7,3,20,7,5,15,7,10,2,30,52,10, 56,14,14,2,2,3,7,5,5,10,3,10,7,15,13,3,89,21,11,5,14, 13,13,11,40,23,21,6,3,20,14,24,6,2,28,6,5,5,6,5,20,7, 13,29,26,31,5,11,3,24,5,7,5,29,38,3,19,13,6,2,22,10,7, 22,5,29,7,11,6,3,2,37,7,7,6,19,12,2,91,7,2,6,56,52,7, 11,10,31,2,6,11,78,34,15,2,53,14,22,31,22,41,2,23,6, 39,6,31,15,15,10,7,30,2,19,6,43,11,38,6,6,3,26,7,10, 11,5,2,11,5,22,22,3,15,2,5,6,24,10,17,7,10,11,2,23,13, 5,11,11,6,21,18,3,21,10,21,2,11,5,7,44,43,31,10,13,3, 2,10,3,7,2,47,13,12,2,13,5,11,2,5,7,11,19,5,10,17,22, 26,2,2,7,2,2,6,6,3,6,22,15,13,57,13,12,2,20,3,5,13,3, 13,15,6,15,2,21,11,10,11,15,2,22,23,10,6,44,7,2,10,5, 31,6,2,5,22,15,5,17,11,2,6,6,13,23,6,5,11,7,28,29,22, 2,7,11,6,35,24,2,3,2,17,13,3,6,2,14,5,11,11,7,23,13,11, 2,13,13,10,12,23,10,18,5,60,10,5,26,5,29,3,20,7,21,47}; /* Computes a^n mod m. Classic binary power ladder evaluation. It may be possible to optimize this based on your architecture, depending especially on its integer % speed, and/or support for native 64 bit ints. */ unsigned long powmod(unsigned long a, unsigned long n, unsigned long m) { unsigned long long result=n&1 ? a : 1; unsigned long long aa=a; n=n>>1; while (n) { // note 64 bit math used for intermediate answers before mod. aa=(aa*aa)%m; if (n&1) result=(aa*result)%m; n=n>>1; } return (unsigned long)result; } static int isSPRP(unsigned long n, unsigned long a) { unsigned long d=n1; unsigned long ad; long s=0; // break down n1 into d*(2^s). Linux ffs() can be better while (0==(d&1)) { ++s; d>>=1; } ad=(powmod(a, d, n)); // (a^d) mod n if (ad==1) return 1; // 1 == a^d mod n if (s>0 && ad==(n1)) return 1; // 1 == a^d mod n (tests (a^d)^(2^0)) for (unsigned long r=1; r<s; ++r) { // ad is currently ad^(2^(r1)) so we square it to get ad^(2^r)); ad=(((unsigned long long)ad)*ad)%n; if (ad==(n1)) return 1; //tests (a^d)^(2^(r+1))) } return 0; // false, composite } int isPrime(unsigned long v) { /* this initial screening test is OPTIONAL, and on many platforms will give faster rejection of most composites without a full SPRP test. */ if (0==v%2) return v==2; if (0==v%3) return v==3; if (0==v%5) return v==5; if (0==v%7) return v==7; if (v<121) return true; // Hash the value to find the lookup table index for a single SPRP test. return isSPRP(v, (unsigned long)SPRPlookup[(0x3AC69A35*v)>>22]); } #include <stdio.h> int main(int argc, char **argv) { unsigned long i; for (i=2; i<300; ++i) if (isPrime(i)) printf("%ld is prime.\n", i); return 0; } 
20110416, 20:25  #2 
Mar 2011
2×5 Posts 
Attached is a code to extend this principle to 1e15 (using William Galway's list) with only 2 millerrabin tests.
It uses a SPRP2, and a hash function to get a value from a table of 2048 "magic" values. Just for fun, the hash uses a very pilike prime and a HHGTTGshift. Since it uses 64bit numbers, powmod is required to get into 128 bit math for each multiplication to use the basic Righttoleft Binary Method. Thus I have two powmod methods: one that uses GMP, and one that uses a mulmod to do the multiplications. The code, as written, just makes use of the GMP method (since it far faster than using powmod2/mulmod) and the powmod2 is simply ignored. This is definitely the weakest part of the entire code. Anyone have a faster way to do 64bit powmod in 64bits with basic C code? I know the entire code would be sped up significantly if I just kept all the numbers in GMP/mpz_t's, but I was trying to isolate GMP as much as possible so it could be removed if I found a short/fast 64bit powmod. The code translates directly to Java as well (using BigInteger for powmod and long for ullint). The hash results are the same in Java as in C (even though Java uses signed 64bit longs). Jim Sinclair 
20110416, 21:01  #3 
Aug 2006
1011101011011_{2} Posts 
Have you thought of extending this to 2^64 with Feitsma's data?
http://gilchrist.ca/jeff/factoring/pseudoprimes.html 
20110416, 22:02  #4  
Mar 2006
2·3^{5} Posts 
Quote:
Here is the code that I use to do mulmod64 and powmod64. I haven't actually tested it against a GMP implementation, but I think it should be faster since it just uses the native machine type. As long as you have a 64bit C compiler that supports inline asm, this should work just fine for you. Code:
u64_t mulmod64(u64_t a, u64_t b, u64_t c) { u64_t d; /* to hold the result of a*b mod c */ /* calculates a*b mod c, stores result in d */ asm ("mov %1, %%rax;" /* put a into rax */ "mul %2;" /* mul a*b > rdx:rax */ "div %3;" /* (a*b)/c > quot in rax remainder in rdx */ "mov %%rdx, %0;" /* store result in d */ :"=r"(d) /* output */ :"r"(a), "r"(b), "r"(c) /* input */ :"%rax", "%rdx" /* clobbered registers */ ); return d; }/* method mulmod64 */ /*================================================================================*/ /* powmod64 from Geoff Reynolds code at */ /* http://www.geocities.com/g_w_reynolds/srsieve/ */ u64_t powmod64(u64_t b, u64_t n, u64_t p) { u64_t a=1; goto tst; mul: a = mulmod64(a,b,p); sqr: b = mulmod64(b,b,p); n >>= 1; tst: if (n & 1) goto mul; if (n > 0) goto sqr; return a; }/* method powmod64 */ 

20110417, 00:31  #5  
Mar 2011
2·5 Posts 
Quote:
Quote:
Quote:
First step was to create the hash function (just using whatever) and verify that it distributes the 419489 SPRP2's fairly evenly regardless of what i use for bucket size (it does). Step 2, I started with 64 buckets, and ran all the numbers in bucket 1 (a random bucket) through using 3, 4, 5... up to about a million as the second witness before I killed it. For each witness I kept track of the number of SPRPs that failed. The count of SPRPs that failed varied a lot but seemed to have a bell curve of some kind. With 64 buckets there was no witness less than 1 million that "solved" bucket 1, or got anywhere close. There were usually hundreds left in bucket 1 unsolved. So I needed to break the buckets down into smaller buckets. Step 3, double the number of buckets (128...256...) and repeat step 2 until I could reasonably expect to get a solved bucket for every bucket. That is, when the average number of liars for a given witness was something like 5 to 10. With 1024 buckets it is close, but 2048 was guaranteed. If I played more with the hashing function and such, I could probably narrow it down, but I was being lazy and did this yesterday afternoon just messing around. Step 4, for all 2048 buckets of SPRPs, find the least number that "solved" the bucket (0 liars left). This is your "magic" value for each bucket. Step 5 verify that the code gets every answer correct up to the highest witness in the 2048 list (which is 1,095,558). The reason for this is that there is a small chance that a random number could hash to a value that is equal to itself or a multiple of itself. For instance, 1219 hashed to a bucket with 1219 in it; this would obviously fail. There were about 5 of these values that had to be replaced out of the 2048. I just did step 4 again and used a different value for that bucket (and reverified). After this, we are good. We know that millerrabin always returns the answer "prime" when a number is prime (assuming the number isn't tested with itself or a multiple). So all prime numbers <1e15 will return correctly. We know the composite SPRP2 liars, and I verified them all with the hashed 2048 magic numbers. Quote:
I'd still like to see a Conly implementation of mulmod though for maximum portability. That is, without basically building an entire MPlibrary in place or using an Intel/AMD specific instruction. I guess I just suspect that there must be a numbertheoretic way of doing this easily. I've tried reading some of web info on Montgomery reduction (don't know if that would even help), but it was beyond my Friday afternoon skills for number theory. Perhaps another day. 

20110417, 02:37  #6  
Aug 2006
3×1,993 Posts 
Quote:


20110417, 02:59  #7 
Mar 2011
2×5 Posts 
Found it (with some creative Googling) and downloading now. thanks! I'll take a look and play with it a little bit.

20110419, 23:00  #8 
Mar 2011
12_{8} Posts 
Fast isPrime for n < 2^64
Fast isPrime for n < 2^64 attached.
Three static SPRP/MillerRabin tests w/ a hash to find the 4th test. It seemed a little too easy to find this. I think with more computation time I could find a way to reduce either the number of tests to 3 total or the size of the hash (just a hunch). But it's a good starting point to get a 100% guaranteed test for 64bit numbers (assuming the completeness of Feitsma's data). Note: mulmod is Intel 64bit assembly. Substitute as needed for cross platform/architecture. Thanks WraithX for the mulmod implementation, and SPWorley for the inspiration. 
20110420, 01:04  #9 
Aug 2006
3×1,993 Posts 
I would really like to see a version that reduces the tests to 3  or perhaps just 2 plus some trial division? The latter would need a really big hash...

20110420, 05:18  #10 
Mar 2003
New Zealand
13·89 Posts 
Here is some SPRP code I put together for the AP26 project. You can find the full source at http://sites.google.com/site/geoffre.../programs/ap26, look in the PrimeQ_x86.h header.
The mulmod function only works up to 2^62 (or 2^52 on some machines), but it is easy to replace with another one that works up to 2^64 (There are much faster methods than the mul/div asm version). If you are doing multiple SPRP tests on the same number then it would probably be worthwhile writing a vector version of the function that tests all bases in parallel. Code:
/* Returns r = a*b (mod p) in the range 0 <= r < p. Assumes 0 <= a,b < p < 2^LDBL_MULMOD_LIMIT. Assumes inv = 1.0L/p. */ static int64_t mulmod(int64_t a, int64_t b, int64_t p, long double inv) { int64_t q, r; long double x, y; /* ab (mod p) = abpq, where q = floor(ab/p) */ x = a; y = b; q = x*y*inv; r = a*b  q*p; /* Because the default FPU rounding mode is roundtonearest the result could be out by +/p. One of these branches could be eliminated by using a nondefault rounding mode. (Roundtozero is ideal I think). */ if (r < 0) r += p; else if (r >= p) r = p; return r; } /* Returns 0 only if N is composite. Otherwise N is a strong probable prime to base a. Assumes N odd, a < N < 2^LDBL_MULMOD_LIMIT. */ static int strong_prp(int64_t a, int64_t N) { long double inv; int64_t r; uint64_t d; unsigned int s, t; /* If N is prime and N = d*2^t+1, where d is odd, then either 1. a^d = 1 (mod N), or 2. a^(d*2^s) = 1 (mod N) for some s in 0 <= s < t */ inv = 1.0L/N; #ifdef __GNUC__ t = __builtin_ctzll(N1); d = (uint64_t)N >> t; #else for (d = (uint64_t)N >> 1, t = 1; !(d & 1); d >>= 1, t++) ; #endif /* r < a^d mod N, assuming d odd */ for (r = a, d >>= 1; d > 0; d >>= 1) { a = mulmod(a,a,N,inv); if (d & 1) r = mulmod(r,a,N,inv); } /* Clause 1. and s = 0 case for clause 2. */ if (r == 1  r == N1) return 1; /* 0 < s < t cases for clause 2. */ for (s = 1; s < t; s++) if ((r = mulmod(r,r,N,inv)) == N1) return 1; return 0; } Last fiddled with by geoff on 20110420 at 05:21 Reason: fix url 
20110420, 14:06  #11  
Nov 2003
2^{2}×5×373 Posts 
Quote:
Lucas test or a FrobeniusGrantham. There is no known pseudoprime for a single MR followed by a single Lucas test. (Preda Mihailescu and I looked for one without success). If the objective is proved primes, one can just trial divide N1 and N+1 to N^(1/6), and apply the SelfridgeLehmerBrillhart test that allows one to prove primality if enough factors of N1 and N+1 are known. 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Do normal adults give themselves an allowance? (...to fast or not to fast  there is no question!)  jasong  jasong  35  20161211 00:57 
How to Check if NonMersenne Number isPrime?  FloatingPoint  Operation Billion Digits  39  20151021 02:15 
How fast is the dog?  Andi47  Puzzles  20  20090401 02:35 
I wonder how fast this is...  ixfd64  Hardware  1  20051121 21:28 
Fast way to square???  maheshexp  Math  2  20040529 01:54 