Lenstra elliptic curve factorization
From Wikipedia, the free encyclopedia
The Lenstra elliptic curve factorization or the elliptic curve factorization method (ECM) is a fast, sub-exponential running time algorithm for integer factorization which employs elliptic curves. For general purpose factoring, ECM is the third-fastest known factoring method. The second fastest is the multiple polynomial quadratic sieve and the fastest is the general number field sieve.
Practically speaking, ECM is considered a special purpose factoring algorithm as it is most suitable for finding small factors. Currently[update], it is still the best algorithm for divisors not greatly exceeding 20 to 25 digits (64 to 83 bits or so), as its running time is dominated by the size of the smallest factor p rather than by the size of the number n to be factored. Frequently, ECM is used to remove small factors from a very large integer with many factors; if the remaining integer is still composite, then it has only large factors and is factored using general purpose techniques. The largest factor found using ECM so far was discovered on August 24, 2006 by B. Dodson and has 67 digits [1]. Increasing the number of curves tested improves the chances of finding a factor, but they are not linear with the increase in the number of digits.
Contents |
[edit] Lenstra's elliptic curve factorization
The Lenstra elliptic curve factorization method to find a factor of the given number n works as follows:
- Pick a random elliptic curve over
, given by an equation of the form y2 = x3 + ax + b (mod n), and a non-trivial point P on it. Say, pick first a point P=(x,y) with random non-zero coordinates x,y (mod n), then pick a random non-zero a (mod n), then take b = y2 - x3 - ax (mod n). - We will compute certain multiples
(k times) of the point P using the standard addition rule
on our elliptic curve: given two points P,Q on the curve, their sum
can be computed using the formulas given in the group law section of the article on elliptic curves. These formulas contain the "slope" of the line connecting P and Q, hence involve divisions between residue classes modulo n, which can be performed using the extended Euclidean algorithm. In particular, division by some v (mod n) includes the calculation of the greatest common divisor gcd(v,n).
- If the slope is of the form u/v with gcd(u,n)=1, then v=0 (mod n) means that the result of the
-addition will be
, the point at infinity on the curve. However, if gcd(v,n) is neither 1 nor n, then the
-addition will not produce a meaningful point on the curve, which shows that our elliptic curve is not a group (mod n), but, more importantly for now, gcd(v,n) is a non-trivial factor of n.
- If the slope is of the form u/v with gcd(u,n)=1, then v=0 (mod n) means that the result of the
- Compute eP on the elliptic curve (mod n), where e is product of many small numbers: say, a product of small primes raised to small powers, as in the p − 1 algorithm, or the factorial B! for some not too large B. This can be done efficiently, one small factor at a time. Say, to get B!P, first compute 2P, then 3(2P), then 4(3!P), and so on. Of course, B should be small enough so that B-wise
-addition can be performed in reasonable time. -
- If we were able to finish all the calculations above without encountering non-invertible elements (mod n), then we need to try again with some other curve and starting point.
- If we found
at some stage (the point at infinity on the elliptic curve), then again we should start over with a new curve and starting point, since
is the zero element for
, so after this point we would be just repeating
. - If we encountered a gcd(v,n) at some stage that was neither 1 nor n, then we are done: it is a non-trivial factor of n.
The time complexity depends on the size of the factor and can be represented by O(e(1 + o(1)) √(ln p ln ln p)), where p is the smallest factor of n.
[edit] Why does the algorithm work?
If p and q are two prime divisors of n, then y2 = x3 + ax + b (mod n) implies the same equation also (mod p) and (mod q). These two smaller elliptic curves with the
-addition are now genuine groups. If these groups have Np and Nq elements, respectively, then for any point P on the original curve, by Lagrange's theorem,
on the curve (mod p) implies that k divides Np; moreover,
. The analogous statement holds for q. When the elliptic curve is chosen randomly, then Np and Nq are random numbers close to p+1 and q+1, respectively (see below). Hence it is unlikely that most of the prime factors of Np and Nq are the same, and it is quite likely that while computing eP, we will encounter some kP that is
(mod p) but not (mod q), or vice versa. When this is the case, kP does not exist on the original curve, and in the computations we found some v with either gcd(v,p)=p or gcd(v,q)=q, but not both. That is, gcd(v,n) gave a non-trivial factor of n.
ECM is at its core an improvement of the older p − 1 algorithm. The p − 1 algorithm finds prime factors p such that p − 1 is b-powersmooth for small values of b. For any e, a multiple of p − 1, and any a relatively prime to p, by Fermat's little theorem we have ae ≡ 1 (mod p). Then gcd(ae − 1, n) is likely to produce a factor of n. However, the algorithm fails when p-1 has large prime factors, as is the case for numbers containing strong primes, for example.
ECM gets around this obstacle by considering the group of a random elliptic curve over the finite field Zp, rather than considering the multiplicative group of Zp which always has order p − 1.
The order of the group of an elliptic curve over Zp varies (randomly) between p + 1 − 2√p and p + 1 + 2√p by Hasse's theorem, and is likely to be smooth for some elliptic curves. Although there is no proof that a smooth group order will be found in the Hasse-interval, by using heuristic probabilistic methods, the Canfield-Erdős-Pomerance theorem with suitably optimized parameter choices, and the L-notation, we can expect to try L[√2 / 2, √2] curves before getting a smooth group order. This heuristic estimate is very reliable in practice.
[edit] An example
The following example is from Trappe & Washington (2006), with some details added. This book contains a nice introduction into the cryptographic uses of elliptic curves.
We want to factor n=455839. Let's choose the elliptic curve y2 = x3 + 5x - 5, with the point P=(1,1) on it, and let's try to compute (10!)P.
First we compute 2P. The slope of the tangent line at P is s=(3x2+5)/(2y)=4, and then the coordinates of 2P=(x′,y′) are x′=s2-2x=14 and y′=s(x-x′)-y=4(1-14)-1=-53, all numbers understood (mod n). Just to check that this 2P is indeed on the curve: (-53)2=2809=143+5·14-5.
Then we compute 3(2P). The slope of the tangent line at 2P is s=(3·142+5)/(2(-53))=-593/106 (mod n). Using the Euclidean algorithm: 455839=4300·106+39, then 106=2·39+28, then 39=28+11, then 28=2·11+6, then 11=6+5, then 6=5+1. Hence gcd(455839,106)=1, and working backwards (a version of the extended Euclidean algorithm): 1=6-5=2·6-11=2·28-5·11=7·28-5·39=7·106-19·39=81707·106-19·455839. Hence 106-1=81707 (mod 455839), and -593/106=322522 (mod 455839). Given this s, we can compute the coordinates of 2(2P), just as we did above: 4P=(259851,116255). Just to check that this is indeed a point on the curve: y2=54514=x3+5x-5 (mod 455839). After this, we can compute
.
We can similarly compute 4!P, and so on, but 8!P requires inverting 599 (mod 455839). The Euclidean algorithm gives that 455839 is divisible by 599, and we have found a factorization 455839=599·761.
The reason that this worked is that the curve (mod 599) has 640=27·5 points, while (mod 761) it has 777=3·7·37 points. Moreover, 640 and 777 are the smallest positive integers k such that
on the curve (mod 599) and (mod 761), respectively. Since 8! is a multiple of 640 but not a multiple of 777, we have
on the curve (mod 599), but not on the curve (mod 761), hence the repeated addition broke down here, yielding the factorization.
[edit] See also
- UBASIC for practical program (ECMX).
[edit] External links
- Factoring applet that uses ECM
- GMP-ECM, an efficient implementation of ECM.
- ECMNet, an easy client-server implementation that works with several factorization projects.
- pyecm, a python implementation of ECM. Much faster with psyco and/or gmpy.
[edit] References
- Brent, Richard P. "Factorization of the tenth Fermat number." Mathematics of Computation 68 (1999), 429-451.
- Cohen, Henri. A Course in Computational Algebraic Number Theory. Springer-Verlag, New York, Berlin, Heidelberg, 1996. ISBN 0-387-55640-0.
- Lenstra, A. K. and Lenstra Jr., H. W. (editors), The development of the number field sieve. Lecture Notes in Mathematics 1554, Springer-Verlag, Berlin, 1993. MR 96m:11116.
- Lenstra Jr., H. W. "Factoring integers with elliptic curves." Annals of Mathematics (2) 126 (1987), 649-673. MR 89g:11125.
- Pomerance, Carl; Richard Crandall (2001). "Section 7.4: Elliptic curve method". Prime Numbers: A Computational Perspective (1st ed.). Springer. pp. 301–313. ISBN 0-387-94777-9.
- Pomerance, Carl. "The quadratic sieve factoring algorithm." Advances in Cryptology, Proc. Eurocrypt '84, Lecture Notes in Computer Science, Vol. 209, Springer-Verlag, Berlin, 1985, 169-182. MR 87d:11098.
- Pomerance, Carl (1996). "A Tale of Two Sieves" (PDF). Notices of the American Mathematical Society 43 (12): 1473–1485. http://www.ams.org/notices/199612/pomerance.pdf.
- Silverman, Robert D. "The Multiple Polynomial Quadratic Sieve." Mathematics of Computation 48 (1987), 329-339.
- Trappe, W.; Washington, L. C. (2006), Introduction to Cryptography with Coding Theory. Second edition, Pearson Prentice Hall, ISBN 0-13-186239-1
- Watras, Marcin. Cryptography, Number Analysis, and Very Large Numbers. Wojciechowski-Steinhagen, Bydgoszcz, 2008. PL:5324564.
|
|||||||||||||||||

