Looking from the cryptographic view of point, there seems to be a very narrow perspective of using MMX technology to increase the performance of already existing cryptographic primitives. Reasons are manifold, public key cryptography (RSA, for example) relys on long multiplication not on parallel execution of many short multiplications. Most of the block ciphers in general use are inherently nonparallelisable due to their use of S-boxes and/or table lookups.
Some primitives, though, are outstandingly suitable for MMX. One of such primitives is IDEA; this is demonstrated in the following Figure. (The speeds are scaled for an hypothetical 3200 MHz machine, last updated Nov 13 2000 if not explicitly said otherwise; NB: 1 MB/s=10242 B/s).
Block cipher | Block size | Cycles | MBytes/s | Author | Processor | |
---|---|---|---|---|---|---|
Square | 128 | 192 | 254.4 | Lipmaa | Pentium II | |
RC6 | 128 | 219 | 222.8 | Lipmaa | Pentium II/III | ![]() |
4-way IDEA | 4x64 | 440 | 222.0 | Lipmaa | Pentium III | |
Rijndael | 128 | 226 | 216.0 | Lipmaa | Pentium II/III | ![]() |
Square | 128 | 244 | 200.0 | Bosselaers | Pentium | |
4-way IDEA | 4x64 | 543 | 180.0 | Lipmaa | Pentium MMX | |
SC2000 | 128 | 270 | 180.8 | Lipmaa | Pentium II/III, gcc (no asm) | New, 04.04.2002 |
4-way IDEA | 4x64 | 554 | 176.4 | Lipmaa | AMD Athlon | New, 01.10.2003 |
Twofish | 128 | 277 | 176.4 | Aoki, Lipmaa | Pentium II/III | |
Rijndael | 128 | 300 | 162.8 | Gladman | Pentium III | New, 15.10.2001 |
Camellia | 128 | 302 | 161.6 | Aoki | Pentium II/III | |
MARS | 128 | 306 | 160.0 | Lipmaa | Pentium II/III | |
Blowfish | 64 | 158 | 154.4 | Bosselaers | Pentium | |
RC5-32/16 | 64 | 199 | 122.8 | Bosselaers | Pentium | |
CAST5 | 64 | 220 | 110.8 | Bosselaers | Pentium | |
DES | 64 | 340 | 72.0 | Bosselaers | Pentium | |
IDEA | 64 | 358 | 68.0 | Lipmaa | Pentium MMX | |
SAFER (S)K-128 | 64 | 418 | 58.4 | Bosselaers | Pentium | |
Shark | 64 | 585 | 41.6 | Bosselaers | Pentium | |
IDEA | 64 | 590 | 41.2 | Bosselaers | Pentium | |
3DES | 64 | 928 | 26.4 | Bosselaers | Pentium | |
Compared to leading AES candidates, 4-way IDEA is only a little slower than RC6 and Rijndael on the Pentium II, but faster than Twofish and MARS. On the Pentium III, 4-way IDEA is even faster than RC6 and Rijndael. If one prefers a block cipher with time-proven security margins, IDEA is definitely the choice over AES algorithms.
However, in the light of the ongoing AES process and the amount of cryptanalysis applied to the leading AES candidates, especially to the winner, it might very soon become desirable to switch over to the proposed AES, Rijndael. To get more information about the AES candidates, click here.
To further simplify the usage, the CBC1 and XORC1 modes (corresponding to the usual CBC and counter modes) have been implemented. In this mode, FastIDEA can be seen as a drop in replacement to the popular OpenSSL library. Since CBC1 decryption and XORC1 encryption (and decryption that is equal to encryption) use internally the 4-way IDEA, the library achieves almost the same speed in these modes as in the low level CBC4 and XORC4 modes: about 235-260 Mbit/s. The CBC1 encryption is not parallelizable and therefore does not achieve such speed.
If you want to learn more about my implementations, don't hesitate to mail to lipmaa(at)cyber.ee. In the mail please specify your interest! :-)
NEWI have also finished fast implementatios of MARS and Rijndael, two of the AES finalist ciphers. Information about that can be obtained from the AES Speed page or by sending an email to lipmaa(at)cyber.ee.
Interestingly, the fastest available IDEA hardware coprocessor runs at 40 MHz and achieves about 300 Mbit/s on the 3-way IDEA mode and about 100 Mbit/s on the standard IDEA mode. (See links.) On a 500 MHz Pentium III, my 4-way IDEA implementation achieves 290 Mbit/s, while already on the 550 MHz Pentium III my sotware implementation is faster than the Ascom hardware implementation. A 866 MHz Pentium III achieves 500 Mbit/s, which is in the same range as the hardware implementations of the AES candidates.