AES Ciphers: speed in no-feedback mode

Cross-table: speed in no-feedback mode

The following table includes data from implementations of AES candidates in NFB (no-feedback mode), where more than one block is encrypted in parallel, using either the multimedia extensions of contemporary processors, or pipelining in hardware. This table incorporates only the rows from the more general table where at least one entry is done in NFB mode. The NFB mode implementations below are emphasized with a comment like 2-nfb, here 2 being the number of blocks encrypted in parallel. Moreover, gain compared to the normal implementation (by the same implementer, if the corresponding data is available) is given.

Note that NFB is not usable in many of the block cipher modes like CBC encryption. On the other hand, it is usable in ECB and counter modes, where the latter is by many thought to be a new MUST BE standard.
Machine/ compiler Mars RC6RijndaelSerpentTwofishAvailability
32-bit software
Pentium Pro/II/III assembly (cycles) 306 (Lipmaa) 223 (Aoki, Lipmaa) 232 (Lipmaa) 569 (Gladman, PIII)
2-nfb, gain: 1.36
258 (self-modifying)
277 (Aoki, Lipmaa)
IA-64 assembly (cycles) 511 (Worley etc) 199 (Eric Young)
8-nfb, gain: 2.37
124 (Worley etc) 419 (Worley, p.c.) 182 (Worley etc)
Alpha 21264 assembly (cycles) 375 (Weiss etc)
2-nfb, gain: 1.87
210 (Weiss etc)
2-nfb, gain: 2.72
210 (Weiss etc)
2-nfb, gain: 2.09
506 (Weiss etc)
2-nfb, gain: 1.94
255 (Weiss etc)
2-nfb, gain: 1.96
FPGA
Xilinx Virtex XCV 1000
Gbit/s (CLB Slices)
Gaj etc
13100 (46900)
gain: 91.8 (41.2x CLBs)
12200 (~24600)
gain: 29.5 (~9.81x CLBs)
16800 (19700)
gain: 38.9 (4.37x CLBs)
15200 (21000)
gain: 85.7 (19.5x CLBs)
Xilinx Virtex XCV1000BG560-4 FPGA
Mbit/s (cycles per block/MHz/CLB Slices)
Elbirt etc
2397.9 (2/37.5/10856)
gain: 19.0
1937.9 (2.1/31.8/10992)
gain: 6.5
4860.2 (1/38.0/9004)
gain: 10.9
1585.3 (2/24.8/9345)
gain: 12.4
D'Crypt implementation
(Altera APEX 20KE core)
>2500
Virtex E
cycles per block/MHz/CLB Slices/BlockRAMS [Mbps]
?/?/800/10 [1750]FWeaver
Spartan II-100
cycles per block/MHz/CLB Slices/BlockRAMS [Mbps]
?/?/800/10 [1300]FWeaver
Virtex E
cycles per block/MHz/CLB Slices/BlockRAMS [Mbps]
?/?/460/10 [700]FWeaver
Spartan II-100
cycles per block/MHz/CLB Slices/BlockRAMS [Mbps]
?/?/460/10 [500]FWeaver
Hardware
NSA Hardware Test
Mbit/s(area um2, trans count)
2189.16 (1,332,658,283; 20,661,116)
gain: 38.6
2196.67 (453,248,223; 11,611,314)
gain: 21.4
5745.06 (419,886,025; 4,292,749)
gain: 9.5
8030.11 (438,561,500; 5,741,469)
gain: 39.7
2273.53 (225,298,323; 3,783,973)
gain: 21.6

Note: Weaver's implementations have no key setup time.


Maintained by Helger Lipmaa. Don't hesitate to email me if you have any corrections/additions/comments.