Rieks Joosten (private communication) has estimated the hardware implementation speed in a .35 micron technology, based on an actual Safer implementation.
I have used the compiler optimisation switches for the Pentium rather than the Pentium Pro but the source code itself has been optimised for the latter so the results are a long way below what is possible on the Pentium with proper optimisation for this processor.Note that those numbers were measured by using his early variants of code.
You can put the Merced/McKinley numbers up, but please add some caveats to them, as follows. The Merced/McKinley ratios versus Pentium II are still preliminary -- it's possible that there will be some changes in them, although we believe that they are reasonbly close. Thus, most ratios intentionally have only one significant digit, with a few minuses indicating that the rounding was "close".
We have simulated the RC6 code fully and know that its "slow" numbers are for real. We will finish simulating the assembly code for the other algorithms in the near future.[...]
Here are the latest results, with an average of encryption and decryption times used for the ratios (Serpent and RC6 are slightly asymmetric):P2 PA-RISC Merced/P2 McKinley/P2 RC6 250 547 2.5 2.1 Rijndael 283 186 0.6 0.5- Serpent 900 610 0.8 0.8 Twofish 258 222 0.9- 0.7