154 lines
6.1 KiB
ReStructuredText
154 lines
6.1 KiB
ReStructuredText
Performance
|
|
-----------
|
|
|
|
.. currentmodule:: numpy.random
|
|
|
|
Recommendation
|
|
**************
|
|
The recommended generator for general use is `PCG64`. It is
|
|
statistically high quality, full-featured, and fast on most platforms, but
|
|
somewhat slow when compiled for 32-bit processes.
|
|
|
|
`Philox` is fairly slow, but its statistical properties have
|
|
very high quality, and it is easy to get assuredly-independent stream by using
|
|
unique keys. If that is the style you wish to use for parallel streams, or you
|
|
are porting from another system that uses that style, then
|
|
`Philox` is your choice.
|
|
|
|
`SFC64` is statistically high quality and very fast. However, it
|
|
lacks jumpability. If you are not using that capability and want lots of speed,
|
|
even on 32-bit processes, this is your choice.
|
|
|
|
`MT19937` `fails some statistical tests`_ and is not especially
|
|
fast compared to modern PRNGs. For these reasons, we mostly do not recommend
|
|
using it on its own, only through the legacy `~.RandomState` for
|
|
reproducing old results. That said, it has a very long history as a default in
|
|
many systems.
|
|
|
|
.. _`fails some statistical tests`: https://www.iro.umontreal.ca/~lecuyer/myftp/papers/testu01.pdf
|
|
|
|
Timings
|
|
*******
|
|
|
|
The timings below are the time in ns to produce 1 random value from a
|
|
specific distribution. The original `MT19937` generator is
|
|
much slower since it requires 2 32-bit values to equal the output of the
|
|
faster generators.
|
|
|
|
Integer performance has a similar ordering.
|
|
|
|
The pattern is similar for other, more complex generators. The normal
|
|
performance of the legacy `RandomState` generator is much
|
|
lower than the other since it uses the Box-Muller transformation rather
|
|
than the Ziggurat generator. The performance gap for Exponentials is also
|
|
large due to the cost of computing the log function to invert the CDF.
|
|
The column labeled MT19973 is used the same 32-bit generator as
|
|
`RandomState` but produces random values using
|
|
`Generator`.
|
|
|
|
.. csv-table::
|
|
:header: ,MT19937,PCG64,Philox,SFC64,RandomState
|
|
:widths: 14,14,14,14,14,14
|
|
|
|
32-bit Unsigned Ints,3.2,2.7,4.9,2.7,3.2
|
|
64-bit Unsigned Ints,5.6,3.7,6.3,2.9,5.7
|
|
Uniforms,7.3,4.1,8.1,3.1,7.3
|
|
Normals,13.1,10.2,13.5,7.8,34.6
|
|
Exponentials,7.9,5.4,8.5,4.1,40.3
|
|
Gammas,34.8,28.0,34.7,25.1,58.1
|
|
Binomials,25.0,21.4,26.1,19.5,25.2
|
|
Laplaces,45.1,40.7,45.5,38.1,45.6
|
|
Poissons,67.6,52.4,69.2,46.4,78.1
|
|
|
|
The next table presents the performance in percentage relative to values
|
|
generated by the legacy generator, ``RandomState(MT19937())``. The overall
|
|
performance was computed using a geometric mean.
|
|
|
|
.. csv-table::
|
|
:header: ,MT19937,PCG64,Philox,SFC64
|
|
:widths: 14,14,14,14,14
|
|
|
|
32-bit Unsigned Ints,101,121,67,121
|
|
64-bit Unsigned Ints,102,156,91,199
|
|
Uniforms,100,179,90,235
|
|
Normals,263,338,257,443
|
|
Exponentials,507,752,474,985
|
|
Gammas,167,207,167,231
|
|
Binomials,101,118,96,129
|
|
Laplaces,101,112,100,120
|
|
Poissons,116,149,113,168
|
|
Overall,144,192,132,225
|
|
|
|
.. note::
|
|
|
|
All timings were taken using Linux on an i5-3570 processor.
|
|
|
|
Performance on different Operating Systems
|
|
******************************************
|
|
Performance differs across platforms due to compiler and hardware availability
|
|
(e.g., register width) differences. The default bit generator has been chosen
|
|
to perform well on 64-bit platforms. Performance on 32-bit operating systems
|
|
is very different.
|
|
|
|
The values reported are normalized relative to the speed of MT19937 in
|
|
each table. A value of 100 indicates that the performance matches the MT19937.
|
|
Higher values indicate improved performance. These values cannot be compared
|
|
across tables.
|
|
|
|
64-bit Linux
|
|
~~~~~~~~~~~~
|
|
|
|
=================== ========= ======= ======== =======
|
|
Distribution MT19937 PCG64 Philox SFC64
|
|
=================== ========= ======= ======== =======
|
|
32-bit Unsigned Int 100 119.8 67.7 120.2
|
|
64-bit Unsigned Int 100 152.9 90.8 213.3
|
|
Uniforms 100 179.0 87.0 232.0
|
|
Normals 100 128.5 99.2 167.8
|
|
Exponentials 100 148.3 93.0 189.3
|
|
**Overall** 100 144.3 86.8 180.0
|
|
=================== ========= ======= ======== =======
|
|
|
|
|
|
64-bit Windows
|
|
~~~~~~~~~~~~~~
|
|
The relative performance on 64-bit Linux and 64-bit Windows is broadly similar.
|
|
|
|
|
|
=================== ========= ======= ======== =======
|
|
Distribution MT19937 PCG64 Philox SFC64
|
|
=================== ========= ======= ======== =======
|
|
32-bit Unsigned Int 100 129.1 35.0 135.0
|
|
64-bit Unsigned Int 100 146.9 35.7 176.5
|
|
Uniforms 100 165.0 37.0 192.0
|
|
Normals 100 128.5 48.5 158.0
|
|
Exponentials 100 151.6 39.0 172.8
|
|
**Overall** 100 143.6 38.7 165.7
|
|
=================== ========= ======= ======== =======
|
|
|
|
|
|
32-bit Windows
|
|
~~~~~~~~~~~~~~
|
|
|
|
The performance of 64-bit generators on 32-bit Windows is much lower than on 64-bit
|
|
operating systems due to register width. MT19937, the generator that has been
|
|
in NumPy since 2005, operates on 32-bit integers.
|
|
|
|
=================== ========= ======= ======== =======
|
|
Distribution MT19937 PCG64 Philox SFC64
|
|
=================== ========= ======= ======== =======
|
|
32-bit Unsigned Int 100 30.5 21.1 77.9
|
|
64-bit Unsigned Int 100 26.3 19.2 97.0
|
|
Uniforms 100 28.0 23.0 106.0
|
|
Normals 100 40.1 31.3 112.6
|
|
Exponentials 100 33.7 26.3 109.8
|
|
**Overall** 100 31.4 23.8 99.8
|
|
=================== ========= ======= ======== =======
|
|
|
|
|
|
.. note::
|
|
|
|
Linux timings used Ubuntu 18.04 and GCC 7.4. Windows timings were made on
|
|
Windows 10 using Microsoft C/C++ Optimizing Compiler Version 19 (Visual
|
|
Studio 2015). All timings were produced on an i5-3570 processor.
|