资讯专栏INFORMATION COLUMN

高效随机数算法Java实现

baukh789 / 338人阅读

摘要:初遇梅森旋转算法后面咨询了网友后得知了一个高效的随机数算法梅森旋转。通过搜索资料得知梅森旋转算法是一个伪随机数发生算法。可以快速产生高质量的伪随机数,修正了古典随机数发生算法的很多缺陷。

前言

事情起源于一位网友分享了一个有趣的面试题:

生成由六位数字组成的ID,要求随机数字,不排重,不可自增,且数字不重复。ID总数为几十万。
初次解答

我一开始想到的办法是

生成一个足够大的ID池(其实就是需要多少就生成多少)

对ID池中的数字进行随机排序

依次消费ID池中的数字

可惜这个方法十分浪费空间,且性能很差。

初遇梅森旋转算法

后面咨询了网友后得知了一个高效的随机数算法:梅森旋转(Mersenne Twister/MT)。通过搜索资料得知:

梅森旋转算法(Mersenne twister)是一个伪随机数发生算法。由松本真和西村拓士在1997年开发,基于有限二进制字段上的矩阵线性递归。可以快速产生高质量的伪随机数,修正了古典随机数发生算法的很多缺陷。

最为广泛使用Mersenne Twister的一种变体是MT19937,可以产生32位整数序列。

PS:此算法依然无法完美解决面试题,但是也算学到了新知识

MT19937算法实现

后面通过Google,找到了一个高效的MT19937的Java版本代码。原代码链接为http://www.math.sci.hiroshima...

import java.util.Random;

/**
 * MT19937的Java实现
 */
public class MTRandom extends Random {
    
    // Constants used in the original C implementation
    private final static int UPPER_MASK = 0x80000000;
    private final static int LOWER_MASK = 0x7fffffff;

    private final static int N = 624;
    private final static int M = 397;
    private final static int MAGIC[] = { 0x0, 0x9908b0df };
    private final static int MAGIC_FACTOR1 = 1812433253;
    private final static int MAGIC_FACTOR2 = 1664525;
    private final static int MAGIC_FACTOR3 = 1566083941;
    private final static int MAGIC_MASK1   = 0x9d2c5680;
    private final static int MAGIC_MASK2   = 0xefc60000;
    private final static int MAGIC_SEED    = 19650218;
    private final static long DEFAULT_SEED = 5489L;

    // Internal state
    private transient int[] mt;
    private transient int mti;
    private transient boolean compat = false;

    // Temporary buffer used during setSeed(long)
    private transient int[] ibuf;

    /**
     * The default constructor for an instance of MTRandom.  This invokes
     * the no-argument constructor for java.util.Random which will result
     * in the class being initialised with a seed value obtained by calling
     * System.currentTimeMillis().
     */
    public MTRandom() { }

    /**
     * This version of the constructor can be used to implement identical
     * behaviour to the original C code version of this algorithm including
     * exactly replicating the case where the seed value had not been set
     * prior to calling genrand_int32.
     * 

* If the compatibility flag is set to true, then the algorithm will be * seeded with the same default value as was used in the original C * code. Furthermore the setSeed() method, which must take a 64 bit * long value, will be limited to using only the lower 32 bits of the * seed to facilitate seamless migration of existing C code into Java * where identical behaviour is required. *

* Whilst useful for ensuring backwards compatibility, it is advised * that this feature not be used unless specifically required, due to * the reduction in strength of the seed value. * * @param compatible Compatibility flag for replicating original * behaviour. */ public MTRandom(boolean compatible) { super(0L); compat = compatible; setSeed(compat?DEFAULT_SEED:System.currentTimeMillis()); } /** * This version of the constructor simply initialises the class with * the given 64 bit seed value. For a better random number sequence * this seed value should contain as much entropy as possible. * * @param seed The seed value with which to initialise this class. */ public MTRandom(long seed) { super(seed); } /** * This version of the constructor initialises the class with the * given byte array. All the data will be used to initialise this * instance. * * @param buf The non-empty byte array of seed information. * @throws NullPointerException if the buffer is null. * @throws IllegalArgumentException if the buffer has zero length. */ public MTRandom(byte[] buf) { super(0L); setSeed(buf); } /** * This version of the constructor initialises the class with the * given integer array. All the data will be used to initialise * this instance. * * @param buf The non-empty integer array of seed information. * @throws NullPointerException if the buffer is null. * @throws IllegalArgumentException if the buffer has zero length. */ public MTRandom(int[] buf) { super(0L); setSeed(buf); } // Initializes mt[N] with a simple integer seed. This method is // required as part of the Mersenne Twister algorithm but need // not be made public. private final void setSeed(int seed) { // Annoying runtime check for initialisation of internal data // caused by java.util.Random invoking setSeed() during init. // This is unavoidable because no fields in our instance will // have been initialised at this point, not even if the code // were placed at the declaration of the member variable. if (mt == null) mt = new int[N]; // ---- Begin Mersenne Twister Algorithm ---- mt[0] = seed; for (mti = 1; mti < N; mti++) { mt[mti] = (MAGIC_FACTOR1 * (mt[mti-1] ^ (mt[mti-1] >>> 30)) + mti); } // ---- End Mersenne Twister Algorithm ---- } /** * This method resets the state of this instance using the 64 * bits of seed data provided. Note that if the same seed data * is passed to two different instances of MTRandom (both of * which share the same compatibility state) then the sequence * of numbers generated by both instances will be identical. *

* If this instance was initialised in "compatibility" mode then * this method will only use the lower 32 bits of any seed value * passed in and will match the behaviour of the original C code * exactly with respect to state initialisation. * * @param seed The 64 bit value used to initialise the random * number generator state. */ public final synchronized void setSeed(long seed) { if (compat) { setSeed((int)seed); } else { // Annoying runtime check for initialisation of internal data // caused by java.util.Random invoking setSeed() during init. // This is unavoidable because no fields in our instance will // have been initialised at this point, not even if the code // were placed at the declaration of the member variable. if (ibuf == null) ibuf = new int[2]; ibuf[0] = (int)seed; ibuf[1] = (int)(seed >>> 32); setSeed(ibuf); } } /** * This method resets the state of this instance using the byte * array of seed data provided. Note that calling this method * is equivalent to calling "setSeed(pack(buf))" and in particular * will result in a new integer array being generated during the * call. If you wish to retain this seed data to allow the pseudo * random sequence to be restarted then it would be more efficient * to use the "pack()" method to convert it into an integer array * first and then use that to re-seed the instance. The behaviour * of the class will be the same in both cases but it will be more * efficient. * * @param buf The non-empty byte array of seed information. * @throws NullPointerException if the buffer is null. * @throws IllegalArgumentException if the buffer has zero length. */ public final void setSeed(byte[] buf) { setSeed(pack(buf)); } /** * This method resets the state of this instance using the integer * array of seed data provided. This is the canonical way of * resetting the pseudo random number sequence. * * @param buf The non-empty integer array of seed information. * @throws NullPointerException if the buffer is null. * @throws IllegalArgumentException if the buffer has zero length. */ public final synchronized void setSeed(int[] buf) { int length = buf.length; if (length == 0) throw new IllegalArgumentException("Seed buffer may not be empty"); // ---- Begin Mersenne Twister Algorithm ---- int i = 1, j = 0, k = (N > length ? N : length); setSeed(MAGIC_SEED); for (; k > 0; k--) { mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >>> 30)) * MAGIC_FACTOR2)) + buf[j] + j; i++; j++; if (i >= N) { mt[0] = mt[N-1]; i = 1; } if (j >= length) j = 0; } for (k = N-1; k > 0; k--) { mt[i] = (mt[i] ^ ((mt[i-1] ^ (mt[i-1] >>> 30)) * MAGIC_FACTOR3)) - i; i++; if (i >= N) { mt[0] = mt[N-1]; i = 1; } } mt[0] = UPPER_MASK; // MSB is 1; assuring non-zero initial array // ---- End Mersenne Twister Algorithm ---- } /** * This method forms the basis for generating a pseudo random number * sequence from this class. If given a value of 32, this method * behaves identically to the genrand_int32 function in the original * C code and ensures that using the standard nextInt() function * (inherited from Random) we are able to replicate behaviour exactly. *

* Note that where the number of bits requested is not equal to 32 * then bits will simply be masked out from the top of the returned * integer value. That is to say that: *

     * mt.setSeed(12345);
     * int foo = mt.nextInt(16) + (mt.nextInt(16) << 16);
* will not give the same result as *
     * mt.setSeed(12345);
     * int foo = mt.nextInt(32);
* * @param bits The number of significant bits desired in the output. * @return The next value in the pseudo random sequence with the * specified number of bits in the lower part of the integer. */ protected final synchronized int next(int bits) { // ---- Begin Mersenne Twister Algorithm ---- int y, kk; if (mti >= N) { // generate N words at one time // In the original C implementation, mti is checked here // to determine if initialisation has occurred; if not // it initialises this instance with DEFAULT_SEED (5489). // This is no longer necessary as initialisation of the // Java instance must result in initialisation occurring // Use the constructor MTRandom(true) to enable backwards // compatible behaviour. for (kk = 0; kk < N-M; kk++) { y = (mt[kk] & UPPER_MASK) | (mt[kk+1] & LOWER_MASK); mt[kk] = mt[kk+M] ^ (y >>> 1) ^ MAGIC[y & 0x1]; } for (;kk < N-1; kk++) { y = (mt[kk] & UPPER_MASK) | (mt[kk+1] & LOWER_MASK); mt[kk] = mt[kk+(M-N)] ^ (y >>> 1) ^ MAGIC[y & 0x1]; } y = (mt[N-1] & UPPER_MASK) | (mt[0] & LOWER_MASK); mt[N-1] = mt[M-1] ^ (y >>> 1) ^ MAGIC[y & 0x1]; mti = 0; } y = mt[mti++]; // Tempering y ^= (y >>> 11); y ^= (y << 7) & MAGIC_MASK1; y ^= (y << 15) & MAGIC_MASK2; y ^= (y >>> 18); // ---- End Mersenne Twister Algorithm ---- return (y >>> (32-bits)); } // This is a fairly obscure little code section to pack a // byte[] into an int[] in little endian ordering. /** * This simply utility method can be used in cases where a byte * array of seed data is to be used to repeatedly re-seed the * random number sequence. By packing the byte array into an * integer array first, using this method, and then invoking * setSeed() with that; it removes the need to re-pack the byte * array each time setSeed() is called. *

* If the length of the byte array is not a multiple of 4 then * it is implicitly padded with zeros as necessary. For example: *

    byte[] { 0x01, 0x02, 0x03, 0x04, 0x05, 0x06 }
* becomes *
    int[]  { 0x04030201, 0x00000605 }
*

* Note that this method will not complain if the given byte array * is empty and will produce an empty integer array, but the * setSeed() method will throw an exception if the empty integer * array is passed to it. * * @param buf The non-null byte array to be packed. * @return A non-null integer array of the packed bytes. * @throws NullPointerException if the given byte array is null. */ public static int[] pack(byte[] buf) { int k, blen = buf.length, ilen = ((buf.length+3) >>> 2); int[] ibuf = new int[ilen]; for (int n = 0; n < ilen; n++) { int m = (n+1) << 2; if (m > blen) m = blen; for (k = buf[--m]&0xff; (m & 0x3) != 0; k = (k << 8) | buf[--m]&0xff); ibuf[n] = k; } return ibuf; } }

测试 测试代码
        // MT19937的Java实现
        MTRandom mtRandom=new MTRandom();
        Map map=new HashMap<>();
        //循环次数
        int times=1000000;
        long startTime=System.currentTimeMillis();
        for(int i=0;i
测试结果
times:1000000
num:999886
proportion:0.999886
time:374

文章版权归作者所有,未经允许请勿转载,若此文章存在违规行为,您可以联系管理员删除。

转载请注明本文地址:https://www.ucloud.cn/yun/73325.html

相关文章

  • 算法笔试利器--对数器的使用

    摘要:对于一个数组的排序,如果笔试中要求的时间复杂度是,但是你却写了一个冒泡排序的算法交上去了,这时就会提示而在对数器中,我们要求的绝对正确的算法是没有时间和空间复杂度的限制的,唯一的要求是确保绝对正确。 对数器的作用 对数器是通过用大量测试数据来验证算法是否正确的一种方式。在算法笔试的时候,我们经常只能确定我们写出的算法在逻辑上是大致正确的,但是谁也不能一次性保证绝对的正确。特别是对于一些...

    wyk1184 评论0 收藏0
  • 排序算法Java)——那些年面试常见的排序算法

    摘要:下面会介绍的一种排序算法快速排序甚至被誉为世纪科学和工程领域的十大算法之一。我们将讨论比较排序算法的理论基础并中借若干排序算法和优先队列的应用。为了展示初级排序算法性质的价值,我们来看一下基于插入排序的快速的排序算法希尔排序。 前言   排序就是将一组对象按照某种逻辑顺序重新排列的过程。比如信用卡账单中的交易是按照日期排序的——这种排序很可能使用了某种排序算法。在计算时代早期,大家普遍...

    Harpsichord1207 评论0 收藏0
  • 后端技术精选 - 收藏集 - 掘金

    摘要:使用签署免费证书后端掘金本文操作在操作系统下完成,需要和超文本传输安全协议英语,缩写,常称为,红黑树深入剖析及实现后端掘金红黑树是平衡二叉查找树的一种。 使用 Lets Encrypt 签署免费 Https 证书 - 后端 - 掘金 本文操作在Linux操作系统下完成,需要Python和Nginx 超文本传输安全协议(英语:Hypertext Transfer Protocol Sec...

    Meils 评论0 收藏0

发表评论

0条评论

baukh789

|高级讲师

TA的文章

阅读更多
最新活动
阅读需要支付1元查看
<