随机取不重复的随机数_可重复随机数入门

随机取不重复的随机数_可重复随机数入门

随机取不重复的随机数

如果要创建任何程序,几乎可以保证在某一时刻需要随机数。 而且,如果您希望能够多次产生相同的结果,则需要随机数是可重复的。 (If you’re creating anything procedural, you’re almost guaranteed to come in need of random numbers at one point. And if you want to be able to produce the same result more than once, you’ll need the random numbers to be repeatable.)

In this article we’ll use level/world generation in games as example use cases, but the lessons are applicable to many other things, such as procedural textures, models, music, etc. They are however not meant for applications with very strict requirements, such as cryptography.

在本文中,我们将在游戏中使用关卡/世界生成作为示例用例,但这些课程适用于许多其他方面,例如过程纹理,模型,音乐等。但是,它们并不适合具有非常严格要求的应用程序,例如密码学。

Why would you want to repeat the same result more than once?

您为什么要多次重复相同的结果?

  • Ability to revisit the same level/world. For example a certain level/world can be created from a specific seed. If the same seed is used again, you will get the same level/world again. You can for example do this in Minecraft.

    能够重新访问同一级别/世界。 例如,可以从特定种子创建特定级别/世界。 如果再次使用相同的种子,您将再次获得相同的级别/世界。 例如,您可以在Minecraft中执行此操作

  • Persistent world that’s generated on the fly. If you have a world that’s generated on the fly as the player moves around in it, you may want locations to remain the same the first and subsequent times the player visit those locations (like in Minecraft, the upcoming game No Man’s Sky, and others), rather than being different each time as if driven by dream logic.

    持续不断发展的世界。 如果您有一个随玩家移动而动态生成的世界,则您可能希望位置在玩家第一次和随后访问该位置时保持相同(例如在Minecraft,即将推出的游戏《无人深空》中,以及其他),而不是每次都由梦的逻辑所驱动而有所不同。

  • Same world for everyone. Maybe you want your game world to be the same for everyone who play it, exactly as if it wasn’t procedurally generated. This is for example the case in No Man’s Sky. This is essentially the same as the ability to revisit the same level/world mentioned above, except that the same seed is always used.

    每个人的世界都一样。 也许您希望您的游戏世界对于每个玩过它的人来说都是一样的,就像不是程序产生的一样。 例如在《无人的天空》中就是这种情况 。 这与重新访问上述相同级别/世界的能力基本相同,只是始终使用相同的种子。

We’ve mentioned the word seed a few times. A seed can be a number, text string, or other data that’s used as input in order to get a random output. The defining trait for a seed is that the same seed will always produce the same output, but even the slightest change in the seed can produce a completely different output.

我们已经多次提到种子一词。 种子可以是数字,文本字符串或其他用作输入以获得随机输出的数据。 种子的定义特征是相同的种子将始终产生相同的输出,但是即使是种子中的最小变化也可以产生完全不同的输出。

In this article we’ll look into two different ways to produce random numbers – random number generators and random hash functions – and reasons for using one or the other. The things I know about this are hard earned and don’t seem to be readily available elsewhere, so I thought it would be in order to write it down and share it.

在本文中,我们将研究产生随机数的两种不同方式-随机数生成器和随机散列函数-以及使用其中一种的原因。 我所知道的事情是来之不易的,似乎在其他地方也不太容易获得,所以我认为这是为了写下来并分享。

随机数发生器 (Random number generators)

The most common way to produce random numbers is using a random number generator (or RNG for short). Many programming languages have RNG classes or methods included, and they have the word “random” in their name, so it’s the obvious go-to approach to get started with random numbers.

产生随机数的最常见方法是使用随机数生成器(或简称RNG)。 许多编程语言都包含RNG类或方法,并且它们的名称中包含“ random”一词,因此这是从随机数开始的明显入门方法。

A random number generator produces a sequence of random numbers based on an initial seed. In object-oriented languages, a random number generator is typically an object that is initialized with a seed. A method on that object can then be repeatedly called to produce random numbers.

随机数生成器基于初始种子生成一系列随机数。 在面向对象的语言中,随机数生成器通常是使用种子初始化的对象。 然后可以重复调用该对象上的方法以产生随机数。

随机取不重复的随机数_可重复随机数入门

The code in C# could look like this:

C#中的代码如下所示:

[csharp]Random randomSequence = new Random(12345); int randomNumber1 = randomSequence.Next(); int randomNumber2 = randomSequence.Next(); int randomNumber3 = randomSequence.Next();[/csharp]

[csharp] Random randomSequence = new Random(12345); int randomNumber1 = randomSequence.Next(); int randomNumber2 = randomSequence.Next(); int randomNumber3 = randomSequence.Next(); [/ csharp]

In this case we’re getting random integer values between 0 and the maximum possible integer value (2147483647), but it’s trivial to convert this to a random integer in a specific range, or a random floating point number between 0 and 1 or similar. Often methods are provided that do this out of the box.

在这种情况下,我们将获得介于0和最大可能整数值(2147483647)之间的随机整数值,但是将其转换为特定范围内的随机整数或介于0和1之间的随机浮点数或类似值很容易。 通常提供开箱即用的方法。

Here’s an image with the first 65536 numbers generated by the Random class in C# from the seed 0. Each random number is represented as a pixel with a brightness between 0 (black) and 1 (white).

这是一张由C#中的Random类从种子0生成的前65536个数字的图像。每个随机数均表示为亮度在0(黑色)到1(白色)之间的像素。

随机取不重复的随机数_可重复随机数入门

It’s important to understand here that you cannot get the third random number without first getting the first and second one. This is not just an oversight in the implementation. In its very nature, an RNG generates each random number using the previous random number as part of the calculation. Hence we talk about a random sequence.

重要的是要在这里了解到,如果不先获得第一个和第二个就不能获得第三个随机数。 这不仅仅是实施过程中的疏忽。 就其本质而言,RNG使用先前的随机数作为计算的一部分来生成每个随机数。 因此,我们谈论一个随机序列

随机取不重复的随机数_可重复随机数入门

This means that RNGs are great if you need a bunch of random numbers one after the other, but if you need to be able to get a specific random number (say, the 26th random number from the sequence), then you’re out of luck. Well, you could call Next() 26 times and use the last number but this is a very bad idea.

这意味着如果您需要一个接一个的随机数,那么RNG很好,但是如果您需要能够获得特定的随机数(例如,序列中的第26个随机数),那么您就没有了。运气。 好吧,您可以调用Next()26次并使用最后一个数字,但这是一个非常糟糕的主意。

为什么要从序列中获取特定的随机数? (Why would I want a specific random number from the sequence?)

If you generate everything at once, you probably don’t need specific random numbers from a sequence, or at least I can’t think of a reason. However, if you generate things bit by bit on the fly, then you do.

如果您一次生成所有内容,则可能不需要序列中的特定随机数,或者至少我想不出原因。 但是,如果您在运行过程中一点一点地产生东西,那么您会做到。

For example, say you have three sections in your world: A, B, and C. The player starts in section A, so section A is generated using 100 random numbers. Then the player proceeds to section B which is generated using 100 different numbers. The generated section A is destroyed at the same time to free up memory. The player proceeds to section C which is 100 yet different numbers and section B is destroyed.

例如,假设您的世界中有三个部分:A,B和C。玩家从A部分开始,因此A部分是使用100个随机数生成的。 然后,玩家进入使用100个不同数字生成的B部分。 同时破坏了生成的部分A以释放内存。 玩家进入C区,该区为100,但数字不同,B区被破坏。

However, if the player now go back to section B again, it should be generated with the same 100 random numbers as it was the first time in order for the section to look the same.

但是,如果玩家现在再次回到B部分,则应该使用与第一次相同的100个随机数生成该部分,以使该部分看起来相同。

我不能只使用具有不同种子值的随机数生成器吗? (Can’t I just use random number generators with different seed values?)

No! This is a very common misconception about RNGs. The fact is that while the different numbers in the same sequence are random in relation to each other, the same indexed numbers from different sequences are not random in relation to each other, even if it may look like it at first glance.

没有! 这是对RNG的非常普遍的误解。 事实是,尽管相同序列中的不同数字相互之间是随机的,但是即使乍一看,来自不同序列的相同索引数字也不是相互随机的。

随机取不重复的随机数_可重复随机数入门

So if you have 100 sequences and take the first number from each, those numbers will not be random in relation to each other. And it won’t be any better if you take the 10th, 100th, 1000th number from each sequence.

因此,如果您有100个序列,并从每个序列中获取第一个数字,则这些数字彼此之间将不会是随机的。 如果您从每个序列中选取第10、100、1000个数字,也不会更好。

At this point some people will be skeptical, and that’s fine. You can also look at this Stack Overflow question about RNG for procedural content if that’s more trustworthy. But for something a bit more fun and informative, let’s do some experiments and look at the results.

在这一点上,有些人会持怀疑态度,这很好。 如果更值得信赖,您还可以查看有关RNG的堆栈溢出问题以获取程序内容 。 但是,为了获得更多乐趣和更多信息,让我们进行一些实验并查看结果。

Let’s look at the numbers generated from the same sequence for reference and compare with numbers created by getting the first number in of each of 65536 sequences created from the seeds 0 to 65535.

让我们看一下从相同序列生成的数字以供参考,并将其与通过获取从种子0到65535创建的65536个序列中的每个序列的第一个数字而创建的数字进行比较。

随机取不重复的随机数_可重复随机数入门

Though the pattern is rather uniformly distributed, it isn’t quite random. In fact, I’ve shown the output of a purely linear function for comparison, and it’s apparent that using numbers from subsequent seeds is barely any better than just using a linear function.

尽管模式相当均匀地分布,但是它不是很随机。 实际上,我已经展示了纯线性函数的输出以进行比较,很明显,使用后续种子中的数字几乎比仅使用线性函数更好。

Still, is it almost random though? Is it good enough?

不过,它几乎是随机的吗? 好吗?

At this point it can be a good idea to introduce better ways to measure randomness since the naked eye is not very reliable. Why not? Isn’t it enough that the output looks random enough?

此时,最好引入更好的方法来测量随机性,因为肉眼不是很可靠。 为什么不? 输出看起来足够随机是否足够?

Well yes, in the end our goal is simply that things look sufficiently random. But the random number output can look very different depending on how it’s used. Your generation algorithms may transform the random values in all kinds of ways that will reveal clear patterns that are hidden when just inspecting the values listed in a simple sequence.

好吧,是的,最终我们的目标只是使事物看起来足够随机。 但是,随机数的输出根据使用方式的不同而可能看起来非常不同。 您的生成算法可能会以各种方式转换随机值,从而揭示仅在检查简单序列中列出的值时隐藏的清晰模式。

An alternative way to inspect the random output is to create 2D coordinates from pairs of the random numbers and plot those coordinates into an image. The more coordinates land on the same pixel, the brighter that pixel gets.

检查随机输出的另一种方法是从成对的随机数创建2D坐标并将这些坐标绘制到图像中。 坐标落在同一像素上的次数越多,该像素得到的亮度就越高。

Let’s take a look at such a coordinate plot for both a random numbers in the same sequence and for random numbers created from individual sequences with different seeds. Oh and let’s throw in the linear function too.

让我们看一下这样一个坐标图,它既针对相同序列中的随机数,也针对从具有不同种子的单个序列中创建的随机数。 哦,让我们也加入线性函数。

Perhaps surprisingly, when creating coordinates from random numbers from different seeds, the coordinates are all plotted into thin lines rather than being anything near uniformly distributed. This is again just like for a linear function.

也许令人惊讶,当根据来自不同种子的随机数创建坐标时,所有坐标都被绘制成细线,而不是几乎均匀分布的任何东西。 再次就像线性函数一样。

Imagine you created coordinates from random numbers in order to plant trees on a terrain. Now all your trees would be planted in a straight line with the remaining terrain being empty!

想象一下,您是根据随机数创建坐标的,以便在地形上种植树木。 现在,您所有的树木都将沿直线种植,而其余地形为空!

We can conclude that random number generators are only useful if you don’t need to access the numbers in a specific order. If you do, then you might want to look into random hash functions.

我们可以得出结论,仅当您不需要按特定顺序访问数字时,随机数生成器才有用。 如果这样做,则可能需要研究随机哈希函数。

随机哈希函数 (Random hash functions)

In general a hash function is any function that can be used to map data of arbitrary size to data of fixed size, with slight differences in input data producing very big differences in output data.

通常,散列函数是可用于将任意大小的数据映射到固定大小的数据的任何函数,输入数据中的细微差异会在输出数据中产生非常大的差异。

For procedural generation, typical use cases are to provide one or more integer numbers as input and get a random number as output. For example, for large worlds where only parts are generated at a time, a typical need is to get a random number associated with an input vector (such as a location in the world), and this random number should always be the same given the same input. Unlike random number generators (RNGs) there is no sequence – you can get the random numbers in any order you like.

对于过程生成,典型的用例是提供一个或多个整数作为输入,并获得一个随机数作为输出。 例如,对于一次仅生成零件的大世界,典型的需求是获取与输入向量相关联的随机数(例如世界中的位置),并且该随机数在给定的情况下应始终相同。相同的输入。 与随机数生成器(RNG)不同,它没有序列 -您可以按任意顺序获得随机数。

随机取不重复的随机数_可重复随机数入门

The code in C# could look like this – note that you can get the numbers in any order you like:

C#中的代码可能看起来像这样–请注意,您可以按照自己喜欢的任何顺序获取数字:

[csharp]RandomHash randomHashObject = new RandomHash(12345); int randomNumber2 = randomHashObject.GetHash(2); int randomNumber3 = randomHashObject.GetHash(3); int randomNumber1 = randomHashObject.GetHash(1);[/csharp]

[csharp] RandomHash randomHashObject = new RandomHash(12345); int randomNumber2 = randomHashObject.GetHash(2); int randomNumber3 = randomHashObject.GetHash(3); int randomNumber1 = randomHashObject.GetHash(1); [/ csharp]

The hash function may also take multiple inputs, which mean you can get a random number for a given 2D or 3D coordinate:

散列函数还可以接受多个输入,这意味着您可以为给定的2D或3D坐标获取一个随机数:

[csharp]RandomHash randomHashObject = new RandomHash(12345); randomNumberGrid[20, 40] = randomHashObject.GetHash(20, 40); randomNumberGrid[21, 40] = randomHashObject.GetHash(21, 40); randomNumberGrid[20, 41] = randomHashObject.GetHash(20, 41); randomNumberGrid[21, 41] = randomHashObject.GetHash(21, 41);[/csharp]

[csharp] RandomHash randomHashObject = new RandomHash(12345); randomNumberGrid [20,40] = randomHashObject.GetHash(20,40); randomNumberGrid [21,40] = randomHashObject.GetHash(21,40); randomNumberGrid [20,41] = randomHashObject.GetHash(20,41); randomNumberGrid [21,41] = randomHashObject.GetHash(21,41); [/ csharp]

Procedural generation is not the typical use of hash functions, and not all hash functions are well suited for procedural generation, as they may either not have sufficiently random distribution, or be unnecessarily expensive.

程序生成不是哈希函数的典型用法,并且并非所有哈希函数都非常适合于程序生成,因为它们要么没有足够的随机分布,要么不必要地昂贵。

One use of hash functions is as part of the implementation of data structures such as hash tables and dictionaries. These are often fast but not sufficiently random, since they are not meant for randomness but just for making algorithms perform efficiently. In theory this means they should be random as well, but in practice I haven’t found resources that compare the randomness properties of these, and the ones I’ve tested have turned out to have fairly bad randomness properties (see Appendix C for details).

哈希函数的一种用途是作为数据结构(例如哈希表和字典)实现的一部分。 这些通常是快速的,但不够随机,因为它们不是为了随机性而是为了使算法有效执行。 从理论上讲,这意味着它们也应该是随机的,但是在实践中,我没有找到可以比较它们的随机性的资源,而我测试过的资源却具有相当差的随机性(有关详细信息,请参见附录C)。 )。

Another use of hash function is for cryptography. These are often very random, but are also slow, since the requirements for cryptographically strong hash functions is much higher than for values that just looks random.

哈希函数的另一种用途是用于加密。 这些通常是非常随机的,但是也很慢,因为对密码学上强的哈希函数的要求比对看起来只是随机的值的要求高得多。

Our goal for procedural generation purposes is a random hash function that looks random but is also efficient, meaning that it’s not slower than it needs to be. Chances are there’s not a suitable one built into your programming language of choice, and that you’ll need to find one to include in your project.

我们出于过程生成目的的目标是一个随机散列函数,该函数看起来是随机的,但也是高效的,这意味着它不会比需要的慢。 可能没有一种合适的语言内置在您选择的编程语言中,因此您需要找到一个包含在项目中的语言。

I’ve tested a few different hash functions based on recommendations and information from various corners of the Internet. I’ve selected three of those for comparison here.

我已经根据来自互联网各个角落的建议和信息测试了一些不同的哈希函数。 我选择了其中三个进行比较。

  • PcgHash: I got this hash function from Adam Smith in a discussion on Google Groups forum on Procedural Content Generation. Adam proposed that with a little skill, it’s not too hard to create your own random hash function and offered his PcgHash code snippet as an example.

    PcgHash :我在Google Groups论坛上关于过程内容生成的讨论中,从Adam Smith获得了此哈希函数。 Adam提出了一点技巧,即创建自己的随机哈希函数并不难,并以他的PcgHash代码段为例。

  • MD5: This may be one of the most well-known hash functions. It’s also of cryptographic strength and more expensive than it needs to be for our purposes. On top of that, we typically just need a 32-bit int as return value, while MD5 returns a much larger hash value, most of which we’d just be throwing away. Nevertheless it’s worth including for comparison.

    MD5 :这可能是最著名的哈希函数之一。 它也具有加密强度,并且比我们所需的价格昂贵。 最重要的是,我们通常只需要一个32位的int作为返回值,而MD5返回一个更大的哈希值,其中大部分我们都将被丢弃。 不过,值得进行比较。

  • xxHash: This is a high-performing modern non-cryptographic hash function that has both very nice random properties and great performance.

    xxHash :这是一个高性能的现代非加密哈希函数,具有非常好的随机属性和出色的性能。

Apart from generating the noise sequence images and coordinate plots, I’ve also tested with a randomness testing suite called ENT – A Pseudorandom Number Sequence Test Program. I’ve included select ENT stats in the images as well as a stat I devised myself with I call the Diagonals Deviation. The latter looks at sums of diagonal lines of pixels from the coordinate plot and measures the standard deviation of these sums.

除了生成噪声序列图像和坐标图外,我还使用称为ENT的随机性测试套件进行了测试-伪随机数序列测试程序 。 我在图像中包括了选定的ENT统计数据,以及我自己设计的统计数据(称为对角线偏差)。 后者从坐标图中查看像素对角线的总和,并测量这些总和的标准偏差。

Here’s the results from the three hash functions:

这是三个哈希函数的结果:

随机取不重复的随机数_可重复随机数入门

PcgHash stands out in that while it appears very random in the noise images of sequential random values, the coordinate plot reveals clear patterns, which means it doesn’t hold up well to simple transformations. I conclude from this that rolling your own random hash function is hard and should probably be left to the experts.

PcgHash的突出之处在于,尽管它在顺序随机值的噪声图像中显得非常随机,但坐标图显示出清晰的图案,这意味着它不能很好地支持简单的变换。 由此得出的结论是,滚动自己的随机散列函数很困难,应该交给专家。

MD5 and xxHash seem to have very comparable random properties, and out of those, xxHash is around 50 times faster.

MD5和xxHash似乎具有非常可比的随机属性,其中xxHash快大约50倍。

xxHash also has the advantage that although it’s not an RNG, it still has the concept of a seed, which is not the case for all hash functions. Being able to specify a seed has clear advantages for procedural generation, since you can use different seeds for different random properties of entities, grid cells, or similar, and then just use the entity index / cell coordinate as input for the hash function as-is. Crucially, with xxHash, the numbers from differently seeded sequences are random in relation to each other (see Appendix 2 for more details).

xxHash还具有优点,尽管它不是RNG,但仍然具有种子的概念,并非所有哈希函数都如此。 能够指定种子对于程序生成具有明显的优势,因为您可以对实体,网格单元或类似对象的不同随机属性使用不同的种子,然后只需将实体索引/单元格坐标用作哈希函数的输入即可,如下所示:是。 至关重要的是,使用xxHash,来自不同种子序列的数字彼此之间是随机的(有关更多详细信息,请参见附录2)。

针对程序生成进行了优化的哈希实现 (Hash implementations optimized for procedural generation)

In my investigations of hash functions it has become clear that while it’s good to choose a hash function that’s high-performing in general-purpose hash benchmarks, it’s crucial for performance to optimize it to procedural generation needs rather than just using the hash function as-is.

在我对散列函数的研究中,很清楚地发现,尽管选择通用散列基准测试中性能较高的散列函数是好事,但对性能进行优化以使其满足程序生成需求至关重要,而不仅仅是将散列函数用作-是。

There are two important optimizations:

有两个重要的优化:

  • Avoid conversions between integers and bytes. Most general-purpose hash functions take a byte array as input and return an integer or some bytes as the hash value. However, some of the high-performing ones convert the input bytes to integers since they operate on integers internally. Since it’s most common for procedural generation to get a hash based on integer input values, the conversion to bytes is completely pointless. Refactoring the reliance on bytes away can triple the performance while leaving the output 100% identical.

    避免在整数和字节之间进行转换。 大多数通用哈希函数将字节数组作为输入,并返回整数或一些字节作为哈希值。 但是,一些高性能的字节将输入字节转换为整数,因为它们在内部对整数进行运算。 由于程序生成最常见的是基于整数输入值获取哈希,因此转换为字节是完全没有意义的。 重构对字节的依赖可以使性能提高三倍,同时使输出保持100%相同。

  • Implement no-loop methods that take just one or a few inputs. Most general-purpose hash functions take input data of variable length in the form of an array. This is useful for procedural generation too, but the most common uses are probably to get a hash based on just 1, 2 or 3 input integers. Creating optimized methods that take a fixed number of integers rather than an array can eliminate the need for a loop in the hash function, and this can dramatically improve the performance (by around 4x-5x in my tests). I’m not an expert on low level optimization, but the dramatic difference could be caused by either implicit branching in the for loop or by the need to allocate an array.

    实现仅需要一个或几个输入的无环方法。 大多数通用哈希函数以数组的形式获取长度可变的输入数据。 这对于过程生成也很有用,但是最常见的用途可能是仅基于1、2或3个输入整数来获取哈希。 创建采用固定数量的整数而不是数组的优化方法可以消除哈希函数中循环的需要,这可以显着提高性能(在我的测试中提高了4到5倍)。 我不是底层优化的专家,但是巨大的差异可能是由于for循环中的隐式分支或需要分配数组引起的。

My current recommendation for a hash function is to use an implementation of xxHash that’s optimized for procedural generation. See Appendix C for details on why.

我当前对散列函数的建议是使用针对过程生成进行了优化的xxHash实现。 有关原因的详细信息,请参见附录C。

You can get my implementations of xxHash and other hash functions in C# on BitBucket. This is maintained privately by me in my free time, not by Unity Technologies.

您可以在BitBucket的C#中获得我对xxHash和其他哈希函数的实现。 这是由我在业余时间私下维护的,而不是由Unity Technologies私下维护的。

Besides the optimizations I also added extra methods to get the output as an integer number in a specified range or as a floating point number in a specified range, which are typical needs in procedural generation.

除了优化之外,我还添加了额外的方法来获取输出,该输出是指定范围内的整数或指定范围内的浮点数,这是过程生成中的典型需求。

Note: At the time of writing I only added a single integer input optimization to xxHash and MurmurHash3. I’ll add optimized overloads for two and three integer inputs too when I get time.

注意:在撰写本文时,我仅向xxHash和MurmurHash3添加了单个整数输入优化。 我会在有时间的时候为两个和三个整数输入添加优化的重载。

结合哈希函数和RNG (Combining hash functions and RNGs)

Random hash functions and random number generators can also be combined. A sensible approach is to use random number generators with different seeds, but where the seeds have been passed through a random hash function rather than being used directly.

随机哈希函数和随机数生成器也可以组合使用。 明智的方法是使用带有不同种子的随机数生成器,但是种子是通过随机哈希函数传递的,而不是直接使用的。

Imagine you have a large maze world, possibly nearly infinite. The world has a large scale grid where each grid cell is a maze. As the player moves around in the world, mazes are generated in the grid cells surrounding the player.

想象一下,您有一个巨大的迷宫世界,可能几乎是无限的。 世界上有一个大型网格,其中每个网格单元都是一个迷宫。 当玩家在世界各地移动时,迷宫会在玩家周围的网格单元中产生。

In this case you’ll want each maze to always be generated the same way every time it’s visited, so the random numbers needed for that need to be able to be produced independently from previously generated numbers.

在这种情况下,您将希望每次访问时始终以相同的方式生成每个迷宫,因此需要能够独立于先前生成的数字来生成所需的随机数。

However, mazes are always generated one whole maze at a time, so there’s no need to have control over the order of the individual random numbers used for one maze.

但是,迷宫总是一次生成整个迷宫,因此无需控制用于一个迷宫的各个随机数的顺序。

The ideal approach here is to use a random hash function to create a seed for a maze based on the coordinate of the grid cell of the maze, and then use this seed for a random number generator sequence of random numbers.

此处的理想方法是使用随机哈希函数根据迷宫网格单元的坐标为迷宫创建种子,然后将该种子用于随机数的随机数生成器序列。

随机取不重复的随机数_可重复随机数入门

The C# code could look like this:

C#代码如下所示:

[csharp]RandomHash randomHashObject = new RandomHash(12345); int mazeSeed = randomHashObject.GetHash(cellCoord.x, cellCoord.y);

[csharp] RandomHash randomHashObject = new RandomHash(12345); int mazeSeed = randomHashObject.GetHash(cellCoord.x,cellCoord.y);

Random randomSequence = new Random(mazeSeed); int randomNumber1 = randomSequence.Next(); int randomNumber2 = randomSequence.Next(); int randomNumber3 = randomSequence.Next();[/csharp]

随机randomSequence =新的Random(mazeSeed); int randomNumber1 = randomSequence.Next(); int randomNumber2 = randomSequence.Next(); int randomNumber3 = randomSequence.Next(); [/ csharp]

结论 (Conclusions)

If you need control over the order of querying random numbers, use a suitable random hash function (such as xxHash) in an implementation that’s optimized for procedural generation.

如果您需要控制查询随机数的顺序,请在针对过程生成进行了优化的实现中使用合适的随机哈希函数(例如xxHash)。

If you just need to get a bunch of random numbers and the order doesn’t matter, the simplest way is to use a random number generator such as the System.Random class in C#. In order for all the numbers to be random in relation to each other, either only a single sequence (initialized with one seed) should be used, or if multiple seeds are used they should be passed through a random hash function (such as xxHash) first.

如果只需要获取一堆随机数,而顺序无关紧要,则最简单的方法是使用随机数生成器,例如C#中的System.Random类。 为了使所有数字彼此之间是随机的,应该只使用一个序列(用一个种子初始化),或者如果使用多个种子,则应将它们通过随机哈希函数(例如xxHash)传递第一。

The source code for the random numbers testing framework referred to in this article, as well as a variety of RNGs and hash functions, is available on BitBucket. This is maintained privately by me in my free time, not by Unity Technologies.

本文提到的随机数测试框架的源代码以及各种RNG和哈希函数可从BitBucket上获得 。 这是由我在业余时间私下维护的,而不是由Unity Technologies私下维护的。

This article originally appeared on the runevision blog which is dedicated to game development and research I do in my free time.

这篇文章最初出现在runevision博客上,该博客致力于我业余时间的游戏开发和研究。

附录A:关于连续噪声的说明 (Appendix A: A note on continuous noise)

For certain things you’ll want to be able to query noise values that are continuous, meaning that input values near each other produce output values that are also near each other. Typical uses are for terrains or textures.

对于某些事情,您将希望能够查询连续的噪声值,这意味着彼此接近的输入值会产生彼此也接近的输出值。 典型用途是用于地形或纹理。

These requirements are completely different from the ones discussed in this article. For continuous noise, look into Perlin Noise – or better – Simplex Noise.

这些要求与本文讨论的要求完全不同。 要获得连续的噪声,请查看Perlin噪声-或更好的是Simplex噪声。

However, be aware that these are only suitable for continuous noise. Querying continuous noise functions just to get random numbers unrelated to other random numbers will produce poor results since it’s not what these algorithms are optimized for. For example, I’ve found that querying a Simplex Noise function at integer positions will return 0 for every third input!

但是,请注意,这些适用于连续噪声。 仅查询连续噪声函数以获得与其他随机数无关的随机数会产生较差的结果,因为这不是针对这些算法进行优化的。 例如,我发现在整数位置查询Simplex Noise函数将为每三个输入返回0!

Additionally, continuous noise functions usually use floating point numbers in their calculations, which have worse stability and precision the further you get from the origin.

此外,连续噪声函数通常在计算中使用浮点数,如果离原点越远,它们的稳定性和精度就越差。

附录B:更多关于种子和输入值的测试结果 (Appendix B: More test results for seed and input values)

I’ve heard various misconceptions over the years and I’ll try to address a few more of them here

这些年来,我听到过各种各样的误解,在这里我将尝试解决其中的一些误解。

使用大量种子不是最好吗? (Isn’t it best to use a large number for the seed?)

No, I haven’t seen anything that indicates that. If you look at the test images throughout this article, there’s no difference between the results for low or high seed values.

不,我还没有看到任何迹象表明这一点。 如果您查看本文中的测试图像,则无论种子值低还是高,结果之间都没有区别。

随机数生成器难道不需要几个数字就可以开始吗? (Don’t random number generators take a few numbers to “get going”?)

No. Again, if you look at the test images, you can see that the sequences of random values follow the same pattern from start (upper left corner and proceeding one line after the other) to end.

不会。再次,如果您查看测试图像,您会发现随机值序列从开始(左上角开始,然后在另一行之后依次行进)到结束遵循相同的模式。

In the image below I’ve tested the 0th number in 65535 sequences as well as the 100th number in those same sequences. As can be seen, there’s no apparent significant difference in the (lack of) quality of the randomness.

在下面的图像中,我测试了65535个序列中的第0个数字以及这些相同序列中的第100个数字。 可以看出,(缺乏)随机性没有明显的区别。

随机取不重复的随机数_可重复随机数入门

某些RNG,例如Java的,不是来自不同种子序列的数字之间具有更好的随机性吗? (Doesn’t some RNGs, such as Java’s, have better randomness between numbers from differently seeded sequences?)

Maybe a tiny bit better, but not nearly enough. Unlike the Random class in C#, the Random class in Java doesn’t use the provided seed as-is, but shuffles the bits a bit before storing the seed.

也许好一点,但还远远不够。 与C#中的Random类不同,Java中的Random类不按原样使用所提供的种子,而是在存储种子之前将位稍微改组。

The resulting numbers from different sequences may be a tiny bit more random looking, and we can see from the test stats that the Serial Correlation is much better. However, it’s clear in the coordinates plot that the numbers still collapse to a single line when used for coordinates.

从不同序列得到的数字可能看起来有点随机,并且从测试统计数据中我们可以看到串行相关性要好得多。 但是,很明显,在坐标图中,当用于坐标时,数字仍然折叠为单行。

随机取不重复的随机数_可重复随机数入门

That said, there’s no reason why a RNG couldn’t apply a high-quality random hash function to the seed before using it. In fact it seems like a very good idea to do so, with no downsides I can think of. It’s just that popular RNG implementations that I’m aware of don’t do that, so you’ll have to do it yourself as described previously.

就是说,没有理由为什么RNG不能在使用之前将高质量的随机哈希函数应用于种子。 实际上,这样做似乎是一个非常好的主意,没有我能想到的缺点。 我所知道的只是流行的RNG实现不会这样做,因此您必须按照前面所述自己动手。

如果不是RNG,那么为随机散列函数使用不同的种子会有什么好处呢? (How come it’s fine to use different seeds for random hash functions when it isn’t for RNGs?)

There’s no intrinsic reason, but hash functions such as xxHash and MurmurHash3 treat the seed value similar to the inputs, meaning that it essentially applies a high quality random hash function to the seed, so to speak. Because it’s implemented that way, it’s safe to use the Nth number from differently seeded random hash objects.

没有内在的原因,但是诸如xxHash和MurmurHash3之类的哈希函数对待种子值的方式与输入类似,这意味着可以说,它本质上将高质量的随机哈希函数应用于种子。 因为它是通过这种方式实现的,所以可以安全地从不同种子的随机哈希对象中使用第N个数字。

随机取不重复的随机数_可重复随机数入门

附录C:更多哈希函数的比较 (Appendix C: Comparison of more hash functions)

In the original version of this article I compared PcgHash, MD5, and MurmurHash3 and recommended using MurmurHash3.

在本文的原始版本中,我比较了PcgHash,MD5和MurmurHash3,并建议使用MurmurHash3。

MurmurHash3 has excellent randomness properties and very decent speed. The author implemented it in parallel with a framework for testing hash functions called SMHasher which has become a widely used framework for that purpose.

MurmurHash3具有出色的随机性和非常不错的速度。 作者将其与用于测试哈希函数的框架SMHasher并行实现,该框架已被广泛用于此目的。

There is also good information on this Stack Overflow question about good hash functions for uniqueness and speed which compares a lot more hash functions and seems to paint an equally favorable picture of MurmurHash.

关于此堆栈溢出问题,也有关于良好的哈希函数的唯一性和速度的良好信息,该哈希函数比较了许多哈希函数,似乎描绘出了同样令人满意的MurmurHash图片。

After publishing the article I got recommendations from Aras Pranckevičius to look into xxHash and from Nathan Reed to look into Wang Hash which he’s written about here.

发表文章后,我得到了ArasPranckevičius的建议,以研究xxHash,并得到Nathan Reed的建议,以研究Wang Hash,这是他在这里写的

xxHash is a hash function which apparently beats MurmurHash on its own turf by scoring as high on quality in the SMHasher testing framework while having significantly better performance. Read more about xxHash on its Google Code page.

xxHash是一个哈希函数,通过在SMHasher测试框架中获得最高的质量得分,显然在其自身的领域上击败了MurmurHash,同时具有明显更好的性能。 在其Google代码页上了解有关xxHash的更多信息。

In my initial implementation of it, after I had removed byte conversions, it was slighter faster than MurmurHash3, though not nearly as much faster as shown in the SMHasher results.

在它的最初实现中,删除字节转换后,它的速度比MurmurHash3略快,尽管速度不如SMHasher结果所示。

I also implemented WangHash. The quality proved to be insufficient since it showed clear patterns in the coordinate plot, but it was over five times faster than xxHash. I tried implementing a “WangDoubleHash” where its result is fed to itself, and that had fine quality in my tests while still being over three times faster than xxHash.

我还实现了WangHash。 由于在坐标图中显示出清晰的图案,因此质量不足,但是它比xxHash快五倍以上。 我尝试实现一个“ WangDoubleHash”,将其结果反馈给自己,并且在我的测试中具有良好的质量,但仍然比xxHash**倍以上。

随机取不重复的随机数_可重复随机数入门

However, since WangHash (and WangDoubleHash) takes only a single integer as input, I opted to also implement single input versions of xxHash and MurmurHash3  to see what effect it would have on the performance. And it turned out to improve performance dramatically (around 4-5 times faster). So much in fact that xxHash was now faster than WangDoubleHash.

但是,由于WangHash(和WangDoubleHash)仅接受单个整数作为输入,因此我选择也实现xxHash和MurmurHash3的单个输入版本,以查看其对性能的影响。 事实证明,它可以显着提高性能(快4到5倍)。 实际上,xxHash现在比WangDoubleHash快得多。

随机取不重复的随机数_可重复随机数入门

As for quality, my own test framework reveals fairly obvious flaws, but is not nearly as sophisticated as the SMHasher test framework, so a hash function that scores high there can be assumed to be a better seal of quality for randomness properties than just looking fine in my own tests. In general I would say that passing the tests in my test framework may be sufficient for procedural generation purposes, but since xxHash (in its optimized version) is the fastest hash function passing my own tests anyway, it’s a no-brainer to just use that.

至于质量,我自己的测试框架显示出相当明显的缺陷,但还不及SMHasher测试框架那么复杂,因此可以认为,得分较高的哈希函数对于随机性具有更好的质量密封,而不仅仅是看起来很好。在我自己的测试中 总的来说,我会说在测试框架中通过测试对于生成程序而言可能就足够了,但是由于xxHash(处于优化版本)是通过我自己的测试最快的哈希函数,因此使用它就不费吹灰之力。

There are extremely many different hash functions out there and it would always be possible to include even more for comparison. However, I’ve focused primarily on some of the widely recognized best performing ones in terms of both randomness quality and performance, and then optimized them further for procedural generation.

那里有很多不同的哈希函数,总是可以包含更多哈希函数进行比较。 但是,我主要从随机性质量和性能两方面着眼于一些公认的最佳性能,然后针对程序生成进一步对其进行优化。

I think the results produced with this version of xxHash are fairly close to optimal and further gains by finding and using something even better are likely going to be small. That said, feel free to extend the test framework with more implementations.

我认为使用此版本的xxHash生成的结果相当接近最佳值,并且通过查找和使用更好的东西而获得的进一步收益可能很小。 就是说,随时通过更多实现扩展测试框架

翻译自: https://blogs.unity3d.com/2015/01/07/a-primer-on-repeatable-random-numbers/

随机取不重复的随机数