Skip to content

ghostproxies/blastcircuit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

BlastCircuit

BlastCircuit

Table of Contents

Introduction

BlastCircuit is an efficient PRNG algorithm for 64-bit output.

It's a non-cryptographic PRNG that isn't intended for cryptographic applications, although it may be suitable as a component in cryptographic applications after careful evaluation.

It's intended to replace PRNGs for 64-bit output in practical use cases when the requirements are a 2⁶⁴ minimum period, fast speed and strong empirical test results.

Author

BlastCircuit was created by William Stafford Parsons as a product of GhostProxies.

License

BlastCircuit is licensed with the BSD-3-Clause license.

The default phrase the copyright holder in the 3rd clause is replaced with GhostProxies.

Algorithm Design

BlastCircuit uses a cyclic queue design similar to SFC64.

Instead of overwriting the first queue variable, BlastCircuit preserves most of the previous value by truncating a few lower bits before adding it to the second state variable.

// BlastCircuit
b = (b >> 3) + c;
// SFC64
a = b ^ (b >> 11);

Furthermore, BlastCircuit uses a 2⁶⁴ equidistributed sequence that increases by a large number and truncates modulo 2⁶⁴ frequently for improved distribution compared to an incrementing counter.

As a result, the first queue variable is an adequate queue accumulation to mix with the equidistributed sequence instead of requiring both the first and second queue variables.

// BlastCircuit
mix = a ^ b;
a += 111111111111111;
// SFC64
tmp = a + b + counter++;

As a result, the second queue variable can move up the queue with a fast assignment operation that doesn't require additional bitwise arithmetic mixing.

// BlastCircuit
c = d;
// SFC64
b = c + (c << 3);

The remaining operations that rotate the last queue variable and add the temporary variable are almost identical to SFC64.

// BlastCircuit
d = ((d << 21) | (d >> 43)) + mix;
return mix;
// SFC64
c = ((c << 25) | (c >> 39)) + tmp;
return tmp;

Period

BlastCircuit has many possible deterministic sequences based on the seed.

The 2⁶⁴ minimum period in each sequence is proven by a mixed-in 2⁶⁴ equidistributed sequence, although the state capacity is 2²⁵⁶ and the probable period is at least 2¹²⁸ among all combinations of seeded state values.

The probable period of 2¹²⁸ was estimated after scaling BlastCircuit down to 8-bit state variables and running exhaustive tests with all combinations of seeds.

Parallelism

Each instance within a parallel set of BlastCircuit instances must replace the default additive constant 111111111111111 with an odd number that's unique among the parallel set of instances.

Each additive constant should be greater than 32 bits and less than 56 bits to reduce variability in empirical test results.

Furthermore, seeding with random values and skipping the first dozen results in each instance reduces correlations within a parallel set of BlastCircuit instances.

Reference

C Implementation

blastcircuit.c

The blastcircuit function modifies the state in a struct blastcircuit_state instance to generate and return a pseudorandom uint64_t integer.

Each state variable (a, b, c and d) in a struct blastcircuit_state instance must be seeded, preferably with random values, before generating a pseudorandom sequence.

It requires the stdint.h header to define a 64-bit, unsigned integral type for uint64_t.

C# Implementation

blastcircuit.cs

The Next function modifies the state in a BlastCircuit instance to generate and return a pseudorandom ulong integer.

Each state variable (a, b, c and d) in a BlastCircuit instance should be seeded with random values before generating a pseudorandom sequence.

Empirical Test Results

BlastCircuit passed up to a minimum of 32TB in PractRand stdin64.

For reproducibility, each of the aforementioned test results seeded each state variable with 0 and used the default additive constant 111111111111111.

Speed Test Results

The following speed test results used the C implementation on an Intel Core m3-8100Y @ 1.10GHz processor.

Each benchmark result used a loop that slowly hashes 1 billion pseudorandom number results.

while (i < 1000000000) {
  hash_a /= (hash_b >> (hash_a & 3)) | 1;
  hash_b += generate();
  hash_b /= hash_a | 1;
  i++;
}

The BlastCircuit algorithm design suggests the following 1st-place speed rankings will likely be consistent across a majority of implementations with varying 64-bit architectures, devices and programming languages.

Each Time result used the fastest process execution speed among several consecutive tests.

gcc -O3

Algorithm Ranking Time
BlastCircuit 1st 0m0.911s
2-Rotate JSF64 2nd 0m0.914s
SFC64 3rd 0m0.926s
Xoroshiro128+ 4th 0m1.032s
Xoshiro256+ 5th 0m1.047s
MWC192 6th 0m1.055s
3-Rotate JSF64 7th 0m1.068s
WOB2M 8th 0m1.149s
Fast MCG128 9th 0m1.224s
Xorshift64 10th 0m1.257s
Xorshift128+ 11th 0m1.324s
SplitMix64 12th 0m1.515s
PCG64 XSL RR RR 13th 0m1.528s
MWC128 14th 0m1.598s
PCG64 DXSM 15th 0m1.653s
PCG64 MCG XSL RR 16th 0m1.892s
PCG128 XSL RR RR 17th 0m2.305s
PCG64 XSL RR 18th 0m2.306s
PCG64 MCG XSH RR 19th 0m2.543s
PCG64 RXS M 20th 0m4.144s

gcc -O0

Algorithm Ranking Time
BlastCircuit 1st 0m24.575s
SFC64 2nd 0m24.687s
Xoshiro256+ 3rd 0m24.825s
Xorshift128+ 4th 0m24.874s
Xoroshiro128+ 5th 0m24.920s
Fast MCG128 6th 0m24.921s
WOB2M 7th 0m25.003s
2-Rotate JSF64 8th 0m25.016s
Xorshift64 9th 0m25.042s
SplitMix64 10th 0m25.110s
3-Rotate JSF64 11th 0m25.161s
MWC192 12th 0m25.450s
MWC128 13th 0m25.452s
PCG64 DXSM 14th 0m25.917s
PCG64 MCG XSL RR 15th 0m26.489s
PCG64 XSL RR RR 16th 0m27.506s
PCG64 MCG XSH RR 17th 0m28.287s
PCG64 XSL RR 18th 0m28.333s
PCG64 RXS M 19th 0m29.857s
PCG128 XSL RR RR 20th 0m33.873s

clang -O3

Algorithm Ranking Time
BlastCircuit 1st 0m0.778s
Fast MCG128 2nd 0m0.929s
SFC64 3rd 0m0.985s
MWC192 4th 0m1.005s
Xoshiro256+ 5th 0m1.050s
2-Rotate JSF64 6th 0m1.132s
Xoroshiro128+ 7th 0m1.179s
3-Rotate JSF64 8th 0m1.187s
Xorshift128+ 9th 0m1.245s
Xorshift64 10th 0m1.262s
MWC128 11th 0m1.273s
WOB2M 12th 0m1.327s
PCG64 MCG XSL RR 13th 0m1.426s
SplitMix64 14th 0m1.526s
PCG64 XSL RR RR 15th 0m1.559s
PCG64 DXSM 16th 0m1.575s
PCG64 MCG XSH RR 17th 0m1.762s
PCG128 XSL RR RR 18th 0m1.967s
PCG64 XSL RR 19th 0m2.007s
PCG64 RXS M 20th 0m3.318s

clang -O0

Algorithm Ranking Time
BlastCircuit 1st 0m24.583s
MWC192 2nd 0m24.846s
Fast MCG128 3rd 0m24.854s
Xoshiro256+ 4th 0m24.868s
SFC64 5th 0m24.919s
MWC128 6th 0m24.992s
Xorshift64 7th 0m25.142s
2-Rotate JSF64 8th 0m25.207s
Xorshift128+ 9th 0m25.298s
SplitMix64 10th 0m25.312s
PCG64 DXSM 11th 0m25.372s
3-Rotate JSF64 12th 0m25.631s
Xoroshiro128+ 13th 0m25.678s
WOB2M 14th 0m25.804s
PCG64 MCG XSL RR 15th 0m28.541s
PCG64 XSL RR 16th 0m30.129s
PCG64 MCG XSH RR 17th 0m30.148s
PCG64 RXS M 18th 0m30.898s
PCG64 XSL RR RR 19th 0m31.073s
PCG128 XSL RR RR 20th 0m38.168s