Back to Transparency

Dieharder Test Protocol

How Eormen tests 1 GiB entropy blocks using the Dieharder statistical test suite, including the full methodology, test selection rationale, and technical implementation details.

This documentation is published by Eormen and reproduced here in full. Eormen generates and certifies entropy blocks independently. ScirDom publishes this documentation so that anyone can understand the testing that was done before an entropy block was activated.

Overview

This document covers the comprehensive statistical test results from the Dieharder randomness test suite, as applied to 1 GiB blocks of entropy data generated by the Eormen entropy system. These results provide rigorous mathematical validation of the randomness quality of the entropy blocks using one of the most respected statistical test suites in the field.

What is Dieharder Testing?

The Dieharder test suite, developed by Robert G. Brown at Duke University, represents a comprehensive collection of statistical tests for evaluating random number generators and entropy sources. Building upon the original Diehard tests created by George Marsaglia, Dieharder provides a modernised and extended test battery that examines multiple aspects of randomness through sophisticated statistical analysis.

For cryptographic and security applications, rigorous randomness validation is essential. Any patterns, correlations, or biases in entropy data can create vulnerabilities that compromise system security. The Dieharder test suite provides mathematical assurance that the entropy data exhibits the statistical properties expected from truly random sequences.

The suite encompasses 29 different statistical tests, each designed to detect specific types of non-random patterns that might exist in data. These tests examine everything from basic frequency distributions to complex mathematical relationships, spatial correlations, and information-theoretic properties.

Software Provenance

The Dieharder software used in this validation was obtained from the official distribution maintained by Robert G. Brown at Duke University (webhome.phy.duke.edu/~rgb/General/dieharder.php). This ensures Eormen utilised the authoritative version of the test suite, maintaining the highest standards of software integrity and academic credibility.

Detail Value
SourceDuke University Physics Department
MaintainerRobert G. Brown
Package Filedieharder-2.24.7-1.i386.rpm
Installed Version3.31.1 (as reported by the software)
Official URLwebhome.phy.duke.edu/~rgb/General/dieharder.php

The package filename reflects historical versioning whilst the installed software reports as version 3.31.1. This is the version used for all testing documented here. Using the official distribution guarantees that the test implementation follows the exact mathematical specifications and methodologies established by the original developers.

EORM Block Structure

EORM entropy blocks consist of two components:

  • Entropy Data: 1,073,741,824 bytes (exactly 1 GiB) of random data
  • Metadata: 64 bytes containing generation information
  • Total File Size: 1,073,741,888 bytes

The Dieharder tests are applied only to the entropy data portion, excluding the metadata. This ensures that the randomness validation focuses purely on the generated entropy without influence from structured metadata.

Metadata field Byte offset Size Description
Nonce0–1516 bytesUnique identifier for this generation session
Timestamp16–238 bytes (little-endian)Generation time
Filename24–5532 bytes (UTF-8)Original filename
File size56–638 bytes (little-endian)Total file size

The 1 GiB Testing Challenge

Standard Dieharder implementations are typically designed for smaller data files or continuous streams from random number generators. Testing 1 GiB blocks (containing over 8.5 billion bits) presents unique challenges that required careful consideration and systematic solutions.

The Core Problem

Many Dieharder tests were designed assuming either unlimited data streams or much smaller finite files. When applied to 1 GiB files, some tests can consume data faster than the file can provide, leading to excessive file rewinding, data contamination from reusing the same data for multiple statistical measurements, and compromised test independence caused by statistical relationships between tests due to shared data.

File Rewinding Impact

When a test requires more data than is available in the file, Dieharder automatically rewinds to the beginning and continues reading. Whilst this allows tests to complete, it introduces problems: tests assume independent data sources; patterns from earlier in the file may influence later measurements; and results may not accurately reflect randomness properties.

The Scale Challenge

A 1 GiB entropy block represents a substantial amount of data, but some tests in the complete Dieharder suite were designed for continuous streams or much larger datasets. These data-hungry tests can easily consume multiple gigabytes of data when run with default parameters.

Test Selection Methodology

To address the 1 GiB testing challenge whilst maintaining statistical rigour, Eormen developed a systematic approach based on clear, defensible criteria. The methodology prioritises official test reliability ratings whilst ensuring practical feasibility for finite data files.

Primary Criterion: Official Reliability Ratings

The Dieharder developers have assigned reliability ratings to each test based on extensive research and validation:

  • “Good”: Tests that are statistically sound and provide reliable randomness assessment.
  • “Suspect”: Tests with known issues or questionable statistical validity.
  • “Do Not Use”: Tests that are fundamentally flawed or produce unreliable results.

Eormen's methodology gives absolute priority to these official ratings, as they represent the accumulated expertise of the academic randomness testing community.

Secondary Criterion: Data Efficiency

Among the “Good”-rated tests, each test's data consumption patterns were evaluated to identify those suitable for 1 GiB files, considering data requirements, rewinding behaviour, and whether tests provide unique statistical coverage or duplicate existing analysis.

Optimisation Philosophy

  • Include all suitable “Good” tests: No arbitrary exclusions based on convenience.
  • Exclude only when necessary: Clear justification for any exclusion.
  • Maintain statistical independence: Preserve the validity of individual tests.
  • Transparent decision-making: Document all rationale for reproducibility.

Test Selection Rationale

Included: Core Diehard Tests (Tests 0–4, 8–13, 15–17)

All rated “Good” by the Dieharder developers. Designed for finite data sources and complete within 1 GiB constraints. These tests form the backbone of the validation suite, examining collision patterns, matrix rank properties, bit-level pattern analysis, spatial distribution, information-theoretic measures, sequential pattern analysis, and number-theoretic relationships.

Excluded: Suspect Tests (Tests 5–7)

Tests 5–7: Diehard OPSO, OQSO, and DNA tests carry an official rating of “Suspect”. The Dieharder developers have identified statistical issues with these tests that make their results unreliable. Including tests known to be problematic would compromise the integrity of the validation suite.

Excluded: Problematic Test (Test 14)

Test 14: Diehard Sums Test carries an official rating of “Do Not Use”. This test is fundamentally flawed and should never be used for randomness assessment, according to the official documentation.

Excluded: Data-Intensive “Good” Tests (Tests 100–102, 200–209)

Despite being rated “Good”, seven tests were excluded due to excessive data requirements:

  • STS Tests (100–102): Tests 100–101 largely duplicate analysis provided by included tests. Test 102 requires excessive data, causing significant file rewinding.
  • RGB Tests (200–205): Created for continuous generator streams, not finite files. Extremely data-hungry, causing extensive file rewinding.
  • DAB Tests (206–209): Designed for much larger datasets. Would require significant file rewinding for completion.

Parameter Optimisation

Reduced p-value samples: Changed from default 100 to 20 samples per test. This maintains adequate statistical power whilst reducing data consumption by 80%, significantly reducing file rewinding across all tests. Slightly reduced statistical precision, but still statistically valid.

The 14 Selected Tests

The validation suite includes 14 carefully selected tests providing comprehensive coverage of randomness properties whilst respecting 1 GiB file constraints. Select any test to see the full details.

What it measures
Collision patterns in random sequences, based on the birthday paradox.
How it works
Examines whether birthdays (represented as random integers) in groups show the expected collision frequency. The test divides data into groups and counts collisions, comparing results to theoretical expectations for random data.
Technical parameters
ntup=0
Why it matters
Collision analysis is fundamental to cryptographic applications. Non-random data often exhibits unexpected collision patterns that this test can detect.
What PASSED means
The entropy exhibits appropriate collision patterns consistent with random data, with no unexpected clustering or avoidance of collisions.
What it measures
Permutation patterns in sequences of 5 elements.
How it works
Analyses whether the 120 possible permutations of 5 elements occur with equal frequency in the data. Each permutation should appear with probability 1/120 in truly random sequences.
Technical parameters
ntup=0
Why it matters
Ordering bias can indicate subtle patterns in entropy generation that might not be detected by simpler frequency tests.
What PASSED means
No bias towards particular ordering patterns in the entropy data, confirming that sequential relationships appear random.
What it measures
Mathematical rank properties of 32×32 binary matrices formed from the data.
How it works
Creates matrices from consecutive bits and calculates their rank over GF(2) (Galois Field). The distribution of ranks is compared to theoretical expectations for random binary matrices.
Technical parameters
ntup=0
Why it matters
Matrix rank analysis can detect linear dependencies and structural patterns that might not be apparent in other tests.
What PASSED means
The data exhibits the expected linear algebra properties of random binary matrices, with no detectable linear dependencies.
What it measures
Rank properties of smaller 6×8 binary matrices, complementing Test 2.
How it works
Analyses rank distribution of smaller matrices for different scale validation. This provides analysis at a different resolution than the 32×32 test.
Technical parameters
ntup=0
Why it matters
Testing multiple matrix sizes ensures that linear dependencies are not missed due to scale effects.
What PASSED means
Confirms randomness properties at a different matrix scale, providing additional confidence in linear independence.
What it measures
Overlapping bit patterns within the data stream.
How it works
Examines specific overlapping bit sequences for unexpected patterns, focusing on the frequency of particular bit combinations as they overlap.
Technical parameters
ntup=0
Why it matters
Overlapping pattern analysis can detect subtler correlations than non-overlapping pattern tests.
What PASSED means
No problematic bit-level patterns detected in the entropy, confirming appropriate bit-level randomness.
What it measures
Distribution of 1-bits in consecutive bit streams.
How it works
Counts 1-bits in overlapping windows of specific sizes and tests whether the distribution matches theoretical expectations for random data.
Technical parameters
ntup=0
Why it matters
Bit frequency distribution is fundamental to randomness assessment and can reveal bias in bit generation.
What PASSED means
Appropriate distribution of 1-bits throughout the data streams, confirming balanced bit generation.
What it measures
Distribution of 1-bits within individual bytes.
How it works
Analyses how many 1-bits appear in each byte value (0–8 ones per byte) and compares the distribution to theoretical expectations.
Technical parameters
ntup=0
Why it matters
Byte-level analysis can detect patterns that might be masked when analysing larger data blocks.
What PASSED means
Byte-level bit distribution matches random expectations, confirming appropriate entropy at the byte scale.
What it measures
2D spatial distribution patterns using a parking lot analogy.
How it works
Places “cars” (data points) randomly in a 2D space and measures parking success rates. The test simulates trying to park cars of specific sizes and counts successful placements.
Technical parameters
ntup=0
Why it matters
Spatial distribution tests can detect clustering patterns that other tests might miss.
What PASSED means
The entropy exhibits appropriate 2D spatial randomness properties, with no unexpected clustering or avoidance patterns.
What it measures
Distribution of points within a 2D circular space.
How it works
Maps data to 2D coordinates within a unit circle and analyses distribution uniformity, examining whether points are appropriately distributed throughout the 2D circular space.
Technical parameters
ntup=2
Why it matters
Circular spatial analysis can detect patterns that rectangular spatial tests might miss.
What PASSED means
No unexpected clustering in 2D circular representation of the data, confirming appropriate spatial distribution.
What it measures
3D spatial distribution within a unit sphere.
How it works
Maps data points to 3D coordinates within a sphere and tests distribution uniformity, examining whether points are appropriately distributed throughout the 3D space.
Technical parameters
ntup=3
Why it matters
Higher-dimensional spatial analysis can detect patterns that might not be apparent in 2D tests.
What PASSED means
Appropriate 3D spatial distribution properties confirmed, demonstrating randomness in higher-dimensional space.
What it measures
Compressibility and information-theoretic properties.
How it works
Uses a mathematical “squeeze” operation to test for hidden patterns that might make data compressible. The test examines how much data can be “compressed” using specific mathematical operations.
Technical parameters
ntup=0
Why it matters
Truly random data should be incompressible. Any compressibility suggests the presence of patterns.
What PASSED means
The entropy data resists compression as expected for random sequences, confirming high information content.
What it measures
Consecutive identical bit patterns (runs of 0s and 1s).
How it works
Counts run lengths of consecutive identical bits and compares the distribution to theoretical expectations for random bit sequences.
Technical parameters
ntup=0
Why it matters
Run-length analysis can detect bias towards longer or shorter sequences of identical bits.
What PASSED means
Run-length patterns match those expected from random bit sequences, confirming appropriate bit transition behaviour.
Note
This test produces 2 results: one for runs of 0s and one for runs of 1s.
What it measures
Complex sequence analysis using dice game simulation.
How it works
Simulates craps games using the entropy data and analyses win/loss patterns and the number of throws required to reach decisions.
Technical parameters
ntup=0
Why it matters
Game simulation tests complex sequential relationships that individual statistical tests might not detect.
What PASSED means
Complex sequential patterns behave as expected for random data, confirming sophisticated randomness properties.
Note
This test produces 2 results (wins and throws to decision). During execution, this test typically causes one file rewind.
What it measures
Greatest common divisor patterns in integer sequences.
How it works
Applies number theory analysis to detect mathematical patterns by examining the GCD relationships between pairs of integers derived from the data.
Technical parameters
ntup=0
Why it matters
Number-theoretic analysis can detect mathematical relationships that other statistical tests might miss.
What PASSED means
No unexpected mathematical relationships in the entropy data, confirming randomness from a number theory perspective.
Note
This test produces 2 results covering different GCD analyses.

Complete Test Suite Inventory

All 29 Dieharder tests with Eormen's inclusion and exclusion decisions. 14 included (shown in green); 15 excluded (shown in red) with rationale.

Test ID Test Name Official Rating Included Rationale
Original Diehard Tests
0Diehard Birthdays TestGood✓ YesCore randomness test, efficient for 1 GiB
1Diehard OPERM5 TestGood✓ YesPermutation analysis, suitable data usage
2Diehard 32×32 Binary Rank TestGood✓ YesMatrix rank analysis, efficient implementation
3Diehard 6×8 Binary Rank TestGood✓ YesComplementary matrix analysis
4Diehard Bitstream TestGood✓ YesBit pattern analysis, reasonable data usage
5Diehard OPSO TestSuspect✗ NoOfficial rating “Suspect”
6Diehard OQSO TestSuspect✗ NoOfficial rating “Suspect”
7Diehard DNA TestSuspect✗ NoOfficial rating “Suspect”
8Diehard Count the 1s (stream) TestGood✓ YesFundamental bit analysis, efficient
9Diehard Count the 1s Test (byte)Good✓ YesByte-level analysis, minimal data usage
10Diehard Parking Lot TestGood✓ YesSpatial analysis, suitable for 1 GiB
11Diehard 2D Sphere TestGood✓ Yes2D circular spatial analysis, efficient
12Diehard 3D Sphere TestGood✓ Yes3D spatial analysis, manageable data usage
13Diehard Squeeze TestGood✓ YesInformation theory, efficient implementation
14Diehard Sums TestDo Not Use✗ NoOfficial rating “Do Not Use”
15Diehard Runs TestGood✓ YesRun analysis, fundamental test
16Diehard Craps TestGood✓ YesSequential analysis, reasonable data usage
17Marsaglia and Tsang GCD TestGood✓ YesNumber theory, efficient for 1 GiB
STS (Statistical Test Suite) Tests
100STS Monobit TestGood✗ NoDuplicates analysis in included tests
101STS Runs TestGood✗ NoDuplicates Test 15 analysis
102STS Serial Test (Generalised)Good✗ NoExcessive data consumption for 1 GiB
RGB (Robert G. Brown) Tests
200RGB Bit Distribution TestGood✗ NoExtremely data-intensive, causes rewinding
201RGB Generalised Minimum Distance TestGood✗ NoData-hungry, designed for continuous streams
202RGB Permutations TestGood✗ NoExcessive data requirements
203RGB Lagged Sum TestGood✗ NoData-intensive, rewinding risk
204RGB Kolmogorov-Smirnov TestGood✗ NoHigh data consumption
205Byte DistributionGood✗ NoData-hungry implementation
DAB (Data Analysis Battery) Tests
206DAB DCTGood✗ NoDesigned for larger datasets
207DAB Fill Tree TestGood✗ NoExcessive data requirements
208DAB Fill Tree 2 TestGood✗ NoHigh data consumption
209DAB Monobit 2 TestGood✗ NoData-intensive for 1 GiB files

Total available tests

29

Tests included

14

67% of all “Good”-rated tests

Individual results produced

17

Some tests produce multiple p-values

Technical Implementation

Parameters Used

Each test execution uses the following carefully optimised parameters:

-d [test_number] # Specific test identifier -g 201 # file_input_raw generator (optimal for binary files) -f [filename] # Path to the entropy block file -p 20 # 20 p-value samples (balanced for statistical power and data conservation)

Generator 201 (file_input_raw)

Specifically designed for binary entropy files, providing direct access to raw data without unnecessary transformations.

20 P-value Samples

Reduced from the default 100 to conserve data whilst maintaining adequate statistical power for reliable results. This prevents excessive file rewinding that could compromise test independence.

Individual Test Execution

Each test runs separately to provide detailed individual results and prevent interference between tests.

Understanding the Test Results

Result Categories

  • PASSED: Test indicates excellent randomness properties.
  • WEAK: Borderline result, not necessarily problematic but worth noting. Occasional weak results are not necessarily cause for concern.
  • FAILED: Test suggests potential non-random patterns requiring investigation.

Expected Performance

  • Runtime: Approximately 15–30 minutes for the complete 14-test suite.
  • Total results: 17 individual test results (some tests produce multiple p-values).
  • File rewinds: Typically 1 rewind (during the diehard_craps test).
  • Data usage: Efficient usage of the 1 GiB block with minimal repetition.

Overall Assessment Guidelines

  • All PASSED: Excellent randomness properties confirmed.
  • Mostly PASSED with occasional WEAK: Good randomness with minor variations.
  • Multiple FAILED: Concerning results requiring investigation of entropy source.

Strengths and Limitations

Systematic test selection: All practical “Good”-rated tests from the complete 29-test suite.
Official compliance: Uses only tests approved by Dieharder developers.
Comprehensive coverage: 14 tests examine all fundamental randomness properties.
Statistical independence: Minimal file rewinding preserves test validity.
Transparent methodology: Complete disclosure of test selection rationale.
Academic credibility: Official Duke University Dieharder distribution.
7 “Good” tests excluded: STS, RGB, and DAB tests omitted due to excessive data requirements.
Reduced samples: 20 p-values per test (vs. 100 default) to conserve data.
Finite block testing: Results are specific to the tested block, not predictive of future generations.

File Authenticity

The results file includes cryptographic verification using a three-tier hashing system:

  • Nonce: Links results to the specific EORM generation session.
  • Data-only SHA-256: Verifies the entropy data independently. Remains constant for reproductions.
  • Complete file SHA-256: Verifies the entire file has not been modified.
  • Metadata-only SHA-256: Verifies the metadata section.
  • Timestamp: Unix timestamp confirming when the entropy was generated.

This three-tier approach ensures both data integrity and proper attribution to the original generation session. The data-only hash remains constant for reproductions, whilst the complete file hash verifies the entire file has not been modified.

Your file download is starting.
Your browser should begin the download in a moment.