Eormen Full-Block Validation Protocol

v6.0.0

How Eormen’s internal validation suite tests every single byte of a 1 GiB entropy block: 8 statistical categories, zero sampling, with full methodology and scoring criteria.

This documentation is published by Eormen and reproduced here in full. Eormen generates and certifies entropy blocks independently. ScirDom publishes this documentation so that anyone can understand the testing that was done before an entropy block was activated.

Test categories

Sampling: every byte analysed

1000

Chunk uniqueness max score

          What makes this suite different from Dieharder and NIST: Both the Dieharder and NIST SP800-22 suites examine statistical samples drawn from the entropy block. The Eormen Internal Validation Suite processes every single byte of the full 1 GiB with no sampling at all. Only blocks that pass all 8 categories are delivered.
        

Overview

The Eormen Internal Validation Test Suite is a comprehensive statistical analysis system designed to perform exhaustive validation of 1 GiB entropy blocks generated by the Eormen entropy generation system. This validation suite processes every single byte of the entropy block without sampling, providing rigorous statistical measurements to characterise the randomness properties of the generated data. The suite includes advanced chunk uniqueness verification to ensure no portions of the entropy block repeat anywhere within the same block.

Purpose

This validation suite accompanies each generated entropy block to provide objective statistical measurements of its randomness characteristics. The suite performs multiple independent analyses to ensure comprehensive coverage of different statistical properties that characterise high-quality random data, including verification that no significant portions of data repeat within the block.

File Requirements

The validation suite expects entropy block files with the following structure:

Section	Size	Description
Data section	Exactly 1,073,741,824 bytes (1 GiB)	Entropy data to be tested
Metadata section	64 bytes	Block identification and generation details, appended after entropy data
Total file size	1,073,741,888 bytes

The 8 Statistical Categories

Every byte of the 1 GiB block passes through all 8 of the following analyses. A block must pass all 8 to be delivered.

Category 1

Frequency Distribution Analysis

Examines the distribution of byte values (0–255) throughout the entire block. For a block to pass, all 256 possible byte values must appear with near-perfect uniformity.

Chi-square test: Measures deviation from expected uniform distribution.
Uniformity index: Calculated using Gini coefficient.
Maximum deviation: Largest deviation from expected frequency (1/256).
Standard deviation: Variability in frequency distribution.

Category 2

Entropy Measurements

Calculates information-theoretic entropy at multiple scales throughout the full block. A perfect entropy source produces exactly 8.000000 bits per byte.

Shannon entropy: Full-block entropy measurement (bits per byte).
Multi-scale analysis: Entropy calculated for block sizes of 1 KB, 4 KB, 16 KB, 64 KB, 256 KB, and 1 MB.
Bit-position entropies: Entropy for each bit position (0–7).
Byte-pair entropy: Second-order entropy from consecutive byte pairs.
Conditional entropy: Measures predictability based on previous bytes.

Category 3

Correlation Analysis

Tests for dependencies between bytes at 17 different distances. In a truly random block, no relationship should exist between any byte and any other byte, at any distance.

Lag correlations: Calculated for lags of 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1,024, 2,048, 4,096, 8,192, 16,384, 32,768, and 65,536 bytes.
Maximum absolute correlation: Highest correlation value found across all lags.
Correlation decay rate: How quickly correlations decrease with lag distance.
Significant correlations: Correlations exceeding theoretical threshold.

Category 4

Spectral Analysis

Examines frequency domain properties using Fast Fourier Transform. Truly random data produces a flat “white noise” spectrum with no dominant frequencies.

Total spectral power: Energy distribution across frequencies.
Spectral flatness: Measure of “whiteness” of the spectrum.
Peak-to-average ratio: Identifies any dominant frequencies.
FFT window size: 1,048,576 bytes (1 MB).

Category 5

Pattern Analysis

Detects and quantifies patterns in the data at multiple levels. Covers both run-length behaviour and template matching across the full block.

Run length distributions: For each bit position, counts consecutive 0s or 1s.
Longest run: Maximum length of consecutive identical bits.
Template matching: Counts occurrences of 9-bit patterns.
Approximate entropy: Measures pattern complexity.

Category 6

Compression Tests

Evaluates compressibility using three industrial compression algorithms at maximum compression. Truly random data cannot be compressed: any achievable reduction indicates the presence of patterns.

zlib: Deflate algorithm at maximum compression (level 9).
bzip2: Burrows-Wheeler transform at maximum compression (level 9).
lzma: Lempel-Ziv-Markov chain algorithm at maximum compression (level 9).

Category 7

Binary Matrix Rank Analysis

Tests the rank distribution of binary matrices formed from the data against theoretical predictions. Linear dependencies in the data would cause deviations from the expected rank distribution.

Matrix size: 32×32 bits over Galois Field GF(2).
Rank distribution: Counts of matrices with each possible rank.
Chi-square test: Comparison against theoretical probabilities.
Theoretical probabilities: Rank 32: 28.88% · Rank 31: 57.76% · Rank 30: 12.84% · Rank 29: 0.52% · Rank ≤28: ~0.0044%.

Category 8

Chunk Uniqueness Analysis

Verifies that no significant portions of the entropy block repeat anywhere within the same block. Every chunk is hashed with SHA-256; any collision indicates a duplicate. This test is unique to the Eormen suite and provides direct evidence that the block contains no repeated segments.

Chunk sizes tested: 4,096 bytes (4 KB), 16,384 bytes (16 KB), and 65,536 bytes (64 KB).
Uniqueness verification: Every chunk hashed using SHA-256 to detect duplicates.
Collision rate analysis: Calculates observed collision rate per million chunks.
Theoretical probability: Compares observed duplicates against birthday paradox expectations.
Spatial analysis: Measures separation distances between any duplicate chunks found.
Clustering coefficient: Detects if duplicates cluster in specific regions.
Hash distribution uniformity: Verifies SHA-256 hash prefixes distribute uniformly.
Uniqueness scoring: 0–1,000 point system where 1,000 represents perfect uniqueness.

Scoring thresholds:

Score	Assessment	Meaning
1000	Excellent	No duplicates found at any chunk size
≥ 950	Good	Negligible duplicates
≥ 900	Acceptable	Minimal duplicates within expected range
≥ 800	Concerning	Duplicates approaching threshold
< 800	Failed	Block not suitable for use

Metadata Extraction and Verification

The suite extracts and provides comprehensive hash verification from the 64-byte metadata section appended to each block:

Nonce: 16-byte unique identifier (displayed as hexadecimal). Links results to this specific generation session.
Data-only SHA-256: Hash of entropy data section for reproduction verification.
Complete file SHA-256: Hash of entire file for integrity verification.
Metadata-only SHA-256: Hash of metadata section.
Generation timestamp: Unix timestamp and UTC datetime.
Original filename: As recorded during generation.

The hash ordering follows the production standard: nonce, data-only hash, complete file hash, ensuring consistency across all Eormen documentation.

Output Format

Results are saved to a JSON file with the naming convention:

validation_results_{base_filename}.json

The JSON output contains:

Block identification with three-tier hash verification.
Validation metadata (timestamp, version, processing statistics).
Complete statistical measurements from all 8 analyses.
Computational parameters used during validation.
Comprehensive chunk uniqueness analysis results including: overall uniqueness score and assessment; per-chunk-size detailed statistics; duplicate positions if any found; statistical significance assessments; and scoring system explanation.

Processing Characteristics

Memory efficient: Uses streaming algorithms with constant memory usage throughout.
State isolation: Complete state reset between validation runs.
No sampling: Every byte of the 1 GiB block is analysed without exception.
Numerical precision: High-precision calculations using Decimal arithmetic where appropriate.
Chunk processing: Efficient sliding window approach for uniqueness verification.
Hash-based detection: SHA-256 used for reliable duplicate detection in the uniqueness analysis.

Interpretation Note

This validation suite provides objective statistical measurements only. No judgements about randomness quality are made by the suite itself. The measurements should be interpreted by qualified analysts familiar with statistical testing of random number generators. The chunk uniqueness analysis provides definitive verification that no significant data portions repeat within the block.

File Authenticity

The results file includes cryptographic verification:

Nonce: Links results to the specific EORM generation session.
Data-only SHA-256: Verifies the entropy data independently.
Complete file SHA-256: Verifies the entire file tested.
Metadata-only SHA-256: Verifies the metadata section.
Timestamp: Confirms when the entropy was generated.
GPG signature: For authentication of the results file.

The three-tier hash system enables verification of data integrity separately from metadata, essential for confirming that reproduced blocks contain identical entropy whilst having different timestamps.