Skip to content

ARJSON Compaction

ARJSON is a novel JSON encoding designed for permanent storage. It utilizes the following tequniques and in some cases outperforms the current best encoding algorithms such as MessagePack and CBOR by more than 20x in size.

  • Bit-level optimization over traditional byte-level encoding MessagePack and other serialization algorithms are optimized at the byte level, often leaving unused bits scattered throughout the encoded data. Since 1 byte consists of 8 bits, data is typically packed into bytes in a way that aligns with standard memory structures. In contrast, ARJSON leverages bit-level optimizations with variable-length data units, ensuring that every bit is utilized efficiently, leaving no space wasted.

  • Columnar structure reorganization for enhanced compression As ARJSON scans JSON data, it dynamically reorganizes it into a columnar structure. Modern databases achieve high performance by storing values of the same field together, which naturally results in sequences of similar data. This structural alignment makes data size more uniform and predictable, increasing redundancy and minimizing differences between neighboring values, ultimately leading to superior compression efficiency.

  • Delta packing for efficient storage of sequential and repetitive data ARJSON leverages its columnar structure to encode differences between consecutive values and efficiently pack repeating sequences. Since meaningful data naturally forms patterns—whether timestamps, counters, or structured records—delta packing takes full advantage of these repetitions, drastically reducing storage and improving compression efficiency.

  • Simple, Deterministic, and Metadata-Efficient Despite its advanced compression capabilities, ARJSON is based on simple deterministic principles. With just a single scan, it dynamically applies bit-level optimizations, columnar reorganization, and delta packing without requiring multiple passes or complex heuristics. The deterministic order eliminates the need for unnecessary metadata, further reducing storage overhead and ensuring a compact and streamlined encoding process.

  • Absolute Minimum Incremental Only Upgrade of Immutable Data ARJSON is purpose-built for efficient, permanent data upgrades. It enables absolute minimum, incremental bit-level updates to immutable data, making it ideal for frequently updated permanent databases. Instead of rewriting entire documents, ARJSON allows compact, append-only mutations that preserve historical integrity while minimizing storage costs.

The WeaveDB device on HyperBEAM, along with external validators (optionally TEE-backed), attests to the correctness of each database state transition. It then bundles the ARJSON-encoded deltas into a new HTTP message while simultaneously computing the zkDB Merkle root hash.

Once the data transitions are finalized, HyperBEAM uploads the message to Arweave, making the updated database state permanently and verifiably stored. This process serves as compaction, analogous to how modern databases compact data into SSTables.