ox
is a modular command-line toolkit written in C for compressing, encoding, analyzing, and profiling symbolic sequences.
It includes bit-packing, entropy metrics, histogramming, finite-context modeling, and CRC-based hashing.
git clone https://github.com/cobilab/ox
cd ox/src/
make
./ox <command> [options]
Generate random symbolic sequences.
./ox generate [-s <size>] [-c <cardinality>] [-e <seed>] <filename>
-s
: sequence size-c
: alphabet cardinality (0–255)-e
: random seed
Bit-pack sequences using 2-bit (ABCD
) or 4-bit (A–P
) encodings.
./ox pack2 pack <input> <output>
./ox pack2 unpack <input> <output>
./ox pack4 pack <input> <output>
./ox pack4 unpack <input> <output>
Encode/decode sequences with a custom XRC-256 codec: order-0 followed by a Range Coder.
./ox xrc-256 encode <input> <output>
./ox xrc-256 decode <input> <output>
Compute Shannon entropy of binary input.
./ox entropy [-v] <filename>
-v
: verbose output (byte frequencies and count)
Analyze distribution of values in a file (supports 8 and 16 bits).
./ox histogram [-h] [-t 8|16] [-w <width>] [-p] <filename>
-t
: data type (8 or 16 bits)-w
: histogram width-p
: plot instead of raw values-h
: hide zero-count bins
Measure pattern distances in a sequence.
./ox distance -t <pattern> <filename>
-t
: pattern (e.g.,RRR
,EXFGGHH
)
Compute CRC32 checksum.
./ox crc32-hash <filename>
Estimate local complexity using a finite-context model.
./ox profile [-k <ctx>] [-a <alphaDen>] [-w <window>] <filename>
-k
: model context order-a
: smoothing parameter (1/a)-w
: sliding window size
Print predefined analysis pipelines.
./ox pipelines
Example pipeline for DNA compression and decompression:
#!/bin/bash
grep -v '>' DNA.fa | tr -d -c 'ACGT' | tr 'ACGT' 'ABCD' > A.seq
./ox pack2 pack A.seq A.packed
./ox xrc-256 encode A.packed A.encoded
./ox xrc-256 decode A.encoded A.decoded
./ox pack2 unpack A.decoded A.unpacked
cmp A.unpacked A.seq
Print program version.
./ox version
pack2
: expects sequence with only'A'
,'B'
,'C'
,'D'
pack4
: expects symbols from'A'
to'P'
# Prepare sequence
grep -v '>' input.fa | tr -d -c 'ACGT' | tr 'ACGT' 'ABCD' > seq.txt
# Pack using 2-bit
./ox pack2 pack seq.txt packed.bin
# Encode with custom codec
./ox xrc-256 encode packed.bin encoded.bin
# Decode
./ox xrc-256 decode encoded.bin decoded.bin
# Unpack to original
./ox pack2 unpack decoded.bin unpacked.txt
# Validate
cmp seq.txt unpacked.txt
GPLv3 License