Huffbit is a command-line program written in Go that implements Huffman coding for data compression.
ary82/huffbit
A huffman encoder/decoder written in Go
Go
0
0
About#
- Huffman code is a type of optimal prefix code that supports lossless data compression by assigning variable length codes to the characters of a file.
- This algorithm assigns shorter codes to the most frequent characters, while comparatively uncommon characters receive larger codes, thus reducing file size optimally.
- This is accomplished by counting all the characters in a file, then constructing a Huffman Tree for assigning values to them.
- These codes are then written bit by bit to the output file.
Features#
- Implements the classic Huffman coding algorithm.
- Supports compression and decompression of byte data.
- Leverages Go’s standard library packages for efficient file handling and data structures.
Working#
Take the input string “huffbit”:
This program makes the following Huffman tree:
The Codes for the characters are generated as:
Character | Code |
---|---|
f | 00 |
i | 010 |
t | 011 |
b | 100 |
h | 101 |
newline | 110 |
u | 111 |
After Writing the necessary headers, the encoded characters are written to the output file:
After compression, the bytes needed get reduced from 8 in “huffbit\n” to just 3 in the compressed file, making it a theoretical compression of 62%.