Huffbit is a command-line program written in Go that implements Huffman coding for data compression.
ary82/huffbit
A huffman encoder/decoder written in Go
Go
0
0
About#
- Huffman code is a type of optimal prefix code that supports lossless data compression by assigning variable length codes to the characters of a file.
- This algorithm assigns shorter codes to the most frequent characters, while comparatively uncommon characters receive larger codes, thus reducing file size optimally.
- This is accomplished by counting all the characters in a file, then constructing a Huffman Tree for assigning values to them.
- These codes are then written bit by bit to the output file.
Features#
- Implements the classic Huffman coding algorithm.
- Supports compression and decompression of byte data.
- Leverages Go’s standard library packages for efficient file handling and data structures.
Working#
Take the input string “huffbit”:
This program makes the following Huffman tree:
The Codes for the characters are generated as:
| Character | Code |
|---|---|
| f | 00 |
| i | 010 |
| t | 011 |
| b | 100 |
| h | 101 |
| newline | 110 |
| u | 111 |
After Writing the necessary headers, the encoded characters are written to the output file:
After compression, the bytes needed get reduced from 8 in “huffbit\n” to just 3 in the compressed file, making it a theoretical compression of 62%.