Skip to main content

Huffbit

·211 words·1 min
POC Go Algorithms
Table of Contents

Huffbit is a command-line program written in Go that implements Huffman coding for data compression.

ary82/huffbit

A huffman encoder/decoder written in Go

Go
0
0

About
#

  • Huffman code is a type of optimal prefix code that supports lossless data compression by assigning variable length codes to the characters of a file.
  • This algorithm assigns shorter codes to the most frequent characters, while comparatively uncommon characters receive larger codes, thus reducing file size optimally.
  • This is accomplished by counting all the characters in a file, then constructing a Huffman Tree for assigning values to them.
  • These codes are then written bit by bit to the output file.

Features
#

  • Implements the classic Huffman coding algorithm.
  • Supports compression and decompression of byte data.
  • Leverages Go’s standard library packages for efficient file handling and data structures.

Working
#

Take the input string “huffbit”:

representation of input string huffbit

This program makes the following Huffman tree:

representation of input Huffman tree

The Codes for the characters are generated as:

CharacterCode
f00
i010
t011
b100
h101
newline110
u111

After Writing the necessary headers, the encoded characters are written to the output file:

representation of output bytes

After compression, the bytes needed get reduced from 8 in “huffbit\n” to just 3 in the compressed file, making it a theoretical compression of 62%.