Huffman

This is an implementation of Huffman's-Encoding. The compression of an arbitrary file is done losslessly through prefix codes. Any file format can be compressed through this implementation. A compressed file will be stored in a so-called Huffman file (.huff) and can be decoded with this program. There is a staggering amount of minute optimization to ensure optimal runtime and maximal compression ratio.

Compilation

Before compilation the code has to be cloned. Run the command

git clone https://github.com/vinceclifford/Huffman.git

to clone the repo onto your local machine.

Once the code is cloned, it can be compiled. Due to the complexity of this project I simplified the compiling and building process with a MAKEFILE. To compile and link the program run the command:

make

This should take care of the entire process and result in a single executable file called huffman

General Usage

One can run the program as follows:

./huffman [Options] [Input] [Output]

There are following options to run the program:

-B<int> This option runs the compression / decompression int many times on the same file and logs the (average) runtime. Default: B1
-o With this flag the output file can be specified
-b<int>, --batchsize<int> Specifies how big the batchsize is. In other words how many Bytes are read from memory at one time. Default set to 16384 Byte. Bigger batchsize results in less system calls and therefore improves performance drastically.
-s<string>, --sort<string> With this option the sorting algorithm can be chosen to compare runtimes between the algorithms. Valid inputs: radix_binary, radix_decimal, shell. Default: shell
-v, --verbose Verbose mode
-d Decompress Huffman file. If this flag is not set then the input file will be compressed. If this flag is set then the input file will be interpreted as a Huffman file and will be decoded
-V<int> Select specific program version. Default: V9. Different program versions are listed at the end of the README
-h, --help Print help page

Example Encoding

Let's assume there is a file called input.txt which should be compressed.

./huffman input.txt -o foo

The above command would compress the input file into a file foo.huff. Options can be added to this command to see which effect different implementations have on the runtime.

Example Decoding

The previously generated Huffman file foo.huff should be decoded. For that following command must be run:

./huffman -d foo.huff -o output.txt

That result is a decompressed file called output.txt which is equivalent to the file input.txt

Testing

This program was tested via system tests. These tests check for different modalities (eg. batchsize and sorting algorithm) whether the input file equals the compressed and then decompressed file. These tests can be found in the directory Testing and can be executed with the script tests.sh. The input data for this script is located in the directory test_files. Therefore anybody can execute their own individual tests and check for correctness of the program

Versions

The different Versions of the program are:
-V1 Radix sort binary Batch size 1
-V2 Radix sort decimal Batch size 1
-V3 Shell sort Batch size 1
-V4 Radix sort binary Batch size 4096
-V5 Radix sort decimal Batch size 4096
-V6 Shell sort Batch size 4096
-V7 Radix sort binary Batch size 16384
-V8 Radix sort decimal Batch size 16384
-V9 Shell sort Batch size 16384

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
Benchmarking		Benchmarking
Data		Data
Testing		Testing
Makefile		Makefile
README.md		README.md
batch_io.c		batch_io.c
batch_io.h		batch_io.h
configuration.h		configuration.h
deserialize_tree.c		deserialize_tree.c
deserialize_tree.h		deserialize_tree.h
extract_dictionary.c		extract_dictionary.c
extract_dictionary.h		extract_dictionary.h
frequency_analysis.c		frequency_analysis.c
frequency_analysis.h		frequency_analysis.h
generator.c		generator.c
generator.h		generator.h
huffman.c		huffman.c
huffman.h		huffman.h
huffman_decode.c		huffman_decode.c
huffman_decode.h		huffman_decode.h
huffman_encode.c		huffman_encode.c
huffman_encode.h		huffman_encode.h
main.c		main.c
node.h		node.h
option_select.c		option_select.c
option_select.h		option_select.h
serialize_tree.c		serialize_tree.c
serialize_tree.h		serialize_tree.h
sort.c		sort.c
sort.h		sort.h
tuple_array_size.h		tuple_array_size.h
write_information.h		write_information.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Huffman

Compilation

General Usage

Example Encoding

Example Decoding

Testing

Versions

About

Releases

Packages

Languages

vinceclifford/Huffman

Folders and files

Latest commit

History

Repository files navigation

Huffman

Compilation

General Usage

Example Encoding

Example Decoding

Testing

Versions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages