This project implements Lempel-Ziv compression and decompression for files. Two Abstract Data Types (ADTs), Tries and Word Tables, are used to compress and decompress data efficiently. By utilizing buffers, the program can perform clean compression and decompression on any data passed to the executables, encode and decode.
To build this project, use the following commands:
$ makeor$ make all: Builds both executables,encodeanddecode.$ make encode: Builds only theencodeexecutable.$ make decode: Builds only thedecodeexecutable.
To delete all executables and .o files, run:
$ make clean
- Build both executables (
encodeanddecode) as described in the Build section. - Prepare a file with the text or data to be compressed.
Run the following command to compress a file:
$ ./encode -i <input_file_name> -o <output_file_name> [-v]-i <input_file_name>: Specifies the input file to be compressed.-o <output_file_name>: Specifies the output file where compressed data will be stored.-v: (Optional) Prints extra statistics about the compression.
Once you've compressed the file, run the following command to decompress it:
$ ./decode -i <output_file_name> -o <final_output_file_name> [-v]-i <output_file_name>: Specifies the compressed file to decode.-o <final_output_file_name>: Specifies the file where the decompressed data will be stored.-v: (Optional) Prints extra statistics about the decompression.
After these steps,<final_output_file_name> should contain the decompressed version of the original file.
Note: If you don’t specify a file for -i (input) or -o (output), the program will default to stdin for input and stdout for output.
When running $ make scan-build to detect errors, you may encounter the following warning:
word.c:48:24: warning: Result of 'malloc' is converted to a pointer of type 'WordTable', which is incompatible with sizeof operand type 'Word' [unix.MallocSizeof]
This warning occurs because a Word Table is essentially a list of Words. When initializing wt (a pointer to a Word Table), we allocate memory for a list of Words up to MAX_CODE, which is intentional and does not impact functionality.