添加lzw压缩算法

2023-12-02 11:52:15 +08:00
parent 9e46a19283
commit a556d0a141
31 changed files with 1593 additions and 2 deletions
--- a/lzw-ab/README
+++ b/lzw-ab/README
@@ -0,0 +1,83 @@
+////////////////////////////////////////////////////////////////////////////
+//                            **** LZW-AB ****                            //
+//               Adjusted Binary LZW Compressor/Decompressor              //
+//                  Copyright (c) 2016-2020 David Bryant                  //
+//                           All Rights Reserved                          //
+//      Distributed under the BSD Software License (see license.txt)      //
+////////////////////////////////////////////////////////////////////////////
+
+This is an implementation of the Lempel-Ziv-Welch general-purpose data
+compression algorithm. It is targeted at embedded applications that require
+high speed compression or decompression facilities where lots of RAM for
+large dictionaries might not be available. I have used this in several
+projects for storing compressed firmware images, and once I even coded the
+decompressor in Z-80 assembly language for speed! Depending on the maximum
+symbol size selected, the implementation can require from 2368 to 335616
+bytes of RAM for decoding (and about half again more for encoding).
+
+This is a streaming compressor in that the data is not divided into blocks
+and no context information like dictionaries or Huffman tables are sent
+ahead of the compressed data (except for one byte to signal the maximum
+bit depth). This limits the maximum possible compression ratio compared to
+algorithms that significantly preprocess the data, but with the help of
+some enhancements to the LZW algorithm (described below) it is able to
+compress better than the UNIX "compress" utility (which is also LZW) and
+is in fact closer to and sometimes beats the compression level of "gzip".
+
+The symbols are stored in "adjusted binary" which provides somewhat better
+compression (with virtually no speed penalty) compared to the fixed word
+sizes normally used. Once the dictionary is full, the encoder returns to
+the beginning and recycles string codes that have not been used yet for
+longer strings. In this way the dictionary constantly "churns" based on the
+the incoming stream, thereby improving and adapting to optimal compression.
+The compression performance is constantly monitored and a dictionary flush
+is forced on stretches of negative compression which limits worst-case
+performance to about 8% inflation.
+
+LZW-AB consists of three standard C files: the library, a command-line
+filter demo using pipes, and a command-line test harness. Each program
+builds with a single command on most platforms. It has been designed with
+maximum portability in mind and should work correctly on big-endian as well
+as little-endian machines.
+
+Linux:
+% gcc -O3 lzwfilter.c lzwlib.c -o lzwfilter
+% gcc -O3 lzwtester.c lzwlib.c -o lzwtester
+
+Darwin/Mac:
+% clang -O3 lzwfilter.c lzwlib.c -o lzwfilter
+% clang -O3 lzwtester.c lzwlib.c -o lzwtester
+
+MS Visual Studio:
+cl -O2 lzwfilter.c lzwlib.c
+cl -O2 lzwtester.c lzwlib.c
+
+There are Windows binaries (built on MinGW) for the filter and the tester on the
+GitHub release page (v3). The "help" display for the filter looks like this:
+
+ Usage:     lzwfilter [-options] [< infile] [> outfile]
+
+ Operation: compression is default, use -d to decompress
+
+ Options:  -d     = decompress
+           -h     = display this "help" message
+           -1     = maximum symbol size = 9 bits
+           -2     = maximum symbol size = 10 bits
+           -3     = maximum symbol size = 11 bits
+           -4     = maximum symbol size = 12 bits
+           -5     = maximum symbol size = 13 bits
+           -6     = maximum symbol size = 14 bits
+           -7     = maximum symbol size = 15 bits
+           -8     = maximum symbol size = 16 bits (default)
+           -v     = verbose (display ratio and checksum)
+
+Here's the "help" display for the tester:
+
+ Usage:     lzwtester [options] file [...]
+
+ Options:   -1 ... -8 = test using only specified max symbol size (9 - 16)
+            -0        = cycle through all maximum symbol sizes (default)
+            -e        = exhaustive test (by successive truncation)
+            -f        = fuzz test (randomly corrupt compressed data)
+            -q        = quiet mode (only reports errors and summary)
+