Trixter wrote:LENGTH is 1 or 2 bytes:
1xxxxxxx - for 4...131
01xxxxxx - for 132..195
00xxxxxx xxxxxxxx - for up to 16383
If you care about decomp speed, you might want to put codes at the end instead of the beginning. For example, 1xxxxxxx means this is necessary:
- Code: Select all
cmp al,10000000b
je handle_1
...
handle_1:
and al,01111111b
add al,4
But if you change your codes to be at the end, you can do this:
- Code: Select all
shr al,1
jc handle_1
...
handle_1:
add al,4
This is both smaller and faster on 8086. It shifts the bits into carry and jumps based on carry bit, and when you're finished you don't need to adjust the value as it is already in the correct format.
Hm, interesting, so it will be
xxxxxxx1 - 4...131
1xxxxxx0 - 132...195
and 0xxxxxx0 xxxxxxxx - any other number up to 16383?
Actually, I think in my case masking is NOT required to get rid of highest 1 - just add 84H instead of 04H and it's gone

Trixter wrote:EDIT: I just thought of another optimization: Don't use FF for the code; instead, use the value that shows up the least. Scan the entire file before compressing to determine which value shows up the least, then use that as the code. Otherwise, data that has a lot of FFs in it (like most 8-bit programming!) won't compress very well...
I don't think I'll get a lot by switching to rarest byte - usually a lot of FFs will be successfully eaten by LZ77, but I can at least estimate using outputs from my current design to see possible impact of that...
P.S. So I took results of compression of JRCARTS7.IMG and count number of literals FF that was coded in:
- Code: Select all
> grep "\[255\]" JRCARTS7.out
[255] 0xFF ? = 218
[255] 0xFF ? = 175
[255] 0xFF ? = 140
[255] 0xFF ? = 39
[255] 0xFF ? = 62
[255] 0xFF ? = 98
[255] 0xFF ? = 159
[255] 0xFF ? = 240
[255] 0xFF ? = 140
> python
Python 2.5.2 (r252:60911, Jan 24 2010, 18:51:01)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 218+175+140+39+62+98+159+240+140
1271
>>> 112715-1271
111444
>>> 100.0*1271/112715
1.127622765381715
so by switching to rarest byte I'll get extra 1.2KB (-1.13%) so 112,715 will turn into 111,444 - from one hand it's better, from other it's not too much yet to convince me to get rid of FF at the name of my compression utility (it's SHAFF because of FF : ) - it's still a little far from ZX7 and RNC method 2...