Re: LZMA inclusion - Lasse Collin

linux-embedded.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Lasse Collin <lasse.collin@tukaani.org>
To: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "Jörn Engel" <joern@logfs.org>,
	"Geert Uytterhoeven" <Geert.Uytterhoeven@sonycom.com>,
	"Bernhard Reutner-Fischer" <rep.dot.nop@gmail.com>,
	"Tim Bird" <tim.bird@am.sony.com>,
	glp@openwrt.org, linux-embedded@vger.kernel.org
Subject: Re: LZMA inclusion
Date: Mon, 8 Dec 2008 20:23:23 +0200	[thread overview]
Message-ID: <200812082023.23450.lasse.collin@tukaani.org> (raw)
In-Reply-To: <493C5D10.1040604@lougher.demon.co.uk>

Phillip Lougher wrote:
> Currently, as mentioned above, Squashfs decompresses into a single
> contiguous output buffer.  But, due to the linux kernel mailing
> list's dislike of vmalloc, this is being changed.  In future Squashfs
> will decompress into a sequence of 4 KiB output buffers (possibly in
> the page cache).

To my understanding, using 4 KiB output buffers can make sense only if 
the dictionary size is significantly smaller than the Squashfs block 
size. This is because an output buffer scattered to 4 KiB pieces means 
that the the dictionary has to be vmalloced as part of the LZMA decoder 
state.

For example, if the dictionary size is equal to the Squashfs block size, 
the same amount of memory that earlier Squashfs versions vmalloced for 
the output buffer is now vmalloced by the uncompression code for the 
dictionary. Plus, memcpy is needed to get the data from the dictionary 
to the 4 KiB output buffers.

LZMA decoder accesses the dictionary contents quite randomly, copying 
1-273 bytes at a time from some unpredictable offset to the end of the 
dictionary. The offsets are relative to the end of the dictionary (i.e. 
the current write position). I suppose the dictionary buffer has to be 
contiguous in the virtual address space. Otherwise the decoder needs to 
emulate it to find the offsets in the dictionary. That would probably 
make things very slow.

Naturally it might be OK to decrease the maximum dictionary size allowed 
in multi-call decoder. I'm not sure if going from 1 MiB to e.g. 128 KiB 
dictionary while keeping the 1 MiB Squashfs block size affects 
compression ratio too much. I guess it means 2-8 % bigger result, but 
someone should test it with real-world file system data. 

-- 
Lasse Collin  |  IRC: Larhzu @ IRCnet & Freenode

next prev parent reply	other threads:[~2008-12-08 18:23 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-25  7:06 LZMA inclusion Gregers Petersen
2008-12-03 19:36 ` Tim Bird
2008-12-03 19:50   ` Florian Fainelli
2008-12-03 19:58   ` Bernhard Reutner-Fischer
2008-12-03 20:20     ` Sam Ravnborg
2008-12-03 20:45       ` Bernhard Reutner-Fischer
2008-12-03 21:16         ` Sam Ravnborg
2008-12-03 21:28           ` Bernhard Reutner-Fischer
2008-12-03 21:43             ` Sam Ravnborg
2008-12-03 21:48     ` Lasse Collin
2008-12-04 21:46       ` Jean-Christophe PLAGNIOL-VILLARD
2008-12-05  8:31       ` Geert Uytterhoeven
2008-12-06 21:56         ` Lasse Collin
2008-12-07 16:01           ` Jörn Engel
2008-12-07 23:32             ` Phillip Lougher
2008-12-08 13:46               ` Jamie Lokier
2008-12-08 18:23               ` Lasse Collin [this message]
2008-12-08 19:00                 ` Phillip Lougher
2008-12-09 10:20                   ` Lasse Collin
2008-12-09 10:37                     ` Geert Uytterhoeven
2008-12-16  8:55                       ` Lasse Collin
2008-12-08 20:17               ` Jörn Engel
2008-12-08 21:47                 ` Phillip Lougher
2008-12-08 22:15                   ` Jörn Engel
2008-12-03 20:09   ` Gregers Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200812082023.23450.lasse.collin@tukaani.org \
    --to=lasse.collin@tukaani.org \
    --cc=Geert.Uytterhoeven@sonycom.com \
    --cc=glp@openwrt.org \
    --cc=joern@logfs.org \
    --cc=linux-embedded@vger.kernel.org \
    --cc=phillip@lougher.demon.co.uk \
    --cc=rep.dot.nop@gmail.com \
    --cc=tim.bird@am.sony.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).