Re: LZMA inclusion - Phillip Lougher

linux-embedded.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Phillip Lougher <phillip@lougher.demon.co.uk>
To: "Jörn Engel" <joern@logfs.org>
Cc: Lasse Collin <lasse.collin@tukaani.org>,
	Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>,
	Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>,
	Tim Bird <tim.bird@am.sony.com>,
	glp@openwrt.org, linux-embedded@vger.kernel.org
Subject: Re: LZMA inclusion
Date: Sun, 07 Dec 2008 23:32:32 +0000	[thread overview]
Message-ID: <493C5D10.1040604@lougher.demon.co.uk> (raw)
In-Reply-To: <20081207160140.GA13387@logfs.org>

Jörn Engel wrote:
> On Sat, 6 December 2008 23:56:50 +0200, Lasse Collin wrote:
>> Since you are improving the crypto API, maybe it would be a good idea to 
>> add a flag to tell the decoder that the whole output buffer will be 
>> kept available to the multi-call decoder.
> 
> I'm not convinced this is the right direction.  One of the constraints
> of kernel programming is that large contiguous are hard to come by.  The
> mm subsystem makes no guarantees that you will be able to allocate 1MiB
> or contiguous memory.  On a 32bit system with highmem, it may even
> become hard to get 1MiB from vmalloc.

This is an important issue, on the last Squashfs submission attempt, its 
  use of vmalloc to allocate up to 1MiB contiguous blocks for 
decompression was brought up.  Any LZMA implementation which requires 
1MiB vmalloced input and output buffers will probably face similar problems.

> 
> So another approach would be to ignore the one-shot debate and
> concentrate on taking a pagevec instead of a buffer (as in a void *
> pointer).  That would certainly be useful for other compressed
> filesystems and without checking the code (I forgot where the squashfs
> git tree was) I claim it should be useful for squashfs as well.

Squashfs doesn't use one-shot decoding with zlib for performance and 
memory issues.  Input data is split across buffer_heads (4 KiB or less 
per buffer_head), and calling zlib repeatedly for each separate 
buffer_head eliminates the necessary memcpy into a larger input buffer, 
eliminates the memory overhead for this buffer, and ensures only the 
first buffer_head needs to be waited on (for arrival off disk) before 
decompression starts.

Currently, as mentioned above, Squashfs decompresses into a single 
contiguous output buffer.  But, due to the linux kernel mailing list's 
dislike of vmalloc, this is being changed.  In future Squashfs will 
decompress into a sequence of 4 KiB output buffers (possibly in the page 
cache).

One-shot LZMA decoding therefore isn't going to work very well with 
future versions of Squashfs, obviously a solution (as is currently done 
with the Squashfs-LZMA patches) is to use separately allocated 
contiguous input/output buffers, and memcpy into and out of them, but 
this isn't particularly ideal.

The discussion about using the output buffer as the temporary workspace 
(as it isn't touched until after decompression is completely finished) 
will work with the current version of Squashfs, but it isn't going to 
work with later versions unless the LZMA code can be changed to work 
with a list of discontiguous output buffers (i.e. a scatter-gather type 
list).

So it looks inevitable that a separately vmalloced workspace buffer will 
be required.

Phillip

> 
> Jörn
>

next prev parent reply	other threads:[~2008-12-07 23:32 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-25  7:06 LZMA inclusion Gregers Petersen
2008-12-03 19:36 ` Tim Bird
2008-12-03 19:50   ` Florian Fainelli
2008-12-03 19:58   ` Bernhard Reutner-Fischer
2008-12-03 20:20     ` Sam Ravnborg
2008-12-03 20:45       ` Bernhard Reutner-Fischer
2008-12-03 21:16         ` Sam Ravnborg
2008-12-03 21:28           ` Bernhard Reutner-Fischer
2008-12-03 21:43             ` Sam Ravnborg
2008-12-03 21:48     ` Lasse Collin
2008-12-04 21:46       ` Jean-Christophe PLAGNIOL-VILLARD
2008-12-05  8:31       ` Geert Uytterhoeven
2008-12-06 21:56         ` Lasse Collin
2008-12-07 16:01           ` Jörn Engel
2008-12-07 23:32             ` Phillip Lougher [this message]
2008-12-08 13:46               ` Jamie Lokier
2008-12-08 18:23               ` Lasse Collin
2008-12-08 19:00                 ` Phillip Lougher
2008-12-09 10:20                   ` Lasse Collin
2008-12-09 10:37                     ` Geert Uytterhoeven
2008-12-16  8:55                       ` Lasse Collin
2008-12-08 20:17               ` Jörn Engel
2008-12-08 21:47                 ` Phillip Lougher
2008-12-08 22:15                   ` Jörn Engel
2008-12-03 20:09   ` Gregers Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=493C5D10.1040604@lougher.demon.co.uk \
    --to=phillip@lougher.demon.co.uk \
    --cc=Geert.Uytterhoeven@sonycom.com \
    --cc=glp@openwrt.org \
    --cc=joern@logfs.org \
    --cc=lasse.collin@tukaani.org \
    --cc=linux-embedded@vger.kernel.org \
    --cc=rep.dot.nop@gmail.com \
    --cc=tim.bird@am.sony.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).