All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@pobox.com>
To: J?rn Engel <joern@wohnheim.fh-wedel.de>
Cc: linux-kernel@vger.kernel.org, jmorris@intercode.com.au,
	davem@redhat.com, David Woodhouse <dwmw2@infradead.org>
Subject: Re: [RFC] Breaking data compatibility with userspace bzlib
Date: Fri, 20 Jun 2003 15:09:57 -0400	[thread overview]
Message-ID: <20030620190957.GA19988@gtf.org> (raw)
In-Reply-To: <20030620185915.GD28711@wohnheim.fh-wedel.de>

On Fri, Jun 20, 2003 at 08:59:15PM +0200, J?rn Engel wrote:
> Now, the cost of the underlying BWT is O(n) in memory and O(n*ln(n))
> in time.  That given, I consider it odd to use a linear semantic of
> blockSizeXXXX and would prefer an exponential one, as the zlib uses
> here and there.  Thus blockSizeBits would now give the blockSize as
> 1 << blockSizeBits, allowing to go well below 100k, resulting in lower
> memory consumption for some and well above 900k, giving better
> compression ratios.
> 
> 
> Long intro, short question: Jay O Nay?

The big question is whether the bzip2 better compression is actually
useful in a kernel context?  Patches to do bzip2 for initrd, for
example, have been around for ages:

	http://gtf.org/garzik/kernel/files/initrd-bzip2-2.2.13-2.patch.gz

But the compression and decompression overhead is _much_ larger
than gzip.  It was so huge for maximal compression that dialing back
compression reaching a point of diminishing returns rather quickly,
when compared to gzip memory usage and compression.

I talked a bit with the bzip2 author a while ago about memory usage.
He eventually added the capability to only require small blocks
for decompression (64K IIRC?), but there was a significant loss in
compression factor.

So... even in 2003, I really don't know of many (any?) tasks which
would benefit from bzip2, considering the additional memory and
cpu overhead.

	Jeff




  reply	other threads:[~2003-06-20 18:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-06-20 18:59 [RFC] Breaking data compatibility with userspace bzlib Jörn Engel
2003-06-20 19:09 ` Jeff Garzik [this message]
2003-06-20 19:45   ` Jörn Engel
2003-06-20 19:48     ` David Lang
2003-06-20 20:05       ` Jörn Engel
2003-06-20 21:53         ` Jeff Garzik
2003-06-20 21:55           ` Jeff Garzik
2003-06-20 20:27   ` [RFC] Breaking data compatibility with userspace bz2lib Nicholas Wourms
2003-06-20 20:51     ` Jörn Engel
2003-06-20 21:34       ` Jeff Garzik
2003-06-20 19:45 ` [RFC] Breaking data compatibility with userspace bzlib David S. Miller
2003-06-20 19:56   ` Jörn Engel
2003-06-20 20:23     ` David S. Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030620190957.GA19988@gtf.org \
    --to=jgarzik@pobox.com \
    --cc=davem@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=jmorris@intercode.com.au \
    --cc=joern@wohnheim.fh-wedel.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.