linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Where's the bzip2 compressed linux-kernel patch?
@ 2003-10-18  5:18 Rob Landley
  2003-10-18  5:30 ` Nick Piggin
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Rob Landley @ 2003-10-18  5:18 UTC (permalink / raw)
  To: linux-kernel

I just rewrote bunzip2 for busybox in about 500 lines of C (and a good chunk 
of that's comments), which comiles to a bit under 7k, and I was thinking of 
redoing the bunzip-the-kernel patch with my new bunzip code, but I can't find 
the patch.  Anybody got a URL to it?

The most recent one I could find was kerneltrap's 404-error link to 
http://chrissicool.piranho.com/patch-2.4.x-bzip2-i386

Rob

P.S.  If you're curious about the micro-bunzip code, it's in busybox CVS:
http://www.busybox.net/cgi-bin/cvsweb/busybox/archival/libunarchive/decompress_bunzip2.c

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-18  5:18 Where's the bzip2 compressed linux-kernel patch? Rob Landley
@ 2003-10-18  5:30 ` Nick Piggin
  2003-10-18 11:39   ` Daniel Egger
  2003-10-18  6:05 ` Erik Andersen
  2003-10-18 16:43 ` Jörn Engel
  2 siblings, 1 reply; 18+ messages in thread
From: Nick Piggin @ 2003-10-18  5:30 UTC (permalink / raw)
  To: rob; +Cc: linux-kernel



Rob Landley wrote:

>I just rewrote bunzip2 for busybox in about 500 lines of C (and a good chunk 
>of that's comments), which comiles to a bit under 7k, and I was thinking of 
>redoing the bunzip-the-kernel patch with my new bunzip code, but I can't find 
>the patch.  Anybody got a URL to it?
>
>The most recent one I could find was kerneltrap's 404-error link to 
>http://chrissicool.piranho.com/patch-2.4.x-bzip2-i386
>


This came up on the list a while back. IIRC the conclusion was that
runtime memory usage and speed, and not so significant compression
improvement over gzip.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-18  5:18 Where's the bzip2 compressed linux-kernel patch? Rob Landley
  2003-10-18  5:30 ` Nick Piggin
@ 2003-10-18  6:05 ` Erik Andersen
  2003-10-18 16:43 ` Jörn Engel
  2 siblings, 0 replies; 18+ messages in thread
From: Erik Andersen @ 2003-10-18  6:05 UTC (permalink / raw)
  To: Rob Landley; +Cc: linux-kernel

On Sat Oct 18, 2003 at 12:18:21AM -0500, Rob Landley wrote:
> I just rewrote bunzip2 for busybox in about 500 lines of C (and a good chunk 
> of that's comments), which comiles to a bit under 7k, and I was thinking of 
> redoing the bunzip-the-kernel patch with my new bunzip code, but I can't find 
> the patch.  Anybody got a URL to it?
> 
> The most recent one I could find was kerneltrap's 404-error link to 
> http://chrissicool.piranho.com/patch-2.4.x-bzip2-i386

http://shepard.kicks-ass.net/~cc/

 -Erik

--
Erik B. Andersen             http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-18  5:30 ` Nick Piggin
@ 2003-10-18 11:39   ` Daniel Egger
  2003-10-19  0:51     ` Nick Piggin
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Egger @ 2003-10-18 11:39 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel, rob

[-- Attachment #1: Type: text/plain, Size: 902 bytes --]

Am Sam, den 18.10.2003 schrieb Nick Piggin um 07:30:

> This came up on the list a while back. IIRC the conclusion was that
> runtime memory usage and speed, and not so significant compression
> improvement over gzip.

I quick test with a PowerPC kernel and the normal vmlinux image reveals
that this is nonsense. 

-rwxr-xr-x    1 root     root      2766490 2003-09-27 22:29 vmlinux
-rwxr-xr-x    1 root     root      1149410 2003-09-27 22:29 vmlinux.gz
-rwxr-xr-x    1 root     root      1062999 2003-09-27 22:29 vmlinux.bz2

This is a 86411 bytes or 8.1% reduction, seems significant to me...

Granted, it takes 9 times as long to decompress the kernel and ca. 900kb
more memory but considering an embedded DSL router I'm working with
which has 16MB RAM but only 4MB Flash this is certainly worth it. At
least when the target is an embedded device.

-- 
Servus,
       Daniel

[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-18  5:18 Where's the bzip2 compressed linux-kernel patch? Rob Landley
  2003-10-18  5:30 ` Nick Piggin
  2003-10-18  6:05 ` Erik Andersen
@ 2003-10-18 16:43 ` Jörn Engel
  2003-10-18 20:38   ` Rob Landley
  2 siblings, 1 reply; 18+ messages in thread
From: Jörn Engel @ 2003-10-18 16:43 UTC (permalink / raw)
  To: Rob Landley; +Cc: linux-kernel

On Sat, 18 October 2003 00:18:21 -0500, Rob Landley wrote:
> 
> I just rewrote bunzip2 for busybox in about 500 lines of C (and a good chunk 
> of that's comments), which comiles to a bit under 7k.

5140 on my machine, compared to 9436 for the stock decompress.o.  Nice.

Does it survive the bzip2 testcases?

> P.S.  If you're curious about the micro-bunzip code, it's in busybox CVS:
> http://www.busybox.net/cgi-bin/cvsweb/busybox/archival/libunarchive/decompress_bunzip2.c

Not pretty with 80 columns, but it looks good at first glance.  And
surely more fun to work on than the zlib-inspired code from Julian.

Jörn

-- 
Fools ignore complexity.  Pragmatists suffer it.
Some can avoid it.  Geniuses remove it.
-- Perlis's Programming Proverb #58, SIGPLAN Notices, Sept.  1982

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-18 16:43 ` Jörn Engel
@ 2003-10-18 20:38   ` Rob Landley
  2003-10-20  8:31     ` Jörn Engel
  0 siblings, 1 reply; 18+ messages in thread
From: Rob Landley @ 2003-10-18 20:38 UTC (permalink / raw)
  To: Jörn Engel; +Cc: linux-kernel

On Saturday 18 October 2003 11:43, Jörn Engel wrote:
> On Sat, 18 October 2003 00:18:21 -0500, Rob Landley wrote:
> > I just rewrote bunzip2 for busybox in about 500 lines of C (and a good
> > chunk of that's comments), which comiles to a bit under 7k.
>
> 5140 on my machine, compared to 9436 for the stock decompress.o.  Nice.
>
> Does it survive the bzip2 testcases?

The decompression-side ones, yes.  (Modulo different command line arguments, 
and that I didn't implement the "small mode" that's slower but uses less 
memory.  That would probably only add a couple hundred bytes, and I could 
make it a compile time option, but I just haven't gotten around to it yet.  
If somebody wants to send me a patch... :)

Mostly I've been repeatedly extracting linux-2.6.0-test6.tar.bz2 as my 
testcase.  Much more of a workout. :)

> > P.S.  If you're curious about the micro-bunzip code, it's in busybox CVS:
> > http://www.busybox.net/cgi-bin/cvsweb/busybox/archival/libunarchive/decom
> >press_bunzip2.c
>
> Not pretty with 80 columns, but it looks good at first glance.

Manuel Novoa submitted a patch that sped things up over 10% (seriously cool, 
that's why we're faster than the original), but broke the 80 column thing 
(mostly a couple return statements that need to be on the next line).

I'm happy to take a patch to clean it up. :)

> And surely more fun to work on than the zlib-inspired code from Julian.

That was the original reason for doing this, yes. :)

Eric Anderson pointed me to the new home of the kernel bunzip patch, which is 
at "http://shepard.kicks-ass.net/~cc/", and I'll take a stab at porting it to 
2.6.0-test8 as the mood strikes me. :)

> Jörn

Rob

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-18 11:39   ` Daniel Egger
@ 2003-10-19  0:51     ` Nick Piggin
  2003-10-19 10:45       ` Michael Buesch
  0 siblings, 1 reply; 18+ messages in thread
From: Nick Piggin @ 2003-10-19  0:51 UTC (permalink / raw)
  To: Daniel Egger; +Cc: linux-kernel, rob



Daniel Egger wrote:

>Am Sam, den 18.10.2003 schrieb Nick Piggin um 07:30:
>
>
>>This came up on the list a while back. IIRC the conclusion was that
>>runtime memory usage and speed, and not so significant compression
>>improvement over gzip.
>>
>
>I quick test with a PowerPC kernel and the normal vmlinux image reveals
>that this is nonsense. 
>
>-rwxr-xr-x    1 root     root      2766490 2003-09-27 22:29 vmlinux
>-rwxr-xr-x    1 root     root      1149410 2003-09-27 22:29 vmlinux.gz
>-rwxr-xr-x    1 root     root      1062999 2003-09-27 22:29 vmlinux.bz2
>
>This is a 86411 bytes or 8.1% reduction, seems significant to me...
>

Sure, it might be worth it in some cases. I didn't mean improvement
wasn't measurable at all.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-19  0:51     ` Nick Piggin
@ 2003-10-19 10:45       ` Michael Buesch
  2003-10-19 21:04         ` Willy Tarreau
  2003-10-19 22:15         ` Rob Landley
  0 siblings, 2 replies; 18+ messages in thread
From: Michael Buesch @ 2003-10-19 10:45 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Daniel Egger, linux-kernel, rob

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 19 October 2003 02:51, Nick Piggin wrote:
> Daniel Egger wrote:
> >I quick test with a PowerPC kernel and the normal vmlinux image reveals
> >that this is nonsense.
> >
> >-rwxr-xr-x    1 root     root      2766490 2003-09-27 22:29 vmlinux
> >-rwxr-xr-x    1 root     root      1149410 2003-09-27 22:29 vmlinux.gz
> >-rwxr-xr-x    1 root     root      1062999 2003-09-27 22:29 vmlinux.bz2
> >
> >This is a 86411 bytes or 8.1% reduction, seems significant to me...
>
> Sure, it might be worth it in some cases. I didn't mean improvement
> wasn't measurable at all.

What about making it configurable? If someone want's bzip2
and if he want's to wait longer to boot, (s)he may
compile bzip2 support.
If someone dislikes it, (s)he may use gzip.

- -- 
Regards Michael Buesch  [ http://www.tuxsoft.de.vu ]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/kmtjoxoigfggmSgRAqNAAJ9isDDlHbCBTWleCMMtd7/AtaogBACcDu8o
tg6ggQ+w3nyn933Po7tRJ5I=
=jft8
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-19 10:45       ` Michael Buesch
@ 2003-10-19 21:04         ` Willy Tarreau
  2003-10-19 21:27           ` Michael Buesch
  2003-10-20  0:00           ` Rob Landley
  2003-10-19 22:15         ` Rob Landley
  1 sibling, 2 replies; 18+ messages in thread
From: Willy Tarreau @ 2003-10-19 21:04 UTC (permalink / raw)
  To: Michael Buesch; +Cc: Nick Piggin, Daniel Egger, linux-kernel, rob

On Sun, Oct 19, 2003 at 12:45:46PM +0200, Michael Buesch wrote:
 
> What about making it configurable? If someone want's bzip2
> and if he want's to wait longer to boot, (s)he may
> compile bzip2 support.
> If someone dislikes it, (s)he may use gzip.

I don't know if people have already tested LZO and NRV (used in UPX). Perhaps
LZO might already be a little better than gzip, while faster and with less
code. NRV is surely better, but is not open source IIRC. Perhaps Markus would
agree to release a free NRV decompressor if asked kindly ?

Regards,
Willy


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-19 21:04         ` Willy Tarreau
@ 2003-10-19 21:27           ` Michael Buesch
  2003-10-20  0:00           ` Rob Landley
  1 sibling, 0 replies; 18+ messages in thread
From: Michael Buesch @ 2003-10-19 21:27 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: Nick Piggin, Daniel Egger, linux-kernel, rob

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 19 October 2003 23:04, Willy Tarreau wrote:
> I don't know if people have already tested LZO and NRV (used in UPX).
> Perhaps LZO might already be a little better than gzip, while faster and
> with less code. NRV is surely better, but is not open source IIRC. Perhaps
> Markus would agree to release a free NRV decompressor if asked kindly ?

If NRV is closed source (I don't know), then we would need a free
compressor _and_ a free decompressor. Because I don't think we should
go and compile (compress) the kernel image with some closed-source
software. 8-)

> Regards,
> Willy

- -- 
Regards Michael Buesch  [ http://www.tuxsoft.de.vu ]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/kwGuoxoigfggmSgRArzRAJ4rNgBumLFlQ9q+OincNpxmL2uKaQCePZi2
DFXJPIOdcY4IoAivT6gBmOg=
=uOq6
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-19 10:45       ` Michael Buesch
  2003-10-19 21:04         ` Willy Tarreau
@ 2003-10-19 22:15         ` Rob Landley
  1 sibling, 0 replies; 18+ messages in thread
From: Rob Landley @ 2003-10-19 22:15 UTC (permalink / raw)
  To: Michael Buesch, Nick Piggin; +Cc: Daniel Egger, linux-kernel

On Sunday 19 October 2003 05:45, Michael Buesch wrote:
> On Sunday 19 October 2003 02:51, Nick Piggin wrote:
> > Daniel Egger wrote:
> > >I quick test with a PowerPC kernel and the normal vmlinux image reveals
> > >that this is nonsense.
> > >
> > >-rwxr-xr-x    1 root     root      2766490 2003-09-27 22:29 vmlinux
> > >-rwxr-xr-x    1 root     root      1149410 2003-09-27 22:29 vmlinux.gz
> > >-rwxr-xr-x    1 root     root      1062999 2003-09-27 22:29 vmlinux.bz2
> > >
> > >This is a 86411 bytes or 8.1% reduction, seems significant to me...
> >
> > Sure, it might be worth it in some cases. I didn't mean improvement
> > wasn't measurable at all.
>
> What about making it configurable? If someone want's bzip2
> and if he want's to wait longer to boot, (s)he may
> compile bzip2 support.
> If someone dislikes it, (s)he may use gzip.

That's what the patch against 2.4 did.  I'm banging on a 2.6 version, but 
Manuel's continuing to optimize bunzip over in busybox cvs (I'm good at 
cleaning up and simplifying, but he's way better at optimizing), and I'm 
waiting ot see the results (and starting a micro-version of the compression 
side code in the meantime, which is irrelevant here...)

Rob

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-19 21:04         ` Willy Tarreau
  2003-10-19 21:27           ` Michael Buesch
@ 2003-10-20  0:00           ` Rob Landley
  2003-10-20  4:28             ` Willy Tarreau
  1 sibling, 1 reply; 18+ messages in thread
From: Rob Landley @ 2003-10-20  0:00 UTC (permalink / raw)
  To: Willy Tarreau, Michael Buesch; +Cc: Nick Piggin, Daniel Egger, linux-kernel

On Sunday 19 October 2003 16:04, Willy Tarreau wrote:
> On Sun, Oct 19, 2003 at 12:45:46PM +0200, Michael Buesch wrote:
> > What about making it configurable? If someone want's bzip2
> > and if he want's to wait longer to boot, (s)he may
> > compile bzip2 support.
> > If someone dislikes it, (s)he may use gzip.
>
> I don't know if people have already tested LZO and NRV (used in UPX).
> Perhaps LZO might already be a little better than gzip, while faster and
> with less code. NRV is surely better, but is not open source IIRC. Perhaps
> Markus would agree to release a free NRV decompressor if asked kindly ?

You know, I remember the days of arj, zoo, arc, lha, lzw, the big switchover
from pkzip 1.x to pkzip 2.x, and about 8 zillion others...  And I must say
that the depth of my lack of caring truly cannot be expressed in words.

As soon as I have to start seeing a lot of blah.tar.upx files on ftp sites,
and the tools for dealing with them come with my Linux distribution, then
maybe I'll start caring.  Until then, no.  I'm interested in bzip because
A) it reliably compresses smaller than gzip by margin big enough to care
about, B) it's in widespread use.

You're talking about something that compresses _less_ well than bzip, but
does it faster.  We have one, it's called gzip.

Ville Herva sent me this table yesterday:                                                                    
-----
original vmlinux: 2718814 bytes, 2.4.18pre3                                     
                                                                                
gzip -1 gzip    gzip -9   bzip2 -1 bzip2   bzip2 -9  upx -1   upx     upx -9  upx --best                                                             
compress     0.736   1.289   5.300     5.342    10.810  10.585    1.672    4.794   12.361  20.054                                                                 
decompress   0.157   0.278   0.210     1.360     1.920   1.948    0.156    0.158   0.153   0.159                                                                  
bytes        1247526 1151202 1145777   1118671  1101465 1101465   1311390  1180222 1179325 1178953                                                        
                                                                                
upx is upx-1.22.  
-----

So you're talking about something that compresses LESS well than gzip.  Why?
(I'm told the UPX numbers include a self-extraction decompressor, and the
gzip numbers don't.  The self-extractor would have to be 33k for this to
bring it _even_ with gzip.  And no, gzip -9 does not add anything to
decompression time, only compression time.  I ported info-zip to java 1.0
just before java 1.1 came out with deflate as part of the standard API, I
know how deflate/inflate works pretty well.  I could make deflate work in
place pretty easily.  I can make bunzip work in-place if you'd like (the
symbols occur sequentially, just stick the compressed data at the end of
a buffer the size of the decompressed data, and by the time you're
overwriting any given symbol you must already have read it).  Although bzip
would still need 2-4 megabytes of scratch space to undo the burrows-wheeler
transform, and gzip only needs something like a 64k dictionary.  And doing
either in-place assumes you know the size of the decoded data ahead of
time, but storing the size in 4k pages means 2 bytes gets you a 256
megabyte capacity, which is plenty for the kernel.  (No, I'm not bothering
to overwrite the decompressor.  You can execute that out of flash, why copy
it to dram at all?)

So why should anyone care about UPX?  (Is there really a situation where
gzip isn't fast enough anymore?)  I'm curious why more than one person has
brought it up.  Am I missing something...?

> Regards,
> Willy

Rob

(P.S.  I don't remember info-zip having as much cruft in its deflate
implementation as bzip did, but I intend to do a cleaning pass over
busybox's gzip/gunzip when I get around to it, and I'd be happy to
port the improvements (if any) to the linux kernel afterwards.)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-20  0:00           ` Rob Landley
@ 2003-10-20  4:28             ` Willy Tarreau
  2003-10-20  4:52               ` Valdis.Kletnieks
  2003-10-20  5:47               ` Rob Landley
  0 siblings, 2 replies; 18+ messages in thread
From: Willy Tarreau @ 2003-10-20  4:28 UTC (permalink / raw)
  To: Rob Landley
  Cc: Willy Tarreau, Michael Buesch, Nick Piggin, Daniel Egger,
	linux-kernel

On Sun, Oct 19, 2003 at 07:00:47PM -0500, Rob Landley wrote:
> upx is upx-1.22.  
> 
> So you're talking about something that compresses LESS well than gzip.  Why?

I'm not talking about something that compresses LESS than gzip, I'm talking
about something I almost always use to recompress my kernels and reduce their
size by 20% over gzip, and decompress way faster.

It's upx-1.90 (which is able to decompress/recompress bzimage). OK it's still
a development version, but it has already saved me megs of flash.

I don't know how the tests above were done. But what I know for sure is that
there are excessive open source zealots who would only download the OSS
version of UPX which uses the UCL library while the closed source version
uses the NRV one which is a jewel.

And you know what ? when I put a complete distro on a 8MB flash, my primary
concern is to save as much space as possible, and my ability to read the
compressor sources only comes next.

BTW, I don't remember every upx argument, but I'm sure I got best numbers
with '--best --crp-ms=100000'.

> And no, gzip -9 does not add anything to decompression time, only
> compression time.

Where did you get this interesting idea ? every decompressor needs
decompression time. You need the compressed kernel to be in memory, then you
decompress it, then you boot it. On my old 386sx which was still my home
firewall 6 months ago, the kernel would take 2-3 seconds to decompress with
gzip while it was almost unnoticeable with upx (which did it in place, BTW).

Now don't get me wrong, I'm not advertising for upx. But bzip2 was proposed
and will always be criticized because of its decompression time and cost in
terms of memory. So I simply suggested to take a look at other solutions
which seem interesting. An UPX-based implementation seems interesting to me
only if the decompression code is free and can be put in the kernel. Otherwise
it is not. If the numbers above come from the UCL lib, then we at least know
that this one doesn't interest us for this matter. Some people would also
suggest taking a look at other compression algorithms which can be of
interest, but I don't know their sources status (open/close).

> I can make bunzip work in-place if you'd like

That's very important for low-memory systems.

Regards,
Willy


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-20  4:28             ` Willy Tarreau
@ 2003-10-20  4:52               ` Valdis.Kletnieks
  2003-10-20  5:47               ` Rob Landley
  1 sibling, 0 replies; 18+ messages in thread
From: Valdis.Kletnieks @ 2003-10-20  4:52 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Rob Landley, Michael Buesch, Nick Piggin, Daniel Egger,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 863 bytes --]

On Mon, 20 Oct 2003 06:28:37 +0200, Willy Tarreau said:

> I don't know how the tests above were done. But what I know for sure is that
> there are excessive open source zealots who would only download the OSS
> version of UPX which uses the UCL library while the closed source version
> uses the NRV one which is a jewel.

Well.. unfortunately, a closed-source UPX has as much chance of making it into
the mainstream kernel as the NVidia graphics drivers.  Both will have to remain
things that are self-inflicted by end users.

> > And no, gzip -9 does not add anything to decompression time, only
> > compression time.
> 
> Where did you get this interesting idea ? every decompressor needs
> decompression time. 

I think he meant that decompressing a 'gzip -1' and a 'gzip -9' go at basically
the same Mbytes/second - the big CPU hit is at compression time.


[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-20  4:28             ` Willy Tarreau
  2003-10-20  4:52               ` Valdis.Kletnieks
@ 2003-10-20  5:47               ` Rob Landley
  2003-10-20 11:55                 ` Willy Tarreau
  1 sibling, 1 reply; 18+ messages in thread
From: Rob Landley @ 2003-10-20  5:47 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: linux-kernel

On Sunday 19 October 2003 23:28, Willy Tarreau wrote:
> I don't know how the tests above were done. But what I know for sure is
> that there are excessive open source zealots who would only download the
> OSS version of UPX which uses the UCL library while the closed source
> version uses the NRV one which is a jewel.

Good for you.  And arj beat pkzip for years, no a number of metrics.  I 
honestly do not care.

> > And no, gzip -9 does not add anything to decompression time, only
> > compression time.
>
> Where did you get this interesting idea ? every decompressor needs
> decompression time.

I mean that the -9 version of gzip does not take any more time to decompress 
than the -1 version does, it only affects how many matches are checked in the 
dictionary before it stops looking for a shorter match.  During decompression 
the offsets are all stored, it doesn't matter how they were calculated.

That's where I got the "interesting idea" that the -9 option to the gzip 
compressor does not add anything to the decompression time.

> You need the compressed kernel to be in memory, then
> you decompress it, then you boot it. On my old 386sx which was still my
> home firewall 6 months ago, the kernel would take 2-3 seconds to decompress
> with gzip while it was almost unnoticeable with upx (which did it in place,
> BTW).

Interesting.  So you're suggesting that your algorithm is optimized for a 
processor with no L1 or L2 cache whatsoever?  Or are you suggesting that an 
algorithm that takes upwards of 3 seconds on a system that maxed out at 16 
mhz and had a 16 bit data path, a system with a maximum memory throughput 
slower than modern low-end hard drives (16mhz*2 bytes is 32 megabyts per 
second, and half for read and half for write is 16 megabytes per second when 
copying data without actually PROCESSING any of it, and this glosses over the 
fact that with no instruction cache you'll never see even close to that 
theoretical throughput on any code snippet longer than "rep movsw")...  That 
this algorithm is slow enough to be a legitimate optimization target and 
worth using closed source software to optimize this bottleneck.

Are you really suggesting that gzip isn't fast enough?  (Out of morbid 
curiosity, how long did it take the bios code to boot up this straw man to 
the point where it loaded the boot sector?)

> Now don't get me wrong, I'm not advertising for upx.

No comment.

> But bzip2 was proposed

No, I posted a link to code.  And I offered to use that code to update an 
existing kernel patch if I could track it down, which I did.  (And I'm 
working on a 2.6 port with the new engine, albeit not at a particularly high 
priority seeing as we're in the middle of a code freeze...)

You're welcome to code up your own patch to do whatever you darn well please.  
I'm not interested.

> and will always be criticized because of its decompression time and cost in
> terms of memory. So I simply suggested to take a look at other solutions
> which seem interesting.

You suggested I spend my free time to code up a patch to support a closed 
source compressor I'd never heard of.  I've taken it under advisement, and 
filed it appropriately.

>  An UPX-based implementation seems interesting to me
> only if the decompression code is free and can be put in the kernel.

Good for you.  Code up a patch, so I can start ignoring code instead of just 
ignoring suggestions.

> Otherwise it is not. If the numbers above come from the UCL lib, then we at
> least know that this one doesn't interest us for this matter.

I presume you're using the editorial "we".  Because I, myself, find the 
qualifier "for this matter" to be superfluous.

> Some people
> would also suggest taking a look at other compression algorithms which can
> be of interest, but I don't know their sources status (open/close).

If some people code up a patch, then the matter moves out of the realm of pure 
philosophical arguments.

> > I can make bunzip work in-place if you'd like
>
> That's very important for low-memory systems.

Really?  Wow.

> Regards,
> Willy

Rob

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-18 20:38   ` Rob Landley
@ 2003-10-20  8:31     ` Jörn Engel
  2003-10-20  9:41       ` Rob Landley
  0 siblings, 1 reply; 18+ messages in thread
From: Jörn Engel @ 2003-10-20  8:31 UTC (permalink / raw)
  To: Rob Landley; +Cc: linux-kernel

On Sat, 18 October 2003 15:38:35 -0500, Rob Landley wrote:
> 
> The decompression-side ones, yes.  (Modulo different command line arguments, 
> and that I didn't implement the "small mode" that's slower but uses less 
> memory.  That would probably only add a couple hundred bytes, and I could 
> make it a compile time option, but I just haven't gotten around to it yet.  
> If somebody wants to send me a patch... :)

Does anyone really need the "small mode"?  If not, I consider it a
feature to not support it. :)

> > Not pretty with 80 columns, but it looks good at first glance.
> 
> Manuel Novoa submitted a patch that sped things up over 10% (seriously cool, 
> that's why we're faster than the original), but broke the 80 column thing 
> (mostly a couple return statements that need to be on the next line).
> 
> I'm happy to take a patch to clean it up. :)

Today is rainy.  Why not? ;)

> > And surely more fun to work on than the zlib-inspired code from Julian.
> 
> That was the original reason for doing this, yes. :)

You don't - by any chance - plan the same thing for the compression
side, do you?  I still have vague plans to improve the algorithm a bit,
so a clean codebase would be nice to have.

> Eric Anderson pointed me to the new home of the kernel bunzip patch, which is 
> at "http://shepard.kicks-ass.net/~cc/", and I'll take a stab at porting it to 
> 2.6.0-test8 as the mood strikes me. :)

Cool!  Maybe I should update my patches again.  They are definitely
2.7 material, but if people want to play with that stuff...

Jörn

-- 
My second remark is that our intellectual powers are rather geared to
master static relations and that our powers to visualize processes
evolving in time are relatively poorly developed.
-- Edsger W. Dijkstra

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-20  8:31     ` Jörn Engel
@ 2003-10-20  9:41       ` Rob Landley
  0 siblings, 0 replies; 18+ messages in thread
From: Rob Landley @ 2003-10-20  9:41 UTC (permalink / raw)
  To: Jörn Engel; +Cc: linux-kernel

On Monday 20 October 2003 03:31, Jörn Engel wrote:
> On Sat, 18 October 2003 15:38:35 -0500, Rob Landley wrote:
> > The decompression-side ones, yes.  (Modulo different command line
> > arguments, and that I didn't implement the "small mode" that's slower but
> > uses less memory.  That would probably only add a couple hundred bytes,
> > and I could make it a compile time option, but I just haven't gotten
> > around to it yet. If somebody wants to send me a patch... :)
>
> Does anyone really need the "small mode"?  If not, I consider it a
> feature to not support it. :)

The "fast mode" walks a 4 megabyte array (okay, 900k*4=3.6 megabytes) in a way 
that makes the cpu cache break down and cry, and actually brings dram bank 
switching latency into the picture.  (It's about the worst case scenario for 
memory access; fetching whole cache lines for randomly ordered 4 byte 
reads...)  Anything that doesn't do that seems like it's at least worth a 
look, but it's not a high priority just now...;)

Any future uber-xeon that has a 4 megabyte L2 cache is going to do bunzip 
_really_ fast. :)

> > > Not pretty with 80 columns, but it looks good at first glance.
> >
> > Manuel Novoa submitted a patch that sped things up over 10% (seriously
> > cool, that's why we're faster than the original), but broke the 80 column
> > thing (mostly a couple return statements that need to be on the next
> > line).
> >
> > I'm happy to take a patch to clean it up. :)
>
> Today is rainy.  Why not? ;)

It turns out Manuel Novoa is busy beating the source code to death with a 
profiler, which is why I'm waiting on the bzip-kernel patch.  He's sped it up 
another 10% or so since last time (extracting the kernel tarball took 22 
seconds in the original, and he's got it down under 18 now), but hasn't 
checked the result into busybox's CVS yet.  (He's still tweaking.)

No rush.  The kernel patch is on hold pending delivery of Manuel's updated 
bunzip code.  The ball's in his court at the moment.

> > > And surely more fun to work on than the zlib-inspired code from Julian.
> >
> > That was the original reason for doing this, yes. :)
>
> You don't - by any chance - plan the same thing for the compression
> side, do you?  I still have vague plans to improve the algorithm a bit,
> so a clean codebase would be nice to have.

That's what I'm working on now.  Yesterday, I glued all the bits necessary to 
bzip compress stdin to stdout into one 95k kilobyte c file (which compiled to 
a ~50k executable).  So far this evening I've trimmed that down to 79k of 
source and 48k of executable, and I'm still at the "crapectomy" stage.  Not 
actually doing serious reorganization type cleanup yet, just taking a scalpel 
(okay, vi) and removing bits that shouldn't BE there.  (The line "typedef 
BZFILE void;" should not occur in any program anywhere ever.  It just 
shouldn't.  "#define BZ_API(func) func" and "#define BZ_EXTERN extern" aren't 
really a good sign either, although they were the result of making the code 
compile on windows.  Unwinding this kind of thing is very time consuming.)

Not really relevant to the kernel, though.  Ask on the busybox list if you 
want to know more. :)

> > Eric Anderson pointed me to the new home of the kernel bunzip patch,
> > which is at "http://shepard.kicks-ass.net/~cc/", and I'll take a stab at
> > porting it to 2.6.0-test8 as the mood strikes me. :)
>
> Cool!  Maybe I should update my patches again.  They are definitely
> 2.7 material, but if people want to play with that stuff...
>
> Jörn

Which patches are you referring to?

I'm going to take the shepard patch and make a 2.6 version with the new bunzip 
code as soon as Manuel gets tired of making it faster.  But he hasn't yet, 
and I've got plenty of other things to do in the meantime...

Rob

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Where's the bzip2 compressed linux-kernel patch?
  2003-10-20  5:47               ` Rob Landley
@ 2003-10-20 11:55                 ` Willy Tarreau
  0 siblings, 0 replies; 18+ messages in thread
From: Willy Tarreau @ 2003-10-20 11:55 UTC (permalink / raw)
  To: Rob Landley; +Cc: linux-kernel

On Mon, Oct 20, 2003 at 12:47:52AM -0500, Rob Landley wrote:
 
> > > And no, gzip -9 does not add anything to decompression time, only
> > > compression time.
> >
> > Where did you get this interesting idea ? every decompressor needs
> > decompression time.
> 
> I mean that the -9 version of gzip does not take any more time to decompress 
> than the -1 version does, it only affects how many matches are checked in the 
> dictionary before it stops looking for a shorter match.  During decompression 
> the offsets are all stored, it doesn't matter how they were calculated.

OK, I misunderstood your statement. I thought you were saying that gunzip
costs nothing !

> Interesting.  So you're suggesting that your algorithm is optimized for a 
> processor with no L1 or L2 cache whatsoever?  Or are you suggesting that an 
> algorithm that takes upwards of 3 seconds on a system that maxed out at 16 
> mhz and had a 16 bit data path, a system with a maximum memory throughput 
> slower than modern low-end hard drives (16mhz*2 bytes is 32 megabyts per 
> second, and half for read and half for write is 16 megabytes per second when 
> copying data without actually PROCESSING any of it, and this glosses over the 
> fact that with no instruction cache you'll never see even close to that 
> theoretical throughput on any code snippet longer than "rep movsw")...  That 
> this algorithm is slow enough to be a legitimate optimization target and 
> worth using closed source software to optimize this bottleneck.

No, I simply meant that I *observed* consequent gains on this rather limited
platform (it's 33 Mhz, BTW), and I think that if the decompression time is
unnoticeable on it, it will also be on any newer hardware.

> Are you really suggesting that gzip isn't fast enough?  (Out of morbid 
> curiosity, how long did it take the bios code to boot up this straw man to 
> the point where it loaded the boot sector?)

I'm not saying that gzip isn't fast enough, and yes the bios takes longer.

> No, I posted a link to code.  And I offered to use that code to update an 
> existing kernel patch if I could track it down, which I did.  (And I'm 
> working on a 2.6 port with the new engine, albeit not at a particularly high 
> priority seeing as we're in the middle of a code freeze...)
> 
> You're welcome to code up your own patch to do whatever you darn well please.  
> I'm not interested.

I don't want to code it, and never asked you to code it. I stated that this
code already exists, and all that would be needed is its author to agree to
open it. Then if he did, it would even save you some time coding your bzip2
version.

> You suggested I spend my free time to code up a patch to support a closed 
> source compressor I'd never heard of.  I've taken it under advisement, and 
> filed it appropriately.

I didn't suggest to code it (since it already exists), but only to study it
among other algos. I'm sorrt you took it for advertisement while it really
was not.

> Good for you.  Code up a patch, so I can start ignoring code instead of just 
> ignoring suggestions.

I don't want to fight, you clearly stated that you don't want to hear about
this one. That's OK, period. I can still use upx as I did before if I need.
I repeat, I was only *suggesting* something which already exists, but is not
(yet?) open source.

Regards,
Willy


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2003-10-20 11:55 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-18  5:18 Where's the bzip2 compressed linux-kernel patch? Rob Landley
2003-10-18  5:30 ` Nick Piggin
2003-10-18 11:39   ` Daniel Egger
2003-10-19  0:51     ` Nick Piggin
2003-10-19 10:45       ` Michael Buesch
2003-10-19 21:04         ` Willy Tarreau
2003-10-19 21:27           ` Michael Buesch
2003-10-20  0:00           ` Rob Landley
2003-10-20  4:28             ` Willy Tarreau
2003-10-20  4:52               ` Valdis.Kletnieks
2003-10-20  5:47               ` Rob Landley
2003-10-20 11:55                 ` Willy Tarreau
2003-10-19 22:15         ` Rob Landley
2003-10-18  6:05 ` Erik Andersen
2003-10-18 16:43 ` Jörn Engel
2003-10-18 20:38   ` Rob Landley
2003-10-20  8:31     ` Jörn Engel
2003-10-20  9:41       ` Rob Landley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).