public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Drew Scott Daniels <ddaniels@UMAlumni.mb.ca>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Smaller compressed kernel source tarballs?
Date: Mon, 2 Oct 2006 05:35:31 +0200	[thread overview]
Message-ID: <20061002033531.GA5050@1wt.eu> (raw)
In-Reply-To: <20061002033511.GB12695@zimmer>

On Sun, Oct 01, 2006 at 10:35:11PM -0500, Drew Scott Daniels wrote:
> ppmd, also in Debian had better compression than lzma. PAQ8i has even
> better compression, but isn't in Debian. See the maximumcompression web
> site or other archive comparison tests.

Interesting. But I suspect that you have not checked the compression time.
PAQ8I for instance is between 100 and 300 times SLOWER than bzip2 to achieve
about 30% smaller ! Given that the kernel already takes a very long time to
compress with bzip2, it would take several hours to compress it with such
tools. While they're very interesting proofs of concept for compression
research, they're not suited to any real world usage !

> The pace of compression algorithm development is high enough that I'd
> suggest that the bar be placed quite high before switching to a new
> compression format that's not reverse compatible.

At least, ppmd takes the same time as bzip2 to achieve about 12% better
compression. But I don't think it justifies a switch.

> For those interested, I'm working on publishing a proof of concept that 
> can make most tarballs compress better. About 2-3% better in my tests 
> with bzip2/gzip on the Linux kernel source code.

A lot of improvement can be made in tar to compress better archive with
large number of small files such as the kernel. You just have to see the
difference in archive size depending on the base directory name. If you
come up with something really interesting which does not alter the output
format nor the compression time, it might get a place in the git-tar-tree
command. But IMHO, it would me more interesting to further reduce patches
size than tarballs size, since patches might be downloaded far more often.

Regards,
Willy


  parent reply	other threads:[~2006-10-02  4:07 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-02  3:35 Smaller compressed kernel source tarballs? Drew Scott Daniels
2006-10-02  3:32 ` Bernd Eckenfels
2006-10-02  3:35 ` Willy Tarreau [this message]
     [not found]   ` <Pi ne.LNX.4.63.0610012205280.28534@qynat.qvtvafvgr.pbz>
2006-10-02  5:11   ` David Lang
2006-10-02  5:49     ` Willy Tarreau
2006-10-02 15:16     ` Phillip Susi
2006-10-02 15:48       ` David Lang
2006-10-02 20:20         ` Phillip Susi
2006-10-02 20:12           ` David Lang
2006-10-02 20:35             ` Willy Tarreau
     [not found]             ` <2006 1002203527.GA585@1wt.eu>
2006-10-02 21:49               ` Sean
     [not found]                 ` <20061002174938.bb82027d.seanlkml@sympatico.ca>
2006-10-02 21:42                   ` David Lang
2006-10-03  2:48                   ` Willy Tarreau
2006-10-03 10:28   ` Jan Engelhardt
2006-10-03 18:24     ` Phillip Susi
2006-10-04 15:57       ` Compressing pages [was: Re: Smaller compressed kernel source tarballs?] Jörn Engel
  -- strict thread matches above, loose matches on Subject: below --
2006-09-21 20:32 Smaller compressed kernel source tarballs? Dax Kelson
     [not found] ` <20060921204250 .GN13641@csclub.uwaterloo.ca>
2006-09-21 20:42 ` Lennart Sorensen
2006-09-21 21:40   ` Dax Kelson
2006-09-22 14:00     ` Lennart Sorensen
     [not found]   ` <20060921171747.9ae2b42e.seanlkml@sympatico.ca>
2006-09-21 21:17     ` Sean
2006-09-21 21:41     ` Dax Kelson
2006-09-21 21:50       ` Bob Copeland
     [not found]       ` <20060921175717.272c58ee.seanlkml@sympatico.ca>
2006-09-21 21:57         ` Sean
2006-09-21 22:00         ` David Lang
2006-09-21 22:24           ` Dave Jones
2006-09-21 22:16             ` David Lang
2006-09-21 22:40               ` Dave Jones
2006-09-21 22:34                 ` David Lang
     [not found]                   ` <20060921193823.ec49d446.seanlkml@sympatico.ca>
2006-09-21 23:38                     ` Sean
     [not found]         ` <Pin e.LNX.4.63.0609211455570.17238@qynat.qvtvafvgr.pbz>
2006-09-21 22:25           ` Sean
     [not found]             ` <20060921182554.23044ca3.seanlkml@sympatico.ca>
2006-09-21 22:20               ` David Lang
2006-09-21 21:43   ` H. Peter Anvin
2006-09-22 14:00     ` Lennart Sorensen
2006-09-22 16:13       ` H. Peter Anvin
2006-09-22 16:13       ` Jan Engelhardt
2006-09-22 16:33         ` H. Peter Anvin
2006-09-22 17:41           ` Johannes Stezenbach
2006-09-22 18:09             ` H. Peter Anvin
2006-09-22 18:19               ` Michael Tokarev
2006-09-22 18:26                 ` H. Peter Anvin
2006-09-25 11:51                   ` Paulo Marques
2006-09-25 15:47                     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061002033531.GA5050@1wt.eu \
    --to=w@1wt.eu \
    --cc=ddaniels@UMAlumni.mb.ca \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox