public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Paul Jackson <pj@engr.sgi.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: cw@f00f.org, torvalds@osdl.org, davem@davemloft.net,
	andrea@suse.de, mbp@sourcefrog.net, linux-kernel@vger.kernel.org,
	dlang@digitalinsight.com
Subject: Re: Kernel SCM saga..
Date: Sun, 10 Apr 2005 10:38:05 -0700	[thread overview]
Message-ID: <20050410103805.7eee2fea.pj@engr.sgi.com> (raw)
In-Reply-To: <20050410120331.GA8878@elte.hu>

Ingo wrote:
> With default gzip it's 3.3 seconds though,
> and that still compresses it down to 57 MB.

Interesting.  I'm surprised how much a bunch of separate, modest sized
files can be compressed.

I'm unclear what matters most here.

Space on disk certainly isn't much of an issue.  Even with Andrew Morton
on our side, we still can't grow the kernel as fast as the disk drive
manufacturers can grow disk sizes.

Main memory size of the compressed history matters to Linus and his top
20 lieutenants doing full kernel source patching as a primary mission if
they can't fit the source _history_ in main memory.  But those people
are running 1 GByte or more of RAM - so whether it is 95, 57 or 45
MBytes, it fits fine.  The rest of us are mostly concerned with whether
a kernel build fits in memory.

Looking at an arch i386 kernel build tree I have at hand, I see the
following disk usage:

	102 MBytes - BitKeeper/*
	287 MBytes - */SCCS/* (outside of already counted BitKeeper/*)
	232 MBytes - checked out source files
	 94 MBytes - ELF and other build byproducts
	---
	715 MBytes - Total

Converting from bk to git, I guess this becomes:

	 97 MBytes - git (zlib)
	232 MBytes - checked out source files
	 94 MBytes - ELF and other build byproducts
	---
	423 MBytes - Total

Size matters when its a two to one difference, but when we are down to a
10% to 15% difference in the Total, its presentation that matters.  The
above numbers tell me that this is not a pure size issue for local disk
or memory usage.

What does matter that I can see:

 1) Linus explicitly stated he wanted "a raw zlib compressed blob,
    not a gzipped file", to encourage everyone to use the git tools to
    access this data.  He did not "want people editing repostitory files
    by hand."  I'm not sure what he gains here - it did annoy me for a
    couple hours before I decided fixing my supper was more important.

 2) The time to compress will be noticed by users as a delay when
    checking in changes (I'm guessing zlib compresses relatively faster).

 3) The time to copy compressed data over the internet will be
    noticed by users when upgrading kernel versions (gzip can
    compress smaller).

 4) Decompress times are smaller so don't matter as much.

 5) Zlib has a nice library, and is patent free.  I don't know about gzip.

 6) As you note, zlib has rsync-friendly, recovery-friendly Z_PARTIAL_FLUSH.
    I don't know about gzip.

My guess is that Linus finds (2) and (3) to balance each other, and that
(1) decides the point, in favor of zlib.  Well, that or a simpler
hypothesis, that he found the nice library (5) convenient, and (1)
sealed the deal, with the other tradeoffs passing through his
subconscious faster than he bothered to verbalize them.

You (Ingo) seem in your second message to be encouraging further
consideration of gzip, for its improved compression.

How will that matter to us, day to day?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

  reply	other threads:[~2005-04-10 17:41 UTC|newest]

Thread overview: 202+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-06 15:42 Kernel SCM saga Linus Torvalds
2005-04-06 16:00 ` Greg KH
2005-04-07 16:40   ` Rik van Riel
2005-04-08  0:53     ` Jesse Barnes
2005-04-06 16:09 ` Daniel Phillips
2005-04-06 19:07 ` Jon Smirl
2005-04-06 19:24   ` Matan Peled
2005-04-06 19:49     ` Jon Smirl
2005-04-06 20:34       ` Hua Zhong
2005-04-07  1:31       ` Christoph Lameter
2005-04-06 19:39 ` Paul P Komkoff Jr
2005-04-07  1:40   ` Martin Pool
2005-04-07  1:47     ` Jeff Garzik
2005-04-07  2:26       ` Martin Pool
2005-04-07  2:32         ` David Lang
2005-04-07  5:38           ` Martin Pool
2005-04-07 23:27             ` Linus Torvalds
2005-04-08  5:56               ` Martin Pool
2005-04-08  6:41                 ` Linus Torvalds
2005-04-08  8:38                   ` Andrea Arcangeli
2005-04-08 23:38                     ` Daniel Phillips
2005-04-09  2:54                       ` Andrea Arcangeli
2005-04-09  0:12                     ` Linus Torvalds
2005-04-09  2:27                       ` Andrea Arcangeli
2005-04-09  2:32                         ` David Lang
2005-04-09  3:08                         ` Brian Gerst
2005-04-09  3:15                           ` Andrea Arcangeli
2005-04-09  5:45                         ` Linus Torvalds
2005-04-09 22:55                           ` David S. Miller
2005-04-09 23:13                             ` Linus Torvalds
2005-04-10  0:14                               ` Chris Wedgwood
2005-04-10  1:56                                 ` Paul Jackson
2005-04-10 12:03                                   ` Ingo Molnar
2005-04-10 17:38                                     ` Paul Jackson [this message]
2005-04-10 17:46                                       ` Ingo Molnar
2005-04-10 17:56                                         ` Paul Jackson
2005-04-10  0:22                             ` Paul Jackson
2005-04-10 11:33                             ` Ingo Molnar
2005-04-10 17:55                         ` Matthias Andree
2005-04-09 16:33                       ` Roman Zippel
2005-04-09 23:31                         ` Tupshin Harper
2005-04-10 17:24                         ` Code snippet to reconstruct ancestry graph from bk repo Paul P Komkoff Jr
2005-04-10 18:19                           ` Roman Zippel
2005-04-08 16:46                   ` Kernel SCM saga Catalin Marinas
2005-04-07  8:14           ` Magnus Damm
2005-04-07  7:53       ` Zwane Mwaikambo
2005-04-07  3:35     ` Daniel Phillips
2005-04-07 15:08       ` Daniel Phillips
2005-04-07  6:36   ` bert hubert
2005-04-06 23:22 ` Jon Masters
2005-04-07  6:51 ` Paul Mackerras
2005-04-07  7:48   ` Arjan van de Ven
2005-04-07 15:10   ` Linus Torvalds
2005-04-07 17:00     ` Daniel Phillips
2005-04-07 17:38       ` Linus Torvalds
2005-04-07 17:47         ` Chris Wedgwood
2005-04-07 18:06         ` Magnus Damm
2005-04-07 18:36         ` Daniel Phillips
2005-04-08  3:35         ` Jeff Garzik
2005-04-07 19:56       ` Sam Ravnborg
2005-04-07 23:21     ` Dave Airlie
2005-04-07  7:18 ` David Woodhouse
2005-04-07  8:50   ` Andrew Morton
2005-04-07  9:20     ` Paul Mackerras
2005-04-07  9:46       ` Andrew Morton
2005-04-07 11:17         ` Paul Mackerras
2005-04-07 10:41       ` Geert Uytterhoeven
2005-04-07  9:25     ` David Woodhouse
2005-04-07  9:49       ` Andrew Morton
2005-04-07  9:55       ` Russell King
2005-04-07 10:11         ` David Woodhouse
2005-04-07  9:40     ` David Vrabel
2005-04-07  9:24   ` Sergei Organov
2005-04-07 10:30     ` Matthias Andree
2005-04-07 10:54       ` Andrew Walrond
2005-04-09 16:17       ` David Roundy
2005-04-10  9:24         ` Giuseppe Bilotta
2005-04-10 13:51           ` David Roundy
2005-04-07 15:32   ` Linus Torvalds
2005-04-07 17:09     ` Daniel Phillips
2005-04-07 17:10     ` Al Viro
2005-04-07 17:47       ` Linus Torvalds
2005-04-07 18:04         ` Jörn Engel
2005-04-07 18:27           ` Daniel Phillips
2005-04-07 20:54           ` Arjan van de Ven
2005-04-08  3:41         ` Jeff Garzik
2005-04-07 17:52       ` Bartlomiej Zolnierkiewicz
2005-04-07 17:54       ` Daniel Phillips
2005-04-07 18:13         ` Dmitry Yusupov
2005-04-07 18:29           ` Daniel Phillips
2005-04-10 22:33             ` Troy Benjegerdes
2005-04-11  0:00               ` Christian Parpart
2005-04-08 17:24         ` Jon Masters
2005-04-08 22:05           ` Daniel Phillips
2005-04-08 22:52     ` Roman Zippel
2005-04-08 23:46       ` Tupshin Harper
2005-04-09  1:00         ` Roman Zippel
2005-04-09  1:23           ` Tupshin Harper
2005-04-09 16:52       ` Eric D. Mudama
2005-04-09 17:40         ` Roman Zippel
2005-04-09 18:56           ` Ray Lee
2005-04-07  7:44 ` Jan Hudec
2005-04-08  6:14   ` Matthias Urlichs
2005-04-09  1:01   ` Marcin Dalecki
2005-04-09  8:32     ` Jan Hudec
2005-04-11  2:26     ` Miles Bader
2005-04-11  2:56       ` Marcin Dalecki
2005-04-11  6:36         ` Jan Hudec
2005-04-07 10:56 ` Andrew Walrond
2005-04-08  0:57 ` Ian Wienand
2005-04-08  4:13 ` Chris Wedgwood
2005-04-08  4:42   ` Linus Torvalds
2005-04-08  5:04     ` Chris Wedgwood
2005-04-08  5:14       ` H. Peter Anvin
2005-04-08  7:05         ` Rogan Dawes
2005-04-08  7:21           ` Daniel Phillips
2005-04-08  7:49             ` H. Peter Anvin
2005-04-08  7:14     ` Andrea Arcangeli
2005-04-08 12:02       ` Matthias Andree
2005-04-08 12:21         ` Florian Weimer
2005-04-08 14:26       ` Linus Torvalds
2005-04-08 16:15         ` Matthias-Christian Ott
2005-04-08 17:14           ` Linus Torvalds
2005-04-08 17:15             ` Chris Wedgwood
2005-04-08 17:46               ` Linus Torvalds
2005-04-08 18:05                 ` Chris Wedgwood
2005-04-08 19:03                   ` Linus Torvalds
2005-04-08 19:16                     ` Chris Wedgwood
2005-04-08 19:38                       ` Florian Weimer
2005-04-08 19:48                         ` Chris Wedgwood
2005-04-08 19:39                       ` Linus Torvalds
2005-04-08 20:11                         ` Uncached stat performace [ Was: Re: Kernel SCM saga.. ] Ragnar Kjørstad
2005-04-08 20:14                           ` Chris Wedgwood
2005-04-08 20:50                       ` Kernel SCM saga Luck, Tony
2005-04-08 21:27                         ` Linus Torvalds
2005-04-09 17:14                           ` Roman Zippel
2005-04-09  7:20                     ` Willy Tarreau
2005-04-09 15:15                     ` Paul Jackson
2005-04-08 17:25             ` Matthias-Christian Ott
2005-04-08 18:14               ` Linus Torvalds
2005-04-08 18:28                 ` Jon Smirl
2005-04-08 18:58                   ` Florian Weimer
2005-04-09  1:11                   ` Marcin Dalecki
2005-04-09  1:50                     ` David Lang
2005-04-09 22:12                       ` Florian Weimer
2005-04-08 19:16                 ` Matthias-Christian Ott
2005-04-08 19:32                   ` Linus Torvalds
2005-04-08 19:44                     ` Matthias-Christian Ott
2005-04-09  1:09                 ` Marcin Dalecki
2005-04-08 17:35             ` Jeff Garzik
2005-04-08 18:47               ` Linus Torvalds
2005-04-08 18:56                 ` Chris Wedgwood
2005-04-09  7:37                   ` Willy Tarreau
2005-04-09  7:47                     ` Neil Brown
2005-04-09  8:00                       ` Willy Tarreau
2005-04-09  9:34                         ` Neil Brown
2005-04-09 15:40                 ` Paul Jackson
2005-04-09 16:16                   ` Linus Torvalds
2005-04-09 17:15                     ` Paul Jackson
2005-04-09 17:35                     ` Paul Jackson
2005-04-09  1:04             ` Marcin Dalecki
2005-04-09 15:42               ` Paul Jackson
2005-04-09 18:45                 ` Marcin Dalecki
2005-04-09  1:00           ` Marcin Dalecki
2005-04-09  1:09             ` Chris Wedgwood
2005-04-09  1:21               ` Marcin Dalecki
2005-04-08  7:17     ` ross
2005-04-08 15:50       ` Linus Torvalds
2005-04-09  2:53         ` Petr Baudis
2005-04-09  7:08           ` Randy.Dunlap
2005-04-09 18:06             ` [PATCH] " Petr Baudis
2005-04-10  1:01           ` Phillip Lougher
2005-04-10  1:42             ` Petr Baudis
2005-04-10  1:57               ` Phillip Lougher
2005-04-09 15:50         ` Paul Jackson
2005-04-09 16:26           ` Linus Torvalds
2005-04-09 17:08             ` Paul Jackson
2005-04-10  3:41             ` Paul Jackson
2005-04-10  8:39             ` David Lang
2005-04-10  9:40               ` Junio C Hamano
2005-04-10 16:46                 ` Bill Davidsen
2005-04-10 17:50                   ` Paul Jackson
2005-04-12 23:20                     ` Pavel Machek
2005-04-08  7:34     ` Marcel Lanz
2005-04-08  9:23       ` Geert Uytterhoeven
2005-04-08  8:38     ` Matt Johnston
2005-04-12  7:14     ` Kernel SCM saga.. (bk license?) Kedar Sovani
2005-04-12  9:34       ` Catalin Marinas
2005-04-13  4:04       ` Ricky Beam
2005-04-08 11:42   ` Kernel SCM saga Catalin Marinas
     [not found] <Pine.LNX.4.58.0504060800280.2215 () ppc970 ! osdl ! org>
2005-04-06 21:13 ` kfogel
2005-04-06 22:39   ` Jeff Garzik
2005-04-09  1:00   ` Marcin Dalecki
  -- strict thread matches above, loose matches on Subject: below --
2005-04-08 22:27 Rajesh Venkatasubramanian
2005-04-08 23:29 ` Linus Torvalds
2005-04-09  0:29   ` Linus Torvalds
2005-04-09 16:20   ` Paul Jackson
2005-04-09  4:06 Walter Landry
2005-04-09 11:02 Samium Gromoff
2005-04-09 11:29 Samium Gromoff
2005-04-10  4:20 Albert Cahalan
2025-02-27  7:09 purple_eater1988

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050410103805.7eee2fea.pj@engr.sgi.com \
    --to=pj@engr.sgi.com \
    --cc=andrea@suse.de \
    --cc=cw@f00f.org \
    --cc=davem@davemloft.net \
    --cc=dlang@digitalinsight.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbp@sourcefrog.net \
    --cc=mingo@elte.hu \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox