public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Matt Mackall <mpm@selenic.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Andrea Arcangeli <andrea@suse.de>,
	David Eger <eger@havoc.gtf.org>, Petr Baudis <pasky@ucw.cz>,
	"Randy.Dunlap" <rddunlap@osdl.org>,
	Ross Vandegrift <ross@jose.lug.udel.edu>,
	Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Re: more git updates..
Date: Wed, 13 Apr 2005 13:44:51 -0700	[thread overview]
Message-ID: <20050413204451.GP25554@waste.org> (raw)
In-Reply-To: <Pine.LNX.4.58.0504121809380.4501@ppc970.osdl.org>

On Tue, Apr 12, 2005 at 06:10:27PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 13 Apr 2005, Andrea Arcangeli wrote:
> > 
> > I wasn't suggesting to use CVS. I meant that for a newly developed SCM,
> > the CVS/SCCS format as storage may be more appealing than the current
> > git format.
> 
> Go wild. I did mine in six days, and you've been whining about other 
> peoples SCM's for three years.

I wrote a hack to do efficient delta storage with O(1) seeks for
lookup and append last week, I believe it's been integrated into the
latest Bazaar-NG. I expect it'll give better compression and
performance than BK. Of course it ends up being O(revisions) for
modifications or insertions (but that is probably a non-issue for the
SCM models we're looking at).

The git model is obviously very different, but I worry about the slop
space implied. With 200k file revision and an average of 2k slop per
file, that's 400MB of slop, or almost the size of an equivalent delta
compressed kernel repo.

Now if you can assume that blobs never change and are never deleted,
you can simply append them all onto a log, and then index them with a
separate file containing an htree of (sha1, offset, length) or the
like. Since the key is already a strong hash, this is an excellent
match and avoids rehashing in the kernel's directory lookup. And it'll
save an inode, a directory entry, and about half a data block per
entry. "Open" will also be cheaper as there's no per-revision inode to
grab.

I could hack on this if you think it fits with the git model,
otherwise I'll go back to my other experiments..

-- 
Mathematics is the supreme nostalgia of our time.

  parent reply	other threads:[~2005-04-13 20:45 UTC|newest]

Thread overview: 179+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-09 19:45 more git updates Linus Torvalds
2005-04-09 19:56 ` Linus Torvalds
2005-04-09 20:07 ` Petr Baudis
2005-04-09 21:00   ` Linus Torvalds
2005-04-09 21:00     ` tony.luck
2005-04-10 16:01       ` Linus Torvalds
2005-04-12 17:34         ` Helge Hafting
2005-04-10 18:19       ` Paul Jackson
2005-04-10 23:04         ` Bernd Eckenfels
2005-04-11  9:27           ` Anton Altaparmakov
2005-04-09 21:08     ` Linus Torvalds
2005-04-09 23:31       ` Linus Torvalds
2005-04-10  2:41         ` Petr Baudis
2005-04-10 16:27           ` [ANNOUNCE] git-pasky-0.1 Petr Baudis
2005-04-10 16:55             ` Linus Torvalds
2005-04-10 19:49               ` Sean
2005-04-10 17:33             ` Ingo Molnar
2005-04-10 17:42               ` Willy Tarreau
2005-04-10 17:45                 ` Ingo Molnar
2005-04-10 18:45                   ` Petr Baudis
2005-04-10 19:13                     ` Willy Tarreau
2005-04-10 21:27                       ` Petr Baudis
2005-04-10 20:38                     ` Linus Torvalds
2005-04-10 21:39                       ` Linus Torvalds
2005-04-10 23:49                         ` Petr Baudis
2005-04-10 22:27                       ` Petr Baudis
2005-04-10 23:10                         ` Linus Torvalds
2005-04-10 23:26                           ` Petr Baudis
2005-04-10 23:46                             ` Linus Torvalds
2005-04-10 23:56                               ` Petr Baudis
2005-04-11  0:20                                 ` GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1) Linus Torvalds
2005-04-11  0:27                                   ` Petr Baudis
2005-04-11  7:45                                   ` Ingo Molnar
2005-04-11  8:40                                     ` Florian Weimer
2005-04-11 10:52                                       ` Petr Baudis
2005-04-11 16:05                                         ` Florian Weimer
2005-04-10 23:23                         ` [ANNOUNCE] git-pasky-0.1 Paul Jackson
2005-04-11  0:15                           ` Randy.Dunlap
2005-04-11  0:30                       ` Re: " Petr Baudis
2005-04-11  1:11                         ` Linus Torvalds
2005-04-10 20:41                     ` Paul Jackson
2005-04-11  1:58             ` [ANNOUNCE] git-pasky-0.2 Petr Baudis
2005-04-11  2:46               ` Daniel Barkalow
2005-04-11 10:17                 ` Petr Baudis
2005-04-11  8:50               ` Ingo Molnar
2005-04-11 10:16                 ` Petr Baudis
2005-04-11 13:57               ` [ANNOUNCE] git-pasky-0.3 Petr Baudis
2005-04-12 12:47                 ` Martin Schlemmer
2005-04-12 13:02                   ` Petr Baudis
2005-04-12 13:13                     ` Martin Schlemmer
2005-04-12 13:23                       ` Petr Baudis
2005-04-12 13:07                 ` David Woodhouse
2005-04-13  8:47                   ` Russell King
2005-04-13  8:59                     ` Petr Baudis
2005-04-13  9:06                       ` H. Peter Anvin
2005-04-13  9:09                         ` David Woodhouse
2005-04-13  9:25                       ` David Woodhouse
2005-04-13  9:42                         ` Petr Baudis
2005-04-13 10:24                           ` David Woodhouse
2005-04-13 17:01                           ` Daniel Barkalow
2005-04-13 18:07                             ` Petr Baudis
2005-04-13 18:22                               ` git mailing list (Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.3) Linus Torvalds
2005-04-13 18:38                               ` Re: Re: Re: [ANNOUNCE] git-pasky-0.3 Daniel Barkalow
2005-04-13 12:43                         ` Xavier Bestel
2005-04-13 16:48                           ` H. Peter Anvin
2005-04-13 18:15                             ` Xavier Bestel
2005-04-13 23:05                           ` bd
2005-04-13 14:38                         ` Linus Torvalds
2005-04-13 14:47                           ` David Woodhouse
2005-04-13 14:59                             ` Linus Torvalds
2005-04-13  9:35                 ` Russell King
2005-04-13  9:38                   ` Russell King
2005-04-13  9:49                     ` Petr Baudis
2005-04-13 11:02                       ` Ingo Molnar
2005-04-13 14:50                         ` Linus Torvalds
2005-04-13  9:46                   ` Petr Baudis
2005-04-13 10:28                     ` Russell King
2005-04-13 19:03                   ` Russell King
2005-04-13 19:13                     ` Petr Baudis
2005-04-13 19:21                       ` Russell King
2005-04-13 19:23                         ` H. Peter Anvin
2005-04-10  6:53         ` more git updates Christopher Li
2005-04-10 11:48           ` Ralph Corderoy
2005-04-10 19:23           ` Paul Jackson
2005-04-10 18:42             ` Christopher Li
2005-04-10 22:30               ` Petr Baudis
2005-04-11 13:58           ` H. Peter Anvin
2005-04-20 20:29             ` Kai Henningsen
2005-04-24  0:42               ` Paul Jackson
2005-04-24  1:29                 ` Bernd Eckenfels
2005-04-24  4:13                   ` Paul Jackson
2005-04-24  4:38                     ` Bernd Eckenfels
2005-04-24  4:53                       ` Paul Jackson
2005-04-25 11:57                       ` Theodore Ts'o
2005-04-25 16:40                         ` David Wagner
2005-04-25 20:35                         ` Bernd Eckenfels
2005-04-24 16:52                   ` Horst von Brand
2005-04-24  8:00                 ` Kai Henningsen
     [not found]               ` <6f6293f10504210220744af114@mail.gmail.com>
2005-04-24  8:01                 ` Kai Henningsen
2005-04-11 11:35         ` [rfc] git: combo-blobs Ingo Molnar
2005-04-11 14:45           ` Paul Jackson
2005-04-11 15:12             ` Ingo Molnar
2005-04-11 15:32               ` Linus Torvalds
2005-04-11 15:39                 ` Ingo Molnar
2005-04-11 15:57                   ` Ingo Molnar
2005-04-11 16:01                   ` Linus Torvalds
2005-04-11 16:33                     ` Ingo Molnar
2005-04-12  5:42                       ` Barry K. Nathan
2005-04-11 18:13                     ` Chris Wedgwood
2005-04-11 18:30                       ` Linus Torvalds
2005-04-11 20:18                         ` Linus Torvalds
2005-04-11 18:40                       ` Petr Baudis
2005-04-11 17:50               ` Paul Jackson
2005-04-11 15:28             ` Ingo Molnar
2005-04-11 15:31               ` Ingo Molnar
2005-04-12  4:05         ` more git updates David Eger
2005-04-12  8:16           ` Petr Baudis
2005-04-12 20:44             ` David Eger
2005-04-12 21:21               ` Linus Torvalds
2005-04-12 22:29                 ` Krzysztof Halasa
2005-04-12 22:49                   ` Linus Torvalds
2005-04-13  4:32                     ` Matthias Urlichs
2005-04-12 22:36                 ` David Eger
2005-04-12 23:48                   ` Panagiotis Issaris
2005-04-12 23:40                 ` Andrea Arcangeli
2005-04-12 23:45                   ` Linus Torvalds
2005-04-13  0:14                     ` Andrea Arcangeli
2005-04-13  1:10                       ` Linus Torvalds
2005-04-13 10:59                         ` Andrea Arcangeli
2005-04-13 20:44                         ` Matt Mackall [this message]
2005-04-13 23:42                           ` Krzysztof Halasa
2005-04-14  0:13                             ` Matt Mackall
2005-04-13  9:30                     ` Russell King
2005-04-13 10:20                       ` Andrea Arcangeli
2005-04-13 14:43                       ` Linus Torvalds
2005-04-10  2:07     ` Paul Jackson
2005-04-10  2:20       ` Paul Jackson
2005-04-10  2:09     ` Paul Jackson
2005-04-10  7:51     ` Junio C Hamano
2005-04-10  5:53       ` Christopher Li
2005-04-10  9:28         ` Junio C Hamano
2005-04-10  7:06           ` Christopher Li
2005-04-10 11:38             ` tony.luck
2005-04-10  9:48           ` Petr Baudis
2005-04-10  9:40         ` Wichert Akkerman
2005-04-10  9:41         ` Petr Baudis
2005-04-10  7:09           ` Christopher Li
2005-04-10 11:21       ` Proposal for shell-patch-format [was: Re: more git updates..] Rutger Nijlunsing
2005-04-10 15:44       ` more git updates Linus Torvalds
2005-04-10 17:00         ` Rutger Nijlunsing
2005-04-10 18:50         ` Paul Jackson
2005-04-10 20:57           ` Linus Torvalds
2005-04-10 19:03             ` Christopher Li
2005-04-10 22:38               ` Linus Torvalds
2005-04-10 19:53                 ` Christopher Li
2005-04-10 23:21                   ` Linus Torvalds
2005-04-10 21:28                     ` Christopher Li
2005-04-12  5:14                       ` David Lang
2005-04-12  6:00                         ` Paul Jackson
2005-04-12  7:05                         ` Barry K. Nathan
2005-04-11  6:57                 ` bert hubert
2005-04-11  7:20                   ` Christer Weinigel
2005-04-10 23:14             ` Paul Jackson
2005-04-10 23:38               ` Linus Torvalds
2005-04-11  0:19                 ` Paul Jackson
2005-04-11 15:49                 ` Randy.Dunlap
2005-04-11 18:30                   ` Petr Baudis
2005-04-11  0:10               ` Petr Baudis
2005-04-09 22:00 ` Paul Jackson
2005-04-09 23:21 ` Ralph Corderoy
2005-04-10  0:39   ` Paul Jackson
2005-04-10  1:14     ` Bernd Eckenfels
2005-04-10  1:33       ` Paul Jackson
2005-04-10 10:22     ` Ralph Corderoy
2005-04-10 17:30       ` Paul Jackson
2005-04-10 17:31 ` Rik van Riel
2005-04-10 17:35   ` Ingo Molnar
2005-04-11 16:46 ` ross
  -- strict thread matches above, loose matches on Subject: below --
2005-04-10 22:07 Luck, Tony
2005-04-10 22:11 ` Petr Baudis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050413204451.GP25554@waste.org \
    --to=mpm@selenic.com \
    --cc=andrea@suse.de \
    --cc=eger@havoc.gtf.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pasky@ucw.cz \
    --cc=rddunlap@osdl.org \
    --cc=ross@jose.lug.udel.edu \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox