Re: Why Git is so fast (was: Re: Eric Sink's blog - notes on git, dscms and a "whole product" approach)

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Linus Torvalds <torvalds@linux-foundation.org>
To: Jeff King <peff@peff.net>
Cc: Jakub Narebski <jnareb@gmail.com>,
	Martin Langhoff <martin.langhoff@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Why Git is so fast (was: Re: Eric Sink's blog - notes on git, dscms and a "whole product" approach)
Date: Fri, 1 May 2009 14:37:28 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.2.00.0905011420580.5379@localhost.localdomain> (raw)
In-Reply-To: <20090501190854.GA13770@coredump.intra.peff.net>

On Fri, 1 May 2009, Jeff King wrote:
> 
> Thanks for the analysis; what you said makes sense to me. However, there
> is at least one case of somebody complaining that git doesn't scale as
> well as perforce for their load:

So we definitely do have scaling issues, there's no question about that. I 
just don't think they are about enterprise network servers vs the more 
workstation-oriented OSS world..

I think they're likely about the whole git mentality of looking at the big 
picture, and then getting swamped by just how _huge_ that picture can be 
if somebody just put the whole world in a single repository..

With perforce, repository maintenance is such a central issue that the 
whole p4 mentality seems to _encourage_ everybody to put everything into 
basically one single p4 repository. And afaik, p4 basically works mostly 
like CVS, ie it really ends up being pretty much oriented to a "one file 
at a time" model.

Which is nice in that you can have a million files, and then only check 
out a few of them - you'll never even _see_ the impact of the other 
999,995 files.

And git obviously doesn't have that kind of model at all. Git 
fundamnetally never really looks at less than the whole repo. Even if you 
limit things a bit (ie check out just a portion, or have the history go 
back just a bit), git ends up still always caring about the whole thing, 
and carrying the knowledge around.

So git scales really badly if you force it to look at everything as one 
_huge_ repository. I don't think that part is really fixable, although we 
can probably improve on it.

And yes, then there's the "big file" issues. I really don't know what to 
do about huge files. We suck at them, I know. There are work-arounds (like 
not deltaing big objects at all), but they aren't necessarily that great 
either.

I bet we could probably improve git large-file behavior for many common 
cases. Do we have a good test-case of some particular suckiness that is 
actually relevant enough that people might decide to look at it (and by 
"people", I do mean myself too - but I'd need to be somewhat motivated by 
it. A usage case that we suck at and that is available and relevant).

			Linus

next prev parent reply	other threads:[~2009-05-01 21:40 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-27  8:55 Eric Sink's blog - notes on git, dscms and a "whole product" approach Martin Langhoff
2009-04-28 11:24 ` Cross-Platform Version Control (was: Eric Sink's blog - notes on git, dscms and a "whole product" approach) Jakub Narebski
2009-04-28 21:00   ` Robin Rosenberg
2009-04-29  6:55   ` Martin Langhoff
2009-04-29  7:21     ` Jeff King
2009-04-29 20:05       ` Markus Heidelberg
2009-04-29  7:52     ` Cross-Platform Version Control Jakub Narebski
2009-04-29  8:25       ` Martin Langhoff
2009-04-28 18:16 ` Eric Sink's blog - notes on git, dscms and a "whole product" approach Jakub Narebski
2009-04-29  7:54   ` Sitaram Chamarty
2009-04-30 12:17   ` Why Git is so fast (was: Re: Eric Sink's blog - notes on git, dscms and a "whole product" approach) Jakub Narebski
2009-04-30 12:56     ` Michael Witten
2009-04-30 15:28       ` Why Git is so fast Jakub Narebski
2009-04-30 18:52         ` Shawn O. Pearce
2009-04-30 20:36           ` Kjetil Barvik
2009-04-30 20:40             ` Shawn O. Pearce
2009-04-30 21:36               ` Kjetil Barvik
2009-05-01  0:23                 ` Steven Noonan
2009-05-01  1:25                   ` James Pickens
2009-05-01  9:19                   ` Kjetil Barvik
2009-05-01  9:34                     ` Mike Hommey
2009-05-01  9:42                       ` Kjetil Barvik
2009-05-01 17:42                 ` Tony Finch
2009-05-01  5:24             ` Dmitry Potapov
2009-05-01  9:42               ` Mike Hommey
2009-05-01 10:46                 ` Dmitry Potapov
2009-04-30 18:43       ` Why Git is so fast (was: Re: Eric Sink's blog - notes on git, dscms and a "whole product" approach) Shawn O. Pearce
2009-04-30 14:22     ` Jeff King
2009-05-01 18:43       ` Linus Torvalds
2009-05-01 19:08         ` Jeff King
2009-05-01 19:13           ` david
2009-05-01 19:32             ` Nicolas Pitre
2009-05-01 21:17           ` Daniel Barkalow
2009-05-01 21:37           ` Linus Torvalds [this message]
2009-05-01 22:11             ` david
2009-04-30 18:56     ` Nicolas Pitre
2009-04-30 19:16       ` Alex Riesen
2009-05-04  8:01         ` Why Git is so fast Andreas Ericsson
2009-04-30 19:33       ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.0905011420580.5379@localhost.localdomain \
    --to=torvalds@linux-foundation.org \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=martin.langhoff@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).