git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Evgeny Sazhin <euguess@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Joshua Redstone <joshua.redstone@fb.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Git performance results on a large repository
Date: Fri, 3 Feb 2012 20:25:59 -0500	[thread overview]
Message-ID: <6E708713-3DEF-40A2-9585-690707166BDF@gmail.com> (raw)
In-Reply-To: <CACBZZX4BsFZxB6A-Hg-k37FBavgTV8SDiQTK_sVh9Mb9iskiEw@mail.gmail.com>

 

On Feb 3, 2012, at 9:56 AM, Ævar Arnfjörð Bjarmason wrote:

> On Fri, Feb 3, 2012 at 15:20, Joshua Redstone <joshua.redstone@fb.com> wrote:
> 
>> We (Facebook) have been investigating source control systems to meet our
>> growing needs.  We already use git fairly widely, but have noticed it
>> getting slower as we grow, and we want to make sure we have a good story
>> going forward.  We're debating how to proceed and would like to solicit
>> people's thoughts.
> 
> Where I work we also have a relatively large Git repository. Around
> 30k files, a couple of hundred thousand commits, clone size around
> half a GB.
> 
> You haven't supplied background info on this but it really seems to me
> like your testcase is converting something like a humongous Perforce
> repository directly to Git.
> 
> While you /can/ do this it's not a good idea, you should split up
> repositories at the boundaries code or data doesn't directly cross
> over, e.g. there's no reason why you need HipHop PHP in the same
> repository as Cassandra or the Facebook chat system, is there?
> 
> While Git could better with large repositories (in particular applying
> commits in interactive rebase seems to be to slow down on bigger
> repositories) there's only so much you can do about stat-ing 1.3
> million files.
> 
> A structure that would make more sense would be to split up that giant
> repository into a lot of other repositories, most of them probably
> have no direct dependencies on other components, but even those that
> do can sometimes just use some other repository as a submodule.
> 
> Even if you have the requirement that you'd like to roll out
> *everything* at a certain point in time you can still solve that with
> a super-repository that has all the other ones as submodules, and
> creates a tag for every rollout or something like that.
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



I concur. I'm working in the company with many years of development history with several huge CVS repos and we are slowly but surely migrating the codebase from CVS to Git. 
Split the things up. This will allow you to reorganize things better and there is IMHO no downsides. 
As for rollout - i think this job should be given to build/release system that will have an ability to gather necessary code from different repos and tag it properly.

just my 2 cents

Thanks,
Eugene

  parent reply	other threads:[~2012-02-04  1:26 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-03 14:20 Git performance results on a large repository Joshua Redstone
2012-02-03 14:56 ` Ævar Arnfjörð Bjarmason
2012-02-03 17:00   ` Joshua Redstone
2012-02-03 22:40     ` Sam Vilain
2012-02-03 22:57       ` Sam Vilain
2012-02-07  1:19       ` Nguyen Thai Ngoc Duy
2012-02-03 23:05     ` Matt Graham
2012-02-04  1:25   ` Evgeny Sazhin [this message]
2012-02-03 23:35 ` Chris Lee
2012-02-04  0:01 ` Zeki Mokhtarzada
2012-02-04  5:07 ` Joey Hess
2012-02-04  6:53 ` Nguyen Thai Ngoc Duy
2012-02-04 18:05   ` Joshua Redstone
2012-02-05  3:47     ` Nguyen Thai Ngoc Duy
2012-02-06 15:40       ` Joey Hess
2012-02-07 13:43         ` Nguyen Thai Ngoc Duy
2012-02-09 21:06           ` Joshua Redstone
2012-02-10  7:12             ` Nguyen Thai Ngoc Duy
2012-02-10  9:39               ` Christian Couder
2012-02-10 12:24                 ` Nguyen Thai Ngoc Duy
2012-02-06  7:10     ` David Mohs
2012-02-06 16:23     ` Matt Graham
2012-02-06 20:50       ` Joshua Redstone
2012-02-06 21:07         ` Greg Troxel
2012-02-07  1:28         ` david
2012-02-06 21:17     ` Sam Vilain
2012-02-04 20:05   ` Joshua Redstone
2012-02-05 15:01   ` Tomas Carnecky
2012-02-05 15:17     ` Nguyen Thai Ngoc Duy
2012-02-04  8:57 ` slinky
2012-02-04 21:42 ` Greg Troxel
2012-02-05  4:30 ` david
2012-02-05 11:24   ` David Barr
2012-02-07  8:58 ` Emanuele Zattin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6E708713-3DEF-40A2-9585-690707166BDF@gmail.com \
    --to=euguess@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=joshua.redstone@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).