All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Niebler <eric@boostpro.com>
To: git@vger.kernel.org
Subject: Re: help moving boost.org to git
Date: Mon, 05 Jul 2010 19:11:10 -0400	[thread overview]
Message-ID: <4C32668E.9040000@boostpro.com> (raw)
In-Reply-To: <20100705220443.GA23727@pvv.org>

On 7/5/2010 6:04 PM, Finn Arne Gangstad wrote:
> On Mon, Jul 05, 2010 at 10:16:36AM -0400, Eric Niebler wrote:
>> I have a question about the best approach to take for refactoring a
>> large svn project into git. The project, boost.org, is a collection of
>> C++ libraries (>100) that are mostly independent. (There may be
>> cross-library dependencies, but we plan to handle that at a higher
>> level.) After the move to git, we'd like each library to be in its own
>> git repository. Boost can then be a stitching-together of these, using
>> submodules or something (opinions welcome). It's an old project with
>> lots of history that we don't want to lose. The naive approach of simply
>> forking into N repositories for the N libraries and deleting the
>> unwanted files in each is unworkable because we'll end up with all the
>> history duplicated everywhere ... >100 repositories, each larger than 100Mb.
> 
> If the libraries are not independent (i.e. some commits are across
> multiple libraries), submodules will give you some interesting
> challenges to put it mildly.

You have correctly assessed the situation. There *are* cross-library
commits in our history. What are the implications of this for
modularlization?

> The current boost 1.43 is 29344 files, is this all there is? 

Yes.

> This
> should fit eaily into a single repository. The Linux kernel is much
> larger, and that is sort of the canonical single repo git project. I
> _strongly_ recommend that you go for a single repo if you can make it
> work.

It does fit into one repo, but that doesn't meet our needs for the
future. Users want to install and build library X and its dependencies,
not all of boost. This is increasingly becoming a problem as boost
grows. Imagine if a perl programmer had to download all of CPAN to use
or hack on any one perl module. Or if contributing to CPAN meant getting
the whole shebang, history and all. I'm sure even in the Linux kernel,
not *every* third-party driver is maintained in the master git repo.

We are aiming to make boost a clearing-house for C++ libraries (like
CPAN, or PyPi for python), turning the official boost distribution into
little more than a well-tested collection of the libraries that have
passed our peer-review and regression test process.

In fact, the modularization has already been done, and work is well
underway on the infrastructure to support dependency tracking. But the
modularization is not history-preserving and needs to be redone.

> If you manage to create a single git repo with the history you want,
> it is trivial to split out separate repositories of subdirectories
> later (and those repos will then be comparatively small). git subtree
> allegedly automates this process more or less (I have not used it, but
> have heard good things about it). What about having a single "master
> repository", and then using subtree to create single-library repos for
> the library developers if they want a smaller repo to play around in?

This sounds like it might be ok, but I need to research it.

>> So,, what are the options? Can I somehow delete from each repository the
>> history that is irrelevant? Is these some feature of git I don't know
>> about that can solve this problem for us?
> 
> How do you define "irrelevant"? Do you only require enough history for
> git annotate/blame to give correct results?  Or does this only refer
> to multiple repositories sharing the same ancient history?

If multiple repositories share the same ancient history, wouldn't that
give git annotate/blame enough information? Sorry, git newbie here.

>> At boost, We've already discussed a few possible approaches. Feel free
>> to comment and/or criticize any of the solutions suggested here:
>>
>>   http://github.com/ryppl/ryppl/issues#issue/4
> 
> It is unclear from the discussion if you will change to git, or use
> git in addition to svn? This will have some impact on how to go about
> this.

The plan is to move to git. However, we don't expect this to happen
overnight, so a way to continue to pull changes from a svn mirror while
the new git repositories are being set up would be ideal.

-- 
Eric Niebler
BoostPro Computing
http://www.boostpro.com

  reply	other threads:[~2010-07-05 23:11 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-05 14:16 help moving boost.org to git Eric Niebler
2010-07-05 14:48 ` Erik Faye-Lund
2010-07-05 14:48 ` Johannes Sixt
2010-07-05 17:51   ` Eric Niebler
2010-07-05 18:43     ` Sverre Rabbelier
2010-07-06 15:06   ` Raja R Harinath
2010-07-05 22:04 ` Finn Arne Gangstad
2010-07-05 23:11   ` Eric Niebler [this message]
2010-07-05 23:32     ` Avery Pennarun
2010-07-06  0:16       ` Eric Niebler
2010-07-06 17:27         ` Avery Pennarun
2010-07-06 18:00           ` Eric Niebler
2010-07-06 18:13             ` Avery Pennarun
2010-07-06 18:29               ` Eric Niebler
2010-07-06  1:46     ` Dave Abrahams
2010-07-06  8:51       ` Jakub Narebski
2010-07-06 10:34         ` David Abrahams
2010-07-06  0:16 ` Greg Troxel
2010-07-06  0:25   ` Eric Niebler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C32668E.9040000@boostpro.com \
    --to=eric@boostpro.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.