All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Niebler <eric@boostpro.com>
To: Avery Pennarun <apenwarr@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: help moving boost.org to git
Date: Mon, 05 Jul 2010 20:16:00 -0400	[thread overview]
Message-ID: <4C3275C0.8000406@boostpro.com> (raw)
In-Reply-To: <AANLkTimAqL8gvgIisLpWE6xj2p0jEZD5wetdGYJnOpdr@mail.gmail.com>

On 7/5/2010 7:32 PM, Avery Pennarun wrote:
> (note: on this mailing list, you shouldn't drop names from the cc:
> line when replying to a thread)

Noted, thanks.

> On Mon, Jul 5, 2010 at 7:11 PM, Eric Niebler <eric@boostpro.com> wrote:
>> On 7/5/2010 6:04 PM, Finn Arne Gangstad wrote:
>>> This
>>> should fit eaily into a single repository. The Linux kernel is much
>>> larger, and that is sort of the canonical single repo git project. I
>>> _strongly_ recommend that you go for a single repo if you can make it
>>> work.
>>
>> It does fit into one repo, but that doesn't meet our needs for the
>> future. Users want to install and build library X and its dependencies,
>> not all of boost. This is increasingly becoming a problem as boost
>> grows. Imagine if a perl programmer had to download all of CPAN to use
>> or hack on any one perl module. Or if contributing to CPAN meant getting
>> the whole shebang, history and all. I'm sure even in the Linux kernel,
>> not *every* third-party driver is maintained in the master git repo.
> 
> Actually, that's mostly not true; there are a few third-party drivers
> that don't make it into the core Linux repo
<snip discussion showing my ignorance of Linux's repository structure>

Thanks for the correction. The CPAN/PyPi analogy is still apt.

>> We are aiming to make boost a clearing-house for C++ libraries (like
>> CPAN, or PyPi for python), turning the official boost distribution into
>> little more than a well-tested collection of the libraries that have
>> passed our peer-review and regression test process.
> 
> Of course you will want to have some kind of really excellent
> versioned dependency fetching system (exactly like CPAN or PyPi or
> ruby gems) if you want this to be nice.  git's submodules stuff is
> almost certainly not going to add any features you need/want.  On the
> other hand, cloning a separate git repo is pretty easy to write your
> CPAN-like script around.

Indeed, we are stealing the work of the python guys. Pip does most of
what we want. They've graciously been accepting our patches so it
happily clones git repos in order to satisfy dependencies now. It is
some kind of really excellent! :-)

>> In fact, the modularization has already been done, and work is well
>> underway on the infrastructure to support dependency tracking. But the
>> modularization is not history-preserving and needs to be redone.
> 
> If your code doesn't move too many files around, then splitting out
> the history is pretty easy with git-subtree (a tool I wrote that's not
> part of git):
> 
>    git subtree split --prefix=/path/to/subdir
> 
> And you get a new history for just that subdir.  That might do exactly
> what you want.  It also works iteratively, so you can export your
> history from svn, then re-export the changes as they occur over time.

This looks like it here:

  http://github.com/apenwarr/git-subtree

I'll have to read the docs. Thanks for the tip.

>>>> So,, what are the options? Can I somehow delete from each repository the
>>>> history that is irrelevant? Is these some feature of git I don't know
>>>> about that can solve this problem for us?
>>>
>>> How do you define "irrelevant"? Do you only require enough history for
>>> git annotate/blame to give correct results?  Or does this only refer
>>> to multiple repositories sharing the same ancient history?
>>
>> If multiple repositories share the same ancient history, wouldn't that
>> give git annotate/blame enough information? Sorry, git newbie here.
> 
> Yes, it would.  But how much of the ancient history do you want?  If
> you want all of it, you don't save any space in your repo.

Repos, plural. We'd save space because the history wouldn't be
duplicated in each one. Right? Or else I'm confused and this something
that will become clear after I understand what git subtree does.

Right now, the other boost developers are pushing for a solution that
uses grafts. I'm fuzzy on what they are exactly, but it seems that we'd
freeze a svn mirror and have anybody interested in history put grafts in
their local repository pointing back at the mirror. I don't know enough
yet to say what the pros/cons of this approach might be wrt git subtree.

>> The plan is to move to git. However, we don't expect this to happen
>> overnight, so a way to continue to pull changes from a svn mirror while
>> the new git repositories are being set up would be ideal.
> 
> This isn't too hard to do; you just need some scripts around git-svn
> and git-subtree (or whatever tool you use to do the splitting).  We've
> done this at work for a couple of years now and it's working fine.

Cool.

> The confusing part is taking *submissions* back through both channels.
> If you value your sanity, you probably want to only allow submissions
> back via svn while you're running the two in parallel; but that makes
> git's added features a lot less useful, so you probably want to run in
> parallel for only a short time.

Oh my! I don't think we'd open the git repositories for changes until
after we close down svn. This problem is hard enough.

-- 
Eric Niebler
BoostPro Computing
http://www.boostpro.com

  reply	other threads:[~2010-07-06  0:16 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-05 14:16 help moving boost.org to git Eric Niebler
2010-07-05 14:48 ` Erik Faye-Lund
2010-07-05 14:48 ` Johannes Sixt
2010-07-05 17:51   ` Eric Niebler
2010-07-05 18:43     ` Sverre Rabbelier
2010-07-06 15:06   ` Raja R Harinath
2010-07-05 22:04 ` Finn Arne Gangstad
2010-07-05 23:11   ` Eric Niebler
2010-07-05 23:32     ` Avery Pennarun
2010-07-06  0:16       ` Eric Niebler [this message]
2010-07-06 17:27         ` Avery Pennarun
2010-07-06 18:00           ` Eric Niebler
2010-07-06 18:13             ` Avery Pennarun
2010-07-06 18:29               ` Eric Niebler
2010-07-06  1:46     ` Dave Abrahams
2010-07-06  8:51       ` Jakub Narebski
2010-07-06 10:34         ` David Abrahams
2010-07-06  0:16 ` Greg Troxel
2010-07-06  0:25   ` Eric Niebler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C3275C0.8000406@boostpro.com \
    --to=eric@boostpro.com \
    --cc=apenwarr@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.