git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jānis Rukšāns" <janis.ruksans@gmail.com>
To: git@vger.kernel.org
Subject: Submodule, subtree, or something else?
Date: Sat, 22 Aug 2015 01:47:42 +0300	[thread overview]
Message-ID: <1440197262.23145.191.camel@gmail.com> (raw)

Hello,


First of all, I apologise for the wall of text that follows; obviously I
am bad at this.

My $DAYJOB is switching from Subversion to Git, primarily because of
it's distributed nature (we are scattered all across the globe), and the
ease of branching and merging.  One issue that has popped up is how to
manage code shared between multiple projects.

Our SVN setup used a shared repository for all projects, either using
externals for shared code, or, more often than not, simply merging the
code between projects as needed.  Ignoring the fact that merging with
SVN is somewhat cumbersome, overall it has worked quite well for us,
especially when combined with git-svn.

For external libraries that rarely change, submodules appear to be the
obvious choice when using Git.  On the other hand, I've found them
somewhat cumbersome to use, and subtree merging (either using git
subtree, or directly with git merge -s subtree) is closer to what we
were doing in SVN.  A major drawback of submodules in my opinion is the
inability to make a full clone from an existing one without having
access to the central repository, which is something I have to do from
time to time.

For internal libraries, the situation is even less clear.  For many of
these libraries, most of the development happens within the context of a
single project, with commits to main project being interleaved with
commits to the subproject(s), resulting in histories resembling:

 (using git submodule)

   A---B---S1---S2---C---S3
          ,´   ,´       ,´
     N---O----P----Q---R

 (using git subtree with --rejoin)

   A---B---N---O---M1---M2---Q---C---R---M3
                  /    /                /
             N'--O'---P--------Q'------R'

 (using merge -s subtree)

   A---B---M1---M2---C---M3
          /    /        /
     N---O----P----Q---R

where A, B and C are changes to the main project, N, O, P, Q and R are
changes to library code, and Sn and Mn are submodule updates and merge
commits, respectively.

From what I have gathered, submodules have issues with branching and
merging, therefore, unless I'm mistaken, submodules are kinda out of
question.  Of the remaining two options, merging directly results in a
nicer history, but requires making all changes to the library repo first
(although I am quite sure that a similar effect can be achieved with
plumbing, similarly to how git subtree split works), and is harder to
use than git subtree.  Also, all three options can result in the main
project history being cluttered with extra commits.

Lastly, there is a particularly painful 3rd party library that has an
enormous amount of local modifications that are never going to make it
upstream, essentially making it a fork, project specific changes that
are required for one project, but would break others, separate language
bindings that access the internals (often requiring bug fixes to be made
simultaneously to both), and, if that wasn't enough, it *requires*
several source files to be modified for each individual project that
uses it.  It's a complete mess, but we're stuck with it for the existing
projects, as switching to an alternative would be too time consuming.


To sum up, I'm looking for something that would let us share code
between multiple projects, allow for:

1) separate histories with relatively easy branching and merging

2) distributed workflow without having to set up a multiple repositories
everywhere (eg. work <-> home <-> laptop)

3) to work on the shared code within a project using it

4) inspection of the complete history

5) modifications that are not shared with other projects

and would not result in lots of clutter in the history.

Repository size is somewhat less of an issue, because each submodule has
to be checked out anyway.

Submodules let you have #3, and #1, #2 and #5 to a point, after which it
becomes a pain.  git subtree allows #1, #2, #3 and #4, and #5 with some
pain (?), but results in duplicate commits.  Using subtree merge
strategy directly gives everything except #3, but is harder to use than
submodules or subtree.

Are there any other options beside these three for sharing (or in some
cases, not sharing) common code between projects using Git, that would
address the above points better?  Or, alternatively, ways to work around
the drawbacks of the existing tools?

Lastly, I will be grateful for any suggestions about how to handle the
messy case described above better.

Thanks,
Jānis

             reply	other threads:[~2015-08-21 22:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-21 22:47 Jānis Rukšāns [this message]
2015-08-22  0:07 ` Submodule, subtree, or something else? Stefan Beller
2015-08-23 14:11   ` Jānis Rukšāns
2015-08-24 16:51     ` Stefan Beller
2015-08-24 17:53       ` Jānis Rukšāns
     [not found]     ` <CAK6hiNiBD+DUdNq0c2DY9LWg2PCgE56SpbBip8BNNmHTsEttuQ@mail.gmail.com>
2015-08-24 17:12       ` Jānis Rukšāns

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1440197262.23145.191.camel@gmail.com \
    --to=janis.ruksans@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).