git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Avery Pennarun <apenwarr@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Jens Lehmann <Jens.Lehmann@web.de>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	"Shawn O. Pearce" <spearce@spearce.org>,
	Paul Mackerras <paulus@samba.org>,
	Heiko Voigt <hvoigt@hvoigt.net>, Lars Hjemli <hjemli@gmail.com>
Subject: Re: submodules' shortcomings, was Re: RFC: display dirty submodule  working directory in git gui and gitk
Date: Mon, 4 Jan 2010 17:53:32 -0500	[thread overview]
Message-ID: <32541b131001041453l25409c41y3aadf749b1308f01@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.1.00.1001042217370.4985@pacific.mpi-cbg.de>

On Mon, Jan 4, 2010 at 5:29 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> On Mon, 4 Jan 2010, Jens Lehmann wrote:
>> IMVHO using the tree sha1 for a submodule seems to be the 'natural' way
>> to include another git repo. And it gives the reproducibility i expect
>> from a scm. Or am i missing something?
>
> You do remember the discussion at the Alles wird Git about the need for
> Subversion external-like behavior, right?

I'm not sure why this is such an issue.  Basically, non-version-locked
submodules are about the easiest thing in the world; that's why CVS
and SVN supported them first.  (SVN later added version-locking like
git has.)

All you need is a .gitignore entry and a trivial script that checks
out the external.  If you want to be fancy, this operation could be
part of git, but it's such a totally different case (and an easy one,
no less) that I think it ought to be treated totally seperately.

> - among other use cases, submodules are recommended for sharing content
>  between two different repositories. But it is part of the design that it
>  is _very_ easy to forget to commit, or push the changes in the submodule
>  that are required for the integrity of the superproject.
[...]
> - working directories with GIT_DIRs are a very different beast from single
>  files.  That alone leads to a _lot_ of problems.  The original design of
>  Git had only a couple of states for named content (AKA files): clean,
>  added, removed, modified.  The states that are possible with submodules
>  are for the most part not handled _at all_ by most Git commands (and it
>  is sometimes very hard to decide what would be the best way to handle
>  those states, either).  Just think of a submodule at a different
>  revision than committed in the superproject, with uncommitted changes,
>  ignored and unignored files, a few custom hooks, a bit of additional
>  metadata in the .git/config, and just for fun, a few temporary files in
>  .git/ which are used by the hooks.


I think this is primarily because checked-out submodules currently
have their own .git directories (with their own config, index, etc).
If they were considered *part* of the subproject's repo checkout, and
updated upon switching branches, etc, this whole class of problems
would go away.

> - that use case -- sharing content between different repositories -- is
>  not really supported by submodules, but rather an afterthought.  This is
>  all too obvious when you look at the restriction that the shared content
>  must be in a single subdirectory.

I haven't found the subdir requirement to be much of an issue, at
least on Unix where I can simply work around it using symlinks from
the superproject into the subproject.  It's obviously more gross on
Windows, but I've worked around it there too.  This one isn't a daily
aggravation for me, though maybe it is for others.  And any cure I can
think of sounds rather worse than the disease.

> - submodules would be a perfect way to provide a fast-forward-only media
>  subdirectory that is written to by different people (artists) than to
>  the superproject (developers).  But there is no mechanism to enforce
>  shallow fetches, which means that this use case cannot be handled
>  efficiently using Git.

I doubt you want to "enforce" shallow fetches.  And if you just want
to "allow" shallow fetches, or default to shallow fetches, I'd think
it would be pretty easy to add.  This hasn't been important to me
either.  (It seems to be not too important to git users in general, or
git's support *in general* for shallow repositories would be more
featureful.)

> - while it might be called clever that the submodules' metadata are stored
>  in .gitmodules in the superproject (and are therefore naturally tracked
>  with Git), the synchronization with .git/config is performed exactly
>  once -- when you initialize the submodule.  You are likely to miss out
>  on _every_ change you pulled into the superproject.

This could be fixed too, though I gave up on git-submodule before I
bothered to fix it myself.

The correct solution here is simply to not ever copy the settings from
.gitmodules into .git/config.  Instead, git-submodule should read
.gitmodules as defaults, and then override those defaults with
anything in .git/config.  99% of users will probably not need to ever
put any of their settings in .git/config, and so this problem
disappears.

> All in all, submodules are very clumsy to work with, and you are literally
> forced to provide scripts in the superproject to actually work with the
> submodules.

Agreed; I do this in every project which uses git-submodule.  (And
from doing so, I learned that the value-added of git-submodule is
nearly zero.  My script does most of the work, and it could just as
easily check out the submodule as a git repo too.  I could even choose
to version-lock or not version-lock the checked-out submodule: just
hardcode the commitid into my script!)

> I do not think that --include-submodules is a good default.  It is just
> too expensive in terms of I/O even to check the status in a superproject
> with a lot of submodules.

I've thought about this a lot, and I think having a special case for
submodules here is the wrong line of thinking.  A big project
*without* submodules has this same problem.  The "real" solution is to
just make status checks faster.

(This is actually possible to do: in the extreme case, you just have a
daemon running with inotify or the Windows equivalent.  TortoiseSvn
reputedly does something like this.  I've thought of writing such a
daemon myself to just twiddle --assume-{un,}changed flags at the right
times, particularly since status checks in Windows are so ridiculously
slow.  But I got frustrated when it was *still* slow even after
setting --assume-unchanged on all the files in the index.  git still
scans directories to detect *unknown* files, and there seems to be no
way to turn it off or, moreover, to provide the list of unknown files
from some other source.)

Have fun,

Avery

  parent reply	other threads:[~2010-01-04 22:53 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-02 15:33 RFC: display dirty submodule working directory in git gui and gitk Jens Lehmann
2010-01-04  9:44 ` Johannes Schindelin
2010-01-04 10:44   ` Heiko Voigt
2010-01-04 11:46     ` submodules, was " Johannes Schindelin
2010-01-04 18:29       ` Avery Pennarun
2010-01-04 19:14         ` Jens Lehmann
2010-01-04 17:04   ` Jens Lehmann
2010-01-04 22:29     ` submodules' shortcomings, was " Johannes Schindelin
2010-01-04 22:27       ` Shawn O. Pearce
2010-01-04 22:35         ` Avery Pennarun
2010-01-04 22:53       ` Avery Pennarun [this message]
2010-01-05  8:11       ` Jens Lehmann
2010-01-05  9:33         ` Junio C Hamano
2010-01-05 10:07           ` Johannes Schindelin
2010-01-05 11:57           ` Jens Lehmann
2010-01-05 18:31             ` Junio C Hamano
2010-01-05 20:01               ` Jens Lehmann
2010-01-06  1:04                 ` Junio C Hamano
2010-01-06 14:05                   ` Jens Lehmann
2010-01-06 17:01                     ` Junio C Hamano
2010-01-06 17:23                       ` Nguyen Thai Ngoc Duy
2010-01-06 17:55                         ` Junio C Hamano
2010-01-06 18:22                           ` Nguyen Thai Ngoc Duy
2010-01-06 18:32                           ` Jens Lehmann
2010-01-06 20:01                             ` Junio C Hamano
2010-01-06 21:19                               ` Jens Lehmann
2010-01-06 18:20                       ` Jens Lehmann
2010-01-05 23:02               ` Johannes Schindelin
2010-01-05  9:46         ` Johannes Schindelin
2010-01-05 12:19           ` Jens Lehmann
2010-01-05 14:27           ` Heiko Voigt
2010-01-05 15:07             ` Johan Herland
2010-01-05 15:30             ` Johannes Schindelin
2010-01-05 22:37             ` Nanako Shiraishi
2010-01-05 23:13               ` Johannes Schindelin
2010-01-07 11:04                 ` Nanako Shiraishi
2010-01-05 20:38       ` Pau Garcia i Quiles
2010-01-05 23:06         ` cmake, was Re: submodules' shortcomings Johannes Schindelin
2010-01-06  1:17           ` Pau Garcia i Quiles
2010-01-06  4:25             ` Miles Bader
2010-01-06  9:24             ` Johannes Schindelin
2010-01-04 17:51   ` RFC: display dirty submodule working directory in git gui and gitk Nguyen Thai Ngoc Duy
2010-01-04 18:40     ` Jens Lehmann
2010-01-04 19:05       ` Junio C Hamano
2010-01-04 19:21         ` Jens Lehmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32541b131001041453l25409c41y3aadf749b1308f01@mail.gmail.com \
    --to=apenwarr@gmail.com \
    --cc=Jens.Lehmann@web.de \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hjemli@gmail.com \
    --cc=hvoigt@hvoigt.net \
    --cc=paulus@samba.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).