All of lore.kernel.org
 help / color / mirror / Atom feed
From: Emily Shaffer <emilyshaffer@google.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [PATCH v5 4/4] submodule: record superproject gitdir during 'update'
Date: Tue, 9 Nov 2021 13:46:29 -0800	[thread overview]
Message-ID: <YYrsNagtHssb7jBF@google.com> (raw)
In-Reply-To: <YYrb4oCaIXIwN/bF@google.com>

On Tue, Nov 09, 2021 at 12:36:50PM -0800, Emily Shaffer wrote:
> 
> On Tue, Nov 09, 2021 at 01:42:03AM +0100, Ævar Arnfjörð Bjarmason wrote:
> > 
> > 
> > On Mon, Nov 08 2021, Emily Shaffer wrote:
> > 
> > > On Fri, Nov 05, 2021 at 09:43:56AM +0100, Ævar Arnfjörð Bjarmason wrote:
> > >> 
> > >> 
> > >> On Thu, Nov 04 2021, Junio C Hamano wrote:
> > >> 
> > >> > Emily Shaffer <emilyshaffer@google.com> writes:
> > >> >
> > >> >> A recorded hint path to the superproject's gitdir might be added during
> > >> >> 'git submodule add', but in some cases - like submodules which were
> > >> >> created before 'git submodule add' learned to record that info - it might
> > >> >> be useful to update the hint. Let's do it during 'git submodule
> > >> >> update', when we already have a handle to the superproject while calling
> > >> >> operations on the submodules.
> > >> >
> > >> > We are hearing repeated mention of "cache" and "hint".  Do we ever
> > >> > invalidate it, or if we have such a record, do we blindly trust it
> > >> > and use it without verifying if it is still fresh?
> > >> >
> > >> > Also, this step and the previous step both say we record gitdir on
> > >> > their title, but we instead record common dir.  Whichever is the
> > >> > right choice to record, let's be consistent.
> > >> 
> > >> I had similar (AFAICT still unaddressed) feedback on the v2[1]. I'd lost
> > >> track of this series, and see one reason is that the In-Reply-Chain was
> > >> broken between v3..v4.
> > >> 
> > >> I.e. it seems to me that this whole thing started as a way to avoid
> > >> shellscript overhead by calling git-rev-parse from git-submodule.sh, but
> > >> now that the relevant bits are moved to C we could just call some
> > >> slightly adjusted code in setup.c.
> > >
> > > No, that is not the case. It is the case that `git -C .. rev-parse
> > > --git-dir` is *very* expensive in the case where `../` is not, in fact,
> > > a gitdir; when I attempted another series which relied on finding the
> > > parent superproject's gitdir in this way, our testsuite took something
> > > like 5x longer to run than before. In other words, the expensive part is
> > > not the shelling out overhead - the expensive part is searching up the
> > > entire filesystem directory structure in the worst-case ("we are not a
> > > submodule") scenario. This is still needed, even with 'git-submodule.sh'
> > > moving to C.
> > 
> > Do you have that test code somewhere?
> 
> I messed around with it a little more, rebasing the no-caches-involved
> older implementation and using an in-process lookup with
> setup_git_directory_gently_1.
> https://github.com/nasamuffin/git/tree/config-inheritance-no-cache
> 
> The recent experiments are in the tip commit, and the original series is
> in the two commits prior if you're interested.
> 
> The upshot, though, is that I think there is still not a way around a
> second subprocess. Before, we determined the superproject's gitdir like
> so:
> 
>   # Does a git project at .. think I belong to it?
>   git -C .. ls-files <args> -- path/to/submodule
>   # Where does that git project's gitdir live?
>   git -C .. rev-parse --absolute-git-dir
> 
> Even if we can do the second call in-process, we still will be
> performing this ls-files call to ensure that the parent repo is actually
> our superproject. (One good example of a time when the parent repo is
> *not*: the entire Git test suite, where '/path/to/git/t/trash directory.t1234-abcd'
> is not a submodule of '/path/to/git/.git'.)
> 
> We could reverse the checks, which will make this much less painful in
> the real world, but will still slow down our test suites (and hopefully
> you'll forgive me for combining C and bash so brazenly, but it's for
> illustration purposes only):
> 
>   # Is there a git project above us?
>   setup_git_directory_gently_1("..", out, 0);
>   # Does it think we're its submodule?
>   git -C $out rev-parse --absolute-git-dir
> 
> That will still result in an extra out-of-process call for every line in
> the Git test suite, though, because of the trash directory layout.
> 
> I looked briefly at `git ls-files` // `cmd_ls_files()` and it's fairly
> close to being callable on an arbitrary 'struct repository', but not
> quite there. But I am pretty afraid of the rabbit hole ;)

Jonathan Nieder and Glen Choo pointed out that I can read in the index
to an arbitrary struct index_state from a path, and then call
'index_dir_exists()', so this part of it is not as scary as I thought.

I'll mess around today and see if I can come up with an in-process
version of 'get_superproject_working_tree()' and
'get_superproject_gitdir()'.

Thanks for the lead - a fine example of how useful it is to receive
high-level input from someone removed from the problem ;)

 - Emily

  reply	other threads:[~2021-11-09 21:46 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-14 20:34 [PATCH v4 0/4] cache parent project's gitdir in submodules Emily Shaffer
2021-10-14 20:34 ` [PATCH v4 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-10-14 20:34 ` [PATCH v4 2/4] introduce submodule.superprojectGitDir record Emily Shaffer
2021-10-14 20:34 ` [PATCH v4 3/4] submodule: record superproject gitdir during absorbgitdirs Emily Shaffer
2021-10-18 23:18   ` Jonathan Tan
2021-10-25 16:14     ` Derrick Stolee
2021-11-04 23:22       ` Emily Shaffer
2021-11-08  1:07         ` Derrick Stolee
2021-11-04 22:09     ` Emily Shaffer
2021-10-14 20:34 ` [PATCH v4 4/4] submodule: record superproject gitdir during 'update' Emily Shaffer
2021-10-25 16:17   ` Derrick Stolee
2021-10-25 16:19 ` [PATCH v4 0/4] cache parent project's gitdir in submodules Derrick Stolee
2021-11-04 23:49 ` [PATCH v5 " Emily Shaffer
2021-11-04 23:49   ` [PATCH v5 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-11-04 23:49   ` [PATCH v5 2/4] introduce submodule.superprojectGitDir record Emily Shaffer
2021-11-05  7:50     ` Junio C Hamano
2021-11-08 23:16       ` Emily Shaffer
2021-11-04 23:49   ` [PATCH v5 3/4] submodule: record superproject gitdir during absorbgitdirs Emily Shaffer
2021-11-04 23:49   ` [PATCH v5 4/4] submodule: record superproject gitdir during 'update' Emily Shaffer
2021-11-05  4:49     ` Junio C Hamano
2021-11-05  8:43       ` Ævar Arnfjörð Bjarmason
2021-11-08 23:21         ` Emily Shaffer
2021-11-09  0:42           ` Ævar Arnfjörð Bjarmason
2021-11-09 20:36             ` Emily Shaffer
2021-11-09 21:46               ` Emily Shaffer [this message]
2021-11-05  8:51     ` Ævar Arnfjörð Bjarmason
2021-11-08 23:22       ` Emily Shaffer
2021-11-09  1:12         ` Ævar Arnfjörð Bjarmason
2021-11-08  1:24   ` [PATCH v5 0/4] cache parent project's gitdir in submodules Derrick Stolee
2021-11-08 23:11     ` Emily Shaffer
2021-11-09 15:58       ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YYrsNagtHssb7jBF@google.com \
    --to=emilyshaffer@google.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.