git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brandon Williams <bmwill@google.com>
To: Elijah Newren <newren@gmail.com>
Cc: Duy Nguyen <pclouds@gmail.com>, Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH/RFC/BUG] unpack-trees.c: do not use "the_index"
Date: Mon, 4 Jun 2018 10:33:38 -0700	[thread overview]
Message-ID: <20180604173338.GA91449@google.com> (raw)
In-Reply-To: <CABPp-BEMUUrYd_=a1sPYV941rMCdWh_x-4j5uUiCEUFA0Fwoww@mail.gmail.com>

On 06/02, Elijah Newren wrote:
> On Fri, Jun 1, 2018 at 10:03 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> > On Fri, Jun 1, 2018 at 8:34 PM, Elijah Newren <newren@gmail.com> wrote:
> >> On Fri, Jun 1, 2018 at 9:11 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> >>> This is more of a bug report than an actual fix because I'm not sure
> >>> if "o->src_index" is always the correct answer instead of "the_index"
> >>> here. But this is very similar to 7db118303a (unpack_trees: fix
> >>> breakage when o->src_index != o->dst_index - 2018-04-23) and could
> >>> potentially break things again...
> 
> I'm pretty sure your patch is correct.  Adding Brandon Williams to the
> cc for comment since his patches came up in the analysis below...
> 
> >> Actually, I don't think the patch will break anything in the current
> >> code.  Currently, all callers of unpack_trees() (even merge recursive
> >> which uses different src_index and dst_index now) set src_index =
> >> &the_index.  So, these changes should continue to work as before (with
> >> a minor possible exception of merge-recursive calling into other
> >> functions from unpack-trees.c after unpack_trees() has finished..).
> >> That's not to say that your patch is bug free, just that I think any
> >> bugs shouldn't be triggerable from the current codebase.
> >
> > Ah.. I thought merge-recursive would do fancier things and used some
> > temporary index. Good to know.
> 
> Well, it does does use a temporary index, but for dst_index rather
> than src_index.  It then does some fancier things, but not until the
> call to unpack_trees() is over.  In particular, at that point, it
> swaps the_index and tmp_index, reversing their roles so that now
> tmp_index is the original index and the_index becomes the result after
> unpack_trees() is run.  That's done because I later want to use the
> original index for calling verify_uptodate().  verify_uptodate() is
> then used for was_tracked_and_matches() and was_tracked().
> 
> Anyway, the whole upshot of this is:
>   * merge-recursive uses src_index == &the_index for the unpack_trees() call.
>   * merge-recursive uses src_index == o->orig_index for subsequent
> calls to verify_uptodate(), but verify_uptodate() doesn't call into
> any of the sites you have modified.
> 
> Further:
>   * Every other existing caller of unpack-trees in the code sets
> src_index == &the_index, so this won't break any of them.
>   * There are only two callers in the codebase that set dst_index to
> something other than &the_index -- diff-lib.c (which sets it to NULL)
> and merge-recursive (which does the stuff described above).
> 
> So, having done that analysis, I am now pretty convinced your patch
> won't break anything.  That's one half...
> 
> >> Also, if any of the changes you made are wrong, what was there before
> >> was also clearly wrong.  So I think we're at least no worse off.
> >>
> >> But, I agree, it's not easy to verify what the correct thing should be
> >> in all cases.  I'll try to take a closer look in the next couple days.
> >
> > Thanks. I will also stare at this code some more in the next couple
> > days trying to remember what these functions do.
> 
> Your patch has two divisible parts:
> 
> 1) Your modifications to
>   * clear_ce_flags_1()
>   * clear_ce_flags_dir()
>   * clear_ce_flags()
>   * mark_new_skip_worktree()
> The clear_ce_flags*() functions are only called by each other and by
> mark_new_skip_worktree(), which in turn is only called from
> unpack_trees().  Also, in all of these, your change ends up only
> modifying what index_state is passed to is_excluded_from_list().
> 
> 2) Your modifications to
>   * verify_clean_subdirectory()
>   * check_ok_to_remove()
> In this case, the former is only called by the latter, and the latter
> ends up only being called (via verify_absent_1()) from verify_absent()
> and verify_absent_sparse().
> 
> I'll address each, in reverse order.
> 
> 2) The stuff that affects verify_absent()
> 
> While verify_absent() is not called from merge-recursive right now, it
> was something I wanted to use in the future for very similar reasons
> that verify_uptodate() started being called directly from
> merge-recursive.  In particular, if the rewrite of merge-recursive[A]
> I want to do sets index_only when calling unpack_trees(), then does
> the whole merge without touching the worktree, then at the end goes to
> update the working tree, it will need to do extra checks.
> verify_absent() will come in handy as one of those extra checks.  For
> that case, using the_index (the new index just created with lots of
> changes) would be wrong in all the same ways that using the_index
> caused massive problems for was_tracked() in merge-recursive (e.g. the
> blow up of when Junio merged the original directory rename detection
> series into master and subsequently reverted it); we'd instead want
> src_index (which was the index that existed when merge was called)
> instead.  So, with this patch you've fixed some important bugs that I
> would have hit later.
> 
> [A] sidenote: see
> https://public-inbox.org/git/xmqqk1ydkbx0.fsf@gitster.mtv.corp.google.com/
> for more details
> 
> 1) mark_new_skip_worktree() ... is_excluded_from_list().
> 
> Sadly, I'm not very familiar with the skip_worktree and sparse
> checkout stuff.  However, the fact that mark_new_skip_worktree()
> explicitly takes an index_state (and a different one is passed to it
> the two different times it is called), and that it is the only caller
> of the clear_ce_flags*() family of functions, and that those function
> use the cache entries from the index passed to them in all cases other
> than the calls to is_excluded_from_list() makes those two look like
> oversights.  In fact, a little more digging turns up commit
>    fba92be8f7 ("dir: convert is_excluded_from_list to take an index",
> 2017-05-05)
> and before then, those functions didn't use the_index directly.  But
> they did use it indirectly, because they called a function in dir.c
> that had it hardcoded.  So it looks like Brandon fixed part of the bug
> for us, but moved the incorrect hardcoding from dir.c to
> unpack-trees.c.  Your patch is just fixing it up.  In fact, a little
> more digging turns up:
> 
> 2c1eb10454 ("dir: convert read_directory to take an index", 2017-05-05)
> a0bba65b10 ("dir: convert is_excluded to take an index", 2017-05-05)
> 
> which appear to be the culprits behind the other two uses of the_index
> called from verify_absent().  It looks like before these commits that
> unpack_trees() was carefully using the appropriate index, except that
> functions in dir.c had use of the_index hardcoded.  Brandon fix the
> functions in dir.c for us, but ended up still hardcoding the_index in
> unpack-trees.c.  Basically, he did most of the necessary lifting, and
> your patch just finally changes them over to use the correct index.
> 
> Brandon: Does anything look off in my analysis above?

Nope your analysis is correct with regards to the changes I made to dir.
Back when I was converting ls-files and grep to recursively work on
submodules I needed to make it so that some of the dir functions
explicitly accepted an index parameter instead of implicitly relying on
the_index.  So I bubbled up the hardcoded the_index to the places I
wasn't passing in a specific index (which would be when the_index
appeared in unpack-trees.c) and allowed for arbitrary index's to be
passed in like in ls-files.

This is one of the larger issues with working in our codebase,
implicitly relying on global state.  It makes it very difficult to
reason if a section of code is correct or not, as is in this case.  Is
it a bug that we use o->src_index in some places and the_index in
others?  Well maybe not, but that's because they usually are the same
thing in the few cases that it matters (at least now its a bit more
explicit to see this, whereas before it was very implicit).  Of course I
know stefan has done work with making unpack-trees recursive so maybe
it can be a bug in the recursive case.

-- 
Brandon Williams

  reply	other threads:[~2018-06-04 17:33 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-01 16:11 [PATCH/RFC/BUG] unpack-trees.c: do not use "the_index" Nguyễn Thái Ngọc Duy
2018-06-01 18:34 ` Elijah Newren
2018-06-01 18:51   ` Stefan Beller
2018-06-02  5:01     ` Duy Nguyen
2018-06-02  5:03   ` Duy Nguyen
2018-06-03  4:58     ` Elijah Newren
2018-06-04 17:33       ` Brandon Williams [this message]
2018-06-05 15:43 ` [PATCH 0/6] Fix "the_index" usage in unpack-trees.c Nguyễn Thái Ngọc Duy
2018-06-05 15:43   ` [PATCH 1/6] unpack-trees: remove 'extern' on function declaration Nguyễn Thái Ngọc Duy
2018-06-05 15:43   ` [PATCH 2/6] unpack-trees: add a note about path invalidation Nguyễn Thái Ngọc Duy
2018-06-05 15:43   ` [PATCH 3/6] unpack-trees: don't shadow global var the_index Nguyễn Thái Ngọc Duy
2018-06-05 17:11     ` Ramsay Jones
2018-06-05 15:43   ` [PATCH 4/6] unpack-tress: convert clear_ce_flags* to avoid the_index Nguyễn Thái Ngọc Duy
2018-06-05 17:37     ` Ramsay Jones
2018-06-05 15:43   ` [PATCH 5/6] unpack-trees: avoid the_index in verify_absent() Nguyễn Thái Ngọc Duy
2018-06-05 15:43   ` [PATCH 6/6] Forbid "the_index" in dir.c and unpack-trees.c Nguyễn Thái Ngọc Duy
2018-06-05 16:58     ` Brandon Williams
2018-06-06  4:57       ` Duy Nguyen
2018-06-06  5:02   ` [PATCH v2 0/5] Fix "the_index" usage in unpack-trees.c Nguyễn Thái Ngọc Duy
2018-06-06  5:02     ` [PATCH v2 1/5] unpack-trees: remove 'extern' on function declaration Nguyễn Thái Ngọc Duy
2018-06-06  5:02     ` [PATCH v2 2/5] unpack-trees: add a note about path invalidation Nguyễn Thái Ngọc Duy
2018-06-06  5:02     ` [PATCH v2 3/5] unpack-trees: don't shadow global var the_index Nguyễn Thái Ngọc Duy
2018-06-06  5:02     ` [PATCH v2 4/5] unpack-tress: convert clear_ce_flags* to avoid the_index Nguyễn Thái Ngọc Duy
2018-06-06  5:02     ` [PATCH v2 5/5] unpack-trees: avoid the_index in verify_absent() Nguyễn Thái Ngọc Duy
2018-06-06  7:39     ` [PATCH v3 00/20] Fix incorrect use of the_index Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 01/20] unpack-trees: remove 'extern' on function declaration Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 02/20] unpack-trees: add a note about path invalidation Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 03/20] unpack-trees: don't shadow global var the_index Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 04/20] unpack-tress: convert clear_ce_flags* to avoid the_index Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 05/20] unpack-trees: avoid the_index in verify_absent() Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 06/20] attr.h: drop extern from function declaration Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 07/20] attr: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-06 13:35         ` Ramsay Jones
2018-06-06 16:50         ` Brandon Williams
2018-06-06 16:58           ` Duy Nguyen
2018-06-06 17:02             ` Brandon Williams
2018-06-06  7:39       ` [PATCH v3 08/20] convert.h: drop 'extern' from function declaration Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 09/20] convert.c: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 10/20] dir.c: remove an implicit dependency on the_index in pathspec code Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 11/20] ls-files: correct index argument to get_convert_attr_ascii() Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 12/20] pathspec.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 13/20] submodule.c: " Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 14/20] entry.c: " Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 15/20] attr: remove index from git_attr_set_direction() Nguyễn Thái Ngọc Duy
2018-06-06 16:58         ` Brandon Williams
2018-06-06  7:39       ` [PATCH v3 16/20] preload-index.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 17/20] cache.c: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:00         ` Brandon Williams
2018-06-06  7:39       ` [PATCH v3 18/20] resolve-undo.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 19/20] grep: " Nguyễn Thái Ngọc Duy
2018-06-06  7:39       ` [PATCH v3 20/20] cache.h: make the_index part of "compatibility macros" Nguyễn Thái Ngọc Duy
2018-06-06 16:49       ` [PATCH v4 00/23] Fix incorrect use of the_index Nguyễn Thái Ngọc Duy
2018-06-06 16:49         ` [PATCH v4 01/23] unpack-trees: remove 'extern' on function declaration Nguyễn Thái Ngọc Duy
2018-06-06 16:49         ` [PATCH v4 02/23] unpack-trees: add a note about path invalidation Nguyễn Thái Ngọc Duy
2018-06-06 16:49         ` [PATCH v4 03/23] unpack-trees: don't shadow global var the_index Nguyễn Thái Ngọc Duy
2018-06-06 16:49         ` [PATCH v4 04/23] unpack-tress: convert clear_ce_flags* to avoid the_index Nguyễn Thái Ngọc Duy
2018-06-07  7:41           ` Elijah Newren
2018-06-07 15:11             ` Duy Nguyen
2018-06-08 15:58               ` Duy Nguyen
2018-06-06 16:49         ` [PATCH v4 05/23] unpack-trees: avoid the_index in verify_absent() Nguyễn Thái Ngọc Duy
2018-06-06 17:02         ` [PATCH v4 06/23] attr.h: drop extern from function declaration Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 07/23] attr: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 08/23] convert.h: drop 'extern' from function declaration Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 09/23] convert.c: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 10/23] dir.c: remove an implicit dependency on the_index in pathspec code Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 11/23] ls-files: correct index argument to get_convert_attr_ascii() Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 12/23] pathspec.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 13/23] submodule.c: " Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 14/23] entry.c: " Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 15/23] attr: remove index from git_attr_set_direction() Nguyễn Thái Ngọc Duy
2018-06-09 17:57             ` Elijah Newren
2018-06-06 17:02           ` [PATCH v4 16/23] preload-index.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 17/23] read-cache.c: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-09 18:10             ` Elijah Newren
2018-06-09 18:45               ` Duy Nguyen
2018-06-09 19:48                 ` Elijah Newren
2018-06-06 17:02           ` [PATCH v4 18/23] apply.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 19/23] difftool: " Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 20/23] checkout: avoid the_index when possible Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 21/23] resolve-undo.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 22/23] grep: " Nguyễn Thái Ngọc Duy
2018-06-06 17:02           ` [PATCH v4 23/23] cache.h: make the_index part of "compatibility macros" Nguyễn Thái Ngọc Duy
2018-06-07  7:44         ` [PATCH v4 00/23] Fix incorrect use of the_index Elijah Newren
2018-06-09 19:58           ` Elijah Newren
2018-06-11 16:05             ` Duy Nguyen
2018-06-11 16:11               ` Elijah Newren
2018-06-11 16:24                 ` Duy Nguyen
2018-06-11 16:44                   ` Elijah Newren
2018-06-11 16:49                     ` Duy Nguyen
2018-06-14 17:18                       ` Duy Nguyen
2018-06-14 20:57                         ` Elijah Newren
2018-06-11 18:20               ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180604173338.GA91449@google.com \
    --to=bmwill@google.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).