From: Brandon Williams <bmwill@google.com>
To: Elijah Newren <newren@gmail.com>
Cc: Duy Nguyen <pclouds@gmail.com>, Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH/RFC/BUG] unpack-trees.c: do not use "the_index"
Date: Mon, 4 Jun 2018 10:33:38 -0700 [thread overview]
Message-ID: <20180604173338.GA91449@google.com> (raw)
In-Reply-To: <CABPp-BEMUUrYd_=a1sPYV941rMCdWh_x-4j5uUiCEUFA0Fwoww@mail.gmail.com>
On 06/02, Elijah Newren wrote:
> On Fri, Jun 1, 2018 at 10:03 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> > On Fri, Jun 1, 2018 at 8:34 PM, Elijah Newren <newren@gmail.com> wrote:
> >> On Fri, Jun 1, 2018 at 9:11 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> >>> This is more of a bug report than an actual fix because I'm not sure
> >>> if "o->src_index" is always the correct answer instead of "the_index"
> >>> here. But this is very similar to 7db118303a (unpack_trees: fix
> >>> breakage when o->src_index != o->dst_index - 2018-04-23) and could
> >>> potentially break things again...
>
> I'm pretty sure your patch is correct. Adding Brandon Williams to the
> cc for comment since his patches came up in the analysis below...
>
> >> Actually, I don't think the patch will break anything in the current
> >> code. Currently, all callers of unpack_trees() (even merge recursive
> >> which uses different src_index and dst_index now) set src_index =
> >> &the_index. So, these changes should continue to work as before (with
> >> a minor possible exception of merge-recursive calling into other
> >> functions from unpack-trees.c after unpack_trees() has finished..).
> >> That's not to say that your patch is bug free, just that I think any
> >> bugs shouldn't be triggerable from the current codebase.
> >
> > Ah.. I thought merge-recursive would do fancier things and used some
> > temporary index. Good to know.
>
> Well, it does does use a temporary index, but for dst_index rather
> than src_index. It then does some fancier things, but not until the
> call to unpack_trees() is over. In particular, at that point, it
> swaps the_index and tmp_index, reversing their roles so that now
> tmp_index is the original index and the_index becomes the result after
> unpack_trees() is run. That's done because I later want to use the
> original index for calling verify_uptodate(). verify_uptodate() is
> then used for was_tracked_and_matches() and was_tracked().
>
> Anyway, the whole upshot of this is:
> * merge-recursive uses src_index == &the_index for the unpack_trees() call.
> * merge-recursive uses src_index == o->orig_index for subsequent
> calls to verify_uptodate(), but verify_uptodate() doesn't call into
> any of the sites you have modified.
>
> Further:
> * Every other existing caller of unpack-trees in the code sets
> src_index == &the_index, so this won't break any of them.
> * There are only two callers in the codebase that set dst_index to
> something other than &the_index -- diff-lib.c (which sets it to NULL)
> and merge-recursive (which does the stuff described above).
>
> So, having done that analysis, I am now pretty convinced your patch
> won't break anything. That's one half...
>
> >> Also, if any of the changes you made are wrong, what was there before
> >> was also clearly wrong. So I think we're at least no worse off.
> >>
> >> But, I agree, it's not easy to verify what the correct thing should be
> >> in all cases. I'll try to take a closer look in the next couple days.
> >
> > Thanks. I will also stare at this code some more in the next couple
> > days trying to remember what these functions do.
>
> Your patch has two divisible parts:
>
> 1) Your modifications to
> * clear_ce_flags_1()
> * clear_ce_flags_dir()
> * clear_ce_flags()
> * mark_new_skip_worktree()
> The clear_ce_flags*() functions are only called by each other and by
> mark_new_skip_worktree(), which in turn is only called from
> unpack_trees(). Also, in all of these, your change ends up only
> modifying what index_state is passed to is_excluded_from_list().
>
> 2) Your modifications to
> * verify_clean_subdirectory()
> * check_ok_to_remove()
> In this case, the former is only called by the latter, and the latter
> ends up only being called (via verify_absent_1()) from verify_absent()
> and verify_absent_sparse().
>
> I'll address each, in reverse order.
>
> 2) The stuff that affects verify_absent()
>
> While verify_absent() is not called from merge-recursive right now, it
> was something I wanted to use in the future for very similar reasons
> that verify_uptodate() started being called directly from
> merge-recursive. In particular, if the rewrite of merge-recursive[A]
> I want to do sets index_only when calling unpack_trees(), then does
> the whole merge without touching the worktree, then at the end goes to
> update the working tree, it will need to do extra checks.
> verify_absent() will come in handy as one of those extra checks. For
> that case, using the_index (the new index just created with lots of
> changes) would be wrong in all the same ways that using the_index
> caused massive problems for was_tracked() in merge-recursive (e.g. the
> blow up of when Junio merged the original directory rename detection
> series into master and subsequently reverted it); we'd instead want
> src_index (which was the index that existed when merge was called)
> instead. So, with this patch you've fixed some important bugs that I
> would have hit later.
>
> [A] sidenote: see
> https://public-inbox.org/git/xmqqk1ydkbx0.fsf@gitster.mtv.corp.google.com/
> for more details
>
> 1) mark_new_skip_worktree() ... is_excluded_from_list().
>
> Sadly, I'm not very familiar with the skip_worktree and sparse
> checkout stuff. However, the fact that mark_new_skip_worktree()
> explicitly takes an index_state (and a different one is passed to it
> the two different times it is called), and that it is the only caller
> of the clear_ce_flags*() family of functions, and that those function
> use the cache entries from the index passed to them in all cases other
> than the calls to is_excluded_from_list() makes those two look like
> oversights. In fact, a little more digging turns up commit
> fba92be8f7 ("dir: convert is_excluded_from_list to take an index",
> 2017-05-05)
> and before then, those functions didn't use the_index directly. But
> they did use it indirectly, because they called a function in dir.c
> that had it hardcoded. So it looks like Brandon fixed part of the bug
> for us, but moved the incorrect hardcoding from dir.c to
> unpack-trees.c. Your patch is just fixing it up. In fact, a little
> more digging turns up:
>
> 2c1eb10454 ("dir: convert read_directory to take an index", 2017-05-05)
> a0bba65b10 ("dir: convert is_excluded to take an index", 2017-05-05)
>
> which appear to be the culprits behind the other two uses of the_index
> called from verify_absent(). It looks like before these commits that
> unpack_trees() was carefully using the appropriate index, except that
> functions in dir.c had use of the_index hardcoded. Brandon fix the
> functions in dir.c for us, but ended up still hardcoding the_index in
> unpack-trees.c. Basically, he did most of the necessary lifting, and
> your patch just finally changes them over to use the correct index.
>
> Brandon: Does anything look off in my analysis above?
Nope your analysis is correct with regards to the changes I made to dir.
Back when I was converting ls-files and grep to recursively work on
submodules I needed to make it so that some of the dir functions
explicitly accepted an index parameter instead of implicitly relying on
the_index. So I bubbled up the hardcoded the_index to the places I
wasn't passing in a specific index (which would be when the_index
appeared in unpack-trees.c) and allowed for arbitrary index's to be
passed in like in ls-files.
This is one of the larger issues with working in our codebase,
implicitly relying on global state. It makes it very difficult to
reason if a section of code is correct or not, as is in this case. Is
it a bug that we use o->src_index in some places and the_index in
others? Well maybe not, but that's because they usually are the same
thing in the few cases that it matters (at least now its a bit more
explicit to see this, whereas before it was very implicit). Of course I
know stefan has done work with making unpack-trees recursive so maybe
it can be a bug in the recursive case.
--
Brandon Williams
next prev parent reply other threads:[~2018-06-04 17:33 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-01 16:11 [PATCH/RFC/BUG] unpack-trees.c: do not use "the_index" Nguyễn Thái Ngọc Duy
2018-06-01 18:34 ` Elijah Newren
2018-06-01 18:51 ` Stefan Beller
2018-06-02 5:01 ` Duy Nguyen
2018-06-02 5:03 ` Duy Nguyen
2018-06-03 4:58 ` Elijah Newren
2018-06-04 17:33 ` Brandon Williams [this message]
2018-06-05 15:43 ` [PATCH 0/6] Fix "the_index" usage in unpack-trees.c Nguyễn Thái Ngọc Duy
2018-06-05 15:43 ` [PATCH 1/6] unpack-trees: remove 'extern' on function declaration Nguyễn Thái Ngọc Duy
2018-06-05 15:43 ` [PATCH 2/6] unpack-trees: add a note about path invalidation Nguyễn Thái Ngọc Duy
2018-06-05 15:43 ` [PATCH 3/6] unpack-trees: don't shadow global var the_index Nguyễn Thái Ngọc Duy
2018-06-05 17:11 ` Ramsay Jones
2018-06-05 15:43 ` [PATCH 4/6] unpack-tress: convert clear_ce_flags* to avoid the_index Nguyễn Thái Ngọc Duy
2018-06-05 17:37 ` Ramsay Jones
2018-06-05 15:43 ` [PATCH 5/6] unpack-trees: avoid the_index in verify_absent() Nguyễn Thái Ngọc Duy
2018-06-05 15:43 ` [PATCH 6/6] Forbid "the_index" in dir.c and unpack-trees.c Nguyễn Thái Ngọc Duy
2018-06-05 16:58 ` Brandon Williams
2018-06-06 4:57 ` Duy Nguyen
2018-06-06 5:02 ` [PATCH v2 0/5] Fix "the_index" usage in unpack-trees.c Nguyễn Thái Ngọc Duy
2018-06-06 5:02 ` [PATCH v2 1/5] unpack-trees: remove 'extern' on function declaration Nguyễn Thái Ngọc Duy
2018-06-06 5:02 ` [PATCH v2 2/5] unpack-trees: add a note about path invalidation Nguyễn Thái Ngọc Duy
2018-06-06 5:02 ` [PATCH v2 3/5] unpack-trees: don't shadow global var the_index Nguyễn Thái Ngọc Duy
2018-06-06 5:02 ` [PATCH v2 4/5] unpack-tress: convert clear_ce_flags* to avoid the_index Nguyễn Thái Ngọc Duy
2018-06-06 5:02 ` [PATCH v2 5/5] unpack-trees: avoid the_index in verify_absent() Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 00/20] Fix incorrect use of the_index Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 01/20] unpack-trees: remove 'extern' on function declaration Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 02/20] unpack-trees: add a note about path invalidation Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 03/20] unpack-trees: don't shadow global var the_index Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 04/20] unpack-tress: convert clear_ce_flags* to avoid the_index Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 05/20] unpack-trees: avoid the_index in verify_absent() Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 06/20] attr.h: drop extern from function declaration Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 07/20] attr: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-06 13:35 ` Ramsay Jones
2018-06-06 16:50 ` Brandon Williams
2018-06-06 16:58 ` Duy Nguyen
2018-06-06 17:02 ` Brandon Williams
2018-06-06 7:39 ` [PATCH v3 08/20] convert.h: drop 'extern' from function declaration Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 09/20] convert.c: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 10/20] dir.c: remove an implicit dependency on the_index in pathspec code Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 11/20] ls-files: correct index argument to get_convert_attr_ascii() Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 12/20] pathspec.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 13/20] submodule.c: " Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 14/20] entry.c: " Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 15/20] attr: remove index from git_attr_set_direction() Nguyễn Thái Ngọc Duy
2018-06-06 16:58 ` Brandon Williams
2018-06-06 7:39 ` [PATCH v3 16/20] preload-index.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 17/20] cache.c: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:00 ` Brandon Williams
2018-06-06 7:39 ` [PATCH v3 18/20] resolve-undo.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 19/20] grep: " Nguyễn Thái Ngọc Duy
2018-06-06 7:39 ` [PATCH v3 20/20] cache.h: make the_index part of "compatibility macros" Nguyễn Thái Ngọc Duy
2018-06-06 16:49 ` [PATCH v4 00/23] Fix incorrect use of the_index Nguyễn Thái Ngọc Duy
2018-06-06 16:49 ` [PATCH v4 01/23] unpack-trees: remove 'extern' on function declaration Nguyễn Thái Ngọc Duy
2018-06-06 16:49 ` [PATCH v4 02/23] unpack-trees: add a note about path invalidation Nguyễn Thái Ngọc Duy
2018-06-06 16:49 ` [PATCH v4 03/23] unpack-trees: don't shadow global var the_index Nguyễn Thái Ngọc Duy
2018-06-06 16:49 ` [PATCH v4 04/23] unpack-tress: convert clear_ce_flags* to avoid the_index Nguyễn Thái Ngọc Duy
2018-06-07 7:41 ` Elijah Newren
2018-06-07 15:11 ` Duy Nguyen
2018-06-08 15:58 ` Duy Nguyen
2018-06-06 16:49 ` [PATCH v4 05/23] unpack-trees: avoid the_index in verify_absent() Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 06/23] attr.h: drop extern from function declaration Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 07/23] attr: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 08/23] convert.h: drop 'extern' from function declaration Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 09/23] convert.c: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 10/23] dir.c: remove an implicit dependency on the_index in pathspec code Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 11/23] ls-files: correct index argument to get_convert_attr_ascii() Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 12/23] pathspec.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 13/23] submodule.c: " Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 14/23] entry.c: " Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 15/23] attr: remove index from git_attr_set_direction() Nguyễn Thái Ngọc Duy
2018-06-09 17:57 ` Elijah Newren
2018-06-06 17:02 ` [PATCH v4 16/23] preload-index.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 17/23] read-cache.c: remove an implicit dependency on the_index Nguyễn Thái Ngọc Duy
2018-06-09 18:10 ` Elijah Newren
2018-06-09 18:45 ` Duy Nguyen
2018-06-09 19:48 ` Elijah Newren
2018-06-06 17:02 ` [PATCH v4 18/23] apply.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 19/23] difftool: " Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 20/23] checkout: avoid the_index when possible Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 21/23] resolve-undo.c: use the right index instead of the_index Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 22/23] grep: " Nguyễn Thái Ngọc Duy
2018-06-06 17:02 ` [PATCH v4 23/23] cache.h: make the_index part of "compatibility macros" Nguyễn Thái Ngọc Duy
2018-06-07 7:44 ` [PATCH v4 00/23] Fix incorrect use of the_index Elijah Newren
2018-06-09 19:58 ` Elijah Newren
2018-06-11 16:05 ` Duy Nguyen
2018-06-11 16:11 ` Elijah Newren
2018-06-11 16:24 ` Duy Nguyen
2018-06-11 16:44 ` Elijah Newren
2018-06-11 16:49 ` Duy Nguyen
2018-06-14 17:18 ` Duy Nguyen
2018-06-14 20:57 ` Elijah Newren
2018-06-11 18:20 ` Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180604173338.GA91449@google.com \
--to=bmwill@google.com \
--cc=git@vger.kernel.org \
--cc=newren@gmail.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.