From: Matheus Tavares <matheus.bernardino@usp.br>
To: Derrick Stolee <stolee@gmail.com>
Cc: git <git@vger.kernel.org>, Elijah Newren <newren@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
vdye@github.com, Derrick Stolee <derrickstolee@github.com>,
Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH v2] add, rm, mv: fix bug that prevents the update of non-sparse dirs
Date: Tue, 26 Oct 2021 19:43:13 -0300 [thread overview]
Message-ID: <CAHd-oW6w0aiFDVX1S2ttfc++H3okz2YTGf3f2p=xSbL_Bc_DNA@mail.gmail.com> (raw)
In-Reply-To: <47aec8ed-5e54-6d13-8154-0202ef0fd747@gmail.com>
Hi, Stolee
Thanks for the review!
On Tue, Oct 26, 2021 at 9:53 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 10/25/2021 5:07 PM, Matheus Tavares wrote:
> >
> > - Simplified the implementation by unifing the code path for cone mode and
> > full pattern mode. Since path_matches_pattern_list() never returns UNDECIDED
> > for cone mode, it will always execute only one iteration of the loop and then
> > find the final answer. There is no need to handle this case in a separate
> > block.
>
> This was unexpected, but makes sense. While your commit message hints at the
> fact that cone mode never returns UNDECIDED, it doesn't explicitly mention
> that cone mode will exit the loop after a single iteration. It might be nice
> to make that explicit either in the commit message or the block comment before
> the loop.
Yup, will do.
> > - Inside the loop, made sure to change dtype to DT_DIR when going to parent
> > directories. Without this, the pattern match would fail if we had a path
> > like "a/b/c" and the pattern "b/" (with trailing slash).
>
> Very good. We typically need to detect the type for the first path given,
> but we know that all parents are directories. I've used this trick elsewhere.
>
> I see in the code that the first path is used as DT_REG. It's my fault, but
> perhaps it should be made more clear that path_in_sparse_checkout() will
> consider the given path as a file, not a directory. The current users of the
> method are using it properly, but I'm suddenly worried about another caller
> misinterpreting the generality of the problem.
Yeah, I was thinking about this too... I'm afraid there might be at
least two users of this function which already pass non-regular files
to it: builtin/add.c:refresh() and
sparse-index.c:convert_to_sparse_rec().
The first calls the function passing the user-given pathspec, which
may be a directory. But this one is easy to solve: I think we don't
even need the path_in_sparse_checkout() here as the `git add
--refresh` only work on tracked files, and the previous
matches_skip_worktree() call covers both skip_worktree and
non-skip_worktree index entries (maybe we should rename this function
to matches_sparse_ce()?)
As for convert_to_sparse_rec(), it seems to call
path_in_sparse_checkout() with the directory components of paths, so
something like "dir/". Perhaps we can make path_in_sparse_checkout()
receive a dtype argument and pass DT_UNKNOWN in this case?
Another case I haven't given much thought yet is submodules. For example:
git init sub &&
test_commit -C sub file &&
git submodule add ./sub &&
git commit -m sub &&
git sparse-checkout set 'sub/' &&
git mv sub sub2
Erroneously gives:
The following paths and/or pathspecs matched paths that exist
outside of your sparse-checkout definition, so will not be
updated in the index:
sub
But it works if we change DT_REG to DT_UNKNOWN in
path_in_sparse_checkout(). So, I'm not sure, should we use DT_UNKNOWN
for all calls?
> > @@ -1504,8 +1504,9 @@ static int path_in_sparse_checkout_1(const char *path,
> > struct index_state *istate,
> > int require_cone_mode)
> > {
> > - const char *base;
> > int dtype = DT_REG;
>
> Here is where we assume a file to start.
>
> > + enum pattern_match_result match = UNDECIDED;
> > + const char *end, *slash;
> >
> > /*
> > * We default to accepting a path if there are no patterns or
> > @@ -1516,11 +1517,23 @@ static int path_in_sparse_checkout_1(const char *path,
> > !istate->sparse_checkout_patterns->use_cone_patterns))
> > return 1;
> >
> > - base = strrchr(path, '/');
> > - return path_matches_pattern_list(path, strlen(path), base ? base + 1 : path,
> > - &dtype,
> > - istate->sparse_checkout_patterns,
> > - istate) > 0;
> > + /*
> > + * If UNDECIDED, use the match from the parent dir (recursively),
> > + * or fall back to NOT_MATCHED at the topmost level.
> > + */
> > + for (end = path + strlen(path); end > path && match == UNDECIDED; end = slash) {
>
> nit: since this line is long and the sentinel is complicated, it might
> be worth splitting the different parts into their own lines:
>
> for (end = path + strlen(path);
> end > path && match == UNDECIDED;
> end = slash) {
Good idea, I will split the lines like you did.
next prev parent reply other threads:[~2021-10-26 22:43 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-25 21:07 [PATCH v2] add, rm, mv: fix bug that prevents the update of non-sparse dirs Matheus Tavares
2021-10-26 12:53 ` Derrick Stolee
2021-10-26 22:43 ` Matheus Tavares [this message]
2021-10-27 11:35 ` Derrick Stolee
2021-10-26 16:22 ` René Scharfe
2021-10-26 19:04 ` Derrick Stolee
2021-10-27 0:00 ` Matheus Tavares
2021-10-27 22:13 ` Chris Torek
2021-10-27 17:36 ` Sean Christopherson
2021-10-28 14:21 ` [PATCH v3] " Matheus Tavares
2021-10-28 15:06 ` Derrick Stolee
2021-10-28 15:58 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHd-oW6w0aiFDVX1S2ttfc++H3okz2YTGf3f2p=xSbL_Bc_DNA@mail.gmail.com' \
--to=matheus.bernardino@usp.br \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=seanjc@google.com \
--cc=stolee@gmail.com \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).