git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: Derrick Stolee <stolee@gmail.com>
Cc: git <git@vger.kernel.org>, Elijah Newren <newren@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	vdye@github.com, Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	Sean Christopherson <seanjc@google.com>
Subject: Re: [PATCH v2] add, rm, mv: fix bug that prevents the update of non-sparse dirs
Date: Tue, 26 Oct 2021 19:43:13 -0300	[thread overview]
Message-ID: <CAHd-oW6w0aiFDVX1S2ttfc++H3okz2YTGf3f2p=xSbL_Bc_DNA@mail.gmail.com> (raw)
In-Reply-To: <47aec8ed-5e54-6d13-8154-0202ef0fd747@gmail.com>

Hi, Stolee

Thanks for the review!

On Tue, Oct 26, 2021 at 9:53 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 10/25/2021 5:07 PM, Matheus Tavares wrote:
> >
> > - Simplified the implementation by unifing the code path for cone mode and
> >   full pattern mode. Since path_matches_pattern_list() never returns UNDECIDED
> >   for cone mode, it will always execute only one iteration of the loop and then
> >   find the final answer. There is no need to handle this case in a separate
> >   block.
>
> This was unexpected, but makes sense. While your commit message hints at the
> fact that cone mode never returns UNDECIDED, it doesn't explicitly mention
> that cone mode will exit the loop after a single iteration. It might be nice
> to make that explicit either in the commit message or the block comment before
> the loop.

Yup, will do.

> > - Inside the loop, made sure to change dtype to DT_DIR when going to parent
> >   directories. Without this, the pattern match would fail if we had a path
> >   like "a/b/c" and the pattern "b/" (with trailing slash).
>
> Very good. We typically need to detect the type for the first path given,
> but we know that all parents are directories. I've used this trick elsewhere.
>
> I see in the code that the first path is used as DT_REG. It's my fault, but
> perhaps it should be made more clear that path_in_sparse_checkout() will
> consider the given path as a file, not a directory. The current users of the
> method are using it properly, but I'm suddenly worried about another caller
> misinterpreting the generality of the problem.

Yeah, I was thinking about this too... I'm afraid there might be at
least two users of this function which already pass non-regular files
to it: builtin/add.c:refresh() and
sparse-index.c:convert_to_sparse_rec().

The first calls the function passing the user-given pathspec, which
may be a directory. But this one is easy to solve: I think we don't
even need the path_in_sparse_checkout() here as the `git add
--refresh` only work on tracked files, and the previous
matches_skip_worktree() call covers both skip_worktree and
non-skip_worktree index entries (maybe we should rename this function
to matches_sparse_ce()?)

As for convert_to_sparse_rec(), it seems to call
path_in_sparse_checkout() with the directory components of paths, so
something like "dir/". Perhaps we can make path_in_sparse_checkout()
receive a dtype argument and pass DT_UNKNOWN in this case?

Another case I haven't given much thought yet is submodules. For example:

git init sub &&
test_commit -C sub file &&
git submodule add ./sub &&
git commit -m sub &&
git sparse-checkout set 'sub/' &&
git mv sub sub2

Erroneously gives:
The following paths and/or pathspecs matched paths that exist
outside of your sparse-checkout definition, so will not be
updated in the index:
sub

But it works if we change DT_REG to DT_UNKNOWN in
path_in_sparse_checkout(). So, I'm not sure, should we use DT_UNKNOWN
for all calls?

> > @@ -1504,8 +1504,9 @@ static int path_in_sparse_checkout_1(const char *path,
> >                                    struct index_state *istate,
> >                                    int require_cone_mode)
> >  {
> > -     const char *base;
> >       int dtype = DT_REG;
>
> Here is where we assume a file to start.
>
> > +     enum pattern_match_result match = UNDECIDED;
> > +     const char *end, *slash;
> >
> >       /*
> >        * We default to accepting a path if there are no patterns or
> > @@ -1516,11 +1517,23 @@ static int path_in_sparse_checkout_1(const char *path,
> >            !istate->sparse_checkout_patterns->use_cone_patterns))
> >               return 1;
> >
> > -     base = strrchr(path, '/');
> > -     return path_matches_pattern_list(path, strlen(path), base ? base + 1 : path,
> > -                                      &dtype,
> > -                                      istate->sparse_checkout_patterns,
> > -                                      istate) > 0;
> > +     /*
> > +      * If UNDECIDED, use the match from the parent dir (recursively),
> > +      * or fall back to NOT_MATCHED at the topmost level.
> > +      */
> > +     for (end = path + strlen(path); end > path && match == UNDECIDED; end = slash) {
>
> nit: since this line is long and the sentinel is complicated, it might
> be worth splitting the different parts into their own lines:
>
>         for (end = path + strlen(path);
>              end > path && match == UNDECIDED;
>              end = slash) {

Good idea, I will split the lines like you did.

  reply	other threads:[~2021-10-26 22:43 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25 21:07 [PATCH v2] add, rm, mv: fix bug that prevents the update of non-sparse dirs Matheus Tavares
2021-10-26 12:53 ` Derrick Stolee
2021-10-26 22:43   ` Matheus Tavares [this message]
2021-10-27 11:35     ` Derrick Stolee
2021-10-26 16:22 ` René Scharfe
2021-10-26 19:04   ` Derrick Stolee
2021-10-27  0:00     ` Matheus Tavares
2021-10-27 22:13     ` Chris Torek
2021-10-27 17:36 ` Sean Christopherson
2021-10-28 14:21 ` [PATCH v3] " Matheus Tavares
2021-10-28 15:06   ` Derrick Stolee
2021-10-28 15:58     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHd-oW6w0aiFDVX1S2ttfc++H3okz2YTGf3f2p=xSbL_Bc_DNA@mail.gmail.com' \
    --to=matheus.bernardino@usp.br \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=seanjc@google.com \
    --cc=stolee@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).