All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Tamir Duberstein <tamird@gmail.com>
Cc: git@vger.kernel.org, "René Scharfe" <l.s.r@web.de>,
	"Patrick Steinhardt" <ps@pks.im>
Subject: Re: [PATCH] ls-files: filter pathspec before lstat
Date: Mon, 08 Jun 2026 06:06:33 -0700	[thread overview]
Message-ID: <xmqqa4t5yyee.fsf@gitster.g> (raw)
In-Reply-To: <20260607-ls-files-pathspec-lstat-v1-1-8cf40b730146@gmail.com> (Tamir Duberstein's message of "Sun, 07 Jun 2026 11:40:56 -0400")

On Sun, Jun 7, 2026 at 11:40, Tamir Duberstein wrote:
> show_files() checks whether each index entry is deleted or modified
> before show_ce() applies the pathspec. prune_index() avoids most of this
> work for pathspecs with a common directory prefix, but a top-level name
> or leading wildcard leaves every entry to be checked.
> 
> Match the pathspec before lstat() for the deleted and modified modes.
> Keep the later match in show_ce() so --error-unmatch is satisfied only
> by entries that are actually shown.

Adding an extra early `match_pathspec()` check before making slow
system calls like `lstat()` makes sense, especially when most of the
index entries need to be skipped.  But if most of them would match,
then we would end up doing the same match_pathspec() calls twice for
each path, and run lstat() anyway, so you may also be able to
construct a perf test that demonstrates a case where this approach
is not a clear win (or even degradation), perhaps?

> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index e1a22b41b9..702c607183 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -450,6 +450,13 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
>  			continue;
>  		if (ce_skip_worktree(ce))
>  			continue;
> +		/* Only entries shown by show_ce() satisfy --error-unmatch. */
> +		if (pathspec.nr &&
> +		    !match_pathspec(repo->index, &pathspec, fullname.buf,
> +				    fullname.len, max_prefix_len, NULL,
> +				    S_ISDIR(ce->ce_mode) ||
> +				    S_ISGITLINK(ce->ce_mode)))
> +			continue;
>  		stat_err = lstat(fullname.buf, &st);
>  		if (stat_err && (errno != ENOENT && errno != ENOTDIR))
>  			error_errno("cannot lstat '%s'", fullname.buf);

Hmph.  In the current code, because there is no such pre-filtering,
show_ce() would unconditionally recurse into active submodules when
told to with the "--recurse-submodules" flag, even if your pathspec
coes not match the submodule.  With this change, such a submodule
whose path does not match the pathspec would not even be seen by
show_ce().  Would it cause a change in behaviour?

  parent reply	other threads:[~2026-06-08 13:06 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-07 15:40 [PATCH] ls-files: filter pathspec before lstat Tamir Duberstein
2026-06-07 16:02 ` Kristoffer Haugsbakk
2026-06-07 16:07   ` Tamir Duberstein
2026-06-07 16:17     ` Kristoffer Haugsbakk
2026-06-07 18:24       ` Tamir Duberstein
2026-06-08 13:06 ` Junio C Hamano [this message]
2026-06-08 19:15   ` Tamir Duberstein
2026-06-08 23:03   ` Jeff King
2026-06-08 23:25     ` Jeff King
2026-06-09  0:13       ` Tamir Duberstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqa4t5yyee.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=l.s.r@web.de \
    --cc=ps@pks.im \
    --cc=tamird@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.