* [PATCH v3 0/4] diff: add pathspec support to --no-index
@ 2025-05-20 0:01 Jacob Keller
2025-05-20 0:01 ` [PATCH v3 1/4] prefix_path: support prefixes not ending in trailing slash Jacob Keller
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Jacob Keller @ 2025-05-20 0:01 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Jacob Keller
From: Jacob Keller <jacob.keller@gmail.com>
This series adds support for using pathspecs to limit the comparison when
using git diff --no-index. This is similar to how you can limit what is
included with pathspecs when comparing inside a repository.
This likely still needs some reworks, and I have some open questions in the
final implementation:
1) pathspecs must all come after the first two path arguments, you
can't re-arrange them to come first. I'm treating them sort of like
the treeish arguments to git diff-tree.
2) The pathspecs are interpreted relative to the provided paths, and
thus will always need to be specified as relative paths, and will be
interpreted as relative to the root of the search for each path
separately.
3) negative pathspecs have to be fully qualified from the root, i.e.
':(exclude)file' will only exclude 'a/file' and not 'a/b/file'
unless you also use '(glob)' or similar. I think this matches the
other pathspec support, but I an not 100% sure.
4) I'm not certain about exposing match_pathspec_with_flags as-is,
since DO_MATCH_EXCLUDE shouldn't be passed. I got the behavior I
expected with DO_MATCH_LEADING_PATHSPEC, but it feels a bit of a
weird API. Perhaps match_pathspec could set both flags when is_dir
is true? But would that break other callers?
However, this version now has documentation and some test cases. I found a
few issues with my original implementation in v2, which I've fixed.
I also an open to suggestions on better ways to handle the matching.
Currently I need a separate set of pathspecs for both paths, since I need to
make sure they get rooted appropriately. I don't know if there is a better
solution that would allow using the same pathspec structure for both
comparisons.
Jacob Keller (4):
prefix_path: support prefixes not ending in trailing slash
pathspec: expose match_pathspec_with_flags
pathspec: add flag to indicate operation without repository
diff --no-index: support limiting by pathspec
pathspec.h | 13 +++++
builtin/diff.c | 2 +-
diff-no-index.c | 91 ++++++++++++++++++++++++------
dir.c | 20 ++++---
pathspec.c | 6 +-
setup.c | 4 +-
Documentation/git-diff.adoc | 10 +++-
t/t0060-path-utils.sh | 18 ++++++
t/t4053-diff-no-index.sh | 107 ++++++++++++++++++++++++++++++++++++
9 files changed, 240 insertions(+), 31 deletions(-)
--
2.48.1.397.gec9d649cc640
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v3 1/4] prefix_path: support prefixes not ending in trailing slash
2025-05-20 0:01 [PATCH v3 0/4] diff: add pathspec support to --no-index Jacob Keller
@ 2025-05-20 0:01 ` Jacob Keller
2025-05-20 14:35 ` Junio C Hamano
2025-05-20 0:01 ` [PATCH v3 2/4] pathspec: expose match_pathspec_with_flags Jacob Keller
` (2 subsequent siblings)
3 siblings, 1 reply; 15+ messages in thread
From: Jacob Keller @ 2025-05-20 0:01 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Jacob Keller
From: Jacob Keller <jacob.keller@gmail.com>
The prefix_path_gently() -- and its wrapper prefix_path() -- function
normalizes the provided path and optionally adds the provided prefix to
relative paths. If the prefix does not end in a trailing slash, the
function will combine the last component of the prefix with the first
component of the relative path. This is unlikely to produce a desirable
result.
Teach prefix_path_gently() to check if the prefix ends in a slash. If it
does not, then insert a slash between the prefix and the path. Take care
to avoid inserting a slash if the prefix is empty.
Add test cases to cover the relative path behavior, which was previously
untested.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
---
setup.c | 4 +++-
t/t0060-path-utils.sh | 18 ++++++++++++++++++
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/setup.c b/setup.c
index f93bd6a24a5d..bd2888500817 100644
--- a/setup.c
+++ b/setup.c
@@ -139,7 +139,9 @@ char *prefix_path_gently(const char *prefix, int len,
return NULL;
}
} else {
- sanitized = xstrfmt("%.*s%s", len, len ? prefix : "", path);
+ sanitized = xstrfmt("%.*s%s%s", len, len ? prefix : "",
+ !len || prefix[len - 1] == '/' ? "" : "/",
+ path);
if (remaining_prefix)
*remaining_prefix = len;
if (normalize_path_copy_len(sanitized, sanitized, remaining_prefix)) {
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 8545cdfab559..9274713aea0c 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -307,6 +307,24 @@ test_expect_success SYMLINKS 'prefix_path works with absolute path to a symlink
test_cmp expect actual
'
+test_expect_success 'prefix_path works with relative path' '
+ echo "prefix/a" >expect &&
+ test-tool path-utils prefix_path "prefix/" "a" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'prefix_path works with relative path and prefix not ending in /' '
+ echo "prefix/a" >expect &&
+ test-tool path-utils prefix_path "prefix" "a" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success 'prefix_path works with relative path and empty prefix' '
+ echo "a" >expect &&
+ test-tool path-utils prefix_path "" "a" >actual &&
+ test_cmp expect actual
+'
+
relative_path /foo/a/b/c/ /foo/a/b/ c/
relative_path /foo/a/b/c/ /foo/a/b c/
relative_path /foo/a//b//c/ ///foo/a/b// c/ POSIX
--
2.48.1.397.gec9d649cc640
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 2/4] pathspec: expose match_pathspec_with_flags
2025-05-20 0:01 [PATCH v3 0/4] diff: add pathspec support to --no-index Jacob Keller
2025-05-20 0:01 ` [PATCH v3 1/4] prefix_path: support prefixes not ending in trailing slash Jacob Keller
@ 2025-05-20 0:01 ` Jacob Keller
2025-05-20 14:39 ` Junio C Hamano
2025-05-20 0:01 ` [PATCH v3 3/4] pathspec: add flag to indicate operation without repository Jacob Keller
2025-05-20 0:01 ` [PATCH v3 4/4] diff --no-index: support limiting by pathspec Jacob Keller
3 siblings, 1 reply; 15+ messages in thread
From: Jacob Keller @ 2025-05-20 0:01 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Jacob Keller
From: Jacob Keller <jacob.keller@gmail.com>
The do_match_pathspec() function has the DO_MATCH_LEADING_PATHSPEC
option to allow pathspecs to match when matching "src" against a
pathspec like "src/path/...". This support is not exposed by
match_pathspec, and the internal flags to do_match_pathspec are not
exposed outside of dir.c
Make match_pathspec_with_flags public, and expose the
DO_MATCH_LEADING_PATHSPEC and DO_MATCH_DIRECTORY flags. The
DO_MATCH_EXCLUDE flag is kept private in dir.c
This will be used in a an extension to support pathspec matching in git
diff --no-index.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
---
pathspec.h | 8 ++++++++
dir.c | 11 +++++------
2 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/pathspec.h b/pathspec.h
index de537cff3cb6..d22d4e80248d 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -184,6 +184,14 @@ int match_pathspec(struct index_state *istate,
const char *name, int namelen,
int prefix, char *seen, int is_dir);
+#define DO_MATCH_DIRECTORY (1<<1)
+#define DO_MATCH_LEADING_PATHSPEC (1<<2)
+
+int match_pathspec_with_flags(struct index_state *istate,
+ const struct pathspec *ps,
+ const char *name, int namelen,
+ int prefix, char *seen, unsigned flags);
+
/*
* Determine whether a pathspec will match only entire index entries (non-sparse
* files and/or entire sparse directories). If the pathspec has the potential to
diff --git a/dir.c b/dir.c
index a374972b6243..2f2b654b0252 100644
--- a/dir.c
+++ b/dir.c
@@ -329,9 +329,8 @@ static int do_read_blob(const struct object_id *oid, struct oid_stat *oid_stat,
return 1;
}
+// DO_MATCH_EXCLUDE is not public
#define DO_MATCH_EXCLUDE (1<<0)
-#define DO_MATCH_DIRECTORY (1<<1)
-#define DO_MATCH_LEADING_PATHSPEC (1<<2)
/*
* Does the given pathspec match the given name? A match is found if
@@ -551,10 +550,10 @@ static int do_match_pathspec(struct index_state *istate,
return retval;
}
-static int match_pathspec_with_flags(struct index_state *istate,
- const struct pathspec *ps,
- const char *name, int namelen,
- int prefix, char *seen, unsigned flags)
+int match_pathspec_with_flags(struct index_state *istate,
+ const struct pathspec *ps,
+ const char *name, int namelen,
+ int prefix, char *seen, unsigned flags)
{
int positive, negative;
positive = do_match_pathspec(istate, ps, name, namelen,
--
2.48.1.397.gec9d649cc640
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 3/4] pathspec: add flag to indicate operation without repository
2025-05-20 0:01 [PATCH v3 0/4] diff: add pathspec support to --no-index Jacob Keller
2025-05-20 0:01 ` [PATCH v3 1/4] prefix_path: support prefixes not ending in trailing slash Jacob Keller
2025-05-20 0:01 ` [PATCH v3 2/4] pathspec: expose match_pathspec_with_flags Jacob Keller
@ 2025-05-20 0:01 ` Jacob Keller
2025-05-20 15:13 ` Junio C Hamano
2025-05-20 0:01 ` [PATCH v3 4/4] diff --no-index: support limiting by pathspec Jacob Keller
3 siblings, 1 reply; 15+ messages in thread
From: Jacob Keller @ 2025-05-20 0:01 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Jacob Keller
From: Jacob Keller <jacob.keller@gmail.com>
A following change will add support for pathspecs to the git diff
--no-index command. This mode of git diff does not load any repository.
Add a new PATHSPEC_NO_REPOSITORY flag indicating that we're parsing
pathspecs without a repository.
Both PATHSPEC_ATTR and PATHSPEC_FROMTOP require a repository to
function. Thus, verify that both of these are set in magic_mask to
ensure they won't be accepted when PATHSPEC_NO_REPOSITORY is set.
Check PATHSPEC_NO_REPOSITORY when warning about paths outside the
directory tree. When the flag is set, do not look for a git repository
when generating the warning message.
Finally, add a BUG in match_pathspec_item if the istate is NULL but the
pathspec has PATHSPEC_ATTR set. Callers which support PATHSPEC_ATTR
should always pass a valid istate, and callers which don't pass a valid
istate should have set PATHSPEC_ATTR in the magic_mask field to disable
support for attribute-based pathspecs.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
---
pathspec.h | 5 +++++
dir.c | 9 ++++++---
pathspec.c | 6 +++++-
3 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/pathspec.h b/pathspec.h
index d22d4e80248d..2b47f9a2a213 100644
--- a/pathspec.h
+++ b/pathspec.h
@@ -76,6 +76,11 @@ struct pathspec {
* allowed, then it will automatically set for every pathspec.
*/
#define PATHSPEC_LITERAL_PATH (1<<6)
+/*
+ * For git diff --no-index, indicate that we are operating without
+ * a repository or index.
+ */
+#define PATHSPEC_NO_REPOSITORY (1<<7)
/**
* Given command line arguments and a prefix, convert the input to
diff --git a/dir.c b/dir.c
index 2f2b654b0252..45aac0bfacab 100644
--- a/dir.c
+++ b/dir.c
@@ -396,9 +396,12 @@ static int match_pathspec_item(struct index_state *istate,
strncmp(item->match, name - prefix, item->prefix))
return 0;
- if (item->attr_match_nr &&
- !match_pathspec_attrs(istate, name - prefix, namelen + prefix, item))
- return 0;
+ if (item->attr_match_nr) {
+ if (!istate)
+ BUG("magic PATHSPEC_ATTR requires an index");
+ if (!match_pathspec_attrs(istate, name - prefix, namelen + prefix, item))
+ return 0;
+ }
/* If the match was just the prefix, we matched */
if (!*match)
diff --git a/pathspec.c b/pathspec.c
index 2b4e434bc0aa..a3ddd701c740 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -492,7 +492,7 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags,
if (!match) {
const char *hint_path;
- if (!have_git_dir())
+ if ((flags & PATHSPEC_NO_REPOSITORY) || !have_git_dir())
die(_("'%s' is outside the directory tree"),
copyfrom);
hint_path = repo_get_work_tree(the_repository);
@@ -614,6 +614,10 @@ void parse_pathspec(struct pathspec *pathspec,
(flags & PATHSPEC_PREFER_FULL))
BUG("PATHSPEC_PREFER_CWD and PATHSPEC_PREFER_FULL are incompatible");
+ if ((flags & PATHSPEC_NO_REPOSITORY) &&
+ (~magic_mask & (PATHSPEC_ATTR | PATHSPEC_FROMTOP)))
+ BUG("PATHSPEC_NO_REPOSITORY is incompatible with PATHSPEC_ATTR and PATHSPEC_FROMTOP");
+
/* No arguments with prefix -> prefix pathspec */
if (!entry) {
if (flags & PATHSPEC_PREFER_FULL)
--
2.48.1.397.gec9d649cc640
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 4/4] diff --no-index: support limiting by pathspec
2025-05-20 0:01 [PATCH v3 0/4] diff: add pathspec support to --no-index Jacob Keller
` (2 preceding siblings ...)
2025-05-20 0:01 ` [PATCH v3 3/4] pathspec: add flag to indicate operation without repository Jacob Keller
@ 2025-05-20 0:01 ` Jacob Keller
2025-05-20 16:30 ` Junio C Hamano
3 siblings, 1 reply; 15+ messages in thread
From: Jacob Keller @ 2025-05-20 0:01 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Jacob Keller
From: Jacob Keller <jacob.keller@gmail.com>
The --no-index option of git-diff enables using the diff machinery from
git while operating outside of a repository. This mode of git diff is
able to compare directories and produce a diff of their contents.
When operating git diff in a repository, git has the notion of
"pathspecs" which can specify which files to compare. In particular,
when using git to diff two trees, you might invoke:
$ git diff-tree -r <treeish1> <treeish2>.
where the treeish could point to a subdirectory of the repository.
When invoked this way, users can limit the selected paths of the tree by
using a pathspec. Either by providing some list of paths to accept, or
by removing paths via a negative refspec.
The git diff --no-index mode does not support pathspecs, and cannot
limit the diff output in this way. Other diff programs such as GNU
difftools have options for excluding paths based on a pattern match.
However, using git diff as a diff replacement has several advantages
over many popular diff tools, including coloring moved lines, rename
detections, and similar.
Teach git diff --no-index how to handle pathspecs to limit the
comparisons. This will only be supported if both provided paths are
directories.
For comparisons where one path isn't a directory, the --no-index mode
already has some DWIM shortcuts implemented in the fixup_paths()
function.
Modify the fixup_paths function to return 1 if both paths are
directories. If this is the case, interpret any extra arguments to git
diff as pathspecs via parse_pathspec. Add a new PATHSPEC_NO_REPOSITORY
flag to indicate to the parser that we do not have repository. Use this
to aid in error message reporting in the event that the user happens to
invoke git diff --no-index from a repository.
Use parse_pathspec to load the remaining arguments (if any) to git diff
--no-index as pathspec items. Disable PATHSPEC_ATTR support since we do
not have a repository to do attribute lookup. Disable PATHSPEC_FROMTOP
since we do not have a repository root. All pathspecs are treated as
rooted at the provided comparison paths.
Load the pathspecs twice, once prefixed with paths[0] and once prefixed
with paths[1]. I considered trying to avoid this, but don't yet have a
workable solution that reuses the same pathspec objects. We need the
directory paths as prefixes otherwise the match_pathspec won't compare
properly.
Pass the pathspec object for both paths into queue_diff, who in-turn
passes these along to read_directory_contents.
Modify read_directory_contents to check against the pathspecs when
scanning the directory. In order to properly recurse, we need to ensure
that directories match, and that we continue iterating down the nested
directory structure:
$ git diff --no-index a b c/d
This should include all paths in a and b which match the c/d pathspec.
In particular, if there was 'a/c/d' we need to match it. But to check
'a/c/d', we need to first get to that part, which requires comparing
'a/c' first. With the match_pathspec() function, 'a/c' won't match the
'a/c/d' pathspec string.
However, if we always passed DO_MATCH_LEADING_PATHSPEC, then we will
incorrectly match in certain cases such as matching 'a/c' against
':(glob)**/d'. The match logic will see that a matches the leading part
of the **/ and accept this even tho c doesn't match. The trick seems to
be setting both DO_MATCH_LEADING_PATHSPEC and DO_MATCH_DIRECTORY when
checking directories, but set neither of them when checking files.
Some other gotchas and open questions:
1) pathspecs must all come after the first two path arguments, you
can't re-arrange them to come first. I'm treating them sort of like
the treeish arguments to git diff-tree.
2) The pathspecs are interpreted relative to the provided paths, and
thus will always need to be specified as relative paths, and will be
interpreted as relative to the root of the search for each path
separately.
3) negative pathspecs have to be fully qualified from the root, i.e.
':(exclude)file' will only exclude 'a/file' and not 'a/b/file'
unless you also use '(glob)' or similar. I think this matches the
other pathspec support, but I an not 100% sure.
4) I'm not certain about exposing match_pathspec_with_flags as-is,
since DO_MATCH_EXCLUDE shouldn't be passed. I got the behavior I
expected with DO_MATCH_LEADING_PATHSPEC, but it feels a bit of a
weird API. Perhaps match_pathspec could set both flags when is_dir
is true? But would that break other callers?
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
---
builtin/diff.c | 2 +-
diff-no-index.c | 91 ++++++++++++++++++++++++------
Documentation/git-diff.adoc | 10 +++-
t/t4053-diff-no-index.sh | 107 ++++++++++++++++++++++++++++++++++++
4 files changed, 190 insertions(+), 20 deletions(-)
diff --git a/builtin/diff.c b/builtin/diff.c
index fa963808c318..c6231edce4e8 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -35,7 +35,7 @@ static const char builtin_diff_usage[] =
" or: git diff [<options>] [--merge-base] <commit> [<commit>...] <commit> [--] [<path>...]\n"
" or: git diff [<options>] <commit>...<commit> [--] [<path>...]\n"
" or: git diff [<options>] <blob> <blob>\n"
-" or: git diff [<options>] --no-index [--] <path> <path>"
+" or: git diff [<options>] --no-index [--] <path> <path> [<pathspec>...]"
"\n"
COMMON_DIFF_OPTIONS_HELP;
diff --git a/diff-no-index.c b/diff-no-index.c
index 9739b2b268b9..e9e8006487c4 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -15,20 +15,47 @@
#include "gettext.h"
#include "revision.h"
#include "parse-options.h"
+#include "pathspec.h"
#include "string-list.h"
#include "dir.h"
-static int read_directory_contents(const char *path, struct string_list *list)
+static int read_directory_contents(const char *path, struct string_list *list,
+ const struct pathspec *pathspec)
{
+ struct strbuf match = STRBUF_INIT;
+ int len;
DIR *dir;
struct dirent *e;
if (!(dir = opendir(path)))
return error("Could not open directory %s", path);
- while ((e = readdir_skip_dot_and_dotdot(dir)))
- string_list_insert(list, e->d_name);
+ if (pathspec) {
+ strbuf_addstr(&match, path);
+ strbuf_complete(&match, '/');
+ len = match.len;
+ }
+ while ((e = readdir_skip_dot_and_dotdot(dir))) {
+ if (pathspec) {
+ int flags = 0;
+
+ strbuf_setlen(&match, len);
+ strbuf_addstr(&match, e->d_name);
+
+ if (e->d_type == DT_DIR)
+ flags = DO_MATCH_LEADING_PATHSPEC | DO_MATCH_DIRECTORY;
+
+ if (!match_pathspec_with_flags(NULL, pathspec,
+ match.buf, match.len,
+ 0, NULL, flags))
+ continue;
+ }
+
+ string_list_insert(list, e->d_name);
+ }
+
+ strbuf_release(&match);
closedir(dir);
return 0;
}
@@ -131,7 +158,8 @@ static struct diff_filespec *noindex_filespec(const struct git_hash_algo *algop,
}
static int queue_diff(struct diff_options *o, const struct git_hash_algo *algop,
- const char *name1, const char *name2, int recursing)
+ const char *name1, const char *name2, int recursing,
+ const struct pathspec *ps1, const struct pathspec *ps2)
{
int mode1 = 0, mode2 = 0;
enum special special1 = SPECIAL_NONE, special2 = SPECIAL_NONE;
@@ -171,9 +199,9 @@ static int queue_diff(struct diff_options *o, const struct git_hash_algo *algop,
int i1, i2, ret = 0;
size_t len1 = 0, len2 = 0;
- if (name1 && read_directory_contents(name1, &p1))
+ if (name1 && read_directory_contents(name1, &p1, ps1))
return -1;
- if (name2 && read_directory_contents(name2, &p2)) {
+ if (name2 && read_directory_contents(name2, &p2, ps2)) {
string_list_clear(&p1, 0);
return -1;
}
@@ -218,7 +246,7 @@ static int queue_diff(struct diff_options *o, const struct git_hash_algo *algop,
n2 = buffer2.buf;
}
- ret = queue_diff(o, algop, n1, n2, 1);
+ ret = queue_diff(o, algop, n1, n2, 1, ps1, ps2);
}
string_list_clear(&p1, 0);
string_list_clear(&p2, 0);
@@ -258,8 +286,10 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
* DWIM "diff D F" into "diff D/F F" and "diff F D" into "diff F D/F"
* Note that we append the basename of F to D/, so "diff a/b/file D"
* becomes "diff a/b/file D/file", not "diff a/b/file D/a/b/file".
+ *
+ * Return 1 if both paths are directories, 0 otherwise.
*/
-static void fixup_paths(const char **path, struct strbuf *replacement)
+static int fixup_paths(const char **path, struct strbuf *replacement)
{
struct stat st;
unsigned int isdir0 = 0, isdir1 = 0;
@@ -282,25 +312,30 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
if ((isdir0 && ispipe1) || (ispipe0 && isdir1))
die(_("cannot compare a named pipe to a directory"));
- if (isdir0 == isdir1)
- return;
+ /* if both paths are directories, we will enable pathspecs */
+ if (isdir0 && isdir1)
+ return 1;
+
if (isdir0) {
append_basename(replacement, path[0], path[1]);
path[0] = replacement->buf;
- } else {
+ } else if (isdir1) {
append_basename(replacement, path[1], path[0]);
path[1] = replacement->buf;
}
+
+ return 0;
}
static const char * const diff_no_index_usage[] = {
- N_("git diff --no-index [<options>] <path> <path>"),
+ N_("git diff --no-index [<options>] <path> <path> [<pathspec>...]"),
NULL
};
int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop,
int implicit_no_index, int argc, const char **argv)
{
+ struct pathspec pathspec1, pathspec2, *ps1 = NULL, *ps2 = NULL;
int i, no_index;
int ret = 1;
const char *paths[2];
@@ -317,13 +352,12 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop,
options = add_diff_options(no_index_options, &revs->diffopt);
argc = parse_options(argc, argv, revs->prefix, options,
diff_no_index_usage, 0);
- if (argc != 2) {
+ if (argc < 2) {
if (implicit_no_index)
warning(_("Not a git repository. Use --no-index to "
"compare two paths outside a working tree"));
usage_with_options(diff_no_index_usage, options);
}
- FREE_AND_NULL(options);
for (i = 0; i < 2; i++) {
const char *p = argv[i];
if (!strcmp(p, "-"))
@@ -337,7 +371,28 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop,
paths[i] = p;
}
- fixup_paths(paths, &replacement);
+ /* TODO: should we try to catch pathspec-like paths first and warn or
+ * error? We accepted those as valid 'paths' before so it seems
+ * unlikely we can change that behavior.
+ */
+ if (fixup_paths(paths, &replacement)) {
+ parse_pathspec(&pathspec1, PATHSPEC_FROMTOP | PATHSPEC_ATTR,
+ PATHSPEC_PREFER_FULL | PATHSPEC_NO_REPOSITORY,
+ paths[0], &argv[2]);
+ if (pathspec1.nr)
+ ps1 = &pathspec1;
+
+ parse_pathspec(&pathspec2, PATHSPEC_FROMTOP | PATHSPEC_ATTR,
+ PATHSPEC_PREFER_FULL | PATHSPEC_NO_REPOSITORY,
+ paths[1], &argv[2]);
+ if (pathspec2.nr)
+ ps2 = &pathspec2;
+ } else if (argc > 2) {
+ warning(_("Limiting comparison with pathspecs is only "
+ "supported if both paths are directories."));
+ usage_with_options(diff_no_index_usage, options);
+ }
+ FREE_AND_NULL(options);
revs->diffopt.skip_stat_unmatch = 1;
if (!revs->diffopt.output_format)
@@ -354,7 +409,7 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop,
setup_diff_pager(&revs->diffopt);
revs->diffopt.flags.exit_with_status = 1;
- if (queue_diff(&revs->diffopt, algop, paths[0], paths[1], 0))
+ if (queue_diff(&revs->diffopt, algop, paths[0], paths[1], 0, ps1, ps2))
goto out;
diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
diffcore_std(&revs->diffopt);
@@ -370,5 +425,9 @@ int diff_no_index(struct rev_info *revs, const struct git_hash_algo *algop,
for (i = 0; i < ARRAY_SIZE(to_free); i++)
free(to_free[i]);
strbuf_release(&replacement);
+ if (ps1)
+ clear_pathspec(ps1);
+ if (ps2)
+ clear_pathspec(ps2);
return ret;
}
diff --git a/Documentation/git-diff.adoc b/Documentation/git-diff.adoc
index dec173a34517..272331afbaec 100644
--- a/Documentation/git-diff.adoc
+++ b/Documentation/git-diff.adoc
@@ -14,7 +14,7 @@ git diff [<options>] --cached [--merge-base] [<commit>] [--] [<path>...]
git diff [<options>] [--merge-base] <commit> [<commit>...] <commit> [--] [<path>...]
git diff [<options>] <commit>...<commit> [--] [<path>...]
git diff [<options>] <blob> <blob>
-git diff [<options>] --no-index [--] <path> <path>
+git diff [<options>] --no-index [--] <path> <path> [<pathspec>...]
DESCRIPTION
-----------
@@ -31,14 +31,18 @@ files on disk.
further add to the index but you still haven't. You can
stage these changes by using linkgit:git-add[1].
-`git diff [<options>] --no-index [--] <path> <path>`::
+`git diff [<options>] --no-index [--] <path> <path> [<pathspec>...]`::
This form is to compare the given two paths on the
filesystem. You can omit the `--no-index` option when
running the command in a working tree controlled by Git and
at least one of the paths points outside the working tree,
or when running the command outside a working tree
- controlled by Git. This form implies `--exit-code`.
+ controlled by Git. This form implies `--exit-code`. If both
+ paths point to directories, additional pathspecs may be
+ provided. These will limit the files included in the
+ difference. All such pathspecs must be relative as they
+ apply to both sides of the diff.
`git diff [<options>] --cached [--merge-base] [<commit>] [--] [<path>...]`::
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 5e5bad61ca1e..60437accfd98 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -295,4 +295,111 @@ test_expect_success PIPE,SYMLINKS 'diff --no-index reads from pipes' '
test_cmp expect actual
'
+test_expect_success 'diff --no-index F F rejects pathspecs' '
+ test_must_fail git diff --no-index -- a/1 a/2 a 2>actual.err &&
+ test_grep "usage: git diff --no-index" actual.err
+'
+
+test_expect_success 'diff --no-index D F rejects pathspecs' '
+ test_must_fail git diff --no-index -- a a/2 a 2>actual.err &&
+ test_grep "usage: git diff --no-index" actual.err
+'
+
+test_expect_success 'diff --no-index F D rejects pathspecs' '
+ test_must_fail git diff --no-index -- a/1 b b 2>actual.err &&
+ test_grep "usage: git diff --no-index" actual.err
+'
+
+test_expect_success 'diff --no-index with pathspec' '
+ test_expect_code 1 git diff --no-index a b 1 >actual &&
+ cat >expect <<-EOF &&
+ diff --git a/a/1 b/a/1
+ deleted file mode 100644
+ index d00491f..0000000
+ --- a/a/1
+ +++ /dev/null
+ @@ -1 +0,0 @@
+ -1
+ EOF
+ test_cmp expect actual
+'
+
+test_expect_success 'diff --no-index with pathspec no matches' '
+ test_expect_code 0 git diff --no-index a b missing
+'
+
+test_expect_success 'diff --no-index with negative pathspec' '
+ test_expect_code 1 git diff --no-index a b ":!2" >actual &&
+ cat >expect <<-EOF &&
+ diff --git a/a/1 b/a/1
+ deleted file mode 100644
+ index d00491f..0000000
+ --- a/a/1
+ +++ /dev/null
+ @@ -1 +0,0 @@
+ -1
+ EOF
+ test_cmp expect actual
+'
+
+test_expect_success 'setup nested' '
+ mkdir -p c/1/2 &&
+ mkdir -p d/1/2 &&
+ echo 1 >c/1/2/a &&
+ echo 2 >c/1/2/b
+'
+
+test_expect_success 'diff --no-index with pathspec nested negative pathspec' '
+ test_expect_code 0 git diff --no-index c d ":!1"
+'
+
+test_expect_success 'diff --no-index with pathspec nested pathspec' '
+ test_expect_code 1 git diff --no-index c d 1/2 >actual &&
+ cat >expect <<-EOF &&
+ diff --git a/c/1/2/a b/c/1/2/a
+ deleted file mode 100644
+ index d00491f..0000000
+ --- a/c/1/2/a
+ +++ /dev/null
+ @@ -1 +0,0 @@
+ -1
+ diff --git a/c/1/2/b b/c/1/2/b
+ deleted file mode 100644
+ index 0cfbf08..0000000
+ --- a/c/1/2/b
+ +++ /dev/null
+ @@ -1 +0,0 @@
+ -2
+ EOF
+ test_cmp expect actual
+'
+
+test_expect_success 'diff --no-index with pathspec glob' '
+ test_expect_code 1 git diff --no-index c d ":(glob)**/a" >actual &&
+ cat >expect <<-EOF &&
+ diff --git a/c/1/2/a b/c/1/2/a
+ deleted file mode 100644
+ index d00491f..0000000
+ --- a/c/1/2/a
+ +++ /dev/null
+ @@ -1 +0,0 @@
+ -1
+ EOF
+ test_cmp expect actual
+'
+
+test_expect_success 'diff --no-index with pathspec glob and exclude' '
+ test_expect_code 1 git diff --no-index c d ":(glob,exclude)**/a" >actual &&
+ cat >expect <<-EOF &&
+ diff --git a/c/1/2/b b/c/1/2/b
+ deleted file mode 100644
+ index 0cfbf08..0000000
+ --- a/c/1/2/b
+ +++ /dev/null
+ @@ -1 +0,0 @@
+ -2
+ EOF
+ test_cmp expect actual
+'
+
test_done
--
2.48.1.397.gec9d649cc640
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v3 1/4] prefix_path: support prefixes not ending in trailing slash
2025-05-20 0:01 ` [PATCH v3 1/4] prefix_path: support prefixes not ending in trailing slash Jacob Keller
@ 2025-05-20 14:35 ` Junio C Hamano
2025-05-20 22:34 ` Jacob Keller
0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2025-05-20 14:35 UTC (permalink / raw)
To: Jacob Keller; +Cc: git, Jacob Keller
Jacob Keller <jacob.e.keller@intel.com> writes:
> From: Jacob Keller <jacob.keller@gmail.com>
>
> The prefix_path_gently() -- and its wrapper prefix_path() -- function
> normalizes the provided path and optionally adds the provided prefix to
> relative paths.
I find "optionally" confusing here. The original intended use case
of this function being that "the provided path", which comes from
the end user from the command line that names a file relative to the
directory the usre started the "git" command in, needs to be
adjusted after "git" chdir's up to the top-level of the working
tree, and the way to do so is to prepend the "prefix" computed by
"git" (which always ends with a slash). Adding the prefix to
relative paths is the central part of what it has to do. When the
end-user supplied path goes up e.g., "../file", we may have to lose
a level or more of the prefix before prepending, but that does not
change the fact that the helper function is about prepending the
prefix.
And as you can guess from the above description, if the caller
passes prefix that does not end in a trailing slash, the caller is
buggy.
> If the prefix does not end in a trailing slash, the
> function will combine the last component of the prefix with the first
> component of the relative path. This is unlikely to produce a desirable
> result.
True, and I do not mind being lenient to buggy callers, but given
that the majority of callers (i.e. all the existing callers) not
being buggy, I wonder if it is better to check and append a slash to
the end by a new caller that feeds a prefix that directly comes from
the end user?
Unlike prefix that is internally generated by Git, we'd need to be a
lot more careful if we are taking end-user input directly and
passing it as a "prefix" (I take that the possible lack of trailing
slash as a signal that you are doing something like that in this
series). We may need to check and correct that they do not contain
"./", "//", or "../", for example, anyway, and adding the required
trailing slash at the end sounds like something we may want to do as
part of that input validation and massaging _before_ calling these
helper functions.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 2/4] pathspec: expose match_pathspec_with_flags
2025-05-20 0:01 ` [PATCH v3 2/4] pathspec: expose match_pathspec_with_flags Jacob Keller
@ 2025-05-20 14:39 ` Junio C Hamano
2025-05-20 22:38 ` Jacob Keller
0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2025-05-20 14:39 UTC (permalink / raw)
To: Jacob Keller; +Cc: git, Jacob Keller
Jacob Keller <jacob.e.keller@intel.com> writes:
> From: Jacob Keller <jacob.keller@gmail.com>
>
> The do_match_pathspec() function has the DO_MATCH_LEADING_PATHSPEC
> option to allow pathspecs to match when matching "src" against a
> pathspec like "src/path/...". This support is not exposed by
> match_pathspec, and the internal flags to do_match_pathspec are not
> exposed outside of dir.c
>
> Make match_pathspec_with_flags public, and expose the
> DO_MATCH_LEADING_PATHSPEC and DO_MATCH_DIRECTORY flags. The
> DO_MATCH_EXCLUDE flag is kept private in dir.c
>
> This will be used in a an extension to support pathspec matching in git
> diff --no-index.
>
> Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
> ---
> pathspec.h | 8 ++++++++
> dir.c | 11 +++++------
> 2 files changed, 13 insertions(+), 6 deletions(-)
You use diff.orderfile? Not complaining, just finding it amusing
that somebody uses the feature ;-).
> diff --git a/pathspec.h b/pathspec.h
> index de537cff3cb6..d22d4e80248d 100644
> --- a/pathspec.h
> +++ b/pathspec.h
> @@ -184,6 +184,14 @@ int match_pathspec(struct index_state *istate,
> const char *name, int namelen,
> int prefix, char *seen, int is_dir);
>
> +#define DO_MATCH_DIRECTORY (1<<1)
> +#define DO_MATCH_LEADING_PATHSPEC (1<<2)
> +
> +int match_pathspec_with_flags(struct index_state *istate,
> + const struct pathspec *ps,
> + const char *name, int namelen,
> + int prefix, char *seen, unsigned flags);
> +
> /*
> * Determine whether a pathspec will match only entire index entries (non-sparse
> * files and/or entire sparse directories). If the pathspec has the potential to
> diff --git a/dir.c b/dir.c
> index a374972b6243..2f2b654b0252 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -329,9 +329,8 @@ static int do_read_blob(const struct object_id *oid, struct oid_stat *oid_stat,
> return 1;
> }
>
> +// DO_MATCH_EXCLUDE is not public
We do not use // comments (outside borrowed code anyway).
> #define DO_MATCH_EXCLUDE (1<<0)
> -#define DO_MATCH_DIRECTORY (1<<1)
> -#define DO_MATCH_LEADING_PATHSPEC (1<<2)
>
> /*
> * Does the given pathspec match the given name? A match is found if
> @@ -551,10 +550,10 @@ static int do_match_pathspec(struct index_state *istate,
> return retval;
> }
>
> -static int match_pathspec_with_flags(struct index_state *istate,
> - const struct pathspec *ps,
> - const char *name, int namelen,
> - int prefix, char *seen, unsigned flags)
> +int match_pathspec_with_flags(struct index_state *istate,
> + const struct pathspec *ps,
> + const char *name, int namelen,
> + int prefix, char *seen, unsigned flags)
> {
> int positive, negative;
> positive = do_match_pathspec(istate, ps, name, namelen,
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 3/4] pathspec: add flag to indicate operation without repository
2025-05-20 0:01 ` [PATCH v3 3/4] pathspec: add flag to indicate operation without repository Jacob Keller
@ 2025-05-20 15:13 ` Junio C Hamano
2025-05-20 22:42 ` Jacob Keller
2025-05-21 23:05 ` Jacob Keller
0 siblings, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2025-05-20 15:13 UTC (permalink / raw)
To: Jacob Keller; +Cc: git, Jacob Keller
Jacob Keller <jacob.e.keller@intel.com> writes:
> From: Jacob Keller <jacob.keller@gmail.com>
>
> A following change will add support for pathspecs to the git diff
> --no-index command. This mode of git diff does not load any repository.
>
> Add a new PATHSPEC_NO_REPOSITORY flag indicating that we're parsing
> pathspecs without a repository.
>
> Both PATHSPEC_ATTR and PATHSPEC_FROMTOP require a repository to
> function. Thus, verify that both of these are set in magic_mask to
> ensure they won't be accepted when PATHSPEC_NO_REPOSITORY is set.
>
> Check PATHSPEC_NO_REPOSITORY when warning about paths outside the
> directory tree. When the flag is set, do not look for a git repository
> when generating the warning message.
>
> Finally, add a BUG in match_pathspec_item if the istate is NULL but the
> pathspec has PATHSPEC_ATTR set. Callers which support PATHSPEC_ATTR
> should always pass a valid istate, and callers which don't pass a valid
> istate should have set PATHSPEC_ATTR in the magic_mask field to disable
> support for attribute-based pathspecs.
All very sensible considerations.
> diff --git a/dir.c b/dir.c
> index 2f2b654b0252..45aac0bfacab 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -396,9 +396,12 @@ static int match_pathspec_item(struct index_state *istate,
> strncmp(item->match, name - prefix, item->prefix))
> return 0;
>
> - if (item->attr_match_nr &&
> - !match_pathspec_attrs(istate, name - prefix, namelen + prefix, item))
> - return 0;
> + if (item->attr_match_nr) {
> + if (!istate)
> + BUG("magic PATHSPEC_ATTR requires an index");
> + if (!match_pathspec_attrs(istate, name - prefix, namelen + prefix, item))
> + return 0;
> + }
It is a bit curious why we do not check PATHSPEC_NO_REPOSITORY here,
but it is OK, because it is a BUG for istate to be NULL when we have
a repository anyway.
> diff --git a/pathspec.c b/pathspec.c
> index 2b4e434bc0aa..a3ddd701c740 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -492,7 +492,7 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags,
> if (!match) {
> const char *hint_path;
>
> - if (!have_git_dir())
> + if ((flags & PATHSPEC_NO_REPOSITORY) || !have_git_dir())
> die(_("'%s' is outside the directory tree"),
> copyfrom);
> hint_path = repo_get_work_tree(the_repository);
This is a part of generating an error message. We die early to
avoid having to call get-work-tree when we know we are not even in
any working tree, which makes sense.
> @@ -614,6 +614,10 @@ void parse_pathspec(struct pathspec *pathspec,
> (flags & PATHSPEC_PREFER_FULL))
> BUG("PATHSPEC_PREFER_CWD and PATHSPEC_PREFER_FULL are incompatible");
>
> + if ((flags & PATHSPEC_NO_REPOSITORY) &&
> + (~magic_mask & (PATHSPEC_ATTR | PATHSPEC_FROMTOP)))
> + BUG("PATHSPEC_NO_REPOSITORY is incompatible with PATHSPEC_ATTR and PATHSPEC_FROMTOP");
Hmph, I am not sure if this change is correct. The magic_mask
parameter is passed by a caller to say "even if parsr_pathspec()
parses a pathspec using a certain set of features properly, the
caller is not prepared to handle the parsed result". If magic_mask
lacks PATHSPEC_ATTR, that does not necessarily mean that the given
pathspec contains any pathspec items that do use the attr magic. It
merely says that the caller is not prepared to handle a pathspec
item that uses the attr magic feature.
If we are going to add a call to parse_pathspec() in a code path
that is specific to diff-no-index, isn't it sufficient to pass
PATHSPEC_ATTR and PATHSPEC_FROMTOP as magic_mask without this
change?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 4/4] diff --no-index: support limiting by pathspec
2025-05-20 0:01 ` [PATCH v3 4/4] diff --no-index: support limiting by pathspec Jacob Keller
@ 2025-05-20 16:30 ` Junio C Hamano
2025-05-20 22:45 ` Jacob Keller
2025-05-20 22:47 ` Jacob Keller
0 siblings, 2 replies; 15+ messages in thread
From: Junio C Hamano @ 2025-05-20 16:30 UTC (permalink / raw)
To: Jacob Keller; +Cc: git, Jacob Keller
Jacob Keller <jacob.e.keller@intel.com> writes:
> The git diff --no-index mode does not support pathspecs, and cannot
> limit the diff output in this way. Other diff programs such as GNU
> difftools have options for excluding paths based on a pattern match.
True.
> However, using git diff as a diff replacement has several advantages
> over many popular diff tools, including coloring moved lines, rename
> detections, and similar.
I said this when we introduced "--no-index", but back when "git" was
still young, it would have helped a lot wider developer populations
if we didn't do "git diff --no-index" and instead donated these
features to "many popular diff tools". We however chose to be lazy
and selfish---those who want to use these features are better off
installing "git" even if they are not using it for version control,
only to use it as "better diff".
People are still welcome to add "coloring moved lines, rename
detections and similar" to "many popular diff tools", but given that
"git" has become fairly popular and widely used, it may not matter
all that much any more that we were lazy and selfish ;-).
> Teach git diff --no-index how to handle pathspecs to limit the
> comparisons. This will only be supported if both provided paths are
> directories.
Good. If you are giving only two files, pathspec limiting would not
make much sense. If you are giving a file and a directory, lazily
give only F and D to compare F with D/F, that is essentially giving
only two files F and D/F, so pathspec limiting would not make much
sense, either. pathspec limited comparison would make sense only
when you are talking about two sets of files.
> However, if we always passed DO_MATCH_LEADING_PATHSPEC, then we will
> incorrectly match in certain cases such as matching 'a/c' against
> ':(glob)**/d'. The match logic will see that a matches the leading part
> of the **/ and accept this even tho c doesn't match. The trick seems to
> be setting both DO_MATCH_LEADING_PATHSPEC and DO_MATCH_DIRECTORY when
> checking directories, but set neither of them when checking files.
Sounds sensible.
> Some other gotchas and open questions:
>
> 1) pathspecs must all come after the first two path arguments, you
> can't re-arrange them to come first. I'm treating them sort of like
> the treeish arguments to git diff-tree.
Exactly. "git diff" proper is about comparing two sets of files,
either comparing two tree-ishes "git diff master next", comparing a
tree-ish and the index "git diff --cached HEAD", comparing a
tree-ish and the working tree files "git diff HEAD", or comparing
the index and the working tree files "git diff". It is a natural
extension that "git --no-index dirA dirB" compares contents of the
two directories. In all of these forms, it is common that the
comparison can be pathspec limited by giving pathspec elements as
the command line arguments at the end.
> 2) The pathspecs are interpreted relative to the provided paths, and
> thus will always need to be specified as relative paths, and will be
> interpreted as relative to the root of the search for each path
> separately.
Yes, that is not anything new or something we need to point out as
if it is any different from the "normal" pathspec.
> 3) negative pathspecs have to be fully qualified from the root, i.e.
> ':(exclude)file' will only exclude 'a/file' and not 'a/b/file'
> unless you also use '(glob)' or similar. I think this matches the
> other pathspec support, but I an not 100% sure.
I think that is correct "git ls-files :(exclude)po" does not exclude
git-gui/po, for example.
> -`git diff [<options>] --no-index [--] <path> <path>`::
> +`git diff [<options>] --no-index [--] <path> <path> [<pathspec>...]`::
This is a bit unfortunate. The disambiguating "--" should ideally
be between the "things to be compared" and the pathspec, as the
former corresponds to <rev> in the normal "git diff" invocation.
> + ... If both
> + paths point to directories, additional pathspecs may be
> + provided. These will limit the files included in the
> + difference. All such pathspecs must be relative as they
> + apply to both sides of the diff.
"as they" -> "and they"?
> +test_expect_success 'diff --no-index with pathspec' '
> + test_expect_code 1 git diff --no-index a b 1 >actual &&
> + cat >expect <<-EOF &&
> + diff --git a/a/1 b/a/1
> + deleted file mode 100644
> + index d00491f..0000000
> + --- a/a/1
> + +++ /dev/null
> + @@ -1 +0,0 @@
> + -1
> + EOF
> + test_cmp expect actual
> +'
If you use --name-only or --name-status would the test become
simpler?
> +
> +test_expect_success 'diff --no-index with pathspec no matches' '
> + test_expect_code 0 git diff --no-index a b missing
> +'
OK.
> +test_expect_success 'diff --no-index with negative pathspec' '
> + test_expect_code 1 git diff --no-index a b ":!2" >actual &&
> + cat >expect <<-EOF &&
> + diff --git a/a/1 b/a/1
> + deleted file mode 100644
> + index d00491f..0000000
> + --- a/a/1
> + +++ /dev/null
> + @@ -1 +0,0 @@
> + -1
> + EOF
> + test_cmp expect actual
> +'
OK.
All other tests also look sensible.
Thanks.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 1/4] prefix_path: support prefixes not ending in trailing slash
2025-05-20 14:35 ` Junio C Hamano
@ 2025-05-20 22:34 ` Jacob Keller
0 siblings, 0 replies; 15+ messages in thread
From: Jacob Keller @ 2025-05-20 22:34 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jacob Keller
On 5/20/2025 7:35 AM, Junio C Hamano wrote:
> Jacob Keller <jacob.e.keller@intel.com> writes:
>
>> From: Jacob Keller <jacob.keller@gmail.com>
>>
>> The prefix_path_gently() -- and its wrapper prefix_path() -- function
>> normalizes the provided path and optionally adds the provided prefix to
>> relative paths.
>
> I find "optionally" confusing here. The original intended use case
> of this function being that "the provided path", which comes from
> the end user from the command line that names a file relative to the
> directory the usre started the "git" command in, needs to be
> adjusted after "git" chdir's up to the top-level of the working
> tree, and the way to do so is to prepend the "prefix" computed by
> "git" (which always ends with a slash). Adding the prefix to
> relative paths is the central part of what it has to do. When the
> end-user supplied path goes up e.g., "../file", we may have to lose
> a level or more of the prefix before prepending, but that does not
> change the fact that the helper function is about prepending the
> prefix.
>
> And as you can guess from the above description, if the caller
> passes prefix that does not end in a trailing slash, the caller is
> buggy.
>
Sure. Part of the reason I was thinking modify this here instead of
adding the trailing slash, is because to add a trailing slash I had to
convert the path to a strbuf, where as by inserting it here thats
handled by the format specifiers.
>> If the prefix does not end in a trailing slash, the
>> function will combine the last component of the prefix with the first
>> component of the relative path. This is unlikely to produce a desirable
>> result.
>
> True, and I do not mind being lenient to buggy callers, but given
> that the majority of callers (i.e. all the existing callers) not
> being buggy, I wonder if it is better to check and append a slash to
> the end by a new caller that feeds a prefix that directly comes from
> the end user?
>
> Unlike prefix that is internally generated by Git, we'd need to be a
> lot more careful if we are taking end-user input directly and
> passing it as a "prefix" (I take that the possible lack of trailing
> slash as a signal that you are doing something like that in this
> series). We may need to check and correct that they do not contain
> "./", "//", or "../", for example, anyway, and adding the required
> trailing slash at the end sounds like something we may want to do as
> part of that input validation and massaging _before_ calling these
> helper functions.
Yea, I think it boils down to me passing the two paths from the user in
as the prefix or root of the pathspecs in parse_pathspec.
Perhaps we either need a different function to clean up the path first
or modify the pathspec code to not use prefix_path here at all and use
something else instead.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 2/4] pathspec: expose match_pathspec_with_flags
2025-05-20 14:39 ` Junio C Hamano
@ 2025-05-20 22:38 ` Jacob Keller
0 siblings, 0 replies; 15+ messages in thread
From: Jacob Keller @ 2025-05-20 22:38 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jacob Keller
On 5/20/2025 7:39 AM, Junio C Hamano wrote:
> Jacob Keller <jacob.e.keller@intel.com> writes:
>
>> From: Jacob Keller <jacob.keller@gmail.com>
>>
>> The do_match_pathspec() function has the DO_MATCH_LEADING_PATHSPEC
>> option to allow pathspecs to match when matching "src" against a
>> pathspec like "src/path/...". This support is not exposed by
>> match_pathspec, and the internal flags to do_match_pathspec are not
>> exposed outside of dir.c
>>
>> Make match_pathspec_with_flags public, and expose the
>> DO_MATCH_LEADING_PATHSPEC and DO_MATCH_DIRECTORY flags. The
>> DO_MATCH_EXCLUDE flag is kept private in dir.c
>>
>> This will be used in a an extension to support pathspec matching in git
>> diff --no-index.
>>
>> Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
>> ---
>> pathspec.h | 8 ++++++++
>> dir.c | 11 +++++------
>> 2 files changed, 13 insertions(+), 6 deletions(-)
>
> You use diff.orderfile? Not complaining, just finding it amusing
> that somebody uses the feature ;-).
>
One of my coworkers asked me to set it up so that header files appeared
first in diffs. I kinda liked that, so I stuck with it.
>> diff --git a/pathspec.h b/pathspec.h
>> index de537cff3cb6..d22d4e80248d 100644
>> --- a/pathspec.h
>> +++ b/pathspec.h
>> @@ -184,6 +184,14 @@ int match_pathspec(struct index_state *istate,
>> const char *name, int namelen,
>> int prefix, char *seen, int is_dir);
>>
>> +#define DO_MATCH_DIRECTORY (1<<1)
>> +#define DO_MATCH_LEADING_PATHSPEC (1<<2)
>> +
>> +int match_pathspec_with_flags(struct index_state *istate,
>> + const struct pathspec *ps,
>> + const char *name, int namelen,
>> + int prefix, char *seen, unsigned flags);
>> +
>> /*
>> * Determine whether a pathspec will match only entire index entries (non-sparse
>> * files and/or entire sparse directories). If the pathspec has the potential to
>> diff --git a/dir.c b/dir.c
>> index a374972b6243..2f2b654b0252 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -329,9 +329,8 @@ static int do_read_blob(const struct object_id *oid, struct oid_stat *oid_stat,
>> return 1;
>> }
>>
>> +// DO_MATCH_EXCLUDE is not public
>
> We do not use // comments (outside borrowed code anyway).
>
Sure. I don't actually expect to keep this patch as-is anyways, since I
think we might want to do something else... as exposing these flags
seems incorrect to me...
>> #define DO_MATCH_EXCLUDE (1<<0)
>> -#define DO_MATCH_DIRECTORY (1<<1)
>> -#define DO_MATCH_LEADING_PATHSPEC (1<<2)
>>
I actually almost wonder if we should set both DO_MATCH_DIRECTORY and
DO_MATCH_LEADING_PATHSPEC in match_pathspec when is_dir is true.
The DO_MATCH_DIRECTORY causes pathspecs to match if we have a path like
"a/b/c/d" and a pathspec like "a/b/c".
The DO_MATCH_LEADING_PATHSPEC does the inverse: if we have a path like
"a/b/c" then we match a pathspec like "a/b/c/d"
I guess it really depends on the nature of the caller. In the normal
case, I guess we don't check intermediate directory paths and only check
the end resulting files. But in my case, we're checking "a/b/c" before
we descend into it to check its inner contents.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 3/4] pathspec: add flag to indicate operation without repository
2025-05-20 15:13 ` Junio C Hamano
@ 2025-05-20 22:42 ` Jacob Keller
2025-05-21 23:05 ` Jacob Keller
1 sibling, 0 replies; 15+ messages in thread
From: Jacob Keller @ 2025-05-20 22:42 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jacob Keller
On 5/20/2025 8:13 AM, Junio C Hamano wrote:
> Jacob Keller <jacob.e.keller@intel.com> writes:
>
>> From: Jacob Keller <jacob.keller@gmail.com>
>>
>> A following change will add support for pathspecs to the git diff
>> --no-index command. This mode of git diff does not load any repository.
>>
>> Add a new PATHSPEC_NO_REPOSITORY flag indicating that we're parsing
>> pathspecs without a repository.
>>
>> Both PATHSPEC_ATTR and PATHSPEC_FROMTOP require a repository to
>> function. Thus, verify that both of these are set in magic_mask to
>> ensure they won't be accepted when PATHSPEC_NO_REPOSITORY is set.
>>
>> Check PATHSPEC_NO_REPOSITORY when warning about paths outside the
>> directory tree. When the flag is set, do not look for a git repository
>> when generating the warning message.
>>
>> Finally, add a BUG in match_pathspec_item if the istate is NULL but the
>> pathspec has PATHSPEC_ATTR set. Callers which support PATHSPEC_ATTR
>> should always pass a valid istate, and callers which don't pass a valid
>> istate should have set PATHSPEC_ATTR in the magic_mask field to disable
>> support for attribute-based pathspecs.
>
> All very sensible considerations.
>
>> diff --git a/dir.c b/dir.c
>> index 2f2b654b0252..45aac0bfacab 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -396,9 +396,12 @@ static int match_pathspec_item(struct index_state *istate,
>> strncmp(item->match, name - prefix, item->prefix))
>> return 0;
>>
>> - if (item->attr_match_nr &&
>> - !match_pathspec_attrs(istate, name - prefix, namelen + prefix, item))
>> - return 0;
>> + if (item->attr_match_nr) {
>> + if (!istate)
>> + BUG("magic PATHSPEC_ATTR requires an index");
>> + if (!match_pathspec_attrs(istate, name - prefix, namelen + prefix, item))
>> + return 0;
>> + }
>
> It is a bit curious why we do not check PATHSPEC_NO_REPOSITORY here,
> but it is OK, because it is a BUG for istate to be NULL when we have
> a repository anyway.
>
Right. We could check it here, but I actually had added this BUG first
before I added PATHSPEC_NO_REPOSITORY.
>> diff --git a/pathspec.c b/pathspec.c
>> index 2b4e434bc0aa..a3ddd701c740 100644
>> --- a/pathspec.c
>> +++ b/pathspec.c
>> @@ -492,7 +492,7 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags,
>> if (!match) {
>> const char *hint_path;
>>
>> - if (!have_git_dir())
>> + if ((flags & PATHSPEC_NO_REPOSITORY) || !have_git_dir())
>> die(_("'%s' is outside the directory tree"),
>> copyfrom);
>> hint_path = repo_get_work_tree(the_repository);
>
> This is a part of generating an error message. We die early to
> avoid having to call get-work-tree when we know we are not even in
> any working tree, which makes sense.
>
>> @@ -614,6 +614,10 @@ void parse_pathspec(struct pathspec *pathspec,
>> (flags & PATHSPEC_PREFER_FULL))
>> BUG("PATHSPEC_PREFER_CWD and PATHSPEC_PREFER_FULL are incompatible");
>>
>> + if ((flags & PATHSPEC_NO_REPOSITORY) &&
>> + (~magic_mask & (PATHSPEC_ATTR | PATHSPEC_FROMTOP)))
>> + BUG("PATHSPEC_NO_REPOSITORY is incompatible with PATHSPEC_ATTR and PATHSPEC_FROMTOP");
>
> Hmph, I am not sure if this change is correct. The magic_mask
> parameter is passed by a caller to say "even if parsr_pathspec()
> parses a pathspec using a certain set of features properly, the
> caller is not prepared to handle the parsed result". If magic_mask
> lacks PATHSPEC_ATTR, that does not necessarily mean that the given
> pathspec contains any pathspec items that do use the attr magic. It
> merely says that the caller is not prepared to handle a pathspec
> item that uses the attr magic feature.
>
Right. The magic_mask is a "these magic types are not allowed". I'm
checking to make sure that if you set PATHSPEC_NO_REPOSITORY, you must
also set PATHSPEC_ATTR and PATHSPEC_FROMTOP, because you cannot possibly
handle these pathspecs without a repository.
> If we are going to add a call to parse_pathspec() in a code path
> that is specific to diff-no-index, isn't it sufficient to pass
> PATHSPEC_ATTR and PATHSPEC_FROMTOP as magic_mask without this
> change?
>
Strictly speaking, yes. This part is really just a "this would be a
programmer error we should catch early".
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 4/4] diff --no-index: support limiting by pathspec
2025-05-20 16:30 ` Junio C Hamano
@ 2025-05-20 22:45 ` Jacob Keller
2025-05-20 22:47 ` Jacob Keller
1 sibling, 0 replies; 15+ messages in thread
From: Jacob Keller @ 2025-05-20 22:45 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jacob Keller
On 5/20/2025 9:30 AM, Junio C Hamano wrote:
> Jacob Keller <jacob.e.keller@intel.com> writes:
>> -`git diff [<options>] --no-index [--] <path> <path>`::
>> +`git diff [<options>] --no-index [--] <path> <path> [<pathspec>...]`::
>
> This is a bit unfortunate. The disambiguating "--" should ideally
> be between the "things to be compared" and the pathspec, as the
> former corresponds to <rev> in the normal "git diff" invocation.
>
True, but it looks like we already had it before paths. I am not sure if
we can easily change that now. :(
>> + ... If both
>> + paths point to directories, additional pathspecs may be
>> + provided. These will limit the files included in the
>> + difference. All such pathspecs must be relative as they
>> + apply to both sides of the diff.
>
> "as they" -> "and they"?
I think I meant "because they", but for documentation I think and they
makes more sense. I can clarify this a bit better in the next version.
>
>> +test_expect_success 'diff --no-index with pathspec' '
>> + test_expect_code 1 git diff --no-index a b 1 >actual &&
>> + cat >expect <<-EOF &&
>> + diff --git a/a/1 b/a/1
>> + deleted file mode 100644
>> + index d00491f..0000000
>> + --- a/a/1
>> + +++ /dev/null
>> + @@ -1 +0,0 @@
>> + -1
>> + EOF
>> + test_cmp expect actual
>> +'
>
> If you use --name-only or --name-status would the test become
> simpler?
>
That is a good idea.
>> +
>> +test_expect_success 'diff --no-index with pathspec no matches' '
>> + test_expect_code 0 git diff --no-index a b missing
>> +'
>
> OK.
>
>> +test_expect_success 'diff --no-index with negative pathspec' '
>> + test_expect_code 1 git diff --no-index a b ":!2" >actual &&
>> + cat >expect <<-EOF &&
>> + diff --git a/a/1 b/a/1
>> + deleted file mode 100644
>> + index d00491f..0000000
>> + --- a/a/1
>> + +++ /dev/null
>> + @@ -1 +0,0 @@
>> + -1
>> + EOF
>> + test_cmp expect actual
>> +'
>
> OK.
>
> All other tests also look sensible.
>
> Thanks.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 4/4] diff --no-index: support limiting by pathspec
2025-05-20 16:30 ` Junio C Hamano
2025-05-20 22:45 ` Jacob Keller
@ 2025-05-20 22:47 ` Jacob Keller
1 sibling, 0 replies; 15+ messages in thread
From: Jacob Keller @ 2025-05-20 22:47 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jacob Keller
On 5/20/2025 9:30 AM, Junio C Hamano wrote:
> Jacob Keller <jacob.e.keller@intel.com> writes:
>
>> The git diff --no-index mode does not support pathspecs, and cannot
>> limit the diff output in this way. Other diff programs such as GNU
>> difftools have options for excluding paths based on a pattern match.
>
> True.
>
>> However, using git diff as a diff replacement has several advantages
>> over many popular diff tools, including coloring moved lines, rename
>> detections, and similar.
>
> I said this when we introduced "--no-index", but back when "git" was
> still young, it would have helped a lot wider developer populations
> if we didn't do "git diff --no-index" and instead donated these
> features to "many popular diff tools". We however chose to be lazy
> and selfish---those who want to use these features are better off
> installing "git" even if they are not using it for version control,
> only to use it as "better diff".
>
> People are still welcome to add "coloring moved lines, rename
> detections and similar" to "many popular diff tools", but given that
> "git" has become fairly popular and widely used, it may not matter
> all that much any more that we were lazy and selfish ;-).
>
>> Teach git diff --no-index how to handle pathspecs to limit the
>> comparisons. This will only be supported if both provided paths are
>> directories.
>
> Good. If you are giving only two files, pathspec limiting would not
> make much sense. If you are giving a file and a directory, lazily
> give only F and D to compare F with D/F, that is essentially giving
> only two files F and D/F, so pathspec limiting would not make much
> sense, either. pathspec limited comparison would make sense only
> when you are talking about two sets of files.
>
>> However, if we always passed DO_MATCH_LEADING_PATHSPEC, then we will
>> incorrectly match in certain cases such as matching 'a/c' against
>> ':(glob)**/d'. The match logic will see that a matches the leading part
>> of the **/ and accept this even tho c doesn't match. The trick seems to
>> be setting both DO_MATCH_LEADING_PATHSPEC and DO_MATCH_DIRECTORY when
>> checking directories, but set neither of them when checking files.
>
> Sounds sensible.
>
>> Some other gotchas and open questions:
>>
>> 1) pathspecs must all come after the first two path arguments, you
>> can't re-arrange them to come first. I'm treating them sort of like
>> the treeish arguments to git diff-tree.
>
> Exactly. "git diff" proper is about comparing two sets of files,
> either comparing two tree-ishes "git diff master next", comparing a
> tree-ish and the index "git diff --cached HEAD", comparing a
> tree-ish and the working tree files "git diff HEAD", or comparing
> the index and the working tree files "git diff". It is a natural
> extension that "git --no-index dirA dirB" compares contents of the
> two directories. In all of these forms, it is common that the
> comparison can be pathspec limited by giving pathspec elements as
> the command line arguments at the end.
>
>> 2) The pathspecs are interpreted relative to the provided paths, and
>> thus will always need to be specified as relative paths, and will be
>> interpreted as relative to the root of the search for each path
>> separately.
>
> Yes, that is not anything new or something we need to point out as
> if it is any different from the "normal" pathspec.
>
>> 3) negative pathspecs have to be fully qualified from the root, i.e.
>> ':(exclude)file' will only exclude 'a/file' and not 'a/b/file'
>> unless you also use '(glob)' or similar. I think this matches the
>> other pathspec support, but I an not 100% sure.
>
> I think that is correct "git ls-files :(exclude)po" does not exclude
> git-gui/po, for example.
>
Sounds like you answered most of my open questions. I guess my remaining
big question to you or others on the list is whether or not the overall
algorithm makes sense, especially in regards to how we handle needing
both DO_MATCH_DIRECTORY and DO_MATCH_LEADING_PATHSPEC, and in using two
separate calls to parse_pathspec, one set for each path.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 3/4] pathspec: add flag to indicate operation without repository
2025-05-20 15:13 ` Junio C Hamano
2025-05-20 22:42 ` Jacob Keller
@ 2025-05-21 23:05 ` Jacob Keller
1 sibling, 0 replies; 15+ messages in thread
From: Jacob Keller @ 2025-05-21 23:05 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jacob Keller
On 5/20/2025 8:13 AM, Junio C Hamano wrote:
> Jacob Keller <jacob.e.keller@intel.com> writes:
>
>> From: Jacob Keller <jacob.keller@gmail.com>
>>
>> A following change will add support for pathspecs to the git diff
>> --no-index command. This mode of git diff does not load any repository.
>>
>> Add a new PATHSPEC_NO_REPOSITORY flag indicating that we're parsing
>> pathspecs without a repository.
>>
>> Both PATHSPEC_ATTR and PATHSPEC_FROMTOP require a repository to
>> function. Thus, verify that both of these are set in magic_mask to
>> ensure they won't be accepted when PATHSPEC_NO_REPOSITORY is set.
>>
>> Check PATHSPEC_NO_REPOSITORY when warning about paths outside the
>> directory tree. When the flag is set, do not look for a git repository
>> when generating the warning message.
>>
>> Finally, add a BUG in match_pathspec_item if the istate is NULL but the
>> pathspec has PATHSPEC_ATTR set. Callers which support PATHSPEC_ATTR
>> should always pass a valid istate, and callers which don't pass a valid
>> istate should have set PATHSPEC_ATTR in the magic_mask field to disable
>> support for attribute-based pathspecs.
>
> All very sensible considerations.
>
>> diff --git a/dir.c b/dir.c
>> index 2f2b654b0252..45aac0bfacab 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -396,9 +396,12 @@ static int match_pathspec_item(struct index_state *istate,
>> strncmp(item->match, name - prefix, item->prefix))
>> return 0;
>>
>> - if (item->attr_match_nr &&
>> - !match_pathspec_attrs(istate, name - prefix, namelen + prefix, item))
>> - return 0;
>> + if (item->attr_match_nr) {
>> + if (!istate)
>> + BUG("magic PATHSPEC_ATTR requires an index");
>> + if (!match_pathspec_attrs(istate, name - prefix, namelen + prefix, item))
>> + return 0;
>> + }
>
> It is a bit curious why we do not check PATHSPEC_NO_REPOSITORY here,
> but it is OK, because it is a BUG for istate to be NULL when we have
> a repository anyway.
>
match_pathspec doesn't take the same flags as parse_pathspec anyways, so
we can't check it here :)
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-05-21 23:07 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-20 0:01 [PATCH v3 0/4] diff: add pathspec support to --no-index Jacob Keller
2025-05-20 0:01 ` [PATCH v3 1/4] prefix_path: support prefixes not ending in trailing slash Jacob Keller
2025-05-20 14:35 ` Junio C Hamano
2025-05-20 22:34 ` Jacob Keller
2025-05-20 0:01 ` [PATCH v3 2/4] pathspec: expose match_pathspec_with_flags Jacob Keller
2025-05-20 14:39 ` Junio C Hamano
2025-05-20 22:38 ` Jacob Keller
2025-05-20 0:01 ` [PATCH v3 3/4] pathspec: add flag to indicate operation without repository Jacob Keller
2025-05-20 15:13 ` Junio C Hamano
2025-05-20 22:42 ` Jacob Keller
2025-05-21 23:05 ` Jacob Keller
2025-05-20 0:01 ` [PATCH v3 4/4] diff --no-index: support limiting by pathspec Jacob Keller
2025-05-20 16:30 ` Junio C Hamano
2025-05-20 22:45 ` Jacob Keller
2025-05-20 22:47 ` Jacob Keller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).