Git development

Git development
 help / color / mirror / Atom feed

* Bug: `git stash push ':(attr:<somevalue>)' attr magic does not get parsed correctly
From: Joanna Wang @ 2023-10-11 19:14 UTC (permalink / raw)
  To: git

What did you do before the bug happened? (Steps to reproduce your issue)
run the following with a debugger: `git stash push ':(attr:<somevalue>)'`
check the resulting pathspec after parse_pathspec()

What did you expect to happen? (Expected behavior)
the resulting pathspec should contain ':(attr:<somevalue>)'.

What happened instead? (Actual behavior)
The resulting pathspec is ':(attr,prefix:0)'
This is due to parse_pathspec in git-stash being called with
PATHSPEC_PREFIX_ORIGIN as a flag.

':(attr:<somevalue>)' gets parsed into ':(attr,prefix:0)' because prefix_magic()
https://github.com/git/git/blob/master/pathspec.c#L112 does not take
into account
that some magic can have values in addition to their magic names
specified in the pathspec.

What's different between what you expected and what actually happened?
Resulting pathspec is ':(attr,prefix:0)' when it should be
':(attr:<somevalue>,prefix:0)'

Anything else you want to add:
attr pathspec magic is currently blocked for git stash by GUARD_PATHSPEC
https://github.com/git/git/blob/master/dir.c#L2176
But, it looks like it can be unblocked, once prefix_magic() is fixed.

Also add-interactive also uses PATHSPEC_PREFIX_ORIGIN as a flag and
the resulting pathspec from parse_pathspec
shows the same bug.  But `git add -i ':(attr:<somevalue>'` behaves as
expected and I'm not sure why. It may be using different parts of the
pathspec struct down the line to filter out files.

[Enabled Hooks]
commit-msg
pre-commit
prepare-commit-msg

^ permalink raw reply

* Re: [PATCH 18/20] commit-graph: check bounds when accessing BIDX chunk
From: Taylor Blau @ 2023-10-11 19:15 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20231009210553.GR3282181@coredump.intra.peff.net>

On Mon, Oct 09, 2023 at 05:05:53PM -0400, Jeff King wrote:
> We load the bloom_filter_indexes chunk using pair_chunk(), so we have no
> idea how big it is. This can lead to out-of-bounds reads if it is
> smaller than expected, since we index it based on the number of commits
> found elsewhere in the graph file.
>
> We can check the chunk size up front, like we do for CDAT and other
> chunks with one fixed-size record per commit.
>
> The test case demonstrates the problem. It actually won't segfault,
> because we end up reading random data from the follow-on chunk (BDAT in
> this case), and the bounds checks added in the previous patch complain.
> But this is by no means assured, and you can craft a commit-graph file
> with BIDX at the end (or a smaller BDAT) that does segfault.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  commit-graph.c       | 16 ++++++++++++++--
>  t/t4216-log-bloom.sh |  9 +++++++++
>  2 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index f7a42be6d0..1f334987b5 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -360,6 +360,18 @@ static int graph_read_generation_data(const unsigned char *chunk_start,
>  	return 0;
>  }
>
> +static int graph_read_bloom_index(const unsigned char *chunk_start,
> +				  size_t chunk_size, void *data)
> +{
> +	struct commit_graph *g = data;
> +	if (chunk_size != g->num_commits * 4) {

Here we probably should wrap `g->num_commits * 4` in an st_mult() call.
Having 4B commits is probably pretty unlikely in practice, but we have
definitely seen issues in the wild at GitHub where we have more than
2^32-1/20 commits in a single network.git.

> +		warning("commit-graph changed-path index chunk is too small");

Should this be marked for translation?

> +		return -1;
> +	}
> +	g->chunk_bloom_indexes = chunk_start;
> +	return 0;
> +}
> +
>  static int graph_read_bloom_data(const unsigned char *chunk_start,
>  				  size_t chunk_size, void *data)
>  {
> @@ -470,8 +482,8 @@ struct commit_graph *parse_commit_graph(struct repo_settings *s,
>  	}
>
>  	if (s->commit_graph_read_changed_paths) {
> -		pair_chunk_unsafe(cf, GRAPH_CHUNKID_BLOOMINDEXES,
> -			   &graph->chunk_bloom_indexes);
> +		read_chunk(cf, GRAPH_CHUNKID_BLOOMINDEXES,
> +			   graph_read_bloom_index, graph);

Since we know g->num_commits by this point, this would be another spot
that would benefit from a `pair_chunk_expect()` function. But, again,
not trying to beat a dead horse here or anything ;-).

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 19/20] commit-graph: detect out-of-order BIDX offsets
From: Taylor Blau @ 2023-10-11 19:16 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20231009210556.GS3282181@coredump.intra.peff.net>

On Mon, Oct 09, 2023 at 05:05:56PM -0400, Jeff King wrote:
> diff --git a/bloom.c b/bloom.c
> index 61abad7f8c..1474aa19fa 100644
> --- a/bloom.c
> +++ b/bloom.c
> @@ -75,6 +75,16 @@ static int load_bloom_filter_from_graph(struct commit_graph *g,
>  	    check_bloom_offset(g, lex_pos - 1, start_index) < 0)
>  		return 0;
>
> +	if (end_index < start_index) {
> +		warning("ignoring decreasing changed-path index offsets"
> +			" (%"PRIuMAX" > %"PRIuMAX") for positions"
> +			" %"PRIuMAX" and %"PRIuMAX" of %s",

Should this be marked for translation?

Otherwise this LGTM.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 0/20] bounds-checks for chunk-based files
From: Taylor Blau @ 2023-10-11 19:19 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20231009205544.GA3281950@coredump.intra.peff.net>

On Mon, Oct 09, 2023 at 04:55:44PM -0400, Jeff King wrote:
>  bloom.c                            |  34 +++++++++
>  chunk-format.c                     |  24 ++++--
>  chunk-format.h                     |   9 ++-
>  commit-graph.c                     | 119 ++++++++++++++++++++++++-----
>  commit-graph.h                     |   4 +
>  midx.c                             |  68 +++++++++++++----
>  midx.h                             |   3 +
>  pack-revindex.c                    |  13 +++-
>  t/lib-chunk.sh                     |  17 +++++
>  t/lib-chunk/corrupt-chunk-file.pl  |  66 ++++++++++++++++
>  t/t4216-log-bloom.sh               |  50 ++++++++++++
>  t/t5318-commit-graph.sh            |  76 +++++++++++++++++-
>  t/t5319-multi-pack-index.sh        | 102 ++++++++++++++++++++++++-
>  t/t5324-split-commit-graph.sh      |  20 ++++-
>  t/t5328-commit-graph-64bit-time.sh |  10 +++
>  15 files changed, 568 insertions(+), 47 deletions(-)
>  create mode 100644 t/lib-chunk.sh
>  create mode 100644 t/lib-chunk/corrupt-chunk-file.pl

I reviewed this carefully (well, except for the new Perl script, for
obvious[^1] reasons ;-)).

Everything mostly looks good to me, though I
had a handful of review comments throughout. Many of them are trivial
(e.g. a number of warning() and error() strings should be marked for
translation, etc.), but a couple of them I think are worth looking at.

Most notably, I think that by the end of the series, I was convinced
that having some kind of 'pair_chunk_expectsz()' or similar would be
useful and eliminate a good chunk of the boilerplate you have to write
to check the chunk size against an expected value when using
read_chunk().

Otherwise, this looks great. I appreciate the care you took in finding
and fixing these issues, as well as thoroughly documenting the process
(and the security implications, or lack thereof). Thanks for working on
this!

Thanks,
Taylor

[^1]: That I may be the world's least competent Perl programmer.

^ permalink raw reply

* Re: What's cooking in git.git (Oct 2023, #04; Tue, 10)
From: Taylor Blau @ 2023-10-11 19:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, SZEDER Gábor, Elijah Newren, git
In-Reply-To: <xmqqwmvuosf3.fsf@gitster.g>

On Tue, Oct 10, 2023 at 06:32:16PM -0700, Junio C Hamano wrote:
> * tb/path-filter-fix (2023-08-30) 15 commits
>  - bloom: introduce `deinit_bloom_filters()`
>  - commit-graph: reuse existing Bloom filters where possible
>  - object.h: fix mis-aligned flag bits table
>  - commit-graph: drop unnecessary `graph_read_bloom_data_context`
>  - commit-graph.c: unconditionally load Bloom filters
>  - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
>  - bloom: prepare to discard incompatible Bloom filters
>  - bloom: annotate filters with hash version
>  - commit-graph: new filter ver. that fixes murmur3
>  - repo-settings: introduce commitgraph.changedPathsVersion
>  - t4216: test changed path filters with high bit paths
>  - t/helper/test-read-graph: implement `bloom-filters` mode
>  - bloom.h: make `load_bloom_filter_from_graph()` public
>  - t/helper/test-read-graph.c: extract `dump_graph_info()`
>  - gitformat-commit-graph: describe version 2 of BDAT
>
>  The Bloom filter used for path limited history traversal was broken
>  on systems whose "char" is unsigned; update the implementation and
>  bump the format version to 2.
>
>  Reroll exists, not picked up yet.
>  cf. <20230830200218.GA5147@szeder.dev>
>  cf. <20230901205616.3572722-1-jonathantanmy@google.com>
>  cf. <20230924195900.GA1156862@szeder.dev>
>  cf. <20231008143523.GA18858@szeder.dev>
>  source: <cover.1693413637.git.jonathantanmy@google.com>

Great, thanks for noting that you saw it ;-). I think that this one is
ready to go, but I'm obviously biased and I'd feel better if Jonathan or
Gábor (both CC'd) would take a look before you merge this down.

> * tb/repack-max-cruft-size (2023-10-09) 5 commits
>   (merged to 'next' on 2023-10-09 at 38f039e880)
>  + repack: free existing_cruft array after use
>   (merged to 'next' on 2023-10-06 at b3ca6df3b9)
>  + builtin/repack.c: avoid making cruft packs preferred
>  + builtin/repack.c: implement support for `--max-cruft-size`
>  + builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDE
>  + t7700: split cruft-related tests to t7704
>
>  "git repack" learned "--max-cruft-size" to prevent cruft packs from
>  growing without bounds.
>
>  Will merge to 'master'.
>  source: <cover.1696293862.git.me@ttaylorr.com>
>  source: <20231007172031.GA1524950@coredump.intra.peff.net>
>  source: <035393935108d02aaf8927189b05102f4f74f340.1696370003.git.me@ttaylorr.com>

Great, thanks.

The only series of mine I didn't see mentioned here was the merge ORT ->
bulk-checkin stuff that I posted in [1]. The discussion there has sort
of morphed beyond the patches themselves, so I'd love some review on
these patches (perhaps from Elijah?, also CC'd) if time permits.

Thanks,
Taylor

[1]: https://lore.kernel.org/git/cover.1696629697.git.me@ttaylorr.com/

^ permalink raw reply

* Re: [REGRESSION] uninitialized value $address in git send-email when given multiple recipients separated by commas
From: Michael Strawbridge @ 2023-10-11 19:27 UTC (permalink / raw)
  To: Bagas Sanjaya, Jeff King, Todd Zullinger
  Cc: Junio C Hamano, Luben Tuikov,
	Ævar Arnfjörð Bjarmason, Taylor Blau,
	Git Mailing List
In-Reply-To: <ZSal-mQIZAUBaq6g@debian.me>


On 10/11/23 09:41, Bagas Sanjaya wrote:
> On Mon, Sep 25, 2023 at 12:17:48PM -0400, Jeff King wrote:
>> On Mon, Sep 25, 2023 at 10:48:29AM -0400, Todd Zullinger wrote:
>>
>>> From the peanut gallery... could the presence or lack of the
>>> Email::Valid perl module be a factor?
>> Ah, thanks! The thought of differing modules even occurred to me, since
>> I know we have a few optimistic dependencies, but when I looked I didn't
>> manage to find that one (somehow I thought Mail::Address was the
>> interesting one here; I think I might be getting senile).
>>
>> With Email::Valid installed, I can reproduce with just (in git.git, but
>> I think it would work in any repo):
>>
>>   $ echo "exit 0" >.git/hooks/sendemail-validate
>>   $ chmod +x .git/hooks/sendemail-validate
>>   $ git send-email --dry-run -1 --to=foo@example.com,bar@example.com
>>   error: unable to extract a valid address from: foo@example.com,bar@example.com
>>
>> Disabling the hook with "chmod -x" makes the problem go away (and this
>> is with current "master", hence the more readable error message).
>>
>> I think the issue is that a8022c5f7b ends up in extract_valid_address()
>> via this call stack:
>>
>>   $ = main::extract_valid_address('foo@example.com,bar@example.com') called from file '/home/peff/compile/git/git-send-email' line 1161
>>   $ = main::extract_valid_address_or_die('foo@example.com,bar@example.com') called from file '/home/peff/compile/git/git-send-email' line 2087
>>   @ = main::unique_email_list('foo@example.com,bar@example.com') called from file '/home/peff/compile/git/git-send-email' line 1507
>>   @ = main::gen_header() called from file '/home/peff/compile/git/git-send-email' line 2113
>>   . = main::validate_patch('/tmp/WfoPQSKCUa/0001-The-twelfth-batch.patch', 'auto') called from file '/home/peff/compile/git/git-send-email' line 815
>>
>> whereas prior to that commit, we hit it later:
>>
>>   $ = main::extract_valid_address('foo@example.com') called from file '/home/peff/compile/git/git-send-email' line 1166
>>   @ = main::validate_address('foo@example.com') called from file '/home/peff/compile/git/git-send-email' line 1189
>>   @ = main::validate_address_list('foo@example.com', 'bar@example.com') called from file '/home/peff/compile/git/git-send-email' line 1348
>>   @ = main::process_address_list('foo@example.com,bar@example.com') called from file '/home/peff/compile/git/git-send-email' line 1091
>>
>> So the issue is the call to gen_header() added in validate_patch(). We
>> won't yet have processed the address lists by that point. We can move
>> those calls up, but it requires moving a bit of extra code, too (like
>> the parts prompting for the "to" list if it isn't filled in).
>>
>> Possibly the validation checks need to be moved down, if they want to
>> see a more complete view of the emails. But now we're doing more work
>> (like asking the user to write the cover letter!) before we do
>> validation, which is probably bad.
>>
>> So I dunno. Maybe gen_header() should be lazily doing this
>> process_address_list() stuff? I'm not very familiar with the send-email
>> code, so I'm not sure what secondary effects that could have.
>>
> Michael, did you look into this since you authored the culprit?
>
I tried to repro the issue previously but didn't have luck (even with Email::Valid installed). I decided to try again today and realised I forgot to make the test sendemail-validate hook executable before.  Now that I can repro it, I can look further.

The fix may not be easy for the reasons that Jeff King states.  There is a lot of implicit order in the code because there are several places where code exists outside of any function, which has function definitions scattered through it.  My change attempted to clean a portion of that up and encapsulated it into a reusable function for generating header information so that sendemail-validate could see it.  I'll keep digging into it.


^ permalink raw reply

* Re: [PATCH v6] merge-tree: add -X strategy option
From: Junio C Hamano @ 2023-10-11 19:38 UTC (permalink / raw)
  To: Jeff King; +Cc: phillip.wood, Izzy via GitGitGadget, git, Elijah Newren, Izzy
In-Reply-To: <20231009185234.GB3270793@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> Oops, I totally missed the loop around the call to real_merge(). So
> yeah, I think this is rather tricky.
> ...
> I don't think there are any bugs with the state at the current tip of
> ty/merge-tree-strategy-options, but if we want to make it safer, I think
> we have two options:
>
>   - delay the conversion of the "xopts" list into merge_options until we
>     initialize it in real_merge(). This avoids breaking abstraction
>     boundaries, but does mean that the sanity-checking of the options
>     happens a little later (but not much in practice).
>
>   - provide a copy_merge_options() function, which makes this kind of
>     "set up a template and then copy it" pattern official. It can be a
>     struct assignment for now, but it at least alerts anybody adding new
>     options to the notion that a deep copy might be required.
>
> Option 1 looks something like this (a lot of the hunks are just
> reverting the tip of that branch, so squashed in it would be even
> smaller):

If we have no plan and intention to extend "merge-tree" even more in
the future, then option 1 would be the approach with least patch
noise, and as your "something like this" shows, it is a nice and
clean solution.  I very much like it.

But as the renovated "merge-tree" is a relatively young thing in our
toolbox, I suspect that more and more work may want to go into it.
And the other "official copy_merge_options()" approach would be a
more healthy solution in the longer run, I would think.  If we were
to go that route, we should also give an interface to free the
resources held by the copy.

It is not that much code on top of the commit that is already queued
in 'next', I suspect.  Perhaps something like this?

 builtin/merge-tree.c |  4 +++-
 merge-recursive.c    | 16 ++++++++++++++++
 merge-recursive.h    |  3 +++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git c/builtin/merge-tree.c w/builtin/merge-tree.c
index 7024b5ce2e..a35e0452d6 100644
--- c/builtin/merge-tree.c
+++ w/builtin/merge-tree.c
@@ -425,10 +425,11 @@ static int real_merge(struct merge_tree_options *o,
 {
 	struct commit *parent1, *parent2;
 	struct commit_list *merge_bases = NULL;
-	struct merge_options opt = o->merge_options;
 	struct merge_result result = { 0 };
 	int show_messages = o->show_messages;
+	struct merge_options opt;
 
+	copy_merge_options(&opt, &o->merge_options);
 	parent1 = get_merge_parent(branch1);
 	if (!parent1)
 		help_unknown_ref(branch1, "merge-tree",
@@ -507,6 +508,7 @@ static int real_merge(struct merge_tree_options *o,
 	if (o->use_stdin)
 		putchar(line_termination);
 	merge_finalize(&opt, &result);
+	clear_merge_options(&opt);
 	return !result.clean; /* result.clean < 0 handled above */
 }
 
diff --git c/merge-recursive.c w/merge-recursive.c
index 0d7e57e2df..e3beb0801b 100644
--- c/merge-recursive.c
+++ w/merge-recursive.c
@@ -3912,6 +3912,22 @@ void init_merge_options(struct merge_options *opt,
 		opt->buffer_output = 0;
 }
 
+/*
+ * For now, members of merge_options do not need deep copying, but
+ * it may change in the future, in which case we would need to update
+ * this, and also make a matching change to clear_merge_options() to
+ * release the resources held by a copied instance.
+ */
+void copy_merge_options(struct merge_options *dst, struct merge_options *src)
+{
+	*dst = *src;
+}
+
+void clear_merge_options(struct merge_options *opt UNUSED)
+{
+	; /* no-op as our copy is shallow right now */
+}
+
 int parse_merge_opt(struct merge_options *opt, const char *s)
 {
 	const char *arg;
diff --git c/merge-recursive.h w/merge-recursive.h
index b88000e3c2..3d3b3e3c29 100644
--- c/merge-recursive.h
+++ w/merge-recursive.h
@@ -55,6 +55,9 @@ struct merge_options {
 
 void init_merge_options(struct merge_options *opt, struct repository *repo);
 
+void copy_merge_options(struct merge_options *dst, struct merge_options *src);
+void clear_merge_options(struct merge_options *opt);
+
 /* parse the option in s and update the relevant field of opt */
 int parse_merge_opt(struct merge_options *opt, const char *s);
 


^ permalink raw reply related

* Re: Bug: git stash store can create stash entries that can't be dropped
From: Erik Cervin Edin @ 2023-10-11 19:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List
In-Reply-To: <xmqqzg0pnmv5.fsf@gitster.g>

On Wed, Oct 11, 2023 at 6:29 PM Junio C Hamano <gitster@pobox.com> wrote:
> I am of two minds.  As "stash store" and "stash create" were
> invented as, and have always been ever since, pretty much
> implementation details of scripted "stash save", the user deserves
> what they get when they abuse them: garbage-in, garbage-out.

Fair. I knew what I was doing was probably not kosher. Still, the
documentation says

           git stash store [(-m | --message) <message>] [-q | --quiet] <commit>

and then later clarifying

           Store a given stash created via git stash create (which is
a dangling merge commit) in the stash ref, updating the stash reflog.
This is intended to be useful for scripts.
           It is probably not the command you want to use; see "push" above.

I knew git stash store is meant to be used with git stash create. But
I didn't expect to be able to create a stash entry that I wouldn't be
able to drop.

> Yes, this is exactly what the user deserves.
>
> Having said that, I agree that this shows an uneven UI.  The "drop"
> command cares about what it is dropping and refuses if it is not a
> stash-like thing, so it is understandable to wish "store" to also
> care to the same degree.

I can see why I deserve the mess I made for myself. But I'm also
inclined to think that if I've been allowed to make said mess, I
should similarly be allowed to clean it up, without having to manually
delete the entry from the stash reflog. (In case someone else finds
themselves in a similar mess, I just went into the
.git/logs/refs/stash and deleted line of the offending stash entry)

I think a more user-friendly behavior is to either prevent the user
from making the mess, or not prevent the user from cleaning it up.
Doing both seems a tad overzealous. Hard for me to say which is
better. For example, it's not immediately clear, to me, why git stash
drop cares about the validity of stash entries to be dropped. But
maybe there are edge cases where it's good to be defensive here.

^ permalink raw reply

* [PATCH 1/1] add: enable attr pathspec magic
From: Joanna Wang @ 2023-10-11 20:20 UTC (permalink / raw)
  To: gitster; +Cc: git, jojwang
In-Reply-To: <xmqqbkda4ubp.fsf@gitster.g>

This lets users limit added files or exclude files based on file
attributes. For example, the chromium project would like to use
this like "git add --all ':(exclude,attr:submodule)'", as submodules
are managed in a unique way and often results in submodule changes
that users do not want in their commits.

This does not change any attr magic implementation. It is only adding
attr as an allowed pathspec in git-add, which was previously
blocked by GUARD_PATHSPEC and a pathspec mask in parse_pathspec()).

With this patch, attr is supported. It is possible that when the attr
pathspec feature was first added in
b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few
GUARD_PATHSPEC() invocations. Later, to get a more user-friendly error
message when attr was used with git-add, PATHSPEC_ATTR was added as a
mask to git-add's invocation of parse_pathspec()
84d938b732 (add: do not accept pathspec magic 'attr', 2018-09-18).

git-stash which goes through the same GUARD_PATHSPEC(), currently
does not work with attr.
https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/T/#u
So a PATHSPEC_ATTR mask has been added to its parse_pathspec and
parse_pathspec_file().

Signed-off-by: Joanna Wang <jojwang@google.com>
---

> Do you know what exactly is not ready, so that perhaps others can
> help figuring out how to make it ready for the attr magic?
I have filed a bug which describes what I have found:
https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/T/#u

> Why this reindent?
Fixed

> There is a typo here.
Fixed


 builtin/add.c                  |   7 ++-
 builtin/stash.c                |   7 ++-
 dir.c                          |   3 +-
 t/t6135-pathspec-with-attrs.sh | 106 +++++++++++++++++++++++++++++++--
 4 files changed, 112 insertions(+), 11 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index c27254a5cd..2de83964a3 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -424,7 +424,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 	 * Check the "pathspec '%s' did not match any files" block
 	 * below before enabling new magic.
 	 */
-	parse_pathspec(&pathspec, PATHSPEC_ATTR,
+	parse_pathspec(&pathspec, 0,
 		       PATHSPEC_PREFER_FULL |
 		       PATHSPEC_SYMLINK_LEADING_PATH,
 		       prefix, argv);
@@ -433,7 +433,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		if (pathspec.nr)
 			die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
 
-		parse_pathspec_file(&pathspec, PATHSPEC_ATTR,
+		parse_pathspec_file(&pathspec, 0,
 				    PATHSPEC_PREFER_FULL |
 				    PATHSPEC_SYMLINK_LEADING_PATH,
 				    prefix, pathspec_from_file, pathspec_file_nul);
@@ -504,7 +504,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			       PATHSPEC_LITERAL |
 			       PATHSPEC_GLOB |
 			       PATHSPEC_ICASE |
-			       PATHSPEC_EXCLUDE);
+			       PATHSPEC_EXCLUDE |
+			       PATHSPEC_ATTR);
 
 		for (i = 0; i < pathspec.nr; i++) {
 			const char *path = pathspec.items[i].match;
diff --git a/builtin/stash.c b/builtin/stash.c
index 1ad496985a..af1b3a7146 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1760,7 +1760,10 @@ static int push_stash(int argc, const char **argv, const char *prefix,
 		}
 	}
 
-	parse_pathspec(&ps, 0, PATHSPEC_PREFER_FULL | PATHSPEC_PREFIX_ORIGIN,
+        // For parse_pathspec() and parse_pathspec_file() below, PATHSPEC_ATTR is blocked for git stash
+        // because the magic attr does not get properly parsed when the PATHSPEC_PREFIX_ORIGIN flag is
+        // used, resulting in incorrect file filtering for attr.
+	parse_pathspec(&ps, PATHSPEC_ATTR, PATHSPEC_PREFER_FULL | PATHSPEC_PREFIX_ORIGIN,
 		       prefix, argv);
 
 	if (pathspec_from_file) {
@@ -1773,7 +1776,7 @@ static int push_stash(int argc, const char **argv, const char *prefix,
 		if (ps.nr)
 			die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
 
-		parse_pathspec_file(&ps, 0,
+		parse_pathspec_file(&ps, PATHSPEC_ATTR,
 				    PATHSPEC_PREFER_FULL | PATHSPEC_PREFIX_ORIGIN,
 				    prefix, pathspec_from_file, pathspec_file_nul);
 	} else if (pathspec_file_nul) {
diff --git a/dir.c b/dir.c
index 8486e4d56f..ee3f3777df 100644
--- a/dir.c
+++ b/dir.c
@@ -2179,7 +2179,8 @@ static int exclude_matches_pathspec(const char *path, int pathlen,
 		       PATHSPEC_LITERAL |
 		       PATHSPEC_GLOB |
 		       PATHSPEC_ICASE |
-		       PATHSPEC_EXCLUDE);
+		       PATHSPEC_EXCLUDE |
+		       PATHSPEC_ATTR);
 
 	for (i = 0; i < pathspec->nr; i++) {
 		const struct pathspec_item *item = &pathspec->items[i];
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index f70c395e75..403ee5e6b6 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -64,12 +64,24 @@ test_expect_success 'setup .gitattributes' '
 	fileSetLabel label
 	fileValue label=foo
 	fileWrongLabel label☺
+	newFileA* labelA
+	newFileB* labelB
 	EOF
 	echo fileSetLabel label1 >sub/.gitattributes &&
 	git add .gitattributes sub/.gitattributes &&
 	git commit -m "add attributes"
 '
 
+test_expect_success 'setup .gitignore' '
+	cat <<-\EOF >.gitignore &&
+	actual
+	expect
+	pathspec_file
+	EOF
+	git add .gitignore &&
+	git commit -m "add gitignore"
+'
+
 test_expect_success 'check specific set attr' '
 	cat <<-\EOF >expect &&
 	fileSetLabel
@@ -150,6 +162,7 @@ test_expect_success 'check specific value attr (2)' '
 test_expect_success 'check unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileA
 	fileAB
 	fileAC
@@ -175,6 +188,7 @@ test_expect_success 'check unspecified attr' '
 test_expect_success 'check unspecified attr (2)' '
 	cat <<-\EOF >expect &&
 	HEAD:.gitattributes
+	HEAD:.gitignore
 	HEAD:fileA
 	HEAD:fileAB
 	HEAD:fileAC
@@ -200,6 +214,7 @@ test_expect_success 'check unspecified attr (2)' '
 test_expect_success 'check multiple unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileC
 	fileNoLabel
 	fileWrongLabel
@@ -239,16 +254,97 @@ test_expect_success 'fail on multiple attr specifiers in one pathspec item' '
 	test_i18ngrep "Only one" actual
 '
 
-test_expect_success 'fail if attr magic is used places not implemented' '
+test_expect_success 'fail if attr magic is used in places not implemented' '
 	# The main purpose of this test is to check that we actually fail
 	# when you attempt to use attr magic in commands that do not implement
-	# attr magic. This test does not advocate git-add to stay that way,
-	# though, but git-add is convenient as it has its own internal pathspec
-	# parsing.
-	test_must_fail git add ":(attr:labelB)" 2>actual &&
+	# attr magic. This test does not advocate stash push to stay that way.
+	# When you teach the command to grok the pathspec, you need to find
+	# another command to replace it for the test.
+	test_must_fail git stash push ":(attr:labelB)" 2>actual &&
+	test_i18ngrep "magic not supported" actual
+'
+
+test_expect_success 'fail if attr magic is used in --pathspec-from-file when not implemented' '
+	# This is like the test above but for attr magic passed in via --pathspec-from-file.
+	cat <<-\EOF >pathspec_file &&
+	:(attr:labelB)
+	EOF
+	test_must_fail git stash push --pathspec-from-file=pathspec_file 2>actual &&
 	test_i18ngrep "magic not supported" actual
 '
 
+test_expect_success 'check that attr magic works for git add --all' '
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached >actual &&
+	git restore -W -S . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add -u' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add -u ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached  >actual &&
+	git restore -S -W . && rm sub/new* &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add <path>' '
+	cat <<-\EOF >expect &&
+	fileA
+	fileB
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add ":(exclude,attr:labelB)sub/*" &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git -add .' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	cd sub &&
+	git add . ":(exclude,attr:labelB)" &&
+	cd .. &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --pathspec-from-file' '
+	cat <<-\EOF >pathspec_file &&
+	:(exclude,attr:labelB)
+	EOF
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all --pathspec-from-file=pathspec_file &&
+	git diff --name-only --cached >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'abort on giving invalid label on the command line' '
 	test_must_fail git ls-files . ":(attr:☺)"
 '
-- 
2.42.0.609.gbb76f46606-goog


^ permalink raw reply related

* [PATCH] send-email: move process_address_list earlier to avoid, uninitialized address error
From: Michael Strawbridge @ 2023-10-11 20:22 UTC (permalink / raw)
  To: Bagas Sanjaya, Jeff King, Todd Zullinger
  Cc: Junio C Hamano, Luben Tuikov,
	Ævar Arnfjörð Bjarmason, Taylor Blau,
	Git Mailing List
In-Reply-To: <95b9e5d5-ab07-48a6-b972-af5348f653be@amd.com>

Move processing of email address lists before the sendemail-validate
hook code.  This fixes email address validation errors when the optional
perl module Email::Valid is installed and multiple addresses are passed
in on a single to/cc argument like --to=foo@example.com,bar@example.com.
---
 git-send-email.perl | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 288ea1ae80..cfd80c9d8b 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -799,6 +799,10 @@ sub is_format_patch_arg {
 
 $time = time - scalar $#files;
 
+@initial_to = process_address_list(@initial_to);
+@initial_cc = process_address_list(@initial_cc);
+@initial_bcc = process_address_list(@initial_bcc);
+
 if ($validate) {
        # FIFOs can only be read once, exclude them from validation.
        my @real_files = ();
@@ -1099,10 +1103,6 @@ sub expand_one_alias {
        return $aliases{$alias} ? expand_aliases(@{$aliases{$alias}}) : $alias;
 }
 
-@initial_to = process_address_list(@initial_to);
-@initial_cc = process_address_list(@initial_cc);
-@initial_bcc = process_address_list(@initial_bcc);
-
 if ($thread && !defined $initial_in_reply_to && $prompting) {
        $initial_in_reply_to = ask(
                __("Message-ID to be used as In-Reply-To for the first email (if any)? "),
-- 
2.34.1

^ permalink raw reply related

* Re: [PATCH] send-email: move process_address_list earlier to avoid, uninitialized address error
From: Michael Strawbridge @ 2023-10-11 20:25 UTC (permalink / raw)
  To: Bagas Sanjaya, Jeff King, Todd Zullinger
  Cc: Junio C Hamano, Luben Tuikov,
	Ævar Arnfjörð Bjarmason, Taylor Blau,
	Git Mailing List
In-Reply-To: <7e2c92ff-b42c-4b3f-a509-9d0785448262@amd.com>

Hi Bagas,

If possible, please try the patch I just sent out and let me know if it works for your situation.

Thanks,

Michael

On 10/11/23 16:22, Michael Strawbridge wrote:
> Move processing of email address lists before the sendemail-validate
> hook code.  This fixes email address validation errors when the optional
> perl module Email::Valid is installed and multiple addresses are passed
> in on a single to/cc argument like --to=foo@example.com,bar@example.com.
> ---
>  git-send-email.perl | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index 288ea1ae80..cfd80c9d8b 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -799,6 +799,10 @@ sub is_format_patch_arg {
>  
>  $time = time - scalar $#files;
>  
> +@initial_to = process_address_list(@initial_to);
> +@initial_cc = process_address_list(@initial_cc);
> +@initial_bcc = process_address_list(@initial_bcc);
> +
>  if ($validate) {
>         # FIFOs can only be read once, exclude them from validation.
>         my @real_files = ();
> @@ -1099,10 +1103,6 @@ sub expand_one_alias {
>         return $aliases{$alias} ? expand_aliases(@{$aliases{$alias}}) : $alias;
>  }
>  
> -@initial_to = process_address_list(@initial_to);
> -@initial_cc = process_address_list(@initial_cc);
> -@initial_bcc = process_address_list(@initial_bcc);
> -
>  if ($thread && !defined $initial_in_reply_to && $prompting) {
>         $initial_in_reply_to = ask(
>                 __("Message-ID to be used as In-Reply-To for the first email (if any)? "),

^ permalink raw reply

* [PATCH] stash: be careful what we store
From: Junio C Hamano @ 2023-10-11 20:44 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Erik Cervin Edin
In-Reply-To: <xmqqzg0pnmv5.fsf@gitster.g>

"git stash store" is meant to store what "git stash create"
produces, as these two are implementation details of the end-user
facing "git stash save" command.  Even though it is clearly
documented as such, users would try silly things like "git stash
store HEAD" to render their stash unusable.

Worse yet, because "git stash drop" does not allow such a stash
entry to be removed, "git stash clear" would be the only way to
recover from such a mishap.  Reuse the logic that allows "drop" to
refrain from working on such a stash entry to teach "store" to avoid
storing an object that is not a stash entry in the first place.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
  > It may be just the matter of doing something silly like this.
  > Not even compile tested, but hopefully it is sufficient to
  > convey the idea.

  Now it is at least compile-tested.

 builtin/stash.c  | 6 ++++++
 t/t3903-stash.sh | 4 ++++
 2 files changed, 10 insertions(+)

diff --git a/builtin/stash.c b/builtin/stash.c
index 3a4f9fd566..8073ef4019 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -977,6 +977,12 @@ static int show_stash(int argc, const char **argv, const char *prefix)
 static int do_store_stash(const struct object_id *w_commit, const char *stash_msg,
 			  int quiet)
 {
+	struct stash_info info;
+	char revision[GIT_MAX_HEXSZ];
+
+	oid_to_hex_r(revision, w_commit);
+	assert_stash_like(&info, revision);
+
 	if (!stash_msg)
 		stash_msg = "Created via \"git stash store\".";
 
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 376cc8f4ab..35c8569aea 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -931,6 +931,10 @@ test_expect_success 'store called with invalid commit' '
 	test_must_fail git stash store foo
 '
 
+test_expect_success 'store called with non-stash commit' '
+	test_must_fail git stash store HEAD
+'
+
 test_expect_success 'store updates stash ref and reflog' '
 	git stash clear &&
 	git reset --hard &&
-- 
2.42.0-345-gaab89be2eb


^ permalink raw reply related

* Re: [PATCH v8 1/3] unit tests: Add a project plan document
From: Josh Steadmon @ 2023-10-11 21:14 UTC (permalink / raw)
  To: Oswald Buddenhagen
  Cc: git, phillip.wood123, linusa, calvinwan, gitster, rsbecker
In-Reply-To: <ZSUR+YdzqNTSB0XC@ugly>

On 2023.10.10 10:57, Oswald Buddenhagen wrote:
> On Mon, Oct 09, 2023 at 03:21:20PM -0700, Josh Steadmon wrote:
> > +=== Comparison
> > +
> > +[format="csv",options="header",width="33%"]
> > +|=====
> > +Framework,"<<license,License>>","<<vendorable-or-ubiquitous,Vendorable or ubiquitous>>","<<maintainable-extensible,Maintainable / extensible>>","<<major-platform-support,Major platform support>>","<<tap-support,TAP support>>","<<diagnostic-output,Diagnostic output>>","<<runtime--skippable-tests,Runtime- skippable tests>>","<<parallel-execution,Parallel execution>>","<<mock-support,Mock support>>","<<signal-error-handling,Signal & error handling>>","<<project-kloc,Project KLOC>>","<<adoption,Adoption>>"
> > 
> the redundancy seems unnecessary; asciidoc should automatically use each
> target's section title as the xreflabel.

Hmm, this doesn't seem to work for me. It only renders as
"[anchor-label]".


> > +https://lore.kernel.org/git/c902a166-98ce-afba-93f2-ea6027557176@gmail.com/[Custom Git impl.],[lime-background]#GPL v2#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,1,0
> > +https://github.com/silentbicycle/greatest[Greatest],[lime-background]#ISC#,[lime-background]#True#,[yellow-background]#Partial#,[lime-background]#True#,[yellow-background]#Partial#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,3,1400
> > +https://github.com/Snaipe/Criterion[Criterion],[lime-background]#MIT#,[red-background]#False#,[yellow-background]#Partial#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[lime-background]#True#,19,1800
> > +https://github.com/rra/c-tap-harness/[C TAP],[lime-background]#Expat#,[lime-background]#True#,[yellow-background]#Partial#,[yellow-background]#Partial#,[lime-background]#True#,[red-background]#False#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,4,33
> > +https://libcheck.github.io/check/[Check],[lime-background]#LGPL v2.1#,[red-background]#False#,[yellow-background]#Partial#,[lime-background]#True#,[lime-background]#True#,[lime-background]#True#,[red-background]#False#,[red-background]#False#,[red-background]#False#,[lime-background]#True#,17,973
> > +|=====
> > +
> i find this totally unreadable in its raw form.
> consider user-defined document-attributes for specific cell contents.
> externalizing the urls would probably help as well (i'm not sure how to do
> that best).

Ah yeah, user-defined attributes definitely cleans this up quite a bit.
Thanks for the tip!

^ permalink raw reply

* Re: [PATCH] upload-pack: add tracing for fetches
From: Taylor Blau @ 2023-10-11 21:26 UTC (permalink / raw)
  To: Robert Coup via GitGitGadget; +Cc: git, Robert Coup
In-Reply-To: <pull.1598.git.1697040242703.gitgitgadget@gmail.com>

On Wed, Oct 11, 2023 at 04:04:02PM +0000, Robert Coup via GitGitGadget wrote:
> diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
> index d18f2823d86..bb15ac34f77 100755
> --- a/t/t5500-fetch-pack.sh
> +++ b/t/t5500-fetch-pack.sh
> @@ -132,13 +132,18 @@ test_expect_success 'single branch object count' '
>  '
>
>  test_expect_success 'single given branch clone' '
> -	git clone --single-branch --branch A "file://$(pwd)/." branch-a &&
> -	test_must_fail git --git-dir=branch-a/.git rev-parse origin/B
> +	GIT_TRACE2_EVENT="$(pwd)/branch-a/trace2_event" \
> +		git clone --single-branch --branch A "file://$(pwd)/." branch-a &&
> +	test_must_fail git --git-dir=branch-a/.git rev-parse origin/B &&
> +	grep \"fetch-info\".*\"haves\":0 branch-a/trace2_event &&
> +	grep \"fetch-info\".*\"wants\":1 branch-a/trace2_event

Not at all related to your patch here, but I feel like we have a bunch
of these greps sprinkled throughout the test suite which serve to
inspect some key(s) of a JSON object printed to the trace2 event stream.

It might be nice to have a more robust test-tool that could do this
heavy lifting for us instead of having to write these grep expressions
ourselves.

But definitely outside the scope of this patch ;-).

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH] send-email: move process_address_list earlier to avoid, uninitialized address error
From: Junio C Hamano @ 2023-10-11 21:27 UTC (permalink / raw)
  To: Michael Strawbridge
  Cc: Bagas Sanjaya, Jeff King, Todd Zullinger, Luben Tuikov,
	Ævar Arnfjörð Bjarmason, Taylor Blau,
	Git Mailing List
In-Reply-To: <7e2c92ff-b42c-4b3f-a509-9d0785448262@amd.com>

Michael Strawbridge <michael.strawbridge@amd.com> writes:

> @@ -799,6 +799,10 @@ sub is_format_patch_arg {
>  
>  $time = time - scalar $#files;
>  
> +@initial_to = process_address_list(@initial_to);
> +@initial_cc = process_address_list(@initial_cc);
> +@initial_bcc = process_address_list(@initial_bcc);
> +

This does not look OK.  If we trace how @initial_to gets its value,

 - it first gets its value from @getopt_to and @config_to

 - if that is empty, and there is no $to_cmd, the end-user is
   interactively asked.

 - then process_address_list() is applied.

But this patch just swapped the second one and the third one, so
process_address_list() does not process what the end-user gave
interactively, no?

>  if ($validate) {
>         # FIFOs can only be read once, exclude them from validation.
>         my @real_files = ();

It almost feels like what need to move is not the setting of these
address lists, but the code that calls int validation callchain that
needs access to these address lists---the block that begins with the
above "if ($validate) {" needs to move below ...

> @@ -1099,10 +1103,6 @@ sub expand_one_alias {
>         return $aliases{$alias} ? expand_aliases(@{$aliases{$alias}}) : $alias;
>  }
>  
> -@initial_to = process_address_list(@initial_to);
> -@initial_cc = process_address_list(@initial_cc);
> -@initial_bcc = process_address_list(@initial_bcc);
> -

... this point, or something, perhaps?

>  if ($thread && !defined $initial_in_reply_to && $prompting) {
>         $initial_in_reply_to = ask(
>                 __("Message-ID to be used as In-Reply-To for the first email (if any)? "),

^ permalink raw reply

* Re: [PATCH] upload-pack: add tracing for fetches
From: Taylor Blau @ 2023-10-11 21:32 UTC (permalink / raw)
  To: Robert Coup via GitGitGadget; +Cc: git, Robert Coup
In-Reply-To: <pull.1598.git.1697040242703.gitgitgadget@gmail.com>

On Wed, Oct 11, 2023 at 04:04:02PM +0000, Robert Coup via GitGitGadget wrote:
> From: Robert Coup <robert@coup.net.nz>
>
> Information on how users are accessing hosted repositories can be
> helpful to server operators. For example, being able to broadly
> differentiate between fetches and initial clones; the use of shallow
> repository features; or partial clone filters.

Indeed. One of the custom patches that GitHub is carrying in its private
fork is something similar to this that dumps information from
upload-pack into a custom logging system specific to GitHub (called
"sockstat", in case anybody was curious).

I suspect that we would still live with that patch because we depend on
some of the custom logging infrastructure provided by sockstat, but this
is definitely a good direction to be pursuing for git.git nonetheless.

> @@ -1552,6 +1553,32 @@ static int parse_have(const char *line, struct oid_array *haves)
>  	return 0;
>  }
>
> +static void trace2_fetch_info(struct upload_pack_data *data)
> +{
> +	struct json_writer jw = JSON_WRITER_INIT;
> +
> +	jw_object_begin(&jw, 0);
> +	{

Is there a reason that we have a separate scope here? I think we may
want to drop this as unnecessary, but it's entirely possible that I'm
missing something here...

> +		jw_object_intmax(&jw, "haves", data->haves.nr);
> +		jw_object_intmax(&jw, "wants", data->want_obj.nr);
> +		jw_object_intmax(&jw, "want-refs", data->wanted_refs.nr);
> +		jw_object_intmax(&jw, "depth", data->depth);
> +		jw_object_intmax(&jw, "shallows", data->shallows.nr);
> +		jw_object_bool(&jw, "deepen-since", data->deepen_since);
> +		jw_object_intmax(&jw, "deepen-not", data->deepen_not.nr);
> +		jw_object_bool(&jw, "deepen-relative", data->deepen_relative);
> +		if (data->filter_options.choice)
> +			jw_object_string(&jw, "filter", list_object_filter_config_name(data->filter_options.choice));

I'm pretty sure that list_object_filter_config_name() returns characters
that are safe for JSON-encoding, and/or that jw_object_string() does any
quoting beforehand, but worth checking nonetheless.

> +		else
> +			jw_object_null(&jw, "filter");

These all seem like useful things to have.

Thanks,
Taylor

^ permalink raw reply

* Re: [RFC] Define "precious" attribute and support it in `git clean`
From: Kristoffer Haugsbakk @ 2023-10-11 21:41 UTC (permalink / raw)
  To: Sebastian Thiel; +Cc: Josh Triplett, git
In-Reply-To: <79901E6C-9839-4AB2-9360-9EBCA1AAE549@icloud.com>

On Tue, Oct 10, 2023, at 14:37, Sebastian Thiel wrote:
> [Note: I'm collaborating with Josh Triplett (CCed) on the design.]
>
> I'd like to propose adding a new standard gitattribute "precious".  I've
> included proposed documentation at the end of this mail, and I'm happy to write
> the code.  I wanted to get feedback on the concept first.
>
> What's a 'precious' file?
>
> "Precious" files are files that are specific to a user or local configuration
> and thus not tracked by Git.  As such, a user spent time to create or generate
> them, and to tune them to fit their needs.  They are typically also ignored by
> `git` due to `.gitignore` configuration, preventing them to be tracked by
> accident.

How do people deal with these precious files today?

You could track these precious files somewhere else. Maybe a Git directory
which is a sibling of `.git`.

    .git-local

Files that are useless to the project but important to the individual.

It would get its own “excludes” file.

    .gitignore-local

Now the normal repository (`.git`) needs to ignore these things.

    # Cast a wide net: could want other siblings
    # Use another pattern if you have something like `.git-blame-ignore-revs`
    printf '.git-*\n'       >> .git/info/exclude
    printf '.gitignore-*\n' >> .git/info/exclude

You need to pass in two arguments to `git` every time you want to use
`.git-local`.

    alias gitl='git --git-dir=.git-local -c core.excludesFile=.gitignore-local'

Git Local should ignore everything by default. You should check with
`.gitignore` to make sure that it ignores the files that Git Local does
_not_ ignore.

    printf '*\n'                 >> .gitignore-local
    printf '!.gitignore-local\n' >> .gitignore-local
    printf '!.idea/**\n'         >> .gitignore-local

Now you can backup your local files.

    gitl add .gitignore-local
    gitl add .idea
    gitl commit -m'Update local'

But you can also version control them by providing real (intentional)
messages.

(Maybe `.idea/` is just an XML soup and thus hard to make a VCS narrative
around; I don't know yet.)

`git clean` won't help you. But an alias can.

Or if writing a shell-oneliner alias is too hard for you, I mean me.

    #!/bin/sh
    # git-klean
    git --git-dir=.git-local -c core.excludesFile=.gitignore-local add --all
    git --git-dir=.git-local -c core.excludesFile=.gitignore-local commit -mUpdate
    git clean -e .gitignore-local -e .git-local "$@"

The two `-e` switches protect the Git Local things from being wiped by
`-xd` (tested with `--dry-run`).

§ Sibling repositories

At first I thought that `git clean` could use a `pre-clean` hook. But
that's not very satisfying.

Maybe Git could be told about its siblings via a multi-valued
configuration variable.

    sibling=local

Then it expects there to exist `.git-<sibling name>` repository next to
it (`.git/../.git-<sibling name>`).

Then the rule becomes:

Only do destructive operations if all of the working trees[1] of the
sibling repositories are clean (cannot override with `--force`).

Additionally one could say that directories that are Git repositories
should be ignored by `git clean`, always.[2] Or siblings which
match the glob:

    .git-*

... Or if you want something longer:

    .git-sibling-*

(And also ignore `.gitignore-*` and maybe more things)

Then the regular Git repository might still blow away your precious
files. But they will be backed up by the siblings.

Or you put these other Git repositories outside of `.git/..`. Sidestepping
the issue at the cost of some path confusion (for yourself). Maybe at:

    /home/user/git-siblings/repository1/local

† 1: Worktrees not considered.
† 2: And what would that break? People who make Git repositories in their
   working trees and then delete them? (Well they can still use `rm -r`.)

⧫ ⧫

§ Worktrees

But seriously: worktrees probably makes this not work.

-- 
Kristoffer

^ permalink raw reply

* Re: [PATCH v8 2/3] unit tests: add TAP unit test framework
From: Junio C Hamano @ 2023-10-11 21:42 UTC (permalink / raw)
  To: Josh Steadmon; +Cc: git, phillip.wood123, linusa, calvinwan, rsbecker
In-Reply-To: <00d3c95a81449bf49c4ce992d862d7a858691840.1696889530.git.steadmon@google.com>

Josh Steadmon <steadmon@google.com> writes:

> +	/* Initialized strbufs should always have a non-NULL buffer */
> +	if (buf->buf == NULL)
> +		return 0;

This upsets Coccinelle (equals-null).  I'll queue this on top for
now to work around CI breakage.

Thanks.

----- >8 -----
Subject: [PATCH] SQUASH???

Coccinelle suggested style fix.
---
 t/unit-tests/t-strbuf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/unit-tests/t-strbuf.c b/t/unit-tests/t-strbuf.c
index c2fcb0cbd6..8388442426 100644
--- a/t/unit-tests/t-strbuf.c
+++ b/t/unit-tests/t-strbuf.c
@@ -28,7 +28,7 @@ static void setup_populated(void (*f)(struct strbuf*, void*), char *init_str, vo
 static int assert_sane_strbuf(struct strbuf *buf)
 {
 	/* Initialized strbufs should always have a non-NULL buffer */
-	if (buf->buf == NULL)
+	if (!buf->buf)
 		return 0;
 	/* Buffers should always be NUL-terminated */
 	if (buf->buf[buf->len] != '\0')
-- 
2.42.0-345-gaab89be2eb


^ permalink raw reply related

* Re: [PATCH v6] merge-tree: add -X strategy option
From: Jeff King @ 2023-10-11 21:43 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: phillip.wood, Izzy via GitGitGadget, git, Elijah Newren, Izzy
In-Reply-To: <xmqqlec9kl0k.fsf@gitster.g>

On Wed, Oct 11, 2023 at 12:38:03PM -0700, Junio C Hamano wrote:

> If we have no plan and intention to extend "merge-tree" even more in
> the future, then option 1 would be the approach with least patch
> noise, and as your "something like this" shows, it is a nice and
> clean solution.  I very much like it.
> 
> But as the renovated "merge-tree" is a relatively young thing in our
> toolbox, I suspect that more and more work may want to go into it.
> And the other "official copy_merge_options()" approach would be a
> more healthy solution in the longer run, I would think.  If we were
> to go that route, we should also give an interface to free the
> resources held by the copy.

I am happy with either, as they both resolve the "merge-tree knows
intimate details about merge_options" issue. The patch I showed would
require manually passing more details down to real_merge(), which is I
guess what you are getting at with the "more work may want to go into
it".

> It is not that much code on top of the commit that is already queued
> in 'next', I suspect.  Perhaps something like this?

This looks OK, though...

> +void clear_merge_options(struct merge_options *opt UNUSED)
> +{
> +	; /* no-op as our copy is shallow right now */
> +}

Clearing is generally not just about copies, but any use of the struct.
so this invites the question of whether the original non-copy struct
should have a call to clear_merge_options() in cmd_merge_tree(). And
ditto for every other user.

I do not mind adding such calls, but of course we know that they are
currently noops (and we don't have any particular plan to change that).

-Peff

^ permalink raw reply

* Re: [PATCH v4 0/2] attr: add attr.tree config
From: Junio C Hamano @ 2023-10-11 22:09 UTC (permalink / raw)
  To: John Cai via GitGitGadget
  Cc: git, Jeff King, Jonathan Tan, Eric Sunshine, John Cai
In-Reply-To: <pull.1577.v4.git.git.1697044422.gitgitgadget@gmail.com>

"John Cai via GitGitGadget" <gitgitgadget@gmail.com> writes:

> 44451a2e5e (attr: teach "--attr-source=" global option to "git", 2023-05-06)
> provided the ability to pass in a treeish as the attr source. When a
> revision does not resolve to a valid tree is passed, Git will die. At
> GitLab, we server repositories as bare repos and would like to always read
> attributes from the default branch, so we'd like to pass in HEAD as the
> treeish to read gitattributes from on every command. In this context we
> would not want Git to die if HEAD is unborn, like in the case of empty
> repositories.
>
> Instead of modifying the default behavior of --attr-source, create a new
> config attr.tree with which an admin can configure a ref for all commands to
> read gitattributes from. Also make the default tree to read from HEAD on
> bare repositories.
>
> Changes since v2:
>
>  * relax the restrictions around attr.tree so that if it does not resolve to
>    a valid treeish, ignore it.
>  * add a commit to default to HEAD in bare repositories
>
> Changes since v1:
>
>  * Added a commit to add attr.tree config

THis is v4 so there must be some changes since v3 that we are missing?

> Range-diff vs v3:
>
>  1:  cef206d47c7 ! 1:  eaa27c47810 attr: read attributes from HEAD when bare repo
>      @@ t/t0003-attributes.sh: test_expect_success 'bare repository: check that .gitattr
>       +test_expect_success 'bare repo defaults to reading .gitattributes from HEAD' '
>       +	test_when_finished rm -rf test bare_with_gitattribute &&
>       +	git init test &&
>      -+	(
>      -+		cd test &&
>      -+		test_commit gitattributes .gitattributes "f/path test=val"
>      -+	) &&
>      ++	test_commit -C test gitattributes .gitattributes "f/path test=val" &&

OK.

>  2:  dadb822da99 ! 2:  749d8a8082e attr: add attr.tree for setting the treeish to read attributes from
>      @@ Documentation/config.txt: other popular tools, and describe them in your documen
>       
>        ## Documentation/config/attr.txt (new) ##
>       @@
>      -+attr.tree:
>      -+	A <tree-ish> to read gitattributes from instead of the worktree. See
>      -+	linkgit:gitattributes[5]. If `attr.tree` does not resolve to a valid tree,
>      -+	treat it as an empty tree. --attr-source and GIT_ATTR_SOURCE take
>      -+	precedence over attr.tree.
>      ++attr.tree::
>      ++	A reference to a tree in the repository from which to read attributes,
>      ++	instead of the `.gitattributes` file in the working tree. In a bare
>      ++	repository, this defaults to `HEAD:.gitattributes`. If the value does
>      ++	not resolve to a valid tree object, an empty tree is used instead.
>      ++	When the `GIT_ATTR_SOURCE` environment variable or `--attr-source`
>      ++	command line option are used, this configuration variable has no effect.

OK.

>      -+	if (!default_attr_source_tree_object_name) {
>      ++	if (!default_attr_source_tree_object_name && git_attr_tree) {
>       +		default_attr_source_tree_object_name = git_attr_tree;
>       +		ignore_bad_attr_tree = 1;
>       +	}

Makes sense.

>      @@ t/t0003-attributes.sh: test_expect_success 'bare repository: check that .gitattr
>        
>       +bad_attr_source_err="fatal: bad --attr-source or GIT_ATTR_SOURCE"
>       +
>      ++test_expect_success '--attr-source is bad' '
>      ++	test_when_finished rm -rf empty &&
>      ++	git init empty &&
>      ++	(
>      ++		cd empty &&
>      ++		echo "$bad_attr_source_err" >expect_err &&
>      ++		test_must_fail git --attr-source=HEAD check-attr test -- f/path 2>err &&
>      ++		test_cmp expect_err err
>      ++	)
>      ++'

OK.  We fail when explicitly given a bad attr-source.

>       +test_expect_success 'attr.tree when HEAD is unborn' '
>       +	test_when_finished rm -rf empty &&
>       +	git init empty &&
>       +	(
>       +		cd empty &&
>      -+		echo $bad_attr_source_err >expect_err &&
>       +		echo "f/path: test: unspecified" >expect &&
>       +		git -c attr.tree=HEAD check-attr test -- f/path >actual 2>err &&
>       +		test_must_be_empty err &&

But we silently ignore when given via a configuration variable.

>      @@ t/t0003-attributes.sh: test_expect_success 'bare repository: check that .gitattr
>       +	git init empty &&
>       +	(
>       +		cd empty &&
>      -+		echo $bad_attr_source_err >expect_err &&
>       +		echo "f/path: test: unspecified" >expect &&
>       +		git -c attr.tree=refs/does/not/exist check-attr test -- f/path >actual 2>err &&
>       +		test_must_be_empty err &&

Ditto.  Is this any different from the above?  Both points at an
object that does not exist.  If one were pointing at an object that
does not exist (e.g., HEAD before the initial commit) and the other
were pointing at an object that is not a tree-ish (e.g., a blob),
then having two separate tests may make sense, but otherwise, I am
not sure about the value proposition of the second test.

>      @@ t/t0003-attributes.sh: test_expect_success 'bare repo defaults to reading .gitat
>        	test_cmp expect actual
>        '
>        
>      -+test_expect_success '--attr-source and GIT_ATTR_SOURCE take precedence over attr.tree' '
>      ++test_expect_success 'precedence of --attr-source, GIT_ATTR_SOURCE, then attr.tree' '
>       +	test_when_finished rm -rf empty &&
>       +	git init empty &&
>       +	(
>      @@ t/t0003-attributes.sh: test_expect_success 'bare repo defaults to reading .gitat
>       +		test_commit "val2" .gitattributes "f/path test=val2" &&
>       +		git checkout attr-source &&
>       +		echo "f/path: test: val1" >expect &&
>      -+		git -c attr.tree=attr-tree --attr-source=attr-source check-attr test -- f/path >actual &&
>      ++		GIT_ATTR_SOURCE=attr-source git -c attr.tree=attr-tree --attr-source=attr-source \
>      ++		check-attr test -- f/path >actual &&
>       +		test_cmp expect actual &&
>      -+		GIT_ATTR_SOURCE=attr-source git -c attr.tree=attr-tree check-attr test -- f/path >actual &&
>      ++		GIT_ATTR_SOURCE=attr-source git -c attr.tree=attr-tree \
>      ++		check-attr test -- f/path >actual &&
>       +		test_cmp expect actual
>       +	)
>       +'

Looking good.

Thanks.  Queued.

^ permalink raw reply

* Re: [PATCH] send-email: move process_address_list earlier to avoid, uninitialized address error
From: Jeff King @ 2023-10-11 22:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Michael Strawbridge, Bagas Sanjaya, Todd Zullinger, Luben Tuikov,
	Ævar Arnfjörð Bjarmason, Taylor Blau,
	Git Mailing List
In-Reply-To: <xmqq1qe0lui2.fsf@gitster.g>

On Wed, Oct 11, 2023 at 02:27:49PM -0700, Junio C Hamano wrote:

> Michael Strawbridge <michael.strawbridge@amd.com> writes:
> 
> > @@ -799,6 +799,10 @@ sub is_format_patch_arg {
> >  
> >  $time = time - scalar $#files;
> >  
> > +@initial_to = process_address_list(@initial_to);
> > +@initial_cc = process_address_list(@initial_cc);
> > +@initial_bcc = process_address_list(@initial_bcc);
> > +
> 
> This does not look OK.  If we trace how @initial_to gets its value,
> 
>  - it first gets its value from @getopt_to and @config_to
> 
>  - if that is empty, and there is no $to_cmd, the end-user is
>    interactively asked.
> 
>  - then process_address_list() is applied.
> 
> But this patch just swapped the second one and the third one, so
> process_address_list() does not process what the end-user gave
> interactively, no?

Yep, that is the issue I found when I dug into this earlier.

> >  if ($validate) {
> >         # FIFOs can only be read once, exclude them from validation.
> >         my @real_files = ();
> 
> It almost feels like what need to move is not the setting of these
> address lists, but the code that calls int validation callchain that
> needs access to these address lists---the block that begins with the
> above "if ($validate) {" needs to move below ...

Yes, though then we have the problem that we've asked the user some
interactive questions before validating if the input files are bogus or
not. Which would be annoying if they aren't valid, because when we barf
they've wasted time typing.

Which of course implies that we're not (and cannot) validate what
they're typing at this step, but I think that's OK because we feed it
through extract_valid_address_or_die(). IOW, I think there are actually
two distinct validation steps hidden here:

  1. We want to validate that the patch files we were fed are OK.

  2. We want to validate that the addresses, etc, fed by the user are
     OK.

And after Michael's original patch, we are accidentally hitting some of
that validation code for (2) while doing (1).

This is actually a weird split if you think about it. We are feeding to
the validate hook in (1), so surely it would want to see the full set of
inputs from the user, too? Which argues for pushing the "if ($validate)"
down as you suggest. And then either:

  a. We accept that the user experience is a little worse if validation
     fails after the user typed.

  b. We split (1) into "early" validation that just checks if the files
     are OK, but doesn't call the hook. And then later on we do the full
     validation.

I don't have a strong opinion myself (I don't even use send-email
myself, and if I did, I'd probably mostly be feeding it with "--to" etc
on the command line, rather than interactively).

-Peff

^ permalink raw reply

* Re: [PATCH v6] merge-tree: add -X strategy option
From: Junio C Hamano @ 2023-10-11 22:19 UTC (permalink / raw)
  To: Jeff King; +Cc: phillip.wood, Izzy via GitGitGadget, git, Elijah Newren, Izzy
In-Reply-To: <20231011214340.GA518221@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> I am happy with either, as they both resolve the "merge-tree knows
> intimate details about merge_options" issue. The patch I showed would
> require manually passing more details down to real_merge(), which is I
> guess what you are getting at with the "more work may want to go into
> it".

I was alluding more about teaching "merge-tree" various optional
behaviour merge_options represents.  In today's patch it may be
-X<options>, who knows what tomorrow's patch wants to bring
"merge-tree" to feature-parity with "merge".  And the first approach
would mean we would add xopts today to the struct, but we will be
required passing more details as we add other things.

>> It is not that much code on top of the commit that is already queued
>> in 'next', I suspect.  Perhaps something like this?
>
> This looks OK, though...
>
>> +void clear_merge_options(struct merge_options *opt UNUSED)
>> +{
>> +	; /* no-op as our copy is shallow right now */
>> +}
>
> Clearing is generally not just about copies, but any use of the struct.
> so this invites the question of whether the original non-copy struct
> should have a call to clear_merge_options() in cmd_merge_tree(). And
> ditto for every other user.

Yes, once we start leaking, somebody hopefully notice the lack of a
call to this on the original/template copy and add one.  Until then...

^ permalink raw reply

* why does git set X in LESS env var?
From: Christoph Anton Mitterer @ 2023-10-11 22:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

Hey.

I recently stumbled over the problem that mouse wheel scrolling doesn't
work with (git-)delta[0], a problem[1][2], which apparently numerous
people had before me.

Numerous solutions were given in [1] and [2], for example using --mouse
as less option.
I noticed however, that this causes a somewhat different mouse
scrolling behaviour than my less usually gives me and dug a bit
further, noticing that the problem was that `X` was set in the `LESS`
env var for the less process, wrongly assuming[3] first that delta
would set it.

After delta's upstream noticed that this must be wrong, I looked
further and found that git set it since commit
0abc0260fa3419de649fcc1444e3d256a17ca6c7, which gives however no
indication why it was added.

A somewhat later commit, b3275838d969b7ecb91aae584226fccbeb046aca,
which removes `S` from being set, mentions:
> … The FRX flags actually make sense for Git (F and X because
> sometimes the output Git pipes to less is short, and R because Git
> pipes colored output).

But I still don't get from that why X would be needed?

My less manpage documents it as:
> -X or --no‐init
>     Disables sending the termcap initialization and deinitialization
>     strings to the terminal.  This is sometimes desirable if the
>     deinitialization string does something unnecessary, like clearing
>     the screen.

Is it to avoid clearing the screen?

Not sure whether git should really do something here... I mean it could
add --mouse per default, but then users who don't want that would need
to work around it (and there is no negative option for --mouse, it
seems).

Oh, I should add that scrolling without `X` and without `--mouse` sees
to only work on some terminals (e.g. VTE based ones, but not xterm).

Thanks,
Chris.

[0] https://github.com/dandavison/delta/
[1] https://github.com/dandavison/delta/issues/58
[2] https://github.com/dandavison/delta/issues/630
[3] https://github.com/dandavison/delta/issues/58#issuecomment-1756542986

^ permalink raw reply

* Re: why does git set X in LESS env var?
From: Junio C Hamano @ 2023-10-11 22:23 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: git
In-Reply-To: <3a2c362c019338ca7408b7a3bc5715b535d15b8a.camel@scientia.org>

Christoph Anton Mitterer <calestyo@scientia.org> writes:

> But I still don't get from that why X would be needed?
>
> My less manpage documents it as:
>> -X or --no‐init
>>     Disables sending the termcap initialization and deinitialization
>>     strings to the terminal.  This is sometimes desirable if the
>>     deinitialization string does something unnecessary, like clearing
>>     the screen.
>
> Is it to avoid clearing the screen?

I think that was the reason we added it back in 2005.  In any case,
asking "why" is not a useful use of anybody's time, because it is
very unlikely to change in the official version we ship, and because
it is so easy for any individual who does not like it to drop by
exporting the $LESS environment variable.

Thanks.

^ permalink raw reply

* Re: why does git set X in LESS env var?
From: Christoph Anton Mitterer @ 2023-10-11 22:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqa5sokdd3.fsf@gitster.g>

On Wed, 2023-10-11 at 15:23 -0700, Junio C Hamano wrote:
> I think that was the reason we added it back in 2005.  In any case,
> asking "why" is not a useful use of anybody's time, because it is
> very unlikely to change in the official version we ship, and because
> it is so easy for any individual who does not like it to drop by
> exporting the $LESS environment variable.

Well the other commit I've mentioned kinda read as if it was thought
that either X or both F and X were needed for the effect to exit less
immediately if the output is too short ("F and X because
> sometimes the output Git pipes to less is short").

So I thought maybe that was intended, and the no-clear was just a side-
effect no one ever really thought about.

Anyway, thanks,
Chris.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox