* Re: [PATCH] revision: use priority queue in limit_list()
From: Derrick Stolee @ 2026-05-14 19:57 UTC (permalink / raw)
To: Kristofer Karlsson via GitGitGadget, git; +Cc: Kristofer Karlsson
In-Reply-To: <pull.2114.git.1778777491939.gitgitgadget@gmail.com>
On 5/14/2026 12:51 PM, Kristofer Karlsson via GitGitGadget wrote:
> From: Kristofer Karlsson <krka@spotify.com>
>
> limit_list() maintains a date-sorted work queue of commits using a
> linked list with commit_list_insert_by_date() for insertion. Each
> insertion walks the list to find the right position — O(n) per insert.
> In repositories with merge-heavy histories, the symmetric difference
> can contain thousands of commits, making this O(n) insertion the
> dominant cost.
Linear operations are bad, especially when multiplied by linear-ish
loops, causing quadratic behavior.
> Replace the sorted linked list with a prio_queue (binary heap). This
> gives O(log n) insertion and O(log n) extraction instead of O(n)
> insertion and O(1) extraction, which is a net win when the queue is
> large.
Yes, much better.
> The still_interesting() and everybody_uninteresting() helpers are
> updated to scan the prio_queue's contiguous array instead of walking a
> linked list. process_parents() already accepts both a commit_list and
> a prio_queue parameter, so the change in limit_list() simply switches
> which one is passed.
>
> Benchmark: git rev-list --left-right --count HEAD~N...HEAD
> Repository: 2.3M commits, merge-heavy DAG (monorepo)
> Best of 5 runs, times in seconds:
>
> commits in
> symmetric diff baseline patched speedup
> -------------- -------- ------- -------
> 10 0.01 0.01 1.0x
> 50 0.01 0.01 1.0x
> 3751 21.23 8.49 2.5x
> 4524 21.70 8.29 2.6x
> 10130 20.10 6.65 3.0x
>
> No change for small traversals; 2.5-3.0x faster when the queue grows
> to thousands of commits.
This is good. Is there any chance that you could demonstrate this with
any commits in the Git repo? It does have some interesting behavior,
especially around point releases that are independent from the 'master'
branch and thus could have lopsided symmetric differences using well-
established tag names.
> Switching to a prio_queue (binary heap) reduces insertion cost to O(log
> w), bringing total cost to O(N·log w). The practical result on the same
> repository:
>
> commits in
> symmetric diff before after speedup
> -------------- -------- ------- -------
> 3751 21.2s 8.5s 2.5x
> 4524 21.7s 8.3s 2.6x
> 10130 20.1s 6.6s 3.0x
Very nice! I notice that this data is in your cover letter, but
not the commit message. Is that intentional?
> This affects any command that triggers limit_list() — i.e., when
> revs->limited is set — including --left-right, --cherry-mark,
> --cherry-pick, --ancestry-path, bisect, and rebase's fork-point
> computation. The practical trigger is git status --ahead-behind on a
> branch that has diverged from upstream in a merge-heavy repository.
This also impacts 'git log --graph' when there is no serialized
commit-graph file. We are still using limit_list() in that case.
> The change is minimal (+21/−17 lines, single file) because
> process_parents() already accepts both a commit_list and a prio_queue
> parameter — limit_list() just switches which one it passes.
The key logic is turning the initial list into the starting
points for the priority queue and everything else is about
moving types around, it seems.
> @@ -1451,6 +1447,7 @@ static int limit_list(struct rev_info *revs)
> struct commit_list *newlist = NULL;
> struct commit_list **p = &newlist;
> struct commit *interesting_cache = NULL;
> + struct prio_queue queue = { .compare = compare_commits_by_commit_date };
Here, we are _not_ using generation numbers, which is correct
for this case because we are matching the date-based sorting
of the previous list.
> while (original_list) {
> struct commit *commit = pop_commit(&original_list);
> + prio_queue_put(&queue, commit);
> + }
> +
> + while (queue.nr) {
> + struct commit *commit = prio_queue_get(&queue);
> struct object *obj = &commit->object;
This is a fun reuse of lines to take the old "drain the
list as it is being mutated" loop and turn it into "fill
the priority queue" and "drain the priority queue as it
is being mutated"
This code change looks good. No new tests are needed, since
this is a performance-only change. Do any of the tests in
t/perf/ demonstrate this improvement?
Thanks,
-Stolee
^ permalink raw reply
* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow()
From: René Scharfe @ 2026-05-14 20:13 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git List
In-Reply-To: <xmqqo6ihg690.fsf@gitster.g>
On 5/14/26 9:07 PM, Junio C Hamano wrote:
> René Scharfe <l.s.r@web.de> writes:
>
>> Simplify the code by calling st_add3() to do overflow checks instead of
>> open-coding it. This changes the error message to include the offending
>> summands, which can be helpful when tracking down the cause.
>>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>> strbuf.c | 5 +----
>> 1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/strbuf.c b/strbuf.c
>> index 3e04addc22..bb04d3910e 100644
>> --- a/strbuf.c
>> +++ b/strbuf.c
>> @@ -106,12 +106,9 @@ void strbuf_attach(struct strbuf *sb, void *buf, size_t len, size_t alloc)
>> void strbuf_grow(struct strbuf *sb, size_t extra)
>> {
>> int new_buf = !sb->alloc;
>> - if (unsigned_add_overflows(extra, 1) ||
>> - unsigned_add_overflows(sb->len, extra + 1))
>> - die("you want to use way too much memory");
>> if (new_buf)
>> sb->buf = NULL;
>> - ALLOC_GROW(sb->buf, sb->len + extra + 1, sb->alloc);
>> + ALLOC_GROW(sb->buf, st_add3(sb->len, extra, 1), sb->alloc);
>
> ALLOC_GROW() being a macro that references its second argument three
> times, doesn't this rewrite rely on the compiler being clever enough
> to notice that checking for the same overflow three times is
> pointless and does it only once? I guess the original has the same
> issue already, so this may not be so bad but it makes me feel a bit
> queasy.
As long as it has no side-effect (as is the case for addition) and we
keep compiler optimization enabled we should be fine.
I guess the reason for having multiple references as opposed to
loading the value once into a private variable is to support arbitrary
types. In the end it needs to fit into a size_t, though. Something
like this could bring the reference count down to one:
--- 8< ---
diff --git a/git-compat-util.h b/git-compat-util.h
index ae1bdc90a4..ca89cfb0b3 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -812,6 +812,15 @@ static inline void move_array(void *dst, const void *src, size_t n, size_t size)
#define alloc_nr(x) (((x)+16)*3/2)
+static inline bool st_alloc_nr(size_t nr, size_t alloc, size_t *out)
+{
+ if (nr > alloc) {
+ *out = alloc_nr(alloc) < nr ? nr : alloc_nr(alloc);
+ return true;
+ }
+ return false;
+}
+
/**
* Dynamically growing an array using realloc() is error prone and boring.
*
@@ -857,12 +866,10 @@ static inline void move_array(void *dst, const void *src, size_t n, size_t size)
*/
#define ALLOC_GROW(x, nr, alloc) \
do { \
- if ((nr) > alloc) { \
- if (alloc_nr(alloc) < (nr)) \
- alloc = (nr); \
- else \
- alloc = alloc_nr(alloc); \
- REALLOC_ARRAY(x, alloc); \
+ size_t alloc_grow_new_alloc_; \
+ if (st_alloc_nr((nr), (alloc), &alloc_grow_new_alloc_)) { \
+ alloc = alloc_grow_new_alloc_; \
+ REALLOC_ARRAY(x, alloc_grow_new_alloc_); \
} \
} while (0)
--- >8 ---
Hmm, alloc_nr() doesn't do any overflow checking. It should, though,
shouldn't it?
René
^ permalink raw reply related
* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang
From: René Scharfe @ 2026-05-14 20:17 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git List, Jeff King
In-Reply-To: <xmqqjyt5g5zr.fsf@gitster.g>
On 5/14/26 9:12 PM, Junio C Hamano wrote:
> René Scharfe <l.s.r@web.de> writes:
>
>> Provide a variant of st_add() that wraps __builtin_add_overflow() to
>> help Clang optimize it. Use it on all platforms for simplicity.
>> ...
>> +/* Help Clang; GCC generates the same code for both variants. */
>> +#if defined(__clang__)
>> +static inline size_t st_add(size_t a, size_t b)
>> +{
>> + size_t sum;
>> + if (__builtin_add_overflow(a, b, &sum))
>> + die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
>> + (uintmax_t)a, (uintmax_t)b);
>> + return sum;
>> +}
>> +#else
>> static inline size_t st_add(size_t a, size_t b)
>> {
>> if (unsigned_add_overflows(a, b))
>> @@ -621,6 +632,7 @@ static inline size_t st_add(size_t a, size_t b)
>> (uintmax_t)a, (uintmax_t)b);
>> return a + b;
>> }
>> +#endif
>
> Makes me wonder if we tweaked unsigned_add_overflows() to take an
> extra *dst parameter to match __builtin_add_overflow(), which of
> course requires us to all of 18 callsites, it might make the whole
> thing a bit simpler. New uses of unsigned_add_overflows(), if we
> ever add them, would automatically benefit, right?
Hmm. It sounds like a lot of churn, but it would make sure that
we use the checked result and not check a + b and then go on and
use x + y because the code de-synced at some point.
How to do it, though? It needs to be generic and evaluate its
arguments only once. Perhaps like this?
diff --git a/git-compat-util.h b/git-compat-util.h
index ca89cfb0b3..27fbb622d7 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -103,6 +103,21 @@ struct strbuf;
#define unsigned_add_overflows(a, b) \
((b) > maximum_unsigned_value_of_type(a) - (a))
+static bool uint_add_overflow(uintmax_t a, uintmax_t b,
+ uintmax_t *out, size_t out_size)
+{
+ if (b > UINTMAX_MAX - a)
+ return true;
+ a += b;
+ if (a > (UINTMAX_MAX >> (bitsizeof(uintmax_t) - CHAR_BIT * out_size)))
+ return true;
+ *out = a;
+ return false;
+}
+
+#define UINT_ADD_OVERFLOW(a, b, out) \
+ uint_add_overflow((a), (b), (out), sizeof(a))
+
/*
* Returns true if the multiplication of "a" and "b" will
* overflow. The types of "a" and "b" must match and must be unsigned.
@@ -616,10 +631,11 @@ int git_open_cloexec(const char *name, int flags);
static inline size_t st_add(size_t a, size_t b)
{
- if (unsigned_add_overflows(a, b))
+ size_t ret;
+ if (UINT_ADD_OVERFLOW(a, b, &ret))
die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
(uintmax_t)a, (uintmax_t)b);
- return a + b;
+ return ret;
}
#define st_add3(a,b,c) st_add(st_add((a),(b)),(c))
#define st_add4(a,b,c,d) st_add(st_add3((a),(b),(c)),(d))
^ permalink raw reply related
* Re: [PATCH v3 2/4] approxidate: alias "today" to "now"
From: Tuomas Ahola @ 2026-05-14 21:07 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <xmqqwlx6f1fo.fsf@gitster.g>
Junio C Hamano <gitster@pobox.com> wrote:
> Tuomas Ahola <taahol@utu.fi> writes:
>
> > Sorry, I don't know if I understood. Does the patch change the behavior of
> > that command somehow? Is there some kind of edge case I missed?
>
> No, I did not think it was a good idea to carve the behaviour in
> stone that "git log --since=today" behaves as if it were given "git
> log --since=now". My reaction would have been very different if we
> were deliberatly and explicitly saying "today is synonym for now",
> but the thing is, it is not a designed behaviour but what
> approxidate does for anything it does not understand, e.g.
>
> git log --since=decay
> git log --since=bogus
>
> all behave as if it were given --since=now.
Thanks for spelling that out. So, as there is no deliberative
decision behind the current behaviour of "today", the code has
to remain non-committed on that; we are not at liberty to codify
the status quo. Right? But perhaps we can find a least common
denominator. "Today" means the current day (well, obviously),
and this seems to be enough to get "today at noon" and such to work:
```
static void date_today(struct tm *tm, struct tm *now, int *num)
{
*num = 0;
tm->tm_mday = now->tm_mday;
}
```
It gets the job done whitout setting in stone too much anything
that is up for debate. Do you see that could work?
--Tuomas
^ permalink raw reply
* Re: [PATCH] config: suggest the correct form when key contains "="
From: Junio C Hamano @ 2026-05-14 21:26 UTC (permalink / raw)
To: Harald Nordgren via GitGitGadget; +Cc: git, Harald Nordgren
In-Reply-To: <pull.2302.git.git.1778680725459.gitgitgadget@gmail.com>
"Harald Nordgren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Harald Nordgren <haraldnordgren@gmail.com>
>
> When a user types "git config foo.bar=baz", git_config_parse_key()
> rejects the key with "error: invalid key: foo.bar=baz" but gives no
> indication of what the user should have written. The mistake is a
> common one for users who reach for INI-file syntax or for the
> "--flag=value" convention used by other command-line tools.
>
> Since "=" is never a valid character in a config key, treat its
> presence as a strong signal of this specific mistake and follow the
> error with a one-line suggestion in the "(did you mean ...)" style
> used elsewhere in git, e.g.:
>
> $ git config pull.rebase=false
> error: invalid key: pull.rebase=false
> (did you mean "git config set pull.rebase false"?)
If the command line were
git config get foo.bar=baz
git config set foo.bar=baz nitfol
we shouldn't give an extra "did you mean?" at all.
The only cases you may want to do the "did you mean?" I think are
git config foo.bar=baz
git config set foo.bar=baz
And I think git_config_parse_key() is at a way too low level to tell
in what context we are seeing this faulty key to guess end-user's
intention to limit our "did you mean?"
I also wonder if, given that "=" in anywhere other than three-level
names, is invalid, we should just start accept
git config foo.bar=baz
git config set foo.bar=baz
and interpret them as
git config set foo.bar baz
We of course need to be careful about non-invalid keys, i.e.
git config foo.bar=baz.boo
is a request to read the value of that named variable, i.e.
[foo "bar=baz"]
boo = its value
so either you start offering unsolicited "did you mean?" or accepting
tokens with '=' in them as new style "set", you need to be extra
careful not to trigger a false positive.
^ permalink raw reply
* Re: [PATCH 1/3] daemon: fix IPv6 address corruption in lookup_hostname()
From: Junio C Hamano @ 2026-05-14 21:26 UTC (permalink / raw)
To: Sebastien Tardif via GitGitGadget; +Cc: git, Sebastien Tardif
In-Reply-To: <b2d81438117a716417a031c74b678a8f91701af4.1778773592.git.gitgitgadget@gmail.com>
"Sebastien Tardif via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Sebastien Tardif <sebtardif@ncf.ca>
>
> getaddrinfo() is called with AF_UNSPEC hints, so it may return IPv6
> results. However, the code unconditionally casts ai_addr to
> sockaddr_in and passes AF_INET to inet_ntop(). On IPv6-only hosts,
> this reads from the wrong struct offset, producing garbage IP
> addresses.
>
> Fix this by checking ai_family and extracting the address pointer
> into a local variable before calling inet_ntop() once with the
> correct family. Die on unexpected address families.
>
> Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
> ---
> daemon.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/daemon.c b/daemon.c
> index 0a7b1aae44..80fa0226d8 100644
> --- a/daemon.c
> +++ b/daemon.c
> @@ -674,9 +674,20 @@ static void lookup_hostname(struct hostinfo *hi)
>
> gai = getaddrinfo(hi->hostname.buf, NULL, &hints, &ai);
> if (!gai) {
> - struct sockaddr_in *sin_addr = (void *)ai->ai_addr;
> + void *addr;
> +
> + if (ai->ai_family == AF_INET) {
> + struct sockaddr_in *sa = (void *)ai->ai_addr;
> + addr = &sa->sin_addr;
> + } else if (ai->ai_family == AF_INET6) {
> + struct sockaddr_in6 *sa6 = (void *)ai->ai_addr;
> + addr = &sa6->sin6_addr;
> + } else {
> + die("unexpected address family: %d",
> + ai->ai_family);
> + }
The previous iteration used to more explicitly cast ai->ai_addr to
the target type, but the use of (void *) here is a cute way to make
the result shorter, which makes it a bit easier to read (it may take
readers a bit of practice to convince themselves that this type
conversion using (void *) as an intermediate type is perfectly fine,
though).
>
> - inet_ntop(AF_INET, &sin_addr->sin_addr,
> + inet_ntop(ai->ai_family, addr,
> addrbuf, sizeof(addrbuf));
> strbuf_addstr(&hi->ip_address, addrbuf);
^ permalink raw reply
* Re: [PATCH] trailer: change strbuf in-place in unfold_value()
From: Ramsay Jones @ 2026-05-14 21:30 UTC (permalink / raw)
To: René Scharfe, Git List; +Cc: Jeff King
In-Reply-To: <9629b0c1-b28f-4cd2-8d59-67d909ca9052@web.de>
On 14/05/2026 7:40 pm, René Scharfe wrote:
> Avoid an allocation by doing s/\n\s*/ /g (replacing NL and any following
> whitespace with a SP) right in the strbuf instead of copying the result
> to a temporary one and swapping them in the end. We can safely do that
> because the replacement is never longer than the original string.
>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
> Formatted with --function-context for easier review.
> Inspired by https://lore.kernel.org/git/20260513185408.GA147423@coredump.intra.peff.net/
>
> trailer.c | 16 ++++++----------
> 1 file changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/trailer.c b/trailer.c
> index 470f86a4a2..b89fa12fe7 100644
> --- a/trailer.c
> +++ b/trailer.c
> @@ -988,29 +988,25 @@ static int ends_with_blank_line(const char *buf, size_t len)
>
> static void unfold_value(struct strbuf *val)
> {
> - struct strbuf out = STRBUF_INIT;
> size_t i;
> + size_t pos = 0;
>
> - strbuf_grow(&out, val->len);
> i = 0;
> while (i < val->len) {
> char c = val->buf[i++];
> if (c == '\n') {
> /* Collapse continuation down to a single space. */
> while (i < val->len && isspace(val->buf[i]))
> i++;
> - strbuf_addch(&out, ' ');
> - } else {
> - strbuf_addch(&out, c);
> + val->buf[pos++] = ' ';
> + } else if (pos != i) {
Hmm, isn't 'pos' strictly (always) less than 'i' here? (note the post update
of 'i' when setting 'c' at the head of the loop).
> + val->buf[pos++] = c;
So, this (non-newline-or-'trailing'-space char) is always copied.
Not that it matters much (depending on how long the first line is, I doubt
the difference is measurable :) ).
[Unless I'm not reading it correctly, of course - in which case, oops!]
ATB,
Ramsay Jones
> }
> }
> + strbuf_setlen(val, pos);
>
> /* Empty lines may have left us with whitespace cruft at the edges */
> - strbuf_trim(&out);
> -
> - /* output goes back to val as if we modified it in-place */
> - strbuf_swap(&out, val);
> - strbuf_release(&out);
> + strbuf_trim(val);
> }
>
> static struct trailer_block *trailer_block_new(void)
^ permalink raw reply
* [PATCH] fetch: add fetch.pruneLocalBranches config
From: Harald Nordgren @ 2026-05-14 22:16 UTC (permalink / raw)
To: gitster; +Cc: git, gitgitgadget, haraldnordgren
In-Reply-To: <xmqqqzndel8c.fsf@gitster.g>
> I also wonder if, given that "=" in anywhere other than three-level
> names, is invalid, we should just start accept
>
> git config foo.bar=baz
> git config set foo.bar=baz
>
> and interpret them as
>
> git config set foo.bar baz
That sounds good too! Probably even better.
Harald
^ permalink raw reply
* Re: [PATCH v3 2/4] approxidate: alias "today" to "now"
From: Junio C Hamano @ 2026-05-15 1:27 UTC (permalink / raw)
To: Tuomas Ahola; +Cc: git, Jeff King
In-Reply-To: <20260514210742.Yc6NZ%taahol@utu.fi>
Tuomas Ahola <taahol@utu.fi> writes:
> Junio C Hamano <gitster@pobox.com> wrote:
>
>> Tuomas Ahola <taahol@utu.fi> writes:
>>
>> > Sorry, I don't know if I understood. Does the patch change the behavior of
>> > that command somehow? Is there some kind of edge case I missed?
>>
>> No, I did not think it was a good idea to carve the behaviour in
>> stone that "git log --since=today" behaves as if it were given "git
>> log --since=now". My reaction would have been very different if we
>> were deliberatly and explicitly saying "today is synonym for now",
>> but the thing is, it is not a designed behaviour but what
>> approxidate does for anything it does not understand, e.g.
>>
>> git log --since=decay
>> git log --since=bogus
>>
>> all behave as if it were given --since=now.
>
> Thanks for spelling that out. So, as there is no deliberative
> decision behind the current behaviour of "today", the code has
> to remain non-committed on that; we are not at liberty to codify
> the status quo. Right?
Not right. It is more like "Even though we try not to change
existing behavoiur left and right without a good reason to avoid
breaking existing users' expectations, we should be able to "fix"
what is not intended behaviour but is something the code happened to
be doing, especially if the current behaviour does not make sense.
^ permalink raw reply
* Re: [PATCH] fetch: add fetch.pruneLocalBranches config
From: Junio C Hamano @ 2026-05-15 1:28 UTC (permalink / raw)
To: Harald Nordgren; +Cc: git, gitgitgadget
In-Reply-To: <20260514221602.9918-1-haraldnordgren@gmail.com>
Harald Nordgren <haraldnordgren@gmail.com> writes:
>> I also wonder if, given that "=" in anywhere other than three-level
>> names, is invalid, we should just start accept
>>
>> git config foo.bar=baz
>> git config set foo.bar=baz
>>
>> and interpret them as
>>
>> git config set foo.bar baz
>
> That sounds good too! Probably even better.
>
>
> Harald
Why do I get the above, which apparently is a response to my review
for
[PATCH] config: suggest the correct form when key contains "="
under this thread? Am I dealing with some sort of mechanical slop?
^ permalink raw reply
* Re: [PATCH v3 2/4] approxidate: alias "today" to "now"
From: Junio C Hamano @ 2026-05-15 1:38 UTC (permalink / raw)
To: Tuomas Ahola; +Cc: git, Jeff King
In-Reply-To: <xmqqik8pea39.fsf@gitster.g>
Junio C Hamano <gitster@pobox.com> writes:
> Tuomas Ahola <taahol@utu.fi> writes:
>
>> Junio C Hamano <gitster@pobox.com> wrote:
>>
>>> Tuomas Ahola <taahol@utu.fi> writes:
>>>
>>> > Sorry, I don't know if I understood. Does the patch change the behavior of
>>> > that command somehow? Is there some kind of edge case I missed?
>>>
>>> No, I did not think it was a good idea to carve the behaviour in
>>> stone that "git log --since=today" behaves as if it were given "git
>>> log --since=now". My reaction would have been very different if we
>>> were deliberatly and explicitly saying "today is synonym for now",
>>> but the thing is, it is not a designed behaviour but what
>>> approxidate does for anything it does not understand, e.g.
>>>
>>> git log --since=decay
>>> git log --since=bogus
>>>
>>> all behave as if it were given --since=now.
>>
>> Thanks for spelling that out. So, as there is no deliberative
>> decision behind the current behaviour of "today", the code has
>> to remain non-committed on that; we are not at liberty to codify
>> the status quo. Right?
>
> Not right. It is more like "Even though we try not to change
> existing behavoiur left and right without a good reason to avoid
> breaking existing users' expectations, we should be able to "fix"
> what is not intended behaviour but is something the code happened to
> be doing, especially if the current behaviour does not make sense.
And the other half of the discussion is that once we explicitly say
"today means right now" and make it official, it makes it much harder
to fix it later. So we need to be very careful in our first attempt.
^ permalink raw reply
* Re: What's cooking in git.git (May 2026, #03)
From: Junio C Hamano @ 2026-05-15 1:46 UTC (permalink / raw)
To: Pushkar Singh; +Cc: git
In-Reply-To: <CALE2CrTea19qHKbhQK8V+uQJgh5GdT+8ia1q2jwr+hf546fnaQ@mail.gmail.com>
Pushkar Singh <pushkarkumarsingh1970@gmail.com> writes:
> My thinking was mainly that git stash show normally omits untracked
> changes, while --include-untracked consults the additional untracked
> parent of the stash commit.
>
> I did not see existing coverage specifically checking that behavior,
Ah, OK. Please more explicitly state that it is filling a gap in
test coverage in the proposed log message; it would have helped to
sell the patch better. Adding even more when we already have
adequate coverage is one thing, covering the cases we had no
coverage is totally different matter.
Thanks.
^ permalink raw reply
* Re: [PATCH 1/4] strbuf: add strbuf_add_uint()
From: Jeff King @ 2026-05-15 3:53 UTC (permalink / raw)
To: René Scharfe; +Cc: git
In-Reply-To: <f51cdd89-dab1-44f3-8f63-7d34f6fbbba5@web.de>
On Thu, May 14, 2026 at 01:09:24PM +0200, René Scharfe wrote:
> > And btw, one final thing to look at if you are interested in
> > micro-optimizing strbufs: using intrinsics for overflow detection.
> >
> > Right now we use unsigned_add_overflows(), and then do the actual add.
> > Using __builtin_add_overflow() might be faster.
> Curious. Clang and GCC emit the same instructions for our
> unsigned_add_overflows() vs. __builtin_add_overflow() on x64, but clang
> on ARM64 fails to elide the comparison: https://godbolt.org/z/91d35KofM
Ah, neat. I always assumed there was low-hanging fruit to pick here, but
it sounds like the compiler is (usually) more clever than I expected.
-Peff
^ permalink raw reply
* Re: [PATCH v4 0/7] odb: add write operation to ODB transaction interface
From: Jeff King @ 2026-05-15 3:56 UTC (permalink / raw)
To: Justin Tobler; +Cc: git, ps, gitster
In-Reply-To: <20260514183740.1505171-1-jltobler@gmail.com>
On Thu, May 14, 2026 at 01:37:33PM -0500, Justin Tobler wrote:
> Changes since V3:
> - Fixed leak due to an fd not being closed when exiting prior to
> close_loose_object() being invoked.
> [...]
> 3: 11321ad607 ! 3: d53ad95712 odb: update `struct odb_write_stream` read() callback
> @@ object-file.c: int odb_source_loose_write_stream(struct odb_source *source,
> + ssize_t read_len = odb_write_stream_read(in_stream, buf,
> + sizeof(buf));
> + if (read_len < 0) {
> ++ close(fd);
> + err = -1;
> + goto cleanup;
> + }
This fix looks good to me (and I think is the best way to write it,
given the rest of the function).
I briefly wondered whether callers might care about errno being
preserved, but I couldn't find any indication that they do.
-Peff
^ permalink raw reply
* Re: [PATCH] revision: use priority queue in limit_list()
From: Jeff King @ 2026-05-15 4:16 UTC (permalink / raw)
To: Kristofer Karlsson via GitGitGadget; +Cc: git, Kristofer Karlsson
In-Reply-To: <pull.2114.git.1778777491939.gitgitgadget@gmail.com>
On Thu, May 14, 2026 at 04:51:31PM +0000, Kristofer Karlsson via GitGitGadget wrote:
> @@ -1451,6 +1447,7 @@ static int limit_list(struct rev_info *revs)
> struct commit_list *newlist = NULL;
> struct commit_list **p = &newlist;
> struct commit *interesting_cache = NULL;
> + struct prio_queue queue = { .compare = compare_commits_by_commit_date };
>
> if (revs->ancestry_path_implicit_bottoms) {
> collect_bottom_commits(original_list,
> @@ -1461,6 +1458,11 @@ static int limit_list(struct rev_info *revs)
>
> while (original_list) {
> struct commit *commit = pop_commit(&original_list);
> + prio_queue_put(&queue, commit);
> + }
> +
> + while (queue.nr) {
> + struct commit *commit = prio_queue_get(&queue);
Here we push the whole starting list into the prio-queue, which will let
us pull the commits out in date order. But is the incoming list always
in date order?
If revs->unsorted_input, then we don't sort the initial list. So we'd
now see the commits in a different order, and put them onto newlist in
that different order.
I _think_ it may not matter because we don't call limit_list() when
revs->no_walk is set, and we only have revs->unsorted_input when no_walk
is also set. If that wasn't true, it would get weird when limit_list()
calls process_parents(), which uses commit_list_insert_by_date().
I was on the lookout for this issue particularly because I have another
patch which converts revs.commits to a prio_queue totally. And I
remember running into issues (and the solution is that sometimes the
prio_queue has a NULL comparator and acts like a LIFO queue). But if my
analysis is right above, we can ignore that for now. And if we
eventually move to revs.commits as a prio_queue, then it will just slot
in nicely here (we can drop the queue generation step and just use it
directly).
The rest of the patch looks as I'd expect from what my other patch does.
-Peff
^ permalink raw reply
* Re: [PATCH 1/2] strbuf: use st_add3() in strbuf_grow()
From: Jeff King @ 2026-05-15 4:36 UTC (permalink / raw)
To: René Scharfe; +Cc: Junio C Hamano, Git List
In-Reply-To: <0c3b4e94-b56c-4c92-a4d8-0e4364f1257b@web.de>
On Thu, May 14, 2026 at 10:13:19PM +0200, René Scharfe wrote:
> Hmm, alloc_nr() doesn't do any overflow checking. It should, though,
> shouldn't it?
Yes, probably. It's a known blind spot in the overflow checking, but
I think is OK in practice because:
1. We are growing an existing buffer by ~3/2. So even with ordering
the multiplication first, an overflow implies that you have a
single buffer consuming ~1/3 of your address space.
On 64-bit systems that's impractically large, and on 32-bit systems I
think you generally run into fragmentation and address-space issues
first.
2. If alloc_nr(alloc) is less than the desired nr, we just use that nr
directly. So even if we did overflow, I think the result is
too-slow allocation, and not a buffer overflow.
But it would be nice to be less hand-wavy. One of the reasons I hadn't
dug into it further is that I wanted to start making use of intrinsics
to avoid slowdowns. But since you're already doing that (and finding
that the compiler was doing the fast thing anyway!) it might be a good
time to make the jump.
That's all assuming that no overflow happens before ALLOC_GROW() gets
the values. We also tend to do unchecked computions for the "nr" field
there, but it's usually just "nr_foo + 1", so the same logic applies:
you'd have to have an existing array consuming the entire address space
minus one byte to trigger an overflow.
-Peff
^ permalink raw reply
* Re: [PATCH 2/2] use __builtin_add_overflow() in st_add() with Clang
From: Jeff King @ 2026-05-15 4:40 UTC (permalink / raw)
To: René Scharfe; +Cc: Git List
In-Reply-To: <0ded6062-f66a-4713-af24-d1b5aa654823@web.de>
On Thu, May 14, 2026 at 05:13:46PM +0200, René Scharfe wrote:
> Clang and GCC optimize away comparisons of overflow checks by checking
> the carry flag on x64. GCC does the same on ARM64, but Clang currently
> (version 22.1) doesn't.
>
> Provide a variant of st_add() that wraps __builtin_add_overflow() to
> help Clang optimize it. Use it on all platforms for simplicity.
OK. I probably would have just used the intrinsic everywhere with
__GNUC__, but if gcc is already figuring it out, it doesn't matter in
practice.
> +/* Help Clang; GCC generates the same code for both variants. */
> +#if defined(__clang__)
> +static inline size_t st_add(size_t a, size_t b)
> +{
> + size_t sum;
> + if (__builtin_add_overflow(a, b, &sum))
> + die("size_t overflow: %"PRIuMAX" + %"PRIuMAX,
> + (uintmax_t)a, (uintmax_t)b);
> + return sum;
> +}
> +#else
> static inline size_t st_add(size_t a, size_t b)
> {
> if (unsigned_add_overflows(a, b))
It's a shame we can't share more code here, especially the die message.
I guess the ideal primitive is probably a wrapper with the same
interface as __builtin_add_overflow(), which could then be used
everywhere that unsigned_add_overflows() with some minor conversion.
But it gets awkward to do as a macro, and using an inline function runs
into type questions.
-Peff
^ permalink raw reply
* Re: [PATCH] trailer: change strbuf in-place in unfold_value()
From: Jeff King @ 2026-05-15 4:44 UTC (permalink / raw)
To: Ramsay Jones; +Cc: René Scharfe, Git List
In-Reply-To: <a4da346d-3800-40ea-8828-970b15088bf3@ramsayjones.plus.com>
On Thu, May 14, 2026 at 10:30:37PM +0100, Ramsay Jones wrote:
> > i = 0;
> > while (i < val->len) {
> > char c = val->buf[i++];
> > if (c == '\n') {
> > /* Collapse continuation down to a single space. */
> > while (i < val->len && isspace(val->buf[i]))
> > i++;
> > - strbuf_addch(&out, ' ');
> > - } else {
> > - strbuf_addch(&out, c);
> > + val->buf[pos++] = ' ';
> > + } else if (pos != i) {
>
> Hmm, isn't 'pos' strictly (always) less than 'i' here? (note the post update
> of 'i' when setting 'c' at the head of the loop).
>
> > + val->buf[pos++] = c;
>
> So, this (non-newline-or-'trailing'-space char) is always copied.
>
> Not that it matters much (depending on how long the first line is, I doubt
> the difference is measurable :) ).
>
> [Unless I'm not reading it correctly, of course - in which case, oops!]
Yeah, I think you're right. If it were a for-loop which incremented "i"
at the end then the comparison could make sense. But even then, I think
usually in such modify-in-place loops we don't bother trying to skip
self-assignment (e.g., see remove_space() in builtin/patch-id.c). In
practice I don't know which is worse: the extra branch or a pointless
memory store.
-Peff
^ permalink raw reply
* Re: [PATCH] trailer: change strbuf in-place in unfold_value()
From: Jeff King @ 2026-05-15 4:47 UTC (permalink / raw)
To: René Scharfe; +Cc: Git List
In-Reply-To: <9629b0c1-b28f-4cd2-8d59-67d909ca9052@web.de>
On Thu, May 14, 2026 at 08:40:56PM +0200, René Scharfe wrote:
> Avoid an allocation by doing s/\n\s*/ /g (replacing NL and any following
> whitespace with a SP) right in the strbuf instead of copying the result
> to a temporary one and swapping them in the end. We can safely do that
> because the replacement is never longer than the original string.
>
> [...]
>
> Inspired by https://lore.kernel.org/git/20260513185408.GA147423@coredump.intra.peff.net/
Cute. Modulo the issue raised by Ramsay, this looks correct to me. In
the discussion you referenced I was mostly expecting people to find
spots where the solution would be to just remove the strbuf_grow() call.
This one is quite a bit trickier, and I am glad to have somebody careful
looking at it. ;)
-Peff
^ permalink raw reply
* Re: [PATCH v3 2/4] approxidate: alias "today" to "now"
From: Tuomas Ahola @ 2026-05-15 5:02 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jeff King
In-Reply-To: <xmqqa4u1e9k9.fsf@gitster.g>
Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Tuomas Ahola <taahol@utu.fi> writes:
> >
> >> Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >>> Tuomas Ahola <taahol@utu.fi> writes:
> >>>
> >>> > Sorry, I don't know if I understood. Does the patch change the behavior of
> >>> > that command somehow? Is there some kind of edge case I missed?
> >>>
> >>> No, I did not think it was a good idea to carve the behaviour in
> >>> stone that "git log --since=today" behaves as if it were given "git
> >>> log --since=now". My reaction would have been very different if we
> >>> were deliberatly and explicitly saying "today is synonym for now",
> >>> but the thing is, it is not a designed behaviour but what
> >>> approxidate does for anything it does not understand, e.g.
> >>>
> >>> git log --since=decay
> >>> git log --since=bogus
> >>>
> >>> all behave as if it were given --since=now.
> >>
> >> Thanks for spelling that out. So, as there is no deliberative
> >> decision behind the current behaviour of "today", the code has
> >> to remain non-committed on that; we are not at liberty to codify
> >> the status quo. Right?
> >
> > Not right. It is more like "Even though we try not to change
> > existing behavoiur left and right without a good reason to avoid
> > breaking existing users' expectations, we should be able to "fix"
> > what is not intended behaviour but is something the code happened to
> > be doing, especially if the current behaviour does not make sense.
Well, that's not far off from what I wrote. I just meant we cannot
make the current (somewhat accidental) behaviour official *just because*
it happens to be the status quo.
>
> And the other half of the discussion is that once we explicitly say "today
> means right now" and make it official, it makes it much harder to fix it
> later. So we need to be very careful in our first attempt.
Roger that.
^ permalink raw reply
* [PATCH] connected: close err_fd in promisor fast-path
From: Ethan via GitGitGadget @ 2026-05-15 6:39 UTC (permalink / raw)
To: git; +Cc: Ethan, Ethan Dickson
From: Ethan Dickson <ethanndickson@gmail.com>
connected.h documents that err_fd is closed before check_connected()
returns. It is, on three of four exit paths. The promisor-pack fast
path added in 50033772d (connected: verify promisor-ness of partial
clone, 2020-01-30) returns 0 without closing it.
receive-pack uses err_fd as the write end of an async sideband
muxer's pipe, and the muxer thread waits for EOF. The same omission
has caused deadlocks there twice before: 49ecfa13f (receive-pack:
close sideband fd on early pack errors, 2013-04-19) and 6cdad1f13
(receive-pack: fix deadlock when we cannot create tmpdir,
2017-03-07).
Signed-off-by: Ethan Dickson <ethanndickson@gmail.com>
---
connected: close err_fd in promisor fast-path
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2303%2Fethanndickson%2Fconnected-close-err-fd-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2303/ethanndickson/connected-close-err-fd-v1
Pull-Request: https://github.com/git/git/pull/2303
connected.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/connected.c b/connected.c
index 6718503649..7e26976832 100644
--- a/connected.c
+++ b/connected.c
@@ -76,6 +76,8 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
promisor_pack_found:
;
} while ((oid = fn(cb_data)) != NULL);
+ if (opt->err_fd)
+ close(opt->err_fd);
return 0;
}
base-commit: 94f057755b7941b321fd11fec1b2e3ca5313a4e0
--
gitgitgadget
^ permalink raw reply related
* Re: [PATCH] trailer: change strbuf in-place in unfold_value()
From: René Scharfe @ 2026-05-15 6:47 UTC (permalink / raw)
To: Ramsay Jones, Git List; +Cc: Jeff King
In-Reply-To: <a4da346d-3800-40ea-8828-970b15088bf3@ramsayjones.plus.com>
On 5/14/26 11:30 PM, Ramsay Jones wrote:
>
>> diff --git a/trailer.c b/trailer.c
>> index 470f86a4a2..b89fa12fe7 100644
>> --- a/trailer.c
>> +++ b/trailer.c
>> @@ -988,29 +988,25 @@ static int ends_with_blank_line(const char *buf, size_t len)
>>
>> static void unfold_value(struct strbuf *val)
>> {
>> - struct strbuf out = STRBUF_INIT;
>> size_t i;
>> + size_t pos = 0;
>>
>> - strbuf_grow(&out, val->len);
>> i = 0;
>> while (i < val->len) {
>> char c = val->buf[i++];
>> if (c == '\n') {
>> /* Collapse continuation down to a single space. */
>> while (i < val->len && isspace(val->buf[i]))
>> i++;
>> - strbuf_addch(&out, ' ');
>> - } else {
>> - strbuf_addch(&out, c);
>> + val->buf[pos++] = ' ';
>> + } else if (pos != i) {
>
> Hmm, isn't 'pos' strictly (always) less than 'i' here? (note the post update
> of 'i' when setting 'c' at the head of the loop).
Ah, yes, good find. Initially I used a for loop which incremented i
only at the end, but converted it back to minimize the patch and
forgot to adjust this comparison.
René
^ permalink raw reply
* Re: [PATCH 0/3] daemon: fix network address handling bugs
From: Patrick Steinhardt @ 2026-05-15 7:31 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Sebastien Tardif via GitGitGadget, git, Sebastien Tardif
In-Reply-To: <xmqqfr3tg5me.fsf@gitster.g>
On Fri, May 15, 2026 at 04:20:41AM +0900, Junio C Hamano wrote:
> "Sebastien Tardif via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Fix three related issues in daemon.c's network address handling:
>
> Thanks for separating patches so that each of them addresses one
> specific issue.
>
> It would have been better if you sent this series as [PATCH v2] as a
> reply to <pull.2299.git.git.1778291290159.gitgitgadget@gmail.com>,
> which is the previous round. That way, the mailing list archive
> will keep the related discussions together on the same page. If we
> visit the page for the cover letter I am responding to,
>
> https://lore.kernel.org/git/pull.2300.git.git.1778773592.gitgitgadget@gmail.com/
>
> nobody can see that there was a previous iteration so those who
> looked at the earlier effort cannot refer back to it and compare.
True. Other than that though I'm happy with this iteration. Thanks!
Patrick
^ permalink raw reply
* [PATCH v2] trailer: change strbuf in-place in unfold_value()
From: René Scharfe @ 2026-05-15 7:33 UTC (permalink / raw)
To: Git List; +Cc: Jeff King, Ramsay Jones
In-Reply-To: <9629b0c1-b28f-4cd2-8d59-67d909ca9052@web.de>
Avoid an allocation by doing s/\n\s*/ /g (replacing NL and any following
whitespace with a SP) right in the strbuf instead of copying the result
to a temporary one and swapping them in the end. We can safely do that
because the replacement is never longer than the original string.
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Formatted with --function-context for easier review.
Changes since v1:
- Removed always-true comparison.
trailer.c | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/trailer.c b/trailer.c
index 470f86a4a2..6d8ec7fa8d 100644
--- a/trailer.c
+++ b/trailer.c
@@ -988,29 +988,24 @@ static int ends_with_blank_line(const char *buf, size_t len)
static void unfold_value(struct strbuf *val)
{
- struct strbuf out = STRBUF_INIT;
size_t i;
+ size_t pos = 0;
- strbuf_grow(&out, val->len);
i = 0;
while (i < val->len) {
char c = val->buf[i++];
if (c == '\n') {
/* Collapse continuation down to a single space. */
while (i < val->len && isspace(val->buf[i]))
i++;
- strbuf_addch(&out, ' ');
- } else {
- strbuf_addch(&out, c);
+ c = ' ';
}
+ val->buf[pos++] = c;
}
+ strbuf_setlen(val, pos);
/* Empty lines may have left us with whitespace cruft at the edges */
- strbuf_trim(&out);
-
- /* output goes back to val as if we modified it in-place */
- strbuf_swap(&out, val);
- strbuf_release(&out);
+ strbuf_trim(val);
}
static struct trailer_block *trailer_block_new(void)
Interdiff against v1:
diff --git a/trailer.c b/trailer.c
index b89fa12fe7..6d8ec7fa8d 100644
--- a/trailer.c
+++ b/trailer.c
@@ -989,22 +989,21 @@ static int ends_with_blank_line(const char *buf, size_t len)
static void unfold_value(struct strbuf *val)
{
size_t i;
size_t pos = 0;
i = 0;
while (i < val->len) {
char c = val->buf[i++];
if (c == '\n') {
/* Collapse continuation down to a single space. */
while (i < val->len && isspace(val->buf[i]))
i++;
- val->buf[pos++] = ' ';
- } else if (pos != i) {
- val->buf[pos++] = c;
+ c = ' ';
}
+ val->buf[pos++] = c;
}
strbuf_setlen(val, pos);
/* Empty lines may have left us with whitespace cruft at the edges */
strbuf_trim(val);
}
--
2.54.0
^ permalink raw reply related
* Re: [PATCH] http: handle absolute-path alternates from server root
From: Patrick Steinhardt @ 2026-05-15 7:41 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git, slonkazoid
In-Reply-To: <20260513185825.GB147423@coredump.intra.peff.net>
On Wed, May 13, 2026 at 02:58:25PM -0400, Jeff King wrote:
> On Wed, May 13, 2026 at 10:10:54AM +0900, Junio C Hamano wrote:
>
> > Jeff King <peff@peff.net> writes:
> >
> > > ... Probably in a way that makes it totally invalid, but
> > > if you were very unlucky you could turn something like:
> > >
> > > http://victim.com.evil.domain:8000
> > >
> > > into:
> > >
> > > http://victim.com
> > >
> > > Which looks like the start of a redirect attack, except that
> > > the attacker could just have written "http://victim.com" in
> > > the first place! Either way we feed it to
> > > is_alternate_allowed(), which is where we check redirect and
> > > protocol rules.
> >
> > Yuck. I know I am the guilty party who introduced the dumb HTTP
> > walker but I wish we could kill it off after all these years. I did
> > not even recall that we supported the alternate object store in the
> > "protocol" until I saw this patch X-<.
>
> Me too. It's been the source of many obscure bugs, and I think a couple
> of vulnerabilities (even though clients never intend to use dumb clones
> in the first place).
>
> We talked about dropping it a few years ago, but Eric countered that
> dumb clones are easier on the server in some cases (like gigantic
> public-inbox repos that are packed to keep most of the old history in
> one big pack that is never updated). The verbatim pack-reuse feature
> tries to get smart clones closer to that, but it's hard to beat serving
> a static file from the server's perspective. I haven't measured anything
> in that area in a while, though.
In theory we can get much closer with packfile URIs, too, can't we? If
the packfiles are directly accessible anyway the server could just
announce these directly and have the client fetch them. That should
significantly reduce the load on the server even further.
Of course, the big downside is that "fetch.uriProtocols" is empty by
default, so Git will not use them. Makes me wonder whether this is
something we want to eventually change, but I guess the current default
behaviour is somewhat insecure as it would allow the server to redirect
clients to arbitrary locations. It would be great if we had a mechanism
that only allowed packfile URIs that use the same host, which would make
this a lot more reasonable to enable by default.
Patrick
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox