* [PATCH] refs/ref-cache: stop seeking into empty directories
@ 2025-09-24 7:55 Karthik Nayak
2025-09-24 12:28 ` Patrick Steinhardt
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Karthik Nayak @ 2025-09-24 7:55 UTC (permalink / raw)
To: git; +Cc: Karthik Nayak
The 'cache_ref_iterator_seek()' function is used to seek the
`ref_iterator` to the desired reference in the ref-cache mechanism. We
use the seeking functionality to implement the '--start-after' flag in
'git-for-each-ref(1)'.
When using the files-backend with packed-refs, it is possible that some
of the refs directories are empty. For e.g. just after repacking, the
'refs/heads' directory would be empty. The ref-cache seek mechanism
doesn't take this into consideration, causing SEGFAULT as we try to
access entries within the directory. Fix this by breaking out of the
loop when we enter an empty directory.
Add tests which simulate this behavior and also provide coverage over
using the feature over packed-refs.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
refs/ref-cache.c | 3 ++
t/t6302-for-each-ref-filter.sh | 65 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 68 insertions(+)
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index c180e0aad7..8a260028ec 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -507,6 +507,9 @@ static int cache_ref_iterator_seek(struct ref_iterator *ref_iterator,
slash = strchr(slash, '/');
len = slash ? (size_t)(slash - refname) : strlen(refname);
+ if (dir->nr == 0)
+ break;
+
for (idx = 0; idx < dir->nr; idx++) {
cmp = strncmp(refname, dir->entries[idx]->name, len);
if (cmp <= 0)
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 9b80ea1e3b..d14567cb62 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -754,4 +754,69 @@ test_expect_success 'start after used with custom sort order' '
test_cmp expect actual
'
+test_expect_success 'start after with packed refs' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git update-ref --stdin <<-EOF &&
+ create refs/heads/branch @
+ create refs/heads/side @
+ create refs/odd/spot @
+ create refs/tags/one @
+ create refs/tags/two @
+ commit
+ EOF
+
+ cat >expect <<-\EOF &&
+ refs/tags/default
+ refs/tags/one
+ refs/tags/two
+ EOF
+
+ git pack-refs --all &&
+ git for-each-ref --format="%(refname)" --start-after=refs/odd/spot >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'start after with packed refs and some loose refs' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git update-ref --stdin <<-EOF &&
+ create refs/heads/branch @
+ create refs/heads/side @
+ create refs/odd/spot @
+ create refs/tags/one @
+ create refs/tags/two @
+ commit
+ EOF
+
+ git pack-refs --all &&
+
+ git update-ref --stdin <<-EOF &&
+ create refs/heads/foo @
+ create refs/odd/tee @
+ commit
+ EOF
+
+ cat >expect <<-\EOF &&
+ refs/odd/tee
+ refs/tags/default
+ refs/tags/one
+ refs/tags/two
+ EOF
+
+
+ git for-each-ref --format="%(refname)" --start-after=refs/odd/spot >actual &&
+ test_cmp expect actual
+ )
+'
+
test_done
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] refs/ref-cache: stop seeking into empty directories
2025-09-24 7:55 [PATCH] refs/ref-cache: stop seeking into empty directories Karthik Nayak
@ 2025-09-24 12:28 ` Patrick Steinhardt
2025-09-25 8:20 ` Karthik Nayak
2025-09-25 9:15 ` [PATCH v2] refs/ref-cache: fix SEGFAULT when seeking in " Karthik Nayak
2025-10-01 12:17 ` [PATCH v3] " Karthik Nayak
2 siblings, 1 reply; 10+ messages in thread
From: Patrick Steinhardt @ 2025-09-24 12:28 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git
On Wed, Sep 24, 2025 at 09:55:51AM +0200, Karthik Nayak wrote:
> The 'cache_ref_iterator_seek()' function is used to seek the
> `ref_iterator` to the desired reference in the ref-cache mechanism. We
> use the seeking functionality to implement the '--start-after' flag in
> 'git-for-each-ref(1)'.
>
> When using the files-backend with packed-refs, it is possible that some
> of the refs directories are empty. For e.g. just after repacking, the
> 'refs/heads' directory would be empty. The ref-cache seek mechanism
> doesn't take this into consideration, causing SEGFAULT as we try to
> access entries within the directory.
Why do we even try to access any entry in an empty directory? We have
`dir->nr`, so shouldn't we check that `idx < dir->nr` every time we try
to deref an entry?
In other words, I wonder whether we fix a symptom of a missing bounds
check instead of addressing that missing bounds check as the root cause
directly.
> Fix this by breaking out of the
> loop when we enter an empty directory.
>
> Add tests which simulate this behavior and also provide coverage over
> using the feature over packed-refs.
Nit: the commit subject might be adjusted to mention the symptom we're
fixing rather than what we're doing in the commit. At least, it feels
like the segfault is the more important thing to talk about here.
> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> ---
> refs/ref-cache.c | 3 ++
> t/t6302-for-each-ref-filter.sh | 65 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 68 insertions(+)
>
> diff --git a/refs/ref-cache.c b/refs/ref-cache.c
> index c180e0aad7..8a260028ec 100644
> --- a/refs/ref-cache.c
> +++ b/refs/ref-cache.c
> @@ -507,6 +507,9 @@ static int cache_ref_iterator_seek(struct ref_iterator *ref_iterator,
> slash = strchr(slash, '/');
> len = slash ? (size_t)(slash - refname) : strlen(refname);
>
> + if (dir->nr == 0)
> + break;
Can't we break before sorting already? Avoids a couple of no-op changes.
> for (idx = 0; idx < dir->nr; idx++) {
> cmp = strncmp(refname, dir->entries[idx]->name, len);
> if (cmp <= 0)
> diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
> index 9b80ea1e3b..d14567cb62 100755
> --- a/t/t6302-for-each-ref-filter.sh
> +++ b/t/t6302-for-each-ref-filter.sh
> @@ -754,4 +754,69 @@ test_expect_success 'start after used with custom sort order' '
> test_cmp expect actual
> '
>
> +test_expect_success 'start after with packed refs' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit default &&
> +
> + git update-ref --stdin <<-EOF &&
s/EOF/\EOF/
The same is true for the other test.
Patrick
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] refs/ref-cache: stop seeking into empty directories
2025-09-24 12:28 ` Patrick Steinhardt
@ 2025-09-25 8:20 ` Karthik Nayak
0 siblings, 0 replies; 10+ messages in thread
From: Karthik Nayak @ 2025-09-25 8:20 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 3435 bytes --]
Patrick Steinhardt <ps@pks.im> writes:
> On Wed, Sep 24, 2025 at 09:55:51AM +0200, Karthik Nayak wrote:
>> The 'cache_ref_iterator_seek()' function is used to seek the
>> `ref_iterator` to the desired reference in the ref-cache mechanism. We
>> use the seeking functionality to implement the '--start-after' flag in
>> 'git-for-each-ref(1)'.
>>
>> When using the files-backend with packed-refs, it is possible that some
>> of the refs directories are empty. For e.g. just after repacking, the
>> 'refs/heads' directory would be empty. The ref-cache seek mechanism
>> doesn't take this into consideration, causing SEGFAULT as we try to
>> access entries within the directory.
>
> Why do we even try to access any entry in an empty directory? We have
> `dir->nr`, so shouldn't we check that `idx < dir->nr` every time we try
> to deref an entry?
>
> In other words, I wonder whether we fix a symptom of a missing bounds
> check instead of addressing that missing bounds check as the root cause
> directly.
>
We do a bounds check to ensure that we don't overflow the entries in a
directory with `idx = idx >= dir->nr ? dir->nr - 1 : idx;`.
The issue is that when `dir->nr = 0`, this will set the `idx = -1`. So
this is where it fails.
Taking a step back, the while loop shouldn't even operate when we hit a
directory with no entries. So I'll modify the loop condition to include
a check for `dir->nr`.
>> Fix this by breaking out of the
>> loop when we enter an empty directory.
>>
>> Add tests which simulate this behavior and also provide coverage over
>> using the feature over packed-refs.
>
> Nit: the commit subject might be adjusted to mention the symptom we're
> fixing rather than what we're doing in the commit. At least, it feels
> like the segfault is the more important thing to talk about here.
>
That's fair, let me modify that.
>> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
>> ---
>> refs/ref-cache.c | 3 ++
>> t/t6302-for-each-ref-filter.sh | 65 ++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 68 insertions(+)
>>
>> diff --git a/refs/ref-cache.c b/refs/ref-cache.c
>> index c180e0aad7..8a260028ec 100644
>> --- a/refs/ref-cache.c
>> +++ b/refs/ref-cache.c
>> @@ -507,6 +507,9 @@ static int cache_ref_iterator_seek(struct ref_iterator *ref_iterator,
>> slash = strchr(slash, '/');
>> len = slash ? (size_t)(slash - refname) : strlen(refname);
>>
>> + if (dir->nr == 0)
>> + break;
>
> Can't we break before sorting already? Avoids a couple of no-op changes.
>
Yup, in the next iteration, I'll move it to the loop condition, so we
don't even enter the loop.
>> for (idx = 0; idx < dir->nr; idx++) {
>> cmp = strncmp(refname, dir->entries[idx]->name, len);
>> if (cmp <= 0)
>> diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
>> index 9b80ea1e3b..d14567cb62 100755
>> --- a/t/t6302-for-each-ref-filter.sh
>> +++ b/t/t6302-for-each-ref-filter.sh
>> @@ -754,4 +754,69 @@ test_expect_success 'start after used with custom sort order' '
>> test_cmp expect actual
>> '
>>
>> +test_expect_success 'start after with packed refs' '
>> + test_when_finished "rm -rf repo" &&
>> + git init repo &&
>> + (
>> + cd repo &&
>> + test_commit default &&
>> +
>> + git update-ref --stdin <<-EOF &&
>
> s/EOF/\EOF/
>
> The same is true for the other test.
>
> Patrick
This always trips me up. Will fix.
Thanks for the review.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2] refs/ref-cache: fix SEGFAULT when seeking in empty directories
2025-09-24 7:55 [PATCH] refs/ref-cache: stop seeking into empty directories Karthik Nayak
2025-09-24 12:28 ` Patrick Steinhardt
@ 2025-09-25 9:15 ` Karthik Nayak
2025-09-25 17:07 ` Junio C Hamano
2025-10-01 12:17 ` [PATCH v3] " Karthik Nayak
2 siblings, 1 reply; 10+ messages in thread
From: Karthik Nayak @ 2025-09-25 9:15 UTC (permalink / raw)
To: git; +Cc: ps, Karthik Nayak
The 'cache_ref_iterator_seek()' function is used to seek the
`ref_iterator` to the desired reference in the ref-cache mechanism. We
use the seeking functionality to implement the '--start-after' flag in
'git-for-each-ref(1)'.
When using the files-backend with packed-refs, it is possible that some
of the refs directories are empty. For e.g. just after repacking, the
'refs/heads' directory would be empty. The ref-cache seek mechanism
doesn't take this into consideration, causing SEGFAULT as we try to
access entries within the directory. Fix this by breaking out of the
loop when we enter an empty directory.
Add tests which simulate this behavior and also provide coverage over
using the feature over packed-refs.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
Changes in v2:
- Moved the `dir-nr` check to the loop to provide better bound checks.
- Modified the commit subject to talk about the issue at hand.
- Substituted EOF with \EOF since we don't do any variable parsing.
- Link to v1: https://lore.kernel.org/r/20250924-583-git-for-each-ref-start-after-v1-1-c73be2b5db5a@gmail.com
---
refs/ref-cache.c | 2 +-
t/t6302-for-each-ref-filter.sh | 65 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 66 insertions(+), 1 deletion(-)
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index c180e0aad7..e5e5df16d8 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -539,7 +539,7 @@ static int cache_ref_iterator_seek(struct ref_iterator *ref_iterator,
*/
break;
}
- } while (slash);
+ } while (slash && dir->nr);
}
return 0;
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 9b80ea1e3b..7f060d97bf 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -754,4 +754,69 @@ test_expect_success 'start after used with custom sort order' '
test_cmp expect actual
'
+test_expect_success 'start after with packed refs' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git update-ref --stdin <<-\EOF &&
+ create refs/heads/branch @
+ create refs/heads/side @
+ create refs/odd/spot @
+ create refs/tags/one @
+ create refs/tags/two @
+ commit
+ EOF
+
+ cat >expect <<-\EOF &&
+ refs/tags/default
+ refs/tags/one
+ refs/tags/two
+ EOF
+
+ git pack-refs --all &&
+ git for-each-ref --format="%(refname)" --start-after=refs/odd/spot >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'start after with packed refs and some loose refs' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git update-ref --stdin <<-\EOF &&
+ create refs/heads/branch @
+ create refs/heads/side @
+ create refs/odd/spot @
+ create refs/tags/one @
+ create refs/tags/two @
+ commit
+ EOF
+
+ git pack-refs --all &&
+
+ git update-ref --stdin <<-\EOF &&
+ create refs/heads/foo @
+ create refs/odd/tee @
+ commit
+ EOF
+
+ cat >expect <<-\EOF &&
+ refs/odd/tee
+ refs/tags/default
+ refs/tags/one
+ refs/tags/two
+ EOF
+
+
+ git for-each-ref --format="%(refname)" --start-after=refs/odd/spot >actual &&
+ test_cmp expect actual
+ )
+'
+
test_done
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2] refs/ref-cache: fix SEGFAULT when seeking in empty directories
2025-09-25 9:15 ` [PATCH v2] refs/ref-cache: fix SEGFAULT when seeking in " Karthik Nayak
@ 2025-09-25 17:07 ` Junio C Hamano
2025-09-26 7:41 ` Karthik Nayak
0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2025-09-25 17:07 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, ps
Karthik Nayak <karthik.188@gmail.com> writes:
> diff --git a/refs/ref-cache.c b/refs/ref-cache.c
> index c180e0aad7..e5e5df16d8 100644
> --- a/refs/ref-cache.c
> +++ b/refs/ref-cache.c
> @@ -539,7 +539,7 @@ static int cache_ref_iterator_seek(struct ref_iterator *ref_iterator,
> */
> break;
> }
> - } while (slash);
> + } while (slash && dir->nr);
> }
This is at the tail of a "do { ... } while (...);" loop, but inside
the loop I see this construct:
for (idx = 0; idx < dir->nr; idx++) {
cmp = strncmp(refname, dir->entries[idx]->name, len);
if (cmp <= 0)
break;
}
/* don't overflow the index */
idx = idx >= dir->nr ? dir->nr - 1 : idx;
i.e., if we scan all the dir->entries[] elements in the innter loop
and did not find any hit, idx would become dir->nr and this inner
loop runs to the end. If (dir->nr == 0), then ?: operator [*] would
become the idx = (dir->nr - 1); And that idx is used for a while
before we get to this "while (slash && dir->nr)".
And then tha tis used like this.
if (slash)
slash = slash + 1;
level->index = idx;
if (dir->entries[idx]->flag & REF_DIR) {
...
IOW, isn't this check a bit too late? I wonder if we can leave at
the beginning of the outer loop, even before sort_ref_dir(dir), when
dir->nr is zero, or something?
[Side note]
* I found the problematic ?: extremely hard to read, given its
contrast with the terminating condnition of for loop. If you
apply the discipline to keep the textual order match the actual
order, i.e.
for (idx = 0; idx < dir->nr; idx++)
...;
idx = dir->nr <= idx ? dir->nr - 1 : idx;
people would have spotted this more easily.
But perhaps that may be just me. Anyway...
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] refs/ref-cache: fix SEGFAULT when seeking in empty directories
2025-09-25 17:07 ` Junio C Hamano
@ 2025-09-26 7:41 ` Karthik Nayak
2025-09-26 7:44 ` Karthik Nayak
2025-09-26 16:30 ` Junio C Hamano
0 siblings, 2 replies; 10+ messages in thread
From: Karthik Nayak @ 2025-09-26 7:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, ps
[-- Attachment #1: Type: text/plain, Size: 2498 bytes --]
Junio C Hamano <gitster@pobox.com> writes:
> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> diff --git a/refs/ref-cache.c b/refs/ref-cache.c
>> index c180e0aad7..e5e5df16d8 100644
>> --- a/refs/ref-cache.c
>> +++ b/refs/ref-cache.c
>> @@ -539,7 +539,7 @@ static int cache_ref_iterator_seek(struct ref_iterator *ref_iterator,
>> */
>> break;
>> }
>> - } while (slash);
>> + } while (slash && dir->nr);
>> }
>
> This is at the tail of a "do { ... } while (...);" loop, but inside
> the loop I see this construct:
>
> for (idx = 0; idx < dir->nr; idx++) {
> cmp = strncmp(refname, dir->entries[idx]->name, len);
> if (cmp <= 0)
> break;
> }
> /* don't overflow the index */
> idx = idx >= dir->nr ? dir->nr - 1 : idx;
>
> i.e., if we scan all the dir->entries[] elements in the innter loop
> and did not find any hit, idx would become dir->nr and this inner
> loop runs to the end. If (dir->nr == 0), then ?: operator [*] would
> become the idx = (dir->nr - 1); And that idx is used for a while
> before we get to this "while (slash && dir->nr)".
>
This wouldn't happen because before the loop starts, we set:
dir = get_ref_dir(iter->cache->root);
So, `dir` will always have an entry (i.e. 'refs') in the first
iteration, after that, the checks in the while section of the 'do {}
while (...)' loop will kick in.
> And then tha tis used like this.
>
> if (slash)
> slash = slash + 1;
>
> level->index = idx;
> if (dir->entries[idx]->flag & REF_DIR) {
> ...
>
> IOW, isn't this check a bit too late? I wonder if we can leave at
> the beginning of the outer loop, even before sort_ref_dir(dir), when
> dir->nr is zero, or something?
>
>
We could add it there too. I can't see the merit of one over the other.
But if you see it being more readable. I'll happily make that change.
As the author of this code, I do find it complex already.
>
> [Side note]
>
> * I found the problematic ?: extremely hard to read, given its
> contrast with the terminating condnition of for loop. If you
> apply the discipline to keep the textual order match the actual
> order, i.e.
>
> for (idx = 0; idx < dir->nr; idx++)
> ...;
>
> idx = dir->nr <= idx ? dir->nr - 1 : idx;
>
> people would have spotted this more easily.
>
> But perhaps that may be just me. Anyway...
I'm not really sold on this being easier to read. But that could be me,
So I'll make a second commit to make this change, we can drop it if
unwanted.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] refs/ref-cache: fix SEGFAULT when seeking in empty directories
2025-09-26 7:41 ` Karthik Nayak
@ 2025-09-26 7:44 ` Karthik Nayak
2025-09-26 16:30 ` Junio C Hamano
1 sibling, 0 replies; 10+ messages in thread
From: Karthik Nayak @ 2025-09-26 7:44 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, ps
[-- Attachment #1: Type: text/plain, Size: 2145 bytes --]
Karthik Nayak <karthik.188@gmail.com> writes:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Karthik Nayak <karthik.188@gmail.com> writes:
>>
>>> diff --git a/refs/ref-cache.c b/refs/ref-cache.c
>>> index c180e0aad7..e5e5df16d8 100644
>>> --- a/refs/ref-cache.c
>>> +++ b/refs/ref-cache.c
>>> @@ -539,7 +539,7 @@ static int cache_ref_iterator_seek(struct ref_iterator *ref_iterator,
>>> */
>>> break;
>>> }
>>> - } while (slash);
>>> + } while (slash && dir->nr);
>>> }
>>
>> This is at the tail of a "do { ... } while (...);" loop, but inside
>> the loop I see this construct:
>>
>> for (idx = 0; idx < dir->nr; idx++) {
>> cmp = strncmp(refname, dir->entries[idx]->name, len);
>> if (cmp <= 0)
>> break;
>> }
>> /* don't overflow the index */
>> idx = idx >= dir->nr ? dir->nr - 1 : idx;
>>
>> i.e., if we scan all the dir->entries[] elements in the innter loop
>> and did not find any hit, idx would become dir->nr and this inner
>> loop runs to the end. If (dir->nr == 0), then ?: operator [*] would
>> become the idx = (dir->nr - 1); And that idx is used for a while
>> before we get to this "while (slash && dir->nr)".
>>
>
> This wouldn't happen because before the loop starts, we set:
>
> dir = get_ref_dir(iter->cache->root);
>
> So, `dir` will always have an entry (i.e. 'refs') in the first
> iteration, after that, the checks in the while section of the 'do {}
> while (...)' loop will kick in.
>
>> And then tha tis used like this.
>>
>> if (slash)
>> slash = slash + 1;
>>
>> level->index = idx;
>> if (dir->entries[idx]->flag & REF_DIR) {
>> ...
>>
>> IOW, isn't this check a bit too late? I wonder if we can leave at
>> the beginning of the outer loop, even before sort_ref_dir(dir), when
>> dir->nr is zero, or something?
>>
>>
>
> We could add it there too. I can't see the merit of one over the other.
> But if you see it being more readable. I'll happily make that change.
>
Just to clarify, I will hold off on re-rolling till we make a decision
around this, as if we decide to keep the check as it currently is, there
is no reason to re-roll.
[snip]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] refs/ref-cache: fix SEGFAULT when seeking in empty directories
2025-09-26 7:41 ` Karthik Nayak
2025-09-26 7:44 ` Karthik Nayak
@ 2025-09-26 16:30 ` Junio C Hamano
2025-09-29 14:47 ` Karthik Nayak
1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2025-09-26 16:30 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, ps
Karthik Nayak <karthik.188@gmail.com> writes:
> This wouldn't happen because before the loop starts, we set:
>
> dir = get_ref_dir(iter->cache->root);
>
> So, `dir` will always have an entry (i.e. 'refs') in the first
> iteration, after that, the checks in the while section of the 'do {}
> while (...)' loop will kick in.
Ah, OK. So we know dir->nr != 0 when the loop is entered for the
first time. We may descend into a new subdirectory and at that
point we might hit dir->nr == 0 and that is why we want to catch it
at the end of "do {} while ()". Makes sense.
>> I wonder if we can leave at
>> the beginning of the outer loop, even before sort_ref_dir(dir), when
>> dir->nr is zero, or something?
>
> We could add it there too. I can't see the merit of one over the other.
No, I don't, either. As long as we know dir->nr != 0 upon the entry
into the loop, the check at the tail end in while() is sufficient.
Thanks for clarifying. I wonder if a bit of update to the log
message would help future readers of the code, though.
When using the files-backend with packed-refs, it is possible that some
of the refs directories are empty. For e.g. just after repacking, the
'refs/heads' directory would be empty. The ref-cache seek mechanism,
doesn't take this into consideration when descending into a subdirectory,
and makes an out of bounds access, causing SEGFAULT as we try to
access entries within the directory. Fix this by breaking out of the
loop when we enter an empty directory.
or something? I dunno.
Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] refs/ref-cache: fix SEGFAULT when seeking in empty directories
2025-09-26 16:30 ` Junio C Hamano
@ 2025-09-29 14:47 ` Karthik Nayak
0 siblings, 0 replies; 10+ messages in thread
From: Karthik Nayak @ 2025-09-29 14:47 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, ps
[-- Attachment #1: Type: text/plain, Size: 1792 bytes --]
Junio C Hamano <gitster@pobox.com> writes:
> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> This wouldn't happen because before the loop starts, we set:
>>
>> dir = get_ref_dir(iter->cache->root);
>>
>> So, `dir` will always have an entry (i.e. 'refs') in the first
>> iteration, after that, the checks in the while section of the 'do {}
>> while (...)' loop will kick in.
>
> Ah, OK. So we know dir->nr != 0 when the loop is entered for the
> first time. We may descend into a new subdirectory and at that
> point we might hit dir->nr == 0 and that is why we want to catch it
> at the end of "do {} while ()". Makes sense.
>
>>> I wonder if we can leave at
>>> the beginning of the outer loop, even before sort_ref_dir(dir), when
>>> dir->nr is zero, or something?
>>
>> We could add it there too. I can't see the merit of one over the other.
>
> No, I don't, either. As long as we know dir->nr != 0 upon the entry
> into the loop, the check at the tail end in while() is sufficient.
>
> Thanks for clarifying. I wonder if a bit of update to the log
> message would help future readers of the code, though.
>
> When using the files-backend with packed-refs, it is possible that some
> of the refs directories are empty. For e.g. just after repacking, the
> 'refs/heads' directory would be empty. The ref-cache seek mechanism,
> doesn't take this into consideration when descending into a subdirectory,
> and makes an out of bounds access, causing SEGFAULT as we try to
> access entries within the directory. Fix this by breaking out of the
> loop when we enter an empty directory.
>
> or something? I dunno.
>
> Thanks.
Yeah, I think something like this would be much better, I'll amend the
commit and send in a new version soon.
Thanks!
Karthik
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 690 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v3] refs/ref-cache: fix SEGFAULT when seeking in empty directories
2025-09-24 7:55 [PATCH] refs/ref-cache: stop seeking into empty directories Karthik Nayak
2025-09-24 12:28 ` Patrick Steinhardt
2025-09-25 9:15 ` [PATCH v2] refs/ref-cache: fix SEGFAULT when seeking in " Karthik Nayak
@ 2025-10-01 12:17 ` Karthik Nayak
2 siblings, 0 replies; 10+ messages in thread
From: Karthik Nayak @ 2025-10-01 12:17 UTC (permalink / raw)
To: git; +Cc: ps, gitster, Karthik Nayak
The 'cache_ref_iterator_seek()' function is used to seek the
`ref_iterator` to the desired reference in the ref-cache mechanism. We
use the seeking functionality to implement the '--start-after' flag in
'git-for-each-ref(1)'.
When using the files-backend with packed-refs, it is possible that some
of the refs directories are empty. For e.g. just after repacking, the
'refs/heads' directory would be empty. The ref-cache seek mechanism,
doesn't take this into consideration when descending into a
subdirectory, and makes an out of bounds access, causing SEGFAULT as we
try to access entries within the directory. Fix this by breaking out of
the loop when we enter an empty directory.
Since we start with the base directory of 'refs/' which is never empty,
it is okay to perform this check after the first iteration in the
`do..while` clause.
Add tests which simulate this behavior and also provide coverage over
using the feature over packed-refs.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
Changes in v3:
- Modify the commit message to also explain why it is safe to add the
check for empty directories in a `do..while` clause.
- Link to v2: https://lore.kernel.org/r/20250925-583-git-for-each-ref-start-after-v2-1-3613b5a27ff1@gmail.com
Changes in v2:
- Moved the `dir-nr` check to the loop to provide better bound checks.
- Modified the commit subject to talk about the issue at hand.
- Substituted EOF with \EOF since we don't do any variable parsing.
- Link to v1: https://lore.kernel.org/r/20250924-583-git-for-each-ref-start-after-v1-1-c73be2b5db5a@gmail.com
---
refs/ref-cache.c | 2 +-
t/t6302-for-each-ref-filter.sh | 65 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 66 insertions(+), 1 deletion(-)
diff --git a/refs/ref-cache.c b/refs/ref-cache.c
index c180e0aad7..e5e5df16d8 100644
--- a/refs/ref-cache.c
+++ b/refs/ref-cache.c
@@ -539,7 +539,7 @@ static int cache_ref_iterator_seek(struct ref_iterator *ref_iterator,
*/
break;
}
- } while (slash);
+ } while (slash && dir->nr);
}
return 0;
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 9b80ea1e3b..7f060d97bf 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -754,4 +754,69 @@ test_expect_success 'start after used with custom sort order' '
test_cmp expect actual
'
+test_expect_success 'start after with packed refs' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git update-ref --stdin <<-\EOF &&
+ create refs/heads/branch @
+ create refs/heads/side @
+ create refs/odd/spot @
+ create refs/tags/one @
+ create refs/tags/two @
+ commit
+ EOF
+
+ cat >expect <<-\EOF &&
+ refs/tags/default
+ refs/tags/one
+ refs/tags/two
+ EOF
+
+ git pack-refs --all &&
+ git for-each-ref --format="%(refname)" --start-after=refs/odd/spot >actual &&
+ test_cmp expect actual
+ )
+'
+
+test_expect_success 'start after with packed refs and some loose refs' '
+ test_when_finished "rm -rf repo" &&
+ git init repo &&
+ (
+ cd repo &&
+ test_commit default &&
+
+ git update-ref --stdin <<-\EOF &&
+ create refs/heads/branch @
+ create refs/heads/side @
+ create refs/odd/spot @
+ create refs/tags/one @
+ create refs/tags/two @
+ commit
+ EOF
+
+ git pack-refs --all &&
+
+ git update-ref --stdin <<-\EOF &&
+ create refs/heads/foo @
+ create refs/odd/tee @
+ commit
+ EOF
+
+ cat >expect <<-\EOF &&
+ refs/odd/tee
+ refs/tags/default
+ refs/tags/one
+ refs/tags/two
+ EOF
+
+
+ git for-each-ref --format="%(refname)" --start-after=refs/odd/spot >actual &&
+ test_cmp expect actual
+ )
+'
+
test_done
^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-10-01 12:17 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-24 7:55 [PATCH] refs/ref-cache: stop seeking into empty directories Karthik Nayak
2025-09-24 12:28 ` Patrick Steinhardt
2025-09-25 8:20 ` Karthik Nayak
2025-09-25 9:15 ` [PATCH v2] refs/ref-cache: fix SEGFAULT when seeking in " Karthik Nayak
2025-09-25 17:07 ` Junio C Hamano
2025-09-26 7:41 ` Karthik Nayak
2025-09-26 7:44 ` Karthik Nayak
2025-09-26 16:30 ` Junio C Hamano
2025-09-29 14:47 ` Karthik Nayak
2025-10-01 12:17 ` [PATCH v3] " Karthik Nayak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).