* [PATCH] Fixed --shallow-since generating descendant borders
@ 2025-11-22 10:38 Samo Pogačnik via GitGitGadget
2025-11-22 17:36 ` Junio C Hamano
2025-11-23 19:35 ` [PATCH v2] shallow: set borders which are all reachable after clone shallow since Samo Pogačnik via GitGitGadget
0 siblings, 2 replies; 10+ messages in thread
From: Samo Pogačnik via GitGitGadget @ 2025-11-22 10:38 UTC (permalink / raw)
To: git; +Cc: Samo Pogačnik, Samo Pogačnik
From: =?UTF-8?q?Samo=20Poga=C4=8Dnik?= <samo_pogacnik@t-2.net>
When shallow cloning based on a date, it happens that a list
of commits is received, where some of the list border commits
actually descend one from another. In such cases borders need
to be expanded by additional parents and excluding the child
as border.
Signed-off-by: Samo Pogačnik <samo_pogacnik@t-2.net>
---
Fixed --shallow-since generating descendant borders
When shallow cloning based on a date, it happens that a list of commits
is received, where some of the list border commits actually descend one
from another. In such cases borders need to be expanded by additional
parents and excluding the child as border.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2107%2Fspog%2Ffix-shallow-since-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2107/spog/fix-shallow-since-v1
Pull-Request: https://github.com/git/git/pull/2107
shallow.c | 35 ++++++++++++++++++++++++++++++++---
1 file changed, 32 insertions(+), 3 deletions(-)
diff --git a/shallow.c b/shallow.c
index 55b9cd9d3f..37079a0bf1 100644
--- a/shallow.c
+++ b/shallow.c
@@ -251,21 +251,50 @@ struct commit_list *get_shallow_commits_by_rev_list(struct strvec *argv,
* commit A is processed first, then commit B, whose parent is
* A, later. If NOT_SHALLOW on A is cleared at step 1, B
* itself is considered border at step 2, which is incorrect.
+ * We must also consider that B has multiple parents, some of
+ * them not being in the not_shallow_list, but must be added
+ * as border commits to the result.
+ *
+ * The general processing goes like this:
+ * 1. Above we've coloured the whole not_shallow_list of commits
+ * with 'not_shallow'.
+ * 2. For each commit from the not_shallow_list (the code below)
+ * we colour 'shallow' the commit and its parents, which are not
+ * already coloured 'not_shallow'.
+ * 3. Commits with all parents being coloured only 'shallow' remain
+ * shallow and are being added to result list.
+ * 4. Commits without all parents being coloured only 'shallow' are
+ * being excluded as borders, however their parents coloured only
+ * 'shallow' are being added to the result borders list.
*/
for (p = not_shallow_list; p; p = p->next) {
struct commit *c = p->item;
struct commit_list *parent;
+ int must_not_be_shallow = 0;
if (repo_parse_commit(the_repository, c))
die("unable to parse commit %s",
oid_to_hex(&c->object.oid));
for (parent = c->parents; parent; parent = parent->next)
- if (!(parent->item->object.flags & not_shallow_flag)) {
+ if (parent->item->object.flags & not_shallow_flag) {
+ must_not_be_shallow = 1;
+ } else {
c->object.flags |= shallow_flag;
- commit_list_insert(c, &result);
- break;
+ parent->item->object.flags |= shallow_flag;
}
+ if (must_not_be_shallow) {
+ c->object.flags &= ~shallow_flag;
+ for (parent = c->parents; parent; parent = parent->next)
+ if (parent->item->object.flags & shallow_flag) {
+ parent->item->object.flags |= not_shallow_flag;
+ commit_list_insert(parent->item, &result);
+ }
+ } else {
+ for (parent = c->parents; parent; parent = parent->next)
+ parent->item->object.flags &= ~shallow_flag;
+ commit_list_insert(c, &result);
+ }
}
free_commit_list(not_shallow_list);
base-commit: debbc87557487aa9a8ed8a35367d17f8b4081c76
--
gitgitgadget
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH] Fixed --shallow-since generating descendant borders 2025-11-22 10:38 [PATCH] Fixed --shallow-since generating descendant borders Samo Pogačnik via GitGitGadget @ 2025-11-22 17:36 ` Junio C Hamano 2025-11-23 19:35 ` [PATCH v2] shallow: set borders which are all reachable after clone shallow since Samo Pogačnik via GitGitGadget 1 sibling, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2025-11-22 17:36 UTC (permalink / raw) To: Samo Pogačnik via GitGitGadget; +Cc: git, Samo Pogačnik "Samo Pogačnik via GitGitGadget" <gitgitgadget@gmail.com> writes: > Subject: Re: [PATCH] Fixed --shallow-since generating descendant borders A patch title wants maximum information density, and "Fixed" is a vague verb in that context, as it does not tell what existing behaviour was wrong or what is the right alternative behaviour. As "git log --oneline --no-merges" can show, our patches typically begins with the area the change pertains to plus a colon. "git log --oneline shallow.c" (which is the file this patch touches) shows a handful ones that are prefixed with "shallow:". What to write after the "<area>:" prefix for this patch, I am not sure, as the body of the proposed log message does not discuss the bad effect of the suboptimal or wrong (I cannot even tell which one from the proposed log message) behaviour. > From: =?UTF-8?q?Samo=20Poga=C4=8Dnik?= <samo_pogacnik@t-2.net> > > When shallow cloning based on a date, it happens that a list > of commits is received, where some of the list border commits > actually descend one from another. In such cases borders need > to be expanded by additional parents and excluding the child > as border. Missing from the above description are - received by whom? - the reason why they want such a list is to do what? - when there are multiple borders that can be "expanded", and if you leave it unexpanded (i.e., the behaviour of the current code) what happens and why is it bad? Is it breaking bad (e.g., clone would be aborted, the resulting cloned repository does not pass fsck), or is it suboptimal bad (e.g., we told the command that we do not want commits older than date X, but we end up having more commits)? - what is the cost of computing the "expansion", relative to the above "badness"? Fixing a breaking bad behaviour can of course afford to spend more cycles than a suboptimal bad behaviour. The usual way to compose a log message of this project is to - Give an observation on how the current system works in the present tense (so no need to say "Currently X is Y", or "Previously X was Y" to describe the state before your change; just "X is Y" is enough), and discuss what you perceive as a problem in it. - Propose a solution (optional---often, problem description trivially leads to an obvious solution in reader's minds). - Give commands to somebody editing the codebase to "make it so", instead of saying "This commit does X". in this order. > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2107%2Fspog%2Ffix-shallow-since-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2107/spog/fix-shallow-since-v1 > Pull-Request: https://github.com/git/git/pull/2107 > > shallow.c | 35 ++++++++++++++++++++++++++++++++--- > 1 file changed, 32 insertions(+), 3 deletions(-) We'd want to protect this change from other people accidentally breaking it in the future, and the best practice we have is to write a test to observe end-user visible behaviour. The whole helper function being touched by the patch came in the 27-patch series merged at a460ea4a (Merge branch 'nd/shallow-deepen', 2016-10-10), and it seems to have added to t5500-fetch-pack.sh to cover the "--shallow-since" feature. Perhaps this should add a few tests to demonstrate existing "breakage" (i.e., a "test_expect_success" test that would fail without the change in the patch to shallow.c we see below, and would succeed when the patch to shallow.c is applied). Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2] shallow: set borders which are all reachable after clone shallow since 2025-11-22 10:38 [PATCH] Fixed --shallow-since generating descendant borders Samo Pogačnik via GitGitGadget 2025-11-22 17:36 ` Junio C Hamano @ 2025-11-23 19:35 ` Samo Pogačnik via GitGitGadget 2025-11-25 0:43 ` Junio C Hamano 2026-01-31 16:28 ` [PATCH v3] shallow: ensure all boundary commits are reachable with --shallow-since Samo Pogačnik via GitGitGadget 1 sibling, 2 replies; 10+ messages in thread From: Samo Pogačnik via GitGitGadget @ 2025-11-23 19:35 UTC (permalink / raw) To: git; +Cc: Samo Pogačnik, Samo Pogačnik From: =?UTF-8?q?Samo=20Poga=C4=8Dnik?= <samo_pogacnik@t-2.net> When shallow cloning based on a date, it happens that not all shallow border commits are reachable. Original implementation of a generic shallow boundary finder based on rev-list sets a commit (from the initial list of border commit candidates) to be the border commit as soon as it finds one of its parentis that wasn't on the list of initial candidates. This results in a successful shallow clone, where some of its declared border commits may not be reachable and they would not actually exist in the cloned repository. Thus the result may contradict existing comment in the code, which correctly states that such commmit should not be considered border. One can inspect such case by running the added test scenario: - 'clone shallow since all borders reachable' The modified implementation of a generic shallow boundary finder based on rev-list ensures that all shallow border commits are reachable also after being grafted. This is achieved by inspecting all parents of each initial border commit candidate. The border commit candidate is set border only when all its parents wern't on the initial list of candidates. Otherwise the border commit candidate is not set as border however its parents that weren't on the list of candidates are set as borders. Signed-off-by: Samo Pogačnik <samo_pogacnik@t-2.net> --- Fixed --shallow-since generating descendant borders When shallow cloning based on a date, it happens that a list of commits is received, where some of the list border commits actually descend one from another. In such cases borders need to be expanded by additional parents and excluding the child as border. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2107%2Fspog%2Ffix-shallow-since-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2107/spog/fix-shallow-since-v2 Pull-Request: https://github.com/git/git/pull/2107 Range-diff vs v1: 1: aaeff44f83 ! 1: 479692c386 Fixed --shallow-since generating descendant borders @@ Metadata Author: Samo Pogačnik <samo_pogacnik@t-2.net> ## Commit message ## - Fixed --shallow-since generating descendant borders + shallow: set borders which are all reachable after clone shallow since - When shallow cloning based on a date, it happens that a list - of commits is received, where some of the list border commits - actually descend one from another. In such cases borders need - to be expanded by additional parents and excluding the child - as border. + When shallow cloning based on a date, it happens that not all + shallow border commits are reachable. + + Original implementation of a generic shallow boundary finder + based on rev-list sets a commit (from the initial list of border + commit candidates) to be the border commit as soon as it finds one + of its parentis that wasn't on the list of initial candidates. This + results in a successful shallow clone, where some of its declared + border commits may not be reachable and they would not actually exist + in the cloned repository. Thus the result may contradict existing + comment in the code, which correctly states that such commmit should + not be considered border. + + One can inspect such case by running the added test scenario: + - 'clone shallow since all borders reachable' + + The modified implementation of a generic shallow boundary finder + based on rev-list ensures that all shallow border commits are reachable + also after being grafted. This is achieved by inspecting all parents + of each initial border commit candidate. The border commit candidate + is set border only when all its parents wern't on the initial list of + candidates. Otherwise the border commit candidate is not set as border + however its parents that weren't on the list of candidates are set as + borders. Signed-off-by: Samo Pogačnik <samo_pogacnik@t-2.net> @@ shallow.c: struct commit_list *get_shallow_commits_by_rev_list(struct strvec *ar * commit A is processed first, then commit B, whose parent is * A, later. If NOT_SHALLOW on A is cleared at step 1, B * itself is considered border at step 2, which is incorrect. -+ * We must also consider that B has multiple parents, some of -+ * them not being in the not_shallow_list, but must be added -+ * as border commits to the result. ++ * ++ * We must also consider that B has multiple parents which may ++ * not all be marked NOT_SHALLOW (as they weren't traversed into ++ * the not_shallow_list from revs in the first place). Because of ++ * that an additional step is required to reconsider B as border. ++ * A commit from the not_shallow_list is considered border only ++ * when ALL its parents weren't on the not_shallow_list. ++ * When one or more parents of a commit from the not_shellow_list ++ * also come from that list, the commit is not considered border, ++ * but its non-listed parents are considered border commits. + * + * The general processing goes like this: -+ * 1. Above we've coloured the whole not_shallow_list of commits -+ * with 'not_shallow'. ++ * 1. Above we've painted the whole not_shallow_list of commits ++ * NOT_SHALLOW. + * 2. For each commit from the not_shallow_list (the code below) -+ * we colour 'shallow' the commit and its parents, which are not -+ * already coloured 'not_shallow'. -+ * 3. Commits with all parents being coloured only 'shallow' remain ++ * we paint SHALLOW this commit and its parent for all its ++ * parents that had not yet been painted NOT_SHALLOW. ++ * 3. Commits with all parents being painted only SHALLOW remain + * shallow and are being added to result list. -+ * 4. Commits without all parents being coloured only 'shallow' are -+ * being excluded as borders, however their parents coloured only -+ * 'shallow' are being added to the result borders list. ++ * 4. Commits without all parents being painted only SHALLOW are ++ * being excluded as borders, however their parents painted only ++ * SHALLOW are being added to the result borders list. */ for (p = not_shallow_list; p; p = p->next) { struct commit *c = p->item; @@ shallow.c: struct commit_list *get_shallow_commits_by_rev_list(struct strvec *ar } free_commit_list(not_shallow_list); + + ## t/t5500-fetch-pack.sh ## +@@ t/t5500-fetch-pack.sh: test_expect_success 'shallow since with commit graph and already-seen commit' ' + ) + ' + ++test_expect_success 'clone shallow since all borders reachable' ' ++ test_create_repo shallow-since-all-borders-reachable && ++ ( ++ rm -rf shallow123 && ++ cd shallow-since-all-borders-reachable && ++ GIT_COMMITTER_DATE="2025-08-19 12:34:56" git commit --allow-empty -m one && ++ GIT_COMMITTER_DATE="2025-08-20 12:34:56" git switch -c branch && ++ GIT_COMMITTER_DATE="2025-08-21 12:34:56" git commit --allow-empty -m two && ++ GIT_COMMITTER_DATE="2025-08-22 12:34:56" git commit --allow-empty -m three && ++ GIT_COMMITTER_DATE="2025-08-23 12:34:56" git switch main && ++ GIT_COMMITTER_DATE="2025-08-24 12:34:56" git merge branch --no-ff && ++ GIT_COMMITTER_DATE="2025-08-26 12:34:56" git clone --shallow-since "2025-08-21 12:34:56" "file://$(pwd)/." ../shallow123 && ++ cd ../shallow123 && ++ echo "Shallow borders:" && ++ cat .git/shallow && ++ $(for commit in $(cat .git/shallow); do git rev-list $commit 1>/dev/null || exit 1; done) ++ ) ++' ++ + test_expect_success 'shallow clone exclude tag two' ' + test_create_repo shallow-exclude && + ( shallow.c | 42 +++++++++++++++++++++++++++++++++++++++--- t/t5500-fetch-pack.sh | 19 +++++++++++++++++++ 2 files changed, 58 insertions(+), 3 deletions(-) diff --git a/shallow.c b/shallow.c index 55b9cd9d3f..2929ac90ee 100644 --- a/shallow.c +++ b/shallow.c @@ -251,21 +251,57 @@ struct commit_list *get_shallow_commits_by_rev_list(struct strvec *argv, * commit A is processed first, then commit B, whose parent is * A, later. If NOT_SHALLOW on A is cleared at step 1, B * itself is considered border at step 2, which is incorrect. + * + * We must also consider that B has multiple parents which may + * not all be marked NOT_SHALLOW (as they weren't traversed into + * the not_shallow_list from revs in the first place). Because of + * that an additional step is required to reconsider B as border. + * A commit from the not_shallow_list is considered border only + * when ALL its parents weren't on the not_shallow_list. + * When one or more parents of a commit from the not_shellow_list + * also come from that list, the commit is not considered border, + * but its non-listed parents are considered border commits. + * + * The general processing goes like this: + * 1. Above we've painted the whole not_shallow_list of commits + * NOT_SHALLOW. + * 2. For each commit from the not_shallow_list (the code below) + * we paint SHALLOW this commit and its parent for all its + * parents that had not yet been painted NOT_SHALLOW. + * 3. Commits with all parents being painted only SHALLOW remain + * shallow and are being added to result list. + * 4. Commits without all parents being painted only SHALLOW are + * being excluded as borders, however their parents painted only + * SHALLOW are being added to the result borders list. */ for (p = not_shallow_list; p; p = p->next) { struct commit *c = p->item; struct commit_list *parent; + int must_not_be_shallow = 0; if (repo_parse_commit(the_repository, c)) die("unable to parse commit %s", oid_to_hex(&c->object.oid)); for (parent = c->parents; parent; parent = parent->next) - if (!(parent->item->object.flags & not_shallow_flag)) { + if (parent->item->object.flags & not_shallow_flag) { + must_not_be_shallow = 1; + } else { c->object.flags |= shallow_flag; - commit_list_insert(c, &result); - break; + parent->item->object.flags |= shallow_flag; } + if (must_not_be_shallow) { + c->object.flags &= ~shallow_flag; + for (parent = c->parents; parent; parent = parent->next) + if (parent->item->object.flags & shallow_flag) { + parent->item->object.flags |= not_shallow_flag; + commit_list_insert(parent->item, &result); + } + } else { + for (parent = c->parents; parent; parent = parent->next) + parent->item->object.flags &= ~shallow_flag; + commit_list_insert(c, &result); + } } free_commit_list(not_shallow_list); diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh index 2677cd5faa..12209887fb 100755 --- a/t/t5500-fetch-pack.sh +++ b/t/t5500-fetch-pack.sh @@ -904,6 +904,25 @@ test_expect_success 'shallow since with commit graph and already-seen commit' ' ) ' +test_expect_success 'clone shallow since all borders reachable' ' + test_create_repo shallow-since-all-borders-reachable && + ( + rm -rf shallow123 && + cd shallow-since-all-borders-reachable && + GIT_COMMITTER_DATE="2025-08-19 12:34:56" git commit --allow-empty -m one && + GIT_COMMITTER_DATE="2025-08-20 12:34:56" git switch -c branch && + GIT_COMMITTER_DATE="2025-08-21 12:34:56" git commit --allow-empty -m two && + GIT_COMMITTER_DATE="2025-08-22 12:34:56" git commit --allow-empty -m three && + GIT_COMMITTER_DATE="2025-08-23 12:34:56" git switch main && + GIT_COMMITTER_DATE="2025-08-24 12:34:56" git merge branch --no-ff && + GIT_COMMITTER_DATE="2025-08-26 12:34:56" git clone --shallow-since "2025-08-21 12:34:56" "file://$(pwd)/." ../shallow123 && + cd ../shallow123 && + echo "Shallow borders:" && + cat .git/shallow && + $(for commit in $(cat .git/shallow); do git rev-list $commit 1>/dev/null || exit 1; done) + ) +' + test_expect_success 'shallow clone exclude tag two' ' test_create_repo shallow-exclude && ( base-commit: debbc87557487aa9a8ed8a35367d17f8b4081c76 -- gitgitgadget ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2] shallow: set borders which are all reachable after clone shallow since 2025-11-23 19:35 ` [PATCH v2] shallow: set borders which are all reachable after clone shallow since Samo Pogačnik via GitGitGadget @ 2025-11-25 0:43 ` Junio C Hamano 2026-01-20 20:59 ` Junio C Hamano 2026-01-31 16:28 ` [PATCH v3] shallow: ensure all boundary commits are reachable with --shallow-since Samo Pogačnik via GitGitGadget 1 sibling, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2025-11-25 0:43 UTC (permalink / raw) To: Samo Pogačnik via GitGitGadget Cc: git, Samo Pogačnik, Patrick Steinhardt, Taylor Blau, Johannes Schindelin "Samo Pogačnik via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: =?UTF-8?q?Samo=20Poga=C4=8Dnik?= <samo_pogacnik@t-2.net> > > When shallow cloning based on a date, it happens that not all > shallow border commits are reachable. > > Original implementation of a generic shallow boundary finder > based on rev-list sets a commit (from the initial list of border > commit candidates) to be the border commit as soon as it finds one > of its parentis that wasn't on the list of initial candidates. This > results in a successful shallow clone, where some of its declared > border commits may not be reachable and they would not actually exist > in the cloned repository. Thus the result may contradict existing > comment in the code, which correctly states that such commmit should > not be considered border. > > One can inspect such case by running the added test scenario: > - 'clone shallow since all borders reachable' > > The modified implementation of a generic shallow boundary finder > based on rev-list ensures that all shallow border commits are reachable > also after being grafted. This is achieved by inspecting all parents > of each initial border commit candidate. The border commit candidate > is set border only when all its parents wern't on the initial list of > candidates. Otherwise the border commit candidate is not set as border > however its parents that weren't on the list of candidates are set as > borders. It is a minor point, but there are "boundary" and "border" used more or less interchangeably in the proposed commit log message, and would make the readers wonder if there are differences (I do not think we use the word "border" anywhere in our documentation). It is minor as we do not have such mixture in the end-user facing part of the documentation with this patch. I'll let those (cc'ed) who may be more familiar with, or, at least have more code than I have in, the shallow infrastructure to comment on the way the updated code uses the revision machinery. Thanks. > diff --git a/shallow.c b/shallow.c > index 55b9cd9d3f..2929ac90ee 100644 > --- a/shallow.c > +++ b/shallow.c > @@ -251,21 +251,57 @@ struct commit_list *get_shallow_commits_by_rev_list(struct strvec *argv, > * commit A is processed first, then commit B, whose parent is > * A, later. If NOT_SHALLOW on A is cleared at step 1, B > * itself is considered border at step 2, which is incorrect. > + * > + * We must also consider that B has multiple parents which may > + * not all be marked NOT_SHALLOW (as they weren't traversed into > + * the not_shallow_list from revs in the first place). Because of > + * that an additional step is required to reconsider B as border. > + * A commit from the not_shallow_list is considered border only > + * when ALL its parents weren't on the not_shallow_list. > + * When one or more parents of a commit from the not_shellow_list > + * also come from that list, the commit is not considered border, > + * but its non-listed parents are considered border commits. > + * > + * The general processing goes like this: > + * 1. Above we've painted the whole not_shallow_list of commits > + * NOT_SHALLOW. > + * 2. For each commit from the not_shallow_list (the code below) > + * we paint SHALLOW this commit and its parent for all its > + * parents that had not yet been painted NOT_SHALLOW. > + * 3. Commits with all parents being painted only SHALLOW remain > + * shallow and are being added to result list. > + * 4. Commits without all parents being painted only SHALLOW are > + * being excluded as borders, however their parents painted only > + * SHALLOW are being added to the result borders list. > */ > for (p = not_shallow_list; p; p = p->next) { > struct commit *c = p->item; > struct commit_list *parent; > + int must_not_be_shallow = 0; > > if (repo_parse_commit(the_repository, c)) > die("unable to parse commit %s", > oid_to_hex(&c->object.oid)); > > for (parent = c->parents; parent; parent = parent->next) > - if (!(parent->item->object.flags & not_shallow_flag)) { > + if (parent->item->object.flags & not_shallow_flag) { > + must_not_be_shallow = 1; > + } else { > c->object.flags |= shallow_flag; > - commit_list_insert(c, &result); > - break; > + parent->item->object.flags |= shallow_flag; > } > + if (must_not_be_shallow) { > + c->object.flags &= ~shallow_flag; > + for (parent = c->parents; parent; parent = parent->next) > + if (parent->item->object.flags & shallow_flag) { > + parent->item->object.flags |= not_shallow_flag; > + commit_list_insert(parent->item, &result); > + } > + } else { > + for (parent = c->parents; parent; parent = parent->next) > + parent->item->object.flags &= ~shallow_flag; > + commit_list_insert(c, &result); > + } > } > free_commit_list(not_shallow_list); > > diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh > index 2677cd5faa..12209887fb 100755 > --- a/t/t5500-fetch-pack.sh > +++ b/t/t5500-fetch-pack.sh > @@ -904,6 +904,25 @@ test_expect_success 'shallow since with commit graph and already-seen commit' ' > ) > ' > > +test_expect_success 'clone shallow since all borders reachable' ' > + test_create_repo shallow-since-all-borders-reachable && > + ( > + rm -rf shallow123 && > + cd shallow-since-all-borders-reachable && > + GIT_COMMITTER_DATE="2025-08-19 12:34:56" git commit --allow-empty -m one && > + GIT_COMMITTER_DATE="2025-08-20 12:34:56" git switch -c branch && > + GIT_COMMITTER_DATE="2025-08-21 12:34:56" git commit --allow-empty -m two && > + GIT_COMMITTER_DATE="2025-08-22 12:34:56" git commit --allow-empty -m three && > + GIT_COMMITTER_DATE="2025-08-23 12:34:56" git switch main && > + GIT_COMMITTER_DATE="2025-08-24 12:34:56" git merge branch --no-ff && > + GIT_COMMITTER_DATE="2025-08-26 12:34:56" git clone --shallow-since "2025-08-21 12:34:56" "file://$(pwd)/." ../shallow123 && > + cd ../shallow123 && > + echo "Shallow borders:" && > + cat .git/shallow && > + $(for commit in $(cat .git/shallow); do git rev-list $commit 1>/dev/null || exit 1; done) > + ) > +' > + > test_expect_success 'shallow clone exclude tag two' ' > test_create_repo shallow-exclude && > ( > > base-commit: debbc87557487aa9a8ed8a35367d17f8b4081c76 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] shallow: set borders which are all reachable after clone shallow since 2025-11-25 0:43 ` Junio C Hamano @ 2026-01-20 20:59 ` Junio C Hamano 2026-01-28 4:23 ` Samo Pogačnik 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2026-01-20 20:59 UTC (permalink / raw) To: git Cc: Samo Pogačnik via GitGitGadget, Samo Pogačnik, Patrick Steinhardt, Taylor Blau, Johannes Schindelin Junio C Hamano <gitster@pobox.com> writes: >> ... >> The modified implementation of a generic shallow boundary finder >> based on rev-list ensures that all shallow border commits are reachable >> also after being grafted. This is achieved by inspecting all parents >> of each initial border commit candidate. The border commit candidate >> is set border only when all its parents wern't on the initial list of >> candidates. Otherwise the border commit candidate is not set as border >> however its parents that weren't on the list of candidates are set as >> borders. > > It is a minor point, but there are "boundary" and "border" used more > or less interchangeably in the proposed commit log message, and > would make the readers wonder if there are differences (I do not > think we use the word "border" anywhere in our documentation). It > is minor as we do not have such mixture in the end-user facing part > of the documentation with this patch. > > I'll let those (cc'ed) who may be more familiar with, or, at least > have more code than I have in, the shallow infrastructure to comment > on the way the updated code uses the revision machinery. After this exchange, the topic has been dormant for almost full two months. As I do not deal with shallow clones myself, even though I understand that some folks rely on it working, I'd really prefer to see somebody who are familiar with the underlying logic to review this patch if we were to move forward with it. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] shallow: set borders which are all reachable after clone shallow since 2026-01-20 20:59 ` Junio C Hamano @ 2026-01-28 4:23 ` Samo Pogačnik 2026-02-07 5:06 ` Samo Pogačnik 0 siblings, 1 reply; 10+ messages in thread From: Samo Pogačnik @ 2026-01-28 4:23 UTC (permalink / raw) To: Junio C Hamano, git; +Cc: Patrick Steinhardt, Taylor Blau, Johannes Schindelin [-- Attachment #1: Type: text/plain, Size: 1832 bytes --] On Tue, 2026-01-20 at 12:59 -0800, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > > ... > > > The modified implementation of a generic shallow boundary finder > > > based on rev-list ensures that all shallow border commits are reachable > > > also after being grafted. This is achieved by inspecting all parents > > > of each initial border commit candidate. The border commit candidate > > > is set border only when all its parents wern't on the initial list of > > > candidates. Otherwise the border commit candidate is not set as border > > > however its parents that weren't on the list of candidates are set as > > > borders. > > > > It is a minor point, but there are "boundary" and "border" used more > > or less interchangeably in the proposed commit log message, and > > would make the readers wonder if there are differences (I do not > > think we use the word "border" anywhere in our documentation). It > > is minor as we do not have such mixture in the end-user facing part > > of the documentation with this patch. > > > > I'll let those (cc'ed) who may be more familiar with, or, at least > > have more code than I have in, the shallow infrastructure to comment > > on the way the updated code uses the revision machinery. > > After this exchange, the topic has been dormant for almost full two > months. As I do not deal with shallow clones myself, even though I > understand that some folks rely on it working, I'd really prefer to > see somebody who are familiar with the underlying logic to review > this patch if we were to move forward with it. > I’m currently rewriting the patch and the commit message trying to address the boundary/border dilemma. I hope to be able to send a new version by the end of this week. Best regards, Samo [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] shallow: set borders which are all reachable after clone shallow since 2026-01-28 4:23 ` Samo Pogačnik @ 2026-02-07 5:06 ` Samo Pogačnik 2026-02-07 5:19 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Samo Pogačnik @ 2026-02-07 5:06 UTC (permalink / raw) To: Junio C Hamano, git; +Cc: Patrick Steinhardt, Taylor Blau, Johannes Schindelin [-- Attachment #1: Type: text/plain, Size: 2119 bytes --] On Wed, 2026-01-28 at 05:23 +0100, Samo Pogačnik wrote: > On Tue, 2026-01-20 at 12:59 -0800, Junio C Hamano wrote: > > Junio C Hamano <gitster@pobox.com> writes: > > > > > > ... > > > > The modified implementation of a generic shallow boundary finder > > > > based on rev-list ensures that all shallow border commits are reachable > > > > also after being grafted. This is achieved by inspecting all parents > > > > of each initial border commit candidate. The border commit candidate > > > > is set border only when all its parents wern't on the initial list of > > > > candidates. Otherwise the border commit candidate is not set as border > > > > however its parents that weren't on the list of candidates are set as > > > > borders. > > > > > > It is a minor point, but there are "boundary" and "border" used more > > > or less interchangeably in the proposed commit log message, and > > > would make the readers wonder if there are differences (I do not > > > think we use the word "border" anywhere in our documentation). It > > > is minor as we do not have such mixture in the end-user facing part > > > of the documentation with this patch. > > > > > > I'll let those (cc'ed) who may be more familiar with, or, at least > > > have more code than I have in, the shallow infrastructure to comment > > > on the way the updated code uses the revision machinery. > > > > After this exchange, the topic has been dormant for almost full two > > months. As I do not deal with shallow clones myself, even though I > > understand that some folks rely on it working, I'd really prefer to > > see somebody who are familiar with the underlying logic to review > > this patch if we were to move forward with it. > > > > I’m currently rewriting the patch and the commit message trying to > address the boundary/border dilemma. I hope to be able to send a new > version by the end of this week. > I posted a new version of patch '[PATCH v3] shallow: ensure all boundary commits are reachable with --shallow-since' on 31st of January. I hope you've seen it. thanks, Samo [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] shallow: set borders which are all reachable after clone shallow since 2026-02-07 5:06 ` Samo Pogačnik @ 2026-02-07 5:19 ` Junio C Hamano 2026-03-07 7:13 ` Samo Pogačnik 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2026-02-07 5:19 UTC (permalink / raw) To: Samo Pogačnik Cc: git, Patrick Steinhardt, Taylor Blau, Johannes Schindelin Samo Pogačnik <samo_pogacnik@t-2.net> writes: > On Wed, 2026-01-28 at 05:23 +0100, Samo Pogačnik wrote: >> On Tue, 2026-01-20 at 12:59 -0800, Junio C Hamano wrote: >> ... >> > After this exchange, the topic has been dormant for almost full two >> > months. As I do not deal with shallow clones myself, even though I >> > understand that some folks rely on it working, I'd really prefer to >> > see somebody who are familiar with the underlying logic to review >> > this patch if we were to move forward with it. >> > >> >> I’m currently rewriting the patch and the commit message trying to >> address the boundary/border dilemma. I hope to be able to send a new >> version by the end of this week. > > I posted a new version of patch '[PATCH v3] shallow: ensure all boundary commits > are reachable with --shallow-since' on 31st of January. I hope you've seen it. > > thanks, Samo For those of you on the original CC: list taken from v2 review thread, who may be more qualified to review this topic than I am, the v3 is found at: https://lore.kernel.org/git/pull.2107.v3.git.git.1769876930544.gitgitgadget@gmail.com/ Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] shallow: set borders which are all reachable after clone shallow since 2026-02-07 5:19 ` Junio C Hamano @ 2026-03-07 7:13 ` Samo Pogačnik 0 siblings, 0 replies; 10+ messages in thread From: Samo Pogačnik @ 2026-03-07 7:13 UTC (permalink / raw) To: Patrick Steinhardt, Taylor Blau, Johannes Schindelin; +Cc: Junio C Hamano, git [-- Attachment #1: Type: text/plain, Size: 1359 bytes --] On Fri, 2026-02-06 at 21:19 -0800, Junio C Hamano wrote: > Samo Pogačnik <samo_pogacnik@t-2.net> writes: > > > On Wed, 2026-01-28 at 05:23 +0100, Samo Pogačnik wrote: > > > On Tue, 2026-01-20 at 12:59 -0800, Junio C Hamano wrote: > > > ... > > > > After this exchange, the topic has been dormant for almost full two > > > > months. As I do not deal with shallow clones myself, even though I > > > > understand that some folks rely on it working, I'd really prefer to > > > > see somebody who are familiar with the underlying logic to review > > > > this patch if we were to move forward with it. > > > > > > > > > > I’m currently rewriting the patch and the commit message trying to > > > address the boundary/border dilemma. I hope to be able to send a new > > > version by the end of this week. > > > > I posted a new version of patch '[PATCH v3] shallow: ensure all boundary > > commits > > are reachable with --shallow-since' on 31st of January. I hope you've seen > > it. > > > > thanks, Samo > > For those of you on the original CC: list taken from v2 review > thread, who may be more qualified to review this topic than I am, > the v3 is found at: > > > https://lore.kernel.org/git/pull.2107.v3.git.git.1769876930544.gitgitgadget@gmail.com/ > Could you please check this correction. Thanks, Samo [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v3] shallow: ensure all boundary commits are reachable with --shallow-since 2025-11-23 19:35 ` [PATCH v2] shallow: set borders which are all reachable after clone shallow since Samo Pogačnik via GitGitGadget 2025-11-25 0:43 ` Junio C Hamano @ 2026-01-31 16:28 ` Samo Pogačnik via GitGitGadget 1 sibling, 0 replies; 10+ messages in thread From: Samo Pogačnik via GitGitGadget @ 2026-01-31 16:28 UTC (permalink / raw) To: git; +Cc: Samo Pogačnik, Samo Pogačnik From: =?UTF-8?q?Samo=20Poga=C4=8Dnik?= <samo_pogacnik@t-2.net> When performing a shallow clone based on a date, it is possible for some declared shallow boundary commits to be unreachable. The original implementation of the generic shallow boundary finder based on rev-list marks a commit (from the initial list of boundary candidates) as shallow as soon as it finds one parent that is not in the initial candidate list. This can result in a successful shallow clone where some declared boundary commits are not reachable and therefore do not exist in the cloned repository. In such cases, the result contradicts the existing code comment, which correctly states that boundary commit candidates with a parent in the same candidate list must not be considered boundary commits. The added test case 'clone shallow-since all shallows reachable' exposes the problem. For example: 0. Original repository Graph: * e5fbe33 (HEAD -> main) Apr_4th_13:14:15 |\ | * 72f5b73 (branch) Apr_3rd_13:14:15 |/ * 0ba76c8 Apr_2nd_13:14:15 * f58ea3a Apr_1st_13:14:15 1. Clone with --shallow-since="2005-04-03 13:14:15" Shallows: e5fbe33724032807ab2d8636d4c9161f6716882d 72f5b73c5fec9a728adc42c23de1b87bb5b3ab16 Graph: * e5fbe33 (grafted, HEAD -> main, ...) Apr_4th_13:14:15 Note that the second shallow commit, 72f5b73c5fec9a728adc42c23de1b87bb5b3ab16, is not reachable. Update the generic shallow boundary finder to ensure that all shallow boundary commits are reachable. This is done by inspecting all parents of each initial boundary candidate. A candidate is marked shallow only if all of its parents are not in the initial candidate list. Otherwise, the candidate itself is not marked shallow, but its parents that are not in the candidate list are marked as boundary commits instead. With the same example, the corrected behavior results in: 1. Clone with --shallow-since="2005-04-03 13:14:15" Shallows: 72f5b73c5fec9a728adc42c23de1b87bb5b3ab16 0ba76c8b72273477b59026bea93fcc792bf48747 Graph: * e5fbe33 (HEAD -> main, ...) Apr_4th_13:14:15 |\ | * 72f5b73 (grafted) Apr_3rd_13:14:15 * 0ba76c8 (grafted) Apr_2nd_13:14:15 Signed-off-by: Samo Pogačnik <samo_pogacnik@t-2.net> --- Fixed --shallow-since generating descendant borders When shallow cloning based on a date, it happens that a list of commits is received, where some of the list border commits actually descend one from another. In such cases borders need to be expanded by additional parents and excluding the child as border. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2107%2Fspog%2Ffix-shallow-since-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2107/spog/fix-shallow-since-v3 Pull-Request: https://github.com/git/git/pull/2107 Range-diff vs v2: 1: 479692c386 ! 1: 34df169b67 shallow: set borders which are all reachable after clone shallow since @@ Metadata Author: Samo Pogačnik <samo_pogacnik@t-2.net> ## Commit message ## - shallow: set borders which are all reachable after clone shallow since + shallow: ensure all boundary commits are reachable with --shallow-since - When shallow cloning based on a date, it happens that not all - shallow border commits are reachable. + When performing a shallow clone based on a date, it is possible for some + declared shallow boundary commits to be unreachable. - Original implementation of a generic shallow boundary finder - based on rev-list sets a commit (from the initial list of border - commit candidates) to be the border commit as soon as it finds one - of its parentis that wasn't on the list of initial candidates. This - results in a successful shallow clone, where some of its declared - border commits may not be reachable and they would not actually exist - in the cloned repository. Thus the result may contradict existing - comment in the code, which correctly states that such commmit should - not be considered border. + The original implementation of the generic shallow boundary finder based + on rev-list marks a commit (from the initial list of boundary candidates) + as shallow as soon as it finds one parent that is not in the initial + candidate list. This can result in a successful shallow clone where some + declared boundary commits are not reachable and therefore do not exist + in the cloned repository. - One can inspect such case by running the added test scenario: - - 'clone shallow since all borders reachable' + In such cases, the result contradicts the existing code comment, which + correctly states that boundary commit candidates with a parent in the + same candidate list must not be considered boundary commits. - The modified implementation of a generic shallow boundary finder - based on rev-list ensures that all shallow border commits are reachable - also after being grafted. This is achieved by inspecting all parents - of each initial border commit candidate. The border commit candidate - is set border only when all its parents wern't on the initial list of - candidates. Otherwise the border commit candidate is not set as border - however its parents that weren't on the list of candidates are set as - borders. + The added test case 'clone shallow-since all shallows reachable' exposes + the problem. For example: + + 0. Original repository + Graph: + * e5fbe33 (HEAD -> main) Apr_4th_13:14:15 + |\ + | * 72f5b73 (branch) Apr_3rd_13:14:15 + |/ + * 0ba76c8 Apr_2nd_13:14:15 + * f58ea3a Apr_1st_13:14:15 + + 1. Clone with --shallow-since="2005-04-03 13:14:15" + Shallows: + e5fbe33724032807ab2d8636d4c9161f6716882d + 72f5b73c5fec9a728adc42c23de1b87bb5b3ab16 + Graph: + * e5fbe33 (grafted, HEAD -> main, ...) Apr_4th_13:14:15 + + Note that the second shallow commit, + 72f5b73c5fec9a728adc42c23de1b87bb5b3ab16, + is not reachable. + + Update the generic shallow boundary finder to ensure that all shallow + boundary commits are reachable. This is done by inspecting all parents of + each initial boundary candidate. A candidate is marked shallow only if + all of its parents are not in the initial candidate list. Otherwise, the + candidate itself is not marked shallow, but its parents that are not in + the candidate list are marked as boundary commits instead. + + With the same example, the corrected behavior results in: + + 1. Clone with --shallow-since="2005-04-03 13:14:15" + Shallows: + 72f5b73c5fec9a728adc42c23de1b87bb5b3ab16 + 0ba76c8b72273477b59026bea93fcc792bf48747 + Graph: + * e5fbe33 (HEAD -> main, ...) Apr_4th_13:14:15 + |\ + | * 72f5b73 (grafted) Apr_3rd_13:14:15 + * 0ba76c8 (grafted) Apr_2nd_13:14:15 Signed-off-by: Samo Pogačnik <samo_pogacnik@t-2.net> ## shallow.c ## +@@ shallow.c: static void show_commit(struct commit *commit, void *data) + } + + /* +- * Given rev-list arguments, run rev-list. All reachable commits +- * except border ones are marked with not_shallow_flag. Border commits +- * are marked with shallow_flag. The list of border/shallow commits +- * are also returned. ++ * Given rev-list arguments, run rev-list. All reachable commits except ++ * shallow boundary commits are marked with not_shallow_flag. ++ * Returned is a list of boundary commits marked with shallow_flag only. + */ + struct commit_list *get_shallow_commits_by_rev_list(struct strvec *argv, + int shallow_flag, @@ shallow.c: struct commit_list *get_shallow_commits_by_rev_list(struct strvec *argv, - * commit A is processed first, then commit B, whose parent is - * A, later. If NOT_SHALLOW on A is cleared at step 1, B - * itself is considered border at step 2, which is incorrect. + if (!not_shallow_list) + die("no commits selected for shallow requests"); + +- /* Mark all reachable commits as NOT_SHALLOW */ ++ /* Mark all reachable (listed) commits as NOT_SHALLOW */ + for (p = not_shallow_list; p; p = p->next) + p->item->object.flags |= not_shallow_flag; + + /* +- * mark border commits SHALLOW + NOT_SHALLOW. +- * We cannot clear NOT_SHALLOW right now. Imagine border +- * commit A is processed first, then commit B, whose parent is +- * A, later. If NOT_SHALLOW on A is cleared at step 1, B +- * itself is considered border at step 2, which is incorrect. ++ * Mark shallow commits from the list as SHALLOW + NOT_SHALLOW. ++ * Do not clear NOT_SHALLOW flags immediately. Consider two listed ++ * commits, B and its parent A, where A is shallow. If A is processed ++ * first and its NOT_SHALLOW flag is cleared immediately, B would later ++ * be incorrectly marked SHALLOW when processed. ++ * ++ * Also, listed commits may have multiple parents, and not all parents ++ * are necessarily listed (as they were not all traversed into the ++ * not_shallow_list from the revs in the first place — not marked ++ * NOT_SHALLOW). Therefore: + * -+ * We must also consider that B has multiple parents which may -+ * not all be marked NOT_SHALLOW (as they weren't traversed into -+ * the not_shallow_list from revs in the first place). Because of -+ * that an additional step is required to reconsider B as border. -+ * A commit from the not_shallow_list is considered border only -+ * when ALL its parents weren't on the not_shallow_list. -+ * When one or more parents of a commit from the not_shellow_list -+ * also come from that list, the commit is not considered border, -+ * but its non-listed parents are considered border commits. ++ * - A listed commit is marked SHALLOW only if none of its parents are ++ * listed. ++ * - If at least one parent of a listed commit is also listed, the ++ * commit itself is not marked SHALLOW; however, any of its non-listed ++ * parents are marked SHALLOW. + * -+ * The general processing goes like this: -+ * 1. Above we've painted the whole not_shallow_list of commits -+ * NOT_SHALLOW. -+ * 2. For each commit from the not_shallow_list (the code below) -+ * we paint SHALLOW this commit and its parent for all its -+ * parents that had not yet been painted NOT_SHALLOW. -+ * 3. Commits with all parents being painted only SHALLOW remain -+ * shallow and are being added to result list. -+ * 4. Commits without all parents being painted only SHALLOW are -+ * being excluded as borders, however their parents painted only -+ * SHALLOW are being added to the result borders list. ++ * Processing overview: ++ * 1. All listed commits have already been marked NOT_SHALLOW are not ++ * cleared until all shallow commits have been identified. ++ * ++ * 2. For each listed commit: ++ * - Mark the commit SHALLOW if it has any parent that is not listed. ++ * - Mark all non-listed parents as SHALLOW. ++ * - If the commit has at least one listed parent, it is excluded ++ * from the shallow result; however its parents marked only SHALLOW ++ * are added instead. ++ * - If all parents are marked only SHALLOW, the commit remains SHALLOW ++ * and is added to the shallow result. */ for (p = not_shallow_list; p; p = p->next) { struct commit *c = p->item; @@ shallow.c: struct commit_list *get_shallow_commits_by_rev_list(struct strvec *ar if (repo_parse_commit(the_repository, c)) die("unable to parse commit %s", oid_to_hex(&c->object.oid)); ++ if (!c->parents) ++ continue; for (parent = c->parents; parent; parent = parent->next) - if (!(parent->item->object.flags & not_shallow_flag)) { @@ shallow.c: struct commit_list *get_shallow_commits_by_rev_list(struct strvec *ar - break; + parent->item->object.flags |= shallow_flag; } ++ + if (must_not_be_shallow) { + c->object.flags &= ~shallow_flag; + for (parent = c->parents; parent; parent = parent->next) -+ if (parent->item->object.flags & shallow_flag) { -+ parent->item->object.flags |= not_shallow_flag; ++ if ((parent->item->object.flags & shallow_flag) && ++ !(parent->item->object.flags & not_shallow_flag)) + commit_list_insert(parent->item, &result); -+ } + } else { + for (parent = c->parents; parent; parent = parent->next) + parent->item->object.flags &= ~shallow_flag; @@ shallow.c: struct commit_list *get_shallow_commits_by_rev_list(struct strvec *ar } free_commit_list(not_shallow_list); + /* +- * Now we can clean up NOT_SHALLOW on border commits. Having ++ * Now we can clean up NOT_SHALLOW on shallow commits. Having + * both flags set can confuse the caller. + */ + for (p = result; p; p = p->next) { ## t/t5500-fetch-pack.sh ## @@ t/t5500-fetch-pack.sh: test_expect_success 'shallow since with commit graph and already-seen commit' ' ) ' -+test_expect_success 'clone shallow since all borders reachable' ' -+ test_create_repo shallow-since-all-borders-reachable && ++test_expect_success 'clone shallow-since all shallows reachable' ' ++ test_create_repo shallow-since-all-shallows-reachable && + ( -+ rm -rf shallow123 && -+ cd shallow-since-all-borders-reachable && -+ GIT_COMMITTER_DATE="2025-08-19 12:34:56" git commit --allow-empty -m one && -+ GIT_COMMITTER_DATE="2025-08-20 12:34:56" git switch -c branch && -+ GIT_COMMITTER_DATE="2025-08-21 12:34:56" git commit --allow-empty -m two && -+ GIT_COMMITTER_DATE="2025-08-22 12:34:56" git commit --allow-empty -m three && -+ GIT_COMMITTER_DATE="2025-08-23 12:34:56" git switch main && -+ GIT_COMMITTER_DATE="2025-08-24 12:34:56" git merge branch --no-ff && -+ GIT_COMMITTER_DATE="2025-08-26 12:34:56" git clone --shallow-since "2025-08-21 12:34:56" "file://$(pwd)/." ../shallow123 && -+ cd ../shallow123 && -+ echo "Shallow borders:" && -+ cat .git/shallow && -+ $(for commit in $(cat .git/shallow); do git rev-list $commit 1>/dev/null || exit 1; done) ++ rm -rf shallow123 && ++ cd shallow-since-all-shallows-reachable && ++ GIT_COMMITTER_DATE="2005-04-01 13:14:15" git commit --allow-empty -m Apr_1st && ++ GIT_COMMITTER_DATE="2005-04-02 13:14:15" git commit --allow-empty -m Apr_2nd && ++ GIT_COMMITTER_DATE="2005-04-03 13:14:15" git switch -c branch && ++ GIT_COMMITTER_DATE="2005-04-03 13:14:15" git commit --allow-empty -m Apr_3rd && ++ GIT_COMMITTER_DATE="2005-04-04 13:14:15" git switch main && ++ GIT_COMMITTER_DATE="2005-04-04 13:14:15" git merge branch --no-ff -m Apr_4th && ++ git clone --shallow-since "2005-04-03 13:14:15" "file://$(pwd)/." ../shallow123 && ++ cd ../shallow123 && ++ for commit in $(cat .git/shallow 2> /dev/null) ++ do ++ git rev-list $commit 1>/dev/null || exit 1 ++ done + ) +' + shallow.c | 67 ++++++++++++++++++++++++++++++++++--------- t/t5500-fetch-pack.sh | 20 +++++++++++++ 2 files changed, 73 insertions(+), 14 deletions(-) diff --git a/shallow.c b/shallow.c index 55b9cd9d3f..cd99e5777f 100644 --- a/shallow.c +++ b/shallow.c @@ -208,10 +208,9 @@ static void show_commit(struct commit *commit, void *data) } /* - * Given rev-list arguments, run rev-list. All reachable commits - * except border ones are marked with not_shallow_flag. Border commits - * are marked with shallow_flag. The list of border/shallow commits - * are also returned. + * Given rev-list arguments, run rev-list. All reachable commits except + * shallow boundary commits are marked with not_shallow_flag. + * Returned is a list of boundary commits marked with shallow_flag only. */ struct commit_list *get_shallow_commits_by_rev_list(struct strvec *argv, int shallow_flag, @@ -241,36 +240,76 @@ struct commit_list *get_shallow_commits_by_rev_list(struct strvec *argv, if (!not_shallow_list) die("no commits selected for shallow requests"); - /* Mark all reachable commits as NOT_SHALLOW */ + /* Mark all reachable (listed) commits as NOT_SHALLOW */ for (p = not_shallow_list; p; p = p->next) p->item->object.flags |= not_shallow_flag; /* - * mark border commits SHALLOW + NOT_SHALLOW. - * We cannot clear NOT_SHALLOW right now. Imagine border - * commit A is processed first, then commit B, whose parent is - * A, later. If NOT_SHALLOW on A is cleared at step 1, B - * itself is considered border at step 2, which is incorrect. + * Mark shallow commits from the list as SHALLOW + NOT_SHALLOW. + * Do not clear NOT_SHALLOW flags immediately. Consider two listed + * commits, B and its parent A, where A is shallow. If A is processed + * first and its NOT_SHALLOW flag is cleared immediately, B would later + * be incorrectly marked SHALLOW when processed. + * + * Also, listed commits may have multiple parents, and not all parents + * are necessarily listed (as they were not all traversed into the + * not_shallow_list from the revs in the first place — not marked + * NOT_SHALLOW). Therefore: + * + * - A listed commit is marked SHALLOW only if none of its parents are + * listed. + * - If at least one parent of a listed commit is also listed, the + * commit itself is not marked SHALLOW; however, any of its non-listed + * parents are marked SHALLOW. + * + * Processing overview: + * 1. All listed commits have already been marked NOT_SHALLOW are not + * cleared until all shallow commits have been identified. + * + * 2. For each listed commit: + * - Mark the commit SHALLOW if it has any parent that is not listed. + * - Mark all non-listed parents as SHALLOW. + * - If the commit has at least one listed parent, it is excluded + * from the shallow result; however its parents marked only SHALLOW + * are added instead. + * - If all parents are marked only SHALLOW, the commit remains SHALLOW + * and is added to the shallow result. */ for (p = not_shallow_list; p; p = p->next) { struct commit *c = p->item; struct commit_list *parent; + int must_not_be_shallow = 0; if (repo_parse_commit(the_repository, c)) die("unable to parse commit %s", oid_to_hex(&c->object.oid)); + if (!c->parents) + continue; for (parent = c->parents; parent; parent = parent->next) - if (!(parent->item->object.flags & not_shallow_flag)) { + if (parent->item->object.flags & not_shallow_flag) { + must_not_be_shallow = 1; + } else { c->object.flags |= shallow_flag; - commit_list_insert(c, &result); - break; + parent->item->object.flags |= shallow_flag; } + + if (must_not_be_shallow) { + c->object.flags &= ~shallow_flag; + for (parent = c->parents; parent; parent = parent->next) + if ((parent->item->object.flags & shallow_flag) && + !(parent->item->object.flags & not_shallow_flag)) + commit_list_insert(parent->item, &result); + } else { + for (parent = c->parents; parent; parent = parent->next) + parent->item->object.flags &= ~shallow_flag; + commit_list_insert(c, &result); + } } free_commit_list(not_shallow_list); /* - * Now we can clean up NOT_SHALLOW on border commits. Having + * Now we can clean up NOT_SHALLOW on shallow commits. Having * both flags set can confuse the caller. */ for (p = result; p; p = p->next) { diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh index 2677cd5faa..44e274281a 100755 --- a/t/t5500-fetch-pack.sh +++ b/t/t5500-fetch-pack.sh @@ -904,6 +904,26 @@ test_expect_success 'shallow since with commit graph and already-seen commit' ' ) ' +test_expect_success 'clone shallow-since all shallows reachable' ' + test_create_repo shallow-since-all-shallows-reachable && + ( + rm -rf shallow123 && + cd shallow-since-all-shallows-reachable && + GIT_COMMITTER_DATE="2005-04-01 13:14:15" git commit --allow-empty -m Apr_1st && + GIT_COMMITTER_DATE="2005-04-02 13:14:15" git commit --allow-empty -m Apr_2nd && + GIT_COMMITTER_DATE="2005-04-03 13:14:15" git switch -c branch && + GIT_COMMITTER_DATE="2005-04-03 13:14:15" git commit --allow-empty -m Apr_3rd && + GIT_COMMITTER_DATE="2005-04-04 13:14:15" git switch main && + GIT_COMMITTER_DATE="2005-04-04 13:14:15" git merge branch --no-ff -m Apr_4th && + git clone --shallow-since "2005-04-03 13:14:15" "file://$(pwd)/." ../shallow123 && + cd ../shallow123 && + for commit in $(cat .git/shallow 2> /dev/null) + do + git rev-list $commit 1>/dev/null || exit 1 + done + ) +' + test_expect_success 'shallow clone exclude tag two' ' test_create_repo shallow-exclude && ( base-commit: debbc87557487aa9a8ed8a35367d17f8b4081c76 -- gitgitgadget ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-03-07 7:21 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-11-22 10:38 [PATCH] Fixed --shallow-since generating descendant borders Samo Pogačnik via GitGitGadget 2025-11-22 17:36 ` Junio C Hamano 2025-11-23 19:35 ` [PATCH v2] shallow: set borders which are all reachable after clone shallow since Samo Pogačnik via GitGitGadget 2025-11-25 0:43 ` Junio C Hamano 2026-01-20 20:59 ` Junio C Hamano 2026-01-28 4:23 ` Samo Pogačnik 2026-02-07 5:06 ` Samo Pogačnik 2026-02-07 5:19 ` Junio C Hamano 2026-03-07 7:13 ` Samo Pogačnik 2026-01-31 16:28 ` [PATCH v3] shallow: ensure all boundary commits are reachable with --shallow-since Samo Pogačnik via GitGitGadget
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox