* [PATCH] builtin/push: call set_refspecs after validating remote
@ 2024-07-08 14:03 Karthik Nayak
2024-07-08 19:17 ` Junio C Hamano
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Karthik Nayak @ 2024-07-08 14:03 UTC (permalink / raw)
To: karthik.188; +Cc: git, peff
Since 9badf97c4 (remote: allow resetting url list), we reset the remote
URL if the provided URL is empty. This means any caller of
`remotes_remote_get()` would now get a NULL remote.
The 'builtin/push.c' code, calls 'set_refspecs' before validating the
remote. This worked earlier since we would get a remote, albeit with an
empty URL. With the new changes, we get a NULL remote and this crashes.
Do a simple fix by doing remote validation first and also add a test to
validate the bug fix.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
I noticed that this was breaking on master. We run tests on Git master
for Gitaly at GitLab and I noticed a SEFAULT. I could also reproduce the
bug by simply doing 'git push "" refs/heads/master' on master, next and
seen.
builtin/push.c | 7 ++++---
t/t5529-push-errors.sh | 8 ++++++++
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/builtin/push.c b/builtin/push.c
index 8260c6e46a..992f603de7 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -630,10 +630,8 @@ int cmd_push(int argc, const char **argv, const char *prefix)
if (tags)
refspec_append(&rs, "refs/tags/*");
- if (argc > 0) {
+ if (argc > 0)
repo = argv[0];
- set_refspecs(argv + 1, argc - 1, repo);
- }
remote = pushremote_get(repo);
if (!remote) {
@@ -649,6 +647,9 @@ int cmd_push(int argc, const char **argv, const char *prefix)
" git push <name>\n"));
}
+ if (argc > 0)
+ set_refspecs(argv + 1, argc - 1, repo);
+
if (remote->mirror)
flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);
diff --git a/t/t5529-push-errors.sh b/t/t5529-push-errors.sh
index 0247137cb3..771f5f8ae8 100755
--- a/t/t5529-push-errors.sh
+++ b/t/t5529-push-errors.sh
@@ -2,6 +2,9 @@
test_description='detect some push errors early (before contacting remote)'
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
@@ -38,6 +41,11 @@ test_expect_success 'detect missing sha1 expressions early' '
test_cmp expect rp-ran
'
+test_expect_success 'detect empty remote' '
+ test_must_fail git push "" main 2> stderr &&
+ grep "fatal: bad repository ''" stderr
+'
+
test_expect_success 'detect ambiguous refs early' '
git branch foo &&
git tag foo &&
--
2.45.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH] builtin/push: call set_refspecs after validating remote 2024-07-08 14:03 [PATCH] builtin/push: call set_refspecs after validating remote Karthik Nayak @ 2024-07-08 19:17 ` Junio C Hamano 2024-07-08 23:33 ` Jeff King 2024-07-09 9:59 ` Karthik Nayak 2024-07-08 23:32 ` Jeff King 2024-07-09 14:49 ` [PATCH v2] " Karthik Nayak 2 siblings, 2 replies; 20+ messages in thread From: Junio C Hamano @ 2024-07-08 19:17 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, peff Karthik Nayak <karthik.188@gmail.com> writes: > Since 9badf97c4 (remote: allow resetting url list), Please do not be original in places where it shouldn't matter. Use "git show -s --format=reference" that includes the datestamp to help readers judge how old the problem is. > we reset the remote > URL if the provided URL is empty. This means any caller of > `remotes_remote_get()` would now get a NULL remote. "NULL remote" meaning? If you have this: [remote "multi"] url = wrong-one url = wrong-two url = and ask "remotes_remote_get()" to give you the remote "multi", you'd get a remote whose URL array has no elements. Is that what you are referring to? > The 'builtin/push.c' code, calls 'set_refspecs' before validating the > remote. There is a comment about "lazily grab remote", so it is very understandable. > This worked earlier since we would get a remote, albeit with an > empty URL. With the new changes, we get a NULL remote and this crashes. You'd really really need to clarify what you mean by "a NULL remote" if you want the proposed log message and the change to be understood. The change made by 9badf97c (remote: allow resetting url list, 2024-06-14), as far as I can tell, can make the strvecs that hold URL and pushURL in a remote structure empty, but it does not otherwise destroy the remote structure, or nullify a pointer that points at the remote structure. So I am completely lost here. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] builtin/push: call set_refspecs after validating remote 2024-07-08 19:17 ` Junio C Hamano @ 2024-07-08 23:33 ` Jeff King 2024-07-09 9:59 ` Karthik Nayak 1 sibling, 0 replies; 20+ messages in thread From: Jeff King @ 2024-07-08 23:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: Karthik Nayak, git On Mon, Jul 08, 2024 at 12:17:38PM -0700, Junio C Hamano wrote: > > This worked earlier since we would get a remote, albeit with an > > empty URL. With the new changes, we get a NULL remote and this crashes. > > You'd really really need to clarify what you mean by "a NULL remote" > if you want the proposed log message and the change to be > understood. The change made by 9badf97c (remote: allow resetting > url list, 2024-06-14), as far as I can tell, can make the strvecs > that hold URL and pushURL in a remote structure empty, but it does > not otherwise destroy the remote structure, or nullify a pointer > that points at the remote structure. So I am completely lost here. I was somewhat confused how it could happen, too. ;) I left more comments elsewhere in the thread, but the gist of it is that an empty remote name confuses the "when there are no urls, fall back to the remote name as a url" code. -Peff ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] builtin/push: call set_refspecs after validating remote 2024-07-08 19:17 ` Junio C Hamano 2024-07-08 23:33 ` Jeff King @ 2024-07-09 9:59 ` Karthik Nayak 1 sibling, 0 replies; 20+ messages in thread From: Karthik Nayak @ 2024-07-09 9:59 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, peff [-- Attachment #1: Type: text/plain, Size: 2622 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Karthik Nayak <karthik.188@gmail.com> writes: > >> Since 9badf97c4 (remote: allow resetting url list), > > Please do not be original in places where it shouldn't matter. Use > "git show -s --format=reference" that includes the datestamp to help > readers judge how old the problem is. > Will do. >> we reset the remote >> URL if the provided URL is empty. This means any caller of >> `remotes_remote_get()` would now get a NULL remote. > > "NULL remote" meaning? > > If you have this: > > [remote "multi"] > url = wrong-one > url = wrong-two > url = > > and ask "remotes_remote_get()" to give you the remote "multi", you'd > get a remote whose URL array has no elements. Is that what you are > referring to? I was referring to the 'struct remote *' being 'NULL'. I think Jeff explained this in his email reply to this. >> The 'builtin/push.c' code, calls 'set_refspecs' before validating the >> remote. > > There is a comment about "lazily grab remote", so it is very > understandable. > >> This worked earlier since we would get a remote, albeit with an >> empty URL. With the new changes, we get a NULL remote and this crashes. > > You'd really really need to clarify what you mean by "a NULL remote" > if you want the proposed log message and the change to be > understood. The change made by 9badf97c (remote: allow resetting > url list, 2024-06-14), as far as I can tell, can make the strvecs > that hold URL and pushURL in a remote structure empty, but it does > not otherwise destroy the remote structure, or nullify a pointer > that points at the remote structure. So I am completely lost here. I will amend the commit to the following: Since 9badf97c42 (remote: allow resetting url list, 2024-06-14), we reset the remote URL if the provided URL is empty. When a user of 'remotes_remote_get' tries to fetch a remote with an empty repo name, the function initializes the remote via 'make_remote'. But the remote is still not a valid remote, since the URL is empty, so it tries to add the URL alias using 'add_url_alias'. This in-turn will call 'add_url', but since the URL is empty we call 'strvec_clear' on the `remote->url`. Back in 'remotes_remote_get', we again check if the remote is valid, which fails, so we return 'NULL' for the 'struct remote *' value. The 'builtin/push.c' code, calls 'set_refspecs' before validating the remote. This worked earlier since we would get a remote, albeit with an empty URL. With the new changes, we get a NULL remote this causes the check for remote to fail and raises the BUG. --- I hope this makes it clearer [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] builtin/push: call set_refspecs after validating remote 2024-07-08 14:03 [PATCH] builtin/push: call set_refspecs after validating remote Karthik Nayak 2024-07-08 19:17 ` Junio C Hamano @ 2024-07-08 23:32 ` Jeff King 2024-07-09 9:05 ` Karthik Nayak 2024-07-09 14:49 ` [PATCH v2] " Karthik Nayak 2 siblings, 1 reply; 20+ messages in thread From: Jeff King @ 2024-07-08 23:32 UTC (permalink / raw) To: Karthik Nayak; +Cc: git On Mon, Jul 08, 2024 at 04:03:50PM +0200, Karthik Nayak wrote: > Since 9badf97c4 (remote: allow resetting url list), we reset the remote > URL if the provided URL is empty. This means any caller of > `remotes_remote_get()` would now get a NULL remote. > > The 'builtin/push.c' code, calls 'set_refspecs' before validating the > remote. This worked earlier since we would get a remote, albeit with an > empty URL. With the new changes, we get a NULL remote and this crashes. Interesting. I think this was always a bit buggy, in the sense that the some of the code was prepared for pushremote_get() to return NULL, but the set_refspecs() call was not. So in theory _any_ problem with the remote that caused pushremote_get() to bail out would be a problem. But in practice, I'm not sure there was a way to do so, since the remote.c code usually falls back on the given name as the url if needed, rather than returning NULL. And 9badf97c4 does something a bit unexpected here, since the fallback calls the same add_url() function that we feed the config values to, and so it respects the same "empty means reset" logic. Which means that an empty-string remote name will no longer fall back in that way. It's a little surprising that we hit the "empty means reset" logic here. I wonder if that fallback path should be avoiding it. OTOH, an empty string URL is nonsense, and it's not going to work, so maybe returning a NULL remote is a good thing. The issue here is mostly just calling BUG() for something that is bogus input. > Do a simple fix by doing remote validation first and also add a test to > validate the bug fix. OK, so we push it further down, past the "if (!remote)" check in the caller. I think that's a good fix. It does make me wonder why set_refspecs() does not simply take the remote struct in the first place? I.e.: diff --git a/builtin/push.c b/builtin/push.c index 992f603de7..ae787f1f63 100644 --- a/builtin/push.c +++ b/builtin/push.c @@ -96,9 +96,8 @@ static void refspec_append_mapped(struct refspec *refspec, const char *ref, refspec_append(refspec, ref); } -static void set_refspecs(const char **refs, int nr, const char *repo) +static void set_refspecs(const char **refs, int nr, struct remote *remote) { - struct remote *remote = NULL; struct ref *local_refs = NULL; int i; @@ -127,12 +126,6 @@ static void set_refspecs(const char **refs, int nr, const char *repo) if (count_refspec_match(ref, local_refs, &matched) != 1) { refspec_append(&rs, ref); } else { - /* lazily grab remote */ - if (!remote) - remote = remote_get(repo); - if (!remote) - BUG("must get a remote for repo '%s'", repo); - refspec_append_mapped(&rs, ref, remote, matched); } } else @@ -648,7 +641,7 @@ int cmd_push(int argc, const char **argv, const char *prefix) } if (argc > 0) - set_refspecs(argv + 1, argc - 1, repo); + set_refspecs(argv + 1, argc - 1, remote); if (remote->mirror) flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE); which is now possible after your patch. Note that set_refspecs() currently calls remote_get(), but the caller will use pushremote_get() to get the remote. I think that set_refspecs() is actually wrong here, but it doesn't matter in practice because "repo" is always non-NULL at this point. But with the patch above, this kind of error would be impossible to trigger (but again, your patch is still necessary to get the ordering right in the first place). > I noticed that this was breaking on master. We run tests on Git master > for Gitaly at GitLab and I noticed a SEFAULT. I could also reproduce the > bug by simply doing 'git push "" refs/heads/master' on master, next and > seen. I don't see a segfault, but rather a BUG() that triggers SIGABRT. Does that match what you see? > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > + > TEST_PASSES_SANITIZE_LEAK=true > . ./test-lib.sh > > @@ -38,6 +41,11 @@ test_expect_success 'detect missing sha1 expressions early' ' > test_cmp expect rp-ran > ' > > +test_expect_success 'detect empty remote' ' > + test_must_fail git push "" main 2> stderr && > + grep "fatal: bad repository ''" stderr > +' The test makes sense. Your single-quotes are not doing what you expect, though (they are closing and re-opening the outer test body, so end up as the empty string). You can use $SQ$SQ instead (I'm also working on patches to allow you to specify the body as a here-doc to avoid exactly this kind of situation, but I don't think we should depend on that yet). I was a little surprised you needed to use the name "main" and not just "HEAD" or something neutral (avoiding the INITIAL_BRANCH_NAME stuff). But we only hit the problematic code path when we match a local name. Also, minor style nit: use "2>stderr" with no space. Anyway, thanks for catching my bug! I think your fix is the right approach, and we just need to adjust the test. I do think we should do the patch I suggested above on top. I'd be happy if you want to add it in to your series, but let me know if you want me to write a commit message or just send it separately afterwards. -Peff ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] builtin/push: call set_refspecs after validating remote 2024-07-08 23:32 ` Jeff King @ 2024-07-09 9:05 ` Karthik Nayak 2024-07-09 9:59 ` Jeff King 0 siblings, 1 reply; 20+ messages in thread From: Karthik Nayak @ 2024-07-09 9:05 UTC (permalink / raw) To: Jeff King; +Cc: git [-- Attachment #1: Type: text/plain, Size: 5927 bytes --] Jeff King <peff@peff.net> writes: > On Mon, Jul 08, 2024 at 04:03:50PM +0200, Karthik Nayak wrote: > >> Since 9badf97c4 (remote: allow resetting url list), we reset the remote >> URL if the provided URL is empty. This means any caller of >> `remotes_remote_get()` would now get a NULL remote. >> >> The 'builtin/push.c' code, calls 'set_refspecs' before validating the >> remote. This worked earlier since we would get a remote, albeit with an >> empty URL. With the new changes, we get a NULL remote and this crashes. > > Interesting. I think this was always a bit buggy, in the sense that the > some of the code was prepared for pushremote_get() to return NULL, but > the set_refspecs() call was not. So in theory _any_ problem with the > remote that caused pushremote_get() to bail out would be a problem. But > in practice, I'm not sure there was a way to do so, since the remote.c > code usually falls back on the given name as the url if needed, rather > than returning NULL. > Yup, that seems right. > And 9badf97c4 does something a bit unexpected here, since the fallback > calls the same add_url() function that we feed the config values to, and > so it respects the same "empty means reset" logic. Which means that an > empty-string remote name will no longer fall back in that way. > > It's a little surprising that we hit the "empty means reset" logic here. > I wonder if that fallback path should be avoiding it. OTOH, an empty > string URL is nonsense, and it's not going to work, so maybe returning a > NULL remote is a good thing. The issue here is mostly just calling BUG() > for something that is bogus input. > Yup, that's the explanation that I should've added, thanks. This is a good summary. >> Do a simple fix by doing remote validation first and also add a test to >> validate the bug fix. > > OK, so we push it further down, past the "if (!remote)" check in the > caller. I think that's a good fix. It does make me wonder why > set_refspecs() does not simply take the remote struct in the first > place? I.e.: > > diff --git a/builtin/push.c b/builtin/push.c > index 992f603de7..ae787f1f63 100644 > --- a/builtin/push.c > +++ b/builtin/push.c > @@ -96,9 +96,8 @@ static void refspec_append_mapped(struct refspec *refspec, const char *ref, > refspec_append(refspec, ref); > } > > -static void set_refspecs(const char **refs, int nr, const char *repo) > +static void set_refspecs(const char **refs, int nr, struct remote *remote) > { > - struct remote *remote = NULL; > struct ref *local_refs = NULL; > int i; > > @@ -127,12 +126,6 @@ static void set_refspecs(const char **refs, int nr, const char *repo) > if (count_refspec_match(ref, local_refs, &matched) != 1) { > refspec_append(&rs, ref); > } else { > - /* lazily grab remote */ > - if (!remote) > - remote = remote_get(repo); > - if (!remote) > - BUG("must get a remote for repo '%s'", repo); > - > refspec_append_mapped(&rs, ref, remote, matched); > } > } else > @@ -648,7 +641,7 @@ int cmd_push(int argc, const char **argv, const char *prefix) > } > > if (argc > 0) > - set_refspecs(argv + 1, argc - 1, repo); > + set_refspecs(argv + 1, argc - 1, remote); > > if (remote->mirror) > flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE); > > which is now possible after your patch. Note that set_refspecs() > currently calls remote_get(), but the caller will use pushremote_get() > to get the remote. I think that set_refspecs() is actually wrong here, > but it doesn't matter in practice because "repo" is always non-NULL at > this point. > > But with the patch above, this kind of error would be impossible to > trigger (but again, your patch is still necessary to get the ordering > right in the first place). > I think this is a great addition on top of my patch, I will add it in. >> I noticed that this was breaking on master. We run tests on Git master >> for Gitaly at GitLab and I noticed a SEFAULT. I could also reproduce the >> bug by simply doing 'git push "" refs/heads/master' on master, next and >> seen. > > I don't see a segfault, but rather a BUG() that triggers SIGABRT. Does > that match what you see? > Yes, it is indeed SIGABRT and not SEGFAULT. Will correct my message. >> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main >> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME >> + >> TEST_PASSES_SANITIZE_LEAK=true >> . ./test-lib.sh >> >> @@ -38,6 +41,11 @@ test_expect_success 'detect missing sha1 expressions early' ' >> test_cmp expect rp-ran >> ' >> >> +test_expect_success 'detect empty remote' ' >> + test_must_fail git push "" main 2> stderr && >> + grep "fatal: bad repository ''" stderr >> +' > > The test makes sense. Your single-quotes are not doing what you expect, > though (they are closing and re-opening the outer test body, so end up > as the empty string). You can use $SQ$SQ instead (I'm also working on > patches to allow you to specify the body as a here-doc to avoid exactly > this kind of situation, but I don't think we should depend on that yet). > Good catch. I'm wondering how it worked though, since I wrote the test before the fix and used the test to validate the failure and the fix. > I was a little surprised you needed to use the name "main" and not just > "HEAD" or something neutral (avoiding the INITIAL_BRANCH_NAME stuff). > But we only hit the problematic code path when we match a local name. > Exactly. I'll add a comment though, to make it clearer > Also, minor style nit: use "2>stderr" with no space. > Thanks, will change. > > Anyway, thanks for catching my bug! I think your fix is the right > approach, and we just need to adjust the test. I do think we should do > the patch I suggested above on top. I'd be happy if you want to add it > in to your series, but let me know if you want me to write a commit > message or just send it separately afterwards. > > -Peff I will add it in. Thanks for your help! [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] builtin/push: call set_refspecs after validating remote 2024-07-09 9:05 ` Karthik Nayak @ 2024-07-09 9:59 ` Jeff King 0 siblings, 0 replies; 20+ messages in thread From: Jeff King @ 2024-07-09 9:59 UTC (permalink / raw) To: Karthik Nayak; +Cc: git On Tue, Jul 09, 2024 at 05:05:58AM -0400, Karthik Nayak wrote: > >> +test_expect_success 'detect empty remote' ' > >> + test_must_fail git push "" main 2> stderr && > >> + grep "fatal: bad repository ''" stderr > >> +' > > > > The test makes sense. Your single-quotes are not doing what you expect, > > though (they are closing and re-opening the outer test body, so end up > > as the empty string). You can use $SQ$SQ instead (I'm also working on > > patches to allow you to specify the body as a here-doc to avoid exactly > > this kind of situation, but I don't think we should depend on that yet). > > Good catch. I'm wondering how it worked though, since I wrote the test > before the fix and used the test to validate the failure and the fix. You end up grepping for the sub-string "fatal: bad repository ", which still matches. It's just not quite as careful as what you intended. -Peff ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2] builtin/push: call set_refspecs after validating remote 2024-07-08 14:03 [PATCH] builtin/push: call set_refspecs after validating remote Karthik Nayak 2024-07-08 19:17 ` Junio C Hamano 2024-07-08 23:32 ` Jeff King @ 2024-07-09 14:49 ` Karthik Nayak 2024-07-09 18:44 ` Junio C Hamano 2024-07-11 9:39 ` [PATCH v3] " Karthik Nayak 2 siblings, 2 replies; 20+ messages in thread From: Karthik Nayak @ 2024-07-09 14:49 UTC (permalink / raw) To: karthik.188; +Cc: git, peff Since 9badf97c42 (remote: allow resetting url list, 2024-06-14), we reset the remote URL if the provided URL is empty. When a user of 'remotes_remote_get' tries to fetch a remote with an empty repo name, the function initializes the remote via 'make_remote'. But the remote is still not a valid remote, since the URL is empty, so it tries to add the URL alias using 'add_url_alias'. This in-turn will call 'add_url', but since the URL is empty we call 'strvec_clear' on the `remote->url`. Back in 'remotes_remote_get', we again check if the remote is valid, which fails, so we return 'NULL' for the 'struct remote *' value The 'builtin/push.c' code, calls 'set_refspecs' before validating the remote. This worked with empty repo names earlier since we would get a remote, albeit with an empty URL. With the new changes, we get a 'NULL' remote value, this causes the check for remote to fail and raises the BUG in 'set_refspecs'. Do a simple fix by doing remote validation first. Also add a test to validate the bug fix. With this, we can also now directly pass remote to 'set_refspecs' instead of it trying to lazily obtain it. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> --- Changes from v1: - Added more details to the commit message to clarify the issue at hand. - Fixed the quoting in the test. - Cleaned up 'set_refspecs' by now accepting a remote instead of trying to obtain one. Range-diff against v1: 1: 1bd4dc424d ! 1: fd9a9387e9 builtin/push: call set_refspecs after validating remote @@ Metadata ## Commit message ## builtin/push: call set_refspecs after validating remote - Since 9badf97c4 (remote: allow resetting url list), we reset the remote - URL if the provided URL is empty. This means any caller of - `remotes_remote_get()` would now get a NULL remote. + Since 9badf97c42 (remote: allow resetting url list, 2024-06-14), we + reset the remote URL if the provided URL is empty. When a user of + 'remotes_remote_get' tries to fetch a remote with an empty repo name, + the function initializes the remote via 'make_remote'. But the remote is + still not a valid remote, since the URL is empty, so it tries to add the + URL alias using 'add_url_alias'. This in-turn will call 'add_url', but + since the URL is empty we call 'strvec_clear' on the `remote->url`. Back + in 'remotes_remote_get', we again check if the remote is valid, which + fails, so we return 'NULL' for the 'struct remote *' value The 'builtin/push.c' code, calls 'set_refspecs' before validating the - remote. This worked earlier since we would get a remote, albeit with an - empty URL. With the new changes, we get a NULL remote and this crashes. + remote. This worked with empty repo names earlier since we would get a + remote, albeit with an empty URL. With the new changes, we get a 'NULL' + remote value, this causes the check for remote to fail and raises the + BUG in 'set_refspecs'. - Do a simple fix by doing remote validation first and also add a test to - validate the bug fix. + Do a simple fix by doing remote validation first. Also add a test to + validate the bug fix. With this, we can also now directly pass remote to + 'set_refspecs' instead of it trying to lazily obtain it. + Helped-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> ## builtin/push.c ## +@@ builtin/push.c: static void refspec_append_mapped(struct refspec *refspec, const char *ref, + refspec_append(refspec, ref); + } + +-static void set_refspecs(const char **refs, int nr, const char *repo) ++static void set_refspecs(const char **refs, int nr, struct remote *remote) + { +- struct remote *remote = NULL; + struct ref *local_refs = NULL; + int i; + +@@ builtin/push.c: static void set_refspecs(const char **refs, int nr, const char *repo) + local_refs = get_local_heads(); + + /* Does "ref" uniquely name our ref? */ +- if (count_refspec_match(ref, local_refs, &matched) != 1) { ++ if (count_refspec_match(ref, local_refs, &matched) != 1) + refspec_append(&rs, ref); +- } else { +- /* lazily grab remote */ +- if (!remote) +- remote = remote_get(repo); +- if (!remote) +- BUG("must get a remote for repo '%s'", repo); +- ++ else + refspec_append_mapped(&rs, ref, remote, matched); +- } + } else + refspec_append(&rs, ref); + } @@ builtin/push.c: int cmd_push(int argc, const char **argv, const char *prefix) if (tags) refspec_append(&rs, "refs/tags/*"); @@ builtin/push.c: int cmd_push(int argc, const char **argv, const char *prefix) } + if (argc > 0) -+ set_refspecs(argv + 1, argc - 1, repo); ++ set_refspecs(argv + 1, argc - 1, remote); + if (remote->mirror) flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE); @@ t/t5529-push-errors.sh: test_expect_success 'detect missing sha1 expressions ear test_cmp expect rp-ran ' ++# We need to use an existing local_ref so that the remote is mapped to ++# it in 'builtin/push.c:set_refspecs()'. +test_expect_success 'detect empty remote' ' + test_must_fail git push "" main 2> stderr && -+ grep "fatal: bad repository ''" stderr ++ grep "fatal: bad repository ${SQ}${SQ}" stderr +' + test_expect_success 'detect ambiguous refs early' ' builtin/push.c | 21 +++++++-------------- t/t5529-push-errors.sh | 10 ++++++++++ 2 files changed, 17 insertions(+), 14 deletions(-) diff --git a/builtin/push.c b/builtin/push.c index 8260c6e46a..7a67398124 100644 --- a/builtin/push.c +++ b/builtin/push.c @@ -96,9 +96,8 @@ static void refspec_append_mapped(struct refspec *refspec, const char *ref, refspec_append(refspec, ref); } -static void set_refspecs(const char **refs, int nr, const char *repo) +static void set_refspecs(const char **refs, int nr, struct remote *remote) { - struct remote *remote = NULL; struct ref *local_refs = NULL; int i; @@ -124,17 +123,10 @@ static void set_refspecs(const char **refs, int nr, const char *repo) local_refs = get_local_heads(); /* Does "ref" uniquely name our ref? */ - if (count_refspec_match(ref, local_refs, &matched) != 1) { + if (count_refspec_match(ref, local_refs, &matched) != 1) refspec_append(&rs, ref); - } else { - /* lazily grab remote */ - if (!remote) - remote = remote_get(repo); - if (!remote) - BUG("must get a remote for repo '%s'", repo); - + else refspec_append_mapped(&rs, ref, remote, matched); - } } else refspec_append(&rs, ref); } @@ -630,10 +622,8 @@ int cmd_push(int argc, const char **argv, const char *prefix) if (tags) refspec_append(&rs, "refs/tags/*"); - if (argc > 0) { + if (argc > 0) repo = argv[0]; - set_refspecs(argv + 1, argc - 1, repo); - } remote = pushremote_get(repo); if (!remote) { @@ -649,6 +639,9 @@ int cmd_push(int argc, const char **argv, const char *prefix) " git push <name>\n")); } + if (argc > 0) + set_refspecs(argv + 1, argc - 1, remote); + if (remote->mirror) flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE); diff --git a/t/t5529-push-errors.sh b/t/t5529-push-errors.sh index 0247137cb3..54427252a8 100755 --- a/t/t5529-push-errors.sh +++ b/t/t5529-push-errors.sh @@ -2,6 +2,9 @@ test_description='detect some push errors early (before contacting remote)' +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME + TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh @@ -38,6 +41,13 @@ test_expect_success 'detect missing sha1 expressions early' ' test_cmp expect rp-ran ' +# We need to use an existing local_ref so that the remote is mapped to +# it in 'builtin/push.c:set_refspecs()'. +test_expect_success 'detect empty remote' ' + test_must_fail git push "" main 2> stderr && + grep "fatal: bad repository ${SQ}${SQ}" stderr +' + test_expect_success 'detect ambiguous refs early' ' git branch foo && git tag foo && -- 2.45.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2] builtin/push: call set_refspecs after validating remote 2024-07-09 14:49 ` [PATCH v2] " Karthik Nayak @ 2024-07-09 18:44 ` Junio C Hamano 2024-07-09 18:56 ` Junio C Hamano ` (2 more replies) 2024-07-11 9:39 ` [PATCH v3] " Karthik Nayak 1 sibling, 3 replies; 20+ messages in thread From: Junio C Hamano @ 2024-07-09 18:44 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, peff Karthik Nayak <karthik.188@gmail.com> writes: > - Added more details to the commit message to clarify the issue at hand. > Since 9badf97c42 (remote: allow resetting url list, 2024-06-14), we > reset the remote URL if the provided URL is empty. When a user of > 'remotes_remote_get' tries to fetch a remote with an empty repo name, > the function initializes the remote via 'make_remote'. But the remote is > still not a valid remote, since the URL is empty, so it tries to add the > URL alias using 'add_url_alias'. This in-turn will call 'add_url', but > since the URL is empty we call 'strvec_clear' on the `remote->url`. Back > in 'remotes_remote_get', we again check if the remote is valid, which > fails, so we return 'NULL' for the 'struct remote *' value That's better, but it still talks only about the implementation and control flow description without mentioning what end-user action it breaks. We see in the new test this: $ git push "" main but is that something we even want to support? Before 9badf97c (remote: allow resetting url list, 2024-06-14), the command would have failed with different way [*1*]. Is it merely "this is a nonsense request and must fail, but we do not want to hit BUG in general"? I think it is the latter, but leaving it unsaid is confusing. How about starting it more like... When an end-user runs "git push" with an empty string for the remote repository name, e.g. $ git push '' main "git push" fails with a BUG(). This is because ... or something. cmd_push() calls set_refspecs() on a repo (whose name is ""), which then calls remote.c:remote_get() on it, which calls remote.c:make_remote() to obtain a remote structure. But during this calling sequence especially down from remote_get(), there are inconsistent decisions made for an empty string as a name. It is not a NULL pointer, so it does not benefit from the default refspecs by calling get_default=remotes_remote_for_branch, valid_remote_nick() considers it is not a valid nick so we do not read remotes or branches configuration file, but we still think a name was given (even though it is an empty string) and try to do something useful by calling add_url_alias(). It is a mess and whoever designed this should be shot [*2*] ;-). In any case, an obvious additional fix on top of your change might be to do something like this: diff --git i/remote.c w/remote.c index 5fa046c8f8..d7f9ba3571 100644 --- i/remote.c +++ w/remote.c @@ -682,7 +682,7 @@ remotes_remote_get_1( struct remote *ret; int name_given = 0; - if (name) + if (name && *name) name_given = 1; else name = get_default(remote_state, remote_state->current_branch, which would give us the default remote name, and we would not call add_url_alias() with a bogus empty string to nuke the list. > - Cleaned up 'set_refspecs' by now accepting a remote instead of trying > to obtain one. The last one, which does make sense, is not mentioned in the proposed log message. Lazy remote creation does not help the updated caller because it already has one. > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > + > TEST_PASSES_SANITIZE_LEAK=true > . ./test-lib.sh > > @@ -38,6 +41,13 @@ test_expect_success 'detect missing sha1 expressions early' ' > test_cmp expect rp-ran > ' > > +# We need to use an existing local_ref so that the remote is mapped to > +# it in 'builtin/push.c:set_refspecs()'. Hmph, it is OK to force 'main' like the above as a workaround, but wouldn't it be sufficient to do $ git push "" HEAD:refs/heads/main or something to avoid having to know what our current branch is? Thanks. [Footnote } *1* For example, before Peff's series, here is what I got: $ git push "" HEAD:master fatal: no path specified; see 'git help pull' for valid url syntax It comes from transport_push() -> get_refs_via_connect() -> parse_connect_url() code path. *2* It probably is me writing in the original in shell script, so I can safely say this without offending anybody. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] builtin/push: call set_refspecs after validating remote 2024-07-09 18:44 ` Junio C Hamano @ 2024-07-09 18:56 ` Junio C Hamano 2024-07-09 23:55 ` Jeff King 2024-07-10 13:12 ` Karthik Nayak 2 siblings, 0 replies; 20+ messages in thread From: Junio C Hamano @ 2024-07-09 18:56 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, peff Junio C Hamano <gitster@pobox.com> writes: > Is it merely "this is a nonsense request and must fail, but we do > not want to hit BUG in general"? I think it is the latter, but > leaving it unsaid is confusing. How about starting it more like... A bit of clarification. "it is the latter" -> "that is what you meant". In any earlier draft I had another possibility but I removed it before sending the message. > When an end-user runs "git push" with an empty string for the > remote repository name, e.g. > > $ git push '' main > > "git push" fails with a BUG(). This is because ... > > or something. Another clarificaiton. ... with a BUG(). -> ... with a BUG(). Even though this is a nonsense request that we want to fail, we shouldn't hit a BUG(). Instead we want to give a sensible error message, e.g., 'bad repository'". Let's make a habit of not stopping at saying what is bad, and also saying what we want instead explicitly. Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] builtin/push: call set_refspecs after validating remote 2024-07-09 18:44 ` Junio C Hamano 2024-07-09 18:56 ` Junio C Hamano @ 2024-07-09 23:55 ` Jeff King 2024-07-10 1:04 ` Junio C Hamano 2024-07-10 13:12 ` Karthik Nayak 2 siblings, 1 reply; 20+ messages in thread From: Jeff King @ 2024-07-09 23:55 UTC (permalink / raw) To: Junio C Hamano; +Cc: Karthik Nayak, git On Tue, Jul 09, 2024 at 11:44:25AM -0700, Junio C Hamano wrote: > In any case, an obvious additional fix on top of your change might > be to do something like this: > > diff --git i/remote.c w/remote.c > index 5fa046c8f8..d7f9ba3571 100644 > --- i/remote.c > +++ w/remote.c > @@ -682,7 +682,7 @@ remotes_remote_get_1( > struct remote *ret; > int name_given = 0; > > - if (name) > + if (name && *name) > name_given = 1; > else > name = get_default(remote_state, remote_state->current_branch, > > which would give us the default remote name, and we would not call > add_url_alias() with a bogus empty string to nuke the list. FWIW, I almost suggested something like this earlier. The outcome will be the same (remote_get(), etc, will return NULL), but I think it removes the "this is surprising" comment from my earlier email and makes things much more explicit. (I also agree with everything else you said in your review). -Peff ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] builtin/push: call set_refspecs after validating remote 2024-07-09 23:55 ` Jeff King @ 2024-07-10 1:04 ` Junio C Hamano 0 siblings, 0 replies; 20+ messages in thread From: Junio C Hamano @ 2024-07-10 1:04 UTC (permalink / raw) To: Jeff King; +Cc: Karthik Nayak, git Jeff King <peff@peff.net> writes: > On Tue, Jul 09, 2024 at 11:44:25AM -0700, Junio C Hamano wrote: > >> In any case, an obvious additional fix on top of your change might >> be to do something like this: >> >> diff --git i/remote.c w/remote.c >> index 5fa046c8f8..d7f9ba3571 100644 >> --- i/remote.c >> +++ w/remote.c >> @@ -682,7 +682,7 @@ remotes_remote_get_1( >> struct remote *ret; >> int name_given = 0; >> >> - if (name) >> + if (name && *name) >> name_given = 1; >> else >> name = get_default(remote_state, remote_state->current_branch, >> >> which would give us the default remote name, and we would not call >> add_url_alias() with a bogus empty string to nuke the list. > > FWIW, I almost suggested something like this earlier. The outcome will > be the same (remote_get(), etc, will return NULL), but I think it > removes the "this is surprising" comment from my earlier email and makes > things much more explicit. > > (I also agree with everything else you said in your review). Heh, thanks. I should prepare to shoot myself then ;-) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] builtin/push: call set_refspecs after validating remote 2024-07-09 18:44 ` Junio C Hamano 2024-07-09 18:56 ` Junio C Hamano 2024-07-09 23:55 ` Jeff King @ 2024-07-10 13:12 ` Karthik Nayak 2024-07-10 15:34 ` Junio C Hamano 2024-07-10 15:46 ` Jeff King 2 siblings, 2 replies; 20+ messages in thread From: Karthik Nayak @ 2024-07-10 13:12 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, peff [-- Attachment #1: Type: text/plain, Size: 5395 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Karthik Nayak <karthik.188@gmail.com> writes: > >> - Added more details to the commit message to clarify the issue at hand. > >> Since 9badf97c42 (remote: allow resetting url list, 2024-06-14), we >> reset the remote URL if the provided URL is empty. When a user of >> 'remotes_remote_get' tries to fetch a remote with an empty repo name, >> the function initializes the remote via 'make_remote'. But the remote is >> still not a valid remote, since the URL is empty, so it tries to add the >> URL alias using 'add_url_alias'. This in-turn will call 'add_url', but >> since the URL is empty we call 'strvec_clear' on the `remote->url`. Back >> in 'remotes_remote_get', we again check if the remote is valid, which >> fails, so we return 'NULL' for the 'struct remote *' value > > That's better, but it still talks only about the implementation and > control flow description without mentioning what end-user action it > breaks. We see in the new test this: > > $ git push "" main > > but is that something we even want to support? Before 9badf97c > (remote: allow resetting url list, 2024-06-14), the command would > have failed with different way [*1*]. > > Is it merely "this is a nonsense request and must fail, but we do > not want to hit BUG in general"? I think it is the latter, but > leaving it unsaid is confusing. How about starting it more like... > > When an end-user runs "git push" with an empty string for the > remote repository name, e.g. > > $ git push '' main > > "git push" fails with a BUG(). This is because ... > > or something. > Thanks, I was focused on the technical aspect of it that I didn't really spill out the user interaction, you're right, it should start this way. I will modify accordingly. > cmd_push() calls set_refspecs() on a repo (whose name is ""), which > then calls remote.c:remote_get() on it, which calls > remote.c:make_remote() to obtain a remote structure. > > But during this calling sequence especially down from remote_get(), > there are inconsistent decisions made for an empty string as a name. > It is not a NULL pointer, so it does not benefit from the default > refspecs by calling get_default=remotes_remote_for_branch, > valid_remote_nick() considers it is not a valid nick so we do not > read remotes or branches configuration file, but we still think a > name was given (even though it is an empty string) and try to do > something useful by calling add_url_alias(). > > It is a mess and whoever designed this should be shot [*2*] ;-). > > In any case, an obvious additional fix on top of your change might > be to do something like this: > > diff --git i/remote.c w/remote.c > index 5fa046c8f8..d7f9ba3571 100644 > --- i/remote.c > +++ w/remote.c > @@ -682,7 +682,7 @@ remotes_remote_get_1( > struct remote *ret; > int name_given = 0; > > - if (name) > + if (name && *name) > name_given = 1; > else > name = get_default(remote_state, remote_state->current_branch, > > which would give us the default remote name, and we would not call > add_url_alias() with a bogus empty string to nuke the list. > I'm a bit skeptical of making this change. Mostly from the user's perspective. With my patch currently: $ git push "" refs/heads/master fatal: bad repository '' But with this added, we'd be doing $ git push "" refs/heads/master Everything up-to-date This is because we actually obtained the default remote here. Isn't this confusing from a user's perspective? I mean I agree that an empty repo name is something we should support, but it also shouldn't be something we simply ignore? >> - Cleaned up 'set_refspecs' by now accepting a remote instead of trying >> to obtain one. > > The last one, which does make sense, is not mentioned in the > proposed log message. Lazy remote creation does not help the > updated caller because it already has one. > >> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main >> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME >> + >> TEST_PASSES_SANITIZE_LEAK=true >> . ./test-lib.sh >> >> @@ -38,6 +41,13 @@ test_expect_success 'detect missing sha1 expressions early' ' >> test_cmp expect rp-ran >> ' >> >> +# We need to use an existing local_ref so that the remote is mapped to >> +# it in 'builtin/push.c:set_refspecs()'. > > Hmph, it is OK to force 'main' like the above as a workaround, but > wouldn't it be sufficient to do > > $ git push "" HEAD:refs/heads/main > > or something to avoid having to know what our current branch is? > > Thanks. > So if we're simply testing my patch, this is definitely enough. But I wanted to also keep in mind the state before this patch. Wherein the only way the flow would be triggered is if we provide a local_ref, providing ':' follows a different path in 'set_refspecs'. > > > [Footnote } > > *1* For example, before Peff's series, here is what I got: > > $ git push "" HEAD:master > fatal: no path specified; see 'git help pull' for valid url syntax > > It comes from transport_push() -> get_refs_via_connect() -> > parse_connect_url() code path. > > *2* It probably is me writing in the original in shell script, so I > can safely say this without offending anybody. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] builtin/push: call set_refspecs after validating remote 2024-07-10 13:12 ` Karthik Nayak @ 2024-07-10 15:34 ` Junio C Hamano 2024-07-10 15:46 ` Jeff King 1 sibling, 0 replies; 20+ messages in thread From: Junio C Hamano @ 2024-07-10 15:34 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, peff Karthik Nayak <karthik.188@gmail.com> writes: >> In any case, an obvious additional fix on top of your change might >> be to do something like this: >> >> diff --git i/remote.c w/remote.c >> index 5fa046c8f8..d7f9ba3571 100644 >> --- i/remote.c >> +++ w/remote.c >> @@ -682,7 +682,7 @@ remotes_remote_get_1( >> struct remote *ret; >> int name_given = 0; >> >> - if (name) >> + if (name && *name) >> name_given = 1; >> else >> name = get_default(remote_state, remote_state->current_branch, >> >> which would give us the default remote name, and we would not call >> add_url_alias() with a bogus empty string to nuke the list. >> > > I'm a bit skeptical of making this change. Mostly from the user's > perspective. > ... > This is because we actually obtained the default remote here. Isn't this > confusing from a user's perspective? I agree with you 100%. I haven't checked the ramifications of such a change to other code paths, and callers of remote_get() are many. With your fix, the above is not necessary and it certainly can be left well outside the scope of the current topic. > I mean I agree that an empty repo > name is something we should support, but it also shouldn't be something > we simply ignore? For the sake of simplicity of the UI design, I think an empty repo name (giving an empty string explicitly where a repository nickname or URL is expected) should be forbidden. The above change lets "" stand for the default remote (which is different from "simply" ignoring), and is confusing, because we never did that historically. Unless we advertise it as a new "feature" [*], that is (and I do not believe it is such a great feature myself). > So if we're simply testing my patch, this is definitely enough. But I > wanted to also keep in mind the state before this patch. Wherein the > only way the flow would be triggered is if we provide a local_ref, > providing ':' follows a different path in 'set_refspecs'. OK. If so we may want to have both, so that we won't regress in the other code paths? Thanks. [Footnote] * When you want to interact with the remote you usually work with but want to affect a custom set of refs, with the current UI, you'd need to remember what you call that usual remote (typically "origin") and say "git [push|fetch] origin $refspec". The above change may be seen as a feature that allows you to use an empty string in place of the remote that has to be named explicitly on the command line. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] builtin/push: call set_refspecs after validating remote 2024-07-10 13:12 ` Karthik Nayak 2024-07-10 15:34 ` Junio C Hamano @ 2024-07-10 15:46 ` Jeff King 2024-07-11 9:35 ` Karthik Nayak 1 sibling, 1 reply; 20+ messages in thread From: Jeff King @ 2024-07-10 15:46 UTC (permalink / raw) To: Karthik Nayak; +Cc: Junio C Hamano, git On Wed, Jul 10, 2024 at 06:12:21AM -0700, Karthik Nayak wrote: > > In any case, an obvious additional fix on top of your change might > > be to do something like this: > > > > diff --git i/remote.c w/remote.c > > index 5fa046c8f8..d7f9ba3571 100644 > > --- i/remote.c > > +++ w/remote.c > > @@ -682,7 +682,7 @@ remotes_remote_get_1( > > struct remote *ret; > > int name_given = 0; > > > > - if (name) > > + if (name && *name) > > name_given = 1; > > else > > name = get_default(remote_state, remote_state->current_branch, > > > > which would give us the default remote name, and we would not call > > add_url_alias() with a bogus empty string to nuke the list. > > > > I'm a bit skeptical of making this change. Mostly from the user's > perspective. > > With my patch currently: > > $ git push "" refs/heads/master > fatal: bad repository '' > > But with this added, we'd be doing > > $ git push "" refs/heads/master > Everything up-to-date > > This is because we actually obtained the default remote here. Isn't this > confusing from a user's perspective? I mean I agree that an empty repo > name is something we should support, but it also shouldn't be something > we simply ignore? Oh, I misread Junio's patch in my earlier response. I was focused on not setting name_given, which I thought would result in a NULL return value, and didn't notice that it would also mean using the default remote. Something like: diff --git a/remote.c b/remote.c index 7f6406aaa2..883cf6086e 100644 --- a/remote.c +++ b/remote.c @@ -703,7 +703,7 @@ remotes_remote_get_1(struct remote_state *remote_state, const char *name, if (!valid_remote(ret)) read_branches_file(remote_state, ret); } - if (name_given && !valid_remote(ret)) + if (name_given && *name && !valid_remote(ret)) add_url_alias(remote_state, ret, name); if (!valid_remote(ret)) return NULL; was more what I was thinking. That is, inhibit the empty string explicitly rather than letting the emergent behavior of add_url_alias() do it for us. Or maybe even just: diff --git a/remote.c b/remote.c index 7f6406aaa2..a0b166131f 100644 --- a/remote.c +++ b/remote.c @@ -690,9 +690,11 @@ remotes_remote_get_1(struct remote_state *remote_state, const char *name, struct remote *ret; int name_given = 0; - if (name) + if (name) { + if (!name) + return NULL; name_given = 1; - else + } else name = get_default(remote_state, remote_state->current_branch, &name_given); to bail immediately. But all of that would be internal refactoring / cleanup on top of your patch. The user-facing behavior would be the same. -Peff ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2] builtin/push: call set_refspecs after validating remote 2024-07-10 15:46 ` Jeff King @ 2024-07-11 9:35 ` Karthik Nayak 2024-07-11 21:32 ` Jeff King 0 siblings, 1 reply; 20+ messages in thread From: Karthik Nayak @ 2024-07-11 9:35 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git [-- Attachment #1: Type: text/plain, Size: 3098 bytes --] Jeff King <peff@peff.net> writes: > On Wed, Jul 10, 2024 at 06:12:21AM -0700, Karthik Nayak wrote: > >> > In any case, an obvious additional fix on top of your change might >> > be to do something like this: >> > >> > diff --git i/remote.c w/remote.c >> > index 5fa046c8f8..d7f9ba3571 100644 >> > --- i/remote.c >> > +++ w/remote.c >> > @@ -682,7 +682,7 @@ remotes_remote_get_1( >> > struct remote *ret; >> > int name_given = 0; >> > >> > - if (name) >> > + if (name && *name) >> > name_given = 1; >> > else >> > name = get_default(remote_state, remote_state->current_branch, >> > >> > which would give us the default remote name, and we would not call >> > add_url_alias() with a bogus empty string to nuke the list. >> > >> >> I'm a bit skeptical of making this change. Mostly from the user's >> perspective. >> >> With my patch currently: >> >> $ git push "" refs/heads/master >> fatal: bad repository '' >> >> But with this added, we'd be doing >> >> $ git push "" refs/heads/master >> Everything up-to-date >> >> This is because we actually obtained the default remote here. Isn't this >> confusing from a user's perspective? I mean I agree that an empty repo >> name is something we should support, but it also shouldn't be something >> we simply ignore? > > Oh, I misread Junio's patch in my earlier response. I was focused on not > setting name_given, which I thought would result in a NULL return value, > and didn't notice that it would also mean using the default remote. > Something like: > > diff --git a/remote.c b/remote.c > index 7f6406aaa2..883cf6086e 100644 > --- a/remote.c > +++ b/remote.c > @@ -703,7 +703,7 @@ remotes_remote_get_1(struct remote_state *remote_state, const char *name, > if (!valid_remote(ret)) > read_branches_file(remote_state, ret); > } > - if (name_given && !valid_remote(ret)) > + if (name_given && *name && !valid_remote(ret)) > add_url_alias(remote_state, ret, name); > if (!valid_remote(ret)) > return NULL; > > was more what I was thinking. That is, inhibit the empty string > explicitly rather than letting the emergent behavior of add_url_alias() > do it for us. Or maybe even just: > > diff --git a/remote.c b/remote.c > index 7f6406aaa2..a0b166131f 100644 > --- a/remote.c > +++ b/remote.c > @@ -690,9 +690,11 @@ remotes_remote_get_1(struct remote_state *remote_state, const char *name, > struct remote *ret; > int name_given = 0; > > - if (name) > + if (name) { > + if (!name) > + return NULL; > name_given = 1; > - else > + } else > name = get_default(remote_state, remote_state->current_branch, > &name_given); > > > to bail immediately. > > But all of that would be internal refactoring / cleanup on top of your > patch. The user-facing behavior would be the same. > > -Peff These should work as intended on top of my patch. But I will skip doing these changes for now. I do see the merit but I think it is also okay the way it is now. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2] builtin/push: call set_refspecs after validating remote 2024-07-11 9:35 ` Karthik Nayak @ 2024-07-11 21:32 ` Jeff King 0 siblings, 0 replies; 20+ messages in thread From: Jeff King @ 2024-07-11 21:32 UTC (permalink / raw) To: Karthik Nayak; +Cc: Junio C Hamano, git On Thu, Jul 11, 2024 at 02:35:08AM -0700, Karthik Nayak wrote: > > But all of that would be internal refactoring / cleanup on top of your > > patch. The user-facing behavior would be the same. > > These should work as intended on top of my patch. But I will skip doing > these changes for now. I do see the merit but I think it is also okay > the way it is now. Yep, that is perfectly fine with me. Thanks for working on this! -Peff ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v3] builtin/push: call set_refspecs after validating remote 2024-07-09 14:49 ` [PATCH v2] " Karthik Nayak 2024-07-09 18:44 ` Junio C Hamano @ 2024-07-11 9:39 ` Karthik Nayak 2024-07-11 15:08 ` Junio C Hamano 1 sibling, 1 reply; 20+ messages in thread From: Karthik Nayak @ 2024-07-11 9:39 UTC (permalink / raw) To: karthik.188; +Cc: git, peff, Junio C Hamano When an end-user runs "git push" with an empty string for the remote repository name, e.g. $ git push '' main "git push" fails with a BUG(). Even though this is a nonsense request that we want to fail, we shouldn't hit a BUG(). Instead we want to give a sensible error message, e.g., 'bad repository'". This is because since 9badf97c42 (remote: allow resetting url list, 2024-06-14), we reset the remote URL if the provided URL is empty. When a user of 'remotes_remote_get' tries to fetch a remote with an empty repo name, the function initializes the remote via 'make_remote'. But the remote is still not a valid remote, since the URL is empty, so it tries to add the URL alias using 'add_url_alias'. This in-turn will call 'add_url', but since the URL is empty we call 'strvec_clear' on the `remote->url`. Back in 'remotes_remote_get', we again check if the remote is valid, which fails, so we return 'NULL' for the 'struct remote *' value. The 'builtin/push.c' code, calls 'set_refspecs' before validating the remote. This worked with empty repo names earlier since we would get a remote, albeit with an empty URL. With the new changes, we get a 'NULL' remote value, this causes the check for remote to fail and raises the BUG in 'set_refspecs'. Do a simple fix by doing remote validation first. Also add a test to validate the bug fix. With this, we can also now directly pass remote to 'set_refspecs' instead of it trying to lazily obtain it. Helped-by: Jeff King <peff@peff.net> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> --- Changes from v2: - Updated the commit message to talk more about the user experience at the start. - Added another test to also check targeted refspec. Range-diff against v2: 1: fd9a9387e9 ! 1: 845be99dd6 builtin/push: call set_refspecs after validating remote @@ Metadata ## Commit message ## builtin/push: call set_refspecs after validating remote - Since 9badf97c42 (remote: allow resetting url list, 2024-06-14), we - reset the remote URL if the provided URL is empty. When a user of - 'remotes_remote_get' tries to fetch a remote with an empty repo name, - the function initializes the remote via 'make_remote'. But the remote is - still not a valid remote, since the URL is empty, so it tries to add the - URL alias using 'add_url_alias'. This in-turn will call 'add_url', but - since the URL is empty we call 'strvec_clear' on the `remote->url`. Back - in 'remotes_remote_get', we again check if the remote is valid, which - fails, so we return 'NULL' for the 'struct remote *' value + When an end-user runs "git push" with an empty string for the remote + repository name, e.g. + + $ git push '' main + + "git push" fails with a BUG(). Even though this is a nonsense request + that we want to fail, we shouldn't hit a BUG(). Instead we want to give + a sensible error message, e.g., 'bad repository'". + + This is because since 9badf97c42 (remote: allow resetting url list, + 2024-06-14), we reset the remote URL if the provided URL is empty. When + a user of 'remotes_remote_get' tries to fetch a remote with an empty + repo name, the function initializes the remote via 'make_remote'. But + the remote is still not a valid remote, since the URL is empty, so it + tries to add the URL alias using 'add_url_alias'. This in-turn will call + 'add_url', but since the URL is empty we call 'strvec_clear' on the + `remote->url`. Back in 'remotes_remote_get', we again check if the + remote is valid, which fails, so we return 'NULL' for the 'struct + remote *' value. The 'builtin/push.c' code, calls 'set_refspecs' before validating the remote. This worked with empty repo names earlier since we would get a @@ Commit message 'set_refspecs' instead of it trying to lazily obtain it. Helped-by: Jeff King <peff@peff.net> + Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> ## builtin/push.c ## @@ t/t5529-push-errors.sh: test_expect_success 'detect missing sha1 expressions ear test_cmp expect rp-ran ' -+# We need to use an existing local_ref so that the remote is mapped to -+# it in 'builtin/push.c:set_refspecs()'. -+test_expect_success 'detect empty remote' ' ++# We use an existing local_ref, since it follows a different flow in ++# 'builtin/push.c:set_refspecs()' and we want to test that regression. ++test_expect_success 'detect empty remote with existing local ref' ' + test_must_fail git push "" main 2> stderr && + grep "fatal: bad repository ${SQ}${SQ}" stderr +' ++ ++# While similar to the previous test, here we want to ensure that ++# even targeted refspecs are handled. ++test_expect_success 'detect empty remote with targeted refspec' ' ++ test_must_fail git push "" HEAD:refs/heads/main 2> stderr && ++ grep "fatal: bad repository ${SQ}${SQ}" stderr ++' + test_expect_success 'detect ambiguous refs early' ' git branch foo && builtin/push.c | 21 +++++++-------------- t/t5529-push-errors.sh | 17 +++++++++++++++++ 2 files changed, 24 insertions(+), 14 deletions(-) diff --git a/builtin/push.c b/builtin/push.c index 8260c6e46a..7a67398124 100644 --- a/builtin/push.c +++ b/builtin/push.c @@ -96,9 +96,8 @@ static void refspec_append_mapped(struct refspec *refspec, const char *ref, refspec_append(refspec, ref); } -static void set_refspecs(const char **refs, int nr, const char *repo) +static void set_refspecs(const char **refs, int nr, struct remote *remote) { - struct remote *remote = NULL; struct ref *local_refs = NULL; int i; @@ -124,17 +123,10 @@ static void set_refspecs(const char **refs, int nr, const char *repo) local_refs = get_local_heads(); /* Does "ref" uniquely name our ref? */ - if (count_refspec_match(ref, local_refs, &matched) != 1) { + if (count_refspec_match(ref, local_refs, &matched) != 1) refspec_append(&rs, ref); - } else { - /* lazily grab remote */ - if (!remote) - remote = remote_get(repo); - if (!remote) - BUG("must get a remote for repo '%s'", repo); - + else refspec_append_mapped(&rs, ref, remote, matched); - } } else refspec_append(&rs, ref); } @@ -630,10 +622,8 @@ int cmd_push(int argc, const char **argv, const char *prefix) if (tags) refspec_append(&rs, "refs/tags/*"); - if (argc > 0) { + if (argc > 0) repo = argv[0]; - set_refspecs(argv + 1, argc - 1, repo); - } remote = pushremote_get(repo); if (!remote) { @@ -649,6 +639,9 @@ int cmd_push(int argc, const char **argv, const char *prefix) " git push <name>\n")); } + if (argc > 0) + set_refspecs(argv + 1, argc - 1, remote); + if (remote->mirror) flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE); diff --git a/t/t5529-push-errors.sh b/t/t5529-push-errors.sh index 0247137cb3..17d7257892 100755 --- a/t/t5529-push-errors.sh +++ b/t/t5529-push-errors.sh @@ -2,6 +2,9 @@ test_description='detect some push errors early (before contacting remote)' +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME + TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh @@ -38,6 +41,20 @@ test_expect_success 'detect missing sha1 expressions early' ' test_cmp expect rp-ran ' +# We use an existing local_ref, since it follows a different flow in +# 'builtin/push.c:set_refspecs()' and we want to test that regression. +test_expect_success 'detect empty remote with existing local ref' ' + test_must_fail git push "" main 2> stderr && + grep "fatal: bad repository ${SQ}${SQ}" stderr +' + +# While similar to the previous test, here we want to ensure that +# even targeted refspecs are handled. +test_expect_success 'detect empty remote with targeted refspec' ' + test_must_fail git push "" HEAD:refs/heads/main 2> stderr && + grep "fatal: bad repository ${SQ}${SQ}" stderr +' + test_expect_success 'detect ambiguous refs early' ' git branch foo && git tag foo && -- 2.45.2 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v3] builtin/push: call set_refspecs after validating remote 2024-07-11 9:39 ` [PATCH v3] " Karthik Nayak @ 2024-07-11 15:08 ` Junio C Hamano 2024-07-11 21:33 ` Jeff King 0 siblings, 1 reply; 20+ messages in thread From: Junio C Hamano @ 2024-07-11 15:08 UTC (permalink / raw) To: Karthik Nayak; +Cc: git, peff Karthik Nayak <karthik.188@gmail.com> writes: > When an end-user runs "git push" with an empty string for the remote > repository name, e.g. > > $ git push '' main > > "git push" fails with a BUG(). Even though this is a nonsense request > that we want to fail, we shouldn't hit a BUG(). Instead we want to give > a sensible error message, e.g., 'bad repository'". > > This is because since 9badf97c42 (remote: allow resetting url list, > 2024-06-14), we reset the remote URL if the provided URL is empty. When > a user of 'remotes_remote_get' tries to fetch a remote with an empty > repo name, the function initializes the remote via 'make_remote'. But > the remote is still not a valid remote, since the URL is empty, so it > tries to add the URL alias using 'add_url_alias'. This in-turn will call > 'add_url', but since the URL is empty we call 'strvec_clear' on the > `remote->url`. Back in 'remotes_remote_get', we again check if the > remote is valid, which fails, so we return 'NULL' for the 'struct > remote *' value. > > The 'builtin/push.c' code, calls 'set_refspecs' before validating the > remote. This worked with empty repo names earlier since we would get a > remote, albeit with an empty URL. With the new changes, we get a 'NULL' > remote value, this causes the check for remote to fail and raises the > BUG in 'set_refspecs'. > > Do a simple fix by doing remote validation first. Also add a test to > validate the bug fix. With this, we can also now directly pass remote to > 'set_refspecs' instead of it trying to lazily obtain it. > > Helped-by: Jeff King <peff@peff.net> > Helped-by: Junio C Hamano <gitster@pobox.com> > Signed-off-by: Karthik Nayak <karthik.188@gmail.com> > --- > ... > builtin/push.c | 21 +++++++-------------- > t/t5529-push-errors.sh | 17 +++++++++++++++++ > 2 files changed, 24 insertions(+), 14 deletions(-) Everything makes sense to me. Thanks. > diff --git a/builtin/push.c b/builtin/push.c > index 8260c6e46a..7a67398124 100644 > --- a/builtin/push.c > +++ b/builtin/push.c > @@ -96,9 +96,8 @@ static void refspec_append_mapped(struct refspec *refspec, const char *ref, > refspec_append(refspec, ref); > } > > -static void set_refspecs(const char **refs, int nr, const char *repo) > +static void set_refspecs(const char **refs, int nr, struct remote *remote) > { > - struct remote *remote = NULL; > struct ref *local_refs = NULL; > int i; > > @@ -124,17 +123,10 @@ static void set_refspecs(const char **refs, int nr, const char *repo) > local_refs = get_local_heads(); > > /* Does "ref" uniquely name our ref? */ > - if (count_refspec_match(ref, local_refs, &matched) != 1) { > + if (count_refspec_match(ref, local_refs, &matched) != 1) > refspec_append(&rs, ref); > - } else { > - /* lazily grab remote */ > - if (!remote) > - remote = remote_get(repo); > - if (!remote) > - BUG("must get a remote for repo '%s'", repo); > - > + else > refspec_append_mapped(&rs, ref, remote, matched); > - } > } else > refspec_append(&rs, ref); > } > @@ -630,10 +622,8 @@ int cmd_push(int argc, const char **argv, const char *prefix) > if (tags) > refspec_append(&rs, "refs/tags/*"); > > - if (argc > 0) { > + if (argc > 0) > repo = argv[0]; > - set_refspecs(argv + 1, argc - 1, repo); > - } > > remote = pushremote_get(repo); > if (!remote) { > @@ -649,6 +639,9 @@ int cmd_push(int argc, const char **argv, const char *prefix) > " git push <name>\n")); > } > > + if (argc > 0) > + set_refspecs(argv + 1, argc - 1, remote); > + > if (remote->mirror) > flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE); > > diff --git a/t/t5529-push-errors.sh b/t/t5529-push-errors.sh > index 0247137cb3..17d7257892 100755 > --- a/t/t5529-push-errors.sh > +++ b/t/t5529-push-errors.sh > @@ -2,6 +2,9 @@ > > test_description='detect some push errors early (before contacting remote)' > > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > + > TEST_PASSES_SANITIZE_LEAK=true > . ./test-lib.sh > > @@ -38,6 +41,20 @@ test_expect_success 'detect missing sha1 expressions early' ' > test_cmp expect rp-ran > ' > > +# We use an existing local_ref, since it follows a different flow in > +# 'builtin/push.c:set_refspecs()' and we want to test that regression. > +test_expect_success 'detect empty remote with existing local ref' ' > + test_must_fail git push "" main 2> stderr && > + grep "fatal: bad repository ${SQ}${SQ}" stderr > +' > + > +# While similar to the previous test, here we want to ensure that > +# even targeted refspecs are handled. > +test_expect_success 'detect empty remote with targeted refspec' ' > + test_must_fail git push "" HEAD:refs/heads/main 2> stderr && > + grep "fatal: bad repository ${SQ}${SQ}" stderr > +' > + > test_expect_success 'detect ambiguous refs early' ' > git branch foo && > git tag foo && ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v3] builtin/push: call set_refspecs after validating remote 2024-07-11 15:08 ` Junio C Hamano @ 2024-07-11 21:33 ` Jeff King 0 siblings, 0 replies; 20+ messages in thread From: Jeff King @ 2024-07-11 21:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: Karthik Nayak, git On Thu, Jul 11, 2024 at 08:08:52AM -0700, Junio C Hamano wrote: > > Do a simple fix by doing remote validation first. Also add a test to > > validate the bug fix. With this, we can also now directly pass remote to > > 'set_refspecs' instead of it trying to lazily obtain it. > [..] > > Everything makes sense to me. Thanks. Likewise for me. -Peff ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2024-07-11 21:33 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-07-08 14:03 [PATCH] builtin/push: call set_refspecs after validating remote Karthik Nayak 2024-07-08 19:17 ` Junio C Hamano 2024-07-08 23:33 ` Jeff King 2024-07-09 9:59 ` Karthik Nayak 2024-07-08 23:32 ` Jeff King 2024-07-09 9:05 ` Karthik Nayak 2024-07-09 9:59 ` Jeff King 2024-07-09 14:49 ` [PATCH v2] " Karthik Nayak 2024-07-09 18:44 ` Junio C Hamano 2024-07-09 18:56 ` Junio C Hamano 2024-07-09 23:55 ` Jeff King 2024-07-10 1:04 ` Junio C Hamano 2024-07-10 13:12 ` Karthik Nayak 2024-07-10 15:34 ` Junio C Hamano 2024-07-10 15:46 ` Jeff King 2024-07-11 9:35 ` Karthik Nayak 2024-07-11 21:32 ` Jeff King 2024-07-11 9:39 ` [PATCH v3] " Karthik Nayak 2024-07-11 15:08 ` Junio C Hamano 2024-07-11 21:33 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).