From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>, Elijah Newren <newren@gmail.com>,
Elijah Newren <newren@gmail.com>,
Elijah Newren <newren@gmail.com>
Subject: [PATCH v3 2/2] object-name: be more strict in parsing describe-like output
Date: Mon, 13 Jan 2025 17:13:37 +0000 [thread overview]
Message-ID: <19f84dfc9cc33616f3720d740c449b29de8d492e.1736788417.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1844.v3.git.1736788417.gitgitgadget@gmail.com>
From: Elijah Newren <newren@gmail.com>
From Documentation/revisions.txt:
'<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb'::
Output from `git describe`; i.e. a closest tag, optionally
followed by a dash and a number of commits, followed by a dash, a
'g', and an abbreviated object name.
which means that output of the format
${REFNAME}-${INTEGER}-g${HASH}
should parse to fully expanded ${HASH}. This is fine. However, we
currently don't validate any of ${REFNAME}-${INTEGER}, we only parse
-g${HASH} and assume the rest is valid. That is problematic, since it
breaks things like
git cat-file -p branchname:path/to/file/named/i-gaffed
which, when commit (or tree or blob) affed exists, will not return us
information about the file we are looking for but will instead
erroneously tell us about object affed.
A few additional notes:
- This is a slight backward incompatibility break, because we used
to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}. However,
a backward incompatible break is necessary, because there is no
other way for someone to be more specific and disambiguate that they
want the blob master:path/to/who-gabbed instead of the object abbed.
- There is a possibility that check_refname_format() rules change in
the future. However, we can only realistically loosen the rules
for what that function accepts rather than tighten. If we were to
tighten the rules, some real world repositories may already have
refnames that suddenly become unacceptable and we break those
repositories. As such, any describe-like syntax of the form
${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the
changes in this commit will remain valid in the future.
- The fact that check_refname_format() rules could loosen in the
future is probably also an important reason to make this change. If
the rules loosen, there might be additional cases within
${GARBAGE}-g${HASH} that become ambiguous in the future. While
abbreviated hashes can be disambiguated by abbreviating less, it may
well be that these alternative object names have no way of being
disambiguated (much like pathnames cannot be). Accepting all random
${GARBAGE} thus makes it difficult for us to allow future
extensions to object naming.
So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are
present in the string, and would be considered a valid ref and
non-negative integer.
Also, add a few tests for git describe using object names of the form
${REVISION_NAME}${MODIFIERS}
since an early version of this patch failed on constructs like
git describe v2.48.0-rc2-161-g6c2274cdbc^0
Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---
object-name.c | 55 ++++++++++++++++++++++++++++++++++++++++++++-
t/t6120-describe.sh | 24 ++++++++++++++++++++
2 files changed, 78 insertions(+), 1 deletion(-)
diff --git a/object-name.c b/object-name.c
index 8e80841acd3..cb96a0e6161 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1271,6 +1271,58 @@ static int peel_onion(struct repository *r, const char *name, int len,
return 0;
}
+/*
+ * Documentation/revisions.txt says:
+ * '<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb'::
+ * Output from `git describe`; i.e. a closest tag, optionally
+ * followed by a dash and a number of commits, followed by a dash, a
+ * 'g', and an abbreviated object name.
+ *
+ * which means that the stuff before '-g${HASH}' needs to be a valid
+ * refname, a dash, and a non-negative integer. This function verifies
+ * that.
+ *
+ * In particular, we do not want to treat
+ * branchname:path/to/file/named/i-gaffed
+ * as a request for commit affed.
+ *
+ * More generally, we should probably not treat
+ * 'refs/heads/./../.../ ~^:/?*[////\\\&}/busted.lock-g050e0ef6ead'
+ * as a request for object 050e0ef6ead either.
+ *
+ * We are called with name[len] == '-' and name[len+1] == 'g', i.e.
+ * we are verifying ${REFNAME}-{INTEGER} part of the name.
+ */
+static int ref_and_count_parts_valid(const char *name, int len)
+{
+ struct strbuf sb;
+ const char *cp;
+ int flags = REFNAME_ALLOW_ONELEVEL;
+ int ret = 1;
+
+ /* Ensure we have at least one digit */
+ if (!isxdigit(name[len-1]))
+ return 0;
+
+ /* Skip over digits backwards until we get to the dash */
+ for (cp = name + len - 2; name < cp; cp--) {
+ if (*cp == '-')
+ break;
+ if (!isxdigit(*cp))
+ return 0;
+ }
+ /* Ensure we found the leading dash */
+ if (*cp != '-')
+ return 0;
+
+ len = cp - name;
+ strbuf_init(&sb, len);
+ strbuf_add(&sb, name, len);
+ ret = !check_refname_format(sb.buf, flags);
+ strbuf_release(&sb);
+ return ret;
+}
+
static int get_describe_name(struct repository *r,
const char *name, int len,
struct object_id *oid)
@@ -1284,7 +1336,8 @@ static int get_describe_name(struct repository *r,
/* We must be looking at g in "SOMETHING-g"
* for it to be describe output.
*/
- if (ch == 'g' && cp[-1] == '-') {
+ if (ch == 'g' && cp[-1] == '-' &&
+ ref_and_count_parts_valid(name, cp - 1 - name)) {
cp++;
len -= cp - name;
return get_short_oid(r,
diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh
index 3f6160d702b..76843a61691 100755
--- a/t/t6120-describe.sh
+++ b/t/t6120-describe.sh
@@ -82,11 +82,13 @@ check_describe R-2-gHASH HEAD^^
check_describe A-3-gHASH HEAD^^2
check_describe B HEAD^^2^
check_describe R-1-gHASH HEAD^^^
+check_describe R-1-gHASH R-1-g$(git rev-parse --short HEAD^^)~1
check_describe c-7-gHASH --tags HEAD
check_describe c-6-gHASH --tags HEAD^
check_describe e-1-gHASH --tags HEAD^^
check_describe c-2-gHASH --tags HEAD^^2
+check_describe c-2-gHASH --tags c-2-g$(git rev-parse --short HEAD^^2)^0
check_describe B --tags HEAD^^2^
check_describe e --tags HEAD^^^
check_describe e --tags --exact-match HEAD^^^
@@ -725,4 +727,26 @@ test_expect_success '--exact-match does not show --always fallback' '
test_must_fail git describe --exact-match --always
'
+test_expect_success 'avoid being fooled by describe-like filename' '
+ test_when_finished rm out &&
+
+ git rev-parse --short HEAD >out &&
+ FILENAME=filename-g$(cat out) &&
+ touch $FILENAME &&
+ git add $FILENAME &&
+ git commit -m "Add $FILENAME" &&
+
+ git cat-file -t HEAD:$FILENAME >actual &&
+
+ echo blob >expect &&
+ test_cmp expect actual
+'
+
+test_expect_success 'do not be fooled by invalid describe format ' '
+ test_when_finished rm out &&
+
+ git rev-parse --short HEAD >out &&
+ test_must_fail git cat-file -t "refs/tags/super-invalid/./../...../ ~^:/?*[////\\\\\\&}/busted.lock-42-g"$(cat out)
+'
+
test_done
--
gitgitgadget
next prev parent reply other threads:[~2025-01-13 17:13 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-01 2:53 [PATCH] object-name: fix resolution of object names containing curly braces Elijah Newren via GitGitGadget
2025-01-01 17:00 ` Junio C Hamano
2025-01-03 23:34 ` Elijah Newren
2025-01-04 2:52 ` Junio C Hamano
2025-01-03 8:16 ` Patrick Steinhardt
2025-01-03 15:46 ` Junio C Hamano
2025-01-03 23:43 ` Elijah Newren
2025-01-04 0:17 ` [PATCH v2 0/2] " Elijah Newren via GitGitGadget
2025-01-04 0:17 ` [PATCH v2 1/2] " Elijah Newren via GitGitGadget
2025-01-04 17:26 ` Junio C Hamano
2025-01-04 18:54 ` Elijah Newren
2025-01-05 16:14 ` Junio C Hamano
2025-01-04 0:17 ` [PATCH v2 2/2] object-name: be more strict in parsing describe-like output Elijah Newren via GitGitGadget
2025-01-04 14:35 ` [PATCH v2 0/2] object-name: fix resolution of object names containing curly braces Junio C Hamano
2025-01-04 15:55 ` Elijah Newren
2025-01-04 17:51 ` Junio C Hamano
2025-01-04 18:55 ` Elijah Newren
2025-01-06 17:29 ` Junio C Hamano
2025-01-06 19:26 ` Elijah Newren
2025-01-06 20:38 ` Junio C Hamano
2025-01-13 17:13 ` [PATCH v3 0/2] object-name: fix a pair of object name resolution issues Elijah Newren via GitGitGadget
2025-01-13 17:13 ` [PATCH v3 1/2] object-name: fix resolution of object names containing curly braces Elijah Newren via GitGitGadget
2025-01-13 17:13 ` Elijah Newren via GitGitGadget [this message]
2025-01-13 18:15 ` [PATCH v3 0/2] object-name: fix a pair of object name resolution issues Junio C Hamano
2025-01-13 19:26 ` Elijah Newren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=19f84dfc9cc33616f3720d740c449b29de8d492e.1736788417.git.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=newren@gmail.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).