All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>, Elijah Newren <newren@gmail.com>,
	Elijah Newren <newren@gmail.com>
Subject: [PATCH v3 0/2] object-name: fix a pair of object name resolution issues
Date: Mon, 13 Jan 2025 17:13:35 +0000	[thread overview]
Message-ID: <pull.1844.v3.git.1736788417.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1844.v2.git.1735949870.gitgitgadget@gmail.com>

Changes since v2:

 * Readability improvement to the first patch, which fixes object name
   resolution with refs containing a curly brace
 * Fixed the second patch for cases like OBJ-COUNT-gHASH~13 and added a
   couple test cases for that. Also, extended the commit message a bit to
   discuss the cases brought up on the list.

For the second patch, if folks want some open source examples where it could
be triggered, I found two examples:

 * lore.git: git cat-file -t master:random/path/major-gaffed
 * git.git: git cat-file -t super-invalid~///\\.....@.lock-gfd0bba94e

Elijah Newren (2):
  object-name: fix resolution of object names containing curly braces
  object-name: be more strict in parsing describe-like output

 object-name.c       | 63 ++++++++++++++++++++++++++++++++++++++++++---
 t/t1006-cat-file.sh | 31 +++++++++++++++++++++-
 t/t6120-describe.sh | 24 +++++++++++++++++
 3 files changed, 113 insertions(+), 5 deletions(-)


base-commit: 063bcebf0c917140ca0e705cbe0fdea127e90086
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1844%2Fnewren%2Fobject-name-fix-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1844/newren/object-name-fix-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1844

Range-diff vs v2:

 1:  13f68bebe90 ! 1:  b21b8fc5554 object-name: fix resolution of object names containing curly braces
     @@ Commit message
            * 'foo@@{...}'
            * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}'
      
     +    Note that we'd prefer duplicating the special logic for "@^" characters
     +    here, because if get_oid_basic() or interpret_nth_prior_checkout() or
     +    get_oid_basic() or similar gain extra methods of using curly braces,
     +    then the logic in get_oid_with_context_1() would need to be updated as
     +    well.  But it's not clear how to refactor all of these to have a simple
     +    common callpoint with the specialized logic.
     +
          Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
          Helped-by: Michael Haggerty <mhagger@github.com>
          Signed-off-by: Elijah Newren <newren@gmail.com>
     @@ object-name.c: static enum get_oid_result get_oid_with_context_1(struct reposito
       	}
       	for (cp = name, bracket_depth = 0; *cp; cp++) {
      -		if (*cp == '{')
     -+		if (*(cp+1) == '{' && (*cp == '@' || *cp == '^')) {
     ++		if (strchr("@^", *cp) && cp[1] == '{') {
      +			cp++;
       			bracket_depth++;
      -		else if (bracket_depth && *cp == '}')
 2:  31f1c37b31a ! 2:  19f84dfc9cc object-name: be more strict in parsing describe-like output
     @@ Commit message
                'g', and an abbreviated object name.
          which means that output of the format
              ${REFNAME}-${INTEGER}-g${HASH}
     -    should parse to fully expand ${HASH}.  This is fine.  However, we
     +    should parse to fully expanded ${HASH}.  This is fine.  However, we
          currently don't validate any of ${REFNAME}-${INTEGER}, we only parse
          -g${HASH} and assume the rest is valid.  That is problematic, since it
          breaks things like
      
              git cat-file -p branchname:path/to/file/named/i-gaffed
      
     -    which, when commit affed exists, will not return us information about a
     -    file we are looking for but will instead tell us about commit affed.
     +    which, when commit (or tree or blob) affed exists, will not return us
     +    information about the file we are looking for but will instead
     +    erroneously tell us about object affed.
      
     -    Similarly, we should probably not treat
     -        refs/tags/invalid/./../...../// ~^:/?*\\&[}/busted.lock-g049e0ef6
     -    as a request for commit 050e0ef6 either.
     +    A few additional notes:
     +      - This is a slight backward incompatibility break, because we used
     +        to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}.  However,
     +        a backward incompatible break is necessary, because there is no
     +        other way for someone to be more specific and disambiguate that they
     +        want the blob master:path/to/who-gabbed instead of the object abbed.
     +      - There is a possibility that check_refname_format() rules change in
     +        the future.  However, we can only realistically loosen the rules
     +        for what that function accepts rather than tighten.  If we were to
     +        tighten the rules, some real world repositories may already have
     +        refnames that suddenly become unacceptable and we break those
     +        repositories.  As such, any describe-like syntax of the form
     +        ${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the
     +        changes in this commit will remain valid in the future.
     +      - The fact that check_refname_format() rules could loosen in the
     +        future is probably also an important reason to make this change.  If
     +        the rules loosen, there might be additional cases within
     +        ${GARBAGE}-g${HASH} that become ambiguous in the future.  While
     +        abbreviated hashes can be disambiguated by abbreviating less, it may
     +        well be that these alternative object names have no way of being
     +        disambiguated (much like pathnames cannot be).  Accepting all random
     +        ${GARBAGE} thus makes it difficult for us to allow future
     +        extensions to object naming.
      
     -    Tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are
     -    present and valid.
     +    So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are
     +    present in the string, and would be considered a valid ref and
     +    non-negative integer.
     +
     +    Also, add a few tests for git describe using object names of the form
     +        ${REVISION_NAME}${MODIFIERS}
     +    since an early version of this patch failed on constructs like
     +        git describe v2.48.0-rc2-161-g6c2274cdbc^0
      
          Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
          Signed-off-by: Elijah Newren <newren@gmail.com>
     @@ object-name.c: static int peel_onion(struct repository *r, const char *name, int
      +	len = cp - name;
      +	strbuf_init(&sb, len);
      +	strbuf_add(&sb, name, len);
     -+	ret = !check_refname_format(name, flags);
     ++	ret = !check_refname_format(sb.buf, flags);
      +	strbuf_release(&sb);
      +	return ret;
      +}
     @@ object-name.c: static int get_describe_name(struct repository *r,
       				return get_short_oid(r,
      
       ## t/t6120-describe.sh ##
     +@@ t/t6120-describe.sh: check_describe R-2-gHASH HEAD^^
     + check_describe A-3-gHASH HEAD^^2
     + check_describe B HEAD^^2^
     + check_describe R-1-gHASH HEAD^^^
     ++check_describe R-1-gHASH R-1-g$(git rev-parse --short HEAD^^)~1
     + 
     + check_describe c-7-gHASH --tags HEAD
     + check_describe c-6-gHASH --tags HEAD^
     + check_describe e-1-gHASH --tags HEAD^^
     + check_describe c-2-gHASH --tags HEAD^^2
     ++check_describe c-2-gHASH --tags c-2-g$(git rev-parse --short HEAD^^2)^0
     + check_describe B --tags HEAD^^2^
     + check_describe e --tags HEAD^^^
     + check_describe e --tags --exact-match HEAD^^^
      @@ t/t6120-describe.sh: test_expect_success '--exact-match does not show --always fallback' '
       	test_must_fail git describe --exact-match --always
       '

-- 
gitgitgadget

  parent reply	other threads:[~2025-01-13 17:13 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-01  2:53 [PATCH] object-name: fix resolution of object names containing curly braces Elijah Newren via GitGitGadget
2025-01-01 17:00 ` Junio C Hamano
2025-01-03 23:34   ` Elijah Newren
2025-01-04  2:52     ` Junio C Hamano
2025-01-03  8:16 ` Patrick Steinhardt
2025-01-03 15:46   ` Junio C Hamano
2025-01-03 23:43   ` Elijah Newren
2025-01-04  0:17 ` [PATCH v2 0/2] " Elijah Newren via GitGitGadget
2025-01-04  0:17   ` [PATCH v2 1/2] " Elijah Newren via GitGitGadget
2025-01-04 17:26     ` Junio C Hamano
2025-01-04 18:54       ` Elijah Newren
2025-01-05 16:14     ` Junio C Hamano
2025-01-04  0:17   ` [PATCH v2 2/2] object-name: be more strict in parsing describe-like output Elijah Newren via GitGitGadget
2025-01-04 14:35   ` [PATCH v2 0/2] object-name: fix resolution of object names containing curly braces Junio C Hamano
2025-01-04 15:55     ` Elijah Newren
2025-01-04 17:51       ` Junio C Hamano
2025-01-04 18:55         ` Elijah Newren
2025-01-06 17:29   ` Junio C Hamano
2025-01-06 19:26     ` Elijah Newren
2025-01-06 20:38       ` Junio C Hamano
2025-01-13 17:13   ` Elijah Newren via GitGitGadget [this message]
2025-01-13 17:13     ` [PATCH v3 1/2] " Elijah Newren via GitGitGadget
2025-01-13 17:13     ` [PATCH v3 2/2] object-name: be more strict in parsing describe-like output Elijah Newren via GitGitGadget
2025-01-13 18:15     ` [PATCH v3 0/2] object-name: fix a pair of object name resolution issues Junio C Hamano
2025-01-13 19:26       ` Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1844.v3.git.1736788417.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.