public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Yuvraj Singh Chauhan <ysinghcin@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: "D. Ben Knoble" <ben.knoble@gmail.com>,
	git@vger.kernel.org, christian.couder@gmail.com
Subject: Re: [PATCH v1] path-walk: fix NULL pointer dereference in error message
Date: Mon, 23 Mar 2026 15:15:43 +0530	[thread overview]
Message-ID: <acELx_n-RK3qwz1Z@ThinkPad-E14-Gen-6> (raw)
In-Reply-To: <994f92e9-3576-455a-a142-0fefc559131c@gmail.com>

On Fri, Mar 20, 2026 at 12:14:19PM -0400, Derrick Stolee wrote:
> On 3/20/2026 11:16 AM, D. Ben Knoble wrote:
> > On Fri, Mar 20, 2026 at 7:50 AM Yuvraj Singh Chauhan
> > <ysinghcin@gmail.com> wrote:
> >> @@ -171,7 +171,7 @@ static int add_tree_entries(struct path_walk_context *ctx,
> >>
> >>                 if (!o) {
> >>                         error(_("failed to find object %s"),
> >> -                             oid_to_hex(&o->oid));
> >> +                             oid_to_hex(&entry.oid));
> >>                         return -1;
> >>                 }
> >>
> >> --
> >> 2.53.0.582.gca1db8a0f7
> > 
> > Interesting find. I was hoping to see an easy way to reproduce hitting
> > this code, and after grepping around a bit I found a few places that
> > end up in this code (git-backfill and git-repo being the primary
> > callers of walk_objects_by_path), but on second glance I think "!o" is
> > current dead code.
> 
> I can appreciate that the existing code is clearly incorrect, so
> tooling scanning code for defects would find this even if we can't
> easily create a test case to demonstrate it.
>  
> > Still, fixing such obviously wrong dereference is good, but I wonder
> > if we should go further?
> > 
> > You mentioned git-backfill with a tree missing from the local odb; do
> > you have a short reproduction script or test-case?
> 
> I imagine that it would be difficult to set up such a case, but maybe
> it would follow these steps (based on existing 'git backfill' tests
> that start with a partial clone):
> 
> 1. Make a bare, blobless partial clone of the server repo.
> 
> 2. Explode the client repo's object store into loose objects.
> 
> 3. Delete a loose tree object, but one that isn't a commit's root
>    tree. It must be a child tree.
> 
> In this case, we should hit this issue. Blobless partial clones
> expect all reachable trees to exist locally and so are not prepared
> to download missing trees on-demand.
> 
> Thanks,
> -Stolee

Thanks for the suggestions on how to reproduce this.
I traced through the code paths and believe the scenario described
(blobless partial clone, explode objects, delete a child tree) would
not actually reach the '!o' branch. Here is my reasoning:

When add_tree_entries() encounters a child tree entry whose OID is
missing from the ODB, it calls lookup_tree(). Since the OID was never
previously registered in the parsed object hash, lookup_object()
returns NULL, and lookup_tree() falls through to create_object() which
allocates a shell object. So 'child' is non-NULL, 'o' is non-NULL, 
and the '!o' check is not reached.
The missing object would instead fail later: when the path-walk tries
to visit the child tree via a subsequent add_tree_entries() call,
repo_parse_tree_gently() attempts to read it from the ODB, fails, and
returns the "bad tree object" error at the top of the function.

The '!o' path is reachable when the OID in a tree entry was
"previously registered" in the parsed object hash with a different
type. In that case, lookup_blob() or lookup_tree() calls
object_as_type(), which detects the type conflict and returns NULL.
This is what happens with a corrupt/confused tree that references a
commit OID as a blob entry, the commit was already registered during
the revision walk, so object_as_type(obj, OBJ_BLOB, 0) returns NULL.
For v2, I've added a test that exercises exactly this path:
    # Create a tree with a blob entry pointing to a commit object
    commit=$(git rev-parse HEAD)
    bad_tree=$(perl -e 'print "100644 confused\0" . pack("H*", $ARGV[0])' \
    "$commit" | git hash-object -w --stdin -t tree)
    git pack-objects --path-walk --all /tmp/test-pack <<< "$bad_tree"
This constructs a tree where mode 100644 (blob) has a commit OID.
During --path-walk, the commit is registered as OBJ_COMMIT by the
revision walk via --all, so when the tree entry is processed,
lookup_blob() → object_as_type() returns NULL, triggering the bug.

Please review

Sincerely,
Yuvraj

  reply	other threads:[~2026-03-23  9:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 11:48 [PATCH v1] path-walk: fix NULL pointer dereference in error message Yuvraj Singh Chauhan
2026-03-20 15:16 ` D. Ben Knoble
2026-03-20 16:14   ` Derrick Stolee
2026-03-23  9:45     ` Yuvraj Singh Chauhan [this message]
2026-03-20 17:08   ` Junio C Hamano
2026-03-20 18:54   ` René Scharfe
2026-03-22 14:21     ` D. Ben Knoble

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acELx_n-RK3qwz1Z@ThinkPad-E14-Gen-6 \
    --to=ysinghcin@gmail.com \
    --cc=ben.knoble@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox