All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yuvraj Singh Chauhan <ysinghcin@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: "D. Ben Knoble" <ben.knoble@gmail.com>,
	git@vger.kernel.org, christian.couder@gmail.com
Subject: Re: [PATCH v1] path-walk: fix NULL pointer dereference in error message
Date: Mon, 23 Mar 2026 15:15:43 +0530	[thread overview]
Message-ID: <acELx_n-RK3qwz1Z@ThinkPad-E14-Gen-6> (raw)
In-Reply-To: <994f92e9-3576-455a-a142-0fefc559131c@gmail.com>

On Fri, Mar 20, 2026 at 12:14:19PM -0400, Derrick Stolee wrote:
> On 3/20/2026 11:16 AM, D. Ben Knoble wrote:
> > On Fri, Mar 20, 2026 at 7:50 AM Yuvraj Singh Chauhan
> > <ysinghcin@gmail.com> wrote:
> >> @@ -171,7 +171,7 @@ static int add_tree_entries(struct path_walk_context *ctx,
> >>
> >>                 if (!o) {
> >>                         error(_("failed to find object %s"),
> >> -                             oid_to_hex(&o->oid));
> >> +                             oid_to_hex(&entry.oid));
> >>                         return -1;
> >>                 }
> >>
> >> --
> >> 2.53.0.582.gca1db8a0f7
> > 
> > Interesting find. I was hoping to see an easy way to reproduce hitting
> > this code, and after grepping around a bit I found a few places that
> > end up in this code (git-backfill and git-repo being the primary
> > callers of walk_objects_by_path), but on second glance I think "!o" is
> > current dead code.
> 
> I can appreciate that the existing code is clearly incorrect, so
> tooling scanning code for defects would find this even if we can't
> easily create a test case to demonstrate it.
>  
> > Still, fixing such obviously wrong dereference is good, but I wonder
> > if we should go further?
> > 
> > You mentioned git-backfill with a tree missing from the local odb; do
> > you have a short reproduction script or test-case?
> 
> I imagine that it would be difficult to set up such a case, but maybe
> it would follow these steps (based on existing 'git backfill' tests
> that start with a partial clone):
> 
> 1. Make a bare, blobless partial clone of the server repo.
> 
> 2. Explode the client repo's object store into loose objects.
> 
> 3. Delete a loose tree object, but one that isn't a commit's root
>    tree. It must be a child tree.
> 
> In this case, we should hit this issue. Blobless partial clones
> expect all reachable trees to exist locally and so are not prepared
> to download missing trees on-demand.
> 
> Thanks,
> -Stolee

Thanks for the suggestions on how to reproduce this.
I traced through the code paths and believe the scenario described
(blobless partial clone, explode objects, delete a child tree) would
not actually reach the '!o' branch. Here is my reasoning:

When add_tree_entries() encounters a child tree entry whose OID is
missing from the ODB, it calls lookup_tree(). Since the OID was never
previously registered in the parsed object hash, lookup_object()
returns NULL, and lookup_tree() falls through to create_object() which
allocates a shell object. So 'child' is non-NULL, 'o' is non-NULL, 
and the '!o' check is not reached.
The missing object would instead fail later: when the path-walk tries
to visit the child tree via a subsequent add_tree_entries() call,
repo_parse_tree_gently() attempts to read it from the ODB, fails, and
returns the "bad tree object" error at the top of the function.

The '!o' path is reachable when the OID in a tree entry was
"previously registered" in the parsed object hash with a different
type. In that case, lookup_blob() or lookup_tree() calls
object_as_type(), which detects the type conflict and returns NULL.
This is what happens with a corrupt/confused tree that references a
commit OID as a blob entry, the commit was already registered during
the revision walk, so object_as_type(obj, OBJ_BLOB, 0) returns NULL.
For v2, I've added a test that exercises exactly this path:
    # Create a tree with a blob entry pointing to a commit object
    commit=$(git rev-parse HEAD)
    bad_tree=$(perl -e 'print "100644 confused\0" . pack("H*", $ARGV[0])' \
    "$commit" | git hash-object -w --stdin -t tree)
    git pack-objects --path-walk --all /tmp/test-pack <<< "$bad_tree"
This constructs a tree where mode 100644 (blob) has a commit OID.
During --path-walk, the commit is registered as OBJ_COMMIT by the
revision walk via --all, so when the tree entry is processed,
lookup_blob() → object_as_type() returns NULL, triggering the bug.

Please review

Sincerely,
Yuvraj

  reply	other threads:[~2026-03-23  9:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 11:48 [PATCH v1] path-walk: fix NULL pointer dereference in error message Yuvraj Singh Chauhan
2026-03-20 15:16 ` D. Ben Knoble
2026-03-20 16:14   ` Derrick Stolee
2026-03-23  9:45     ` Yuvraj Singh Chauhan [this message]
2026-03-20 17:08   ` Junio C Hamano
2026-03-20 18:54   ` René Scharfe
2026-03-22 14:21     ` D. Ben Knoble

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acELx_n-RK3qwz1Z@ThinkPad-E14-Gen-6 \
    --to=ysinghcin@gmail.com \
    --cc=ben.knoble@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.