netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Carpenter <dan.carpenter@linaro.org>
To: Lizhi Xu <lizhi.xu@windriver.com>
Cc: davem@davemloft.net, edumazet@google.com, horms@kernel.org,
	kuba@kernel.org, linux-hams@vger.kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	pabeni@redhat.com,
	syzbot+2860e75836a08b172755@syzkaller.appspotmail.com,
	syzkaller-bugs@googlegroups.com
Subject: Re: [PATCH V2] netrom: Prevent race conditions between multiple add route
Date: Tue, 21 Oct 2025 09:36:47 +0300	[thread overview]
Message-ID: <aPcp_xemzpDuw-MW@stanley.mountain> (raw)
In-Reply-To: <20251021020533.1234755-1-lizhi.xu@windriver.com>

On Tue, Oct 21, 2025 at 10:05:33AM +0800, Lizhi Xu wrote:
> On Mon, 20 Oct 2025 20:59:24 +0300, Dan Carpenter wrote:
> > On Mon, Oct 20, 2025 at 09:49:12PM +0800, Lizhi Xu wrote:
> > > On Mon, 20 Oct 2025 21:34:56 +0800, Lizhi Xu wrote:
> > > > > Task0					Task1						Task2
> > > > > =====					=====						=====
> > > > > [97] nr_add_node()
> > > > > [113] nr_neigh_get_dev()		[97] nr_add_node()
> > > > > 					[214] nr_node_lock()
> > > > > 					[245] nr_node->routes[2].neighbour->count--
> > > > > 					[246] nr_neigh_put(nr_node->routes[2].neighbour);
> > > > > 					[248] nr_remove_neigh(nr_node->routes[2].neighbour)
> > > > > 					[283] nr_node_unlock()
> > > > > [214] nr_node_lock()
> > > > > [253] nr_node->routes[2].neighbour = nr_neigh
> > > > > [254] nr_neigh_hold(nr_neigh);							[97] nr_add_node()
> > > > > 											[XXX] nr_neigh_put()
> > > > >                                                                                         ^^^^^^^^^^^^^^^^^^^^
> > > > >
> > > > > These charts are supposed to be chronological so [XXX] is wrong because the
> > > > > use after free happens on line [248].  Do we really need three threads to
> > > > > make this race work?
> > > > The UAF problem occurs in Task2. Task1 sets the refcount of nr_neigh to 1,
> > > > then Task0 adds it to routes[2]. Task2 releases routes[2].neighbour after
> > > > executing [XXX]nr_neigh_put().
> > > Execution Order:
> > > 1 -> Task0
> > > [113] nr_neigh_get_dev() // After execution, the refcount value is 3
> > >
> > > 2 -> Task1
> > > [246] nr_neigh_put(nr_node->routes[2].neighbour);   // After execution, the refcount value is 2
> > > [248] nr_remove_neigh(nr_node->routes[2].neighbour) // After execution, the refcount value is 1
> > >
> > > 3 -> Task0
> > > [253] nr_node->routes[2].neighbour = nr_neigh       // nr_neigh's refcount value is 1 and add it to routes[2]
> > >
> > > 4 -> Task2
> > > [XXX] nr_neigh_put(nr_node->routes[2].neighbour)    // After execution, neighhour is freed
> > > if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)  // Uaf occurs this line when accessing neighbour->count
> > 
> > Let's step back a bit and look at the bigger picture design.  (Which is
> > completely undocumented so we're just guessing).
> > 
> > When we put nr_neigh into nr_node->routes[] we bump the nr_neigh_hold()
> > reference count and nr_neigh->count++, then when we remove it from
> > ->routes[] we drop the reference and do nr_neigh->count--.
> > 
> > If it's the last reference (and we are not holding ->locked) then we
> > remove it from the &nr_neigh_list and drop the reference count again and
> > free it.  So we drop the reference count twice.  This is a complicated
> > design with three variables: nr_neigh_hold(), nr_neigh->count and
> > ->locked.  Why can it not just be one counter nr_neigh_hold().  So
> > instead of setting locked = true we would just take an extra reference?
> > The nr_neigh->count++ would be replaced with nr_neigh_hold() as well.
> locked controls whether the neighbor quality can be automatically updated;

I'm not sure your patch fixes the bug because we could still race against
nr_del_node().

I'm not saying get rid of locked completely, I'm saying get rid of code like
this:
		if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
			nr_remove_neigh(nr_node->routes[2].neighbour);

Right now, locked serves as a special kind of reference count, because we
don't drop the reference if it's true.

> count controls the number of different routes a neighbor is linked to;

Sure, that is interesting information for the user, so keep it around to
print in the proc file, but don't use it as a reference count.

> refcount is simply used to manage the neighbor lifecycle.

The bug is caused because our reference counting is bad.

So right now what happens is we allocate nr_neigh and we put it on the
&nr_neigh_list.  Then we lock it or we add it to ->routes[] and each of
those has a different reference count.  Then when we drop those references
we do:

		if (nr_node->routes[2].neighbour->count == 0 && !nr_node->routes[2].neighbour->locked)
			nr_remove_neigh(nr_node->routes[2].neighbour);

This removes it from the list, and hopefully this is the last reference
and it frees it.

It would be much simpler to say, we only use nr_neigh_hold()/put() for
reference counting.  When we set locked we do:

	nr_neigh_hold(nr_neigh);
	nr_neigh->locked  = true;

Incrementing the refcount means it can't be freed.

Then when we remove nr_neigh from ->routes[] we wouldn't "remove it from
the list", instead we would just drop a reference.  When we dropped the
last reference, nr_neigh_put() would remove it from the list.

My proposal would be a behavior change because right now what happens is:

1: allocate nr_neigh
2: add it to ->routes[]
3: remove it from ->routes[]
   (freed automatically because we drop two references)

Now it would be:
1: allocate nr_neigh
2: add it to ->routes[]
3: remove it from ->routes[]
4: needs to be freed manually with nr_del_neigh().

regards,
dan carpenter

  reply	other threads:[~2025-10-21  6:36 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-25 16:35 [syzbot] [hams?] KASAN: slab-use-after-free Read in nr_add_node syzbot
2025-10-18 20:37 ` syzbot
2025-10-19  5:02   ` Brahmajit Das
2025-10-19  5:21     ` syzbot
2025-10-19  5:10   ` Brahmajit Das
2025-10-19  5:10     ` syzbot
2025-10-20  8:13   ` [PATCH] netrom: Prevent race conditions between multiple add route Lizhi Xu
2025-10-20 10:10     ` Dan Carpenter
2025-10-20 11:02       ` [PATCH V2] " Lizhi Xu
2025-10-20 12:25         ` Dan Carpenter
2025-10-20 12:33           ` Dan Carpenter
2025-10-20 12:57           ` Dan Carpenter
2025-10-20 13:34           ` Lizhi Xu
2025-10-20 13:49             ` Lizhi Xu
2025-10-20 17:59               ` Dan Carpenter
2025-10-21  2:05                 ` Lizhi Xu
2025-10-21  6:36                   ` Dan Carpenter [this message]
2025-10-21  8:34                     ` Lizhi Xu
2025-10-21  8:35                     ` [PATCH V3] netrom: Prevent race conditions between neighbor operations Lizhi Xu
2025-10-23 11:44                       ` Paolo Abeni
2025-10-23 11:54                         ` Eric Dumazet
2025-10-23 12:41                         ` Lizhi Xu
2025-10-23 13:50                           ` [PATCH V4] netrom: Preventing the use of abnormal neighbor Lizhi Xu
2025-10-28 14:13                             ` Paolo Abeni
2025-10-29  2:59                               ` Lizhi Xu
2025-11-13  6:33                                 ` Lizhi Xu
2025-10-24 10:45                         ` [PATCH V3] netrom: Prevent race conditions between neighbor operations Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPcp_xemzpDuw-MW@stanley.mountain \
    --to=dan.carpenter@linaro.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-hams@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizhi.xu@windriver.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=syzbot+2860e75836a08b172755@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).