* [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF
@ 2026-05-26 10:23 Zhenghang Xiao
2026-05-26 10:59 ` Oliver Hartkopp
2026-06-09 18:52 ` sashiko-bot
0 siblings, 2 replies; 4+ messages in thread
From: Zhenghang Xiao @ 2026-05-26 10:23 UTC (permalink / raw)
To: Oliver Hartkopp, Marc Kleine-Budde; +Cc: linux-can, Zhenghang Xiao
bcm_delete_rx_op() calls can_rx_unregister() which does hlist_del_rcu(),
preventing new bcm_rx_handler invocations. However, already-dispatched
handlers continue running under rcu_read_lock() on other CPUs. The
subsequent bcm_remove_op() cancels the hrtimers and schedules deferred
free via call_rcu(), but a concurrent bcm_rx_handler can re-arm
op->thrtimer via hrtimer_start() after the cancel returns. The re-armed
timer fires on freed memory after the grace period.
The RX_NO_AUTOTIMER flag added by commit f1b4e32aca08 ("can: bcm: use
call_rcu() instead of costly synchronize_rcu()") only guards op->timer
(via bcm_rx_starttimer), not op->thrtimer (via bcm_rx_update_and_send).
Add synchronize_rcu() between list_del_rcu() and bcm_remove_op(),
matching the pattern in bcm_release(). This ensures all in-flight
handlers complete before timers are cancelled.
Fixes: f1b4e32aca08 ("can: bcm: use call_rcu() instead of costly synchronize_rcu()")
Signed-off-by: Zhenghang Xiao <kipreyyy@gmail.com>
---
net/can/bcm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/can/bcm.c b/net/can/bcm.c
index a4bef2c48a55..ae083f59a9ef 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -870,6 +870,7 @@ static int bcm_delete_rx_op(struct list_head *ops, struct bcm_msg_head *mh,
bcm_rx_handler, op);
list_del_rcu(&op->list);
+ synchronize_rcu();
bcm_remove_op(op);
return 1; /* done */
}
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF 2026-05-26 10:23 [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF Zhenghang Xiao @ 2026-05-26 10:59 ` Oliver Hartkopp 2026-06-09 18:52 ` sashiko-bot 1 sibling, 0 replies; 4+ messages in thread From: Oliver Hartkopp @ 2026-05-26 10:59 UTC (permalink / raw) To: Zhenghang Xiao, Marc Kleine-Budde, Lee Jones; +Cc: linux-can Lee Jones is already working on a workqueue based approach: https://lore.kernel.org/linux-can/20260520080523.2513957-1-lee@kernel.org/ to not(!) re-introduce the costly synchronize_rcu(). There's only some feedback from the AI bot that has to be considered. https://netdev-ai.bots.linux.dev/ai-review.html?id=9e8842e7-59a2-4ec5-9f65-1df7973c6170 I expect the updated v2 patch this week. Best regards, Oliver On 26.05.26 12:23, Zhenghang Xiao wrote: > bcm_delete_rx_op() calls can_rx_unregister() which does hlist_del_rcu(), > preventing new bcm_rx_handler invocations. However, already-dispatched > handlers continue running under rcu_read_lock() on other CPUs. The > subsequent bcm_remove_op() cancels the hrtimers and schedules deferred > free via call_rcu(), but a concurrent bcm_rx_handler can re-arm > op->thrtimer via hrtimer_start() after the cancel returns. The re-armed > timer fires on freed memory after the grace period. > > The RX_NO_AUTOTIMER flag added by commit f1b4e32aca08 ("can: bcm: use > call_rcu() instead of costly synchronize_rcu()") only guards op->timer > (via bcm_rx_starttimer), not op->thrtimer (via bcm_rx_update_and_send). > > Add synchronize_rcu() between list_del_rcu() and bcm_remove_op(), > matching the pattern in bcm_release(). This ensures all in-flight > handlers complete before timers are cancelled. > > Fixes: f1b4e32aca08 ("can: bcm: use call_rcu() instead of costly synchronize_rcu()") > Signed-off-by: Zhenghang Xiao <kipreyyy@gmail.com> > --- > net/can/bcm.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/can/bcm.c b/net/can/bcm.c > index a4bef2c48a55..ae083f59a9ef 100644 > --- a/net/can/bcm.c > +++ b/net/can/bcm.c > @@ -870,6 +870,7 @@ static int bcm_delete_rx_op(struct list_head *ops, struct bcm_msg_head *mh, > bcm_rx_handler, op); > > list_del_rcu(&op->list); > + synchronize_rcu(); > bcm_remove_op(op); > return 1; /* done */ > } ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF 2026-05-26 10:23 [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF Zhenghang Xiao 2026-05-26 10:59 ` Oliver Hartkopp @ 2026-06-09 18:52 ` sashiko-bot 2026-06-10 7:29 ` Oliver Hartkopp 1 sibling, 1 reply; 4+ messages in thread From: sashiko-bot @ 2026-06-09 18:52 UTC (permalink / raw) To: Zhenghang Xiao Cc: Oleksij Rempel, linux-can, Marc Kleine-Budde, Vincent Mailhol, Oliver Hartkopp Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider: Pre-existing issues: - [High] Skipping `bcm_rx_unreg()` when `dev_get_by_index()` returns NULL leaves the CAN receiver registered in the core, leading to a memory leak and Use-After-Free. - [High] Incorrect list primitives used for RCU-traversed lists (`rx_ops` and `tx_ops`), leading to list corruption and crashes. - [High] Concurrent modification of an existing RX operation in `bcm_rx_setup()` races with `bcm_rx_handler()`, allowing readers to see partially-written data. -- commit 27083d45b99681b318907aa96e90787a12c08a58 Author: Zhenghang Xiao <kipreyyy@gmail.com> can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF This commit adds synchronize_rcu() in bcm_delete_rx_op() before bcm_remove_op() to ensure all in-flight handlers complete before timers are cancelled. This prevents a use-after-free where a timer could be re-armed by a concurrent bcm_rx_handler after being cancelled. Fixes: f1b4e32aca08 ("can: bcm: use call_rcu() instead of costly synchronize_rcu()") > diff --git a/net/can/bcm.c b/net/can/bcm.c > index a4bef2c48a559..ae083f59a9ef5 100644 > --- a/net/can/bcm.c > +++ b/net/can/bcm.c > @@ -870,6 +870,7 @@ static int bcm_delete_rx_op(struct list_head *ops, struct bcm_msg_head *mh, > bcm_rx_handler, op); > > list_del_rcu(&op->list); [Severity: High] This isn't a bug introduced by this patch, but looking at the code just above this addition in bcm_delete_rx_op(): dev = dev_get_by_index(sock_net(op->sk), op->ifindex); if (dev) { bcm_rx_unreg(dev, op); dev_put(dev); } During device unregistration, unlist_netdevice() makes dev_get_by_index() return NULL before the NETDEV_UNREGISTER notifier is dispatched. If user space concurrently closes the socket in bcm_release() or deletes the RX filter in bcm_delete_rx_op(), these functions check dev_get_by_index(), see NULL, and bypass calling bcm_rx_unreg(). Does this leave the CAN receiver registered in the core, leaking the struct receiver and causing a use-after-free when queued frames process the already freed op pointer? > + synchronize_rcu(); > bcm_remove_op(op); > return 1; /* done */ > } [Severity: High] This is a pre-existing issue, but I noticed a potential problem with the list primitives used for rx_ops and tx_ops in net/can/bcm.c. In bcm_rx_setup() and bcm_tx_setup(), the code uses list_add(): list_add(&op->list, &bo->tx_ops); However, in bcm_proc_show(), the tx_ops list is traversed locklessly: list_for_each_entry(op, &bo->tx_ops, list) { Because list_add() lacks the smp_store_release() memory barriers of list_add_rcu(), and the traversal doesn't use list_for_each_entry_rcu(), could this result in readers seeing partially initialized list pointers and dereferencing a NULL pointer on weakly ordered architectures? [Severity: High] This is another pre-existing issue, but there appears to be a race condition between bcm_rx_setup() and bcm_rx_handler() in net/can/bcm.c. When bcm_rx_setup() updates an existing RX operation under lock_sock(sk), it modifies the frames locklessly: err = memcpy_from_msg(op->frames, msg, msg_head->nframes * op->cfsiz); if (err < 0) return err; memset(op->last_frames, 0, msg_head->nframes * op->cfsiz); op->nframes = msg_head->nframes; Meanwhile, bcm_rx_handler() accesses these fields in softirq context: for (i = 1; i < op->nframes; i++) { if ((get_u64(op->frames, 0) & get_u64(rxframe, 0)) == (get_u64(op->frames, 0) & get_u64(op->frames + op->cfsiz * i, 0))) { Since memcpy_from_msg() can sleep on page faults, could bcm_rx_handler() read torn or partially overwritten CAN filters, leading to incorrect message delivery or spurious timer scheduling? -- Sashiko AI review · https://sashiko.dev/#/patchset/20260526102349.94074-1-kipreyyy@gmail.com?part=1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF 2026-06-09 18:52 ` sashiko-bot @ 2026-06-10 7:29 ` Oliver Hartkopp 0 siblings, 0 replies; 4+ messages in thread From: Oliver Hartkopp @ 2026-06-10 7:29 UTC (permalink / raw) To: sashiko-reviews, Zhenghang Xiao Cc: Oleksij Rempel, linux-can, Marc Kleine-Budde, Vincent Mailhol The patch already has been rejected in favor of: [PATCH v3 1/1] can: bcm: defer rx_op deallocation to workqueue to fix thrtimer UAF https://lore.kernel.org/linux-can/20260526165257.3705239-1-lee@kernel.org/ On 09.06.26 20:52, sashiko-bot@kernel.org wrote: > Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider: > > Pre-existing issues: > - [High] Skipping `bcm_rx_unreg()` when `dev_get_by_index()` returns NULL leaves the CAN receiver registered in the core, leading to a memory leak and Use-After-Free. > - [High] Incorrect list primitives used for RCU-traversed lists (`rx_ops` and `tx_ops`), leading to list corruption and crashes. > - [High] Concurrent modification of an existing RX operation in `bcm_rx_setup()` races with `bcm_rx_handler()`, allowing readers to see partially-written data. > -- > > commit 27083d45b99681b318907aa96e90787a12c08a58 > Author: Zhenghang Xiao <kipreyyy@gmail.com> > > can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF > > This commit adds synchronize_rcu() in bcm_delete_rx_op() before > bcm_remove_op() to ensure all in-flight handlers complete before timers > are cancelled. This prevents a use-after-free where a timer could be > re-armed by a concurrent bcm_rx_handler after being cancelled. > > Fixes: f1b4e32aca08 ("can: bcm: use call_rcu() instead of costly synchronize_rcu()") > >> diff --git a/net/can/bcm.c b/net/can/bcm.c >> index a4bef2c48a559..ae083f59a9ef5 100644 >> --- a/net/can/bcm.c >> +++ b/net/can/bcm.c >> @@ -870,6 +870,7 @@ static int bcm_delete_rx_op(struct list_head *ops, struct bcm_msg_head *mh, >> bcm_rx_handler, op); >> >> list_del_rcu(&op->list); > > [Severity: High] > This isn't a bug introduced by this patch, but looking at the code just > above this addition in bcm_delete_rx_op(): > > dev = dev_get_by_index(sock_net(op->sk), op->ifindex); > if (dev) { > bcm_rx_unreg(dev, op); > dev_put(dev); > } > > During device unregistration, unlist_netdevice() makes dev_get_by_index() > return NULL before the NETDEV_UNREGISTER notifier is dispatched. > > If user space concurrently closes the socket in bcm_release() or deletes > the RX filter in bcm_delete_rx_op(), these functions check > dev_get_by_index(), see NULL, and bypass calling bcm_rx_unreg(). > > Does this leave the CAN receiver registered in the core, leaking the > struct receiver and causing a use-after-free when queued frames process > the already freed op pointer? > >> + synchronize_rcu(); >> bcm_remove_op(op); >> return 1; /* done */ >> } > > [Severity: High] > This is a pre-existing issue, but I noticed a potential problem with the > list primitives used for rx_ops and tx_ops in net/can/bcm.c. > > In bcm_rx_setup() and bcm_tx_setup(), the code uses list_add(): > > list_add(&op->list, &bo->tx_ops); > > However, in bcm_proc_show(), the tx_ops list is traversed locklessly: > > list_for_each_entry(op, &bo->tx_ops, list) { > > Because list_add() lacks the smp_store_release() memory barriers of > list_add_rcu(), and the traversal doesn't use list_for_each_entry_rcu(), > could this result in readers seeing partially initialized list pointers > and dereferencing a NULL pointer on weakly ordered architectures? > > [Severity: High] > This is another pre-existing issue, but there appears to be a race > condition between bcm_rx_setup() and bcm_rx_handler() in net/can/bcm.c. > > When bcm_rx_setup() updates an existing RX operation under lock_sock(sk), > it modifies the frames locklessly: > > err = memcpy_from_msg(op->frames, msg, msg_head->nframes * op->cfsiz); > if (err < 0) > return err; > memset(op->last_frames, 0, msg_head->nframes * op->cfsiz); > op->nframes = msg_head->nframes; > > Meanwhile, bcm_rx_handler() accesses these fields in softirq context: > > for (i = 1; i < op->nframes; i++) { > if ((get_u64(op->frames, 0) & get_u64(rxframe, 0)) == > (get_u64(op->frames, 0) & > get_u64(op->frames + op->cfsiz * i, 0))) { > > Since memcpy_from_msg() can sleep on page faults, could bcm_rx_handler() > read torn or partially overwritten CAN filters, leading to incorrect > message delivery or spurious timer scheduling? > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-10 7:30 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-26 10:23 [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF Zhenghang Xiao 2026-05-26 10:59 ` Oliver Hartkopp 2026-06-09 18:52 ` sashiko-bot 2026-06-10 7:29 ` Oliver Hartkopp
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.