* Re: [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF
2026-05-26 10:23 [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF Zhenghang Xiao
@ 2026-05-26 10:59 ` Oliver Hartkopp
2026-06-09 18:52 ` sashiko-bot
1 sibling, 0 replies; 3+ messages in thread
From: Oliver Hartkopp @ 2026-05-26 10:59 UTC (permalink / raw)
To: Zhenghang Xiao, Marc Kleine-Budde, Lee Jones; +Cc: linux-can
Lee Jones is already working on a workqueue based approach:
https://lore.kernel.org/linux-can/20260520080523.2513957-1-lee@kernel.org/
to not(!) re-introduce the costly synchronize_rcu().
There's only some feedback from the AI bot that has to be considered.
https://netdev-ai.bots.linux.dev/ai-review.html?id=9e8842e7-59a2-4ec5-9f65-1df7973c6170
I expect the updated v2 patch this week.
Best regards,
Oliver
On 26.05.26 12:23, Zhenghang Xiao wrote:
> bcm_delete_rx_op() calls can_rx_unregister() which does hlist_del_rcu(),
> preventing new bcm_rx_handler invocations. However, already-dispatched
> handlers continue running under rcu_read_lock() on other CPUs. The
> subsequent bcm_remove_op() cancels the hrtimers and schedules deferred
> free via call_rcu(), but a concurrent bcm_rx_handler can re-arm
> op->thrtimer via hrtimer_start() after the cancel returns. The re-armed
> timer fires on freed memory after the grace period.
>
> The RX_NO_AUTOTIMER flag added by commit f1b4e32aca08 ("can: bcm: use
> call_rcu() instead of costly synchronize_rcu()") only guards op->timer
> (via bcm_rx_starttimer), not op->thrtimer (via bcm_rx_update_and_send).
>
> Add synchronize_rcu() between list_del_rcu() and bcm_remove_op(),
> matching the pattern in bcm_release(). This ensures all in-flight
> handlers complete before timers are cancelled.
>
> Fixes: f1b4e32aca08 ("can: bcm: use call_rcu() instead of costly synchronize_rcu()")
> Signed-off-by: Zhenghang Xiao <kipreyyy@gmail.com>
> ---
> net/can/bcm.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/net/can/bcm.c b/net/can/bcm.c
> index a4bef2c48a55..ae083f59a9ef 100644
> --- a/net/can/bcm.c
> +++ b/net/can/bcm.c
> @@ -870,6 +870,7 @@ static int bcm_delete_rx_op(struct list_head *ops, struct bcm_msg_head *mh,
> bcm_rx_handler, op);
>
> list_del_rcu(&op->list);
> + synchronize_rcu();
> bcm_remove_op(op);
> return 1; /* done */
> }
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF
2026-05-26 10:23 [PATCH can] can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF Zhenghang Xiao
2026-05-26 10:59 ` Oliver Hartkopp
@ 2026-06-09 18:52 ` sashiko-bot
1 sibling, 0 replies; 3+ messages in thread
From: sashiko-bot @ 2026-06-09 18:52 UTC (permalink / raw)
To: Zhenghang Xiao
Cc: Oleksij Rempel, linux-can, Marc Kleine-Budde, Vincent Mailhol,
Oliver Hartkopp
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
Pre-existing issues:
- [High] Skipping `bcm_rx_unreg()` when `dev_get_by_index()` returns NULL leaves the CAN receiver registered in the core, leading to a memory leak and Use-After-Free.
- [High] Incorrect list primitives used for RCU-traversed lists (`rx_ops` and `tx_ops`), leading to list corruption and crashes.
- [High] Concurrent modification of an existing RX operation in `bcm_rx_setup()` races with `bcm_rx_handler()`, allowing readers to see partially-written data.
--
commit 27083d45b99681b318907aa96e90787a12c08a58
Author: Zhenghang Xiao <kipreyyy@gmail.com>
can: bcm: add synchronize_rcu() in bcm_delete_rx_op() to fix UAF
This commit adds synchronize_rcu() in bcm_delete_rx_op() before
bcm_remove_op() to ensure all in-flight handlers complete before timers
are cancelled. This prevents a use-after-free where a timer could be
re-armed by a concurrent bcm_rx_handler after being cancelled.
Fixes: f1b4e32aca08 ("can: bcm: use call_rcu() instead of costly synchronize_rcu()")
> diff --git a/net/can/bcm.c b/net/can/bcm.c
> index a4bef2c48a559..ae083f59a9ef5 100644
> --- a/net/can/bcm.c
> +++ b/net/can/bcm.c
> @@ -870,6 +870,7 @@ static int bcm_delete_rx_op(struct list_head *ops, struct bcm_msg_head *mh,
> bcm_rx_handler, op);
>
> list_del_rcu(&op->list);
[Severity: High]
This isn't a bug introduced by this patch, but looking at the code just
above this addition in bcm_delete_rx_op():
dev = dev_get_by_index(sock_net(op->sk), op->ifindex);
if (dev) {
bcm_rx_unreg(dev, op);
dev_put(dev);
}
During device unregistration, unlist_netdevice() makes dev_get_by_index()
return NULL before the NETDEV_UNREGISTER notifier is dispatched.
If user space concurrently closes the socket in bcm_release() or deletes
the RX filter in bcm_delete_rx_op(), these functions check
dev_get_by_index(), see NULL, and bypass calling bcm_rx_unreg().
Does this leave the CAN receiver registered in the core, leaking the
struct receiver and causing a use-after-free when queued frames process
the already freed op pointer?
> + synchronize_rcu();
> bcm_remove_op(op);
> return 1; /* done */
> }
[Severity: High]
This is a pre-existing issue, but I noticed a potential problem with the
list primitives used for rx_ops and tx_ops in net/can/bcm.c.
In bcm_rx_setup() and bcm_tx_setup(), the code uses list_add():
list_add(&op->list, &bo->tx_ops);
However, in bcm_proc_show(), the tx_ops list is traversed locklessly:
list_for_each_entry(op, &bo->tx_ops, list) {
Because list_add() lacks the smp_store_release() memory barriers of
list_add_rcu(), and the traversal doesn't use list_for_each_entry_rcu(),
could this result in readers seeing partially initialized list pointers
and dereferencing a NULL pointer on weakly ordered architectures?
[Severity: High]
This is another pre-existing issue, but there appears to be a race
condition between bcm_rx_setup() and bcm_rx_handler() in net/can/bcm.c.
When bcm_rx_setup() updates an existing RX operation under lock_sock(sk),
it modifies the frames locklessly:
err = memcpy_from_msg(op->frames, msg, msg_head->nframes * op->cfsiz);
if (err < 0)
return err;
memset(op->last_frames, 0, msg_head->nframes * op->cfsiz);
op->nframes = msg_head->nframes;
Meanwhile, bcm_rx_handler() accesses these fields in softirq context:
for (i = 1; i < op->nframes; i++) {
if ((get_u64(op->frames, 0) & get_u64(rxframe, 0)) ==
(get_u64(op->frames, 0) &
get_u64(op->frames + op->cfsiz * i, 0))) {
Since memcpy_from_msg() can sleep on page faults, could bcm_rx_handler()
read torn or partially overwritten CAN filters, leading to incorrect
message delivery or spurious timer scheduling?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260526102349.94074-1-kipreyyy@gmail.com?part=1
^ permalink raw reply [flat|nested] 3+ messages in thread