From: Lianjie Wang <karin0.zst@gmail.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Olivia Mackall <olivia@selenic.com>,
David Laight <david.laight.linux@gmail.com>,
Jonathan McDowell <noodles@meta.com>,
linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] hwrng: core - use RCU for current_rng to fix race condition
Date: Tue, 27 Jan 2026 22:32:30 +0900 [thread overview]
Message-ID: <aXirTffsg1GWk2Yw@kator> (raw)
In-Reply-To: <aXbwM1wiPKqmC94v@gondor.apana.org.au>
On Mon, Jan 26, 2026 at 12:40:19PM +0800, Herbert Xu wrote:
> > static struct hwrng *get_current_rng(void)
> > {
> > struct hwrng *rng;
> >
> > - if (mutex_lock_interruptible(&rng_mutex))
> > - return ERR_PTR(-ERESTARTSYS);
> > + rcu_read_lock();
> > + rng = rcu_dereference(current_rng);
> > + if (rng && !kref_get_unless_zero(&rng->ref))
> > + rng = NULL;
>
> rng->ref should never be zero here as the final kref_put is delayed
> by RCU. So this should be a plain kref_get.
Thanks for the review! I will change this to kref_get() in v3.
> > static void put_rng(struct hwrng *rng)
> > {
> > - /*
> > - * Hold rng_mutex here so we serialize in case they set_current_rng
> > - * on rng again immediately.
> > - */
> > - mutex_lock(&rng_mutex);
> > - if (rng)
> > - kref_put(&rng->ref, cleanup_rng);
> > - mutex_unlock(&rng_mutex);
> > + kref_put(&rng->ref, cleanup_rng);
> > }
>
> I think the mutex needs to be kept here as otherwise there is
> a risk of a slow cleanup_rng racing against a subsequent hwrng_init
> on the same RNG.
It is true that cleanup_rng() can race with hwrng_init(). However,
currently put_rng() is also called in hwrng_fillfn(), where we want to
avoid holding rng_mutex to prevent deadlock in hwrng_unregister().
To solve this race, should we introduce a separate lock (e.g.,
init_mutex) to serialize only hwrng_init() and cleanup_rng()?
Alternatively, I think we could stop the thread also in
set_current_rng() before switching current_rng, so that each lifetime of
hwrng_fillfn() thread strictly holds a single RNG instance, avoiding the
need to call get_current_rng() or put_rng() inside hwrng_fillfn().
> > @@ -371,11 +385,10 @@ static ssize_t rng_current_show(struct device *dev,
> > struct hwrng *rng;
> >
> > rng = get_current_rng();
> > - if (IS_ERR(rng))
> > - return PTR_ERR(rng);
> >
> > ret = sysfs_emit(buf, "%s\n", rng ? rng->name : "none");
> > - put_rng(rng);
> > + if (rng)
> > + put_rng(rng);
>
> I don't think this NULL check is necessary as put_rng can handle
> rng == NULL.
I removed the NULL check in put_rng() and moved it to rng_current_show()
in v2, since all the other callers of put_rng() already check for NULL
before calling put_rng(). I can restore the NULL check in put_rng() in
v3 if preferred.
> > @@ -489,8 +502,17 @@ static int hwrng_fillfn(void *unused)
> > struct hwrng *rng;
> >
> > rng = get_current_rng();
> > - if (IS_ERR(rng) || !rng)
> > + if (!rng) {
> > + /* This is only possible within drop_current_rng(),
> > + * so just wait until we are stopped.
> > + */
> > + while (!kthread_should_stop()) {
> > + set_current_state(TASK_INTERRUPTIBLE);
> > + schedule();
> > + }
> > break;
> > + }
> > +
>
> Is the schedule necessary? Shouldn't the break just work as it
> did before?
With the break alone, the task_struct might get freed before
kthread_stop() is called, which can still cause use-after-free
sometimes:
refcount_t: addition on 0; use-after-free.
WARNING: lib/refcount.c:25 at refcount_warn_saturate+0x103/0x130
...
WARNING: kernel/fork.c:778 at __put_task_struct+0x2ea/0x510
...
Call Trace:
<TASK>
kthread_stop+0x347/0x3b0
hwrng_unregister+0x2d4/0x360
remove_common+0x1d3/0x230
virtio_dev_remove+0xb6/0x200
...
</TASK>
As in the description of kthread_create_on_node() from kernel/kthread.c,
it seems we cannot return directly if we plan to call kthread_stop():
* ... @threadfn() can either return directly if it is a
* standalone thread for which no one will call kthread_stop(), or
* return when 'kthread_should_stop()' is true (which means
* kthread_stop() has been called). ...
And in kthread_stop():
* If threadfn() may call kthread_exit() itself, the caller must ensure
* task_struct can't go away.
So I intended to wait until kthread_should_stop() is set. Otherwise, we
might need to call get_task_struct() and put_task_struct() manually to
hold the task_struct and ensure kthread_stop() works.
Alternatively, we could stop the thread *before* clearing current_rng in
drop_current_rng(), so that get_current_rng() will never return NULL
inside hwrng_fillfn(). What do you think?
Best regards,
--
Lianjie Wang
next prev parent reply other threads:[~2026-01-27 13:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-24 19:55 [PATCH v2] hwrng: core - use RCU for current_rng to fix race condition Lianjie Wang
2026-01-26 4:40 ` Herbert Xu
2026-01-27 13:32 ` Lianjie Wang [this message]
2026-01-28 1:43 ` Herbert Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aXirTffsg1GWk2Yw@kator \
--to=karin0.zst@gmail.com \
--cc=david.laight.linux@gmail.com \
--cc=herbert@gondor.apana.org.au \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=noodles@meta.com \
--cc=olivia@selenic.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox