public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed
From: Lianjie Wang <karin0.zst@gmail.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Olivia Mackall <olivia@selenic.com>,
	David Laight <david.laight.linux@gmail.com>,
	Jonathan McDowell <noodles@meta.com>,
	linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] hwrng: core - use RCU for current_rng to fix race condition
Date: Tue, 27 Jan 2026 22:32:30 +0900	[thread overview]
Message-ID: <aXirTffsg1GWk2Yw@kator> (raw)
In-Reply-To: <aXbwM1wiPKqmC94v@gondor.apana.org.au>

On Mon, Jan 26, 2026 at 12:40:19PM +0800, Herbert Xu wrote:
> >  static struct hwrng *get_current_rng(void)
> >  {
> >  	struct hwrng *rng;
> > 
> > -	if (mutex_lock_interruptible(&rng_mutex))
> > -		return ERR_PTR(-ERESTARTSYS);
> > +	rcu_read_lock();
> > +	rng = rcu_dereference(current_rng);
> > +	if (rng && !kref_get_unless_zero(&rng->ref))
> > +		rng = NULL;
> 
> rng->ref should never be zero here as the final kref_put is delayed
> by RCU.  So this should be a plain kref_get.

Thanks for the review! I will change this to kref_get() in v3.

> >  static void put_rng(struct hwrng *rng)
> >  {
> > -	/*
> > -	 * Hold rng_mutex here so we serialize in case they set_current_rng
> > -	 * on rng again immediately.
> > -	 */
> > -	mutex_lock(&rng_mutex);
> > -	if (rng)
> > -		kref_put(&rng->ref, cleanup_rng);
> > -	mutex_unlock(&rng_mutex);
> > +	kref_put(&rng->ref, cleanup_rng);
> >  }
> 
> I think the mutex needs to be kept here as otherwise there is
> a risk of a slow cleanup_rng racing against a subsequent hwrng_init
> on the same RNG.

It is true that cleanup_rng() can race with hwrng_init(). However,
currently put_rng() is also called in hwrng_fillfn(), where we want to
avoid holding rng_mutex to prevent deadlock in hwrng_unregister().

To solve this race, should we introduce a separate lock (e.g.,
init_mutex) to serialize only hwrng_init() and cleanup_rng()?

Alternatively, I think we could stop the thread also in
set_current_rng() before switching current_rng, so that each lifetime of
hwrng_fillfn() thread strictly holds a single RNG instance, avoiding the
need to call get_current_rng() or put_rng() inside hwrng_fillfn().

> > @@ -371,11 +385,10 @@ static ssize_t rng_current_show(struct device *dev,
> >  	struct hwrng *rng;
> > 
> >  	rng = get_current_rng();
> > -	if (IS_ERR(rng))
> > -		return PTR_ERR(rng);
> > 
> >  	ret = sysfs_emit(buf, "%s\n", rng ? rng->name : "none");
> > -	put_rng(rng);
> > +	if (rng)
> > +		put_rng(rng);
> 
> I don't think this NULL check is necessary as put_rng can handle
> rng == NULL.

I removed the NULL check in put_rng() and moved it to rng_current_show()
in v2, since all the other callers of put_rng() already check for NULL
before calling put_rng(). I can restore the NULL check in put_rng() in
v3 if preferred.

> > @@ -489,8 +502,17 @@ static int hwrng_fillfn(void *unused)
> >  		struct hwrng *rng;
> > 
> >  		rng = get_current_rng();
> > -		if (IS_ERR(rng) || !rng)
> > +		if (!rng) {
> > +			/* This is only possible within drop_current_rng(),
> > +			 * so just wait until we are stopped.
> > +			 */
> > +			while (!kthread_should_stop()) {
> > +				set_current_state(TASK_INTERRUPTIBLE);
> > +				schedule();
> > +			}
> >  			break;
> > +		}
> > +
> 
> Is the schedule necessary? Shouldn't the break just work as it
> did before?

With the break alone, the task_struct might get freed before
kthread_stop() is called, which can still cause use-after-free
sometimes:

refcount_t: addition on 0; use-after-free.
WARNING: lib/refcount.c:25 at refcount_warn_saturate+0x103/0x130
...
WARNING: kernel/fork.c:778 at __put_task_struct+0x2ea/0x510
...
Call Trace:
 <TASK>
 kthread_stop+0x347/0x3b0
 hwrng_unregister+0x2d4/0x360
 remove_common+0x1d3/0x230
 virtio_dev_remove+0xb6/0x200
 ...
 </TASK>

As in the description of kthread_create_on_node() from kernel/kthread.c,
it seems we cannot return directly if we plan to call kthread_stop():

 *  ... @threadfn() can either return directly if it is a
 * standalone thread for which no one will call kthread_stop(), or
 * return when 'kthread_should_stop()' is true (which means
 * kthread_stop() has been called). ...

And in kthread_stop():

 * If threadfn() may call kthread_exit() itself, the caller must ensure
 * task_struct can't go away.

So I intended to wait until kthread_should_stop() is set. Otherwise, we
might need to call get_task_struct() and put_task_struct() manually to
hold the task_struct and ensure kthread_stop() works.

Alternatively, we could stop the thread *before* clearing current_rng in
drop_current_rng(), so that get_current_rng() will never return NULL
inside hwrng_fillfn(). What do you think?

Best regards,
-- 
Lianjie Wang

  reply	other threads:[~2026-01-27 13:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-24 19:55 [PATCH v2] hwrng: core - use RCU for current_rng to fix race condition Lianjie Wang
2026-01-26  4:40 ` Herbert Xu
2026-01-27 13:32   ` Lianjie Wang [this message]
2026-01-28  1:43     ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXirTffsg1GWk2Yw@kator \
    --to=karin0.zst@gmail.com \
    --cc=david.laight.linux@gmail.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=noodles@meta.com \
    --cc=olivia@selenic.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox