public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jason A. Donenfeld" <Jason@zx2c4.com>
To: "Andy Lutomirski" <luto@amacapital.net>,
	"Jonathan Neuschäfer" <j.neuschaefer@gmx.net>,
	"Theodore Ts'o" <tytso@mit.edu>,
	linux-kernel@vger.kernel.org,
	"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>, "Will Deacon" <will@kernel.org>,
	"Waiman Long" <longman@redhat.com>,
	"Boqun Feng" <boqun.feng@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	stable@vger.kernel.org, "Jason A . Donenfeld" <Jason@zx2c4.com>
Subject: [PATCH] random: remove batched entropy locking
Date: Fri, 28 Jan 2022 16:33:44 +0100	[thread overview]
Message-ID: <20220128153344.34211-1-Jason@zx2c4.com> (raw)
In-Reply-To: <CAHmME9pb9A4SN6TTjNvvxKqw1L3gXVOX7KKihfEH4AgKGNGZ2A@mail.gmail.com>

From: Andy Lutomirski <luto@kernel.org>

We don't need spinlocks to protect batched entropy -- all we need
is a little bit of care. This should fix up the following splat that
Jonathan received with a PROVE_LOCKING=y/PROVE_RAW_LOCK_NESTING=y
kernel:

[    2.500000] [ BUG: Invalid wait context ]
[    2.500000] 5.17.0-rc1 #563 Not tainted
[    2.500000] -----------------------------
[    2.500000] swapper/1 is trying to lock:
[    2.500000] c0b0e9cc (batched_entropy_u32.lock){....}-{3:3}, at: invalidate_batched_entropy+0x18/0x4c
[    2.500000] other info that might help us debug this:
[    2.500000] context-{2:2}
[    2.500000] 3 locks held by swapper/1:
[    2.500000]  #0: c0ae86ac (event_mutex){+.+.}-{4:4}, at: event_trace_init+0x4c/0xd8
[    2.500000]  #1: c0ae81b8 (trace_event_sem){+.+.}-{4:4}, at: event_trace_init+0x68/0xd8
[    2.500000]  #2: c19b05cc (&sb->s_type->i_mutex_key#2){+.+.}-{4:4}, at: start_creating+0x40/0xc4
[    2.500000] stack backtrace:
[    2.500000] CPU: 0 PID: 1 Comm: swapper Not tainted 5.17.0-rc1 #563
[    2.500000] Hardware name: WPCM450 chip
[    2.500000] [<c00100a8>] (unwind_backtrace) from [<c000db2c>] (show_stack+0x10/0x14)
[    2.500000] [<c000db2c>] (show_stack) from [<c0054eec>] (__lock_acquire+0x3f0/0x189c)
[    2.500000] [<c0054eec>] (__lock_acquire) from [<c0054478>] (lock_acquire+0x2b8/0x354)
[    2.500000] [<c0054478>] (lock_acquire) from [<c0568028>] (_raw_spin_lock_irqsave+0x60/0x74)
[    2.500000] [<c0568028>] (_raw_spin_lock_irqsave) from [<c030b6f4>] (invalidate_batched_entropy+0x18/0x4c)
[    2.500000] [<c030b6f4>] (invalidate_batched_entropy) from [<c030e7fc>] (crng_fast_load+0xf0/0x110)
[    2.500000] [<c030e7fc>] (crng_fast_load) from [<c030e954>] (add_interrupt_randomness+0x138/0x200)
[    2.500000] [<c030e954>] (add_interrupt_randomness) from [<c0061b34>] (handle_irq_event_percpu+0x18/0x38)
[    2.500000] [<c0061b34>] (handle_irq_event_percpu) from [<c0061b8c>] (handle_irq_event+0x38/0x5c)
[    2.500000] [<c0061b8c>] (handle_irq_event) from [<c0065b28>] (handle_fasteoi_irq+0x9c/0x114)
[    2.500000] [<c0065b28>] (handle_fasteoi_irq) from [<c0061178>] (handle_irq_desc+0x24/0x34)
[    2.500000] [<c0061178>] (handle_irq_desc) from [<c056214c>] (generic_handle_arch_irq+0x28/0x3c)
[    2.500000] [<c056214c>] (generic_handle_arch_irq) from [<c0008eb4>] (__irq_svc+0x54/0x80)
[    2.500000] Exception stack(0xc1485d48 to 0xc1485d90)
[    2.500000] 5d40:                   9780e804 00000001 c09413d4 200000d3 60000053 c016af54
[    2.500000] 5d60: 00000000 c0afa5b8 c14194e0 c19a1d48 c0789ce0 00000000 c1490480 c1485d98
[    2.500000] 5d80: c0168970 c0168984 20000053 ffffffff
[    2.500000] [<c0008eb4>] (__irq_svc) from [<c0168984>] (read_seqbegin.constprop.0+0x6c/0x90)
[    2.500000] [<c0168984>] (read_seqbegin.constprop.0) from [<c016af54>] (d_lookup+0x14/0x40)
[    2.500000] [<c016af54>] (d_lookup) from [<c015cecc>] (lookup_dcache+0x18/0x50)
[    2.500000] [<c015cecc>] (lookup_dcache) from [<c015d868>] (lookup_one_len+0x90/0xe0)
[    2.500000] [<c015d868>] (lookup_one_len) from [<c01e33e4>] (start_creating+0x68/0xc4)
[    2.500000] [<c01e33e4>] (start_creating) from [<c01e398c>] (tracefs_create_file+0x30/0x11c)
[    2.500000] [<c01e398c>] (tracefs_create_file) from [<c00c42f8>] (trace_create_file+0x14/0x38)
[    2.500000] [<c00c42f8>] (trace_create_file) from [<c00cc854>] (event_create_dir+0x310/0x420)
[    2.500000] [<c00cc854>] (event_create_dir) from [<c00cc9d8>] (__trace_early_add_event_dirs+0x28/0x50)
[    2.500000] [<c00cc9d8>] (__trace_early_add_event_dirs) from [<c07c8d64>] (event_trace_init+0x70/0xd8)
[    2.500000] [<c07c8d64>] (event_trace_init) from [<c07c8560>] (tracer_init_tracefs+0x14/0x284)
[    2.500000] [<c07c8560>] (tracer_init_tracefs) from [<c000a330>] (do_one_initcall+0xdc/0x288)
[    2.500000] [<c000a330>] (do_one_initcall) from [<c07bd1e8>] (kernel_init_freeable+0x1c4/0x20c)
[    2.500000] [<c07bd1e8>] (kernel_init_freeable) from [<c05629c0>] (kernel_init+0x10/0x110)
[    2.500000] [<c05629c0>] (kernel_init) from [<c00084f8>] (ret_from_fork+0x14/0x3c)
[    2.500000] Exception stack(0xc1485fb0 to 0xc1485ff8)
[    2.500000] 5fa0:                                     00000000 00000000 00000000 00000000
[    2.500000] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    2.500000] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000

Signed-off-by: Andy Lutomirski <luto@kernel.org>
[Jason: I extracted this from a larger in-progress series of Andy's that
 also unifies the two batches into one and does other performance
 things. Since that's still under development, but because we need this
 part to fix the CONFIG_PROVE_RAW_LOCK_NESTING issue, I've extracted it
 out and applied it to the current setup. This will also make it easier
 to backport to old kernels that also need the fix. I've also amended
 Andy's original commit message.]
Reported-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Link: https://lore.kernel.org/lkml/YfMa0QgsjCVdRAvJ@latitude/
Fixes: b7d5dc21072c ("random: add a spinlock_t to struct batched_entropy")
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
Andy - could you take a look at this and let me know if it's still
correct after I've ported it out of your series and into a standalone
thing here? I'd prefer to hold off on moving forward on this until I
receive our green light. I'm also still a bit uncertain about your NB:
comment regarding the acceptable race. If you could elaborate on that
argument, it might save me a few cycles with my thinking cap on.

 drivers/char/random.c | 57 ++++++++++++++++++++++++-------------------
 1 file changed, 32 insertions(+), 25 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index b411182df6f6..22c190aecbe8 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -2063,7 +2063,6 @@ struct batched_entropy {
 		u32 entropy_u32[CHACHA_BLOCK_SIZE / sizeof(u32)];
 	};
 	unsigned int position;
-	spinlock_t batch_lock;
 };
 
 /*
@@ -2075,7 +2074,7 @@ struct batched_entropy {
  * point prior.
  */
 static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_u64) = {
-	.batch_lock = __SPIN_LOCK_UNLOCKED(batched_entropy_u64.lock),
+	.position = ARRAY_SIZE(((struct batched_entropy *)0)->entropy_u64)
 };
 
 u64 get_random_u64(void)
@@ -2083,42 +2082,55 @@ u64 get_random_u64(void)
 	u64 ret;
 	unsigned long flags;
 	struct batched_entropy *batch;
+	size_t position;
 	static void *previous;
 
 	warn_unseeded_randomness(&previous);
 
-	batch = raw_cpu_ptr(&batched_entropy_u64);
-	spin_lock_irqsave(&batch->batch_lock, flags);
-	if (batch->position % ARRAY_SIZE(batch->entropy_u64) == 0) {
+	local_irq_save(flags);
+	batch = this_cpu_ptr(&batched_entropy_u64);
+	position = READ_ONCE(batch->position);
+	/* NB: position can change to ARRAY_SIZE(batch->entropy_u64) out
+	 * from under us -- see invalidate_batched_entropy().  If this,
+	 * happens it's okay if we still return the data in the batch. */
+	if (unlikely(position + 1 > ARRAY_SIZE(batch->entropy_u64))) {
 		extract_crng((u8 *)batch->entropy_u64);
-		batch->position = 0;
+		position = 0;
 	}
-	ret = batch->entropy_u64[batch->position++];
-	spin_unlock_irqrestore(&batch->batch_lock, flags);
+	ret = batch->entropy_u64[position++];
+	WRITE_ONCE(batch->position, position);
+	local_irq_restore(flags);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_u64);
 
 static DEFINE_PER_CPU(struct batched_entropy, batched_entropy_u32) = {
-	.batch_lock = __SPIN_LOCK_UNLOCKED(batched_entropy_u32.lock),
+	.position = ARRAY_SIZE(((struct batched_entropy *)0)->entropy_u32)
 };
+
 u32 get_random_u32(void)
 {
 	u32 ret;
 	unsigned long flags;
 	struct batched_entropy *batch;
+	size_t position;
 	static void *previous;
 
 	warn_unseeded_randomness(&previous);
 
-	batch = raw_cpu_ptr(&batched_entropy_u32);
-	spin_lock_irqsave(&batch->batch_lock, flags);
-	if (batch->position % ARRAY_SIZE(batch->entropy_u32) == 0) {
+	local_irq_save(flags);
+	batch = this_cpu_ptr(&batched_entropy_u32);
+	position = READ_ONCE(batch->position);
+	/* NB: position can change to ARRAY_SIZE(batch->entropy_u32) out
+	 * from under us -- see invalidate_batched_entropy().  If this,
+	 * happens it's okay if we still return the data in the batch. */
+	if (unlikely(position + 1 > ARRAY_SIZE(batch->entropy_u32))) {
 		extract_crng((u8 *)batch->entropy_u32);
-		batch->position = 0;
+		position = 0;
 	}
-	ret = batch->entropy_u32[batch->position++];
-	spin_unlock_irqrestore(&batch->batch_lock, flags);
+	ret = batch->entropy_u64[position++];
+	WRITE_ONCE(batch->position, position);
+	local_irq_restore(flags);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_u32);
@@ -2130,20 +2142,15 @@ EXPORT_SYMBOL(get_random_u32);
 static void invalidate_batched_entropy(void)
 {
 	int cpu;
-	unsigned long flags;
 
 	for_each_possible_cpu(cpu) {
-		struct batched_entropy *batched_entropy;
+		struct batched_entropy *batch;
 
-		batched_entropy = per_cpu_ptr(&batched_entropy_u32, cpu);
-		spin_lock_irqsave(&batched_entropy->batch_lock, flags);
-		batched_entropy->position = 0;
-		spin_unlock(&batched_entropy->batch_lock);
+		batch = per_cpu_ptr(&batched_entropy_u32, cpu);
+		WRITE_ONCE(batch->position, ARRAY_SIZE(batch->entropy_u32));
 
-		batched_entropy = per_cpu_ptr(&batched_entropy_u64, cpu);
-		spin_lock(&batched_entropy->batch_lock);
-		batched_entropy->position = 0;
-		spin_unlock_irqrestore(&batched_entropy->batch_lock, flags);
+		batch = per_cpu_ptr(&batched_entropy_u64, cpu);
+		WRITE_ONCE(batch->position, ARRAY_SIZE(batch->entropy_u64));
 	}
 }
 
-- 
2.35.0


  parent reply	other threads:[~2022-01-28 15:34 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-27 22:21 "BUG: Invalid wait context" in invalidate_batched_entropy Jonathan Neuschäfer
2022-01-27 22:26 ` Jason A. Donenfeld
2022-01-28  8:34   ` Sebastian Andrzej Siewior
2022-01-28 16:04     ` Jason A. Donenfeld
2022-01-28 16:19       ` Sebastian Andrzej Siewior
2022-01-28 16:28         ` Jason A. Donenfeld
2022-01-28 17:02           ` Sebastian Andrzej Siewior
2022-01-28 15:33   ` Jason A. Donenfeld [this message]
2022-01-28 15:44     ` [PATCH] random: remove batched entropy locking Sebastian Andrzej Siewior
2022-01-28 15:54       ` Jason A. Donenfeld
2022-01-28 16:15         ` Sebastian Andrzej Siewior
2022-01-28 16:36           ` Jason A. Donenfeld
2022-01-28 15:48     ` Jason A. Donenfeld
2022-01-28 22:35       ` [PATCH v2] " Jason A. Donenfeld
2022-01-29 21:03         ` Jonathan Neuschäfer
2022-02-04  0:27         ` Jason A. Donenfeld
2022-02-04 11:10           ` Sebastian Andrzej Siewior
2022-02-04 13:42             ` Jason A. Donenfeld
2022-02-04 14:01               ` Sebastian Andrzej Siewior
2022-02-04 14:11                 ` Jason A. Donenfeld
2022-02-04 14:30                   ` Sebastian Andrzej Siewior
2022-02-04 15:39                     ` Jason A. Donenfeld
2022-02-04 15:51                       ` [PATCH v3] " Jason A. Donenfeld
2022-02-04 15:57                         ` Sebastian Andrzej Siewior
2022-02-04 16:12                           ` Jason A. Donenfeld
2022-02-16 20:01                           ` Jann Horn
2022-02-16 20:58                             ` Jason A. Donenfeld
2022-02-17 17:33                               ` Sebastian Andrzej Siewior
2022-01-28 18:05     ` [PATCH] " Jonathan Neuschäfer
2022-01-29 18:22       ` Jason A. Donenfeld
2022-01-29  7:10     ` [random] 1e1724f9dd: UBSAN:array-index-out-of-bounds_in_drivers/char/random.c kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220128153344.34211-1-Jason@zx2c4.com \
    --to=jason@zx2c4.com \
    --cc=bigeasy@linutronix.de \
    --cc=boqun.feng@gmail.com \
    --cc=j.neuschaefer@gmx.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox