From: Ard Biesheuvel <ardb+git@google.com>
To: linux-hardening@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, Ard Biesheuvel <ardb@kernel.org>,
Kees Cook <kees@kernel.org>, Ryan Roberts <ryan.roberts@arm.com>,
Will Deacon <will@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
Jeremy Linton <jeremy.linton@arm.com>,
Catalin Marinas <Catalin.Marinas@arm.com>,
Mark Rutland <mark.rutland@arm.com>,
"Jason A. Donenfeld" <Jason@zx2c4.com>
Subject: [RFC/RFT PATCH 4/6] random: Use a lockless fast path for get_random_uXX()
Date: Thu, 27 Nov 2025 10:22:31 +0100 [thread overview]
Message-ID: <20251127092226.1439196-12-ardb+git@google.com> (raw)
In-Reply-To: <20251127092226.1439196-8-ardb+git@google.com>
From: Ard Biesheuvel <ardb@kernel.org>
Currently, the implementations of the get_random_uXX() API protect their
critical section with a local lock and disabling interrupts, to ensure
that the code does not race with itself when called from interrupt
context.
Given that the fast path does nothing more than read a single uXX
quantity from a linear buffer and bump the position pointer, poking the
hardware registers to disable and re-enable interrupts is
disproportionately costly, and best avoided.
There are two conditions under which the batched entropy buffer is
replenished, which is what forms the critical section:
- the buffer is exhausted
- the base_crng generation counter has incremented.
By combining the position and generation counters into a single u64, we
can use compare and exchange to implement the fast path without taking
the local lock or disabling interrupts. By constructing the expected and
next values carefully, the compare and exchange will only succeed if
- we did not race with ourselves, i.e., the compare and exchange
increments the position counter by exactly 1;
- the buffer is not exhausted
- the generation counter equals the base_crng generation counter.
Only if the compare and exchange fails is the original slow path taken,
and only in that case do we take the local lock. This results in a
considerable speedup (3-5x) when benchmarking get_random_u8() in a tight
loop.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
drivers/char/random.c | 44 ++++++++++++++------
1 file changed, 31 insertions(+), 13 deletions(-)
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 0e04bc60d034..71bd74871540 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -496,6 +496,12 @@ static ssize_t get_random_bytes_user(struct iov_iter *iter)
* should be called and return 0 at least once at any point prior.
*/
+#ifdef __LITTLE_ENDIAN
+#define LOHI(lo, hi) lo, hi
+#else
+#define LOHI(lo, hi) hi, lo
+#endif
+
#define DEFINE_BATCHED_ENTROPY(type) \
struct batch_ ##type { \
/* \
@@ -507,8 +513,12 @@ struct batch_ ##type { \
*/ \
type entropy[CHACHA_BLOCK_SIZE * 3 / (2 * sizeof(type))]; \
local_lock_t lock; \
- unsigned int generation; \
- unsigned int position; \
+ union { \
+ struct { \
+ unsigned int LOHI(position, generation); \
+ }; \
+ u64 posgen; \
+ }; \
}; \
\
static DEFINE_PER_CPU(struct batch_ ##type, batched_entropy_ ##type) = { \
@@ -522,6 +532,7 @@ type get_random_ ##type(void) \
unsigned long flags; \
struct batch_ ##type *batch; \
unsigned int next_gen; \
+ u64 next; \
\
warn_unseeded_randomness(); \
\
@@ -530,21 +541,28 @@ type get_random_ ##type(void) \
return ret; \
} \
\
- local_lock_irqsave(&batched_entropy_ ##type.lock, flags); \
- batch = raw_cpu_ptr(&batched_entropy_##type); \
+ batch = &get_cpu_var(batched_entropy_##type); \
\
next_gen = (unsigned int)READ_ONCE(base_crng.generation); \
- if (batch->position >= ARRAY_SIZE(batch->entropy) || \
- next_gen != batch->generation) { \
- _get_random_bytes(batch->entropy, sizeof(batch->entropy)); \
- batch->position = 0; \
- batch->generation = next_gen; \
+ next = (u64)next_gen << 32; \
+ if (likely(batch->position < ARRAY_SIZE(batch->entropy))) { \
+ next |= batch->position + 1; /* next-1 is bogus otherwise */ \
+ ret = batch->entropy[batch->position]; \
+ } \
+ if (cmpxchg64_local(&batch->posgen, next, next - 1) != next - 1) { \
+ local_lock_irqsave(&batched_entropy_ ##type.lock, flags); \
+ if (batch->position >= ARRAY_SIZE(batch->entropy) || \
+ next_gen != batch->generation) { \
+ _get_random_bytes(batch->entropy, sizeof(batch->entropy));\
+ batch->position = 0; \
+ batch->generation = next_gen; \
+ } \
+ ret = batch->entropy[batch->position++]; \
+ local_unlock_irqrestore(&batched_entropy_ ##type.lock, flags); \
} \
\
- ret = batch->entropy[batch->position]; \
- batch->entropy[batch->position] = 0; \
- ++batch->position; \
- local_unlock_irqrestore(&batched_entropy_ ##type.lock, flags); \
+ batch->entropy[batch->position - 1] = 0; \
+ put_cpu_var(batched_entropy_##type); \
return ret; \
} \
EXPORT_SYMBOL(get_random_ ##type);
--
2.52.0.107.ga0afd4fd5b-goog
next prev parent reply other threads:[~2025-11-27 9:23 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-27 9:22 [RFC/RFT PATCH 0/6] Improve get_random_u8() for use in randomize kstack Ard Biesheuvel
2025-11-27 9:22 ` [RFC/RFT PATCH 1/6] hexagon: Wire up cmpxchg64_local() to generic implementation Ard Biesheuvel
2025-11-27 9:22 ` [RFC/RFT PATCH 2/6] arc: " Ard Biesheuvel
2025-11-27 15:06 ` Joey Gouly
2025-12-03 16:59 ` Ard Biesheuvel
2025-11-27 9:22 ` [RFC/RFT PATCH 3/6] random: Use u32 to keep track of batched entropy generation Ard Biesheuvel
2025-11-27 10:11 ` david laight
2025-11-27 10:15 ` Ard Biesheuvel
2025-11-27 9:22 ` Ard Biesheuvel [this message]
2025-11-27 10:32 ` [RFC/RFT PATCH 4/6] random: Use a lockless fast path for get_random_uXX() Ard Biesheuvel
2025-11-27 9:22 ` [RFC/RFT PATCH 5/6] random: Plug race in preceding patch Ard Biesheuvel
2025-11-28 11:13 ` david laight
2025-11-28 11:18 ` Ard Biesheuvel
2025-11-27 9:22 ` [RFC/RFT PATCH 6/6] randomize_kstack: Use get_random_u8() at entry for entropy Ard Biesheuvel
2025-11-27 12:12 ` [RFC/RFT PATCH 0/6] Improve get_random_u8() for use in randomize kstack Ryan Roberts
2025-11-27 12:28 ` Ard Biesheuvel
2025-11-27 13:08 ` Ryan Roberts
2025-11-27 14:18 ` Ryan Roberts
2025-11-27 15:03 ` Ard Biesheuvel
2025-11-27 15:40 ` Ard Biesheuvel
2025-11-27 15:56 ` Ryan Roberts
2025-11-27 16:58 ` Mark Rutland
2025-11-27 19:01 ` Ard Biesheuvel
2025-11-28 10:36 ` Ryan Roberts
2025-11-28 11:44 ` Mark Rutland
2025-11-28 10:07 ` Ard Biesheuvel
2025-11-28 10:32 ` Ryan Roberts
2025-11-28 10:36 ` Ard Biesheuvel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251127092226.1439196-12-ardb+git@google.com \
--to=ardb+git@google.com \
--cc=Catalin.Marinas@arm.com \
--cc=Jason@zx2c4.com \
--cc=ardb@kernel.org \
--cc=arnd@arndb.de \
--cc=jeremy.linton@arm.com \
--cc=kees@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=ryan.roberts@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).