From: George Spelvin <lkml@SDF.ORG>
To: Kees Cook <keescook@chromium.org>
Cc: Dan Williams <dan.j.williams@intel.com>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Alexander Duyck <alexander.duyck@gmail.com>,
Randy Dunlap <rdunlap@infradead.org>,
lkml@sdf.org
Subject: [PATCH v3] mm/shuffle.c: Fix races in add_to_free_area_random()
Date: Wed, 18 Mar 2020 20:39:14 +0000 [thread overview]
Message-ID: <20200318203914.GA16083@SDF.ORG> (raw)
In-Reply-To: <20200318014410.GA2281@SDF.ORG>
The old code had separate "rand" and "rand_count" variables,
which could get out of sync with bad results.
In the worst case, two threads would see rand_count = 1 and
both decrement it, resulting in rand_count = 255 and rand being
filled with zeros for the next 255 calls.
Instead, pack them both into a single, atomically updateable,
variable. This makes it a lot easier to reason about race
conditions. They are still there - the code deliberately eschews
locking - but basically harmless on the rare occasions that
they happen.
Second, use READ_ONCE and WRITE_ONCE. Without them, we are deep
in the land of nasal demons. The compiler would be free to spill
temporaries to the static variables in arbitrarily perverse ways
and create hard-to-find bugs.
(Alternatively, we could declare the static variable "volatile",
one of the few places in the Linux kernel that would be correct,
but it would probably annoy Linus.)
Third, use long rather than u64. This not only keeps the state
atomically updateable, it also speeds up the fast path on 32-bit
machines. Saving at least three instructions on the fast path (one
load, one add-with-carry, and one store) is worth a second call
to get_random_u*() per 64 bits. The fast path of get_random_u*
is less than the 3*64 = 192 instructions saved, and the slow path
happens every 64 bytes so isn't affected by the change.
Fourth, use left-shift rather than right. Testing the sign bit
produces slightly smaller/faster code than testing the lsbit.
I've tried shifting both ways, and copying the desired bit to
a boolean before shifting rather than keeping separate full-width
r and rshift variables, but both produce larger code:
x86_64 i386
This code 94 95
Explicit bool 103 99
Lsbits 99 101
Both 96 100
In a perfect world, the x86-64 object code would be tiny,
and dominated by the unavoidable unpredictable branch, but
this function doesn't warrant arch-specific optimization.
add_to_free_area_random:
shlq rand(%rip)
jz 3f
1:
jnc 2f
jmp add_to_free_area # tail call
2:
jmp add_to_free_area_tail
3:
pushq %rdx
pushq %rsi
pushq %rdi
call get_random_u64
popq %rdi
popq %rsi
popq %rdx
stc
adcq %rax,%rax # not lea because we need carry out
movq %rax, rand(%rip)
jmp 1b
Signed-off-by: George Spelvin <lkml@sdf.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
---
v2: Rewrote commit message to explain existing races better.
Made local variables unsigned to avoid (technically undefined)
signed overflow.
v3: Typos fixed, Acked-by, expanded commit message.
mm/shuffle.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/mm/shuffle.c b/mm/shuffle.c
index e0ed247f8d90..16c5fddc292f 100644
--- a/mm/shuffle.c
+++ b/mm/shuffle.c
@@ -186,22 +186,28 @@ void __meminit __shuffle_free_memory(pg_data_t *pgdat)
void add_to_free_area_random(struct page *page, struct free_area *area,
int migratetype)
{
- static u64 rand;
- static u8 rand_bits;
+ static unsigned long rand; /* buffered random bits */
+ unsigned long r = READ_ONCE(rand), rshift = r << 1;
/*
- * The lack of locking is deliberate. If 2 threads race to
- * update the rand state it just adds to the entropy.
+ * rand holds 0..BITS_PER_LONG-1 random msbits, followed by a
+ * 1 bit, then zero-padding in the lsbits. This allows us to
+ * maintain the pre-generated bits and the count of bits in a
+ * single, atomically updatable, variable.
+ *
+ * The lack of locking is deliberate. If two threads race to
+ * update the rand state it just adds to the entropy. The
+ * worst that can happen is a random bit is used twice, or
+ * get_random_long is called redundantly.
*/
- if (rand_bits == 0) {
- rand_bits = 64;
- rand = get_random_u64();
+ if (unlikely(rshift == 0)) {
+ r = get_random_long();
+ rshift = r << 1 | 1;
}
+ WRITE_ONCE(rand, rshift);
- if (rand & 1)
+ if ((long)r < 0)
add_to_free_area(page, area, migratetype);
else
add_to_free_area_tail(page, area, migratetype);
- rand_bits--;
- rand >>= 1;
}
--
2.26.0.rc2
next prev parent reply other threads:[~2020-03-18 20:40 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-17 13:50 [PATCH] mm/shuffle.c: optimize add_to_free_area_random() George Spelvin
2020-03-17 21:44 ` Kees Cook
2020-03-17 23:06 ` George Spelvin
2020-03-17 23:38 ` Kees Cook
2020-03-18 1:44 ` [PATCH v2] mm/shuffle.c: Fix races in add_to_free_area_random() George Spelvin
2020-03-18 1:49 ` Randy Dunlap
2020-03-18 3:53 ` Dan Williams
2020-03-18 8:20 ` George Spelvin
2020-03-18 17:36 ` Dan Williams
2020-03-18 19:29 ` George Spelvin
2020-03-18 19:40 ` Dan Williams
2020-03-18 21:02 ` George Spelvin
2020-03-18 3:58 ` Kees Cook
2020-03-18 15:26 ` Alexander Duyck
2020-03-18 18:35 ` George Spelvin
2020-03-18 19:17 ` Alexander Duyck
2020-03-18 20:06 ` George Spelvin
2020-03-18 20:39 ` George Spelvin [this message]
2020-03-18 21:34 ` [PATCH v3] " Alexander Duyck
2020-03-18 22:49 ` George Spelvin
2020-03-18 22:57 ` Dan Williams
2020-03-18 23:18 ` George Spelvin
2020-03-19 12:05 ` [PATCH v4] " George Spelvin
2020-03-19 17:49 ` Alexander Duyck
2020-03-20 17:58 ` Kees Cook
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200318203914.GA16083@SDF.ORG \
--to=lkml@sdf.org \
--cc=akpm@linux-foundation.org \
--cc=alexander.duyck@gmail.com \
--cc=dan.j.williams@intel.com \
--cc=keescook@chromium.org \
--cc=linux-mm@kvack.org \
--cc=rdunlap@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).