From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5576C3F2CE for ; Tue, 17 Mar 2020 13:50:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AB4B420767 for ; Tue, 17 Mar 2020 13:50:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AB4B420767 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=SDF.ORG Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 507DA6B0005; Tue, 17 Mar 2020 09:50:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4B8EB6B0006; Tue, 17 Mar 2020 09:50:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3A8AE6B0007; Tue, 17 Mar 2020 09:50:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0254.hostedemail.com [216.40.44.254]) by kanga.kvack.org (Postfix) with ESMTP id 208376B0005 for ; Tue, 17 Mar 2020 09:50:48 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E3119180AD815 for ; Tue, 17 Mar 2020 13:50:47 +0000 (UTC) X-FDA: 76604989734.09.crack85_6e282894264b X-HE-Tag: crack85_6e282894264b X-Filterd-Recvd-Size: 3572 Received: from mx.sdf.org (mx.sdf.org [205.166.94.20]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Tue, 17 Mar 2020 13:50:47 +0000 (UTC) Received: from sdf.org (IDENT:lkml@otaku.sdf.org [205.166.94.8]) by mx.sdf.org (8.15.2/8.14.5) with ESMTPS id 02HDoaL0006137 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits) verified NO); Tue, 17 Mar 2020 13:50:36 GMT Received: (from lkml@localhost) by sdf.org (8.15.2/8.12.8/Submit) id 02HDoZLi000776; Tue, 17 Mar 2020 13:50:35 GMT Date: Tue, 17 Mar 2020 13:50:35 +0000 From: George Spelvin To: Dan Williams , linux-mm@kvack.org Cc: Kees Cook , Andrew Morton , lkml@sdf.org Subject: [PATCH] mm/shuffle.c: optimize add_to_free_area_random() Message-ID: <20200317135035.GA19442@SDF.ORG> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: First, use long rather than u64 for the bit buffer type, which is significantly more efficient on 32-bit processors. Second, avoid the need for a separate rand_bits counter. rand_bits is never more than 63, so there's always room in rand for a bit to mark the end of the available bits. This makes the shared state atomically updatable, which makes it a lot easier to reason about race conditions. Third, use READ_ONCE and WRITE_ONCE. Without them, the compiler may spill to the shared static in arbitrarily perverse ways, and combined with the fact that the code eschews locking, that is a recipe for hard-to-find bugs. Now, a race might cause a bit to be used twice, or get_random_long() to be called redundantly, but it can't summon nasal daemons. I've tried a few variants. Keeping random lsbits with a most- significant end marker, or using an explicit bool variable rather than testing r both increase code size slightly. x86_64 i386 This code 94 95 Explicit bool 103 99 Lsbits 99 101 Both 96 100 Signed-off-by: George Spelvin Cc: Dan Williams Cc: Kees Cook Cc: Andrew Morton Cc: linux-mm@kvack.org --- mm/shuffle.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/mm/shuffle.c b/mm/shuffle.c index b3fe97fd6654..0e4bf6a8da52 100644 --- a/mm/shuffle.c +++ b/mm/shuffle.c @@ -186,22 +186,25 @@ void __meminit __shuffle_free_memory(pg_data_t *pgdat) void add_to_free_area_random(struct page *page, struct free_area *area, int migratetype) { - static u64 rand; - static u8 rand_bits; + static long rand; /* 0..BITS_PER_LONG-1 buffered random bits */ + long r = READ_ONCE(rand), rshift = r << 1;; /* - * The lack of locking is deliberate. If 2 threads race to + * rand holds some random msbits, with the end marked by a 1 bit. + * This allows us to maintain the pre-generated bits and the + * count of bits in a single, atomically updatable, variable. + * + * The lack of locking is deliberate. If two threads race to * update the rand state it just adds to the entropy. */ - if (rand_bits == 0) { - rand_bits = 64; - rand = get_random_u64(); + if (unlikely(rshift == 0)) { + r = get_random_long(); + rshift = r << 1 | 1; } + WRITE_ONCE(rand, rshift); - if (rand & 1) + if (r < 0) add_to_free_area(page, area, migratetype); else add_to_free_area_tail(page, area, migratetype); - rand_bits--; - rand >>= 1; } -- 2.25.1