From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FD1CCF6491 for ; Sat, 28 Sep 2024 14:49:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 394346B0209; Sat, 28 Sep 2024 10:49:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 36AE76B020B; Sat, 28 Sep 2024 10:49:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 20BDE6B020C; Sat, 28 Sep 2024 10:49:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 01F9A6B0209 for ; Sat, 28 Sep 2024 10:49:54 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A622C1C7650 for ; Sat, 28 Sep 2024 14:49:54 +0000 (UTC) X-FDA: 82614431508.07.9544A45 Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) by imf30.hostedemail.com (Postfix) with ESMTP id CB74B80004 for ; Sat, 28 Sep 2024 14:49:51 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=rowland.harvard.edu header.s=google header.b=iabp+Uhm; dmarc=pass (policy=none) header.from=rowland.harvard.edu; spf=pass (imf30.hostedemail.com: domain of stern@g.harvard.edu designates 209.85.219.51 as permitted sender) smtp.mailfrom=stern@g.harvard.edu ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727534973; a=rsa-sha256; cv=none; b=XYv+wzdHWFJnsfoZXlRsgoNKrR+DJGX7C0AlyKW2eBDmPD1nh+uD65mvOvYf+v2oTgMncH RlNLGjxmpCHA1zlKfhOcLPUxkhj37a67TtZnZjCuvWxhESAvvqerAOEHf0Hx4RVJIsS9Gp x4si+Bv0lX9LbRty/bxrXDBIf0YbK48= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=rowland.harvard.edu header.s=google header.b=iabp+Uhm; dmarc=pass (policy=none) header.from=rowland.harvard.edu; spf=pass (imf30.hostedemail.com: domain of stern@g.harvard.edu designates 209.85.219.51 as permitted sender) smtp.mailfrom=stern@g.harvard.edu ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727534973; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1yi3s2Vi95xenb/4sW3OtKFLyrwZhQyIWMqCuz6AJtQ=; b=sbqRM9UExZMJ/JZ2ZqbDl/WUHmtyb4/OcnW5PpGttozwFHx1UuV+PCDWHyi/nPOm1IwHB3 li6WDS/tRzUvTkhx7DFBgvgdJpU/ReNw0zhh+QyFoIYTrea335taUM2Wl3cmfwTAWbJSC7 OtSSBQ2e1unHaD8QwTh9FphD1VCmwdI= Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-6cb2824ddc2so23646816d6.1 for ; Sat, 28 Sep 2024 07:49:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rowland.harvard.edu; s=google; t=1727534991; x=1728139791; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=1yi3s2Vi95xenb/4sW3OtKFLyrwZhQyIWMqCuz6AJtQ=; b=iabp+UhmiaUopBZEeTH086ZN9CRfEgojoYLtiCqZR7MD0GEHSMqbrqaV/txKmwJRGQ 0rSKJ4LG3mhaaikZXVaeU122CtxBAXEfPd5pcJ6jmU9/zro4DqtYpLE0uCKipWs42Am5 18sP20gTmDWElUYn/CcVTJ25Ne13OYw0e1GswS9YXftviYzx4klyMwUBiBuIiCtBiXMW Oxn9SPR/grqaA9I25mVkg4ZPcfSR8CiT5xJVDqEDXom3puNv7Md1i0hHb/rfuizkMxvC bsamgcClSno8U+q3itOsA/43Cgz2NIkEunZ6qGq9Op2aZFhIpmsFLkW4axkPVd9o7KyX qLrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727534991; x=1728139791; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=1yi3s2Vi95xenb/4sW3OtKFLyrwZhQyIWMqCuz6AJtQ=; b=W/W+yCZc2GHWO8gov48uB0ArikPOrBmRdk8s6FpL4wJgP0iE356QgJ6aEG8dRRBRiU MCbEyib7px5SnMSOAsUYiRXtBv6DURcJlpZzmNgxv8lMmJi/ndcq8JLVRceoaOwzCGzy iHWlkU7UdMtKFkVtCxlyEn3CWjRlP/nWgyvJoIKz6IaDiNsi2OCzgylLBl8qJCdJ8VcP 12tSPGi8I8J2V+tDkeqwtiHOfo2x8e8+VPaGUYpE1IJrsCGKO2t9MJ953Gq0Y6ovP07J KU0fWCYabOTxmXzfG5ka11bzti0i6kXExvHi27o7YHXldUXXf/PMwxARhK7xAHltQBr3 8shw== X-Forwarded-Encrypted: i=1; AJvYcCWa+cjyNvstdW+ZxEjdRxPhY8NyxRyr4BtG/0B4oofCBGxYPp6VY3j5BGW4yYPwT10uD1BMLihngA==@kvack.org X-Gm-Message-State: AOJu0YySMEG0/fwXid6orO1rzrt53maGWSEWPPxJBiEd+AC6+OxNh5DV uh+g9JnAsYX4dvWf4lscNPTN9IcqnAeeH8/+ooOq7mWP6yQMXgQ+s1yERchsXw== X-Google-Smtp-Source: AGHT+IGYsBThwDk0+Q9GAkjd0Qrkf1rZeLvqxc9J+iXThiKd1XuEa4qWnG2xVu4bINYg8B5pAobCEg== X-Received: by 2002:a05:6214:3bc3:b0:6cb:4e86:bb53 with SMTP id 6a1803df08f44-6cb4e86bc21mr44373516d6.43.1727534990879; Sat, 28 Sep 2024 07:49:50 -0700 (PDT) Received: from rowland.harvard.edu ([2601:19b:681:fd10::abbf]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6cb3b66b507sm20259966d6.99.2024.09.28.07.49.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2024 07:49:49 -0700 (PDT) Date: Sat, 28 Sep 2024 10:49:45 -0400 From: Alan Stern To: Mathieu Desnoyers Cc: Linus Torvalds , linux-kernel@vger.kernel.org, Greg Kroah-Hartman , Sebastian Andrzej Siewior , "Paul E. McKenney" , Will Deacon , Peter Zijlstra , Boqun Feng , John Stultz , Neeraj Upadhyay , Frederic Weisbecker , Joel Fernandes , Josh Triplett , Uladzislau Rezki , Steven Rostedt , Lai Jiangshan , Zqiang , Ingo Molnar , Waiman Long , Mark Rutland , Thomas Gleixner , Vlastimil Babka , maged.michael@gmail.com, Mateusz Guzik , Gary Guo , Jonas Oberhauser , rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev Subject: Re: [PATCH 1/2] compiler.h: Introduce ptr_eq() to preserve address dependency Message-ID: <02c63e79-ec8c-4d6a-9fcf-75f0e67ea242@rowland.harvard.edu> References: <20240928135128.991110-1-mathieu.desnoyers@efficios.com> <20240928135128.991110-2-mathieu.desnoyers@efficios.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240928135128.991110-2-mathieu.desnoyers@efficios.com> X-Rspam-User: X-Stat-Signature: 9ncarz7mryq74r4mjktqdbkricwwtzu4 X-Rspamd-Queue-Id: CB74B80004 X-Rspamd-Server: rspam02 X-HE-Tag: 1727534991-90485 X-HE-Meta: U2FsdGVkX1+SF81WyAoDZO7l1A7lmMVUxDcXjlbgSJLw6pDZg1ZzEGIQ+7VyVofmtiQy9ciQCqCqwJclTY5bebqXtfagOLesBdlJIYfUYa+ePggzA4bxJqBFSeFJGO6vMjWMXTIm7lklZ+p4uT9bAQloM749OLVvym7B38nIGOffVz7iNnN1zlLRPNjlbFQz7HD6jvMHRFj+EBwh7dN1zn1q4v++osqM249/imygWd6Lq8jxP/xR7MSGLAV66g/sESSNtJwWe/LAncAa2iynlqNCLlaZmFvOWLq2Em73kkadSZzlpp+Yovr8/o1Pst+OZAJjimiUWxGTalwz2x8883kWFlP4Tm24Az7SWQDymwq7IzwONcQRhvpRRMcHrn/8A28POhGNqC6xZgWmRYpU96Sc+1RLhnEDtg1upHTJlZD0StHZrq8w+PiPx/X6LFKbYMMXF1MmSR0pk9d3zPVFTYyUv9/dVQwgxHzLzOKkJqA4gz4mSHy15f6x/638jXsiWpWOGkfOk+VrqlpCKTYiBuaBnJ6q24dDX1uPBCdejLZDCAvkHJPVYZMFh/1997umtE1uSS+XZCD3VUxWkVoVUEIJ9AS4A5T2U3Qw2YqO29mRH41tZpvnRmrOQI2AFZW39+hXvwYBLkC9caanJCruDHtLPR/YjSVp77cnxvkICFJuoop180u3kBQu2VlC/aodOrLS5ExDIXNz50XLC6VhBhSdJ1HnkTzAAzh0vxt9nSlPta7WWSV59ovf6rUk/EhtGDJXllpLHhiqyKUaaT1tUII9xnsna/nAjFemsuKJKNUY5q2nZzrxEQaex+7pNNkh8dd4V6k0gtBnfdES4w5OH169ML2t2eIu8T3v7kwSpt0EP/fksfsm/zV9upRsyfv2cvHhGgEuNsNwhGkwWKSPMnbIFLBx0rpKYoHmnC7RDWWxv9dC4I8saEW8DDJe+7IhCJLvb5NjNpPm44Op6pm 2GPJnZJP vz1p3C/bRgIegfyDsp6leu9VWZIUAMFDEhOwICu1GlhS4IHIMvyEe4MdNbydw+yAw3oQ+uhsFmpvXChR3k5geJo8ogAkIQC4VSwp8s6tK1VkQXyWaB4ARiwowiQzv4p6zhtYcLMM0FFB8nJhsrXmDVJTGRwXCMiCeHSN8XiwT0sGDVmmXAXiaAMNpJTzB6my7BcTx4PBDOyA+DsS/jEqs9NraTYHn09VPrvfVREdn+ezavtuT8nxYOFRz5X3h1/WHkOYH4cTg+1q5DSf0g4e7bWcA9WyB43BTg7X6W9UedgyDDbk/0ZqePgeLaYCdd40S4aZsNX39HS+0Y0eCsD9AYn7vB0Lkrjha9flxlGS7D9o90krZw7E6r9qVj/aR95e+ih/iKYV58j/ZA02IQQXc74ZVvS3KI4aksrEyciyokMm1N0seVrh1FR2bM9/1IPFfJUp0yQyBkl/3l4bkg6I0DlmsuA0QXTWr6AeO54b9l4iSe4K7XlGGrB/DfSrB6W2f0EVpi/26lpDZ0a2KQXfGFHy+jTjF7wJWxsewsWZ/VQtl0ov8Srqn/THruXe9ScCRotLots73C5WKAGNjIQOK+d+SrrNiVxisWHSGyt4ynUQZMgZkBf6cOne2ECUxgQASmJdhTD9ey23woMTOOC9wd8LDSuZUw4/SDJOMFrcMGkhhVQ9CNHwMSS3izyXSpi0Y1IIkE8cmOP/b/TTn7uomoJ9m9xTsM17ON23jqT39/Ydt2mB3Ly0AEesosRRUGbA/l/K5AW1bbVIf6fQXaJ36ryUjgbl/7P5du6EypAnDS7mHHOF2pFYsRMNlnL+QrmQgXDQbozex/4NLl18oKwIMPZ6vNv1enomgQ00Cm0RxP2LzIIDtbJqdPLvEWiVsDS2Bx9NxfNF03BDtyvD+UBZRT0MoCXjljImRLuOHcf/U+wuOBR4mzpNHgRr7q+mW2xVbyTFKuMCIqWA43s/43CvuAbVKKA75 yJcOX4U8 LPAip43Xr0Zv/u7iXD9Au8FmvoQEsnfWH6nLERHVvk3I7augaAD8laa4O2c0HV4JcxEoWZZ9dVfi+6cLtlrzuZwzoI5rG45sEfqyG7A+YIvMFP0cBhQHA6wvQ3Ax5eJbPvYRBWShdeRJlbugsEnnOkCLq5te2oGyppKOwlCx07qCU/RYu4KtU0PhnPnO1qtybD3YRaqbanjroaMAg5Q33wHpwHe4bWyzbu8lsMQURgSDUbnTaYa+nVpVZhbXM3tmVEewuVyVHoCcKe3fCWJwiHX1VRQmFFGIs6tKgGnWMYwqpsyyWhv0gNWKBYFnFQB0tqjCq9faYZ6q/TLow1z0Xu5sOuqYC21MgaCiow/vyRyzjl/RRS7nngBru8Cf2DQZ83ZvG9xpgDLYaL+QPuZCPvzAba0AMuGXyarsuBiO3S4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Sep 28, 2024 at 09:51:27AM -0400, Mathieu Desnoyers wrote: > Compiler CSE and SSA GVN optimizations can cause the address dependency > of addresses returned by rcu_dereference to be lost when comparing those > pointers with either constants or previously loaded pointers. > > Introduce ptr_eq() to compare two addresses while preserving the address > dependencies for later use of the address. It should be used when > comparing an address returned by rcu_dereference(). > > This is needed to prevent the compiler CSE and SSA GVN optimizations > from replacing the registers holding @a or @b based on their "Replacing" isn't the right word. What the compiler does is use one rather than the other. Furthermore, the compiler can play these games even with values that aren't in registers. You should just say: "... from using @a (or @b) in places where the source refers to @b (or @a) (based on the fact that after the comparison, the two are known to be equal), which does not ..." > equality, which does not preserve address dependencies and allows the > following misordering speculations: > > - If @b is a constant, the compiler can issue the loads which depend > on @a before loading @a. > - If @b is a register populated by a prior load, weakly-ordered > CPUs can speculate loads which depend on @a before loading @a. It shouldn't matter whether @a and @b are constants, registers, or anything else. All that matters is that the compiler uses the wrong one, which allows weakly ordered CPUs to speculate loads you wouldn't expect it to, based on the source code alone. > The same logic applies with @a and @b swapped. > > The compiler barrier() is ineffective at fixing this issue. > It does not prevent the compiler CSE from losing the address dependency: > > int fct_2_volatile_barriers(void) > { > int *a, *b; > > do { > a = READ_ONCE(p); > asm volatile ("" : : : "memory"); > b = READ_ONCE(p); > } while (a != b); > asm volatile ("" : : : "memory"); <----- barrier() > return *b; > } > > With gcc 14.2 (arm64): > > fct_2_volatile_barriers: > adrp x0, .LANCHOR0 > add x0, x0, :lo12:.LANCHOR0 > .L2: > ldr x1, [x0] <------ x1 populated by first load. > ldr x2, [x0] > cmp x1, x2 > bne .L2 > ldr w0, [x1] <------ x1 is used for access which should depend on b. > ret > > On weakly-ordered architectures, this lets CPU speculation use the > result from the first load to speculate "ldr w0, [x1]" before > "ldr x2, [x0]". > Based on the RCU documentation, the control dependency does not prevent > the CPU from speculating loads. > > Suggested-by: Linus Torvalds > Suggested-by: Boqun Feng > Signed-off-by: Mathieu Desnoyers > Reviewed-by: Boqun Feng > Acked-by: "Paul E. McKenney" > Cc: Greg Kroah-Hartman > Cc: Sebastian Andrzej Siewior > Cc: "Paul E. McKenney" > Cc: Will Deacon > Cc: Peter Zijlstra > Cc: Boqun Feng > Cc: Alan Stern > Cc: John Stultz > Cc: Neeraj Upadhyay > Cc: Linus Torvalds > Cc: Boqun Feng > Cc: Frederic Weisbecker > Cc: Joel Fernandes > Cc: Josh Triplett > Cc: Uladzislau Rezki > Cc: Steven Rostedt > Cc: Lai Jiangshan > Cc: Zqiang > Cc: Ingo Molnar > Cc: Waiman Long > Cc: Mark Rutland > Cc: Thomas Gleixner > Cc: Vlastimil Babka > Cc: maged.michael@gmail.com > Cc: Mateusz Guzik > Cc: Gary Guo > Cc: Jonas Oberhauser > Cc: rcu@vger.kernel.org > Cc: linux-mm@kvack.org > Cc: lkmm@lists.linux.dev > --- > include/linux/compiler.h | 62 ++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 62 insertions(+) > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > index 2df665fa2964..f26705c267e8 100644 > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -186,6 +186,68 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, > __asm__ ("" : "=r" (var) : "0" (var)) > #endif > > +/* > + * Compare two addresses while preserving the address dependencies for > + * later use of the address. It should be used when comparing an address > + * returned by rcu_dereference(). > + * > + * This is needed to prevent the compiler CSE and SSA GVN optimizations > + * from replacing the registers holding @a or @b based on their > + * equality, which does not preserve address dependencies and allows the > + * following misordering speculations: > + * > + * - If @b is a constant, the compiler can issue the loads which depend > + * on @a before loading @a. > + * - If @b is a register populated by a prior load, weakly-ordered > + * CPUs can speculate loads which depend on @a before loading @a. > + * > + * The same logic applies with @a and @b swapped. This could be more concise, and it should be more general (along the same lines as the description above). > + * > + * Return value: true if pointers are equal, false otherwise. > + * > + * The compiler barrier() is ineffective at fixing this issue. It does > + * not prevent the compiler CSE from losing the address dependency: > + * > + * int fct_2_volatile_barriers(void) > + * { > + * int *a, *b; > + * > + * do { > + * a = READ_ONCE(p); > + * asm volatile ("" : : : "memory"); > + * b = READ_ONCE(p); > + * } while (a != b); > + * asm volatile ("" : : : "memory"); <-- barrier() > + * return *b; > + * } > + * > + * With gcc 14.2 (arm64): > + * > + * fct_2_volatile_barriers: > + * adrp x0, .LANCHOR0 > + * add x0, x0, :lo12:.LANCHOR0 > + * .L2: > + * ldr x1, [x0] <-- x1 populated by first load. > + * ldr x2, [x0] > + * cmp x1, x2 > + * bne .L2 > + * ldr w0, [x1] <-- x1 is used for access which should depend on b. > + * ret > + * > + * On weakly-ordered architectures, this lets CPU speculation use the > + * result from the first load to speculate "ldr w0, [x1]" before > + * "ldr x2, [x0]". > + * Based on the RCU documentation, the control dependency does not > + * prevent the CPU from speculating loads. IMO, this lengthy explanation is not needed in the source code. Just refer interested readers to the commit description. You're repeating the same text verbatim, after all. (Or if you firmly believe that this explanation _does_ belong in the code, then omit it from the commit description. There's no need to say everything twice.) Alan Stern > + */ > +static __always_inline > +int ptr_eq(const volatile void *a, const volatile void *b) > +{ > + OPTIMIZER_HIDE_VAR(a); > + OPTIMIZER_HIDE_VAR(b); > + return a == b; > +} > + > #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__) > > /** > -- > 2.39.2 >