public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Marco Elver <elver@google.com>,
	Alexander Potapenko <glider@google.com>,
	Dmitry Vyukov <dvyukov@google.com>, Jann Horn <jannh@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 5.15 13/26] kfence: default to dynamic branch instead of static keys mode
Date: Wed, 10 Nov 2021 19:44:12 +0100	[thread overview]
Message-ID: <20211110182004.120292963@linuxfoundation.org> (raw)
In-Reply-To: <20211110182003.700594531@linuxfoundation.org>

From: Marco Elver <elver@google.com>

commit 4f612ed3f748962cbef1316ff3d323e2b9055b6e upstream.

We have observed that on very large machines with newer CPUs, the static
key/branch switching delay is on the order of milliseconds.  This is due
to the required broadcast IPIs, which simply does not scale well to
hundreds of CPUs (cores).  If done too frequently, this can adversely
affect tail latencies of various workloads.

One workaround is to increase the sample interval to several seconds,
while decreasing sampled allocation coverage, but the problem still
exists and could still increase tail latencies.

As already noted in the Kconfig help text, there are trade-offs: at
lower sample intervals the dynamic branch results in better performance;
however, at very large sample intervals, the static keys mode can result
in better performance -- careful benchmarking is recommended.

Our initial benchmarking showed that with large enough sample intervals
and workloads stressing the allocator, the static keys mode was slightly
better.  Evaluating and observing the possible system-wide side-effects
of the static-key-switching induced broadcast IPIs, however, was a blind
spot (in particular on large machines with 100s of cores).

Therefore, a major downside of the static keys mode is, unfortunately,
that it is hard to predict performance on new system architectures and
topologies, but also making conclusions about performance of new
workloads based on a limited set of benchmarks.

Most distributions will simply select the defaults, while targeting a
large variety of different workloads and system architectures.  As such,
the better default is CONFIG_KFENCE_STATIC_KEYS=n, and re-enabling it is
only recommended after careful evaluation.

For reference, on x86-64 the condition in kfence_alloc() generates
exactly
2 instructions in the kmem_cache_alloc() fast-path:

 | ...
 | cmpl   $0x0,0x1a8021c(%rip)  # ffffffff82d560d0 <kfence_allocation_gate>
 | je     ffffffff812d6003      <kmem_cache_alloc+0x243>
 | ...

which, given kfence_allocation_gate is infrequently modified, should be
well predicted by most CPUs.

Link: https://lkml.kernel.org/r/20211019102524.2807208-2-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/dev-tools/kfence.rst |   12 ++++++++----
 lib/Kconfig.kfence                 |   26 +++++++++++++++-----------
 2 files changed, 23 insertions(+), 15 deletions(-)

--- a/Documentation/dev-tools/kfence.rst
+++ b/Documentation/dev-tools/kfence.rst
@@ -231,10 +231,14 @@ Guarded allocations are set up based on
 of the sample interval, the next allocation through the main allocator (SLAB or
 SLUB) returns a guarded allocation from the KFENCE object pool (allocation
 sizes up to PAGE_SIZE are supported). At this point, the timer is reset, and
-the next allocation is set up after the expiration of the interval. To "gate" a
-KFENCE allocation through the main allocator's fast-path without overhead,
-KFENCE relies on static branches via the static keys infrastructure. The static
-branch is toggled to redirect the allocation to KFENCE.
+the next allocation is set up after the expiration of the interval.
+
+When using ``CONFIG_KFENCE_STATIC_KEYS=y``, KFENCE allocations are "gated"
+through the main allocator's fast-path by relying on static branches via the
+static keys infrastructure. The static branch is toggled to redirect the
+allocation to KFENCE. Depending on sample interval, target workloads, and
+system architecture, this may perform better than the simple dynamic branch.
+Careful benchmarking is recommended.
 
 KFENCE objects each reside on a dedicated page, at either the left or right
 page boundaries selected at random. The pages to the left and right of the
--- a/lib/Kconfig.kfence
+++ b/lib/Kconfig.kfence
@@ -25,17 +25,6 @@ menuconfig KFENCE
 
 if KFENCE
 
-config KFENCE_STATIC_KEYS
-	bool "Use static keys to set up allocations"
-	default y
-	depends on JUMP_LABEL # To ensure performance, require jump labels
-	help
-	  Use static keys (static branches) to set up KFENCE allocations. Using
-	  static keys is normally recommended, because it avoids a dynamic
-	  branch in the allocator's fast path. However, with very low sample
-	  intervals, or on systems that do not support jump labels, a dynamic
-	  branch may still be an acceptable performance trade-off.
-
 config KFENCE_SAMPLE_INTERVAL
 	int "Default sample interval in milliseconds"
 	default 100
@@ -56,6 +45,21 @@ config KFENCE_NUM_OBJECTS
 	  pages are required; with one containing the object and two adjacent
 	  ones used as guard pages.
 
+config KFENCE_STATIC_KEYS
+	bool "Use static keys to set up allocations" if EXPERT
+	depends on JUMP_LABEL
+	help
+	  Use static keys (static branches) to set up KFENCE allocations. This
+	  option is only recommended when using very large sample intervals, or
+	  performance has carefully been evaluated with this option.
+
+	  Using static keys comes with trade-offs that need to be carefully
+	  evaluated given target workloads and system architectures. Notably,
+	  enabling and disabling static keys invoke IPI broadcasts, the latency
+	  and impact of which is much harder to predict than a dynamic branch.
+
+	  Say N if you are unsure.
+
 config KFENCE_STRESS_TEST_FAULTS
 	int "Stress testing of fault handling and error reporting" if EXPERT
 	default 0



  parent reply	other threads:[~2021-11-10 18:56 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10 18:43 [PATCH 5.15 00/26] 5.15.2-rc1 review Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 01/26] KVM: x86: avoid warning with -Wbitwise-instead-of-logical Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 02/26] Revert "x86/kvm: fix vcpu-id indexed array sizes" Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 03/26] usb: ehci: handshake CMD_RUN instead of STS_HALT Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 04/26] usb: gadget: Mark USB_FSL_QE broken on 64-bit Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 05/26] usb: musb: Balance list entry in musb_gadget_queue Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 06/26] usb-storage: Add compatibility quirk flags for iODD 2531/2541 Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 07/26] Revert "proc/wchan: use printk format instead of lookup_symbol_name()" Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 08/26] binder: use euid from cred instead of using task Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 09/26] binder: use cred instead of task for selinux checks Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 10/26] binder: use cred instead of task for getsecid Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 11/26] binder: dont detect sender/target during buffer cleanup Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 12/26] kfence: always use static branches to guard kfence_alloc() Greg Kroah-Hartman
2021-11-10 18:44 ` Greg Kroah-Hartman [this message]
2021-11-10 18:44 ` [PATCH 5.15 14/26] btrfs: fix lzo_decompress_bio() kmap leakage Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 15/26] staging: rtl8712: fix use-after-free in rtl8712_dl_fw Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 16/26] isofs: Fix out of bound access for corrupted isofs image Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 17/26] comedi: dt9812: fix DMA buffers on stack Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 18/26] comedi: ni_usb6501: fix NULL-deref in command paths Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 19/26] comedi: vmk80xx: fix transfer-buffer overflows Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 20/26] comedi: vmk80xx: fix bulk-buffer overflow Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 21/26] comedi: vmk80xx: fix bulk and interrupt message timeouts Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 22/26] staging: r8712u: fix control-message timeout Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 23/26] staging: rtl8192u: fix control-message timeouts Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 24/26] staging: r8188eu: fix memleak in rtw_wx_set_enc_ext Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 25/26] media: staging/intel-ipu3: css: Fix wrong size comparison imgu_css_fw_init Greg Kroah-Hartman
2021-11-10 18:44 ` [PATCH 5.15 26/26] rsi: fix control-message timeout Greg Kroah-Hartman
2021-11-10 23:43 ` [PATCH 5.15 00/26] 5.15.2-rc1 review Florian Fainelli
2021-11-11  9:57 ` Naresh Kamboju
2021-11-11 16:26 ` Shuah Khan
2021-11-11 16:37 ` Fox Chen
2021-11-12  0:59 ` Guenter Roeck
2021-11-12 15:46 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211110182004.120292963@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox