stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Sandipan Das <sandipan@linux.ibm.com>,
	Michael Roth <mdroth@linux.vnet.ibm.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alexei Starovoitov <ast@kernel.org>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 80/90] bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
Date: Mon,  8 Jul 2019 17:13:47 +0200	[thread overview]
Message-ID: <20190708150526.419263775@linuxfoundation.org> (raw)
In-Reply-To: <20190708150521.829733162@linuxfoundation.org>

[ Upstream commit fdadd04931c2d7cd294dc5b2b342863f94be53a3 ]

Michael and Sandipan report:

  Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF
  JIT allocations. At compile time it defaults to PAGE_SIZE * 40000,
  and is adjusted again at init time if MODULES_VADDR is defined.

  For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with
  the compile-time default at boot-time, which is 0x9c400000 when
  using 64K page size. This overflows the signed 32-bit bpf_jit_limit
  value:

  root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit
  -1673527296

  and can cause various unexpected failures throughout the network
  stack. In one case `strace dhclient eth0` reported:

  setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8},
             16) = -1 ENOTSUPP (Unknown error 524)

  and similar failures can be seen with tools like tcpdump. This doesn't
  always reproduce however, and I'm not sure why. The more consistent
  failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9
  host would time out on systemd/netplan configuring a virtio-net NIC
  with no noticeable errors in the logs.

Given this and also given that in near future some architectures like
arm64 will have a custom area for BPF JIT image allocations we should
get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For
4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec()
so therefore add another overridable bpf_jit_alloc_exec_limit() helper
function which returns the possible size of the memory area for deriving
the default heuristic in bpf_jit_charge_init().

Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new
bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default
JIT memory provider, and therefore in case archs implement their custom
module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for
vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}.

Additionally, for archs supporting large page sizes, we should change
the sysctl to be handled as long to not run into sysctl restrictions
in future.

Fixes: ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
Reported-by: Sandipan Das <sandipan@linux.ibm.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/filter.h     |  2 +-
 kernel/bpf/core.c          | 21 +++++++++++++++------
 net/core/sysctl_net_core.c | 20 +++++++++++++++++---
 3 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index d52a7484aeb2..3705c6f10b17 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -837,7 +837,7 @@ bpf_run_sk_reuseport(struct sock_reuseport *reuse, struct sock *sk,
 extern int bpf_jit_enable;
 extern int bpf_jit_harden;
 extern int bpf_jit_kallsyms;
-extern int bpf_jit_limit;
+extern long bpf_jit_limit;
 
 typedef void (*bpf_jit_fill_hole_t)(void *area, unsigned int size);
 
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index bad9985b8a08..36be400c3e65 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -366,13 +366,11 @@ void bpf_prog_kallsyms_del_all(struct bpf_prog *fp)
 }
 
 #ifdef CONFIG_BPF_JIT
-# define BPF_JIT_LIMIT_DEFAULT	(PAGE_SIZE * 40000)
-
 /* All BPF JIT sysctl knobs here. */
 int bpf_jit_enable   __read_mostly = IS_BUILTIN(CONFIG_BPF_JIT_ALWAYS_ON);
 int bpf_jit_harden   __read_mostly;
 int bpf_jit_kallsyms __read_mostly;
-int bpf_jit_limit    __read_mostly = BPF_JIT_LIMIT_DEFAULT;
+long bpf_jit_limit   __read_mostly;
 
 static __always_inline void
 bpf_get_prog_addr_region(const struct bpf_prog *prog,
@@ -583,16 +581,27 @@ int bpf_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
 
 static atomic_long_t bpf_jit_current;
 
+/* Can be overridden by an arch's JIT compiler if it has a custom,
+ * dedicated BPF backend memory area, or if neither of the two
+ * below apply.
+ */
+u64 __weak bpf_jit_alloc_exec_limit(void)
+{
 #if defined(MODULES_VADDR)
+	return MODULES_END - MODULES_VADDR;
+#else
+	return VMALLOC_END - VMALLOC_START;
+#endif
+}
+
 static int __init bpf_jit_charge_init(void)
 {
 	/* Only used as heuristic here to derive limit. */
-	bpf_jit_limit = min_t(u64, round_up((MODULES_END - MODULES_VADDR) >> 2,
-					    PAGE_SIZE), INT_MAX);
+	bpf_jit_limit = min_t(u64, round_up(bpf_jit_alloc_exec_limit() >> 2,
+					    PAGE_SIZE), LONG_MAX);
 	return 0;
 }
 pure_initcall(bpf_jit_charge_init);
-#endif
 
 static int bpf_jit_charge_modmem(u32 pages)
 {
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 37b4667128a3..d67ec17f2cc8 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -28,6 +28,8 @@ static int two __maybe_unused = 2;
 static int min_sndbuf = SOCK_MIN_SNDBUF;
 static int min_rcvbuf = SOCK_MIN_RCVBUF;
 static int max_skb_frags = MAX_SKB_FRAGS;
+static long long_one __maybe_unused = 1;
+static long long_max __maybe_unused = LONG_MAX;
 
 static int net_msg_warn;	/* Unused, but still a sysctl */
 
@@ -289,6 +291,17 @@ proc_dointvec_minmax_bpf_restricted(struct ctl_table *table, int write,
 
 	return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
 }
+
+static int
+proc_dolongvec_minmax_bpf_restricted(struct ctl_table *table, int write,
+				     void __user *buffer, size_t *lenp,
+				     loff_t *ppos)
+{
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	return proc_doulongvec_minmax(table, write, buffer, lenp, ppos);
+}
 #endif
 
 static struct ctl_table net_core_table[] = {
@@ -398,10 +411,11 @@ static struct ctl_table net_core_table[] = {
 	{
 		.procname	= "bpf_jit_limit",
 		.data		= &bpf_jit_limit,
-		.maxlen		= sizeof(int),
+		.maxlen		= sizeof(long),
 		.mode		= 0600,
-		.proc_handler	= proc_dointvec_minmax_bpf_restricted,
-		.extra1		= &one,
+		.proc_handler	= proc_dolongvec_minmax_bpf_restricted,
+		.extra1		= &long_one,
+		.extra2		= &long_max,
 	},
 #endif
 	{
-- 
2.20.1




  parent reply	other threads:[~2019-07-08 15:39 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-08 15:12 [PATCH 4.19 00/90] 4.19.58-stable review Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 01/90] Bluetooth: Fix faulty expression for minimum encryption key size check Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 02/90] block: Fix a NULL pointer dereference in generic_make_request() Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 03/90] md/raid0: Do not bypass blocking queue entered for raid0 bios Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 04/90] netfilter: nf_flow_table: ignore DF bit setting Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 05/90] netfilter: nft_flow_offload: set liberal tracking mode for tcp Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 06/90] netfilter: nft_flow_offload: dont offload when sequence numbers need adjustment Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 07/90] netfilter: nft_flow_offload: IPCB is only valid for ipv4 family Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 08/90] ASoC : cs4265 : readable register too low Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 09/90] ASoC: ak4458: add return value for ak4458_probe Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 10/90] ASoC: soc-pcm: BE dai needs prepare when pause release after resume Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 11/90] ASoC: ak4458: rstn_control - return a non-zero on error only Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 12/90] spi: bitbang: Fix NULL pointer dereference in spi_unregister_master Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 13/90] drm/mediatek: fix unbind functions Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 14/90] drm/mediatek: unbind components in mtk_drm_unbind() Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 15/90] drm/mediatek: call drm_atomic_helper_shutdown() when unbinding driver Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 16/90] drm/mediatek: clear num_pipes when unbind driver Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 17/90] drm/mediatek: call mtk_dsi_stop() after mtk_drm_crtc_atomic_disable() Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 18/90] ASoC: max98090: remove 24-bit format support if RJ is 0 Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 19/90] ASoC: sun4i-i2s: Fix sun8i tx channel offset mask Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 20/90] ASoC: sun4i-i2s: Add offset to RX channel select Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 21/90] x86/CPU: Add more Icelake model numbers Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 22/90] usb: gadget: fusb300_udc: Fix memory leak of fusb300->ep[i] Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 23/90] usb: gadget: udc: lpc32xx: allocate descriptor with GFP_ATOMIC Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 24/90] ALSA: hdac: fix memory release for SST and SOF drivers Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 25/90] SoC: rt274: Fix internal jack assignment in set_jack callback Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 26/90] scsi: hpsa: correct ioaccel2 chaining Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 27/90] drm: panel-orientation-quirks: Add quirk for GPD pocket2 Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 28/90] drm: panel-orientation-quirks: Add quirk for GPD MicroPC Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 29/90] platform/x86: asus-wmi: Only Tell EC the OS will handle display hotkeys from asus_nb_wmi Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 30/90] platform/x86: intel-vbtn: Report switch events when event wakes device Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 31/90] platform/x86: mlx-platform: Fix parent device in i2c-mux-reg device registration Greg Kroah-Hartman
2019-07-08 15:12 ` [PATCH 4.19 32/90] platform/mellanox: mlxreg-hotplug: Add devm_free_irq call to remove flow Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 33/90] i2c: pca-platform: Fix GPIO lookup code Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 34/90] cpuset: restore sanity to cpuset_cpus_allowed_fallback() Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 35/90] scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 36/90] mm/mlock.c: change count_mm_mlocked_page_nr return type Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 37/90] tracing: avoid build warning with HAVE_NOP_MCOUNT Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 38/90] module: Fix livepatch/ftrace module text permissions race Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 39/90] ftrace: Fix NULL pointer dereference in free_ftrace_func_mapper() Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 40/90] drm/i915/dmc: protect against reading random memory Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 41/90] ptrace: Fix ->ptracer_cred handling for PTRACE_TRACEME Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 42/90] crypto: user - prevent operating on larval algorithms Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 43/90] crypto: cryptd - Fix skcipher instance memory leak Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 44/90] ALSA: seq: fix incorrect order of dest_client/dest_ports arguments Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 45/90] ALSA: firewire-lib/fireworks: fix miss detection of received MIDI messages Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 46/90] ALSA: line6: Fix write on zero-sized buffer Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 47/90] ALSA: usb-audio: fix sign unintended sign extension on left shifts Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 48/90] ALSA: hda/realtek: Add quirks for several Clevo notebook barebones Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 49/90] ALSA: hda/realtek - Change front mic location for Lenovo M710q Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 50/90] lib/mpi: Fix karactx leak in mpi_powm Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 51/90] fs/userfaultfd.c: disable irqs for fault_pending and event locks Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 52/90] tracing/snapshot: Resize spare buffer if size changed Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 53/90] ARM: dts: armada-xp-98dx3236: Switch to armada-38x-uart serial node Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 54/90] arm64: kaslr: keep modules inside module region when KASAN is enabled Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 55/90] drm/amd/powerplay: use hardware fan control if no powerplay fan table Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 56/90] drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 57/90] drm/etnaviv: add missing failure path to destroy suballoc Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 58/90] drm/imx: notify drm core before sending event during crtc disable Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 59/90] drm/imx: only send event on crtc disable if kept disabled Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 60/90] ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code() Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 61/90] mm/vmscan.c: prevent useless kswapd loops Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 62/90] btrfs: Ensure replaced device doesnt have pending chunk allocation Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 63/90] tty: rocket: fix incorrect forward declaration of rp_init() Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 64/90] mlxsw: spectrum: Handle VLAN device unlinking Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 65/90] net/smc: move unhash before release of clcsock Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 66/90] media: s5p-mfc: fix incorrect bus assignment in virtual child device Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 67/90] drm/fb-helper: generic: Dont take module ref for fbcon Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 68/90] f2fs: dont access node/meta inode mapping after iput Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 69/90] mac80211: mesh: fix missing unlock on error in table_path_del() Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 70/90] scsi: tcmu: fix use after free Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 71/90] selftests: fib_rule_tests: Fix icmp proto with ipv6 Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 72/90] x86/boot/compressed/64: Do not corrupt EDX on EFER.LME=1 setting Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 73/90] net: hns: Fixes the missing put_device in positive leg for roce reset Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 74/90] ALSA: hda: Initialize power_state field properly Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 75/90] rds: Fix warning Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 76/90] ip6: fix skb leak in ip6frag_expire_frag_queue() Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 77/90] netfilter: ipv6: nf_defrag: fix leakage of unqueued fragments Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 78/90] sc16is7xx: move label err_spi to correct section Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 79/90] net: hns: fix unsigned comparison to less than zero Greg Kroah-Hartman
2019-07-08 15:13 ` Greg Kroah-Hartman [this message]
2019-07-08 15:13 ` [PATCH 4.19 81/90] netfilter: ipv6: nf_defrag: accept duplicate fragments again Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 82/90] KVM: x86: degrade WARN to pr_warn_ratelimited Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 83/90] KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 84/90] nfsd: Fix overflow causing non-working mounts on 1 TB machines Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 85/90] svcrdma: Ignore source port when computing DRC hash Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 86/90] MIPS: Fix bounds check virt_addr_valid Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 87/90] MIPS: Add missing EHB in mtc0 -> mfc0 sequence Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 88/90] MIPS: have "plain" make calls build dtbs for selected platforms Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 89/90] dmaengine: qcom: bam_dma: Fix completed descriptors count Greg Kroah-Hartman
2019-07-08 15:13 ` [PATCH 4.19 90/90] dmaengine: imx-sdma: remove BD_INTR for channel0 Greg Kroah-Hartman
2019-07-08 17:31 ` [PATCH 4.19 00/90] 4.19.58-stable review Phong Tran
2019-07-08 19:12 ` kernelci.org bot
2019-07-09  0:54 ` shuah
2019-07-09  4:24 ` Naresh Kamboju
2019-07-09 15:44 ` Amol Surati
2019-07-09 18:41 ` Guenter Roeck
2019-07-10  6:13 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190708150526.419263775@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=sandipan@linux.ibm.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).