stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Allan Zhang <allanzhang@google.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Stanislav Fomichev <sdf@google.com>,
	Eric Dumazet <edumazet@google.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Sasha Levin <sashal@kernel.org>,
	netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: [PATCH AUTOSEL 5.3 55/71] bpf: Fix bpf_event_output re-entry issue
Date: Tue,  1 Oct 2019 12:39:05 -0400	[thread overview]
Message-ID: <20191001163922.14735-55-sashal@kernel.org> (raw)
In-Reply-To: <20191001163922.14735-1-sashal@kernel.org>

From: Allan Zhang <allanzhang@google.com>

[ Upstream commit 768fb61fcc13b2acaca758275d54c09a65e2968b ]

BPF_PROG_TYPE_SOCK_OPS program can reenter bpf_event_output because it
can be called from atomic and non-atomic contexts since we don't have
bpf_prog_active to prevent it happen.

This patch enables 3 levels of nesting to support normal, irq and nmi
context.

We can easily reproduce the issue by running netperf crr mode with 100
flows and 10 threads from netperf client side.

Here is the whole stack dump:

[  515.228898] WARNING: CPU: 20 PID: 14686 at kernel/trace/bpf_trace.c:549 bpf_event_output+0x1f9/0x220
[  515.228903] CPU: 20 PID: 14686 Comm: tcp_crr Tainted: G        W        4.15.0-smp-fixpanic #44
[  515.228904] Hardware name: Intel TBG,ICH10/Ikaria_QC_1b, BIOS 1.22.0 06/04/2018
[  515.228905] RIP: 0010:bpf_event_output+0x1f9/0x220
[  515.228906] RSP: 0018:ffff9a57ffc03938 EFLAGS: 00010246
[  515.228907] RAX: 0000000000000012 RBX: 0000000000000001 RCX: 0000000000000000
[  515.228907] RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffffffff836b0f80
[  515.228908] RBP: ffff9a57ffc039c8 R08: 0000000000000004 R09: 0000000000000012
[  515.228908] R10: ffff9a57ffc1de40 R11: 0000000000000000 R12: 0000000000000002
[  515.228909] R13: ffff9a57e13bae00 R14: 00000000ffffffff R15: ffff9a57ffc1e2c0
[  515.228910] FS:  00007f5a3e6ec700(0000) GS:ffff9a57ffc00000(0000) knlGS:0000000000000000
[  515.228910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  515.228911] CR2: 0000537082664fff CR3: 000000061fed6002 CR4: 00000000000226f0
[  515.228911] Call Trace:
[  515.228913]  <IRQ>
[  515.228919]  [<ffffffff82c6c6cb>] bpf_sockopt_event_output+0x3b/0x50
[  515.228923]  [<ffffffff8265daee>] ? bpf_ktime_get_ns+0xe/0x10
[  515.228927]  [<ffffffff8266fda5>] ? __cgroup_bpf_run_filter_sock_ops+0x85/0x100
[  515.228930]  [<ffffffff82cf90a5>] ? tcp_init_transfer+0x125/0x150
[  515.228933]  [<ffffffff82cf9159>] ? tcp_finish_connect+0x89/0x110
[  515.228936]  [<ffffffff82cf98e4>] ? tcp_rcv_state_process+0x704/0x1010
[  515.228939]  [<ffffffff82c6e263>] ? sk_filter_trim_cap+0x53/0x2a0
[  515.228942]  [<ffffffff82d90d1f>] ? tcp_v6_inbound_md5_hash+0x6f/0x1d0
[  515.228945]  [<ffffffff82d92160>] ? tcp_v6_do_rcv+0x1c0/0x460
[  515.228947]  [<ffffffff82d93558>] ? tcp_v6_rcv+0x9f8/0xb30
[  515.228951]  [<ffffffff82d737c0>] ? ip6_route_input+0x190/0x220
[  515.228955]  [<ffffffff82d5f7ad>] ? ip6_protocol_deliver_rcu+0x6d/0x450
[  515.228958]  [<ffffffff82d60246>] ? ip6_rcv_finish+0xb6/0x170
[  515.228961]  [<ffffffff82d5fb90>] ? ip6_protocol_deliver_rcu+0x450/0x450
[  515.228963]  [<ffffffff82d60361>] ? ipv6_rcv+0x61/0xe0
[  515.228966]  [<ffffffff82d60190>] ? ipv6_list_rcv+0x330/0x330
[  515.228969]  [<ffffffff82c4976b>] ? __netif_receive_skb_one_core+0x5b/0xa0
[  515.228972]  [<ffffffff82c497d1>] ? __netif_receive_skb+0x21/0x70
[  515.228975]  [<ffffffff82c4a8d2>] ? process_backlog+0xb2/0x150
[  515.228978]  [<ffffffff82c4aadf>] ? net_rx_action+0x16f/0x410
[  515.228982]  [<ffffffff830000dd>] ? __do_softirq+0xdd/0x305
[  515.228986]  [<ffffffff8252cfdc>] ? irq_exit+0x9c/0xb0
[  515.228989]  [<ffffffff82e02de5>] ? smp_call_function_single_interrupt+0x65/0x120
[  515.228991]  [<ffffffff82e020e1>] ? call_function_single_interrupt+0x81/0x90
[  515.228992]  </IRQ>
[  515.228996]  [<ffffffff82a11ff0>] ? io_serial_in+0x20/0x20
[  515.229000]  [<ffffffff8259c040>] ? console_unlock+0x230/0x490
[  515.229003]  [<ffffffff8259cbaa>] ? vprintk_emit+0x26a/0x2a0
[  515.229006]  [<ffffffff8259cbff>] ? vprintk_default+0x1f/0x30
[  515.229008]  [<ffffffff8259d9f5>] ? vprintk_func+0x35/0x70
[  515.229011]  [<ffffffff8259d4bb>] ? printk+0x50/0x66
[  515.229013]  [<ffffffff82637637>] ? bpf_event_output+0xb7/0x220
[  515.229016]  [<ffffffff82c6c6cb>] ? bpf_sockopt_event_output+0x3b/0x50
[  515.229019]  [<ffffffff8265daee>] ? bpf_ktime_get_ns+0xe/0x10
[  515.229023]  [<ffffffff82c29e87>] ? release_sock+0x97/0xb0
[  515.229026]  [<ffffffff82ce9d6a>] ? tcp_recvmsg+0x31a/0xda0
[  515.229029]  [<ffffffff8266fda5>] ? __cgroup_bpf_run_filter_sock_ops+0x85/0x100
[  515.229032]  [<ffffffff82ce77c1>] ? tcp_set_state+0x191/0x1b0
[  515.229035]  [<ffffffff82ced10e>] ? tcp_disconnect+0x2e/0x600
[  515.229038]  [<ffffffff82cecbbb>] ? tcp_close+0x3eb/0x460
[  515.229040]  [<ffffffff82d21082>] ? inet_release+0x42/0x70
[  515.229043]  [<ffffffff82d58809>] ? inet6_release+0x39/0x50
[  515.229046]  [<ffffffff82c1f32d>] ? __sock_release+0x4d/0xd0
[  515.229049]  [<ffffffff82c1f3e5>] ? sock_close+0x15/0x20
[  515.229052]  [<ffffffff8273b517>] ? __fput+0xe7/0x1f0
[  515.229055]  [<ffffffff8273b66e>] ? ____fput+0xe/0x10
[  515.229058]  [<ffffffff82547bf2>] ? task_work_run+0x82/0xb0
[  515.229061]  [<ffffffff824086df>] ? exit_to_usermode_loop+0x7e/0x11f
[  515.229064]  [<ffffffff82408171>] ? do_syscall_64+0x111/0x130
[  515.229067]  [<ffffffff82e0007c>] ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Fixes: a5a3a828cd00 ("bpf: add perf event notificaton support for sock_ops")
Signed-off-by: Allan Zhang <allanzhang@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20190925234312.94063-2-allanzhang@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/trace/bpf_trace.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index ca1255d145766..3e38a010003c9 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -500,14 +500,17 @@ static const struct bpf_func_proto bpf_perf_event_output_proto = {
 	.arg5_type	= ARG_CONST_SIZE_OR_ZERO,
 };
 
-static DEFINE_PER_CPU(struct pt_regs, bpf_pt_regs);
-static DEFINE_PER_CPU(struct perf_sample_data, bpf_misc_sd);
+static DEFINE_PER_CPU(int, bpf_event_output_nest_level);
+struct bpf_nested_pt_regs {
+	struct pt_regs regs[3];
+};
+static DEFINE_PER_CPU(struct bpf_nested_pt_regs, bpf_pt_regs);
+static DEFINE_PER_CPU(struct bpf_trace_sample_data, bpf_misc_sds);
 
 u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
 		     void *ctx, u64 ctx_size, bpf_ctx_copy_t ctx_copy)
 {
-	struct perf_sample_data *sd = this_cpu_ptr(&bpf_misc_sd);
-	struct pt_regs *regs = this_cpu_ptr(&bpf_pt_regs);
+	int nest_level = this_cpu_inc_return(bpf_event_output_nest_level);
 	struct perf_raw_frag frag = {
 		.copy		= ctx_copy,
 		.size		= ctx_size,
@@ -522,12 +525,25 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
 			.data	= meta,
 		},
 	};
+	struct perf_sample_data *sd;
+	struct pt_regs *regs;
+	u64 ret;
+
+	if (WARN_ON_ONCE(nest_level > ARRAY_SIZE(bpf_misc_sds.sds))) {
+		ret = -EBUSY;
+		goto out;
+	}
+	sd = this_cpu_ptr(&bpf_misc_sds.sds[nest_level - 1]);
+	regs = this_cpu_ptr(&bpf_pt_regs.regs[nest_level - 1]);
 
 	perf_fetch_caller_regs(regs);
 	perf_sample_data_init(sd, 0, 0);
 	sd->raw = &raw;
 
-	return __bpf_perf_event_output(regs, map, flags, sd);
+	ret = __bpf_perf_event_output(regs, map, flags, sd);
+out:
+	this_cpu_dec(bpf_event_output_nest_level);
+	return ret;
 }
 
 BPF_CALL_0(bpf_get_current_task)
-- 
2.20.1


  parent reply	other threads:[~2019-10-01 17:00 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-01 16:38 [PATCH AUTOSEL 5.3 01/71] drivers: thermal: qcom: tsens: Fix memory leak from qfprom read Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 02/71] ima: always return negative code for error Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 03/71] ima: fix freeing ongoing ahash_request Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 04/71] fs: nfs: Fix possible null-pointer dereferences in encode_attrs() Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 05/71] xprtrdma: Toggle XPRT_CONGESTED in xprtrdma's slot methods Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 06/71] xprtrdma: Send Queue size grows after a reconnect Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 07/71] 9p: Transport error uninitialized Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 08/71] 9p: avoid attaching writeback_fid on mmap with type PRIVATE Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 09/71] 9p/cache.c: Fix memory leak in v9fs_cache_session_get_cookie Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 10/71] xen/pci: reserve MCFG areas earlier Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 11/71] fuse: fix request limit Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 12/71] ceph: fix directories inode i_blkbits initialization Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 13/71] ceph: fetch cap_gen under spinlock in ceph_add_cap Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 14/71] ceph: reconnect connection if session hang in opening state Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 15/71] rbd: fix response length parameter for encoded strings Sasha Levin
2019-10-01 17:15   ` Ilya Dryomov
2019-10-08 21:29     ` Sasha Levin
2019-10-09  3:45       ` Jens Axboe
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 16/71] SUNRPC: RPC level errors should always set task->tk_rpc_status Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 17/71] watchdog: aspeed: Add support for AST2600 Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 18/71] netfilter: nf_tables: allow lookups in dynamic sets Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 19/71] drm/amdgpu: Fix KFD-related kernel oops on Hawaii Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 20/71] drm/amdgpu: Check for valid number of registers to read Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 21/71] perf probe: Fix to clear tev->nargs in clear_probe_trace_event() Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 22/71] pNFS: Ensure we do clear the return-on-close layout stateid on fatal errors Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 23/71] SUNRPC: Don't try to parse incomplete RPC messages Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 24/71] net/sched: act_sample: don't push mac header on ip6gre ingress Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 25/71] pwm: stm32-lp: Add check in case requested period cannot be achieved Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 26/71] cdc_ncm: fix divide-by-zero caused by invalid wMaxPacketSize Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 27/71] usbnet: ignore endpoints with " Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 28/71] net/phy: fix DP83865 10 Mbps HDX loopback disable function Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 29/71] net_sched: add max len check for TCA_KIND Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 30/71] selftests/seccomp: fix build on older kernels Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 31/71] x86/purgatory: Disable the stackleak GCC plugin for the purgatory Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 32/71] ntb: point to right memory window index Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 33/71] thermal: Fix use-after-free when unregistering thermal zone device Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 34/71] thermal_hwmon: Sanitize thermal_zone type Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 35/71] iommu/amd: Fix downgrading default page-sizes in alloc_pte() Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 36/71] libnvdimm/region: Initialize bad block for volatile namespaces Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 37/71] net/mlx5e: Fix matching on tunnel addresses type Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 38/71] net/mlx5e: Fix traffic duplication in ethtool steering Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 39/71] KVM: hyperv: Fix Direct Synthetic timers assert an interrupt w/o lapic_in_kernel Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 40/71] libnvdimm: Fix endian conversion issues Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 41/71] fuse: fix memleak in cuse_channel_open Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 42/71] arcnet: provide a buffer big enough to actually receive packets Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 43/71] libnvdimm/nfit_test: Fix acpi_handle redefinition Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 44/71] ppp: Fix memory leak in ppp_write Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 45/71] sched/membarrier: Call sync_core only before usermode for same mm Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 46/71] sched/membarrier: Fix private expedited registration check Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 47/71] sched/core: Fix migration to invalid CPU in __set_cpus_allowed_ptr() Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 48/71] perf build: Add detection of java-11-openjdk-devel package Sasha Levin
2019-10-01 16:38 ` [PATCH AUTOSEL 5.3 49/71] include/trace/events/writeback.h: fix -Wstringop-truncation warnings Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 50/71] selftests/bpf: adjust strobemeta loop to satisfy latest clang Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 51/71] kernel/elfcore.c: include proper prototypes Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 52/71] libbpf: fix false uninitialized variable warning Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 53/71] kexec: bail out upon SIGKILL when allocating memory Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 54/71] blk-mq: move lockdep_assert_held() into elevator_exit Sasha Levin
2019-10-01 16:39 ` Sasha Levin [this message]
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 56/71] macsec: drop skb sk before calling gro_cells_receive Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 57/71] net: dsa: microchip: Always set regmap stride to 1 Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 58/71] i2c: qcom-geni: Disable DMA processing on the Lenovo Yoga C630 Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 59/71] perf unwind: Fix libunwind build failure on i386 systems Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 60/71] nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 61/71] net: stmmac: Fix page pool size Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 62/71] net: phy: micrel: add Asym Pause workaround for KSZ9021 Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 63/71] mlxsw: spectrum_flower: Fail in case user specifies multiple mirror actions Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 64/71] vrf: Do not attempt to create IPv6 mcast rule if IPv6 is disabled Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 65/71] nfp: flower: prevent memory leak in nfp_flower_spawn_phy_reprs Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 66/71] nfp: abm: fix memory leak in nfp_abm_u32_knode_replace Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 67/71] fuse: fix deadlock with aio poll and fuse_iqueue::waitq.lock Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 68/71] drm/radeon: Bail earlier when radeon.cik_/si_support=0 is passed Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 69/71] usbnet: sanity checking of packet sizes and device mtu Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 70/71] sch_netem: fix a divide by zero in tabledist() Sasha Levin
2019-10-01 16:39 ` [PATCH AUTOSEL 5.3 71/71] Btrfs: fix selftests failure due to uninitialized i_mode in test inodes Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191001163922.14735-55-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=allanzhang@google.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=edumazet@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).