From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Chuck Lever <chuck.lever@oracle.com>,
Sasha Levin <sashal@kernel.org>,
linux-nfs@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH AUTOSEL 5.4 22/57] svcrdma: Fix trace point use-after-free race
Date: Thu, 30 Apr 2020 09:51:43 -0400 [thread overview]
Message-ID: <20200430135218.20372-22-sashal@kernel.org> (raw)
In-Reply-To: <20200430135218.20372-1-sashal@kernel.org>
From: Chuck Lever <chuck.lever@oracle.com>
[ Upstream commit e28b4fc652c1830796a4d3e09565f30c20f9a2cf ]
I hit this while testing nfsd-5.7 with kernel memory debugging
enabled on my server:
Mar 30 13:21:45 klimt kernel: BUG: unable to handle page fault for address: ffff8887e6c279a8
Mar 30 13:21:45 klimt kernel: #PF: supervisor read access in kernel mode
Mar 30 13:21:45 klimt kernel: #PF: error_code(0x0000) - not-present page
Mar 30 13:21:45 klimt kernel: PGD 3601067 P4D 3601067 PUD 87c519067 PMD 87c3e2067 PTE 800ffff8193d8060
Mar 30 13:21:45 klimt kernel: Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
Mar 30 13:21:45 klimt kernel: CPU: 2 PID: 1933 Comm: nfsd Not tainted 5.6.0-rc6-00040-g881e87a3c6f9 #1591
Mar 30 13:21:45 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
Mar 30 13:21:45 klimt kernel: RIP: 0010:svc_rdma_post_chunk_ctxt+0xab/0x284 [rpcrdma]
Mar 30 13:21:45 klimt kernel: Code: c1 83 34 02 00 00 29 d0 85 c0 7e 72 48 8b bb a0 02 00 00 48 8d 54 24 08 4c 89 e6 48 8b 07 48 8b 40 20 e8 5a 5c 2b e1 41 89 c6 <8b> 45 20 89 44 24 04 8b 05 02 e9 01 00 85 c0 7e 33 e9 5e 01 00 00
Mar 30 13:21:45 klimt kernel: RSP: 0018:ffffc90000dfbdd8 EFLAGS: 00010286
Mar 30 13:21:45 klimt kernel: RAX: 0000000000000000 RBX: ffff8887db8db400 RCX: 0000000000000030
Mar 30 13:21:45 klimt kernel: RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000246
Mar 30 13:21:45 klimt kernel: RBP: ffff8887e6c27988 R08: 0000000000000000 R09: 0000000000000004
Mar 30 13:21:45 klimt kernel: R10: ffffc90000dfbdd8 R11: 00c068ef00000000 R12: ffff8887eb4e4a80
Mar 30 13:21:45 klimt kernel: R13: ffff8887db8db634 R14: 0000000000000000 R15: ffff8887fc931000
Mar 30 13:21:45 klimt kernel: FS: 0000000000000000(0000) GS:ffff88885bd00000(0000) knlGS:0000000000000000
Mar 30 13:21:45 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 30 13:21:45 klimt kernel: CR2: ffff8887e6c279a8 CR3: 000000081b72e002 CR4: 00000000001606e0
Mar 30 13:21:45 klimt kernel: Call Trace:
Mar 30 13:21:45 klimt kernel: ? svc_rdma_vec_to_sg+0x7f/0x7f [rpcrdma]
Mar 30 13:21:45 klimt kernel: svc_rdma_send_write_chunk+0x59/0xce [rpcrdma]
Mar 30 13:21:45 klimt kernel: svc_rdma_sendto+0xf9/0x3ae [rpcrdma]
Mar 30 13:21:45 klimt kernel: ? nfsd_destroy+0x51/0x51 [nfsd]
Mar 30 13:21:45 klimt kernel: svc_send+0x105/0x1e3 [sunrpc]
Mar 30 13:21:45 klimt kernel: nfsd+0xf2/0x149 [nfsd]
Mar 30 13:21:45 klimt kernel: kthread+0xf6/0xfb
Mar 30 13:21:45 klimt kernel: ? kthread_queue_delayed_work+0x74/0x74
Mar 30 13:21:45 klimt kernel: ret_from_fork+0x3a/0x50
Mar 30 13:21:45 klimt kernel: Modules linked in: ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue ib_umad ib_ipoib mlx4_ib sb_edac x86_pkg_temp_thermal iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel glue_helper crypto_simd cryptd pcspkr rpcrdma i2c_i801 rdma_ucm lpc_ich mfd_core ib_iser rdma_cm iw_cm ib_cm mei_me raid0 libiscsi mei sg scsi_transport_iscsi ioatdma wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod sr_mod cdrom mlx4_core crc32c_intel igb nvme i2c_algo_bit ahci i2c_core libahci nvme_core dca libata t10_pi qedr dm_mirror dm_region_hash dm_log dm_mod dax qede qed crc8 ib_uverbs ib_core
Mar 30 13:21:45 klimt kernel: CR2: ffff8887e6c279a8
Mar 30 13:21:45 klimt kernel: ---[ end trace 87971d2ad3429424 ]---
It's absolutely not safe to use resources pointed to by the @send_wr
argument of ib_post_send() _after_ that function returns. Those
resources are typically freed by the Send completion handler, which
can run before ib_post_send() returns.
Thus the trace points currently around ib_post_send() in the
server's RPC/RDMA transport are a hazard, even when they are
disabled. Rearrange them so that they touch the Work Request only
_before_ ib_post_send() is invoked.
Fixes: bd2abef33394 ("svcrdma: Trace key RDMA API events")
Fixes: 4201c7464753 ("svcrdma: Introduce svc_rdma_send_ctxt")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/trace/events/rpcrdma.h | 50 +++++++++++++++++++--------
net/sunrpc/xprtrdma/svc_rdma_rw.c | 3 +-
net/sunrpc/xprtrdma/svc_rdma_sendto.c | 16 +++++----
3 files changed, 46 insertions(+), 23 deletions(-)
diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index 7fd11ec1c9a42..2464311b03896 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -1638,17 +1638,15 @@ DECLARE_EVENT_CLASS(svcrdma_sendcomp_event,
TRACE_EVENT(svcrdma_post_send,
TP_PROTO(
- const struct ib_send_wr *wr,
- int status
+ const struct ib_send_wr *wr
),
- TP_ARGS(wr, status),
+ TP_ARGS(wr),
TP_STRUCT__entry(
__field(const void *, cqe)
__field(unsigned int, num_sge)
__field(u32, inv_rkey)
- __field(int, status)
),
TP_fast_assign(
@@ -1656,12 +1654,11 @@ TRACE_EVENT(svcrdma_post_send,
__entry->num_sge = wr->num_sge;
__entry->inv_rkey = (wr->opcode == IB_WR_SEND_WITH_INV) ?
wr->ex.invalidate_rkey : 0;
- __entry->status = status;
),
- TP_printk("cqe=%p num_sge=%u inv_rkey=0x%08x status=%d",
+ TP_printk("cqe=%p num_sge=%u inv_rkey=0x%08x",
__entry->cqe, __entry->num_sge,
- __entry->inv_rkey, __entry->status
+ __entry->inv_rkey
)
);
@@ -1726,26 +1723,23 @@ TRACE_EVENT(svcrdma_wc_receive,
TRACE_EVENT(svcrdma_post_rw,
TP_PROTO(
const void *cqe,
- int sqecount,
- int status
+ int sqecount
),
- TP_ARGS(cqe, sqecount, status),
+ TP_ARGS(cqe, sqecount),
TP_STRUCT__entry(
__field(const void *, cqe)
__field(int, sqecount)
- __field(int, status)
),
TP_fast_assign(
__entry->cqe = cqe;
__entry->sqecount = sqecount;
- __entry->status = status;
),
- TP_printk("cqe=%p sqecount=%d status=%d",
- __entry->cqe, __entry->sqecount, __entry->status
+ TP_printk("cqe=%p sqecount=%d",
+ __entry->cqe, __entry->sqecount
)
);
@@ -1841,6 +1835,34 @@ DECLARE_EVENT_CLASS(svcrdma_sendqueue_event,
DEFINE_SQ_EVENT(full);
DEFINE_SQ_EVENT(retry);
+TRACE_EVENT(svcrdma_sq_post_err,
+ TP_PROTO(
+ const struct svcxprt_rdma *rdma,
+ int status
+ ),
+
+ TP_ARGS(rdma, status),
+
+ TP_STRUCT__entry(
+ __field(int, avail)
+ __field(int, depth)
+ __field(int, status)
+ __string(addr, rdma->sc_xprt.xpt_remotebuf)
+ ),
+
+ TP_fast_assign(
+ __entry->avail = atomic_read(&rdma->sc_sq_avail);
+ __entry->depth = rdma->sc_sq_depth;
+ __entry->status = status;
+ __assign_str(addr, rdma->sc_xprt.xpt_remotebuf);
+ ),
+
+ TP_printk("addr=%s sc_sq_avail=%d/%d status=%d",
+ __get_str(addr), __entry->avail, __entry->depth,
+ __entry->status
+ )
+);
+
#endif /* _TRACE_RPCRDMA_H */
#include <trace/define_trace.h>
diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c
index 48fe3b16b0d9c..a59912e2666d3 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_rw.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c
@@ -323,8 +323,6 @@ static int svc_rdma_post_chunk_ctxt(struct svc_rdma_chunk_ctxt *cc)
if (atomic_sub_return(cc->cc_sqecount,
&rdma->sc_sq_avail) > 0) {
ret = ib_post_send(rdma->sc_qp, first_wr, &bad_wr);
- trace_svcrdma_post_rw(&cc->cc_cqe,
- cc->cc_sqecount, ret);
if (ret)
break;
return 0;
@@ -337,6 +335,7 @@ static int svc_rdma_post_chunk_ctxt(struct svc_rdma_chunk_ctxt *cc)
trace_svcrdma_sq_retry(rdma);
} while (1);
+ trace_svcrdma_sq_post_err(rdma, ret);
set_bit(XPT_CLOSE, &xprt->xpt_flags);
/* If even one was posted, there will be a completion. */
diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
index 6fdba72f89f47..d4d9844ae85b0 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
@@ -306,15 +306,17 @@ int svc_rdma_send(struct svcxprt_rdma *rdma, struct ib_send_wr *wr)
}
svc_xprt_get(&rdma->sc_xprt);
+ trace_svcrdma_post_send(wr);
ret = ib_post_send(rdma->sc_qp, wr, NULL);
- trace_svcrdma_post_send(wr, ret);
- if (ret) {
- set_bit(XPT_CLOSE, &rdma->sc_xprt.xpt_flags);
- svc_xprt_put(&rdma->sc_xprt);
- wake_up(&rdma->sc_send_wait);
- }
- break;
+ if (ret)
+ break;
+ return 0;
}
+
+ trace_svcrdma_sq_post_err(rdma, ret);
+ set_bit(XPT_CLOSE, &rdma->sc_xprt.xpt_flags);
+ svc_xprt_put(&rdma->sc_xprt);
+ wake_up(&rdma->sc_send_wait);
return ret;
}
--
2.20.1
next prev parent reply other threads:[~2020-04-30 14:07 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-30 13:51 [PATCH AUTOSEL 5.4 01/57] drm/bridge: analogix_dp: Split bind() into probe() and real bind() Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 02/57] iio:ad7797: Use correct attribute_group Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 03/57] ASoC: topology: Check return value of soc_tplg_create_tlv Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 04/57] ASoC: topology: Check return value of soc_tplg_*_create Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 05/57] ASoC: topology: Check soc_tplg_add_route return value Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 06/57] ASoC: topology: Check return value of pcm_new_ver Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 07/57] ASoC: topology: Check return value of soc_tplg_dai_config Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 08/57] nfsd: memory corruption in nfsd4_lock() Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 09/57] selftests/ipc: Fix test failure seen after initial test run Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 10/57] drivers: soc: xilinx: fix firmware driver Kconfig dependency Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 11/57] ASoC: sgtl5000: Fix VAG power-on handling Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 12/57] ASoC: q6dsp6: q6afe-dai: add missing channels to MI2S DAIs Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 13/57] ASoC: topology: Fix endianness issue Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 14/57] usb: dwc3: gadget: Properly set maxpacket limit Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 15/57] usb: dwc3: gadget: Do link recovery for SS and SSP Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 16/57] ASoC: rsnd: Fix parent SSI start/stop in multi-SSI mode Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 17/57] ASoC: rsnd: Fix HDMI channel mapping for " Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 18/57] ASoC: codecs: hdac_hdmi: Fix incorrect use of list_for_each_entry Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 19/57] ARM: dts: bcm283x: Disable dsi0 node Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 20/57] remoteproc: qcom_q6v5_mss: fix a bug in q6v5_probe() Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 21/57] usb: gadget: udc: atmel: Fix vbus disconnect handling Sasha Levin
2020-04-30 13:51 ` Sasha Levin [this message]
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 23/57] ASoC: stm32: sai: fix sai probe Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 24/57] ASoC: SOF: Intel: add min/max channels for SSP on Baytrail/Broadwell Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 25/57] drm/amdgpu: Correctly initialize thermal controller for GPUs with Powerplay table v0 (e.g Hawaii) Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 26/57] wimax/i2400m: Fix potential urb refcnt leak Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 27/57] net: stmmac: fix enabling socfpga's ptp_ref_clock Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 28/57] net: stmmac: Fix sub-second increment Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 29/57] netfilter: nat: fix error handling upon registering inet hook Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 30/57] counter: 104-quad-8: Add lock guards - generic interface Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 31/57] ASoC: meson: axg-card: fix codec-to-codec link setup Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 32/57] ASoC: rsnd: Don't treat master SSI in multi SSI setup as parent Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 33/57] ASoC: rsnd: Fix "status check failed" spam for multi-SSI Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 34/57] net/mlx5: Fix failing fw tracer allocation on s390 Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 35/57] net/mlx5e: Don't trigger IRQ multiple times on XSK wakeup to avoid WQ overruns Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 36/57] net/mlx5e: Get the latest values from counters in switchdev mode Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 37/57] cpumap: Avoid warning when CONFIG_DEBUG_PER_CPU_MAPS is enabled Sasha Levin
2020-04-30 13:51 ` [PATCH AUTOSEL 5.4 38/57] bpf: Forbid XADD on spilled pointers for unprivileged users Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 39/57] ASoC: wm8960: Fix wrong clock after suspend & resume Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 40/57] cifs: protect updating server->dstaddr with a spinlock Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 41/57] blk-iocost: Fix error on iocost_ioc_vrate_adj Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 42/57] s390/ftrace: fix potential crashes when switching tracers Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 43/57] scripts/config: allow colons in option strings for sed Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 44/57] sched/core: Fix reset-on-fork from RT with uclamp Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 45/57] perf/core: fix parent pid/tid in task exit events Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 46/57] cifs: do not share tcons with DFS Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 47/57] tracing: Fix memory leaks in trace_events_hist.c Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 48/57] um: ensure `make ARCH=um mrproper` removes arch/$(SUBARCH)/include/generated/ Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 49/57] lib/mpi: Fix building for powerpc with clang Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 50/57] mac80211: sta_info: Add lockdep condition for RCU list usage Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 51/57] afs: Fix to actually set AFS_SERVER_FL_HAVE_EPOCH Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 52/57] afs: Make record checking use TASK_UNINTERRUPTIBLE when appropriate Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 53/57] net: bcmgenet: suppress warnings on failed Rx SKB allocations Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 54/57] net: systemport: " Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 55/57] bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extension Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 56/57] bpf, x86_32: Fix clobbering of dst for BPF_JSET Sasha Levin
2020-04-30 13:52 ` [PATCH AUTOSEL 5.4 57/57] bpf, x86_32: Fix logic error in BPF_LDX zero-extension Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200430135218.20372-22-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).