* [PATCH AUTOSEL 4.19 13/30] svcrdma: Fix trace point use-after-free race
[not found] <20200430135325.20762-1-sashal@kernel.org>
@ 2020-04-30 13:53 ` Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 15/30] wimax/i2400m: Fix potential urb refcnt leak Sasha Levin
` (7 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2020-04-30 13:53 UTC (permalink / raw)
To: linux-kernel, stable; +Cc: Chuck Lever, Sasha Levin, linux-nfs, netdev
From: Chuck Lever <chuck.lever@oracle.com>
[ Upstream commit e28b4fc652c1830796a4d3e09565f30c20f9a2cf ]
I hit this while testing nfsd-5.7 with kernel memory debugging
enabled on my server:
Mar 30 13:21:45 klimt kernel: BUG: unable to handle page fault for address: ffff8887e6c279a8
Mar 30 13:21:45 klimt kernel: #PF: supervisor read access in kernel mode
Mar 30 13:21:45 klimt kernel: #PF: error_code(0x0000) - not-present page
Mar 30 13:21:45 klimt kernel: PGD 3601067 P4D 3601067 PUD 87c519067 PMD 87c3e2067 PTE 800ffff8193d8060
Mar 30 13:21:45 klimt kernel: Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
Mar 30 13:21:45 klimt kernel: CPU: 2 PID: 1933 Comm: nfsd Not tainted 5.6.0-rc6-00040-g881e87a3c6f9 #1591
Mar 30 13:21:45 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
Mar 30 13:21:45 klimt kernel: RIP: 0010:svc_rdma_post_chunk_ctxt+0xab/0x284 [rpcrdma]
Mar 30 13:21:45 klimt kernel: Code: c1 83 34 02 00 00 29 d0 85 c0 7e 72 48 8b bb a0 02 00 00 48 8d 54 24 08 4c 89 e6 48 8b 07 48 8b 40 20 e8 5a 5c 2b e1 41 89 c6 <8b> 45 20 89 44 24 04 8b 05 02 e9 01 00 85 c0 7e 33 e9 5e 01 00 00
Mar 30 13:21:45 klimt kernel: RSP: 0018:ffffc90000dfbdd8 EFLAGS: 00010286
Mar 30 13:21:45 klimt kernel: RAX: 0000000000000000 RBX: ffff8887db8db400 RCX: 0000000000000030
Mar 30 13:21:45 klimt kernel: RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000246
Mar 30 13:21:45 klimt kernel: RBP: ffff8887e6c27988 R08: 0000000000000000 R09: 0000000000000004
Mar 30 13:21:45 klimt kernel: R10: ffffc90000dfbdd8 R11: 00c068ef00000000 R12: ffff8887eb4e4a80
Mar 30 13:21:45 klimt kernel: R13: ffff8887db8db634 R14: 0000000000000000 R15: ffff8887fc931000
Mar 30 13:21:45 klimt kernel: FS: 0000000000000000(0000) GS:ffff88885bd00000(0000) knlGS:0000000000000000
Mar 30 13:21:45 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 30 13:21:45 klimt kernel: CR2: ffff8887e6c279a8 CR3: 000000081b72e002 CR4: 00000000001606e0
Mar 30 13:21:45 klimt kernel: Call Trace:
Mar 30 13:21:45 klimt kernel: ? svc_rdma_vec_to_sg+0x7f/0x7f [rpcrdma]
Mar 30 13:21:45 klimt kernel: svc_rdma_send_write_chunk+0x59/0xce [rpcrdma]
Mar 30 13:21:45 klimt kernel: svc_rdma_sendto+0xf9/0x3ae [rpcrdma]
Mar 30 13:21:45 klimt kernel: ? nfsd_destroy+0x51/0x51 [nfsd]
Mar 30 13:21:45 klimt kernel: svc_send+0x105/0x1e3 [sunrpc]
Mar 30 13:21:45 klimt kernel: nfsd+0xf2/0x149 [nfsd]
Mar 30 13:21:45 klimt kernel: kthread+0xf6/0xfb
Mar 30 13:21:45 klimt kernel: ? kthread_queue_delayed_work+0x74/0x74
Mar 30 13:21:45 klimt kernel: ret_from_fork+0x3a/0x50
Mar 30 13:21:45 klimt kernel: Modules linked in: ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue ib_umad ib_ipoib mlx4_ib sb_edac x86_pkg_temp_thermal iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel glue_helper crypto_simd cryptd pcspkr rpcrdma i2c_i801 rdma_ucm lpc_ich mfd_core ib_iser rdma_cm iw_cm ib_cm mei_me raid0 libiscsi mei sg scsi_transport_iscsi ioatdma wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod sr_mod cdrom mlx4_core crc32c_intel igb nvme i2c_algo_bit ahci i2c_core libahci nvme_core dca libata t10_pi qedr dm_mirror dm_region_hash dm_log dm_mod dax qede qed crc8 ib_uverbs ib_core
Mar 30 13:21:45 klimt kernel: CR2: ffff8887e6c279a8
Mar 30 13:21:45 klimt kernel: ---[ end trace 87971d2ad3429424 ]---
It's absolutely not safe to use resources pointed to by the @send_wr
argument of ib_post_send() _after_ that function returns. Those
resources are typically freed by the Send completion handler, which
can run before ib_post_send() returns.
Thus the trace points currently around ib_post_send() in the
server's RPC/RDMA transport are a hazard, even when they are
disabled. Rearrange them so that they touch the Work Request only
_before_ ib_post_send() is invoked.
Fixes: bd2abef33394 ("svcrdma: Trace key RDMA API events")
Fixes: 4201c7464753 ("svcrdma: Introduce svc_rdma_send_ctxt")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/trace/events/rpcrdma.h | 50 +++++++++++++++++++--------
net/sunrpc/xprtrdma/svc_rdma_rw.c | 3 +-
net/sunrpc/xprtrdma/svc_rdma_sendto.c | 16 +++++----
3 files changed, 46 insertions(+), 23 deletions(-)
diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h
index 4c91cadd1871d..4a0b18921e06a 100644
--- a/include/trace/events/rpcrdma.h
+++ b/include/trace/events/rpcrdma.h
@@ -1322,17 +1322,15 @@ DECLARE_EVENT_CLASS(svcrdma_sendcomp_event,
TRACE_EVENT(svcrdma_post_send,
TP_PROTO(
- const struct ib_send_wr *wr,
- int status
+ const struct ib_send_wr *wr
),
- TP_ARGS(wr, status),
+ TP_ARGS(wr),
TP_STRUCT__entry(
__field(const void *, cqe)
__field(unsigned int, num_sge)
__field(u32, inv_rkey)
- __field(int, status)
),
TP_fast_assign(
@@ -1340,12 +1338,11 @@ TRACE_EVENT(svcrdma_post_send,
__entry->num_sge = wr->num_sge;
__entry->inv_rkey = (wr->opcode == IB_WR_SEND_WITH_INV) ?
wr->ex.invalidate_rkey : 0;
- __entry->status = status;
),
- TP_printk("cqe=%p num_sge=%u inv_rkey=0x%08x status=%d",
+ TP_printk("cqe=%p num_sge=%u inv_rkey=0x%08x",
__entry->cqe, __entry->num_sge,
- __entry->inv_rkey, __entry->status
+ __entry->inv_rkey
)
);
@@ -1410,26 +1407,23 @@ TRACE_EVENT(svcrdma_wc_receive,
TRACE_EVENT(svcrdma_post_rw,
TP_PROTO(
const void *cqe,
- int sqecount,
- int status
+ int sqecount
),
- TP_ARGS(cqe, sqecount, status),
+ TP_ARGS(cqe, sqecount),
TP_STRUCT__entry(
__field(const void *, cqe)
__field(int, sqecount)
- __field(int, status)
),
TP_fast_assign(
__entry->cqe = cqe;
__entry->sqecount = sqecount;
- __entry->status = status;
),
- TP_printk("cqe=%p sqecount=%d status=%d",
- __entry->cqe, __entry->sqecount, __entry->status
+ TP_printk("cqe=%p sqecount=%d",
+ __entry->cqe, __entry->sqecount
)
);
@@ -1525,6 +1519,34 @@ DECLARE_EVENT_CLASS(svcrdma_sendqueue_event,
DEFINE_SQ_EVENT(full);
DEFINE_SQ_EVENT(retry);
+TRACE_EVENT(svcrdma_sq_post_err,
+ TP_PROTO(
+ const struct svcxprt_rdma *rdma,
+ int status
+ ),
+
+ TP_ARGS(rdma, status),
+
+ TP_STRUCT__entry(
+ __field(int, avail)
+ __field(int, depth)
+ __field(int, status)
+ __string(addr, rdma->sc_xprt.xpt_remotebuf)
+ ),
+
+ TP_fast_assign(
+ __entry->avail = atomic_read(&rdma->sc_sq_avail);
+ __entry->depth = rdma->sc_sq_depth;
+ __entry->status = status;
+ __assign_str(addr, rdma->sc_xprt.xpt_remotebuf);
+ ),
+
+ TP_printk("addr=%s sc_sq_avail=%d/%d status=%d",
+ __get_str(addr), __entry->avail, __entry->depth,
+ __entry->status
+ )
+);
+
#endif /* _TRACE_RPCRDMA_H */
#include <trace/define_trace.h>
diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c
index dc1951759a8ef..4fc0ce1270894 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_rw.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c
@@ -331,8 +331,6 @@ static int svc_rdma_post_chunk_ctxt(struct svc_rdma_chunk_ctxt *cc)
if (atomic_sub_return(cc->cc_sqecount,
&rdma->sc_sq_avail) > 0) {
ret = ib_post_send(rdma->sc_qp, first_wr, &bad_wr);
- trace_svcrdma_post_rw(&cc->cc_cqe,
- cc->cc_sqecount, ret);
if (ret)
break;
return 0;
@@ -345,6 +343,7 @@ static int svc_rdma_post_chunk_ctxt(struct svc_rdma_chunk_ctxt *cc)
trace_svcrdma_sq_retry(rdma);
} while (1);
+ trace_svcrdma_sq_post_err(rdma, ret);
set_bit(XPT_CLOSE, &xprt->xpt_flags);
/* If even one was posted, there will be a completion. */
diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
index e8ad7ddf347ad..840bd352ef6ec 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
@@ -310,15 +310,17 @@ int svc_rdma_send(struct svcxprt_rdma *rdma, struct ib_send_wr *wr)
}
svc_xprt_get(&rdma->sc_xprt);
+ trace_svcrdma_post_send(wr);
ret = ib_post_send(rdma->sc_qp, wr, NULL);
- trace_svcrdma_post_send(wr, ret);
- if (ret) {
- set_bit(XPT_CLOSE, &rdma->sc_xprt.xpt_flags);
- svc_xprt_put(&rdma->sc_xprt);
- wake_up(&rdma->sc_send_wait);
- }
- break;
+ if (ret)
+ break;
+ return 0;
}
+
+ trace_svcrdma_sq_post_err(rdma, ret);
+ set_bit(XPT_CLOSE, &rdma->sc_xprt.xpt_flags);
+ svc_xprt_put(&rdma->sc_xprt);
+ wake_up(&rdma->sc_send_wait);
return ret;
}
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH AUTOSEL 4.19 15/30] wimax/i2400m: Fix potential urb refcnt leak
[not found] <20200430135325.20762-1-sashal@kernel.org>
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 13/30] svcrdma: Fix trace point use-after-free race Sasha Levin
@ 2020-04-30 13:53 ` Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 16/30] net: stmmac: fix enabling socfpga's ptp_ref_clock Sasha Levin
` (6 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2020-04-30 13:53 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Xiyu Yang, Xin Tan, David S . Miller, Sasha Levin, netdev
From: Xiyu Yang <xiyuyang19@fudan.edu.cn>
[ Upstream commit 7717cbec172c3554d470023b4020d5781961187e ]
i2400mu_bus_bm_wait_for_ack() invokes usb_get_urb(), which increases the
refcount of the "notif_urb".
When i2400mu_bus_bm_wait_for_ack() returns, local variable "notif_urb"
becomes invalid, so the refcount should be decreased to keep refcount
balanced.
The issue happens in all paths of i2400mu_bus_bm_wait_for_ack(), which
forget to decrease the refcnt increased by usb_get_urb(), causing a
refcnt leak.
Fix this issue by calling usb_put_urb() before the
i2400mu_bus_bm_wait_for_ack() returns.
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/wimax/i2400m/usb-fw.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/wimax/i2400m/usb-fw.c b/drivers/net/wimax/i2400m/usb-fw.c
index 529ebca1e9e13..1f7709d24f352 100644
--- a/drivers/net/wimax/i2400m/usb-fw.c
+++ b/drivers/net/wimax/i2400m/usb-fw.c
@@ -354,6 +354,7 @@ out:
usb_autopm_put_interface(i2400mu->usb_iface);
d_fnend(8, dev, "(i2400m %p ack %p size %zu) = %ld\n",
i2400m, ack, ack_size, (long) result);
+ usb_put_urb(¬if_urb);
return result;
error_exceeded:
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH AUTOSEL 4.19 16/30] net: stmmac: fix enabling socfpga's ptp_ref_clock
[not found] <20200430135325.20762-1-sashal@kernel.org>
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 13/30] svcrdma: Fix trace point use-after-free race Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 15/30] wimax/i2400m: Fix potential urb refcnt leak Sasha Levin
@ 2020-04-30 13:53 ` Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 17/30] net: stmmac: Fix sub-second increment Sasha Levin
` (5 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2020-04-30 13:53 UTC (permalink / raw)
To: linux-kernel, stable; +Cc: Julien Beraud, David S . Miller, Sasha Levin, netdev
From: Julien Beraud <julien.beraud@orolia.com>
[ Upstream commit 15ce30609d1e88d42fb1cd948f453e6d5f188249 ]
There are 2 registers to write to enable a ptp ref clock coming from the
fpga.
One that enables the usage of the clock from the fpga for emac0 and emac1
as a ptp ref clock, and the other to allow signals from the fpga to reach
emac0 and emac1.
Currently, if the dwmac-socfpga has phymode set to PHY_INTERFACE_MODE_MII,
PHY_INTERFACE_MODE_GMII, or PHY_INTERFACE_MODE_SGMII, both registers will
be written and the ptp ref clock will be set as coming from the fpga.
Separate the 2 register writes to only enable signals from the fpga to
reach emac0 or emac1 when ptp ref clock is not coming from the fpga.
Signed-off-by: Julien Beraud <julien.beraud@orolia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
index 5b3b06a0a3bf5..33407df6bea69 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
@@ -274,16 +274,19 @@ static int socfpga_dwmac_set_phy_mode(struct socfpga_dwmac *dwmac)
phymode == PHY_INTERFACE_MODE_MII ||
phymode == PHY_INTERFACE_MODE_GMII ||
phymode == PHY_INTERFACE_MODE_SGMII) {
- ctrl |= SYSMGR_EMACGRP_CTRL_PTP_REF_CLK_MASK << (reg_shift / 2);
regmap_read(sys_mgr_base_addr, SYSMGR_FPGAGRP_MODULE_REG,
&module);
module |= (SYSMGR_FPGAGRP_MODULE_EMAC << (reg_shift / 2));
regmap_write(sys_mgr_base_addr, SYSMGR_FPGAGRP_MODULE_REG,
module);
- } else {
- ctrl &= ~(SYSMGR_EMACGRP_CTRL_PTP_REF_CLK_MASK << (reg_shift / 2));
}
+ if (dwmac->f2h_ptp_ref_clk)
+ ctrl |= SYSMGR_EMACGRP_CTRL_PTP_REF_CLK_MASK << (reg_shift / 2);
+ else
+ ctrl &= ~(SYSMGR_EMACGRP_CTRL_PTP_REF_CLK_MASK <<
+ (reg_shift / 2));
+
regmap_write(sys_mgr_base_addr, reg_offset, ctrl);
/* Deassert reset for the phy configuration to be sampled by
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH AUTOSEL 4.19 17/30] net: stmmac: Fix sub-second increment
[not found] <20200430135325.20762-1-sashal@kernel.org>
` (2 preceding siblings ...)
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 16/30] net: stmmac: fix enabling socfpga's ptp_ref_clock Sasha Levin
@ 2020-04-30 13:53 ` Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 20/30] net/mlx5: Fix failing fw tracer allocation on s390 Sasha Levin
` (4 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2020-04-30 13:53 UTC (permalink / raw)
To: linux-kernel, stable; +Cc: Julien Beraud, David S . Miller, Sasha Levin, netdev
From: Julien Beraud <julien.beraud@orolia.com>
[ Upstream commit 91a2559c1dc5b0f7e1256d42b1508935e8eabfbf ]
In fine adjustement mode, which is the current default, the sub-second
increment register is the number of nanoseconds that will be added to
the clock when the accumulator overflows. At each clock cycle, the
value of the addend register is added to the accumulator.
Currently, we use 20ns = 1e09ns / 50MHz as this value whatever the
frequency of the ptp clock actually is.
The adjustment is then done on the addend register, only incrementing
every X clock cycles X being the ratio between 50MHz and ptp_clock_rate
(addend = 2^32 * 50MHz/ptp_clock_rate).
This causes the following issues :
- In case the frequency of the ptp clock is inferior or equal to 50MHz,
the addend value calculation will overflow and the default
addend value will be set to 0, causing the clock to not work at
all. (For instance, for ptp_clock_rate = 50MHz, addend = 2^32).
- The resolution of the timestamping clock is limited to 20ns while it
is not needed, thus limiting the accuracy of the timestamping to
20ns.
Fix this by setting sub-second increment to 2e09ns / ptp_clock_rate.
It will allow to reach the minimum possible frequency for
ptp_clk_ref, which is 5MHz for GMII 1000Mps Full-Duplex by setting the
sub-second-increment to a higher value. For instance, for 25MHz, it
gives ssinc = 80ns and default_addend = 2^31.
It will also allow to use a lower value for sub-second-increment, thus
improving the timestamping accuracy with frequencies higher than
100MHz, for instance, for 200MHz, ssinc = 10ns and default_addend =
2^31.
v1->v2:
- Remove modifications to the calculation of default addend, which broke
compatibility with clock frequencies for which 2000000000 / ptp_clk_freq
is not an integer.
- Modify description according to discussions.
Signed-off-by: Julien Beraud <julien.beraud@orolia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
.../net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
index 7423262ce5907..e1fbd7c81bfa9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c
@@ -36,12 +36,16 @@ static void config_sub_second_increment(void __iomem *ioaddr,
unsigned long data;
u32 reg_value;
- /* For GMAC3.x, 4.x versions, convert the ptp_clock to nano second
- * formula = (1/ptp_clock) * 1000000000
- * where ptp_clock is 50MHz if fine method is used to update system
+ /* For GMAC3.x, 4.x versions, in "fine adjustement mode" set sub-second
+ * increment to twice the number of nanoseconds of a clock cycle.
+ * The calculation of the default_addend value by the caller will set it
+ * to mid-range = 2^31 when the remainder of this division is zero,
+ * which will make the accumulator overflow once every 2 ptp_clock
+ * cycles, adding twice the number of nanoseconds of a clock cycle :
+ * 2000000000ULL / ptp_clock.
*/
if (value & PTP_TCR_TSCFUPDT)
- data = (1000000000ULL / 50000000);
+ data = (2000000000ULL / ptp_clock);
else
data = (1000000000ULL / ptp_clock);
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH AUTOSEL 4.19 20/30] net/mlx5: Fix failing fw tracer allocation on s390
[not found] <20200430135325.20762-1-sashal@kernel.org>
` (3 preceding siblings ...)
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 17/30] net: stmmac: Fix sub-second increment Sasha Levin
@ 2020-04-30 13:53 ` Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 21/30] cpumap: Avoid warning when CONFIG_DEBUG_PER_CPU_MAPS is enabled Sasha Levin
` (3 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2020-04-30 13:53 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Niklas Schnelle, Saeed Mahameed, Sasha Levin, netdev, linux-rdma
From: Niklas Schnelle <schnelle@linux.ibm.com>
[ Upstream commit a019b36123aec9700b21ae0724710f62928a8bc1 ]
On s390 FORCE_MAX_ZONEORDER is 9 instead of 11, thus a larger kzalloc()
allocation as done for the firmware tracer will always fail.
Looking at mlx5_fw_tracer_save_trace(), it is actually the driver itself
that copies the debug data into the trace array and there is no need for
the allocation to be contiguous in physical memory. We can therefor use
kvzalloc() instead of kzalloc() and get rid of the large contiguous
allcoation.
Fixes: f53aaa31cce7 ("net/mlx5: FW tracer, implement tracer logic")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
index d4ec93bde4ded..2266c09b741a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c
@@ -796,7 +796,7 @@ struct mlx5_fw_tracer *mlx5_fw_tracer_create(struct mlx5_core_dev *dev)
return NULL;
}
- tracer = kzalloc(sizeof(*tracer), GFP_KERNEL);
+ tracer = kvzalloc(sizeof(*tracer), GFP_KERNEL);
if (!tracer)
return ERR_PTR(-ENOMEM);
@@ -842,7 +842,7 @@ destroy_workqueue:
tracer->dev = NULL;
destroy_workqueue(tracer->work_queue);
free_tracer:
- kfree(tracer);
+ kvfree(tracer);
return ERR_PTR(err);
}
@@ -919,7 +919,7 @@ void mlx5_fw_tracer_destroy(struct mlx5_fw_tracer *tracer)
mlx5_fw_tracer_destroy_log_buf(tracer);
flush_workqueue(tracer->work_queue);
destroy_workqueue(tracer->work_queue);
- kfree(tracer);
+ kvfree(tracer);
}
void mlx5_fw_tracer_event(struct mlx5_core_dev *dev, struct mlx5_eqe *eqe)
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH AUTOSEL 4.19 21/30] cpumap: Avoid warning when CONFIG_DEBUG_PER_CPU_MAPS is enabled
[not found] <20200430135325.20762-1-sashal@kernel.org>
` (4 preceding siblings ...)
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 20/30] net/mlx5: Fix failing fw tracer allocation on s390 Sasha Levin
@ 2020-04-30 13:53 ` Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 28/30] net: bcmgenet: suppress warnings on failed Rx SKB allocations Sasha Levin
` (2 subsequent siblings)
8 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2020-04-30 13:53 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Toke Høiland-Jørgensen, Xiumei Mu, Alexei Starovoitov,
Jesper Dangaard Brouer, Song Liu, Sasha Levin, netdev
From: Toke Høiland-Jørgensen <toke@redhat.com>
[ Upstream commit bc23d0e3f717ced21fbfacab3ab887d55e5ba367 ]
When the kernel is built with CONFIG_DEBUG_PER_CPU_MAPS, the cpumap code
can trigger a spurious warning if CONFIG_CPUMASK_OFFSTACK is also set. This
happens because in this configuration, NR_CPUS can be larger than
nr_cpumask_bits, so the initial check in cpu_map_alloc() is not sufficient
to guard against hitting the warning in cpumask_check().
Fix this by explicitly checking the supplied key against the
nr_cpumask_bits variable before calling cpu_possible().
Fixes: 6710e1126934 ("bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Xiumei Mu <xmu@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200416083120.453718-1-toke@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/bpf/cpumap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index 3c18260403dde..61fbcae82f0a1 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -455,7 +455,7 @@ static int cpu_map_update_elem(struct bpf_map *map, void *key, void *value,
return -EOVERFLOW;
/* Make sure CPU is a valid possible cpu */
- if (!cpu_possible(key_cpu))
+ if (key_cpu >= nr_cpumask_bits || !cpu_possible(key_cpu))
return -ENODEV;
if (qsize == 0) {
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH AUTOSEL 4.19 28/30] net: bcmgenet: suppress warnings on failed Rx SKB allocations
[not found] <20200430135325.20762-1-sashal@kernel.org>
` (5 preceding siblings ...)
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 21/30] cpumap: Avoid warning when CONFIG_DEBUG_PER_CPU_MAPS is enabled Sasha Levin
@ 2020-04-30 13:53 ` Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 29/30] net: systemport: " Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 30/30] bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extension Sasha Levin
8 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2020-04-30 13:53 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Doug Berger, Florian Fainelli, David S . Miller, Sasha Levin,
netdev
From: Doug Berger <opendmb@gmail.com>
[ Upstream commit ecaeceb8a8a145d93c7e136f170238229165348f ]
The driver is designed to drop Rx packets and reclaim the buffers
when an allocation fails, and the network interface needs to safely
handle this packet loss. Therefore, an allocation failure of Rx
SKBs is relatively benign.
However, the output of the warning message occurs with a high
scheduling priority that can cause excessive jitter/latency for
other high priority processing.
This commit suppresses the warning messages to prevent scheduling
problems while retaining the failure count in the statistics of
the network interface.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 789c206b515ed..89cc146d2c5c8 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1699,7 +1699,8 @@ static struct sk_buff *bcmgenet_rx_refill(struct bcmgenet_priv *priv,
dma_addr_t mapping;
/* Allocate a new Rx skb */
- skb = netdev_alloc_skb(priv->dev, priv->rx_buf_len + SKB_ALIGNMENT);
+ skb = __netdev_alloc_skb(priv->dev, priv->rx_buf_len + SKB_ALIGNMENT,
+ GFP_ATOMIC | __GFP_NOWARN);
if (!skb) {
priv->mib.alloc_rx_buff_failed++;
netif_err(priv, rx_err, priv->dev,
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH AUTOSEL 4.19 29/30] net: systemport: suppress warnings on failed Rx SKB allocations
[not found] <20200430135325.20762-1-sashal@kernel.org>
` (6 preceding siblings ...)
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 28/30] net: bcmgenet: suppress warnings on failed Rx SKB allocations Sasha Levin
@ 2020-04-30 13:53 ` Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 30/30] bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extension Sasha Levin
8 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2020-04-30 13:53 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Doug Berger, Florian Fainelli, David S . Miller, Sasha Levin,
netdev
From: Doug Berger <opendmb@gmail.com>
[ Upstream commit 3554e54a46125030c534820c297ed7f6c3907e24 ]
The driver is designed to drop Rx packets and reclaim the buffers
when an allocation fails, and the network interface needs to safely
handle this packet loss. Therefore, an allocation failure of Rx
SKBs is relatively benign.
However, the output of the warning message occurs with a high
scheduling priority that can cause excessive jitter/latency for
other high priority processing.
This commit suppresses the warning messages to prevent scheduling
problems while retaining the failure count in the statistics of
the network interface.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/broadcom/bcmsysport.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 3fdf135bad56e..6b761f6b8fd56 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -677,7 +677,8 @@ static struct sk_buff *bcm_sysport_rx_refill(struct bcm_sysport_priv *priv,
dma_addr_t mapping;
/* Allocate a new SKB for a new packet */
- skb = netdev_alloc_skb(priv->netdev, RX_BUF_LENGTH);
+ skb = __netdev_alloc_skb(priv->netdev, RX_BUF_LENGTH,
+ GFP_ATOMIC | __GFP_NOWARN);
if (!skb) {
priv->mib.alloc_rx_buff_failed++;
netif_err(priv, rx_err, ndev, "SKB alloc failed\n");
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH AUTOSEL 4.19 30/30] bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extension
[not found] <20200430135325.20762-1-sashal@kernel.org>
` (7 preceding siblings ...)
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 29/30] net: systemport: " Sasha Levin
@ 2020-04-30 13:53 ` Sasha Levin
8 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2020-04-30 13:53 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Luke Nelson, Xi Wang, Luke Nelson, Alexei Starovoitov,
H . Peter Anvin, Wang YanQing, Sasha Levin, netdev
From: Luke Nelson <lukenels@cs.washington.edu>
[ Upstream commit 5fa9a98fb10380e48a398998cd36a85e4ef711d6 ]
The current JIT uses the following sequence to zero-extend into the
upper 32 bits of the destination register for BPF_LDX BPF_{B,H,W},
when the destination register is not on the stack:
EMIT3(0xC7, add_1reg(0xC0, dst_hi), 0);
The problem is that C7 /0 encodes a MOV instruction that requires a 4-byte
immediate; the current code emits only 1 byte of the immediate. This
means that the first 3 bytes of the next instruction will be treated as
the rest of the immediate, breaking the stream of instructions.
This patch fixes the problem by instead emitting "xor dst_hi,dst_hi"
to clear the upper 32 bits. This fixes the problem and is more efficient
than using MOV to load a zero immediate.
This bug may not be currently triggerable as BPF_REG_AX is the only
register not stored on the stack and the verifier uses it in a limited
way, and the verifier implements a zero-extension optimization. But the
JIT should avoid emitting incorrect encodings regardless.
Fixes: 03f5781be2c7b ("bpf, x86_32: add eBPF JIT compiler for ia32")
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Acked-by: Wang YanQing <udknight@gmail.com>
Link: https://lore.kernel.org/bpf/20200422173630.8351-1-luke.r.nels@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/net/bpf_jit_comp32.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
index 24d573bc550d9..21df0b6d7be6e 100644
--- a/arch/x86/net/bpf_jit_comp32.c
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -1830,7 +1830,9 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
STACK_VAR(dst_hi));
EMIT(0x0, 4);
} else {
- EMIT3(0xC7, add_1reg(0xC0, dst_hi), 0);
+ /* xor dst_hi,dst_hi */
+ EMIT2(0x33,
+ add_2reg(0xC0, dst_hi, dst_hi));
}
break;
case BPF_DW:
--
2.20.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-04-30 14:02 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20200430135325.20762-1-sashal@kernel.org>
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 13/30] svcrdma: Fix trace point use-after-free race Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 15/30] wimax/i2400m: Fix potential urb refcnt leak Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 16/30] net: stmmac: fix enabling socfpga's ptp_ref_clock Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 17/30] net: stmmac: Fix sub-second increment Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 20/30] net/mlx5: Fix failing fw tracer allocation on s390 Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 21/30] cpumap: Avoid warning when CONFIG_DEBUG_PER_CPU_MAPS is enabled Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 28/30] net: bcmgenet: suppress warnings on failed Rx SKB allocations Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 29/30] net: systemport: " Sasha Levin
2020-04-30 13:53 ` [PATCH AUTOSEL 4.19 30/30] bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extension Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).