From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Jack Morgenstein <jackm@dev.mellanox.co.il>,
Tariq Toukan <tariqt@mellanox.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.14 19/34] net/mlx4_core: Fix reset flow when in command polling mode
Date: Mon, 18 Mar 2019 10:25:43 +0100 [thread overview]
Message-ID: <20190318084147.344341912@linuxfoundation.org> (raw)
In-Reply-To: <20190318084144.657740413@linuxfoundation.org>
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
[ Upstream commit e15ce4b8d11227007577e6dc1364d288b8874fbe ]
As part of unloading a device, the driver switches from
FW command event mode to FW command polling mode.
Part of switching over to polling mode is freeing the command context array
memory (unfortunately, currently, without NULLing the command context array
pointer).
The reset flow calls "complete" to complete all outstanding fw commands
(if we are in event mode). The check for event vs. polling mode here
is to test if the command context array pointer is NULL.
If the reset flow is activated after the switch to polling mode, it will
attempt (incorrectly) to complete all the commands in the context array --
because the pointer was not NULLed when the driver switched over to polling
mode.
As a result, we have a use-after-free situation, which results in a
kernel crash.
For example:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff876c4a8e>] __wake_up_common+0x2e/0x90
PGD 0
Oops: 0000 [#1] SMP
Modules linked in: netconsole nfsv3 nfs_acl nfs lockd grace ...
CPU: 2 PID: 940 Comm: kworker/2:3 Kdump: loaded Not tainted 3.10.0-862.el7.x86_64 #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 04/28/2016
Workqueue: events hv_eject_device_work [pci_hyperv]
task: ffff8d1734ca0fd0 ti: ffff8d17354bc000 task.ti: ffff8d17354bc000
RIP: 0010:[<ffffffff876c4a8e>] [<ffffffff876c4a8e>] __wake_up_common+0x2e/0x90
RSP: 0018:ffff8d17354bfa38 EFLAGS: 00010082
RAX: 0000000000000000 RBX: ffff8d17362d42c8 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8d17362d42c8
RBP: ffff8d17354bfa70 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000298 R11: ffff8d173610e000 R12: ffff8d17362d42d0
R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff8d1802680000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000f16d8000 CR4: 00000000001406e0
Call Trace:
[<ffffffff876c7adc>] complete+0x3c/0x50
[<ffffffffc04242f0>] mlx4_cmd_wake_completions+0x70/0x90 [mlx4_core]
[<ffffffffc041e7b1>] mlx4_enter_error_state+0xe1/0x380 [mlx4_core]
[<ffffffffc041fa4b>] mlx4_comm_cmd+0x29b/0x360 [mlx4_core]
[<ffffffffc041ff51>] __mlx4_cmd+0x441/0x920 [mlx4_core]
[<ffffffff877f62b1>] ? __slab_free+0x81/0x2f0
[<ffffffff87951384>] ? __radix_tree_lookup+0x84/0xf0
[<ffffffffc043a8eb>] mlx4_free_mtt_range+0x5b/0xb0 [mlx4_core]
[<ffffffffc043a957>] mlx4_mtt_cleanup+0x17/0x20 [mlx4_core]
[<ffffffffc04272c7>] mlx4_free_eq+0xa7/0x1c0 [mlx4_core]
[<ffffffffc042803e>] mlx4_cleanup_eq_table+0xde/0x130 [mlx4_core]
[<ffffffffc0433e08>] mlx4_unload_one+0x118/0x300 [mlx4_core]
[<ffffffffc0434191>] mlx4_remove_one+0x91/0x1f0 [mlx4_core]
The fix is to set the command context array pointer to NULL after freeing
the array.
Fixes: f5aef5aa3506 ("net/mlx4_core: Activate reset flow upon fatal command cases")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlx4/cmd.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -2686,6 +2686,7 @@ void mlx4_cmd_use_polling(struct mlx4_de
down(&priv->cmd.event_sem);
kfree(priv->cmd.context);
+ priv->cmd.context = NULL;
up(&priv->cmd.poll_sem);
up_write(&priv->cmd.switch_sem);
next prev parent reply other threads:[~2019-03-18 9:40 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-18 9:25 [PATCH 4.14 00/34] 4.14.107-stable review Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 01/34] ACPICA: Reference Counts: increase max to 0x4000 for large servers Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 02/34] perf tools: Fix compile error with libunwind x86 Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 03/34] gro_cells: make sure device is up in gro_cells_receive() Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 04/34] ipv4/route: fail early when inet dev is missing Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 05/34] l2tp: fix infoleak in l2tp_ip6_recvmsg() Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 06/34] net: hsr: fix memory leak in hsr_dev_finalize() Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 07/34] net/hsr: fix possible crash in add_timer() Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 08/34] net: sit: fix UBSAN Undefined behaviour in check_6rd Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 09/34] net/x25: fix use-after-free in x25_device_event() Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 10/34] net/x25: reset state in x25_connect() Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 11/34] pptp: dst_release sk_dst_cache in pptp_sock_destruct Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 12/34] ravb: Decrease TxFIFO depth of Q3 and Q2 to one Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 13/34] route: set the deleted fnhe fnhe_daddr to 0 in ip_del_fnhe to fix a race Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 14/34] rxrpc: Fix client call queueing, waiting for channel Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 15/34] tcp: Dont access TCP_SKB_CB before initializing it Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 16/34] tcp: handle inet_csk_reqsk_queue_add() failures Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 17/34] vxlan: Fix GRO cells race condition between receive and link delete Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 18/34] vxlan: test dev->flags & IFF_UP before calling gro_cells_receive() Greg Kroah-Hartman
2019-03-18 9:25 ` Greg Kroah-Hartman [this message]
2019-03-18 9:25 ` [PATCH 4.14 20/34] net/mlx4_core: Fix locking in SRIOV mode when switching between events and polling Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 21/34] net/mlx4_core: Fix qp mtt size calculation Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 22/34] net/x25: fix a race in x25_bind() Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 23/34] mdio_bus: Fix use-after-free on device_register fails Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 24/34] net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 25/34] bonding: fix PACKET_ORIGDEV regression Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 26/34] missing barriers in some of unix_sock ->addr and ->path accesses Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 27/34] ipvlan: disallow userns cap_net_admin to change global mode/flags Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 28/34] perf/x86: Fixup typo in stub functions Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 29/34] ALSA: bebob: use more identical mod_alias for Saffire Pro 10 I/O against Liquid Saffire 56 Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 30/34] ALSA: firewire-motu: fix construction of PCM frame for capture direction Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 31/34] perf/x86/intel: Fix memory corruption Greg Kroah-Hartman
2019-03-18 18:20 ` DSouza, Nelson
2019-03-18 20:29 ` DSouza, Nelson
2019-03-19 12:20 ` Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 32/34] perf/x86/intel: Make dev_attr_allow_tsx_force_abort static Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 33/34] Its wrong to add len to sector_nr in raid10 reshape twice Greg Kroah-Hartman
2019-03-18 9:25 ` [PATCH 4.14 34/34] vhost/vsock: fix vhost vsock cid hashing inconsistent Greg Kroah-Hartman
2019-03-18 13:22 ` [PATCH 4.14 00/34] 4.14.107-stable review kernelci.org bot
2019-03-18 16:27 ` Naresh Kamboju
2019-03-19 2:25 ` Guenter Roeck
2019-03-19 10:33 ` Jon Hunter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190318084147.344341912@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=jackm@dev.mellanox.co.il \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tariqt@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox