Archive-only list for patches
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Tariq Toukan <tariqt@nvidia.com>,
	Moshe Shemesh <moshe@nvidia.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Jakub Kicinski <kuba@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.4 62/64] net/mlx5: Fix possible use-after-free in async command interface
Date: Wed,  2 Nov 2022 03:34:28 +0100	[thread overview]
Message-ID: <20221102022053.820256280@linuxfoundation.org> (raw)
In-Reply-To: <20221102022051.821538553@linuxfoundation.org>

From: Tariq Toukan <tariqt@nvidia.com>

[ Upstream commit bacd22df95147ed673bec4692ab2d4d585935241 ]

mlx5_cmd_cleanup_async_ctx should return only after all its callback
handlers were completed. Before this patch, the below race between
mlx5_cmd_cleanup_async_ctx and mlx5_cmd_exec_cb_handler was possible and
lead to a use-after-free:

1. mlx5_cmd_cleanup_async_ctx is called while num_inflight is 2 (i.e.
   elevated by 1, a single inflight callback).
2. mlx5_cmd_cleanup_async_ctx decreases num_inflight to 1.
3. mlx5_cmd_exec_cb_handler is called, decreases num_inflight to 0 and
   is about to call wake_up().
4. mlx5_cmd_cleanup_async_ctx calls wait_event, which returns
   immediately as the condition (num_inflight == 0) holds.
5. mlx5_cmd_cleanup_async_ctx returns.
6. The caller of mlx5_cmd_cleanup_async_ctx frees the mlx5_async_ctx
   object.
7. mlx5_cmd_exec_cb_handler goes on and calls wake_up() on the freed
   object.

Fix it by syncing using a completion object. Mark it completed when
num_inflight reaches 0.

Trace:

BUG: KASAN: use-after-free in do_raw_spin_lock+0x23d/0x270
Read of size 4 at addr ffff888139cd12f4 by task swapper/5/0

CPU: 5 PID: 0 Comm: swapper/5 Not tainted 6.0.0-rc3_for_upstream_debug_2022_08_30_13_10 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 dump_stack_lvl+0x57/0x7d
 print_report.cold+0x2d5/0x684
 ? do_raw_spin_lock+0x23d/0x270
 kasan_report+0xb1/0x1a0
 ? do_raw_spin_lock+0x23d/0x270
 do_raw_spin_lock+0x23d/0x270
 ? rwlock_bug.part.0+0x90/0x90
 ? __delete_object+0xb8/0x100
 ? lock_downgrade+0x6e0/0x6e0
 _raw_spin_lock_irqsave+0x43/0x60
 ? __wake_up_common_lock+0xb9/0x140
 __wake_up_common_lock+0xb9/0x140
 ? __wake_up_common+0x650/0x650
 ? destroy_tis_callback+0x53/0x70 [mlx5_core]
 ? kasan_set_track+0x21/0x30
 ? destroy_tis_callback+0x53/0x70 [mlx5_core]
 ? kfree+0x1ba/0x520
 ? do_raw_spin_unlock+0x54/0x220
 mlx5_cmd_exec_cb_handler+0x136/0x1a0 [mlx5_core]
 ? mlx5_cmd_cleanup_async_ctx+0x220/0x220 [mlx5_core]
 ? mlx5_cmd_cleanup_async_ctx+0x220/0x220 [mlx5_core]
 mlx5_cmd_comp_handler+0x65a/0x12b0 [mlx5_core]
 ? dump_command+0xcc0/0xcc0 [mlx5_core]
 ? lockdep_hardirqs_on_prepare+0x400/0x400
 ? cmd_comp_notifier+0x7e/0xb0 [mlx5_core]
 cmd_comp_notifier+0x7e/0xb0 [mlx5_core]
 atomic_notifier_call_chain+0xd7/0x1d0
 mlx5_eq_async_int+0x3ce/0xa20 [mlx5_core]
 atomic_notifier_call_chain+0xd7/0x1d0
 ? irq_release+0x140/0x140 [mlx5_core]
 irq_int_handler+0x19/0x30 [mlx5_core]
 __handle_irq_event_percpu+0x1f2/0x620
 handle_irq_event+0xb2/0x1d0
 handle_edge_irq+0x21e/0xb00
 __common_interrupt+0x79/0x1a0
 common_interrupt+0x78/0xa0
 </IRQ>
 <TASK>
 asm_common_interrupt+0x22/0x40
RIP: 0010:default_idle+0x42/0x60
Code: c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 04 84 d2 75 14 8b 05 eb 47 22 02 85 c0 7e 07 0f 00 2d e0 9f 48 00 fb f4 <c3> 48 c7 c7 80 08 7f 85 e8 d1 d3 3e fe eb de 66 66 2e 0f 1f 84 00
RSP: 0018:ffff888100dbfdf0 EFLAGS: 00000242
RAX: 0000000000000001 RBX: ffffffff84ecbd48 RCX: 1ffffffff0afe110
RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff835cc9bc
RBP: 0000000000000005 R08: 0000000000000001 R09: ffff88881dec4ac3
R10: ffffed1103bd8958 R11: 0000017d0ca571c9 R12: 0000000000000005
R13: ffffffff84f024e0 R14: 0000000000000000 R15: dffffc0000000000
 ? default_idle_call+0xcc/0x450
 default_idle_call+0xec/0x450
 do_idle+0x394/0x450
 ? arch_cpu_idle_exit+0x40/0x40
 ? do_idle+0x17/0x450
 cpu_startup_entry+0x19/0x20
 start_secondary+0x221/0x2b0
 ? set_cpu_sibling_map+0x2070/0x2070
 secondary_startup_64_no_verify+0xcd/0xdb
 </TASK>

Allocated by task 49502:
 kasan_save_stack+0x1e/0x40
 __kasan_kmalloc+0x81/0xa0
 kvmalloc_node+0x48/0xe0
 mlx5e_bulk_async_init+0x35/0x110 [mlx5_core]
 mlx5e_tls_priv_tx_list_cleanup+0x84/0x3e0 [mlx5_core]
 mlx5e_ktls_cleanup_tx+0x38f/0x760 [mlx5_core]
 mlx5e_cleanup_nic_tx+0xa7/0x100 [mlx5_core]
 mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core]
 mlx5e_suspend+0xdb/0x140 [mlx5_core]
 mlx5e_remove+0x89/0x190 [mlx5_core]
 auxiliary_bus_remove+0x52/0x70
 device_release_driver_internal+0x40f/0x650
 driver_detach+0xc1/0x180
 bus_remove_driver+0x125/0x2f0
 auxiliary_driver_unregister+0x16/0x50
 mlx5e_cleanup+0x26/0x30 [mlx5_core]
 cleanup+0xc/0x4e [mlx5_core]
 __x64_sys_delete_module+0x2b5/0x450
 do_syscall_64+0x3d/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Freed by task 49502:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_set_free_info+0x20/0x30
 ____kasan_slab_free+0x11d/0x1b0
 kfree+0x1ba/0x520
 mlx5e_tls_priv_tx_list_cleanup+0x2e7/0x3e0 [mlx5_core]
 mlx5e_ktls_cleanup_tx+0x38f/0x760 [mlx5_core]
 mlx5e_cleanup_nic_tx+0xa7/0x100 [mlx5_core]
 mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core]
 mlx5e_suspend+0xdb/0x140 [mlx5_core]
 mlx5e_remove+0x89/0x190 [mlx5_core]
 auxiliary_bus_remove+0x52/0x70
 device_release_driver_internal+0x40f/0x650
 driver_detach+0xc1/0x180
 bus_remove_driver+0x125/0x2f0
 auxiliary_driver_unregister+0x16/0x50
 mlx5e_cleanup+0x26/0x30 [mlx5_core]
 cleanup+0xc/0x4e [mlx5_core]
 __x64_sys_delete_module+0x2b5/0x450
 do_syscall_64+0x3d/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: e355477ed9e4 ("net/mlx5: Make mlx5_cmd_exec_cb() a safe API")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-8-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 10 +++++-----
 include/linux/mlx5/driver.h                   |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 6c7b364d0bf0..4fdc97304f69 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1850,7 +1850,7 @@ void mlx5_cmd_init_async_ctx(struct mlx5_core_dev *dev,
 	ctx->dev = dev;
 	/* Starts at 1 to avoid doing wake_up if we are not cleaning up */
 	atomic_set(&ctx->num_inflight, 1);
-	init_waitqueue_head(&ctx->wait);
+	init_completion(&ctx->inflight_done);
 }
 EXPORT_SYMBOL(mlx5_cmd_init_async_ctx);
 
@@ -1864,8 +1864,8 @@ EXPORT_SYMBOL(mlx5_cmd_init_async_ctx);
  */
 void mlx5_cmd_cleanup_async_ctx(struct mlx5_async_ctx *ctx)
 {
-	atomic_dec(&ctx->num_inflight);
-	wait_event(ctx->wait, atomic_read(&ctx->num_inflight) == 0);
+	if (!atomic_dec_and_test(&ctx->num_inflight))
+		wait_for_completion(&ctx->inflight_done);
 }
 EXPORT_SYMBOL(mlx5_cmd_cleanup_async_ctx);
 
@@ -1876,7 +1876,7 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work)
 
 	work->user_callback(status, work);
 	if (atomic_dec_and_test(&ctx->num_inflight))
-		wake_up(&ctx->wait);
+		complete(&ctx->inflight_done);
 }
 
 int mlx5_cmd_exec_cb(struct mlx5_async_ctx *ctx, void *in, int in_size,
@@ -1892,7 +1892,7 @@ int mlx5_cmd_exec_cb(struct mlx5_async_ctx *ctx, void *in, int in_size,
 	ret = cmd_exec(ctx->dev, in, in_size, out, out_size,
 		       mlx5_cmd_exec_cb_handler, work, false);
 	if (ret && atomic_dec_and_test(&ctx->num_inflight))
-		wake_up(&ctx->wait);
+		complete(&ctx->inflight_done);
 
 	return ret;
 }
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 2b65ffb3bd76..3a19b9202a12 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -904,7 +904,7 @@ void mlx5_cmd_allowed_opcode(struct mlx5_core_dev *dev, u16 opcode);
 struct mlx5_async_ctx {
 	struct mlx5_core_dev *dev;
 	atomic_t num_inflight;
-	struct wait_queue_head wait;
+	struct completion inflight_done;
 };
 
 struct mlx5_async_work;
-- 
2.35.1




  parent reply	other threads:[~2022-11-02  3:26 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-02  2:33 [PATCH 5.4 00/64] 5.4.223-rc1 review Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 01/64] can: j1939: transport: j1939_session_skb_drop_old(): spin_unlock_irqrestore() before kfree_skb() Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 02/64] can: kvaser_usb: Fix possible completions during init_completion Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 03/64] ALSA: Use del_timer_sync() before freeing timer Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 04/64] ALSA: au88x0: use explicitly signed char Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 05/64] USB: add RESET_RESUME quirk for NVIDIA Jetson devices in RCM Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 06/64] usb: dwc3: gadget: Stop processing more requests on IMI Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 07/64] usb: dwc3: gadget: Dont set IMI for no_interrupt Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 08/64] usb: bdc: change state when port disconnected Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 09/64] usb: xhci: add XHCI_SPURIOUS_SUCCESS to ASM1042 despite being a V0.96 controller Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 10/64] mtd: rawnand: marvell: Use correct logic for nand-keep-config Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 11/64] xhci: Remove device endpoints from bandwidth list when freeing the device Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 12/64] tools: iio: iio_utils: fix digit calculation Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 13/64] iio: light: tsl2583: Fix module unloading Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 14/64] fbdev: smscufx: Fix several use-after-free bugs Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 15/64] mac802154: Fix LQI recording Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 16/64] drm/msm/dsi: fix memory corruption with too many bridges Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 17/64] drm/msm/hdmi: " Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 18/64] mmc: core: Fix kernel panic when remove non-standard SDIO card Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 19/64] kernfs: fix use-after-free in __kernfs_remove Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 20/64] perf auxtrace: Fix address filter symbol name match for modules Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 21/64] s390/futex: add missing EX_TABLE entry to __futex_atomic_op() Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 22/64] s390/pci: add missing EX_TABLE entries to __pcistg_mio_inuser()/__pcilg_mio_inuser() Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 23/64] xfs: finish dfops on every insert range shift iteration Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 24/64] xfs: clear XFS_DQ_FREEING if we cant lock the dquot buffer to flush Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 25/64] xfs: force the log after remapping a synchronous-writes file Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 26/64] Xen/gntdev: dont ignore kernel unmapping error Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 27/64] xen/gntdev: Prevent leaking grants Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 28/64] cgroup-v1: add disabled controller check in cgroup1_parse_param() Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 29/64] mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 30/64] net: ieee802154: fix error return code in dgram_bind() Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 31/64] media: v4l2: Fix v4l2_i2c_subdev_set_name function documentation Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 32/64] drm/msm: Fix return type of mdp4_lvds_connector_mode_valid Greg Kroah-Hartman
2022-11-02  2:33 ` [PATCH 5.4 33/64] arc: iounmap() arg is volatile Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 34/64] ALSA: ac97: fix possible memory leak in snd_ac97_dev_register() Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 35/64] tipc: fix a null-ptr-deref in tipc_topsrv_accept Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 36/64] net: netsec: fix error handling in netsec_register_mdio() Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 37/64] x86/unwind/orc: Fix unreliable stack dump with gcov Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 38/64] amd-xgbe: fix the SFP compliance codes check for DAC cables Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 39/64] amd-xgbe: add the bit rate quirk for Molex cables Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 40/64] kcm: annotate data-races around kcm->rx_psock Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 41/64] kcm: annotate data-races around kcm->rx_wait Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 42/64] net: fix UAF issue in nfqnl_nf_hook_drop() when ops_init() failed Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 43/64] net: lantiq_etop: dont free skb when returning NETDEV_TX_BUSY Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 44/64] tcp: fix indefinite deferral of RTO with SACK reneging Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 45/64] can: mscan: mpc5xxx: mpc5xxx_can_probe(): add missing put_clock() in error path Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 46/64] PM: hibernate: Allow hybrid sleep to work with s2idle Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 47/64] media: vivid: s_fbuf: add more sanity checks Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 48/64] media: vivid: dev->bitmap_cap wasnt freed in all cases Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 49/64] media: v4l2-dv-timings: add sanity checks for blanking values Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 50/64] media: videodev2.h: V4L2_DV_BT_BLANKING_HEIGHT should check interlaced Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 51/64] i40e: Fix ethtool rx-flow-hash setting for X722 Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 52/64] i40e: Fix VF hang when reset is triggered on another VF Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 53/64] i40e: Fix flow-type by setting GL_HASH_INSET registers Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 54/64] net: ksz884x: fix missing pci_disable_device() on error in pcidev_init() Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 55/64] PM: domains: Fix handling of unavailable/disabled idle states Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 56/64] ALSA: aoa: i2sbus: fix possible memory leak in i2sbus_add_dev() Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 57/64] ALSA: aoa: Fix I2S device accounting Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 58/64] openvswitch: switch from WARN to pr_warn Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 59/64] net: ehea: fix possible memory leak in ehea_register_port() Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 60/64] nh: fix scope used to find saddr when adding non gw nh Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 61/64] net/mlx5e: Do not increment ESN when updating IPsec ESN state Greg Kroah-Hartman
2022-11-02  2:34 ` Greg Kroah-Hartman [this message]
2022-11-02  2:34 ` [PATCH 5.4 63/64] net: enetc: survive memory pressure without crashing Greg Kroah-Hartman
2022-11-02  2:34 ` [PATCH 5.4 64/64] can: rcar_canfd: rcar_canfd_handle_global_receive(): fix IRQ storm on global FIFO receive Greg Kroah-Hartman
2022-11-02 10:07 ` [PATCH 5.4 00/64] 5.4.223-rc1 review Jon Hunter
2022-11-02 17:50 ` Florian Fainelli
2022-11-02 20:46 ` Guenter Roeck
2022-11-03  9:18 ` Naresh Kamboju
2022-11-03 12:21 ` Sudip Mukherjee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221102022053.820256280@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=kuba@kernel.org \
    --cc=moshe@nvidia.com \
    --cc=patches@lists.linux.dev \
    --cc=saeedm@nvidia.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox