From: "Michael S. Tsirkin" <mst@redhat.com>
To: Dragos Tatulea <dtatulea@nvidia.com>
Cc: Jason Wang <jasowang@redhat.com>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
Saeed Mahameed <saeedm@nvidia.com>,
stable@vger.kernel.org,
virtualization@lists.linux-foundation.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] vdpa/mlx5: Fix crash on shutdown for when no ndev exists
Date: Wed, 26 Jul 2023 15:26:02 -0400 [thread overview]
Message-ID: <20230726152258-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20230726190744.14143-1-dtatulea@nvidia.com>
On Wed, Jul 26, 2023 at 10:07:38PM +0300, Dragos Tatulea wrote:
> The ndev was accessed on shutdown without a check if it actually exists.
> This triggered the crash pasted below. This patch simply adds a check
> before using ndev.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000300
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP
> CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 6.5.0-rc2_for_upstream_min_debug_2023_07_17_15_05 #1
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> RIP: 0010:mlx5v_shutdown+0xe/0x50 [mlx5_vdpa]
> RSP: 0018:ffff8881003bfdc0 EFLAGS: 00010286
> RAX: ffff888103befba0 RBX: ffff888109d28008 RCX: 0000000000000017
> RDX: 0000000000000001 RSI: 0000000000000212 RDI: ffff888109d28000
> RBP: 0000000000000000 R08: 0000000d3a3a3882 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff888109d28000
> R13: ffff888109d28080 R14: 00000000fee1dead R15: 0000000000000000
> FS: 00007f4969e0be40(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000300 CR3: 00000001051cd006 CR4: 0000000000370eb0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> ? __die+0x20/0x60
> ? page_fault_oops+0x14c/0x3c0
> ? exc_page_fault+0x75/0x140
> ? asm_exc_page_fault+0x22/0x30
> ? mlx5v_shutdown+0xe/0x50 [mlx5_vdpa]
> device_shutdown+0x13e/0x1e0
> kernel_restart+0x36/0x90
> __do_sys_reboot+0x141/0x210
> ? vfs_writev+0xcd/0x140
> ? handle_mm_fault+0x161/0x260
> ? do_writev+0x6b/0x110
> do_syscall_64+0x3d/0x90
> entry_SYSCALL_64_after_hwframe+0x46/0xb0
> RIP: 0033:0x7f496990fb56
> RSP: 002b:00007fffc7bdde88 EFLAGS: 00000206 ORIG_RAX: 00000000000000a9
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f496990fb56
> RDX: 0000000001234567 RSI: 0000000028121969 RDI: fffffffffee1dead
> RBP: 00007fffc7bde1d0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> R13: 00007fffc7bddf10 R14: 0000000000000000 R15: 00007fffc7bde2b8
> </TASK>
> CR2: 0000000000000300
> ---[ end trace 0000000000000000 ]---
>
> Fixes: bc9a2b3e686e ("vdpa/mlx5: Support interrupt bypassing")
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> ---
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 9138ef2fb2c8..e2e7ebd71798 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -3556,7 +3556,8 @@ static void mlx5v_shutdown(struct auxiliary_device *auxdev)
> mgtdev = auxiliary_get_drvdata(auxdev);
> ndev = mgtdev->ndev;
>
> - free_irqs(ndev);
> + if (ndev)
> + free_irqs(ndev);
> }
>
something I don't get:
irqs are allocated in mlx5_vdpa_dev_add
why are they not freed in mlx5_vdpa_dev_del?
this is what's creating all this mess.
> static const struct auxiliary_device_id mlx5v_id_table[] = {
> --
> 2.41.0
WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Dragos Tatulea <dtatulea@nvidia.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
linux-kernel@vger.kernel.org, stable@vger.kernel.org,
virtualization@lists.linux-foundation.org,
Saeed Mahameed <saeedm@nvidia.com>
Subject: Re: [PATCH] vdpa/mlx5: Fix crash on shutdown for when no ndev exists
Date: Wed, 26 Jul 2023 15:26:02 -0400 [thread overview]
Message-ID: <20230726152258-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20230726190744.14143-1-dtatulea@nvidia.com>
On Wed, Jul 26, 2023 at 10:07:38PM +0300, Dragos Tatulea wrote:
> The ndev was accessed on shutdown without a check if it actually exists.
> This triggered the crash pasted below. This patch simply adds a check
> before using ndev.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000300
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP
> CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 6.5.0-rc2_for_upstream_min_debug_2023_07_17_15_05 #1
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> RIP: 0010:mlx5v_shutdown+0xe/0x50 [mlx5_vdpa]
> RSP: 0018:ffff8881003bfdc0 EFLAGS: 00010286
> RAX: ffff888103befba0 RBX: ffff888109d28008 RCX: 0000000000000017
> RDX: 0000000000000001 RSI: 0000000000000212 RDI: ffff888109d28000
> RBP: 0000000000000000 R08: 0000000d3a3a3882 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff888109d28000
> R13: ffff888109d28080 R14: 00000000fee1dead R15: 0000000000000000
> FS: 00007f4969e0be40(0000) GS:ffff88852c800000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000300 CR3: 00000001051cd006 CR4: 0000000000370eb0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> ? __die+0x20/0x60
> ? page_fault_oops+0x14c/0x3c0
> ? exc_page_fault+0x75/0x140
> ? asm_exc_page_fault+0x22/0x30
> ? mlx5v_shutdown+0xe/0x50 [mlx5_vdpa]
> device_shutdown+0x13e/0x1e0
> kernel_restart+0x36/0x90
> __do_sys_reboot+0x141/0x210
> ? vfs_writev+0xcd/0x140
> ? handle_mm_fault+0x161/0x260
> ? do_writev+0x6b/0x110
> do_syscall_64+0x3d/0x90
> entry_SYSCALL_64_after_hwframe+0x46/0xb0
> RIP: 0033:0x7f496990fb56
> RSP: 002b:00007fffc7bdde88 EFLAGS: 00000206 ORIG_RAX: 00000000000000a9
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f496990fb56
> RDX: 0000000001234567 RSI: 0000000028121969 RDI: fffffffffee1dead
> RBP: 00007fffc7bde1d0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
> R13: 00007fffc7bddf10 R14: 0000000000000000 R15: 00007fffc7bde2b8
> </TASK>
> CR2: 0000000000000300
> ---[ end trace 0000000000000000 ]---
>
> Fixes: bc9a2b3e686e ("vdpa/mlx5: Support interrupt bypassing")
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> ---
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 9138ef2fb2c8..e2e7ebd71798 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -3556,7 +3556,8 @@ static void mlx5v_shutdown(struct auxiliary_device *auxdev)
> mgtdev = auxiliary_get_drvdata(auxdev);
> ndev = mgtdev->ndev;
>
> - free_irqs(ndev);
> + if (ndev)
> + free_irqs(ndev);
> }
>
something I don't get:
irqs are allocated in mlx5_vdpa_dev_add
why are they not freed in mlx5_vdpa_dev_del?
this is what's creating all this mess.
> static const struct auxiliary_device_id mlx5v_id_table[] = {
> --
> 2.41.0
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2023-07-26 19:26 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-26 19:07 [PATCH] vdpa/mlx5: Fix crash on shutdown for when no ndev exists Dragos Tatulea
2023-07-26 19:07 ` Dragos Tatulea via Virtualization
2023-07-26 19:10 ` kernel test robot
2023-07-26 19:26 ` Michael S. Tsirkin [this message]
2023-07-26 19:26 ` Michael S. Tsirkin
2023-07-27 16:02 ` Dragos Tatulea
2023-07-27 16:02 ` Dragos Tatulea via Virtualization
2023-07-27 16:28 ` Michael S. Tsirkin
2023-07-27 16:28 ` Michael S. Tsirkin
2023-07-31 7:15 ` Dragos Tatulea
2023-07-31 7:15 ` Dragos Tatulea via Virtualization
2023-07-31 9:08 ` Michael S. Tsirkin
2023-07-31 9:08 ` Michael S. Tsirkin
2023-08-01 3:59 ` Jason Wang
2023-08-01 3:59 ` Jason Wang
2023-08-01 8:17 ` Dragos Tatulea
2023-08-01 8:17 ` Dragos Tatulea via Virtualization
2023-08-02 2:51 ` Jason Wang
2023-08-02 2:51 ` Jason Wang
2023-08-02 7:56 ` Dragos Tatulea
2023-08-02 7:56 ` Dragos Tatulea via Virtualization
2023-08-03 15:02 ` Dragos Tatulea
2023-08-03 15:02 ` Dragos Tatulea via Virtualization
2023-08-03 15:12 ` Michael S. Tsirkin
2023-08-03 15:12 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230726152258-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=dtatulea@nvidia.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=saeedm@nvidia.com \
--cc=stable@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.