From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eli Cohen <elic@nvidia.com>
Cc: eperezma@redhat.com, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH] virtio_net: Abort driver initialization if device fails
Date: Sun, 28 Aug 2022 16:07:48 -0400 [thread overview]
Message-ID: <20220828160106-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20220828154820.458007-1-elic@nvidia.com>
On Sun, Aug 28, 2022 at 06:48:20PM +0300, Eli Cohen wrote:
> Read the status bit after virtio_device_ready() to check if device
> initialization was successful. If it was not, abort driver
> initialization to avoid further attempts to access device resources.
>
> Abort is required per virtio spec v1.1
>
> 3.1.1
> ...
> If any of these steps go irrecoverably wrong, the driver SHOULD set the
> FAILED status bit to indicate that it has given up on the device (it can
> reset the device later to restart if desired). The driver MUST NOT
> continue initialization in that case.
I don't see a requirement to read the status bit though
which is what the patch does.
> This fixes an issue that was discovered when mlx5_vdpa initialization
> failed due to firmware error
A bit more detail would be nice. which function failed exactly?
> and subsequent attempts to send control VQ
> commands failed with a call trace:
>
> watchdog: BUG: soft lockup - CPU#62 stuck for 26s! [systemd-udevd:2610]
> Modules linked in: virtio_net(+) net_failover failover virtio_vdpa mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib ib_uverbs ib_core mlx5_core mlxfw psample tls pci>
> drm ghash_clmulni_intel serio_raw usb_storage scsi_transport_sas hpwdt wmi target_core_mod [last unloaded: ib_core]
> CPU: 62 PID: 2610 Comm: systemd-udevd Tainted: G I 6.0.0-rc2+ #7
> Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 04/08/2020
> RIP: 0010:virtnet_send_command+0xfa/0x140 [virtio_net]
> Code: ec f0 80 e0 85 c0 0f 88 0c 68 00 00 48 8b 7b 08 e8 9b d6 80 e0 84 c0 75 11 eb 43 48 8b 7b 08 e8 7c c3 80 e0 84 c0 75 15 f3 90 <48> 8b 7b 08 48 8d 74 24 >
> RSP: 0018:ffffb6fba2037b78 EFLAGS: 00000246
> RAX: 0000000000000000 RBX: ffff99c6043a99c0 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffb6fba2037b7c RDI: ffff99c6042c6500
> RBP: ffffb6fba2037bc0 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000003 R11: 0000000000000002 R12: 0000000000000002
> R13: 0000000000000004 R14: 0000000000000000 R15: ffff99f634258400
> FS: 00007f5894eefb40(0000) GS:ffff99dd9fb80000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055acd4a80000 CR3: 00000018605a4002 CR4: 00000000007706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> <TASK>
> _virtnet_set_queues+0x7f/0xc0 [virtio_net]
> virtnet_probe+0x989/0xae0 [virtio_net]
> virtio_dev_probe+0x1ab/0x260
> really_probe+0xde/0x390
> ? pm_runtime_barrier+0x50/0x90
> __driver_probe_device+0x78/0x180
> driver_probe_device+0x1e/0x90
> __driver_attach+0xc4/0x1e0
> ? __device_attach_driver+0xe0/0xe0
> ? __device_attach_driver+0xe0/0xe0
> bus_for_each_dev+0x61/0x90
> bus_add_driver+0x1a9/0x200
> driver_register+0x8f/0xf0
> ? 0xffffffffc0fdb000
> virtio_net_driver_init+0x70/0x1000 [virtio_net]
> do_one_initcall+0x41/0x210
> ? kmem_cache_alloc_trace+0x16d/0x2c0
> do_init_module+0x4c/0x1f0
> __do_sys_finit_module+0x9f/0x100
> do_syscall_64+0x38/0x90
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7f589510aecd
> Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 >
> RSP: 002b:00007ffde61fb398 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 000055acd48ac950 RCX: 00007f589510aecd
> RDX: 0000000000000000 RSI: 00007f589584332c RDI: 0000000000000012
> RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
> R10: 0000000000000012 R11: 0000000000000246 R12: 00007f589584332c
> R13: 000055acd4867c70 R14: 0000000000000007 R15: 000055acd48980a0
> </TASK>
>
> Fixes: commit 4baf1e33d084 ("virtio_net: enable VQs early")
> Signed-off-by: Eli Cohen <elic@nvidia.com>
> ---
> drivers/net/virtio_net.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 9cce7dec7366..4698d9a28a6f 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -3900,6 +3900,11 @@ static int virtnet_probe(struct virtio_device *vdev)
> }
>
> virtio_device_ready(vdev);
> + if (vdev->config->get_status(vdev) & VIRTIO_CONFIG_S_FAILED) {
> + err = -EINVAL;
> + rtnl_unlock();
> + goto unregister_ndev;
> + }
>
> rtnl_unlock();
>
I don't get it. What set the failed status?
> @@ -3934,7 +3939,7 @@ static int virtnet_probe(struct virtio_device *vdev)
>
> free_unregister_netdev:
> virtio_reset_device(vdev);
> -
> +unregister_ndev:
> unregister_netdev(dev);
> free_failover:
> net_failover_destroy(vi->failover);
> --
> 2.35.1
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next parent reply other threads:[~2022-08-28 20:08 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220828154820.458007-1-elic@nvidia.com>
2022-08-28 20:07 ` Michael S. Tsirkin [this message]
2022-08-29 0:33 ` [PATCH] virtio_net: Abort driver initialization if device fails Jason Wang
[not found] ` <DM8PR12MB54001542CF00A017738C6BC3AB769@DM8PR12MB5400.namprd12.prod.outlook.com>
2022-08-31 7:03 ` Jason Wang
[not found] ` <DM8PR12MB54007A124E19FAF1AEA1F9C3AB769@DM8PR12MB5400.namprd12.prod.outlook.com>
2022-08-31 12:27 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220828160106-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=elic@nvidia.com \
--cc=eperezma@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).