From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F23954A28 for ; Mon, 22 Jan 2024 03:08:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705892921; cv=none; b=Pj97I6VDQ8NmRDDk+H63uET+Y2OH+oTeacs29t0b4t+zOrt6VevhK1msgxTT49g0PaeIOGD3Y0ru1PabNX1FD+Y6PVT0Xhvzg0yaFrlPdxzWU3v2UuXJwzIiE2+mVfGKQN4Z2Azl7G4JYUpJYN77MyfJGrqvy8+ugULLcjsfw3g= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705892921; c=relaxed/simple; bh=KA1P7W2QuABfH8bspW8VlV8YRcMNds1SqvvA3jsYEqo=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=ZjLeNSQtOS6PPrr2xThyBvYi8K4/GcawDSePYE/M1YDOt2GA/Vo7WJyjZXyCyBJDQYCZNJI95l0xevI2ARqJueukcREKTBwnGEpNDW+EUxq6Fi5nO8XUU9QSemlklI2WHb7Vb8x3cZFOLC2RHQxAF0CFORZFlqBS4Y5RKuh30r0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=LWNyzqCP; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LWNyzqCP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705892918; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PvclOpVUq9nBIefmqWAKl4V9JT/TCJ4LukD6a7e4esA=; b=LWNyzqCPGR8+Qi2YRZdE/fQi1bpTpg2Pg30C9wY9/+Objq4lgqAS0aUtPkV/J2PR7o3fc2 6vSgMOkrYJvyBDEyLABna/VCiuokFHxQAIDTp1B+1kJzKFY87d/szieE6OFAEBmIUY7QGy bTbVQ+gy8hvpFaRnSz2sEQOU2qkuw44= Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-462-Vyh_X8WzPUahFHb9KXDUYg-1; Sun, 21 Jan 2024 22:08:36 -0500 X-MC-Unique: Vyh_X8WzPUahFHb9KXDUYg-1 Received: by mail-pg1-f198.google.com with SMTP id 41be03b00d2f7-5cda0492c8eso3231668a12.3 for ; Sun, 21 Jan 2024 19:08:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705892915; x=1706497715; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PvclOpVUq9nBIefmqWAKl4V9JT/TCJ4LukD6a7e4esA=; b=bxpVjKhpq2PPyGLbH9NvvhSJtsjnlzBanwTcAimU/GF5kUxQQlNz73KDl4T/f4S1SN iy/MWayqKh7V+Mfg5xtBC2obav7rJlX2rJ4foLi1qQFB+VLlVXwYhwe/GnPAXV2WoFWe SwRZz8oWfb0ZJMHj31i7KoVWtmXXncYS/zLSNUXRE5TJkTfy8eFT7HEms23n8rnbduQT 9TFJ394Zr0fTZ5A4UJ5v7T6iXzGCY72DBql4gwnqcipxHKPUWdKMficVVhYhnB9+Hizs 5pz3yL2UIVC0o/tLm6cTysfGz9FfORQl0BBDCq2Tgf6OHzMvIVN3bfntr8tjnp0HjCq9 dBfQ== X-Gm-Message-State: AOJu0YzGR9eIreg32u4Bjlt3ecnQUUp+6KwBuhJZiLnsXes5HnS50gCF jWbRwHk/tO6oLi1v2Yp/mm8SOOjbm6yJ0md9wEiNXIfrDucWLC6PZoF4vpJQgB6aueuXz/IA66h zx5bSMJ2AaANWeCzmVVP34TFI2jAme+W7E3ii2v3KQM59Bdr3rfykYKD24jGW3qO8JcqCHnueW6 NKidWb/GwaD/a3PEmCVEU3eD3I83U/ X-Received: by 2002:a05:6a21:398a:b0:19a:360c:75d8 with SMTP id ad10-20020a056a21398a00b0019a360c75d8mr7003456pzc.14.1705892915398; Sun, 21 Jan 2024 19:08:35 -0800 (PST) X-Google-Smtp-Source: AGHT+IGUAdsVAP3JwuRkeDx9R/USIJrLiLGrwaGezy2guNDh0NawennAknmk9cdS4nwRQSvwRbVgX8QPeV09EzpFs9s= X-Received: by 2002:a05:6a21:398a:b0:19a:360c:75d8 with SMTP id ad10-20020a056a21398a00b0019a360c75d8mr7003437pzc.14.1705892915092; Sun, 21 Jan 2024 19:08:35 -0800 (PST) Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240115012918.3081203-1-yanjun.zhu@intel.com> <667a9520-a53f-40a2-810a-6c1e45146589@linux.dev> <7dd89fc0-f31e-4f83-9c02-58ee67c2d436@linux.alibaba.com> In-Reply-To: <7dd89fc0-f31e-4f83-9c02-58ee67c2d436@linux.alibaba.com> From: Jason Wang Date: Mon, 22 Jan 2024 11:08:24 +0800 Message-ID: Subject: Re: [PATCH 1/1] virtio_net: Add timeout handler to avoid kernel hang To: Heng Qi Cc: Zhu Yanjun , Paolo Abeni , Zhu Yanjun , mst@redhat.com, xuanzhuo@linux.alibaba.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, virtualization@lists.linux.dev, netdev@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Jan 19, 2024 at 10:27=E2=80=AFPM Heng Qi = wrote: > > > > =E5=9C=A8 2024/1/18 =E4=B8=8B=E5=8D=888:01, Zhu Yanjun =E5=86=99=E9=81=93= : > > > > =E5=9C=A8 2024/1/16 20:04, Paolo Abeni =E5=86=99=E9=81=93: > >> On Mon, 2024-01-15 at 09:29 +0800, Zhu Yanjun wrote: > >>> From: Zhu Yanjun > >>> > >>> Some devices emulate the virtio_net hardwares. When virtio_net > >>> driver sends commands to the emulated hardware, normally the > >>> hardware needs time to response. Sometimes the time is very > >>> long. Thus, the following will appear. Then the whole system > >>> will hang. > >>> The similar problems also occur in Intel NICs and Mellanox NICs. > >>> As such, the similar solution is borrowed from them. A timeout > >>> value is added and the timeout value as large as possible is set > >>> to ensure that the driver gets the maximum possible response from > >>> the hardware. > >>> > >>> " > >>> [ 213.795860] watchdog: BUG: soft lockup - CPU#108 stuck for 26s! > >>> [(udev-worker):3157] > >>> [ 213.796114] Modules linked in: virtio_net(+) net_failover > >>> failover qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common > >>> intel_uncore_frequency intel_uncore_frequency_common intel_ifs > >>> i10nm_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp > >>> coretemp iTCO_wdt rapl intel_pmc_bxt dax_hmem iTCO_vendor_support > >>> vfat cxl_acpi intel_cstate pmt_telemetry pmt_class intel_sdsi joydev > >>> intel_uncore cxl_core fat pcspkr mei_me isst_if_mbox_pci > >>> isst_if_mmio idxd i2c_i801 isst_if_common mei intel_vsec idxd_bus > >>> i2c_smbus i2c_ismt ipmi_ssif acpi_ipmi ipmi_si ipmi_devintf > >>> ipmi_msghandler acpi_pad acpi_power_meter pfr_telemetry pfr_update > >>> fuse loop zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel > >>> polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 > >>> bnxt_en sha256_ssse3 sha1_ssse3 nvme ast nvme_core i2c_algo_bit wmi > >>> pinctrl_emmitsburg scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath > >>> [ 213.796194] irq event stamp: 67740 > >>> [ 213.796195] hardirqs last enabled at (67739): > >>> [] asm_sysvec_apic_timer_interrupt+0x1a/0x20 > >>> [ 213.796203] hardirqs last disabled at (67740): > >>> [] sysvec_apic_timer_interrupt+0xe/0x90 > >>> [ 213.796208] softirqs last enabled at (67686): > >>> [] __irq_exit_rcu+0xbe/0xe0 > >>> [ 213.796214] softirqs last disabled at (67681): > >>> [] __irq_exit_rcu+0xbe/0xe0 > >>> [ 213.796217] CPU: 108 PID: 3157 Comm: (udev-worker) Kdump: loaded > >>> Not tainted 6.7.0+ #9 > >>> [ 213.796220] Hardware name: Intel Corporation > >>> M50FCP2SBSTD/M50FCP2SBSTD, BIOS SE5C741.86B.01.01.0001.2211140926 > >>> 11/14/2022 > >>> [ 213.796221] RIP: 0010:virtqueue_get_buf_ctx_split+0x8d/0x110 > >>> [ 213.796228] Code: 89 df e8 26 fe ff ff 0f b7 43 50 83 c0 01 66 89 > >>> 43 50 f6 43 78 01 75 12 80 7b 42 00 48 8b 4b 68 8b 53 58 74 0f 66 87 > >>> 44 51 04 <48> 89 e8 5b 5d c3 cc cc cc cc 66 89 44 51 04 0f ae f0 48 > >>> 89 e8 5b > >>> [ 213.796230] RSP: 0018:ff4bbb362306f9b0 EFLAGS: 00000246 > >>> [ 213.796233] RAX: 0000000000000000 RBX: ff2f15095896f000 RCX: > >>> 0000000000000001 > >>> [ 213.796235] RDX: 0000000000000000 RSI: ff4bbb362306f9cc RDI: > >>> ff2f15095896f000 > >>> [ 213.796236] RBP: 0000000000000000 R08: 0000000000000000 R09: > >>> 0000000000000000 > >>> [ 213.796237] R10: 0000000000000003 R11: ff2f15095893cc40 R12: > >>> 0000000000000002 > >>> [ 213.796239] R13: 0000000000000004 R14: 0000000000000000 R15: > >>> ff2f1509534f3000 > >>> [ 213.796240] FS: 00007f775847d0c0(0000) GS:ff2f1528bac00000(0000) > >>> knlGS:0000000000000000 > >>> [ 213.796242] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>> [ 213.796243] CR2: 0000557f987b6e70 CR3: 0000002098602006 CR4: > >>> 0000000000f71ef0 > >>> [ 213.796245] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > >>> 0000000000000000 > >>> [ 213.796246] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: > >>> 0000000000000400 > >>> [ 213.796247] PKRU: 55555554 > >>> [ 213.796249] Call Trace: > >>> [ 213.796250] > >>> [ 213.796252] ? watchdog_timer_fn+0x1c0/0x220 > >>> [ 213.796258] ? __pfx_watchdog_timer_fn+0x10/0x10 > >>> [ 213.796261] ? __hrtimer_run_queues+0x1af/0x380 > >>> [ 213.796269] ? hrtimer_interrupt+0xf8/0x230 > >>> [ 213.796274] ? __sysvec_apic_timer_interrupt+0x64/0x1a0 > >>> [ 213.796279] ? sysvec_apic_timer_interrupt+0x6d/0x90 > >>> [ 213.796282] > >>> [ 213.796284] > >>> [ 213.796285] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 > >>> [ 213.796293] ? virtqueue_get_buf_ctx_split+0x8d/0x110 > >>> [ 213.796297] virtnet_send_command+0x18a/0x1f0 [virtio_net] > >>> [ 213.796310] _virtnet_set_queues+0xc6/0x120 [virtio_net] > >>> [ 213.796319] virtnet_probe+0xa06/0xd50 [virtio_net] > >>> [ 213.796328] virtio_dev_probe+0x195/0x230 > >>> [ 213.796333] really_probe+0x19f/0x400 > >>> [ 213.796338] ? __pfx___driver_attach+0x10/0x10 > >>> [ 213.796340] __driver_probe_device+0x78/0x160 > >>> [ 213.796343] driver_probe_device+0x1f/0x90 > >>> [ 213.796346] __driver_attach+0xd6/0x1d0 > >>> [ 213.796349] bus_for_each_dev+0x8c/0xe0 > >>> [ 213.796355] bus_add_driver+0x119/0x220 > >>> [ 213.796359] driver_register+0x59/0x100 > >>> [ 213.796362] ? __pfx_virtio_net_driver_init+0x10/0x10 [virtio_net] > >>> [ 213.796369] virtio_net_driver_init+0x8e/0xff0 [virtio_net] > >>> [ 213.796375] do_one_initcall+0x6f/0x380 > >>> [ 213.796384] do_init_module+0x60/0x240 > >>> [ 213.796388] init_module_from_file+0x86/0xc0 > >>> [ 213.796396] idempotent_init_module+0x129/0x2c0 > >>> [ 213.796406] __x64_sys_finit_module+0x5e/0xb0 > >>> [ 213.796409] do_syscall_64+0x60/0xe0 > >>> [ 213.796415] ? do_syscall_64+0x6f/0xe0 > >>> [ 213.796418] ? lockdep_hardirqs_on_prepare+0xe4/0x1a0 > >>> [ 213.796424] ? do_syscall_64+0x6f/0xe0 > >>> [ 213.796427] ? do_syscall_64+0x6f/0xe0 > >>> [ 213.796431] entry_SYSCALL_64_after_hwframe+0x6e/0x76 > >>> [ 213.796435] RIP: 0033:0x7f7758f279cd > >>> [ 213.796465] Code: 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e > >>> fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 > >>> 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 33 e4 0c 00 f7 d8 64 > >>> 89 01 48 > >>> [ 213.796467] RSP: 002b:00007ffe2cad8738 EFLAGS: 00000246 ORIG_RAX: > >>> 0000000000000139 > >>> [ 213.796469] RAX: ffffffffffffffda RBX: 0000557f987a8180 RCX: > >>> 00007f7758f279cd > >>> [ 213.796471] RDX: 0000000000000000 RSI: 00007f77593e5453 RDI: > >>> 000000000000000f > >>> [ 213.796472] RBP: 00007f77593e5453 R08: 0000000000000000 R09: > >>> 00007ffe2cad8860 > >>> [ 213.796473] R10: 000000000000000f R11: 0000000000000246 R12: > >>> 0000000000020000 > >>> [ 213.796475] R13: 0000557f9879f8e0 R14: 0000000000000000 R15: > >>> 0000557f98783aa0 > >>> [ 213.796482] > >>> " > >>> > >>> Signed-off-by: Zhu Yanjun > >>> --- > >>> drivers/net/virtio_net.c | 10 ++++++++-- > >>> 1 file changed, 8 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > >>> index 51b1868d2f22..28b7dd917a43 100644 > >>> --- a/drivers/net/virtio_net.c > >>> +++ b/drivers/net/virtio_net.c > >>> @@ -2468,7 +2468,7 @@ static bool virtnet_send_command(struct > >>> virtnet_info *vi, u8 class, u8 cmd, > >>> { > >>> struct scatterlist *sgs[4], hdr, stat; > >>> unsigned out_num =3D 0, tmp; > >>> - int ret; > >>> + int ret, timeout =3D 200; > >>> /* Caller should know better */ > >>> BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)); > >>> @@ -2502,8 +2502,14 @@ static bool virtnet_send_command(struct > >>> virtnet_info *vi, u8 class, u8 cmd, > >>> * into the hypervisor, so the request should be handled > >>> immediately. > >>> */ > >>> while (!virtqueue_get_buf(vi->cvq, &tmp) && > >>> - !virtqueue_is_broken(vi->cvq)) > >>> + !virtqueue_is_broken(vi->cvq)) { > >>> + if (timeout) > >>> + timeout--; > >> This is not really a timeout, just a loop counter. 200 iterations coul= d > >> be a very short time on reasonable H/W. I guess this avoid the soft > >> lockup, but possibly (likely?) breaks the functionality when we need t= o > >> loop for some non negligible time. > >> > >> I fear we need a more complex solution, as mentioned by Micheal in the > >> thread you quoted. > > > > Got it. I also look forward to the more complex solution to this proble= m. > > Can we add a device capability (new feature bit) such as > ctrq_wait_timeout to get a reasonable timeout=EF=BC=9F This adds another kind of complexity for migration compatibility. And we need to make it more general, e.g 1) it should not be cvq specific 2) or we can have a timeout that works for all queues ? Thanks > > Thanks, > Heng > > > > > Zhu Yanjun > > > >> > >> Cheers, > >> > >> Paolo > >> >