virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* Re: upstream kernel crashes
       [not found]                 ` <3df6bb82-1951-455d-a768-e9e1513eb667@www.fastmail.com>
@ 2022-08-15  8:02                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2022-08-15  8:02 UTC (permalink / raw)
  To: Andres Freund
  Cc: Jens Axboe, Martin K. Petersen, Greg Kroah-Hartman, linux-kernel,
	virtualization, James Bottomley, netdev, Linus Torvalds,
	Guenter Roeck

On Mon, Aug 15, 2022 at 12:46:36AM -0700, Andres Freund wrote:
> Hi,
> 
> On Mon, Aug 15, 2022, at 00:29, Michael S. Tsirkin wrote:
> > On Mon, Aug 15, 2022 at 12:11:43AM -0700, Andres Freund wrote:
> >> Hi,
> >> 
> >> On 2022-08-14 20:18:44 -0700, Linus Torvalds wrote:
> >> > On Sun, Aug 14, 2022 at 6:36 PM Andres Freund <andres@anarazel.de> wrote:
> >> > >
> >> > > Some of the symptoms could be related to the issue in this thread, hence
> >> > > listing them here
> >> > 
> >> > Smells like slab corruption to me, and the problems may end up being
> >> > then largely random just depending on who ends up using the allocation
> >> > that gets trampled on.
> >> > 
> >> > I wouldn't be surprised if it's all the same thing - including your
> >> > network issue.
> >> 
> >> Yea. As I just wrote in
> >> https://postgr.es/m/20220815070203.plwjx7b3cyugpdt7%40awork3.anarazel.de I
> >> bisected it down to one commit (762faee5a267). With that commit I only see the
> >> networking issue across a few reboots, but with ebcce4926365 some boots oops
> >> badly and other times it' "just" network not working.
> >> 
> >> 
> >> [oopses]
> 
> >> If somebody knowledgeable staring at 762faee5a267 doesn't surface somebody I
> >> can create a kernel with some more debugging stuff enabled, if somebody tells
> >> me what'd work best here.
> >> 
> >> 
> >> Greetings,
> >> 
> >> Andres Freund
> >
> > Thanks a lot for the work!
> > Just a small clarification:
> >
> > So IIUC you see several issues, right?
> 
> Yes, although they might be related, as theorized by Linus upthread.
> 
> > With 762faee5a2678559d3dc09d95f8f2c54cd0466a7 you see networking issues.
> 
> Yes.
> 
> 
> > With ebcce492636506443e4361db6587e6acd1a624f9 you see crashes.
> 
> Changed between rebooting. Sometimes the network issue, sometimes the crashes in the email you're replying to.
>

OK just adding:

    Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
	L: virtualization@lists.linux-foundation.org
	L: netdev@vger.kernel.org

I think we can drop the original Cc list:

Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Guenter Roeck <linux@roeck-us.net>, linux-kernel@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>

but I'm not sure, maybe they want to be informed.

> 
> > MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
       [not found]                     ` <FCDC5DDE-3CDD-4B8A-916F-CA7D87B547CE@anarazel.de>
@ 2022-08-15 15:40                       ` Michael S. Tsirkin
  2022-08-15 16:45                         ` Andres Freund
  0 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2022-08-15 15:40 UTC (permalink / raw)
  To: Xuan Zhuo, Jason Wang, Andres Freund, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, virtualization, netdev,
	Linus Torvalds, Jens Axboe, James Bottomley, Martin K. Petersen,
	Guenter Roeck, linux-kernel, Greg KH, c

On Mon, Aug 15, 2022 at 01:34:41AM -0700, Andres Freund wrote:
> Hi, 
> 
> On August 15, 2022 1:28:29 AM PDT, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >On Mon, Aug 15, 2022 at 01:15:27AM -0700, Andres Freund wrote:
> >> Hi,
> >> 
> >> On 2022-08-15 03:51:34 -0400, Michael S. Tsirkin wrote:
> >> > It is possible that GCP gets confused if ring size is smaller than the
> >> > device maximum simply because no one did it in the past.
> >> > 
> >> > So I pushed just the revert of 762faee5a267 to the test branch.
> >> > Could you give it a spin?
> >> 
> >> Seems to fix the issue, at least to the extent I can determine at 1am... :)
> >> 
> >> Greetings,
> >> 
> >> Andres Freund
> >
> >So you tested this:
> >
> >commit 13df5a7eaeb22561d39354b576bc98a7e2c389f9 (HEAD, kernel.org/test)
> >Author: Michael S. Tsirkin <mst@redhat.com>
> >Date:   Mon Aug 15 03:44:38 2022 -0400
> >
> >    Revert "virtio_net: set the default max ring size by find_vqs()"
> >    
> >    This reverts commit 762faee5a2678559d3dc09d95f8f2c54cd0466a7.
> >    
> >    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> >and it fixes both issues right? No crashes no networking issue?
> 
> Correct. I only did limited testing, but it's survived far longer / more reboots than anything since the commit.
> 
> Andres
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity.


OK so this gives us a quick revert as a solution for now.
Next, I would appreciate it if you just try this simple hack.
If it crashes we either have a long standing problem in virtio
code or more likely a gcp bug where it can't handle smaller
rings than what device requestes.
Thanks!

diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index f7965c5dd36b..bdd5f481570b 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -314,6 +314,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
 	if (!size || size > num)
 		size = num;
 
+	if (size > 1024)
+		size = 1024;
+
 	if (size & (size - 1)) {
 		dev_warn(&vp_dev->pci_dev->dev, "bad queue size %u", size);
 		return ERR_PTR(-EINVAL);


-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 15:40                       ` Michael S. Tsirkin
@ 2022-08-15 16:45                         ` Andres Freund
  2022-08-15 16:50                           ` Michael S. Tsirkin
  0 siblings, 1 reply; 17+ messages in thread
From: Andres Freund @ 2022-08-15 16:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Martin K. Petersen, netdev, linux-kernel,
	virtualization, James Bottomley, Eric Dumazet, Greg KH, c,
	Jakub Kicinski, Paolo Abeni, Linus Torvalds, David S. Miller,
	Guenter Roeck

Hi,

On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> OK so this gives us a quick revert as a solution for now.
> Next, I would appreciate it if you just try this simple hack.
> If it crashes we either have a long standing problem in virtio
> code or more likely a gcp bug where it can't handle smaller
> rings than what device requestes.
> Thanks!

I applied the below and the problem persists.

> diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
> index f7965c5dd36b..bdd5f481570b 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -314,6 +314,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
>  	if (!size || size > num)
>  		size = num;
>  
> +	if (size > 1024)
> +		size = 1024;
> +
>  	if (size & (size - 1)) {
>  		dev_warn(&vp_dev->pci_dev->dev, "bad queue size %u", size);
>  		return ERR_PTR(-EINVAL);
> 
> 

[    1.165162] virtio_net virtio1 enp0s4: renamed from eth0
[    1.177815] general protection fault, probably for non-canonical address 0xffff000000000400: 0000 [#1] PREEMPT SMP PTI
[    1.179565] CPU: 1 PID: 125 Comm: systemd-udevd Not tainted 6.0.0-rc1-bisect14-dirty #14
[    1.180785] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022
[    1.182475] RIP: 0010:__kmalloc_node_track_caller+0x19e/0x380
[    1.183365] Code: 2b 04 25 28 00 00 00 0f 85 f8 01 00 00 48 83 c4 18 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 8b 4d 28 48 8b 7d 00 <48> 8b 1c 08 48 8d 4a 40 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84 0b ff
[    1.186208] RSP: 0018:ffff9c470021b860 EFLAGS: 00010246
[    1.187194] RAX: ffff000000000000 RBX: 00000000000928c0 RCX: 0000000000000400
[    1.188634] RDX: 0000000000005781 RSI: 00000000000928c0 RDI: 000000000002e0f0
[    1.190177] RBP: ffff908380042c00 R08: 0000000000000600 R09: ffff908380b665e4
[    1.191256] R10: 0000000000000003 R11: 0000000000000002 R12: 00000000000928c0
[    1.192269] R13: 0000000000000740 R14: 00000000ffffffff R15: 0000000000000000
[    1.193368] FS:  00007f746702a8c0(0000) GS:ffff9084b7d00000(0000) knlGS:0000000000000000
[    1.194846] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.195661] CR2: 00007ffc010df980 CR3: 0000000103826005 CR4: 00000000003706e0
[    1.196912] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.198216] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    1.199367] Call Trace:
[    1.199815]  <TASK>
[    1.200138]  ? netlink_trim+0x85/0xb0
[    1.200754]  pskb_expand_head+0x92/0x340
[    1.202512]  netlink_trim+0x85/0xb0
[    1.203069]  netlink_unicast+0x54/0x390
[    1.203630]  rtnl_getlink+0x366/0x410
[    1.204155]  ? __d_alloc+0x24/0x1d0
[    1.204668]  rtnetlink_rcv_msg+0x146/0x3b0
[    1.205256]  ? _raw_spin_unlock+0xd/0x30
[    1.205867]  ? __d_add+0xf2/0x1b0
[    1.206600]  ? rtnl_calcit.isra.0+0x130/0x130
[    1.207221]  netlink_rcv_skb+0x49/0xf0
[    1.207904]  netlink_unicast+0x23a/0x390
[    1.208585]  netlink_sendmsg+0x23b/0x4b0
[    1.209203]  sock_sendmsg+0x57/0x60
[    1.210118]  __sys_sendto+0x117/0x170
[    1.210694]  ? __wake_up_common_lock+0x83/0xc0
[    1.211420]  __x64_sys_sendto+0x1b/0x30
[    1.211992]  do_syscall_64+0x37/0x90
[    1.212497]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[    1.213407] RIP: 0033:0x7f74677404e6
[    1.213973] Code: 69 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 72 c3 90 41 54 48 83 ec 30 44 89 4c 24 2c 4c
[    1.217098] RSP: 002b:00007ffc010daa78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[    1.219539] RAX: ffffffffffffffda RBX: 000000000011bc98 RCX: 00007f74677404e6
[    1.220552] RDX: 0000000000000020 RSI: 0000563160679570 RDI: 0000000000000005
[    1.222378] RBP: 00005631606796b0 R08: 00007ffc010daaf0 R09: 0000000000000080
[    1.223692] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
[    1.224793] R13: 0000000000000000 R14: 0000000000000000 R15: 00005631606794b0
[    1.226228]  </TASK>
[    1.226775] Modules linked in:
[    1.227414] ---[ end trace 0000000000000000 ]---

Greetings,

Andres Freund
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 16:45                         ` Andres Freund
@ 2022-08-15 16:50                           ` Michael S. Tsirkin
  2022-08-15 17:46                             ` Andres Freund
  2022-08-15 20:45                             ` Guenter Roeck
  0 siblings, 2 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2022-08-15 16:50 UTC (permalink / raw)
  To: Andres Freund
  Cc: Jens Axboe, Martin K. Petersen, netdev, linux-kernel,
	virtualization, James Bottomley, Eric Dumazet, Greg KH, c,
	Jakub Kicinski, Paolo Abeni, Linus Torvalds, David S. Miller,
	Guenter Roeck

On Mon, Aug 15, 2022 at 09:45:03AM -0700, Andres Freund wrote:
> Hi,
> 
> On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > OK so this gives us a quick revert as a solution for now.
> > Next, I would appreciate it if you just try this simple hack.
> > If it crashes we either have a long standing problem in virtio
> > code or more likely a gcp bug where it can't handle smaller
> > rings than what device requestes.
> > Thanks!
> 
> I applied the below and the problem persists.
> 
> > diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
> > index f7965c5dd36b..bdd5f481570b 100644
> > --- a/drivers/virtio/virtio_pci_modern.c
> > +++ b/drivers/virtio/virtio_pci_modern.c
> > @@ -314,6 +314,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
> >  	if (!size || size > num)
> >  		size = num;
> >  
> > +	if (size > 1024)
> > +		size = 1024;
> > +
> >  	if (size & (size - 1)) {
> >  		dev_warn(&vp_dev->pci_dev->dev, "bad queue size %u", size);
> >  		return ERR_PTR(-EINVAL);
> > 
> > 
> 
> [    1.165162] virtio_net virtio1 enp0s4: renamed from eth0
> [    1.177815] general protection fault, probably for non-canonical address 0xffff000000000400: 0000 [#1] PREEMPT SMP PTI
> [    1.179565] CPU: 1 PID: 125 Comm: systemd-udevd Not tainted 6.0.0-rc1-bisect14-dirty #14
> [    1.180785] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022
> [    1.182475] RIP: 0010:__kmalloc_node_track_caller+0x19e/0x380
> [    1.183365] Code: 2b 04 25 28 00 00 00 0f 85 f8 01 00 00 48 83 c4 18 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 8b 4d 28 48 8b 7d 00 <48> 8b 1c 08 48 8d 4a 40 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84 0b ff
> [    1.186208] RSP: 0018:ffff9c470021b860 EFLAGS: 00010246
> [    1.187194] RAX: ffff000000000000 RBX: 00000000000928c0 RCX: 0000000000000400
> [    1.188634] RDX: 0000000000005781 RSI: 00000000000928c0 RDI: 000000000002e0f0
> [    1.190177] RBP: ffff908380042c00 R08: 0000000000000600 R09: ffff908380b665e4
> [    1.191256] R10: 0000000000000003 R11: 0000000000000002 R12: 00000000000928c0
> [    1.192269] R13: 0000000000000740 R14: 00000000ffffffff R15: 0000000000000000
> [    1.193368] FS:  00007f746702a8c0(0000) GS:ffff9084b7d00000(0000) knlGS:0000000000000000
> [    1.194846] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.195661] CR2: 00007ffc010df980 CR3: 0000000103826005 CR4: 00000000003706e0
> [    1.196912] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    1.198216] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [    1.199367] Call Trace:
> [    1.199815]  <TASK>
> [    1.200138]  ? netlink_trim+0x85/0xb0
> [    1.200754]  pskb_expand_head+0x92/0x340
> [    1.202512]  netlink_trim+0x85/0xb0
> [    1.203069]  netlink_unicast+0x54/0x390
> [    1.203630]  rtnl_getlink+0x366/0x410
> [    1.204155]  ? __d_alloc+0x24/0x1d0
> [    1.204668]  rtnetlink_rcv_msg+0x146/0x3b0
> [    1.205256]  ? _raw_spin_unlock+0xd/0x30
> [    1.205867]  ? __d_add+0xf2/0x1b0
> [    1.206600]  ? rtnl_calcit.isra.0+0x130/0x130
> [    1.207221]  netlink_rcv_skb+0x49/0xf0
> [    1.207904]  netlink_unicast+0x23a/0x390
> [    1.208585]  netlink_sendmsg+0x23b/0x4b0
> [    1.209203]  sock_sendmsg+0x57/0x60
> [    1.210118]  __sys_sendto+0x117/0x170
> [    1.210694]  ? __wake_up_common_lock+0x83/0xc0
> [    1.211420]  __x64_sys_sendto+0x1b/0x30
> [    1.211992]  do_syscall_64+0x37/0x90
> [    1.212497]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [    1.213407] RIP: 0033:0x7f74677404e6
> [    1.213973] Code: 69 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 72 c3 90 41 54 48 83 ec 30 44 89 4c 24 2c 4c
> [    1.217098] RSP: 002b:00007ffc010daa78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
> [    1.219539] RAX: ffffffffffffffda RBX: 000000000011bc98 RCX: 00007f74677404e6
> [    1.220552] RDX: 0000000000000020 RSI: 0000563160679570 RDI: 0000000000000005
> [    1.222378] RBP: 00005631606796b0 R08: 00007ffc010daaf0 R09: 0000000000000080
> [    1.223692] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> [    1.224793] R13: 0000000000000000 R14: 0000000000000000 R15: 00005631606794b0
> [    1.226228]  </TASK>
> [    1.226775] Modules linked in:
> [    1.227414] ---[ end trace 0000000000000000 ]---
> 
> Greetings,
> 
> Andres Freund

Okay! And just to be 100% sure, can you try the following on top of 5.19:


diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index 623906b4996c..6f4e54a618bc 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -208,6 +208,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
 		return ERR_PTR(-EINVAL);
 	}
 
+	if (num > 1024)
+		num = 1024;
+
 	info->msix_vector = msix_vec;
 
 	/* create the vring */

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 16:50                           ` Michael S. Tsirkin
@ 2022-08-15 17:46                             ` Andres Freund
  2022-08-15 20:21                               ` Michael S. Tsirkin
  2022-08-15 20:45                             ` Guenter Roeck
  1 sibling, 1 reply; 17+ messages in thread
From: Andres Freund @ 2022-08-15 17:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Martin K. Petersen, netdev, linux-kernel,
	virtualization, James Bottomley, Eric Dumazet, Greg KH, c,
	Jakub Kicinski, Paolo Abeni, Linus Torvalds, David S. Miller,
	Guenter Roeck

Hi,

On 2022-08-15 12:50:52 -0400, Michael S. Tsirkin wrote:
> On Mon, Aug 15, 2022 at 09:45:03AM -0700, Andres Freund wrote:
> > Hi,
> > 
> > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > OK so this gives us a quick revert as a solution for now.
> > > Next, I would appreciate it if you just try this simple hack.
> > > If it crashes we either have a long standing problem in virtio
> > > code or more likely a gcp bug where it can't handle smaller
> > > rings than what device requestes.
> > > Thanks!
> > 
> > I applied the below and the problem persists.
> > 
> > [...]
>
> Okay!

Just checking - I applied and tested this atop 6.0-rc1, correct? Or did you
want me to test it with the 762faee5a267 reverted? I guess what you're trying
to test if a smaller queue than what's requested you'd want to do so without
the problematic patch applied...


> And just to be 100% sure, can you try the following on top of 5.19:

> diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
> index 623906b4996c..6f4e54a618bc 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -208,6 +208,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
>  		return ERR_PTR(-EINVAL);
>  	}
>  
> +	if (num > 1024)
> +		num = 1024;
> +
>  	info->msix_vector = msix_vec;
>  
>  	/* create the vring */
> 
> -- 

Either way, I did this, and there are no issues that I could observe. No
oopses, no broken networking. But:

To make sure it does something I added a debugging printk - which doesn't show
up. I assume this is at a point at least earlyprintk should work (which I see
getting enabled via serial)?

Greetings,

Andres Freund
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 17:46                             ` Andres Freund
@ 2022-08-15 20:21                               ` Michael S. Tsirkin
  2022-08-15 20:53                                 ` Andres Freund
  0 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2022-08-15 20:21 UTC (permalink / raw)
  To: Andres Freund
  Cc: Jens Axboe, Martin K. Petersen, netdev, linux-kernel,
	virtualization, James Bottomley, Eric Dumazet, Greg KH, c,
	Jakub Kicinski, Paolo Abeni, Linus Torvalds, David S. Miller,
	Guenter Roeck

On Mon, Aug 15, 2022 at 10:46:17AM -0700, Andres Freund wrote:
> Hi,
> 
> On 2022-08-15 12:50:52 -0400, Michael S. Tsirkin wrote:
> > On Mon, Aug 15, 2022 at 09:45:03AM -0700, Andres Freund wrote:
> > > Hi,
> > > 
> > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > OK so this gives us a quick revert as a solution for now.
> > > > Next, I would appreciate it if you just try this simple hack.
> > > > If it crashes we either have a long standing problem in virtio
> > > > code or more likely a gcp bug where it can't handle smaller
> > > > rings than what device requestes.
> > > > Thanks!
> > > 
> > > I applied the below and the problem persists.
> > > 
> > > [...]
> >
> > Okay!
> 
> Just checking - I applied and tested this atop 6.0-rc1, correct? Or did you
> want me to test it with the 762faee5a267 reverted? I guess what you're trying
> to test if a smaller queue than what's requested you'd want to do so without
> the problematic patch applied...
> 
> 
> > And just to be 100% sure, can you try the following on top of 5.19:
> 
> > diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
> > index 623906b4996c..6f4e54a618bc 100644
> > --- a/drivers/virtio/virtio_pci_modern.c
> > +++ b/drivers/virtio/virtio_pci_modern.c
> > @@ -208,6 +208,9 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
> >  		return ERR_PTR(-EINVAL);
> >  	}
> >  
> > +	if (num > 1024)
> > +		num = 1024;
> > +
> >  	info->msix_vector = msix_vec;
> >  
> >  	/* create the vring */
> > 
> > -- 
> 
> Either way, I did this, and there are no issues that I could observe. No
> oopses, no broken networking. But:
> 
> To make sure it does something I added a debugging printk - which doesn't show
> up. I assume this is at a point at least earlyprintk should work (which I see
> getting enabled via serial)?
> 
> Greetings,
> 
> Andres Freund


Sorry if I was unclear.  I wanted to know whether the change somehow
exposes a driver bug or a GCP bug. So what I wanted to do is to test
this patch on top of *5.19*, not on top of the revert.
The idea is if we reduce the size and it starts crashing then
we know it's GCP fault, if not then GCP can handle smaller sizes
and it's one of the driver changes.

It will apply on top of the revert but won't do much.

Yes I think printk should work here.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 16:50                           ` Michael S. Tsirkin
  2022-08-15 17:46                             ` Andres Freund
@ 2022-08-15 20:45                             ` Guenter Roeck
  1 sibling, 0 replies; 17+ messages in thread
From: Guenter Roeck @ 2022-08-15 20:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Martin K. Petersen, netdev, linux-kernel,
	virtualization, James Bottomley, Linus Torvalds, Eric Dumazet,
	Greg KH, c, Jakub Kicinski, Paolo Abeni, Andres Freund,
	David S. Miller

Michael,

On Mon, Aug 15, 2022 at 12:50:52PM -0400, Michael S. Tsirkin wrote:
[ ...]
> 
> Okay! And just to be 100% sure, can you try the following on top of 5.19:
> 

You should now be able to test any patches using the syzkaller
infrastructure. Pick any (or all) of the now-published syzkaller
reports from the linux-kernel mailing list, reply with:

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v5.19

and provide your patch as attachment.

Guenter
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 20:21                               ` Michael S. Tsirkin
@ 2022-08-15 20:53                                 ` Andres Freund
  2022-08-15 21:04                                   ` Andres Freund
  2022-08-15 21:32                                   ` Michael S. Tsirkin
  0 siblings, 2 replies; 17+ messages in thread
From: Andres Freund @ 2022-08-15 20:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Martin K. Petersen, netdev, linux-kernel,
	virtualization, James Bottomley, Eric Dumazet, Greg KH, c,
	Jakub Kicinski, Paolo Abeni, Linus Torvalds, David S. Miller,
	Guenter Roeck

Hi,

On 2022-08-15 16:21:51 -0400, Michael S. Tsirkin wrote:
> On Mon, Aug 15, 2022 at 10:46:17AM -0700, Andres Freund wrote:
> > Hi,
> >
> > On 2022-08-15 12:50:52 -0400, Michael S. Tsirkin wrote:
> > > On Mon, Aug 15, 2022 at 09:45:03AM -0700, Andres Freund wrote:
> > > > Hi,
> > > >
> > > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > > OK so this gives us a quick revert as a solution for now.
> > > > > Next, I would appreciate it if you just try this simple hack.
> > > > > If it crashes we either have a long standing problem in virtio
> > > > > code or more likely a gcp bug where it can't handle smaller
> > > > > rings than what device requestes.
> > > > > Thanks!
> > > >
> > > > I applied the below and the problem persists.
> > > >
> > > > [...]
> > >
> > > Okay!
> >
> > Just checking - I applied and tested this atop 6.0-rc1, correct? Or did you
> > want me to test it with the 762faee5a267 reverted? I guess what you're trying
> > to test if a smaller queue than what's requested you'd want to do so without
> > the problematic patch applied...
> >
> >
> > Either way, I did this, and there are no issues that I could observe. No
> > oopses, no broken networking. But:
> >
> > To make sure it does something I added a debugging printk - which doesn't show
> > up. I assume this is at a point at least earlyprintk should work (which I see
> > getting enabled via serial)?
> >

> Sorry if I was unclear.  I wanted to know whether the change somehow
> exposes a driver bug or a GCP bug. So what I wanted to do is to test
> this patch on top of *5.19*, not on top of the revert.

Right, the 5.19 part was clear, just the earlier test:

> > > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > > OK so this gives us a quick revert as a solution for now.
> > > > > Next, I would appreciate it if you just try this simple hack.
> > > > > If it crashes we either have a long standing problem in virtio
> > > > > code or more likely a gcp bug where it can't handle smaller
> > > > > Thanks!

I wasn't sure about.

After I didn't see any effect on 5.19 + your patch, I grew a bit suspicious
and added the printks.


> Yes I think printk should work here.

The reason the debug patch didn't change anything, and that my debug printk
didn't show, is that gcp uses the legacy paths...

If there were a bug in the legacy path, it'd explain why the problem only
shows on gcp, and not in other situations.

I'll queue testing the legacy path with the equivalent change.

- Andres


Greetings,

Andres Freund
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 20:53                                 ` Andres Freund
@ 2022-08-15 21:04                                   ` Andres Freund
  2022-08-15 21:10                                     ` Andres Freund
  2022-08-15 21:32                                   ` Michael S. Tsirkin
  1 sibling, 1 reply; 17+ messages in thread
From: Andres Freund @ 2022-08-15 21:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Martin K. Petersen, netdev, linux-kernel,
	virtualization, James Bottomley, Eric Dumazet, Greg KH, c,
	Jakub Kicinski, Paolo Abeni, Linus Torvalds, David S. Miller,
	Guenter Roeck

On 2022-08-15 13:53:31 -0700, Andres Freund wrote:
> The reason the debug patch didn't change anything, and that my debug printk
> didn't show, is that gcp uses the legacy paths...
> 
> If there were a bug in the legacy path, it'd explain why the problem only
> shows on gcp, and not in other situations.
> 
> I'll queue testing the legacy path with the equivalent change.

Booting with the equivalent change, atop 5.19, in the legacy setup_vq()
reliably causes boot to hang:

[    0.718768] ACPI: button: Sleep Button [SLPF]
[    0.721989] ACPI: \_SB_.LNKC: Enabled at IRQ 11
[    0.722688] adebug: use legacy: 0
[    0.722724] virtio-pci 0000:00:03.0: virtio_pci: leaving for legacy driver
[    0.724286] adebug: probe modern: -19
[    0.727353] ACPI: \_SB_.LNKD: Enabled at IRQ 10
[    0.728719] adebug: use legacy: 0
[    0.728766] virtio-pci 0000:00:04.0: virtio_pci: leaving for legacy driver
[    0.730422] adebug: probe modern: -19
[    0.733552] ACPI: \_SB_.LNKA: Enabled at IRQ 10
[    0.734923] adebug: use legacy: 0
[    0.734957] virtio-pci 0000:00:05.0: virtio_pci: leaving for legacy driver
[    0.736426] adebug: probe modern: -19
[    0.739039] ACPI: \_SB_.LNKB: Enabled at IRQ 11
[    0.740350] adebug: use legacy: 0
[    0.740390] virtio-pci 0000:00:06.0: virtio_pci: leaving for legacy driver
[    0.742142] adebug: probe modern: -19
[    0.747627] adebug: legacy setup_vq
[    0.748243] virtio-pci 0000:00:05.0: adebug: legacy: not limiting queue size, only 256
[    0.751081] adebug: legacy setup_vq
[    0.751110] virtio-pci 0000:00:05.0: adebug: legacy: not limiting queue size, only 256
[    0.754028] adebug: legacy setup_vq
[    0.754059] virtio-pci 0000:00:05.0: adebug: legacy: not limiting queue size, only 1
[    0.757760] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    0.759135] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[    0.760399] 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A
[    0.761610] 00:05: ttyS2 at I/O 0x3e8 (irq = 6, base_baud = 115200) is a 16550A
[    0.762923] 00:06: ttyS3 at I/O 0x2e8 (irq = 7, base_baud = 115200) is a 16550A
[    0.764222] Non-volatile memory driver v1.3
[    0.768857] adebug: legacy setup_vq
[    0.768882] virtio-pci 0000:00:06.0: adebug: legacy: not limiting queue size, only 256
[    0.773002] Linux agpgart interface v0.103
[    0.775424] loop: module loaded
[    0.780513] adebug: legacy setup_vq
[    0.780538] virtio-pci 0000:00:03.0: adebug: legacy: limiting queue size from 8192 to 1024
[    0.784075] adebug: legacy setup_vq
[    0.784104] virtio-pci 0000:00:03.0: adebug: legacy: limiting queue size from 8192 to 1024
[    0.787073] adebug: legacy setup_vq
[    0.787101] virtio-pci 0000:00:03.0: adebug: legacy: limiting queue size from 8192 to 1024
[    0.790379] scsi host0: Virtio SCSI HBA
[    0.795968] Freeing initrd memory: 7236K

Greetings,

Andres Freund
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 21:04                                   ` Andres Freund
@ 2022-08-15 21:10                                     ` Andres Freund
  0 siblings, 0 replies; 17+ messages in thread
From: Andres Freund @ 2022-08-15 21:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Martin K. Petersen, netdev, linux-kernel,
	virtualization, James Bottomley, Eric Dumazet, Greg KH, c,
	Jakub Kicinski, Paolo Abeni, Linus Torvalds, David S. Miller,
	Guenter Roeck

Hi,

On 2022-08-15 14:04:37 -0700, Andres Freund wrote:
> Booting with the equivalent change, atop 5.19, in the legacy setup_vq()
> reliably causes boot to hang:

I don't know much virtio, so take this with a rock of salt:

Legacy setup_vq() doesn't tell the host about the queue size. The modern one
does:
	vp_modern_set_queue_size(mdev, index, virtqueue_get_vring_size(vq));
but the legacy one doesn't.

I assume this means the host will assume the queue is of the size suggested by
vp_legacy_get_queue_size(). If the host continues to write into the space
after the "assumed end" of the queue, but the guest puts other stuff in that
space, well, I'd expect fun roughly like the stuff we've been seeing in this
and related threads.

Greetings,

Andres Freund
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 20:53                                 ` Andres Freund
  2022-08-15 21:04                                   ` Andres Freund
@ 2022-08-15 21:32                                   ` Michael S. Tsirkin
  2022-08-16  2:45                                     ` Xuan Zhuo
  2022-08-17  6:13                                     ` Dmitry Vyukov via Virtualization
  1 sibling, 2 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2022-08-15 21:32 UTC (permalink / raw)
  To: Andres Freund
  Cc: Jens Axboe, Martin K. Petersen, netdev, linux-kernel,
	virtualization, James Bottomley, Eric Dumazet, Greg KH, c,
	Jakub Kicinski, Paolo Abeni, Linus Torvalds, David S. Miller,
	Guenter Roeck

On Mon, Aug 15, 2022 at 01:53:30PM -0700, Andres Freund wrote:
> Hi,
> 
> On 2022-08-15 16:21:51 -0400, Michael S. Tsirkin wrote:
> > On Mon, Aug 15, 2022 at 10:46:17AM -0700, Andres Freund wrote:
> > > Hi,
> > >
> > > On 2022-08-15 12:50:52 -0400, Michael S. Tsirkin wrote:
> > > > On Mon, Aug 15, 2022 at 09:45:03AM -0700, Andres Freund wrote:
> > > > > Hi,
> > > > >
> > > > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > > > OK so this gives us a quick revert as a solution for now.
> > > > > > Next, I would appreciate it if you just try this simple hack.
> > > > > > If it crashes we either have a long standing problem in virtio
> > > > > > code or more likely a gcp bug where it can't handle smaller
> > > > > > rings than what device requestes.
> > > > > > Thanks!
> > > > >
> > > > > I applied the below and the problem persists.
> > > > >
> > > > > [...]
> > > >
> > > > Okay!
> > >
> > > Just checking - I applied and tested this atop 6.0-rc1, correct? Or did you
> > > want me to test it with the 762faee5a267 reverted? I guess what you're trying
> > > to test if a smaller queue than what's requested you'd want to do so without
> > > the problematic patch applied...
> > >
> > >
> > > Either way, I did this, and there are no issues that I could observe. No
> > > oopses, no broken networking. But:
> > >
> > > To make sure it does something I added a debugging printk - which doesn't show
> > > up. I assume this is at a point at least earlyprintk should work (which I see
> > > getting enabled via serial)?
> > >
> 
> > Sorry if I was unclear.  I wanted to know whether the change somehow
> > exposes a driver bug or a GCP bug. So what I wanted to do is to test
> > this patch on top of *5.19*, not on top of the revert.
> 
> Right, the 5.19 part was clear, just the earlier test:
> 
> > > > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > > > OK so this gives us a quick revert as a solution for now.
> > > > > > Next, I would appreciate it if you just try this simple hack.
> > > > > > If it crashes we either have a long standing problem in virtio
> > > > > > code or more likely a gcp bug where it can't handle smaller
> > > > > > Thanks!
> 
> I wasn't sure about.
> 
> After I didn't see any effect on 5.19 + your patch, I grew a bit suspicious
> and added the printks.
> 
> 
> > Yes I think printk should work here.
> 
> The reason the debug patch didn't change anything, and that my debug printk
> didn't show, is that gcp uses the legacy paths...

Wait a second. Eureka I think!

So I think GCP is not broken.
I think what's broken is this patch:

commit cdb44806fca2d0ad29ca644cbf1505433902ee0c
Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Date:   Mon Aug 1 14:38:54 2022 +0800

    virtio_pci: support the arg sizes of find_vqs()


Specifically:

diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index 2257f1b3d8ae..d75e5c4e637f 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -112,6 +112,7 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
                                  unsigned int index,
                                  void (*callback)(struct virtqueue *vq),
                                  const char *name,
+                                 u32 size,
                                  bool ctx,
                                  u16 msix_vec)
 {
@@ -125,10 +126,13 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
        if (!num || vp_legacy_get_queue_enable(&vp_dev->ldev, index))
                return ERR_PTR(-ENOENT);
 
+       if (!size || size > num)
+               size = num;
+
        info->msix_vector = msix_vec;
 
        /* create the vring */
-       vq = vring_create_virtqueue(index, num,
+       vq = vring_create_virtqueue(index, size,
                                    VIRTIO_PCI_VRING_ALIGN, &vp_dev->vdev,
                                    true, false, ctx,
                                    vp_notify, callback, name);

   

So if you pass the size parameter for a legacy device it will
try to make the ring smaller and that is not legal with
legacy at all. But the driver treats legacy and modern
the same, it allocates a smaller queue anyway.


Lo and behold, I pass disable-modern=on to qemu and it happily
corrupts memory exactly the same as GCP does.


So the new find_vqs API is actually completely broken, it can not work for
legacy at all and for added fun there's no way to find out
that it's legacy. Maybe we should interpret the patch

So I think I will also revert

04ca0b0b16f11faf74fa92468dab51b8372586cd..fe3dc04e31aa51f91dc7f741a5f76cc4817eb5b4







> If there were a bug in the legacy path, it'd explain why the problem only
> shows on gcp, and not in other situations.
> 
> I'll queue testing the legacy path with the equivalent change.
> 
> - Andres
> 
> 
> Greetings,
> 
> Andres Freund

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 21:32                                   ` Michael S. Tsirkin
@ 2022-08-16  2:45                                     ` Xuan Zhuo
  2022-08-17  6:13                                     ` Dmitry Vyukov via Virtualization
  1 sibling, 0 replies; 17+ messages in thread
From: Xuan Zhuo @ 2022-08-16  2:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Martin K. Petersen, netdev, linux-kernel,
	virtualization, James Bottomley, Eric Dumazet, Andres Freund,
	Greg KH, c, Jakub Kicinski, Paolo Abeni, Linus Torvalds,
	David S. Miller, Guenter Roeck

On Mon, 15 Aug 2022 17:32:06 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, Aug 15, 2022 at 01:53:30PM -0700, Andres Freund wrote:
> > Hi,
> >
> > On 2022-08-15 16:21:51 -0400, Michael S. Tsirkin wrote:
> > > On Mon, Aug 15, 2022 at 10:46:17AM -0700, Andres Freund wrote:
> > > > Hi,
> > > >
> > > > On 2022-08-15 12:50:52 -0400, Michael S. Tsirkin wrote:
> > > > > On Mon, Aug 15, 2022 at 09:45:03AM -0700, Andres Freund wrote:
> > > > > > Hi,
> > > > > >
> > > > > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > > > > OK so this gives us a quick revert as a solution for now.
> > > > > > > Next, I would appreciate it if you just try this simple hack.
> > > > > > > If it crashes we either have a long standing problem in virtio
> > > > > > > code or more likely a gcp bug where it can't handle smaller
> > > > > > > rings than what device requestes.
> > > > > > > Thanks!
> > > > > >
> > > > > > I applied the below and the problem persists.
> > > > > >
> > > > > > [...]
> > > > >
> > > > > Okay!
> > > >
> > > > Just checking - I applied and tested this atop 6.0-rc1, correct? Or did you
> > > > want me to test it with the 762faee5a267 reverted? I guess what you're trying
> > > > to test if a smaller queue than what's requested you'd want to do so without
> > > > the problematic patch applied...
> > > >
> > > >
> > > > Either way, I did this, and there are no issues that I could observe. No
> > > > oopses, no broken networking. But:
> > > >
> > > > To make sure it does something I added a debugging printk - which doesn't show
> > > > up. I assume this is at a point at least earlyprintk should work (which I see
> > > > getting enabled via serial)?
> > > >
> >
> > > Sorry if I was unclear.  I wanted to know whether the change somehow
> > > exposes a driver bug or a GCP bug. So what I wanted to do is to test
> > > this patch on top of *5.19*, not on top of the revert.
> >
> > Right, the 5.19 part was clear, just the earlier test:
> >
> > > > > > On 2022-08-15 11:40:59 -0400, Michael S. Tsirkin wrote:
> > > > > > > OK so this gives us a quick revert as a solution for now.
> > > > > > > Next, I would appreciate it if you just try this simple hack.
> > > > > > > If it crashes we either have a long standing problem in virtio
> > > > > > > code or more likely a gcp bug where it can't handle smaller
> > > > > > > Thanks!
> >
> > I wasn't sure about.
> >
> > After I didn't see any effect on 5.19 + your patch, I grew a bit suspicious
> > and added the printks.
> >
> >
> > > Yes I think printk should work here.
> >
> > The reason the debug patch didn't change anything, and that my debug printk
> > didn't show, is that gcp uses the legacy paths...
>
> Wait a second. Eureka I think!
>
> So I think GCP is not broken.
> I think what's broken is this patch:
>
> commit cdb44806fca2d0ad29ca644cbf1505433902ee0c
> Author: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Date:   Mon Aug 1 14:38:54 2022 +0800
>
>     virtio_pci: support the arg sizes of find_vqs()
>
>
> Specifically:
>
> diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
> index 2257f1b3d8ae..d75e5c4e637f 100644
> --- a/drivers/virtio/virtio_pci_legacy.c
> +++ b/drivers/virtio/virtio_pci_legacy.c
> @@ -112,6 +112,7 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
>                                   unsigned int index,
>                                   void (*callback)(struct virtqueue *vq),
>                                   const char *name,
> +                                 u32 size,
>                                   bool ctx,
>                                   u16 msix_vec)
>  {
> @@ -125,10 +126,13 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
>         if (!num || vp_legacy_get_queue_enable(&vp_dev->ldev, index))
>                 return ERR_PTR(-ENOENT);
>
> +       if (!size || size > num)
> +               size = num;
> +
>         info->msix_vector = msix_vec;
>
>         /* create the vring */
> -       vq = vring_create_virtqueue(index, num,
> +       vq = vring_create_virtqueue(index, size,
>                                     VIRTIO_PCI_VRING_ALIGN, &vp_dev->vdev,
>                                     true, false, ctx,
>                                     vp_notify, callback, name);
>
>
>
> So if you pass the size parameter for a legacy device it will
> try to make the ring smaller and that is not legal with
> legacy at all. But the driver treats legacy and modern
> the same, it allocates a smaller queue anyway.
>
>
> Lo and behold, I pass disable-modern=on to qemu and it happily
> corrupts memory exactly the same as GCP does.

Yes, I think you are right.

Thank you very much.

>
>
> So the new find_vqs API is actually completely broken, it can not work for
> legacy at all and for added fun there's no way to find out
> that it's legacy. Maybe we should interpret the patch
>
> So I think I will also revert
>
> 04ca0b0b16f11faf74fa92468dab51b8372586cd..fe3dc04e31aa51f91dc7f741a5f76cc4817eb5b4
>
>
>
>
>
>
>
> > If there were a bug in the legacy path, it'd explain why the problem only
> > shows on gcp, and not in other situations.
> >
> > I'll queue testing the legacy path with the equivalent change.
> >
> > - Andres
> >
> >
> > Greetings,
> >
> > Andres Freund
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-15 21:32                                   ` Michael S. Tsirkin
  2022-08-16  2:45                                     ` Xuan Zhuo
@ 2022-08-17  6:13                                     ` Dmitry Vyukov via Virtualization
  2022-08-17  6:36                                       ` Xuan Zhuo
  1 sibling, 1 reply; 17+ messages in thread
From: Dmitry Vyukov via Virtualization @ 2022-08-17  6:13 UTC (permalink / raw)
  To: mst
  Cc: axboe, martin.petersen, gregkh, linux-kernel, kasan-dev,
	virtualization, James.Bottomley, torvalds, edumazet, netdev, c,
	kuba, pabeni, andres, davem, linux

On Mon, 15 Aug 2022 17:32:06 -0400, Michael wrote:
> So if you pass the size parameter for a legacy device it will
> try to make the ring smaller and that is not legal with
> legacy at all. But the driver treats legacy and modern
> the same, it allocates a smaller queue anyway.
>
> Lo and behold, I pass disable-modern=on to qemu and it happily
> corrupts memory exactly the same as GCP does.

Ouch!

I understand that the host does the actual corruption,
but could you think of any additional debug checking in the guest
that would caught this in future? Potentially only when KASAN
is enabled which can verify validity of memory ranges.
Some kind of additional layer of sanity checking.

This caused a bit of a havoc for syzbot with almost 100 unique
crash signatures, so would be useful to catch such issues more
reliably in future.

Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-17  6:13                                     ` Dmitry Vyukov via Virtualization
@ 2022-08-17  6:36                                       ` Xuan Zhuo
  2022-08-17 10:53                                         ` Michael S. Tsirkin
  2022-08-17 15:58                                         ` Linus Torvalds
  0 siblings, 2 replies; 17+ messages in thread
From: Xuan Zhuo @ 2022-08-17  6:36 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: axboe, martin.petersen, mst, gregkh, linux-kernel, kasan-dev,
	virtualization, James.Bottomley, torvalds, edumazet, netdev, c,
	kuba, pabeni, andres, davem, linux

On Wed, 17 Aug 2022 08:13:59 +0200, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Mon, 15 Aug 2022 17:32:06 -0400, Michael wrote:
> > So if you pass the size parameter for a legacy device it will
> > try to make the ring smaller and that is not legal with
> > legacy at all. But the driver treats legacy and modern
> > the same, it allocates a smaller queue anyway.
> >
> > Lo and behold, I pass disable-modern=on to qemu and it happily
> > corrupts memory exactly the same as GCP does.
>
> Ouch!
>
> I understand that the host does the actual corruption,
> but could you think of any additional debug checking in the guest
> that would caught this in future? Potentially only when KASAN
> is enabled which can verify validity of memory ranges.
> Some kind of additional layer of sanity checking.
>
> This caused a bit of a havoc for syzbot with almost 100 unique
> crash signatures, so would be useful to catch such issues more
> reliably in future.

We can add a check to vring size before calling vp_legacy_set_queue_address().
Checking the memory range directly is a bit cumbersome.

Thanks.

diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index 2257f1b3d8ae..0673831f45b6 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -146,6 +146,8 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
                goto out_del_vq;
        }

+       BUG_ON(num != virtqueue_get_vring_size(vq));
+
        /* activate the queue */
        vp_legacy_set_queue_address(&vp_dev->ldev, index, q_pfn);


>
> Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-17  6:36                                       ` Xuan Zhuo
@ 2022-08-17 10:53                                         ` Michael S. Tsirkin
  2022-08-17 15:58                                         ` Linus Torvalds
  1 sibling, 0 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2022-08-17 10:53 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: axboe, martin.petersen, gregkh, linux-kernel, kasan-dev,
	virtualization, James.Bottomley, torvalds, edumazet, linux,
	netdev, c, kuba, pabeni, andres, davem, Dmitry Vyukov

On Wed, Aug 17, 2022 at 02:36:31PM +0800, Xuan Zhuo wrote:
> On Wed, 17 Aug 2022 08:13:59 +0200, Dmitry Vyukov <dvyukov@google.com> wrote:
> > On Mon, 15 Aug 2022 17:32:06 -0400, Michael wrote:
> > > So if you pass the size parameter for a legacy device it will
> > > try to make the ring smaller and that is not legal with
> > > legacy at all. But the driver treats legacy and modern
> > > the same, it allocates a smaller queue anyway.
> > >
> > > Lo and behold, I pass disable-modern=on to qemu and it happily
> > > corrupts memory exactly the same as GCP does.
> >
> > Ouch!
> >
> > I understand that the host does the actual corruption,
> > but could you think of any additional debug checking in the guest
> > that would caught this in future? Potentially only when KASAN
> > is enabled which can verify validity of memory ranges.
> > Some kind of additional layer of sanity checking.
> >
> > This caused a bit of a havoc for syzbot with almost 100 unique
> > crash signatures, so would be useful to catch such issues more
> > reliably in future.
> 
> We can add a check to vring size before calling vp_legacy_set_queue_address().
> Checking the memory range directly is a bit cumbersome.
> 
> Thanks.

With a comment along the lines of

/* Legacy virtio pci has no way to communicate a change in vq size to
 * the hypervisor. If ring sizes don't match hypervisor will happily
 * corrupt memory.
 */


> diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
> index 2257f1b3d8ae..0673831f45b6 100644
> --- a/drivers/virtio/virtio_pci_legacy.c
> +++ b/drivers/virtio/virtio_pci_legacy.c
> @@ -146,6 +146,8 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
>                 goto out_del_vq;
>         }
> 
> +       BUG_ON(num != virtqueue_get_vring_size(vq));
> +
>         /* activate the queue */
>         vp_legacy_set_queue_address(&vp_dev->ldev, index, q_pfn);
> 
> 
> >
> > Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-17  6:36                                       ` Xuan Zhuo
  2022-08-17 10:53                                         ` Michael S. Tsirkin
@ 2022-08-17 15:58                                         ` Linus Torvalds
  2022-08-18  1:55                                           ` Xuan Zhuo
  1 sibling, 1 reply; 17+ messages in thread
From: Linus Torvalds @ 2022-08-17 15:58 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: axboe, martin.petersen, mst, gregkh, linux-kernel, kasan-dev,
	virtualization, James.Bottomley, edumazet, linux, netdev, c, kuba,
	pabeni, andres, davem, Dmitry Vyukov

On Tue, Aug 16, 2022 at 11:47 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> +       BUG_ON(num != virtqueue_get_vring_size(vq));
> +

Please, no more BUG_ON.

Add a WARN_ON_ONCE() and return an  error.

           Linus
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: upstream kernel crashes
  2022-08-17 15:58                                         ` Linus Torvalds
@ 2022-08-18  1:55                                           ` Xuan Zhuo
  0 siblings, 0 replies; 17+ messages in thread
From: Xuan Zhuo @ 2022-08-18  1:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: axboe, martin.petersen, mst, gregkh, linux-kernel, kasan-dev,
	virtualization, James.Bottomley, edumazet, linux, netdev, c, kuba,
	pabeni, andres, davem, Dmitry Vyukov

On Wed, 17 Aug 2022 08:58:20 -0700, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Tue, Aug 16, 2022 at 11:47 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > +       BUG_ON(num != virtqueue_get_vring_size(vq));
> > +
>
> Please, no more BUG_ON.
>
> Add a WARN_ON_ONCE() and return an  error.

OK, I will post v2 with WARN_ON_ONCE().

Thanks.


>
>            Linus
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-08-18  1:57 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20220814212610.GA3690074@roeck-us.net>
     [not found] ` <CAHk-=wgf2EfLHui6A5NbWoaVBB2f8t-XBUiOMkyjN2NU41t6eA@mail.gmail.com>
     [not found]   ` <20220814223743.26ebsbnrvrjien4f@awork3.anarazel.de>
     [not found]     ` <CAHk-=wi6raoJE-1cyRU0YxJ+9ReO1eXmOAq0FwKAyZS7nhvk9w@mail.gmail.com>
     [not found]       ` <1c057afa-92df-ee3c-5978-3731d3db9345@kernel.dk>
     [not found]         ` <20220815013651.mrm7qgklk6sgpkbb@awork3.anarazel.de>
     [not found]           ` <CAHk-=wikzU4402P-FpJRK_QwfVOS+t-3p1Wx5awGHTvr-s_0Ew@mail.gmail.com>
     [not found]             ` <20220815071143.n2t5xsmifnigttq2@awork3.anarazel.de>
     [not found]               ` <20220815031549-mutt-send-email-mst@kernel.org>
     [not found]                 ` <3df6bb82-1951-455d-a768-e9e1513eb667@www.fastmail.com>
2022-08-15  8:02                   ` upstream kernel crashes Michael S. Tsirkin
     [not found]               ` <20220815034532-mutt-send-email-mst@kernel.org>
     [not found]                 ` <20220815081527.soikyi365azh5qpu@awork3.anarazel.de>
     [not found]                   ` <20220815042623-mutt-send-email-mst@kernel.org>
     [not found]                     ` <FCDC5DDE-3CDD-4B8A-916F-CA7D87B547CE@anarazel.de>
2022-08-15 15:40                       ` Michael S. Tsirkin
2022-08-15 16:45                         ` Andres Freund
2022-08-15 16:50                           ` Michael S. Tsirkin
2022-08-15 17:46                             ` Andres Freund
2022-08-15 20:21                               ` Michael S. Tsirkin
2022-08-15 20:53                                 ` Andres Freund
2022-08-15 21:04                                   ` Andres Freund
2022-08-15 21:10                                     ` Andres Freund
2022-08-15 21:32                                   ` Michael S. Tsirkin
2022-08-16  2:45                                     ` Xuan Zhuo
2022-08-17  6:13                                     ` Dmitry Vyukov via Virtualization
2022-08-17  6:36                                       ` Xuan Zhuo
2022-08-17 10:53                                         ` Michael S. Tsirkin
2022-08-17 15:58                                         ` Linus Torvalds
2022-08-18  1:55                                           ` Xuan Zhuo
2022-08-15 20:45                             ` Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).