* Re: remove function pointer casts and constify function tables [not found] ` <20170526150839.GA4593@fieldses.org> @ 2017-05-26 15:09 ` bfields [not found] ` <20170526150956.GB4593@fieldses.org> 1 sibling, 0 replies; 9+ messages in thread From: bfields @ 2017-05-26 15:09 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-nfs@vger.kernel.org, virtualization, anna.schumaker@netapp.com, Trond Myklebust, jlayton@poochiereds.net, hch Probably should have cc'd virtualization@lists.linux-foundation.org too. On Fri, May 26, 2017 at 11:08:39AM -0400, bfields@fieldses.org wrote: > On Tue, May 23, 2017 at 08:23:34AM -0400, bfields@fieldses.org wrote: > > Unfortunately I can't get anything through testing. It's not your > > patches, it's something in -rc1. My server VM stops responding to > > any network traffic randomly in the middle of a run. If I log in from a > > serial console, I see the interface is up and everything looks OK. I > > haven't had the chance to do much more, and I'm not sure where to > > start.... I started a git-bisect attempt, but there are several > > unrelated problems, and I'm not sure this one is 100% reproduceable. > > It looks like it may be due to something pulled in with virtio updates. > I've reproduced the problem on c8b0d7290657 "s390/virtio: change > maintainership" but not on v4.11. Are there any known issues with those > commits? > > I've just been doing this long-running bisect while working on other > stuff. My reproducer (basically just running a bunch of NFS > connectathon tests over a variety of protocol versions and security > flavors) doesn't hit the bug reliably, and I've had to restart a couple > times probably due to false negatives. But this looks pretty promising, > and there's only 17 commits in that range, so I'll keep bisecting. > > --b. ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20170526150956.GB4593@fieldses.org>]
* Re: remove function pointer casts and constify function tables [not found] ` <20170526150956.GB4593@fieldses.org> @ 2017-05-26 19:31 ` bfields [not found] ` <20170526193133.GA9874@fieldses.org> 1 sibling, 0 replies; 9+ messages in thread From: bfields @ 2017-05-26 19:31 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-nfs@vger.kernel.org, virtualization, anna.schumaker@netapp.com, Trond Myklebust, jlayton@poochiereds.net, hch Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix support for small rings". After that patch, my NFS server VM stops responding to packets after a few minutes of testing. Before that patch, my server keeps working. --b. ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20170526193133.GA9874@fieldses.org>]
* Re: remove function pointer casts and constify function tables [not found] ` <20170526193133.GA9874@fieldses.org> @ 2017-05-30 16:26 ` Michael S. Tsirkin 2017-05-30 16:58 ` Michael S. Tsirkin ` (3 more replies) 2017-05-30 17:03 ` Michael S. Tsirkin [not found] ` <20170530200109-mutt-send-email-mst@kernel.org> 2 siblings, 4 replies; 9+ messages in thread From: Michael S. Tsirkin @ 2017-05-30 16:26 UTC (permalink / raw) To: bfields@fieldses.org Cc: linux-nfs@vger.kernel.org, virtualization, anna.schumaker@netapp.com, Trond Myklebust, jlayton@poochiereds.net, hch On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote: > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix > support for small rings". > > After that patch, my NFS server VM stops responding to packets after a > few minutes of testing. Before that patch, my server keeps working. > > --b. Others complained about that too. I'm still trying to reproduce though. Meanwhile, could you please locate this line of code: + vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq); and add something like printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n", vi->rq[i].min_buf_len, GOOD_PACKET_LEN, virtqueue_get_vring_size(vi->rq[i].vq), (int)vi->big_packets); after it? Then boot and capture the output. Thanks! -- MST ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables 2017-05-30 16:26 ` Michael S. Tsirkin @ 2017-05-30 16:58 ` Michael S. Tsirkin [not found] ` <20170530195716-mutt-send-email-mst@kernel.org> ` (2 subsequent siblings) 3 siblings, 0 replies; 9+ messages in thread From: Michael S. Tsirkin @ 2017-05-30 16:58 UTC (permalink / raw) To: bfields@fieldses.org Cc: linux-nfs@vger.kernel.org, virtualization, anna.schumaker@netapp.com, Trond Myklebust, jlayton@poochiereds.net, hch On Tue, May 30, 2017 at 07:26:37PM +0300, Michael S. Tsirkin wrote: > On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote: > > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix > > support for small rings". > > > > After that patch, my NFS server VM stops responding to packets after a > > few minutes of testing. Before that patch, my server keeps working. > > > > --b. > > Others complained about that too. > I'm still trying to reproduce though. > > Meanwhile, could you please locate this line of code: > + vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq); > > and add something like > printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n", > vi->rq[i].min_buf_len, GOOD_PACKET_LEN, > virtqueue_get_vring_size(vi->rq[i].vq), > (int)vi->big_packets); > > after it? > Then boot and capture the output. > > Thanks! > Also, can you pls print the mergeable_rx_buffer_size attribute from sysfs for this device? > -- > MST ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20170530195716-mutt-send-email-mst@kernel.org>]
* Re: remove function pointer casts and constify function tables [not found] ` <20170530195716-mutt-send-email-mst@kernel.org> @ 2017-05-31 20:57 ` bfields 0 siblings, 0 replies; 9+ messages in thread From: bfields @ 2017-05-31 20:57 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-nfs@vger.kernel.org, virtualization, anna.schumaker@netapp.com, Trond Myklebust, jlayton@poochiereds.net, hch On Tue, May 30, 2017 at 07:58:06PM +0300, Michael S. Tsirkin wrote: > On Tue, May 30, 2017 at 07:26:37PM +0300, Michael S. Tsirkin wrote: > > On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote: > > > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix > > > support for small rings". > > > > > > After that patch, my NFS server VM stops responding to packets after a > > > few minutes of testing. Before that patch, my server keeps working. > > > > > > --b. > > > > Others complained about that too. > > I'm still trying to reproduce though. > > > > Meanwhile, could you please locate this line of code: > > + vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq); > > > > and add something like > > printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n", > > vi->rq[i].min_buf_len, GOOD_PACKET_LEN, > > virtqueue_get_vring_size(vi->rq[i].vq), > > (int)vi->big_packets); > > > > after it? > > Then boot and capture the output. > > > > Thanks! > > > > Also, can you pls print the mergeable_rx_buffer_size attribute from > sysfs for this device? On 4.12-rc1: # cat # /sys/devices/pci0000:00/0000:00:03.0/virtio0/net/eth0/queues/rx-0/virtio_net/mergeable_rx_buffer_size 320 --b. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables 2017-05-30 16:26 ` Michael S. Tsirkin 2017-05-30 16:58 ` Michael S. Tsirkin [not found] ` <20170530195716-mutt-send-email-mst@kernel.org> @ 2017-05-31 21:09 ` bfields [not found] ` <20170531210923.GF23526@fieldses.org> 3 siblings, 0 replies; 9+ messages in thread From: bfields @ 2017-05-31 21:09 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-nfs@vger.kernel.org, virtualization, anna.schumaker@netapp.com, Trond Myklebust, jlayton@poochiereds.net, hch On Tue, May 30, 2017 at 07:26:37PM +0300, Michael S. Tsirkin wrote: > On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote: > > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix > > support for small rings". > > > > After that patch, my NFS server VM stops responding to packets after a > > few minutes of testing. Before that patch, my server keeps working. > > > > --b. > > Others complained about that too. > I'm still trying to reproduce though. > > Meanwhile, could you please locate this line of code: > + vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq); > > and add something like > printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n", > vi->rq[i].min_buf_len, GOOD_PACKET_LEN, > virtqueue_get_vring_size(vi->rq[i].vq), > (int)vi->big_packets); > > after it? > Then boot and capture the output. Doesn't look like that code's run on boot; apply the below, boot, and: $ dmesg|grep expected gives no output. --b. diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 9320d96a1632..b10014f7b480 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2212,6 +2212,10 @@ static int virtnet_find_vqs(struct virtnet_info *vi) for (i = 0; i < vi->max_queue_pairs; i++) { vi->rq[i].vq = vqs[rxq2vq(i)]; vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq); + printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n", + vi->rq[i].min_buf_len, GOOD_PACKET_LEN, + virtqueue_get_vring_size(vi->rq[i].vq), + (int)vi->big_packets); vi->sq[i].vq = vqs[txq2vq(i)]; } ^ permalink raw reply related [flat|nested] 9+ messages in thread
[parent not found: <20170531210923.GF23526@fieldses.org>]
* Re: remove function pointer casts and constify function tables [not found] ` <20170531210923.GF23526@fieldses.org> @ 2017-05-31 21:15 ` bfields 0 siblings, 0 replies; 9+ messages in thread From: bfields @ 2017-05-31 21:15 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-nfs@vger.kernel.org, virtualization, anna.schumaker@netapp.com, Trond Myklebust, jlayton@poochiereds.net, hch On Wed, May 31, 2017 at 05:09:23PM -0400, bfields@fieldses.org wrote: > On Tue, May 30, 2017 at 07:26:37PM +0300, Michael S. Tsirkin wrote: > > On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote: > > > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix > > > support for small rings". > > > > > > After that patch, my NFS server VM stops responding to packets after a > > > few minutes of testing. Before that patch, my server keeps working. > > > > > > --b. > > > > Others complained about that too. > > I'm still trying to reproduce though. > > > > Meanwhile, could you please locate this line of code: > > + vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq); > > > > and add something like > > printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n", > > vi->rq[i].min_buf_len, GOOD_PACKET_LEN, > > virtqueue_get_vring_size(vi->rq[i].vq), > > (int)vi->big_packets); > > > > after it? > > Then boot and capture the output. > > Doesn't look like that code's run on boot; apply the below, boot, and: Whoops, no, just a typo in the printk. Here you go: min buf = 0x101 expected 0x5ee size 0x100 big 1 --b. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables [not found] ` <20170526193133.GA9874@fieldses.org> 2017-05-30 16:26 ` Michael S. Tsirkin @ 2017-05-30 17:03 ` Michael S. Tsirkin [not found] ` <20170530200109-mutt-send-email-mst@kernel.org> 2 siblings, 0 replies; 9+ messages in thread From: Michael S. Tsirkin @ 2017-05-30 17:03 UTC (permalink / raw) To: bfields@fieldses.org Cc: linux-nfs@vger.kernel.org, virtualization, anna.schumaker@netapp.com, Trond Myklebust, jlayton@poochiereds.net, hch On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote: > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix > support for small rings". > > After that patch, my NFS server VM stops responding to packets after a > few minutes of testing. Before that patch, my server keeps working. > > --b. So I think I know what caused this: looks like some hypervisors aren't prepared to deal with a situation where packet size becomes very small. But which hypervisors exactly? I'd like to know in order to detect these and decide whether I blacklist bad ones or whitelist known-good ones. Thanks! -- MST ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <20170530200109-mutt-send-email-mst@kernel.org>]
* Re: remove function pointer casts and constify function tables [not found] ` <20170530200109-mutt-send-email-mst@kernel.org> @ 2017-05-31 21:00 ` bfields 0 siblings, 0 replies; 9+ messages in thread From: bfields @ 2017-05-31 21:00 UTC (permalink / raw) To: Michael S. Tsirkin Cc: linux-nfs@vger.kernel.org, virtualization, anna.schumaker@netapp.com, Trond Myklebust, jlayton@poochiereds.net, hch On Tue, May 30, 2017 at 08:03:12PM +0300, Michael S. Tsirkin wrote: > On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote: > > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix > > support for small rings". > > > > After that patch, my NFS server VM stops responding to packets after a > > few minutes of testing. Before that patch, my server keeps working. > > > > --b. > > > So I think I know what caused this: looks like some hypervisors > aren't prepared to deal with a situation where packet size > becomes very small. > > But which hypervisors exactly? I'd like to know in order to detect these > and decide whether I blacklist bad ones or whitelist known-good ones. I'm running this under KVM on a Fedora 25 host (4.10.15-200.fc25.x86_64). Let me know if any more details about the setup would be useful. --b. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-05-31 21:15 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20170512161701.22468-1-hch@lst.de>
[not found] ` <1494620040.19467.1.camel@primarydata.com>
[not found] ` <20170513072557.GA14602@lst.de>
[not found] ` <1494691819.31377.1.camel@primarydata.com>
[not found] ` <20170515152134.GC24547@fieldses.org>
[not found] ` <20170515154450.GA18630@lst.de>
[not found] ` <20170523081159.GA19216@lst.de>
[not found] ` <20170523122334.GA4298@fieldses.org>
[not found] ` <20170526150839.GA4593@fieldses.org>
2017-05-26 15:09 ` remove function pointer casts and constify function tables bfields
[not found] ` <20170526150956.GB4593@fieldses.org>
2017-05-26 19:31 ` bfields
[not found] ` <20170526193133.GA9874@fieldses.org>
2017-05-30 16:26 ` Michael S. Tsirkin
2017-05-30 16:58 ` Michael S. Tsirkin
[not found] ` <20170530195716-mutt-send-email-mst@kernel.org>
2017-05-31 20:57 ` bfields
2017-05-31 21:09 ` bfields
[not found] ` <20170531210923.GF23526@fieldses.org>
2017-05-31 21:15 ` bfields
2017-05-30 17:03 ` Michael S. Tsirkin
[not found] ` <20170530200109-mutt-send-email-mst@kernel.org>
2017-05-31 21:00 ` bfields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox