* Re: remove function pointer casts and constify function tables
[not found] ` <20170526150839.GA4593@fieldses.org>
@ 2017-05-26 15:09 ` bfields
[not found] ` <20170526150956.GB4593@fieldses.org>
1 sibling, 0 replies; 9+ messages in thread
From: bfields @ 2017-05-26 15:09 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-nfs@vger.kernel.org, virtualization,
anna.schumaker@netapp.com, Trond Myklebust,
jlayton@poochiereds.net, hch
Probably should have cc'd virtualization@lists.linux-foundation.org too.
On Fri, May 26, 2017 at 11:08:39AM -0400, bfields@fieldses.org wrote:
> On Tue, May 23, 2017 at 08:23:34AM -0400, bfields@fieldses.org wrote:
> > Unfortunately I can't get anything through testing. It's not your
> > patches, it's something in -rc1. My server VM stops responding to
> > any network traffic randomly in the middle of a run. If I log in from a
> > serial console, I see the interface is up and everything looks OK. I
> > haven't had the chance to do much more, and I'm not sure where to
> > start.... I started a git-bisect attempt, but there are several
> > unrelated problems, and I'm not sure this one is 100% reproduceable.
>
> It looks like it may be due to something pulled in with virtio updates.
> I've reproduced the problem on c8b0d7290657 "s390/virtio: change
> maintainership" but not on v4.11. Are there any known issues with those
> commits?
>
> I've just been doing this long-running bisect while working on other
> stuff. My reproducer (basically just running a bunch of NFS
> connectathon tests over a variety of protocol versions and security
> flavors) doesn't hit the bug reliably, and I've had to restart a couple
> times probably due to false negatives. But this looks pretty promising,
> and there's only 17 commits in that range, so I'll keep bisecting.
>
> --b.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables
[not found] ` <20170526150956.GB4593@fieldses.org>
@ 2017-05-26 19:31 ` bfields
[not found] ` <20170526193133.GA9874@fieldses.org>
1 sibling, 0 replies; 9+ messages in thread
From: bfields @ 2017-05-26 19:31 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-nfs@vger.kernel.org, virtualization,
anna.schumaker@netapp.com, Trond Myklebust,
jlayton@poochiereds.net, hch
Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix
support for small rings".
After that patch, my NFS server VM stops responding to packets after a
few minutes of testing. Before that patch, my server keeps working.
--b.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables
[not found] ` <20170526193133.GA9874@fieldses.org>
@ 2017-05-30 16:26 ` Michael S. Tsirkin
2017-05-30 16:58 ` Michael S. Tsirkin
` (3 more replies)
2017-05-30 17:03 ` Michael S. Tsirkin
[not found] ` <20170530200109-mutt-send-email-mst@kernel.org>
2 siblings, 4 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2017-05-30 16:26 UTC (permalink / raw)
To: bfields@fieldses.org
Cc: linux-nfs@vger.kernel.org, virtualization,
anna.schumaker@netapp.com, Trond Myklebust,
jlayton@poochiereds.net, hch
On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote:
> Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix
> support for small rings".
>
> After that patch, my NFS server VM stops responding to packets after a
> few minutes of testing. Before that patch, my server keeps working.
>
> --b.
Others complained about that too.
I'm still trying to reproduce though.
Meanwhile, could you please locate this line of code:
+ vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq);
and add something like
printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n",
vi->rq[i].min_buf_len, GOOD_PACKET_LEN,
virtqueue_get_vring_size(vi->rq[i].vq),
(int)vi->big_packets);
after it?
Then boot and capture the output.
Thanks!
--
MST
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables
2017-05-30 16:26 ` Michael S. Tsirkin
@ 2017-05-30 16:58 ` Michael S. Tsirkin
[not found] ` <20170530195716-mutt-send-email-mst@kernel.org>
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2017-05-30 16:58 UTC (permalink / raw)
To: bfields@fieldses.org
Cc: linux-nfs@vger.kernel.org, virtualization,
anna.schumaker@netapp.com, Trond Myklebust,
jlayton@poochiereds.net, hch
On Tue, May 30, 2017 at 07:26:37PM +0300, Michael S. Tsirkin wrote:
> On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote:
> > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix
> > support for small rings".
> >
> > After that patch, my NFS server VM stops responding to packets after a
> > few minutes of testing. Before that patch, my server keeps working.
> >
> > --b.
>
> Others complained about that too.
> I'm still trying to reproduce though.
>
> Meanwhile, could you please locate this line of code:
> + vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq);
>
> and add something like
> printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n",
> vi->rq[i].min_buf_len, GOOD_PACKET_LEN,
> virtqueue_get_vring_size(vi->rq[i].vq),
> (int)vi->big_packets);
>
> after it?
> Then boot and capture the output.
>
> Thanks!
>
Also, can you pls print the mergeable_rx_buffer_size attribute from
sysfs for this device?
> --
> MST
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables
[not found] ` <20170526193133.GA9874@fieldses.org>
2017-05-30 16:26 ` Michael S. Tsirkin
@ 2017-05-30 17:03 ` Michael S. Tsirkin
[not found] ` <20170530200109-mutt-send-email-mst@kernel.org>
2 siblings, 0 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2017-05-30 17:03 UTC (permalink / raw)
To: bfields@fieldses.org
Cc: linux-nfs@vger.kernel.org, virtualization,
anna.schumaker@netapp.com, Trond Myklebust,
jlayton@poochiereds.net, hch
On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote:
> Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix
> support for small rings".
>
> After that patch, my NFS server VM stops responding to packets after a
> few minutes of testing. Before that patch, my server keeps working.
>
> --b.
So I think I know what caused this: looks like some hypervisors
aren't prepared to deal with a situation where packet size
becomes very small.
But which hypervisors exactly? I'd like to know in order to detect these
and decide whether I blacklist bad ones or whitelist known-good ones.
Thanks!
--
MST
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables
[not found] ` <20170530195716-mutt-send-email-mst@kernel.org>
@ 2017-05-31 20:57 ` bfields
0 siblings, 0 replies; 9+ messages in thread
From: bfields @ 2017-05-31 20:57 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-nfs@vger.kernel.org, virtualization,
anna.schumaker@netapp.com, Trond Myklebust,
jlayton@poochiereds.net, hch
On Tue, May 30, 2017 at 07:58:06PM +0300, Michael S. Tsirkin wrote:
> On Tue, May 30, 2017 at 07:26:37PM +0300, Michael S. Tsirkin wrote:
> > On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote:
> > > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix
> > > support for small rings".
> > >
> > > After that patch, my NFS server VM stops responding to packets after a
> > > few minutes of testing. Before that patch, my server keeps working.
> > >
> > > --b.
> >
> > Others complained about that too.
> > I'm still trying to reproduce though.
> >
> > Meanwhile, could you please locate this line of code:
> > + vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq);
> >
> > and add something like
> > printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n",
> > vi->rq[i].min_buf_len, GOOD_PACKET_LEN,
> > virtqueue_get_vring_size(vi->rq[i].vq),
> > (int)vi->big_packets);
> >
> > after it?
> > Then boot and capture the output.
> >
> > Thanks!
> >
>
> Also, can you pls print the mergeable_rx_buffer_size attribute from
> sysfs for this device?
On 4.12-rc1:
# cat
# /sys/devices/pci0000:00/0000:00:03.0/virtio0/net/eth0/queues/rx-0/virtio_net/mergeable_rx_buffer_size
320
--b.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables
[not found] ` <20170530200109-mutt-send-email-mst@kernel.org>
@ 2017-05-31 21:00 ` bfields
0 siblings, 0 replies; 9+ messages in thread
From: bfields @ 2017-05-31 21:00 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-nfs@vger.kernel.org, virtualization,
anna.schumaker@netapp.com, Trond Myklebust,
jlayton@poochiereds.net, hch
On Tue, May 30, 2017 at 08:03:12PM +0300, Michael S. Tsirkin wrote:
> On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote:
> > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix
> > support for small rings".
> >
> > After that patch, my NFS server VM stops responding to packets after a
> > few minutes of testing. Before that patch, my server keeps working.
> >
> > --b.
>
>
> So I think I know what caused this: looks like some hypervisors
> aren't prepared to deal with a situation where packet size
> becomes very small.
>
> But which hypervisors exactly? I'd like to know in order to detect these
> and decide whether I blacklist bad ones or whitelist known-good ones.
I'm running this under KVM on a Fedora 25 host
(4.10.15-200.fc25.x86_64). Let me know if any more details about the
setup would be useful.
--b.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables
2017-05-30 16:26 ` Michael S. Tsirkin
2017-05-30 16:58 ` Michael S. Tsirkin
[not found] ` <20170530195716-mutt-send-email-mst@kernel.org>
@ 2017-05-31 21:09 ` bfields
[not found] ` <20170531210923.GF23526@fieldses.org>
3 siblings, 0 replies; 9+ messages in thread
From: bfields @ 2017-05-31 21:09 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-nfs@vger.kernel.org, virtualization,
anna.schumaker@netapp.com, Trond Myklebust,
jlayton@poochiereds.net, hch
On Tue, May 30, 2017 at 07:26:37PM +0300, Michael S. Tsirkin wrote:
> On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote:
> > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix
> > support for small rings".
> >
> > After that patch, my NFS server VM stops responding to packets after a
> > few minutes of testing. Before that patch, my server keeps working.
> >
> > --b.
>
> Others complained about that too.
> I'm still trying to reproduce though.
>
> Meanwhile, could you please locate this line of code:
> + vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq);
>
> and add something like
> printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n",
> vi->rq[i].min_buf_len, GOOD_PACKET_LEN,
> virtqueue_get_vring_size(vi->rq[i].vq),
> (int)vi->big_packets);
>
> after it?
> Then boot and capture the output.
Doesn't look like that code's run on boot; apply the below, boot, and:
$ dmesg|grep expected
gives no output.
--b.
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 9320d96a1632..b10014f7b480 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2212,6 +2212,10 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
for (i = 0; i < vi->max_queue_pairs; i++) {
vi->rq[i].vq = vqs[rxq2vq(i)];
vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq);
+ printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n",
+ vi->rq[i].min_buf_len, GOOD_PACKET_LEN,
+ virtqueue_get_vring_size(vi->rq[i].vq),
+ (int)vi->big_packets);
vi->sq[i].vq = vqs[txq2vq(i)];
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: remove function pointer casts and constify function tables
[not found] ` <20170531210923.GF23526@fieldses.org>
@ 2017-05-31 21:15 ` bfields
0 siblings, 0 replies; 9+ messages in thread
From: bfields @ 2017-05-31 21:15 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: linux-nfs@vger.kernel.org, virtualization,
anna.schumaker@netapp.com, Trond Myklebust,
jlayton@poochiereds.net, hch
On Wed, May 31, 2017 at 05:09:23PM -0400, bfields@fieldses.org wrote:
> On Tue, May 30, 2017 at 07:26:37PM +0300, Michael S. Tsirkin wrote:
> > On Fri, May 26, 2017 at 03:31:33PM -0400, bfields@fieldses.org wrote:
> > > Looks like the culprit is very likely d85b758f72b0 "virtio_net: fix
> > > support for small rings".
> > >
> > > After that patch, my NFS server VM stops responding to packets after a
> > > few minutes of testing. Before that patch, my server keeps working.
> > >
> > > --b.
> >
> > Others complained about that too.
> > I'm still trying to reproduce though.
> >
> > Meanwhile, could you please locate this line of code:
> > + vi->rq[i].min_buf_len = mergeable_min_buf_len(vi, vi->rq[i].vq);
> >
> > and add something like
> > printk(KERN_ERR, "min buf = 0x%x expected 0x%x size 0x%x big %d\n",
> > vi->rq[i].min_buf_len, GOOD_PACKET_LEN,
> > virtqueue_get_vring_size(vi->rq[i].vq),
> > (int)vi->big_packets);
> >
> > after it?
> > Then boot and capture the output.
>
> Doesn't look like that code's run on boot; apply the below, boot, and:
Whoops, no, just a typo in the printk. Here you go:
min buf = 0x101 expected 0x5ee size 0x100 big 1
--b.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-05-31 21:15 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20170512161701.22468-1-hch@lst.de>
[not found] ` <1494620040.19467.1.camel@primarydata.com>
[not found] ` <20170513072557.GA14602@lst.de>
[not found] ` <1494691819.31377.1.camel@primarydata.com>
[not found] ` <20170515152134.GC24547@fieldses.org>
[not found] ` <20170515154450.GA18630@lst.de>
[not found] ` <20170523081159.GA19216@lst.de>
[not found] ` <20170523122334.GA4298@fieldses.org>
[not found] ` <20170526150839.GA4593@fieldses.org>
2017-05-26 15:09 ` remove function pointer casts and constify function tables bfields
[not found] ` <20170526150956.GB4593@fieldses.org>
2017-05-26 19:31 ` bfields
[not found] ` <20170526193133.GA9874@fieldses.org>
2017-05-30 16:26 ` Michael S. Tsirkin
2017-05-30 16:58 ` Michael S. Tsirkin
[not found] ` <20170530195716-mutt-send-email-mst@kernel.org>
2017-05-31 20:57 ` bfields
2017-05-31 21:09 ` bfields
[not found] ` <20170531210923.GF23526@fieldses.org>
2017-05-31 21:15 ` bfields
2017-05-30 17:03 ` Michael S. Tsirkin
[not found] ` <20170530200109-mutt-send-email-mst@kernel.org>
2017-05-31 21:00 ` bfields
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox