* Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)
2023-03-28 3:39 ` Jason Wang
@ 2023-03-28 9:09 ` 黄杰
2023-03-28 14:35 ` Michael S. Tsirkin
2023-03-29 5:34 ` Michael S. Tsirkin
2023-03-29 5:21 ` [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf) Michael S. Tsirkin
2023-03-29 5:42 ` Michael S. Tsirkin
2 siblings, 2 replies; 19+ messages in thread
From: 黄杰 @ 2023-03-28 9:09 UTC (permalink / raw)
To: Jason Wang
Cc: Dominique Martinet, Michael S. Tsirkin, virtualization,
linux-kernel, Luis Chamberlain, v9fs-developer,
Eric Van Hensbergen, Christian Schoenebeck
Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 11:40写道:
>
> On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <huangjie.albert@bytedance.com> wrote:
> >
> > Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 10:59写道:
> > >
> > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > <asmadeus@codewreck.org> wrote:
> > > >
> > > > Hi Michael, Albert,
> > > >
> > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > virtqueue_enable_cb_prepare.
> > > >
> > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > Luis in https://lkml.kernel.org/r/ZCI+7Wg5OclSlE8c@bombadil.infradead.org
> > > >
> > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > didn't look at the content at all yet so cannot advise further.
> > > > It might very well be that we need some extra handling for 9p
> > > > specifically that can be added separately if required.
> > > >
> > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > >
> > > >
> > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > these messages:
> > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > >
> > > > So I suspect we're just not getting a callback.
> > >
> > > I think so. The patch assumes the driver will call
> > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > >
> > > So after the first interrupt, event_triggered will be set to true forever.
> > >
> > > Thanks
> > >
> >
> > Hi: Wang
> >
> > Yes, This patch assumes that all virtio-related drivers will call
> > virtqueue_disable/enable_cb().
> > Thank you for raising this issue.
> >
> > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > we need to refactor
> > napi_tx instead of implementing it inside virtio_ring.
>
> We can hear from others.
>
> I think it's better not to workaround virtio_ring issues in a specific
> driver. It might just add more hacks. We should correctly set
> VRING_AVAIL_F_NO_INTERRUPT,
>
> Do you think the following might work (not even a compile test)?
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 41144b5246a8..12f4efb6dc54 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> - if (vq->event)
> - /* TODO: this is a hack. Figure out a cleaner
> value to write. */
> - vring_used_event(&vq->split.vring) = 0x0;
> - else
> - vq->split.vring.avail->flags =
> - cpu_to_virtio16(_vq->vdev,
> - vq->split.avail_flags_shadow);
> - }
> + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + if (vq->event && !vq->event_triggered)
> + /* TODO: this is a hack. Figure out a cleaner value to write. */
> + vring_used_event(&vq->split.vring) = 0x0;
> + else
> + vq->split.vring.avail->flags =
> + cpu_to_virtio16(_vq->vdev,
> + vq->split.avail_flags_shadow);
> }
>
> static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + if (vq->event_triggered)
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
>
> Thanks
>
Hi, This patch seems to address the issue I initially raised and also
avoids the problem with virtio-9P.
but maybe this is a better choice:
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 307e139cb11d..6784d155c781 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -812,6 +812,10 @@ static void virtqueue_disable_cb_split(struct
virtqueue *_vq)
if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
+
+ if (vq->event_triggered)
+ return;
+
if (vq->event)
/* TODO: this is a hack. Figure out a cleaner
value to write. */
vring_used_event(&vq->split.vring) = 0x0;
@@ -1546,6 +1550,10 @@ static void virtqueue_disable_cb_packed(struct
virtqueue *_vq)
if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
+
+ if (vq->event_triggered)
+ return;
+
vq->packed.vring.driver->flags =
cpu_to_le16(vq->packed.event_flags_shadow);
}
@@ -2063,12 +2071,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);
- /* If device triggered an event already it won't trigger one again:
- * no need to disable.
- */
- if (vq->event_triggered)
- return;
-
if (vq->packed_ring)
virtqueue_disable_cb_packed(_vq);
else
Does Michael have any other suggestions?
Thanks.
> >
> > Thanks
> >
> > > >
> > > >
> > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > appreciated!
> > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > this is figured out, but I'll leave this up to you)
> > > >
> > > >
> > > > >
> > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > buffer is relatively small.
> > > > >
> > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > >
> > > > > To summarize:
> > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > then it tries to publish new event
> > > > >
> > > > > To fix, if event_triggered is set to true, do not update
> > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > >
> > > > > Tested with iperf:
> > > > > iperf3 tcp stream:
> > > > > vm1 -----------------> vm2
> > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > there are many tx interrupts in vm2.
> > > > > but without event_triggered there are just a few tx interrupts.
> > > > >
> > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > Signed-off-by: Albert Huang <huangjie.albert@bytedance.com>
> > > > > Message-Id: <20230321085953.24949-1-huangjie.albert@bytedance.com>
> > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > ---
> > > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > /* If we expect an interrupt for the next entry, tell host
> > > > > * by writing event index and flush out the write before
> > > > > * the read in the next get_buf call. */
> > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > + !vq->event_triggered))
> > > > > virtio_store_mb(vq->weak_barriers,
> > > > > &vring_used_event(&vq->split.vring),
> > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > * by writing event index and flush out the write before
> > > > > * the read in the next get_buf call.
> > > > > */
> > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > + !vq->event_triggered))
> > > > > virtio_store_mb(vq->weak_barriers,
> > > > > &vq->packed.vring.driver->off_wrap,
> > > > > cpu_to_le16(vq->last_used_idx));
> > > >
> > >
> >
>
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)
2023-03-28 9:09 ` 黄杰
@ 2023-03-28 14:35 ` Michael S. Tsirkin
2023-03-29 5:34 ` Michael S. Tsirkin
1 sibling, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2023-03-28 14:35 UTC (permalink / raw)
To: 黄杰
Cc: Jason Wang, Dominique Martinet, virtualization, linux-kernel,
Luis Chamberlain, v9fs-developer, Eric Van Hensbergen,
Christian Schoenebeck
On Tue, Mar 28, 2023 at 05:09:19PM +0800, 黄杰 wrote:
> Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 11:40写道:
> >
> > On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <huangjie.albert@bytedance.com> wrote:
> > >
> > > Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 10:59写道:
> > > >
> > > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > > <asmadeus@codewreck.org> wrote:
> > > > >
> > > > > Hi Michael, Albert,
> > > > >
> > > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > > virtqueue_enable_cb_prepare.
> > > > >
> > > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > > Luis in https://lkml.kernel.org/r/ZCI+7Wg5OclSlE8c@bombadil.infradead.org
> > > > >
> > > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > > didn't look at the content at all yet so cannot advise further.
> > > > > It might very well be that we need some extra handling for 9p
> > > > > specifically that can be added separately if required.
> > > > >
> > > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > > >
> > > > >
> > > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > > these messages:
> > > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > > >
> > > > > So I suspect we're just not getting a callback.
> > > >
> > > > I think so. The patch assumes the driver will call
> > > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > > >
> > > > So after the first interrupt, event_triggered will be set to true forever.
> > > >
> > > > Thanks
> > > >
> > >
> > > Hi: Wang
> > >
> > > Yes, This patch assumes that all virtio-related drivers will call
> > > virtqueue_disable/enable_cb().
> > > Thank you for raising this issue.
> > >
> > > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > > we need to refactor
> > > napi_tx instead of implementing it inside virtio_ring.
> >
> > We can hear from others.
> >
> > I think it's better not to workaround virtio_ring issues in a specific
> > driver. It might just add more hacks. We should correctly set
> > VRING_AVAIL_F_NO_INTERRUPT,
> >
> > Do you think the following might work (not even a compile test)?
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 41144b5246a8..12f4efb6dc54 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> > - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > - if (vq->event)
> > - /* TODO: this is a hack. Figure out a cleaner
> > value to write. */
> > - vring_used_event(&vq->split.vring) = 0x0;
> > - else
> > - vq->split.vring.avail->flags =
> > - cpu_to_virtio16(_vq->vdev,
> > - vq->split.avail_flags_shadow);
> > - }
> > + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > +
> > + if (vq->event && !vq->event_triggered)
> > + /* TODO: this is a hack. Figure out a cleaner value to write. */
> > + vring_used_event(&vq->split.vring) = 0x0;
> > + else
> > + vq->split.vring.avail->flags =
> > + cpu_to_virtio16(_vq->vdev,
> > + vq->split.avail_flags_shadow);
> > }
> >
> > static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> > @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> > + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> > vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> > +
> > + if (vq->event_triggered)
> > vq->packed.vring.driver->flags =
> > cpu_to_le16(vq->packed.event_flags_shadow);
> > }
> > @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - /* If device triggered an event already it won't trigger one again:
> > - * no need to disable.
> > - */
> > - if (vq->event_triggered)
> > - return;
> > -
> > if (vq->packed_ring)
> > virtqueue_disable_cb_packed(_vq);
> > else
> >
> > Thanks
> >
>
> Hi, This patch seems to address the issue I initially raised and also
> avoids the problem with virtio-9P.
>
> but maybe this is a better choice:
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 307e139cb11d..6784d155c781 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -812,6 +812,10 @@ static void virtqueue_disable_cb_split(struct
> virtqueue *_vq)
>
> if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + if (vq->event_triggered)
> + return;
> +
> if (vq->event)
> /* TODO: this is a hack. Figure out a cleaner
> value to write. */
> vring_used_event(&vq->split.vring) = 0x0;
> @@ -1546,6 +1550,10 @@ static void virtqueue_disable_cb_packed(struct
> virtqueue *_vq)
>
> if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + if (vq->event_triggered)
> + return;
> +
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2063,12 +2071,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
>
> Does Michael have any other suggestions?
>
> Thanks.
Hmm what bothers me is this breaks the underlying assumption that
shadow is an exact match of the shared memory. Need to check this does
not cascade. But I am still trying to get my head around what
the issue is.
> > >
> > > Thanks
> > >
> > > > >
> > > > >
> > > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > > appreciated!
> > > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > > this is figured out, but I'll leave this up to you)
> > > > >
> > > > >
> > > > > >
> > > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > > buffer is relatively small.
> > > > > >
> > > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > > >
> > > > > > To summarize:
> > > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > > then it tries to publish new event
> > > > > >
> > > > > > To fix, if event_triggered is set to true, do not update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > >
> > > > > > Tested with iperf:
> > > > > > iperf3 tcp stream:
> > > > > > vm1 -----------------> vm2
> > > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > > there are many tx interrupts in vm2.
> > > > > > but without event_triggered there are just a few tx interrupts.
> > > > > >
> > > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > > Signed-off-by: Albert Huang <huangjie.albert@bytedance.com>
> > > > > > Message-Id: <20230321085953.24949-1-huangjie.albert@bytedance.com>
> > > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > ---
> > > > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > > /* If we expect an interrupt for the next entry, tell host
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call. */
> > > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vring_used_event(&vq->split.vring),
> > > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call.
> > > > > > */
> > > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vq->packed.vring.driver->off_wrap,
> > > > > > cpu_to_le16(vq->last_used_idx));
> > > > >
> > > >
> > >
> >
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)
2023-03-28 9:09 ` 黄杰
2023-03-28 14:35 ` Michael S. Tsirkin
@ 2023-03-29 5:34 ` Michael S. Tsirkin
2023-03-29 7:21 ` [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set Albert Huang
1 sibling, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2023-03-29 5:34 UTC (permalink / raw)
To: 黄杰
Cc: Jason Wang, Dominique Martinet, virtualization, linux-kernel,
Luis Chamberlain, v9fs-developer, Eric Van Hensbergen,
Christian Schoenebeck
On Tue, Mar 28, 2023 at 05:09:19PM +0800, 黄杰 wrote:
> Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 11:40写道:
> >
> > On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <huangjie.albert@bytedance.com> wrote:
> > >
> > > Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 10:59写道:
> > > >
> > > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > > <asmadeus@codewreck.org> wrote:
> > > > >
> > > > > Hi Michael, Albert,
> > > > >
> > > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > > virtqueue_enable_cb_prepare.
> > > > >
> > > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > > Luis in https://lkml.kernel.org/r/ZCI+7Wg5OclSlE8c@bombadil.infradead.org
> > > > >
> > > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > > didn't look at the content at all yet so cannot advise further.
> > > > > It might very well be that we need some extra handling for 9p
> > > > > specifically that can be added separately if required.
> > > > >
> > > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > > >
> > > > >
> > > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > > these messages:
> > > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > > >
> > > > > So I suspect we're just not getting a callback.
> > > >
> > > > I think so. The patch assumes the driver will call
> > > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > > >
> > > > So after the first interrupt, event_triggered will be set to true forever.
> > > >
> > > > Thanks
> > > >
> > >
> > > Hi: Wang
> > >
> > > Yes, This patch assumes that all virtio-related drivers will call
> > > virtqueue_disable/enable_cb().
> > > Thank you for raising this issue.
> > >
> > > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > > we need to refactor
> > > napi_tx instead of implementing it inside virtio_ring.
> >
> > We can hear from others.
> >
> > I think it's better not to workaround virtio_ring issues in a specific
> > driver. It might just add more hacks. We should correctly set
> > VRING_AVAIL_F_NO_INTERRUPT,
> >
> > Do you think the following might work (not even a compile test)?
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 41144b5246a8..12f4efb6dc54 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> > - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > - if (vq->event)
> > - /* TODO: this is a hack. Figure out a cleaner
> > value to write. */
> > - vring_used_event(&vq->split.vring) = 0x0;
> > - else
> > - vq->split.vring.avail->flags =
> > - cpu_to_virtio16(_vq->vdev,
> > - vq->split.avail_flags_shadow);
> > - }
> > + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > +
> > + if (vq->event && !vq->event_triggered)
> > + /* TODO: this is a hack. Figure out a cleaner value to write. */
> > + vring_used_event(&vq->split.vring) = 0x0;
> > + else
> > + vq->split.vring.avail->flags =
> > + cpu_to_virtio16(_vq->vdev,
> > + vq->split.avail_flags_shadow);
> > }
> >
> > static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> > @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> > + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> > vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> > +
> > + if (vq->event_triggered)
> > vq->packed.vring.driver->flags =
> > cpu_to_le16(vq->packed.event_flags_shadow);
> > }
> > @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - /* If device triggered an event already it won't trigger one again:
> > - * no need to disable.
> > - */
> > - if (vq->event_triggered)
> > - return;
> > -
> > if (vq->packed_ring)
> > virtqueue_disable_cb_packed(_vq);
> > else
> >
> > Thanks
> >
>
> Hi, This patch seems to address the issue I initially raised and also
> avoids the problem with virtio-9P.
>
> but maybe this is a better choice:
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 307e139cb11d..6784d155c781 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -812,6 +812,10 @@ static void virtqueue_disable_cb_split(struct
> virtqueue *_vq)
>
> if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + if (vq->event_triggered)
> + return;
> +
> if (vq->event)
> /* TODO: this is a hack. Figure out a cleaner
> value to write. */
> vring_used_event(&vq->split.vring) = 0x0;
> @@ -1546,6 +1550,10 @@ static void virtqueue_disable_cb_packed(struct
> virtqueue *_vq)
>
> if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + if (vq->event_triggered)
> + return;
> +
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2063,12 +2071,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
>
> Does Michael have any other suggestions?
>
> Thanks.
Oh I finally understand I think. The issue is with event index
enabled. interrupt fires once, we set event index to 0x0.
since shadow is not set then we never update it again.
I agree, your patch is a good fix, just please copy the comment
you are removing to the two places where we check event_triggered now.
I dropped patch v2 from my tree and please post v3
with this fixup squashed.
Thanks a lot!
Apropos if we fixed all drivers to call disable/enable_cb
explicitly we could remove branch on data path, but it's a lot of work
and it's easy to miss some drivers.
> > >
> > > Thanks
> > >
> > > > >
> > > > >
> > > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > > appreciated!
> > > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > > this is figured out, but I'll leave this up to you)
> > > > >
> > > > >
> > > > > >
> > > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > > buffer is relatively small.
> > > > > >
> > > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > > >
> > > > > > To summarize:
> > > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > > then it tries to publish new event
> > > > > >
> > > > > > To fix, if event_triggered is set to true, do not update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > >
> > > > > > Tested with iperf:
> > > > > > iperf3 tcp stream:
> > > > > > vm1 -----------------> vm2
> > > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > > there are many tx interrupts in vm2.
> > > > > > but without event_triggered there are just a few tx interrupts.
> > > > > >
> > > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > > Signed-off-by: Albert Huang <huangjie.albert@bytedance.com>
> > > > > > Message-Id: <20230321085953.24949-1-huangjie.albert@bytedance.com>
> > > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > ---
> > > > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > > /* If we expect an interrupt for the next entry, tell host
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call. */
> > > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vring_used_event(&vq->split.vring),
> > > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call.
> > > > > > */
> > > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vq->packed.vring.driver->off_wrap,
> > > > > > cpu_to_le16(vq->last_used_idx));
> > > > >
> > > >
> > >
> >
^ permalink raw reply [flat|nested] 19+ messages in thread* [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set
2023-03-29 5:34 ` Michael S. Tsirkin
@ 2023-03-29 7:21 ` Albert Huang
2023-03-29 7:27 ` Xuan Zhuo
2023-03-29 16:26 ` Michael S. Tsirkin
0 siblings, 2 replies; 19+ messages in thread
From: Albert Huang @ 2023-03-29 7:21 UTC (permalink / raw)
To: Jason Wang, Michael S . Tsirkin
Cc: huangjie.albert, virtualization, linux-kernel
From: "huangjie.albert" <huangjie.albert@bytedance.com>
in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
the vq->event_triggered will be set to true. It will no longer be set to
false. Unless we explicitly call virtqueue_enable_cb_delayed or
virtqueue_enable_cb_prepare.
If we disable the napi_tx, it will only be called when the tx ring
buffer is relatively small.
Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
every time we call virtqueue_get_buf_ctx.This bring more interruptions.
To summarize:
1) event_triggered was set to true in vring_interrupt()
2) after this nothing will happen for virtqueue_disable_cb() so
VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
then it tries to publish new event
To fix:
update VRING_AVAIL_F_NO_INTERRUPT or VRING_PACKED_EVENT_FLAG_DISABLE to vq
when we call virtqueue_disable_cb even the event_triggered is set to true.
Tested with iperf:
iperf3 tcp stream:
vm1 -----------------> vm2
vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
there are many tx interrupts in vm2.
but without event_triggered there are just a few tx interrupts.
v2->v3:
-update the interrupt disable flag even with the event_triggered is set,
-instead of checking whether event_triggered is set in
-virtqueue_get_buf_ctx_{packed/split}, will cause the drivers which have
-not called virtqueue_{enable/disable}_cb to miss notifications.
Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
Signed-off-by: huangjie.albert <huangjie.albert@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
drivers/virtio/virtio_ring.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 307e139cb11d..ad74463a48ee 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -812,6 +812,14 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq)
if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
+
+ /*
+ * If device triggered an event already it won't trigger one again:
+ * no need to disable.
+ */
+ if (vq->event_triggered)
+ return;
+
if (vq->event)
/* TODO: this is a hack. Figure out a cleaner value to write. */
vring_used_event(&vq->split.vring) = 0x0;
@@ -1544,8 +1552,16 @@ static void virtqueue_disable_cb_packed(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);
- if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
+ if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE)) {
vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
+
+ /*
+ * If device triggered an event already it won't trigger one again:
+ * no need to disable.
+ */
+ if (vq->event_triggered)
+ return;
+
vq->packed.vring.driver->flags =
cpu_to_le16(vq->packed.event_flags_shadow);
}
@@ -2063,12 +2079,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);
- /* If device triggered an event already it won't trigger one again:
- * no need to disable.
- */
- if (vq->event_triggered)
- return;
-
if (vq->packed_ring)
virtqueue_disable_cb_packed(_vq);
else
--
2.31.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set
2023-03-29 7:21 ` [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set Albert Huang
@ 2023-03-29 7:27 ` Xuan Zhuo
2023-03-29 16:28 ` Michael S. Tsirkin
2023-03-29 16:26 ` Michael S. Tsirkin
1 sibling, 1 reply; 19+ messages in thread
From: Xuan Zhuo @ 2023-03-29 7:27 UTC (permalink / raw)
To: Albert Huang
Cc: huangjie.albert, virtualization, linux-kernel, Jason Wang,
Michael S . Tsirkin
Maybe one new thread is better.
Thanks.
On Wed, 29 Mar 2023 15:21:35 +0800, Albert Huang <huangjie.albert@bytedance.com> wrote:
> From: "huangjie.albert" <huangjie.albert@bytedance.com>
>
> in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> the vq->event_triggered will be set to true. It will no longer be set to
> false. Unless we explicitly call virtqueue_enable_cb_delayed or
> virtqueue_enable_cb_prepare.
>
> If we disable the napi_tx, it will only be called when the tx ring
> buffer is relatively small.
>
> Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> every time we call virtqueue_get_buf_ctx.This bring more interruptions.
>
> To summarize:
> 1) event_triggered was set to true in vring_interrupt()
> 2) after this nothing will happen for virtqueue_disable_cb() so
> VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> then it tries to publish new event
>
> To fix:
> update VRING_AVAIL_F_NO_INTERRUPT or VRING_PACKED_EVENT_FLAG_DISABLE to vq
> when we call virtqueue_disable_cb even the event_triggered is set to true.
>
> Tested with iperf:
> iperf3 tcp stream:
> vm1 -----------------> vm2
> vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> there are many tx interrupts in vm2.
> but without event_triggered there are just a few tx interrupts.
>
> v2->v3:
> -update the interrupt disable flag even with the event_triggered is set,
> -instead of checking whether event_triggered is set in
> -virtqueue_get_buf_ctx_{packed/split}, will cause the drivers which have
> -not called virtqueue_{enable/disable}_cb to miss notifications.
>
> Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> Signed-off-by: huangjie.albert <huangjie.albert@bytedance.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/virtio/virtio_ring.c | 24 +++++++++++++++++-------
> 1 file changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 307e139cb11d..ad74463a48ee 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -812,6 +812,14 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq)
>
> if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + /*
> + * If device triggered an event already it won't trigger one again:
> + * no need to disable.
> + */
> + if (vq->event_triggered)
> + return;
> +
> if (vq->event)
> /* TODO: this is a hack. Figure out a cleaner value to write. */
> vring_used_event(&vq->split.vring) = 0x0;
> @@ -1544,8 +1552,16 @@ static void virtqueue_disable_cb_packed(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE)) {
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + /*
> + * If device triggered an event already it won't trigger one again:
> + * no need to disable.
> + */
> + if (vq->event_triggered)
> + return;
> +
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2063,12 +2079,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set
2023-03-29 7:27 ` Xuan Zhuo
@ 2023-03-29 16:28 ` Michael S. Tsirkin
0 siblings, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2023-03-29 16:28 UTC (permalink / raw)
To: Xuan Zhuo; +Cc: Albert Huang, virtualization, linux-kernel, Jason Wang
On Wed, Mar 29, 2023 at 03:27:03PM +0800, Xuan Zhuo wrote:
> Maybe one new thread is better.
>
> Thanks.
I don't know but do not post same message twice please
without explanation. if you repost put "PATCH repost" in
the subject.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set
2023-03-29 7:21 ` [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set Albert Huang
2023-03-29 7:27 ` Xuan Zhuo
@ 2023-03-29 16:26 ` Michael S. Tsirkin
1 sibling, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2023-03-29 16:26 UTC (permalink / raw)
To: Albert Huang; +Cc: Jason Wang, virtualization, linux-kernel
Ican't parse the subject at all. I think the subject from v2
was fine - we are skipping event index updates on get buf.
Or maybe go higher level and describe the effect of the patch:
virtio_ring: reduce interrupt rate with event idx enabled
On Wed, Mar 29, 2023 at 03:21:35PM +0800, Albert Huang wrote:
> From: "huangjie.albert" <huangjie.albert@bytedance.com>
>
> in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> the vq->event_triggered will be set to true. It will no longer be set to
> false.
this last sentence is redundant
> Unless we explicitly call virtqueue_enable_cb_delayed or
> virtqueue_enable_cb_prepare.
>
> If we disable the napi_tx, it will only be called when the tx ring
> buffer is relatively small.
>
> Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> every time we call virtqueue_get_buf_ctx.This bring more interruptions.
This will bring more interrupts. And pls add space after ".".
>
> To summarize:
> 1) event_triggered was set to true in vring_interrupt()
> 2) after this nothing will happen for virtqueue_disable_cb() so
> VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> then it tries to publish new event
>
> To fix:
> update VRING_AVAIL_F_NO_INTERRUPT or VRING_PACKED_EVENT_FLAG_DISABLE to vq
> when we call virtqueue_disable_cb even the event_triggered is set to true.
bad grammar here and way too much detail. better:
make disable_cb set VRING_AVAIL_F_NO_INTERRUPT or
VRING_PACKED_EVENT_FLAG_DISABLE in flags shadow to make get_buf
correctly detect that callbacks are disabled.
>
> Tested with iperf:
> iperf3 tcp stream:
> vm1 -----------------> vm2
> vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> there are many tx interrupts in vm2.
> but without event_triggered there are just a few tx interrupts.
>
put changelog after --- please.
> v2->v3:
> -update the interrupt disable flag even with the event_triggered is set,
> -instead of checking whether event_triggered is set in
> -virtqueue_get_buf_ctx_{packed/split}, will cause the drivers which have
> -not called virtqueue_{enable/disable}_cb to miss notifications.
>
> Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> Signed-off-by: huangjie.albert <huangjie.albert@bytedance.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/virtio/virtio_ring.c | 24 +++++++++++++++++-------
> 1 file changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 307e139cb11d..ad74463a48ee 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -812,6 +812,14 @@ static void virtqueue_disable_cb_split(struct virtqueue *_vq)
>
> if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + /*
> + * If device triggered an event already it won't trigger one again:
> + * no need to disable.
> + */
> + if (vq->event_triggered)
> + return;
> +
> if (vq->event)
> /* TODO: this is a hack. Figure out a cleaner value to write. */
> vring_used_event(&vq->split.vring) = 0x0;
> @@ -1544,8 +1552,16 @@ static void virtqueue_disable_cb_packed(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE)) {
Why are you making this change? It's not great because it
only works because VRING_PACKED_EVENT_FLAG_DISABLE happens
to equal 0x1. does not the patch work with the original
if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE)
?
Besides, if you are making unrelated changes commit log should
describe them.
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + /*
> + * If device triggered an event already it won't trigger one again:
> + * no need to disable.
> + */
> + if (vq->event_triggered)
> + return;
> +
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2063,12 +2079,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
> --
> 2.31.1
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)
2023-03-28 3:39 ` Jason Wang
2023-03-28 9:09 ` 黄杰
@ 2023-03-29 5:21 ` Michael S. Tsirkin
2023-03-29 5:42 ` Michael S. Tsirkin
2 siblings, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2023-03-29 5:21 UTC (permalink / raw)
To: Jason Wang
Cc: 黄杰, Dominique Martinet, virtualization,
linux-kernel, Luis Chamberlain, v9fs-developer,
Eric Van Hensbergen, Christian Schoenebeck
On Tue, Mar 28, 2023 at 11:39:59AM +0800, Jason Wang wrote:
> On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <huangjie.albert@bytedance.com> wrote:
> >
> > Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 10:59写道:
> > >
> > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > <asmadeus@codewreck.org> wrote:
> > > >
> > > > Hi Michael, Albert,
> > > >
> > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > virtqueue_enable_cb_prepare.
> > > >
> > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > Luis in https://lkml.kernel.org/r/ZCI+7Wg5OclSlE8c@bombadil.infradead.org
> > > >
> > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > didn't look at the content at all yet so cannot advise further.
> > > > It might very well be that we need some extra handling for 9p
> > > > specifically that can be added separately if required.
> > > >
> > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > >
> > > >
> > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > these messages:
> > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > >
> > > > So I suspect we're just not getting a callback.
> > >
> > > I think so. The patch assumes the driver will call
> > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > >
> > > So after the first interrupt, event_triggered will be set to true forever.
> > >
> > > Thanks
> > >
> >
> > Hi: Wang
> >
> > Yes, This patch assumes that all virtio-related drivers will call
> > virtqueue_disable/enable_cb().
> > Thank you for raising this issue.
> >
> > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > we need to refactor
> > napi_tx instead of implementing it inside virtio_ring.
>
> We can hear from others.
>
> I think it's better not to workaround virtio_ring issues in a specific
> driver. It might just add more hacks. We should correctly set
> VRING_AVAIL_F_NO_INTERRUPT,
I am still stuck trying to understand why we don't set it.
How does event_triggered end up getting set without
event index support?
Thanks!
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)
2023-03-28 3:39 ` Jason Wang
2023-03-28 9:09 ` 黄杰
2023-03-29 5:21 ` [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf) Michael S. Tsirkin
@ 2023-03-29 5:42 ` Michael S. Tsirkin
2023-03-30 2:52 ` Jason Wang
2 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2023-03-29 5:42 UTC (permalink / raw)
To: Jason Wang
Cc: 黄杰, Dominique Martinet, virtualization,
linux-kernel, Luis Chamberlain, v9fs-developer,
Eric Van Hensbergen, Christian Schoenebeck
On Tue, Mar 28, 2023 at 11:39:59AM +0800, Jason Wang wrote:
> On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <huangjie.albert@bytedance.com> wrote:
> >
> > Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 10:59写道:
> > >
> > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > <asmadeus@codewreck.org> wrote:
> > > >
> > > > Hi Michael, Albert,
> > > >
> > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > virtqueue_enable_cb_prepare.
> > > >
> > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > Luis in https://lkml.kernel.org/r/ZCI+7Wg5OclSlE8c@bombadil.infradead.org
> > > >
> > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > didn't look at the content at all yet so cannot advise further.
> > > > It might very well be that we need some extra handling for 9p
> > > > specifically that can be added separately if required.
> > > >
> > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > >
> > > >
> > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > these messages:
> > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > >
> > > > So I suspect we're just not getting a callback.
> > >
> > > I think so. The patch assumes the driver will call
> > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > >
> > > So after the first interrupt, event_triggered will be set to true forever.
> > >
> > > Thanks
> > >
> >
> > Hi: Wang
> >
> > Yes, This patch assumes that all virtio-related drivers will call
> > virtqueue_disable/enable_cb().
> > Thank you for raising this issue.
> >
> > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > we need to refactor
> > napi_tx instead of implementing it inside virtio_ring.
>
> We can hear from others.
>
> I think it's better not to workaround virtio_ring issues in a specific
> driver. It might just add more hacks. We should correctly set
> VRING_AVAIL_F_NO_INTERRUPT,
>
> Do you think the following might work (not even a compile test)?
ok but:
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 41144b5246a8..12f4efb6dc54 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> - if (vq->event)
> - /* TODO: this is a hack. Figure out a cleaner
> value to write. */
> - vring_used_event(&vq->split.vring) = 0x0;
> - else
> - vq->split.vring.avail->flags =
> - cpu_to_virtio16(_vq->vdev,
> - vq->split.avail_flags_shadow);
> - }
> + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> + if (vq->event && !vq->event_triggered)
> + /* TODO: this is a hack. Figure out a cleaner value to write. */
> + vring_used_event(&vq->split.vring) = 0x0;
> + else
> + vq->split.vring.avail->flags =
> + cpu_to_virtio16(_vq->vdev,
> + vq->split.avail_flags_shadow);
> }
>
> static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> + if (vq->event_triggered)
I don't get this one. if event_triggered why do you still want to
write into driver flags? it won't trigger again anytime soon.
> vq->packed.vring.driver->flags =
> cpu_to_le16(vq->packed.event_flags_shadow);
> }
> @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
>
> - /* If device triggered an event already it won't trigger one again:
> - * no need to disable.
> - */
> - if (vq->event_triggered)
> - return;
> -
> if (vq->packed_ring)
> virtqueue_disable_cb_packed(_vq);
> else
>
> Thanks
I think I prefer Huang Albert's other patch - are you ok with it?
> >
> > Thanks
> >
> > > >
> > > >
> > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > appreciated!
> > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > this is figured out, but I'll leave this up to you)
> > > >
> > > >
> > > > >
> > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > buffer is relatively small.
> > > > >
> > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > >
> > > > > To summarize:
> > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > then it tries to publish new event
> > > > >
> > > > > To fix, if event_triggered is set to true, do not update
> > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > >
> > > > > Tested with iperf:
> > > > > iperf3 tcp stream:
> > > > > vm1 -----------------> vm2
> > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > there are many tx interrupts in vm2.
> > > > > but without event_triggered there are just a few tx interrupts.
> > > > >
> > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > Signed-off-by: Albert Huang <huangjie.albert@bytedance.com>
> > > > > Message-Id: <20230321085953.24949-1-huangjie.albert@bytedance.com>
> > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > ---
> > > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > /* If we expect an interrupt for the next entry, tell host
> > > > > * by writing event index and flush out the write before
> > > > > * the read in the next get_buf call. */
> > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > + !vq->event_triggered))
> > > > > virtio_store_mb(vq->weak_barriers,
> > > > > &vring_used_event(&vq->split.vring),
> > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > * by writing event index and flush out the write before
> > > > > * the read in the next get_buf call.
> > > > > */
> > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > + !vq->event_triggered))
> > > > > virtio_store_mb(vq->weak_barriers,
> > > > > &vq->packed.vring.driver->off_wrap,
> > > > > cpu_to_le16(vq->last_used_idx));
> > > >
> > >
> >
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)
2023-03-29 5:42 ` Michael S. Tsirkin
@ 2023-03-30 2:52 ` Jason Wang
0 siblings, 0 replies; 19+ messages in thread
From: Jason Wang @ 2023-03-30 2:52 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: 黄杰, Dominique Martinet, virtualization,
linux-kernel, Luis Chamberlain, v9fs-developer,
Eric Van Hensbergen, Christian Schoenebeck
On Wed, Mar 29, 2023 at 1:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Mar 28, 2023 at 11:39:59AM +0800, Jason Wang wrote:
> > On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <huangjie.albert@bytedance.com> wrote:
> > >
> > > Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 10:59写道:
> > > >
> > > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > > <asmadeus@codewreck.org> wrote:
> > > > >
> > > > > Hi Michael, Albert,
> > > > >
> > > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > > virtqueue_enable_cb_prepare.
> > > > >
> > > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > > Luis in https://lkml.kernel.org/r/ZCI+7Wg5OclSlE8c@bombadil.infradead.org
> > > > >
> > > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > > didn't look at the content at all yet so cannot advise further.
> > > > > It might very well be that we need some extra handling for 9p
> > > > > specifically that can be added separately if required.
> > > > >
> > > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > > >
> > > > >
> > > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > > these messages:
> > > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > > >
> > > > > So I suspect we're just not getting a callback.
> > > >
> > > > I think so. The patch assumes the driver will call
> > > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > > >
> > > > So after the first interrupt, event_triggered will be set to true forever.
> > > >
> > > > Thanks
> > > >
> > >
> > > Hi: Wang
> > >
> > > Yes, This patch assumes that all virtio-related drivers will call
> > > virtqueue_disable/enable_cb().
> > > Thank you for raising this issue.
> > >
> > > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > > we need to refactor
> > > napi_tx instead of implementing it inside virtio_ring.
> >
> > We can hear from others.
> >
> > I think it's better not to workaround virtio_ring issues in a specific
> > driver. It might just add more hacks. We should correctly set
> > VRING_AVAIL_F_NO_INTERRUPT,
> >
> > Do you think the following might work (not even a compile test)?
>
>
> ok but:
>
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 41144b5246a8..12f4efb6dc54 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> > - vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > - if (vq->event)
> > - /* TODO: this is a hack. Figure out a cleaner
> > value to write. */
> > - vring_used_event(&vq->split.vring) = 0x0;
> > - else
> > - vq->split.vring.avail->flags =
> > - cpu_to_virtio16(_vq->vdev,
> > - vq->split.avail_flags_shadow);
> > - }
> > + if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > + vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > +
> > + if (vq->event && !vq->event_triggered)
> > + /* TODO: this is a hack. Figure out a cleaner value to write. */
> > + vring_used_event(&vq->split.vring) = 0x0;
> > + else
> > + vq->split.vring.avail->flags =
> > + cpu_to_virtio16(_vq->vdev,
> > + vq->split.avail_flags_shadow);
> > }
> >
> > static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> > @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> > virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> > + if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> > vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> > +
> > + if (vq->event_triggered)
>
> I don't get this one. if event_triggered why do you still want to
> write into driver flags? it won't trigger again anytime soon.
Should be a typo.
>
> > vq->packed.vring.driver->flags =
> > cpu_to_le16(vq->packed.event_flags_shadow);
> > }
> > @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> > {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > - /* If device triggered an event already it won't trigger one again:
> > - * no need to disable.
> > - */
> > - if (vq->event_triggered)
> > - return;
> > -
> > if (vq->packed_ring)
> > virtqueue_disable_cb_packed(_vq);
> > else
> >
> > Thanks
>
> I think I prefer Huang Albert's other patch - are you ok with it?
Yes.
Thanks
>
> > >
> > > Thanks
> > >
> > > > >
> > > > >
> > > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > > appreciated!
> > > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > > this is figured out, but I'll leave this up to you)
> > > > >
> > > > >
> > > > > >
> > > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > > buffer is relatively small.
> > > > > >
> > > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > > >
> > > > > > To summarize:
> > > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > > VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > > then it tries to publish new event
> > > > > >
> > > > > > To fix, if event_triggered is set to true, do not update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > >
> > > > > > Tested with iperf:
> > > > > > iperf3 tcp stream:
> > > > > > vm1 -----------------> vm2
> > > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > > there are many tx interrupts in vm2.
> > > > > > but without event_triggered there are just a few tx interrupts.
> > > > > >
> > > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > > Signed-off-by: Albert Huang <huangjie.albert@bytedance.com>
> > > > > > Message-Id: <20230321085953.24949-1-huangjie.albert@bytedance.com>
> > > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > ---
> > > > > > drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > > /* If we expect an interrupt for the next entry, tell host
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call. */
> > > > > > - if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > > + if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vring_used_event(&vq->split.vring),
> > > > > > cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > > * by writing event index and flush out the write before
> > > > > > * the read in the next get_buf call.
> > > > > > */
> > > > > > - if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > > + if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > > + !vq->event_triggered))
> > > > > > virtio_store_mb(vq->weak_barriers,
> > > > > > &vq->packed.vring.driver->off_wrap,
> > > > > > cpu_to_le16(vq->last_used_idx));
> > > > >
> > > >
> > >
>
^ permalink raw reply [flat|nested] 19+ messages in thread