virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: 黄杰 <huangjie.albert@bytedance.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>,
	Christian Schoenebeck <linux_oss@crudebyte.com>,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org,
	Luis Chamberlain <mcgrof@kernel.org>,
	v9fs-developer@lists.sourceforge.net
Subject: Re: [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf)
Date: Tue, 28 Mar 2023 10:35:35 -0400	[thread overview]
Message-ID: <20230328103322-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CABKxMyN0598wA6wHv5GkZC14znwp=OPo7u71_BizJfR+gUx4_w@mail.gmail.com>

On Tue, Mar 28, 2023 at 05:09:19PM +0800, 黄杰 wrote:
> Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 11:40写道:
> >
> > On Tue, Mar 28, 2023 at 11:09 AM 黄杰 <huangjie.albert@bytedance.com> wrote:
> > >
> > > Jason Wang <jasowang@redhat.com> 于2023年3月28日周二 10:59写道:
> > > >
> > > > On Tue, Mar 28, 2023 at 10:13 AM Dominique Martinet
> > > > <asmadeus@codewreck.org> wrote:
> > > > >
> > > > > Hi Michael, Albert,
> > > > >
> > > > > Albert Huang wrote on Sat, Mar 25, 2023 at 06:56:33PM +0800:
> > > > > > in virtio_net, if we disable the napi_tx, when we triger a tx interrupt,
> > > > > > the vq->event_triggered will be set to true. It will no longer be set to
> > > > > > false. Unless we explicitly call virtqueue_enable_cb_delayed or
> > > > > > virtqueue_enable_cb_prepare.
> > > > >
> > > > > This patch (commited as 35395770f803 ("virtio_ring: don't update event
> > > > > idx on get_buf") in next-20230327 apparently breaks 9p, as reported by
> > > > > Luis in https://lkml.kernel.org/r/ZCI+7Wg5OclSlE8c@bombadil.infradead.org
> > > > >
> > > > > I've just hit had a look at recent patches[1] and reverted this to test
> > > > > and I can mount again, so I'm pretty sure this is the culprit, but I
> > > > > didn't look at the content at all yet so cannot advise further.
> > > > > It might very well be that we need some extra handling for 9p
> > > > > specifically that can be added separately if required.
> > > > >
> > > > > [1] git log 0ec57cfa721fbd36b4c4c0d9ccc5d78a78f7fa35..HEAD drivers/virtio/
> > > > >
> > > > >
> > > > > This can be reproduced with a simple mount, run qemu with some -virtfs
> > > > > argument and `mount -t 9p -o debug=65535 tag mountpoint` will hang after
> > > > > these messages:
> > > > > 9pnet: -- p9_virtio_request (83): 9p debug: virtio request
> > > > > 9pnet: -- p9_virtio_request (83): virtio request kicked
> > > > >
> > > > > So I suspect we're just not getting a callback.
> > > >
> > > > I think so. The patch assumes the driver will call
> > > > virtqueue_disable/enable_cb() which is not the case of the 9p driver.
> > > >
> > > > So after the first interrupt, event_triggered will be set to true forever.
> > > >
> > > > Thanks
> > > >
> > >
> > > Hi: Wang
> > >
> > > Yes,  This patch assumes that all virtio-related drivers will call
> > > virtqueue_disable/enable_cb().
> > > Thank you for raising this issue.
> > >
> > > It seems that napi_tx is only related to virtue_net. I'm thinking if
> > > we need to refactor
> > > napi_tx instead of implementing it inside virtio_ring.
> >
> > We can hear from others.
> >
> > I think it's better not to workaround virtio_ring issues in a specific
> > driver. It might just add more hacks. We should correctly set
> > VRING_AVAIL_F_NO_INTERRUPT,
> >
> > Do you think the following might work (not even a compile test)?
> >
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 41144b5246a8..12f4efb6dc54 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -852,16 +852,16 @@ static void virtqueue_disable_cb_split(struct
> > virtqueue *_vq)
> >  {
> >         struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > -       if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
> > -               vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > -               if (vq->event)
> > -                       /* TODO: this is a hack. Figure out a cleaner
> > value to write. */
> > -                       vring_used_event(&vq->split.vring) = 0x0;
> > -               else
> > -                       vq->split.vring.avail->flags =
> > -                               cpu_to_virtio16(_vq->vdev,
> > -                                               vq->split.avail_flags_shadow);
> > -       }
> > +       if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > +               vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> > +
> > +       if (vq->event && !vq->event_triggered)
> > +               /* TODO: this is a hack. Figure out a cleaner value to write. */
> > +               vring_used_event(&vq->split.vring) = 0x0;
> > +       else
> > +               vq->split.vring.avail->flags =
> > +                       cpu_to_virtio16(_vq->vdev,
> > +                                       vq->split.avail_flags_shadow);
> >  }
> >
> >  static unsigned int virtqueue_enable_cb_prepare_split(struct virtqueue *_vq)
> > @@ -1697,8 +1697,10 @@ static void virtqueue_disable_cb_packed(struct
> > virtqueue *_vq)
> >  {
> >         struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > -       if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
> > +       if (!(vq->packed.event_flags_shadow & VRING_PACKED_EVENT_FLAG_DISABLE))
> >                 vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> > +
> > +       if (vq->event_triggered)
> >                 vq->packed.vring.driver->flags =
> >                         cpu_to_le16(vq->packed.event_flags_shadow);
> >         }
> > @@ -2330,12 +2332,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
> >  {
> >         struct vring_virtqueue *vq = to_vvq(_vq);
> >
> > -       /* If device triggered an event already it won't trigger one again:
> > -        * no need to disable.
> > -        */
> > -       if (vq->event_triggered)
> > -               return;
> > -
> >         if (vq->packed_ring)
> >                 virtqueue_disable_cb_packed(_vq);
> >         else
> >
> > Thanks
> >
> 
> Hi, This patch seems to address the issue I initially raised and also
> avoids the problem with virtio-9P.
> 
> but maybe this is a better choice:
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 307e139cb11d..6784d155c781 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -812,6 +812,10 @@ static void virtqueue_disable_cb_split(struct
> virtqueue *_vq)
> 
>         if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
>                 vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
> +
> +               if (vq->event_triggered)
> +                       return;
> +
>                 if (vq->event)
>                         /* TODO: this is a hack. Figure out a cleaner
> value to write. */
>                         vring_used_event(&vq->split.vring) = 0x0;
> @@ -1546,6 +1550,10 @@ static void virtqueue_disable_cb_packed(struct
> virtqueue *_vq)
> 
>         if (vq->packed.event_flags_shadow != VRING_PACKED_EVENT_FLAG_DISABLE) {
>                 vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
> +
> +               if (vq->event_triggered)
> +                       return;
> +
>                 vq->packed.vring.driver->flags =
>                         cpu_to_le16(vq->packed.event_flags_shadow);
>         }
> @@ -2063,12 +2071,6 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
>  {
>         struct vring_virtqueue *vq = to_vvq(_vq);
> 
> -       /* If device triggered an event already it won't trigger one again:
> -        * no need to disable.
> -        */
> -       if (vq->event_triggered)
> -               return;
> -
>         if (vq->packed_ring)
>                 virtqueue_disable_cb_packed(_vq);
>         else
> 
> Does Michael have any other suggestions?
> 
> Thanks.

Hmm what bothers me is this breaks the underlying assumption that
shadow is an exact match of the shared memory. Need to check this does
not cascade. But  I am still trying to get my head around what
the issue is.




> > >
> > > Thanks
> > >
> > > > >
> > > > >
> > > > > I'll have a closer look after work, but any advice meanwhile will be
> > > > > appreciated!
> > > > > (I'm sure Luis would also like a temporary drop from -next until
> > > > > this is figured out, but I'll leave this up to you)
> > > > >
> > > > >
> > > > > >
> > > > > > If we disable the napi_tx, it will only be called when the tx ring
> > > > > > buffer is relatively small.
> > > > > >
> > > > > > Because event_triggered is true. Therefore, VRING_AVAIL_F_NO_INTERRUPT or
> > > > > > VRING_PACKED_EVENT_FLAG_DISABLE will not be set. So we update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > > every time we call virtqueue_get_buf_ctx. This will bring more interruptions.
> > > > > >
> > > > > > To summarize:
> > > > > > 1) event_triggered was set to true in vring_interrupt()
> > > > > > 2) after this nothing will happen for virtqueue_disable_cb() so
> > > > > >    VRING_AVAIL_F_NO_INTERRUPT is not set in avail_flags_shadow
> > > > > > 3) virtqueue_get_buf_ctx_split() will still think the cb is enabled
> > > > > >    then it tries to publish new event
> > > > > >
> > > > > > To fix, if event_triggered is set to true, do not update
> > > > > > vring_used_event(&vq->split.vring) or vq->packed.vring.driver->off_wrap
> > > > > >
> > > > > > Tested with iperf:
> > > > > > iperf3 tcp stream:
> > > > > > vm1 -----------------> vm2
> > > > > > vm2 just receives tcp data stream from vm1, and sends the ack to vm1,
> > > > > > there are many tx interrupts in vm2.
> > > > > > but without event_triggered there are just a few tx interrupts.
> > > > > >
> > > > > > Fixes: 8d622d21d248 ("virtio: fix up virtio_disable_cb")
> > > > > > Signed-off-by: Albert Huang <huangjie.albert@bytedance.com>
> > > > > > Message-Id: <20230321085953.24949-1-huangjie.albert@bytedance.com>
> > > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > ---
> > > > > >  drivers/virtio/virtio_ring.c | 6 ++++--
> > > > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index cbeeea1b0439..1c36fa477966 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -914,7 +914,8 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue *_vq,
> > > > > >       /* If we expect an interrupt for the next entry, tell host
> > > > > >        * by writing event index and flush out the write before
> > > > > >        * the read in the next get_buf call. */
> > > > > > -     if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT))
> > > > > > +     if (unlikely(!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT) &&
> > > > > > +                  !vq->event_triggered))
> > > > > >               virtio_store_mb(vq->weak_barriers,
> > > > > >                               &vring_used_event(&vq->split.vring),
> > > > > >                               cpu_to_virtio16(_vq->vdev, vq->last_used_idx));
> > > > > > @@ -1744,7 +1745,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq,
> > > > > >        * by writing event index and flush out the write before
> > > > > >        * the read in the next get_buf call.
> > > > > >        */
> > > > > > -     if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC)
> > > > > > +     if (unlikely(vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC &&
> > > > > > +                  !vq->event_triggered))
> > > > > >               virtio_store_mb(vq->weak_barriers,
> > > > > >                               &vq->packed.vring.driver->off_wrap,
> > > > > >                               cpu_to_le16(vq->last_used_idx));
> > > > >
> > > >
> > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  parent reply	other threads:[~2023-03-28 14:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230325105633.58592-1-huangjie.albert@bytedance.com>
2023-03-27  4:02 ` [PATCH v2] virtio_ring: don't update event idx on get_buf Xuan Zhuo
2023-03-27 12:43 ` Michael S. Tsirkin
2023-03-28  2:13 ` 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf) Dominique Martinet
2023-03-28  2:58   ` Jason Wang
     [not found]     ` <CABKxMyPwuRb6p-oHxcQDhRtJv04=NDWvosNAp=epgvdrfCeveg@mail.gmail.com>
2023-03-28  3:39       ` [External] " Jason Wang
2023-03-29  5:21         ` Michael S. Tsirkin
     [not found]         ` <CABKxMyN0598wA6wHv5GkZC14znwp=OPo7u71_BizJfR+gUx4_w@mail.gmail.com>
2023-03-28 14:35           ` Michael S. Tsirkin [this message]
2023-03-29  5:34           ` Michael S. Tsirkin
     [not found]             ` <20230329072135.44757-1-huangjie.albert@bytedance.com>
2023-03-29  7:27               ` [PATCH v3] virtio_ring: interrupt disable flag updated to vq even with event_triggered is set Xuan Zhuo
2023-03-29 16:28                 ` Michael S. Tsirkin
2023-03-29 16:26               ` Michael S. Tsirkin
2023-03-29  5:42         ` [External] Re: 9p regression (Was: [PATCH v2] virtio_ring: don't update event idx on get_buf) Michael S. Tsirkin
2023-03-30  2:52           ` Jason Wang
2023-03-28  4:49   ` Luis Chamberlain
2023-03-29  5:36 ` [PATCH v2] virtio_ring: don't update event idx on get_buf Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230328103322-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=ericvh@gmail.com \
    --cc=huangjie.albert@bytedance.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux_oss@crudebyte.com \
    --cc=mcgrof@kernel.org \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).