All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: xuanzhuo@linux.alibaba.com, netdev <netdev@vger.kernel.org>,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, eperezma@redhat.com,
	edumazet@google.com, maxime.coquelin@redhat.com, kuba@kernel.org,
	pabeni@redhat.com, david.marchand@redhat.com,
	davem@davemloft.net
Subject: Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
Date: Tue, 16 May 2023 01:56:22 -0400	[thread overview]
Message-ID: <20230516015514-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEvCHQLFbtB2fbF27oCd5fNSjUtUOS0q-Lx7=MeYR8KzRA@mail.gmail.com>

On Tue, May 16, 2023 at 12:17:50PM +0800, Jason Wang wrote:
> On Tue, May 16, 2023 at 12:13 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, May 16, 2023 at 10:44:45AM +0800, Jason Wang wrote:
> > > On Mon, May 15, 2023 at 6:17 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, May 15, 2023 at 01:13:33PM +0800, Jason Wang wrote:
> > > > > On Mon, May 15, 2023 at 12:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, May 15, 2023 at 09:05:54AM +0800, Jason Wang wrote:
> > > > > > > On Wed, May 10, 2023 at 1:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> > > > > > > > > On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > > > > > > > > > Forget to cc netdev, adding.
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > > > > > > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > > > > > > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > > > > > > > > > response since current code is executed under addr spin lock.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > >
> > > > > > > > > > > > I don't like this frankly. This means that setting RX mode which would
> > > > > > > > > > > > previously be reliable, now becomes unreliable.
> > > > > > > > > > >
> > > > > > > > > > > It is "unreliable" by design:
> > > > > > > > > > >
> > > > > > > > > > >       void                    (*ndo_set_rx_mode)(struct net_device *dev);
> > > > > > > > > > >
> > > > > > > > > > > > - first of all configuration is no longer immediate
> > > > > > > > > > >
> > > > > > > > > > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > > > > > > > > > >
> > > > > > > > > > > mlx5e, ipoib, efx, ...
> > > > > > > > > > >
> > > > > > > > > > > >   and there is no way for driver to find out when
> > > > > > > > > > > >   it actually took effect
> > > > > > > > > > >
> > > > > > > > > > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > > > > > > > > > survive from this for years.
> > > > > > > > > > >
> > > > > > > > > > > > - second, if device fails command, this is also not
> > > > > > > > > > > >   propagated to driver, again no way for driver to find out
> > > > > > > > > > > >
> > > > > > > > > > > > VDUSE needs to be fixed to do tricks to fix this
> > > > > > > > > > > > without breaking normal drivers.
> > > > > > > > > > >
> > > > > > > > > > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > > > > > > > > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > > > > > > > > > transport).
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > Hmm. Can we differentiate between these use-cases?
> > > > > > > > >
> > > > > > > > > It doesn't look easy since we are drivers for virtio bus. Underlayer
> > > > > > > > > details were hidden from virtio-net.
> > > > > > > > >
> > > > > > > > > Or do you have any ideas on this?
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > >
> > > > > > > > I don't know, pass some kind of flag in struct virtqueue?
> > > > > > > >         "bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
> > > > > > > >
> > > > > > > > ?
> > > > > > > >
> > > > > > >
> > > > > > > So if it's slow, sleep, otherwise poll?
> > > > > > >
> > > > > > > I feel setting this flag might be tricky, since the driver doesn't
> > > > > > > know whether or not it's really slow. E.g smartNIC vendor may allow
> > > > > > > virtio-net emulation over PCI.
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > driver will have the choice, depending on whether
> > > > > > vq is deterministic or not.
> > > > >
> > > > > Ok, but the problem is, such booleans are only useful for virtio ring
> > > > > codes. But in this case, virtio-net knows what to do for cvq. So I'm
> > > > > not sure who the user is.
> > > > >
> > > > > Thanks
> > > >
> > > > Circling back, what exactly does the architecture you are trying
> > > > to fix look like? Who is going to introduce unbounded latency?
> > > > The hypervisor?
> > >
> > > Hypervisor is one of the possible reason, we have many more:
> > >
> > > Hardware device that provides virtio-pci emulation.
> > > Userspace devices like VDUSE.
> >
> > So let's start by addressing VDUSE maybe?
> 
> It's reported by at least one hardware vendor as well. I remember it
> was Alvaro who reported this first in the past.
> 
> >
> > > > If so do we not maybe want a new feature bit
> > > > that documents this? Hypervisor then can detect old guests
> > > > that spin and decide what to do, e.g. prioritise cvq more,
> > > > or fail FEATURES_OK.
> > >
> > > We suffer from this for bare metal as well.
> > >
> > > But a question is what's wrong with the approach that is used in this
> > > patch? I've answered that set_rx_mode is not reliable, so it should be
> > > fine to use workqueue. Except for this, any other thing that worries
> > > you?
> > >
> > > Thanks
> >
> > It's not reliable for other drivers but has been reliable for virtio.
> > I worry some software relied on this.
> 
> It's probably fine since some device like vhost doesn't support this
> at all and we manage to survive for several years.

vhost is often connected to a clever learning backend
such as a bridge which will DTRT without guest configuring
anything at all though, this could be why it works.



> > You are making good points though ... could we get some
> > maintainer's feedback on this?
> 
> That would be helpful. Jakub, any input on this?
> 
> Thanks
> 
> >
> > > >
> > > > > >
> > > > > >
> > > > > > > > --
> > > > > > > > MST
> > > > > > > >
> > > > > >
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: kuba@kernel.org, davem@davemloft.net, edumazet@google.com,
	pabeni@redhat.com, virtualization@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, maxime.coquelin@redhat.com,
	alvaro.karsz@solid-run.com, eperezma@redhat.com,
	xuanzhuo@linux.alibaba.com, david.marchand@redhat.com,
	netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
Date: Tue, 16 May 2023 01:56:22 -0400	[thread overview]
Message-ID: <20230516015514-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEvCHQLFbtB2fbF27oCd5fNSjUtUOS0q-Lx7=MeYR8KzRA@mail.gmail.com>

On Tue, May 16, 2023 at 12:17:50PM +0800, Jason Wang wrote:
> On Tue, May 16, 2023 at 12:13 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, May 16, 2023 at 10:44:45AM +0800, Jason Wang wrote:
> > > On Mon, May 15, 2023 at 6:17 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, May 15, 2023 at 01:13:33PM +0800, Jason Wang wrote:
> > > > > On Mon, May 15, 2023 at 12:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, May 15, 2023 at 09:05:54AM +0800, Jason Wang wrote:
> > > > > > > On Wed, May 10, 2023 at 1:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> > > > > > > > > On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > > > > > > > > > Forget to cc netdev, adding.
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > > > > > > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > > > > > > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > > > > > > > > > response since current code is executed under addr spin lock.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > >
> > > > > > > > > > > > I don't like this frankly. This means that setting RX mode which would
> > > > > > > > > > > > previously be reliable, now becomes unreliable.
> > > > > > > > > > >
> > > > > > > > > > > It is "unreliable" by design:
> > > > > > > > > > >
> > > > > > > > > > >       void                    (*ndo_set_rx_mode)(struct net_device *dev);
> > > > > > > > > > >
> > > > > > > > > > > > - first of all configuration is no longer immediate
> > > > > > > > > > >
> > > > > > > > > > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > > > > > > > > > >
> > > > > > > > > > > mlx5e, ipoib, efx, ...
> > > > > > > > > > >
> > > > > > > > > > > >   and there is no way for driver to find out when
> > > > > > > > > > > >   it actually took effect
> > > > > > > > > > >
> > > > > > > > > > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > > > > > > > > > survive from this for years.
> > > > > > > > > > >
> > > > > > > > > > > > - second, if device fails command, this is also not
> > > > > > > > > > > >   propagated to driver, again no way for driver to find out
> > > > > > > > > > > >
> > > > > > > > > > > > VDUSE needs to be fixed to do tricks to fix this
> > > > > > > > > > > > without breaking normal drivers.
> > > > > > > > > > >
> > > > > > > > > > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > > > > > > > > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > > > > > > > > > transport).
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > Hmm. Can we differentiate between these use-cases?
> > > > > > > > >
> > > > > > > > > It doesn't look easy since we are drivers for virtio bus. Underlayer
> > > > > > > > > details were hidden from virtio-net.
> > > > > > > > >
> > > > > > > > > Or do you have any ideas on this?
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > >
> > > > > > > > I don't know, pass some kind of flag in struct virtqueue?
> > > > > > > >         "bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
> > > > > > > >
> > > > > > > > ?
> > > > > > > >
> > > > > > >
> > > > > > > So if it's slow, sleep, otherwise poll?
> > > > > > >
> > > > > > > I feel setting this flag might be tricky, since the driver doesn't
> > > > > > > know whether or not it's really slow. E.g smartNIC vendor may allow
> > > > > > > virtio-net emulation over PCI.
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > driver will have the choice, depending on whether
> > > > > > vq is deterministic or not.
> > > > >
> > > > > Ok, but the problem is, such booleans are only useful for virtio ring
> > > > > codes. But in this case, virtio-net knows what to do for cvq. So I'm
> > > > > not sure who the user is.
> > > > >
> > > > > Thanks
> > > >
> > > > Circling back, what exactly does the architecture you are trying
> > > > to fix look like? Who is going to introduce unbounded latency?
> > > > The hypervisor?
> > >
> > > Hypervisor is one of the possible reason, we have many more:
> > >
> > > Hardware device that provides virtio-pci emulation.
> > > Userspace devices like VDUSE.
> >
> > So let's start by addressing VDUSE maybe?
> 
> It's reported by at least one hardware vendor as well. I remember it
> was Alvaro who reported this first in the past.
> 
> >
> > > > If so do we not maybe want a new feature bit
> > > > that documents this? Hypervisor then can detect old guests
> > > > that spin and decide what to do, e.g. prioritise cvq more,
> > > > or fail FEATURES_OK.
> > >
> > > We suffer from this for bare metal as well.
> > >
> > > But a question is what's wrong with the approach that is used in this
> > > patch? I've answered that set_rx_mode is not reliable, so it should be
> > > fine to use workqueue. Except for this, any other thing that worries
> > > you?
> > >
> > > Thanks
> >
> > It's not reliable for other drivers but has been reliable for virtio.
> > I worry some software relied on this.
> 
> It's probably fine since some device like vhost doesn't support this
> at all and we manage to survive for several years.

vhost is often connected to a clever learning backend
such as a bridge which will DTRT without guest configuring
anything at all though, this could be why it works.



> > You are making good points though ... could we get some
> > maintainer's feedback on this?
> 
> That would be helpful. Jakub, any input on this?
> 
> Thanks
> 
> >
> > > >
> > > > > >
> > > > > >
> > > > > > > > --
> > > > > > > > MST
> > > > > > > >
> > > > > >
> > > >
> >


  reply	other threads:[~2023-05-16  5:56 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-13  6:40 [PATCH net-next V2 0/2] virtio-net: don't busy poll for cvq command Jason Wang
2023-04-13  6:40 ` Jason Wang
2023-04-13  6:40 ` [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue Jason Wang
2023-04-13  6:40   ` Jason Wang
2023-04-13 16:25   ` Michael S. Tsirkin
2023-04-13 16:25     ` Michael S. Tsirkin
2023-04-14  5:04     ` Jason Wang
2023-04-14  5:04       ` Jason Wang
2023-04-14  7:21       ` Michael S. Tsirkin
2023-04-14  7:21         ` Michael S. Tsirkin
2023-04-17  3:40         ` Jason Wang
2023-04-17  3:40           ` Jason Wang
2023-05-05  3:46           ` Jason Wang
2023-05-05  3:46             ` Jason Wang
2023-05-10  5:32           ` Michael S. Tsirkin
2023-05-10  5:32             ` Michael S. Tsirkin
2023-05-15  1:05             ` Jason Wang
2023-05-15  1:05               ` Jason Wang
2023-05-15  4:45               ` Michael S. Tsirkin
2023-05-15  4:45                 ` Michael S. Tsirkin
2023-05-15  5:13                 ` Jason Wang
2023-05-15  5:13                   ` Jason Wang
2023-05-15 10:17                   ` Michael S. Tsirkin
2023-05-15 10:17                     ` Michael S. Tsirkin
2023-05-16  2:44                     ` Jason Wang
2023-05-16  2:44                       ` Jason Wang
2023-05-16  4:12                       ` Michael S. Tsirkin
2023-05-16  4:12                         ` Michael S. Tsirkin
2023-05-16  4:17                         ` Jason Wang
2023-05-16  4:17                           ` Jason Wang
2023-05-16  5:56                           ` Michael S. Tsirkin [this message]
2023-05-16  5:56                             ` Michael S. Tsirkin
2023-05-16 19:36                           ` Jakub Kicinski
2023-04-13  6:40 ` [PATCH net-next V2 2/2] virtio-net: sleep instead of busy waiting for cvq command Jason Wang
2023-04-13  6:40   ` Jason Wang
2023-04-13  7:27   ` Xuan Zhuo
2023-04-13  7:27     ` Xuan Zhuo
2023-04-14  5:09     ` Jason Wang
2023-04-14  5:09       ` Jason Wang
2023-04-14  5:10       ` Jason Wang
2023-04-14  5:10         ` Jason Wang
2023-05-16 20:54   ` Michael S. Tsirkin
2023-05-16 20:54     ` Michael S. Tsirkin
2023-05-17  0:50     ` Jason Wang
2023-05-17  0:50       ` Jason Wang
2023-04-13 13:02 ` [PATCH net-next V2 0/2] virtio-net: don't busy poll " Maxime Coquelin
2023-04-13 13:02   ` Maxime Coquelin
2023-04-13 13:05   ` Maxime Coquelin
2023-04-13 13:05     ` Maxime Coquelin
2023-04-13 14:04 ` Jakub Kicinski
2023-04-14  4:53   ` Jason Wang
2023-04-14  4:53     ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230516015514-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=davem@davemloft.net \
    --cc=david.marchand@redhat.com \
    --cc=edumazet@google.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.