* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
[not found] ` <20230413121525-mutt-send-email-mst@kernel.org>
@ 2023-04-14 5:04 ` Jason Wang
2023-04-14 7:21 ` Michael S. Tsirkin
0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2023-04-14 5:04 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
Forget to cc netdev, adding.
On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > This patch convert rx mode setting to be done in a workqueue, this is
> > a must for allow to sleep when waiting for the cvq command to
> > response since current code is executed under addr spin lock.
> >
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
>
> I don't like this frankly. This means that setting RX mode which would
> previously be reliable, now becomes unreliable.
It is "unreliable" by design:
void (*ndo_set_rx_mode)(struct net_device *dev);
> - first of all configuration is no longer immediate
Is immediate a hard requirement? I can see a workqueue is used at least:
mlx5e, ipoib, efx, ...
> and there is no way for driver to find out when
> it actually took effect
But we know rx mode is best effort e.g it doesn't support vhost and we
survive from this for years.
> - second, if device fails command, this is also not
> propagated to driver, again no way for driver to find out
>
> VDUSE needs to be fixed to do tricks to fix this
> without breaking normal drivers.
It's not specific to VDUSE. For example, when using virtio-net in the
UP environment with any software cvq (like mlx5 via vDPA or cma
transport).
Thanks
>
>
> > ---
> > Changes since V1:
> > - use RTNL to synchronize rx mode worker
> > ---
> > drivers/net/virtio_net.c | 55 +++++++++++++++++++++++++++++++++++++---
> > 1 file changed, 52 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index e2560b6f7980..2e56bbf86894 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -265,6 +265,12 @@ struct virtnet_info {
> > /* Work struct for config space updates */
> > struct work_struct config_work;
> >
> > + /* Work struct for config rx mode */
> > + struct work_struct rx_mode_work;
> > +
> > + /* Is rx mode work enabled? */
> > + bool rx_mode_work_enabled;
> > +
> > /* Does the affinity hint is set for virtqueues? */
> > bool affinity_hint_set;
> >
> > @@ -388,6 +394,20 @@ static void disable_delayed_refill(struct virtnet_info *vi)
> > spin_unlock_bh(&vi->refill_lock);
> > }
> >
> > +static void enable_rx_mode_work(struct virtnet_info *vi)
> > +{
> > + rtnl_lock();
> > + vi->rx_mode_work_enabled = true;
> > + rtnl_unlock();
> > +}
> > +
> > +static void disable_rx_mode_work(struct virtnet_info *vi)
> > +{
> > + rtnl_lock();
> > + vi->rx_mode_work_enabled = false;
> > + rtnl_unlock();
> > +}
> > +
> > static void virtqueue_napi_schedule(struct napi_struct *napi,
> > struct virtqueue *vq)
> > {
> > @@ -2310,9 +2330,11 @@ static int virtnet_close(struct net_device *dev)
> > return 0;
> > }
> >
> > -static void virtnet_set_rx_mode(struct net_device *dev)
> > +static void virtnet_rx_mode_work(struct work_struct *work)
> > {
> > - struct virtnet_info *vi = netdev_priv(dev);
> > + struct virtnet_info *vi =
> > + container_of(work, struct virtnet_info, rx_mode_work);
> > + struct net_device *dev = vi->dev;
> > struct scatterlist sg[2];
> > struct virtio_net_ctrl_mac *mac_data;
> > struct netdev_hw_addr *ha;
> > @@ -2325,6 +2347,8 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX))
> > return;
> >
> > + rtnl_lock();
> > +
> > vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0);
> > vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0);
> >
> > @@ -2342,14 +2366,19 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n",
> > vi->ctrl->allmulti ? "en" : "dis");
> >
> > + netif_addr_lock_bh(dev);
> > +
> > uc_count = netdev_uc_count(dev);
> > mc_count = netdev_mc_count(dev);
> > /* MAC filter - use one buffer for both lists */
> > buf = kzalloc(((uc_count + mc_count) * ETH_ALEN) +
> > (2 * sizeof(mac_data->entries)), GFP_ATOMIC);
> > mac_data = buf;
> > - if (!buf)
> > + if (!buf) {
> > + netif_addr_unlock_bh(dev);
> > + rtnl_unlock();
> > return;
> > + }
> >
> > sg_init_table(sg, 2);
> >
> > @@ -2370,6 +2399,8 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > netdev_for_each_mc_addr(ha, dev)
> > memcpy(&mac_data->macs[i++][0], ha->addr, ETH_ALEN);
> >
> > + netif_addr_unlock_bh(dev);
> > +
> > sg_set_buf(&sg[1], mac_data,
> > sizeof(mac_data->entries) + (mc_count * ETH_ALEN));
> >
> > @@ -2377,9 +2408,19 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > VIRTIO_NET_CTRL_MAC_TABLE_SET, sg))
> > dev_warn(&dev->dev, "Failed to set MAC filter table.\n");
> >
> > + rtnl_unlock();
> > +
> > kfree(buf);
> > }
> >
> > +static void virtnet_set_rx_mode(struct net_device *dev)
> > +{
> > + struct virtnet_info *vi = netdev_priv(dev);
> > +
> > + if (vi->rx_mode_work_enabled)
> > + schedule_work(&vi->rx_mode_work);
> > +}
> > +
> > static int virtnet_vlan_rx_add_vid(struct net_device *dev,
> > __be16 proto, u16 vid)
> > {
> > @@ -3150,6 +3191,8 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
> >
> > /* Make sure no work handler is accessing the device */
> > flush_work(&vi->config_work);
> > + disable_rx_mode_work(vi);
> > + flush_work(&vi->rx_mode_work);
> >
> > netif_tx_lock_bh(vi->dev);
> > netif_device_detach(vi->dev);
>
> So now configuration is not propagated to device.
> Won't device later wake up in wrong state?
>
>
> > @@ -3172,6 +3215,7 @@ static int virtnet_restore_up(struct virtio_device *vdev)
> > virtio_device_ready(vdev);
> >
> > enable_delayed_refill(vi);
> > + enable_rx_mode_work(vi);
> >
> > if (netif_running(vi->dev)) {
> > err = virtnet_open(vi->dev);
> > @@ -3969,6 +4013,7 @@ static int virtnet_probe(struct virtio_device *vdev)
> > vdev->priv = vi;
> >
> > INIT_WORK(&vi->config_work, virtnet_config_changed_work);
> > + INIT_WORK(&vi->rx_mode_work, virtnet_rx_mode_work);
> > spin_lock_init(&vi->refill_lock);
> >
> > if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) {
> > @@ -4077,6 +4122,8 @@ static int virtnet_probe(struct virtio_device *vdev)
> > if (vi->has_rss || vi->has_rss_hash_report)
> > virtnet_init_default_rss(vi);
> >
> > + enable_rx_mode_work(vi);
> > +
> > /* serialize netdev register + virtio_device_ready() with ndo_open() */
> > rtnl_lock();
> >
> > @@ -4174,6 +4221,8 @@ static void virtnet_remove(struct virtio_device *vdev)
> >
> > /* Make sure no work handler is accessing the device. */
> > flush_work(&vi->config_work);
> > + disable_rx_mode_work(vi);
> > + flush_work(&vi->rx_mode_work);
> >
> > unregister_netdev(vi->dev);
> >
> > --
> > 2.25.1
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 2/2] virtio-net: sleep instead of busy waiting for cvq command
[not found] ` <CACGkMEuJuZKGMhVwFmD0ZMa7V7TdGu6qaXF24Gg67TzMbs8ANA@mail.gmail.com>
@ 2023-04-14 5:10 ` Jason Wang
0 siblings, 0 replies; 15+ messages in thread
From: Jason Wang @ 2023-04-14 5:10 UTC (permalink / raw)
To: Xuan Zhuo
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, david.marchand, mst,
netdev
Adding netdev.
On Fri, Apr 14, 2023 at 1:09 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Thu, Apr 13, 2023 at 3:31 PM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Thu, 13 Apr 2023 14:40:27 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > > We used to busy waiting on the cvq command this tends to be
> > > problematic since there no way for to schedule another process which
> > > may serve for the control virtqueue. This might be the case when the
> > > control virtqueue is emulated by software. This patch switches to use
> > > completion to allow the CPU to sleep instead of busy waiting for the
> > > cvq command.
> > >
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > > Changes since V1:
> > > - use completion for simplicity
> > > - don't try to harden the CVQ command which requires more thought
> > > Changes since RFC:
> > > - break the device when timeout
> > > - get buffer manually since the virtio core check more_used() instead
> > > ---
> > > drivers/net/virtio_net.c | 21 ++++++++++++++-------
> > > 1 file changed, 14 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 2e56bbf86894..d3eb8fd6c9dc 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -19,6 +19,7 @@
> > > #include <linux/average.h>
> > > #include <linux/filter.h>
> > > #include <linux/kernel.h>
> > > +#include <linux/completion.h>
> > > #include <net/route.h>
> > > #include <net/xdp.h>
> > > #include <net/net_failover.h>
> > > @@ -295,6 +296,8 @@ struct virtnet_info {
> > >
> > > /* failover when STANDBY feature enabled */
> > > struct failover *failover;
> > > +
> > > + struct completion completion;
> > > };
> > >
> > > struct padded_vnet_hdr {
> > > @@ -1709,6 +1712,13 @@ static bool try_fill_recv(struct virtnet_info *vi, struct receive_queue *rq,
> > > return !oom;
> > > }
> > >
> > > +static void virtnet_cvq_done(struct virtqueue *cvq)
> > > +{
> > > + struct virtnet_info *vi = cvq->vdev->priv;
> > > +
> > > + complete(&vi->completion);
> > > +}
> > > +
> > > static void skb_recv_done(struct virtqueue *rvq)
> > > {
> > > struct virtnet_info *vi = rvq->vdev->priv;
> > > @@ -2169,12 +2179,8 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> > > if (unlikely(!virtqueue_kick(vi->cvq)))
> > > return vi->ctrl->status == VIRTIO_NET_OK;
> > >
> > > - /* Spin for a response, the kick causes an ioport write, trapping
> > > - * into the hypervisor, so the request should be handled immediately.
> > > - */
> > > - while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > > - !virtqueue_is_broken(vi->cvq))
> > > - cpu_relax();
> > > + wait_for_completion(&vi->completion);
> > > + virtqueue_get_buf(vi->cvq, &tmp);
> > >
> > > return vi->ctrl->status == VIRTIO_NET_OK;
> > > }
> > > @@ -3672,7 +3678,7 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
> > >
> > > /* Parameters for control virtqueue, if any */
> > > if (vi->has_cvq) {
> > > - callbacks[total_vqs - 1] = NULL;
> > > + callbacks[total_vqs - 1] = virtnet_cvq_done;
> >
> > This depends the interrupt, right?
>
> Not necessarily, we have ISR for at last PCI:
>
> static irqreturn_t vp_interrupt(int irq, void *opaque)
> {
> struct virtio_pci_device *vp_dev = opaque;
> u8 isr;
>
> /* reading the ISR has the effect of also clearing it so it's very
> * important to save off the value. */
> isr = ioread8(vp_dev->isr);
> ...
> }
>
> >
> > I worry that there may be some devices that may not support interruption on cvq.
>
> Is the device using INTX or MSI-X?
>
> > Although this may not be in line with SPEC, it may cause problem on the devices
> > that can work normally at present.
>
> Then the implementation is buggy, it might not work for drivers other
> than Linux. Working around such buggy implementation is suboptimal.
>
> Thanks
>
> >
> > Thanks.
> >
> >
> > > names[total_vqs - 1] = "control";
> > > }
> > >
> > > @@ -4122,6 +4128,7 @@ static int virtnet_probe(struct virtio_device *vdev)
> > > if (vi->has_rss || vi->has_rss_hash_report)
> > > virtnet_init_default_rss(vi);
> > >
> > > + init_completion(&vi->completion);
> > > enable_rx_mode_work(vi);
> > >
> > > /* serialize netdev register + virtio_device_ready() with ndo_open() */
> > > --
> > > 2.25.1
> > >
> >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-04-14 5:04 ` [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue Jason Wang
@ 2023-04-14 7:21 ` Michael S. Tsirkin
2023-04-17 3:40 ` Jason Wang
0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-04-14 7:21 UTC (permalink / raw)
To: Jason Wang
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> Forget to cc netdev, adding.
>
> On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > This patch convert rx mode setting to be done in a workqueue, this is
> > > a must for allow to sleep when waiting for the cvq command to
> > > response since current code is executed under addr spin lock.
> > >
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> >
> > I don't like this frankly. This means that setting RX mode which would
> > previously be reliable, now becomes unreliable.
>
> It is "unreliable" by design:
>
> void (*ndo_set_rx_mode)(struct net_device *dev);
>
> > - first of all configuration is no longer immediate
>
> Is immediate a hard requirement? I can see a workqueue is used at least:
>
> mlx5e, ipoib, efx, ...
>
> > and there is no way for driver to find out when
> > it actually took effect
>
> But we know rx mode is best effort e.g it doesn't support vhost and we
> survive from this for years.
>
> > - second, if device fails command, this is also not
> > propagated to driver, again no way for driver to find out
> >
> > VDUSE needs to be fixed to do tricks to fix this
> > without breaking normal drivers.
>
> It's not specific to VDUSE. For example, when using virtio-net in the
> UP environment with any software cvq (like mlx5 via vDPA or cma
> transport).
>
> Thanks
Hmm. Can we differentiate between these use-cases?
> >
> >
> > > ---
> > > Changes since V1:
> > > - use RTNL to synchronize rx mode worker
> > > ---
> > > drivers/net/virtio_net.c | 55 +++++++++++++++++++++++++++++++++++++---
> > > 1 file changed, 52 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index e2560b6f7980..2e56bbf86894 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -265,6 +265,12 @@ struct virtnet_info {
> > > /* Work struct for config space updates */
> > > struct work_struct config_work;
> > >
> > > + /* Work struct for config rx mode */
> > > + struct work_struct rx_mode_work;
> > > +
> > > + /* Is rx mode work enabled? */
> > > + bool rx_mode_work_enabled;
> > > +
> > > /* Does the affinity hint is set for virtqueues? */
> > > bool affinity_hint_set;
> > >
> > > @@ -388,6 +394,20 @@ static void disable_delayed_refill(struct virtnet_info *vi)
> > > spin_unlock_bh(&vi->refill_lock);
> > > }
> > >
> > > +static void enable_rx_mode_work(struct virtnet_info *vi)
> > > +{
> > > + rtnl_lock();
> > > + vi->rx_mode_work_enabled = true;
> > > + rtnl_unlock();
> > > +}
> > > +
> > > +static void disable_rx_mode_work(struct virtnet_info *vi)
> > > +{
> > > + rtnl_lock();
> > > + vi->rx_mode_work_enabled = false;
> > > + rtnl_unlock();
> > > +}
> > > +
> > > static void virtqueue_napi_schedule(struct napi_struct *napi,
> > > struct virtqueue *vq)
> > > {
> > > @@ -2310,9 +2330,11 @@ static int virtnet_close(struct net_device *dev)
> > > return 0;
> > > }
> > >
> > > -static void virtnet_set_rx_mode(struct net_device *dev)
> > > +static void virtnet_rx_mode_work(struct work_struct *work)
> > > {
> > > - struct virtnet_info *vi = netdev_priv(dev);
> > > + struct virtnet_info *vi =
> > > + container_of(work, struct virtnet_info, rx_mode_work);
> > > + struct net_device *dev = vi->dev;
> > > struct scatterlist sg[2];
> > > struct virtio_net_ctrl_mac *mac_data;
> > > struct netdev_hw_addr *ha;
> > > @@ -2325,6 +2347,8 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > > if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX))
> > > return;
> > >
> > > + rtnl_lock();
> > > +
> > > vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0);
> > > vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0);
> > >
> > > @@ -2342,14 +2366,19 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > > dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n",
> > > vi->ctrl->allmulti ? "en" : "dis");
> > >
> > > + netif_addr_lock_bh(dev);
> > > +
> > > uc_count = netdev_uc_count(dev);
> > > mc_count = netdev_mc_count(dev);
> > > /* MAC filter - use one buffer for both lists */
> > > buf = kzalloc(((uc_count + mc_count) * ETH_ALEN) +
> > > (2 * sizeof(mac_data->entries)), GFP_ATOMIC);
> > > mac_data = buf;
> > > - if (!buf)
> > > + if (!buf) {
> > > + netif_addr_unlock_bh(dev);
> > > + rtnl_unlock();
> > > return;
> > > + }
> > >
> > > sg_init_table(sg, 2);
> > >
> > > @@ -2370,6 +2399,8 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > > netdev_for_each_mc_addr(ha, dev)
> > > memcpy(&mac_data->macs[i++][0], ha->addr, ETH_ALEN);
> > >
> > > + netif_addr_unlock_bh(dev);
> > > +
> > > sg_set_buf(&sg[1], mac_data,
> > > sizeof(mac_data->entries) + (mc_count * ETH_ALEN));
> > >
> > > @@ -2377,9 +2408,19 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > > VIRTIO_NET_CTRL_MAC_TABLE_SET, sg))
> > > dev_warn(&dev->dev, "Failed to set MAC filter table.\n");
> > >
> > > + rtnl_unlock();
> > > +
> > > kfree(buf);
> > > }
> > >
> > > +static void virtnet_set_rx_mode(struct net_device *dev)
> > > +{
> > > + struct virtnet_info *vi = netdev_priv(dev);
> > > +
> > > + if (vi->rx_mode_work_enabled)
> > > + schedule_work(&vi->rx_mode_work);
> > > +}
> > > +
> > > static int virtnet_vlan_rx_add_vid(struct net_device *dev,
> > > __be16 proto, u16 vid)
> > > {
> > > @@ -3150,6 +3191,8 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
> > >
> > > /* Make sure no work handler is accessing the device */
> > > flush_work(&vi->config_work);
> > > + disable_rx_mode_work(vi);
> > > + flush_work(&vi->rx_mode_work);
> > >
> > > netif_tx_lock_bh(vi->dev);
> > > netif_device_detach(vi->dev);
> >
> > So now configuration is not propagated to device.
> > Won't device later wake up in wrong state?
> >
> >
> > > @@ -3172,6 +3215,7 @@ static int virtnet_restore_up(struct virtio_device *vdev)
> > > virtio_device_ready(vdev);
> > >
> > > enable_delayed_refill(vi);
> > > + enable_rx_mode_work(vi);
> > >
> > > if (netif_running(vi->dev)) {
> > > err = virtnet_open(vi->dev);
> > > @@ -3969,6 +4013,7 @@ static int virtnet_probe(struct virtio_device *vdev)
> > > vdev->priv = vi;
> > >
> > > INIT_WORK(&vi->config_work, virtnet_config_changed_work);
> > > + INIT_WORK(&vi->rx_mode_work, virtnet_rx_mode_work);
> > > spin_lock_init(&vi->refill_lock);
> > >
> > > if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) {
> > > @@ -4077,6 +4122,8 @@ static int virtnet_probe(struct virtio_device *vdev)
> > > if (vi->has_rss || vi->has_rss_hash_report)
> > > virtnet_init_default_rss(vi);
> > >
> > > + enable_rx_mode_work(vi);
> > > +
> > > /* serialize netdev register + virtio_device_ready() with ndo_open() */
> > > rtnl_lock();
> > >
> > > @@ -4174,6 +4221,8 @@ static void virtnet_remove(struct virtio_device *vdev)
> > >
> > > /* Make sure no work handler is accessing the device. */
> > > flush_work(&vi->config_work);
> > > + disable_rx_mode_work(vi);
> > > + flush_work(&vi->rx_mode_work);
> > >
> > > unregister_netdev(vi->dev);
> > >
> > > --
> > > 2.25.1
> >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-04-14 7:21 ` Michael S. Tsirkin
@ 2023-04-17 3:40 ` Jason Wang
2023-05-05 3:46 ` Jason Wang
2023-05-10 5:32 ` Michael S. Tsirkin
0 siblings, 2 replies; 15+ messages in thread
From: Jason Wang @ 2023-04-17 3:40 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > Forget to cc netdev, adding.
> >
> > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > a must for allow to sleep when waiting for the cvq command to
> > > > response since current code is executed under addr spin lock.
> > > >
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > >
> > > I don't like this frankly. This means that setting RX mode which would
> > > previously be reliable, now becomes unreliable.
> >
> > It is "unreliable" by design:
> >
> > void (*ndo_set_rx_mode)(struct net_device *dev);
> >
> > > - first of all configuration is no longer immediate
> >
> > Is immediate a hard requirement? I can see a workqueue is used at least:
> >
> > mlx5e, ipoib, efx, ...
> >
> > > and there is no way for driver to find out when
> > > it actually took effect
> >
> > But we know rx mode is best effort e.g it doesn't support vhost and we
> > survive from this for years.
> >
> > > - second, if device fails command, this is also not
> > > propagated to driver, again no way for driver to find out
> > >
> > > VDUSE needs to be fixed to do tricks to fix this
> > > without breaking normal drivers.
> >
> > It's not specific to VDUSE. For example, when using virtio-net in the
> > UP environment with any software cvq (like mlx5 via vDPA or cma
> > transport).
> >
> > Thanks
>
> Hmm. Can we differentiate between these use-cases?
It doesn't look easy since we are drivers for virtio bus. Underlayer
details were hidden from virtio-net.
Or do you have any ideas on this?
Thanks
>
> > >
> > >
> > > > ---
> > > > Changes since V1:
> > > > - use RTNL to synchronize rx mode worker
> > > > ---
> > > > drivers/net/virtio_net.c | 55 +++++++++++++++++++++++++++++++++++++---
> > > > 1 file changed, 52 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > index e2560b6f7980..2e56bbf86894 100644
> > > > --- a/drivers/net/virtio_net.c
> > > > +++ b/drivers/net/virtio_net.c
> > > > @@ -265,6 +265,12 @@ struct virtnet_info {
> > > > /* Work struct for config space updates */
> > > > struct work_struct config_work;
> > > >
> > > > + /* Work struct for config rx mode */
> > > > + struct work_struct rx_mode_work;
> > > > +
> > > > + /* Is rx mode work enabled? */
> > > > + bool rx_mode_work_enabled;
> > > > +
> > > > /* Does the affinity hint is set for virtqueues? */
> > > > bool affinity_hint_set;
> > > >
> > > > @@ -388,6 +394,20 @@ static void disable_delayed_refill(struct virtnet_info *vi)
> > > > spin_unlock_bh(&vi->refill_lock);
> > > > }
> > > >
> > > > +static void enable_rx_mode_work(struct virtnet_info *vi)
> > > > +{
> > > > + rtnl_lock();
> > > > + vi->rx_mode_work_enabled = true;
> > > > + rtnl_unlock();
> > > > +}
> > > > +
> > > > +static void disable_rx_mode_work(struct virtnet_info *vi)
> > > > +{
> > > > + rtnl_lock();
> > > > + vi->rx_mode_work_enabled = false;
> > > > + rtnl_unlock();
> > > > +}
> > > > +
> > > > static void virtqueue_napi_schedule(struct napi_struct *napi,
> > > > struct virtqueue *vq)
> > > > {
> > > > @@ -2310,9 +2330,11 @@ static int virtnet_close(struct net_device *dev)
> > > > return 0;
> > > > }
> > > >
> > > > -static void virtnet_set_rx_mode(struct net_device *dev)
> > > > +static void virtnet_rx_mode_work(struct work_struct *work)
> > > > {
> > > > - struct virtnet_info *vi = netdev_priv(dev);
> > > > + struct virtnet_info *vi =
> > > > + container_of(work, struct virtnet_info, rx_mode_work);
> > > > + struct net_device *dev = vi->dev;
> > > > struct scatterlist sg[2];
> > > > struct virtio_net_ctrl_mac *mac_data;
> > > > struct netdev_hw_addr *ha;
> > > > @@ -2325,6 +2347,8 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > > > if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX))
> > > > return;
> > > >
> > > > + rtnl_lock();
> > > > +
> > > > vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0);
> > > > vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0);
> > > >
> > > > @@ -2342,14 +2366,19 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > > > dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n",
> > > > vi->ctrl->allmulti ? "en" : "dis");
> > > >
> > > > + netif_addr_lock_bh(dev);
> > > > +
> > > > uc_count = netdev_uc_count(dev);
> > > > mc_count = netdev_mc_count(dev);
> > > > /* MAC filter - use one buffer for both lists */
> > > > buf = kzalloc(((uc_count + mc_count) * ETH_ALEN) +
> > > > (2 * sizeof(mac_data->entries)), GFP_ATOMIC);
> > > > mac_data = buf;
> > > > - if (!buf)
> > > > + if (!buf) {
> > > > + netif_addr_unlock_bh(dev);
> > > > + rtnl_unlock();
> > > > return;
> > > > + }
> > > >
> > > > sg_init_table(sg, 2);
> > > >
> > > > @@ -2370,6 +2399,8 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > > > netdev_for_each_mc_addr(ha, dev)
> > > > memcpy(&mac_data->macs[i++][0], ha->addr, ETH_ALEN);
> > > >
> > > > + netif_addr_unlock_bh(dev);
> > > > +
> > > > sg_set_buf(&sg[1], mac_data,
> > > > sizeof(mac_data->entries) + (mc_count * ETH_ALEN));
> > > >
> > > > @@ -2377,9 +2408,19 @@ static void virtnet_set_rx_mode(struct net_device *dev)
> > > > VIRTIO_NET_CTRL_MAC_TABLE_SET, sg))
> > > > dev_warn(&dev->dev, "Failed to set MAC filter table.\n");
> > > >
> > > > + rtnl_unlock();
> > > > +
> > > > kfree(buf);
> > > > }
> > > >
> > > > +static void virtnet_set_rx_mode(struct net_device *dev)
> > > > +{
> > > > + struct virtnet_info *vi = netdev_priv(dev);
> > > > +
> > > > + if (vi->rx_mode_work_enabled)
> > > > + schedule_work(&vi->rx_mode_work);
> > > > +}
> > > > +
> > > > static int virtnet_vlan_rx_add_vid(struct net_device *dev,
> > > > __be16 proto, u16 vid)
> > > > {
> > > > @@ -3150,6 +3191,8 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
> > > >
> > > > /* Make sure no work handler is accessing the device */
> > > > flush_work(&vi->config_work);
> > > > + disable_rx_mode_work(vi);
> > > > + flush_work(&vi->rx_mode_work);
> > > >
> > > > netif_tx_lock_bh(vi->dev);
> > > > netif_device_detach(vi->dev);
> > >
> > > So now configuration is not propagated to device.
> > > Won't device later wake up in wrong state?
> > >
> > >
> > > > @@ -3172,6 +3215,7 @@ static int virtnet_restore_up(struct virtio_device *vdev)
> > > > virtio_device_ready(vdev);
> > > >
> > > > enable_delayed_refill(vi);
> > > > + enable_rx_mode_work(vi);
> > > >
> > > > if (netif_running(vi->dev)) {
> > > > err = virtnet_open(vi->dev);
> > > > @@ -3969,6 +4013,7 @@ static int virtnet_probe(struct virtio_device *vdev)
> > > > vdev->priv = vi;
> > > >
> > > > INIT_WORK(&vi->config_work, virtnet_config_changed_work);
> > > > + INIT_WORK(&vi->rx_mode_work, virtnet_rx_mode_work);
> > > > spin_lock_init(&vi->refill_lock);
> > > >
> > > > if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) {
> > > > @@ -4077,6 +4122,8 @@ static int virtnet_probe(struct virtio_device *vdev)
> > > > if (vi->has_rss || vi->has_rss_hash_report)
> > > > virtnet_init_default_rss(vi);
> > > >
> > > > + enable_rx_mode_work(vi);
> > > > +
> > > > /* serialize netdev register + virtio_device_ready() with ndo_open() */
> > > > rtnl_lock();
> > > >
> > > > @@ -4174,6 +4221,8 @@ static void virtnet_remove(struct virtio_device *vdev)
> > > >
> > > > /* Make sure no work handler is accessing the device. */
> > > > flush_work(&vi->config_work);
> > > > + disable_rx_mode_work(vi);
> > > > + flush_work(&vi->rx_mode_work);
> > > >
> > > > unregister_netdev(vi->dev);
> > > >
> > > > --
> > > > 2.25.1
> > >
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-04-17 3:40 ` Jason Wang
@ 2023-05-05 3:46 ` Jason Wang
2023-05-10 5:32 ` Michael S. Tsirkin
1 sibling, 0 replies; 15+ messages in thread
From: Jason Wang @ 2023-05-05 3:46 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
在 2023/4/17 11:40, Jason Wang 写道:
> On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
>>> Forget to cc netdev, adding.
>>>
>>> On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>> On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
>>>>> This patch convert rx mode setting to be done in a workqueue, this is
>>>>> a must for allow to sleep when waiting for the cvq command to
>>>>> response since current code is executed under addr spin lock.
>>>>>
>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> I don't like this frankly. This means that setting RX mode which would
>>>> previously be reliable, now becomes unreliable.
>>> It is "unreliable" by design:
>>>
>>> void (*ndo_set_rx_mode)(struct net_device *dev);
>>>
>>>> - first of all configuration is no longer immediate
>>> Is immediate a hard requirement? I can see a workqueue is used at least:
>>>
>>> mlx5e, ipoib, efx, ...
>>>
>>>> and there is no way for driver to find out when
>>>> it actually took effect
>>> But we know rx mode is best effort e.g it doesn't support vhost and we
>>> survive from this for years.
>>>
>>>> - second, if device fails command, this is also not
>>>> propagated to driver, again no way for driver to find out
>>>>
>>>> VDUSE needs to be fixed to do tricks to fix this
>>>> without breaking normal drivers.
>>> It's not specific to VDUSE. For example, when using virtio-net in the
>>> UP environment with any software cvq (like mlx5 via vDPA or cma
>>> transport).
>>>
>>> Thanks
>> Hmm. Can we differentiate between these use-cases?
> It doesn't look easy since we are drivers for virtio bus. Underlayer
> details were hidden from virtio-net.
>
> Or do you have any ideas on this?
Michael, any thought on this?
Thanks
>
> Thanks
>
>>>>
>>>>> ---
>>>>> Changes since V1:
>>>>> - use RTNL to synchronize rx mode worker
>>>>> ---
>>>>> drivers/net/virtio_net.c | 55 +++++++++++++++++++++++++++++++++++++---
>>>>> 1 file changed, 52 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>> index e2560b6f7980..2e56bbf86894 100644
>>>>> --- a/drivers/net/virtio_net.c
>>>>> +++ b/drivers/net/virtio_net.c
>>>>> @@ -265,6 +265,12 @@ struct virtnet_info {
>>>>> /* Work struct for config space updates */
>>>>> struct work_struct config_work;
>>>>>
>>>>> + /* Work struct for config rx mode */
>>>>> + struct work_struct rx_mode_work;
>>>>> +
>>>>> + /* Is rx mode work enabled? */
>>>>> + bool rx_mode_work_enabled;
>>>>> +
>>>>> /* Does the affinity hint is set for virtqueues? */
>>>>> bool affinity_hint_set;
>>>>>
>>>>> @@ -388,6 +394,20 @@ static void disable_delayed_refill(struct virtnet_info *vi)
>>>>> spin_unlock_bh(&vi->refill_lock);
>>>>> }
>>>>>
>>>>> +static void enable_rx_mode_work(struct virtnet_info *vi)
>>>>> +{
>>>>> + rtnl_lock();
>>>>> + vi->rx_mode_work_enabled = true;
>>>>> + rtnl_unlock();
>>>>> +}
>>>>> +
>>>>> +static void disable_rx_mode_work(struct virtnet_info *vi)
>>>>> +{
>>>>> + rtnl_lock();
>>>>> + vi->rx_mode_work_enabled = false;
>>>>> + rtnl_unlock();
>>>>> +}
>>>>> +
>>>>> static void virtqueue_napi_schedule(struct napi_struct *napi,
>>>>> struct virtqueue *vq)
>>>>> {
>>>>> @@ -2310,9 +2330,11 @@ static int virtnet_close(struct net_device *dev)
>>>>> return 0;
>>>>> }
>>>>>
>>>>> -static void virtnet_set_rx_mode(struct net_device *dev)
>>>>> +static void virtnet_rx_mode_work(struct work_struct *work)
>>>>> {
>>>>> - struct virtnet_info *vi = netdev_priv(dev);
>>>>> + struct virtnet_info *vi =
>>>>> + container_of(work, struct virtnet_info, rx_mode_work);
>>>>> + struct net_device *dev = vi->dev;
>>>>> struct scatterlist sg[2];
>>>>> struct virtio_net_ctrl_mac *mac_data;
>>>>> struct netdev_hw_addr *ha;
>>>>> @@ -2325,6 +2347,8 @@ static void virtnet_set_rx_mode(struct net_device *dev)
>>>>> if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_RX))
>>>>> return;
>>>>>
>>>>> + rtnl_lock();
>>>>> +
>>>>> vi->ctrl->promisc = ((dev->flags & IFF_PROMISC) != 0);
>>>>> vi->ctrl->allmulti = ((dev->flags & IFF_ALLMULTI) != 0);
>>>>>
>>>>> @@ -2342,14 +2366,19 @@ static void virtnet_set_rx_mode(struct net_device *dev)
>>>>> dev_warn(&dev->dev, "Failed to %sable allmulti mode.\n",
>>>>> vi->ctrl->allmulti ? "en" : "dis");
>>>>>
>>>>> + netif_addr_lock_bh(dev);
>>>>> +
>>>>> uc_count = netdev_uc_count(dev);
>>>>> mc_count = netdev_mc_count(dev);
>>>>> /* MAC filter - use one buffer for both lists */
>>>>> buf = kzalloc(((uc_count + mc_count) * ETH_ALEN) +
>>>>> (2 * sizeof(mac_data->entries)), GFP_ATOMIC);
>>>>> mac_data = buf;
>>>>> - if (!buf)
>>>>> + if (!buf) {
>>>>> + netif_addr_unlock_bh(dev);
>>>>> + rtnl_unlock();
>>>>> return;
>>>>> + }
>>>>>
>>>>> sg_init_table(sg, 2);
>>>>>
>>>>> @@ -2370,6 +2399,8 @@ static void virtnet_set_rx_mode(struct net_device *dev)
>>>>> netdev_for_each_mc_addr(ha, dev)
>>>>> memcpy(&mac_data->macs[i++][0], ha->addr, ETH_ALEN);
>>>>>
>>>>> + netif_addr_unlock_bh(dev);
>>>>> +
>>>>> sg_set_buf(&sg[1], mac_data,
>>>>> sizeof(mac_data->entries) + (mc_count * ETH_ALEN));
>>>>>
>>>>> @@ -2377,9 +2408,19 @@ static void virtnet_set_rx_mode(struct net_device *dev)
>>>>> VIRTIO_NET_CTRL_MAC_TABLE_SET, sg))
>>>>> dev_warn(&dev->dev, "Failed to set MAC filter table.\n");
>>>>>
>>>>> + rtnl_unlock();
>>>>> +
>>>>> kfree(buf);
>>>>> }
>>>>>
>>>>> +static void virtnet_set_rx_mode(struct net_device *dev)
>>>>> +{
>>>>> + struct virtnet_info *vi = netdev_priv(dev);
>>>>> +
>>>>> + if (vi->rx_mode_work_enabled)
>>>>> + schedule_work(&vi->rx_mode_work);
>>>>> +}
>>>>> +
>>>>> static int virtnet_vlan_rx_add_vid(struct net_device *dev,
>>>>> __be16 proto, u16 vid)
>>>>> {
>>>>> @@ -3150,6 +3191,8 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
>>>>>
>>>>> /* Make sure no work handler is accessing the device */
>>>>> flush_work(&vi->config_work);
>>>>> + disable_rx_mode_work(vi);
>>>>> + flush_work(&vi->rx_mode_work);
>>>>>
>>>>> netif_tx_lock_bh(vi->dev);
>>>>> netif_device_detach(vi->dev);
>>>> So now configuration is not propagated to device.
>>>> Won't device later wake up in wrong state?
>>>>
>>>>
>>>>> @@ -3172,6 +3215,7 @@ static int virtnet_restore_up(struct virtio_device *vdev)
>>>>> virtio_device_ready(vdev);
>>>>>
>>>>> enable_delayed_refill(vi);
>>>>> + enable_rx_mode_work(vi);
>>>>>
>>>>> if (netif_running(vi->dev)) {
>>>>> err = virtnet_open(vi->dev);
>>>>> @@ -3969,6 +4013,7 @@ static int virtnet_probe(struct virtio_device *vdev)
>>>>> vdev->priv = vi;
>>>>>
>>>>> INIT_WORK(&vi->config_work, virtnet_config_changed_work);
>>>>> + INIT_WORK(&vi->rx_mode_work, virtnet_rx_mode_work);
>>>>> spin_lock_init(&vi->refill_lock);
>>>>>
>>>>> if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF)) {
>>>>> @@ -4077,6 +4122,8 @@ static int virtnet_probe(struct virtio_device *vdev)
>>>>> if (vi->has_rss || vi->has_rss_hash_report)
>>>>> virtnet_init_default_rss(vi);
>>>>>
>>>>> + enable_rx_mode_work(vi);
>>>>> +
>>>>> /* serialize netdev register + virtio_device_ready() with ndo_open() */
>>>>> rtnl_lock();
>>>>>
>>>>> @@ -4174,6 +4221,8 @@ static void virtnet_remove(struct virtio_device *vdev)
>>>>>
>>>>> /* Make sure no work handler is accessing the device. */
>>>>> flush_work(&vi->config_work);
>>>>> + disable_rx_mode_work(vi);
>>>>> + flush_work(&vi->rx_mode_work);
>>>>>
>>>>> unregister_netdev(vi->dev);
>>>>>
>>>>> --
>>>>> 2.25.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-04-17 3:40 ` Jason Wang
2023-05-05 3:46 ` Jason Wang
@ 2023-05-10 5:32 ` Michael S. Tsirkin
2023-05-15 1:05 ` Jason Wang
1 sibling, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-05-10 5:32 UTC (permalink / raw)
To: Jason Wang
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > Forget to cc netdev, adding.
> > >
> > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > response since current code is executed under addr spin lock.
> > > > >
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > >
> > > > I don't like this frankly. This means that setting RX mode which would
> > > > previously be reliable, now becomes unreliable.
> > >
> > > It is "unreliable" by design:
> > >
> > > void (*ndo_set_rx_mode)(struct net_device *dev);
> > >
> > > > - first of all configuration is no longer immediate
> > >
> > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > >
> > > mlx5e, ipoib, efx, ...
> > >
> > > > and there is no way for driver to find out when
> > > > it actually took effect
> > >
> > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > survive from this for years.
> > >
> > > > - second, if device fails command, this is also not
> > > > propagated to driver, again no way for driver to find out
> > > >
> > > > VDUSE needs to be fixed to do tricks to fix this
> > > > without breaking normal drivers.
> > >
> > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > transport).
> > >
> > > Thanks
> >
> > Hmm. Can we differentiate between these use-cases?
>
> It doesn't look easy since we are drivers for virtio bus. Underlayer
> details were hidden from virtio-net.
>
> Or do you have any ideas on this?
>
> Thanks
I don't know, pass some kind of flag in struct virtqueue?
"bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
?
--
MST
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-05-10 5:32 ` Michael S. Tsirkin
@ 2023-05-15 1:05 ` Jason Wang
2023-05-15 4:45 ` Michael S. Tsirkin
0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2023-05-15 1:05 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Wed, May 10, 2023 at 1:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> > On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > > Forget to cc netdev, adding.
> > > >
> > > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > > response since current code is executed under addr spin lock.
> > > > > >
> > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > >
> > > > > I don't like this frankly. This means that setting RX mode which would
> > > > > previously be reliable, now becomes unreliable.
> > > >
> > > > It is "unreliable" by design:
> > > >
> > > > void (*ndo_set_rx_mode)(struct net_device *dev);
> > > >
> > > > > - first of all configuration is no longer immediate
> > > >
> > > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > > >
> > > > mlx5e, ipoib, efx, ...
> > > >
> > > > > and there is no way for driver to find out when
> > > > > it actually took effect
> > > >
> > > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > > survive from this for years.
> > > >
> > > > > - second, if device fails command, this is also not
> > > > > propagated to driver, again no way for driver to find out
> > > > >
> > > > > VDUSE needs to be fixed to do tricks to fix this
> > > > > without breaking normal drivers.
> > > >
> > > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > > transport).
> > > >
> > > > Thanks
> > >
> > > Hmm. Can we differentiate between these use-cases?
> >
> > It doesn't look easy since we are drivers for virtio bus. Underlayer
> > details were hidden from virtio-net.
> >
> > Or do you have any ideas on this?
> >
> > Thanks
>
> I don't know, pass some kind of flag in struct virtqueue?
> "bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
>
> ?
>
So if it's slow, sleep, otherwise poll?
I feel setting this flag might be tricky, since the driver doesn't
know whether or not it's really slow. E.g smartNIC vendor may allow
virtio-net emulation over PCI.
Thanks
> --
> MST
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-05-15 1:05 ` Jason Wang
@ 2023-05-15 4:45 ` Michael S. Tsirkin
2023-05-15 5:13 ` Jason Wang
0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-05-15 4:45 UTC (permalink / raw)
To: Jason Wang
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Mon, May 15, 2023 at 09:05:54AM +0800, Jason Wang wrote:
> On Wed, May 10, 2023 at 1:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> > > On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > > > Forget to cc netdev, adding.
> > > > >
> > > > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > > > response since current code is executed under addr spin lock.
> > > > > > >
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > >
> > > > > > I don't like this frankly. This means that setting RX mode which would
> > > > > > previously be reliable, now becomes unreliable.
> > > > >
> > > > > It is "unreliable" by design:
> > > > >
> > > > > void (*ndo_set_rx_mode)(struct net_device *dev);
> > > > >
> > > > > > - first of all configuration is no longer immediate
> > > > >
> > > > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > > > >
> > > > > mlx5e, ipoib, efx, ...
> > > > >
> > > > > > and there is no way for driver to find out when
> > > > > > it actually took effect
> > > > >
> > > > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > > > survive from this for years.
> > > > >
> > > > > > - second, if device fails command, this is also not
> > > > > > propagated to driver, again no way for driver to find out
> > > > > >
> > > > > > VDUSE needs to be fixed to do tricks to fix this
> > > > > > without breaking normal drivers.
> > > > >
> > > > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > > > transport).
> > > > >
> > > > > Thanks
> > > >
> > > > Hmm. Can we differentiate between these use-cases?
> > >
> > > It doesn't look easy since we are drivers for virtio bus. Underlayer
> > > details were hidden from virtio-net.
> > >
> > > Or do you have any ideas on this?
> > >
> > > Thanks
> >
> > I don't know, pass some kind of flag in struct virtqueue?
> > "bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
> >
> > ?
> >
>
> So if it's slow, sleep, otherwise poll?
>
> I feel setting this flag might be tricky, since the driver doesn't
> know whether or not it's really slow. E.g smartNIC vendor may allow
> virtio-net emulation over PCI.
>
> Thanks
driver will have the choice, depending on whether
vq is deterministic or not.
> > --
> > MST
> >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-05-15 4:45 ` Michael S. Tsirkin
@ 2023-05-15 5:13 ` Jason Wang
2023-05-15 10:17 ` Michael S. Tsirkin
0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2023-05-15 5:13 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Mon, May 15, 2023 at 12:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, May 15, 2023 at 09:05:54AM +0800, Jason Wang wrote:
> > On Wed, May 10, 2023 at 1:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> > > > On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > > > > Forget to cc netdev, adding.
> > > > > >
> > > > > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > > > > response since current code is executed under addr spin lock.
> > > > > > > >
> > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > >
> > > > > > > I don't like this frankly. This means that setting RX mode which would
> > > > > > > previously be reliable, now becomes unreliable.
> > > > > >
> > > > > > It is "unreliable" by design:
> > > > > >
> > > > > > void (*ndo_set_rx_mode)(struct net_device *dev);
> > > > > >
> > > > > > > - first of all configuration is no longer immediate
> > > > > >
> > > > > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > > > > >
> > > > > > mlx5e, ipoib, efx, ...
> > > > > >
> > > > > > > and there is no way for driver to find out when
> > > > > > > it actually took effect
> > > > > >
> > > > > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > > > > survive from this for years.
> > > > > >
> > > > > > > - second, if device fails command, this is also not
> > > > > > > propagated to driver, again no way for driver to find out
> > > > > > >
> > > > > > > VDUSE needs to be fixed to do tricks to fix this
> > > > > > > without breaking normal drivers.
> > > > > >
> > > > > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > > > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > > > > transport).
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > Hmm. Can we differentiate between these use-cases?
> > > >
> > > > It doesn't look easy since we are drivers for virtio bus. Underlayer
> > > > details were hidden from virtio-net.
> > > >
> > > > Or do you have any ideas on this?
> > > >
> > > > Thanks
> > >
> > > I don't know, pass some kind of flag in struct virtqueue?
> > > "bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
> > >
> > > ?
> > >
> >
> > So if it's slow, sleep, otherwise poll?
> >
> > I feel setting this flag might be tricky, since the driver doesn't
> > know whether or not it's really slow. E.g smartNIC vendor may allow
> > virtio-net emulation over PCI.
> >
> > Thanks
>
> driver will have the choice, depending on whether
> vq is deterministic or not.
Ok, but the problem is, such booleans are only useful for virtio ring
codes. But in this case, virtio-net knows what to do for cvq. So I'm
not sure who the user is.
Thanks
>
>
> > > --
> > > MST
> > >
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-05-15 5:13 ` Jason Wang
@ 2023-05-15 10:17 ` Michael S. Tsirkin
2023-05-16 2:44 ` Jason Wang
0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-05-15 10:17 UTC (permalink / raw)
To: Jason Wang
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Mon, May 15, 2023 at 01:13:33PM +0800, Jason Wang wrote:
> On Mon, May 15, 2023 at 12:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, May 15, 2023 at 09:05:54AM +0800, Jason Wang wrote:
> > > On Wed, May 10, 2023 at 1:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> > > > > On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > > > > > Forget to cc netdev, adding.
> > > > > > >
> > > > > > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > > > > > response since current code is executed under addr spin lock.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > >
> > > > > > > > I don't like this frankly. This means that setting RX mode which would
> > > > > > > > previously be reliable, now becomes unreliable.
> > > > > > >
> > > > > > > It is "unreliable" by design:
> > > > > > >
> > > > > > > void (*ndo_set_rx_mode)(struct net_device *dev);
> > > > > > >
> > > > > > > > - first of all configuration is no longer immediate
> > > > > > >
> > > > > > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > > > > > >
> > > > > > > mlx5e, ipoib, efx, ...
> > > > > > >
> > > > > > > > and there is no way for driver to find out when
> > > > > > > > it actually took effect
> > > > > > >
> > > > > > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > > > > > survive from this for years.
> > > > > > >
> > > > > > > > - second, if device fails command, this is also not
> > > > > > > > propagated to driver, again no way for driver to find out
> > > > > > > >
> > > > > > > > VDUSE needs to be fixed to do tricks to fix this
> > > > > > > > without breaking normal drivers.
> > > > > > >
> > > > > > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > > > > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > > > > > transport).
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > Hmm. Can we differentiate between these use-cases?
> > > > >
> > > > > It doesn't look easy since we are drivers for virtio bus. Underlayer
> > > > > details were hidden from virtio-net.
> > > > >
> > > > > Or do you have any ideas on this?
> > > > >
> > > > > Thanks
> > > >
> > > > I don't know, pass some kind of flag in struct virtqueue?
> > > > "bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
> > > >
> > > > ?
> > > >
> > >
> > > So if it's slow, sleep, otherwise poll?
> > >
> > > I feel setting this flag might be tricky, since the driver doesn't
> > > know whether or not it's really slow. E.g smartNIC vendor may allow
> > > virtio-net emulation over PCI.
> > >
> > > Thanks
> >
> > driver will have the choice, depending on whether
> > vq is deterministic or not.
>
> Ok, but the problem is, such booleans are only useful for virtio ring
> codes. But in this case, virtio-net knows what to do for cvq. So I'm
> not sure who the user is.
>
> Thanks
Circling back, what exactly does the architecture you are trying
to fix look like? Who is going to introduce unbounded latency?
The hypervisor? If so do we not maybe want a new feature bit
that documents this? Hypervisor then can detect old guests
that spin and decide what to do, e.g. prioritise cvq more,
or fail FEATURES_OK.
> >
> >
> > > > --
> > > > MST
> > > >
> >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-05-15 10:17 ` Michael S. Tsirkin
@ 2023-05-16 2:44 ` Jason Wang
2023-05-16 4:12 ` Michael S. Tsirkin
0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2023-05-16 2:44 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Mon, May 15, 2023 at 6:17 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, May 15, 2023 at 01:13:33PM +0800, Jason Wang wrote:
> > On Mon, May 15, 2023 at 12:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, May 15, 2023 at 09:05:54AM +0800, Jason Wang wrote:
> > > > On Wed, May 10, 2023 at 1:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> > > > > > On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > > > > > > Forget to cc netdev, adding.
> > > > > > > >
> > > > > > > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > > > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > > > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > > > > > > response since current code is executed under addr spin lock.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > >
> > > > > > > > > I don't like this frankly. This means that setting RX mode which would
> > > > > > > > > previously be reliable, now becomes unreliable.
> > > > > > > >
> > > > > > > > It is "unreliable" by design:
> > > > > > > >
> > > > > > > > void (*ndo_set_rx_mode)(struct net_device *dev);
> > > > > > > >
> > > > > > > > > - first of all configuration is no longer immediate
> > > > > > > >
> > > > > > > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > > > > > > >
> > > > > > > > mlx5e, ipoib, efx, ...
> > > > > > > >
> > > > > > > > > and there is no way for driver to find out when
> > > > > > > > > it actually took effect
> > > > > > > >
> > > > > > > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > > > > > > survive from this for years.
> > > > > > > >
> > > > > > > > > - second, if device fails command, this is also not
> > > > > > > > > propagated to driver, again no way for driver to find out
> > > > > > > > >
> > > > > > > > > VDUSE needs to be fixed to do tricks to fix this
> > > > > > > > > without breaking normal drivers.
> > > > > > > >
> > > > > > > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > > > > > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > > > > > > transport).
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > > Hmm. Can we differentiate between these use-cases?
> > > > > >
> > > > > > It doesn't look easy since we are drivers for virtio bus. Underlayer
> > > > > > details were hidden from virtio-net.
> > > > > >
> > > > > > Or do you have any ideas on this?
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > I don't know, pass some kind of flag in struct virtqueue?
> > > > > "bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
> > > > >
> > > > > ?
> > > > >
> > > >
> > > > So if it's slow, sleep, otherwise poll?
> > > >
> > > > I feel setting this flag might be tricky, since the driver doesn't
> > > > know whether or not it's really slow. E.g smartNIC vendor may allow
> > > > virtio-net emulation over PCI.
> > > >
> > > > Thanks
> > >
> > > driver will have the choice, depending on whether
> > > vq is deterministic or not.
> >
> > Ok, but the problem is, such booleans are only useful for virtio ring
> > codes. But in this case, virtio-net knows what to do for cvq. So I'm
> > not sure who the user is.
> >
> > Thanks
>
> Circling back, what exactly does the architecture you are trying
> to fix look like? Who is going to introduce unbounded latency?
> The hypervisor?
Hypervisor is one of the possible reason, we have many more:
Hardware device that provides virtio-pci emulation.
Userspace devices like VDUSE.
> If so do we not maybe want a new feature bit
> that documents this? Hypervisor then can detect old guests
> that spin and decide what to do, e.g. prioritise cvq more,
> or fail FEATURES_OK.
We suffer from this for bare metal as well.
But a question is what's wrong with the approach that is used in this
patch? I've answered that set_rx_mode is not reliable, so it should be
fine to use workqueue. Except for this, any other thing that worries
you?
Thanks
>
> > >
> > >
> > > > > --
> > > > > MST
> > > > >
> > >
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-05-16 2:44 ` Jason Wang
@ 2023-05-16 4:12 ` Michael S. Tsirkin
2023-05-16 4:17 ` Jason Wang
0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-05-16 4:12 UTC (permalink / raw)
To: Jason Wang
Cc: davem, edumazet, kuba, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Tue, May 16, 2023 at 10:44:45AM +0800, Jason Wang wrote:
> On Mon, May 15, 2023 at 6:17 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, May 15, 2023 at 01:13:33PM +0800, Jason Wang wrote:
> > > On Mon, May 15, 2023 at 12:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, May 15, 2023 at 09:05:54AM +0800, Jason Wang wrote:
> > > > > On Wed, May 10, 2023 at 1:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> > > > > > > On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > > > > > > > Forget to cc netdev, adding.
> > > > > > > > >
> > > > > > > > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > > > > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > > > > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > > > > > > > response since current code is executed under addr spin lock.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > >
> > > > > > > > > > I don't like this frankly. This means that setting RX mode which would
> > > > > > > > > > previously be reliable, now becomes unreliable.
> > > > > > > > >
> > > > > > > > > It is "unreliable" by design:
> > > > > > > > >
> > > > > > > > > void (*ndo_set_rx_mode)(struct net_device *dev);
> > > > > > > > >
> > > > > > > > > > - first of all configuration is no longer immediate
> > > > > > > > >
> > > > > > > > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > > > > > > > >
> > > > > > > > > mlx5e, ipoib, efx, ...
> > > > > > > > >
> > > > > > > > > > and there is no way for driver to find out when
> > > > > > > > > > it actually took effect
> > > > > > > > >
> > > > > > > > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > > > > > > > survive from this for years.
> > > > > > > > >
> > > > > > > > > > - second, if device fails command, this is also not
> > > > > > > > > > propagated to driver, again no way for driver to find out
> > > > > > > > > >
> > > > > > > > > > VDUSE needs to be fixed to do tricks to fix this
> > > > > > > > > > without breaking normal drivers.
> > > > > > > > >
> > > > > > > > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > > > > > > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > > > > > > > transport).
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > >
> > > > > > > > Hmm. Can we differentiate between these use-cases?
> > > > > > >
> > > > > > > It doesn't look easy since we are drivers for virtio bus. Underlayer
> > > > > > > details were hidden from virtio-net.
> > > > > > >
> > > > > > > Or do you have any ideas on this?
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > I don't know, pass some kind of flag in struct virtqueue?
> > > > > > "bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
> > > > > >
> > > > > > ?
> > > > > >
> > > > >
> > > > > So if it's slow, sleep, otherwise poll?
> > > > >
> > > > > I feel setting this flag might be tricky, since the driver doesn't
> > > > > know whether or not it's really slow. E.g smartNIC vendor may allow
> > > > > virtio-net emulation over PCI.
> > > > >
> > > > > Thanks
> > > >
> > > > driver will have the choice, depending on whether
> > > > vq is deterministic or not.
> > >
> > > Ok, but the problem is, such booleans are only useful for virtio ring
> > > codes. But in this case, virtio-net knows what to do for cvq. So I'm
> > > not sure who the user is.
> > >
> > > Thanks
> >
> > Circling back, what exactly does the architecture you are trying
> > to fix look like? Who is going to introduce unbounded latency?
> > The hypervisor?
>
> Hypervisor is one of the possible reason, we have many more:
>
> Hardware device that provides virtio-pci emulation.
> Userspace devices like VDUSE.
So let's start by addressing VDUSE maybe?
> > If so do we not maybe want a new feature bit
> > that documents this? Hypervisor then can detect old guests
> > that spin and decide what to do, e.g. prioritise cvq more,
> > or fail FEATURES_OK.
>
> We suffer from this for bare metal as well.
>
> But a question is what's wrong with the approach that is used in this
> patch? I've answered that set_rx_mode is not reliable, so it should be
> fine to use workqueue. Except for this, any other thing that worries
> you?
>
> Thanks
It's not reliable for other drivers but has been reliable for virtio.
I worry some software relied on this.
You are making good points though ... could we get some
maintainer's feedback on this?
> >
> > > >
> > > >
> > > > > > --
> > > > > > MST
> > > > > >
> > > >
> >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-05-16 4:12 ` Michael S. Tsirkin
@ 2023-05-16 4:17 ` Jason Wang
2023-05-16 5:56 ` Michael S. Tsirkin
2023-05-16 19:36 ` Jakub Kicinski
0 siblings, 2 replies; 15+ messages in thread
From: Jason Wang @ 2023-05-16 4:17 UTC (permalink / raw)
To: Michael S. Tsirkin, kuba
Cc: davem, edumazet, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Tue, May 16, 2023 at 12:13 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, May 16, 2023 at 10:44:45AM +0800, Jason Wang wrote:
> > On Mon, May 15, 2023 at 6:17 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, May 15, 2023 at 01:13:33PM +0800, Jason Wang wrote:
> > > > On Mon, May 15, 2023 at 12:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, May 15, 2023 at 09:05:54AM +0800, Jason Wang wrote:
> > > > > > On Wed, May 10, 2023 at 1:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> > > > > > > > On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > > > > > > > > Forget to cc netdev, adding.
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > > > > > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > > > > > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > > > > > > > > response since current code is executed under addr spin lock.
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > >
> > > > > > > > > > > I don't like this frankly. This means that setting RX mode which would
> > > > > > > > > > > previously be reliable, now becomes unreliable.
> > > > > > > > > >
> > > > > > > > > > It is "unreliable" by design:
> > > > > > > > > >
> > > > > > > > > > void (*ndo_set_rx_mode)(struct net_device *dev);
> > > > > > > > > >
> > > > > > > > > > > - first of all configuration is no longer immediate
> > > > > > > > > >
> > > > > > > > > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > > > > > > > > >
> > > > > > > > > > mlx5e, ipoib, efx, ...
> > > > > > > > > >
> > > > > > > > > > > and there is no way for driver to find out when
> > > > > > > > > > > it actually took effect
> > > > > > > > > >
> > > > > > > > > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > > > > > > > > survive from this for years.
> > > > > > > > > >
> > > > > > > > > > > - second, if device fails command, this is also not
> > > > > > > > > > > propagated to driver, again no way for driver to find out
> > > > > > > > > > >
> > > > > > > > > > > VDUSE needs to be fixed to do tricks to fix this
> > > > > > > > > > > without breaking normal drivers.
> > > > > > > > > >
> > > > > > > > > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > > > > > > > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > > > > > > > > transport).
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > Hmm. Can we differentiate between these use-cases?
> > > > > > > >
> > > > > > > > It doesn't look easy since we are drivers for virtio bus. Underlayer
> > > > > > > > details were hidden from virtio-net.
> > > > > > > >
> > > > > > > > Or do you have any ideas on this?
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > > I don't know, pass some kind of flag in struct virtqueue?
> > > > > > > "bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
> > > > > > >
> > > > > > > ?
> > > > > > >
> > > > > >
> > > > > > So if it's slow, sleep, otherwise poll?
> > > > > >
> > > > > > I feel setting this flag might be tricky, since the driver doesn't
> > > > > > know whether or not it's really slow. E.g smartNIC vendor may allow
> > > > > > virtio-net emulation over PCI.
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > driver will have the choice, depending on whether
> > > > > vq is deterministic or not.
> > > >
> > > > Ok, but the problem is, such booleans are only useful for virtio ring
> > > > codes. But in this case, virtio-net knows what to do for cvq. So I'm
> > > > not sure who the user is.
> > > >
> > > > Thanks
> > >
> > > Circling back, what exactly does the architecture you are trying
> > > to fix look like? Who is going to introduce unbounded latency?
> > > The hypervisor?
> >
> > Hypervisor is one of the possible reason, we have many more:
> >
> > Hardware device that provides virtio-pci emulation.
> > Userspace devices like VDUSE.
>
> So let's start by addressing VDUSE maybe?
It's reported by at least one hardware vendor as well. I remember it
was Alvaro who reported this first in the past.
>
> > > If so do we not maybe want a new feature bit
> > > that documents this? Hypervisor then can detect old guests
> > > that spin and decide what to do, e.g. prioritise cvq more,
> > > or fail FEATURES_OK.
> >
> > We suffer from this for bare metal as well.
> >
> > But a question is what's wrong with the approach that is used in this
> > patch? I've answered that set_rx_mode is not reliable, so it should be
> > fine to use workqueue. Except for this, any other thing that worries
> > you?
> >
> > Thanks
>
> It's not reliable for other drivers but has been reliable for virtio.
> I worry some software relied on this.
It's probably fine since some device like vhost doesn't support this
at all and we manage to survive for several years.
> You are making good points though ... could we get some
> maintainer's feedback on this?
That would be helpful. Jakub, any input on this?
Thanks
>
> > >
> > > > >
> > > > >
> > > > > > > --
> > > > > > > MST
> > > > > > >
> > > > >
> > >
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-05-16 4:17 ` Jason Wang
@ 2023-05-16 5:56 ` Michael S. Tsirkin
2023-05-16 19:36 ` Jakub Kicinski
1 sibling, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-05-16 5:56 UTC (permalink / raw)
To: Jason Wang
Cc: kuba, davem, edumazet, pabeni, virtualization, linux-kernel,
maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo, david.marchand,
netdev
On Tue, May 16, 2023 at 12:17:50PM +0800, Jason Wang wrote:
> On Tue, May 16, 2023 at 12:13 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, May 16, 2023 at 10:44:45AM +0800, Jason Wang wrote:
> > > On Mon, May 15, 2023 at 6:17 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, May 15, 2023 at 01:13:33PM +0800, Jason Wang wrote:
> > > > > On Mon, May 15, 2023 at 12:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, May 15, 2023 at 09:05:54AM +0800, Jason Wang wrote:
> > > > > > > On Wed, May 10, 2023 at 1:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Apr 17, 2023 at 11:40:58AM +0800, Jason Wang wrote:
> > > > > > > > > On Fri, Apr 14, 2023 at 3:21 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 14, 2023 at 01:04:15PM +0800, Jason Wang wrote:
> > > > > > > > > > > Forget to cc netdev, adding.
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Apr 14, 2023 at 12:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Apr 13, 2023 at 02:40:26PM +0800, Jason Wang wrote:
> > > > > > > > > > > > > This patch convert rx mode setting to be done in a workqueue, this is
> > > > > > > > > > > > > a must for allow to sleep when waiting for the cvq command to
> > > > > > > > > > > > > response since current code is executed under addr spin lock.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > >
> > > > > > > > > > > > I don't like this frankly. This means that setting RX mode which would
> > > > > > > > > > > > previously be reliable, now becomes unreliable.
> > > > > > > > > > >
> > > > > > > > > > > It is "unreliable" by design:
> > > > > > > > > > >
> > > > > > > > > > > void (*ndo_set_rx_mode)(struct net_device *dev);
> > > > > > > > > > >
> > > > > > > > > > > > - first of all configuration is no longer immediate
> > > > > > > > > > >
> > > > > > > > > > > Is immediate a hard requirement? I can see a workqueue is used at least:
> > > > > > > > > > >
> > > > > > > > > > > mlx5e, ipoib, efx, ...
> > > > > > > > > > >
> > > > > > > > > > > > and there is no way for driver to find out when
> > > > > > > > > > > > it actually took effect
> > > > > > > > > > >
> > > > > > > > > > > But we know rx mode is best effort e.g it doesn't support vhost and we
> > > > > > > > > > > survive from this for years.
> > > > > > > > > > >
> > > > > > > > > > > > - second, if device fails command, this is also not
> > > > > > > > > > > > propagated to driver, again no way for driver to find out
> > > > > > > > > > > >
> > > > > > > > > > > > VDUSE needs to be fixed to do tricks to fix this
> > > > > > > > > > > > without breaking normal drivers.
> > > > > > > > > > >
> > > > > > > > > > > It's not specific to VDUSE. For example, when using virtio-net in the
> > > > > > > > > > > UP environment with any software cvq (like mlx5 via vDPA or cma
> > > > > > > > > > > transport).
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > Hmm. Can we differentiate between these use-cases?
> > > > > > > > >
> > > > > > > > > It doesn't look easy since we are drivers for virtio bus. Underlayer
> > > > > > > > > details were hidden from virtio-net.
> > > > > > > > >
> > > > > > > > > Or do you have any ideas on this?
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > >
> > > > > > > > I don't know, pass some kind of flag in struct virtqueue?
> > > > > > > > "bool slow; /* This vq can be very slow sometimes. Don't wait for it! */"
> > > > > > > >
> > > > > > > > ?
> > > > > > > >
> > > > > > >
> > > > > > > So if it's slow, sleep, otherwise poll?
> > > > > > >
> > > > > > > I feel setting this flag might be tricky, since the driver doesn't
> > > > > > > know whether or not it's really slow. E.g smartNIC vendor may allow
> > > > > > > virtio-net emulation over PCI.
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > driver will have the choice, depending on whether
> > > > > > vq is deterministic or not.
> > > > >
> > > > > Ok, but the problem is, such booleans are only useful for virtio ring
> > > > > codes. But in this case, virtio-net knows what to do for cvq. So I'm
> > > > > not sure who the user is.
> > > > >
> > > > > Thanks
> > > >
> > > > Circling back, what exactly does the architecture you are trying
> > > > to fix look like? Who is going to introduce unbounded latency?
> > > > The hypervisor?
> > >
> > > Hypervisor is one of the possible reason, we have many more:
> > >
> > > Hardware device that provides virtio-pci emulation.
> > > Userspace devices like VDUSE.
> >
> > So let's start by addressing VDUSE maybe?
>
> It's reported by at least one hardware vendor as well. I remember it
> was Alvaro who reported this first in the past.
>
> >
> > > > If so do we not maybe want a new feature bit
> > > > that documents this? Hypervisor then can detect old guests
> > > > that spin and decide what to do, e.g. prioritise cvq more,
> > > > or fail FEATURES_OK.
> > >
> > > We suffer from this for bare metal as well.
> > >
> > > But a question is what's wrong with the approach that is used in this
> > > patch? I've answered that set_rx_mode is not reliable, so it should be
> > > fine to use workqueue. Except for this, any other thing that worries
> > > you?
> > >
> > > Thanks
> >
> > It's not reliable for other drivers but has been reliable for virtio.
> > I worry some software relied on this.
>
> It's probably fine since some device like vhost doesn't support this
> at all and we manage to survive for several years.
vhost is often connected to a clever learning backend
such as a bridge which will DTRT without guest configuring
anything at all though, this could be why it works.
> > You are making good points though ... could we get some
> > maintainer's feedback on this?
>
> That would be helpful. Jakub, any input on this?
>
> Thanks
>
> >
> > > >
> > > > > >
> > > > > >
> > > > > > > > --
> > > > > > > > MST
> > > > > > > >
> > > > > >
> > > >
> >
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue
2023-05-16 4:17 ` Jason Wang
2023-05-16 5:56 ` Michael S. Tsirkin
@ 2023-05-16 19:36 ` Jakub Kicinski
1 sibling, 0 replies; 15+ messages in thread
From: Jakub Kicinski @ 2023-05-16 19:36 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S. Tsirkin, davem, edumazet, pabeni, virtualization,
linux-kernel, maxime.coquelin, alvaro.karsz, eperezma, xuanzhuo,
david.marchand, netdev
On Tue, 16 May 2023 12:17:50 +0800 Jason Wang wrote:
> > It's not reliable for other drivers but has been reliable for virtio.
> > I worry some software relied on this.
>
> It's probably fine since some device like vhost doesn't support this
> at all and we manage to survive for several years.
>
> > You are making good points though ... could we get some
> > maintainer's feedback on this?
>
> That would be helpful. Jakub, any input on this?
AFAIU the question is whether .ndo_set_rx_mode needs to be reliable
and instantaneous? I haven't heard any complaints for it not being
immediate, and most 10G+ NICs do the config via a workqueue.
I even have an "intern task" to implement a workqueue in the core,
for this to save the boilerplate code in the drivers.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-05-16 19:36 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20230413064027.13267-1-jasowang@redhat.com>
[not found] ` <20230413064027.13267-2-jasowang@redhat.com>
[not found] ` <20230413121525-mutt-send-email-mst@kernel.org>
2023-04-14 5:04 ` [PATCH net-next V2 1/2] virtio-net: convert rx mode setting to use workqueue Jason Wang
2023-04-14 7:21 ` Michael S. Tsirkin
2023-04-17 3:40 ` Jason Wang
2023-05-05 3:46 ` Jason Wang
2023-05-10 5:32 ` Michael S. Tsirkin
2023-05-15 1:05 ` Jason Wang
2023-05-15 4:45 ` Michael S. Tsirkin
2023-05-15 5:13 ` Jason Wang
2023-05-15 10:17 ` Michael S. Tsirkin
2023-05-16 2:44 ` Jason Wang
2023-05-16 4:12 ` Michael S. Tsirkin
2023-05-16 4:17 ` Jason Wang
2023-05-16 5:56 ` Michael S. Tsirkin
2023-05-16 19:36 ` Jakub Kicinski
[not found] ` <20230413064027.13267-3-jasowang@redhat.com>
[not found] ` <1681370820.0675354-2-xuanzhuo@linux.alibaba.com>
[not found] ` <CACGkMEuJuZKGMhVwFmD0ZMa7V7TdGu6qaXF24Gg67TzMbs8ANA@mail.gmail.com>
2023-04-14 5:10 ` [PATCH net-next V2 2/2] virtio-net: sleep instead of busy waiting for cvq command Jason Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).