virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation
       [not found] <cover.1697093455.git.hengqi@linux.alibaba.com>
@ 2023-10-12  8:29 ` Jason Wang
  2023-10-13  1:53   ` Jason Wang
       [not found]   ` <d215566f-8185-463b-aa0b-5925f2a0853c@linux.alibaba.com>
       [not found] ` <dc171e2288d2755b1805afde6b394d2d443a134d.1697093455.git.hengqi@linux.alibaba.com>
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 15+ messages in thread
From: Jason Wang @ 2023-10-12  8:29 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, Michael S. Tsirkin,
	netdev, John Fastabend, Alexei Starovoitov, virtualization,
	Eric Dumazet, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> Now, virtio-net already supports per-queue moderation parameter
> setting. Based on this, we use the netdim library of linux to support
> dynamic coalescing moderation for virtio-net.
>
> Due to hardware scheduling issues, we only tested rx dim.

Do you have PPS numbers? And TX numbers are also important as the
throughput could be misleading due to various reasons.

Thanks

>
> @Test env
> rxq0 has affinity to cpu0.
>
> @Test cmd
> client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> server: taskset -c 0 sockperf sr --tcp
>
> @Test res
> The second column is the ratio of the result returned by client
> when rx dim is enabled to the result returned by client when
> rx dim is disabled.
>         --------------------------------------
>         | msg_size |  rx_dim=on / rx_dim=off |
>         --------------------------------------
>         |   14B    |         + 3%            |
>         --------------------------------------
>         |   100B   |         + 16%           |
>         --------------------------------------
>         |   500B   |         + 25%           |
>         --------------------------------------
>         |   1400B  |         + 28%           |
>         --------------------------------------
>         |   2048B  |         + 22%           |
>         --------------------------------------
>         |   4096B  |         + 5%            |
>         --------------------------------------
>
> ---
> This patch set was part of the previous netdim patch set[1].
> [1] was split into a merged bugfix set[2] and the current set.
> The previous relevant commentators have been Cced.
>
> [1] https://lore.kernel.org/all/20230811065512.22190-1-hengqi@linux.alibaba.com/
> [2] https://lore.kernel.org/all/cover.1696745452.git.hengqi@linux.alibaba.com/
>
> Heng Qi (5):
>   virtio-net: returns whether napi is complete
>   virtio-net: separate rx/tx coalescing moderation cmds
>   virtio-net: extract virtqueue coalescig cmd for reuse
>   virtio-net: support rx netdim
>   virtio-net: support tx netdim
>
>  drivers/net/virtio_net.c | 394 ++++++++++++++++++++++++++++++++-------
>  1 file changed, 322 insertions(+), 72 deletions(-)
>
> --
> 2.19.1.6.gb485710b
>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation
  2023-10-12  8:29 ` [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation Jason Wang
@ 2023-10-13  1:53   ` Jason Wang
       [not found]   ` <d215566f-8185-463b-aa0b-5925f2a0853c@linux.alibaba.com>
  1 sibling, 0 replies; 15+ messages in thread
From: Jason Wang @ 2023-10-13  1:53 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, Michael S. Tsirkin,
	netdev, John Fastabend, Alexei Starovoitov, virtualization,
	Eric Dumazet, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Thu, Oct 12, 2023 at 4:29 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >
> > Now, virtio-net already supports per-queue moderation parameter
> > setting. Based on this, we use the netdim library of linux to support
> > dynamic coalescing moderation for virtio-net.
> >
> > Due to hardware scheduling issues, we only tested rx dim.
>
> Do you have PPS numbers? And TX numbers are also important as the
> throughput could be misleading due to various reasons.

Another consideration:

We currently have two TX mode, NAPI by default, and orphan. Simple
pktgen test shows NAPI can only achieve 30% of orphan. So we need to
make sure:

1) dim helps for NAPI, and if NAPI can compete with orphan, we can
drop the orphan code completely which is a great release and
simplification of the codes. And it means we can have BQL, and other
good stuff on top easily.
2) dim doesn't cause regression for orphan

Thanks

>
> Thanks
>
> >
> > @Test env
> > rxq0 has affinity to cpu0.
> >
> > @Test cmd
> > client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> > server: taskset -c 0 sockperf sr --tcp
> >
> > @Test res
> > The second column is the ratio of the result returned by client
> > when rx dim is enabled to the result returned by client when
> > rx dim is disabled.
> >         --------------------------------------
> >         | msg_size |  rx_dim=on / rx_dim=off |
> >         --------------------------------------
> >         |   14B    |         + 3%            |
> >         --------------------------------------
> >         |   100B   |         + 16%           |
> >         --------------------------------------
> >         |   500B   |         + 25%           |
> >         --------------------------------------
> >         |   1400B  |         + 28%           |
> >         --------------------------------------
> >         |   2048B  |         + 22%           |
> >         --------------------------------------
> >         |   4096B  |         + 5%            |
> >         --------------------------------------
> >
> > ---
> > This patch set was part of the previous netdim patch set[1].
> > [1] was split into a merged bugfix set[2] and the current set.
> > The previous relevant commentators have been Cced.
> >
> > [1] https://lore.kernel.org/all/20230811065512.22190-1-hengqi@linux.alibaba.com/
> > [2] https://lore.kernel.org/all/cover.1696745452.git.hengqi@linux.alibaba.com/
> >
> > Heng Qi (5):
> >   virtio-net: returns whether napi is complete
> >   virtio-net: separate rx/tx coalescing moderation cmds
> >   virtio-net: extract virtqueue coalescig cmd for reuse
> >   virtio-net: support rx netdim
> >   virtio-net: support tx netdim
> >
> >  drivers/net/virtio_net.c | 394 ++++++++++++++++++++++++++++++++-------
> >  1 file changed, 322 insertions(+), 72 deletions(-)
> >
> > --
> > 2.19.1.6.gb485710b
> >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 2/5] virtio-net: separate rx/tx coalescing moderation cmds
       [not found]     ` <06d90cc8-ccc0-4b2f-ad42-2db4a6fb229f@linux.alibaba.com>
@ 2023-10-16  7:51       ` Michael S. Tsirkin
  0 siblings, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-10-16  7:51 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, netdev,
	John Fastabend, Alexei Starovoitov, virtualization, Eric Dumazet,
	Simon Horman, Jakub Kicinski, Paolo Abeni, David S. Miller

On Mon, Oct 16, 2023 at 03:45:38PM +0800, Heng Qi wrote:
> 
> 
> 在 2023/10/14 上午9:11, Jakub Kicinski 写道:
> > On Thu, 12 Oct 2023 15:44:06 +0800 Heng Qi wrote:
> > > +
> > > +static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi,
> > > +					  struct ethtool_coalesce *ec)
> > > +{
> > > +	struct scatterlist sgs_rx;
> > > +
> > ../drivers/net/virtio_net.c: In function ‘virtnet_send_rx_notf_coal_cmds’:
> > ../drivers/net/virtio_net.c:3306:14: error: ‘i’ undeclared (first use in this function); did you mean ‘vi’?
> >   3306 |         for (i = 0; i < vi->max_queue_pairs; i++) {
> >        |              ^
> >        |              vi
> 
> Will fix in the next version.
> 
> Thanks!

OK, however pls do test individual patches as well as the whole
patchset.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation
       [not found]   ` <d215566f-8185-463b-aa0b-5925f2a0853c@linux.alibaba.com>
@ 2023-10-25  1:18     ` Jason Wang
  2023-10-25  5:53       ` Michael S. Tsirkin
       [not found]       ` <753ac6da-f7f1-4acb-9184-e59271809c6d@linux.alibaba.com>
  0 siblings, 2 replies; 15+ messages in thread
From: Jason Wang @ 2023-10-25  1:18 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, Michael S. Tsirkin,
	netdev, John Fastabend, Alexei Starovoitov, virtualization,
	Eric Dumazet, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Tue, Oct 24, 2023 at 8:03 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2023/10/12 下午4:29, Jason Wang 写道:
> > On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >> Now, virtio-net already supports per-queue moderation parameter
> >> setting. Based on this, we use the netdim library of linux to support
> >> dynamic coalescing moderation for virtio-net.
> >>
> >> Due to hardware scheduling issues, we only tested rx dim.
> > Do you have PPS numbers? And TX numbers are also important as the
> > throughput could be misleading due to various reasons.
>
> Hi Jason!
>
> The comparison of rx netdim performance is as follows:
> (the backend supporting tx dim is not yet ready)

Thanks a lot for the numbers.

I'd still expect the TX result as I did play tx interrupt coalescing
about 10 years ago.

I will start to review the series but let's try to have some TX numbers as well.

Btw, it would be more convenient to have a raw PPS benchmark. E.g you
can try to use a software or hardware packet generator.

Thanks

>
>
> I. Sockperf UDP
> =================================================
> 1. Env
> rxq_0 is affinity to cpu_0
>
> 2. Cmd
> client:  taskset -c 0 sockperf tp -p 8989 -i $IP -t 10 -m 16B
> server: taskset -c 0 sockperf sr -p 8989
>
> 3. Result
> dim off: 1143277.00 rxpps, throughput 17.844 MBps, cpu is 100%.
> dim on: 1124161.00 rxpps, throughput 17.610 MBps, cpu is 83.5%.
> =================================================
>
>
> II. Redis
> =================================================
> 1. Env
> There are 8 rxqs and rxq_i is affinity to cpu_i.
>
> 2. Result
> When all cpus are 100%, ops/sec of memtier_benchmark client is
> dim off:   978437.23
> dim on: 1143638.28
> =================================================
>
>
> III. Nginx
> =================================================
> 1. Env
> There are 8 rxqs and rxq_i is affinity to cpu_i.
>
> 2. Result
> When all cpus are 100%, requests/sec of wrk client is
> dim off:   877931.67
> dim on: 1019160.31
> =================================================
>
> Thanks!
>
> >
> > Thanks
> >
> >> @Test env
> >> rxq0 has affinity to cpu0.
> >>
> >> @Test cmd
> >> client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> >> server: taskset -c 0 sockperf sr --tcp
> >>
> >> @Test res
> >> The second column is the ratio of the result returned by client
> >> when rx dim is enabled to the result returned by client when
> >> rx dim is disabled.
> >>          --------------------------------------
> >>          | msg_size |  rx_dim=on / rx_dim=off |
> >>          --------------------------------------
> >>          |   14B    |         + 3%            |
> >>          --------------------------------------
> >>          |   100B   |         + 16%           |
> >>          --------------------------------------
> >>          |   500B   |         + 25%           |
> >>          --------------------------------------
> >>          |   1400B  |         + 28%           |
> >>          --------------------------------------
> >>          |   2048B  |         + 22%           |
> >>          --------------------------------------
> >>          |   4096B  |         + 5%            |
> >>          --------------------------------------
> >>
> >> ---
> >> This patch set was part of the previous netdim patch set[1].
> >> [1] was split into a merged bugfix set[2] and the current set.
> >> The previous relevant commentators have been Cced.
> >>
> >> [1] https://lore.kernel.org/all/20230811065512.22190-1-hengqi@linux.alibaba.com/
> >> [2] https://lore.kernel.org/all/cover.1696745452.git.hengqi@linux.alibaba.com/
> >>
> >> Heng Qi (5):
> >>    virtio-net: returns whether napi is complete
> >>    virtio-net: separate rx/tx coalescing moderation cmds
> >>    virtio-net: extract virtqueue coalescig cmd for reuse
> >>    virtio-net: support rx netdim
> >>    virtio-net: support tx netdim
> >>
> >>   drivers/net/virtio_net.c | 394 ++++++++++++++++++++++++++++++++-------
> >>   1 file changed, 322 insertions(+), 72 deletions(-)
> >>
> >> --
> >> 2.19.1.6.gb485710b
> >>
> >>
>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 1/5] virtio-net: returns whether napi is complete
       [not found] ` <ca2cef5c582bea958e300b39eb508d08675d1106.1697093455.git.hengqi@linux.alibaba.com>
@ 2023-10-25  2:43   ` Jason Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jason Wang @ 2023-10-25  2:43 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, Michael S. Tsirkin,
	netdev, John Fastabend, Alexei Starovoitov, virtualization,
	Eric Dumazet, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> rx netdim needs to count the traffic during a complete napi process,
> and start updating and comparing samples to make decisions after
> the napi ends. Let virtqueue_napi_complete() return true if napi is done,
> otherwise vice versa.
>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks

> ---
>  drivers/net/virtio_net.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index a52fd743504a..cf5d2ef4bd24 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -431,7 +431,7 @@ static void virtqueue_napi_schedule(struct napi_struct *napi,
>         }
>  }
>
> -static void virtqueue_napi_complete(struct napi_struct *napi,
> +static bool virtqueue_napi_complete(struct napi_struct *napi,
>                                     struct virtqueue *vq, int processed)
>  {
>         int opaque;
> @@ -440,9 +440,13 @@ static void virtqueue_napi_complete(struct napi_struct *napi,
>         if (napi_complete_done(napi, processed)) {
>                 if (unlikely(virtqueue_poll(vq, opaque)))
>                         virtqueue_napi_schedule(napi, vq);
> +               else
> +                       return true;
>         } else {
>                 virtqueue_disable_cb(vq);
>         }
> +
> +       return false;
>  }
>
>  static void skb_xmit_done(struct virtqueue *vq)
> --
> 2.19.1.6.gb485710b
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 3/5] virtio-net: extract virtqueue coalescig cmd for reuse
       [not found] ` <5b99db97dd4b26636f306f70c8158d9d3297b4a0.1697093455.git.hengqi@linux.alibaba.com>
@ 2023-10-25  3:03   ` Jason Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jason Wang @ 2023-10-25  3:03 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, Michael S. Tsirkin,
	netdev, John Fastabend, Alexei Starovoitov, virtualization,
	Eric Dumazet, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> Extract commands to set virtqueue coalescing parameters for reuse
> by ethtool -Q, vq resize and netdim.
>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks

> ---
>  drivers/net/virtio_net.c | 106 +++++++++++++++++++++++----------------
>  1 file changed, 64 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 54b3fb8d0384..caef78bb3963 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -2846,6 +2846,58 @@ static void virtnet_cpu_notif_remove(struct virtnet_info *vi)
>                                             &vi->node_dead);
>  }
>
> +static int virtnet_send_ctrl_coal_vq_cmd(struct virtnet_info *vi,
> +                                        u16 vqn, u32 max_usecs, u32 max_packets)
> +{
> +       struct scatterlist sgs;
> +
> +       vi->ctrl->coal_vq.vqn = cpu_to_le16(vqn);
> +       vi->ctrl->coal_vq.coal.max_usecs = cpu_to_le32(max_usecs);
> +       vi->ctrl->coal_vq.coal.max_packets = cpu_to_le32(max_packets);
> +       sg_init_one(&sgs, &vi->ctrl->coal_vq, sizeof(vi->ctrl->coal_vq));
> +
> +       if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> +                                 VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> +                                 &sgs))
> +               return -EINVAL;
> +
> +       return 0;
> +}
> +
> +static int virtnet_send_rx_ctrl_coal_vq_cmd(struct virtnet_info *vi,
> +                                           u16 queue, u32 max_usecs,
> +                                           u32 max_packets)
> +{
> +       int err;
> +
> +       err = virtnet_send_ctrl_coal_vq_cmd(vi, rxq2vq(queue),
> +                                           max_usecs, max_packets);
> +       if (err)
> +               return err;
> +
> +       vi->rq[queue].intr_coal.max_usecs = max_usecs;
> +       vi->rq[queue].intr_coal.max_packets = max_packets;
> +
> +       return 0;
> +}
> +
> +static int virtnet_send_tx_ctrl_coal_vq_cmd(struct virtnet_info *vi,
> +                                           u16 queue, u32 max_usecs,
> +                                           u32 max_packets)
> +{
> +       int err;
> +
> +       err = virtnet_send_ctrl_coal_vq_cmd(vi, txq2vq(queue),
> +                                           max_usecs, max_packets);
> +       if (err)
> +               return err;
> +
> +       vi->sq[queue].intr_coal.max_usecs = max_usecs;
> +       vi->sq[queue].intr_coal.max_packets = max_packets;
> +
> +       return 0;
> +}
> +
>  static void virtnet_get_ringparam(struct net_device *dev,
>                                   struct ethtool_ringparam *ring,
>                                   struct kernel_ethtool_ringparam *kernel_ring,
> @@ -2903,14 +2955,11 @@ static int virtnet_set_ringparam(struct net_device *dev,
>                          * through the VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver
>                          * did not set any TX coalescing parameters, to 0.
>                          */
> -                       err = virtnet_send_ctrl_coal_vq_cmd(vi, txq2vq(i),
> -                                                           vi->intr_coal_tx.max_usecs,
> -                                                           vi->intr_coal_tx.max_packets);
> +                       err = virtnet_send_tx_ctrl_coal_vq_cmd(vi, i,
> +                                                              vi->intr_coal_tx.max_usecs,
> +                                                              vi->intr_coal_tx.max_packets);
>                         if (err)
>                                 return err;
> -
> -                       vi->sq[i].intr_coal.max_usecs = vi->intr_coal_tx.max_usecs;
> -                       vi->sq[i].intr_coal.max_packets = vi->intr_coal_tx.max_packets;
>                 }
>
>                 if (ring->rx_pending != rx_pending) {
> @@ -2919,14 +2968,11 @@ static int virtnet_set_ringparam(struct net_device *dev,
>                                 return err;
>
>                         /* The reason is same as the transmit virtqueue reset */
> -                       err = virtnet_send_ctrl_coal_vq_cmd(vi, rxq2vq(i),
> -                                                           vi->intr_coal_rx.max_usecs,
> -                                                           vi->intr_coal_rx.max_packets);
> +                       err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, i,
> +                                                              vi->intr_coal_rx.max_usecs,
> +                                                              vi->intr_coal_rx.max_packets);
>                         if (err)
>                                 return err;
> -
> -                       vi->rq[i].intr_coal.max_usecs = vi->intr_coal_rx.max_usecs;
> -                       vi->rq[i].intr_coal.max_packets = vi->intr_coal_rx.max_packets;
>                 }
>         }
>
> @@ -3327,48 +3373,24 @@ static int virtnet_send_notf_coal_cmds(struct virtnet_info *vi,
>         return 0;
>  }
>
> -static int virtnet_send_ctrl_coal_vq_cmd(struct virtnet_info *vi,
> -                                        u16 vqn, u32 max_usecs, u32 max_packets)
> -{
> -       struct scatterlist sgs;
> -
> -       vi->ctrl->coal_vq.vqn = cpu_to_le16(vqn);
> -       vi->ctrl->coal_vq.coal.max_usecs = cpu_to_le32(max_usecs);
> -       vi->ctrl->coal_vq.coal.max_packets = cpu_to_le32(max_packets);
> -       sg_init_one(&sgs, &vi->ctrl->coal_vq, sizeof(vi->ctrl->coal_vq));
> -
> -       if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> -                                 VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET,
> -                                 &sgs))
> -               return -EINVAL;
> -
> -       return 0;
> -}
> -
>  static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi,
>                                           struct ethtool_coalesce *ec,
>                                           u16 queue)
>  {
>         int err;
>
> -       err = virtnet_send_ctrl_coal_vq_cmd(vi, rxq2vq(queue),
> -                                           ec->rx_coalesce_usecs,
> -                                           ec->rx_max_coalesced_frames);
> +       err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, queue,
> +                                              ec->rx_coalesce_usecs,
> +                                              ec->rx_max_coalesced_frames);
>         if (err)
>                 return err;
>
> -       vi->rq[queue].intr_coal.max_usecs = ec->rx_coalesce_usecs;
> -       vi->rq[queue].intr_coal.max_packets = ec->rx_max_coalesced_frames;
> -
> -       err = virtnet_send_ctrl_coal_vq_cmd(vi, txq2vq(queue),
> -                                           ec->tx_coalesce_usecs,
> -                                           ec->tx_max_coalesced_frames);
> +       err = virtnet_send_tx_ctrl_coal_vq_cmd(vi, queue,
> +                                              ec->tx_coalesce_usecs,
> +                                              ec->tx_max_coalesced_frames);
>         if (err)
>                 return err;
>
> -       vi->sq[queue].intr_coal.max_usecs = ec->tx_coalesce_usecs;
> -       vi->sq[queue].intr_coal.max_packets = ec->tx_max_coalesced_frames;
> -
>         return 0;
>  }
>
> --
> 2.19.1.6.gb485710b
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 4/5] virtio-net: support rx netdim
       [not found] ` <b4656b1a14fea432bf8493a7e2f1976c08f2e63c.1697093455.git.hengqi@linux.alibaba.com>
@ 2023-10-25  3:34   ` Jason Wang
       [not found]     ` <3bc31a3b-a022-4816-a854-7f6b41d2e351@linux.alibaba.com>
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2023-10-25  3:34 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, Michael S. Tsirkin,
	netdev, John Fastabend, Alexei Starovoitov, virtualization,
	Eric Dumazet, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> By comparing the traffic information in the complete napi processes,
> let the virtio-net driver automatically adjust the coalescing
> moderation parameters of each receive queue.
>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
>  drivers/net/virtio_net.c | 147 +++++++++++++++++++++++++++++++++------
>  1 file changed, 126 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index caef78bb3963..6ad2890a7909 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -19,6 +19,7 @@
>  #include <linux/average.h>
>  #include <linux/filter.h>
>  #include <linux/kernel.h>
> +#include <linux/dim.h>
>  #include <net/route.h>
>  #include <net/xdp.h>
>  #include <net/net_failover.h>
> @@ -172,6 +173,17 @@ struct receive_queue {
>
>         struct virtnet_rq_stats stats;
>
> +       /* The number of rx notifications */
> +       u16 calls;
> +
> +       /* Is dynamic interrupt moderation enabled? */
> +       bool dim_enabled;
> +
> +       /* Dynamic Interrupt Moderation */
> +       struct dim dim;
> +
> +       u32 packets_in_napi;
> +
>         struct virtnet_interrupt_coalesce intr_coal;
>
>         /* Chain pages by the private ptr. */
> @@ -305,6 +317,9 @@ struct virtnet_info {
>         u8 duplex;
>         u32 speed;
>
> +       /* Is rx dynamic interrupt moderation enabled? */
> +       bool rx_dim_enabled;
> +
>         /* Interrupt coalescing settings */
>         struct virtnet_interrupt_coalesce intr_coal_tx;
>         struct virtnet_interrupt_coalesce intr_coal_rx;
> @@ -2001,6 +2016,7 @@ static void skb_recv_done(struct virtqueue *rvq)
>         struct virtnet_info *vi = rvq->vdev->priv;
>         struct receive_queue *rq = &vi->rq[vq2rxq(rvq)];
>
> +       rq->calls++;
>         virtqueue_napi_schedule(&rq->napi, rvq);
>  }
>
> @@ -2138,6 +2154,25 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
>         }
>  }
>
> +static void virtnet_rx_dim_work(struct work_struct *work);
> +
> +static void virtnet_rx_dim_update(struct virtnet_info *vi, struct receive_queue *rq)
> +{
> +       struct virtnet_rq_stats *stats = &rq->stats;
> +       struct dim_sample cur_sample = {};
> +
> +       if (!rq->packets_in_napi)
> +               return;
> +
> +       u64_stats_update_begin(&rq->stats.syncp);
> +       dim_update_sample(rq->calls, stats->packets,
> +                         stats->bytes, &cur_sample);
> +       u64_stats_update_end(&rq->stats.syncp);
> +
> +       net_dim(&rq->dim, cur_sample);
> +       rq->packets_in_napi = 0;
> +}
> +
>  static int virtnet_poll(struct napi_struct *napi, int budget)
>  {
>         struct receive_queue *rq =
> @@ -2146,17 +2181,22 @@ static int virtnet_poll(struct napi_struct *napi, int budget)
>         struct send_queue *sq;
>         unsigned int received;
>         unsigned int xdp_xmit = 0;
> +       bool napi_complete;
>
>         virtnet_poll_cleantx(rq);
>
>         received = virtnet_receive(rq, budget, &xdp_xmit);
> +       rq->packets_in_napi += received;
>
>         if (xdp_xmit & VIRTIO_XDP_REDIR)
>                 xdp_do_flush();
>
>         /* Out of packets? */
> -       if (received < budget)
> -               virtqueue_napi_complete(napi, rq->vq, received);
> +       if (received < budget) {
> +               napi_complete = virtqueue_napi_complete(napi, rq->vq, received);
> +               if (napi_complete && rq->dim_enabled)
> +                       virtnet_rx_dim_update(vi, rq);
> +       }
>
>         if (xdp_xmit & VIRTIO_XDP_TX) {
>                 sq = virtnet_xdp_get_sq(vi);
> @@ -2176,6 +2216,7 @@ static void virtnet_disable_queue_pair(struct virtnet_info *vi, int qp_index)
>         virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
>         napi_disable(&vi->rq[qp_index].napi);
>         xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
> +       cancel_work_sync(&vi->rq[qp_index].dim.work);
>  }
>
>  static int virtnet_enable_queue_pair(struct virtnet_info *vi, int qp_index)
> @@ -2193,6 +2234,9 @@ static int virtnet_enable_queue_pair(struct virtnet_info *vi, int qp_index)
>         if (err < 0)
>                 goto err_xdp_reg_mem_model;
>
> +       INIT_WORK(&vi->rq[qp_index].dim.work, virtnet_rx_dim_work);
> +       vi->rq[qp_index].dim.mode = DIM_CQ_PERIOD_MODE_START_FROM_EQE;
> +
>         virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi);
>         virtnet_napi_tx_enable(vi, vi->sq[qp_index].vq, &vi->sq[qp_index].napi);
>
> @@ -3335,23 +3379,42 @@ static int virtnet_send_tx_notf_coal_cmds(struct virtnet_info *vi,
>  static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi,
>                                           struct ethtool_coalesce *ec)
>  {
> +       bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce;
>         struct scatterlist sgs_rx;
> +       int i;
>
> -       vi->ctrl->coal_rx.rx_usecs = cpu_to_le32(ec->rx_coalesce_usecs);
> -       vi->ctrl->coal_rx.rx_max_packets = cpu_to_le32(ec->rx_max_coalesced_frames);
> -       sg_init_one(&sgs_rx, &vi->ctrl->coal_rx, sizeof(vi->ctrl->coal_rx));
> -
> -       if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> -                                 VIRTIO_NET_CTRL_NOTF_COAL_RX_SET,
> -                                 &sgs_rx))
> +       if (rx_ctrl_dim_on && (ec->rx_coalesce_usecs != vi->intr_coal_rx.max_usecs ||
> +                              ec->rx_max_coalesced_frames != vi->intr_coal_rx.max_packets))

Any reason we need to stick a check for usecs/packets? I think it
might confuse the user since the value could be modified by netdim
actually.

>                 return -EINVAL;
>
> -       /* Save parameters */
> -       vi->intr_coal_rx.max_usecs = ec->rx_coalesce_usecs;
> -       vi->intr_coal_rx.max_packets = ec->rx_max_coalesced_frames;
> -       for (i = 0; i < vi->max_queue_pairs; i++) {
> -               vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs;
> -               vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames;
> +       if (rx_ctrl_dim_on) {
> +               if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) {
> +                       vi->rx_dim_enabled = true;
> +                       for (i = 0; i < vi->max_queue_pairs; i++)
> +                               vi->rq[i].dim_enabled = true;
> +               } else {
> +                       return -EOPNOTSUPP;
> +               }
> +       } else {
> +               vi->rx_dim_enabled = false;
> +               for (i = 0; i < vi->max_queue_pairs; i++)
> +                       vi->rq[i].dim_enabled = false;
> +
> +               vi->ctrl->coal_rx.rx_usecs = cpu_to_le32(ec->rx_coalesce_usecs);
> +               vi->ctrl->coal_rx.rx_max_packets = cpu_to_le32(ec->rx_max_coalesced_frames);
> +               sg_init_one(&sgs_rx, &vi->ctrl->coal_rx, sizeof(vi->ctrl->coal_rx));
> +
> +               if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> +                                         VIRTIO_NET_CTRL_NOTF_COAL_RX_SET,
> +                                         &sgs_rx))
> +                       return -EINVAL;
> +
> +               vi->intr_coal_rx.max_usecs = ec->rx_coalesce_usecs;
> +               vi->intr_coal_rx.max_packets = ec->rx_max_coalesced_frames;
> +               for (i = 0; i < vi->max_queue_pairs; i++) {
> +                       vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs;
> +                       vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames;
> +               }
>         }
>
>         return 0;
> @@ -3377,13 +3440,27 @@ static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi,
>                                           struct ethtool_coalesce *ec,
>                                           u16 queue)
>  {
> +       bool rx_ctrl_dim_on;
> +       u32 max_usecs, max_packets;
>         int err;
>
> -       err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, queue,
> -                                              ec->rx_coalesce_usecs,
> -                                              ec->rx_max_coalesced_frames);
> -       if (err)
> -               return err;
> +       rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce;
> +       max_usecs = vi->rq[queue].intr_coal.max_usecs;
> +       max_packets = vi->rq[queue].intr_coal.max_packets;
> +       if (rx_ctrl_dim_on && (ec->rx_coalesce_usecs != max_usecs ||
> +                              ec->rx_max_coalesced_frames != max_packets))
> +               return -EINVAL;
> +
> +       if (rx_ctrl_dim_on) {
> +               vi->rq[queue].dim_enabled = true;
> +       } else {
> +               vi->rq[queue].dim_enabled = false;
> +               err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, queue,
> +                                                      ec->rx_coalesce_usecs,
> +                                                      ec->rx_max_coalesced_frames);
> +               if (err)
> +                       return err;
> +       }
>
>         err = virtnet_send_tx_ctrl_coal_vq_cmd(vi, queue,
>                                                ec->tx_coalesce_usecs,
> @@ -3394,6 +3471,32 @@ static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi,
>         return 0;
>  }
>
> +static void virtnet_rx_dim_work(struct work_struct *work)
> +{
> +       struct dim *dim = container_of(work, struct dim, work);
> +       struct receive_queue *rq = container_of(dim,
> +                       struct receive_queue, dim);
> +       struct virtnet_info *vi = rq->vq->vdev->priv;
> +       struct net_device *dev = vi->dev;
> +       struct dim_cq_moder update_moder;
> +       int qnum = rq - vi->rq, err;
> +
> +       update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
> +       if (update_moder.usec != vi->rq[qnum].intr_coal.max_usecs ||
> +           update_moder.pkts != vi->rq[qnum].intr_coal.max_packets) {

Is this safe across the e.g vq reset?

Thanks

> +               rtnl_lock();
> +               err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum,
> +                                                      update_moder.usec,
> +                                                      update_moder.pkts);
> +               if (err)
> +                       pr_debug("%s: Failed to send dim parameters on rxq%d\n",
> +                                dev->name, (int)(rq - vi->rq));
> +               rtnl_unlock();
> +       }
> +
> +       dim->state = DIM_START_MEASURE;
> +}
> +
>  static int virtnet_coal_params_supported(struct ethtool_coalesce *ec)
>  {
>         /* usecs coalescing is supported only if VIRTIO_NET_F_NOTF_COAL
> @@ -3475,6 +3578,7 @@ static int virtnet_get_coalesce(struct net_device *dev,
>                 ec->tx_coalesce_usecs = vi->intr_coal_tx.max_usecs;
>                 ec->tx_max_coalesced_frames = vi->intr_coal_tx.max_packets;
>                 ec->rx_max_coalesced_frames = vi->intr_coal_rx.max_packets;
> +               ec->use_adaptive_rx_coalesce = vi->rx_dim_enabled;
>         } else {
>                 ec->rx_max_coalesced_frames = 1;
>
> @@ -3532,6 +3636,7 @@ static int virtnet_get_per_queue_coalesce(struct net_device *dev,
>                 ec->tx_coalesce_usecs = vi->sq[queue].intr_coal.max_usecs;
>                 ec->tx_max_coalesced_frames = vi->sq[queue].intr_coal.max_packets;
>                 ec->rx_max_coalesced_frames = vi->rq[queue].intr_coal.max_packets;
> +               ec->use_adaptive_rx_coalesce = vi->rq[queue].dim_enabled;
>         } else {
>                 ec->rx_max_coalesced_frames = 1;
>
> @@ -3657,7 +3762,7 @@ static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
>
>  static const struct ethtool_ops virtnet_ethtool_ops = {
>         .supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
> -               ETHTOOL_COALESCE_USECS,
> +               ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
>         .get_drvinfo = virtnet_get_drvinfo,
>         .get_link = ethtool_op_get_link,
>         .get_ringparam = virtnet_get_ringparam,
> --
> 2.19.1.6.gb485710b
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 5/5] virtio-net: support tx netdim
       [not found] ` <ef5d159875745040e406473bd5c03e9875742ff5.1697093455.git.hengqi@linux.alibaba.com>
@ 2023-10-25  3:35   ` Jason Wang
  2023-10-25  5:50     ` Michael S. Tsirkin
  0 siblings, 1 reply; 15+ messages in thread
From: Jason Wang @ 2023-10-25  3:35 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, Michael S. Tsirkin,
	netdev, John Fastabend, Alexei Starovoitov, virtualization,
	Eric Dumazet, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
> Similar to rx netdim, this patch supports adaptive tx
> coalescing moderation for the virtio-net.
>
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
>  drivers/net/virtio_net.c | 143 ++++++++++++++++++++++++++++++++-------
>  1 file changed, 119 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 6ad2890a7909..1c680cb09d48 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -154,6 +154,15 @@ struct send_queue {
>
>         struct virtnet_sq_stats stats;
>
> +       /* The number of tx notifications */
> +       u16 calls;
> +
> +       /* Is dynamic interrupt moderation enabled? */
> +       bool dim_enabled;
> +
> +       /* Dynamic Interrupt Moderation */
> +       struct dim dim;
> +
>         struct virtnet_interrupt_coalesce intr_coal;
>
>         struct napi_struct napi;
> @@ -317,8 +326,9 @@ struct virtnet_info {
>         u8 duplex;
>         u32 speed;
>
> -       /* Is rx dynamic interrupt moderation enabled? */
> +       /* Is dynamic interrupt moderation enabled? */
>         bool rx_dim_enabled;
> +       bool tx_dim_enabled;
>
>         /* Interrupt coalescing settings */
>         struct virtnet_interrupt_coalesce intr_coal_tx;
> @@ -464,19 +474,40 @@ static bool virtqueue_napi_complete(struct napi_struct *napi,
>         return false;
>  }
>
> +static void virtnet_tx_dim_work(struct work_struct *work);
> +
> +static void virtnet_tx_dim_update(struct virtnet_info *vi, struct send_queue *sq)
> +{
> +       struct virtnet_sq_stats *stats = &sq->stats;
> +       struct dim_sample cur_sample = {};
> +
> +       u64_stats_update_begin(&sq->stats.syncp);
> +       dim_update_sample(sq->calls, stats->packets,
> +                         stats->bytes, &cur_sample);
> +       u64_stats_update_end(&sq->stats.syncp);
> +
> +       net_dim(&sq->dim, cur_sample);
> +}
> +
>  static void skb_xmit_done(struct virtqueue *vq)
>  {
>         struct virtnet_info *vi = vq->vdev->priv;
> -       struct napi_struct *napi = &vi->sq[vq2txq(vq)].napi;
> +       struct send_queue *sq = &vi->sq[vq2txq(vq)];
> +       struct napi_struct *napi = &sq->napi;
> +
> +       sq->calls++;

I wonder what's the impact of this counters for netdim. As we have a
mode of orphan skb in xmit.

We need to test to see.

Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation
       [not found] <cover.1697093455.git.hengqi@linux.alibaba.com>
                   ` (5 preceding siblings ...)
       [not found] ` <ef5d159875745040e406473bd5c03e9875742ff5.1697093455.git.hengqi@linux.alibaba.com>
@ 2023-10-25  5:49 ` Michael S. Tsirkin
       [not found]   ` <707be7fa-3bb7-46c5-bb34-ef2900fe473f@linux.alibaba.com>
  6 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-10-25  5:49 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, netdev,
	John Fastabend, Alexei Starovoitov, virtualization, Eric Dumazet,
	Simon Horman, Jakub Kicinski, Paolo Abeni, David S. Miller

On Thu, Oct 12, 2023 at 03:44:04PM +0800, Heng Qi wrote:
> Now, virtio-net already supports per-queue moderation parameter
> setting. Based on this, we use the netdim library of linux to support
> dynamic coalescing moderation for virtio-net.
> 
> Due to hardware scheduling issues, we only tested rx dim.

So patches 1 to 4 look ok but patch 5 is untested - we should
probably wait until it's tested properly.


> @Test env
> rxq0 has affinity to cpu0.
> 
> @Test cmd
> client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> server: taskset -c 0 sockperf sr --tcp
> 
> @Test res
> The second column is the ratio of the result returned by client
> when rx dim is enabled to the result returned by client when
> rx dim is disabled.
> 	--------------------------------------
> 	| msg_size |  rx_dim=on / rx_dim=off |
> 	--------------------------------------
> 	|   14B    |         + 3%            |   
> 	--------------------------------------
> 	|   100B   |         + 16%           |
> 	--------------------------------------
> 	|   500B   |         + 25%           |
> 	--------------------------------------
> 	|   1400B  |         + 28%           |
> 	--------------------------------------
> 	|   2048B  |         + 22%           |
> 	--------------------------------------
> 	|   4096B  |         + 5%            |
> 	--------------------------------------
> 
> ---
> This patch set was part of the previous netdim patch set[1].
> [1] was split into a merged bugfix set[2] and the current set.
> The previous relevant commentators have been Cced.
> 
> [1] https://lore.kernel.org/all/20230811065512.22190-1-hengqi@linux.alibaba.com/
> [2] https://lore.kernel.org/all/cover.1696745452.git.hengqi@linux.alibaba.com/
> 
> Heng Qi (5):
>   virtio-net: returns whether napi is complete
>   virtio-net: separate rx/tx coalescing moderation cmds
>   virtio-net: extract virtqueue coalescig cmd for reuse
>   virtio-net: support rx netdim
>   virtio-net: support tx netdim
> 
>  drivers/net/virtio_net.c | 394 ++++++++++++++++++++++++++++++++-------
>  1 file changed, 322 insertions(+), 72 deletions(-)
> 
> -- 
> 2.19.1.6.gb485710b

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 5/5] virtio-net: support tx netdim
  2023-10-25  3:35   ` [PATCH net-next 5/5] virtio-net: support tx netdim Jason Wang
@ 2023-10-25  5:50     ` Michael S. Tsirkin
  0 siblings, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-10-25  5:50 UTC (permalink / raw)
  To: Jason Wang
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, netdev,
	John Fastabend, Alexei Starovoitov, virtualization, Eric Dumazet,
	Heng Qi, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Wed, Oct 25, 2023 at 11:35:43AM +0800, Jason Wang wrote:
> On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >
> > Similar to rx netdim, this patch supports adaptive tx
> > coalescing moderation for the virtio-net.
> >
> > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > ---
> >  drivers/net/virtio_net.c | 143 ++++++++++++++++++++++++++++++++-------
> >  1 file changed, 119 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 6ad2890a7909..1c680cb09d48 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -154,6 +154,15 @@ struct send_queue {
> >
> >         struct virtnet_sq_stats stats;
> >
> > +       /* The number of tx notifications */
> > +       u16 calls;
> > +
> > +       /* Is dynamic interrupt moderation enabled? */
> > +       bool dim_enabled;
> > +
> > +       /* Dynamic Interrupt Moderation */
> > +       struct dim dim;
> > +
> >         struct virtnet_interrupt_coalesce intr_coal;
> >
> >         struct napi_struct napi;
> > @@ -317,8 +326,9 @@ struct virtnet_info {
> >         u8 duplex;
> >         u32 speed;
> >
> > -       /* Is rx dynamic interrupt moderation enabled? */
> > +       /* Is dynamic interrupt moderation enabled? */
> >         bool rx_dim_enabled;
> > +       bool tx_dim_enabled;
> >
> >         /* Interrupt coalescing settings */
> >         struct virtnet_interrupt_coalesce intr_coal_tx;
> > @@ -464,19 +474,40 @@ static bool virtqueue_napi_complete(struct napi_struct *napi,
> >         return false;
> >  }
> >
> > +static void virtnet_tx_dim_work(struct work_struct *work);
> > +
> > +static void virtnet_tx_dim_update(struct virtnet_info *vi, struct send_queue *sq)
> > +{
> > +       struct virtnet_sq_stats *stats = &sq->stats;
> > +       struct dim_sample cur_sample = {};
> > +
> > +       u64_stats_update_begin(&sq->stats.syncp);
> > +       dim_update_sample(sq->calls, stats->packets,
> > +                         stats->bytes, &cur_sample);
> > +       u64_stats_update_end(&sq->stats.syncp);
> > +
> > +       net_dim(&sq->dim, cur_sample);
> > +}
> > +
> >  static void skb_xmit_done(struct virtqueue *vq)
> >  {
> >         struct virtnet_info *vi = vq->vdev->priv;
> > -       struct napi_struct *napi = &vi->sq[vq2txq(vq)].napi;
> > +       struct send_queue *sq = &vi->sq[vq2txq(vq)];
> > +       struct napi_struct *napi = &sq->napi;
> > +
> > +       sq->calls++;
> 
> I wonder what's the impact of this counters for netdim. As we have a
> mode of orphan skb in xmit.
> 
> We need to test to see.
> 
> Thanks

Agreed, performance patches should come with performance results.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation
  2023-10-25  1:18     ` Jason Wang
@ 2023-10-25  5:53       ` Michael S. Tsirkin
       [not found]         ` <d3b9e9e8-1ef4-48ac-8a2f-4fa647ae4372@linux.alibaba.com>
       [not found]       ` <753ac6da-f7f1-4acb-9184-e59271809c6d@linux.alibaba.com>
  1 sibling, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-10-25  5:53 UTC (permalink / raw)
  To: Jason Wang
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, netdev,
	John Fastabend, Alexei Starovoitov, virtualization, Eric Dumazet,
	Heng Qi, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Wed, Oct 25, 2023 at 09:18:27AM +0800, Jason Wang wrote:
> On Tue, Oct 24, 2023 at 8:03 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >
> >
> >
> > 在 2023/10/12 下午4:29, Jason Wang 写道:
> > > On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> > >> Now, virtio-net already supports per-queue moderation parameter
> > >> setting. Based on this, we use the netdim library of linux to support
> > >> dynamic coalescing moderation for virtio-net.
> > >>
> > >> Due to hardware scheduling issues, we only tested rx dim.
> > > Do you have PPS numbers? And TX numbers are also important as the
> > > throughput could be misleading due to various reasons.
> >
> > Hi Jason!
> >
> > The comparison of rx netdim performance is as follows:
> > (the backend supporting tx dim is not yet ready)
> 
> Thanks a lot for the numbers.
> 
> I'd still expect the TX result as I did play tx interrupt coalescing
> about 10 years ago.
> 
> I will start to review the series but let's try to have some TX numbers as well.
> 
> Btw, it would be more convenient to have a raw PPS benchmark. E.g you
> can try to use a software or hardware packet generator.
> 
> Thanks

Latency results are also kind of interesting.


> >
> >
> > I. Sockperf UDP
> > =================================================
> > 1. Env
> > rxq_0 is affinity to cpu_0
> >
> > 2. Cmd
> > client:  taskset -c 0 sockperf tp -p 8989 -i $IP -t 10 -m 16B
> > server: taskset -c 0 sockperf sr -p 8989
> >
> > 3. Result
> > dim off: 1143277.00 rxpps, throughput 17.844 MBps, cpu is 100%.
> > dim on: 1124161.00 rxpps, throughput 17.610 MBps, cpu is 83.5%.
> > =================================================
> >
> >
> > II. Redis
> > =================================================
> > 1. Env
> > There are 8 rxqs and rxq_i is affinity to cpu_i.
> >
> > 2. Result
> > When all cpus are 100%, ops/sec of memtier_benchmark client is
> > dim off:   978437.23
> > dim on: 1143638.28
> > =================================================
> >
> >
> > III. Nginx
> > =================================================
> > 1. Env
> > There are 8 rxqs and rxq_i is affinity to cpu_i.
> >
> > 2. Result
> > When all cpus are 100%, requests/sec of wrk client is
> > dim off:   877931.67
> > dim on: 1019160.31
> > =================================================
> >
> > Thanks!
> >
> > >
> > > Thanks
> > >
> > >> @Test env
> > >> rxq0 has affinity to cpu0.
> > >>
> > >> @Test cmd
> > >> client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> > >> server: taskset -c 0 sockperf sr --tcp
> > >>
> > >> @Test res
> > >> The second column is the ratio of the result returned by client
> > >> when rx dim is enabled to the result returned by client when
> > >> rx dim is disabled.
> > >>          --------------------------------------
> > >>          | msg_size |  rx_dim=on / rx_dim=off |
> > >>          --------------------------------------
> > >>          |   14B    |         + 3%            |
> > >>          --------------------------------------
> > >>          |   100B   |         + 16%           |
> > >>          --------------------------------------
> > >>          |   500B   |         + 25%           |
> > >>          --------------------------------------
> > >>          |   1400B  |         + 28%           |
> > >>          --------------------------------------
> > >>          |   2048B  |         + 22%           |
> > >>          --------------------------------------
> > >>          |   4096B  |         + 5%            |
> > >>          --------------------------------------
> > >>
> > >> ---
> > >> This patch set was part of the previous netdim patch set[1].
> > >> [1] was split into a merged bugfix set[2] and the current set.
> > >> The previous relevant commentators have been Cced.
> > >>
> > >> [1] https://lore.kernel.org/all/20230811065512.22190-1-hengqi@linux.alibaba.com/
> > >> [2] https://lore.kernel.org/all/cover.1696745452.git.hengqi@linux.alibaba.com/
> > >>
> > >> Heng Qi (5):
> > >>    virtio-net: returns whether napi is complete
> > >>    virtio-net: separate rx/tx coalescing moderation cmds
> > >>    virtio-net: extract virtqueue coalescig cmd for reuse
> > >>    virtio-net: support rx netdim
> > >>    virtio-net: support tx netdim
> > >>
> > >>   drivers/net/virtio_net.c | 394 ++++++++++++++++++++++++++++++++-------
> > >>   1 file changed, 322 insertions(+), 72 deletions(-)
> > >>
> > >> --
> > >> 2.19.1.6.gb485710b
> > >>
> > >>
> >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation
       [not found]   ` <707be7fa-3bb7-46c5-bb34-ef2900fe473f@linux.alibaba.com>
@ 2023-11-01 10:44     ` Michael S. Tsirkin
  0 siblings, 0 replies; 15+ messages in thread
From: Michael S. Tsirkin @ 2023-11-01 10:44 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, netdev,
	John Fastabend, Alexei Starovoitov, virtualization, Eric Dumazet,
	Simon Horman, Jakub Kicinski, Paolo Abeni, David S. Miller

On Wed, Nov 01, 2023 at 05:40:30PM +0800, Heng Qi wrote:
> 
> 
> 在 2023/10/25 下午1:49, Michael S. Tsirkin 写道:
> > On Thu, Oct 12, 2023 at 03:44:04PM +0800, Heng Qi wrote:
> > > Now, virtio-net already supports per-queue moderation parameter
> > > setting. Based on this, we use the netdim library of linux to support
> > > dynamic coalescing moderation for virtio-net.
> > > 
> > > Due to hardware scheduling issues, we only tested rx dim.
> > So patches 1 to 4 look ok but patch 5 is untested - we should
> > probably wait until it's tested properly.
> 
> Hi, Michael.
> 
> For a few reasons (reply to Jason's thread), I won't be trying to push tx
> dim any more in the short term.
> 
> Please review the remaining patches.
> 
> Thanks a lot!


You got a bunch of comments from Jason - want to address them
in a new version then, and I'll review that?

> > 
> > 
> > > @Test env
> > > rxq0 has affinity to cpu0.
> > > 
> > > @Test cmd
> > > client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> > > server: taskset -c 0 sockperf sr --tcp
> > > 
> > > @Test res
> > > The second column is the ratio of the result returned by client
> > > when rx dim is enabled to the result returned by client when
> > > rx dim is disabled.
> > > 	--------------------------------------
> > > 	| msg_size |  rx_dim=on / rx_dim=off |
> > > 	--------------------------------------
> > > 	|   14B    |         + 3%            |
> > > 	--------------------------------------
> > > 	|   100B   |         + 16%           |
> > > 	--------------------------------------
> > > 	|   500B   |         + 25%           |
> > > 	--------------------------------------
> > > 	|   1400B  |         + 28%           |
> > > 	--------------------------------------
> > > 	|   2048B  |         + 22%           |
> > > 	--------------------------------------
> > > 	|   4096B  |         + 5%            |
> > > 	--------------------------------------
> > > 
> > > ---
> > > This patch set was part of the previous netdim patch set[1].
> > > [1] was split into a merged bugfix set[2] and the current set.
> > > The previous relevant commentators have been Cced.
> > > 
> > > [1] https://lore.kernel.org/all/20230811065512.22190-1-hengqi@linux.alibaba.com/
> > > [2] https://lore.kernel.org/all/cover.1696745452.git.hengqi@linux.alibaba.com/
> > > 
> > > Heng Qi (5):
> > >    virtio-net: returns whether napi is complete
> > >    virtio-net: separate rx/tx coalescing moderation cmds
> > >    virtio-net: extract virtqueue coalescig cmd for reuse
> > >    virtio-net: support rx netdim
> > >    virtio-net: support tx netdim
> > > 
> > >   drivers/net/virtio_net.c | 394 ++++++++++++++++++++++++++++++++-------
> > >   1 file changed, 322 insertions(+), 72 deletions(-)
> > > 
> > > -- 
> > > 2.19.1.6.gb485710b

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 4/5] virtio-net: support rx netdim
       [not found]     ` <3bc31a3b-a022-4816-a854-7f6b41d2e351@linux.alibaba.com>
@ 2023-11-02  4:31       ` Jason Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jason Wang @ 2023-11-02  4:31 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, Michael S. Tsirkin,
	netdev, John Fastabend, Alexei Starovoitov, virtualization,
	Eric Dumazet, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Wed, Nov 1, 2023 at 6:55 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2023/10/25 上午11:34, Jason Wang 写道:
> > On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >> By comparing the traffic information in the complete napi processes,
> >> let the virtio-net driver automatically adjust the coalescing
> >> moderation parameters of each receive queue.
> >>
> >> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> >> ---
> >>   drivers/net/virtio_net.c | 147 +++++++++++++++++++++++++++++++++------
> >>   1 file changed, 126 insertions(+), 21 deletions(-)
> >>
> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >> index caef78bb3963..6ad2890a7909 100644
> >> --- a/drivers/net/virtio_net.c
> >> +++ b/drivers/net/virtio_net.c
> >> @@ -19,6 +19,7 @@
> >>   #include <linux/average.h>
> >>   #include <linux/filter.h>
> >>   #include <linux/kernel.h>
> >> +#include <linux/dim.h>
> >>   #include <net/route.h>
> >>   #include <net/xdp.h>
> >>   #include <net/net_failover.h>
> >> @@ -172,6 +173,17 @@ struct receive_queue {
> >>
> >>          struct virtnet_rq_stats stats;
> >>
> >> +       /* The number of rx notifications */
> >> +       u16 calls;
> >> +
> >> +       /* Is dynamic interrupt moderation enabled? */
> >> +       bool dim_enabled;
> >> +
> >> +       /* Dynamic Interrupt Moderation */
> >> +       struct dim dim;
> >> +
> >> +       u32 packets_in_napi;
> >> +
> >>          struct virtnet_interrupt_coalesce intr_coal;
> >>
> >>          /* Chain pages by the private ptr. */
> >> @@ -305,6 +317,9 @@ struct virtnet_info {
> >>          u8 duplex;
> >>          u32 speed;
> >>
> >> +       /* Is rx dynamic interrupt moderation enabled? */
> >> +       bool rx_dim_enabled;
> >> +
> >>          /* Interrupt coalescing settings */
> >>          struct virtnet_interrupt_coalesce intr_coal_tx;
> >>          struct virtnet_interrupt_coalesce intr_coal_rx;
> >> @@ -2001,6 +2016,7 @@ static void skb_recv_done(struct virtqueue *rvq)
> >>          struct virtnet_info *vi = rvq->vdev->priv;
> >>          struct receive_queue *rq = &vi->rq[vq2rxq(rvq)];
> >>
> >> +       rq->calls++;
> >>          virtqueue_napi_schedule(&rq->napi, rvq);
> >>   }
> >>
> >> @@ -2138,6 +2154,25 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
> >>          }
> >>   }
> >>
> >> +static void virtnet_rx_dim_work(struct work_struct *work);
> >> +
> >> +static void virtnet_rx_dim_update(struct virtnet_info *vi, struct receive_queue *rq)
> >> +{
> >> +       struct virtnet_rq_stats *stats = &rq->stats;
> >> +       struct dim_sample cur_sample = {};
> >> +
> >> +       if (!rq->packets_in_napi)
> >> +               return;
> >> +
> >> +       u64_stats_update_begin(&rq->stats.syncp);
> >> +       dim_update_sample(rq->calls, stats->packets,
> >> +                         stats->bytes, &cur_sample);
> >> +       u64_stats_update_end(&rq->stats.syncp);
> >> +
> >> +       net_dim(&rq->dim, cur_sample);
> >> +       rq->packets_in_napi = 0;
> >> +}
> >> +
> >>   static int virtnet_poll(struct napi_struct *napi, int budget)
> >>   {
> >>          struct receive_queue *rq =
> >> @@ -2146,17 +2181,22 @@ static int virtnet_poll(struct napi_struct *napi, int budget)
> >>          struct send_queue *sq;
> >>          unsigned int received;
> >>          unsigned int xdp_xmit = 0;
> >> +       bool napi_complete;
> >>
> >>          virtnet_poll_cleantx(rq);
> >>
> >>          received = virtnet_receive(rq, budget, &xdp_xmit);
> >> +       rq->packets_in_napi += received;
> >>
> >>          if (xdp_xmit & VIRTIO_XDP_REDIR)
> >>                  xdp_do_flush();
> >>
> >>          /* Out of packets? */
> >> -       if (received < budget)
> >> -               virtqueue_napi_complete(napi, rq->vq, received);
> >> +       if (received < budget) {
> >> +               napi_complete = virtqueue_napi_complete(napi, rq->vq, received);
> >> +               if (napi_complete && rq->dim_enabled)
> >> +                       virtnet_rx_dim_update(vi, rq);
> >> +       }
> >>
> >>          if (xdp_xmit & VIRTIO_XDP_TX) {
> >>                  sq = virtnet_xdp_get_sq(vi);
> >> @@ -2176,6 +2216,7 @@ static void virtnet_disable_queue_pair(struct virtnet_info *vi, int qp_index)
> >>          virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
> >>          napi_disable(&vi->rq[qp_index].napi);
> >>          xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
> >> +       cancel_work_sync(&vi->rq[qp_index].dim.work);
> >>   }
> >>
> >>   static int virtnet_enable_queue_pair(struct virtnet_info *vi, int qp_index)
> >> @@ -2193,6 +2234,9 @@ static int virtnet_enable_queue_pair(struct virtnet_info *vi, int qp_index)
> >>          if (err < 0)
> >>                  goto err_xdp_reg_mem_model;
> >>
> >> +       INIT_WORK(&vi->rq[qp_index].dim.work, virtnet_rx_dim_work);
> >> +       vi->rq[qp_index].dim.mode = DIM_CQ_PERIOD_MODE_START_FROM_EQE;
> >> +
> >>          virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi);
> >>          virtnet_napi_tx_enable(vi, vi->sq[qp_index].vq, &vi->sq[qp_index].napi);
> >>
> >> @@ -3335,23 +3379,42 @@ static int virtnet_send_tx_notf_coal_cmds(struct virtnet_info *vi,
> >>   static int virtnet_send_rx_notf_coal_cmds(struct virtnet_info *vi,
> >>                                            struct ethtool_coalesce *ec)
> >>   {
> >> +       bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce;
> >>          struct scatterlist sgs_rx;
> >> +       int i;
> >>
> >> -       vi->ctrl->coal_rx.rx_usecs = cpu_to_le32(ec->rx_coalesce_usecs);
> >> -       vi->ctrl->coal_rx.rx_max_packets = cpu_to_le32(ec->rx_max_coalesced_frames);
> >> -       sg_init_one(&sgs_rx, &vi->ctrl->coal_rx, sizeof(vi->ctrl->coal_rx));
> >> -
> >> -       if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> >> -                                 VIRTIO_NET_CTRL_NOTF_COAL_RX_SET,
> >> -                                 &sgs_rx))
> >> +       if (rx_ctrl_dim_on && (ec->rx_coalesce_usecs != vi->intr_coal_rx.max_usecs ||
> >> +                              ec->rx_max_coalesced_frames != vi->intr_coal_rx.max_packets))
> > Any reason we need to stick a check for usecs/packets? I think it
> > might confuse the user since the value could be modified by netdim
> > actually.
>
> Yes, that's exactly what's done here.
>
> When dim is enabled, the user is prohibited from manually configuring
> parameters because dim may modify the parameters.

So it should be something like

if (rx_ctrl_dim_on)
      return -EINVAL;

without the checking of whether it matches the current parameters?

>
> >
> >>                  return -EINVAL;
> >>
> >> -       /* Save parameters */
> >> -       vi->intr_coal_rx.max_usecs = ec->rx_coalesce_usecs;
> >> -       vi->intr_coal_rx.max_packets = ec->rx_max_coalesced_frames;
> >> -       for (i = 0; i < vi->max_queue_pairs; i++) {
> >> -               vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs;
> >> -               vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames;
> >> +       if (rx_ctrl_dim_on) {
> >> +               if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) {
> >> +                       vi->rx_dim_enabled = true;
> >> +                       for (i = 0; i < vi->max_queue_pairs; i++)
> >> +                               vi->rq[i].dim_enabled = true;
> >> +               } else {
> >> +                       return -EOPNOTSUPP;
> >> +               }
> >> +       } else {
> >> +               vi->rx_dim_enabled = false;
> >> +               for (i = 0; i < vi->max_queue_pairs; i++)
> >> +                       vi->rq[i].dim_enabled = false;
> >> +
> >> +               vi->ctrl->coal_rx.rx_usecs = cpu_to_le32(ec->rx_coalesce_usecs);
> >> +               vi->ctrl->coal_rx.rx_max_packets = cpu_to_le32(ec->rx_max_coalesced_frames);
> >> +               sg_init_one(&sgs_rx, &vi->ctrl->coal_rx, sizeof(vi->ctrl->coal_rx));
> >> +
> >> +               if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> >> +                                         VIRTIO_NET_CTRL_NOTF_COAL_RX_SET,
> >> +                                         &sgs_rx))
> >> +                       return -EINVAL;
> >> +
> >> +               vi->intr_coal_rx.max_usecs = ec->rx_coalesce_usecs;
> >> +               vi->intr_coal_rx.max_packets = ec->rx_max_coalesced_frames;
> >> +               for (i = 0; i < vi->max_queue_pairs; i++) {
> >> +                       vi->rq[i].intr_coal.max_usecs = ec->rx_coalesce_usecs;
> >> +                       vi->rq[i].intr_coal.max_packets = ec->rx_max_coalesced_frames;
> >> +               }
> >>          }
> >>
> >>          return 0;
> >> @@ -3377,13 +3440,27 @@ static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi,
> >>                                            struct ethtool_coalesce *ec,
> >>                                            u16 queue)
> >>   {
> >> +       bool rx_ctrl_dim_on;
> >> +       u32 max_usecs, max_packets;
> >>          int err;
> >>
> >> -       err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, queue,
> >> -                                              ec->rx_coalesce_usecs,
> >> -                                              ec->rx_max_coalesced_frames);
> >> -       if (err)
> >> -               return err;
> >> +       rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce;
> >> +       max_usecs = vi->rq[queue].intr_coal.max_usecs;
> >> +       max_packets = vi->rq[queue].intr_coal.max_packets;
> >> +       if (rx_ctrl_dim_on && (ec->rx_coalesce_usecs != max_usecs ||
> >> +                              ec->rx_max_coalesced_frames != max_packets))
> >> +               return -EINVAL;
> >> +
> >> +       if (rx_ctrl_dim_on) {
> >> +               vi->rq[queue].dim_enabled = true;
> >> +       } else {
> >> +               vi->rq[queue].dim_enabled = false;
> >> +               err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, queue,
> >> +                                                      ec->rx_coalesce_usecs,
> >> +                                                      ec->rx_max_coalesced_frames);
> >> +               if (err)
> >> +                       return err;
> >> +       }
> >>
> >>          err = virtnet_send_tx_ctrl_coal_vq_cmd(vi, queue,
> >>                                                 ec->tx_coalesce_usecs,
> >> @@ -3394,6 +3471,32 @@ static int virtnet_send_notf_coal_vq_cmds(struct virtnet_info *vi,
> >>          return 0;
> >>   }
> >>
> >> +static void virtnet_rx_dim_work(struct work_struct *work)
> >> +{
> >> +       struct dim *dim = container_of(work, struct dim, work);
> >> +       struct receive_queue *rq = container_of(dim,
> >> +                       struct receive_queue, dim);
> >> +       struct virtnet_info *vi = rq->vq->vdev->priv;
> >> +       struct net_device *dev = vi->dev;
> >> +       struct dim_cq_moder update_moder;
> >> +       int qnum = rq - vi->rq, err;
> >> +
> >> +       update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
> >> +       if (update_moder.usec != vi->rq[qnum].intr_coal.max_usecs ||
> >> +           update_moder.pkts != vi->rq[qnum].intr_coal.max_packets) {
> > Is this safe across the e.g vq reset?
>
> I think it might. This will be avoided in the next version using:
> 1. cancel virtnet_rx_dim_work before vq reset.
> 2. restore virtnet_rx_dim_work after vq re-enable.

Ok.

Thanks

>
> Thanks a lot!
>
> >
> > Thanks
> >
> >> +               rtnl_lock();
> >> +               err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum,
> >> +                                                      update_moder.usec,
> >> +                                                      update_moder.pkts);
> >> +               if (err)
> >> +                       pr_debug("%s: Failed to send dim parameters on rxq%d\n",
> >> +                                dev->name, (int)(rq - vi->rq));
> >> +               rtnl_unlock();
> >> +       }
> >> +
> >> +       dim->state = DIM_START_MEASURE;
> >> +}
> >> +
> >>   static int virtnet_coal_params_supported(struct ethtool_coalesce *ec)
> >>   {
> >>          /* usecs coalescing is supported only if VIRTIO_NET_F_NOTF_COAL
> >> @@ -3475,6 +3578,7 @@ static int virtnet_get_coalesce(struct net_device *dev,
> >>                  ec->tx_coalesce_usecs = vi->intr_coal_tx.max_usecs;
> >>                  ec->tx_max_coalesced_frames = vi->intr_coal_tx.max_packets;
> >>                  ec->rx_max_coalesced_frames = vi->intr_coal_rx.max_packets;
> >> +               ec->use_adaptive_rx_coalesce = vi->rx_dim_enabled;
> >>          } else {
> >>                  ec->rx_max_coalesced_frames = 1;
> >>
> >> @@ -3532,6 +3636,7 @@ static int virtnet_get_per_queue_coalesce(struct net_device *dev,
> >>                  ec->tx_coalesce_usecs = vi->sq[queue].intr_coal.max_usecs;
> >>                  ec->tx_max_coalesced_frames = vi->sq[queue].intr_coal.max_packets;
> >>                  ec->rx_max_coalesced_frames = vi->rq[queue].intr_coal.max_packets;
> >> +               ec->use_adaptive_rx_coalesce = vi->rq[queue].dim_enabled;
> >>          } else {
> >>                  ec->rx_max_coalesced_frames = 1;
> >>
> >> @@ -3657,7 +3762,7 @@ static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
> >>
> >>   static const struct ethtool_ops virtnet_ethtool_ops = {
> >>          .supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
> >> -               ETHTOOL_COALESCE_USECS,
> >> +               ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
> >>          .get_drvinfo = virtnet_get_drvinfo,
> >>          .get_link = ethtool_op_get_link,
> >>          .get_ringparam = virtnet_get_ringparam,
> >> --
> >> 2.19.1.6.gb485710b
> >>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation
       [not found]         ` <d3b9e9e8-1ef4-48ac-8a2f-4fa647ae4372@linux.alibaba.com>
@ 2023-11-02  4:33           ` Jason Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jason Wang @ 2023-11-02  4:33 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, Michael S. Tsirkin,
	netdev, John Fastabend, Alexei Starovoitov, virtualization,
	Eric Dumazet, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Wed, Nov 1, 2023 at 7:03 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2023/10/25 下午1:53, Michael S. Tsirkin 写道:
> > On Wed, Oct 25, 2023 at 09:18:27AM +0800, Jason Wang wrote:
> >> On Tue, Oct 24, 2023 at 8:03 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>
> >>>
> >>> 在 2023/10/12 下午4:29, Jason Wang 写道:
> >>>> On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>>> Now, virtio-net already supports per-queue moderation parameter
> >>>>> setting. Based on this, we use the netdim library of linux to support
> >>>>> dynamic coalescing moderation for virtio-net.
> >>>>>
> >>>>> Due to hardware scheduling issues, we only tested rx dim.
> >>>> Do you have PPS numbers? And TX numbers are also important as the
> >>>> throughput could be misleading due to various reasons.
> >>> Hi Jason!
> >>>
> >>> The comparison of rx netdim performance is as follows:
> >>> (the backend supporting tx dim is not yet ready)
> >> Thanks a lot for the numbers.
> >>
> >> I'd still expect the TX result as I did play tx interrupt coalescing
> >> about 10 years ago.
> >>
> >> I will start to review the series but let's try to have some TX numbers as well.
> >>
> >> Btw, it would be more convenient to have a raw PPS benchmark. E.g you
> >> can try to use a software or hardware packet generator.
> >>
> >> Thanks
> > Latency results are also kind of interesting.
>
> I test the latency using sockperf pp:
>
> @Rx cmd
> taskset -c 0 sockperf sr -p 8989
>
> @Tx cmd
> taskset -c 0 sockperf pp -i ${ip} -p 8989 -t 10
>
> After running this cmd 5 times and averaging the results,
> we get the following data:
>
> dim off: 17.7735 usec
> dim on: 18.0110 usec

Let's add those numbers to the changelog of the next version.

Thanks

>
> Thanks!
>
> >
> >
> >>>
> >>> I. Sockperf UDP
> >>> =================================================
> >>> 1. Env
> >>> rxq_0 is affinity to cpu_0
> >>>
> >>> 2. Cmd
> >>> client:  taskset -c 0 sockperf tp -p 8989 -i $IP -t 10 -m 16B
> >>> server: taskset -c 0 sockperf sr -p 8989
> >>>
> >>> 3. Result
> >>> dim off: 1143277.00 rxpps, throughput 17.844 MBps, cpu is 100%.
> >>> dim on: 1124161.00 rxpps, throughput 17.610 MBps, cpu is 83.5%.
> >>> =================================================
> >>>
> >>>
> >>> II. Redis
> >>> =================================================
> >>> 1. Env
> >>> There are 8 rxqs and rxq_i is affinity to cpu_i.
> >>>
> >>> 2. Result
> >>> When all cpus are 100%, ops/sec of memtier_benchmark client is
> >>> dim off:   978437.23
> >>> dim on: 1143638.28
> >>> =================================================
> >>>
> >>>
> >>> III. Nginx
> >>> =================================================
> >>> 1. Env
> >>> There are 8 rxqs and rxq_i is affinity to cpu_i.
> >>>
> >>> 2. Result
> >>> When all cpus are 100%, requests/sec of wrk client is
> >>> dim off:   877931.67
> >>> dim on: 1019160.31
> >>> =================================================
> >>>
> >>> Thanks!
> >>>
> >>>> Thanks
> >>>>
> >>>>> @Test env
> >>>>> rxq0 has affinity to cpu0.
> >>>>>
> >>>>> @Test cmd
> >>>>> client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> >>>>> server: taskset -c 0 sockperf sr --tcp
> >>>>>
> >>>>> @Test res
> >>>>> The second column is the ratio of the result returned by client
> >>>>> when rx dim is enabled to the result returned by client when
> >>>>> rx dim is disabled.
> >>>>>           --------------------------------------
> >>>>>           | msg_size |  rx_dim=on / rx_dim=off |
> >>>>>           --------------------------------------
> >>>>>           |   14B    |         + 3%            |
> >>>>>           --------------------------------------
> >>>>>           |   100B   |         + 16%           |
> >>>>>           --------------------------------------
> >>>>>           |   500B   |         + 25%           |
> >>>>>           --------------------------------------
> >>>>>           |   1400B  |         + 28%           |
> >>>>>           --------------------------------------
> >>>>>           |   2048B  |         + 22%           |
> >>>>>           --------------------------------------
> >>>>>           |   4096B  |         + 5%            |
> >>>>>           --------------------------------------
> >>>>>
> >>>>> ---
> >>>>> This patch set was part of the previous netdim patch set[1].
> >>>>> [1] was split into a merged bugfix set[2] and the current set.
> >>>>> The previous relevant commentators have been Cced.
> >>>>>
> >>>>> [1] https://lore.kernel.org/all/20230811065512.22190-1-hengqi@linux.alibaba.com/
> >>>>> [2] https://lore.kernel.org/all/cover.1696745452.git.hengqi@linux.alibaba.com/
> >>>>>
> >>>>> Heng Qi (5):
> >>>>>     virtio-net: returns whether napi is complete
> >>>>>     virtio-net: separate rx/tx coalescing moderation cmds
> >>>>>     virtio-net: extract virtqueue coalescig cmd for reuse
> >>>>>     virtio-net: support rx netdim
> >>>>>     virtio-net: support tx netdim
> >>>>>
> >>>>>    drivers/net/virtio_net.c | 394 ++++++++++++++++++++++++++++++++-------
> >>>>>    1 file changed, 322 insertions(+), 72 deletions(-)
> >>>>>
> >>>>> --
> >>>>> 2.19.1.6.gb485710b
> >>>>>
> >>>>>
> >>>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation
       [not found]       ` <753ac6da-f7f1-4acb-9184-e59271809c6d@linux.alibaba.com>
@ 2023-11-02  4:34         ` Jason Wang
  0 siblings, 0 replies; 15+ messages in thread
From: Jason Wang @ 2023-11-02  4:34 UTC (permalink / raw)
  To: Heng Qi
  Cc: Xuan Zhuo, Liu, Yujie, Jesper Dangaard Brouer, Michael S. Tsirkin,
	netdev, John Fastabend, Alexei Starovoitov, virtualization,
	Eric Dumazet, Simon Horman, Jakub Kicinski, Paolo Abeni,
	David S. Miller

On Wed, Nov 1, 2023 at 5:38 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> 在 2023/10/25 上午9:18, Jason Wang 写道:
> > On Tue, Oct 24, 2023 at 8:03 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>
> >>
> >> 在 2023/10/12 下午4:29, Jason Wang 写道:
> >>> On Thu, Oct 12, 2023 at 3:44 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>> Now, virtio-net already supports per-queue moderation parameter
> >>>> setting. Based on this, we use the netdim library of linux to support
> >>>> dynamic coalescing moderation for virtio-net.
> >>>>
> >>>> Due to hardware scheduling issues, we only tested rx dim.
> >>> Do you have PPS numbers? And TX numbers are also important as the
> >>> throughput could be misleading due to various reasons.
> >> Hi Jason!
> >>
> >> The comparison of rx netdim performance is as follows:
> >> (the backend supporting tx dim is not yet ready)
> > Thanks a lot for the numbers.
> >
> > I'd still expect the TX result as I did play tx interrupt coalescing
>
> Hi, Jason.
>
> Sorry for the late reply to this! Our team has been blocked by other
> priorities the past few days.
>
> For tx dim, we have a fixed empirical value internally.
> This value performs better overall than manually adjusting the tx timer
> register -->
> I'll do not have tx numbers. :( So in the short term I no longer try to
> push [5/5]
> patch for tx dim and try to return -EOPNOTSUPP for it, sorry for this.
>
> > about 10 years ago.
> >
> > I will start to review the series but let's try to have some TX numbers as well.
> >
> > Btw, it would be more convenient to have a raw PPS benchmark. E.g you
>
> I got some raw pps data using pktgen from linux/sample/pktgen:
>
> 1. tx cmd
> ./pktgen_sample02_multiqueue.sh -i eth1 -s 44 -d ${dst_ip} -m ${dst_mac}
> -t 8 -f 0 -n 0
>
> This uses 8 kpktgend threads to inject data into eth1.
>
> 2. Rx side loads a simple xdp prog which drops all received udp packets.
>
> 3. Data
> pps: ~1000w

For "w" did you mean 10 million? Looks too huge to me?

> rx dim off: cpu idle= ~35%
> rx dim on: cpu idle= ~76%

This looks promising.

Thanks

>
> Thanks!
>
> > can try to use a software or hardware packet generator.
> >
> > Thanks
> >
> >>
> >> I. Sockperf UDP
> >> =================================================
> >> 1. Env
> >> rxq_0 is affinity to cpu_0
> >>
> >> 2. Cmd
> >> client:  taskset -c 0 sockperf tp -p 8989 -i $IP -t 10 -m 16B
> >> server: taskset -c 0 sockperf sr -p 8989
> >>
> >> 3. Result
> >> dim off: 1143277.00 rxpps, throughput 17.844 MBps, cpu is 100%.
> >> dim on: 1124161.00 rxpps, throughput 17.610 MBps, cpu is 83.5%.
> >> =================================================
> >>
> >>
> >> II. Redis
> >> =================================================
> >> 1. Env
> >> There are 8 rxqs and rxq_i is affinity to cpu_i.
> >>
> >> 2. Result
> >> When all cpus are 100%, ops/sec of memtier_benchmark client is
> >> dim off:   978437.23
> >> dim on: 1143638.28
> >> =================================================
> >>
> >>
> >> III. Nginx
> >> =================================================
> >> 1. Env
> >> There are 8 rxqs and rxq_i is affinity to cpu_i.
> >>
> >> 2. Result
> >> When all cpus are 100%, requests/sec of wrk client is
> >> dim off:   877931.67
> >> dim on: 1019160.31
> >> =================================================
> >>
> >> Thanks!
> >>
> >>> Thanks
> >>>
> >>>> @Test env
> >>>> rxq0 has affinity to cpu0.
> >>>>
> >>>> @Test cmd
> >>>> client: taskset -c 0 sockperf tp -i ${IP} -t 30 --tcp -m ${msg_size}
> >>>> server: taskset -c 0 sockperf sr --tcp
> >>>>
> >>>> @Test res
> >>>> The second column is the ratio of the result returned by client
> >>>> when rx dim is enabled to the result returned by client when
> >>>> rx dim is disabled.
> >>>>           --------------------------------------
> >>>>           | msg_size |  rx_dim=on / rx_dim=off |
> >>>>           --------------------------------------
> >>>>           |   14B    |         + 3%            |
> >>>>           --------------------------------------
> >>>>           |   100B   |         + 16%           |
> >>>>           --------------------------------------
> >>>>           |   500B   |         + 25%           |
> >>>>           --------------------------------------
> >>>>           |   1400B  |         + 28%           |
> >>>>           --------------------------------------
> >>>>           |   2048B  |         + 22%           |
> >>>>           --------------------------------------
> >>>>           |   4096B  |         + 5%            |
> >>>>           --------------------------------------
> >>>>
> >>>> ---
> >>>> This patch set was part of the previous netdim patch set[1].
> >>>> [1] was split into a merged bugfix set[2] and the current set.
> >>>> The previous relevant commentators have been Cced.
> >>>>
> >>>> [1] https://lore.kernel.org/all/20230811065512.22190-1-hengqi@linux.alibaba.com/
> >>>> [2] https://lore.kernel.org/all/cover.1696745452.git.hengqi@linux.alibaba.com/
> >>>>
> >>>> Heng Qi (5):
> >>>>     virtio-net: returns whether napi is complete
> >>>>     virtio-net: separate rx/tx coalescing moderation cmds
> >>>>     virtio-net: extract virtqueue coalescig cmd for reuse
> >>>>     virtio-net: support rx netdim
> >>>>     virtio-net: support tx netdim
> >>>>
> >>>>    drivers/net/virtio_net.c | 394 ++++++++++++++++++++++++++++++++-------
> >>>>    1 file changed, 322 insertions(+), 72 deletions(-)
> >>>>
> >>>> --
> >>>> 2.19.1.6.gb485710b
> >>>>
> >>>>
> >>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-11-02  4:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <cover.1697093455.git.hengqi@linux.alibaba.com>
2023-10-12  8:29 ` [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation Jason Wang
2023-10-13  1:53   ` Jason Wang
     [not found]   ` <d215566f-8185-463b-aa0b-5925f2a0853c@linux.alibaba.com>
2023-10-25  1:18     ` Jason Wang
2023-10-25  5:53       ` Michael S. Tsirkin
     [not found]         ` <d3b9e9e8-1ef4-48ac-8a2f-4fa647ae4372@linux.alibaba.com>
2023-11-02  4:33           ` Jason Wang
     [not found]       ` <753ac6da-f7f1-4acb-9184-e59271809c6d@linux.alibaba.com>
2023-11-02  4:34         ` Jason Wang
     [not found] ` <dc171e2288d2755b1805afde6b394d2d443a134d.1697093455.git.hengqi@linux.alibaba.com>
     [not found]   ` <20231013181148.3fd252dc@kernel.org>
     [not found]     ` <06d90cc8-ccc0-4b2f-ad42-2db4a6fb229f@linux.alibaba.com>
2023-10-16  7:51       ` [PATCH net-next 2/5] virtio-net: separate rx/tx coalescing moderation cmds Michael S. Tsirkin
     [not found] ` <ca2cef5c582bea958e300b39eb508d08675d1106.1697093455.git.hengqi@linux.alibaba.com>
2023-10-25  2:43   ` [PATCH net-next 1/5] virtio-net: returns whether napi is complete Jason Wang
     [not found] ` <5b99db97dd4b26636f306f70c8158d9d3297b4a0.1697093455.git.hengqi@linux.alibaba.com>
2023-10-25  3:03   ` [PATCH net-next 3/5] virtio-net: extract virtqueue coalescig cmd for reuse Jason Wang
     [not found] ` <b4656b1a14fea432bf8493a7e2f1976c08f2e63c.1697093455.git.hengqi@linux.alibaba.com>
2023-10-25  3:34   ` [PATCH net-next 4/5] virtio-net: support rx netdim Jason Wang
     [not found]     ` <3bc31a3b-a022-4816-a854-7f6b41d2e351@linux.alibaba.com>
2023-11-02  4:31       ` Jason Wang
     [not found] ` <ef5d159875745040e406473bd5c03e9875742ff5.1697093455.git.hengqi@linux.alibaba.com>
2023-10-25  3:35   ` [PATCH net-next 5/5] virtio-net: support tx netdim Jason Wang
2023-10-25  5:50     ` Michael S. Tsirkin
2023-10-25  5:49 ` [PATCH net-next 0/5] virtio-net: support dynamic coalescing moderation Michael S. Tsirkin
     [not found]   ` <707be7fa-3bb7-46c5-bb34-ef2900fe473f@linux.alibaba.com>
2023-11-01 10:44     ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).