Re: [virtio-comment] Re: [PATCH RESEND] virtio-net: support setting coalescing params for multiple vqs

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Heng Qi <hengqi@linux.alibaba.com>,
	virtio-comment@lists.oasis-open.org,
	Yuri Benditovich <yuri.benditovich@daynix.com>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Alvaro Karsz <alvaro.karsz@solid-run.com>,
	Parav Pandit <parav@nvidia.com>
Subject: Re: [virtio-comment] Re: [PATCH RESEND] virtio-net: support setting coalescing params for multiple vqs
Date: Tue, 26 Dec 2023 04:18:17 -0500	[thread overview]
Message-ID: <20231226040940-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEtCtFvY0rMO0nvpBRL6qQa94jZRJC66YTyRHDoDbSgJ_g@mail.gmail.com>

On Tue, Dec 26, 2023 at 11:32:12AM +0800, Jason Wang wrote:
> On Fri, Dec 22, 2023 at 2:45 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >
> >
> >
> > 在 2023/12/22 上午10:40, Jason Wang 写道:
> > > On Thu, Dec 21, 2023 at 12:55 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> > >>
> > >>
> > >> 在 2023/12/21 上午11:37, Jason Wang 写道:
> > >>> On Wed, Dec 20, 2023 at 10:40 PM Heng Qi <hengqi@linux.alibaba.com> wrote:
> > >>>> Currently, when each time the driver attempts to update the coalescing parameters
> > >>>> for a vq, it needs to kick the device and wait for the ctrlq response to return.
> > >>>>
> > >>>> If a command can only update one vq parameters, when the parameters are updated
> > >>>> frequently (such as netdim), ctrlq on the device is kicked frequently, which will
> > >>>> increase the device CPU scheduling overhead, and the number and overhead of device
> > >>>> DMA will also increase.
> > >>>>
> > >>>> Merging multiple vq updated parameters into one command can effectively reduce
> > >>>> the number of kick devices and device DMA times.
> > >>>>
> > >>>> Test results show that this greatly improves the efficiency of the ctrlq in
> > >>>> responding to multiple vq coalescing parameter updates issued by the driver.
> > >>> So netdim is per virtqueue, to make use of this, you need to batch the
> > >>> netdim requests first.
> > >> Yes. I think the code looks more intuitive.
> > >> Please view the following PoC test code combined with the current code
> > >> of upstream virtio netdim
> > >>
> > >> +struct ctrl_coal_vq_entry {
> > >> +       u16 vqn;
> > >> +       u16 reserved;
> > >> +       int usecs;
> > >> +       int packets;
> > >> +};
> > >> +
> > >> +struct virtio_net_ctrl_mrg_coal_vq {
> > >> +       int num_entries;
> > >> +       struct ctrl_coal_vq_entry coal_vqs[];
> > >> +};
> > >> +
> > >>
> > >> @@ -313,12 +335,18 @@ struct virtnet_info {
> > >>
> > >>           struct control_buf *ctrl;
> > >>
> > >> +       struct virtio_net_ctrl_mrg_coal_vq *ctrl_coal_vqs;
> > >> +
> > >>
> > >>
> > >> @@ -4390,6 +4586,7 @@ static int virtnet_alloc_queues(struct
> > >> virtnet_info *vi)
> > >>           if (!vi->rq)
> > >>                   goto err_rq;
> > >>
> > >> +       vi->ctrl_coal_vqs = kzalloc(sizeof(struct
> > >> virtio_net_ctrl_mrg_coal_vq) + vi->max_queue_pairs * sizeof(struct
> > >> ctrl_coal_vq_entry), GFP_KERNEL);
> > >>
> > >>
> > >> +       struct scatterlist sg;
> > >> +       for (i = 0; i < vi->curr_queue_pairs; i++) {
> > >> +                vi->ctrl_coal_vqs->coal_vqs[i].vqn = rxq2vq(i);
> > >> +                vi->ctrl_coal_vqs->coal_vqs[i].usecs = 8;
> > >> +                vi->ctrl_coal_vqs->coal_vqs[i].packets = 16;
> > >> +        }
> > >> +        vi->ctrl_coal_vqs->num_entries = i;
> > >> +        sg_init_one(&sg, vi->ctrl_coal_vqs, sizeof(struct
> > >> virtio_net_ctrl_mrg_coal_vq) + vi->ctrl_coal_vqs->num_entries *
> > >> sizeof(struct ctrl_coal_vq_entry));
> > >> +        vi->req_num++;
> > >> +        if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL,
> > >> +                              VIRTIO_NET_CTRL_NOTF_COAL_*VQS*_SET, &sg))
> > >>
> > >>    #define VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET               3
> > >> +#define VIRTIO_NET_CTRL_NOTF_COAL_VQS_SET              4
> > >>
> > >>>    And if you do that, you can batch the commands
> > >>> as well.
> > >>
> > >> Batch commands we have tried:
> > >>
> > >> +               for (i = 0; i < vi->curr_queue_pairs; i++) {
> > >> +                       err = virtnet_pre_send_cmd(vi, &vi->vq_ctrls[i],
> > >> 8, 16);
> > >> +               }
> > >> +
> > >> +               if (unlikely(!virtqueue_kick(vi->cvq)))
> > >> +                       return vi->ctrl->status == VIRTIO_NET_OK;
> > >> +
> > >> +               for (i = 0; i < vi->curr_queue_pairs; i++) {
> > >> +                       while (!virtqueue_is_broken(vi->cvq) &&
> > >> !virtqueue_get_buf(vi->cvq, &tmp))
> > >> +                               cpu_relax();
> > >> +               }
> > >> +       }
> > >>
> > >>
> > >> +
> > >> +static bool virtnet_pre_send_cmd(struct virtnet_info *vi, struct
> > >> vq_coal *vq_ctrls,
> > >> +                                u32 max_usecs, u32 max_packets)
> > >> +{
> > >> +       struct scatterlist hdr, out, stat;
> > >> +       struct scatterlist *sgs[4];
> > >> +       unsigned out_num = 0;
> > >> +       int ret;
> > >> +
> > >> +       BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ));
> > >> +
> > >> +       vq_ctrls->hdr.class = VIRTIO_NET_CTRL_NOTF_COAL;
> > >> +       vq_ctrls->hdr.cmd = VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET;
> > >> +       sg_init_one(&hdr, &vq_ctrls->hdr, sizeof(vq_ctrls->hdr));
> > >> +       sgs[out_num++] = &hdr;
> > >> +
> > >> +       vq_ctrls->coal_vq.vqn = cpu_to_le16(0);
> > >> +       vq_ctrls->coal_vq.coal.max_usecs = cpu_to_le32(max_usecs);
> > >> +       vq_ctrls->coal_vq.coal.max_packets = cpu_to_le32(max_packets);
> > >> +       sg_init_one(&out, &vq_ctrls->ho, sizeof(vq_ctrls->ho));
> > >> +
> > >> +       if (&out)
> > >> +               sgs[out_num++] = &out;
> > >> +
> > >> +       vq_ctrls->status = ~0;
> > >> +       sg_init_one(&stat, &vq_ctrls->status, sizeof(vq_ctrls->status));
> > >> +       sgs[out_num] = &stat;
> > >> +
> > >> +       BUG_ON(out_num + 1 > ARRAY_SIZE(sgs));
> > >> +       ret = virtqueue_add_sgs(vi->cvq, sgs, out_num, 1, vi, GFP_ATOMIC);
> > >> +       if (ret < 0) {
> > >> +               dev_warn(&vi->vdev->dev,
> > >> +                        "Failed to add sgs for command vq: %d\n.", ret);
> > >> +               return false;
> > >> +       }
> > >>
> > >>
> > >> Overall, batch reqs are sufficient. Because the current major overhead
> > >> is the number of DMAs.
> > >> For example, for a device with 256 queues,
> > >>
> > >> For the current upstream code, the overhead is 256 kicks + 256*8 DMA times.
> > >> The overhead of batch cmds is 1 kick + 256*8 DMA times.
> > > This can furtherly optimized by using IN_ORDER. The DMA can be batched as well.
> >
> > DMA batch reduces the number of DMA requests, but not the actual number
> > of DMAs.
> 
> Well, if we do IN_ORDER, then
> 
> 1 for avail idx.
> 
> At most 2 DMA is needed for all descriptors.
> 
> Driver can allocate an array of cvq commands. So CVQ commands could be
> fetched at most 2 DMA.
> 
> etc.

For most users, all this happens maybe once per VM start.  It's not
reasonable to start micro-optimizing this until it is taking so much
time it slows VM start measureably.  With each DMA taking at most
of an extra microsecond I just do not see the benefit, even with 1000s
of DMAs, this will take at most milliseconds.


> > For out DPU, how much to batch and when to end the batch also need to be
> > evaluated in detail.
> >
> > >
> > > Or even using packed virtqueue.
> > >
> > >> The overhead of batch reqs is 1 kick + 8 DMA times.
> > > If this is a real issue for your architecture, we should not make it
> > > only work for coalescing. Any per vq tweaking may suffer from that.
> >
> > YES.
> >
> > > E.g if you want to do ARFS in the future. So it should a e.g a
> > > container command for cvq instead of just for coalescing command
> > > itself.
> > >
> > > We need to make sure if the existing facility is sufficient or not.
> >
> > We considered to update the generic ctrlq batch command. But so far we
> > have only seen
> > this requirement for netdim and aRFS.
> 
> Two use cases should be sufficient to justify.
> 
> We do have others, for example the vlan/mac filters.
> 
> 
> > At the same time, aRFS has not yet
> > entered the community.
> > We will try to add a batch req interface for it and it may have a
> > dedicated flow vq to ensure that
> > the number of command issuances larger than netdim is responded to in time.
> > So we temporarily plan to advance the batch of coalescing parameter
> > requests first.
> 
> Thanks


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/

next prev parent reply	other threads:[~2023-12-26  9:18 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-20 14:40 [virtio-comment] [PATCH RESEND] virtio-net: support setting coalescing params for multiple vqs Heng Qi
2023-12-21  3:37 ` [virtio-comment] " Jason Wang
2023-12-21  4:55   ` Heng Qi
2023-12-22  2:40     ` Jason Wang
2023-12-22  6:45       ` Heng Qi
2023-12-22  8:20         ` Michael S. Tsirkin
2023-12-26  2:17           ` Heng Qi
2023-12-26  3:32         ` Jason Wang
2023-12-26  9:18           ` Michael S. Tsirkin [this message]
2023-12-27  2:32             ` Jason Wang
2023-12-22  8:26   ` Michael S. Tsirkin
2023-12-21  6:56 ` Michael S. Tsirkin
2023-12-21  7:01   ` Heng Qi
2023-12-21  7:08     ` Michael S. Tsirkin
2023-12-21  7:19       ` Heng Qi
2023-12-21  7:48         ` Heng Qi
2023-12-22  8:24         ` Michael S. Tsirkin
2023-12-26  2:04           ` Heng Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231226040940-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alvaro.karsz@solid-run.com \
    --cc=hengqi@linux.alibaba.com \
    --cc=jasowang@redhat.com \
    --cc=parav@nvidia.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=yuri.benditovich@daynix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.