All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Xie Yongji <xieyongji@bytedance.com>,
	tglx@linutronix.de, hch@lst.de, linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH v4 03/11] virtio-vdpa: Support interrupt affinity spreading mechanism
Date: Fri, 24 Mar 2023 05:12:26 -0400	[thread overview]
Message-ID: <20230324051153-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEtH0=vr6JQrqWFZqf4p8bcgeKCr4ipqdBc9nv-st3Pfiw@mail.gmail.com>

On Fri, Mar 24, 2023 at 02:27:52PM +0800, Jason Wang wrote:
> On Thu, Mar 23, 2023 at 1:31 PM Xie Yongji <xieyongji@bytedance.com> wrote:
> >
> > To support interrupt affinity spreading mechanism,
> > this makes use of group_cpus_evenly() to create
> > an irq callback affinity mask for each virtqueue
> > of vdpa device. Then we will unify set_vq_affinity
> > callback to pass the affinity to the vdpa device driver.
> >
> > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> 
> Thinking hard of all the logics, I think I've found something interesting.
> 
> Commit ad71473d9c437 ("virtio_blk: use virtio IRQ affinity") tries to
> pass irq_affinity to transport specific find_vqs().  This seems a
> layer violation since driver has no knowledge of
> 
> 1) whether or not the callback is based on an IRQ
> 2) whether or not the device is a PCI or not (the details are hided by
> the transport driver)
> 3) how many vectors could be used by a device
> 
> This means the driver can't actually pass a real affinity masks so the
> commit passes a zero irq affinity structure as a hint in fact, so the
> PCI layer can build a default affinity based that groups cpus evenly
> based on the number of MSI-X vectors (the core logic is the
> group_cpus_evenly). I think we should fix this by replacing the
> irq_affinity structure with
> 
> 1) a boolean like auto_cb_spreading
> 
> or
> 
> 2) queue to cpu mapping
> 
> So each transport can do its own logic based on that. Then virtio-vDPA
> can pass that policy to VDUSE where we only need a group_cpus_evenly()
> and avoid duplicating irq_create_affinity_masks()?
> 
> Thanks

I don't really understand what you propose. Care to post a patch?
Also does it have to block this patchset or can it be done on top?

> > ---
> >  drivers/virtio/virtio_vdpa.c | 68 ++++++++++++++++++++++++++++++++++++
> >  1 file changed, 68 insertions(+)
> >
> > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > index f72696b4c1c2..f3826f42b704 100644
> > --- a/drivers/virtio/virtio_vdpa.c
> > +++ b/drivers/virtio/virtio_vdpa.c
> > @@ -13,6 +13,7 @@
> >  #include <linux/kernel.h>
> >  #include <linux/slab.h>
> >  #include <linux/uuid.h>
> > +#include <linux/group_cpus.h>
> >  #include <linux/virtio.h>
> >  #include <linux/vdpa.h>
> >  #include <linux/virtio_config.h>
> > @@ -272,6 +273,66 @@ static void virtio_vdpa_del_vqs(struct virtio_device *vdev)
> >                 virtio_vdpa_del_vq(vq);
> >  }
> >
> > +static void default_calc_sets(struct irq_affinity *affd, unsigned int affvecs)
> > +{
> > +       affd->nr_sets = 1;
> > +       affd->set_size[0] = affvecs;
> > +}
> > +
> > +static struct cpumask *
> > +create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd)
> > +{
> > +       unsigned int affvecs = 0, curvec, usedvecs, i;
> > +       struct cpumask *masks = NULL;
> > +
> > +       if (nvecs > affd->pre_vectors + affd->post_vectors)
> > +               affvecs = nvecs - affd->pre_vectors - affd->post_vectors;
> > +
> > +       if (!affd->calc_sets)
> > +               affd->calc_sets = default_calc_sets;
> > +
> > +       affd->calc_sets(affd, affvecs);
> > +
> > +       if (!affvecs)
> > +               return NULL;
> > +
> > +       masks = kcalloc(nvecs, sizeof(*masks), GFP_KERNEL);
> > +       if (!masks)
> > +               return NULL;
> > +
> > +       /* Fill out vectors at the beginning that don't need affinity */
> > +       for (curvec = 0; curvec < affd->pre_vectors; curvec++)
> > +               cpumask_setall(&masks[curvec]);
> > +
> > +       for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) {
> > +               unsigned int this_vecs = affd->set_size[i];
> > +               int j;
> > +               struct cpumask *result = group_cpus_evenly(this_vecs);
> > +
> > +               if (!result) {
> > +                       kfree(masks);
> > +                       return NULL;
> > +               }
> > +
> > +               for (j = 0; j < this_vecs; j++)
> > +                       cpumask_copy(&masks[curvec + j], &result[j]);
> > +               kfree(result);
> > +
> > +               curvec += this_vecs;
> > +               usedvecs += this_vecs;
> > +       }
> > +
> > +       /* Fill out vectors at the end that don't need affinity */
> > +       if (usedvecs >= affvecs)
> > +               curvec = affd->pre_vectors + affvecs;
> > +       else
> > +               curvec = affd->pre_vectors + usedvecs;
> > +       for (; curvec < nvecs; curvec++)
> > +               cpumask_setall(&masks[curvec]);
> > +
> > +       return masks;
> > +}
> > +
> >  static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
> >                                 struct virtqueue *vqs[],
> >                                 vq_callback_t *callbacks[],
> > @@ -282,9 +343,15 @@ static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
> >         struct virtio_vdpa_device *vd_dev = to_virtio_vdpa_device(vdev);
> >         struct vdpa_device *vdpa = vd_get_vdpa(vdev);
> >         const struct vdpa_config_ops *ops = vdpa->config;
> > +       struct irq_affinity default_affd = { 0 };
> > +       struct cpumask *masks;
> >         struct vdpa_callback cb;
> >         int i, err, queue_idx = 0;
> >
> > +       masks = create_affinity_masks(nvqs, desc ? desc : &default_affd);
> > +       if (!masks)
> > +               return -ENOMEM;
> > +
> >         for (i = 0; i < nvqs; ++i) {
> >                 if (!names[i]) {
> >                         vqs[i] = NULL;
> > @@ -298,6 +365,7 @@ static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
> >                         err = PTR_ERR(vqs[i]);
> >                         goto err_setup_vq;
> >                 }
> > +               ops->set_vq_affinity(vdpa, i, &masks[i]);
> >         }
> >
> >         cb.callback = virtio_vdpa_config_cb;
> > --
> > 2.20.1
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Xie Yongji <xieyongji@bytedance.com>,
	tglx@linutronix.de, hch@lst.de,
	virtualization@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 03/11] virtio-vdpa: Support interrupt affinity spreading mechanism
Date: Fri, 24 Mar 2023 05:12:26 -0400	[thread overview]
Message-ID: <20230324051153-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CACGkMEtH0=vr6JQrqWFZqf4p8bcgeKCr4ipqdBc9nv-st3Pfiw@mail.gmail.com>

On Fri, Mar 24, 2023 at 02:27:52PM +0800, Jason Wang wrote:
> On Thu, Mar 23, 2023 at 1:31 PM Xie Yongji <xieyongji@bytedance.com> wrote:
> >
> > To support interrupt affinity spreading mechanism,
> > this makes use of group_cpus_evenly() to create
> > an irq callback affinity mask for each virtqueue
> > of vdpa device. Then we will unify set_vq_affinity
> > callback to pass the affinity to the vdpa device driver.
> >
> > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> 
> Thinking hard of all the logics, I think I've found something interesting.
> 
> Commit ad71473d9c437 ("virtio_blk: use virtio IRQ affinity") tries to
> pass irq_affinity to transport specific find_vqs().  This seems a
> layer violation since driver has no knowledge of
> 
> 1) whether or not the callback is based on an IRQ
> 2) whether or not the device is a PCI or not (the details are hided by
> the transport driver)
> 3) how many vectors could be used by a device
> 
> This means the driver can't actually pass a real affinity masks so the
> commit passes a zero irq affinity structure as a hint in fact, so the
> PCI layer can build a default affinity based that groups cpus evenly
> based on the number of MSI-X vectors (the core logic is the
> group_cpus_evenly). I think we should fix this by replacing the
> irq_affinity structure with
> 
> 1) a boolean like auto_cb_spreading
> 
> or
> 
> 2) queue to cpu mapping
> 
> So each transport can do its own logic based on that. Then virtio-vDPA
> can pass that policy to VDUSE where we only need a group_cpus_evenly()
> and avoid duplicating irq_create_affinity_masks()?
> 
> Thanks

I don't really understand what you propose. Care to post a patch?
Also does it have to block this patchset or can it be done on top?

> > ---
> >  drivers/virtio/virtio_vdpa.c | 68 ++++++++++++++++++++++++++++++++++++
> >  1 file changed, 68 insertions(+)
> >
> > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> > index f72696b4c1c2..f3826f42b704 100644
> > --- a/drivers/virtio/virtio_vdpa.c
> > +++ b/drivers/virtio/virtio_vdpa.c
> > @@ -13,6 +13,7 @@
> >  #include <linux/kernel.h>
> >  #include <linux/slab.h>
> >  #include <linux/uuid.h>
> > +#include <linux/group_cpus.h>
> >  #include <linux/virtio.h>
> >  #include <linux/vdpa.h>
> >  #include <linux/virtio_config.h>
> > @@ -272,6 +273,66 @@ static void virtio_vdpa_del_vqs(struct virtio_device *vdev)
> >                 virtio_vdpa_del_vq(vq);
> >  }
> >
> > +static void default_calc_sets(struct irq_affinity *affd, unsigned int affvecs)
> > +{
> > +       affd->nr_sets = 1;
> > +       affd->set_size[0] = affvecs;
> > +}
> > +
> > +static struct cpumask *
> > +create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd)
> > +{
> > +       unsigned int affvecs = 0, curvec, usedvecs, i;
> > +       struct cpumask *masks = NULL;
> > +
> > +       if (nvecs > affd->pre_vectors + affd->post_vectors)
> > +               affvecs = nvecs - affd->pre_vectors - affd->post_vectors;
> > +
> > +       if (!affd->calc_sets)
> > +               affd->calc_sets = default_calc_sets;
> > +
> > +       affd->calc_sets(affd, affvecs);
> > +
> > +       if (!affvecs)
> > +               return NULL;
> > +
> > +       masks = kcalloc(nvecs, sizeof(*masks), GFP_KERNEL);
> > +       if (!masks)
> > +               return NULL;
> > +
> > +       /* Fill out vectors at the beginning that don't need affinity */
> > +       for (curvec = 0; curvec < affd->pre_vectors; curvec++)
> > +               cpumask_setall(&masks[curvec]);
> > +
> > +       for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) {
> > +               unsigned int this_vecs = affd->set_size[i];
> > +               int j;
> > +               struct cpumask *result = group_cpus_evenly(this_vecs);
> > +
> > +               if (!result) {
> > +                       kfree(masks);
> > +                       return NULL;
> > +               }
> > +
> > +               for (j = 0; j < this_vecs; j++)
> > +                       cpumask_copy(&masks[curvec + j], &result[j]);
> > +               kfree(result);
> > +
> > +               curvec += this_vecs;
> > +               usedvecs += this_vecs;
> > +       }
> > +
> > +       /* Fill out vectors at the end that don't need affinity */
> > +       if (usedvecs >= affvecs)
> > +               curvec = affd->pre_vectors + affvecs;
> > +       else
> > +               curvec = affd->pre_vectors + usedvecs;
> > +       for (; curvec < nvecs; curvec++)
> > +               cpumask_setall(&masks[curvec]);
> > +
> > +       return masks;
> > +}
> > +
> >  static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
> >                                 struct virtqueue *vqs[],
> >                                 vq_callback_t *callbacks[],
> > @@ -282,9 +343,15 @@ static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
> >         struct virtio_vdpa_device *vd_dev = to_virtio_vdpa_device(vdev);
> >         struct vdpa_device *vdpa = vd_get_vdpa(vdev);
> >         const struct vdpa_config_ops *ops = vdpa->config;
> > +       struct irq_affinity default_affd = { 0 };
> > +       struct cpumask *masks;
> >         struct vdpa_callback cb;
> >         int i, err, queue_idx = 0;
> >
> > +       masks = create_affinity_masks(nvqs, desc ? desc : &default_affd);
> > +       if (!masks)
> > +               return -ENOMEM;
> > +
> >         for (i = 0; i < nvqs; ++i) {
> >                 if (!names[i]) {
> >                         vqs[i] = NULL;
> > @@ -298,6 +365,7 @@ static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
> >                         err = PTR_ERR(vqs[i]);
> >                         goto err_setup_vq;
> >                 }
> > +               ops->set_vq_affinity(vdpa, i, &masks[i]);
> >         }
> >
> >         cb.callback = virtio_vdpa_config_cb;
> > --
> > 2.20.1
> >


  reply	other threads:[~2023-03-24  9:12 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-23  5:30 [PATCH v4 00/11] VDUSE: Improve performance Xie Yongji
2023-03-23  5:30 ` [PATCH v4 01/11] lib/group_cpus: Export group_cpus_evenly() Xie Yongji
2023-04-04 18:20   ` Michael S. Tsirkin
2023-04-04 18:20     ` Michael S. Tsirkin
2023-03-23  5:30 ` [PATCH v4 02/11] vdpa: Add set/get_vq_affinity callbacks in vdpa_config_ops Xie Yongji
2023-03-23  5:30 ` [PATCH v4 03/11] virtio-vdpa: Support interrupt affinity spreading mechanism Xie Yongji
2023-03-24  6:27   ` Jason Wang
2023-03-24  6:27     ` Jason Wang
2023-03-24  9:12     ` Michael S. Tsirkin [this message]
2023-03-24  9:12       ` Michael S. Tsirkin
2023-03-28  6:12       ` Jason Wang
2023-03-28  6:12         ` Jason Wang
2023-03-28  3:03     ` Yongji Xie
2023-03-28  3:14       ` Jason Wang
2023-03-28  3:14         ` Jason Wang
2023-03-28  3:33         ` Yongji Xie
2023-03-28  3:44           ` Jason Wang
2023-03-28  3:44             ` Jason Wang
2023-03-28  4:04             ` Yongji Xie
2023-03-28  6:07               ` Jason Wang
2023-03-28  6:07                 ` Jason Wang
2023-03-23  5:30 ` [PATCH v4 04/11] vduse: Refactor allocation for vduse virtqueues Xie Yongji
2023-03-23  5:30 ` [PATCH v4 05/11] vduse: Support set_vq_affinity callback Xie Yongji
2023-03-28  6:14   ` Jason Wang
2023-03-28  6:14     ` Jason Wang
2023-03-23  5:30 ` [PATCH v4 06/11] vduse: Support get_vq_affinity callback Xie Yongji
2023-03-28  6:15   ` Jason Wang
2023-03-28  6:15     ` Jason Wang
2023-03-23  5:30 ` [PATCH v4 07/11] vduse: Add sysfs interface for irq callback affinity Xie Yongji
2023-03-28  6:16   ` Jason Wang
2023-03-28  6:16     ` Jason Wang
2023-03-23  5:30 ` [PATCH v4 08/11] vdpa: Add eventfd for the vdpa callback Xie Yongji
2023-03-28  6:17   ` Jason Wang
2023-03-28  6:17     ` Jason Wang
2023-03-23  5:30 ` [PATCH v4 09/11] vduse: Signal vq trigger eventfd directly if possible Xie Yongji
2023-03-23  5:30 ` [PATCH v4 10/11] vduse: Delay iova domain creation Xie Yongji
2023-03-23  5:30 ` [PATCH v4 11/11] vduse: Support specifying bounce buffer size via sysfs Xie Yongji

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230324051153-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=hch@lst.de \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=xieyongji@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.