linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: linux-rdma@vger.kernel.org,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v1] rdma: Enable ib_alloc_cq to spread work over a device's comp_vectors
Date: Thu, 25 Jul 2019 16:17:15 +0300	[thread overview]
Message-ID: <20190725131715.GF4674@mtr-leonro.mtl.com> (raw)
In-Reply-To: <8AC10EC7-5203-4D82-A455-2589DA6ADB9D@oracle.com>

On Wed, Jul 24, 2019 at 10:01:36AM -0400, Chuck Lever wrote:
> Hi Leon, thanks for taking a look. Responses below.
>
>
> > On Jul 24, 2019, at 1:47 AM, Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Tue, Jul 23, 2019 at 03:13:37PM -0400, Chuck Lever wrote:
> >> Send and Receive completion is handled on a single CPU selected at
> >> the time each Completion Queue is allocated. Typically this is when
> >> an initiator instantiates an RDMA transport, or when a target
> >> accepts an RDMA connection.
> >>
> >> Some ULPs cannot open a connection per CPU to spread completion
> >> workload across available CPUs. For these ULPs, allow the RDMA core
> >> to select a completion vector based on the device's complement of
> >> available comp_vecs.
> >>
> >> When a ULP elects to use RDMA_CORE_ANY_COMPVEC, if multiple CPUs are
> >> available, a different CPU will be selected for each Completion
> >> Queue. For the moment, a simple round-robin mechanism is used.
> >>
> >> Suggested-by: Håkon Bugge <haakon.bugge@oracle.com>
> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >
> > It make me wonder why do we need comp_vector as an argument to ib_alloc_cq?
> > From what I see, or callers are internally implementing similar logic
> > to proposed here, or they don't care (set 0).
>
> The goal of this patch is to deduplicate that "similar logic".
> Callers that implement this logic already can use
> RDMA_CORE_ANY_COMPVEC and get rid of their own copy.

Can you please send removal patches together with this API proposal?
It will ensure that ULPs authors will notice such changes and we won't
end with special function for one ULP.

>
>
> > Can we enable this comp_vector for everyone and simplify our API?
>
> We could create a new CQ allocation API that does not take a
> comp vector. That might be cleaner than passing in a -1.

+1

>
> But I think some ULPs still want to use the existing API to
> allocate one CQ for each of a device's comp vectors.

It can be "legacy implementation", which is not really needed,
but I don't really know about it.

>
>
> >> ---
> >> drivers/infiniband/core/cq.c             |   20 +++++++++++++++++++-
> >> include/rdma/ib_verbs.h                  |    3 +++
> >> net/sunrpc/xprtrdma/svc_rdma_transport.c |    6 ++++--
> >> net/sunrpc/xprtrdma/verbs.c              |    5 ++---
> >> 4 files changed, 28 insertions(+), 6 deletions(-)
> >>
> >> Jason-
> >>
> >> If this patch is acceptable to all, then I would expect you to take
> >> it through the RDMA tree.
> >>
> >>
> >> diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c
> >> index 7c599878ccf7..a89d549490c4 100644
> >> --- a/drivers/infiniband/core/cq.c
> >> +++ b/drivers/infiniband/core/cq.c
> >> @@ -165,12 +165,27 @@ static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private)
> >> 	queue_work(cq->comp_wq, &cq->work);
> >> }
> >>
> >> +/*
> >> + * Attempt to spread ULP completion queues over a device's completion
> >> + * vectors so that all available CPU cores can help service the device's
> >> + * interrupt workload. This mechanism may be improved at a later point
> >> + * to dynamically take into account the system's actual workload.
> >> + */
> >> +static int ib_get_comp_vector(struct ib_device *dev)
> >> +{
> >> +	static atomic_t cv;
> >> +
> >> +	if (dev->num_comp_vectors > 1)
> >> +		return atomic_inc_return(&cv) % dev->num_comp_vectors;
> >
> > It is worth to take into account num_online_cpus(),
>
> I don't believe it is.
>
> Håkon has convinced me that assigning interrupt vectors to
> CPUs is in the domain of user space (ie, driven by policy).
> In addition, one assumes that taking a CPU offline properly
> will also involve re-assigning interrupt vectors that point
> to that core.
>
> In any event, this code can be modified after it is merged
> if it is necessary to accommodate such requirements.

It is a simple change, which is worth to do now as long as
we have interested parties involved here.

>
> --
> Chuck Lever
>
>
>

  reply	other threads:[~2019-07-25 13:17 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-23 19:13 [PATCH v1] rdma: Enable ib_alloc_cq to spread work over a device's comp_vectors Chuck Lever
2019-07-24  5:47 ` Leon Romanovsky
2019-07-24 14:01   ` Chuck Lever
2019-07-25 13:17     ` Leon Romanovsky [this message]
2019-07-25 14:03       ` Chuck Lever
2019-07-28 15:05         ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190725131715.GF4674@mtr-leonro.mtl.com \
    --to=leon@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).