public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Konstantin Taranov <kotaranov@microsoft.com>
Cc: Konstantin Taranov <kotaranov@linux.microsoft.com>,
	Shiraz Saleem <shirazsaleem@microsoft.com>,
	Long Li <longli@microsoft.com>, "jgg@ziepe.ca" <jgg@ziepe.ca>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH rdma-next v2 1/1] RDMA/mana: Provide a modern CQ creation interface
Date: Wed, 4 Mar 2026 17:59:13 +0200	[thread overview]
Message-ID: <20260304155913.GH12611@unreal> (raw)
In-Reply-To: <DU8PR83MB0975A4114E1CFE0B6BFA2DBBB47CA@DU8PR83MB0975.EURPRD83.prod.outlook.com>

On Wed, Mar 04, 2026 at 02:06:21PM +0000, Konstantin Taranov wrote:
> > > > > The uverbs CQ creation UAPI allows users to supply their own umem
> > > > > for a
> > > > CQ.
> > > > > Update mana to support this workflow while preserving support for
> > > > > creating umem through the legacy interface.
> > > > >
> > > > > To support RDMA objects that own umem, extend
> > > > mana_ib_create_queue()
> > > > > to return the umem to the caller and do not allocate umem if it
> > > > > was allocted by the caller.
> > > > >
> > > > > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > > > > ---
> > > > > v2: It is a rework of the patch proposed by Leon
> > > >
> > > > I am curious to know what changes were introduced?
> > >
> > > It is like your patch, but I kept get_umem in mana_ib_create_queue and
> > > introduced ownership.
> > > It made the code simpler and extendable. In your proposal, it was hard
> > > to track the changes and it led to double free of the umem. With new
> > > mana_ib_create_queue() it is clear from the caller what happens and no
> > > special changes in the caller required.
> > >
> > > >
> > > > >  drivers/infiniband/hw/mana/cq.c      | 125 +++++++++++++++++----------
> > > > >  drivers/infiniband/hw/mana/device.c  |   1 +
> > > > >  drivers/infiniband/hw/mana/main.c    |  30 +++++--
> > > > >  drivers/infiniband/hw/mana/mana_ib.h |   5 +-
> > > > >  drivers/infiniband/hw/mana/qp.c      |   5 +-
> > > > >  drivers/infiniband/hw/mana/wq.c      |   3 +-
> > > > >  6 files changed, 111 insertions(+), 58 deletions(-)
> > > > >
> > > > > diff --git a/drivers/infiniband/hw/mana/cq.c
> > > > > b/drivers/infiniband/hw/mana/cq.c index b2749f971..fa951732a
> > > > > 100644
> > > > > --- a/drivers/infiniband/hw/mana/cq.c
> > > > > +++ b/drivers/infiniband/hw/mana/cq.c
> > > > > @@ -8,12 +8,8 @@
> > > > >  int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> > > > > ib_cq_init_attr
> > > > *attr,
> > > > >  		      struct uverbs_attr_bundle *attrs)  {
> > > > > -	struct ib_udata *udata = &attrs->driver_udata;
> > > > >  	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
> > > > > -	struct mana_ib_create_cq_resp resp = {};
> > > > > -	struct mana_ib_ucontext *mana_ucontext;
> > > > >  	struct ib_device *ibdev = ibcq->device;
> > > > > -	struct mana_ib_create_cq ucmd = {};
> > > > >  	struct mana_ib_dev *mdev;
> > > > >  	bool is_rnic_cq;
> > > > >  	u32 doorbell;
> > > > > @@ -26,48 +22,91 @@ int mana_ib_create_cq(struct ib_cq *ibcq,
> > > > > const
> > > > struct ib_cq_init_attr *attr,
> > > > >  	cq->cq_handle = INVALID_MANA_HANDLE;
> > > > >  	is_rnic_cq = mana_ib_is_rnic(mdev);
> > > > >
> > > > > -	if (udata) {
> > > > > -		if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> > > > > -			return -EINVAL;
> > > > > -
> > > > > -		err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd),
> > > > udata->inlen));
> > > > > -		if (err) {
> > > > > -			ibdev_dbg(ibdev, "Failed to copy from udata for
> > > > create cq, %d\n", err);
> > > > > -			return err;
> > > > > -		}
> > > > > +	if (attr->cqe > U32_MAX / COMP_ENTRY_SIZE / 2 + 1)
> > > > > +		return -EINVAL;
> > > >
> > > > We are talking about kernel verbs. ULPs are not designed to provide
> > > > attributes and recover from random driver limitations.
> > >
> > > I understand, but there was an ask before to add that check as some
> > > automated code verifier detected overflow. So if we remote it, I guess
> > > we get again an ask to fix the potential overflow.
> > >
> > > >
> > > > >
> > > > > -		if ((!is_rnic_cq && attr->cqe > mdev-
> > > > >adapter_caps.max_qp_wr) ||
> > > > > -		    attr->cqe > U32_MAX / COMP_ENTRY_SIZE) {
> > > > > -			ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr-
> > > > >cqe);
> > > > > -			return -EINVAL;
> > > > > -		}
> > > > > +	buf_size = MANA_PAGE_ALIGN(roundup_pow_of_two(attr->cqe *
> > > > COMP_ENTRY_SIZE));
> > > > > +	cq->cqe = buf_size / COMP_ENTRY_SIZE;
> > > > > +	err = mana_ib_create_kernel_queue(mdev, buf_size, GDMA_CQ,
> > > > &cq->queue);
> > > > > +	if (err) {
> > > > > +		ibdev_dbg(ibdev, "Failed to create kernel queue for create cq,
> > > > %d\n", err);
> > > > > +		return err;
> > > > > +	}
> > > > > +	doorbell = mdev->gdma_dev->doorbell;
> > > > >
> > > > > -		cq->cqe = attr->cqe;
> > > > > -		err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe
> > > > * COMP_ENTRY_SIZE,
> > > > > -					   &cq->queue);
> > > > > +	if (is_rnic_cq) {
> > > > > +		err = mana_ib_gd_create_cq(mdev, cq, doorbell);
> > > > >  		if (err) {
> > > > > -			ibdev_dbg(ibdev, "Failed to create queue for create
> > > > cq, %d\n", err);
> > > > > -			return err;
> > > > > +			ibdev_dbg(ibdev, "Failed to create RNIC cq, %d\n",
> > > > err);
> > > > > +			goto err_destroy_queue;
> > > > >  		}
> > > > >
> > > > > -		mana_ucontext = rdma_udata_to_drv_context(udata, struct
> > > > mana_ib_ucontext,
> > > > > -							  ibucontext);
> > > > > -		doorbell = mana_ucontext->doorbell;
> > > > > -	} else {
> > > > > -		if (attr->cqe > U32_MAX / COMP_ENTRY_SIZE / 2 + 1) {
> > > > > -			ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr-
> > > > >cqe);
> > > > > -			return -EINVAL;
> > > > > -		}
> > > > > -		buf_size = MANA_PAGE_ALIGN(roundup_pow_of_two(attr-
> > > > >cqe * COMP_ENTRY_SIZE));
> > > > > -		cq->cqe = buf_size / COMP_ENTRY_SIZE;
> > > > > -		err = mana_ib_create_kernel_queue(mdev, buf_size,
> > > > GDMA_CQ, &cq->queue);
> > > > > +		err = mana_ib_install_cq_cb(mdev, cq);
> > > > >  		if (err) {
> > > > > -			ibdev_dbg(ibdev, "Failed to create kernel queue for
> > > > create cq, %d\n", err);
> > > > > -			return err;
> > > > > +			ibdev_dbg(ibdev, "Failed to install cq callback, %d\n",
> > > > err);
> > > > > +			goto err_destroy_rnic_cq;
> > > > >  		}
> > > > > -		doorbell = mdev->gdma_dev->doorbell;
> > > > >  	}
> > > > >
> > > > > +	spin_lock_init(&cq->cq_lock);
> > > > > +	INIT_LIST_HEAD(&cq->list_send_qp);
> > > > > +	INIT_LIST_HEAD(&cq->list_recv_qp);
> > > > > +
> > > > > +	return 0;
> > > > > +
> > > > > +err_destroy_rnic_cq:
> > > > > +	mana_ib_gd_destroy_cq(mdev, cq);
> > > > > +err_destroy_queue:
> > > > > +	mana_ib_destroy_queue(mdev, &cq->queue);
> > > > > +
> > > > > +	return err;
> > > > > +}
> > > > > +
> > > > > +int mana_ib_create_user_cq(struct ib_cq *ibcq, const struct
> > > > ib_cq_init_attr *attr,
> > > > > +			   struct uverbs_attr_bundle *attrs) {
> > > > > +	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
> > > > > +	struct ib_udata *udata = &attrs->driver_udata;
> > > > > +	struct mana_ib_create_cq_resp resp = {};
> > > > > +	struct mana_ib_ucontext *mana_ucontext;
> > > > > +	struct ib_device *ibdev = ibcq->device;
> > > > > +	struct mana_ib_create_cq ucmd = {};
> > > > > +	struct mana_ib_dev *mdev;
> > > > > +	bool is_rnic_cq;
> > > > > +	u32 doorbell;
> > > > > +	int err;
> > > > > +
> > > > > +	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> > > > > +
> > > > > +	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
> > > > > +	cq->cq_handle = INVALID_MANA_HANDLE;
> > > > > +	is_rnic_cq = mana_ib_is_rnic(mdev);
> > > > > +
> > > > > +	if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> > > > > +		return -EINVAL;
> > > > > +
> > > > > +	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata-
> > > > >inlen));
> > > > > +	if (err) {
> > > > > +		ibdev_dbg(ibdev, "Failed to copy from udata for create cq,
> > > > %d\n", err);
> > > > > +		return err;
> > > > > +	}
> > > > > +
> > > > > +	if ((!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) ||
> > > > > +	    attr->cqe > U32_MAX / COMP_ENTRY_SIZE)
> > > > > +		return -EINVAL;
> > > > > +
> > > > > +	cq->cqe = attr->cqe;
> > > > > +	err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe *
> > > > COMP_ENTRY_SIZE,
> > > > > +				   &cq->queue, &ibcq->umem);
> > 
> > I just realized that I forgot to handle the case when ibcq->umem == NULL and
> > mana fails later after this call. I need to clean ibcq->umem in this case.
> > I will address that in v3. I am sorry.
> > 
> 
> Hi Leon,
> After re-reading the code, I see that there is no bug in v2 as the umem gets deallocated
> on failure inside the handler of UVERBS_METHOD_CQ_CREATE. I also see that you also had
> the same logic in v1. So, what is your recommendation? Leave v2 logic as is, so mana would
> immediately give ownership of umem to cq->umem, and if mana_ib_create_user_cq() fails at later stage
> it should not clean cq->umem and leave it to the caller handle (i.e., UVERBS_METHOD_CQ_CREATE)
> to clean cq->umem regardless of who created it.
> 
> Or should I make v3, where I will assign umem to cq->umem right before return 0, so that if
> mana_ib_create_user_cq() fails it does not change cq->umem at all.

My suggestion is to stick with my original patch and remove
ib_umem_release(queue->umem) from mana_ib_destroy_queue().

Thanks

> 
> > - Konstantin
> > 
> 
> 

  reply	other threads:[~2026-03-04 15:59 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-03 12:48 [PATCH rdma-next v2 1/1] RDMA/mana: Provide a modern CQ creation interface Konstantin Taranov
2026-03-04 11:05 ` Leon Romanovsky
2026-03-04 11:41   ` [EXTERNAL] " Konstantin Taranov
2026-03-04 13:23     ` Konstantin Taranov
2026-03-04 14:06       ` Konstantin Taranov
2026-03-04 15:59         ` Leon Romanovsky [this message]
2026-03-05  9:20           ` Konstantin Taranov
2026-03-11 13:29             ` Konstantin Taranov
2026-03-11 18:55               ` Leon Romanovsky
2026-03-17 14:05                 ` Konstantin Taranov
2026-03-17 16:27                   ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260304155913.GH12611@unreal \
    --to=leon@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=kotaranov@linux.microsoft.com \
    --cc=kotaranov@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=shirazsaleem@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox