From: Leon Romanovsky <leon@kernel.org>
To: Konstantin Taranov <kotaranov@microsoft.com>
Cc: Konstantin Taranov <kotaranov@linux.microsoft.com>,
Shiraz Saleem <shirazsaleem@microsoft.com>,
Long Li <longli@microsoft.com>, "jgg@ziepe.ca" <jgg@ziepe.ca>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH rdma-next v2 1/1] RDMA/mana: Provide a modern CQ creation interface
Date: Wed, 4 Mar 2026 17:59:13 +0200 [thread overview]
Message-ID: <20260304155913.GH12611@unreal> (raw)
In-Reply-To: <DU8PR83MB0975A4114E1CFE0B6BFA2DBBB47CA@DU8PR83MB0975.EURPRD83.prod.outlook.com>
On Wed, Mar 04, 2026 at 02:06:21PM +0000, Konstantin Taranov wrote:
> > > > > The uverbs CQ creation UAPI allows users to supply their own umem
> > > > > for a
> > > > CQ.
> > > > > Update mana to support this workflow while preserving support for
> > > > > creating umem through the legacy interface.
> > > > >
> > > > > To support RDMA objects that own umem, extend
> > > > mana_ib_create_queue()
> > > > > to return the umem to the caller and do not allocate umem if it
> > > > > was allocted by the caller.
> > > > >
> > > > > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > > > > ---
> > > > > v2: It is a rework of the patch proposed by Leon
> > > >
> > > > I am curious to know what changes were introduced?
> > >
> > > It is like your patch, but I kept get_umem in mana_ib_create_queue and
> > > introduced ownership.
> > > It made the code simpler and extendable. In your proposal, it was hard
> > > to track the changes and it led to double free of the umem. With new
> > > mana_ib_create_queue() it is clear from the caller what happens and no
> > > special changes in the caller required.
> > >
> > > >
> > > > > drivers/infiniband/hw/mana/cq.c | 125 +++++++++++++++++----------
> > > > > drivers/infiniband/hw/mana/device.c | 1 +
> > > > > drivers/infiniband/hw/mana/main.c | 30 +++++--
> > > > > drivers/infiniband/hw/mana/mana_ib.h | 5 +-
> > > > > drivers/infiniband/hw/mana/qp.c | 5 +-
> > > > > drivers/infiniband/hw/mana/wq.c | 3 +-
> > > > > 6 files changed, 111 insertions(+), 58 deletions(-)
> > > > >
> > > > > diff --git a/drivers/infiniband/hw/mana/cq.c
> > > > > b/drivers/infiniband/hw/mana/cq.c index b2749f971..fa951732a
> > > > > 100644
> > > > > --- a/drivers/infiniband/hw/mana/cq.c
> > > > > +++ b/drivers/infiniband/hw/mana/cq.c
> > > > > @@ -8,12 +8,8 @@
> > > > > int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> > > > > ib_cq_init_attr
> > > > *attr,
> > > > > struct uverbs_attr_bundle *attrs) {
> > > > > - struct ib_udata *udata = &attrs->driver_udata;
> > > > > struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
> > > > > - struct mana_ib_create_cq_resp resp = {};
> > > > > - struct mana_ib_ucontext *mana_ucontext;
> > > > > struct ib_device *ibdev = ibcq->device;
> > > > > - struct mana_ib_create_cq ucmd = {};
> > > > > struct mana_ib_dev *mdev;
> > > > > bool is_rnic_cq;
> > > > > u32 doorbell;
> > > > > @@ -26,48 +22,91 @@ int mana_ib_create_cq(struct ib_cq *ibcq,
> > > > > const
> > > > struct ib_cq_init_attr *attr,
> > > > > cq->cq_handle = INVALID_MANA_HANDLE;
> > > > > is_rnic_cq = mana_ib_is_rnic(mdev);
> > > > >
> > > > > - if (udata) {
> > > > > - if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> > > > > - return -EINVAL;
> > > > > -
> > > > > - err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd),
> > > > udata->inlen));
> > > > > - if (err) {
> > > > > - ibdev_dbg(ibdev, "Failed to copy from udata for
> > > > create cq, %d\n", err);
> > > > > - return err;
> > > > > - }
> > > > > + if (attr->cqe > U32_MAX / COMP_ENTRY_SIZE / 2 + 1)
> > > > > + return -EINVAL;
> > > >
> > > > We are talking about kernel verbs. ULPs are not designed to provide
> > > > attributes and recover from random driver limitations.
> > >
> > > I understand, but there was an ask before to add that check as some
> > > automated code verifier detected overflow. So if we remote it, I guess
> > > we get again an ask to fix the potential overflow.
> > >
> > > >
> > > > >
> > > > > - if ((!is_rnic_cq && attr->cqe > mdev-
> > > > >adapter_caps.max_qp_wr) ||
> > > > > - attr->cqe > U32_MAX / COMP_ENTRY_SIZE) {
> > > > > - ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr-
> > > > >cqe);
> > > > > - return -EINVAL;
> > > > > - }
> > > > > + buf_size = MANA_PAGE_ALIGN(roundup_pow_of_two(attr->cqe *
> > > > COMP_ENTRY_SIZE));
> > > > > + cq->cqe = buf_size / COMP_ENTRY_SIZE;
> > > > > + err = mana_ib_create_kernel_queue(mdev, buf_size, GDMA_CQ,
> > > > &cq->queue);
> > > > > + if (err) {
> > > > > + ibdev_dbg(ibdev, "Failed to create kernel queue for create cq,
> > > > %d\n", err);
> > > > > + return err;
> > > > > + }
> > > > > + doorbell = mdev->gdma_dev->doorbell;
> > > > >
> > > > > - cq->cqe = attr->cqe;
> > > > > - err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe
> > > > * COMP_ENTRY_SIZE,
> > > > > - &cq->queue);
> > > > > + if (is_rnic_cq) {
> > > > > + err = mana_ib_gd_create_cq(mdev, cq, doorbell);
> > > > > if (err) {
> > > > > - ibdev_dbg(ibdev, "Failed to create queue for create
> > > > cq, %d\n", err);
> > > > > - return err;
> > > > > + ibdev_dbg(ibdev, "Failed to create RNIC cq, %d\n",
> > > > err);
> > > > > + goto err_destroy_queue;
> > > > > }
> > > > >
> > > > > - mana_ucontext = rdma_udata_to_drv_context(udata, struct
> > > > mana_ib_ucontext,
> > > > > - ibucontext);
> > > > > - doorbell = mana_ucontext->doorbell;
> > > > > - } else {
> > > > > - if (attr->cqe > U32_MAX / COMP_ENTRY_SIZE / 2 + 1) {
> > > > > - ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr-
> > > > >cqe);
> > > > > - return -EINVAL;
> > > > > - }
> > > > > - buf_size = MANA_PAGE_ALIGN(roundup_pow_of_two(attr-
> > > > >cqe * COMP_ENTRY_SIZE));
> > > > > - cq->cqe = buf_size / COMP_ENTRY_SIZE;
> > > > > - err = mana_ib_create_kernel_queue(mdev, buf_size,
> > > > GDMA_CQ, &cq->queue);
> > > > > + err = mana_ib_install_cq_cb(mdev, cq);
> > > > > if (err) {
> > > > > - ibdev_dbg(ibdev, "Failed to create kernel queue for
> > > > create cq, %d\n", err);
> > > > > - return err;
> > > > > + ibdev_dbg(ibdev, "Failed to install cq callback, %d\n",
> > > > err);
> > > > > + goto err_destroy_rnic_cq;
> > > > > }
> > > > > - doorbell = mdev->gdma_dev->doorbell;
> > > > > }
> > > > >
> > > > > + spin_lock_init(&cq->cq_lock);
> > > > > + INIT_LIST_HEAD(&cq->list_send_qp);
> > > > > + INIT_LIST_HEAD(&cq->list_recv_qp);
> > > > > +
> > > > > + return 0;
> > > > > +
> > > > > +err_destroy_rnic_cq:
> > > > > + mana_ib_gd_destroy_cq(mdev, cq);
> > > > > +err_destroy_queue:
> > > > > + mana_ib_destroy_queue(mdev, &cq->queue);
> > > > > +
> > > > > + return err;
> > > > > +}
> > > > > +
> > > > > +int mana_ib_create_user_cq(struct ib_cq *ibcq, const struct
> > > > ib_cq_init_attr *attr,
> > > > > + struct uverbs_attr_bundle *attrs) {
> > > > > + struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
> > > > > + struct ib_udata *udata = &attrs->driver_udata;
> > > > > + struct mana_ib_create_cq_resp resp = {};
> > > > > + struct mana_ib_ucontext *mana_ucontext;
> > > > > + struct ib_device *ibdev = ibcq->device;
> > > > > + struct mana_ib_create_cq ucmd = {};
> > > > > + struct mana_ib_dev *mdev;
> > > > > + bool is_rnic_cq;
> > > > > + u32 doorbell;
> > > > > + int err;
> > > > > +
> > > > > + mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> > > > > +
> > > > > + cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
> > > > > + cq->cq_handle = INVALID_MANA_HANDLE;
> > > > > + is_rnic_cq = mana_ib_is_rnic(mdev);
> > > > > +
> > > > > + if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> > > > > + return -EINVAL;
> > > > > +
> > > > > + err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata-
> > > > >inlen));
> > > > > + if (err) {
> > > > > + ibdev_dbg(ibdev, "Failed to copy from udata for create cq,
> > > > %d\n", err);
> > > > > + return err;
> > > > > + }
> > > > > +
> > > > > + if ((!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) ||
> > > > > + attr->cqe > U32_MAX / COMP_ENTRY_SIZE)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + cq->cqe = attr->cqe;
> > > > > + err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe *
> > > > COMP_ENTRY_SIZE,
> > > > > + &cq->queue, &ibcq->umem);
> >
> > I just realized that I forgot to handle the case when ibcq->umem == NULL and
> > mana fails later after this call. I need to clean ibcq->umem in this case.
> > I will address that in v3. I am sorry.
> >
>
> Hi Leon,
> After re-reading the code, I see that there is no bug in v2 as the umem gets deallocated
> on failure inside the handler of UVERBS_METHOD_CQ_CREATE. I also see that you also had
> the same logic in v1. So, what is your recommendation? Leave v2 logic as is, so mana would
> immediately give ownership of umem to cq->umem, and if mana_ib_create_user_cq() fails at later stage
> it should not clean cq->umem and leave it to the caller handle (i.e., UVERBS_METHOD_CQ_CREATE)
> to clean cq->umem regardless of who created it.
>
> Or should I make v3, where I will assign umem to cq->umem right before return 0, so that if
> mana_ib_create_user_cq() fails it does not change cq->umem at all.
My suggestion is to stick with my original patch and remove
ib_umem_release(queue->umem) from mana_ib_destroy_queue().
Thanks
>
> > - Konstantin
> >
>
>
next prev parent reply other threads:[~2026-03-04 15:59 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-03 12:48 [PATCH rdma-next v2 1/1] RDMA/mana: Provide a modern CQ creation interface Konstantin Taranov
2026-03-04 11:05 ` Leon Romanovsky
2026-03-04 11:41 ` [EXTERNAL] " Konstantin Taranov
2026-03-04 13:23 ` Konstantin Taranov
2026-03-04 14:06 ` Konstantin Taranov
2026-03-04 15:59 ` Leon Romanovsky [this message]
2026-03-05 9:20 ` Konstantin Taranov
2026-03-11 13:29 ` Konstantin Taranov
2026-03-11 18:55 ` Leon Romanovsky
2026-03-17 14:05 ` Konstantin Taranov
2026-03-17 16:27 ` Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260304155913.GH12611@unreal \
--to=leon@kernel.org \
--cc=jgg@ziepe.ca \
--cc=kotaranov@linux.microsoft.com \
--cc=kotaranov@microsoft.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=longli@microsoft.com \
--cc=shirazsaleem@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.