From: Leon Romanovsky <leon@kernel.org>
To: Konstantin Taranov <kotaranov@microsoft.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
Konstantin Taranov <kotaranov@linux.microsoft.com>,
Wei Hu <weh@microsoft.com>,
"sharmaajay@microsoft.com" <sharmaajay@microsoft.com>,
Long Li <longli@microsoft.com>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH rdma-next 1/1] RDMA/mana_ib: indicate that inline data is not supported
Date: Sun, 21 Jul 2024 09:56:21 +0300 [thread overview]
Message-ID: <20240721065621.GD1265781@unreal> (raw)
In-Reply-To: <PAXPR83MB0559FD4684B40F51A67D6AC9B4AD2@PAXPR83MB0559.EURPRD83.prod.outlook.com>
On Fri, Jul 19, 2024 at 10:51:58AM +0000, Konstantin Taranov wrote:
> > > > > > > Yes, you are. If user asked for specific functionality
> > > > > > > (max_inline_data != 0) and your device doesn't support it, you
> > > > > > > should
> > > > return an error.
> > > > > > >
> > > > > > > pvrdma, mlx4 and rvt are not good examples, they should return
> > > > > > > an error as well, but because of being legacy code, we won't change
> > them.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > >
> > > > > > I see. So I guess we can return a larger value, but not smaller. Right?
> > > > > > I will send v2 that fails QP creation then.
> > > > > >
> > > > > > In this case, may I submit a patch to rdma-core that queries
> > > > > > device caps before trying to create a qp in rdma_client.c and
> > > > > > rdma_server.c? As that code violates what you described.
> > > > >
> > > > > Let's ask Jason, why is that? Do we allow to ignore max_inline_data?
> > > > >
> > > > > librdmacm/examples/rdma_client.c
> > > > > 63 memset(&attr, 0, sizeof attr);
> > > > > 64 attr.cap.max_send_wr = attr.cap.max_recv_wr = 1;
> > > > > 65 attr.cap.max_send_sge = attr.cap.max_recv_sge = 1;
> > > > > 66 attr.cap.max_inline_data = 16;
> > > > > 67 attr.qp_context = id;
> > > > > 68 attr.sq_sig_all = 1;
> > > > > 69 ret = rdma_create_ep(&id, res, NULL, &attr);
> > > > > 70 // Check to see if we got inline data allowed or not
> > > > > 71 if (attr.cap.max_inline_data >= 16)
> > > > > 72 send_flags = IBV_SEND_INLINE;
> > > > > 73 else
> > > > > 74 printf("rdma_client: device doesn't support
> > > > IBV_SEND_INLINE, "
> > > > > 75 "using sge sends\n");
> > > >
> > > > I think the idea expressed in this code is that if max_inline_data
> > > > requested too much it would be limited to the device capability.
> > > >
> > > > ie qp creation should limit the requests values to what the HW can
> > > > do, similar to how entries and other work.
> > > >
> > > > If the HW has no support it should return - for max_inline_data not
> > > > an error, I guess?
> > >
> > > Yes, this code implies that max_inline_data can be ignored at creation,
> > while the manual of ibv_create_qp says:
> > > "The function ibv_create_qp() will update the qp_init_attr->cap struct
> > > with the actual QP values of the QP that was created; the values will
> > > be **greater than or equal to** the values requested."
> >
> > Ah, well that seems to be some misunderstandings then, yes.
> >
> > > I see two options:
> > > 1) Remove code from rdma examples that rely on ignoring max_inline; add
> > a warning to libibverbs when drivers ignore that value.
> > > 2) Add to manual that max_inline_data might be ignored by drivers; and
> > allow my current patch that ignores max_inline_data in mana_ib.
> >
> > I don't know, what do the majority of drivers do? If enough are already doing
> > 1 then lets force everyone into 1, otherwise we have to document 2.
> >
> > And a pyverbs test should be added to cover this weirdness
>
> I quickly read create_qp code of all providers and it seems that max_inline_data is ignored by hw/pvrdma and sw/rvt.
> Other providers fail the creation when they cannot satisfy the inline_data cap.
> Some drivers ignore it for GSI, but I think it is reasonable.
>
> Then I guess the option 1 is better. Regarding pyverbs, should I add a test for the option 1?
> If yes, what should it test?
Probably, the test should check the max_inline_data value returned from device caps and try to create
QP with higher value. If the QP creation fails, the test should pass. For hw/pvrdma and sw/rvt, the QP
should be successfully created, despite the requested value.
Thanks
>
> >
> > Jason
prev parent reply other threads:[~2024-07-21 6:56 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-16 10:48 [PATCH rdma-next 1/1] RDMA/mana_ib: indicate that inline data is not supported Konstantin Taranov
2024-07-16 11:14 ` Leon Romanovsky
2024-07-16 13:42 ` Konstantin Taranov
2024-07-16 14:22 ` Leon Romanovsky
2024-07-16 14:55 ` Konstantin Taranov
2024-07-16 17:06 ` Leon Romanovsky
2024-07-16 17:25 ` [EXTERNAL] " Konstantin Taranov
2024-07-17 6:22 ` Leon Romanovsky
2024-07-17 16:34 ` Jason Gunthorpe
2024-07-18 15:05 ` Konstantin Taranov
2024-07-18 16:48 ` Jason Gunthorpe
2024-07-19 10:51 ` Konstantin Taranov
2024-07-21 6:56 ` Leon Romanovsky [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240721065621.GD1265781@unreal \
--to=leon@kernel.org \
--cc=jgg@nvidia.com \
--cc=kotaranov@linux.microsoft.com \
--cc=kotaranov@microsoft.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=longli@microsoft.com \
--cc=sharmaajay@microsoft.com \
--cc=weh@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.