* ib_create_qp failing
@ 2018-01-29 8:50 Joel Nider
[not found] ` <OF089AD255.99D19AC9-ONC2258224.002F8CF2-C2258224.00309558-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Joel Nider @ 2018-01-29 8:50 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Hi,
I have been trying to implement a kernel module (kernel 4.12) that opens
an Infiniband connection, but have run into difficulties. The call to
ib_create_qp() fails with -EINVAL. I have successfully set up the
completion queues and protection domain. I have called ib_query_port()
beforehand, which helps (I get farther along) but the ib_create_qp() call
still fails.
>From userspace, I have no problems moving data over RoCE, so I'm assuming
my hardware, kernel, OFED, etc are all OK.
I have used the net/smc module extensively as a reference, but there seems
to be something fundamental that I'm missing. How can I find out what's
going wrong here?
Thanks,
Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: ib_create_qp failing
[not found] ` <OF089AD255.99D19AC9-ONC2258224.002F8CF2-C2258224.00309558-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
@ 2018-01-29 19:04 ` Ilya Lesokhin
[not found] ` <AM4PR0501MB27239EF18E2C07A64B045674D4E50-dp/nxUn679gUhbbWvQuVF8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Ilya Lesokhin @ 2018-01-29 19:04 UTC (permalink / raw)
To: Joel Nider, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> -----Original Message-----
> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Joel Nider
> Sent: Monday, January 29, 2018 10:51 AM
> To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Subject: ib_create_qp failing
>
...
>
> I have used the net/smc module extensively as a reference, but there seems
> to be something fundamental that I'm missing. How can I find out what's
> going wrong here?
Hi Joel,
What ROCE device are you using?
There are many debug prints in the Mellanox drivers.
You should try enabling dynamic debug for mlx4_ib or mlx5_ib, it might give you more information.
Good luck,
Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: ib_create_qp failing
[not found] ` <AM4PR0501MB27239EF18E2C07A64B045674D4E50-dp/nxUn679gUhbbWvQuVF8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2018-01-29 20:18 ` Joel Nider
[not found] ` <OF09394B7D.58CD59F8-ONC2258224.006BDAF8-C2258224.006F854D-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Joel Nider @ 2018-01-29 20:18 UTC (permalink / raw)
To: Ilya Lesokhin; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Hi Ilya,
I am using ConnectX5 EN with special firmware (I just mention it so you
know, but I'm pretty sure that's not the problem). I have used dynamic
debug - it shows me that the call is failing like this:
mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772): CREATE_QP(0x500)
op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0)
Is there any way for me to look up the syndrome?
Ilya Lesokhin <ilyal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 21:04:57:
> >
> > I have used the net/smc module extensively as a reference, but there
seems
> > to be something fundamental that I'm missing. How can I find out
what's
> > going wrong here?
>
> Hi Joel,
> What ROCE device are you using?
> There are many debug prints in the Mellanox drivers.
> You should try enabling dynamic debug for mlx4_ib or mlx5_ib, it might
give
> you more information.
>
> Good luck,
> Ilya
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ib_create_qp failing
[not found] ` <OF09394B7D.58CD59F8-ONC2258224.006BDAF8-C2258224.006F854D-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
@ 2018-01-29 20:47 ` Mark Bloch
[not found] ` <fd693fbd-1387-0298-49d0-035e224f07d8-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Mark Bloch @ 2018-01-29 20:47 UTC (permalink / raw)
To: Joel Nider, Ilya Lesokhin
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On 29/01/2018 22:18, Joel Nider wrote:
> Hi Ilya,
>
> I am using ConnectX5 EN with special firmware (I just mention it so you
> know, but I'm pretty sure that's not the problem). I have used dynamic
> debug - it shows me that the call is failing like this:
>
> mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772): CREATE_QP(0x500)
> op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0)
The syndrome means: "create_qp: invalid service type for RoCE"
what QP type are you trying to create and with what parameters?
>
> Is there any way for me to look up the syndrome?
>
> Ilya Lesokhin <ilyal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 21:04:57:
>>>
>>> I have used the net/smc module extensively as a reference, but there
> seems
>>> to be something fundamental that I'm missing. How can I find out
> what's
>>> going wrong here?
>>
>> Hi Joel,
>> What ROCE device are you using?
>> There are many debug prints in the Mellanox drivers.
>> You should try enabling dynamic debug for mlx4_ib or mlx5_ib, it might
> give
>> you more information.
>>
>> Good luck,
>> Ilya
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ib_create_qp failing
[not found] ` <fd693fbd-1387-0298-49d0-035e224f07d8-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2018-01-30 8:30 ` Joel Nider
[not found] ` <OF85410752.DF2BA5BF-ONC2258225.002CFF83-C2258225.002EB65B-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Joel Nider @ 2018-01-30 8:30 UTC (permalink / raw)
To: Mark Bloch
Cc: Ilya Lesokhin, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Hi Mark,
Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 22:47:04:
> > I am using ConnectX5 EN with special firmware (I just mention it so
you
> > know, but I'm pretty sure that's not the problem). I have used
dynamic
> > debug - it shows me that the call is failing like this:
> >
> > mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772):
CREATE_QP(0x500)
> > op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0)
>
> The syndrome means: "create_qp: invalid service type for RoCE"
> what QP type are you trying to create and with what parameters?
I'm trying to create an RC connection - here is an outline of my code:
In module_init(): ib_register_client() -> registers callback
on_device_add()
On_device_add(): I only have one device (mlx5_0) and I save the ib_device*
to a linked list
>From sysfs, a user writes the name (string) of the device to use - this
starts the setup procedure
ib_register_event_handler(rpf_ib_qp_event_handler)
ib_query_port(dev, 1) -> state=4 max_mtu=4096 active_mtu=1024 gid_len=256
caps=0x4010000
ib_create_cq() -> rx completion queue
ib_create_cq() -> tx completion queue
ib_alloc_pd() -> the protection domain
ib_create_qp() with:
struct ib_qp_init_attr qp_attr = {
.event_handler = rpf_ib_qp_event_handler,
.qp_context = res,
.send_cq = res->tx_cq,
.recv_cq = res->rx_cq,
.srq = NULL,
.cap = {
.max_send_wr = 16,
.max_recv_wr = 48,
.max_send_sge = 2,
.max_recv_sge = 1,
},
.sq_sig_type = IB_SIGNAL_REQ_WR,
.qp_type = IB_QPT_RC,
};
Fails with the code I mentioned earlier. This used to fail even earlier
until I discovered that ib_query_port() has some interesting side effects,
and is mandatory. Maybe there is some other function that I need to call
to set up some state first?
>
> >
> > Is there any way for me to look up the syndrome?
> >
> > Ilya Lesokhin <ilyal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 21:04:57:
> >>>
> >>> I have used the net/smc module extensively as a reference, but there
> > seems
> >>> to be something fundamental that I'm missing. How can I find out
> > what's
> >>> going wrong here?
> >>
> >> Hi Joel,
> >> What ROCE device are you using?
> >> There are many debug prints in the Mellanox drivers.
> >> You should try enabling dynamic debug for mlx4_ib or mlx5_ib, it
might
> > give
> >> you more information.
> >>
> >> Good luck,
> >> Ilya
> >>
> >
> Mark
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ib_create_qp failing
[not found] ` <OF85410752.DF2BA5BF-ONC2258225.002CFF83-C2258225.002EB65B-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
@ 2018-01-31 13:58 ` Leon Romanovsky
0 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2018-01-31 13:58 UTC (permalink / raw)
To: Joel Nider
Cc: Mark Bloch, Ilya Lesokhin,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Attachment #1: Type: text/plain, Size: 2313 bytes --]
On Tue, Jan 30, 2018 at 10:30:13AM +0200, Joel Nider wrote:
> Hi Mark,
>
> Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 22:47:04:
>
> > > I am using ConnectX5 EN with special firmware (I just mention it so
> you
> > > know, but I'm pretty sure that's not the problem). I have used
> dynamic
> > > debug - it shows me that the call is failing like this:
> > >
> > > mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772):
> CREATE_QP(0x500)
> > > op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0)
> >
> > The syndrome means: "create_qp: invalid service type for RoCE"
> > what QP type are you trying to create and with what parameters?
>
> I'm trying to create an RC connection - here is an outline of my code:
> In module_init(): ib_register_client() -> registers callback
> on_device_add()
> On_device_add(): I only have one device (mlx5_0) and I save the ib_device*
> to a linked list
> From sysfs, a user writes the name (string) of the device to use - this
> starts the setup procedure
> ib_register_event_handler(rpf_ib_qp_event_handler)
> ib_query_port(dev, 1) -> state=4 max_mtu=4096 active_mtu=1024 gid_len=256
> caps=0x4010000
> ib_create_cq() -> rx completion queue
> ib_create_cq() -> tx completion queue
> ib_alloc_pd() -> the protection domain
> ib_create_qp() with:
>
> struct ib_qp_init_attr qp_attr = {
> .event_handler = rpf_ib_qp_event_handler,
> .qp_context = res,
> .send_cq = res->tx_cq,
> .recv_cq = res->rx_cq,
> .srq = NULL,
> .cap = {
> .max_send_wr = 16,
> .max_recv_wr = 48,
> .max_send_sge = 2,
> .max_recv_sge = 1,
> },
> .sq_sig_type = IB_SIGNAL_REQ_WR,
> .qp_type = IB_QPT_RC,
> };
>
> Fails with the code I mentioned earlier. This used to fail even earlier
> until I discovered that ib_query_port() has some interesting side effects,
> and is mandatory. Maybe there is some other function that I need to call
> to set up some state first?
We strongly recommend you to contact the customer support
representative, because you are getting FW error from custom FW.
Thanks
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-01-31 13:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-29 8:50 ib_create_qp failing Joel Nider
[not found] ` <OF089AD255.99D19AC9-ONC2258224.002F8CF2-C2258224.00309558-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
2018-01-29 19:04 ` Ilya Lesokhin
[not found] ` <AM4PR0501MB27239EF18E2C07A64B045674D4E50-dp/nxUn679gUhbbWvQuVF8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2018-01-29 20:18 ` Joel Nider
[not found] ` <OF09394B7D.58CD59F8-ONC2258224.006BDAF8-C2258224.006F854D-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
2018-01-29 20:47 ` Mark Bloch
[not found] ` <fd693fbd-1387-0298-49d0-035e224f07d8-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2018-01-30 8:30 ` Joel Nider
[not found] ` <OF85410752.DF2BA5BF-ONC2258225.002CFF83-C2258225.002EB65B-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
2018-01-31 13:58 ` Leon Romanovsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox