* ib_create_qp failing
@ 2018-01-29 8:50 Joel Nider
[not found] ` <OF089AD255.99D19AC9-ONC2258224.002F8CF2-C2258224.00309558-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Joel Nider @ 2018-01-29 8:50 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Hi,
I have been trying to implement a kernel module (kernel 4.12) that opens
an Infiniband connection, but have run into difficulties. The call to
ib_create_qp() fails with -EINVAL. I have successfully set up the
completion queues and protection domain. I have called ib_query_port()
beforehand, which helps (I get farther along) but the ib_create_qp() call
still fails.
>From userspace, I have no problems moving data over RoCE, so I'm assuming
my hardware, kernel, OFED, etc are all OK.
I have used the net/smc module extensively as a reference, but there seems
to be something fundamental that I'm missing. How can I find out what's
going wrong here?
Thanks,
Joel
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread[parent not found: <OF089AD255.99D19AC9-ONC2258224.002F8CF2-C2258224.00309558-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>]
* RE: ib_create_qp failing [not found] ` <OF089AD255.99D19AC9-ONC2258224.002F8CF2-C2258224.00309558-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org> @ 2018-01-29 19:04 ` Ilya Lesokhin [not found] ` <AM4PR0501MB27239EF18E2C07A64B045674D4E50-dp/nxUn679gUhbbWvQuVF8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Ilya Lesokhin @ 2018-01-29 19:04 UTC (permalink / raw) To: Joel Nider, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > -----Original Message----- > From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma- > owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Joel Nider > Sent: Monday, January 29, 2018 10:51 AM > To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > Subject: ib_create_qp failing > ... > > I have used the net/smc module extensively as a reference, but there seems > to be something fundamental that I'm missing. How can I find out what's > going wrong here? Hi Joel, What ROCE device are you using? There are many debug prints in the Mellanox drivers. You should try enabling dynamic debug for mlx4_ib or mlx5_ib, it might give you more information. Good luck, Ilya -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <AM4PR0501MB27239EF18E2C07A64B045674D4E50-dp/nxUn679gUhbbWvQuVF8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>]
* RE: ib_create_qp failing [not found] ` <AM4PR0501MB27239EF18E2C07A64B045674D4E50-dp/nxUn679gUhbbWvQuVF8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org> @ 2018-01-29 20:18 ` Joel Nider [not found] ` <OF09394B7D.58CD59F8-ONC2258224.006BDAF8-C2258224.006F854D-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Joel Nider @ 2018-01-29 20:18 UTC (permalink / raw) To: Ilya Lesokhin; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Hi Ilya, I am using ConnectX5 EN with special firmware (I just mention it so you know, but I'm pretty sure that's not the problem). I have used dynamic debug - it shows me that the call is failing like this: mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772): CREATE_QP(0x500) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0) Is there any way for me to look up the syndrome? Ilya Lesokhin <ilyal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 21:04:57: > > > > I have used the net/smc module extensively as a reference, but there seems > > to be something fundamental that I'm missing. How can I find out what's > > going wrong here? > > Hi Joel, > What ROCE device are you using? > There are many debug prints in the Mellanox drivers. > You should try enabling dynamic debug for mlx4_ib or mlx5_ib, it might give > you more information. > > Good luck, > Ilya > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <OF09394B7D.58CD59F8-ONC2258224.006BDAF8-C2258224.006F854D-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>]
* Re: ib_create_qp failing [not found] ` <OF09394B7D.58CD59F8-ONC2258224.006BDAF8-C2258224.006F854D-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org> @ 2018-01-29 20:47 ` Mark Bloch [not found] ` <fd693fbd-1387-0298-49d0-035e224f07d8-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Mark Bloch @ 2018-01-29 20:47 UTC (permalink / raw) To: Joel Nider, Ilya Lesokhin Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 29/01/2018 22:18, Joel Nider wrote: > Hi Ilya, > > I am using ConnectX5 EN with special firmware (I just mention it so you > know, but I'm pretty sure that's not the problem). I have used dynamic > debug - it shows me that the call is failing like this: > > mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772): CREATE_QP(0x500) > op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0) The syndrome means: "create_qp: invalid service type for RoCE" what QP type are you trying to create and with what parameters? > > Is there any way for me to look up the syndrome? > > Ilya Lesokhin <ilyal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 21:04:57: >>> >>> I have used the net/smc module extensively as a reference, but there > seems >>> to be something fundamental that I'm missing. How can I find out > what's >>> going wrong here? >> >> Hi Joel, >> What ROCE device are you using? >> There are many debug prints in the Mellanox drivers. >> You should try enabling dynamic debug for mlx4_ib or mlx5_ib, it might > give >> you more information. >> >> Good luck, >> Ilya >> > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Mark -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <fd693fbd-1387-0298-49d0-035e224f07d8-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: ib_create_qp failing [not found] ` <fd693fbd-1387-0298-49d0-035e224f07d8-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2018-01-30 8:30 ` Joel Nider [not found] ` <OF85410752.DF2BA5BF-ONC2258225.002CFF83-C2258225.002EB65B-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Joel Nider @ 2018-01-30 8:30 UTC (permalink / raw) To: Mark Bloch Cc: Ilya Lesokhin, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Hi Mark, Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 22:47:04: > > I am using ConnectX5 EN with special firmware (I just mention it so you > > know, but I'm pretty sure that's not the problem). I have used dynamic > > debug - it shows me that the call is failing like this: > > > > mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772): CREATE_QP(0x500) > > op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0) > > The syndrome means: "create_qp: invalid service type for RoCE" > what QP type are you trying to create and with what parameters? I'm trying to create an RC connection - here is an outline of my code: In module_init(): ib_register_client() -> registers callback on_device_add() On_device_add(): I only have one device (mlx5_0) and I save the ib_device* to a linked list >From sysfs, a user writes the name (string) of the device to use - this starts the setup procedure ib_register_event_handler(rpf_ib_qp_event_handler) ib_query_port(dev, 1) -> state=4 max_mtu=4096 active_mtu=1024 gid_len=256 caps=0x4010000 ib_create_cq() -> rx completion queue ib_create_cq() -> tx completion queue ib_alloc_pd() -> the protection domain ib_create_qp() with: struct ib_qp_init_attr qp_attr = { .event_handler = rpf_ib_qp_event_handler, .qp_context = res, .send_cq = res->tx_cq, .recv_cq = res->rx_cq, .srq = NULL, .cap = { .max_send_wr = 16, .max_recv_wr = 48, .max_send_sge = 2, .max_recv_sge = 1, }, .sq_sig_type = IB_SIGNAL_REQ_WR, .qp_type = IB_QPT_RC, }; Fails with the code I mentioned earlier. This used to fail even earlier until I discovered that ib_query_port() has some interesting side effects, and is mandatory. Maybe there is some other function that I need to call to set up some state first? > > > > > Is there any way for me to look up the syndrome? > > > > Ilya Lesokhin <ilyal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 21:04:57: > >>> > >>> I have used the net/smc module extensively as a reference, but there > > seems > >>> to be something fundamental that I'm missing. How can I find out > > what's > >>> going wrong here? > >> > >> Hi Joel, > >> What ROCE device are you using? > >> There are many debug prints in the Mellanox drivers. > >> You should try enabling dynamic debug for mlx4_ib or mlx5_ib, it might > > give > >> you more information. > >> > >> Good luck, > >> Ilya > >> > > > Mark > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <OF85410752.DF2BA5BF-ONC2258225.002CFF83-C2258225.002EB65B-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>]
* Re: ib_create_qp failing [not found] ` <OF85410752.DF2BA5BF-ONC2258225.002CFF83-C2258225.002EB65B-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org> @ 2018-01-31 13:58 ` Leon Romanovsky 0 siblings, 0 replies; 6+ messages in thread From: Leon Romanovsky @ 2018-01-31 13:58 UTC (permalink / raw) To: Joel Nider Cc: Mark Bloch, Ilya Lesokhin, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [-- Attachment #1: Type: text/plain, Size: 2313 bytes --] On Tue, Jan 30, 2018 at 10:30:13AM +0200, Joel Nider wrote: > Hi Mark, > > Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 22:47:04: > > > > I am using ConnectX5 EN with special firmware (I just mention it so > you > > > know, but I'm pretty sure that's not the problem). I have used > dynamic > > > debug - it shows me that the call is failing like this: > > > > > > mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772): > CREATE_QP(0x500) > > > op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0) > > > > The syndrome means: "create_qp: invalid service type for RoCE" > > what QP type are you trying to create and with what parameters? > > I'm trying to create an RC connection - here is an outline of my code: > In module_init(): ib_register_client() -> registers callback > on_device_add() > On_device_add(): I only have one device (mlx5_0) and I save the ib_device* > to a linked list > From sysfs, a user writes the name (string) of the device to use - this > starts the setup procedure > ib_register_event_handler(rpf_ib_qp_event_handler) > ib_query_port(dev, 1) -> state=4 max_mtu=4096 active_mtu=1024 gid_len=256 > caps=0x4010000 > ib_create_cq() -> rx completion queue > ib_create_cq() -> tx completion queue > ib_alloc_pd() -> the protection domain > ib_create_qp() with: > > struct ib_qp_init_attr qp_attr = { > .event_handler = rpf_ib_qp_event_handler, > .qp_context = res, > .send_cq = res->tx_cq, > .recv_cq = res->rx_cq, > .srq = NULL, > .cap = { > .max_send_wr = 16, > .max_recv_wr = 48, > .max_send_sge = 2, > .max_recv_sge = 1, > }, > .sq_sig_type = IB_SIGNAL_REQ_WR, > .qp_type = IB_QPT_RC, > }; > > Fails with the code I mentioned earlier. This used to fail even earlier > until I discovered that ib_query_port() has some interesting side effects, > and is mandatory. Maybe there is some other function that I need to call > to set up some state first? We strongly recommend you to contact the customer support representative, because you are getting FW error from custom FW. Thanks [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-01-31 13:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-29 8:50 ib_create_qp failing Joel Nider
[not found] ` <OF089AD255.99D19AC9-ONC2258224.002F8CF2-C2258224.00309558-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
2018-01-29 19:04 ` Ilya Lesokhin
[not found] ` <AM4PR0501MB27239EF18E2C07A64B045674D4E50-dp/nxUn679gUhbbWvQuVF8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2018-01-29 20:18 ` Joel Nider
[not found] ` <OF09394B7D.58CD59F8-ONC2258224.006BDAF8-C2258224.006F854D-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
2018-01-29 20:47 ` Mark Bloch
[not found] ` <fd693fbd-1387-0298-49d0-035e224f07d8-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2018-01-30 8:30 ` Joel Nider
[not found] ` <OF85410752.DF2BA5BF-ONC2258225.002CFF83-C2258225.002EB65B-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
2018-01-31 13:58 ` Leon Romanovsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox