public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* ib_create_qp failing
@ 2018-01-29  8:50 Joel Nider
       [not found] ` <OF089AD255.99D19AC9-ONC2258224.002F8CF2-C2258224.00309558-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Joel Nider @ 2018-01-29  8:50 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hi,

I have been trying to implement a kernel module (kernel 4.12) that opens 
an Infiniband connection, but have run into difficulties.  The call to 
ib_create_qp() fails with -EINVAL.  I have successfully set up the 
completion queues and protection domain. I have called ib_query_port() 
beforehand, which helps (I get farther along) but the ib_create_qp() call 
still fails.

>From userspace, I have no problems moving data over RoCE, so I'm assuming 
my hardware, kernel, OFED, etc are all OK.

I have used the net/smc module extensively as a reference, but there seems 
to be something fundamental that I'm missing.  How can I find out what's 
going wrong here?

Thanks,
Joel


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: ib_create_qp failing
       [not found] ` <OF089AD255.99D19AC9-ONC2258224.002F8CF2-C2258224.00309558-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
@ 2018-01-29 19:04   ` Ilya Lesokhin
       [not found]     ` <AM4PR0501MB27239EF18E2C07A64B045674D4E50-dp/nxUn679gUhbbWvQuVF8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Ilya Lesokhin @ 2018-01-29 19:04 UTC (permalink / raw)
  To: Joel Nider, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

> -----Original Message-----
> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Joel Nider
> Sent: Monday, January 29, 2018 10:51 AM
> To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Subject: ib_create_qp failing
> 
...
> 
> I have used the net/smc module extensively as a reference, but there seems
> to be something fundamental that I'm missing.  How can I find out what's
> going wrong here?

Hi Joel,
What ROCE device are you using?
There are many debug prints in the Mellanox drivers.
You should try enabling  dynamic debug for mlx4_ib or mlx5_ib, it might give you more information.

Good luck,
Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: ib_create_qp failing
       [not found]     ` <AM4PR0501MB27239EF18E2C07A64B045674D4E50-dp/nxUn679gUhbbWvQuVF8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2018-01-29 20:18       ` Joel Nider
       [not found]         ` <OF09394B7D.58CD59F8-ONC2258224.006BDAF8-C2258224.006F854D-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Joel Nider @ 2018-01-29 20:18 UTC (permalink / raw)
  To: Ilya Lesokhin; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Hi Ilya,

I am using ConnectX5 EN with special firmware (I just mention it so you 
know, but I'm pretty sure that's not the problem).  I have used dynamic 
debug - it shows me that the call is failing like this:

mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772): CREATE_QP(0x500) 
op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0)

Is there any way for me to look up the syndrome?

Ilya Lesokhin <ilyal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 21:04:57:
> > 
> > I have used the net/smc module extensively as a reference, but there 
seems
> > to be something fundamental that I'm missing.  How can I find out 
what's
> > going wrong here?
> 
> Hi Joel,
> What ROCE device are you using?
> There are many debug prints in the Mellanox drivers.
> You should try enabling  dynamic debug for mlx4_ib or mlx5_ib, it might 
give 
> you more information.
> 
> Good luck,
> Ilya
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ib_create_qp failing
       [not found]         ` <OF09394B7D.58CD59F8-ONC2258224.006BDAF8-C2258224.006F854D-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
@ 2018-01-29 20:47           ` Mark Bloch
       [not found]             ` <fd693fbd-1387-0298-49d0-035e224f07d8-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Bloch @ 2018-01-29 20:47 UTC (permalink / raw)
  To: Joel Nider, Ilya Lesokhin
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org



On 29/01/2018 22:18, Joel Nider wrote:
> Hi Ilya,
> 
> I am using ConnectX5 EN with special firmware (I just mention it so you 
> know, but I'm pretty sure that's not the problem).  I have used dynamic 
> debug - it shows me that the call is failing like this:
> 
> mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772): CREATE_QP(0x500) 
> op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0)

The syndrome means: "create_qp: invalid service type for RoCE"
what QP type are you trying to create and with what parameters?

> 
> Is there any way for me to look up the syndrome?
> 
> Ilya Lesokhin <ilyal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 21:04:57:
>>>
>>> I have used the net/smc module extensively as a reference, but there 
> seems
>>> to be something fundamental that I'm missing.  How can I find out 
> what's
>>> going wrong here?
>>
>> Hi Joel,
>> What ROCE device are you using?
>> There are many debug prints in the Mellanox drivers.
>> You should try enabling  dynamic debug for mlx4_ib or mlx5_ib, it might 
> give 
>> you more information.
>>
>> Good luck,
>> Ilya
>>
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ib_create_qp failing
       [not found]             ` <fd693fbd-1387-0298-49d0-035e224f07d8-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2018-01-30  8:30               ` Joel Nider
       [not found]                 ` <OF85410752.DF2BA5BF-ONC2258225.002CFF83-C2258225.002EB65B-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Joel Nider @ 2018-01-30  8:30 UTC (permalink / raw)
  To: Mark Bloch
  Cc: Ilya Lesokhin, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Hi Mark,

Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 22:47:04:

> > I am using ConnectX5 EN with special firmware (I just mention it so 
you 
> > know, but I'm pretty sure that's not the problem).  I have used 
dynamic 
> > debug - it shows me that the call is failing like this:
> > 
> > mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772): 
CREATE_QP(0x500) 
> > op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0)
> 
> The syndrome means: "create_qp: invalid service type for RoCE"
> what QP type are you trying to create and with what parameters?

I'm trying to create an RC connection - here is an outline of my code:
In module_init(): ib_register_client() -> registers callback 
on_device_add()
On_device_add(): I only have one device (mlx5_0) and I save the ib_device* 
to a linked list
>From sysfs, a user writes the name (string) of the device to use - this 
starts the setup procedure
ib_register_event_handler(rpf_ib_qp_event_handler)
ib_query_port(dev, 1) -> state=4 max_mtu=4096 active_mtu=1024 gid_len=256 
caps=0x4010000
ib_create_cq() -> rx completion queue
ib_create_cq() -> tx completion queue
ib_alloc_pd() -> the protection domain
ib_create_qp() with: 

        struct ib_qp_init_attr qp_attr = {
                .event_handler = rpf_ib_qp_event_handler,
                .qp_context = res,
                .send_cq = res->tx_cq,
                .recv_cq = res->rx_cq,
                .srq = NULL,
                .cap = {
                        .max_send_wr = 16,
                        .max_recv_wr = 48,
                        .max_send_sge = 2,
                        .max_recv_sge = 1,
                },
                .sq_sig_type = IB_SIGNAL_REQ_WR,
                .qp_type = IB_QPT_RC,
        };

Fails with the code I mentioned earlier.  This used to fail even earlier 
until I discovered that ib_query_port() has some interesting side effects, 
and is mandatory. Maybe there is some other function that I need to call 
to set up some state first?

> 
> > 
> > Is there any way for me to look up the syndrome?
> > 
> > Ilya Lesokhin <ilyal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 21:04:57:
> >>>
> >>> I have used the net/smc module extensively as a reference, but there 

> > seems
> >>> to be something fundamental that I'm missing.  How can I find out 
> > what's
> >>> going wrong here?
> >>
> >> Hi Joel,
> >> What ROCE device are you using?
> >> There are many debug prints in the Mellanox drivers.
> >> You should try enabling  dynamic debug for mlx4_ib or mlx5_ib, it 
might 
> > give 
> >> you more information.
> >>
> >> Good luck,
> >> Ilya
> >>
> > 
> Mark
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ib_create_qp failing
       [not found]                 ` <OF85410752.DF2BA5BF-ONC2258225.002CFF83-C2258225.002EB65B-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
@ 2018-01-31 13:58                   ` Leon Romanovsky
  0 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2018-01-31 13:58 UTC (permalink / raw)
  To: Joel Nider
  Cc: Mark Bloch, Ilya Lesokhin,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Attachment #1: Type: text/plain, Size: 2313 bytes --]

On Tue, Jan 30, 2018 at 10:30:13AM +0200, Joel Nider wrote:
> Hi Mark,
>
> Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote on 29/01/2018 22:47:04:
>
> > > I am using ConnectX5 EN with special firmware (I just mention it so
> you
> > > know, but I'm pretty sure that's not the problem).  I have used
> dynamic
> > > debug - it shows me that the call is failing like this:
> > >
> > > mlx5_core 0000:01:00.0: mlx5_cmd_check:714:(pid 120772):
> CREATE_QP(0x500)
> > > op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x48b5c0)
> >
> > The syndrome means: "create_qp: invalid service type for RoCE"
> > what QP type are you trying to create and with what parameters?
>
> I'm trying to create an RC connection - here is an outline of my code:
> In module_init(): ib_register_client() -> registers callback
> on_device_add()
> On_device_add(): I only have one device (mlx5_0) and I save the ib_device*
> to a linked list
> From sysfs, a user writes the name (string) of the device to use - this
> starts the setup procedure
> ib_register_event_handler(rpf_ib_qp_event_handler)
> ib_query_port(dev, 1) -> state=4 max_mtu=4096 active_mtu=1024 gid_len=256
> caps=0x4010000
> ib_create_cq() -> rx completion queue
> ib_create_cq() -> tx completion queue
> ib_alloc_pd() -> the protection domain
> ib_create_qp() with:
>
>         struct ib_qp_init_attr qp_attr = {
>                 .event_handler = rpf_ib_qp_event_handler,
>                 .qp_context = res,
>                 .send_cq = res->tx_cq,
>                 .recv_cq = res->rx_cq,
>                 .srq = NULL,
>                 .cap = {
>                         .max_send_wr = 16,
>                         .max_recv_wr = 48,
>                         .max_send_sge = 2,
>                         .max_recv_sge = 1,
>                 },
>                 .sq_sig_type = IB_SIGNAL_REQ_WR,
>                 .qp_type = IB_QPT_RC,
>         };
>
> Fails with the code I mentioned earlier.  This used to fail even earlier
> until I discovered that ib_query_port() has some interesting side effects,
> and is mandatory. Maybe there is some other function that I need to call
> to set up some state first?

We strongly recommend you to contact the customer support
representative, because you are getting FW error from custom FW.

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-01-31 13:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-29  8:50 ib_create_qp failing Joel Nider
     [not found] ` <OF089AD255.99D19AC9-ONC2258224.002F8CF2-C2258224.00309558-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
2018-01-29 19:04   ` Ilya Lesokhin
     [not found]     ` <AM4PR0501MB27239EF18E2C07A64B045674D4E50-dp/nxUn679gUhbbWvQuVF8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2018-01-29 20:18       ` Joel Nider
     [not found]         ` <OF09394B7D.58CD59F8-ONC2258224.006BDAF8-C2258224.006F854D-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
2018-01-29 20:47           ` Mark Bloch
     [not found]             ` <fd693fbd-1387-0298-49d0-035e224f07d8-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2018-01-30  8:30               ` Joel Nider
     [not found]                 ` <OF85410752.DF2BA5BF-ONC2258225.002CFF83-C2258225.002EB65B-8eTO7WVQ4XIsd+ienQ86orlN3bxYEBpz@public.gmane.org>
2018-01-31 13:58                   ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox