From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sagi Grimberg Subject: Re: [PATCH v4 09/14] IB/cm: Expose BTH P_Key in CM and SIDR request events Date: Mon, 31 Aug 2015 10:41:21 +0300 Message-ID: <55E40521.60403@dev.mellanox.co.il> References: <1438267826-32155-1-git-send-email-haggaie@mellanox.com> <1438267826-32155-10-git-send-email-haggaie@mellanox.com> <55E34A05.8040205@dev.mellanox.co.il> <55E3F93D.6000400@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <55E3F93D.6000400-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Haggai Eran , Doug Ledford Cc: Liran Liss , linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Jason Gunthorpe , Eli Cohen List-Id: linux-rdma@vger.kernel.org On 8/31/2015 9:50 AM, Haggai Eran wrote: > On 30/08/2015 21:23, Sagi Grimberg wrote: >> >> Looks like for some reason cm_get_bth_pkey got pkey_index of 0xffff >> instead of 0 (working on the default pkey 0xffff at entry 0). > > It looks like the mlx5 driver doesn't interpret the completion format > correctly. It takes a field defined in the programmer reference manual > as pkey, and interprets it as pkey_index [1]. You're right! I wonder how this ever used to work (and it did...). So the driver needs to lookup a pkey_index on each GSI packet? > >> log: >> infiniband mlx5_0: ib_cm: Couldn't retrieve pkey for incoming request (port 1, pkey index 65535). -22 >> ib_srpt Received SRP_LOGIN_REQ with i_port_id 0x0:0x2c90300ed0960, t_port_id 0x2c90300ed0950:0x2c90300ed0950 and it_iu_len 260 on port 1 (guid=0xfe80000000000000:0x2c90300ed0950) >> ib_srpt Session : kernel thread ib_srpt_compl (PID 8584) started >> infiniband mlx5_0: ib_cm: Couldn't retrieve pkey for incoming request (port 1, pkey index 65535). -22 >> ib_srpt Received SRP_LOGIN_REQ with i_port_id 0x0:0x2c90300ed0960, t_port_id 0x2c90300ed0950:0x2c90300ed0950 and it_iu_len 260 on port 1 (guid=0xfe80000000000000:0x2c90300ed0950) >> ib_srpt Session : kernel thread ib_srpt_compl (PID 8585) started >> mlx5_0:dump_cqe:238:(pid 8584): dump error cqe >> 00000000 00000000 00000000 00000000 >> 00000000 00000000 00000000 00000000 >> 0000002b 00000000 00000000 00000000 >> 00000000 94003004 0000002c 0000b8e0 >> ib_srpt receiving failed for idx 0 with status 4 >> 0000:04:00.0:poll_health:151:(pid 0): device's health compromised >> assert_var[0] 0x00000094 >> assert_var[1] 0x00000000 >> assert_var[2] 0x00000000 >> assert_var[3] 0x00000000 >> assert_var[4] 0x00000000 >> assert_exit_ptr 0x0061d35c >> assert_callra 0x0067a5f4 >> fw_ver 0xa0641900 >> hw_id 0x000001ff >> irisc_index 2 >> synd 0x1: firmware internal error >> ext_sync 0x0000 >> 0000:04:00.0:health_care:76:(pid 7943): handling bad device here >> ib_srpt Received DREQ and sent DREP for session 0x00000000000000000002c90300ed0960. >> ib_srpt Received DREQ and sent DREP for session 0x00000000000000000002c90300ed0960. >> ib_srpt Received IB TimeWait exit for cm_id ffff88046d1fb200. >> ib_srpt Received IB TimeWait exit for cm_id ffff880454ffa000. >> ib_srpt Session 0x00000000000000000002c90300ed0960: kernel thread ib_srpt_compl (PID 8585) stopped > > I don't know how that can cause all the other errors though. Me neither... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html