All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurence Oberman <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Bart Van Assche
	<bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>,
	Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>,
	Israel Rukshin <israelr-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array
Date: Wed, 26 Apr 2017 09:28:37 -0400 (EDT)	[thread overview]
Message-ID: <2122831810.2341766.1493213317484.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <16ea1371-84a5-c055-5b0c-fdc6d355276a-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>



----- Original Message -----
> From: "Max Gurtovoy" <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> To: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: "Leon Romanovsky" <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>, "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>, "Doug Ledford"
> <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, "Sagi Grimberg" <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>, "Israel Rukshin" <israelr-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
> linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Sent: Wednesday, April 26, 2017 8:25:30 AM
> Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array
> 
> 
> 
> On 4/26/2017 3:18 PM, Laurence Oberman wrote:
> >
> >
> > ----- Original Message -----
> >> From: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> >> To: "Max Gurtovoy" <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >> Cc: "Leon Romanovsky" <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>, "Bart Van Assche"
> >> <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>, "Doug Ledford"
> >> <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, "Sagi Grimberg" <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>, "Israel
> >> Rukshin" <israelr-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
> >> linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> Sent: Wednesday, April 26, 2017 7:47:37 AM
> >> Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms()
> >> overflows the klms[] array
> >>
> >>
> >>
> >> ----- Original Message -----
> >>> From: "Max Gurtovoy" <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >>> To: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, "Leon Romanovsky"
> >>> <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >>> Cc: "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>, "Doug Ledford"
> >>> <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, "Sagi Grimberg"
> >>> <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>, "Israel Rukshin" <israelr-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
> >>> linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >>> Sent: Wednesday, April 26, 2017 4:31:57 AM
> >>> Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms()
> >>> overflows the klms[] array
> >>>
> >>>
> >>>
> >>> On 4/25/2017 11:37 PM, Laurence Oberman wrote:
> >>>>
> >>>>
> >>>> ----- Original Message -----
> >>>>> From: "Leon Romanovsky" <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >>>>> To: "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> >>>>> Cc: "Doug Ledford" <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, "Max Gurtovoy"
> >>>>> <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>, "Sagi Grimberg" <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>,
> >>>>> "Israel Rukshin" <israelr-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>, "Laurence Oberman"
> >>>>> <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >>>>> Sent: Tuesday, April 25, 2017 1:58:49 PM
> >>>>> Subject: Re: [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms()
> >>>>> overflows the klms[] array
> >>>>>
> >>>>> On Mon, Apr 24, 2017 at 03:15:28PM -0700, Bart Van Assche wrote:
> >>>>>> ib_map_mr_sg() can pass an SG-list to .map_mr_sg() that is larger
> >>>>>> than what fits into a single MR. .map_mr_sg() must not attempt to
> >>>>>> map more SG-list elements than what fits into a single MR.
> >>>>>> Hence make sure that mlx5_ib_sg_to_klms() does not write outside
> >>>>>> the MR klms[] array.
> >>>>>>
> >>>>>> Fixes: b005d3164713 ("mlx5: Add arbitrary sg list support")
> >>>>>> Signed-off-by: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> >>>>>> Reviewed-by: Max Gurtovoy <maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >>>>>> Cc: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
> >>>>>> Cc: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >>>>>> Cc: Israel Rukshin <israelr-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >>>>>> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> >>>>>> ---
> >>>>>>  drivers/infiniband/hw/mlx5/mr.c | 2 +-
> >>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>
> >>>>>
> >>>>> Bart,
> >>>>>
> >>>>> Thanks a lot, it indeed looks right.
> >>>>> Acked-by: Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>
> >>>>
> >>>> Hello Bart, Leon, Max and Israel.
> >>>>
> >>>> I cloned off Barts tree.
> >>>>
> >>>> git clone https://github.com/bvanassche/linux
> >>>> cd linux
> >>>> git checkout block-scsi-for-next
> >>>>
> >>>> I checked all patches were in for this test.
> >>>>
> >>>> a83e404 IB/srp: Reenable IB_MR_TYPE_SG_GAPS
> >>>> dfa5a2b mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array
> >>>> f759c80 mlx5: Fix mlx5_ib_map_mr_sg mr lengt
> >>>
> >>> Hi,
> >>> copying Sagi's request from different thread:
> >>>
> >>> "
> >>> Can you please enable srp_add_one debug:
> >>>
> >>> echo "func srp_add_one +p" > /sys/kernel/debug/dynamic_debug/control
> >>>
> >>> In addition apply the following:
> >>> --
> >>> diff --git a/drivers/infiniband/hw/mlx5/mr.c
> >>> b/drivers/infiniband/hw/mlx5/mr.c
> >>> index d9c6c0ea750b..040fbc387e4f 100644
> >>> --- a/drivers/infiniband/hw/mlx5/mr.c
> >>> +++ b/drivers/infiniband/hw/mlx5/mr.c
> >>> @@ -1403,6 +1403,8 @@ mlx5_alloc_priv_descs(struct ib_device *device,
> >>>          int add_size;
> >>>          int ret;
> >>>
> >>> +       WARN_ON_ONCE(ndescs > device->attr.max_fast_reg_page_list_len);
> >>> +
> >>>          add_size = max_t(int, MLX5_UMR_ALIGN - ARCH_KMALLOC_MINALIGN,
> >>>          0);
> >>>
> >>>          mr->descs_alloc = kzalloc(size + add_size, GFP_KERNEL);
> >>>
> >>> "
> >>>
> >>> Max.
> >>>
> >>>>
> >>>> Built and tested the kernel.
> >>>>
> >>>> However this issue is not resolved :(
> >>>>
> >>>> [ 2707.931909] scsi host1: ib_srp: failed RECV status WR flushed (5) for
> >>>> CQE ffff8817edca86b0
> >>>> [ 2708.089806] mlx5_0:dump_cqe:262:(pid 20129): dump error cqe
> >>>> [ 2708.121342] 00000000 00000000 00000000 00000000
> >>>> [ 2708.147104] 00000000 00000000 00000000 00000000
> >>>> [ 2708.172633] 00000000 00000000 00000000 00000000
> >>>> [ 2708.198702] 00000000 0f007806 2500002a 14a527d0
> >>>> [ 2732.434127] scsi host1: ib_srp: reconnect succeeded
> >>>> [ 2733.048023] scsi host1: ib_srp: failed RECV status WR flushed (5) for
> >>>> CQE ffff8817ed0a9c30
> >>>>
> >>>> [root@localhost ~]# [ 2746.413277] mlx5_0:dump_cqe:262:(pid 15877): dump
> >>>> error cqe
> >>>> [ 2746.443240] 00000000 00000000 00000000 00000000
> >>>> [ 2746.469323] 00000000 00000000 00000000 00000000
> >>>> [ 2746.495310] 00000000 00000000 00000000 00000000
> >>>> [ 2746.521407] 00000000 0f007806 25000032 003c7ad0
> >>>> [ 2752.445899] scsi host1: ib_srp: reconnect succeeded
> >>>> [ 2752.481835] scsi host1: ib_srp: failed RECV status WR flushed (5) for
> >>>> CQE ffff8817ed0a9cf0
> >>>> [ 2763.267386] mlx5_0:dump_cqe:262:(pid 15877): dump error cqe
> >>>> [ 2763.297826] 00000000 00000000 00000000 00000000
> >>>> [ 2763.323352] 00000000 00000000 00000000 00000000
> >>>> [ 2763.348722] 00000000 00000000 00000000 00000000
> >>>> [ 2763.374681] 00000000 0f007806 2500003a 00084bd0
> >>>>
> >>>> [root@localhost ~]# [ 2769.385203] fast_io_fail_tmo expired for SRP
> >>>> port-1:1 / host1.
> >>>> [ 2769.415956] scsi host1: ib_srp: reconnect succeeded
> >>>> [ 2769.450258] scsi host1: ib_srp: failed RECV status WR flushed (5) for
> >>>> CQE ffff8817ed0a9cf0
> >>>> [ 2780.064627] mlx5_0:dump_cqe:262:(pid 18771): dump error cqe
> >>>> [ 2780.093520] 00000000 00000000 00000000 00000000
> >>>> [ 2780.120067] 00000000 00000000 00000000 00000000
> >>>> [ 2780.145575] 00000000 00000000 00000000 00000000
> >>>> [ 2780.171153] 00000000 0f007806 25000042 000833d0
> >>>> [ 2785.923399] scsi host1: ib_srp: reconnect succeeded
> >>>> [ 2785.957504] scsi host1: ib_srp: failed RECV status WR flushed (5) for
> >>>> CQE ffff8817ed0a9cf0
> >>>> [ 2796.463426] mlx5_0:dump_cqe:262:(pid 18771): dump error cqe
> >>>> [ 2796.495257] 00000000 00000000 00000000 00000000
> >>>> [ 2796.521506] 00000000 00000000 00000000 00000000
> >>>> [ 2796.547640] 00000000 00000000 00000000 00000000
> >>>> [ 2796.573120] 00000000 0f007806 2500004a 00083bd0
> >>>> [ 2802.562578] scsi host1: ib_srp: reconnect succeeded
> >>>> [ 2802.596880] scsi host1: ib_srp: failed RECV status WR flushed (5) for
> >>>> CQE ffff8817ed0a9cf0
> >>>>
> >>>> Regards
> >>>> Laurence
> >>>>
> >>>
> >> Doing this now
> >> Thanks
> >> Laurence
> >
> > Max
> >
> > The Patch is not correct.
> >
> > drivers/infiniband/hw/mlx5/mr.c: In function 'mlx5_alloc_priv_descs':
> > drivers/infiniband/hw/mlx5/mr.c:1406:30: error: 'struct ib_device' has no
> > member named 'attr'
> >   WARN_ON_ONCE(ndescs > device->attr.max_fast_reg_page_list_len);
> >                               ^
> > ./include/asm-generic/bug.h:117:27: note: in definition of macro
> > 'WARN_ON_ONCE'
> >   int __ret_warn_once = !!(condition);   \
> >
> > I think you meant to give me
> >
> > WARN_ON_ONCE(ndescs > ib_device_attr->attr.max_fast_reg_page_list_len);
> >
> > Can you confirm
> 
> Hi Laurence,
> should be device->attrs.max_fast_reg_page_list_len.
> 
> please check this one that might solve the issue (on top of everything):
> 
> 
> diff --git a/drivers/infiniband/hw/mlx5/mr.c
> b/drivers/infiniband/hw/mlx5/mr.c
> index b8f9382..063d116 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -1559,7 +1559,7 @@ struct ib_mr *mlx5_ib_alloc_mr(struct ib_pd *pd,
>                  mr->max_descs = ndescs;
>          } else if (mr_type == IB_MR_TYPE_SG_GAPS) {
>                  mr->access_mode = MLX5_MKC_ACCESS_MODE_KLMS;
> -
> +               MLX5_SET(mkc, mkc, translations_octword_size,
> ALIGN(max_num_sg + 1, 4));
>                  err = mlx5_alloc_priv_descs(pd->device, mr,
>                                              ndescs, sizeof(struct
> mlx5_klm));
>                  if (err)
> 
> thanks,
> Max.
> 
> >
> > Thanks
> > Laurence
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Hello Max

I have the corrected WARN_ON_ONCE patch and the above patch as well as the rest as it was from Barts tree.

Still fails.

For a baseline I can revert 
a83e404 IB/srp: Reenable IB_MR_TYPE_SG_GAPS

Then test again to make sure we are starting from a good place.

Initiator log

[  280.481951] scsi host1: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff8817d9a881b8
[  301.149106] scsi host1: ib_srp: reconnect succeeded
[  301.280635] scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817ed32f2f0
[  334.596420] scsi host2: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817c592c970
[  334.599689] mlx5_1:dump_cqe:262:(pid 20): dump error cqe
[  334.599691] 00000000 00000000 00000000 00000000
[  334.599692] 00000000 00000000 00000000 00000000
[  334.599692] 00000000 00000000 00000000 00000000
[  334.599693] 00000000 0f007806 2500002d 067b48d0
[  334.599697] scsi host2: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff8817c6e30078
[  336.117248] mlx5_0:dump_cqe:262:(pid 130): dump error cqe
[  336.145840] 00000000 00000000 00000000 00000000
[  336.171830] 00000000 00000000 00000000 00000000
[  336.197688] 00000000 00000000 00000000 00000000
[  336.223720] 00000000 0f007806 25000032 005408d0
[  339.712706] fast_io_fail_tmo expired for SRP port-1:1 / host1.
[  341.453634] scsi host1: ib_srp: reconnect succeeded
[  341.481600] mlx5_0:dump_cqe:262:(pid 130): dump error cqe
[  341.482145] scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817ecaf6970
[  341.559359] 00000000 00000000 00000000 00000000
[  341.585397] 00000000 00000000 00000000 00000000
[  341.610948] 00000000 00000000 00000000 00000000
[  341.637515] 00000000 0f007806 2500003d 000046d0
[  342.297598] sd 1:0:0:9: rejecting I/O to offline device
[  342.297936] sd 1:0:0:9: [sdg] tag#28 FAILED Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[  342.297941] sd 1:0:0:9: [sdg] tag#28 CDB: Write(10) 2a 00 00 00 40 00 00 40 00 00
[  342.297943] blk_update_request: recoverable transport error, dev sdg, sector 16384
[  342.297951] sd 1:0:0:20: [sdar] tag#5 FAILED Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[  342.297952] sd 1:0:0:20: [sdar] tag#15 FAILED Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[  342.297956] sd 1:0:0:20: [sdar] tag#5 CDB: Write(10) 2a 00 00 03 c0 00 00 40 00 00
[  342.297956] sd 1:0:0:20: [sdar] tag#15 CDB: Write(10) 2a 00 00 2c c0 00 00 40 00 00
[  342.297958] blk_update_request: recoverable transport error, dev sdar, sector 245760
[  342.297959] blk_update_request: recoverable transport error, dev sdar, sector 2932736
[  342.298119] device-mapper: multipath: Failing path 8:96.
[  342.298266] sd 1:0:0:9: [sdg] tag#29 FAILED Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[  342.298268] sd 1:0:0:9: [sdg] tag#29 CDB: Write(10) 2a 00 00 00 c0 00 00 40 00 00
[  342.298269] blk_update_request: recoverable transport error, dev sdg, sector 49152
[  342.298300] device-mapper: multipath: Failing path 66:176.
[  342.298486] sd 1:0:0:20: [sdar] tag#16 FAILED Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[  342.298488] sd 1:0:0:20: [sdar] tag#6 FAILED Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[  342.298489] sd 1:0:0:20: [sdar] tag#16 CDB: Write(10) 2a 00 00 2d 40 00 00 40 00 00
[  342.298490] sd 1:0:0:20: [sdar] tag#6 CDB: Write(10) 2a 00 00 04 40 00 00 40 00 00
[  342.298491] blk_update_request: recoverable transport error, dev sdar, sector 2965504
[  342.298492] blk_update_request: recoverable transport error, dev sdar, sector 278528
[  342.298582] sd 1:0:0:9: [sdg] tag#30 FAILED Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[  342.298584] sd 1:0:0:9: [sdg] tag#30 CDB: Write(10) 2a 00 00 01 40 00 00 40 00 00
[  342.298585] blk_update_request: recoverable transport error, dev sdg, sector 81920
[  342.298889] sd 1:0:0:9: [sdg] tag#31 FAILED Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[  342.298890] sd 1:0:0:9: [sdg] tag#31 CDB: Write(10) 2a 00 00 01 c0 00 00 40 00 00
[  342.298891] blk_update_request: recoverable transport error, dev sdg, sector 114688
[  342.298981] sd 1:0:0:20: [sdar] tag#7 FAILED Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[  342.298983] sd 1:0:0:20: [sdar] tag#7 CDB: Write(10) 2a 00 00 04 c0 00 00 40 00 00
[  342.298985] blk_update_request: recoverable transport error, dev sdar, sector 311296
[  342.299004] sd 1:0:0:20: [sdar] tag#17 FAILED Result: hostbyte=DID_TRANSPORT_FAILFAST driverbyte=DRIVER_OK
[  342.299007] sd 1:0:0:20: [sdar] tag#17 CDB: Write(10) 2a 00 00 34 c0 00 00 40 00 00
[  342.299009] blk_update_request: recoverable transport error, dev sdar, sector 3457024
[  342.356353] device-mapper: multipath: Failing path 8:64.
[  342.356489] device-mapper: multipath: Failing path 8:128.
[  342.356628] device-mapper: multipath: Failing path 8:160.
[  342.356699] device-mapper: multipath: Failing path 8:176.
[  342.356767] device-mapper: multipath: Failing path 8:240.
[  342.356834] device-mapper: multipath: Failing path 8:208.
[  342.356900] device-mapper: multipath: Failing path 65:16.
[  342.356967] device-mapper: multipath: Failing path 65:64.
[  342.357035] device-mapper: multipath: Failing path 65:96.
[  342.357103] device-mapper: multipath: Failing path 65:128.
[  342.357169] device-mapper: multipath: Failing path 65:176.
[  342.357237] device-mapper: multipath: Failing path 65:208.
[  342.357303] device-mapper: multipath: Failing path 65:224.
[  342.357371] device-mapper: multipath: Failing path 66:0.
[  342.357454] device-mapper: multipath: Failing path 66:32.
[  342.357521] device-mapper: multipath: Failing path 66:48.
[  342.357647] device-mapper: multipath: Failing path 66:80.
[  342.357714] device-mapper: multipath: Failing path 66:112.
[  342.357781] device-mapper: multipath: Failing path 66:144.
[  342.357936] device-mapper: multipath: Failing path 66:208.
[  342.358019] device-mapper: multipath: Failing path 66:240.
[  342.358115] device-mapper: multipath: Failing path 67:16.
[  342.358183] device-mapper: multipath: Failing path 67:48.
[  342.358264] device-mapper: multipath: Failing path 67:80.
[  342.358359] device-mapper: multipath: Failing path 67:128.
[  342.358442] device-mapper: multipath: Failing path 67:160.
[  342.358594] device-mapper: multipath: Failing path 67:224.
[  342.358671] device-mapper: multipath: Failing path 67:208.
[  350.157728] scsi host2: ib_srp: reconnect succeeded
[  350.189605] mlx5_1:dump_cqe:262:(pid 4756): dump error cqe
[  350.193180] mlx5_1:dump_cqe:262:(pid 1275): dump error cqe
[  350.193182] 00000000 00000000 00000000 00000000
[  350.193182] 00000000 00000000 00000000 00000000
[  350.193183] 00000000 00000000 00000000 00000000
[  350.193183] 00000000 0f007806 25000035 04f569d0
[  350.193187] scsi host2: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff8817c6e30078
[  350.412637] 00000000 00000000 00000000 00000000
[  350.436431] 00000000 00000000 00000000 00000000
[  350.461871] 00000000 00000000 00000000 00000000
[  350.487549] 00000000 0f007806 25000032 000843d0

Target Log

Thee events happened after the first failures on the initiator

[ 1111.029847] ib_srpt Received CM TimeWait exit for ch 0x4f6e72000390fe7c7cfe900300726ed3-49.
[ 1111.078815] ib_srpt Received CM TimeWait exit for ch 0x4f6e72000390fe7c7cfe900300726ed3-48.
[ 1111.127420] ib_srpt Received CM TimeWait exit for ch 0x4f6e72000390fe7c7cfe900300726ed3-47.
[ 1111.175801] ib_srpt Received CM TimeWait exit for ch 0x4f6e72000390fe7c7cfe900300726ed3-46.
[ 1111.223725] ib_srpt Received CM TimeWait exit for ch 0x4f6e72000390fe7c7cfe900300726ed3-45.
[ 1111.271957] ib_srpt Received CM TimeWait exit for ch 0x4f6e72000390fe7c7cfe900300726ed3-44.
[ 1111.319494] ib_srpt Received CM TimeWait exit for ch 0x4f6e72000390fe7c7cfe900300726ed3-43.
[ 1111.365795] ib_srpt Received CM TimeWait exit for ch 0x4f6e72000390fe7c7cfe900300726ed3-42.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2017-04-26 13:28 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-24 22:15 [PATCH, untested] mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] array Bart Van Assche
     [not found] ` <8992bd28-667f-94b1-e582-106e6b41aa4b-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2017-04-24 22:39   ` Laurence Oberman
     [not found]     ` <1726285260.1422143.1493073573791.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-24 22:46       ` Bart Van Assche
     [not found]         ` <1493073989.3394.24.camel-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2017-04-24 22:59           ` Laurence Oberman
2017-04-25 17:58   ` Leon Romanovsky
     [not found]     ` <20170425175849.GS14088-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-04-25 20:37       ` Laurence Oberman
     [not found]         ` <438230391.2090966.1493152655709.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-26  3:39           ` Bart Van Assche
     [not found]             ` <1493177952.3503.1.camel-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2017-04-26 11:46               ` Laurence Oberman
     [not found]                 ` <1801288254.2280763.1493207193850.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-26 15:05                   ` Bart Van Assche
2017-04-26  6:16           ` Leon Romanovsky
     [not found]             ` <20170426061640.GV14088-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-04-26 10:30               ` Max Gurtovoy
2017-05-03  8:18               ` Sagi Grimberg
     [not found]                 ` <bcd56de8-0f17-f2bb-b079-bf22c1b92ca2-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-05-03 14:15                   ` Laurence Oberman
     [not found]                     ` <501334895.4531615.1493820950718.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-05-03 14:58                       ` Sagi Grimberg
     [not found]                         ` <374fcc74-4b84-610b-b55e-d385563bef6f-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2017-05-05 16:31                           ` Laurence Oberman
     [not found]                             ` <1072634318.5542006.1494001866306.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-05-10 14:06                               ` Laurence Oberman
2017-04-26  8:31           ` Max Gurtovoy
     [not found]             ` <896e9a9e-43b6-7a21-e41b-861e4f795436-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-04-26 11:47               ` Laurence Oberman
     [not found]                 ` <288883138.2280971.1493207257218.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-26 12:18                   ` Laurence Oberman
     [not found]                     ` <497950649.2287440.1493209093092.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-26 12:20                       ` Laurence Oberman
2017-04-26 12:25                       ` Max Gurtovoy
     [not found]                         ` <16ea1371-84a5-c055-5b0c-fdc6d355276a-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-04-26 13:28                           ` Laurence Oberman [this message]
     [not found]                             ` <2122831810.2341766.1493213317484.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-26 13:50                               ` Laurence Oberman
     [not found]                                 ` <1879402127.2348907.1493214625254.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-26 15:10                                   ` Laurence Oberman
     [not found]                                     ` <1477402175.2378198.1493219418826.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-05-02 23:28                                       ` Max Gurtovoy
2017-04-26 14:45   ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2122831810.2341766.1493213317484.JavaMail.zimbra@redhat.com \
    --to=loberman-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=israelr-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=maxg-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.