From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bart Van Assche Subject: Re: [PATCH 5/8] IB/srp: Remove stale connection retry mechanism Date: Fri, 03 Oct 2014 10:51:17 +0200 Message-ID: <542E6385.5060009@acm.org> References: <541C27BF.6070609@acm.org> <541C287D.1050900@acm.org> <542D2A3C.2080009@acm.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------010204020203070708040708" Return-path: In-Reply-To: <542D2A3C.2080009-HInyCGIudOg@public.gmane.org> Sender: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Or Gerlitz , sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org Cc: linux-rdma , Christoph Hellwig , Jens Axboe , Robert Elliott , Ming Lei List-Id: linux-rdma@vger.kernel.org This is a multi-part message in MIME format. --------------010204020203070708040708 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit On 10/02/14 12:34, Bart Van Assche wrote: > On 09/20/14 19:45, Or Gerlitz wrote: >> On Fri, Sep 19, 2014 at 3:58 PM, Bart Van Assche >> wrote: >>> Attempting to connect three times may be insufficient after an >>> initiator system that was using multiple RDMA channels tries to >>> relogin. Additionally, this login retry mechanism is a workaround >>> for particular behavior of the IB/CM. >> >> Can you be more specific re the particular behavior of the IB CM? >> added Sean, the CM maintainer. > > Let's focus on the software behavior instead of the people who are > involved. What I have observed several times is that after a power cycle > of the initiator system the first few login attempts are rejected. I was > assuming that this was due to the IB/CM implementation but now that I > have had another look at the logs I see that there is not enough > information in the system logs to draw this conclusion. I will add > additional logging statements in the initiator and target kernel code > such that I can determine the root cause of this behavior. (replying to my own e-mail / removed linux-scsi from CC-list) So far I have been able to reproduce this behavior once after pushing the reset button of the initiator system while it was in the connected state. After the initiator system had finished rebooting I started ibdump on both IB ports of the target system (attached to this e-mail). What surprised me is that I found all the messages I expected in the ibdump output (e.g. IB MAD device management query) but no CM messages. Both sides were running FW 2.32.5100. The following messages were logged at the initiator side while ibdump was running at the target side: Oct 02 17:43:42 msi kernel: scsi host14: ib_srp: REJ received Oct 02 17:43:42 msi kernel: scsi host14: REJ reason: stale connection Oct 02 17:43:42 msi kernel: scsi host14: ib_srp: giving up on stale connection Oct 02 17:43:42 msi kernel: scsi host14: ib_srp: Connection 0/12 failed Oct 02 17:43:42 msi kernel: scsi host15: ib_srp: REJ received Oct 02 17:43:42 msi kernel: scsi host15: REJ reason: stale connection Oct 02 17:43:42 msi kernel: scsi host15: ib_srp: giving up on stale connection Oct 02 17:43:42 msi kernel: scsi host15: ib_srp: Connection 0/12 failed After a few more login attempts SRP login succeeded. Bart. --------------010204020203070708040708 Content-Type: application/vnd.tcpdump.pcap; name="p1.pcap" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="p1.pcap" --------------010204020203070708040708 Content-Type: application/vnd.tcpdump.pcap; name="p2.pcap" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="p2.pcap" 1MOyoQIABAAAAAAAAAAAAP//AADFAAAArnItVMIrCAAyAQAAMgEAAM4jFomuci1UFQQBMgAA ASIAAgADAEgABGQA//8AAAABAAAAFYABAAAAAAABAQMCAQAAAAAAAAAXAAAAAQABAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAE1eKZyUUK5yLVQcLAgAMgEAADIBAADECRyJrnItVBUEATIAAAEi AAIABABIAANkAP//AAAAAQAAA7yAAQAAAAAAAQEDAoEAAAAAAAAAFwAAAAEAAQAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQImAgAAfRAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAABgAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAQAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAACvxB5jr/iuci1ULCwIADIBAAAyAQAANBYdia5yLVQVBAEyAAABIgAC AAMASAAEZAD//wAAAAEAAAAWgAEAAAAAAAEBAwISAAAAAAAAABcAAAACABIAAIAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAI AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAUP2gQYgWrnItVF4sCAAyAQAAMgEAABJdIImuci1UFQQBMgAAASIAAgAE AEgAA2QA//8AAAABAAADvYABAAAAAAABAQMCkgAAAAAAAAAXAAAAAgASAACAAAAAAQEHAAAA AAEAAABcAAAAAAAAAAAACQAAAAAAAAAAAIAAAwIAAAAAAAAAAAD+gAAAAAAAAAADAAMCWUhq AAAAAAIDAwJ0UgBHUEAECAgFH0AAAAAAAACAEBCIAAAAAAAAAAARAQAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAChryc2ZtK5yLVRmLAgAMgEAADIBAABK4yCJrnItVBUEATIAAAEiAAIAAwBI AARkAP//AAAAAQAAABeAAQAAAAAAAQEDAhIAAAAAAAAAFwAAAAIAEgAAgAAAAAECAQAAAAAB AAAAAQAAAAAAAAAAAAkAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAANoVkLUYSuci1UaywIADIBAAAyAQAALTchia5yLVQVBAEyAAABIgACAAMASAAE ZAD//wAAAAEAAAAYgAEAAAAAAAEBAwIBAAAAAAAAABcAAAADABEAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAQADAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAwLG+1dMnrnItVI0sCAAyAQAAMgEAAJtxI4muci1UFQQBMgAAASIAAgAEAEgAA2QA //8AAAABAAADvoABAAAAAAABAQMCgQAAAAAAAAAXAAAAAwARAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAADgAAAAAAAAAAAAEAAwAAAQEBAgACyQMAo0JzAALJAwCjQnAAAskDAKNCcgCA EAMAAAAAAgACyWFzdXMgSENBLTEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAGIyBRgYV65yLVRRLQgAMgEAADIBAAD3STCJrnItVBUEATIAAAEiAAIAAwBIAARkAP// AAAAAQAAABmAAQAAAAAAAQEDAgEAAAAAAAAAFwAAAAQANQAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAACAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAADAAQAAAAAAAD//wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAhPNIjximuci1Uxy0IADIBAAAyAQAAsQU4ia5yLVQVBAEyAAABIgACAAQASAADZAD//wAA AAEAAAO/gAEAAAAAAAEBAwKBAAAAAAAAABcAAAAEADUAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAIAAAAAAAAAAAgMAAAAAAAAAAA/oAAAAAAAAAAAskDAKNCcv6AAAAAAAAAAALJAwD6 t/IAAwAEAAAAAACA//8AAIWMkAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 1oc4Q1JWrnItVJRqCAAyAQAAMgEAAAwZNI2uci1UFQQBMgAAASIAAgADAEgABGQA//8AAAAB AAAAGoABAAAAAAABAQYBAQAAAAAAAAAXAAAABQAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPVm 7h75Kq5yLVSbaggAMgEAADIBAAB9jjSNrnItVBUEATIAAAEiAAIABABIAANkAP//AAAAAQAA A8CAAQAAAAAAAQEGAYEAAAAAAAAAFwAAAAUAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAARAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABZXzuA 81Wuci1Uw2oIADIBAAAyAQAAlS03ja5yLVQVBAEyAAABIgACAAMASAAEZAD//wAAAAEAAAAb gAEAAAAAAAEBBgEBAAAAAAAAABcAAAAGABEAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA9xix+pe1 rnItVMlqCAAyAQAAMgEAAD+SN42uci1UFQQBMgAAASIAAgAEAEgAA2QA//8AAAABAAADwYAB AAAAAAABAQYBgQAAAAAAAAAXAAAABgARAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAACyQMAo0JwAAACyQAAEAMAAAAAAAACyQAAAAABAGCeAQgAAQAA AAAP/wAEAAAgUAABAAArAAEAAAAAAAAAAABGVVNJT05JTyBTUlAgdGFyZ2V0AA== --------------010204020203070708040708-- -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html