linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
To: Or Gerlitz <gerlitz.or-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org
Cc: linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
	Robert Elliott <Elliott-VXdhtT5mjnY@public.gmane.org>,
	Ming Lei <ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Subject: Re: [PATCH 5/8] IB/srp: Remove stale connection retry mechanism
Date: Fri, 03 Oct 2014 10:51:17 +0200	[thread overview]
Message-ID: <542E6385.5060009@acm.org> (raw)
In-Reply-To: <542D2A3C.2080009-HInyCGIudOg@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 2373 bytes --]

On 10/02/14 12:34, Bart Van Assche wrote:
> On 09/20/14 19:45, Or Gerlitz wrote:
>> On Fri, Sep 19, 2014 at 3:58 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> 
>> wrote:
>>> Attempting to connect three times may be insufficient after an
>>> initiator system that was using multiple RDMA channels tries to
>>> relogin. Additionally, this login retry mechanism is a workaround
>>> for particular behavior of the IB/CM.
>>
>> Can you be more specific re the particular behavior of the IB CM?
>> added Sean, the CM maintainer.
> 
> Let's focus on the software behavior instead of the people who are 
> involved. What I have observed several times is that after a power cycle 
> of the initiator system the first few login attempts are rejected. I was 
> assuming that this was due to the IB/CM implementation but now that I 
> have had another look at the logs I see that there is not enough 
> information in the system logs to draw this conclusion. I will add 
> additional logging statements in the initiator and target kernel code 
> such that I can determine the root cause of this behavior.

(replying to my own e-mail / removed linux-scsi from CC-list)

So far I have been able to reproduce this behavior once after pushing 
the reset button of the initiator system while it was in the connected 
state. After the initiator system had finished rebooting I started 
ibdump on both IB ports of the target system (attached to this e-mail). 
What surprised me is that I found all the messages I expected in the 
ibdump output (e.g. IB MAD device management query) but no CM messages. Both 
sides were running FW 2.32.5100. The following messages were logged at 
the initiator side while ibdump was running at the target side:

Oct 02 17:43:42 msi kernel: scsi host14: ib_srp: REJ received
Oct 02 17:43:42 msi kernel: scsi host14:   REJ reason: stale connection
Oct 02 17:43:42 msi kernel: scsi host14: ib_srp: giving up on stale connection
Oct 02 17:43:42 msi kernel: scsi host14: ib_srp: Connection 0/12 failed
Oct 02 17:43:42 msi kernel: scsi host15: ib_srp: REJ received
Oct 02 17:43:42 msi kernel: scsi host15:   REJ reason: stale connection
Oct 02 17:43:42 msi kernel: scsi host15: ib_srp: giving up on stale connection
Oct 02 17:43:42 msi kernel: scsi host15: ib_srp: Connection 0/12 failed

After a few more login attempts SRP login succeeded.

Bart.

[-- Attachment #2: p1.pcap --]
[-- Type: application/vnd.tcpdump.pcap, Size: 0 bytes --]

[-- Attachment #3: p2.pcap --]
[-- Type: application/vnd.tcpdump.pcap, Size: 4096 bytes --]

  parent reply	other threads:[~2014-10-03  8:51 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-19 12:55 [PATCH RFC 0/8] IB/srp: Add multichannel support Bart Van Assche
2014-09-19 12:56 ` Bart Van Assche
2014-09-19 18:02   ` Sagi Grimberg
2014-09-19 12:57 ` [PATCH 2/8] scsi-mq: Add support for multiple hardware queues Bart Van Assche
     [not found]   ` <541C281E.9090206-HInyCGIudOg@public.gmane.org>
2014-09-19 18:05     ` Sagi Grimberg
2014-09-19 18:11       ` Christoph Hellwig
     [not found]       ` <CAF9gx6JfP2bGyMauB6LzepZP_vKEvrd-sPZc5CRuOrtgQ_UCSw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-26 11:08         ` Ming Lei
     [not found]           ` <CACVXFVMiYsW=dszQ6mE-o_L8fEDdkO59vJ5qeHKch5c33K_QXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-26 14:02             ` Bart Van Assche
2014-09-19 12:57 ` [PATCH 3/8] scsi-mq: Pass hctx to low-level SCSI drivers Bart Van Assche
2014-09-19 12:59 ` [PATCH 6/8] IB/srp: Avoid that I/O hangs due to a cable pull during LUN scanning Bart Van Assche
2014-09-19 12:59 ` [PATCH 7/8] IB/srp: Separate target and channel variables Bart Van Assche
2014-09-19 18:47   ` Sagi Grimberg
     [not found]   ` <541C28C8.7000007-HInyCGIudOg@public.gmane.org>
2014-09-23 16:07     ` Sagi Grimberg
2014-09-23 20:00       ` Bart Van Assche
     [not found] ` <541C27BF.6070609-HInyCGIudOg@public.gmane.org>
2014-09-19 12:58   ` [PATCH 4/8] IB/srp: Move ib_destroy_cm_id() call into srp_free_ch_ib() Bart Van Assche
     [not found]     ` <541C285B.5010309-HInyCGIudOg@public.gmane.org>
2014-09-19 18:10       ` Sagi Grimberg
2014-09-19 12:58   ` [PATCH 5/8] IB/srp: Remove stale connection retry mechanism Bart Van Assche
2014-09-19 18:25     ` Sagi Grimberg
     [not found]     ` <541C287D.1050900-HInyCGIudOg@public.gmane.org>
2014-09-20 17:45       ` Or Gerlitz
     [not found]         ` <CAJ3xEMhPKiut4MwZH9F7-T0+u7B6XPuh-FTZpA=Xe4ViAj5UUQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-02 10:34           ` Bart Van Assche
     [not found]             ` <542D2A3C.2080009-HInyCGIudOg@public.gmane.org>
2014-10-03  8:51               ` Bart Van Assche [this message]
2014-09-19 13:00   ` [PATCH 8/8] IB/srp: Add multichannel support Bart Van Assche
     [not found]     ` <541C28E0.7010705-HInyCGIudOg@public.gmane.org>
2014-09-19 14:28       ` Ming Lei
     [not found]         ` <CACVXFVPzz37J-613NZCfPStUBxf0rLOtz71LJ07PpCxYg5nn+g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-09-19 15:21           ` Bart Van Assche
     [not found]             ` <541C49EC.6030404-HInyCGIudOg@public.gmane.org>
2014-09-19 15:27               ` Ming Lei
2014-09-19 15:35                 ` Bart Van Assche
2014-09-19 15:38                   ` Jens Axboe
2014-09-19 17:30                     ` Sagi Grimberg
2014-09-19 17:33                       ` Jens Axboe
2014-09-19 18:11                         ` Christoph Hellwig
     [not found]                     ` <541C4DF1.4090604-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2014-10-01 16:08                       ` Bart Van Assche
2014-10-01 16:54                         ` Jens Axboe
2014-10-01 21:14                           ` Christoph Hellwig
     [not found]                           ` <542C31C4.1020702-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2014-10-02 16:45                             ` Bart Van Assche
2014-10-02 16:55                               ` Jens Axboe
     [not found]                                 ` <542D8368.8080604-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2014-10-03 13:01                                   ` Bart Van Assche
2014-10-03 14:24                                     ` Jens Axboe
     [not found]                               ` <542D8143.3050305-HInyCGIudOg@public.gmane.org>
2014-10-02 17:30                                 ` Christoph Hellwig
2014-10-06 11:16                                   ` Bart Van Assche
     [not found]                                     ` <54327A21.6070202-HInyCGIudOg@public.gmane.org>
2014-10-10 20:16                                       ` Roland Dreier
2014-09-23 16:32     ` Sagi Grimberg
     [not found]       ` <5421A093.1070203-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-09-23 19:02         ` Bart Van Assche
2014-09-24 12:22           ` Sagi Grimberg
2014-09-24 13:13             ` Bart Van Assche
     [not found]               ` <5422C395.7090902-HInyCGIudOg@public.gmane.org>
2014-09-24 13:38                 ` Sagi Grimberg
     [not found]                   ` <5422C970.4050306-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-09-24 13:43                     ` Sagi Grimberg
2014-10-07 12:51         ` Bart Van Assche
     [not found]           ` <5433E1B5.1030103-HInyCGIudOg@public.gmane.org>
2014-10-13  8:17             ` Sagi Grimberg
     [not found]               ` <543B8AB0.1090704-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-10-13  8:52                 ` Bart Van Assche
2014-10-01 16:14   ` [PATCH RFC] scsi_tcq.h: Add support for multiple hardware queues Bart Van Assche
2014-09-19 18:31 ` [PATCH RFC 0/8] IB/srp: Add multichannel support Jens Axboe
2014-09-22 14:37 ` Christoph Hellwig
     [not found]   ` <20140922143731.GA15377-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2014-09-22 16:25     ` Bart Van Assche
2014-09-22 16:31       ` Jens Axboe
2014-09-22 16:39         ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=542E6385.5060009@acm.org \
    --to=bvanassche-hinycgiudog@public.gmane.org \
    --cc=Elliott-VXdhtT5mjnY@public.gmane.org \
    --cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
    --cc=gerlitz.or-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).