linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vu Pham <vu-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
To: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
Cc: Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	David Dillow <dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>,
	Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Oren Duer <oren-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Sagi Grimberg <sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Subject: Re: [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8
Date: Wed, 6 Feb 2013 13:42:56 -0800	[thread overview]
Message-ID: <5112CE60.2030607@mellanox.com> (raw)
In-Reply-To: <5112049B.8030406-HInyCGIudOg@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 3961 bytes --]

Bart Van Assche wrote:
> On 02/05/13 21:54, Or Gerlitz wrote:
>> On Tue, Feb 5, 2013 at 6:25 PM, Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org> 
>> wrote:
>>> On 02/04/13 22:11, Or Gerlitz wrote:
>> Bart, I'd like to sharpen the point: could you please clarify if the
>> series posted to linux-rdma stands for itself in the sense that SRP HA
>> scheme X (please state it) now works/better when the patches applied
>> on top of the latest 3.8-rc cut? OR for X to do better/work, one needs
>> this series AND the one you posted to linux-scsi.
>
> Hello Or,
>
> A huge number of patches have been taken upstream between 3.8-rc1 and 
> 3.8-rc6. I have retested these three patches with 3.8-rc6 and would 
> appreciate if you would also repeat your tests.
>
> Thanks,
>
> Bart.
Hello Bart,

I tested your 3.8 v3 patchset. I did the following:
- clone & checkout Roland's ib tree for-next branch
- applied Bart's 3.8 v3 patchset
- applied "save & restore host_scribble during error handling" patch - 
http://www.mail-archive.com/linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg17809.html

I have two paths to target thru port 1 & 2 (scsi_host host9 & host10)

- run I/Os
- disable port 1 @ 19:11:30
- error recovery for host9 kick in @ 19:12:04
- multipath remove the path, I/Os fail-over @ 19:12:51
- error recovery was still going on with host9 (sysfs entry for host9 
still intact)
- enable port 1 @19:15:00
- host9 reconnect to target thru error recovery, multipathd module 
re-instate the path in kernel; and then host9 is REMOVED, usermode 
"multipath -l" did not show re-instate path thru host9

Feb  6 19:15:04 vsa30 kernel: scsi host9: SRP abort called
Feb  6 19:15:05 vsa30 multipathd: overflow in attribute 
'/sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host9/target9:0:0/9:0:0:2/state'
Feb  6 19:15:14 vsa30 kernel: scsi host9: SRP abort called
Feb  6 19:15:14 vsa30 kernel: scsi host9: SRP reset_device called
Feb  6 19:15:14 vsa30 kernel: scsi host9: ib_srp: SRP reset_host called
Feb  6 19:15:14 vsa30 kernel: scsi host9: ib_srp: reconnect succeeded
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180003: sdd 
- tur checker reports path is up
Feb  6 19:15:26 vsa30 multipathd: 8:48: reinstated
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180003: 
remaining active paths: 2
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180002: sdc 
- tur checker reports path is up
Feb  6 19:15:26 vsa30 multipathd: 8:32: reinstated
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180002: 
remaining active paths: 2
Feb  6 19:15:26 vsa30 multipathd: sdc: remove path (uevent)
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180002: 
load table [0 409600 multipath 0 0 1 1 round-robin 0 1 1 8:80 1]
Feb  6 19:15:26 vsa30 multipathd: sdc: path removed from map 
3600144f0665c4400000050a522180002
Feb  6 19:15:26 vsa30 kernel: sd 9:0:0:1: [sdc] Synchronizing SCSI cache
Feb  6 19:15:26 vsa30 multipathd: sdd: remove path (uevent)
Feb  6 19:15:26 vsa30 multipathd: 3600144f0665c4400000050a522180003: 
load table [0 409600 multipath 0 0 1 1 round-robin 0 1 1 8:96 1]
Feb  6 19:15:26 vsa30 multipathd: sdd: path removed from map 
3600144f0665c4400000050a522180003
Feb  6 19:15:26 vsa30 kernel: sd 9:0:0:2: [sdd] Synchronizing SCSI cache

- disable port 2 @19:22:50
- error recovery kicked in on host10 @ 19:23:40
- I/Os failed with NO path to target @ 19:24:27
- without enabling port 2, error recovery was still going on host10 
still 19:57:52 and stop.
- host10 was still in sysfs /sys/class/scsi_host/host10 & taking 
reference on ib_srp module
- enable port 2 - nothing happened.

Conclusion:
1. disable the port/path long enough >35 minutes, we have dangling scsi 
host.
2. enable the port within 30 minute, scsi host re-establish connection, 
path re-instate and then scsi_host was removed (no entry in sysfs)

I attached a log here to show what happened above.

thanks,
-vu

[-- Attachment #2: messages.bz2 --]
[-- Type: application/octet-stream, Size: 10661 bytes --]

  parent reply	other threads:[~2013-02-06 21:42 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-01 15:18 [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 Bart Van Assche
     [not found] ` <510BDCAA.204-HInyCGIudOg@public.gmane.org>
2013-02-01 15:18   ` [PATCH for 3.8 v3, resend 1/3] IB/srp: Track connection state properly Bart Van Assche
2013-02-01 15:19   ` [PATCH for 3.8 v3, resend 2/3] IB/srp: Avoid sending a task management function needlessly Bart Van Assche
2013-02-01 15:21   ` [PATCH for 3.8 v3, resend 3/3] IB/srp: Avoid endless SCSI error handling loop Bart Van Assche
2013-02-04 21:11   ` [PATCH for 3.8 v3, resend 0/3] IB/SRP patches for kernel 3.8 Or Gerlitz
     [not found]     ` <CAJZOPZLKQV0QvrW5sK8hQJf7AZc+1nUzp+5YCkZ3iVU4oTWbLg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-02-05 16:25       ` Bart Van Assche
     [not found]         ` <5111327F.6050402-HInyCGIudOg@public.gmane.org>
2013-02-05 20:54           ` Or Gerlitz
     [not found]             ` <CAJZOPZ+-Zg=jnqg4ZmFL5Yo4_2DoWGcgy=3u6g3Rf9y80pXnpg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-02-06  7:22               ` Bart Van Assche
     [not found]                 ` <5112049B.8030406-HInyCGIudOg@public.gmane.org>
2013-02-06  7:44                   ` Or Gerlitz
     [not found]                     ` <511209E5.1010807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-06  7:59                       ` Bart Van Assche
     [not found]                         ` <51120D4F.2070102-HInyCGIudOg@public.gmane.org>
2013-02-06  8:25                           ` Or Gerlitz
2013-02-06 21:42                   ` Vu Pham [this message]
     [not found]                     ` <5112CE60.2030607-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-07  9:05                       ` Bart Van Assche
     [not found]                         ` <51136E74.9090209-HInyCGIudOg@public.gmane.org>
2013-02-07  9:41                           ` Or Gerlitz
     [not found]                             ` <511376C2.6050100-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-07 10:15                               ` Bart Van Assche
2013-02-07 18:20                           ` Vu Pham
     [not found]                             ` <5113F056.4020501-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-15  9:39                               ` [PATCH] IB/srp: Fail I/O requests if the transport is offline Bart Van Assche
     [not found]                                 ` <511E024E.70002-HInyCGIudOg@public.gmane.org>
2013-02-18  4:06                                   ` David Dillow
     [not found]                                     ` <1361160385.7415.2.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2013-02-18  8:11                                       ` Sagi Grimberg
     [not found]                                         ` <5121E217.3080003-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-24  8:09                                           ` Bart Van Assche
     [not found]                                             ` <5129CAB6.5030506-HInyCGIudOg@public.gmane.org>
2013-02-24  8:59                                               ` Sagi Grimberg
     [not found]                                                 ` <5129D665.3070206-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-24 14:42                                                   ` Or Gerlitz
2013-02-21 16:10                                       ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5112CE60.2030607@mellanox.com \
    --to=vu-vpraknaxozvwk0htik3j/w@public.gmane.org \
    --cc=bvanassche-HInyCGIudOg@public.gmane.org \
    --cc=dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=oren-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=sagig-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).