linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Jack Wang <jinpu.wang@profitbricks.com>
Cc: David Dillow <dillowda@ornl.gov>, Vu Pham <vuhuong@mellanox.com>,
	Sebastian Riemer <sebastian.riemer@profitbricks.com>,
	linux-rdma <linux-rdma@vger.kernel.org>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	James Bottomley <jbottomley@parallels.com>,
	Roland Dreier <roland@kernel.org>
Subject: Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
Date: Wed, 19 Jun 2013 17:27:04 +0200	[thread overview]
Message-ID: <51C1CDC8.4070103@acm.org> (raw)
In-Reply-To: <51C1B5CA.2030302@profitbricks.com>

On 06/19/13 15:44, Jack Wang wrote:
>> +		/*
>> +		 * It can occur that after fast_io_fail_tmo expired and before
>> +		 * dev_loss_tmo expired that the SCSI error handler has
>> +		 * offlined one or more devices. scsi_target_unblock() doesn't
>> +		 * change the state of these devices into running, so do that
>> +		 * explicitly.
>> +		 */
>> +		spin_lock_irq(shost->host_lock);
>> +		__shost_for_each_device(sdev, shost)
>> +			if (sdev->sdev_state == SDEV_OFFLINE)
>> +				sdev->sdev_state = SDEV_RUNNING;
>> +		spin_unlock_irq(shost->host_lock);
>
> Do you have test case to verify this behaviour?

Hello Jack,

This is what I came up with after analyzing why a so-called "port 
flapping" test failed. The concept of that test is simple: use 
ibportstate to disable and reenable the proper IB port on the switch 
with random intervals and check whether I/O starts running again if the 
path remains operational long enough. When running such a test for a few 
days with random intervals between a few seconds and a few minutes 
sooner or later it will occur that scsi_try_host_reset() succeeds and 
that scsi_eh_test_devices() fails. That will cause the SCSI error 
handler to offline devices. Hence the above code to change the offline 
state into running after a reconnect succeeds. I'm not proud of that 
code but I couldn't find a better solution. Maybe the above code won't 
be necessary anymore once we switch to Hannes' new SCSI error handler.

Bart.

  reply	other threads:[~2013-06-19 15:27 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-19 13:44 [PATCH 07/14] scsi_transport_srp: Add transport layer error handling Jack Wang
2013-06-19 15:27 ` Bart Van Assche [this message]
2013-06-21 12:17   ` Jack Wang
  -- strict thread matches above, loose matches on Subject: below --
2013-06-24 13:48 Jack Wang
     [not found] ` <51C84E39.80806-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-24 15:50   ` Bart Van Assche
     [not found]     ` <51C86AB4.1000906-HInyCGIudOg@public.gmane.org>
2013-06-24 16:05       ` Jack Wang
2013-06-12 13:17 [PATCH 0/14] IB SRP initiator patches for kernel 3.11 Bart Van Assche
2013-06-12 13:28 ` [PATCH 07/14] scsi_transport_srp: Add transport layer error handling Bart Van Assche
     [not found]   ` <51B8777B.5050201-HInyCGIudOg@public.gmane.org>
2013-06-13 19:43     ` Vu Pham
2013-06-14 13:19       ` Bart Van Assche
     [not found]         ` <51BB1857.7040802-HInyCGIudOg@public.gmane.org>
2013-06-14 17:59           ` Vu Pham
     [not found]             ` <51BB5A04.3080901-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-15  9:52               ` Bart Van Assche
     [not found]                 ` <51BC3945.9030900-HInyCGIudOg@public.gmane.org>
2013-06-17  6:18                   ` Hannes Reinecke
2013-06-17  7:04                     ` Bart Van Assche
2013-06-17  7:14                       ` Hannes Reinecke
2013-06-17  7:29                         ` Bart Van Assche
     [not found]                           ` <51BEBAEA.4080202-HInyCGIudOg@public.gmane.org>
2013-06-17  8:10                             ` Hannes Reinecke
2013-06-17 10:13                             ` Sebastian Riemer
2013-06-18 16:59                 ` Vu Pham
     [not found]                   ` <51C09202.2040503-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-19 13:00                     ` Bart Van Assche
2013-06-23 21:13   ` Mike Christie
     [not found]     ` <51C764FB.6070207-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2013-06-24  7:37       ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51C1CDC8.4070103@acm.org \
    --to=bvanassche@acm.org \
    --cc=dillowda@ornl.gov \
    --cc=jbottomley@parallels.com \
    --cc=jinpu.wang@profitbricks.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=roland@kernel.org \
    --cc=sebastian.riemer@profitbricks.com \
    --cc=vuhuong@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).