From: Bart Van Assche <bvanassche-HInyCGIudOg@public.gmane.org>
To: Mike Christie <michaelc-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
David Dillow <dillowda-1Heg1YXhbW8@public.gmane.org>,
Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
Sebastian Riemer
<sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>,
linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
linux-scsi <linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
James Bottomley
<jbottomley-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
Subject: Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling
Date: Mon, 24 Jun 2013 09:37:16 +0200 [thread overview]
Message-ID: <51C7F72C.2050004@acm.org> (raw)
In-Reply-To: <51C764FB.6070207-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
On 06/23/13 23:13, Mike Christie wrote:
> On 06/12/2013 08:28 AM, Bart Van Assche wrote:
>> + /*
>> + * It can occur that after fast_io_fail_tmo expired and before
>> + * dev_loss_tmo expired that the SCSI error handler has
>> + * offlined one or more devices. scsi_target_unblock() doesn't
>> + * change the state of these devices into running, so do that
>> + * explicitly.
>> + */
>> + spin_lock_irq(shost->host_lock);
>> + __shost_for_each_device(sdev, shost)
>> + if (sdev->sdev_state == SDEV_OFFLINE)
>> + sdev->sdev_state = SDEV_RUNNING;
>> + spin_unlock_irq(shost->host_lock);
>
> Is it possible for this to race with scsi_eh_offline_sdevs? Can it be
> looping over cmds offlining devices while this is looping over devices
> onlining them?
>
> It seems this can also happen for all transports/drivers. Maybe a a scsi
> eh/lib helper function that syncrhonizes with the scsi eh completion
> would be better.
I'm not sure it's possible to avoid such a race without introducing
a new mutex. How about something like the (untested) SCSI core patch
below, and invoking scsi_block_eh() and scsi_unblock_eh() around any
reconnect activity not initiated from the SCSI EH thread ?
[PATCH] Add scsi_block_eh() and scsi_unblock_eh()
---
drivers/scsi/hosts.c | 1 +
drivers/scsi/scsi_error.c | 10 ++++++++++
include/scsi/scsi_host.h | 1 +
3 files changed, 12 insertions(+)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 17e2ccb..0df3ec8 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -360,6 +360,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
init_waitqueue_head(&shost->host_wait);
mutex_init(&shost->scan_mutex);
+ mutex_init(&shost->block_eh_mutex);
/*
* subtract one because we increment first then return, but we need to
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index ab16930..566daaa 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -551,6 +551,10 @@ static int scsi_begin_eh(struct Scsi_Host *host)
{
int res;
+ res = mutex_lock_interruptible(&host->block_eh_mutex);
+ if (res)
+ goto out;
+
spin_lock_irq(host->host_lock);
switch (host->shost_state) {
case SHOST_DEL:
@@ -565,6 +569,10 @@ static int scsi_begin_eh(struct Scsi_Host *host)
}
spin_unlock_irq(host->host_lock);
+ if (res)
+ mutex_unlock(&host->block_eh_mutex);
+
+out:
return res;
}
@@ -579,6 +587,8 @@ static void scsi_end_eh(struct Scsi_Host *host)
if (host->eh_active == 0)
wake_up(&host->host_wait);
spin_unlock_irq(host->host_lock);
+
+ mutex_unlock(&host->block_eh_mutex);
}
/**
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 9785e51..d7ce065 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -573,6 +573,7 @@ struct Scsi_Host {
spinlock_t *host_lock;
struct mutex scan_mutex;/* serialize scanning activity */
+ struct mutex block_eh_mutex; /* block ML LLD EH calls */
struct list_head eh_cmd_q;
struct task_struct * ehandler; /* Error recovery thread. */
--
1.7.10.4
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-06-24 7:37 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-12 13:17 [PATCH 0/14] IB SRP initiator patches for kernel 3.11 Bart Van Assche
2013-06-12 13:28 ` [PATCH 07/14] scsi_transport_srp: Add transport layer error handling Bart Van Assche
[not found] ` <51B8777B.5050201-HInyCGIudOg@public.gmane.org>
2013-06-13 19:43 ` Vu Pham
2013-06-14 13:19 ` Bart Van Assche
[not found] ` <51BB1857.7040802-HInyCGIudOg@public.gmane.org>
2013-06-14 17:59 ` Vu Pham
[not found] ` <51BB5A04.3080901-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-15 9:52 ` Bart Van Assche
[not found] ` <51BC3945.9030900-HInyCGIudOg@public.gmane.org>
2013-06-17 6:18 ` Hannes Reinecke
2013-06-17 7:04 ` Bart Van Assche
2013-06-17 7:14 ` Hannes Reinecke
2013-06-17 7:29 ` Bart Van Assche
[not found] ` <51BEBAEA.4080202-HInyCGIudOg@public.gmane.org>
2013-06-17 8:10 ` Hannes Reinecke
2013-06-17 10:13 ` Sebastian Riemer
2013-06-18 16:59 ` Vu Pham
[not found] ` <51C09202.2040503-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-19 13:00 ` Bart Van Assche
2013-06-23 21:13 ` Mike Christie
[not found] ` <51C764FB.6070207-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org>
2013-06-24 7:37 ` Bart Van Assche [this message]
[not found] ` <51B87501.4070005-HInyCGIudOg@public.gmane.org>
2013-06-12 13:20 ` [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion Bart Van Assche
[not found] ` <51B875A4.7040903-HInyCGIudOg@public.gmane.org>
2013-06-12 13:38 ` Bart Van Assche
[not found] ` <51B879CF.1080802-HInyCGIudOg@public.gmane.org>
2013-06-12 14:24 ` Sebastian Riemer
2013-06-27 21:01 ` David Dillow
[not found] ` <1372366870.32164.30.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-27 23:45 ` Roland Dreier
[not found] ` <CAL1RGDWVgAKSL-GNZCkP1FEt9r_y5QWp+74NzDcga6+tcvWpXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-06-28 7:41 ` Sebastian Riemer
2013-06-12 13:21 ` [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req() Bart Van Assche
[not found] ` <51B875EE.3030702-HInyCGIudOg@public.gmane.org>
2013-06-12 14:58 ` Sebastian Riemer
[not found] ` <51B88C7C.4030209-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-12 15:14 ` Bart Van Assche
[not found] ` <51B8903E.3000609-HInyCGIudOg@public.gmane.org>
2013-06-27 21:02 ` David Dillow
[not found] ` <1372366945.32164.32.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28 7:36 ` Bart Van Assche
2013-06-12 13:23 ` [PATCH 03/14] IB/srp: Avoid that srp_reset_host() is skipped after a TL error Bart Van Assche
[not found] ` <51B87638.50102-HInyCGIudOg@public.gmane.org>
2013-06-13 9:30 ` Sebastian Riemer
[not found] ` <51B99120.9000503-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-13 9:57 ` Bart Van Assche
2013-06-27 21:03 ` David Dillow
2013-06-12 13:24 ` [PATCH 04/14] IB/srp: Skip host settle delay Bart Van Assche
[not found] ` <51B87689.8030806-HInyCGIudOg@public.gmane.org>
2013-06-13 9:53 ` Sebastian Riemer
[not found] ` <51B996A1.6080604-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-13 13:06 ` Or Gerlitz
2013-06-27 21:04 ` David Dillow
2013-06-12 13:25 ` [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus Bart Van Assche
[not found] ` <51B876BF.4070400-HInyCGIudOg@public.gmane.org>
2013-06-13 13:57 ` Sebastian Riemer
[not found] ` <51B9CFC3.8080008-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-13 15:07 ` Bart Van Assche
[not found] ` <51B9E046.3030008-HInyCGIudOg@public.gmane.org>
2013-06-13 15:35 ` Sebastian Riemer
2013-06-13 17:50 ` Vu Pham
[not found] ` <51BA0655.6090707-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-13 18:25 ` Bart Van Assche
[not found] ` <51BA0E8F.3030104-HInyCGIudOg@public.gmane.org>
2013-06-13 23:27 ` Vu Pham
[not found] ` <51BA555F.9060807-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-14 9:38 ` Sebastian Riemer
[not found] ` <51BAE482.1050304-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-14 17:07 ` Vu Pham
[not found] ` <51BB4DBB.4070800-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-06-17 9:41 ` Sebastian Riemer
2013-06-27 21:10 ` David Dillow
[not found] ` <1372367432.32164.36.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28 7:40 ` Bart Van Assche
2013-06-12 13:26 ` [PATCH 06/14] IB/srp: Keep rport as long as the IB transport layer Bart Van Assche
2013-06-12 13:29 ` [PATCH 08/14] IB/srp: Add srp_terminate_io() Bart Van Assche
2013-06-12 13:30 ` [PATCH 09/14] IB/srp: Use SRP transport layer error recovery Bart Van Assche
2013-06-12 13:31 ` [PATCH 10/14] IB/srp: Start timers if a transport layer error occurs Bart Van Assche
2013-06-12 13:33 ` [PATCH 11/14] IB/srp: Fail SCSI commands silently Bart Van Assche
2013-06-12 13:35 ` [PATCH 12/14] IB/srp: Make HCA completion vector configurable Bart Van Assche
[not found] ` <51B87904.1070803-HInyCGIudOg@public.gmane.org>
2013-06-27 21:24 ` David Dillow
[not found] ` <1372368256.32164.41.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28 8:18 ` Bart Van Assche
[not found] ` <51CD46F0.60301-HInyCGIudOg@public.gmane.org>
2013-06-28 12:04 ` David Dillow
[not found] ` <1372421041.28740.14.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2013-06-28 12:29 ` Bart Van Assche
2013-06-12 13:36 ` [PATCH 13/14] IB/srp: Make transport layer retry count configurable Bart Van Assche
[not found] ` <51B8794F.6050003-HInyCGIudOg@public.gmane.org>
2013-06-27 21:22 ` David Dillow
[not found] ` <1372368138.32164.40.camel-zHLflQxYYDO4Hhoo1DtQwJ9G+ZOsUmrO@public.gmane.org>
2013-06-28 8:28 ` Bart Van Assche
[not found] ` <51CD4933.5080709-HInyCGIudOg@public.gmane.org>
2013-06-28 12:07 ` David Dillow
[not found] ` <1372421227.28740.17.camel-a7a0dvSY7KqLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2013-06-28 12:30 ` Bart Van Assche
2013-06-12 13:37 ` [PATCH 14/14] IB/srp: Bump driver version and release date Bart Van Assche
-- strict thread matches above, loose matches on Subject: below --
2013-06-19 13:44 [PATCH 07/14] scsi_transport_srp: Add transport layer error handling Jack Wang
2013-06-19 15:27 ` Bart Van Assche
2013-06-21 12:17 ` Jack Wang
2013-06-24 13:48 Jack Wang
[not found] ` <51C84E39.80806-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>
2013-06-24 15:50 ` Bart Van Assche
[not found] ` <51C86AB4.1000906-HInyCGIudOg@public.gmane.org>
2013-06-24 16:05 ` Jack Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51C7F72C.2050004@acm.org \
--to=bvanassche-hinycgiudog@public.gmane.org \
--cc=dillowda-1Heg1YXhbW8@public.gmane.org \
--cc=jbottomley-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=michaelc-hcNo3dDEHLuVc3sceRu5cw@public.gmane.org \
--cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=sebastian.riemer-EIkl63zCoXaH+58JC4qpiA@public.gmane.org \
--cc=vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.