From: David Dillow <dillowda@ornl.gov>
To: Bart Van Assche <bvanassche@acm.org>
Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
Fujita Tomonori <fujita.tomonori@lab.ntt.co.jp>,
Brian King <brking@linux.vnet.ibm.com>,
Roland Dreier <roland@purestorage.com>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH 12/14] ib_srp: Rework error handling
Date: Mon, 19 Dec 2011 17:51:23 -0500 [thread overview]
Message-ID: <1324335083.7043.66.camel@lap75545.ornl.gov> (raw)
In-Reply-To: <CAO+b5-q332RKkLrUV7X_D73SHBrLpt=6Xz=W7ZtOwZ+9iSOyDw@mail.gmail.com>
On Mon, 2011-12-19 at 05:38 -0500, Bart Van Assche wrote:
> On Mon, Dec 19, 2011 at 3:36 AM, David Dillow <dillowda@ornl.gov> wrote:
> > On Thu, 2011-12-01 at 20:10 +0100, Bart Van Assche wrote:
> >> Rework ib_srp transport layer error handling. Instead of letting SCSI
> >> commands time out if a transport layer error occurs,
> >
> > This is good, but should probably be part of the initial disconnect. We
> > want to run the receive queue dry, processing responses to avoid
> > unnecessarily resetting a command that was successful.
>
> Blocking the SCSI host doesn't prevent ib_srp from continuing to
> process IB completion notifications.
In your series, it doesn't -- and I was wrong about the message spam,
you check for the proper state and the work being queued.
I haven't parsed it all out from your changes just yet, but I think part
of the reason you may have had problems with req->scmd being null in
srp_handle_recv() is due to a new race between the tear down of the
connection and continuing to process completion notifications.
> >> block the SCSI
> >> host and try to reconnect until the reconnect timeout elapses or until
> >> the maximum number of reconnect attempts has been exceeded, whichever
> >> happens first.
> >
> > When we're blocked for a disconnected target, we may want to call the
> > state something other than SRP_TARGET_BLOCKED. I'd like to eventually
> > handle the case where the target leaves the fabric temporarily -- we
> > don't necessarily disconnect in that case, but we need to block commands
> > until it comes back or we give up on it.
>
> The case where the target leaves the fabric temporarily should already
> be covered by this patch series.
Only once a command is sent to it and timesout or we get a QP error. The
idea here would be to start the dev_loss_tmo timer once it was detected
that it left the fabric. This was handled in OFED via sysfs attributes,
but it seems like we could register for the events ourselves.
This isn't something I expect you to implement, but I'd like to leave
room for it in the future.
> >> +static void srp_remove_target(struct srp_target_port *target)
> >> {
> >
> >> srp_del_scsi_host_attr(target->scsi_host);
> >> + cancel_work_sync(&target->block_work);
> >> + mutex_lock(&target->mutex);
> >> + mutex_unlock(&target->mutex);
> >
> > You lock and unlock here without doing anything in the critical section.
>
> That's on purpose.
I'm assuming you do this to ensure that everyone has seen the
appropriate state and exited the critical section, then? Best to add a
comment to that effect. Actually, is there any reason to unlock it again
since we're removing the target? I don't have the code in front of me
ATM, so I'm assuming that you don't call any other routines that need
the lock after this. sparse may complain, though.
--
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office
next prev parent reply other threads:[~2011-12-19 22:51 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-01 18:54 [PATCH 00/14] Make ib_srp better suited for H.A. purposes Bart Van Assche
2011-12-01 19:05 ` [PATCH 08/14] srp_transport: Document sysfs attributes Bart Van Assche
2011-12-01 19:06 ` [PATCH 09/14] srp_transport: Fix attribute registration Bart Van Assche
[not found] ` <201112012006.50445.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-15 20:09 ` David Dillow
2011-12-01 19:10 ` [PATCH 12/14] ib_srp: Rework error handling Bart Van Assche
[not found] ` <201112012010.37276.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-15 20:20 ` David Dillow
2011-12-19 3:36 ` David Dillow
2011-12-19 10:38 ` Bart Van Assche
2011-12-19 22:51 ` David Dillow [this message]
[not found] ` <1324335083.7043.66.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2011-12-20 9:01 ` Bart Van Assche
[not found] ` <CAO+b5-qF2taG0B4n9SBwqnuh0wajH5fXFLTb-VAaDrfT9TZ6aQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-21 3:33 ` David Dillow
[not found] ` <1324438387.7621.53.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-12-21 13:26 ` Bart Van Assche
[not found] ` <1324265791.17849.92.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-12-26 19:13 ` Bart Van Assche
[not found] ` <201112011954.25811.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-01 18:55 ` [PATCH 01/14] ib_srp: Introduce pr_fmt() Bart Van Assche
[not found] ` <201112011955.47198.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-01 23:40 ` David Dillow
2011-12-01 18:57 ` [PATCH 02/14] ib_srp: Consolidate repetitive sysfs code Bart Van Assche
[not found] ` <201112011957.19892.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-15 18:26 ` David Dillow
2011-12-01 18:58 ` [PATCH 03/14] ib_srp: Disallow duplicate logins Bart Van Assche
[not found] ` <201112011958.17339.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-15 19:08 ` David Dillow
[not found] ` <1323976101.16703.42.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2011-12-18 19:20 ` Bart Van Assche
[not found] ` <CAO+b5-r6dHt-vmbNjeD4zvcAaRVqhiEm5eZF7hgF6ei35kqjdQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-18 21:40 ` David Dillow
[not found] ` <1324244446.17849.39.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-12-19 10:31 ` Bart Van Assche
2011-12-01 18:59 ` [PATCH 04/14] ib_srp: Set block layer timeout Bart Van Assche
[not found] ` <201112011959.11956.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-15 19:37 ` David Dillow
[not found] ` <1323977841.16703.57.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2011-12-17 17:50 ` Bart Van Assche
[not found] ` <CAO+b5-p9FQVvdVZtUvRJgMW6qhcnUKrp4RhCch1Hnt8JsjS5qQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-17 22:39 ` David Dillow
2011-12-17 22:03 ` Or Gerlitz
[not found] ` <CAJZOPZKQ4rg6D=ZDt2q+aJbsNAQbgqgh19FmGC5Vi6_EQ4ROFQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-17 22:39 ` David Dillow
2011-12-18 11:53 ` Bart Van Assche
2011-12-01 19:00 ` [PATCH 05/14] ib_srp: Avoid that late SRP replies cause trouble Bart Van Assche
[not found] ` <201112012000.04427.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-15 20:03 ` David Dillow
2011-12-01 19:02 ` [PATCH 06/14] ib_srp: Micro-optimize completion handlers Bart Van Assche
[not found] ` <201112012002.17829.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-01 21:35 ` chas williams - CONTRACTOR
[not found] ` <20111201163539.572ca669-KCdNrDJlFBBhNwqIksvPR6qiWZVw4kCD+aIohriVLy8@public.gmane.org>
2011-12-01 23:32 ` David Dillow
[not found] ` <1322782369.11664.5.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2011-12-12 11:41 ` Bart Van Assche
[not found] ` <CAO+b5-pFBEQybN+01heAzr=_dNCb7Sr7ri_o_hF1MnOeX4_idQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-12 23:12 ` David Dillow
2011-12-01 19:02 ` [PATCH 07/14] ib_srp: Introduce srp_handle_qp_err() Bart Van Assche
[not found] ` <201112012002.56307.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-18 23:55 ` David Dillow
2011-12-01 19:08 ` [PATCH 10/14] srp_transport: Simplify attribute initialization code Bart Van Assche
[not found] ` <201112012008.00502.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-19 0:07 ` David Dillow
[not found] ` <1324253243.17849.45.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-12-20 10:21 ` Bart Van Assche
2011-12-21 3:23 ` David Dillow
2011-12-01 19:09 ` [PATCH 11/14] ib_srp: Document sysfs attributes Bart Van Assche
[not found] ` <201112012009.12815.bvanassche-HInyCGIudOg@public.gmane.org>
2011-12-19 0:33 ` David Dillow
[not found] ` <1324254811.17849.65.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-12-19 8:46 ` Bart Van Assche
[not found] ` <CAO+b5-rb=Gtch0UCZPwTSCHhOdsUBecSpgYjaLvj5pbo9_1AeQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-19 21:27 ` David Dillow
2011-12-01 19:11 ` [PATCH 13/14] ib_srp: Implement transport layer ping Bart Van Assche
2011-12-19 0:50 ` David Dillow
2011-12-19 10:16 ` Bart Van Assche
2011-12-19 22:32 ` David Dillow
[not found] ` <1324333931.7043.52.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2011-12-20 10:13 ` Bart Van Assche
[not found] ` <CAO+b5-qLxmcXCCxA8+bPYsinjr1eqCDO2JUJbjgVr59N55CU1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-21 2:32 ` David Dillow
2011-12-20 10:27 ` Bart Van Assche
2011-12-21 3:05 ` David Dillow
[not found] ` <1324436736.7621.38.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-12-21 14:07 ` Bart Van Assche
2011-12-23 22:34 ` David Dillow
[not found] ` <1324679698.3004.12.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-12-23 22:56 ` Mike Christie
2011-12-24 20:07 ` David Dillow
2011-12-26 19:39 ` Bart Van Assche
2011-12-28 23:53 ` David Dillow
2011-12-26 20:01 ` Bart Van Assche
2011-12-01 19:13 ` [PATCH 14/14] ib_srp: Allow SRP disconnect through sysfs Bart Van Assche
2011-12-19 4:03 ` David Dillow
[not found] ` <1324267414.17849.98.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2011-12-19 9:04 ` Bart Van Assche
2011-12-01 23:26 ` [PATCH 00/14] Make ib_srp better suited for H.A. purposes David Dillow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1324335083.7043.66.camel@lap75545.ornl.gov \
--to=dillowda@ornl.gov \
--cc=brking@linux.vnet.ibm.com \
--cc=bvanassche@acm.org \
--cc=fujita.tomonori@lab.ntt.co.jp \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=roland@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox