From: Vu Pham <vuhuong-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
To: Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Cc: Linux RDMA list <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [ofa-general][PATCH 3/4] SRP fail-over faster
Date: Wed, 14 Oct 2009 13:37:21 -0700 [thread overview]
Message-ID: <4AD63681.6080901@mellanox.com> (raw)
In-Reply-To: <ada1vl5alqh.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Roland Dreier wrote:
> > +static int srp_dev_loss_tmo = 60;
>
> I don't think the name needs to be this abbreviated. We don't
> necessarily need the srp_ prefix, but probably "device_loss_timeout" is
> much clearer without being too much longer.
>
>
OK
> > +
> > +module_param(srp_dev_loss_tmo, int, 0444);
> > +MODULE_PARM_DESC(srp_dev_loss_tmo,
> > + "Default number of seconds that srp transport should \
> > + insulate the lost of a remote port (default is 60 secs");
>
> I can't understand this description. What does "insulate the lost" of a
> port mean?
>
>
I should change "remote port" to just "port". It means that multipath
driver won't know about port offline event (pulling cable, power
cycling switch, target...) and won't act/fail-over because srp won't
return error code until this timeout expired
> > +static void srp_reconnect_work(struct work_struct *work)
> > +{
> > + struct srp_target_port *target =
> > + container_of(work, struct srp_target_port, work);
> > +
> > + srp_reconnect_target(target);
> > + target->work_in_progress = 0;
>
> surely this is racy... isn't it possible for a context to see
> work_in_progress as 1, decide not to schedule the work, and then have it
> set to 0 immediately afterwards by the workqueue context?
>
Yes, it is racy. It should be in lock_irq scsi host_lock
> > + target->qp_err_timer.expires = time * HZ + jiffies;
>
> given that this is only with 1 second resolution, probably makes sense
> to either make it a deferrable timer or round the timeout to avoid extra
> wakeups.
>
OK - I'll round the timeout.
> > + add_timer(&target->qp_err_timer);
>
> I don't see anywhere that this is canceled on module unload etc?
>
>
My mistake. Bart also pointed it out. I'll fix this.
> > + srp_qp_err_add_timer(target,
> > + srp_dev_loss_tmo - 55);
>
> > + if (srp_dev_loss_tmo < 60)
> > + srp_dev_loss_tmo = 60;
>
> I don't understand the 55 and the 60 here... what are these magic
> numbers? Wouldn't it make sense for the user to specify the actual
> timeout that is used (value - 55) rather than the value and then
> secretly subtracting 55?
>
> - R.
>
First it does not make sense for user to set it below 60; therefore, it
is forced to have 60 and above
With async event handler, srp can detect local port offline and set
timer exact device_loss_timeout; however, it does not have mechanism to
detect remote port offline (srp_daemon need to register trap and
communicate remote port in/out fabric down to srp driver)
I should just add timer (X seconds) instead of (device_loss_tmo - 55) in
case receiving cqe error and/or connection close event
-vu
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-10-14 20:37 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-12 22:57 [ofa-general][PATCH 3/4] SRP fail-over faster Vu Pham
[not found] ` <4AD3B453.3030109-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2009-10-13 11:09 ` Bart Van Assche
2009-10-14 18:12 ` Roland Dreier
[not found] ` <ada1vl5alqh.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2009-10-14 20:37 ` Vu Pham [this message]
[not found] ` <4AD63681.6080901-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2009-10-14 20:52 ` Roland Dreier
[not found] ` <adaljjd8zrj.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2009-10-14 21:08 ` Vu Pham
[not found] ` <4AD63DB1.3060906-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2009-10-14 22:47 ` Roland Dreier
[not found] ` <adahbu18uf5.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2009-10-14 23:59 ` Vu Pham
2009-10-15 1:39 ` David Dillow
[not found] ` <1255570760.13845.4.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2009-10-15 16:23 ` Vu Pham
[not found] ` <4AD74C88.8030604-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2009-10-15 19:25 ` David Dillow
[not found] ` <1255634715.29829.9.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2009-10-15 21:35 ` Jason Gunthorpe
[not found] ` <20091015213512.GW5191-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2009-10-22 23:13 ` Vu Pham
[not found] ` <4AE0E71E.20309-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2009-10-22 23:33 ` David Dillow
[not found] ` <1256254394.1579.86.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2009-10-22 23:34 ` David Dillow
[not found] ` <1256254459.1579.87.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2009-10-22 23:38 ` David Dillow
[not found] ` <1256254692.1579.89.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2009-10-23 0:04 ` Vu Pham
[not found] ` <4AE0F309.5040201-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2009-10-23 0:16 ` David Dillow
[not found] ` <1256256984.1579.105.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2009-10-23 0:24 ` Vu Pham
[not found] ` <4AE0F7DA.20100-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2009-10-23 0:34 ` David Dillow
[not found] ` <1256258049.1598.8.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2009-10-23 16:50 ` Vu Pham
[not found] ` <4AE1DEEF.5070205-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2009-10-23 22:08 ` David Dillow
[not found] ` <1256335698.10273.62.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2009-10-24 7:35 ` Vu Pham
[not found] ` <4AE2AE54.5020004-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2009-10-28 15:09 ` David Dillow
2009-10-29 18:42 ` Vladislav Bolkhovitin
2009-10-23 6:13 ` Bart Van Assche
[not found] ` <e2e108260910222313o27c8b97dh483d846b6c98e480-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-23 16:52 ` Vu Pham
2009-10-28 18:00 ` Roland Dreier
[not found] ` <adavdhzs8iv.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2009-10-29 16:37 ` Vu Pham
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AD63681.6080901@mellanox.com \
--to=vuhuong-vpraknaxozvwk0htik3j/w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox