From: Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
To: Bart Van Assche <bart.vanassche-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: OFED mailing list
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Chris Worley <worleys-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH] IB/srp: Fix initiator lockup
Date: Wed, 06 Jan 2010 13:37:43 -0800 [thread overview]
Message-ID: <ada4omyud7c.fsf@roland-alpha.cisco.com> (raw)
In-Reply-To: <e2e108261001020419l36319156hb9d625edc2e15d06-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> (Bart Van Assche's message of "Sat, 2 Jan 2010 13:19:13 +0100")
> When the SRP initiator is communicating with an SRP target under load it can
> happen that the SRP initiator locks up. The communication pattern that causes
> the lockup is as follows:
> * SRP initiator sends (req_lim - 2) SRP_CMD requests to the target.
> * The REQUEST LIMIT DELTA field of each SRP_RSP response is zero.
> * The target sends an SRP_CRED_REQ information unit with non-zero REQUEST
> LIMIT DELTA.
>
> The above communication pattern brings the initiator in the following state:
> * srp_queuecommand() always returns SCSI_MLQUEUE_HOST_BUSY.
> * The per-session variable zero_req_lim keeps increasing.
> The initiator never leaves this state because it ignores SRP_CRED_REQ
> information units.
This is all a bit obfuscated. The problem is that the initiator runs
out of credits and stops sending commands; because we don't process
SRP_CRED_REQ messages from the target, we never get more credits.
I'm wondering why this took so long to come up? Does SCST send
SRP_CRED_REQ only under unusual circumstances? Also I'm wondering why
the "Unhandled SRP opcode" message didn't show up in the kernel log and
help debug this?
Some specific comments:
> +/* Similar to is_power_of_2(), but can be evaluated at compile time. */
> +#define IS_POWER_OF_2(n) ((n) != 0 && (((n) & ((n) - 1)) == 0))
I don't think this level of ugliness is really required -- we can just
document carefully at the definition that we need things to be powers of 2.
> + for (i = 0; i < ARRAY_SIZE(target->txp_ring); ++i)
> + srp_free_iu(target->srp_host, target->txp_ring[i]);
Not sure I understand why we need two TX rings -- why can't we just have
one bigger TX ring that handles both requests and responses?
> + * Obtain an information unit for sending a request to the target.
I think there's a bit of overcommenting in a few places. Does it really
help anyone to repeat that what "get_tx_iu" does is "get an information
unit for sending"?
> - return target->tx_ring[target->tx_head & SRP_SQ_SIZE];
> + return target->tx_ring[target->tx_head
> + & (ARRAY_SIZE(target->tx_ring) - 1)];
is this an improvement?
> + * Send a request to the target.
> +static int __srp_post_send_req(struct srp_target_port *target,
same -- does the comment add anything?
> + /* Completion queue. */
> + SRP_CQ_SIZE = SRP_SQ_SIZE + SRP_TXP_SIZE + SRP_RQ_SIZE,
etc... all these comments are mostly taking up vertical space and
visually jarring, without adding much info.
> +/*
> + * SRP_CRED_REQ information unit, as defined in section 6.10 of the
> + * T10 SRP r16a document.
> + */
> +struct srp_cred_req {
> + u8 opcode;
> + u8 sol_not;
> + u8 reserved[2];
> + __be32 req_lim_delta;
> + u64 tag;
> +} __attribute__((packed));
again... does it help anyone to say "struct srp_cred_req" corresponds to
SRP_CRED_REQ? <scsi/srp.h> already refers to the SRP document, and I
would think anyone looking up SRP_CRED_REQ could find it faster by
looking for the string itself rather than by section number.
The existing comments in the file are actually useful -- they explain
why some structures need to be packed, and as far as I can tell neither
srp_cred_req nor srp_cred_rsp need to be packed.
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-01-06 21:37 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-02 12:19 [PATCH] IB/srp: Fix initiator lockup Bart Van Assche
[not found] ` <e2e108261001020419l36319156hb9d625edc2e15d06-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-02 16:05 ` Fwd: " Chris Worley
[not found] ` <f3177b9e1001020805k4dce1991u3733c5a7f6d46aaa-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-02 17:52 ` Bart Van Assche
2010-01-04 1:34 ` David Dillow
[not found] ` <1262568846.13289.4.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2010-01-04 1:43 ` [RFC PATCH 1/3] IB/srp: differentiate the uses of SRP_SQ_SIZE David Dillow
2010-01-04 1:43 ` [RFC PATCH 2/3] IB/srp: minimal support for SRP_CRED_REQ and SRP_AER_REQ David Dillow
[not found] ` <1262569417-20341-2-git-send-email-dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
2010-01-06 21:48 ` Roland Dreier
[not found] ` <adar5q2sy5q.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-06 23:02 ` David Dillow
[not found] ` <1262818979.2265.13.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2010-01-06 23:20 ` Roland Dreier
2010-01-04 1:43 ` [RFC PATCH 3/3] IB/srp: export req_limit via sysfs David Dillow
2010-01-04 7:13 ` [PATCH] IB/srp: Fix initiator lockup Bart Van Assche
[not found] ` <e2e108261001032313v472d4b9dm75c3d0b1a9268adc-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-04 8:12 ` David Dillow
[not found] ` <1262592761.13289.13.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2010-01-04 8:32 ` Bart Van Assche
2010-01-06 21:40 ` Roland Dreier
[not found] ` <adavdfesyi4.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-07 7:59 ` Bart Van Assche
[not found] ` <e2e108261001062359l66384219r650294b4d6e9f3f1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-07 16:26 ` David Dillow
2010-01-06 21:38 ` Roland Dreier
2010-01-06 21:37 ` Roland Dreier [this message]
[not found] ` <ada4omyud7c.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-07 7:52 ` Bart Van Assche
[not found] ` <e2e108261001062352l1925f4e1i24bdbdde08056c4c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-07 7:57 ` Roland Dreier
[not found] ` <adak4vuqrcw.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-07 8:03 ` Bart Van Assche
[not found] ` <e2e108261001070003y6881eac7g3e8129e44b78a475-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-07 10:01 ` Bart Van Assche
2010-01-10 6:41 ` Roland Dreier
[not found] ` <adavdfapimb.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-10 10:34 ` Bart Van Assche
[not found] ` <e2e108261001100234k7dcb45cescb9ae133c0cdc256-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-12 17:13 ` Roland Dreier
[not found] ` <ada3a2bp7ow.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-12 17:23 ` Bart Van Assche
[not found] ` <e2e108261001120923w4b405736v8b8e8ae41322823b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-12 22:57 ` Roland Dreier
[not found] ` <adavdf7nd7k.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-12 23:24 ` Jason Gunthorpe
[not found] ` <20100112232421.GA16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-01-13 7:23 ` Bart Van Assche
[not found] ` <e2e108261001122323p695d8a9cscd992eda25fdba89-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-13 18:16 ` Jason Gunthorpe
[not found] ` <20100113181615.GC16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-01-13 18:57 ` Bart Van Assche
[not found] ` <e2e108261001131057g339fc80fn638109f2d441f13a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-13 19:52 ` Jason Gunthorpe
2010-01-13 9:32 ` Bart Van Assche
2010-01-13 7:29 ` Bart Van Assche
[not found] ` <e2e108261001122329u698f13ecibf42c1dc60b5cc04-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-13 16:58 ` Roland Dreier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ada4omyud7c.fsf@roland-alpha.cisco.com \
--to=rdreier-fyb4gu1cfyuavxtiumwx3w@public.gmane.org \
--cc=bart.vanassche-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=worleys-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox