public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
To: Bart Van Assche <bart.vanassche-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: OFED mailing list
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Chris Worley <worleys-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH] IB/srp: Fix initiator lockup
Date: Wed, 06 Jan 2010 13:37:43 -0800	[thread overview]
Message-ID: <ada4omyud7c.fsf@roland-alpha.cisco.com> (raw)
In-Reply-To: <e2e108261001020419l36319156hb9d625edc2e15d06-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> (Bart Van Assche's message of "Sat, 2 Jan 2010 13:19:13 +0100")


 > When the SRP initiator is communicating with an SRP target under load it can
 > happen that the SRP initiator locks up. The communication pattern that causes
 > the lockup is as follows:
 > * SRP initiator sends (req_lim - 2) SRP_CMD requests to the target.
 > * The REQUEST LIMIT DELTA field of each SRP_RSP response is zero.
 > * The target sends an SRP_CRED_REQ information unit with non-zero REQUEST
 >   LIMIT DELTA.
 > 
 > The above communication pattern brings the initiator in the following state:
 > * srp_queuecommand() always returns SCSI_MLQUEUE_HOST_BUSY.
 > * The per-session variable zero_req_lim keeps increasing.
 > The initiator never leaves this state because it ignores SRP_CRED_REQ
 > information units.

This is all a bit obfuscated.  The problem is that the initiator runs
out of credits and stops sending commands; because we don't process
SRP_CRED_REQ messages from the target, we never get more credits.

I'm wondering why this took so long to come up?  Does SCST send
SRP_CRED_REQ only under unusual circumstances?  Also I'm wondering why
the "Unhandled SRP opcode" message didn't show up in the kernel log and
help debug this?

Some specific comments:

 > +/* Similar to is_power_of_2(), but can be evaluated at compile time. */
 > +#define IS_POWER_OF_2(n) ((n) != 0 && (((n) & ((n) - 1)) == 0))

I don't think this level of ugliness is really required -- we can just
document carefully at the definition that we need things to be powers of 2.

 > +	for (i = 0; i < ARRAY_SIZE(target->txp_ring); ++i)
 > +		srp_free_iu(target->srp_host, target->txp_ring[i]);

Not sure I understand why we need two TX rings -- why can't we just have
one bigger TX ring that handles both requests and responses?

 > + * Obtain an information unit for sending a request to the target.

I think there's a bit of overcommenting in a few places.  Does it really
help anyone to repeat that what "get_tx_iu" does is "get an information
unit for sending"?

 > -	return target->tx_ring[target->tx_head & SRP_SQ_SIZE];
 > +	return target->tx_ring[target->tx_head
 > +			       & (ARRAY_SIZE(target->tx_ring) - 1)];

is this an improvement?

 > + * Send a request to the target.
 > +static int __srp_post_send_req(struct srp_target_port *target,

same -- does the comment add anything?

 > +	/* Completion queue. */
 > +	SRP_CQ_SIZE		= SRP_SQ_SIZE + SRP_TXP_SIZE + SRP_RQ_SIZE,

etc... all these comments are mostly taking up vertical space and
visually jarring, without adding much info.

 > +/*
 > + * SRP_CRED_REQ information unit, as defined in section 6.10 of the
 > + * T10 SRP r16a document.
 > + */
 > +struct srp_cred_req {
 > +	u8	opcode;
 > +	u8	sol_not;
 > +	u8	reserved[2];
 > +	__be32	req_lim_delta;
 > +	u64	tag;
 > +} __attribute__((packed));

again... does it help anyone to say "struct srp_cred_req" corresponds to
SRP_CRED_REQ?  <scsi/srp.h> already refers to the SRP document, and I
would think anyone looking up SRP_CRED_REQ could find it faster by
looking for the string itself rather than by section number.

The existing comments in the file are actually useful -- they explain
why some structures need to be packed, and as far as I can tell neither
srp_cred_req nor srp_cred_rsp need to be packed.

 - R.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-01-06 21:37 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-02 12:19 [PATCH] IB/srp: Fix initiator lockup Bart Van Assche
     [not found] ` <e2e108261001020419l36319156hb9d625edc2e15d06-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-02 16:05   ` Fwd: " Chris Worley
     [not found]     ` <f3177b9e1001020805k4dce1991u3733c5a7f6d46aaa-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-02 17:52       ` Bart Van Assche
2010-01-04  1:34   ` David Dillow
     [not found]     ` <1262568846.13289.4.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2010-01-04  1:43       ` [RFC PATCH 1/3] IB/srp: differentiate the uses of SRP_SQ_SIZE David Dillow
2010-01-04  1:43       ` [RFC PATCH 2/3] IB/srp: minimal support for SRP_CRED_REQ and SRP_AER_REQ David Dillow
     [not found]         ` <1262569417-20341-2-git-send-email-dave-i1Mk8JYDVaaSihdK6806/g@public.gmane.org>
2010-01-06 21:48           ` Roland Dreier
     [not found]             ` <adar5q2sy5q.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-06 23:02               ` David Dillow
     [not found]                 ` <1262818979.2265.13.camel-FqX9LgGZnHWDB2HL1qBt2PIbXMQ5te18@public.gmane.org>
2010-01-06 23:20                   ` Roland Dreier
2010-01-04  1:43       ` [RFC PATCH 3/3] IB/srp: export req_limit via sysfs David Dillow
2010-01-04  7:13       ` [PATCH] IB/srp: Fix initiator lockup Bart Van Assche
     [not found]         ` <e2e108261001032313v472d4b9dm75c3d0b1a9268adc-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-04  8:12           ` David Dillow
     [not found]             ` <1262592761.13289.13.camel-1q1vX8mYZiGLUyTwlgNVppKKF0rrzTr+@public.gmane.org>
2010-01-04  8:32               ` Bart Van Assche
2010-01-06 21:40           ` Roland Dreier
     [not found]             ` <adavdfesyi4.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-07  7:59               ` Bart Van Assche
     [not found]                 ` <e2e108261001062359l66384219r650294b4d6e9f3f1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-07 16:26                   ` David Dillow
2010-01-06 21:38       ` Roland Dreier
2010-01-06 21:37   ` Roland Dreier [this message]
     [not found]     ` <ada4omyud7c.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-07  7:52       ` Bart Van Assche
     [not found]         ` <e2e108261001062352l1925f4e1i24bdbdde08056c4c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-07  7:57           ` Roland Dreier
     [not found]             ` <adak4vuqrcw.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-07  8:03               ` Bart Van Assche
     [not found]                 ` <e2e108261001070003y6881eac7g3e8129e44b78a475-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-07 10:01                   ` Bart Van Assche
2010-01-10  6:41                   ` Roland Dreier
     [not found]                     ` <adavdfapimb.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-10 10:34                       ` Bart Van Assche
     [not found]                         ` <e2e108261001100234k7dcb45cescb9ae133c0cdc256-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-12 17:13                           ` Roland Dreier
     [not found]                             ` <ada3a2bp7ow.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-12 17:23                               ` Bart Van Assche
     [not found]                                 ` <e2e108261001120923w4b405736v8b8e8ae41322823b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-12 22:57                                   ` Roland Dreier
     [not found]                                     ` <adavdf7nd7k.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-01-12 23:24                                       ` Jason Gunthorpe
     [not found]                                         ` <20100112232421.GA16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-01-13  7:23                                           ` Bart Van Assche
     [not found]                                             ` <e2e108261001122323p695d8a9cscd992eda25fdba89-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-13 18:16                                               ` Jason Gunthorpe
     [not found]                                                 ` <20100113181615.GC16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-01-13 18:57                                                   ` Bart Van Assche
     [not found]                                                     ` <e2e108261001131057g339fc80fn638109f2d441f13a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-13 19:52                                                       ` Jason Gunthorpe
2010-01-13  9:32                                           ` Bart Van Assche
2010-01-13  7:29                                       ` Bart Van Assche
     [not found]                                         ` <e2e108261001122329u698f13ecibf42c1dc60b5cc04-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-01-13 16:58                                           ` Roland Dreier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ada4omyud7c.fsf@roland-alpha.cisco.com \
    --to=rdreier-fyb4gu1cfyuavxtiumwx3w@public.gmane.org \
    --cc=bart.vanassche-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=worleys-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox