linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Netes <alexne-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
To: Jim Foraker <foraker1-i2BcT+NCU+M@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"Weiny, Ira K." <weiny2-i2BcT+NCU+M@public.gmane.org>
Subject: Re: [PATCH 5/8] opensm: Signal subnet init errors on SubnGet timeouts
Date: Sun, 29 Jul 2012 19:29:33 +0300	[thread overview]
Message-ID: <20120729162933.GE5195@calypso> (raw)
In-Reply-To: <1343081989.29792.12.camel-mxTxeWJot8FliZ7u+bvwcg@public.gmane.org>

Hi Jim,

On 15:19 Mon 23 Jul     , Jim Foraker wrote:
> 
> On Mon, 2012-07-23 at 08:43 -0700, Alex Netes wrote:
> > Hi Jim,
> > 
> > On 17:55 Mon 25 Jun     , Jim Foraker wrote:
> > > A subnet should not be listed as cleanly initialized if CAs
> > > fail to respond to SubnGet requests.
> > > 
> > > Signed-off-by: Jim Foraker <foraker1-i2BcT+NCU+M@public.gmane.org>
> > > ---
> > >  opensm/osm_sm_mad_ctrl.c |    9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/opensm/osm_sm_mad_ctrl.c b/opensm/osm_sm_mad_ctrl.c
> > > index f0bcff2..464b6b0 100644
> > > --- a/opensm/osm_sm_mad_ctrl.c
> > > +++ b/opensm/osm_sm_mad_ctrl.c
> > > @@ -741,6 +741,15 @@ static void sm_mad_ctrl_send_err_cb(IN void *context, IN osm_madw_t * p_madw)
> > >  			cl_ntoh16(p_smp->attr_id),
> > >  			ib_get_sm_attr_str(p_smp->attr_id));
> > >  		p_ctrl->p_subn->subnet_initialization_error = TRUE;
> > > +	} else if (p_madw->status == IB_TIMEOUT &&
> > > +		   p_smp->method == IB_MAD_METHOD_GET) {
> > 
> > It's pretty common to see timeouts in fabrics without m_key support (e.g.
> > switch reboots) and it's not desirable to start another heavy sweep because
> > of that. So I guess it would be better if we could initiate heavy sweep only
> > when m_key is set and protection level is 2 or 3.
>      This was done primarily to ensure that "SUBNET UP" doesn't get
> displayed/logged while there are unconfigured HCAs due to misset mkeys.
> I'm reasonably sure (I will re-test to verify) that future light sweeps
> will catch HCAs whos mkeys timeout, presuming the timeout is set.  So we
> could also just log the error and not worry about setting
> subnet_initialization_error.

It's fine to have TIMEOUTs on Get() in case we are dealing with M_Key set, but
in general case we don't want to run into heavy sweep loops because of
TIMEOUTs on Get(), so I suggest the following:

+       } else if (p_ctrl->p_subn->opt.m_key &&
+                  p_ctrl->p_subn->opt.m_key_protect_bits > 1 &&
+                  p_madw->status == IB_TIMEOUT &&
+                  p_smp->method == IB_MAD_METHOD_GET) {
+               /* Timeouts on SubnGet may be an indication of an mkey
+                  error at protection levels 2/3 */
+               OSM_LOG(p_ctrl->p_log, OSM_LOG_ERROR, "ERR 3120 "
+                       "Timeout while getting attribute 0x%X (%s)\n",
+                       cl_ntoh16(p_smp->attr_id),
+                       ib_get_sm_attr_str(p_smp->attr_id));
+               p_ctrl->p_subn->subnet_initialization_error = TRUE;


-- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2012-07-29 16:29 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-26  0:54 [PATCH 0/8] opensm: Improved mkey support Jim Foraker
     [not found] ` <1340672058.5218.97.camel-mxTxeWJot8FliZ7u+bvwcg@public.gmane.org>
2012-06-26  0:54   ` [PATCH 1/8] opensm: Add guid2mkey cache file support Jim Foraker
     [not found]     ` <1340672104-18039-1-git-send-email-foraker1-i2BcT+NCU+M@public.gmane.org>
2012-06-26  0:54       ` [PATCH 2/8] opensm: Allow recovery of subnets with misset mkeys Jim Foraker
2012-06-26  0:54       ` [PATCH 3/8] Add locking where necessary around osm_req_* Jim Foraker
2012-06-26  0:55       ` [PATCH 4/8] Add support for setting mkey protection levels Jim Foraker
2012-06-26  0:55       ` [PATCH 5/8] opensm: Signal subnet init errors on SubnGet timeouts Jim Foraker
     [not found]         ` <1340672104-18039-5-git-send-email-foraker1-i2BcT+NCU+M@public.gmane.org>
2012-07-23 15:43           ` Alex Netes
2012-07-23 22:19             ` Jim Foraker
     [not found]               ` <1343081989.29792.12.camel-mxTxeWJot8FliZ7u+bvwcg@public.gmane.org>
2012-07-29 16:29                 ` Alex Netes [this message]
2012-07-30 17:19                   ` Foraker, Jim
2012-06-26  0:55       ` [PATCH 6/8] opensm: Add neighboring link cache file Jim Foraker
2012-06-26  0:55       ` [PATCH 7/8] opensm: Check for valid mkey protection level in config file Jim Foraker
2012-06-26  0:55       ` [PATCH 8/8] opensm: Ensure sweep interval/mkey lease are sensibly set Jim Foraker
     [not found]         ` <1340672104-18039-8-git-send-email-foraker1-i2BcT+NCU+M@public.gmane.org>
2012-07-24  9:01           ` Alex Netes
2012-07-24 17:40             ` Jim Foraker
2012-07-04  0:25   ` [PATCH 0/8] opensm: Improved mkey support Jim Foraker
     [not found]     ` <1341361508.5218.148.camel-mxTxeWJot8FliZ7u+bvwcg@public.gmane.org>
2012-07-04  0:25       ` [PATCH V1.1 1/8] opensm: Add guid2mkey cache file support Jim Foraker
     [not found]         ` <1341361548-30229-1-git-send-email-foraker1-i2BcT+NCU+M@public.gmane.org>
2012-07-04  0:25           ` [PATCH V1.1 3/8] Add locking where necessary around osm_req_* Jim Foraker
2012-07-23 15:55           ` [PATCH V1.1 1/8] opensm: Add guid2mkey cache file support Alex Netes
2012-07-23 22:37             ` Jim Foraker
2012-07-23 15:59       ` [PATCH 0/8] opensm: Improved mkey support Alex Netes
2012-07-23 22:28         ` Jim Foraker
2012-08-01 14:48   ` Jim Foraker
     [not found]     ` <1343832537.26423.8.camel-mxTxeWJot8FliZ7u+bvwcg@public.gmane.org>
2012-08-01 14:52       ` [PATCH 1/9 v2] opensm: Add guid2mkey cache file support Jim Foraker
     [not found]         ` <1343832755-26753-1-git-send-email-foraker1-i2BcT+NCU+M@public.gmane.org>
2012-08-01 14:52           ` [PATCH 2/9 v2] opensm: Allow recovery of subnets with misset mkeys Jim Foraker
2012-08-01 14:52           ` [PATCH 3/9 v2] opensm: Add locking where necessary around osm_req_* Jim Foraker
2012-08-01 14:52           ` [PATCH 4/9 v2] opensm: Add support for setting mkey protection levels Jim Foraker
2012-08-01 14:52           ` [PATCH 5/9 v2] opensm: Log errors on SubnGet timeouts Jim Foraker
2012-08-01 14:52           ` [PATCH 6/9 v2] opensm: Add neighboring link cache file Jim Foraker
2012-08-01 14:52           ` [PATCH 7/9 v2] opensm: Check for valid mkey protection level in config file Jim Foraker
2012-08-01 14:52           ` [PATCH 8/9 v2] opensm: Ensure sweep interval/mkey lease are sensibly set Jim Foraker
2012-08-01 14:52           ` [PATCH 9/9 v2] opensm/scripts/sldd.sh: Update to support guid2mkey/neighbors Jim Foraker
2012-08-01 20:19       ` [PATCH 0/8] opensm: Improved mkey support Alex Netes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120729162933.GE5195@calypso \
    --to=alexne-vpraknaxozvwk0htik3j/w@public.gmane.org \
    --cc=foraker1-i2BcT+NCU+M@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=weiny2-i2BcT+NCU+M@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).