All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Netes <alexne-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
To: Jim Foraker <foraker1-i2BcT+NCU+M@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"Weiny, Ira K." <weiny2-i2BcT+NCU+M@public.gmane.org>
Subject: Re: [PATCH 5/8] opensm: Signal subnet init errors on SubnGet timeouts
Date: Sun, 29 Jul 2012 19:29:33 +0300	[thread overview]
Message-ID: <20120729162933.GE5195@calypso> (raw)
In-Reply-To: <1343081989.29792.12.camel-mxTxeWJot8FliZ7u+bvwcg@public.gmane.org>

Hi Jim,

On 15:19 Mon 23 Jul     , Jim Foraker wrote:
> 
> On Mon, 2012-07-23 at 08:43 -0700, Alex Netes wrote:
> > Hi Jim,
> > 
> > On 17:55 Mon 25 Jun     , Jim Foraker wrote:
> > > A subnet should not be listed as cleanly initialized if CAs
> > > fail to respond to SubnGet requests.
> > > 
> > > Signed-off-by: Jim Foraker <foraker1-i2BcT+NCU+M@public.gmane.org>
> > > ---
> > >  opensm/osm_sm_mad_ctrl.c |    9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/opensm/osm_sm_mad_ctrl.c b/opensm/osm_sm_mad_ctrl.c
> > > index f0bcff2..464b6b0 100644
> > > --- a/opensm/osm_sm_mad_ctrl.c
> > > +++ b/opensm/osm_sm_mad_ctrl.c
> > > @@ -741,6 +741,15 @@ static void sm_mad_ctrl_send_err_cb(IN void *context, IN osm_madw_t * p_madw)
> > >  			cl_ntoh16(p_smp->attr_id),
> > >  			ib_get_sm_attr_str(p_smp->attr_id));
> > >  		p_ctrl->p_subn->subnet_initialization_error = TRUE;
> > > +	} else if (p_madw->status == IB_TIMEOUT &&
> > > +		   p_smp->method == IB_MAD_METHOD_GET) {
> > 
> > It's pretty common to see timeouts in fabrics without m_key support (e.g.
> > switch reboots) and it's not desirable to start another heavy sweep because
> > of that. So I guess it would be better if we could initiate heavy sweep only
> > when m_key is set and protection level is 2 or 3.
>      This was done primarily to ensure that "SUBNET UP" doesn't get
> displayed/logged while there are unconfigured HCAs due to misset mkeys.
> I'm reasonably sure (I will re-test to verify) that future light sweeps
> will catch HCAs whos mkeys timeout, presuming the timeout is set.  So we
> could also just log the error and not worry about setting
> subnet_initialization_error.

It's fine to have TIMEOUTs on Get() in case we are dealing with M_Key set, but
in general case we don't want to run into heavy sweep loops because of
TIMEOUTs on Get(), so I suggest the following:

+       } else if (p_ctrl->p_subn->opt.m_key &&
+                  p_ctrl->p_subn->opt.m_key_protect_bits > 1 &&
+                  p_madw->status == IB_TIMEOUT &&
+                  p_smp->method == IB_MAD_METHOD_GET) {
+               /* Timeouts on SubnGet may be an indication of an mkey
+                  error at protection levels 2/3 */
+               OSM_LOG(p_ctrl->p_log, OSM_LOG_ERROR, "ERR 3120 "
+                       "Timeout while getting attribute 0x%X (%s)\n",
+                       cl_ntoh16(p_smp->attr_id),
+                       ib_get_sm_attr_str(p_smp->attr_id));
+               p_ctrl->p_subn->subnet_initialization_error = TRUE;


-- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2012-07-29 16:29 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-26  0:54 [PATCH 0/8] opensm: Improved mkey support Jim Foraker
     [not found] ` <1340672058.5218.97.camel-mxTxeWJot8FliZ7u+bvwcg@public.gmane.org>
2012-06-26  0:54   ` [PATCH 1/8] opensm: Add guid2mkey cache file support Jim Foraker
     [not found]     ` <1340672104-18039-1-git-send-email-foraker1-i2BcT+NCU+M@public.gmane.org>
2012-06-26  0:54       ` [PATCH 2/8] opensm: Allow recovery of subnets with misset mkeys Jim Foraker
2012-06-26  0:54       ` [PATCH 3/8] Add locking where necessary around osm_req_* Jim Foraker
2012-06-26  0:55       ` [PATCH 4/8] Add support for setting mkey protection levels Jim Foraker
2012-06-26  0:55       ` [PATCH 5/8] opensm: Signal subnet init errors on SubnGet timeouts Jim Foraker
     [not found]         ` <1340672104-18039-5-git-send-email-foraker1-i2BcT+NCU+M@public.gmane.org>
2012-07-23 15:43           ` Alex Netes
2012-07-23 22:19             ` Jim Foraker
     [not found]               ` <1343081989.29792.12.camel-mxTxeWJot8FliZ7u+bvwcg@public.gmane.org>
2012-07-29 16:29                 ` Alex Netes [this message]
2012-07-30 17:19                   ` Foraker, Jim
2012-06-26  0:55       ` [PATCH 6/8] opensm: Add neighboring link cache file Jim Foraker
2012-06-26  0:55       ` [PATCH 7/8] opensm: Check for valid mkey protection level in config file Jim Foraker
2012-06-26  0:55       ` [PATCH 8/8] opensm: Ensure sweep interval/mkey lease are sensibly set Jim Foraker
     [not found]         ` <1340672104-18039-8-git-send-email-foraker1-i2BcT+NCU+M@public.gmane.org>
2012-07-24  9:01           ` Alex Netes
2012-07-24 17:40             ` Jim Foraker
2012-07-04  0:25   ` [PATCH 0/8] opensm: Improved mkey support Jim Foraker
     [not found]     ` <1341361508.5218.148.camel-mxTxeWJot8FliZ7u+bvwcg@public.gmane.org>
2012-07-04  0:25       ` [PATCH V1.1 1/8] opensm: Add guid2mkey cache file support Jim Foraker
     [not found]         ` <1341361548-30229-1-git-send-email-foraker1-i2BcT+NCU+M@public.gmane.org>
2012-07-04  0:25           ` [PATCH V1.1 3/8] Add locking where necessary around osm_req_* Jim Foraker
2012-07-23 15:55           ` [PATCH V1.1 1/8] opensm: Add guid2mkey cache file support Alex Netes
2012-07-23 22:37             ` Jim Foraker
2012-07-23 15:59       ` [PATCH 0/8] opensm: Improved mkey support Alex Netes
2012-07-23 22:28         ` Jim Foraker
2012-08-01 14:48   ` Jim Foraker
     [not found]     ` <1343832537.26423.8.camel-mxTxeWJot8FliZ7u+bvwcg@public.gmane.org>
2012-08-01 14:52       ` [PATCH 1/9 v2] opensm: Add guid2mkey cache file support Jim Foraker
     [not found]         ` <1343832755-26753-1-git-send-email-foraker1-i2BcT+NCU+M@public.gmane.org>
2012-08-01 14:52           ` [PATCH 2/9 v2] opensm: Allow recovery of subnets with misset mkeys Jim Foraker
2012-08-01 14:52           ` [PATCH 3/9 v2] opensm: Add locking where necessary around osm_req_* Jim Foraker
2012-08-01 14:52           ` [PATCH 4/9 v2] opensm: Add support for setting mkey protection levels Jim Foraker
2012-08-01 14:52           ` [PATCH 5/9 v2] opensm: Log errors on SubnGet timeouts Jim Foraker
2012-08-01 14:52           ` [PATCH 6/9 v2] opensm: Add neighboring link cache file Jim Foraker
2012-08-01 14:52           ` [PATCH 7/9 v2] opensm: Check for valid mkey protection level in config file Jim Foraker
2012-08-01 14:52           ` [PATCH 8/9 v2] opensm: Ensure sweep interval/mkey lease are sensibly set Jim Foraker
2012-08-01 14:52           ` [PATCH 9/9 v2] opensm/scripts/sldd.sh: Update to support guid2mkey/neighbors Jim Foraker
2012-08-01 20:19       ` [PATCH 0/8] opensm: Improved mkey support Alex Netes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120729162933.GE5195@calypso \
    --to=alexne-vpraknaxozvwk0htik3j/w@public.gmane.org \
    --cc=foraker1-i2BcT+NCU+M@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=weiny2-i2BcT+NCU+M@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.