From: Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
To: Albert Chu <chu11-i2BcT+NCU+M@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: Node Description mismatch between saquery & smpquery
Date: Tue, 18 Jun 2013 07:13:11 -0400 [thread overview]
Message-ID: <51C040C7.9070109@dev.mellanox.co.il> (raw)
In-Reply-To: <1371505093.19017.76.camel-akkeaxHeDKRliZ7u+bvwcg@public.gmane.org>
On 6/17/2013 5:38 PM, Albert Chu wrote:
> We've recently noticed that the Node Description for a node can
> mis-mismatch between the output of smpquery and saquery. For example:
>
> # smpquery NodeDesc 427
> Node Description:.................sierra1932 qib0
>
> # saquery NodeRecord 427 | grep NodeDesc
> NodeDescription.........QLogic Infiniband HCA
>
> A restart of OpenSM is the current solution to resolve this.
>
> We've noticed it occurring more often on our larger clusters than our
> smaller clusters, leading to a speculation about why it is happening.
>
> The speculation is when a node comes up, there is a window of time in
> which the HCA is up, can be scanned by OpenSM, but not yet have its node
> descriptor set (in RHEL I appears to be set via /etc/init.d/rdma).
> During this window, OpenSM reads/stores the non-desired node descriptor
> (in the above case the non-desired "Qlogic Infiniband HCA").
>
> When the node descriptor is changed, a trap should be sent to opensm
> indicating the change. Normally OpenSM gets the trap and reads the new
> node descriptor.
Are you sure the trap is being issued by those devices when the
NodeDescription is changed locally ?
Also, if so, do these devices implement timeout/retry on sending the
trap (e.g. trying to make sure that they receive trap repress before
giving up on trap) ?
> On our large clusters all nodes are typically brought up at the same
> time, so there are probably a ton of node descriptor change traps
> happening at the exact same time. We speculate a number of these are
> dropped/lost, and subsequently OpenSM never realizes that the node
> descriptor has changed.
Do you see any evidence of that traps are being dropped ? Have you
correlated any VL15Dropped counters in the subnet with this ? Also,
there is a module parameter in MAD kernel module that might help with
any unsolicited MAD bursts. You might try increasing that on your SM
node(s).
> I don't know if the speculation sounds reasonable or not. Regardless,
> we're not sure of the best fix.
>
> A trivial fix would be to just make OpenSM re-scan the node descriptor
> of an HCA, perhaps during a heavy sweep. But I don't know if this is
> optimal. It'll introduce more MADs on the wire. However if the present
> solution is to restart OpenSM, we figure this can't be any worse.
Yes, but to add the additional queries in is O(n) there and has been
resisted in the past.
> Just wondering what peoples thoughts are of if there's another obvious
> solution we're not seeing.
I think this issue needs better understanding first.
-- Hal
> Al
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-06-18 11:13 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-17 21:38 Node Description mismatch between saquery & smpquery Albert Chu
[not found] ` <1371505093.19017.76.camel-akkeaxHeDKRliZ7u+bvwcg@public.gmane.org>
2013-06-17 22:00 ` Weiny, Ira
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E020A19F2-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2013-06-17 23:58 ` Albert Chu
2013-06-18 11:13 ` Hal Rosenstock [this message]
[not found] ` <51C040C7.9070109-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2013-06-18 18:14 ` Albert Chu
[not found] ` <1371579281.19017.86.camel-akkeaxHeDKRliZ7u+bvwcg@public.gmane.org>
2013-06-18 22:07 ` Weiny, Ira
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51C040C7.9070109@dev.mellanox.co.il \
--to=hal-ldsdmyg8hgv8yrgs2mwiifqbs+8scbdb@public.gmane.org \
--cc=chu11-i2BcT+NCU+M@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox