All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yevgeny Kliteynik <kliteyn-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
To: Aaron Knister <aaron.knister-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Or Gerlitz <ogerlitz-smomgflXvOZWk0Htik3J/w@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: OpenSM Failover
Date: Wed, 14 Oct 2009 10:46:37 +0200	[thread overview]
Message-ID: <4AD58FED.1060800@dev.mellanox.co.il> (raw)
In-Reply-To: <eafd71280910131426g1cd68d7k1c28aee185ac3b8d-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Aaron,

Aaron Knister wrote:
>>> As I said, the older opensms
>>> on the older mellanox model HCAs failsover and failsback instantly.
>> The instant failback is expected, and this is the bug that
>> we're discussing. As for the instant failover - I'll check
>> how the things supposed to work and get back to you.

After checking this thing, I don't understand how the
instant failover is possible. The only case it can
work is if you don't have a switch in your subnet - 
just two HCAs connected directly to each other.
Is this the case?

If not, then I'd like to see opensm logs.
Please run opensm as before (-V -s 0 -e).
Start OSM on node A with high priority.
Start OSM on node B with low priority.
Kill OSM on node A, and see that OSM on
node B becomes master.

I need only the log of the opensm on node B.
Best if you could just attach it to the bugzilla
issue form, but if you can't - you can mail it to me.

-- Yevgeny

>> -- Yevgeny
>>
>>> On Tue, Oct 13, 2009 at 11:32 AM, Yevgeny Kliteynik
>>> <kliteyn-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>>>> Aaron,
>>>>
>>>> Thanks for the logs, this was really helpful.
>>>> Looks like there is a handover race in the OSM -
>>>> SM on node A misses the fact that SM on node B
>>>> have gave up its mastership.
>>>>
>>>> There is a bugzilla issue the describes all the
>>>> details of this race:
>>>>
>>>> https://bugs.openfabrics.org/show_bug.cgi?id=1499
>>>>
>>>> I've updated the issue form with your case, and we will continue
>>>> following
>>>> this bug there.
>>>>
>>>> -- Yevgeny
>>>>
>>>> Aaron Knister wrote:
>>>>> While the adapters have mellanox chipsets their actually IBM OEM
>>>>> branded and IBM hasn't released the 2.7 fw yet. I'm a little hesitant
>>>>> to apply the generic Mellanox FW.
>>>>>
>>>>> On Mon, Oct 12, 2009 at 4:22 AM, Yevgeny Kliteynik
>>>>> <kliteyn-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>>>>>> Or Gerlitz wrote:
>>>>>>> Yevgeny Kliteynik wrote:
>>>>>>>> There was a hand-over problem in OFED 1.4, but later it turned  out
>>>>>>>> to
>>>>>>>> be
>>>>>>>> FW issue. The thing is, FW version 2.6.648 doesn't  have this bug any
>>>>>>>> more...
>>>>>>> so things should work fine with the newly released 2.7 firmware?
>>>>>> Yes
>>>>>>
>>>>>>> if this is still under question, Aaron, I suggest you open a bugzilla
>>>>>>> case
>>>>>>> @ https://bugs.openfabrics.org and we can track from there.
>>>>>> Good idea.
>>>>>>
>>>>>> -- Yevgeny
>>>>>>
>>>>>>> Or.
>>>>>>>
>>>>>>>
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2009-10-14  8:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-10 23:38 OpenSM Failover Aaron Knister
     [not found] ` <B1EF3F77-622B-40DA-BB3D-DC35973B60A6-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-10-11  0:02   ` Aaron Knister
     [not found]     ` <CBC039F5-9019-436D-AF6D-F887E860D07B-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-10-12  7:14       ` Yevgeny Kliteynik
     [not found]         ` <4AD2D75A.2020403-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2009-10-12  8:14           ` Or Gerlitz
     [not found]             ` <4AD2E582.8010202-smomgflXvOZWk0Htik3J/w@public.gmane.org>
2009-10-12  8:22               ` Yevgeny Kliteynik
     [not found]                 ` <4AD2E736.4050803-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2009-10-12 14:03                   ` Aaron Knister
     [not found]                     ` <eafd71280910120703y7dfa04cbq114cf07d46c909fb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-13 15:32                       ` Yevgeny Kliteynik
     [not found]                         ` <4AD49DAB.4020206-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2009-10-13 15:52                           ` Aaron Knister
     [not found]                             ` <eafd71280910130852k1166b980kdf7129a52dacd42f-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-13 16:13                               ` Yevgeny Kliteynik
     [not found]                                 ` <4AD4A72C.6000108-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2009-10-13 21:26                                   ` Aaron Knister
     [not found]                                     ` <eafd71280910131426g1cd68d7k1c28aee185ac3b8d-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-14  8:46                                       ` Yevgeny Kliteynik [this message]
     [not found]                                         ` <4AD58FED.1060800-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2009-10-14 13:13                                           ` Aaron Knister
2009-10-12 13:41           ` Aaron Knister

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AD58FED.1060800@dev.mellanox.co.il \
    --to=kliteyn-ldsdmyg8hgv8yrgs2mwiifqbs+8scbdb@public.gmane.org \
    --cc=aaron.knister-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ogerlitz-smomgflXvOZWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.