From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: Jeff Roberson <jroberson-gUAg20sWgfgcWVvVuXF20w@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: LID reconfiguration
Date: Mon, 9 Nov 2009 17:20:47 -0700 [thread overview]
Message-ID: <20091110002047.GJ6188@obsidianresearch.com> (raw)
In-Reply-To: <alpine.BSF.2.00.0911091348360.1226@desktop>
On Mon, Nov 09, 2009 at 01:56:49PM -1000, Jeff Roberson wrote:
>>> Is there anything I can do other than restart the discovery and
>>> connection process? Shouldn't we have enough information with the GID to
>>> retain and reroute the connection?
>>
>> With a GID you can go back to the SM and get an updated set of
>> path records with the new LID data.
>
> Ok, so the QPs will be held in an error state but I can restart them once
> I re-initialize the paths right? I can query the path using umad and get
> path record? So we'll have a minor hicup in communication but previously
> buffered data will be sent as soon as the QP is valid again?
I've never heard of someone recovering QPs once they reach the error
state, I think they are pretty much done at that point. You have to
start again.
To get hitless switching to the passive backup pass you need to use
the IB APM feature.
Otherwse, you could detect failure of the QP and issue a new PR query
for the GID using umad and then try again to connect - depending on
how your home grown connection process works I guess..
> We are not using IPoIB at the moment. This is for an appliance type
> device and the customers will be responsible for their own switches. At
> present everything simply stops working when we re-lid so I just need to
> add the correct failure handling code.
Detect failure and start again from stratch is what pretty much
everyone does today, AFAIK.
>> rdmacm when combined with IPoIB bonding will give you a kind of
>> active/passive HA type multi-path.
>
> That is essentially what we're looking for. We discover the devices
> automatically but transparent multi-path would've saved a lot of work.
Yes, you probably could have used the bonding feature, but note it
does not save you from errored QPs in the failover case and I've had
problems with IPoIB PR caching in LID-change cases in the past..
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-11-10 0:20 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-09 23:30 LID reconfiguration Jeff Roberson
2009-11-09 23:45 ` Jason Gunthorpe
[not found] ` <20091109234547.GH6188-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2009-11-09 23:56 ` Jeff Roberson
2009-11-10 0:20 ` Jason Gunthorpe [this message]
[not found] ` <20091110002047.GJ6188-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2009-11-17 4:38 ` Jeff Roberson
2009-11-17 4:51 ` Jason Gunthorpe
2009-11-17 5:15 ` Sean Hefty
[not found] ` <6A30FB8CEED94D778E7CDAE4660458DA-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2009-11-24 3:57 ` Jeff Roberson
2009-11-24 17:49 ` Sean Hefty
[not found] ` <10477AA8CF094F2F92E8792307982F66-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2009-11-24 19:54 ` Jeff Roberson
2009-11-24 19:59 ` Sean Hefty
[not found] ` <65B503E4F968463B8D6D5D019E036ED7-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2009-12-01 0:28 ` Jeff Roberson
2009-12-01 15:55 ` Sean Hefty
2009-11-10 7:07 ` Or Gerlitz
[not found] ` <4AF91138.7000809-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
2009-11-10 7:11 ` Jeff Roberson
2009-11-10 7:44 ` Or Gerlitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091110002047.GJ6188@obsidianresearch.com \
--to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
--cc=jroberson-gUAg20sWgfgcWVvVuXF20w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox