public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Sumeet Lahorani <Sumeet.Lahorani-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: gratuitous arps lost during IB switch failure
Date: Tue, 21 Sep 2010 16:42:43 -0700	[thread overview]
Message-ID: <4C9942F3.5090206@oracle.com> (raw)


Hi All,

We are using dual ported HCAs connected with each port connected to 2 
different IB switches so that we can tolerate the failure of any one of 
those switches and we are trying to cut down the amount of time it takes 
for traffic (TCP & RDS) to resume when there is an IB switch failure and 
the hosts failover from one port to the other.

We have the bonding driver configured in active-backup mode and setup to 
send out 100 gratuitous arps at intervals of 100ms whenever there is a 
failover. In most cases, traffic resumes within a few seconds after a 
failover because these gratuitous arps take care of updating all the 
nodes with the new IP:GID mapping.

The problem we are seeing is that sometimes, one or more of the nodes on 
the fabric do not receive even 1 of these gratuitous arps and 
re-establishing communication with these nodes takes a much longer time 
(over 40 seconds) as it depends on various arp cache timeouts. Does 
anyone know why all these gratuitous arps might be lost?

Besides the gratuitous arp settings, are there any other tunables to 
look at to minimize the time it takes for IPoIB traffic to resume?

- Sumeet

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

             reply	other threads:[~2010-09-21 23:42 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-21 23:42 Sumeet Lahorani [this message]
2010-09-30 23:23 ` gratuitous arps lost during IB switch failure Sumeet
     [not found]   ` <loom.20101001T010729-594-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
2010-10-02 20:17     ` Or Gerlitz
     [not found]       ` <AANLkTinwGJq-aRM43ct9_1PrUCkn7QMecPczPpEyD2pQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-02 20:19         ` Or Gerlitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C9942F3.5090206@oracle.com \
    --to=sumeet.lahorani-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox