public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Venkat Venkatsubra <venkat.x.venkatsubra-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Cc: pierre orzechowski
	<pierre.e.orzechowski-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	Michael Nowak
	<michael.nowak-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Subject: Question about RDMA_CM_EVENT_ROUTE_ERROR
Date: Thu, 24 May 2012 17:48:56 -0500	[thread overview]
Message-ID: <4FBEBAD8.5010507@oracle.com> (raw)

Hello,

We are trying to figure out the cause for RDMA_CM_EVENT_ROUTE_ERROR 
errors after a failover event of the bonding driver.
The event status returned is -EINVAL. To gather further information on 
when this EINVAL is returned,
I added some debug which showed 3 for mad_hdr.status in the below 
function in drivers/infiniband/core/sa_query.c.

[drivers/infiniband/core/sa_query.c]
static void recv_handler(struct ib_mad_agent *mad_agent, struct 
ib_mad_recv_wc *mad_recv_wc)
{
         struct ib_sa_query *query;
         struct ib_mad_send_buf *mad_buf;

         mad_buf = (void *) (unsigned long) mad_recv_wc->wc->wr_id;
         query = mad_buf->context[0];

         if (query->callback) {
                 if (mad_recv_wc->wc->status == IB_WC_SUCCESS) {
                         query->callback(query,
                                         
mad_recv_wc->recv_buf.mad->mad_hdr.status ? -EINVAL : 0,
                                         (struct ib_sa_mad *) 
mad_recv_wc->recv_buf.mad);

How do I find out what 3 in mad_recv_wc->recv_buf.mad->mad_hdr.status 
stands for ?

To test RDS reconnect time we are rebooting one of the switch connected 
to one port of the bonding driver.
It then fails over to the other port, RDMA CM gets notified which then 
notifies RDS.
RDS initiates a reconnect.   rdma_resolve_route results in these errors.
There are some 25 connections that try to failover at the same time.
We get this error for a couple of seconds and finally the 
rdma_resolve_route succeeds.
Some of them succeed right away. So it may be due to the load generated 
by too many rdma_resolve_route.

Thanks for your help.

Venkat
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

             reply	other threads:[~2012-05-24 22:48 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-24 22:48 Venkat Venkatsubra [this message]
     [not found] ` <4FBEBAD8.5010507-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-25  0:18   ` Question about RDMA_CM_EVENT_ROUTE_ERROR Hefty, Sean
     [not found]     ` <1828884A29C6694DAF28B7E6B8A8237346A23FFD-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2012-05-25 17:20       ` Venkat Venkatsubra
     [not found]         ` <4FBFBF4D.9090400-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2012-05-31 18:53           ` Venkat Venkatsubra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FBEBAD8.5010507@oracle.com \
    --to=venkat.x.venkatsubra-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=michael.nowak-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=pierre.e.orzechowski-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox