public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Init ipoib_neigh.dgid
@ 2009-11-16 17:36 David J. Wilder
       [not found] ` <1258392964.29051.5.camel-XfwDJb4SXxnMbYB6QlFGEg@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: David J. Wilder @ 2009-11-16 17:36 UTC (permalink / raw)
  To: rdreir-FYB4Gu1CFyUAvxtiuMwx3w, eli-VPRAkNaXOzVS1MOuV/RT9w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

Roland, Eli

Ipoib can miss a change in dgid under some conditions.  The problem is
caused when ipoib_neigh->dgid contains a stale address.  The fix is to
set ipoib_neigh->dgid to zero in ipoib_neigh_alloc().

Detail description: A systems using bonding on its ipoib interface has
switched it active slave interface from interface A to B and back to A
setting up the situation for this bug.  The system that fails will not
correctly processes the 2nd address change.

When an address has changed neighbor->ha is updated with the new address.
Each neighbor has an associated ipoib_neigh.  ipoib_neigh->dgid also
holds a copy of the remote node's hardware address.  When an address
changes neighbor->ha is updated by the network layer (arp code) with the
new address.  Ipoib detects this change in ipoib_start_xmit() by comparing
neighbor->ha with ipoib_neigh->dgid.  The bug is that ipoib_neigh->dgid
already contains the new address(A) thus the change from B to A is missed
by ipoib.  Here is the sequence of events:

ipoib_neigh->dgid = A neighbor->ha=A

The address is switched to B (the first switch)

neighbor->ha=B

The change is seen in ipoib_start_xmit(). neighbor->ha !=
ipoib_neigh->dgid

The ipoib_neigh is released, and a new one is allocated.

The memory allocation system returned the same chunk of memory that was
just released, therefore ipoib_neigh->dgid still contains A at this point.

ipoib_neigh->dgid should be updated in neigh_add_path(), but if the
following conditions are true dgid is not updated.

        1) __path_find() returns a path

        2) path->ah is NULL

The remote system now switches from address B to A, neighbor->ha is
updated to A.

Now we have: ipoib_neigh->dgid = A neighbor->ha=A

Since the address are the same ipoib won't process the change in address.

Signed-off-by: David Wilder <dwilder-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

------------------------------------------------------
 drivers/infiniband/ulp/ipoib/ipoib_main.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 2bf5116..25ef50b 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -884,6 +884,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
 
 	neigh->neighbour = neighbour;
 	neigh->dev = dev;
+	memset(&neigh->dgid.raw, 0, sizeof(union ib_gid));
 	*to_ipoib_neigh(neighbour) = neigh;
 	skb_queue_head_init(&neigh->queue);
 	ipoib_cm_set(neigh, NULL);


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] Init ipoib_neigh.dgid
       [not found] ` <1258392964.29051.5.camel-XfwDJb4SXxnMbYB6QlFGEg@public.gmane.org>
@ 2009-11-16 17:40   ` Roland Dreier
       [not found]     ` <ada3a4ecqrg.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Roland Dreier @ 2009-11-16 17:40 UTC (permalink / raw)
  To: David J. Wilder
  Cc: rdreir-FYB4Gu1CFyUAvxtiuMwx3w, eli-VPRAkNaXOzVS1MOuV/RT9w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

Is this any different from the previous patch you sent?  Which is still
in my review queue...

 - R.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* patchwork for tracking patch status (was: [PATCH] Init ipoib_neigh.dgid)
       [not found]     ` <ada3a4ecqrg.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
@ 2009-11-16 17:44       ` Roland Dreier
  0 siblings, 0 replies; 3+ messages in thread
From: Roland Dreier @ 2009-11-16 17:44 UTC (permalink / raw)
  To: David J. Wilder
  Cc: eli-VPRAkNaXOzVS1MOuV/RT9w, linux-rdma-u79uwXL29TY76Z2rM5mHXA

By the way, you (and anyone else) can look at
<http://patchwork.kernel.org/project/linux-rdma/list/> to see the status
of patches.  If a patch is in the "New" state then it's not lost and
I'll get to it eventually.

 - R.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-11-16 17:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-16 17:36 [PATCH] Init ipoib_neigh.dgid David J. Wilder
     [not found] ` <1258392964.29051.5.camel-XfwDJb4SXxnMbYB6QlFGEg@public.gmane.org>
2009-11-16 17:40   ` Roland Dreier
     [not found]     ` <ada3a4ecqrg.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2009-11-16 17:44       ` patchwork for tracking patch status (was: [PATCH] Init ipoib_neigh.dgid) Roland Dreier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox