* [PATCH/RFC] IPoIB: Free ipoib neigh on path record failure so path rec queries are retried
@ 2013-02-25 17:46 Roland Dreier
[not found] ` <1361814409-6704-1-git-send-email-roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Roland Dreier @ 2013-02-25 17:46 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Roland Dreier
From: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
If IPoIB fails to look up a path record (eg if it tries during an SM
failover when one SM is dead but the new one hasn't taken over yet), the
driver ends up with a neighbour structure but no address handle (AH).
There's no mechanism to recover from this: any further packets sent to
this destination will be silently dumped in ipoib_start_xmit().
Fix this by freeing the neighbour structures when a path rec query
fails, so that the next packet queued to be sent will trigger a new path
record query.
Signed-off-by: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
---
We actually hit this while testing SM failover. Let me know if there
are any objections to merging this for 3.9.
drivers/infiniband/ulp/ipoib/ipoib_main.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 66d6da9..8534afd 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -510,6 +510,9 @@ static void path_rec_completion(int status,
spin_unlock_irqrestore(&priv->lock, flags);
+ if (IS_ERR_OR_NULL(ah))
+ ipoib_del_neighs_by_gid(dev, path->pathrec.dgid.raw);
+
if (old_ah)
ipoib_put_ah(old_ah);
--
1.8.1.2
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 4+ messages in thread[parent not found: <1361814409-6704-1-git-send-email-roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>]
* Re: [PATCH/RFC] IPoIB: Free ipoib neigh on path record failure so path rec queries are retried [not found] ` <1361814409-6704-1-git-send-email-roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> @ 2013-02-26 10:32 ` Or Gerlitz [not found] ` <512C8F4F.9000606-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Or Gerlitz @ 2013-02-26 10:32 UTC (permalink / raw) To: Roland Dreier Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Roland Dreier, Shlomo Pongratz, Erez Shitrit On 25/02/2013 19:46, Roland Dreier wrote: > If IPoIB fails to look up a path record (eg if it tries during an SM > failover when one SM is dead but the new one hasn't taken over yet), the > driver ends up with a neighbour structure but no address handle (AH). > There's no mechanism to recover from this: any further packets sent to > this destination will be silently dumped in ipoib_start_xmit(). Looking on the flow of sending ARP probes, I see that in unicast_arp_send, if there's no AH for the path, a path query is initiated, if (path->ah) { ipoib_dbg(priv, "Send unicast ARP to %04x\n", be16_to_cpu(path->pathrec.dlid)); spin_unlock_irqrestore(&priv->lock, flags); ipoib_send(dev, skb, path->ah, IPOIB_QPN(cb->hwaddr)); return; } else if ((path->query || !path_rec_start(dev, path)) && skb_queue_len(&path->queue) < IPOIB_MAX_PATH_REC_QUEUE) { __skb_queue_tail(&path->queue, skb); } else { ++dev->stats.tx_dropped; dev_kfree_skb_any(skb); } so eventually the traffic should resume once the ND state machine sends a probe, agree? did you only wanted to make that faster? Or. > Fix this by freeing the neighbour structures when a path rec query fails, so that the next packet queued to be sent will trigger a new path record query. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <512C8F4F.9000606-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH/RFC] IPoIB: Free ipoib neigh on path record failure so path rec queries are retried [not found] ` <512C8F4F.9000606-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2013-02-26 16:55 ` Roland Dreier [not found] ` <CAG4TOxMyTEMKLUF1+YLej7Txj4P5MLZ49cVqAhQP4CW8qi0AUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Roland Dreier @ 2013-02-26 16:55 UTC (permalink / raw) To: Or Gerlitz Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Shlomo Pongratz, Erez Shitrit On Tue, Feb 26, 2013 at 2:32 AM, Or Gerlitz <ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> wrote: > Looking on the flow of sending ARP probes, I see that in unicast_arp_send, > if there's no AH for the path, a path query is initiated, > > if (path->ah) { > ipoib_dbg(priv, "Send unicast ARP to %04x\n", > be16_to_cpu(path->pathrec.dlid)); > > spin_unlock_irqrestore(&priv->lock, flags); > ipoib_send(dev, skb, path->ah, IPOIB_QPN(cb->hwaddr)); > return; > } else if ((path->query || !path_rec_start(dev, path)) && > skb_queue_len(&path->queue) < IPOIB_MAX_PATH_REC_QUEUE) { > __skb_queue_tail(&path->queue, skb); > } else { > ++dev->stats.tx_dropped; > dev_kfree_skb_any(skb); > } > > > so eventually the traffic should resume once the ND state machine sends a > probe, agree? did you only wanted to make that faster? Well, look at ipoib_start_xmit(): /* unicast, arrange "switch" according to probability */ switch (header->proto) { case htons(ETH_P_IP): case htons(ETH_P_IPV6): neigh = ipoib_neigh_get(dev, cb->hwaddr); if (unlikely(!neigh)) { neigh_add_path(skb, cb->hwaddr, dev); return NETDEV_TX_OK; } break; case htons(ETH_P_ARP): case htons(ETH_P_RARP): /* for unicast ARP and RARP should always perform path find */ unicast_arp_send(skb, dev, cb); return NETDEV_TX_OK; etc. So although you mention ND, we only call unicast_arp_send() for IPv4 ARP. And in our case we were actually using IPv6, which follows the main flow with neighbours even for neighbour discovery packets. So there's no way to ever recover from having the failed path record. In fact I bet this is why the bug has been there as long as it has been: almost no one is using IPv6 on IPoIB seriously, and IPv4 should work OK as you point out. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <CAG4TOxMyTEMKLUF1+YLej7Txj4P5MLZ49cVqAhQP4CW8qi0AUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH/RFC] IPoIB: Free ipoib neigh on path record failure so path rec queries are retried [not found] ` <CAG4TOxMyTEMKLUF1+YLej7Txj4P5MLZ49cVqAhQP4CW8qi0AUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-02-27 10:32 ` Sebastian Riemer 0 siblings, 0 replies; 4+ messages in thread From: Sebastian Riemer @ 2013-02-27 10:32 UTC (permalink / raw) To: Roland Dreier Cc: Or Gerlitz, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Shlomo Pongratz, Erez Shitrit On 26.02.2013 17:55, Roland Dreier wrote: [...] > In fact I bet this is why the bug has been there as long as it has > been: almost no one is using IPv6 on IPoIB seriously, and IPv4 should > work OK as you point out. Thanks a lot,.... Unfortunately, we are using IPoIB with IPv6 in production for the inter-VM and the internet gateway traffic. We had an issue last Friday were one of our gateway machines wasn't reachable anymore and pings didn't come through. We had to reload the ib_ipoib modules of nearly every server connected to that gateway. I'm so glad that we use SRP - all SRP connections survived. I don't have the time to care for IPoIB as well at ProfitBricks so I will encourage our networking teams to get their hands on IPoIB and linux-rdma communication. How to verify if we've hit this bug? How to reproduce this bug? Cheers, Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-02-27 10:32 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-25 17:46 [PATCH/RFC] IPoIB: Free ipoib neigh on path record failure so path rec queries are retried Roland Dreier
[not found] ` <1361814409-6704-1-git-send-email-roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-02-26 10:32 ` Or Gerlitz
[not found] ` <512C8F4F.9000606-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-02-26 16:55 ` Roland Dreier
[not found] ` <CAG4TOxMyTEMKLUF1+YLej7Txj4P5MLZ49cVqAhQP4CW8qi0AUA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-02-27 10:32 ` Sebastian Riemer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox