* [ofa-general] [PATCH 2/2] IPoIB: Code cleanup
2007-09-18 11:18 [ofa-general] [PATCH 1/2] IPoIB: Fix unregister_netdev hang Krishna Kumar
@ 2007-09-18 11:18 ` Krishna Kumar
2007-09-18 14:27 ` [PATCH 1/2] IPoIB: Fix unregister_netdev hang Roland Dreier
2007-09-18 17:58 ` [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule() Roland Dreier
2 siblings, 0 replies; 12+ messages in thread
From: Krishna Kumar @ 2007-09-18 11:18 UTC (permalink / raw)
To: rdreier; +Cc: netdev, davem, general
Follow-up cleanup and "while loop" optimization in the poll handler.
net_rx_action guarantees that 'budget' is atleast 1.
Note: This could also be done for poll handlers of other drivers.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
ipoib_ib.c | 22 ++++++++--------------
1 files changed, 8 insertions(+), 14 deletions(-)
diff -ruNp new1/drivers/infiniband/ulp/ipoib/ipoib_ib.c new2/drivers/infiniband/ulp/ipoib/ipoib_ib.c
--- new1/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2007-09-18 16:14:20.000000000 +0530
+++ new2/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2007-09-18 16:31:42.000000000 +0530
@@ -285,19 +285,16 @@ int ipoib_poll(struct napi_struct *napi,
{
struct ipoib_dev_priv *priv = container_of(napi, struct ipoib_dev_priv, napi);
struct net_device *dev = priv->dev;
- int done;
- int t;
- int n, i;
+ int num_wc, max_wc;
+ int done = 0;
- done = 0;
-
- while (done < budget) {
- int max = (budget - done);
+ do {
+ int i;
- t = min(IPOIB_NUM_WC, max);
- n = ib_poll_cq(priv->cq, t, priv->ibwc);
+ max_wc = min(IPOIB_NUM_WC, budget - done);
+ num_wc = ib_poll_cq(priv->cq, max_wc, priv->ibwc);
- for (i = 0; i < n; i++) {
+ for (i = 0; i < num_wc; i++) {
struct ib_wc *wc = priv->ibwc + i;
if (wc->wr_id & IPOIB_CM_OP_SRQ) {
@@ -309,10 +306,7 @@ int ipoib_poll(struct napi_struct *napi,
} else
ipoib_ib_handle_tx_wc(dev, wc);
}
-
- if (n != t)
- break;
- }
+ } while (num_wc == max_wc && done < budget);
if (done < budget) {
if (likely(!ib_req_notify_cq(priv->cq,
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang
2007-09-18 11:18 [ofa-general] [PATCH 1/2] IPoIB: Fix unregister_netdev hang Krishna Kumar
2007-09-18 11:18 ` [ofa-general] [PATCH 2/2] IPoIB: Code cleanup Krishna Kumar
@ 2007-09-18 14:27 ` Roland Dreier
2007-09-19 3:23 ` Krishna Kumar2
2007-09-18 17:58 ` [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule() Roland Dreier
2 siblings, 1 reply; 12+ messages in thread
From: Roland Dreier @ 2007-09-18 14:27 UTC (permalink / raw)
To: Krishna Kumar; +Cc: netdev, general, davem
Thanks for testing on ehca...
> While using IPoIB over EHCA (rc6 bits), unregister_netdev hangs with
I don't think you're actually using rc6 bits, since in your patch you have:
> -poll_more:
and I think that is only in Dave's net-2.6.24 tree now, right?
> The problem is that the poll handler does netif_rx_complete (which
> does a dev_put) followed by netif_rx_reschedule() to schedule for
> more receives (which again does a dev_put). This reduces refcount to
> < 0 (depending on how many times netif_rx_complete followed by
> netif_rx_reschedule was called).
Dave, the real problem seems to be that netif_rx_recschedule() calls
__napi_schedule() rather than __netif_rx_schedule(), so it misses the
call to dev_hold() that is needed to balance the dev_put() in
netif_rx_complete(). The current netif_rx_reschedule() looks like it
really should be napi_reschedule(), and we need a new function that
takes a netdev too. Or am I misunderstanding the refcounting?
I'll send a patch once I've had some breakfast and had a chance to at
least compile it...
Krishna, unfortunately your proposed fix has a race:
> - netif_rx_complete(dev, napi);
> - if (unlikely(ib_req_notify_cq(priv->cq,
> - IB_CQ_NEXT_COMP |
> - IB_CQ_REPORT_MISSED_EVENTS)) &&
> - netif_rx_reschedule(napi))
> - goto poll_more;
> + if (likely(!ib_req_notify_cq(priv->cq,
> + IB_CQ_NEXT_COMP |
> + IB_CQ_REPORT_MISSED_EVENTS)))
It is possible for an interrupt to happen immediately right here,
before the netif_rx_complete(), so that netif_rx_schedule() gets
called while we are still on the poll list.
> + netif_rx_complete(dev, napi);
- R.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang
2007-09-18 14:27 ` [PATCH 1/2] IPoIB: Fix unregister_netdev hang Roland Dreier
@ 2007-09-19 3:23 ` Krishna Kumar2
2007-09-19 3:30 ` [ofa-general] " Roland Dreier
0 siblings, 1 reply; 12+ messages in thread
From: Krishna Kumar2 @ 2007-09-19 3:23 UTC (permalink / raw)
To: Roland Dreier; +Cc: davem, general, netdev
Hi Roland,
Roland Dreier <rdreier@cisco.com> wrote on 09/18/2007 07:57:24 PM:
> > While using IPoIB over EHCA (rc6 bits), unregister_netdev hangs with
>
> I don't think you're actually using rc6 bits, since in your patch you
have:
>
> > -poll_more:
>
> and I think that is only in Dave's net-2.6.24 tree now, right?
Nope, that was what I downloaded yesterday:
VERSION = 2
PATCHLEVEL = 6
SUBLEVEL = 23
EXTRAVERSION =-rc6
NAME = Pink Farting Weasel
> > + if (likely(!ib_req_notify_cq(priv->cq,
> > +
IB_CQ_NEXT_COMP |
> > +
IB_CQ_REPORT_MISSED_EVENTS)))
>
> It is possible for an interrupt to happen immediately right here,
> before the netif_rx_complete(), so that netif_rx_schedule() gets
> called while we are still on the poll list.
>
> > + netif_rx_complete(dev, napi);
To be clear, netif_rx_schedule while we are still in the poll list will not
do any harm as it does nothing since NAPI_STATE_SCHED is still set (cleared
by netif_rx_complete which has not yet run). Effectively we lost/delayed
processing an interrupt, if I understood the code right.
I agree with you on the new patch.
thanks,
- KK
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ofa-general] Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang
2007-09-19 3:23 ` Krishna Kumar2
@ 2007-09-19 3:30 ` Roland Dreier
2007-09-19 4:24 ` Krishna Kumar2
0 siblings, 1 reply; 12+ messages in thread
From: Roland Dreier @ 2007-09-19 3:30 UTC (permalink / raw)
To: Krishna Kumar2; +Cc: netdev, davem, general
> > and I think that is only in Dave's net-2.6.24 tree now, right?
>
> Nope, that was what I downloaded yesterday:
>
> VERSION = 2
> PATCHLEVEL = 6
> SUBLEVEL = 23
> EXTRAVERSION =-rc6
> NAME = Pink Farting Weasel
Please double check your tree. I just very carefully looked at my
trees, and the poll_more: label is added in commit 6b460a71 ("[NET]:
Make NAPI polling independent of struct net_device objects.") which is
only in the net-2.6.24 tree. Of course Dave did not change the
version information in the Makefile since he wouldn't want Linus to
pick up any extra strange changes when he pulls, so a net-2.6.24 tree
will look like 2.6.23-rc6 as you quoted.
And the refcounting bug I fixed is only in net-2.6.24.
> To be clear, netif_rx_schedule while we are still in the poll list will not
> do any harm as it does nothing since NAPI_STATE_SCHED is still set (cleared
> by netif_rx_complete which has not yet run). Effectively we lost/delayed
> processing an interrupt, if I understood the code right.
Right, we lose an interrupt, and since the CQ events are one-shot, we
never get another one, and the interface is effectively dead.
- R.
^ permalink raw reply [flat|nested] 12+ messages in thread* [ofa-general] Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang
2007-09-19 3:30 ` [ofa-general] " Roland Dreier
@ 2007-09-19 4:24 ` Krishna Kumar2
0 siblings, 0 replies; 12+ messages in thread
From: Krishna Kumar2 @ 2007-09-19 4:24 UTC (permalink / raw)
To: Roland Dreier; +Cc: netdev, davem, general
Hi Roland,
> Please double check your tree. I just very carefully looked at my
> trees, and the poll_more: label is added in commit 6b460a71 ("[NET]:
> Make NAPI polling independent of struct net_device objects.") which is
> only in the net-2.6.24 tree. Of course Dave did not change the
> version information in the Makefile since he wouldn't want Linus to
> pick up any extra strange changes when he pulls, so a net-2.6.24 tree
> will look like 2.6.23-rc6 as you quoted.
>
> And the refcounting bug I fixed is only in net-2.6.24.
You are absolutely right. My wording was incorrect, I should have said
net-2.6.24 (which is *at* rev rc6).
> > To be clear, netif_rx_schedule while we are still in the poll list
will not
> > do any harm as it does nothing since NAPI_STATE_SCHED is still set
(cleared
> > by netif_rx_complete which has not yet run). Effectively we
lost/delayed
> > processing an interrupt, if I understood the code right.
>
> Right, we lose an interrupt, and since the CQ events are one-shot, we
> never get another one, and the interface is effectively dead.
Thanks,
- KK
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule()
2007-09-18 11:18 [ofa-general] [PATCH 1/2] IPoIB: Fix unregister_netdev hang Krishna Kumar
2007-09-18 11:18 ` [ofa-general] [PATCH 2/2] IPoIB: Code cleanup Krishna Kumar
2007-09-18 14:27 ` [PATCH 1/2] IPoIB: Fix unregister_netdev hang Roland Dreier
@ 2007-09-18 17:58 ` Roland Dreier
2007-09-18 18:04 ` [ofa-general] [PATCH net-2.6.24] Fix documentation for dev_put()/dev_hold() Roland Dreier
2007-09-18 20:15 ` [ofa-general] Re: [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule() David Miller
2 siblings, 2 replies; 12+ messages in thread
From: Roland Dreier @ 2007-09-18 17:58 UTC (permalink / raw)
To: davem; +Cc: Krishna Kumar, netdev, general
netif_rx_complete() takes a netdev parameter and does dev_put() on
that netdev, so netif_rx_reschedule() needs to also take a netdev
parameter and do dev_hold() on it to avoid reference counts from
getting becoming negative because of unbalanced dev_put()s.
This should fix the problem reported by Krishna Kumar
<krkumar2@in.ibm.com> with IPoIB waiting forever for netdev refcounts
to become 0 during module unload.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
---
Dave, feel free to roll this up into earlier NAPI conversion patches
(assuming I'm understanding things correctly and this patch actually
makes sense!).
BTW, it looks like drivers/net/ibm_emac/ibm_emac_mal.c would not have
built in the current net-2.6.24 tree, since its call to
netif_rx_reschedule() was left with the netdev parameter. So that
file does not need to be touched in this patch.
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 2 +-
drivers/net/arm/ep93xx_eth.c | 2 +-
drivers/net/ehea/ehea_main.c | 2 +-
drivers/net/ibmveth.c | 2 +-
include/linux/netdevice.h | 21 +++++++++++----------
5 files changed, 15 insertions(+), 14 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 6a2bff4..481e4b6 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -320,7 +320,7 @@ poll_more:
if (unlikely(ib_req_notify_cq(priv->cq,
IB_CQ_NEXT_COMP |
IB_CQ_REPORT_MISSED_EVENTS)) &&
- netif_rx_reschedule(napi))
+ netif_rx_reschedule(dev, napi))
goto poll_more;
}
diff --git a/drivers/net/arm/ep93xx_eth.c b/drivers/net/arm/ep93xx_eth.c
index f3858d1..7f016f3 100644
--- a/drivers/net/arm/ep93xx_eth.c
+++ b/drivers/net/arm/ep93xx_eth.c
@@ -309,7 +309,7 @@ poll_some_more:
}
spin_unlock_irq(&ep->rx_lock);
- if (more && netif_rx_reschedule(napi))
+ if (more && netif_rx_reschedule(dev, napi))
goto poll_some_more;
}
diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index 4a5ab4a..9a499f4 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -636,7 +636,7 @@ static int ehea_poll(struct napi_struct *napi, int budget)
if (!cqe && !cqe_skb)
return 0;
- if (!netif_rx_reschedule(napi))
+ if (!netif_rx_reschedule(dev, napi))
return 0;
}
diff --git a/drivers/net/ibmveth.c b/drivers/net/ibmveth.c
index b8d7cec..b94f266 100644
--- a/drivers/net/ibmveth.c
+++ b/drivers/net/ibmveth.c
@@ -973,7 +973,7 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
netif_rx_complete(netdev, napi);
if (ibmveth_rxq_pending_buffer(adapter) &&
- netif_rx_reschedule(napi)) {
+ netif_rx_reschedule(netdev, napi)) {
lpar_rc = h_vio_signal(adapter->vdev->unit_address,
VIO_IRQ_DISABLE);
goto restart_poll;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index be5fe05..0dbf185 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1198,16 +1198,6 @@ static inline u32 netif_msg_init(int debug_value, int default_msg_enable_bits)
return (1 << debug_value) - 1;
}
-/* Try to reschedule poll. Called by dev->poll() after netif_rx_complete(). */
-static inline int netif_rx_reschedule(struct napi_struct *n)
-{
- if (napi_schedule_prep(n)) {
- __napi_schedule(n);
- return 1;
- }
- return 0;
-}
-
/* Test if receive needs to be scheduled but only if up */
static inline int netif_rx_schedule_prep(struct net_device *dev,
struct napi_struct *napi)
@@ -1234,6 +1224,17 @@ static inline void netif_rx_schedule(struct net_device *dev,
__netif_rx_schedule(dev, napi);
}
+/* Try to reschedule poll. Called by dev->poll() after netif_rx_complete(). */
+static inline int netif_rx_reschedule(struct net_device *dev,
+ struct napi_struct *napi)
+{
+ if (napi_schedule_prep(napi)) {
+ __netif_rx_schedule(dev, napi);
+ return 1;
+ }
+ return 0;
+}
+
/* same as netif_rx_complete, except that local_irq_save(flags)
* has already been issued
*/
^ permalink raw reply related [flat|nested] 12+ messages in thread* [ofa-general] [PATCH net-2.6.24] Fix documentation for dev_put()/dev_hold()
2007-09-18 17:58 ` [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule() Roland Dreier
@ 2007-09-18 18:04 ` Roland Dreier
2007-09-18 20:16 ` David Miller
2007-09-18 20:15 ` [ofa-general] Re: [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule() David Miller
1 sibling, 1 reply; 12+ messages in thread
From: Roland Dreier @ 2007-09-18 18:04 UTC (permalink / raw)
To: davem; +Cc: netdev, general
It looks like the comments for dev_put() and dev_hold() got reversed somehow.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
---
include/linux/netdevice.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index be5fe05..239ae68 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1030,7 +1030,7 @@ extern int netdev_budget;
extern void netdev_run_todo(void);
/**
- * dev_put - get reference to device
+ * dev_put - release reference to device
* @dev: network device
*
* Hold reference to device to keep it from being freed.
@@ -1041,7 +1041,7 @@ static inline void dev_put(struct net_device *dev)
}
/**
- * dev_hold - release reference to device
+ * dev_hold - get reference to device
* @dev: network device
*
* Release reference to device to allow it to be freed.
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [ofa-general] Re: [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule()
2007-09-18 17:58 ` [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule() Roland Dreier
2007-09-18 18:04 ` [ofa-general] [PATCH net-2.6.24] Fix documentation for dev_put()/dev_hold() Roland Dreier
@ 2007-09-18 20:15 ` David Miller
2007-09-18 22:46 ` Roland Dreier
1 sibling, 1 reply; 12+ messages in thread
From: David Miller @ 2007-09-18 20:15 UTC (permalink / raw)
To: rdreier; +Cc: netdev, general
From: Roland Dreier <rdreier@cisco.com>
Date: Tue, 18 Sep 2007 10:58:37 -0700
> netif_rx_complete() takes a netdev parameter and does dev_put() on
> that netdev, so netif_rx_reschedule() needs to also take a netdev
> parameter and do dev_hold() on it to avoid reference counts from
> getting becoming negative because of unbalanced dev_put()s.
>
> This should fix the problem reported by Krishna Kumar
> <krkumar2@in.ibm.com> with IPoIB waiting forever for netdev refcounts
> to become 0 during module unload.
>
> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Applied to net-2.6.24, thanks Roland.
> BTW, it looks like drivers/net/ibm_emac/ibm_emac_mal.c would not have
> built in the current net-2.6.24 tree, since its call to
> netif_rx_reschedule() was left with the netdev parameter. So that
> file does not need to be touched in this patch.
Yes, I know, this is the one NAPI driver that hasn't been converted.
It's a complicated conversion because of how the driver and the data
structures have been arranged (in short, a mess) which makes it
insanely difficult to get from a queue instance back up to a network
device or similar.
Further complicating things is that you need to setup a ppc32
cross-build environment to even build test a conversion, and I'm not
comfortable doing the surgery until I can test build the thing.
And this may be hard to believe, but other things have been more
pressing than setting up a ppc32 cross-build environment :-)
This is a hint of anyone looking for something to do that it'd
be much appreciated for someone to tackle the ibm_emac conversion.
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ofa-general] Re: [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule()
2007-09-18 20:15 ` [ofa-general] Re: [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule() David Miller
@ 2007-09-18 22:46 ` Roland Dreier
2007-09-18 22:50 ` David Miller
0 siblings, 1 reply; 12+ messages in thread
From: Roland Dreier @ 2007-09-18 22:46 UTC (permalink / raw)
To: David Miller; +Cc: netdev, general
> Further complicating things is that you need to setup a ppc32
> cross-build environment to even build test a conversion, and I'm not
> comfortable doing the surgery until I can test build the thing.
OK, I actually have a system with a ppc 440 SoC that uses this driver,
so I'll try to get things to the stage where I can boot net-2.6.24 on
it and see if I can get the driver working...
^ permalink raw reply [flat|nested] 12+ messages in thread
* [ofa-general] Re: [PATCH net-2.6.24] Fix refcounting problem with netif_rx_reschedule()
2007-09-18 22:46 ` Roland Dreier
@ 2007-09-18 22:50 ` David Miller
0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2007-09-18 22:50 UTC (permalink / raw)
To: rdreier; +Cc: netdev, general
From: Roland Dreier <rdreier@cisco.com>
Date: Tue, 18 Sep 2007 15:46:36 -0700
> OK, I actually have a system with a ppc 440 SoC that uses this driver,
> so I'll try to get things to the stage where I can boot net-2.6.24 on
> it and see if I can get the driver working...
Thanks a lot Roland.
^ permalink raw reply [flat|nested] 12+ messages in thread