* Receive processing stops when dev->poll returns 1
@ 2010-08-05 14:20 Usha Srinivasan
2010-08-05 16:04 ` Stephen Hemminger
0 siblings, 1 reply; 8+ messages in thread
From: Usha Srinivasan @ 2010-08-05 14:20 UTC (permalink / raw)
To: netdev@vger.kernel.org
Hello,
I have run into an interesting and frustrating problem which I've not been able to resolve. I am hoping someone can help me.
I have a network driver which sets its dev->weight to 100 (like ipoib) and when it processes 100 received packets, following the rules, it decrements dev->quota and *budget and returns 1 without calling netif_rx_complete. When my driver does that, all processing of incoming packets for all interfaces comes to a halt.
How do I know this? Because, as soon as my driver returns 1 to dev->poll, I lose my putty session and eth0 stops working; eth0 counters show that it stops receiving packets, though it is able to transmit. My own device stops receiving packets. I have scoured the code for ipoib and other network devices and I see no difference in what my driver does. I have tried to lower weight for ipoib & eth0 hoping to reproduce with those device it but no luck.
One guess is that net_rx_action spent more than 1 tick processing all the incoming packets for all interfaces it polled; I verified that my driver by itself does not spend that much. When this happens, net_rx_action exits after marking NETIF_RX_SOFTIRQ as pending. So one would expect it to be called again later, but my guess is that doesn't happen thereby resulting in a stoppage of incoming packets. Is that possible and, if so, what is the fix?
1814 static void net_rx_action(struct softirq_action *h)
1815 {
1816 struct softnet_data *queue = &__get_cpu_var(softnet_data);
1817 unsigned long start_time = jiffies;
1818 int budget = netdev_budget;
1819 void *have;
1820
1821 local_irq_disable();
1822
1823 while (!list_empty(&queue->poll_list)) {
1824 struct net_device *dev;
1825
1826 if (budget <= 0 || jiffies - start_time > 1)
1827 goto softnet_break;
1828
1829 local_irq_enable();
1830
1831 dev = list_entry(queue->poll_list.next,
1832 struct net_device, poll_list);
1833 have = netpoll_poll_lock(dev);
1834
1835 if (dev->quota <= 0 || dev->poll(dev, &budget)) {
1836 netpoll_poll_unlock(have);
1837 local_irq_disable();
1838 list_move_tail(&dev->poll_list, &queue->poll_list);
1839 if (dev->quota < 0)
1840 dev->quota += dev->weight;
1841 else
1842 dev->quota = dev->weight;
1843 } else {
1844 netpoll_poll_unlock(have);
1845 dev_put(dev);
1846 local_irq_disable();
1847 }
1848 }
1849 out:
1850 local_irq_enable();
1851 return;
1852
1853 softnet_break:
1854 __get_cpu_var(netdev_rx_stat).time_squeeze++;
1855 __raise_softirq_irqoff(NET_RX_SOFTIRQ);
1856 goto out;
1857 }
1858
I have run into this problem on four systems running RHEL5, SLES10 or SLES 11. The above describes what happens in RHEL5/SLES10. This is different in SLES11, wherein dev->poll has been replaced by netif_napi_add and the poll function returns done without quota/budget manipulation; yet, I run into the same behavior.
Any help appreciated! Thanks in advance!
Usha
___________________
Usha Srinivasan
Software Engineer
QLogic Corporation
780 5th Ave, Suite A
King of Prussia, PA 19406
(610) 233-4844
(610) 233-4777 (Fax)
(610) 233-4838 (Main Desk)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Receive processing stops when dev->poll returns 1
2010-08-05 14:20 Receive processing stops when dev->poll returns 1 Usha Srinivasan
@ 2010-08-05 16:04 ` Stephen Hemminger
2010-08-05 16:11 ` Usha Srinivasan
0 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2010-08-05 16:04 UTC (permalink / raw)
To: Usha Srinivasan; +Cc: netdev@vger.kernel.org
On Thu, 5 Aug 2010 09:20:03 -0500
Usha Srinivasan <usha.srinivasan@qlogic.com> wrote:
> Hello,
> I have run into an interesting and frustrating problem which I've not been able to resolve. I am hoping someone can help me.
>
> I have a network driver which sets its dev->weight to 100 (like ipoib) and when it processes 100 received packets, following the rules, it decrements dev->quota and *budget and returns 1 without calling netif_rx_complete. When my driver does that, all processing of incoming packets for all interfaces comes to a halt.
>
> How do I know this? Because, as soon as my driver returns 1 to dev->poll, I lose my putty session and eth0 stops working; eth0 counters show that it stops receiving packets, though it is able to transmit. My own device stops receiving packets. I have scoured the code for ipoib and other network devices and I see no difference in what my driver does. I have tried to lower weight for ipoib & eth0 hoping to reproduce with those device it but no luck.
You maybe looking at old documentation on how NAPI works.
In NAPI <= 2.6.23, the driver changed dev->quota and budget
and returned 0 or 1.
For current kernels, the NAPI poll has changed.
Using your example,
dev->weight = 100
budget would be 100
if your network driver process 100 packets, it should return 100
and call napi_complete().
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Receive processing stops when dev->poll returns 1
2010-08-05 16:04 ` Stephen Hemminger
@ 2010-08-05 16:11 ` Usha Srinivasan
2010-08-05 16:16 ` Stephen Hemminger
2010-08-05 16:22 ` Stephen Hemminger
0 siblings, 2 replies; 8+ messages in thread
From: Usha Srinivasan @ 2010-08-05 16:11 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev@vger.kernel.org
Thanks for your response. What you said is exactly what my driver is doing:
<= 2.6.23
Calls netif_rx_complete if done < budget; decrements quota & *budget by done; returns 0 if done < budget and 1 otherwise.
When 1 is returned, I encounter the problem I described)
> 2.6.23
Calls napi-complete if done < budget; returns done.
When done==budget, I encounter the problem I described.
Any ideas?
-----Original Message-----
From: Stephen Hemminger [mailto:shemminger@vyatta.com]
Sent: Thursday, August 05, 2010 12:05 PM
To: Usha Srinivasan
Cc: netdev@vger.kernel.org
Subject: Re: Receive processing stops when dev->poll returns 1
On Thu, 5 Aug 2010 09:20:03 -0500
Usha Srinivasan <usha.srinivasan@qlogic.com> wrote:
> Hello,
> I have run into an interesting and frustrating problem which I've not been able to resolve. I am hoping someone can help me.
>
> I have a network driver which sets its dev->weight to 100 (like ipoib) and when it processes 100 received packets, following the rules, it decrements dev->quota and *budget and returns 1 without calling netif_rx_complete. When my driver does that, all processing of incoming packets for all interfaces comes to a halt.
>
> How do I know this? Because, as soon as my driver returns 1 to dev->poll, I lose my putty session and eth0 stops working; eth0 counters show that it stops receiving packets, though it is able to transmit. My own device stops receiving packets. I have scoured the code for ipoib and other network devices and I see no difference in what my driver does. I have tried to lower weight for ipoib & eth0 hoping to reproduce with those device it but no luck.
You maybe looking at old documentation on how NAPI works.
In NAPI <= 2.6.23, the driver changed dev->quota and budget
and returned 0 or 1.
For current kernels, the NAPI poll has changed.
Using your example,
dev->weight = 100
budget would be 100
if your network driver process 100 packets, it should return 100
and call napi_complete().
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Receive processing stops when dev->poll returns 1
2010-08-05 16:11 ` Usha Srinivasan
@ 2010-08-05 16:16 ` Stephen Hemminger
2010-08-05 16:22 ` Stephen Hemminger
1 sibling, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2010-08-05 16:16 UTC (permalink / raw)
To: Usha Srinivasan; +Cc: netdev@vger.kernel.org
On Thu, 5 Aug 2010 11:11:51 -0500
Usha Srinivasan <usha.srinivasan@qlogic.com> wrote:
> Thanks for your response. What you said is exactly what my driver is doing:
>
>
> <= 2.6.23
> Calls netif_rx_complete if done < budget; decrements quota & *budget by done; returns 0 if done < budget and 1 otherwise.
>
> When 1 is returned, I encounter the problem I described)
>
> > 2.6.23
> Calls napi-complete if done < budget; returns done.
>
> When done==budget, I encounter the problem I described.
>
Your driver did not call napi_complete (and re-enable interrupts).
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Receive processing stops when dev->poll returns 1
2010-08-05 16:11 ` Usha Srinivasan
2010-08-05 16:16 ` Stephen Hemminger
@ 2010-08-05 16:22 ` Stephen Hemminger
2010-08-05 16:36 ` Usha Srinivasan
1 sibling, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2010-08-05 16:22 UTC (permalink / raw)
To: Usha Srinivasan; +Cc: netdev@vger.kernel.org
On Thu, 5 Aug 2010 11:11:51 -0500
Usha Srinivasan <usha.srinivasan@qlogic.com> wrote:
> Thanks for your response. What you said is exactly what my driver is doing:
>
>
> <= 2.6.23
> Calls netif_rx_complete if done < budget; decrements quota & *budget by done; returns 0 if done < budget and 1 otherwise.
>
> When 1 is returned, I encounter the problem I described)
>
> > 2.6.23
> Calls napi-complete if done < budget; returns done.
>
> When done==budget, I encounter the problem I described.
>
> Any ideas?
Ignore last mail...
If you done == budget, the poll will be recalled (after other drivers).
If quantum exhausts, then it gets called it gets deferred to ksoftirq
thread.
One possibility is that the driver is looking at wrong parameter
for budget and is exceeding the requested value. Please post your code.
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Receive processing stops when dev->poll returns 1
2010-08-05 16:22 ` Stephen Hemminger
@ 2010-08-05 16:36 ` Usha Srinivasan
2010-08-05 17:37 ` Stephen Hemminger
0 siblings, 1 reply; 8+ messages in thread
From: Usha Srinivasan @ 2010-08-05 16:36 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev@vger.kernel.org
I have compared the code in my driver to code in other drivers and they are quite similar. Here is my code:
int vnic_napi_poll(struct napi_struct *napi, int budget)
{
done = 0;
poll_more:
while (done < budget) {
int max = (budget - done);
t = min(<max-supported-by-driver>, max);
n = get-completions(comp_list);
for (i = 0; i < n; i++, done++)
handle_completions(<complist[i]);
if (n != t)
break;
}
if (done < budget) {
netif_rx_complete(dev, napi);
/* check again just to be sure */
if (more-completions()) {
If netif_rx_reschedule(dev, napi))
goto poll_more;
}
}
return done;
}
***********************
BACKPORTED version:
***********************
int vnic_poll(struct net_device *dev, int *budget)
{
int max = min(*budget, dev->quota);
done = 0;
poll_more:
while (max) {
t = min(<max-supported-by-driver>, max);
n = get-completions(comp_list);
for (i = 0; i < n; i++, --max, done++)
handle_completions(<complist[i]);
if (n != t)
break;
}
if (max) {
netif_rx_complete(dev);
/* check again just to be sure */
if (more-completions()) {
If netif_rx_reschedule(dev, napi))
goto poll_more;
}
ret = 0;
} else
ret = 1;
dev->quota -= done;
*budget -= done;
return ret;
}
***********************
-----Original Message-----
From: Stephen Hemminger [mailto:shemminger@vyatta.com]
Sent: Thursday, August 05, 2010 12:23 PM
To: Usha Srinivasan
Cc: netdev@vger.kernel.org
Subject: Re: Receive processing stops when dev->poll returns 1
On Thu, 5 Aug 2010 11:11:51 -0500
Usha Srinivasan <usha.srinivasan@qlogic.com> wrote:
> Thanks for your response. What you said is exactly what my driver is doing:
>
>
> <= 2.6.23
> Calls netif_rx_complete if done < budget; decrements quota & *budget by done; returns 0 if done < budget and 1 otherwise.
>
> When 1 is returned, I encounter the problem I described)
>
> > 2.6.23
> Calls napi-complete if done < budget; returns done.
>
> When done==budget, I encounter the problem I described.
>
> Any ideas?
Ignore last mail...
If you done == budget, the poll will be recalled (after other drivers).
If quantum exhausts, then it gets called it gets deferred to ksoftirq
thread.
One possibility is that the driver is looking at wrong parameter
for budget and is exceeding the requested value. Please post your code.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Receive processing stops when dev->poll returns 1
2010-08-05 16:36 ` Usha Srinivasan
@ 2010-08-05 17:37 ` Stephen Hemminger
2010-08-05 18:11 ` Usha Srinivasan
0 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2010-08-05 17:37 UTC (permalink / raw)
To: Usha Srinivasan; +Cc: netdev@vger.kernel.org
On Thu, 5 Aug 2010 11:36:26 -0500
Usha Srinivasan <usha.srinivasan@qlogic.com> wrote:
> int max = (budget - done);
> t = min(<max-supported-by-driver>, max);
> n = get-completions(comp_list);
You need to handle all completions pending in the poll, the code will
not call you back. So this min() is the problem.
--
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Receive processing stops when dev->poll returns 1
2010-08-05 17:37 ` Stephen Hemminger
@ 2010-08-05 18:11 ` Usha Srinivasan
0 siblings, 0 replies; 8+ messages in thread
From: Usha Srinivasan @ 2010-08-05 18:11 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev@vger.kernel.org
Stephen,
The min is inside a while loop; it is purely used to limit the number of completions that are retrieved at-a-time. The outer while loops ensuring that all the completions are handled until budget is reached or there are no completions left. Please look again at the code I sent you.
Usha
-----Original Message-----
From: Stephen Hemminger [mailto:shemminger@vyatta.com]
Sent: Thursday, August 05, 2010 1:37 PM
To: Usha Srinivasan
Cc: netdev@vger.kernel.org
Subject: Re: Receive processing stops when dev->poll returns 1
On Thu, 5 Aug 2010 11:36:26 -0500
Usha Srinivasan <usha.srinivasan@qlogic.com> wrote:
> int max = (budget - done);
> t = min(<max-supported-by-driver>, max);
> n = get-completions(comp_list);
You need to handle all completions pending in the poll, the code will
not call you back. So this min() is the problem.
--
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-08-05 18:11 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-05 14:20 Receive processing stops when dev->poll returns 1 Usha Srinivasan
2010-08-05 16:04 ` Stephen Hemminger
2010-08-05 16:11 ` Usha Srinivasan
2010-08-05 16:16 ` Stephen Hemminger
2010-08-05 16:22 ` Stephen Hemminger
2010-08-05 16:36 ` Usha Srinivasan
2010-08-05 17:37 ` Stephen Hemminger
2010-08-05 18:11 ` Usha Srinivasan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox