netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NAPI poll behavior in various Intel drivers
@ 2008-01-04 11:40 David Miller
  2008-01-04 20:10 ` James Chapman
  2008-01-07  8:24 ` Jarek Poplawski
  0 siblings, 2 replies; 8+ messages in thread
From: David Miller @ 2008-01-04 11:40 UTC (permalink / raw)
  To: netdev; +Cc: auke-jan.h.kok


Several Intel networking drivers such as e1000, e1000e
and e100 all do this to exit NAPI polling:

	if ((!tx_cleaned && (work_done == 0)) ||
 	   !netif_running(poll_dev)) {

I tried to make this use in the NAPI rework:

	if ((!tx_cleaned && (work_done < budget)) ||
 	   !netif_running(poll_dev)) {

But that got reverted by:

	commit f7bbb9098315d712351aba7861a8c9fcf6bf0213

	e1000: Fix NAPI state bug when Rx complete
    
	Don't exit polling when we have not yet used our budget, this causes
	the NAPI system to end up with a messed up poll list.
    
	Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
	Signed-off-by: Jeff Garzik <jeff@garzik.org>

I definitely would not have signed off on that :-)

That "tx_cleaned" thing clouds the logic in all of these driver's
poll routines.

The one necessary precondition is that when work_done < budget
we exit polling and return a value less than budget.

If the ->poll() returns a value less than budget, net_rx_action()
assumes that the device has been removed from the poll() list.

		/* Drivers must not modify the NAPI state if they
		 * consume the entire weight.  In such cases this code
		 * still "owns" the NAPI instance and therefore can
		 * move the instance around on the list at-will.
		 */
		if (unlikely(work == weight))
			list_move_tail(&n->poll_list, list);

This "work_done == 0" test in these drivers, is thus, wrong.  It
should be "work_done < budget" and the whole tx_cleaned thing needs to
be removed.

It happens to work, because what happens is that we loop again and
process the same NAPI struct again.

As a result, E1000 devices get polled TWICE every time they
process at least one RX packet, but do not consume the whole
quota.

I smell a performance hack, and if so this is wrong and against
all of the principles of NAPI.  Either that or it's a workaround
for the "!netif_running()" case.

I noticed this while trying to work on a generic fix for the
"->poll() does not exit when device is brought down while being
bombed with packets" bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-01-07  8:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-04 11:40 NAPI poll behavior in various Intel drivers David Miller
2008-01-04 20:10 ` James Chapman
2008-01-04 21:24   ` David Miller
2008-01-05  0:18     ` James Chapman
2008-01-05  7:25       ` David Miller
2008-01-05 13:29         ` Andi Kleen
2008-01-06  4:15           ` David Miller
2008-01-07  8:24 ` Jarek Poplawski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).