* NAPI Race?
@ 2003-10-08 3:07 Marko Rauhamaa
2003-10-08 11:34 ` P
2003-10-08 19:57 ` kuznet
0 siblings, 2 replies; 5+ messages in thread
From: Marko Rauhamaa @ 2003-10-08 3:07 UTC (permalink / raw)
To: linux-kernel; +Cc: Alexey Kuznetsov, Jamal Hadi Salim, Robert Olsson
It looks to me like net_rx_action() might suffer from a race, which in
turn might explain some weirdness in my driver test results.
Here's the essence of the function from net/core/dev.c:
net_rx_action()
{
local_irq_disable();
while (!list_empty(&queue->poll_list)) {
local_irq_enable();
/* do stuff */
local_irq_disable();
}
local_irq_enable();
}
Say I receive a packet. net_rx_action() processes it in the while loop
and reenables interrupts. But just before net_rx_action() returns, I
receive another packet, and __netif_rx_schedule() gets called from the
driver. Then the soft irq is raised from within itself. If I'm not
interrupted for some other reason, the packet will get processed only at
the next jiffie when the soft irq is invoked again.
Am I mistaken?
As an aside, it looks also as though the design might technically allow
the network driver to starve the CPU (the very situation NAPI was
designed to protect against). If I receive a new packet always right
after returning from net_rx_action(), the interrupt will cause the soft
irq to be executed immediately. It's true that this scenario would
require a very accurately calibrated packet stream, but in my business
that just might take place.
Marko
--
Marko Rauhamaa mailto:marko@pacujo.net http://pacujo.net/marko/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NAPI Race?
2003-10-08 3:07 NAPI Race? Marko Rauhamaa
@ 2003-10-08 11:34 ` P
2003-10-08 19:57 ` kuznet
1 sibling, 0 replies; 5+ messages in thread
From: P @ 2003-10-08 11:34 UTC (permalink / raw)
To: Marko Rauhamaa
Cc: linux-kernel, Alexey Kuznetsov, Jamal Hadi Salim, Robert Olsson
Marko Rauhamaa wrote:
> It looks to me like net_rx_action() might suffer from a race, which in
> turn might explain some weirdness in my driver test results.
>
> Here's the essence of the function from net/core/dev.c:
>
> net_rx_action()
> {
> local_irq_disable();
> while (!list_empty(&queue->poll_list)) {
> local_irq_enable();
> /* do stuff */
> local_irq_disable();
> }
> local_irq_enable();
> }
>
> Say I receive a packet. net_rx_action() processes it in the while loop
> and reenables interrupts. But just before net_rx_action() returns, I
> receive another packet, and __netif_rx_schedule() gets called from the
> driver. Then the soft irq is raised from within itself. If I'm not
> interrupted for some other reason, the packet will get processed only at
> the next jiffie when the soft irq is invoked again.
>
> Am I mistaken?
Probably not, as I tested the reception timing
accuracy against an independent hardware "packet
timestamper", and out of 2 million packets,
3 were delayed by up to 5ms on the linux box
(e100 NAPI). There were about 10 packets delayed
between 1ms and 5ms.
Pádraig.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NAPI Race?
2003-10-08 3:07 NAPI Race? Marko Rauhamaa
2003-10-08 11:34 ` P
@ 2003-10-08 19:57 ` kuznet
2003-10-08 21:17 ` Marko Rauhamaa
1 sibling, 1 reply; 5+ messages in thread
From: kuznet @ 2003-10-08 19:57 UTC (permalink / raw)
To: Marko Rauhamaa; +Cc: linux-kernel, Jamal Hadi Salim, Robert Olsson
Hello!
> interrupted for some other reason, the packet will get processed only at
> the next jiffie when the soft irq is invoked again.
>
> Am I mistaken?
Yes, you are wrong. It is processed as soon as possible.
> As an aside, it looks also as though the design might technically allow
> the network driver to starve the CPU (the very situation NAPI was
> designed to protect against).
Nope. NAPI is not expected to cure starvation caused by softirqs.
Alexey
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NAPI Race?
2003-10-08 19:57 ` kuznet
@ 2003-10-08 21:17 ` Marko Rauhamaa
2003-10-08 21:31 ` David S. Miller
0 siblings, 1 reply; 5+ messages in thread
From: Marko Rauhamaa @ 2003-10-08 21:17 UTC (permalink / raw)
To: kuznet; +Cc: linux-kernel, Jamal Hadi Salim, Robert Olsson
kuznet@ms2.inr.ac.ru:
> > interrupted for some other reason, the packet will get processed only at
> > the next jiffie when the soft irq is invoked again.
> >
> > Am I mistaken?
>
> Yes, you are wrong. It is processed as soon as possible.
If I receive a packet at the tail end of net_rx_action(), we will
schedule the softirq again. But do_softirq() explicitly refuses to run
the same softirq right away. The softirq will be invoked at the next
interrupt, timer tick, system call (?) or when ksoftirqd is scheduled.
It may happen that none of these events occur for milliseconds.
> > As an aside, it looks also as though the design might technically
> > allow the network driver to starve the CPU (the very situation NAPI
> > was designed to protect against).
>
> Nope. NAPI is not expected to cure starvation caused by softirqs.
Well, it almost does. You can blast a NAPI driver with a packet flood,
and the system is happy and responsive -- no interrupts are generated,
and packets are polled by ksoftirqd. However, you can find a packet rate
that will cause the CPU to spend virtually all of its time in NAPI.
Marko
--
Marko Rauhamaa mailto:marko@pacujo.net http://pacujo.net/marko/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NAPI Race?
2003-10-08 21:17 ` Marko Rauhamaa
@ 2003-10-08 21:31 ` David S. Miller
0 siblings, 0 replies; 5+ messages in thread
From: David S. Miller @ 2003-10-08 21:31 UTC (permalink / raw)
To: Marko Rauhamaa; +Cc: kuznet, linux-kernel, hadi, Robert.Olsson
On 08 Oct 2003 14:17:31 -0700
Marko Rauhamaa <marko@pacujo.net> wrote:
> But do_softirq() explicitly refuses to run
> the same softirq right away.
Check current 2.6.x sources, it does loop a certain number of times
even for the same softirq type.
> Well, it almost does. You can blast a NAPI driver with a packet flood,
> and the system is happy and responsive -- no interrupts are generated,
> and packets are polled by ksoftirqd. However, you can find a packet rate
> that will cause the CPU to spend virtually all of its time in NAPI.
This situation can be created with non-NAPI drivers too.
Alexey is trying to explain to you what the true cause of the
problem is, and it's not NAPI, it's softirq starvation.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2003-10-08 21:31 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-08 3:07 NAPI Race? Marko Rauhamaa
2003-10-08 11:34 ` P
2003-10-08 19:57 ` kuznet
2003-10-08 21:17 ` Marko Rauhamaa
2003-10-08 21:31 ` David S. Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox