David Miller a écrit : > From: Eric Dumazet > Date: Thu, 21 Feb 2008 19:51:52 +0100 > >> Following patch directly calls netif_receive_skb() and avoids lot of >> atomic operations. >> (atomic_inc(&dev->refcnt), set_and_set_bit(NAPI_STATE_SCHED, &n->state), ... >> atomic_dec(&dev->refcnt)...), cache line ping-pongs on device refcnt, >> but also softirq overhead. >> >> This gives a nice boost on tbench for example (5 % on my machine) > > My only concern is stack usage. > > Note that packet reception can elicit a response and go all the way > back into this driver and all the way down into netif_receive_skb() > again. And so on and so forth. > > If there is some bug in the stack (ACK'ing ACKs, stuff like that) we > could get into a loop and overrun the kernel stack in no time at all. > > So, if anything, this change could make inconvenient errors become > catastrophic and hard to diagnose. You are absolutly right. We should guard against recursion, using a new field in "pcpu_lstats" (cheap access in a hot cache line as we have to update stats anyway) Thank you [PATCH] loopback: calls netif_receive_skb() instead of netif_rx() Loopback transmit function loopback_xmit() actually calls netif_rx() to queue a skb to the softnet queue, and arms a softirq so that this skb can be handled later. This has a cost on SMP, because we need to hold a reference on the device, and free this reference when softirq dequeues packet. Following patch directly calls netif_receive_skb() and avoids lot of atomic operations. (atomic_inc(&dev->refcnt), set_and_set_bit(NAPI_STATE_SCHED, &n->state), ... atomic_dec(&dev->refcnt)...), cache line ping-pongs on device refcnt, but also softirq overhead. This gives a nice boost on tbench for example (5 % on my machine) We want to limit recursion, in case network stack wants to re-enter loopback_xmit(). We use a depth field (per cpu), so that we avoid stack overflow, queueing the packet instead of trying to directly handle it. Signed-off-by: Eric Dumazet