David Miller a écrit : > From: "Ilpo Järvinen" > Date: Fri, 31 Oct 2008 11:40:16 +0200 (EET) > >> Let me remind that it is just a single process, so no ping-pong & other >> lock related cache effects should play any significant role here, no? (I'm >> no expert though :-)). > > Not locks or ping-pongs perhaps, I guess. So it just sends and > receives over a socket, implementing both ends of the communication > in the same process? > > If hash chain conflicts do happen for those 2 sockets, just traversing > the chain 2 entries deep could show up. tbench is very sensible to cache line ping-pongs (on SMP machines of course) Just to prove my point, I coded the following patch and tried it on a HP BL460c G1. This machine has 2 quad cores cpu (Intel(R) Xeon(R) CPU E5450 @3.00GHz) tbench 8 went from 2240 MB/s to 2310 MB/s after this patch applied [PATCH] net: Introduce netif_set_last_rx() helper On SMP machine, loopback device (and possibly others net device) should try to avoid dirty the memory cache line containing "last_rx" field. Got 3% increase on tbench on a 8 cpus machine. Signed-off-by: Eric Dumazet --- drivers/net/loopback.c | 2 +- include/linux/netdevice.h | 16 ++++++++++++++++ 2 files changed, 17 insertions(+), 1 deletion(-)