From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Miller Subject: Re: [PATCH net-next-2.6] net: __alloc_skb() speedup Date: Wed, 05 May 2010 14:52:40 -0700 (PDT) Message-ID: <20100505.145240.28817983.davem@davemloft.net> References: <1273047734.2367.3.camel@edumazet-laptop> <20100505.012647.260083711.davem@davemloft.net> <1273060809.2367.67.camel@edumazet-laptop> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, hadi@cyberus.ca, therbert@google.com To: eric.dumazet@gmail.com Return-path: Received: from 74-93-104-97-Washington.hfc.comcastbusiness.net ([74.93.104.97]:41195 "EHLO sunset.davemloft.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750869Ab0EEVwd (ORCPT ); Wed, 5 May 2010 17:52:33 -0400 In-Reply-To: <1273060809.2367.67.camel@edumazet-laptop> Sender: netdev-owner@vger.kernel.org List-ID: From: Eric Dumazet Date: Wed, 05 May 2010 14:00:09 +0200 > Sorry, I was thinking about the shinfo part : > > memset(shinfo, 0, offsetof(struct skb_shared_info, dataref)); > > offsetof(struct skb_shared_info, dataref) is small enough and we dont > dirty a full cache line, so maybe I can keep prefetchw(data + size) ? You do dirty a full line on sparc64, the prefetch invalidate goes a L1 cache line at a time, so 32 bytes. And this memset() is 40 bytes. The call to the memset symbol is still generated by gcc for this case. I think the cutoff for doing it inline is something like 16 bytes on sparc64, four 64-bit loads and stores. Unlike x86 these risc chips don't have string-op instructions, and for sparc64 and powerpc the instructions are fixed in size (4 bytes) so the inline cost is "(memset_size / word_size) * 4". Whereas on x86 the inlining cost is more-or-less fixed. > If not, in which cases can we use prefetchw() in kernel, if some arches > dont handle it well ? It has to be looked at in a case-by-case basis. There is no simple answer here.