From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Hansen Subject: RE: e1000 performance hack for ppc64 (Power4) Date: 13 Jun 2003 09:21:03 -0700 Sender: netdev-bounce@oss.sgi.com Message-ID: <1055521263.3531.2055.camel@nighthawk> References: Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: "Feldman, Scott" , David Gibson , Linux Kernel Mailing List , Anton Blanchard , Nancy J Milliner , Ricardo C Gonzalez , Brian Twichell , netdev@oss.sgi.com Return-path: To: Herman Dierks In-Reply-To: Errors-to: netdev-bounce@oss.sgi.com List-Id: netdev.vger.kernel.org Too long to quote: http://marc.theaimsgroup.com/?t=105538879600001&r=1&w=2 Wouldn't you get most of the benefit from copying that stuff around in the driver if you allocated the skb->data aligned in the first place? There's already code to align them on CPU cache boundaries: #define SKB_DATA_ALIGN(X) (((X) + (SMP_CACHE_BYTES - 1)) & \ ~(SMP_CACHE_BYTES - 1)) So, do something like this: #ifdef ARCH_ALIGN_SKB_BYTES #define SKB_ALIGN_BYTES ARCH_ALIGN_SKB_BYTES #else #define SKB_ALIGN_BYTES SMP_CACHE_BYTES #endif #define SKB_DATA_ALIGN(X) (((X) + (ARCH_ALIGN_SKB - 1)) & \ ~(SKB_ALIGN_BYTES - 1)) You could easily make this adaptive to no align on th arch size when the request is bigger than that, just like in the e1000 patch you posted. -- Dave Hansen haveblue@us.ibm.com