From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: [RFC] changing value of NETDEV_ALIGN to cacheline size Date: Mon, 15 May 2006 14:58:28 -0700 Message-ID: <20060515145828.662bba1f@localhost.localdomain> References: <200605151408.29688.borntrae@de.ibm.com> <20060515080258.bdcfba5f.rdunlap@xenotime.net> <20060515.143011.36635008.davem@davemloft.net> <4468F50B.9000102@hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: "David S. Miller" , rdunlap@xenotime.net, borntrae@de.ibm.com, netdev@vger.kernel.org Return-path: Received: from smtp.osdl.org ([65.172.181.4]:29135 "EHLO smtp.osdl.org") by vger.kernel.org with ESMTP id S964955AbWEOV6l (ORCPT ); Mon, 15 May 2006 17:58:41 -0400 To: Rick Jones In-Reply-To: <4468F50B.9000102@hp.com> Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On Mon, 15 May 2006 14:39:23 -0700 Rick Jones wrote: > David S. Miller wrote: > > From: "Randy.Dunlap" > > Date: Mon, 15 May 2006 08:02:58 -0700 > > > > > >>>-#define NETDEV_ALIGN 32 > >>>+#define NETDEV_ALIGN L1_CACHE_BYTES > >>> #define NETDEV_ALIGN_CONST (NETDEV_ALIGN - 1) > >> > >>I don't know about the fixed value of 32, but if this patch is > >>accepted, I'd prefer NETDEV_ALIGN_MASK instead of NETDEV_ALIGN_CONST. > > > > > > The reason it's 32 is that old drivers depended on the > > struct being at least 32-byte aligned because they would > > embed structures DMA'd to/from the card in their private > > area and just assumed that would be aligned enough for > > the card's restrictions. > > > > So setting it to L1_CACHE_BYTES would be wrong, because if > > that happens to be less than 32 it would violate said > > assumption which we are catering to. > > How about: > > #define NETDEV_ALIGN_MIN 32 > #if L1_CACHE_BYTES > NETDEV_ALIGN_MIN > # define NETDEV_ALIGN L1_CACHE_BYTES > #else > # define NETDEV_ALIGN NETDEV_ALIGN_MIN > #endif > I can't see how adding more padding could help performance here. Most drivers reference both parts (netdevice and netdev_priv) in most code paths. So having the stuff on separate cache lines couldn't help that much. Alternatively, there might be a performance gain if the netdevice stats structure was rearranged so the rx and tx stats were on different cache lines.