From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: [PATCH] net: make sure struct dst_entry refcount is aligned on 64 bytes Date: Fri, 14 Nov 2008 12:43:06 +0100 Message-ID: <491D644A.4040309@cosmosbay.com> References: <491D323B.9030802@cosmosbay.com> <20081114.005437.09284570.davem@davemloft.net> <491D3F18.5030505@cosmosbay.com> <20081114093613.GA2834@x200.localdomain> <491D5725.50006@cosmosbay.com> <20081114113558.GA9692@x200.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: David Miller , netdev@vger.kernel.org, shemminger@vyatta.com, "Zhang, Yanmin" To: Alexey Dobriyan Return-path: Received: from gw1.cosmosbay.com ([86.65.150.130]:33570 "EHLO gw1.cosmosbay.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751132AbYKNLnR convert rfc822-to-8bit (ORCPT ); Fri, 14 Nov 2008 06:43:17 -0500 In-Reply-To: <20081114113558.GA9692@x200.localdomain> Sender: netdev-owner@vger.kernel.org List-ID: Alexey Dobriyan a =E9crit : > On Fri, Nov 14, 2008 at 11:47:01AM +0100, Eric Dumazet wrote: >> Alexey Dobriyan a =E9crit : >>> On Fri, Nov 14, 2008 at 10:04:24AM +0100, Eric Dumazet wrote: >>>> David Miller a =E9crit : >>>>> From: Eric Dumazet >>>>> Date: Fri, 14 Nov 2008 09:09:31 +0100 >>>>> >>>>>> During tbench/oprofile sessions, I found that dst_release() was = in third position. >>>>> ... >>>>>> Instead of first checking the refcount value, then decrement it, >>>>>> we use atomic_dec_return() to help CPU to make the right memory = transaction >>>>>> (ie getting the cache line in exclusive mode) >>>>> ... >>>>>> Signed-off-by: Eric Dumazet >>>>> This looks great, applied, thanks Eric. >>>>> >>>> Thanks David >>>> >>>> >>>> I think I understood some regressions here on 32bits=20 >>>> >>>> offsetof(struct dst_entry, __refcnt) is 0x7c again !!! >>>> >>>> This is really really bad for performance >>>> >>>> I believe this comes from a patch from Alexey Dobriyan >>>> (commit def8b4faff5ca349beafbbfeb2c51f3602a6ef3a >>>> net: reduce structures when XFRM=3Dn) >>> Ick. >> Well, your patch is a good thing, we only need to make adjustments. >> >>>> This kills effort from Zhang Yanmin (and me...) >>>> >>>> (commit f1dd9c379cac7d5a76259e7dffcd5f8edc697d17 >>>> [NET]: Fix tbench regression in 2.6.25-rc1) >>>> >>>> >>>> Really we must find something so that this damned __refcnt is star= ting at 0x80 >>> Make it last member? >> Yes, it will help tbench, but not machines that stress IP route cach= e >> >> (dst_use() must dirty the three fields "refcnt, __use , lastuse" ) >> >> Also, 'next' pointer should be in the same cache line, to speedup ro= ute >> cache lookups. >=20 > Knowledge taken. >=20 >> Next problem is that offsets depend on architecture being 32 or 64 b= its. >> >> On 64bit, offsetof(struct dst_entry, __refcnt) is 0xb0 : not very go= od... >=20 > I think all these constraints can be satisfied with clever rearrangin= g of dst_entry. > Let me come up with alternative patch which still reduces dst slab si= ze. You cannot reduce size, and it doesnt matter, since we use dst_entry in= side rtable and rtable is using SLAB_HWCACHE_ALIGN kmem_cachep : we have many bytes= available. After patch on 32 bits sizeof(struct rtable)=3D244 (12 bytes left) Same for other containers.