From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753903AbYJWPRg (ORCPT ); Thu, 23 Oct 2008 11:17:36 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751221AbYJWPR0 (ORCPT ); Thu, 23 Oct 2008 11:17:26 -0400 Received: from smtp2e.orange.fr ([80.12.242.113]:40016 "EHLO smtp2e.orange.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751144AbYJWPRZ convert rfc822-to-8bit (ORCPT ); Thu, 23 Oct 2008 11:17:25 -0400 X-ME-UUID: 20081023151722309.4BA057000112@mwinf2e18.orange.fr Message-ID: <49009575.60004@cosmosbay.com> Date: Thu, 23 Oct 2008 17:17:09 +0200 From: Eric Dumazet User-Agent: Thunderbird 2.0.0.17 (Windows/20080914) MIME-Version: 1.0 To: Christoph Lameter Cc: Pekka Enberg , Miklos Szeredi , nickpiggin@yahoo.com.au, hugh@veritas.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org Subject: Re: SLUB defrag pull request? References: <1223883004.31587.15.camel@penberg-laptop> <84144f020810221348j536f0d84vca039ff32676e2cc@mail.gmail.com> <1224745831.25814.21.camel@penberg-laptop> <84144f020810230658o7c6b3651k2d671aab09aa71fb@mail.gmail.com> <84144f020810230714g7f5d36bas812ad691140ee453@mail.gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Christoph Lameter a écrit : > On Thu, 23 Oct 2008, Pekka Enberg wrote: > >>> The problem looks like its freeing objects on a different processor that >>> where it was used last. With the pointer array it is only necessary >>> to touch >>> the objects that contain the arrays. >> >> Interesting. SLAB gets away with this because of per-cpu caches or >> because it uses the bufctls instead of a freelist? > > Exactly. Slab adds a special management structure to each slab page that > contains the freelist and other stuff. Freeing first occurs to a per cpu > queue that contains an array of pointers. Then later the objects are > moved from the pointer array into the management structure for the slab. > > What we could do for SLUB is to generate a linked list of pointer arrays > in the free objects of a slab page. If all objects are allocated then no > pointer array is needed. The first object freed would become the first > pointer array. If that is found to be exhausted then the object > currently being freed is becoming the next pointer array and we put a > link to the old one into the object as well. > This idea is very nice, especially considering that many objects are freed by RCU, and their rcu_head (which is hot at kfree() time), might be far away the linked list anchor actually used in SLUB. At alloc time, I remember I added a prefetchw() call in SLAB in __cache_alloc(), this could explain some differences between SLUB and SLAB too, since SLAB gives a hint to processor to warm its cache. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: SLUB defrag pull request? Date: Thu, 23 Oct 2008 17:17:09 +0200 Message-ID: <49009575.60004@cosmosbay.com> References: <1223883004.31587.15.camel@penberg-laptop> <84144f020810221348j536f0d84vca039ff32676e2cc@mail.gmail.com> <1224745831.25814.21.camel@penberg-laptop> <84144f020810230658o7c6b3651k2d671aab09aa71fb@mail.gmail.com> <84144f020810230714g7f5d36bas812ad691140ee453@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Pekka Enberg , Miklos Szeredi , nickpiggin@yahoo.com.au, hugh@veritas.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org To: Christoph Lameter Return-path: Received: from smtp2e.orange.fr ([80.12.242.113]:40016 "EHLO smtp2e.orange.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751144AbYJWPRZ convert rfc822-to-8bit (ORCPT ); Thu, 23 Oct 2008 11:17:25 -0400 In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Christoph Lameter a =E9crit : > On Thu, 23 Oct 2008, Pekka Enberg wrote: >=20 >>> The problem looks like its freeing objects on a different processor= that >>> where it was used last. With the pointer array it is only necessary= =20 >>> to touch >>> the objects that contain the arrays. >> >> Interesting. SLAB gets away with this because of per-cpu caches or >> because it uses the bufctls instead of a freelist? >=20 > Exactly. Slab adds a special management structure to each slab page t= hat=20 > contains the freelist and other stuff. Freeing first occurs to a per = cpu=20 > queue that contains an array of pointers. Then later the objects are=20 > moved from the pointer array into the management structure for the sl= ab. >=20 > What we could do for SLUB is to generate a linked list of pointer arr= ays=20 > in the free objects of a slab page. If all objects are allocated then= no=20 > pointer array is needed. The first object freed would become the firs= t=20 > pointer array. If that is found to be exhausted then the object=20 > currently being freed is becoming the next pointer array and we put a= =20 > link to the old one into the object as well. >=20 This idea is very nice, especially considering that many objects are fr= eed by RCU, and their rcu_head (which is hot at kfree() time), might be far away the linked list anchor actually used in SLUB. At alloc time, I remember I added a prefetchw() call in SLAB in __cache= _alloc(), this could explain some differences between SLUB and SLAB too, since SL= AB gives a hint to processor to warm its cache. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <49009575.60004@cosmosbay.com> Date: Thu, 23 Oct 2008 17:17:09 +0200 From: Eric Dumazet MIME-Version: 1.0 Subject: Re: SLUB defrag pull request? References: <1223883004.31587.15.camel@penberg-laptop> <84144f020810221348j536f0d84vca039ff32676e2cc@mail.gmail.com> <1224745831.25814.21.camel@penberg-laptop> <84144f020810230658o7c6b3651k2d671aab09aa71fb@mail.gmail.com> <84144f020810230714g7f5d36bas812ad691140ee453@mail.gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8BIT Sender: owner-linux-mm@kvack.org Return-Path: To: Christoph Lameter Cc: Pekka Enberg , Miklos Szeredi , nickpiggin@yahoo.com.au, hugh@veritas.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org List-ID: Christoph Lameter a ecrit : > On Thu, 23 Oct 2008, Pekka Enberg wrote: > >>> The problem looks like its freeing objects on a different processor that >>> where it was used last. With the pointer array it is only necessary >>> to touch >>> the objects that contain the arrays. >> >> Interesting. SLAB gets away with this because of per-cpu caches or >> because it uses the bufctls instead of a freelist? > > Exactly. Slab adds a special management structure to each slab page that > contains the freelist and other stuff. Freeing first occurs to a per cpu > queue that contains an array of pointers. Then later the objects are > moved from the pointer array into the management structure for the slab. > > What we could do for SLUB is to generate a linked list of pointer arrays > in the free objects of a slab page. If all objects are allocated then no > pointer array is needed. The first object freed would become the first > pointer array. If that is found to be exhausted then the object > currently being freed is becoming the next pointer array and we put a > link to the old one into the object as well. > This idea is very nice, especially considering that many objects are freed by RCU, and their rcu_head (which is hot at kfree() time), might be far away the linked list anchor actually used in SLUB. At alloc time, I remember I added a prefetchw() call in SLAB in __cache_alloc(), this could explain some differences between SLUB and SLAB too, since SLAB gives a hint to processor to warm its cache. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org