All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-Netdev <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	David Miller <davem@davemloft.net>, Neil Brown <neilb@suse.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Pekka Enberg <penberg@cs.helsinki.fi>
Subject: Re: [PATCH 02/15] mm: sl[au]b: Add knowledge of PFMEMALLOC reserve pages
Date: Wed, 8 Feb 2012 21:23:23 +0000	[thread overview]
Message-ID: <20120208212323.GM5938@suse.de> (raw)
In-Reply-To: <alpine.DEB.2.00.1202081338210.32060@router.home>

On Wed, Feb 08, 2012 at 01:49:05PM -0600, Christoph Lameter wrote:
> On Wed, 8 Feb 2012, Mel Gorman wrote:
> 
> > Ok, I looked into what is necessary to replace these with checking a page
> > flag and the cost shifts quite a bit and ends up being more expensive.
> 
> That is only true if you go the slab route.

Well, yes but both slab and slub have to be supported. I see no reason
why I would choose to make this a slab-only or slub-only feature. Slob is
not supported because it's not expected that a platform using slob is also
going to use network-based swap.

> Slab suffers from not having
> the page struct pointer readily available. The changes are likely already
> impacting slab performance without the virt_to_page patch.
> 

The performance impact only comes into play when swap is on a network
device and pfmemalloc reserves are in use. The rest of the time the check
on ac avoids all the cost and there is a micro-optimisation later to avoid
calling a function (patch 12).

> > In slub, it's sufficient to check kmem_cache_cpu to know whether the
> > objects in the list are pfmemalloc or not.
> 
> We try to minimize the size of kmem_cache_cpu. The page pointer is readily
> available. We just removed the node field from kmem_cache_cpu because it
> was less expensive to get the node number from the struct page field.
> 
> The same is certainly true for a PFMEMALLOC flag.
> 

Ok, are you asking that I use the page flag for slub and leave kmem_cache_cpu
alone in the slub case? I can certainly check it out if that's what you
are asking for.

> > Yeah, you're right on the button there. I did my checking assuming that
> > PG_active+PG_slab were safe to use. The following is an untested patch that
> > I probably got details wrong in but it illustrates where virt_to_page()
> > starts cropping up.
> 
> Yes you need to come up with a way to not use virt_to_page otherwise slab
> performance is significantly impacted.

I did come up with a way: the necessary information is in ac and slabp
on slab :/ . There are not exactly many ways that the information can
be recorded.

> On NUMA we are already doing a page struct lookup on free in slab.
> If you would save the page struct pointer
> there and reuse it then you would not have an issue at least on free.
> 

That information is only available on NUMA and only when there is more than
one node. Having cache_free_alien return the page for passing to ac_put_obj()
would also be ugly. The biggest downfall by far is that single-node machines
incur the cost of virt_to_page() where they did not have to before. This
is not a solution and it is not better than the current simply check on
a struct field.

> You still would need to determine which "struct slab" pointer is in use
> which will also require similar lookups in varous places.
> 
> Transfer of the pfmemalloc flags (guess you must have a pfmemalloc
> field in struct slab then) in slab is best be done when allocating and
> freeing a slab page from the page allocator.
> 

The page->pfmemalloc is already been transferred to the slab in
cache_grow.

> I think its rather trivial to add the support you want in a non intrusive
> way to slub. Slab would require some more thought and discussion.
> 

I'm slightly confused by this sentence. Support for slub is already in the
patch and as you say, it's fairly straight-forward. Supporting a page flag
and leaving kmem_cache_cpu alone may also be easier as kmem_cache_cpu->page
can be used instead of a kmem_cache_cpu->pfmemalloc field.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-Netdev <netdev@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	David Miller <davem@davemloft.net>, Neil Brown <neilb@suse.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Pekka Enberg <penberg@cs.helsinki.fi>
Subject: Re: [PATCH 02/15] mm: sl[au]b: Add knowledge of PFMEMALLOC reserve pages
Date: Wed, 8 Feb 2012 21:23:23 +0000	[thread overview]
Message-ID: <20120208212323.GM5938@suse.de> (raw)
In-Reply-To: <alpine.DEB.2.00.1202081338210.32060@router.home>

On Wed, Feb 08, 2012 at 01:49:05PM -0600, Christoph Lameter wrote:
> On Wed, 8 Feb 2012, Mel Gorman wrote:
> 
> > Ok, I looked into what is necessary to replace these with checking a page
> > flag and the cost shifts quite a bit and ends up being more expensive.
> 
> That is only true if you go the slab route.

Well, yes but both slab and slub have to be supported. I see no reason
why I would choose to make this a slab-only or slub-only feature. Slob is
not supported because it's not expected that a platform using slob is also
going to use network-based swap.

> Slab suffers from not having
> the page struct pointer readily available. The changes are likely already
> impacting slab performance without the virt_to_page patch.
> 

The performance impact only comes into play when swap is on a network
device and pfmemalloc reserves are in use. The rest of the time the check
on ac avoids all the cost and there is a micro-optimisation later to avoid
calling a function (patch 12).

> > In slub, it's sufficient to check kmem_cache_cpu to know whether the
> > objects in the list are pfmemalloc or not.
> 
> We try to minimize the size of kmem_cache_cpu. The page pointer is readily
> available. We just removed the node field from kmem_cache_cpu because it
> was less expensive to get the node number from the struct page field.
> 
> The same is certainly true for a PFMEMALLOC flag.
> 

Ok, are you asking that I use the page flag for slub and leave kmem_cache_cpu
alone in the slub case? I can certainly check it out if that's what you
are asking for.

> > Yeah, you're right on the button there. I did my checking assuming that
> > PG_active+PG_slab were safe to use. The following is an untested patch that
> > I probably got details wrong in but it illustrates where virt_to_page()
> > starts cropping up.
> 
> Yes you need to come up with a way to not use virt_to_page otherwise slab
> performance is significantly impacted.

I did come up with a way: the necessary information is in ac and slabp
on slab :/ . There are not exactly many ways that the information can
be recorded.

> On NUMA we are already doing a page struct lookup on free in slab.
> If you would save the page struct pointer
> there and reuse it then you would not have an issue at least on free.
> 

That information is only available on NUMA and only when there is more than
one node. Having cache_free_alien return the page for passing to ac_put_obj()
would also be ugly. The biggest downfall by far is that single-node machines
incur the cost of virt_to_page() where they did not have to before. This
is not a solution and it is not better than the current simply check on
a struct field.

> You still would need to determine which "struct slab" pointer is in use
> which will also require similar lookups in varous places.
> 
> Transfer of the pfmemalloc flags (guess you must have a pfmemalloc
> field in struct slab then) in slab is best be done when allocating and
> freeing a slab page from the page allocator.
> 

The page->pfmemalloc is already been transferred to the slab in
cache_grow.

> I think its rather trivial to add the support you want in a non intrusive
> way to slub. Slab would require some more thought and discussion.
> 

I'm slightly confused by this sentence. Support for slub is already in the
patch and as you say, it's fairly straight-forward. Supporting a page flag
and leaving kmem_cache_cpu alone may also be easier as kmem_cache_cpu->page
can be used instead of a kmem_cache_cpu->pfmemalloc field.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2012-02-08 21:23 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-06 22:56 [PATCH 00/15] Swap-over-NBD without deadlocking V8 Mel Gorman
2012-02-06 22:56 ` Mel Gorman
2012-02-06 22:56 ` [PATCH 01/15] mm: Serialize access to min_free_kbytes Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-08 18:47   ` Rik van Riel
2012-02-08 18:47     ` Rik van Riel
2012-02-06 22:56 ` [PATCH 02/15] mm: sl[au]b: Add knowledge of PFMEMALLOC reserve pages Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-07 16:27   ` Christoph Lameter
2012-02-07 16:27     ` Christoph Lameter
2012-02-08 14:45     ` Mel Gorman
2012-02-08 14:45       ` Mel Gorman
2012-02-08 15:14       ` Christoph Lameter
2012-02-08 15:14         ` Christoph Lameter
2012-02-08 16:34         ` Mel Gorman
2012-02-08 16:34           ` Mel Gorman
2012-02-08 19:49           ` Christoph Lameter
2012-02-08 19:49             ` Christoph Lameter
2012-02-08 21:23             ` Mel Gorman [this message]
2012-02-08 21:23               ` Mel Gorman
2012-02-08 22:13               ` Christoph Lameter
2012-02-08 22:13                 ` Christoph Lameter
2012-02-09 12:50                 ` Mel Gorman
2012-02-09 12:50                   ` Mel Gorman
2012-02-09 19:53                   ` Christoph Lameter
2012-02-09 19:53                     ` Christoph Lameter
2012-02-10 10:26                     ` Mel Gorman
2012-02-10 10:26                       ` Mel Gorman
2012-02-10 21:01                       ` Christoph Lameter
2012-02-10 21:01                         ` Christoph Lameter
2012-02-10 22:07                         ` Christoph Lameter
2012-02-10 22:07                           ` Christoph Lameter
2012-02-13 10:12                           ` Mel Gorman
2012-02-13 10:12                             ` Mel Gorman
2012-02-13 11:10                         ` Mel Gorman
2012-02-13 11:10                           ` Mel Gorman
2012-02-06 22:56 ` [PATCH 03/15] mm: Introduce __GFP_MEMALLOC to allow access to emergency reserves Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 04/15] mm: allow PF_MEMALLOC from softirq context Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 05/15] mm: Ignore mempolicies when using ALLOC_NO_WATERMARK Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 06/15] net: Introduce sk_allocation() to allow addition of GFP flags depending on the individual socket Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 07/15] netvm: Allow the use of __GFP_MEMALLOC by specific sockets Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 08/15] netvm: Allow skb allocation to use PFMEMALLOC reserves Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 09/15] netvm: Propagate page->pfmemalloc to skb Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 10/15] netvm: Propagate page->pfmemalloc from netdev_alloc_page " Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-07 23:38   ` Alexander Duyck
2012-02-07 23:38     ` Alexander Duyck
2012-02-08 15:23     ` Mel Gorman
2012-02-08 15:23       ` Mel Gorman
2012-02-06 22:56 ` [PATCH 11/15] netvm: Set PF_MEMALLOC as appropriate during SKB processing Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 12/15] mm: Micro-optimise slab to avoid a function call Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 13/15] nbd: Set SOCK_MEMALLOC for access to PFMEMALLOC reserves Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 14/15] mm: Throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-06 22:56 ` [PATCH 15/15] mm: Account for the number of times direct reclaimers get throttled Mel Gorman
2012-02-06 22:56   ` Mel Gorman
2012-02-07 12:45 ` [PATCH 00/15] Swap-over-NBD without deadlocking V8 Hillf Danton
2012-02-07 12:45   ` Hillf Danton
2012-02-07 13:27   ` Mel Gorman
2012-02-07 13:27     ` Mel Gorman
2012-02-07 13:27     ` Mel Gorman
2012-02-08 12:51     ` Hillf Danton
2012-02-08 12:51       ` Hillf Danton
2012-02-08 15:26       ` Mel Gorman
2012-02-08 15:26         ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120208212323.GM5938@suse.de \
    --to=mgorman@suse.de \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=neilb@suse.de \
    --cc=netdev@vger.kernel.org \
    --cc=penberg@cs.helsinki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.