netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 04/31] mm: tag reseve pages
@ 2009-10-01 14:05 Suresh Jayaraman
  2009-10-01 21:09 ` David Rientjes
  0 siblings, 1 reply; 4+ messages in thread
From: Suresh Jayaraman @ 2009-10-01 14:05 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, linux-kernel, linux-mm
  Cc: netdev, Neil Brown, Miklos Szeredi, Wouter Verhelst,
	Peter Zijlstra, trond.myklebust, Suresh Jayaraman

From: Peter Zijlstra <a.p.zijlstra@chello.nl> 

Tag pages allocated from the reserves with a non-zero page->reserve.
This allows us to distinguish and account reserve pages.

Since low-memory situations are transient, and unrelated the the actual
page (any page can be on the freelist when we run low), don't mark the
page in any permanent way - just pass along the information to the
allocatee.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
---
 include/linux/mm_types.h |    1 +
 mm/page_alloc.c          |    4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

Index: mmotm/include/linux/mm_types.h
===================================================================
--- mmotm.orig/include/linux/mm_types.h
+++ mmotm/include/linux/mm_types.h
@@ -77,6 +77,7 @@ struct page {
 	union {
 		pgoff_t index;		/* Our offset within mapping. */
 		void *freelist;		/* SLUB: freelist req. slab lock */
+		int reserve;		/* page_alloc: page is a reserve page */
 	};
 	struct list_head lru;		/* Pageout list, eg. active_list
 					 * protected by zone->lru_lock !
Index: mmotm/mm/page_alloc.c
===================================================================
--- mmotm.orig/mm/page_alloc.c
+++ mmotm/mm/page_alloc.c
@@ -1501,8 +1501,10 @@ zonelist_scan:
 try_this_zone:
 		page = buffered_rmqueue(preferred_zone, zone, order,
 						gfp_mask, migratetype);
-		if (page)
+		if (page) {
+			page->reserve = !!(alloc_flags & ALLOC_NO_WATERMARKS);
 			break;
+		}
 this_zone_full:
 		if (NUMA_BUILD)
 			zlc_mark_zone_full(zonelist, z);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 04/31] mm: tag reseve pages
  2009-10-01 14:05 [PATCH 04/31] mm: tag reseve pages Suresh Jayaraman
@ 2009-10-01 21:09 ` David Rientjes
  2009-10-02  4:43   ` Neil Brown
  0 siblings, 1 reply; 4+ messages in thread
From: David Rientjes @ 2009-10-01 21:09 UTC (permalink / raw)
  To: Suresh Jayaraman
  Cc: Linus Torvalds, Andrew Morton, linux-kernel, linux-mm, netdev,
	Neil Brown, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust

On Thu, 1 Oct 2009, Suresh Jayaraman wrote:

> Index: mmotm/mm/page_alloc.c
> ===================================================================
> --- mmotm.orig/mm/page_alloc.c
> +++ mmotm/mm/page_alloc.c
> @@ -1501,8 +1501,10 @@ zonelist_scan:
>  try_this_zone:
>  		page = buffered_rmqueue(preferred_zone, zone, order,
>  						gfp_mask, migratetype);
> -		if (page)
> +		if (page) {
> +			page->reserve = !!(alloc_flags & ALLOC_NO_WATERMARKS);
>  			break;
> +		}
>  this_zone_full:
>  		if (NUMA_BUILD)
>  			zlc_mark_zone_full(zonelist, z);

page->reserve won't necessary indicate that access to reserves was 
_necessary_ for the allocation to succeed, though.  This will mark any 
page being allocated under PF_MEMALLOC as reserve when all zones may be 
well above their min watermarks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 04/31] mm: tag reseve pages
  2009-10-01 21:09 ` David Rientjes
@ 2009-10-02  4:43   ` Neil Brown
  2009-10-02  9:50     ` David Rientjes
  0 siblings, 1 reply; 4+ messages in thread
From: Neil Brown @ 2009-10-02  4:43 UTC (permalink / raw)
  To: David Rientjes
  Cc: Suresh Jayaraman, Linus Torvalds, Andrew Morton, linux-kernel,
	linux-mm, netdev, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust

On Thursday October 1, rientjes@google.com wrote:
> On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
> 
> > Index: mmotm/mm/page_alloc.c
> > ===================================================================
> > --- mmotm.orig/mm/page_alloc.c
> > +++ mmotm/mm/page_alloc.c
> > @@ -1501,8 +1501,10 @@ zonelist_scan:
> >  try_this_zone:
> >  		page = buffered_rmqueue(preferred_zone, zone, order,
> >  						gfp_mask, migratetype);
> > -		if (page)
> > +		if (page) {
> > +			page->reserve = !!(alloc_flags & ALLOC_NO_WATERMARKS);
> >  			break;
> > +		}
> >  this_zone_full:
> >  		if (NUMA_BUILD)
> >  			zlc_mark_zone_full(zonelist, z);
> 
> page->reserve won't necessary indicate that access to reserves was 
> _necessary_ for the allocation to succeed, though.  This will mark any 
> page being allocated under PF_MEMALLOC as reserve when all zones may be 
> well above their min watermarks.

Normally if zones are above their watermarks, page->reserve will not
be set.
This is because __alloc_page_nodemask (which seems to be the main
non-inline entrypoint) first calls get_page_from_freelist with
alloc_flags set to ALLOC_WMARK_LOW|ALLOC_CPUSET.
Only if this fails does __alloc_page_nodemask call
__alloc_pages_slowpath which potentially sets ALLOC_NO_WATERMARKS in
alloc_flags.

So page->reserved being set actually tells us:
  PF_MEMALLOC or GFP_MEMALLOC were used, and
  a WMARK_LOW allocation attempt failed very recently

which is close enough to "the emergency reserves were used" I think.

Thanks,
NeilBrown

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 04/31] mm: tag reseve pages
  2009-10-02  4:43   ` Neil Brown
@ 2009-10-02  9:50     ` David Rientjes
  0 siblings, 0 replies; 4+ messages in thread
From: David Rientjes @ 2009-10-02  9:50 UTC (permalink / raw)
  To: Neil Brown
  Cc: Suresh Jayaraman, Linus Torvalds, Andrew Morton, linux-kernel,
	linux-mm, netdev, Miklos Szeredi, Wouter Verhelst, Peter Zijlstra,
	trond.myklebust

On Fri, 2 Oct 2009, Neil Brown wrote:

> Normally if zones are above their watermarks, page->reserve will not
> be set.
> This is because __alloc_page_nodemask (which seems to be the main
> non-inline entrypoint) first calls get_page_from_freelist with
> alloc_flags set to ALLOC_WMARK_LOW|ALLOC_CPUSET.
> Only if this fails does __alloc_page_nodemask call
> __alloc_pages_slowpath which potentially sets ALLOC_NO_WATERMARKS in
> alloc_flags.
> 
> So page->reserved being set actually tells us:
>   PF_MEMALLOC or GFP_MEMALLOC were used, and
>   a WMARK_LOW allocation attempt failed very recently
> 
> which is close enough to "the emergency reserves were used" I think.
> 

There're a couple cornercases for GFP_ATOMIC, though:

 - it isn't restricted by cpuset, so ALLOC_CPUSET will never get set for 
   the slowpath allocs and may very well allow the allocation to succeed 
   in zones far above their min watermark.

 - it allows for allocating beyond the min watermark in allowed zones
   simply by setting ALLOC_HARDER; these types of "reserve" allocations
   wouldn't be marked as page->reserve with your patches if
   ALLOC_NO_WATERMARKS wasn't set because of the allocation context.

The second one is debatable whether it fits your definition of reserve or 
not, but there's an inconsistency if it doesn't because the allocation may 
succeed in "no watermark" context (for example, in hard irq context) even 
though that privilege wasn't necessary to successfully allocate: perhaps 
it only needed ALLOC_HARDER.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-10-02  9:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-01 14:05 [PATCH 04/31] mm: tag reseve pages Suresh Jayaraman
2009-10-01 21:09 ` David Rientjes
2009-10-02  4:43   ` Neil Brown
2009-10-02  9:50     ` David Rientjes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).