Re: [PATCH 12/20] mm: Extended batches for generic mmu_gather

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrew Morton <akpm@linux-foundation.org>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Avi Kivity <avi@redhat.com>, Thomas Gleixner <tglx@linutronix.de>,
	Rik van Riel <riel@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-mm@kvack.org,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	David Miller <davem@davemloft.net>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Yanmin Zhang <yanmin_zhang@linux.intel.com>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH 12/20] mm: Extended batches for generic mmu_gather
Date: Tue, 19 Apr 2011 13:06:33 -0700	[thread overview]
Message-ID: <20110419130633.3d8cd5ae.akpm@linux-foundation.org> (raw)
In-Reply-To: <20110401121725.892956392@chello.nl>

On Fri, 01 Apr 2011 14:13:10 +0200
Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> Instead of using a single batch (the small on-stack, or an allocated
> page), try and extend the batch every time it runs out and only flush
> once either the extend fails or we're done.

why?

>
> ...
>
> @@ -86,22 +86,48 @@ struct mmu_gather {
>  #ifdef CONFIG_HAVE_RCU_TABLE_FREE
>  	struct mmu_table_batch	*batch;
>  #endif
> -	unsigned int		nr;	/* set to ~0U means fast mode */
> -	unsigned int		max;	/* nr < max */
> -	unsigned int		need_flush;/* Really unmapped some ptes? */
> -	unsigned int		fullmm; /* non-zero means full mm flush */
> -	struct page		**pages;
> -	struct page		*local[MMU_GATHER_BUNDLE];
> +	unsigned int		need_flush : 1,	/* Did free PTEs */
> +				fast_mode  : 1; /* No batching   */

mmu_gather.fast_mode gets modified in several places apparently without
locking to protect itself.  I don't think that these modifications will
accidentally trash need_flush, mainly by luck.

Please review the concurrency issues here and document them clearly.

> +	unsigned int		fullmm;
> +
> +	struct mmu_gather_batch *active;
> +	struct mmu_gather_batch	local;
> +	struct page		*__pages[MMU_GATHER_BUNDLE];
>  };
>  
> -static inline void __tlb_alloc_page(struct mmu_gather *tlb)
> +/*
> + * For UP we don't need to worry about TLB flush
> + * and page free order so much..
> + */
> +#ifdef CONFIG_SMP
> +  #define tlb_fast_mode(tlb) (tlb->fast_mode)
> +#else
> +  #define tlb_fast_mode(tlb) 1
> +#endif

Mutter.

Could have been written in C.

Will cause a compile error with, for example, tlb_fast_mode(tlb + 1).

> +static inline int tlb_next_batch(struct mmu_gather *tlb)
>  {
> -	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
> +	struct mmu_gather_batch *batch;
>  
> -	if (addr) {
> -		tlb->pages = (void *)addr;
> -		tlb->max = PAGE_SIZE / sizeof(struct page *);
> +	batch = tlb->active;
> +	if (batch->next) {
> +		tlb->active = batch->next;
> +		return 1;
>  	}
> +
> +	batch = (void *)__get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);

A comment explaining the gfp_t decision would be useful.

> +	if (!batch)
> +		return 0;
> +
> +	batch->next = NULL;
> +	batch->nr   = 0;
> +	batch->max  = MAX_GATHER_BATCH;
> +
> +	tlb->active->next = batch;
> +	tlb->active = batch;
> +
> +	return 1;
>  }
>  
>  /* tlb_gather_mmu
> @@ -114,16 +140,13 @@ tlb_gather_mmu(struct mmu_gather *tlb, s
>  {
>  	tlb->mm = mm;
>  
> -	tlb->max = ARRAY_SIZE(tlb->local);
> -	tlb->pages = tlb->local;
> -
> -	if (num_online_cpus() > 1) {
> -		tlb->nr = 0;
> -		__tlb_alloc_page(tlb);
> -	} else /* Use fast mode if only one CPU is online */
> -		tlb->nr = ~0U;
> -
> -	tlb->fullmm = fullmm;
> +	tlb->fullmm     = fullmm;
> +	tlb->need_flush = 0;
> +	tlb->fast_mode  = (num_possible_cpus() == 1);

The changelog didn't tell us why we switched from num_online_cpus() to
num_possible_cpus().

> +	tlb->local.next = NULL;
> +	tlb->local.nr   = 0;
> +	tlb->local.max  = ARRAY_SIZE(tlb->__pages);
> +	tlb->active     = &tlb->local;
>  
>  #ifdef CONFIG_HAVE_RCU_TABLE_FREE
>  	tlb->batch = NULL;
>
> ...
>
> @@ -177,15 +205,24 @@ tlb_finish_mmu(struct mmu_gather *tlb, u
>   */
>  static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
>  {
> +	struct mmu_gather_batch *batch;
> +
>  	tlb->need_flush = 1;
> +
>  	if (tlb_fast_mode(tlb)) {
>  		free_page_and_swap_cache(page);
>  		return 1; /* avoid calling tlb_flush_mmu() */
>  	}
> -	tlb->pages[tlb->nr++] = page;
> -	VM_BUG_ON(tlb->nr > tlb->max);
>  
> -	return tlb->max - tlb->nr;
> +	batch = tlb->active;
> +	batch->pages[batch->nr++] = page;
> +	VM_BUG_ON(batch->nr > batch->max);
> +	if (batch->nr == batch->max) {
> +		if (!tlb_next_batch(tlb))
> +			return 0;
> +	}

Moving the VM_BUG_ON() down to after the if() would save a few cycles.

> +	return batch->max - batch->nr;
>  }
>  
>  /* tlb_remove_page
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Andrew Morton <akpm@linux-foundation.org>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Avi Kivity <avi@redhat.com>, Thomas Gleixner <tglx@linutronix.de>,
	Rik van Riel <riel@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-mm@kvack.org,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	David Miller <davem@davemloft.net>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Yanmin Zhang <yanmin_zhang@linux.intel.com>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH 12/20] mm: Extended batches for generic mmu_gather
Date: Tue, 19 Apr 2011 13:06:33 -0700	[thread overview]
Message-ID: <20110419130633.3d8cd5ae.akpm@linux-foundation.org> (raw)
Message-ID: <20110419200633._DH5PKHH8Kf2ROnel9Pk-gYx5c2WdNXbBR3aJ5rnh_c@z> (raw)
In-Reply-To: <20110401121725.892956392@chello.nl>

On Fri, 01 Apr 2011 14:13:10 +0200
Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> Instead of using a single batch (the small on-stack, or an allocated
> page), try and extend the batch every time it runs out and only flush
> once either the extend fails or we're done.

why?

>
> ...
>
> @@ -86,22 +86,48 @@ struct mmu_gather {
>  #ifdef CONFIG_HAVE_RCU_TABLE_FREE
>  	struct mmu_table_batch	*batch;
>  #endif
> -	unsigned int		nr;	/* set to ~0U means fast mode */
> -	unsigned int		max;	/* nr < max */
> -	unsigned int		need_flush;/* Really unmapped some ptes? */
> -	unsigned int		fullmm; /* non-zero means full mm flush */
> -	struct page		**pages;
> -	struct page		*local[MMU_GATHER_BUNDLE];
> +	unsigned int		need_flush : 1,	/* Did free PTEs */
> +				fast_mode  : 1; /* No batching   */

mmu_gather.fast_mode gets modified in several places apparently without
locking to protect itself.  I don't think that these modifications will
accidentally trash need_flush, mainly by luck.

Please review the concurrency issues here and document them clearly.

> +	unsigned int		fullmm;
> +
> +	struct mmu_gather_batch *active;
> +	struct mmu_gather_batch	local;
> +	struct page		*__pages[MMU_GATHER_BUNDLE];
>  };
>  
> -static inline void __tlb_alloc_page(struct mmu_gather *tlb)
> +/*
> + * For UP we don't need to worry about TLB flush
> + * and page free order so much..
> + */
> +#ifdef CONFIG_SMP
> +  #define tlb_fast_mode(tlb) (tlb->fast_mode)
> +#else
> +  #define tlb_fast_mode(tlb) 1
> +#endif

Mutter.

Could have been written in C.

Will cause a compile error with, for example, tlb_fast_mode(tlb + 1).

> +static inline int tlb_next_batch(struct mmu_gather *tlb)
>  {
> -	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
> +	struct mmu_gather_batch *batch;
>  
> -	if (addr) {
> -		tlb->pages = (void *)addr;
> -		tlb->max = PAGE_SIZE / sizeof(struct page *);
> +	batch = tlb->active;
> +	if (batch->next) {
> +		tlb->active = batch->next;
> +		return 1;
>  	}
> +
> +	batch = (void *)__get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);

A comment explaining the gfp_t decision would be useful.

> +	if (!batch)
> +		return 0;
> +
> +	batch->next = NULL;
> +	batch->nr   = 0;
> +	batch->max  = MAX_GATHER_BATCH;
> +
> +	tlb->active->next = batch;
> +	tlb->active = batch;
> +
> +	return 1;
>  }
>  
>  /* tlb_gather_mmu
> @@ -114,16 +140,13 @@ tlb_gather_mmu(struct mmu_gather *tlb, s
>  {
>  	tlb->mm = mm;
>  
> -	tlb->max = ARRAY_SIZE(tlb->local);
> -	tlb->pages = tlb->local;
> -
> -	if (num_online_cpus() > 1) {
> -		tlb->nr = 0;
> -		__tlb_alloc_page(tlb);
> -	} else /* Use fast mode if only one CPU is online */
> -		tlb->nr = ~0U;
> -
> -	tlb->fullmm = fullmm;
> +	tlb->fullmm     = fullmm;
> +	tlb->need_flush = 0;
> +	tlb->fast_mode  = (num_possible_cpus() == 1);

The changelog didn't tell us why we switched from num_online_cpus() to
num_possible_cpus().

> +	tlb->local.next = NULL;
> +	tlb->local.nr   = 0;
> +	tlb->local.max  = ARRAY_SIZE(tlb->__pages);
> +	tlb->active     = &tlb->local;
>  
>  #ifdef CONFIG_HAVE_RCU_TABLE_FREE
>  	tlb->batch = NULL;
>
> ...
>
> @@ -177,15 +205,24 @@ tlb_finish_mmu(struct mmu_gather *tlb, u
>   */
>  static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
>  {
> +	struct mmu_gather_batch *batch;
> +
>  	tlb->need_flush = 1;
> +
>  	if (tlb_fast_mode(tlb)) {
>  		free_page_and_swap_cache(page);
>  		return 1; /* avoid calling tlb_flush_mmu() */
>  	}
> -	tlb->pages[tlb->nr++] = page;
> -	VM_BUG_ON(tlb->nr > tlb->max);
>  
> -	return tlb->max - tlb->nr;
> +	batch = tlb->active;
> +	batch->pages[batch->nr++] = page;
> +	VM_BUG_ON(batch->nr > batch->max);
> +	if (batch->nr == batch->max) {
> +		if (!tlb_next_batch(tlb))
> +			return 0;
> +	}

Moving the VM_BUG_ON() down to after the if() would save a few cycles.

> +	return batch->max - batch->nr;
>  }
>  
>  /* tlb_remove_page
>

next prev parent reply	other threads:[~2011-04-19 20:06 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-01 12:12 [PATCH 00/20] mm: Preemptibility -v10 Peter Zijlstra
2011-04-01 12:12 ` Peter Zijlstra
2011-04-01 12:12 ` Peter Zijlstra
2011-04-01 12:12 ` [PATCH 01/20] mm: mmu_gather rework Peter Zijlstra
2011-04-01 12:12   ` Peter Zijlstra
2011-04-01 12:12   ` Peter Zijlstra
2011-04-19 20:06   ` Andrew Morton
2011-04-19 20:06     ` Andrew Morton
2011-04-19 20:06     ` Andrew Morton
2011-04-20  8:47     ` Peter Zijlstra
2011-04-20  8:47       ` Peter Zijlstra
2011-04-20  9:10       ` Peter Zijlstra
2011-04-20  9:10         ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 02/20] powerpc: " Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 03/20] sparc: " Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 04/20] s390: " Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 05/20] arm: " Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 06/20] sh: " Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 07/20] ia64: " Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 08/20] um: " Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 09/20] mm: Now that all old mmu_gather code is gone, remove the storage Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 10/20] mm, powerpc: Move the RCU page-table freeing into generic code Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 11/20] s390: use generic RCP page-table freeing Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 12/20] mm: Extended batches for generic mmu_gather Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-19 20:06   ` Andrew Morton [this message]
2011-04-19 20:06     ` Andrew Morton
2011-04-20 10:40     ` Peter Zijlstra
2011-04-20 10:40       ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 13/20] lockdep, mutex: Provide mutex_lock_nest_lock Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-19 20:06   ` Andrew Morton
2011-04-19 20:06     ` Andrew Morton
2011-04-20 11:03     ` Peter Zijlstra
2011-04-20 11:03       ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 14/20] mm: Remove i_mmap_lock lockbreak Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-19 20:07   ` Andrew Morton
2011-04-19 20:07     ` Andrew Morton
2011-04-21 13:32     ` Peter Zijlstra
2011-04-21 13:32       ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 15/20] mm: Convert i_mmap_lock to a mutex Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-19 20:07   ` Andrew Morton
2011-04-19 20:07     ` Andrew Morton
2011-04-21 13:28     ` Peter Zijlstra
2011-04-21 13:28       ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 16/20] mm: Revert page_lock_anon_vma() lock annotation Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 17/20] mm: Improve page_lock_anon_vma() comment Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 18/20] mm: Use refcounts for page_lock_anon_vma() Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 19/20] mm: Convert anon_vma->lock to a mutex Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-19 20:07   ` Andrew Morton
2011-04-19 20:07     ` Andrew Morton
2011-04-21 13:28     ` Peter Zijlstra
2011-04-21 13:28       ` Peter Zijlstra
2011-04-01 12:13 ` [PATCH 20/20] mm: Optimize page_lock_anon_vma() fast-path Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-01 12:13   ` Peter Zijlstra
2011-04-19 20:08   ` Andrew Morton
2011-04-19 20:08     ` Andrew Morton
2011-04-20 12:38     ` Peter Zijlstra
2011-04-20 12:38       ` Peter Zijlstra
2011-04-20 15:00       ` Peter Zijlstra
2011-04-20 15:00         ` Peter Zijlstra
2011-04-01 13:51 ` [PATCH 00/20] mm: Preemptibility -v10 Peter Zijlstra
2011-04-01 13:51   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110419130633.3d8cd5ae.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=avi@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=davem@davemloft.net \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=hughd@google.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mingo@elte.hu \
    --cc=npiggin@kernel.dk \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.