[PATCH] mm: rearrange exit_mmap() to unlock before arch_exit

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] mm: rearrange exit_mmap() to unlock before arch_exit_mmap
@ 2009-02-09 17:29 Lee Schermerhorn
  2009-02-10 21:31 ` Andrew Morton
  0 siblings, 1 reply; 3+ messages in thread
From: Lee Schermerhorn @ 2009-02-09 17:29 UTC (permalink / raw)
  To: linux-kernel, stable, Andrew Morton
  Cc: Jeremy Fitzhardinge, Keir Fraser, Christophe Saout,
	Alex Williamson, Nick Piggin

From: Jeremy Fitzhardinge <jeremy@goop.org>

Subject: mm: rearrange exit_mmap() to unlock before arch_exit_mmap

Applicable to 29-rc4 and 28-stable

Christophe Saout reported [in precursor to:
http://marc.info/?l=linux-kernel&m=123209902707347&w=4]:

> Note that I also some a different issue with CONFIG_UNEVICTABLE_LRU.
> Seems like Xen tears down current->mm early on process termination, so
> that __get_user_pages in exit_mmap causes nasty messages when the
> process had any mlocked pages.  (in fact, it somehow manages to get into
> the swapping code and produces a null pointer dereference trying to get
> a swap token)

Jeremy explained:

Yes.  In the normal case under Xen, an in-use pagetable is "pinned", 
meaning that it is RO to the kernel, and all updates must go via 
hypercall (or writes are trapped and emulated, which is much the same 
thing).  An unpinned pagetable is not currently in use by any process, 
and can be directly accessed as normal RW pages.

As an optimisation at process exit time, we unpin the pagetable as early 
as possible (switching the process to init_mm), so that all the normal 
pagetable teardown can happen with direct memory accesses.

This happens in exit_mmap() -> arch_exit_mmap().  The munlocking happens 
a few lines below.  The obvious thing to do would be to move 
arch_exit_mmap() to below the munlock code, but I think we'd want to 
call it even if mm->mmap is NULL, just to be on the safe side.

Thus, this patch:

exit_mmap() needs to unlock any locked vmas before calling
arch_exit_mmap, as the latter may switch the current mm to init_mm,
which would cause the former to fail.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by:  Lee Schermerhorn <lee.schermerhorn@hp.com>

---
 mm/mmap.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

===================================================================
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2078,12 +2078,8 @@
 	unsigned long end;

 	/* mm's last user has gone, and its about to be pulled down */
-	arch_exit_mmap(mm);
 	mmu_notifier_release(mm);

-	if (!mm->mmap)	/* Can happen if dup_mmap() received an OOM */
-		return;
-
 	if (mm->locked_vm) {
 		vma = mm->mmap;
 		while (vma) {
@@ -2092,7 +2088,13 @@
 			vma = vma->vm_next;
 		}
 	}
+
+	arch_exit_mmap(mm);
+
 	vma = mm->mmap;
+	if (!vma)	/* Can happen if dup_mmap() received an OOM */
+		return;
+
 	lru_add_drain();
 	flush_cache_mm(mm);
 	tlb = tlb_gather_mmu(mm, 1);

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: rearrange exit_mmap() to unlock before arch_exit_mmap
  2009-02-09 17:29 [PATCH] mm: rearrange exit_mmap() to unlock before arch_exit_mmap Lee Schermerhorn
@ 2009-02-10 21:31 ` Andrew Morton
  2009-02-10 21:45   ` Lee Schermerhorn
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2009-02-10 21:31 UTC (permalink / raw)
  To: Lee Schermerhorn
  Cc: linux-kernel, stable, jeremy, keir.fraser, christophe,
	alex.williamson, npiggin

On Mon, 09 Feb 2009 12:29:48 -0500
Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:

> From: Jeremy Fitzhardinge <jeremy@goop.org>
> 
> Subject: mm: rearrange exit_mmap() to unlock before arch_exit_mmap
> 
> Applicable to 29-rc4 and 28-stable
> 
> Christophe Saout reported [in precursor to:
> http://marc.info/?l=linux-kernel&m=123209902707347&w=4]:
> 
> > Note that I also some a different issue with CONFIG_UNEVICTABLE_LRU.
> > Seems like Xen tears down current->mm early on process termination, so
> > that __get_user_pages in exit_mmap causes nasty messages when the
> > process had any mlocked pages.  (in fact, it somehow manages to get into
> > the swapping code and produces a null pointer dereference trying to get
> > a swap token)
> 
> Jeremy explained:
> 
> Yes.  In the normal case under Xen, an in-use pagetable is "pinned", 
> meaning that it is RO to the kernel, and all updates must go via 
> hypercall (or writes are trapped and emulated, which is much the same 
> thing).  An unpinned pagetable is not currently in use by any process, 
> and can be directly accessed as normal RW pages.
> 
> As an optimisation at process exit time, we unpin the pagetable as early 
> as possible (switching the process to init_mm), so that all the normal 
> pagetable teardown can happen with direct memory accesses.
> 
> This happens in exit_mmap() -> arch_exit_mmap().  The munlocking happens 
> a few lines below.  The obvious thing to do would be to move 
> arch_exit_mmap() to below the munlock code, but I think we'd want to 
> call it even if mm->mmap is NULL, just to be on the safe side.
> 
> Thus, this patch:
> 
> exit_mmap() needs to unlock any locked vmas before calling
> arch_exit_mmap, as the latter may switch the current mm to init_mm,
> which would cause the former to fail.
> 
> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> Acked-by:  Lee Schermerhorn <lee.schermerhorn@hp.com>
> 
> ---
>  mm/mmap.c |   10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> ===================================================================
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2078,12 +2078,8 @@
>  	unsigned long end;
>  
>  	/* mm's last user has gone, and its about to be pulled down */
> -	arch_exit_mmap(mm);
>  	mmu_notifier_release(mm);
>  
> -	if (!mm->mmap)	/* Can happen if dup_mmap() received an OOM */
> -		return;
> -
>  	if (mm->locked_vm) {
>  		vma = mm->mmap;
>  		while (vma) {
> @@ -2092,7 +2088,13 @@
>  			vma = vma->vm_next;
>  		}
>  	}
> +
> +	arch_exit_mmap(mm);
> +
>  	vma = mm->mmap;
> +	if (!vma)	/* Can happen if dup_mmap() received an OOM */
> +		return;
> +
>  	lru_add_drain();
>  	flush_cache_mm(mm);
>  	tlb = tlb_gather_mmu(mm, 1);

The patch as it stands doesn't apply cleanly to 2.6.28.  I didn't look
into what needs to be done to fix it up.  Presumably the stable beavers
would like a fixed-up and tested version for backporting sometime.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: rearrange exit_mmap() to unlock before arch_exit_mmap
  2009-02-10 21:31 ` Andrew Morton
@ 2009-02-10 21:45   ` Lee Schermerhorn
  0 siblings, 0 replies; 3+ messages in thread
From: Lee Schermerhorn @ 2009-02-10 21:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, stable, jeremy, keir.fraser, christophe,
	alex.williamson, npiggin

On Tue, 2009-02-10 at 13:31 -0800, Andrew Morton wrote:
> On Mon, 09 Feb 2009 12:29:48 -0500
> Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> 
> > From: Jeremy Fitzhardinge <jeremy@goop.org>
> > 
> > Subject: mm: rearrange exit_mmap() to unlock before arch_exit_mmap
> > 
> > Applicable to 29-rc4 and 28-stable
> > 
> > Christophe Saout reported [in precursor to:
> > http://marc.info/?l=linux-kernel&m=123209902707347&w=4]:
> > 
> > > Note that I also some a different issue with CONFIG_UNEVICTABLE_LRU.
> > > Seems like Xen tears down current->mm early on process termination, so
> > > that __get_user_pages in exit_mmap causes nasty messages when the
> > > process had any mlocked pages.  (in fact, it somehow manages to get into
> > > the swapping code and produces a null pointer dereference trying to get
> > > a swap token)
> > 
> > Jeremy explained:
> > 
> > Yes.  In the normal case under Xen, an in-use pagetable is "pinned", 
> > meaning that it is RO to the kernel, and all updates must go via 
> > hypercall (or writes are trapped and emulated, which is much the same 
> > thing).  An unpinned pagetable is not currently in use by any process, 
> > and can be directly accessed as normal RW pages.
> > 
> > As an optimisation at process exit time, we unpin the pagetable as early 
> > as possible (switching the process to init_mm), so that all the normal 
> > pagetable teardown can happen with direct memory accesses.
> > 
> > This happens in exit_mmap() -> arch_exit_mmap().  The munlocking happens 
> > a few lines below.  The obvious thing to do would be to move 
> > arch_exit_mmap() to below the munlock code, but I think we'd want to 
> > call it even if mm->mmap is NULL, just to be on the safe side.
> > 
> > Thus, this patch:
> > 
> > exit_mmap() needs to unlock any locked vmas before calling
> > arch_exit_mmap, as the latter may switch the current mm to init_mm,
> > which would cause the former to fail.
> > 
> > Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> > Acked-by:  Lee Schermerhorn <lee.schermerhorn@hp.com>
> > 
> > ---
> >  mm/mmap.c |   10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > ===================================================================
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2078,12 +2078,8 @@
> >  	unsigned long end;
> >  
> >  	/* mm's last user has gone, and its about to be pulled down */
> > -	arch_exit_mmap(mm);
> >  	mmu_notifier_release(mm);
> >  
> > -	if (!mm->mmap)	/* Can happen if dup_mmap() received an OOM */
> > -		return;
> > -
> >  	if (mm->locked_vm) {
> >  		vma = mm->mmap;
> >  		while (vma) {
> > @@ -2092,7 +2088,13 @@
> >  			vma = vma->vm_next;
> >  		}
> >  	}
> > +
> > +	arch_exit_mmap(mm);
> > +
> >  	vma = mm->mmap;
> > +	if (!vma)	/* Can happen if dup_mmap() received an OOM */
> > +		return;
> > +
> >  	lru_add_drain();
> >  	flush_cache_mm(mm);
> >  	tlb = tlb_gather_mmu(mm, 1);
> 
> The patch as it stands doesn't apply cleanly to 2.6.28.  I didn't look
> into what needs to be done to fix it up.  Presumably the stable beavers
> would like a fixed-up and tested version for backporting sometime.

The only difference I can see in this area, looking at a 28.2 tree I
have, the 28.4 patch on kernel.org and the context in the patch, is that
"flush_cache_mm(mm);" line.

Lee




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-02-10 21:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-09 17:29 [PATCH] mm: rearrange exit_mmap() to unlock before arch_exit_mmap Lee Schermerhorn
2009-02-10 21:31 ` Andrew Morton
2009-02-10 21:45   ` Lee Schermerhorn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox