[PATCH] mm: call pte_unmap() against a proper pte (Re: [PATCH 7/9] swap

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

* [PATCH] mm: call pte_unmap() against a proper pte (Re: [PATCH 7/9] swap_info: swap count continuations)
  2009-10-15  0:56 ` [PATCH 7/9] swap_info: swap count continuations Hugh Dickins
@ 2009-10-16  6:30   ` Daisuke Nishimura
  2009-10-16  8:01     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 3+ messages in thread
From: Daisuke Nishimura @ 2009-10-16  6:30 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Nitin Gupta, KAMEZAWA Hiroyuki, hongshin,
	linux-kernel, linux-mm, Daisuke Nishimura

Hi.

> @@ -645,6 +648,7 @@ static int copy_pte_range(struct mm_stru
>  	spinlock_t *src_ptl, *dst_ptl;
>  	int progress = 0;
>  	int rss[2];
> +	swp_entry_t entry = (swp_entry_t){0};
>  
>  again:
>  	rss[1] = rss[0] = 0;
> @@ -671,7 +675,10 @@ again:
>  			progress++;
>  			continue;
>  		}
> -		copy_one_pte(dst_mm, src_mm, dst_pte, src_pte, vma, addr, rss);
> +		entry.val = copy_one_pte(dst_mm, src_mm, dst_pte, src_pte,
> +							vma, addr, rss);
> +		if (entry.val)
> +			break;
>  		progress += 8;
>  	} while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
>  
It isn't the fault of only this patch, but I think breaking the loop without incrementing
dst_pte(and src_pte) would be bad behavior because we do unmap_pte(dst_pte - 1) later.
(current copy_pte_range() already does it though... and this is only problematic
when we break the first loop, IIUC.)

> @@ -681,6 +688,12 @@ again:
>  	add_mm_rss(dst_mm, rss[0], rss[1]);
>  	pte_unmap_unlock(dst_pte - 1, dst_ptl);
>  	cond_resched();
> +
> +	if (entry.val) {
> +		if (add_swap_count_continuation(entry, GFP_KERNEL) < 0)
> +			return -ENOMEM;
> +		progress = 0;
> +	}
>  	if (addr != end)
>  		goto again;
>  	return 0;

I've searched other places where we break a similar loop and do pte_unmap(pte - 1).
Current copy_pte_range() and apply_to_pte_range() has the same problem.

How about a patch like this ?
===
From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>

There are some places where we do like:

	pte = pte_map();
	do {
		(do break in some conditions)
	} while (pte++, ...);
	pte_unmap(pte - 1);

But if the loop breaks at the first loop, pte_unmap() unmaps invalid pte.

This patch is a fix for this problem.

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
---
 mm/memory.c |   11 +++++++----
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 72a2494..492de38 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -641,6 +641,7 @@ static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 		pmd_t *dst_pmd, pmd_t *src_pmd, struct vm_area_struct *vma,
 		unsigned long addr, unsigned long end)
 {
+	pte_t *orig_src_pte, *orig_dst_pte;
 	pte_t *src_pte, *dst_pte;
 	spinlock_t *src_ptl, *dst_ptl;
 	int progress = 0;
@@ -654,6 +655,8 @@ again:
 	src_pte = pte_offset_map_nested(src_pmd, addr);
 	src_ptl = pte_lockptr(src_mm, src_pmd);
 	spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
+	orig_src_pte = src_pte;
+	orig_dst_pte = dst_pte;
 	arch_enter_lazy_mmu_mode();
 
 	do {
@@ -677,9 +680,9 @@ again:
 
 	arch_leave_lazy_mmu_mode();
 	spin_unlock(src_ptl);
-	pte_unmap_nested(src_pte - 1);
+	pte_unmap_nested(orig_src_pte);
 	add_mm_rss(dst_mm, rss[0], rss[1]);
-	pte_unmap_unlock(dst_pte - 1, dst_ptl);
+	pte_unmap_unlock(orig_dst_pte, dst_ptl);
 	cond_resched();
 	if (addr != end)
 		goto again;
@@ -1822,10 +1825,10 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
 	token = pmd_pgtable(*pmd);
 
 	do {
-		err = fn(pte, token, addr, data);
+		err = fn(pte++, token, addr, data);
 		if (err)
 			break;
-	} while (pte++, addr += PAGE_SIZE, addr != end);
+	} while (addr += PAGE_SIZE, addr != end);
 
 	arch_leave_lazy_mmu_mode();
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: call pte_unmap() against a proper pte (Re: [PATCH 7/9] swap_info: swap count continuations)
  2009-10-16  6:30   ` [PATCH] mm: call pte_unmap() against a proper pte (Re: [PATCH 7/9] swap_info: swap count continuations) Daisuke Nishimura
@ 2009-10-16  8:01     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 3+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-10-16  8:01 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Hugh Dickins, Andrew Morton, Nitin Gupta, hongshin, linux-kernel,
	linux-mm

On Fri, 16 Oct 2009 15:30:56 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:

> Hi.
> 
> > @@ -645,6 +648,7 @@ static int copy_pte_range(struct mm_stru
> >  	spinlock_t *src_ptl, *dst_ptl;
> >  	int progress = 0;
> >  	int rss[2];
> > +	swp_entry_t entry = (swp_entry_t){0};
> >  
> >  again:
> >  	rss[1] = rss[0] = 0;
> > @@ -671,7 +675,10 @@ again:
> >  			progress++;
> >  			continue;
> >  		}
> > -		copy_one_pte(dst_mm, src_mm, dst_pte, src_pte, vma, addr, rss);
> > +		entry.val = copy_one_pte(dst_mm, src_mm, dst_pte, src_pte,
> > +							vma, addr, rss);
> > +		if (entry.val)
> > +			break;
> >  		progress += 8;
> >  	} while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
> >  
> It isn't the fault of only this patch, but I think breaking the loop without incrementing
> dst_pte(and src_pte) would be bad behavior because we do unmap_pte(dst_pte - 1) later.
> (current copy_pte_range() already does it though... and this is only problematic
> when we break the first loop, IIUC.)
> 

oh, yes. nice catch!

> > @@ -681,6 +688,12 @@ again:
> >  	add_mm_rss(dst_mm, rss[0], rss[1]);
> >  	pte_unmap_unlock(dst_pte - 1, dst_ptl);
> >  	cond_resched();
> > +
> > +	if (entry.val) {
> > +		if (add_swap_count_continuation(entry, GFP_KERNEL) < 0)
> > +			return -ENOMEM;
> > +		progress = 0;
> > +	}
> >  	if (addr != end)
> >  		goto again;
> >  	return 0;
> 
> I've searched other places where we break a similar loop and do pte_unmap(pte - 1).
> Current copy_pte_range() and apply_to_pte_range() has the same problem.
> 

> How about a patch like this ?
> ===
> From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> 
> There are some places where we do like:
> 
> 	pte = pte_map();
> 	do {
> 		(do break in some conditions)
> 	} while (pte++, ...);
> 	pte_unmap(pte - 1);
> 
> But if the loop breaks at the first loop, pte_unmap() unmaps invalid pte.
> 
> This patch is a fix for this problem.
> 
> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>

seems correct.

Reviewd-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

> ---
>  mm/memory.c |   11 +++++++----
>  1 files changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 72a2494..492de38 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -641,6 +641,7 @@ static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
>  		pmd_t *dst_pmd, pmd_t *src_pmd, struct vm_area_struct *vma,
>  		unsigned long addr, unsigned long end)
>  {
> +	pte_t *orig_src_pte, *orig_dst_pte;
>  	pte_t *src_pte, *dst_pte;
>  	spinlock_t *src_ptl, *dst_ptl;
>  	int progress = 0;
> @@ -654,6 +655,8 @@ again:
>  	src_pte = pte_offset_map_nested(src_pmd, addr);
>  	src_ptl = pte_lockptr(src_mm, src_pmd);
>  	spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
> +	orig_src_pte = src_pte;
> +	orig_dst_pte = dst_pte;
>  	arch_enter_lazy_mmu_mode();
>  
>  	do {
> @@ -677,9 +680,9 @@ again:
>  
>  	arch_leave_lazy_mmu_mode();
>  	spin_unlock(src_ptl);
> -	pte_unmap_nested(src_pte - 1);
> +	pte_unmap_nested(orig_src_pte);
>  	add_mm_rss(dst_mm, rss[0], rss[1]);
> -	pte_unmap_unlock(dst_pte - 1, dst_ptl);
> +	pte_unmap_unlock(orig_dst_pte, dst_ptl);
>  	cond_resched();
>  	if (addr != end)
>  		goto again;
> @@ -1822,10 +1825,10 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd,
>  	token = pmd_pgtable(*pmd);
>  
>  	do {
> -		err = fn(pte, token, addr, data);
> +		err = fn(pte++, token, addr, data);
>  		if (err)
>  			break;
> -	} while (pte++, addr += PAGE_SIZE, addr != end);
> +	} while (addr += PAGE_SIZE, addr != end);
>  
>  	arch_leave_lazy_mmu_mode();
>  
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: call pte_unmap() against a proper pte (Re: [PATCH 7/9] swap_info: swap count continuations)
@ 2009-10-17 22:44 hugh.dickins
  0 siblings, 0 replies; 3+ messages in thread
From: hugh.dickins @ 2009-10-17 22:44 UTC (permalink / raw)
  To: nishimura
  Cc: Andrew Morton, Nitin Gupta, KAMEZAWA Hiroyuki, hongshin,
	linux-kernel, linux-mm



>----Original Message----
>From: nishimura@mxp.nes.nec.co.jp
>Date: 16/10/2009 7:30 
>To: "Hugh Dickins"<hugh.dickins@tiscali.co.uk>
>Cc: "Andrew Morton"<akpm@linux-foundation.org>, "Nitin Gupta"<ngupta@vflare.
org>, "KAMEZAWA Hiroyuki"<kamezawa.hiroyu@jp.fujitsu.com>, <hongshin@gmail.
com>, <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>, "Daisuke Nishimura"
<nishimura@mxp.nes.nec.co.jp>
>Subj: [PATCH] mm: call pte_unmap() against a proper pte (Re: [PATCH 7/9] 
swap_info: swap count continuations)
>
>Hi.
>
>> @@ -645,6 +648,7 @@ static int copy_pte_range(struct mm_stru
>>  	spinlock_t *src_ptl, *dst_ptl;
>>  	int progress = 0;
>>  	int rss[2];
>> +	swp_entry_t entry = (swp_entry_t){0};
>>  
>>  again:
>>  	rss[1] = rss[0] = 0;
>> @@ -671,7 +675,10 @@ again:
>>  			progress++;
>>  			continue;
>>  		}
>> -		copy_one_pte(dst_mm, src_mm, dst_pte, src_pte, vma, addr, rss);
>> +		entry.val = copy_one_pte(dst_mm, src_mm, dst_pte, src_pte,
>> +							vma, addr, rss);
>> +		if (entry.val)
>> +			break;
>>  		progress += 8;
>>  	} while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr != end);
>>  
>It isn't the fault of only this patch, but I think breaking the loop without 
incrementing
>dst_pte(and src_pte) would be bad behavior because we do unmap_pte(dst_pte - 
1) later.
>(current copy_pte_range() already does it though... and this is only 
problematic
>when we break the first loop, IIUC.)

Good catch, thanks a lot for finding that.  I believe this is entirely a fault 
in my 7/9, the existing code
takes care not to break before it has made some progress (in part because of 
this unmap issue).

>
>> @@ -681,6 +688,12 @@ again:
>>  	add_mm_rss(dst_mm, rss[0], rss[1]);
>>  	pte_unmap_unlock(dst_pte - 1, dst_ptl);
>>  	cond_resched();
>> +
>> +	if (entry.val) {
>> +		if (add_swap_count_continuation(entry, GFP_KERNEL) < 0)
>> +			return -ENOMEM;
>> +		progress = 0;
>> +	}
>>  	if (addr != end)
>>  		goto again;
>>  	return 0;
>
>I've searched other places where we break a similar loop and do pte_unmap(pte 
- 1).
>Current copy_pte_range() and apply_to_pte_range() has the same problem.

And thank you for taking the trouble to look further afield: yes, 
apply_to_pte_range()
was already wrong.

>
>How about a patch like this ?
>===
>From: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
>
>There are some places where we do like:
>
>	pte = pte_map();
>	do {
>		(do break in some conditions)
>	} while (pte++, ...);
>	pte_unmap(pte - 1);
>
>But if the loop breaks at the first loop, pte_unmap() unmaps invalid pte.
>
>This patch is a fix for this problem.
>
>Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>

Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>





Forget the rest, get the best - http://www.tiscali.co.uk/music

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-10-17 22:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-17 22:44 [PATCH] mm: call pte_unmap() against a proper pte (Re: [PATCH 7/9] swap_info: swap count continuations) hugh.dickins
  -- strict thread matches above, loose matches on Subject: below --
2009-10-15  0:44 [PATCH 0/9] swap_info and swap_map patches Hugh Dickins
2009-10-15  0:56 ` [PATCH 7/9] swap_info: swap count continuations Hugh Dickins
2009-10-16  6:30   ` [PATCH] mm: call pte_unmap() against a proper pte (Re: [PATCH 7/9] swap_info: swap count continuations) Daisuke Nishimura
2009-10-16  8:01     ` KAMEZAWA Hiroyuki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox