* [PATCH 02/14] mm: fix major/minor fault accounting on retried fault
2009-04-07 11:50 [PATCH 00/14] filemap and readahead fixes Wu Fengguang
@ 2009-04-07 11:50 ` Wu Fengguang
0 siblings, 0 replies; 7+ messages in thread
From: Wu Fengguang @ 2009-04-07 11:50 UTC (permalink / raw)
To: Andrew Morton
Cc: Benjamin Herrenschmidt, Ying Han, Wu Fengguang, David Rientjes,
Hugh Dickins, Ingo Molnar, Lee Schermerhorn, Mike Waychison,
Nick Piggin, Peter Zijlstra, Rohit Seth, Edwin, H. Peter Anvin,
LKML, linux-mm, linux-fsdevel
[-- Attachment #1: filemap-major-fault-retry.patch --]
[-- Type: text/plain, Size: 1558 bytes --]
VM_FAULT_RETRY does make major/minor faults accounting a bit twisted..
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
arch/x86/mm/fault.c | 2 ++
mm/memory.c | 22 ++++++++++++++--------
2 files changed, 16 insertions(+), 8 deletions(-)
--- mm.orig/arch/x86/mm/fault.c
+++ mm/arch/x86/mm/fault.c
@@ -1160,6 +1160,8 @@ good_area:
if (fault & VM_FAULT_RETRY) {
if (retry_flag) {
retry_flag = 0;
+ tsk->maj_flt++;
+ tsk->min_flt--;
goto retry;
}
BUG();
--- mm.orig/mm/memory.c
+++ mm/mm/memory.c
@@ -2882,26 +2882,32 @@ int handle_mm_fault(struct mm_struct *mm
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
+ int ret;
__set_current_state(TASK_RUNNING);
- count_vm_event(PGFAULT);
-
- if (unlikely(is_vm_hugetlb_page(vma)))
- return hugetlb_fault(mm, vma, address, write_access);
+ if (unlikely(is_vm_hugetlb_page(vma))) {
+ ret = hugetlb_fault(mm, vma, address, write_access);
+ goto out;
+ }
+ ret = VM_FAULT_OOM;
pgd = pgd_offset(mm, address);
pud = pud_alloc(mm, pgd, address);
if (!pud)
- return VM_FAULT_OOM;
+ goto out;
pmd = pmd_alloc(mm, pud, address);
if (!pmd)
- return VM_FAULT_OOM;
+ goto out;
pte = pte_alloc_map(mm, pmd, address);
if (!pte)
- return VM_FAULT_OOM;
+ goto out;
- return handle_pte_fault(mm, vma, address, pte, pmd, write_access);
+ ret = handle_pte_fault(mm, vma, address, pte, pmd, write_access);
+out:
+ if (!(ret & VM_FAULT_RETRY))
+ count_vm_event(PGFAULT);
+ return ret;
}
#ifndef __PAGETABLE_PUD_FOLDED
--
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 02/14] mm: fix major/minor fault accounting on retried fault
[not found] ` <20090407072132.943283183@intel.com>
@ 2009-04-07 19:58 ` Ying Han
2009-04-07 22:45 ` Wu Fengguang
0 siblings, 1 reply; 7+ messages in thread
From: Ying Han @ 2009-04-07 19:58 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andrew Morton, LKML, linux-fsdevel, linux-mm
On Tue, Apr 7, 2009 at 12:17 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> VM_FAULT_RETRY does make major/minor faults accounting a bit twisted..
>
> Cc: Ying Han <yinghan@google.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
> arch/x86/mm/fault.c | 2 ++
> mm/memory.c | 22 ++++++++++++++--------
> 2 files changed, 16 insertions(+), 8 deletions(-)
>
> --- mm.orig/arch/x86/mm/fault.c
> +++ mm/arch/x86/mm/fault.c
> @@ -1160,6 +1160,8 @@ good_area:
> if (fault & VM_FAULT_RETRY) {
> if (retry_flag) {
> retry_flag = 0;
> + tsk->maj_flt++;
> + tsk->min_flt--;
> goto retry;
> }
> BUG();
sorry, little bit confuse here. are we assuming the retry path will
return min_flt as always?
> --- mm.orig/mm/memory.c
> +++ mm/mm/memory.c
> @@ -2882,26 +2882,32 @@ int handle_mm_fault(struct mm_struct *mm
> pud_t *pud;
> pmd_t *pmd;
> pte_t *pte;
> + int ret;
>
> __set_current_state(TASK_RUNNING);
>
> - count_vm_event(PGFAULT);
> -
> - if (unlikely(is_vm_hugetlb_page(vma)))
> - return hugetlb_fault(mm, vma, address, write_access);
> + if (unlikely(is_vm_hugetlb_page(vma))) {
> + ret = hugetlb_fault(mm, vma, address, write_access);
> + goto out;
> + }
>
> + ret = VM_FAULT_OOM;
> pgd = pgd_offset(mm, address);
> pud = pud_alloc(mm, pgd, address);
> if (!pud)
> - return VM_FAULT_OOM;
> + goto out;
> pmd = pmd_alloc(mm, pud, address);
> if (!pmd)
> - return VM_FAULT_OOM;
> + goto out;
> pte = pte_alloc_map(mm, pmd, address);
> if (!pte)
> - return VM_FAULT_OOM;
> + goto out;
>
> - return handle_pte_fault(mm, vma, address, pte, pmd, write_access);
> + ret = handle_pte_fault(mm, vma, address, pte, pmd, write_access);
> +out:
> + if (!(ret & VM_FAULT_RETRY))
> + count_vm_event(PGFAULT);
> + return ret;
> }
>
> #ifndef __PAGETABLE_PUD_FOLDED
>
> --
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 03/14] mm: remove FAULT_FLAG_RETRY dead code
[not found] ` <20090407072133.053995305@intel.com>
@ 2009-04-07 20:03 ` Ying Han
2009-04-07 23:27 ` Wu Fengguang
0 siblings, 1 reply; 7+ messages in thread
From: Ying Han @ 2009-04-07 20:03 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andrew Morton, LKML, linux-fsdevel, linux-mm
On Tue, Apr 7, 2009 at 12:17 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> Cc: Ying Han <yinghan@google.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
> mm/memory.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> --- mm.orig/mm/memory.c
> +++ mm/mm/memory.c
> @@ -2766,10 +2766,8 @@ static int do_linear_fault(struct mm_str
> {
> pgoff_t pgoff = (((address & PAGE_MASK)
> - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
> - int write = write_access & ~FAULT_FLAG_RETRY;
> - unsigned int flags = (write ? FAULT_FLAG_WRITE : 0);
> + unsigned int flags = (write_access ? FAULT_FLAG_WRITE : 0);
>
> - flags |= (write_access & FAULT_FLAG_RETRY);
> pte_unmap(page_table);
> return __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte);
> }
So, we got rid of FAULT_FLAG_RETRY flag?
> --
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 02/14] mm: fix major/minor fault accounting on retried fault
2009-04-07 19:58 ` [PATCH 02/14] mm: fix major/minor fault accounting on retried fault Ying Han
@ 2009-04-07 22:45 ` Wu Fengguang
0 siblings, 0 replies; 7+ messages in thread
From: Wu Fengguang @ 2009-04-07 22:45 UTC (permalink / raw)
To: Ying Han
Cc: Andrew Morton, LKML, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org
On Wed, Apr 08, 2009 at 03:58:16AM +0800, Ying Han wrote:
> On Tue, Apr 7, 2009 at 12:17 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> > VM_FAULT_RETRY does make major/minor faults accounting a bit twisted..
> >
> > Cc: Ying Han <yinghan@google.com>
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > ---
> > arch/x86/mm/fault.c | 2 ++
> > mm/memory.c | 22 ++++++++++++++--------
> > 2 files changed, 16 insertions(+), 8 deletions(-)
> >
> > --- mm.orig/arch/x86/mm/fault.c
> > +++ mm/arch/x86/mm/fault.c
> > @@ -1160,6 +1160,8 @@ good_area:
> > if (fault & VM_FAULT_RETRY) {
> > if (retry_flag) {
> > retry_flag = 0;
> > + tsk->maj_flt++;
> > + tsk->min_flt--;
> > goto retry;
> > }
> > BUG();
> sorry, little bit confuse here. are we assuming the retry path will
> return min_flt as always?
Sure - except for some really exceptional ftruncate cases.
The page was there ready, and we'll retry immediately.
maj_flt/min_flt are not _exact_ numbers by their nature, so 99.9%
accuracy shall be fine.
Thanks,
Fengguang
> > --- mm.orig/mm/memory.c
> > +++ mm/mm/memory.c
> > @@ -2882,26 +2882,32 @@ int handle_mm_fault(struct mm_struct *mm
> > pud_t *pud;
> > pmd_t *pmd;
> > pte_t *pte;
> > + int ret;
> >
> > __set_current_state(TASK_RUNNING);
> >
> > - count_vm_event(PGFAULT);
> > -
> > - if (unlikely(is_vm_hugetlb_page(vma)))
> > - return hugetlb_fault(mm, vma, address, write_access);
> > + if (unlikely(is_vm_hugetlb_page(vma))) {
> > + ret = hugetlb_fault(mm, vma, address, write_access);
> > + goto out;
> > + }
> >
> > + ret = VM_FAULT_OOM;
> > pgd = pgd_offset(mm, address);
> > pud = pud_alloc(mm, pgd, address);
> > if (!pud)
> > - return VM_FAULT_OOM;
> > + goto out;
> > pmd = pmd_alloc(mm, pud, address);
> > if (!pmd)
> > - return VM_FAULT_OOM;
> > + goto out;
> > pte = pte_alloc_map(mm, pmd, address);
> > if (!pte)
> > - return VM_FAULT_OOM;
> > + goto out;
> >
> > - return handle_pte_fault(mm, vma, address, pte, pmd, write_access);
> > + ret = handle_pte_fault(mm, vma, address, pte, pmd, write_access);
> > +out:
> > + if (!(ret & VM_FAULT_RETRY))
> > + count_vm_event(PGFAULT);
> > + return ret;
> > }
> >
> > #ifndef __PAGETABLE_PUD_FOLDED
> >
> > --
> >
> >
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 03/14] mm: remove FAULT_FLAG_RETRY dead code
2009-04-07 20:03 ` [PATCH 03/14] mm: remove FAULT_FLAG_RETRY dead code Ying Han
@ 2009-04-07 23:27 ` Wu Fengguang
2009-04-08 1:17 ` Ying Han
0 siblings, 1 reply; 7+ messages in thread
From: Wu Fengguang @ 2009-04-07 23:27 UTC (permalink / raw)
To: Ying Han
Cc: Andrew Morton, LKML, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org
On Wed, Apr 08, 2009 at 04:03:36AM +0800, Ying Han wrote:
> On Tue, Apr 7, 2009 at 12:17 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> > Cc: Ying Han <yinghan@google.com>
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > ---
> > mm/memory.c | 4 +---
> > 1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > --- mm.orig/mm/memory.c
> > +++ mm/mm/memory.c
> > @@ -2766,10 +2766,8 @@ static int do_linear_fault(struct mm_str
> > {
> > pgoff_t pgoff = (((address & PAGE_MASK)
> > - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
> > - int write = write_access & ~FAULT_FLAG_RETRY;
> > - unsigned int flags = (write ? FAULT_FLAG_WRITE : 0);
> > + unsigned int flags = (write_access ? FAULT_FLAG_WRITE : 0);
> >
> > - flags |= (write_access & FAULT_FLAG_RETRY);
> > pte_unmap(page_table);
> > return __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte);
> > }
> So, we got rid of FAULT_FLAG_RETRY flag?
Seems yes for the current mm tree, see the following two commits.
I did this patch on seeing 761fe7bc8193b7. But a closer look
indicates that the following two patches disable the filemap
VM_FAULT_RETRY part totally...
Anyway, if these two patches are to be reverted somehow(I guess yes),
this patch shall be _ignored_.
btw, do you have any test case and performance numbers for
FAULT_FLAG_RETRY? And possible overheads for (the worst case)
sparse random mmap reads on a sparse file? I cannot find any
in your changelogs..
Thanks,
Fengguang
commit 761fe7bc8193b7858b7dc7eb4a026dc66e49fe1f
Author: Andrew Morton <akpm@linux-foundation.org>
Date: Mon Feb 9 21:08:50 2009 +0100
A shot in the dark :(
Cc: Mike Waychison <mikew@google.com>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index bac7d7a..1c6736d 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1139,8 +1139,6 @@ good_area:
return;
}
- write |= retry_flag;
-
/*
* If for any reason at all we couldn't handle the fault,
* make sure we exit gracefully rather than endlessly redo
commit f01ca7a68c37680a4eee22a8722a713c5102b3bb
Author: Andrew Morton <akpm@linux-foundation.org>
Date: Mon Feb 9 21:08:50 2009 +0100
Untangle the `write' boolean from the FAULT_FLAG_foo non-boolean field.
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mike Waychison <mikew@google.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rohit Seth <rohitseth@google.com>
Cc: T<F6>r<F6>k Edwin <edwintorok@gmail.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index b2cc88f..bac7d7a 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -978,7 +978,7 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
struct mm_struct *mm;
int write;
int fault;
- unsigned int retry_flag = FAULT_FLAG_RETRY;
+ int retry_flag = 1;
tsk = current;
mm = tsk->mm;
@@ -1140,6 +1140,7 @@ good_area:
}
write |= retry_flag;
+
/*
* If for any reason at all we couldn't handle the fault,
* make sure we exit gracefully rather than endlessly redo
@@ -1159,8 +1160,8 @@ good_area:
* be removed or changed after the retry.
*/
if (fault & VM_FAULT_RETRY) {
- if (write & FAULT_FLAG_RETRY) {
- retry_flag &= ~FAULT_FLAG_RETRY;
+ if (retry_flag) {
+ retry_flag = 0;
goto retry;
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 03/14] mm: remove FAULT_FLAG_RETRY dead code
2009-04-07 23:27 ` Wu Fengguang
@ 2009-04-08 1:17 ` Ying Han
2009-04-08 2:29 ` Wu Fengguang
0 siblings, 1 reply; 7+ messages in thread
From: Ying Han @ 2009-04-08 1:17 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andrew Morton, LKML, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org
On Tue, Apr 7, 2009 at 4:27 PM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> On Wed, Apr 08, 2009 at 04:03:36AM +0800, Ying Han wrote:
>> On Tue, Apr 7, 2009 at 12:17 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
>> > Cc: Ying Han <yinghan@google.com>
>> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
>> > ---
>> > mm/memory.c | 4 +---
>> > 1 file changed, 1 insertion(+), 3 deletions(-)
>> >
>> > --- mm.orig/mm/memory.c
>> > +++ mm/mm/memory.c
>> > @@ -2766,10 +2766,8 @@ static int do_linear_fault(struct mm_str
>> > {
>> > pgoff_t pgoff = (((address & PAGE_MASK)
>> > - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
>> > - int write = write_access & ~FAULT_FLAG_RETRY;
>> > - unsigned int flags = (write ? FAULT_FLAG_WRITE : 0);
>> > + unsigned int flags = (write_access ? FAULT_FLAG_WRITE : 0);
>> >
>> > - flags |= (write_access & FAULT_FLAG_RETRY);
>> > pte_unmap(page_table);
>> > return __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte);
>> > }
>> So, we got rid of FAULT_FLAG_RETRY flag?
>
> Seems yes for the current mm tree, see the following two commits.
>
> I did this patch on seeing 761fe7bc8193b7. But a closer look
> indicates that the following two patches disable the filemap
> VM_FAULT_RETRY part totally...
>
> Anyway, if these two patches are to be reverted somehow(I guess yes),
> this patch shall be _ignored_.
>
> btw, do you have any test case and performance numbers for
> FAULT_FLAG_RETRY? And possible overheads for (the worst case)
> sparse random mmap reads on a sparse file? I cannot find any
> in your changelogs..
here is the benchmark i posted on [V1] but somehow missed in [V2] describtion
Benchmarks:
case 1. one application has a high count of threads each faulting in
different pages of a hugefile. Benchmark indicate that this double data
structure walking in case of major fault results in << 1% performance hit.
case 2. add another thread in the above application which in a tight loop of
mmap()/munmap(). Here we measure loop count in the new thread while other
threads doing the same amount of work as case one. we got << 3% performance
hit on the Complete Time(benchmark value for case one) and 10% performance
improvement on the mmap()/munmap() counter.
This patch helps a lot in cases we have writer which is waitting behind all
readers, so it could execute much faster.
--Ying
>
> Thanks,
> Fengguang
>
>
> commit 761fe7bc8193b7858b7dc7eb4a026dc66e49fe1f
> Author: Andrew Morton <akpm@linux-foundation.org>
> Date: Mon Feb 9 21:08:50 2009 +0100
>
> A shot in the dark :(
>
> Cc: Mike Waychison <mikew@google.com>
> Cc: Ying Han <yinghan@google.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index bac7d7a..1c6736d 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1139,8 +1139,6 @@ good_area:
> return;
> }
>
> - write |= retry_flag;
> -
> /*
> * If for any reason at all we couldn't handle the fault,
> * make sure we exit gracefully rather than endlessly redo
>
>
> commit f01ca7a68c37680a4eee22a8722a713c5102b3bb
> Author: Andrew Morton <akpm@linux-foundation.org>
> Date: Mon Feb 9 21:08:50 2009 +0100
>
> Untangle the `write' boolean from the FAULT_FLAG_foo non-boolean field.
>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Hugh Dickins <hugh@veritas.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
> Cc: Mike Waychison <mikew@google.com>
> Cc: Nick Piggin <npiggin@suse.de>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Rohit Seth <rohitseth@google.com>
> Cc: T<F6>r<F6>k Edwin <edwintorok@gmail.com>
> Cc: Valdis.Kletnieks@vt.edu
> Cc: Ying Han <yinghan@google.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index b2cc88f..bac7d7a 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -978,7 +978,7 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
> struct mm_struct *mm;
> int write;
> int fault;
> - unsigned int retry_flag = FAULT_FLAG_RETRY;
> + int retry_flag = 1;
>
> tsk = current;
> mm = tsk->mm;
> @@ -1140,6 +1140,7 @@ good_area:
> }
>
> write |= retry_flag;
> +
> /*
> * If for any reason at all we couldn't handle the fault,
> * make sure we exit gracefully rather than endlessly redo
> @@ -1159,8 +1160,8 @@ good_area:
> * be removed or changed after the retry.
> */
> if (fault & VM_FAULT_RETRY) {
> - if (write & FAULT_FLAG_RETRY) {
> - retry_flag &= ~FAULT_FLAG_RETRY;
> + if (retry_flag) {
> + retry_flag = 0;
> goto retry;
> }
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 03/14] mm: remove FAULT_FLAG_RETRY dead code
2009-04-08 1:17 ` Ying Han
@ 2009-04-08 2:29 ` Wu Fengguang
0 siblings, 0 replies; 7+ messages in thread
From: Wu Fengguang @ 2009-04-08 2:29 UTC (permalink / raw)
To: Ying Han
Cc: Andrew Morton, LKML, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org
Hi Ying Han,
On Wed, Apr 08, 2009 at 09:17:26AM +0800, Ying Han wrote:
> On Tue, Apr 7, 2009 at 4:27 PM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> > On Wed, Apr 08, 2009 at 04:03:36AM +0800, Ying Han wrote:
> >> On Tue, Apr 7, 2009 at 12:17 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> >> > Cc: Ying Han <yinghan@google.com>
> >> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> >> > ---
> >> > mm/memory.c | 4 +---
> >> > 1 file changed, 1 insertion(+), 3 deletions(-)
> >> >
> >> > --- mm.orig/mm/memory.c
> >> > +++ mm/mm/memory.c
> >> > @@ -2766,10 +2766,8 @@ static int do_linear_fault(struct mm_str
> >> > {
> >> > pgoff_t pgoff = (((address & PAGE_MASK)
> >> > - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
> >> > - int write = write_access & ~FAULT_FLAG_RETRY;
> >> > - unsigned int flags = (write ? FAULT_FLAG_WRITE : 0);
> >> > + unsigned int flags = (write_access ? FAULT_FLAG_WRITE : 0);
> >> >
> >> > - flags |= (write_access & FAULT_FLAG_RETRY);
> >> > pte_unmap(page_table);
> >> > return __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte);
> >> > }
> >> So, we got rid of FAULT_FLAG_RETRY flag?
> >
> > Seems yes for the current mm tree, see the following two commits.
> >
> > I did this patch on seeing 761fe7bc8193b7. But a closer look
> > indicates that the following two patches disable the filemap
> > VM_FAULT_RETRY part totally...
> >
> > Anyway, if these two patches are to be reverted somehow(I guess yes),
> > this patch shall be _ignored_.
> >
> > btw, do you have any test case and performance numbers for
> > FAULT_FLAG_RETRY? And possible overheads for (the worst case)
> > sparse random mmap reads on a sparse file? I cannot find any
> > in your changelogs..
>
> here is the benchmark i posted on [V1] but somehow missed in [V2] describtion
>
> Benchmarks:
> case 1. one application has a high count of threads each faulting in
> different pages of a hugefile. Benchmark indicate that this double data
> structure walking in case of major fault results in << 1% performance hit.
>
> case 2. add another thread in the above application which in a tight loop of
> mmap()/munmap(). Here we measure loop count in the new thread while other
> threads doing the same amount of work as case one. we got << 3% performance
> hit on the Complete Time(benchmark value for case one) and 10% performance
> improvement on the mmap()/munmap() counter.
>
> This patch helps a lot in cases we have writer which is waitting behind all
> readers, so it could execute much faster.
>
Just tested the sparse-random-read-on-sparse-file case, and found the
performance impact to be 0.4% (8.706s vs 8.744s). Kind of acceptable.
without FAULT_FLAG_RETRY:
iotrace.rb --load stride-100 --mplay /mnt/btrfs-ram/sparse 3.28s user 5.39s system 99% cpu 8.692 total
iotrace.rb --load stride-100 --mplay /mnt/btrfs-ram/sparse 3.17s user 5.54s system 99% cpu 8.742 total
iotrace.rb --load stride-100 --mplay /mnt/btrfs-ram/sparse 3.18s user 5.48s system 99% cpu 8.684 total
FAULT_FLAG_RETRY:
iotrace.rb --load stride-100 --mplay /mnt/btrfs-ram/sparse 3.18s user 5.63s system 99% cpu 8.825 total
iotrace.rb --load stride-100 --mplay /mnt/btrfs-ram/sparse 3.22s user 5.47s system 99% cpu 8.718 total
iotrace.rb --load stride-100 --mplay /mnt/btrfs-ram/sparse 3.13s user 5.55s system 99% cpu 8.690 total
In the above faked workload, the mmap read page offsets are loaded from
stride-100 and performed on /mnt/btrfs-ram/sparse, which are created by:
seq 0 100 1000000 > stride-100
dd if=/dev/zero of=/mnt/btrfs-ram/sparse bs=1M count=1 seek=1024000
Thanks,
Fengguang
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-04-08 2:30 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20090407071729.233579162@intel.com>
[not found] ` <20090407072132.943283183@intel.com>
2009-04-07 19:58 ` [PATCH 02/14] mm: fix major/minor fault accounting on retried fault Ying Han
2009-04-07 22:45 ` Wu Fengguang
[not found] ` <20090407072133.053995305@intel.com>
2009-04-07 20:03 ` [PATCH 03/14] mm: remove FAULT_FLAG_RETRY dead code Ying Han
2009-04-07 23:27 ` Wu Fengguang
2009-04-08 1:17 ` Ying Han
2009-04-08 2:29 ` Wu Fengguang
2009-04-07 11:50 [PATCH 00/14] filemap and readahead fixes Wu Fengguang
2009-04-07 11:50 ` [PATCH 02/14] mm: fix major/minor fault accounting on retried fault Wu Fengguang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox