All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Török Edwin" <edwintorok@gmail.com>
To: Ying Han <yinghan@google.com>
Cc: linux-mm@kvack.org, linux-kernel <linux-kernel@vger.kernel.org>,
	akpm <akpm@linux-foundation.org>, Ingo Molnar <mingo@elte.hu>,
	Mike Waychison <mikew@google.com>,
	David Rientjes <rientjes@google.com>,
	Rohit Seth <rohitseth@google.com>,
	Hugh Dickins <hugh@veritas.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Lee Schermerhorn <lee.schermerhorn@hp.com>,
	Nick Piggin <npiggin@suse.de>
Subject: Re: [RFC v2][PATCH]page_fault retry with NOPAGE_RETRY
Date: Sat, 06 Dec 2008 11:52:08 +0200	[thread overview]
Message-ID: <493A4B48.1050706@gmail.com> (raw)
In-Reply-To: <604427e00812051140s67b2a89dm35806c3ee3b6ed7a@mail.gmail.com>

On 2008-12-05 21:40, Ying Han wrote:
> changelog[v2]:
> - reduce the runtime overhead by extending the 'write' flag of
>   handle_mm_fault() to indicate the retry hint.
> - add another two branches in filemap_fault with retry logic.
> - replace find_lock_page with find_lock_page_retry to make the code
>   cleaner.
>
> todo:
> - there is potential a starvation hole with the retry. By the time the
>   retry returns, the pages might be released. we can make change by holding
>   page reference as well as remembering what the page "was"(in case the
>   file was truncated). any suggestion here are welcomed.
>
> I also made patches for all other arch. I am posting x86_64 here first and
> i will post others by the time everyone feels comfortable of this patch.
>
> Edwin, please test this patch with your testcase and check if you get any
> performance improvement of mmap over read. I added another two more places
> in filemap_fault with retry logic which you might hit in your privous
> experiment.
>   

I get much better results with this patch than with v1, thanks!

mmap now scales almost as well as read does (there is a small ~5%
overhead), which is a significant improvement over not scaling at all!

Here are the results when running my testcase:

Number of threads ->, 1,,, 2,,, 4,,, 8,,, 16
Kernel version, read, mmap, mixed, read, mmap, mixed, read, mmap, mixed,
read, mmap, mixed, read, mmap, mixed
2.6.28-rc7-tip, 27.55, 26.18, 27.06, 16.18, 16.97, 16.10, 11.06, 11.64,
11.41, 9.38, 9.97, 9.31, 9.37, 9.82, 9.3


Here are the /proc/lock_stat output when running my testcase, contention
is lower (34911+10462 vs 58590+7231), and waittime-total is better
(57 601 464 vs 234 170 024)

lock_stat version 0.3
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                              class name    con-bounces    contentions  
waittime-min   waittime-max waittime-total    acq-bounces  
acquisitions   holdtime-min   holdtime-max holdtime-total
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        &mm->mmap_sem-W:          5843         
10462           2.89      138824.72    14217159.52         
18965          84205           1.81        5031.07      725293.65
                         &mm->mmap_sem-R:         20208         
34911           4.87      136797.26    57601464.49          55797       
1110394           1.89      164918.52    30551371.71
                         ---------------
                           &mm->mmap_sem           5341         
[<ffffffff802bf9d7>] sys_munmap+0x47/0x80
                           &mm->mmap_sem          28579         
[<ffffffff805d1c62>] do_page_fault+0x172/0xab0
                           &mm->mmap_sem           5030         
[<ffffffff80211161>] sys_mmap+0xf1/0x140
                           &mm->mmap_sem           6331         
[<ffffffff802a675e>] find_lock_page_retry+0xde/0xf0
                         ---------------
                           &mm->mmap_sem          13558         
[<ffffffff802a675e>] find_lock_page_retry+0xde/0xf0
                           &mm->mmap_sem           4694         
[<ffffffff802bf9d7>] sys_munmap+0x47/0x80
                           &mm->mmap_sem           3681         
[<ffffffff80211161>] sys_mmap+0xf1/0x140
                           &mm->mmap_sem          23374         
[<ffffffff805d1c62>] do_page_fault+0x172/0xab0


On clamd:

Here holdtime-total is better (1 493 154 + 2 395 987 vs 2 087 538 + 2
514 673), and number of contentions on read
(458 052 vs 5851


lock_stat version 0.3
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                              class name    con-bounces    contentions  
waittime-min   waittime-max waittime-total    acq-bounces  
acquisitions   holdtime-min   holdtime-max holdtime-total
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                         &mm->mmap_sem-W:        346769        
533541           1.62       99819.40   454843342.63        
411259         588486           1.33        6719.62     2395987.75
                         &mm->mmap_sem-R:        197856        
458052           1.59       99800.28   313508721.01        
338158         653427           1.71       25421.10     1493154.95
                         ---------------
                           &mm->mmap_sem         427857         
[<ffffffff805d1c62>] do_page_fault+0x172/0xab0
                           &mm->mmap_sem         266464         
[<ffffffff802bf9d7>] sys_munmap+0x47/0x80
                           &mm->mmap_sem         251689         
[<ffffffff802110d6>] sys_mmap+0x66/0x140
                           &mm->mmap_sem          15187         
[<ffffffff80211161>] sys_mmap+0xf1/0x140
                         ---------------
                           &mm->mmap_sem         226908         
[<ffffffff802110d6>] sys_mmap+0x66/0x140
                           &mm->mmap_sem         483909         
[<ffffffff805d1c62>] do_page_fault+0x172/0xab0
                           &mm->mmap_sem         229404         
[<ffffffff802bf9d7>] sys_munmap+0x47/0x80
                           &mm->mmap_sem          13229         
[<ffffffff80211161>] sys_mmap+0xf1/0x140

...............................................................................................................................................................................................

                         &sem->wait_lock:        112617        
114394           0.41         111.20      225590.14       
1517470        6300681           0.27        4103.77     3814684.55
                         ---------------
                         &sem->wait_lock           5634         
[<ffffffff8043a608>] __up_write+0x28/0x170
                         &sem->wait_lock          13595         
[<ffffffff805ce4dc>] __down_read+0x1c/0xbc
                         &sem->wait_lock          38882         
[<ffffffff8043a4a0>] __down_read_trylock+0x20/0x60
                         &sem->wait_lock          30718         
[<ffffffff8043a773>] __up_read+0x23/0xc0
                         ---------------
                         &sem->wait_lock          21389         
[<ffffffff8043a4a0>] __down_read_trylock+0x20/0x60
                         &sem->wait_lock          48959         
[<ffffffff8043a608>] __up_write+0x28/0x170
                         &sem->wait_lock          24330         
[<ffffffff8043a773>] __up_read+0x23/0xc0
                         &sem->wait_lock           9000         
[<ffffffff805ce4dc>] __down_read+0x1c/0xbc


> @@ -694,6 +694,7 @@ static inline int page_mapped(struct page *page)
>  #define VM_FAULT_SIGBUS	0x0002
>  #define VM_FAULT_MAJOR	0x0004
>  #define VM_FAULT_WRITE	0x0008	/* Special case for get_user_pages */
> +#define VM_FAULT_RETRY	0x0010
>
>  #define VM_FAULT_NOPAGE	0x0100	/* ->fault installed the pte, not return page
>   

The patch got damaged here, and failed to apply, I added the missing */,
and then git-am -3 applied it.

Best regards,
--Edwin

WARNING: multiple messages have this Message-ID (diff)
From: "Török Edwin" <edwintorok@gmail.com>
To: Ying Han <yinghan@google.com>
Cc: linux-mm@kvack.org, linux-kernel <linux-kernel@vger.kernel.org>,
	akpm <akpm@linux-foundation.org>, Ingo Molnar <mingo@elte.hu>,
	Mike Waychison <mikew@google.com>,
	David Rientjes <rientjes@google.com>,
	Rohit Seth <rohitseth@google.com>,
	Hugh Dickins <hugh@veritas.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Lee Schermerhorn <lee.schermerhorn@hp.com>,
	Nick Piggin <npiggin@suse.de>
Subject: Re: [RFC v2][PATCH]page_fault retry with NOPAGE_RETRY
Date: Sat, 06 Dec 2008 11:52:08 +0200	[thread overview]
Message-ID: <493A4B48.1050706@gmail.com> (raw)
In-Reply-To: <604427e00812051140s67b2a89dm35806c3ee3b6ed7a@mail.gmail.com>

On 2008-12-05 21:40, Ying Han wrote:
> changelog[v2]:
> - reduce the runtime overhead by extending the 'write' flag of
>   handle_mm_fault() to indicate the retry hint.
> - add another two branches in filemap_fault with retry logic.
> - replace find_lock_page with find_lock_page_retry to make the code
>   cleaner.
>
> todo:
> - there is potential a starvation hole with the retry. By the time the
>   retry returns, the pages might be released. we can make change by holding
>   page reference as well as remembering what the page "was"(in case the
>   file was truncated). any suggestion here are welcomed.
>
> I also made patches for all other arch. I am posting x86_64 here first and
> i will post others by the time everyone feels comfortable of this patch.
>
> Edwin, please test this patch with your testcase and check if you get any
> performance improvement of mmap over read. I added another two more places
> in filemap_fault with retry logic which you might hit in your privous
> experiment.
>   

I get much better results with this patch than with v1, thanks!

mmap now scales almost as well as read does (there is a small ~5%
overhead), which is a significant improvement over not scaling at all!

Here are the results when running my testcase:

Number of threads ->, 1,,, 2,,, 4,,, 8,,, 16
Kernel version, read, mmap, mixed, read, mmap, mixed, read, mmap, mixed,
read, mmap, mixed, read, mmap, mixed
2.6.28-rc7-tip, 27.55, 26.18, 27.06, 16.18, 16.97, 16.10, 11.06, 11.64,
11.41, 9.38, 9.97, 9.31, 9.37, 9.82, 9.3


Here are the /proc/lock_stat output when running my testcase, contention
is lower (34911+10462 vs 58590+7231), and waittime-total is better
(57 601 464 vs 234 170 024)

lock_stat version 0.3
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                              class name    con-bounces    contentions  
waittime-min   waittime-max waittime-total    acq-bounces  
acquisitions   holdtime-min   holdtime-max holdtime-total
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                        &mm->mmap_sem-W:          5843         
10462           2.89      138824.72    14217159.52         
18965          84205           1.81        5031.07      725293.65
                         &mm->mmap_sem-R:         20208         
34911           4.87      136797.26    57601464.49          55797       
1110394           1.89      164918.52    30551371.71
                         ---------------
                           &mm->mmap_sem           5341         
[<ffffffff802bf9d7>] sys_munmap+0x47/0x80
                           &mm->mmap_sem          28579         
[<ffffffff805d1c62>] do_page_fault+0x172/0xab0
                           &mm->mmap_sem           5030         
[<ffffffff80211161>] sys_mmap+0xf1/0x140
                           &mm->mmap_sem           6331         
[<ffffffff802a675e>] find_lock_page_retry+0xde/0xf0
                         ---------------
                           &mm->mmap_sem          13558         
[<ffffffff802a675e>] find_lock_page_retry+0xde/0xf0
                           &mm->mmap_sem           4694         
[<ffffffff802bf9d7>] sys_munmap+0x47/0x80
                           &mm->mmap_sem           3681         
[<ffffffff80211161>] sys_mmap+0xf1/0x140
                           &mm->mmap_sem          23374         
[<ffffffff805d1c62>] do_page_fault+0x172/0xab0


On clamd:

Here holdtime-total is better (1 493 154 + 2 395 987 vs 2 087 538 + 2
514 673), and number of contentions on read
(458 052 vs 5851


lock_stat version 0.3
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                              class name    con-bounces    contentions  
waittime-min   waittime-max waittime-total    acq-bounces  
acquisitions   holdtime-min   holdtime-max holdtime-total
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                         &mm->mmap_sem-W:        346769        
533541           1.62       99819.40   454843342.63        
411259         588486           1.33        6719.62     2395987.75
                         &mm->mmap_sem-R:        197856        
458052           1.59       99800.28   313508721.01        
338158         653427           1.71       25421.10     1493154.95
                         ---------------
                           &mm->mmap_sem         427857         
[<ffffffff805d1c62>] do_page_fault+0x172/0xab0
                           &mm->mmap_sem         266464         
[<ffffffff802bf9d7>] sys_munmap+0x47/0x80
                           &mm->mmap_sem         251689         
[<ffffffff802110d6>] sys_mmap+0x66/0x140
                           &mm->mmap_sem          15187         
[<ffffffff80211161>] sys_mmap+0xf1/0x140
                         ---------------
                           &mm->mmap_sem         226908         
[<ffffffff802110d6>] sys_mmap+0x66/0x140
                           &mm->mmap_sem         483909         
[<ffffffff805d1c62>] do_page_fault+0x172/0xab0
                           &mm->mmap_sem         229404         
[<ffffffff802bf9d7>] sys_munmap+0x47/0x80
                           &mm->mmap_sem          13229         
[<ffffffff80211161>] sys_mmap+0xf1/0x140

...............................................................................................................................................................................................

                         &sem->wait_lock:        112617        
114394           0.41         111.20      225590.14       
1517470        6300681           0.27        4103.77     3814684.55
                         ---------------
                         &sem->wait_lock           5634         
[<ffffffff8043a608>] __up_write+0x28/0x170
                         &sem->wait_lock          13595         
[<ffffffff805ce4dc>] __down_read+0x1c/0xbc
                         &sem->wait_lock          38882         
[<ffffffff8043a4a0>] __down_read_trylock+0x20/0x60
                         &sem->wait_lock          30718         
[<ffffffff8043a773>] __up_read+0x23/0xc0
                         ---------------
                         &sem->wait_lock          21389         
[<ffffffff8043a4a0>] __down_read_trylock+0x20/0x60
                         &sem->wait_lock          48959         
[<ffffffff8043a608>] __up_write+0x28/0x170
                         &sem->wait_lock          24330         
[<ffffffff8043a773>] __up_read+0x23/0xc0
                         &sem->wait_lock           9000         
[<ffffffff805ce4dc>] __down_read+0x1c/0xbc


> @@ -694,6 +694,7 @@ static inline int page_mapped(struct page *page)
>  #define VM_FAULT_SIGBUS	0x0002
>  #define VM_FAULT_MAJOR	0x0004
>  #define VM_FAULT_WRITE	0x0008	/* Special case for get_user_pages */
> +#define VM_FAULT_RETRY	0x0010
>
>  #define VM_FAULT_NOPAGE	0x0100	/* ->fault installed the pte, not return page
>   

The patch got damaged here, and failed to apply, I added the missing */,
and then git-am -3 applied it.

Best regards,
--Edwin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-12-06  9:52 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-05 19:40 [RFC v2][PATCH]page_fault retry with NOPAGE_RETRY Ying Han
2008-12-05 19:40 ` Ying Han
2008-12-06  9:52 ` Török Edwin [this message]
2008-12-06  9:52   ` Török Edwin
2008-12-06  9:55   ` Török Edwin
2008-12-06  9:55     ` Török Edwin
2008-12-08  1:43     ` Ying Han
2008-12-08  1:43       ` Ying Han
2008-12-09 17:57       ` Ying Han
2008-12-09 17:57         ` Ying Han
2008-12-09 19:31 ` Andrew Morton
2008-12-09 19:31   ` Andrew Morton
2009-01-26 19:37 ` Andrew Morton
2009-01-26 19:37   ` Andrew Morton
     [not found]   ` <604427e00901261508n7967ea74m3deacd3213c86065@mail.gmail.com>
2009-01-26 23:52     ` Andrew Morton
2009-01-26 23:52       ` Andrew Morton
2009-01-26 23:57       ` Ingo Molnar
2009-01-26 23:57         ` Ingo Molnar
2009-01-27  4:34         ` KOSAKI Motohiro
2009-01-27  4:34           ` KOSAKI Motohiro
2009-03-31 22:00 ` Andrew Morton
2009-03-31 22:00   ` Andrew Morton
2009-04-01  0:17   ` Ying Han
2009-04-01  0:17     ` Ying Han
2009-04-03  8:22   ` [PATCH] vfs: fix find_lock_page_retry() return value parsing Wu Fengguang
2009-04-03  8:22     ` Wu Fengguang
2009-04-03  8:35     ` [PATCH v2] " Wu Fengguang
2009-04-03  8:35       ` Wu Fengguang
2009-04-03  8:55       ` [PATCH] vfs: reduce page fault retry code Wu Fengguang
2009-04-03  8:55         ` Wu Fengguang
2009-04-03 10:53         ` Wu Fengguang
2009-04-03 10:53           ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=493A4B48.1050706@gmail.com \
    --to=edwintorok@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=hugh@veritas.com \
    --cc=lee.schermerhorn@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mikew@google.com \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=rientjes@google.com \
    --cc=rohitseth@google.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.