All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Waychison <mikew@google.com>
To: "Török Edwin" <edwintorok@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>, Ying Han <yinghan@google.com>,
	Ingo Molnar <mingo@elte.hu>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Rohit Seth <rohitseth@google.com>,
	Hugh Dickins <hugh@veritas.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY
Date: Sun, 30 Nov 2008 20:50:46 -0800	[thread overview]
Message-ID: <49336D26.2060607@google.com> (raw)
In-Reply-To: <4932EF90.9070601@gmail.com>

Török Edwin wrote:
> On 2008-11-29 01:02, Mike Waychison wrote:
>> Nick Piggin wrote:
>>> On Thu, Nov 27, 2008 at 11:03:40AM -0800, Mike Waychison wrote:
>>>> Nick Piggin wrote:
>>>>> On Thu, Nov 27, 2008 at 01:28:41AM -0800, Mike Waychison wrote:
>>>>>> Török however identified mmap taking on the order of several
>>>>>> milliseconds due to this exact problem:
>>>>>>
>>>>>> http://lkml.org/lkml/2008/9/12/185
>>>>> Turns out to be a different problem.
>>>>>
>>>> What do you mean?
>>> His is just contending on the write side. The retry patch doesn't help.
>>>
>> I disagree.  How do you get 'write contention' from the following
>> paragraph:
>>
>> "Just to confirm that the problem is with pagefaults and mmap, I dropped
>> the mmap_sem in filemap_fault, and then
>> I got same performance in my testprogram for mmap and read. Of course
>> this is totally unsafe, because the mapping could change at any time."
>>
>> It reads to me that the writers were held off by the readers sleeping
>> in IO.
> 
> It is true that I have a write/write contention too, but do_page_fault
> shows up too on lock_stat.
> 
> This is my guess at what happens:
> * filemap_fault used to sleep with mmap_sem held while waiting for the
> page lock.
> * the google patch avoids that, which is fine: if page lock can't be
> taken, it drops mmap_sem, waits, then retries the fault once
> * however after we acquired the page lock, mapping->a_ops->readpage is
> invoked, mmap_sem is NOT dropped here:
> 
>     error = mapping->a_ops->readpage(file, page);
>     if (!error) {
>         wait_on_page_locked(page);
> 
> If my understanding is correct ->readpage does the actual disk I/O, and
> it keeps the page locked, when the lock is released we know it has finished.
> So wait_on_page_locked(page)  holds mmap_sem locked for read during the
> disk I/O, preventing sys_mmap/sys_munmap from making progress.
> 
> I don't know how to prove/disprove my guess above, suggestions welcome.
> 
> Could the patch be changed to also release the mmap_sem after readpage,
> and before wait_on_page_locked?

Ya, my suspicion is that there is still some other code path where we 
are waiting on the locked page with mmap_sem still held.  Ying and I 
will take a closer look this week.

WARNING: multiple messages have this Message-ID (diff)
From: Mike Waychison <mikew@google.com>
To: "Török Edwin" <edwintorok@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>, Ying Han <yinghan@google.com>,
	Ingo Molnar <mingo@elte.hu>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Rohit Seth <rohitseth@google.com>,
	Hugh Dickins <hugh@veritas.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY
Date: Sun, 30 Nov 2008 20:50:46 -0800	[thread overview]
Message-ID: <49336D26.2060607@google.com> (raw)
In-Reply-To: <4932EF90.9070601@gmail.com>

Torok Edwin wrote:
> On 2008-11-29 01:02, Mike Waychison wrote:
>> Nick Piggin wrote:
>>> On Thu, Nov 27, 2008 at 11:03:40AM -0800, Mike Waychison wrote:
>>>> Nick Piggin wrote:
>>>>> On Thu, Nov 27, 2008 at 01:28:41AM -0800, Mike Waychison wrote:
>>>>>> Torok however identified mmap taking on the order of several
>>>>>> milliseconds due to this exact problem:
>>>>>>
>>>>>> http://lkml.org/lkml/2008/9/12/185
>>>>> Turns out to be a different problem.
>>>>>
>>>> What do you mean?
>>> His is just contending on the write side. The retry patch doesn't help.
>>>
>> I disagree.  How do you get 'write contention' from the following
>> paragraph:
>>
>> "Just to confirm that the problem is with pagefaults and mmap, I dropped
>> the mmap_sem in filemap_fault, and then
>> I got same performance in my testprogram for mmap and read. Of course
>> this is totally unsafe, because the mapping could change at any time."
>>
>> It reads to me that the writers were held off by the readers sleeping
>> in IO.
> 
> It is true that I have a write/write contention too, but do_page_fault
> shows up too on lock_stat.
> 
> This is my guess at what happens:
> * filemap_fault used to sleep with mmap_sem held while waiting for the
> page lock.
> * the google patch avoids that, which is fine: if page lock can't be
> taken, it drops mmap_sem, waits, then retries the fault once
> * however after we acquired the page lock, mapping->a_ops->readpage is
> invoked, mmap_sem is NOT dropped here:
> 
>     error = mapping->a_ops->readpage(file, page);
>     if (!error) {
>         wait_on_page_locked(page);
> 
> If my understanding is correct ->readpage does the actual disk I/O, and
> it keeps the page locked, when the lock is released we know it has finished.
> So wait_on_page_locked(page)  holds mmap_sem locked for read during the
> disk I/O, preventing sys_mmap/sys_munmap from making progress.
> 
> I don't know how to prove/disprove my guess above, suggestions welcome.
> 
> Could the patch be changed to also release the mmap_sem after readpage,
> and before wait_on_page_locked?

Ya, my suspicion is that there is still some other code path where we 
are waiting on the locked page with mmap_sem still held.  Ying and I 
will take a closer look this week.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-12-01  4:52 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-22  6:47 [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY Ying Han
2008-11-22  6:47 ` Ying Han
2008-11-22  7:15 ` Andrew Morton
2008-11-22  7:15   ` Andrew Morton
2008-11-23  9:18 ` Ingo Molnar
2008-11-23  9:18   ` Ingo Molnar
2008-11-23 18:24   ` Andrew Morton
2008-11-23 18:24     ` Andrew Morton
2008-11-25 18:42   ` Ying Han
2008-11-25 18:42     ` Ying Han
2008-11-26 12:32     ` Nick Piggin
2008-11-26 12:32       ` Nick Piggin
2008-11-26 19:57       ` Mike Waychison
2008-11-26 19:57         ` Mike Waychison
2008-11-27  8:55         ` Nick Piggin
2008-11-27  8:55           ` Nick Piggin
2008-11-27  9:28           ` Mike Waychison
2008-11-27  9:28             ` Mike Waychison
2008-11-27 10:00             ` Peter Zijlstra
2008-11-27 10:00               ` Peter Zijlstra
2008-11-27 10:14               ` Nick Piggin
2008-11-27 10:14                 ` Nick Piggin
2008-11-27 19:22                 ` Mike Waychison
2008-11-27 19:22                   ` Mike Waychison
2008-11-28  9:41                   ` Nick Piggin
2008-11-28  9:41                     ` Nick Piggin
2008-11-28 22:46                     ` Mike Waychison
2008-11-28 22:46                       ` Mike Waychison
2008-11-27 11:08               ` KOSAKI Motohiro
2008-11-27 11:08                 ` KOSAKI Motohiro
2008-11-27 19:10               ` Mike Waychison
2008-11-27 19:10                 ` Mike Waychison
2008-11-27 11:39             ` Török Edwin
2008-11-27 11:39               ` Török Edwin
2008-11-27 12:03               ` Nick Piggin
2008-11-27 12:03                 ` Nick Piggin
2008-11-27 12:21                 ` Török Edwin
2008-11-27 12:21                   ` Török Edwin
2008-11-27 12:32                   ` Peter Zijlstra
2008-11-27 12:32                     ` Peter Zijlstra
2008-11-27 12:39                   ` Nick Piggin
2008-11-27 12:39                     ` Nick Piggin
2008-11-27 12:52                     ` Török Edwin
2008-11-27 12:52                       ` Török Edwin
2008-11-27 13:05                       ` Nick Piggin
2008-11-27 13:05                         ` Nick Piggin
2008-11-27 13:10                         ` Török Edwin
2008-11-27 13:10                           ` Török Edwin
2008-11-27 13:12                           ` Nick Piggin
2008-11-27 13:12                             ` Nick Piggin
2008-11-27 13:23                             ` Török Edwin
2008-11-27 13:23                               ` Török Edwin
2008-11-28 12:10                               ` Nick Piggin
2008-11-28 12:10                                 ` Nick Piggin
2008-11-30 19:38                                 ` Török Edwin
2008-11-30 19:38                                   ` Török Edwin
2008-12-01  8:52                                   ` Nick Piggin
2008-12-01  8:52                                     ` Nick Piggin
2008-12-01 11:13                                   ` Nick Piggin
2008-12-01 11:13                                     ` Nick Piggin
2008-12-01 11:37                                     ` Török Edwin
2008-12-01 11:37                                       ` Török Edwin
2008-12-04 22:27                       ` Ying Han
2008-12-04 22:27                         ` Ying Han
2008-12-05  6:50                         ` Török Edwin
2008-12-05  6:50                           ` Török Edwin
2008-11-27 13:08             ` Nick Piggin
2008-11-27 13:08               ` Nick Piggin
2008-11-27 19:03               ` Mike Waychison
2008-11-27 19:03                 ` Mike Waychison
2008-11-28  9:37                 ` Nick Piggin
2008-11-28  9:37                   ` Nick Piggin
2008-11-28 23:02                   ` Mike Waychison
2008-11-28 23:02                     ` Mike Waychison
2008-11-30 19:54                     ` Török Edwin
2008-11-30 19:54                       ` Török Edwin
2008-12-01  4:50                       ` Mike Waychison [this message]
2008-12-01  4:50                         ` Mike Waychison
2008-12-01  8:58                       ` Nick Piggin
2008-12-01  8:58                         ` Nick Piggin
2008-12-01 11:45                     ` Nick Piggin
2008-12-01 11:45                       ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49336D26.2060607@google.com \
    --to=mikew@google.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=edwintorok@gmail.com \
    --cc=hpa@zytor.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=rientjes@google.com \
    --cc=rohitseth@google.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.