public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Christoph Lameter <clameter@sgi.com>
Cc: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org
Subject: Re: Page fault scalability patch V18: Drop first acquisition of ptl
Date: Wed, 2 Mar 2005 20:14:25 -0800	[thread overview]
Message-ID: <20050302201425.2b994195.akpm@osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.58.0503021856380.3365@schroedinger.engr.sgi.com>

Christoph Lameter <clameter@sgi.com> wrote:
>
> On Wed, 2 Mar 2005, Andrew Morton wrote:
> 
> > > This is a related change discussed during V16 with Nick.
> >
> > It's worth retaining a paragraph for the changelog.
> 
> There have been extensive discussions on all aspects of this patch.
> This issue was discussed in
> http://marc.theaimsgroup.com/?t=110694497200004&r=1&w=2

This is a difficult, intrusive and controversial patch.  Things like the
above should be done in a separate patch.  Not only does this aid
maintainability, it also allows the change to be performance tested in
isolation.

If the change gets folded into other changes then it would be best to draw
attention to, and fully explain/justify the change within the changelog.

> >
> > > The page is protected from munmap because of the down_read(mmap_sem) in
> > > the arch specific code before calling handle_mm_fault.
> >
> > We don't take mmap_sem during page reclaim.  What prevents the page from
> > being freed by, say, kswapd?
> 
> The cmpxchg will fail if that happens.

How about if someone does remap_file_pages() against that virtual address
and that syscalls happens to pick the same physical page?  We have the same
physical page at the same pte slot with different contents, and the cmpxchg
will succeed.

Maybe mmap_sem will save us, maybe not.  Either way, this change needs a
ton of analysys, justification and documentation, please.

Plus if the page gets freed under our feet, CONFIG_DEBUG_PAGEALLOC will
oops during the copy.

> > I forget.  I do recall that we decided that the change was OK, but briefly
> > looking at it now, it seems that we'll fail to move a
> > PageReferenced,!PageActive onto the active list?
> 
> See http://marc.theaimsgroup.com/?l=bk-commits-head&m=110481975332117&w=2
> 
> and
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=110272296503539&w=2

Those are different cases.  I still don't see why the change is justified in
do_swap_page().

> > > That is up to the arch maintainers.  Add something to arch/xx/Kconfig to
> > > allow atomic operations for an arch.  Out of the box it only works for
> > > x86_64, ia64 and ia32.
> > > > Feedback from s390, sparc64 and ppc64 people would help in making a merge
> > decision.
>
> These architectures have the atomic pte's not enable.  It would require
> them to submit a patch to activate atomic pte's for these architectures. 


But if the approach which these patches take is not suitable for these
architectures then they have no solution to the scalability problem.  The
machines will perform suboptimally and more (perhaps conflicting)
development will be needed.

> > > Earlier releases back in September 2004 had some pte locking code (and
> > > AFAIK Nick also played around with pte locking) but that
> > > was less efficient than atomic operations.
> >
> > How much less efficient?
> > Does anyone else have that code around?
> 
> Nick may have some data. It got far too complicated too fast when I tried
> to introduce locking for individual ptes. It required bit
> spinlocks for the pte meaning multiple atomic operations.

One could add a spinlock to the pageframe, or use hashed spinlocking.

> One
> would have to check for the lock being active leading to significant code
> changes.

Why?

> This would include the arch specific low level fault handers to
> update bits, walk the page table etc etc.



  reply	other threads:[~2005-03-03  4:22 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-03-02  3:49 Page fault scalability patch V18: Overview Christoph Lameter
2005-03-02  3:50 ` Page fault scalability patch V18: atomic pte ops, pte_cmpxchg and pte_xchg Christoph Lameter
2005-03-02  3:51 ` Page fault scalability patch V18: abstract rss counter ops Christoph Lameter
2005-03-02  3:51 ` Page fault scalability patch V18: Drop first acquisition of ptl Christoph Lameter
2005-03-03  1:45   ` Andrew Morton
2005-03-03  2:13     ` Christoph Lameter
2005-03-03  2:55       ` Andrew Morton
2005-03-03  3:17         ` Christoph Lameter
2005-03-03  4:14           ` Andrew Morton [this message]
2005-03-03  4:27             ` Christoph Lameter
2005-03-03  4:56               ` Andrew Morton
2005-03-03  5:17                 ` Christoph Lameter
2005-03-03  5:37                   ` Andrew Morton
2005-03-03  5:48                     ` Christoph Lameter
2005-03-03  6:13                 ` Christoph Lameter
2005-03-03  6:20                   ` Andrew Morton
2005-03-03 16:54                     ` Christoph Lameter
2005-03-03 21:20                       ` Andrew Morton
2005-03-03 22:14                         ` Christoph Lameter
2005-03-04 16:44                         ` Christoph Lameter
2005-03-04 17:09                           ` Hugh Dickins
2005-03-04 18:29                             ` Christoph Lameter
2005-03-04 19:08                               ` Hugh Dickins
2005-03-31  6:55                             ` Avoid spurious page faults by avoiding pte_clear -> set pte Christoph Lameter
2005-03-04 16:46                         ` Page fault scalability patch V18: Drop first acquisition of ptl Christoph Lameter
2005-03-03  5:00             ` Paul Mackerras
2005-03-03  5:19               ` Christoph Lameter
2005-03-03  5:38               ` David S. Miller
2005-03-03  5:51                 ` Christoph Lameter
2005-03-03  6:11                   ` Benjamin Herrenschmidt
2005-03-03 16:52                     ` Christoph Lameter
2005-03-03  5:54                 ` Benjamin Herrenschmidt
2005-03-03 17:19                   ` Nick Piggin
2005-03-03  6:30                     ` Benjamin Herrenschmidt
2005-03-03  7:44                       ` Nick Piggin
2005-03-03 17:43                       ` David S. Miller
2005-03-03  5:24             ` Nick Piggin
2005-03-02  3:52 ` Page fault scalability patch V18: No page table lock in do_anonymous_page Christoph Lameter
2005-03-04  2:18 ` Page fault scalability patch V18: Overview Darren Williams
2005-03-04  2:47   ` Darren Williams
2005-03-04 16:15     ` Christoph Lameter
2005-03-06 21:49       ` Darren Williams
2005-03-06 23:59         ` Christoph Lameter
2005-03-07  3:32           ` Darren Williams
2005-03-08  4:03             ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050302201425.2b994195.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=clameter@sgi.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox