linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: tony.luck@intel.com, Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Shan Hai <haishan.bai@gmail.com>,
	linux-kernel@vger.kernel.org, cmetcalf@tilera.com,
	dhowells@redhat.com, paulus@samba.org, tglx@linutronix.de,
	walken@google.com, linuxppc-dev@lists.ozlabs.org,
	akpm@linux-foundation.org
Subject: Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core
Date: Mon, 18 Jul 2011 00:29:50 +1000	[thread overview]
Message-ID: <1310912990.25044.203.camel@pasglop> (raw)
In-Reply-To: <4b337921-d430-4b63-bc36-ad31753cf800@email.android.com>

On Sun, 2011-07-17 at 11:38 +0200, Peter Zijlstra wrote:

> Whats this talk about interrup context? There is non of that involved.

Ok, let's not mix things here. I was going to fast and may have murkied
the waters. First let's get back to the futex code issue with gup:

> Furthermore, I still dont see the problem. The futex code is
> optimistically trying to poke at user memory while holding spinlocks.
> 
> We fully expect that to fail, hence the error path that drops all
> locks and does the gup(.write=1) to fix up the mapping after which we
> try again.

See below

> This has worked for years, its by no means new code. Nor do I see how
> its broken.

No it hasn't worked for years, it's been broken for years on some archs
but nobody noticed :-)

The problem I see is that gup doesn't set dirty (or young) itself. It
requires the caller to set dirty before releasing the pages basically
and there's no provision for young. Afaik, callers set it directly in
the struct page and not the PTE too (which means a spurrious fault on
subsequent access for archs that do dirty tracking).

On archs where dirty and young are SW tracked, pages that don't have
them set in the PTE will fault on access. The lack of dirty means the
page is effectively going to be read-only and the lack of young means
the page will be inaccessible.

gup itself isn't mean to fix those, at least the way it's been used so
far, the caller of gup is.

Thus the only "proper" way to fix that is to have the futex code itself
perform dirty and accessed updates, which sucks (means going back down
the page tables, taking the PTL and doing the deed).

Having the actual fault handlers do the fixup even when in atomic isn't
a good option for us:

I don't know what other archs that rely on that SW tracking do, but in
the case of powerpc, that would be problematic due to the fact that
those archs have been written with the assumption that all changes to
PTEs are done under the PTL (which allows to simplify code and thus make
things faster).

Among others, that also means no changes at interrupt time. Enabling our
fault handlers to update the dirty & young bits even while in "atomic"
context would potentially open the door to things like interrupt-time
perf backtraces to cause PTE updates, etc... (in addition to generally
breaking the locking rules for PTE modifications).

Now, I suppose we -could- differentiate preempt disabled from real
interrupt time in the fault handlers, tho that's somewhat sucky. It
might require moving the dirty/young updates from generic code to arch
code, I suppose.

I'm also not sure how risky it would be to take the PTL in that case...
code doing user accesses within pagefault_disable() might be written
with the assumption that the PTL won't be taken (it might be already
held ? I don't know what all the users are at this point, too late to
grep, I can have a look tomorrow). It makes me a bit nervous too.

A better approach might be a flag to pass to gup (via the "write"
argument ? top bits ?) to tell it to immediately perform dirty/young
updates.

Cheers,
Ben.

  reply	other threads:[~2011-07-17 14:37 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-15  8:07 [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core Shan Hai
2011-07-15  8:07 ` [PATCH 1/1] " Shan Hai
2011-07-15 10:23   ` Peter Zijlstra
2011-07-15 15:18     ` Shan Hai
2011-07-15 15:24       ` Peter Zijlstra
2011-07-16 15:36         ` Shan Hai
2011-07-16 14:50     ` Shan Hai
2011-07-16 23:49       ` Benjamin Herrenschmidt
2011-07-17  9:38         ` Peter Zijlstra
2011-07-17 14:29           ` Benjamin Herrenschmidt [this message]
2011-07-17 23:14             ` Benjamin Herrenschmidt
2011-07-18  3:53               ` Benjamin Herrenschmidt
2011-07-18  4:02                 ` Benjamin Herrenschmidt
2011-07-18  4:01               ` Benjamin Herrenschmidt
2011-07-18  6:48                 ` Shan Hai
2011-07-18  7:01                   ` Benjamin Herrenschmidt
2011-07-18  7:26                     ` Shan Hai
2011-07-18  7:36                       ` Benjamin Herrenschmidt
2011-07-18  7:50                         ` Shan Hai
2011-07-19  3:30                         ` Shan Hai
2011-07-19  4:20                           ` Benjamin Herrenschmidt
2011-07-19  4:29                           ` [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of dirty & young Benjamin Herrenschmidt
2011-07-19  4:55                             ` Shan Hai
2011-07-19  5:17                             ` Shan Hai
2011-07-19  5:24                               ` Benjamin Herrenschmidt
2011-07-19  5:38                                 ` Shan Hai
2011-07-19  7:46                                   ` Benjamin Herrenschmidt
2011-07-19  8:24                                     ` Shan Hai
2011-07-19  8:26                                       ` [RFC/PATCH] mm/futex: Fix futex writes on archs with SW trackingof " David Laight
2011-07-19  8:45                                         ` Benjamin Herrenschmidt
2011-07-19  8:45                                         ` Shan Hai
2011-07-19 11:10                             ` [RFC/PATCH] mm/futex: Fix futex writes on archs with SW tracking of " Peter Zijlstra
2011-07-20 14:39                             ` Darren Hart
2011-07-21 22:36                             ` Andrew Morton
2011-07-21 22:52                               ` Benjamin Herrenschmidt
2011-07-21 22:57                                 ` Benjamin Herrenschmidt
2011-07-21 22:59                                 ` Andrew Morton
2011-07-22  1:40                                   ` Benjamin Herrenschmidt
2011-07-22  1:54                                   ` Shan Hai
2011-07-27  6:50                             ` Mike Frysinger
2011-07-27  7:58                               ` Benjamin Herrenschmidt
2011-07-27  8:59                                 ` Peter Zijlstra
2011-07-27 10:09                                 ` David Howells
2011-07-27 10:17                                   ` Peter Zijlstra
2011-07-27 10:20                                     ` Benjamin Herrenschmidt
2011-07-28  0:12                                       ` Mike Frysinger
2011-08-08  2:31                                     ` Mike Frysinger
2011-07-28 10:55                                   ` David Howells
2011-07-17 11:02         ` [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core Peter Zijlstra
2011-07-17 13:33           ` Shan Hai
2011-07-17 14:48             ` Benjamin Herrenschmidt
2011-07-17 15:40               ` Shan Hai
2011-07-17 22:34                 ` Benjamin Herrenschmidt
2011-07-17 14:34           ` Benjamin Herrenschmidt
2011-07-15  8:20 ` [PATCH 0/1] " Peter Zijlstra
2011-07-15  8:38   ` MailingLists
2011-07-15  8:44     ` Peter Zijlstra
2011-07-15  9:08       ` Shan Hai
2011-07-15  9:12         ` Benjamin Herrenschmidt
2011-07-15  9:50         ` Peter Zijlstra
2011-07-15 10:06           ` Shan Hai
2011-07-15 10:32             ` David Laight
2011-07-15 10:39               ` Peter Zijlstra
2011-07-15 15:32               ` Shan Hai
2011-07-16  0:20                 ` Benjamin Herrenschmidt
2011-07-16 15:03                   ` Shan Hai
2011-07-15 23:47               ` Benjamin Herrenschmidt
2011-07-15  9:07     ` Benjamin Herrenschmidt
2011-07-15  9:05   ` Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1310912990.25044.203.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=cmetcalf@tilera.com \
    --cc=dhowells@redhat.com \
    --cc=haishan.bai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).