Re: [PATCH 0/5] Volatile Ranges (v12) & LSF-MM discussion fodder

All of lore.kernel.org
 help / color / mirror / Atom feed

From: John Stultz <john.stultz@linaro.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave@sr71.net>, "H. Peter Anvin" <hpa@zytor.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Android Kernel Team <kernel-team@android.com>,
	Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Neil Brown <neilb@suse.de>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mike Hommey <mh@glandium.org>, Taras Glek <tglek@mozilla.com>,
	Jan Kara <jack@suse.cz>,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	Michel Lespinasse <walken@google.com>,
	Minchan Kim <minchan@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 0/5] Volatile Ranges (v12) & LSF-MM discussion fodder
Date: Wed, 02 Apr 2014 13:13:34 -0700	[thread overview]
Message-ID: <533C6F6E.4080601@linaro.org> (raw)
In-Reply-To: <20140402194708.GV14688@cmpxchg.org>

On 04/02/2014 12:47 PM, Johannes Weiner wrote:
> On Wed, Apr 02, 2014 at 12:01:00PM -0700, John Stultz wrote:
>> On Wed, Apr 2, 2014 at 10:58 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
>>> On Wed, Apr 02, 2014 at 10:40:16AM -0700, John Stultz wrote:
>>>> That point beside, I think the other problem with the page-cleaning
>>>> volatility approach is that there are other awkward side effects. For
>>>> example: Say an application marks a range as volatile. One page in the
>>>> range is then purged. The application, due to a bug or otherwise,
>>>> reads the volatile range. This causes the page to be zero-filled in,
>>>> and the application silently uses the corrupted data (which isn't
>>>> great). More problematic though, is that by faulting the page in,
>>>> they've in effect lost the purge state for that page. When the
>>>> application then goes to mark the range as non-volatile, all pages are
>>>> present, so we'd return that no pages were purged.  From an
>>>> application perspective this is pretty ugly.
>>>>
>>>> Johannes: Any thoughts on this potential issue with your proposal? Am
>>>> I missing something else?
>>> No, this is accurate.  However, I don't really see how this is
>>> different than any other use-after-free bug.  If you access malloc
>>> memory after free(), you might receive a SIGSEGV, you might see random
>>> data, you might corrupt somebody else's data.  This certainly isn't
>>> nice, but it's not exactly new behavior, is it?
>> The part that troubles me is that I see the purged state as kernel
>> data being corrupted by userland in this case. The kernel will tell
>> userspace that no pages were purged, even though they were. Only
>> because userspace made an errant read of a page, and got garbage data
>> back.
> That sounds overly dramatic to me.  First of all, this data still
> reflects accurately the actions of userspace in this situation.  And
> secondly, the kernel does not rely on this data to be meaningful from
> a userspace perspective to function correctly.
<insert dramatic-chipmunk video w/ text overlay "errant read corrupted
volatile page purge state!!!!1">

Maybe you're right, but I feel this is the sort of thing application
developers would be surprised and annoyed by.


> It's really nothing but a use-after-free bug that has consequences for
> no-one but the faulty application.  The thing that IS new is that even
> a read is enough to corrupt your data in this case.
>
> MADV_REVIVE could return 0 if all pages in the specified range were
> present, -Esomething if otherwise.  That would be semantically sound
> even if userspace messes up.

So its semantically more of just a combined mincore+dirty operation..
and nothing more?

What are other folks thinking about this? Although I don't particularly
like it, I probably could go along with Johannes' approach, forgoing
SIGBUS for zero-fill and adapting the semantics that are in my mind a
bit stranger. This would allow for ashmem-like style behavior w/ the
additional  write-clears-volatile-state and read-clears-purged-state
constraints (which I don't think would be problematic for Android, but
am not totally sure).

But I do worry that these semantics are easier for kernel-mm-developers
to grasp, but are much much harder for application developers to
understand.

Additionally unless we could really leave access-after-volatile as a
total undefined behavior, this would lock us into O(page) behavior and
would remove the possibility of O(log(ranges)) behavior Minchan and I
were able to get (admittedly with more complicated code - but something
I was hoping we'd be able to get back to after the base semantics and
interface behavior was understood and merged). I since applications will
have bugs and will access after volatile, we won't be able to get away
with that sort of behavioral flexibility.

thanks
-john

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: John Stultz <john.stultz@linaro.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave@sr71.net>, "H. Peter Anvin" <hpa@zytor.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Android Kernel Team <kernel-team@android.com>,
	Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Neil Brown <neilb@suse.de>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mike Hommey <mh@glandium.org>, Taras Glek <tglek@mozilla.com>,
	Jan Kara <jack@suse.cz>,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	Michel Lespinasse <walken@google.com>,
	Minchan Kim <minchan@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 0/5] Volatile Ranges (v12) & LSF-MM discussion fodder
Date: Wed, 02 Apr 2014 13:13:34 -0700	[thread overview]
Message-ID: <533C6F6E.4080601@linaro.org> (raw)
In-Reply-To: <20140402194708.GV14688@cmpxchg.org>

On 04/02/2014 12:47 PM, Johannes Weiner wrote:
> On Wed, Apr 02, 2014 at 12:01:00PM -0700, John Stultz wrote:
>> On Wed, Apr 2, 2014 at 10:58 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
>>> On Wed, Apr 02, 2014 at 10:40:16AM -0700, John Stultz wrote:
>>>> That point beside, I think the other problem with the page-cleaning
>>>> volatility approach is that there are other awkward side effects. For
>>>> example: Say an application marks a range as volatile. One page in the
>>>> range is then purged. The application, due to a bug or otherwise,
>>>> reads the volatile range. This causes the page to be zero-filled in,
>>>> and the application silently uses the corrupted data (which isn't
>>>> great). More problematic though, is that by faulting the page in,
>>>> they've in effect lost the purge state for that page. When the
>>>> application then goes to mark the range as non-volatile, all pages are
>>>> present, so we'd return that no pages were purged.  From an
>>>> application perspective this is pretty ugly.
>>>>
>>>> Johannes: Any thoughts on this potential issue with your proposal? Am
>>>> I missing something else?
>>> No, this is accurate.  However, I don't really see how this is
>>> different than any other use-after-free bug.  If you access malloc
>>> memory after free(), you might receive a SIGSEGV, you might see random
>>> data, you might corrupt somebody else's data.  This certainly isn't
>>> nice, but it's not exactly new behavior, is it?
>> The part that troubles me is that I see the purged state as kernel
>> data being corrupted by userland in this case. The kernel will tell
>> userspace that no pages were purged, even though they were. Only
>> because userspace made an errant read of a page, and got garbage data
>> back.
> That sounds overly dramatic to me.  First of all, this data still
> reflects accurately the actions of userspace in this situation.  And
> secondly, the kernel does not rely on this data to be meaningful from
> a userspace perspective to function correctly.
<insert dramatic-chipmunk video w/ text overlay "errant read corrupted
volatile page purge state!!!!1">

Maybe you're right, but I feel this is the sort of thing application
developers would be surprised and annoyed by.


> It's really nothing but a use-after-free bug that has consequences for
> no-one but the faulty application.  The thing that IS new is that even
> a read is enough to corrupt your data in this case.
>
> MADV_REVIVE could return 0 if all pages in the specified range were
> present, -Esomething if otherwise.  That would be semantically sound
> even if userspace messes up.

So its semantically more of just a combined mincore+dirty operation..
and nothing more?

What are other folks thinking about this? Although I don't particularly
like it, I probably could go along with Johannes' approach, forgoing
SIGBUS for zero-fill and adapting the semantics that are in my mind a
bit stranger. This would allow for ashmem-like style behavior w/ the
additional  write-clears-volatile-state and read-clears-purged-state
constraints (which I don't think would be problematic for Android, but
am not totally sure).

But I do worry that these semantics are easier for kernel-mm-developers
to grasp, but are much much harder for application developers to
understand.

Additionally unless we could really leave access-after-volatile as a
total undefined behavior, this would lock us into O(page) behavior and
would remove the possibility of O(log(ranges)) behavior Minchan and I
were able to get (admittedly with more complicated code - but something
I was hoping we'd be able to get back to after the base semantics and
interface behavior was understood and merged). I since applications will
have bugs and will access after volatile, we won't be able to get away
with that sort of behavioral flexibility.

thanks
-john

next prev parent reply	other threads:[~2014-04-02 20:13 UTC|newest]

Thread overview: 112+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-21 21:17 [PATCH 0/5] Volatile Ranges (v12) & LSF-MM discussion fodder John Stultz
2014-03-21 21:17 ` John Stultz
2014-03-21 21:17 ` [PATCH 1/5] vrange: Add vrange syscall and handle splitting/merging and marking vmas John Stultz
2014-03-21 21:17   ` John Stultz
2014-03-23 12:20   ` Jan Kara
2014-03-23 12:20     ` Jan Kara
2014-03-23 20:34     ` John Stultz
2014-03-23 20:34       ` John Stultz
2014-03-23 16:50   ` KOSAKI Motohiro
2014-03-23 16:50     ` KOSAKI Motohiro
2014-04-08 18:52     ` John Stultz
2014-04-08 18:52       ` John Stultz
2014-03-21 21:17 ` [PATCH 2/5] vrange: Add purged page detection on setting memory non-volatile John Stultz
2014-03-21 21:17   ` John Stultz
2014-03-23 12:29   ` Jan Kara
2014-03-23 12:29     ` Jan Kara
2014-03-23 20:21     ` John Stultz
2014-03-23 20:21       ` John Stultz
2014-03-23 17:42   ` KOSAKI Motohiro
2014-03-23 17:42     ` KOSAKI Motohiro
2014-04-07 18:37     ` John Stultz
2014-04-07 18:37       ` John Stultz
2014-04-07 22:14       ` KOSAKI Motohiro
2014-04-07 22:14         ` KOSAKI Motohiro
2014-04-08  3:09         ` John Stultz
2014-04-08  3:09           ` John Stultz
2014-03-23 17:50   ` KOSAKI Motohiro
2014-03-23 17:50     ` KOSAKI Motohiro
2014-03-23 20:26     ` John Stultz
2014-03-23 20:26       ` John Stultz
2014-03-23 21:50       ` KOSAKI Motohiro
2014-03-23 21:50         ` KOSAKI Motohiro
2014-04-09 18:29         ` John Stultz
2014-04-09 18:29           ` John Stultz
2014-03-21 21:17 ` [PATCH 3/5] vrange: Add page purging logic & SIGBUS trap John Stultz
2014-03-21 21:17   ` John Stultz
2014-03-23 23:44   ` KOSAKI Motohiro
2014-03-23 23:44     ` KOSAKI Motohiro
2014-04-10 18:49     ` John Stultz
2014-04-10 18:49       ` John Stultz
2014-03-21 21:17 ` [PATCH 4/5] vrange: Set affected pages referenced when marking volatile John Stultz
2014-03-21 21:17   ` John Stultz
2014-03-24  0:01   ` KOSAKI Motohiro
2014-03-24  0:01     ` KOSAKI Motohiro
2014-03-21 21:17 ` [PATCH 5/5] vmscan: Age anonymous memory even when swap is off John Stultz
2014-03-21 21:17   ` John Stultz
2014-03-24 17:33   ` Rik van Riel
2014-03-24 17:33     ` Rik van Riel
2014-03-24 18:04     ` John Stultz
2014-03-24 18:04       ` John Stultz
2014-04-01 21:21 ` [PATCH 0/5] Volatile Ranges (v12) & LSF-MM discussion fodder Johannes Weiner
2014-04-01 21:21   ` Johannes Weiner
2014-04-01 21:34   ` H. Peter Anvin
2014-04-01 21:34     ` H. Peter Anvin
2014-04-01 21:35   ` H. Peter Anvin
2014-04-01 21:35     ` H. Peter Anvin
2014-04-01 23:01     ` Dave Hansen
2014-04-01 23:01       ` Dave Hansen
2014-04-02  4:12       ` John Stultz
2014-04-02  4:12         ` John Stultz
2014-04-02 16:36         ` Johannes Weiner
2014-04-02 16:36           ` Johannes Weiner
2014-04-02 17:40           ` John Stultz
2014-04-02 17:40             ` John Stultz
2014-04-02 17:58             ` Johannes Weiner
2014-04-02 17:58               ` Johannes Weiner
2014-04-02 19:01               ` John Stultz
2014-04-02 19:01                 ` John Stultz
2014-04-02 19:47                 ` Johannes Weiner
2014-04-02 19:47                   ` Johannes Weiner
2014-04-02 20:13                   ` John Stultz [this message]
2014-04-02 20:13                     ` John Stultz
2014-04-02 22:44                     ` Jan Kara
2014-04-02 22:44                       ` Jan Kara
2014-04-11 19:32                     ` John Stultz
2014-04-11 19:32                       ` John Stultz
2014-04-07  5:48             ` Minchan Kim
2014-04-07  5:48               ` Minchan Kim
2014-04-08  4:32             ` Kevin Easton
2014-04-08  3:38               ` John Stultz
2014-04-08  3:38                 ` John Stultz
2014-04-07  5:24           ` Minchan Kim
2014-04-07  5:24             ` Minchan Kim
2014-04-02  4:03   ` John Stultz
2014-04-02  4:03     ` John Stultz
2014-04-02  4:07     ` H. Peter Anvin
2014-04-02  4:07       ` H. Peter Anvin
2014-04-02 16:30     ` Johannes Weiner
2014-04-02 16:30       ` Johannes Weiner
2014-04-02 16:32       ` H. Peter Anvin
2014-04-02 16:32         ` H. Peter Anvin
2014-04-02 16:37         ` H. Peter Anvin
2014-04-02 17:18           ` Johannes Weiner
2014-04-02 17:18             ` Johannes Weiner
2014-04-02 17:40             ` Dave Hansen
2014-04-02 17:40               ` Dave Hansen
2014-04-02 17:48               ` John Stultz
2014-04-02 17:48                 ` John Stultz
2014-04-02 18:07                 ` Johannes Weiner
2014-04-02 18:07                   ` Johannes Weiner
2014-04-02 19:37                   ` John Stultz
2014-04-02 19:37                     ` John Stultz
2014-04-02 18:31     ` Andrea Arcangeli
2014-04-02 18:31       ` Andrea Arcangeli
2014-04-02 19:27       ` Johannes Weiner
2014-04-02 19:27         ` Johannes Weiner
2014-04-07  6:19         ` Minchan Kim
2014-04-07  6:19           ` Minchan Kim
2014-04-02 19:51       ` John Stultz
2014-04-02 19:51         ` John Stultz
2014-04-07  6:11       ` Minchan Kim
2014-04-07  6:11         ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=533C6F6E.4080601@linaro.org \
    --to=john.stultz@linaro.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave@sr71.net \
    --cc=dmitry.adamushko@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=kernel-team@android.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mh@glandium.org \
    --cc=minchan@kernel.org \
    --cc=neilb@suse.de \
    --cc=riel@redhat.com \
    --cc=rlove@google.com \
    --cc=tglek@mozilla.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.