From: Minchan Kim <minchan@kernel.org>
To: Paul Turner <pjt@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
John Stultz <john.stultz@linaro.org>,
Christoph Lameter <cl@linux.com>,
Android Kernel Team <kernel-team@android.com>,
Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>,
Hugh Dickins <hughd@google.com>,
Dave Hansen <dave@linux.vnet.ibm.com>,
Rik van Riel <riel@redhat.com>,
Dave Chinner <david@fromorbit.com>, Neil Brown <neilb@suse.de>,
Mike Hommey <mh@glandium.org>, Taras Glek <tglek@mozilla.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
sanjay@google.com, David Rientjes <rientjes@google.com>
Subject: Re: [RFC v2] Support volatile range for anon vma
Date: Thu, 1 Nov 2012 10:46:04 +0900 [thread overview]
Message-ID: <20121101014604.GE26256@bbox> (raw)
In-Reply-To: <CAPM31RJwrM2f8fg0--Xcea+tHYcB2C_khXy3k-h=O2x4MMfwmw@mail.gmail.com>
On Wed, Oct 31, 2012 at 06:15:33PM -0700, Paul Turner wrote:
> On Wed, Oct 31, 2012 at 3:56 PM, KOSAKI Motohiro
> <kosaki.motohiro@gmail.com> wrote:
> >>> > Allocator should call madvise(MADV_NOVOLATILE) before reusing for
> >>> > allocating that area to user. Otherwise, accessing of volatile range
> >>> > will meet SIGBUS error.
> >>>
> >>> Well, why? It would be easy enough for the fault handler to give
> >>> userspace a new, zeroed page at that address.
> >>
> >> Note: MADV_DONTNEED already has this (nice) property.
> >
> > I don't think I strictly understand this patch. but maybe I can answer why
> > userland and malloc folks don't like MADV_DONTNEED.
> >
> > glibc malloc discard freed memory by using MADV_DONTNEED
> > as tcmalloc. and it is often a source of large performance decrease.
> > because of MADV_DONTNEED discard memory immediately and
> > right after malloc() call fall into page fault and pagesize memset() path.
> > then, using DONTNEED increased zero fill and cache miss rate.
> >
> > At called free() time, malloc don't have a knowledge when next big malloc()
> > is called. then, immediate discarding may or may not get good performance
> > gain. (Ah, ok, the rate is not 5:5. then usually it is worth. but not everytime)
> >
>
> Ah; In tcmalloc allocations (and their associated free-lists) are
> binned into separate lists as a function of object-size which helps to
> mitigate this.
>
> I'd make a separate more general argument here:
> If I'm allocating a large (multi-kilobyte object) the cost of what I'm
> about to do with that object is likely fairly large -- The fault/zero
> cost a probably fairly small proportional cost, which limits the
> optimization value.
While I look at thread trial of Rik which is same goal while implementation
is different, I found this number.
https://lkml.org/lkml/2007/4/20/390
I believe optimiation is valuable. Of course, I need simillar testing for
proving it.
>
> >
> > In past, several developers tryied to avoid such situation, likes
> >
> > - making zero page daemon and avoid pagesize zero fill at page fault
> > - making new vma or page flags and mark as discardable w/o swap and
> > vmscan treat it. (like this and/or MADV_FREE)
> > - making new process option and avoid page zero fill from page fault path.
> > (yes, it is big incompatibility and insecure. but some embedded folks thought
> > they are acceptable downside)
> > - etc
> >
> >
> > btw, I'm not sure this patch is better for malloc because current MADV_DONTNEED
> > don't need mmap_sem and works very effectively when a lot of threads case.
> > taking mmap_sem might bring worse performance than DONTNEED. dunno.
>
> MADV_VOLATILE also seems to end up looking quite similar to a
> user-visible (range-based) cleancache.
>
> A second popular use-case for such semantics is the case of
> discardable cache elements (e.g. web browser). I suspect we'd want to
> at least mention these in the changelog. (Alternatively, what does a
> cleancache-backed-fs exposing these semantics look like?)
>
It's a trial of John Stultz(http://lwn.net/Articles/518130/, there was another
trial long time ago https://lkml.org/lkml/2005/11/1/384) and I want to
expand the concept from file-backed page to anonymous page so this patch
is a trial for anonymous page. So, usecase of my patch have focussed on
malloc/free case.
I hope both are able to be unified.
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Paul Turner <pjt@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
John Stultz <john.stultz@linaro.org>,
Christoph Lameter <cl@linux.com>,
Android Kernel Team <kernel-team@android.com>,
Robert Love <rlove@google.com>, Mel Gorman <mel@csn.ul.ie>,
Hugh Dickins <hughd@google.com>,
Dave Hansen <dave@linux.vnet.ibm.com>,
Rik van Riel <riel@redhat.com>,
Dave Chinner <david@fromorbit.com>, Neil Brown <neilb@suse.de>,
Mike Hommey <mh@glandium.org>, Taras Glek <tglek@mozilla.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
sanjay@google.com, David Rientjes <rientjes@google.com>
Subject: Re: [RFC v2] Support volatile range for anon vma
Date: Thu, 1 Nov 2012 10:46:04 +0900 [thread overview]
Message-ID: <20121101014604.GE26256@bbox> (raw)
In-Reply-To: <CAPM31RJwrM2f8fg0--Xcea+tHYcB2C_khXy3k-h=O2x4MMfwmw@mail.gmail.com>
On Wed, Oct 31, 2012 at 06:15:33PM -0700, Paul Turner wrote:
> On Wed, Oct 31, 2012 at 3:56 PM, KOSAKI Motohiro
> <kosaki.motohiro@gmail.com> wrote:
> >>> > Allocator should call madvise(MADV_NOVOLATILE) before reusing for
> >>> > allocating that area to user. Otherwise, accessing of volatile range
> >>> > will meet SIGBUS error.
> >>>
> >>> Well, why? It would be easy enough for the fault handler to give
> >>> userspace a new, zeroed page at that address.
> >>
> >> Note: MADV_DONTNEED already has this (nice) property.
> >
> > I don't think I strictly understand this patch. but maybe I can answer why
> > userland and malloc folks don't like MADV_DONTNEED.
> >
> > glibc malloc discard freed memory by using MADV_DONTNEED
> > as tcmalloc. and it is often a source of large performance decrease.
> > because of MADV_DONTNEED discard memory immediately and
> > right after malloc() call fall into page fault and pagesize memset() path.
> > then, using DONTNEED increased zero fill and cache miss rate.
> >
> > At called free() time, malloc don't have a knowledge when next big malloc()
> > is called. then, immediate discarding may or may not get good performance
> > gain. (Ah, ok, the rate is not 5:5. then usually it is worth. but not everytime)
> >
>
> Ah; In tcmalloc allocations (and their associated free-lists) are
> binned into separate lists as a function of object-size which helps to
> mitigate this.
>
> I'd make a separate more general argument here:
> If I'm allocating a large (multi-kilobyte object) the cost of what I'm
> about to do with that object is likely fairly large -- The fault/zero
> cost a probably fairly small proportional cost, which limits the
> optimization value.
While I look at thread trial of Rik which is same goal while implementation
is different, I found this number.
https://lkml.org/lkml/2007/4/20/390
I believe optimiation is valuable. Of course, I need simillar testing for
proving it.
>
> >
> > In past, several developers tryied to avoid such situation, likes
> >
> > - making zero page daemon and avoid pagesize zero fill at page fault
> > - making new vma or page flags and mark as discardable w/o swap and
> > vmscan treat it. (like this and/or MADV_FREE)
> > - making new process option and avoid page zero fill from page fault path.
> > (yes, it is big incompatibility and insecure. but some embedded folks thought
> > they are acceptable downside)
> > - etc
> >
> >
> > btw, I'm not sure this patch is better for malloc because current MADV_DONTNEED
> > don't need mmap_sem and works very effectively when a lot of threads case.
> > taking mmap_sem might bring worse performance than DONTNEED. dunno.
>
> MADV_VOLATILE also seems to end up looking quite similar to a
> user-visible (range-based) cleancache.
>
> A second popular use-case for such semantics is the case of
> discardable cache elements (e.g. web browser). I suspect we'd want to
> at least mention these in the changelog. (Alternatively, what does a
> cleancache-backed-fs exposing these semantics look like?)
>
It's a trial of John Stultz(http://lwn.net/Articles/518130/, there was another
trial long time ago https://lkml.org/lkml/2005/11/1/384) and I want to
expand the concept from file-backed page to anonymous page so this patch
is a trial for anonymous page. So, usecase of my patch have focussed on
malloc/free case.
I hope both are able to be unified.
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kind regards,
Minchan Kim
next prev parent reply other threads:[~2012-11-01 1:40 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-30 1:29 [RFC v2] Support volatile range for anon vma Minchan Kim
2012-10-30 1:29 ` Minchan Kim
2012-10-31 21:35 ` Andrew Morton
2012-10-31 21:35 ` Andrew Morton
2012-10-31 21:59 ` Paul Turner
2012-10-31 21:59 ` Paul Turner
2012-10-31 22:56 ` KOSAKI Motohiro
2012-10-31 22:56 ` KOSAKI Motohiro
2012-11-01 1:15 ` Paul Turner
2012-11-01 1:15 ` Paul Turner
2012-11-01 1:46 ` Minchan Kim [this message]
2012-11-01 1:46 ` Minchan Kim
2012-11-01 1:25 ` Minchan Kim
2012-11-01 1:25 ` Minchan Kim
2012-11-01 2:01 ` KOSAKI Motohiro
2012-11-01 2:01 ` KOSAKI Motohiro
2012-11-05 23:54 ` Arun Sharma
2012-11-05 23:54 ` Arun Sharma
2012-11-06 1:49 ` Minchan Kim
2012-11-06 1:49 ` Minchan Kim
2012-11-06 2:03 ` Arun Sharma
2012-11-06 2:03 ` Arun Sharma
2012-11-01 0:50 ` Minchan Kim
2012-11-01 0:50 ` Minchan Kim
2012-11-01 1:22 ` Paul Turner
2012-11-01 1:22 ` Paul Turner
2012-11-01 1:33 ` Minchan Kim
2012-11-01 1:33 ` Minchan Kim
2012-11-01 0:21 ` Minchan Kim
2012-11-01 0:21 ` Minchan Kim
2012-11-02 1:43 ` Bob Liu
2012-11-02 1:43 ` Bob Liu
2012-11-02 2:37 ` Minchan Kim
2012-11-02 2:37 ` Minchan Kim
2012-11-22 0:36 ` John Stultz
2012-11-22 0:36 ` John Stultz
2012-11-29 4:18 ` John Stultz
2012-11-29 4:18 ` John Stultz
2012-12-04 0:00 ` Minchan Kim
2012-12-04 0:00 ` Minchan Kim
2012-12-04 0:57 ` John Stultz
2012-12-04 0:57 ` John Stultz
2012-12-04 7:22 ` Minchan Kim
2012-12-04 7:22 ` Minchan Kim
2012-12-04 19:13 ` John Stultz
2012-12-04 19:13 ` John Stultz
2012-12-05 4:18 ` Minchan Kim
2012-12-05 4:18 ` Minchan Kim
2012-12-08 0:49 ` John Stultz
2012-12-08 0:49 ` John Stultz
2012-12-11 4:40 ` Minchan Kim
2012-12-11 4:40 ` Minchan Kim
2012-12-05 7:01 ` Minchan Kim
2012-12-05 7:01 ` Minchan Kim
2012-12-08 0:20 ` John Stultz
2012-12-08 0:20 ` John Stultz
2012-12-11 4:34 ` Minchan Kim
2012-12-11 4:34 ` Minchan Kim
2012-12-03 23:50 ` Minchan Kim
2012-12-03 23:50 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121101014604.GE26256@bbox \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=dave@linux.vnet.ibm.com \
--cc=david@fromorbit.com \
--cc=hughd@google.com \
--cc=john.stultz@linaro.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kernel-team@android.com \
--cc=kosaki.motohiro@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mh@glandium.org \
--cc=neilb@suse.de \
--cc=pjt@google.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=rlove@google.com \
--cc=sanjay@google.com \
--cc=tglek@mozilla.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.