From: Eric B Munson <emunson-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
To: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org>
Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>,
Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>,
Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>,
David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH V5] Allow compaction of unevictable pages
Date: Mon, 16 Mar 2015 09:49:56 -0400 [thread overview]
Message-ID: <20150316134956.GA15324@akamai.com> (raw)
In-Reply-To: <5506ACEC.9010403-AlSwsSmVLrQ@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 4935 bytes --]
On Mon, 16 Mar 2015, Vlastimil Babka wrote:
> [CC += linux-api@]
>
> Since this is a kernel-user-space API change, please CC linux-api@.
> The kernel source file Documentation/SubmitChecklist notes that all
> Linux kernel patches that change userspace interfaces should be CCed
> to linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, so that the various parties who are
> interested in API changes are informed. For further information, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.kernel.org_doc_man-2Dpages_linux-2Dapi-2Dml.html&d=AwIC-g&c=96ZbZZcaMF4w0F4jpN6LZg&r=aUmMDRRT0nx4IfILbQLv8xzE0wB9sQxTHI3QrQ2lkBU&m=GUotTNnv26L0HxtXrBgiHqu6kwW3ufx2_TQpXIA216c&s=IFFYQ7Zr-4SIaF3slOZqiSP_noyva42kCwVRxxDm5wo&e=
Added to the Cc list, thanks.
>
>
> On 03/13/2015 09:19 PM, Michal Hocko wrote:
> >On Fri 13-03-15 15:09:15, Eric B Munson wrote:
> >>On Fri, 13 Mar 2015, Rik van Riel wrote:
> >>
> >>>On 03/13/2015 01:26 PM, Eric B Munson wrote:
> >>>
> >>>>--- a/mm/compaction.c
> >>>>+++ b/mm/compaction.c
> >>>>@@ -1046,6 +1046,8 @@ typedef enum {
> >>>> ISOLATE_SUCCESS, /* Pages isolated, migrate */
> >>>> } isolate_migrate_t;
> >>>>
> >>>>+int sysctl_compact_unevictable;
>
> A comment here would be useful I think, as well as explicit default
> value. Maybe also __read_mostly although I don't know how much that
> matters.
I am going to sit on V6 for a couple of days incase anyone from rt wants
to chime in. But these will be in V6.
>
> I also wonder if it might be confusing that "compact_memory" is a
> write-only trigger that doesn't even show under "sysctl -a", while
> "compact_unevictable" is a read/write setting. But I don't have a
> better suggestion right now.
Does allow_unevictable_compaction sound better? It feels too much like
variable naming conventions from other languages which seems to
encourage verbosity to me, but does indicate a difference from
compact_memory.
>
> >>>>+
> >>>> /*
> >>>> * Isolate all pages that can be migrated from the first suitable block,
> >>>> * starting at the block pointed to by the migrate scanner pfn within
> >>>
> >>>I suspect that the use cases where users absolutely do not want
> >>>unevictable pages migrated are special cases, and it may make
> >>>sense to enable sysctl_compact_unevictable by default.
> >>
> >>Given that sysctl_compact_unevictable=0 is the way the kernel behaves
> >>now and the push back against always enabling compaction on unevictable
> >>pages, I left the default to be the behavior as it is today.
> >
> >The question is _why_ we have this behavior now. Is it intentional?
>
> It's there since 748446bb6 ("mm: compaction: memory compaction
> core"). Commit c53919adc0 ("mm: vmscan: remove lumpy reclaim")
> changes the comment in __isolate_lru_page() handling of unevictable
> pages to mention compaction explicitly. It could have been
> accidental in 748446bb6 though, maybe it just reused
> __isolate_lru_page() for compaction - it seems that the skipping of
> unevictable was initially meant to optimize lumpy reclaim.
>
> >e46a28790e59 (CMA: migrate mlocked pages) is a precedence in that
>
> Well, CMA and realtime kernels are probably mutually exclusive enough.
>
> >direction. Vlastimil has then changed that by edc2ca612496 (mm,
> >compaction: move pageblock checks up from isolate_migratepages_range()).
> >There is no mention about mlock pages so I guess it was more an
> >unintentional side effect of the patch. At least that is my current
> >understanding. I might be wrong here.
>
> Although that commit did change unintentionally more details that I
> would have liked (unfortunately), I think you are wrong on this one.
> ISOLATE_UNEVICTABLE is still passed from
> isolate_migratepages_range() which is used by CMA, while the
> compaction variant isolate_migratepages() does not pass it. So it's
> kept CMA-specific as before.
>
> >The thing about RT is that it is not usable with the upstream kernel
> >without the RT patchset AFAIU. So the default should be reflect what is
> >better for the standard kernel. RT loads have to tune the system anyway
> >so it is not so surprising they would disable this option as well. We
> >should help those guys and do not require them to touch the code but the
> >knob is reasonable IMHO.
> >
> >Especially when your changelog suggests that having this enabled by
> >default is beneficial for the standard kernel.
>
> I agree, but if there's a danger of becoming too of a bikeshed
> topic, I'm fine with keeping the default same as current behavior
> and changing it later. Or maybe we should ask some -rt mailing list
> instead of just Peter and Thomas?
According to the rt wiki, there is no -rt development list so lkml is
it. I will change the default to 1 for V6 if I don't hear otherwise by
the time I get back around to spinning V6.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: Eric B Munson <emunson@akamai.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>, Rik van Riel <riel@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Christoph Lameter <cl@linux.com>,
Peter Zijlstra <peterz@infradead.org>,
Mel Gorman <mgorman@suse.de>,
David Rientjes <rientjes@google.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH V5] Allow compaction of unevictable pages
Date: Mon, 16 Mar 2015 09:49:56 -0400 [thread overview]
Message-ID: <20150316134956.GA15324@akamai.com> (raw)
In-Reply-To: <5506ACEC.9010403@suse.cz>
[-- Attachment #1: Type: text/plain, Size: 4911 bytes --]
On Mon, 16 Mar 2015, Vlastimil Babka wrote:
> [CC += linux-api@]
>
> Since this is a kernel-user-space API change, please CC linux-api@.
> The kernel source file Documentation/SubmitChecklist notes that all
> Linux kernel patches that change userspace interfaces should be CCed
> to linux-api@vger.kernel.org, so that the various parties who are
> interested in API changes are informed. For further information, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.kernel.org_doc_man-2Dpages_linux-2Dapi-2Dml.html&d=AwIC-g&c=96ZbZZcaMF4w0F4jpN6LZg&r=aUmMDRRT0nx4IfILbQLv8xzE0wB9sQxTHI3QrQ2lkBU&m=GUotTNnv26L0HxtXrBgiHqu6kwW3ufx2_TQpXIA216c&s=IFFYQ7Zr-4SIaF3slOZqiSP_noyva42kCwVRxxDm5wo&e=
Added to the Cc list, thanks.
>
>
> On 03/13/2015 09:19 PM, Michal Hocko wrote:
> >On Fri 13-03-15 15:09:15, Eric B Munson wrote:
> >>On Fri, 13 Mar 2015, Rik van Riel wrote:
> >>
> >>>On 03/13/2015 01:26 PM, Eric B Munson wrote:
> >>>
> >>>>--- a/mm/compaction.c
> >>>>+++ b/mm/compaction.c
> >>>>@@ -1046,6 +1046,8 @@ typedef enum {
> >>>> ISOLATE_SUCCESS, /* Pages isolated, migrate */
> >>>> } isolate_migrate_t;
> >>>>
> >>>>+int sysctl_compact_unevictable;
>
> A comment here would be useful I think, as well as explicit default
> value. Maybe also __read_mostly although I don't know how much that
> matters.
I am going to sit on V6 for a couple of days incase anyone from rt wants
to chime in. But these will be in V6.
>
> I also wonder if it might be confusing that "compact_memory" is a
> write-only trigger that doesn't even show under "sysctl -a", while
> "compact_unevictable" is a read/write setting. But I don't have a
> better suggestion right now.
Does allow_unevictable_compaction sound better? It feels too much like
variable naming conventions from other languages which seems to
encourage verbosity to me, but does indicate a difference from
compact_memory.
>
> >>>>+
> >>>> /*
> >>>> * Isolate all pages that can be migrated from the first suitable block,
> >>>> * starting at the block pointed to by the migrate scanner pfn within
> >>>
> >>>I suspect that the use cases where users absolutely do not want
> >>>unevictable pages migrated are special cases, and it may make
> >>>sense to enable sysctl_compact_unevictable by default.
> >>
> >>Given that sysctl_compact_unevictable=0 is the way the kernel behaves
> >>now and the push back against always enabling compaction on unevictable
> >>pages, I left the default to be the behavior as it is today.
> >
> >The question is _why_ we have this behavior now. Is it intentional?
>
> It's there since 748446bb6 ("mm: compaction: memory compaction
> core"). Commit c53919adc0 ("mm: vmscan: remove lumpy reclaim")
> changes the comment in __isolate_lru_page() handling of unevictable
> pages to mention compaction explicitly. It could have been
> accidental in 748446bb6 though, maybe it just reused
> __isolate_lru_page() for compaction - it seems that the skipping of
> unevictable was initially meant to optimize lumpy reclaim.
>
> >e46a28790e59 (CMA: migrate mlocked pages) is a precedence in that
>
> Well, CMA and realtime kernels are probably mutually exclusive enough.
>
> >direction. Vlastimil has then changed that by edc2ca612496 (mm,
> >compaction: move pageblock checks up from isolate_migratepages_range()).
> >There is no mention about mlock pages so I guess it was more an
> >unintentional side effect of the patch. At least that is my current
> >understanding. I might be wrong here.
>
> Although that commit did change unintentionally more details that I
> would have liked (unfortunately), I think you are wrong on this one.
> ISOLATE_UNEVICTABLE is still passed from
> isolate_migratepages_range() which is used by CMA, while the
> compaction variant isolate_migratepages() does not pass it. So it's
> kept CMA-specific as before.
>
> >The thing about RT is that it is not usable with the upstream kernel
> >without the RT patchset AFAIU. So the default should be reflect what is
> >better for the standard kernel. RT loads have to tune the system anyway
> >so it is not so surprising they would disable this option as well. We
> >should help those guys and do not require them to touch the code but the
> >knob is reasonable IMHO.
> >
> >Especially when your changelog suggests that having this enabled by
> >default is beneficial for the standard kernel.
>
> I agree, but if there's a danger of becoming too of a bikeshed
> topic, I'm fine with keeping the default same as current behavior
> and changing it later. Or maybe we should ask some -rt mailing list
> instead of just Peter and Thomas?
According to the rt wiki, there is no -rt development list so lkml is
it. I will change the default to 1 for V6 if I don't hear otherwise by
the time I get back around to spinning V6.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2015-03-16 13:49 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-13 17:26 [PATCH V5] Allow compaction of unevictable pages Eric B Munson
2015-03-13 17:26 ` Eric B Munson
2015-03-13 18:56 ` Rik van Riel
2015-03-13 18:56 ` Rik van Riel
2015-03-13 19:09 ` Eric B Munson
2015-03-13 20:19 ` Michal Hocko
2015-03-13 20:19 ` Michal Hocko
2015-03-16 10:14 ` Vlastimil Babka
2015-03-16 10:14 ` Vlastimil Babka
[not found] ` <5506ACEC.9010403-AlSwsSmVLrQ@public.gmane.org>
2015-03-16 13:49 ` Eric B Munson [this message]
2015-03-16 13:49 ` Eric B Munson
[not found] ` <20150316134956.GA15324-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org>
2015-03-18 14:40 ` Vlastimil Babka
2015-03-18 14:40 ` Vlastimil Babka
2015-03-18 14:40 ` Vlastimil Babka
2015-03-18 15:40 ` Steven Rostedt
2015-03-18 15:40 ` Steven Rostedt
2015-03-13 23:18 ` David Rientjes
2015-03-13 23:18 ` David Rientjes
2015-03-13 23:43 ` Rik van Riel
2015-03-13 23:43 ` Rik van Riel
2015-03-16 15:39 ` Christoph Lameter
2015-03-16 15:39 ` Christoph Lameter
2015-03-13 20:02 ` Michal Hocko
2015-03-13 20:02 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150316134956.GA15324@akamai.com \
--to=emunson-jqffy2xvxfxqt0dzr+alfa@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=mgorman-l3A5Bk7waGM@public.gmane.org \
--cc=mhocko-AlSwsSmVLrQ@public.gmane.org \
--cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
--cc=vbabka-AlSwsSmVLrQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.