linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Andres Lagar-Cavilla
	<andreslc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Raghavendra K T
	<raghavendra.kt-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Michel Lespinasse
	<walken-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
	Cyrill Gorcunov
	<gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
	Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH -mm v9 0/8] idle memory tracking
Date: Thu, 30 Jul 2015 12:31:11 +0300	[thread overview]
Message-ID: <20150730093110.GB8100@esperanza> (raw)
In-Reply-To: <20150730090708.GE9387-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>

On Thu, Jul 30, 2015 at 11:07:09AM +0200, Michal Hocko wrote:
> On Wed 29-07-15 19:29:08, Vladimir Davydov wrote:
> > On Wed, Jul 29, 2015 at 05:47:18PM +0200, Michal Hocko wrote:
> [...]
> > > If you use the low limit for isolating an important load then you do not
> > > have to care about the others that much. All you care about is to set
> > > the reasonable protection level and let others to compete for the rest.
> > 
> > That's a use case, you're right. Well, it's a natural limitation of this
> > API - you just have to perform a full PFN scan then. You can avoid
> > costly rmap walks for the cgroups you are not interested in by filtering
> > them out using /proc/kpagecgroup though.
> 
> You still have to read through the whole memory and that is inherent to
> the API and there no way for a better implementation later on other than
> a new exported file.

I don't deny that. Nevertheless, PFN-walk is something that will always
be useful, simply because PFN-range is an invariant - it will always
exist. If one day a better page iterator appear (e.g. LRU walk) and the
need for it is justified well enough, we can add one more file. Note, it
won't deprecate the original PFN map - they both can be used for
different use cases then. If we move kpageidle to /sys/kernel/mm attr
group, which I'm doing now, it will be trivial to do and won't pollute
/proc.

> 
> [...]
> 
> > > > Because there is too much to be taken care of in the kernel with such an
> > > > approach and chances are high that it won't satisfy everyone. What
> > > > should the scan period be equal too?
> > > 
> > > No, just gather the data on the read request and let the userspace
> > > to decide when/how often etc. If we are clever enough we can cache
> > > the numbers and prevent from the walk. Write to the file and do the
> > > mark_idle stuff.
> > 
> > Still, scan rate limiting would be an issue IMO.
> 
> Not sure what you mean here. Scan rate would be defined by the userspace
> by reading/writing to the knob. No background kernel thread is really
> necessary.

Nevertheless, it means more logic in the kernel (rate limiter) and a
wider interface (+ rate limit value).

> 
> > > > Knob. How many kthreads do we want?
> > > > Knob. I want to keep history for last N intervals (this was a part of
> > > > Michel's implementation), what should N be equal to? Knob.
> > > 
> > > This all relates to the kernel thread implementation which I wasn't
> > > suggesting. I was referring to Michel's work which might induce that.
> > > I was merely referring to a single number output. Sorry about the
> > > confusion.
> > 
> > Still, what about idle stats history? I mean having info about how many
> > pages were idle for N scans. It might be useful for more robust/accurate
> > wss estimation.
> 
> Why cannot userspace remember those numbers?

Because they must be per-page - you have to remember for how many
periods *each particular* page has been idle. To achieve this, Michel
had to introduce a byte array referenced by PFN in his work. With
kpageidle file one can store this array in the userspace.

> 
> > > > I want to be
> > > > able to choose between an instant scan and a scan distributed in time.
> > > > Knob. I want to see stats for anon/locked/file/dirty memory separately,
> > > 
> > > Why is this useful for the memcg limits setting or the wss estimation? I
> > > can imagine that a further drop down numbers might be interesting
> > > from the debugging POV but I fail to see what kind of decisions from
> > > userspace you would do based on them.
> > 
> > A couple examples that pop up in my mind:
> > 
> > It's difficult to make wss estimation perfect. By mlocking pages, a
> > workload might give a hint to the system that it will be really unhappy
> > if they are evicted.
> > 
> > One might want to consider anon pages and/or dirty pages as not idle in
> > order to protect them and hence avoid expensive pageout/swapout.
> 
> I still seem to miss the point. How do you do that via the proposed
> interface which doesn't influence the reclaim AFAIU and you do not have
> means to achieve the above (except for swappiness). What am I missing?

You can consider idle only those pages that are clean, and then set the
low limit appropriately for your workload. You can find out which pages
are clean by reading /proc/kpageflags. Of course, this won't stop the
reclaimer from evicting them, but it will make the reclaimer less
aggressive with respect to your workload.

Thanks,
Vladimir

  parent reply	other threads:[~2015-07-30  9:31 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-19 12:31 [PATCH -mm v9 0/8] idle memory tracking Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 1/8] memcg: add page_cgroup_ino helper Vladimir Davydov
2015-07-21 23:34   ` Andrew Morton
2015-07-22  9:21     ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 2/8] hwpoison: use page_cgroup_ino for filtering by memcg Vladimir Davydov
2015-07-21 23:34   ` Andrew Morton
     [not found]     ` <20150721163412.1b44e77f5ac3b742734d1ce6-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2015-07-22  9:45       ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 3/8] memcg: zap try_get_mem_cgroup_from_page Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 4/8] proc: add kpagecgroup file Vladimir Davydov
2015-07-21 23:34   ` Andrew Morton
     [not found]     ` <20150721163433.618855e1f61536a09dfac30b-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2015-07-22 10:33       ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 5/8] mmu-notifier: add clear_young callback Vladimir Davydov
2015-07-20 18:34   ` Andres Lagar-Cavilla
2015-07-21  8:51     ` Vladimir Davydov
2015-07-22 16:33       ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 6/8] proc: add kpageidle file Vladimir Davydov
2015-07-21 23:34   ` Andrew Morton
     [not found]     ` <20150721163452.c1e4075a2b193bcd325fad56-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2015-07-22 15:20       ` Vladimir Davydov
     [not found]   ` <d7a78b72053cf529c0c9ff6cbc02ffbb3d58fe35.1437303956.git.vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2015-07-24 14:08     ` Paul Gortmaker
     [not found]       ` <CAP=VYLqiNfQJ6oyQg2GszeHwdOmeY_uD3XPvw=++weJOKdx4_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-24 14:17         ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 7/8] proc: export idle flag via kpageflags Vladimir Davydov
2015-07-21 23:35   ` Andrew Morton
     [not found]     ` <20150721163500.528bd39bbbc71abc3c8d429b-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2015-07-22 16:25       ` Vladimir Davydov
2015-07-22 19:44         ` Andrew Morton
2015-07-22 20:46           ` Andres Lagar-Cavilla
2015-07-23  7:57             ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 8/8] proc: add cond_resched to /proc/kpage* read/write loop Vladimir Davydov
     [not found] ` <cover.1437303956.git.vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2015-07-19 12:37   ` [PATCH -mm v9 0/8] idle memory tracking Vladimir Davydov
2015-07-21 21:39 ` Andres Lagar-Cavilla
2015-07-21 23:34 ` Andrew Morton
2015-07-22 16:23   ` Vladimir Davydov
2015-07-25 16:24     ` Vladimir Davydov
2015-07-27 19:18   ` Kees Cook
2015-07-27 19:25     ` Andrew Morton
2015-07-29 12:36 ` Michal Hocko
     [not found]   ` <20150729123629.GI15801-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-29 13:59     ` Vladimir Davydov
2015-07-29 14:12       ` Michel Lespinasse
     [not found]         ` <CANN689HJX2ZL891uOd8TW9ct4PNH9d5odQZm86WMxkpkCWhA-w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-29 14:13           ` Michel Lespinasse
2015-07-29 14:45           ` Vladimir Davydov
2015-07-29 15:08             ` Michel Lespinasse
     [not found]               ` <CANN689Euq3Y-CHQo8q88vzFAYZX4S6rK+rZRfbuSKfS74u=gcg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-29 15:31                 ` Vladimir Davydov
2015-07-29 15:34                   ` Michel Lespinasse
2015-07-29 15:08             ` Michal Hocko
     [not found]               ` <20150729150855.GM15801-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-29 15:36                 ` Vladimir Davydov
2015-07-29 15:58                   ` Michal Hocko
2015-07-29 14:26       ` Michal Hocko
2015-07-29 15:28         ` Vladimir Davydov
2015-07-29 15:47           ` Michal Hocko
     [not found]             ` <20150729154718.GN15801-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-29 16:29               ` Vladimir Davydov
2015-07-29 21:30                 ` Andrew Morton
2015-07-30  9:12                   ` Vladimir Davydov
2015-07-30 13:01                     ` Vladimir Davydov
2015-07-31  9:34                       ` Vladimir Davydov
2015-07-30  9:07                 ` Michal Hocko
     [not found]                   ` <20150730090708.GE9387-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-30  9:31                     ` Vladimir Davydov [this message]
2015-07-29 15:55           ` Andres Lagar-Cavilla
     [not found]             ` <CAJu=L59RdowYjTyVM0Vhz79A4d=d8=ZmU7PB59CmEj5B0_c48Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-29 16:37               ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150730093110.GB8100@esperanza \
    --to=vdavydov-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=andreslc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=corbet-T1hC0tSOHrs@public.gmane.org \
    --cc=gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=raghavendra.kt-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=walken-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).