From: Vladimir Davydov <vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Michel Lespinasse
<walken-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Andres Lagar-Cavilla
<andreslc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Raghavendra K T
<raghavendra.kt-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
Greg Thelen <gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
Cyrill Gorcunov
<gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>,
linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH -mm v9 0/8] idle memory tracking
Date: Wed, 29 Jul 2015 18:36:40 +0300 [thread overview]
Message-ID: <20150729153640.GX8100@esperanza> (raw)
In-Reply-To: <20150729150855.GM15801-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
On Wed, Jul 29, 2015 at 05:08:55PM +0200, Michal Hocko wrote:
> On Wed 29-07-15 17:45:39, Vladimir Davydov wrote:
> > On Wed, Jul 29, 2015 at 07:12:13AM -0700, Michel Lespinasse wrote:
> > > On Wed, Jul 29, 2015 at 6:59 AM, Vladimir Davydov <vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> > > wrote:
> > > >> I guess the primary reason to rely on the pfn rather than the LRU walk,
> > > >> which would be more targeted (especially for memcg cases), is that we
> > > >> cannot hold lru lock for the whole LRU walk and we cannot continue
> > > >> walking after the lock is dropped. Maybe we can try to address that
> > > >> instead? I do not think this is easy to achieve but have you considered
> > > >> that as an option?
> > > >
> > > > Yes, I have, and I've come to a conclusion it's not doable, because LRU
> > > > lists can be constantly rotating at an arbitrary rate. If you have an
> > > > idea in mind how this could be done, please share.
> > > >
> > > > Speaking of LRU-vs-PFN walk, iterating over PFNs has its own advantages:
> > > > - You can distribute a walk in time to avoid CPU bursts.
> > > > - You are free to parallelize the scanner as you wish to decrease the
> > > > scan time.
> > >
> > > There is a third way: one could go through every MM in the system and scan
> > > their page tables. Doing things that way turns out to be generally faster
> > > than scanning by physical address, because you don't have to go through
> > > RMAP for every page. But, you end up needing to take the mmap_sem lock of
> > > every MM (in turn) while scanning them, and that degrades quickly under
> > > memory load, which is exactly when you most need this feature. So, scan by
> > > address is still what we use here.
> >
> > Page table scan approach has the inherent problem - it ignores unmapped
> > page cache. If a workload does a lot of read/write or map-access-unmap
> > operations, we won't be able to even roughly estimate its wss.
>
> That page cache is trivially reclaimable if it is clean. If it needs
> writeback then it is non-idle only until the next writeback. So why does
> it matter for the estimation?
Because it might be a part of a workload's working set, in which case
evicting it will make the workload lag.
Thanks,
Vladimir
WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Michel Lespinasse <walken@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Andres Lagar-Cavilla <andreslc@google.com>,
Minchan Kim <minchan@kernel.org>,
Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Greg Thelen <gthelen@google.com>,
David Rientjes <rientjes@google.com>,
Pavel Emelyanov <xemul@parallels.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Jonathan Corbet <corbet@lwn.net>,
linux-api@vger.kernel.org, linux-doc@vger.kernel.org,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH -mm v9 0/8] idle memory tracking
Date: Wed, 29 Jul 2015 18:36:40 +0300 [thread overview]
Message-ID: <20150729153640.GX8100@esperanza> (raw)
In-Reply-To: <20150729150855.GM15801@dhcp22.suse.cz>
On Wed, Jul 29, 2015 at 05:08:55PM +0200, Michal Hocko wrote:
> On Wed 29-07-15 17:45:39, Vladimir Davydov wrote:
> > On Wed, Jul 29, 2015 at 07:12:13AM -0700, Michel Lespinasse wrote:
> > > On Wed, Jul 29, 2015 at 6:59 AM, Vladimir Davydov <vdavydov@parallels.com>
> > > wrote:
> > > >> I guess the primary reason to rely on the pfn rather than the LRU walk,
> > > >> which would be more targeted (especially for memcg cases), is that we
> > > >> cannot hold lru lock for the whole LRU walk and we cannot continue
> > > >> walking after the lock is dropped. Maybe we can try to address that
> > > >> instead? I do not think this is easy to achieve but have you considered
> > > >> that as an option?
> > > >
> > > > Yes, I have, and I've come to a conclusion it's not doable, because LRU
> > > > lists can be constantly rotating at an arbitrary rate. If you have an
> > > > idea in mind how this could be done, please share.
> > > >
> > > > Speaking of LRU-vs-PFN walk, iterating over PFNs has its own advantages:
> > > > - You can distribute a walk in time to avoid CPU bursts.
> > > > - You are free to parallelize the scanner as you wish to decrease the
> > > > scan time.
> > >
> > > There is a third way: one could go through every MM in the system and scan
> > > their page tables. Doing things that way turns out to be generally faster
> > > than scanning by physical address, because you don't have to go through
> > > RMAP for every page. But, you end up needing to take the mmap_sem lock of
> > > every MM (in turn) while scanning them, and that degrades quickly under
> > > memory load, which is exactly when you most need this feature. So, scan by
> > > address is still what we use here.
> >
> > Page table scan approach has the inherent problem - it ignores unmapped
> > page cache. If a workload does a lot of read/write or map-access-unmap
> > operations, we won't be able to even roughly estimate its wss.
>
> That page cache is trivially reclaimable if it is clean. If it needs
> writeback then it is non-idle only until the next writeback. So why does
> it matter for the estimation?
Because it might be a part of a workload's working set, in which case
evicting it will make the workload lag.
Thanks,
Vladimir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Michel Lespinasse <walken@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Andres Lagar-Cavilla <andreslc@google.com>,
Minchan Kim <minchan@kernel.org>,
Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Greg Thelen <gthelen@google.com>,
David Rientjes <rientjes@google.com>,
"Pavel Emelyanov" <xemul@parallels.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Jonathan Corbet <corbet@lwn.net>, <linux-api@vger.kernel.org>,
<linux-doc@vger.kernel.org>, <linux-mm@kvack.org>,
<cgroups@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH -mm v9 0/8] idle memory tracking
Date: Wed, 29 Jul 2015 18:36:40 +0300 [thread overview]
Message-ID: <20150729153640.GX8100@esperanza> (raw)
In-Reply-To: <20150729150855.GM15801@dhcp22.suse.cz>
On Wed, Jul 29, 2015 at 05:08:55PM +0200, Michal Hocko wrote:
> On Wed 29-07-15 17:45:39, Vladimir Davydov wrote:
> > On Wed, Jul 29, 2015 at 07:12:13AM -0700, Michel Lespinasse wrote:
> > > On Wed, Jul 29, 2015 at 6:59 AM, Vladimir Davydov <vdavydov@parallels.com>
> > > wrote:
> > > >> I guess the primary reason to rely on the pfn rather than the LRU walk,
> > > >> which would be more targeted (especially for memcg cases), is that we
> > > >> cannot hold lru lock for the whole LRU walk and we cannot continue
> > > >> walking after the lock is dropped. Maybe we can try to address that
> > > >> instead? I do not think this is easy to achieve but have you considered
> > > >> that as an option?
> > > >
> > > > Yes, I have, and I've come to a conclusion it's not doable, because LRU
> > > > lists can be constantly rotating at an arbitrary rate. If you have an
> > > > idea in mind how this could be done, please share.
> > > >
> > > > Speaking of LRU-vs-PFN walk, iterating over PFNs has its own advantages:
> > > > - You can distribute a walk in time to avoid CPU bursts.
> > > > - You are free to parallelize the scanner as you wish to decrease the
> > > > scan time.
> > >
> > > There is a third way: one could go through every MM in the system and scan
> > > their page tables. Doing things that way turns out to be generally faster
> > > than scanning by physical address, because you don't have to go through
> > > RMAP for every page. But, you end up needing to take the mmap_sem lock of
> > > every MM (in turn) while scanning them, and that degrades quickly under
> > > memory load, which is exactly when you most need this feature. So, scan by
> > > address is still what we use here.
> >
> > Page table scan approach has the inherent problem - it ignores unmapped
> > page cache. If a workload does a lot of read/write or map-access-unmap
> > operations, we won't be able to even roughly estimate its wss.
>
> That page cache is trivially reclaimable if it is clean. If it needs
> writeback then it is non-idle only until the next writeback. So why does
> it matter for the estimation?
Because it might be a part of a workload's working set, in which case
evicting it will make the workload lag.
Thanks,
Vladimir
next prev parent reply other threads:[~2015-07-29 15:36 UTC|newest]
Thread overview: 135+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-19 12:31 [PATCH -mm v9 0/8] idle memory tracking Vladimir Davydov
2015-07-19 12:31 ` Vladimir Davydov
2015-07-19 12:31 ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 1/8] memcg: add page_cgroup_ino helper Vladimir Davydov
2015-07-19 12:31 ` Vladimir Davydov
2015-07-21 23:34 ` Andrew Morton
2015-07-21 23:34 ` Andrew Morton
2015-07-21 23:34 ` Andrew Morton
2015-07-22 9:21 ` Vladimir Davydov
2015-07-22 9:21 ` Vladimir Davydov
2015-07-22 9:21 ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 2/8] hwpoison: use page_cgroup_ino for filtering by memcg Vladimir Davydov
2015-07-19 12:31 ` Vladimir Davydov
2015-07-21 23:34 ` Andrew Morton
2015-07-21 23:34 ` Andrew Morton
[not found] ` <20150721163412.1b44e77f5ac3b742734d1ce6-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2015-07-22 9:45 ` Vladimir Davydov
2015-07-22 9:45 ` Vladimir Davydov
2015-07-22 9:45 ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 3/8] memcg: zap try_get_mem_cgroup_from_page Vladimir Davydov
2015-07-19 12:31 ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 4/8] proc: add kpagecgroup file Vladimir Davydov
2015-07-19 12:31 ` Vladimir Davydov
2015-07-21 23:34 ` Andrew Morton
2015-07-21 23:34 ` Andrew Morton
[not found] ` <20150721163433.618855e1f61536a09dfac30b-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2015-07-22 10:33 ` Vladimir Davydov
2015-07-22 10:33 ` Vladimir Davydov
2015-07-22 10:33 ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 5/8] mmu-notifier: add clear_young callback Vladimir Davydov
2015-07-19 12:31 ` Vladimir Davydov
2015-07-20 18:34 ` Andres Lagar-Cavilla
2015-07-21 8:51 ` Vladimir Davydov
2015-07-21 8:51 ` Vladimir Davydov
2015-07-22 16:33 ` Vladimir Davydov
2015-07-22 16:33 ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 6/8] proc: add kpageidle file Vladimir Davydov
2015-07-19 12:31 ` Vladimir Davydov
2015-07-21 23:34 ` Andrew Morton
2015-07-21 23:34 ` Andrew Morton
[not found] ` <20150721163452.c1e4075a2b193bcd325fad56-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2015-07-22 15:20 ` Vladimir Davydov
2015-07-22 15:20 ` Vladimir Davydov
2015-07-22 15:20 ` Vladimir Davydov
[not found] ` <d7a78b72053cf529c0c9ff6cbc02ffbb3d58fe35.1437303956.git.vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2015-07-24 14:08 ` Paul Gortmaker
2015-07-24 14:08 ` Paul Gortmaker
2015-07-24 14:08 ` Paul Gortmaker
[not found] ` <CAP=VYLqiNfQJ6oyQg2GszeHwdOmeY_uD3XPvw=++weJOKdx4_g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-24 14:17 ` Vladimir Davydov
2015-07-24 14:17 ` Vladimir Davydov
2015-07-24 14:17 ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 7/8] proc: export idle flag via kpageflags Vladimir Davydov
2015-07-19 12:31 ` Vladimir Davydov
2015-07-21 23:35 ` Andrew Morton
2015-07-21 23:35 ` Andrew Morton
[not found] ` <20150721163500.528bd39bbbc71abc3c8d429b-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2015-07-22 16:25 ` Vladimir Davydov
2015-07-22 16:25 ` Vladimir Davydov
2015-07-22 16:25 ` Vladimir Davydov
2015-07-22 19:44 ` Andrew Morton
2015-07-22 19:44 ` Andrew Morton
2015-07-22 19:44 ` Andrew Morton
2015-07-22 20:46 ` Andres Lagar-Cavilla
2015-07-23 7:57 ` Vladimir Davydov
2015-07-23 7:57 ` Vladimir Davydov
2015-07-23 7:57 ` Vladimir Davydov
2015-07-19 12:31 ` [PATCH -mm v9 8/8] proc: add cond_resched to /proc/kpage* read/write loop Vladimir Davydov
2015-07-19 12:31 ` Vladimir Davydov
[not found] ` <cover.1437303956.git.vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2015-07-19 12:37 ` [PATCH -mm v9 0/8] idle memory tracking Vladimir Davydov
2015-07-19 12:37 ` Vladimir Davydov
2015-07-19 12:37 ` Vladimir Davydov
2015-07-21 21:39 ` Andres Lagar-Cavilla
2015-07-21 23:34 ` Andrew Morton
2015-07-21 23:34 ` Andrew Morton
2015-07-22 16:23 ` Vladimir Davydov
2015-07-22 16:23 ` Vladimir Davydov
2015-07-22 16:23 ` Vladimir Davydov
2015-07-25 16:24 ` Vladimir Davydov
2015-07-25 16:24 ` Vladimir Davydov
2015-07-25 16:24 ` Vladimir Davydov
2015-07-27 19:18 ` Kees Cook
2015-07-27 19:18 ` Kees Cook
2015-07-27 19:25 ` Andrew Morton
2015-07-27 19:25 ` Andrew Morton
2015-07-29 12:36 ` Michal Hocko
2015-07-29 12:36 ` Michal Hocko
[not found] ` <20150729123629.GI15801-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-29 13:59 ` Vladimir Davydov
2015-07-29 13:59 ` Vladimir Davydov
2015-07-29 13:59 ` Vladimir Davydov
2015-07-29 14:12 ` Michel Lespinasse
[not found] ` <CANN689HJX2ZL891uOd8TW9ct4PNH9d5odQZm86WMxkpkCWhA-w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-29 14:13 ` Michel Lespinasse
2015-07-29 14:13 ` Michel Lespinasse
2015-07-29 14:13 ` Michel Lespinasse
2015-07-29 14:45 ` Vladimir Davydov
2015-07-29 14:45 ` Vladimir Davydov
2015-07-29 14:45 ` Vladimir Davydov
2015-07-29 15:08 ` Michel Lespinasse
[not found] ` <CANN689Euq3Y-CHQo8q88vzFAYZX4S6rK+rZRfbuSKfS74u=gcg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-29 15:31 ` Vladimir Davydov
2015-07-29 15:31 ` Vladimir Davydov
2015-07-29 15:31 ` Vladimir Davydov
2015-07-29 15:34 ` Michel Lespinasse
2015-07-29 15:08 ` Michal Hocko
2015-07-29 15:08 ` Michal Hocko
[not found] ` <20150729150855.GM15801-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-29 15:36 ` Vladimir Davydov [this message]
2015-07-29 15:36 ` Vladimir Davydov
2015-07-29 15:36 ` Vladimir Davydov
2015-07-29 15:58 ` Michal Hocko
2015-07-29 15:58 ` Michal Hocko
2015-07-29 14:26 ` Michal Hocko
2015-07-29 14:26 ` Michal Hocko
2015-07-29 15:28 ` Vladimir Davydov
2015-07-29 15:28 ` Vladimir Davydov
2015-07-29 15:47 ` Michal Hocko
2015-07-29 15:47 ` Michal Hocko
2015-07-29 15:47 ` Michal Hocko
[not found] ` <20150729154718.GN15801-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-29 16:29 ` Vladimir Davydov
2015-07-29 16:29 ` Vladimir Davydov
2015-07-29 16:29 ` Vladimir Davydov
2015-07-29 21:30 ` Andrew Morton
2015-07-29 21:30 ` Andrew Morton
2015-07-29 21:30 ` Andrew Morton
2015-07-30 9:12 ` Vladimir Davydov
2015-07-30 9:12 ` Vladimir Davydov
2015-07-30 13:01 ` Vladimir Davydov
2015-07-30 13:01 ` Vladimir Davydov
2015-07-30 13:01 ` Vladimir Davydov
2015-07-31 9:34 ` Vladimir Davydov
2015-07-31 9:34 ` Vladimir Davydov
2015-07-31 9:34 ` Vladimir Davydov
2015-07-30 9:07 ` Michal Hocko
2015-07-30 9:07 ` Michal Hocko
2015-07-30 9:07 ` Michal Hocko
[not found] ` <20150730090708.GE9387-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-07-30 9:31 ` Vladimir Davydov
2015-07-30 9:31 ` Vladimir Davydov
2015-07-30 9:31 ` Vladimir Davydov
2015-07-29 15:55 ` Andres Lagar-Cavilla
2015-07-29 15:55 ` Andres Lagar-Cavilla
[not found] ` <CAJu=L59RdowYjTyVM0Vhz79A4d=d8=ZmU7PB59CmEj5B0_c48Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-29 16:37 ` Vladimir Davydov
2015-07-29 16:37 ` Vladimir Davydov
2015-07-29 16:37 ` Vladimir Davydov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150729153640.GX8100@esperanza \
--to=vdavydov-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=andreslc-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=corbet-T1hC0tSOHrs@public.gmane.org \
--cc=gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
--cc=gthelen-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=raghavendra.kt-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=walken-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.