* [patch] mm: memcg: remove incorrect VM_BUG_ON for swap cache pages in uncharge
@ 2013-05-08 13:28 Johannes Weiner
2013-05-08 19:34 ` Hugh Dickins
0 siblings, 1 reply; 2+ messages in thread
From: Johannes Weiner @ 2013-05-08 13:28 UTC (permalink / raw)
To: Andrew Morton
Cc: Heiko Carstens, Lingzhu Xiang, Hugh Dickins, Michal Hocko,
linux-mm, linux-kernel
0c59b89 "mm: memcg: push down PageSwapCache check into uncharge entry
functions" added a VM_BUG_ON() on PageSwapCache in the uncharge path
after checking that page flag once, assuming that the state is stable
in all paths, but this is not the case and the condition triggers in
user environments. An uncharge after the last page table reference to
the page goes away can race with reclaim adding the page to swap
cache.
Swap cache pages are usually uncharged when they are freed after
swapout, from a path that also handles swap usage accounting and memcg
lifetime management. However, since the last page table reference is
gone and thus no references to the swap slot left, the swap slot will
be freed shortly when reclaim attempts to write the page to disk. The
whole swap accounting is not even necessary.
So while the race condition for which this VM_BUG_ON was added is real
and actually existed all along, there are no negative effects. Remove
the VM_BUG_ON again.
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reported-by: Lingzhu Xiang <lxiang@redhat.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: stable@vger.kernel.org
---
mm/memcontrol.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cb1c9de..010d6c1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4108,8 +4108,6 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
if (mem_cgroup_disabled())
return NULL;
- VM_BUG_ON(PageSwapCache(page));
-
if (PageTransHuge(page)) {
nr_pages <<= compound_order(page);
VM_BUG_ON(!PageTransHuge(page));
@@ -4205,6 +4203,18 @@ void mem_cgroup_uncharge_page(struct page *page)
if (page_mapped(page))
return;
VM_BUG_ON(page->mapping && !PageAnon(page));
+ /*
+ * If the page is in swap cache, uncharge should be deferred
+ * to the swap path, which also properly accounts swap usage
+ * and handles memcg lifetime.
+ *
+ * Note that this check is not stable and reclaim may add the
+ * page to swap cache at any time after this. However, if the
+ * page is not in swap cache by the time page->mapcount hits
+ * 0, there won't be any page table references to the swap
+ * slot, and reclaim will free it and not actually write the
+ * page to disk.
+ */
if (PageSwapCache(page))
return;
__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
--
1.7.11.7
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [patch] mm: memcg: remove incorrect VM_BUG_ON for swap cache pages in uncharge
2013-05-08 13:28 [patch] mm: memcg: remove incorrect VM_BUG_ON for swap cache pages in uncharge Johannes Weiner
@ 2013-05-08 19:34 ` Hugh Dickins
0 siblings, 0 replies; 2+ messages in thread
From: Hugh Dickins @ 2013-05-08 19:34 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Heiko Carstens, Lingzhu Xiang, Michal Hocko,
linux-mm, linux-kernel
On Wed, 8 May 2013, Johannes Weiner wrote:
> 0c59b89 "mm: memcg: push down PageSwapCache check into uncharge entry
> functions" added a VM_BUG_ON() on PageSwapCache in the uncharge path
> after checking that page flag once, assuming that the state is stable
> in all paths, but this is not the case and the condition triggers in
> user environments. An uncharge after the last page table reference to
> the page goes away can race with reclaim adding the page to swap
> cache.
>
> Swap cache pages are usually uncharged when they are freed after
> swapout, from a path that also handles swap usage accounting and memcg
> lifetime management. However, since the last page table reference is
> gone and thus no references to the swap slot left, the swap slot will
> be freed shortly when reclaim attempts to write the page to disk. The
> whole swap accounting is not even necessary.
>
> So while the race condition for which this VM_BUG_ON was added is real
> and actually existed all along, there are no negative effects. Remove
> the VM_BUG_ON again.
>
> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
> Reported-by: Lingzhu Xiang <lxiang@redhat.com>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Hugh Dickins <hughd@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: stable@vger.kernel.org
> ---
> mm/memcontrol.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index cb1c9de..010d6c1 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4108,8 +4108,6 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
> if (mem_cgroup_disabled())
> return NULL;
>
> - VM_BUG_ON(PageSwapCache(page));
> -
> if (PageTransHuge(page)) {
> nr_pages <<= compound_order(page);
> VM_BUG_ON(!PageTransHuge(page));
> @@ -4205,6 +4203,18 @@ void mem_cgroup_uncharge_page(struct page *page)
> if (page_mapped(page))
> return;
> VM_BUG_ON(page->mapping && !PageAnon(page));
> + /*
> + * If the page is in swap cache, uncharge should be deferred
> + * to the swap path, which also properly accounts swap usage
> + * and handles memcg lifetime.
> + *
> + * Note that this check is not stable and reclaim may add the
> + * page to swap cache at any time after this. However, if the
> + * page is not in swap cache by the time page->mapcount hits
> + * 0, there won't be any page table references to the swap
> + * slot, and reclaim will free it and not actually write the
> + * page to disk.
> + */
> if (PageSwapCache(page))
> return;
> __mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
> --
> 1.7.11.7
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2013-05-08 19:34 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-08 13:28 [patch] mm: memcg: remove incorrect VM_BUG_ON for swap cache pages in uncharge Johannes Weiner
2013-05-08 19:34 ` Hugh Dickins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).