All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>,
	Jet Chen <jet.chen@intel.com>,
	Yuanhan Liu <yuanhan.liu@intel.com>, LKP <lkp@01.org>,
	cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [memcontrol] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked()
Date: Fri, 20 Jun 2014 11:42:10 -0400	[thread overview]
Message-ID: <20140620154209.GI7331@cmpxchg.org> (raw)
In-Reply-To: <20140620102704.GA8912@localhost>

On Fri, Jun 20, 2014 at 06:27:04PM +0800, Fengguang Wu wrote:
> Greetings,
> 
> 0day kernel testing robot got the below dmesg and the first bad commit is
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

Thanks for the bisect.

> commit ddc5bfec501f4be3f9e89084c2db270c0c45d1d6
> Author:     Johannes Weiner <hannes@cmpxchg.org>
> AuthorDate: Fri Jun 20 10:27:58 2014 +1000
> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
> CommitDate: Fri Jun 20 10:27:58 2014 +1000
> 
>     mm: memcontrol: rewrite uncharge API
>     
>     The memcg uncharging code that is involved towards the end of a page's
>     lifetime - truncation, reclaim, swapout, migration - is impressively
>     complicated and fragile.
>     
>     Because anonymous and file pages were always charged before they had their
>     page->mapping established, uncharges had to happen when the page type
>     could still be known from the context; as in unmap for anonymous, page
>     cache removal for file and shmem pages, and swap cache truncation for swap
>     pages.  However, these operations happen well before the page is actually
>     freed, and so a lot of synchronization is necessary:
>     
>     - Charging, uncharging, page migration, and charge migration all need
>       to take a per-page bit spinlock as they could race with uncharging.
>     
>     - Swap cache truncation happens during both swap-in and swap-out, and
>       possibly repeatedly before the page is actually freed.  This means
>       that the memcg swapout code is called from many contexts that make
>       no sense and it has to figure out the direction from page state to
>       make sure memory and memory+swap are always correctly charged.
>     
>     - On page migration, the old page might be unmapped but then reused,
>       so memcg code has to prevent untimely uncharging in that case.
>       Because this code - which should be a simple charge transfer - is so
>       special-cased, it is not reusable for replace_page_cache().
>     
>     But now that charged pages always have a page->mapping, introduce
>     mem_cgroup_uncharge(), which is called after the final put_page(), when we
>     know for sure that nobody is looking at the page anymore.
>     
>     For page migration, introduce mem_cgroup_migrate(), which is called after
>     the migration is successful and the new page is fully rmapped.  Because
>     the old page is no longer uncharged after migration, prevent double
>     charges by decoupling the page's memcg association (PCG_USED and
>     pc->mem_cgroup) from the page holding an actual charge.  The new bits
>     PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
>     to the new page during migration.
>     
>     mem_cgroup_migrate() is suitable for replace_page_cache() as well, which
>     gets rid of mem_cgroup_replace_page_cache().
>     
>     Swap accounting is massively simplified: because the page is no longer
>     uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
>     transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
>     before the final put_page() in page reclaim.
>     
>     Finally, page_cgroup changes are now protected by whatever protection the
>     page itself offers: anonymous pages are charged under the page table lock,
>     whereas page cache insertions, swapin, and migration hold the page lock.
>     Uncharging happens under full exclusion with no outstanding references.
>     Charging and uncharging also ensure that the page is off-LRU, which
>     serializes against charge migration.  Remove the very costly page_cgroup
>     lock and set pc->flags non-atomically.
>     
>     Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>     Cc: Michal Hocko <mhocko@suse.cz>
>     Cc: Hugh Dickins <hughd@google.com>
>     Cc: Tejun Heo <tj@kernel.org>
>     Cc: Vladimir Davydov <vdavydov@parallels.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> 
> +-----------------------------------------------------------------------+------------+------------+---------------+
> |                                                                       | 5b647620c6 | ddc5bfec50 | next-20140620 |
> +-----------------------------------------------------------------------+------------+------------+---------------+
> | boot_successes                                                        | 60         | 0          | 0             |
> | boot_failures                                                         | 0          | 20         | 13            |
> | WARNING:CPU:PID:at_kernel/res_counter.c:res_counter_uncharge_locked() | 0          | 20         | 13            |
> | backtrace:vm_munmap                                                   | 0          | 20         | 13            |
> | backtrace:SyS_munmap                                                  | 0          | 20         | 13            |
> | backtrace:do_sys_open                                                 | 0          | 20         | 13            |
> | backtrace:SyS_open                                                    | 0          | 20         | 13            |
> | backtrace:do_execve                                                   | 0          | 20         | 13            |
> | backtrace:SyS_execve                                                  | 0          | 20         | 13            |
> | backtrace:do_group_exit                                               | 0          | 20         | 13            |
> | backtrace:SyS_exit_group                                              | 0          | 20         | 13            |
> | backtrace:SYSC_renameat2                                              | 0          | 11         | 8             |
> | backtrace:SyS_rename                                                  | 0          | 11         | 8             |
> | backtrace:do_munmap                                                   | 0          | 11         | 8             |
> | backtrace:SyS_brk                                                     | 0          | 11         | 8             |
> | Out_of_memory:Kill_process                                            | 0          | 1          |               |
> | backtrace:do_unlinkat                                                 | 0          | 9          | 5             |
> | backtrace:SyS_unlink                                                  | 0          | 9          | 5             |
> | backtrace:SYSC_umount                                                 | 0          | 9          |               |
> | backtrace:SyS_umount                                                  | 0          | 9          |               |
> | backtrace:cleanup_mnt_work                                            | 0          | 0          | 5             |
> +-----------------------------------------------------------------------+------------+------------+---------------+
> 
> [    2.747397] debug: unmapping init [mem 0xffff880001a3a000-0xffff880001bfffff]
> [    2.748630] debug: unmapping init [mem 0xffff8800021ad000-0xffff8800021fffff]
> [    2.752857] ------------[ cut here ]------------
> [    2.753355] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked+0x48/0x74()
> [    2.753355] CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-00238-gddc5bfe #1
> [    2.753355] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [    2.753355]  0000000000000000 ffff880012073c50 ffffffff81a23b9d ffff880012073c88
> [    2.753355]  ffffffff810bc765 ffffffff8111fac8 0000000000001000 ffff88001200fa50
> [    2.753355]  0000000000000001 ffff88001200fa01 ffff880012073c98 ffffffff810bc84b
> [    2.753355] Call Trace:
> [    2.753355]  [<ffffffff81a23b9d>] dump_stack+0x19/0x1b
> [    2.753355]  [<ffffffff810bc765>] warn_slowpath_common+0x73/0x8c
> [    2.753355]  [<ffffffff8111fac8>] ? res_counter_uncharge_locked+0x48/0x74
> [    2.753355]  [<ffffffff810bc84b>] warn_slowpath_null+0x1a/0x1c
> [    2.753355]  [<ffffffff8111fac8>] res_counter_uncharge_locked+0x48/0x74
> [    2.753355]  [<ffffffff8111fd02>] res_counter_uncharge_until+0x4e/0xa9
> [    2.753355]  [<ffffffff8111fd70>] res_counter_uncharge+0x13/0x15
> [    2.753355]  [<ffffffff8119499c>] mem_cgroup_uncharge_end+0x73/0x8d
> [    2.753355]  [<ffffffff8115735e>] release_pages+0x1f2/0x20d
> [    2.753355]  [<ffffffff8116cc3a>] tlb_flush_mmu_free+0x28/0x43
> [    2.753355]  [<ffffffff8116d5e5>] tlb_flush_mmu+0x20/0x23
> [    2.753355]  [<ffffffff8116d5fc>] tlb_finish_mmu+0x14/0x39
> [    2.753355]  [<ffffffff811730c1>] unmap_region+0xcd/0xdf
> [    2.753355]  [<ffffffff81172b0e>] ? vma_gap_callbacks_propagate+0x18/0x33
> [    2.753355]  [<ffffffff81174bf1>] do_munmap+0x252/0x2e0
> [    2.753355]  [<ffffffff81174cc3>] vm_munmap+0x44/0x5c
> [    2.753355]  [<ffffffff81174cfe>] SyS_munmap+0x23/0x29
> [    2.753355]  [<ffffffff81a31567>] system_call_fastpath+0x16/0x1b
> [    2.753355] ---[ end trace cfeb07101f6fbdfb ]---
> [    2.780913] ------------[ cut here ]------------

This is an underflow that happens with memcg enabled but memcg-swap
disabled - the memsw counter is not accounted, but then unaccounted.

Andrew, can you please put this in to fix the uncharge rewrite patch
mentioned above?

---

From 29bcfcf54494467008aaf9d4e37771d3b2e2c2c7 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Fri, 20 Jun 2014 11:09:14 -0400
Subject: [patch] mm: memcontrol: rewrite uncharge API fix

It's not entirely clear whether do_swap_account or PCG_MEMSW is the
authoritative answer to whether a page is swap-accounted or not.  This
currently leads to the following memsw counter underflow when swap
accounting is disabled:

[    2.753355] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked+0x48/0x74()
[    2.753355] CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-00238-gddc5bfe #1
[    2.753355] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    2.753355]  0000000000000000 ffff880012073c50 ffffffff81a23b9d ffff880012073c88
[    2.753355]  ffffffff810bc765 ffffffff8111fac8 0000000000001000 ffff88001200fa50
[    2.753355]  0000000000000001 ffff88001200fa01 ffff880012073c98 ffffffff810bc84b
[    2.753355] Call Trace:
[    2.753355]  [<ffffffff81a23b9d>] dump_stack+0x19/0x1b
[    2.753355]  [<ffffffff810bc765>] warn_slowpath_common+0x73/0x8c
[    2.753355]  [<ffffffff8111fac8>] ? res_counter_uncharge_locked+0x48/0x74
[    2.753355]  [<ffffffff810bc84b>] warn_slowpath_null+0x1a/0x1c
[    2.753355]  [<ffffffff8111fac8>] res_counter_uncharge_locked+0x48/0x74
[    2.753355]  [<ffffffff8111fd02>] res_counter_uncharge_until+0x4e/0xa9
[    2.753355]  [<ffffffff8111fd70>] res_counter_uncharge+0x13/0x15
[    2.753355]  [<ffffffff8119499c>] mem_cgroup_uncharge_end+0x73/0x8d
[    2.753355]  [<ffffffff8115735e>] release_pages+0x1f2/0x20d
[    2.753355]  [<ffffffff8116cc3a>] tlb_flush_mmu_free+0x28/0x43
[    2.753355]  [<ffffffff8116d5e5>] tlb_flush_mmu+0x20/0x23
[    2.753355]  [<ffffffff8116d5fc>] tlb_finish_mmu+0x14/0x39
[    2.753355]  [<ffffffff811730c1>] unmap_region+0xcd/0xdf
[    2.753355]  [<ffffffff81172b0e>] ? vma_gap_callbacks_propagate+0x18/0x33
[    2.753355]  [<ffffffff81174bf1>] do_munmap+0x252/0x2e0
[    2.753355]  [<ffffffff81174cc3>] vm_munmap+0x44/0x5c
[    2.753355]  [<ffffffff81174cfe>] SyS_munmap+0x23/0x29
[    2.753355]  [<ffffffff81a31567>] system_call_fastpath+0x16/0x1b
[    2.753355] ---[ end trace cfeb07101f6fbdfb ]---

Don't set PCG_MEMSW when swap accounting is disabled, so that
uncharging only has to look at this per-page flag.

mem_cgroup_swapout() could also fully rely on this flag, but as it can
bail out before even looking up the page_cgroup, check do_swap_account
as a performance optimization and only sanity test for PCG_MEMSW.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 94d7c40b9f26..d6a20935f9c4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2740,7 +2740,7 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
 	 *   have the page locked
 	 */
 	pc->mem_cgroup = memcg;
-	pc->flags = PCG_USED | PCG_MEM | PCG_MEMSW;
+	pc->flags = PCG_USED | PCG_MEM | (do_swap_account ? PCG_MEMSW : 0);
 
 	if (lrucare) {
 		if (was_on_lru) {
@@ -6598,7 +6598,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage,
 		return;
 
 	VM_BUG_ON_PAGE(!(pc->flags & PCG_MEM), oldpage);
-	VM_BUG_ON_PAGE(!(pc->flags & PCG_MEMSW), oldpage);
+	VM_BUG_ON_PAGE(do_swap_account && !(pc->flags & PCG_MEMSW), oldpage);
 	pc->flags &= ~(PCG_MEM | PCG_MEMSW);
 
 	if (PageTransHuge(oldpage)) {
-- 
2.0.0

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: lkp@lists.01.org
Subject: Re: [memcontrol] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked()
Date: Fri, 20 Jun 2014 11:42:10 -0400	[thread overview]
Message-ID: <20140620154209.GI7331@cmpxchg.org> (raw)
In-Reply-To: <20140620102704.GA8912@localhost>

[-- Attachment #1: Type: text/plain, Size: 12983 bytes --]

On Fri, Jun 20, 2014 at 06:27:04PM +0800, Fengguang Wu wrote:
> Greetings,
> 
> 0day kernel testing robot got the below dmesg and the first bad commit is
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

Thanks for the bisect.

> commit ddc5bfec501f4be3f9e89084c2db270c0c45d1d6
> Author:     Johannes Weiner <hannes@cmpxchg.org>
> AuthorDate: Fri Jun 20 10:27:58 2014 +1000
> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
> CommitDate: Fri Jun 20 10:27:58 2014 +1000
> 
>     mm: memcontrol: rewrite uncharge API
>     
>     The memcg uncharging code that is involved towards the end of a page's
>     lifetime - truncation, reclaim, swapout, migration - is impressively
>     complicated and fragile.
>     
>     Because anonymous and file pages were always charged before they had their
>     page->mapping established, uncharges had to happen when the page type
>     could still be known from the context; as in unmap for anonymous, page
>     cache removal for file and shmem pages, and swap cache truncation for swap
>     pages.  However, these operations happen well before the page is actually
>     freed, and so a lot of synchronization is necessary:
>     
>     - Charging, uncharging, page migration, and charge migration all need
>       to take a per-page bit spinlock as they could race with uncharging.
>     
>     - Swap cache truncation happens during both swap-in and swap-out, and
>       possibly repeatedly before the page is actually freed.  This means
>       that the memcg swapout code is called from many contexts that make
>       no sense and it has to figure out the direction from page state to
>       make sure memory and memory+swap are always correctly charged.
>     
>     - On page migration, the old page might be unmapped but then reused,
>       so memcg code has to prevent untimely uncharging in that case.
>       Because this code - which should be a simple charge transfer - is so
>       special-cased, it is not reusable for replace_page_cache().
>     
>     But now that charged pages always have a page->mapping, introduce
>     mem_cgroup_uncharge(), which is called after the final put_page(), when we
>     know for sure that nobody is looking at the page anymore.
>     
>     For page migration, introduce mem_cgroup_migrate(), which is called after
>     the migration is successful and the new page is fully rmapped.  Because
>     the old page is no longer uncharged after migration, prevent double
>     charges by decoupling the page's memcg association (PCG_USED and
>     pc->mem_cgroup) from the page holding an actual charge.  The new bits
>     PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
>     to the new page during migration.
>     
>     mem_cgroup_migrate() is suitable for replace_page_cache() as well, which
>     gets rid of mem_cgroup_replace_page_cache().
>     
>     Swap accounting is massively simplified: because the page is no longer
>     uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
>     transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
>     before the final put_page() in page reclaim.
>     
>     Finally, page_cgroup changes are now protected by whatever protection the
>     page itself offers: anonymous pages are charged under the page table lock,
>     whereas page cache insertions, swapin, and migration hold the page lock.
>     Uncharging happens under full exclusion with no outstanding references.
>     Charging and uncharging also ensure that the page is off-LRU, which
>     serializes against charge migration.  Remove the very costly page_cgroup
>     lock and set pc->flags non-atomically.
>     
>     Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>     Cc: Michal Hocko <mhocko@suse.cz>
>     Cc: Hugh Dickins <hughd@google.com>
>     Cc: Tejun Heo <tj@kernel.org>
>     Cc: Vladimir Davydov <vdavydov@parallels.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> 
> +-----------------------------------------------------------------------+------------+------------+---------------+
> |                                                                       | 5b647620c6 | ddc5bfec50 | next-20140620 |
> +-----------------------------------------------------------------------+------------+------------+---------------+
> | boot_successes                                                        | 60         | 0          | 0             |
> | boot_failures                                                         | 0          | 20         | 13            |
> | WARNING:CPU:PID:at_kernel/res_counter.c:res_counter_uncharge_locked() | 0          | 20         | 13            |
> | backtrace:vm_munmap                                                   | 0          | 20         | 13            |
> | backtrace:SyS_munmap                                                  | 0          | 20         | 13            |
> | backtrace:do_sys_open                                                 | 0          | 20         | 13            |
> | backtrace:SyS_open                                                    | 0          | 20         | 13            |
> | backtrace:do_execve                                                   | 0          | 20         | 13            |
> | backtrace:SyS_execve                                                  | 0          | 20         | 13            |
> | backtrace:do_group_exit                                               | 0          | 20         | 13            |
> | backtrace:SyS_exit_group                                              | 0          | 20         | 13            |
> | backtrace:SYSC_renameat2                                              | 0          | 11         | 8             |
> | backtrace:SyS_rename                                                  | 0          | 11         | 8             |
> | backtrace:do_munmap                                                   | 0          | 11         | 8             |
> | backtrace:SyS_brk                                                     | 0          | 11         | 8             |
> | Out_of_memory:Kill_process                                            | 0          | 1          |               |
> | backtrace:do_unlinkat                                                 | 0          | 9          | 5             |
> | backtrace:SyS_unlink                                                  | 0          | 9          | 5             |
> | backtrace:SYSC_umount                                                 | 0          | 9          |               |
> | backtrace:SyS_umount                                                  | 0          | 9          |               |
> | backtrace:cleanup_mnt_work                                            | 0          | 0          | 5             |
> +-----------------------------------------------------------------------+------------+------------+---------------+
> 
> [    2.747397] debug: unmapping init [mem 0xffff880001a3a000-0xffff880001bfffff]
> [    2.748630] debug: unmapping init [mem 0xffff8800021ad000-0xffff8800021fffff]
> [    2.752857] ------------[ cut here ]------------
> [    2.753355] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked+0x48/0x74()
> [    2.753355] CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-00238-gddc5bfe #1
> [    2.753355] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [    2.753355]  0000000000000000 ffff880012073c50 ffffffff81a23b9d ffff880012073c88
> [    2.753355]  ffffffff810bc765 ffffffff8111fac8 0000000000001000 ffff88001200fa50
> [    2.753355]  0000000000000001 ffff88001200fa01 ffff880012073c98 ffffffff810bc84b
> [    2.753355] Call Trace:
> [    2.753355]  [<ffffffff81a23b9d>] dump_stack+0x19/0x1b
> [    2.753355]  [<ffffffff810bc765>] warn_slowpath_common+0x73/0x8c
> [    2.753355]  [<ffffffff8111fac8>] ? res_counter_uncharge_locked+0x48/0x74
> [    2.753355]  [<ffffffff810bc84b>] warn_slowpath_null+0x1a/0x1c
> [    2.753355]  [<ffffffff8111fac8>] res_counter_uncharge_locked+0x48/0x74
> [    2.753355]  [<ffffffff8111fd02>] res_counter_uncharge_until+0x4e/0xa9
> [    2.753355]  [<ffffffff8111fd70>] res_counter_uncharge+0x13/0x15
> [    2.753355]  [<ffffffff8119499c>] mem_cgroup_uncharge_end+0x73/0x8d
> [    2.753355]  [<ffffffff8115735e>] release_pages+0x1f2/0x20d
> [    2.753355]  [<ffffffff8116cc3a>] tlb_flush_mmu_free+0x28/0x43
> [    2.753355]  [<ffffffff8116d5e5>] tlb_flush_mmu+0x20/0x23
> [    2.753355]  [<ffffffff8116d5fc>] tlb_finish_mmu+0x14/0x39
> [    2.753355]  [<ffffffff811730c1>] unmap_region+0xcd/0xdf
> [    2.753355]  [<ffffffff81172b0e>] ? vma_gap_callbacks_propagate+0x18/0x33
> [    2.753355]  [<ffffffff81174bf1>] do_munmap+0x252/0x2e0
> [    2.753355]  [<ffffffff81174cc3>] vm_munmap+0x44/0x5c
> [    2.753355]  [<ffffffff81174cfe>] SyS_munmap+0x23/0x29
> [    2.753355]  [<ffffffff81a31567>] system_call_fastpath+0x16/0x1b
> [    2.753355] ---[ end trace cfeb07101f6fbdfb ]---
> [    2.780913] ------------[ cut here ]------------

This is an underflow that happens with memcg enabled but memcg-swap
disabled - the memsw counter is not accounted, but then unaccounted.

Andrew, can you please put this in to fix the uncharge rewrite patch
mentioned above?

---

>From 29bcfcf54494467008aaf9d4e37771d3b2e2c2c7 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Fri, 20 Jun 2014 11:09:14 -0400
Subject: [patch] mm: memcontrol: rewrite uncharge API fix

It's not entirely clear whether do_swap_account or PCG_MEMSW is the
authoritative answer to whether a page is swap-accounted or not.  This
currently leads to the following memsw counter underflow when swap
accounting is disabled:

[    2.753355] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked+0x48/0x74()
[    2.753355] CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-00238-gddc5bfe #1
[    2.753355] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    2.753355]  0000000000000000 ffff880012073c50 ffffffff81a23b9d ffff880012073c88
[    2.753355]  ffffffff810bc765 ffffffff8111fac8 0000000000001000 ffff88001200fa50
[    2.753355]  0000000000000001 ffff88001200fa01 ffff880012073c98 ffffffff810bc84b
[    2.753355] Call Trace:
[    2.753355]  [<ffffffff81a23b9d>] dump_stack+0x19/0x1b
[    2.753355]  [<ffffffff810bc765>] warn_slowpath_common+0x73/0x8c
[    2.753355]  [<ffffffff8111fac8>] ? res_counter_uncharge_locked+0x48/0x74
[    2.753355]  [<ffffffff810bc84b>] warn_slowpath_null+0x1a/0x1c
[    2.753355]  [<ffffffff8111fac8>] res_counter_uncharge_locked+0x48/0x74
[    2.753355]  [<ffffffff8111fd02>] res_counter_uncharge_until+0x4e/0xa9
[    2.753355]  [<ffffffff8111fd70>] res_counter_uncharge+0x13/0x15
[    2.753355]  [<ffffffff8119499c>] mem_cgroup_uncharge_end+0x73/0x8d
[    2.753355]  [<ffffffff8115735e>] release_pages+0x1f2/0x20d
[    2.753355]  [<ffffffff8116cc3a>] tlb_flush_mmu_free+0x28/0x43
[    2.753355]  [<ffffffff8116d5e5>] tlb_flush_mmu+0x20/0x23
[    2.753355]  [<ffffffff8116d5fc>] tlb_finish_mmu+0x14/0x39
[    2.753355]  [<ffffffff811730c1>] unmap_region+0xcd/0xdf
[    2.753355]  [<ffffffff81172b0e>] ? vma_gap_callbacks_propagate+0x18/0x33
[    2.753355]  [<ffffffff81174bf1>] do_munmap+0x252/0x2e0
[    2.753355]  [<ffffffff81174cc3>] vm_munmap+0x44/0x5c
[    2.753355]  [<ffffffff81174cfe>] SyS_munmap+0x23/0x29
[    2.753355]  [<ffffffff81a31567>] system_call_fastpath+0x16/0x1b
[    2.753355] ---[ end trace cfeb07101f6fbdfb ]---

Don't set PCG_MEMSW when swap accounting is disabled, so that
uncharging only has to look at this per-page flag.

mem_cgroup_swapout() could also fully rely on this flag, but as it can
bail out before even looking up the page_cgroup, check do_swap_account
as a performance optimization and only sanity test for PCG_MEMSW.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 94d7c40b9f26..d6a20935f9c4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2740,7 +2740,7 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
 	 *   have the page locked
 	 */
 	pc->mem_cgroup = memcg;
-	pc->flags = PCG_USED | PCG_MEM | PCG_MEMSW;
+	pc->flags = PCG_USED | PCG_MEM | (do_swap_account ? PCG_MEMSW : 0);
 
 	if (lrucare) {
 		if (was_on_lru) {
@@ -6598,7 +6598,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage,
 		return;
 
 	VM_BUG_ON_PAGE(!(pc->flags & PCG_MEM), oldpage);
-	VM_BUG_ON_PAGE(!(pc->flags & PCG_MEMSW), oldpage);
+	VM_BUG_ON_PAGE(do_swap_account && !(pc->flags & PCG_MEMSW), oldpage);
 	pc->flags &= ~(PCG_MEM | PCG_MEMSW);
 
 	if (PageTransHuge(oldpage)) {
-- 
2.0.0

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>,
	Jet Chen <jet.chen@intel.com>,
	Yuanhan Liu <yuanhan.liu@intel.com>, LKP <lkp@01.org>,
	cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [memcontrol] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked()
Date: Fri, 20 Jun 2014 11:42:10 -0400	[thread overview]
Message-ID: <20140620154209.GI7331@cmpxchg.org> (raw)
In-Reply-To: <20140620102704.GA8912@localhost>

On Fri, Jun 20, 2014 at 06:27:04PM +0800, Fengguang Wu wrote:
> Greetings,
> 
> 0day kernel testing robot got the below dmesg and the first bad commit is
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

Thanks for the bisect.

> commit ddc5bfec501f4be3f9e89084c2db270c0c45d1d6
> Author:     Johannes Weiner <hannes@cmpxchg.org>
> AuthorDate: Fri Jun 20 10:27:58 2014 +1000
> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
> CommitDate: Fri Jun 20 10:27:58 2014 +1000
> 
>     mm: memcontrol: rewrite uncharge API
>     
>     The memcg uncharging code that is involved towards the end of a page's
>     lifetime - truncation, reclaim, swapout, migration - is impressively
>     complicated and fragile.
>     
>     Because anonymous and file pages were always charged before they had their
>     page->mapping established, uncharges had to happen when the page type
>     could still be known from the context; as in unmap for anonymous, page
>     cache removal for file and shmem pages, and swap cache truncation for swap
>     pages.  However, these operations happen well before the page is actually
>     freed, and so a lot of synchronization is necessary:
>     
>     - Charging, uncharging, page migration, and charge migration all need
>       to take a per-page bit spinlock as they could race with uncharging.
>     
>     - Swap cache truncation happens during both swap-in and swap-out, and
>       possibly repeatedly before the page is actually freed.  This means
>       that the memcg swapout code is called from many contexts that make
>       no sense and it has to figure out the direction from page state to
>       make sure memory and memory+swap are always correctly charged.
>     
>     - On page migration, the old page might be unmapped but then reused,
>       so memcg code has to prevent untimely uncharging in that case.
>       Because this code - which should be a simple charge transfer - is so
>       special-cased, it is not reusable for replace_page_cache().
>     
>     But now that charged pages always have a page->mapping, introduce
>     mem_cgroup_uncharge(), which is called after the final put_page(), when we
>     know for sure that nobody is looking at the page anymore.
>     
>     For page migration, introduce mem_cgroup_migrate(), which is called after
>     the migration is successful and the new page is fully rmapped.  Because
>     the old page is no longer uncharged after migration, prevent double
>     charges by decoupling the page's memcg association (PCG_USED and
>     pc->mem_cgroup) from the page holding an actual charge.  The new bits
>     PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
>     to the new page during migration.
>     
>     mem_cgroup_migrate() is suitable for replace_page_cache() as well, which
>     gets rid of mem_cgroup_replace_page_cache().
>     
>     Swap accounting is massively simplified: because the page is no longer
>     uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
>     transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
>     before the final put_page() in page reclaim.
>     
>     Finally, page_cgroup changes are now protected by whatever protection the
>     page itself offers: anonymous pages are charged under the page table lock,
>     whereas page cache insertions, swapin, and migration hold the page lock.
>     Uncharging happens under full exclusion with no outstanding references.
>     Charging and uncharging also ensure that the page is off-LRU, which
>     serializes against charge migration.  Remove the very costly page_cgroup
>     lock and set pc->flags non-atomically.
>     
>     Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>     Cc: Michal Hocko <mhocko@suse.cz>
>     Cc: Hugh Dickins <hughd@google.com>
>     Cc: Tejun Heo <tj@kernel.org>
>     Cc: Vladimir Davydov <vdavydov@parallels.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> 
> +-----------------------------------------------------------------------+------------+------------+---------------+
> |                                                                       | 5b647620c6 | ddc5bfec50 | next-20140620 |
> +-----------------------------------------------------------------------+------------+------------+---------------+
> | boot_successes                                                        | 60         | 0          | 0             |
> | boot_failures                                                         | 0          | 20         | 13            |
> | WARNING:CPU:PID:at_kernel/res_counter.c:res_counter_uncharge_locked() | 0          | 20         | 13            |
> | backtrace:vm_munmap                                                   | 0          | 20         | 13            |
> | backtrace:SyS_munmap                                                  | 0          | 20         | 13            |
> | backtrace:do_sys_open                                                 | 0          | 20         | 13            |
> | backtrace:SyS_open                                                    | 0          | 20         | 13            |
> | backtrace:do_execve                                                   | 0          | 20         | 13            |
> | backtrace:SyS_execve                                                  | 0          | 20         | 13            |
> | backtrace:do_group_exit                                               | 0          | 20         | 13            |
> | backtrace:SyS_exit_group                                              | 0          | 20         | 13            |
> | backtrace:SYSC_renameat2                                              | 0          | 11         | 8             |
> | backtrace:SyS_rename                                                  | 0          | 11         | 8             |
> | backtrace:do_munmap                                                   | 0          | 11         | 8             |
> | backtrace:SyS_brk                                                     | 0          | 11         | 8             |
> | Out_of_memory:Kill_process                                            | 0          | 1          |               |
> | backtrace:do_unlinkat                                                 | 0          | 9          | 5             |
> | backtrace:SyS_unlink                                                  | 0          | 9          | 5             |
> | backtrace:SYSC_umount                                                 | 0          | 9          |               |
> | backtrace:SyS_umount                                                  | 0          | 9          |               |
> | backtrace:cleanup_mnt_work                                            | 0          | 0          | 5             |
> +-----------------------------------------------------------------------+------------+------------+---------------+
> 
> [    2.747397] debug: unmapping init [mem 0xffff880001a3a000-0xffff880001bfffff]
> [    2.748630] debug: unmapping init [mem 0xffff8800021ad000-0xffff8800021fffff]
> [    2.752857] ------------[ cut here ]------------
> [    2.753355] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked+0x48/0x74()
> [    2.753355] CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-00238-gddc5bfe #1
> [    2.753355] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [    2.753355]  0000000000000000 ffff880012073c50 ffffffff81a23b9d ffff880012073c88
> [    2.753355]  ffffffff810bc765 ffffffff8111fac8 0000000000001000 ffff88001200fa50
> [    2.753355]  0000000000000001 ffff88001200fa01 ffff880012073c98 ffffffff810bc84b
> [    2.753355] Call Trace:
> [    2.753355]  [<ffffffff81a23b9d>] dump_stack+0x19/0x1b
> [    2.753355]  [<ffffffff810bc765>] warn_slowpath_common+0x73/0x8c
> [    2.753355]  [<ffffffff8111fac8>] ? res_counter_uncharge_locked+0x48/0x74
> [    2.753355]  [<ffffffff810bc84b>] warn_slowpath_null+0x1a/0x1c
> [    2.753355]  [<ffffffff8111fac8>] res_counter_uncharge_locked+0x48/0x74
> [    2.753355]  [<ffffffff8111fd02>] res_counter_uncharge_until+0x4e/0xa9
> [    2.753355]  [<ffffffff8111fd70>] res_counter_uncharge+0x13/0x15
> [    2.753355]  [<ffffffff8119499c>] mem_cgroup_uncharge_end+0x73/0x8d
> [    2.753355]  [<ffffffff8115735e>] release_pages+0x1f2/0x20d
> [    2.753355]  [<ffffffff8116cc3a>] tlb_flush_mmu_free+0x28/0x43
> [    2.753355]  [<ffffffff8116d5e5>] tlb_flush_mmu+0x20/0x23
> [    2.753355]  [<ffffffff8116d5fc>] tlb_finish_mmu+0x14/0x39
> [    2.753355]  [<ffffffff811730c1>] unmap_region+0xcd/0xdf
> [    2.753355]  [<ffffffff81172b0e>] ? vma_gap_callbacks_propagate+0x18/0x33
> [    2.753355]  [<ffffffff81174bf1>] do_munmap+0x252/0x2e0
> [    2.753355]  [<ffffffff81174cc3>] vm_munmap+0x44/0x5c
> [    2.753355]  [<ffffffff81174cfe>] SyS_munmap+0x23/0x29
> [    2.753355]  [<ffffffff81a31567>] system_call_fastpath+0x16/0x1b
> [    2.753355] ---[ end trace cfeb07101f6fbdfb ]---
> [    2.780913] ------------[ cut here ]------------

This is an underflow that happens with memcg enabled but memcg-swap
disabled - the memsw counter is not accounted, but then unaccounted.

Andrew, can you please put this in to fix the uncharge rewrite patch
mentioned above?

---

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>,
	Jet Chen <jet.chen@intel.com>,
	Yuanhan Liu <yuanhan.liu@intel.com>, LKP <lkp@01.org>,
	cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [memcontrol] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked()
Date: Fri, 20 Jun 2014 11:42:10 -0400	[thread overview]
Message-ID: <20140620154209.GI7331@cmpxchg.org> (raw)
In-Reply-To: <20140620102704.GA8912@localhost>

On Fri, Jun 20, 2014 at 06:27:04PM +0800, Fengguang Wu wrote:
> Greetings,
> 
> 0day kernel testing robot got the below dmesg and the first bad commit is
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

Thanks for the bisect.

> commit ddc5bfec501f4be3f9e89084c2db270c0c45d1d6
> Author:     Johannes Weiner <hannes@cmpxchg.org>
> AuthorDate: Fri Jun 20 10:27:58 2014 +1000
> Commit:     Stephen Rothwell <sfr@canb.auug.org.au>
> CommitDate: Fri Jun 20 10:27:58 2014 +1000
> 
>     mm: memcontrol: rewrite uncharge API
>     
>     The memcg uncharging code that is involved towards the end of a page's
>     lifetime - truncation, reclaim, swapout, migration - is impressively
>     complicated and fragile.
>     
>     Because anonymous and file pages were always charged before they had their
>     page->mapping established, uncharges had to happen when the page type
>     could still be known from the context; as in unmap for anonymous, page
>     cache removal for file and shmem pages, and swap cache truncation for swap
>     pages.  However, these operations happen well before the page is actually
>     freed, and so a lot of synchronization is necessary:
>     
>     - Charging, uncharging, page migration, and charge migration all need
>       to take a per-page bit spinlock as they could race with uncharging.
>     
>     - Swap cache truncation happens during both swap-in and swap-out, and
>       possibly repeatedly before the page is actually freed.  This means
>       that the memcg swapout code is called from many contexts that make
>       no sense and it has to figure out the direction from page state to
>       make sure memory and memory+swap are always correctly charged.
>     
>     - On page migration, the old page might be unmapped but then reused,
>       so memcg code has to prevent untimely uncharging in that case.
>       Because this code - which should be a simple charge transfer - is so
>       special-cased, it is not reusable for replace_page_cache().
>     
>     But now that charged pages always have a page->mapping, introduce
>     mem_cgroup_uncharge(), which is called after the final put_page(), when we
>     know for sure that nobody is looking at the page anymore.
>     
>     For page migration, introduce mem_cgroup_migrate(), which is called after
>     the migration is successful and the new page is fully rmapped.  Because
>     the old page is no longer uncharged after migration, prevent double
>     charges by decoupling the page's memcg association (PCG_USED and
>     pc->mem_cgroup) from the page holding an actual charge.  The new bits
>     PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
>     to the new page during migration.
>     
>     mem_cgroup_migrate() is suitable for replace_page_cache() as well, which
>     gets rid of mem_cgroup_replace_page_cache().
>     
>     Swap accounting is massively simplified: because the page is no longer
>     uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
>     transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
>     before the final put_page() in page reclaim.
>     
>     Finally, page_cgroup changes are now protected by whatever protection the
>     page itself offers: anonymous pages are charged under the page table lock,
>     whereas page cache insertions, swapin, and migration hold the page lock.
>     Uncharging happens under full exclusion with no outstanding references.
>     Charging and uncharging also ensure that the page is off-LRU, which
>     serializes against charge migration.  Remove the very costly page_cgroup
>     lock and set pc->flags non-atomically.
>     
>     Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>     Cc: Michal Hocko <mhocko@suse.cz>
>     Cc: Hugh Dickins <hughd@google.com>
>     Cc: Tejun Heo <tj@kernel.org>
>     Cc: Vladimir Davydov <vdavydov@parallels.com>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> 
> +-----------------------------------------------------------------------+------------+------------+---------------+
> |                                                                       | 5b647620c6 | ddc5bfec50 | next-20140620 |
> +-----------------------------------------------------------------------+------------+------------+---------------+
> | boot_successes                                                        | 60         | 0          | 0             |
> | boot_failures                                                         | 0          | 20         | 13            |
> | WARNING:CPU:PID:at_kernel/res_counter.c:res_counter_uncharge_locked() | 0          | 20         | 13            |
> | backtrace:vm_munmap                                                   | 0          | 20         | 13            |
> | backtrace:SyS_munmap                                                  | 0          | 20         | 13            |
> | backtrace:do_sys_open                                                 | 0          | 20         | 13            |
> | backtrace:SyS_open                                                    | 0          | 20         | 13            |
> | backtrace:do_execve                                                   | 0          | 20         | 13            |
> | backtrace:SyS_execve                                                  | 0          | 20         | 13            |
> | backtrace:do_group_exit                                               | 0          | 20         | 13            |
> | backtrace:SyS_exit_group                                              | 0          | 20         | 13            |
> | backtrace:SYSC_renameat2                                              | 0          | 11         | 8             |
> | backtrace:SyS_rename                                                  | 0          | 11         | 8             |
> | backtrace:do_munmap                                                   | 0          | 11         | 8             |
> | backtrace:SyS_brk                                                     | 0          | 11         | 8             |
> | Out_of_memory:Kill_process                                            | 0          | 1          |               |
> | backtrace:do_unlinkat                                                 | 0          | 9          | 5             |
> | backtrace:SyS_unlink                                                  | 0          | 9          | 5             |
> | backtrace:SYSC_umount                                                 | 0          | 9          |               |
> | backtrace:SyS_umount                                                  | 0          | 9          |               |
> | backtrace:cleanup_mnt_work                                            | 0          | 0          | 5             |
> +-----------------------------------------------------------------------+------------+------------+---------------+
> 
> [    2.747397] debug: unmapping init [mem 0xffff880001a3a000-0xffff880001bfffff]
> [    2.748630] debug: unmapping init [mem 0xffff8800021ad000-0xffff8800021fffff]
> [    2.752857] ------------[ cut here ]------------
> [    2.753355] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked+0x48/0x74()
> [    2.753355] CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-00238-gddc5bfe #1
> [    2.753355] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [    2.753355]  0000000000000000 ffff880012073c50 ffffffff81a23b9d ffff880012073c88
> [    2.753355]  ffffffff810bc765 ffffffff8111fac8 0000000000001000 ffff88001200fa50
> [    2.753355]  0000000000000001 ffff88001200fa01 ffff880012073c98 ffffffff810bc84b
> [    2.753355] Call Trace:
> [    2.753355]  [<ffffffff81a23b9d>] dump_stack+0x19/0x1b
> [    2.753355]  [<ffffffff810bc765>] warn_slowpath_common+0x73/0x8c
> [    2.753355]  [<ffffffff8111fac8>] ? res_counter_uncharge_locked+0x48/0x74
> [    2.753355]  [<ffffffff810bc84b>] warn_slowpath_null+0x1a/0x1c
> [    2.753355]  [<ffffffff8111fac8>] res_counter_uncharge_locked+0x48/0x74
> [    2.753355]  [<ffffffff8111fd02>] res_counter_uncharge_until+0x4e/0xa9
> [    2.753355]  [<ffffffff8111fd70>] res_counter_uncharge+0x13/0x15
> [    2.753355]  [<ffffffff8119499c>] mem_cgroup_uncharge_end+0x73/0x8d
> [    2.753355]  [<ffffffff8115735e>] release_pages+0x1f2/0x20d
> [    2.753355]  [<ffffffff8116cc3a>] tlb_flush_mmu_free+0x28/0x43
> [    2.753355]  [<ffffffff8116d5e5>] tlb_flush_mmu+0x20/0x23
> [    2.753355]  [<ffffffff8116d5fc>] tlb_finish_mmu+0x14/0x39
> [    2.753355]  [<ffffffff811730c1>] unmap_region+0xcd/0xdf
> [    2.753355]  [<ffffffff81172b0e>] ? vma_gap_callbacks_propagate+0x18/0x33
> [    2.753355]  [<ffffffff81174bf1>] do_munmap+0x252/0x2e0
> [    2.753355]  [<ffffffff81174cc3>] vm_munmap+0x44/0x5c
> [    2.753355]  [<ffffffff81174cfe>] SyS_munmap+0x23/0x29
> [    2.753355]  [<ffffffff81a31567>] system_call_fastpath+0x16/0x1b
> [    2.753355] ---[ end trace cfeb07101f6fbdfb ]---
> [    2.780913] ------------[ cut here ]------------

This is an underflow that happens with memcg enabled but memcg-swap
disabled - the memsw counter is not accounted, but then unaccounted.

Andrew, can you please put this in to fix the uncharge rewrite patch
mentioned above?

---

>From 29bcfcf54494467008aaf9d4e37771d3b2e2c2c7 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Fri, 20 Jun 2014 11:09:14 -0400
Subject: [patch] mm: memcontrol: rewrite uncharge API fix

It's not entirely clear whether do_swap_account or PCG_MEMSW is the
authoritative answer to whether a page is swap-accounted or not.  This
currently leads to the following memsw counter underflow when swap
accounting is disabled:

[    2.753355] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked+0x48/0x74()
[    2.753355] CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-00238-gddc5bfe #1
[    2.753355] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    2.753355]  0000000000000000 ffff880012073c50 ffffffff81a23b9d ffff880012073c88
[    2.753355]  ffffffff810bc765 ffffffff8111fac8 0000000000001000 ffff88001200fa50
[    2.753355]  0000000000000001 ffff88001200fa01 ffff880012073c98 ffffffff810bc84b
[    2.753355] Call Trace:
[    2.753355]  [<ffffffff81a23b9d>] dump_stack+0x19/0x1b
[    2.753355]  [<ffffffff810bc765>] warn_slowpath_common+0x73/0x8c
[    2.753355]  [<ffffffff8111fac8>] ? res_counter_uncharge_locked+0x48/0x74
[    2.753355]  [<ffffffff810bc84b>] warn_slowpath_null+0x1a/0x1c
[    2.753355]  [<ffffffff8111fac8>] res_counter_uncharge_locked+0x48/0x74
[    2.753355]  [<ffffffff8111fd02>] res_counter_uncharge_until+0x4e/0xa9
[    2.753355]  [<ffffffff8111fd70>] res_counter_uncharge+0x13/0x15
[    2.753355]  [<ffffffff8119499c>] mem_cgroup_uncharge_end+0x73/0x8d
[    2.753355]  [<ffffffff8115735e>] release_pages+0x1f2/0x20d
[    2.753355]  [<ffffffff8116cc3a>] tlb_flush_mmu_free+0x28/0x43
[    2.753355]  [<ffffffff8116d5e5>] tlb_flush_mmu+0x20/0x23
[    2.753355]  [<ffffffff8116d5fc>] tlb_finish_mmu+0x14/0x39
[    2.753355]  [<ffffffff811730c1>] unmap_region+0xcd/0xdf
[    2.753355]  [<ffffffff81172b0e>] ? vma_gap_callbacks_propagate+0x18/0x33
[    2.753355]  [<ffffffff81174bf1>] do_munmap+0x252/0x2e0
[    2.753355]  [<ffffffff81174cc3>] vm_munmap+0x44/0x5c
[    2.753355]  [<ffffffff81174cfe>] SyS_munmap+0x23/0x29
[    2.753355]  [<ffffffff81a31567>] system_call_fastpath+0x16/0x1b
[    2.753355] ---[ end trace cfeb07101f6fbdfb ]---

Don't set PCG_MEMSW when swap accounting is disabled, so that
uncharging only has to look at this per-page flag.

mem_cgroup_swapout() could also fully rely on this flag, but as it can
bail out before even looking up the page_cgroup, check do_swap_account
as a performance optimization and only sanity test for PCG_MEMSW.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 94d7c40b9f26..d6a20935f9c4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2740,7 +2740,7 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg,
 	 *   have the page locked
 	 */
 	pc->mem_cgroup = memcg;
-	pc->flags = PCG_USED | PCG_MEM | PCG_MEMSW;
+	pc->flags = PCG_USED | PCG_MEM | (do_swap_account ? PCG_MEMSW : 0);
 
 	if (lrucare) {
 		if (was_on_lru) {
@@ -6598,7 +6598,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage,
 		return;
 
 	VM_BUG_ON_PAGE(!(pc->flags & PCG_MEM), oldpage);
-	VM_BUG_ON_PAGE(!(pc->flags & PCG_MEMSW), oldpage);
+	VM_BUG_ON_PAGE(do_swap_account && !(pc->flags & PCG_MEMSW), oldpage);
 	pc->flags &= ~(PCG_MEM | PCG_MEMSW);
 
 	if (PageTransHuge(oldpage)) {
-- 
2.0.0

  reply	other threads:[~2014-06-20 15:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-20 10:24 [memcontrol] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked() Fengguang Wu
2014-06-20 10:27 ` Fengguang Wu
2014-06-20 15:42 ` Johannes Weiner [this message]
2014-06-20 15:42   ` Johannes Weiner
2014-06-20 15:42   ` Johannes Weiner
2014-06-20 15:42   ` Johannes Weiner
2014-06-21 13:26   ` Jet Chen
2014-06-21 13:26     ` Jet Chen
2014-06-21 13:26     ` Jet Chen
2014-06-24 16:07   ` Felipe Balbi
2014-06-24 16:10     ` Felipe Balbi
2014-06-24 16:10     ` Felipe Balbi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140620154209.GI7331@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=fengguang.wu@intel.com \
    --cc=jet.chen@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@01.org \
    --cc=yuanhan.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.