From: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
To: Vladimir Davydov <vdavydov@parallels.com>
Cc: Tejun Heo <tj@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Will Deacon <Will.Deacon@arm.com>,
mhocko@suse.cz, akpm@linux-foundation.org
Subject: Re: [Regression] 3.19-rc3 : memcg: Hang in mount memcg
Date: Mon, 19 Jan 2015 12:51:27 +0000 [thread overview]
Message-ID: <54BCFDCF.9090603@arm.com> (raw)
In-Reply-To: <20150110085525.GD2110@esperanza>
On 10/01/15 08:55, Vladimir Davydov wrote:
> On Fri, Jan 09, 2015 at 05:43:17PM +0000, Suzuki K. Poulose wrote:
>> Hi
>>
>> We have hit a hang on ARM64 defconfig, while running LTP tests on
>> 3.19-rc3. We are
>> in the process of a git bisect and will update the results as and
>> when we find the commit.
>>
>> During the ksm ltp run, the test hangs trying to mount memcg with
>> the following strace
>> output:
>>
>> mount("memcg", "/dev/cgroup", "cgroup", 0, "memory") = ?
>> ERESTARTNOINTR (To be restarted)
>> mount("memcg", "/dev/cgroup", "cgroup", 0, "memory") = ?
>> ERESTARTNOINTR (To be restarted)
>> [ ... repeated forever ... ]
>>
>> At this point, one can try mounting the memcg to verify the problem.
>> # mount -t cgroup -o memory memcg memcg_dir
>> --hangs--
>>
>> Strangely, if we run the mount command from a cold boot (i.e.
>> without running LTP first),
>> then it succeeds.
>>
>> Upon a quick look we are hitting the following code :
>> kernel/cgroup.c: cgroup_mount() :
>>
>> 1779 for_each_subsys(ss, i) {
>> 1780 if (!(opts.subsys_mask & (1 << i)) ||
>> 1781 ss->root == &cgrp_dfl_root)
>> 1782 continue;
>> 1783
>> 1784 if
>> (!percpu_ref_tryget_live(&ss->root->cgrp.self.refcnt)) {
>> 1785 mutex_unlock(&cgroup_mutex);
>> 1786 msleep(10);
>> 1787 ret = restart_syscall(); <=====
>> 1788 goto out_free;
>> 1789 }
>> 1790 cgroup_put(&ss->root->cgrp);
>> 1791 }
>>
>> with ss->root->cgrp.self.refct.percpu_count_ptr == __PERCPU_REF_ATOMIC_DEAD
>>
>> Any ideas?
>
> The problem is that the memory cgroup controller takes a css reference
> per each charged page and does not reparent charged pages on css
> offline, while cgroup_mount/cgroup_kill_sb expect all css references to
> offline cgroups to be gone soon, restarting the syscall if the ref count
> != 0. As a result, if you create a memory cgroup, charge some page cache
> to it, and then remove it, unmount/mount will hang forever.
>
> May be, we should kill the ref counter to the memory controller root in
> cgroup_kill_sb only if there is no children at all, neither online nor
> offline.
>
Still reproducible on 3.19-rc5 with the same setup. From git bisect, the
last good commit is :
commit 8df0c2dcf61781d2efa8e6e5b06870f6c6785735
Author: Pranith Kumar <bobby.prani@gmail.com>
Date: Wed Dec 10 15:42:28 2014 -0800
slab: replace smp_read_barrier_depends() with lockless_dereference()
Thanks
Suzuki
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
To: Vladimir Davydov <vdavydov@parallels.com>
Cc: Tejun Heo <tj@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Will Deacon <Will.Deacon@arm.com>,
mhocko@suse.cz, akpm@linux-foundation.org
Subject: Re: [Regression] 3.19-rc3 : memcg: Hang in mount memcg
Date: Mon, 19 Jan 2015 12:51:27 +0000 [thread overview]
Message-ID: <54BCFDCF.9090603@arm.com> (raw)
In-Reply-To: <20150110085525.GD2110@esperanza>
On 10/01/15 08:55, Vladimir Davydov wrote:
> On Fri, Jan 09, 2015 at 05:43:17PM +0000, Suzuki K. Poulose wrote:
>> Hi
>>
>> We have hit a hang on ARM64 defconfig, while running LTP tests on
>> 3.19-rc3. We are
>> in the process of a git bisect and will update the results as and
>> when we find the commit.
>>
>> During the ksm ltp run, the test hangs trying to mount memcg with
>> the following strace
>> output:
>>
>> mount("memcg", "/dev/cgroup", "cgroup", 0, "memory") = ?
>> ERESTARTNOINTR (To be restarted)
>> mount("memcg", "/dev/cgroup", "cgroup", 0, "memory") = ?
>> ERESTARTNOINTR (To be restarted)
>> [ ... repeated forever ... ]
>>
>> At this point, one can try mounting the memcg to verify the problem.
>> # mount -t cgroup -o memory memcg memcg_dir
>> --hangs--
>>
>> Strangely, if we run the mount command from a cold boot (i.e.
>> without running LTP first),
>> then it succeeds.
>>
>> Upon a quick look we are hitting the following code :
>> kernel/cgroup.c: cgroup_mount() :
>>
>> 1779 for_each_subsys(ss, i) {
>> 1780 if (!(opts.subsys_mask & (1 << i)) ||
>> 1781 ss->root == &cgrp_dfl_root)
>> 1782 continue;
>> 1783
>> 1784 if
>> (!percpu_ref_tryget_live(&ss->root->cgrp.self.refcnt)) {
>> 1785 mutex_unlock(&cgroup_mutex);
>> 1786 msleep(10);
>> 1787 ret = restart_syscall(); <=====
>> 1788 goto out_free;
>> 1789 }
>> 1790 cgroup_put(&ss->root->cgrp);
>> 1791 }
>>
>> with ss->root->cgrp.self.refct.percpu_count_ptr == __PERCPU_REF_ATOMIC_DEAD
>>
>> Any ideas?
>
> The problem is that the memory cgroup controller takes a css reference
> per each charged page and does not reparent charged pages on css
> offline, while cgroup_mount/cgroup_kill_sb expect all css references to
> offline cgroups to be gone soon, restarting the syscall if the ref count
> != 0. As a result, if you create a memory cgroup, charge some page cache
> to it, and then remove it, unmount/mount will hang forever.
>
> May be, we should kill the ref counter to the memory controller root in
> cgroup_kill_sb only if there is no children at all, neither online nor
> offline.
>
Still reproducible on 3.19-rc5 with the same setup. From git bisect, the
last good commit is :
commit 8df0c2dcf61781d2efa8e6e5b06870f6c6785735
Author: Pranith Kumar <bobby.prani@gmail.com>
Date: Wed Dec 10 15:42:28 2014 -0800
slab: replace smp_read_barrier_depends() with lockless_dereference()
Thanks
Suzuki
next prev parent reply other threads:[~2015-01-19 12:51 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-09 17:43 [Regression] 3.19-rc3 : memcg: Hang in mount memcg Suzuki K. Poulose
2015-01-09 17:43 ` Suzuki K. Poulose
2015-01-09 21:46 ` Tejun Heo
2015-01-09 21:46 ` Tejun Heo
2015-01-12 17:02 ` Suzuki K. Poulose
2015-01-12 17:02 ` Suzuki K. Poulose
2015-01-10 8:55 ` Vladimir Davydov
2015-01-10 8:55 ` Vladimir Davydov
2015-01-10 21:43 ` [PATCH cgroup/for-3.19-fixes] cgroup: implement cgroup_subsys->unbind() callback Tejun Heo
2015-01-10 21:43 ` Tejun Heo
2015-01-11 20:55 ` Johannes Weiner
2015-01-11 20:55 ` Johannes Weiner
2015-01-12 8:01 ` Vladimir Davydov
2015-01-12 8:01 ` Vladimir Davydov
2015-01-12 11:28 ` Tejun Heo
2015-01-12 11:28 ` Tejun Heo
2015-01-12 12:59 ` Vladimir Davydov
2015-01-12 12:59 ` Vladimir Davydov
2015-01-12 13:05 ` Tejun Heo
2015-01-12 13:05 ` Tejun Heo
2015-01-14 11:16 ` Suzuki K. Poulose
2015-01-14 11:16 ` Suzuki K. Poulose
2015-01-15 17:56 ` Michal Hocko
2015-01-15 17:56 ` Michal Hocko
2015-01-15 17:26 ` Michal Hocko
2015-01-15 17:26 ` Michal Hocko
2015-01-19 12:51 ` Suzuki K. Poulose [this message]
2015-01-19 12:51 ` [Regression] 3.19-rc3 : memcg: Hang in mount memcg Suzuki K. Poulose
2015-01-21 16:39 ` Will Deacon
2015-01-21 16:39 ` Will Deacon
2015-01-22 13:45 ` Johannes Weiner
2015-01-22 13:45 ` Johannes Weiner
2015-01-22 14:34 ` Tejun Heo
2015-01-22 14:34 ` Tejun Heo
2015-01-22 15:19 ` Johannes Weiner
2015-01-22 15:19 ` Johannes Weiner
2015-01-22 15:28 ` Tejun Heo
2015-01-22 15:28 ` Tejun Heo
2015-01-23 15:00 ` Suzuki K. Poulose
2015-01-23 15:00 ` Suzuki K. Poulose
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54BCFDCF.9090603@arm.com \
--to=suzuki.poulose@arm.com \
--cc=Will.Deacon@arm.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=tj@kernel.org \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.