From mboxrd@z Thu Jan 1 00:00:00 1970 From: Li Zefan Subject: Re: [PATCH cgroup/for-3.15-fixes 2/2] cgroup_freezer: replace freezer->lock with freezer_mutex Date: Thu, 8 May 2014 09:23:14 +0800 Message-ID: <536ADC82.20405@huawei.com> References: <20140505223557.GU11231@htj.dyndns.org> <20140505223941.GV11231@htj.dyndns.org> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20140505223941.GV11231-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii" To: Tejun Heo Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, "Rafael J. Wysocki" > After 96d365e0b86e ("cgroup: make css_set_lock a rwsem and rename it > to css_set_rwsem"), css task iterators requires sleepable context as > it may block on css_set_rwsem. I missed that cgroup_freezer was > iterating tasks under IRQ-safe spinlock freezer->lock. This leads to > errors like the following on freezer state reads and transitions. > > BUG: sleeping function called from invalid context at /work > /os/work/kernel/locking/rwsem.c:20 > in_atomic(): 0, irqs_disabled(): 0, pid: 462, name: bash > 5 locks held by bash/462: > #0: (sb_writers#7){.+.+.+}, at: [] vfs_write+0x1a3/0x1c0 > #1: (&of->mutex){+.+.+.}, at: [] kernfs_fop_write+0xbb/0x170 > #2: (s_active#70){.+.+.+}, at: [] kernfs_fop_write+0xc3/0x170 > #3: (freezer_mutex){+.+...}, at: [] freezer_write+0x61/0x1e0 > #4: (rcu_read_lock){......}, at: [] freezer_write+0x53/0x1e0 > Preemption disabled at:[] console_unlock+0x1e4/0x460 > > CPU: 3 PID: 462 Comm: bash Not tainted 3.15.0-rc1-work+ #10 > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > ffff88000916a6d0 ffff88000e0a3da0 ffffffff81cf8c96 0000000000000000 > ffff88000e0a3dc8 ffffffff810cf4f2 ffffffff82388040 ffff880013aaf740 > 0000000000000002 ffff88000e0a3de8 ffffffff81d05974 0000000000000246 > Call Trace: > [] dump_stack+0x4e/0x7a > [] __might_sleep+0x162/0x260 > [] down_read+0x24/0x60 > [] css_task_iter_start+0x27/0x70 > [] freezer_apply_state+0x5d/0x130 > [] freezer_write+0xf6/0x1e0 > [] cgroup_file_write+0xd8/0x230 > [] kernfs_fop_write+0xe7/0x170 > [] vfs_write+0xb6/0x1c0 > [] SyS_write+0x4d/0xc0 > [] system_call_fastpath+0x16/0x1b > > freezer->lock used to be used in hot paths but that time is long gone > and there's no reason for the lock to be IRQ-safe spinlock or even > per-cgroup. In fact, given the fact that a cgroup may contain large > number of tasks, it's not a good idea to iterate over them while > holding IRQ-safe spinlock. > > Let's simplify locking by replacing per-cgroup freezer->lock with > global freezer_mutex. This also makes the comments explaining the > intricacies of policy inheritance and the locking around it as the > states are protected by a common mutex. > > The conversion is mostly straight-forward. The followings are worth > mentioning. > > * freezer_css_online() no longer needs double locking. > > * freezer_attach() now performs propagation simply while holding > freezer_mutex. update_if_frozen() race no longer exists and the > comment is removed. > > * freezer_fork() now tests whether the task is in root cgroup using > the new task_css_is_root() without doing rcu_read_lock/unlock(). If > not, it grabs freezer_mutex and performs the operation. > > * freezer_read() and freezer_change_state() grab freezer_mutex across > the whole operation and pin the css while iterating so that each > descendant processing happens in sleepable context. > > Fixes: 96d365e0b86e ("cgroup: make css_set_lock a rwsem and rename it to css_set_rwsem") > Signed-off-by: Tejun Heo Acked-by: Li Zefan