From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f50.google.com ([74.125.82.50]:38160 "EHLO mail-wm0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751328AbdGZMFZ (ORCPT ); Wed, 26 Jul 2017 08:05:25 -0400 Received: by mail-wm0-f50.google.com with SMTP id m85so65568758wma.1 for ; Wed, 26 Jul 2017 05:05:25 -0700 (PDT) Date: Wed, 26 Jul 2017 13:05:23 +0100 From: Matt Fleming To: Greg KH Cc: stable@vger.kernel.org, linux-kernel@vger.kernel.org, Konstantin Khlebnikov , Peter Zijlstra , Linus Torvalds , Tejun Heo , Thomas Gleixner , Ingo Molnar Subject: Re: [PATCH v4.4.y] sched/cgroup: Move sched_online_group() back into css_online() to fix crash Message-ID: <20170726120523.GA3414@codeblueprint.co.uk> References: <20170720135309.24182-1-matt@codeblueprint.co.uk> <20170725180439.GC24730@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170725180439.GC24730@kroah.com> Sender: stable-owner@vger.kernel.org List-ID: On Tue, 25 Jul, at 11:04:39AM, Greg KH wrote: > On Thu, Jul 20, 2017 at 02:53:09PM +0100, Matt Fleming wrote: > > From: Konstantin Khlebnikov > > > > commit 96b777452d8881480fd5be50112f791c17db4b6b upstream. > > > > Commit: > > > > 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") > > > > .. moved sched_online_group() from css_online() to css_alloc(). > > It exposes half-baked task group into global lists before initializing > > generic cgroup stuff. > > > > LTP testcase (third in cgroup_regression_test) written for testing > > similar race in kernels 2.6.26-2.6.28 easily triggers this oops: > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 > > IP: kernfs_path_from_node_locked+0x260/0x320 > > CPU: 1 PID: 30346 Comm: cat Not tainted 4.10.0-rc5-test #4 > > Call Trace: > > ? kernfs_path_from_node+0x4f/0x60 > > kernfs_path_from_node+0x3e/0x60 > > print_rt_rq+0x44/0x2b0 > > print_rt_stats+0x7a/0xd0 > > print_cpu+0x2fc/0xe80 > > ? __might_sleep+0x4a/0x80 > > sched_debug_show+0x17/0x30 > > seq_read+0xf2/0x3b0 > > proc_reg_read+0x42/0x70 > > __vfs_read+0x28/0x130 > > ? security_file_permission+0x9b/0xc0 > > ? rw_verify_area+0x4e/0xb0 > > vfs_read+0xa5/0x170 > > SyS_read+0x46/0xa0 > > entry_SYSCALL_64_fastpath+0x1e/0xad > > > > Here the task group is already linked into the global RCU-protected 'task_groups' > > list, but the css->cgroup pointer is still NULL. > > > > This patch reverts this chunk and moves online back to css_online(). > > > > Signed-off-by: Konstantin Khlebnikov > > Signed-off-by: Peter Zijlstra (Intel) > > Cc: Linus Torvalds > > Cc: Peter Zijlstra > > Cc: Tejun Heo > > Cc: Thomas Gleixner > > Fixes: 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") > > Link: http://lkml.kernel.org/r/148655324740.424917.5302984537258726349.stgit@buzz > > Signed-off-by: Ingo Molnar > > Signed-off-by: Matt Fleming > > --- > > kernel/sched/core.c | 14 ++++++++++++-- > > 1 file changed, 12 insertions(+), 2 deletions(-) > > What about 4.9-stable, this should go there too, right? Yes, good catch. Would you like me to send a separate patch?