sched/autogroup: race if !sysctl_sched_autogroup_enabled ?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: Ingo Molnar <mingo@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mike Galbraith <efault@gmx.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: hartsjc@redhat.com, vbendel@redhat.com, vlovejoy@redhat.com,
	linux-kernel@vger.kernel.org
Subject: sched/autogroup: race if !sysctl_sched_autogroup_enabled ?
Date: Wed, 9 Nov 2016 17:59:33 +0100	[thread overview]
Message-ID: <20161109165933.GA26071@redhat.com> (raw)

I am trying to investigate a bug-report which looks as an autogroup bug,
and it seems I found the race which _might_ explain the problem. I'll
try to make the fix tomorrow but could you confirm I got it right and
answer the question below?

Let's look at task_wants_autogroup()

	/*
	 * We can only assume the task group can't go away on us if
	 * autogroup_move_group() can see us on ->thread_group list.
	 */
	if (p->flags & PF_EXITING)
		return false;

Firstly, I think that this PF_EXITING check is no longer needed.
sched_change_group() can be called by autogroup or cgroups code.
autogroup is obviously fine, cgroup-attach will never try to migrate
the exiting task (cgroup_threadgroup_rwsem helps), cpu_cgroup_fork()
is obviously fine too.

But!!! at the same time the comment is _correct_ even if very cryptic ;)

We need to ensure that autogroup/tg returned by autogroup_task_group()
can't go away if we race with autogroup_move_group(), and unless the
caller holds ->siglock we rely on fact that autogroup_move_group()
will a) see this task and b) do sched_move_task() which needs the same
same rq->lock.

However. autogroup_move_group() skips for_each_thread/sched_move_task
if sysctl_sched_autogroup_enabled == 0.

So. Doesn't this mean that cgroup migration to the root cgroup can race
with autogroup_move_group() and use the soon-to-be-freed autogroup->tg?

Or, even simpler, cgroup_post_fork()->cpu_cgroup_fork() can hit the
same race if CLONE_TRHEAD?

Or I am totally confused?

---------------------------------------------------------------------
Now the qustions. Can't we simplify autogroup_task_get() and avoid
lock_task_sighand() ?

	struct autogroup *autogroup_task_get(struct task_struct *p)
	{
		struct autogroup *ag

		rcu_read_lock();
		for (;;) {
			// it is freed by sched_free_group_rcu() path
			// and thus ->autogroup is rcu-safe too.
			ag = READ_ONCE(p->signal->autogroup);
			if (kref_get_unless_zero(&ag->kref))
				break;
		}
		rcu_read_unlock();

		return ag;
	}

although this is a bit off-topic. Another question is that I fail to
understand why sched_autogroup_create_attach() does autogroup_create()
and changes signal->autogroup even if !sysctl_sched_autogroup_enabled.

IOW, even ignoring the problem above, what is wrong with this patch?

	--- x/kernel/sched/auto_group.c
	+++ x/kernel/sched/auto_group.c
	@@ -152,8 +152,12 @@ out:
	 /* Allocates GFP_KERNEL, cannot be called under any spinlock */
	 void sched_autogroup_create_attach(struct task_struct *p)
	 {
	-	struct autogroup *ag = autogroup_create();
	+	struct autogroup *ag;
	+
	+	if (!sysctl_sched_autogroup_enabled)
	+		return;

	+	ag = autogroup_create();
		autogroup_move_group(p, ag);
		/* drop extra reference added by autogroup_create() */
		autogroup_kref_put(ag);

or even

	--- x/kernel/sched/auto_group.c
	+++ x/kernel/sched/auto_group.c
	@@ -64,9 +64,13 @@ static inline struct autogroup *autogrou

	 static inline struct autogroup *autogroup_create(void)
	 {
	-	struct autogroup *ag = kzalloc(sizeof(*ag), GFP_KERNEL);
	+	struct autogroup *ag;
		struct task_group *tg;

	+	if (!sysctl_sched_autogroup_enabled)
	+		goto xxx;
	+
	+	ag = kzalloc(sizeof(*ag), GFP_KERNEL);
		if (!ag)
			goto out_fail;

	@@ -103,7 +107,7 @@ out_fail:
			printk(KERN_WARNING "autogroup_create: %s failure.\n",
				ag ? "sched_create_group()" : "kmalloc()");
		}
	-
	+xxx:
		return autogroup_kref_get(&autogroup_default);
	 }

Oleg.

next             reply	other threads:[~2016-11-09 16:59 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-09 16:59 Oleg Nesterov [this message]
2016-11-09 17:50 ` sched/autogroup: race if !sysctl_sched_autogroup_enabled ? Peter Zijlstra
2016-11-10 13:09   ` Oleg Nesterov
2016-11-11 16:57     ` Oleg Nesterov
2016-11-13 13:59       ` Mike Galbraith
2016-11-14 15:14         ` Oleg Nesterov
2016-11-12 12:12   ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161109165933.GA26071@redhat.com \
    --to=oleg@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=efault@gmx.de \
    --cc=hartsjc@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbendel@redhat.com \
    --cc=vlovejoy@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.