All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Kirill Tkhai <ktkhai@parallels.com>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Vladimir Davydov <vdavydov@parallels.com>,
	Kirill Tkhai <tkhai@yandex.ru>
Subject: Re: [PATCH v3] sched/numa: fix unsafe get_task_struct() in task_numa_assign()
Date: Mon, 20 Oct 2014 16:47:57 +0200	[thread overview]
Message-ID: <20141020144757.GA10939@redhat.com> (raw)
In-Reply-To: <1413800145.19914.23.camel@tkhai>

Kirill,

I leave this to you and Peter, but...

On 10/20, Kirill Tkhai wrote:
>
> @@ -259,10 +259,15 @@ void __init fork_init(unsigned long mempages)
>  #ifndef ARCH_MIN_TASKALIGN
>  #define ARCH_MIN_TASKALIGN	L1_CACHE_BYTES
>  #endif
> -	/* create a slab on which task_structs can be allocated */
> +	/*
> +	 * Create a slab on which task_structs can be allocated.
> +	 * Note, we need SLAB_DESTROY_BY_RCU flag, when we access
> +	 * rq::curr under RCU read lock. See scheduler code.
> +	 */
>  	task_struct_cachep =
>  		kmem_cache_create("task_struct", sizeof(struct task_struct),
> -			ARCH_MIN_TASKALIGN, SLAB_PANIC | SLAB_NOTRACK, NULL);
> +			ARCH_MIN_TASKALIGN,
> +			SLAB_PANIC | SLAB_NOTRACK | SLAB_DESTROY_BY_RCU, NULL);

to me this change needs more justification.

Again, perhaps we will need to change the lifetime rules for task_struct
anyway, if we have more problems like this. But until then this looks like
an overkill to me. Plus rq_curr_if_not_put() looks too subtle, and it is
not generic.

May be we should start with something simple and stupid?

(it seems we can remove rcu_read_lock() with this patch, but I am not
 sure).

Oleg.


--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1087,9 +1087,6 @@ static void task_numa_assign(struct task_numa_env *env,
 {
 	if (env->best_task)
 		put_task_struct(env->best_task);
-	if (p)
-		get_task_struct(p);
-
 	env->best_task = p;
 	env->best_imp = imp;
 	env->best_cpu = env->dst_cpu;
@@ -1139,6 +1136,18 @@ static bool load_too_imbalanced(long src_load, long dst_load,
 	return (imb > old_imb);
 }
 
+struct task_struct *get_rq_curr(struct rq *rq)
+{
+	struct task_struct *curr;
+
+	raw_spin_lock_irq(&rq->lock);
+	curr = rq->curr;
+	get_task_struct(curr);
+	raw_spin_unlock_irq(&rq->lock);
+
+	return curr;
+}
+
 /*
  * This checks if the overall compute and NUMA accesses of the system would
  * be improved if the source tasks was migrated to the target dst_cpu taking
@@ -1156,11 +1165,9 @@ static void task_numa_compare(struct task_numa_env *env,
 	long imp = env->p->numa_group ? groupimp : taskimp;
 	long moveimp = imp;
 
-	rcu_read_lock();
-	cur = ACCESS_ONCE(dst_rq->curr);
-	if (cur->pid == 0) /* idle */
-		cur = NULL;
+	cur = get_rq_curr(dst_rq);
 
+	rcu_read_lock();
 	/*
 	 * "imp" is the fault differential for the source task between the
 	 * source and destination node. Calculate the total differential for
@@ -1235,6 +1242,7 @@ balance:
 		 */
 		if (!load_too_imbalanced(src_load, dst_load, env)) {
 			imp = moveimp - 1;
+			put_task_struct(cur);
 			cur = NULL;
 			goto assign;
 		}
@@ -1254,8 +1262,12 @@ balance:
 
 assign:
 	task_numa_assign(env, cur, imp);
+	cur = NULL;
 unlock:
 	rcu_read_unlock();
+
+	if (cur)
+		put_task_struct(cur);
 }
 
 static void task_numa_find_cpu(struct task_numa_env *env,


  reply	other threads:[~2014-10-20 14:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-20 10:15 [PATCH v3] sched/numa: fix unsafe get_task_struct() in task_numa_assign() Kirill Tkhai
2014-10-20 14:47 ` Oleg Nesterov [this message]
2014-10-20 16:56   ` Oleg Nesterov
2014-10-20 18:27     ` Oleg Nesterov
2014-10-20 20:18       ` Kirill Tkhai
2014-10-20 20:50         ` Oleg Nesterov
2014-10-20 21:05           ` Kirill Tkhai
2014-10-20 21:34             ` Oleg Nesterov
2014-10-20 22:57               ` Peter Zijlstra
2014-10-21  9:45           ` Peter Zijlstra
2014-10-21 19:03             ` Oleg Nesterov
2014-10-21 20:03               ` Kirill Tkhai
2014-10-21 20:10                 ` Kirill Tkhai
2014-10-22  9:09               ` Peter Zijlstra
2014-10-22 16:14                 ` Oleg Nesterov
2014-10-22 16:37                   ` Peter Zijlstra
2014-10-22 18:14                     ` introduce probe_slab_address? (Was: sched/numa: fix unsafe get_task_struct() in task_numa_assign()) Oleg Nesterov
2014-10-22 18:59                       ` introduce probe_slab_address? David Miller
2014-10-22 19:42                         ` Oleg Nesterov
2014-10-22 20:08                           ` David Miller
2014-10-22 20:20                             ` Oleg Nesterov
2014-10-24  9:44                         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141020144757.GA10939@redhat.com \
    --to=oleg@redhat.com \
    --cc=ktkhai@parallels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tkhai@yandex.ru \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.