public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>, Lai Jiangshan <laijs@cn.fujitsu.com>,
	Tejun Heo <tj@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: Q: select_fallback_rq() && cpuset_lock()
Date: Thu, 11 Mar 2010 17:29:13 +0100	[thread overview]
Message-ID: <1268324953.5037.124.camel@laptop> (raw)
In-Reply-To: <20100311161909.GA16008@redhat.com>

On Thu, 2010-03-11 at 17:19 +0100, Oleg Nesterov wrote:

> > > How can we fix this later? Perhaps we can change
> > > cpuset_track_online_cpus(CPU_DEAD) to scan all affected cpusets and
> > > fixup the tasks with the wrong ->cpus_allowed == cpu_possible_mask.
> >
> > Problem is, we can't really fix up tasks, wakeup must be able to find a
> > suitable cpu.
> 
> Yes sure. I meant, wakeup()->select_fallback_rq() sets cpus_allowed =
> cpu_possible_map as we discussed. Then cpuset_track_online_cpus(CPU_DEAD)
> fixes the affected tasks.

Ah, have that re-validate the p->cpus_allowed for all cpuset tasks, ok
that might work.

> > > At first glance this should work in try_to_wake_up(p) case, we can't
> > > race with cpuset_change_cpumask()/etc because of TASK_WAKING logic.
> >
> > Well, cs->cpus_possible can still go funny on us.
> 
> What do you mean? Afaics, cpusets always uses set_cpus_allowed() to
> change task->cpus_allowed.

Confusion^2 ;-), I failed to grasp your fixup idea and got confused,
which confused you.

> > > But I am not sure how can we fix move_task_off_dead_cpu(). I think
> > > __migrate_task_irq() itself is fine, but if select_fallback_rq() is
> > > called from move_task_off_dead_cpu() nothing protects ->cpus_allowed.
> >
> > It has that retry loop in case the migration fails, right?
> >
> > > We can race with cpusets, or even with the plain set_cpus_allowed().
> > > Probably nothing really bad can happen, if the resulting cpumask
> > > doesn't have online cpus due to the racing memcpys, we should retry
> > > after __migrate_task_irq() fails. Or we can take cpu_rq(cpu)-lock
> > > around cpumask_copy(p->cpus_allowed, cpu_possible_mask).
> >
> > It does the retry thing.
> 
> Yes, I mentioned retry logic too. But it can't always help, even without
> cpusets.
> 
> Suppose a task T is bound to the dead CPU, and move_task_off_dead_cpu()
> races with set_cpus_allowed(new_mask). I think it is fine if T gets
> either new_mask or cpu_possible_map in ->cpus_allowed. But, it can get
> a "random" mix if 2 memcpy() run in parallel. And it is possible that
> __migrate_task_irq() will not fail if dest_cpu falls into resulting mask.

Ah indeed. One would almost construct a cpumask_assign that uses RCU
atomic pointer assignment for all this stupid cpumask juggling :/

> > > @@ -2289,10 +2289,9 @@ static int select_fallback_rq(int cpu, s
> > >
> > >  	/* No more Mr. Nice Guy. */
> > >  	if (dest_cpu >= nr_cpu_ids) {
> > > -		rcu_read_lock();
> > > -		cpuset_cpus_allowed_locked(p, &p->cpus_allowed);
> > > -		rcu_read_unlock();
> > > -		dest_cpu = cpumask_any_and(cpu_active_mask, &p->cpus_allowed);
> > > +		// XXX: take cpu_rq(cpu)->lock ???
> > > +		cpumask_copy(&p->cpus_allowed, cpu_possible_mask);
> > > +		dest_cpu = cpumask_any(cpu_active_mask);
> >
> >
> > Right, this seems safe.
> 
> OK, I'll try to read this code a bit more and then send this patch.

Thanks!


  reply	other threads:[~2010-03-11 16:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-09 18:06 Q: select_fallback_rq() && cpuset_lock() Oleg Nesterov
2010-03-10 16:40 ` Peter Zijlstra
2010-03-10 17:30   ` Oleg Nesterov
2010-03-10 18:01     ` Peter Zijlstra
2010-03-10 18:33       ` Oleg Nesterov
2010-03-11 14:52         ` Oleg Nesterov
2010-03-11 15:22           ` Oleg Nesterov
2010-03-11 15:41             ` Peter Zijlstra
2010-03-11 15:35           ` Peter Zijlstra
2010-03-11 16:19             ` Oleg Nesterov
2010-03-11 16:29               ` Peter Zijlstra [this message]
2010-03-13 19:28               ` Oleg Nesterov
2010-03-14  2:11                 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1268324953.5037.124.camel@laptop \
    --to=peterz@infradead.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox