From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757684AbbE3IUt (ORCPT ); Sat, 30 May 2015 04:20:49 -0400 Received: from casper.infradead.org ([85.118.1.10]:41393 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757571AbbE3IUo (ORCPT ); Sat, 30 May 2015 04:20:44 -0400 Date: Sat, 30 May 2015 10:20:35 +0200 From: Peter Zijlstra To: pang.xunlei@zte.com.cn Cc: Juri Lelli , linux-kernel@vger.kernel.org, Ingo Molnar , Xunlei Pang , Steven Rostedt , Xunlei Pang Subject: Re: [PATCH v3 1/4] sched/rt: Check to push the task away after its affinity was changed Message-ID: <20150530082035.GL19282@twins.programming.kicks-ass.net> References: <1431442004-18716-1-git-send-email-xlpang@126.com> <20150529131626.GK19282@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 29, 2015 at 10:04:36PM +0800, pang.xunlei@zte.com.cn wrote: > Hi Peter, > > Peter Zijlstra wrote 2015-05-29 PM 09:16:26: > > > > Re: [PATCH v3 1/4] sched/rt: Check to push the task away after its > > affinity was changed > > > > On Tue, May 12, 2015 at 10:46:41PM +0800, Xunlei Pang wrote: > > > @@ -2278,6 +2279,20 @@ static void set_cpus_allowed_rt(struct > > task_struct *p, > > > } > > > > > > update_rt_migration(&rq->rt); > > > + > > > +check_push: > > > + if (weight > 1 && > > > + !task_running(rq, p) && > > > + !test_tsk_need_resched(rq->curr) && > > > + !cpumask_subset(new_mask, &p->cpus_allowed)) { > > > + /* Update new affinity and try to push. */ > > > + cpumask_copy(&p->cpus_allowed, new_mask); > > > + p->nr_cpus_allowed = weight; > > > + push_rt_tasks(rq); > > > + return true; > > > + } > > > + > > > + return false; > > > } > > > > I think this is broken; push_rt_tasks() will do double_rq_lock() which > > will drop rq->lock. > > > > This means load-balancing can come in and move our task p; in fact, > > push_rt_task() can do exactly that -- after all that was the point of > > this patch. > > > > _However_ this means that after calling ->set_cpus_allowed() we must not > > assume @p is on @rt, yet we do. Look at __set_cpus_allowed_ptr(), we'll > > call move_queued_task() if (!running || waking) && on_rq, and > > move_queued_task() happily calls dequeue_task(rq, p), which will go > > *boom*. > > I can't see why this can happen? > > After finishing set_cpus_allowed_rt(), if there happens a successful > load-balancing (pull or push) action, new task_cpu(@p) will be set, > so we will definitely get the following true condition: > > /* Can the task run on the task's current CPU? If so, we're done > */ > if (cpumask_test_cpu(task_cpu(p), new_mask)) > goto out; > > So I think the whole function will simply go out and return normally. Humm, yes. Missed that. That makes it work by accident; because you didn't document/Changelog any of this. Makes me like the thing even less though..