From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH] sched: fix race between sched_move_domain() and vcpu_wake() Date: Fri, 11 Oct 2013 15:39:44 +0100 Message-ID: <52580DB0.3070002@eu.citrix.com> References: <5257BEBA.2070701@citrix.com> <5257E1CA02000078000FA7D3@nat28.tlf.novell.com> <5258094502000078000FA917@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1VUdsl-0008Td-H2 for xen-devel@lists.xenproject.org; Fri, 11 Oct 2013 14:39:59 +0000 In-Reply-To: <5258094502000078000FA917@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , Andrew Cooper Cc: xen-devel , Juergen Gross , Keir Fraser , David Vrabel List-Id: xen-devel@lists.xenproject.org On 11/10/13 13:20, Jan Beulich wrote: >>>> On 11.10.13 at 11:32, "Jan Beulich" wrote: >> I suppose you scanned the code for other cases like this, and >> there are none? > Actually I did just now, and I think there's a similar issue in > credit2's init_pcpu(): After taking pcpu_schedule_lock(cpu) it > alters schedule_lock and hence effectively drops the locking, > yet continues to do other stuff before in fact releasing it. > > What is being done prior to unlocking, however, looks to be > unrelated to the lock being held, and rather independently > (of the effective releasing) wanting &rqd->lock held. I can't quite make out what you mean in the last sentence; but setting the cpu in rqd->idle and rqd->active should certainly be protected by rqd->lock, and it certainly looks like it's not being grabbed at the moment. Hmm -- I think we may need to do some kind of fancy looping thing like we do in vcpu_migrate, to lock both the current schedule lock and rqd->lock; with the difference, I suppose, that rqd lock won't change (since the assignment of cpu->runqueue at the moment is static). Let me put this on my list of things to do before the release. -George