From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH] xen: Always ask the scheduler to re-place the vcpu when the affinity changes Date: Mon, 4 Mar 2013 14:03:24 +0000 Message-ID: <5134A9AC.2070004@eu.citrix.com> References: <1362399557-20214-1-git-send-email-george.dunlap@eu.citrix.com> <5134A33402000078000C2CF1@nat28.tlf.novell.com> <5134A570.7050604@eu.citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5134A570.7050604@eu.citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Dario Faggioli , "Keir (Xen.org)" , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 04/03/13 13:45, George Dunlap wrote: > On 04/03/13 12:35, Jan Beulich wrote: >>>>> On 04.03.13 at 13:19, George Dunlap >>>>> wrote: >>> It's probably a good idea to re-evaluate placement whenever the >>> affinity changes. >>> >>> This additionally has the benefit of removing scheduler-specific >>> exceptions introduced in git c/s e6a6fd63. >>> >>> Signed-off-by: George Dunlap >>> --- >>> xen/common/schedule.c | 7 ++++--- >>> 1 file changed, 4 insertions(+), 3 deletions(-) >>> >>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c >>> index de11110..dbef5af 100644 >>> --- a/xen/common/schedule.c >>> +++ b/xen/common/schedule.c >>> @@ -613,9 +613,10 @@ int vcpu_set_affinity(struct vcpu *v, const >>> cpumask_t >>> *affinity) >>> vcpu_schedule_lock_irq(v); >>> cpumask_copy(v->cpu_affinity, affinity); >>> - if ( VCPU2OP(v)->sched_id == XEN_SCHEDULER_SEDF || >>> - !cpumask_test_cpu(v->processor, v->cpu_affinity) ) >>> - set_bit(_VPF_migrating, &v->pause_flags); >>> + >>> + /* Always ask the scheduler to re-evaluate placement >>> + * when changing the affinity */ >>> + set_bit(_VPF_migrating, &v->pause_flags); >>> vcpu_schedule_unlock_irq(v); >> The code right below the context visible here is >> >> if ( test_bit(_VPF_migrating, &v->pause_flags) ) >> { >> vcpu_sleep_nosync(v); >> vcpu_migrate(v); >> } >> >> and I think the conditional could (and should) now be removed. > > Yeah, I wasn't sure what to make of that one -- it looked as though it > was coded such that someone else might be able to clear _VPF_migrating > between releasing the lock and this test. But if that can happen, > it's racy anyway. vcpu_force_reschedule() has this "set, unlock, > re-check" pattern. > > It looks like there actually is a way that it could conceivably be > cleared: if the vcpu is running on another pcpu, if after we release > the lock the vcpu is de-scheduled and context_saved() is called, it > will check for _VPF_migrating and call vcpu_migrate(), which can clear > the flag. > > But that's probably a rare enough occurrence that it's better overall > to take the occasional double-migrate. Hmm -- but thinking it further, it actually seems likely that a different double-migrate race will happen: 1. vcpu is running on pcpu A 2. pcpu B runs set_affinity, setting VPF_migrate 3. pcpu B calls vcpu_sleep_nosync 4. pcpu A wakes up and grabs the schedule lock 5. pcpu A notices that VPF_migrate is set, and calls vcpu_migrate() 6. pcpu B calls vcpu_migrate() Either that, or 6 happens before 4, but 4 still happens before pcpu B clears VPF_migrate. It seems like we should really only call if (!v->is_running || v->processor == this_cpu). Dario, any thoughts? -George > > -George