From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: [PATCH] fix locking in cpu_disable_scheduler() Date: Mon, 28 Oct 2013 16:24:44 +0100 Message-ID: References: <526E994E02000078000FD571@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1Vapci-0003DX-US for xen-devel@lists.xenproject.org; Mon, 28 Oct 2013 16:25:01 +0000 Received: by mail-pb0-f51.google.com with SMTP id wz7so6701673pbc.38 for ; Mon, 28 Oct 2013 09:24:55 -0700 (PDT) In-Reply-To: <526E994E02000078000FD571@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , xen-devel Cc: George Dunlap List-Id: xen-devel@lists.xenproject.org On 28/10/2013 17:05, "Jan Beulich" wrote: > So commit eedd6039 ("scheduler: adjust internal locking interface") > uncovered - by now using proper spin lock constructs - a bug after all: > When bringing down a CPU, cpu_disable_scheduler() gets called with > interrupts disabled, and hence the use of vcpu_schedule_lock_irq() was > never really correct (i.e. the caller ended up with interrupts enabled > despite having disabled them explicitly). > > Fixing this however surfaced another problem: The call path > vcpu_migrate() -> evtchn_move_pirqs() wants to acquire the event lock, > which however is a non-IRQ-safe once, and hence check_lock() doesn't > like this lock to be acquired when interrupts are already off. As we're > in stop-machine context here, getting things wrong wrt interrupt state > management during lock acquire/release is out of question though, so > the simple solution to this appears to be to just suppress spin lock > debugging for the period of time while the stop machine callback gets > run. > > Signed-off-by: Jan Beulich Acked-by: Keir Fraser > --- a/xen/common/schedule.c > +++ b/xen/common/schedule.c > @@ -601,7 +601,8 @@ int cpu_disable_scheduler(unsigned int c > { > for_each_vcpu ( d, v ) > { > - spinlock_t *lock = vcpu_schedule_lock_irq(v); > + unsigned long flags; > + spinlock_t *lock = vcpu_schedule_lock_irqsave(v, &flags); > > cpumask_and(&online_affinity, v->cpu_affinity, c->cpu_valid); > if ( cpumask_empty(&online_affinity) && > @@ -622,14 +623,12 @@ int cpu_disable_scheduler(unsigned int c > if ( v->processor == cpu ) > { > set_bit(_VPF_migrating, &v->pause_flags); > - vcpu_schedule_unlock_irq(lock, v); > + vcpu_schedule_unlock_irqrestore(lock, flags, v); > vcpu_sleep_nosync(v); > vcpu_migrate(v); > } > else > - { > - vcpu_schedule_unlock_irq(lock, v); > - } > + vcpu_schedule_unlock_irqrestore(lock, flags, v); > > /* > * A vcpu active in the hypervisor will not be migratable. > --- a/xen/common/stop_machine.c > +++ b/xen/common/stop_machine.c > @@ -110,6 +110,7 @@ int stop_machine_run(int (*fn)(void *), > local_irq_disable(); > stopmachine_set_state(STOPMACHINE_DISABLE_IRQ); > stopmachine_wait_state(); > + spin_debug_disable(); > > stopmachine_set_state(STOPMACHINE_INVOKE); > if ( (cpu == smp_processor_id()) || (cpu == NR_CPUS) ) > @@ -117,6 +118,7 @@ int stop_machine_run(int (*fn)(void *), > stopmachine_wait_state(); > ret = stopmachine_data.fn_result; > > + spin_debug_enable(); > stopmachine_set_state(STOPMACHINE_EXIT); > stopmachine_wait_state(); > local_irq_enable(); > > >