Re: [PATCH 3/4] xen: sched: reorganize cpu_disable_scheduler()

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: George Dunlap <george.dunlap@eu.citrix.com>
To: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Juergen Gross <jgross@suse.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH 3/4] xen: sched: reorganize cpu_disable_scheduler()
Date: Thu, 9 Jul 2015 11:33:41 +0100	[thread overview]
Message-ID: <559E4E05.6000800@eu.citrix.com> (raw)
In-Reply-To: <1436373444.22672.256.camel@citrix.com>

On 07/08/2015 05:37 PM, Dario Faggioli wrote:
> On Wed, 2015-07-08 at 17:01 +0100, George Dunlap wrote:
>> On Fri, Jul 3, 2015 at 4:49 PM, Dario Faggioli
> 
>>> --- a/xen/common/schedule.c
>>> +++ b/xen/common/schedule.c
>>> @@ -455,8 +455,8 @@ void vcpu_unblock(struct vcpu *v)
>>>   * Do the actual movemet of a vcpu from old to new CPU. Locks for *both*
>>>   * CPUs needs to have been taken already when calling this!
>>>   */
>>> -static void vcpu_move(struct vcpu *v, unsigned int old_cpu,
>>> -                      unsigned int new_cpu)
>>> +static void _vcpu_move(struct vcpu *v, unsigned int old_cpu,
>>> +                       unsigned int new_cpu)
>>>  {
>>>      /*
>>>       * Transfer urgency status to new CPU before switching CPUs, as
>>> @@ -479,6 +479,35 @@ static void vcpu_move(struct vcpu *v, unsigned int old_cpu,
>>>          v->processor = new_cpu;
>>>  }
>>>
>>> +static void vcpu_move(struct vcpu *v, unsigned int new_cpu)
>>
>> Not quite a fan of the naming scheme here.  Perhaps vcpu_move and
>> vcpu_move_locked?
>>
> I'm fine with that.
> 
>> Actually -- looking at this again, is there a reason to pass old_cpu
>> into _vcpu_move?  It looks like old_vcpu should always be equal to
>> v->processor.  That would make the function prototypes the same.
>>
> It should yes, I think I can get rid of it.
> 
>>> +{
>>> +    unsigned long flags;
>>> +    unsigned int cpu = v->processor;
>>> +    spinlock_t *lock, *new_lock;
>>> +
>>> +    /*
>>> +     * When here, the vcpu should be ready for being moved. This means:
>>> +     *  - both its original and target processor must be quiet;
>>> +     *  - it must not be marked as currently running;
>>> +     *  - the proper flag must be set (i.e., no one must have had any
>>> +     *    chance to reset it).
>>> +     */
>>> +    ASSERT(is_idle_vcpu(curr_on_cpu(cpu)) &&
>>> +           is_idle_vcpu(curr_on_cpu(new_cpu)));
>>> +    ASSERT(!v->is_running && test_bit(_VPF_migrating, &v->pause_flags));
>>> +
>>> +    lock = per_cpu(schedule_data, v->processor).schedule_lock;
>>> +    new_lock = per_cpu(schedule_data, new_cpu).schedule_lock;
>>> +
>>> +    sched_spin_lock_double(lock, new_lock, &flags);
>>> +    ASSERT(new_cpu != v->processor);
>>
>> Don't we need to do the "schedule lock dance" here as well --
>> double-check to make sure that v->processor hasn't changed since the
>> time we grabbed the lock?
>>
> This is intended to be called pretty much only from the place where it's
> called, i.e., during system teardown, when already is already quite
> quiet.
> 
> So, no, I don't think we need that, but I probably could have made this
> _a_lot_ more clear both with comments and ASSERT()-s. Would that be ok?

Yes, a comment and an assert are important.  We might give it a more
descriptive name too -- vcpu_move_teardown()?  I can't immediately think
of something that I like though. :-/

Anyway, I guess use your best judgement on the name.

> 
>>> @@ -645,25 +675,72 @@ int cpu_disable_scheduler(unsigned int cpu)
>>>                  cpumask_setall(v->cpu_hard_affinity);
>>>              }
>>>
>>> -            if ( v->processor == cpu )
>>> +            if ( v->processor != cpu )
>>>              {
>>> -                set_bit(_VPF_migrating, &v->pause_flags);
>>> +                /* This vcpu is not on cpu, so we can move on. */
>>>                  vcpu_schedule_unlock_irqrestore(lock, flags, v);
>>> -                vcpu_sleep_nosync(v);
>>> -                vcpu_migrate(v);
>>> +                continue;
>>>              }
>>> -            else
>>> -                vcpu_schedule_unlock_irqrestore(lock, flags, v);
>>>
>>>              /*
>>> -             * A vcpu active in the hypervisor will not be migratable.
>>> -             * The caller should try again after releasing and reaquiring
>>> -             * all locks.
>>> +             * If we're here, it means that the vcpu is on cpu. Let's see how
>>> +             * it's best to send it away, depending on whether we are shutting
>>> +             * down/suspending, or doing cpupool manipulations.
>>>               */
>>> -            if ( v->processor == cpu )
>>> -                ret = -EAGAIN;
>>> -        }
>>> +            set_bit(_VPF_migrating, &v->pause_flags);
>>> +            vcpu_schedule_unlock_irqrestore(lock, flags, v);
>>> +            vcpu_sleep_nosync(v);
>>
>> I don't quite understand this.  By calling _nosync(), you're not
>> guaranteed that the vcpu has actually stopped running when you call
>> vcpu_move() down below; and yet inside vcpu_move(), you assert
>> !v->is_running.
>>
>> So either at this point, when doing suspend, the vcpu has already been
>> paused; in which case this is unnecessary; or it hasn't been paused,
>> in which case we're potentially racing the IPI / deschedule, and will
>> trip over the ASSERT if we "win".
>>
> The former: if we're are suspending, at this stage, everything is paused
> already. My aim was to minimize the code being special cased basing on
> system_state, and I left this out because it is required for the
> !SYS_STATE_suspend case, and does not harm in the SYS_STATE_suspended
> case.
> 
> However, I see how you find it confusing, and agree it could trip people
> over. I'll restructure the code in such a way that we go through this
> only in the non-suspending case (and change vcpu_move() accordingly).

Great, thanks.

 -George

next prev parent reply	other threads:[~2015-07-09 10:34 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-03 15:49 [PATCH 0/4] xen: sched/cpupool: more fixing of (corner?) cases Dario Faggioli
2015-07-03 15:49 ` [PATCH 1/4] xen: sched: factor the code for taking two runq locks in a function Dario Faggioli
2015-07-08 14:47   ` George Dunlap
2015-07-03 15:49 ` [PATCH 2/4] xen: sched: factor code that moves a vcpu to a new pcpu " Dario Faggioli
2015-07-08 14:56   ` George Dunlap
2015-07-08 15:09     ` Dario Faggioli
2015-07-03 15:49 ` [PATCH 3/4] xen: sched: reorganize cpu_disable_scheduler() Dario Faggioli
2015-07-07 11:16   ` Juergen Gross
2015-07-08 15:13     ` Dario Faggioli
2015-07-09 10:24       ` Dario Faggioli
2015-07-09 10:45         ` Juergen Gross
2015-07-15 14:54           ` Dario Faggioli
2015-07-15 15:07             ` Juergen Gross
2015-07-08 16:01   ` George Dunlap
2015-07-08 16:37     ` Dario Faggioli
2015-07-09 10:33       ` George Dunlap [this message]
2015-07-03 15:49 ` [PATCH 4/4] xen: sched/cpupool: properly update affinity when removing a cpu from a cpupool Dario Faggioli
2015-07-07 11:30   ` Juergen Gross
2015-07-08 16:19   ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=559E4E05.6000800@eu.citrix.com \
    --to=george.dunlap@eu.citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=jgross@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).