xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: George Dunlap <george.dunlap@citrix.com>, xen-devel@lists.xenproject.org
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Anshul Makkar <anshul.makkar@citrix.com>,
	Jan Beulich <JBeulich@suse.com>
Subject: Re: [PATCH 05/24] xen: credit2: make tickling more deterministic
Date: Fri, 30 Sep 2016 04:22:09 +0200	[thread overview]
Message-ID: <1475202129.5315.70.camel@citrix.com> (raw)
In-Reply-To: <2cf45a94-2e9a-3cbb-4bde-9850dc0af139@citrix.com>


[-- Attachment #1.1: Type: text/plain, Size: 4698 bytes --]

On Tue, 2016-09-13 at 12:28 +0100, George Dunlap wrote:
> On 17/08/16 18:18, Dario Faggioli wrote:
> > 
> diff --git a/xen/common/sched_credit2.c
> > @@ -2233,7 +2241,8 @@ void __dump_execstate(void *unused);
> >  static struct csched2_vcpu *
> >  runq_candidate(struct csched2_runqueue_data *rqd,
> >                 struct csched2_vcpu *scurr,
> > -               int cpu, s_time_t now)
> > +               int cpu, s_time_t now,
> > +               unsigned int *pos)
> 
> I think I'd prefer if this were called "skipped" or something like
> that
> -- to indicate how many vcpus in the runqueue had been skipped before
> coming to this one.
> 
Done.

> > @@ -2298,6 +2340,7 @@ csched2_schedule(
> >      struct csched2_runqueue_data *rqd;
> >      struct csched2_vcpu * const scurr = CSCHED2_VCPU(current);
> >      struct csched2_vcpu *snext = NULL;
> > +    unsigned int snext_pos = 0;
> >      struct task_slice ret;
> >  
> >      SCHED_STAT_CRANK(schedule);
> > @@ -2347,7 +2390,7 @@ csched2_schedule(
> >          snext = CSCHED2_VCPU(idle_vcpu[cpu]);
> >      }
> >      else
> > -        snext=runq_candidate(rqd, scurr, cpu, now);
> > +        snext = runq_candidate(rqd, scurr, cpu, now, &snext_pos);
> >  
> >      /* If switching from a non-idle runnable vcpu, put it
> >       * back on the runqueue. */
> > @@ -2371,8 +2414,21 @@ csched2_schedule(
> >              __set_bit(__CSFLAG_scheduled, &snext->flags);
> >          }
> >  
> > -        /* Check for the reset condition */
> > -        if ( snext->credit <= CSCHED2_CREDIT_RESET )
> > +        /*
> > +         * The reset condition is "has a scheduler epoch come to
> > an end?".
> > +         * The way this is enforced is checking whether the vcpu
> > at the top
> > +         * of the runqueue has negative credits. This means the
> > epochs have
> > +         * variable lenght, as in one epoch expores when:
> > +         *  1) the vcpu at the top of the runqueue has executed
> > for
> > +         *     around 10 ms (with default parameters);
> > +         *  2) no other vcpu with higher credits wants to run.
> > +         *
> > +         * Here, where we want to check for reset, we need to make
> > sure the
> > +         * proper vcpu is being used. In fact,
> > runqueue_candidate() may have
> > +         * not returned the first vcpu in the runqueue, for
> > various reasons
> > +         * (e.g., affinity). Only trigger a reset when it does.
> > +         */
> > +        if ( snext_pos == 0 && snext->credit <=
> > CSCHED2_CREDIT_RESET )
> 
> This bit wasn't mentioned in the description. :-)
> 
You're right. Actually, I think this change deserves to be in its own
patch, so in v2 I'm splitting this patch in two.

> There's a certain amount of sense to the idea here, but it's the kind
> of
> thing that may have strange side effects.  Did you look at traces
> before
> and after this change?  And does the behavior seem more rational?
> 
I have. It's not like it was happening a lot of times that we were
resetting upon the wrong vcpus, but I indeed have caught a couple of
examples.

And yes, the trace looked more 'regular' with this patch. Or, IOW,
without this patch, there were some of the reset events that were
suspiciously closer between each other.

TBH, in the vast majority of the cases, even when a "spurious reset"
was involved, the difference was rather hard to tell, but please,
consider that the combination of hard-affinity, this patch and soft-
affinity will potentially make things much worse (and in fact, I saw
the most severe occurrences when using hard-affinity).

It's also rather hard to measure the effect, but I think what is
implemented here is the right thing to do. And even if it may be hard
to measure the performance impact, I claim that this is a 'correctness'
issue, or at least a matter of adhering as much as possible to the
algorithm theory and idea.

> If so, I'm happy to trust your judgement -- just want to check to
> make
> sure. :-)
> 
Ah, thanks. :-)

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-09-30  2:22 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-17 17:17 [PATCH 00/24] sched: Credit1 and Credit2 improvements... and soft-affinity for Credit2! Dario Faggioli
2016-08-17 17:17 ` [PATCH 01/24] xen: credit1: small optimization in Credit1's tickling logic Dario Faggioli
2016-09-12 15:01   ` George Dunlap
2016-08-17 17:17 ` [PATCH 02/24] xen: credit1: fix mask to be used for tickling in Credit1 Dario Faggioli
2016-08-17 23:42   ` Dario Faggioli
2016-09-12 15:04     ` George Dunlap
2016-08-17 17:17 ` [PATCH 03/24] xen: credit1: return the 'time remaining to the limit' as next timeslice Dario Faggioli
2016-09-12 15:14   ` George Dunlap
2016-09-12 17:00     ` Dario Faggioli
2016-09-14  9:34       ` George Dunlap
2016-09-14 13:54         ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 04/24] xen: credit2: properly schedule migration of a running vcpu Dario Faggioli
2016-09-12 17:11   ` George Dunlap
2016-08-17 17:18 ` [PATCH 05/24] xen: credit2: make tickling more deterministic Dario Faggioli
2016-08-31 17:10   ` anshul makkar
2016-09-05 13:47     ` Dario Faggioli
2016-09-07 12:25       ` anshul makkar
2016-09-13 11:13       ` George Dunlap
2016-09-29 15:24         ` Dario Faggioli
2016-09-13 11:28   ` George Dunlap
2016-09-30  2:22     ` Dario Faggioli [this message]
2016-08-17 17:18 ` [PATCH 06/24] xen: credit2: implement yield() Dario Faggioli
2016-09-13 13:33   ` George Dunlap
2016-09-29 16:05     ` Dario Faggioli
2016-09-20 13:25   ` George Dunlap
2016-09-20 13:37     ` George Dunlap
2016-08-17 17:18 ` [PATCH 07/24] xen: sched: don't rate limit context switches in case of yields Dario Faggioli
2016-09-20 13:32   ` George Dunlap
2016-09-29 16:46     ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 08/24] xen: tracing: add trace records for schedule and rate-limiting Dario Faggioli
2016-08-18  0:57   ` Meng Xu
2016-08-18  9:41     ` Dario Faggioli
2016-09-20 13:50   ` George Dunlap
2016-08-17 17:18 ` [PATCH 09/24] xen/tools: tracing: improve tracing of context switches Dario Faggioli
2016-09-20 14:08   ` George Dunlap
2016-08-17 17:18 ` [PATCH 10/24] xen: tracing: improve Credit2's tickle_check and burn_credits records Dario Faggioli
2016-09-20 14:35   ` George Dunlap
2016-09-29 17:23     ` Dario Faggioli
2016-09-29 17:28       ` George Dunlap
2016-09-29 20:53         ` Dario Faggioli
2016-08-17 17:18 ` [PATCH 11/24] tools: tracing: handle more scheduling related events Dario Faggioli
2016-09-20 14:37   ` George Dunlap
2016-08-17 17:18 ` [PATCH 12/24] xen: libxc: allow to set the ratelimit value online Dario Faggioli
2016-09-20 14:43   ` George Dunlap
2016-09-20 14:45     ` Wei Liu
2016-09-28 15:44   ` George Dunlap
2016-08-17 17:19 ` [PATCH 13/24] libxc: improve error handling of xc Credit1 and Credit2 helpers Dario Faggioli
2016-09-20 15:10   ` Wei Liu
2016-08-17 17:19 ` [PATCH 14/24] libxl: allow to set the ratelimit value online for Credit2 Dario Faggioli
2016-08-22  9:21   ` Ian Jackson
2016-09-05 14:02     ` Dario Faggioli
2016-08-22  9:28   ` Ian Jackson
2016-09-28 15:37     ` George Dunlap
2016-09-30  1:03     ` Dario Faggioli
2016-09-28 15:39   ` George Dunlap
2016-08-17 17:19 ` [PATCH 15/24] xl: " Dario Faggioli
2016-09-28 15:46   ` George Dunlap
2016-08-17 17:19 ` [PATCH 16/24] xen: sched: factor affinity helpers out of sched_credit.c Dario Faggioli
2016-09-28 15:49   ` George Dunlap
2016-08-17 17:19 ` [PATCH 17/24] xen: credit2: soft-affinity awareness in runq_tickle() Dario Faggioli
2016-09-01 10:52   ` anshul makkar
2016-09-05 14:55     ` Dario Faggioli
2016-09-07 13:24       ` anshul makkar
2016-09-07 13:31         ` Dario Faggioli
2016-09-28 20:44   ` George Dunlap
2016-08-17 17:19 ` [PATCH 18/24] xen: credit2: soft-affinity awareness fallback_cpu() and cpu_pick() Dario Faggioli
2016-09-01 11:08   ` anshul makkar
2016-09-05 13:26     ` Dario Faggioli
2016-09-07 12:52       ` anshul makkar
2016-09-29 11:11   ` George Dunlap
2016-08-17 17:19 ` [PATCH 19/24] xen: credit2: soft-affinity awareness in load balancing Dario Faggioli
2016-09-02 11:46   ` anshul makkar
2016-09-05 12:49     ` Dario Faggioli
2016-08-17 17:19 ` [PATCH 20/24] xen: credit2: kick away vcpus not running within their soft-affinity Dario Faggioli
2016-08-17 17:20 ` [PATCH 21/24] xen: credit2: optimize runq_candidate() a little bit Dario Faggioli
2016-08-17 17:20 ` [PATCH 22/24] xen: credit2: "relax" CSCHED2_MAX_TIMER Dario Faggioli
2016-09-30 15:30   ` George Dunlap
2016-08-17 17:20 ` [PATCH 23/24] xen: credit2: optimize runq_tickle() a little bit Dario Faggioli
2016-09-02 12:38   ` anshul makkar
2016-09-05 12:52     ` Dario Faggioli
2016-08-17 17:20 ` [PATCH 24/24] xen: credit2: try to avoid tickling cpus subject to ratelimiting Dario Faggioli
2016-08-18  0:11 ` [PATCH 00/24] sched: Credit1 and Credit2 improvements... and soft-affinity for Credit2! Dario Faggioli
2016-08-18 11:49 ` Dario Faggioli
2016-08-18 11:53 ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1475202129.5315.70.camel@citrix.com \
    --to=dario.faggioli@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anshul.makkar@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).