From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH v4] xen: sched: convert RTDS from time to event driven model Date: Tue, 2 Feb 2016 16:08:25 +0100 Message-ID: <1454425705.9227.176.camel@citrix.com> References: <1454301176-7131-1-git-send-email-tiche@seas.upenn.edu> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4923990612523484960==" Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aQcZH-0002aA-VU for xen-devel@lists.xenproject.org; Tue, 02 Feb 2016 15:08:36 +0000 In-Reply-To: <1454301176-7131-1-git-send-email-tiche@seas.upenn.edu> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Tianyang Chen , xen-devel@lists.xenproject.org Cc: george.dunlap@citrix.com, Dagaen Golomb , Meng Xu List-Id: xen-devel@lists.xenproject.org --===============4923990612523484960== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-E7gVb2yvKkbtjqF1oyRI" --=-E7gVb2yvKkbtjqF1oyRI Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, 2016-01-31 at 23:32 -0500, Tianyang Chen wrote: > v4 is meant for discussion on the addition of replq. >=20 In which cases, you should always put something like RFC in the subject, so people knows what the intent is even by just skimming their inboxes. > Changes since v3: > removed running queue. > added repl queue to keep track of repl events. > timer is now per scheduler. > timer is init on a valid cpu in a cpupool. >=20 > Bugs to be fixed: Cpupool and locks. When a pcpu is removed from a > pool and added to another, the lock equality assert in free_pdata() > fails when Pool-0 is using rtds. >=20 Known issue. I will fix it for both Credit2 and RTDS, so just ignore it. > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c > index 2e5430f..c36e5de 100644 > --- a/xen/common/sched_rt.c > +++ b/xen/common/sched_rt.c > @@ -16,6 +16,7 @@ > =C2=A0#include > =C2=A0#include > =C2=A0#include > +#include > =C2=A0#include > =C2=A0#include > =C2=A0#include > @@ -87,7 +88,7 @@ > =C2=A0#define RTDS_DEFAULT_BUDGET=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0(MICROSECS= (4000)) > =C2=A0 > =C2=A0#define UPDATE_LIMIT_SHIFT=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A010 > -#define MAX_SCHEDULE=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0(MILLISECS(1)) > + > Unrelated to the topic. Can we have a dedicated patch that adds a comment above=C2=A0UPDATE_LIMIT_SHIFT, explainig what it is and how it's used. This is something low priority, of course. > @@ -160,6 +166,7 @@ struct rt_private { > =C2=A0 */ > =C2=A0struct rt_vcpu { > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct list_head q_elem;=C2=A0=C2=A0=C2=A0= =C2=A0/* on the runq/depletedq list */ > +=C2=A0=C2=A0=C2=A0=C2=A0struct list_head p_elem;=C2=A0=C2=A0=C2=A0=C2=A0= /* on the repl event list */ > =C2=A0 Mmm... 'p_elem' because the previous one was 'q_elem', and 'p' follows 'q', I guess. It feels a bit obscure, though, and I'd prefer a more 'talking' name. For now, and at this level, I guess I can live with it. But in other places, things need to be improved (see below). > @@ -213,8 +220,14 @@ static inline struct list_head > *rt_depletedq(const struct scheduler *ops) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return &rt_priv(ops)->depletedq; > =C2=A0} > =C2=A0 > +static inline struct list_head *rt_replq(const struct scheduler > *ops) > +{ > +=C2=A0=C2=A0=C2=A0=C2=A0return &rt_priv(ops)->replq; > +} > + > =C2=A0/* > - * Queue helper functions for runq and depletedq > + * Queue helper functions for runq, depletedq > + * and repl event q > =C2=A0 */ > So, in comments, can we use full words as much as possible. It makes them a lot easier to read (so, e.g., "replenishment events queue" or "queue of the replenishment events"). I know this very file's style is not like that, from this point of view... and, in fact, this is something I would like to change, when we get the chance (i.e., when touching the code for other reasons, like in this case). > @@ -228,6 +241,18 @@ __q_elem(struct list_head *elem) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return list_entry(elem, struct rt_vcpu, q_e= lem); > =C2=A0} > =C2=A0 > +static struct rt_vcpu * > +__p_elem(struct list_head *elem) > +{ > +=C2=A0=C2=A0=C2=A0=C2=A0return list_entry(elem, struct rt_vcpu, p_elem); > +} > + So, while this may well still be fine... > +static int > +__vcpu_on_p(const struct rt_vcpu *svc) > +{ > +=C2=A0=C2=A0=C2=A0return !list_empty(&svc->p_elem); > +} > + > ...here I really would like to go for something that will make it much more obvious, just by giving a look at the code, where we are actually checking the vcpu to be, without having to remember that p is the replenishment queue. So, I don't know, maybe something like vcpu_on_replq(), or vcpu_needs_replenish() ? > @@ -387,6 +412,13 @@ __q_remove(struct rt_vcpu *svc) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0list_del_init(&svc-= >q_elem); > =C2=A0} > =C2=A0 > +static inline void > +__p_remove(struct rt_vcpu *svc) > +{ > +=C2=A0=C2=A0=C2=A0=C2=A0if ( __vcpu_on_p(svc) ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0list_del_init(&svc->p_el= em); > +} > + replq_remove(), or vcpu_repl_cancel() (or something else) ? > @@ -421,6 +453,32 @@ __runq_insert(const struct scheduler *ops, > struct rt_vcpu *svc) > =C2=A0} > =C2=A0 > =C2=A0/* > + * Insert svc into the repl even list: > + * vcpus that needs to be repl earlier go first. > + */ > +static void > +__replq_insert(const struct scheduler *ops, struct rt_vcpu *svc) > +{ > This is exactly what I've been trying to say above: __replq_insert() is ok, much better than __q_insert()! Do the same everywhere else. :-) > +=C2=A0=C2=A0=C2=A0=C2=A0struct rt_private *prv =3D rt_priv(ops); > +=C2=A0=C2=A0=C2=A0=C2=A0struct list_head *replq =3D rt_replq(ops); > +=C2=A0=C2=A0=C2=A0=C2=A0struct list_head *iter; > + > +=C2=A0=C2=A0=C2=A0=C2=A0ASSERT( spin_is_locked(&prv->lock) ); > + Is this really useful? Considering where this is called, I don't think it is. > +=C2=A0=C2=A0=C2=A0=C2=A0ASSERT( !__vcpu_on_p(svc) ); > + > +=C2=A0=C2=A0=C2=A0=C2=A0list_for_each(iter, replq) > +=C2=A0=C2=A0=C2=A0=C2=A0{ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct rt_vcpu * iter_sv= c =3D __p_elem(iter); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( svc->cur_deadline <= =3D iter_svc->cur_deadline ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0break; > +=C2=A0=C2=A0=C2=A0=C2=A0} > + > +=C2=A0=C2=A0=C2=A0=C2=A0list_add_tail(&svc->p_elem, iter); > +} This looks ok. The list_for_each()+list_ad_tail() part would be basically identical to what we have inside __runq_insert(), thgough, wouldn't it? It may make sense to define an _ordered_queue_insert() (or _deadline_queue_insert(), since it's alwas the deadline that is compared) utility function that we can then call in both cases. > @@ -449,6 +507,7 @@ rt_init(struct scheduler *ops) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0INIT_LIST_HEAD(&prv->sdom); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0INIT_LIST_HEAD(&prv->runq); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0INIT_LIST_HEAD(&prv->depletedq); > +=C2=A0=C2=A0=C2=A0=C2=A0INIT_LIST_HEAD(&prv->replq); > =C2=A0 Is there a reason why you don't do at least the allocation of the timer here? Because, to me, this looks the best place for that. > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cpumask_clear(&prv->tickled); > =C2=A0 > @@ -473,6 +532,9 @@ rt_deinit(const struct scheduler *ops) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0xfree(_cpumask_scra= tch); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0_cpumask_scratch = =3D NULL; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > + > +=C2=A0=C2=A0=C2=A0=C2=A0kill_timer(prv->repl_timer); > + > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0xfree(prv); > And here, you're freeing prv, without having freed prv->timer, which is therefore leaked. =C2=A0 > @@ -632,6 +699,17 @@ rt_vcpu_insert(const struct scheduler *ops, > struct vcpu *vc) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0vcpu_schedule_unlock_irq(lock, vc); > =C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0SCHED_STAT_CRANK(vcpu_insert); > + > +=C2=A0=C2=A0=C2=A0=C2=A0if( prv->repl_timer =3D=3D NULL ) > +=C2=A0=C2=A0=C2=A0=C2=A0{=C2=A0=C2=A0=C2=A0 > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* vc->processor has bee= n set in schedule.c */ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cpu =3D vc->processor; > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0repl_timer =3D xzalloc(s= truct timer); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0prv->repl_timer =3D repl= _timer; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0init_timer(repl_timer, r= epl_handler, (void *)ops, cpu); > +=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0} > =C2=A0 This belong to rt_init, IMO. > @@ -841,7 +881,6 @@ rt_schedule(const struct scheduler *ops, s_time_t > now, bool_t tasklet_work_sched > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* burn_budget would return for IDLE VCPU *= / > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0burn_budget(ops, scurr, now); > =C2=A0 > -=C2=A0=C2=A0=C2=A0=C2=A0__repl_update(ops, now); > =C2=A0 __repl_update() going away is of course fine. But we need to enact the budget enforcement logic somewhere in here, don't we? > @@ -924,6 +967,10 @@ rt_vcpu_sleep(const struct scheduler *ops, > struct vcpu *vc) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0__q_remove(svc); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0else if ( svc->flags & RTDS_delayed_runq_ad= d ) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0clear_bit(__RTDS_de= layed_runq_add, &svc->flags); > + > +=C2=A0=C2=A0=C2=A0=C2=A0/* stop tracking the repl time of this vcpu */ > +=C2=A0=C2=A0=C2=A0/* if( __vcpu_on_p(svc) ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0__p_remove(svc); */ > =C2=A0} > =C2=A0 Is it ok to kill the replenishment in this case? This is a genuine question. What does the dynamic DS algorithm that you're trying to implement here says about this? Meng, maybe you can help here? Is it ok to do this _because_ you then handle the situation of a replenishment having to had happened while the vcpu was asleep in rt_vcpu_wake (the '(now>=3Dsvc->cur_deadline)' thing)? Yes... it probably is ok for that reason... > @@ -1027,23 +1074,21 @@ rt_vcpu_wake(const struct scheduler *ops, > struct vcpu *vc) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct rt_vcpu * const svc =3D rt_vcpu(vc); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0s_time_t now =3D NOW(); > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0struct rt_private *prv =3D rt_priv(ops); > -=C2=A0=C2=A0=C2=A0=C2=A0struct rt_vcpu *snext =3D NULL; /* highest prior= ity on RunQ */ > -=C2=A0=C2=A0=C2=A0=C2=A0struct rt_dom *sdom =3D NULL; > -=C2=A0=C2=A0=C2=A0=C2=A0cpumask_t *online; > +=C2=A0=C2=A0=C2=A0=C2=A0struct timer *repl_timer =3D prv->repl_timer; > =C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0BUG_ON( is_idle_vcpu(vc) ); > =C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( unlikely(curr_on_cpu(vc->processor) = =3D=3D vc) ) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0SCHED_STAT_CRANK(vc= pu_wake_running); > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0goto out; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* on RunQ/DepletedQ, just update info is o= k */ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( unlikely(__vcpu_on_q(svc)) ) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0SCHED_STAT_CRANK(vc= pu_wake_onrunq); > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0goto out; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( likely(vcpu_runnable(vc)) ) > Mmm.. no. At least the first two cases, they should really just be 'return'-s. As you say yourself in the comment further down, right below the 'out:' label "a newly waken-up vcpu could xxx". If this is running on a pCPU, or it is in a runqueue already, it is not a newly woken-up vcpu. In fact, we need to check for this cases, but let's at least made them act like NOPs, as all other schedulers do and as it's correct. > @@ -1058,25 +1103,39 @@ rt_vcpu_wake(const struct scheduler *ops, > struct vcpu *vc) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( unlikely(svc->flags & RTDS_scheduled) = ) > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0set_bit(__RTDS_dela= yed_runq_add, &svc->flags); > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0goto out; > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > In this case, I'm less sure. I think this should just return as well, but it may depend on how other stuff are done (like budget enforcement), so it's hard to think at it right now, but consider this when working on next version. > +=C2=A0=C2=A0=C2=A0=C2=A0/* budget repl here is needed before inserting b= ack to runq. If > so, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* it should be taken out of replq and put = back. This can > potentially > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* cause the timer handler to replenish no = vcpu when it triggers > if > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* the replenishment is done here already. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > Coding style for long comments wants them to have both "wings" (you're missing the opening one, '/*'). > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( now >=3D svc->cur_deadline) > +=C2=A0=C2=A0=C2=A0=C2=A0{=C2=A0=C2=A0=C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0rt_update_deadline(= now, svc); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if( __vcpu_on_p(svc) ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= __p_remove(svc); > +=C2=A0=C2=A0=C2=A0=C2=A0} > =C2=A0 Well, then maybe the __p_remove() function can be made smart enough to: =C2=A0- check whether the one being removed is the first element of the =C2=A0 =C2=A0queue; =C2=A0- if it is, disarm the timer =C2=A0- if there still are elements in the queue, rearm the timer to fire at=C2=A0 =C2=A0 =C2=A0the new first's deadline This will be useful at least in another case (rt_vcpu_sleep()). > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* insert svc to runq/depletedq because svc= is not in queue now > */ > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0__runq_insert(ops, svc); > =C2=A0 > -=C2=A0=C2=A0=C2=A0=C2=A0__repl_update(ops, now); > - > -=C2=A0=C2=A0=C2=A0=C2=A0ASSERT(!list_empty(&prv->sdom)); > -=C2=A0=C2=A0=C2=A0=C2=A0sdom =3D list_entry(prv->sdom.next, struct rt_do= m, sdom_elem); > -=C2=A0=C2=A0=C2=A0=C2=A0online =3D cpupool_domain_cpumask(sdom->dom); > -=C2=A0=C2=A0=C2=A0=C2=A0snext =3D __runq_pick(ops, online); /* pick snex= t from ALL valid > cpus */ > - > -=C2=A0=C2=A0=C2=A0=C2=A0runq_tickle(ops, snext); > +=C2=A0=C2=A0=C2=A0=C2=A0runq_tickle(ops, svc); > =C2=A0 > -=C2=A0=C2=A0=C2=A0=C2=A0return; > +out: > +=C2=A0=C2=A0=C2=A0=C2=A0/* a newly waken-up vcpu could have an earlier r= elease time=C2=A0 > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* or it could be the first to program the = timer > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > +=C2=A0=C2=A0=C2=A0=C2=A0if( repl_timer->expires =3D=3D 0 || repl_timer->= expires > svc- > >cur_deadline ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0set_timer(repl_timer,svc= ->cur_deadline); > + And again, can't this logic be part of __rqplq_insert()? I.e.: =C2=A0- do the insertion =C2=A0- if the inserted element is the new head of the queue, reprogram the= =C2=A0 =C2=A0 =C2=A0timer > +=C2=A0=C2=A0=C2=A0=C2=A0/* start tracking the repl time of this vcpu her= e=C2=A0 > +=C2=A0=C2=A0=C2=A0=C2=A0* it stays on the replq unless it goes to sleep > +=C2=A0=C2=A0=C2=A0=C2=A0* or marked as not-runnable > +=C2=A0=C2=A0=C2=A0=C2=A0*/ > +=C2=A0=C2=A0=C2=A0=C2=A0if( !__vcpu_on_p(svc) ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0__replq_insert(ops, svc); > =C2=A0} > @@ -1168,6 +1216,80 @@ rt_dom_cntl( > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return rc; > =C2=A0} > =C2=A0 > +/* The replenishment timer handler picks vcpus > + * from the replq and does the actual replenishment > + * the replq only keeps track of runnable vcpus > + */ > +static void repl_handler(void *data){ > +=C2=A0=C2=A0=C2=A0=C2=A0unsigned long flags; > +=C2=A0=C2=A0=C2=A0=C2=A0s_time_t now =3D NOW(); > +=C2=A0=C2=A0=C2=A0=C2=A0s_time_t t_next =3D LONG_MAX; /* the next time t= imer should be > triggered */ > +=C2=A0=C2=A0=C2=A0=C2=A0struct scheduler *ops =3D data;=C2=A0 > +=C2=A0=C2=A0=C2=A0=C2=A0struct rt_private *prv =3D rt_priv(ops); > +=C2=A0=C2=A0=C2=A0=C2=A0struct list_head *replq =3D rt_replq(ops); > +=C2=A0=C2=A0=C2=A0=C2=A0struct timer *repl_timer =3D prv->repl_timer; > +=C2=A0=C2=A0=C2=A0=C2=A0struct list_head *iter, *tmp; > +=C2=A0=C2=A0=C2=A0=C2=A0struct rt_vcpu *svc =3D NULL; > + > +=C2=A0=C2=A0=C2=A0=C2=A0stop_timer(repl_timer); > + > +=C2=A0=C2=A0=C2=A0=C2=A0spin_lock_irqsave(&prv->lock, flags); > +=C2=A0 > +=C2=A0=C2=A0=C2=A0=C2=A0list_for_each_safe(iter, tmp, replq) > +=C2=A0=C2=A0=C2=A0=C2=A0{ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0svc =3D __p_elem(iter); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( now >=3D svc->cur_d= eadline ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= rt_update_deadline(now, svc); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= if( t_next > svc->cur_deadline ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0t_next =3D svc->cur_deadline; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Why do you need to do this? The next time at which the timer needs to be reprogrammed is, after you'll have done all the replenishments that needed to be done, here in this loop, the deadline of the new head of the replenishment queue (if there still are elements in there). > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= /* when the replenishment happens > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0* svc is either on a pcpu or on > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0* runq/depletedq > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0*/ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= if( __vcpu_on_q(svc) ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= { > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0/* put back to runq */ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0__q_remove(svc); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0__runq_insert(ops, svc); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0runq_tickle(ops, svc); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= } > + Aha! So, it looks the budget enforcement logic ended up here? Mmm... no, I think that doing it in rt_schedule(), as we agreed, would make things simpler and better looking, both here and there. This is the replenishment timer handler, and it really should do one thins: handle replenishment events. That's it! ;-P > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= /* resort replq, keep track of this > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0* vcpu if it's still runnable. If=C2=A0 > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0* at this point no vcpu are, then > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0* the timer waits for wake() to > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0* program it. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0*/ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= __p_remove(svc); > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= if( vcpu_runnable(svc->vcpu) ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0__replq_insert(ops, svc); > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0else > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0{ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= /* Break out of the loop if a vcpu's > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0* cur_deadline is bigger than now. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0* Also when some replenishment is done > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0* in wake(), the code could end up here > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0* if the time fires earlier than it should > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= */ > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= if( t_next > svc->cur_deadline ) > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0t_next =3D svc->cur_deadline; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= break; > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0} > +=C2=A0=C2=A0=C2=A0=C2=A0} > + > +=C2=A0=C2=A0=C2=A0=C2=A0set_timer(repl_timer, t_next); > + Well, what if there are no more replenishments to be performed? You're still reprogramming the timer, either at t_next or at cur_deadline, which looks arbitrary. If there are no replenishments, let's just keep the timer off. So, all in all, this is a very very good first attempt at implementing what we agreed upon in previous email... Good work, can't wait to see v5! :-D Let me know if there is anything unclear with any of the comments. Thanks and Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-E7gVb2yvKkbtjqF1oyRI Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEABECAAYFAlawxmsACgkQk4XaBE3IOsS7xQCgm+Sf6yhViSpLGbdUxAyQTSdC 2vkAn3D/oI3FHI+m/peB+XjykawMaF3G =Dgkf -----END PGP SIGNATURE----- --=-E7gVb2yvKkbtjqF1oyRI-- --===============4923990612523484960== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============4923990612523484960==--