From: Ryan Harper <ryanh@us.ibm.com>
To: Stephan Diestelhorst <sd386@cl.cam.ac.uk>
Cc: xen-devel@lists.xensource.com
Subject: Re: scheduler independent forced vcpu selection
Date: Wed, 18 May 2005 13:03:07 -0500 [thread overview]
Message-ID: <20050518180307.GK7305@us.ibm.com> (raw)
In-Reply-To: <428B30BC.8070602@cl.cam.ac.uk>
* Stephan Diestelhorst <sd386@cl.cam.ac.uk> [2005-05-18 09:04]:
> The timer assertion might be the old scheduling timer, which gets
> probably reset, but not deleted beforehand... And the on runqueue
> assertion suggests that you are 'stealing' the domain from the
> schedulers queues without giving it a chance to notice.
Looking at both bvt and sedf, the runqueue is ordered by some metric or
another (evt, deadline respectively). What I think we need is a way to
swap positions in the runqueues. That is, if the lock holder is
runnable, I want the holder to run instead of current. Is there some
way to do this in a scheduler independent manner with the current set of
scheduler ops defined in sched-if.h ?
I noticed that neither bvt or sedf implement the rem_task function which
I thought could be used to help out with the 'stealing' by notifying the
schedulers that prev was going away (removing it from the runqueue) but
just removing the exec_domain from the runqueue didn't help.
I'm including a patch that I'm currently using so you can get a better
idea of the modifications to schedule.c I'm making.
--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253 T/L: 678-9253
ryanh@us.ibm.com
---
--- b/xen/common/schedule.c 2005-05-17 22:16:55.000000000 -0500
+++ c/xen/common/schedule.c 2005-05-18 12:42:44.765691872 -0500
@@ -273,6 +273,49 @@
return 0;
}
+/* Confer control to another vcpu */
+long do_confer(unsigned int vcpu, unsigned int yield_count)
+{
+ struct domain *d = current->domain;
+
+ /* count hcalls */
+ current->confercnt++;
+
+ /* Validate CONFER prereqs:
+ * - vcpu is within bounds
+ * - vcpu is a valid in this domain
+ * - current has not already conferred its slice to vcpu
+ * - vcpu is not already running
+ * - designated vcpu's yield_count matches value from call
+ *
+ * of 1-4 are ok, then set conferred value and enter scheduler
+ */
+
+ if (vcpu > MAX_VIRT_CPUS)
+ return 0;
+
+ if (d->exec_domain[vcpu] == NULL)
+ return 0;
+
+ if (current->conferred != VCPU_CANCONFER)
+ return 0;
+
+ /* even counts indicate a running vcpu, odd is preempted/conferred */
+ if ((d->exec_domain[vcpu]->vcpu_info->yield_count & 1) == 0)
+ return 0;
+
+ if (d->exec_domain[vcpu]->vcpu_info->yield_count != yield_count)
+ return 0;
+
+ /*
+ * set which vcpu should run in conferred state, request scheduling
+ */
+ current->conferred = (VCPU_CONFERRING|vcpu);
+ raise_softirq(SCHEDULE_SOFTIRQ);
+
+ return 0;
+}
+
/*
* Demultiplex scheduler-related hypercalls.
*/
@@ -412,8 +455,9 @@
*/
static void __enter_scheduler(void)
{
- struct exec_domain *prev = current, *next = NULL;
+ struct exec_domain *prev = current, *next = NULL, *holder = NULL;
int cpu = prev->processor;
+ unsigned int holder_vcpu;
s_time_t now;
struct task_slice next_slice;
s32 r_time; /* time for new dom to run */
@@ -436,12 +480,39 @@
prev->cpu_time += now - prev->lastschd;
- /* get policy-specific decision on scheduling... */
- next_slice = ops.do_schedule(now);
+ /* get ed pointer to holder vcpu */
+ holder_vcpu = 0xffff & prev->conferred;
+ holder = prev->domain->exec_domain[holder_vcpu];
+
+ if (unlikely(prev->conferred & VCPU_CONFERRING) &&
+ domain_runnable(holder))
+ {
+ /* run holder next */
+ next = holder;
+
+ /* run for the remainder of prev's slice */
+ r_time = schedule_data[cpu].s_timer.expires - now;
+
+ /* increment confer counters */
+ prev->confer_out++;
+ next->confer_in++;
+
+ /* change prev's confer state to prevent re-entrance */
+ prev->conferred = VCPU_CONFERRED;
+
+ } else {
+ /* get policy-specific decision on scheduling... */
+ next_slice = ops.do_schedule(now);
+
+ r_time = next_slice.time;
+ next = next_slice.task;
+ }
+
+ /*
+ * always clear conferred state so this vcpu can confer during its slice
+ */
+ next->conferred = 0;
- r_time = next_slice.time;
- next = next_slice.task;
-
schedule_data[cpu].curr = next;
next->lastschd = now;
@@ -455,6 +526,12 @@
spin_unlock_irq(&schedule_data[cpu].schedule_lock);
+ /* bump vcpu yield_count when controlling domain is not-idle */
+ if ( !is_idle_task(prev->domain) )
+ prev->vcpu_info->yield_count++;
+ if ( !is_idle_task(next->domain) )
+ next->vcpu_info->yield_count++;
+
if ( unlikely(prev == next) ) {
#ifdef ADV_SCHED_HISTO
adv_sched_hist_to_stop(cpu);
next prev parent reply other threads:[~2005-05-18 18:03 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-05-17 20:48 scheduler independent forced vcpu selection Ryan Harper
2005-05-18 12:10 ` Stephan Diestelhorst
2005-05-18 14:55 ` Ryan Harper
2005-05-18 18:03 ` Ryan Harper [this message]
2005-05-19 13:22 ` Stephan Diestelhorst
2005-05-18 22:37 ` Ryan Harper
2005-05-19 13:25 ` Stephan Diestelhorst
2005-05-19 14:55 ` Ryan Harper
2005-05-19 15:05 ` Ryan Harper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050518180307.GK7305@us.ibm.com \
--to=ryanh@us.ibm.com \
--cc=sd386@cl.cam.ac.uk \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.