From: George Dunlap <George.Dunlap@eu.citrix.com>
To: Keir Fraser <keir.fraser@eu.citrix.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: Re: [PATCH] [RFC] Credit2 scheduler prototype
Date: Wed, 13 Jan 2010 14:48:11 +0000 [thread overview]
Message-ID: <de76405a1001130648u50ccf3ebg3bde1b0c79840366@mail.gmail.com> (raw)
In-Reply-To: <C7444985.3E6A%keir.fraser@eu.citrix.com>
[-- Attachment #1: Type: text/plain, Size: 1699 bytes --]
Keir,
What do you think of the attached patches?
The first implements something like what you suggest below, but
instead of using a sort of "hack" with VPF_migrate, it makes a proper
"context_saved" SCHED_OP callback.
The second addresses the fact that when sharing runqueues,
v->processor may change quickly without an explicit migrate.
The last two are the credit2 hypervisor and tool patches, which use
these two changes (for reference).
I think these patches should be basically NOOP for the existing
schedulers, so as far as I'm concerned they're ready to be merged as
soon as you're happy with them.
Peace,
-George
On Tue, Dec 8, 2009 at 6:20 PM, Keir Fraser <keir.fraser@eu.citrix.com> wrote:
> On 08/12/2009 14:48, "George Dunlap" <George.Dunlap@eu.citrix.com> wrote:
>
>> My main concern is that sharing the runqueue between cores requires
>> some changes to the core context switch code. The kinks aren't 100%
>> worked out yet, so there's a risk that there will be an impact on the
>> correctness of the credit1 scheduler.
>
> Ah, if that's the problem with selecting a vcpu which happens to still be
> 'is_running' then I had some ideas how you could deal with that within the
> credit2 scheduler. If you see such a vcpu when searching the runqueue,
> ignore it, but set VPF_migrating. You'll then get a 'pick_cpu' callback when
> descheduling of the vcpu is completed. That should play nice with the lazy
> context switch logic while keeping things work conserving.
>
> -- Keir
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
[-- Attachment #2: context_switch-scheduler-callback.diff --]
[-- Type: text/x-patch, Size: 1198 bytes --]
Add context_saved scheduler callback.
Because credit2 shares a runqueue between several cpus, it needs
to know when a scheduled-out process has finally been context-switched
away so that it can be added to the runqueue again. (Otherwise it may
be grabbed by another processor before the context has been properly
saved.)
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
diff -r c44b7b9b6306 xen/common/schedule.c
--- a/xen/common/schedule.c Wed Jan 13 13:33:57 2010 +0000
+++ b/xen/common/schedule.c Wed Jan 13 13:36:37 2010 +0000
@@ -877,6 +877,8 @@
/* Check for migration request /after/ clearing running flag. */
smp_mb();
+ SCHED_OP(context_saved, prev);
+
if ( unlikely(test_bit(_VPF_migrating, &prev->pause_flags)) )
vcpu_migrate(prev);
}
diff -r c44b7b9b6306 xen/include/xen/sched-if.h
--- a/xen/include/xen/sched-if.h Wed Jan 13 13:33:57 2010 +0000
+++ b/xen/include/xen/sched-if.h Wed Jan 13 13:36:37 2010 +0000
@@ -69,6 +69,7 @@
void (*sleep) (struct vcpu *);
void (*wake) (struct vcpu *);
+ void (*context_saved) (struct vcpu *);
struct task_slice (*do_schedule) (s_time_t);
[-- Attachment #3: context_switch-vcpu-processor-sync.diff --]
[-- Type: text/x-patch, Size: 1615 bytes --]
Safely change next->processor if necessary.
Credit2's shared runqueue means that a vcpu may switch from one
pcpu to another without an explicit migration. We need to
change v->processor to match. However, this must be done with the
current v->processor schedule lock held. To avoid deadlock,
do this after we've released the current processor's schedule lock.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
diff -r 669448fb9d0c xen/common/schedule.c
--- a/xen/common/schedule.c Wed Jan 13 13:47:16 2010 +0000
+++ b/xen/common/schedule.c Wed Jan 13 14:15:51 2010 +0000
@@ -305,6 +305,23 @@
vcpu_wake(v);
}
+/* Safely change v->processor when running on a different cpu sharing the same runqueue */
+static void __vcpu_processor_sync(struct vcpu *next)
+{
+ unsigned long flags;
+ int old_cpu;
+ int this_cpu = smp_processor_id();
+
+ vcpu_schedule_lock_irqsave(next, flags);
+
+ /* Switch to new CPU, then unlock old CPU. */
+ old_cpu = next->processor;
+ next->processor = this_cpu;
+
+ spin_unlock_irqrestore(
+ &per_cpu(schedule_data, old_cpu).schedule_lock, flags);
+}
+
/*
* Force a VCPU through a deschedule/reschedule path.
* For example, using this when setting the periodic timer period means that
@@ -852,6 +869,11 @@
spin_unlock_irq(&sd->schedule_lock);
+ /* Safely change v->processor if necessary. Do this after
+ * releasing this cpu's lock to avoid deadlock. */
+ if ( next->processor != smp_processor_id() )
+ __vcpu_processor_sync(next);
+
perfc_incr(sched_ctx);
stop_timer(&prev->periodic_timer);
[-- Attachment #4: credit2-hypervisor.diff --]
[-- Type: text/x-patch, Size: 29453 bytes --]
diff -r 7bd1dd9fb30f xen/common/Makefile
--- a/xen/common/Makefile Wed Jan 13 14:15:51 2010 +0000
+++ b/xen/common/Makefile Wed Jan 13 14:36:58 2010 +0000
@@ -13,6 +13,7 @@
obj-y += page_alloc.o
obj-y += rangeset.o
obj-y += sched_credit.o
+obj-y += sched_credit2.o
obj-y += sched_sedf.o
obj-y += schedule.o
obj-y += shutdown.o
diff -r 7bd1dd9fb30f xen/common/sched_credit2.c
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/xen/common/sched_credit2.c Wed Jan 13 14:36:58 2010 +0000
@@ -0,0 +1,1037 @@
+
+/****************************************************************************
+ * (C) 2009 - George Dunlap - Citrix Systems R&D UK, Ltd
+ ****************************************************************************
+ *
+ * File: common/csched_credit2.c
+ * Author: George Dunlap
+ *
+ * Description: Credit-based SMP CPU scheduler
+ * Based on an earlier verson by Emmanuel Ackaouy.
+ */
+
+#include <xen/config.h>
+#include <xen/init.h>
+#include <xen/lib.h>
+#include <xen/sched.h>
+#include <xen/domain.h>
+#include <xen/delay.h>
+#include <xen/event.h>
+#include <xen/time.h>
+#include <xen/perfc.h>
+#include <xen/sched-if.h>
+#include <xen/softirq.h>
+#include <asm/atomic.h>
+#include <xen/errno.h>
+#include <xen/trace.h>
+
+#if __i386__
+#define PRI_stime "lld"
+#else
+#define PRI_stime "ld"
+#endif
+
+#define d2printk(x...)
+//#define d2printk printk
+
+#define TRC_CSCHED2_TICK TRC_SCHED_CLASS + 1
+#define TRC_CSCHED2_RUNQ_POS TRC_SCHED_CLASS + 2
+#define TRC_CSCHED2_CREDIT_BURN TRC_SCHED_CLASS + 3
+#define TRC_CSCHED2_CREDIT_ADD TRC_SCHED_CLASS + 4
+#define TRC_CSCHED2_TICKLE_CHECK TRC_SCHED_CLASS + 5
+
+/*
+ * WARNING: This is still in an experimental phase. Status and work can be found at the
+ * credit2 wiki page:
+ * http://wiki.xensource.com/xenwiki/Credit2_Scheduler_Development
+ */
+
+/*
+ * Design:
+ *
+ * VMs "burn" credits based on their weight; higher weight means credits burn
+ * more slowly.
+ *
+ * vcpus are inserted into the runqueue by credit order.
+ *
+ * Credits are "reset" when the next vcpu in the runqueue is less than or equal to zero. At that
+ * point, everyone's credits are "clipped" to a small value, and a fixed credit is added to everyone.
+ *
+ * The plan is for all cores that share an L2 will share the same runqueue. At the moment, there is
+ * one global runqueue for all cores.
+ */
+
+/*
+ * Basic constants
+ */
+#define CSCHED_DEFAULT_WEIGHT 256
+#define CSCHED_MIN_TIMER MICROSECS(500)
+#define CSCHED_CARRYOVER_MAX CSCHED_MIN_TIMER
+#define CSCHED_CREDIT_RESET 0
+#define CSCHED_CREDIT_INIT MILLISECS(10)
+#define CSCHED_MAX_TIMER MILLISECS(2)
+
+#define CSCHED_IDLE_CREDIT (-(1<<30))
+
+/*
+ * Flags
+ */
+/* CSFLAG_scheduled: Is this vcpu either running on, or context-switching off,
+ * a physical cpu?
+ * + Accessed only with runqueue lock held
+ * + Set when chosen as next in csched_schedule().
+ * + Cleared after context switch has been saved in csched_context_saved()
+ * + Checked in vcpu_wake to see if we can add to the runqueue, or if we should
+ * set CSFLAG_delayed_runq_add
+ * + Checked to be false in runq_insert.
+ */
+#define __CSFLAG_scheduled 1
+#define CSFLAG_scheduled (1<<__CSFLAG_scheduled)
+/* CSFLAG_delayed_runq_add: Do we need to add this to the runqueue once it'd done
+ * being context switched out?
+ * + Set when scheduling out in csched_schedule() if prev is runnable
+ * + Set in csched_vcpu_wake if it finds CSFLAG_scheduled set
+ * + Read in csched_context_switched(). If set, it adds prev to the runqueue and
+ * clears the bit.
+ */
+#define __CSFLAG_delayed_runq_add 2
+#define CSFLAG_delayed_runq_add (1<<__CSFLAG_delayed_runq_add)
+
+
+/*
+ * Useful macros
+ */
+#define CSCHED_PCPU(_c) \
+ ((struct csched_pcpu *)per_cpu(schedule_data, _c).sched_priv)
+#define CSCHED_VCPU(_vcpu) ((struct csched_vcpu *) (_vcpu)->sched_priv)
+#define CSCHED_DOM(_dom) ((struct csched_dom *) (_dom)->sched_priv)
+//#define RUNQ(_cpu) (&(CSCHED_GROUP(_cpu)->runq))
+#define RUNQ(_cpu) (&csched_priv.runq)
+
+/*
+ * System-wide private data
+ */
+struct csched_private {
+ spinlock_t lock;
+ uint32_t ncpus;
+ struct domain *idle_domain;
+
+ /* Per-runqueue info */
+ struct list_head runq; /* Global runqueue */
+ int max_weight;
+ struct list_head sdom;
+ struct list_head svc; /* List of all vcpus */
+};
+
+struct csched_pcpu {
+ int _dummy;
+};
+
+/*
+ * Virtual CPU
+ */
+struct csched_vcpu {
+ struct list_head global_elem; /* On the global vcpu list */
+ struct list_head sdom_elem; /* On the domain vcpu list */
+ struct list_head runq_elem; /* On the runqueue */
+
+ /* Up-pointers */
+ struct csched_dom *sdom;
+ struct vcpu *vcpu;
+
+ int weight;
+
+ int credit;
+ s_time_t start_time; /* When we were scheduled (used for credit) */
+ unsigned flags; /* 16 bits doesn't seem to play well with clear_bit() */
+
+};
+
+/*
+ * Domain
+ */
+struct csched_dom {
+ struct list_head vcpu;
+ struct list_head sdom_elem;
+ struct domain *dom;
+ uint16_t weight;
+ uint16_t nr_vcpus;
+};
+
+
+/*
+ * Global variables
+ */
+static struct csched_private csched_priv;
+
+/*
+ * Time-to-credit, credit-to-time.
+ * FIXME: Do pre-calculated division?
+ */
+static s_time_t t2c(s_time_t time, struct csched_vcpu *svc)
+{
+ return time * csched_priv.max_weight / svc->weight;
+}
+
+static s_time_t c2t(s_time_t credit, struct csched_vcpu *svc)
+{
+ return credit * svc->weight / csched_priv.max_weight;
+}
+
+/*
+ * Runqueue related code
+ */
+
+static /*inline*/ int
+__vcpu_on_runq(struct csched_vcpu *svc)
+{
+ return !list_empty(&svc->runq_elem);
+}
+
+static /*inline*/ struct csched_vcpu *
+__runq_elem(struct list_head *elem)
+{
+ return list_entry(elem, struct csched_vcpu, runq_elem);
+}
+
+static int
+__runq_insert(struct list_head *runq, struct csched_vcpu *svc)
+{
+ struct list_head *iter;
+ int pos = 0;
+
+ d2printk("rqi d%dv%d\n",
+ svc->vcpu->domain->domain_id,
+ svc->vcpu->vcpu_id);
+
+ /* Idle vcpus not allowed on the runqueue anymore */
+ BUG_ON(is_idle_vcpu(svc->vcpu));
+ BUG_ON(svc->vcpu->is_running);
+ BUG_ON(test_bit(__CSFLAG_scheduled, &svc->flags));
+
+ list_for_each( iter, runq )
+ {
+ struct csched_vcpu * iter_svc = __runq_elem(iter);
+
+ if ( svc->credit > iter_svc->credit )
+ {
+ d2printk(" p%d d%dv%d\n",
+ pos,
+ iter_svc->vcpu->domain->domain_id,
+ iter_svc->vcpu->vcpu_id);
+ break;
+ }
+ pos++;
+ }
+
+ list_add_tail(&svc->runq_elem, iter);
+
+ return pos;
+}
+
+static void
+runq_insert(unsigned int cpu, struct csched_vcpu *svc)
+{
+ struct list_head * runq = RUNQ(cpu);
+ int pos = 0;
+
+ /* FIXME: Runqueue per L2 */
+ ASSERT( spin_is_locked(&csched_priv.lock) );
+
+ BUG_ON( __vcpu_on_runq(svc) );
+ /* FIXME: Check runqueue handles this cpu*/
+ //BUG_ON( cpu != svc->vcpu->processor );
+
+ pos = __runq_insert(runq, svc);
+
+ {
+ struct {
+ unsigned dom:16,vcpu:16;
+ unsigned pos;
+ } d;
+ d.dom = svc->vcpu->domain->domain_id;
+ d.vcpu = svc->vcpu->vcpu_id;
+ d.pos = pos;
+ trace_var(TRC_CSCHED2_RUNQ_POS, 1,
+ sizeof(d),
+ (unsigned char *)&d);
+ }
+
+ return;
+}
+
+static inline void
+__runq_remove(struct csched_vcpu *svc)
+{
+ BUG_ON( !__vcpu_on_runq(svc) );
+ list_del_init(&svc->runq_elem);
+}
+
+void burn_credits(struct csched_vcpu *, s_time_t);
+
+/* Check to see if the item on the runqueue is higher priority than what's
+ * currently running; if so, wake up the processor */
+static /*inline*/ void
+runq_tickle(unsigned int cpu, struct csched_vcpu *new, s_time_t now)
+{
+ int i, ipid=-1;
+ s_time_t lowest=(1<<30);
+
+ d2printk("rqt d%dv%d cd%dv%d\n",
+ new->vcpu->domain->domain_id,
+ new->vcpu->vcpu_id,
+ current->domain->domain_id,
+ current->vcpu_id);
+
+ /* Find the cpu in this queue group that has the lowest credits */
+ /* FIXME: separate runqueues */
+ for_each_online_cpu ( i )
+ {
+ struct csched_vcpu * const cur =
+ CSCHED_VCPU(per_cpu(schedule_data, i).curr);
+
+ /* FIXME: keep track of idlers, chose from the mask */
+ if ( is_idle_vcpu(cur->vcpu) )
+ {
+ ipid = i;
+ lowest = CSCHED_IDLE_CREDIT;
+ break;
+ }
+ else
+ {
+ /* Update credits for current to see if we want to preempt */
+ burn_credits(cur, now);
+
+ if ( cur->credit < lowest )
+ {
+ ipid = i;
+ lowest = cur->credit;
+ }
+
+ /* TRACE */ {
+ struct {
+ unsigned dom:16,vcpu:16;
+ unsigned credit;
+ } d;
+ d.dom = cur->vcpu->domain->domain_id;
+ d.vcpu = cur->vcpu->vcpu_id;
+ d.credit = cur->credit;
+ trace_var(TRC_CSCHED2_TICKLE_CHECK, 1,
+ sizeof(d),
+ (unsigned char *)&d);
+ }
+ }
+ }
+
+ if ( ipid != -1 )
+ {
+ int cdiff = lowest - new->credit;
+
+ if ( lowest == CSCHED_IDLE_CREDIT || cdiff < 0 ) {
+ d2printk("si %d\n", ipid);
+ cpu_raise_softirq(ipid, SCHEDULE_SOFTIRQ);
+ }
+ else
+ /* FIXME: Wake up later? */;
+ }
+}
+
+/*
+ * Credit-related code
+ */
+static void reset_credit(int cpu, s_time_t now)
+{
+ struct list_head *iter;
+
+ list_for_each( iter, &csched_priv.svc )
+ {
+ struct csched_vcpu * svc = list_entry(iter, struct csched_vcpu, global_elem);
+ s_time_t cmax;
+
+ BUG_ON( is_idle_vcpu(svc->vcpu) );
+
+ /* Maximum amount of credit that can be carried over */
+ cmax = CSCHED_CARRYOVER_MAX;
+
+ if ( svc->credit > cmax )
+ svc->credit = cmax;
+ svc->credit += CSCHED_CREDIT_INIT; /* Find a better name */
+ svc->start_time = now;
+
+ /* Trace credit */
+ }
+
+ /* No need to resort runqueue, as everyone's order should be the same. */
+}
+
+void burn_credits(struct csched_vcpu *svc, s_time_t now)
+{
+ s_time_t delta;
+
+ /* Assert svc is current */
+ ASSERT(svc==CSCHED_VCPU(per_cpu(schedule_data, svc->vcpu->processor).curr));
+
+ if ( is_idle_vcpu(svc->vcpu) )
+ {
+ BUG_ON(svc->credit != CSCHED_IDLE_CREDIT);
+ return;
+ }
+
+ delta = now - svc->start_time;
+
+ if ( delta > 0 ) {
+ /* This will round down; should we consider rounding up...? */
+ svc->credit -= t2c(delta, svc);
+ svc->start_time = now;
+
+ d2printk("b d%dv%d c%d\n",
+ svc->vcpu->domain->domain_id,
+ svc->vcpu->vcpu_id,
+ svc->credit);
+ } else {
+ d2printk("%s: Time went backwards? now %"PRI_stime" start %"PRI_stime"\n",
+ __func__, now, svc->start_time);
+ }
+
+ /* TRACE */
+ {
+ struct {
+ unsigned dom:16,vcpu:16;
+ unsigned credit;
+ int delta;
+ } d;
+ d.dom = svc->vcpu->domain->domain_id;
+ d.vcpu = svc->vcpu->vcpu_id;
+ d.credit = svc->credit;
+ d.delta = delta;
+ trace_var(TRC_CSCHED2_CREDIT_BURN, 1,
+ sizeof(d),
+ (unsigned char *)&d);
+ }
+}
+
+/* Find the domain with the highest weight. */
+void update_max_weight(int new_weight, int old_weight)
+{
+ if ( new_weight > csched_priv.max_weight )
+ {
+ csched_priv.max_weight = new_weight;
+ printk("%s: Max weight %d\n", __func__, csched_priv.max_weight);
+ }
+ else if ( old_weight == csched_priv.max_weight )
+ {
+ struct list_head *iter;
+ int max_weight = 1;
+
+ list_for_each( iter, &csched_priv.sdom )
+ {
+ struct csched_dom * sdom = list_entry(iter, struct csched_dom, sdom_elem);
+
+ if ( sdom->weight > max_weight )
+ max_weight = sdom->weight;
+ }
+
+ csched_priv.max_weight = max_weight;
+ printk("%s: Max weight %d\n", __func__, csched_priv.max_weight);
+ }
+}
+
+/*
+ * Initialization code
+ */
+static int
+csched_pcpu_init(int cpu)
+{
+ unsigned long flags;
+ struct csched_pcpu *spc;
+
+ /* Allocate per-PCPU info */
+ spc = xmalloc(struct csched_pcpu);
+ if ( spc == NULL )
+ return -1;
+
+ spin_lock_irqsave(&csched_priv.lock, flags);
+
+ /* Initialize/update system-wide config */
+ per_cpu(schedule_data, cpu).sched_priv = spc;
+
+ csched_priv.ncpus++;
+
+ /* Start off idling... */
+ BUG_ON(!is_idle_vcpu(per_cpu(schedule_data, cpu).curr));
+
+ spin_unlock_irqrestore(&csched_priv.lock, flags);
+
+ return 0;
+}
+
+#ifndef NDEBUG
+static /*inline*/ void
+__csched_vcpu_check(struct vcpu *vc)
+{
+ struct csched_vcpu * const svc = CSCHED_VCPU(vc);
+ struct csched_dom * const sdom = svc->sdom;
+
+ BUG_ON( svc->vcpu != vc );
+ BUG_ON( sdom != CSCHED_DOM(vc->domain) );
+ if ( sdom )
+ {
+ BUG_ON( is_idle_vcpu(vc) );
+ BUG_ON( sdom->dom != vc->domain );
+ }
+ else
+ {
+ BUG_ON( !is_idle_vcpu(vc) );
+ }
+}
+#define CSCHED_VCPU_CHECK(_vc) (__csched_vcpu_check(_vc))
+#else
+#define CSCHED_VCPU_CHECK(_vc)
+#endif
+
+static int
+csched_vcpu_init(struct vcpu *vc)
+{
+ struct domain * const dom = vc->domain;
+ struct csched_dom *sdom = CSCHED_DOM(dom);
+ struct csched_vcpu *svc;
+
+ printk("%s: Initializing d%dv%d\n",
+ __func__, dom->domain_id, vc->vcpu_id);
+
+ /* Allocate per-VCPU info */
+ svc = xmalloc(struct csched_vcpu);
+ if ( svc == NULL )
+ return -1;
+
+ INIT_LIST_HEAD(&svc->global_elem);
+ INIT_LIST_HEAD(&svc->sdom_elem);
+ INIT_LIST_HEAD(&svc->runq_elem);
+
+ svc->sdom = sdom;
+ svc->vcpu = vc;
+ svc->flags = 0U;
+ vc->sched_priv = svc;
+
+ if ( ! is_idle_vcpu(vc) )
+ {
+ BUG_ON( sdom == NULL );
+
+ svc->credit = CSCHED_CREDIT_INIT;
+ svc->weight = sdom->weight;
+
+ list_add_tail(&svc->sdom_elem, &sdom->vcpu);
+ list_add_tail(&svc->global_elem, &csched_priv.svc);
+ sdom->nr_vcpus++;
+ }
+ else
+ {
+ BUG_ON( sdom != NULL );
+ svc->credit = CSCHED_IDLE_CREDIT;
+ svc->weight = 0;
+ if ( csched_priv.idle_domain == NULL )
+ csched_priv.idle_domain = dom;
+ }
+
+ /* Allocate per-PCPU info */
+ if ( unlikely(!CSCHED_PCPU(vc->processor)) )
+ {
+ if ( csched_pcpu_init(vc->processor) != 0 )
+ return -1;
+ }
+
+ CSCHED_VCPU_CHECK(vc);
+ return 0;
+}
+
+static void
+csched_vcpu_destroy(struct vcpu *vc)
+{
+ struct csched_vcpu * const svc = CSCHED_VCPU(vc);
+ struct csched_dom * const sdom = svc->sdom;
+ unsigned long flags;
+
+ BUG_ON( sdom == NULL );
+ BUG_ON( !list_empty(&svc->runq_elem) );
+
+ spin_lock_irqsave(&csched_priv.lock, flags);
+
+ /* Remove from sdom list */
+ list_del_init(&svc->global_elem);
+ list_del_init(&svc->sdom_elem);
+
+ sdom->nr_vcpus--;
+
+ spin_unlock_irqrestore(&csched_priv.lock, flags);
+
+ xfree(svc);
+}
+
+static void
+csched_vcpu_sleep(struct vcpu *vc)
+{
+ struct csched_vcpu * const svc = CSCHED_VCPU(vc);
+
+ BUG_ON( is_idle_vcpu(vc) );
+
+ if ( per_cpu(schedule_data, vc->processor).curr == vc )
+ cpu_raise_softirq(vc->processor, SCHEDULE_SOFTIRQ);
+ else if ( __vcpu_on_runq(svc) )
+ __runq_remove(svc);
+}
+
+static void
+csched_vcpu_wake(struct vcpu *vc)
+{
+ struct csched_vcpu * const svc = CSCHED_VCPU(vc);
+ const unsigned int cpu = vc->processor;
+ s_time_t now = 0;
+ int flags;
+
+ d2printk("w d%dv%d\n", vc->domain->domain_id, vc->vcpu_id);
+
+ BUG_ON( is_idle_vcpu(vc) );
+
+ /* FIXME: Runqueue per L2 */
+ spin_lock_irqsave(&csched_priv.lock, flags);
+
+
+ /* Make sure svc priority mod happens before runq check */
+ if ( unlikely(per_cpu(schedule_data, cpu).curr == vc) )
+ {
+ goto out;
+ }
+
+ if ( unlikely(__vcpu_on_runq(svc)) )
+ {
+ /* If we've boosted someone that's already on a runqueue, prioritize
+ * it and inform the cpu in question. */
+ goto out;
+ }
+
+ /* If the context hasn't been saved for this vcpu yet, we can't put it on
+ * another runqueue. Instead, we set a flag so that it will be put on the runqueue
+ * after the context has been saved. */
+ if ( unlikely (test_bit(__CSFLAG_scheduled, &svc->flags) ) )
+ {
+ set_bit(__CSFLAG_delayed_runq_add, &svc->flags);
+ goto out;
+ }
+
+ now = NOW();
+
+ /* Put the VCPU on the runq */
+ runq_insert(cpu, svc);
+ runq_tickle(cpu, svc, now);
+
+out:
+ spin_unlock_irqrestore(&csched_priv.lock, flags);
+ d2printk("w-\n");
+ return;
+}
+
+static void
+csched_context_saved(struct vcpu *vc)
+{
+ int flags;
+ struct csched_vcpu * const svc = CSCHED_VCPU(vc);
+
+ spin_lock_irqsave(&csched_priv.lock, flags);
+
+ /* This vcpu is now eligible to be put on the runqueue again */
+ clear_bit(__CSFLAG_scheduled, &svc->flags);
+
+ /* If someone wants it there, put it there */
+ if ( test_bit(__CSFLAG_delayed_runq_add, &svc->flags) )
+ {
+ const unsigned int cpu = vc->processor;
+
+ clear_bit(__CSFLAG_delayed_runq_add, &svc->flags);
+
+ BUG_ON(__vcpu_on_runq(svc));
+
+ runq_insert(cpu, svc);
+ runq_tickle(cpu, svc, NOW());
+ }
+
+ spin_unlock_irqrestore(&csched_priv.lock, flags);
+}
+
+static int
+csched_cpu_pick(struct vcpu *vc)
+{
+ /* FIXME: Chose a schedule group based on load */
+ return 0;
+}
+
+static int
+csched_dom_cntl(
+ struct domain *d,
+ struct xen_domctl_scheduler_op *op)
+{
+ struct csched_dom * const sdom = CSCHED_DOM(d);
+ unsigned long flags;
+
+ if ( op->cmd == XEN_DOMCTL_SCHEDOP_getinfo )
+ {
+ op->u.credit2.weight = sdom->weight;
+ }
+ else
+ {
+ ASSERT(op->cmd == XEN_DOMCTL_SCHEDOP_putinfo);
+
+ if ( op->u.credit2.weight != 0 )
+ {
+ struct list_head *iter;
+ int old_weight;
+
+ spin_lock_irqsave(&csched_priv.lock, flags);
+
+ old_weight = sdom->weight;
+
+ sdom->weight = op->u.credit2.weight;
+
+ /* Update max weight */
+ update_max_weight(sdom->weight, old_weight);
+
+ /* Update weights for vcpus */
+ list_for_each ( iter, &sdom->vcpu )
+ {
+ struct csched_vcpu *svc = list_entry(iter, struct csched_vcpu, sdom_elem);
+
+ svc->weight = sdom->weight;
+ }
+
+ spin_unlock_irqrestore(&csched_priv.lock, flags);
+ }
+ }
+
+ return 0;
+}
+
+static int
+csched_dom_init(struct domain *dom)
+{
+ struct csched_dom *sdom;
+ int flags;
+
+ printk("%s: Initializing domain %d\n", __func__, dom->domain_id);
+
+ if ( is_idle_domain(dom) )
+ return 0;
+
+ sdom = xmalloc(struct csched_dom);
+ if ( sdom == NULL )
+ return -ENOMEM;
+
+ /* Initialize credit and weight */
+ INIT_LIST_HEAD(&sdom->vcpu);
+ INIT_LIST_HEAD(&sdom->sdom_elem);
+ sdom->dom = dom;
+ sdom->weight = CSCHED_DEFAULT_WEIGHT;
+ sdom->nr_vcpus = 0;
+
+ dom->sched_priv = sdom;
+
+ spin_lock_irqsave(&csched_priv.lock, flags);
+
+ update_max_weight(sdom->weight, 0);
+ list_add_tail(&sdom->sdom_elem, &csched_priv.sdom);
+
+ spin_unlock_irqrestore(&csched_priv.lock, flags);
+
+ return 0;
+}
+
+static void
+csched_dom_destroy(struct domain *dom)
+{
+ struct csched_dom *sdom = CSCHED_DOM(dom);
+ int flags;
+
+ BUG_ON(!list_empty(&sdom->vcpu));
+
+ spin_lock_irqsave(&csched_priv.lock, flags);
+
+ list_del_init(&sdom->sdom_elem);
+
+ update_max_weight(0, sdom->weight);
+
+ spin_unlock_irqrestore(&csched_priv.lock, flags);
+
+ xfree(CSCHED_DOM(dom));
+}
+
+#if 0
+static void csched_load_balance(int cpu)
+{
+ /* FIXME: Do something. */
+}
+#endif
+
+/* How long should we let this vcpu run for? */
+static s_time_t
+csched_runtime(int cpu, struct csched_vcpu *snext)
+{
+ s_time_t time = CSCHED_MAX_TIMER;
+ struct list_head *runq = RUNQ(cpu);
+
+ if ( is_idle_vcpu(snext->vcpu) )
+ return CSCHED_MAX_TIMER;
+
+ /* Basic time */
+ time = c2t(snext->credit, snext);
+
+ /* Next guy on runqueue */
+ if ( ! list_empty(runq) )
+ {
+ struct csched_vcpu *svc = __runq_elem(runq->next);
+ s_time_t ntime;
+
+ if ( ! is_idle_vcpu(svc->vcpu) )
+ {
+ ntime = c2t(snext->credit - svc->credit, snext);
+
+ if ( time > ntime )
+ time = ntime;
+ }
+ }
+
+ /* Check limits */
+ if ( time < CSCHED_MIN_TIMER )
+ time = CSCHED_MIN_TIMER;
+ else if ( time > CSCHED_MAX_TIMER )
+ time = CSCHED_MAX_TIMER;
+
+ return time;
+}
+
+void __dump_execstate(void *unused);
+
+/*
+ * This function is in the critical path. It is designed to be simple and
+ * fast for the common case.
+ */
+static struct task_slice
+csched_schedule(s_time_t now)
+{
+ const int cpu = smp_processor_id();
+ struct list_head * const runq = RUNQ(cpu);
+ //struct csched_pcpu *spc = CSCHED_PCPU(cpu);
+ struct csched_vcpu * const scurr = CSCHED_VCPU(current);
+ struct csched_vcpu *snext = NULL;
+ struct task_slice ret;
+ int flags;
+
+ CSCHED_VCPU_CHECK(current);
+
+ d2printk("sc p%d c d%dv%d now %"PRI_stime"\n",
+ cpu,
+ scurr->vcpu->domain->domain_id,
+ scurr->vcpu->vcpu_id,
+ now);
+
+
+ /* FIXME: Runqueue per L2 */
+ spin_lock_irqsave(&csched_priv.lock, flags);
+
+ /* Update credits */
+ burn_credits(scurr, now);
+
+ /*
+ * Select next runnable local VCPU (ie top of local runq).
+ *
+ * If the current vcpu is runnable, and has higher credit than
+ * the next guy on the queue (or there is noone else), we want to run him again.
+ *
+ * If the current vcpu is runnable, and the next guy on the queue
+ * has higher credit, we want to mark current for delayed runqueue
+ * add, and remove the next guy from the queue.
+ *
+ * If the current vcpu is not runnable, we want to chose the idle
+ * vcpu for this processor.
+ */
+ if ( list_empty(runq) )
+ snext = CSCHED_VCPU(csched_priv.idle_domain->vcpu[cpu]);
+ else
+ snext = __runq_elem(runq->next);
+
+ if ( !is_idle_vcpu(current) && vcpu_runnable(current) )
+ {
+ /* If the current vcpu is runnable, and has higher credit
+ * than the next on the runqueue, run him again.
+ * Otherwise, set him for delayed runq add. */
+ if ( scurr->credit > snext->credit)
+ snext = scurr;
+ else
+ set_bit(__CSFLAG_delayed_runq_add, &scurr->flags);
+ }
+
+ if ( snext != scurr && !is_idle_vcpu(snext->vcpu) )
+ {
+ __runq_remove(snext);
+ if ( snext->vcpu->is_running )
+ {
+ printk("p%d: snext d%dv%d running on p%d! scurr d%dv%d\n",
+ cpu,
+ snext->vcpu->domain->domain_id, snext->vcpu->vcpu_id,
+ snext->vcpu->processor,
+ scurr->vcpu->domain->domain_id,
+ scurr->vcpu->vcpu_id);
+ BUG();
+ }
+ set_bit(__CSFLAG_scheduled, &snext->flags);
+ }
+
+ if ( !is_idle_vcpu(snext->vcpu) && snext->credit <= CSCHED_CREDIT_RESET )
+ reset_credit(cpu, now);
+
+ spin_unlock_irqrestore(&csched_priv.lock, flags);
+
+#if 0
+ /*
+ * Update idlers mask if necessary. When we're idling, other CPUs
+ * will tickle us when they get extra work.
+ */
+ if ( is_idle_vcpu(snext->vcpu) )
+ {
+ if ( !cpu_isset(cpu, csched_priv.idlers) )
+ cpu_set(cpu, csched_priv.idlers);
+ }
+ else if ( cpu_isset(cpu, csched_priv.idlers) )
+ {
+ cpu_clear(cpu, csched_priv.idlers);
+ }
+#endif
+
+ if ( !is_idle_vcpu(snext->vcpu) )
+ snext->start_time = now;
+ /*
+ * Return task to run next...
+ */
+ ret.time = csched_runtime(cpu, snext);
+ ret.task = snext->vcpu;
+
+ CSCHED_VCPU_CHECK(ret.task);
+ return ret;
+}
+
+static void
+csched_dump_vcpu(struct csched_vcpu *svc)
+{
+ printk("[%i.%i] flags=%x cpu=%i",
+ svc->vcpu->domain->domain_id,
+ svc->vcpu->vcpu_id,
+ svc->flags,
+ svc->vcpu->processor);
+
+ printk(" credit=%" PRIi32" [w=%u]", svc->credit, svc->weight);
+
+ printk("\n");
+}
+
+static void
+csched_dump_pcpu(int cpu)
+{
+ struct list_head *runq, *iter;
+ //struct csched_pcpu *spc;
+ struct csched_vcpu *svc;
+ int loop;
+ char cpustr[100];
+
+ //spc = CSCHED_PCPU(cpu);
+ runq = RUNQ(cpu);
+
+ cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_sibling_map,cpu));
+ printk(" sibling=%s, ", cpustr);
+ cpumask_scnprintf(cpustr, sizeof(cpustr), per_cpu(cpu_core_map,cpu));
+ printk("core=%s\n", cpustr);
+
+ /* current VCPU */
+ svc = CSCHED_VCPU(per_cpu(schedule_data, cpu).curr);
+ if ( svc )
+ {
+ printk("\trun: ");
+ csched_dump_vcpu(svc);
+ }
+
+ loop = 0;
+ list_for_each( iter, runq )
+ {
+ svc = __runq_elem(iter);
+ if ( svc )
+ {
+ printk("\t%3d: ", ++loop);
+ csched_dump_vcpu(svc);
+ }
+ }
+}
+
+static void
+csched_dump(void)
+{
+ struct list_head *iter_sdom, *iter_svc;
+ int loop;
+
+ printk("info:\n"
+ "\tncpus = %u\n"
+ "\tdefault-weight = %d\n",
+ csched_priv.ncpus,
+ CSCHED_DEFAULT_WEIGHT);
+
+ printk("active vcpus:\n");
+ loop = 0;
+ list_for_each( iter_sdom, &csched_priv.sdom )
+ {
+ struct csched_dom *sdom;
+ sdom = list_entry(iter_sdom, struct csched_dom, sdom_elem);
+
+ list_for_each( iter_svc, &sdom->vcpu )
+ {
+ struct csched_vcpu *svc;
+ svc = list_entry(iter_svc, struct csched_vcpu, sdom_elem);
+
+ printk("\t%3d: ", ++loop);
+ csched_dump_vcpu(svc);
+ }
+ }
+}
+
+static void
+csched_init(void)
+{
+ spin_lock_init(&csched_priv.lock);
+ INIT_LIST_HEAD(&csched_priv.sdom);
+ INIT_LIST_HEAD(&csched_priv.svc);
+
+ csched_priv.ncpus = 0;
+
+ /* FIXME: Runqueue per l2 */
+ csched_priv.max_weight = 1;
+ INIT_LIST_HEAD(&csched_priv.runq);
+}
+
+struct scheduler sched_credit2_def = {
+ .name = "SMP Credit Scheduler rev2",
+ .opt_name = "credit2",
+ .sched_id = XEN_SCHEDULER_CREDIT2,
+
+ .init_domain = csched_dom_init,
+ .destroy_domain = csched_dom_destroy,
+
+ .init_vcpu = csched_vcpu_init,
+ .destroy_vcpu = csched_vcpu_destroy,
+
+ .sleep = csched_vcpu_sleep,
+ .wake = csched_vcpu_wake,
+
+ .adjust = csched_dom_cntl,
+
+ .pick_cpu = csched_cpu_pick,
+ .do_schedule = csched_schedule,
+ .context_saved = csched_context_saved,
+
+ .dump_cpu_state = csched_dump_pcpu,
+ .dump_settings = csched_dump,
+ .init = csched_init,
+};
diff -r 7bd1dd9fb30f xen/common/schedule.c
--- a/xen/common/schedule.c Wed Jan 13 14:15:51 2010 +0000
+++ b/xen/common/schedule.c Wed Jan 13 14:36:58 2010 +0000
@@ -58,9 +58,11 @@
extern const struct scheduler sched_sedf_def;
extern const struct scheduler sched_credit_def;
+extern const struct scheduler sched_credit2_def;
static const struct scheduler *__initdata schedulers[] = {
&sched_sedf_def,
&sched_credit_def,
+ &sched_credit2_def,
NULL
};
diff -r 7bd1dd9fb30f xen/include/public/domctl.h
--- a/xen/include/public/domctl.h Wed Jan 13 14:15:51 2010 +0000
+++ b/xen/include/public/domctl.h Wed Jan 13 14:36:58 2010 +0000
@@ -295,6 +295,7 @@
/* Scheduler types. */
#define XEN_SCHEDULER_SEDF 4
#define XEN_SCHEDULER_CREDIT 5
+#define XEN_SCHEDULER_CREDIT2 6
/* Set or get info? */
#define XEN_DOMCTL_SCHEDOP_putinfo 0
#define XEN_DOMCTL_SCHEDOP_getinfo 1
@@ -313,6 +314,9 @@
uint16_t weight;
uint16_t cap;
} credit;
+ struct xen_domctl_sched_credit2 {
+ uint16_t weight;
+ } credit2;
} u;
};
typedef struct xen_domctl_scheduler_op xen_domctl_scheduler_op_t;
diff -r 7bd1dd9fb30f xen/include/public/trace.h
--- a/xen/include/public/trace.h Wed Jan 13 14:15:51 2010 +0000
+++ b/xen/include/public/trace.h Wed Jan 13 14:36:58 2010 +0000
@@ -53,6 +53,7 @@
#define TRC_HVM_HANDLER 0x00082000 /* various HVM handlers */
#define TRC_SCHED_MIN 0x00021000 /* Just runstate changes */
+#define TRC_SCHED_CLASS 0x00022000 /* Scheduler-specific */
#define TRC_SCHED_VERBOSE 0x00028000 /* More inclusive scheduling */
/* Trace events per class */
[-- Attachment #5: credit2-tools.diff --]
[-- Type: text/x-patch, Size: 15599 bytes --]
diff -r 63531e640828 tools/libxc/Makefile
--- a/tools/libxc/Makefile Mon Dec 07 17:01:11 2009 +0000
+++ b/tools/libxc/Makefile Mon Dec 21 11:45:00 2009 +0000
@@ -17,6 +17,7 @@
CTRL_SRCS-y += xc_private.c
CTRL_SRCS-y += xc_sedf.c
CTRL_SRCS-y += xc_csched.c
+CTRL_SRCS-y += xc_csched2.c
CTRL_SRCS-y += xc_tbuf.c
CTRL_SRCS-y += xc_pm.c
CTRL_SRCS-y += xc_cpu_hotplug.c
diff -r 63531e640828 tools/libxc/xc_csched2.c
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/libxc/xc_csched2.c Mon Dec 21 11:45:00 2009 +0000
@@ -0,0 +1,50 @@
+/****************************************************************************
+ * (C) 2006 - Emmanuel Ackaouy - XenSource Inc.
+ ****************************************************************************
+ *
+ * File: xc_csched.c
+ * Author: Emmanuel Ackaouy
+ *
+ * Description: XC Interface to the credit scheduler
+ *
+ */
+#include "xc_private.h"
+
+
+int
+xc_sched_credit2_domain_set(
+ int xc_handle,
+ uint32_t domid,
+ struct xen_domctl_sched_credit2 *sdom)
+{
+ DECLARE_DOMCTL;
+
+ domctl.cmd = XEN_DOMCTL_scheduler_op;
+ domctl.domain = (domid_t) domid;
+ domctl.u.scheduler_op.sched_id = XEN_SCHEDULER_CREDIT2;
+ domctl.u.scheduler_op.cmd = XEN_DOMCTL_SCHEDOP_putinfo;
+ domctl.u.scheduler_op.u.credit2 = *sdom;
+
+ return do_domctl(xc_handle, &domctl);
+}
+
+int
+xc_sched_credit2_domain_get(
+ int xc_handle,
+ uint32_t domid,
+ struct xen_domctl_sched_credit2 *sdom)
+{
+ DECLARE_DOMCTL;
+ int err;
+
+ domctl.cmd = XEN_DOMCTL_scheduler_op;
+ domctl.domain = (domid_t) domid;
+ domctl.u.scheduler_op.sched_id = XEN_SCHEDULER_CREDIT2;
+ domctl.u.scheduler_op.cmd = XEN_DOMCTL_SCHEDOP_getinfo;
+
+ err = do_domctl(xc_handle, &domctl);
+ if ( err == 0 )
+ *sdom = domctl.u.scheduler_op.u.credit2;
+
+ return err;
+}
diff -r 63531e640828 tools/libxc/xenctrl.h
--- a/tools/libxc/xenctrl.h Mon Dec 07 17:01:11 2009 +0000
+++ b/tools/libxc/xenctrl.h Mon Dec 21 11:45:00 2009 +0000
@@ -469,6 +469,14 @@
uint32_t domid,
struct xen_domctl_sched_credit *sdom);
+int xc_sched_credit2_domain_set(int xc_handle,
+ uint32_t domid,
+ struct xen_domctl_sched_credit2 *sdom);
+
+int xc_sched_credit2_domain_get(int xc_handle,
+ uint32_t domid,
+ struct xen_domctl_sched_credit2 *sdom);
+
/**
* This function sends a trigger to a domain.
*
diff -r 63531e640828 tools/python/xen/lowlevel/xc/xc.c
--- a/tools/python/xen/lowlevel/xc/xc.c Mon Dec 07 17:01:11 2009 +0000
+++ b/tools/python/xen/lowlevel/xc/xc.c Mon Dec 21 11:45:00 2009 +0000
@@ -1374,6 +1374,45 @@
"cap", sdom.cap);
}
+static PyObject *pyxc_sched_credit2_domain_set(XcObject *self,
+ PyObject *args,
+ PyObject *kwds)
+{
+ uint32_t domid;
+ uint16_t weight;
+ static char *kwd_list[] = { "domid", "weight", NULL };
+ static char kwd_type[] = "I|H";
+ struct xen_domctl_sched_credit2 sdom;
+
+ weight = 0;
+ if( !PyArg_ParseTupleAndKeywords(args, kwds, kwd_type, kwd_list,
+ &domid, &weight) )
+ return NULL;
+
+ sdom.weight = weight;
+
+ if ( xc_sched_credit2_domain_set(self->xc_handle, domid, &sdom) != 0 )
+ return pyxc_error_to_exception();
+
+ Py_INCREF(zero);
+ return zero;
+}
+
+static PyObject *pyxc_sched_credit2_domain_get(XcObject *self, PyObject *args)
+{
+ uint32_t domid;
+ struct xen_domctl_sched_credit2 sdom;
+
+ if( !PyArg_ParseTuple(args, "I", &domid) )
+ return NULL;
+
+ if ( xc_sched_credit2_domain_get(self->xc_handle, domid, &sdom) != 0 )
+ return pyxc_error_to_exception();
+
+ return Py_BuildValue("{s:H}",
+ "weight", sdom.weight);
+}
+
static PyObject *pyxc_domain_setmaxmem(XcObject *self, PyObject *args)
{
uint32_t dom;
@@ -1912,6 +1951,24 @@
"Returns: [dict]\n"
" weight [short]: domain's scheduling weight\n"},
+ { "sched_credit2_domain_set",
+ (PyCFunction)pyxc_sched_credit2_domain_set,
+ METH_KEYWORDS, "\n"
+ "Set the scheduling parameters for a domain when running with the\n"
+ "SMP credit2 scheduler.\n"
+ " domid [int]: domain id to set\n"
+ " weight [short]: domain's scheduling weight\n"
+ "Returns: [int] 0 on success; -1 on error.\n" },
+
+ { "sched_credit2_domain_get",
+ (PyCFunction)pyxc_sched_credit2_domain_get,
+ METH_VARARGS, "\n"
+ "Get the scheduling parameters for a domain when running with the\n"
+ "SMP credit2 scheduler.\n"
+ " domid [int]: domain id to get\n"
+ "Returns: [dict]\n"
+ " weight [short]: domain's scheduling weight\n"},
+
{ "evtchn_alloc_unbound",
(PyCFunction)pyxc_evtchn_alloc_unbound,
METH_VARARGS | METH_KEYWORDS, "\n"
@@ -2272,6 +2329,7 @@
/* Expose some libxc constants to Python */
PyModule_AddIntConstant(m, "XEN_SCHEDULER_SEDF", XEN_SCHEDULER_SEDF);
PyModule_AddIntConstant(m, "XEN_SCHEDULER_CREDIT", XEN_SCHEDULER_CREDIT);
+ PyModule_AddIntConstant(m, "XEN_SCHEDULER_CREDIT2", XEN_SCHEDULER_CREDIT2);
}
diff -r 63531e640828 tools/python/xen/xend/XendAPI.py
--- a/tools/python/xen/xend/XendAPI.py Mon Dec 07 17:01:11 2009 +0000
+++ b/tools/python/xen/xend/XendAPI.py Mon Dec 21 11:45:00 2009 +0000
@@ -1626,8 +1626,7 @@
if 'weight' in xeninfo.info['vcpus_params'] \
and 'cap' in xeninfo.info['vcpus_params']:
weight = xeninfo.info['vcpus_params']['weight']
- cap = xeninfo.info['vcpus_params']['cap']
- xendom.domain_sched_credit_set(xeninfo.getDomid(), weight, cap)
+ xendom.domain_sched_credit2_set(xeninfo.getDomid(), weight)
def VM_set_VCPUs_number_live(self, _, vm_ref, num):
dom = XendDomain.instance().get_vm_by_uuid(vm_ref)
diff -r 63531e640828 tools/python/xen/xend/XendDomain.py
--- a/tools/python/xen/xend/XendDomain.py Mon Dec 07 17:01:11 2009 +0000
+++ b/tools/python/xen/xend/XendDomain.py Mon Dec 21 11:45:00 2009 +0000
@@ -1757,6 +1757,60 @@
log.exception(ex)
raise XendError(str(ex))
+ def domain_sched_credit2_get(self, domid):
+ """Get credit2 scheduler parameters for a domain.
+
+ @param domid: Domain ID or Name
+ @type domid: int or string.
+ @rtype: dict with keys 'weight'
+ @return: credit2 scheduler parameters
+ """
+ dominfo = self.domain_lookup_nr(domid)
+ if not dominfo:
+ raise XendInvalidDomain(str(domid))
+
+ if dominfo._stateGet() in (DOM_STATE_RUNNING, DOM_STATE_PAUSED):
+ try:
+ return xc.sched_credit2_domain_get(dominfo.getDomid())
+ except Exception, ex:
+ raise XendError(str(ex))
+ else:
+ return {'weight' : dominfo.getWeight()}
+
+ def domain_sched_credit2_set(self, domid, weight = None):
+ """Set credit2 scheduler parameters for a domain.
+
+ @param domid: Domain ID or Name
+ @type domid: int or string.
+ @type weight: int
+ @rtype: 0
+ """
+ set_weight = False
+ dominfo = self.domain_lookup_nr(domid)
+ if not dominfo:
+ raise XendInvalidDomain(str(domid))
+ try:
+ if weight is None:
+ weight = int(0)
+ elif weight < 1 or weight > 65535:
+ raise XendError("weight is out of range")
+ else:
+ set_weight = True
+
+ assert type(weight) == int
+
+ rc = 0
+ if dominfo._stateGet() in (DOM_STATE_RUNNING, DOM_STATE_PAUSED):
+ rc = xc.sched_credit2_domain_set(dominfo.getDomid(), weight)
+ if rc == 0:
+ if set_weight:
+ dominfo.setWeight(weight)
+ self.managed_config_save(dominfo)
+ return rc
+ except Exception, ex:
+ log.exception(ex)
+ raise XendError(str(ex))
+
def domain_maxmem_set(self, domid, mem):
"""Set the memory limit for a domain.
diff -r 63531e640828 tools/python/xen/xend/XendDomainInfo.py
--- a/tools/python/xen/xend/XendDomainInfo.py Mon Dec 07 17:01:11 2009 +0000
+++ b/tools/python/xen/xend/XendDomainInfo.py Mon Dec 21 11:45:00 2009 +0000
@@ -2719,6 +2719,10 @@
XendDomain.instance().domain_sched_credit_set(self.getDomid(),
self.getWeight(),
self.getCap())
+ elif XendNode.instance().xenschedinfo() == 'credit2':
+ from xen.xend import XendDomain
+ XendDomain.instance().domain_sched_credit2_set(self.getDomid(),
+ self.getWeight())
def _initDomain(self):
log.debug('XendDomainInfo.initDomain: %s %s',
diff -r 63531e640828 tools/python/xen/xend/XendNode.py
--- a/tools/python/xen/xend/XendNode.py Mon Dec 07 17:01:11 2009 +0000
+++ b/tools/python/xen/xend/XendNode.py Mon Dec 21 11:45:00 2009 +0000
@@ -760,6 +760,8 @@
return 'sedf'
elif sched_id == xen.lowlevel.xc.XEN_SCHEDULER_CREDIT:
return 'credit'
+ elif sched_id == xen.lowlevel.xc.XEN_SCHEDULER_CREDIT2:
+ return 'credit2'
else:
return 'unknown'
@@ -961,6 +963,8 @@
return 'sedf'
elif sched_id == xen.lowlevel.xc.XEN_SCHEDULER_CREDIT:
return 'credit'
+ elif sched_id == xen.lowlevel.xc.XEN_SCHEDULER_CREDIT2:
+ return 'credit2'
else:
return 'unknown'
diff -r 63531e640828 tools/python/xen/xend/XendVMMetrics.py
--- a/tools/python/xen/xend/XendVMMetrics.py Mon Dec 07 17:01:11 2009 +0000
+++ b/tools/python/xen/xend/XendVMMetrics.py Mon Dec 21 11:45:00 2009 +0000
@@ -129,6 +129,7 @@
params_live['cpumap%i' % i] = \
",".join(map(str, info['cpumap']))
+ # FIXME: credit2??
params_live.update(xc.sched_credit_domain_get(domid))
return params_live
diff -r 63531e640828 tools/python/xen/xend/server/SrvDomain.py
--- a/tools/python/xen/xend/server/SrvDomain.py Mon Dec 07 17:01:11 2009 +0000
+++ b/tools/python/xen/xend/server/SrvDomain.py Mon Dec 21 11:45:00 2009 +0000
@@ -163,6 +163,20 @@
val = fn(req.args, {'dom': self.dom.getName()})
return val
+ def op_domain_sched_credit2_get(self, _, req):
+ fn = FormFn(self.xd.domain_sched_credit2_get,
+ [['dom', 'str']])
+ val = fn(req.args, {'dom': self.dom.getName()})
+ return val
+
+
+ def op_domain_sched_credit2_set(self, _, req):
+ fn = FormFn(self.xd.domain_sched_credit2_set,
+ [['dom', 'str'],
+ ['weight', 'int']])
+ val = fn(req.args, {'dom': self.dom.getName()})
+ return val
+
def op_maxmem_set(self, _, req):
return self.call(self.dom.setMemoryMaximum,
[['memory', 'int']],
diff -r 63531e640828 tools/python/xen/xm/main.py
--- a/tools/python/xen/xm/main.py Mon Dec 07 17:01:11 2009 +0000
+++ b/tools/python/xen/xm/main.py Mon Dec 21 11:45:00 2009 +0000
@@ -150,6 +150,8 @@
'sched-sedf' : ('<Domain> [options]', 'Get/set EDF parameters.'),
'sched-credit': ('[-d <Domain> [-w[=WEIGHT]|-c[=CAP]]]',
'Get/set credit scheduler parameters.'),
+ 'sched-credit2': ('[-d <Domain> [-w[=WEIGHT]]',
+ 'Get/set credit2 scheduler parameters.'),
'sysrq' : ('<Domain> <letter>', 'Send a sysrq to a domain.'),
'debug-keys' : ('<Keys>', 'Send debug keys to Xen.'),
'trigger' : ('<Domain> <nmi|reset|init|s3resume|power> [<VCPU>]',
@@ -265,6 +267,10 @@
('-w WEIGHT', '--weight=WEIGHT', 'Weight (int)'),
('-c CAP', '--cap=CAP', 'Cap (int)'),
),
+ 'sched-credit2': (
+ ('-d DOMAIN', '--domain=DOMAIN', 'Domain to modify'),
+ ('-w WEIGHT', '--weight=WEIGHT', 'Weight (int)'),
+ ),
'list': (
('-l', '--long', 'Output all VM details in SXP'),
('', '--label', 'Include security labels'),
@@ -406,6 +412,7 @@
]
scheduler_commands = [
+ "sched-credit2",
"sched-credit",
"sched-sedf",
]
@@ -1720,6 +1727,80 @@
if result != 0:
err(str(result))
+def xm_sched_credit2(args):
+ """Get/Set options for Credit2 Scheduler."""
+
+ check_sched_type('credit2')
+
+ try:
+ opts, params = getopt.getopt(args, "d:w:",
+ ["domain=", "weight="])
+ except getopt.GetoptError, opterr:
+ err(opterr)
+ usage('sched-credit2')
+
+ domid = None
+ weight = None
+
+ for o, a in opts:
+ if o in ["-d", "--domain"]:
+ domid = a
+ elif o in ["-w", "--weight"]:
+ weight = int(a)
+
+ doms = filter(lambda x : domid_match(domid, x),
+ [parse_doms_info(dom)
+ for dom in getDomains(None, 'all')])
+
+ if weight is None:
+ if domid is not None and doms == []:
+ err("Domain '%s' does not exist." % domid)
+ usage('sched-credit2')
+ # print header if we aren't setting any parameters
+ print '%-33s %4s %6s' % ('Name','ID','Weight')
+
+ for d in doms:
+ try:
+ if serverType == SERVER_XEN_API:
+ info = server.xenapi.VM_metrics.get_VCPUs_params(
+ server.xenapi.VM.get_metrics(
+ get_single_vm(d['name'])))
+ else:
+ info = server.xend.domain.sched_credit2_get(d['name'])
+ except xmlrpclib.Fault:
+ pass
+
+ if 'weight' not in info:
+ # domain does not support sched-credit2?
+ info = {'weight': -1}
+
+ info['weight'] = int(info['weight'])
+
+ info['name'] = d['name']
+ info['domid'] = str(d['domid'])
+ print( ("%(name)-32s %(domid)5s %(weight)6d") % info)
+ else:
+ if domid is None:
+ # place holder for system-wide scheduler parameters
+ err("No domain given.")
+ usage('sched-credit2')
+
+ if serverType == SERVER_XEN_API:
+ if doms[0]['domid']:
+ server.xenapi.VM.add_to_VCPUs_params_live(
+ get_single_vm(domid),
+ "weight",
+ weight)
+ else:
+ server.xenapi.VM.add_to_VCPUs_params(
+ get_single_vm(domid),
+ "weight",
+ weight)
+ else:
+ result = server.xend.domain.sched_credit2_set(domid, weight)
+ if result != 0:
+ err(str(result))
+
def xm_info(args):
arg_check(args, "info", 0, 1)
@@ -3341,6 +3422,7 @@
# scheduler
"sched-sedf": xm_sched_sedf,
"sched-credit": xm_sched_credit,
+ "sched-credit2": xm_sched_credit2,
# block
"block-attach": xm_block_attach,
"block-detach": xm_block_detach,
[-- Attachment #6: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2010-01-13 14:48 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-07 17:02 [PATCH] [RFC] Credit2 scheduler prototype George Dunlap
2009-12-07 17:45 ` Keir Fraser
2009-12-08 14:48 ` George Dunlap
2009-12-08 18:20 ` Keir Fraser
2010-01-13 14:48 ` George Dunlap [this message]
2010-01-13 15:16 ` Keir Fraser
2010-01-13 16:05 ` George Dunlap
2010-01-13 16:36 ` Keir Fraser
2010-01-13 16:43 ` George Dunlap
2010-01-28 23:27 ` Dulloor
2010-01-29 0:56 ` George Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=de76405a1001130648u50ccf3ebg3bde1b0c79840366@mail.gmail.com \
--to=george.dunlap@eu.citrix.com \
--cc=keir.fraser@eu.citrix.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).