* Scheduler problem in XEN 3.4.0
@ 2009-10-12 17:15 Pankaj Parakh
2009-10-13 11:47 ` George Dunlap
0 siblings, 1 reply; 5+ messages in thread
From: Pankaj Parakh @ 2009-10-12 17:15 UTC (permalink / raw)
To: xen-devel
Hi All,
I am trying to learn about schedulers of XEN, so for a start I am
using XEN 3.4.0 and using book - The Definitive Guide to the Xen
Hypervisor - by David Chisnall, I have followed its steps to add
scheduler which is there in Chap. 12
But I dont know what is the problem, I am unable to boot with that
scheduler selected, I have been trying to debug this problem, but
kinda stuck in it.
The scheduler given in that book is a trivial round robin scheduler.
Is the problem is with the code or with the procedure, I dont know that also.
Plzz help me out of this.
Thanks
Pankaj Parakh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Scheduler problem in XEN 3.4.0
2009-10-12 17:15 Scheduler problem in XEN 3.4.0 Pankaj Parakh
@ 2009-10-13 11:47 ` George Dunlap
2009-10-23 15:38 ` Pankaj Parakh
0 siblings, 1 reply; 5+ messages in thread
From: George Dunlap @ 2009-10-13 11:47 UTC (permalink / raw)
To: Pankaj Parakh; +Cc: xen-devel
Pankaj,
I haven't used the round-robin scheduler code in that book, but
another guy named Ananth tried to use it unsuccessfully as well. You
can see some of that thread here (not sure why I can't find the
original post):
http://lists.xensource.com/archives/html/xen-devel/2009-05/msg00004.html
Most of us are not so interested in finding the bug in the books'
code, but we are interested in helping *you* find it. If you continue
to do hypervisor work (and especially if you do anything with the
scheduler), you're going to have to learn how to debug a hypervisor,
which is often rather a pain in the neck.
There is some advice in the thread linked to above about setting up a
serial console. You can add printk()'s around to find out what the
sched_rr code is doing and where it's going wrong, and ask more
questions on the list if you get stuck. (Feel free to cc me to bring
it to my attention, but always send it to the whole list.)
Good luck,
-George
On Mon, Oct 12, 2009 at 6:15 PM, Pankaj Parakh
<me.pankajparakh@gmail.com> wrote:
> Hi All,
>
> I am trying to learn about schedulers of XEN, so for a start I am
> using XEN 3.4.0 and using book - The Definitive Guide to the Xen
> Hypervisor - by David Chisnall, I have followed its steps to add
> scheduler which is there in Chap. 12
> But I dont know what is the problem, I am unable to boot with that
> scheduler selected, I have been trying to debug this problem, but
> kinda stuck in it.
> The scheduler given in that book is a trivial round robin scheduler.
>
> Is the problem is with the code or with the procedure, I dont know that also.
>
> Plzz help me out of this.
>
> Thanks
> Pankaj Parakh
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Scheduler problem in XEN 3.4.0
2009-10-13 11:47 ` George Dunlap
@ 2009-10-23 15:38 ` Pankaj Parakh
2009-10-26 15:26 ` George Dunlap
0 siblings, 1 reply; 5+ messages in thread
From: Pankaj Parakh @ 2009-10-23 15:38 UTC (permalink / raw)
To: xen-devel; +Cc: George Dunlap
[-- Attachment #1: Type: text/plain, Size: 3121 bytes --]
Hi George,
Thanks for showing interest, First of all I'll try to setup debugging
environment for Xen, I also checked the mailing list of Ananth also I
know him personally, he has left it some where in middle.
I tried printk's and locking method to find out which function is
being called in schedule.c, I found the following sequence:-
schedule_init()
schedule_domain_init()
schedule_vcpu_init()
schedule_domain_init()
schedule_vcpu_init()
(from here no function is called from schedule.c, but system hangs)
I cant say what the problem is, I am only changing the sched_priv data
of vcpu struct, also my rr sched has only three functions viz
init_vcpu, destroy_vcpu, do_schedule.
I have attached the code as well.
Following are my doubts:
1. Is the above function sequence is right ? Why two times init_domain
is called during the boot.
2. Do a scheduler policy need to manipulate other part of vcpu struct
(other than sched_priv)
3. Is it necessary to maintain domain information inside scheduler policy
On Tue, Oct 13, 2009 at 5:17 PM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
>
> Pankaj,
>
> I haven't used the round-robin scheduler code in that book, but
> another guy named Ananth tried to use it unsuccessfully as well. You
> can see some of that thread here (not sure why I can't find the
> original post):
> http://lists.xensource.com/archives/html/xen-devel/2009-05/msg00004.html
>
> Most of us are not so interested in finding the bug in the books'
> code, but we are interested in helping *you* find it. If you continue
> to do hypervisor work (and especially if you do anything with the
> scheduler), you're going to have to learn how to debug a hypervisor,
> which is often rather a pain in the neck.
>
> There is some advice in the thread linked to above about setting up a
> serial console. You can add printk()'s around to find out what the
> sched_rr code is doing and where it's going wrong, and ask more
> questions on the list if you get stuck. (Feel free to cc me to bring
> it to my attention, but always send it to the whole list.)
>
> Good luck,
> -George
>
>
> On Mon, Oct 12, 2009 at 6:15 PM, Pankaj Parakh
> <me.pankajparakh@gmail.com> wrote:
> > Hi All,
> >
> > I am trying to learn about schedulers of XEN, so for a start I am
> > using XEN 3.4.0 and using book - The Definitive Guide to the Xen
> > Hypervisor - by David Chisnall, I have followed its steps to add
> > scheduler which is there in Chap. 12
> > But I dont know what is the problem, I am unable to boot with that
> > scheduler selected, I have been trying to debug this problem, but
> > kinda stuck in it.
> > The scheduler given in that book is a trivial round robin scheduler.
> >
> > Is the problem is with the code or with the procedure, I dont know that also.
> >
> > Plzz help me out of this.
> >
> > Thanks
> > Pankaj Parakh
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> >
--
Pankaj Parakh
[-- Attachment #2: sched_trivial.c --]
[-- Type: text/x-csrc, Size: 3628 bytes --]
#include <xen/config.h>
#include <xen/init.h>
#include <xen/lib.h>
#include <xen/sched.h>
#include <xen/domain.h>
#include <xen/delay.h>
#include <xen/event.h>
#include <xen/time.h>
#include <xen/perfc.h>
#include <xen/sched-if.h>
#include <xen/softirq.h>
#include <asm/atomic.h>
#include <xen/errno.h>
#include <xen/timer.h>
#define BEEP "inb $97, %al;\n\
outb %al, $0x80;\n\
movb $3, %al;\n\
outb %al, $97;\n\
outb %al, $0x80;\n\
movb $-74, %al;\n\
outb %al, $67;\n\
outb %al, $0x80;\n\
movb $-119, %al;\n\
outb %al, $66;\n\
outb %al, $0x80;\n\
movb $15, %al;\n\
outb %al, $66;"
/* CPU Run Queue */
static struct vcpu *vcpu_list_head = NULL ;
static struct vcpu *vcpu_list_tail = NULL ;
static spinlock_t lock;
unsigned int vcpus = 0 ;
static void trivial_init_func(void)
{
// unsigned long flags;
spin_lock_init(&lock);
// printk("t_init_func:Wait\n");
// spin_lock_irqsave(&lock, flags);
}
/* Add a VCPU */
static int trivial_init_vcpu (struct vcpu *v)
{
//asm volatile(BEEP);
unsigned long flags;
printk("t_init:Wait\n");
spin_lock_irqsave(&lock, flags);
if ( vcpu_list_head == NULL)
{
vcpu_list_head = vcpu_list_tail = v ;
}
else
{
vcpu_list_tail->sched_priv = (void*)v ;
vcpu_list_tail = v;
}
spin_unlock_irqrestore(&lock, flags);
v->sched_priv = NULL ;
printk("t_init:free");
return 0;
}
/* Remove a VCPU */
static void trivial_destroy_vcpu(struct vcpu *v){
struct vcpu *curr = NULL;
struct vcpu *last = NULL;
//asm volatile(BEEP);
unsigned long flags;
printk("t_destroy:wait\n");
spin_lock_irqsave(&lock, flags);
if(v == vcpu_list_head)
{
vcpu_list_head = (struct vcpu*)(vcpu_list_head->sched_priv);
}
else
{
last = NULL;
curr = vcpu_list_head;
while(curr != NULL)
{
if(curr == v)
{
last = (struct vcpu*)(curr->sched_priv);
break;
}
last = curr;
curr = (struct vcpu*)(curr->sched_priv);
}
if(curr != NULL)
{
last->sched_priv = curr->sched_priv;
}
}
spin_unlock_irqrestore(&lock, flags);
printk("destroy:free");
}
static inline void increment_run_queue(void)
{
//asm volatile(BEEP);
unsigned long flags;
printk("t_irq:wait\n");
spin_lock_irqsave(&lock, flags);
vcpu_list_tail->sched_priv = (void*)vcpu_list_head;
vcpu_list_tail = vcpu_list_head;
vcpu_list_head = (struct vcpu*)(vcpu_list_tail->sched_priv);
vcpu_list_tail->sched_priv = NULL;
spin_unlock_irqrestore(&lock, flags);
printk("irq:free");
}
/* Pick a VCPU to run */
static struct task_slice trivial_do_schedule(s_time_t now)
{
//int cpu = smp_processor_id();
struct task_slice ret ;
/* F i x e d −s i z e quantum */
struct vcpu *head = vcpu_list_head ;
// asm volatile(BEEP);
ret.time = MILLISECS(10) ;
do
{
/* F i n d a r u n n a b l e VCPU */
increment_run_queue() ;
if (vcpu_runnable (vcpu_list_head))
{
ret.task = vcpu_list_head ;
return ret;
}
} while (head != vcpu_list_head) ;
/* Return the idle task if there isn’t one */
ret.task = ((struct vcpu *)(per_cpu(schedule_data,vcpus).idle));
return ret;
}
struct scheduler sched_trivial_def = {
.name = "Trivial Round Robin Scheduler" ,
.opt_name = "trivial" ,
.sched_id = 6 ,
.init = trivial_init_func,
.init_domain = NULL,
.destroy_domain = NULL,
.init_vcpu = trivial_init_vcpu ,
.destroy_vcpu = trivial_destroy_vcpu ,
.sleep = NULL,
.wake = NULL,
.do_schedule = trivial_do_schedule ,
.pick_cpu = NULL,
.adjust = NULL ,
.dump_settings = NULL,
.dump_cpu_state = NULL,
.tick_suspend = NULL,
.tick_resume = NULL,
};
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Scheduler problem in XEN 3.4.0
2009-10-23 15:38 ` Pankaj Parakh
@ 2009-10-26 15:26 ` George Dunlap
2009-10-28 12:01 ` Pankaj Parakh
0 siblings, 1 reply; 5+ messages in thread
From: George Dunlap @ 2009-10-26 15:26 UTC (permalink / raw)
To: Pankaj Parakh; +Cc: xen-devel@lists.xensource.com
Pankaj Parakh wrote:
> Hi George,
>
> Thanks for showing interest, First of all I'll try to setup debugging
> environment for Xen, I also checked the mailing list of Ananth also I
> know him personally, he has left it some where in middle.
>
> I tried printk's and locking method to find out which function is
> being called in schedule.c, I found the following sequence:-
>
> schedule_init()
> schedule_domain_init()
> schedule_vcpu_init()
> schedule_domain_init()
> schedule_vcpu_init()
> (from here no function is called from schedule.c, but system hangs)
>
> I cant say what the problem is, I am only changing the sched_priv data
> of vcpu struct, also my rr sched has only three functions viz
> init_vcpu, destroy_vcpu, do_schedule.
>
> I have attached the code as well.
>
> Following are my doubts:
>
[Language point: In many English dialects, such as US and UK, "doubt"
implies something negative. To avoid being misunderstood by speakers of
those dialects, "questions" might be a better word to use.]
> 1. Is the above function sequence is right ? Why two times init_domain
> is called during the boot.
>
I suspect that the idle domain (32727) and domain 0 are the two domains
being initialized. You could easily find out by adding the domain id to
the printk.
> 2. Do a scheduler policy need to manipulate other part of vcpu struct
> (other than sched_priv)
>
I don't believe so. You could look through other schedulers (like the
credit scheduler) to see if this is so.
> 3. Is it necessary to maintain domain information inside scheduler policy
>
I don't know.
A quick scan through the attached file turn up a couple of points:
+ It appears that the algorithm adds *all* vcpus to a single global
runqueue, and scans through them looking for "runnable" vcpus on
schedule. This seems pretty pointless: why not add them to the global
runqueue on wake?
+ Furthermore, I'm not sure your linked-list implementation is sound;
for example, the initial v->sched_priv is not initialized to NULL. (Not
going to spend a lot of time trying to figure out if that's OK or not.)
+ There is some missing logic regarding the v->processor field and
sync_vcpu_execstate(). Xen is designed expecting a per-cpu runqueue
with explicit migrations of vcpus between pcpus. One of the reasons for
this is so that when switching between a vcpu and the idle domain, it
doesn't actually need to do a full context switch. As a result, if you
run a vcpu on one cpu, and then run it on another without an explicit
sync, you may have stale data in the vcpu struct (and Xen will throw a BUG).
+ You don't implement a .wake() method. I think that Xen will wake()
dom0 when it's ready, expecting the .wake() method to raise a SCHEDULE
softirq on the appropriate pcpu (from which the .do_schedule() method is
called). So no wake method means no schedule(), and no schedule means
it just runs the idle domain.
It might be best to start with just one pcpu, and adding multiple cpus
once you get things running. Try adding a wake() method that will check
to see if cpu 0 is idle; if it is, raise SCHEDULE_SOFTIRQ on that cpu,
and see what you get. After that, try adding some logic to figure out
which other cpu to wake instead; but be advised that you'll probably hit
the BUG() relating to sync_vcpu_execstate() not being properly called.
-George
> On Tue, Oct 13, 2009 at 5:17 PM, George Dunlap
> <George.Dunlap@eu.citrix.com> wrote:
>
>> Pankaj,
>>
>> I haven't used the round-robin scheduler code in that book, but
>> another guy named Ananth tried to use it unsuccessfully as well. You
>> can see some of that thread here (not sure why I can't find the
>> original post):
>> http://lists.xensource.com/archives/html/xen-devel/2009-05/msg00004.html
>>
>> Most of us are not so interested in finding the bug in the books'
>> code, but we are interested in helping *you* find it. If you continue
>> to do hypervisor work (and especially if you do anything with the
>> scheduler), you're going to have to learn how to debug a hypervisor,
>> which is often rather a pain in the neck.
>>
>> There is some advice in the thread linked to above about setting up a
>> serial console. You can add printk()'s around to find out what the
>> sched_rr code is doing and where it's going wrong, and ask more
>> questions on the list if you get stuck. (Feel free to cc me to bring
>> it to my attention, but always send it to the whole list.)
>>
>> Good luck,
>> -George
>>
>>
>> On Mon, Oct 12, 2009 at 6:15 PM, Pankaj Parakh
>> <me.pankajparakh@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I am trying to learn about schedulers of XEN, so for a start I am
>>> using XEN 3.4.0 and using book - The Definitive Guide to the Xen
>>> Hypervisor - by David Chisnall, I have followed its steps to add
>>> scheduler which is there in Chap. 12
>>> But I dont know what is the problem, I am unable to boot with that
>>> scheduler selected, I have been trying to debug this problem, but
>>> kinda stuck in it.
>>> The scheduler given in that book is a trivial round robin scheduler.
>>>
>>> Is the problem is with the code or with the procedure, I dont know that also.
>>>
>>> Plzz help me out of this.
>>>
>>> Thanks
>>> Pankaj Parakh
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>>>
>>>
>
>
>
> --
> Pankaj Parakh
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Scheduler problem in XEN 3.4.0
2009-10-26 15:26 ` George Dunlap
@ 2009-10-28 12:01 ` Pankaj Parakh
0 siblings, 0 replies; 5+ messages in thread
From: Pankaj Parakh @ 2009-10-28 12:01 UTC (permalink / raw)
To: George Dunlap; +Cc: xen-devel@lists.xensource.com
Hi George,
First of all I want to thank you, I am able to run that rr scheduler,
wake function was the only problem which was stopping it to run, I
still need to improve it. Your review was very very helpful. Also now
I have two machine serial console setup which let me see printk s in
XEN code.
Now I want to manipulate a parameter of policy say quantum of
timeslice in runtime. I am unable to find much reference for it. I
know that I have to make a entry in domctl.h but what next..
Please guide me through.
Thanks a lot again.
Regards,
Pankaj Parakh.
On Mon, Oct 26, 2009 at 8:56 PM, George Dunlap
<george.dunlap@eu.citrix.com> wrote:
>
> Pankaj Parakh wrote:
>>
>> Hi George,
>>
>> Thanks for showing interest, First of all I'll try to setup debugging
>> environment for Xen, I also checked the mailing list of Ananth also I
>> know him personally, he has left it some where in middle.
>>
>> I tried printk's and locking method to find out which function is
>> being called in schedule.c, I found the following sequence:-
>>
>> schedule_init()
>> schedule_domain_init()
>> schedule_vcpu_init()
>> schedule_domain_init()
>> schedule_vcpu_init()
>> (from here no function is called from schedule.c, but system hangs)
>>
>> I cant say what the problem is, I am only changing the sched_priv data
>> of vcpu struct, also my rr sched has only three functions viz
>> init_vcpu, destroy_vcpu, do_schedule.
>>
>> I have attached the code as well.
>>
>> Following are my doubts:
>>
>
> [Language point: In many English dialects, such as US and UK, "doubt" implies something negative. To avoid being misunderstood by speakers of those dialects, "questions" might be a better word to use.]
>>
>> 1. Is the above function sequence is right ? Why two times init_domain
>> is called during the boot.
>>
>
> I suspect that the idle domain (32727) and domain 0 are the two domains being initialized. You could easily find out by adding the domain id to the printk.
>>
>> 2. Do a scheduler policy need to manipulate other part of vcpu struct
>> (other than sched_priv)
>>
>
> I don't believe so. You could look through other schedulers (like the credit scheduler) to see if this is so.
>>
>> 3. Is it necessary to maintain domain information inside scheduler policy
>>
>
> I don't know.
>
> A quick scan through the attached file turn up a couple of points:
> + It appears that the algorithm adds *all* vcpus to a single global runqueue, and scans through them looking for "runnable" vcpus on schedule. This seems pretty pointless: why not add them to the global runqueue on wake?
> + Furthermore, I'm not sure your linked-list implementation is sound; for example, the initial v->sched_priv is not initialized to NULL. (Not going to spend a lot of time trying to figure out if that's OK or not.)
> + There is some missing logic regarding the v->processor field and sync_vcpu_execstate(). Xen is designed expecting a per-cpu runqueue with explicit migrations of vcpus between pcpus. One of the reasons for this is so that when switching between a vcpu and the idle domain, it doesn't actually need to do a full context switch. As a result, if you run a vcpu on one cpu, and then run it on another without an explicit sync, you may have stale data in the vcpu struct (and Xen will throw a BUG).
> + You don't implement a .wake() method. I think that Xen will wake() dom0 when it's ready, expecting the .wake() method to raise a SCHEDULE softirq on the appropriate pcpu (from which the .do_schedule() method is called). So no wake method means no schedule(), and no schedule means it just runs the idle domain.
>
> It might be best to start with just one pcpu, and adding multiple cpus once you get things running. Try adding a wake() method that will check to see if cpu 0 is idle; if it is, raise SCHEDULE_SOFTIRQ on that cpu, and see what you get. After that, try adding some logic to figure out which other cpu to wake instead; but be advised that you'll probably hit the BUG() relating to sync_vcpu_execstate() not being properly called.
>
> -George
>>
>> On Tue, Oct 13, 2009 at 5:17 PM, George Dunlap
>> <George.Dunlap@eu.citrix.com> wrote:
>>
>>>
>>> Pankaj,
>>>
>>> I haven't used the round-robin scheduler code in that book, but
>>> another guy named Ananth tried to use it unsuccessfully as well. You
>>> can see some of that thread here (not sure why I can't find the
>>> original post):
>>> http://lists.xensource.com/archives/html/xen-devel/2009-05/msg00004.html
>>>
>>> Most of us are not so interested in finding the bug in the books'
>>> code, but we are interested in helping *you* find it. If you continue
>>> to do hypervisor work (and especially if you do anything with the
>>> scheduler), you're going to have to learn how to debug a hypervisor,
>>> which is often rather a pain in the neck.
>>>
>>> There is some advice in the thread linked to above about setting up a
>>> serial console. You can add printk()'s around to find out what the
>>> sched_rr code is doing and where it's going wrong, and ask more
>>> questions on the list if you get stuck. (Feel free to cc me to bring
>>> it to my attention, but always send it to the whole list.)
>>>
>>> Good luck,
>>> -George
>>>
>>>
>>> On Mon, Oct 12, 2009 at 6:15 PM, Pankaj Parakh
>>> <me.pankajparakh@gmail.com> wrote:
>>>
>>>>
>>>> Hi All,
>>>>
>>>> I am trying to learn about schedulers of XEN, so for a start I am
>>>> using XEN 3.4.0 and using book - The Definitive Guide to the Xen
>>>> Hypervisor - by David Chisnall, I have followed its steps to add
>>>> scheduler which is there in Chap. 12
>>>> But I dont know what is the problem, I am unable to boot with that
>>>> scheduler selected, I have been trying to debug this problem, but
>>>> kinda stuck in it.
>>>> The scheduler given in that book is a trivial round robin scheduler.
>>>>
>>>> Is the problem is with the code or with the procedure, I dont know that also.
>>>>
>>>> Plzz help me out of this.
>>>>
>>>> Thanks
>>>> Pankaj Parakh
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xensource.com
>>>> http://lists.xensource.com/xen-devel
>>>>
>>>>
>>
>>
>>
>> --
>> Pankaj Parakh
>>
>
--
Pankaj Parakh
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-10-28 12:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-12 17:15 Scheduler problem in XEN 3.4.0 Pankaj Parakh
2009-10-13 11:47 ` George Dunlap
2009-10-23 15:38 ` Pankaj Parakh
2009-10-26 15:26 ` George Dunlap
2009-10-28 12:01 ` Pankaj Parakh
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.