* deadlock in the credit2
@ 2011-10-14 9:08 Eunbyung Park
2011-10-14 11:47 ` George Dunlap
0 siblings, 1 reply; 4+ messages in thread
From: Eunbyung Park @ 2011-10-14 9:08 UTC (permalink / raw)
To: george.dunlap; +Cc: xen-devel
IMHO, it seems to be deadlock when changing dom0's weight in credit2
scheduler.
when the sched_adjust() in schedule.c is called, it grabs the
schedule_lock after pausing all of the vcpus
and then, csched_dom_cntl in sched_credit2.c, it also grab the
schedule_lock by using vcpu_schedule_lock_irq().
In the credit2, all of the percpu schedule_lock points out same runqueue
lock if they belong to same runqueue.
Eventually, all of vcpu are paused except for itself running the code,
and it try to grab schedule_lock that was grabbed by itself.
Am I right? If I was wrong, please tell me my misunderstanding.
And, I have a question about the code, where are in sched_adjust() in
schedule.c
if ( d == current->domain )
vcpu_schedule_lock_irq(current);
It was very hard to understan for me..:) What does it exactly mean?
I would be very grateful for any comments
--
Best Regards,
Eunbyung Park
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: deadlock in the credit2
2011-10-14 9:08 deadlock in the credit2 Eunbyung Park
@ 2011-10-14 11:47 ` George Dunlap
2011-10-14 12:11 ` Keir Fraser
0 siblings, 1 reply; 4+ messages in thread
From: George Dunlap @ 2011-10-14 11:47 UTC (permalink / raw)
To: Eunbyung Park, Keir Fraser; +Cc: xen-devel
2011/10/14 Eunbyung Park <silverbottlep@gmail.com>:
> IMHO, it seems to be deadlock when changing dom0's weight in credit2
> scheduler.
>
> when the sched_adjust() in schedule.c is called, it grabs the
> schedule_lock after pausing all of the vcpus
>
> and then, csched_dom_cntl in sched_credit2.c, it also grab the
> schedule_lock by using vcpu_schedule_lock_irq().
>
> In the credit2, all of the percpu schedule_lock points out same runqueue
> lock if they belong to same runqueue.
>
> Eventually, all of vcpu are paused except for itself running the code,
> and it try to grab schedule_lock that was grabbed by itself.
>
> Am I right? If I was wrong, please tell me my misunderstanding.
Hmm, I think you may have discovered the source of a bug that people
have been reporting but I haven't had time to look into yet.
Keir, I think that lock in schedule.c around SCHED(adjust) must be
wrong. By the time we grab that lock, grabbing it will be completely
pointless. What are we going to be racing against? In any case, the
actual scheduler should be responsible for grabbing locks; there's no
reason that the scheduler can't grab whatever lock it needs inside
that function. I haven't done a deep analysis, but my initial
instinct is to just get rid of it. What do you think?
> if ( d == current->domain )
> vcpu_schedule_lock_irq(current);
>
> It was very hard to understan for me..:) What does it exactly mean?
You're asking what "current" means? "current" is a macro that always
resolves to the vcpu which is running on the current processor.
sched_adjust() seems to be trying to avoid scheduling races in general
by pausing all vcpus before calling the per-scheduler function. But
if a VM is calling the op on itself, the vcpu making the hypercall
can't pause itself. So in that case (current->domain == d) will be
true, so sched_adjust() grab the schedule lock of that vm instead.
But really all that locking should be handled in the scheduler
function, not by the generic code. It knows best what needs to be
locked when.
-George
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: deadlock in the credit2
2011-10-14 11:47 ` George Dunlap
@ 2011-10-14 12:11 ` Keir Fraser
2011-10-15 4:48 ` Eunbyung Park
0 siblings, 1 reply; 4+ messages in thread
From: Keir Fraser @ 2011-10-14 12:11 UTC (permalink / raw)
To: George Dunlap, Eunbyung Park; +Cc: xen-devel
On 14/10/2011 12:47, "George Dunlap" <George.Dunlap@eu.citrix.com> wrote:
> 2011/10/14 Eunbyung Park <silverbottlep@gmail.com>:
>> IMHO, it seems to be deadlock when changing dom0's weight in credit2
>> scheduler.
>>
>> when the sched_adjust() in schedule.c is called, it grabs the
>> schedule_lock after pausing all of the vcpus
>>
>> and then, csched_dom_cntl in sched_credit2.c, it also grab the
>> schedule_lock by using vcpu_schedule_lock_irq().
>>
>> In the credit2, all of the percpu schedule_lock points out same runqueue
>> lock if they belong to same runqueue.
>>
>> Eventually, all of vcpu are paused except for itself running the code,
>> and it try to grab schedule_lock that was grabbed by itself.
>>
>> Am I right? If I was wrong, please tell me my misunderstanding.
>
> Hmm, I think you may have discovered the source of a bug that people
> have been reporting but I haven't had time to look into yet.
>
> Keir, I think that lock in schedule.c around SCHED(adjust) must be
> wrong. By the time we grab that lock, grabbing it will be completely
> pointless. What are we going to be racing against? In any case, the
> actual scheduler should be responsible for grabbing locks; there's no
> reason that the scheduler can't grab whatever lock it needs inside
> that function. I haven't done a deep analysis, but my initial
> instinct is to just get rid of it. What do you think?
Fine by me. The synchronisation in that function looks pretty fragile. It's
probably outdated too.
-- Keir
>> if ( d == current->domain )
>> vcpu_schedule_lock_irq(current);
>>
>> It was very hard to understan for me..:) What does it exactly mean?
>
> You're asking what "current" means? "current" is a macro that always
> resolves to the vcpu which is running on the current processor.
>
> sched_adjust() seems to be trying to avoid scheduling races in general
> by pausing all vcpus before calling the per-scheduler function. But
> if a VM is calling the op on itself, the vcpu making the hypercall
> can't pause itself. So in that case (current->domain == d) will be
> true, so sched_adjust() grab the schedule lock of that vm instead.
>
> But really all that locking should be handled in the scheduler
> function, not by the generic code. It knows best what needs to be
> locked when.
>
> -George
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: deadlock in the credit2
2011-10-14 12:11 ` Keir Fraser
@ 2011-10-15 4:48 ` Eunbyung Park
0 siblings, 0 replies; 4+ messages in thread
From: Eunbyung Park @ 2011-10-15 4:48 UTC (permalink / raw)
To: Keir Fraser; +Cc: George Dunlap, xen-devel
>>> if ( d == current->domain )
>>> vcpu_schedule_lock_irq(current);
>>>
>>> It was very hard to understan for me..:) What does it exactly mean?
>> You're asking what "current" means? "current" is a macro that always
>> resolves to the vcpu which is running on the current processor.
>>
>> sched_adjust() seems to be trying to avoid scheduling races in general
>> by pausing all vcpus before calling the per-scheduler function. But
>> if a VM is calling the op on itself, the vcpu making the hypercall
>> can't pause itself. So in that case (current->domain == d) will be
>> true, so sched_adjust() grab the schedule lock of that vm instead.
>>
>> But really all that locking should be handled in the scheduler
>> function, not by the generic code. It knows best what needs to be
>> locked when.
This was what I really wanted to ask about.
Now I can understand what the generic scheduler code was going to say.
Thanks for your kindness.
--
Best Regards,
Eunbyung Park
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-10-15 4:48 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-14 9:08 deadlock in the credit2 Eunbyung Park
2011-10-14 11:47 ` George Dunlap
2011-10-14 12:11 ` Keir Fraser
2011-10-15 4:48 ` Eunbyung Park
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.