* [PATCH] Fix deadlock in schedule.c at TRACE mode
@ 2008-04-24 4:34 NISHIGUCHI Naoki
2008-04-24 5:42 ` Atsushi SAKAI
0 siblings, 1 reply; 4+ messages in thread
From: NISHIGUCHI Naoki @ 2008-04-24 4:34 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1: Type: text/plain, Size: 455 bytes --]
Hi,
In schedule.c, schedule() and sched_adjust() call trace functions during
acquiring lock of schedule_lock in each cpu's schedule_data. When trace
buffers are enabled, the trace function (__trace_var()) may call
vcpu_wake() by calling send_guest_global_virq(). In the case, a deadlock
occurs when acquiring lock of schedule_lock.
Attached patch fixes this problem.
Signed-off-by: Naoki Nishiguchi <nisiguti@jp.fujitsu.com>
Regards,
Naoki Nishiguchi
[-- Attachment #2: fix_deadlock.patch --]
[-- Type: text/plain, Size: 2165 bytes --]
diff -r 77dec8732cde xen/common/schedule.c
--- a/xen/common/schedule.c Wed Apr 23 16:58:44 2008 +0100
+++ b/xen/common/schedule.c Thu Apr 24 11:19:25 2008 +0900
@@ -605,11 +605,13 @@ long sched_adjust(struct domain *d, stru
if ( d == current->domain )
vcpu_schedule_lock_irq(current);
- if ( (ret = SCHED_OP(adjust, d, op)) == 0 )
- TRACE_1D(TRC_SCHED_ADJDOM, d->domain_id);
+ ret = SCHED_OP(adjust, d, op);
if ( d == current->domain )
vcpu_schedule_unlock_irq(current);
+
+ if ( ret == 0 )
+ TRACE_1D(TRC_SCHED_ADJDOM, d->domain_id);
for_each_vcpu ( d, v )
{
@@ -654,6 +656,7 @@ static void schedule(void)
struct schedule_data *sd;
struct task_slice next_slice;
s32 r_time; /* time for new dom to run */
+ uint64_t prev_state_time, next_state_time;
ASSERT(!in_irq());
ASSERT(this_cpu(mc_state).flags == 0);
@@ -682,14 +685,10 @@ static void schedule(void)
return continue_running(prev);
}
- TRACE_2D(TRC_SCHED_SWITCH_INFPREV,
- prev->domain->domain_id,
- now - prev->runstate.state_entry_time);
- TRACE_3D(TRC_SCHED_SWITCH_INFNEXT,
- next->domain->domain_id,
- (next->runstate.state == RUNSTATE_runnable) ?
- (now - next->runstate.state_entry_time) : 0,
- r_time);
+ /* Temporarily save the period of previous runstate. */
+ prev_state_time = now - prev->runstate.state_entry_time;
+ next_state_time = (next->runstate.state == RUNSTATE_runnable) ?
+ (now - next->runstate.state_entry_time) : 0;
ASSERT(prev->runstate.state == RUNSTATE_running);
vcpu_runstate_change(
@@ -705,6 +704,12 @@ static void schedule(void)
next->is_running = 1;
spin_unlock_irq(&sd->schedule_lock);
+
+ /* Avoid deadlock by calling the trace function after unlock. */
+ TRACE_2D(TRC_SCHED_SWITCH_INFPREV,
+ prev->domain->domain_id, prev_state_time);
+ TRACE_3D(TRC_SCHED_SWITCH_INFNEXT,
+ next->domain->domain_id, next_state_time, r_time);
perfc_incr(sched_ctx);
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] Fix deadlock in schedule.c at TRACE mode
2008-04-24 4:34 [PATCH] Fix deadlock in schedule.c at TRACE mode NISHIGUCHI Naoki
@ 2008-04-24 5:42 ` Atsushi SAKAI
2008-04-24 7:03 ` NISHIGUCHI Naoki
0 siblings, 1 reply; 4+ messages in thread
From: Atsushi SAKAI @ 2008-04-24 5:42 UTC (permalink / raw)
To: NISHIGUCHI Naoki; +Cc: xen-devel
Hi, Naoki
I have two questions about this.
1)How to reproduce your deadlock ?
Would you give me your test environment to reproduce this deadlock?
Is it easily reproduced by running xenmon.py or xentrace
with one or two guest domain(s)?
or Any additional condition needed?
2)About fixing code,
I think __trace_var() should fix for this issue not schedule()
This issue cannot be fixed by modify the __trace_var()?
Thanks
Atsushi SAKAI
NISHIGUCHI Naoki <nisiguti@jp.fujitsu.com> wrote:
> Hi,
>
> In schedule.c, schedule() and sched_adjust() call trace functions during
> acquiring lock of schedule_lock in each cpu's schedule_data. When trace
> buffers are enabled, the trace function (__trace_var()) may call
> vcpu_wake() by calling send_guest_global_virq(). In the case, a deadlock
> occurs when acquiring lock of schedule_lock.
>
> Attached patch fixes this problem.
>
> Signed-off-by: Naoki Nishiguchi <nisiguti@jp.fujitsu.com>
>
> Regards,
> Naoki Nishiguchi
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Fix deadlock in schedule.c at TRACE mode
2008-04-24 5:42 ` Atsushi SAKAI
@ 2008-04-24 7:03 ` NISHIGUCHI Naoki
2008-04-24 7:29 ` Keir Fraser
0 siblings, 1 reply; 4+ messages in thread
From: NISHIGUCHI Naoki @ 2008-04-24 7:03 UTC (permalink / raw)
To: Atsushi SAKAI, xen-devel
[-- Attachment #1: Type: text/plain, Size: 859 bytes --]
Hi, Atsushi
Atsushi SAKAI wrote:
> I have two questions about this.
>
> 1)How to reproduce your deadlock ?
> Would you give me your test environment to reproduce this deadlock?
> Is it easily reproduced by running xenmon.py or xentrace
> with one or two guest domain(s)?
> or Any additional condition needed?
This deadlock can be easily reproduced by running xenmon.py without
guest domain.
Furthermore, this deadlock occurs easier by applying my patch to xenbaked.c.
Subject of my patch is "[PATCH] Fix access to trace buffer after
xentrace changes".
> 2)About fixing code,
> I think __trace_var() should fix for this issue not schedule()
> This issue cannot be fixed by modify the __trace_var()?
Thanks for your advise.
I agree with you.
I fixed this deadlock using tasklet in trace.c.
Here is the patch.
Thanks,
Naoki Nishiguchi
[-- Attachment #2: trace.patch --]
[-- Type: text/plain, Size: 890 bytes --]
diff -r 77dec8732cde xen/common/trace.c
--- a/xen/common/trace.c Wed Apr 23 16:58:44 2008 +0100
+++ b/xen/common/trace.c Thu Apr 24 15:56:37 2008 +0900
@@ -69,6 +69,13 @@ static cpumask_t tb_cpu_mask = CPU_MASK_
/* which tracing events are enabled */
static u32 tb_event_mask = TRC_ALL;
+static void trace_notify_guest(unsigned long unused)
+{
+ send_guest_global_virq(dom0, VIRQ_TBUF);
+}
+
+static DECLARE_TASKLET(trace_tasklet, trace_notify_guest, 0);
+
/**
* alloc_trace_bufs - performs initialization of the per-cpu trace buffers.
*
@@ -506,7 +513,7 @@ void __trace_var(u32 event, int cycles,
/* Notify trace buffer consumer that we've crossed the high water mark. */
if ( started_below_highwater &&
(calc_unconsumed_bytes(buf) >= t_buf_highwater) )
- send_guest_global_virq(dom0, VIRQ_TBUF);
+ tasklet_schedule(&trace_tasklet);
}
/*
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] Fix deadlock in schedule.c at TRACE mode
2008-04-24 7:03 ` NISHIGUCHI Naoki
@ 2008-04-24 7:29 ` Keir Fraser
0 siblings, 0 replies; 4+ messages in thread
From: Keir Fraser @ 2008-04-24 7:29 UTC (permalink / raw)
To: NISHIGUCHI Naoki, Atsushi SAKAI, xen-devel
On 24/4/08 08:03, "NISHIGUCHI Naoki" <nisiguti@jp.fujitsu.com> wrote:
>> 2)About fixing code,
>> I think __trace_var() should fix for this issue not schedule()
>> This issue cannot be fixed by modify the __trace_var()?
>
> Thanks for your advise.
> I agree with you.
> I fixed this deadlock using tasklet in trace.c.
>
> Here is the patch.
Thanks. This was my bad as I removed the old softirq and couldn't see a
reason not to call wake() directly. I'll add a comment when I apply your
patch.
-- Keir
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-04-24 7:29 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-24 4:34 [PATCH] Fix deadlock in schedule.c at TRACE mode NISHIGUCHI Naoki
2008-04-24 5:42 ` Atsushi SAKAI
2008-04-24 7:03 ` NISHIGUCHI Naoki
2008-04-24 7:29 ` Keir Fraser
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.