* issue with PLE and/or scheduler
@ 2011-12-19 23:10 andrew thomas
2011-12-23 2:44 ` Zhang, Xiantao
0 siblings, 1 reply; 2+ messages in thread
From: andrew thomas @ 2011-12-19 23:10 UTC (permalink / raw)
To: xen-devel@lists.xensource.com; +Cc: andrew.thomas
This is with xen-4.1-testing cs 23201:1c89f7d29fbb
and using the default "credit" scheduler.
I've run into an interesting issue with HVM guests which
make use of Pause Loop Exiting (ie. on westmere systems;
and also on romley systems): after yielding the cpu, guests
don't seem to receive timer interrupts correctly..
Some background: for historical reasons (ie old templates) we boot
OL/RHEL guests
with the following settings:
kernel parameters: clock=pit nohpet nopmtimer
vm.cfg: timer_mode = 2
With PLE enabled, 2.6.32 guests will crash early on with:
..MP-BIOS bug: 8254 timer not connected to IO-APIC
# a few lines omitted..
Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with
apic=debug
While 2.6.18-238 (ie OL/RHEL5u6) will fail to find the timer, but
continue and
lock up in the serial line initialization.
..MP-BIOS bug: 8254 timer not connected to IO-APIC
# continues until lock up here:
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
Instrumenting the 2.6.32 code (ie timer_irq_works()) shows that jiffies
isn't advancing (or only 1 or 2 ticks are
being received, which is insufficient for "working"). This is on a
"quiet" system with no other activity.
So, even though the guest has voluntarily yielded the cpu (through PLE),
I would still expect it to
receive every clock tick (even with timer_mode=2) as there is no other
work to do on the
system.
Disabling PLE allows both 2.6.18 and 2.6.32 guests to boot.. [As an
aside, so does setting ple_gap to
41 (ie prior to 21355:727ccaaa6cce) -- the perf counters show no exits
happening,
so this is equivalent to disabling PLE.]
I'm hoping someone who knows the scheduler well will be able to quickly
decide whether this is a bug or a feature...
Andrew
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: issue with PLE and/or scheduler
2011-12-19 23:10 issue with PLE and/or scheduler andrew thomas
@ 2011-12-23 2:44 ` Zhang, Xiantao
0 siblings, 0 replies; 2+ messages in thread
From: Zhang, Xiantao @ 2011-12-23 2:44 UTC (permalink / raw)
To: andrew thomas, xen-devel@lists.xensource.com
Andrew,
Can you try this patch to see whether to fix your issue ?
Xiantao
diff -r 381ab77db71a xen/arch/x86/hvm/vpt.c
--- a/xen/arch/x86/hvm/vpt.c Mon Apr 18 10:10:02 2011 +0100
+++ b/xen/arch/x86/hvm/vpt.c Thu Dec 22 11:35:36 2011 +0800
@@ -185,7 +185,7 @@
list_for_each_entry ( pt, head, list )
{
- if ( pt->pending_intr_nr == 0 )
+ if ( pt->pending_intr_nr == 0 && !pt->do_not_freeze)
{
pt_process_missed_ticks(pt);
set_timer(&pt->timer, pt->scheduled);
> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-
> bounces@lists.xensource.com] On Behalf Of andrew thomas
> Sent: Tuesday, December 20, 2011 7:10 AM
> To: xen-devel@lists.xensource.com
> Cc: andrew.thomas@oracle.com
> Subject: [Xen-devel] issue with PLE and/or scheduler
>
> This is with xen-4.1-testing cs 23201:1c89f7d29fbb and using the default
> "credit" scheduler.
>
> I've run into an interesting issue with HVM guests which make use of Pause
> Loop Exiting (ie. on westmere systems; and also on romley systems): after
> yielding the cpu, guests don't seem to receive timer interrupts correctly..
>
> Some background: for historical reasons (ie old templates) we boot OL/RHEL
> guests with the following settings:
>
> kernel parameters: clock=pit nohpet nopmtimer
> vm.cfg: timer_mode = 2
>
> With PLE enabled, 2.6.32 guests will crash early on with:
> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> # a few lines omitted..
> Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with
> apic=debug
>
> While 2.6.18-238 (ie OL/RHEL5u6) will fail to find the timer, but continue and
> lock up in the serial line initialization.
>
> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> # continues until lock up here:
> Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
>
> Instrumenting the 2.6.32 code (ie timer_irq_works()) shows that jiffies isn't
> advancing (or only 1 or 2 ticks are being received, which is insufficient for
> "working"). This is on a "quiet" system with no other activity.
> So, even though the guest has voluntarily yielded the cpu (through PLE), I
> would still expect it to receive every clock tick (even with timer_mode=2) as
> there is no other work to do on the system.
>
> Disabling PLE allows both 2.6.18 and 2.6.32 guests to boot.. [As an aside, so
> does setting ple_gap to
> 41 (ie prior to 21355:727ccaaa6cce) -- the perf counters show no exits
> happening, so this is equivalent to disabling PLE.]
>
> I'm hoping someone who knows the scheduler well will be able to quickly
> decide whether this is a bug or a feature...
>
> Andrew
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2011-12-23 2:44 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-19 23:10 issue with PLE and/or scheduler andrew thomas
2011-12-23 2:44 ` Zhang, Xiantao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.