* [PATCH] x86: fix NULL function call in timer_softirq_action()
@ 2008-04-22 2:42 NISHIGUCHI Naoki
2008-04-22 10:29 ` Keir Fraser
0 siblings, 1 reply; 2+ messages in thread
From: NISHIGUCHI Naoki @ 2008-04-22 2:42 UTC (permalink / raw)
To: xen-devel
[-- Attachment #1: Type: text/plain, Size: 1053 bytes --]
Hi,
In VT-d enabled and SMP machine, when start HVM guests that was assigned
device such as "pci = ['01:00.0']", sometimes panic happens! This panic
occurs because of NULL function call in timer_softirq_action().
Attached patch fixes this problem.
This panic's cause was find_first_bit() in vmx_dirq_assist().
In find_first_bit(__find_first_bit) function, "repe; scas" instruction
and "bsf" instruction refer addresses of a bitmap. If clear_bit() is
called to clear a bit of the bitmap between above instructions, eax
register's value is zero after execution of "bsf" instruction. As a
result, the return value of find_first_bit() will be 0, 64, 128 or
192(on x86_64 arch).
In this case, vmx_dirq_assist() calls set_timer() about the bit not to
be set. If hvm_timer(timer structure) about the bit is not initialized,
timer_softirq_action() will call zero address.
Only in VT-d enabled and SMP machine, clear_bit() is called in
pt_irq_time_out() on another cpu.
Signed-off-by: Naoki Nishiguchi <nisiguti@jp.fujitsu.com>
Regards,
Naoki Nishiguchi
[-- Attachment #2: bitops.patch --]
[-- Type: text/plain, Size: 876 bytes --]
diff -r 08e010c3f251 xen/arch/x86/bitops.c
--- a/xen/arch/x86/bitops.c Tue Apr 15 16:39:00 2008 +0100
+++ b/xen/arch/x86/bitops.c Wed Apr 16 09:38:06 2008 +0900
@@ -8,12 +8,15 @@ unsigned int __find_first_bit(
unsigned long d0, d1, res;
asm volatile (
- " xor %%eax,%%eax\n\t" /* also ensures ZF==1 if size==0 */
+ "1: xor %%eax,%%eax\n\t" /* also ensures ZF==1 if size==0 */
" repe; scas"__OS"\n\t"
- " je 1f\n\t"
+ " je 2f\n\t"
" lea -"STR(BITS_PER_LONG/8)"(%2),%2\n\t"
- " bsf (%2),%0\n"
- "1: sub %%ebx,%%edi\n\t"
+ " bsf (%2),%0\n\t"
+ " jnz 2f\n\t"
+ " lea "STR(BITS_PER_LONG/8)"(%2),%2\n\t"
+ " jmp 1b\n\t"
+ "2: sub %%ebx,%%edi\n\t"
" shl $3,%%edi\n\t"
" add %%edi,%%eax"
: "=&a" (res), "=&c" (d0), "=&D" (d1)
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: [PATCH] x86: fix NULL function call in timer_softirq_action()
2008-04-22 2:42 [PATCH] x86: fix NULL function call in timer_softirq_action() NISHIGUCHI Naoki
@ 2008-04-22 10:29 ` Keir Fraser
0 siblings, 0 replies; 2+ messages in thread
From: Keir Fraser @ 2008-04-22 10:29 UTC (permalink / raw)
To: NISHIGUCHI Naoki, xen-devel
On 22/4/08 03:42, "NISHIGUCHI Naoki" <nisiguti@jp.fujitsu.com> wrote:
> This panic's cause was find_first_bit() in vmx_dirq_assist().
> In find_first_bit(__find_first_bit) function, "repe; scas" instruction
> and "bsf" instruction refer addresses of a bitmap. If clear_bit() is
> called to clear a bit of the bitmap between above instructions, eax
> register's value is zero after execution of "bsf" instruction. As a
> result, the return value of find_first_bit() will be 0, 64, 128 or
> 192(on x86_64 arch).
> In this case, vmx_dirq_assist() calls set_timer() about the bit not to
> be set. If hvm_timer(timer structure) about the bit is not initialized,
> timer_softirq_action() will call zero address.
Good catch. Actually our usage of BSF is generally bad because Intel does
not guarantee the contents of the destination register when the source is
zero (we are currently assuming the destination is left intact in that
case). I will fix that up too.
Also I think vmx_dirq_assist() is broken because it assumes that it 'owns'
the bit it finds. But actually two VCPUs can race to clear_bit() and both
increment, for example, the mirq[irq].pending field. I think the loop body
should start with if ( !test_and_clear_bit(...) ) continue. I will make that
change also.
-- Keir
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2008-04-22 10:29 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-22 2:42 [PATCH] x86: fix NULL function call in timer_softirq_action() NISHIGUCHI Naoki
2008-04-22 10:29 ` Keir Fraser
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.