[Qemu-devel] irq problems after live migration with 0.12.4

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] irq problems after live migration with 0.12.4
@ 2010-05-23  9:55 Peter Lieven
  2010-05-23 10:38 ` [Qemu-devel] " Michael Tokarev
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Lieven @ 2010-05-23  9:55 UTC (permalink / raw)
  To: qemu-devel, kvm

Hi,

after live migrating ubuntu 9.10 server (2.6.31-14-server) and suse linux 10.1 (2.6.16.13-4-smp)
it happens sometimes that the guest runs into irq problems. i mention these 2 guest oss
since i have seen the error there. there are likely others around with the same problem.

on the host i run 2.6.33.3 (kernel+mod) and qemu-kvm 0.12.4.

i started a vm with:
/usr/bin/qemu-kvm-0.12.4  -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net nic,vlan=141,model=e1000,macaddr=52:54:00:ff:00:72   -drive file=/dev/sdb,if=ide,boot=on,cache=none,aio=native  -m 1024 -cpu qemu64,model_id='Intel(R) Xeon(R) CPU           E5430  @ 2.66GHz'  -monitor tcp:0:4001,server,nowait -vnc :1 -name 'migration-test-9-10'  -boot order=dc,menu=on  -k de  -incoming tcp:172.21.55.22:5001  -pidfile /var/run/qemu/vm-155.pid  -mem-path /hugepages -mem-prealloc  -rtc base=utc,clock=host -usb -usbdevice tablet 

for testing i have a clean ubuntu 9.10 server 64-bit install and created a small script with fetches a dvd iso from a local server and checking md5sum in an endless loop.

the download performance is approx. 50MB/s on that vm.

to trigger the error i did several migrations of the vm throughout the last days. finally I ended up in the following oops in the guest:

[64442.298521] irq 10: nobody cared (try booting with the "irqpoll" option)
[64442.299175] Pid: 0, comm: swapper Not tainted 2.6.31-14-server #48-Ubuntu
[64442.299179] Call Trace:
[64442.299185]  <IRQ>  [<ffffffff810b4b96>] __report_bad_irq+0x26/0xa0
[64442.299227]  [<ffffffff810b4d9c>] note_interrupt+0x18c/0x1d0
[64442.299232]  [<ffffffff810b5415>] handle_fasteoi_irq+0xd5/0x100
[64442.299244]  [<ffffffff81014bdd>] handle_irq+0x1d/0x30
[64442.299246]  [<ffffffff810140b7>] do_IRQ+0x67/0xe0
[64442.299249]  [<ffffffff810129d3>] ret_from_intr+0x0/0x11
[64442.299266]  [<ffffffff810b3234>] ? handle_IRQ_event+0x24/0x160
[64442.299269]  [<ffffffff810b529f>] ? handle_edge_irq+0xcf/0x170
[64442.299271]  [<ffffffff81014bdd>] ? handle_irq+0x1d/0x30
[64442.299273]  [<ffffffff810140b7>] ? do_IRQ+0x67/0xe0
[64442.299275]  [<ffffffff810129d3>] ? ret_from_intr+0x0/0x11
[64442.299290]  [<ffffffff81526b14>] ? _spin_unlock_irqrestore+0x14/0x20
[64442.299302]  [<ffffffff8133257c>] ? scsi_dispatch_cmd+0x16c/0x2d0
[64442.299307]  [<ffffffff8133963a>] ? scsi_request_fn+0x3aa/0x500
[64442.299322]  [<ffffffff8125fafc>] ? __blk_run_queue+0x6c/0x150
[64442.299324]  [<ffffffff8125fcbb>] ? blk_run_queue+0x2b/0x50
[64442.299327]  [<ffffffff8133899f>] ? scsi_run_queue+0xcf/0x2a0
[64442.299336]  [<ffffffff81339a0d>] ? scsi_next_command+0x3d/0x60
[64442.299338]  [<ffffffff8133a21b>] ? scsi_end_request+0xab/0xb0
[64442.299340]  [<ffffffff8133a50e>] ? scsi_io_completion+0x9e/0x4d0
[64442.299348]  [<ffffffff81036419>] ? default_spin_lock_flags+0x9/0x10
[64442.299351]  [<ffffffff8133224d>] ? scsi_finish_command+0xbd/0x130
[64442.299353]  [<ffffffff8133aa95>] ? scsi_softirq_done+0x145/0x170
[64442.299356]  [<ffffffff81264e6d>] ? blk_done_softirq+0x7d/0x90
[64442.299368]  [<ffffffff810651fd>] ? __do_softirq+0xbd/0x200
[64442.299370]  [<ffffffff810131ac>] ? call_softirq+0x1c/0x30
[64442.299372]  [<ffffffff81014b85>] ? do_softirq+0x55/0x90
[64442.299374]  [<ffffffff81064f65>] ? irq_exit+0x85/0x90
[64442.299376]  [<ffffffff810140c0>] ? do_IRQ+0x70/0xe0
[64442.299379]  [<ffffffff810129d3>] ? ret_from_intr+0x0/0x11
[64442.299380]  <EOI>  [<ffffffff810356f6>] ? native_safe_halt+0x6/0x10
[64442.299390]  [<ffffffff8101a20c>] ? default_idle+0x4c/0xe0
[64442.299395]  [<ffffffff815298f5>] ? atomic_notifier_call_chain+0x15/0x20
[64442.299398]  [<ffffffff81010e02>] ? cpu_idle+0xb2/0x100
[64442.299406]  [<ffffffff815123c6>] ? rest_init+0x66/0x70
[64442.299424]  [<ffffffff81838047>] ? start_kernel+0x352/0x35b
[64442.299427]  [<ffffffff8183759a>] ? x86_64_start_reservations+0x125/0x129
[64442.299429]  [<ffffffff81837698>] ? x86_64_start_kernel+0xfa/0x109
[64442.299433] handlers:
[64442.299840] [<ffffffffa0000b80>] (e1000_intr+0x0/0x190 [e1000])
[64442.300046] Disabling IRQ #10

After this the guest is still allive, but download performance is down to approx. 500KB/s

This error is definetly not triggerable with option -no-kvm-irqchip. I have seen this error occasionally
since my first experiments with qemu-kvm-88 and also without hugetablefs.

Help appreciated.

BR,
Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Qemu-devel] Re: irq problems after live migration with 0.12.4
  2010-05-23  9:55 [Qemu-devel] irq problems after live migration with 0.12.4 Peter Lieven
@ 2010-05-23 10:38 ` Michael Tokarev
  2010-05-23 10:44   ` Peter Lieven
  2010-05-25 11:03   ` Peter Lieven
  0 siblings, 2 replies; 6+ messages in thread
From: Michael Tokarev @ 2010-05-23 10:38 UTC (permalink / raw)
  To: Peter Lieven; +Cc: qemu-devel, kvm

23.05.2010 13:55, Peter Lieven wrote:
> Hi,
>
> after live migrating ubuntu 9.10 server (2.6.31-14-server) and suse linux 10.1 (2.6.16.13-4-smp)
> it happens sometimes that the guest runs into irq problems. i mention these 2 guest oss
> since i have seen the error there. there are likely others around with the same problem.
>
> on the host i run 2.6.33.3 (kernel+mod) and qemu-kvm 0.12.4.
>
> i started a vm with:
> /usr/bin/qemu-kvm-0.12.4  -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net nic,vlan=141,model=e1000,macaddr=52:54:00:ff:00:72   -drive file=/dev/sdb,if=ide,boot=on,cache=none,aio=native  -m 1024 -cpu qemu64,model_id='Intel(R) Xeon(R) CPU           E5430  @ 2.66GHz'  -monitor tcp:0:4001,server,nowait -vnc :1 -name 'migration-test-9-10'  -boot order=dc,menu=on  -k de  -incoming tcp:172.21.55.22:5001  -pidfile /var/run/qemu/vm-155.pid  -mem-path /hugepages -mem-prealloc  -rtc base=utc,clock=host -usb -usbdevice tablet
>
> for testing i have a clean ubuntu 9.10 server 64-bit install and created a small script with fetches a dvd iso from a local server and checking md5sum in an endless loop.
>
> the download performance is approx. 50MB/s on that vm.
>
> to trigger the error i did several migrations of the vm throughout the last days. finally I ended up in the following oops in the guest:
>
> [64442.298521] irq 10: nobody cared (try booting with the "irqpoll" option)
> [64442.299175] Pid: 0, comm: swapper Not tainted 2.6.31-14-server #48-Ubuntu
> [64442.299179] Call Trace:
> [64442.299185]<IRQ>   [<ffffffff810b4b96>] __report_bad_irq+0x26/0xa0
> [64442.299227]  [<ffffffff810b4d9c>] note_interrupt+0x18c/0x1d0
> [64442.299232]  [<ffffffff810b5415>] handle_fasteoi_irq+0xd5/0x100
> [64442.299244]  [<ffffffff81014bdd>] handle_irq+0x1d/0x30
> [64442.299246]  [<ffffffff810140b7>] do_IRQ+0x67/0xe0
> [64442.299249]  [<ffffffff810129d3>] ret_from_intr+0x0/0x11
> [64442.299266]  [<ffffffff810b3234>] ? handle_IRQ_event+0x24/0x160
> [64442.299269]  [<ffffffff810b529f>] ? handle_edge_irq+0xcf/0x170
> [64442.299271]  [<ffffffff81014bdd>] ? handle_irq+0x1d/0x30
> [64442.299273]  [<ffffffff810140b7>] ? do_IRQ+0x67/0xe0
> [64442.299275]  [<ffffffff810129d3>] ? ret_from_intr+0x0/0x11
> [64442.299290]  [<ffffffff81526b14>] ? _spin_unlock_irqrestore+0x14/0x20
> [64442.299302]  [<ffffffff8133257c>] ? scsi_dispatch_cmd+0x16c/0x2d0
> [64442.299307]  [<ffffffff8133963a>] ? scsi_request_fn+0x3aa/0x500
> [64442.299322]  [<ffffffff8125fafc>] ? __blk_run_queue+0x6c/0x150
> [64442.299324]  [<ffffffff8125fcbb>] ? blk_run_queue+0x2b/0x50
> [64442.299327]  [<ffffffff8133899f>] ? scsi_run_queue+0xcf/0x2a0
> [64442.299336]  [<ffffffff81339a0d>] ? scsi_next_command+0x3d/0x60
> [64442.299338]  [<ffffffff8133a21b>] ? scsi_end_request+0xab/0xb0
> [64442.299340]  [<ffffffff8133a50e>] ? scsi_io_completion+0x9e/0x4d0
> [64442.299348]  [<ffffffff81036419>] ? default_spin_lock_flags+0x9/0x10
> [64442.299351]  [<ffffffff8133224d>] ? scsi_finish_command+0xbd/0x130
> [64442.299353]  [<ffffffff8133aa95>] ? scsi_softirq_done+0x145/0x170
> [64442.299356]  [<ffffffff81264e6d>] ? blk_done_softirq+0x7d/0x90
> [64442.299368]  [<ffffffff810651fd>] ? __do_softirq+0xbd/0x200
> [64442.299370]  [<ffffffff810131ac>] ? call_softirq+0x1c/0x30
> [64442.299372]  [<ffffffff81014b85>] ? do_softirq+0x55/0x90
> [64442.299374]  [<ffffffff81064f65>] ? irq_exit+0x85/0x90
> [64442.299376]  [<ffffffff810140c0>] ? do_IRQ+0x70/0xe0
> [64442.299379]  [<ffffffff810129d3>] ? ret_from_intr+0x0/0x11
> [64442.299380]<EOI>   [<ffffffff810356f6>] ? native_safe_halt+0x6/0x10
> [64442.299390]  [<ffffffff8101a20c>] ? default_idle+0x4c/0xe0
> [64442.299395]  [<ffffffff815298f5>] ? atomic_notifier_call_chain+0x15/0x20
> [64442.299398]  [<ffffffff81010e02>] ? cpu_idle+0xb2/0x100
> [64442.299406]  [<ffffffff815123c6>] ? rest_init+0x66/0x70
> [64442.299424]  [<ffffffff81838047>] ? start_kernel+0x352/0x35b
> [64442.299427]  [<ffffffff8183759a>] ? x86_64_start_reservations+0x125/0x129
> [64442.299429]  [<ffffffff81837698>] ? x86_64_start_kernel+0xfa/0x109
> [64442.299433] handlers:
> [64442.299840] [<ffffffffa0000b80>] (e1000_intr+0x0/0x190 [e1000])
> [64442.300046] Disabling IRQ #10

See also LP bug #584131 (https://bugs.launchpad.net/bugs/584131)
and original Debian bug#580649 (http://bugs.debian.org/580649)

Not sure if they're related...

/mjt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Re: irq problems after live migration with 0.12.4
  2010-05-23 10:38 ` [Qemu-devel] " Michael Tokarev
@ 2010-05-23 10:44   ` Peter Lieven
  2010-05-25 11:03   ` Peter Lieven
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Lieven @ 2010-05-23 10:44 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: qemu-devel, kvm


Am 23.05.2010 um 12:38 schrieb Michael Tokarev:

> 23.05.2010 13:55, Peter Lieven wrote:
>> Hi,
>> 
>> after live migrating ubuntu 9.10 server (2.6.31-14-server) and suse linux 10.1 (2.6.16.13-4-smp)
>> it happens sometimes that the guest runs into irq problems. i mention these 2 guest oss
>> since i have seen the error there. there are likely others around with the same problem.
>> 
>> on the host i run 2.6.33.3 (kernel+mod) and qemu-kvm 0.12.4.
>> 
>> i started a vm with:
>> /usr/bin/qemu-kvm-0.12.4  -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net nic,vlan=141,model=e1000,macaddr=52:54:00:ff:00:72   -drive file=/dev/sdb,if=ide,boot=on,cache=none,aio=native  -m 1024 -cpu qemu64,model_id='Intel(R) Xeon(R) CPU           E5430  @ 2.66GHz'  -monitor tcp:0:4001,server,nowait -vnc :1 -name 'migration-test-9-10'  -boot order=dc,menu=on  -k de  -incoming tcp:172.21.55.22:5001  -pidfile /var/run/qemu/vm-155.pid  -mem-path /hugepages -mem-prealloc  -rtc base=utc,clock=host -usb -usbdevice tablet
>> 
>> for testing i have a clean ubuntu 9.10 server 64-bit install and created a small script with fetches a dvd iso from a local server and checking md5sum in an endless loop.
>> 
>> the download performance is approx. 50MB/s on that vm.
>> 
>> to trigger the error i did several migrations of the vm throughout the last days. finally I ended up in the following oops in the guest:
>> 
>> [64442.298521] irq 10: nobody cared (try booting with the "irqpoll" option)
>> [64442.299175] Pid: 0, comm: swapper Not tainted 2.6.31-14-server #48-Ubuntu
>> [64442.299179] Call Trace:
>> [64442.299185]<IRQ>   [<ffffffff810b4b96>] __report_bad_irq+0x26/0xa0
>> [64442.299227]  [<ffffffff810b4d9c>] note_interrupt+0x18c/0x1d0
>> [64442.299232]  [<ffffffff810b5415>] handle_fasteoi_irq+0xd5/0x100
>> [64442.299244]  [<ffffffff81014bdd>] handle_irq+0x1d/0x30
>> [64442.299246]  [<ffffffff810140b7>] do_IRQ+0x67/0xe0
>> [64442.299249]  [<ffffffff810129d3>] ret_from_intr+0x0/0x11
>> [64442.299266]  [<ffffffff810b3234>] ? handle_IRQ_event+0x24/0x160
>> [64442.299269]  [<ffffffff810b529f>] ? handle_edge_irq+0xcf/0x170
>> [64442.299271]  [<ffffffff81014bdd>] ? handle_irq+0x1d/0x30
>> [64442.299273]  [<ffffffff810140b7>] ? do_IRQ+0x67/0xe0
>> [64442.299275]  [<ffffffff810129d3>] ? ret_from_intr+0x0/0x11
>> [64442.299290]  [<ffffffff81526b14>] ? _spin_unlock_irqrestore+0x14/0x20
>> [64442.299302]  [<ffffffff8133257c>] ? scsi_dispatch_cmd+0x16c/0x2d0
>> [64442.299307]  [<ffffffff8133963a>] ? scsi_request_fn+0x3aa/0x500
>> [64442.299322]  [<ffffffff8125fafc>] ? __blk_run_queue+0x6c/0x150
>> [64442.299324]  [<ffffffff8125fcbb>] ? blk_run_queue+0x2b/0x50
>> [64442.299327]  [<ffffffff8133899f>] ? scsi_run_queue+0xcf/0x2a0
>> [64442.299336]  [<ffffffff81339a0d>] ? scsi_next_command+0x3d/0x60
>> [64442.299338]  [<ffffffff8133a21b>] ? scsi_end_request+0xab/0xb0
>> [64442.299340]  [<ffffffff8133a50e>] ? scsi_io_completion+0x9e/0x4d0
>> [64442.299348]  [<ffffffff81036419>] ? default_spin_lock_flags+0x9/0x10
>> [64442.299351]  [<ffffffff8133224d>] ? scsi_finish_command+0xbd/0x130
>> [64442.299353]  [<ffffffff8133aa95>] ? scsi_softirq_done+0x145/0x170
>> [64442.299356]  [<ffffffff81264e6d>] ? blk_done_softirq+0x7d/0x90
>> [64442.299368]  [<ffffffff810651fd>] ? __do_softirq+0xbd/0x200
>> [64442.299370]  [<ffffffff810131ac>] ? call_softirq+0x1c/0x30
>> [64442.299372]  [<ffffffff81014b85>] ? do_softirq+0x55/0x90
>> [64442.299374]  [<ffffffff81064f65>] ? irq_exit+0x85/0x90
>> [64442.299376]  [<ffffffff810140c0>] ? do_IRQ+0x70/0xe0
>> [64442.299379]  [<ffffffff810129d3>] ? ret_from_intr+0x0/0x11
>> [64442.299380]<EOI>   [<ffffffff810356f6>] ? native_safe_halt+0x6/0x10
>> [64442.299390]  [<ffffffff8101a20c>] ? default_idle+0x4c/0xe0
>> [64442.299395]  [<ffffffff815298f5>] ? atomic_notifier_call_chain+0x15/0x20
>> [64442.299398]  [<ffffffff81010e02>] ? cpu_idle+0xb2/0x100
>> [64442.299406]  [<ffffffff815123c6>] ? rest_init+0x66/0x70
>> [64442.299424]  [<ffffffff81838047>] ? start_kernel+0x352/0x35b
>> [64442.299427]  [<ffffffff8183759a>] ? x86_64_start_reservations+0x125/0x129
>> [64442.299429]  [<ffffffff81837698>] ? x86_64_start_kernel+0xfa/0x109
>> [64442.299433] handlers:
>> [64442.299840] [<ffffffffa0000b80>] (e1000_intr+0x0/0x190 [e1000])
>> [64442.300046] Disabling IRQ #10
> 
> See also LP bug #584131 (https://bugs.launchpad.net/bugs/584131)
> and original Debian bug#580649 (http://bugs.debian.org/580649)
> 
> Not sure if they're related...
> 
> /mjt
> 
> 

hi, thanks for the pointer.

i have seen them. the reporters of these bugs think that
the bug is caused by the virtio subsystem. at least the debian bug reporter
says it does not occur with virtio disabled.
here is no virtio involved. but, of course the cause could be the same.

i have a test platform here and i'm willing to make any modifications
to kernel, kvm-kmod, qemu-kvm or guest kernel to debug the problem.

peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Re: irq problems after live migration with 0.12.4
  2010-05-23 10:38 ` [Qemu-devel] " Michael Tokarev
  2010-05-23 10:44   ` Peter Lieven
@ 2010-05-25 11:03   ` Peter Lieven
  2010-05-25 19:27     ` Michael Tokarev
  1 sibling, 1 reply; 6+ messages in thread
From: Peter Lieven @ 2010-05-25 11:03 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: qemu-devel, kvm

Michael Tokarev wrote:
> 23.05.2010 13:55, Peter Lieven wrote:
>> Hi,
>>
>> after live migrating ubuntu 9.10 server (2.6.31-14-server) and suse 
>> linux 10.1 (2.6.16.13-4-smp)
>> it happens sometimes that the guest runs into irq problems. i mention 
>> these 2 guest oss
>> since i have seen the error there. there are likely others around 
>> with the same problem.
>>
>> on the host i run 2.6.33.3 (kernel+mod) and qemu-kvm 0.12.4.
>>
>> i started a vm with:
>> /usr/bin/qemu-kvm-0.12.4  -net 
>> tap,vlan=141,script=no,downscript=no,ifname=tap0 -net 
>> nic,vlan=141,model=e1000,macaddr=52:54:00:ff:00:72   -drive 
>> file=/dev/sdb,if=ide,boot=on,cache=none,aio=native  -m 1024 -cpu 
>> qemu64,model_id='Intel(R) Xeon(R) CPU           E5430  @ 2.66GHz'  
>> -monitor tcp:0:4001,server,nowait -vnc :1 -name 
>> 'migration-test-9-10'  -boot order=dc,menu=on  -k de  -incoming 
>> tcp:172.21.55.22:5001  -pidfile /var/run/qemu/vm-155.pid  -mem-path 
>> /hugepages -mem-prealloc  -rtc base=utc,clock=host -usb -usbdevice 
>> tablet
>>
>> for testing i have a clean ubuntu 9.10 server 64-bit install and 
>> created a small script with fetches a dvd iso from a local server and 
>> checking md5sum in an endless loop.
>>
>> the download performance is approx. 50MB/s on that vm.
>>
>> to trigger the error i did several migrations of the vm throughout 
>> the last days. finally I ended up in the following oops in the guest:
>>
>> [64442.298521] irq 10: nobody cared (try booting with the "irqpoll" 
>> option)
>> [64442.299175] Pid: 0, comm: swapper Not tainted 2.6.31-14-server 
>> #48-Ubuntu
>> [64442.299179] Call Trace:
>> [64442.299185]<IRQ>   [<ffffffff810b4b96>] __report_bad_irq+0x26/0xa0
>> [64442.299227]  [<ffffffff810b4d9c>] note_interrupt+0x18c/0x1d0
>> [64442.299232]  [<ffffffff810b5415>] handle_fasteoi_irq+0xd5/0x100
>> [64442.299244]  [<ffffffff81014bdd>] handle_irq+0x1d/0x30
>> [64442.299246]  [<ffffffff810140b7>] do_IRQ+0x67/0xe0
>> [64442.299249]  [<ffffffff810129d3>] ret_from_intr+0x0/0x11
>> [64442.299266]  [<ffffffff810b3234>] ? handle_IRQ_event+0x24/0x160
>> [64442.299269]  [<ffffffff810b529f>] ? handle_edge_irq+0xcf/0x170
>> [64442.299271]  [<ffffffff81014bdd>] ? handle_irq+0x1d/0x30
>> [64442.299273]  [<ffffffff810140b7>] ? do_IRQ+0x67/0xe0
>> [64442.299275]  [<ffffffff810129d3>] ? ret_from_intr+0x0/0x11
>> [64442.299290]  [<ffffffff81526b14>] ? _spin_unlock_irqrestore+0x14/0x20
>> [64442.299302]  [<ffffffff8133257c>] ? scsi_dispatch_cmd+0x16c/0x2d0
>> [64442.299307]  [<ffffffff8133963a>] ? scsi_request_fn+0x3aa/0x500
>> [64442.299322]  [<ffffffff8125fafc>] ? __blk_run_queue+0x6c/0x150
>> [64442.299324]  [<ffffffff8125fcbb>] ? blk_run_queue+0x2b/0x50
>> [64442.299327]  [<ffffffff8133899f>] ? scsi_run_queue+0xcf/0x2a0
>> [64442.299336]  [<ffffffff81339a0d>] ? scsi_next_command+0x3d/0x60
>> [64442.299338]  [<ffffffff8133a21b>] ? scsi_end_request+0xab/0xb0
>> [64442.299340]  [<ffffffff8133a50e>] ? scsi_io_completion+0x9e/0x4d0
>> [64442.299348]  [<ffffffff81036419>] ? default_spin_lock_flags+0x9/0x10
>> [64442.299351]  [<ffffffff8133224d>] ? scsi_finish_command+0xbd/0x130
>> [64442.299353]  [<ffffffff8133aa95>] ? scsi_softirq_done+0x145/0x170
>> [64442.299356]  [<ffffffff81264e6d>] ? blk_done_softirq+0x7d/0x90
>> [64442.299368]  [<ffffffff810651fd>] ? __do_softirq+0xbd/0x200
>> [64442.299370]  [<ffffffff810131ac>] ? call_softirq+0x1c/0x30
>> [64442.299372]  [<ffffffff81014b85>] ? do_softirq+0x55/0x90
>> [64442.299374]  [<ffffffff81064f65>] ? irq_exit+0x85/0x90
>> [64442.299376]  [<ffffffff810140c0>] ? do_IRQ+0x70/0xe0
>> [64442.299379]  [<ffffffff810129d3>] ? ret_from_intr+0x0/0x11
>> [64442.299380]<EOI>   [<ffffffff810356f6>] ? native_safe_halt+0x6/0x10
>> [64442.299390]  [<ffffffff8101a20c>] ? default_idle+0x4c/0xe0
>> [64442.299395]  [<ffffffff815298f5>] ? 
>> atomic_notifier_call_chain+0x15/0x20
>> [64442.299398]  [<ffffffff81010e02>] ? cpu_idle+0xb2/0x100
>> [64442.299406]  [<ffffffff815123c6>] ? rest_init+0x66/0x70
>> [64442.299424]  [<ffffffff81838047>] ? start_kernel+0x352/0x35b
>> [64442.299427]  [<ffffffff8183759a>] ? 
>> x86_64_start_reservations+0x125/0x129
>> [64442.299429]  [<ffffffff81837698>] ? x86_64_start_kernel+0xfa/0x109
>> [64442.299433] handlers:
>> [64442.299840] [<ffffffffa0000b80>] (e1000_intr+0x0/0x190 [e1000])
>> [64442.300046] Disabling IRQ #10
>
> See also LP bug #584131 (https://bugs.launchpad.net/bugs/584131)
> and original Debian bug#580649 (http://bugs.debian.org/580649)
>
> Not sure if they're related...
>
> /mjt
michael, do you have any ideas what i got do to debug whats happening?
looking at launchpad and debian bug tracker i found other bugs also
with a maybe related problem. so this issue might be greater...

thanks
peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Re: irq problems after live migration with 0.12.4
  2010-05-25 11:03   ` Peter Lieven
@ 2010-05-25 19:27     ` Michael Tokarev
  2010-05-26 11:47       ` Peter Lieven
  0 siblings, 1 reply; 6+ messages in thread
From: Michael Tokarev @ 2010-05-25 19:27 UTC (permalink / raw)
  To: Peter Lieven; +Cc: qemu-devel, kvm

25.05.2010 15:03, Peter Lieven wrote:
> Michael Tokarev wrote:
>> 23.05.2010 13:55, Peter Lieven wrote:
[]
>>> [64442.298521] irq 10: nobody cared (try booting with the "irqpoll" option)
[]
>>> [64442.299433] handlers:
>>> [64442.299840] [<ffffffffa0000b80>] (e1000_intr+0x0/0x190 [e1000])
>>> [64442.300046] Disabling IRQ #10

Apparently, for some reason, e1000_intr decided it's not
interesting IRQ or somehow wrong or not for that NIC.  I
dunno.  But something fishy is going on with IRQs here.

>> See also LP bug #584131 (https://bugs.launchpad.net/bugs/584131)
>> and original Debian bug#580649 (http://bugs.debian.org/580649)

>> Not sure if they're related...

It looks they are actually the same thing, but happens with
different devices and/or IRQs.  Either spurious, or unwanted,
or unrecognized or somesuch IRQ which is not recognized by
the irq handler, which results in disabling that IRQ by the
kernel, which is a bad thing (In your case it works because
e1000 works in 2 modes, interrupts and polling).

> michael, do you have any ideas what i got do to debug whats happening?

Unfortunately, no idea.  I don't know neither kernel nor kvm
internals.

> looking at launchpad and debian bug tracker i found other bugs also
> with a maybe related problem. so this issue might be greater...

Can you share your findings?  I don't know other debian bugs which
are similar to this one.

Thanks!

/mjt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Re: irq problems after live migration with 0.12.4
  2010-05-25 19:27     ` Michael Tokarev
@ 2010-05-26 11:47       ` Peter Lieven
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Lieven @ 2010-05-26 11:47 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: qemu-devel, kvm

Michael Tokarev wrote:
> 25.05.2010 15:03, Peter Lieven wrote:
>> Michael Tokarev wrote:
>>> 23.05.2010 13:55, Peter Lieven wrote:
> []
>>>> [64442.298521] irq 10: nobody cared (try booting with the "irqpoll" 
>>>> option)
> []
>>>> [64442.299433] handlers:
>>>> [64442.299840] [<ffffffffa0000b80>] (e1000_intr+0x0/0x190 [e1000])
>>>> [64442.300046] Disabling IRQ #10
>
> Apparently, for some reason, e1000_intr decided it's not
> interesting IRQ or somehow wrong or not for that NIC.  I
> dunno.  But something fishy is going on with IRQs here.
>
>>> See also LP bug #584131 (https://bugs.launchpad.net/bugs/584131)
>>> and original Debian bug#580649 (http://bugs.debian.org/580649)
>
>>> Not sure if they're related...
>
> It looks they are actually the same thing, but happens with
> different devices and/or IRQs.  Either spurious, or unwanted,
> or unrecognized or somesuch IRQ which is not recognized by
> the irq handler, which results in disabling that IRQ by the
> kernel, which is a bad thing (In your case it works because
> e1000 works in 2 modes, interrupts and polling).
>
>> michael, do you have any ideas what i got do to debug whats happening?
>
> Unfortunately, no idea.  I don't know neither kernel nor kvm
> internals.
I would be very greatful if someone with deeper knowledge would hook up.
I'm also not familiar with internals, unfortunately.
>
>> looking at launchpad and debian bug tracker i found other bugs also
>> with a maybe related problem. so this issue might be greater...
>
> Can you share your findings?  I don't know other debian bugs which
> are similar to this one.
I suspect that other reports regarding crashed VMs after migration
might be related. If I take my test VM with that I can trigger the bug
and change the Network Adapter from e1000 to rtl8139 and leave
everything else untouched the VM hangs at 100% CPU..



>
> Thanks!
>
> /mjt
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-05-26 11:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-23  9:55 [Qemu-devel] irq problems after live migration with 0.12.4 Peter Lieven
2010-05-23 10:38 ` [Qemu-devel] " Michael Tokarev
2010-05-23 10:44   ` Peter Lieven
2010-05-25 11:03   ` Peter Lieven
2010-05-25 19:27     ` Michael Tokarev
2010-05-26 11:47       ` Peter Lieven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).