From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=43036 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OGrum-00087T-Hs for qemu-devel@nongnu.org; Tue, 25 May 2010 07:03:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OGrul-0007vH-1n for qemu-devel@nongnu.org; Tue, 25 May 2010 07:03:16 -0400 Received: from zion.dlh.net ([91.198.192.1]:33538 helo=mail.dlh.net) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OGruk-0007v6-MU for qemu-devel@nongnu.org; Tue, 25 May 2010 07:03:15 -0400 Message-ID: <4BFBAE6D.4010507@dlh.net> Date: Tue, 25 May 2010 13:03:09 +0200 From: Peter Lieven MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: irq problems after live migration with 0.12.4 References: <4BF905B7.2040003@msgid.tls.msk.ru> In-Reply-To: <4BF905B7.2040003@msgid.tls.msk.ru> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Michael Tokarev Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org Michael Tokarev wrote: > 23.05.2010 13:55, Peter Lieven wrote: >> Hi, >> >> after live migrating ubuntu 9.10 server (2.6.31-14-server) and suse >> linux 10.1 (2.6.16.13-4-smp) >> it happens sometimes that the guest runs into irq problems. i mention >> these 2 guest oss >> since i have seen the error there. there are likely others around >> with the same problem. >> >> on the host i run 2.6.33.3 (kernel+mod) and qemu-kvm 0.12.4. >> >> i started a vm with: >> /usr/bin/qemu-kvm-0.12.4 -net >> tap,vlan=141,script=no,downscript=no,ifname=tap0 -net >> nic,vlan=141,model=e1000,macaddr=52:54:00:ff:00:72 -drive >> file=/dev/sdb,if=ide,boot=on,cache=none,aio=native -m 1024 -cpu >> qemu64,model_id='Intel(R) Xeon(R) CPU E5430 @ 2.66GHz' >> -monitor tcp:0:4001,server,nowait -vnc :1 -name >> 'migration-test-9-10' -boot order=dc,menu=on -k de -incoming >> tcp:172.21.55.22:5001 -pidfile /var/run/qemu/vm-155.pid -mem-path >> /hugepages -mem-prealloc -rtc base=utc,clock=host -usb -usbdevice >> tablet >> >> for testing i have a clean ubuntu 9.10 server 64-bit install and >> created a small script with fetches a dvd iso from a local server and >> checking md5sum in an endless loop. >> >> the download performance is approx. 50MB/s on that vm. >> >> to trigger the error i did several migrations of the vm throughout >> the last days. finally I ended up in the following oops in the guest: >> >> [64442.298521] irq 10: nobody cared (try booting with the "irqpoll" >> option) >> [64442.299175] Pid: 0, comm: swapper Not tainted 2.6.31-14-server >> #48-Ubuntu >> [64442.299179] Call Trace: >> [64442.299185] [] __report_bad_irq+0x26/0xa0 >> [64442.299227] [] note_interrupt+0x18c/0x1d0 >> [64442.299232] [] handle_fasteoi_irq+0xd5/0x100 >> [64442.299244] [] handle_irq+0x1d/0x30 >> [64442.299246] [] do_IRQ+0x67/0xe0 >> [64442.299249] [] ret_from_intr+0x0/0x11 >> [64442.299266] [] ? handle_IRQ_event+0x24/0x160 >> [64442.299269] [] ? handle_edge_irq+0xcf/0x170 >> [64442.299271] [] ? handle_irq+0x1d/0x30 >> [64442.299273] [] ? do_IRQ+0x67/0xe0 >> [64442.299275] [] ? ret_from_intr+0x0/0x11 >> [64442.299290] [] ? _spin_unlock_irqrestore+0x14/0x20 >> [64442.299302] [] ? scsi_dispatch_cmd+0x16c/0x2d0 >> [64442.299307] [] ? scsi_request_fn+0x3aa/0x500 >> [64442.299322] [] ? __blk_run_queue+0x6c/0x150 >> [64442.299324] [] ? blk_run_queue+0x2b/0x50 >> [64442.299327] [] ? scsi_run_queue+0xcf/0x2a0 >> [64442.299336] [] ? scsi_next_command+0x3d/0x60 >> [64442.299338] [] ? scsi_end_request+0xab/0xb0 >> [64442.299340] [] ? scsi_io_completion+0x9e/0x4d0 >> [64442.299348] [] ? default_spin_lock_flags+0x9/0x10 >> [64442.299351] [] ? scsi_finish_command+0xbd/0x130 >> [64442.299353] [] ? scsi_softirq_done+0x145/0x170 >> [64442.299356] [] ? blk_done_softirq+0x7d/0x90 >> [64442.299368] [] ? __do_softirq+0xbd/0x200 >> [64442.299370] [] ? call_softirq+0x1c/0x30 >> [64442.299372] [] ? do_softirq+0x55/0x90 >> [64442.299374] [] ? irq_exit+0x85/0x90 >> [64442.299376] [] ? do_IRQ+0x70/0xe0 >> [64442.299379] [] ? ret_from_intr+0x0/0x11 >> [64442.299380] [] ? native_safe_halt+0x6/0x10 >> [64442.299390] [] ? default_idle+0x4c/0xe0 >> [64442.299395] [] ? >> atomic_notifier_call_chain+0x15/0x20 >> [64442.299398] [] ? cpu_idle+0xb2/0x100 >> [64442.299406] [] ? rest_init+0x66/0x70 >> [64442.299424] [] ? start_kernel+0x352/0x35b >> [64442.299427] [] ? >> x86_64_start_reservations+0x125/0x129 >> [64442.299429] [] ? x86_64_start_kernel+0xfa/0x109 >> [64442.299433] handlers: >> [64442.299840] [] (e1000_intr+0x0/0x190 [e1000]) >> [64442.300046] Disabling IRQ #10 > > See also LP bug #584131 (https://bugs.launchpad.net/bugs/584131) > and original Debian bug#580649 (http://bugs.debian.org/580649) > > Not sure if they're related... > > /mjt michael, do you have any ideas what i got do to debug whats happening? looking at launchpad and debian bug tracker i found other bugs also with a maybe related problem. so this issue might be greater... thanks peter