From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tomasz Chmielewski Subject: I/O errors after migration - why? Date: Fri, 27 Mar 2009 17:34:28 +0100 Message-ID: <49CD0014.5020503@wpkg.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit To: "kvm@vger.kernel.org" Return-path: Received: from mx03.syneticon.net ([78.111.66.105]:40423 "EHLO mx03.syneticon.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755613AbZC0Qee (ORCPT ); Fri, 27 Mar 2009 12:34:34 -0400 Received: from localhost (filter1.syneticon.net [192.168.113.83]) by mx03.syneticon.net (Postfix) with ESMTP id 7CC9536149 for ; Fri, 27 Mar 2009 17:34:31 +0100 (CET) Received: from mx03.syneticon.net ([192.168.113.84]) by localhost (mx03.syneticon.net [192.168.113.83]) (amavisd-new, port 10025) with ESMTP id JAFNfUFI3k3L for ; Fri, 27 Mar 2009 17:34:29 +0100 (CET) Received: from [192.168.10.145] (koln-4db41662.pool.einsundeins.de [77.180.22.98]) by mx03.syneticon.net (Postfix) with ESMTPSA for ; Fri, 27 Mar 2009 17:34:29 +0100 (CET) Sender: kvm-owner@vger.kernel.org List-ID: I'm trying to perform live migration by following the instructions on http://www.linux-kvm.org/page/Migration. Unfortunately, it doesn't work very well - guest is migrated, but looses access to its disk. On the destination host, I'm starting the guest with exactly the same options as on the source host, with "-incoming tcp:0:4444". On the source host, I start the migration with "migrate -d tcp:B:4444". Both hosts use the same iSCSI device and can access it. Looks like the destination host can't really access the iSCSI device after all? No - after I reboot the guest (echo b > /proc/sysrq-trigger), it boots just fine from its disk. Also lsof on the host shows that the kvm process accesses the correct /dev/sdX device. Both hosts use kvm-84. This is what kernel says on the guest after migration: sd 0:0:0:0: ABORT operation started. sd 0:0:0:0: ABORT operation timed-out. sd 0:0:0:0: ABORT operation started. sd 0:0:0:0: ABORT operation timed-out. sd 0:0:0:0: ABORT operation started. sd 0:0:0:0: ABORT operation timed-out. sd 0:0:0:0: ABORT operation started. sd 0:0:0:0: ABORT operation timed-out. sd 0:0:0:0: ABORT operation started. sd 0:0:0:0: ABORT operation timed-out. sd 0:0:0:0: DEVICE RESET operation started. sd 0:0:0:0: DEVICE RESET operation timed-out. sd 0:0:0:0: BUS RESET operation started. sym0: suspicious SCSI data while resetting the BUS. sym0: dp1,d15-8,dp0,d7-0,rst,req,ack,bsy,sel,atn,msg,c/d,i/o = 0x0, expecting 0x100 sd 0:0:0:0: BUS RESET operation timed-out. sd 0:0:0:0: HOST RESET operation started. sym0: suspicious SCSI data while resetting the BUS. sym0: dp1,d15-8,dp0,d7-0,rst,req,ack,bsy,sel,atn,msg,c/d,i/o = 0x0, expecting 0x100 sym0: the chip cannot lock the frequency sym0: SCSI BUS has been reset. sd 0:0:0:0: HOST RESET operation timed-out. sd 0:0:0:0: scsi: Device offlined - not ready after error recovery (...) Buffer I/O error on device sda1, logical block 1 lost page write due to I/O error on sda1 sd 0:0:0:0: rejecting I/O to offline device -- Tomasz Chmielewski http://wpkg.org