From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55313) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XrWWy-00044j-Ty for qemu-devel@nongnu.org; Thu, 20 Nov 2014 13:32:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XrWWs-0007BN-Nd for qemu-devel@nongnu.org; Thu, 20 Nov 2014 13:32:36 -0500 From: Don Slutz Message-ID: <546E339E.4030202@terremark.com> Date: Thu, 20 Nov 2014 13:31:58 -0500 MIME-Version: 1.0 References: <1416259239-13281-1-git-send-email-dslutz@verizon.com> <87k32r8jc8.fsf@blackfin.pond.sub.org> In-Reply-To: <87k32r8jc8.fsf@blackfin.pond.sub.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [BUGFIX][PATCH for 2.2 1/1] hw/ide/core.c: Prevent SIGSEGV during migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Markus Armbruster , Don Slutz Cc: Kevin Wolf , Stefano Stabellini , qemu-devel@nongnu.org, Stefan Hajnoczi , qemu-stable@nongnu.org On 11/19/14 07:29, Markus Armbruster wrote: > Don Slutz writes: > >> The other callers to blk_set_enable_write_cache() in this file >> already check for s->blk == NULL. >> >> Signed-off-by: Don Slutz >> --- >> >> I think this is a bugfix that should be back ported to stable >> releases. >> >> I also think this should be done in xen's copy of QEMU for 4.5 with >> back port(s) to active stable releases. >> >> Note: In 2.1 and earlier the routine is >> bdrv_set_enable_write_cache(); variable is s->bs. > Got a reproducer? yes. Migrating a guest from xen 4.2 or 4.3 to xen 4.4 (or 4.5-unstable) on CentOS 6.3 with xen_emul_unplug=unnecessary and no cdrom defined. > > I'm asking because I believe s->identify_set implies s->blk. > s->identify_set is initialized to zero, and gets set to non-zero exactly > on the first successful IDENTIFY DEVICE or IDENTIFY PACKET DEVICE, in > ide_identify(), ide_atapi_identify() or ide_cfata_identify(), > respectively. Only called via cmd_identify() / cmd_identify_packet() > via ide_exec_cmd(). The latter immediately fails when !s->blk: > > s = idebus_active_if(bus); > /* ignore commands to non existent slave */ > if (s != bus->ifs && !s->blk) { > return; > } I do think that you are right. I have now spent more time on why I am seeing this. > Even if I'm right, your patch is fine, because it makes this spot more > obviously correct, and consistent with the other uses of > blk_set_enable_write_cache(). The case for stable is weak, though. > I had not fully tracked down what is happening before sending the bugfix. I have now done more debugging, and have tracked it down to xen 4.4 now using "-nodefaults" with QEMU. I needed to add output to QEMU to track this down because I have long command lines... (all I get for ps -ef): root 14864 1 82 16:42 ? 00:00:09 /usr/lib/xen/bin/qemu-system-i386 -xen-domid 20 -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-20,server,nowait -mon chardev=libxl-cmd,mode=control -name C63-min-tools -machine xenfv,vmware_hw=7,xen_platform_pci=240 -hostbridge 82443 -device agp-bridge,id=agp,msi=off,msi-x=off,addr=0x12.0 -monitor pty -monitor vc -boot menu=on -device vmware-pci-bridge,msi=on,msi-x=on,id=pciBridge1.0,addr=0x11.0 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge5.0,multifunction=on,addr=0x15.0 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge5.1,multifunction=on,addr=0x15.1 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge5.2,multifunction=on,addr=0x15.2 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge5.3,multifunction=on,addr=0x15.3 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge5.4,multifunction=on,addr=0x15.4 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge5.5,multifunction=on,addr=0x15.5 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge5.6,multifunction=on,addr=0x15.6 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge5.7,multifunction=on,addr=0x15.7 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge6.0,multifunction=on,addr=0x16.0 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge6.1,multifunction=on,addr=0x16.1 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge6.2,multifunction=on,addr=0x16.2 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge6.3,multifunction=on,addr=0x16.3 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge6.4,multifunction=on,addr=0x16.4 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge6.5,multifunction=on,addr=0x16.5 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge6.6,multifunction=on,addr=0x16.6 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge6.7,multifunction=on,addr=0x16.7 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge7.0,multifunction=on,addr=0x17.0 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge7.1,multifunction=on,addr=0x17.1 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge7.2,multifunction=on,addr=0x17.2 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge7.3,multifunction=on,addr=0x17.3 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge7.4,multifunction=on,addr=0x17.4 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge7.5,multifunction=on,addr=0x17.5 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge7.6,multifunction=on,addr=0x17.6 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge7.7,multifunction=on,addr=0x17.7 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge8.0,multifunction=on,addr=0x18.0 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge8.1,multifunction=on,addr=0x18.1 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge8.2,multifunction=on,addr=0x18.2 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge8.3,multifunction=on,addr=0x18.3 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge8.4,multifunction=on,addr=0x18.4 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge8.5,multifunction=on,addr=0x18.5 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge8.6,multifunction=on,addr=0x18.6 -device vmware-pcie-bridge,msi=on,msi-x=on,id=pciBridge8.7,multifunction=on,addr=0x18.7 -vnc 0.0.0.0:7,to=99 -serial pty -vga vmware -global vmware-svga.vgamem_mb=32 -boot order=cda -device vmxnet3,id=nic0,netdev=net0,mac=00:0c:29:86:44:a0,addr=0x3.0x0 -netdev type=tap,id=net0,ifname=vif20.0-emu,script=no,downscript=no -device e1000_vmw,id=nic1,netdev=net1,mac=00:0c:29:86:44:aa,addr=0x4.0x0 -netdev type=tap,id=net1,ifname=vif20.1-emu,script=no,downscript=no -device vmxnet3,id=nic2,netdev=net2,mac=00:0c:29:86:44:b4,addr=0x5.0x0 -netdev type=tap,id=net2,ifname=vif20.2-emu,script=no,downscript=no -device vmxnet3,id=nic3,netdev=net3,mac=00:0c:29:86:44:be,addr=0x6.0x0 -netdev type=tap,id=net3,ifname=vif20.3-emu,script=no,downscript=no -device vmxnet3,id=nic4,netdev=net4,mac=00:0c:29:86:44:c8,addr=0x8.0x0 -netdev type=tap,id=net4,ifname=vif20.4-emu,script=no,downscript=no -device vmxnet3,id=nic5,netdev=net5,mac=00:0c:29:86:44:d2,addr=0x9.0x0 -netdev t Which is missing that option. The ide that was aborting in this case is the cdrom at hdc that is added if you do not specify "-nodefaults". Since this is a "changed" machine config, I am no longer as sure as what versions this needs to be in. If I put my QEMU hat on, it does not look like a back port is needed. However for xen it would be nice. I do not know how the QEMU community feels about migration from a config without "-nodefaults" to one with "-nodefaults" as the only difference. -Don Slutz >> hw/ide/core.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/hw/ide/core.c b/hw/ide/core.c >> index 00e21cf..d4af5e2 100644 >> --- a/hw/ide/core.c >> +++ b/hw/ide/core.c >> @@ -2401,7 +2401,7 @@ static int ide_drive_post_load(void *opaque, int version_id) >> { >> IDEState *s = opaque; >> >> - if (s->identify_set) { >> + if (s->blk && s->identify_set) { >> blk_set_enable_write_cache(s->blk, !!(s->identify_data[85] & (1 << 5))); >> } >> return 0;