From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:36269) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UhFXq-0005q0-Fy for qemu-devel@nongnu.org; Tue, 28 May 2013 04:46:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UhFXp-0000HT-4b for qemu-devel@nongnu.org; Tue, 28 May 2013 04:46:14 -0400 Message-ID: <51A46ED1.7060503@suse.de> Date: Tue, 28 May 2013 10:46:09 +0200 From: =?ISO-8859-1?Q?Andreas_F=E4rber?= MIME-Version: 1.0 References: <1369729127-24499-1-git-send-email-afaerber@suse.de> <20130528082743.GC2854@dhcp-200-207.str.redhat.com> In-Reply-To: <20130528082743.GC2854@dhcp-200-207.str.redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: stefano.stabellini@eu.citrix.com, Stefan Hajnoczi , Heiko Rommel , qemu-devel@nongnu.org, Bruce Rogers , "Gonglei (Arei)" , Paolo Bonzini , qemu-stable Am 28.05.2013 10:27, schrieb Kevin Wolf: > Am 28.05.2013 um 10:18 hat Andreas F=E4rber geschrieben: >> The implementation of the ATA FLUSH command invokes a flush at the blo= ck >> layer, which may on raw files on POSIX entail a synchronous fdatasync(= ). >> This may in some cases take so long that the SLES 11 SP1 guest driver >> reports I/O errors and filesystems get corrupted or remounted read-onl= y. >> >> Avoid this by setting BUSY_STAT, so that the guest is made aware we ar= e >> in the middle of an operation and no ATA commands are attempted to be >> processed concurrently. >> >> Addresses BNC#637297. >> >> Suggested-by: Gonglei (Arei) >> Signed-off-by: Andreas F=E4rber >> --- >> hw/ide/core.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/hw/ide/core.c b/hw/ide/core.c >> index c7a8041..bf1ff18 100644 >> --- a/hw/ide/core.c >> +++ b/hw/ide/core.c >> @@ -795,6 +795,8 @@ static void ide_flush_cb(void *opaque, int ret) >> { >> IDEState *s =3D opaque; >> =20 >> + s->status &=3D ~BUSY_STAT; >> + >=20 > This part is unnecessary, the status is already reset. Only in the ret >=3D 0 case though AFAICS? >> if (ret < 0) { >> /* XXX: What sector number to set here? */ >> if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) { >> @@ -814,6 +816,7 @@ void ide_flush_cache(IDEState *s) >> return; >> } >> =20 >> + s->status |=3D BUSY_STAT; >> bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH); >> bdrv_aio_flush(s->bs, ide_flush_cb, s); >> } >=20 > This should fix the bug, however in an one-off way. I was planning to > fix it by setting BSY for all commands and having an explicit command > completion everywhere. This part is a mess currently in IDE. That's a valid idea, but I had backporting to 0.15 in mind. ;) And doh, I forgot qemu-stable. > The other part why I haven't sent a fix yet is that I don't have a test > case for it. Temporarily add a sleep(31) in qemu_fdatasync()? I was lazy in testing with -snapshot to not corrupt my disk image, which would not trigger the same issue since qcow2-backed AFAIU. > I guess I need to extend blkdebug first before this can be > reliably tested by qtest. It can't, since it's not a pure device emulation issue but depends on the relative timing of filesystem operations and subsequent commands. Andreas --=20 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N=FCrnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imend=F6rffer; HRB 16746 AG N=FCrnbe= rg