From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36198) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z9bCe-0002gh-NC for qemu-devel@nongnu.org; Mon, 29 Jun 2015 11:42:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z9bCd-0001ax-Ls for qemu-devel@nongnu.org; Mon, 29 Jun 2015 11:42:36 -0400 Message-ID: <55916763.5000209@redhat.com> Date: Mon, 29 Jun 2015 11:42:27 -0400 From: John Snow MIME-Version: 1.0 References: <1435018875-22527-1-git-send-email-jsnow@redhat.com> <1435018875-22527-7-git-send-email-jsnow@redhat.com> <20150626153514.GC31186@stefanha-thinkpad.redhat.com> <558D999B.9090001@redhat.com> <20150629142446.GH32151@stefanha-thinkpad.redhat.com> In-Reply-To: <20150629142446.GH32151@stefanha-thinkpad.redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 06/16] ahci: record ncq failures List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: kwolf@redhat.com, pbonzini@redhat.com, qemu-block@nongnu.org, qemu-devel@nongnu.org, Stefan Hajnoczi On 06/29/2015 10:24 AM, Stefan Hajnoczi wrote: > On Fri, Jun 26, 2015 at 02:27:39PM -0400, John Snow wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 >>=20 >>=20 >>=20 >> On 06/26/2015 11:35 AM, Stefan Hajnoczi wrote: >>> On Mon, Jun 22, 2015 at 08:21:05PM -0400, John Snow wrote: >>>> Handle NCQ failures for cases where we want to halt the VM on >>>> IO errors. >>>>=20 >>>> Signed-off-by: John Snow --- >>>> hw/ide/ahci.c | 17 +++++++++++++++-- hw/ide/ahci.h | 1 + >>>> hw/ide/internal.h | 1 + 3 files changed, 17 insertions(+), 2 >>>> deletions(-) >>>>=20 >>>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index >>>> 71b5085..a838317 100644 --- a/hw/ide/ahci.c +++ >>>> b/hw/ide/ahci.c @@ -959,13 +959,25 @@ static void ncq_cb(void >>>> *opaque, int ret) return; } >>>>=20 >>>> + ncq_tfs->halt =3D false; >>>=20 >>> Why does halt need to be cleared here? >>>=20 I remember my thinking now. I didn't want to leave it dangling after a successful command, so I wanted to clear it on the callback. It's harmless, but it's weird to have it set afterwards and I wanted it to be very crystal clear what it meant when it was set. (With this usage case, 'halt' being set ALWAYS means that there's a command to retry -- you don't have to test other variables like 'used' to tell if it's a left over flag.) I actually want to leave it here, now. >>=20 >> Might make more sense to clear it just on the beginning of every=20 >> command, in execute(). >>=20 >> There's no strong reason here other than "If there's an error and >> it should be set, it'll get reset again pretty soon." It's just a >> default state. >>=20 >> I could move it from process to execute. >=20 > By the way, does ->halt need to be cleared in ahci_reset_port()? >=20 > I'm thinking of a scenario where requests have failed and the port > is reset. We should not try to reissue those commands after > reset. >=20 If I keep it like I have it now (where it clears itself after a successful command), we can clear it alongside the 'used' flag to protect against the case where the guest issues a reset almost simultaneously with an errored read command, where it might be that the NCQ command halts the VM, then we try to reset immediately after boot= . (Perilously tangential side-note: what's the default error action if you don't set rerror=3Dstop or werror=3Dstop? the ide code didn't make it particularly clear to me if it was IGNORE or REPORT.)