From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38398) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z9bHa-0007q5-5R for qemu-devel@nongnu.org; Mon, 29 Jun 2015 11:47:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z9bHZ-0004Fc-7W for qemu-devel@nongnu.org; Mon, 29 Jun 2015 11:47:42 -0400 Message-ID: <55916894.7030501@redhat.com> Date: Mon, 29 Jun 2015 11:47:32 -0400 From: John Snow MIME-Version: 1.0 References: <1435018875-22527-1-git-send-email-jsnow@redhat.com> <1435018875-22527-7-git-send-email-jsnow@redhat.com> <20150626153514.GC31186@stefanha-thinkpad.redhat.com> <558D999B.9090001@redhat.com> <20150629142446.GH32151@stefanha-thinkpad.redhat.com> <55916763.5000209@redhat.com> In-Reply-To: <55916763.5000209@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 06/16] ahci: record ncq failures List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: kwolf@redhat.com, pbonzini@redhat.com, qemu-block@nongnu.org, qemu-devel@nongnu.org, Stefan Hajnoczi On 06/29/2015 11:42 AM, John Snow wrote: > > > On 06/29/2015 10:24 AM, Stefan Hajnoczi wrote: >> On Fri, Jun 26, 2015 at 02:27:39PM -0400, John Snow wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 >>> >>> >>> >>> On 06/26/2015 11:35 AM, Stefan Hajnoczi wrote: >>>> On Mon, Jun 22, 2015 at 08:21:05PM -0400, John Snow wrote: >>>>> Handle NCQ failures for cases where we want to halt the VM on >>>>> IO errors. >>>>> >>>>> Signed-off-by: John Snow --- >>>>> hw/ide/ahci.c | 17 +++++++++++++++-- hw/ide/ahci.h | 1 + >>>>> hw/ide/internal.h | 1 + 3 files changed, 17 insertions(+), 2 >>>>> deletions(-) >>>>> >>>>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index >>>>> 71b5085..a838317 100644 --- a/hw/ide/ahci.c +++ >>>>> b/hw/ide/ahci.c @@ -959,13 +959,25 @@ static void ncq_cb(void >>>>> *opaque, int ret) return; } >>>>> >>>>> + ncq_tfs->halt = false; >>>> >>>> Why does halt need to be cleared here? >>>> > > I remember my thinking now. I didn't want to leave it dangling after a > successful command, so I wanted to clear it on the callback. > > It's harmless, but it's weird to have it set afterwards and I wanted > it to be very crystal clear what it meant when it was set. (With this > usage case, 'halt' being set ALWAYS means that there's a command to > retry -- you don't have to test other variables like 'used' to tell if > it's a left over flag.) > > I actually want to leave it here, now. > I am literally not awake: if I clear it in execute_ncq_command like I offered, it will get cleared on retry and we won't leave it hanging. It takes hitting the 'send' button to realize this. >>> >>> Might make more sense to clear it just on the beginning of every >>> command, in execute(). >>> >>> There's no strong reason here other than "If there's an error and >>> it should be set, it'll get reset again pretty soon." It's just a >>> default state. >>> >>> I could move it from process to execute. >> >> By the way, does ->halt need to be cleared in ahci_reset_port()? >> >> I'm thinking of a scenario where requests have failed and the port >> is reset. We should not try to reissue those commands after >> reset. >> > > If I keep it like I have it now (where it clears itself after a > successful command), we can clear it alongside the 'used' flag to > protect against the case where the guest issues a reset almost > simultaneously with an errored read command, where it might be that > the NCQ command halts the VM, then we try to reset immediately after boot. > > > (Perilously tangential side-note: what's the default error action if > you don't set rerror=stop or werror=stop? the ide code didn't make it > particularly clear to me if it was IGNORE or REPORT.) > Still curious about this bit, though.