From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: John Snow <jsnow@redhat.com>
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 2/2] atapi migration: Throw recoverable error to avoid recovery
Date: Wed, 10 Dec 2014 12:05:42 +0000 [thread overview]
Message-ID: <20141210120541.GC2311@work-vm> (raw)
In-Reply-To: <5487E4C8.4050102@redhat.com>
* John Snow (jsnow@redhat.com) wrote:
>
>
> On 12/09/2014 01:15 PM, Dr. David Alan Gilbert (git) wrote:
> >From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> >(With the previous atapi_dma flag recovery)
> >If migration happens between the ATAPI command being written and the
> >bmdma being started, the DMA is dropped. Eventually the guest times
> >out and recovers, but that can take many seconds.
> >(This is rare, on a pingpong reading the CD continuously I hit
> >this about ~1/30-1/50 migrates)
> >
> >I don't think we've got enough state to be able to recover safely
> >at this point, so I throw a 'medium error, no seek complete'
> >that I'm assuming guests will try and recover from an apparently
> >dirty CD.
> >
> >OK, it's a hack, the real solution is probably to push a lot of
> >ATAPI state into the migration stream, but this is a fix that
> >works with no stream changes. Tested only on Linux (both RHEL5
> >(pre-libata) and RHEL7).
> >
> >Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >---
> > hw/ide/atapi.c | 17 +++++++++++++++++
> > hw/ide/internal.h | 2 ++
> > hw/ide/pci.c | 11 +++++++++++
> > 3 files changed, 30 insertions(+)
> >
> >diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
> >index c63b7e5..e17799c 100644
> >--- a/hw/ide/atapi.c
> >+++ b/hw/ide/atapi.c
> >@@ -394,6 +394,23 @@ static void ide_atapi_cmd_read(IDEState *s, int lba, int nb_sectors,
> > }
> > }
> >
> >+
> >+/* Called by *_restart_bh when the transfer function points
> >+ * to ide_atapi_cmd
> >+ */
> >+void ide_atapi_dma_restart(IDEState *s)
> >+{
> >+ /*
> >+ * I'm not sure we have enough stored to restart the command
> >+ * safely, so give the guest an error it should recover from.
> >+ * I'm assuming most guests will try to recover from something
> >+ * listed as a medium error on a CD; it seems to work on Linux.
> >+ * This would be more of a problem if we did any other type of
> >+ * DMA operation.
> >+ */
> >+ ide_atapi_cmd_error(s, MEDIUM_ERROR, ASC_NO_SEEK_COMPLETE);
> >+}
> >+
>
> Is this safe for non-data commands? Can we even get there in such a case?
See below.
> > static inline uint8_t ide_atapi_set_profile(uint8_t *buf, uint8_t *index,
> > uint16_t profile)
> > {
> >diff --git a/hw/ide/internal.h b/hw/ide/internal.h
> >index 8a3eca4..8b65285 100644
> >--- a/hw/ide/internal.h
> >+++ b/hw/ide/internal.h
> >@@ -289,6 +289,7 @@ typedef struct IDEDMAOps IDEDMAOps;
> > #define ATAPI_INT_REASON_TAG 0xf8
> >
> > /* same constants as bochs */
> >+#define ASC_NO_SEEK_COMPLETE 0x02
> > #define ASC_ILLEGAL_OPCODE 0x20
> > #define ASC_LOGICAL_BLOCK_OOR 0x21
> > #define ASC_INV_FIELD_IN_CMD_PACKET 0x24
> >@@ -529,6 +530,7 @@ void ide_dma_error(IDEState *s);
> >
> > void ide_atapi_cmd_ok(IDEState *s);
> > void ide_atapi_cmd_error(IDEState *s, int sense_key, int asc);
> >+void ide_atapi_dma_restart(IDEState *s);
> > void ide_atapi_io_error(IDEState *s, int ret);
> >
> > void ide_ioport_write(void *opaque, uint32_t addr, uint32_t val);
> >diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> >index bee5ad3..e3f2054 100644
> >--- a/hw/ide/pci.c
> >+++ b/hw/ide/pci.c
> >@@ -235,6 +235,17 @@ static void bmdma_restart_bh(void *opaque)
> > }
> > } else if (error_status & IDE_RETRY_FLUSH) {
> > ide_flush_cache(bmdma_active_if(bm));
> >+ } else {
> >+ IDEState *s = bmdma_active_if(bm);
> >+
> >+ /*
> >+ * We've not got any bits to tell us about ATAPI - but
> >+ * we do have the end_transfer_func that tells us what
> >+ * we're trying to do.
> >+ */
> >+ if (s->end_transfer_func == ide_atapi_cmd) {
> >+ ide_atapi_dma_restart(s);
> >+ }
>
> OK, so when the restart routines get invoked we add a hook to see if we were
> in the middle of an ATAPI command and acknowledge that we don't know how to
> properly handle this.
As to your qeustion above about non-data commands; hmm probably - but how
do I guard it? I guess I could check for the atapi_dma flag the previous
patch fixed.
(This is all probably still broken for non-DMA atapi transfers)
> Isn't this going to run on every vmstate change, though?
There aren't many - only starting/stopping the CPU does it; and bmdma_restart_cb
guards it by 'if (!running)' exit, so it'll only do it when the CPU starts
running again.
> I think we don't
> clear out end_transfer_func on success, so this might fire off more than we
> want it to, although I guess end_transfer_func is usually going to get set
> to ide_atapi_cmd_reply_end if it finishes normally ...
Right, or if ide_transfer_stop is called.
> > }
> > }
> >
> >
>
> Indeed a hack, but it's probably appropriate: if our code cannot in fact
> handle ATAPI migration, throwing an error or disabling migration is the
> correct thing to do, but I don't think users would be very happy with the
> second option. I feel that this is an OK workaround because it should not
> introduce spurious errors or retries for cases where we manage to avoid
> migrating in the middle of the loop. This will at least let the currently
> broken case limp along until we fix it more properly.
>
> What makes me the most curious is how this plays out in Windows if this case
> is triggered. Throw a trace around the fake error and see if you can't
> observe it getting called during a pingpong test while Windows reads a CD.
Yeh, I'm going to figure out how to try that.
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2014-12-10 12:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-09 18:15 [Qemu-devel] [PATCH 0/2] ATAPI migration fix/hack Dr. David Alan Gilbert (git)
2014-12-09 18:15 ` [Qemu-devel] [PATCH 1/2] Restore atapi_dma flag across migration Dr. David Alan Gilbert (git)
2014-12-10 5:04 ` John Snow
2014-12-09 18:15 ` [Qemu-devel] [PATCH 2/2] atapi migration: Throw recoverable error to avoid recovery Dr. David Alan Gilbert (git)
2014-12-10 6:14 ` John Snow
2014-12-10 12:05 ` Dr. David Alan Gilbert [this message]
2014-12-10 20:09 ` Dr. David Alan Gilbert
2014-12-10 22:04 ` Paolo Bonzini
2014-12-11 19:45 ` Dr. David Alan Gilbert
2014-12-18 19:39 ` Dr. David Alan Gilbert
2014-12-18 23:42 ` John Snow
2015-01-16 17:28 ` John Snow
2015-02-02 12:11 ` Kevin Wolf
2015-01-07 16:26 ` [Qemu-devel] [PATCH 0/2] ATAPI migration fix/hack Dr. David Alan Gilbert
2015-01-30 16:07 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141210120541.GC2311@work-vm \
--to=dgilbert@redhat.com \
--cc=jsnow@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).