qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: John Snow <jsnow@redhat.com>
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 2/2] atapi migration: Throw recoverable error to avoid recovery
Date: Wed, 10 Dec 2014 12:05:42 +0000	[thread overview]
Message-ID: <20141210120541.GC2311@work-vm> (raw)
In-Reply-To: <5487E4C8.4050102@redhat.com>

* John Snow (jsnow@redhat.com) wrote:
> 
> 
> On 12/09/2014 01:15 PM, Dr. David Alan Gilbert (git) wrote:
> >From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> >(With the previous atapi_dma flag recovery)
> >If migration happens between the ATAPI command being written and the
> >bmdma being started, the DMA is dropped.  Eventually the guest times
> >out and recovers, but that can take many seconds.
> >(This is rare, on a pingpong reading the CD continuously I hit
> >this about ~1/30-1/50 migrates)
> >
> >I don't think we've got enough state to be able to recover safely
> >at this point, so I throw a 'medium error, no seek complete'
> >that I'm assuming guests will try and recover from an apparently
> >dirty CD.
> >
> >OK, it's a hack, the real solution is probably to push a lot of
> >ATAPI state into the migration stream, but this is a fix that
> >works with no stream changes. Tested only on Linux (both RHEL5
> >(pre-libata) and RHEL7).
> >
> >Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >---
> >  hw/ide/atapi.c    | 17 +++++++++++++++++
> >  hw/ide/internal.h |  2 ++
> >  hw/ide/pci.c      | 11 +++++++++++
> >  3 files changed, 30 insertions(+)
> >
> >diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
> >index c63b7e5..e17799c 100644
> >--- a/hw/ide/atapi.c
> >+++ b/hw/ide/atapi.c
> >@@ -394,6 +394,23 @@ static void ide_atapi_cmd_read(IDEState *s, int lba, int nb_sectors,
> >      }
> >  }
> >
> >+
> >+/* Called by *_restart_bh when the transfer function points
> >+ * to ide_atapi_cmd
> >+ */
> >+void ide_atapi_dma_restart(IDEState *s)
> >+{
> >+    /*
> >+     * I'm not sure we have enough stored to restart the command
> >+     * safely, so give the guest an error it should recover from.
> >+     * I'm assuming most guests will try to recover from something
> >+     * listed as a medium error on a CD; it seems to work on Linux.
> >+     * This would be more of a problem if we did any other type of
> >+     * DMA operation.
> >+     */
> >+    ide_atapi_cmd_error(s, MEDIUM_ERROR, ASC_NO_SEEK_COMPLETE);
> >+}
> >+
> 
> Is this safe for non-data commands? Can we even get there in such a case?

See below.

> >  static inline uint8_t ide_atapi_set_profile(uint8_t *buf, uint8_t *index,
> >                                              uint16_t profile)
> >  {
> >diff --git a/hw/ide/internal.h b/hw/ide/internal.h
> >index 8a3eca4..8b65285 100644
> >--- a/hw/ide/internal.h
> >+++ b/hw/ide/internal.h
> >@@ -289,6 +289,7 @@ typedef struct IDEDMAOps IDEDMAOps;
> >  #define ATAPI_INT_REASON_TAG            0xf8
> >
> >  /* same constants as bochs */
> >+#define ASC_NO_SEEK_COMPLETE                 0x02
> >  #define ASC_ILLEGAL_OPCODE                   0x20
> >  #define ASC_LOGICAL_BLOCK_OOR                0x21
> >  #define ASC_INV_FIELD_IN_CMD_PACKET          0x24
> >@@ -529,6 +530,7 @@ void ide_dma_error(IDEState *s);
> >
> >  void ide_atapi_cmd_ok(IDEState *s);
> >  void ide_atapi_cmd_error(IDEState *s, int sense_key, int asc);
> >+void ide_atapi_dma_restart(IDEState *s);
> >  void ide_atapi_io_error(IDEState *s, int ret);
> >
> >  void ide_ioport_write(void *opaque, uint32_t addr, uint32_t val);
> >diff --git a/hw/ide/pci.c b/hw/ide/pci.c
> >index bee5ad3..e3f2054 100644
> >--- a/hw/ide/pci.c
> >+++ b/hw/ide/pci.c
> >@@ -235,6 +235,17 @@ static void bmdma_restart_bh(void *opaque)
> >          }
> >      } else if (error_status & IDE_RETRY_FLUSH) {
> >          ide_flush_cache(bmdma_active_if(bm));
> >+    } else {
> >+        IDEState *s = bmdma_active_if(bm);
> >+
> >+        /*
> >+         * We've not got any bits to tell us about ATAPI - but
> >+         * we do have the end_transfer_func that tells us what
> >+         * we're trying to do.
> >+         */
> >+        if (s->end_transfer_func == ide_atapi_cmd) {
> >+            ide_atapi_dma_restart(s);
> >+        }
> 
> OK, so when the restart routines get invoked we add a hook to see if we were
> in the middle of an ATAPI command and acknowledge that we don't know how to
> properly handle this.

As to your qeustion above about non-data commands; hmm probably - but how
do I guard it? I guess I could check for the atapi_dma flag the previous
patch fixed.
(This is all probably still broken for non-DMA atapi transfers)

> Isn't this going to run on every vmstate change, though?

There aren't many - only starting/stopping the CPU does it; and bmdma_restart_cb
guards it by 'if (!running)' exit, so it'll only do it when the CPU starts
running again.

> I think we don't
> clear out end_transfer_func on success, so this might fire off more than we
> want it to, although I guess end_transfer_func is usually going to get set
> to ide_atapi_cmd_reply_end if it finishes normally ...

Right, or if ide_transfer_stop is called.

> >      }
> >  }
> >
> >
> 
> Indeed a hack, but it's probably appropriate: if our code cannot in fact
> handle ATAPI migration, throwing an error or disabling migration is the
> correct thing to do, but I don't think users would be very happy with the
> second option. I feel that this is an OK workaround because it should not
> introduce spurious errors or retries for cases where we manage to avoid
> migrating in the middle of the loop. This will at least let the currently
> broken case limp along until we fix it more properly.
> 
> What makes me the most curious is how this plays out in Windows if this case
> is triggered. Throw a trace around the fake error and see if you can't
> observe it getting called during a pingpong test while Windows reads a CD.

Yeh, I'm going to figure out how to try that.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2014-12-10 12:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-09 18:15 [Qemu-devel] [PATCH 0/2] ATAPI migration fix/hack Dr. David Alan Gilbert (git)
2014-12-09 18:15 ` [Qemu-devel] [PATCH 1/2] Restore atapi_dma flag across migration Dr. David Alan Gilbert (git)
2014-12-10  5:04   ` John Snow
2014-12-09 18:15 ` [Qemu-devel] [PATCH 2/2] atapi migration: Throw recoverable error to avoid recovery Dr. David Alan Gilbert (git)
2014-12-10  6:14   ` John Snow
2014-12-10 12:05     ` Dr. David Alan Gilbert [this message]
2014-12-10 20:09     ` Dr. David Alan Gilbert
2014-12-10 22:04   ` Paolo Bonzini
2014-12-11 19:45     ` Dr. David Alan Gilbert
2014-12-18 19:39       ` Dr. David Alan Gilbert
2014-12-18 23:42         ` John Snow
2015-01-16 17:28   ` John Snow
2015-02-02 12:11     ` Kevin Wolf
2015-01-07 16:26 ` [Qemu-devel] [PATCH 0/2] ATAPI migration fix/hack Dr. David Alan Gilbert
2015-01-30 16:07   ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141210120541.GC2311@work-vm \
    --to=dgilbert@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).