qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: David Gibson <david@gibson.dropbear.id.au>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] RFC: Partial workaround for buggy guest virtio-balloon driver
Date: Thu, 08 Nov 2012 07:11:13 -0600	[thread overview]
Message-ID: <87a9us1bpq.fsf@codemonkey.ws> (raw)
In-Reply-To: <20121108044522.GU23553@truffula.fritz.box>

David Gibson <david@gibson.dropbear.id.au> writes:

> Linux kernel commits 1a87228f5f1d316002c7c161316f5524592be766
> "virtio_balloon: Fix endian bug" and
> 3ccc9372ed0fab33d20f10be3c1efd5776ff5913 "virtio_balloon: fix handling
> of PAGE_SIZE != 4k" fixed two serious bugs in their (guest side)
> handling of the virtio balloon.  In practice, these bugs only affected
> powerpc guests, which is big-endian and frequently configured for 64k
> base page size.  Attempting to use the balloon with the buggy guest
> would usually result in an immediate guest crash.

You should create a new feature VIRTIO_BALLOON_F_ENDIAN_SAFE,
advertise it in the host, and add a guest kernel patch to ack it in
newer kernels.

Older kernels won't ack this feature which gives you a safe way to to
disable the driver on a big endian host.

You won't get support for 3.4 kernels but it's much nicer to handle it
this way.

Regards,

Anthony Liguori

>
> The bugs are fixed now, of course and the balloon works fine with
> kernels v3.4 and later, but unfortunately there are many distro
> releases still in use which still have buggy kernels.
>
> The nature of the page size bug makes it impossible to work around
> from the host side (there simply isn't enough information supplied to
> operate the balloon correctly).  However, it *is* possible with some
> fiddling to safely detect the endian bug in the guest kernel, and
> disable the balloon in this case.  The two fixes were applied to the
> mainline kernel almost consecutively, so there are no released kernels
> with one fix but not the other, meaning we can use the presence of the
> endian bug as a proxy for the presence of the page size bug.
>
> This patch implements such a test, working as follows.
>
> For a fixed guest kernel:
>   1. qemu sets a state variable to "TESTING" and the initial balloon
>      target size to 16 (4k pages).
>   2. When the guest balloon driver starts, it sees the target, finds
>      either 16 unused 4k pages or 1 unused 64k page (depending on
>      guest config) and submits the 16 resulting 4k pfns to the
>      host. qemu, in TESTING state, ignores the submitted pages for
>      now.
>   4. The guest then updates the 'actual' field in the balloon config
>      space to 16.  qemu sees this and determines that the guest is not
>      buggy, it moves to CLEANUP state, and sets the target balloon
>      size back to 0.
>   5. The guest sees the target go back to 0, and reclaims its page(s)
>      from the balloon.  qemu continues to ignore the page addresses
>      for now in CLEANUP state.
>   6. The guest updates the actual field to 0.  qemu now considers
>      cleanup complete and moves to GOOD state.  The balloon now
>      operates normally.
>
> For a buggy kernel:
>   1. qemu sets a state variable to "TESTING" and the initial balloon
>      target size to 16 (4k pages).
>   2. When the guest balloon driver starts it sees the non-zero target,
>      and misinterprets it as 268435456 (endian bug).  It starts trying
>      to find pages to free.
>   3. The guest is probably newly booted, so it almost certainly finds
>      256 pages easily and submits incorrect addresses for them (page
>      size bug) to the host. qemu, in TESTING state ignores the (wrong)
>      addresses for now.
>   4. The guest updates the actual field in config space to 256.  qemu
>      sees this, and determines that the guest is buggy.  It moves to
>      BUGGY state and sets the balloon target back to zero.
>   5. The guest, before attempting to find the next batch of pages to
>      free, rechecks the target and discovers it is now zero.  It
>      reclaims the pages by submitting more wrong addresses, which qemu
>      ignores.
>   6. The balloon is now disabled, if the user attempts to change the
>      balloon size, qemu print an error message and otherwise ignore
>      it.
>
>
> I'm aware that this patch needs a bunch more comments (largely based
> on the info above), but otherwise do people think this is a reasonable
> approach.  It doesn't (and can't) fix the balloon for buggy kernels,
> but it does make the failure mode a lot less ugly.
>
> From dbc721f5e50a39ca3b40d81f060d8bb0e6312995 Mon Sep 17 00:00:00 2001
> From: David Gibson <david@gibson.dropbear.id.au>
> Date: Thu, 8 Nov 2012 14:49:38 +1100
> Subject: [PATCH] Detection of buggy guest balloon
>
> ---
>  hw/virtio-balloon.c |   77 +++++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 72 insertions(+), 5 deletions(-)
>
> diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c
> index dd1a650..6a9cd3f 100644
> --- a/hw/virtio-balloon.c
> +++ b/hw/virtio-balloon.c
> @@ -37,8 +37,16 @@ typedef struct VirtIOBalloon
>      VirtQueueElement stats_vq_elem;
>      size_t stats_vq_offset;
>      DeviceState *qdev;
> +
> +    int guest_bug_state;
> +#define GUEST_BUG_TESTING       0
> +#define GUEST_BUG_CLEANUP       1
> +#define GUEST_BUG_BUGGY         2
> +#define GUEST_BUG_GOOD          3
>  } VirtIOBalloon;
>  
> +#define GUEST_BUG_TARGET        16
> +
>  static VirtIOBalloon *to_virtio_balloon(VirtIODevice *vdev)
>  {
>      return (VirtIOBalloon *)vdev;
> @@ -84,6 +92,11 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>              pa = (ram_addr_t)ldl_p(&pfn) << VIRTIO_BALLOON_PFN_SHIFT;
>              offset += 4;
>  
> +            if (s->guest_bug_state != GUEST_BUG_GOOD) {
> +                /* Still bug testing, not ready to use balloon yet */
> +                continue;
> +            }
> +
>              /* FIXME: remove get_system_memory(), but how? */
>              section = memory_region_find(get_system_memory(), pa, 1);
>              if (!section.size || !memory_region_is_ram(section.mr))
> @@ -134,8 +147,23 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
>  {
>      VirtIOBalloon *dev = to_virtio_balloon(vdev);
>      struct virtio_balloon_config config;
> +    uint32_t num_pages;
> +
> +    switch (dev->guest_bug_state) {
> +    case GUEST_BUG_TESTING:
> +        num_pages = GUEST_BUG_TARGET;
> +        break;
>  
> -    config.num_pages = cpu_to_le32(dev->num_pages);
> +    case GUEST_BUG_CLEANUP:
> +    case GUEST_BUG_BUGGY:
> +        num_pages = 0;
> +        break;
> +
> +    default:
> +        num_pages = dev->num_pages;
> +    }
> +
> +    config.num_pages = cpu_to_le32(num_pages);
>      config.actual = cpu_to_le32(dev->actual);
>  
>      memcpy(config_data, &config, 8);
> @@ -147,11 +175,41 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
>      VirtIOBalloon *dev = to_virtio_balloon(vdev);
>      struct virtio_balloon_config config;
>      uint32_t oldactual = dev->actual;
> +
>      memcpy(&config, config_data, 8);
>      dev->actual = le32_to_cpu(config.actual);
> -    if (dev->actual != oldactual) {
> -        qemu_balloon_changed(ram_size -
> -                             (dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
> +
> +    switch (dev->guest_bug_state) {
> +    case GUEST_BUG_TESTING:
> +        if (dev->actual == 0) {
> +            /* Both buggy and non-buggy guests write a 0 before going
> +             * on to write a meaningful value */
> +            break;
> +        }
> +
> +        if (dev->actual > GUEST_BUG_TARGET) {
> +            fprintf(stderr, "virtio-balloon: Buggy guest detected, disabling balloon\n");
> +            dev->guest_bug_state = GUEST_BUG_BUGGY;
> +        } else {
> +            dev->guest_bug_state = GUEST_BUG_CLEANUP;
> +        }
> +        /* Changing bug state implicitly alters the config */
> +        virtio_notify_config(&dev->vdev);
> +        break;
> +
> +    case GUEST_BUG_CLEANUP:
> +        if (dev->actual == 0) {
> +            /* Cleanup completed, proceed with normal operation */
> +            dev->guest_bug_state = GUEST_BUG_GOOD;
> +            virtio_notify_config(&dev->vdev);
> +        }
> +        break;
> +
> +    default:
> +        if (dev->actual != oldactual) {
> +            qemu_balloon_changed(ram_size -
> +                                 (dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
> +        }
>      }
>  }
>  
> @@ -194,12 +252,21 @@ static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
>  {
>      VirtIOBalloon *dev = opaque;
>  
> +    if (dev->guest_bug_state == GUEST_BUG_BUGGY) {
> +        fprintf(stderr, "Guest is buggy, cannot use balloon\n");
> +    }
> +
>      if (target > ram_size) {
>          target = ram_size;
>      }
>      if (target) {
>          dev->num_pages = (ram_size - target) >> VIRTIO_BALLOON_PFN_SHIFT;
> -        virtio_notify_config(&dev->vdev);
> +
> +        /* If we're still testing for guest bugs, delay the change
> +         * interrupt until we've finished that */
> +        if (dev->guest_bug_state == GUEST_BUG_GOOD) {
> +            virtio_notify_config(&dev->vdev);
> +        }
>      }
>  }
>  
> -- 
> 1.7.10.4
>
>
>
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

  reply	other threads:[~2012-11-08 13:11 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-08  4:45 [Qemu-devel] RFC: Partial workaround for buggy guest virtio-balloon driver David Gibson
2012-11-08 13:11 ` Anthony Liguori [this message]
2012-11-09  0:57   ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a9us1bpq.fsf@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).