qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] exec.c:invalidate_and_set_dirty() only checks whether first page in its range is dirty...
Date: Tue, 18 Nov 2014 14:53:18 +0000	[thread overview]
Message-ID: <20141118145318.GG7837@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <CAFEAcA8op+VPnVw3XGn8XGoc6=DRmaJDs1cbtFMNF9x7EM2swg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2026 bytes --]

On Sun, Nov 16, 2014 at 06:11:48PM +0000, Peter Maydell wrote:
> I'm trying to track down a bug in ARM TCG where we:
>  * boot a guest
>  * run 'shutdown -r now' to trigger a reboot
>  * on reboot, crash when running userspace because the contents
>    of physical RAM have changed but the translated code from
>    before the shutdown was never invalidated
> 
> This is with a virtio-mmio block device as the disk.
> 
> Debugging indicates that when the post-reboot guest reloads
> binaries from disk into ram we fail to invalidate the cached
> translations. For the specific case I looked at, we have a
> translation of code at ramaddr_t 0x806e000. The disk load
> pulls 0x16000 bytes of data off disk to address 0x806a000.
> Virtio correctly calls address_space_unmap(), which is supposed
> to be what marks the ram range as dirty. It in turn calls
> invalidate_and_set_clean(). However invalidate_and_set_clean()
> just does this:
> 
>     if (cpu_physical_memory_is_clean(addr)) {
>         /* invalidate code */
>         tb_invalidate_phys_page_range(addr, addr + length, 0);
>         /* set dirty bit */
>         cpu_physical_memory_set_dirty_range_nocode(addr, length);
>     }
> 
> So if the first page in the range (here 0x806a000) happens
> to be dirty then we won't do anything, even if later pages
> in the range do need to be invalidated. Also, we'll call
> tb_invalidate_phys_page_range() with a start/end which may
> be in different physical pages, which is forbidden by that
> function's API.
> 
> I guess invalidate_and_set_clean() really needs to be
> fixed to loop through each page in the range; does anybody
> know how this is supposed to work (or why nobody's noticed
> this bug before :-)) ?

Not directly but I don't like this code because it's not atomic.  I'll
send patches soon for atomic test-and-set and test-and-clear.  Hopefully
it won't impact performance too much.

What you've discovered seems like a plain old bug.  It needs a loop.

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

      reply	other threads:[~2014-11-18 14:53 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-16 18:11 [Qemu-devel] exec.c:invalidate_and_set_dirty() only checks whether first page in its range is dirty Peter Maydell
2014-11-18 14:53 ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141118145318.GG7837@stefanha-thinkpad.redhat.com \
    --to=stefanha@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).