From: Stefan Hajnoczi <stefanha@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] exec.c:invalidate_and_set_dirty() only checks whether first page in its range is dirty...
Date: Tue, 18 Nov 2014 14:53:18 +0000 [thread overview]
Message-ID: <20141118145318.GG7837@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <CAFEAcA8op+VPnVw3XGn8XGoc6=DRmaJDs1cbtFMNF9x7EM2swg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2026 bytes --]
On Sun, Nov 16, 2014 at 06:11:48PM +0000, Peter Maydell wrote:
> I'm trying to track down a bug in ARM TCG where we:
> * boot a guest
> * run 'shutdown -r now' to trigger a reboot
> * on reboot, crash when running userspace because the contents
> of physical RAM have changed but the translated code from
> before the shutdown was never invalidated
>
> This is with a virtio-mmio block device as the disk.
>
> Debugging indicates that when the post-reboot guest reloads
> binaries from disk into ram we fail to invalidate the cached
> translations. For the specific case I looked at, we have a
> translation of code at ramaddr_t 0x806e000. The disk load
> pulls 0x16000 bytes of data off disk to address 0x806a000.
> Virtio correctly calls address_space_unmap(), which is supposed
> to be what marks the ram range as dirty. It in turn calls
> invalidate_and_set_clean(). However invalidate_and_set_clean()
> just does this:
>
> if (cpu_physical_memory_is_clean(addr)) {
> /* invalidate code */
> tb_invalidate_phys_page_range(addr, addr + length, 0);
> /* set dirty bit */
> cpu_physical_memory_set_dirty_range_nocode(addr, length);
> }
>
> So if the first page in the range (here 0x806a000) happens
> to be dirty then we won't do anything, even if later pages
> in the range do need to be invalidated. Also, we'll call
> tb_invalidate_phys_page_range() with a start/end which may
> be in different physical pages, which is forbidden by that
> function's API.
>
> I guess invalidate_and_set_clean() really needs to be
> fixed to loop through each page in the range; does anybody
> know how this is supposed to work (or why nobody's noticed
> this bug before :-)) ?
Not directly but I don't like this code because it's not atomic. I'll
send patches soon for atomic test-and-set and test-and-clear. Hopefully
it won't impact performance too much.
What you've discovered seems like a plain old bug. It needs a loop.
Stefan
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
prev parent reply other threads:[~2014-11-18 14:53 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-16 18:11 [Qemu-devel] exec.c:invalidate_and_set_dirty() only checks whether first page in its range is dirty Peter Maydell
2014-11-18 14:53 ` Stefan Hajnoczi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141118145318.GG7837@stefanha-thinkpad.redhat.com \
--to=stefanha@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).