Re: [RFC PATCH] accel/tcg: clear all TBs from a page when it is written to

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Nicholas Piggin" <npiggin@gmail.com>
To: "Richard Henderson" <richard.henderson@linaro.org>,
	<qemu-devel@nongnu.org>, <qemu-ppc@nongnu.org>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [RFC PATCH] accel/tcg: clear all TBs from a page when it is written to
Date: Wed, 14 Aug 2024 16:09:36 +1000	[thread overview]
Message-ID: <D3FESSMCNW22.3S68NMF7NI5KR@gmail.com> (raw)
In-Reply-To: <c098afe9-d09d-400d-8f2b-9278744a4ad4@linaro.org>

On Mon Aug 12, 2024 at 11:25 AM AEST, Richard Henderson wrote:
> On 8/9/24 17:47, Nicholas Piggin wrote:
> > This is not a clean patch, but does fix a problem I hit with TB
> > invalidation due to the target software writing to memory with TBs.
> > 
> > Lockup messages are triggering in Linux due to page clearing taking a
> > long time when a code page has been freed, because it takes a lot of
> > notdirty notifiers, which massively slows things down. Linux might
> > possibly have a bug here too because it seems to hang indefinitely in
> > some cases, but even if it didn't, the latency of clearing these pages
> > is very high.
> > 
> > This showed when running KVM on the emulated machine, starting and
> > stopping guests. That causes lots of instruction pages to be freed.
> > Usually if you're just running Linux, executable pages remain in
> > pagecache so you get fewer of these bombs in the kernel memory
> > allocator. But page reclaim, JITs, deleting executable files, etc.,
> > could trigger it too.
> > 
> > Invalidating all TBs from the page on any hit seems to avoid the problem
> > and generally speeds things up.
> > 
> > How important is the precise invalidation? These days I assume the
> > tricky kind of SMC that frequently writes code close to where it's
> > executing is pretty rare and might not be something we really care about
> > for performance. Could we remove sub-page TB invalidation entirely?
>
> Happens on x86 and s390 regularly enough, so we can't remove it.
>
> > @@ -1107,6 +1107,9 @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
> >       TranslationBlock *current_tb = retaddr ? tcg_tb_lookup(retaddr) : NULL;
> >   #endif /* TARGET_HAS_PRECISE_SMC */
> >   
> > +    start &= TARGET_PAGE_MASK;
> > +    last |= ~TARGET_PAGE_MASK;
> > +
> >       /* Range may not cross a page. */
> >       tcg_debug_assert(((start ^ last) & TARGET_PAGE_MASK) == 0);
>
> This would definitely break SMC.

They can't invalidate the instruction currently being executed?
I'll experiment a bit more.

> However, there's a better solution.  We're already iterating over all of the TBs on the 
> current page only.  Move *everything* except the tb_phys_invalidate__locked call into the 
> SMC ifdef, and unconditionally invalidate every TB selected in the loop.

Okay. I suspect *most* of the time even the strict SMC archs would
not be writing to the same page they're executing either. But I can
start with the !SMC.

> We experimented with something like this for aarch64, which used to spend a lot of the 
> kernel startup time invalidating code pages from the (somewhat bloated) EDK2 bios.  But it 
> turned out the bigger problem was address space randomization, and with CF_PCREL the 
> problem appeared to go away.

Interesting.

> I don't think we've done any kvm-under-tcg performance testing, but lockup messages would 
> certainly be something to look for...

Yeah, actually Linux is throwing the messages a bit more recently
at least on distros that enable page clearing at alloc for security,
because that clearing is a big chunk that can happen in critical
sections.

Thanks for the suggestion, I'll give it a try.

Thanks,
Nick

next prev parent reply	other threads:[~2024-08-14  6:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-09  7:47 [RFC PATCH] accel/tcg: clear all TBs from a page when it is written to Nicholas Piggin
2024-08-09  8:26 ` Philippe Mathieu-Daudé
2024-08-12  1:25 ` Richard Henderson
2024-08-14  6:09   ` Nicholas Piggin [this message]
2024-08-20 23:16     ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D3FESSMCNW22.3S68NMF7NI5KR@gmail.com \
    --to=npiggin@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).