From: Ahmad Fatoum <a.fatoum@pengutronix.de>
To: Marc Zyngier <maz@kernel.org>
Cc: Peter Maydell <peter.maydell@linaro.org>,
qemu-arm@nongnu.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org,
Pengutronix Kernel Team <kernel@pengutronix.de>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
Enrico Joerns <ejo@pengutronix.de>
Subject: Re: [BUG] ARM64 KVM: Data abort executing post-indexed LDR on MMIO address
Date: Wed, 9 Oct 2024 08:11:52 +0200 [thread overview]
Message-ID: <f096876b-f364-4291-88e0-ac189b4f26fe@pengutronix.de> (raw)
In-Reply-To: <8634l97cfs.wl-maz@kernel.org>
Hello Marc,
On 06.10.24 12:28, Marc Zyngier wrote:
> On Sun, 06 Oct 2024 08:59:56 +0100,
> Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
>> On 05.10.24 23:35, Marc Zyngier wrote:
>>> On Sat, 05 Oct 2024 19:38:23 +0100,
>>> Ahmad Fatoum <a.fatoum@pengutronix.de> wrote:
>> One more question: This upgrading of DC IVAC to DC CIVAC is because
>> the code is run under virtualization, right?
>
> Not necessarily. Virtualisation mandates the upgrade, but CIVAC is
> also a perfectly valid implementation of both IVAC and CVAC. And it
> isn't uncommon that CPUs implement everything the same way.
Makes sense. After all, software should expect cache lines to
be evicted at any time due to capacity misses anyway.
>> I think following fix on the barebox side may work:
>>
>> - Walk all pages about to be remapped
>> - Execute the AT instruction on the page's base address
>
> Why do you need AT if you are walking the PTs? If you walk, you
> already have access to the memory attributes. In general, AT can be
> slower than an actual walk.
>
> Or did you actually mean iterating over the VA range? Even in that
> case, AT can be a bad idea, as you are likely to iterate in page-size
> increments even if you have a block mapping. Walking the PTs tells you
> immediately how much a leaf is mapping (assuming you don't have any
> other tracking).
There's no other tracking and I hoped that using AT (which is already
being used for the mmuinfo shell command) would be easier.
I see now that it would be too suboptimal to do it this way and have
implemented a revised arch_remap_range[1] for barebox, which I just
Cc'd you on.
[1]: https://lore.kernel.org/barebox/20241009060511.4121157-5-a.fatoum@pengutronix.de/T/#u
>> - Only if the page was previously mapped cacheable, clean + invalidate
>> the cache
>> - Remove the current cache invalidation after remap
>>
>> Does that sound sensible?
>
> This looks reasonable (apart from the AT thingy).
I have two (hopefully the last!) questions about remaining differing
behavior with KVM and without:
1) Unaligned stack accesses crash in KVM:
start: /* This will be mapped at 0x40080000 */
ldr x0, =0x4007fff0
mov sp, x0
stp x0, x1, [sp] // This is ok
ldr x0, =0x4007fff8
mov sp, x0
stp x0, x1, [sp] // This crashes
I know that the stack should be 16 byte aligned, but why does it crash
only under KVM?
Context: The barebox Image used for Qemu has a Linux ARM64 "Image" header,
so it's loaded at an offset and grows the stack down into this memory region
until the FDT's /memory could be decoded and a proper stack is set up.
A regression introduced earlier this year, caused the stack to grow down
from a non-16-byte address, which is fixed in [2].
[2]: https://lore.kernel.org/barebox/20241009060511.4121157-5-a.fatoum@pengutronix.de/T/#ma381512862d22530382aff1662caadad2c8bc182
2) Using uncached memory for Virt I/O queues with KVM enabled is considerably
slower. My guess is that these accesses keep getting trapped, but what I wonder
about is the performance discrepancy between the big.LITTLE cores
(measurement of barebox copying 1MiB using `time cp -v /dev/virtioblk0 /tmp`):
KVM && !CACHED && 1x Cortex-A53: 0.137s
KVM && !CACHED && 1x Cortex-A72: 54.030s
KVM && CACHED && 1x Cortex-A53: 0.120s
KVM && CACHED && 1x Cortex-A72: 0.035s
The A53s are CPUs 0-1 and the A72 are 2-5.
Any idea why accessing uncached memory from the big core is so much worse?
Thank you!
Ahmad
>
> Thanks,
>
> M.
>
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
next prev parent reply other threads:[~2024-10-09 6:13 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-04 9:47 [BUG] ARM64 KVM: Data abort executing post-indexed LDR on MMIO address Ahmad Fatoum
2024-10-04 10:40 ` Peter Maydell
2024-10-04 11:51 ` Ahmad Fatoum
2024-10-04 12:10 ` Peter Maydell
2024-10-04 15:52 ` Oliver Upton
2024-10-04 15:57 ` Peter Maydell
2024-10-04 16:21 ` Oliver Upton
2024-10-04 19:50 ` Ahmad Fatoum
2024-10-05 10:31 ` Marc Zyngier
2024-10-05 18:38 ` Ahmad Fatoum
2024-10-05 21:35 ` Marc Zyngier
2024-10-06 7:59 ` Ahmad Fatoum
2024-10-06 10:28 ` Marc Zyngier
2024-10-09 6:11 ` Ahmad Fatoum [this message]
2024-10-09 8:05 ` Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f096876b-f364-4291-88e0-ac189b4f26fe@pengutronix.de \
--to=a.fatoum@pengutronix.de \
--cc=ejo@pengutronix.de \
--cc=kernel@pengutronix.de \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).