linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: D Scott Phillips <scott@os.amperecomputing.com>
Cc: linux-kernel@vger.kernel.org,  kvm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,  kvmarm@lists.linux.dev,
	maz@kernel.org,  arnd@linaro.org
Subject: Re: [PATCH 1/3] ampere/arm64: Add a fixup handler for alignment faults in aarch64 code
Date: Wed, 28 Aug 2024 10:47:02 +0100	[thread overview]
Message-ID: <87plpt3rop.fsf@draig.linaro.org> (raw)
In-Reply-To: <86frqpk6d7.fsf@scott-ph-mail.amperecomputing.com> (D. Scott Phillips's message of "Tue, 27 Aug 2024 14:23:16 -0700")

D Scott Phillips <scott@os.amperecomputing.com> writes:

> Alex Bennée <alex.bennee@linaro.org> writes:
>
>> From: D Scott Phillips <scott@os.amperecomputing.com>
>>
>> A later patch will hand out Device memory in some cases to code
>> which expects a Normal memory type, as an errata workaround.
>> Unaligned accesses to Device memory will fault though, so here we
>> add a fixup handler to emulate faulting accesses, at a performance
>> penalty.
>>
>> Many of the instructions in the Loads and Stores group are supported,
>> but these groups are not handled here:
>>
>>  * Advanced SIMD load/store multiple structures
>>  * Advanced SIMD load/store multiple structures (post-indexed)
>>  * Advanced SIMD load/store single structure
>>  * Advanced SIMD load/store single structure (post-indexed)
>
> Hi Alex, I'm keeping my version of these patches here:
>
> https://github.com/AmpereComputing/linux-ampere-altra-erratum-pcie-65
>
> It looks like the difference to the version you've harvested is that
> I've since added handling for these load/store types:
>
> Advanced SIMD load/store multiple structure
> Advanced SIMD load/store multiple structure (post-indexed)
> Advanced SIMD load/store single structure
> Advanced SIMD load/store single structure (post-indexed)

Are you going to roll in the fixes I added or should I re-spin with your
additional handling?

> I've never sent these patches because in my opinion there's too much
> complexity to maintain upstream for this workaround, though now they're
> here so we can have that conversation.

It's not totally out of the scope of the kernel to do instruction
decoding to workaround things that can't be executed directly. There is
already a bunch of instruction decode logic to handle stepping over
instruction probes. The 32 bit ARM code even has a complete user-space
alignment fixup handler driver by procfs.

It might make sense to share some of the logic although of course the
probe handler and the misaligned handler are targeting different sets of
instructions.

The core kernel code also has a bunch of unaligned load/store helper
functions that could probably be re-used as well to further reduce the
code delta.

> Finally, I think a better approach overall would have been to have
> device memory mapping in the stage 2 page table, then booting with pkvm
> would have this workaround for both the host and guests. I don't think
> that approach changes the fact that there's too much complexity here.

That would be a cleaner solution for pKVM although we would like to see
it ported to Xen as well. There is a tension there between having a
generic fixup library and something tightly integrated into a given
kernel/hypervisor.

I don't think instruction decoding is fundamentally too complicated for
a kernel - although I may be biased as a QEMU developer ;-). However if
it is to be taken forward I think it should probably come with an
exhaustive test case to exercise the decoder and fixup handler. The
fixes so far were found by repeatedly iterating on vkmark and seeing
were things failed and fixing when they came up.

I will leave it to the kernel maintainers to decide if this is an
acceptable workaround or not.

I do have two remaining questions:

  - Why do AMD GPUs trigger this and not nVidia? Do they just have their
    own fixup code hidden in the binary blob? Would Nouveau suffer
    similar problems?

  - Will Arm SoC PCI implementations continue to see these edge cases
    that don't affect the x86 world? This is not the first Arm machine
    with PCI to see issues. In fact of the 3 machines I have (SynQuacer,
    MachiatoBin and AVA) all have some measure of PCI "fun" to deal
    with.

Thanks,

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


  parent reply	other threads:[~2024-08-28  9:48 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-27 13:08 [PATCH 0/3] altra workarounds for PCIE_64 with instrumentation Alex Bennée
2024-08-27 13:08 ` [PATCH 1/3] ampere/arm64: Add a fixup handler for alignment faults in aarch64 code Alex Bennée
2024-08-27 21:23   ` D Scott Phillips
2024-08-28  8:29     ` Marc Zyngier
2024-08-28  9:47     ` Alex Bennée [this message]
2024-08-30  1:43   ` kernel test robot
2024-08-27 13:08 ` [PATCH 2/3] ampere/arm64: Work around Ampere Altra erratum #82288 PCIE_65 Alex Bennée
2024-08-27 13:08 ` [PATCH 3/3] ampere/arm64: instrument the altra workarounds Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87plpt3rop.fsf@draig.linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=arnd@linaro.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=scott@os.amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).