Re: [Xen-devel] PV guest with PCI passthrough crash on Xen 4.8.3 inside KVM when booted through OVMF

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Juergen Gross <jgross@suse.com>
To: xen-devel@lists.xenproject.org
Subject: Re: [Xen-devel] PV guest with PCI passthrough crash on Xen 4.8.3 inside KVM when booted through OVMF
Date: Mon, 27 Nov 2023 17:05:52 +0100	[thread overview]
Message-ID: <42f02cd6-9aa2-43d3-a352-3abfc5b25ee0@suse.com> (raw)
In-Reply-To: <CAKf6xpvBE7VnziXYBpbh4iPw+sJi9bjLcZupUgrt_Pw6qUtffg@mail.gmail.com>


[-- Attachment #1.1.1: Type: text/plain, Size: 7598 bytes --]

On 27.11.23 16:56, Jason Andryuk wrote:
> On Mon, Nov 27, 2023 at 6:27 AM Marek Marczykowski-Górecki
> <marmarek@invisiblethingslab.com> wrote:
>>
>> On Mon, Nov 27, 2023 at 11:20:36AM +0000, Frediano Ziglio wrote:
>>> On Sun, Nov 26, 2023 at 2:51 PM Marek Marczykowski-Górecki
>>> <marmarek@invisiblethingslab.com> wrote:
>>>>
>>>> On Mon, Feb 19, 2018 at 06:30:14PM +0100, Juergen Gross wrote:
>>>>> On 16/02/18 20:02, Andrew Cooper wrote:
>>>>>> On 16/02/18 18:51, Marek Marczykowski-Górecki wrote:
>>>>>>> On Fri, Feb 16, 2018 at 05:52:50PM +0000, Andrew Cooper wrote:
>>>>>>>> On 16/02/18 17:48, Marek Marczykowski-Górecki wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> As in the subject, the guest crashes on boot, before kernel output
>>>>>>>>> anything. I've isolated this to the conditions below:
>>>>>>>>>   - PV guest have PCI device assigned (e1000e emulated by QEMU in this case),
>>>>>>>>>     without PCI device it works
>>>>>>>>>   - Xen (in KVM) is started through OVMF; with seabios it works
>>>>>>>>>   - nested HVM is disabled in KVM
>>>>>>>>>   - AMD IOMMU emulation is disabled in KVM; when enabled qemu crashes on
>>>>>>>>>     boot (looks like qemu bug, unrelated to this one)
>>>>>>>>>
>>>>>>>>> Version info:
>>>>>>>>>   - KVM host: OpenSUSE 42.3, qemu 2.9.1, ovmf-2017+git1492060560.b6d11d7c46-4.1, AMD
>>>>>>>>>   - Xen host: Xen 4.8.3, dom0: Linux 4.14.13
>>>>>>>>>   - Xen domU: Linux 4.14.13, direct boot
>>>>>>>>>
>>>>>>>>> Not sure if relevant, but initially I've tried booting xen.efi /mapbs
>>>>>>>>> /noexitboot and then dom0 kernel crashed saying something about conflict
>>>>>>>>> between e820 and kernel mapping. But now those options are disabled.
>>>>>>>>>
>>>>>>>>> The crash message:
>>>>>>>>> (XEN) d1v0 Unhandled invalid opcode fault/trap [#6, ec=0000]
>>>>>>>>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d080218720 entry.o#create_bounce_frame+0x137/0x146
>>>>>>>>> (XEN) Domain 1 (vcpu#0) crashed on cpu#1:
>>>>>>>>> (XEN) ----[ Xen-4.8.3  x86_64  debug=n   Not tainted ]----
>>>>>>>>> (XEN) CPU:    1
>>>>>>>>> (XEN) RIP:    e033:[<ffffffff826d9156>]
>>>>>>>> This is #UD, which is most probably hitting a BUG().  addr2line this ^
>>>>>>>> to find some code to look at.
>>>>>>> addr2line failed me
>>>>>>
>>>>>> By default, vmlinux is stripped and compressed.  Ideally you want to
>>>>>> addr2line the vmlinux artefact in the root of your kernel build, which
>>>>>> is the plain elf with debugging symbols.
>>>>>>
>>>>>> Alternatively, use scripts/extract-vmlinux on the binary you actually
>>>>>> booted, which might get you somewhere.
>>>>>>
>>>>>>> , but System.map says its xen_memory_setup. And it
>>>>>>> looks like the BUG() is the same as I had in dom0 before:
>>>>>>> "Xen hypervisor allocated kernel memory conflicts with E820 map".
>>>>>>
>>>>>> Juergen: Is there anything we can do to try and insert some dummy
>>>>>> exception handlers right at PV start, so we could at least print out a
>>>>>> oneliner to the host console which is a little more helpful than Xen
>>>>>> saying "something unknown went wrong" ?
>>>>>
>>>>> You mean something like commit 42b3a4cb5609de757f5445fcad18945ba9239a07
>>>>> added to kernel 4.15?
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Disabling e820_host in guest config solved the problem. Thanks!
>>>>>>>
>>>>>>> Is this some bug in Xen or OVMF, or is it expected behavior and e820_host
>>>>>>> should be avoided?
>>>>>>
>>>>>> I don't really know.  e820_host is a gross hack which shouldn't really
>>>>>> be present.  The actually problem is that Linux can't cope with the
>>>>>> memory layout it was given (and I can't recall if there is anything
>>>>>> Linux could potentially to do cope).  OTOH, the toolstack, which knew
>>>>>> about e820_host and chose to lay the guest out in an overlapping way is
>>>>>> probably also at fault.
>>>>>
>>>>> The kernel can cope with lots of E820 scenarios (e.g. by relocating
>>>>> initrd or the p2m map), but moving itself out of the way is not
>>>>> possible.
>>>>
>>>> I'm afraid I need to resurrect this thread...
>>>>
>>>> With recent kernel (6.6+), the host_e820=0 workaround is not an option
>>>> anymore. It makes Linux not initialize xen-swiotlb (due to
>>>> f9a38ea5172a3365f4594335ed5d63e15af2fd18), so PCI passthrough doesn't
>>>> work at all. While I can add yet another layer of workaround (force
>>>> xen-swiotlb with iommu=soft), that's getting unwieldy.
>>>>
>>>> Furthermore, I don't get the crash message anymore, even with debug
>>>> hypervisor and guest_loglvl=all. Not even "Domain X crashed" in `xl
>>>> dmesg`. It looks like the "crash" shutdown reason doesn't reach Xen, and
>>>> it's considered clean shutdown (I can confirm it by changing various
>>>> `on_*` settings (via libvirt) and observing which gets applied).
>>>>
>>>> Most tests I've done with 6.7-rc1, but the issue I observed on 6.6.1
>>>> already.
>>>>
>>>> This is on Xen 4.17.2. And the L0 is running Linux 6.6.1, and then uses
>>>> QEMU 8.1.2 + OVMF 202308 to run Xen as L1.
>>>>
>>>
>>> So basically you start the domain and it looks like it's shutting down
>>> cleanly from logs.
>>> Can you see anything from the guest? Can you turn on some more
>>> debugging at guest level?
>>
>> No, it crashes before printing anything to the console, also with
>> earlyprintk=xen.
>>
>>> I tried to get some more information from the initial crash but I
>>> could not understand which guest code triggered the bug.
>>
>> I'm not sure which one is it this time (because I don't have Xen
>> reporting guest crash...) but last time it was here:
>> https://github.com/torvalds/linux/blob/master/arch/x86/xen/setup.c#L873-L874
> 
> Hi Marek,
> 
> I too have run into this "Xen hypervisor allocated kernel memory
> conflicts with E820 map" error when running Xen under KVM & OVMF with
> SecureBoot.  OVMF built without SecureBoot did not trip over the
> issue.  It was a little while back - I have some notes though.
> 
> Non-SecureBoot
> (XEN)  [0000000000810000, 00000000008fffff] (ACPI NVS)
> (XEN)  [0000000000900000, 000000007f8eefff] (usable)
> 
> SecureBoot
> (XEN)  [0000000000810000, 000000000170ffff] (ACPI NVS)
> (XEN)  [0000000001710000, 000000007f0edfff] (usable)
> 
> Linux (under Xen) is checking that _pa(_text) (= 0x1000000) is RAM,
> but it is not.  Looking at the E820 map, there is type 4, NVS, region
> defined:
> [0000000000810000, 000000000170ffff] (ACPI NVS)
> 
> When OVMF is built with SMM (for SecureBoot) and S3Supported is true,
> the memory range 0x900000-0x170ffff is additionally marked ACPI NVS
> and Linux trips over this.  It becomes usable RAM under Non-SecureBoot
> so Linux boots fine.
> 
> What I don't understand is why there is even a check that _pa(_text)
> is RAM.  Xen logs that it places dom0 way up high in memory, so the
> physical address of the kernel pages are much higher than 0x1000000.
> The value 0x1000000 for _pa(_text) doesn't match reality.  Maybe there
> are some expectations for the ACPI NVS and other reserved regions to
> be 1-1 mapped?  I tried removing the BUG mentioned above, but it still
> failed to boot.  I think I also removed a second BUG, but
> unfortunately I don't have notes on either.

The _guest_ physical address is what matters here.

With using the host E820 map the PV-kernel tries to rearrange its guest
physical memory layout to match the E820 map. And a non-RAM GPA for the
location where the kernel is located triggers the BUG.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

     prev parent reply	other threads:[~2023-11-27 16:06 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-16 17:48 PV guest with PCI passthrough crash on Xen 4.8.3 inside KVM when booted through OVMF Marek Marczykowski-Górecki
2018-02-16 17:52 ` Andrew Cooper
2018-02-16 18:51   ` Marek Marczykowski-Górecki
2018-02-16 19:02     ` Andrew Cooper
2018-02-16 19:54       ` Marek Marczykowski-Górecki
2018-02-19 17:23         ` Juergen Gross
2018-02-19 17:29           ` Marek Marczykowski-Górecki
2018-02-19 17:46             ` Juergen Gross
2018-02-19 17:49               ` Andrew Cooper
2018-02-16 21:35       ` Rich Persaud
2018-02-19 17:13         ` Roger Pau Monné
2018-02-19 17:30       ` Juergen Gross
2023-11-26 14:51         ` [Xen-devel] " Marek Marczykowski-Górecki
     [not found]           ` <CACHz=ZiWufUenyw_wg+QuK86+gU5RZNkuJNzX9-K1UM5P3m8+Q@mail.gmail.com>
2023-11-27 11:26             ` Marek Marczykowski-Górecki
2023-11-27 15:56               ` Jason Andryuk
2023-11-27 16:05                 ` Juergen Gross [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42f02cd6-9aa2-43d3-a352-3abfc5b25ee0@suse.com \
    --to=jgross@suse.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).