From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DFAA8C4167B for ; Mon, 27 Nov 2023 11:27:35 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.641992.1001015 (Exim 4.92) (envelope-from ) id 1r7ZlX-0000mn-Db; Mon, 27 Nov 2023 11:27:03 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 641992.1001015; Mon, 27 Nov 2023 11:27:03 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r7ZlX-0000mg-9I; Mon, 27 Nov 2023 11:27:03 +0000 Received: by outflank-mailman (input) for mailman id 641992; Mon, 27 Nov 2023 11:27:02 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r7ZlV-0000kL-Nq for xen-devel@lists.xen.org; Mon, 27 Nov 2023 11:27:01 +0000 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id dda89fd1-8d17-11ee-9b0e-b553b5be7939; Mon, 27 Nov 2023 12:26:57 +0100 (CET) Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id 93E883200AC7; Mon, 27 Nov 2023 06:26:52 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Mon, 27 Nov 2023 06:26:52 -0500 Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Nov 2023 06:26:51 -0500 (EST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: dda89fd1-8d17-11ee-9b0e-b553b5be7939 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= invisiblethingslab.com; h=cc:cc:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm1; t= 1701084412; x=1701170812; bh=Ahovy3dTMQY5oV7FRTgjhhNH+Fxl2JWmSk3 tOMjcAbA=; b=xIN8yc7w11iqT8MNuarkF581B/CkDLiNCTKVQjNImWlTFPabBAx ZpnfYAfk42mnIHpzNTVeUTNgE1Dnn8ll4SWDZuQG1fi+FXmlVeO4U6DFvfk3B33y Nu+mZ1rlk/CMS8HREPU+NUeWR7MXzWyKNgP9YqelcbReRdwBv3jnJVGSRbjNjhzX 3K8ODBAAbSTQOZwfuPUet+90Bj6AWcH9x0e1lVBFdcvmZQJvPiPjMcx1u3oSZt2S 5kl6iQjLkn6XuY55qgtky9LUxB99t0knRIzYRBX95hXxFToMnKsD13+guzpD1A50 iyljFu0WdwkS/w393gaLsHVVJ3/NXnp9lVw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1701084412; x=1701170812; bh=Ahovy3dTMQY5o V7FRTgjhhNH+Fxl2JWmSk3tOMjcAbA=; b=p6VBWOTlWzOaTIV+uCn5kyTc0VfpK x92rkoWfWfSejZujfVD3eYQzPWDp68/FhULi+LggKpu2ym5Il+s0zm58HdnlvQeS K+U7LYbkOfUaULNdQaOJCG/4FhGeDfNP6kzlfNNtcrZ+A+lVA309Ko5r39+oUJ99 yrSpAmDcnHkDdxhAvnAOReqqozWbCmdm/dSJrSIhsOY2GPembMvlYp73Fzj/y98H X3b3nVGIlWjZ+rjJiAAoVZkGKZ81185U1CrELrYJE38ON5PePZkjlEGO4OgTM/B5 yqWeAxpRMydMT5lH4GOZm9pRGVLPwRp30NtXZCyRclHr635piCXm/zajQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrudeiuddgvdekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtjeenucfhrhhomhepofgrrhgv khcuofgrrhgtiiihkhhofihskhhiqdfikphrvggtkhhiuceomhgrrhhmrghrvghksehinh hvihhsihgslhgvthhhihhnghhslhgrsgdrtghomheqnecuggftrfgrthhtvghrnhepueek teetgefggfekudehteegieeljeejieeihfejgeevhfetgffgteeuteetueetnecuffhomh grihhnpehgihhthhhusgdrtghomhenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgr mhepmhgrihhlfhhrohhmpehmrghrmhgrrhgvkhesihhnvhhishhisghlvghthhhinhhgsh hlrggsrdgtohhm X-ME-Proxy: Feedback-ID: i1568416f:Fastmail Date: Mon, 27 Nov 2023 12:26:49 +0100 From: Marek =?utf-8?Q?Marczykowski-G=C3=B3recki?= To: Frediano Ziglio Cc: xen-devel Subject: Re: [Xen-devel] PV guest with PCI passthrough crash on Xen 4.8.3 inside KVM when booted through OVMF Message-ID: References: <20180216174835.GJ4302@mail-itl> <3b6ce245-626d-a6db-b9fa-77dcf26a4ad6@citrix.com> <20180216185122.GK4302@mail-itl> <01e7d219-5a2f-58cb-bb30-59f31749f019@suse.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="exY+4DKLQTsw8emk" Content-Disposition: inline In-Reply-To: --exY+4DKLQTsw8emk Content-Type: text/plain; protected-headers=v1; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Date: Mon, 27 Nov 2023 12:26:49 +0100 From: Marek =?utf-8?Q?Marczykowski-G=C3=B3recki?= To: Frediano Ziglio Cc: xen-devel Subject: Re: [Xen-devel] PV guest with PCI passthrough crash on Xen 4.8.3 inside KVM when booted through OVMF On Mon, Nov 27, 2023 at 11:20:36AM +0000, Frediano Ziglio wrote: > On Sun, Nov 26, 2023 at 2:51=E2=80=AFPM Marek Marczykowski-G=C3=B3recki > wrote: > > > > On Mon, Feb 19, 2018 at 06:30:14PM +0100, Juergen Gross wrote: > > > On 16/02/18 20:02, Andrew Cooper wrote: > > > > On 16/02/18 18:51, Marek Marczykowski-G=C3=B3recki wrote: > > > >> On Fri, Feb 16, 2018 at 05:52:50PM +0000, Andrew Cooper wrote: > > > >>> On 16/02/18 17:48, Marek Marczykowski-G=C3=B3recki wrote: > > > >>>> Hi, > > > >>>> > > > >>>> As in the subject, the guest crashes on boot, before kernel outp= ut > > > >>>> anything. I've isolated this to the conditions below: > > > >>>> - PV guest have PCI device assigned (e1000e emulated by QEMU in= this case), > > > >>>> without PCI device it works > > > >>>> - Xen (in KVM) is started through OVMF; with seabios it works > > > >>>> - nested HVM is disabled in KVM > > > >>>> - AMD IOMMU emulation is disabled in KVM; when enabled qemu cra= shes on > > > >>>> boot (looks like qemu bug, unrelated to this one) > > > >>>> > > > >>>> Version info: > > > >>>> - KVM host: OpenSUSE 42.3, qemu 2.9.1, ovmf-2017+git1492060560.= b6d11d7c46-4.1, AMD > > > >>>> - Xen host: Xen 4.8.3, dom0: Linux 4.14.13 > > > >>>> - Xen domU: Linux 4.14.13, direct boot > > > >>>> > > > >>>> Not sure if relevant, but initially I've tried booting xen.efi /= mapbs > > > >>>> /noexitboot and then dom0 kernel crashed saying something about = conflict > > > >>>> between e820 and kernel mapping. But now those options are disab= led. > > > >>>> > > > >>>> The crash message: > > > >>>> (XEN) d1v0 Unhandled invalid opcode fault/trap [#6, ec=3D0000] > > > >>>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d080= 218720 entry.o#create_bounce_frame+0x137/0x146 > > > >>>> (XEN) Domain 1 (vcpu#0) crashed on cpu#1: > > > >>>> (XEN) ----[ Xen-4.8.3 x86_64 debug=3Dn Not tainted ]---- > > > >>>> (XEN) CPU: 1 > > > >>>> (XEN) RIP: e033:[] > > > >>> This is #UD, which is most probably hitting a BUG(). addr2line t= his ^ > > > >>> to find some code to look at. > > > >> addr2line failed me > > > > > > > > By default, vmlinux is stripped and compressed. Ideally you want to > > > > addr2line the vmlinux artefact in the root of your kernel build, wh= ich > > > > is the plain elf with debugging symbols. > > > > > > > > Alternatively, use scripts/extract-vmlinux on the binary you actual= ly > > > > booted, which might get you somewhere. > > > > > > > >> , but System.map says its xen_memory_setup. And it > > > >> looks like the BUG() is the same as I had in dom0 before: > > > >> "Xen hypervisor allocated kernel memory conflicts with E820 map". > > > > > > > > Juergen: Is there anything we can do to try and insert some dummy > > > > exception handlers right at PV start, so we could at least print ou= t a > > > > oneliner to the host console which is a little more helpful than Xen > > > > saying "something unknown went wrong" ? > > > > > > You mean something like commit 42b3a4cb5609de757f5445fcad18945ba9239a= 07 > > > added to kernel 4.15? > > > > > > > > > > >> > > > >> Disabling e820_host in guest config solved the problem. Thanks! > > > >> > > > >> Is this some bug in Xen or OVMF, or is it expected behavior and e8= 20_host > > > >> should be avoided? > > > > > > > > I don't really know. e820_host is a gross hack which shouldn't rea= lly > > > > be present. The actually problem is that Linux can't cope with the > > > > memory layout it was given (and I can't recall if there is anything > > > > Linux could potentially to do cope). OTOH, the toolstack, which kn= ew > > > > about e820_host and chose to lay the guest out in an overlapping wa= y is > > > > probably also at fault. > > > > > > The kernel can cope with lots of E820 scenarios (e.g. by relocating > > > initrd or the p2m map), but moving itself out of the way is not > > > possible. > > > > I'm afraid I need to resurrect this thread... > > > > With recent kernel (6.6+), the host_e820=3D0 workaround is not an option > > anymore. It makes Linux not initialize xen-swiotlb (due to > > f9a38ea5172a3365f4594335ed5d63e15af2fd18), so PCI passthrough doesn't > > work at all. While I can add yet another layer of workaround (force > > xen-swiotlb with iommu=3Dsoft), that's getting unwieldy. > > > > Furthermore, I don't get the crash message anymore, even with debug > > hypervisor and guest_loglvl=3Dall. Not even "Domain X crashed" in `xl > > dmesg`. It looks like the "crash" shutdown reason doesn't reach Xen, and > > it's considered clean shutdown (I can confirm it by changing various > > `on_*` settings (via libvirt) and observing which gets applied). > > > > Most tests I've done with 6.7-rc1, but the issue I observed on 6.6.1 > > already. > > > > This is on Xen 4.17.2. And the L0 is running Linux 6.6.1, and then uses > > QEMU 8.1.2 + OVMF 202308 to run Xen as L1. > > >=20 > So basically you start the domain and it looks like it's shutting down > cleanly from logs. > Can you see anything from the guest? Can you turn on some more > debugging at guest level? No, it crashes before printing anything to the console, also with earlyprintk=3Dxen. > I tried to get some more information from the initial crash but I > could not understand which guest code triggered the bug. I'm not sure which one is it this time (because I don't have Xen reporting guest crash...) but last time it was here: https://github.com/torvalds/linux/blob/master/arch/x86/xen/setup.c#L873-L874 --=20 Best Regards, Marek Marczykowski-G=C3=B3recki Invisible Things Lab --exY+4DKLQTsw8emk Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAmVkfPkACgkQ24/THMrX 1yyHMgf6A8AxoFAloPh24/UoabFzFWqVwbwNXr0mL+HK/nCjf3MbOEUdzTiPOM39 91Pbywha5jf9FW6yFwssKCYwNAcFxNmyoCbFC0nxMvMrA1nBAnq3gsJBLLU7FkQs EQqE7M47fYZJf1K6otwkKq+GTvKR5nheXCKpMIqEM5qRFvhNtdL3v8m4/071D8JG VQpYxeBRT1Ad1Uwxbe1j5v+yksPk2CW8jztomb3ypQxs02R0hAXdBWF7VK0pJ43I a94lluEI2T2TSeG6iX2k9+8sKIVdZldIbrtIk0sSiihkIFr3oROtUzhg3L13neoH HOanUQSlGRwlZ9aQdf7yQt/Ok2z8Bg== =0YKb -----END PGP SIGNATURE----- --exY+4DKLQTsw8emk--