* [PATCH v1 0/6] xen/arm: Add Virtio-PCI for dom0less on ARM
@ 2024-09-24 16:23 Edgar E. Iglesias
2024-09-24 16:23 ` [PATCH v1 1/6] xen/arm: Decrease size of the 2nd ram bank Edgar E. Iglesias
` (5 more replies)
0 siblings, 6 replies; 32+ messages in thread
From: Edgar E. Iglesias @ 2024-09-24 16:23 UTC (permalink / raw)
To: xen-devel
Cc: sstabellini, julien, bertrand.marquis, michal.orzel,
Volodymyr_Babchuk, dpsmith, edgar.iglesias
From: "Edgar E. Iglesias" <edgar.iglesias@amd.com>
This series adds support for Virtio-PCI for dom0less on ARM.
Three parts:
1. Decrease size of second RAM bank to free up space in <40bit address range.
2. Add generation of virtio-pci FDT nodes for domU.
3. Add a background ECAM region always responding 0xFFFFFFFF (nothing attached).
Decreasing the size of the second RAM bank is probably a bit controversial
since it breaks backwards compatibility for guests with huge amounts of RAM.
There are other options we could consider but the problem we've got is that
we're running out of address-space under <40bits. There're some fragmented
holes under <32bit but nothing between 32 - 40bits and there are ARM systems
out there that only have 40-bit physical address space (Cortex-A53 on the
Xilinx ZynqMP to name one).
For virtio-pci the holes <32-bit are not large enough to fit the ECAM range
nor a 4G area to map pretechable-memory BARs.
Another option is to leave the second bank unmodified and default virtio-pci
to >40bit addresses. Users that create guests with less RAM than the maximum
can use the fdt bindings to relocate virtio-pci down to <40bit range if they
need to.
I've published some instructions on how to try this including the work in
progress QEMU side of the ARM PVH PCI support:
https://github.com/edgarigl/docs/blob/master/xen/pvh/virtio-pci-dom0less.md
Cheers,
Edgar
Edgar E. Iglesias (5):
xen/arm: Decrease size of the 2nd ram bank
xen/arm: Reserve resources for virtio-pci
xen/arm: io: Add support for mmio background regions
xen/arm: io: Add a read-const writes-ignored mmio handler
xen/arm: dom0less: Add a background PCI ECAM mmio region
Stewart Hildebrand (1):
xen/arm: create dom0less virtio-pci DT node
docs/misc/arm/device-tree/booting.txt | 21 +++
xen/arch/arm/dom0less-build.c | 248 ++++++++++++++++++++++++++
xen/arch/arm/include/asm/kernel.h | 15 ++
xen/arch/arm/include/asm/mmio.h | 13 +-
xen/arch/arm/io.c | 39 +++-
xen/include/public/arch-arm.h | 21 ++-
6 files changed, 348 insertions(+), 9 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 32+ messages in thread* [PATCH v1 1/6] xen/arm: Decrease size of the 2nd ram bank 2024-09-24 16:23 [PATCH v1 0/6] xen/arm: Add Virtio-PCI for dom0less on ARM Edgar E. Iglesias @ 2024-09-24 16:23 ` Edgar E. Iglesias 2024-09-24 16:30 ` Julien Grall 2024-09-24 16:23 ` [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci Edgar E. Iglesias ` (4 subsequent siblings) 5 siblings, 1 reply; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-24 16:23 UTC (permalink / raw) To: xen-devel Cc: sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> The address range between 4G (32bit) and 1TB (40bit) is fully allocated. There's no more room for devices on ARM systems with 40-bit physicall address width. This decreases the size of the second RAM bank to free up space in preparation for virtio-pci and for future use-cases. In the future we may need to add a third RAM bank in higher address ranges. Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> --- xen/include/public/arch-arm.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h index e2412a1747..e19f0251a6 100644 --- a/xen/include/public/arch-arm.h +++ b/xen/include/public/arch-arm.h @@ -491,8 +491,8 @@ typedef uint64_t xen_callback_t; #define GUEST_VPCI_PREFETCH_MEM_ADDR xen_mk_ullong(0x100000000) #define GUEST_VPCI_PREFETCH_MEM_SIZE xen_mk_ullong(0x100000000) -#define GUEST_RAM1_BASE xen_mk_ullong(0x0200000000) /* 1016GB of RAM @ 8GB */ -#define GUEST_RAM1_SIZE xen_mk_ullong(0xfe00000000) +#define GUEST_RAM1_BASE xen_mk_ullong(0x0200000000) /* 952GB of RAM @ 8GB */ +#define GUEST_RAM1_SIZE xen_mk_ullong(0xee00000000) #define GUEST_RAM_BASE GUEST_RAM0_BASE /* Lowest RAM address */ /* Largest amount of actual RAM, not including holes */ -- 2.43.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v1 1/6] xen/arm: Decrease size of the 2nd ram bank 2024-09-24 16:23 ` [PATCH v1 1/6] xen/arm: Decrease size of the 2nd ram bank Edgar E. Iglesias @ 2024-09-24 16:30 ` Julien Grall 2024-09-24 23:34 ` Stefano Stabellini 0 siblings, 1 reply; 32+ messages in thread From: Julien Grall @ 2024-09-24 16:30 UTC (permalink / raw) To: Edgar E. Iglesias, xen-devel Cc: sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias Hi Edgar, On 24/09/2024 17:23, Edgar E. Iglesias wrote: > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> > > The address range between 4G (32bit) and 1TB (40bit) is fully > allocated. There's no more room for devices on ARM systems with > 40-bit physicall address width. > > This decreases the size of the second RAM bank to free up space > in preparation for virtio-pci and for future use-cases. I don't think we should reduce the amount of RAM supported in the default case. Instead, I think it is time to support a more dynamic layout so we still allow 1TB guest when QEMU is not emulated a virtual PCI hostbridge. Cheers, -- Julien Grall ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 1/6] xen/arm: Decrease size of the 2nd ram bank 2024-09-24 16:30 ` Julien Grall @ 2024-09-24 23:34 ` Stefano Stabellini 2024-09-24 23:40 ` Edgar E. Iglesias 0 siblings, 1 reply; 32+ messages in thread From: Stefano Stabellini @ 2024-09-24 23:34 UTC (permalink / raw) To: Julien Grall Cc: Edgar E. Iglesias, xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias On Tue, 24 Sep 2024, Julien Grall wrote: > Hi Edgar, > > On 24/09/2024 17:23, Edgar E. Iglesias wrote: > > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> > > > > The address range between 4G (32bit) and 1TB (40bit) is fully > > allocated. There's no more room for devices on ARM systems with > > 40-bit physicall address width. > > > This decreases the size of the second RAM bank to free up space > > in preparation for virtio-pci and for future use-cases. > > I don't think we should reduce the amount of RAM supported in the default > case. Instead, I think it is time to support a more dynamic layout so we still > allow 1TB guest when QEMU is not emulated a virtual PCI hostbridge. Edgar, do you think it would be possible for QEMU to take the virtio-pci address ranges and SPIs on the command line? If yes, I think that would solve the problem on the QEMU side. On the Xen side, you have already added "virtio-pci-ranges" as dom0less property and that's all we need as far as I can tell. Then you can remove patch #1 and patch #2 from this series? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 1/6] xen/arm: Decrease size of the 2nd ram bank 2024-09-24 23:34 ` Stefano Stabellini @ 2024-09-24 23:40 ` Edgar E. Iglesias 0 siblings, 0 replies; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-24 23:40 UTC (permalink / raw) To: Stefano Stabellini Cc: Julien Grall, xen-devel, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias On Tue, Sep 24, 2024 at 04:34:16PM -0700, Stefano Stabellini wrote: > On Tue, 24 Sep 2024, Julien Grall wrote: > > Hi Edgar, > > > > On 24/09/2024 17:23, Edgar E. Iglesias wrote: > > > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> > > > > > > The address range between 4G (32bit) and 1TB (40bit) is fully > > > allocated. There's no more room for devices on ARM systems with > > > 40-bit physicall address width. > > > > This decreases the size of the second RAM bank to free up space > > > in preparation for virtio-pci and for future use-cases. > > > > I don't think we should reduce the amount of RAM supported in the default > > case. Instead, I think it is time to support a more dynamic layout so we still > > allow 1TB guest when QEMU is not emulated a virtual PCI hostbridge. > > Edgar, do you think it would be possible for QEMU to take the virtio-pci > address ranges and SPIs on the command line? If yes, I think that would > solve the problem on the QEMU side. Yes, that is already done on the QEMU side, all addresses and sizes are passed on the command-line. > On the Xen side, you have already added "virtio-pci-ranges" as dom0less > property and that's all we need as far as I can tell. > > Then you can remove patch #1 and patch #2 from this series? I didn't think of that but yes, that should actually work, good point. Cheers, Edgar ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci 2024-09-24 16:23 [PATCH v1 0/6] xen/arm: Add Virtio-PCI for dom0less on ARM Edgar E. Iglesias 2024-09-24 16:23 ` [PATCH v1 1/6] xen/arm: Decrease size of the 2nd ram bank Edgar E. Iglesias @ 2024-09-24 16:23 ` Edgar E. Iglesias 2024-09-24 16:35 ` Julien Grall 2024-09-24 16:23 ` [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node Edgar E. Iglesias ` (3 subsequent siblings) 5 siblings, 1 reply; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-24 16:23 UTC (permalink / raw) To: xen-devel Cc: sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> Reserve memory ranges and interrupt lines for an externally emulated PCI controller (e.g by QEMU) dedicated to hosting Virtio devices and potentially other emulated devices. Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> --- xen/include/public/arch-arm.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h index e19f0251a6..654b827715 100644 --- a/xen/include/public/arch-arm.h +++ b/xen/include/public/arch-arm.h @@ -494,6 +494,20 @@ typedef uint64_t xen_callback_t; #define GUEST_RAM1_BASE xen_mk_ullong(0x0200000000) /* 952GB of RAM @ 8GB */ #define GUEST_RAM1_SIZE xen_mk_ullong(0xee00000000) +/* Virtio PCI - Ordered by decreasing size to keep things aligned */ +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE xen_mk_ullong(0x43000000) +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE xen_mk_ullong(0x0f000000000) +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE xen_mk_ullong(0x100000000) + +#define GUEST_VIRTIO_PCI_ECAM_BASE (GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE + \ + GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE) +#define GUEST_VIRTIO_PCI_ECAM_SIZE xen_mk_ullong(0x10000000) + +#define GUEST_VIRTIO_PCI_MEM_TYPE xen_mk_ullong(0x02000000) +#define GUEST_VIRTIO_PCI_MEM_BASE (GUEST_VIRTIO_PCI_ECAM_BASE + \ + GUEST_VIRTIO_PCI_ECAM_SIZE) +#define GUEST_VIRTIO_PCI_MEM_SIZE xen_mk_ullong(0x00002000000) + #define GUEST_RAM_BASE GUEST_RAM0_BASE /* Lowest RAM address */ /* Largest amount of actual RAM, not including holes */ #define GUEST_RAM_MAX (GUEST_RAM0_SIZE + GUEST_RAM1_SIZE) @@ -529,6 +543,9 @@ typedef uint64_t xen_callback_t; #define GUEST_FFA_NOTIF_PEND_INTR_ID 8 #define GUEST_FFA_SCHEDULE_RECV_INTR_ID 9 +#define GUEST_VIRTIO_PCI_SPI_FIRST 44 +#define GUEST_VIRTIO_PCI_SPI_LAST 48 + /* PSCI functions */ #define PSCI_cpu_suspend 0 #define PSCI_cpu_off 1 -- 2.43.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci 2024-09-24 16:23 ` [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci Edgar E. Iglesias @ 2024-09-24 16:35 ` Julien Grall 2024-09-24 17:11 ` Edgar E. Iglesias 0 siblings, 1 reply; 32+ messages in thread From: Julien Grall @ 2024-09-24 16:35 UTC (permalink / raw) To: Edgar E. Iglesias, xen-devel Cc: sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias Hi Edgar, On 24/09/2024 17:23, Edgar E. Iglesias wrote: > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> > > Reserve memory ranges and interrupt lines for an externally > emulated PCI controller (e.g by QEMU) dedicated to hosting > Virtio devices and potentially other emulated devices. > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > --- > xen/include/public/arch-arm.h | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h > index e19f0251a6..654b827715 100644 > --- a/xen/include/public/arch-arm.h > +++ b/xen/include/public/arch-arm.h > @@ -494,6 +494,20 @@ typedef uint64_t xen_callback_t; > #define GUEST_RAM1_BASE xen_mk_ullong(0x0200000000) /* 952GB of RAM @ 8GB */ > #define GUEST_RAM1_SIZE xen_mk_ullong(0xee00000000) > > +/* Virtio PCI - Ordered by decreasing size to keep things aligned */ > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE xen_mk_ullong(0x43000000) > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE xen_mk_ullong(0x0f000000000) > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE xen_mk_ullong(0x100000000) > + > +#define GUEST_VIRTIO_PCI_ECAM_BASE (GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE + \ > + GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE) > +#define GUEST_VIRTIO_PCI_ECAM_SIZE xen_mk_ullong(0x10000000) > + > +#define GUEST_VIRTIO_PCI_MEM_TYPE xen_mk_ullong(0x02000000) > +#define GUEST_VIRTIO_PCI_MEM_BASE (GUEST_VIRTIO_PCI_ECAM_BASE + \ > + GUEST_VIRTIO_PCI_ECAM_SIZE) > +#define GUEST_VIRTIO_PCI_MEM_SIZE xen_mk_ullong(0x00002000000) Why is this specific to virtio PCI? However, I am not entirely convinced we should have a second PCI hostbridge exposed to the guest for a few reasons: 1. This require to reserve yet another range in the address space (could be solved with a more dynamic layout) 2. From your instructions, the guest needs to explicitly do a PCI rescan. So rather than having a second hostbridge, have you considered to extend the existing hostbridge (implemented in Xen) to support a mix of physical PCI device and virtual one? Cheers, -- Julien Grall ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci 2024-09-24 16:35 ` Julien Grall @ 2024-09-24 17:11 ` Edgar E. Iglesias 2024-09-24 17:24 ` Julien Grall 0 siblings, 1 reply; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-24 17:11 UTC (permalink / raw) To: Julien Grall Cc: Edgar E. Iglesias, xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith On Tue, Sep 24, 2024 at 05:35:20PM +0100, Julien Grall wrote: > Hi Edgar, > > On 24/09/2024 17:23, Edgar E. Iglesias wrote: > > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> > > > > Reserve memory ranges and interrupt lines for an externally > > emulated PCI controller (e.g by QEMU) dedicated to hosting > > Virtio devices and potentially other emulated devices. > > > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > > --- > > xen/include/public/arch-arm.h | 17 +++++++++++++++++ > > 1 file changed, 17 insertions(+) > > > > diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h > > index e19f0251a6..654b827715 100644 > > --- a/xen/include/public/arch-arm.h > > +++ b/xen/include/public/arch-arm.h > > @@ -494,6 +494,20 @@ typedef uint64_t xen_callback_t; > > #define GUEST_RAM1_BASE xen_mk_ullong(0x0200000000) /* 952GB of RAM @ 8GB */ > > #define GUEST_RAM1_SIZE xen_mk_ullong(0xee00000000) > > +/* Virtio PCI - Ordered by decreasing size to keep things aligned */ > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE xen_mk_ullong(0x43000000) > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE xen_mk_ullong(0x0f000000000) > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE xen_mk_ullong(0x100000000) > > + > > +#define GUEST_VIRTIO_PCI_ECAM_BASE (GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE + \ > > + GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE) > > +#define GUEST_VIRTIO_PCI_ECAM_SIZE xen_mk_ullong(0x10000000) > > + > > +#define GUEST_VIRTIO_PCI_MEM_TYPE xen_mk_ullong(0x02000000) > > +#define GUEST_VIRTIO_PCI_MEM_BASE (GUEST_VIRTIO_PCI_ECAM_BASE + \ > > + GUEST_VIRTIO_PCI_ECAM_SIZE) > > +#define GUEST_VIRTIO_PCI_MEM_SIZE xen_mk_ullong(0x00002000000) > > Why is this specific to virtio PCI? However, I am not entirely convinced we > should have a second PCI hostbridge exposed to the guest for a few reasons: > 1. This require to reserve yet another range in the address space (could > be solved with a more dynamic layout) > 2. From your instructions, the guest needs to explicitly do a PCI rescan. > > So rather than having a second hostbridge, have you considered to extend the > existing hostbridge (implemented in Xen) to support a mix of physical PCI > device and virtual one? > Thanks Julien, It's briefly come up in a couple of discussions but I haven't looked carefully at it. It is a good idea and it's probably worth prototyping to see what the gaps are in hypercall interfaces, QEMU support etc. Cheers, Edgar ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci 2024-09-24 17:11 ` Edgar E. Iglesias @ 2024-09-24 17:24 ` Julien Grall 2024-09-24 23:16 ` Stefano Stabellini 0 siblings, 1 reply; 32+ messages in thread From: Julien Grall @ 2024-09-24 17:24 UTC (permalink / raw) To: Edgar E. Iglesias Cc: Edgar E. Iglesias, xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith Hi Edgar, On 24/09/2024 18:11, Edgar E. Iglesias wrote: > On Tue, Sep 24, 2024 at 05:35:20PM +0100, Julien Grall wrote: >> Hi Edgar, >> >> On 24/09/2024 17:23, Edgar E. Iglesias wrote: >>> From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> >>> >>> Reserve memory ranges and interrupt lines for an externally >>> emulated PCI controller (e.g by QEMU) dedicated to hosting >>> Virtio devices and potentially other emulated devices. >>> >>> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> >>> --- >>> xen/include/public/arch-arm.h | 17 +++++++++++++++++ >>> 1 file changed, 17 insertions(+) >>> >>> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h >>> index e19f0251a6..654b827715 100644 >>> --- a/xen/include/public/arch-arm.h >>> +++ b/xen/include/public/arch-arm.h >>> @@ -494,6 +494,20 @@ typedef uint64_t xen_callback_t; >>> #define GUEST_RAM1_BASE xen_mk_ullong(0x0200000000) /* 952GB of RAM @ 8GB */ >>> #define GUEST_RAM1_SIZE xen_mk_ullong(0xee00000000) >>> +/* Virtio PCI - Ordered by decreasing size to keep things aligned */ >>> +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE xen_mk_ullong(0x43000000) >>> +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE xen_mk_ullong(0x0f000000000) >>> +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE xen_mk_ullong(0x100000000) >>> + >>> +#define GUEST_VIRTIO_PCI_ECAM_BASE (GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE + \ >>> + GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE) >>> +#define GUEST_VIRTIO_PCI_ECAM_SIZE xen_mk_ullong(0x10000000) >>> + >>> +#define GUEST_VIRTIO_PCI_MEM_TYPE xen_mk_ullong(0x02000000) >>> +#define GUEST_VIRTIO_PCI_MEM_BASE (GUEST_VIRTIO_PCI_ECAM_BASE + \ >>> + GUEST_VIRTIO_PCI_ECAM_SIZE) >>> +#define GUEST_VIRTIO_PCI_MEM_SIZE xen_mk_ullong(0x00002000000) >> >> Why is this specific to virtio PCI? However, I am not entirely convinced we >> should have a second PCI hostbridge exposed to the guest for a few reasons: >> 1. This require to reserve yet another range in the address space (could >> be solved with a more dynamic layout) >> 2. From your instructions, the guest needs to explicitly do a PCI rescan. Another big advantage I forgot to mention is disaggregation. In a world where the hostbridge is handled in Xen, you could have a PCI device emulated by dom0 while the other is emulated by a stub domain. >> >> So rather than having a second hostbridge, have you considered to extend the >> existing hostbridge (implemented in Xen) to support a mix of physical PCI >> device and virtual one? >> > > Thanks Julien, > > It's briefly come up in a couple of discussions but I haven't looked > carefully at it. It is a good idea and it's probably worth prototyping > to see what the gaps are in hypercall interfaces, QEMU support etc. I also vaguely recall to discuss it on xen-devel. But I couldn't find the discussion... :(. I think all the hypercalls should be there but will require some plumbing in Xen on Arm. QEMU should be able to request Xen to forward configuration access for a given PCI device (see XEN_DMOP_IO_RANGE_PCI). They will then be forwarded to QEMU using IOREQ_TYPE_PCI_CONFIG. We also have an hypercall to be able to inject interrupts from QEMU (see XEN_DMOP_set_irq_level). Cheers, -- Julien Grall ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci 2024-09-24 17:24 ` Julien Grall @ 2024-09-24 23:16 ` Stefano Stabellini 2024-09-25 7:36 ` Julien Grall 0 siblings, 1 reply; 32+ messages in thread From: Stefano Stabellini @ 2024-09-24 23:16 UTC (permalink / raw) To: Julien Grall Cc: Edgar E. Iglesias, Edgar E. Iglesias, xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith [-- Attachment #1: Type: text/plain, Size: 4728 bytes --] On Tue, 24 Sep 2024, Julien Grall wrote: > On 24/09/2024 18:11, Edgar E. Iglesias wrote: > > On Tue, Sep 24, 2024 at 05:35:20PM +0100, Julien Grall wrote: > > > Hi Edgar, > > > > > > On 24/09/2024 17:23, Edgar E. Iglesias wrote: > > > > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> > > > > > > > > Reserve memory ranges and interrupt lines for an externally > > > > emulated PCI controller (e.g by QEMU) dedicated to hosting > > > > Virtio devices and potentially other emulated devices. > > > > > > > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > > > > --- > > > > xen/include/public/arch-arm.h | 17 +++++++++++++++++ > > > > 1 file changed, 17 insertions(+) > > > > > > > > diff --git a/xen/include/public/arch-arm.h > > > > b/xen/include/public/arch-arm.h > > > > index e19f0251a6..654b827715 100644 > > > > --- a/xen/include/public/arch-arm.h > > > > +++ b/xen/include/public/arch-arm.h > > > > @@ -494,6 +494,20 @@ typedef uint64_t xen_callback_t; > > > > #define GUEST_RAM1_BASE xen_mk_ullong(0x0200000000) /* 952GB of RAM > > > > @ 8GB */ > > > > #define GUEST_RAM1_SIZE xen_mk_ullong(0xee00000000) > > > > +/* Virtio PCI - Ordered by decreasing size to keep things aligned */ > > > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE xen_mk_ullong(0x43000000) > > > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE > > > > xen_mk_ullong(0x0f000000000) > > > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE xen_mk_ullong(0x100000000) > > > > + > > > > +#define GUEST_VIRTIO_PCI_ECAM_BASE > > > > (GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE + \ > > > > + > > > > GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE) > > > > +#define GUEST_VIRTIO_PCI_ECAM_SIZE xen_mk_ullong(0x10000000) > > > > + > > > > +#define GUEST_VIRTIO_PCI_MEM_TYPE xen_mk_ullong(0x02000000) > > > > +#define GUEST_VIRTIO_PCI_MEM_BASE (GUEST_VIRTIO_PCI_ECAM_BASE + > > > > \ > > > > + GUEST_VIRTIO_PCI_ECAM_SIZE) > > > > +#define GUEST_VIRTIO_PCI_MEM_SIZE xen_mk_ullong(0x00002000000) > > > > > > Why is this specific to virtio PCI? However, I am not entirely convinced > > > we > > > should have a second PCI hostbridge exposed to the guest for a few > > > reasons: > > > 1. This require to reserve yet another range in the address space > > > (could > > > be solved with a more dynamic layout) > > > 2. From your instructions, the guest needs to explicitly do a PCI > > > rescan. > > Another big advantage I forgot to mention is disaggregation. In a world where > the hostbridge is handled in Xen, you could have a PCI device emulated by dom0 > while the other is emulated by a stub domain. > > > > > > > So rather than having a second hostbridge, have you considered to extend > > > the > > > existing hostbridge (implemented in Xen) to support a mix of physical PCI > > > device and virtual one? > > > > > > > Thanks Julien, > > > > It's briefly come up in a couple of discussions but I haven't looked > > carefully at it. It is a good idea and it's probably worth prototyping > > to see what the gaps are in hypercall interfaces, QEMU support etc. > > I also vaguely recall to discuss it on xen-devel. But I couldn't find the > discussion... :(. > > I think all the hypercalls should be there but will require some plumbing in > Xen on Arm. QEMU should be able to request Xen to forward configuration access > for a given PCI device (see XEN_DMOP_IO_RANGE_PCI). They will then be > forwarded to QEMU using IOREQ_TYPE_PCI_CONFIG. > > We also have an hypercall to be able to inject interrupts from QEMU (see > XEN_DMOP_set_irq_level). Hi Julien, Yes, I remember a thread on xen-devel too about this topic when EPAM suggested a similar two-hostbridges approach. I was one of the people suggesting to use a single hostbridge at the time. However, when we looked at the implementation more closely, the two-hostbridge approach was easier to get up and running. It works (almost) out of the box. Currently, we have the two-hostbridge solution working on both ARM and x86 to enable virtio-pci to work alongside vPCI in Dom0less/Hyperlaunch configurations. While I think that a single hostbridge is better architecturally, it is important to consider that virtio is moving toward a new transport (virtio-msg, Bertrand is also involved) which does not require a hostbridge. This new transport is key for all our use-cases as it addresses safety requirements and supports AMP configurations without a shared hypervisor between the frontend and backend. Edgar is one of the top contributors to virtio-msg. Given this, I don't think it's worthwhile to invest much effort in virtio-pci, as it will be replaced soon in embedded applications. Cheers, Stefano ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci 2024-09-24 23:16 ` Stefano Stabellini @ 2024-09-25 7:36 ` Julien Grall 2024-09-25 18:09 ` Stefano Stabellini 0 siblings, 1 reply; 32+ messages in thread From: Julien Grall @ 2024-09-25 7:36 UTC (permalink / raw) To: Stefano Stabellini Cc: Edgar E. Iglesias, Edgar E. Iglesias, xen-devel, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith Hi Stefano, On 25/09/2024 00:16, Stefano Stabellini wrote: > On Tue, 24 Sep 2024, Julien Grall wrote: >> On 24/09/2024 18:11, Edgar E. Iglesias wrote: >>> On Tue, Sep 24, 2024 at 05:35:20PM +0100, Julien Grall wrote: >>>> Hi Edgar, >>>> >>>> On 24/09/2024 17:23, Edgar E. Iglesias wrote: >>>>> From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> >>>>> >>>>> Reserve memory ranges and interrupt lines for an externally >>>>> emulated PCI controller (e.g by QEMU) dedicated to hosting >>>>> Virtio devices and potentially other emulated devices. >>>>> >>>>> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> >>>>> --- >>>>> xen/include/public/arch-arm.h | 17 +++++++++++++++++ >>>>> 1 file changed, 17 insertions(+) >>>>> >>>>> diff --git a/xen/include/public/arch-arm.h >>>>> b/xen/include/public/arch-arm.h >>>>> index e19f0251a6..654b827715 100644 >>>>> --- a/xen/include/public/arch-arm.h >>>>> +++ b/xen/include/public/arch-arm.h >>>>> @@ -494,6 +494,20 @@ typedef uint64_t xen_callback_t; >>>>> #define GUEST_RAM1_BASE xen_mk_ullong(0x0200000000) /* 952GB of RAM >>>>> @ 8GB */ >>>>> #define GUEST_RAM1_SIZE xen_mk_ullong(0xee00000000) >>>>> +/* Virtio PCI - Ordered by decreasing size to keep things aligned */ >>>>> +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE xen_mk_ullong(0x43000000) >>>>> +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE >>>>> xen_mk_ullong(0x0f000000000) >>>>> +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE xen_mk_ullong(0x100000000) >>>>> + >>>>> +#define GUEST_VIRTIO_PCI_ECAM_BASE >>>>> (GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE + \ >>>>> + >>>>> GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE) >>>>> +#define GUEST_VIRTIO_PCI_ECAM_SIZE xen_mk_ullong(0x10000000) >>>>> + >>>>> +#define GUEST_VIRTIO_PCI_MEM_TYPE xen_mk_ullong(0x02000000) >>>>> +#define GUEST_VIRTIO_PCI_MEM_BASE (GUEST_VIRTIO_PCI_ECAM_BASE + >>>>> \ >>>>> + GUEST_VIRTIO_PCI_ECAM_SIZE) >>>>> +#define GUEST_VIRTIO_PCI_MEM_SIZE xen_mk_ullong(0x00002000000) >>>> >>>> Why is this specific to virtio PCI? However, I am not entirely convinced >>>> we >>>> should have a second PCI hostbridge exposed to the guest for a few >>>> reasons: >>>> 1. This require to reserve yet another range in the address space >>>> (could >>>> be solved with a more dynamic layout) >>>> 2. From your instructions, the guest needs to explicitly do a PCI >>>> rescan. >> >> Another big advantage I forgot to mention is disaggregation. In a world where >> the hostbridge is handled in Xen, you could have a PCI device emulated by dom0 >> while the other is emulated by a stub domain. >> >>>> >>>> So rather than having a second hostbridge, have you considered to extend >>>> the >>>> existing hostbridge (implemented in Xen) to support a mix of physical PCI >>>> device and virtual one? >>>> >>> >>> Thanks Julien, >>> >>> It's briefly come up in a couple of discussions but I haven't looked >>> carefully at it. It is a good idea and it's probably worth prototyping >>> to see what the gaps are in hypercall interfaces, QEMU support etc. >> >> I also vaguely recall to discuss it on xen-devel. But I couldn't find the >> discussion... :(. >> >> I think all the hypercalls should be there but will require some plumbing in >> Xen on Arm. QEMU should be able to request Xen to forward configuration access >> for a given PCI device (see XEN_DMOP_IO_RANGE_PCI). They will then be >> forwarded to QEMU using IOREQ_TYPE_PCI_CONFIG. >> >> We also have an hypercall to be able to inject interrupts from QEMU (see >> XEN_DMOP_set_irq_level). > > Hi Julien, > > Yes, I remember a thread on xen-devel too about this topic when EPAM > suggested a similar two-hostbridges approach. I was one of the people > suggesting to use a single hostbridge at the time. > > However, when we looked at the implementation more closely, the > two-hostbridge approach was easier to get up and running. It works > (almost) out of the box. Currently, we have the two-hostbridge solution > working on both ARM and x86 to enable virtio-pci to work alongside vPCI > in Dom0less/Hyperlaunch configurations. I understand this is the easiest solution... However, this requires code in Xen that I am not yet convinced it is good to have. I am not too concerned about the MMIO range part. This can be (easily) solved. I am more concerned about the support of background region and the fact the OS needs to be able to rescan. I am definitely not an expert of PCI, but AFAIK, it is possible to have the guest to be notified when a PCI device is hotplug. Why can't we use it? > > While I think that a single hostbridge is better architecturally, it is > important to consider that virtio is moving toward a new transport > (virtio-msg, Bertrand is also involved) which does not require a > hostbridge. This new transport is key for all our use-cases as it > addresses safety requirements and supports AMP configurations without a > shared hypervisor between the frontend and backend. Edgar is one of the > top contributors to virtio-msg. Given this, I don't think it's > worthwhile to invest much effort in virtio-pci, as it will be replaced > soon in embedded applications. To me this raises the question why we should have a temporary solution upstream then? Cheers, -- Julien Grall ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci 2024-09-25 7:36 ` Julien Grall @ 2024-09-25 18:09 ` Stefano Stabellini 0 siblings, 0 replies; 32+ messages in thread From: Stefano Stabellini @ 2024-09-25 18:09 UTC (permalink / raw) To: Julien Grall Cc: Stefano Stabellini, Edgar E. Iglesias, Edgar E. Iglesias, xen-devel, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith [-- Attachment #1: Type: text/plain, Size: 6666 bytes --] On Wed, 25 Sep 2024, Julien Grall wrote: > On 25/09/2024 00:16, Stefano Stabellini wrote: > > On Tue, 24 Sep 2024, Julien Grall wrote: > > > On 24/09/2024 18:11, Edgar E. Iglesias wrote: > > > > On Tue, Sep 24, 2024 at 05:35:20PM +0100, Julien Grall wrote: > > > > > Hi Edgar, > > > > > > > > > > On 24/09/2024 17:23, Edgar E. Iglesias wrote: > > > > > > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> > > > > > > > > > > > > Reserve memory ranges and interrupt lines for an externally > > > > > > emulated PCI controller (e.g by QEMU) dedicated to hosting > > > > > > Virtio devices and potentially other emulated devices. > > > > > > > > > > > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > > > > > > --- > > > > > > xen/include/public/arch-arm.h | 17 +++++++++++++++++ > > > > > > 1 file changed, 17 insertions(+) > > > > > > > > > > > > diff --git a/xen/include/public/arch-arm.h > > > > > > b/xen/include/public/arch-arm.h > > > > > > index e19f0251a6..654b827715 100644 > > > > > > --- a/xen/include/public/arch-arm.h > > > > > > +++ b/xen/include/public/arch-arm.h > > > > > > @@ -494,6 +494,20 @@ typedef uint64_t xen_callback_t; > > > > > > #define GUEST_RAM1_BASE xen_mk_ullong(0x0200000000) /* 952GB > > > > > > of RAM > > > > > > @ 8GB */ > > > > > > #define GUEST_RAM1_SIZE xen_mk_ullong(0xee00000000) > > > > > > +/* Virtio PCI - Ordered by decreasing size to keep things aligned > > > > > > */ > > > > > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE > > > > > > xen_mk_ullong(0x43000000) > > > > > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE > > > > > > xen_mk_ullong(0x0f000000000) > > > > > > +#define GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE > > > > > > xen_mk_ullong(0x100000000) > > > > > > + > > > > > > +#define GUEST_VIRTIO_PCI_ECAM_BASE > > > > > > (GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE + \ > > > > > > + > > > > > > GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE) > > > > > > +#define GUEST_VIRTIO_PCI_ECAM_SIZE xen_mk_ullong(0x10000000) > > > > > > + > > > > > > +#define GUEST_VIRTIO_PCI_MEM_TYPE xen_mk_ullong(0x02000000) > > > > > > +#define GUEST_VIRTIO_PCI_MEM_BASE > > > > > > (GUEST_VIRTIO_PCI_ECAM_BASE + > > > > > > \ > > > > > > + > > > > > > GUEST_VIRTIO_PCI_ECAM_SIZE) > > > > > > +#define GUEST_VIRTIO_PCI_MEM_SIZE > > > > > > xen_mk_ullong(0x00002000000) > > > > > > > > > > Why is this specific to virtio PCI? However, I am not entirely > > > > > convinced > > > > > we > > > > > should have a second PCI hostbridge exposed to the guest for a few > > > > > reasons: > > > > > 1. This require to reserve yet another range in the address space > > > > > (could > > > > > be solved with a more dynamic layout) > > > > > 2. From your instructions, the guest needs to explicitly do a PCI > > > > > rescan. > > > > > > Another big advantage I forgot to mention is disaggregation. In a world > > > where > > > the hostbridge is handled in Xen, you could have a PCI device emulated by > > > dom0 > > > while the other is emulated by a stub domain. > > > > > > > > > > > > > So rather than having a second hostbridge, have you considered to > > > > > extend > > > > > the > > > > > existing hostbridge (implemented in Xen) to support a mix of physical > > > > > PCI > > > > > device and virtual one? > > > > > > > > > > > > > Thanks Julien, > > > > > > > > It's briefly come up in a couple of discussions but I haven't looked > > > > carefully at it. It is a good idea and it's probably worth prototyping > > > > to see what the gaps are in hypercall interfaces, QEMU support etc. > > > > > > I also vaguely recall to discuss it on xen-devel. But I couldn't find the > > > discussion... :(. > > > > > > I think all the hypercalls should be there but will require some plumbing > > > in > > > Xen on Arm. QEMU should be able to request Xen to forward configuration > > > access > > > for a given PCI device (see XEN_DMOP_IO_RANGE_PCI). They will then be > > > forwarded to QEMU using IOREQ_TYPE_PCI_CONFIG. > > > > > > We also have an hypercall to be able to inject interrupts from QEMU (see > > > XEN_DMOP_set_irq_level). > > > > Hi Julien, > > > > Yes, I remember a thread on xen-devel too about this topic when EPAM > > suggested a similar two-hostbridges approach. I was one of the people > > suggesting to use a single hostbridge at the time. > > > > However, when we looked at the implementation more closely, the > > two-hostbridge approach was easier to get up and running. It works > > (almost) out of the box. Currently, we have the two-hostbridge solution > > working on both ARM and x86 to enable virtio-pci to work alongside vPCI > > in Dom0less/Hyperlaunch configurations. > > I understand this is the easiest solution... However, this requires code in > Xen that I am not yet convinced it is good to have. > > I am not too concerned about the MMIO range part. This can be (easily) solved. > I am more concerned about the support of background region and the fact the OS > needs to be able to rescan. > > I am definitely not an expert of PCI, but AFAIK, it is possible to have the > guest to be notified when a PCI device is hotplug. Why can't we use it? Yes, that is the cleanest solution and Xenia has been working on that in the last couple of weeks. I am not sure if she has it fully functional yet. PCI rescan is just a crude but effective way to solve the problem. We also have a prototype of a special "flag" on the PCI root complex to tell the guest if the bus is ready, or whether it should wait. In any case I am confident this issue can be solved well, and we were already aiming for pci hotplug as the final solution. > > While I think that a single hostbridge is better architecturally, it is > > important to consider that virtio is moving toward a new transport > > (virtio-msg, Bertrand is also involved) which does not require a > > hostbridge. This new transport is key for all our use-cases as it > > addresses safety requirements and supports AMP configurations without a > > shared hypervisor between the frontend and backend. Edgar is one of the > > top contributors to virtio-msg. Given this, I don't think it's > > worthwhile to invest much effort in virtio-pci, as it will be replaced > > soon in embedded applications. > > To me this raises the question why we should have a temporary solution > upstream then? Having virtio-pci support as a stopgap seems beneficial, especially since it's reasonable to expect it will be needed for 1-2 years, which is a considerable period. This would put Xen in a good position in respect to other hypervisors in the same space. However, I also recognize that this implementation isn't ideal. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-24 16:23 [PATCH v1 0/6] xen/arm: Add Virtio-PCI for dom0less on ARM Edgar E. Iglesias 2024-09-24 16:23 ` [PATCH v1 1/6] xen/arm: Decrease size of the 2nd ram bank Edgar E. Iglesias 2024-09-24 16:23 ` [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci Edgar E. Iglesias @ 2024-09-24 16:23 ` Edgar E. Iglesias 2024-09-24 23:55 ` Stefano Stabellini ` (3 more replies) 2024-09-24 16:23 ` [PATCH v1 4/6] xen/arm: io: Add support for mmio background regions Edgar E. Iglesias ` (2 subsequent siblings) 5 siblings, 4 replies; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-24 16:23 UTC (permalink / raw) To: xen-devel Cc: sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias, Stewart Hildebrand From: Stewart Hildebrand <stewart.hildebrand@amd.com> When virtio-pci is specified in the dom0less domU properties, create a virtio-pci node in the guest's device tree. Set up an mmio handler with a register for the guest to poll when the backend has connected and virtio-pci bus is ready to be probed. Grant tables may be used by specifying virtio-pci = "grants";. [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). Make grants iommu-map cover the entire PCI bus. Add virtio-pci-ranges to specify memory-map for direct-mapped guests. Document virtio-pci dom0less fdt bindings.] Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> --- docs/misc/arm/device-tree/booting.txt | 21 +++ xen/arch/arm/dom0less-build.c | 238 ++++++++++++++++++++++++++ xen/arch/arm/include/asm/kernel.h | 15 ++ 3 files changed, 274 insertions(+) diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt index 3a04f5c57f..82f3bd7026 100644 --- a/docs/misc/arm/device-tree/booting.txt +++ b/docs/misc/arm/device-tree/booting.txt @@ -276,6 +276,27 @@ with the following properties: passed through. This option is the default if this property is missing and the user does not provide the device partial device tree for the domain. +- virtio-pci + + A string property specifying whether virtio-pci is enabled for the + domain and if grant table mappings should be used. If no value is set + this property is treated as a boolean and the same way as if set to + "enabled". + Possible property values are: + + - "enabled" + Virtio-pci is enabled for the domain. + + - "grants" + Virtio-pci is enabled for the domain and an grants IOMMU node will be + generated in the domains device-tree. + +- virtio-pci-ranges + + An optional array of 6 u32 values specifying the 2 cells base addresses of + the ECAM, Memory and Prefetchable-Memory regions for virtio-pci. This is + useful to avoid memory-map collisions when using direct-mapped guests. + Under the "xen,domain" compatible node, one or more sub-nodes are present for the DomU kernel and ramdisk. diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c index 09b65e44ae..dab24fa9e2 100644 --- a/xen/arch/arm/dom0less-build.c +++ b/xen/arch/arm/dom0less-build.c @@ -586,6 +586,189 @@ static int __init domain_handle_dtb_bootmodule(struct domain *d, return res; } +static int __init make_virtio_pci_domU_node(const struct kernel_info *kinfo) +{ + void *fdt = kinfo->fdt; + /* reg is sized to be used for all the needed properties below */ + __be32 reg[(1 + (GUEST_ROOT_ADDRESS_CELLS * 2) + GUEST_ROOT_SIZE_CELLS) + * 2]; + __be32 irq_map[4 * 4 * 8]; + __be32 *cells; + char buf[22]; /* pcie@ + max 16 char address + '\0' */ + int res; + int devfn, intx_pin; + static const char compat[] = "pci-host-ecam-generic"; + static const char reg_names[] = "ecam"; + + if ( p2m_ipa_bits <= 40 ) { + printk("PA bits %d is too small!\nvirtio-pci is only supported " + "on platforms with PA bits > 40\n", p2m_ipa_bits); + return -EINVAL; + } + + snprintf(buf, sizeof(buf), "pcie@%lx", kinfo->virtio_pci.ecam.base); + dt_dprintk("Create virtio-pci node\n"); + res = fdt_begin_node(fdt, buf); + if ( res ) + return res; + + res = fdt_property(fdt, "compatible", compat, sizeof(compat)); + if ( res ) + return res; + + res = fdt_property_string(fdt, "device_type", "pci"); + if ( res ) + return res; + + /* Create reg property */ + cells = ®[0]; + dt_child_set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS, + kinfo->virtio_pci.ecam.base, GUEST_VIRTIO_PCI_ECAM_SIZE); + + res = fdt_property(fdt, "reg", reg, + (GUEST_ROOT_ADDRESS_CELLS + + GUEST_ROOT_SIZE_CELLS) * sizeof(*reg)); + if ( res ) + return res; + + res = fdt_property(fdt, "reg-names", reg_names, sizeof(reg_names)); + if ( res ) + return res; + + /* Create bus-range property */ + cells = ®[0]; + dt_set_cell(&cells, 1, 0); + dt_set_cell(&cells, 1, 0xff); + res = fdt_property(fdt, "bus-range", reg, 2 * sizeof(*reg)); + if ( res ) + return res; + + res = fdt_property_cell(fdt, "#address-cells", 3); + if ( res ) + return res; + + res = fdt_property_cell(fdt, "#size-cells", 2); + if ( res ) + return res; + + res = fdt_property_string(fdt, "status", "okay"); + if ( res ) + return res; + + /* + * Create ranges property as: + * <(PCI bitfield) (PCI address) (CPU address) (Size)> + */ + cells = ®[0]; + dt_set_cell(&cells, 1, GUEST_VIRTIO_PCI_MEM_TYPE); + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.mem.base); + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.mem.base); + dt_set_cell(&cells, GUEST_ROOT_SIZE_CELLS, GUEST_VIRTIO_PCI_MEM_SIZE); + dt_set_cell(&cells, 1, GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE); + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.pf_mem.base); + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.pf_mem.base); + dt_set_cell(&cells, GUEST_ROOT_SIZE_CELLS, GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE); + res = fdt_property(fdt, "ranges", reg, 14 * sizeof(*reg)); + if ( res ) + return res; + + res = fdt_property(fdt, "dma-coherent", "", 0); + if ( res ) + return res; + + res = fdt_property_cell(fdt, "#interrupt-cells", 1); + if ( res ) + return res; + + /* + * PCI-to-PCI bridge specification + * 9.1: Interrupt routing. Table 9-1 + * + * the PCI Express Base Specification, Revision 2.1 + * 2.2.8.1: INTx interrupt signaling - Rules + * the Implementation Note + * Table 2-20 + */ + cells = &irq_map[0]; + for (devfn = 0; devfn <= 0x18; devfn += 8) { + for (intx_pin = 0; intx_pin < 4; intx_pin++) { + int irq = GUEST_VIRTIO_PCI_SPI_FIRST - 32; + irq += ((intx_pin + PCI_SLOT(devfn)) % 4); + + dt_set_cell(&cells, 1, devfn << 8); + dt_set_cell(&cells, 1, 0); + dt_set_cell(&cells, 1, 0); + dt_set_cell(&cells, 1, intx_pin + 1); + dt_set_cell(&cells, 1, kinfo->phandle_gic); + /* 3 GIC cells. */ + dt_set_cell(&cells, 1, 0); + dt_set_cell(&cells, 1, irq); + dt_set_cell(&cells, 1, DT_IRQ_TYPE_LEVEL_HIGH); + } + } + + /* Assert we've sized irq_map correctly. */ + BUG_ON(cells - &irq_map[0] != ARRAY_SIZE(irq_map)); + + res = fdt_property(fdt, "interrupt-map", irq_map, sizeof(irq_map)); + if ( res ) + return res; + + cells = ®[0]; + dt_set_cell(&cells, 1, cpu_to_be16(PCI_DEVFN(3, 0))); + dt_set_cell(&cells, 1, 0x0); + dt_set_cell(&cells, 1, 0x0); + dt_set_cell(&cells, 1, 0x7); + res = fdt_property(fdt, "interrupt-map-mask", reg, 4 * sizeof(*reg)); + if ( res ) + return res; + + if ( kinfo->virtio_pci.mode == VIRTIO_PCI_GRANTS ) + { + cells = ®[0]; + dt_set_cell(&cells, 1, 0x0); + dt_set_cell(&cells, 1, GUEST_PHANDLE_IOMMU); + dt_set_cell(&cells, 1, 0x0); + dt_set_cell(&cells, 1, 0x10000); + res = fdt_property(fdt, "iommu-map", reg, 4 * sizeof(*reg)); + if ( res ) + return res; + } + + res = fdt_property_cell(fdt, "linux,pci-domain", 1); + if ( res ) + return res; + + res = fdt_end_node(fdt); + if ( res ) + return res; + + if ( kinfo->virtio_pci.mode == VIRTIO_PCI_GRANTS ) + { + snprintf(buf, sizeof(buf), "xen_iommu"); + + res = fdt_begin_node(fdt, buf); + if ( res ) + return res; + + res = fdt_property_string(fdt, "compatible", "xen,grant-dma"); + if ( res ) + return res; + + res = fdt_property_cell(fdt, "#iommu-cells", 1); + if ( res ) + return res; + + res = fdt_property_cell(fdt, "phandle", GUEST_PHANDLE_IOMMU); + if ( res ) + return res; + + res = fdt_end_node(fdt); + } + + return res; +} + /* * The max size for DT is 2MB. However, the generated DT is small (not including * domU passthrough DT nodes whose size we account separately), 4KB are enough @@ -693,6 +876,13 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo) goto err; } + if ( kinfo->virtio_pci.mode ) + { + ret = make_virtio_pci_domU_node(kinfo); + if ( ret ) + goto err; + } + ret = fdt_end_node(kinfo->fdt); if ( ret < 0 ) goto err; @@ -744,11 +934,24 @@ static int __init alloc_xenstore_evtchn(struct domain *d) return 0; } +static u64 combine_u64(u32 v[2]) +{ + u64 v64; + + v64 = v[0]; + v64 <<= 32; + v64 |= v[1]; + return v64; +} + static int __init construct_domU(struct domain *d, const struct dt_device_node *node) { struct kernel_info kinfo = KERNEL_INFO_INIT; const char *dom0less_enhanced; + const char *virtio_pci; + /* virtio-pci ECAM, MEM, PF-MEM each carrying 2 x Address cells. */ + u32 virtio_pci_ranges[3 * 2]; int rc; u64 mem; u32 p2m_mem_mb; @@ -779,6 +982,41 @@ static int __init construct_domU(struct domain *d, kinfo.vpl011 = dt_property_read_bool(node, "vpl011"); + rc = dt_property_read_string(node, "virtio-pci", &virtio_pci); + if ( !rc ) + { + if ( !strcmp(virtio_pci, "enabled") ) + kinfo.virtio_pci.mode = VIRTIO_PCI; + else if ( !strcmp(virtio_pci, "grants") ) + kinfo.virtio_pci.mode = VIRTIO_PCI_GRANTS; + else + { + printk("Invalid \"virtio-pci\" property value (%s)\n", virtio_pci); + return -EINVAL; + } + } + else if ( rc == -ENODATA ) + { + /* Handle missing property value */ + kinfo.virtio_pci.mode = dt_property_read_bool(node, "virtio-pci"); + } + + if ( kinfo.virtio_pci.mode != VIRTIO_PCI_NONE ) + { + rc = dt_property_read_u32_array(node, "virtio-pci-ranges", + virtio_pci_ranges, + ARRAY_SIZE(virtio_pci_ranges)); + if ( rc == 0 ) { + kinfo.virtio_pci.ecam.base = combine_u64(&virtio_pci_ranges[0]); + kinfo.virtio_pci.mem.base = combine_u64(&virtio_pci_ranges[2]); + kinfo.virtio_pci.pf_mem.base = combine_u64(&virtio_pci_ranges[4]); + } else { + kinfo.virtio_pci.ecam.base = GUEST_VIRTIO_PCI_ECAM_BASE; + kinfo.virtio_pci.mem.base = GUEST_VIRTIO_PCI_MEM_BASE; + kinfo.virtio_pci.pf_mem.base = GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE; + } + } + rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced); if ( rc == -EILSEQ || rc == -ENODATA || diff --git a/xen/arch/arm/include/asm/kernel.h b/xen/arch/arm/include/asm/kernel.h index 7e6e3c82a4..2dab2ac88f 100644 --- a/xen/arch/arm/include/asm/kernel.h +++ b/xen/arch/arm/include/asm/kernel.h @@ -29,6 +29,13 @@ #define DOM0LESS_XENSTORE BIT(1, U) #define DOM0LESS_ENHANCED (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XENSTORE) +/* virtio-pci types */ +enum virtio_pci_type { + VIRTIO_PCI_NONE, + VIRTIO_PCI, + VIRTIO_PCI_GRANTS, +}; + struct kernel_info { #ifdef CONFIG_ARM_64 enum domain_type type; @@ -62,6 +69,14 @@ struct kernel_info { /* Enable/Disable PV drivers interfaces */ uint16_t dom0less_feature; + struct { + enum virtio_pci_type mode; + struct { + u64 base; + } ecam, mem, pf_mem; + u32 pci_intx_irq_base; + } virtio_pci; + /* GIC phandle */ uint32_t phandle_gic; -- 2.43.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-24 16:23 ` [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node Edgar E. Iglesias @ 2024-09-24 23:55 ` Stefano Stabellini 2024-09-25 13:03 ` Stewart Hildebrand 2024-09-26 22:15 ` Edgar E. Iglesias 2024-09-25 2:48 ` Stewart Hildebrand ` (2 subsequent siblings) 3 siblings, 2 replies; 32+ messages in thread From: Stefano Stabellini @ 2024-09-24 23:55 UTC (permalink / raw) To: Edgar E. Iglesias Cc: xen-devel, sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias, Stewart Hildebrand On Tue, 24 Sep 2024, Edgar E. Iglesias wrote: > From: Stewart Hildebrand <stewart.hildebrand@amd.com> > > When virtio-pci is specified in the dom0less domU properties, create a > virtio-pci node in the guest's device tree. Set up an mmio handler with > a register for the guest to poll when the backend has connected and > virtio-pci bus is ready to be probed. Grant tables may be used by > specifying virtio-pci = "grants";. > > [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). > Make grants iommu-map cover the entire PCI bus. > Add virtio-pci-ranges to specify memory-map for direct-mapped guests. > Document virtio-pci dom0less fdt bindings.] > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > --- > docs/misc/arm/device-tree/booting.txt | 21 +++ > xen/arch/arm/dom0less-build.c | 238 ++++++++++++++++++++++++++ > xen/arch/arm/include/asm/kernel.h | 15 ++ > 3 files changed, 274 insertions(+) > > diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt > index 3a04f5c57f..82f3bd7026 100644 > --- a/docs/misc/arm/device-tree/booting.txt > +++ b/docs/misc/arm/device-tree/booting.txt > @@ -276,6 +276,27 @@ with the following properties: > passed through. This option is the default if this property is missing > and the user does not provide the device partial device tree for the domain. > > +- virtio-pci > + > + A string property specifying whether virtio-pci is enabled for the > + domain and if grant table mappings should be used. If no value is set > + this property is treated as a boolean and the same way as if set to > + "enabled". > + Possible property values are: > + > + - "enabled" > + Virtio-pci is enabled for the domain. > + > + - "grants" > + Virtio-pci is enabled for the domain and an grants IOMMU node will be > + generated in the domains device-tree. > + > +- virtio-pci-ranges > + > + An optional array of 6 u32 values specifying the 2 cells base addresses of > + the ECAM, Memory and Prefetchable-Memory regions for virtio-pci. This is > + useful to avoid memory-map collisions when using direct-mapped guests. > + > Under the "xen,domain" compatible node, one or more sub-nodes are present > for the DomU kernel and ramdisk. > > diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c > index 09b65e44ae..dab24fa9e2 100644 > --- a/xen/arch/arm/dom0less-build.c > +++ b/xen/arch/arm/dom0less-build.c > @@ -586,6 +586,189 @@ static int __init domain_handle_dtb_bootmodule(struct domain *d, > return res; > } > > +static int __init make_virtio_pci_domU_node(const struct kernel_info *kinfo) > +{ > + void *fdt = kinfo->fdt; > + /* reg is sized to be used for all the needed properties below */ > + __be32 reg[(1 + (GUEST_ROOT_ADDRESS_CELLS * 2) + GUEST_ROOT_SIZE_CELLS) > + * 2]; > + __be32 irq_map[4 * 4 * 8]; > + __be32 *cells; > + char buf[22]; /* pcie@ + max 16 char address + '\0' */ > + int res; > + int devfn, intx_pin; > + static const char compat[] = "pci-host-ecam-generic"; > + static const char reg_names[] = "ecam"; > + > + if ( p2m_ipa_bits <= 40 ) { > + printk("PA bits %d is too small!\nvirtio-pci is only supported " > + "on platforms with PA bits > 40\n", p2m_ipa_bits); > + return -EINVAL; > + } > + > + snprintf(buf, sizeof(buf), "pcie@%lx", kinfo->virtio_pci.ecam.base); > + dt_dprintk("Create virtio-pci node\n"); > + res = fdt_begin_node(fdt, buf); > + if ( res ) > + return res; > + > + res = fdt_property(fdt, "compatible", compat, sizeof(compat)); > + if ( res ) > + return res; > + > + res = fdt_property_string(fdt, "device_type", "pci"); > + if ( res ) > + return res; > + > + /* Create reg property */ > + cells = ®[0]; > + dt_child_set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS, > + kinfo->virtio_pci.ecam.base, GUEST_VIRTIO_PCI_ECAM_SIZE); > + > + res = fdt_property(fdt, "reg", reg, > + (GUEST_ROOT_ADDRESS_CELLS + > + GUEST_ROOT_SIZE_CELLS) * sizeof(*reg)); > + if ( res ) > + return res; > + > + res = fdt_property(fdt, "reg-names", reg_names, sizeof(reg_names)); > + if ( res ) > + return res; > + > + /* Create bus-range property */ > + cells = ®[0]; > + dt_set_cell(&cells, 1, 0); > + dt_set_cell(&cells, 1, 0xff); > + res = fdt_property(fdt, "bus-range", reg, 2 * sizeof(*reg)); > + if ( res ) > + return res; > + > + res = fdt_property_cell(fdt, "#address-cells", 3); > + if ( res ) > + return res; > + > + res = fdt_property_cell(fdt, "#size-cells", 2); > + if ( res ) > + return res; > + > + res = fdt_property_string(fdt, "status", "okay"); > + if ( res ) > + return res; > + > + /* > + * Create ranges property as: > + * <(PCI bitfield) (PCI address) (CPU address) (Size)> > + */ > + cells = ®[0]; > + dt_set_cell(&cells, 1, GUEST_VIRTIO_PCI_MEM_TYPE); > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.mem.base); > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.mem.base); > + dt_set_cell(&cells, GUEST_ROOT_SIZE_CELLS, GUEST_VIRTIO_PCI_MEM_SIZE); > + dt_set_cell(&cells, 1, GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE); > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.pf_mem.base); > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.pf_mem.base); > + dt_set_cell(&cells, GUEST_ROOT_SIZE_CELLS, GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE); > + res = fdt_property(fdt, "ranges", reg, 14 * sizeof(*reg)); > + if ( res ) > + return res; > + > + res = fdt_property(fdt, "dma-coherent", "", 0); > + if ( res ) > + return res; > + > + res = fdt_property_cell(fdt, "#interrupt-cells", 1); > + if ( res ) > + return res; > + > + /* > + * PCI-to-PCI bridge specification > + * 9.1: Interrupt routing. Table 9-1 > + * > + * the PCI Express Base Specification, Revision 2.1 > + * 2.2.8.1: INTx interrupt signaling - Rules > + * the Implementation Note > + * Table 2-20 > + */ > + cells = &irq_map[0]; > + for (devfn = 0; devfn <= 0x18; devfn += 8) { > + for (intx_pin = 0; intx_pin < 4; intx_pin++) { > + int irq = GUEST_VIRTIO_PCI_SPI_FIRST - 32; > + irq += ((intx_pin + PCI_SLOT(devfn)) % 4); > + > + dt_set_cell(&cells, 1, devfn << 8); > + dt_set_cell(&cells, 1, 0); > + dt_set_cell(&cells, 1, 0); > + dt_set_cell(&cells, 1, intx_pin + 1); > + dt_set_cell(&cells, 1, kinfo->phandle_gic); > + /* 3 GIC cells. */ > + dt_set_cell(&cells, 1, 0); > + dt_set_cell(&cells, 1, irq); > + dt_set_cell(&cells, 1, DT_IRQ_TYPE_LEVEL_HIGH); > + } > + } > + > + /* Assert we've sized irq_map correctly. */ > + BUG_ON(cells - &irq_map[0] != ARRAY_SIZE(irq_map)); > + > + res = fdt_property(fdt, "interrupt-map", irq_map, sizeof(irq_map)); > + if ( res ) > + return res; > + > + cells = ®[0]; > + dt_set_cell(&cells, 1, cpu_to_be16(PCI_DEVFN(3, 0))); Hi Edgar, what is this PCI_DEVFN(3, 0) device? > + dt_set_cell(&cells, 1, 0x0); > + dt_set_cell(&cells, 1, 0x0); > + dt_set_cell(&cells, 1, 0x7); > + res = fdt_property(fdt, "interrupt-map-mask", reg, 4 * sizeof(*reg)); > + if ( res ) > + return res; > + > + if ( kinfo->virtio_pci.mode == VIRTIO_PCI_GRANTS ) > + { > + cells = ®[0]; > + dt_set_cell(&cells, 1, 0x0); > + dt_set_cell(&cells, 1, GUEST_PHANDLE_IOMMU); > + dt_set_cell(&cells, 1, 0x0); > + dt_set_cell(&cells, 1, 0x10000); > + res = fdt_property(fdt, "iommu-map", reg, 4 * sizeof(*reg)); > + if ( res ) > + return res; > + } > + > + res = fdt_property_cell(fdt, "linux,pci-domain", 1); > + if ( res ) > + return res; > + > + res = fdt_end_node(fdt); > + if ( res ) > + return res; > + > + if ( kinfo->virtio_pci.mode == VIRTIO_PCI_GRANTS ) > + { > + snprintf(buf, sizeof(buf), "xen_iommu"); I don't think underscores are allowed in device tree node names > + res = fdt_begin_node(fdt, buf); > + if ( res ) > + return res; > + > + res = fdt_property_string(fdt, "compatible", "xen,grant-dma"); > + if ( res ) > + return res; > + > + res = fdt_property_cell(fdt, "#iommu-cells", 1); > + if ( res ) > + return res; > + > + res = fdt_property_cell(fdt, "phandle", GUEST_PHANDLE_IOMMU); > + if ( res ) > + return res; > + > + res = fdt_end_node(fdt); > + } > + > + return res; > +} > + > /* > * The max size for DT is 2MB. However, the generated DT is small (not including > * domU passthrough DT nodes whose size we account separately), 4KB are enough > @@ -693,6 +876,13 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo) > goto err; > } > > + if ( kinfo->virtio_pci.mode ) > + { > + ret = make_virtio_pci_domU_node(kinfo); > + if ( ret ) > + goto err; > + } > + > ret = fdt_end_node(kinfo->fdt); > if ( ret < 0 ) > goto err; > @@ -744,11 +934,24 @@ static int __init alloc_xenstore_evtchn(struct domain *d) > return 0; > } > > +static u64 combine_u64(u32 v[2]) Let's make this a static inline or a macro. I can't believe we don't have one already. > +{ > + u64 v64; > + > + v64 = v[0]; > + v64 <<= 32; > + v64 |= v[1]; > + return v64; > +} > + > static int __init construct_domU(struct domain *d, > const struct dt_device_node *node) > { > struct kernel_info kinfo = KERNEL_INFO_INIT; > const char *dom0less_enhanced; > + const char *virtio_pci; > + /* virtio-pci ECAM, MEM, PF-MEM each carrying 2 x Address cells. */ > + u32 virtio_pci_ranges[3 * 2]; > int rc; > u64 mem; > u32 p2m_mem_mb; > @@ -779,6 +982,41 @@ static int __init construct_domU(struct domain *d, > > kinfo.vpl011 = dt_property_read_bool(node, "vpl011"); > > + rc = dt_property_read_string(node, "virtio-pci", &virtio_pci); > + if ( !rc ) > + { > + if ( !strcmp(virtio_pci, "enabled") ) > + kinfo.virtio_pci.mode = VIRTIO_PCI; > + else if ( !strcmp(virtio_pci, "grants") ) > + kinfo.virtio_pci.mode = VIRTIO_PCI_GRANTS; > + else > + { > + printk("Invalid \"virtio-pci\" property value (%s)\n", virtio_pci); > + return -EINVAL; > + } > + } > + else if ( rc == -ENODATA ) > + { > + /* Handle missing property value */ > + kinfo.virtio_pci.mode = dt_property_read_bool(node, "virtio-pci"); > + } > + > + if ( kinfo.virtio_pci.mode != VIRTIO_PCI_NONE ) > + { > + rc = dt_property_read_u32_array(node, "virtio-pci-ranges", > + virtio_pci_ranges, > + ARRAY_SIZE(virtio_pci_ranges)); > + if ( rc == 0 ) { > + kinfo.virtio_pci.ecam.base = combine_u64(&virtio_pci_ranges[0]); > + kinfo.virtio_pci.mem.base = combine_u64(&virtio_pci_ranges[2]); > + kinfo.virtio_pci.pf_mem.base = combine_u64(&virtio_pci_ranges[4]); > + } else { > + kinfo.virtio_pci.ecam.base = GUEST_VIRTIO_PCI_ECAM_BASE; > + kinfo.virtio_pci.mem.base = GUEST_VIRTIO_PCI_MEM_BASE; > + kinfo.virtio_pci.pf_mem.base = GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE; > + } > + } > + > rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced); > if ( rc == -EILSEQ || > rc == -ENODATA || > diff --git a/xen/arch/arm/include/asm/kernel.h b/xen/arch/arm/include/asm/kernel.h > index 7e6e3c82a4..2dab2ac88f 100644 > --- a/xen/arch/arm/include/asm/kernel.h > +++ b/xen/arch/arm/include/asm/kernel.h > @@ -29,6 +29,13 @@ > #define DOM0LESS_XENSTORE BIT(1, U) > #define DOM0LESS_ENHANCED (DOM0LESS_ENHANCED_NO_XS | DOM0LESS_XENSTORE) > > +/* virtio-pci types */ > +enum virtio_pci_type { > + VIRTIO_PCI_NONE, > + VIRTIO_PCI, > + VIRTIO_PCI_GRANTS, > +}; > + > struct kernel_info { > #ifdef CONFIG_ARM_64 > enum domain_type type; > @@ -62,6 +69,14 @@ struct kernel_info { > /* Enable/Disable PV drivers interfaces */ > uint16_t dom0less_feature; > > + struct { > + enum virtio_pci_type mode; > + struct { > + u64 base; > + } ecam, mem, pf_mem; > + u32 pci_intx_irq_base; > + } virtio_pci; > + > /* GIC phandle */ > uint32_t phandle_gic; > > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-24 23:55 ` Stefano Stabellini @ 2024-09-25 13:03 ` Stewart Hildebrand 2024-09-26 22:15 ` Edgar E. Iglesias 1 sibling, 0 replies; 32+ messages in thread From: Stewart Hildebrand @ 2024-09-25 13:03 UTC (permalink / raw) To: Stefano Stabellini, Edgar E. Iglesias Cc: xen-devel, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias On 9/24/24 19:55, Stefano Stabellini wrote: > On Tue, 24 Sep 2024, Edgar E. Iglesias wrote: >> diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c >> index 09b65e44ae..dab24fa9e2 100644 >> --- a/xen/arch/arm/dom0less-build.c >> +++ b/xen/arch/arm/dom0less-build.c >> @@ -744,11 +934,24 @@ static int __init alloc_xenstore_evtchn(struct domain *d) >> return 0; >> } >> >> +static u64 combine_u64(u32 v[2]) > > Let's make this a static inline or a macro. I can't believe we don't > have one already. We have dt_read_number(). You'd need to use it with dt_get_property(). In this case, it might look something like: __be32 *val = dt_get_property(node, "virtio-pci-ranges", &len); /* TODO: check len */ ...ecam.base = dt_read_number(&val[0], 2); ...mem.base = dt_read_number(&val[2], 2); ...pf_mem.base = dt_read_number(&val[4], 2); >> +{ >> + u64 v64; >> + >> + v64 = v[0]; >> + v64 <<= 32; >> + v64 |= v[1]; >> + return v64; >> +} >> + >> static int __init construct_domU(struct domain *d, >> const struct dt_device_node *node) >> { >> struct kernel_info kinfo = KERNEL_INFO_INIT; >> const char *dom0less_enhanced; >> + const char *virtio_pci; >> + /* virtio-pci ECAM, MEM, PF-MEM each carrying 2 x Address cells. */ >> + u32 virtio_pci_ranges[3 * 2]; >> int rc; >> u64 mem; >> u32 p2m_mem_mb; >> @@ -779,6 +982,41 @@ static int __init construct_domU(struct domain *d, >> >> kinfo.vpl011 = dt_property_read_bool(node, "vpl011"); >> >> + rc = dt_property_read_string(node, "virtio-pci", &virtio_pci); >> + if ( !rc ) >> + { >> + if ( !strcmp(virtio_pci, "enabled") ) >> + kinfo.virtio_pci.mode = VIRTIO_PCI; >> + else if ( !strcmp(virtio_pci, "grants") ) >> + kinfo.virtio_pci.mode = VIRTIO_PCI_GRANTS; >> + else >> + { >> + printk("Invalid \"virtio-pci\" property value (%s)\n", virtio_pci); >> + return -EINVAL; >> + } >> + } >> + else if ( rc == -ENODATA ) >> + { >> + /* Handle missing property value */ >> + kinfo.virtio_pci.mode = dt_property_read_bool(node, "virtio-pci"); >> + } >> + >> + if ( kinfo.virtio_pci.mode != VIRTIO_PCI_NONE ) >> + { >> + rc = dt_property_read_u32_array(node, "virtio-pci-ranges", >> + virtio_pci_ranges, >> + ARRAY_SIZE(virtio_pci_ranges)); >> + if ( rc == 0 ) { >> + kinfo.virtio_pci.ecam.base = combine_u64(&virtio_pci_ranges[0]); >> + kinfo.virtio_pci.mem.base = combine_u64(&virtio_pci_ranges[2]); >> + kinfo.virtio_pci.pf_mem.base = combine_u64(&virtio_pci_ranges[4]); >> + } else { >> + kinfo.virtio_pci.ecam.base = GUEST_VIRTIO_PCI_ECAM_BASE; >> + kinfo.virtio_pci.mem.base = GUEST_VIRTIO_PCI_MEM_BASE; >> + kinfo.virtio_pci.pf_mem.base = GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE; >> + } >> + } >> + >> rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced); >> if ( rc == -EILSEQ || >> rc == -ENODATA || ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-24 23:55 ` Stefano Stabellini 2024-09-25 13:03 ` Stewart Hildebrand @ 2024-09-26 22:15 ` Edgar E. Iglesias 1 sibling, 0 replies; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-26 22:15 UTC (permalink / raw) To: Stefano Stabellini Cc: Edgar E. Iglesias, xen-devel, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, Stewart Hildebrand On Tue, Sep 24, 2024 at 04:55:15PM -0700, Stefano Stabellini wrote: > On Tue, 24 Sep 2024, Edgar E. Iglesias wrote: > > From: Stewart Hildebrand <stewart.hildebrand@amd.com> > > > > When virtio-pci is specified in the dom0less domU properties, create a > > virtio-pci node in the guest's device tree. Set up an mmio handler with > > a register for the guest to poll when the backend has connected and > > virtio-pci bus is ready to be probed. Grant tables may be used by > > specifying virtio-pci = "grants";. > > > > [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). > > Make grants iommu-map cover the entire PCI bus. > > Add virtio-pci-ranges to specify memory-map for direct-mapped guests. > > Document virtio-pci dom0less fdt bindings.] > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > > --- > > docs/misc/arm/device-tree/booting.txt | 21 +++ > > xen/arch/arm/dom0less-build.c | 238 ++++++++++++++++++++++++++ > > xen/arch/arm/include/asm/kernel.h | 15 ++ > > 3 files changed, 274 insertions(+) > > > > diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt > > index 3a04f5c57f..82f3bd7026 100644 > > --- a/docs/misc/arm/device-tree/booting.txt > > +++ b/docs/misc/arm/device-tree/booting.txt > > @@ -276,6 +276,27 @@ with the following properties: > > passed through. This option is the default if this property is missing > > and the user does not provide the device partial device tree for the domain. > > > > +- virtio-pci > > + > > + A string property specifying whether virtio-pci is enabled for the > > + domain and if grant table mappings should be used. If no value is set > > + this property is treated as a boolean and the same way as if set to > > + "enabled". > > + Possible property values are: > > + > > + - "enabled" > > + Virtio-pci is enabled for the domain. > > + > > + - "grants" > > + Virtio-pci is enabled for the domain and an grants IOMMU node will be > > + generated in the domains device-tree. > > + > > +- virtio-pci-ranges > > + > > + An optional array of 6 u32 values specifying the 2 cells base addresses of > > + the ECAM, Memory and Prefetchable-Memory regions for virtio-pci. This is > > + useful to avoid memory-map collisions when using direct-mapped guests. > > + > > Under the "xen,domain" compatible node, one or more sub-nodes are present > > for the DomU kernel and ramdisk. > > > > diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c > > index 09b65e44ae..dab24fa9e2 100644 > > --- a/xen/arch/arm/dom0less-build.c > > +++ b/xen/arch/arm/dom0less-build.c > > @@ -586,6 +586,189 @@ static int __init domain_handle_dtb_bootmodule(struct domain *d, > > return res; > > } > > > > +static int __init make_virtio_pci_domU_node(const struct kernel_info *kinfo) > > +{ > > + void *fdt = kinfo->fdt; > > + /* reg is sized to be used for all the needed properties below */ > > + __be32 reg[(1 + (GUEST_ROOT_ADDRESS_CELLS * 2) + GUEST_ROOT_SIZE_CELLS) > > + * 2]; > > + __be32 irq_map[4 * 4 * 8]; > > + __be32 *cells; > > + char buf[22]; /* pcie@ + max 16 char address + '\0' */ > > + int res; > > + int devfn, intx_pin; > > + static const char compat[] = "pci-host-ecam-generic"; > > + static const char reg_names[] = "ecam"; > > + > > + if ( p2m_ipa_bits <= 40 ) { > > + printk("PA bits %d is too small!\nvirtio-pci is only supported " > > + "on platforms with PA bits > 40\n", p2m_ipa_bits); > > + return -EINVAL; > > + } > > + > > + snprintf(buf, sizeof(buf), "pcie@%lx", kinfo->virtio_pci.ecam.base); > > + dt_dprintk("Create virtio-pci node\n"); > > + res = fdt_begin_node(fdt, buf); > > + if ( res ) > > + return res; > > + > > + res = fdt_property(fdt, "compatible", compat, sizeof(compat)); > > + if ( res ) > > + return res; > > + > > + res = fdt_property_string(fdt, "device_type", "pci"); > > + if ( res ) > > + return res; > > + > > + /* Create reg property */ > > + cells = ®[0]; > > + dt_child_set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS, > > + kinfo->virtio_pci.ecam.base, GUEST_VIRTIO_PCI_ECAM_SIZE); > > + > > + res = fdt_property(fdt, "reg", reg, > > + (GUEST_ROOT_ADDRESS_CELLS + > > + GUEST_ROOT_SIZE_CELLS) * sizeof(*reg)); > > + if ( res ) > > + return res; > > + > > + res = fdt_property(fdt, "reg-names", reg_names, sizeof(reg_names)); > > + if ( res ) > > + return res; > > + > > + /* Create bus-range property */ > > + cells = ®[0]; > > + dt_set_cell(&cells, 1, 0); > > + dt_set_cell(&cells, 1, 0xff); > > + res = fdt_property(fdt, "bus-range", reg, 2 * sizeof(*reg)); > > + if ( res ) > > + return res; > > + > > + res = fdt_property_cell(fdt, "#address-cells", 3); > > + if ( res ) > > + return res; > > + > > + res = fdt_property_cell(fdt, "#size-cells", 2); > > + if ( res ) > > + return res; > > + > > + res = fdt_property_string(fdt, "status", "okay"); > > + if ( res ) > > + return res; > > + > > + /* > > + * Create ranges property as: > > + * <(PCI bitfield) (PCI address) (CPU address) (Size)> > > + */ > > + cells = ®[0]; > > + dt_set_cell(&cells, 1, GUEST_VIRTIO_PCI_MEM_TYPE); > > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.mem.base); > > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.mem.base); > > + dt_set_cell(&cells, GUEST_ROOT_SIZE_CELLS, GUEST_VIRTIO_PCI_MEM_SIZE); > > + dt_set_cell(&cells, 1, GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE); > > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.pf_mem.base); > > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.pf_mem.base); > > + dt_set_cell(&cells, GUEST_ROOT_SIZE_CELLS, GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE); > > + res = fdt_property(fdt, "ranges", reg, 14 * sizeof(*reg)); > > + if ( res ) > > + return res; > > + > > + res = fdt_property(fdt, "dma-coherent", "", 0); > > + if ( res ) > > + return res; > > + > > + res = fdt_property_cell(fdt, "#interrupt-cells", 1); > > + if ( res ) > > + return res; > > + > > + /* > > + * PCI-to-PCI bridge specification > > + * 9.1: Interrupt routing. Table 9-1 > > + * > > + * the PCI Express Base Specification, Revision 2.1 > > + * 2.2.8.1: INTx interrupt signaling - Rules > > + * the Implementation Note > > + * Table 2-20 > > + */ > > + cells = &irq_map[0]; > > + for (devfn = 0; devfn <= 0x18; devfn += 8) { > > + for (intx_pin = 0; intx_pin < 4; intx_pin++) { > > + int irq = GUEST_VIRTIO_PCI_SPI_FIRST - 32; > > + irq += ((intx_pin + PCI_SLOT(devfn)) % 4); > > + > > + dt_set_cell(&cells, 1, devfn << 8); > > + dt_set_cell(&cells, 1, 0); > > + dt_set_cell(&cells, 1, 0); > > + dt_set_cell(&cells, 1, intx_pin + 1); > > + dt_set_cell(&cells, 1, kinfo->phandle_gic); > > + /* 3 GIC cells. */ > > + dt_set_cell(&cells, 1, 0); > > + dt_set_cell(&cells, 1, irq); > > + dt_set_cell(&cells, 1, DT_IRQ_TYPE_LEVEL_HIGH); > > + } > > + } > > + > > + /* Assert we've sized irq_map correctly. */ > > + BUG_ON(cells - &irq_map[0] != ARRAY_SIZE(irq_map)); > > + > > + res = fdt_property(fdt, "interrupt-map", irq_map, sizeof(irq_map)); > > + if ( res ) > > + return res; > > + > > + cells = ®[0]; > > + dt_set_cell(&cells, 1, cpu_to_be16(PCI_DEVFN(3, 0))); > > Hi Edgar, what is this PCI_DEVFN(3, 0) device? Hi Stefano, This is for the interrupt-map-mask, since the swizzling pattern repeats itself for every fourth device, we only need to describe entries 0 - 3 and can mask out the rest. I the next version (which will be quite different) I'll try to make this clearer. > > > > + dt_set_cell(&cells, 1, 0x0); > > + dt_set_cell(&cells, 1, 0x0); > > + dt_set_cell(&cells, 1, 0x7); > > + res = fdt_property(fdt, "interrupt-map-mask", reg, 4 * sizeof(*reg)); > > + if ( res ) > > + return res; > > + > > + if ( kinfo->virtio_pci.mode == VIRTIO_PCI_GRANTS ) > > + { > > + cells = ®[0]; > > + dt_set_cell(&cells, 1, 0x0); > > + dt_set_cell(&cells, 1, GUEST_PHANDLE_IOMMU); > > + dt_set_cell(&cells, 1, 0x0); > > + dt_set_cell(&cells, 1, 0x10000); > > + res = fdt_property(fdt, "iommu-map", reg, 4 * sizeof(*reg)); > > + if ( res ) > > + return res; > > + } > > + > > + res = fdt_property_cell(fdt, "linux,pci-domain", 1); > > + if ( res ) > > + return res; > > + > > + res = fdt_end_node(fdt); > > + if ( res ) > > + return res; > > + > > + if ( kinfo->virtio_pci.mode == VIRTIO_PCI_GRANTS ) > > + { > > + snprintf(buf, sizeof(buf), "xen_iommu"); > > I don't think underscores are allowed in device tree node names Will fix, thanks. > > > > > + res = fdt_begin_node(fdt, buf); > > + if ( res ) > > + return res; > > + > > + res = fdt_property_string(fdt, "compatible", "xen,grant-dma"); > > + if ( res ) > > + return res; > > + > > + res = fdt_property_cell(fdt, "#iommu-cells", 1); > > + if ( res ) > > + return res; > > + > > + res = fdt_property_cell(fdt, "phandle", GUEST_PHANDLE_IOMMU); > > + if ( res ) > > + return res; > > + > > + res = fdt_end_node(fdt); > > + } > > + > > + return res; > > +} > > + > > /* > > * The max size for DT is 2MB. However, the generated DT is small (not including > > * domU passthrough DT nodes whose size we account separately), 4KB are enough > > @@ -693,6 +876,13 @@ static int __init prepare_dtb_domU(struct domain *d, struct kernel_info *kinfo) > > goto err; > > } > > > > + if ( kinfo->virtio_pci.mode ) > > + { > > + ret = make_virtio_pci_domU_node(kinfo); > > + if ( ret ) > > + goto err; > > + } > > + > > ret = fdt_end_node(kinfo->fdt); > > if ( ret < 0 ) > > goto err; > > @@ -744,11 +934,24 @@ static int __init alloc_xenstore_evtchn(struct domain *d) > > return 0; > > } > > > > +static u64 combine_u64(u32 v[2]) > > Let's make this a static inline or a macro. I can't believe we don't > have one already. Yes, I'll try what Stewart suggested. Cheers, Edgar ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-24 16:23 ` [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node Edgar E. Iglesias 2024-09-24 23:55 ` Stefano Stabellini @ 2024-09-25 2:48 ` Stewart Hildebrand 2024-09-25 7:44 ` Julien Grall 2024-10-01 19:30 ` Stewart Hildebrand 3 siblings, 0 replies; 32+ messages in thread From: Stewart Hildebrand @ 2024-09-25 2:48 UTC (permalink / raw) To: Edgar E. Iglesias, xen-devel Cc: sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias, Sergiy Kibrik, Oleksandr Tyshchenko On 9/24/24 12:23, Edgar E. Iglesias wrote: > diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c > index 09b65e44ae..dab24fa9e2 100644 > --- a/xen/arch/arm/dom0less-build.c > +++ b/xen/arch/arm/dom0less-build.c > @@ -586,6 +586,189 @@ static int __init domain_handle_dtb_bootmodule(struct domain *d, > return res; > } > > +static int __init make_virtio_pci_domU_node(const struct kernel_info *kinfo) > +{ > + void *fdt = kinfo->fdt; > + /* reg is sized to be used for all the needed properties below */ > + __be32 reg[(1 + (GUEST_ROOT_ADDRESS_CELLS * 2) + GUEST_ROOT_SIZE_CELLS) > + * 2]; > + __be32 irq_map[4 * 4 * 8]; > + __be32 *cells; > + char buf[22]; /* pcie@ + max 16 char address + '\0' */ > + int res; > + int devfn, intx_pin; > + static const char compat[] = "pci-host-ecam-generic"; > + static const char reg_names[] = "ecam"; > + > + if ( p2m_ipa_bits <= 40 ) { > + printk("PA bits %d is too small!\nvirtio-pci is only supported " > + "on platforms with PA bits > 40\n", p2m_ipa_bits); > + return -EINVAL; > + } > + > + snprintf(buf, sizeof(buf), "pcie@%lx", kinfo->virtio_pci.ecam.base); > + dt_dprintk("Create virtio-pci node\n"); > + res = fdt_begin_node(fdt, buf); > + if ( res ) > + return res; > + > + res = fdt_property(fdt, "compatible", compat, sizeof(compat)); > + if ( res ) > + return res; > + > + res = fdt_property_string(fdt, "device_type", "pci"); > + if ( res ) > + return res; > + > + /* Create reg property */ > + cells = ®[0]; > + dt_child_set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS, > + kinfo->virtio_pci.ecam.base, GUEST_VIRTIO_PCI_ECAM_SIZE); > + > + res = fdt_property(fdt, "reg", reg, > + (GUEST_ROOT_ADDRESS_CELLS + > + GUEST_ROOT_SIZE_CELLS) * sizeof(*reg)); > + if ( res ) > + return res; > + > + res = fdt_property(fdt, "reg-names", reg_names, sizeof(reg_names)); > + if ( res ) > + return res; > + > + /* Create bus-range property */ > + cells = ®[0]; > + dt_set_cell(&cells, 1, 0); > + dt_set_cell(&cells, 1, 0xff); > + res = fdt_property(fdt, "bus-range", reg, 2 * sizeof(*reg)); > + if ( res ) > + return res; > + > + res = fdt_property_cell(fdt, "#address-cells", 3); > + if ( res ) > + return res; > + > + res = fdt_property_cell(fdt, "#size-cells", 2); > + if ( res ) > + return res; > + > + res = fdt_property_string(fdt, "status", "okay"); > + if ( res ) > + return res; > + > + /* > + * Create ranges property as: > + * <(PCI bitfield) (PCI address) (CPU address) (Size)> > + */ > + cells = ®[0]; > + dt_set_cell(&cells, 1, GUEST_VIRTIO_PCI_MEM_TYPE); > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.mem.base); > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.mem.base); > + dt_set_cell(&cells, GUEST_ROOT_SIZE_CELLS, GUEST_VIRTIO_PCI_MEM_SIZE); > + dt_set_cell(&cells, 1, GUEST_VIRTIO_PCI_PREFETCH_MEM_TYPE); > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.pf_mem.base); > + dt_set_cell(&cells, GUEST_ROOT_ADDRESS_CELLS, kinfo->virtio_pci.pf_mem.base); > + dt_set_cell(&cells, GUEST_ROOT_SIZE_CELLS, GUEST_VIRTIO_PCI_PREFETCH_MEM_SIZE); > + res = fdt_property(fdt, "ranges", reg, 14 * sizeof(*reg)); > + if ( res ) > + return res; > + > + res = fdt_property(fdt, "dma-coherent", "", 0); > + if ( res ) > + return res; > + > + res = fdt_property_cell(fdt, "#interrupt-cells", 1); > + if ( res ) > + return res; > + > + /* > + * PCI-to-PCI bridge specification > + * 9.1: Interrupt routing. Table 9-1 > + * > + * the PCI Express Base Specification, Revision 2.1 > + * 2.2.8.1: INTx interrupt signaling - Rules > + * the Implementation Note > + * Table 2-20 > + */ > + cells = &irq_map[0]; > + for (devfn = 0; devfn <= 0x18; devfn += 8) { > + for (intx_pin = 0; intx_pin < 4; intx_pin++) { > + int irq = GUEST_VIRTIO_PCI_SPI_FIRST - 32; > + irq += ((intx_pin + PCI_SLOT(devfn)) % 4); > + > + dt_set_cell(&cells, 1, devfn << 8); > + dt_set_cell(&cells, 1, 0); > + dt_set_cell(&cells, 1, 0); > + dt_set_cell(&cells, 1, intx_pin + 1); > + dt_set_cell(&cells, 1, kinfo->phandle_gic); Here we will also want a parent unit address (GIC unit address). The number of cells is determined by the vGIC node's #address-cells. See section 2.4.3 in [1]. Also take a look at EPAM's libxl implementation, in the function create_virtio_pci_irq_map() [2]. [1] https://github.com/devicetree-org/devicetree-specification/releases/download/v0.4/devicetree-specification-v0.4.pdf [2] https://lore.kernel.org/xen-devel/20231115112611.3865905-4-Sergiy_Kibrik@epam.com/ > + /* 3 GIC cells. */ > + dt_set_cell(&cells, 1, 0); > + dt_set_cell(&cells, 1, irq); > + dt_set_cell(&cells, 1, DT_IRQ_TYPE_LEVEL_HIGH); > + } > + } > + > + /* Assert we've sized irq_map correctly. */ > + BUG_ON(cells - &irq_map[0] != ARRAY_SIZE(irq_map)); > + > + res = fdt_property(fdt, "interrupt-map", irq_map, sizeof(irq_map)); > + if ( res ) > + return res; > + > + cells = ®[0]; > + dt_set_cell(&cells, 1, cpu_to_be16(PCI_DEVFN(3, 0))); > + dt_set_cell(&cells, 1, 0x0); > + dt_set_cell(&cells, 1, 0x0); > + dt_set_cell(&cells, 1, 0x7); > + res = fdt_property(fdt, "interrupt-map-mask", reg, 4 * sizeof(*reg)); > + if ( res ) > + return res; > + > + if ( kinfo->virtio_pci.mode == VIRTIO_PCI_GRANTS ) > + { > + cells = ®[0]; > + dt_set_cell(&cells, 1, 0x0); > + dt_set_cell(&cells, 1, GUEST_PHANDLE_IOMMU); > + dt_set_cell(&cells, 1, 0x0); > + dt_set_cell(&cells, 1, 0x10000); > + res = fdt_property(fdt, "iommu-map", reg, 4 * sizeof(*reg)); > + if ( res ) > + return res; > + } > + > + res = fdt_property_cell(fdt, "linux,pci-domain", 1); > + if ( res ) > + return res; > + > + res = fdt_end_node(fdt); > + if ( res ) > + return res; > + > + if ( kinfo->virtio_pci.mode == VIRTIO_PCI_GRANTS ) > + { > + snprintf(buf, sizeof(buf), "xen_iommu"); > + > + res = fdt_begin_node(fdt, buf); > + if ( res ) > + return res; > + > + res = fdt_property_string(fdt, "compatible", "xen,grant-dma"); > + if ( res ) > + return res; > + > + res = fdt_property_cell(fdt, "#iommu-cells", 1); > + if ( res ) > + return res; > + > + res = fdt_property_cell(fdt, "phandle", GUEST_PHANDLE_IOMMU); > + if ( res ) > + return res; > + > + res = fdt_end_node(fdt); > + } > + > + return res; > +} ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-24 16:23 ` [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node Edgar E. Iglesias 2024-09-24 23:55 ` Stefano Stabellini 2024-09-25 2:48 ` Stewart Hildebrand @ 2024-09-25 7:44 ` Julien Grall 2024-09-25 16:34 ` Edgar E. Iglesias 2024-10-01 19:30 ` Stewart Hildebrand 3 siblings, 1 reply; 32+ messages in thread From: Julien Grall @ 2024-09-25 7:44 UTC (permalink / raw) To: Edgar E. Iglesias, xen-devel Cc: sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias, Stewart Hildebrand Hi, On 24/09/2024 17:23, Edgar E. Iglesias wrote: > From: Stewart Hildebrand <stewart.hildebrand@amd.com> > > When virtio-pci is specified in the dom0less domU properties, create a > virtio-pci node in the guest's device tree. Set up an mmio handler with > a register for the guest to poll when the backend has connected and > virtio-pci bus is ready to be probed. Grant tables may be used by > specifying virtio-pci = "grants";. > > [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). > Make grants iommu-map cover the entire PCI bus. > Add virtio-pci-ranges to specify memory-map for direct-mapped guests. > Document virtio-pci dom0less fdt bindings.] > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > --- > docs/misc/arm/device-tree/booting.txt | 21 +++ > xen/arch/arm/dom0less-build.c | 238 ++++++++++++++++++++++++++ > xen/arch/arm/include/asm/kernel.h | 15 ++ > 3 files changed, 274 insertions(+) > > diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt > index 3a04f5c57f..82f3bd7026 100644 > --- a/docs/misc/arm/device-tree/booting.txt > +++ b/docs/misc/arm/device-tree/booting.txt > @@ -276,6 +276,27 @@ with the following properties: > passed through. This option is the default if this property is missing > and the user does not provide the device partial device tree for the domain. > > +- virtio-pci Similar question to the other patches, why is this specific to virtio PCI? QEMU (or another device module) is free to emulate whatever it wants behind the PCI hosbtridge. > + > + A string property specifying whether virtio-pci is enabled for the > + domain and if grant table mappings should be used. If no value is set > + this property is treated as a boolean and the same way as if set to > + "enabled". > + Possible property values are: > + > + - "enabled" > + Virtio-pci is enabled for the domain. > + > + - "grants" > + Virtio-pci is enabled for the domain and an grants IOMMU node will be > + generated in the domains device-tree. > + > +- virtio-pci-ranges > + > + An optional array of 6 u32 values specifying the 2 cells base addresses of > + the ECAM, Memory and Prefetchable-Memory regions for virtio-pci. This is > + useful to avoid memory-map collisions when using direct-mapped guests. > + > Under the "xen,domain" compatible node, one or more sub-nodes are present > for the DomU kernel and ramdisk. > > diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c > index 09b65e44ae..dab24fa9e2 100644 > --- a/xen/arch/arm/dom0less-build.c > +++ b/xen/arch/arm/dom0less-build.c > @@ -586,6 +586,189 @@ static int __init domain_handle_dtb_bootmodule(struct domain *d, > return res; > } > > +static int __init make_virtio_pci_domU_node(const struct kernel_info *kinfo) > +{ > + void *fdt = kinfo->fdt; > + /* reg is sized to be used for all the needed properties below */ > + __be32 reg[(1 + (GUEST_ROOT_ADDRESS_CELLS * 2) + GUEST_ROOT_SIZE_CELLS) > + * 2]; > + __be32 irq_map[4 * 4 * 8]; > + __be32 *cells; > + char buf[22]; /* pcie@ + max 16 char address + '\0' */ > + int res; > + int devfn, intx_pin; > + static const char compat[] = "pci-host-ecam-generic"; > + static const char reg_names[] = "ecam"; > + > + if ( p2m_ipa_bits <= 40 ) { > + printk("PA bits %d is too small!\nvirtio-pci is only supported " > + "on platforms with PA bits > 40\n", p2m_ipa_bits); > + return -EINVAL; > + } Please add a comment explaining where does this requires come from. If this is the Address layout, then probably be to avoid relying on hardcoded number of bits. Cheers, -- Julien Grall ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-25 7:44 ` Julien Grall @ 2024-09-25 16:34 ` Edgar E. Iglesias 2024-09-25 16:38 ` Julien Grall 0 siblings, 1 reply; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-25 16:34 UTC (permalink / raw) To: Julien Grall Cc: Edgar E. Iglesias, xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, Stewart Hildebrand On Wed, Sep 25, 2024 at 08:44:41AM +0100, Julien Grall wrote: > Hi, > Hi Julien, > On 24/09/2024 17:23, Edgar E. Iglesias wrote: > > From: Stewart Hildebrand <stewart.hildebrand@amd.com> > > > > When virtio-pci is specified in the dom0less domU properties, create a > > virtio-pci node in the guest's device tree. Set up an mmio handler with > > a register for the guest to poll when the backend has connected and > > virtio-pci bus is ready to be probed. Grant tables may be used by > > specifying virtio-pci = "grants";. > > > > [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). > > Make grants iommu-map cover the entire PCI bus. > > Add virtio-pci-ranges to specify memory-map for direct-mapped guests. > > Document virtio-pci dom0less fdt bindings.] > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > > --- > > docs/misc/arm/device-tree/booting.txt | 21 +++ > > xen/arch/arm/dom0less-build.c | 238 ++++++++++++++++++++++++++ > > xen/arch/arm/include/asm/kernel.h | 15 ++ > > 3 files changed, 274 insertions(+) > > > > diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt > > index 3a04f5c57f..82f3bd7026 100644 > > --- a/docs/misc/arm/device-tree/booting.txt > > +++ b/docs/misc/arm/device-tree/booting.txt > > @@ -276,6 +276,27 @@ with the following properties: > > passed through. This option is the default if this property is missing > > and the user does not provide the device partial device tree for the domain. > > +- virtio-pci > > Similar question to the other patches, why is this specific to virtio PCI? > QEMU (or another device module) is free to emulate whatever it wants behind > the PCI hosbtridge. There's no hard limitatino to only virtio-pci devices it's more of a recommendation that PVH guests should not use "emulated" devices but there's nothing stopping it. Perhaps the names of these properties are missleading, I'm happy to change them if there are better suggestions! I can also clarify it in the documentation. > > > + > > + A string property specifying whether virtio-pci is enabled for the > > + domain and if grant table mappings should be used. If no value is set > > + this property is treated as a boolean and the same way as if set to > > + "enabled". > > + Possible property values are: > > + > > + - "enabled" > > + Virtio-pci is enabled for the domain. > > + > > + - "grants" > > + Virtio-pci is enabled for the domain and an grants IOMMU node will be > > + generated in the domains device-tree. > > + > > +- virtio-pci-ranges > > + > > + An optional array of 6 u32 values specifying the 2 cells base addresses of > > + the ECAM, Memory and Prefetchable-Memory regions for virtio-pci. This is > > + useful to avoid memory-map collisions when using direct-mapped guests. > > + > > Under the "xen,domain" compatible node, one or more sub-nodes are present > > for the DomU kernel and ramdisk. > > diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c > > index 09b65e44ae..dab24fa9e2 100644 > > --- a/xen/arch/arm/dom0less-build.c > > +++ b/xen/arch/arm/dom0less-build.c > > @@ -586,6 +586,189 @@ static int __init domain_handle_dtb_bootmodule(struct domain *d, > > return res; > > } > > +static int __init make_virtio_pci_domU_node(const struct kernel_info *kinfo) > > +{ > > + void *fdt = kinfo->fdt; > > + /* reg is sized to be used for all the needed properties below */ > > + __be32 reg[(1 + (GUEST_ROOT_ADDRESS_CELLS * 2) + GUEST_ROOT_SIZE_CELLS) > > + * 2]; > > + __be32 irq_map[4 * 4 * 8]; > > + __be32 *cells; > > + char buf[22]; /* pcie@ + max 16 char address + '\0' */ > > + int res; > > + int devfn, intx_pin; > > + static const char compat[] = "pci-host-ecam-generic"; > > + static const char reg_names[] = "ecam"; > > + > > + if ( p2m_ipa_bits <= 40 ) { > > + printk("PA bits %d is too small!\nvirtio-pci is only supported " > > + "on platforms with PA bits > 40\n", p2m_ipa_bits); > > + return -EINVAL; > > + } > > Please add a comment explaining where does this requires come from. If this > is the Address layout, then probably be to avoid relying on hardcoded number > of bits. > Oops, sorry, I had actually removed this part from my git branch but I forgot to regenerated the patch series. I'll remove it. Best regards, Edgar ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-25 16:34 ` Edgar E. Iglesias @ 2024-09-25 16:38 ` Julien Grall 2024-09-25 16:44 ` Edgar E. Iglesias 0 siblings, 1 reply; 32+ messages in thread From: Julien Grall @ 2024-09-25 16:38 UTC (permalink / raw) To: Edgar E. Iglesias Cc: Edgar E. Iglesias, xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, Stewart Hildebrand Hi Edgar, On 25/09/2024 17:34, Edgar E. Iglesias wrote: > On Wed, Sep 25, 2024 at 08:44:41AM +0100, Julien Grall wrote: >> Hi, >> On 24/09/2024 17:23, Edgar E. Iglesias wrote: >>> From: Stewart Hildebrand <stewart.hildebrand@amd.com> >>> >>> When virtio-pci is specified in the dom0less domU properties, create a >>> virtio-pci node in the guest's device tree. Set up an mmio handler with >>> a register for the guest to poll when the backend has connected and >>> virtio-pci bus is ready to be probed. Grant tables may be used by >>> specifying virtio-pci = "grants";. >>> >>> [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). >>> Make grants iommu-map cover the entire PCI bus. >>> Add virtio-pci-ranges to specify memory-map for direct-mapped guests. >>> Document virtio-pci dom0less fdt bindings.] >>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> >>> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> >>> --- >>> docs/misc/arm/device-tree/booting.txt | 21 +++ >>> xen/arch/arm/dom0less-build.c | 238 ++++++++++++++++++++++++++ >>> xen/arch/arm/include/asm/kernel.h | 15 ++ >>> 3 files changed, 274 insertions(+) >>> >>> diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt >>> index 3a04f5c57f..82f3bd7026 100644 >>> --- a/docs/misc/arm/device-tree/booting.txt >>> +++ b/docs/misc/arm/device-tree/booting.txt >>> @@ -276,6 +276,27 @@ with the following properties: >>> passed through. This option is the default if this property is missing >>> and the user does not provide the device partial device tree for the domain. >>> +- virtio-pci >> >> Similar question to the other patches, why is this specific to virtio PCI? >> QEMU (or another device module) is free to emulate whatever it wants behind >> the PCI hosbtridge. > > There's no hard limitatino to only virtio-pci devices it's more of a > recommendation that PVH guests should not use "emulated" devices but > there's nothing stopping it. Could you provide a bit more details where this requirement is coming from? For instance, I would expect we would need to do some emulation to boot Windows on Arm. Cheers, -- Julien Grall ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-25 16:38 ` Julien Grall @ 2024-09-25 16:44 ` Edgar E. Iglesias 2024-09-25 16:49 ` Edgar E. Iglesias 0 siblings, 1 reply; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-25 16:44 UTC (permalink / raw) To: Julien Grall Cc: Edgar E. Iglesias, xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, Stewart Hildebrand On Wed, Sep 25, 2024 at 05:38:13PM +0100, Julien Grall wrote: > Hi Edgar, > > On 25/09/2024 17:34, Edgar E. Iglesias wrote: > > On Wed, Sep 25, 2024 at 08:44:41AM +0100, Julien Grall wrote: > > > Hi, > > > On 24/09/2024 17:23, Edgar E. Iglesias wrote: > > > > From: Stewart Hildebrand <stewart.hildebrand@amd.com> > > > > > > > > When virtio-pci is specified in the dom0less domU properties, create a > > > > virtio-pci node in the guest's device tree. Set up an mmio handler with > > > > a register for the guest to poll when the backend has connected and > > > > virtio-pci bus is ready to be probed. Grant tables may be used by > > > > specifying virtio-pci = "grants";. > > > > > > > > [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). > > > > Make grants iommu-map cover the entire PCI bus. > > > > Add virtio-pci-ranges to specify memory-map for direct-mapped guests. > > > > Document virtio-pci dom0less fdt bindings.] > > > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > > > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > > > > --- > > > > docs/misc/arm/device-tree/booting.txt | 21 +++ > > > > xen/arch/arm/dom0less-build.c | 238 ++++++++++++++++++++++++++ > > > > xen/arch/arm/include/asm/kernel.h | 15 ++ > > > > 3 files changed, 274 insertions(+) > > > > > > > > diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt > > > > index 3a04f5c57f..82f3bd7026 100644 > > > > --- a/docs/misc/arm/device-tree/booting.txt > > > > +++ b/docs/misc/arm/device-tree/booting.txt > > > > @@ -276,6 +276,27 @@ with the following properties: > > > > passed through. This option is the default if this property is missing > > > > and the user does not provide the device partial device tree for the domain. > > > > +- virtio-pci > > > > > > Similar question to the other patches, why is this specific to virtio PCI? > > > QEMU (or another device module) is free to emulate whatever it wants behind > > > the PCI hosbtridge. > > > > There's no hard limitatino to only virtio-pci devices it's more of a > > recommendation that PVH guests should not use "emulated" devices but > > there's nothing stopping it. > > Could you provide a bit more details where this requirement is coming from? > For instance, I would expect we would need to do some emulation to boot > Windows on Arm. > I see. I guess it just came from my mental model, I thought part of the philosophy behind PVH was to avoid emulated devices and use paravirualized (virtio or something else) or passthrough whereever possible (except for the basic set of devices needed like vGIC, vuart, MMU). Cheers, Edgar ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-25 16:44 ` Edgar E. Iglesias @ 2024-09-25 16:49 ` Edgar E. Iglesias 2024-09-25 17:45 ` Julien Grall 0 siblings, 1 reply; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-25 16:49 UTC (permalink / raw) To: Edgar E. Iglesias Cc: Julien Grall, xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, Stewart Hildebrand [-- Attachment #1: Type: text/plain, Size: 3415 bytes --] On Wed, Sep 25, 2024 at 10:44 AM Edgar E. Iglesias <edgar.iglesias@amd.com> wrote: > On Wed, Sep 25, 2024 at 05:38:13PM +0100, Julien Grall wrote: > > Hi Edgar, > > > > On 25/09/2024 17:34, Edgar E. Iglesias wrote: > > > On Wed, Sep 25, 2024 at 08:44:41AM +0100, Julien Grall wrote: > > > > Hi, > > > > On 24/09/2024 17:23, Edgar E. Iglesias wrote: > > > > > From: Stewart Hildebrand <stewart.hildebrand@amd.com> > > > > > > > > > > When virtio-pci is specified in the dom0less domU properties, > create a > > > > > virtio-pci node in the guest's device tree. Set up an mmio handler > with > > > > > a register for the guest to poll when the backend has connected and > > > > > virtio-pci bus is ready to be probed. Grant tables may be used by > > > > > specifying virtio-pci = "grants";. > > > > > > > > > > [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). > > > > > Make grants iommu-map cover the entire PCI bus. > > > > > Add virtio-pci-ranges to specify memory-map for direct-mapped > guests. > > > > > Document virtio-pci dom0less fdt bindings.] > > > > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > > > > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > > > > > --- > > > > > docs/misc/arm/device-tree/booting.txt | 21 +++ > > > > > xen/arch/arm/dom0less-build.c | 238 > ++++++++++++++++++++++++++ > > > > > xen/arch/arm/include/asm/kernel.h | 15 ++ > > > > > 3 files changed, 274 insertions(+) > > > > > > > > > > diff --git a/docs/misc/arm/device-tree/booting.txt > b/docs/misc/arm/device-tree/booting.txt > > > > > index 3a04f5c57f..82f3bd7026 100644 > > > > > --- a/docs/misc/arm/device-tree/booting.txt > > > > > +++ b/docs/misc/arm/device-tree/booting.txt > > > > > @@ -276,6 +276,27 @@ with the following properties: > > > > > passed through. This option is the default if this property > is missing > > > > > and the user does not provide the device partial device > tree for the domain. > > > > > +- virtio-pci > > > > > > > > Similar question to the other patches, why is this specific to > virtio PCI? > > > > QEMU (or another device module) is free to emulate whatever it wants > behind > > > > the PCI hosbtridge. > > > > > > There's no hard limitatino to only virtio-pci devices it's more of a > > > recommendation that PVH guests should not use "emulated" devices but > > > there's nothing stopping it. > > > > Could you provide a bit more details where this requirement is coming > from? > > For instance, I would expect we would need to do some emulation to boot > > Windows on Arm. > > > > I see. I guess it just came from my mental model, I thought part of the > philosophy behind PVH was to avoid emulated devices and use > paravirualized (virtio or something else) or passthrough whereever > possible (except for the basic set of devices needed like vGIC, vuart, > MMU). > For example, we would recommend users to use virtio-net in favor of an emulated eepro1000 or whatever other NIC models available in QEMU. But there is no hard requirement nor limitation, a user can connect any available PCI device from the QEMU set. Another thing we're looking to do is to minimize the QEMU build (Kconfig + configure flags) to create a small build with only the stuff needed for virtio-pci. Best regards, Edgar [-- Attachment #2: Type: text/html, Size: 4560 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-25 16:49 ` Edgar E. Iglesias @ 2024-09-25 17:45 ` Julien Grall 2024-09-25 18:42 ` Edgar E. Iglesias 2024-09-25 22:20 ` Stefano Stabellini 0 siblings, 2 replies; 32+ messages in thread From: Julien Grall @ 2024-09-25 17:45 UTC (permalink / raw) To: Edgar E. Iglesias, Edgar E. Iglesias Cc: xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, Stewart Hildebrand Hi Edgar, On 25/09/2024 17:49, Edgar E. Iglesias wrote: > On Wed, Sep 25, 2024 at 10:44 AM Edgar E. Iglesias <edgar.iglesias@amd.com> > wrote: > >> On Wed, Sep 25, 2024 at 05:38:13PM +0100, Julien Grall wrote: >>> Hi Edgar, >>> >>> On 25/09/2024 17:34, Edgar E. Iglesias wrote: >>>> On Wed, Sep 25, 2024 at 08:44:41AM +0100, Julien Grall wrote: >>>>> Hi, >>>>> On 24/09/2024 17:23, Edgar E. Iglesias wrote: >>>>>> From: Stewart Hildebrand <stewart.hildebrand@amd.com> >>>>>> >>>>>> When virtio-pci is specified in the dom0less domU properties, >> create a >>>>>> virtio-pci node in the guest's device tree. Set up an mmio handler >> with >>>>>> a register for the guest to poll when the backend has connected and >>>>>> virtio-pci bus is ready to be probed. Grant tables may be used by >>>>>> specifying virtio-pci = "grants";. >>>>>> >>>>>> [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). >>>>>> Make grants iommu-map cover the entire PCI bus. >>>>>> Add virtio-pci-ranges to specify memory-map for direct-mapped >> guests. >>>>>> Document virtio-pci dom0less fdt bindings.] >>>>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> >>>>>> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> >>>>>> --- >>>>>> docs/misc/arm/device-tree/booting.txt | 21 +++ >>>>>> xen/arch/arm/dom0less-build.c | 238 >> ++++++++++++++++++++++++++ >>>>>> xen/arch/arm/include/asm/kernel.h | 15 ++ >>>>>> 3 files changed, 274 insertions(+) >>>>>> >>>>>> diff --git a/docs/misc/arm/device-tree/booting.txt >> b/docs/misc/arm/device-tree/booting.txt >>>>>> index 3a04f5c57f..82f3bd7026 100644 >>>>>> --- a/docs/misc/arm/device-tree/booting.txt >>>>>> +++ b/docs/misc/arm/device-tree/booting.txt >>>>>> @@ -276,6 +276,27 @@ with the following properties: >>>>>> passed through. This option is the default if this property >> is missing >>>>>> and the user does not provide the device partial device >> tree for the domain. >>>>>> +- virtio-pci >>>>> >>>>> Similar question to the other patches, why is this specific to >> virtio PCI? >>>>> QEMU (or another device module) is free to emulate whatever it wants >> behind >>>>> the PCI hosbtridge. >>>> >>>> There's no hard limitatino to only virtio-pci devices it's more of a >>>> recommendation that PVH guests should not use "emulated" devices but >>>> there's nothing stopping it. >>> >>> Could you provide a bit more details where this requirement is coming >> from? >>> For instance, I would expect we would need to do some emulation to boot >>> Windows on Arm. >>> >> >> I see. I guess it just came from my mental model, I thought part of the >> philosophy behind PVH was to avoid emulated devices and use >> paravirualized (virtio or something else) or passthrough whereever >> possible (except for the basic set of devices needed like vGIC, vuart, >> MMU). >> > > For example, we would recommend users to use virtio-net in favor of an > emulated eepro1000 or whatever other NIC models available in QEMU. Indeed. I would always recommend user to use virtio-net over eepro1000. > But there is no hard requirement nor limitation, a user can connect any > available PCI device from the QEMU set. We need to be clear about what we are exposing to the guest. With this patch we will describe a PCI hostbridge in Device Tree (well it is an empty region we hope the Device Model to emulate at some point). But the hypervisor will not create the device model. Instead, you expect the user/integrator to have extra script to launch a Device Model (So it may not even be a hostbridge). > > Another thing we're looking to do is to minimize the QEMU build (Kconfig + > configure flags) to create a small build with only the stuff needed for > virtio-pci. It is nice to have a cut down version of QEMU :). However, Xen doesn't care about the device model used for the emulation. I have seen some specialized DM in the wild (and used them while I was working on disaggregating the DM). Anyway, while I understand this approach works in tailored environment, I am not convinced this works for a more general approach. The two options I would rather consider are: 1. Allow the device model to receive access for a single PCI device (IOW hook into vPCI). 2. Find a way to let the user provide the binding (maybe in a partial device-tree) + the list of Interrupts/MMIO that would be emulated by QEMU. The second approach might be another way to get a second hostbridge in your use case while giving a bit more flexibility in what can be done (thinking about disagreggated environment). Cheers, -- Julien Grall ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-25 17:45 ` Julien Grall @ 2024-09-25 18:42 ` Edgar E. Iglesias 2024-09-25 22:20 ` Stefano Stabellini 1 sibling, 0 replies; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-25 18:42 UTC (permalink / raw) To: Julien Grall Cc: Edgar E. Iglesias, xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, Stewart Hildebrand On Wed, Sep 25, 2024 at 06:45:19PM +0100, Julien Grall wrote: > Hi Edgar, > > On 25/09/2024 17:49, Edgar E. Iglesias wrote: > > On Wed, Sep 25, 2024 at 10:44 AM Edgar E. Iglesias <edgar.iglesias@amd.com> > > wrote: > > > > > On Wed, Sep 25, 2024 at 05:38:13PM +0100, Julien Grall wrote: > > > > Hi Edgar, > > > > > > > > On 25/09/2024 17:34, Edgar E. Iglesias wrote: > > > > > On Wed, Sep 25, 2024 at 08:44:41AM +0100, Julien Grall wrote: > > > > > > Hi, > > > > > > On 24/09/2024 17:23, Edgar E. Iglesias wrote: > > > > > > > From: Stewart Hildebrand <stewart.hildebrand@amd.com> > > > > > > > > > > > > > > When virtio-pci is specified in the dom0less domU properties, > > > create a > > > > > > > virtio-pci node in the guest's device tree. Set up an mmio handler > > > with > > > > > > > a register for the guest to poll when the backend has connected and > > > > > > > virtio-pci bus is ready to be probed. Grant tables may be used by > > > > > > > specifying virtio-pci = "grants";. > > > > > > > > > > > > > > [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). > > > > > > > Make grants iommu-map cover the entire PCI bus. > > > > > > > Add virtio-pci-ranges to specify memory-map for direct-mapped > > > guests. > > > > > > > Document virtio-pci dom0less fdt bindings.] > > > > > > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > > > > > > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > > > > > > > --- > > > > > > > docs/misc/arm/device-tree/booting.txt | 21 +++ > > > > > > > xen/arch/arm/dom0less-build.c | 238 > > > ++++++++++++++++++++++++++ > > > > > > > xen/arch/arm/include/asm/kernel.h | 15 ++ > > > > > > > 3 files changed, 274 insertions(+) > > > > > > > > > > > > > > diff --git a/docs/misc/arm/device-tree/booting.txt > > > b/docs/misc/arm/device-tree/booting.txt > > > > > > > index 3a04f5c57f..82f3bd7026 100644 > > > > > > > --- a/docs/misc/arm/device-tree/booting.txt > > > > > > > +++ b/docs/misc/arm/device-tree/booting.txt > > > > > > > @@ -276,6 +276,27 @@ with the following properties: > > > > > > > passed through. This option is the default if this property > > > is missing > > > > > > > and the user does not provide the device partial device > > > tree for the domain. > > > > > > > +- virtio-pci > > > > > > > > > > > > Similar question to the other patches, why is this specific to > > > virtio PCI? > > > > > > QEMU (or another device module) is free to emulate whatever it wants > > > behind > > > > > > the PCI hosbtridge. > > > > > > > > > > There's no hard limitatino to only virtio-pci devices it's more of a > > > > > recommendation that PVH guests should not use "emulated" devices but > > > > > there's nothing stopping it. > > > > > > > > Could you provide a bit more details where this requirement is coming > > > from? > > > > For instance, I would expect we would need to do some emulation to boot > > > > Windows on Arm. > > > > > > > > > > I see. I guess it just came from my mental model, I thought part of the > > > philosophy behind PVH was to avoid emulated devices and use > > > paravirualized (virtio or something else) or passthrough whereever > > > possible (except for the basic set of devices needed like vGIC, vuart, > > > MMU). > > > > > > > For example, we would recommend users to use virtio-net in favor of an > > emulated eepro1000 or whatever other NIC models available in QEMU. > > Indeed. I would always recommend user to use virtio-net over eepro1000. > > > But there is no hard requirement nor limitation, a user can connect any > > available PCI device from the QEMU set. > > We need to be clear about what we are exposing to the guest. With this patch > we will describe a PCI hostbridge in Device Tree (well it is an empty region > we hope the Device Model to emulate at some point). But the hypervisor will > not create the device model. Instead, you expect the user/integrator to have > extra script to launch a Device Model (So it may not even be a hostbridge). Right OK, I'll try make it clearer in the next versions allthough the interface may change a bit if we go with your second suggestion below. > > > > > Another thing we're looking to do is to minimize the QEMU build (Kconfig + > > configure flags) to create a small build with only the stuff needed for > > virtio-pci. > > It is nice to have a cut down version of QEMU :). However, Xen doesn't care > about the device model used for the emulation. I have seen some specialized > DM in the wild (and used them while I was working on disaggregating the DM). > > Anyway, while I understand this approach works in tailored environment, I am > not convinced this works for a more general approach. The two options I > would rather consider are: > 1. Allow the device model to receive access for a single PCI device (IOW > hook into vPCI). > 2. Find a way to let the user provide the binding (maybe in a partial > device-tree) + the list of Interrupts/MMIO that would be emulated by QEMU. > > The second approach might be another way to get a second hostbridge in your > use case while giving a bit more flexibility in what can be done (thinking > about disagreggated environment). > Thanks, it makes sense and I like that it makes the FDT setup more explicit, i.e user providing an FDT fragment rather than Xen implicitly generating it behind the scene. I'll wait for more comments but if folks don't disagree I will explore this second option. Cheers, Edgar ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-25 17:45 ` Julien Grall 2024-09-25 18:42 ` Edgar E. Iglesias @ 2024-09-25 22:20 ` Stefano Stabellini 1 sibling, 0 replies; 32+ messages in thread From: Stefano Stabellini @ 2024-09-25 22:20 UTC (permalink / raw) To: Julien Grall Cc: Edgar E. Iglesias, Edgar E. Iglesias, xen-devel, sstabellini, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, Stewart Hildebrand [-- Attachment #1: Type: text/plain, Size: 9110 bytes --] On Wed, 25 Sep 2024, Julien Grall wrote: > Hi Edgar, > > On 25/09/2024 17:49, Edgar E. Iglesias wrote: > > On Wed, Sep 25, 2024 at 10:44 AM Edgar E. Iglesias <edgar.iglesias@amd.com> > > wrote: > > > > > On Wed, Sep 25, 2024 at 05:38:13PM +0100, Julien Grall wrote: > > > > Hi Edgar, > > > > > > > > On 25/09/2024 17:34, Edgar E. Iglesias wrote: > > > > > On Wed, Sep 25, 2024 at 08:44:41AM +0100, Julien Grall wrote: > > > > > > Hi, > > > > > > On 24/09/2024 17:23, Edgar E. Iglesias wrote: > > > > > > > From: Stewart Hildebrand <stewart.hildebrand@amd.com> > > > > > > > > > > > > > > When virtio-pci is specified in the dom0less domU properties, > > > create a > > > > > > > virtio-pci node in the guest's device tree. Set up an mmio handler > > > with > > > > > > > a register for the guest to poll when the backend has connected > > > > > > > and > > > > > > > virtio-pci bus is ready to be probed. Grant tables may be used by > > > > > > > specifying virtio-pci = "grants";. > > > > > > > > > > > > > > [Edgar: Use GPEX PCI INTX interrupt swizzling (from PCI specs). > > > > > > > Make grants iommu-map cover the entire PCI bus. > > > > > > > Add virtio-pci-ranges to specify memory-map for direct-mapped > > > guests. > > > > > > > Document virtio-pci dom0less fdt bindings.] > > > > > > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > > > > > > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > > > > > > > --- > > > > > > > docs/misc/arm/device-tree/booting.txt | 21 +++ > > > > > > > xen/arch/arm/dom0less-build.c | 238 > > > ++++++++++++++++++++++++++ > > > > > > > xen/arch/arm/include/asm/kernel.h | 15 ++ > > > > > > > 3 files changed, 274 insertions(+) > > > > > > > > > > > > > > diff --git a/docs/misc/arm/device-tree/booting.txt > > > b/docs/misc/arm/device-tree/booting.txt > > > > > > > index 3a04f5c57f..82f3bd7026 100644 > > > > > > > --- a/docs/misc/arm/device-tree/booting.txt > > > > > > > +++ b/docs/misc/arm/device-tree/booting.txt > > > > > > > @@ -276,6 +276,27 @@ with the following properties: > > > > > > > passed through. This option is the default if this > > > > > > > property > > > is missing > > > > > > > and the user does not provide the device partial device > > > tree for the domain. > > > > > > > +- virtio-pci > > > > > > > > > > > > Similar question to the other patches, why is this specific to > > > virtio PCI? > > > > > > QEMU (or another device module) is free to emulate whatever it wants > > > behind > > > > > > the PCI hosbtridge. > > > > > > > > > > There's no hard limitatino to only virtio-pci devices it's more of a > > > > > recommendation that PVH guests should not use "emulated" devices but > > > > > there's nothing stopping it. > > > > > > > > Could you provide a bit more details where this requirement is coming > > > from? > > > > For instance, I would expect we would need to do some emulation to boot > > > > Windows on Arm. > > > > > > > > > > I see. I guess it just came from my mental model, I thought part of the > > > philosophy behind PVH was to avoid emulated devices and use > > > paravirualized (virtio or something else) or passthrough whereever > > > possible (except for the basic set of devices needed like vGIC, vuart, > > > MMU). > > > > > > > For example, we would recommend users to use virtio-net in favor of an > > emulated eepro1000 or whatever other NIC models available in QEMU. > > Indeed. I would always recommend user to use virtio-net over eepro1000. > > > But there is no hard requirement nor limitation, a user can connect any > > available PCI device from the QEMU set. > > We need to be clear about what we are exposing to the guest. With this patch > we will describe a PCI hostbridge in Device Tree (well it is an empty region > we hope the Device Model to emulate at some point). But the hypervisor will > not create the device model. Instead, you expect the user/integrator to have > extra script to launch a Device Model (So it may not even be a hostbridge). > > > > > Another thing we're looking to do is to minimize the QEMU build (Kconfig + > > configure flags) to create a small build with only the stuff needed for > > virtio-pci. > > It is nice to have a cut down version of QEMU :). However, Xen doesn't care > about the device model used for the emulation. I have seen some specialized DM > in the wild (and used them while I was working on disaggregating the DM). > > Anyway, while I understand this approach works in tailored environment, I am > not convinced this works for a more general approach. The two options I would > rather consider are: > 1. Allow the device model to receive access for a single PCI device (IOW > hook into vPCI). > 2. Find a way to let the user provide the binding (maybe in a partial > device-tree) + the list of Interrupts/MMIO that would be emulated by QEMU. > > The second approach might be another way to get a second hostbridge in your > use case while giving a bit more flexibility in what can be done (thinking > about disagreggated environment). Thank you for the suggestion on the second option, I think that is close to what we intended. Let me add a few more details. There has been a significant trend toward using virtio for all virtual interfaces in automotive and other industries for several years now. While I'm not entirely sure about Windows, all the operating systems we work with (e.g. Android, RTOSes) are optimizing for virtio interfaces. The expectation is that guests will either access physical devices or virtio devices. I mention this in response to the specialized vs. general approach - virtio is becoming (or has already become) the standard, at least in automotive and embedded sectors. This is why we have introduced the new specialized QEMU machine for virtio only on both ARM and x86. However, you are right that the solution is somewhat dependent on the QEMU emulation provided, meaning it isn't fully generalized and may not work with other device models. Let's see if we can improve this. I agree that a single PCI root complex is the cleanest solution from a Xen perspective. However, aside from the level of effort required, it's also important to consider QEMU integration. The separate root complex integrates very well into QEMU's own view of the world, and that is important too because the more we deviate the more we are at risk of triggering unwanted bugs in QEMU. Bugs that would only show up in a Xen configuration and we would responsible to fix. The two PCI RCs approach is simple because it is low complexity from a QEMU point of view. The trade-off is having the two PCI RCs exposed to the VM instead of one, but in our tests two PCI RCs work well on both ARM and even x86. So I think the two PCI RCs approach is viable. (Also I believe that technically is a single PCI RC with two host bridges.) For the second option, I'll let Edgar investigate but I think that would work, thank you for being flexible. We would still need patches 4-6 from this series. Let's assume we'll proceed with patches 4-6 and, as agreed, skip patches 1-2. Then my first thought would be to rely on ImageBuilder to generate the complete virtio DT node. While I usually like using ImageBuilder, in this case, I lean toward having Xen generate the domU nodes. There are a few reasons for this: the partial DTB is typically used for passthrough and related information, which isn't the case here. Although ImageBuilder can merge multiple partial DTBs, I think it's best not to depend on that more delicate feature, for scenarios where a user wants both a passthrough device and a virtio device. But we could change the DT properties to be more explicitly related to an emulated PCI root complex, which could be provided by any device model and not only QEMU. Also we can avoid saying "virtio" in the property name because although our use-case is virtio, as you wrote there is nothing that ties this to virtio today. So what about the following dom0less device tree properties instead? secondary-emulated-pci-host-bridge = <ecam_address ecam_size memory_address memory_size prefetch_address prefetch_size irq_start, irq_how_many, flags>; one of the special flags could be grants enabled/disabled. This way: - The list of interrupts and MMIOs is explicit - The fact that we are talking about a seconday emulated host bridge is explicit, in the description we can say we expect it to be provided by an ioreq device model. We can call it secondary-ioreq-pci-host-bridge - The Xen-generated domU device tree description is still generic and reusable. Let's say that someone comes up with a different use-case and a different device model but still wants a PCI host bridge, they can reuse this. The DomU DT is standard for a generic PCI host bridge. - We don't need ImageBuilder to generate/edit complex device tree nodes and merge partial device trees ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node 2024-09-24 16:23 ` [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node Edgar E. Iglesias ` (2 preceding siblings ...) 2024-09-25 7:44 ` Julien Grall @ 2024-10-01 19:30 ` Stewart Hildebrand 3 siblings, 0 replies; 32+ messages in thread From: Stewart Hildebrand @ 2024-10-01 19:30 UTC (permalink / raw) To: Edgar E. Iglesias, xen-devel Cc: sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias On 9/24/24 12:23, Edgar E. Iglesias wrote: > diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c > index 09b65e44ae..dab24fa9e2 100644 > --- a/xen/arch/arm/dom0less-build.c > +++ b/xen/arch/arm/dom0less-build.c > @@ -586,6 +586,189 @@ static int __init domain_handle_dtb_bootmodule(struct domain *d, > return res; > } > > +static int __init make_virtio_pci_domU_node(const struct kernel_info *kinfo) > +{ > + void *fdt = kinfo->fdt; > + /* reg is sized to be used for all the needed properties below */ > + __be32 reg[(1 + (GUEST_ROOT_ADDRESS_CELLS * 2) + GUEST_ROOT_SIZE_CELLS) > + * 2]; > + __be32 irq_map[4 * 4 * 8]; Why two separate arrays on the stack? It looks like they're used for the same purpose, so I think we may as well keep the bigger one and get rid of the smaller. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v1 4/6] xen/arm: io: Add support for mmio background regions 2024-09-24 16:23 [PATCH v1 0/6] xen/arm: Add Virtio-PCI for dom0less on ARM Edgar E. Iglesias ` (2 preceding siblings ...) 2024-09-24 16:23 ` [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node Edgar E. Iglesias @ 2024-09-24 16:23 ` Edgar E. Iglesias 2024-09-24 23:59 ` Stefano Stabellini 2024-09-24 16:23 ` [PATCH v1 5/6] xen/arm: io: Add a read-const writes-ignored mmio handler Edgar E. Iglesias 2024-09-24 16:23 ` [PATCH v1 6/6] xen/arm: dom0less: Add a background PCI ECAM mmio region Edgar E. Iglesias 5 siblings, 1 reply; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-24 16:23 UTC (permalink / raw) To: xen-devel Cc: sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> Add support for mmio background regions. These regions can be overlayed by IOREQ handlers and thus act as fallback handlers while IOREQ clients haven't registered. Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> --- xen/arch/arm/include/asm/mmio.h | 11 ++++++++++- xen/arch/arm/io.c | 18 ++++++++++++------ 2 files changed, 22 insertions(+), 7 deletions(-) diff --git a/xen/arch/arm/include/asm/mmio.h b/xen/arch/arm/include/asm/mmio.h index b22cfdac5b..7da542cd79 100644 --- a/xen/arch/arm/include/asm/mmio.h +++ b/xen/arch/arm/include/asm/mmio.h @@ -70,6 +70,7 @@ struct mmio_handler_ops { struct mmio_handler { paddr_t addr; paddr_t size; + bool background; const struct mmio_handler_ops *ops; void *priv; }; @@ -83,9 +84,17 @@ struct vmmio { enum io_state try_handle_mmio(struct cpu_user_regs *regs, mmio_info_t *info); +void register_mmio_bg_handler(struct domain *d, + bool background, + const struct mmio_handler_ops *ops, + paddr_t addr, paddr_t size, void *priv); +static inline void register_mmio_handler(struct domain *d, const struct mmio_handler_ops *ops, - paddr_t addr, paddr_t size, void *priv); + paddr_t addr, paddr_t size, void *priv) +{ + register_mmio_bg_handler(d, false, ops, addr, size, priv); +} int domain_io_init(struct domain *d, unsigned int max_count); void domain_io_free(struct domain *d); diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c index 96c740d563..934a2ad2b9 100644 --- a/xen/arch/arm/io.c +++ b/xen/arch/arm/io.c @@ -159,6 +159,7 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs, { struct vcpu *v = current; const struct mmio_handler *handler = NULL; + bool has_background; int rc; ASSERT(info->dabt.ec == HSR_EC_DATA_ABORT_LOWER_EL); @@ -170,13 +171,16 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs, } handler = find_mmio_handler(v->domain, info->gpa); - if ( !handler ) + has_background = handler && handler->background; + if ( !handler || has_background ) { rc = try_fwd_ioserv(regs, v, info); if ( rc == IO_HANDLED ) return handle_ioserv(regs, v); - - return rc; + else if ( !(rc == IO_UNHANDLED && has_background) ) { + /* Only return failure if there's no background handler. */ + return rc; + } } /* @@ -197,9 +201,10 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs, return handle_read(handler, v, info); } -void register_mmio_handler(struct domain *d, - const struct mmio_handler_ops *ops, - paddr_t addr, paddr_t size, void *priv) +void register_mmio_bg_handler(struct domain *d, + bool background, + const struct mmio_handler_ops *ops, + paddr_t addr, paddr_t size, void *priv) { struct vmmio *vmmio = &d->arch.vmmio; struct mmio_handler *handler; @@ -213,6 +218,7 @@ void register_mmio_handler(struct domain *d, handler->ops = ops; handler->addr = addr; handler->size = size; + handler->background = background; handler->priv = priv; vmmio->num_entries++; -- 2.43.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v1 4/6] xen/arm: io: Add support for mmio background regions 2024-09-24 16:23 ` [PATCH v1 4/6] xen/arm: io: Add support for mmio background regions Edgar E. Iglesias @ 2024-09-24 23:59 ` Stefano Stabellini 0 siblings, 0 replies; 32+ messages in thread From: Stefano Stabellini @ 2024-09-24 23:59 UTC (permalink / raw) To: Edgar E. Iglesias Cc: xen-devel, sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias On Tue, 24 Sep 2024, Edgar E. Iglesias wrote: > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> > > Add support for mmio background regions. These regions > can be overlayed by IOREQ handlers and thus act as > fallback handlers while IOREQ clients haven't registered. > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > --- > xen/arch/arm/include/asm/mmio.h | 11 ++++++++++- > xen/arch/arm/io.c | 18 ++++++++++++------ > 2 files changed, 22 insertions(+), 7 deletions(-) > > diff --git a/xen/arch/arm/include/asm/mmio.h b/xen/arch/arm/include/asm/mmio.h > index b22cfdac5b..7da542cd79 100644 > --- a/xen/arch/arm/include/asm/mmio.h > +++ b/xen/arch/arm/include/asm/mmio.h > @@ -70,6 +70,7 @@ struct mmio_handler_ops { > struct mmio_handler { > paddr_t addr; > paddr_t size; > + bool background; > const struct mmio_handler_ops *ops; > void *priv; > }; > @@ -83,9 +84,17 @@ struct vmmio { > > enum io_state try_handle_mmio(struct cpu_user_regs *regs, > mmio_info_t *info); > +void register_mmio_bg_handler(struct domain *d, > + bool background, > + const struct mmio_handler_ops *ops, > + paddr_t addr, paddr_t size, void *priv); > +static inline > void register_mmio_handler(struct domain *d, > const struct mmio_handler_ops *ops, > - paddr_t addr, paddr_t size, void *priv); > + paddr_t addr, paddr_t size, void *priv) > +{ > + register_mmio_bg_handler(d, false, ops, addr, size, priv); > +} > int domain_io_init(struct domain *d, unsigned int max_count); > void domain_io_free(struct domain *d); > > diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c > index 96c740d563..934a2ad2b9 100644 > --- a/xen/arch/arm/io.c > +++ b/xen/arch/arm/io.c > @@ -159,6 +159,7 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs, > { > struct vcpu *v = current; > const struct mmio_handler *handler = NULL; > + bool has_background; > int rc; > > ASSERT(info->dabt.ec == HSR_EC_DATA_ABORT_LOWER_EL); > @@ -170,13 +171,16 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs, > } > > handler = find_mmio_handler(v->domain, info->gpa); > - if ( !handler ) > + has_background = handler && handler->background; > + if ( !handler || has_background ) > { > rc = try_fwd_ioserv(regs, v, info); > if ( rc == IO_HANDLED ) > return handle_ioserv(regs, v); > - > - return rc; > + else if ( !(rc == IO_UNHANDLED && has_background) ) { > + /* Only return failure if there's no background handler. */ > + return rc; > + } > } > > /* > @@ -197,9 +201,10 @@ enum io_state try_handle_mmio(struct cpu_user_regs *regs, > return handle_read(handler, v, info); > } > > -void register_mmio_handler(struct domain *d, > - const struct mmio_handler_ops *ops, > - paddr_t addr, paddr_t size, void *priv) > +void register_mmio_bg_handler(struct domain *d, > + bool background, > + const struct mmio_handler_ops *ops, > + paddr_t addr, paddr_t size, void *priv) > { > struct vmmio *vmmio = &d->arch.vmmio; > struct mmio_handler *handler; > @@ -213,6 +218,7 @@ void register_mmio_handler(struct domain *d, > handler->ops = ops; > handler->addr = addr; > handler->size = size; > + handler->background = background; > handler->priv = priv; > > vmmio->num_entries++; > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v1 5/6] xen/arm: io: Add a read-const writes-ignored mmio handler 2024-09-24 16:23 [PATCH v1 0/6] xen/arm: Add Virtio-PCI for dom0less on ARM Edgar E. Iglesias ` (3 preceding siblings ...) 2024-09-24 16:23 ` [PATCH v1 4/6] xen/arm: io: Add support for mmio background regions Edgar E. Iglesias @ 2024-09-24 16:23 ` Edgar E. Iglesias 2024-09-25 0:02 ` Stefano Stabellini 2024-09-24 16:23 ` [PATCH v1 6/6] xen/arm: dom0less: Add a background PCI ECAM mmio region Edgar E. Iglesias 5 siblings, 1 reply; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-24 16:23 UTC (permalink / raw) To: xen-devel Cc: sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> Add a read-const writes-ignored mmio handler. This is useful to for example register background regions that return a fixed value instead of raising data aborts. Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> --- xen/arch/arm/include/asm/mmio.h | 2 ++ xen/arch/arm/io.c | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/xen/arch/arm/include/asm/mmio.h b/xen/arch/arm/include/asm/mmio.h index 7da542cd79..605620a2f4 100644 --- a/xen/arch/arm/include/asm/mmio.h +++ b/xen/arch/arm/include/asm/mmio.h @@ -82,6 +82,8 @@ struct vmmio { struct mmio_handler *handlers; }; +extern const struct mmio_handler_ops mmio_read_const_writes_ignored; + enum io_state try_handle_mmio(struct cpu_user_regs *regs, mmio_info_t *info); void register_mmio_bg_handler(struct domain *d, diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c index 934a2ad2b9..8ab0435afc 100644 --- a/xen/arch/arm/io.c +++ b/xen/arch/arm/io.c @@ -20,6 +20,27 @@ #include "decode.h" +/* + * Reusable mmio handler useful as background while waiting for IOREQ. + * Register with priv as default read value. Writes ignored. + */ +static int bg_read(struct vcpu *v, mmio_info_t *info, register_t *r, void *priv) +{ + *r = (uintptr_t) priv; + return 1; +} + +static int bg_write(struct vcpu *v, mmio_info_t *info, register_t r, void *priv) +{ + return 1; +} + +/* Read const value (from priv), writes ignored. */ +const struct mmio_handler_ops mmio_read_const_writes_ignored = { + .read = bg_read, + .write = bg_write, +}; + static enum io_state handle_read(const struct mmio_handler *handler, struct vcpu *v, mmio_info_t *info) -- 2.43.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v1 5/6] xen/arm: io: Add a read-const writes-ignored mmio handler 2024-09-24 16:23 ` [PATCH v1 5/6] xen/arm: io: Add a read-const writes-ignored mmio handler Edgar E. Iglesias @ 2024-09-25 0:02 ` Stefano Stabellini 0 siblings, 0 replies; 32+ messages in thread From: Stefano Stabellini @ 2024-09-25 0:02 UTC (permalink / raw) To: Edgar E. Iglesias Cc: xen-devel, sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias On Tue, 24 Sep 2024, Edgar E. Iglesias wrote: > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> > > Add a read-const writes-ignored mmio handler. This is useful > to for example register background regions that return a fixed > value instead of raising data aborts. > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > --- > xen/arch/arm/include/asm/mmio.h | 2 ++ > xen/arch/arm/io.c | 21 +++++++++++++++++++++ > 2 files changed, 23 insertions(+) > > diff --git a/xen/arch/arm/include/asm/mmio.h b/xen/arch/arm/include/asm/mmio.h > index 7da542cd79..605620a2f4 100644 > --- a/xen/arch/arm/include/asm/mmio.h > +++ b/xen/arch/arm/include/asm/mmio.h > @@ -82,6 +82,8 @@ struct vmmio { > struct mmio_handler *handlers; > }; > > +extern const struct mmio_handler_ops mmio_read_const_writes_ignored; > + > enum io_state try_handle_mmio(struct cpu_user_regs *regs, > mmio_info_t *info); > void register_mmio_bg_handler(struct domain *d, > diff --git a/xen/arch/arm/io.c b/xen/arch/arm/io.c > index 934a2ad2b9..8ab0435afc 100644 > --- a/xen/arch/arm/io.c > +++ b/xen/arch/arm/io.c > @@ -20,6 +20,27 @@ > > #include "decode.h" > > +/* > + * Reusable mmio handler useful as background while waiting for IOREQ. > + * Register with priv as default read value. Writes ignored. > + */ > +static int bg_read(struct vcpu *v, mmio_info_t *info, register_t *r, void *priv) > +{ > + *r = (uintptr_t) priv; > + return 1; > +} > + > +static int bg_write(struct vcpu *v, mmio_info_t *info, register_t r, void *priv) > +{ > + return 1; > +} > + > +/* Read const value (from priv), writes ignored. */ > +const struct mmio_handler_ops mmio_read_const_writes_ignored = { > + .read = bg_read, > + .write = bg_write, > +}; > + > static enum io_state handle_read(const struct mmio_handler *handler, > struct vcpu *v, > mmio_info_t *info) > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v1 6/6] xen/arm: dom0less: Add a background PCI ECAM mmio region 2024-09-24 16:23 [PATCH v1 0/6] xen/arm: Add Virtio-PCI for dom0less on ARM Edgar E. Iglesias ` (4 preceding siblings ...) 2024-09-24 16:23 ` [PATCH v1 5/6] xen/arm: io: Add a read-const writes-ignored mmio handler Edgar E. Iglesias @ 2024-09-24 16:23 ` Edgar E. Iglesias 2024-09-25 0:07 ` Stefano Stabellini 5 siblings, 1 reply; 32+ messages in thread From: Edgar E. Iglesias @ 2024-09-24 16:23 UTC (permalink / raw) To: xen-devel Cc: sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> Add a background PCI ECAM mmio region always reading as all ones. This indicates to the OS that there are no PCI devices on the bus. Once the device-model's IOREQ client connects, the OS can rescan the bus and find PV and emulated devices. This avoids a race where domU's come up before the device models, causing domU to crash into a data-abort when accessing ECAM. Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> --- xen/arch/arm/dom0less-build.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c index dab24fa9e2..bc5285e7fa 100644 --- a/xen/arch/arm/dom0less-build.c +++ b/xen/arch/arm/dom0less-build.c @@ -1015,6 +1015,16 @@ static int __init construct_domU(struct domain *d, kinfo.virtio_pci.mem.base = GUEST_VIRTIO_PCI_MEM_BASE; kinfo.virtio_pci.pf_mem.base = GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE; } + + /* + * Register a background PCI ECAM region returning ~0. This indicates + * to the OS that there are no PCI devices on the bus. Once an IOREQ + * client connects, the OS can rescan the bus and find devices. + */ + register_mmio_bg_handler(d, true, &mmio_read_const_writes_ignored, + kinfo.virtio_pci.ecam.base, + GUEST_VIRTIO_PCI_ECAM_SIZE, + (void *) ULONG_MAX); } rc = dt_property_read_string(node, "xen,enhanced", &dom0less_enhanced); -- 2.43.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v1 6/6] xen/arm: dom0less: Add a background PCI ECAM mmio region 2024-09-24 16:23 ` [PATCH v1 6/6] xen/arm: dom0less: Add a background PCI ECAM mmio region Edgar E. Iglesias @ 2024-09-25 0:07 ` Stefano Stabellini 0 siblings, 0 replies; 32+ messages in thread From: Stefano Stabellini @ 2024-09-25 0:07 UTC (permalink / raw) To: Edgar E. Iglesias Cc: xen-devel, sstabellini, julien, bertrand.marquis, michal.orzel, Volodymyr_Babchuk, dpsmith, edgar.iglesias On Tue, 24 Sep 2024, Edgar E. Iglesias wrote: > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com> > > Add a background PCI ECAM mmio region always reading as all ones. > This indicates to the OS that there are no PCI devices on the bus. > Once the device-model's IOREQ client connects, the OS can rescan > the bus and find PV and emulated devices. > > This avoids a race where domU's come up before the device models, > causing domU to crash into a data-abort when accessing ECAM. > > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com> > --- > xen/arch/arm/dom0less-build.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c > index dab24fa9e2..bc5285e7fa 100644 > --- a/xen/arch/arm/dom0less-build.c > +++ b/xen/arch/arm/dom0less-build.c > @@ -1015,6 +1015,16 @@ static int __init construct_domU(struct domain *d, > kinfo.virtio_pci.mem.base = GUEST_VIRTIO_PCI_MEM_BASE; > kinfo.virtio_pci.pf_mem.base = GUEST_VIRTIO_PCI_PREFETCH_MEM_BASE; > } > + > + /* > + * Register a background PCI ECAM region returning ~0. This indicates > + * to the OS that there are no PCI devices on the bus. Once an IOREQ > + * client connects, the OS can rescan the bus and find devices. > + */ > + register_mmio_bg_handler(d, true, &mmio_read_const_writes_ignored, > + kinfo.virtio_pci.ecam.base, > + GUEST_VIRTIO_PCI_ECAM_SIZE, > + (void *) ULONG_MAX); > } I think GUEST_VIRTIO_PCI_ECAM_SIZE should be added to the description of virtio-pci-ranges in patch #3 to make it clear. Also any other "size" too. The ULONG_MAX -> void* -> uintptr_t -> register_t conversion works OK on both Xen 64-bit and 32-bit as far as I can tell, so: Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2024-10-01 19:31 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-24 16:23 [PATCH v1 0/6] xen/arm: Add Virtio-PCI for dom0less on ARM Edgar E. Iglesias 2024-09-24 16:23 ` [PATCH v1 1/6] xen/arm: Decrease size of the 2nd ram bank Edgar E. Iglesias 2024-09-24 16:30 ` Julien Grall 2024-09-24 23:34 ` Stefano Stabellini 2024-09-24 23:40 ` Edgar E. Iglesias 2024-09-24 16:23 ` [PATCH v1 2/6] xen/arm: Reserve resources for virtio-pci Edgar E. Iglesias 2024-09-24 16:35 ` Julien Grall 2024-09-24 17:11 ` Edgar E. Iglesias 2024-09-24 17:24 ` Julien Grall 2024-09-24 23:16 ` Stefano Stabellini 2024-09-25 7:36 ` Julien Grall 2024-09-25 18:09 ` Stefano Stabellini 2024-09-24 16:23 ` [PATCH v1 3/6] xen/arm: create dom0less virtio-pci DT node Edgar E. Iglesias 2024-09-24 23:55 ` Stefano Stabellini 2024-09-25 13:03 ` Stewart Hildebrand 2024-09-26 22:15 ` Edgar E. Iglesias 2024-09-25 2:48 ` Stewart Hildebrand 2024-09-25 7:44 ` Julien Grall 2024-09-25 16:34 ` Edgar E. Iglesias 2024-09-25 16:38 ` Julien Grall 2024-09-25 16:44 ` Edgar E. Iglesias 2024-09-25 16:49 ` Edgar E. Iglesias 2024-09-25 17:45 ` Julien Grall 2024-09-25 18:42 ` Edgar E. Iglesias 2024-09-25 22:20 ` Stefano Stabellini 2024-10-01 19:30 ` Stewart Hildebrand 2024-09-24 16:23 ` [PATCH v1 4/6] xen/arm: io: Add support for mmio background regions Edgar E. Iglesias 2024-09-24 23:59 ` Stefano Stabellini 2024-09-24 16:23 ` [PATCH v1 5/6] xen/arm: io: Add a read-const writes-ignored mmio handler Edgar E. Iglesias 2024-09-25 0:02 ` Stefano Stabellini 2024-09-24 16:23 ` [PATCH v1 6/6] xen/arm: dom0less: Add a background PCI ECAM mmio region Edgar E. Iglesias 2024-09-25 0:07 ` Stefano Stabellini
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.