Re: [RFC] ARM PCI Passthrough design document

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Julien Grall <julien.grall@linaro.org>
To: Manish Jaggi <mjaggi@caviumnetworks.com>,
	Stefano Stabellini <sstabellini@kernel.org>
Cc: edgar.iglesias@xilinx.com, okaya@qti.qualcomm.com,
	Wei Chen <Wei.Chen@arm.com>, Steve Capper <Steve.Capper@arm.com>,
	Andre Przywara <andre.przywara@arm.com>,
	manish.jaggi@caviumnetworks.com, punit.agrawal@arm.com,
	vikrams@qti.qualcomm.com, "Goel, Sameer" <sgoel@qti.qualcomm.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	Dave P Martin <Dave.Martin@arm.com>,
	Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>,
	roger.pau@citrix.com
Subject: Re: [RFC] ARM PCI Passthrough design document
Date: Mon, 29 May 2017 19:14:55 +0100	[thread overview]
Message-ID: <928d5f49-b40b-56ee-f955-0b7122d529e2@linaro.org> (raw)
In-Reply-To: <61af41e6-a549-7930-efd4-705718d174f8@caviumnetworks.com>



On 05/29/2017 03:30 AM, Manish Jaggi wrote:
> Hi Julien,

Hello Manish,

> On 5/26/2017 10:44 PM, Julien Grall wrote:
>> PCI pass-through allows the guest to receive full control of physical PCI
>> devices. This means the guest will have full and direct access to the PCI
>> device.
>>
>> ARM is supporting a kind of guest that exploits as much as possible
>> virtualization support in hardware. The guest will rely on PV driver only
>> for IO (e.g block, network) and interrupts will come through the
>> virtualized
>> interrupt controller, therefore there are no big changes required
>> within the
>> kernel.
>>
>> As a consequence, it would be possible to replace PV drivers by
>> assigning real
>> devices to the guest for I/O access. Xen on ARM would therefore be
>> able to
>> run unmodified operating system.
>>
>> To achieve this goal, it looks more sensible to go towards emulating the
>> host bridge (there will be more details later).
> IIUC this means that domU would have an emulated host bridge and dom0
> will see the actual host bridge?

You don't want the hardware domain and Xen access the configuration 
space at the same time. So if Xen is in charge of the host bridge, then 
an emulated host bridge should be exposed to the hardware.

Although, this is depending on who is in charge of the the host bridge. 
As you may have noticed, this design document is proposing two ways to 
handle configuration space access. At the moment any generic host bridge 
(see the definition in the design document) will be handled in Xen and 
the hardware domain will have an emulated host bridge.

If your host bridges is not a generic one, then the hardware domain will 
be  in charge of the host bridges, any configuration access from Xen 
will be forward to the hardware domain.

At the moment, as part of the first implementation, we are only looking 
to implement a generic host bridge in Xen. We will decide on case by 
case basis for all the other host bridges whether we want to have the 
driver in Xen.

[...]

>> ## IOMMU
>>
>> The IOMMU will be used to isolate the PCI device when accessing the
>> memory (e.g
>> DMA and MSI Doorbells). Often the IOMMU will be configured using a
>> MasterID
>> (aka StreamID for ARM SMMU)  that can be deduced from the SBDF with
>> the help
>> of the firmware tables (see below).
>>
>> Whilst in theory, all the memory transactions issued by a PCI device
>> should
>> go through the IOMMU, on certain platforms some of the memory
>> transaction may
>> not reach the IOMMU because they are interpreted by the host bridge. For
>> instance, this could happen if the MSI doorbell is built into the PCI
>> host
>> bridge or for P2P traffic. See [6] for more details.
>>
>> XXX: I think this could be solved by using direct mapping (e.g GFN ==
>> MFN),
>> this would mean the guest memory layout would be similar to the host
>> one when
>> PCI devices will be pass-throughed => Detail it.
> In the example given in the IORT spec, for pci devices not behind an SMMU,
> how would the writes from the device be protected.

I realize the XXX paragraph is quite confusing. I am not trying to solve 
the problem where PCI devices are not protected behind an SMMU but 
platform where some transactions (e.g P2P or MSI doorbell access) are 
by-passing the SMMU.

You may still want to allow PCI passthrough in that case, because you 
know that P2P cannot be done (or potentially disabled) and MSI doorbell 
access is protected (for instance a write in the ITS doorbell will be 
tagged with the device by the hardware). In order to support such 
platform you need to direct map the doorbel (e.g GFN == MFN) and carve 
out the P2P region from the guest memory map. Hence the suggestion to 
re-use the host memory layout for the guest.

Note that it does not mean the RAM region will be direct mapped. It is 
only there to ease carving out memory region by-passed by the SMMU.

[...]

>> ## ACPI
>>
>> ### Host bridges
>>
>> The static table MCFG (see 4.2 in [1]) will describe the host bridges
>> available
>> at boot and supporting ECAM. Unfortunately, there are platforms out there
>> (see [2]) that re-use MCFG to describe host bridge that are not fully
>> ECAM
>> compatible.
>>
>> This means that Xen needs to account for possible quirks in the host
>> bridge.
>> The Linux community are working on a patch series for this, see [2]
>> and [3],
>> where quirks will be detected with:
>>      * OEM ID
>>      * OEM Table ID
>>      * OEM Revision
>>      * PCI Segment
>>      * PCI bus number range (wildcard allowed)
>>
>> Based on what Linux is currently doing, there are two kind of quirks:
>>      * Accesses to the configuration space of certain sizes are not
>> allowed
>>      * A specific driver is necessary for driving the host bridge
>>
>> The former is straightforward to solve but the latter will require
>> more thought.
>> Instantiation of a specific driver for the host controller can be
>> easily done
>> if Xen has the information to detect it.
> So Xen would parse the MCFG to find a hb, then map the config space in
> dom0 stage2 ?
> and then provide the same MCFG to dom0?

This is implementation details. I have been really careful so far to 
leave the implementation open as it does not matter at this stage how we 
are going to implement it in Xen.

[...]

>> ## Discovering and registering host bridge
>>
>> The approach taken in the document will require communication between
>> Xen and
>> the hardware domain. In this case, they would need to agree on the
>> segment
>> number associated to an host bridge. However, this number is not
>> available in
>> the Device Tree case.
>>
>> The hardware domain will register new host bridges using the existing
>> hypercall
>> PHYSDEV_mmcfg_reserved:
>>
>> #define XEN_PCI_MMCFG_RESERVED 1
>>
>> struct physdev_pci_mmcfg_reserved {
>>      /* IN */
>>      uint64_t    address;
>>      uint16_t    segment;
>>      /* Range of bus supported by the host bridge */
>>      uint8_t     start_bus;
>>      uint8_t     end_bus;
>>
>>      uint32_t    flags;
>> }
> So this hypercall is not required for ACPI?

This is not DT specific as even on ACPI there are platform not fully 
ECAM compliant. As I said above, we will need to decide whether we want 
to support non-ECAM compliant host bridges (e.g all host bridges have a 
specific drivers) in Xen. Likely this will be on case by case basis.

[...]

>> ## Discovering and registering PCI devices
>>
>> The hardware domain will scan the host bridge to find the list of PCI
>> devices
>> available and then report it to Xen using the existing hypercall
>> PHYSDEV_pci_device_add:
>>
>> #define XEN_PCI_DEV_EXTFN   0x1
>> #define XEN_PCI_DEV_VIRTFN  0x2
>> #define XEN_PCI_DEV_PXM     0x3
>>
>> struct physdev_pci_device_add {
>>      /* IN */
>>      uint16_t    seg;
>>      uint8_t     bus;
>>      uint8_t     devfn;
>>      uint32_t    flags;
>>      struct {
>>          uint8_t bus;
>>          uint8_t devfn;
>>      } physfn;
>>      /*
>>       * Optional parameters array.
>>       * First element ([0]) is PXM domain associated with the device (if
>>       * XEN_PCI_DEV_PXM is set)
>>       */
>>      uint32_t optarr[0];
>> }
> For mapping the MMIO space of the device in Stage2, we need to add
> support in Xen / via a map hypercall in linux/drivers/xen/pci.c

Mapping MMIO space in stage-2 is not PCI specific and already addressed 
in Xen 4.9 (see commit 80f9c31 "xen/arm: acpi: Map MMIO on fault in 
stage-2 page table for the hardware domain"). So I don't understand why 
we should care about that here...

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2017-05-29 18:15 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-26 17:14 [RFC] ARM PCI Passthrough design document Julien Grall
2017-05-29  2:30 ` Manish Jaggi
2017-05-29 18:14   ` Julien Grall [this message]
2017-05-30  5:53     ` Manish Jaggi
2017-05-30  9:33       ` Julien Grall
2017-05-30  7:53     ` Roger Pau Monné
2017-05-30  9:42       ` Julien Grall
2017-05-30  7:40 ` Roger Pau Monné
2017-05-30  9:54   ` Julien Grall
2017-06-16  0:31     ` Stefano Stabellini
2017-06-16  0:23 ` Stefano Stabellini
2017-06-20  0:19 ` Vikram Sethi
2017-06-28 15:22   ` Julien Grall
2017-06-29 15:17     ` Vikram Sethi
2017-07-03 14:35       ` Julien Grall
2017-07-04  8:30     ` roger.pau
2017-07-06 20:55       ` Vikram Sethi
2017-07-07  8:49         ` Roger Pau Monné
2017-07-07 21:50           ` Stefano Stabellini
2017-07-07 23:40             ` Vikram Sethi
2017-07-08  7:34             ` Roger Pau Monné
2018-01-19 10:34               ` Manish Jaggi
2017-07-19 14:41 ` Notes from PCI Passthrough design discussion at Xen Summit Punit Agrawal
2017-07-20  3:54   ` Manish Jaggi
2017-07-20  8:24     ` Roger Pau Monné
2017-07-20  9:32       ` Manish Jaggi
2017-07-20 10:29         ` Roger Pau Monné
2017-07-20 10:47           ` Julien Grall
2017-07-20 11:06             ` Roger Pau Monné
2017-07-20 11:52               ` Julien Grall
2017-07-20 11:02           ` Manish Jaggi
2017-07-20 10:41         ` Julien Grall
2017-07-20 11:00           ` Manish Jaggi
2017-07-20 12:24             ` Julien Grall
2018-01-22 11:10 ` [RFC] ARM PCI Passthrough design document Manish Jaggi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=928d5f49-b40b-56ee-f955-0b7122d529e2@linaro.org \
    --to=julien.grall@linaro.org \
    --cc=Dave.Martin@arm.com \
    --cc=Steve.Capper@arm.com \
    --cc=Vijaya.Kumar@caviumnetworks.com \
    --cc=Wei.Chen@arm.com \
    --cc=andre.przywara@arm.com \
    --cc=edgar.iglesias@xilinx.com \
    --cc=manish.jaggi@caviumnetworks.com \
    --cc=mjaggi@caviumnetworks.com \
    --cc=okaya@qti.qualcomm.com \
    --cc=punit.agrawal@arm.com \
    --cc=roger.pau@citrix.com \
    --cc=sgoel@qti.qualcomm.com \
    --cc=sstabellini@kernel.org \
    --cc=vikrams@qti.qualcomm.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).