xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>,
	"Punit Agrawal" <punit.agrawal@arm.com>,
	"Wei Chen" <Wei.Chen@arm.com>,
	"Steve Capper" <Steve.Capper@arm.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Jiandi An" <anjiandi@codeaurora.org>,
	"Julien Grall" <julien.grall@linaro.org>,
	alistair.francis@xilinx.com,
	"Campbell Sean" <scampbel@codeaurora.org>,
	xen-devel <xen-devel@lists.xenproject.org>,
	"manish.jaggi@caviumnetworks.com"
	<manish.jaggi@caviumnetworks.com>,
	"Shanker Donthineni" <shankerd@codeaurora.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [early RFC] ARM PCI Passthrough design document
Date: Fri, 3 Feb 2017 00:44:52 +0100	[thread overview]
Message-ID: <20170202234452.GN9606@toto> (raw)
In-Reply-To: <alpine.DEB.2.10.1702021507580.17946@sstabellini-ThinkPad-X260>

On Thu, Feb 02, 2017 at 03:12:52PM -0800, Stefano Stabellini wrote:
> On Thu, 2 Feb 2017, Edgar E. Iglesias wrote:
> > On Wed, Feb 01, 2017 at 07:04:43PM +0000, Julien Grall wrote:
> > > Hi Edgar,
> > > 
> > > On 31/01/2017 19:06, Edgar E. Iglesias wrote:
> > > >On Tue, Jan 31, 2017 at 05:09:53PM +0000, Julien Grall wrote:
> > > >>On 31/01/17 16:53, Edgar E. Iglesias wrote:
> > > >>>On Wed, Jan 25, 2017 at 06:53:20PM +0000, Julien Grall wrote:
> > > >>>>On 24/01/17 20:07, Stefano Stabellini wrote:
> > > >>>>>On Tue, 24 Jan 2017, Julien Grall wrote:
> > > >>>>For generic host bridge, the initialization is inexistent. However some host
> > > >>>>bridge (e.g xgene, xilinx) may require some specific setup and also
> > > >>>>configuring clocks. Given that Xen only requires to access the configuration
> > > >>>>space, I was thinking to let DOM0 initialization the host bridge. This would
> > > >>>>avoid to import a lot of code in Xen, however this means that we need to
> > > >>>>know when the host bridge has been initialized before accessing the
> > > >>>>configuration space.
> > > >>>
> > > >>>
> > > >>>Yes, that's correct.
> > > >>>There's a sequence on the ZynqMP that involves assiging Gigabit Transceivers
> > > >>>to PCI (GTs are shared among PCIe, USB, SATA and the Display Port),
> > > >>>enabling clocks and configuring a few registers to enable ECAM and MSI.
> > > >>>
> > > >>>I'm not sure if this could be done prior to starting Xen. Perhaps.
> > > >>>If so, bootloaders would have to know a head of time what devices
> > > >>>the GTs are supposed to be configured for.
> > > >>
> > > >>I've got further questions regarding the Gigabit Transceivers. You mention
> > > >>they are shared, do you mean that multiple devices can use a GT at the same
> > > >>time? Or the software is deciding at startup which device will use a given
> > > >>GT? If so, how does the software make this decision?
> > > >
> > > >Software will decide at startup. AFAIK, the allocation is normally done
> > > >once but I guess that in theory you could design boards that could switch
> > > >at runtime. I'm not sure we need to worry about that use-case though.
> > > >
> > > >The details can be found here:
> > > >https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf
> > > >
> > > >I suggest looking at pages 672 and 733.
> > > 
> > > Thank you for the documentation. I am trying to understand if we could move
> > > initialization in Xen as suggested by Stefano. I looked at the driver in
> > > Linux and the code looks simple not many dependencies. However, I was not
> > > able to find where the Gigabit Transceivers are configured. Do you have any
> > > link to the code for that?
> > 
> > Hi Julien,
> > 
> > I suspect that this setup has previously been done by the initial bootloader
> > auto-generated from design configuration tools.
> > 
> > Now, this is moving into Linux.
> > There's a specific driver that does that but AFAICS, it has not been upstreamed yet.
> > You can see it here:
> > https://github.com/Xilinx/linux-xlnx/blob/master/drivers/phy/phy-zynqmp.c
> > 
> > DTS nodes that need a PHY can then just refer to it, here's an example from SATA:
> > &sata {
> >         phy-names = "sata-phy";
> >         phys = <&lane3 PHY_TYPE_SATA 1 3 150000000>;
> > };
> > 
> > I'll see if I can find working examples for PCIe on the ZCU102. Then I'll share
> > DTS, Kernel etc.
> > 
> > If you are looking for a platform to get started, an option could be if I get you a build of
> > our QEMU that includes models for the PCIe controller, MSI and SMMU connections.
> > These models are friendly wrt. PHY configs and initialization sequences, it will
> > accept pretty much any sequence and still work. This would allow you to focus on
> > architectural issues rather than exact details of init sequences (which we can
> > deal with later).
> > 
> > 
> > 
> > > 
> > > This would also mean that the MSI interrupt controller will be moved in Xen.
> > > Which I think is a more sensible design (see more below).
> > > 
> > > >>
> > > >>>>	- For all other host bridges => I don't know if there are host bridges
> > > >>>>falling under this category. I also don't have any idea how to handle this.
> > > >>>>
> > > >>>>>
> > > >>>>>Otherwise, if Dom0 is the only one to drive the physical host bridge,
> > > >>>>>and Xen is the one to provide the emulated host bridge, how are DomU PCI
> > > >>>>>config reads and writes supposed to work in details?
> > > >>>>
> > > >>>>I think I have answered to this question with my explanation above. Let me
> > > >>>>know if it is not the case.
> > > >>>>
> > > >>>>>How is MSI configuration supposed to work?
> > > >>>>
> > > >>>>For GICv3 ITS, the MSI will be configured with the eventID (it is uniq
> > > >>>>per-device) and the address of the doorbell. The linkage between the LPI and
> > > >>>>"MSI" will be done through the ITS.
> > > >>>>
> > > >>>>For GICv2m, the MSI will be configured with an SPIs (or offset on some
> > > >>>>GICv2m) and the address of the doorbell. Note that for DOM0 SPIs are mapped
> > > >>>>1:1.
> > > >>>>
> > > >>>>So in both case, I don't think it is necessary to trap MSI configuration for
> > > >>>>DOM0. This may not be true if we want to handle other MSI controller.
> > > >>>>
> > > >>>>I have in mind the xilinx MSI controller (embedded in the host bridge? [4])
> > > >>>>and xgene MSI controller ([5]). But I have no idea how they work and if we
> > > >>>>need to support them. Maybe Edgar could share details on the Xilinx one?
> > > >>>
> > > >>>
> > > >>>The Xilinx controller has 2 dedicated SPIs and pages for MSIs. AFAIK, there's no
> > > >>>way to protect the MSI doorbells from mal-configured end-points raising malicious EventIDs.
> > > >>>So perhaps trapped config accesses from domUs can help by adding this protection
> > > >>>as drivers configure the device.
> > > >>>
> > > >>>On Linux, Once MSI's hit, the kernel takes the SPI interrupts, reads
> > > >>>out the EventID from a FIFO in the controller and injects a new IRQ into
> > > >>>the kernel.
> > > >>
> > > >>It might be early to ask, but how do you expect  MSI to work with DOMU on
> > > >>your hardware? Does your MSI controller supports virtualization? Or are you
> > > >>looking for a different way to inject MSI?
> > > >
> > > >MSI support in HW is quite limited to support domU and will require SW hacks :-(
> > > >
> > > >Anyway, something along the lines of this might work:
> > > >
> > > >* Trap domU CPU writes to MSI descriptors in config space.
> > > >  Force real MSI descriptors to the address of the door bell area.
> > > >  Force real MSI descriptors to use a specific device unique Event ID allocated by Xen.
> > > >  Remember what EventID domU requested per device and descriptor.
> > > >
> > > >* Xen or Dom0 take the real SPI generated when device writes into the doorbell area.
> > > >  At this point, we can read out the EventID from the MSI FIFO and map it to the one requested from domU.
> > > >  Xen or Dom0 inject the expected EventID into domU
> > > >
> > > >Do you have any good ideas? :-)
> > > 
> > > From my understanding your MSI controller is embedded in the hostbridge,
> > > right? If so, the MSIs would need to be handled where the host bridge will
> > > be initialized (e.g either Xen or DOM0).
> > 
> > Yes, it is.
> > 
> > > 
> > > From a design point of view, it would make more sense to have the MSI
> > > controller driver in Xen as the hostbridge emulation for guest will also
> > > live there.
> > > 
> > > So if we receive MSI in Xen, we need to figure out a way for DOM0 and guest
> > > to receive MSI. The same way would be the best, and I guess non-PV if
> > > possible. I know you are looking to boot unmodified OS in a VM. This would
> > > mean we need to emulate the MSI controller and potentially xilinx PCI
> > > controller. How much are you willing to modify the OS?
> > 
> > Today, we have not yet implemented PCIe drivers for our baremetal SDK. So
> > things are very open and we could design with pretty much anything in mind.
> > 
> > Yes, we could perhaps include a very small model with most registers dummied.
> > Implementing the MSI read FIFO would allow us to:
> > 
> > 1. Inject the MSI doorbell SPI into guests. The guest will then see the same
> >    IRQ as on real HW.
> > 
> > 2. Guest reads host-controller registers (MSI FIFO) to get the signaled MSI.
> > 
> > 
> > 
> > > Regarding the MSI doorbell, I have seen it is configured by the software
> > > using a physical address of a page allocated in the RAM. When the PCI
> > > devices is writing into the doorbell does the access go through the SMMU?
> > 
> > That's a good question. On our QEMU model it does, but I'll have to dig a little to see if that is the case on real HW aswell.
> > 
> > > Regardless the answer, I think we would need to map the MSI doorbell page in
> > > the guest. Meaning that even if we trap MSI configuration access, a guess
> > > could DMA in the page. So if I am not mistaken, MSI would be insecure in
> > > this case :/.
> > > 
> > > Or maybe we could avoid mapping the doorbell in the guest and let Xen
> > > receive an SMMU abort. When receiving the SMMU abort, Xen could sanitize the
> > > value and write into the real MSI doorbell. Not sure if it would works
> > > thought.
> > 
> > Yeah, this is a problem.
> > I'm not sure if SMMU aborts would work because I don't think we know the value of the data written when we take the abort.
> > Without the data, I'm not sure how we would distinguish between different MSI's from the same device.
> > 
> > Also, even if the MSI doorbell would be protected by the SMMU, all PCI devices are presented with the same AXI Master ID.
> 
> Does that mean that from the SMMU perspective you can only assign them
> all or none?

Unfortunately yes.


> > BTW, this master-ID SMMU limitation is a showstopper for domU guests isn't it?
> > Or do you have ideas around that? Perhaps some PV way to request mappings for DMA?
> 
> No, we don't have anything like that. There are too many device specific
> ways to request DMAs to do that. For devices that cannot be effectively
> protected by IOMMU, (on x86) we support assignment but only in an
> insecure fashion.

OK, I see.

A possible hack could be to allocate a chunk of DDR dedicated for PCI DMA.
PCI DMA devs could be locked in to only be able to access this mem + MSI doorbell.
Guests can still screw each other up but at least it becomes harder to read/write directly from each others OS memory.
It may not be worth the effort though....

Cheers,
Edgar




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-02-02 23:44 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-29 14:04 [early RFC] ARM PCI Passthrough design document Julien Grall
2016-12-29 14:16 ` Jaggi, Manish
2016-12-29 17:03   ` Julien Grall
2016-12-29 18:41     ` Jaggi, Manish
2016-12-29 19:38       ` Julien Grall
2017-01-04  0:24 ` Stefano Stabellini
2017-01-24 14:28   ` Julien Grall
2017-01-24 20:07     ` Stefano Stabellini
2017-01-25 11:21       ` Roger Pau Monné
2017-01-25 18:53       ` Julien Grall
2017-01-31 16:53         ` Edgar E. Iglesias
2017-01-31 17:09           ` Julien Grall
2017-01-31 19:06             ` Edgar E. Iglesias
2017-01-31 22:08               ` Stefano Stabellini
2017-02-01 19:04               ` Julien Grall
2017-02-01 19:31                 ` Stefano Stabellini
2017-02-01 20:24                   ` Julien Grall
2017-02-02 15:33                 ` Edgar E. Iglesias
2017-02-02 23:12                   ` Stefano Stabellini
2017-02-02 23:44                     ` Edgar E. Iglesias [this message]
2017-02-10  1:01                       ` Stefano Stabellini
2017-02-13 15:39                         ` Julien Grall
2017-02-13 19:59                           ` Stefano Stabellini
2017-02-14 17:21                             ` Julien Grall
2017-02-14 18:20                               ` Stefano Stabellini
2017-02-14 20:18                                 ` Julien Grall
2017-02-13 15:35                   ` Julien Grall
2017-02-22  4:03                     ` Edgar E. Iglesias
2017-02-23 16:47                       ` Julien Grall
2017-03-02 21:13                         ` Edgar E. Iglesias
2017-02-02 15:40                 ` Roger Pau Monné
2017-02-13 16:22                   ` Julien Grall
2017-01-31 21:58         ` Stefano Stabellini
2017-02-01 20:12           ` Julien Grall
2017-02-01 10:55         ` Roger Pau Monné
2017-02-01 18:50           ` Stefano Stabellini
2017-02-10  9:48             ` Roger Pau Monné
2017-02-10 10:11               ` Paul Durrant
2017-02-10 12:57                 ` Roger Pau Monne
2017-02-10 13:02                   ` Paul Durrant
2017-02-10 21:04                     ` Stefano Stabellini
2017-02-02 12:38           ` Julien Grall
2017-02-02 23:06             ` Stefano Stabellini
2017-03-08 19:06               ` Julien Grall
2017-03-08 19:12                 ` Konrad Rzeszutek Wilk
2017-03-08 19:55                   ` Stefano Stabellini
2017-03-08 21:51                     ` Julien Grall
2017-03-09  2:59                   ` Roger Pau Monné
2017-03-09 11:17                     ` Konrad Rzeszutek Wilk
2017-03-09 13:26                       ` Julien Grall
2017-03-10  0:29                         ` Konrad Rzeszutek Wilk
2017-03-10  3:23                           ` Roger Pau Monné
2017-03-10 15:28                             ` Konrad Rzeszutek Wilk
2017-03-15 12:07                               ` Roger Pau Monné
2017-03-15 12:42                                 ` Konrad Rzeszutek Wilk
2017-03-15 12:56                                   ` Roger Pau Monné
2017-03-15 15:11                                     ` Venu Busireddy
2017-03-15 16:38                                       ` Roger Pau Monn?
2017-03-15 16:54                                         ` Venu Busireddy
2017-03-15 17:00                                           ` Roger Pau Monn?
2017-05-03 12:38                                             ` Julien Grall
2017-05-03 12:53                                         ` Julien Grall
2017-01-25  4:23     ` Manish Jaggi
2017-01-06 15:12 ` Roger Pau Monné
2017-01-06 21:16   ` Stefano Stabellini
2017-01-24 17:17   ` Julien Grall
2017-01-25 11:42     ` Roger Pau Monné
2017-01-31 15:59       ` Julien Grall
2017-01-31 22:03         ` Stefano Stabellini
2017-02-01 10:28           ` Roger Pau Monné
2017-02-01 18:45             ` Stefano Stabellini
2017-01-06 16:27 ` Edgar E. Iglesias
2017-01-06 21:12   ` Stefano Stabellini
2017-01-09 17:50     ` Edgar E. Iglesias
2017-01-19  5:09 ` Manish Jaggi
2017-01-24 17:43   ` Julien Grall
2017-01-25  4:37     ` Manish Jaggi
2017-01-25 15:25       ` Julien Grall
2017-01-30  7:41         ` Manish Jaggi
2017-01-31 13:33           ` Julien Grall
2017-05-19  6:38 ` Goel, Sameer
2017-05-19 16:48   ` Julien Grall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170202234452.GN9606@toto \
    --to=edgar.iglesias@gmail.com \
    --cc=Steve.Capper@arm.com \
    --cc=Wei.Chen@arm.com \
    --cc=alistair.francis@xilinx.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anjiandi@codeaurora.org \
    --cc=edgar.iglesias@xilinx.com \
    --cc=julien.grall@linaro.org \
    --cc=manish.jaggi@caviumnetworks.com \
    --cc=punit.agrawal@arm.com \
    --cc=roger.pau@citrix.com \
    --cc=scampbel@codeaurora.org \
    --cc=shankerd@codeaurora.org \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).