From: Chao Gao <chao.gao@intel.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Lan Tianyu <tianyu.lan@intel.com>,
Kevin Tian <kevin.tian@intel.com>,
Stefano Stabellini <sstabellini@kernel.org>,
Wei Liu <wei.liu2@citrix.com>,
George Dunlap <george.dunlap@eu.citrix.com>,
Ian Jackson <ian.jackson@eu.citrix.com>, Tim Deegan <tim@xen.org>,
xen-devel@lists.xen.org, Jan Beulich <jbeulich@suse.com>,
Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [PATCH v4 01/28] Xen/doc: Add Xen virtual IOMMU doc
Date: Fri, 9 Feb 2018 23:53:59 +0800 [thread overview]
Message-ID: <20180209155358.GA30322@skl-4s-chao.sh.intel.com> (raw)
In-Reply-To: <20180209125411.xpi6unodokr2o72e@MacBook-Pro-de-Roger.local>
On Fri, Feb 09, 2018 at 12:54:11PM +0000, Roger Pau Monné wrote:
>On Fri, Nov 17, 2017 at 02:22:08PM +0800, Chao Gao wrote:
>> From: Lan Tianyu <tianyu.lan@intel.com>
>>
>> This patch is to add Xen virtual IOMMU doc to introduce motivation,
>> framework, vIOMMU hypercall and xl configuration.
>>
>> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> docs/misc/viommu.txt | 120 +++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 120 insertions(+)
>> create mode 100644 docs/misc/viommu.txt
>>
>> diff --git a/docs/misc/viommu.txt b/docs/misc/viommu.txt
>> new file mode 100644
>> index 0000000..472d2b5
>> --- /dev/null
>> +++ b/docs/misc/viommu.txt
>> @@ -0,0 +1,120 @@
>> +Xen virtual IOMMU
>> +
>> +Motivation
>> +==========
>> +Enable more than 128 vcpu support
>> +
>> +The current requirements of HPC cloud service requires VM with a high
>> +number of CPUs in order to achieve high performance in parallel
>> +computing.
>> +
>> +To support >128 vcpus, X2APIC mode in guest is necessary because legacy
>> +APIC(XAPIC) just supports 8-bit APIC ID. The APIC ID used by Xen is
>> +CPU ID * 2 (ie: CPU 127 has APIC ID 254, which is the last one available
>> +in xAPIC mode) and so it only can support 128 vcpus at most. x2APIC mode
>> +supports 32-bit APIC ID and it requires the interrupt remapping functionality
>> +of a vIOMMU if the guest wishes to route interrupts to all available vCPUs
>> +
>> +PCI MSI/IOAPIC can only send interrupt message containing 8-bit APIC ID,
>> +which cannot address cpus with >254 APIC ID. Interrupt remapping supports
>> +32-bit APIC ID and so it's necessary for >128 vcpus support.
>> +
>> +vIOMMU Architecture
>> +===================
>> +vIOMMU device model is inside Xen hypervisor for following factors
>> + 1) Avoid round trips between Qemu and Xen hypervisor
>> + 2) Ease of integration with the rest of hypervisor
>> + 3) PVH doesn't use Qemu
>> +
>> +* Interrupt remapping overview.
>> +Interrupts from virtual devices and physical devices are delivered
>> +to vLAPIC from vIOAPIC and vMSI. vIOMMU needs to remap interrupt during
>> +this procedure.
>> +
>> ++---------------------------------------------------+
>> +|Qemu |VM |
>> +| | +----------------+ |
>> +| | | Device driver | |
>> +| | +--------+-------+ |
>> +| | ^ |
>> +| +----------------+ | +--------+-------+ |
>> +| | Virtual device | | | IRQ subsystem | |
>> +| +-------+--------+ | +--------+-------+ |
>> +| | | ^ |
>> +| | | | |
>> ++---------------------------+-----------------------+
>> +|hypervisor | | VIRQ |
>> +| | +---------+--------+ |
>> +| | | vLAPIC | |
>> +| |VIRQ +---------+--------+ |
>> +| | ^ |
>> +| | | |
>> +| | +---------+--------+ |
>> +| | | vIOMMU | |
>> +| | +---------+--------+ |
>> +| | ^ |
>> +| | | |
>> +| | +---------+--------+ |
>> +| | | vIOAPIC/vMSI | |
>> +| | +----+----+--------+ |
>> +| | ^ ^ |
>> +| +-----------------+ | |
>> +| | |
>> ++---------------------------------------------------+
>> +HW |IRQ
>> + +-------------------+
>> + | PCI Device |
>> + +-------------------+
>> +
>> +
>> +vIOMMU hypercall
>> +================
>> +Introduce a new domctl hypercall "xen_domctl_viommu_op" to create
>> +vIOMMUs instance in hypervisor. vIOMMU instance will be destroyed
>> +during destroying domain.
>> +
>> +* vIOMMU hypercall parameter structure
>> +
>> +/* vIOMMU type - specify vendor vIOMMU device model */
>> +#define VIOMMU_TYPE_INTEL_VTD 0
>> +
>> +/* vIOMMU capabilities */
>> +#define VIOMMU_CAP_IRQ_REMAPPING (1u << 0)
>> +
>> +struct xen_domctl_viommu_op {
>> + uint32_t cmd;
>> +#define XEN_DOMCTL_viommu_create 0
>> + union {
>> + struct {
>> + /* IN - vIOMMU type */
>> + uint8_t type;
>> + /* IN - MMIO base address of vIOMMU. */
>> + uint64_t base_address;
>> + /* IN - Capabilities with which we want to create */
>> + uint64_t capabilities;
>> + /* OUT - vIOMMU identity */
>> + uint32_t id;
>> + } create;
>> + } u;
>> +};
>> +
>> +- XEN_DOMCTL_create_viommu
>> + Create vIOMMU device with type, capabilities and MMIO base address.
>> +Hypervisor allocates viommu_id for new vIOMMU instance and return back.
>> +The vIOMMU device model in hypervisor should check whether it can
>> +support the input capabilities and return error if not.
>> +
>> +vIOMMU domctl and vIOMMU option in configure file consider multi-vIOMMU
>> +support for single VM.(e.g, parameters of create vIOMMU includes vIOMMU id).
>> +But function implementation only supports one vIOMMU per VM so far.
>> +
>> +xl x86 vIOMMU configuration"
>> +============================
>> +viommu = [
>> + 'type=intel_vtd,intremap=1',
>> + ...
>> +]
>> +
>> +"type" - Specify vIOMMU device model type. Currently only supports Intel vtd
>> +device model.
>
>Although I see the point in being able to specify the vIOMMU type, is
>this really helpful from an admin PoV?
>
>What would happen for example if you try to add an Intel vIOMMU to a
>guest running on an AMD CPU? I guess the guest OSes would be quite
>surprised about that...
>
>I think the most common way to use this option would be:
>
>viommu = [
> 'intremap=1',
> ...
>]
Agree it.
>
>And vIOMMUs should automatically be added to guests with > 128 vCPUs?
>IIRC Linux requires a vIOMMU in order to run with > 128 vCPUs (which
>is quite arbitrary, but anyway...).
I think linux will only use 128 CPUs for this case on bare-metal.
Considering a benign VM shouldn't has a weird configuration -- has > 128
vcpus but has no viommu, adding vIOMMUs automatically when needed is
fine with me.
Thanks
Chao
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2018-02-09 15:53 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-17 6:22 [PATCH v4 00/28] add vIOMMU support with irq remapping function of virtual VT-d Chao Gao
2017-11-17 6:22 ` [PATCH v4 01/28] Xen/doc: Add Xen virtual IOMMU doc Chao Gao
2018-02-09 12:54 ` Roger Pau Monné
2018-02-09 15:53 ` Chao Gao [this message]
2017-11-17 6:22 ` [PATCH v4 02/28] VIOMMU: Add vIOMMU framework and vIOMMU domctl Chao Gao
2018-02-09 14:33 ` Roger Pau Monné
2018-02-09 16:13 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 03/28] VIOMMU: Add irq request callback to deal with irq remapping Chao Gao
2018-02-09 15:02 ` Roger Pau Monné
2018-02-09 16:21 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 04/28] VIOMMU: Add get irq info callback to convert irq remapping request Chao Gao
2018-02-09 15:06 ` Roger Pau Monné
2018-02-09 16:34 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 05/28] VIOMMU: Introduce callback of checking irq remapping mode Chao Gao
2018-02-09 15:11 ` Roger Pau Monné
2018-02-09 16:47 ` Chao Gao
2018-02-12 10:21 ` Roger Pau Monné
2017-11-17 6:22 ` [PATCH v4 06/28] vtd: clean-up and preparation for vvtd Chao Gao
2018-02-09 15:17 ` Roger Pau Monné
2018-02-09 16:51 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 07/28] x86/hvm: Introduce a emulated VTD for HVM Chao Gao
2018-02-09 16:27 ` Roger Pau Monné
2018-02-09 17:12 ` Chao Gao
2018-02-12 10:35 ` Roger Pau Monné
2017-11-17 6:22 ` [PATCH v4 08/28] x86/vvtd: Add MMIO handler for VVTD Chao Gao
2018-02-09 16:39 ` Roger Pau Monné
2018-02-09 17:21 ` Chao Gao
2018-02-09 17:51 ` Roger Pau Monné
2018-02-22 6:20 ` Chao Gao
2018-02-23 17:07 ` Roger Pau Monné
2018-02-23 17:37 ` Wei Liu
2017-11-17 6:22 ` [PATCH v4 09/28] x86/vvtd: Set Interrupt Remapping Table Pointer through GCMD Chao Gao
2018-02-09 16:59 ` Roger Pau Monné
2018-02-11 4:34 ` Chao Gao
2018-02-11 5:09 ` Chao Gao
2018-02-12 11:25 ` Roger Pau Monné
2017-11-17 6:22 ` [PATCH v4 10/28] x86/vvtd: Enable Interrupt Remapping " Chao Gao
2018-02-09 17:15 ` Roger Pau Monné
2018-02-11 5:05 ` Chao Gao
2018-02-12 11:30 ` Roger Pau Monné
2018-02-22 6:25 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 11/28] x86/vvtd: Process interrupt remapping request Chao Gao
2018-02-09 17:44 ` Roger Pau Monné
2018-02-11 5:31 ` Chao Gao
2018-02-23 17:04 ` Roger Pau Monné
2017-11-17 6:22 ` [PATCH v4 12/28] x86/vvtd: decode interrupt attribute from IRTE Chao Gao
2018-02-12 11:55 ` Roger Pau Monné
2018-02-22 6:33 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 13/28] x86/vvtd: add a helper function to decide the interrupt format Chao Gao
2018-02-12 12:14 ` Roger Pau Monné
2017-11-17 6:22 ` [PATCH v4 14/28] x86/vvtd: Handle interrupt translation faults Chao Gao
2018-02-12 12:55 ` Roger Pau Monné
2018-02-22 8:23 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 15/28] x86/vvtd: Enable Queued Invalidation through GCMD Chao Gao
2018-02-12 14:04 ` Roger Pau Monné
2018-02-22 10:33 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 16/28] x86/vvtd: Add queued invalidation (QI) support Chao Gao
2018-02-12 14:36 ` Roger Pau Monné
2018-02-23 4:38 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 17/28] x86/vvtd: save and restore emulated VT-d Chao Gao
2018-02-12 14:49 ` Roger Pau Monné
2018-02-23 5:22 ` Chao Gao
2018-02-23 17:19 ` Roger Pau Monné
2017-11-17 6:22 ` [PATCH v4 18/28] x86/vioapic: Hook interrupt delivery of vIOAPIC Chao Gao
2018-02-12 14:54 ` Roger Pau Monné
2018-02-24 1:51 ` Chao Gao
2018-02-24 3:17 ` Tian, Kevin
2017-11-17 6:22 ` [PATCH v4 19/28] x86/vioapic: extend vioapic_get_vector() to support remapping format RTE Chao Gao
2018-02-12 15:01 ` Roger Pau Monné
2017-11-17 6:22 ` [PATCH v4 20/28] xen/pt: when binding guest msi, accept the whole msi message Chao Gao
2018-02-12 15:16 ` Roger Pau Monné
2018-02-24 2:20 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 21/28] vvtd: update hvm_gmsi_info when binding guest msi with pirq or Chao Gao
2018-02-12 15:38 ` Roger Pau Monné
2018-02-24 5:05 ` Chao Gao
2017-11-17 6:22 ` [PATCH v4 22/28] x86/vmsi: Hook delivering remapping format msi to guest and handling eoi Chao Gao
2017-11-17 6:22 ` [PATCH v4 23/28] tools/libacpi: Add DMA remapping reporting (DMAR) ACPI table structures Chao Gao
2017-11-17 6:22 ` [PATCH v4 24/28] tools/libacpi: Add new fields in acpi_config for DMAR table Chao Gao
2017-11-17 6:22 ` [PATCH v4 25/28] tools/libxl: Add an user configurable parameter to control vIOMMU attributes Chao Gao
2017-11-17 6:22 ` [PATCH v4 26/28] tools/libxl: build DMAR table for a guest with one virtual VTD Chao Gao
2017-11-17 6:22 ` [PATCH v4 27/28] tools/libxl: create vIOMMU during domain construction Chao Gao
2017-11-17 6:22 ` [PATCH v4 28/28] tools/libxc: Add viommu operations in libxc Chao Gao
2018-10-04 15:51 ` [PATCH v4 00/28] add vIOMMU support with irq remapping function of virtual VT-d Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180209155358.GA30322@skl-4s-chao.sh.intel.com \
--to=chao.gao@intel.com \
--cc=andrew.cooper3@citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=ian.jackson@eu.citrix.com \
--cc=jbeulich@suse.com \
--cc=kevin.tian@intel.com \
--cc=roger.pau@citrix.com \
--cc=sstabellini@kernel.org \
--cc=tianyu.lan@intel.com \
--cc=tim@xen.org \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).