kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Don Dutile <ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>,
	Auger Eric <eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: drjones-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	christoffer.dall-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org,
	jason-NLaQJdtUoK4Be96aLqz0jA@public.gmane.org,
	kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	marc.zyngier-5wv7dgnIgG8@public.gmane.org,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org,
	punit.agrawal-5wv7dgnIgG8@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	arnd-r2nGTMty4D4@public.gmane.org,
	dwmw-vV1OtcyAfmbQXOPxS62xeg@public.gmane.org,
	jcm-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	eric.auger.pro-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Subject: Re: Summary of LPC guest MSI discussion in Santa Fe
Date: Tue, 8 Nov 2016 14:02:39 -0500	[thread overview]
Message-ID: <5822214F.2070500@redhat.com> (raw)
In-Reply-To: <20161108175457.GK20591-5wv7dgnIgG8@public.gmane.org>

On 11/08/2016 12:54 PM, Will Deacon wrote:
> On Tue, Nov 08, 2016 at 03:27:23PM +0100, Auger Eric wrote:
>> On 08/11/2016 03:45, Will Deacon wrote:
>>> Rather than treat these as separate problems, a better interface is to
>>> tell userspace about a set of reserved regions, and have this include
>>> the MSI doorbell, irrespective of whether or not it can be remapped.
>>> Don suggested that we statically pick an address for the doorbell in a
>>> similar way to x86, and have the kernel map it there. We could even pick
>>> 0xfee00000. If it conflicts with a reserved region on the platform (due
>>> to (4)), then we'd obviously have to (deterministically?) allocate it
>>> somewhere else, but probably within the bottom 4G.
>> This is tentatively achieved now with
>> [1] [RFC v2 0/8] KVM PCIe/MSI passthrough on ARM/ARM64 - Alt II
>> (http://www.mail-archive.com/linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg1264506.html)
> Yup, I saw that fly by. Hopefully some of the internals can be reused
> with the current thinking on user ABI.
>
>>> The next question is how to tell userspace about all of the reserved
>>> regions. Initially, the idea was to extend VFIO, however Alex pointed
>>> out a horrible scenario:
>>>
>>>    1. QEMU spawns a VM on system 0
>>>    2. VM is migrated to system 1
>>>    3. QEMU attempts to passthrough a device using PCI hotplug
>>>
>>> In this scenario, the guest memory map is chosen at step (1), yet there
>>> is no VFIO fd available to determine the reserved regions. Furthermore,
>>> the reserved regions may vary between system 0 and system 1. This pretty
>>> much rules out using VFIO to determine the reserved regions.Alex suggested
>>> that the SMMU driver can advertise the regions via /sys/class/iommu/. This
>>> would solve part of the problem, but migration between systems with
>>> different memory maps can still cause problems if the reserved regions
>>> of the new system conflict with the guest memory map chosen by QEMU.
>>
>> OK so I understand we do not want anymore the VFIO chain capability API
>> (patch 5 of above series) but we prefer a sysfs approach instead.
> Right.
>
>> I understand the sysfs approach which allows the userspace to get the
>> info earlier and independently on VFIO. Keeping in mind current QEMU
>> virt - which is not the only userspace - will not do much from this info
>> until we bring upheavals in virt address space management. So if I am
>> not wrong, at the moment the main action to be undertaken is the
>> rejection of the PCI hotplug in case we detect a collision.
> I don't think so; it should be up to userspace to reject the hotplug.
> If userspace doesn't have support for the regions, then that's fine --
> you just end up in a situation where the CPU page table maps memory
> somewhere that the device can't see. In other words, you'll end up with
> spurious DMA failures, but that's exactly what happens with current systems
> if you passthrough an overlapping region (Robin demonstrated this on Juno).
>
> Additionally, you can imagine some future support where you can tell the
> guest not to use certain regions of its memory for DMA. In this case, you
> wouldn't want to refuse the hotplug in the case of overlapping regions.
>
> Really, I think the kernel side just needs to enumerate the fixed reserved
> regions, place the doorbell at a fixed address and then advertise these
> via sysfs.
>
>> I can respin [1]
>> - studying and taking into account Robin's comments about dm_regions
>> similarities
>> - removing the VFIO capability chain and replacing this by a sysfs API
> Ideally, this would be reusable between different SMMU drivers so the sysfs
> entries have the same format etc.
>
>> Would that be OK?
> Sounds good to me. Are you in a position to prototype something on the qemu
> side once we've got kernel-side agreement?
>
>> What about Alex comments who wanted to report the usable memory ranges
>> instead of unusable memory ranges?
>>
>> Also did you have a chance to discuss the following items:
>> 1) the VFIO irq safety assessment
> The discussion really focussed on system topology, as opposed to properties
> of the doorbell. Regardless of how the device talks to the doorbell, if
> the doorbell can't protect against things like MSI spoofing, then it's
> unsafe. My opinion is that we shouldn't allow passthrough by default on
> systems with unsafe doorbells (we could piggyback on allow_unsafe_interrupts
> cmdline option to VFIO).
>
> A first step would be making all this opt-in, and only supporting GICv3
> ITS for now.
You're trying to support a config that is < GICv3 and no ITS ? ...
That would be the equiv. of x86 pre-intr-remap, and that's why allow_unsafe_interrupts
hook was created ... to enable devel/kick-the-tires.
>> 2) the MSI reserved size computation (is an arbitrary size OK?)
> If we fix the base address, we could fix a size too. However, we'd still
> need to enumerate the doorbells to check that they fit in the region we
> have. If not, then we can warn during boot and treat it the same way as
> a resource conflict (that is, reallocate the region in some deterministic
> way).
>
> Will

  parent reply	other threads:[~2016-11-08 19:02 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-03 21:39 [RFC 0/8] KVM PCIe/MSI passthrough on ARM/ARM64 (Alt II) Eric Auger
     [not found] ` <1478209178-3009-1-git-send-email-eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-11-03 21:39   ` [RFC 1/8] vfio: fix vfio_info_cap_add/shift Eric Auger
2016-11-03 21:39   ` [RFC 2/8] iommu/iova: fix __alloc_and_insert_iova_range Eric Auger
2016-11-03 21:39   ` [RFC 3/8] iommu/dma: Allow MSI-only cookies Eric Auger
2016-11-03 21:39   ` [RFC 5/8] vfio/type1: Introduce RESV_IOVA_RANGE capability Eric Auger
2016-11-03 21:39   ` [RFC 8/8] iommu/arm-smmu: implement add_reserved_regions callback Eric Auger
2016-11-04  4:02   ` [RFC 0/8] KVM PCIe/MSI passthrough on ARM/ARM64 (Alt II) Alex Williamson
2016-11-08  2:45     ` Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM PCIe/MSI passthrough on ARM/ARM64 (Alt II)) Will Deacon
2016-11-08 14:27       ` Summary of LPC guest MSI discussion in Santa Fe Auger Eric
     [not found]         ` <dae12190-1eb6-20a9-5740-9e5be8bb65fc-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-11-08 17:54           ` Will Deacon
     [not found]             ` <20161108175457.GK20591-5wv7dgnIgG8@public.gmane.org>
2016-11-08 19:02               ` Don Dutile [this message]
2016-11-08 19:10                 ` Will Deacon
     [not found]                 ` <5822214F.2070500-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-11-09  7:43                   ` Auger Eric
2016-11-08 16:02       ` Don Dutile
     [not found]       ` <20161108024559.GA20591-5wv7dgnIgG8@public.gmane.org>
2016-11-08 20:29         ` Summary of LPC guest MSI discussion in Santa Fe (was: Re: [RFC 0/8] KVM PCIe/MSI passthrough on ARM/ARM64 (Alt II)) Christoffer Dall
2016-11-08 23:35           ` Alex Williamson
     [not found]             ` <20161108163508.1bcae0c2-1yVPhWWZRC1BDLzU/O5InQ@public.gmane.org>
2016-11-09  2:52               ` Summary of LPC guest MSI discussion in Santa Fe Don Dutile
     [not found]                 ` <58228F71.6020108-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-11-09 17:03                   ` Will Deacon
2016-11-09 18:59                     ` Don Dutile
     [not found]                       ` <582371FB.2040808-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-11-09 19:23                         ` Christoffer Dall
2016-11-09 20:01                           ` Alex Williamson
2016-11-10 14:40                             ` Joerg Roedel
     [not found]                               ` <20161110144007.GC2078-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-11-10 17:07                                 ` Alex Williamson
2016-11-09 20:31                           ` Will Deacon
     [not found]                             ` <20161109203145.GO17771-5wv7dgnIgG8@public.gmane.org>
2016-11-09 22:17                               ` Alex Williamson
     [not found]                                 ` <20161109151709.74927f83-1yVPhWWZRC1BDLzU/O5InQ@public.gmane.org>
2016-11-09 22:25                                   ` Will Deacon
     [not found]                                     ` <20161109222522.GS17771-5wv7dgnIgG8@public.gmane.org>
2016-11-09 23:24                                       ` Alex Williamson
2016-11-09 23:38                                         ` Will Deacon
     [not found]                                           ` <20161109233847.GT17771-5wv7dgnIgG8@public.gmane.org>
2016-11-09 23:59                                             ` Alex Williamson
2016-11-10  0:14                                               ` Auger Eric
     [not found]                                                 ` <83b6440a-31eb-c1b4-642c-a4c311f37ef2-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-11-10  0:55                                                   ` Alex Williamson
2016-11-10  2:01                                                     ` Will Deacon
     [not found]                                                       ` <20161110020130.GA19108-5wv7dgnIgG8@public.gmane.org>
2016-11-10 11:14                                                         ` Auger Eric
     [not found]                                                           ` <ddd8af9d-ad8f-78d8-3048-3d640b74470e-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-11-10 17:46                                                             ` Alex Williamson
2016-11-11 11:19                                                               ` Joerg Roedel
     [not found]                                                                 ` <20161111111944.GO2078-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-11-11 15:50                                                                   ` Alex Williamson
     [not found]                                                                     ` <20161111085056.4cf8989d-1yVPhWWZRC1BDLzU/O5InQ@public.gmane.org>
2016-11-11 16:05                                                                       ` Alex Williamson
2016-11-14 15:19                                                                         ` Joerg Roedel
2016-11-11 16:25                                                                       ` Don Dutile
2016-11-11 16:00                                                                   ` Don Dutile
2016-11-10 14:52                                                 ` Joerg Roedel
2016-11-09 20:11                       ` Robin Murphy
     [not found]                         ` <e59e9a17-e943-a227-5ea4-d028232155a8-5wv7dgnIgG8@public.gmane.org>
2016-11-10 15:18                           ` Joerg Roedel
2016-11-21  5:13       ` Jon Masters
     [not found]         ` <83d7bf8e-1aa9-b61b-4e83-ba9da1926d19-Zp4isUonpHBD60Wz+7aTrA@public.gmane.org>
2016-11-23 20:12           ` Don Dutile
2016-11-03 21:39 ` [RFC 4/8] iommu: Add a list of iommu_reserved_region in iommu_domain Eric Auger
2016-11-03 21:39 ` [RFC 6/8] iommu: Handle the list of reserved regions Eric Auger
2016-11-03 21:39 ` [RFC 7/8] iommu/vt-d: Implement add_reserved_regions callback Eric Auger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5822214F.2070500@redhat.com \
    --to=ddutile-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=arnd-r2nGTMty4D4@public.gmane.org \
    --cc=benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org \
    --cc=christoffer.dall-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
    --cc=drjones-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=dwmw-vV1OtcyAfmbQXOPxS62xeg@public.gmane.org \
    --cc=eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=eric.auger.pro-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=jason-NLaQJdtUoK4Be96aLqz0jA@public.gmane.org \
    --cc=jcm-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=marc.zyngier-5wv7dgnIgG8@public.gmane.org \
    --cc=pranav.sawargaonkar-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=punit.agrawal-5wv7dgnIgG8@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=will.deacon-5wv7dgnIgG8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).