qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Le Tan <tamlokveer@gmail.com>,
	Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	wei.huang2@amd.com, qemu-devel@nongnu.org,
	Luiz Capitulino <lcapitulino@redhat.com>,
	Auger Eric <eric.auger@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Wei Yang <richardw.yang@linux.intel.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH PROTOTYPE 3/6] vfio: Implement support for sparse RAM memory regions
Date: Wed, 18 Nov 2020 18:37:42 +0100	[thread overview]
Message-ID: <1f1602db-748a-4e9d-dfde-950b573592fb@redhat.com> (raw)
In-Reply-To: <20201118170143.GC29639@xz-x1>

>> virtio-mem + vfio + iommu seems to work. More testing to be done.
>>
>> However, malicious guests can play nasty tricks like
>>
>> a) Unplugging plugged virtio-mem blocks while they are mapped via an
>>     IOMMU
>>
>> 1. Guest: map memory location X located on a virtio-mem device inside a
>>     plugged block into the IOMMU
>>     -> QEMU IOMMU notifier: create vfio DMA mapping
>>     -> VFIO pins memory of unplugged blocks (populating memory)
>> 2. Guest: Request to unplug memory location X via virtio-mem device
>>     -> QEMU virtio-mem: discards the memory.
>>     -> VFIO still has the memory pinned
> 
> When unplug some memory, does the user need to first do something to notify the
> guest kernel that "this memory is going to be unplugged soon" (assuming echo
> "offline" to some dev file)?  Then the kernel should be responsible to prepare
> for that before it really happens, e.g., migrate anonymous pages out from this
> memory block.  I don't know what would happen if some pages on the memblock
> were used for DMA like this and we want to unplug it.  Ideally I thought it
> should fail the "echo offline" operation with something like EBUSY if it can't
> notify the device driver about this, or it's hard to.

In the very simple case (without resizable RAMBlocks/allocations.memory 
regions) as implemented right now, a virtio-mem device really just 
consists of a static RAM memory region that's mapped into guest physical 
address space. The size of that region corresponds to the "maximum" size 
a virtio-mem device can provide.

How much memory the VM should consume via such a device is expressed via 
a "requested size". So for the hypervisor requests the VM to consume 
less/more memory, it adjusts the "requested size".

It is up to the device driver in the guest to plug/unplug memory blocks 
(e.g., 4 MiB granularity), in order to reach the requested size. The 
device driver selects memory blocks within the device-assigned memory 
region and requests the hypervisor to (un)plug them - think of it as 
something similar (but different) to memory ballooning.

When requested to unplug memory by the hypervisor, the device driver in 
Linux will try to find memory blocks (e.g., 4 MiB) within the 
device-assigned memory region it can free up. This involves migrating 
pages away etc. Once that succeeded - nobody in the guest is using that 
memory anymore; the guest requests the hypervisor to unplug that block, 
resulting in QEMU discarding the memory. The guest agreed to not touch 
that memory anymore before officially requesting to "plug" it via the 
virtio-mem device.

There is no further action inside the guest required. A sane guest will 
never request to unplug memory that is still in use (similar to memory 
ballooning, where we don't inflate memory that is still in use).

But of course, a malicious guest could try doing that just to cause 
trouble.

> 
> IMHO this question not really related to vIOMMU, but a general question for
> unplugging. Say, what would happen if we unplug some memory with DMA buffers
> without vIOMMU at all?  The buffer will be invalid right after unplugging, so
> the guest kernel should either fail the operation trying to unplug, or at least
> tell the device drivers about this somehow?

A sane guest will never do that. The way memory is removed from Linux 
makes sure that there are no remaining users, otherwise it would be a BUG.

> 
>>
>> We consume more memory than intended. In case virtio-memory would get
>> replugged and used, we would have an inconsistency. IOMMU device resets/ fix
>> it (whereby all VFIO mappings are removed via the IOMMU notifier).
>>
>>
>> b) Mapping unplugged virtio-mem blocks via an IOMMU
>>
>> 1. Guest: map memory location X located on a virtio-mem device inside an
>>     unplugged block
>>     -> QEMU IOMMU notifier: create vfio DMA mapping
>>     -> VFIO pins memory of unplugged blocks (populating memory)
> 
> For this case, I would expect vfio_get_xlat_addr() to fail directly if the
> guest driver force to map some IOVA onto an invalid range of the virtio-mem
> device.  Even before that, since the guest should know that this region of
> virtio-mem is not valid since unplugged, so shouldn't the guest kernel directly
> fail the dma_map() upon such a region even before the mapping message reaching
> QEMU?

Again, sane guests will never do that, for the very reason you mentioned 
"the guest should know that this region of virtio-mem is not valid since 
unplugged,". But a malicious guest could try doing that to cause trouble :)

The memory region managed by a virtio-mem device is always fully mapped 
into the system address space: one reason being, that fragmenting it in 
2 MiB granularity or similar would not be feasible (e.g., KVM memory 
slots limit, migration, ...), but there are other reasons. (Again, 
similar to how memory ballooning works).

vfio_get_xlat_addr() only checks if that mapping exists. It would be 
easy to ask the virtio-mem device (similar as done in this prototype) if 
that part of the identified memory region may be mapped by VFIO right now.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2020-11-18 17:41 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-24 16:04 [PATCH PROTOTYPE 0/6] virtio-mem: vfio support David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 1/6] memory: Introduce sparse RAM handler for memory regions David Hildenbrand
2020-10-20 19:24   ` Peter Xu
2020-10-20 20:13     ` David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 2/6] virtio-mem: Impelement SparseRAMHandler interface David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 3/6] vfio: Implement support for sparse RAM memory regions David Hildenbrand
2020-10-20 19:44   ` Peter Xu
2020-10-20 20:01     ` David Hildenbrand
2020-10-20 20:44       ` Peter Xu
2020-11-12 10:11         ` David Hildenbrand
2020-11-18 13:04         ` David Hildenbrand
2020-11-18 15:23           ` Peter Xu
2020-11-18 16:14             ` David Hildenbrand
2020-11-18 17:01               ` Peter Xu
2020-11-18 17:37                 ` David Hildenbrand [this message]
2020-11-18 19:05                   ` Peter Xu
2020-11-18 19:20                     ` David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 4/6] memory: Extend ram_block_discard_(require|disable) by two discard types David Hildenbrand
2020-10-20 19:17   ` Peter Xu
2020-10-20 19:58     ` David Hildenbrand
2020-10-20 20:49       ` Peter Xu
2020-10-20 21:30         ` Peter Xu
2020-09-24 16:04 ` [PATCH PROTOTYPE 5/6] virtio-mem: Require only RAM_BLOCK_DISCARD_T_COORDINATED discards David Hildenbrand
2020-09-24 16:04 ` [PATCH PROTOTYPE 6/6] vfio: Disable only RAM_BLOCK_DISCARD_T_UNCOORDINATED discards David Hildenbrand
2020-09-24 19:30 ` [PATCH PROTOTYPE 0/6] virtio-mem: vfio support no-reply
2020-09-29 17:02 ` Dr. David Alan Gilbert
2020-09-29 17:05   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f1602db-748a-4e9d-dfde-950b573592fb@redhat.com \
    --to=david@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=mst@redhat.com \
    --cc=pankaj.gupta.linux@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richardw.yang@linux.intel.com \
    --cc=tamlokveer@gmail.com \
    --cc=wei.huang2@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).