From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
"Zhao, Yan Y" <yan.y.zhao@intel.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"intel-gvt-dev@lists.freedesktop.org"
<intel-gvt-dev@lists.freedesktop.org>,
"Zhengxiao.zx@Alibaba-inc.com" <Zhengxiao.zx@alibaba-inc.com>,
"Liu, Yi L" <yi.l.liu@intel.com>,
"eskultet@redhat.com" <eskultet@redhat.com>,
"Yang, Ziye" <ziye.yang@intel.com>,
"cohuck@redhat.com" <cohuck@redhat.com>,
"shuangtai.tst@alibaba-inc.com" <shuangtai.tst@alibaba-inc.com>,
"Wang, Zhi A" <zhi.a.wang@intel.com>,
"mlevitsk@redhat.com" <mlevitsk@redhat.com>,
"pasic@linux.ibm.com" <pasic@linux.ibm.com>,
"aik@ozlabs.ru" <aik@ozlabs.ru>,
"eauger@redhat.com" <eauger@redhat.com>,
"felipe@nutanix.com" <felipe@nutanix.com>,
"jonathan.davies@nutanix.com" <jonathan.davies@nutanix.com>,
"Liu, Changpeng" <changpeng.liu@intel.com>,
"Ken.Xue@amd.com" <Ken.Xue@amd.com>,
"kwankhede@nvidia.com" <kwankhede@nvidia.com>,
"cjia@nvidia.com" <cjia@nvidia.com>,
"arei.gonglei@huawei.com" <arei.gonglei@huawei.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [Qemu-devel] [PATCH 0/5] QEMU VFIO live migration
Date: Fri, 8 Mar 2019 16:21:46 +0000 [thread overview]
Message-ID: <20190308162146.GD2834@work-vm> (raw)
In-Reply-To: <20190308091133.3073f5db@x1.home>
* Alex Williamson (alex.williamson@redhat.com) wrote:
> On Thu, 7 Mar 2019 23:20:36 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
>
> > > From: Alex Williamson [mailto:alex.williamson@redhat.com]
> > > Sent: Friday, March 8, 2019 1:44 AM
> > > > >
> > > > > > This kind of data needs to be saved / loaded in pre-copy and
> > > > > > stop-and-copy phase.
> > > > > > The data of device memory is held in device memory region.
> > > > > > Size of devie memory is usually larger than that of device
> > > > > > memory region. qemu needs to save/load it in chunks of size of
> > > > > > device memory region.
> > > > > > Not all device has device memory. Like IGD only uses system
> > > memory.
> > > > >
> > > > > It seems a little gratuitous to me that this is a separate region or
> > > > > that this data is handled separately. All of this data is opaque to
> > > > > QEMU, so why do we need to separate it?
> > > > hi Alex,
> > > > as the device state interfaces are provided by kernel, it is expected to
> > > > meet as general needs as possible. So, do you think there are such use
> > > > cases from user space that user space knows well of the device, and
> > > > it wants kernel to return desired data back to it.
> > > > E.g. It just wants to get whole device config data including all mmios,
> > > > page tables, pci config data...
> > > > or, It just wants to get current device memory snapshot, not including any
> > > > dirty data.
> > > > Or, It just needs the dirty pages in device memory or system memory.
> > > > With all this accurate query, quite a lot of useful features can be
> > > > developped in user space.
> > > >
> > > > If all of this data is opaque to user app, seems the only use case is
> > > > for live migration.
> > >
> > > I can certainly appreciate a more versatile interface, but I think
> > > we're also trying to create the most simple interface we can, with the
> > > primary target being live migration. As soon as we start defining this
> > > type of device memory and that type of device memory, we're going to
> > > have another device come along that needs yet another because they have
> > > a slightly different requirement. Even without that, we're going to
> > > have vendor drivers implement it differently, so what works for one
> > > device for a more targeted approach may not work for all devices. Can
> > > you enumerate some specific examples of the use cases you imagine your
> > > design to enable?
> > >
> >
> > Do we want to consider an use case where user space would like to
> > selectively introspect a portion of the device state (including implicit
> > state which are not available through PCI regions), and may ask for
> > capability of direct mapping of selected portion for scanning (e.g.
> > device memory) instead of always turning on dirty logging on all
> > device state?
>
> I don't see that a migration interface necessarily lends itself to this
> use case. A migration data stream has no requirement to be user
> consumable as anything other than opaque data, there's also no
> requirement that it expose state in a form that directly represents the
> internal state of the device. In fact I'm not sure we want to encourage
> introspection via this data stream. If a user knows how to interpret
> the data, what prevents them from modifying the data in-flight? I've
> raised the question previously regarding how the vendor driver can
> validate the integrity of the migration data stream. Using the
> migration interface to introspect the device certainly suggests an
> interface ripe for exploiting any potential weakness in the vendor
> driver reassembling that migration stream. If the user has an mmap to
> the actual live working state of the vendor driver, protection in the
> hardware seems like the only way you could protect against a malicious
> user. Please be defensive in what is directly exposed to the user and
> what safeguards are in place within the vendor driver for validating
> incoming data. Thanks,
Hmm; that sounds like a security-by-obscurity answer!
The scripts/analyze-migration.py scripts will actually dump the
migration stream data in an almost readable format.
So if you properly define the VMState definitions it should be almost
readable; it's occasionally been useful.
I agree that you should be very very careful to validate the incoming
migration stream against:
a) Corruption
b) Wrong driver versions
c) Malicious intent
c.1) Especially by the guest
c.2) Or by someone trying to feed you a duff stream
d) Someone trying load the VFIO stream into completely the wrong
device.
Whether the migration interface is the right thing to use for that
inspection hmm; well it might be - if you're trying to debug
your device and need a dump of it's state, then why not?
(I guess you end up with something not dissimilar to what things
like intek_reg_snapshot in intel-gpu-tools does).
Dave
> Alex
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2019-03-08 16:21 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-19 8:50 [Qemu-devel] [PATCH 0/5] QEMU VFIO live migration Yan Zhao
2019-02-19 8:52 ` [Qemu-devel] [PATCH 1/5] vfio/migration: define kernel interfaces Yan Zhao
2019-02-19 13:09 ` Cornelia Huck
2019-02-20 7:36 ` Zhao Yan
2019-02-20 17:08 ` Cornelia Huck
2019-02-21 1:47 ` Zhao Yan
2019-02-19 8:52 ` [Qemu-devel] [PATCH 2/5] vfio/migration: support device of device config capability Yan Zhao
2019-02-19 11:01 ` Dr. David Alan Gilbert
2019-02-20 5:12 ` Zhao Yan
2019-02-20 10:57 ` Dr. David Alan Gilbert
2019-02-19 14:37 ` Cornelia Huck
2019-02-20 22:54 ` Zhao Yan
2019-02-21 10:56 ` Cornelia Huck
2019-02-19 8:52 ` [Qemu-devel] [PATCH 3/5] vfio/migration: tracking of dirty page in system memory Yan Zhao
2019-02-19 8:52 ` [Qemu-devel] [PATCH 4/5] vfio/migration: turn on migration Yan Zhao
2019-02-19 8:53 ` [Qemu-devel] [PATCH 5/5] vfio/migration: support device memory capability Yan Zhao
2019-02-19 11:25 ` Dr. David Alan Gilbert
2019-02-20 5:17 ` Zhao Yan
2019-02-19 14:42 ` Christophe de Dinechin
2019-02-20 7:58 ` Zhao Yan
2019-02-20 10:14 ` Christophe de Dinechin
2019-02-21 0:07 ` Zhao Yan
2019-02-19 11:32 ` [Qemu-devel] [PATCH 0/5] QEMU VFIO live migration Dr. David Alan Gilbert
2019-02-20 5:28 ` Zhao Yan
2019-02-20 11:01 ` Dr. David Alan Gilbert
2019-02-20 11:28 ` Gonglei (Arei)
2019-02-20 11:42 ` Cornelia Huck
2019-02-20 12:07 ` Gonglei (Arei)
[not found] ` <20190327063509.GD14681@joy-OptiPlex-7040>
[not found] ` <20190327201854.GG2636@work-vm>
[not found] ` <20190327161020.1c013e65@x1.home>
2019-04-01 8:14 ` Cornelia Huck
2019-04-01 8:40 ` Yan Zhao
2019-04-01 14:15 ` Alex Williamson
2019-02-21 0:31 ` Zhao Yan
2019-02-21 9:15 ` Dr. David Alan Gilbert
2019-02-20 11:56 ` Gonglei (Arei)
2019-02-21 0:24 ` Zhao Yan
2019-02-21 1:35 ` Gonglei (Arei)
2019-02-21 1:58 ` Zhao Yan
2019-02-21 3:33 ` Gonglei (Arei)
2019-02-21 4:08 ` Zhao Yan
2019-02-21 5:46 ` Gonglei (Arei)
2019-02-21 2:04 ` Zhao Yan
2019-02-21 3:16 ` Gonglei (Arei)
2019-02-21 4:21 ` Zhao Yan
2019-02-21 5:56 ` Gonglei (Arei)
2019-02-21 20:40 ` Alex Williamson
2019-02-25 2:22 ` Zhao Yan
2019-03-06 0:22 ` Zhao Yan
2019-03-07 17:44 ` Alex Williamson
2019-03-07 23:20 ` Tian, Kevin
2019-03-08 16:11 ` Alex Williamson
2019-03-08 16:21 ` Dr. David Alan Gilbert [this message]
2019-03-08 22:02 ` Alex Williamson
2019-03-11 2:33 ` Tian, Kevin
2019-03-11 20:19 ` Alex Williamson
2019-03-12 2:48 ` Tian, Kevin
2019-03-12 2:57 ` Zhao Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190308162146.GD2834@work-vm \
--to=dgilbert@redhat.com \
--cc=Ken.Xue@amd.com \
--cc=Zhengxiao.zx@alibaba-inc.com \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=arei.gonglei@huawei.com \
--cc=changpeng.liu@intel.com \
--cc=cjia@nvidia.com \
--cc=cohuck@redhat.com \
--cc=eauger@redhat.com \
--cc=eskultet@redhat.com \
--cc=felipe@nutanix.com \
--cc=intel-gvt-dev@lists.freedesktop.org \
--cc=jonathan.davies@nutanix.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=mlevitsk@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=shuangtai.tst@alibaba-inc.com \
--cc=yan.y.zhao@intel.com \
--cc=yi.l.liu@intel.com \
--cc=zhi.a.wang@intel.com \
--cc=ziye.yang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).