From: Si-Wei Liu <si-wei.liu@oracle.com>
To: Jason Wang <jasowang@redhat.com>
Cc: qemu-devel@nongnu.org, eblake@redhat.com, armbru@redhat.com,
eperezma@redhat.com, Cindy Lu <lulu@redhat.com>,
Shuo Wang <shuo.s.wang@oracle.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: [PATCH v2] vhost-vdpa: allow passing opened vhostfd to vhost-vdpa
Date: Thu, 27 Oct 2022 14:56:17 -0700 [thread overview]
Message-ID: <dca26485-162a-6c61-33af-94e062503e11@oracle.com> (raw)
In-Reply-To: <d625202e-9729-a050-db31-da2a5d89d58c@oracle.com>
Hi Jason,
Sorry for top posting, but are you going to queue this patch? It looks
like the discussion has been settled and no further comment I got for 2
weeks for this patch.
Thanks,
-Siwei
On 10/13/2022 4:12 PM, Si-Wei Liu wrote:
> Jason,
>
> On 10/12/2022 10:02 PM, Jason Wang wrote:
>>
>> 在 2022/10/12 13:59, Si-Wei Liu 写道:
>>>
>>>
>>> On 10/11/2022 8:09 PM, Jason Wang wrote:
>>>> On Tue, Oct 11, 2022 at 1:18 AM Si-Wei Liu<si-wei.liu@oracle.com>
>>>> wrote:
>>>>> On 10/8/2022 10:43 PM, Jason Wang wrote:
>>>>>
>>>>> On Sat, Oct 8, 2022 at 5:04 PM Si-Wei Liu<si-wei.liu@oracle.com>
>>>>> wrote:
>>>>>
>>>>> Similar to other vhost backends, vhostfd can be passed to vhost-vdpa
>>>>> backend as another parameter to instantiate vhost-vdpa net client.
>>>>> This would benefit the use case where only open file descriptors, as
>>>>> opposed to raw vhost-vdpa device paths, are accessible from the QEMU
>>>>> process.
>>>>>
>>>>> (qemu) netdev_add type=vhost-vdpa,vhostfd=61,id=vhost-vdpa1
>>>>>
>>>>> Adding Cindy.
>>>>>
>>>>> This has been discussed before, we've already had
>>>>> vhostdev=/dev/fdset/$fd which should be functional equivalent to what
>>>>> has been proposed here. (And this is how libvirt works if I
>>>>> understand
>>>>> correctly).
>>>>>
>>>>> Yes, I was aware of that discussion. However, our implementation
>>>>> of the management software is a bit different from libvirt, in
>>>>> which the paths in /dev/fdset/NNN can't be dynamically passed to
>>>>> the container where QEMU is running. By using a specific vhostfd
>>>>> property with existing code, it would allow our mgmt software
>>>>> smooth adaption without having to add too much infra code to
>>>>> support the /dev/fdset/NNN trick.
>>>> I think fdset has extra flexibility in e.g hot-plug to allow the file
>>>> descriptor to be passed with SCM_RIGHTS.
>>> Yes, that's exactly the use case we'd like to support. Though the
>>> difference in our mgmt software stack from libvirt is that any
>>> dynamic path in /dev (like /dev/fdset/ABC or /dev/vhost-vdpa-XYZ)
>>> can't be allowed to get passed through to the container running QEMU
>>> on the fly for security reasons. fd passing is allowed, though, with
>>> very strict security checks.
>>
>>
>> Interesting, any reason for disallowing fd passing?
> For our mgmt software stack, QEMU is running in a secured container
> with its own namespace(s) with minimally well known and trusted
> devices from root ns exposed (only) at the time when QEMU is being
> started. Direct fd passing via SCM_RIGHTS is allowed, but fdset
> device node exposure is not allowed and not even considered useful to
> us, as it adds an unwarranted attack surface to the QEMU's secured
> container unnecessarily. This has been the case and our security model
> for a while now w.r.t hot plugging vhost-net/tap and vhost-scsi
> devices, so will do for vhost-vdpa with vhostfd. It's not an open
> source project, though what I can share is that it's not a simple
> script that can be easily changed, and allow passing extra devices
> e.g. fdset especially on the fly is not even in consideration per
> suggested security guideline. I think we don't do anything special
> here as with other secured containers that disallow dynamic device
> injection on the fly.
>
>> I'm asking since it's the way that libvirt work and it seems to me we
>> don't get any complaints in the past.
> I guess it was because libvirt doesn't run QEMU in a container with
> very limited device exposure, otherwise this sort of constraints would
> pop up. Anyway the point and the way I see it is that passing vhostfd
> is proved to be working well and secure with other vhost devices, I
> don't see why vhost-vdpa is treated special here that would need to
> enforce the fdset usage. It's an edge case for libvirt maybe, but
> supporting QEMU's vhost-vdpa device to run in a securely contained
> environment with no dynamic device injection shouldn't be an odd or
> bizarre use case.
>
>
> Thanks,
> -Siwei
>
>>
>>
>>> That's the main motivation for this direct vhostfd passing support
>>> (noted fdset doesn't need to be used along with /dev/fdset node).
>>>
>>> Having it said, I found there's also nuance in the
>>> vhostdev=/dev/fdset/XyZ interface besides the /dev node limitation:
>>> the fd to open has to be dup'ed from the original one passed via
>>> SCM_RIGHTS. This also has implication on security that any ioctl
>>> call from QEMU can't be audited through the original fd.
>>
>>
>> I'm not sure I get this, but management layer can enforce a ioctl
>> whiltelist for safety.
>>
>> Thanks
>>
>>
>>> With this regard, I think vhostfd offers more flexibility than work
>>> around those qemu_open() specifics. Would these justify the use case
>>> of concern?
>>>
>>> Thanks,
>>> -Siwei
>>>
>>>> It would still be good to add
>>>> the support.
>>>>
>>>>> On the other hand, the other vhost backends, e.g. tap (via
>>>>> vhost-net), vhost-scsi and vhost-vsock all accept vhostfd as
>>>>> parameter to instantiate device, although the /dev/fdset trick
>>>>> also works there. I think vhost-vdpa is not unprecedented in this
>>>>> case?
>>>> Yes.
>>>>
>>>> Thanks
>>>>
>>>>> Thanks,
>>>>> -Siwei
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Signed-off-by: Si-Wei Liu<si-wei.liu@oracle.com>
>>>>> Acked-by: Eugenio Pérez<eperezma@redhat.com>
>>>>>
>>>>> ---
>>>>> v2:
>>>>> - fixed typo in commit message
>>>>> - s/fd's/file descriptors/
>>>>> ---
>>>>> net/vhost-vdpa.c | 25 ++++++++++++++++++++-----
>>>>> qapi/net.json | 3 +++
>>>>> qemu-options.hx | 6 ++++--
>>>>> 3 files changed, 27 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>>>> index 182b3a1..366b070 100644
>>>>> --- a/net/vhost-vdpa.c
>>>>> +++ b/net/vhost-vdpa.c
>>>>> @@ -683,14 +683,29 @@ int net_init_vhost_vdpa(const Netdev
>>>>> *netdev, const char *name,
>>>>>
>>>>> assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>>>>> opts = &netdev->u.vhost_vdpa;
>>>>> - if (!opts->vhostdev) {
>>>>> - error_setg(errp, "vdpa character device not specified
>>>>> with vhostdev");
>>>>> + if (!opts->has_vhostdev && !opts->has_vhostfd) {
>>>>> + error_setg(errp,
>>>>> + "vhost-vdpa: neither vhostdev= nor vhostfd=
>>>>> was specified");
>>>>> return -1;
>>>>> }
>>>>>
>>>>> - vdpa_device_fd = qemu_open(opts->vhostdev, O_RDWR, errp);
>>>>> - if (vdpa_device_fd == -1) {
>>>>> - return -errno;
>>>>> + if (opts->has_vhostdev && opts->has_vhostfd) {
>>>>> + error_setg(errp,
>>>>> + "vhost-vdpa: vhostdev= and vhostfd= are
>>>>> mutually exclusive");
>>>>> + return -1;
>>>>> + }
>>>>> +
>>>>> + if (opts->has_vhostdev) {
>>>>> + vdpa_device_fd = qemu_open(opts->vhostdev, O_RDWR, errp);
>>>>> + if (vdpa_device_fd == -1) {
>>>>> + return -errno;
>>>>> + }
>>>>> + } else if (opts->has_vhostfd) {
>>>>> + vdpa_device_fd = monitor_fd_param(monitor_cur(),
>>>>> opts->vhostfd, errp);
>>>>> + if (vdpa_device_fd == -1) {
>>>>> + error_prepend(errp, "vhost-vdpa: unable to parse
>>>>> vhostfd: ");
>>>>> + return -1;
>>>>> + }
>>>>> }
>>>>>
>>>>> r = vhost_vdpa_get_features(vdpa_device_fd, &features, errp);
>>>>> diff --git a/qapi/net.json b/qapi/net.json
>>>>> index dd088c0..926ecc8 100644
>>>>> --- a/qapi/net.json
>>>>> +++ b/qapi/net.json
>>>>> @@ -442,6 +442,8 @@
>>>>> # @vhostdev: path of vhost-vdpa device
>>>>> # (default:'/dev/vhost-vdpa-0')
>>>>> #
>>>>> +# @vhostfd: file descriptor of an already opened vhost vdpa device
>>>>> +#
>>>>> # @queues: number of queues to be created for multiqueue vhost-vdpa
>>>>> # (default: 1)
>>>>> #
>>>>> @@ -456,6 +458,7 @@
>>>>> { 'struct': 'NetdevVhostVDPAOptions',
>>>>> 'data': {
>>>>> '*vhostdev': 'str',
>>>>> + '*vhostfd': 'str',
>>>>> '*queues': 'int',
>>>>> '*x-svq': {'type': 'bool', 'features' : [ 'unstable']
>>>>> } } }
>>>>>
>>>>> diff --git a/qemu-options.hx b/qemu-options.hx
>>>>> index 913c71e..c040f74 100644
>>>>> --- a/qemu-options.hx
>>>>> +++ b/qemu-options.hx
>>>>> @@ -2774,8 +2774,10 @@ DEF("netdev", HAS_ARG, QEMU_OPTION_netdev,
>>>>> " configure a vhost-user network, backed by a
>>>>> chardev 'dev'\n"
>>>>> #endif
>>>>> #ifdef __linux__
>>>>> - "-netdev vhost-vdpa,id=str,vhostdev=/path/to/dev\n"
>>>>> + "-netdev
>>>>> vhost-vdpa,id=str[,vhostdev=/path/to/dev][,vhostfd=h]\n"
>>>>> " configure a vhost-vdpa network,Establish a
>>>>> vhost-vdpa netdev\n"
>>>>> + " use 'vhostdev=/path/to/dev' to open a vhost
>>>>> vdpa device\n"
>>>>> + " use 'vhostfd=h' to connect to an already
>>>>> opened vhost vdpa device\n"
>>>>> #endif
>>>>> #ifdef CONFIG_VMNET
>>>>> "-netdev vmnet-host,id=str[,isolated=on|off][,net-uuid=uuid]\n"
>>>>> @@ -3280,7 +3282,7 @@ SRST
>>>>> -netdev type=vhost-user,id=net0,chardev=chr0 \
>>>>> -device virtio-net-pci,netdev=net0
>>>>>
>>>>> -``-netdev vhost-vdpa,vhostdev=/path/to/dev``
>>>>> +``-netdev vhost-vdpa[,vhostdev=/path/to/dev][,vhostfd=h]``
>>>>> Establish a vhost-vdpa netdev.
>>>>>
>>>>> vDPA device is a device that uses a datapath which complies
>>>>> with
>>>>> --
>>>>> 1.8.3.1
>>>>>
>>>>>
>>>
>>
>
next prev parent reply other threads:[~2022-10-27 21:58 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-08 7:58 [PATCH v2] vhost-vdpa: allow passing opened vhostfd to vhost-vdpa Si-Wei Liu
2022-10-09 5:43 ` Jason Wang
2022-10-10 17:18 ` Si-Wei Liu
2022-10-12 3:09 ` Jason Wang
2022-10-12 5:59 ` Si-Wei Liu
2022-10-13 5:02 ` Jason Wang
2022-10-13 23:12 ` Si-Wei Liu
2022-10-27 21:56 ` Si-Wei Liu [this message]
2022-10-28 1:50 ` Jason Wang
2022-10-28 15:31 ` Si-Wei Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dca26485-162a-6c61-33af-94e062503e11@oracle.com \
--to=si-wei.liu@oracle.com \
--cc=armbru@redhat.com \
--cc=boris.ostrovsky@oracle.com \
--cc=eblake@redhat.com \
--cc=eperezma@redhat.com \
--cc=jasowang@redhat.com \
--cc=lulu@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=shuo.s.wang@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).