Re: [PATCH v2] vhost-vdpa: allow passing opened vhostfd to vhost-vdpa

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Si-Wei Liu <si-wei.liu@oracle.com>
To: Jason Wang <jasowang@redhat.com>
Cc: qemu-devel@nongnu.org, eblake@redhat.com, armbru@redhat.com,
	eperezma@redhat.com, Cindy Lu <lulu@redhat.com>,
	Shuo Wang <shuo.s.wang@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: [PATCH v2] vhost-vdpa: allow passing opened vhostfd to vhost-vdpa
Date: Thu, 27 Oct 2022 14:56:17 -0700	[thread overview]
Message-ID: <dca26485-162a-6c61-33af-94e062503e11@oracle.com> (raw)
In-Reply-To: <d625202e-9729-a050-db31-da2a5d89d58c@oracle.com>

Hi Jason,

Sorry for top posting, but are you going to queue this patch? It looks 
like the discussion has been settled and no further comment I got for 2 
weeks for this patch.

Thanks,
-Siwei

On 10/13/2022 4:12 PM, Si-Wei Liu wrote:
> Jason,
>
> On 10/12/2022 10:02 PM, Jason Wang wrote:
>>
>> 在 2022/10/12 13:59, Si-Wei Liu 写道:
>>>
>>>
>>> On 10/11/2022 8:09 PM, Jason Wang wrote:
>>>> On Tue, Oct 11, 2022 at 1:18 AM Si-Wei Liu<si-wei.liu@oracle.com> 
>>>> wrote:
>>>>> On 10/8/2022 10:43 PM, Jason Wang wrote:
>>>>>
>>>>> On Sat, Oct 8, 2022 at 5:04 PM Si-Wei Liu<si-wei.liu@oracle.com> 
>>>>> wrote:
>>>>>
>>>>> Similar to other vhost backends, vhostfd can be passed to vhost-vdpa
>>>>> backend as another parameter to instantiate vhost-vdpa net client.
>>>>> This would benefit the use case where only open file descriptors, as
>>>>> opposed to raw vhost-vdpa device paths, are accessible from the QEMU
>>>>> process.
>>>>>
>>>>> (qemu) netdev_add type=vhost-vdpa,vhostfd=61,id=vhost-vdpa1
>>>>>
>>>>> Adding Cindy.
>>>>>
>>>>> This has been discussed before, we've already had
>>>>> vhostdev=/dev/fdset/$fd which should be functional equivalent to what
>>>>> has been proposed here. (And this is how libvirt works if I 
>>>>> understand
>>>>> correctly).
>>>>>
>>>>> Yes, I was aware of that discussion. However, our implementation 
>>>>> of the management software is a bit different from libvirt, in 
>>>>> which the paths in /dev/fdset/NNN can't be dynamically passed to 
>>>>> the container where QEMU is running. By using a specific vhostfd 
>>>>> property with existing code, it would allow our mgmt software 
>>>>> smooth adaption without having to add too much infra code to 
>>>>> support the /dev/fdset/NNN trick.
>>>> I think fdset has extra flexibility in e.g hot-plug to allow the file
>>>> descriptor to be passed with SCM_RIGHTS.
>>> Yes, that's exactly the use case we'd like to support. Though the 
>>> difference in our mgmt software stack from libvirt is that any 
>>> dynamic path in /dev (like /dev/fdset/ABC or /dev/vhost-vdpa-XYZ) 
>>> can't be allowed to get passed through to the container running QEMU 
>>> on the fly for security reasons. fd passing is allowed, though, with 
>>> very strict security checks.
>>
>>
>> Interesting, any reason for disallowing fd passing?
> For our mgmt software stack, QEMU is running in a secured container 
> with its own namespace(s) with minimally well known and trusted 
> devices from root ns exposed (only) at the time when QEMU is being 
> started.  Direct fd passing via SCM_RIGHTS is allowed, but fdset 
> device node exposure is not allowed and not even considered useful to 
> us, as it adds an unwarranted attack surface to the QEMU's secured 
> container unnecessarily. This has been the case and our security model 
> for a while now w.r.t hot plugging vhost-net/tap and vhost-scsi 
> devices, so will do for vhost-vdpa with vhostfd. It's not an open 
> source project, though what I can share is that it's not a simple 
> script that can be easily changed, and allow passing extra devices 
> e.g. fdset especially on the fly is not even in consideration per 
> suggested security guideline. I think we don't do anything special 
> here as with other secured containers that disallow dynamic device 
> injection on the fly.
>
>> I'm asking since it's the way that libvirt work and it seems to me we 
>> don't get any complaints in the past.
> I guess it was because libvirt doesn't run QEMU in a container with 
> very limited device exposure, otherwise this sort of constraints would 
> pop up. Anyway the point and the way I see it is that passing vhostfd 
> is proved to be working well and secure with other vhost devices, I 
> don't see why vhost-vdpa is treated special here that would need to 
> enforce the fdset usage. It's an edge case for libvirt maybe, but 
> supporting QEMU's vhost-vdpa device to run in a securely contained 
> environment with no dynamic device injection shouldn't be an odd or 
> bizarre use case.
>
>
> Thanks,
> -Siwei
>
>>
>>
>>> That's the main motivation for this direct vhostfd passing support 
>>> (noted fdset doesn't need to be used along with /dev/fdset node).
>>>
>>> Having it said, I found there's also nuance in the 
>>> vhostdev=/dev/fdset/XyZ interface besides the /dev node limitation: 
>>> the fd to open has to be dup'ed from the original one passed via 
>>> SCM_RIGHTS. This also has implication on security that any ioctl 
>>> call from QEMU can't be audited through the original fd.
>>
>>
>> I'm not sure I get this, but management layer can enforce a ioctl 
>> whiltelist for safety.
>>
>> Thanks
>>
>>
>>> With this regard, I think vhostfd offers more flexibility than work 
>>> around those qemu_open() specifics. Would these justify the use case 
>>> of concern?
>>>
>>> Thanks,
>>> -Siwei
>>>
>>>>   It would still be good to add
>>>> the support.
>>>>
>>>>> On the other hand, the other vhost backends, e.g. tap (via 
>>>>> vhost-net), vhost-scsi and vhost-vsock all accept vhostfd as 
>>>>> parameter to instantiate device, although the /dev/fdset trick 
>>>>> also works there. I think vhost-vdpa is not  unprecedented in this 
>>>>> case?
>>>> Yes.
>>>>
>>>> Thanks
>>>>
>>>>> Thanks,
>>>>> -Siwei
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Signed-off-by: Si-Wei Liu<si-wei.liu@oracle.com>
>>>>> Acked-by: Eugenio Pérez<eperezma@redhat.com>
>>>>>
>>>>> ---
>>>>> v2:
>>>>>    - fixed typo in commit message
>>>>>    - s/fd's/file descriptors/
>>>>> ---
>>>>>   net/vhost-vdpa.c | 25 ++++++++++++++++++++-----
>>>>>   qapi/net.json    |  3 +++
>>>>>   qemu-options.hx  |  6 ++++--
>>>>>   3 files changed, 27 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
>>>>> index 182b3a1..366b070 100644
>>>>> --- a/net/vhost-vdpa.c
>>>>> +++ b/net/vhost-vdpa.c
>>>>> @@ -683,14 +683,29 @@ int net_init_vhost_vdpa(const Netdev 
>>>>> *netdev, const char *name,
>>>>>
>>>>>       assert(netdev->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>>>>>       opts = &netdev->u.vhost_vdpa;
>>>>> -    if (!opts->vhostdev) {
>>>>> -        error_setg(errp, "vdpa character device not specified 
>>>>> with vhostdev");
>>>>> +    if (!opts->has_vhostdev && !opts->has_vhostfd) {
>>>>> +        error_setg(errp,
>>>>> +                   "vhost-vdpa: neither vhostdev= nor vhostfd= 
>>>>> was specified");
>>>>>           return -1;
>>>>>       }
>>>>>
>>>>> -    vdpa_device_fd = qemu_open(opts->vhostdev, O_RDWR, errp);
>>>>> -    if (vdpa_device_fd == -1) {
>>>>> -        return -errno;
>>>>> +    if (opts->has_vhostdev && opts->has_vhostfd) {
>>>>> +        error_setg(errp,
>>>>> +                   "vhost-vdpa: vhostdev= and vhostfd= are 
>>>>> mutually exclusive");
>>>>> +        return -1;
>>>>> +    }
>>>>> +
>>>>> +    if (opts->has_vhostdev) {
>>>>> +        vdpa_device_fd = qemu_open(opts->vhostdev, O_RDWR, errp);
>>>>> +        if (vdpa_device_fd == -1) {
>>>>> +            return -errno;
>>>>> +        }
>>>>> +    } else if (opts->has_vhostfd) {
>>>>> +        vdpa_device_fd = monitor_fd_param(monitor_cur(), 
>>>>> opts->vhostfd, errp);
>>>>> +        if (vdpa_device_fd == -1) {
>>>>> +            error_prepend(errp, "vhost-vdpa: unable to parse 
>>>>> vhostfd: ");
>>>>> +            return -1;
>>>>> +        }
>>>>>       }
>>>>>
>>>>>       r = vhost_vdpa_get_features(vdpa_device_fd, &features, errp);
>>>>> diff --git a/qapi/net.json b/qapi/net.json
>>>>> index dd088c0..926ecc8 100644
>>>>> --- a/qapi/net.json
>>>>> +++ b/qapi/net.json
>>>>> @@ -442,6 +442,8 @@
>>>>>   # @vhostdev: path of vhost-vdpa device
>>>>>   #            (default:'/dev/vhost-vdpa-0')
>>>>>   #
>>>>> +# @vhostfd: file descriptor of an already opened vhost vdpa device
>>>>> +#
>>>>>   # @queues: number of queues to be created for multiqueue vhost-vdpa
>>>>>   #          (default: 1)
>>>>>   #
>>>>> @@ -456,6 +458,7 @@
>>>>>   { 'struct': 'NetdevVhostVDPAOptions',
>>>>>     'data': {
>>>>>       '*vhostdev':     'str',
>>>>> +    '*vhostfd':      'str',
>>>>>       '*queues':       'int',
>>>>>       '*x-svq':        {'type': 'bool', 'features' : [ 'unstable'] 
>>>>> } } }
>>>>>
>>>>> diff --git a/qemu-options.hx b/qemu-options.hx
>>>>> index 913c71e..c040f74 100644
>>>>> --- a/qemu-options.hx
>>>>> +++ b/qemu-options.hx
>>>>> @@ -2774,8 +2774,10 @@ DEF("netdev", HAS_ARG, QEMU_OPTION_netdev,
>>>>>       "                configure a vhost-user network, backed by a 
>>>>> chardev 'dev'\n"
>>>>>   #endif
>>>>>   #ifdef __linux__
>>>>> -    "-netdev vhost-vdpa,id=str,vhostdev=/path/to/dev\n"
>>>>> +    "-netdev 
>>>>> vhost-vdpa,id=str[,vhostdev=/path/to/dev][,vhostfd=h]\n"
>>>>>       "                configure a vhost-vdpa network,Establish a 
>>>>> vhost-vdpa netdev\n"
>>>>> +    "                use 'vhostdev=/path/to/dev' to open a vhost 
>>>>> vdpa device\n"
>>>>> +    "                use 'vhostfd=h' to connect to an already 
>>>>> opened vhost vdpa device\n"
>>>>>   #endif
>>>>>   #ifdef CONFIG_VMNET
>>>>>       "-netdev vmnet-host,id=str[,isolated=on|off][,net-uuid=uuid]\n"
>>>>> @@ -3280,7 +3282,7 @@ SRST
>>>>>                -netdev type=vhost-user,id=net0,chardev=chr0 \
>>>>>                -device virtio-net-pci,netdev=net0
>>>>>
>>>>> -``-netdev vhost-vdpa,vhostdev=/path/to/dev``
>>>>> +``-netdev vhost-vdpa[,vhostdev=/path/to/dev][,vhostfd=h]``
>>>>>       Establish a vhost-vdpa netdev.
>>>>>
>>>>>       vDPA device is a device that uses a datapath which complies 
>>>>> with
>>>>> -- 
>>>>> 1.8.3.1
>>>>>
>>>>>
>>>
>>
>

next prev parent reply	other threads:[~2022-10-27 21:58 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-08  7:58 [PATCH v2] vhost-vdpa: allow passing opened vhostfd to vhost-vdpa Si-Wei Liu
2022-10-09  5:43 ` Jason Wang
2022-10-10 17:18   ` Si-Wei Liu
2022-10-12  3:09     ` Jason Wang
2022-10-12  5:59       ` Si-Wei Liu
2022-10-13  5:02         ` Jason Wang
2022-10-13 23:12           ` Si-Wei Liu
2022-10-27 21:56             ` Si-Wei Liu [this message]
2022-10-28  1:50               ` Jason Wang
2022-10-28 15:31                 ` Si-Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dca26485-162a-6c61-33af-94e062503e11@oracle.com \
    --to=si-wei.liu@oracle.com \
    --cc=armbru@redhat.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=eblake@redhat.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=lulu@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shuo.s.wang@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).