netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Stefano Garzarella <sgarzare@redhat.com>
Cc: "Bobby Eshleman" <bobbyeshleman@gmail.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	"Haiyang Zhang" <haiyangz@microsoft.com>,
	"Wei Liu" <wei.liu@kernel.org>,
	"Dexuan Cui" <decui@microsoft.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Jason Wang" <jasowang@redhat.com>,
	"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
	"Eugenio Pérez" <eperezma@redhat.com>,
	"Bryan Tan" <bryan-bt.tan@broadcom.com>,
	"Vishnu Dasa" <vishnu.dasa@broadcom.com>,
	"Broadcom internal kernel review list"
	<bcm-kernel-feedback-list@broadcom.com>,
	"David S. Miller" <davem@davemloft.net>,
	virtualization@lists.linux.dev, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org,
	kvm@vger.kernel.org
Subject: Re: [PATCH v2 0/3] vsock: add namespace support to vhost-vsock
Date: Tue, 1 Apr 2025 20:05:16 +0100	[thread overview]
Message-ID: <Z-w47H3qUXZe4seQ@redhat.com> (raw)
In-Reply-To: <r6a6ihjw3etlb5chqsb65u7uhcav6q6pjxu65iqpp76423w2wd@kmctvoaywmbu>

On Fri, Mar 28, 2025 at 06:03:19PM +0100, Stefano Garzarella wrote:
> CCing Daniel
> 
> On Wed, Mar 12, 2025 at 01:59:34PM -0700, Bobby Eshleman wrote:
> > Picking up Stefano's v1 [1], this series adds netns support to
> > vhost-vsock. Unlike v1, this series does not address guest-to-host (g2h)
> > namespaces, defering that for future implementation and discussion.
> > 
> > Any vsock created with /dev/vhost-vsock is a global vsock, accessible
> > from any namespace. Any vsock created with /dev/vhost-vsock-netns is a
> > "scoped" vsock, accessible only to sockets in its namespace. If a global
> > vsock or scoped vsock share the same CID, the scoped vsock takes
> > precedence.
> > 
> > If a socket in a namespace connects with a global vsock, the CID becomes
> > unavailable to any VMM in that namespace when creating new vsocks. If
> > disconnected, the CID becomes available again.
> 
> I was talking about this feature with Daniel and he pointed out something
> interesting (Daniel please feel free to correct me):
> 
>     If we have a process in the host that does a listen(AF_VSOCK) in a
> namespace, can this receive connections from guests connected to
> /dev/vhost-vsock in any namespace?
> 
>     Should we provide something (e.g. sysctl/sysfs entry) to disable
> this behaviour, preventing a process in a namespace from receiving
> connections from the global vsock address space (i.e.      /dev/vhost-vsock
> VMs)?

I think my concern goes a bit beyond that, to the general conceptual
idea of sharing the CID space between the global vsocks and namespace
vsocks. So I'm not sure a sysctl would be sufficient...details later
below..

> I understand that by default maybe we should allow this behaviour in order
> to not break current applications, but in some cases the user may want to
> isolate sockets in a namespace also from being accessed by VMs running in
> the global vsock address space.
> 
> Indeed in this series we have talked mostly about the host -> guest path (as
> the direction of the connection), but little about the guest -> host path,
> maybe we should explain it better in the cover/commit
> descriptions/documentation.

> > Testing
> > 
> > QEMU with /dev/vhost-vsock-netns support:
> > 	https://github.com/beshleman/qemu/tree/vsock-netns
> > 
> > Test: Scoped vsocks isolated by namespace
> > 
> >  host# ip netns add ns1
> >  host# ip netns add ns2
> >  host# ip netns exec ns1 \
> > 				  qemu-system-x86_64 \
> > 					  -m 8G -smp 4 -cpu host -enable-kvm \
> > 					  -serial mon:stdio \
> > 					  -drive if=virtio,file=${IMAGE1} \
> > 					  -device vhost-vsock-pci,netns=on,guest-cid=15
> >  host# ip netns exec ns2 \
> > 				  qemu-system-x86_64 \
> > 					  -m 8G -smp 4 -cpu host -enable-kvm \
> > 					  -serial mon:stdio \
> > 					  -drive if=virtio,file=${IMAGE2} \
> > 					  -device vhost-vsock-pci,netns=on,guest-cid=15
> > 
> >  host# socat - VSOCK-CONNECT:15:1234
> >  2025/03/10 17:09:40 socat[255741] E connect(5, AF=40 cid:15 port:1234, 16): No such device
> > 
> >  host# echo foobar1 | sudo ip netns exec ns1 socat - VSOCK-CONNECT:15:1234
> >  host# echo foobar2 | sudo ip netns exec ns2 socat - VSOCK-CONNECT:15:1234
> > 
> >  vm1# socat - VSOCK-LISTEN:1234
> >  foobar1
> >  vm2# socat - VSOCK-LISTEN:1234
> >  foobar2
> > 
> > Test: Global vsocks accessible to any namespace
> > 
> >  host# qemu-system-x86_64 \
> > 	  -m 8G -smp 4 -cpu host -enable-kvm \
> > 	  -serial mon:stdio \
> > 	  -drive if=virtio,file=${IMAGE2} \
> > 	  -device vhost-vsock-pci,guest-cid=15,netns=off
> > 
> >  host# echo foobar | sudo ip netns exec ns1 socat - VSOCK-CONNECT:15:1234
> > 
> >  vm# socat - VSOCK-LISTEN:1234
> >  foobar
> > 
> > Test: Connecting to global vsock makes CID unavailble to namespace
> > 
> >  host# qemu-system-x86_64 \
> > 	  -m 8G -smp 4 -cpu host -enable-kvm \
> > 	  -serial mon:stdio \
> > 	  -drive if=virtio,file=${IMAGE2} \
> > 	  -device vhost-vsock-pci,guest-cid=15,netns=off
> > 
> >  vm# socat - VSOCK-LISTEN:1234
> > 
> >  host# sudo ip netns exec ns1 socat - VSOCK-CONNECT:15:1234
> >  host# ip netns exec ns1 \
> > 				  qemu-system-x86_64 \
> > 					  -m 8G -smp 4 -cpu host -enable-kvm \
> > 					  -serial mon:stdio \
> > 					  -drive if=virtio,file=${IMAGE1} \
> > 					  -device vhost-vsock-pci,netns=on,guest-cid=15
> > 
> >  qemu-system-x86_64: -device vhost-vsock-pci,netns=on,guest-cid=15: vhost-vsock: unable to set guest cid: Address already in use

I find it conceptually quite unsettling that the VSOCK CID address
space for AF_VSOCK is shared between the host and the namespace.
That feels contrary to how namespaces are more commonly used for
deterministically isolating resources between the namespace and the
host.

Naively I would expect that in a namespace, all VSOCK CIDs are
free for use, without having to concern yourself with what CIDs
are in use in the host now, or in future.

What happens if we reverse the QEMU order above, to get the
following scenario

   # Launch VM1 inside the NS
   host# ip netns exec ns1 \
  				  qemu-system-x86_64 \
  					  -m 8G -smp 4 -cpu host -enable-kvm \
  					  -serial mon:stdio \
  					  -drive if=virtio,file=${IMAGE1} \
  					  -device vhost-vsock-pci,netns=on,guest-cid=15
   # Launch VM2
   host# qemu-system-x86_64 \
  	  -m 8G -smp 4 -cpu host -enable-kvm \
  	  -serial mon:stdio \
  	  -drive if=virtio,file=${IMAGE2} \
  	  -device vhost-vsock-pci,guest-cid=15,netns=off
  
   vm1# socat - VSOCK-LISTEN:1234
   vm2# socat - VSOCK-LISTEN:1234

   host# socat - VSOCK-CONNECT:15:1234
     => Presume this connects to "VM2" running outside the NS

   host# sudo ip netns exec ns1 socat - VSOCK-CONNECT:15:1234

     => Does this connect to "VM1" inside the NS, or "VM2"
        outside the NS ?



With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


  parent reply	other threads:[~2025-04-01 19:05 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-12 20:59 [PATCH v2 0/3] vsock: add namespace support to vhost-vsock Bobby Eshleman
2025-03-12 20:59 ` [PATCH v2 1/3] vsock: add network namespace support Bobby Eshleman
2025-03-19 13:02   ` Stefano Garzarella
2025-03-19 19:00     ` Bobby Eshleman
2025-03-20  8:57       ` Stefano Garzarella
2025-03-20 20:56         ` Bobby Eshleman
2025-03-12 20:59 ` [PATCH v2 2/3] vsock/virtio_transport_common: handle netns of received packets Bobby Eshleman
2025-03-19 13:26   ` Stefano Garzarella
2025-03-19 19:05     ` Bobby Eshleman
2025-03-12 20:59 ` [PATCH v2 3/3] vhost/vsock: use netns of process that opens the vhost-vsock-netns device Bobby Eshleman
2025-03-19 14:15   ` Stefano Garzarella
2025-03-19 19:28     ` Bobby Eshleman
2025-03-19 21:09   ` Paolo Abeni
2025-03-20  9:08     ` Stefano Garzarella
2025-03-20 21:05       ` Bobby Eshleman
2025-03-21 10:02         ` Stefano Garzarella
2025-03-21 16:43           ` Bobby Eshleman
2025-03-26  0:11           ` Bobby Eshleman
2025-03-27  9:14             ` Stefano Garzarella
2025-03-28 16:07               ` Bobby Eshleman
2025-03-28 16:19                 ` Stefano Garzarella
2025-03-28 20:14                   ` Bobby Eshleman
2025-03-20 20:57     ` Bobby Eshleman
2025-03-13  2:28 ` [PATCH v2 0/3] vsock: add namespace support to vhost-vsock Bobby Eshleman
2025-03-13 15:37   ` Stefano Garzarella
2025-03-13 16:20     ` Bobby Eshleman
2025-03-21 19:49 ` Michael S. Tsirkin
2025-03-22  1:04   ` Bobby Eshleman
2025-03-28 17:03 ` Stefano Garzarella
2025-03-28 20:13   ` Bobby Eshleman
2025-04-01 19:05   ` Daniel P. Berrangé [this message]
2025-04-02  0:21     ` Bobby Eshleman
2025-04-02  8:13       ` Stefano Garzarella
2025-04-02  9:21         ` Daniel P. Berrangé
2025-04-02 22:18           ` Bobby Eshleman
2025-04-02 22:28             ` Bobby Eshleman
2025-04-03  9:33               ` Stefano Garzarella
2025-04-03 19:42                 ` Bobby Eshleman
2025-04-04 13:05             ` Daniel P. Berrangé
2025-04-18 17:57               ` Bobby Eshleman
2025-04-22 13:35                 ` Stefano Garzarella
2025-04-03  9:01           ` Stefano Garzarella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-w47H3qUXZe4seQ@redhat.com \
    --to=berrange@redhat.com \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=bobbyeshleman@gmail.com \
    --cc=bryan-bt.tan@broadcom.com \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=eperezma@redhat.com \
    --cc=haiyangz@microsoft.com \
    --cc=jasowang@redhat.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtualization@lists.linux.dev \
    --cc=vishnu.dasa@broadcom.com \
    --cc=wei.liu@kernel.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).