* [LSF/MM/BPF TOPIC] Inter-VM Shared Memory Communications with eBPF
@ 2024-02-23 23:05 Cong Wang
2024-03-04 9:59 ` Dust Li
0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2024-02-23 23:05 UTC (permalink / raw)
To: lsf-pc; +Cc: bpf, a.mehrab@bytedance.com
Hi, all
We would like to discuss our inter-VM shared memory communications
proposal with the BPF community.
First, VMM (virtual machine monitor) offers significant advantages
over native machines when VMs co-resident on the same physical host
are non-competing in terms of network and computing resources.
However, the performance of VMs is significantly degraded compared to
that of native machines when co-resident VMs are competing for
resources under high workload demands due to high overheads of
switches and events in host/guest domain and VMM. Second, the
communication overhead between co-resident VMs can be as high as the
communication cost between VMs located on separate physical machines.
This is because the abstraction of VMs supported by VMM technology
does not differentiate whether the data request is coming from
co-resident VMs or not. More importantly, when using TCP/IP as the
communication method, the overhead of the Linux networking stack
itself is also significant.
Although vsock already offers an optimized alternative of inter-VM
communications, we argue that lack of transparency to applications is
the reason why vsock is not yet widely adopted. Instead of introducing
more socket families, we propose a novel solution using shared memory
with eBPF to bypass the TCP/IP stack completely and transparently to
bring co-resident VM communications to optimal.
We would like to discuss:
- How to design a new eBPF map based on IVSHMEM (Inter-VM Shared Memory)?
- How to reuse the existing eBPF ring buffer?
- How to leverage the socket map to replace tcp_sendmsg() and
tcp_recvmsg() with shared memory logic?
Thanks.
Cong
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Inter-VM Shared Memory Communications with eBPF
2024-02-23 23:05 [LSF/MM/BPF TOPIC] Inter-VM Shared Memory Communications with eBPF Cong Wang
@ 2024-03-04 9:59 ` Dust Li
2024-03-08 3:52 ` Cong Wang
0 siblings, 1 reply; 5+ messages in thread
From: Dust Li @ 2024-03-04 9:59 UTC (permalink / raw)
To: Cong Wang, lsf-pc; +Cc: bpf, Xuan Zhuo, a.mehrab@bytedance.com
On Fri, Feb 23, 2024 at 03:05:59PM -0800, Cong Wang wrote:
Hi Cong,
This is a good topic !
We have proposed another solution to accelerate Inter-VM tcp/ip communication
transparently within the same host based on SMC-D + virtio-ism
https://lists.oasis-open.org/archives/virtio-comment/202212/msg00030.html
I don't know, can we do better with your proposal ?
Best regards,
Dust
>Hi, all
>
>We would like to discuss our inter-VM shared memory communications
>proposal with the BPF community.
>
>First, VMM (virtual machine monitor) offers significant advantages
>over native machines when VMs co-resident on the same physical host
>are non-competing in terms of network and computing resources.
>However, the performance of VMs is significantly degraded compared to
>that of native machines when co-resident VMs are competing for
>resources under high workload demands due to high overheads of
>switches and events in host/guest domain and VMM. Second, the
>communication overhead between co-resident VMs can be as high as the
>communication cost between VMs located on separate physical machines.
>This is because the abstraction of VMs supported by VMM technology
>does not differentiate whether the data request is coming from
>co-resident VMs or not. More importantly, when using TCP/IP as the
>communication method, the overhead of the Linux networking stack
>itself is also significant.
>
>Although vsock already offers an optimized alternative of inter-VM
>communications, we argue that lack of transparency to applications is
>the reason why vsock is not yet widely adopted. Instead of introducing
>more socket families, we propose a novel solution using shared memory
>with eBPF to bypass the TCP/IP stack completely and transparently to
>bring co-resident VM communications to optimal.
>
>We would like to discuss:
>- How to design a new eBPF map based on IVSHMEM (Inter-VM Shared Memory)?
>- How to reuse the existing eBPF ring buffer?
>- How to leverage the socket map to replace tcp_sendmsg() and
>tcp_recvmsg() with shared memory logic?
>
>
>Thanks.
>Cong
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Inter-VM Shared Memory Communications with eBPF
2024-03-04 9:59 ` Dust Li
@ 2024-03-08 3:52 ` Cong Wang
2024-03-11 9:54 ` Dust Li
0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2024-03-08 3:52 UTC (permalink / raw)
To: dust.li; +Cc: lsf-pc, bpf, Xuan Zhuo, a.mehrab@bytedance.com
On Mon, Mar 4, 2024 at 1:59 AM Dust Li <dust.li@linux.alibaba.com> wrote:
>
> On Fri, Feb 23, 2024 at 03:05:59PM -0800, Cong Wang wrote:
>
> Hi Cong,
>
> This is a good topic !
> We have proposed another solution to accelerate Inter-VM tcp/ip communication
> transparently within the same host based on SMC-D + virtio-ism
> https://lists.oasis-open.org/archives/virtio-comment/202212/msg00030.html
>
> I don't know, can we do better with your proposal ?
We knew SMC and it _is_ actually why I have this eBPF based proposal.
Sorry for not providing more details here, since I just want to keep
this proposal
brief and will certain have all the details in our presentation if our
proposal gets
accepted.
The main problem of SMC is it is not fully transparent, LD_PRELOAD could
work for most cases but not all. Therefore, I don't think introducing any new
socket family is in the right direction at all.
(There are some other problems with SMC too, for instance, it requires more
than a 3-way handshake.)
And I don't think there is any conflict or overlap here at all. Our eBPF-based
solution relies on the existing inter-VM shared memory, no matter it is ivshmem
or virtio-ism. We don't propose any new way of sharing memory, what we
propose is merely using an existing one and building our solution on top.
In fact, we believe our solution can be on top of your virtio-ism,
since it is just
another flat memory region from our point of view.
Hope this helps.
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Inter-VM Shared Memory Communications with eBPF
2024-03-08 3:52 ` Cong Wang
@ 2024-03-11 9:54 ` Dust Li
2024-05-07 17:18 ` Cong Wang
0 siblings, 1 reply; 5+ messages in thread
From: Dust Li @ 2024-03-11 9:54 UTC (permalink / raw)
To: Cong Wang; +Cc: lsf-pc, bpf, Xuan Zhuo, a.mehrab@bytedance.com
On Thu, Mar 07, 2024 at 07:52:52PM -0800, Cong Wang wrote:
>On Mon, Mar 4, 2024 at 1:59 AM Dust Li <dust.li@linux.alibaba.com> wrote:
>>
>> On Fri, Feb 23, 2024 at 03:05:59PM -0800, Cong Wang wrote:
>>
>> Hi Cong,
>>
>> This is a good topic !
>> We have proposed another solution to accelerate Inter-VM tcp/ip communication
>> transparently within the same host based on SMC-D + virtio-ism
>> https://lists.oasis-open.org/archives/virtio-comment/202212/msg00030.html
>>
>> I don't know, can we do better with your proposal ?
>
>We knew SMC and it _is_ actually why I have this eBPF based proposal.
>Sorry for not providing more details here, since I just want to keep
>this proposal
>brief and will certain have all the details in our presentation if our
>proposal gets
>accepted.
>
>The main problem of SMC is it is not fully transparent, LD_PRELOAD could
>work for most cases but not all. Therefore, I don't think introducing any new
>socket family is in the right direction at all.
Actually, this is not really true. We have introduce several ways to solve
this. The best way I think is to support IPPROTO_SMC[1] in SMC and using the
same eBPF infrastructure as MPTCP has already contributed[2].
[1] https://lore.kernel.org/netdev/20231113045758.GB121324@linux.alibaba.com
[2] https://lore.kernel.org/all/cover.1692147782.git.geliang.tang@suse.com
>
>(There are some other problems with SMC too, for instance, it requires more
>than a 3-way handshake.)
Right, but I don't see much performance penalty because of the extra handshake,
setting up the share memory is always the slowest part in a share memory
communication model.
>
>And I don't think there is any conflict or overlap here at all. Our eBPF-based
>solution relies on the existing inter-VM shared memory, no matter it is ivshmem
>or virtio-ism. We don't propose any new way of sharing memory, what we
>propose is merely using an existing one and building our solution on top.
>
>In fact, we believe our solution can be on top of your virtio-ism,
>since it is just
>another flat memory region from our point of view.
>
>Hope this helps.
Don't get me wrong, I'm not against to your proposal at all, I like it.
I just hope different solutions can be seen.
Best regards,
Dust
>
>Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [LSF/MM/BPF TOPIC] Inter-VM Shared Memory Communications with eBPF
2024-03-11 9:54 ` Dust Li
@ 2024-05-07 17:18 ` Cong Wang
0 siblings, 0 replies; 5+ messages in thread
From: Cong Wang @ 2024-05-07 17:18 UTC (permalink / raw)
To: Dust Li; +Cc: lsf-pc, bpf, Xuan Zhuo, a.mehrab@bytedance.com
On Mon, Mar 11, 2024 at 05:54:56PM +0800, Dust Li wrote:
> On Thu, Mar 07, 2024 at 07:52:52PM -0800, Cong Wang wrote:
> >On Mon, Mar 4, 2024 at 1:59 AM Dust Li <dust.li@linux.alibaba.com> wrote:
> >>
> >> On Fri, Feb 23, 2024 at 03:05:59PM -0800, Cong Wang wrote:
> >>
> >> Hi Cong,
> >>
> >> This is a good topic !
> >> We have proposed another solution to accelerate Inter-VM tcp/ip communication
> >> transparently within the same host based on SMC-D + virtio-ism
> >> https://lists.oasis-open.org/archives/virtio-comment/202212/msg00030.html
> >>
> >> I don't know, can we do better with your proposal ?
> >
> >We knew SMC and it _is_ actually why I have this eBPF based proposal.
> >Sorry for not providing more details here, since I just want to keep
> >this proposal
> >brief and will certain have all the details in our presentation if our
> >proposal gets
> >accepted.
> >
> >The main problem of SMC is it is not fully transparent, LD_PRELOAD could
> >work for most cases but not all. Therefore, I don't think introducing any new
> >socket family is in the right direction at all.
>
> Actually, this is not really true. We have introduce several ways to solve
> this. The best way I think is to support IPPROTO_SMC[1] in SMC and using the
> same eBPF infrastructure as MPTCP has already contributed[2].
>
> [1] https://lore.kernel.org/netdev/20231113045758.GB121324@linux.alibaba.com
> [2] https://lore.kernel.org/all/cover.1692147782.git.geliang.tang@suse.com
(Sorry for missing your email.)
I think this is wrong, basically and literally speaking, it is saying
"you want to use a kernel module to replace another kernel module with
eBPF as a trigger". The trigger itself could not function at all without
the actual module which provides the implementation. Nor it works for
kernel sockets, you need to think about NVMe-oF which is a very legitimate
case since it supports both TCP and RDMA.
Unlike SMC, all those eBPF components we need here can be easily used
independently for any other purposes. Neither sockmap nor sockops (not
even ivmshem) is designed for this specific case, we just combine and
reuse them. I hope now you could see how and why flexibilities matter.
We prefer eBPF not because it is cool or new, it is because of this
kind of flexibility.
BTW, its granularity is less ideal than sockops which is per container.
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-05-07 17:18 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-23 23:05 [LSF/MM/BPF TOPIC] Inter-VM Shared Memory Communications with eBPF Cong Wang
2024-03-04 9:59 ` Dust Li
2024-03-08 3:52 ` Cong Wang
2024-03-11 9:54 ` Dust Li
2024-05-07 17:18 ` Cong Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox