netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible
@ 2024-07-30 19:47 Luigi Leonardi via B4 Relay
  2024-07-30 19:47 ` [PATCH net-next v4 1/2] vsock/virtio: refactor virtio_transport_send_pkt_work Luigi Leonardi via B4 Relay
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Luigi Leonardi via B4 Relay @ 2024-07-30 19:47 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: kvm, virtualization, netdev, linux-kernel, Marco Pinna,
	Luigi Leonardi

This series introduces an optimization for vsock/virtio to reduce latency
and increase the throughput: When the guest sends a packet to the host,
and the intermediate queue (send_pkt_queue) is empty, if there is enough
space, the packet is put directly in the virtqueue.

v3->v4
While running experiments on fio with 64B payload, I realized that there
was a mistake in my fio configuration, so I re-ran all the experiments
and now the latency numbers are indeed lower with the patch applied.
I also noticed that I was kicking the host without the lock.

- Fixed a configuration mistake on fio and re-ran all experiments.
- Fio latency measurement using 64B payload.
- virtio_transport_send_skb_fast_path sends kick with the tx_lock acquired
- Addressed all minor style changes requested by maintainer.
- Rebased on latest net-next
- Link to v3: https://lore.kernel.org/r/20240711-pinna-v3-0-697d4164fe80@outlook.com

v2->v3
- Performed more experiments using iperf3 using multiple streams
- Handling of reply packets removed from virtio_transport_send_skb,
  as is needed just by the worker.
- Removed atomic_inc/atomic_sub when queuing directly to the vq.
- Introduced virtio_transport_send_skb_fast_path that handles the
  steps for sending on the vq.
- Fixed a missing mutex_unlock in error path.
- Changed authorship of the second commit
- Rebased on latest net-next

v1->v2
In this v2 I replaced a mutex_lock with a mutex_trylock because it was
insidea RCU critical section. I also added a check on tx_run, so if the
module is being removed the packet is not queued. I'd like to thank Stefano
for reporting the tx_run issue.

Applied all Stefano's suggestions:
    - Minor code style changes
    - Minor commit text rewrite
Performed more experiments:
     - Check if all the packets go directly to the vq (Matias' suggestion)
     - Used iperf3 to see if there is any improvement in overall throughput
      from guest to host
     - Pinned the vhost process to a pCPU.
     - Run fio using 512B payload
Rebased on latest net-next

---
Luigi Leonardi (1):
      vsock/virtio: avoid queuing packets when intermediate queue is empty

Marco Pinna (1):
      vsock/virtio: refactor virtio_transport_send_pkt_work

 net/vmw_vsock/virtio_transport.c | 144 +++++++++++++++++++++++++--------------
 1 file changed, 94 insertions(+), 50 deletions(-)
---
base-commit: 1722389b0d863056d78287a120a1d6cadb8d4f7b
change-id: 20240730-pinna-db8cc1b8b037

Best regards,
-- 
Luigi Leonardi <luigi.leonardi@outlook.com>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next v4 1/2] vsock/virtio: refactor virtio_transport_send_pkt_work
  2024-07-30 19:47 [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible Luigi Leonardi via B4 Relay
@ 2024-07-30 19:47 ` Luigi Leonardi via B4 Relay
  2024-07-30 19:47 ` [PATCH net-next v4 2/2] vsock/virtio: avoid queuing packets when intermediate queue is empty Luigi Leonardi via B4 Relay
  2024-08-05  8:39 ` [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible Stefano Garzarella
  2 siblings, 0 replies; 11+ messages in thread
From: Luigi Leonardi via B4 Relay @ 2024-07-30 19:47 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: kvm, virtualization, netdev, linux-kernel, Marco Pinna,
	Luigi Leonardi

From: Marco Pinna <marco.pinn95@gmail.com>

Preliminary patch to introduce an optimization to the
enqueue system.

All the code used to enqueue a packet into the virtqueue
is removed from virtio_transport_send_pkt_work()
and moved to the new virtio_transport_send_skb() function.

Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
Signed-off-by: Marco Pinna <marco.pinn95@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---
 net/vmw_vsock/virtio_transport.c | 105 ++++++++++++++++++++++-----------------
 1 file changed, 59 insertions(+), 46 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 64a07acfef12..f641e906f351 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -94,6 +94,63 @@ static u32 virtio_transport_get_local_cid(void)
 	return ret;
 }
 
+/* Caller need to hold vsock->tx_lock on vq */
+static int virtio_transport_send_skb(struct sk_buff *skb, struct virtqueue *vq,
+				     struct virtio_vsock *vsock)
+{
+	int ret, in_sg = 0, out_sg = 0;
+	struct scatterlist **sgs;
+
+	sgs = vsock->out_sgs;
+	sg_init_one(sgs[out_sg], virtio_vsock_hdr(skb),
+		    sizeof(*virtio_vsock_hdr(skb)));
+	out_sg++;
+
+	if (!skb_is_nonlinear(skb)) {
+		if (skb->len > 0) {
+			sg_init_one(sgs[out_sg], skb->data, skb->len);
+			out_sg++;
+		}
+	} else {
+		struct skb_shared_info *si;
+		int i;
+
+		/* If skb is nonlinear, then its buffer must contain
+		 * only header and nothing more. Data is stored in
+		 * the fragged part.
+		 */
+		WARN_ON_ONCE(skb_headroom(skb) != sizeof(*virtio_vsock_hdr(skb)));
+
+		si = skb_shinfo(skb);
+
+		for (i = 0; i < si->nr_frags; i++) {
+			skb_frag_t *skb_frag = &si->frags[i];
+			void *va;
+
+			/* We will use 'page_to_virt()' for the userspace page
+			 * here, because virtio or dma-mapping layers will call
+			 * 'virt_to_phys()' later to fill the buffer descriptor.
+			 * We don't touch memory at "virtual" address of this page.
+			 */
+			va = page_to_virt(skb_frag_page(skb_frag));
+			sg_init_one(sgs[out_sg],
+				    va + skb_frag_off(skb_frag),
+				    skb_frag_size(skb_frag));
+			out_sg++;
+		}
+	}
+
+	ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, skb, GFP_KERNEL);
+	/* Usually this means that there is no more space available in
+	 * the vq
+	 */
+	if (ret < 0)
+		return ret;
+
+	virtio_transport_deliver_tap_pkt(skb);
+	return 0;
+}
+
 static void
 virtio_transport_send_pkt_work(struct work_struct *work)
 {
@@ -111,66 +168,22 @@ virtio_transport_send_pkt_work(struct work_struct *work)
 	vq = vsock->vqs[VSOCK_VQ_TX];
 
 	for (;;) {
-		int ret, in_sg = 0, out_sg = 0;
-		struct scatterlist **sgs;
 		struct sk_buff *skb;
 		bool reply;
+		int ret;
 
 		skb = virtio_vsock_skb_dequeue(&vsock->send_pkt_queue);
 		if (!skb)
 			break;
 
 		reply = virtio_vsock_skb_reply(skb);
-		sgs = vsock->out_sgs;
-		sg_init_one(sgs[out_sg], virtio_vsock_hdr(skb),
-			    sizeof(*virtio_vsock_hdr(skb)));
-		out_sg++;
-
-		if (!skb_is_nonlinear(skb)) {
-			if (skb->len > 0) {
-				sg_init_one(sgs[out_sg], skb->data, skb->len);
-				out_sg++;
-			}
-		} else {
-			struct skb_shared_info *si;
-			int i;
-
-			/* If skb is nonlinear, then its buffer must contain
-			 * only header and nothing more. Data is stored in
-			 * the fragged part.
-			 */
-			WARN_ON_ONCE(skb_headroom(skb) != sizeof(*virtio_vsock_hdr(skb)));
-
-			si = skb_shinfo(skb);
 
-			for (i = 0; i < si->nr_frags; i++) {
-				skb_frag_t *skb_frag = &si->frags[i];
-				void *va;
-
-				/* We will use 'page_to_virt()' for the userspace page
-				 * here, because virtio or dma-mapping layers will call
-				 * 'virt_to_phys()' later to fill the buffer descriptor.
-				 * We don't touch memory at "virtual" address of this page.
-				 */
-				va = page_to_virt(skb_frag_page(skb_frag));
-				sg_init_one(sgs[out_sg],
-					    va + skb_frag_off(skb_frag),
-					    skb_frag_size(skb_frag));
-				out_sg++;
-			}
-		}
-
-		ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, skb, GFP_KERNEL);
-		/* Usually this means that there is no more space available in
-		 * the vq
-		 */
+		ret = virtio_transport_send_skb(skb, vq, vsock);
 		if (ret < 0) {
 			virtio_vsock_skb_queue_head(&vsock->send_pkt_queue, skb);
 			break;
 		}
 
-		virtio_transport_deliver_tap_pkt(skb);
-
 		if (reply) {
 			struct virtqueue *rx_vq = vsock->vqs[VSOCK_VQ_RX];
 			int val;

-- 
2.45.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v4 2/2] vsock/virtio: avoid queuing packets when intermediate queue is empty
  2024-07-30 19:47 [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible Luigi Leonardi via B4 Relay
  2024-07-30 19:47 ` [PATCH net-next v4 1/2] vsock/virtio: refactor virtio_transport_send_pkt_work Luigi Leonardi via B4 Relay
@ 2024-07-30 19:47 ` Luigi Leonardi via B4 Relay
  2024-07-31  7:39   ` Stefano Garzarella
  2024-08-05  8:39 ` [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible Stefano Garzarella
  2 siblings, 1 reply; 11+ messages in thread
From: Luigi Leonardi via B4 Relay @ 2024-07-30 19:47 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: kvm, virtualization, netdev, linux-kernel, Marco Pinna,
	Luigi Leonardi

From: Luigi Leonardi <luigi.leonardi@outlook.com>

When the driver needs to send new packets to the device, it always
queues the new sk_buffs into an intermediate queue (send_pkt_queue)
and schedules a worker (send_pkt_work) to then queue them into the
virtqueue exposed to the device.

This increases the chance of batching, but also introduces a lot of
latency into the communication. So we can optimize this path by
adding a fast path to be taken when there is no element in the
intermediate queue, there is space available in the virtqueue,
and no other process that is sending packets (tx_lock held).

The following benchmarks were run to check improvements in latency and
throughput. The test bed is a host with Intel i7-10700KF CPU @ 3.80GHz
and L1 guest running on QEMU/KVM with vhost process and all vCPUs
pinned individually to pCPUs.

- Latency
   Tool: Fio version 3.37-56
   Mode: pingpong (h-g-h)
   Test runs: 50
   Runtime-per-test: 50s
   Type: SOCK_STREAM

In the following fio benchmark (pingpong mode) the host sends
a payload to the guest and waits for the same payload back.

fio process pinned both inside the host and the guest system.

Before: Linux 6.9.8

Payload 64B:

	1st perc.	overall		99th perc.
Before	12.91		16.78		42.24		us
After	9.77		13.57		39.17		us

Payload 512B:

	1st perc.	overall		99th perc.
Before	13.35		17.35		41.52		us
After	10.25		14.11		39.58		us

Payload 4K:

	1st perc.	overall		99th perc.
Before	14.71		19.87		41.52		us
After	10.51		14.96		40.81		us

- Throughput
   Tool: iperf-vsock

The size represents the buffer length (-l) to read/write
P represents the number of parallel streams

P=1
	4K	64K	128K
Before	6.87	29.3	29.5 Gb/s
After	10.5	39.4	39.9 Gb/s

P=2
	4K	64K	128K
Before	10.5	32.8	33.2 Gb/s
After	17.8	47.7	48.5 Gb/s

P=4
	4K	64K	128K
Before	12.7	33.6	34.2 Gb/s
After	16.9	48.1	50.5 Gb/s

The performance improvement is related to this optimization,
I used a ebpf kretprobe on virtio_transport_send_skb to check
that each packet was sent directly to the virtqueue

Co-developed-by: Marco Pinna <marco.pinn95@gmail.com>
Signed-off-by: Marco Pinna <marco.pinn95@gmail.com>
Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
---
 net/vmw_vsock/virtio_transport.c | 39 +++++++++++++++++++++++++++++++++++----
 1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index f641e906f351..f992f9a216f0 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -208,6 +208,28 @@ virtio_transport_send_pkt_work(struct work_struct *work)
 		queue_work(virtio_vsock_workqueue, &vsock->rx_work);
 }
 
+/* Caller need to hold RCU for vsock.
+ * Returns 0 if the packet is successfully put on the vq.
+ */
+static int virtio_transport_send_skb_fast_path(struct virtio_vsock *vsock, struct sk_buff *skb)
+{
+	struct virtqueue *vq = vsock->vqs[VSOCK_VQ_TX];
+	int ret;
+
+	/* Inside RCU, can't sleep! */
+	ret = mutex_trylock(&vsock->tx_lock);
+	if (unlikely(ret == 0))
+		return -EBUSY;
+
+	ret = virtio_transport_send_skb(skb, vq, vsock);
+	if (ret == 0)
+		virtqueue_kick(vq);
+
+	mutex_unlock(&vsock->tx_lock);
+
+	return ret;
+}
+
 static int
 virtio_transport_send_pkt(struct sk_buff *skb)
 {
@@ -231,11 +253,20 @@ virtio_transport_send_pkt(struct sk_buff *skb)
 		goto out_rcu;
 	}
 
-	if (virtio_vsock_skb_reply(skb))
-		atomic_inc(&vsock->queued_replies);
+	/* If send_pkt_queue is empty, we can safely bypass this queue
+	 * because packet order is maintained and (try) to put the packet
+	 * on the virtqueue using virtio_transport_send_skb_fast_path.
+	 * If this fails we simply put the packet on the intermediate
+	 * queue and schedule the worker.
+	 */
+	if (!skb_queue_empty_lockless(&vsock->send_pkt_queue) ||
+	    virtio_transport_send_skb_fast_path(vsock, skb)) {
+		if (virtio_vsock_skb_reply(skb))
+			atomic_inc(&vsock->queued_replies);
 
-	virtio_vsock_skb_queue_tail(&vsock->send_pkt_queue, skb);
-	queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
+		virtio_vsock_skb_queue_tail(&vsock->send_pkt_queue, skb);
+		queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
+	}
 
 out_rcu:
 	rcu_read_unlock();

-- 
2.45.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v4 2/2] vsock/virtio: avoid queuing packets when intermediate queue is empty
  2024-07-30 19:47 ` [PATCH net-next v4 2/2] vsock/virtio: avoid queuing packets when intermediate queue is empty Luigi Leonardi via B4 Relay
@ 2024-07-31  7:39   ` Stefano Garzarella
  0 siblings, 0 replies; 11+ messages in thread
From: Stefano Garzarella @ 2024-07-31  7:39 UTC (permalink / raw)
  To: luigi.leonardi
  Cc: Stefan Hajnoczi, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, kvm, virtualization, netdev, linux-kernel,
	Marco Pinna

On Tue, Jul 30, 2024 at 09:47:32PM GMT, Luigi Leonardi via B4 Relay wrote:
>From: Luigi Leonardi <luigi.leonardi@outlook.com>
>
>When the driver needs to send new packets to the device, it always
>queues the new sk_buffs into an intermediate queue (send_pkt_queue)
>and schedules a worker (send_pkt_work) to then queue them into the
>virtqueue exposed to the device.
>
>This increases the chance of batching, but also introduces a lot of
>latency into the communication. So we can optimize this path by
>adding a fast path to be taken when there is no element in the
>intermediate queue, there is space available in the virtqueue,
>and no other process that is sending packets (tx_lock held).
>
>The following benchmarks were run to check improvements in latency and
>throughput. The test bed is a host with Intel i7-10700KF CPU @ 3.80GHz
>and L1 guest running on QEMU/KVM with vhost process and all vCPUs
>pinned individually to pCPUs.
>
>- Latency
>   Tool: Fio version 3.37-56
>   Mode: pingpong (h-g-h)
>   Test runs: 50
>   Runtime-per-test: 50s
>   Type: SOCK_STREAM
>
>In the following fio benchmark (pingpong mode) the host sends
>a payload to the guest and waits for the same payload back.
>
>fio process pinned both inside the host and the guest system.
>
>Before: Linux 6.9.8
>
>Payload 64B:
>
>	1st perc.	overall		99th perc.
>Before	12.91		16.78		42.24		us
>After	9.77		13.57		39.17		us
>
>Payload 512B:
>
>	1st perc.	overall		99th perc.
>Before	13.35		17.35		41.52		us
>After	10.25		14.11		39.58		us
>
>Payload 4K:
>
>	1st perc.	overall		99th perc.
>Before	14.71		19.87		41.52		us
>After	10.51		14.96		40.81		us
>
>- Throughput
>   Tool: iperf-vsock
>
>The size represents the buffer length (-l) to read/write
>P represents the number of parallel streams
>
>P=1
>	4K	64K	128K
>Before	6.87	29.3	29.5 Gb/s
>After	10.5	39.4	39.9 Gb/s
>
>P=2
>	4K	64K	128K
>Before	10.5	32.8	33.2 Gb/s
>After	17.8	47.7	48.5 Gb/s
>
>P=4
>	4K	64K	128K
>Before	12.7	33.6	34.2 Gb/s
>After	16.9	48.1	50.5 Gb/s

Great improvement! Thanks again for this work!

>
>The performance improvement is related to this optimization,
>I used a ebpf kretprobe on virtio_transport_send_skb to check
>that each packet was sent directly to the virtqueue
>
>Co-developed-by: Marco Pinna <marco.pinn95@gmail.com>
>Signed-off-by: Marco Pinna <marco.pinn95@gmail.com>
>Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
>---
> net/vmw_vsock/virtio_transport.c | 39 +++++++++++++++++++++++++++++++++++----
> 1 file changed, 35 insertions(+), 4 deletions(-)

All my comments have been resolved. I let iperf run bidirectionally for 
a long time and saw no problems, so:

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>


>
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index f641e906f351..f992f9a216f0 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -208,6 +208,28 @@ virtio_transport_send_pkt_work(struct work_struct *work)
> 		queue_work(virtio_vsock_workqueue, &vsock->rx_work);
> }
>
>+/* Caller need to hold RCU for vsock.
>+ * Returns 0 if the packet is successfully put on the vq.
>+ */
>+static int virtio_transport_send_skb_fast_path(struct virtio_vsock *vsock, struct sk_buff *skb)
>+{
>+	struct virtqueue *vq = vsock->vqs[VSOCK_VQ_TX];
>+	int ret;
>+
>+	/* Inside RCU, can't sleep! */
>+	ret = mutex_trylock(&vsock->tx_lock);
>+	if (unlikely(ret == 0))
>+		return -EBUSY;
>+
>+	ret = virtio_transport_send_skb(skb, vq, vsock);
>+	if (ret == 0)
>+		virtqueue_kick(vq);
>+
>+	mutex_unlock(&vsock->tx_lock);
>+
>+	return ret;
>+}
>+
> static int
> virtio_transport_send_pkt(struct sk_buff *skb)
> {
>@@ -231,11 +253,20 @@ virtio_transport_send_pkt(struct sk_buff *skb)
> 		goto out_rcu;
> 	}
>
>-	if (virtio_vsock_skb_reply(skb))
>-		atomic_inc(&vsock->queued_replies);
>+	/* If send_pkt_queue is empty, we can safely bypass this queue
>+	 * because packet order is maintained and (try) to put the packet
>+	 * on the virtqueue using virtio_transport_send_skb_fast_path.
>+	 * If this fails we simply put the packet on the intermediate
>+	 * queue and schedule the worker.
>+	 */
>+	if (!skb_queue_empty_lockless(&vsock->send_pkt_queue) ||
>+	    virtio_transport_send_skb_fast_path(vsock, skb)) {
>+		if (virtio_vsock_skb_reply(skb))
>+			atomic_inc(&vsock->queued_replies);
>
>-	virtio_vsock_skb_queue_tail(&vsock->send_pkt_queue, skb);
>-	queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
>+		virtio_vsock_skb_queue_tail(&vsock->send_pkt_queue, skb);
>+		queue_work(virtio_vsock_workqueue, &vsock->send_pkt_work);
>+	}
>
> out_rcu:
> 	rcu_read_unlock();
>
>-- 
>2.45.2
>
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible
  2024-07-30 19:47 [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible Luigi Leonardi via B4 Relay
  2024-07-30 19:47 ` [PATCH net-next v4 1/2] vsock/virtio: refactor virtio_transport_send_pkt_work Luigi Leonardi via B4 Relay
  2024-07-30 19:47 ` [PATCH net-next v4 2/2] vsock/virtio: avoid queuing packets when intermediate queue is empty Luigi Leonardi via B4 Relay
@ 2024-08-05  8:39 ` Stefano Garzarella
  2024-08-06 16:02   ` Jakub Kicinski
  2024-08-29 11:00   ` Luigi Leonardi
  2 siblings, 2 replies; 11+ messages in thread
From: Stefano Garzarella @ 2024-08-05  8:39 UTC (permalink / raw)
  To: luigi.leonardi, mst
  Cc: Stefan Hajnoczi, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, kvm, virtualization, netdev, linux-kernel,
	Marco Pinna

Hi Michael,
this series is marked as "Not Applicable" for the net-next tree:
https://patchwork.kernel.org/project/netdevbpf/patch/20240730-pinna-v4-2-5c9179164db5@outlook.com/

Actually this is more about the virtio-vsock driver, so can you queue 
this on your tree?

Thanks,
Stefano

On Tue, Jul 30, 2024 at 09:47:30PM GMT, Luigi Leonardi via B4 Relay wrote:
>This series introduces an optimization for vsock/virtio to reduce latency
>and increase the throughput: When the guest sends a packet to the host,
>and the intermediate queue (send_pkt_queue) is empty, if there is enough
>space, the packet is put directly in the virtqueue.
>
>v3->v4
>While running experiments on fio with 64B payload, I realized that there
>was a mistake in my fio configuration, so I re-ran all the experiments
>and now the latency numbers are indeed lower with the patch applied.
>I also noticed that I was kicking the host without the lock.
>
>- Fixed a configuration mistake on fio and re-ran all experiments.
>- Fio latency measurement using 64B payload.
>- virtio_transport_send_skb_fast_path sends kick with the tx_lock acquired
>- Addressed all minor style changes requested by maintainer.
>- Rebased on latest net-next
>- Link to v3: https://lore.kernel.org/r/20240711-pinna-v3-0-697d4164fe80@outlook.com
>
>v2->v3
>- Performed more experiments using iperf3 using multiple streams
>- Handling of reply packets removed from virtio_transport_send_skb,
>  as is needed just by the worker.
>- Removed atomic_inc/atomic_sub when queuing directly to the vq.
>- Introduced virtio_transport_send_skb_fast_path that handles the
>  steps for sending on the vq.
>- Fixed a missing mutex_unlock in error path.
>- Changed authorship of the second commit
>- Rebased on latest net-next
>
>v1->v2
>In this v2 I replaced a mutex_lock with a mutex_trylock because it was
>insidea RCU critical section. I also added a check on tx_run, so if the
>module is being removed the packet is not queued. I'd like to thank Stefano
>for reporting the tx_run issue.
>
>Applied all Stefano's suggestions:
>    - Minor code style changes
>    - Minor commit text rewrite
>Performed more experiments:
>     - Check if all the packets go directly to the vq (Matias' suggestion)
>     - Used iperf3 to see if there is any improvement in overall throughput
>      from guest to host
>     - Pinned the vhost process to a pCPU.
>     - Run fio using 512B payload
>Rebased on latest net-next
>
>---
>Luigi Leonardi (1):
>      vsock/virtio: avoid queuing packets when intermediate queue is empty
>
>Marco Pinna (1):
>      vsock/virtio: refactor virtio_transport_send_pkt_work
>
> net/vmw_vsock/virtio_transport.c | 144 +++++++++++++++++++++++++--------------
> 1 file changed, 94 insertions(+), 50 deletions(-)
>---
>base-commit: 1722389b0d863056d78287a120a1d6cadb8d4f7b
>change-id: 20240730-pinna-db8cc1b8b037
>
>Best regards,
>-- 
>Luigi Leonardi <luigi.leonardi@outlook.com>
>
>
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible
  2024-08-05  8:39 ` [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible Stefano Garzarella
@ 2024-08-06 16:02   ` Jakub Kicinski
  2024-08-06 16:45     ` Stefano Garzarella
  2024-08-29 11:00   ` Luigi Leonardi
  1 sibling, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2024-08-06 16:02 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: luigi.leonardi, mst, Stefan Hajnoczi, David S. Miller,
	Eric Dumazet, Paolo Abeni, kvm, virtualization, netdev,
	linux-kernel, Marco Pinna

On Mon, 5 Aug 2024 10:39:23 +0200 Stefano Garzarella wrote:
> this series is marked as "Not Applicable" for the net-next tree:
> https://patchwork.kernel.org/project/netdevbpf/patch/20240730-pinna-v4-2-5c9179164db5@outlook.com/
> 
> Actually this is more about the virtio-vsock driver, so can you queue 
> this on your tree?

We can revive it in our patchwork, too, if that's easier.
Not entirely sure why it was discarded, seems borderline.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible
  2024-08-06 16:02   ` Jakub Kicinski
@ 2024-08-06 16:45     ` Stefano Garzarella
  0 siblings, 0 replies; 11+ messages in thread
From: Stefano Garzarella @ 2024-08-06 16:45 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: luigi.leonardi, mst, Stefan Hajnoczi, David S. Miller,
	Eric Dumazet, Paolo Abeni, kvm, virtualization, netdev,
	linux-kernel, Marco Pinna

On Tue, Aug 06, 2024 at 09:02:57AM GMT, Jakub Kicinski wrote:
>On Mon, 5 Aug 2024 10:39:23 +0200 Stefano Garzarella wrote:
>> this series is marked as "Not Applicable" for the net-next tree:
>> https://patchwork.kernel.org/project/netdevbpf/patch/20240730-pinna-v4-2-5c9179164db5@outlook.com/
>>
>> Actually this is more about the virtio-vsock driver, so can you queue
>> this on your tree?
>
>We can revive it in our patchwork, too, if that's easier.

That's perfectly fine with me, if Michael hasn't already queued it.

>Not entirely sure why it was discarded, seems borderline.
>

Yes, even to me it's not super clear when to expect net and when virtio.
Usually the other vsock transports (VMCI and HyperV) go with net, so 
virtio-vsock is a bit of an exception.

I don't have any particular preferences, so how it works best for you 
and Michael is fine with me.

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible
  2024-08-05  8:39 ` [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible Stefano Garzarella
  2024-08-06 16:02   ` Jakub Kicinski
@ 2024-08-29 11:00   ` Luigi Leonardi
  2024-08-29 12:19     ` Michael S. Tsirkin
  1 sibling, 1 reply; 11+ messages in thread
From: Luigi Leonardi @ 2024-08-29 11:00 UTC (permalink / raw)
  To: sgarzare, mst
  Cc: davem, edumazet, kuba, kvm, linux-kernel, luigi.leonardi,
	marco.pinn95, netdev, pabeni, stefanha, virtualization

Hi All,

It has been a while since the last email and this patch has not been merged yet.
This is just a gentle ping :)

Thanks,
Luigi

>Hi Michael,
>this series is marked as "Not Applicable" for the net-next tree:
>https://patchwork.kernel.org/project/netdevbpf/patch/20240730-pinna-v4-2-5c9179164db5@outlook.com/

>Actually this is more about the virtio-vsock driver, so can you queue
>this on your tree?

>Thanks,
>Stefano

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible
  2024-08-29 11:00   ` Luigi Leonardi
@ 2024-08-29 12:19     ` Michael S. Tsirkin
  2024-08-29 12:33       ` Stefano Garzarella
  0 siblings, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2024-08-29 12:19 UTC (permalink / raw)
  To: Luigi Leonardi
  Cc: sgarzare, davem, edumazet, kuba, kvm, linux-kernel, marco.pinn95,
	netdev, pabeni, stefanha, virtualization

On Thu, Aug 29, 2024 at 01:00:37PM +0200, Luigi Leonardi wrote:
> Hi All,
> 
> It has been a while since the last email and this patch has not been merged yet.
> This is just a gentle ping :)
> 
> Thanks,
> Luigi


ok I can queue it for next. Next time pls remember to CC all
maintainers. Thanks!


> >Hi Michael,
> >this series is marked as "Not Applicable" for the net-next tree:
> >https://patchwork.kernel.org/project/netdevbpf/patch/20240730-pinna-v4-2-5c9179164db5@outlook.com/
> 
> >Actually this is more about the virtio-vsock driver, so can you queue
> >this on your tree?
> 
> >Thanks,
> >Stefano


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible
  2024-08-29 12:19     ` Michael S. Tsirkin
@ 2024-08-29 12:33       ` Stefano Garzarella
  2024-08-29 12:37         ` Michael S. Tsirkin
  0 siblings, 1 reply; 11+ messages in thread
From: Stefano Garzarella @ 2024-08-29 12:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Luigi Leonardi, davem, edumazet, kuba, kvm, linux-kernel,
	marco.pinn95, netdev, pabeni, stefanha, virtualization

On Thu, Aug 29, 2024 at 08:19:31AM GMT, Michael S. Tsirkin wrote:
>On Thu, Aug 29, 2024 at 01:00:37PM +0200, Luigi Leonardi wrote:
>> Hi All,
>>
>> It has been a while since the last email and this patch has not been merged yet.
>> This is just a gentle ping :)
>>
>> Thanks,
>> Luigi
>
>
>ok I can queue it for next. Next time pls remember to CC all
>maintainers. Thanks!

Thank for queueing it!

BTW, it looks like the virtio-vsock driver is listed in
"VIRTIO AND VHOST VSOCK DRIVER" but not listed under
"VIRTIO CORE AND NET DRIVERS", so running get_maintainer.pl I have this
list:

$ ./scripts/get_maintainer.pl -f net/vmw_vsock/virtio_transport.c
Stefan Hajnoczi <stefanha@redhat.com> (maintainer:VIRTIO AND VHOST VSOCK DRIVER)
Stefano Garzarella <sgarzare@redhat.com> (maintainer:VIRTIO AND VHOST VSOCK DRIVER)
"David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING [GENERAL])
Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING [GENERAL])
Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING [GENERAL])
Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING [GENERAL])
kvm@vger.kernel.org (open list:VIRTIO AND VHOST VSOCK DRIVER)
virtualization@lists.linux.dev (open list:VIRTIO AND VHOST VSOCK DRIVER)
netdev@vger.kernel.org (open list:VIRTIO AND VHOST VSOCK DRIVER)
linux-kernel@vger.kernel.org (open list)

Should we add net/vmw_vsock/virtio_transport.c and related files also 
under "VIRTIO CORE AND NET DRIVERS" ?

Thanks,
Stefano

>
>
>> >Hi Michael,
>> >this series is marked as "Not Applicable" for the net-next tree:
>> >https://patchwork.kernel.org/project/netdevbpf/patch/20240730-pinna-v4-2-5c9179164db5@outlook.com/
>>
>> >Actually this is more about the virtio-vsock driver, so can you queue
>> >this on your tree?
>>
>> >Thanks,
>> >Stefano
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible
  2024-08-29 12:33       ` Stefano Garzarella
@ 2024-08-29 12:37         ` Michael S. Tsirkin
  0 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2024-08-29 12:37 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Luigi Leonardi, davem, edumazet, kuba, kvm, linux-kernel,
	marco.pinn95, netdev, pabeni, stefanha, virtualization

On Thu, Aug 29, 2024 at 02:33:11PM +0200, Stefano Garzarella wrote:
> On Thu, Aug 29, 2024 at 08:19:31AM GMT, Michael S. Tsirkin wrote:
> > On Thu, Aug 29, 2024 at 01:00:37PM +0200, Luigi Leonardi wrote:
> > > Hi All,
> > > 
> > > It has been a while since the last email and this patch has not been merged yet.
> > > This is just a gentle ping :)
> > > 
> > > Thanks,
> > > Luigi
> > 
> > 
> > ok I can queue it for next. Next time pls remember to CC all
> > maintainers. Thanks!
> 
> Thank for queueing it!
> 
> BTW, it looks like the virtio-vsock driver is listed in
> "VIRTIO AND VHOST VSOCK DRIVER" but not listed under
> "VIRTIO CORE AND NET DRIVERS", so running get_maintainer.pl I have this
> list:
> 
> $ ./scripts/get_maintainer.pl -f net/vmw_vsock/virtio_transport.c
> Stefan Hajnoczi <stefanha@redhat.com> (maintainer:VIRTIO AND VHOST VSOCK DRIVER)
> Stefano Garzarella <sgarzare@redhat.com> (maintainer:VIRTIO AND VHOST VSOCK DRIVER)
> "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING [GENERAL])
> Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING [GENERAL])
> Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING [GENERAL])
> Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING [GENERAL])
> kvm@vger.kernel.org (open list:VIRTIO AND VHOST VSOCK DRIVER)
> virtualization@lists.linux.dev (open list:VIRTIO AND VHOST VSOCK DRIVER)
> netdev@vger.kernel.org (open list:VIRTIO AND VHOST VSOCK DRIVER)
> linux-kernel@vger.kernel.org (open list)
> 
> Should we add net/vmw_vsock/virtio_transport.c and related files also under
> "VIRTIO CORE AND NET DRIVERS" ?
> 
> Thanks,
> Stefano

ok

> > 
> > 
> > > >Hi Michael,
> > > >this series is marked as "Not Applicable" for the net-next tree:
> > > >https://patchwork.kernel.org/project/netdevbpf/patch/20240730-pinna-v4-2-5c9179164db5@outlook.com/
> > > 
> > > >Actually this is more about the virtio-vsock driver, so can you queue
> > > >this on your tree?
> > > 
> > > >Thanks,
> > > >Stefano
> > 


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-08-29 12:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-30 19:47 [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible Luigi Leonardi via B4 Relay
2024-07-30 19:47 ` [PATCH net-next v4 1/2] vsock/virtio: refactor virtio_transport_send_pkt_work Luigi Leonardi via B4 Relay
2024-07-30 19:47 ` [PATCH net-next v4 2/2] vsock/virtio: avoid queuing packets when intermediate queue is empty Luigi Leonardi via B4 Relay
2024-07-31  7:39   ` Stefano Garzarella
2024-08-05  8:39 ` [PATCH net-next v4 0/2] vsock: avoid queuing on intermediate queue if possible Stefano Garzarella
2024-08-06 16:02   ` Jakub Kicinski
2024-08-06 16:45     ` Stefano Garzarella
2024-08-29 11:00   ` Luigi Leonardi
2024-08-29 12:19     ` Michael S. Tsirkin
2024-08-29 12:33       ` Stefano Garzarella
2024-08-29 12:37         ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).