linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2"
@ 2025-05-13  0:05 mhkelley58
  2025-05-13  0:06 ` [PATCH net 1/5] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges mhkelley58
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: mhkelley58 @ 2025-05-13  0:05 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen
  Cc: linux-hyperv, linux-kernel, netdev, linux-scsi, stable

From: Michael Kelley <mhklinux@outlook.com>

Starting with commit dca5161f9bd0 in the 6.3 kernel, the Linux driver
for Hyper-V synthetic networking (netvsc) occasionally reports
"nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
that Hyper-V has rejected a network packet transmit request from the
guest, and the outgoing network packet is dropped. Higher level
network protocols presumably recover and resend the packet so there is
no functional error, but performance is slightly impacted. Commit
dca5161f9bd0 is not the cause of the error -- it only added reporting
of an error that was already happening without any notice. The error
has presumably been present since the netvsc driver was originally
introduced into Linux.

This patch set fixes the root cause of the problem, which is that the
netvsc driver in Linux may send an incorrectly formatted VMBus message
to Hyper-V when transmitting the network packet. The incorrect
formatting occurs when the rndis header of the VMBus message crosses a
page boundary due to how the Linux skb head memory is aligned. In such
a case, two PFNs are required to describe the location of the rndis
header, even though they are contiguous in guest physical address
(GPA) space. Hyper-V requires that two PFNs be in a single "GPA range"
data struture, but current netvsc code puts each PFN in its own GPA
range, which Hyper-V rejects as an error in the case of the rndis
header.

The incorrect formatting occurs only for larger packets that netvsc
must transmit via a VMBus "GPA Direct" message. There's no problem
when netvsc transmits a smaller packet by copying it into a pre-
allocated send buffer slot because the pre-allocated slots don't have
page crossing issues.

After commit 14ad6ed30a10 in the 6.14 kernel, the error occurs much
more frequently in VMs with 16 or more vCPUs. It may occur every few
seconds, or even more frequently, in a ssh session that outputs a lot
of text. Commit 14ad6ed30a10 subtly changes how skb head memory is
allocated, making it much more likely that the rndis header will cross
a page boundary when the vCPU count is 16 or more.  The changes in
commit 14ad6ed30a10 are perfectly valid -- they just had the side
effect of making the netvsc bug more prominent.

One fix is to check for adjacent PFNs in vmbus_sendpacket_pagebuffer()
and just combine them into a single GPA range. Such a fix is very
contained. But conceptually it is fixing the problem at the wrong
level. So this patch set takes the broader approach of maintaining
the already known grouping of contiguous PFNs at a higher level in
the netvsc driver code, and propagating that grouping down to the
creation of the VMBus message to send to Hyper-V. Maintaining the
grouping fixes this problem, and has the added benefit of allowing
netvsc_dma_map() to make fewer calls to dma_map_single() to do bounce
buffering in CoCo VMs.

Patch 1 is a preparatory change to allow vmbus_sendpacket_mpb_desc()
to specify multiple GPA ranges. In current code
vmbus_sendpacket_mpb_desc() is used only by the storvsc synthetic SCSI
driver, and it always creates a single GPA range.

Patch 2 updates the netvsc driver to use vmbus_sendpacket_mpb_desc()
instead of vmbus_sendpacket_pagebuffer(). Because the higher levels of
netvsc still don't group contiguous PFNs, this patch is functionally
neutral. The VMBus message to Hyper-V still has many GPA ranges, each
with a single PFN. But it lays the groundwork for the next patch.

Patch 3 changes the higher levels of netvsc to preserve the already
known grouping of contiguous PFNs. When the contiguous groupings are
passed to vmbus_sendpacket_mpb_desc(), GPA ranges containing multiple
PFNs are produced, as expected by Hyper-V. This is point at which the
core problem is fixed.

Patches 4 and 5 remove code that is no longer necessary after the
previous patches.

These changes provide a net reduction of about 65 lines of code, which
is an added benefit.

These changes have been tested in normal VMs, in SEV-SNP and TDX CoCo
VMs, and in Dv6-series VMs where the netvsp implementation is in the
OpenHCL paravisor instead of the Hyper-V host.

These changes are built against kernel version 6.15-rc6.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=217503

Michael Kelley (5):
  Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple
    ranges
  hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
  hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
  hv_netvsc: Remove rmsg_pgcnt
  Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()

 drivers/hv/channel.c              | 65 ++-----------------------------
 drivers/net/hyperv/hyperv_net.h   | 13 ++++++-
 drivers/net/hyperv/netvsc.c       | 57 ++++++++++++++++++++++-----
 drivers/net/hyperv/netvsc_drv.c   | 62 +++++++----------------------
 drivers/net/hyperv/rndis_filter.c | 24 +++---------
 drivers/scsi/storvsc_drv.c        |  1 +
 include/linux/hyperv.h            |  7 ----
 7 files changed, 83 insertions(+), 146 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH net 1/5] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges
  2025-05-13  0:05 [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" mhkelley58
@ 2025-05-13  0:06 ` mhkelley58
  2025-05-15 10:51   ` Simon Horman
  2025-05-13  0:06 ` [PATCH net 2/5] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages mhkelley58
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: mhkelley58 @ 2025-05-13  0:06 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen
  Cc: linux-hyperv, linux-kernel, netdev, linux-scsi, stable

From: Michael Kelley <mhklinux@outlook.com>

vmbus_sendpacket_mpb_desc() is currently used only by the storvsc driver
and is hardcoded to create a single GPA range. To allow it to also be
used by the netvsc driver to create multiple GPA ranges, no longer
hardcode as having a single GPA range. Allow the calling driver to
specify the rangecount in the supplied descriptor.

Update the storvsc driver to reflect this new approach.

Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
 drivers/hv/channel.c       | 6 +++---
 drivers/scsi/storvsc_drv.c | 1 +
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index fb8cd8469328..4ffd5eaa7817 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -1136,9 +1136,10 @@ int vmbus_sendpacket_pagebuffer(struct vmbus_channel *channel,
 EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer);
 
 /*
- * vmbus_sendpacket_multipagebuffer - Send a multi-page buffer packet
+ * vmbus_sendpacket_mpb_desc - Send one or more multi-page buffer packets
  * using a GPADL Direct packet type.
- * The buffer includes the vmbus descriptor.
+ * The desc argument must include space for the VMBus descriptor. The
+ * rangecount field must already be set.
  */
 int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel,
 			      struct vmbus_packet_mpb_array *desc,
@@ -1160,7 +1161,6 @@ int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel,
 	desc->length8 = (u16)(packetlen_aligned >> 3);
 	desc->transactionid = VMBUS_RQST_ERROR; /* will be updated in hv_ringbuffer_write() */
 	desc->reserved = 0;
-	desc->rangecount = 1;
 
 	bufferlist[0].iov_base = desc;
 	bufferlist[0].iov_len = desc_size;
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 35db061ae3ec..2e6b2412d2c9 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1819,6 +1819,7 @@ static int storvsc_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *scmnd)
 				return SCSI_MLQUEUE_DEVICE_BUSY;
 		}
 
+		payload->rangecount = 1;
 		payload->range.len = length;
 		payload->range.offset = offset_in_hvpg;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net 2/5] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
  2025-05-13  0:05 [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" mhkelley58
  2025-05-13  0:06 ` [PATCH net 1/5] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges mhkelley58
@ 2025-05-13  0:06 ` mhkelley58
  2025-05-14  9:37   ` Simon Horman
  2025-05-13  0:06 ` [PATCH net 3/5] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array mhkelley58
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: mhkelley58 @ 2025-05-13  0:06 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen
  Cc: linux-hyperv, linux-kernel, netdev, linux-scsi, stable

From: Michael Kelley <mhklinux@outlook.com>

netvsc currently uses vmbus_sendpacket_pagebuffer() to send VMBus
messages. This function creates a series of GPA ranges, each of which
contains a single PFN. However, if the rndis header in the VMBus
message crosses a page boundary, the netvsc protocol with the host
requires that both PFNs for the rndis header must be in a single "GPA
range" data structure, which isn't possible with
vmbus_sendpacket_pagebuffer(). As the first step in fixing this, add a
new function netvsc_build_mpb_array() to build a VMBus message with
multiple GPA ranges, each of which may contain multiple PFNs. Use
vmbus_sendpacket_mpb_desc() to send this VMBus message to the host.

There's no functional change since higher levels of netvsc don't
maintain or propagate knowledge of contiguous PFNs. Based on its
input, netvsc_build_mpb_array() still produces a separate GPA range
for each PFN and the behavior is the same as with
vmbus_sendpacket_pagebuffer(). But the groundwork is laid for a
subsequent patch to provide the necessary grouping.

Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
 drivers/net/hyperv/netvsc.c | 50 +++++++++++++++++++++++++++++++++----
 1 file changed, 45 insertions(+), 5 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index d6f5b9ea3109..6d1705f87682 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -1055,6 +1055,42 @@ static int netvsc_dma_map(struct hv_device *hv_dev,
 	return 0;
 }
 
+/* Build an "array" of mpb entries describing the data to be transferred
+ * over VMBus. After the desc header fields, each "array" entry is variable
+ * size, and each entry starts after the end of the previous entry. The
+ * "offset" and "len" fields for each entry imply the size of the entry.
+ *
+ * The pfns are in HV_HYP_PAGE_SIZE, because all communication with Hyper-V
+ * uses that granularity, even if the system page size of the guest is larger.
+ * Each entry in the input "pb" array must describe a contiguous range of
+ * guest physical memory so that the pfns are sequential if the range crosses
+ * a page boundary. The offset field must be < HV_HYP_PAGE_SIZE.
+ */
+static inline void netvsc_build_mpb_array(struct hv_page_buffer *pb,
+				u32 page_buffer_count,
+				struct vmbus_packet_mpb_array *desc,
+				u32 *desc_size)
+{
+	struct hv_mpb_array *mpb_entry = &desc->range;
+	int i, j;
+
+	for (i = 0; i < page_buffer_count; i++) {
+		u32 offset = pb[i].offset;
+		u32 len = pb[i].len;
+
+		mpb_entry->offset = offset;
+		mpb_entry->len = len;
+
+		for (j = 0; j < HVPFN_UP(offset + len); j++)
+			mpb_entry->pfn_array[j] = pb[i].pfn + j;
+
+		mpb_entry = (struct hv_mpb_array *)&mpb_entry->pfn_array[j];
+	}
+
+	desc->rangecount = page_buffer_count;
+	*desc_size = (char *)mpb_entry - (char *)desc;
+}
+
 static inline int netvsc_send_pkt(
 	struct hv_device *device,
 	struct hv_netvsc_packet *packet,
@@ -1097,6 +1133,9 @@ static inline int netvsc_send_pkt(
 
 	packet->dma_range = NULL;
 	if (packet->page_buf_cnt) {
+		struct vmbus_channel_packet_page_buffer desc;
+		u32 desc_size;
+
 		if (packet->cp_partial)
 			pb += packet->rmsg_pgcnt;
 
@@ -1106,11 +1145,12 @@ static inline int netvsc_send_pkt(
 			goto exit;
 		}
 
-		ret = vmbus_sendpacket_pagebuffer(out_channel,
-						  pb, packet->page_buf_cnt,
-						  &nvmsg, sizeof(nvmsg),
-						  req_id);
-
+		netvsc_build_mpb_array(pb, packet->page_buf_cnt,
+				(struct vmbus_packet_mpb_array *)&desc,
+				 &desc_size);
+		ret = vmbus_sendpacket_mpb_desc(out_channel,
+				(struct vmbus_packet_mpb_array *)&desc,
+				desc_size, &nvmsg, sizeof(nvmsg), req_id);
 		if (ret)
 			netvsc_dma_unmap(ndev_ctx->device_ctx, packet);
 	} else {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net 3/5] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
  2025-05-13  0:05 [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" mhkelley58
  2025-05-13  0:06 ` [PATCH net 1/5] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges mhkelley58
  2025-05-13  0:06 ` [PATCH net 2/5] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages mhkelley58
@ 2025-05-13  0:06 ` mhkelley58
  2025-05-14  9:34   ` Simon Horman
  2025-05-13  0:06 ` [PATCH net 4/5] hv_netvsc: Remove rmsg_pgcnt mhkelley58
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: mhkelley58 @ 2025-05-13  0:06 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen
  Cc: linux-hyperv, linux-kernel, netdev, linux-scsi, stable

From: Michael Kelley <mhklinux@outlook.com>

Starting with commit dca5161f9bd0 ("hv_netvsc: Check status in
SEND_RNDIS_PKT completion message") in the 6.3 kernel, the Linux
driver for Hyper-V synthetic networking (netvsc) occasionally reports
"nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
that Hyper-V has rejected a network packet transmit request from the
guest, and the outgoing network packet is dropped. Higher level
network protocols presumably recover and resend the packet so there is
no functional error, but performance is slightly impacted. Commit
dca5161f9bd0 is not the cause of the error -- it only added reporting
of an error that was already happening without any notice. The error
has presumably been present since the netvsc driver was originally
introduced into Linux.

The root cause of the problem is that the netvsc driver in Linux may
send an incorrectly formatted VMBus message to Hyper-V when
transmitting the network packet. The incorrect formatting occurs when
the rndis header of the VMBus message crosses a page boundary due to
how the Linux skb head memory is aligned. In such a case, two PFNs are
required to describe the location of the rndis header, even though
they are contiguous in guest physical address (GPA) space. Hyper-V
requires that two rndis header PFNs be in a single "GPA range" data
struture, but current netvsc code puts each PFN in its own GPA range,
which Hyper-V rejects as an error.

The incorrect formatting occurs only for larger packets that netvsc
must transmit via a VMBus "GPA Direct" message. There's no problem
when netvsc transmits a smaller packet by copying it into a pre-
allocated send buffer slot because the pre-allocated slots don't have
page crossing issues.

After commit 14ad6ed30a10 ("net: allow small head cache usage with
large MAX_SKB_FRAGS values") in the 6.14-rc4 kernel, the error occurs
much more frequently in VMs with 16 or more vCPUs. It may occur every
few seconds, or even more frequently, in an ssh session that outputs a
lot of text. Commit 14ad6ed30a10 subtly changes how skb head memory is
allocated, making it much more likely that the rndis header will cross
a page boundary when the vCPU count is 16 or more. The changes in
commit 14ad6ed30a10 are perfectly valid -- they just had the side
effect of making the netvsc bug more prominent.

Current code in init_page_array() creates a separate page buffer array
entry for each PFN required to identify the data to be transmitted.
Contiguous PFNs get separate entries in the page buffer array, and any
information about contiguity is lost.

Fix the core issue by having init_page_array() construct the page
buffer array to represent contiguous ranges rather than individual
pages. When these ranges are subsequently passed to
netvsc_build_mpb_array(), it can build GPA ranges that contain
multiple PFNs, as required to avoid the error "nvsp_rndis_pkt_complete
error status: 2". If instead the network packet is sent by copying
into a pre-allocated send buffer slot, the copy proceeds using the
contiguous ranges rather than individual pages, but the result of the
copying is the same. Also fix rndis_filter_send_request() to construct
a contiguous range, since it has its own page buffer array.

This change has a side benefit in CoCo VMs in that netvsc_dma_map()
calls dma_map_single() on each contiguous range instead of on each
page. This results in fewer calls to dma_map_single() but on larger
chunks of memory, which should reduce contention on the swiotlb.

Since the page buffer array now contains one entry for each contiguous
range instead of for each individual page, the number of entries in
the array can be reduced, saving 208 bytes of stack space in
netvsc_xmit() when MAX_SKG_FRAGS has the default value of 17.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=217503

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217503
Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
 drivers/net/hyperv/hyperv_net.h   | 12 ++++++
 drivers/net/hyperv/netvsc_drv.c   | 63 ++++++++-----------------------
 drivers/net/hyperv/rndis_filter.c | 24 +++---------
 3 files changed, 32 insertions(+), 67 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 70f7cb383228..76725f25abd5 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -893,6 +893,18 @@ struct nvsp_message {
 				 sizeof(struct nvsp_message))
 #define NETVSC_MIN_IN_MSG_SIZE sizeof(struct vmpacket_descriptor)
 
+/* Maximum # of contiguous data ranges that can make up a trasmitted packet.
+ * Typically it's the max SKB fragments plus 2 for the rndis packet and the
+ * linear portion of the SKB. But if MAX_SKB_FRAGS is large, the value may
+ * need to be limited to MAX_PAGE_BUFFER_COUNT, which is the max # of entries
+ * in a GPA direct packet sent to netvsp over VMBus.
+ */
+#if MAX_SKB_FRAGS + 2 < MAX_PAGE_BUFFER_COUNT
+#define MAX_DATA_RANGES (MAX_SKB_FRAGS + 2)
+#else
+#define MAX_DATA_RANGES MAX_PAGE_BUFFER_COUNT
+#endif
+
 /* Estimated requestor size:
  * out_ring_size/min_out_msg_size + in_ring_size/min_in_msg_size
  */
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index c51b318b8a72..929f6b3de768 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -326,43 +326,10 @@ static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb,
 	return txq;
 }
 
-static u32 fill_pg_buf(unsigned long hvpfn, u32 offset, u32 len,
-		       struct hv_page_buffer *pb)
-{
-	int j = 0;
-
-	hvpfn += offset >> HV_HYP_PAGE_SHIFT;
-	offset = offset & ~HV_HYP_PAGE_MASK;
-
-	while (len > 0) {
-		unsigned long bytes;
-
-		bytes = HV_HYP_PAGE_SIZE - offset;
-		if (bytes > len)
-			bytes = len;
-		pb[j].pfn = hvpfn;
-		pb[j].offset = offset;
-		pb[j].len = bytes;
-
-		offset += bytes;
-		len -= bytes;
-
-		if (offset == HV_HYP_PAGE_SIZE && len) {
-			hvpfn++;
-			offset = 0;
-			j++;
-		}
-	}
-
-	return j + 1;
-}
-
 static u32 init_page_array(void *hdr, u32 len, struct sk_buff *skb,
 			   struct hv_netvsc_packet *packet,
 			   struct hv_page_buffer *pb)
 {
-	u32 slots_used = 0;
-	char *data = skb->data;
 	int frags = skb_shinfo(skb)->nr_frags;
 	int i;
 
@@ -371,28 +338,28 @@ static u32 init_page_array(void *hdr, u32 len, struct sk_buff *skb,
 	 * 2. skb linear data
 	 * 3. skb fragment data
 	 */
-	slots_used += fill_pg_buf(virt_to_hvpfn(hdr),
-				  offset_in_hvpage(hdr),
-				  len,
-				  &pb[slots_used]);
 
+	pb[0].offset = offset_in_hvpage(hdr);
+	pb[0].len = len;
+	pb[0].pfn = virt_to_hvpfn(hdr);
 	packet->rmsg_size = len;
-	packet->rmsg_pgcnt = slots_used;
+	packet->rmsg_pgcnt = 1;
 
-	slots_used += fill_pg_buf(virt_to_hvpfn(data),
-				  offset_in_hvpage(data),
-				  skb_headlen(skb),
-				  &pb[slots_used]);
+	pb[1].offset = offset_in_hvpage(skb->data);
+	pb[1].len = skb_headlen(skb);
+	pb[1].pfn = virt_to_hvpfn(skb->data);
 
 	for (i = 0; i < frags; i++) {
 		skb_frag_t *frag = skb_shinfo(skb)->frags + i;
+		struct hv_page_buffer *cur_pb = &pb[i + 2];
+		u64 pfn = page_to_hvpfn(skb_frag_page(frag));
+		u32 offset = skb_frag_off(frag);
 
-		slots_used += fill_pg_buf(page_to_hvpfn(skb_frag_page(frag)),
-					  skb_frag_off(frag),
-					  skb_frag_size(frag),
-					  &pb[slots_used]);
+		cur_pb->offset = offset_in_hvpage(offset);
+		cur_pb->len = skb_frag_size(frag);
+		cur_pb->pfn = pfn + (offset >> HV_HYP_PAGE_SHIFT);
 	}
-	return slots_used;
+	return frags + 2;
 }
 
 static int count_skb_frag_slots(struct sk_buff *skb)
@@ -483,7 +450,7 @@ static int netvsc_xmit(struct sk_buff *skb, struct net_device *net, bool xdp_tx)
 	struct net_device *vf_netdev;
 	u32 rndis_msg_size;
 	u32 hash;
-	struct hv_page_buffer pb[MAX_PAGE_BUFFER_COUNT];
+	struct hv_page_buffer pb[MAX_DATA_RANGES];
 
 	/* If VF is present and up then redirect packets to it.
 	 * Skip the VF if it is marked down or has no carrier.
diff --git a/drivers/net/hyperv/rndis_filter.c b/drivers/net/hyperv/rndis_filter.c
index 82747dfacd70..9e73959e61ee 100644
--- a/drivers/net/hyperv/rndis_filter.c
+++ b/drivers/net/hyperv/rndis_filter.c
@@ -225,8 +225,7 @@ static int rndis_filter_send_request(struct rndis_device *dev,
 				  struct rndis_request *req)
 {
 	struct hv_netvsc_packet *packet;
-	struct hv_page_buffer page_buf[2];
-	struct hv_page_buffer *pb = page_buf;
+	struct hv_page_buffer pb;
 	int ret;
 
 	/* Setup the packet to send it */
@@ -235,27 +234,14 @@ static int rndis_filter_send_request(struct rndis_device *dev,
 	packet->total_data_buflen = req->request_msg.msg_len;
 	packet->page_buf_cnt = 1;
 
-	pb[0].pfn = virt_to_phys(&req->request_msg) >>
-					HV_HYP_PAGE_SHIFT;
-	pb[0].len = req->request_msg.msg_len;
-	pb[0].offset = offset_in_hvpage(&req->request_msg);
-
-	/* Add one page_buf when request_msg crossing page boundary */
-	if (pb[0].offset + pb[0].len > HV_HYP_PAGE_SIZE) {
-		packet->page_buf_cnt++;
-		pb[0].len = HV_HYP_PAGE_SIZE -
-			pb[0].offset;
-		pb[1].pfn = virt_to_phys((void *)&req->request_msg
-			+ pb[0].len) >> HV_HYP_PAGE_SHIFT;
-		pb[1].offset = 0;
-		pb[1].len = req->request_msg.msg_len -
-			pb[0].len;
-	}
+	pb.pfn = virt_to_phys(&req->request_msg) >> HV_HYP_PAGE_SHIFT;
+	pb.len = req->request_msg.msg_len;
+	pb.offset = offset_in_hvpage(&req->request_msg);
 
 	trace_rndis_send(dev->ndev, 0, &req->request_msg);
 
 	rcu_read_lock_bh();
-	ret = netvsc_send(dev->ndev, packet, NULL, pb, NULL, false);
+	ret = netvsc_send(dev->ndev, packet, NULL, &pb, NULL, false);
 	rcu_read_unlock_bh();
 
 	return ret;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net 4/5] hv_netvsc: Remove rmsg_pgcnt
  2025-05-13  0:05 [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" mhkelley58
                   ` (2 preceding siblings ...)
  2025-05-13  0:06 ` [PATCH net 3/5] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array mhkelley58
@ 2025-05-13  0:06 ` mhkelley58
  2025-05-15 10:55   ` Simon Horman
  2025-05-13  0:06 ` [PATCH net 5/5] Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer() mhkelley58
  2025-05-15  3:00 ` [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" patchwork-bot+netdevbpf
  5 siblings, 1 reply; 15+ messages in thread
From: mhkelley58 @ 2025-05-13  0:06 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen
  Cc: linux-hyperv, linux-kernel, netdev, linux-scsi, stable

From: Michael Kelley <mhklinux@outlook.com>

init_page_array() now always creates a single page buffer array entry
for the rndis message, even if the rndis message crosses a page
boundary. As such, the number of page buffer array entries used for
the rndis message must no longer be tracked -- it is always just 1.
Remove the rmsg_pgcnt field and use "1" where the value is needed.

Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
 drivers/net/hyperv/hyperv_net.h | 1 -
 drivers/net/hyperv/netvsc.c     | 7 +++----
 drivers/net/hyperv/netvsc_drv.c | 1 -
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 76725f25abd5..cb6f5482d203 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -158,7 +158,6 @@ struct hv_netvsc_packet {
 	u8 cp_partial; /* partial copy into send buffer */
 
 	u8 rmsg_size; /* RNDIS header and PPI size */
-	u8 rmsg_pgcnt; /* page count of RNDIS header and PPI */
 	u8 page_buf_cnt;
 
 	u16 q_idx;
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 6d1705f87682..41661d5c61ab 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -953,8 +953,7 @@ static void netvsc_copy_to_send_buf(struct netvsc_device *net_device,
 		     + pend_size;
 	int i;
 	u32 padding = 0;
-	u32 page_count = packet->cp_partial ? packet->rmsg_pgcnt :
-		packet->page_buf_cnt;
+	u32 page_count = packet->cp_partial ? 1 : packet->page_buf_cnt;
 	u32 remain;
 
 	/* Add padding */
@@ -1137,7 +1136,7 @@ static inline int netvsc_send_pkt(
 		u32 desc_size;
 
 		if (packet->cp_partial)
-			pb += packet->rmsg_pgcnt;
+			pb++;
 
 		ret = netvsc_dma_map(ndev_ctx->device_ctx, packet, pb);
 		if (ret) {
@@ -1299,7 +1298,7 @@ int netvsc_send(struct net_device *ndev,
 		packet->send_buf_index = section_index;
 
 		if (packet->cp_partial) {
-			packet->page_buf_cnt -= packet->rmsg_pgcnt;
+			packet->page_buf_cnt--;
 			packet->total_data_buflen = msd_len + packet->rmsg_size;
 		} else {
 			packet->page_buf_cnt = 0;
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 929f6b3de768..d8b169ac0343 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -343,7 +343,6 @@ static u32 init_page_array(void *hdr, u32 len, struct sk_buff *skb,
 	pb[0].len = len;
 	pb[0].pfn = virt_to_hvpfn(hdr);
 	packet->rmsg_size = len;
-	packet->rmsg_pgcnt = 1;
 
 	pb[1].offset = offset_in_hvpage(skb->data);
 	pb[1].len = skb_headlen(skb);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net 5/5] Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()
  2025-05-13  0:05 [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" mhkelley58
                   ` (3 preceding siblings ...)
  2025-05-13  0:06 ` [PATCH net 4/5] hv_netvsc: Remove rmsg_pgcnt mhkelley58
@ 2025-05-13  0:06 ` mhkelley58
  2025-05-15  3:00 ` [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" patchwork-bot+netdevbpf
  5 siblings, 0 replies; 15+ messages in thread
From: mhkelley58 @ 2025-05-13  0:06 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen
  Cc: linux-hyperv, linux-kernel, netdev, linux-scsi, stable

From: Michael Kelley <mhklinux@outlook.com>

With the netvsc driver changed to use vmbus_sendpacket_mpb_desc()
instead of vmbus_sendpacket_pagebuffer(), the latter has no remaining
callers. Remove it.

Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
 drivers/hv/channel.c   | 59 ------------------------------------------
 include/linux/hyperv.h |  7 -----
 2 files changed, 66 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 4ffd5eaa7817..35f26fa1ffe7 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -1076,65 +1076,6 @@ int vmbus_sendpacket(struct vmbus_channel *channel, void *buffer,
 }
 EXPORT_SYMBOL(vmbus_sendpacket);
 
-/*
- * vmbus_sendpacket_pagebuffer - Send a range of single-page buffer
- * packets using a GPADL Direct packet type. This interface allows you
- * to control notifying the host. This will be useful for sending
- * batched data. Also the sender can control the send flags
- * explicitly.
- */
-int vmbus_sendpacket_pagebuffer(struct vmbus_channel *channel,
-				struct hv_page_buffer pagebuffers[],
-				u32 pagecount, void *buffer, u32 bufferlen,
-				u64 requestid)
-{
-	int i;
-	struct vmbus_channel_packet_page_buffer desc;
-	u32 descsize;
-	u32 packetlen;
-	u32 packetlen_aligned;
-	struct kvec bufferlist[3];
-	u64 aligned_data = 0;
-
-	if (pagecount > MAX_PAGE_BUFFER_COUNT)
-		return -EINVAL;
-
-	/*
-	 * Adjust the size down since vmbus_channel_packet_page_buffer is the
-	 * largest size we support
-	 */
-	descsize = sizeof(struct vmbus_channel_packet_page_buffer) -
-			  ((MAX_PAGE_BUFFER_COUNT - pagecount) *
-			  sizeof(struct hv_page_buffer));
-	packetlen = descsize + bufferlen;
-	packetlen_aligned = ALIGN(packetlen, sizeof(u64));
-
-	/* Setup the descriptor */
-	desc.type = VM_PKT_DATA_USING_GPA_DIRECT;
-	desc.flags = VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED;
-	desc.dataoffset8 = descsize >> 3; /* in 8-bytes granularity */
-	desc.length8 = (u16)(packetlen_aligned >> 3);
-	desc.transactionid = VMBUS_RQST_ERROR; /* will be updated in hv_ringbuffer_write() */
-	desc.reserved = 0;
-	desc.rangecount = pagecount;
-
-	for (i = 0; i < pagecount; i++) {
-		desc.range[i].len = pagebuffers[i].len;
-		desc.range[i].offset = pagebuffers[i].offset;
-		desc.range[i].pfn	 = pagebuffers[i].pfn;
-	}
-
-	bufferlist[0].iov_base = &desc;
-	bufferlist[0].iov_len = descsize;
-	bufferlist[1].iov_base = buffer;
-	bufferlist[1].iov_len = bufferlen;
-	bufferlist[2].iov_base = &aligned_data;
-	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
-
-	return hv_ringbuffer_write(channel, bufferlist, 3, requestid, NULL);
-}
-EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer);
-
 /*
  * vmbus_sendpacket_mpb_desc - Send one or more multi-page buffer packets
  * using a GPADL Direct packet type.
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index d6ffe01962c2..b52ac40d5830 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1167,13 +1167,6 @@ extern int vmbus_sendpacket(struct vmbus_channel *channel,
 				  enum vmbus_packet_type type,
 				  u32 flags);
 
-extern int vmbus_sendpacket_pagebuffer(struct vmbus_channel *channel,
-					    struct hv_page_buffer pagebuffers[],
-					    u32 pagecount,
-					    void *buffer,
-					    u32 bufferlen,
-					    u64 requestid);
-
 extern int vmbus_sendpacket_mpb_desc(struct vmbus_channel *channel,
 				     struct vmbus_packet_mpb_array *mpb,
 				     u32 desc_size,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net 3/5] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
  2025-05-13  0:06 ` [PATCH net 3/5] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array mhkelley58
@ 2025-05-14  9:34   ` Simon Horman
  2025-05-14 15:42     ` Michael Kelley
  0 siblings, 1 reply; 15+ messages in thread
From: Simon Horman @ 2025-05-14  9:34 UTC (permalink / raw)
  To: mhklinux
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen, linux-hyperv,
	linux-kernel, netdev, linux-scsi, stable

On Mon, May 12, 2025 at 05:06:02PM -0700, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> Starting with commit dca5161f9bd0 ("hv_netvsc: Check status in
> SEND_RNDIS_PKT completion message") in the 6.3 kernel, the Linux
> driver for Hyper-V synthetic networking (netvsc) occasionally reports
> "nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
> that Hyper-V has rejected a network packet transmit request from the
> guest, and the outgoing network packet is dropped. Higher level
> network protocols presumably recover and resend the packet so there is
> no functional error, but performance is slightly impacted. Commit
> dca5161f9bd0 is not the cause of the error -- it only added reporting
> of an error that was already happening without any notice. The error
> has presumably been present since the netvsc driver was originally
> introduced into Linux.
> 
> The root cause of the problem is that the netvsc driver in Linux may
> send an incorrectly formatted VMBus message to Hyper-V when
> transmitting the network packet. The incorrect formatting occurs when
> the rndis header of the VMBus message crosses a page boundary due to
> how the Linux skb head memory is aligned. In such a case, two PFNs are
> required to describe the location of the rndis header, even though
> they are contiguous in guest physical address (GPA) space. Hyper-V
> requires that two rndis header PFNs be in a single "GPA range" data
> struture, but current netvsc code puts each PFN in its own GPA range,
> which Hyper-V rejects as an error.
> 
> The incorrect formatting occurs only for larger packets that netvsc
> must transmit via a VMBus "GPA Direct" message. There's no problem
> when netvsc transmits a smaller packet by copying it into a pre-
> allocated send buffer slot because the pre-allocated slots don't have
> page crossing issues.
> 
> After commit 14ad6ed30a10 ("net: allow small head cache usage with
> large MAX_SKB_FRAGS values") in the 6.14-rc4 kernel, the error occurs
> much more frequently in VMs with 16 or more vCPUs. It may occur every
> few seconds, or even more frequently, in an ssh session that outputs a
> lot of text. Commit 14ad6ed30a10 subtly changes how skb head memory is
> allocated, making it much more likely that the rndis header will cross
> a page boundary when the vCPU count is 16 or more. The changes in
> commit 14ad6ed30a10 are perfectly valid -- they just had the side
> effect of making the netvsc bug more prominent.
> 
> Current code in init_page_array() creates a separate page buffer array
> entry for each PFN required to identify the data to be transmitted.
> Contiguous PFNs get separate entries in the page buffer array, and any
> information about contiguity is lost.
> 
> Fix the core issue by having init_page_array() construct the page
> buffer array to represent contiguous ranges rather than individual
> pages. When these ranges are subsequently passed to
> netvsc_build_mpb_array(), it can build GPA ranges that contain
> multiple PFNs, as required to avoid the error "nvsp_rndis_pkt_complete
> error status: 2". If instead the network packet is sent by copying
> into a pre-allocated send buffer slot, the copy proceeds using the
> contiguous ranges rather than individual pages, but the result of the
> copying is the same. Also fix rndis_filter_send_request() to construct
> a contiguous range, since it has its own page buffer array.
> 
> This change has a side benefit in CoCo VMs in that netvsc_dma_map()
> calls dma_map_single() on each contiguous range instead of on each
> page. This results in fewer calls to dma_map_single() but on larger
> chunks of memory, which should reduce contention on the swiotlb.
> 
> Since the page buffer array now contains one entry for each contiguous
> range instead of for each individual page, the number of entries in
> the array can be reduced, saving 208 bytes of stack space in
> netvsc_xmit() when MAX_SKG_FRAGS has the default value of 17.
> 
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=217503
> 
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217503
> Cc: <stable@vger.kernel.org> # 6.1.x
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> ---
>  drivers/net/hyperv/hyperv_net.h   | 12 ++++++
>  drivers/net/hyperv/netvsc_drv.c   | 63 ++++++++-----------------------
>  drivers/net/hyperv/rndis_filter.c | 24 +++---------
>  3 files changed, 32 insertions(+), 67 deletions(-)
> 
> diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
> index 70f7cb383228..76725f25abd5 100644
> --- a/drivers/net/hyperv/hyperv_net.h
> +++ b/drivers/net/hyperv/hyperv_net.h
> @@ -893,6 +893,18 @@ struct nvsp_message {
>  				 sizeof(struct nvsp_message))
>  #define NETVSC_MIN_IN_MSG_SIZE sizeof(struct vmpacket_descriptor)
>  
> +/* Maximum # of contiguous data ranges that can make up a trasmitted packet.
> + * Typically it's the max SKB fragments plus 2 for the rndis packet and the
> + * linear portion of the SKB. But if MAX_SKB_FRAGS is large, the value may
> + * need to be limited to MAX_PAGE_BUFFER_COUNT, which is the max # of entries
> + * in a GPA direct packet sent to netvsp over VMBus.
> + */
> +#if MAX_SKB_FRAGS + 2 < MAX_PAGE_BUFFER_COUNT
> +#define MAX_DATA_RANGES (MAX_SKB_FRAGS + 2)
> +#else
> +#define MAX_DATA_RANGES MAX_PAGE_BUFFER_COUNT
> +#endif
> +
>  /* Estimated requestor size:
>   * out_ring_size/min_out_msg_size + in_ring_size/min_in_msg_size
>   */

...

> @@ -371,28 +338,28 @@ static u32 init_page_array(void *hdr, u32 len, struct sk_buff *skb,
>  	 * 2. skb linear data
>  	 * 3. skb fragment data
>  	 */
> -	slots_used += fill_pg_buf(virt_to_hvpfn(hdr),
> -				  offset_in_hvpage(hdr),
> -				  len,
> -				  &pb[slots_used]);
>  
> +	pb[0].offset = offset_in_hvpage(hdr);
> +	pb[0].len = len;
> +	pb[0].pfn = virt_to_hvpfn(hdr);
>  	packet->rmsg_size = len;
> -	packet->rmsg_pgcnt = slots_used;
> +	packet->rmsg_pgcnt = 1;
>  
> -	slots_used += fill_pg_buf(virt_to_hvpfn(data),
> -				  offset_in_hvpage(data),
> -				  skb_headlen(skb),
> -				  &pb[slots_used]);
> +	pb[1].offset = offset_in_hvpage(skb->data);
> +	pb[1].len = skb_headlen(skb);
> +	pb[1].pfn = virt_to_hvpfn(skb->data);
>  
>  	for (i = 0; i < frags; i++) {
>  		skb_frag_t *frag = skb_shinfo(skb)->frags + i;
> +		struct hv_page_buffer *cur_pb = &pb[i + 2];

Hi Michael,

If I got things right then then pb is allocated on the stack
in netvsc_xmit and has MAX_DATA_RANGES elements.

If MAX_SKB_FRAGS is largs and MAX_DATA_RANGES has been limited to
MAX_DATA_RANGES. And frags is large. Is is possible to overrun pb here?

> +		u64 pfn = page_to_hvpfn(skb_frag_page(frag));
> +		u32 offset = skb_frag_off(frag);
>  
> -		slots_used += fill_pg_buf(page_to_hvpfn(skb_frag_page(frag)),
> -					  skb_frag_off(frag),
> -					  skb_frag_size(frag),
> -					  &pb[slots_used]);
> +		cur_pb->offset = offset_in_hvpage(offset);
> +		cur_pb->len = skb_frag_size(frag);
> +		cur_pb->pfn = pfn + (offset >> HV_HYP_PAGE_SHIFT);
>  	}
> -	return slots_used;
> +	return frags + 2;
>  }
>  
>  static int count_skb_frag_slots(struct sk_buff *skb)
> @@ -483,7 +450,7 @@ static int netvsc_xmit(struct sk_buff *skb, struct net_device *net, bool xdp_tx)
>  	struct net_device *vf_netdev;
>  	u32 rndis_msg_size;
>  	u32 hash;
> -	struct hv_page_buffer pb[MAX_PAGE_BUFFER_COUNT];
> +	struct hv_page_buffer pb[MAX_DATA_RANGES];
>  
>  	/* If VF is present and up then redirect packets to it.
>  	 * Skip the VF if it is marked down or has no carrier.

...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 2/5] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
  2025-05-13  0:06 ` [PATCH net 2/5] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages mhkelley58
@ 2025-05-14  9:37   ` Simon Horman
  2025-05-14 15:44     ` Michael Kelley
  0 siblings, 1 reply; 15+ messages in thread
From: Simon Horman @ 2025-05-14  9:37 UTC (permalink / raw)
  To: mhklinux
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen, linux-hyperv,
	linux-kernel, netdev, linux-scsi, stable

On Mon, May 12, 2025 at 05:06:01PM -0700, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> netvsc currently uses vmbus_sendpacket_pagebuffer() to send VMBus
> messages. This function creates a series of GPA ranges, each of which
> contains a single PFN. However, if the rndis header in the VMBus
> message crosses a page boundary, the netvsc protocol with the host
> requires that both PFNs for the rndis header must be in a single "GPA
> range" data structure, which isn't possible with
> vmbus_sendpacket_pagebuffer(). As the first step in fixing this, add a
> new function netvsc_build_mpb_array() to build a VMBus message with
> multiple GPA ranges, each of which may contain multiple PFNs. Use
> vmbus_sendpacket_mpb_desc() to send this VMBus message to the host.
> 
> There's no functional change since higher levels of netvsc don't
> maintain or propagate knowledge of contiguous PFNs. Based on its
> input, netvsc_build_mpb_array() still produces a separate GPA range
> for each PFN and the behavior is the same as with
> vmbus_sendpacket_pagebuffer(). But the groundwork is laid for a
> subsequent patch to provide the necessary grouping.
> 
> Cc: <stable@vger.kernel.org> # 6.1.x
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> ---
>  drivers/net/hyperv/netvsc.c | 50 +++++++++++++++++++++++++++++++++----
>  1 file changed, 45 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
> index d6f5b9ea3109..6d1705f87682 100644
> --- a/drivers/net/hyperv/netvsc.c
> +++ b/drivers/net/hyperv/netvsc.c
> @@ -1055,6 +1055,42 @@ static int netvsc_dma_map(struct hv_device *hv_dev,
>  	return 0;
>  }
>  
> +/* Build an "array" of mpb entries describing the data to be transferred
> + * over VMBus. After the desc header fields, each "array" entry is variable
> + * size, and each entry starts after the end of the previous entry. The
> + * "offset" and "len" fields for each entry imply the size of the entry.
> + *
> + * The pfns are in HV_HYP_PAGE_SIZE, because all communication with Hyper-V
> + * uses that granularity, even if the system page size of the guest is larger.
> + * Each entry in the input "pb" array must describe a contiguous range of
> + * guest physical memory so that the pfns are sequential if the range crosses
> + * a page boundary. The offset field must be < HV_HYP_PAGE_SIZE.

Hi Michael,

Is there a guarantee that this constraint is met. And moreover, is there a
guarantee that all of the entries will fit in desc? I am slightly concerned
that there may be an overrun lurking here.

...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH net 3/5] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
  2025-05-14  9:34   ` Simon Horman
@ 2025-05-14 15:42     ` Michael Kelley
  2025-05-15 10:40       ` Simon Horman
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Kelley @ 2025-05-14 15:42 UTC (permalink / raw)
  To: Simon Horman
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	James.Bottomley@hansenpartnership.com, martin.petersen@oracle.com,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-scsi@vger.kernel.org,
	stable@vger.kernel.org

From: Simon Horman <horms@kernel.org> Sent: Wednesday, May 14, 2025 2:35 AM
> 
> On Mon, May 12, 2025 at 05:06:02PM -0700, mhkelley58@gmail.com wrote:
> > From: Michael Kelley <mhklinux@outlook.com>
> >
> > Starting with commit dca5161f9bd0 ("hv_netvsc: Check status in
> > SEND_RNDIS_PKT completion message") in the 6.3 kernel, the Linux
> > driver for Hyper-V synthetic networking (netvsc) occasionally reports
> > "nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
> > that Hyper-V has rejected a network packet transmit request from the
> > guest, and the outgoing network packet is dropped. Higher level
> > network protocols presumably recover and resend the packet so there is
> > no functional error, but performance is slightly impacted. Commit
> > dca5161f9bd0 is not the cause of the error -- it only added reporting
> > of an error that was already happening without any notice. The error
> > has presumably been present since the netvsc driver was originally
> > introduced into Linux.
> >
> > The root cause of the problem is that the netvsc driver in Linux may
> > send an incorrectly formatted VMBus message to Hyper-V when
> > transmitting the network packet. The incorrect formatting occurs when
> > the rndis header of the VMBus message crosses a page boundary due to
> > how the Linux skb head memory is aligned. In such a case, two PFNs are
> > required to describe the location of the rndis header, even though
> > they are contiguous in guest physical address (GPA) space. Hyper-V
> > requires that two rndis header PFNs be in a single "GPA range" data
> > struture, but current netvsc code puts each PFN in its own GPA range,
> > which Hyper-V rejects as an error.
> >
> > The incorrect formatting occurs only for larger packets that netvsc
> > must transmit via a VMBus "GPA Direct" message. There's no problem
> > when netvsc transmits a smaller packet by copying it into a pre-
> > allocated send buffer slot because the pre-allocated slots don't have
> > page crossing issues.
> >
> > After commit 14ad6ed30a10 ("net: allow small head cache usage with
> > large MAX_SKB_FRAGS values") in the 6.14-rc4 kernel, the error occurs
> > much more frequently in VMs with 16 or more vCPUs. It may occur every
> > few seconds, or even more frequently, in an ssh session that outputs a
> > lot of text. Commit 14ad6ed30a10 subtly changes how skb head memory is
> > allocated, making it much more likely that the rndis header will cross
> > a page boundary when the vCPU count is 16 or more. The changes in
> > commit 14ad6ed30a10 are perfectly valid -- they just had the side
> > effect of making the netvsc bug more prominent.
> >
> > Current code in init_page_array() creates a separate page buffer array
> > entry for each PFN required to identify the data to be transmitted.
> > Contiguous PFNs get separate entries in the page buffer array, and any
> > information about contiguity is lost.
> >
> > Fix the core issue by having init_page_array() construct the page
> > buffer array to represent contiguous ranges rather than individual
> > pages. When these ranges are subsequently passed to
> > netvsc_build_mpb_array(), it can build GPA ranges that contain
> > multiple PFNs, as required to avoid the error "nvsp_rndis_pkt_complete
> > error status: 2". If instead the network packet is sent by copying
> > into a pre-allocated send buffer slot, the copy proceeds using the
> > contiguous ranges rather than individual pages, but the result of the
> > copying is the same. Also fix rndis_filter_send_request() to construct
> > a contiguous range, since it has its own page buffer array.
> >
> > This change has a side benefit in CoCo VMs in that netvsc_dma_map()
> > calls dma_map_single() on each contiguous range instead of on each
> > page. This results in fewer calls to dma_map_single() but on larger
> > chunks of memory, which should reduce contention on the swiotlb.
> >
> > Since the page buffer array now contains one entry for each contiguous
> > range instead of for each individual page, the number of entries in
> > the array can be reduced, saving 208 bytes of stack space in
> > netvsc_xmit() when MAX_SKG_FRAGS has the default value of 17.
> >
> > [1] https://bugzilla.kernel.org/show_bug.cgi?id=217503
> >
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217503
> > Cc: <stable@vger.kernel.org> # 6.1.x
> > Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> > ---
> >  drivers/net/hyperv/hyperv_net.h   | 12 ++++++
> >  drivers/net/hyperv/netvsc_drv.c   | 63 ++++++++-----------------------
> >  drivers/net/hyperv/rndis_filter.c | 24 +++---------
> >  3 files changed, 32 insertions(+), 67 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
> > index 70f7cb383228..76725f25abd5 100644
> > --- a/drivers/net/hyperv/hyperv_net.h
> > +++ b/drivers/net/hyperv/hyperv_net.h
> > @@ -893,6 +893,18 @@ struct nvsp_message {
> >  				 sizeof(struct nvsp_message))
> >  #define NETVSC_MIN_IN_MSG_SIZE sizeof(struct vmpacket_descriptor)
> >
> > +/* Maximum # of contiguous data ranges that can make up a trasmitted packet.
> > + * Typically it's the max SKB fragments plus 2 for the rndis packet and the
> > + * linear portion of the SKB. But if MAX_SKB_FRAGS is large, the value may
> > + * need to be limited to MAX_PAGE_BUFFER_COUNT, which is the max # of entries
> > + * in a GPA direct packet sent to netvsp over VMBus.
> > + */
> > +#if MAX_SKB_FRAGS + 2 < MAX_PAGE_BUFFER_COUNT
> > +#define MAX_DATA_RANGES (MAX_SKB_FRAGS + 2)
> > +#else
> > +#define MAX_DATA_RANGES MAX_PAGE_BUFFER_COUNT
> > +#endif
> > +
> >  /* Estimated requestor size:
> >   * out_ring_size/min_out_msg_size + in_ring_size/min_in_msg_size
> >   */
> 
> ...
> 
> > @@ -371,28 +338,28 @@ static u32 init_page_array(void *hdr, u32 len, struct
> sk_buff *skb,
> >  	 * 2. skb linear data
> >  	 * 3. skb fragment data
> >  	 */
> > -	slots_used += fill_pg_buf(virt_to_hvpfn(hdr),
> > -				  offset_in_hvpage(hdr),
> > -				  len,
> > -				  &pb[slots_used]);
> >
> > +	pb[0].offset = offset_in_hvpage(hdr);
> > +	pb[0].len = len;
> > +	pb[0].pfn = virt_to_hvpfn(hdr);
> >  	packet->rmsg_size = len;
> > -	packet->rmsg_pgcnt = slots_used;
> > +	packet->rmsg_pgcnt = 1;
> >
> > -	slots_used += fill_pg_buf(virt_to_hvpfn(data),
> > -				  offset_in_hvpage(data),
> > -				  skb_headlen(skb),
> > -				  &pb[slots_used]);
> > +	pb[1].offset = offset_in_hvpage(skb->data);
> > +	pb[1].len = skb_headlen(skb);
> > +	pb[1].pfn = virt_to_hvpfn(skb->data);
> >
> >  	for (i = 0; i < frags; i++) {
> >  		skb_frag_t *frag = skb_shinfo(skb)->frags + i;
> > +		struct hv_page_buffer *cur_pb = &pb[i + 2];
> 
> Hi Michael,
> 
> If I got things right then then pb is allocated on the stack
> in netvsc_xmit and has MAX_DATA_RANGES elements.

Correct.

> 
> If MAX_SKB_FRAGS is largs and MAX_DATA_RANGES has been limited to
> MAX_DATA_RANGES. And frags is large. Is is possible to overrun pb here?

I don't think it's possible. Near the top of netvsc_xmit() there's a call
to netvsc_get_slots(), along with code ensuring that all the data in the skb
(and its frags) exists on no more than MAX_PAGE_BUFFER_COUNT (i.e., 32)
pages. There can't be more frags than pages, so it should not be possible to
overrun the pb array even if the frag count is large.

If the kernel is built with CONFIG_MAX_SKB_FRAGS greater than 30, and
there are more than 30 frags in the skb (allowing for 2 pages for the rndis
header), netvsc_xmit() tries to linearize the skb to reduce the frag count.
But if that doesn't work, netvsc_xmit() drops the xmit request, which isn't
a great outcome. But that's a limitation of the existing code, and this patch
set doesn't change that limitation.

Michael

> 
> > +		u64 pfn = page_to_hvpfn(skb_frag_page(frag));
> > +		u32 offset = skb_frag_off(frag);
> >
> > -		slots_used += fill_pg_buf(page_to_hvpfn(skb_frag_page(frag)),
> > -					  skb_frag_off(frag),
> > -					  skb_frag_size(frag),
> > -					  &pb[slots_used]);
> > +		cur_pb->offset = offset_in_hvpage(offset);
> > +		cur_pb->len = skb_frag_size(frag);
> > +		cur_pb->pfn = pfn + (offset >> HV_HYP_PAGE_SHIFT);
> >  	}
> > -	return slots_used;
> > +	return frags + 2;
> >  }
> >
> >  static int count_skb_frag_slots(struct sk_buff *skb)
> > @@ -483,7 +450,7 @@ static int netvsc_xmit(struct sk_buff *skb, struct net_device
> *net, bool xdp_tx)
> >  	struct net_device *vf_netdev;
> >  	u32 rndis_msg_size;
> >  	u32 hash;
> > -	struct hv_page_buffer pb[MAX_PAGE_BUFFER_COUNT];
> > +	struct hv_page_buffer pb[MAX_DATA_RANGES];
> >
> >  	/* If VF is present and up then redirect packets to it.
> >  	 * Skip the VF if it is marked down or has no carrier.
> 
> ...


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH net 2/5] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
  2025-05-14  9:37   ` Simon Horman
@ 2025-05-14 15:44     ` Michael Kelley
  2025-05-15 10:50       ` Simon Horman
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Kelley @ 2025-05-14 15:44 UTC (permalink / raw)
  To: Simon Horman
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	James.Bottomley@hansenpartnership.com, martin.petersen@oracle.com,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-scsi@vger.kernel.org,
	stable@vger.kernel.org

From: Simon Horman <horms@kernel.org> Sent: Wednesday, May 14, 2025 2:38 AM
> 
> On Mon, May 12, 2025 at 05:06:01PM -0700, mhkelley58@gmail.com wrote:
> > From: Michael Kelley <mhklinux@outlook.com>
> >
> > netvsc currently uses vmbus_sendpacket_pagebuffer() to send VMBus
> > messages. This function creates a series of GPA ranges, each of which
> > contains a single PFN. However, if the rndis header in the VMBus
> > message crosses a page boundary, the netvsc protocol with the host
> > requires that both PFNs for the rndis header must be in a single "GPA
> > range" data structure, which isn't possible with
> > vmbus_sendpacket_pagebuffer(). As the first step in fixing this, add a
> > new function netvsc_build_mpb_array() to build a VMBus message with
> > multiple GPA ranges, each of which may contain multiple PFNs. Use
> > vmbus_sendpacket_mpb_desc() to send this VMBus message to the host.
> >
> > There's no functional change since higher levels of netvsc don't
> > maintain or propagate knowledge of contiguous PFNs. Based on its
> > input, netvsc_build_mpb_array() still produces a separate GPA range
> > for each PFN and the behavior is the same as with
> > vmbus_sendpacket_pagebuffer(). But the groundwork is laid for a
> > subsequent patch to provide the necessary grouping.
> >
> > Cc: <stable@vger.kernel.org> # 6.1.x
> > Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> > ---
> >  drivers/net/hyperv/netvsc.c | 50 +++++++++++++++++++++++++++++++++----
> >  1 file changed, 45 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
> > index d6f5b9ea3109..6d1705f87682 100644
> > --- a/drivers/net/hyperv/netvsc.c
> > +++ b/drivers/net/hyperv/netvsc.c
> > @@ -1055,6 +1055,42 @@ static int netvsc_dma_map(struct hv_device *hv_dev,
> >  	return 0;
> >  }
> >
> > +/* Build an "array" of mpb entries describing the data to be transferred
> > + * over VMBus. After the desc header fields, each "array" entry is variable
> > + * size, and each entry starts after the end of the previous entry. The
> > + * "offset" and "len" fields for each entry imply the size of the entry.
> > + *
> > + * The pfns are in HV_HYP_PAGE_SIZE, because all communication with Hyper-V
> > + * uses that granularity, even if the system page size of the guest is larger.
> > + * Each entry in the input "pb" array must describe a contiguous range of
> > + * guest physical memory so that the pfns are sequential if the range crosses
> > + * a page boundary. The offset field must be < HV_HYP_PAGE_SIZE.
> 
> Hi Michael,
> 
> Is there a guarantee that this constraint is met. And moreover, is there a
> guarantee that all of the entries will fit in desc? I am slightly concerned
> that there may be an overrun lurking here.
> 

It is indeed up to the caller to ensure that the pb array is properly
constructed. netvsc_build_mpb_array() doesn't do additional validation.
There are only two sources of the pb array, both of which do the right
thing, so additional validation seemed redundant.

An overrun is a concern, but again the callers do the right thing. As
described in my response to Patch 3 of the series, netvsc_xmit()
counts the number of pages ahead of time, and makes sure the count is
within the limit of the amount space allocated in the "desc" argument
to netvsc_build_mpb_array().

Michael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2"
  2025-05-13  0:05 [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" mhkelley58
                   ` (4 preceding siblings ...)
  2025-05-13  0:06 ` [PATCH net 5/5] Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer() mhkelley58
@ 2025-05-15  3:00 ` patchwork-bot+netdevbpf
  5 siblings, 0 replies; 15+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-05-15  3:00 UTC (permalink / raw)
  To: Michael Kelley
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen, linux-hyperv,
	linux-kernel, netdev, linux-scsi, stable

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 12 May 2025 17:05:59 -0700 you wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> Starting with commit dca5161f9bd0 in the 6.3 kernel, the Linux driver
> for Hyper-V synthetic networking (netvsc) occasionally reports
> "nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
> that Hyper-V has rejected a network packet transmit request from the
> guest, and the outgoing network packet is dropped. Higher level
> network protocols presumably recover and resend the packet so there is
> no functional error, but performance is slightly impacted. Commit
> dca5161f9bd0 is not the cause of the error -- it only added reporting
> of an error that was already happening without any notice. The error
> has presumably been present since the netvsc driver was originally
> introduced into Linux.
> 
> [...]

Here is the summary with links:
  - [net,1/5] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges
    https://git.kernel.org/netdev/net/c/380b75d30786
  - [net,2/5] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
    https://git.kernel.org/netdev/net/c/4f98616b855c
  - [net,3/5] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
    https://git.kernel.org/netdev/net/c/41a6328b2c55
  - [net,4/5] hv_netvsc: Remove rmsg_pgcnt
    https://git.kernel.org/netdev/net/c/5bbc644bbf4e
  - [net,5/5] Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()
    https://git.kernel.org/netdev/net/c/45a442fe369e

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 3/5] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
  2025-05-14 15:42     ` Michael Kelley
@ 2025-05-15 10:40       ` Simon Horman
  0 siblings, 0 replies; 15+ messages in thread
From: Simon Horman @ 2025-05-15 10:40 UTC (permalink / raw)
  To: Michael Kelley
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen, linux-hyperv,
	linux-kernel, netdev, linux-scsi, stable

On Wed, May 14, 2025 at 03:42:19PM +0000, Michael Kelley wrote:
> From: Simon Horman <horms@kernel.org> Sent: Wednesday, May 14, 2025 2:35 AM
> > 
> > On Mon, May 12, 2025 at 05:06:02PM -0700, mhkelley58@gmail.com wrote:
> > > From: Michael Kelley <mhklinux@outlook.com>

...

> > >  	for (i = 0; i < frags; i++) {
> > >  		skb_frag_t *frag = skb_shinfo(skb)->frags + i;
> > > +		struct hv_page_buffer *cur_pb = &pb[i + 2];
> > 
> > Hi Michael,
> > 
> > If I got things right then then pb is allocated on the stack
> > in netvsc_xmit and has MAX_DATA_RANGES elements.
> 
> Correct.
> 
> > 
> > If MAX_SKB_FRAGS is largs and MAX_DATA_RANGES has been limited to
> > MAX_DATA_RANGES. And frags is large. Is is possible to overrun pb here?
> 
> I don't think it's possible. Near the top of netvsc_xmit() there's a call
> to netvsc_get_slots(), along with code ensuring that all the data in the skb
> (and its frags) exists on no more than MAX_PAGE_BUFFER_COUNT (i.e., 32)
> pages. There can't be more frags than pages, so it should not be possible to
> overrun the pb array even if the frag count is large.
> 
> If the kernel is built with CONFIG_MAX_SKB_FRAGS greater than 30, and
> there are more than 30 frags in the skb (allowing for 2 pages for the rndis
> header), netvsc_xmit() tries to linearize the skb to reduce the frag count.
> But if that doesn't work, netvsc_xmit() drops the xmit request, which isn't
> a great outcome. But that's a limitation of the existing code, and this patch
> set doesn't change that limitation.

Hi Michael,

Thanks for addressing my concern. I do now see that the check
in netvsc_xmit() prevents an overrun.

And I agree that while not ideal dropping oversize packets is not
strictly related to this patch-set (and is indeed much better than
an overrun).

With the above in mind I'm now happy with this patch.

Reviewed-by: Simon Horman <horms@kernel.org>

...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 2/5] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
  2025-05-14 15:44     ` Michael Kelley
@ 2025-05-15 10:50       ` Simon Horman
  0 siblings, 0 replies; 15+ messages in thread
From: Simon Horman @ 2025-05-15 10:50 UTC (permalink / raw)
  To: Michael Kelley
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	James.Bottomley@hansenpartnership.com, martin.petersen@oracle.com,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-scsi@vger.kernel.org,
	stable@vger.kernel.org

On Wed, May 14, 2025 at 03:44:35PM +0000, Michael Kelley wrote:
> From: Simon Horman <horms@kernel.org> Sent: Wednesday, May 14, 2025 2:38 AM
> > 
> > On Mon, May 12, 2025 at 05:06:01PM -0700, mhkelley58@gmail.com wrote:
> > > From: Michael Kelley <mhklinux@outlook.com>
> > >
> > > netvsc currently uses vmbus_sendpacket_pagebuffer() to send VMBus
> > > messages. This function creates a series of GPA ranges, each of which
> > > contains a single PFN. However, if the rndis header in the VMBus
> > > message crosses a page boundary, the netvsc protocol with the host
> > > requires that both PFNs for the rndis header must be in a single "GPA
> > > range" data structure, which isn't possible with
> > > vmbus_sendpacket_pagebuffer(). As the first step in fixing this, add a
> > > new function netvsc_build_mpb_array() to build a VMBus message with
> > > multiple GPA ranges, each of which may contain multiple PFNs. Use
> > > vmbus_sendpacket_mpb_desc() to send this VMBus message to the host.
> > >
> > > There's no functional change since higher levels of netvsc don't
> > > maintain or propagate knowledge of contiguous PFNs. Based on its
> > > input, netvsc_build_mpb_array() still produces a separate GPA range
> > > for each PFN and the behavior is the same as with
> > > vmbus_sendpacket_pagebuffer(). But the groundwork is laid for a
> > > subsequent patch to provide the necessary grouping.
> > >
> > > Cc: <stable@vger.kernel.org> # 6.1.x
> > > Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> > > ---
> > >  drivers/net/hyperv/netvsc.c | 50 +++++++++++++++++++++++++++++++++----
> > >  1 file changed, 45 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
> > > index d6f5b9ea3109..6d1705f87682 100644
> > > --- a/drivers/net/hyperv/netvsc.c
> > > +++ b/drivers/net/hyperv/netvsc.c
> > > @@ -1055,6 +1055,42 @@ static int netvsc_dma_map(struct hv_device *hv_dev,
> > >  	return 0;
> > >  }
> > >
> > > +/* Build an "array" of mpb entries describing the data to be transferred
> > > + * over VMBus. After the desc header fields, each "array" entry is variable
> > > + * size, and each entry starts after the end of the previous entry. The
> > > + * "offset" and "len" fields for each entry imply the size of the entry.
> > > + *
> > > + * The pfns are in HV_HYP_PAGE_SIZE, because all communication with Hyper-V
> > > + * uses that granularity, even if the system page size of the guest is larger.
> > > + * Each entry in the input "pb" array must describe a contiguous range of
> > > + * guest physical memory so that the pfns are sequential if the range crosses
> > > + * a page boundary. The offset field must be < HV_HYP_PAGE_SIZE.
> > 
> > Hi Michael,
> > 
> > Is there a guarantee that this constraint is met. And moreover, is there a
> > guarantee that all of the entries will fit in desc? I am slightly concerned
> > that there may be an overrun lurking here.
> > 
> 
> It is indeed up to the caller to ensure that the pb array is properly
> constructed. netvsc_build_mpb_array() doesn't do additional validation.
> There are only two sources of the pb array, both of which do the right
> thing, so additional validation seemed redundant.
> 
> An overrun is a concern, but again the callers do the right thing. As
> described in my response to Patch 3 of the series, netvsc_xmit()
> counts the number of pages ahead of time, and makes sure the count is
> within the limit of the amount space allocated in the "desc" argument
> to netvsc_build_mpb_array().

Thanks Michael,

I agree that is entirely reasonable for callers to be responsible
correctly constructing the pb array. And that it's not necessary
to add validation to netvsc_build_mpb_array().

Also, based on the above, I'm satisfied that the callers are correctly
constructing the pb array.

With the above clarified in my mind I'm now happy with this patch.

Reviewed-by: Simon Horman <horms@kernel.org>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 1/5] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges
  2025-05-13  0:06 ` [PATCH net 1/5] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges mhkelley58
@ 2025-05-15 10:51   ` Simon Horman
  0 siblings, 0 replies; 15+ messages in thread
From: Simon Horman @ 2025-05-15 10:51 UTC (permalink / raw)
  To: mhklinux
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen, linux-hyperv,
	linux-kernel, netdev, linux-scsi, stable

On Mon, May 12, 2025 at 05:06:00PM -0700, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> vmbus_sendpacket_mpb_desc() is currently used only by the storvsc driver
> and is hardcoded to create a single GPA range. To allow it to also be
> used by the netvsc driver to create multiple GPA ranges, no longer
> hardcode as having a single GPA range. Allow the calling driver to
> specify the rangecount in the supplied descriptor.
> 
> Update the storvsc driver to reflect this new approach.
> 
> Cc: <stable@vger.kernel.org> # 6.1.x
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net 4/5] hv_netvsc: Remove rmsg_pgcnt
  2025-05-13  0:06 ` [PATCH net 4/5] hv_netvsc: Remove rmsg_pgcnt mhkelley58
@ 2025-05-15 10:55   ` Simon Horman
  0 siblings, 0 replies; 15+ messages in thread
From: Simon Horman @ 2025-05-15 10:55 UTC (permalink / raw)
  To: mhklinux
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, James.Bottomley, martin.petersen, linux-hyperv,
	linux-kernel, netdev, linux-scsi, stable

On Mon, May 12, 2025 at 05:06:03PM -0700, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> init_page_array() now always creates a single page buffer array entry
> for the rndis message, even if the rndis message crosses a page
> boundary. As such, the number of page buffer array entries used for
> the rndis message must no longer be tracked -- it is always just 1.
> Remove the rmsg_pgcnt field and use "1" where the value is needed.
> 
> Cc: <stable@vger.kernel.org> # 6.1.x
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-05-15 10:55 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-13  0:05 [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" mhkelley58
2025-05-13  0:06 ` [PATCH net 1/5] Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges mhkelley58
2025-05-15 10:51   ` Simon Horman
2025-05-13  0:06 ` [PATCH net 2/5] hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages mhkelley58
2025-05-14  9:37   ` Simon Horman
2025-05-14 15:44     ` Michael Kelley
2025-05-15 10:50       ` Simon Horman
2025-05-13  0:06 ` [PATCH net 3/5] hv_netvsc: Preserve contiguous PFN grouping in the page buffer array mhkelley58
2025-05-14  9:34   ` Simon Horman
2025-05-14 15:42     ` Michael Kelley
2025-05-15 10:40       ` Simon Horman
2025-05-13  0:06 ` [PATCH net 4/5] hv_netvsc: Remove rmsg_pgcnt mhkelley58
2025-05-15 10:55   ` Simon Horman
2025-05-13  0:06 ` [PATCH net 5/5] Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer() mhkelley58
2025-05-15  3:00 ` [PATCH net 0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).