* [PATCH] hv: account for packet descriptor in maximum packet size
@ 2021-12-12 12:13 Yanming Liu
2021-12-13 1:47 ` Andrea Parri
0 siblings, 1 reply; 10+ messages in thread
From: Yanming Liu @ 2021-12-12 12:13 UTC (permalink / raw)
To: linux-hyperv
Cc: Andrea Parri (Microsoft), Andres Beltran, Dexuan Cui, Wei Liu,
Stephen Hemminger, Haiyang Zhang, K. Y. Srinivasan, Yanming Liu
Commit adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V
out of the ring buffer") introduced a notion of maximum packet size and
used that size to initialize a buffer holding all incoming packet along
with their vmpacket_descriptor header. All vmbus drivers set the maximum
packet size to the size of their receive buffer. However most drivers
are expecting this maximum packet size being packet payload size due to
vmbus_recvpacket() stripping the packet descriptor off. This leads to
corruption of the ring buffer state when receiving a maximum sized
packet.
Specifically, in hv_balloon I have observed of a dm_unballoon_request
message of 4096 bytes being truncated to 4080 bytes. When the driver
tries to read next packet it starts from the wrong read_index, receives
garbage and prints a lot of "Unhandled message: type: <garbage>" in
dmesg.
Allocate the packet buffer with 'sizeof(struct vmpacket_descriptor)'
more bytes to make room for the descriptor. This is still flawed as
'desc->offset8' may not match the packet descriptor size, but this is
impossible to handle correctly without re-designing the original patch
and I have not observed such packets sent by Hyper-V in practice.
Fixes: adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer")
Signed-off-by: Yanming Liu <yanminglr@gmail.com>
---
drivers/hv/ring_buffer.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c
index 71efacb90965..e403ed4755ed 100644
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -256,6 +256,7 @@ int hv_ringbuffer_init(struct hv_ring_buffer_info *ring_info,
/* Initialize buffer that holds copies of incoming packets */
if (max_pkt_size) {
+ max_pkt_size += sizeof(struct vmpacket_descriptor);
ring_info->pkt_buffer = kzalloc(max_pkt_size, GFP_KERNEL);
if (!ring_info->pkt_buffer)
return -ENOMEM;
--
2.33.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH] hv: account for packet descriptor in maximum packet size 2021-12-12 12:13 [PATCH] hv: account for packet descriptor in maximum packet size Yanming Liu @ 2021-12-13 1:47 ` Andrea Parri 2021-12-13 6:44 ` Yanming Liu 2021-12-13 17:01 ` Yanming Liu 0 siblings, 2 replies; 10+ messages in thread From: Andrea Parri @ 2021-12-13 1:47 UTC (permalink / raw) To: Yanming Liu Cc: linux-hyperv, Andres Beltran, Dexuan Cui, Wei Liu, Stephen Hemminger, Haiyang Zhang, K. Y. Srinivasan, Michael Kelley Yanming, [...] > Specifically, in hv_balloon I have observed of a dm_unballoon_request > message of 4096 bytes being truncated to 4080 bytes. When the driver > tries to read next packet it starts from the wrong read_index, receives > garbage and prints a lot of "Unhandled message: type: <garbage>" in > dmesg. To make sure I understand your observations: Can you please print/share the values of (desc->len8 << 3) and (desc->offset8 << 3) for such a "truncated" packet, say, right after the desc = hv_pkt_iter_first(channel); in hv_ringbuffer_read()? Also, it'd be interesting to know whether any of the two validations on pkt_len and pkt_offset in hv_pkt_iter_first() fails (so that pkt_len/pkt_offset get updated in there). Thanks, Andrea ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hv: account for packet descriptor in maximum packet size 2021-12-13 1:47 ` Andrea Parri @ 2021-12-13 6:44 ` Yanming Liu 2021-12-13 17:01 ` Yanming Liu 1 sibling, 0 replies; 10+ messages in thread From: Yanming Liu @ 2021-12-13 6:44 UTC (permalink / raw) To: Andrea Parri Cc: linux-hyperv, Andres Beltran, Dexuan Cui, Wei Liu, Stephen Hemminger, Haiyang Zhang, K. Y. Srinivasan, Michael Kelley On Mon, Dec 13, 2021 at 9:47 AM Andrea Parri <parri.andrea@gmail.com> wrote: > > Yanming, > > [...] > > > Specifically, in hv_balloon I have observed of a dm_unballoon_request > > message of 4096 bytes being truncated to 4080 bytes. When the driver > > tries to read next packet it starts from the wrong read_index, receives > > garbage and prints a lot of "Unhandled message: type: <garbage>" in > > dmesg. > > To make sure I understand your observations: Can you please print/share the > values of (desc->len8 << 3) and (desc->offset8 << 3) for such a "truncated" > packet, say, right after the > > desc = hv_pkt_iter_first(channel); > > in hv_ringbuffer_read()? Also, it'd be interesting to know whether any of Sure, unfortunately I have not printed desc->len8 before and reproducing the bug takes some time (the only reliable way I know of seems to be booting a bad kernel, using it normally for a few hours, then running my daily backup script), I'll reply when I have these values ready. Meanwhile, I'd like to share how I observed a dm_unballoon_request message being truncated. I attached the following systemtap script to a bad kernel: probe module("hv_balloon").statement("*@drivers/hv/hv_balloon.c:1493") { printf("balloon_onchannelcallback: recvlen = %d, dm_hdr->type = %d\n", $recvlen, $dm_hdr->type); printf("%*.*M\n", $recvlen * 2, $recvlen, $recv_buffer); } This is the trace printed when the bug happens: balloon_onchannelcallback: recvlen = 16, dm_hdr->type = 6 06001000000000006625000000000000 balloon_onchannelcallback: recvlen = 24, dm_hdr->type = 8 080018000000000000000000010000000012000000005600 balloon_onchannelcallback: recvlen = 32, dm_hdr->type = 8 080020000000000000000000020000000068000000009000001c010000008c01 balloon_onchannelcallback: recvlen = 40, dm_hdr->type = 8 0800280000000000000000000300000000a802000000c40000700300000002000074030000001800 balloon_onchannelcallback: recvlen = 4080, dm_hdr->type = 8 080000100000000001000000fe0100002885100000d9020003881000000900000d8810000007000016881000000200001a881000000300001f88100000010000238810000001000028881000000200002d881000000300003388100000020000378810000001000039881000000100003c8810000001000062881000000[...] balloon_onchannelcallback: recvlen = 3968, dm_hdr->type = 16138 0a3f1000000300000e3f1000000a0000503f1000000e0000603f100000230000843f100000030000883f100000100000b83f100000100000d03f100000220000f33f1000001900000d401000000100000f4010000001000012401000000400001c401000000100001e40100000010000234010000001000025401000000[...] balloon_onchannelcallback: recvlen = 3968, dm_hdr->type = 8792 5822100000070000602210000001000063221000001200007622100000090000802210000004000085221000000300008b221000000800009a22100000160000f02210000008000010231000000100001223100000020000182310000004000029231000000e000038231000000100003a231000000100003c231000000[...] balloon_onchannelcallback: recvlen = 1544, dm_hdr->type = 2728 a80a100000020000fa0b100000010000130c100000010000150c1000000200001b0c100000040000200c100000010000240c100000010000260c100000010000290c1000000100002e0c100000010000310c100000010000330c100000010000350c100000020000380c1000000d0000490c100000060000560c1000000[...] balloon_onchannelcallback: recvlen = 4080, dm_hdr->type = 0 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000[...] balloon_onchannelcallback: recvlen = 3968, dm_hdr->type = 41981 fda3100000010000ffa31000000100000ea41000000d00001ca410000001000020a410000001000024a410000001000027a41000000100002aa410000007000032a41000000e000043a41000000100004aa410000001000054a410000002000057a410000003000060a4100000200000c0a4100000010000c4a41000002[...] balloon_onchannelcallback: recvlen = 3968, dm_hdr->type = 20914 b2511000000a0000be51100000080000cc51100000020000d151100000090000dc51100000020000e0511000001000000052100000040000125210000003000016521000000200002052100000020000245210000001000026521000000400002c5210000006000038521000000200003e521000001200007e521000003[...] balloon_onchannelcallback: recvlen = 1920, dm_hdr->type = 11279 0f2c100000130000232c100000010000252c100000280000502c100000140000712c100000010000742c100000020000782c1000000800008c2c100000080000962c1000000400009e2c100000020000a42c100000040000a92c100000030000ad2c1000000c0000ba2c1000000b0000c62c100000030000ca2c1000000[...] balloon_onchannelcallback: recvlen = 1880, dm_hdr->type = 15848 e83d100000050000ee3d100000010000f03d100000090000fa3d1000001400000f3e100000050000153e100000020000183e1000000400001d3e1000000100001f3e100000070000273e100000080000313e100000010000333e100000050000393e100000080000423e1000000700004a3e1000000f00005a3e1000000[...] balloon_onchannelcallback: recvlen = 4080, dm_hdr->type = 0 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000[...] balloon_onchannelcallback: recvlen = 3968, dm_hdr->type = 41981 fda3100000010000ffa31000000100000ea41000000d00001ca410000001000020a410000001000024a410000001000027a41000000100002aa410000007000032a41000000e000043a41000000100004aa410000001000054a410000002000057a410000003000060a4100000200000c0a4100000010000c4a41000002[...] balloon_onchannelcallback: recvlen = 3968, dm_hdr->type = 20914 b2511000000a0000be51100000080000cc51100000020000d151100000090000dc51100000020000e0511000001000000052100000040000125210000003000016521000000200002052100000020000245210000001000026521000000400002c5210000006000038521000000200003e521000001200007e521000003[...] balloon_onchannelcallback: recvlen = 1920, dm_hdr->type = 11279 0f2c100000130000232c100000010000252c100000280000502c100000140000712c100000010000742c100000020000782c1000000800008c2c100000080000962c1000000400009e2c100000020000a42c100000040000a92c100000030000ad2c1000000c0000ba2c1000000b0000c62c100000030000ca2c1000000[...] balloon_onchannelcallback: recvlen = 1880, dm_hdr->type = 15848 e83d100000050000ee3d100000010000f03d100000090000fa3d1000001400000f3e100000050000153e100000020000183e1000000400001d3e1000000100001f3e100000070000273e100000080000313e100000010000333e100000050000393e100000080000423e1000000700004a3e1000000f00005a3e1000000[...] struct dm_header conveniently has a u16 size field described as "Size of the message in bytes; including the header." in hv_balloon.c, please note how the first 4080 bytes message has size=0x1000, mismatching with recvlen. On a good kernel I'm seeing multiple recvlen=4096, type=8 messages under similar memory pressure. I then ran a (painful) bisect which pointed to adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer"). > the two validations on pkt_len and pkt_offset in hv_pkt_iter_first() fails > (so that pkt_len/pkt_offset get updated in there). My hypothesis is that in hv_pkt_iter_first, 'bytes_avail' was set to 'rbi->pkt_buffer_size', which was used to update pkt_len later. Please suggest better fixes as I'm not familiar with drivers/hv, thanks! > > Thanks, > Andrea ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hv: account for packet descriptor in maximum packet size 2021-12-13 1:47 ` Andrea Parri 2021-12-13 6:44 ` Yanming Liu @ 2021-12-13 17:01 ` Yanming Liu 2021-12-14 2:06 ` Andrea Parri 1 sibling, 1 reply; 10+ messages in thread From: Yanming Liu @ 2021-12-13 17:01 UTC (permalink / raw) To: Andrea Parri Cc: linux-hyperv, Andres Beltran, Dexuan Cui, Wei Liu, Stephen Hemminger, Haiyang Zhang, K. Y. Srinivasan, Michael Kelley On Mon, Dec 13, 2021 at 9:47 AM Andrea Parri <parri.andrea@gmail.com> wrote: > > Yanming, > > [...] > > > Specifically, in hv_balloon I have observed of a dm_unballoon_request > > message of 4096 bytes being truncated to 4080 bytes. When the driver > > tries to read next packet it starts from the wrong read_index, receives > > garbage and prints a lot of "Unhandled message: type: <garbage>" in > > dmesg. > > To make sure I understand your observations: Can you please print/share the > values of (desc->len8 << 3) and (desc->offset8 << 3) for such a "truncated" > packet, say, right after the > > desc = hv_pkt_iter_first(channel); > > in hv_ringbuffer_read()? Also, it'd be interesting to know whether any of Truncated packet: module("hv_vmbus").statement("hv_pkt_iter_first@drivers/hv/ring_buffer.c:457"): desc->offset8 = 2, desc->len8 = 514, rbi->pkt_buffer_size = 4096 module("hv_vmbus").statement("hv_ringbuffer_read@drivers/hv/ring_buffer.c:382"): desc->offset8 = 2, desc->len8 = 512 balloon_onchannelcallback: recvlen = 4080, dm_hdr->type = 8 First garbage packet: module("hv_vmbus").statement("hv_pkt_iter_first@drivers/hv/ring_buffer.c:457"): desc->offset8 = 21, desc->len8 = 16640, rbi->pkt_buffer_size = 4096 module("hv_vmbus").statement("hv_ringbuffer_read@drivers/hv/ring_buffer.c:382"): desc->offset8 = 21, desc->len8 = 512 balloon_onchannelcallback: recvlen = 3928, dm_hdr->type = 63886 The trace proved my hypothesis above. > the two validations on pkt_len and pkt_offset in hv_pkt_iter_first() fails > (so that pkt_len/pkt_offset get updated in there). > > Thanks, > Andrea Regards, Yanming ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hv: account for packet descriptor in maximum packet size 2021-12-13 17:01 ` Yanming Liu @ 2021-12-14 2:06 ` Andrea Parri 2021-12-14 4:28 ` Andrea Parri 0 siblings, 1 reply; 10+ messages in thread From: Andrea Parri @ 2021-12-14 2:06 UTC (permalink / raw) To: Yanming Liu Cc: linux-hyperv, Andres Beltran, Dexuan Cui, Wei Liu, Stephen Hemminger, Haiyang Zhang, K. Y. Srinivasan, Michael Kelley > Truncated packet: > module("hv_vmbus").statement("hv_pkt_iter_first@drivers/hv/ring_buffer.c:457"): > desc->offset8 = 2, desc->len8 = 514, rbi->pkt_buffer_size = 4096 > module("hv_vmbus").statement("hv_ringbuffer_read@drivers/hv/ring_buffer.c:382"): > desc->offset8 = 2, desc->len8 = 512 > balloon_onchannelcallback: recvlen = 4080, dm_hdr->type = 8 > > First garbage packet: > module("hv_vmbus").statement("hv_pkt_iter_first@drivers/hv/ring_buffer.c:457"): > desc->offset8 = 21, desc->len8 = 16640, rbi->pkt_buffer_size = 4096 > module("hv_vmbus").statement("hv_ringbuffer_read@drivers/hv/ring_buffer.c:382"): > desc->offset8 = 21, desc->len8 = 512 > balloon_onchannelcallback: recvlen = 3928, dm_hdr->type = 63886 > > The trace proved my hypothesis above. Thanks for the confirmation. (Back to "how to fix this" now.) I think that the patch properly addresses the "mismatch" between the (maximum) size of the receive buffer (bufferlen in vmbus_recvpacket()) and max_pkt_size: We've discussed hv_ballon in some: 1) drivers/hv/hv_balloon.c:balloon_onchannelcallback() (bufferlen = HV_HYP_PAGE_SIZE, max_pkt_size = VMBUS_DEFAULT_MAX_PKT_SIZE) But I understand that the same mismatch is present *and addressed by your patch in the following cases: 2) drivers/hv/hv_util.c:{shutdown,timesync,heartbeat}_onchannelcallback() (bufferlen = HV_HYP_PAGE_SIZE, max_pkt_size = VMBUS_DEFAULT_MAX_PKT_SIZE) 3) drivers/hv/hv_fcopy.c:hv_fcopy_onchannelcallback() (bufferlen = 2 * HV_HYP_PAGE_SIZE, max_pkt_size = 2 * HV_HYP_PAGE_SIZE) 4) drivers/hv/hv_snapshot.c:hv_vss_onchannelcallback() (bufferlen= 2 * HV_HYP_PAGE_SIZE, max_pkt_size= 2 * HV_HYP_PAGE_SIZE) 5) drivers/hv/hv_kvp.c:hv_kvp_onchannelcallback() (bufferlen= 4 * HV_HYP_PAGE_SIZE, max_pkt_size= 4 * HV_HYP_PAGE_SIZE) In fact, IIUC, similar considerations would apply to hv_fb and hv_drm *if it were not for the fact that those drivers seem to have bogus values for max_pkt_size: 6) drivers/video/fbdev/hyperv_fb.c (bufferlen = MAX_VMBUS_PKT_SIZE, max_pkt_size=VMBUS_DEFAULT_MAX_PKT_SIZE) 7) drivers/gpu/drm/hyperv/hyperv_drm_proto.c (bufferlen = MAX_VMBUS_PKT_SIZE, max_pkt_size=VMBUS_DEFAULT_MAX_PKT_SIZE) So, IIUC, some separate patch is needed in order to "adjust" those values (say, by appropriately setting max_pkt_size in synthvid_connect_vsp() and in hyperv_connect_vsp()), but I digress. Other comments on your patch: a) You mentioned the problem that "pkt_offset may not match the packet descriptor size". I think this is a real problem. To address this problem, we can *presumably consider HV_HYP_PAGE_SIZE to be a valid upper bound for pkt_offset (and sizeof(struct vmpacket_descriptor)) *and increase the value of max_pkt_size by HV_HYP_PAGE_SIZE (rather than sizeof(struct vmpacket_descriptor)), like in: @@ -256,6 +256,7 @@ int hv_ringbuffer_init(struct hv_ring_buffer_info *ring_info, /* Initialize buffer that holds copies of incoming packets */ if (max_pkt_size) { + max_pkt_size += HV_HYP_PAGE_SIZE; ring_info->pkt_buffer = kzalloc(max_pkt_size, GFP_KERNEL); if (!ring_info->pkt_buffer) return -ENOMEM; b) We may then want to "enforce"/check that that bound on pkt_offset, pkt_offset <= HV_HYP_PAGE_SIZE, is met by adding a corresponding check to the (previously discussed) validation of pkt_offset included in hv_pkt_iter_first(), say: @@ -498,7 +498,8 @@ struct vmpacket_descriptor *hv_pkt_iter_first(struct vmbus_channel *channel) * If pkt_offset is invalid, arbitrarily set it to * the size of vmpacket_descriptor */ - if (pkt_offset < sizeof(struct vmpacket_descriptor) || pkt_offset > pkt_len) + if (pkt_offset < sizeof(struct vmpacket_descriptor) || pkt_offset > pkt_len || + pkt_offset > HV_HYP_PAGE_SIZE) pkt_offset = sizeof(struct vmpacket_descriptor); /* Copy the Hyper-V packet out of the ring buffer */ While there (since I have noticed that such validation as well the validation on pkt_len in hv_pkt_iter_first() tend to be the object of a somehow recurring discussion ;/), I wouldn't mind to add some "explicit" debug, pr_err_ratelimited() say, there. c) Last but not least, I'd recommend to pair the above changes or any other change with some inline explanation/comments; these comments could be snippets from an (updated) patch description for example. Andrea ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hv: account for packet descriptor in maximum packet size 2021-12-14 2:06 ` Andrea Parri @ 2021-12-14 4:28 ` Andrea Parri 2021-12-14 16:28 ` Yanming Liu 0 siblings, 1 reply; 10+ messages in thread From: Andrea Parri @ 2021-12-14 4:28 UTC (permalink / raw) To: Yanming Liu Cc: linux-hyperv, Andres Beltran, Dexuan Cui, Wei Liu, Stephen Hemminger, Haiyang Zhang, K. Y. Srinivasan, Michael Kelley On Tue, Dec 14, 2021 at 03:06:58AM +0100, Andrea Parri wrote: > > Truncated packet: > > module("hv_vmbus").statement("hv_pkt_iter_first@drivers/hv/ring_buffer.c:457"): > > desc->offset8 = 2, desc->len8 = 514, rbi->pkt_buffer_size = 4096 > > module("hv_vmbus").statement("hv_ringbuffer_read@drivers/hv/ring_buffer.c:382"): > > desc->offset8 = 2, desc->len8 = 512 > > balloon_onchannelcallback: recvlen = 4080, dm_hdr->type = 8 > > > > First garbage packet: > > module("hv_vmbus").statement("hv_pkt_iter_first@drivers/hv/ring_buffer.c:457"): > > desc->offset8 = 21, desc->len8 = 16640, rbi->pkt_buffer_size = 4096 > > module("hv_vmbus").statement("hv_ringbuffer_read@drivers/hv/ring_buffer.c:382"): > > desc->offset8 = 21, desc->len8 = 512 > > balloon_onchannelcallback: recvlen = 3928, dm_hdr->type = 63886 > > > > The trace proved my hypothesis above. > > Thanks for the confirmation. > > (Back to "how to fix this" now.) I think that the patch properly addresses > the "mismatch" between the (maximum) size of the receive buffer (bufferlen > in vmbus_recvpacket()) and max_pkt_size: > > We've discussed hv_ballon in some: > > 1) drivers/hv/hv_balloon.c:balloon_onchannelcallback() > (bufferlen = HV_HYP_PAGE_SIZE, max_pkt_size = VMBUS_DEFAULT_MAX_PKT_SIZE) > > But I understand that the same mismatch is present *and addressed by your > patch in the following cases: > > 2) drivers/hv/hv_util.c:{shutdown,timesync,heartbeat}_onchannelcallback() > (bufferlen = HV_HYP_PAGE_SIZE, max_pkt_size = VMBUS_DEFAULT_MAX_PKT_SIZE) > > 3) drivers/hv/hv_fcopy.c:hv_fcopy_onchannelcallback() > (bufferlen = 2 * HV_HYP_PAGE_SIZE, max_pkt_size = 2 * HV_HYP_PAGE_SIZE) > > 4) drivers/hv/hv_snapshot.c:hv_vss_onchannelcallback() > (bufferlen= 2 * HV_HYP_PAGE_SIZE, max_pkt_size= 2 * HV_HYP_PAGE_SIZE) > > 5) drivers/hv/hv_kvp.c:hv_kvp_onchannelcallback() > (bufferlen= 4 * HV_HYP_PAGE_SIZE, max_pkt_size= 4 * HV_HYP_PAGE_SIZE) > > In fact, IIUC, similar considerations would apply to hv_fb and hv_drm *if > it were not for the fact that those drivers seem to have bogus values for > max_pkt_size: > > 6) drivers/video/fbdev/hyperv_fb.c > (bufferlen = MAX_VMBUS_PKT_SIZE, max_pkt_size=VMBUS_DEFAULT_MAX_PKT_SIZE) > > 7) drivers/gpu/drm/hyperv/hyperv_drm_proto.c > (bufferlen = MAX_VMBUS_PKT_SIZE, max_pkt_size=VMBUS_DEFAULT_MAX_PKT_SIZE) > > So, IIUC, some separate patch is needed in order to "adjust" those values > (say, by appropriately setting max_pkt_size in synthvid_connect_vsp() and > in hyperv_connect_vsp()), but I digress. > > Other comments on your patch: > > a) You mentioned the problem that "pkt_offset may not match the packet > descriptor size". I think this is a real problem. To address this > problem, we can *presumably consider HV_HYP_PAGE_SIZE to be a valid > upper bound for pkt_offset (and sizeof(struct vmpacket_descriptor)) > *and increase the value of max_pkt_size by HV_HYP_PAGE_SIZE (rather > than sizeof(struct vmpacket_descriptor)), like in: > > @@ -256,6 +256,7 @@ int hv_ringbuffer_init(struct hv_ring_buffer_info *ring_info, > > /* Initialize buffer that holds copies of incoming packets */ > if (max_pkt_size) { > + max_pkt_size += HV_HYP_PAGE_SIZE; > ring_info->pkt_buffer = kzalloc(max_pkt_size, GFP_KERNEL); > if (!ring_info->pkt_buffer) > return -ENOMEM; > > b) We may then want to "enforce"/check that that bound on pkt_offset, > > pkt_offset <= HV_HYP_PAGE_SIZE, > > is met by adding a corresponding check to the (previously discussed) > validation of pkt_offset included in hv_pkt_iter_first(), say: > > @@ -498,7 +498,8 @@ struct vmpacket_descriptor *hv_pkt_iter_first(struct vmbus_channel *channel) > * If pkt_offset is invalid, arbitrarily set it to > * the size of vmpacket_descriptor > */ > - if (pkt_offset < sizeof(struct vmpacket_descriptor) || pkt_offset > pkt_len) > + if (pkt_offset < sizeof(struct vmpacket_descriptor) || pkt_offset > pkt_len || > + pkt_offset > HV_HYP_PAGE_SIZE) > pkt_offset = sizeof(struct vmpacket_descriptor); > > /* Copy the Hyper-V packet out of the ring buffer */ > > While there (since I have noticed that such validation as well the > validation on pkt_len in hv_pkt_iter_first() tend to be the object > of a somehow recurring discussion ;/), I wouldn't mind to add some > "explicit" debug, pr_err_ratelimited() say, there. > > c) Last but not least, I'd recommend to pair the above changes or any > other change with some inline explanation/comments; these comments > could be snippets from an (updated) patch description for example. > > Andrea One more thought. The additional HV_HYP_PAGE_SIZE seems unnecessary for drivers such as hv_netvsc and hv_storvsc, which explicitly account for pkt_offset in their setting of max_pkt_size, as well as for drivers such as hv_pci, which uses vmbus_recvpacket_raw(). This suggests an alternative approach: do not increase max_pkt_size in the generic vmbus code, instead set/adjust max_pkt_size (only) for the drivers, in (1-7) above, which require the additional HV_HYP_PAGE_SIZE /pkt_offset. While putting on the driver(s) some additional "burden", this approach has the advantage of saving some memory (and keeping the generic code relatively simpler). Thoughts? Andrea ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hv: account for packet descriptor in maximum packet size 2021-12-14 4:28 ` Andrea Parri @ 2021-12-14 16:28 ` Yanming Liu 2021-12-14 21:36 ` Andrea Parri 0 siblings, 1 reply; 10+ messages in thread From: Yanming Liu @ 2021-12-14 16:28 UTC (permalink / raw) To: Andrea Parri Cc: linux-hyperv, Andres Beltran, Dexuan Cui, Wei Liu, Stephen Hemminger, Haiyang Zhang, K. Y. Srinivasan, Michael Kelley Hi Andrea, Thank you for your very detailed reply! I'm going to send a V2 which should address all your comments. On Tue, Dec 14, 2021 at 12:28 PM Andrea Parri <parri.andrea@gmail.com> wrote: > > On Tue, Dec 14, 2021 at 03:06:58AM +0100, Andrea Parri wrote: [...] > > In fact, IIUC, similar considerations would apply to hv_fb and hv_drm *if > > it were not for the fact that those drivers seem to have bogus values for > > max_pkt_size: > > > > 6) drivers/video/fbdev/hyperv_fb.c > > (bufferlen = MAX_VMBUS_PKT_SIZE, max_pkt_size=VMBUS_DEFAULT_MAX_PKT_SIZE) > > > > 7) drivers/gpu/drm/hyperv/hyperv_drm_proto.c > > (bufferlen = MAX_VMBUS_PKT_SIZE, max_pkt_size=VMBUS_DEFAULT_MAX_PKT_SIZE) > > Nice catch! The reason why I tried to apply a delta in vmbus code is I found that VMBUS_DEFAULT_MAX_PKT_SIZE helps hide where should we set max_pkt_size (so that drivers like hv_balloon is not changed), but I failed to realize there are more. > > So, IIUC, some separate patch is needed in order to "adjust" those values > > (say, by appropriately setting max_pkt_size in synthvid_connect_vsp() and > > in hyperv_connect_vsp()), but I digress. > > > > Other comments on your patch: > > > > a) You mentioned the problem that "pkt_offset may not match the packet > > descriptor size". I think this is a real problem. To address this > > problem, we can *presumably consider HV_HYP_PAGE_SIZE to be a valid > > upper bound for pkt_offset (and sizeof(struct vmpacket_descriptor)) > > *and increase the value of max_pkt_size by HV_HYP_PAGE_SIZE (rather > > than sizeof(struct vmpacket_descriptor)), like in: > > IIUC, pkt_offset is used for being forward-compatible with future Hyper-V which may expand vmpacket_descriptor. If so, isn't a whole page a little bit too much? Anyway, I'm going to introduce a new #define for that, presumably VMBUS_MAX_PKT_DESCR_SIZE, set to HY_HYP_PAGE_SIZE for now. > > @@ -256,6 +256,7 @@ int hv_ringbuffer_init(struct hv_ring_buffer_info *ring_info, > > > > /* Initialize buffer that holds copies of incoming packets */ > > if (max_pkt_size) { > > + max_pkt_size += HV_HYP_PAGE_SIZE; > > ring_info->pkt_buffer = kzalloc(max_pkt_size, GFP_KERNEL); > > if (!ring_info->pkt_buffer) > > return -ENOMEM; > > > > b) We may then want to "enforce"/check that that bound on pkt_offset, > > > > pkt_offset <= HV_HYP_PAGE_SIZE, > > > > is met by adding a corresponding check to the (previously discussed) > > validation of pkt_offset included in hv_pkt_iter_first(), say: > > > > @@ -498,7 +498,8 @@ struct vmpacket_descriptor *hv_pkt_iter_first(struct vmbus_channel *channel) > > * If pkt_offset is invalid, arbitrarily set it to > > * the size of vmpacket_descriptor > > */ > > - if (pkt_offset < sizeof(struct vmpacket_descriptor) || pkt_offset > pkt_len) > > + if (pkt_offset < sizeof(struct vmpacket_descriptor) || pkt_offset > pkt_len || > > + pkt_offset > HV_HYP_PAGE_SIZE) > > pkt_offset = sizeof(struct vmpacket_descriptor); > > > > /* Copy the Hyper-V packet out of the ring buffer */ > > > > While there (since I have noticed that such validation as well the > > validation on pkt_len in hv_pkt_iter_first() tend to be the object > > of a somehow recurring discussion ;/), I wouldn't mind to add some > > "explicit" debug, pr_err_ratelimited() say, there. Good idea! > > > > c) Last but not least, I'd recommend to pair the above changes or any > > other change with some inline explanation/comments; these comments > > could be snippets from an (updated) patch description for example. Sure, thanks for the detailed guide here! > > > > Andrea > > One more thought. The additional HV_HYP_PAGE_SIZE seems unnecessary for > drivers such as hv_netvsc and hv_storvsc, which explicitly account for > pkt_offset in their setting of max_pkt_size, as well as for drivers such > as hv_pci, which uses vmbus_recvpacket_raw(). > > This suggests an alternative approach: do not increase max_pkt_size in > the generic vmbus code, instead set/adjust max_pkt_size (only) for the > drivers, in (1-7) above, which require the additional HV_HYP_PAGE_SIZE > /pkt_offset. While putting on the driver(s) some additional "burden", > this approach has the advantage of saving some memory (and keeping the > generic code relatively simpler). > > Thoughts? Provided that there are indeed drivers (hv_storvsc and hv_netvsc) which explicitly account for vmpacket_descriptor header, changing max_pkt_size for individual drivers makes more sense. However in this case I'm not sure about our reasoning of 'pkt_offset' above. In drivers/scsi/storvsc_drv.c: #define STORVSC_MAX_PKT_SIZE (sizeof(struct vmpacket_descriptor) +\ sizeof(struct vstor_packet)) Should I also change this 'sizeof(struct vmpacket_descriptor)' to VMBUS_MAX_PKT_DESCR_SIZE? Otherwise this would not match the check in hv_pkt_iter_first. > > Andrea Regards, Yanming ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hv: account for packet descriptor in maximum packet size 2021-12-14 16:28 ` Yanming Liu @ 2021-12-14 21:36 ` Andrea Parri 2021-12-15 12:30 ` Yanming Liu 0 siblings, 1 reply; 10+ messages in thread From: Andrea Parri @ 2021-12-14 21:36 UTC (permalink / raw) To: Yanming Liu Cc: linux-hyperv, Andres Beltran, Dexuan Cui, Wei Liu, Stephen Hemminger, Haiyang Zhang, K. Y. Srinivasan, Michael Kelley > Thank you for your very detailed reply! I'm going to send a V2 which > should address all your comments. Appreciated. (Well, it might be worth to give other people/reviewers some more time to process v1 and this discussion... ;) ) > Provided that there are indeed drivers (hv_storvsc and hv_netvsc) > which explicitly account for vmpacket_descriptor header, changing > max_pkt_size for individual drivers makes more sense. > However in this case I'm not sure about our reasoning of 'pkt_offset' > above. In drivers/scsi/storvsc_drv.c: > > #define STORVSC_MAX_PKT_SIZE (sizeof(struct vmpacket_descriptor) +\ > sizeof(struct vstor_packet)) > > Should I also change this 'sizeof(struct vmpacket_descriptor)' to > VMBUS_MAX_PKT_DESCR_SIZE? Otherwise this would not match the check in > hv_pkt_iter_first. AFAICT, the above #define is fine, i.e., it represents an upper bound on pkt_len as used in hv_pkt_iter_first() (this is all is required on max_pkt_size, cf. the memcpy() in hv_pkt_iter_first()). The same consideration, AFAICT, holds for NETVSC_MAX_PKT_SIZE. The remarks about pkt_offset targetted the cases, such as hv_balloon, where we can somehow upper bound (pkt_len - pkt_offset) (the "packet payload"), since then an upper bound on pkt_offset would give us an upper bound on pkt_len "for free" (associativity): ptk_len = (pkt_len - pkt_offset) + pkt_offset Andrea ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hv: account for packet descriptor in maximum packet size 2021-12-14 21:36 ` Andrea Parri @ 2021-12-15 12:30 ` Yanming Liu 2021-12-15 14:05 ` Andrea Parri 0 siblings, 1 reply; 10+ messages in thread From: Yanming Liu @ 2021-12-15 12:30 UTC (permalink / raw) To: Andrea Parri Cc: linux-hyperv, Andres Beltran, Dexuan Cui, Wei Liu, Stephen Hemminger, Haiyang Zhang, K. Y. Srinivasan, Michael Kelley > > Provided that there are indeed drivers (hv_storvsc and hv_netvsc) > > which explicitly account for vmpacket_descriptor header, changing > > max_pkt_size for individual drivers makes more sense. > > However in this case I'm not sure about our reasoning of 'pkt_offset' > > above. In drivers/scsi/storvsc_drv.c: > > > > #define STORVSC_MAX_PKT_SIZE (sizeof(struct vmpacket_descriptor) +\ > > sizeof(struct vstor_packet)) > > > > Should I also change this 'sizeof(struct vmpacket_descriptor)' to > > VMBUS_MAX_PKT_DESCR_SIZE? Otherwise this would not match the check in > > hv_pkt_iter_first. > > AFAICT, the above #define is fine, i.e., it represents an upper bound > on pkt_len as used in hv_pkt_iter_first() (this is all is required on > max_pkt_size, cf. the memcpy() in hv_pkt_iter_first()). > > The same consideration, AFAICT, holds for NETVSC_MAX_PKT_SIZE. > > The remarks about pkt_offset targetted the cases, such as hv_balloon, > where we can somehow upper bound > > (pkt_len - pkt_offset) > > (the "packet payload"), since then an upper bound on pkt_offset would I don't get it. Isn't it the same for storvsc? For storvsc we just have an upper bound of ("the packet payload") (pkt_len - pkt_offset) == sizeof(struct vstor_packet). With more details: drivers/scsi/storvsc_drv.c:storvsc_on_channel_callback: foreach_vmbus_pkt(desc, channel) { struct vstor_packet *packet = hv_pkt_data(desc); where foreach_vmbus_pkt is a macro calling hv_pkt_iter_first, and hv_pkt_data is defined as: /* Get data payload associated with descriptor */ static inline void *hv_pkt_data(const struct vmpacket_descriptor *desc) { return (void *)((unsigned long)desc + (desc->offset8 << 3)); } i.e. it expects that 'desc' points to a buffer at least '(desc->offset8 << 3) + sizeof(struct vstor_packet)' bytes long. As Hyper-V is proprietary I can only guess what is the purpose of desc->offset8 (being forward compatible), so I agree with you that this is a real problem. Currently, Hyper-V only sends vmbus packets with offset8 == 2, so the expression above equals STORVSC_MAX_PKT_SIZE. If future Hyper-V somehow sends a packet with offset8 == 3, hv_storvsc certainly breaks. Or, is it guaranteed that desc->offset8 will always be 2 and never change in future Hyper-V? > give us an upper bound on pkt_len "for free" (associativity): > > ptk_len = (pkt_len - pkt_offset) + pkt_offset > > Andrea Regards, Yanming ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] hv: account for packet descriptor in maximum packet size 2021-12-15 12:30 ` Yanming Liu @ 2021-12-15 14:05 ` Andrea Parri 0 siblings, 0 replies; 10+ messages in thread From: Andrea Parri @ 2021-12-15 14:05 UTC (permalink / raw) To: Yanming Liu Cc: linux-hyperv, Andres Beltran, Dexuan Cui, Wei Liu, Stephen Hemminger, Haiyang Zhang, K. Y. Srinivasan, Michael Kelley > > AFAICT, the above #define is fine, i.e., it represents an upper bound > > on pkt_len as used in hv_pkt_iter_first() (this is all is required on > > max_pkt_size, cf. the memcpy() in hv_pkt_iter_first()). > > > > The same consideration, AFAICT, holds for NETVSC_MAX_PKT_SIZE. > > > > The remarks about pkt_offset targetted the cases, such as hv_balloon, > > where we can somehow upper bound > > > > (pkt_len - pkt_offset) > > > > (the "packet payload"), since then an upper bound on pkt_offset would > > I don't get it. Isn't it the same for storvsc? For storvsc we just > have an upper bound of ("the packet payload") (pkt_len - pkt_offset) > == sizeof(struct vstor_packet). > > With more details: > > drivers/scsi/storvsc_drv.c:storvsc_on_channel_callback: > > foreach_vmbus_pkt(desc, channel) { > struct vstor_packet *packet = hv_pkt_data(desc); > > where foreach_vmbus_pkt is a macro calling hv_pkt_iter_first, and > hv_pkt_data is defined as: > > /* Get data payload associated with descriptor */ > static inline void *hv_pkt_data(const struct vmpacket_descriptor *desc) > { > return (void *)((unsigned long)desc + (desc->offset8 << 3)); > } > > i.e. it expects that 'desc' points to a buffer at least > '(desc->offset8 << 3) + sizeof(struct vstor_packet)' bytes long. > > As Hyper-V is proprietary I can only guess what is the purpose of > desc->offset8 (being forward compatible), so I agree with you that > this is a real problem. > Currently, Hyper-V only sends vmbus packets with offset8 == 2, so the > expression above equals STORVSC_MAX_PKT_SIZE. If future Hyper-V > somehow sends a packet with offset8 == 3, hv_storvsc certainly breaks. It actually looks to me like we're on a same page. ;) IOW, pkt_offset is expected to be <= sizeof(struct vmpacket_descriptor) (=16 now) *for storvsc. With the risk of adding to the confusion, ;) pkt_offset is expected to be > sizeof(struct vmpacket_descriptor) for netvsc (cf. the validation of offset8 performed in netvsc_receive()). Andrea > > Or, is it guaranteed that desc->offset8 will always be 2 and never > change in future Hyper-V? > > > give us an upper bound on pkt_len "for free" (associativity): > > > > ptk_len = (pkt_len - pkt_offset) + pkt_offset > > > > Andrea > > Regards, > Yanming ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-12-15 14:05 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-12-12 12:13 [PATCH] hv: account for packet descriptor in maximum packet size Yanming Liu 2021-12-13 1:47 ` Andrea Parri 2021-12-13 6:44 ` Yanming Liu 2021-12-13 17:01 ` Yanming Liu 2021-12-14 2:06 ` Andrea Parri 2021-12-14 4:28 ` Andrea Parri 2021-12-14 16:28 ` Yanming Liu 2021-12-14 21:36 ` Andrea Parri 2021-12-15 12:30 ` Yanming Liu 2021-12-15 14:05 ` Andrea Parri
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).