From: Paolo Abeni <pabeni@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: virtio-comment@lists.linux.dev, maxime.coquelin@redhat.com,
Eelco Chaudron <echaudro@redhat.com>,
Stefano Garzarella <sgarzare@redhat.com>,
Willem de Bruijn <willemb@google.com>,
kshankar@marvell.com
Subject: Re: [PATCH v9 1/2] virtio-net: define UDP tunnel segmentation offload feature
Date: Fri, 11 Oct 2024 09:50:25 +0200 [thread overview]
Message-ID: <582a4bc8-5e90-4e10-9c4a-288abbc088af@redhat.com> (raw)
In-Reply-To: <CACGkMEu5GWZrYVRN==F=vrRQQGnxj_WqX86YZ+diqC3s3m2m_A@mail.gmail.com>
On 10/11/24 04:08, Jason Wang wrote:
> On Thu, Oct 10, 2024 at 3:40 PM Paolo Abeni <pabeni@redhat.com> wrote:
>> The above means that an hypervisor with a S/W virtio implementation need
>> to deal with all sort of strange/unexpected combinations.
>
> Driver and device doesn't trust each other, they must do necessary
> validation no matter what features were negotiated.
The problem is that a non restrictive specification allows for
conflicting/confusing layout, hard to check hence the bugs. The goal
here is to try to enforce a restrictive specification for the new feature.
i.e. if we could replace the whole 'else' statement here:
https://elixir.bootlin.com/linux/v6.12-rc2/source/include/linux/virtio_net.h#L113
with a simpler:
if (gso_type)
return -EINVAL;
we would have avoid a lot of serious kernel bugs.
Note that the PoC code enforces the above, but just for GSO over UDP
tunnel packets.
>> Side note: as a consequence, sice we will not have anymore DATA_VALID
>> together with VIRTIO_NET_HDR_F_GSO_UDP_TUNNEL_IPV{4,6}, the above will
>> additional avoid the 'DATA_VALID with GSO over UDP tunnel means inner
>> header is valid' strangeness.
>
> Well, then we will still have the issue with NEEDS_CSUM? For example,
> as I ask in patch 2, if we want to offload outer checksum where do we
> put the out csum_start/csum_offset?
I think we don't have any issue there. For GSO over UDP tunnel packets,
csum_start/csum_offset always refer to the inner checksum. For GSO
without UDP tunnel, csum_start/csum_offset refers to the only checksum.
The outer checksum is offloaded with checksum partial only for GSO over
UDP tunnel packets, when the header carries and set the outer_th_offset
field:
"""
If the VIRTIO_NET_F_GUEST_UDP_TUNNEL_GSO_CSUM feature has been negotiated,
and the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM bit is set in \field{flags},
both the outer UDP checksum and the inner transport checksum
have been validated, otherwise only one level of checksums (the inner one
in case of tunnels) has been validated.
"""
Note that i.e. performing USO aggregation on the outer UDP for GSO over
UDP tunnel packets, will cause data stream corruption, must be avoided
regardless of virtio.
> Just to make sure we are talking the same thing. Actually, I meant for
> TX you propose the following text:
>
> """
> +The valid complete checksum of the outer UDP header of individual segments
> +can be computed by the driver prior to segmentation only if the GSO packet
> +size is a multiple of \field{gso_size}, because then all segments
> +have the same size and thus all data included in the outer UDP
> +checksum is the same for every segment. These pre-computed segment
> +length and checksum fields are different from those of the GSO
> +packet.
> """
>
> If I understand correctly, you are talking about the precomputed
> checksum by the driver which should be TX. If no checksum offload, I
> don't get how such pre-computing can help.
The above refers to GSO over UDP tunnel, so by the spec change requires
that there is csum offload for the inner transport. Precomputing the
outer csum basically leads to checksum partial, and allows a
device/driver pair not supporting outer header csum offload to handle
GSO over UDP tunnel with outer csum enabled.
>>>> The idea is that GSO over UDP tunnel specs are new, so we can be more
>>>> strict and we try to mandate only 'sane' flag/features/layout combinations.
>>>
>>> But I don't see how mandating device to set csum stuffs can help.
>>>
>>> 1) There's no way for the driver to know if device sets csum stuffs or not
>>> 2) Driver should not trust the device, malicious device can set
>>> arbitrary value, so driver needs its own validation
>>
>> We are discussing only about csum validy, right?
>
> Kind of, actually the context is whether mandating the device to set
> csum_start/offset for DATA_VALID can help.
>
> So I meant there's no difference in the following two cases:
>
> 1) Device doesn't set csum_start/offset, driver need to probe and
> validate the packet
> 2) Device set the csum_start/offset, driver still need to probe and
> validate the packet
The current implementation is not doing (rightful) any probing when
csum_start/offset are available, just sanity checks, which are way
simpler and safer.
> And as replied before, there's no way for the driver to know if device
> set csum_start/offset or not.
I don't understand this statement. According the the spec, if the device
has set NEEDS_CSUM it also must have set csum_start/offset. If a broken
or malicious device set random/bad values, sanity checks should catch it.
Note that the real problems are actually in the opposite direction: the
driver sends bad/malicious data to the device and potentially crashing
the host. Strict sanity checks should avoid that.
>> I hope and believe that this part should have been clarified by a
>> previously discussed point: no DATA_VALID for GSO over UDP tunnel, no
>> more differentiate/confusing meaning for the DATA VALID bit.
>
> Rethink of the design, the root cause is that we're trying to re-use
> existing fields for inner which works for outer before:
>
> For example. consider the case for non GSO tunneling, all
> csum_start/offset/hdr_flags etc were for the outer, and we change the
> semantic when GSO tunnel is supported. Such asymmetry might end up of
> a lot of complexity in both the spec and the implementation.
Have you looked at the actual PoC implementation? the proposed
specification allows re-using the existing code to process the inner
headers seamlessly:
https://github.com/pabeni/linux-devel/blob/virtio_tunnel_gso_v9/include/linux/virtio_net.h#L302
https://github.com/pabeni/linux-devel/blob/virtio_tunnel_gso_v9/include/linux/virtio_net.h#L359
For non tunneling GSO, the fields refer to the _only_ transport header
available. You can call it 'outer' but it's just an arbitrary choice.
I must admit all this discussion is quite confusing to me, as keep
touching consolidated and agreed stuff. My understanding was you were ok
with V8:
https://lore.kernel.org/virtio-comment/CA+FuTSf8274uQVEiJ-R5nnqAzDR9+N3WA0YoSbC-Q7MF5+O7Mw@mail.gmail.com/T/#mb455dd5f07ac98f1ccf79ba6edbd17a75f72df0b
and the only significant delta between v8 and v9 is a more restrictive
and defined usage of the DATA_VALID flag with GSO over UDP tunnel
packets (as also reported in the changelog).
I think we should restrict the discussion to that point. I agree the
conflicting definition of DATA_VALID GSO over tunnel and GSO without
tunnel could be confusing.
AFAICS there are 2 options to make the DATA_VALID usage with GSO over
UDP tunnel not contradictory with the non tunnel GSO:
1) forbid DATA_VALID with GSO over UDP tunnel
or
2) forbid DATA_VALID with GSO over UDP tunnel without UDP tunnel
checksum offload. That means that DATA_VALID for GSO over UDP tunnel
packets could only be set when both the inner and outher header are
offloaded.
I hope opt 1) would be simpler to define in details, opt 2 would still
allow for GSO over UDP tunnel HW GRO with DATA_VALID.
Paolo
next prev parent reply other threads:[~2024-10-11 7:50 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-04 8:13 [PATCH v9 0/2] virtio-net: define UDP tunnel offload Paolo Abeni
2024-10-04 8:13 ` [PATCH v9 1/2] virtio-net: define UDP tunnel segmentation offload feature Paolo Abeni
2024-10-09 7:18 ` Jason Wang
2024-10-09 8:37 ` Paolo Abeni
2024-10-10 3:17 ` Jason Wang
2024-10-10 7:40 ` Paolo Abeni
2024-10-11 2:08 ` Jason Wang
2024-10-11 7:50 ` Paolo Abeni [this message]
2024-10-14 7:20 ` Paolo Abeni
2024-10-17 6:47 ` Jason Wang
2024-10-17 15:34 ` Paolo Abeni
2024-10-18 4:26 ` Jason Wang
2024-10-18 10:10 ` Paolo Abeni
2024-10-20 22:28 ` Willem de Bruijn
2024-10-21 15:47 ` Paolo Abeni
2024-10-22 7:54 ` Jason Wang
2024-10-23 20:57 ` Willem de Bruijn
2024-10-25 8:41 ` Jason Wang
2024-10-21 6:54 ` Jason Wang
2024-10-21 16:27 ` Paolo Abeni
2024-10-22 7:42 ` Jason Wang
2024-10-22 16:56 ` Paolo Abeni
2024-10-25 8:28 ` Jason Wang
2024-10-25 11:50 ` Paolo Abeni
2024-10-25 13:28 ` Willem de Bruijn
2024-10-25 14:35 ` Paolo Abeni
2024-10-25 15:47 ` Willem de Bruijn
2024-10-28 3:27 ` Jason Wang
2024-10-28 12:08 ` Paolo Abeni
2024-10-28 12:26 ` Willem de Bruijn
2024-10-28 14:23 ` Paolo Abeni
2024-10-29 7:32 ` Jason Wang
2024-10-04 8:13 ` [PATCH v9 2/2] virtio-net: define UDP tunnel checksum " Paolo Abeni
2024-10-09 7:18 ` Jason Wang
2024-10-09 9:39 ` Paolo Abeni
2024-10-10 4:22 ` Jason Wang
2024-10-04 16:53 ` [PATCH v9 0/2] virtio-net: define UDP tunnel offload Paolo Abeni
2024-10-09 7:24 ` Jason Wang
2024-10-09 8:08 ` Paolo Abeni
2024-10-10 2:29 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=582a4bc8-5e90-4e10-9c4a-288abbc088af@redhat.com \
--to=pabeni@redhat.com \
--cc=echaudro@redhat.com \
--cc=jasowang@redhat.com \
--cc=kshankar@marvell.com \
--cc=maxime.coquelin@redhat.com \
--cc=sgarzare@redhat.com \
--cc=virtio-comment@lists.linux.dev \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox