From: Hangbin Liu <liuhangbin@gmail.com>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: netdev@vger.kernel.org, "Michael S . Tsirkin" <mst@redhat.com>,
Jason Wang <jasowang@redhat.com>,
"David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Maxim Mikityanskiy <maximmi@mellanox.com>,
virtualization@lists.linux-foundation.org,
Balazs Nemeth <bnemeth@redhat.com>,
Mike Pattrick <mpattric@redhat.com>,
Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH net-next] net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO
Date: Thu, 21 Apr 2022 10:31:56 +0800 [thread overview]
Message-ID: <YmDCHI330AUfcYKa@Laptop-X1> (raw)
In-Reply-To: <CA+FuTScyF4BKEcNSCYOv8SBA_EmB806YtKA17jb3F+fymVF-Pg@mail.gmail.com>
On Wed, Apr 20, 2022 at 09:45:15AM -0400, Willem de Bruijn wrote:
> On Wed, Apr 20, 2022 at 4:28 AM Hangbin Liu <liuhangbin@gmail.com> wrote:
> >
> > Currently, the kernel drops GSO VLAN tagged packet if it's created with
> > socket(AF_PACKET, SOCK_RAW, 0) plus virtio_net_hdr.
> >
> > The reason is AF_PACKET doesn't adjust the skb network header if there is
> > a VLAN tag. Then after virtio_net_hdr_set_proto() called, the skb->protocol
> > will be set to ETH_P_IP/IPv6. And in later inet/ipv6_gso_segment() the skb
> > is dropped as network header position is invalid.
> >
> > Let's handle VLAN packets by adjusting network header position in
> > packet_parse_headers(), and move the function in packet_snd() before
> > calling virtio_net_hdr_set_proto().
>
> The network header is set in
>
> skb_reset_network_header(skb);
>
> err = -EINVAL;
> if (sock->type == SOCK_DGRAM) {
> offset = dev_hard_header(skb, dev, ntohs(proto), addr,
> NULL, len);
> if (unlikely(offset < 0))
> goto out_free;
> } else if (reserve) {
> skb_reserve(skb, -reserve);
> if (len < reserve + sizeof(struct ipv6hdr) &&
> dev->min_header_len != dev->hard_header_len)
> skb_reset_network_header(skb);
> }
>
> If all that is needed is to move the network header beyond an optional
> VLAN tag in the case of SOCK_RAW, then this can be done in the else
> for Ethernet packets.
Before we set network position, we need to check the skb->protocol to make
sure it's a VLAN packet.
If we set skb->protocol and adjust network header here, like
packet_parse_headers() does. How should we do with later
skb->protocol = proto;
skb->dev = dev;
settings?
If you are afraid that skb_probe_transport_header(skb) would affect the
virtio_net_hdr operation. How about split the skb_probe_transport_header()
from packet_parse_headers()? Something like
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1924,13 +1924,19 @@ static int packet_rcv_spkt(struct sk_buff *skb, struct net_device *dev,
static void packet_parse_headers(struct sk_buff *skb, struct socket *sock)
{
+ int depth;
+
if ((!skb->protocol || skb->protocol == htons(ETH_P_ALL)) &&
sock->type == SOCK_RAW) {
skb_reset_mac_header(skb);
skb->protocol = dev_parse_header_protocol(skb);
}
- skb_probe_transport_header(skb);
+ /* Move network header to the right position for VLAN tagged packets */
+ if (likely(skb->dev->type == ARPHRD_ETHER) &&
+ eth_type_vlan(skb->protocol) &&
+ __vlan_get_protocol(skb, skb->protocol, &depth) != 0)
+ skb_set_network_header(skb, depth);
}
/*
@@ -3047,6 +3055,8 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
skb->mark = sockc.mark;
skb->tstamp = sockc.transmit_time;
+ packet_parse_headers(skb, sock);
+
if (has_vnet_hdr) {
err = virtio_net_hdr_to_skb(skb, &vnet_hdr, vio_le());
if (err)
@@ -3055,7 +3065,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
virtio_net_hdr_set_proto(skb, &vnet_hdr);
}
- packet_parse_headers(skb, sock);
+ skb_probe_transport_header(skb);
if (unlikely(extra_len == 4))
skb->no_fcs = 1;
>
> It is not safe to increase reserve, as that would eat into the
> reserved hlen LL_RESERVED_SPACE(dev), which does not account for
> optional VLAN headers.
>
> Instead of setting here first, then patching up again later in
> packet_parse_headers.
>
> This change affects all packets with VLAN headers, not just those with
> GSO. I imagine that moving the network header is safe for all, but
> don't know that code well enough to verify that it does not have
> unintended side effects. Does dev_queue_xmit expect the network header
> to point to the start of the VLAN headers or after, for instance?
I think adjust the network position should be safe, as tap device also did that.
Thanks
Hangbin
next prev parent reply other threads:[~2022-04-21 2:32 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-20 8:27 [PATCH net-next] net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO Hangbin Liu
2022-04-20 13:45 ` Willem de Bruijn
2022-04-21 2:31 ` Hangbin Liu [this message]
2022-04-21 14:15 ` Willem de Bruijn
2022-04-22 2:08 ` Hangbin Liu
2022-04-22 21:39 ` Willem de Bruijn
2022-04-24 2:29 ` Hangbin Liu
2022-04-24 13:46 ` Willem de Bruijn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YmDCHI330AUfcYKa@Laptop-X1 \
--to=liuhangbin@gmail.com \
--cc=bnemeth@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jasowang@redhat.com \
--cc=kuba@kernel.org \
--cc=maximmi@mellanox.com \
--cc=mpattric@redhat.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox