From: Guillaume Nault <gnault@redhat.com>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Aleksey Shumnik <ashumnik9@gmail.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Jakub Kicinski <kuba@kernel.org>,
waltje@uwalt.nl.mugnet.org, gw4pts@gw4pts.ampr.org, xeb@mail.ru,
kuznet@ms2.inr.ac.ru, rzsfl@rz.uni-sb.de
Subject: Re: [BUG] In af_packet.c::dev_parse_header() skb.network_header does not point to the network header
Date: Thu, 20 Apr 2023 18:19:19 +0200 [thread overview]
Message-ID: <ZEFmB07IcyhjiqTC@debian> (raw)
In-Reply-To: <64362359316d5_1b9cfb29415@willemb.c.googlers.com.notmuch>
On Tue, Apr 11, 2023 at 11:19:53PM -0400, Willem de Bruijn wrote:
> Aleksey Shumnik wrote:
> > but in ip6_gre.c all
> > skb_mac_header(), skb_network_header(), skb_tranport_header() returns
> > a pointer to payload (skb.data).
> > This function is called when receiving a packet and parsing it in
> > af_packet.c::packet_rcv() in dev_parse_header().
> > The problem is that there is no way to accurately determine the
> > beginning of the ip header.
>
> The issue happens when comparing packet_rcv on an ip_gre tunnel vs an
> ip6_gre tunnel.
>
> The packet_rcv call does the same in both cases, e.g., setting
> skb->data at mac or network header depending on SOCK_DGRAM or
> SOCK_RAW.
>
> The issue then is likely with a difference in tunnel implementations.
> Both implement header_ops and header_ops.create (which is used on
> receive by dev_has_header, but on transmit by dev_hard_header). They
> return different lengths: one with and one without the IP header.
The problem is that, upon reception on an af_packet socket, ip_gre
wants to set the outer source IP address in sll->sll_addr. That is, it
considers the outer IP header as the mac header of the gre device.
As far as I know, ip_gre is the only tunnel that does that.
> We've seen inconsistency in this before between tunnels. See also
> commit aab1e898c26c. ipgre_xmit has special logic to optionally pull
> the headers, but only if header_ops is set, which it isn't for all
> variants of GRE tunnels.
>
> Probably particularly relevant is this section in __ipgre_rcv:
>
> /* Special case for ipgre_header_parse(), which expects the
> * mac_header to point to the outer IP header.
> */
> if (tunnel->dev->header_ops == &ipgre_header_ops)
> skb_pop_mac_header(skb);
> else
> skb_reset_mac_header(skb);
>
> and see this comment in the mentioned commit:
>
> ipgre_header_parse() seems to be the only case that requires mac_header
> to point to the outer header. We can detect this case accurately by
> checking ->header_ops. For all other cases, we can reset mac_header.
The problem was about unifying the different ip tunnel behaviours, as
described in the cover letter of the series (merge commit 8eb517a2a4ae
("Merge branch 'reset-mac'") has all the details).
The idea is to make all tunnel devices consistently set ->mac_header
and ->network_header to the corresponding inner headers. For tunnels
that directly transport network protocols, ->mac_header equals
->network_header (that is, the mac header length is 0).
But there's a problem with ip_gre, as it wants to access the outer
headers again, even though it has already pulled them. To do that,
ip_gre saves the offset of the outer ip header in the ->mac_header, so
that ipgre_header_parse() can find it again later. That's why ip_gre
can't properly set ->mac_header to the inner mac header offset, as the
other tunnels do.
I personally find this use of ->mac_header a bit hacky, but it's used
to implement a feature that's required for some users (see commit
0e3da5bb8da4 ("ip_gre: fix msg_name parsing for recvfrom/recvmsg")). We
could probably store the outer IP header offset elsewhere and reset
->mac_header the way all other tunnels do. But I didn't find a
satisfying solution, so I just kept ip_gre as an exception.
> > diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
> > index 90565b8..0d0c37b 100644
> > --- a/net/ipv6/ip6_gre.c
> > +++ b/net/ipv6/ip6_gre.c
> > @@ -1404,8 +1404,16 @@ static int ip6gre_header(struct sk_buff *skb,
> > struct net_device *dev,
> > return -t->hlen;
> > }
> >
> > +static int ip6gre_header_parse(const struct sk_buff *skb, unsigned char *haddr)
> > +{
> > + const struct ipv6hdr *ipv6h = (const struct ipv6hdr *) skb_mac_header(skb);
> > + memcpy(haddr, &ipv6h->saddr, 16);
> > + return 16;
> > +}
> > +
> > static const struct header_ops ip6gre_header_ops = {
> > .create = ip6gre_header,
> > + .parse = ip6gre_header_parse,
> > };
> >
> > static const struct net_device_ops ip6gre_netdev_ops = {
> >
> > Would you answer whether this behavior is an error and why the
> > behavior in ip_gre.c and ip6_gre.c differs?
> >
> > Regards,
> > Aleksey
>
>
next prev parent reply other threads:[~2023-04-20 16:20 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-11 14:47 [BUG] In af_packet.c::dev_parse_header() skb.network_header does not point to the network header Aleksey Shumnik
2023-04-12 3:19 ` Willem de Bruijn
2023-04-20 16:19 ` Guillaume Nault [this message]
2023-04-20 16:53 ` Guillaume Nault
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZEFmB07IcyhjiqTC@debian \
--to=gnault@redhat.com \
--cc=ashumnik9@gmail.com \
--cc=gw4pts@gw4pts.ampr.org \
--cc=kuba@kernel.org \
--cc=kuznet@ms2.inr.ac.ru \
--cc=netdev@vger.kernel.org \
--cc=rzsfl@rz.uni-sb.de \
--cc=waltje@uwalt.nl.mugnet.org \
--cc=willemdebruijn.kernel@gmail.com \
--cc=xeb@mail.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).