From: Benjamin LaHaise <bcrl@kvack.org>
To: Tom Parkin <tparkin@katalix.com>
Cc: netdev@vger.kernel.org, jchapman@katalix.com
Subject: Re: [RFC] l2tp: avoid checksum offload for fragmented packets
Date: Mon, 27 May 2013 14:58:29 -0400 [thread overview]
Message-ID: <20130527185829.GO1942@kvack.org> (raw)
In-Reply-To: <1368723860-22749-2-git-send-email-tparkin@katalix.com>
On Thu, May 16, 2013 at 06:04:20PM +0100, Tom Parkin wrote:
> Hardware offload for UDP datagram checksum calculation doesn't work with
> fragmented IP packets -- the device will note the fragmentation and leave the
> UDP checksum well alone.
> As such, if we expect the L2TP packet to be fragmented by the IP layer we need
> to perform the UDP checksum ourselves in software (ref: net/ipv4/udp.c).
Hrm, indeed.
> This change modifies the L2TP xmit path to fallback to software checksum
> calculation if the L2TP packet + IP header exceeds the tunnel device MTU.
> Since we don't know what the IP header length will be a priori, we assume the
> worst-case of 60b. This will likely result in unnecessary software
> checksumming when packet sizes approach the MTU since it's probably not common
> to be using the full IP header.
Using the worst case value of 60 is a poor choice for many users of L2TP --
plenty of the wholesale ISP services in the world using PPPoE transport
sessions to ISPs using frame with headers of ethernet(14) + IP(20) + UDP(8) +
L2TP(6) = 48 (this setup is used by a number of large telcos here in Canada).
This will results in spurious use of software checksumming over links that
are provisioned with the minimum usable MTU (which is common with this kind
of link). Please make the code calculate the correct size of the added
headers to avoid uexpected CPU overhead.
> An alternative approach is to mimic UDP and use socket corking to allow us to
> pass the skb to the IP layer prior to finally pushing the button on xmit.
> This lets IP do his fragmentation before we authorise the packet send,
> allowing us to check whether the packet was actually fragmented by IP or not.
That is probably undesirable from a CPU usage point of view. Ideally, the
kernel's L2TP stack should generate ICMP frag needed messages for such
frames to avoid the fragmentation overhead (ipip is one such tunnelling
protocol that does this; there are others).
> Signed-off-by: Tom Parkin <tparkin@katalix.com>
Nacked-by: Benamin LaHaise at least until the IPv6 issue (see blow) is fixed
at the bare minimum.
> ---
> net/l2tp/l2tp_core.c | 53 ++++++++++++++++++++++++++++++--------------------
> 1 file changed, 32 insertions(+), 21 deletions(-)
>
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 6984c3a..bc10658 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
...
> @@ -1197,30 +1224,14 @@ int l2tp_xmit_skb(struct l2tp_session *session, struct sk_buff *skb, int hdr_len
> uh->check = 0;
>
> /* Calculate UDP checksum if configured to do so */
> + if (sk->sk_no_check == UDP_CSUM_NOXMIT)
> + skb->ip_summed = CHECKSUM_NONE;
> #if IS_ENABLED(CONFIG_IPV6)
> - if (sk->sk_family == PF_INET6)
> + else if (sk->sk_family == PF_INET6)
> l2tp_xmit_ipv6_csum(sk, skb, udp_len);
> - else
...
The last time I checked, for IPv6 UDP packets, the checksum MUST always be
calculated (RFC 2460). If this has changed, you'll also need to update the
IPv6 UDP receive path to allow rx packets with a zero checksum, as I believe
they are noisily dropped at present.
-ben
--
"Thought is the essence of where you are now."
next prev parent reply other threads:[~2013-05-27 19:04 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-16 17:04 [RFC] l2tp: avoid checksum offload for fragmented packets Tom Parkin
2013-05-16 17:04 ` Tom Parkin
2013-05-27 18:58 ` Benjamin LaHaise [this message]
2013-05-29 17:15 ` Tom Parkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130527185829.GO1942@kvack.org \
--to=bcrl@kvack.org \
--cc=jchapman@katalix.com \
--cc=netdev@vger.kernel.org \
--cc=tparkin@katalix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).