From: "Jonathan M. McCune" <jonmccune@cmu.edu>
To: xen-devel@lists.xensource.com
Cc: caceres@us.ibm.com, jaegert@us.ibm.com, sailer@us.ibm.com
Subject: Xen checksumming bug with IPsec ESP packets
Date: Wed, 03 Aug 2005 12:27:00 -0400 [thread overview]
Message-ID: <42F0F054.9030003@cmu.edu> (raw)
System setup:
We have a domU sending packets through DOM0 to an external
host machine. We have setup an IPSEC tunnel between DOM0 and
the host machine through which the packets between domU and
the host machine are routed. We switched off MAC bridging for
this purpose, and configured all interfaces statically.
+ xen-unstable.hg from 7.28.2005.
+ Static IP configuration of DOM0 and one domU (NO MAC bridging).
+ IPsec enabled in DOM0 by editing linux-2.6.12-xen0/.config as
appropriate and recompiling.
+ IPsec using ESP in tunnel mode configured to tunnel all traffic
betweem domU and a particular external host (running linux-2.6.13-rc3
with IPsec enabled).
Problem:
Code for a checksum optimization imposed by Xen (basically, don't
checksum packets between domU and DOM0 since there is no real wire on
which they can become garbled) is not placed correctly. As-is, ESP
packets encapsulating IP packets from domU will be silently dropped by
DOM0. Using tcpdump on DOM0, the packets from domU can be seen arriving
in DOM0, but no ESP packet leaves DOM0 for the external host. It turns
out that an ESP packet is being created, but it gets dropped in
net/core/dev.c:dev_queue_xmit() in the switch(skb->nh.iph->protocol)
statement. It gets dropped here because the protocol is IPPROTO_ESP,
and that switch statement can only handle IPPROTO_[TCP|UDP]. The errno
returned is -ENOMEM. Debugging would have been significantly easier
with a more specific errno.
More info:
Xen gives its virtual network interfaces in domU domains the
NETIF_F_IP_CSUM feature flag, which is defined in
include/linux/netdevice.h to mean the interface is capable only of
checksumming TCP/UDP over IPv4. The expectation is that one can then
get away with not checksumming TCP/UDP packets at all as they pass
between domU and DOM0. This looks to me like a common-case optimization
and saves CPU cycles. Some code is then inserted in
net/core/dev.c:dev_queue_xmit() on DOM0 which puts in the checksum for
packets that are actually going on to the rest of the world. This
manifested itself as a problem for us in two ways:
1. The code in net/core/dev.c:dev_queue_xmit() (activated when
skb->proto_csum_blank == 1) can only handle TCP and UDP packets destined
for the rest of the world. ESP packets activate the `default:` case in
the switch() statement, and thus fail with the default errno in that
function: -ENOMEM.
2. I changed net/core/dev.c:dev_queue_xmit() to allow ESP through
unmolested just because I was curious. The ESP packets then went all
the way from DOM0 to the external host, where they were decrypted. Once
the tunneled packet was exposed, it was dropped on the remote system
because it did not have a valid checksum. In other words, the logic in
DOM0 (switch() statement in net/core/dev.c:dev_queue_xmit()) that is
supposed to insert the needed checksum into the original packet from
domU is too late.
Problem summarized:
The original packet from domU did not get the checksum it needed, and
the ESP packet created in DOM0 wasn't allowed to leave because the
too-late-code doesn't know how to handle ESP packets.
Temporary Solution:
We fixed this by removing the addition of flag NETIF_F_IP_CSUM in
drivers/xen/netfront/netfront.c:create_netdev(). I believe this tells
the kernel to just always do the checksum in software. Thus, the broken
optimization for TCP/UDP packets gets bypassed.
Permanent Solution:
???
That's why I posted this message... :-)
Cheers,
-Jonathan McCune
jonmccune@cmu.edu
next reply other threads:[~2005-08-03 16:27 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-08-03 16:27 Jonathan M. McCune [this message]
2005-08-03 17:01 ` Xen checksumming bug with IPsec ESP packets Keir Fraser
2005-08-05 1:03 ` Nivedita Singhvi
2005-08-05 1:06 ` Jonathan M. McCune
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42F0F054.9030003@cmu.edu \
--to=jonmccune@cmu.edu \
--cc=caceres@us.ibm.com \
--cc=jaegert@us.ibm.com \
--cc=sailer@us.ibm.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.