* [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook
@ 2004-02-14 18:37 James Morris
2004-02-14 18:37 ` Harald Welte
2004-02-14 19:07 ` Mika Penttilä
0 siblings, 2 replies; 18+ messages in thread
From: James Morris @ 2004-02-14 18:37 UTC (permalink / raw)
To: David S. Miller, Harald Welte; +Cc: netdev, Stephen Smalley
This patch fixes a bug with with multicast/broadcast packets,
Netfilter, and NICs which do hardware checksums.
Outgoing multicast and broadcast packets are cloned prior to being fed
into the postrouting hook and looped back. A problem is caused when the
shared packet data is modified by the netfilter core code when updating
the checksum, but the skb->ip_summed field in the header of the original
skb is not updated. The NIC then tries to do a hardware checksum on an
already correct checksum, and we end up transmitting the wrong thing.
This bug stops things like DHCP from working, and was noted under SELinux
which uses the postrouting hook alone.
The proposed solution below is to copy the skb rather than clone it, to
ensure that the original and looped back packets are independent.
Please review.
(A similar problem seems to exist in the IPv6 code, although not
verified yet).
- James
--
James Morris
<jmorris@redhat.com>
diff -urN -X dontdiff linux-2.6.3-rc2-mm1.o/net/ipv4/ip_output.c linux-2.6.3-rc2-mm1.w2/net/ipv4/ip_output.c
--- linux-2.6.3-rc2-mm1.o/net/ipv4/ip_output.c 2004-02-03 22:45:00.000000000 -0500
+++ linux-2.6.3-rc2-mm1.w2/net/ipv4/ip_output.c 2004-02-14 13:04:20.880941816 -0500
@@ -254,7 +254,7 @@
&& ((rt->rt_flags&RTCF_LOCAL) || !(IPCB(skb)->flags&IPSKB_FORWARDED))
#endif
) {
- struct sk_buff *newskb = skb_clone(skb, GFP_ATOMIC);
+ struct sk_buff *newskb = skb_copy(skb, GFP_ATOMIC);
if (newskb)
NF_HOOK(PF_INET, NF_IP_POST_ROUTING, newskb, NULL,
newskb->dev,
@@ -270,7 +270,7 @@
}
if (rt->rt_flags&RTCF_BROADCAST) {
- struct sk_buff *newskb = skb_clone(skb, GFP_ATOMIC);
+ struct sk_buff *newskb = skb_copy(skb, GFP_ATOMIC);
if (newskb)
NF_HOOK(PF_INET, NF_IP_POST_ROUTING, newskb, NULL,
newskb->dev, ip_dev_loopback_xmit);
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-14 18:37 [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook James Morris @ 2004-02-14 18:37 ` Harald Welte 2004-02-14 19:07 ` Mika Penttilä 1 sibling, 0 replies; 18+ messages in thread From: Harald Welte @ 2004-02-14 18:37 UTC (permalink / raw) To: James Morris; +Cc: David S. Miller, netdev, Stephen Smalley [-- Attachment #1: Type: text/plain, Size: 727 bytes --] On Sat, Feb 14, 2004 at 01:37:23PM -0500, James Morris wrote: > This patch fixes a bug with with multicast/broadcast packets, > Netfilter, and NICs which do hardware checksums. > > Please review. I'm fine with that patch, James. Maybe it's worth adding a short one-line comment stating why a clone is not sufficient here. > James Morris -- - Harald Welte <laforge@netfilter.org> http://www.netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-14 18:37 [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook James Morris 2004-02-14 18:37 ` Harald Welte @ 2004-02-14 19:07 ` Mika Penttilä 2004-02-14 23:00 ` David S. Miller 2004-02-15 6:09 ` James Morris 1 sibling, 2 replies; 18+ messages in thread From: Mika Penttilä @ 2004-02-14 19:07 UTC (permalink / raw) To: James Morris; +Cc: David S. Miller, Harald Welte, netdev, Stephen Smalley James Morris wrote: >This patch fixes a bug with with multicast/broadcast packets, >Netfilter, and NICs which do hardware checksums. > >Outgoing multicast and broadcast packets are cloned prior to being fed >into the postrouting hook and looped back. A problem is caused when the >shared packet data is modified by the netfilter core code when updating >the checksum, but the skb->ip_summed field in the header of the original >skb is not updated. The NIC then tries to do a hardware checksum on an >already correct checksum, and we end up transmitting the wrong thing. > >This bug stops things like DHCP from working, and was noted under SELinux >which uses the postrouting hook alone. > >The proposed solution below is to copy the skb rather than clone it, to >ensure that the original and looped back packets are independent. > >Please review. > >(A similar problem seems to exist in the IPv6 code, although not >verified yet). > > >- James > This is unneeded overhead for the common case. The right fix is to make sure the modifier (netfilter etc) makes the copy if needed. Actually, this is what skb_ip_make_writable() is doing. --Mika ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-14 19:07 ` Mika Penttilä @ 2004-02-14 23:00 ` David S. Miller 2004-02-15 6:09 ` James Morris 1 sibling, 0 replies; 18+ messages in thread From: David S. Miller @ 2004-02-14 23:00 UTC (permalink / raw) To: Mika Penttilä; +Cc: jmorris, laforge, netdev, sds On Sat, 14 Feb 2004 21:07:14 +0200 Mika Penttilä <mika.penttila@kolumbus.fi> wrote: > This is unneeded overhead for the common case. The right fix is to make > sure the modifier (netfilter etc) makes the copy if needed. Actually, > this is what skb_ip_make_writable() is doing. I totally agree. In postrouting hook, the handler must unshare the SKB if it wishes to modify the packet contents. It sounds to me like the selinux hooks are not doing this, and as suggested they should look at using the routine skb_ip_make_writable() which was designed by Rusty for this. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-14 19:07 ` Mika Penttilä 2004-02-14 23:00 ` David S. Miller @ 2004-02-15 6:09 ` James Morris 2004-02-15 9:34 ` Mika Penttilä 2004-02-17 15:54 ` Harald Welte 1 sibling, 2 replies; 18+ messages in thread From: James Morris @ 2004-02-15 6:09 UTC (permalink / raw) To: Mika Penttilä; +Cc: David S. Miller, Harald Welte, netdev, Stephen Smalley On Sat, 14 Feb 2004, Mika Penttilä wrote: > James Morris wrote: > > >The proposed solution below is to copy the skb rather than clone it, to > >ensure that the original and looped back packets are independent. > > > > This is unneeded overhead for the common case. The right fix is to make > sure the modifier (netfilter etc) makes the copy if needed. Actually, > this is what skb_ip_make_writable() is doing. The common case here will be only for locally generated multicast and broadcast packets. If the netfilter core code is modified instead, we will end up adding skb_ip_make_writable() to nf_hook_slow() which will be called for every packet with an output device which uses hardware checksums. Not sure which is worse, but here's a proposed patch which does this. - James -- James Morris <jmorris@redhat.com> diff -urN -X dontdiff linux-2.6.3-rc2-mm1.o/include/linux/netfilter.h linux-2.6.3-rc2-mm1.w2/include/linux/netfilter.h --- linux-2.6.3-rc2-mm1.o/include/linux/netfilter.h 2004-02-03 22:43:48.000000000 -0500 +++ linux-2.6.3-rc2-mm1.w2/include/linux/netfilter.h 2004-02-15 00:16:30.000000000 -0500 @@ -162,6 +162,12 @@ /* FIXME: Before cache is ever used, this must be implemented for real. */ extern void nf_invalidate_cache(int pf); +/* Call this before modifying an existing IP packet: ensures it is + modifiable and linear to the point you care about (writable_len). + Returns true or false. */ +extern int skb_ip_make_writable(struct sk_buff **pskb, + unsigned int writable_len); + #else /* !CONFIG_NETFILTER */ #define NF_HOOK(pf, hook, skb, indev, outdev, okfn) (okfn)(skb) #endif /*CONFIG_NETFILTER*/ diff -urN -X dontdiff linux-2.6.3-rc2-mm1.o/include/linux/netfilter_ipv4.h linux-2.6.3-rc2-mm1.w2/include/linux/netfilter_ipv4.h --- linux-2.6.3-rc2-mm1.o/include/linux/netfilter_ipv4.h 2004-02-03 22:44:26.000000000 -0500 +++ linux-2.6.3-rc2-mm1.w2/include/linux/netfilter_ipv4.h 2004-02-15 00:16:13.000000000 -0500 @@ -78,11 +78,6 @@ extern int ip_route_me_harder(struct sk_buff **pskb); -/* Call this before modifying an existing IP packet: ensures it is - modifiable and linear to the point you care about (writable_len). - Returns true or false. */ -extern int skb_ip_make_writable(struct sk_buff **pskb, - unsigned int writable_len); #endif /*__KERNEL__*/ #endif /*__LINUX_IP_NETFILTER_H*/ diff -urN -X dontdiff linux-2.6.3-rc2-mm1.o/net/core/netfilter.c linux-2.6.3-rc2-mm1.w2/net/core/netfilter.c --- linux-2.6.3-rc2-mm1.o/net/core/netfilter.c 2004-02-03 22:43:46.000000000 -0500 +++ linux-2.6.3-rc2-mm1.w2/net/core/netfilter.c 2004-02-15 00:57:05.000000000 -0500 @@ -509,6 +509,13 @@ if (outdev == NULL) { skb->ip_summed = CHECKSUM_NONE; } else { + int wlen = skb->h.raw - skb->data + skb->csum; + + /* packet is about to be modified */ + if (!skb_ip_make_writable(&skb, wlen)) { + kfree(skb); + return -ENOMEM; + } skb_checksum_help(skb); } } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-15 6:09 ` James Morris @ 2004-02-15 9:34 ` Mika Penttilä 2004-02-15 13:03 ` James Morris 2004-02-17 15:54 ` Harald Welte 1 sibling, 1 reply; 18+ messages in thread From: Mika Penttilä @ 2004-02-15 9:34 UTC (permalink / raw) To: James Morris; +Cc: David S. Miller, Harald Welte, netdev, Stephen Smalley James Morris wrote: >On Sat, 14 Feb 2004, Mika Penttilä wrote: > > > >>James Morris wrote: >> >> >> >>>The proposed solution below is to copy the skb rather than clone it, to >>>ensure that the original and looped back packets are independent. >>> >>> >>> >>This is unneeded overhead for the common case. The right fix is to make >>sure the modifier (netfilter etc) makes the copy if needed. Actually, >>this is what skb_ip_make_writable() is doing. >> >> > >The common case here will be only for locally generated multicast and >broadcast packets. > >If the netfilter core code is modified instead, we will end up adding >skb_ip_make_writable() to nf_hook_slow() which will be called for every >packet with an output device which uses hardware checksums. > >Not sure which is worse, but here's a proposed patch which does this. > > >- James > I don't see the context here. Where is the packet mangled? Why isn't that instance doing skb_ip_make_writable()? selinux? Not everyone generating locally multicast/broadcast packets is using selinux... --Mika ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-15 9:34 ` Mika Penttilä @ 2004-02-15 13:03 ` James Morris 2004-02-15 13:40 ` Mika Penttilä 0 siblings, 1 reply; 18+ messages in thread From: James Morris @ 2004-02-15 13:03 UTC (permalink / raw) To: Mika Penttilä; +Cc: David S. Miller, Harald Welte, netdev, Stephen Smalley On Sun, 15 Feb 2004, Mika Penttilä wrote: > James Morris wrote: > > >On Sat, 14 Feb 2004, Mika Penttilä wrote: > >>> > >>This is unneeded overhead for the common case. The right fix is to make > >>sure the modifier (netfilter etc) makes the copy if needed. Actually, > >>this is what skb_ip_make_writable() is doing. > >> > >> > > > >The common case here will be only for locally generated multicast and > >broadcast packets. > > > >If the netfilter core code is modified instead, we will end up adding > >skb_ip_make_writable() to nf_hook_slow() which will be called for every > >packet with an output device which uses hardware checksums. > > > >Not sure which is worse, but here's a proposed patch which does this. > > > > > >- James > > > > I don't see the context here. Where is the packet mangled? Why isn't > that instance doing skb_ip_make_writable()? selinux? Not everyone > generating locally multicast/broadcast packets is using selinux... > [my previous post didn't seem to make it to the list, newer patch is attached below again] The packet is mangled in skb_checksum_help(), which is called by the Netfilter core code. It is not being mangled by SELinux. - James -- James Morris <jmorris@redhat.com> diff -urN -X dontdiff linux-2.6.3-rc2-mm1.o/include/linux/netfilter.h linux-2.6.3-rc2-mm1.w2/include/linux/netfilter.h --- linux-2.6.3-rc2-mm1.o/include/linux/netfilter.h 2004-02-03 22:43:48.000000000 -0500 +++ linux-2.6.3-rc2-mm1.w2/include/linux/netfilter.h 2004-02-15 00:16:30.000000000 -0500 @@ -162,6 +162,12 @@ /* FIXME: Before cache is ever used, this must be implemented for real. */ extern void nf_invalidate_cache(int pf); +/* Call this before modifying an existing IP packet: ensures it is + modifiable and linear to the point you care about (writable_len). + Returns true or false. */ +extern int skb_ip_make_writable(struct sk_buff **pskb, + unsigned int writable_len); + #else /* !CONFIG_NETFILTER */ #define NF_HOOK(pf, hook, skb, indev, outdev, okfn) (okfn)(skb) #endif /*CONFIG_NETFILTER*/ diff -urN -X dontdiff linux-2.6.3-rc2-mm1.o/include/linux/netfilter_ipv4.h linux-2.6.3-rc2-mm1.w2/include/linux/netfilter_ipv4.h --- linux-2.6.3-rc2-mm1.o/include/linux/netfilter_ipv4.h 2004-02-03 22:44:26.000000000 -0500 +++ linux-2.6.3-rc2-mm1.w2/include/linux/netfilter_ipv4.h 2004-02-15 00:16:13.000000000 -0500 @@ -78,11 +78,6 @@ extern int ip_route_me_harder(struct sk_buff **pskb); -/* Call this before modifying an existing IP packet: ensures it is - modifiable and linear to the point you care about (writable_len). - Returns true or false. */ -extern int skb_ip_make_writable(struct sk_buff **pskb, - unsigned int writable_len); #endif /*__KERNEL__*/ #endif /*__LINUX_IP_NETFILTER_H*/ diff -urN -X dontdiff linux-2.6.3-rc2-mm1.o/net/core/netfilter.c linux-2.6.3-rc2-mm1.w2/net/core/netfilter.c --- linux-2.6.3-rc2-mm1.o/net/core/netfilter.c 2004-02-03 22:43:46.000000000 -0500 +++ linux-2.6.3-rc2-mm1.w2/net/core/netfilter.c 2004-02-15 00:57:05.000000000 -0500 @@ -509,6 +509,13 @@ if (outdev == NULL) { skb->ip_summed = CHECKSUM_NONE; } else { + int wlen = skb->h.raw - skb->data + skb->csum; + + /* packet is about to be modified */ + if (!skb_ip_make_writable(&skb, wlen)) { + kfree(skb); + return -ENOMEM; + } skb_checksum_help(skb); } } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-15 13:03 ` James Morris @ 2004-02-15 13:40 ` Mika Penttilä 2004-02-15 14:03 ` James Morris 0 siblings, 1 reply; 18+ messages in thread From: Mika Penttilä @ 2004-02-15 13:40 UTC (permalink / raw) To: James Morris; +Cc: David S. Miller, Harald Welte, netdev, Stephen Smalley James Morris wrote: >On Sun, 15 Feb 2004, Mika Penttilä wrote: > > > >>James Morris wrote: >> >> >> >>>On Sat, 14 Feb 2004, Mika Penttilä wrote: >>> >>> >>>>This is unneeded overhead for the common case. The right fix is to make >>>>sure the modifier (netfilter etc) makes the copy if needed. Actually, >>>>this is what skb_ip_make_writable() is doing. >>>> >>>> >>>> >>>> >>>The common case here will be only for locally generated multicast and >>>broadcast packets. >>> >>>If the netfilter core code is modified instead, we will end up adding >>>skb_ip_make_writable() to nf_hook_slow() which will be called for every >>>packet with an output device which uses hardware checksums. >>> >>>Not sure which is worse, but here's a proposed patch which does this. >>> >>> >>>- James >>> >>> >>> >>I don't see the context here. Where is the packet mangled? Why isn't >>that instance doing skb_ip_make_writable()? selinux? Not everyone >>generating locally multicast/broadcast packets is using selinux... >> >> >> > >[my previous post didn't seem to make it to the list, newer patch is >attached below again] > >The packet is mangled in skb_checksum_help(), which is called by the >Netfilter core code. It is not being mangled by SELinux. > > >- James > skb_checksum_help() updates skb->ip_summed to CHECKSUM_NONE in the original skb, and this should be seen by the driver. With your change the checksum is calculated twice, once for the looping back packet, and once for the outgoing. --Mika ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-15 13:40 ` Mika Penttilä @ 2004-02-15 14:03 ` James Morris 2004-02-15 16:00 ` Mika Penttilä 0 siblings, 1 reply; 18+ messages in thread From: James Morris @ 2004-02-15 14:03 UTC (permalink / raw) To: Mika Penttilä; +Cc: David S. Miller, Harald Welte, netdev, Stephen Smalley On Sun, 15 Feb 2004, Mika Penttilä wrote: > skb_checksum_help() updates skb->ip_summed to CHECKSUM_NONE in the > original skb, and this should be seen by the driver. With your change > the checksum is calculated twice, once for the looping back packet, and > once for the outgoing. [Looks like my posts are not making it to netdev] It updates ip_summed in the cloned packet header, but not in the original packet, which gets sent to the driver with an already completed checksum. See the original post. - James -- James Morris <jmorris@redhat.com> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-15 14:03 ` James Morris @ 2004-02-15 16:00 ` Mika Penttilä 2004-02-16 1:50 ` James Morris 0 siblings, 1 reply; 18+ messages in thread From: Mika Penttilä @ 2004-02-15 16:00 UTC (permalink / raw) To: James Morris; +Cc: David S. Miller, Harald Welte, netdev, Stephen Smalley James Morris wrote: >On Sun, 15 Feb 2004, Mika Penttilä wrote: > > > >>skb_checksum_help() updates skb->ip_summed to CHECKSUM_NONE in the >>original skb, and this should be seen by the driver. With your change >>the checksum is calculated twice, once for the looping back packet, and >>once for the outgoing. >> >> > >[Looks like my posts are not making it to netdev] > >It updates ip_summed in the cloned packet header, but not in the original >packet, which gets sent to the driver with an already completed checksum. > >See the original post. > > >- James > ah, sorry, I see your point now. Maybe this copy should be in skb_checksum_help() ? --Mika ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-15 16:00 ` Mika Penttilä @ 2004-02-16 1:50 ` James Morris 2004-02-16 6:43 ` Mika Penttilä 0 siblings, 1 reply; 18+ messages in thread From: James Morris @ 2004-02-16 1:50 UTC (permalink / raw) To: Mika Penttilä; +Cc: David S. Miller, Harald Welte, netdev, Stephen Smalley On Sun, 15 Feb 2004, Mika Penttilä wrote: > ah, sorry, I see your point now. Maybe this copy should be in > skb_checksum_help() ? No, it's not needed by all callers, and is just a does-one-thing helper function. - James -- James Morris <jmorris@redhat.com> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-16 1:50 ` James Morris @ 2004-02-16 6:43 ` Mika Penttilä 2004-02-16 13:45 ` James Morris 0 siblings, 1 reply; 18+ messages in thread From: Mika Penttilä @ 2004-02-16 6:43 UTC (permalink / raw) To: James Morris; +Cc: David S. Miller, Harald Welte, netdev, Stephen Smalley James Morris wrote: >On Sun, 15 Feb 2004, Mika Penttilä wrote: > > > >>ah, sorry, I see your point now. Maybe this copy should be in >>skb_checksum_help() ? >> >> > >No, it's not needed by all callers, and is just a does-one-thing helper >function. > > >- James > It is needed for cloned skbs, so check for this should go in skb_checksum_help(). --Mika ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-16 6:43 ` Mika Penttilä @ 2004-02-16 13:45 ` James Morris 2004-02-19 1:24 ` David S. Miller 0 siblings, 1 reply; 18+ messages in thread From: James Morris @ 2004-02-16 13:45 UTC (permalink / raw) To: Mika Penttilä; +Cc: David S. Miller, Harald Welte, netdev, Stephen Smalley On Mon, 16 Feb 2004, Mika Penttilä wrote: > James Morris wrote: > > >No, it's not needed by all callers, and is just a does-one-thing helper > >function. > > > > > It is needed for cloned skbs, so check for this should go in > skb_checksum_help(). No, there is no need to bloat a simple helper function out with this (and then need to handle OOM internally etc.), and it is not appropriate for all callers. - James -- James Morris <jmorris@redhat.com> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-16 13:45 ` James Morris @ 2004-02-19 1:24 ` David S. Miller 2004-02-23 22:19 ` James Morris 0 siblings, 1 reply; 18+ messages in thread From: David S. Miller @ 2004-02-19 1:24 UTC (permalink / raw) To: James Morris; +Cc: mika.penttila, laforge, netdev, sds On Mon, 16 Feb 2004 08:45:47 -0500 (EST) James Morris <jmorris@redhat.com> wrote: > On Mon, 16 Feb 2004, Mika Penttilä wrote: > > > James Morris wrote: > > > > >No, it's not needed by all callers, and is just a does-one-thing helper > > >function. > > > > > > > > It is needed for cloned skbs, so check for this should go in > > skb_checksum_help(). > > No, there is no need to bloat a simple helper function out with this > (and then need to handle OOM internally etc.), and it is not appropriate > for all callers. I think Mika is in a way correct James. Nobody may mangle packet data contents without unsharing SKB on output path. But things are not so simple. Just mucking with the checksum works because things which usually clone such packets (TCP retransmit queue) regenerate all headers and thus checksum fields upon each further use of such clones. I am tempted just to punish nf_hook_slow()'s invocation of skb_checksum_help() but that is not the right answer. Let us look at some of the other users of skb_checksum_help(). The one in dev_queue_xmit() and the IPSEC encapsulator xfrm drivers are doing so just to plug up a short lived race where a route lookup points to a non-IPSEC checksumming hardware path, but we end up in the IPSEC path or to a device which does not support hw checksumming. One could argue that what we could do in this case is just drop the packet and let the route relookup on retransmit get things right. This reminds me that we should audit and if necessary fix handling of the ethtool operations that turn on/off checksumming support, they would need to flush all of the cached checksumming capability state if they not do so already. I'm as confused as when we began this thread :-) I believe it is possible to replace all skb_checksum_help() calls with a packet drop, except for the nf_hook_slow() case. It may be possible to instead push the nf_hook_slow() instance simply deeper into the netfilter code so that the iterate/parse/match phase need not invoke such a heavy operation. Only if netfilter will actually do something with the packet would we skb_checksum_help() or whatever. And I even believe that most cases where the packet is mangled do not need to keep from still letting the card checksum the packet. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-19 1:24 ` David S. Miller @ 2004-02-23 22:19 ` James Morris 2004-02-29 5:50 ` David S. Miller 0 siblings, 1 reply; 18+ messages in thread From: James Morris @ 2004-02-23 22:19 UTC (permalink / raw) To: David S. Miller; +Cc: mika.penttila, Harald Welte, netdev, Stephen Smalley On Wed, 18 Feb 2004, David S. Miller wrote: > Let us look at some of the other users of skb_checksum_help(). The one in dev_queue_xmit() > and the IPSEC encapsulator xfrm drivers are doing so just to plug up a short lived race where > a route lookup points to a non-IPSEC checksumming hardware path, but we end up in the IPSEC > path or to a device which does not support hw checksumming. > > One could argue that what we could do in this case is just drop the packet and let the route > relookup on retransmit get things right. This does not seem to be the correct approach. If you drop packets in the ipsec cases at least, the retransmits just keep feeding packets back in with hardware checksumming set. I've included the entire patch so far below (which pushes the checksum handling further up into Netfilter) for reference. No data will flow on e.g. an AH connection with this patch. include/linux/netdevice.h | 1 include/linux/netfilter.h | 2 + include/linux/netfilter_ipv4.h | 11 ++++++++ include/linux/netfilter_ipv4/ip_nat_helper.h | 3 +- include/linux/netfilter_ipv4/ip_nat_protocol.h | 2 - include/linux/skbuff.h | 6 ++++ net/core/dev.c | 33 +++---------------------- net/core/netfilter.c | 30 ++++++++++++++++------ net/ipv4/ah4.c | 4 +-- net/ipv4/esp4.c | 6 ++-- net/ipv4/ipcomp.c | 4 +-- net/ipv4/netfilter/ip_fw_compat_redir.c | 5 +++ net/ipv4/netfilter/ip_nat_core.c | 12 ++++----- net/ipv4/netfilter/ip_nat_helper.c | 13 ++++++++- net/ipv4/netfilter/ip_nat_proto_icmp.c | 3 +- net/ipv4/netfilter/ip_nat_proto_tcp.c | 3 +- net/ipv4/netfilter/ip_nat_proto_udp.c | 6 +++- net/ipv4/netfilter/ip_nat_proto_unknown.c | 4 +-- net/ipv4/netfilter/ip_nat_snmp_basic.c | 5 +++ net/ipv4/netfilter/ipt_ECN.c | 10 ++++++- net/ipv4/netfilter/ipt_REJECT.c | 2 + net/ipv4/netfilter/ipt_TCPMSS.c | 1 net/ipv6/ah6.c | 4 +-- net/ipv6/esp6.c | 6 ++-- net/ipv6/ipcomp6.c | 4 +-- 25 files changed, 109 insertions(+), 71 deletions(-) - James -- James Morris <jmorris@redhat.com> diff -urN -X dontdiff linux-2.6.3-mm3.o/include/linux/netdevice.h linux-2.6.3-mm3.w/include/linux/netdevice.h --- linux-2.6.3-mm3.o/include/linux/netdevice.h 2004-02-23 10:52:58.000000000 -0500 +++ linux-2.6.3-mm3.w/include/linux/netdevice.h 2004-02-23 15:09:42.000000000 -0500 @@ -929,7 +929,6 @@ extern unsigned long netdev_fc_xoff; extern atomic_t netdev_dropping; extern int netdev_set_master(struct net_device *dev, struct net_device *master); -extern struct sk_buff * skb_checksum_help(struct sk_buff *skb); #ifdef CONFIG_NET_FASTROUTE extern int netdev_fastroute; extern int netdev_fastroute_obstacles; diff -urN -X dontdiff linux-2.6.3-mm3.o/include/linux/netfilter.h linux-2.6.3-mm3.w/include/linux/netfilter.h --- linux-2.6.3-mm3.o/include/linux/netfilter.h 2003-09-27 20:50:20.000000000 -0400 +++ linux-2.6.3-mm3.w/include/linux/netfilter.h 2004-02-23 15:09:42.000000000 -0500 @@ -159,6 +159,8 @@ extern void nf_dump_skb(int pf, struct sk_buff *skb); #endif +struct sk_buff *nf_skb_checksum_help(struct sk_buff *skb); + /* FIXME: Before cache is ever used, this must be implemented for real. */ extern void nf_invalidate_cache(int pf); diff -urN -X dontdiff linux-2.6.3-mm3.o/include/linux/netfilter_ipv4/ip_nat_helper.h linux-2.6.3-mm3.w/include/linux/netfilter_ipv4/ip_nat_helper.h --- linux-2.6.3-mm3.o/include/linux/netfilter_ipv4/ip_nat_helper.h 2003-09-27 20:50:14.000000000 -0400 +++ linux-2.6.3-mm3.w/include/linux/netfilter_ipv4/ip_nat_helper.h 2004-02-23 15:35:13.000000000 -0500 @@ -60,5 +60,6 @@ unsigned int rep_len); extern int ip_nat_seq_adjust(struct sk_buff **pskb, struct ip_conntrack *ct, - enum ip_conntrack_info ctinfo); + enum ip_conntrack_info ctinfo, + int hooknum); #endif diff -urN -X dontdiff linux-2.6.3-mm3.o/include/linux/netfilter_ipv4/ip_nat_protocol.h linux-2.6.3-mm3.w/include/linux/netfilter_ipv4/ip_nat_protocol.h --- linux-2.6.3-mm3.o/include/linux/netfilter_ipv4/ip_nat_protocol.h 2003-09-27 20:50:26.000000000 -0400 +++ linux-2.6.3-mm3.w/include/linux/netfilter_ipv4/ip_nat_protocol.h 2004-02-23 15:09:42.000000000 -0500 @@ -22,7 +22,7 @@ int (*manip_pkt)(struct sk_buff **pskb, unsigned int hdroff, const struct ip_conntrack_manip *manip, - enum ip_nat_manip_type maniptype); + enum ip_nat_manip_type maniptype, int hooknum); /* Is the manipable part of the tuple between min and max incl? */ int (*in_range)(const struct ip_conntrack_tuple *tuple, diff -urN -X dontdiff linux-2.6.3-mm3.o/include/linux/netfilter_ipv4.h linux-2.6.3-mm3.w/include/linux/netfilter_ipv4.h --- linux-2.6.3-mm3.o/include/linux/netfilter_ipv4.h 2004-01-09 09:41:35.000000000 -0500 +++ linux-2.6.3-mm3.w/include/linux/netfilter_ipv4.h 2004-02-23 15:09:42.000000000 -0500 @@ -83,6 +83,17 @@ Returns true or false. */ extern int skb_ip_make_writable(struct sk_buff **pskb, unsigned int writable_len); + +static inline void nf_ip_skb_checksum_help(struct sk_buff *skb, int hooknum) +{ + if (skb->ip_summed == CHECKSUM_HW) { + if (hooknum == NF_IP_PRE_ROUTING || hooknum == NF_IP_LOCAL_IN) + skb->ip_summed = CHECKSUM_NONE; + else + nf_skb_checksum_help(skb); + } +} + #endif /*__KERNEL__*/ #endif /*__LINUX_IP_NETFILTER_H*/ diff -urN -X dontdiff linux-2.6.3-mm3.o/include/linux/skbuff.h linux-2.6.3-mm3.w/include/linux/skbuff.h --- linux-2.6.3-mm3.o/include/linux/skbuff.h 2004-02-23 10:52:59.000000000 -0500 +++ linux-2.6.3-mm3.w/include/linux/skbuff.h 2004-02-23 16:35:28.233857264 -0500 @@ -1220,5 +1220,11 @@ #endif +static inline void skb_invalidate_hw_checksum(struct sk_buff *skb) +{ + if (skb->ip_summed == CHECKSUM_HW) + skb->ip_summed = CHECKSUM_NONE; +} + #endif /* __KERNEL__ */ #endif /* _LINUX_SKBUFF_H */ diff -urN -X dontdiff linux-2.6.3-mm3.o/net/core/dev.c linux-2.6.3-mm3.w/net/core/dev.c --- linux-2.6.3-mm3.o/net/core/dev.c 2004-02-23 10:52:59.000000000 -0500 +++ linux-2.6.3-mm3.w/net/core/dev.c 2004-02-23 15:09:42.000000000 -0500 @@ -99,7 +99,6 @@ #include <linux/divert.h> #include <net/dst.h> #include <net/pkt_sched.h> -#include <net/checksum.h> #include <linux/highmem.h> #include <linux/init.h> #include <linux/kmod.h> @@ -1196,30 +1195,6 @@ rcu_read_unlock(); } -/* Calculate csum in the case, when packet is misrouted. - * If it failed by some reason, ignore and send skb with wrong - * checksum. - */ -struct sk_buff *skb_checksum_help(struct sk_buff *skb) -{ - unsigned int csum; - int offset = skb->h.raw - skb->data; - - if (offset > (int)skb->len) - BUG(); - csum = skb_checksum(skb, offset, skb->len-offset, 0); - - offset = skb->tail - skb->h.raw; - if (offset <= 0) - BUG(); - if (skb->csum + 2 > offset) - BUG(); - - *(u16*)(skb->h.raw + skb->csum) = csum_fold(csum); - skb->ip_summed = CHECKSUM_NONE; - return skb; -} - #ifdef CONFIG_HIGHMEM /* Actually, we should eliminate this check as soon as we know, that: * 1. IOMMU is present and allows to map all the memory. @@ -1337,14 +1312,15 @@ goto out_kfree_skb; /* If packet is not checksummed and device does not support - * checksumming for this protocol, complete checksumming here. + * checksumming for this protocol. Let route lookup on retransmit + * fix things. */ if (skb->ip_summed == CHECKSUM_HW && (!(dev->features & (NETIF_F_HW_CSUM | NETIF_F_NO_CSUM)) && (!(dev->features & NETIF_F_IP_CSUM) || skb->protocol != htons(ETH_P_IP)))) { - if ((skb = skb_checksum_help(skb)) == NULL) - goto out; + rc = -EAGAIN; + goto out_kfree_skb; } /* Grab device queue */ @@ -3321,7 +3297,6 @@ EXPORT_SYMBOL(register_gifconf); EXPORT_SYMBOL(register_netdevice); EXPORT_SYMBOL(register_netdevice_notifier); -EXPORT_SYMBOL(skb_checksum_help); EXPORT_SYMBOL(synchronize_net); EXPORT_SYMBOL(unregister_netdevice); EXPORT_SYMBOL(unregister_netdevice_notifier); diff -urN -X dontdiff linux-2.6.3-mm3.o/net/core/netfilter.c linux-2.6.3-mm3.w/net/core/netfilter.c --- linux-2.6.3-mm3.o/net/core/netfilter.c 2003-10-15 08:53:19.000000000 -0400 +++ linux-2.6.3-mm3.w/net/core/netfilter.c 2004-02-23 16:27:39.580103464 -0500 @@ -26,6 +26,7 @@ #include <net/sock.h> #include <net/route.h> #include <linux/ip.h> +#include <net/checksum.h> #define __KERNEL_SYSCALLS__ #include <linux/unistd.h> @@ -505,14 +506,6 @@ unsigned int verdict; int ret = 0; - if (skb->ip_summed == CHECKSUM_HW) { - if (outdev == NULL) { - skb->ip_summed = CHECKSUM_NONE; - } else { - skb_checksum_help(skb); - } - } - /* We may already have this, but read-locks nest anyway */ rcu_read_lock(); @@ -743,6 +736,26 @@ EXPORT_SYMBOL(skb_ip_make_writable); #endif /*CONFIG_INET*/ +struct sk_buff *nf_skb_checksum_help(struct sk_buff *skb) +{ + unsigned int csum; + int offset = skb->h.raw - skb->data; + + if (offset > (int)skb->len) + BUG(); + csum = skb_checksum(skb, offset, skb->len-offset, 0); + + offset = skb->tail - skb->h.raw; + if (offset <= 0) + BUG(); + if (skb->csum + 2 > offset) + BUG(); + + *(u16*)(skb->h.raw + skb->csum) = csum_fold(csum); + skb->ip_summed = CHECKSUM_NONE; + return skb; +} + /* This does not belong here, but ipt_REJECT needs it if connection tracking in use: without this, connection may not be in hash table, @@ -773,3 +786,4 @@ EXPORT_SYMBOL(nf_unregister_hook); EXPORT_SYMBOL(nf_unregister_queue_handler); EXPORT_SYMBOL(nf_unregister_sockopt); +EXPORT_SYMBOL(nf_skb_checksum_help); diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/ah4.c linux-2.6.3-mm3.w/net/ipv4/ah4.c --- linux-2.6.3-mm3.o/net/ipv4/ah4.c 2004-02-04 08:39:07.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/ah4.c 2004-02-23 15:09:42.000000000 -0500 @@ -67,8 +67,8 @@ char buf[60]; } tmp_iph; - if (skb->ip_summed == CHECKSUM_HW && skb_checksum_help(skb) == NULL) { - err = -EINVAL; + if (skb->ip_summed == CHECKSUM_HW) { + err = -EAGAIN; goto error_nolock; } diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/esp4.c linux-2.6.3-mm3.w/net/ipv4/esp4.c --- linux-2.6.3-mm3.o/net/ipv4/esp4.c 2003-09-27 20:50:29.000000000 -0400 +++ linux-2.6.3-mm3.w/net/ipv4/esp4.c 2004-02-23 15:09:42.000000000 -0500 @@ -41,9 +41,9 @@ char buf[60]; } tmp_iph; - /* First, if the skb is not checksummed, complete checksum. */ - if (skb->ip_summed == CHECKSUM_HW && skb_checksum_help(skb) == NULL) { - err = -EINVAL; + /* First, if the skb is not checksummed, allow retrans to fix . */ + if (skb->ip_summed == CHECKSUM_HW) { + err = -EAGAIN; goto error_nolock; } diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/ipcomp.c linux-2.6.3-mm3.w/net/ipv4/ipcomp.c --- linux-2.6.3-mm3.o/net/ipv4/ipcomp.c 2004-02-04 08:39:07.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/ipcomp.c 2004-02-23 15:09:42.000000000 -0500 @@ -157,8 +157,8 @@ } tmp_iph; int hdr_len = 0; - if (skb->ip_summed == CHECKSUM_HW && skb_checksum_help(skb) == NULL) { - err = -EINVAL; + if (skb->ip_summed == CHECKSUM_HW) { + err = -EAGAIN; goto error_nolock; } diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_fw_compat_redir.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_fw_compat_redir.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_fw_compat_redir.c 2004-02-18 00:17:09.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_fw_compat_redir.c 2004-02-23 16:33:19.715395040 -0500 @@ -109,6 +109,7 @@ struct tcphdr *tcph = (struct tcphdr *)((u_int32_t *)iph + iph->ihl); + skb_invalidate_hw_checksum(skb); tcph->check = cheat_check(~redir->core.orig_dstip, redir->core.new_dstip, cheat_check(redir->core.orig_dport ^ 0xFFFF, @@ -197,11 +198,13 @@ if (skb->len < iph->ihl*4 + sizeof(*udph)) return NF_DROP; - if (udph->check) /* 0 is a special case meaning no checksum */ + if (udph->check) /* 0 is a special case meaning no checksum */ { + skb_invalidate_hw_checksum(skb); udph->check = cheat_check(~iph->daddr, newdst, cheat_check(udph->dest ^ 0xFFFF, redirpt, udph->check)); + } iph->check = cheat_check(~iph->daddr, newdst, iph->check); udph->dest = redirpt; iph->daddr = newdst; diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_core.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_core.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_core.c 2004-02-18 00:17:09.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_core.c 2004-02-23 15:34:41.000000000 -0500 @@ -721,7 +721,7 @@ struct sk_buff **pskb, unsigned int iphdroff, const struct ip_conntrack_manip *manip, - enum ip_nat_manip_type maniptype) + enum ip_nat_manip_type maniptype, int hooknum) { struct iphdr *iph; @@ -734,7 +734,7 @@ /* Manipulate protcol part. */ if (!find_nat_proto(proto)->manip_pkt(pskb, iphdroff + iph->ihl*4, - manip, maniptype)) + manip, maniptype, hooknum)) return 0; iph = (void *)(*pskb)->data + iphdroff; @@ -793,7 +793,7 @@ htons(info->manips[i].manip.u.all)); if (!manip_pkt(proto, pskb, 0, &info->manips[i].manip, - info->manips[i].maniptype)) { + info->manips[i].maniptype, hooknum)) { READ_UNLOCK(&ip_nat_lock); return NF_DROP; } @@ -858,7 +858,7 @@ DEBUGP("ip_nat_core: adjusting sequence number\n"); /* future: put this in a l4-proto specific function, * and call this function here. */ - if (!ip_nat_seq_adjust(pskb, ct, ctinfo)) + if (!ip_nat_seq_adjust(pskb, ct, ctinfo, hooknum)) ret = NF_DROP; } @@ -952,7 +952,7 @@ (*pskb)->nh.iph->ihl*4 + sizeof(inside->icmp), &info->manips[i].manip, - !info->manips[i].maniptype)) + !info->manips[i].maniptype, hooknum)) goto unlock_fail; /* Outer packet needs to have IP header NATed like @@ -966,7 +966,7 @@ NIPQUAD(info->manips[i].manip.ip)); if (!manip_pkt(0, pskb, 0, &info->manips[i].manip, - info->manips[i].maniptype)) + info->manips[i].maniptype, hooknum)) goto unlock_fail; } } diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_helper.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_helper.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_helper.c 2004-02-18 00:17:09.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_helper.c 2004-02-23 16:32:06.392541808 -0500 @@ -187,6 +187,7 @@ mangle_contents(*pskb, iph->ihl*4 + tcph->doff*4, match_offset, match_len, rep_buffer, rep_len); + skb_invalidate_hw_checksum(*pskb); datalen = (*pskb)->len - iph->ihl*4; tcph->check = 0; tcph->check = tcp_v4_check(tcph, datalen, iph->saddr, iph->daddr, @@ -245,6 +246,8 @@ /* fix udp checksum if udp checksum was previously calculated */ if (udph->check) { int datalen = (*pskb)->len - iph->ihl * 4; + + skb_invalidate_hw_checksum(*pskb); udph->check = 0; udph->check = csum_tcpudp_magic(iph->saddr, iph->daddr, datalen, IPPROTO_UDP, @@ -255,7 +258,11 @@ return 1; } -/* Adjust one found SACK option including checksum correction */ +/* Adjust one found SACK option including checksum correction. + * + * NOTE: if this is called by a path other than ip_nat_seq_adjust(), it wil + * need to be updated to take hardware checksumming into account + */ static void sack_adjust(struct sk_buff *skb, struct tcphdr *tcph, @@ -350,7 +357,8 @@ int ip_nat_seq_adjust(struct sk_buff **pskb, struct ip_conntrack *ct, - enum ip_conntrack_info ctinfo) + enum ip_conntrack_info ctinfo, + int hooknum) { struct tcphdr *tcph; int dir, newseq, newack; @@ -383,6 +391,7 @@ newack = ntohl(tcph->ack_seq) - other_way->offset_before; newack = htonl(newack); + nf_ip_skb_checksum_help(*pskb, hooknum); tcph->check = ip_nat_cheat_check(~tcph->seq, newseq, ip_nat_cheat_check(~tcph->ack_seq, newack, diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_proto_icmp.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_proto_icmp.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_proto_icmp.c 2004-02-18 00:17:09.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_proto_icmp.c 2004-02-23 16:21:11.785057272 -0500 @@ -55,7 +55,7 @@ icmp_manip_pkt(struct sk_buff **pskb, unsigned int hdroff, const struct ip_conntrack_manip *manip, - enum ip_nat_manip_type maniptype) + enum ip_nat_manip_type maniptype, int hooknum) { struct icmphdr *hdr; @@ -64,6 +64,7 @@ hdr = (void *)(*pskb)->data + hdroff; + nf_ip_skb_checksum_help(*pskb, hooknum); hdr->checksum = ip_nat_cheat_check(hdr->un.echo.id ^ 0xFFFF, manip->u.icmp.id, hdr->checksum); diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_proto_tcp.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_proto_tcp.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_proto_tcp.c 2004-02-18 00:17:09.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_proto_tcp.c 2004-02-23 16:21:06.464866064 -0500 @@ -86,7 +86,7 @@ tcp_manip_pkt(struct sk_buff **pskb, unsigned int hdroff, const struct ip_conntrack_manip *manip, - enum ip_nat_manip_type maniptype) + enum ip_nat_manip_type maniptype, int hooknum) { struct tcphdr *hdr; u_int32_t oldip; @@ -120,6 +120,7 @@ if (hdrsize < sizeof(*hdr)) return 1; + nf_ip_skb_checksum_help(*pskb, hooknum); hdr->check = ip_nat_cheat_check(~oldip, manip->ip, ip_nat_cheat_check(oldport ^ 0xFFFF, manip->u.tcp.port, diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_proto_udp.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_proto_udp.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_proto_udp.c 2004-02-18 00:17:09.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_proto_udp.c 2004-02-23 16:21:00.952704040 -0500 @@ -85,7 +85,7 @@ udp_manip_pkt(struct sk_buff **pskb, unsigned int hdroff, const struct ip_conntrack_manip *manip, - enum ip_nat_manip_type maniptype) + enum ip_nat_manip_type maniptype, int hooknum) { struct udphdr *hdr; u_int32_t oldip; @@ -104,11 +104,13 @@ oldip = (*pskb)->nh.iph->daddr; portptr = &hdr->dest; } - if (hdr->check) /* 0 is a special case meaning no checksum */ + if (hdr->check) /* 0 is a special case meaning no checksum */ { + nf_ip_skb_checksum_help(*pskb, hooknum); hdr->check = ip_nat_cheat_check(~oldip, manip->ip, ip_nat_cheat_check(*portptr ^ 0xFFFF, manip->u.udp.port, hdr->check)); + } *portptr = manip->u.udp.port; return 1; } diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_proto_unknown.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_proto_unknown.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_proto_unknown.c 2004-02-18 00:17:09.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_proto_unknown.c 2004-02-23 16:41:13.425380176 -0500 @@ -41,7 +41,7 @@ unknown_manip_pkt(struct sk_buff **pskb, unsigned int hdroff, const struct ip_conntrack_manip *manip, - enum ip_nat_manip_type maniptype) + enum ip_nat_manip_type maniptype, int hooknum) { return 1; } @@ -49,7 +49,7 @@ static unsigned int unknown_print(char *buffer, const struct ip_conntrack_tuple *match, - const struct ip_conntrack_tuple *mask) + const struct ip_conntrack_tuple *mask, int hooknum) { return 0; } diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_snmp_basic.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_snmp_basic.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ip_nat_snmp_basic.c 2003-09-27 20:50:48.000000000 -0400 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ip_nat_snmp_basic.c 2004-02-23 15:09:42.000000000 -0500 @@ -1231,6 +1231,8 @@ if (map.from == map.to) return NF_ACCEPT; + nf_ip_skb_checksum_help(*pskb, hooknum); + if (!snmp_parse_mangle((unsigned char *)udph + sizeof(struct udphdr), paylen, &map, &udph->check)) { printk(KERN_WARNING "bsalg: parser failed\n"); @@ -1252,6 +1254,9 @@ int dir = CTINFO2DIR(ctinfo); struct iphdr *iph = (*pskb)->nh.iph; struct udphdr *udph = (struct udphdr *)((u_int32_t *)iph + iph->ihl); + + if (!skb_ip_make_writable(pskb, (*pskb)->len)) + return NF_DROP; spin_lock_bh(&snmp_lock); diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ipt_ECN.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ipt_ECN.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ipt_ECN.c 2004-02-18 00:17:09.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ipt_ECN.c 2004-02-23 15:09:42.000000000 -0500 @@ -50,7 +50,7 @@ /* Return 0 if there was an error. */ static inline int -set_ect_tcp(struct sk_buff **pskb, const struct ipt_ECN_info *einfo) +set_ect_tcp(struct sk_buff **pskb, const struct ipt_ECN_info *einfo, int inward) { struct tcphdr tcph; u_int16_t diffs[2]; @@ -74,6 +74,12 @@ if (!skb_ip_make_writable(pskb, (*pskb)->nh.iph->ihl*4+sizeof(tcph))) return 0; + if ((*pskb)->ip_summed == CHECKSUM_HW) { + if (inward) + (*pskb)->ip_summed = CHECKSUM_NONE; + else + nf_skb_checksum_help(*pskb); + } tcph.check = csum_fold(csum_partial((char *)diffs, sizeof(diffs), tcph.check^0xFFFF)); @@ -100,7 +106,7 @@ if (einfo->operation & (IPT_ECN_OP_SET_ECE | IPT_ECN_OP_SET_CWR) && (*pskb)->nh.iph->protocol == IPPROTO_TCP) - if (!set_ect_tcp(pskb, einfo)) + if (!set_ect_tcp(pskb, einfo, out == NULL)) return NF_DROP; return IPT_CONTINUE; diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ipt_REJECT.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ipt_REJECT.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ipt_REJECT.c 2004-02-18 00:17:09.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ipt_REJECT.c 2004-02-23 16:33:50.132770896 -0500 @@ -374,6 +374,8 @@ if ((*pskb)->nh.iph->ihl<<2 != sizeof(struct iphdr)) return NF_DROP; + skb_invalidate_hw_checksum(*pskb); + /* WARNING: This code causes reentry within iptables. This means that the iptables jump stack is now crap. We must return an absolute verdict. --RR */ diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv4/netfilter/ipt_TCPMSS.c linux-2.6.3-mm3.w/net/ipv4/netfilter/ipt_TCPMSS.c --- linux-2.6.3-mm3.o/net/ipv4/netfilter/ipt_TCPMSS.c 2004-02-18 00:17:09.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv4/netfilter/ipt_TCPMSS.c 2004-02-23 16:34:27.815042320 -0500 @@ -114,6 +114,7 @@ opt[i+2] = (newmss & 0xff00) >> 8; opt[i+3] = (newmss & 0x00ff); + skb_invalidate_hw_checksum(*pskb); tcph->check = cheat_check(htons(oldmss)^0xFFFF, htons(newmss), tcph->check); diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv6/ah6.c linux-2.6.3-mm3.w/net/ipv6/ah6.c --- linux-2.6.3-mm3.o/net/ipv6/ah6.c 2004-02-04 08:39:07.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv6/ah6.c 2004-02-23 15:09:42.000000000 -0500 @@ -156,8 +156,8 @@ u16 nh_offset = 0; u8 nexthdr; - if (skb->ip_summed == CHECKSUM_HW && skb_checksum_help(skb) == NULL) { - err = -EINVAL; + if (skb->ip_summed == CHECKSUM_HW) { + err = -EAGAIN; goto error_nolock; } diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv6/esp6.c linux-2.6.3-mm3.w/net/ipv6/esp6.c --- linux-2.6.3-mm3.o/net/ipv6/esp6.c 2003-09-27 20:50:30.000000000 -0400 +++ linux-2.6.3-mm3.w/net/ipv6/esp6.c 2004-02-23 15:09:42.000000000 -0500 @@ -58,9 +58,9 @@ u8 *prevhdr; u8 nexthdr = 0; - /* First, if the skb is not checksummed, complete checksum. */ - if (skb->ip_summed == CHECKSUM_HW && skb_checksum_help(skb) == NULL) { - err = -EINVAL; + /* First, if the skb is not checksummed, allow retrans to fix. */ + if (skb->ip_summed == CHECKSUM_HW) { + err = -EAGAIN; goto error_nolock; } diff -urN -X dontdiff linux-2.6.3-mm3.o/net/ipv6/ipcomp6.c linux-2.6.3-mm3.w/net/ipv6/ipcomp6.c --- linux-2.6.3-mm3.o/net/ipv6/ipcomp6.c 2003-12-01 15:27:04.000000000 -0500 +++ linux-2.6.3-mm3.w/net/ipv6/ipcomp6.c 2004-02-23 15:09:42.000000000 -0500 @@ -132,8 +132,8 @@ int plen, dlen; u8 *start, *scratch = ipcd->scratch; - if (skb->ip_summed == CHECKSUM_HW && skb_checksum_help(skb) == NULL) { - err = -EINVAL; + if (skb->ip_summed == CHECKSUM_HW) { + err = -EAGAIN; goto error_nolock; } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-23 22:19 ` James Morris @ 2004-02-29 5:50 ` David S. Miller 0 siblings, 0 replies; 18+ messages in thread From: David S. Miller @ 2004-02-29 5:50 UTC (permalink / raw) To: James Morris; +Cc: mika.penttila, laforge, netdev, sds On Mon, 23 Feb 2004 17:19:03 -0500 (EST) James Morris <jmorris@redhat.com> wrote: > This does not seem to be the correct approach. If you drop packets in > the ipsec cases at least, the retransmits just keep feeding packets back > in with hardware checksumming set. Ok, so TCP retransmits do this. There is a check missing somewhere in the TCP output path then. Or maybe, it refuses to mess with checksum state after the SKB has been created, even for further retransmits. More brainpower needed... ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-15 6:09 ` James Morris 2004-02-15 9:34 ` Mika Penttilä @ 2004-02-17 15:54 ` Harald Welte 2004-02-17 20:35 ` James Morris 1 sibling, 1 reply; 18+ messages in thread From: Harald Welte @ 2004-02-17 15:54 UTC (permalink / raw) To: James Morris; +Cc: netdev [-- Attachment #1: Type: text/plain, Size: 1450 bytes --] On Sun, Feb 15, 2004 at 01:09:23AM -0500, James Morris wrote: > On Sat, 14 Feb 2004, Mika Penttilä wrote: > > > James Morris wrote: > > > > >The proposed solution below is to copy the skb rather than clone it, to > > >ensure that the original and looped back packets are independent. > > > > > > > This is unneeded overhead for the common case. The right fix is to make > > sure the modifier (netfilter etc) makes the copy if needed. Actually, > > this is what skb_ip_make_writable() is doing. > > The common case here will be only for locally generated multicast and > broadcast packets. > > If the netfilter core code is modified instead, we will end up adding > skb_ip_make_writable() to nf_hook_slow() which will be called for every > packet with an output device which uses hardware checksums. > > Not sure which is worse, but here's a proposed patch which does this. Why can't we somehow check inside the netfilter hook if the packet is at least multicast/broadcast (or even better: also locally generated)? > - James > James Morris -- - Harald Welte <laforge@netfilter.org> http://www.netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook 2004-02-17 15:54 ` Harald Welte @ 2004-02-17 20:35 ` James Morris 0 siblings, 0 replies; 18+ messages in thread From: James Morris @ 2004-02-17 20:35 UTC (permalink / raw) To: Harald Welte; +Cc: netdev On Tue, 17 Feb 2004, Harald Welte wrote: > On Sun, Feb 15, 2004 at 01:09:23AM -0500, James Morris wrote: > > If the netfilter core code is modified instead, we will end up adding > > skb_ip_make_writable() to nf_hook_slow() which will be called for every > > packet with an output device which uses hardware checksums. > > > > Not sure which is worse, but here's a proposed patch which does this. > > Why can't we somehow check inside the netfilter hook if the packet is > at least multicast/broadcast (or even better: also locally generated)? Well, we need to use skb_ip_make_writable() in any case (not just for mcast/bcast packets), as we are modifying a part of the packet which may not be linear. Note that this is only called for packets with an outdev, and does nothing if the transport layer checksum is writeable. - James -- James Morris <jmorris@redhat.com> ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2004-02-29 5:50 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-02-14 18:37 [PATCH] Fix checksum bug for multicast/broadcast packets on postrouting hook James Morris 2004-02-14 18:37 ` Harald Welte 2004-02-14 19:07 ` Mika Penttilä 2004-02-14 23:00 ` David S. Miller 2004-02-15 6:09 ` James Morris 2004-02-15 9:34 ` Mika Penttilä 2004-02-15 13:03 ` James Morris 2004-02-15 13:40 ` Mika Penttilä 2004-02-15 14:03 ` James Morris 2004-02-15 16:00 ` Mika Penttilä 2004-02-16 1:50 ` James Morris 2004-02-16 6:43 ` Mika Penttilä 2004-02-16 13:45 ` James Morris 2004-02-19 1:24 ` David S. Miller 2004-02-23 22:19 ` James Morris 2004-02-29 5:50 ` David S. Miller 2004-02-17 15:54 ` Harald Welte 2004-02-17 20:35 ` James Morris
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).