public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* iptables and tcpdump
@ 2001-10-29  1:10 Rolf Fokkens
  2001-10-30  4:28 ` Rusty Russell
  0 siblings, 1 reply; 8+ messages in thread
From: Rolf Fokkens @ 2001-10-29  1:10 UTC (permalink / raw)
  To: linux-kernel

Hi!

I've been "tcpdumping" traffic that passes through a NAT box based on
netfilter. Everything works wonderful, but tcpdump presents confusing data.
With the help of google I found out that tcpdump sees the data right after
the NF_IP_PRE_ROUTING and the NF_IP_POST_ROUTING hooks. This explains it all,
but results in a new question: why does tcpdump "see" the data after the
NF_IP_PRE_ROUTING hook instead of before, which more accurately reflects the
data that's on the wire?

I can imagine this has been explained before, but I haven't found the full
explanation. Could someone enlighten me?

Another thing is /proc/net/ip_conntrack. It shows also some confusing
information like this:

icmp 1 29 src=145.66.17.200 dst=10.13.92.231 ... [UNREPLIED]
src=130.130.92.231 dst=145.66.17.200 ...

One half shows an unNATted dst, the second half shows the NATted src.
Logically speaking they should match but now they don't.

So everything works fine, but it's presented in a confusing way (tcpdump,
ip_conntrack). This may be intentionally but it seems a little accidentally
to me.

Rolf

-------------------------------------------------------

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: iptables and tcpdump
  2001-10-29  1:10 iptables and tcpdump Rolf Fokkens
@ 2001-10-30  4:28 ` Rusty Russell
  2001-10-30  5:31   ` David S. Miller
  2001-10-30 17:31   ` kuznet
  0 siblings, 2 replies; 8+ messages in thread
From: Rusty Russell @ 2001-10-30  4:28 UTC (permalink / raw)
  To: Rolf Fokkens; +Cc: linux-kernel, kuznet

On Sun, 28 Oct 2001 17:10:41 -0800
Rolf Fokkens <fokkensr@linux06.vertis.nl> wrote:

> Hi!
> 
> I've been "tcpdumping" traffic that passes through a NAT box based on
> netfilter. Everything works wonderful, but tcpdump presents confusing data.
> With the help of google I found out that tcpdump sees the data right after
> the NF_IP_PRE_ROUTING and the NF_IP_POST_ROUTING hooks. This explains it all,
> but results in a new question: why does tcpdump "see" the data after the
> NF_IP_PRE_ROUTING hook instead of before, which more accurately reflects the
> data that's on the wire?

It should see the packets on the wire (they are grabbed by tcpdump before
IP processing), but IIRC they are cloned (not copied) for tcpdump's use.

Alexey, should the NAT layer be doing skb_unshare() before altering the packet?

> icmp 1 29 src=145.66.17.200 dst=10.13.92.231 ... [UNREPLIED]
> src=130.130.92.231 dst=145.66.17.200 ...
> 
> One half shows an unNATted dst, the second half shows the NATted src.
> Logically speaking they should match but now they don't.

No, that's what the connection tracking will actually see.  If there is
no NAT, they will match.

Hope that clarifies,
Rusty.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: iptables and tcpdump
  2001-10-30  4:28 ` Rusty Russell
@ 2001-10-30  5:31   ` David S. Miller
  2001-10-31  5:45     ` Rolf Fokkens
  2001-10-31  6:28     ` Rusty Russell
  2001-10-30 17:31   ` kuznet
  1 sibling, 2 replies; 8+ messages in thread
From: David S. Miller @ 2001-10-30  5:31 UTC (permalink / raw)
  To: rusty; +Cc: fokkensr, linux-kernel, kuznet

   From: Rusty Russell <rusty@rustcorp.com.au>
   Date: Tue, 30 Oct 2001 15:28:12 +1100
   
   should the NAT layer be doing skb_unshare() before altering the packet?

I think it should.

Look, if you are messing with packets before they go back out, and
tcpdump could have sniffed it on the way in, you can't change it's
contents blindly.

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: iptables and tcpdump
  2001-10-30  4:28 ` Rusty Russell
  2001-10-30  5:31   ` David S. Miller
@ 2001-10-30 17:31   ` kuznet
  1 sibling, 0 replies; 8+ messages in thread
From: kuznet @ 2001-10-30 17:31 UTC (permalink / raw)
  To: Rusty Russell; +Cc: fokkensr, linux-kernel

Hello!

> Alexey, should the NAT layer be doing skb_unshare() before altering the packet?

MUST. Cloned skbs are read-only.

I did not expect such question from you. :-)

Alexey

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: iptables and tcpdump
  2001-10-30  5:31   ` David S. Miller
@ 2001-10-31  5:45     ` Rolf Fokkens
  2001-10-31  6:28     ` Rusty Russell
  1 sibling, 0 replies; 8+ messages in thread
From: Rolf Fokkens @ 2001-10-31  5:45 UTC (permalink / raw)
  To: David S. Miller, rusty; +Cc: fokkensr, linux-kernel, kuznet

Hi!

I may have missed something, but I'm not on the maillists which would explain 
why. And the archives dont contain the email messages (yet) between my 
initial question and this part of the discussion.

Apparently my question triggered a discussion about some deep NAT details at 
the skb level. As much as I understand it, something goes wrong with the skb 
cloning in the NAT layer, NAT changes read-only copies.

Is this the cause of the weird data that shows up with tcpdump?

Or in other words: does tcpdump show something buggy?

Rolf

On Tuesday 30 October 2001 09:31, you wrote:
> Hello!
>
> > Alexey, should the NAT layer be doing skb_unshare() before altering the
> > packet?
>
> MUST. Cloned skbs are read-only.
>
> I did not expect such question from you. :-)
>
> Alexey

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: iptables and tcpdump
  2001-10-30  5:31   ` David S. Miller
  2001-10-31  5:45     ` Rolf Fokkens
@ 2001-10-31  6:28     ` Rusty Russell
  2001-10-31 13:34       ` kuznet
  2001-11-06 23:40       ` David S. Miller
  1 sibling, 2 replies; 8+ messages in thread
From: Rusty Russell @ 2001-10-31  6:28 UTC (permalink / raw)
  To: David S. Miller; +Cc: fokkensr, linux-kernel, kuznet

On Mon, 29 Oct 2001 21:31:57 -0800 (PST)
"David S. Miller" <davem@redhat.com> wrote:

>    From: Rusty Russell <rusty@rustcorp.com.au>
>    Date: Tue, 30 Oct 2001 15:28:12 +1100
>    
>    should the NAT layer be doing skb_unshare() before altering the packet?
> 
> I think it should.

Agreed.  The 2.2 masq code didn't do this, and hence the "don't tcpdump on masq host"
recommendation.

Please try this patch (compiles at least),
Rusty.

diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.4.13-official/net/ipv4/netfilter/ip_fw_compat.c working-2.4.13-nfunshare/net/ipv4/netfilter/ip_fw_compat.c
--- linux-2.4.13-official/net/ipv4/netfilter/ip_fw_compat.c	Sat Apr 28 07:15:01 2001
+++ working-2.4.13-nfunshare/net/ipv4/netfilter/ip_fw_compat.c	Wed Oct 31 17:05:53 2001
@@ -78,11 +78,19 @@
 {
 	int ret = FW_BLOCK;
 	u_int16_t redirpt;
+	struct sk_buff *nskb;
 
 	/* Assume worse case: any hook could change packet */
 	(*pskb)->nfcache |= NFC_UNKNOWN | NFC_ALTERED;
 	if ((*pskb)->ip_summed == CHECKSUM_HW)
 		(*pskb)->ip_summed = CHECKSUM_NONE;
+
+	/* Firewall rules can alter TOS: raw socket may have clone of
+           skb: don't disturb it --RR */
+	nskb = skb_unshare(*pskb, GFP_ATOMIC);
+	if (!nskb)
+		return NF_DROP;
+	*pskb = nskb;
 
 	switch (hooknum) {
 	case NF_IP_PRE_ROUTING:
diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.4.13-official/net/ipv4/netfilter/ip_nat_core.c working-2.4.13-nfunshare/net/ipv4/netfilter/ip_nat_core.c
--- linux-2.4.13-official/net/ipv4/netfilter/ip_nat_core.c	Thu May 17 03:31:27 2001
+++ working-2.4.13-nfunshare/net/ipv4/netfilter/ip_nat_core.c	Wed Oct 31 16:52:06 2001
@@ -734,6 +734,15 @@
 	   synchronize_bh()) can vanish. */
 	READ_LOCK(&ip_nat_lock);
 	for (i = 0; i < info->num_manips; i++) {
+		struct sk_buff *nskb;
+		/* raw socket may have clone of skb: don't disturb it --RR */
+		nskb = skb_unshare(*pskb, GFP_ATOMIC);
+		if (!nskb) {
+			READ_UNLOCK(&ip_nat_lock);
+			return NF_DROP;
+		}
+		*pskb = nskb;
+
 		if (info->manips[i].direction == dir
 		    && info->manips[i].hooknum == hooknum) {
 			DEBUGP("Mangling %p: %s to %u.%u.%u.%u %u\n",
diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.4.13-official/net/ipv4/netfilter/ipt_TCPMSS.c working-2.4.13-nfunshare/net/ipv4/netfilter/ipt_TCPMSS.c
--- linux-2.4.13-official/net/ipv4/netfilter/ipt_TCPMSS.c	Mon Oct  1 05:26:08 2001
+++ working-2.4.13-nfunshare/net/ipv4/netfilter/ipt_TCPMSS.c	Wed Oct 31 17:00:42 2001
@@ -48,6 +48,13 @@
 	u_int16_t tcplen, newtotlen, oldval, newmss;
 	unsigned int i;
 	u_int8_t *opt;
+	struct sk_buff *nskb;
+
+	/* raw socket may have clone of skb: don't disturb it --RR */
+	nskb = skb_unshare(*pskb, GFP_ATOMIC);
+	if (!nskb)
+		return NF_DROP;
+	*pskb = nskb;
 
 	tcplen = (*pskb)->len - iph->ihl*4;
 
diff -urN -I \$.*\$ --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.4.13-official/net/ipv4/netfilter/ipt_TOS.c working-2.4.13-nfunshare/net/ipv4/netfilter/ipt_TOS.c
--- linux-2.4.13-official/net/ipv4/netfilter/ipt_TOS.c	Mon Oct  1 05:26:08 2001
+++ working-2.4.13-nfunshare/net/ipv4/netfilter/ipt_TOS.c	Wed Oct 31 17:03:11 2001
@@ -19,7 +19,14 @@
 	const struct ipt_tos_target_info *tosinfo = targinfo;
 
 	if ((iph->tos & IPTOS_TOS_MASK) != tosinfo->tos) {
+		struct sk_buff *nskb;
 		u_int16_t diffs[2];
+
+		/* raw socket may have clone of skb: don't disturb it --RR */
+		nskb = skb_unshare(*pskb, GFP_ATOMIC);
+		if (!nskb)
+			return NF_DROP;
+		*pskb = nskb;
 
 		diffs[0] = htons(iph->tos) ^ 0xFFFF;
 		iph->tos = (iph->tos & IPTOS_PREC_MASK) | tosinfo->tos;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: iptables and tcpdump
  2001-10-31  6:28     ` Rusty Russell
@ 2001-10-31 13:34       ` kuznet
  2001-11-06 23:40       ` David S. Miller
  1 sibling, 0 replies; 8+ messages in thread
From: kuznet @ 2001-10-31 13:34 UTC (permalink / raw)
  To: Rusty Russell; +Cc: davem, fokkensr, linux-kernel

Hello!

> Agreed.  The 2.2 masq code didn't do this, and hence the "don't tcpdump on masq host"
> recommendation.

Paul, it is very possible that I smoke/drunk something wrong
and saw this in dreams, but I really remember that this bug
has been fixed in some 2.1.x. :-)

Only function is different: that time skb_unshare() did some
unitelligible thing and was used only by AX.25 for an unknown purpose.
So, the function which does the work was called skb_cow().

Alexey

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: iptables and tcpdump
  2001-10-31  6:28     ` Rusty Russell
  2001-10-31 13:34       ` kuznet
@ 2001-11-06 23:40       ` David S. Miller
  1 sibling, 0 replies; 8+ messages in thread
From: David S. Miller @ 2001-11-06 23:40 UTC (permalink / raw)
  To: rusty; +Cc: fokkensr, linux-kernel, kuznet

   From: Rusty Russell <rusty@rustcorp.com.au>
   Date: Wed, 31 Oct 2001 17:28:35 +1100

   On Mon, 29 Oct 2001 21:31:57 -0800 (PST)
   "David S. Miller" <davem@redhat.com> wrote:
   
   >    From: Rusty Russell <rusty@rustcorp.com.au>
   >    Date: Tue, 30 Oct 2001 15:28:12 +1100
   >    
   >    should the NAT layer be doing skb_unshare() before altering the packet?
   > 
   > I think it should.
   
   Agreed.  The 2.2 masq code didn't do this, and hence the "don't
   tcpdump on masq host" recommendation.
   
   Please try this patch (compiles at least),

Applied to my sources...

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2001-11-06 23:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-10-29  1:10 iptables and tcpdump Rolf Fokkens
2001-10-30  4:28 ` Rusty Russell
2001-10-30  5:31   ` David S. Miller
2001-10-31  5:45     ` Rolf Fokkens
2001-10-31  6:28     ` Rusty Russell
2001-10-31 13:34       ` kuznet
2001-11-06 23:40       ` David S. Miller
2001-10-30 17:31   ` kuznet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox