All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick McHardy <kaber@trash.net>
To: Thomas Poehnitzsch <thomas.poehnitzsch@informatik.tu-chemnitz.de>
Cc: netfilter developer mailinglist <netfilter-devel@lists.netfilter.org>
Subject: Re: [RFC]: ip_conntrack breaks UDP PMTU
Date: Sat, 15 Feb 2003 21:50:59 +0100	[thread overview]
Message-ID: <3E4EA833.5050504@trash.net> (raw)
In-Reply-To: <20030215175841.GL30133@calix.csn.tu-chemnitz.de>

Thomas Poehnitzsch wrote:

>>>ip_conntrack defrags packets at PRE_ROUTING and LOCAL_OUT and
>>>refragments them at POST_ROUTING without careing about IP_DF. packets
>>>      
>>>
>
>What has IP_DF (I hope you mean the "Don't Fragment" bit in the
>IP-header) to do with _de_-fragmentation? As far as I understood RFC791
>
nothing.

>Could somebody please explain the notation: "IP_DF|IP_MF" to me? Does
>this mean at least one of both flags is set? And if so this is against
>my understanding of the above mentioned RFC791. If IP_DF is set, the
>packet _must not_ be fragmented, so IP_MF can't be set.
>
both are set. "|" is logical or. nfs (always?) generates packets bigger 
than mtu
so they are fragmented and have IP_MF set (except last one). If linux 
wants to
know path mtu it sets IP_DF on these, so the fragments may not be _further_
fragmented.

>
>Let me go through an example of PMTUD and correct me if I am wrong with
>my view of this protocol:
>
>Assume host A wants to send a packet to host B and pmtud is enabled 
>(/proc/sys/net/ipv4/ip_no_pmtu_disc = 0) the IP_DF flag will be set in
>the packet sent. In case this packet will not pass through the eye of a
>needle further down the line, it will be droped and an ICMP Message
>(type 3 code 4: fragmentation needed and DF set) will be sent to A.
>Host A will then resend smaller packets (again with IP_DF set) until the
>packet reaches host B. 
>The way I understand it, there won't be any fragmented packet on the
>line in this connection, so iptables will not break anything.
>
>If on the other hand the IP_DF bit is not set, any host on the route
>from A to B is allowed to refragment the packet to fit the MTU of the
>next connection.
>
>Now I can't see any reason why iptables should not be allowed to
>reassemble and refragment a packet with IP_DF not set.
>

see above.

>
>  
>
>>>The problem is that we _need_ to defragment at NF_IP_PRE_ROUTING in
>>>order to have the be able to do connection tracking.  So at this point
>>>we would need to save the sizes of all individual fragments.  This
>>>would enable us to re-fragment to exactly the same size at
>>>POST_ROUTING. 
>>>      
>>>
>
>Do we really have to re-fragment to exactly the same size? Wouldn't it
>be sufficient to re-fragment to fragments not bigger in size than the
>biggest incoming fragment of this connection?
>
>  
>
>>>And then, what happens if NAT has to resize (enlarge/shrink) a packet.
>>>How should we deal with this while re-fragmenting? 
>>>      
>>>
>
>In my opinion we should just refragment it, as any router would do it.
>
that router is broken too. think about a host doing path mtu discovery, 
the packet
doesn't fit the interface mtu but nat shrinks the packet so it does fit 
.. the host gets
a wrong idea of the pmtu. unfortunately i don't know of a way to fix it 
except maybe
to also consider the removed bytes when deciding if a packet needs to be 
fragmented.

>>And if we go for my first propsal, how/where would we store the
>>list-of-fragment-sizes?  We certainly don't want it to be dynamically
>>allocated... but according to RFC791 there kan be 8192 fragments of 8
>>octets each...
>>    
>>
>
>I think we have to store fragment sizes of each connection, but storing
>
even worse we need to store the fragment sizes of each reassembled 
packet. if we consider
the case not all fragments have DF set and we would want to handle nat 
resizing correctly
besides fragment sizes we also need fragment boundaries and fragment 
flags (-> iph->frag_off).

Bye,
Patrick

  reply	other threads:[~2003-02-15 20:50 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-02-14  8:06 [RFC]: ip_conntrack breaks UDP PMTU Harald Welte
2003-02-14 13:42 ` Patrick McHardy
2003-02-14 14:55   ` Harald Welte
2003-02-15  5:12     ` Patrick McHardy
2003-02-15 19:34   ` [netfilter-core] " Jozsef Kadlecsik
2003-02-15 17:58 ` Thomas Poehnitzsch
2003-02-15 20:50   ` Patrick McHardy [this message]
2003-02-16 23:55     ` Thomas Poehnitzsch
2003-02-17  0:39       ` Patrick McHardy
2003-02-16 19:54   ` Harald Welte

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3E4EA833.5050504@trash.net \
    --to=kaber@trash.net \
    --cc=netfilter-devel@lists.netfilter.org \
    --cc=thomas.poehnitzsch@informatik.tu-chemnitz.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.