Re: [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy)

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy)
       [not found] <20030215232635.25928.78900.Mailman@kashyyyk>
@ 2003-02-16  1:43 ` Don Cohen
  2003-02-16  3:41   ` Patrick McHardy
  2003-02-16 20:01   ` [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy) Harald Welte
  0 siblings, 2 replies; 10+ messages in thread
From: Don Cohen @ 2003-02-16  1:43 UTC (permalink / raw)
  To: netfilter-devel

 > I guess some cruel decisions have to be made here, and we haven't even
 > started to think about mangling nat helpers ..

Clearly when you alter packet sizes PMTU can only be approximate,
with an error of the amount that a packet size could change.
On the other hand, internet routes are supposed to be somewhat
dynamic, so it's clear that PMTU discovery must be treated as only
heuristic, i.e., you have to be prepared for a size that worked before
to not work now.

 > I hope this discussion is not already over. Sorry, but it took me a
 > while to understand all the implications and to skip through some RFC's.

I'm also more than a little confused.
My understanding is that the UDP user can't receive fragments, just a 
defragmented "datagram", from which he can't tell whether or how it
was fragmented.  Is this correct?
The only thing the program can do to determine PMTU then is to set the
DF flag and find out whether the data(*) is delivered (either from a
reply that indicates it was or an ICMP that indicates otherwise).
Is that still correct?  I don't even see the word "fragment" in rfc
768 (UDP).
(*) the term data here in intentionally ambiguous, trying to not yet
deal with issues below

Is there some other RFC I should be looking for?

 > > >>ip_conntrack defrags packets at PRE_ROUTING and LOCAL_OUT and
 > > >>refragments them at POST_ROUTING without careing about IP_DF. packets
 > > >>with IP_DF|IP_MF can be refragmented with a different size, so path
 > > >>mtu discovery is broken.  Linux nfs itself sends out packets with
 > > >>IP_DF|IP_MF.

 > both are set. "|" is logical or. nfs (always?) generates packets bigger 
 > than mtu
 > so they are fragmented and have IP_MF set (except last one). If linux 
 > wants to
 > know path mtu it sets IP_DF on these, so the fragments may not be _further_
 > fragmented.

This already seems suspicious to me.  Is there any RFC that
specifically says this is allowed or does this fall into the area
that violates the robustness principle (i.e., not being conservative 
in what you send to others)?

Is a user program allowed to specify what fragments to send?
I guess you're saying it's the system that's doing this, perhaps in
order to decide what size fragments to send.  In that case you might
even argue that the system on the other side should react differently
to different size fragments that arrive.  

So the scenario is:
You want to test PMTU <=1000
You send a "datagram" of size 2000 in fragments of size 1000 and 1000
both with DF set, expecting to get an icmp complaint only if PMTU
<1000.
A conntracking firewall (FW) defragments to a single datagram of size
2000.
At this point it has to refragment to forward the datagram.
Several possibilities:
- FW notices DF and rejects the datagram, telling you that fragmentation
was needed when DF was set
This might be considered incorrect since fragmentation was not needed
for the packets you sent.  On the other hand, the IP header in the icmp 
packet shows that the size that was too big was 2000.
- FW refragments to sizes 1500 and 500.  It's entirely unclear to me
whether those should be marked DF.  What if some of the incoming
fragments were DF and others were not?
In any case, the danger again is that one has to be fragmented later
on, but again the icmp reply contains the header showing the size of
the packet that was rejected.

In other words, when you do this DF PMTU discovery you ought to look
at the size of the packet that was rejected and draw conclusions from
that.  

 > What about storing the biggest fragment size of a packet at
 > defragmentation and refragmenting the packet with that size at
 > POST_ROUTING if MTU is not smaller.
So at this point I gather the goal is the other side of the
robustness principle - trying to make something work that probably 
shouldn't have been tried to begin with.  Are there lots of other
things out there that play the same (questionable?) tricks?

The suggestion above seems fair enough, but if the problem that
we're trying to fix is specifically in linux, perhaps the other 
end should also be fixed?

 > >I think we have to store fragment sizes of each connection, but storing
 > >
 > even worse we need to store the fragment sizes of each reassembled 
 > packet. if we consider
 > the case not all fragments have DF set and we would want to handle nat 
 > resizing correctly
 > besides fragment sizes we also need fragment boundaries and fragment 
 > flags (-> iph->frag_off).
Keeping in mind that PMTU can change as path changes, it seems 
reasonable to do this on a -per-datagram basis rather than
per-connection.  That still doesn't seem so expensive.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy)
  2003-02-16  1:43 ` [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy) Don Cohen
@ 2003-02-16  3:41   ` Patrick McHardy
  2003-02-16  6:00     ` Don Cohen
  2003-02-16 20:01   ` [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy) Harald Welte
  1 sibling, 1 reply; 10+ messages in thread
From: Patrick McHardy @ 2003-02-16  3:41 UTC (permalink / raw)
  To: Don Cohen; +Cc: netfilter-devel

Don Cohen wrote:

> > I guess some cruel decisions have to be made here, and we haven't even
> > started to think about mangling nat helpers ..
>
>Clearly when you alter packet sizes PMTU can only be approximate,
>with an error of the amount that a packet size could change.
>On the other hand, internet routes are supposed to be somewhat
>dynamic, so it's clear that PMTU discovery must be treated as only
>heuristic, i.e., you have to be prepared for a size that worked before
>to not work now.
>  
>

if you don't set DF there's no problem. pmtu discovery is there to minimize
header overhead, one byte off and you have the worst case. since the sender
of an icmp fragmentation required includes the max. mtu it can handle 
without
fragmenting it is clear it is not to be treated as only heuristic.

> > I hope this discussion is not already over. Sorry, but it took me a
> > while to understand all the implications and to skip through some RFC's.
>
>I'm also more than a little confused.
>My understanding is that the UDP user can't receive fragments, just a 
>defragmented "datagram", from which he can't tell whether or how it
>was fragmented.  Is this correct?
>
yes.

>The only thing the program can do to determine PMTU then is to set the
>DF flag and find out whether the data(*) is delivered (either from a
>reply that indicates it was or an ICMP that indicates otherwise).
>Is that still correct?  I don't even see the word "fragment" in rfc
>768 (UDP).
>
the application doesn't set the DF bit, the kernel does (i don't know if 
it is possible
at all for the application without raw sockets). with linux the 
application doesn't see
 the icmp error, the kernel handles it and saves the received pmtu in 
the destination data.

>(*) the term data here in intentionally ambiguous, trying to not yet
>deal with issues below
>
>Is there some other RFC I should be looking for?
>
> > > >>ip_conntrack defrags packets at PRE_ROUTING and LOCAL_OUT and
> > > >>refragments them at POST_ROUTING without careing about IP_DF. packets
> > > >>with IP_DF|IP_MF can be refragmented with a different size, so path
> > > >>mtu discovery is broken.  Linux nfs itself sends out packets with
> > > >>IP_DF|IP_MF.
>
> > both are set. "|" is logical or. nfs (always?) generates packets bigger 
> > than mtu
> > so they are fragmented and have IP_MF set (except last one). If linux 
> > wants to
> > know path mtu it sets IP_DF on these, so the fragments may not be _further_
> > fragmented.
>
>This already seems suspicious to me.  Is there any RFC that
>specifically says this is allowed or does this fall into the area
>that violates the robustness principle (i.e., not being conservative 
>in what you send to others)?
>

Two citations from kerneltrap.com:
---------
In defense of the Linux NFS design, David Miller contends, "/RFCs are 
not laws that cannot be broken when common sense must prevail. [...] 
common sense here dictates that without being able to set DF on 
fragmented frames, UDP path mtu discovery is basically impossible and at 
best useless./"
---------
Author:*dhartmei 
<http://kerneltrap.com/module.php?mod=user&op=view&id=1848>*
Date:Wednesday, 02/12/2003 - 13:33

Yes, but the question that was not answered before was, *why* (for what 
purpose) the sender would set DF on a fragment. PMTU was mentioned, but 
the "normal" way a stack does PMTU discovery is by sending complete 
packets with DF to find the PMTU, then send complete packets of that 
size. Even UDP can do that, in general.

The missing piece in the puzzle was the fact that certain protocols like 
NFS can't split transactions/operations into smaller packets, they need 
to send the entire transaction in one single (complete) IP packet. This 
size might exceed any real MTU, so it will get fragmented first. And 
only afterwards PMTU discovery gets applied to the fragments. Hence, DF 
on fragments. This scheme is not explicitely covered by the RFCs, but I 
agree that it's a logical conclusion.
---------

>Is a user program allowed to specify what fragments to send?
>I guess you're saying it's the system that's doing this, perhaps in
>order to decide what size fragments to send.  In that case you might
>even argue that the system on the other side should react differently
>to different size fragments that arrive.  
>
>So the scenario is:
>You want to test PMTU <=1000
>You send a "datagram" of size 2000 in fragments of size 1000 and 1000
>both with DF set, expecting to get an icmp complaint only if PMTU
><1000.
>A conntracking firewall (FW) defragments to a single datagram of size
>2000.
>At this point it has to refragment to forward the datagram.
>Several possibilities:
>- FW notices DF and rejects the datagram, telling you that fragmentation
>was needed when DF was set
>This might be considered incorrect since fragmentation was not needed
>for the packets you sent.  On the other hand, the IP header in the icmp 
>packet shows that the size that was too big was 2000.
>
the mtu is included in the icmp message. i guess every sane 
implementation uses
this value and not values from ip header. otoh, rfc 1122 states:

Every ICMP error message includes the Internet header and at least the 
first 8 data octets of the datagram that triggered the error; more than 
8 octets MAY be sent; this header and data MUST be unchanged from the 
received datagram.


>- FW refragments to sizes 1500 and 500.  It's entirely unclear to me
>whether those should be marked DF.  What if some of the incoming
>fragments were DF and others were not?
>
you already broke pmtu discovery so you can savely not set DF. if DF is set,
only correct way is to refragment them so the same sizes, unfortunately this
is not possible if the nat shrinks the packet. if the packet got bigger new
data can be sent in fragments with size of any of the other fragments.

>In any case, the danger again is that one has to be fragmented later
>on, but again the icmp reply contains the header showing the size of
>the packet that was rejected.
>
so fragment sizes need to be preserved and maybe even mangling of icmp
errors is required.

>
>In other words, when you do this DF PMTU discovery you ought to look
>at the size of the packet that was rejected and draw conclusions from
>that.  
>
> > What about storing the biggest fragment size of a packet at
> > defragmentation and refragmenting the packet with that size at
> > POST_ROUTING if MTU is not smaller.
>So at this point I gather the goal is the other side of the
>robustness principle - trying to make something work that probably 
>shouldn't have been tried to begin with.  Are there lots of other
>things out there that play the same (questionable?) tricks?
>
>The suggestion above seems fair enough, but if the problem that
>we're trying to fix is specifically in linux, perhaps the other 
>end should also be fixed?
>
read dave millers statement on this at 
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=58084

>
> > >I think we have to store fragment sizes of each connection, but storing
> > >
> > even worse we need to store the fragment sizes of each reassembled 
> > packet. if we consider
> > the case not all fragments have DF set and we would want to handle nat 
> > resizing correctly
> > besides fragment sizes we also need fragment boundaries and fragment 
> > flags (-> iph->frag_off).
>Keeping in mind that PMTU can change as path changes, it seems 
>reasonable to do this on a -per-datagram basis rather than
>per-connection.  That still doesn't seem so expensive.
>  
>
patrick

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy)
  2003-02-16  3:41   ` Patrick McHardy
@ 2003-02-16  6:00     ` Don Cohen
  2003-02-16 12:38       ` Patrick McHardy
  0 siblings, 1 reply; 10+ messages in thread
From: Don Cohen @ 2003-02-16  6:00 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netfilter-devel

Patrick McHardy writes:
 > ... since the sender of an icmp fragmentation required includes the
   max. mtu it can handle without fragmenting it is clear it is not to
   be treated as only heuristic.
This doesn't affect my argument of why it must be considered heuristic.

 > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=58084

After reading this it seems to me that this NFS code (or possibly the
kernel code it ends up using) is really to blame.  I don't even
understand why it uses UDP, but that reference sounds like NFS is
really just not prepared for packet loss.  Use of DF for PMTU
discovery is fine but you have to be prepared for loss, including 
the loss of the ICMP packets and you also have to be prepared for
change of network conditions, inc. change of path or MTU.
(Here we could get into another discussion of the role of ICMP in DoS 
attacks, what people do about that, what they should do, etc.  Recall
that ICMP replies are often rate limited, so an attacker can
effectively block almost all of your ICMP replies.)
I think all code that tries to optimize packet size really has to
view it as an ongoing (requiring continual monitoring, since things
do change) heuristic hill climbing type of problem.

As you point out, PMTU is a situation where a value slightly too low 
is fine, slightly too high is terrible (like the old TV show "The
price is right").  That already suggests appropriate heuristics.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy)
  2003-02-16  6:00     ` Don Cohen
@ 2003-02-16 12:38       ` Patrick McHardy
  2003-02-16 20:11         ` Possible ip_defrag DoS ? Harald Welte
  0 siblings, 1 reply; 10+ messages in thread
From: Patrick McHardy @ 2003-02-16 12:38 UTC (permalink / raw)
  To: Don Cohen; +Cc: netfilter-devel

Don Cohen wrote:

>Patrick McHardy writes:
> > ... since the sender of an icmp fragmentation required includes the
>   max. mtu it can handle without fragmenting it is clear it is not to
>   be treated as only heuristic.
>This doesn't affect my argument of why it must be considered heuristic.
>
> > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=58084
>
>After reading this it seems to me that this NFS code (or possibly the
>kernel code it ends up using) is really to blame.  I don't even
>understand why it uses UDP, but that reference sounds like NFS is
>really just not prepared for packet loss.  Use of DF for PMTU
>discovery is fine but you have to be prepared for loss, including 
>the loss of the ICMP packets and you also have to be prepared for
>change of network conditions, inc. change of path or MTU.
>(Here we could get into another discussion of the role of ICMP in DoS 
>attacks, what people do about that, what they should do, etc.  Recall
>that ICMP replies are often rate limited, so an attacker can
>effectively block almost all of your ICMP replies.)+
>
rate-limiting icmp fragmentation required is just braindead.
inerestingly, it seems linux defragmentation is vulnerable to dos attack.
the evictor (called before defragmentation) just kills the oldest entry
of each hash slot, starting with 0 until memory is below
sysctl_ipfrag_low_thresh. by sending enough fragments 
(>sysctl_ipfrag_high_thresh)
which hash to the highest bucket you can stop reassembly of valid packets.

> 
>I think all code that tries to optimize packet size really has to
>view it as an ongoing (requiring continual monitoring, since things
>do change) heuristic hill climbing type of problem.
>
it does. nevertheless if a router sends "fragmentation required, max mtu 
xyz"
its is not the hosts duty to guess what the sender actually meant!

>
>As you point out, PMTU is a situation where a value slightly too low 
>is fine, slightly too high is terrible (like the old TV show "The
>price is right").  That already suggests appropriate heuristics.
>  
>
so what should the sender do ? send a few bytes less than announced ? 
maybe also
a few byte more .. why not also treat sequence/ack numbers as heuristic, 
after all
they may also be changed by nat ... ?

Patrick

PS: lets not get into further discussion about this.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy)
  2003-02-16  1:43 ` [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy) Don Cohen
  2003-02-16  3:41   ` Patrick McHardy
@ 2003-02-16 20:01   ` Harald Welte
  2003-02-17  9:14     ` Jozsef Kadlecsik
  1 sibling, 1 reply; 10+ messages in thread
From: Harald Welte @ 2003-02-16 20:01 UTC (permalink / raw)
  To: Don Cohen; +Cc: netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 1873 bytes --]

On Sat, Feb 15, 2003 at 05:43:55PM -0800, Don Cohen wrote:
> I'm also more than a little confused.
> My understanding is that the UDP user can't receive fragments, just a 
> defragmented "datagram", from which he can't tell whether or how it
> was fragmented.  Is this correct?

yes, at least as long as we are talking about a user using the standard
socket API and the UDP/IP stack - not talking about raw socktets.

>  > What about storing the biggest fragment size of a packet at
>  > defragmentation and refragmenting the packet with that size at
>  > POST_ROUTING if MTU is not smaller.
> So at this point I gather the goal is the other side of the
> robustness principle - trying to make something work that probably 
> shouldn't have been tried to begin with.  Are there lots of other
> things out there that play the same (questionable?) tricks?

if you want to do stateful IP filtering with _transparent_ fragment
behaviour, I see two possible options:

1) hold back all individual fragments, once you receive them. once you
have all fragments, defragment to an internal buffer, but keep the
fragments.  using the defragmented copy you can consult your ruleset and
then forward all fragments in case the packet is allowed to pass.

2) do what I have been proposing: defragment, save the fragment
sizes/offsets and re-fragment in exactly the same way.

I don't know what other (proprietary) competitors do.  But I cannot
think of a different 'perfect' solution.

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Possible ip_defrag DoS ?
  2003-02-16 12:38       ` Patrick McHardy
@ 2003-02-16 20:11         ` Harald Welte
  2003-02-16 20:26           ` Patrick McHardy
  2003-02-16 20:31           ` Patrick McHardy
  0 siblings, 2 replies; 10+ messages in thread
From: Harald Welte @ 2003-02-16 20:11 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Don Cohen, netfilter-devel, netdev

[-- Attachment #1: Type: text/plain, Size: 1023 bytes --]

On Sun, Feb 16, 2003 at 01:38:56PM +0100, Patrick McHardy wrote:

> inerestingly, it seems linux defragmentation is vulnerable to dos attack.
> the evictor (called before defragmentation) just kills the oldest entry
> of each hash slot, starting with 0 until memory is below
> sysctl_ipfrag_low_thresh. by sending enough fragments 
> (>sysctl_ipfrag_high_thresh) which hash to the highest bucket you can
> stop reassembly of valid packets.

I'm forwarding this (from netfilter-devel) to the linux networking
developers at netdev@oss.sgi.com.  If your assumption is valid, they
might want to have a look at this...

thanks.

> Patrick

-- 
- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible ip_defrag DoS ?
  2003-02-16 20:11         ` Possible ip_defrag DoS ? Harald Welte
@ 2003-02-16 20:26           ` Patrick McHardy
  2003-02-16 20:31           ` Patrick McHardy
  1 sibling, 0 replies; 10+ messages in thread
From: Patrick McHardy @ 2003-02-16 20:26 UTC (permalink / raw)
  To: Harald Welte; +Cc: Don Cohen, netfilter-devel, netdev

Harald Welte wrote:

>On Sun, Feb 16, 2003 at 01:38:56PM +0100, Patrick McHardy wrote:
>
>  
>
>>inerestingly, it seems linux defragmentation is vulnerable to dos attack.
>>the evictor (called before defragmentation) just kills the oldest entry
>>of each hash slot, starting with 0 until memory is below
>>sysctl_ipfrag_low_thresh. by sending enough fragments 
>>(>sysctl_ipfrag_high_thresh) which hash to the highest bucket you can
>>stop reassembly of valid packets.
>>    
>>
>
>I'm forwarding this (from netfilter-devel) to the linux networking
>developers at netdev@oss.sgi.com.  If your assumption is valid, they
>might want to have a look at this...
>
>thanks.
>
>
>  
>
Hi Harald, it seems this was not (entirely) correct, the evictor only 
kills the last
member of each hash slot and then moves on. still, assuming the hash 
function is good
there is a good chance we can disturb reassembly noticeable.

Patrick

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Possible ip_defrag DoS ?
  2003-02-16 20:11         ` Possible ip_defrag DoS ? Harald Welte
  2003-02-16 20:26           ` Patrick McHardy
@ 2003-02-16 20:31           ` Patrick McHardy
  1 sibling, 0 replies; 10+ messages in thread
From: Patrick McHardy @ 2003-02-16 20:31 UTC (permalink / raw)
  To: Harald Welte; +Cc: netfilter-devel

PS: i like your new signature ;)

Harald Welte wrote:
-- 

- Harald Welte <laforge@netfilter.org>             http://www.netfilter.org/
============================================================================
  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."                    -- Paul Vixie

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy)
  2003-02-16 20:01   ` [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy) Harald Welte
@ 2003-02-17  9:14     ` Jozsef Kadlecsik
  2003-02-17 22:08       ` Don Cohen
  0 siblings, 1 reply; 10+ messages in thread
From: Jozsef Kadlecsik @ 2003-02-17  9:14 UTC (permalink / raw)
  To: Harald Welte; +Cc: Don Cohen, netfilter-devel

On Sun, 16 Feb 2003, Harald Welte wrote:

> On Sat, Feb 15, 2003 at 05:43:55PM -0800, Don Cohen wrote:
> > I'm also more than a little confused.
> > My understanding is that the UDP user can't receive fragments, just a
> > defragmented "datagram", from which he can't tell whether or how it
> > was fragmented.  Is this correct?
>
> yes, at least as long as we are talking about a user using the standard
> socket API and the UDP/IP stack - not talking about raw socktets.

We may re-fragment the packets as we wish if we don't break PMTU.
I don't think the actual fragments have importance even for the users
listening to a raw socket.

> >  > What about storing the biggest fragment size of a packet at
> >  > defragmentation and refragmenting the packet with that size at
> >  > POST_ROUTING if MTU is not smaller.
> > So at this point I gather the goal is the other side of the
> > robustness principle - trying to make something work that probably
> > shouldn't have been tried to begin with.  Are there lots of other
> > things out there that play the same (questionable?) tricks?
>
> if you want to do stateful IP filtering with _transparent_ fragment
> behaviour, I see two possible options:
>
> 1) hold back all individual fragments, once you receive them. once you
> have all fragments, defragment to an internal buffer, but keep the
> fragments.  using the defragmented copy you can consult your ruleset and
> then forward all fragments in case the packet is allowed to pass.
>
> 2) do what I have been proposing: defragment, save the fragment
> sizes/offsets and re-fragment in exactly the same way.

Maybe I'm blind but why should we re-fragment exaclty with the same way?
The receiver doesn't care about it. What we should ensure is that if a
fragment is bigger than the MTU we may send and it's flagged with DF,
then the whole defragmented packet should be discarded and an ICMP error
should be sent back.

The ICMP error sent back is tricky. What should we send back: the
offending fragment? Or header of the defragmented packet with the
size of the too big fragment as packet length would be just fine?

Regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy)
  2003-02-17  9:14     ` Jozsef Kadlecsik
@ 2003-02-17 22:08       ` Don Cohen
  0 siblings, 0 replies; 10+ messages in thread
From: Don Cohen @ 2003-02-17 22:08 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: Harald Welte, netfilter-devel


 > The ICMP error sent back is tricky. What should we send back: the
 > offending fragment? Or header of the defragmented packet with the
 > size of the too big fragment as packet length would be just fine?
You might be interested in the trick I've been using:
if you expand the packet by n bytes and have to send an icmp complaint
then reduce the size of what you claim to be the MTU by n.

near end of ip_forward:
        icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
                  htonl(mtu - <* amount added *>));

I suspect this has never been an issue in existing kernel code cause
changing packet length has been rare and probably only occurred in
short packets.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-02-17 22:08 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20030215232635.25928.78900.Mailman@kashyyyk>
2003-02-16  1:43 ` [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy) Don Cohen
2003-02-16  3:41   ` Patrick McHardy
2003-02-16  6:00     ` Don Cohen
2003-02-16 12:38       ` Patrick McHardy
2003-02-16 20:11         ` Possible ip_defrag DoS ? Harald Welte
2003-02-16 20:26           ` Patrick McHardy
2003-02-16 20:31           ` Patrick McHardy
2003-02-16 20:01   ` [RFC]: ip_conntrack breaks UDP PMTU (Patrick McHardy) Harald Welte
2003-02-17  9:14     ` Jozsef Kadlecsik
2003-02-17 22:08       ` Don Cohen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.