[RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
@ 2010-02-27  6:39 Shan Wei
  2010-03-10 17:13 ` YOSHIFUJI Hideaki
  0 siblings, 1 reply; 21+ messages in thread
From: Shan Wei @ 2010-02-27  6:39 UTC (permalink / raw)
  To: Patrick McHardy, David Miller, Alexey Dobriyan, Yasuyuki KOZAKAI
  Cc: netdev@vger.kernel.org, netfilter-devel

 This patch-set solves the problem that an end host with IPv6 connection track enable
can't send an ICMP "Fragment Reassembly Timeout" message when defaging timeout.
And supports MIB counter about fragments reassembly e.g. Ip6ReasmTimeout, Ip6ReasmReqds,
Ip6ReasmOKs, Ip6ReasmFails.

patch-1,2,3： Introduce net namespace to conntrack and share netns_frags with IPv6 stack.  
          But, IPv6 conntrack and IPv6 stack still keep separate fragment queue.
          Like IPv4, proc parameters of ip6frag_low_thresh, ip6frag_time and ip6frag_high_thresh
          manage numbers and memory thresh size of both IPv6 conntrack fragment queue and 
          IPv6 stack fragment queue.

patch-4: Send an ICMP "Fragment Reassembly Timeout" message and record MIB counter 
         when defraging timeout.

patch-5,6,7: According to RFC4293, record MIB counter about fragments reassembly.


This patch-set has been tested using IPv6 Ready Logo Phase-2 tool under host and router type.

---
Shan Wei <shanwei@cn.fujitsu.com> (7):
      IPv6:netfilter: defrag: Handle sysctls about IPv6 conntrack defragment per-netns
      IPv6:netfilter: defrag: Introduce per-netns to conntrack and kill nf_init_frags
      IPv6:netfilter: defrag: Disable button half when reassembling a fragment 
      IPv6:netfilter: Send an ICMPv6 "Fragment Reassembly Timeout" message when enabling connection track
      IPv6:netfilter: Record MIB counter when reassembling all fragments
      IPv6:netfilter: Record MIB counter after a fragment reached
      IPv6:netfilter: Add IPSTATS_MIB_REASMFAILS MIB counter value when evicting fragment queue

 Documentation/feature-removal-schedule.txt     |   19 ++
 include/linux/skbuff.h                         |    5 +
 include/net/netns/ipv6.h                       |    1 +
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |    7 +-
 net/ipv6/netfilter/nf_conntrack_reasm.c        |  221 +++++++++++++++++++-----
 net/ipv6/route.c                               |    1 +
 6 files changed, 208 insertions(+), 46 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-02-27  6:39 [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment Shan Wei
@ 2010-03-10 17:13 ` YOSHIFUJI Hideaki
  2010-03-11  9:16   ` Shan Wei
  0 siblings, 1 reply; 21+ messages in thread
From: YOSHIFUJI Hideaki @ 2010-03-10 17:13 UTC (permalink / raw)
  To: Shan Wei
  Cc: Patrick McHardy, David Miller, Alexey Dobriyan, Yasuyuki KOZAKAI,
	netdev@vger.kernel.org, netfilter-devel,
	yoshfuji@linux-ipv6.org >> YOSHIFUJI Hideaki

Hi,

Shan Wei wrote:
>  This patch-set solves the problem that an end host with IPv6 connection track enable
> can't send an ICMP "Fragment Reassembly Timeout" message when defaging timeout.
> And supports MIB counter about fragments reassembly e.g. Ip6ReasmTimeout, Ip6ReasmReqds,
> Ip6ReasmOKs, Ip6ReasmFails.

Well, because the context of defragment are different
from standard ones (e.g., In netfilter, defragment can
happen even on forwarding path, and the result is always
thrown away anyway), I think it is not a good idea to
touch standard MIB here. However I'm okay to increment
other stats like InDiscards, OurDiscards and netfilter
specific stats.
On the other hand, I'd even say we should NOT send
icmp here (at least by default) because standard routers
never send such packet.

Regards,

--yoshfuji

> patch-1,2,3： Introduce net namespace to conntrack and share netns_frags with IPv6 stack.  
>           But, IPv6 conntrack and IPv6 stack still keep separate fragment queue.
>           Like IPv4, proc parameters of ip6frag_low_thresh, ip6frag_time and ip6frag_high_thresh
>           manage numbers and memory thresh size of both IPv6 conntrack fragment queue and 
>           IPv6 stack fragment queue.
> 
> patch-4: Send an ICMP "Fragment Reassembly Timeout" message and record MIB counter 
>          when defraging timeout.
> 
> patch-5,6,7: According to RFC4293, record MIB counter about fragments reassembly.
> 
> 
> This patch-set has been tested using IPv6 Ready Logo Phase-2 tool under host and router type.
> 
> ---
> Shan Wei <shanwei@cn.fujitsu.com> (7):
>       IPv6:netfilter: defrag: Handle sysctls about IPv6 conntrack defragment per-netns
>       IPv6:netfilter: defrag: Introduce per-netns to conntrack and kill nf_init_frags
>       IPv6:netfilter: defrag: Disable button half when reassembling a fragment 
>       IPv6:netfilter: Send an ICMPv6 "Fragment Reassembly Timeout" message when enabling connection track
>       IPv6:netfilter: Record MIB counter when reassembling all fragments
>       IPv6:netfilter: Record MIB counter after a fragment reached
>       IPv6:netfilter: Add IPSTATS_MIB_REASMFAILS MIB counter value when evicting fragment queue
> 
>  Documentation/feature-removal-schedule.txt     |   19 ++
>  include/linux/skbuff.h                         |    5 +
>  include/net/netns/ipv6.h                       |    1 +
>  net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |    7 +-
>  net/ipv6/netfilter/nf_conntrack_reasm.c        |  221 +++++++++++++++++++-----
>  net/ipv6/route.c                               |    1 +
>  6 files changed, 208 insertions(+), 46 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-10 17:13 ` YOSHIFUJI Hideaki
@ 2010-03-11  9:16   ` Shan Wei
  2010-03-13 13:47     ` YOSHIFUJI Hideaki
  2010-03-23 15:05     ` Patrick McHardy
  0 siblings, 2 replies; 21+ messages in thread
From: Shan Wei @ 2010-03-11  9:16 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki
  Cc: Patrick McHardy, David Miller, Alexey Dobriyan, Yasuyuki KOZAKAI,
	netdev@vger.kernel.org, netfilter-devel,
	yoshfuji@linux-ipv6.org >> YOSHIFUJI Hideaki

yoshifuji-san:

YOSHIFUJI Hideaki wrote, at 03/11/2010 01:13 AM:
> Well, because the context of defragment are different
> from standard ones (e.g., In netfilter, defragment can
> happen even on forwarding path, and the result is always
> thrown away anyway), I think it is not a good idea to
> touch standard MIB here. However I'm okay to increment
> other stats like InDiscards, OurDiscards and netfilter
> specific stats.

Not only on router, but also on host, if conntrack fails to reassemble
fragments, the fragments will not be forwarded to IPv4/IPv6 stack.
So, these fragments can't be traced from MIB counter.

And, IPv4 conntrack records these fragments.
Is the context of IPv4 defragment different from IPv6?

> On the other hand, I'd even say we should NOT send
> icmp here (at least by default) because standard routers
> never send such packet.

Yes，for routers, the patch-set does not send icmp message to
source host. It only does on destination host with IPv6 connection 
track enable.

-- 
Best Regards
-----
Shan Wei
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-11  9:16   ` Shan Wei
@ 2010-03-13 13:47     ` YOSHIFUJI Hideaki
  2010-03-15 16:27       ` Patrick McHardy
  2010-03-23 15:05     ` Patrick McHardy
  1 sibling, 1 reply; 21+ messages in thread
From: YOSHIFUJI Hideaki @ 2010-03-13 13:47 UTC (permalink / raw)
  To: Shan Wei
  Cc: YOSHIFUJI Hideaki, Patrick McHardy, David Miller, Alexey Dobriyan,
	Yasuyuki KOZAKAI, netdev@vger.kernel.org, netfilter-devel,
	yoshfuji

Hi.

(2010/03/11 18:16), Shan Wei wrote:
> yoshifuji-san:
>
> YOSHIFUJI Hideaki wrote, at 03/11/2010 01:13 AM:
>> Well, because the context of defragment are different
>> from standard ones (e.g., In netfilter, defragment can
>> happen even on forwarding path, and the result is always
>> thrown away anyway), I think it is not a good idea to
>> touch standard MIB here. However I'm okay to increment
>> other stats like InDiscards, OurDiscards and netfilter
>> specific stats.
>
> Not only on router, but also on host, if conntrack fails to reassemble
> fragments, the fragments will not be forwarded to IPv4/IPv6 stack.
> So, these fragments can't be traced from MIB counter.
>
> And, IPv4 conntrack records these fragments.
> Is the context of IPv4 defragment different from IPv6?

Yes, it is different.

As you know, defragment can not happen on routers in IPv6.
Because we do want to preserve hop-by-hop option etc,
we preserve original packets in netfilter code.

In IPv6, defragment in netfilter is a temporary just
for conntrack.  The state (including defragmented packet)
is not preserved, and original fragments are used in further
process (including local processing or forwarding).

So, please take that defragment failure is same as other
random reasons what netfilter code thinks.  Of course,
you can introduce nf-specific counters that show reasons
why packets are discarded in netfilter module.

>> On the other hand, I'd even say we should NOT send
>> icmp here (at least by default) because standard routers
>> never send such packet.
>
> Yes，for routers, the patch-set does not send icmp message to
> source host. It only does on destination host with IPv6 connection
> track enable.

Please make it optional (via parameter) at least.

Regards,

--yoshfuji

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-13 13:47     ` YOSHIFUJI Hideaki
@ 2010-03-15 16:27       ` Patrick McHardy
  2010-03-23 16:28         ` YOSHIFUJI Hideaki
  0 siblings, 1 reply; 21+ messages in thread
From: Patrick McHardy @ 2010-03-15 16:27 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki
  Cc: Shan Wei, YOSHIFUJI Hideaki, David Miller, Alexey Dobriyan,
	Yasuyuki KOZAKAI, netdev@vger.kernel.org, netfilter-devel

YOSHIFUJI Hideaki wrote:
> (2010/03/11 18:16), Shan Wei wrote:
>>> On the other hand, I'd even say we should NOT send
>>> icmp here (at least by default) because standard routers
>>> never send such packet.
>>
>> Yes，for routers, the patch-set does not send icmp message to
>> source host. It only does on destination host with IPv6 connection
>> track enable.
> 
> Please make it optional (via parameter) at least.

The ICMP messages are only sent if the packet is destined for the
local host, similar to what IPv6 defrag would do if conntrack wouldn't
be used. So this patch increases consistency, why should we make this
optional?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-11  9:16   ` Shan Wei
  2010-03-13 13:47     ` YOSHIFUJI Hideaki
@ 2010-03-23 15:05     ` Patrick McHardy
  2010-03-25  2:28       ` Shan Wei
  1 sibling, 1 reply; 21+ messages in thread
From: Patrick McHardy @ 2010-03-23 15:05 UTC (permalink / raw)
  To: Shan Wei
  Cc: YOSHIFUJI Hideaki, David Miller, Alexey Dobriyan,
	Yasuyuki KOZAKAI, netdev@vger.kernel.org, netfilter-devel,
	yoshfuji@linux-ipv6.org >> YOSHIFUJI Hideaki

Shan Wei wrote:
>> On the other hand, I'd even say we should NOT send
>> icmp here (at least by default) because standard routers
>> never send such packet.
>>     
>
> Yes，for routers, the patch-set does not send icmp message to
> source host. It only does on destination host with IPv6 connection 
> track enable.
>   

The nf-next tree is open again, now would be a good time to resubmit
these patches.
Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-15 16:27       ` Patrick McHardy
@ 2010-03-23 16:28         ` YOSHIFUJI Hideaki
  2010-03-23 17:16           ` Patrick McHardy
  0 siblings, 1 reply; 21+ messages in thread
From: YOSHIFUJI Hideaki @ 2010-03-23 16:28 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Shan Wei, YOSHIFUJI Hideaki, David Miller, Alexey Dobriyan,
	Yasuyuki KOZAKAI, netdev@vger.kernel.org, netfilter-devel,
	YOSHIFUJI Hideaki

Hello.

Sorry for my slow response.

(2010/03/16 1:27), Patrick McHardy wrote:
> YOSHIFUJI Hideaki wrote:
>> (2010/03/11 18:16), Shan Wei wrote:
>>>> On the other hand, I'd even say we should NOT send
>>>> icmp here (at least by default) because standard routers
>>>> never send such packet.
>>>
>>> Yes，for routers, the patch-set does not send icmp message to
>>> source host. It only does on destination host with IPv6 connection
>>> track enable.
>>
>> Please make it optional (via parameter) at least.
>
> The ICMP messages are only sent if the packet is destined for the
> local host, similar to what IPv6 defrag would do if conntrack wouldn't
> be used. So this patch increases consistency, why should we make this
> optional?

Well, in the first place, I do think conntrack should be
transparent as much as possible.  And, I cannot find other
netfilter conntrack code (ipv4 or ipv6) sending icmp e.g.
parameter problem etc.

As I said before, I agree that netfilter may drop packets
by any reasons, but I do think it should be done silently.
It can increment netfilter's own statistic counting etc.
but it should not increment the core's (especially,
specific) statistic counting.

Reassembling processes are the same.  We should NOT send icmp, and
if ever desired, we might optionally send icmp (in other
module maybe).

Regards,

--yoshfuji
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-23 16:28         ` YOSHIFUJI Hideaki
@ 2010-03-23 17:16           ` Patrick McHardy
  2010-03-23 18:58             ` YOSHIFUJI Hideaki
  0 siblings, 1 reply; 21+ messages in thread
From: Patrick McHardy @ 2010-03-23 17:16 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki
  Cc: Shan Wei, YOSHIFUJI Hideaki, David Miller, Alexey Dobriyan,
	Yasuyuki KOZAKAI, netdev@vger.kernel.org, netfilter-devel

YOSHIFUJI Hideaki wrote:
> Hello.
>
> Sorry for my slow response.
>
> (2010/03/16 1:27), Patrick McHardy wrote:
>> YOSHIFUJI Hideaki wrote:
>>> (2010/03/11 18:16), Shan Wei wrote:
>>>>> On the other hand, I'd even say we should NOT send
>>>>> icmp here (at least by default) because standard routers
>>>>> never send such packet.
>>>>
>>>> Yes，for routers, the patch-set does not send icmp message to
>>>> source host. It only does on destination host with IPv6 connection
>>>> track enable.
>>>
>>> Please make it optional (via parameter) at least.
>>
>> The ICMP messages are only sent if the packet is destined for the
>> local host, similar to what IPv6 defrag would do if conntrack wouldn't
>> be used. So this patch increases consistency, why should we make this
>> optional?
>
> Well, in the first place, I do think conntrack should be
> transparent as much as possible.  And, I cannot find other
> netfilter conntrack code (ipv4 or ipv6) sending icmp e.g.
> parameter problem etc.

Agreed on the transparent part, however I consider silently dropping
packets not transparent. In fact conntrack itself should never drop
packets except under some very special circumstances when there's
no other choice in order to operate correctly. Dropping packets is
supposed to be a policy decision made by the user.

In this case without conntrack, IPv6 would send an ICMPv6 message,
so in my opinion the transparent thing to do would be to still send
them. Of course only if reassembly is done on an end host.

There's really no difference in sending these packets from conntrack
compared to passing the incomplete fragments upwards to IPv6 and
waiting for another timeout, except that its easier to implement
consistently by generating the packets within conntrack.

> As I said before, I agree that netfilter may drop packets
> by any reasons, but I do think it should be done silently.
> It can increment netfilter's own statistic counting etc.
> but it should not increment the core's (especially,
> specific) statistic counting.

It really depends on what you define as "transparent".

>
> Reassembling processes are the same.  We should NOT send icmp, and
> if ever desired, we might optionally send icmp (in other
> module maybe). 

Please see above for my reasoning. There's also the matter of consistency
between IPv4 and IPv6 conntrack.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-23 17:16           ` Patrick McHardy
@ 2010-03-23 18:58             ` YOSHIFUJI Hideaki
  2010-03-23 20:10               ` Jozsef Kadlecsik
  2010-03-25  2:22               ` Shan Wei
  0 siblings, 2 replies; 21+ messages in thread
From: YOSHIFUJI Hideaki @ 2010-03-23 18:58 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Shan Wei, YOSHIFUJI Hideaki, David Miller, Alexey Dobriyan,
	Yasuyuki KOZAKAI, netdev@vger.kernel.org, netfilter-devel,
	YOSHIFUJI Hideaki

Hello.

(2010/03/24 2:16), Patrick McHardy wrote:
> YOSHIFUJI Hideaki wrote:
>> Hello.
>>
>> Sorry for my slow response.
>>
>> (2010/03/16 1:27), Patrick McHardy wrote:
>>> YOSHIFUJI Hideaki wrote:
>>>> (2010/03/11 18:16), Shan Wei wrote:
>>>>>> On the other hand, I'd even say we should NOT send
>>>>>> icmp here (at least by default) because standard routers
>>>>>> never send such packet.
>>>>>
>>>>> Yes，for routers, the patch-set does not send icmp message to
>>>>> source host. It only does on destination host with IPv6 connection
>>>>> track enable.
>>>>
>>>> Please make it optional (via parameter) at least.
>>>
>>> The ICMP messages are only sent if the packet is destined for the
>>> local host, similar to what IPv6 defrag would do if conntrack wouldn't
>>> be used. So this patch increases consistency, why should we make this
>>> optional?
>>
>> Well, in the first place, I do think conntrack should be
>> transparent as much as possible.  And, I cannot find other
>> netfilter conntrack code (ipv4 or ipv6) sending icmp e.g.
>> parameter problem etc.
>
> Agreed on the transparent part, however I consider silently dropping
> packets not transparent. In fact conntrack itself should never drop
> packets except under some very special circumstances when there's
> no other choice in order to operate correctly. Dropping packets is
> supposed to be a policy decision made by the user.

Definitely right.
  
> In this case without conntrack, IPv6 would send an ICMPv6 message,
> so in my opinion the transparent thing to do would be to still send
> them. Of course only if reassembly is done on an end host.

Well, no.  conntrack should just forward even uncompleted fragments
to next process (e.g. core ipv6 code), and then the core would send
ICMP error back.  ICMP should be sent by the core ipv6 code according
to decision of itself, not according to netfilter.
  
> There's really no difference in sending these packets from conntrack
> compared to passing the incomplete fragments upwards to IPv6 and
> waiting for another timeout, except that its easier to implement
> consistently by generating the packets within conntrack.

It should never be sent by the decision of the netfilter because
the semantics and code paths are different in two cases.

>> As I said before, I agree that netfilter may drop packets
>> by any reasons, but I do think it should be done silently.
>> It can increment netfilter's own statistic counting etc.
>> but it should not increment the core's (especially,
>> specific) statistic counting.
>
> It really depends on what you define as "transparent".

I mean, netfilter conntrack should not either drop or modify any
packets, and it should not generate any additional packets.

>> Reassembling processes are the same.  We should NOT send icmp, and
>> if ever desired, we might optionally send icmp (in other
>> module maybe).
>
> Please see above for my reasoning. There's also the matter of consistency
> between IPv4 and IPv6 conntrack.

Would you please explain more about what you mean by consistency
between IPv4 and IPv6 conntrack?

I do think it is rather different, anyway (because original packets
is to be preserved in IPv6, but not in IPv4).

--yoshfuji

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-23 18:58             ` YOSHIFUJI Hideaki
@ 2010-03-23 20:10               ` Jozsef Kadlecsik
  2010-03-25  4:20                 ` YOSHIFUJI Hideaki
  2010-03-25  8:38                 ` Pascal Hambourg
  2010-03-25  2:22               ` Shan Wei
  1 sibling, 2 replies; 21+ messages in thread
From: Jozsef Kadlecsik @ 2010-03-23 20:10 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki
  Cc: Patrick McHardy, Shan Wei, YOSHIFUJI Hideaki, David Miller,
	Alexey Dobriyan, Yasuyuki KOZAKAI, netdev@vger.kernel.org,
	netfilter-devel

Hi,

On Wed, 24 Mar 2010, YOSHIFUJI Hideaki wrote:

> > In this case without conntrack, IPv6 would send an ICMPv6 message,
> > so in my opinion the transparent thing to do would be to still send
> > them. Of course only if reassembly is done on an end host.
> 
> Well, no.  conntrack should just forward even uncompleted fragments
> to next process (e.g. core ipv6 code), and then the core would send
> ICMP error back.  ICMP should be sent by the core ipv6 code according
> to decision of itself, not according to netfilter.

But what state could be associated by conntrack to the uncompleted 
fragments but the INVALID state? In consequence, in any sane setup, the 
uncompleted fragments will be dropped silently by a filter table rule
and no ICMP error message will be sent back.

Therefore I think iff the destination of the fragments is the host 
itself, then conntrack should generate an ICMP error message. But that 
error message must be processed by conntrack to set its state and then 
the fate of the generated packet can be decided by a rule.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-23 18:58             ` YOSHIFUJI Hideaki
  2010-03-23 20:10               ` Jozsef Kadlecsik
@ 2010-03-25  2:22               ` Shan Wei
  1 sibling, 0 replies; 21+ messages in thread
From: Shan Wei @ 2010-03-25  2:22 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki
  Cc: Patrick McHardy, YOSHIFUJI Hideaki, David Miller, Alexey Dobriyan,
	Yasuyuki KOZAKAI, netdev@vger.kernel.org, netfilter-devel

yoshifuji-san:

YOSHIFUJI Hideaki wrote, at 03/24/2010 02:58 AM:
>> In this case without conntrack, IPv6 would send an ICMPv6 message,
>> so in my opinion the transparent thing to do would be to still send
>> them. Of course only if reassembly is done on an end host.
> 
> Well, no.  conntrack should just forward even uncompleted fragments
> to next process (e.g. core ipv6 code), and then the core would send
> ICMP error back.  ICMP should be sent by the core ipv6 code according
> to decision of itself, not according to netfilter.

It's bad to forward uncompleted fragments to IPv4/IPv6 stack.

One the one hand, helper modules of conntrack analyze application data
in packets. They need to parse overall segment or datagram. If packets are 
fragmented, conntrack needs to reassemble them.

On the other hand, if uncompleted fragments are forwarded to IPv4/IPv6 stack,
they will be reassemble twice, and the result is also failure.
So, conntrack will drop uncompleted fragments after reassemble timeout.

> Would you please explain more about what you mean by consistency
> between IPv4 and IPv6 conntrack?
> 
> I do think it is rather different, anyway (because original packets
> is to be preserved in IPv6, but not in IPv4).

Yes, the defragment implement of IPv6 conntrack is absolutely different from that of IPv4 conntrack.
But, the handle after reassemble timeout should be consistent.
For IPv4 conntrack, an end host with conntrack enabled must send ICMP fragment reassembly timeout
message to source host. Details see commit e9017b
(Title: IP: Send an ICMP "Fragment Reassembly Timeout" message when enabling connection track).

-- 
Best Regards
-----
Shan Wei

> 
> --yoshfuji
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-23 15:05     ` Patrick McHardy
@ 2010-03-25  2:28       ` Shan Wei
  2010-03-25  4:19         ` YOSHIFUJI Hideaki
  0 siblings, 1 reply; 21+ messages in thread
From: Shan Wei @ 2010-03-25  2:28 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: YOSHIFUJI Hideaki, David Miller, Alexey Dobriyan,
	Yasuyuki KOZAKAI, netdev@vger.kernel.org, netfilter-devel,
	yoshfuji@linux-ipv6.org >> YOSHIFUJI Hideaki

Patrick McHardy wrote, at 03/23/2010 11:05 PM:
> Shan Wei wrote:
>>> On the other hand, I'd even say we should NOT send
>>> icmp here (at least by default) because standard routers
>>> never send such packet.
>>>     
>> Yes，for routers, the patch-set does not send icmp message to
>> source host. It only does on destination host with IPv6 connection 
>> track enable.
>>   
> 
> The nf-next tree is open again, now would be a good time to resubmit
> these patches.
> Thanks!

If no body opposes them, i will resubmit these patches with v3. 

-- 
Best Regards
-----
Shan Wei
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-25  2:28       ` Shan Wei
@ 2010-03-25  4:19         ` YOSHIFUJI Hideaki
  0 siblings, 0 replies; 21+ messages in thread
From: YOSHIFUJI Hideaki @ 2010-03-25  4:19 UTC (permalink / raw)
  To: Shan Wei
  Cc: Patrick McHardy, YOSHIFUJI Hideaki, David Miller, Alexey Dobriyan,
	Yasuyuki KOZAKAI, netdev@vger.kernel.org, netfilter-devel,
	YOSHIFUJI Hideaki

(2010/03/25 11:28), Shan Wei wrote:
> Patrick McHardy wrote, at 03/23/2010 11:05 PM:
>> Shan Wei wrote:
>>>> On the other hand, I'd even say we should NOT send
>>>> icmp here (at least by default) because standard routers
>>>> never send such packet.
>>>>
>>> Yes，for routers, the patch-set does not send icmp message to
>>> source host. It only does on destination host with IPv6 connection
>>> track enable.
>>>
>>
>> The nf-next tree is open again, now would be a good time to resubmit
>> these patches.
>> Thanks!
>
> If no body opposes them, i will resubmit these patches with v3.
>

I still disagree 4-7.

--yoshfuji
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-23 20:10               ` Jozsef Kadlecsik
@ 2010-03-25  4:20                 ` YOSHIFUJI Hideaki
  2010-03-25  9:23                   ` Jozsef Kadlecsik
  2010-03-25 10:25                   ` Patrick McHardy
  2010-03-25  8:38                 ` Pascal Hambourg
  1 sibling, 2 replies; 21+ messages in thread
From: YOSHIFUJI Hideaki @ 2010-03-25  4:20 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: Patrick McHardy, Shan Wei, YOSHIFUJI Hideaki, David Miller,
	Alexey Dobriyan, Yasuyuki KOZAKAI, netdev@vger.kernel.org,
	netfilter-devel, YOSHIFUJI Hideaki

(2010/03/24 5:10), Jozsef Kadlecsik wrote:

> On Wed, 24 Mar 2010, YOSHIFUJI Hideaki wrote:
>
>>> In this case without conntrack, IPv6 would send an ICMPv6 message,
>>> so in my opinion the transparent thing to do would be to still send
>>> them. Of course only if reassembly is done on an end host.
>>
>> Well, no.  conntrack should just forward even uncompleted fragments
>> to next process (e.g. core ipv6 code), and then the core would send
>> ICMP error back.  ICMP should be sent by the core ipv6 code according
>> to decision of itself, not according to netfilter.
>
> But what state could be associated by conntrack to the uncompleted
> fragments but the INVALID state? In consequence, in any sane setup, the
> uncompleted fragments will be dropped silently by a filter table rule
> and no ICMP error message will be sent back.
>
> Therefore I think iff the destination of the fragments is the host
> itself, then conntrack should generate an ICMP error message. But that
> error message must be processed by conntrack to set its state and then
> the fate of the generated packet can be decided by a rule.

Well.... no.

First of all. in "sane" setup, people should configure according
to their own requirements.  They may or may not want send back
icmp packet.  And, even if the core is to send icmp back, the
state would be correctly assigned.

We cannot (and should not) do something "cleaver" (excluding
packet drops) in conntrack in PRE_ROUTING chain.

One reason is that in PRE_ROUTING context, we can NOT determine
if the address we see in the IP header is really the final
destination.  The overall situation is the same even if the
routing entry corresponding the "current" destination points
the node itself, or even if the node is configured as host.

It might seems that we could do something in the "filter"
table in LOCAL_IN, FORWARD or LOCAL_OUT (after routing header
process).

But well, we unfortunately cannot do this (at least
automatically) because even in LOCAL_IN, the final
destination has not been decided until we process all
of extension headers.

Sending ICMP in netfilter (especially in conntrack) is too
patchy, and is not right.  If we do the right thing (and
I believe we should do so),  I'd propose to have hooks
around handlers inside ip6_input_finish().

...I remember that I was thinking about this before.

For my conclusion, first option is just to drop
uncompleted fragments as we do today.  Second option
would be  to forward them to the next process so that
core code could send ICMPv6 etc. or, we could have
new code to send ICMPV6_TIME_EXCEED in REJECT target.
In longer term, I think it is better to introduce
per-exthdr hooks.

Regards,

--yoshfuji

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-23 20:10               ` Jozsef Kadlecsik
  2010-03-25  4:20                 ` YOSHIFUJI Hideaki
@ 2010-03-25  8:38                 ` Pascal Hambourg
  2010-03-25  9:13                   ` Shan Wei
  1 sibling, 1 reply; 21+ messages in thread
From: Pascal Hambourg @ 2010-03-25  8:38 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: YOSHIFUJI Hideaki, Patrick McHardy, Shan Wei, David Miller,
	Alexey Dobriyan, Yasuyuki KOZAKAI, netdev@vger.kernel.org,
	netfilter-devel

Hello,

Jozsef Kadlecsik a écrit :
> 
> On Wed, 24 Mar 2010, YOSHIFUJI Hideaki wrote:
> 
>>> In this case without conntrack, IPv6 would send an ICMPv6 message,
>>> so in my opinion the transparent thing to do would be to still send
>>> them. Of course only if reassembly is done on an end host.
>> Well, no.  conntrack should just forward even uncompleted fragments
>> to next process (e.g. core ipv6 code), and then the core would send
>> ICMP error back.  ICMP should be sent by the core ipv6 code according
>> to decision of itself, not according to netfilter.
> 
> But what state could be associated by conntrack to the uncompleted 
> fragments but the INVALID state? In consequence, in any sane setup, the 
> uncompleted fragments will be dropped silently by a filter table rule
> and no ICMP error message will be sent back.

AFAIK, in the IPv4 stack the reassembly takes place before the INPUT
chains (NF_IP_LOCAL_IN hook). Is it different in the IPv6 stack ?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-25  8:38                 ` Pascal Hambourg
@ 2010-03-25  9:13                   ` Shan Wei
  2010-03-25 10:07                     ` Jozsef Kadlecsik
  0 siblings, 1 reply; 21+ messages in thread
From: Shan Wei @ 2010-03-25  9:13 UTC (permalink / raw)
  To: Pascal Hambourg
  Cc: Jozsef Kadlecsik, YOSHIFUJI Hideaki, Patrick McHardy,
	David Miller, Alexey Dobriyan, Yasuyuki KOZAKAI,
	netdev@vger.kernel.org, netfilter-devel

Pascal Hambourg wrote, at 03/25/2010 04:38 PM:
> Hello,
> 
> Jozsef Kadlecsik a écrit :
>> On Wed, 24 Mar 2010, YOSHIFUJI Hideaki wrote:
>>
>>>> In this case without conntrack, IPv6 would send an ICMPv6 message,
>>>> so in my opinion the transparent thing to do would be to still send
>>>> them. Of course only if reassembly is done on an end host.
>>> Well, no.  conntrack should just forward even uncompleted fragments
>>> to next process (e.g. core ipv6 code), and then the core would send
>>> ICMP error back.  ICMP should be sent by the core ipv6 code according
>>> to decision of itself, not according to netfilter.
>> But what state could be associated by conntrack to the uncompleted 
>> fragments but the INVALID state? In consequence, in any sane setup, the 
>> uncompleted fragments will be dropped silently by a filter table rule
>> and no ICMP error message will be sent back.
> 
> AFAIK, in the IPv4 stack the reassembly takes place before the INPUT
> chains (NF_IP_LOCAL_IN hook). Is it different in the IPv6 stack ?

Yes, they are different.

In IPv4 stack，for an end host, ip_local_deliver() reassemble 
fragments before LOCAL_IN hook .

But in IPv6 stack, ip6_input_finish() handles fragment extension headers
and try to reassemble them *after* LOCAL_IN hook.

-- 
Best Regards
-----
Shan Wei
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-25  4:20                 ` YOSHIFUJI Hideaki
@ 2010-03-25  9:23                   ` Jozsef Kadlecsik
  2010-03-25 14:14                     ` YOSHIFUJI Hideaki
  2010-03-25 10:25                   ` Patrick McHardy
  1 sibling, 1 reply; 21+ messages in thread
From: Jozsef Kadlecsik @ 2010-03-25  9:23 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki
  Cc: Patrick McHardy, Shan Wei, YOSHIFUJI Hideaki, David Miller,
	Alexey Dobriyan, Yasuyuki KOZAKAI, netdev@vger.kernel.org,
	netfilter-devel

On Thu, 25 Mar 2010, YOSHIFUJI Hideaki wrote:

> (2010/03/24 5:10), Jozsef Kadlecsik wrote:
> 
> > On Wed, 24 Mar 2010, YOSHIFUJI Hideaki wrote:
> > 
> > > > In this case without conntrack, IPv6 would send an ICMPv6 message,
> > > > so in my opinion the transparent thing to do would be to still send
> > > > them. Of course only if reassembly is done on an end host.
> > > 
> > > Well, no.  conntrack should just forward even uncompleted fragments
> > > to next process (e.g. core ipv6 code), and then the core would send
> > > ICMP error back.  ICMP should be sent by the core ipv6 code according
> > > to decision of itself, not according to netfilter.
> > 
> > But what state could be associated by conntrack to the uncompleted
> > fragments but the INVALID state? In consequence, in any sane setup, the
> > uncompleted fragments will be dropped silently by a filter table rule
> > and no ICMP error message will be sent back.
> > 
> > Therefore I think iff the destination of the fragments is the host
> > itself, then conntrack should generate an ICMP error message. But that
> > error message must be processed by conntrack to set its state and then
> > the fate of the generated packet can be decided by a rule.
> 
> Well.... no.
> 
> First of all. in "sane" setup, people should configure according
> to their own requirements.  They may or may not want send back
> icmp packet.  And, even if the core is to send icmp back, the
> state would be correctly assigned.

I meant the state of the fragmented packets. If we let the uncompleted 
fragments to enter conntrack, as far as I see their state will be INVALID. 
Or should we add an exception and set their state to UNTRACKED in 
conntrack?
 
> We cannot (and should not) do something "cleaver" (excluding
> packet drops) in conntrack in PRE_ROUTING chain.

Actually, I have to agree with you.
 
> One reason is that in PRE_ROUTING context, we can NOT determine
> if the address we see in the IP header is really the final
> destination.  The overall situation is the same even if the
> routing entry corresponding the "current" destination points
> the node itself, or even if the node is configured as host.
> 
> It might seems that we could do something in the "filter"
> table in LOCAL_IN, FORWARD or LOCAL_OUT (after routing header
> process).
> 
> But well, we unfortunately cannot do this (at least
> automatically) because even in LOCAL_IN, the final
> destination has not been decided until we process all
> of extension headers.
> 
> Sending ICMP in netfilter (especially in conntrack) is too
> patchy, and is not right.  If we do the right thing (and
> I believe we should do so),  I'd propose to have hooks
> around handlers inside ip6_input_finish().
> 
> ...I remember that I was thinking about this before.
> 
> For my conclusion, first option is just to drop
> uncompleted fragments as we do today.  Second option
> would be  to forward them to the next process so that
> core code could send ICMPv6 etc. or, we could have
> new code to send ICMPV6_TIME_EXCEED in REJECT target.
> In longer term, I think it is better to introduce
> per-exthdr hooks.

I agree with your conclusion too, except a few question.

It is unclear for me how can you forward the packets to the next process: 
above you pointed out that in defrag/reassembly before conntrack we do not 
know yet whether the packets are destined to the host or not. So again, 
how would you let through the fragments on conntrack then?

I don't know how could the REJECT target help in any way.

This is not a simple case at all and I have to think that the "best" way 
just to drop the packets as we currently do.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-25  9:13                   ` Shan Wei
@ 2010-03-25 10:07                     ` Jozsef Kadlecsik
  2010-03-25 10:20                       ` Patrick McHardy
  0 siblings, 1 reply; 21+ messages in thread
From: Jozsef Kadlecsik @ 2010-03-25 10:07 UTC (permalink / raw)
  To: Shan Wei
  Cc: Pascal Hambourg, YOSHIFUJI Hideaki, Patrick McHardy, David Miller,
	Alexey Dobriyan, Yasuyuki KOZAKAI, netdev@vger.kernel.org,
	netfilter-devel

On Thu, 25 Mar 2010, Shan Wei wrote:

> Pascal Hambourg wrote, at 03/25/2010 04:38 PM:
> > 
> > Jozsef Kadlecsik a ?crit :
> >> On Wed, 24 Mar 2010, YOSHIFUJI Hideaki wrote:
> >>
> >>>> In this case without conntrack, IPv6 would send an ICMPv6 message,
> >>>> so in my opinion the transparent thing to do would be to still send
> >>>> them. Of course only if reassembly is done on an end host.
> >>> Well, no.  conntrack should just forward even uncompleted fragments
> >>> to next process (e.g. core ipv6 code), and then the core would send
> >>> ICMP error back.  ICMP should be sent by the core ipv6 code according
> >>> to decision of itself, not according to netfilter.
> >> But what state could be associated by conntrack to the uncompleted 
> >> fragments but the INVALID state? In consequence, in any sane setup, the 
> >> uncompleted fragments will be dropped silently by a filter table rule
> >> and no ICMP error message will be sent back.
> > 
> > AFAIK, in the IPv4 stack the reassembly takes place before the INPUT
> > chains (NF_IP_LOCAL_IN hook). Is it different in the IPv6 stack ?
> 
> Yes, they are different.
> 
> In IPv4 stack?for an end host, ip_local_deliver() reassemble 
> fragments before LOCAL_IN hook .
> 
> But in IPv6 stack, ip6_input_finish() handles fragment extension headers
> and try to reassemble them *after* LOCAL_IN hook.

But we are discussing netfilter and (de)fragmentation: what should happen 
when the packet reassembly in netfilter times out and the destination is 
the host itself.

In IPv4 the very first subsystem is ipv4_conntrack_defrag, called from 
NF_INET_PRE_ROUTING. Then comes the raw table and after that conntrack.

In IPv6 the very first is the raw table, then comes ipv6_defrag and then 
conntrack.

Why the order of the raw table and defragmentation is reversed for IPv6?

That makes impossible to use the NOTRACK target in IPv6: for example if 
someone enters

ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK

and if we receive fragmented packets then the first fragment will be 
untracked and thus skip nf_ct_frag6_gather (and conntrack), while all 
subsequent fragments enter nf_ct_frag6_gather and reassembly will never 
successfully be finished.

IMHO this is a bug and should be fixed. Patrick, please consider applying 
the patch below.

Singed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

diff --git a/include/linux/netfilter_ipv6.h b/include/linux/netfilter_ipv6.h
index d654873..1f7e300 100644
--- a/include/linux/netfilter_ipv6.h
+++ b/include/linux/netfilter_ipv6.h
@@ -59,6 +59,7 @@
 enum nf_ip6_hook_priorities {
 	NF_IP6_PRI_FIRST = INT_MIN,
 	NF_IP6_PRI_CONNTRACK_DEFRAG = -400,
+	NF_IP6_PRI_RAW = -300,
 	NF_IP6_PRI_SELINUX_FIRST = -225,
 	NF_IP6_PRI_CONNTRACK = -200,
 	NF_IP6_PRI_MANGLE = -150,
diff --git a/net/ipv6/netfilter/ip6table_raw.c b/net/ipv6/netfilter/ip6table_raw.c
index ed1a118..3d8c6f0 100644
--- a/net/ipv6/netfilter/ip6table_raw.c
+++ b/net/ipv6/netfilter/ip6table_raw.c
@@ -70,14 +70,14 @@ static struct nf_hook_ops ip6t_ops[] __read_mostly = {
 	  .hook = ip6t_pre_routing_hook,
 	  .pf = NFPROTO_IPV6,
 	  .hooknum = NF_INET_PRE_ROUTING,
-	  .priority = NF_IP6_PRI_FIRST,
+	  .priority = NF_IP6_PRI_RAW,
 	  .owner = THIS_MODULE,
 	},
 	{
 	  .hook = ip6t_local_out_hook,
 	  .pf = NFPROTO_IPV6,
 	  .hooknum = NF_INET_LOCAL_OUT,
-	  .priority = NF_IP6_PRI_FIRST,
+	  .priority = NF_IP6_PRI_RAW,
 	  .owner = THIS_MODULE,
 	},
 };


Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-25 10:07                     ` Jozsef Kadlecsik
@ 2010-03-25 10:20                       ` Patrick McHardy
  0 siblings, 0 replies; 21+ messages in thread
From: Patrick McHardy @ 2010-03-25 10:20 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: Shan Wei, Pascal Hambourg, YOSHIFUJI Hideaki, David Miller,
	Alexey Dobriyan, Yasuyuki KOZAKAI, netdev@vger.kernel.org,
	netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 719 bytes --]

Jozsef Kadlecsik wrote:
> Why the order of the raw table and defragmentation is reversed for IPv6?
> 
> That makes impossible to use the NOTRACK target in IPv6: for example if 
> someone enters
> 
> ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK
> 
> and if we receive fragmented packets then the first fragment will be 
> untracked and thus skip nf_ct_frag6_gather (and conntrack), while all 
> subsequent fragments enter nf_ct_frag6_gather and reassembly will never 
> successfully be finished.
> 
> IMHO this is a bug and should be fixed. Patrick, please consider applying 
> the patch below.

Indeed. I've applied your patch with a minor fixup (attached) to
apply cleanly to the current tree, thanks.


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1657 bytes --]

commit 9c13886665c43600bd0af4b38e33c654e648e078
Author: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Date:   Thu Mar 25 11:17:26 2010 +0100

    netfilter: ip6table_raw: fix table priority
    
    The order of the IPv6 raw table is currently reversed, that makes impossible
    to use the NOTRACK target in IPv6: for example if someone enters
    
    ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK
    
    and if we receive fragmented packets then the first fragment will be
    untracked and thus skip nf_ct_frag6_gather (and conntrack), while all
    subsequent fragments enter nf_ct_frag6_gather and reassembly will never
    successfully be finished.
    
    Singed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/include/linux/netfilter_ipv6.h b/include/linux/netfilter_ipv6.h
index d654873..1f7e300 100644
--- a/include/linux/netfilter_ipv6.h
+++ b/include/linux/netfilter_ipv6.h
@@ -59,6 +59,7 @@
 enum nf_ip6_hook_priorities {
 	NF_IP6_PRI_FIRST = INT_MIN,
 	NF_IP6_PRI_CONNTRACK_DEFRAG = -400,
+	NF_IP6_PRI_RAW = -300,
 	NF_IP6_PRI_SELINUX_FIRST = -225,
 	NF_IP6_PRI_CONNTRACK = -200,
 	NF_IP6_PRI_MANGLE = -150,
diff --git a/net/ipv6/netfilter/ip6table_raw.c b/net/ipv6/netfilter/ip6table_raw.c
index aef31a2..b9cf7cd 100644
--- a/net/ipv6/netfilter/ip6table_raw.c
+++ b/net/ipv6/netfilter/ip6table_raw.c
@@ -13,7 +13,7 @@ static const struct xt_table packet_raw = {
 	.valid_hooks = RAW_VALID_HOOKS,
 	.me = THIS_MODULE,
 	.af = NFPROTO_IPV6,
-	.priority = NF_IP6_PRI_FIRST,
+	.priority = NF_IP6_PRI_RAW,
 };
 
 /* The work comes in here from netfilter.c. */

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-25  4:20                 ` YOSHIFUJI Hideaki
  2010-03-25  9:23                   ` Jozsef Kadlecsik
@ 2010-03-25 10:25                   ` Patrick McHardy
  1 sibling, 0 replies; 21+ messages in thread
From: Patrick McHardy @ 2010-03-25 10:25 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki
  Cc: Jozsef Kadlecsik, Shan Wei, YOSHIFUJI Hideaki, David Miller,
	Alexey Dobriyan, Yasuyuki KOZAKAI, netdev@vger.kernel.org,
	netfilter-devel

YOSHIFUJI Hideaki wrote:
> (2010/03/24 5:10), Jozsef Kadlecsik wrote:
> 
>> On Wed, 24 Mar 2010, YOSHIFUJI Hideaki wrote:
>>
>>>> In this case without conntrack, IPv6 would send an ICMPv6 message,
>>>> so in my opinion the transparent thing to do would be to still send
>>>> them. Of course only if reassembly is done on an end host.
>>>
>>> Well, no.  conntrack should just forward even uncompleted fragments
>>> to next process (e.g. core ipv6 code), and then the core would send
>>> ICMP error back.  ICMP should be sent by the core ipv6 code according
>>> to decision of itself, not according to netfilter.
>>
>> But what state could be associated by conntrack to the uncompleted
>> fragments but the INVALID state? In consequence, in any sane setup, the
>> uncompleted fragments will be dropped silently by a filter table rule
>> and no ICMP error message will be sent back.
>>
>> Therefore I think iff the destination of the fragments is the host
>> itself, then conntrack should generate an ICMP error message. But that
>> error message must be processed by conntrack to set its state and then
>> the fate of the generated packet can be decided by a rule.
> 
> Well.... no.
> 
> First of all. in "sane" setup, people should configure according
> to their own requirements.  They may or may not want send back
> icmp packet.  And, even if the core is to send icmp back, the
> state would be correctly assigned.
> 
> We cannot (and should not) do something "cleaver" (excluding
> packet drops) in conntrack in PRE_ROUTING chain.
> 
> One reason is that in PRE_ROUTING context, we can NOT determine
> if the address we see in the IP header is really the final
> destination.  The overall situation is the same even if the
> routing entry corresponding the "current" destination points
> the node itself, or even if the node is configured as host.

Agreed, that is a problem.

> It might seems that we could do something in the "filter"
> table in LOCAL_IN, FORWARD or LOCAL_OUT (after routing header
> process).
> 
> But well, we unfortunately cannot do this (at least
> automatically) because even in LOCAL_IN, the final
> destination has not been decided until we process all
> of extension headers.
> 
> Sending ICMP in netfilter (especially in conntrack) is too
> patchy, and is not right.  If we do the right thing (and
> I believe we should do so),  I'd propose to have hooks
> around handlers inside ip6_input_finish().
> 
> ...I remember that I was thinking about this before.
> 
> For my conclusion, first option is just to drop
> uncompleted fragments as we do today.  Second option
> would be  to forward them to the next process so that
> core code could send ICMPv6 etc. or, we could have
> new code to send ICMPV6_TIME_EXCEED in REJECT target.
> In longer term, I think it is better to introduce
> per-exthdr hooks.

We'd need something that allows to process the incomplete fragments
long enough so they can actually reach the IPv6 core (people usually
don't allow incoming fragments when using conntrack).

But for now you've convinced me that this patch is wrong.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment
  2010-03-25  9:23                   ` Jozsef Kadlecsik
@ 2010-03-25 14:14                     ` YOSHIFUJI Hideaki
  0 siblings, 0 replies; 21+ messages in thread
From: YOSHIFUJI Hideaki @ 2010-03-25 14:14 UTC (permalink / raw)
  To: Jozsef Kadlecsik
  Cc: Patrick McHardy, Shan Wei, YOSHIFUJI Hideaki, David Miller,
	Alexey Dobriyan, Yasuyuki KOZAKAI, netdev@vger.kernel.org,
	netfilter-devel, YOSHIFUJI Hideaki

(2010/03/25 18:23), Jozsef Kadlecsik wrote:

>> First of all. in "sane" setup, people should configure according
>> to their own requirements.  They may or may not want send back
>> icmp packet.  And, even if the core is to send icmp back, the
>> state would be correctly assigned.
>
> I meant the state of the fragmented packets. If we let the uncompleted
> fragments to enter conntrack, as far as I see their state will be INVALID.
> Or should we add an exception and set their state to UNTRACKED in
> conntrack?

Got it.  INVALID seems fine to me so far
while further consideration might be needed.

>> For my conclusion, first option is just to drop
>> uncompleted fragments as we do today.  Second option
>> would be  to forward them to the next process so that
>> core code could send ICMPv6 etc. or, we could have
>> new code to send ICMPV6_TIME_EXCEED in REJECT target.
>> In longer term, I think it is better to introduce
>> per-exthdr hooks.
>
> I agree with your conclusion too, except a few question.
>
> It is unclear for me how can you forward the packets to the next process:
> above you pointed out that in defrag/reassembly before conntrack we do not
> know yet whether the packets are destined to the host or not. So again,
> how would you let through the fragments on conntrack then?
>
> I don't know how could the REJECT target help in any way.

Argh, more explanation.

If you unconditionally send ICMPv6, behavior would be
broken.

If you do really want to send ICMP in netfilter,
you could pretend additional rules in filter table.
For example, ICMP to be sent back only if other exthdrs
do not exist, and other packets to be silently dropped.

This cannot be our clean/final/full answer, so I said it'd
be better to introduce per-exthdr hooks in longer term.

Regards,

--yoshfuji

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-03-25 14:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-27  6:39 [RFC PATCH net-next 0/7 v2]IPv6:netfilter: defragment Shan Wei
2010-03-10 17:13 ` YOSHIFUJI Hideaki
2010-03-11  9:16   ` Shan Wei
2010-03-13 13:47     ` YOSHIFUJI Hideaki
2010-03-15 16:27       ` Patrick McHardy
2010-03-23 16:28         ` YOSHIFUJI Hideaki
2010-03-23 17:16           ` Patrick McHardy
2010-03-23 18:58             ` YOSHIFUJI Hideaki
2010-03-23 20:10               ` Jozsef Kadlecsik
2010-03-25  4:20                 ` YOSHIFUJI Hideaki
2010-03-25  9:23                   ` Jozsef Kadlecsik
2010-03-25 14:14                     ` YOSHIFUJI Hideaki
2010-03-25 10:25                   ` Patrick McHardy
2010-03-25  8:38                 ` Pascal Hambourg
2010-03-25  9:13                   ` Shan Wei
2010-03-25 10:07                     ` Jozsef Kadlecsik
2010-03-25 10:20                       ` Patrick McHardy
2010-03-25  2:22               ` Shan Wei
2010-03-23 15:05     ` Patrick McHardy
2010-03-25  2:28       ` Shan Wei
2010-03-25  4:19         ` YOSHIFUJI Hideaki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).