* [PATCH] Prevent crash on ip_conntrack removal
@ 2004-08-18 9:13 Olaf Kirch
2004-08-19 10:11 ` Harald Welte
0 siblings, 1 reply; 23+ messages in thread
From: Olaf Kirch @ 2004-08-18 9:13 UTC (permalink / raw)
To: netdev; +Cc: netfilter-devel
[-- Attachment #1: Type: text/plain, Size: 1301 bytes --]
Hi,
here's a patch that keeps us from crashing on removal of ip_conntrack.
This problem came up during IBM's testing of SLES.
I'm not sure if this issue has been submitted already.
Problem description courtesy of David Stevens:
It appears that conntrack, when loaded, is queueing the fragments
for reassembly pre-routing (ie, when skb->dst is 0) and giving
the fully reassembled packet to the pre-routing code which will
set skb->dst before using it.
IP without conntrack does the queueing of fragments and reassembly
post-routing, so skb->dst in that case is set for all fragments
and the reassembled packet.
In the failure scenario, it appears that conntrack has queued
some of the fragments (w/ skb->dst=0, esp. in the offset=0 first
fragment) and then the conntrack module is removed. Arrival
of a fragment afterward will queue and reassemble the entire
packet post-routing, but the first frag still has skb->dst 0,
so it'll blow up
To fix this, the patch below simply drops such skbs. A different fix
could be to change the conntrack module to flush out all unassembled
fragments when unloaded; an alternative patch for this is attached as
well (this one is completely untested).
Cheers
Olaf
--
Olaf Kirch | The Hardware Gods hate me.
okir@suse.de |
---------------+
[-- Attachment #2: netfilter-unload-crash --]
[-- Type: text/plain, Size: 971 bytes --]
Index: v2.6.8/net/ipv4/ip_input.c
===================================================================
--- v2.6.8.orig/net/ipv4/ip_input.c
+++ v2.6.8/net/ipv4/ip_input.c
@@ -177,6 +177,13 @@ int ip_call_ra_chain(struct sk_buff *skb
read_unlock(&ip_ra_lock);
return 1;
}
+ /* When ip_conntrack gets unloaded, we may be
+ * left with fragment chains where the first
+ * fragment has skb->dst = NULL. */
+ if (skb->dst == NULL) {
+ kfree_skb(skb);
+ return 1;
+ }
}
if (last) {
struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC);
@@ -277,6 +284,13 @@ int ip_local_deliver(struct sk_buff *skb
skb = ip_defrag(skb);
if (!skb)
return 0;
+ /* When ip_conntrack gets unloaded, we may be
+ * left with fragment chains where the first
+ * fragment has skb->dst = NULL. */
+ if (skb->dst == NULL) {
+ kfree_skb(skb);
+ return 0;
+ }
}
return NF_HOOK(PF_INET, NF_IP_LOCAL_IN, skb, skb->dev, NULL,
[-- Attachment #3: conntrack-flush-fragments --]
[-- Type: text/plain, Size: 2250 bytes --]
Alternative fix for the crash on conntrack unload. Simply flush all
fragment queues when unloading conntrack_standalone so that there are
no partially assembled fragments left with skb->dst == NULL.
Index: v2.6.8/include/net/ip.h
===================================================================
--- v2.6.8.orig/include/net/ip.h
+++ v2.6.8/include/net/ip.h
@@ -255,6 +255,7 @@ extern int ip_call_ra_chain(struct sk_bu
*/
struct sk_buff *ip_defrag(struct sk_buff *skb);
+extern void ipfrag_flush(void);
extern int ip_frag_nqueues;
extern atomic_t ip_frag_mem;
Index: v2.6.8/net/ipv4/ip_fragment.c
===================================================================
--- v2.6.8.orig/net/ipv4/ip_fragment.c
+++ v2.6.8/net/ipv4/ip_fragment.c
@@ -239,13 +239,13 @@ static void ipq_kill(struct ipq *ipq)
/* Memory limiting on fragments. Evictor trashes the oldest
* fragment queue until we are back under the low threshold.
*/
-static void ip_evictor(void)
+static void __ip_evictor(int threshold)
{
struct ipq *qp;
struct list_head *tmp;
for(;;) {
- if (atomic_read(&ip_frag_mem) <= sysctl_ipfrag_low_thresh)
+ if (atomic_read(&ip_frag_mem) <= threshold)
return;
read_lock(&ipfrag_lock);
if (list_empty(&ipq_lru_list)) {
@@ -267,6 +267,11 @@ static void ip_evictor(void)
}
}
+static inline void ip_evictor(void)
+{
+ __ip_evictor(sysctl_ipfrag_low_thresh);
+}
+
/*
* Oops, a fragment queue timed out. Kill it and send an ICMP reply.
*/
@@ -677,4 +682,10 @@ void ipfrag_init(void)
add_timer(&ipfrag_secret_timer);
}
+void ipfrag_flush(void)
+{
+ __ip_evictor(0);
+}
+
EXPORT_SYMBOL(ip_defrag);
+EXPORT_SYMBOL(ipfrag_flush);
Index: v2.6.8/net/ipv4/netfilter/ip_conntrack_standalone.c
===================================================================
--- v2.6.8.orig/net/ipv4/netfilter/ip_conntrack_standalone.c
+++ v2.6.8/net/ipv4/netfilter/ip_conntrack_standalone.c
@@ -562,6 +562,8 @@ static int init_or_cleanup(int init)
nf_unregister_hook(&ip_conntrack_defrag_local_out_ops);
cleanup_defragops:
nf_unregister_hook(&ip_conntrack_defrag_ops);
+ /* Frag queues may hold fragments with skb->dst == NULL */
+ ipfrag_flush();
cleanup_proc:
proc_net_remove("ip_conntrack");
cleanup_init:
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-18 9:13 [PATCH] Prevent crash on ip_conntrack removal Olaf Kirch
@ 2004-08-19 10:11 ` Harald Welte
2004-08-19 14:18 ` David S. Miller
0 siblings, 1 reply; 23+ messages in thread
From: Harald Welte @ 2004-08-19 10:11 UTC (permalink / raw)
To: Olaf Kirch; +Cc: netdev, netfilter-devel, David Miller
[-- Attachment #1: Type: text/plain, Size: 1427 bytes --]
On Wed, Aug 18, 2004 at 11:13:52AM +0200, Olaf Kirch wrote:
> Hi,
>
> here's a patch that keeps us from crashing on removal of ip_conntrack.
> This problem came up during IBM's testing of SLES.
Thanks for this detailed bugreport and fix.
> I'm not sure if this issue has been submitted already.
Not that I'm aware of.
> To fix this, the patch below simply drops such skbs. A different fix
> could be to change the conntrack module to flush out all unassembled
> fragments when unloaded; an alternative patch for this is attached as
> well (this one is completely untested).
Since I don't want to put any more conntrack-specific code into the core
network stack, I'd rather go for the 'alternative patch'.
I'm not sure whether it's worth the effort to combine the two, i.e. only
flush entries with skb->dst == NULL.
But especially since module unloading is EXPERIMENTAL anyway, I think
it's ok when we completely flush the fragemnt queue.
Dave, is this fine with you? What solution would you prefer?
> Cheers
> Olaf
--
- Harald Welte <laforge@netfilter.org> http://www.netfilter.org/
============================================================================
"Fragmentation is like classful addressing -- an interesting early
architectural error that shows how much experimentation was going
on while IP was being designed." -- Paul Vixie
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-19 10:11 ` Harald Welte
@ 2004-08-19 14:18 ` David S. Miller
2004-08-19 14:55 ` Patrick McHardy
0 siblings, 1 reply; 23+ messages in thread
From: David S. Miller @ 2004-08-19 14:18 UTC (permalink / raw)
To: Harald Welte; +Cc: okir, netdev, netfilter-devel
On Thu, 19 Aug 2004 12:11:59 +0200
Harald Welte <laforge@netfilter.org> wrote:
> Dave, is this fine with you? What solution would you prefer?
I haven't been shown the patches so I can't generate an
opinion. :-)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-19 14:18 ` David S. Miller
@ 2004-08-19 14:55 ` Patrick McHardy
2004-08-19 15:14 ` David S. Miller
0 siblings, 1 reply; 23+ messages in thread
From: Patrick McHardy @ 2004-08-19 14:55 UTC (permalink / raw)
To: David S. Miller; +Cc: Harald Welte, okir, netdev, netfilter-devel
[-- Attachment #1: Type: text/plain, Size: 417 bytes --]
David S. Miller wrote:
>On Thu, 19 Aug 2004 12:11:59 +0200
>Harald Welte <laforge@netfilter.org> wrote:
>
>
>
>>Dave, is this fine with you? What solution would you prefer?
>>
>>
>
>I haven't been shown the patches so I can't generate an
>opinion. :-)
>
>
These are Olaf's patches. I agree with Harald that the second
patch is better. I've fixed it up so it applies with the recent
ip_fragment.c changes.
[-- Attachment #2: netfilter-unload-crash --]
[-- Type: text/plain, Size: 971 bytes --]
Index: v2.6.8/net/ipv4/ip_input.c
===================================================================
--- v2.6.8.orig/net/ipv4/ip_input.c
+++ v2.6.8/net/ipv4/ip_input.c
@@ -177,6 +177,13 @@ int ip_call_ra_chain(struct sk_buff *skb
read_unlock(&ip_ra_lock);
return 1;
}
+ /* When ip_conntrack gets unloaded, we may be
+ * left with fragment chains where the first
+ * fragment has skb->dst = NULL. */
+ if (skb->dst == NULL) {
+ kfree_skb(skb);
+ return 1;
+ }
}
if (last) {
struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC);
@@ -277,6 +284,13 @@ int ip_local_deliver(struct sk_buff *skb
skb = ip_defrag(skb);
if (!skb)
return 0;
+ /* When ip_conntrack gets unloaded, we may be
+ * left with fragment chains where the first
+ * fragment has skb->dst = NULL. */
+ if (skb->dst == NULL) {
+ kfree_skb(skb);
+ return 0;
+ }
}
return NF_HOOK(PF_INET, NF_IP_LOCAL_IN, skb, skb->dev, NULL,
[-- Attachment #3: conntrack-flush-fragments --]
[-- Type: text/plain, Size: 2250 bytes --]
Alternative fix for the crash on conntrack unload. Simply flush all
fragment queues when unloading conntrack_standalone so that there are
no partially assembled fragments left with skb->dst == NULL.
Index: v2.6.8/include/net/ip.h
===================================================================
--- v2.6.8.orig/include/net/ip.h
+++ v2.6.8/include/net/ip.h
@@ -255,6 +255,7 @@ extern int ip_call_ra_chain(struct sk_bu
*/
struct sk_buff *ip_defrag(struct sk_buff *skb);
+extern void ipfrag_flush(void);
extern int ip_frag_nqueues;
extern atomic_t ip_frag_mem;
Index: v2.6.8/net/ipv4/ip_fragment.c
===================================================================
--- v2.6.8.orig/net/ipv4/ip_fragment.c
+++ v2.6.8/net/ipv4/ip_fragment.c
@@ -239,13 +239,13 @@ static void ipq_kill(struct ipq *ipq)
/* Memory limiting on fragments. Evictor trashes the oldest
* fragment queue until we are back under the low threshold.
*/
-static void ip_evictor(void)
+static void __ip_evictor(int threshold)
{
struct ipq *qp;
struct list_head *tmp;
for(;;) {
- if (atomic_read(&ip_frag_mem) <= sysctl_ipfrag_low_thresh)
+ if (atomic_read(&ip_frag_mem) <= threshold)
return;
read_lock(&ipfrag_lock);
if (list_empty(&ipq_lru_list)) {
@@ -267,6 +267,11 @@ static void ip_evictor(void)
}
}
+static inline void ip_evictor(void)
+{
+ __ip_evictor(sysctl_ipfrag_low_thresh);
+}
+
/*
* Oops, a fragment queue timed out. Kill it and send an ICMP reply.
*/
@@ -677,4 +682,10 @@ void ipfrag_init(void)
add_timer(&ipfrag_secret_timer);
}
+void ipfrag_flush(void)
+{
+ __ip_evictor(0);
+}
+
EXPORT_SYMBOL(ip_defrag);
+EXPORT_SYMBOL(ipfrag_flush);
Index: v2.6.8/net/ipv4/netfilter/ip_conntrack_standalone.c
===================================================================
--- v2.6.8.orig/net/ipv4/netfilter/ip_conntrack_standalone.c
+++ v2.6.8/net/ipv4/netfilter/ip_conntrack_standalone.c
@@ -562,6 +562,8 @@ static int init_or_cleanup(int init)
nf_unregister_hook(&ip_conntrack_defrag_local_out_ops);
cleanup_defragops:
nf_unregister_hook(&ip_conntrack_defrag_ops);
+ /* Frag queues may hold fragments with skb->dst == NULL */
+ ipfrag_flush();
cleanup_proc:
proc_net_remove("ip_conntrack");
cleanup_init:
[-- Attachment #4: x --]
[-- Type: text/plain, Size: 2629 bytes --]
# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
# 2004/08/19 16:51:10+02:00 kaber@coreworks.de
# [NETFILTER]: Flush ip fragment queue on conntrack module unload
#
# Signed-off-by: Patrick McHardy <kaber@trash.net>
#
# net/ipv4/netfilter/ip_conntrack_standalone.c
# 2004/08/19 16:50:47+02:00 kaber@coreworks.de +2 -0
# [NETFILTER]: Flush ip fragment queue on conntrack module unload
#
# net/ipv4/ip_fragment.c
# 2004/08/19 16:50:47+02:00 kaber@coreworks.de +14 -3
# [NETFILTER]: Flush ip fragment queue on conntrack module unload
#
# include/net/ip.h
# 2004/08/19 16:50:47+02:00 kaber@coreworks.de +1 -0
# [NETFILTER]: Flush ip fragment queue on conntrack module unload
#
diff -Nru a/include/net/ip.h b/include/net/ip.h
--- a/include/net/ip.h 2004-08-19 16:53:39 +02:00
+++ b/include/net/ip.h 2004-08-19 16:53:39 +02:00
@@ -255,6 +255,7 @@
*/
struct sk_buff *ip_defrag(struct sk_buff *skb);
+extern void ipfrag_flush(void);
extern int ip_frag_nqueues;
extern atomic_t ip_frag_mem;
diff -Nru a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
--- a/net/ipv4/ip_fragment.c 2004-08-19 16:53:38 +02:00
+++ b/net/ipv4/ip_fragment.c 2004-08-19 16:53:38 +02:00
@@ -241,15 +241,15 @@
}
/* Memory limiting on fragments. Evictor trashes the oldest
- * fragment queue until we are back under the low threshold.
+ * fragment queue until we are back under the threshold.
*/
-static void ip_evictor(void)
+static void __ip_evictor(int threshold)
{
struct ipq *qp;
struct list_head *tmp;
int work;
- work = atomic_read(&ip_frag_mem) - sysctl_ipfrag_low_thresh;
+ work = atomic_read(&ip_frag_mem) - threshold;
if (work <= 0)
return;
@@ -274,6 +274,11 @@
}
}
+static inline void ip_evictor(void)
+{
+ __ip_evictor(sysctl_ipfrag_low_thresh);
+}
+
/*
* Oops, a fragment queue timed out. Kill it and send an ICMP reply.
*/
@@ -684,4 +689,10 @@
add_timer(&ipfrag_secret_timer);
}
+void ipfrag_flush(void)
+{
+ __ip_evictor(0);
+}
+
EXPORT_SYMBOL(ip_defrag);
+EXPORT_SYMBOL(ipfrag_flush);
diff -Nru a/net/ipv4/netfilter/ip_conntrack_standalone.c b/net/ipv4/netfilter/ip_conntrack_standalone.c
--- a/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-08-19 16:53:38 +02:00
+++ b/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-08-19 16:53:39 +02:00
@@ -806,6 +806,8 @@
nf_unregister_hook(&ip_conntrack_defrag_local_out_ops);
cleanup_defragops:
nf_unregister_hook(&ip_conntrack_defrag_ops);
+ /* Frag queues may hold fragments with skb->dst == NULL */
+ ipfrag_flush();
cleanup_proc_stat:
proc_net_remove("ip_conntrack_stat");
cleanup_proc_exp:
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-19 14:55 ` Patrick McHardy
@ 2004-08-19 15:14 ` David S. Miller
2004-08-21 15:10 ` Patrick McHardy
0 siblings, 1 reply; 23+ messages in thread
From: David S. Miller @ 2004-08-19 15:14 UTC (permalink / raw)
To: Patrick McHardy; +Cc: laforge, okir, netdev, netfilter-devel
On Thu, 19 Aug 2004 16:55:58 +0200
Patrick McHardy <kaber@trash.net> wrote:
> These are Olaf's patches. I agree with Harald that the second
> patch is better. I've fixed it up so it applies with the recent
> ip_fragment.c changes.
I have a better idea.
Instead of setting skb->dst to NULL, it should set it to some
NULL destination entry which just frees up the packets. Then
no special case handling. skb->dst==NULL packets should never
get into the fragment queue to begin with.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-19 15:14 ` David S. Miller
@ 2004-08-21 15:10 ` Patrick McHardy
2004-08-22 5:13 ` David S. Miller
0 siblings, 1 reply; 23+ messages in thread
From: Patrick McHardy @ 2004-08-21 15:10 UTC (permalink / raw)
To: David S. Miller; +Cc: laforge, okir, netdev, netfilter-devel
David S. Miller wrote:
>I have a better idea.
>
>Instead of setting skb->dst to NULL, it should set it to some
>NULL destination entry which just frees up the packets. Then
>no special case handling. skb->dst==NULL packets should never
>get into the fragment queue to begin with.
>
The problem is that conntrack unload can cause packets without a
dst_entry to appear in ip_local_deliver, which is already after
the call to dst_input.
Regards
Patrick
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-21 15:10 ` Patrick McHardy
@ 2004-08-22 5:13 ` David S. Miller
2004-08-22 12:58 ` Patrick McHardy
0 siblings, 1 reply; 23+ messages in thread
From: David S. Miller @ 2004-08-22 5:13 UTC (permalink / raw)
To: Patrick McHardy; +Cc: laforge, okir, netdev, netfilter-devel
On Sat, 21 Aug 2004 17:10:20 +0200
Patrick McHardy <kaber@trash.net> wrote:
> David S. Miller wrote:
>
> >I have a better idea.
> >
> >Instead of setting skb->dst to NULL, it should set it to some
> >NULL destination entry which just frees up the packets. Then
> >no special case handling. skb->dst==NULL packets should never
> >get into the fragment queue to begin with.
> >
> The problem is that conntrack unload can cause packets without a
> dst_entry to appear in ip_local_deliver, which is already after
> the call to dst_input.
How can it call ip_local_deliver() without a valid skb->dst?
That function is only invoked via skb->dst->input(skb) which
by implication means that skb->dst is non-NULL.
Actually there is a call via ip_mr_input() but that code also
has a precondition that skb->dst is non-NULL too. I say this
due to the unchecked skb->dst accesses it makes early on.
Please explain. I don't question that it happens, just show
me how :-)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-22 5:13 ` David S. Miller
@ 2004-08-22 12:58 ` Patrick McHardy
2004-08-23 5:03 ` David S. Miller
2004-08-23 21:18 ` David Stevens
0 siblings, 2 replies; 23+ messages in thread
From: Patrick McHardy @ 2004-08-22 12:58 UTC (permalink / raw)
To: David S. Miller; +Cc: laforge, okir, netdev, netfilter-devel
David S. Miller wrote:
>On Sat, 21 Aug 2004 17:10:20 +0200
>Patrick McHardy <kaber@trash.net> wrote:
>
>>The problem is that conntrack unload can cause packets without a
>>dst_entry to appear in ip_local_deliver, which is already after
>>the call to dst_input.
>>
>>
>
>How can it call ip_local_deliver() without a valid skb->dst?
>That function is only invoked via skb->dst->input(skb) which
>by implication means that skb->dst is non-NULL.
>
>Actually there is a call via ip_mr_input() but that code also
>has a precondition that skb->dst is non-NULL too. I say this
>due to the unchecked skb->dst accesses it makes early on.
>
>Please explain. I don't question that it happens, just show
>me how :-)
>
>
The first fragment (offset=0) is given to ip_defrag by conntrack
at PRE_ROUTING, without a dst_entry. Then conntrack is unloaded.
Further fragments are now queued in ip_local_deliver. When the
packet is reassembled and "continues" its way from
ip_local_deliver, it doesn't have a dst_entry.
The opposite way is of course also possible, packets queued in
ip_local_deliver can jump and appear in the PRE_ROUTING hook
when conntrack is loaded, but that way doesn't seem to cause
problems.
Regards
Patrick
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-22 12:58 ` Patrick McHardy
@ 2004-08-23 5:03 ` David S. Miller
2004-08-23 21:18 ` David Stevens
1 sibling, 0 replies; 23+ messages in thread
From: David S. Miller @ 2004-08-23 5:03 UTC (permalink / raw)
To: Patrick McHardy; +Cc: laforge, okir, netdev, netfilter-devel
On Sun, 22 Aug 2004 14:58:01 +0200
Patrick McHardy <kaber@trash.net> wrote:
> The first fragment (offset=0) is given to ip_defrag by conntrack
> at PRE_ROUTING, without a dst_entry. Then conntrack is unloaded.
> Further fragments are now queued in ip_local_deliver. When the
> packet is reassembled and "continues" its way from
> ip_local_deliver, it doesn't have a dst_entry.
>
> The opposite way is of course also possible, packets queued in
> ip_local_deliver can jump and appear in the PRE_ROUTING hook
> when conntrack is loaded, but that way doesn't seem to cause
> problems.
Thanks for the explanation Patrick.
Let me brain storm on this on Monday (tomorrow).
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-22 12:58 ` Patrick McHardy
2004-08-23 5:03 ` David S. Miller
@ 2004-08-23 21:18 ` David Stevens
2004-08-24 0:45 ` Nivedita Singhvi
2004-08-24 0:45 ` Patrick McHardy
1 sibling, 2 replies; 23+ messages in thread
From: David Stevens @ 2004-08-23 21:18 UTC (permalink / raw)
To: Patrick McHardy
Cc: David S. Miller, laforge, netdev, netdev-bounce, netfilter-devel,
okir
BTW, since some of the frags (esp. the one that triggers the problem)
are added post-routing, a valid dst is available. It just isn't the first
frag in the particular scenario.
So, one solution would be to set skb->dst for the head (if NULL) based
on a non-null fragment skb->dst. I believe that would prevent the problem
case without dropping the fragment, since it'll be processed post-routing
only if one of the frags is.
When I was looking at it, I wondered if conntrack really has a need to
reassemble itself, though. Couldn't it let IP do the reassembling and
just ignore offset != 0 frags? The offset==0 frags will have enough
protocol header to identify by port (a requirement for ICMP). But I don't
know this code well enough to know if conntrack does actually need
to reassemble for some good reason. Superficially, I wouldn't think
there'd be a reason for it.
+-DLS
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-23 21:18 ` David Stevens
@ 2004-08-24 0:45 ` Nivedita Singhvi
2004-08-24 0:45 ` Patrick McHardy
1 sibling, 0 replies; 23+ messages in thread
From: Nivedita Singhvi @ 2004-08-24 0:45 UTC (permalink / raw)
To: David Stevens
Cc: Patrick McHardy, David S. Miller, laforge, netdev, netdev-bounce,
netfilter-devel, okir
David Stevens wrote:
> So, one solution would be to set skb->dst for the head (if NULL) based
> on a non-null fragment skb->dst. I believe that would prevent the problem
> case without dropping the fragment, since it'll be processed post-routing
> only if one of the frags is.
This would be more performant than dropping the frags, and
requiring a retransmit (or lack thereof, depending on protocol).
> When I was looking at it, I wondered if conntrack really has a need to
> reassemble itself, though. Couldn't it let IP do the reassembling and
I asked Harald this when I met him last, and he said it
does need to. So I don't think this (having conntrack
reassemble) is avoidable, unfortunately.
Of course, fragmentation, on the other hand, :), ...
thanks,
Nivedita
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-23 21:18 ` David Stevens
2004-08-24 0:45 ` Nivedita Singhvi
@ 2004-08-24 0:45 ` Patrick McHardy
2004-08-24 21:28 ` David Stevens
1 sibling, 1 reply; 23+ messages in thread
From: Patrick McHardy @ 2004-08-24 0:45 UTC (permalink / raw)
To: David Stevens
Cc: David S. Miller, laforge, netdev, netdev-bounce, netfilter-devel,
okir
David Stevens wrote:
>BTW, since some of the frags (esp. the one that triggers the problem)
>are added post-routing, a valid dst is available. It just isn't the first
>frag in the particular scenario.
>
>So, one solution would be to set skb->dst for the head (if NULL) based
>on a non-null fragment skb->dst. I believe that would prevent the problem
>case without dropping the fragment, since it'll be processed post-routing
>only if one of the frags is.
>
>
The fragments which jumped from PRE_ROUTING to ip_local_deliver will miss
ip options processing.
>When I was looking at it, I wondered if conntrack really has a need to
>reassemble itself, though. Couldn't it let IP do the reassembling and
>just ignore offset != 0 frags? The offset==0 frags will have enough
>protocol header to identify by port (a requirement for ICMP). But I don't
>know this code well enough to know if conntrack does actually need
>to reassemble for some good reason. Superficially, I wouldn't think
>there'd be a reason for it.
>
>
The NAT code needs to handle all fragments, so they can't be skipped.
Handling fragments in conntrack and NAT would be possible without helpers,
but to scan for patterns in fragments you need state for each fragmented
packet for each connection.
Regards
Patrick
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-24 0:45 ` Patrick McHardy
@ 2004-08-24 21:28 ` David Stevens
2004-08-29 6:15 ` David S. Miller
0 siblings, 1 reply; 23+ messages in thread
From: David Stevens @ 2004-08-24 21:28 UTC (permalink / raw)
To: Patrick McHardy
Cc: David S. Miller, laforge, netdev, netdev-bounce, netfilter-devel,
okir
Then it appears that simply dropping the packet when the
skb->dst == 0 isn't quite right, since per-frag option processing
wouldn't be done in the case where conntrack is removed, but
the first frag does have skb->dst set (but not some of the others).
In that case, it appears that conntrack needs to flush the entire
frag queue when it's unloaded. That shouldn't happen much,
so maybe that's not such a bad idea.
+-DLS
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-24 21:28 ` David Stevens
@ 2004-08-29 6:15 ` David S. Miller
2004-08-29 19:36 ` Patrick McHardy
0 siblings, 1 reply; 23+ messages in thread
From: David S. Miller @ 2004-08-29 6:15 UTC (permalink / raw)
To: David Stevens
Cc: kaber, davem, laforge, netdev, netdev-bounce, netfilter-devel,
okir
On Tue, 24 Aug 2004 15:28:07 -0600
David Stevens <dlstevens@us.ibm.com> wrote:
> In that case, it appears that conntrack needs to flush the entire
> frag queue when it's unloaded. That shouldn't happen much,
> so maybe that's not such a bad idea.
I think I agree with David now that I've read through this a few
times. Can someone send me a patch which does this?
Does 2.4.x have this problem too? I thought it didn't.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-29 6:15 ` David S. Miller
@ 2004-08-29 19:36 ` Patrick McHardy
2004-08-29 19:57 ` David S. Miller
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Patrick McHardy @ 2004-08-29 19:36 UTC (permalink / raw)
To: David S. Miller
Cc: David Stevens, davem, laforge, netdev, netdev-bounce,
netfilter-devel, okir
[-- Attachment #1: Type: text/plain, Size: 424 bytes --]
David S. Miller wrote:
>I think I agree with David now that I've read through this a few
>times. Can someone send me a patch which does this?
>
>
Attached. The first patch still crashed, we need to prevent new
fragments from getting queued after the queue is flushed until the
hook in unregistered.
>Does 2.4.x have this problem too? I thought it didn't.
>
>
I'll have a look, but I think it does.
Regards
Patrick
[-- Attachment #2: x --]
[-- Type: text/plain, Size: 4121 bytes --]
# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
# 2004/08/29 20:24:07+02:00 kaber@coreworks.de
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# Signed-off-by: Patrick McHardy <kaber@trash.net>
#
# net/ipv4/netfilter/ip_conntrack_standalone.c
# 2004/08/29 20:22:57+02:00 kaber@coreworks.de +6 -0
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# net/ipv4/netfilter/ip_conntrack_core.c
# 2004/08/29 20:22:57+02:00 kaber@coreworks.de +8 -0
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# net/ipv4/ip_fragment.c
# 2004/08/29 20:22:57+02:00 kaber@coreworks.de +14 -3
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# include/net/ip.h
# 2004/08/29 20:22:57+02:00 kaber@coreworks.de +1 -0
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# include/linux/netfilter_ipv4/ip_conntrack.h
# 2004/08/29 20:22:57+02:00 kaber@coreworks.de +1 -0
# [NETFILTER]: Flush fragment queue on conntrack unload
#
diff -Nru a/include/linux/netfilter_ipv4/ip_conntrack.h b/include/linux/netfilter_ipv4/ip_conntrack.h
--- a/include/linux/netfilter_ipv4/ip_conntrack.h 2004-08-29 20:55:13 +02:00
+++ b/include/linux/netfilter_ipv4/ip_conntrack.h 2004-08-29 20:55:13 +02:00
@@ -275,6 +275,7 @@
/* Fake conntrack entry for untracked connections */
extern struct ip_conntrack ip_conntrack_untracked;
+extern int ip_ct_no_defrag;
/* Returns new sk_buff, or NULL */
struct sk_buff *
ip_ct_gather_frags(struct sk_buff *skb);
diff -Nru a/include/net/ip.h b/include/net/ip.h
--- a/include/net/ip.h 2004-08-29 20:55:13 +02:00
+++ b/include/net/ip.h 2004-08-29 20:55:13 +02:00
@@ -255,6 +255,7 @@
*/
struct sk_buff *ip_defrag(struct sk_buff *skb);
+extern void ipfrag_flush(void);
extern int ip_frag_nqueues;
extern atomic_t ip_frag_mem;
diff -Nru a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
--- a/net/ipv4/ip_fragment.c 2004-08-29 20:55:13 +02:00
+++ b/net/ipv4/ip_fragment.c 2004-08-29 20:55:13 +02:00
@@ -241,15 +241,15 @@
}
/* Memory limiting on fragments. Evictor trashes the oldest
- * fragment queue until we are back under the low threshold.
+ * fragment queue until we are back under the threshold.
*/
-static void ip_evictor(void)
+static void __ip_evictor(int threshold)
{
struct ipq *qp;
struct list_head *tmp;
int work;
- work = atomic_read(&ip_frag_mem) - sysctl_ipfrag_low_thresh;
+ work = atomic_read(&ip_frag_mem) - threshold;
if (work <= 0)
return;
@@ -274,6 +274,11 @@
}
}
+static inline void ip_evictor(void)
+{
+ __ip_evictor(sysctl_ipfrag_low_thresh);
+}
+
/*
* Oops, a fragment queue timed out. Kill it and send an ICMP reply.
*/
@@ -684,4 +689,10 @@
add_timer(&ipfrag_secret_timer);
}
+void ipfrag_flush(void)
+{
+ __ip_evictor(0);
+}
+
EXPORT_SYMBOL(ip_defrag);
+EXPORT_SYMBOL(ipfrag_flush);
diff -Nru a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c
--- a/net/ipv4/netfilter/ip_conntrack_core.c 2004-08-29 20:55:13 +02:00
+++ b/net/ipv4/netfilter/ip_conntrack_core.c 2004-08-29 20:55:13 +02:00
@@ -1173,6 +1173,8 @@
}
}
+int ip_ct_no_defrag;
+
/* Returns new sk_buff, or NULL */
struct sk_buff *
ip_ct_gather_frags(struct sk_buff *skb)
@@ -1181,6 +1183,12 @@
#ifdef CONFIG_NETFILTER_DEBUG
unsigned int olddebug = skb->nf_debug;
#endif
+
+ if (unlikely(ip_ct_no_defrag)) {
+ kfree_skb(skb);
+ return NULL;
+ }
+
if (sk) {
sock_hold(sk);
skb_orphan(skb);
diff -Nru a/net/ipv4/netfilter/ip_conntrack_standalone.c b/net/ipv4/netfilter/ip_conntrack_standalone.c
--- a/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-08-29 20:55:13 +02:00
+++ b/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-08-29 20:55:13 +02:00
@@ -805,6 +805,12 @@
cleanup_defraglocalops:
nf_unregister_hook(&ip_conntrack_defrag_local_out_ops);
cleanup_defragops:
+ /* Frag queues may hold fragments with skb->dst == NULL */
+ ip_ct_no_defrag = 1;
+ smp_wmb();
+ local_bh_disable();
+ ipfrag_flush();
+ local_bh_enable();
nf_unregister_hook(&ip_conntrack_defrag_ops);
cleanup_proc_stat:
proc_net_remove("ip_conntrack_stat");
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-29 19:36 ` Patrick McHardy
@ 2004-08-29 19:57 ` David S. Miller
2004-08-29 20:06 ` Patrick McHardy
2004-08-29 21:58 ` Patrick McHardy
2004-08-29 21:48 ` Patrick McHardy
2004-08-30 7:57 ` Olaf Kirch
2 siblings, 2 replies; 23+ messages in thread
From: David S. Miller @ 2004-08-29 19:57 UTC (permalink / raw)
To: Patrick McHardy
Cc: dlstevens, davem, laforge, netdev, netdev-bounce, netfilter-devel,
okir
On Sun, 29 Aug 2004 21:36:28 +0200
Patrick McHardy <kaber@trash.net> wrote:
> David S. Miller wrote:
>
> >I think I agree with David now that I've read through this a few
> >times. Can someone send me a patch which does this?
> >
> >
> Attached. The first patch still crashed, we need to prevent new
> fragments from getting queued after the queue is flushed until the
> hook in unregistered.
While we're doing this, is your patch similar to one of the two
original ones that Olaf Kirch posted? I want to give him proper
attribution, in whatever form is reasonable, that's all.
> >Does 2.4.x have this problem too? I thought it didn't.
> >
> >
> I'll have a look, but I think it does.
Thanks a lot for helping out with this Patrick.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-29 19:57 ` David S. Miller
@ 2004-08-29 20:06 ` Patrick McHardy
2004-08-29 21:58 ` Patrick McHardy
1 sibling, 0 replies; 23+ messages in thread
From: Patrick McHardy @ 2004-08-29 20:06 UTC (permalink / raw)
To: David S. Miller
Cc: dlstevens, davem, laforge, netdev, netdev-bounce, netfilter-devel,
okir
David S. Miller wrote:
>On Sun, 29 Aug 2004 21:36:28 +0200
>Patrick McHardy <kaber@trash.net> wrote:
>
>>Attached. The first patch still crashed, we need to prevent new
>>fragments from getting queued after the queue is flushed until the
>>hook in unregistered.
>>
>>
>
>While we're doing this, is your patch similar to one of the two
>original ones that Olaf Kirch posted? I want to give him proper
>attribution, in whatever form is reasonable, that's all.
>
>
It is based on his second patch. He didn't sign off, otherwise I just
would have kept his Signed-off-by: line.
Regards
Patrick
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-29 19:57 ` David S. Miller
2004-08-29 20:06 ` Patrick McHardy
@ 2004-08-29 21:58 ` Patrick McHardy
2004-08-29 23:38 ` David S. Miller
1 sibling, 1 reply; 23+ messages in thread
From: Patrick McHardy @ 2004-08-29 21:58 UTC (permalink / raw)
To: David S. Miller
Cc: dlstevens, davem, laforge, netdev, netdev-bounce, netfilter-devel,
okir
David S. Miller wrote:
>>>Does 2.4.x have this problem too? I thought it didn't.
>>>
>>>
>>I'll have a look, but I think it does.
>>
2.4 has the same problem. Before I post the patch, 2.4 seems to be
missing this patch, do you already have it queued or should I send
a 2.4 version first ?
ChangeSet@1.1853, 2004-08-18 14:28:05-07:00, davem@nuts.davemloft.net
[IPV4]: Fix theoretical loop on SMP in ip_evictor().
Snapshot the amount of work to do, and just do it.
In this way we avoid a theoretical loop whereby
one cpu sits in ip_evictor() tossing fragments
while another keeps adding a fragment just as we
bring ip_frag_mem down below the low threshold.
Signed-off-by: David S. Miller <davem@redhat.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-29 21:58 ` Patrick McHardy
@ 2004-08-29 23:38 ` David S. Miller
2004-08-30 0:50 ` Patrick McHardy
0 siblings, 1 reply; 23+ messages in thread
From: David S. Miller @ 2004-08-29 23:38 UTC (permalink / raw)
To: Patrick McHardy
Cc: dlstevens, davem, laforge, netdev, netdev-bounce, netfilter-devel,
okir
On Sun, 29 Aug 2004 23:58:06 +0200
Patrick McHardy <kaber@trash.net> wrote:
> Before I post the patch, 2.4 seems to be
> missing this patch, do you already have it queued or should I send
> a 2.4 version first ?
>
> ChangeSet@1.1853, 2004-08-18 14:28:05-07:00, davem@nuts.davemloft.net
> [IPV4]: Fix theoretical loop on SMP in ip_evictor().
I pushed this off to Marcelo, he just didn't pull from my
tree yet, which is at:
bk://kernel.bkbits.net/davem/net-2.4
Where you'll find those fixes as:
ChangeSet@1.1498.1.2, 2004-08-18 14:26:09-07:00, davem@nuts.davemloft.net
[IPV4]: Fix theoretical loop on SMP in ip_evictor().
Snapshot the amount of work to do, and just do it.
In this way we avoid a theoretical loop whereby
one cpu sits in ip_evictor() tossing fragments
while another keeps adding a fragment just as we
bring ip_frag_mem down below the low threshold.
Signed-off-by: David S. Miller <davem@redhat.com>
ChangeSet@1.1498.1.3, 2004-08-18 14:31:35-07:00, davem@nuts.davemloft.net
[IPV6]: ip6_evictor() has same problem as ip_evictor().
Signed-off-by: David S. Miller <davem@redhat.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-29 23:38 ` David S. Miller
@ 2004-08-30 0:50 ` Patrick McHardy
2004-08-30 4:28 ` David S. Miller
0 siblings, 1 reply; 23+ messages in thread
From: Patrick McHardy @ 2004-08-30 0:50 UTC (permalink / raw)
To: David S. Miller
Cc: dlstevens, davem, laforge, netdev, netdev-bounce, netfilter-devel,
okir
[-- Attachment #1: Type: text/plain, Size: 751 bytes --]
David S. Miller wrote:
>On Sun, 29 Aug 2004 23:58:06 +0200
>Patrick McHardy <kaber@trash.net> wrote:
>
>
>>Before I post the patch, 2.4 seems to be
>>missing this patch, do you already have it queued or should I send
>>a 2.4 version first ?
>>
>>ChangeSet@1.1853, 2004-08-18 14:28:05-07:00, davem@nuts.davemloft.net
>> [IPV4]: Fix theoretical loop on SMP in ip_evictor().
>>
>>
>
>I pushed this off to Marcelo, he just didn't pull from my
>tree yet, which is at:
>
> bk://kernel.bkbits.net/davem/net-2.4
>
>Where you'll find those fixes as:
>
>ChangeSet@1.1498.1.2, 2004-08-18 14:26:09-07:00, davem@nuts.davemloft.net
> [IPV4]: Fix theoretical loop on SMP in ip_evictor().
>
>
Great, here is the patch for 2.4 for the conntrack problem.
[-- Attachment #2: x --]
[-- Type: text/plain, Size: 4797 bytes --]
# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
# 2004/08/30 02:22:35+02:00 kaber@coreworks.de
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# Based on patch from Olaf Kirch <okir@suse.de>
#
# Signed-off-by: Patrick McHardy <kaber@trash.net>
#
# net/netsyms.c
# 2004/08/30 02:22:30+02:00 kaber@coreworks.de +1 -0
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# net/ipv4/netfilter/ip_conntrack_standalone.c
# 2004/08/30 02:22:30+02:00 kaber@coreworks.de +7 -0
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# net/ipv4/netfilter/ip_conntrack_core.c
# 2004/08/30 02:22:30+02:00 kaber@coreworks.de +8 -0
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# net/ipv4/ip_fragment.c
# 2004/08/30 02:22:30+02:00 kaber@coreworks.de +13 -3
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# include/net/ip.h
# 2004/08/30 02:22:30+02:00 kaber@coreworks.de +1 -0
# [NETFILTER]: Flush fragment queue on conntrack unload
#
# include/linux/netfilter_ipv4/ip_conntrack.h
# 2004/08/30 02:22:30+02:00 kaber@coreworks.de +1 -0
# [NETFILTER]: Flush fragment queue on conntrack unload
#
diff -Nru a/include/linux/netfilter_ipv4/ip_conntrack.h b/include/linux/netfilter_ipv4/ip_conntrack.h
--- a/include/linux/netfilter_ipv4/ip_conntrack.h 2004-08-30 02:24:47 +02:00
+++ b/include/linux/netfilter_ipv4/ip_conntrack.h 2004-08-30 02:24:47 +02:00
@@ -249,6 +249,7 @@
/* Call me when a conntrack is destroyed. */
extern void (*ip_conntrack_destroyed)(struct ip_conntrack *conntrack);
+extern int ip_ct_no_defrag;
/* Returns new sk_buff, or NULL */
struct sk_buff *
ip_ct_gather_frags(struct sk_buff *skb);
diff -Nru a/include/net/ip.h b/include/net/ip.h
--- a/include/net/ip.h 2004-08-30 02:24:47 +02:00
+++ b/include/net/ip.h 2004-08-30 02:24:47 +02:00
@@ -228,6 +228,7 @@
*/
struct sk_buff *ip_defrag(struct sk_buff *skb);
+extern void ipfrag_flush(void);
extern int ip_frag_nqueues;
extern atomic_t ip_frag_mem;
diff -Nru a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
--- a/net/ipv4/ip_fragment.c 2004-08-30 02:24:47 +02:00
+++ b/net/ipv4/ip_fragment.c 2004-08-30 02:24:47 +02:00
@@ -240,15 +240,15 @@
}
/* Memory limiting on fragments. Evictor trashes the oldest
- * fragment queue until we are back under the low threshold.
+ * fragment queue until we are back under the threshold.
*/
-static void ip_evictor(void)
+static void __ip_evictor(int threshold)
{
struct ipq *qp;
struct list_head *tmp;
int work;
- work = atomic_read(&ip_frag_mem) - sysctl_ipfrag_low_thresh;
+ work = atomic_read(&ip_frag_mem) - threshold;
if (work <= 0)
return;
@@ -273,6 +273,11 @@
}
}
+static inline void ip_evictor(void)
+{
+ __ip_evictor(sysctl_ipfrag_low_thresh);
+}
+
/*
* Oops, a fragment queue timed out. Kill it and send an ICMP reply.
*/
@@ -681,4 +686,9 @@
ipfrag_secret_timer.function = ipfrag_secret_rebuild;
ipfrag_secret_timer.expires = jiffies + sysctl_ipfrag_secret_interval;
add_timer(&ipfrag_secret_timer);
+}
+
+void ipfrag_flush(void)
+{
+ __ip_evictor(0);
}
diff -Nru a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c
--- a/net/ipv4/netfilter/ip_conntrack_core.c 2004-08-30 02:24:47 +02:00
+++ b/net/ipv4/netfilter/ip_conntrack_core.c 2004-08-30 02:24:47 +02:00
@@ -1183,6 +1183,8 @@
WRITE_UNLOCK(&ip_conntrack_lock);
}
+int ip_ct_no_defrag;
+
/* Returns new sk_buff, or NULL */
struct sk_buff *
ip_ct_gather_frags(struct sk_buff *skb)
@@ -1191,6 +1193,12 @@
#ifdef CONFIG_NETFILTER_DEBUG
unsigned int olddebug = skb->nf_debug;
#endif
+
+ if (unlikely(ip_ct_no_defrag)) {
+ kfree_skb(skb);
+ return NULL;
+ }
+
if (sk) {
sock_hold(sk);
skb_orphan(skb);
diff -Nru a/net/ipv4/netfilter/ip_conntrack_standalone.c b/net/ipv4/netfilter/ip_conntrack_standalone.c
--- a/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-08-30 02:24:47 +02:00
+++ b/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-08-30 02:24:47 +02:00
@@ -393,6 +393,13 @@
cleanup_inandlocalops:
nf_unregister_hook(&ip_conntrack_local_out_ops);
cleanup_inops:
+ /* Frag queues may hold fragments with skb->dst == NULL */
+ ip_ct_no_defrag = 1;
+ local_bh_disable();
+ br_write_lock(BR_NETPROTO_LOCK);
+ br_write_unlock(BR_NETPROTO_LOCK);
+ ipfrag_flush();
+ local_bh_enable();
nf_unregister_hook(&ip_conntrack_in_ops);
cleanup_proc:
proc_net_remove("ip_conntrack");
diff -Nru a/net/netsyms.c b/net/netsyms.c
--- a/net/netsyms.c 2004-08-30 02:24:47 +02:00
+++ b/net/netsyms.c 2004-08-30 02:24:47 +02:00
@@ -283,6 +283,7 @@
EXPORT_SYMBOL(inetdev_by_index);
EXPORT_SYMBOL(in_dev_finish_destroy);
EXPORT_SYMBOL(ip_defrag);
+EXPORT_SYMBOL(ipfrag_flush);
/* Route manipulation */
EXPORT_SYMBOL(ip_rt_ioctl);
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-30 0:50 ` Patrick McHardy
@ 2004-08-30 4:28 ` David S. Miller
0 siblings, 0 replies; 23+ messages in thread
From: David S. Miller @ 2004-08-30 4:28 UTC (permalink / raw)
To: Patrick McHardy
Cc: dlstevens, davem, laforge, netdev, netdev-bounce, netfilter-devel,
okir
On Mon, 30 Aug 2004 02:50:10 +0200
Patrick McHardy <kaber@trash.net> wrote:
> Great, here is the patch for 2.4 for the conntrack problem.
Applied, along with the 2.6.x variant plus the synchronize_net()
fix for 2.6.x
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-29 19:36 ` Patrick McHardy
2004-08-29 19:57 ` David S. Miller
@ 2004-08-29 21:48 ` Patrick McHardy
2004-08-30 7:57 ` Olaf Kirch
2 siblings, 0 replies; 23+ messages in thread
From: Patrick McHardy @ 2004-08-29 21:48 UTC (permalink / raw)
To: Patrick McHardy
Cc: David S. Miller, David Stevens, davem, laforge, netdev,
netdev-bounce, netfilter-devel, okir
[-- Attachment #1: Type: text/plain, Size: 1373 bytes --]
Patrick McHardy wrote:
> Attached. The first patch still crashed, we need to prevent new
> fragments from getting queued after the queue is flushed until the
> hook in unregistered.
The patch is racy, another CPU could already have passed the check
for ip_ct_no_defrag and queue the packet after __ip_evictor calculated
the work to do, so the packet (or another one) will not get evicted.
This patch on top calls synchronize_net() to prevent this.
>
>@@ -1181,6 +1183,12 @@
> #ifdef CONFIG_NETFILTER_DEBUG
> unsigned int olddebug = skb->nf_debug;
> #endif
>+
>+ if (unlikely(ip_ct_no_defrag)) {
>+ kfree_skb(skb);
>+ return NULL;
>+ }
>+
> if (sk) {
> sock_hold(sk);
> skb_orphan(skb);
>diff -Nru a/net/ipv4/netfilter/ip_conntrack_standalone.c b/net/ipv4/netfilter/ip_conntrack_standalone.c
>--- a/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-08-29 20:55:13 +02:00
>+++ b/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-08-29 20:55:13 +02:00
>@@ -805,6 +805,12 @@
> cleanup_defraglocalops:
> nf_unregister_hook(&ip_conntrack_defrag_local_out_ops);
> cleanup_defragops:
>+ /* Frag queues may hold fragments with skb->dst == NULL */
>+ ip_ct_no_defrag = 1;
>+ smp_wmb();
>+ local_bh_disable();
>+ ipfrag_flush();
>+ local_bh_enable();
> nf_unregister_hook(&ip_conntrack_defrag_ops);
> cleanup_proc_stat:
> proc_net_remove("ip_conntrack_stat");
>
>
[-- Attachment #2: x --]
[-- Type: text/plain, Size: 875 bytes --]
# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
# 2004/08/29 23:36:17+02:00 kaber@coreworks.de
# [NETFILTER]: Fix race when flushing fragment queue
#
# Signed-off-by: Patrick McHardy <kaber@trash.net>
#
# net/ipv4/netfilter/ip_conntrack_standalone.c
# 2004/08/29 23:35:54+02:00 kaber@coreworks.de +1 -1
# [NETFILTER]: Fix race when flushing fragment queue
#
diff -Nru a/net/ipv4/netfilter/ip_conntrack_standalone.c b/net/ipv4/netfilter/ip_conntrack_standalone.c
--- a/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-08-29 23:39:07 +02:00
+++ b/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-08-29 23:39:07 +02:00
@@ -807,7 +807,7 @@
cleanup_defragops:
/* Frag queues may hold fragments with skb->dst == NULL */
ip_ct_no_defrag = 1;
- smp_wmb();
+ synchronize_net();
local_bh_disable();
ipfrag_flush();
local_bh_enable();
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH] Prevent crash on ip_conntrack removal
2004-08-29 19:36 ` Patrick McHardy
2004-08-29 19:57 ` David S. Miller
2004-08-29 21:48 ` Patrick McHardy
@ 2004-08-30 7:57 ` Olaf Kirch
2 siblings, 0 replies; 23+ messages in thread
From: Olaf Kirch @ 2004-08-30 7:57 UTC (permalink / raw)
To: Patrick McHardy
Cc: David S. Miller, David Stevens, davem, laforge, netdev,
netdev-bounce, netfilter-devel
On Sun, Aug 29, 2004 at 09:36:28PM +0200, Patrick McHardy wrote:
> David S. Miller wrote:
>
> >I think I agree with David now that I've read through this a few
> >times. Can someone send me a patch which does this?
> >
> Attached. The first patch still crashed, we need to prevent new
> fragments from getting queued after the queue is flushed until the
> hook in unregistered.
I was actually going to test-drive my patch today, but you were
faster. Thanks!
Olad
--
Olaf Kirch | The Hardware Gods hate me.
okir@suse.de |
---------------+
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2004-08-30 7:57 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-18 9:13 [PATCH] Prevent crash on ip_conntrack removal Olaf Kirch
2004-08-19 10:11 ` Harald Welte
2004-08-19 14:18 ` David S. Miller
2004-08-19 14:55 ` Patrick McHardy
2004-08-19 15:14 ` David S. Miller
2004-08-21 15:10 ` Patrick McHardy
2004-08-22 5:13 ` David S. Miller
2004-08-22 12:58 ` Patrick McHardy
2004-08-23 5:03 ` David S. Miller
2004-08-23 21:18 ` David Stevens
2004-08-24 0:45 ` Nivedita Singhvi
2004-08-24 0:45 ` Patrick McHardy
2004-08-24 21:28 ` David Stevens
2004-08-29 6:15 ` David S. Miller
2004-08-29 19:36 ` Patrick McHardy
2004-08-29 19:57 ` David S. Miller
2004-08-29 20:06 ` Patrick McHardy
2004-08-29 21:58 ` Patrick McHardy
2004-08-29 23:38 ` David S. Miller
2004-08-30 0:50 ` Patrick McHardy
2004-08-30 4:28 ` David S. Miller
2004-08-29 21:48 ` Patrick McHardy
2004-08-30 7:57 ` Olaf Kirch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).