* [patch 2/16] list_head debugging
@ 2002-06-01 8:40 Andrew Morton
2002-06-01 17:19 ` Arnaldo Carvalho de Melo
2002-06-03 13:55 ` Jan Harkes
0 siblings, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2002-06-01 8:40 UTC (permalink / raw)
To: Linus Torvalds; +Cc: lkml
A common and very subtle bug is to use list_heads which aren't on any
lists. It causes kernel memory corruption which is observed long after
the offending code has executed.
The patch nulls out the dangling pointers so we get a nice oops at the
site of the buggy code.
=====================================
--- 2.5.19/include/linux/list.h~list-debug Sat Jun 1 01:18:05 2002
+++ 2.5.19-akpm/include/linux/list.h Sat Jun 1 01:18:05 2002
@@ -94,6 +94,11 @@ static __inline__ void __list_del(struct
static __inline__ void list_del(struct list_head *entry)
{
__list_del(entry->prev, entry->next);
+ /*
+ * This is debug. Remove it when the kernel has no bugs ;)
+ */
+ entry->next = 0;
+ entry->prev = 0;
}
/**
-
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch 2/16] list_head debugging
2002-06-01 8:40 Andrew Morton
@ 2002-06-01 17:19 ` Arnaldo Carvalho de Melo
2002-06-03 13:55 ` Jan Harkes
1 sibling, 0 replies; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2002-06-01 17:19 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linus Torvalds, lkml
Em Sat, Jun 01, 2002 at 01:40:03AM -0700, Andrew Morton escreveu:
> A common and very subtle bug is to use list_heads which aren't on any
> lists. It causes kernel memory corruption which is observed long after
> the offending code has executed.
> The patch nulls out the dangling pointers so we get a nice oops at the
> site of the buggy code.
> =====================================
>
> --- 2.5.19/include/linux/list.h~list-debug Sat Jun 1 01:18:05 2002
> +++ 2.5.19-akpm/include/linux/list.h Sat Jun 1 01:18:05 2002
> @@ -94,6 +94,11 @@ static __inline__ void __list_del(struct
> static __inline__ void list_del(struct list_head *entry)
> {
> __list_del(entry->prev, entry->next);
#ifdef CONFIG_DEBUG_LIST_DEL_NULLIFY
> + entry->next = 0;
> + entry->prev = 0;
#endif
> }
8) And get this configured in the Debug section of make *config
The kernel will always have bugs ;)
- Arnaldo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch 2/16] list_head debugging
2002-06-01 8:40 Andrew Morton
2002-06-01 17:19 ` Arnaldo Carvalho de Melo
@ 2002-06-03 13:55 ` Jan Harkes
2002-06-03 20:16 ` Andrew Morton
2002-06-03 20:41 ` Rik van Riel
1 sibling, 2 replies; 10+ messages in thread
From: Jan Harkes @ 2002-06-03 13:55 UTC (permalink / raw)
To: lkml
On Sat, Jun 01, 2002 at 01:40:03AM -0700, Andrew Morton wrote:
> The patch nulls out the dangling pointers so we get a nice oops at the
> site of the buggy code.
...
> static __inline__ void list_del(struct list_head *entry)
> {
> __list_del(entry->prev, entry->next);
> + /*
> + * This is debug. Remove it when the kernel has no bugs ;)
> + */
> + entry->next = 0;
> + entry->prev = 0;
> }
We've had this before, and it breaks some code that removes items from
lists as follows,
list_for_each(p, list)
if (condition)
list_del(p);
These would have to either use __list_del, or need to do,
for(p = list.next; p != &list;) {
struct list_head *n = p->next;
if (condition)
list_del(p);
p = n;
}
I'm not sure how many places are using the list_for_each/list_del
construction, but there were a couple when this was in the tree
previously. Converting most places that use list_del to use
list_del_init should fix the same bugs, but not cause problems for
existing code.
Just did a grep for list_del and didn't find any obvious places where we
are doing the above construction, except for drivers/isdn/capi/capilib.c
and maybe drivers/hotplug/pcihp_skeleton.c, but it could be hidden in
many more places by a macro or function call (or a larger loop than my 3
line context was showing). But there are not that many places where
we're calling list_del while not immediately re-initializing or adding
the unlinked list_head to another list.
You could probably also add list_move
list_move(entry, head)
{
if (!list_empty(entry))
__list_del(entry->prev, entry->next);
list_add(entry, head);
}
And just delete list_del completely, because all existing places where
list_del is currently used should probably use either list_del_init, or
list_move.
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch 2/16] list_head debugging
2002-06-03 13:55 ` Jan Harkes
@ 2002-06-03 20:16 ` Andrew Morton
2002-06-03 20:41 ` Rik van Riel
1 sibling, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2002-06-03 20:16 UTC (permalink / raw)
To: Jan Harkes; +Cc: lkml
Jan Harkes wrote:
>
> On Sat, Jun 01, 2002 at 01:40:03AM -0700, Andrew Morton wrote:
> > The patch nulls out the dangling pointers so we get a nice oops at the
> > site of the buggy code.
> ...
> > static __inline__ void list_del(struct list_head *entry)
> > {
> > __list_del(entry->prev, entry->next);
> > + /*
> > + * This is debug. Remove it when the kernel has no bugs ;)
> > + */
> > + entry->next = 0;
> > + entry->prev = 0;
> > }
>
> We've had this before, and it breaks some code that removes items from
> lists as follows,
>
> list_for_each(p, list)
> if (condition)
> list_del(p);
hmm. I suppose that's sane.
> These would have to either use __list_del, or need to do,
>
> for(p = list.next; p != &list;) {
> struct list_head *n = p->next;
> if (condition)
> list_del(p);
> p = n;
> }
list_for_each_safe() does this.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch 2/16] list_head debugging
2002-06-03 13:55 ` Jan Harkes
2002-06-03 20:16 ` Andrew Morton
@ 2002-06-03 20:41 ` Rik van Riel
2002-06-10 16:36 ` Jan Harkes
1 sibling, 1 reply; 10+ messages in thread
From: Rik van Riel @ 2002-06-03 20:41 UTC (permalink / raw)
To: Jan Harkes; +Cc: lkml
On Mon, 3 Jun 2002, Jan Harkes wrote:
> > static __inline__ void list_del(struct list_head *entry)
> > {
> > __list_del(entry->prev, entry->next);
> > + /*
> > + * This is debug. Remove it when the kernel has no bugs ;)
> > + */
> > + entry->next = 0;
> > + entry->prev = 0;
> > }
>
> We've had this before, and it breaks some code that removes items from
> lists as follows,
Such code is probably not SMP safe anyway.
regards,
Rik
--
Bravely reimplemented by the knights who say "NIH".
http://www.surriel.com/ http://distro.conectiva.com/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch 2/16] list_head debugging
@ 2002-06-07 14:17 Bernd Jendrissek
2002-06-07 18:30 ` Andrew Morton
0 siblings, 1 reply; 10+ messages in thread
From: Bernd Jendrissek @ 2002-06-07 14:17 UTC (permalink / raw)
To: linux-kernel; +Cc: akpm
[sorry for the nonexistent In-Reply-To/whatever headers - cutting&pasting]
Andrew Morton wrote:
> A common and very subtle bug is to use list_heads which aren't on any
> lists. It causes kernel memory corruption which is observed long after
> the offending code has executed.
>
> The patch nulls out the dangling pointers so we get a nice oops at the
> site of the buggy code.
I'm not current with the kernel tree, but will one such oops occur in
netfilter? See
http://lists.samba.org/pipermail/netfilter-announce/2002/000010.html
Hmm, no. A DoS maybe?
> --- 2.5.19/include/linux/list.h~list-debug Sat Jun 1 01:18:05 2002
> +++ 2.5.19-akpm/include/linux/list.h Sat Jun 1 01:18:05 2002
> @@ -94,6 +94,11 @@ static __inline__ void __list_del(struct
> static __inline__ void list_del(struct list_head *entry)
> {
> __list_del(entry->prev, entry->next);
> + /*
> + * This is debug. Remove it when the kernel has no bugs ;)
> + */
> + entry->next = 0;
> + entry->prev = 0;
> }
>
> /**
Bernd Jendrissek
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch 2/16] list_head debugging
2002-06-07 14:17 [patch 2/16] list_head debugging Bernd Jendrissek
@ 2002-06-07 18:30 ` Andrew Morton
2002-06-14 12:07 ` Jozsef Kadlecsik
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2002-06-07 18:30 UTC (permalink / raw)
To: Bernd Jendrissek; +Cc: linux-kernel, netfilter
Bernd Jendrissek wrote:
> [sorry for the nonexistent In-Reply-To/whatever headers - cutting&pasting]
>
> Andrew Morton wrote:
>
>> A common and very subtle bug is to use list_heads which aren't on any
>> lists. It causes kernel memory corruption which is observed long after
>> the offending code has executed.
>>
>> The patch nulls out the dangling pointers so we get a nice oops at the
>> site of the buggy code.
>
>
> I'm not current with the kernel tree, but will one such oops occur in
> netfilter? See
>
> http://lists.samba.org/pipermail/netfilter-announce/2002/000010.html
>
> Hmm, no. A DoS maybe?
>
An oops, actually. This code:
/* Remove from both hash lists: must not NULL out next ptrs,
otherwise we'll look unconfirmed. Fortunately, LIST_DELETE
doesn't do this. --RR */
LIST_DELETE(&ip_conntrack_hash
[hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple)],
&ct->tuplehash[IP_CT_DIR_ORIGINAL]);
LIST_DELETE(&ip_conntrack_hash
[hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple)],
&ct->tuplehash[IP_CT_DIR_REPLY]);
I think what is needed is:
--- 2.5.20/net/ipv4/netfilter/ip_conntrack_core.c~ipconntrack-lists Fri Jun 7 11:26:38 2002
+++ 2.5.20-akpm/net/ipv4/netfilter/ip_conntrack_core.c Fri Jun 7 11:26:42 2002
@@ -210,17 +210,22 @@ static void destroy_expectations(struct
static void
clean_from_lists(struct ip_conntrack *ct)
{
+
struct list_head *l1;
+
struct list_head *l2;
+
DEBUGP("clean_from_lists(%p)\n", ct);
MUST_BE_WRITE_LOCKED(&ip_conntrack_lock);
-
/* Remove from both hash lists: must not NULL out next ptrs,
- otherwise we'll look unconfirmed. Fortunately, LIST_DELETE
- doesn't do this. --RR */
+
+
l1 = &ct->tuplehash[IP_CT_DIR_ORIGINAL];
+
l2 = &ct->tuplehash[IP_CT_DIR_REPLY];
+
LIST_DELETE(&ip_conntrack_hash
[hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple)],
-
&ct->tuplehash[IP_CT_DIR_ORIGINAL]);
-
LIST_DELETE(&ip_conntrack_hash
-
[hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple)],
-
&ct->tuplehash[IP_CT_DIR_REPLY]);
+
l1);
+
if (l1 != l2)
+
LIST_DELETE(&ip_conntrack_hash
+
[hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple)],
+
l2);
/* Destroy all un-established, pending expectations */
destroy_expectations(ct);
-
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch 2/16] list_head debugging
2002-06-03 20:41 ` Rik van Riel
@ 2002-06-10 16:36 ` Jan Harkes
2002-06-14 9:22 ` Rik van Riel
0 siblings, 1 reply; 10+ messages in thread
From: Jan Harkes @ 2002-06-10 16:36 UTC (permalink / raw)
To: Rik van Riel; +Cc: linux-kernel
On Mon, Jun 03, 2002 at 05:41:39PM -0300, Rik van Riel wrote:
> > We've had this before, and it breaks some code that removes items from
> > lists as follows,
>
> Such code is probably not SMP safe anyway.
Where are you coming from with that comment?
down(&semaphore);
list_for_each(p, list)
if (condition)
list_del(p);
up(&semaphore);
Should be completely SMP safe, or we have more serious problems than I
even imagined. And it worked just fine before the 'zero pointers in
list_del'-patch was included and is used in _at least_ two places in the
existing kernel, probably more.
And list_for_each_safe might work when list_del is zero-ing pointers,
but imho has an ugly interface, it still doesn't fix SMP issues. You
still need to prevent concurrent modifications, otherwise someone could
just as well remove the curr->next object and you corrupt/crash on
dereferencing the saved next pointer.
In fact this saved next pointer will far more likely to lead to obscure
and hard to debug crashes compared to leaving prev/next as they are in
the existing list_del function, just because people will think it is a
'safe' list iterator and forget about correctly locking their lists down.
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch 2/16] list_head debugging
2002-06-10 16:36 ` Jan Harkes
@ 2002-06-14 9:22 ` Rik van Riel
0 siblings, 0 replies; 10+ messages in thread
From: Rik van Riel @ 2002-06-14 9:22 UTC (permalink / raw)
To: Jan Harkes; +Cc: linux-kernel
On Mon, 10 Jun 2002, Jan Harkes wrote:
> On Mon, Jun 03, 2002 at 05:41:39PM -0300, Rik van Riel wrote:
> > > We've had this before, and it breaks some code that removes items from
> > > lists as follows,
> >
> > Such code is probably not SMP safe anyway.
>
> Where are you coming from with that comment?
>
> down(&semaphore);
>
> list_for_each(p, list)
> if (condition)
> list_del(p);
>
> up(&semaphore);
>
> Should be completely SMP safe,
Not if 'p' comes from the slab cache.
In that case 'p' can be re-allocated on another CPU
before we dereference ->next ...
regards,
Rik
--
Bravely reimplemented by the knights who say "NIH".
http://www.surriel.com/ http://distro.conectiva.com/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch 2/16] list_head debugging
2002-06-07 18:30 ` Andrew Morton
@ 2002-06-14 12:07 ` Jozsef Kadlecsik
0 siblings, 0 replies; 10+ messages in thread
From: Jozsef Kadlecsik @ 2002-06-14 12:07 UTC (permalink / raw)
To: Andrew Morton; +Cc: Bernd Jendrissek, linux-kernel, netfilter
Hi,
On Fri, 7 Jun 2002, Andrew Morton wrote:
> Bernd Jendrissek wrote:
> > [sorry for the nonexistent In-Reply-To/whatever headers - cutting&pasting]
> >
> > Andrew Morton wrote:
> >
> >> A common and very subtle bug is to use list_heads which aren't on any
> >> lists. It causes kernel memory corruption which is observed long after
> >> the offending code has executed.
> >>
> >> The patch nulls out the dangling pointers so we get a nice oops at the
> >> site of the buggy code.
> >
> >
> > I'm not current with the kernel tree, but will one such oops occur in
> > netfilter? See
> >
> > http://lists.samba.org/pipermail/netfilter-announce/2002/000010.html
> >
> > Hmm, no. A DoS maybe?
> >
>
> An oops, actually. This code:
>
>
> /* Remove from both hash lists: must not NULL out next ptrs,
> otherwise we'll look unconfirmed. Fortunately, LIST_DELETE
> doesn't do this. --RR */
> LIST_DELETE(&ip_conntrack_hash
> [hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple)],
> &ct->tuplehash[IP_CT_DIR_ORIGINAL]);
> LIST_DELETE(&ip_conntrack_hash
> [hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple)],
> &ct->tuplehash[IP_CT_DIR_REPLY]);
>
>
> I think what is needed is:
>
> --- 2.5.20/net/ipv4/netfilter/ip_conntrack_core.c~ipconntrack-lists Fri Jun 7 11:26:38 2002
> +++ 2.5.20-akpm/net/ipv4/netfilter/ip_conntrack_core.c Fri Jun 7 11:26:42 2002
> @@ -210,17 +210,22 @@ static void destroy_expectations(struct
> static void
> clean_from_lists(struct ip_conntrack *ct)
> {
> +
> struct list_head *l1;
> +
> struct list_head *l2;
> +
> DEBUGP("clean_from_lists(%p)\n", ct);
> MUST_BE_WRITE_LOCKED(&ip_conntrack_lock);
> -
> /* Remove from both hash lists: must not NULL out next ptrs,
> - otherwise we'll look unconfirmed. Fortunately, LIST_DELETE
> - doesn't do this. --RR */
> +
> +
> l1 = &ct->tuplehash[IP_CT_DIR_ORIGINAL];
> +
> l2 = &ct->tuplehash[IP_CT_DIR_REPLY];
> +
> LIST_DELETE(&ip_conntrack_hash
> [hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple)],
> -
> &ct->tuplehash[IP_CT_DIR_ORIGINAL]);
> -
> LIST_DELETE(&ip_conntrack_hash
> -
> [hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple)],
> -
> &ct->tuplehash[IP_CT_DIR_REPLY]);
> +
> l1);
> +
> if (l1 != l2)
> +
> LIST_DELETE(&ip_conntrack_hash
> +
> [hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple)],
> +
> l2);
>
> /* Destroy all un-established, pending expectations */
> destroy_expectations(ct);
>
The two codes actually identical, because the condition is always true.
There is no connection where the ORIGINAL and REPLY tuples would be equal.
Regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
WWW-Home: http://www.kfki.hu/~kadlec
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2002-06-14 12:07 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-06-07 14:17 [patch 2/16] list_head debugging Bernd Jendrissek
2002-06-07 18:30 ` Andrew Morton
2002-06-14 12:07 ` Jozsef Kadlecsik
-- strict thread matches above, loose matches on Subject: below --
2002-06-01 8:40 Andrew Morton
2002-06-01 17:19 ` Arnaldo Carvalho de Melo
2002-06-03 13:55 ` Jan Harkes
2002-06-03 20:16 ` Andrew Morton
2002-06-03 20:41 ` Rik van Riel
2002-06-10 16:36 ` Jan Harkes
2002-06-14 9:22 ` Rik van Riel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox