From: Olaf Kirch <okir@suse.de>
To: netdev@oss.sgi.com
Subject: [PATCH] arp_queue: serializing unlink + kfree_skb
Date: Mon, 31 Jan 2005 11:29:20 +0100 [thread overview]
Message-ID: <20050131102920.GC4170@suse.de> (raw)
[-- Attachment #1: Type: text/plain, Size: 2575 bytes --]
Hi,
I'm just looking into a problem that may be related to the
new neighbor discovery code in core/neighbour.c.
The problem was originally reported against our SLES kernel, but the
code in 2.6.11-rc2 is almost identical; code and patches shown below
are relative to 2.6.11-rc2
The problem is that IBM testing was hitting the assertion in kfree_skb
that checks that the skb has been removed from any list it was on
("kfree_skb passed an skb still on a list").
It seems the problem is a bad interaction between neigh_event_send and
neigh_timer_handler:
In neigh_event_send:
if (skb_queue_len(&neigh->arp_queue) >=
neigh->parms->queue_len) {
struct sk_buff *buff;
buff = neigh->arp_queue.next;
__skb_unlink(buff, &neigh->arp_queue);
kfree_skb(buff);
}
If the ARP queue overflows, we're trying to remove the first skb
from the list. While we do this, the neighbor is locked.
The corresponding piece of code in neigh_timer_handler is this:
struct sk_buff *skb = skb_peek(&neigh->arp_queue);
/* keep skb alive even if arp_queue overflows */
if (skb)
skb_get(skb);
write_unlock(&neigh->lock);
neigh->ops->solicit(neigh, skb);
atomic_inc(&neigh->probes);
if (skb)
kfree_skb(skb); /* <== this is where we BUG() */
The use of skb_get/kfree_skb makes sure the skb remains live while
we're calling solicit(). Note that the neighbor isn't locked when we
call kfree_skb.
Now what happens in the bug that was reported to us (this was on a
PPC SMP machine) is this, I think:
- On CPU 0, neigh_timer_handler grabs the first skb on the queue,
incrementing is reference count to 2,
unlocks the neighbor, calls solicit()
- On CPU 1, we enter neigh_event_send, find the ARP
queue is full and remove the first skb. The call to kfree_skb
will decrement the refcount to 1. This change is visible
on all CPUs immediately, because it's an atomic_dec.
The effects of __skb_unlink are not immediately visible
on the other CPUs however.
- On the first CPU, we return from solicit, call kfree_skb.
Refcount goes to zero, but __kfree_skb still sees the
skb->list pointer => barf.
One possible fix here would be to add an smp_wmb after the __skb_unlink
and a smp_rmb before the assertion in __kfree_skb, as in the attached
patch.
Does this make sense?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
okir@suse.de | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
[-- Attachment #2: arp-skb-unlink-serialize --]
[-- Type: text/plain, Size: 1103 bytes --]
Index: linux-2.6.10/net/core/neighbour.c
===================================================================
--- linux-2.6.10.orig/net/core/neighbour.c 2005-01-21 11:04:23.000000000 +0100
+++ linux-2.6.10/net/core/neighbour.c 2005-01-31 11:20:38.000000000 +0100
@@ -876,6 +876,9 @@ int __neigh_event_send(struct neighbour
struct sk_buff *buff;
buff = neigh->arp_queue.next;
__skb_unlink(buff, &neigh->arp_queue);
+ /* Make sure the change of skb->head is
+ * visible on all CPUs */
+ smp_wmb();
kfree_skb(buff);
}
__skb_queue_tail(&neigh->arp_queue, skb);
Index: linux-2.6.10/net/core/skbuff.c
===================================================================
--- linux-2.6.10.orig/net/core/skbuff.c 2005-01-21 11:04:15.000000000 +0100
+++ linux-2.6.10/net/core/skbuff.c 2005-01-31 11:20:52.000000000 +0100
@@ -275,6 +275,7 @@ void kfree_skbmem(struct sk_buff *skb)
void __kfree_skb(struct sk_buff *skb)
{
+ smp_rmb();
if (skb->list) {
printk(KERN_WARNING "Warning: kfree_skb passed an skb still "
"on a list (from %p).\n", NET_CALLER(skb));
next reply other threads:[~2005-01-31 10:29 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-01-31 10:29 Olaf Kirch [this message]
2005-01-31 11:33 ` [PATCH] arp_queue: serializing unlink + kfree_skb Herbert Xu
2005-02-03 0:20 ` David S. Miller
2005-02-03 11:12 ` Herbert Xu
2005-02-04 0:34 ` David S. Miller
2005-02-03 14:27 ` Anton Blanchard
2005-02-03 18:14 ` David S. Miller
2005-02-03 20:30 ` Herbert Xu
2005-02-03 22:19 ` David S. Miller
2005-02-03 23:50 ` Herbert Xu
2005-02-04 0:49 ` David S. Miller
2005-02-04 1:20 ` Herbert Xu
2005-02-04 1:23 ` David S. Miller
2005-02-04 1:55 ` Herbert Xu
2005-02-04 11:16 ` Olaf Kirch
2005-02-06 1:14 ` David S. Miller
2005-02-03 23:08 ` David S. Miller
2005-02-04 11:33 ` Herbert Xu
2005-02-04 23:48 ` David S. Miller
2005-02-05 6:24 ` David S. Miller
2005-02-05 6:50 ` Herbert Xu
2005-02-10 4:23 ` Werner Almesberger
2005-02-10 4:56 ` Herbert Xu
2005-02-11 3:46 ` David S. Miller
2005-02-11 3:50 ` David S. Miller
2005-02-11 4:27 ` Werner Almesberger
2005-02-11 5:04 ` Dmitry Torokhov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050131102920.GC4170@suse.de \
--to=okir@suse.de \
--cc=netdev@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.