From: Olaf Kirch <okir@suse.de>
To: netdev@oss.sgi.com
Subject: [PATCH] arp_queue: serializing unlink + kfree_skb
Date: Mon, 31 Jan 2005 11:29:20 +0100 [thread overview]
Message-ID: <20050131102920.GC4170@suse.de> (raw)
[-- Attachment #1: Type: text/plain, Size: 2575 bytes --]
Hi,
I'm just looking into a problem that may be related to the
new neighbor discovery code in core/neighbour.c.
The problem was originally reported against our SLES kernel, but the
code in 2.6.11-rc2 is almost identical; code and patches shown below
are relative to 2.6.11-rc2
The problem is that IBM testing was hitting the assertion in kfree_skb
that checks that the skb has been removed from any list it was on
("kfree_skb passed an skb still on a list").
It seems the problem is a bad interaction between neigh_event_send and
neigh_timer_handler:
In neigh_event_send:
if (skb_queue_len(&neigh->arp_queue) >=
neigh->parms->queue_len) {
struct sk_buff *buff;
buff = neigh->arp_queue.next;
__skb_unlink(buff, &neigh->arp_queue);
kfree_skb(buff);
}
If the ARP queue overflows, we're trying to remove the first skb
from the list. While we do this, the neighbor is locked.
The corresponding piece of code in neigh_timer_handler is this:
struct sk_buff *skb = skb_peek(&neigh->arp_queue);
/* keep skb alive even if arp_queue overflows */
if (skb)
skb_get(skb);
write_unlock(&neigh->lock);
neigh->ops->solicit(neigh, skb);
atomic_inc(&neigh->probes);
if (skb)
kfree_skb(skb); /* <== this is where we BUG() */
The use of skb_get/kfree_skb makes sure the skb remains live while
we're calling solicit(). Note that the neighbor isn't locked when we
call kfree_skb.
Now what happens in the bug that was reported to us (this was on a
PPC SMP machine) is this, I think:
- On CPU 0, neigh_timer_handler grabs the first skb on the queue,
incrementing is reference count to 2,
unlocks the neighbor, calls solicit()
- On CPU 1, we enter neigh_event_send, find the ARP
queue is full and remove the first skb. The call to kfree_skb
will decrement the refcount to 1. This change is visible
on all CPUs immediately, because it's an atomic_dec.
The effects of __skb_unlink are not immediately visible
on the other CPUs however.
- On the first CPU, we return from solicit, call kfree_skb.
Refcount goes to zero, but __kfree_skb still sees the
skb->list pointer => barf.
One possible fix here would be to add an smp_wmb after the __skb_unlink
and a smp_rmb before the assertion in __kfree_skb, as in the attached
patch.
Does this make sense?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
okir@suse.de | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
[-- Attachment #2: arp-skb-unlink-serialize --]
[-- Type: text/plain, Size: 1103 bytes --]
Index: linux-2.6.10/net/core/neighbour.c
===================================================================
--- linux-2.6.10.orig/net/core/neighbour.c 2005-01-21 11:04:23.000000000 +0100
+++ linux-2.6.10/net/core/neighbour.c 2005-01-31 11:20:38.000000000 +0100
@@ -876,6 +876,9 @@ int __neigh_event_send(struct neighbour
struct sk_buff *buff;
buff = neigh->arp_queue.next;
__skb_unlink(buff, &neigh->arp_queue);
+ /* Make sure the change of skb->head is
+ * visible on all CPUs */
+ smp_wmb();
kfree_skb(buff);
}
__skb_queue_tail(&neigh->arp_queue, skb);
Index: linux-2.6.10/net/core/skbuff.c
===================================================================
--- linux-2.6.10.orig/net/core/skbuff.c 2005-01-21 11:04:15.000000000 +0100
+++ linux-2.6.10/net/core/skbuff.c 2005-01-31 11:20:52.000000000 +0100
@@ -275,6 +275,7 @@ void kfree_skbmem(struct sk_buff *skb)
void __kfree_skb(struct sk_buff *skb)
{
+ smp_rmb();
if (skb->list) {
printk(KERN_WARNING "Warning: kfree_skb passed an skb still "
"on a list (from %p).\n", NET_CALLER(skb));
next reply other threads:[~2005-01-31 10:29 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-01-31 10:29 Olaf Kirch [this message]
2005-01-31 11:33 ` [PATCH] arp_queue: serializing unlink + kfree_skb Herbert Xu
2005-02-03 0:20 ` David S. Miller
2005-02-03 11:12 ` Herbert Xu
2005-02-04 0:34 ` David S. Miller
2005-02-03 14:27 ` Anton Blanchard
2005-02-03 18:14 ` David S. Miller
2005-02-03 20:30 ` Herbert Xu
2005-02-03 22:19 ` David S. Miller
2005-02-03 23:50 ` Herbert Xu
2005-02-04 0:49 ` David S. Miller
2005-02-04 1:20 ` Herbert Xu
2005-02-04 1:23 ` David S. Miller
2005-02-04 1:55 ` Herbert Xu
2005-02-04 11:16 ` Olaf Kirch
2005-02-06 1:14 ` David S. Miller
2005-02-03 23:08 ` David S. Miller
2005-02-04 11:33 ` Herbert Xu
2005-02-04 23:48 ` David S. Miller
2005-02-05 6:24 ` David S. Miller
2005-02-05 6:50 ` Herbert Xu
2005-02-10 4:23 ` Werner Almesberger
2005-02-10 4:56 ` Herbert Xu
2005-02-11 3:46 ` David S. Miller
2005-02-11 3:50 ` David S. Miller
2005-02-11 4:27 ` Werner Almesberger
2005-02-11 5:04 ` Dmitry Torokhov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050131102920.GC4170@suse.de \
--to=okir@suse.de \
--cc=netdev@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).