* Re: [PATCH] [IPV4] route: fix locking in rt_run_flush()
2008-01-21 15:08 [PATCH] [IPV4] route: fix locking in rt_run_flush() Joonwoo Park
@ 2008-01-21 10:40 ` David Miller
2008-01-21 21:14 ` Eric Dumazet
2008-01-23 7:43 ` joonwpark81
0 siblings, 2 replies; 6+ messages in thread
From: David Miller @ 2008-01-21 10:40 UTC (permalink / raw)
To: joonwpark81; +Cc: netdev, dada1
From: Joonwoo Park <joonwpark81@gmail.com>
Date: Tue, 22 Jan 2008 00:08:57 +0900
> The rt_run_flush() can be stucked if it was called while netdev is on the
> high load.
> It's possible when pushing rtable to rt_hash is faster than pulling
> from it.
>
> The commands 'ifconfig up or ifconfig mtu' and netif_carrier_on() can
> introduce soft lockup like this:
>
> [ 363.528001] BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9]
> [ 363.531492]
> [ 363.535027] Pid: 9, comm: events/0 Not tainted (2.6.24-rc8 #14)
> [ 363.538837] EIP: 0060:[<c4086a39>] EFLAGS: 00000286 CPU: 0
> [ 363.542762] EIP is at kfree+0xa9/0xf0
> ...
> [ 363.660815] [<c42fb0fd>] skb_release_data+0x5d/0x90
> [ 363.666989] [<c42fb7dc>] skb_release_all+0x5c/0xd0
> [ 363.673207] [<c42faf8b>] __kfree_skb+0xb/0x90
> [ 363.679474] [<c42fb029>] kfree_skb+0x19/0x40
> [ 363.685811] [<c4322d87>] ip_rcv+0x27/0x290
> [ 363.692223] [<c4300ae5>] netif_receive_skb+0x255/0x320
> [ 363.698759] [<f88465aa>] e1000_clean_rx_irq+0x14a/0x4f0 [e1000]
> [ 363.705456] [<f88437c2>] e1000_clean+0x62/0x270 [e1000]
> [ 363.712217] [<c43031ee>] net_rx_action+0x16e/0x220
> [ 363.719065] [<c40346d7>] __do_softirq+0x87/0x100
> [ 363.726001] [<c40347a7>] do_softirq+0x57/0x60
> [ 363.732979] [<c4034b4e>] local_bh_enable_ip+0xae/0x100
> [ 363.740094] [<c43e73f5>] _spin_unlock_bh+0x25/0x30
> [ 363.747283] [<c431ec88>] rt_run_flush+0xc8/0xe0
> [ 363.754566] [<c4320c76>] rt_cache_flush+0xd6/0xe0
> [ 363.761917] [<c4350269>] fib_netdev_event+0x89/0xa0
> [ 363.769361] [<c4047d67>] notifier_call_chain+0x37/0x80
> ...
>
> This patch makes rt_run_flush() to run with softirq is disabled.
>
> Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
I agree with the analysis of the problem, however not the solution.
This will absolutely kill software interrupt latency.
In fact, we have moved much of the flush work into a workqueue in
net-2.6.25 because of how important that is
We need to find some other way to solve this.
Eric, any ideas?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] [IPV4] route: fix locking in rt_run_flush()
@ 2008-01-21 15:08 Joonwoo Park
2008-01-21 10:40 ` David Miller
0 siblings, 1 reply; 6+ messages in thread
From: Joonwoo Park @ 2008-01-21 15:08 UTC (permalink / raw)
To: David Miller; +Cc: Joonwoo Park, linux-netdev
The rt_run_flush() can be stucked if it was called while netdev is on the
high load.
It's possible when pushing rtable to rt_hash is faster than pulling
from it.
The commands 'ifconfig up or ifconfig mtu' and netif_carrier_on() can
introduce soft lockup like this:
[ 363.528001] BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9]
[ 363.531492]
[ 363.535027] Pid: 9, comm: events/0 Not tainted (2.6.24-rc8 #14)
[ 363.538837] EIP: 0060:[<c4086a39>] EFLAGS: 00000286 CPU: 0
[ 363.542762] EIP is at kfree+0xa9/0xf0
...
[ 363.660815] [<c42fb0fd>] skb_release_data+0x5d/0x90
[ 363.666989] [<c42fb7dc>] skb_release_all+0x5c/0xd0
[ 363.673207] [<c42faf8b>] __kfree_skb+0xb/0x90
[ 363.679474] [<c42fb029>] kfree_skb+0x19/0x40
[ 363.685811] [<c4322d87>] ip_rcv+0x27/0x290
[ 363.692223] [<c4300ae5>] netif_receive_skb+0x255/0x320
[ 363.698759] [<f88465aa>] e1000_clean_rx_irq+0x14a/0x4f0 [e1000]
[ 363.705456] [<f88437c2>] e1000_clean+0x62/0x270 [e1000]
[ 363.712217] [<c43031ee>] net_rx_action+0x16e/0x220
[ 363.719065] [<c40346d7>] __do_softirq+0x87/0x100
[ 363.726001] [<c40347a7>] do_softirq+0x57/0x60
[ 363.732979] [<c4034b4e>] local_bh_enable_ip+0xae/0x100
[ 363.740094] [<c43e73f5>] _spin_unlock_bh+0x25/0x30
[ 363.747283] [<c431ec88>] rt_run_flush+0xc8/0xe0
[ 363.754566] [<c4320c76>] rt_cache_flush+0xd6/0xe0
[ 363.761917] [<c4350269>] fib_netdev_event+0x89/0xa0
[ 363.769361] [<c4047d67>] notifier_call_chain+0x37/0x80
...
This patch makes rt_run_flush() to run with softirq is disabled.
Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
---
net/ipv4/route.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 28484f3..454e71c 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -620,18 +620,20 @@ static void rt_run_flush(unsigned long dummy)
get_random_bytes(&rt_hash_rnd, 4);
+ local_bh_disable();
for (i = rt_hash_mask; i >= 0; i--) {
- spin_lock_bh(rt_hash_lock_addr(i));
+ spin_lock(rt_hash_lock_addr(i));
rth = rt_hash_table[i].chain;
if (rth)
rt_hash_table[i].chain = NULL;
- spin_unlock_bh(rt_hash_lock_addr(i));
+ spin_unlock(rt_hash_lock_addr(i));
for (; rth; rth = next) {
next = rth->u.dst.rt_next;
rt_free(rth);
}
}
+ local_bh_enable();
}
static DEFINE_SPINLOCK(rt_flush_lock);
--
1.5.3.rc5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] [IPV4] route: fix locking in rt_run_flush()
2008-01-21 10:40 ` David Miller
@ 2008-01-21 21:14 ` Eric Dumazet
2008-01-23 7:43 ` joonwpark81
1 sibling, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2008-01-21 21:14 UTC (permalink / raw)
To: David Miller; +Cc: joonwpark81, netdev
David Miller a écrit :
> From: Joonwoo Park <joonwpark81@gmail.com>
> Date: Tue, 22 Jan 2008 00:08:57 +0900
>
>> The rt_run_flush() can be stucked if it was called while netdev is on the
>> high load.
>> It's possible when pushing rtable to rt_hash is faster than pulling
>> from it.
>>
>> The commands 'ifconfig up or ifconfig mtu' and netif_carrier_on() can
>> introduce soft lockup like this:
>>
>> [ 363.528001] BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9]
>> [ 363.531492]
>> [ 363.535027] Pid: 9, comm: events/0 Not tainted (2.6.24-rc8 #14)
>> [ 363.538837] EIP: 0060:[<c4086a39>] EFLAGS: 00000286 CPU: 0
>> [ 363.542762] EIP is at kfree+0xa9/0xf0
>> ...
>> [ 363.660815] [<c42fb0fd>] skb_release_data+0x5d/0x90
>> [ 363.666989] [<c42fb7dc>] skb_release_all+0x5c/0xd0
>> [ 363.673207] [<c42faf8b>] __kfree_skb+0xb/0x90
>> [ 363.679474] [<c42fb029>] kfree_skb+0x19/0x40
>> [ 363.685811] [<c4322d87>] ip_rcv+0x27/0x290
>> [ 363.692223] [<c4300ae5>] netif_receive_skb+0x255/0x320
>> [ 363.698759] [<f88465aa>] e1000_clean_rx_irq+0x14a/0x4f0 [e1000]
>> [ 363.705456] [<f88437c2>] e1000_clean+0x62/0x270 [e1000]
>> [ 363.712217] [<c43031ee>] net_rx_action+0x16e/0x220
>> [ 363.719065] [<c40346d7>] __do_softirq+0x87/0x100
>> [ 363.726001] [<c40347a7>] do_softirq+0x57/0x60
>> [ 363.732979] [<c4034b4e>] local_bh_enable_ip+0xae/0x100
>> [ 363.740094] [<c43e73f5>] _spin_unlock_bh+0x25/0x30
>> [ 363.747283] [<c431ec88>] rt_run_flush+0xc8/0xe0
>> [ 363.754566] [<c4320c76>] rt_cache_flush+0xd6/0xe0
>> [ 363.761917] [<c4350269>] fib_netdev_event+0x89/0xa0
>> [ 363.769361] [<c4047d67>] notifier_call_chain+0x37/0x80
>> ...
>>
>> This patch makes rt_run_flush() to run with softirq is disabled.
>>
>> Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
>
> I agree with the analysis of the problem, however not the solution.
>
> This will absolutely kill software interrupt latency.
>
> In fact, we have moved much of the flush work into a workqueue in
> net-2.6.25 because of how important that is
>
> We need to find some other way to solve this.
>
> Eric, any ideas?
Hum... 2.6.25 is certainly better in this aspect, but I remember we left
something to finish :)
We currently can have a worker doing the automatic flush every 600 seconds,
and another task doing a rt_cache_flush(...)
On very loaded machines (DDOS), routes might be added faster than deleted.
Also, each change in routes must invalidate rtcache, and/but full scan of this
cache is way too expensive (huge amount of MBytes must me read/written)
One possibility is to use a genid marker so that each entry can be thrown away
if its genid is different than the global one.
rt_cache_flush(-1) or rt_secret_build() would just have to increment the
global genid.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [IPV4] route: fix locking in rt_run_flush()
2008-01-21 10:40 ` David Miller
2008-01-21 21:14 ` Eric Dumazet
@ 2008-01-23 7:43 ` joonwpark81
2008-01-23 20:44 ` Eric Dumazet
1 sibling, 1 reply; 6+ messages in thread
From: joonwpark81 @ 2008-01-23 7:43 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Eric Dumazet
On Mon, Jan 21, 2008 at 02:40:43AM -0800, David Miller wrote:
> From: Joonwoo Park <joonwpark81@gmail.com>
> Date: Tue, 22 Jan 2008 00:08:57 +0900
>
> > The rt_run_flush() can be stucked if it was called while netdev is on the
> > high load.
> > It's possible when pushing rtable to rt_hash is faster than pulling
> > from it.
> >
> > Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
>
> I agree with the analysis of the problem, however not the solution.
>
> This will absolutely kill software interrupt latency.
>
> In fact, we have moved much of the flush work into a workqueue in
> net-2.6.25 because of how important that is
>
> We need to find some other way to solve this.
>
Dave, Eric,
Thanks so much for comments.
I did stress tests and I found that the real problem was not consumer & supplier
issue.
It was the problem for me to innumerable enabling & disabling the softirq.
But I'm still thinking need of considering issue 'faster caching than flush'. :)
ifconfig up on heavy loaded interface.
Before patching:
time ifconfig eth1 up
BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9]
...
After patching:
time ifconfig eth1 up
real 0m0.007s
user 0m0.000s
sys 0m0.004s
Thanks!
Joonwoo
>From 87c29506de967e811ad5b57cd2e1a002134e878f Mon Sep 17 00:00:00 2001
From: Joonwoo Park <joonwpark81@gmail.com>
Date: Wed, 23 Jan 2008 15:16:54 +0900
Subject: [PATCH] [IPV4] route: reduce locking/unlocking in rt_run_flush
The rt_run_flush does spin_lock_bh/spin_unlock_bh for rt_hash_mask + 1
times.
The rt_hash_mask takes from 32767 to 65535, so it's big overhead.
In addition, disable_bh/enable_bh for many times in the rt_run_flush
can cause stuck on a machine with heavily pended softirqs.
This patch reduces locking/unlocking as doing it with jumping the lock
slots.
ifconfig up on heavy loaded interface.
Before:
time ifconfig eth1 up
BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9]
...
After:
time ifconfig eth1 up
real 0m0.007s
user 0m0.000s
sys 0m0.004s
Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
---
net/ipv4/route.c | 38 +++++++++++++++++++++++++++-----------
1 files changed, 27 insertions(+), 11 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 28484f3..79a401f 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -239,9 +239,20 @@ static spinlock_t *rt_hash_locks;
for (i = 0; i < RT_HASH_LOCK_SZ; i++) \
spin_lock_init(&rt_hash_locks[i]); \
}
+# define rt_hash_lock(lock) &rt_hash_locks[lock]
+# define rt_hash_for_each_lock(lock) \
+ for (lock = 0; lock < RT_HASH_LOCK_SZ; lock++)
+# define rt_hash_for_each_slot(slot, lock, mask) \
+ for (slot = 0, mask = lock; \
+ slot < ((int)rt_hash_mask + 1) / RT_HASH_LOCK_SZ; \
+ slot++, mask = (slot * RT_HASH_LOCK_SZ) + lock)
#else
# define rt_hash_lock_addr(slot) NULL
# define rt_hash_lock_init()
+# define rt_hash_lock(lock) NULL
+# define rt_hash_for_each_lock(lock) do { lock = 0; } while (0);
+# define rt_hash_for_each_slot(slot, lock, mask) \
+ for (slot = 0, lock = 0, mask = 0; mask <= rt_hash_mask; mask++)
#endif
static struct rt_hash_bucket *rt_hash_table;
@@ -613,24 +624,29 @@ static void rt_check_expire(struct work_struct *work)
*/
static void rt_run_flush(unsigned long dummy)
{
- int i;
+ int slot, lock, mask;
struct rtable *rth, *next;
rt_deadline = 0;
get_random_bytes(&rt_hash_rnd, 4);
- for (i = rt_hash_mask; i >= 0; i--) {
- spin_lock_bh(rt_hash_lock_addr(i));
- rth = rt_hash_table[i].chain;
- if (rth)
- rt_hash_table[i].chain = NULL;
- spin_unlock_bh(rt_hash_lock_addr(i));
-
- for (; rth; rth = next) {
- next = rth->u.dst.rt_next;
- rt_free(rth);
+ rt_hash_for_each_lock(lock) {
+ spin_lock_bh(rt_hash_lock(lock));
+ rt_hash_for_each_slot(slot, lock, mask) {
+ rth = rt_hash_table[mask].chain;
+
+ if (rth) {
+ rt_hash_table[mask].chain = NULL;
+ spin_unlock_bh(rt_hash_lock(lock));
+ for (; rth; rth = next) {
+ next = rth->u.dst.rt_next;
+ rt_free(rth);
+ }
+ spin_lock_bh(rt_hash_lock(lock));
+ }
}
+ spin_unlock_bh(rt_hash_lock(lock));
}
}
--
1.5.3.rc5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] [IPV4] route: fix locking in rt_run_flush()
2008-01-23 7:43 ` joonwpark81
@ 2008-01-23 20:44 ` Eric Dumazet
2008-01-24 5:06 ` Joonwoo Park
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2008-01-23 20:44 UTC (permalink / raw)
To: joonwpark81; +Cc: David Miller, netdev
joonwpark81@gmail.com a écrit :
> On Mon, Jan 21, 2008 at 02:40:43AM -0800, David Miller wrote:
>> From: Joonwoo Park <joonwpark81@gmail.com>
>> Date: Tue, 22 Jan 2008 00:08:57 +0900
>>
>>> The rt_run_flush() can be stucked if it was called while netdev is on the
>>> high load.
>>> It's possible when pushing rtable to rt_hash is faster than pulling
>>> from it.
>>>
>>> Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
>> I agree with the analysis of the problem, however not the solution.
>>
>> This will absolutely kill software interrupt latency.
>>
>> In fact, we have moved much of the flush work into a workqueue in
>> net-2.6.25 because of how important that is
>>
>> We need to find some other way to solve this.
>>
>
> Dave, Eric,
> Thanks so much for comments.
>
> I did stress tests and I found that the real problem was not consumer & supplier
> issue.
> It was the problem for me to innumerable enabling & disabling the softirq.
> But I'm still thinking need of considering issue 'faster caching than flush'. :)
>
> ifconfig up on heavy loaded interface.
> Before patching:
> time ifconfig eth1 up
> BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9]
> ...
>
> After patching:
> time ifconfig eth1 up
> real 0m0.007s
> user 0m0.000s
> sys 0m0.004s
>
> Thanks!
> Joonwoo
>
>
>>From 87c29506de967e811ad5b57cd2e1a002134e878f Mon Sep 17 00:00:00 2001
> From: Joonwoo Park <joonwpark81@gmail.com>
> Date: Wed, 23 Jan 2008 15:16:54 +0900
> Subject: [PATCH] [IPV4] route: reduce locking/unlocking in rt_run_flush
>
> The rt_run_flush does spin_lock_bh/spin_unlock_bh for rt_hash_mask + 1
> times.
> The rt_hash_mask takes from 32767 to 65535, so it's big overhead.
> In addition, disable_bh/enable_bh for many times in the rt_run_flush
> can cause stuck on a machine with heavily pended softirqs.
>
> This patch reduces locking/unlocking as doing it with jumping the lock
> slots.
>
> ifconfig up on heavy loaded interface.
> Before:
> time ifconfig eth1 up
> BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9]
> ...
>
> After:
> time ifconfig eth1 up
> real 0m0.007s
> user 0m0.000s
> sys 0m0.004s
>
Unfortunatly, your patch doesnt work on CONFIG_SMP=n (softirq will be disabled
for the whole scan of table)
Also, some machines around there have 2^22 slots in hash table, and NR_CPUS=4,
so softirqs will be disabled for a too long time.
Please try net-2.6.25 and submit patches on top of it if necessary, since
rt_run_flush() has pending changes, not in net-2.6
Note : The 'soft lockup' can be avoided by other means.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [IPV4] route: fix locking in rt_run_flush()
2008-01-23 20:44 ` Eric Dumazet
@ 2008-01-24 5:06 ` Joonwoo Park
0 siblings, 0 replies; 6+ messages in thread
From: Joonwoo Park @ 2008-01-24 5:06 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
2008/1/24, Eric Dumazet <dada1@cosmosbay.com>:
>
> Unfortunatly, your patch doesnt work on CONFIG_SMP=n (softirq will be disabled
> for the whole scan of table)
>
> Also, some machines around there have 2^22 slots in hash table, and NR_CPUS=4,
> so softirqs will be disabled for a too long time.
>
> Please try net-2.6.25 and submit patches on top of it if necessary, since
> rt_run_flush() has pending changes, not in net-2.6
>
> Note : The 'soft lockup' can be avoided by other means.
>
>
Eric,
Thank you for your help.
I checked net-2.6.25 in this morining :(
And it brought me a conclusion that the fashion of it is the best solution for
the all circumstances.
I believe that
'cond_resched()'
and
'if (!rth) continue' without disable softirq
of net-2.6.25 is correct answer.
Thanks again!
Joonwoo
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-01-24 5:06 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-21 15:08 [PATCH] [IPV4] route: fix locking in rt_run_flush() Joonwoo Park
2008-01-21 10:40 ` David Miller
2008-01-21 21:14 ` Eric Dumazet
2008-01-23 7:43 ` joonwpark81
2008-01-23 20:44 ` Eric Dumazet
2008-01-24 5:06 ` Joonwoo Park
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).