* [PATCH] [IPV4] route: fix locking in rt_run_flush() @ 2008-01-21 15:08 Joonwoo Park 2008-01-21 10:40 ` David Miller 0 siblings, 1 reply; 6+ messages in thread From: Joonwoo Park @ 2008-01-21 15:08 UTC (permalink / raw) To: David Miller; +Cc: Joonwoo Park, linux-netdev The rt_run_flush() can be stucked if it was called while netdev is on the high load. It's possible when pushing rtable to rt_hash is faster than pulling from it. The commands 'ifconfig up or ifconfig mtu' and netif_carrier_on() can introduce soft lockup like this: [ 363.528001] BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9] [ 363.531492] [ 363.535027] Pid: 9, comm: events/0 Not tainted (2.6.24-rc8 #14) [ 363.538837] EIP: 0060:[<c4086a39>] EFLAGS: 00000286 CPU: 0 [ 363.542762] EIP is at kfree+0xa9/0xf0 ... [ 363.660815] [<c42fb0fd>] skb_release_data+0x5d/0x90 [ 363.666989] [<c42fb7dc>] skb_release_all+0x5c/0xd0 [ 363.673207] [<c42faf8b>] __kfree_skb+0xb/0x90 [ 363.679474] [<c42fb029>] kfree_skb+0x19/0x40 [ 363.685811] [<c4322d87>] ip_rcv+0x27/0x290 [ 363.692223] [<c4300ae5>] netif_receive_skb+0x255/0x320 [ 363.698759] [<f88465aa>] e1000_clean_rx_irq+0x14a/0x4f0 [e1000] [ 363.705456] [<f88437c2>] e1000_clean+0x62/0x270 [e1000] [ 363.712217] [<c43031ee>] net_rx_action+0x16e/0x220 [ 363.719065] [<c40346d7>] __do_softirq+0x87/0x100 [ 363.726001] [<c40347a7>] do_softirq+0x57/0x60 [ 363.732979] [<c4034b4e>] local_bh_enable_ip+0xae/0x100 [ 363.740094] [<c43e73f5>] _spin_unlock_bh+0x25/0x30 [ 363.747283] [<c431ec88>] rt_run_flush+0xc8/0xe0 [ 363.754566] [<c4320c76>] rt_cache_flush+0xd6/0xe0 [ 363.761917] [<c4350269>] fib_netdev_event+0x89/0xa0 [ 363.769361] [<c4047d67>] notifier_call_chain+0x37/0x80 ... This patch makes rt_run_flush() to run with softirq is disabled. Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> --- net/ipv4/route.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 28484f3..454e71c 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -620,18 +620,20 @@ static void rt_run_flush(unsigned long dummy) get_random_bytes(&rt_hash_rnd, 4); + local_bh_disable(); for (i = rt_hash_mask; i >= 0; i--) { - spin_lock_bh(rt_hash_lock_addr(i)); + spin_lock(rt_hash_lock_addr(i)); rth = rt_hash_table[i].chain; if (rth) rt_hash_table[i].chain = NULL; - spin_unlock_bh(rt_hash_lock_addr(i)); + spin_unlock(rt_hash_lock_addr(i)); for (; rth; rth = next) { next = rth->u.dst.rt_next; rt_free(rth); } } + local_bh_enable(); } static DEFINE_SPINLOCK(rt_flush_lock); -- 1.5.3.rc5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] [IPV4] route: fix locking in rt_run_flush() 2008-01-21 15:08 [PATCH] [IPV4] route: fix locking in rt_run_flush() Joonwoo Park @ 2008-01-21 10:40 ` David Miller 2008-01-21 21:14 ` Eric Dumazet 2008-01-23 7:43 ` joonwpark81 0 siblings, 2 replies; 6+ messages in thread From: David Miller @ 2008-01-21 10:40 UTC (permalink / raw) To: joonwpark81; +Cc: netdev, dada1 From: Joonwoo Park <joonwpark81@gmail.com> Date: Tue, 22 Jan 2008 00:08:57 +0900 > The rt_run_flush() can be stucked if it was called while netdev is on the > high load. > It's possible when pushing rtable to rt_hash is faster than pulling > from it. > > The commands 'ifconfig up or ifconfig mtu' and netif_carrier_on() can > introduce soft lockup like this: > > [ 363.528001] BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9] > [ 363.531492] > [ 363.535027] Pid: 9, comm: events/0 Not tainted (2.6.24-rc8 #14) > [ 363.538837] EIP: 0060:[<c4086a39>] EFLAGS: 00000286 CPU: 0 > [ 363.542762] EIP is at kfree+0xa9/0xf0 > ... > [ 363.660815] [<c42fb0fd>] skb_release_data+0x5d/0x90 > [ 363.666989] [<c42fb7dc>] skb_release_all+0x5c/0xd0 > [ 363.673207] [<c42faf8b>] __kfree_skb+0xb/0x90 > [ 363.679474] [<c42fb029>] kfree_skb+0x19/0x40 > [ 363.685811] [<c4322d87>] ip_rcv+0x27/0x290 > [ 363.692223] [<c4300ae5>] netif_receive_skb+0x255/0x320 > [ 363.698759] [<f88465aa>] e1000_clean_rx_irq+0x14a/0x4f0 [e1000] > [ 363.705456] [<f88437c2>] e1000_clean+0x62/0x270 [e1000] > [ 363.712217] [<c43031ee>] net_rx_action+0x16e/0x220 > [ 363.719065] [<c40346d7>] __do_softirq+0x87/0x100 > [ 363.726001] [<c40347a7>] do_softirq+0x57/0x60 > [ 363.732979] [<c4034b4e>] local_bh_enable_ip+0xae/0x100 > [ 363.740094] [<c43e73f5>] _spin_unlock_bh+0x25/0x30 > [ 363.747283] [<c431ec88>] rt_run_flush+0xc8/0xe0 > [ 363.754566] [<c4320c76>] rt_cache_flush+0xd6/0xe0 > [ 363.761917] [<c4350269>] fib_netdev_event+0x89/0xa0 > [ 363.769361] [<c4047d67>] notifier_call_chain+0x37/0x80 > ... > > This patch makes rt_run_flush() to run with softirq is disabled. > > Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> I agree with the analysis of the problem, however not the solution. This will absolutely kill software interrupt latency. In fact, we have moved much of the flush work into a workqueue in net-2.6.25 because of how important that is We need to find some other way to solve this. Eric, any ideas? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [IPV4] route: fix locking in rt_run_flush() 2008-01-21 10:40 ` David Miller @ 2008-01-21 21:14 ` Eric Dumazet 2008-01-23 7:43 ` joonwpark81 1 sibling, 0 replies; 6+ messages in thread From: Eric Dumazet @ 2008-01-21 21:14 UTC (permalink / raw) To: David Miller; +Cc: joonwpark81, netdev David Miller a écrit : > From: Joonwoo Park <joonwpark81@gmail.com> > Date: Tue, 22 Jan 2008 00:08:57 +0900 > >> The rt_run_flush() can be stucked if it was called while netdev is on the >> high load. >> It's possible when pushing rtable to rt_hash is faster than pulling >> from it. >> >> The commands 'ifconfig up or ifconfig mtu' and netif_carrier_on() can >> introduce soft lockup like this: >> >> [ 363.528001] BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9] >> [ 363.531492] >> [ 363.535027] Pid: 9, comm: events/0 Not tainted (2.6.24-rc8 #14) >> [ 363.538837] EIP: 0060:[<c4086a39>] EFLAGS: 00000286 CPU: 0 >> [ 363.542762] EIP is at kfree+0xa9/0xf0 >> ... >> [ 363.660815] [<c42fb0fd>] skb_release_data+0x5d/0x90 >> [ 363.666989] [<c42fb7dc>] skb_release_all+0x5c/0xd0 >> [ 363.673207] [<c42faf8b>] __kfree_skb+0xb/0x90 >> [ 363.679474] [<c42fb029>] kfree_skb+0x19/0x40 >> [ 363.685811] [<c4322d87>] ip_rcv+0x27/0x290 >> [ 363.692223] [<c4300ae5>] netif_receive_skb+0x255/0x320 >> [ 363.698759] [<f88465aa>] e1000_clean_rx_irq+0x14a/0x4f0 [e1000] >> [ 363.705456] [<f88437c2>] e1000_clean+0x62/0x270 [e1000] >> [ 363.712217] [<c43031ee>] net_rx_action+0x16e/0x220 >> [ 363.719065] [<c40346d7>] __do_softirq+0x87/0x100 >> [ 363.726001] [<c40347a7>] do_softirq+0x57/0x60 >> [ 363.732979] [<c4034b4e>] local_bh_enable_ip+0xae/0x100 >> [ 363.740094] [<c43e73f5>] _spin_unlock_bh+0x25/0x30 >> [ 363.747283] [<c431ec88>] rt_run_flush+0xc8/0xe0 >> [ 363.754566] [<c4320c76>] rt_cache_flush+0xd6/0xe0 >> [ 363.761917] [<c4350269>] fib_netdev_event+0x89/0xa0 >> [ 363.769361] [<c4047d67>] notifier_call_chain+0x37/0x80 >> ... >> >> This patch makes rt_run_flush() to run with softirq is disabled. >> >> Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> > > I agree with the analysis of the problem, however not the solution. > > This will absolutely kill software interrupt latency. > > In fact, we have moved much of the flush work into a workqueue in > net-2.6.25 because of how important that is > > We need to find some other way to solve this. > > Eric, any ideas? Hum... 2.6.25 is certainly better in this aspect, but I remember we left something to finish :) We currently can have a worker doing the automatic flush every 600 seconds, and another task doing a rt_cache_flush(...) On very loaded machines (DDOS), routes might be added faster than deleted. Also, each change in routes must invalidate rtcache, and/but full scan of this cache is way too expensive (huge amount of MBytes must me read/written) One possibility is to use a genid marker so that each entry can be thrown away if its genid is different than the global one. rt_cache_flush(-1) or rt_secret_build() would just have to increment the global genid. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [IPV4] route: fix locking in rt_run_flush() 2008-01-21 10:40 ` David Miller 2008-01-21 21:14 ` Eric Dumazet @ 2008-01-23 7:43 ` joonwpark81 2008-01-23 20:44 ` Eric Dumazet 1 sibling, 1 reply; 6+ messages in thread From: joonwpark81 @ 2008-01-23 7:43 UTC (permalink / raw) To: David Miller; +Cc: netdev, Eric Dumazet On Mon, Jan 21, 2008 at 02:40:43AM -0800, David Miller wrote: > From: Joonwoo Park <joonwpark81@gmail.com> > Date: Tue, 22 Jan 2008 00:08:57 +0900 > > > The rt_run_flush() can be stucked if it was called while netdev is on the > > high load. > > It's possible when pushing rtable to rt_hash is faster than pulling > > from it. > > > > Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> > > I agree with the analysis of the problem, however not the solution. > > This will absolutely kill software interrupt latency. > > In fact, we have moved much of the flush work into a workqueue in > net-2.6.25 because of how important that is > > We need to find some other way to solve this. > Dave, Eric, Thanks so much for comments. I did stress tests and I found that the real problem was not consumer & supplier issue. It was the problem for me to innumerable enabling & disabling the softirq. But I'm still thinking need of considering issue 'faster caching than flush'. :) ifconfig up on heavy loaded interface. Before patching: time ifconfig eth1 up BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9] ... After patching: time ifconfig eth1 up real 0m0.007s user 0m0.000s sys 0m0.004s Thanks! Joonwoo >From 87c29506de967e811ad5b57cd2e1a002134e878f Mon Sep 17 00:00:00 2001 From: Joonwoo Park <joonwpark81@gmail.com> Date: Wed, 23 Jan 2008 15:16:54 +0900 Subject: [PATCH] [IPV4] route: reduce locking/unlocking in rt_run_flush The rt_run_flush does spin_lock_bh/spin_unlock_bh for rt_hash_mask + 1 times. The rt_hash_mask takes from 32767 to 65535, so it's big overhead. In addition, disable_bh/enable_bh for many times in the rt_run_flush can cause stuck on a machine with heavily pended softirqs. This patch reduces locking/unlocking as doing it with jumping the lock slots. ifconfig up on heavy loaded interface. Before: time ifconfig eth1 up BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9] ... After: time ifconfig eth1 up real 0m0.007s user 0m0.000s sys 0m0.004s Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> --- net/ipv4/route.c | 38 +++++++++++++++++++++++++++----------- 1 files changed, 27 insertions(+), 11 deletions(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 28484f3..79a401f 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -239,9 +239,20 @@ static spinlock_t *rt_hash_locks; for (i = 0; i < RT_HASH_LOCK_SZ; i++) \ spin_lock_init(&rt_hash_locks[i]); \ } +# define rt_hash_lock(lock) &rt_hash_locks[lock] +# define rt_hash_for_each_lock(lock) \ + for (lock = 0; lock < RT_HASH_LOCK_SZ; lock++) +# define rt_hash_for_each_slot(slot, lock, mask) \ + for (slot = 0, mask = lock; \ + slot < ((int)rt_hash_mask + 1) / RT_HASH_LOCK_SZ; \ + slot++, mask = (slot * RT_HASH_LOCK_SZ) + lock) #else # define rt_hash_lock_addr(slot) NULL # define rt_hash_lock_init() +# define rt_hash_lock(lock) NULL +# define rt_hash_for_each_lock(lock) do { lock = 0; } while (0); +# define rt_hash_for_each_slot(slot, lock, mask) \ + for (slot = 0, lock = 0, mask = 0; mask <= rt_hash_mask; mask++) #endif static struct rt_hash_bucket *rt_hash_table; @@ -613,24 +624,29 @@ static void rt_check_expire(struct work_struct *work) */ static void rt_run_flush(unsigned long dummy) { - int i; + int slot, lock, mask; struct rtable *rth, *next; rt_deadline = 0; get_random_bytes(&rt_hash_rnd, 4); - for (i = rt_hash_mask; i >= 0; i--) { - spin_lock_bh(rt_hash_lock_addr(i)); - rth = rt_hash_table[i].chain; - if (rth) - rt_hash_table[i].chain = NULL; - spin_unlock_bh(rt_hash_lock_addr(i)); - - for (; rth; rth = next) { - next = rth->u.dst.rt_next; - rt_free(rth); + rt_hash_for_each_lock(lock) { + spin_lock_bh(rt_hash_lock(lock)); + rt_hash_for_each_slot(slot, lock, mask) { + rth = rt_hash_table[mask].chain; + + if (rth) { + rt_hash_table[mask].chain = NULL; + spin_unlock_bh(rt_hash_lock(lock)); + for (; rth; rth = next) { + next = rth->u.dst.rt_next; + rt_free(rth); + } + spin_lock_bh(rt_hash_lock(lock)); + } } + spin_unlock_bh(rt_hash_lock(lock)); } } -- 1.5.3.rc5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] [IPV4] route: fix locking in rt_run_flush() 2008-01-23 7:43 ` joonwpark81 @ 2008-01-23 20:44 ` Eric Dumazet 2008-01-24 5:06 ` Joonwoo Park 0 siblings, 1 reply; 6+ messages in thread From: Eric Dumazet @ 2008-01-23 20:44 UTC (permalink / raw) To: joonwpark81; +Cc: David Miller, netdev joonwpark81@gmail.com a écrit : > On Mon, Jan 21, 2008 at 02:40:43AM -0800, David Miller wrote: >> From: Joonwoo Park <joonwpark81@gmail.com> >> Date: Tue, 22 Jan 2008 00:08:57 +0900 >> >>> The rt_run_flush() can be stucked if it was called while netdev is on the >>> high load. >>> It's possible when pushing rtable to rt_hash is faster than pulling >>> from it. >>> >>> Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> >> I agree with the analysis of the problem, however not the solution. >> >> This will absolutely kill software interrupt latency. >> >> In fact, we have moved much of the flush work into a workqueue in >> net-2.6.25 because of how important that is >> >> We need to find some other way to solve this. >> > > Dave, Eric, > Thanks so much for comments. > > I did stress tests and I found that the real problem was not consumer & supplier > issue. > It was the problem for me to innumerable enabling & disabling the softirq. > But I'm still thinking need of considering issue 'faster caching than flush'. :) > > ifconfig up on heavy loaded interface. > Before patching: > time ifconfig eth1 up > BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9] > ... > > After patching: > time ifconfig eth1 up > real 0m0.007s > user 0m0.000s > sys 0m0.004s > > Thanks! > Joonwoo > > >>From 87c29506de967e811ad5b57cd2e1a002134e878f Mon Sep 17 00:00:00 2001 > From: Joonwoo Park <joonwpark81@gmail.com> > Date: Wed, 23 Jan 2008 15:16:54 +0900 > Subject: [PATCH] [IPV4] route: reduce locking/unlocking in rt_run_flush > > The rt_run_flush does spin_lock_bh/spin_unlock_bh for rt_hash_mask + 1 > times. > The rt_hash_mask takes from 32767 to 65535, so it's big overhead. > In addition, disable_bh/enable_bh for many times in the rt_run_flush > can cause stuck on a machine with heavily pended softirqs. > > This patch reduces locking/unlocking as doing it with jumping the lock > slots. > > ifconfig up on heavy loaded interface. > Before: > time ifconfig eth1 up > BUG: soft lockup - CPU#0 stuck for 11s! [events/0:9] > ... > > After: > time ifconfig eth1 up > real 0m0.007s > user 0m0.000s > sys 0m0.004s > Unfortunatly, your patch doesnt work on CONFIG_SMP=n (softirq will be disabled for the whole scan of table) Also, some machines around there have 2^22 slots in hash table, and NR_CPUS=4, so softirqs will be disabled for a too long time. Please try net-2.6.25 and submit patches on top of it if necessary, since rt_run_flush() has pending changes, not in net-2.6 Note : The 'soft lockup' can be avoided by other means. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] [IPV4] route: fix locking in rt_run_flush() 2008-01-23 20:44 ` Eric Dumazet @ 2008-01-24 5:06 ` Joonwoo Park 0 siblings, 0 replies; 6+ messages in thread From: Joonwoo Park @ 2008-01-24 5:06 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev 2008/1/24, Eric Dumazet <dada1@cosmosbay.com>: > > Unfortunatly, your patch doesnt work on CONFIG_SMP=n (softirq will be disabled > for the whole scan of table) > > Also, some machines around there have 2^22 slots in hash table, and NR_CPUS=4, > so softirqs will be disabled for a too long time. > > Please try net-2.6.25 and submit patches on top of it if necessary, since > rt_run_flush() has pending changes, not in net-2.6 > > Note : The 'soft lockup' can be avoided by other means. > > Eric, Thank you for your help. I checked net-2.6.25 in this morining :( And it brought me a conclusion that the fashion of it is the best solution for the all circumstances. I believe that 'cond_resched()' and 'if (!rth) continue' without disable softirq of net-2.6.25 is correct answer. Thanks again! Joonwoo ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-01-24 5:06 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-01-21 15:08 [PATCH] [IPV4] route: fix locking in rt_run_flush() Joonwoo Park 2008-01-21 10:40 ` David Miller 2008-01-21 21:14 ` Eric Dumazet 2008-01-23 7:43 ` joonwpark81 2008-01-23 20:44 ` Eric Dumazet 2008-01-24 5:06 ` Joonwoo Park
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).