* Re: [ipvs] BUG: soft lockup detected on CPU#3! [not found] <446f47c20705221019j34de745fj9aad8633a6edfd54@mail.gmail.com> @ 2007-05-25 8:46 ` Horms 2007-05-25 9:30 ` Sebastien Estienne 0 siblings, 1 reply; 6+ messages in thread From: Horms @ 2007-05-25 8:46 UTC (permalink / raw) To: Sebastien Estienne; +Cc: wensong, ja, nedev [Adding netdev CC] On Tue, May 22, 2007 at 07:19:44PM +0200, Sebastien Estienne wrote: > Hello, > > I have the following bug (stacktrace in the attachment) and i think > it's related to ipvs, i reproduced it many time. > > kernel is from ubuntu: "2.6.20-15-server SMP x86_64" > on dell poweredge sc1425 > > i can trigger this bug when i adjust the weight of the real server > every 500ms using ipvsadm -R and piping the modifications > > there is 4 virtual server with 55 realserver each > > any idea? Not off hand. Though it does look some what unplesant :( I'll try my hand at trying to reproduce the problem. Is there any chance that you could try a vanilla (2.6.21) kernel to see if the problem exists there too? > > regards, > -- > Sebastien Estienne > BUG: soft lockup detected on CPU#3! > > Call Trace: > <IRQ> [softlockup_tick+249/288] softlockup_tick+0xf9/0x120 > [update_process_times+87/144] update_process_times+0x57/0x90 > [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60 > [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80 > [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70 > <EOI> [_end+130125309/2130038908] :ip_vs:do_ip_vs_set_ctl+0xad1/0xbf0 > [_end+130125305/2130038908] :ip_vs:do_ip_vs_set_ctl+0xacd/0xbf0 > [nf_sockopt+233/304] nf_sockopt+0xe9/0x130 > [nf_setsockopt+22/32] nf_setsockopt+0x16/0x20 > [ip_setsockopt+118/160] ip_setsockopt+0x76/0xa0 > [sys_setsockopt+166/240] sys_setsockopt+0xa6/0xf0 > [system_call+126/131] system_call+0x7e/0x83 > > BUG: soft lockup detected on CPU#0! > > Call Trace: > <IRQ> [softlockup_tick+249/288] softlockup_tick+0xf9/0x120 > [update_process_times+87/144] update_process_times+0x57/0x90 > [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60 > [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80 > [_end+130330509/2130038908] :nf_conntrack:nf_ct_invert_tuple+0x51/0xa0 > [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70 > [__read_lock_failed+5/32] __read_lock_failed+0x5/0x20 > [_read_lock+11/16] _read_lock+0xb/0x10 > [_end+130122092/2130038908] :ip_vs:ip_vs_service_get+0x20/0x1e0 > [_end+130144732/2130038908] :ip_vs:tcp_conn_schedule+0xa0/0x150 > [_end+130144313/2130038908] :ip_vs:tcp_conn_in_get+0x7d/0xc0 > [_end+130111414/2130038908] :ip_vs:ip_vs_in+0xca/0x270 > [nf_iterate+92/160] nf_iterate+0x5c/0xa0 > [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210 > [nf_hook_slow+113/240] nf_hook_slow+0x71/0xf0 > [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210 > [ip_local_deliver+111/656] ip_local_deliver+0x6f/0x290 > [ip_rcv+1246/1360] ip_rcv+0x4de/0x550 > [netif_receive_skb+623/800] netif_receive_skb+0x26f/0x320 > [_end+129188049/2130038908] :e1000:e1000_clean_rx_irq+0x445/0x520 > [_end+129183152/2130038908] :e1000:e1000_clean+0x84/0x2b0 > [net_rx_action+186/512] net_rx_action+0xba/0x200 > [__do_softirq+95/208] __do_softirq+0x5f/0xd0 > [call_softirq+28/40] call_softirq+0x1c/0x28 > [do_softirq+44/144] do_softirq+0x2c/0x90 > [do_IRQ+217/256] do_IRQ+0xd9/0x100 > [mwait_idle+0/80] mwait_idle+0x0/0x50 > [ret_from_intr+0/10] ret_from_intr+0x0/0xa > <EOI> [tcp_poll+0/336] tcp_poll+0x0/0x150 > [mwait_idle+66/80] mwait_idle+0x42/0x50 > [cpu_idle+155/208] cpu_idle+0x9b/0xd0 > [start_kernel+586/608] start_kernel+0x24a/0x260 > [x86_64_start_kernel+358/368] _sinittext+0x166/0x170 > -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [ipvs] BUG: soft lockup detected on CPU#3! 2007-05-25 8:46 ` [ipvs] BUG: soft lockup detected on CPU#3! Horms @ 2007-05-25 9:30 ` Sebastien Estienne 2007-05-26 2:22 ` Horms 0 siblings, 1 reply; 6+ messages in thread From: Sebastien Estienne @ 2007-05-25 9:30 UTC (permalink / raw) To: Horms; +Cc: wensong, ja, nedev Hello, On 5/25/07, Horms <horms@verge.net.au> wrote: > [Adding netdev CC] > > On Tue, May 22, 2007 at 07:19:44PM +0200, Sebastien Estienne wrote: > > Hello, > > > > I have the following bug (stacktrace in the attachment) and i think > > it's related to ipvs, i reproduced it many time. > > > > kernel is from ubuntu: "2.6.20-15-server SMP x86_64" > > on dell poweredge sc1425 > > > > i can trigger this bug when i adjust the weight of the real server > > every 500ms using ipvsadm -R and piping the modifications > > > > there is 4 virtual server with 55 realserver each > > > > any idea? > > Not off hand. Though it does look some what unplesant :( > I'll try my hand at trying to reproduce the problem. > > Is there any chance that you could try a vanilla (2.6.21) kernel > to see if the problem exists there too? I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i can't reproduce the bug. When i was using feisty kernel (2.6.20), i can reproduce in less than 5 minutes. I'm using lvs to loadbalance some mysql servers, i wrote a deamon that check the synchro of the mysql replication on each slave and adjust the wieght on the lvs every 500ms > > > > regards, > > -- > > Sebastien Estienne > > > BUG: soft lockup detected on CPU#3! > > > > Call Trace: > > <IRQ> [softlockup_tick+249/288] softlockup_tick+0xf9/0x120 > > [update_process_times+87/144] update_process_times+0x57/0x90 > > [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60 > > [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80 > > [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70 > > <EOI> [_end+130125309/2130038908] :ip_vs:do_ip_vs_set_ctl+0xad1/0xbf0 > > [_end+130125305/2130038908] :ip_vs:do_ip_vs_set_ctl+0xacd/0xbf0 > > [nf_sockopt+233/304] nf_sockopt+0xe9/0x130 > > [nf_setsockopt+22/32] nf_setsockopt+0x16/0x20 > > [ip_setsockopt+118/160] ip_setsockopt+0x76/0xa0 > > [sys_setsockopt+166/240] sys_setsockopt+0xa6/0xf0 > > [system_call+126/131] system_call+0x7e/0x83 > > > > BUG: soft lockup detected on CPU#0! > > > > Call Trace: > > <IRQ> [softlockup_tick+249/288] softlockup_tick+0xf9/0x120 > > [update_process_times+87/144] update_process_times+0x57/0x90 > > [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60 > > [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80 > > [_end+130330509/2130038908] :nf_conntrack:nf_ct_invert_tuple+0x51/0xa0 > > [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70 > > [__read_lock_failed+5/32] __read_lock_failed+0x5/0x20 > > [_read_lock+11/16] _read_lock+0xb/0x10 > > [_end+130122092/2130038908] :ip_vs:ip_vs_service_get+0x20/0x1e0 > > [_end+130144732/2130038908] :ip_vs:tcp_conn_schedule+0xa0/0x150 > > [_end+130144313/2130038908] :ip_vs:tcp_conn_in_get+0x7d/0xc0 > > [_end+130111414/2130038908] :ip_vs:ip_vs_in+0xca/0x270 > > [nf_iterate+92/160] nf_iterate+0x5c/0xa0 > > [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210 > > [nf_hook_slow+113/240] nf_hook_slow+0x71/0xf0 > > [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210 > > [ip_local_deliver+111/656] ip_local_deliver+0x6f/0x290 > > [ip_rcv+1246/1360] ip_rcv+0x4de/0x550 > > [netif_receive_skb+623/800] netif_receive_skb+0x26f/0x320 > > [_end+129188049/2130038908] :e1000:e1000_clean_rx_irq+0x445/0x520 > > [_end+129183152/2130038908] :e1000:e1000_clean+0x84/0x2b0 > > [net_rx_action+186/512] net_rx_action+0xba/0x200 > > [__do_softirq+95/208] __do_softirq+0x5f/0xd0 > > [call_softirq+28/40] call_softirq+0x1c/0x28 > > [do_softirq+44/144] do_softirq+0x2c/0x90 > > [do_IRQ+217/256] do_IRQ+0xd9/0x100 > > [mwait_idle+0/80] mwait_idle+0x0/0x50 > > [ret_from_intr+0/10] ret_from_intr+0x0/0xa > > <EOI> [tcp_poll+0/336] tcp_poll+0x0/0x150 > > [mwait_idle+66/80] mwait_idle+0x42/0x50 > > [cpu_idle+155/208] cpu_idle+0x9b/0xd0 > > [start_kernel+586/608] start_kernel+0x24a/0x260 > > [x86_64_start_kernel+358/368] _sinittext+0x166/0x170 > > > > > -- > Horms > H: http://www.vergenet.net/~horms/ > W: http://www.valinux.co.jp/en/ > > -- Sebastien Estienne ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [ipvs] BUG: soft lockup detected on CPU#3! 2007-05-25 9:30 ` Sebastien Estienne @ 2007-05-26 2:22 ` Horms 2007-05-28 9:36 ` Horms 0 siblings, 1 reply; 6+ messages in thread From: Horms @ 2007-05-26 2:22 UTC (permalink / raw) To: Sebastien Estienne; +Cc: wensong, ja, nedev On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote: > > I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i > can't reproduce the bug. > When i was using feisty kernel (2.6.20), i can reproduce in less than 5 > minutes. > > I'm using lvs to loadbalance some mysql servers, i wrote a deamon that > check the synchro of the mysql replication on each slave and adjust > the wieght on the lvs every 500ms It does look a lot like there is some sort of locking problem in there. Would it be possible to send your kernel config, as the locking deatails to change a little with different configs. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [ipvs] BUG: soft lockup detected on CPU#3! 2007-05-26 2:22 ` Horms @ 2007-05-28 9:36 ` Horms 2007-05-28 14:20 ` Sebastien Estienne 0 siblings, 1 reply; 6+ messages in thread From: Horms @ 2007-05-28 9:36 UTC (permalink / raw) To: Sebastien Estienne; +Cc: wensong, ja, nedev On Sat, May 26, 2007 at 11:22:40AM +0900, Horms wrote: > On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote: > > > > I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i > > can't reproduce the bug. > > When i was using feisty kernel (2.6.20), i can reproduce in less than 5 > > minutes. > > > > I'm using lvs to loadbalance some mysql servers, i wrote a deamon that > > check the synchro of the mysql replication on each slave and adjust > > the wieght on the lvs every 500ms > > It does look a lot like there is some sort of locking problem in there. > Would it be possible to send your kernel config, as the locking > deatails to change a little with different configs. If you also have some details of you ipvs configuration, that might help narrow down which code-paths to investigate. I spent some time this afternoon looking into this probem, and what I think is happening is: 1. Due to your weight-update operations, one processor is sitting in ip_vs_edit_dest() called by do_ip_vs_set_ctl(), holding write_lock_bh(&__ip_vs_svc_lock) and waiting for svc->usecnt to go down to 1. 2. Another process is trying to grab read_lock(&__ip_vs_svc_lock) in ip_vs_service_get(), called from tcp_conn_schedule() and in turn ip_vs_in(). I guess that for some reason svc->usecnt isn't going down to 0. Though I haven't been able to isolate anything particularly interesting. That said, the locking isn't that simple, IMHO, so there seems to be quite a lot of scope for errors. Some things that are of minor insterst are: I. ip_vs_edit_dest() loops with the following construct: while (atomic_read(&svc->usecnt) > 1) {}; whereas similar code in the same file uses IP_VS_WAIT_WHILE(atomic_read(&svc->usecnt) > 1); which expands to while (atomic_read(&svc->usecnt) > 1) { cpu_relax(); } But I dount this is a problem, except for burning the cpu a bit harder than it needs to. II. ip_vs_set_ctl() does seem to leak svc->usecnt in one corner case, but I doubt that is what you are seeing - if it was your ipvsadm command(s) would hang. The problem is a bit wordy to describe, but this fix should illustrate the problem. --- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ctl.c +++ linux-2.6/net/ipv4/ipvs/ip_vs_ctl.c @@ -2000,7 +2000,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cm if (cmd != IP_VS_SO_SET_ADD && (svc == NULL || svc->protocol != usvc->protocol)) { ret = -ESRCH; - goto out_unlock; + goto out_svc; } switch (cmd) { @@ -2034,9 +2034,9 @@ do_ip_vs_set_ctl(struct sock *sk, int cm ret = -EINVAL; } + out_svc: if (svc) ip_vs_service_put(svc); - out_unlock: mutex_unlock(&__ip_vs_mutex); out_dec: III. Perhaps if you are calling ipvsadm a lot then there is a remote possibility that write_lock_bh() could starve read_lock(). This seems ludicrous, but I'm just mentioning it as it crossed my mind. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [ipvs] BUG: soft lockup detected on CPU#3! 2007-05-28 9:36 ` Horms @ 2007-05-28 14:20 ` Sebastien Estienne 2007-05-29 1:08 ` Horms 0 siblings, 1 reply; 6+ messages in thread From: Sebastien Estienne @ 2007-05-28 14:20 UTC (permalink / raw) To: Horms; +Cc: wensong, ja, nedev [-- Attachment #1: Type: text/plain, Size: 3819 bytes --] On 5/28/07, Horms <horms@verge.net.au> wrote: > On Sat, May 26, 2007 at 11:22:40AM +0900, Horms wrote: > > On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote: > > > > > > I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i > > > can't reproduce the bug. > > > When i was using feisty kernel (2.6.20), i can reproduce in less than 5 > > > minutes. > > > > > > I'm using lvs to loadbalance some mysql servers, i wrote a deamon that > > > check the synchro of the mysql replication on each slave and adjust > > > the wieght on the lvs every 500ms > > > > It does look a lot like there is some sort of locking problem in there. > > Would it be possible to send your kernel config, as the locking > > deatails to change a little with different configs. > About the kernel .config, i'm using the vanilla kernel "-server" from ubuntu feisty > If you also have some details of you ipvs configuration, > that might help narrow down which code-paths to investigate. > i attached the output of ipvsadm-save i'm adjusting the weight every 500ms by generating lines like this: -e -t 10.33.1.231:3306 -r 10.33.1.1 -w 100 and piping all the needed changes in ipvadm -R it can represent something like 20 to 40 updates in one time. i also noticed that sometimes when i execute "ipvsadm" the display get locked in the middle for a second and then finish. > I spent some time this afternoon looking into this probem, > and what I think is happening is: > > 1. Due to your weight-update operations, one processor > is sitting in ip_vs_edit_dest() called by do_ip_vs_set_ctl(), > holding write_lock_bh(&__ip_vs_svc_lock) and waiting > for svc->usecnt to go down to 1. > > 2. Another process is trying to grab > read_lock(&__ip_vs_svc_lock) in ip_vs_service_get(), > called from tcp_conn_schedule() and in turn ip_vs_in(). > > I guess that for some reason svc->usecnt isn't going down to 0. > Though I haven't been able to isolate anything particularly > interesting. > > That said, the locking isn't that simple, IMHO, so there seems > to be quite a lot of scope for errors. > > > Some things that are of minor insterst are: > > I. > ip_vs_edit_dest() loops with the following construct: > > while (atomic_read(&svc->usecnt) > 1) {}; > > whereas similar code in the same file uses > > IP_VS_WAIT_WHILE(atomic_read(&svc->usecnt) > 1); > > which expands to > > while (atomic_read(&svc->usecnt) > 1) { cpu_relax(); } > > But I dount this is a problem, except for burning the cpu a bit harder > than it needs to. > > II. > > ip_vs_set_ctl() does seem to leak svc->usecnt in one corner case, > but I doubt that is what you are seeing - if it was your ipvsadm > command(s) would hang. The problem is a bit wordy to describe, > but this fix should illustrate the problem. > > --- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ctl.c > +++ linux-2.6/net/ipv4/ipvs/ip_vs_ctl.c > @@ -2000,7 +2000,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cm > if (cmd != IP_VS_SO_SET_ADD > && (svc == NULL || svc->protocol != usvc->protocol)) { > ret = -ESRCH; > - goto out_unlock; > + goto out_svc; > } > > switch (cmd) { > @@ -2034,9 +2034,9 @@ do_ip_vs_set_ctl(struct sock *sk, int cm > ret = -EINVAL; > } > > + out_svc: > if (svc) > ip_vs_service_put(svc); > - > out_unlock: > mutex_unlock(&__ip_vs_mutex); > out_dec: > > III. > > Perhaps if you are calling ipvsadm a lot then there is a remote > possibility that write_lock_bh() could starve read_lock(). This > seems ludicrous, but I'm just mentioning it as it crossed my mind. > > -- > Horms > H: http://www.vergenet.net/~horms/ > W: http://www.valinux.co.jp/en/ > > -- Sebastien Estienne [-- Attachment #2: ipvsadm-save.txt --] [-- Type: text/plain, Size: 10712 bytes --] -A -t 10.33.1.231:3306 -s wlc -a -t 10.33.1.231:3306 -r 10.33.1.59:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.58:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.57:3306 -g -w 100 -a -t 10.33.1.231:3306 -r 10.33.1.56:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.55:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.54:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.53:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.52:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.51:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.50:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.49:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.48:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.47:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.46:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.45:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.44:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.43:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.42:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.41:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.40:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.39:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.38:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.37:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.36:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.34:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.33:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.32:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.31:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.30:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.29:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.28:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.27:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.26:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.25:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.24:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.23:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.22:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.21:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.20:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.19:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.18:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.17:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.16:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.14:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.13:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.12:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.9:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.4:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.3:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.2:3306 -g -w 0 -a -t 10.33.1.231:3306 -r 10.33.1.1:3306 -g -w 0 -A -t 10.33.1.232:3306 -s wlc -a -t 10.33.1.232:3306 -r 10.33.1.59:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.58:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.57:3306 -g -w 100 -a -t 10.33.1.232:3306 -r 10.33.1.56:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.55:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.54:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.53:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.52:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.51:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.50:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.49:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.48:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.47:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.46:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.45:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.44:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.43:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.42:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.41:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.40:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.39:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.38:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.37:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.36:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.34:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.33:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.32:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.31:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.30:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.29:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.28:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.27:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.26:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.25:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.24:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.23:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.22:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.21:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.20:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.19:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.18:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.17:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.16:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.14:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.13:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.12:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.9:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.4:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.3:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.2:3306 -g -w 0 -a -t 10.33.1.232:3306 -r 10.33.1.1:3306 -g -w 0 -A -t 10.33.1.221:3306 -s wlc -a -t 10.33.1.221:3306 -r 10.33.1.59:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.58:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.57:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.56:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.55:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.54:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.53:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.52:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.51:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.50:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.49:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.48:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.47:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.46:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.45:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.44:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.43:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.42:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.41:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.40:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.39:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.38:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.37:3306 -g -w 0 -a -t 10.33.1.221:3306 -r 10.33.1.36:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.34:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.33:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.32:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.31:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.30:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.29:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.28:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.27:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.26:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.25:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.24:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.23:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.22:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.21:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.20:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.19:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.18:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.17:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.16:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.14:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.13:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.12:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.9:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.4:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.3:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.2:3306 -g -w 100 -a -t 10.33.1.221:3306 -r 10.33.1.1:3306 -g -w 100 -A -t 10.33.1.222:3306 -s wlc -a -t 10.33.1.222:3306 -r 10.33.1.59:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.58:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.57:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.56:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.55:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.54:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.53:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.52:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.51:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.50:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.49:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.48:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.47:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.46:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.45:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.44:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.43:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.42:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.41:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.40:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.39:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.38:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.37:3306 -g -w 0 -a -t 10.33.1.222:3306 -r 10.33.1.36:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.34:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.33:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.32:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.31:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.30:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.29:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.28:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.27:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.26:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.25:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.24:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.23:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.22:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.21:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.20:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.19:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.18:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.17:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.16:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.14:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.13:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.12:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.9:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.4:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.3:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.2:3306 -g -w 100 -a -t 10.33.1.222:3306 -r 10.33.1.1:3306 -g -w 100 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [ipvs] BUG: soft lockup detected on CPU#3! 2007-05-28 14:20 ` Sebastien Estienne @ 2007-05-29 1:08 ` Horms 0 siblings, 0 replies; 6+ messages in thread From: Horms @ 2007-05-29 1:08 UTC (permalink / raw) To: Sebastien Estienne; +Cc: wensong, ja, nedev On Mon, May 28, 2007 at 04:20:32PM +0200, Sebastien Estienne wrote: > On 5/28/07, Horms <horms@verge.net.au> wrote: > >On Sat, May 26, 2007 at 11:22:40AM +0900, Horms wrote: > >> On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote: > >> > > >> > I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i > >> > can't reproduce the bug. > >> > When i was using feisty kernel (2.6.20), i can reproduce in less than 5 > >> > minutes. > >> > > >> > I'm using lvs to loadbalance some mysql servers, i wrote a deamon that > >> > check the synchro of the mysql replication on each slave and adjust > >> > the wieght on the lvs every 500ms > >> > >> It does look a lot like there is some sort of locking problem in there. > >> Would it be possible to send your kernel config, as the locking > >> deatails to change a little with different configs. > > > > About the kernel .config, i'm using the vanilla kernel "-server" from > ubuntu feisty > > >If you also have some details of you ipvs configuration, > >that might help narrow down which code-paths to investigate. > > > > i attached the output of ipvsadm-save > > i'm adjusting the weight every 500ms by generating lines like this: > -e -t 10.33.1.231:3306 -r 10.33.1.1 -w 100 Thanks. > and piping all the needed changes in ipvadm -R > > it can represent something like 20 to 40 updates in one time. > > i also noticed that sometimes when i execute "ipvsadm" the display get > locked in the middle for a second and then finish. That is probably related. Though the fact that it eventually exits seems to indicate that you're not hitting II. > >II. > > > >ip_vs_set_ctl() does seem to leak svc->usecnt in one corner case, > >but I doubt that is what you are seeing - if it was your ipvsadm > >command(s) would hang. The problem is a bit wordy to describe, > >but this fix should illustrate the problem. > > > >--- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ctl.c > >+++ linux-2.6/net/ipv4/ipvs/ip_vs_ctl.c > >@@ -2000,7 +2000,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cm > > if (cmd != IP_VS_SO_SET_ADD > > && (svc == NULL || svc->protocol != usvc->protocol)) { > > ret = -ESRCH; > >- goto out_unlock; > >+ goto out_svc; > > } > > > > switch (cmd) { > >@@ -2034,9 +2034,9 @@ do_ip_vs_set_ctl(struct sock *sk, int cm > > ret = -EINVAL; > > } > > > >+ out_svc: > > if (svc) > > ip_vs_service_put(svc); > >- > > out_unlock: > > mutex_unlock(&__ip_vs_mutex); > > out_dec: -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-05-29 1:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <446f47c20705221019j34de745fj9aad8633a6edfd54@mail.gmail.com>
2007-05-25 8:46 ` [ipvs] BUG: soft lockup detected on CPU#3! Horms
2007-05-25 9:30 ` Sebastien Estienne
2007-05-26 2:22 ` Horms
2007-05-28 9:36 ` Horms
2007-05-28 14:20 ` Sebastien Estienne
2007-05-29 1:08 ` Horms
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).