* Re: [ipvs] BUG: soft lockup detected on CPU#3!
[not found] <446f47c20705221019j34de745fj9aad8633a6edfd54@mail.gmail.com>
@ 2007-05-25 8:46 ` Horms
2007-05-25 9:30 ` Sebastien Estienne
0 siblings, 1 reply; 6+ messages in thread
From: Horms @ 2007-05-25 8:46 UTC (permalink / raw)
To: Sebastien Estienne; +Cc: wensong, ja, nedev
[Adding netdev CC]
On Tue, May 22, 2007 at 07:19:44PM +0200, Sebastien Estienne wrote:
> Hello,
>
> I have the following bug (stacktrace in the attachment) and i think
> it's related to ipvs, i reproduced it many time.
>
> kernel is from ubuntu: "2.6.20-15-server SMP x86_64"
> on dell poweredge sc1425
>
> i can trigger this bug when i adjust the weight of the real server
> every 500ms using ipvsadm -R and piping the modifications
>
> there is 4 virtual server with 55 realserver each
>
> any idea?
Not off hand. Though it does look some what unplesant :(
I'll try my hand at trying to reproduce the problem.
Is there any chance that you could try a vanilla (2.6.21) kernel
to see if the problem exists there too?
>
> regards,
> --
> Sebastien Estienne
> BUG: soft lockup detected on CPU#3!
>
> Call Trace:
> <IRQ> [softlockup_tick+249/288] softlockup_tick+0xf9/0x120
> [update_process_times+87/144] update_process_times+0x57/0x90
> [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60
> [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80
> [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70
> <EOI> [_end+130125309/2130038908] :ip_vs:do_ip_vs_set_ctl+0xad1/0xbf0
> [_end+130125305/2130038908] :ip_vs:do_ip_vs_set_ctl+0xacd/0xbf0
> [nf_sockopt+233/304] nf_sockopt+0xe9/0x130
> [nf_setsockopt+22/32] nf_setsockopt+0x16/0x20
> [ip_setsockopt+118/160] ip_setsockopt+0x76/0xa0
> [sys_setsockopt+166/240] sys_setsockopt+0xa6/0xf0
> [system_call+126/131] system_call+0x7e/0x83
>
> BUG: soft lockup detected on CPU#0!
>
> Call Trace:
> <IRQ> [softlockup_tick+249/288] softlockup_tick+0xf9/0x120
> [update_process_times+87/144] update_process_times+0x57/0x90
> [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60
> [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80
> [_end+130330509/2130038908] :nf_conntrack:nf_ct_invert_tuple+0x51/0xa0
> [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70
> [__read_lock_failed+5/32] __read_lock_failed+0x5/0x20
> [_read_lock+11/16] _read_lock+0xb/0x10
> [_end+130122092/2130038908] :ip_vs:ip_vs_service_get+0x20/0x1e0
> [_end+130144732/2130038908] :ip_vs:tcp_conn_schedule+0xa0/0x150
> [_end+130144313/2130038908] :ip_vs:tcp_conn_in_get+0x7d/0xc0
> [_end+130111414/2130038908] :ip_vs:ip_vs_in+0xca/0x270
> [nf_iterate+92/160] nf_iterate+0x5c/0xa0
> [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210
> [nf_hook_slow+113/240] nf_hook_slow+0x71/0xf0
> [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210
> [ip_local_deliver+111/656] ip_local_deliver+0x6f/0x290
> [ip_rcv+1246/1360] ip_rcv+0x4de/0x550
> [netif_receive_skb+623/800] netif_receive_skb+0x26f/0x320
> [_end+129188049/2130038908] :e1000:e1000_clean_rx_irq+0x445/0x520
> [_end+129183152/2130038908] :e1000:e1000_clean+0x84/0x2b0
> [net_rx_action+186/512] net_rx_action+0xba/0x200
> [__do_softirq+95/208] __do_softirq+0x5f/0xd0
> [call_softirq+28/40] call_softirq+0x1c/0x28
> [do_softirq+44/144] do_softirq+0x2c/0x90
> [do_IRQ+217/256] do_IRQ+0xd9/0x100
> [mwait_idle+0/80] mwait_idle+0x0/0x50
> [ret_from_intr+0/10] ret_from_intr+0x0/0xa
> <EOI> [tcp_poll+0/336] tcp_poll+0x0/0x150
> [mwait_idle+66/80] mwait_idle+0x42/0x50
> [cpu_idle+155/208] cpu_idle+0x9b/0xd0
> [start_kernel+586/608] start_kernel+0x24a/0x260
> [x86_64_start_kernel+358/368] _sinittext+0x166/0x170
>
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [ipvs] BUG: soft lockup detected on CPU#3!
2007-05-25 8:46 ` [ipvs] BUG: soft lockup detected on CPU#3! Horms
@ 2007-05-25 9:30 ` Sebastien Estienne
2007-05-26 2:22 ` Horms
0 siblings, 1 reply; 6+ messages in thread
From: Sebastien Estienne @ 2007-05-25 9:30 UTC (permalink / raw)
To: Horms; +Cc: wensong, ja, nedev
Hello,
On 5/25/07, Horms <horms@verge.net.au> wrote:
> [Adding netdev CC]
>
> On Tue, May 22, 2007 at 07:19:44PM +0200, Sebastien Estienne wrote:
> > Hello,
> >
> > I have the following bug (stacktrace in the attachment) and i think
> > it's related to ipvs, i reproduced it many time.
> >
> > kernel is from ubuntu: "2.6.20-15-server SMP x86_64"
> > on dell poweredge sc1425
> >
> > i can trigger this bug when i adjust the weight of the real server
> > every 500ms using ipvsadm -R and piping the modifications
> >
> > there is 4 virtual server with 55 realserver each
> >
> > any idea?
>
> Not off hand. Though it does look some what unplesant :(
> I'll try my hand at trying to reproduce the problem.
>
> Is there any chance that you could try a vanilla (2.6.21) kernel
> to see if the problem exists there too?
I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i
can't reproduce the bug.
When i was using feisty kernel (2.6.20), i can reproduce in less than 5 minutes.
I'm using lvs to loadbalance some mysql servers, i wrote a deamon that
check the synchro of the mysql replication on each slave and adjust
the wieght on the lvs every 500ms
> >
> > regards,
> > --
> > Sebastien Estienne
>
> > BUG: soft lockup detected on CPU#3!
> >
> > Call Trace:
> > <IRQ> [softlockup_tick+249/288] softlockup_tick+0xf9/0x120
> > [update_process_times+87/144] update_process_times+0x57/0x90
> > [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60
> > [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80
> > [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70
> > <EOI> [_end+130125309/2130038908] :ip_vs:do_ip_vs_set_ctl+0xad1/0xbf0
> > [_end+130125305/2130038908] :ip_vs:do_ip_vs_set_ctl+0xacd/0xbf0
> > [nf_sockopt+233/304] nf_sockopt+0xe9/0x130
> > [nf_setsockopt+22/32] nf_setsockopt+0x16/0x20
> > [ip_setsockopt+118/160] ip_setsockopt+0x76/0xa0
> > [sys_setsockopt+166/240] sys_setsockopt+0xa6/0xf0
> > [system_call+126/131] system_call+0x7e/0x83
> >
> > BUG: soft lockup detected on CPU#0!
> >
> > Call Trace:
> > <IRQ> [softlockup_tick+249/288] softlockup_tick+0xf9/0x120
> > [update_process_times+87/144] update_process_times+0x57/0x90
> > [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60
> > [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80
> > [_end+130330509/2130038908] :nf_conntrack:nf_ct_invert_tuple+0x51/0xa0
> > [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70
> > [__read_lock_failed+5/32] __read_lock_failed+0x5/0x20
> > [_read_lock+11/16] _read_lock+0xb/0x10
> > [_end+130122092/2130038908] :ip_vs:ip_vs_service_get+0x20/0x1e0
> > [_end+130144732/2130038908] :ip_vs:tcp_conn_schedule+0xa0/0x150
> > [_end+130144313/2130038908] :ip_vs:tcp_conn_in_get+0x7d/0xc0
> > [_end+130111414/2130038908] :ip_vs:ip_vs_in+0xca/0x270
> > [nf_iterate+92/160] nf_iterate+0x5c/0xa0
> > [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210
> > [nf_hook_slow+113/240] nf_hook_slow+0x71/0xf0
> > [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210
> > [ip_local_deliver+111/656] ip_local_deliver+0x6f/0x290
> > [ip_rcv+1246/1360] ip_rcv+0x4de/0x550
> > [netif_receive_skb+623/800] netif_receive_skb+0x26f/0x320
> > [_end+129188049/2130038908] :e1000:e1000_clean_rx_irq+0x445/0x520
> > [_end+129183152/2130038908] :e1000:e1000_clean+0x84/0x2b0
> > [net_rx_action+186/512] net_rx_action+0xba/0x200
> > [__do_softirq+95/208] __do_softirq+0x5f/0xd0
> > [call_softirq+28/40] call_softirq+0x1c/0x28
> > [do_softirq+44/144] do_softirq+0x2c/0x90
> > [do_IRQ+217/256] do_IRQ+0xd9/0x100
> > [mwait_idle+0/80] mwait_idle+0x0/0x50
> > [ret_from_intr+0/10] ret_from_intr+0x0/0xa
> > <EOI> [tcp_poll+0/336] tcp_poll+0x0/0x150
> > [mwait_idle+66/80] mwait_idle+0x42/0x50
> > [cpu_idle+155/208] cpu_idle+0x9b/0xd0
> > [start_kernel+586/608] start_kernel+0x24a/0x260
> > [x86_64_start_kernel+358/368] _sinittext+0x166/0x170
> >
>
>
> --
> Horms
> H: http://www.vergenet.net/~horms/
> W: http://www.valinux.co.jp/en/
>
>
--
Sebastien Estienne
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [ipvs] BUG: soft lockup detected on CPU#3!
2007-05-25 9:30 ` Sebastien Estienne
@ 2007-05-26 2:22 ` Horms
2007-05-28 9:36 ` Horms
0 siblings, 1 reply; 6+ messages in thread
From: Horms @ 2007-05-26 2:22 UTC (permalink / raw)
To: Sebastien Estienne; +Cc: wensong, ja, nedev
On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote:
>
> I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i
> can't reproduce the bug.
> When i was using feisty kernel (2.6.20), i can reproduce in less than 5
> minutes.
>
> I'm using lvs to loadbalance some mysql servers, i wrote a deamon that
> check the synchro of the mysql replication on each slave and adjust
> the wieght on the lvs every 500ms
It does look a lot like there is some sort of locking problem in there.
Would it be possible to send your kernel config, as the locking
deatails to change a little with different configs.
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [ipvs] BUG: soft lockup detected on CPU#3!
2007-05-26 2:22 ` Horms
@ 2007-05-28 9:36 ` Horms
2007-05-28 14:20 ` Sebastien Estienne
0 siblings, 1 reply; 6+ messages in thread
From: Horms @ 2007-05-28 9:36 UTC (permalink / raw)
To: Sebastien Estienne; +Cc: wensong, ja, nedev
On Sat, May 26, 2007 at 11:22:40AM +0900, Horms wrote:
> On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote:
> >
> > I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i
> > can't reproduce the bug.
> > When i was using feisty kernel (2.6.20), i can reproduce in less than 5
> > minutes.
> >
> > I'm using lvs to loadbalance some mysql servers, i wrote a deamon that
> > check the synchro of the mysql replication on each slave and adjust
> > the wieght on the lvs every 500ms
>
> It does look a lot like there is some sort of locking problem in there.
> Would it be possible to send your kernel config, as the locking
> deatails to change a little with different configs.
If you also have some details of you ipvs configuration,
that might help narrow down which code-paths to investigate.
I spent some time this afternoon looking into this probem,
and what I think is happening is:
1. Due to your weight-update operations, one processor
is sitting in ip_vs_edit_dest() called by do_ip_vs_set_ctl(),
holding write_lock_bh(&__ip_vs_svc_lock) and waiting
for svc->usecnt to go down to 1.
2. Another process is trying to grab
read_lock(&__ip_vs_svc_lock) in ip_vs_service_get(),
called from tcp_conn_schedule() and in turn ip_vs_in().
I guess that for some reason svc->usecnt isn't going down to 0.
Though I haven't been able to isolate anything particularly
interesting.
That said, the locking isn't that simple, IMHO, so there seems
to be quite a lot of scope for errors.
Some things that are of minor insterst are:
I.
ip_vs_edit_dest() loops with the following construct:
while (atomic_read(&svc->usecnt) > 1) {};
whereas similar code in the same file uses
IP_VS_WAIT_WHILE(atomic_read(&svc->usecnt) > 1);
which expands to
while (atomic_read(&svc->usecnt) > 1) { cpu_relax(); }
But I dount this is a problem, except for burning the cpu a bit harder
than it needs to.
II.
ip_vs_set_ctl() does seem to leak svc->usecnt in one corner case,
but I doubt that is what you are seeing - if it was your ipvsadm
command(s) would hang. The problem is a bit wordy to describe,
but this fix should illustrate the problem.
--- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ctl.c
+++ linux-2.6/net/ipv4/ipvs/ip_vs_ctl.c
@@ -2000,7 +2000,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
if (cmd != IP_VS_SO_SET_ADD
&& (svc == NULL || svc->protocol != usvc->protocol)) {
ret = -ESRCH;
- goto out_unlock;
+ goto out_svc;
}
switch (cmd) {
@@ -2034,9 +2034,9 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
ret = -EINVAL;
}
+ out_svc:
if (svc)
ip_vs_service_put(svc);
-
out_unlock:
mutex_unlock(&__ip_vs_mutex);
out_dec:
III.
Perhaps if you are calling ipvsadm a lot then there is a remote
possibility that write_lock_bh() could starve read_lock(). This
seems ludicrous, but I'm just mentioning it as it crossed my mind.
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [ipvs] BUG: soft lockup detected on CPU#3!
2007-05-28 9:36 ` Horms
@ 2007-05-28 14:20 ` Sebastien Estienne
2007-05-29 1:08 ` Horms
0 siblings, 1 reply; 6+ messages in thread
From: Sebastien Estienne @ 2007-05-28 14:20 UTC (permalink / raw)
To: Horms; +Cc: wensong, ja, nedev
[-- Attachment #1: Type: text/plain, Size: 3819 bytes --]
On 5/28/07, Horms <horms@verge.net.au> wrote:
> On Sat, May 26, 2007 at 11:22:40AM +0900, Horms wrote:
> > On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote:
> > >
> > > I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i
> > > can't reproduce the bug.
> > > When i was using feisty kernel (2.6.20), i can reproduce in less than 5
> > > minutes.
> > >
> > > I'm using lvs to loadbalance some mysql servers, i wrote a deamon that
> > > check the synchro of the mysql replication on each slave and adjust
> > > the wieght on the lvs every 500ms
> >
> > It does look a lot like there is some sort of locking problem in there.
> > Would it be possible to send your kernel config, as the locking
> > deatails to change a little with different configs.
>
About the kernel .config, i'm using the vanilla kernel "-server" from
ubuntu feisty
> If you also have some details of you ipvs configuration,
> that might help narrow down which code-paths to investigate.
>
i attached the output of ipvsadm-save
i'm adjusting the weight every 500ms by generating lines like this:
-e -t 10.33.1.231:3306 -r 10.33.1.1 -w 100
and piping all the needed changes in ipvadm -R
it can represent something like 20 to 40 updates in one time.
i also noticed that sometimes when i execute "ipvsadm" the display get
locked in the middle for a second and then finish.
> I spent some time this afternoon looking into this probem,
> and what I think is happening is:
>
> 1. Due to your weight-update operations, one processor
> is sitting in ip_vs_edit_dest() called by do_ip_vs_set_ctl(),
> holding write_lock_bh(&__ip_vs_svc_lock) and waiting
> for svc->usecnt to go down to 1.
>
> 2. Another process is trying to grab
> read_lock(&__ip_vs_svc_lock) in ip_vs_service_get(),
> called from tcp_conn_schedule() and in turn ip_vs_in().
>
> I guess that for some reason svc->usecnt isn't going down to 0.
> Though I haven't been able to isolate anything particularly
> interesting.
>
> That said, the locking isn't that simple, IMHO, so there seems
> to be quite a lot of scope for errors.
>
>
> Some things that are of minor insterst are:
>
> I.
> ip_vs_edit_dest() loops with the following construct:
>
> while (atomic_read(&svc->usecnt) > 1) {};
>
> whereas similar code in the same file uses
>
> IP_VS_WAIT_WHILE(atomic_read(&svc->usecnt) > 1);
>
> which expands to
>
> while (atomic_read(&svc->usecnt) > 1) { cpu_relax(); }
>
> But I dount this is a problem, except for burning the cpu a bit harder
> than it needs to.
>
> II.
>
> ip_vs_set_ctl() does seem to leak svc->usecnt in one corner case,
> but I doubt that is what you are seeing - if it was your ipvsadm
> command(s) would hang. The problem is a bit wordy to describe,
> but this fix should illustrate the problem.
>
> --- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ctl.c
> +++ linux-2.6/net/ipv4/ipvs/ip_vs_ctl.c
> @@ -2000,7 +2000,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
> if (cmd != IP_VS_SO_SET_ADD
> && (svc == NULL || svc->protocol != usvc->protocol)) {
> ret = -ESRCH;
> - goto out_unlock;
> + goto out_svc;
> }
>
> switch (cmd) {
> @@ -2034,9 +2034,9 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
> ret = -EINVAL;
> }
>
> + out_svc:
> if (svc)
> ip_vs_service_put(svc);
> -
> out_unlock:
> mutex_unlock(&__ip_vs_mutex);
> out_dec:
>
> III.
>
> Perhaps if you are calling ipvsadm a lot then there is a remote
> possibility that write_lock_bh() could starve read_lock(). This
> seems ludicrous, but I'm just mentioning it as it crossed my mind.
>
> --
> Horms
> H: http://www.vergenet.net/~horms/
> W: http://www.valinux.co.jp/en/
>
>
--
Sebastien Estienne
[-- Attachment #2: ipvsadm-save.txt --]
[-- Type: text/plain, Size: 10712 bytes --]
-A -t 10.33.1.231:3306 -s wlc
-a -t 10.33.1.231:3306 -r 10.33.1.59:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.58:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.57:3306 -g -w 100
-a -t 10.33.1.231:3306 -r 10.33.1.56:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.55:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.54:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.53:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.52:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.51:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.50:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.49:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.48:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.47:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.46:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.45:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.44:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.43:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.42:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.41:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.40:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.39:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.38:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.37:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.36:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.34:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.33:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.32:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.31:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.30:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.29:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.28:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.27:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.26:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.25:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.24:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.23:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.22:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.21:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.20:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.19:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.18:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.17:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.16:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.14:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.13:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.12:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.9:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.4:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.3:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.2:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.1:3306 -g -w 0
-A -t 10.33.1.232:3306 -s wlc
-a -t 10.33.1.232:3306 -r 10.33.1.59:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.58:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.57:3306 -g -w 100
-a -t 10.33.1.232:3306 -r 10.33.1.56:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.55:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.54:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.53:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.52:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.51:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.50:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.49:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.48:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.47:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.46:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.45:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.44:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.43:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.42:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.41:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.40:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.39:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.38:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.37:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.36:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.34:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.33:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.32:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.31:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.30:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.29:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.28:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.27:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.26:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.25:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.24:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.23:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.22:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.21:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.20:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.19:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.18:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.17:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.16:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.14:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.13:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.12:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.9:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.4:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.3:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.2:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.1:3306 -g -w 0
-A -t 10.33.1.221:3306 -s wlc
-a -t 10.33.1.221:3306 -r 10.33.1.59:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.58:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.57:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.56:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.55:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.54:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.53:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.52:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.51:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.50:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.49:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.48:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.47:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.46:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.45:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.44:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.43:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.42:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.41:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.40:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.39:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.38:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.37:3306 -g -w 0
-a -t 10.33.1.221:3306 -r 10.33.1.36:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.34:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.33:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.32:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.31:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.30:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.29:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.28:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.27:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.26:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.25:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.24:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.23:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.22:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.21:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.20:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.19:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.18:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.17:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.16:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.14:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.13:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.12:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.9:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.4:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.3:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.2:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.1:3306 -g -w 100
-A -t 10.33.1.222:3306 -s wlc
-a -t 10.33.1.222:3306 -r 10.33.1.59:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.58:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.57:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.56:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.55:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.54:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.53:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.52:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.51:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.50:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.49:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.48:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.47:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.46:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.45:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.44:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.43:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.42:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.41:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.40:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.39:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.38:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.37:3306 -g -w 0
-a -t 10.33.1.222:3306 -r 10.33.1.36:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.34:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.33:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.32:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.31:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.30:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.29:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.28:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.27:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.26:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.25:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.24:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.23:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.22:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.21:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.20:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.19:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.18:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.17:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.16:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.14:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.13:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.12:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.9:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.4:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.3:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.2:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.1:3306 -g -w 100
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [ipvs] BUG: soft lockup detected on CPU#3!
2007-05-28 14:20 ` Sebastien Estienne
@ 2007-05-29 1:08 ` Horms
0 siblings, 0 replies; 6+ messages in thread
From: Horms @ 2007-05-29 1:08 UTC (permalink / raw)
To: Sebastien Estienne; +Cc: wensong, ja, nedev
On Mon, May 28, 2007 at 04:20:32PM +0200, Sebastien Estienne wrote:
> On 5/28/07, Horms <horms@verge.net.au> wrote:
> >On Sat, May 26, 2007 at 11:22:40AM +0900, Horms wrote:
> >> On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote:
> >> >
> >> > I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i
> >> > can't reproduce the bug.
> >> > When i was using feisty kernel (2.6.20), i can reproduce in less than 5
> >> > minutes.
> >> >
> >> > I'm using lvs to loadbalance some mysql servers, i wrote a deamon that
> >> > check the synchro of the mysql replication on each slave and adjust
> >> > the wieght on the lvs every 500ms
> >>
> >> It does look a lot like there is some sort of locking problem in there.
> >> Would it be possible to send your kernel config, as the locking
> >> deatails to change a little with different configs.
> >
>
> About the kernel .config, i'm using the vanilla kernel "-server" from
> ubuntu feisty
>
> >If you also have some details of you ipvs configuration,
> >that might help narrow down which code-paths to investigate.
> >
>
> i attached the output of ipvsadm-save
>
> i'm adjusting the weight every 500ms by generating lines like this:
> -e -t 10.33.1.231:3306 -r 10.33.1.1 -w 100
Thanks.
> and piping all the needed changes in ipvadm -R
>
> it can represent something like 20 to 40 updates in one time.
>
> i also noticed that sometimes when i execute "ipvsadm" the display get
> locked in the middle for a second and then finish.
That is probably related. Though the fact that it eventually
exits seems to indicate that you're not hitting II.
> >II.
> >
> >ip_vs_set_ctl() does seem to leak svc->usecnt in one corner case,
> >but I doubt that is what you are seeing - if it was your ipvsadm
> >command(s) would hang. The problem is a bit wordy to describe,
> >but this fix should illustrate the problem.
> >
> >--- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ctl.c
> >+++ linux-2.6/net/ipv4/ipvs/ip_vs_ctl.c
> >@@ -2000,7 +2000,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
> > if (cmd != IP_VS_SO_SET_ADD
> > && (svc == NULL || svc->protocol != usvc->protocol)) {
> > ret = -ESRCH;
> >- goto out_unlock;
> >+ goto out_svc;
> > }
> >
> > switch (cmd) {
> >@@ -2034,9 +2034,9 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
> > ret = -EINVAL;
> > }
> >
> >+ out_svc:
> > if (svc)
> > ip_vs_service_put(svc);
> >-
> > out_unlock:
> > mutex_unlock(&__ip_vs_mutex);
> > out_dec:
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-05-29 1:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <446f47c20705221019j34de745fj9aad8633a6edfd54@mail.gmail.com>
2007-05-25 8:46 ` [ipvs] BUG: soft lockup detected on CPU#3! Horms
2007-05-25 9:30 ` Sebastien Estienne
2007-05-26 2:22 ` Horms
2007-05-28 9:36 ` Horms
2007-05-28 14:20 ` Sebastien Estienne
2007-05-29 1:08 ` Horms
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).