netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [ipvs] BUG: soft lockup detected on CPU#3!
       [not found] <446f47c20705221019j34de745fj9aad8633a6edfd54@mail.gmail.com>
@ 2007-05-25  8:46 ` Horms
  2007-05-25  9:30   ` Sebastien Estienne
  0 siblings, 1 reply; 6+ messages in thread
From: Horms @ 2007-05-25  8:46 UTC (permalink / raw)
  To: Sebastien Estienne; +Cc: wensong, ja, nedev

[Adding netdev CC]

On Tue, May 22, 2007 at 07:19:44PM +0200, Sebastien Estienne wrote:
> Hello,
> 
> I have the following bug (stacktrace in the attachment) and i think
> it's related to ipvs, i reproduced it many time.
> 
> kernel is from ubuntu: "2.6.20-15-server SMP x86_64"
> on dell poweredge sc1425
> 
> i can trigger this bug when i adjust the weight of the real server
> every 500ms using ipvsadm -R and piping the modifications
> 
> there is 4 virtual server with 55 realserver each
> 
> any idea?

Not off hand. Though it does look some what unplesant :(
I'll try my hand at trying to reproduce the problem.

Is there any chance that you could try a vanilla (2.6.21) kernel
to see if the problem exists there too?
> 
> regards,
> -- 
> Sebastien Estienne

> BUG: soft lockup detected on CPU#3!
> 
> Call Trace:
>  <IRQ>  [softlockup_tick+249/288] softlockup_tick+0xf9/0x120
>  [update_process_times+87/144] update_process_times+0x57/0x90
>  [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60
>  [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80
>  [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70
>  <EOI>  [_end+130125309/2130038908] :ip_vs:do_ip_vs_set_ctl+0xad1/0xbf0
>  [_end+130125305/2130038908] :ip_vs:do_ip_vs_set_ctl+0xacd/0xbf0
>  [nf_sockopt+233/304] nf_sockopt+0xe9/0x130
>  [nf_setsockopt+22/32] nf_setsockopt+0x16/0x20
>  [ip_setsockopt+118/160] ip_setsockopt+0x76/0xa0
>  [sys_setsockopt+166/240] sys_setsockopt+0xa6/0xf0
>  [system_call+126/131] system_call+0x7e/0x83
> 
> BUG: soft lockup detected on CPU#0!
> 
> Call Trace:
>  <IRQ>  [softlockup_tick+249/288] softlockup_tick+0xf9/0x120
>  [update_process_times+87/144] update_process_times+0x57/0x90
>  [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60
>  [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80
>  [_end+130330509/2130038908] :nf_conntrack:nf_ct_invert_tuple+0x51/0xa0
>  [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70
>  [__read_lock_failed+5/32] __read_lock_failed+0x5/0x20
>  [_read_lock+11/16] _read_lock+0xb/0x10
>  [_end+130122092/2130038908] :ip_vs:ip_vs_service_get+0x20/0x1e0
>  [_end+130144732/2130038908] :ip_vs:tcp_conn_schedule+0xa0/0x150
>  [_end+130144313/2130038908] :ip_vs:tcp_conn_in_get+0x7d/0xc0
>  [_end+130111414/2130038908] :ip_vs:ip_vs_in+0xca/0x270
>  [nf_iterate+92/160] nf_iterate+0x5c/0xa0
>  [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210
>  [nf_hook_slow+113/240] nf_hook_slow+0x71/0xf0
>  [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210
>  [ip_local_deliver+111/656] ip_local_deliver+0x6f/0x290
>  [ip_rcv+1246/1360] ip_rcv+0x4de/0x550
>  [netif_receive_skb+623/800] netif_receive_skb+0x26f/0x320
>  [_end+129188049/2130038908] :e1000:e1000_clean_rx_irq+0x445/0x520
>  [_end+129183152/2130038908] :e1000:e1000_clean+0x84/0x2b0
>  [net_rx_action+186/512] net_rx_action+0xba/0x200
>  [__do_softirq+95/208] __do_softirq+0x5f/0xd0
>  [call_softirq+28/40] call_softirq+0x1c/0x28
>  [do_softirq+44/144] do_softirq+0x2c/0x90
>  [do_IRQ+217/256] do_IRQ+0xd9/0x100
>  [mwait_idle+0/80] mwait_idle+0x0/0x50
>  [ret_from_intr+0/10] ret_from_intr+0x0/0xa
>  <EOI>  [tcp_poll+0/336] tcp_poll+0x0/0x150
>  [mwait_idle+66/80] mwait_idle+0x42/0x50
>  [cpu_idle+155/208] cpu_idle+0x9b/0xd0
>  [start_kernel+586/608] start_kernel+0x24a/0x260
>  [x86_64_start_kernel+358/368] _sinittext+0x166/0x170
> 


-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [ipvs] BUG: soft lockup detected on CPU#3!
  2007-05-25  8:46 ` [ipvs] BUG: soft lockup detected on CPU#3! Horms
@ 2007-05-25  9:30   ` Sebastien Estienne
  2007-05-26  2:22     ` Horms
  0 siblings, 1 reply; 6+ messages in thread
From: Sebastien Estienne @ 2007-05-25  9:30 UTC (permalink / raw)
  To: Horms; +Cc: wensong, ja, nedev

Hello,

On 5/25/07, Horms <horms@verge.net.au> wrote:
> [Adding netdev CC]
>
> On Tue, May 22, 2007 at 07:19:44PM +0200, Sebastien Estienne wrote:
> > Hello,
> >
> > I have the following bug (stacktrace in the attachment) and i think
> > it's related to ipvs, i reproduced it many time.
> >
> > kernel is from ubuntu: "2.6.20-15-server SMP x86_64"
> > on dell poweredge sc1425
> >
> > i can trigger this bug when i adjust the weight of the real server
> > every 500ms using ipvsadm -R and piping the modifications
> >
> > there is 4 virtual server with 55 realserver each
> >
> > any idea?
>
> Not off hand. Though it does look some what unplesant :(
> I'll try my hand at trying to reproduce the problem.
>
> Is there any chance that you could try a vanilla (2.6.21) kernel
> to see if the problem exists there too?

I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i
can't reproduce the bug.
When i was using feisty kernel (2.6.20), i can reproduce in less than 5 minutes.

I'm using lvs to loadbalance some mysql servers, i wrote a deamon that
check the synchro of the mysql replication on each slave and adjust
the wieght on the lvs every 500ms

> >
> > regards,
> > --
> > Sebastien Estienne
>
> > BUG: soft lockup detected on CPU#3!
> >
> > Call Trace:
> >  <IRQ>  [softlockup_tick+249/288] softlockup_tick+0xf9/0x120
> >  [update_process_times+87/144] update_process_times+0x57/0x90
> >  [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60
> >  [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80
> >  [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70
> >  <EOI>  [_end+130125309/2130038908] :ip_vs:do_ip_vs_set_ctl+0xad1/0xbf0
> >  [_end+130125305/2130038908] :ip_vs:do_ip_vs_set_ctl+0xacd/0xbf0
> >  [nf_sockopt+233/304] nf_sockopt+0xe9/0x130
> >  [nf_setsockopt+22/32] nf_setsockopt+0x16/0x20
> >  [ip_setsockopt+118/160] ip_setsockopt+0x76/0xa0
> >  [sys_setsockopt+166/240] sys_setsockopt+0xa6/0xf0
> >  [system_call+126/131] system_call+0x7e/0x83
> >
> > BUG: soft lockup detected on CPU#0!
> >
> > Call Trace:
> >  <IRQ>  [softlockup_tick+249/288] softlockup_tick+0xf9/0x120
> >  [update_process_times+87/144] update_process_times+0x57/0x90
> >  [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x34/0x60
> >  [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59/0x80
> >  [_end+130330509/2130038908] :nf_conntrack:nf_ct_invert_tuple+0x51/0xa0
> >  [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70
> >  [__read_lock_failed+5/32] __read_lock_failed+0x5/0x20
> >  [_read_lock+11/16] _read_lock+0xb/0x10
> >  [_end+130122092/2130038908] :ip_vs:ip_vs_service_get+0x20/0x1e0
> >  [_end+130144732/2130038908] :ip_vs:tcp_conn_schedule+0xa0/0x150
> >  [_end+130144313/2130038908] :ip_vs:tcp_conn_in_get+0x7d/0xc0
> >  [_end+130111414/2130038908] :ip_vs:ip_vs_in+0xca/0x270
> >  [nf_iterate+92/160] nf_iterate+0x5c/0xa0
> >  [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210
> >  [nf_hook_slow+113/240] nf_hook_slow+0x71/0xf0
> >  [ip_local_deliver_finish+0/528] ip_local_deliver_finish+0x0/0x210
> >  [ip_local_deliver+111/656] ip_local_deliver+0x6f/0x290
> >  [ip_rcv+1246/1360] ip_rcv+0x4de/0x550
> >  [netif_receive_skb+623/800] netif_receive_skb+0x26f/0x320
> >  [_end+129188049/2130038908] :e1000:e1000_clean_rx_irq+0x445/0x520
> >  [_end+129183152/2130038908] :e1000:e1000_clean+0x84/0x2b0
> >  [net_rx_action+186/512] net_rx_action+0xba/0x200
> >  [__do_softirq+95/208] __do_softirq+0x5f/0xd0
> >  [call_softirq+28/40] call_softirq+0x1c/0x28
> >  [do_softirq+44/144] do_softirq+0x2c/0x90
> >  [do_IRQ+217/256] do_IRQ+0xd9/0x100
> >  [mwait_idle+0/80] mwait_idle+0x0/0x50
> >  [ret_from_intr+0/10] ret_from_intr+0x0/0xa
> >  <EOI>  [tcp_poll+0/336] tcp_poll+0x0/0x150
> >  [mwait_idle+66/80] mwait_idle+0x42/0x50
> >  [cpu_idle+155/208] cpu_idle+0x9b/0xd0
> >  [start_kernel+586/608] start_kernel+0x24a/0x260
> >  [x86_64_start_kernel+358/368] _sinittext+0x166/0x170
> >
>
>
> --
> Horms
>   H: http://www.vergenet.net/~horms/
>   W: http://www.valinux.co.jp/en/
>
>


-- 
Sebastien Estienne

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [ipvs] BUG: soft lockup detected on CPU#3!
  2007-05-25  9:30   ` Sebastien Estienne
@ 2007-05-26  2:22     ` Horms
  2007-05-28  9:36       ` Horms
  0 siblings, 1 reply; 6+ messages in thread
From: Horms @ 2007-05-26  2:22 UTC (permalink / raw)
  To: Sebastien Estienne; +Cc: wensong, ja, nedev

On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote:
> 
> I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i
> can't reproduce the bug.
> When i was using feisty kernel (2.6.20), i can reproduce in less than 5 
> minutes.
> 
> I'm using lvs to loadbalance some mysql servers, i wrote a deamon that
> check the synchro of the mysql replication on each slave and adjust
> the wieght on the lvs every 500ms

It does look a lot like there is some sort of locking problem in there.
Would it be possible to send your kernel config, as the locking
deatails to change a little with different configs.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [ipvs] BUG: soft lockup detected on CPU#3!
  2007-05-26  2:22     ` Horms
@ 2007-05-28  9:36       ` Horms
  2007-05-28 14:20         ` Sebastien Estienne
  0 siblings, 1 reply; 6+ messages in thread
From: Horms @ 2007-05-28  9:36 UTC (permalink / raw)
  To: Sebastien Estienne; +Cc: wensong, ja, nedev

On Sat, May 26, 2007 at 11:22:40AM +0900, Horms wrote:
> On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote:
> > 
> > I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i
> > can't reproduce the bug.
> > When i was using feisty kernel (2.6.20), i can reproduce in less than 5 
> > minutes.
> > 
> > I'm using lvs to loadbalance some mysql servers, i wrote a deamon that
> > check the synchro of the mysql replication on each slave and adjust
> > the wieght on the lvs every 500ms
> 
> It does look a lot like there is some sort of locking problem in there.
> Would it be possible to send your kernel config, as the locking
> deatails to change a little with different configs.

If you also have some details of you ipvs configuration,
that might help narrow down which code-paths to investigate.

I spent some time this afternoon looking into this probem,
and what I think is happening is:

  1. Due to your weight-update operations, one processor
     is sitting in ip_vs_edit_dest() called by do_ip_vs_set_ctl(),
     holding write_lock_bh(&__ip_vs_svc_lock) and waiting
     for svc->usecnt to go down to 1.

  2. Another process is trying to grab
     read_lock(&__ip_vs_svc_lock) in ip_vs_service_get(),
     called from tcp_conn_schedule() and in turn ip_vs_in().

  I guess that for some reason svc->usecnt isn't going down to 0.
  Though I haven't been able to isolate anything particularly
  interesting.

That said, the locking isn't that simple, IMHO, so there seems
to be quite a lot of scope for errors.


Some things that are of minor insterst are:

I.
ip_vs_edit_dest() loops with the following construct:

  while (atomic_read(&svc->usecnt) > 1) {};

whereas similar code in the same file uses

  IP_VS_WAIT_WHILE(atomic_read(&svc->usecnt) > 1);

which expands to

  while (atomic_read(&svc->usecnt) > 1) { cpu_relax(); }

But I dount this is a problem, except for burning the cpu a bit harder
than it needs to.

II.

ip_vs_set_ctl() does seem to leak svc->usecnt in one corner case,
but I doubt that is what you are seeing - if it was your ipvsadm
command(s) would hang. The problem is a bit wordy to describe,
but this fix should illustrate the problem.

--- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ctl.c
+++ linux-2.6/net/ipv4/ipvs/ip_vs_ctl.c
@@ -2000,7 +2000,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
 	if (cmd != IP_VS_SO_SET_ADD
 	    && (svc == NULL || svc->protocol != usvc->protocol)) {
 		ret = -ESRCH;
-		goto out_unlock;
+		goto out_svc;
 	}
 
 	switch (cmd) {
@@ -2034,9 +2034,9 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
 		ret = -EINVAL;
 	}
 
+  out_svc:
 	if (svc)
 		ip_vs_service_put(svc);
-
   out_unlock:
 	mutex_unlock(&__ip_vs_mutex);
   out_dec:

III.

Perhaps if you are calling ipvsadm a lot then there is a remote
possibility that write_lock_bh() could starve read_lock(). This
seems ludicrous, but I'm just mentioning it as it crossed my mind.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [ipvs] BUG: soft lockup detected on CPU#3!
  2007-05-28  9:36       ` Horms
@ 2007-05-28 14:20         ` Sebastien Estienne
  2007-05-29  1:08           ` Horms
  0 siblings, 1 reply; 6+ messages in thread
From: Sebastien Estienne @ 2007-05-28 14:20 UTC (permalink / raw)
  To: Horms; +Cc: wensong, ja, nedev

[-- Attachment #1: Type: text/plain, Size: 3819 bytes --]

On 5/28/07, Horms <horms@verge.net.au> wrote:
> On Sat, May 26, 2007 at 11:22:40AM +0900, Horms wrote:
> > On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote:
> > >
> > > I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i
> > > can't reproduce the bug.
> > > When i was using feisty kernel (2.6.20), i can reproduce in less than 5
> > > minutes.
> > >
> > > I'm using lvs to loadbalance some mysql servers, i wrote a deamon that
> > > check the synchro of the mysql replication on each slave and adjust
> > > the wieght on the lvs every 500ms
> >
> > It does look a lot like there is some sort of locking problem in there.
> > Would it be possible to send your kernel config, as the locking
> > deatails to change a little with different configs.
>

About the kernel .config, i'm using the vanilla kernel "-server" from
ubuntu feisty

> If you also have some details of you ipvs configuration,
> that might help narrow down which code-paths to investigate.
>

i attached the output of ipvsadm-save

i'm adjusting the weight every 500ms by generating lines like this:
-e -t 10.33.1.231:3306 -r 10.33.1.1 -w 100

and piping all the needed changes in ipvadm -R

it can represent something like 20 to 40 updates in one time.

i also noticed that sometimes when i execute "ipvsadm" the display get
locked in the middle for a second and then finish.

> I spent some time this afternoon looking into this probem,
> and what I think is happening is:
>
>   1. Due to your weight-update operations, one processor
>      is sitting in ip_vs_edit_dest() called by do_ip_vs_set_ctl(),
>      holding write_lock_bh(&__ip_vs_svc_lock) and waiting
>      for svc->usecnt to go down to 1.
>
>   2. Another process is trying to grab
>      read_lock(&__ip_vs_svc_lock) in ip_vs_service_get(),
>      called from tcp_conn_schedule() and in turn ip_vs_in().
>
>   I guess that for some reason svc->usecnt isn't going down to 0.
>   Though I haven't been able to isolate anything particularly
>   interesting.
>
> That said, the locking isn't that simple, IMHO, so there seems
> to be quite a lot of scope for errors.
>
>
> Some things that are of minor insterst are:
>
> I.
> ip_vs_edit_dest() loops with the following construct:
>
>   while (atomic_read(&svc->usecnt) > 1) {};
>
> whereas similar code in the same file uses
>
>   IP_VS_WAIT_WHILE(atomic_read(&svc->usecnt) > 1);
>
> which expands to
>
>   while (atomic_read(&svc->usecnt) > 1) { cpu_relax(); }
>
> But I dount this is a problem, except for burning the cpu a bit harder
> than it needs to.
>
> II.
>
> ip_vs_set_ctl() does seem to leak svc->usecnt in one corner case,
> but I doubt that is what you are seeing - if it was your ipvsadm
> command(s) would hang. The problem is a bit wordy to describe,
> but this fix should illustrate the problem.
>
> --- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ctl.c
> +++ linux-2.6/net/ipv4/ipvs/ip_vs_ctl.c
> @@ -2000,7 +2000,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
>         if (cmd != IP_VS_SO_SET_ADD
>             && (svc == NULL || svc->protocol != usvc->protocol)) {
>                 ret = -ESRCH;
> -               goto out_unlock;
> +               goto out_svc;
>         }
>
>         switch (cmd) {
> @@ -2034,9 +2034,9 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
>                 ret = -EINVAL;
>         }
>
> +  out_svc:
>         if (svc)
>                 ip_vs_service_put(svc);
> -
>    out_unlock:
>         mutex_unlock(&__ip_vs_mutex);
>    out_dec:
>
> III.
>
> Perhaps if you are calling ipvsadm a lot then there is a remote
> possibility that write_lock_bh() could starve read_lock(). This
> seems ludicrous, but I'm just mentioning it as it crossed my mind.
>
> --
> Horms
>   H: http://www.vergenet.net/~horms/
>   W: http://www.valinux.co.jp/en/
>
>


-- 
Sebastien Estienne

[-- Attachment #2: ipvsadm-save.txt --]
[-- Type: text/plain, Size: 10712 bytes --]

-A -t 10.33.1.231:3306 -s wlc
-a -t 10.33.1.231:3306 -r 10.33.1.59:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.58:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.57:3306 -g -w 100
-a -t 10.33.1.231:3306 -r 10.33.1.56:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.55:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.54:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.53:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.52:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.51:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.50:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.49:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.48:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.47:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.46:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.45:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.44:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.43:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.42:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.41:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.40:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.39:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.38:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.37:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.36:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.34:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.33:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.32:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.31:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.30:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.29:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.28:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.27:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.26:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.25:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.24:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.23:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.22:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.21:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.20:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.19:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.18:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.17:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.16:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.14:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.13:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.12:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.9:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.4:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.3:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.2:3306 -g -w 0
-a -t 10.33.1.231:3306 -r 10.33.1.1:3306 -g -w 0
-A -t 10.33.1.232:3306 -s wlc
-a -t 10.33.1.232:3306 -r 10.33.1.59:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.58:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.57:3306 -g -w 100
-a -t 10.33.1.232:3306 -r 10.33.1.56:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.55:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.54:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.53:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.52:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.51:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.50:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.49:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.48:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.47:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.46:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.45:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.44:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.43:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.42:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.41:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.40:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.39:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.38:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.37:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.36:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.34:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.33:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.32:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.31:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.30:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.29:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.28:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.27:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.26:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.25:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.24:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.23:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.22:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.21:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.20:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.19:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.18:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.17:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.16:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.14:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.13:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.12:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.9:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.4:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.3:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.2:3306 -g -w 0
-a -t 10.33.1.232:3306 -r 10.33.1.1:3306 -g -w 0
-A -t 10.33.1.221:3306 -s wlc
-a -t 10.33.1.221:3306 -r 10.33.1.59:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.58:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.57:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.56:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.55:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.54:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.53:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.52:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.51:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.50:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.49:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.48:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.47:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.46:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.45:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.44:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.43:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.42:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.41:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.40:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.39:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.38:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.37:3306 -g -w 0
-a -t 10.33.1.221:3306 -r 10.33.1.36:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.34:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.33:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.32:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.31:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.30:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.29:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.28:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.27:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.26:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.25:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.24:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.23:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.22:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.21:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.20:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.19:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.18:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.17:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.16:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.14:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.13:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.12:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.9:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.4:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.3:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.2:3306 -g -w 100
-a -t 10.33.1.221:3306 -r 10.33.1.1:3306 -g -w 100
-A -t 10.33.1.222:3306 -s wlc
-a -t 10.33.1.222:3306 -r 10.33.1.59:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.58:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.57:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.56:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.55:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.54:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.53:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.52:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.51:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.50:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.49:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.48:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.47:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.46:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.45:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.44:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.43:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.42:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.41:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.40:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.39:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.38:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.37:3306 -g -w 0
-a -t 10.33.1.222:3306 -r 10.33.1.36:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.34:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.33:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.32:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.31:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.30:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.29:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.28:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.27:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.26:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.25:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.24:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.23:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.22:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.21:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.20:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.19:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.18:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.17:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.16:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.14:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.13:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.12:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.9:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.4:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.3:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.2:3306 -g -w 100
-a -t 10.33.1.222:3306 -r 10.33.1.1:3306 -g -w 100

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [ipvs] BUG: soft lockup detected on CPU#3!
  2007-05-28 14:20         ` Sebastien Estienne
@ 2007-05-29  1:08           ` Horms
  0 siblings, 0 replies; 6+ messages in thread
From: Horms @ 2007-05-29  1:08 UTC (permalink / raw)
  To: Sebastien Estienne; +Cc: wensong, ja, nedev

On Mon, May 28, 2007 at 04:20:32PM +0200, Sebastien Estienne wrote:
> On 5/28/07, Horms <horms@verge.net.au> wrote:
> >On Sat, May 26, 2007 at 11:22:40AM +0900, Horms wrote:
> >> On Fri, May 25, 2007 at 09:30:52AM +0000, Sebastien Estienne wrote:
> >> >
> >> > I didn't try 2.6.21 yet, but using ubuntu dapper kernel (2.6.15) i
> >> > can't reproduce the bug.
> >> > When i was using feisty kernel (2.6.20), i can reproduce in less than 5
> >> > minutes.
> >> >
> >> > I'm using lvs to loadbalance some mysql servers, i wrote a deamon that
> >> > check the synchro of the mysql replication on each slave and adjust
> >> > the wieght on the lvs every 500ms
> >>
> >> It does look a lot like there is some sort of locking problem in there.
> >> Would it be possible to send your kernel config, as the locking
> >> deatails to change a little with different configs.
> >
> 
> About the kernel .config, i'm using the vanilla kernel "-server" from
> ubuntu feisty
> 
> >If you also have some details of you ipvs configuration,
> >that might help narrow down which code-paths to investigate.
> >
> 
> i attached the output of ipvsadm-save
> 
> i'm adjusting the weight every 500ms by generating lines like this:
> -e -t 10.33.1.231:3306 -r 10.33.1.1 -w 100

Thanks.

> and piping all the needed changes in ipvadm -R
> 
> it can represent something like 20 to 40 updates in one time.
> 
> i also noticed that sometimes when i execute "ipvsadm" the display get
> locked in the middle for a second and then finish.

That is probably related. Though the fact that it eventually
exits seems to indicate that you're not hitting II.

> >II.
> >
> >ip_vs_set_ctl() does seem to leak svc->usecnt in one corner case,
> >but I doubt that is what you are seeing - if it was your ipvsadm
> >command(s) would hang. The problem is a bit wordy to describe,
> >but this fix should illustrate the problem.
> >
> >--- linux-2.6.orig/net/ipv4/ipvs/ip_vs_ctl.c
> >+++ linux-2.6/net/ipv4/ipvs/ip_vs_ctl.c
> >@@ -2000,7 +2000,7 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
> >        if (cmd != IP_VS_SO_SET_ADD
> >            && (svc == NULL || svc->protocol != usvc->protocol)) {
> >                ret = -ESRCH;
> >-               goto out_unlock;
> >+               goto out_svc;
> >        }
> >
> >        switch (cmd) {
> >@@ -2034,9 +2034,9 @@ do_ip_vs_set_ctl(struct sock *sk, int cm
> >                ret = -EINVAL;
> >        }
> >
> >+  out_svc:
> >        if (svc)
> >                ip_vs_service_put(svc);
> >-
> >   out_unlock:
> >        mutex_unlock(&__ip_vs_mutex);
> >   out_dec:

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-05-29  1:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <446f47c20705221019j34de745fj9aad8633a6edfd54@mail.gmail.com>
2007-05-25  8:46 ` [ipvs] BUG: soft lockup detected on CPU#3! Horms
2007-05-25  9:30   ` Sebastien Estienne
2007-05-26  2:22     ` Horms
2007-05-28  9:36       ` Horms
2007-05-28 14:20         ` Sebastien Estienne
2007-05-29  1:08           ` Horms

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).