* Re: Linux 2.4.32-rc2
2005-11-02 11:02 ` Roberto Nibali
@ 2005-11-02 12:29 ` Willy Tarreau
2005-11-03 10:19 ` Roberto Nibali
2005-11-03 5:43 ` Willy Tarreau
2005-11-03 6:41 ` Willy TARREAU
2 siblings, 1 reply; 15+ messages in thread
From: Willy Tarreau @ 2005-11-02 12:29 UTC (permalink / raw)
To: Roberto Nibali; +Cc: Marcelo Tosatti, Grant Coady, linux-kernel
Hi Roberto,
On Wed, Nov 02, 2005 at 12:02:39PM +0100, Roberto Nibali wrote:
> Bonjour Willy,
>
> >>Willy, if you have time, could you check your non-i386 boxes with a
> >>2.95.x compiled 2.4.x kernel, with IPVS enabled?
> >
> > Yes, no problem, but you'll have to tell me what to test ! (a config
> > or script will save me some time). I have a Sun Ultra60 (ultrasparc SMP)
> > which matches your description. I just have a doubt about gcc-2.95
> > availability on this box, I know I have a 3.3.6, do you think that the
> > problem is gcc-related (too strong optimization or de-inlining, etc) ?
>
> At least following should be set, the rest you can leave to your gusto:
>
> CONFIG_ACPI=y
> CONFIG_ACPI_BOOT=y
> CONFIG_ACPI_BUS=y
> CONFIG_ACPI_INTERPRETER=y
> CONFIG_ACPI_EC=y
> CONFIG_ACPI_POWER=y
> CONFIG_ACPI_PCI=y
> CONFIG_ACPI_MMCONFIG=y
> CONFIG_ACPI_SLEEP=y
> CONFIG_ACPI_SYSTEM=y
But this is purely x86-related, I won't have it on sparc.
> CONFIG_IP_VS=m
> CONFIG_IP_VS_DEBUG=y
> CONFIG_IP_VS_TAB_BITS=12
> CONFIG_IP_VS_RR=m
> CONFIG_IP_VS_WRR=m
> CONFIG_IP_VS_LC=m
> CONFIG_IP_VS_WLC=m
> CONFIG_IP_VS_LBLC=m
> CONFIG_IP_VS_LBLCR=m
> CONFIG_IP_VS_DH=m
> CONFIG_IP_VS_SH=m
> CONFIG_IP_VS_SED=m
> CONFIG_IP_VS_NQ=m
> CONFIG_IP_VS_HPRIO=m
> CONFIG_IP_VS_FTP=m
>
> One issue is a possible C99'ism in the last IPVS patch. If you find
> time, please have a 2.95.x compiler installed.
You mean that it's a build issue ? I first thought that you got erroneous
behaviour.
> Another thing that could fail is if you additionally set
>
> CONFIG_ACPI_FAN=m
>
> and compile with CFLAGS="-g -ggdb"
will test too
> > Please keep us informed when you have more info.
>
> I will, and I will get more details, as time permits. My beef with the
> IPVS code seems to be wrong, the code works as expected so far. I'm
> stress-testing it though until Sunday on a 4GB Dual P4 Xeon with HT combo.
How could I stress it ? what ipvs config, what type of traffic ? I'm used
to stress-test firewalls and load-balancers, but there is a wide choice of
possibilities, and all cannot be explored in a short timeframe.
Regards,
Willy
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 2.4.32-rc2
2005-11-02 12:29 ` Willy Tarreau
@ 2005-11-03 10:19 ` Roberto Nibali
2005-11-04 0:09 ` Willy Tarreau
0 siblings, 1 reply; 15+ messages in thread
From: Roberto Nibali @ 2005-11-03 10:19 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Marcelo Tosatti, Grant Coady, linux-kernel
>>CONFIG_ACPI=y
>>CONFIG_ACPI_BOOT=y
>>CONFIG_ACPI_BUS=y
>>CONFIG_ACPI_INTERPRETER=y
>>CONFIG_ACPI_EC=y
>>CONFIG_ACPI_POWER=y
>>CONFIG_ACPI_PCI=y
>>CONFIG_ACPI_MMCONFIG=y
>>CONFIG_ACPI_SLEEP=y
>>CONFIG_ACPI_SYSTEM=y
>
> But this is purely x86-related, I won't have it on sparc.
Indeed ;).
>>CONFIG_IP_VS=m
>>CONFIG_IP_VS_DEBUG=y
>>CONFIG_IP_VS_TAB_BITS=12
>>CONFIG_IP_VS_RR=m
>>CONFIG_IP_VS_WRR=m
>>CONFIG_IP_VS_LC=m
>>CONFIG_IP_VS_WLC=m
>>CONFIG_IP_VS_LBLC=m
>>CONFIG_IP_VS_LBLCR=m
>>CONFIG_IP_VS_DH=m
>>CONFIG_IP_VS_SH=m
>>CONFIG_IP_VS_SED=m
>>CONFIG_IP_VS_NQ=m
>>CONFIG_IP_VS_HPRIO=m
>>CONFIG_IP_VS_FTP=m
>>
>>One issue is a possible C99'ism in the last IPVS patch. If you find
>>time, please have a 2.95.x compiler installed.
>
> You mean that it's a build issue ? I first thought that you got erroneous
> behaviour.
Yes, the erroneous stuff I'm tracking down and it looks like I've found
it (actually, Julian Anastasov fixed it):
diff -ur v2.4.32-rc2/linux/net/ipv4/ipvs/ip_vs_core.c
linux/net/ipv4/ipvs/ip_vs_core.c
--- v2.4.32-rc2/linux/net/ipv4/ipvs/ip_vs_core.c 2005-11-03
01:20:02.000000000 +0200
+++ linux/net/ipv4/ipvs/ip_vs_core.c 2005-11-03 01:22:36.347895544 +0200
@@ -1111,11 +1111,10 @@
if (sysctl_ip_vs_expire_nodest_conn) {
/* try to expire the connection immediately */
ip_vs_conn_expire_now(cp);
- } else {
- /* don't restart its timer, and silently
- drop the packet. */
- __ip_vs_conn_put(cp);
}
+ /* don't restart its timer, and silently
+ drop the packet. */
+ __ip_vs_conn_put(cp);
return NF_DROP;
}
I will send a proper signed-off and acked-by patch against rc2 after
some more stress testing. So, please hold off releasing until then. I'm
done testing this piece of code by tomorrow noon (GMT+1).
What I wasn't sure is if the latest patches still compiled on 2.95.x
gcc. That's the only thing I wanted you to test. I cannot ask you to run
fully fledged LVS tests, as this requires quite some setup time.
> How could I stress it ? what ipvs config, what type of traffic ? I'm used
> to stress-test firewalls and load-balancers, but there is a wide choice of
> possibilities, and all cannot be explored in a short timeframe.
You would need to test IPVS on a SMP box using persistent setup and 0
port feature and the expire_nodest_conn proc-fs entry set to 1. Hit the
LB with 100Mbit/s traffic balancing it on 2-3 RS and reload the
configuration using ipvsadm, _but_ without rmmod'ing the ip_vs_* kernel
modules. Set the persistency timeout low (60 secs) and the
timeout_finwait to 10*HZ. You need 2 clients which connect over a Linux
router to a LVS_DR setup, one needs to be router through and the other
should be NAT'd on the Linux router using a NAT pool to simulate 100's
of clients. This way you have the slashdot-hype and the AOL proxy boost
hitting your LB and generating loaded persistency templates which will
then hit the code in question, wenn the internal timer expires. You need
to grep for NONE in ipvsadm -L -n -c to get the template entries. You
must stop the client connecting directly through the Linux router after
you reloaded the LB setup and then you observe the persistent template
created for this client until the timer expires. Then you start it again
and with luck you should see the abberant behaviour of a missed
__ip_vs_conn_put(cp) :). I am pretty sure you do not want to go through
this setup. I have it here and I'm stress testing all possible
combinations of this szenario.
Thanks for your help, Willy.
A bientôt,
Roberto Nibali, ratz
--
-------------------------------------------------------------
addr://Kasinostrasse 30, CH-5001 Aarau tel://++41 62 823 9355
http://www.terreactive.com fax://++41 62 823 9356
-------------------------------------------------------------
terreActive AG Wir sichern Ihren Erfolg
-------------------------------------------------------------
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: Linux 2.4.32-rc2
2005-11-03 10:19 ` Roberto Nibali
@ 2005-11-04 0:09 ` Willy Tarreau
2005-11-04 8:50 ` Roberto Nibali
0 siblings, 1 reply; 15+ messages in thread
From: Willy Tarreau @ 2005-11-04 0:09 UTC (permalink / raw)
To: Roberto Nibali; +Cc: Marcelo Tosatti, Grant Coady, linux-kernel
Hi Roberto,
On Thu, Nov 03, 2005 at 11:19:38AM +0100, Roberto Nibali wrote:
> >>CONFIG_ACPI=y
> >>CONFIG_ACPI_BOOT=y
> >>CONFIG_ACPI_BUS=y
> >>CONFIG_ACPI_INTERPRETER=y
> >>CONFIG_ACPI_EC=y
> >>CONFIG_ACPI_POWER=y
> >>CONFIG_ACPI_PCI=y
> >>CONFIG_ACPI_MMCONFIG=y
> >>CONFIG_ACPI_SLEEP=y
> >>CONFIG_ACPI_SYSTEM=y
> >
> > But this is purely x86-related, I won't have it on sparc.
>
> Indeed ;).
No pb.
> >>CONFIG_IP_VS=m
> >>CONFIG_IP_VS_DEBUG=y
> >>CONFIG_IP_VS_TAB_BITS=12
> >>CONFIG_IP_VS_RR=m
> >>CONFIG_IP_VS_WRR=m
> >>CONFIG_IP_VS_LC=m
> >>CONFIG_IP_VS_WLC=m
> >>CONFIG_IP_VS_LBLC=m
> >>CONFIG_IP_VS_LBLCR=m
> >>CONFIG_IP_VS_DH=m
> >>CONFIG_IP_VS_SH=m
> >>CONFIG_IP_VS_SED=m
> >>CONFIG_IP_VS_NQ=m
> >>CONFIG_IP_VS_HPRIO=m
> >>CONFIG_IP_VS_FTP=m
> >>
> >>One issue is a possible C99'ism in the last IPVS patch. If you find
> >>time, please have a 2.95.x compiler installed.
> >
> > You mean that it's a build issue ? I first thought that you got erroneous
> > behaviour.
>
> Yes, the erroneous stuff I'm tracking down and it looks like I've found
> it (actually, Julian Anastasov fixed it):
>
> diff -ur v2.4.32-rc2/linux/net/ipv4/ipvs/ip_vs_core.c
> linux/net/ipv4/ipvs/ip_vs_core.c
> --- v2.4.32-rc2/linux/net/ipv4/ipvs/ip_vs_core.c 2005-11-03
> 01:20:02.000000000 +0200
> +++ linux/net/ipv4/ipvs/ip_vs_core.c 2005-11-03 01:22:36.347895544 +0200
> @@ -1111,11 +1111,10 @@
> if (sysctl_ip_vs_expire_nodest_conn) {
> /* try to expire the connection immediately */
> ip_vs_conn_expire_now(cp);
> - } else {
> - /* don't restart its timer, and silently
> - drop the packet. */
> - __ip_vs_conn_put(cp);
> }
> + /* don't restart its timer, and silently
> + drop the packet. */
> + __ip_vs_conn_put(cp);
> return NF_DROP;
> }
>
> I will send a proper signed-off and acked-by patch against rc2 after
> some more stress testing. So, please hold off releasing until then. I'm
> done testing this piece of code by tomorrow noon (GMT+1).
OK, fine. I'll merge it into next -hf (probably within a few days). Please
insist loudly if you consider it important to fix quickly because it's a
real regression, as I don't want to have people wait for long if one hotfix
causes trouble.
> What I wasn't sure is if the latest patches still compiled on 2.95.x
> gcc. That's the only thing I wanted you to test.
OK, if it was your only concern, then I can say that it compiles and
runs on x86.
> I cannot ask you to run fully fledged LVS tests, as this requires
> quite some setup time.
I know this, that's why I asked about the setup, config files and
test scenario :-)
> > How could I stress it ? what ipvs config, what type of traffic ? I'm used
> > to stress-test firewalls and load-balancers, but there is a wide choice of
> > possibilities, and all cannot be explored in a short timeframe.
>
> You would need to test IPVS on a SMP box using persistent setup and 0
> port feature and the expire_nodest_conn proc-fs entry set to 1. Hit the
> LB with 100Mbit/s traffic balancing it on 2-3 RS and reload the
> configuration using ipvsadm, _but_ without rmmod'ing the ip_vs_* kernel
> modules. Set the persistency timeout low (60 secs) and the
> timeout_finwait to 10*HZ. You need 2 clients which connect over a Linux
> router to a LVS_DR setup, one needs to be router through and the other
> should be NAT'd on the Linux router using a NAT pool to simulate 100's
> of clients. This way you have the slashdot-hype and the AOL proxy boost
> hitting your LB and generating loaded persistency templates which will
> then hit the code in question, wenn the internal timer expires. You need
> to grep for NONE in ipvsadm -L -n -c to get the template entries. You
> must stop the client connecting directly through the Linux router after
> you reloaded the LB setup and then you observe the persistent template
> created for this client until the timer expires. Then you start it again
> and with luck you should see the abberant behaviour of a missed
> __ip_vs_conn_put(cp) :). I am pretty sure you do not want to go through
> this setup. I have it here and I'm stress testing all possible
> combinations of this szenario.
Of course this is not the easiest setup just to chase a bug down. But with
such an explanation, a good manual on IPVS, and a lot of spare time, it
could be done if it was the only solution. But I'm not willing to spend
so much time on this yet :-)
> Thanks for your help, Willy.
You're welcome.
> A bientôt,
> Roberto Nibali, ratz
Cheers,
Willy
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: Linux 2.4.32-rc2
2005-11-04 0:09 ` Willy Tarreau
@ 2005-11-04 8:50 ` Roberto Nibali
0 siblings, 0 replies; 15+ messages in thread
From: Roberto Nibali @ 2005-11-04 8:50 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Marcelo Tosatti, Grant Coady, linux-kernel
Salut Willy,
>>I will send a proper signed-off and acked-by patch against rc2 after
>>some more stress testing. So, please hold off releasing until then. I'm
>>done testing this piece of code by tomorrow noon (GMT+1).
>
> OK, fine. I'll merge it into next -hf (probably within a few days). Please
> insist loudly if you consider it important to fix quickly because it's a
> real regression, as I don't want to have people wait for long if one hotfix
> causes trouble.
It is absolutely needed. Without it, people will really experience a
long term problem with hanging templates in IPVS, manifesting itself
depending on time and hardware configuration. So I insist that you merge
this patch _NOW_ :).
I'm checking another issue with an asymmetric reference counting which
is not a bug per se (so far) but could serve you with a plate of
unwelcome surprises in the long run as well. This, however is
post-2.4.32 material because I need a couple of 1000s test runs to check
all invariants and configurations.
> OK, if it was your only concern, then I can say that it compiles and
> runs on x86.
Perfect, thanks.
>>You would need to test IPVS on a SMP box using persistent setup and 0
>>port feature and the expire_nodest_conn proc-fs entry set to 1. Hit the
>>LB with 100Mbit/s traffic balancing it on 2-3 RS and reload the
>>configuration using ipvsadm, _but_ without rmmod'ing the ip_vs_* kernel
>>modules. Set the persistency timeout low (60 secs) and the
>>timeout_finwait to 10*HZ. You need 2 clients which connect over a Linux
>>router to a LVS_DR setup, one needs to be router through and the other
>>should be NAT'd on the Linux router using a NAT pool to simulate 100's
>>of clients. This way you have the slashdot-hype and the AOL proxy boost
>>hitting your LB and generating loaded persistency templates which will
>>then hit the code in question, wenn the internal timer expires. You need
>>to grep for NONE in ipvsadm -L -n -c to get the template entries. You
>>must stop the client connecting directly through the Linux router after
>>you reloaded the LB setup and then you observe the persistent template
>>created for this client until the timer expires. Then you start it again
>>and with luck you should see the abberant behaviour of a missed
>>__ip_vs_conn_put(cp) :). I am pretty sure you do not want to go through
>>this setup. I have it here and I'm stress testing all possible
>>combinations of this szenario.
>
> Of course this is not the easiest setup just to chase a bug down. But with
> such an explanation, a good manual on IPVS, and a lot of spare time, it
> could be done if it was the only solution. But I'm not willing to spend
> so much time on this yet :-)
I understand. It took me one week to set this up. Reading the code alone
only made me suspicious.
Have a nice day,
Roberto Nibali, ratz
--
-------------------------------------------------------------
addr://Kasinostrasse 30, CH-5001 Aarau tel://++41 62 823 9355
http://www.terreactive.com fax://++41 62 823 9356
-------------------------------------------------------------
terreActive AG Wir sichern Ihren Erfolg
-------------------------------------------------------------
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 2.4.32-rc2
2005-11-02 11:02 ` Roberto Nibali
2005-11-02 12:29 ` Willy Tarreau
@ 2005-11-03 5:43 ` Willy Tarreau
2005-11-03 6:41 ` Willy TARREAU
2 siblings, 0 replies; 15+ messages in thread
From: Willy Tarreau @ 2005-11-03 5:43 UTC (permalink / raw)
To: Roberto Nibali; +Cc: Marcelo Tosatti, Grant Coady, linux-kernel
Hi Roberto,
On Wed, Nov 02, 2005 at 12:02:39PM +0100, Roberto Nibali wrote:
> Bonjour Willy,
>
> >>Willy, if you have time, could you check your non-i386 boxes with a
> >>2.95.x compiled 2.4.x kernel, with IPVS enabled?
> >
> > Yes, no problem, but you'll have to tell me what to test ! (a config
> > or script will save me some time). I have a Sun Ultra60 (ultrasparc SMP)
> > which matches your description. I just have a doubt about gcc-2.95
> > availability on this box, I know I have a 3.3.6, do you think that the
> > problem is gcc-related (too strong optimization or de-inlining, etc) ?
>
> At least following should be set, the rest you can leave to your gusto:
>
> CONFIG_ACPI=y
> CONFIG_ACPI_BOOT=y
> CONFIG_ACPI_BUS=y
> CONFIG_ACPI_INTERPRETER=y
> CONFIG_ACPI_EC=y
> CONFIG_ACPI_POWER=y
> CONFIG_ACPI_PCI=y
> CONFIG_ACPI_MMCONFIG=y
> CONFIG_ACPI_SLEEP=y
> CONFIG_ACPI_SYSTEM=y
OK, I can confirm that these ones get washed out
> CONFIG_IP_VS=m
> CONFIG_IP_VS_DEBUG=y
> CONFIG_IP_VS_TAB_BITS=12
> CONFIG_IP_VS_RR=m
> CONFIG_IP_VS_WRR=m
> CONFIG_IP_VS_LC=m
> CONFIG_IP_VS_WLC=m
> CONFIG_IP_VS_LBLC=m
> CONFIG_IP_VS_LBLCR=m
> CONFIG_IP_VS_DH=m
> CONFIG_IP_VS_SH=m
> CONFIG_IP_VS_SED=m
> CONFIG_IP_VS_NQ=m
> CONFIG_IP_VS_HPRIO=m
> CONFIG_IP_VS_FTP=m
These ones stay enabled.
> One issue is a possible C99'ism in the last IPVS patch. If you find
> time, please have a 2.95.x compiler installed.
I discovered that gcc-2.95.4 cannot compile kernel on sparc64 anymore :-(
I even wonder if it ever had been able to, because it does not know about
the -mmedlow option. I removed it to check further, but then ld segfaults
complaining that v8plus objects are not compatible with v9 output. Retrying
with 3.3.5 so...
> Another thing that could fail is if you additionally set
>
> CONFIG_ACPI_FAN=m
>
> and compile with CFLAGS="-g -ggdb"
I bet you guessed it gets ignored too :-) But I could test this on my
dual-athlon with gcc-2.95 if it is of any interest.
Regards,
Willy
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Linux 2.4.32-rc2
2005-11-02 11:02 ` Roberto Nibali
2005-11-02 12:29 ` Willy Tarreau
2005-11-03 5:43 ` Willy Tarreau
@ 2005-11-03 6:41 ` Willy TARREAU
2 siblings, 0 replies; 15+ messages in thread
From: Willy TARREAU @ 2005-11-03 6:41 UTC (permalink / raw)
To: Roberto Nibali; +Cc: Marcelo Tosatti, Grant Coady, linux-kernel
On Wed, Nov 02, 2005 at 12:02:39PM +0100, Roberto Nibali wrote:
> Bonjour Willy,
>
> >>Willy, if you have time, could you check your non-i386 boxes with a
> >>2.95.x compiled 2.4.x kernel, with IPVS enabled?
> >
> > Yes, no problem, but you'll have to tell me what to test ! (a config
> > or script will save me some time). I have a Sun Ultra60 (ultrasparc SMP)
> > which matches your description. I just have a doubt about gcc-2.95
> > availability on this box, I know I have a 3.3.6, do you think that the
> > problem is gcc-related (too strong optimization or de-inlining, etc) ?
>
> At least following should be set, the rest you can leave to your gusto:
>
> CONFIG_ACPI=y
> CONFIG_ACPI_BOOT=y
> CONFIG_ACPI_BUS=y
> CONFIG_ACPI_INTERPRETER=y
> CONFIG_ACPI_EC=y
> CONFIG_ACPI_POWER=y
> CONFIG_ACPI_PCI=y
> CONFIG_ACPI_MMCONFIG=y
> CONFIG_ACPI_SLEEP=y
> CONFIG_ACPI_SYSTEM=y
>
> CONFIG_IP_VS=m
> CONFIG_IP_VS_DEBUG=y
> CONFIG_IP_VS_TAB_BITS=12
> CONFIG_IP_VS_RR=m
> CONFIG_IP_VS_WRR=m
> CONFIG_IP_VS_LC=m
> CONFIG_IP_VS_WLC=m
> CONFIG_IP_VS_LBLC=m
> CONFIG_IP_VS_LBLCR=m
> CONFIG_IP_VS_DH=m
> CONFIG_IP_VS_SH=m
> CONFIG_IP_VS_SED=m
> CONFIG_IP_VS_NQ=m
> CONFIG_IP_VS_HPRIO=m
> CONFIG_IP_VS_FTP=m
>
> One issue is a possible C99'ism in the last IPVS patch. If you find
> time, please have a 2.95.x compiler installed.
>
> Another thing that could fail is if you additionally set
>
> CONFIG_ACPI_FAN=m
>
> and compile with CFLAGS="-g -ggdb"
>
> > Please keep us informed when you have more info.
>
> I will, and I will get more details, as time permits. My beef with the
> IPVS code seems to be wrong, the code works as expected so far. I'm
> stress-testing it though until Sunday on a 4GB Dual P4 Xeon with HT combo.
Well, finally built on sparc64-smp with gcc-3.3.5 (minus CONFIG_ACPI_*) and
on athlon-smp with gcc-2.95.3. So if you want me to do some tests, it will
be possible.
Regards,
Willy
^ permalink raw reply [flat|nested] 15+ messages in thread