* [PATCH] Re: Bad network performance over 2Gbps
2008-04-17 10:02 ` Anton Titov
@ 2008-04-17 17:37 ` Kok, Auke
2008-04-20 12:08 ` Denys Fedoryshchenko
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Kok, Auke @ 2008-04-17 17:37 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List
Cc: Anton Titov, Chris Snook, H. Willstrand, netdev, Jesse Brandeburg,
Linus Torvalds, Andrew Morton
Anton Titov wrote:
> On Tue, 2008-04-15 at 16:59 -0400, Chris Snook wrote:
>> Still, I think you're on to something here. Disabling NAPI and instead
>> tuning the cards' interrupt coalescing settings might allow irqbalance
>> to do a better job than it is currently.
>
> Disabling NAPI allowed me to push as much as 3.5Gbit out of the same
> server with ~ 20% of time CPUs doing software interrupts.
yes, I really don't see this is such an amazing discovery - the in-kernel
irqbalance code is totally wrong for network interrupts (and probably for most
interrupts).
on your system with 6 network interrupts it blows chunks and it's not NAPI that is
the issue - NAPI will work just fine on it's own. By disabling NAPI and reverting
to the in-driver irq moderation code you've effectively put the in-kernel
irqbalance code to the sideline and this is what makes it work again.
It's not the right solution.
We keep seing this exact issue pop up everywhere - especially with e1000(e)
datacenter users - this code _has_ to go or be fixed. Since there is a perfectly
viable solution, I strongly suggest disabling it.
This is not the first time I've sent this patch out in some form...
Auke
---
[X86] IRQBALANCE: Mark as BROKEN and disable by default
The IRQBALANCE option causes interrupts to bounce all around on SMP systems
quickly burying the CPU in migration cost and cache misses. Mainly affected are
network interrupts and this results in one CPU pegged in softirqd completely.
Disable this option and provide documentation to a better solution (userspace
irqbalance daemon does overall the best job to begin with and only manual setting
of smp_affinity will beat it).
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
---
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6c70fed..956aa22 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1026,13 +1026,17 @@ config EFI
platforms.
config IRQBALANCE
- def_bool y
+ def_bool n
prompt "Enable kernel irq balancing"
- depends on X86_32 && SMP && X86_IO_APIC
+ depends on X86_32 && SMP && X86_IO_APIC && BROKEN
help
The default yes will allow the kernel to do irq load balancing.
Saying no will keep the kernel from doing irq load balancing.
+ This option is known to cause performance issues on SMP
+ systems. The preferred method is to use the userspace
+ 'irqbalance' daemon instead. See http://irqbalance.org/.
+
config SECCOMP
def_bool y
prompt "Enable seccomp to safely compute untrusted bytecode"
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
[not found] ` <ajGfA-7rt-7@gated-at.bofh.it>
@ 2008-04-19 15:05 ` Bodo Eggert
[not found] ` <E1JnEcl-0000xc-D9@be1.7eggert.dyndns.org>
1 sibling, 0 replies; 17+ messages in thread
From: Bodo Eggert @ 2008-04-19 15:05 UTC (permalink / raw)
To: Kok, Auke, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List <
Kok, Auke <auke-jan.h.kok@intel.com> wrote:
> [X86] IRQBALANCE: Mark as BROKEN and disable by default
>
> The IRQBALANCE option causes interrupts to bounce all around on SMP systems
> quickly burying the CPU in migration cost and cache misses. Mainly affected
> are network interrupts and this results in one CPU pegged in softirqd
> completely.
If this is the problem, maybe it would help to only balance the IRQs each
e.g. ten seconds? Unfortunately I have no SMP system to try it out.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
[not found] ` <E1JnEcl-0000xc-D9@be1.7eggert.dyndns.org>
@ 2008-04-19 19:23 ` Stephen Hemminger
2008-04-21 16:42 ` Rick Jones
1 sibling, 0 replies; 17+ messages in thread
From: Stephen Hemminger @ 2008-04-19 19:23 UTC (permalink / raw)
To: 7eggert
Cc: Kok, Auke, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List, Anton Titov, Chris Snook,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
Bodo Eggert wrote:
> Kok, Auke <auke-jan.h.kok@intel.com> wrote:
>
>
>> [X86] IRQBALANCE: Mark as BROKEN and disable by default
>>
>> The IRQBALANCE option causes interrupts to bounce all around on SMP systems
>> quickly burying the CPU in migration cost and cache misses. Mainly affected
>> are network interrupts and this results in one CPU pegged in softirqd
>> completely.
>>
>
> If this is the problem, maybe it would help to only balance the IRQs each
> e.g. ten seconds? Unfortunately I have no SMP system to try it out.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
The kernel level IRQBALANCE is useless. The userlevel irqbalance does
the right thing,
it handles multi-core, and network devices, and all the other special cases.
*Don't use kernel level irqbalance*
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-17 17:37 ` [PATCH] " Kok, Auke
@ 2008-04-20 12:08 ` Denys Fedoryshchenko
2008-04-21 13:19 ` Pavel Machek
` (2 subsequent siblings)
3 siblings, 0 replies; 17+ messages in thread
From: Denys Fedoryshchenko @ 2008-04-20 12:08 UTC (permalink / raw)
To: Kok, Auke
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List, Anton Titov, Chris Snook,
H. Willstrand, netdev, Jesse Brandeburg, Andrew Morton
By default also without IRQBALANCE enabled in kernel, APIC or someone else distributing interrupts over processors too.
There is no irqbalance daemon or whatever.
For example:
Router-KARAM ~ # cat /proc/interrupts
CPU0 CPU1
0: 87956938 1403052485 IO-APIC-edge timer
1: 0 2 IO-APIC-edge i8042
9: 0 0 IO-APIC-fasteoi acpi
19: 140 5714 IO-APIC-fasteoi ohci_hcd:usb1, ohci_hcd:usb2
24: 675673280 1186506694 IO-APIC-fasteoi eth2
26: 717865662 2201633562 IO-APIC-fasteoi eth0
27: 1869190 23075556 IO-APIC-fasteoi eth1
NMI: 0 0 Non-maskable interrupts
LOC: 1403052485 87956683 Local timer interrupts
RES: 75059 25408 Rescheduling interrupts
CAL: 99542 83 function call interrupts
TLB: 616 200 TLB shootdowns
TRM: 0 0 Thermal event interrupts
SPU: 0 0 Spurious interrupts
ERR: 0
MIS: 0
sunfire-1 ~ # cat config|grep -i irq
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
# CONFIG_IRQBALANCE is not set
CONFIG_HT_IRQ=y
# CONFIG_HPET_RTC_IRQ is not set
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
# CONFIG_DEBUG_SHIRQ is not set
Is it harmful too?
On Thursday 17 April 2008 20:37, Kok, Auke wrote:
> Anton Titov wrote:
> > On Tue, 2008-04-15 at 16:59 -0400, Chris Snook wrote:
> >> Still, I think you're on to something here. Disabling NAPI and instead
> >> tuning the cards' interrupt coalescing settings might allow irqbalance
> >> to do a better job than it is currently.
> >
> > Disabling NAPI allowed me to push as much as 3.5Gbit out of the same
> > server with ~ 20% of time CPUs doing software interrupts.
>
> yes, I really don't see this is such an amazing discovery - the in-kernel
> irqbalance code is totally wrong for network interrupts (and probably for most
> interrupts).
>
> on your system with 6 network interrupts it blows chunks and it's not NAPI that is
> the issue - NAPI will work just fine on it's own. By disabling NAPI and reverting
> to the in-driver irq moderation code you've effectively put the in-kernel
> irqbalance code to the sideline and this is what makes it work again.
>
> It's not the right solution.
>
> We keep seing this exact issue pop up everywhere - especially with e1000(e)
> datacenter users - this code _has_ to go or be fixed. Since there is a perfectly
> viable solution, I strongly suggest disabling it.
>
> This is not the first time I've sent this patch out in some form...
>
> Auke
>
>
> ---
> [X86] IRQBALANCE: Mark as BROKEN and disable by default
>
> The IRQBALANCE option causes interrupts to bounce all around on SMP systems
> quickly burying the CPU in migration cost and cache misses. Mainly affected are
> network interrupts and this results in one CPU pegged in softirqd completely.
>
> Disable this option and provide documentation to a better solution (userspace
> irqbalance daemon does overall the best job to begin with and only manual setting
> of smp_affinity will beat it).
>
> Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
>
> ---
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 6c70fed..956aa22 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1026,13 +1026,17 @@ config EFI
> platforms.
>
> config IRQBALANCE
> - def_bool y
> + def_bool n
> prompt "Enable kernel irq balancing"
> - depends on X86_32 && SMP && X86_IO_APIC
> + depends on X86_32 && SMP && X86_IO_APIC && BROKEN
> help
> The default yes will allow the kernel to do irq load balancing.
> Saying no will keep the kernel from doing irq load balancing.
>
> + This option is known to cause performance issues on SMP
> + systems. The preferred method is to use the userspace
> + 'irqbalance' daemon instead. See http://irqbalance.org/.
> +
> config SECCOMP
> def_bool y
> prompt "Enable seccomp to safely compute untrusted bytecode"
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
------
Technical Manager
Virtual ISP S.A.L.
Lebanon
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-17 17:37 ` [PATCH] " Kok, Auke
2008-04-20 12:08 ` Denys Fedoryshchenko
@ 2008-04-21 13:19 ` Pavel Machek
2008-04-21 16:38 ` Kok, Auke
2008-04-21 15:28 ` Ingo Molnar
2008-04-22 5:07 ` Bill Fink
3 siblings, 1 reply; 17+ messages in thread
From: Pavel Machek @ 2008-04-21 13:19 UTC (permalink / raw)
To: Kok, Auke
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List, Anton Titov, Chris Snook,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
Hi!
> [X86] IRQBALANCE: Mark as BROKEN and disable by default
>
> The IRQBALANCE option causes interrupts to bounce all around on SMP systems
> quickly burying the CPU in migration cost and cache misses. Mainly affected are
> network interrupts and this results in one CPU pegged in softirqd completely.
>
> Disable this option and provide documentation to a better solution (userspace
> irqbalance daemon does overall the best job to begin with and only manual setting
> of smp_affinity will beat it).
>
> Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
>
> ---
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 6c70fed..956aa22 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1026,13 +1026,17 @@ config EFI
> platforms.
>
> config IRQBALANCE
> - def_bool y
> + def_bool n
ACK.
> prompt "Enable kernel irq balancing"
> - depends on X86_32 && SMP && X86_IO_APIC
> + depends on X86_32 && SMP && X86_IO_APIC && BROKEN
This is wrong. irqbalance works, there's nothing wrong with it; but it
has nasty sideffects.
> help
> The default yes will allow the kernel to do irq load balancing.
> Saying no will keep the kernel from doing irq load balancing.
>
> + This option is known to cause performance issues on SMP
> + systems. The preferred method is to use the userspace
> + 'irqbalance' daemon instead. See http://irqbalance.org/.
> +
ACK.
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-17 17:37 ` [PATCH] " Kok, Auke
2008-04-20 12:08 ` Denys Fedoryshchenko
2008-04-21 13:19 ` Pavel Machek
@ 2008-04-21 15:28 ` Ingo Molnar
2008-04-21 16:58 ` Kok, Auke
2008-04-22 5:07 ` Bill Fink
3 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2008-04-21 15:28 UTC (permalink / raw)
To: Kok, Auke
Cc: Thomas Gleixner, H. Peter Anvin, Linux Kernel Mailing List,
Anton Titov, Chris Snook, H. Willstrand, netdev, Jesse Brandeburg,
Linus Torvalds, Andrew Morton
* Kok, Auke <auke-jan.h.kok@intel.com> wrote:
> We keep seing this exact issue pop up everywhere - especially with
> e1000(e) datacenter users - this code _has_ to go or be fixed. Since
> there is a perfectly viable solution, I strongly suggest disabling it.
strongly agreed. Thanks Auke, applied.
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-21 13:19 ` Pavel Machek
@ 2008-04-21 16:38 ` Kok, Auke
0 siblings, 0 replies; 17+ messages in thread
From: Kok, Auke @ 2008-04-21 16:38 UTC (permalink / raw)
To: Pavel Machek
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List, Anton Titov, Chris Snook,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
Pavel Machek wrote:
> Hi!
>
>> [X86] IRQBALANCE: Mark as BROKEN and disable by default
>>
>> The IRQBALANCE option causes interrupts to bounce all around on SMP systems
>> quickly burying the CPU in migration cost and cache misses. Mainly affected are
>> network interrupts and this results in one CPU pegged in softirqd completely.
>>
>> Disable this option and provide documentation to a better solution (userspace
>> irqbalance daemon does overall the best job to begin with and only manual setting
>> of smp_affinity will beat it).
>>
>> Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
>>
>> ---
>>
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index 6c70fed..956aa22 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -1026,13 +1026,17 @@ config EFI
>> platforms.
>>
>> config IRQBALANCE
>> - def_bool y
>> + def_bool n
>
> ACK.
>> prompt "Enable kernel irq balancing"
>> - depends on X86_32 && SMP && X86_IO_APIC
>> + depends on X86_32 && SMP && X86_IO_APIC && BROKEN
>
> This is wrong. irqbalance works, there's nothing wrong with it; but it
> has nasty sideffects.
ok, I'm fine with taking that part out of the patch.
Ingo, want me to send an updated patch?
>
>> help
>> The default yes will allow the kernel to do irq load balancing.
>> Saying no will keep the kernel from doing irq load balancing.
>>
>> + This option is known to cause performance issues on SMP
>> + systems. The preferred method is to use the userspace
>> + 'irqbalance' daemon instead. See http://irqbalance.org/.
>> +
>
> ACK.
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
[not found] ` <E1JnEcl-0000xc-D9@be1.7eggert.dyndns.org>
2008-04-19 19:23 ` Stephen Hemminger
@ 2008-04-21 16:42 ` Rick Jones
2008-04-21 19:52 ` Bodo Eggert
1 sibling, 1 reply; 17+ messages in thread
From: Rick Jones @ 2008-04-21 16:42 UTC (permalink / raw)
To: 7eggert
Cc: Kok, Auke, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List, Anton Titov, Chris Snook,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
Bodo Eggert wrote:
> Kok, Auke <auke-jan.h.kok@intel.com> wrote:
>
>
>>[X86] IRQBALANCE: Mark as BROKEN and disable by default
>>
>>The IRQBALANCE option causes interrupts to bounce all around on SMP systems
>>quickly burying the CPU in migration cost and cache misses. Mainly affected
>>are network interrupts and this results in one CPU pegged in softirqd
>>completely.
>
>
> If this is the problem, maybe it would help to only balance the IRQs each
> e.g. ten seconds? Unfortunately I have no SMP system to try it out.
Be it kernel or user space, for consistent benchmark results it needs to
be able to be turned-off without turning the code. That leaves me in
agreement with Stephen that if it must exist, the user space one would
be preferable. It can be easily terminated with extreme prejudice.
rick jones
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-21 15:28 ` Ingo Molnar
@ 2008-04-21 16:58 ` Kok, Auke
2008-04-21 18:35 ` Andi Kleen
0 siblings, 1 reply; 17+ messages in thread
From: Kok, Auke @ 2008-04-21 16:58 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Gleixner, H. Peter Anvin, Linux Kernel Mailing List,
Anton Titov, Chris Snook, H. Willstrand, netdev, Jesse Brandeburg,
Linus Torvalds, Andrew Morton
Ingo Molnar wrote:
> * Kok, Auke <auke-jan.h.kok@intel.com> wrote:
>
>> We keep seing this exact issue pop up everywhere - especially with
>> e1000(e) datacenter users - this code _has_ to go or be fixed. Since
>> there is a perfectly viable solution, I strongly suggest disabling it.
>
> strongly agreed. Thanks Auke, applied.
>
> Ingo
excellent, ignore my other reply to Pavel - I didn't see this reply yet :)
Thanks Ingo
Auke
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-21 16:58 ` Kok, Auke
@ 2008-04-21 18:35 ` Andi Kleen
0 siblings, 0 replies; 17+ messages in thread
From: Andi Kleen @ 2008-04-21 18:35 UTC (permalink / raw)
To: Kok, Auke
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List, Anton Titov, Chris Snook,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
"Kok, Auke" <auke-jan.h.kok@intel.com> writes:
> Ingo Molnar wrote:
>> * Kok, Auke <auke-jan.h.kok@intel.com> wrote:
>>
>>> We keep seing this exact issue pop up everywhere - especially with
>>> e1000(e) datacenter users - this code _has_ to go or be fixed. Since
>>> there is a perfectly viable solution, I strongly suggest disabling it.
>>
>> strongly agreed. Thanks Auke, applied.
>>
>> Ingo
>
>
> excellent, ignore my other reply to Pavel - I didn't see this reply yet :)
Shouldn't you just add it to the FeatureRemoval list too and remove it
then quickly? No need to keep disabled and known to be wrong code around.
-Andi
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-21 16:42 ` Rick Jones
@ 2008-04-21 19:52 ` Bodo Eggert
2008-04-21 20:02 ` Rick Jones
0 siblings, 1 reply; 17+ messages in thread
From: Bodo Eggert @ 2008-04-21 19:52 UTC (permalink / raw)
To: Rick Jones
Cc: 7eggert, Kok, Auke, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List, Anton Titov, Chris Snook,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
On Mon, 21 Apr 2008, Rick Jones wrote:
> Bodo Eggert wrote:
> > Kok, Auke <auke-jan.h.kok@intel.com> wrote:
> > > [X86] IRQBALANCE: Mark as BROKEN and disable by default
> > >
> > > The IRQBALANCE option causes interrupts to bounce all around on SMP
> > > systems
> > > quickly burying the CPU in migration cost and cache misses. Mainly
> > > affected
> > > are network interrupts and this results in one CPU pegged in softirqd
> > > completely.
> >
> >
> > If this is the problem, maybe it would help to only balance the IRQs each
> > e.g. ten seconds? Unfortunately I have no SMP system to try it out.
>
> Be it kernel or user space, for consistent benchmark results it needs to be
> able to be turned-off without turning the code. That leaves me in agreement
> with Stephen that if it must exist, the user space one would be preferable.
> It can be easily terminated with extreme prejudice.
I agree that having a full-featured userspace balancer daemon with lots of
intelligence will be theoretically better, but if you can have a simple
daemon doing OK on many machines for less than the userspace daemon's
kernel stack, why not?
--
Funny quotes:
31. Why do "overlook" and "oversee" mean opposite things?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-21 19:52 ` Bodo Eggert
@ 2008-04-21 20:02 ` Rick Jones
2008-04-21 21:08 ` Bodo Eggert
0 siblings, 1 reply; 17+ messages in thread
From: Rick Jones @ 2008-04-21 20:02 UTC (permalink / raw)
To: Bodo Eggert
Cc: Kok, Auke, Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List, Anton Titov, Chris Snook,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
Bodo Eggert wrote:
> On Mon, 21 Apr 2008, Rick Jones wrote:
>>Be it kernel or user space, for consistent benchmark results it needs to be
>>able to be turned-off without turning the code. That leaves me in agreement
>>with Stephen that if it must exist, the user space one would be preferable.
>>It can be easily terminated with extreme prejudice.
>
>
> I agree that having a full-featured userspace balancer daemon with lots of
> intelligence will be theoretically better, but if you can have a simple
> daemon doing OK on many machines for less than the userspace daemon's
> kernel stack, why not?
Perhaps my judgement is too colored by benchmark(et)ing, and desires to
have repeatable results on things like neperf, but I very much like to
know where my interrupts are going and don't like them moving around.
That is why I am not particularly fond of either flavor of irq balancing.
That being the case, whatever is out there aught to be able to be
disabled on a running system without having to roll bits or reboot.
rick jones
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-21 20:02 ` Rick Jones
@ 2008-04-21 21:08 ` Bodo Eggert
2008-04-21 21:30 ` Chris Snook
0 siblings, 1 reply; 17+ messages in thread
From: Bodo Eggert @ 2008-04-21 21:08 UTC (permalink / raw)
To: Rick Jones
Cc: Bodo Eggert, Kok, Auke, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, Linux Kernel Mailing List, Anton Titov,
Chris Snook, H. Willstrand, netdev, Jesse Brandeburg,
Linus Torvalds, Andrew Morton
On Mon, 21 Apr 2008, Rick Jones wrote:
> Bodo Eggert wrote:
> > On Mon, 21 Apr 2008, Rick Jones wrote:
> > > Be it kernel or user space, for consistent benchmark results it needs to
> > > be
> > > able to be turned-off without turning the code. That leaves me in
> > > agreement
> > > with Stephen that if it must exist, the user space one would be
> > > preferable.
> > > It can be easily terminated with extreme prejudice.
> >
> >
> > I agree that having a full-featured userspace balancer daemon with lots of
> > intelligence will be theoretically better, but if you can have a simple
> > daemon doing OK on many machines for less than the userspace daemon's
> > kernel stack, why not?
>
> Perhaps my judgement is too colored by benchmark(et)ing, and desires to have
> repeatable results on things like neperf, but I very much like to know where
> my interrupts are going and don't like them moving around. That is why I am
> not particularly fond of either flavor of irq balancing.
>
> That being the case, whatever is out there aught to be able to be disabled on
> a running system without having to roll bits or reboot.
Adding a "module" parameter to disable it should be cheap, isn't it?
--
Top 100 things you don't want the sysadmin to say:
34. The network's down, but we're working on it. Come back after diner.
(Usually said at 2200 the night before thesis deadline... )
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-21 21:08 ` Bodo Eggert
@ 2008-04-21 21:30 ` Chris Snook
2008-04-22 7:36 ` Bodo Eggert
0 siblings, 1 reply; 17+ messages in thread
From: Chris Snook @ 2008-04-21 21:30 UTC (permalink / raw)
To: Bodo Eggert
Cc: Rick Jones, Kok, Auke, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, Linux Kernel Mailing List, Anton Titov,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
Bodo Eggert wrote:
> On Mon, 21 Apr 2008, Rick Jones wrote:
>> Bodo Eggert wrote:
>>> On Mon, 21 Apr 2008, Rick Jones wrote:
>
>>>> Be it kernel or user space, for consistent benchmark results it needs to
>>>> be
>>>> able to be turned-off without turning the code. That leaves me in
>>>> agreement
>>>> with Stephen that if it must exist, the user space one would be
>>>> preferable.
>>>> It can be easily terminated with extreme prejudice.
>>>
>>> I agree that having a full-featured userspace balancer daemon with lots of
>>> intelligence will be theoretically better, but if you can have a simple
>>> daemon doing OK on many machines for less than the userspace daemon's
>>> kernel stack, why not?
>> Perhaps my judgement is too colored by benchmark(et)ing, and desires to have
>> repeatable results on things like neperf, but I very much like to know where
>> my interrupts are going and don't like them moving around. That is why I am
>> not particularly fond of either flavor of irq balancing.
>>
>> That being the case, whatever is out there aught to be able to be disabled on
>> a running system without having to roll bits or reboot.
>
> Adding a "module" parameter to disable it should be cheap, isn't it?
Except the irq balancing is system-wide. Adding per-device exemptions to an
obsolete feature seems like the wrong way to go.
-- Chris
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-17 17:37 ` [PATCH] " Kok, Auke
` (2 preceding siblings ...)
2008-04-21 15:28 ` Ingo Molnar
@ 2008-04-22 5:07 ` Bill Fink
3 siblings, 0 replies; 17+ messages in thread
From: Bill Fink @ 2008-04-22 5:07 UTC (permalink / raw)
To: Kok, Auke
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin,
Linux Kernel Mailing List, Anton Titov, Chris Snook,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
On Thu, 17 Apr 2008, Kok, Auke wrote:
> [X86] IRQBALANCE: Mark as BROKEN and disable by default
>
> The IRQBALANCE option causes interrupts to bounce all around on SMP systems
> quickly burying the CPU in migration cost and cache misses. Mainly affected are
> network interrupts and this results in one CPU pegged in softirqd completely.
>
> Disable this option and provide documentation to a better solution (userspace
> irqbalance daemon does overall the best job to begin with and only manual setting
> of smp_affinity will beat it).
>
> Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
>
> ---
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 6c70fed..956aa22 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1026,13 +1026,17 @@ config EFI
> platforms.
>
> config IRQBALANCE
> - def_bool y
> + def_bool n
> prompt "Enable kernel irq balancing"
> - depends on X86_32 && SMP && X86_IO_APIC
> + depends on X86_32 && SMP && X86_IO_APIC && BROKEN
> help
> The default yes will allow the kernel to do irq load balancing.
> Saying no will keep the kernel from doing irq load balancing.
Since you're changing the default setting, shouldn't the above be
changed to:
Saying yes will allow the kernel to do irq load balancing.
The default no will keep the kernel from doing irq load balancing.
> + This option is known to cause performance issues on SMP
> + systems. The preferred method is to use the userspace
> + 'irqbalance' daemon instead. See http://irqbalance.org/.
> +
> config SECCOMP
> def_bool y
> prompt "Enable seccomp to safely compute untrusted bytecode"
-Bill
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-21 21:30 ` Chris Snook
@ 2008-04-22 7:36 ` Bodo Eggert
2008-04-22 17:46 ` Kok, Auke
0 siblings, 1 reply; 17+ messages in thread
From: Bodo Eggert @ 2008-04-22 7:36 UTC (permalink / raw)
To: Chris Snook
Cc: Bodo Eggert, Rick Jones, Kok, Auke, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, Linux Kernel Mailing List, Anton Titov,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
On Mon, 21 Apr 2008, Chris Snook wrote:
> Bodo Eggert wrote:
> > On Mon, 21 Apr 2008, Rick Jones wrote:
> >> Bodo Eggert wrote:
> >>> On Mon, 21 Apr 2008, Rick Jones wrote:
> >>>> Be it kernel or user space, for consistent benchmark results it needs to
> >>>> be
> >>>> able to be turned-off without turning the code. That leaves me in
> >>>> agreement
> >>>> with Stephen that if it must exist, the user space one would be
> >>>> preferable.
> >>>> It can be easily terminated with extreme prejudice.
> >>>
> >>> I agree that having a full-featured userspace balancer daemon with lots of
> >>> intelligence will be theoretically better, but if you can have a simple
> >>> daemon doing OK on many machines for less than the userspace daemon's
> >>> kernel stack, why not?
> >> Perhaps my judgement is too colored by benchmark(et)ing, and desires to have
> >> repeatable results on things like neperf, but I very much like to know where
> >> my interrupts are going and don't like them moving around. That is why I am
> >> not particularly fond of either flavor of irq balancing.
> >>
> >> That being the case, whatever is out there aught to be able to be disabled on
> >> a running system without having to roll bits or reboot.
> >
> > Adding a "module" parameter to disable it should be cheap, isn't it?
>
> Except the irq balancing is system-wide. Adding per-device exemptions to an
> obsolete feature seems like the wrong way to go.
No, not a per-device-exemption. My reasoning was: If the IRQ balancer
bounces the IRQ too often, doing it less often seems to be the correct
solution. One cache miss each ten seconds sounds like it should be OK.
As said before, I can't verify this theory.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] Re: Bad network performance over 2Gbps
2008-04-22 7:36 ` Bodo Eggert
@ 2008-04-22 17:46 ` Kok, Auke
0 siblings, 0 replies; 17+ messages in thread
From: Kok, Auke @ 2008-04-22 17:46 UTC (permalink / raw)
To: Bodo Eggert
Cc: Chris Snook, Rick Jones, Ingo Molnar, Thomas Gleixner,
H. Peter Anvin, Linux Kernel Mailing List, Anton Titov,
H. Willstrand, netdev, Jesse Brandeburg, Linus Torvalds,
Andrew Morton
Bodo Eggert wrote:
> On Mon, 21 Apr 2008, Chris Snook wrote:
>> Bodo Eggert wrote:
>>> On Mon, 21 Apr 2008, Rick Jones wrote:
>>>> Bodo Eggert wrote:
>>>>> On Mon, 21 Apr 2008, Rick Jones wrote:
>
>>>>>> Be it kernel or user space, for consistent benchmark results it needs to
>>>>>> be
>>>>>> able to be turned-off without turning the code. That leaves me in
>>>>>> agreement
>>>>>> with Stephen that if it must exist, the user space one would be
>>>>>> preferable.
>>>>>> It can be easily terminated with extreme prejudice.
>>>>> I agree that having a full-featured userspace balancer daemon with lots of
>>>>> intelligence will be theoretically better, but if you can have a simple
>>>>> daemon doing OK on many machines for less than the userspace daemon's
>>>>> kernel stack, why not?
>>>> Perhaps my judgement is too colored by benchmark(et)ing, and desires to have
>>>> repeatable results on things like neperf, but I very much like to know where
>>>> my interrupts are going and don't like them moving around. That is why I am
>>>> not particularly fond of either flavor of irq balancing.
>>>>
>>>> That being the case, whatever is out there aught to be able to be disabled on
>>>> a running system without having to roll bits or reboot.
>>> Adding a "module" parameter to disable it should be cheap, isn't it?
>> Except the irq balancing is system-wide. Adding per-device exemptions to an
>> obsolete feature seems like the wrong way to go.
>
> No, not a per-device-exemption. My reasoning was: If the IRQ balancer
> bounces the IRQ too often, doing it less often seems to be the correct
> solution. One cache miss each ten seconds sounds like it should be OK.
> As said before, I can't verify this theory.
this is exaclty what the userspace irqbalance does and it's even optimized to not
do those migrations once every 10 seconds if things look OK. from that
perspective, it's definately more mature and it's maintained as well.
Auke
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2008-04-22 17:46 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <aiXVe-Yn-7@gated-at.bofh.it>
[not found] ` <ajGfA-7rt-9@gated-at.bofh.it>
[not found] ` <ajGfA-7rt-11@gated-at.bofh.it>
[not found] ` <ajGfA-7rt-13@gated-at.bofh.it>
[not found] ` <ajGfA-7rt-15@gated-at.bofh.it>
[not found] ` <ajGfA-7rt-7@gated-at.bofh.it>
2008-04-19 15:05 ` [PATCH] Re: Bad network performance over 2Gbps Bodo Eggert
[not found] ` <E1JnEcl-0000xc-D9@be1.7eggert.dyndns.org>
2008-04-19 19:23 ` Stephen Hemminger
2008-04-21 16:42 ` Rick Jones
2008-04-21 19:52 ` Bodo Eggert
2008-04-21 20:02 ` Rick Jones
2008-04-21 21:08 ` Bodo Eggert
2008-04-21 21:30 ` Chris Snook
2008-04-22 7:36 ` Bodo Eggert
2008-04-22 17:46 ` Kok, Auke
[not found] <1208282804.23631.27.camel@localhost>
2008-04-15 20:15 ` H. Willstrand
2008-04-15 20:34 ` Kok, Auke
2008-04-15 20:59 ` Chris Snook
2008-04-17 10:02 ` Anton Titov
2008-04-17 17:37 ` [PATCH] " Kok, Auke
2008-04-20 12:08 ` Denys Fedoryshchenko
2008-04-21 13:19 ` Pavel Machek
2008-04-21 16:38 ` Kok, Auke
2008-04-21 15:28 ` Ingo Molnar
2008-04-21 16:58 ` Kok, Auke
2008-04-21 18:35 ` Andi Kleen
2008-04-22 5:07 ` Bill Fink
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).