Re: [PATCH] irq-gic: select all CPU's selected in interrupt affinity settings - Russell King

From: Russell King - ARM Linux admin <linux@armlinux.org.uk>
To: Leonid Movshovich <event.riga@gmail.com>
Cc: Robin Murphy <robin.murphy@arm.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH] irq-gic: select all CPU's selected in interrupt affinity settings
Date: Wed, 20 Nov 2019 17:13:50 +0000	[thread overview]
Message-ID: <20191120171350.GS25745@shell.armlinux.org.uk> (raw)
In-Reply-To: <CAPaFbat=TXqGYx5KrQaO0x_r7wYQ9sno1j07Je437n8+P1Gi6g@mail.gmail.com>

On Wed, Nov 20, 2019 at 03:07:16PM +0000, Leonid Movshovich wrote:
> On Wed, 20 Nov 2019 at 13:58, Russell King - ARM Linux admin
> <linux@armlinux.org.uk> wrote:
> >
> > On Wed, Nov 20, 2019 at 01:33:11PM +0000, Robin Murphy wrote:
> > > On 20/11/2019 11:25 am, Leonid Movshovich wrote:
> > > > On Wed, 20 Nov 2019 at 10:50, Russell King - ARM Linux admin
> > > > <linux@armlinux.org.uk> wrote:
> > > > >
> > > > > On Wed, Nov 20, 2019 at 10:44:39AM +0000, Leonid Movshovich wrote:
> > > > > > On Wed, 20 Nov 2019 at 01:15, Robin Murphy <robin.murphy@arm.com> wrote:
> > > > > > >
> > > > > > > On 2019-11-20 12:24 am, Leonid Movshovich wrote:
> > > > > > > > On Tue, 19 Nov 2019 at 23:36, Russell King - ARM Linux admin
> > > > > > > > <linux@armlinux.org.uk> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Nov 19, 2019 at 11:12:26PM +0000, event wrote:
> > > > > > > > > > So far only a CPU selected with top affinity bit was selected. This
> > > > > > > > > > resulted in all interrupts
> > > > > > > > > > being processed by CPU0 by default despite "FF" default affinity
> > > > > > > > > > setting for all interrupts
> > > > > > > > >
> > > > > > > > > Have you checked whether this causes _ALL_ CPUs in the mask to be
> > > > > > > > > delivered a single interrupt, thereby causing _ALL_ CPUs to be
> > > > > > > > > slowed down and hit the same locks at the same time.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Yes, I've checked this. No, interrupt is delivered to only one CPU.
> > > > > > > > Also ARM GIC architecture specification specifically states in chapter
> > > > > > > > 3.1.1 that hardware interrupts are delivered to a single CPU in
> > > > > > > > multiprocessor system ("1-N model").
> > > > > > >
> > > > > > > But see also section 3.2.3 - just because only one CPU actually runs the
> > > > > > > given ISR doesn't necessarily guarantee that the others *weren't*
> > > > > > > interrupted. I'd also hesitate to make any assumptions that all GIC
> > > > > > > implementations behave exactly the same way.
> > > > > > >
> > > > > > > Robin.
> > > > > >
> > > > > > Yes, that's right, however:
> > > > > > 1. They are only interrupted for a split-second, since interrupt is
> > > > > > immediately ACKed in gic_handle_irq
> > > > >
> > > > > Even that is detrimental - consider cpuidle where a CPU is placed in
> > > > > a low power state waiting for an interrupt, and it keeps getting woken
> > > > > for interrupts that it isn't able to handle.  The effect will be to
> > > > > stop the CPU hitting the lower power states, which would be a regression
> > > > > over how the kernel behaves today.
> > > > >
> > > > > > 2. More important that smp_affinity in procfs is defined to allow user
> > > > > > to configure multiple CPU's to handle interrupts (see
> > > > > > Documentation/IRQ-affinity.txt) which is effectively prohibited in
> > > > > > current implementation. I mean, when user sets it to FF, she expects
> > > > > > all CPUs to process interrupts, not CPU0 only
> > >
> > > I have to say, my interaction with the IRQ layer is far more as a "user"
> > > than as a "developer", yet I've always assumed that the affinity mask
> > > represents the set of CPUs that *may* handle an interrupt and have never
> > > felt particularly surprised by the naive implementation of "just pick the
> > > first one".
> 
> Kernel documentation in Documentation/IRQ-affinity.txt sets an
> expectation that IRQs would be spread between CPUs evenly in case
> multiple CPUs are selected in smp_affinity. It also seems to be quite
> a common practice (in consumer devices at least) to have interrupts
> spread between CPUs. At least that's what happens on my PC and phone
> according to /proc/interrupts
> 
> > >
> > > Do these users also expect the scheduler to constantly context-switch a
> > > single active task all over the place just because the default thread
> > > affinity mask says it can?
> >
> > It is my understanding that the scheduler will try to keep tasks on
> > the CPU they are already running on, unless there's a benefit to
> > migrating it to a different CPU - because if you're constantly
> > migrating code between different CPUs, you're having to bounce
> > cache lines around the system.
> >
> > > > > The reason we've ended up with that on ARM is precisely because it
> > > > > wasted CPU resources, and my attempts at writing code to distribute
> > > > > the interrupt between CPU cores did not have a successful outcome.
> > > > > So, the best thing that could be done was to route interrupts to the
> > > > > first core, and run irqbalance to distribute the interrupts in a
> > > > > sensible, cache friendly way between CPU cores.
> > > > >
> > > > > And no, the current implementation is *NOT* prohibited.  You can't
> > > > > prohibit something that hardware hasn't been able to provide.
> > > > >
> > > >
> > > > Hardware allows delivering interrupt to random CPU from selected
> > > > bitmask and current implementation doesn't allow to configure this.
> > > > While this may be an issue for power-concerned systems, there are also
> > > > systems with plenty of electricity where using all CPUs for e.g.
> > > > network packet handling is more important.
> > >
> > > It's not just about batteries - more and more SoCs these days have
> > > internally constrained power/thermal budgets too. Think of Intel's turbo
> > > boost, or those Amlogic TV box chips that can only hit their advertised top
> > > frequencies with one or two cores active - on systems like that, yanking all
> > > the cores out of standby every time could be actively detrimental to
> > > single-thread performance and actually end up *increasing*
> > > interrupt-handling latency.
> > >
> > > If you want to optimise a particular system for a particular use-case,
> > > you're almost certainly better off manually tuning affinities anyway
> > > (certain distros already do this). If you mostly just want /proc/interrupts
> > > to look pretty, there's irqbalance.
> >
> > The conclusion I came to when I did the initial 32-bit ARM SMP support
> > was:
> >
> > 1) it is policy, and userspace deals with policy
> > 2) routing the IRQ in to distribute it between CPUs is difficult
> 
> Yes, but current implementation of smp_affinity does not allow to set
> multiple CPUs to handle same interrupt. Neither hardware nor software
> seem to have any issues with distribution. In any case, I suggest to
> keep default behaviour as is, so only those who know what are they
> doing would be playing around with this.
> 
> > 3) the problem is already solved by userspace (irqbalance)
> 
> irqbalance sets smp_affinity. If one wants to dedicate a subset of
> CPUs to a certain interrupt with current implementation of
> set_affinity, irqbalance have to sit there and switch affinities all
> the time. Constantly read /proc/interrupts and change smp_affinity.
> That doesn't sound like a great solution at all.
> Not even mentioning that irqbalance pulls glib which won't make many
> embedded developers happy.

This discussion is going nowhere.

I've stated my position based on experience as 32-bit ARM maintainer
trying to make it work.  It may not conform to the documentation, but
it's what has been used for decades on 32-bit ARM, and what most
people have been perfectly happy with.

If you think you have a solution to the stated problem that solves
it for hardware that doesn't automatically distribute interrupts,
then go off and code it and provide a patch.  Otherwise, no amount
of emails stating "but the documentation says X" is going to change
anything.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel