linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: <tglx@linutronix.de>, <linux-kernel@vger.kernel.org>,
	<guohanjun@huawei.com>, John Garry <john.garry@huawei.com>
Subject: Re: [PATCH] irqchip/gic-v3-its: Select housekeeping CPUs preferentially for managed IRQs
Date: Tue, 25 Jan 2022 13:31:38 +0000	[thread overview]
Message-ID: <87sftc6ix1.wl-maz@kernel.org> (raw)
In-Reply-To: <12ac7447-34dc-8497-b608-ada5a2ba17c4@huawei.com>

On Tue, 25 Jan 2022 12:49:20 +0000,
Xiongfeng Wang <wangxiongfeng2@huawei.com> wrote:
> 
> Hi Marc,
> 
> On 2022/1/24 19:24, Marc Zyngier wrote:
> > + John Garry, as he was reporting issues around the same piece of code[1]
> > 
> > On Mon, 24 Jan 2022 07:34:40 +0000,
> > Xiongfeng Wang <wangxiongfeng2@huawei.com> wrote:
> >>
> >> When using kernel parameter 'isolcpus=managed_irq,xxxx' to bind the
> >> managed IRQs to housekeeping CPUs, the effective_affinity sometimes
> >> still contains the non-housekeeping CPUs.
> >>
> >> irq_do_set_affinity() passes the housekeeping cpumask to
> >> chip->irq_set_affinity(), but ITS driver select CPU according to
> >> irq_common_data->affinity. While 'irq_common_data->affinity' is updated
> >> after chip->irq_set_affinity() is called in irq_do_set_affinity(). Also
> >> 'irq_common_data->affinity' may contains non-housekeeping CPUs. I found
> >> the below link explaining the reason.
> >>   https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2267032.html
> >>
> >> To modify CPU selecting logic to prefer housekeeping CPUs, select CPU
> >> from the input cpumask parameter first. If none of it is online, then
> >> select CPU from 'irq_common_data->affinity'.
> >>
> >> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> >> ---
> >>  drivers/irqchip/irq-gic-v3-its.c | 5 ++++-
> >>  1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> >> index d25b7a864bbb..17c15d3b2784 100644
> >> --- a/drivers/irqchip/irq-gic-v3-its.c
> >> +++ b/drivers/irqchip/irq-gic-v3-its.c
> >> @@ -1624,7 +1624,10 @@ static int its_select_cpu(struct irq_data *d,
> >>  
> >>  		cpu = cpumask_pick_least_loaded(d, tmpmask);
> >>  	} else {
> >> -		cpumask_and(tmpmask, irq_data_get_affinity_mask(d), cpu_online_mask);
> >> +		cpumask_and(tmpmask, aff_mask, cpu_online_mask);
> >> +		if (cpumask_empty(tmpmask))
> >> +			cpumask_and(tmpmask, irq_data_get_affinity_mask(d),
> >> +				    cpu_online_mask);
> > 
> > I think that the online_cpu_mask logical and is a bit wrong. A managed
> > interrupt should be able to target an offline CPU:
> > 
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > index eb0882d15366..0cea46bdaf99 100644
> > --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -1620,7 +1620,7 @@ static int its_select_cpu(struct irq_data *d,
> >  
> >  		cpu = cpumask_pick_least_loaded(d, tmpmask);
> >  	} else {
> > -		cpumask_and(tmpmask, irq_data_get_affinity_mask(d), cpu_online_mask);
> > +		cpumask_copy(tmpmask, aff_mask);
> >  
> >  		/* If we cannot cross sockets, limit the search to that node */
> >  		if ((its_dev->its->flags & ITS_FLAGS_WORKAROUND_CAVIUM_23144) &&
> 
> I have tested the above modification with 'maxcpus=1' kernel parameter and got
> the following CallTrace.
> 
> [   14.679493][    T5] pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS
> BTYPE=--)
> [   14.687114][    T5] pc : lpi_update_config+0xe0/0x300
> [   14.692146][    T5] lr : lpi_update_config+0x3c/0x300

That's a problem similar to what John was seeing: the CPU isn't there,
and a lot of stuff goes very wrong in the absence of a CPU targeted by
a managed interrupt.

> > We still have an issue when the system hasn't booted with all its
> > CPUs, as the corresponding collections aren't initialised and we
> > end-up in a rather bad place.
> 
> Shall we fix this 'effective CPU of managed IRQs is not housekeeping
> CPU' issue first, or we will wait until the 'maxcpus=1' issue is
> fixed.

I this we need to address this first. There is no point in only half
fixing it.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

      reply	other threads:[~2022-01-25 13:34 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-24  7:34 [PATCH] irqchip/gic-v3-its: Select housekeeping CPUs preferentially for managed IRQs Xiongfeng Wang
2022-01-24 11:24 ` Marc Zyngier
2022-01-25 12:49   ` Xiongfeng Wang
2022-01-25 13:31     ` Marc Zyngier [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sftc6ix1.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=guohanjun@huawei.com \
    --cc=john.garry@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=wangxiongfeng2@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).