public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Cc: "linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"mark.rutland@arm.com" <mark.rutland@arm.com>
Subject: Re: [PATCH] GICv3: Add restart handler to detach CPU from GICv3
Date: Thu, 05 Jan 2023 10:37:50 +0000	[thread overview]
Message-ID: <86lemh6zcx.wl-maz@kernel.org> (raw)
In-Reply-To: <cca60e816c1f00cf1e2959ca1a47b1fd48f3899a.camel@infinera.com>

On Wed, 04 Jan 2023 19:52:33 +0000,
Joakim Tjernlund <Joakim.Tjernlund@infinera.com> wrote:
> 
> On Wed, 2023-01-04 at 18:48 +0000, Marc Zyngier wrote:
> > On Wed, 04 Jan 2023 17:23:42 +0000,
> > Joakim Tjernlund <Joakim.Tjernlund@infinera.com> wrote:
> > > 
> > > On Wed, 2023-01-04 at 16:50 +0000, Marc Zyngier wrote:
> > > > On Wed, 04 Jan 2023 16:04:14 +0000,
> > > > Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > 
> > > > > On Tue, Jan 03, 2023 at 04:27:07PM +0000, Joakim Tjernlund wrote:
> > > > > > On Fri, 2022-12-16 at 17:21 +0100, Joakim Tjernlund wrote:
> > > > > > 
> > > > > > Ping?
> > > > > 
> > > > > To whom?
> > > > > 
> > > > > You don't appeared to have Cc'd any relevant maintainer, and people are still
> > > > > on holiday, so it's extremely likely this will be missed.
> > > > 
> > > > That, plus nobody reads the list looking for this sort of things.
> > > > 
> > > > > 
> > > > > For the maintainer, please use scripts/get_maintainer.pl, e.g.
> > > > > 
> > > > > > [mark@lakrids:~/src/linux]% ./scripts/get_maintainer.pl -f drivers/irqchip/irq-gic-v3.c
> > > > > > Thomas Gleixner <tglx@linutronix.de> (maintainer:IRQCHIP DRIVERS)
> > > > > > Marc Zyngier <maz@kernel.org> (maintainer:IRQCHIP DRIVERS)
> > > > > > linux-kernel@vger.kernel.org (open list:IRQCHIP DRIVERS)
> > > > > 
> > > > > Note: I've Cc'd Marc, who wrote the GICv3 driver.
> > > > 
> > > > Cheers Mark, much appreciated.
> > > 
> > > Sorry for missing that extra maintainer CC:
> > > 
> > > > 
> > > > > 
> > > > > > > Needed for reboot without resetting the whole GIC
> > > > > 
> > > > > This doesn't really explain what you're trying to do nor why.
> > > > > 
> > > > > Why do you need to "reboot without resetting the whole GIC" ?
> > > > > 
> > > > > Do you encounter a problem if we try to reset the whole GIC?
> > > > > 
> > > > > Is this for kexec?
> > > > > 
> > > > > Is this for some use-case enabled by out-of-tree code?
> > > > 
> > > > All valid questions. This smells of a terrible hack...
> > > 
> > > Yes, all god Q's.
> > 
> > And no answer?
> 
> No kexec, out of tree kernel with minor mods
> > 
> > > 
> > > > 
> > > > The interesting aspect is that this is only done when DS=1, probably
> > > > meaning that they are doing this in a VM. it also rely on some
> > > 
> > > Nope, on our custom target.
> > 
> > And you run with DS=1? On bare metal? Humpf...
> 
> Yes, we control all SW running on this target.
> 
> > 
> > > 
> > > > (unbounded) UNPRED behaviour as ProcessorSleep is entered without
> > > > any consideration for Group0... Good luck with that.
> 
> No Group0 IRQ used(or so I think but I could be mistaken)
> 
> > > 
> > > hmm, I am doing the same as PM does which also needs DS=1 so I figured
> > > this was uncontroversial.
> > 
> > I seriously doubt anyone is actually using that code in anger. I
> > always found it dodgy, and I'm pretty sure it is totally broken.
> 
> I see
> 
> > 
> > > 
> > > > 
> > > > Anyway, I don't think we want any of this stuff. Certainly not without
> > > > a cast-iron justification.
> > > 
> > > We use several cores but only one run Linux so a Linux reboot will
> > > only reset its own core.
> > 
> > And what happens with the distributor? One of the key assumption of
> > the GIC architecture is that there is only one operating system in
> > control of it, the whole of it. The only way to share it is by
> > virtualising it. Shades of Jailhouse...
> 
> Distributor is not touched until Linux is going up again
> 
> > 
> > > 
> > > Without this patch I either need to reset the GIC as part of the
> > > reboot or I get RWP timeout when linux starts again. Resetting the
> > > GIC kills IRQ on the other cores for a long time which is unwanted.
> > 
> > But that's what happens anyway when Linux boots (we reinitialise the
> > whole distributor). So what is this about?
> 
> At that point IRQ will only be lost shortly until the other cores
> are reloaded by Linux user space

How different is that from reseting the GIC altogether?

> 
> > 
> > > 
> > > The RWP timeout comes from lost HW handshake between core and GIC.
> > > Is there another way to regain the HW handshake ?
> > 
> > That's the firmware's job. But overall, getting these timeouts means
> > your GIC has locked up. It would be more interesting to understand
> 
> FW here is u-boot starting from EL3
> 
> > *why* you get in that situation, which I presume is due to the way the
> > driver initialises itself.
> > 
> > Assuming you use kexec to reboot your Linux instance, does the
> > following help (totally untested)?
> 
> no kexec and no it did not help.
> I get the first RWP here:
> static void __init gic_dist_init(void)
> {
> 	unsigned int i;
> 	u64 affinity;
> 	void __iomem *base = gic_data.dist_base;
> 	u32 val;
> 
> 	/* Disable the distributor */
> 	writel_relaxed(0, base + GICD_CTLR);
> 	gic_dist_wait_for_rwp(); <------ RWP timeout
> 
> I guess it is impossible to recover GIC HW handshake once it is lost ?

This is trying to teardown the distributor, but the downstream
redistributor (and CPU interface) are still up. The architecture
doesn't specify what can happen in this case, and I wouldn't be
surprised if you were triggering some really bad HW bugs. Please
provide the exact version of your GIC (at least the GICD_IIDR value).

Since you're in control of your firmware, you should reset things
there, because even if we hack something here for the boot CPU, this
says nothing of the other CPUs.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

      reply	other threads:[~2023-01-05 20:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-16 16:21 [PATCH] GICv3: Add restart handler to detach CPU from GICv3 Joakim Tjernlund
2023-01-03 16:27 ` Joakim Tjernlund
2023-01-04 16:04   ` Mark Rutland
2023-01-04 16:50     ` Marc Zyngier
2023-01-04 17:23       ` Joakim Tjernlund
2023-01-04 18:48         ` Marc Zyngier
2023-01-04 19:52           ` Joakim Tjernlund
2023-01-05 10:37             ` Marc Zyngier [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86lemh6zcx.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=Joakim.Tjernlund@infinera.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox