LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RESEND PATCH 0/6] rapidio: move Kconfig menu definition to subsystem
From: Alex Bounine @ 2018-07-31 15:12 UTC (permalink / raw)
  To: Alexei Colin, Andrew Morton
  Cc: John Paul Walters, Catalin Marinas, Russell King, Arnd Bergmann,
	Will Deacon, Ralf Baechle, Paul Burton, Alexander Sverdlin,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Peter Anvin, Matt Porter, x86, linuxppc-dev, linux-mips,
	linux-arm-kernel, linux-kernel
In-Reply-To: <20180731142954.30345-1-acolin@isi.edu>

Acked-by: Alexandre Bounine <alex.bou9@gmail.com>


On 2018-07-31 10:29 AM, Alexei Colin wrote:
> Resending the patchset from prior submission:
> https://lkml.org/lkml/2018/7/30/911
> 
> The only change are the Cc tags in all patches now include the mailing
> lists for all affected architectures, and patch 1/6 (which adds the menu
> item to RapdidIO subsystem Kconfig) is CCed to all maintainers who are
> getting this cover letter. The cover letter has been updated with
> explanations to points raised in the feedback.
> 
> 
> 
> The top-level Kconfig entry for RapidIO subsystem is currently
> duplicated in several architecture-specific Kconfig files. This set of
> patches does two things:
> 
> 1. Move the Kconfig menu definition into the RapidIO subsystem and
> remove the duplicate definitions from arch Kconfig files.
> 
> 2. Enable RapidIO Kconfig menu entry for arm and arm64 architectures,
> where it was not enabled before. I tested that subsystem and drivers
> build successfully for both architectures, and tested that the modules
> load on a custom arm64 Qemu model.
> 
> For all architectures, RapidIO menu should be offered when either:
> (1) The platform has a PCI bus (which host a RapidIO module on the bus).
> (2) The platform has a RapidIO IP block (connected to a system bus, e.g.
> AXI on ARM). In this case, 'select HAS_RAPIDIO' should be added to the
> 'config ARCH_*' menu entry for the SoCs that offer the IP block.
> 
> Prior to this patchset, different architectures used different criteria:
> * powerpc: (1) and (2)
> * mips: (1) and (2) after recent commit into next that added (2):
>    https://www.linux-mips.org/archives/linux-mips/2018-07/msg00596.html
>    fc5d988878942e9b42a4de5204bdd452f3f1ce47
>    491ec1553e0075f345fbe476a93775eabcbc40b6
> * x86: (1)
> * arm,arm64: none (RapidIO menus never offered)
> 
> This set of architectures are the ones that implement support for
> RapidIO as system bus. On some platforms RapidIO can be the only system
> bus available replacing PCI/PCIe.  As it is done now, RapidIO is
> configured in "Bus Options" (x86/PPC) or "Bus Support" (ARMs) sub-menu
> and from system configuration option it should be kept this way.
> Current location of RAPIDIO configuration option is familiar to users of
> PowerPC and x86 platforms, and is similarly available in some ARM
> manufacturers kernel code trees. (Alex Bounine)
> 
> HAS_RAPIDIO is not enabled unconditionally, because HAS_RAPIDIO option
> is intended for SOCs that have built in SRIO controllers, like TI
> KeyStoneII or FPGAs. Because RapidIO subsystem core is required during
> RapidIO port driver initialization, having separate option allows us to
> control available build options for RapidIO core and port driver (bool
> vs.  tristate) and disable module option if port driver is configured as
> built-in. (Alex Bounine)
> 
> Responses to feedback from prior submission (thanks for the reviews!):
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593347.html
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593349.html
> 
> Changelog:
>    * Moved Kconfig entry into RapidIO subsystem instead of duplicating
> 
> In the current patchset, I took the approach of adding '|| PCI' to the
> depends in the subsystem. I did try the alterantive approach mentioned
> in the reviews for v1 of this patch, where the subsystem Kconfig does
> not add a '|| PCI' and each per-architecture Kconfig has to add a
> 'select HAS_RAPIDIO if PCI' and SoCs with IP blocks have to also add
> 'select HAS_RAPIDIO'. This works too but requires each architecture's
> Kconfig to add the line for RapidIO (whereas current approach does not
> require that involvement) and also may create a false impression that
> the dependency on PCI is strict.
> 
> We appreciate the suggestion for also selecting the RapdiIO subsystem for
> compilation with COMPILE_TEST, but hope to address it in a separate
> patchset, localized to the subsystem, since it will need to change
> depends on all drivers, not just on the top level, and since this
> patch now spans multiple architectures.
> 
> Alexei Colin (6):
>    rapidio: define top Kconfig menu in driver subtree
>    x86: factor out RapidIO Kconfig menu
>    powerpc: factor out RapidIO Kconfig menu entry
>    mips: factor out RapidIO Kconfig entry
>    arm: enable RapidIO menu in Kconfig
>    arm64: enable RapidIO menu in Kconfig
> 
>   arch/arm/Kconfig        |  2 ++
>   arch/arm64/Kconfig      |  2 ++
>   arch/mips/Kconfig       | 11 -----------
>   arch/powerpc/Kconfig    | 13 +------------
>   arch/x86/Kconfig        |  8 --------
>   drivers/rapidio/Kconfig | 15 +++++++++++++++
>   6 files changed, 20 insertions(+), 31 deletions(-)
> 

^ permalink raw reply

* Re: [RESEND PATCH 1/6] rapidio: define top Kconfig menu in driver subtree
From: Russell King - ARM Linux @ 2018-07-31 15:29 UTC (permalink / raw)
  To: Alexei Colin
  Cc: Alexandre Bounine, Matt Porter, Andrew Morton, John Paul Walters,
	Catalin Marinas, Arnd Bergmann, Will Deacon, Ralf Baechle,
	Paul Burton, Alexander Sverdlin, Benjamin Herrenschmidt,
	Paul Mackerras, Thomas Gleixner, Peter Anvin, x86, linuxppc-dev,
	linux-mips, linux-arm-kernel, linux-kernel
In-Reply-To: <20180731142954.30345-2-acolin@isi.edu>

On Tue, Jul 31, 2018 at 10:29:49AM -0400, Alexei Colin wrote:
> The top-level Kconfig entry for RapidIO subsystem is currently
> duplicated in several architecture-specific Kconfigs. This
> commit re-defines it in the driver subtree, and subsequent
> commits, one per architecture, will remove the duplicated
> definitions from respective per-architecture Kconfigs.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: John Paul Walters <jwalters@isi.edu>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Paul Burton <paul.burton@mips.com>
> Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Anvin <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-mips@linux-mips.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Alexei Colin <acolin@isi.edu>
> ---
>  drivers/rapidio/Kconfig | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/rapidio/Kconfig b/drivers/rapidio/Kconfig
> index d6d2f20c4597..98e301847584 100644
> --- a/drivers/rapidio/Kconfig
> +++ b/drivers/rapidio/Kconfig
> @@ -1,6 +1,21 @@
>  #
>  # RapidIO configuration
>  #
> +
> +config HAS_RAPIDIO
> +	bool
> +	default n

There's no need to specify this default - the default default defaults to
'n' anyway, so "default n" just respecifies what's already the default.
(next time I'll try to add more "default"s into that! ;) )

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

^ permalink raw reply

* [PATCH] powerpc/pasemi: Seach for PCI root bus by compatible property
From: Christian Zigotzky @ 2018-07-31 15:34 UTC (permalink / raw)
  To: Michael Ellerman, Darren Stevens, linuxppc-dev; +Cc: Olof Johansson
In-Reply-To: <87zhy7fw7f.fsf@concordia.ellerman.id.au>

Just for info: I tested it on my Nemo board today and it works.

-- Christian

On 31 July 2018 at 2:04PM, Michael Ellerman wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> Darren Stevens <darren@stevens-zone.net> writes:
>>
>>> Pasemi arch code finds the root of the PCI-e bus by searching the
>>> device-tree for a node called 'pxp'. But the root bus has a
>>> compatible property of 'pasemi,rootbus' so search for that instead.
>>>
>>> Signed-off-by: Darren Stevens <darren@stevens-zone.net>
>>> ---
>>>
>>> This works on the Amigaone X1000, I don't know if this method of
>>> finding the pci bus was there bcause of earlier firmwares.
>> Does anyone have another pasemi board they can test this on?
>>
>> The last time I plugged mine in it popped the power supply and took out
>> power to half the office :) - I haven't had a chance to try it since.
> I actually I remembered I have a device tree lying around from an electra.
>
> It has:
>
>    [I] home:pxp@0,80000000(7)(I)> lsprop name compatible
>    name             "pxp"
>    compatible       "pasemi,rootbus"
>                     "pa-pxp"
>
>
> So it looks like the patch would work fine on it at least.
>
> cheers
>
>>> diff --git a/arch/powerpc/platforms/pasemi/pci.c b/arch/powerpc/platforms/pasemi/pci.c
>>> index c7c8607..be62380 100644
>>> --- a/arch/powerpc/platforms/pasemi/pci.c
>>> +++ b/arch/powerpc/platforms/pasemi/pci.c
>>> @@ -216,6 +216,7 @@ static int __init pas_add_bridge(struct device_node *dev)
>>>   void __init pas_pci_init(void)
>>>   {
>>>      struct device_node *np, *root;
>>> +   int res;
>>>   
>>>      root = of_find_node_by_path("/");
>>>      if (!root) {
>>> @@ -226,11 +227,11 @@ void __init pas_pci_init(void)
>>>   
>>>      pci_set_flags(PCI_SCAN_ALL_PCIE_DEVS);
>>>   
>>> -   for (np = NULL; (np = of_get_next_child(root, np)) != NULL;)
>>> -       if (np->name && !strcmp(np->name, "pxp") && !pas_add_bridge(np))
>>> -           of_node_get(np);
>>> -
>>> -   of_node_put(root);
>>> +   np = of_find_compatible_node(root, NULL, "pasemi,rootbus");
>>> +   if (np) {
>>> +       res = pas_add_bridge(np);
>>> +       of_node_put(np);
>>> +   }
>>>   }
>>>   
>>>   void __iomem *pasemi_pci_getcfgaddr(struct pci_dev *dev, int offset)

^ permalink raw reply

* Re: [RESEND PATCH 5/6] arm: enable RapidIO menu in Kconfig
From: Russell King - ARM Linux @ 2018-07-31 15:36 UTC (permalink / raw)
  To: Alexei Colin
  Cc: Alexandre Bounine, Arnd Bergmann, Andrew Morton,
	John Paul Walters, linux-kernel, x86, linuxppc-dev, linux-mips,
	linux-arm-kernel
In-Reply-To: <20180731142954.30345-6-acolin@isi.edu>

On Tue, Jul 31, 2018 at 10:29:53AM -0400, Alexei Colin wrote:
> Platforms with a PCI bus will be offered the RapidIO menu since they may
> be want support for a RapidIO PCI device. Platforms without a PCI bus
> that might include a RapidIO IP block will need to "select HAS_RAPIDIO"
> in the platform-/machine-specific "config ARCH_*" Kconfig entry.
> 
> Tested that kernel builds for ARM with the RapidIO subsystem and switch
> drivers enabled.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: John Paul Walters <jwalters@isi.edu>
> Cc: linux-kernel@vger.kernel.org
> Cc: x86@kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-mips@linux-mips.org
> Cc: linux-arm-kernel@lists.infradead.org
> Signed-off-by: Alexei Colin <acolin@isi.edu>
> ---
>  arch/arm/Kconfig | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index afe350e5e3d9..602a61324890 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1278,6 +1278,8 @@ config PCI_HOST_ITE8152
>  
>  source "drivers/pci/Kconfig"
>  
> +source "drivers/rapidio/Kconfig"
> +
>  source "drivers/pcmcia/Kconfig"

Why not place the new Kconfig file after pcmcia?  That way, it is in
a consistent position wrt architectures such as powerpc, and it is
also in alphabetical order.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

^ permalink raw reply

* Re: [RESEND PATCH 0/6] rapidio: move Kconfig menu definition to subsystem
From: Russell King - ARM Linux @ 2018-07-31 15:59 UTC (permalink / raw)
  To: Alexei Colin
  Cc: Alexandre Bounine, Andrew Morton, John Paul Walters,
	Catalin Marinas, Arnd Bergmann, Will Deacon, Ralf Baechle,
	Paul Burton, Alexander Sverdlin, Benjamin Herrenschmidt,
	Paul Mackerras, Thomas Gleixner, Peter Anvin, Matt Porter, x86,
	linuxppc-dev, linux-mips, linux-arm-kernel, linux-kernel
In-Reply-To: <20180731142954.30345-1-acolin@isi.edu>

For the thread associated with this patch set, a review of a previous
patch for ARM posted last Tuesday on this subject asked a series of
questions about the PCI-nature of this.  The review has not been
responded to.

If it is inappropriate to offer RapidIO for any architecture that
happens to has PCI, then it is inappropriate to offer it for any
ARM machine that happens to have PCI.

In light of the lack of explanation on this point so far, I'm naking
the ARM part of this series for now.

I also think that the HAS_RAPIDIO thing is misleading and needs
sorting out (as I've mentioned in other emails, including the one
I refer to above) before rapidio becomes available more widely.

On Tue, Jul 31, 2018 at 10:29:48AM -0400, Alexei Colin wrote:
> Resending the patchset from prior submission:
> https://lkml.org/lkml/2018/7/30/911
> 
> The only change are the Cc tags in all patches now include the mailing
> lists for all affected architectures, and patch 1/6 (which adds the menu
> item to RapdidIO subsystem Kconfig) is CCed to all maintainers who are
> getting this cover letter. The cover letter has been updated with
> explanations to points raised in the feedback.
> 
> 
> 
> The top-level Kconfig entry for RapidIO subsystem is currently
> duplicated in several architecture-specific Kconfig files. This set of
> patches does two things:
> 
> 1. Move the Kconfig menu definition into the RapidIO subsystem and
> remove the duplicate definitions from arch Kconfig files.
> 
> 2. Enable RapidIO Kconfig menu entry for arm and arm64 architectures,
> where it was not enabled before. I tested that subsystem and drivers
> build successfully for both architectures, and tested that the modules
> load on a custom arm64 Qemu model.
> 
> For all architectures, RapidIO menu should be offered when either:
> (1) The platform has a PCI bus (which host a RapidIO module on the bus).
> (2) The platform has a RapidIO IP block (connected to a system bus, e.g.
> AXI on ARM). In this case, 'select HAS_RAPIDIO' should be added to the
> 'config ARCH_*' menu entry for the SoCs that offer the IP block.
> 
> Prior to this patchset, different architectures used different criteria:
> * powerpc: (1) and (2)
> * mips: (1) and (2) after recent commit into next that added (2):
>   https://www.linux-mips.org/archives/linux-mips/2018-07/msg00596.html
>   fc5d988878942e9b42a4de5204bdd452f3f1ce47
>   491ec1553e0075f345fbe476a93775eabcbc40b6
> * x86: (1)
> * arm,arm64: none (RapidIO menus never offered)
> 
> This set of architectures are the ones that implement support for
> RapidIO as system bus. On some platforms RapidIO can be the only system
> bus available replacing PCI/PCIe.  As it is done now, RapidIO is
> configured in "Bus Options" (x86/PPC) or "Bus Support" (ARMs) sub-menu
> and from system configuration option it should be kept this way.
> Current location of RAPIDIO configuration option is familiar to users of
> PowerPC and x86 platforms, and is similarly available in some ARM
> manufacturers kernel code trees. (Alex Bounine)
> 
> HAS_RAPIDIO is not enabled unconditionally, because HAS_RAPIDIO option
> is intended for SOCs that have built in SRIO controllers, like TI
> KeyStoneII or FPGAs. Because RapidIO subsystem core is required during
> RapidIO port driver initialization, having separate option allows us to
> control available build options for RapidIO core and port driver (bool
> vs.  tristate) and disable module option if port driver is configured as
> built-in. (Alex Bounine)
> 
> Responses to feedback from prior submission (thanks for the reviews!):
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593347.html
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593349.html
> 
> Changelog:
>   * Moved Kconfig entry into RapidIO subsystem instead of duplicating
> 
> In the current patchset, I took the approach of adding '|| PCI' to the
> depends in the subsystem. I did try the alterantive approach mentioned
> in the reviews for v1 of this patch, where the subsystem Kconfig does
> not add a '|| PCI' and each per-architecture Kconfig has to add a
> 'select HAS_RAPIDIO if PCI' and SoCs with IP blocks have to also add
> 'select HAS_RAPIDIO'. This works too but requires each architecture's
> Kconfig to add the line for RapidIO (whereas current approach does not
> require that involvement) and also may create a false impression that
> the dependency on PCI is strict.
> 
> We appreciate the suggestion for also selecting the RapdiIO subsystem for
> compilation with COMPILE_TEST, but hope to address it in a separate
> patchset, localized to the subsystem, since it will need to change
> depends on all drivers, not just on the top level, and since this
> patch now spans multiple architectures.
> 
> Alexei Colin (6):
>   rapidio: define top Kconfig menu in driver subtree
>   x86: factor out RapidIO Kconfig menu
>   powerpc: factor out RapidIO Kconfig menu entry
>   mips: factor out RapidIO Kconfig entry
>   arm: enable RapidIO menu in Kconfig
>   arm64: enable RapidIO menu in Kconfig
> 
>  arch/arm/Kconfig        |  2 ++
>  arch/arm64/Kconfig      |  2 ++
>  arch/mips/Kconfig       | 11 -----------
>  arch/powerpc/Kconfig    | 13 +------------
>  arch/x86/Kconfig        |  8 --------
>  drivers/rapidio/Kconfig | 15 +++++++++++++++
>  6 files changed, 20 insertions(+), 31 deletions(-)
> 
> -- 
> 2.18.0
> 

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Christoph Hellwig @ 2018-07-31 17:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Christoph Hellwig, Will Deacon, Anshuman Khandual, virtualization,
	linux-kernel, linuxppc-dev, aik, robh, joe, elfring, david,
	jasowang, benh, mpe, linuxram, haren, paulus, srikar,
	robin.murphy, jean-philippe.brucker, marc.zyngier
In-Reply-To: <20180730155633-mutt-send-email-mst@kernel.org>

On Mon, Jul 30, 2018 at 04:26:32PM +0300, Michael S. Tsirkin wrote:
> Real hardware would reuse parts of the interface but by necessity it
> needs to behave slightly differently on some platforms.  However for
> some platforms (such as x86) a PV virtio driver will by luck work with a
> PCI device backend without changes. As these platforms and drivers are
> widely deployed, some people will deploy hardware like that.  Should be
> a non issue as by definition it's transparent to guests.

On some x86.  As soon as you have an iommu or strange PCI root ports
things are going to start breaking apart.

> > And that very much excludes arch-specific (or
> > Xen-specific) overrides.
> 
> We already committed to a xen specific hack but generally I prefer
> devices that describe how they work instead of platforms magically
> guessing, yes.

For legacy reasons I guess we'll have to keep it, but we really need
to avoid adding more junk than this.

> However the question people raise is that DMA API is already full of
> arch-specific tricks the likes of which are outlined in your post linked
> above. How is this one much worse?

None of these warts is visible to the driver, they are all handled in
the architecture (possibly on a per-bus basis).

So for virtio we really need to decide if it has one set of behavior
as specified in the virtio spec, or if it behaves exactly as if it
was on a PCI bus, or in fact probably both as you lined up.  But no
magic arch specific behavior inbetween.

^ permalink raw reply

* Re: [RESEND PATCH 0/6] rapidio: move Kconfig menu definition to subsystem
From: Alex Bounine @ 2018-07-31 18:21 UTC (permalink / raw)
  To: Russell King - ARM Linux, Alexei Colin
  Cc: Andrew Morton, John Paul Walters, Catalin Marinas, Arnd Bergmann,
	Will Deacon, Ralf Baechle, Paul Burton, Alexander Sverdlin,
	Benjamin Herrenschmidt, Paul Mackerras, Thomas Gleixner,
	Peter Anvin, Matt Porter, x86, linuxppc-dev, linux-mips,
	linux-arm-kernel, linux-kernel
In-Reply-To: <20180731155909.GO17271@n2100.armlinux.org.uk>



On 2018-07-31 11:59 AM, Russell King - ARM Linux wrote:
> For the thread associated with this patch set, a review of a previous
> patch for ARM posted last Tuesday on this subject asked a series of
> questions about the PCI-nature of this.  The review has not been
> responded to.

We are dealing with this now. More appropriate to do it this time than 
before having reworked set.

> If it is inappropriate to offer RapidIO for any architecture that
> happens to has PCI, then it is inappropriate to offer it for any
> ARM machine that happens to have PCI.

It is completely appropriate to use RapidIO on any architecture that has 
PCI/PCIe using existing PCIe-to-SRIO host bridges. Works well with 
Marvell and NVIDIA boards.

Confusion here is caused by the fact that there are ARM and non-ARM 
devices that offer on-chip RapidIO host controllers as well as PCIe. 
E.g. TI Keystone/KeystoneII, FSL 85xx/86xx, Xilinx and Altera FPGAs with 
ARM cores, Cavium on MIPS. In most cases external buses are configurable 
and we have to address possible combinations.

I already posted some explanation in response to your earlier comment.

> In light of the lack of explanation on this point so far, I'm naking
> the ARM part of this series for now.
> 

Explanations posted.

> I also think that the HAS_RAPIDIO thing is misleading and needs
> sorting out (as I've mentioned in other emails, including the one
> I refer to above) before rapidio becomes available more widely.
>
Highly likely it is used right now in a base station of mobile operator 
near you :)

> On Tue, Jul 31, 2018 at 10:29:48AM -0400, Alexei Colin wrote:
>> Resending the patchset from prior submission:
>> https://lkml.org/lkml/2018/7/30/911
>>
>> The only change are the Cc tags in all patches now include the mailing
>> lists for all affected architectures, and patch 1/6 (which adds the menu
>> item to RapdidIO subsystem Kconfig) is CCed to all maintainers who are
>> getting this cover letter. The cover letter has been updated with
>> explanations to points raised in the feedback.
>>
>>
>>
>> The top-level Kconfig entry for RapidIO subsystem is currently
>> duplicated in several architecture-specific Kconfig files. This set of
>> patches does two things:
>>
>> 1. Move the Kconfig menu definition into the RapidIO subsystem and
>> remove the duplicate definitions from arch Kconfig files.
>>
>> 2. Enable RapidIO Kconfig menu entry for arm and arm64 architectures,
>> where it was not enabled before. I tested that subsystem and drivers
>> build successfully for both architectures, and tested that the modules
>> load on a custom arm64 Qemu model.
>>
>> For all architectures, RapidIO menu should be offered when either:
>> (1) The platform has a PCI bus (which host a RapidIO module on the bus).
>> (2) The platform has a RapidIO IP block (connected to a system bus, e.g.
>> AXI on ARM). In this case, 'select HAS_RAPIDIO' should be added to the
>> 'config ARCH_*' menu entry for the SoCs that offer the IP block.
>>
>> Prior to this patchset, different architectures used different criteria:
>> * powerpc: (1) and (2)
>> * mips: (1) and (2) after recent commit into next that added (2):
>>    https://www.linux-mips.org/archives/linux-mips/2018-07/msg00596.html
>>    fc5d988878942e9b42a4de5204bdd452f3f1ce47
>>    491ec1553e0075f345fbe476a93775eabcbc40b6
>> * x86: (1)
>> * arm,arm64: none (RapidIO menus never offered)
>>
>> This set of architectures are the ones that implement support for
>> RapidIO as system bus. On some platforms RapidIO can be the only system
>> bus available replacing PCI/PCIe.  As it is done now, RapidIO is
>> configured in "Bus Options" (x86/PPC) or "Bus Support" (ARMs) sub-menu
>> and from system configuration option it should be kept this way.
>> Current location of RAPIDIO configuration option is familiar to users of
>> PowerPC and x86 platforms, and is similarly available in some ARM
>> manufacturers kernel code trees. (Alex Bounine)
>>
>> HAS_RAPIDIO is not enabled unconditionally, because HAS_RAPIDIO option
>> is intended for SOCs that have built in SRIO controllers, like TI
>> KeyStoneII or FPGAs. Because RapidIO subsystem core is required during
>> RapidIO port driver initialization, having separate option allows us to
>> control available build options for RapidIO core and port driver (bool
>> vs.  tristate) and disable module option if port driver is configured as
>> built-in. (Alex Bounine)
>>
>> Responses to feedback from prior submission (thanks for the reviews!):
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593347.html
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593349.html
>>
>> Changelog:
>>    * Moved Kconfig entry into RapidIO subsystem instead of duplicating
>>
>> In the current patchset, I took the approach of adding '|| PCI' to the
>> depends in the subsystem. I did try the alterantive approach mentioned
>> in the reviews for v1 of this patch, where the subsystem Kconfig does
>> not add a '|| PCI' and each per-architecture Kconfig has to add a
>> 'select HAS_RAPIDIO if PCI' and SoCs with IP blocks have to also add
>> 'select HAS_RAPIDIO'. This works too but requires each architecture's
>> Kconfig to add the line for RapidIO (whereas current approach does not
>> require that involvement) and also may create a false impression that
>> the dependency on PCI is strict.
>>
>> We appreciate the suggestion for also selecting the RapdiIO subsystem for
>> compilation with COMPILE_TEST, but hope to address it in a separate
>> patchset, localized to the subsystem, since it will need to change
>> depends on all drivers, not just on the top level, and since this
>> patch now spans multiple architectures.
>>
>> Alexei Colin (6):
>>    rapidio: define top Kconfig menu in driver subtree
>>    x86: factor out RapidIO Kconfig menu
>>    powerpc: factor out RapidIO Kconfig menu entry
>>    mips: factor out RapidIO Kconfig entry
>>    arm: enable RapidIO menu in Kconfig
>>    arm64: enable RapidIO menu in Kconfig
>>
>>   arch/arm/Kconfig        |  2 ++
>>   arch/arm64/Kconfig      |  2 ++
>>   arch/mips/Kconfig       | 11 -----------
>>   arch/powerpc/Kconfig    | 13 +------------
>>   arch/x86/Kconfig        |  8 --------
>>   drivers/rapidio/Kconfig | 15 +++++++++++++++
>>   6 files changed, 20 insertions(+), 31 deletions(-)
>>
>> -- 
>> 2.18.0
>>
> 

^ permalink raw reply

* Re: phandle_cache vs of_detach_node (was Re: [PATCH] powerpc/mobility: Fix node detach/rename problem)
From: Michael Bringmann @ 2018-07-31 19:11 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <8736vzj4la.fsf@concordia.ellerman.id.au>

I applied your suggestion with a couple of modifications,
and it looks to have worked for the first 2 migration events.
I am not seeing the errors from repeated migrations.  The
revised patch tested is,

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 466e3c8..8bf64e5 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1096,8 +1096,14 @@ struct device_node *of_find_node_by_phandle(phandle handle)

        if (phandle_cache) {
                if (phandle_cache[masked_handle] &&
-                   handle == phandle_cache[masked_handle]->phandle)
+                   handle == phandle_cache[masked_handle]->phandle) {
                        np = phandle_cache[masked_handle];
+
+                       if (of_node_check_flag(np, OF_DETACHED)) {
+                               np = NULL;
+                               phandle_cache[masked_handle] = NULL;
+                       }
+               }
        }

        if (!np) {

During a conference call this morning, Tyrel expressed concerns
about the use of the phandle_cache on powerpc, at all.  I will
try another build with that feature disabled, but without this
patch.

Michael


On 07/31/2018 01:34 AM, Michael Ellerman wrote:
> Hi Rob/Frank,
> 
> I think we might have a problem with the phandle_cache not interacting
> well with of_detach_node():
> 
> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>> See below.
>>
>> On 07/30/2018 01:31 AM, Michael Ellerman wrote:
>>> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>>>
>>>> During LPAR migration, the content of the device tree/sysfs may
>>>> be updated including deletion and replacement of nodes in the
>>>> tree.  When nodes are added to the internal node structures, they
>>>> are appended in FIFO order to a list of nodes maintained by the
>>>> OF code APIs.
>>>
>>> That hasn't been true for several years. The data structure is an n-ary
>>> tree. What kernel version are you working on?
>>
>> Sorry for an error in my description.  I oversimplified based on the
>> name of a search iterator.  Let me try to provide a better explanation
>> of the problem, here.
>>
>> This is the problem.  The PPC mobility code receives RTAS requests to
>> delete nodes with platform-/hardware-specific attributes when restarting
>> the kernel after a migration.  My example is for migration between a
>> P8 Alpine and a P8 Brazos.   Nodes to be deleted may include 'ibm,random-v1',
>> 'ibm,compression-v1', 'ibm,platform-facilities', 'ibm,sym-encryption-v1',
>> or others.
>>
>> The mobility.c code calls 'of_detach_node' for the nodes and their children.
>> This makes calls to detach the properties and to try to remove the associated
>> sysfs/kernfs files.
>>
>> Then new copies of the same nodes are next provided by the PHYP, local
>> copies are built, and a pointer to the 'struct device_node' is passed to
>> of_attach_node.  Before the call to of_attach_node, the phandle is initialized
>> to 0 when the data structure is alloced.  During the call to of_attach_node,
>> it calls __of_attach_node which pulls the actual name and phandle from just
>> created sub-properties named something like 'name' and 'ibm,phandle'.
>>
>> This is all fine for the first migration.  The problem occurs with the
>> second and subsequent migrations when the PHYP on the new system wants to
>> replace the same set of nodes again, referenced with the same names and
>> phandle values.
>>
>>>
>>>> When nodes are removed from the device tree, they
>>>> are marked OF_DETACHED, but not actually deleted from the system
>>>> to allow for pointers cached elsewhere in the kernel.  The order
>>>> and content of the entries in the list of nodes is not altered,
>>>> though.
>>>
>>> Something is going wrong if this is actually happening.
>>>
>>> When the node is detached it should be *detached* from the tree of all
>>> nodes, so it should not be discoverable other than by having an existing
>>> pointer to it.
>> On the second and subsequent migrations, the PHYP tells the system
>> to again delete the nodes 'ibm,platform-facilities', 'ibm,random-v1',
>> 'ibm,compression-v1', 'ibm,sym-encryption-v1'.  It specifies these
>> nodes by its known set of phandle values -- the same handles used
>> by the PHYP on the source system are known on the target system.
>> The mobility.c code calls of_find_node_by_phandle() with these values
>> and ends up locating the first instance of each node that was added
>> during the original boot, instead of the second instance of each node
>> created after the first migration.  The detach during the second
>> migration fails with errors like,
>>
>> [ 4565.030704] WARNING: CPU: 3 PID: 4787 at drivers/of/dynamic.c:252 __of_detach_node+0x8/0xa0
>> [ 4565.030708] Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag lockd grace fscache sunrpc xts vmx_crypto sg pseries_rng binfmt_misc ip_tables xfs libcrc32c sd_mod ibmveth ibmvscsi scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
>> [ 4565.030733] CPU: 3 PID: 4787 Comm: drmgr Tainted: G        W         4.18.0-rc1-wi107836-v05-120+ #201
>> [ 4565.030737] NIP:  c0000000007c1ea8 LR: c0000000007c1fb4 CTR: 0000000000655170
>> [ 4565.030741] REGS: c0000003f302b690 TRAP: 0700   Tainted: G        W          (4.18.0-rc1-wi107836-v05-120+)
>> [ 4565.030745] MSR:  800000010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22288822  XER: 0000000a
>> [ 4565.030757] CFAR: c0000000007c1fb0 IRQMASK: 1
>> [ 4565.030757] GPR00: c0000000007c1fa4 c0000003f302b910 c00000000114bf00 c0000003ffff8e68
>> [ 4565.030757] GPR04: 0000000000000001 ffffffffffffffff 800000c008e0b4b8 ffffffffffffffff
>> [ 4565.030757] GPR08: 0000000000000000 0000000000000001 0000000080000003 0000000000002843
>> [ 4565.030757] GPR12: 0000000000008800 c00000001ec9ae00 0000000040000000 0000000000000000
>> [ 4565.030757] GPR16: 0000000000000000 0000000000000008 0000000000000000 00000000f6ffffff
>> [ 4565.030757] GPR20: 0000000000000007 0000000000000000 c0000003e9f1f034 0000000000000001
>> [ 4565.030757] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> [ 4565.030757] GPR28: c000000001549d28 c000000001134828 c0000003ffff8e68 c0000003f302b930
>> [ 4565.030804] NIP [c0000000007c1ea8] __of_detach_node+0x8/0xa0
>> [ 4565.030808] LR [c0000000007c1fb4] of_detach_node+0x74/0xd0
>> [ 4565.030811] Call Trace:
>> [ 4565.030815] [c0000003f302b910] [c0000000007c1fa4] of_detach_node+0x64/0xd0 (unreliable)
>> [ 4565.030821] [c0000003f302b980] [c0000000000c33c4] dlpar_detach_node+0xb4/0x150
>> [ 4565.030826] [c0000003f302ba10] [c0000000000c3ffc] delete_dt_node+0x3c/0x80
>> [ 4565.030831] [c0000003f302ba40] [c0000000000c4380] pseries_devicetree_update+0x150/0x4f0
>> [ 4565.030836] [c0000003f302bb70] [c0000000000c479c] post_mobility_fixup+0x7c/0xf0
>> [ 4565.030841] [c0000003f302bbe0] [c0000000000c4908] migration_store+0xf8/0x130
>> [ 4565.030847] [c0000003f302bc70] [c000000000998160] kobj_attr_store+0x30/0x60
>> [ 4565.030852] [c0000003f302bc90] [c000000000412f14] sysfs_kf_write+0x64/0xa0
>> [ 4565.030857] [c0000003f302bcb0] [c000000000411cac] kernfs_fop_write+0x16c/0x240
>> [ 4565.030862] [c0000003f302bd00] [c000000000355f20] __vfs_write+0x40/0x220
>> [ 4565.030867] [c0000003f302bd90] [c000000000356358] vfs_write+0xc8/0x240
>> [ 4565.030872] [c0000003f302bde0] [c0000000003566cc] ksys_write+0x5c/0x100
>> [ 4565.030880] [c0000003f302be30] [c00000000000b288] system_call+0x5c/0x70
>> [ 4565.030884] Instruction dump:
>> [ 4565.030887] 38210070 38600000 e8010010 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8
>> [ 4565.030895] 7c0803a6 4e800020 e9230098 7929f7e2 <0b090000> 2f890000 4cde0020 e9030040
>> [ 4565.030903] ---[ end trace 5bd54cb1df9d2976 ]---
>>
>> The mobility.c code continues on during the second migration, accepts the
>> definitions of the new nodes from the PHYP and ends up renaming the new
>> properties e.g.
>>
>> [ 4565.827296] Duplicate name in base, renamed to "ibm,platform-facilities#1"
>>
>> I don't see any check like 'of_node_check_flag(np, OF_DETACHED)' within
>> of_find_node_by_phandle to skip nodes that are detached, but still present
>> due to caching or use count considerations.  Another possibility to consider
>> is that of_find_node_by_phandle also uses something called 'phandle_cache'
>> which may have outdated data as of_detach_node() does not have access to
>> that cache for the 'OF_DETACHED' nodes.
> 
> Yes the phandle_cache looks like it might be the problem.
> 
> I saw of_free_phandle_cache() being called as late_initcall, but didn't
> realise that's only if MODULES is disabled.
> 
> So I don't see anything that invalidates the phandle_cache when a node
> is removed.
> 
> The right solution would be for __of_detach_node() to invalidate the
> phandle_cache for the node being detached. That's slightly complicated
> by the phandle_cache being static inside base.c
> 
> To test the theory that it's the phandle_cache causing the problems can
> you try this patch:
> 
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index 848f549164cd..60e219132e24 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -1098,6 +1098,9 @@ struct device_node *of_find_node_by_phandle(phandle handle)
>  		if (phandle_cache[masked_handle] &&
>  		    handle == phandle_cache[masked_handle]->phandle)
>  			np = phandle_cache[masked_handle];
> +
> +		if (of_node_check_flag(np, OF_DETACHED))
> +			np = NULL;
>  	}
> 
>  	if (!np) {
> 
> cheers
> 
> 

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:       (512) 466-0650
mwb@linux.vnet.ibm.com

^ permalink raw reply related

* Re: phandle_cache vs of_detach_node (was Re: [PATCH] powerpc/mobility: Fix node detach/rename problem)
From: Frank Rowand @ 2018-07-31 19:18 UTC (permalink / raw)
  To: Rob Herring, Michael Ellerman; +Cc: mwb, linuxppc-dev, devicetree
In-Reply-To: <CAL_JsqKf1k3MM+HgQ2U=Ep7L9nbBCmmvGZoEAKxLB1TVqQdW_w@mail.gmail.com>

On 07/31/18 07:17, Rob Herring wrote:
> On Tue, Jul 31, 2018 at 12:34 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Hi Rob/Frank,
>>
>> I think we might have a problem with the phandle_cache not interacting
>> well with of_detach_node():
> 
> Probably needs a similar fix as this commit did for overlays:
> 
> commit b9952b5218added5577e4a3443969bc20884cea9
> Author: Frank Rowand <frank.rowand@sony.com>
> Date:   Thu Jul 12 14:00:07 2018 -0700
> 
>     of: overlay: update phandle cache on overlay apply and remove
> 
>     A comment in the review of the patch adding the phandle cache said that
>     the cache would have to be updated when modules are applied and removed.
>     This patch implements the cache updates.
> 
>     Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of
> of_find_node_by_phandle()")
>     Reported-by: Alan Tull <atull@kernel.org>
>     Suggested-by: Alan Tull <atull@kernel.org>
>     Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>     Signed-off-by: Rob Herring <robh@kernel.org>

Agreed.  Sorry about missing the of_detach_node() case.


> Really what we need here is an "invalidate phandle" function rather
> than free and re-allocate the whole damn cache.

The big hammer approach was chosen to avoid the race conditions that
would otherwise occur.  OF does not have a locking strategy that
would be able to protect against the races.

We could maybe implement a slightly smaller hammer by (1) disabling
the cache, (2) invalidate a phandle entry in the cache, (3) re-enable
the cache.  That is an off the cuff thought - I would have to look
a little bit more carefully to make sure it would work.

But I don't see a need to add the complexity of the smaller hammer
or the bigger hammer of proper locking _unless_ we start seeing that
the cache is being freed and re-allocated frequently.  For overlays
I don't expect the high frequency because it happens on a per overlay
removal basis (not per node removal basis).  For of_detach_node() the
event _is_ on a per node removal basis.  Michael, do you expect node
removals to be a frequent event with low latency being important?  If
so, a rough guess on what the frequency would be?

-Frank


> Rob
> 
>>
>> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>>> See below.
>>>
>>> On 07/30/2018 01:31 AM, Michael Ellerman wrote:
>>>> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>>>>
>>>>> During LPAR migration, the content of the device tree/sysfs may
>>>>> be updated including deletion and replacement of nodes in the
>>>>> tree.  When nodes are added to the internal node structures, they
>>>>> are appended in FIFO order to a list of nodes maintained by the
>>>>> OF code APIs.
>>>>
>>>> That hasn't been true for several years. The data structure is an n-ary
>>>> tree. What kernel version are you working on?
>>>
>>> Sorry for an error in my description.  I oversimplified based on the
>>> name of a search iterator.  Let me try to provide a better explanation
>>> of the problem, here.
>>>
>>> This is the problem.  The PPC mobility code receives RTAS requests to
>>> delete nodes with platform-/hardware-specific attributes when restarting
>>> the kernel after a migration.  My example is for migration between a
>>> P8 Alpine and a P8 Brazos.   Nodes to be deleted may include 'ibm,random-v1',
>>> 'ibm,compression-v1', 'ibm,platform-facilities', 'ibm,sym-encryption-v1',
>>> or others.
>>>
>>> The mobility.c code calls 'of_detach_node' for the nodes and their children.
>>> This makes calls to detach the properties and to try to remove the associated
>>> sysfs/kernfs files.
>>>
>>> Then new copies of the same nodes are next provided by the PHYP, local
>>> copies are built, and a pointer to the 'struct device_node' is passed to
>>> of_attach_node.  Before the call to of_attach_node, the phandle is initialized
>>> to 0 when the data structure is alloced.  During the call to of_attach_node,
>>> it calls __of_attach_node which pulls the actual name and phandle from just
>>> created sub-properties named something like 'name' and 'ibm,phandle'.
>>>
>>> This is all fine for the first migration.  The problem occurs with the
>>> second and subsequent migrations when the PHYP on the new system wants to
>>> replace the same set of nodes again, referenced with the same names and
>>> phandle values.
>>>
>>>>
>>>>> When nodes are removed from the device tree, they
>>>>> are marked OF_DETACHED, but not actually deleted from the system
>>>>> to allow for pointers cached elsewhere in the kernel.  The order
>>>>> and content of the entries in the list of nodes is not altered,
>>>>> though.
>>>>
>>>> Something is going wrong if this is actually happening.
>>>>
>>>> When the node is detached it should be *detached* from the tree of all
>>>> nodes, so it should not be discoverable other than by having an existing
>>>> pointer to it.
>>> On the second and subsequent migrations, the PHYP tells the system
>>> to again delete the nodes 'ibm,platform-facilities', 'ibm,random-v1',
>>> 'ibm,compression-v1', 'ibm,sym-encryption-v1'.  It specifies these
>>> nodes by its known set of phandle values -- the same handles used
>>> by the PHYP on the source system are known on the target system.
>>> The mobility.c code calls of_find_node_by_phandle() with these values
>>> and ends up locating the first instance of each node that was added
>>> during the original boot, instead of the second instance of each node
>>> created after the first migration.  The detach during the second
>>> migration fails with errors like,
>>>
>>> [ 4565.030704] WARNING: CPU: 3 PID: 4787 at drivers/of/dynamic.c:252 __of_detach_node+0x8/0xa0
>>> [ 4565.030708] Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag lockd grace fscache sunrpc xts vmx_crypto sg pseries_rng binfmt_misc ip_tables xfs libcrc32c sd_mod ibmveth ibmvscsi scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
>>> [ 4565.030733] CPU: 3 PID: 4787 Comm: drmgr Tainted: G        W         4.18.0-rc1-wi107836-v05-120+ #201
>>> [ 4565.030737] NIP:  c0000000007c1ea8 LR: c0000000007c1fb4 CTR: 0000000000655170
>>> [ 4565.030741] REGS: c0000003f302b690 TRAP: 0700   Tainted: G        W          (4.18.0-rc1-wi107836-v05-120+)
>>> [ 4565.030745] MSR:  800000010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22288822  XER: 0000000a
>>> [ 4565.030757] CFAR: c0000000007c1fb0 IRQMASK: 1
>>> [ 4565.030757] GPR00: c0000000007c1fa4 c0000003f302b910 c00000000114bf00 c0000003ffff8e68
>>> [ 4565.030757] GPR04: 0000000000000001 ffffffffffffffff 800000c008e0b4b8 ffffffffffffffff
>>> [ 4565.030757] GPR08: 0000000000000000 0000000000000001 0000000080000003 0000000000002843
>>> [ 4565.030757] GPR12: 0000000000008800 c00000001ec9ae00 0000000040000000 0000000000000000
>>> [ 4565.030757] GPR16: 0000000000000000 0000000000000008 0000000000000000 00000000f6ffffff
>>> [ 4565.030757] GPR20: 0000000000000007 0000000000000000 c0000003e9f1f034 0000000000000001
>>> [ 4565.030757] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> [ 4565.030757] GPR28: c000000001549d28 c000000001134828 c0000003ffff8e68 c0000003f302b930
>>> [ 4565.030804] NIP [c0000000007c1ea8] __of_detach_node+0x8/0xa0
>>> [ 4565.030808] LR [c0000000007c1fb4] of_detach_node+0x74/0xd0
>>> [ 4565.030811] Call Trace:
>>> [ 4565.030815] [c0000003f302b910] [c0000000007c1fa4] of_detach_node+0x64/0xd0 (unreliable)
>>> [ 4565.030821] [c0000003f302b980] [c0000000000c33c4] dlpar_detach_node+0xb4/0x150
>>> [ 4565.030826] [c0000003f302ba10] [c0000000000c3ffc] delete_dt_node+0x3c/0x80
>>> [ 4565.030831] [c0000003f302ba40] [c0000000000c4380] pseries_devicetree_update+0x150/0x4f0
>>> [ 4565.030836] [c0000003f302bb70] [c0000000000c479c] post_mobility_fixup+0x7c/0xf0
>>> [ 4565.030841] [c0000003f302bbe0] [c0000000000c4908] migration_store+0xf8/0x130
>>> [ 4565.030847] [c0000003f302bc70] [c000000000998160] kobj_attr_store+0x30/0x60
>>> [ 4565.030852] [c0000003f302bc90] [c000000000412f14] sysfs_kf_write+0x64/0xa0
>>> [ 4565.030857] [c0000003f302bcb0] [c000000000411cac] kernfs_fop_write+0x16c/0x240
>>> [ 4565.030862] [c0000003f302bd00] [c000000000355f20] __vfs_write+0x40/0x220
>>> [ 4565.030867] [c0000003f302bd90] [c000000000356358] vfs_write+0xc8/0x240
>>> [ 4565.030872] [c0000003f302bde0] [c0000000003566cc] ksys_write+0x5c/0x100
>>> [ 4565.030880] [c0000003f302be30] [c00000000000b288] system_call+0x5c/0x70
>>> [ 4565.030884] Instruction dump:
>>> [ 4565.030887] 38210070 38600000 e8010010 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8
>>> [ 4565.030895] 7c0803a6 4e800020 e9230098 7929f7e2 <0b090000> 2f890000 4cde0020 e9030040
>>> [ 4565.030903] ---[ end trace 5bd54cb1df9d2976 ]---
>>>
>>> The mobility.c code continues on during the second migration, accepts the
>>> definitions of the new nodes from the PHYP and ends up renaming the new
>>> properties e.g.
>>>
>>> [ 4565.827296] Duplicate name in base, renamed to "ibm,platform-facilities#1"
>>>
>>> I don't see any check like 'of_node_check_flag(np, OF_DETACHED)' within
>>> of_find_node_by_phandle to skip nodes that are detached, but still present
>>> due to caching or use count considerations.  Another possibility to consider
>>> is that of_find_node_by_phandle also uses something called 'phandle_cache'
>>> which may have outdated data as of_detach_node() does not have access to
>>> that cache for the 'OF_DETACHED' nodes.
>>
>> Yes the phandle_cache looks like it might be the problem.
>>
>> I saw of_free_phandle_cache() being called as late_initcall, but didn't
>> realise that's only if MODULES is disabled.
>>
>> So I don't see anything that invalidates the phandle_cache when a node
>> is removed.
>>
>> The right solution would be for __of_detach_node() to invalidate the
>> phandle_cache for the node being detached. That's slightly complicated
>> by the phandle_cache being static inside base.c
>>
>> To test the theory that it's the phandle_cache causing the problems can
>> you try this patch:
>>
>> diff --git a/drivers/of/base.c b/drivers/of/base.c
>> index 848f549164cd..60e219132e24 100644
>> --- a/drivers/of/base.c
>> +++ b/drivers/of/base.c
>> @@ -1098,6 +1098,9 @@ struct device_node *of_find_node_by_phandle(phandle handle)
>>                 if (phandle_cache[masked_handle] &&
>>                     handle == phandle_cache[masked_handle]->phandle)
>>                         np = phandle_cache[masked_handle];
>> +
>> +               if (of_node_check_flag(np, OF_DETACHED))
>> +                       np = NULL;
>>         }
>>
>>         if (!np) {
>>
>> cheers
> 

^ permalink raw reply

* Re: phandle_cache vs of_detach_node (was Re: [PATCH] powerpc/mobility: Fix node detach/rename problem)
From: Frank Rowand @ 2018-07-31 19:22 UTC (permalink / raw)
  To: Rob Herring, Michael Ellerman; +Cc: mwb, linuxppc-dev, devicetree
In-Reply-To: <85014a16-cd2c-0137-72e5-54d4bcdf788a@gmail.com>

On 07/31/18 12:18, Frank Rowand wrote:
> On 07/31/18 07:17, Rob Herring wrote:
>> On Tue, Jul 31, 2018 at 12:34 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>>
>>> Hi Rob/Frank,
>>>
>>> I think we might have a problem with the phandle_cache not interacting
>>> well with of_detach_node():
>>
>> Probably needs a similar fix as this commit did for overlays:
>>
>> commit b9952b5218added5577e4a3443969bc20884cea9
>> Author: Frank Rowand <frank.rowand@sony.com>
>> Date:   Thu Jul 12 14:00:07 2018 -0700
>>
>>     of: overlay: update phandle cache on overlay apply and remove
>>
>>     A comment in the review of the patch adding the phandle cache said that
>>     the cache would have to be updated when modules are applied and removed.
>>     This patch implements the cache updates.
>>
>>     Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of
>> of_find_node_by_phandle()")
>>     Reported-by: Alan Tull <atull@kernel.org>
>>     Suggested-by: Alan Tull <atull@kernel.org>
>>     Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>>     Signed-off-by: Rob Herring <robh@kernel.org>
> 
> Agreed.  Sorry about missing the of_detach_node() case.
> 
> 
>> Really what we need here is an "invalidate phandle" function rather
>> than free and re-allocate the whole damn cache.
> 
> The big hammer approach was chosen to avoid the race conditions that
> would otherwise occur.  OF does not have a locking strategy that
> would be able to protect against the races.
> 
> We could maybe implement a slightly smaller hammer by (1) disabling
> the cache, (2) invalidate a phandle entry in the cache, (3) re-enable
> the cache.  That is an off the cuff thought - I would have to look
> a little bit more carefully to make sure it would work.
> 
> But I don't see a need to add the complexity of the smaller hammer
> or the bigger hammer of proper locking _unless_ we start seeing that
> the cache is being freed and re-allocated frequently.  For overlays
> I don't expect the high frequency because it happens on a per overlay
> removal basis (not per node removal basis).


>                                              For of_detach_node() the
> event _is_ on a per node removal basis.  Michael, do you expect node
> removals to be a frequent event with low latency being important?  If
> so, a rough guess on what the frequency would be?

I have not looked at how of_detach_node() is used, so it might not be
very different that overlays.  If a group of of_detach_node() calls
are made from a common code location, the the sequence could possibly
be:

   of_free_phandle_cache()

   multiple calls of of_detach_node()

   of_populate_phandle_cache()

-Frank
> 
> -Frank
> 
> 
>> Rob
>>
>>>
>>> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>>>> See below.
>>>>
>>>> On 07/30/2018 01:31 AM, Michael Ellerman wrote:
>>>>> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>>>>>
>>>>>> During LPAR migration, the content of the device tree/sysfs may
>>>>>> be updated including deletion and replacement of nodes in the
>>>>>> tree.  When nodes are added to the internal node structures, they
>>>>>> are appended in FIFO order to a list of nodes maintained by the
>>>>>> OF code APIs.
>>>>>
>>>>> That hasn't been true for several years. The data structure is an n-ary
>>>>> tree. What kernel version are you working on?
>>>>
>>>> Sorry for an error in my description.  I oversimplified based on the
>>>> name of a search iterator.  Let me try to provide a better explanation
>>>> of the problem, here.
>>>>
>>>> This is the problem.  The PPC mobility code receives RTAS requests to
>>>> delete nodes with platform-/hardware-specific attributes when restarting
>>>> the kernel after a migration.  My example is for migration between a
>>>> P8 Alpine and a P8 Brazos.   Nodes to be deleted may include 'ibm,random-v1',
>>>> 'ibm,compression-v1', 'ibm,platform-facilities', 'ibm,sym-encryption-v1',
>>>> or others.
>>>>
>>>> The mobility.c code calls 'of_detach_node' for the nodes and their children.
>>>> This makes calls to detach the properties and to try to remove the associated
>>>> sysfs/kernfs files.
>>>>
>>>> Then new copies of the same nodes are next provided by the PHYP, local
>>>> copies are built, and a pointer to the 'struct device_node' is passed to
>>>> of_attach_node.  Before the call to of_attach_node, the phandle is initialized
>>>> to 0 when the data structure is alloced.  During the call to of_attach_node,
>>>> it calls __of_attach_node which pulls the actual name and phandle from just
>>>> created sub-properties named something like 'name' and 'ibm,phandle'.
>>>>
>>>> This is all fine for the first migration.  The problem occurs with the
>>>> second and subsequent migrations when the PHYP on the new system wants to
>>>> replace the same set of nodes again, referenced with the same names and
>>>> phandle values.
>>>>
>>>>>
>>>>>> When nodes are removed from the device tree, they
>>>>>> are marked OF_DETACHED, but not actually deleted from the system
>>>>>> to allow for pointers cached elsewhere in the kernel.  The order
>>>>>> and content of the entries in the list of nodes is not altered,
>>>>>> though.
>>>>>
>>>>> Something is going wrong if this is actually happening.
>>>>>
>>>>> When the node is detached it should be *detached* from the tree of all
>>>>> nodes, so it should not be discoverable other than by having an existing
>>>>> pointer to it.
>>>> On the second and subsequent migrations, the PHYP tells the system
>>>> to again delete the nodes 'ibm,platform-facilities', 'ibm,random-v1',
>>>> 'ibm,compression-v1', 'ibm,sym-encryption-v1'.  It specifies these
>>>> nodes by its known set of phandle values -- the same handles used
>>>> by the PHYP on the source system are known on the target system.
>>>> The mobility.c code calls of_find_node_by_phandle() with these values
>>>> and ends up locating the first instance of each node that was added
>>>> during the original boot, instead of the second instance of each node
>>>> created after the first migration.  The detach during the second
>>>> migration fails with errors like,
>>>>
>>>> [ 4565.030704] WARNING: CPU: 3 PID: 4787 at drivers/of/dynamic.c:252 __of_detach_node+0x8/0xa0
>>>> [ 4565.030708] Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag lockd grace fscache sunrpc xts vmx_crypto sg pseries_rng binfmt_misc ip_tables xfs libcrc32c sd_mod ibmveth ibmvscsi scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
>>>> [ 4565.030733] CPU: 3 PID: 4787 Comm: drmgr Tainted: G        W         4.18.0-rc1-wi107836-v05-120+ #201
>>>> [ 4565.030737] NIP:  c0000000007c1ea8 LR: c0000000007c1fb4 CTR: 0000000000655170
>>>> [ 4565.030741] REGS: c0000003f302b690 TRAP: 0700   Tainted: G        W          (4.18.0-rc1-wi107836-v05-120+)
>>>> [ 4565.030745] MSR:  800000010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22288822  XER: 0000000a
>>>> [ 4565.030757] CFAR: c0000000007c1fb0 IRQMASK: 1
>>>> [ 4565.030757] GPR00: c0000000007c1fa4 c0000003f302b910 c00000000114bf00 c0000003ffff8e68
>>>> [ 4565.030757] GPR04: 0000000000000001 ffffffffffffffff 800000c008e0b4b8 ffffffffffffffff
>>>> [ 4565.030757] GPR08: 0000000000000000 0000000000000001 0000000080000003 0000000000002843
>>>> [ 4565.030757] GPR12: 0000000000008800 c00000001ec9ae00 0000000040000000 0000000000000000
>>>> [ 4565.030757] GPR16: 0000000000000000 0000000000000008 0000000000000000 00000000f6ffffff
>>>> [ 4565.030757] GPR20: 0000000000000007 0000000000000000 c0000003e9f1f034 0000000000000001
>>>> [ 4565.030757] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>> [ 4565.030757] GPR28: c000000001549d28 c000000001134828 c0000003ffff8e68 c0000003f302b930
>>>> [ 4565.030804] NIP [c0000000007c1ea8] __of_detach_node+0x8/0xa0
>>>> [ 4565.030808] LR [c0000000007c1fb4] of_detach_node+0x74/0xd0
>>>> [ 4565.030811] Call Trace:
>>>> [ 4565.030815] [c0000003f302b910] [c0000000007c1fa4] of_detach_node+0x64/0xd0 (unreliable)
>>>> [ 4565.030821] [c0000003f302b980] [c0000000000c33c4] dlpar_detach_node+0xb4/0x150
>>>> [ 4565.030826] [c0000003f302ba10] [c0000000000c3ffc] delete_dt_node+0x3c/0x80
>>>> [ 4565.030831] [c0000003f302ba40] [c0000000000c4380] pseries_devicetree_update+0x150/0x4f0
>>>> [ 4565.030836] [c0000003f302bb70] [c0000000000c479c] post_mobility_fixup+0x7c/0xf0
>>>> [ 4565.030841] [c0000003f302bbe0] [c0000000000c4908] migration_store+0xf8/0x130
>>>> [ 4565.030847] [c0000003f302bc70] [c000000000998160] kobj_attr_store+0x30/0x60
>>>> [ 4565.030852] [c0000003f302bc90] [c000000000412f14] sysfs_kf_write+0x64/0xa0
>>>> [ 4565.030857] [c0000003f302bcb0] [c000000000411cac] kernfs_fop_write+0x16c/0x240
>>>> [ 4565.030862] [c0000003f302bd00] [c000000000355f20] __vfs_write+0x40/0x220
>>>> [ 4565.030867] [c0000003f302bd90] [c000000000356358] vfs_write+0xc8/0x240
>>>> [ 4565.030872] [c0000003f302bde0] [c0000000003566cc] ksys_write+0x5c/0x100
>>>> [ 4565.030880] [c0000003f302be30] [c00000000000b288] system_call+0x5c/0x70
>>>> [ 4565.030884] Instruction dump:
>>>> [ 4565.030887] 38210070 38600000 e8010010 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8
>>>> [ 4565.030895] 7c0803a6 4e800020 e9230098 7929f7e2 <0b090000> 2f890000 4cde0020 e9030040
>>>> [ 4565.030903] ---[ end trace 5bd54cb1df9d2976 ]---
>>>>
>>>> The mobility.c code continues on during the second migration, accepts the
>>>> definitions of the new nodes from the PHYP and ends up renaming the new
>>>> properties e.g.
>>>>
>>>> [ 4565.827296] Duplicate name in base, renamed to "ibm,platform-facilities#1"
>>>>
>>>> I don't see any check like 'of_node_check_flag(np, OF_DETACHED)' within
>>>> of_find_node_by_phandle to skip nodes that are detached, but still present
>>>> due to caching or use count considerations.  Another possibility to consider
>>>> is that of_find_node_by_phandle also uses something called 'phandle_cache'
>>>> which may have outdated data as of_detach_node() does not have access to
>>>> that cache for the 'OF_DETACHED' nodes.
>>>
>>> Yes the phandle_cache looks like it might be the problem.
>>>
>>> I saw of_free_phandle_cache() being called as late_initcall, but didn't
>>> realise that's only if MODULES is disabled.
>>>
>>> So I don't see anything that invalidates the phandle_cache when a node
>>> is removed.
>>>
>>> The right solution would be for __of_detach_node() to invalidate the
>>> phandle_cache for the node being detached. That's slightly complicated
>>> by the phandle_cache being static inside base.c
>>>
>>> To test the theory that it's the phandle_cache causing the problems can
>>> you try this patch:
>>>
>>> diff --git a/drivers/of/base.c b/drivers/of/base.c
>>> index 848f549164cd..60e219132e24 100644
>>> --- a/drivers/of/base.c
>>> +++ b/drivers/of/base.c
>>> @@ -1098,6 +1098,9 @@ struct device_node *of_find_node_by_phandle(phandle handle)
>>>                 if (phandle_cache[masked_handle] &&
>>>                     handle == phandle_cache[masked_handle]->phandle)
>>>                         np = phandle_cache[masked_handle];
>>> +
>>> +               if (of_node_check_flag(np, OF_DETACHED))
>>> +                       np = NULL;
>>>         }
>>>
>>>         if (!np) {
>>>
>>> cheers
>>
> 
> 

^ permalink raw reply

* Re: phandle_cache vs of_detach_node (was Re: [PATCH] powerpc/mobility: Fix node detach/rename problem)
From: Michael Bringmann @ 2018-07-31 19:26 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <85014a16-cd2c-0137-72e5-54d4bcdf788a@gmail.com>

On 07/31/2018 02:18 PM, Frank Rowand wrote:
> On 07/31/18 07:17, Rob Herring wrote:
>> On Tue, Jul 31, 2018 at 12:34 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>>
>>> Hi Rob/Frank,
>>>
>>> I think we might have a problem with the phandle_cache not interacting
>>> well with of_detach_node():
>>
>> Probably needs a similar fix as this commit did for overlays:
>>
>> commit b9952b5218added5577e4a3443969bc20884cea9
>> Author: Frank Rowand <frank.rowand@sony.com>
>> Date:   Thu Jul 12 14:00:07 2018 -0700
>>
>>     of: overlay: update phandle cache on overlay apply and remove
>>
>>     A comment in the review of the patch adding the phandle cache said that
>>     the cache would have to be updated when modules are applied and removed.
>>     This patch implements the cache updates.
>>
>>     Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of
>> of_find_node_by_phandle()")
>>     Reported-by: Alan Tull <atull@kernel.org>
>>     Suggested-by: Alan Tull <atull@kernel.org>
>>     Signed-off-by: Frank Rowand <frank.rowand@sony.com>
>>     Signed-off-by: Rob Herring <robh@kernel.org>
> 
> Agreed.  Sorry about missing the of_detach_node() case.
> 
> 
>> Really what we need here is an "invalidate phandle" function rather
>> than free and re-allocate the whole damn cache.
> 
> The big hammer approach was chosen to avoid the race conditions that
> would otherwise occur.  OF does not have a locking strategy that
> would be able to protect against the races.
> 
> We could maybe implement a slightly smaller hammer by (1) disabling
> the cache, (2) invalidate a phandle entry in the cache, (3) re-enable
> the cache.  That is an off the cuff thought - I would have to look
> a little bit more carefully to make sure it would work.
> 
> But I don't see a need to add the complexity of the smaller hammer
> or the bigger hammer of proper locking _unless_ we start seeing that
> the cache is being freed and re-allocated frequently.  For overlays
> I don't expect the high frequency because it happens on a per overlay
> removal basis (not per node removal basis).  For of_detach_node() the
> event _is_ on a per node removal basis.  Michael, do you expect node
> removals to be a frequent event with low latency being important?  If
> so, a rough guess on what the frequency would be?

I am only seeing node removals during startup of the kernel after an
LPAR migration event.  For the tests that I have run between a couple
of P8 platforms, I see about 15 node removals split between, 8 l2-caches,
4 hardware-specific property clusters (e.g. ibm,platform-facilities),
and 3 CPUs.

> 
> -Frank

Michael
> 
> 
>> Rob
>>
>>>
>>> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>>>> See below.
>>>>
>>>> On 07/30/2018 01:31 AM, Michael Ellerman wrote:
>>>>> Michael Bringmann <mwb@linux.vnet.ibm.com> writes:
>>>>>
>>>>>> During LPAR migration, the content of the device tree/sysfs may
>>>>>> be updated including deletion and replacement of nodes in the
>>>>>> tree.  When nodes are added to the internal node structures, they
>>>>>> are appended in FIFO order to a list of nodes maintained by the
>>>>>> OF code APIs.
>>>>>
>>>>> That hasn't been true for several years. The data structure is an n-ary
>>>>> tree. What kernel version are you working on?
>>>>
>>>> Sorry for an error in my description.  I oversimplified based on the
>>>> name of a search iterator.  Let me try to provide a better explanation
>>>> of the problem, here.
>>>>
>>>> This is the problem.  The PPC mobility code receives RTAS requests to
>>>> delete nodes with platform-/hardware-specific attributes when restarting
>>>> the kernel after a migration.  My example is for migration between a
>>>> P8 Alpine and a P8 Brazos.   Nodes to be deleted may include 'ibm,random-v1',
>>>> 'ibm,compression-v1', 'ibm,platform-facilities', 'ibm,sym-encryption-v1',
>>>> or others.
>>>>
>>>> The mobility.c code calls 'of_detach_node' for the nodes and their children.
>>>> This makes calls to detach the properties and to try to remove the associated
>>>> sysfs/kernfs files.
>>>>
>>>> Then new copies of the same nodes are next provided by the PHYP, local
>>>> copies are built, and a pointer to the 'struct device_node' is passed to
>>>> of_attach_node.  Before the call to of_attach_node, the phandle is initialized
>>>> to 0 when the data structure is alloced.  During the call to of_attach_node,
>>>> it calls __of_attach_node which pulls the actual name and phandle from just
>>>> created sub-properties named something like 'name' and 'ibm,phandle'.
>>>>
>>>> This is all fine for the first migration.  The problem occurs with the
>>>> second and subsequent migrations when the PHYP on the new system wants to
>>>> replace the same set of nodes again, referenced with the same names and
>>>> phandle values.
>>>>
>>>>>
>>>>>> When nodes are removed from the device tree, they
>>>>>> are marked OF_DETACHED, but not actually deleted from the system
>>>>>> to allow for pointers cached elsewhere in the kernel.  The order
>>>>>> and content of the entries in the list of nodes is not altered,
>>>>>> though.
>>>>>
>>>>> Something is going wrong if this is actually happening.
>>>>>
>>>>> When the node is detached it should be *detached* from the tree of all
>>>>> nodes, so it should not be discoverable other than by having an existing
>>>>> pointer to it.
>>>> On the second and subsequent migrations, the PHYP tells the system
>>>> to again delete the nodes 'ibm,platform-facilities', 'ibm,random-v1',
>>>> 'ibm,compression-v1', 'ibm,sym-encryption-v1'.  It specifies these
>>>> nodes by its known set of phandle values -- the same handles used
>>>> by the PHYP on the source system are known on the target system.
>>>> The mobility.c code calls of_find_node_by_phandle() with these values
>>>> and ends up locating the first instance of each node that was added
>>>> during the original boot, instead of the second instance of each node
>>>> created after the first migration.  The detach during the second
>>>> migration fails with errors like,
>>>>
>>>> [ 4565.030704] WARNING: CPU: 3 PID: 4787 at drivers/of/dynamic.c:252 __of_detach_node+0x8/0xa0
>>>> [ 4565.030708] Modules linked in: nfsv3 nfs_acl nfs tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag lockd grace fscache sunrpc xts vmx_crypto sg pseries_rng binfmt_misc ip_tables xfs libcrc32c sd_mod ibmveth ibmvscsi scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
>>>> [ 4565.030733] CPU: 3 PID: 4787 Comm: drmgr Tainted: G        W         4.18.0-rc1-wi107836-v05-120+ #201
>>>> [ 4565.030737] NIP:  c0000000007c1ea8 LR: c0000000007c1fb4 CTR: 0000000000655170
>>>> [ 4565.030741] REGS: c0000003f302b690 TRAP: 0700   Tainted: G        W          (4.18.0-rc1-wi107836-v05-120+)
>>>> [ 4565.030745] MSR:  800000010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22288822  XER: 0000000a
>>>> [ 4565.030757] CFAR: c0000000007c1fb0 IRQMASK: 1
>>>> [ 4565.030757] GPR00: c0000000007c1fa4 c0000003f302b910 c00000000114bf00 c0000003ffff8e68
>>>> [ 4565.030757] GPR04: 0000000000000001 ffffffffffffffff 800000c008e0b4b8 ffffffffffffffff
>>>> [ 4565.030757] GPR08: 0000000000000000 0000000000000001 0000000080000003 0000000000002843
>>>> [ 4565.030757] GPR12: 0000000000008800 c00000001ec9ae00 0000000040000000 0000000000000000
>>>> [ 4565.030757] GPR16: 0000000000000000 0000000000000008 0000000000000000 00000000f6ffffff
>>>> [ 4565.030757] GPR20: 0000000000000007 0000000000000000 c0000003e9f1f034 0000000000000001
>>>> [ 4565.030757] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>>> [ 4565.030757] GPR28: c000000001549d28 c000000001134828 c0000003ffff8e68 c0000003f302b930
>>>> [ 4565.030804] NIP [c0000000007c1ea8] __of_detach_node+0x8/0xa0
>>>> [ 4565.030808] LR [c0000000007c1fb4] of_detach_node+0x74/0xd0
>>>> [ 4565.030811] Call Trace:
>>>> [ 4565.030815] [c0000003f302b910] [c0000000007c1fa4] of_detach_node+0x64/0xd0 (unreliable)
>>>> [ 4565.030821] [c0000003f302b980] [c0000000000c33c4] dlpar_detach_node+0xb4/0x150
>>>> [ 4565.030826] [c0000003f302ba10] [c0000000000c3ffc] delete_dt_node+0x3c/0x80
>>>> [ 4565.030831] [c0000003f302ba40] [c0000000000c4380] pseries_devicetree_update+0x150/0x4f0
>>>> [ 4565.030836] [c0000003f302bb70] [c0000000000c479c] post_mobility_fixup+0x7c/0xf0
>>>> [ 4565.030841] [c0000003f302bbe0] [c0000000000c4908] migration_store+0xf8/0x130
>>>> [ 4565.030847] [c0000003f302bc70] [c000000000998160] kobj_attr_store+0x30/0x60
>>>> [ 4565.030852] [c0000003f302bc90] [c000000000412f14] sysfs_kf_write+0x64/0xa0
>>>> [ 4565.030857] [c0000003f302bcb0] [c000000000411cac] kernfs_fop_write+0x16c/0x240
>>>> [ 4565.030862] [c0000003f302bd00] [c000000000355f20] __vfs_write+0x40/0x220
>>>> [ 4565.030867] [c0000003f302bd90] [c000000000356358] vfs_write+0xc8/0x240
>>>> [ 4565.030872] [c0000003f302bde0] [c0000000003566cc] ksys_write+0x5c/0x100
>>>> [ 4565.030880] [c0000003f302be30] [c00000000000b288] system_call+0x5c/0x70
>>>> [ 4565.030884] Instruction dump:
>>>> [ 4565.030887] 38210070 38600000 e8010010 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8
>>>> [ 4565.030895] 7c0803a6 4e800020 e9230098 7929f7e2 <0b090000> 2f890000 4cde0020 e9030040
>>>> [ 4565.030903] ---[ end trace 5bd54cb1df9d2976 ]---
>>>>
>>>> The mobility.c code continues on during the second migration, accepts the
>>>> definitions of the new nodes from the PHYP and ends up renaming the new
>>>> properties e.g.
>>>>
>>>> [ 4565.827296] Duplicate name in base, renamed to "ibm,platform-facilities#1"
>>>>
>>>> I don't see any check like 'of_node_check_flag(np, OF_DETACHED)' within
>>>> of_find_node_by_phandle to skip nodes that are detached, but still present
>>>> due to caching or use count considerations.  Another possibility to consider
>>>> is that of_find_node_by_phandle also uses something called 'phandle_cache'
>>>> which may have outdated data as of_detach_node() does not have access to
>>>> that cache for the 'OF_DETACHED' nodes.
>>>
>>> Yes the phandle_cache looks like it might be the problem.
>>>
>>> I saw of_free_phandle_cache() being called as late_initcall, but didn't
>>> realise that's only if MODULES is disabled.
>>>
>>> So I don't see anything that invalidates the phandle_cache when a node
>>> is removed.
>>>
>>> The right solution would be for __of_detach_node() to invalidate the
>>> phandle_cache for the node being detached. That's slightly complicated
>>> by the phandle_cache being static inside base.c
>>>
>>> To test the theory that it's the phandle_cache causing the problems can
>>> you try this patch:
>>>
>>> diff --git a/drivers/of/base.c b/drivers/of/base.c
>>> index 848f549164cd..60e219132e24 100644
>>> --- a/drivers/of/base.c
>>> +++ b/drivers/of/base.c
>>> @@ -1098,6 +1098,9 @@ struct device_node *of_find_node_by_phandle(phandle handle)
>>>                 if (phandle_cache[masked_handle] &&
>>>                     handle == phandle_cache[masked_handle]->phandle)
>>>                         np = phandle_cache[masked_handle];
>>> +
>>> +               if (of_node_check_flag(np, OF_DETACHED))
>>> +                       np = NULL;
>>>         }
>>>
>>>         if (!np) {
>>>
>>> cheers
>>
> 
> 

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:       (512) 466-0650
mwb@linux.vnet.ibm.com

^ permalink raw reply

* Re: [PATCH] powerpc/mobility: Fix node detach/rename problem
From: Tyrel Datwyler @ 2018-07-31 19:50 UTC (permalink / raw)
  To: Michael Ellerman, Michael Bringmann, linuxppc-dev
  Cc: Nathan Fontenot, Thomas Falcon, John Allen
In-Reply-To: <871sbjj48b.fsf@concordia.ellerman.id.au>

T24gMDcvMzAvMjAxOCAxMTo0MiBQTSwgTWljaGFlbCBFbGxlcm1hbiB3cm90ZToNCj4gVHly
ZWwgRGF0d3lsZXIgPHR5cmVsZEBsaW51eC52bmV0LmlibS5jb20+IHdyaXRlczoNCj4+IE9u
IDA3LzI5LzIwMTggMDY6MTEgQU0sIE1pY2hhZWwgQnJpbmdtYW5uIHdyb3RlOg0KPj4+IER1
cmluZyBMUEFSIG1pZ3JhdGlvbiwgdGhlIGNvbnRlbnQgb2YgdGhlIGRldmljZSB0cmVlL3N5
c2ZzIG1heQ0KPj4+IGJlIHVwZGF0ZWQgaW5jbHVkaW5nIGRlbGV0aW9uIGFuZCByZXBsYWNl
bWVudCBvZiBub2RlcyBpbiB0aGUNCj4+PiB0cmVlLiAgV2hlbiBub2RlcyBhcmUgYWRkZWQg
dG8gdGhlIGludGVybmFsIG5vZGUgc3RydWN0dXJlcywgdGhleQ0KPj4+IGFyZSBhcHBlbmRl
ZCBpbiBGSUZPIG9yZGVyIHRvIGEgbGlzdCBvZiBub2RlcyBtYWludGFpbmVkIGJ5IHRoZQ0K
Pj4+IE9GIGNvZGUgQVBJcy4gIFdoZW4gbm9kZXMgYXJlIHJlbW92ZWQgZnJvbSB0aGUgZGV2
aWNlIHRyZWUsIHRoZXkNCj4+PiBhcmUgbWFya2VkIE9GX0RFVEFDSEVELCBidXQgbm90IGFj
dHVhbGx5IGRlbGV0ZWQgZnJvbSB0aGUgc3lzdGVtDQo+Pj4gdG8gYWxsb3cgZm9yIHBvaW50
ZXJzIGNhY2hlZCBlbHNld2hlcmUgaW4gdGhlIGtlcm5lbC4gIFRoZSBvcmRlcg0KPj4+IGFu
ZCBjb250ZW50IG9mIHRoZSBlbnRyaWVzIGluIHRoZSBsaXN0IG9mIG5vZGVzIGlzIG5vdCBh
bHRlcmVkLA0KPj4+IHRob3VnaC4NCj4+Pg0KPj4+IER1cmluZyBMUEFSIG1pZ3JhdGlvbiBz
b21lIGNvbW1vbiBub2RlcyBhcmUgZGVsZXRlZCBhbmQgcmUtYWRkZWQNCj4+PiBlLmcuICJp
Ym0scGxhdGZvcm0tZmFjaWxpdGllcyIuICBJZiBhIG5vZGUgaXMgcmUtYWRkZWQgdG8gdGhl
IE9GDQo+Pj4gbm9kZSBsaXN0cywgdGhlIG9mX2F0dGFjaF9ub2RlIGZ1bmN0aW9uIGNoZWNr
cyB0byBtYWtlIHN1cmUgdGhhdA0KPj4+IHRoZSBuYW1lICsgaWJtLHBoYW5kbGUgb2YgdGhl
IHRvLWJlLWFkZGVkIGRhdGEgaXMgdW5pcXVlLiAgQXMgdGhlDQo+Pj4gcHJldmlvdXMgY29w
eSBvZiBhIHJlLWFkZGVkIG5vZGUgaXMgbm90IG1vZGlmaWVkIGJleW9uZCB0aGUgYWRkaXRp
b24NCj4+PiBvZiBhIGJpdCBmbGFnLCB0aGUgY29kZSAoMSkgZmluZHMgdGhlIG9sZCBjb3B5
LCAoMikgcHJpbnRzIGEgV0FSTklORw0KPj4+IG5vdGljZSB0byB0aGUgY29uc29sZSwgKDMp
IHJlbmFtZXMgdGhlIHRvLWJlLWFkZGVkIG5vZGUgdG8gYXZvaWQNCj4+PiBmaWxlbmFtZSBj
b2xsaXNpb25zIHdpdGhpbiBhIGRpcmVjdG9yeSwgYW5kICgzKSBhZGRzIGVudHJpZXMgdG8N
Cj4+PiB0aGUgc3lzZnMva2VybmZzLg0KPj4NCj4+IFNvLCB0aGlzIHBhdGNoIGFjdHVhbGx5
IGp1c3QgYmFuZCBhaWRzIG92ZXIgdGhlIHJlYWwgcHJvYmxlbS4gVGhpcyBpcw0KPj4gYSBs
b25nIHN0YW5kaW5nIHByb2JsZW0gd2l0aCBzZXZlcmFsIFBGTyBkcml2ZXJzIGxlYWtpbmcg
cmVmZXJlbmNlcy4NCj4+IFRoZSBpc3N1ZSBoZXJlIGlzIHRoYXQsIGR1cmluZyB0aGUgZGV2
aWNlIHRyZWUgdXBkYXRlIHRoYXQgZm9sbG93cyBhDQo+PiBtaWdyYXRpb24uIHRoZSB1cGRh
dGUgb2YgdGhlIGlibSxwbGF0Zm9ybS1mYWNpbGl0aWVzIG5vZGUgYW5kIGZyaWVuZHMNCj4+
IGJlbG93IGFyZSBhbHdheXMgZGVsZXRlZCBhbmQgcmUtYWRkZWQgb24gdGhlIGRlc3RpbmF0
aW9uIGxwYXIgYW5kDQo+PiBzdWJzZXF1ZW50bHkgdGhlIGxlYWtlZCByZWZlcmVuY2VzIHBy
ZXZlbnQgdGhlIGRldmljZXMgbm9kZXMgZnJvbQ0KPj4gZXZlcnkgYWN0dWFsbHkgYmVpbmcg
cHJvcGVybHkgY2xlYW5lZCB1cCBhZnRlciBkZXRhY2guIFRodXMsIGxlYWRpbmcNCj4+IHRv
IHRoZSBpc3N1ZSB5b3UgYXJlIG9ic2VydmluZy4NCg0KU28sIGl0IHdhcyB0aGUgZW5kIG9m
IHRoZSBkYXksIGFuZCBJIGtpbmQgb2YgZ2xvc3NlZCBvdmVyIHRoZSBpc3N1ZSBNaWNoYWVs
IHdhcyB0cnlpbmcgdG8gYWRkcmVzcyB3aXRoIGFuIGlzc3VlIHRoYXQgSSByZW1lbWJlcmVk
IHRoYXQgaGFkIGJlZW4gYXJvdW5kIGZvciBhd2hpbGUuDQoNCj4gDQo+IExlYWtpbmcgcmVm
ZXJlbmNlcyBzaG91bGRuJ3QgYWZmZWN0IHRoZSBub2RlIGJlaW5nIGRldGFjaGVkIGZyb20g
dGhlDQo+IHRyZWUgdGhvdWdoLg0KDQpObywgYnV0IGl0IGRvZXMgcHJldmVudCBpdCBmcm9t
IGJlaW5nIGZyZWVkIGZyb20gc3lzZnMgd2hpY2ggbGVhZHMgdG8gdGhlIHN5c2ZzIGVudHJ5
IHJlbmFtaW5nIHRoYXQgaGFwcGVucyB3aGVuIGFub3RoZXIgbm9kZSB3aXRoIHRoZSBzYW1l
IG5hbWUgaXMgYXR0YWNoZWQuDQoNCj4gDQo+IFNlZSBvZl9kZXRhY2hfbm9kZSgpIGNhbGxp
bmcgX19vZl9kZXRhY2hfbm9kZSgpLCBub25lIG9mIHRoYXQgZGVwZW5kcyBvbg0KPiB0aGUg
cmVmY291bnQuDQo+IA0KPiBJdCdzIG9ubHkgdGhlIGFjdHVhbCBmcmVlaW5nIG9mIHRoZSBu
b2RlLCBpbiBvZl9ub2RlX3JlbGVhc2UoKSB0aGF0IGlzDQo+IHByZXZlbnRlZCBieSBsZWFr
ZWQgcmVmZXJlbmNlIGNvdW50cy4NCg0KUmlnaHQsIGJ1dCBpZiB3ZSBkaWQgcmVmY291bnQg
Y29ycmVjdGx5IHRoZXJlIGlzIHRoZSBuZXcgcHJvYmxlbSB0aGF0IHRoZSBub2RlIGlzIGZy
ZWVkIGFuZCBub3cgdGhlIHBoYW5kbGUgY2FjaGUgcG9pbnRzIGF0IGZyZWVkIG1lbW9yeSBp
biB0aGUgY2FzZSB3ZXJlIG5vIGludmFsaWRhdGlvbiBpcyBkb25lLg0KDQo+IA0KPiBTbyBJ
IGFncmVlIHdlIG5lZWQgdG8gZG8gYSBiZXR0ZXIgam9iIHdpdGggdGhlIHJlZmVyZW5jZSBj
b3VudGluZywgYnV0IEkNCj4gZG9uJ3Qgc2VlIGhvdyBpdCBpcyBjYXVzaW5nIHRoZSBwcm9i
bGVtIGhlcmUNCg0KTm93IGluIHJlZ2FyZHMgdG8gdGhlIHBoYW5kbGUgY2FjaGluZyBzb21l
aG93IEkgbWlzc2VkIHRoYXQgY29kZSBnb2luZyBpbnRvIE9GIGVhcmxpZXIgdGhpcyB5ZWFy
LiBUaGF0IHNob3VsZCBoYXZlIGhhZCBhdCBsZWFzdCBzb21lIGRpc2N1c3Npb24gZnJvbSBv
dXIgc2lkZSBiYXNlZCBvbiB0aGUgZmFjdCB0aGF0IGl0IGlzIGJ1aWx0IG9uIGR0YyBjb21w
aWxlciBhc3N1bXB0aW9uIHRoYXQgdGhlcmUgYXJlIGEgc2V0IG51bWJlciBvZiBwaGFuZGxl
cyB0aGF0IGFyZSBhbGxvY2F0ZWQgc3RhcnRpbmcgYXQgMS4ubiBzdWNoIHRoYXQgdGhleSBh
cmUgbW9ub3RvbmljYWxseSBpbmNyZWFzaW5nLiBUaGF0IGhhcyBhIG5pY2UgZml4ZWQgc2l6
ZSB3aXRoIE8oMSkgbG9va3VwIHRpbWUuIFBoeXAgZG9lc24ndCBndWFyYW50ZWUgYW55IHNv
cnQgb2YgbmljZW5lc3Mgd2l0aCBuaWNlbHkgb3JkZXJlZCBwaGFuZGxlcy4gRnJvbSB3aGF0
IEkndmUgc2VlbiB0aGVyZSBhcmUgYSBzdWJzZXQgb2YgcGhhbmRsZXMgdGhhdCBkZWNyZWFz
ZSBmcm9tICgtMSkgbW9ub3RvbmljYWxseSwgYW5kIHRoZW4gdGhlcmUgYXJlIGEgYnVuY2gg
b2YgcmFuZG9tIHZhbHVlcyBmb3IgY3B1cyBhbmQgSU9Bcy4gVGhpbmtpbmcgb24gaXQgbWln
aHQgbm90IGJlIHRoYXQgYmlnIG9mIGEgZGVhbCBhcyB3ZSBqdXN0IGVuZCB1cCBpbiB0aGUg
Y2FzZSB3aGVyZSB3ZSBoYXZlIGEgcGhhbmRsZSBjb2xsaXNpb24gb25lIG1ha2VzIGl0IGlu
dG8gdGhlIGNhY2hlIGFuZCBpcyBvcHRpbWl6ZWQgd2hpbGUgdGhlIG90aGVyIGRvZXNuJ3Qu
IE9uIGFub3RoZXIgbm90ZSwgdGhleSBjbGVhcmx5IGhpdCBhIHNpbWlsYXIgaXNzdWUgZHVy
aW5nIG92ZXJsYXkgYXR0YWNoIGFuZCByZW1vdmUsIGFuZCBhcyBSb2IgcG9pbnRlZCBvdXQg
dGhlaXIgc29sdXRpb24gdG8gaGFuZGxlIGl0IGlzIGEgZnVsbCBjYWNoZSBpbnZhbGlkYXRp
b24gZm9sbG93ZWQgYnkgcmVzY2FubmluZyB0aGUgd2hvbGUgdHJlZSB0byByZWJ1aWxkIGl0
LiBTZWVpbmcgYXMgb3VyIGR5bmFtaWMgbGlmZWN5Y2xlIGlzIG5vZGUgYnkgbm9kZSB0aGlz
IGRlZmluaXRlbHkgYWRkcyBzb21lIG92ZXJoZWFkLg0KDQotVHlyZWwNCg0KPiANCj4gY2hl
ZXJzDQo+IA0KDQo=

^ permalink raw reply

* Re: [PATCH v5 00/11] hugetlb: Factorize hugetlb architecture primitives
From: Luiz Capitulino @ 2018-07-31 20:06 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: linux-mm, mike.kravetz, linux, catalin.marinas, will.deacon,
	tony.luck, fenghua.yu, ralf, paul.burton, jhogan, jejb, deller,
	benh, paulus, mpe, ysato, dalias, davem, tglx, mingo, hpa, x86,
	arnd, linux-arm-kernel, linux-kernel, linux-ia64, linux-mips,
	linux-parisc, linuxppc-dev, linux-sh, sparclinux, linux-arch
In-Reply-To: <20180731060155.16915-1-alex@ghiti.fr>

On Tue, 31 Jul 2018 06:01:44 +0000
Alexandre Ghiti <alex@ghiti.fr> wrote:

> [CC linux-mm for inclusion in -mm tree] 
> 
> In order to reduce copy/paste of functions across architectures and then
> make riscv hugetlb port (and future ports) simpler and smaller, this
> patchset intends to factorize the numerous hugetlb primitives that are
> defined across all the architectures.

[...]

>  15 files changed, 139 insertions(+), 382 deletions(-)

I imagine you're mostly interested in non-x86 review at this point, but
as this series is looking amazing:

Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Benjamin Herrenschmidt @ 2018-07-31 20:36 UTC (permalink / raw)
  To: Christoph Hellwig, Michael S. Tsirkin
  Cc: Will Deacon, Anshuman Khandual, virtualization, linux-kernel,
	linuxppc-dev, aik, robh, joe, elfring, david, jasowang, mpe,
	linuxram, haren, paulus, srikar, robin.murphy,
	jean-philippe.brucker, marc.zyngier
In-Reply-To: <20180731173052.GA17153@infradead.org>

On Tue, 2018-07-31 at 10:30 -0700, Christoph Hellwig wrote:
> > However the question people raise is that DMA API is already full of
> > arch-specific tricks the likes of which are outlined in your post linked
> > above. How is this one much worse?
> 
> None of these warts is visible to the driver, they are all handled in
> the architecture (possibly on a per-bus basis).
> 
> So for virtio we really need to decide if it has one set of behavior
> as specified in the virtio spec, or if it behaves exactly as if it
> was on a PCI bus, or in fact probably both as you lined up.  But no
> magic arch specific behavior inbetween.

The only arch specific behaviour is needed in the case where it doesn't
behave like PCI. In this case, the PCI DMA ops are not suitable, but in
our secure VMs, we still need to make it use swiotlb in order to bounce
through non-secure pages.

It would be nice if "real PCI" was the default but it's not, VMs are
created in "legacy" mode all the times and we don't know at VM creation
time whether it will become a secure VM or not. The way our secure VMs
work is that they start as a normal VM, load a secure "payload" and
call the Ultravisor to "become" secure.

So we're in a bit of a bind here. We need that one-liner optional arch
hook to make virtio use swiotlb in that "IOMMU bypass" case.

Ben.

^ permalink raw reply

* ssb: Remove SSB_WARN_ON, SSB_BUG_ON and SSB_DEBUG
From: Michael Büsch @ 2018-07-31 20:15 UTC (permalink / raw)
  To: Kalle Valo
  Cc: linux-wireless, b43-dev, Joe Perches, Ralf Baechle, Paul Burton,
	James Hogan, linux-mips, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 21014 bytes --]

Use the standard WARN_ON instead.
If a small kernel is desired, WARN_ON can be disabled globally.

Also remove SSB_DEBUG. Besides WARN_ON it only adds a tiny debug check.
Include this check unconditionally.

Signed-off-by: Michael Buesch <m@bues.ch>
---

CC-ing Mips and PPC maintainers due to changes in defconfig



 arch/mips/configs/bcm47xx_defconfig |  1 -
 arch/powerpc/configs/wii_defconfig  |  1 -
 drivers/ssb/Kconfig                 |  9 -------
 drivers/ssb/driver_chipcommon.c     |  8 +++---
 drivers/ssb/driver_chipcommon_pmu.c | 10 ++++----
 drivers/ssb/driver_gpio.c           |  4 +--
 drivers/ssb/driver_pcicore.c        |  6 ++---
 drivers/ssb/embedded.c              | 10 ++++----
 drivers/ssb/host_soc.c              | 12 ++++-----
 drivers/ssb/main.c                  | 38 +++++++++++++----------------
 drivers/ssb/pci.c                   | 19 +++++----------
 drivers/ssb/pcmcia.c                | 14 +++++------
 drivers/ssb/scan.c                  |  4 +--
 drivers/ssb/sdio.c                  | 12 ++++-----
 drivers/ssb/ssb_private.h           |  9 -------
 include/linux/ssb/ssb.h             |  2 --
 16 files changed, 63 insertions(+), 96 deletions(-)

diff --git a/arch/mips/configs/bcm47xx_defconfig b/arch/mips/configs/bcm47xx_defconfig
index fad8e964f14c..ba800a892384 100644
--- a/arch/mips/configs/bcm47xx_defconfig
+++ b/arch/mips/configs/bcm47xx_defconfig
@@ -66,7 +66,6 @@ CONFIG_HW_RANDOM=y
 CONFIG_GPIO_SYSFS=y
 CONFIG_WATCHDOG=y
 CONFIG_BCM47XX_WDT=y
-CONFIG_SSB_DEBUG=y
 CONFIG_SSB_DRIVER_GIGE=y
 CONFIG_BCMA_DRIVER_GMAC_CMN=y
 CONFIG_USB=y
diff --git a/arch/powerpc/configs/wii_defconfig b/arch/powerpc/configs/wii_defconfig
index 10940533da71..f5c366b02828 100644
--- a/arch/powerpc/configs/wii_defconfig
+++ b/arch/powerpc/configs/wii_defconfig
@@ -78,7 +78,6 @@ CONFIG_GPIO_HLWD=y
 CONFIG_POWER_RESET=y
 CONFIG_POWER_RESET_GPIO=y
 # CONFIG_HWMON is not set
-CONFIG_SSB_DEBUG=y
 CONFIG_FB=y
 # CONFIG_VGA_CONSOLE is not set
 CONFIG_FRAMEBUFFER_CONSOLE=y
diff --git a/drivers/ssb/Kconfig b/drivers/ssb/Kconfig
index 6c438c819eb9..df30e1323252 100644
--- a/drivers/ssb/Kconfig
+++ b/drivers/ssb/Kconfig
@@ -89,15 +89,6 @@ config SSB_HOST_SOC
 
 	  If unsure, say N
 
-config SSB_DEBUG
-	bool "SSB debugging"
-	depends on SSB
-	help
-	  This turns on additional runtime checks and debugging
-	  messages. Turn this on for SSB troubleshooting.
-
-	  If unsure, say N
-
 config SSB_SERIAL
 	bool
 	depends on SSB
diff --git a/drivers/ssb/driver_chipcommon.c b/drivers/ssb/driver_chipcommon.c
index 48050c6fd847..99a4656d113d 100644
--- a/drivers/ssb/driver_chipcommon.c
+++ b/drivers/ssb/driver_chipcommon.c
@@ -56,7 +56,7 @@ void ssb_chipco_set_clockmode(struct ssb_chipcommon *cc,
 
 	if (cc->capabilities & SSB_CHIPCO_CAP_PMU)
 		return; /* PMU controls clockmode, separated function needed */
-	SSB_WARN_ON(ccdev->id.revision >= 20);
+	WARN_ON(ccdev->id.revision >= 20);
 
 	/* chipcommon cores prior to rev6 don't support dynamic clock control */
 	if (ccdev->id.revision < 6)
@@ -111,7 +111,7 @@ void ssb_chipco_set_clockmode(struct ssb_chipcommon *cc,
 		}
 		break;
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 }
 
@@ -164,7 +164,7 @@ static int chipco_pctl_clockfreqlimit(struct ssb_chipcommon *cc, int get_max)
 			divisor = 32;
 			break;
 		default:
-			SSB_WARN_ON(1);
+			WARN_ON(1);
 		}
 	} else if (cc->dev->id.revision < 10) {
 		switch (clocksrc) {
@@ -277,7 +277,7 @@ static void calc_fast_powerup_delay(struct ssb_chipcommon *cc)
 	minfreq = chipco_pctl_clockfreqlimit(cc, 0);
 	pll_on_delay = chipco_read32(cc, SSB_CHIPCO_PLLONDELAY);
 	tmp = (((pll_on_delay + 2) * 1000000) + (minfreq - 1)) / minfreq;
-	SSB_WARN_ON(tmp & ~0xFFFF);
+	WARN_ON(tmp & ~0xFFFF);
 
 	cc->fast_pwrup_delay = tmp;
 }
diff --git a/drivers/ssb/driver_chipcommon_pmu.c b/drivers/ssb/driver_chipcommon_pmu.c
index e28682a53cdf..0f60e90ded26 100644
--- a/drivers/ssb/driver_chipcommon_pmu.c
+++ b/drivers/ssb/driver_chipcommon_pmu.c
@@ -128,7 +128,7 @@ static void ssb_pmu0_pllinit_r0(struct ssb_chipcommon *cc,
 			      ~(1 << SSB_PMURES_5354_BB_PLL_PU));
 		break;
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 	for (i = 1500; i; i--) {
 		tmp = chipco_read32(cc, SSB_CHIPCO_CLKCTLST);
@@ -265,7 +265,7 @@ static void ssb_pmu1_pllinit_r0(struct ssb_chipcommon *cc,
 		buffer_strength = 0x222222;
 		break;
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 	for (i = 1500; i; i--) {
 		tmp = chipco_read32(cc, SSB_CHIPCO_CLKCTLST);
@@ -501,7 +501,7 @@ static void ssb_pmu_resources_init(struct ssb_chipcommon *cc)
 					      ~(depend_tab[i].depend));
 				break;
 			default:
-				SSB_WARN_ON(1);
+				WARN_ON(1);
 			}
 		}
 	}
@@ -568,12 +568,12 @@ void ssb_pmu_set_ldo_voltage(struct ssb_chipcommon *cc,
 			mask = 0x3F;
 			break;
 		default:
-			SSB_WARN_ON(1);
+			WARN_ON(1);
 			return;
 		}
 		break;
 	case 0x4312:
-		if (SSB_WARN_ON(id != LDO_PAREF))
+		if (WARN_ON(id != LDO_PAREF))
 			return;
 		addr = 0;
 		shift = 21;
diff --git a/drivers/ssb/driver_gpio.c b/drivers/ssb/driver_gpio.c
index 6ce4abf7d473..e809dae4c470 100644
--- a/drivers/ssb/driver_gpio.c
+++ b/drivers/ssb/driver_gpio.c
@@ -461,7 +461,7 @@ int ssb_gpio_init(struct ssb_bus *bus)
 	else if (ssb_extif_available(&bus->extif))
 		return ssb_gpio_extif_init(bus);
 	else
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 
 	return -1;
 }
@@ -473,7 +473,7 @@ int ssb_gpio_unregister(struct ssb_bus *bus)
 		gpiochip_remove(&bus->gpio);
 		return 0;
 	} else {
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 
 	return -1;
diff --git a/drivers/ssb/driver_pcicore.c b/drivers/ssb/driver_pcicore.c
index ae80b3171523..6a5622e0ded5 100644
--- a/drivers/ssb/driver_pcicore.c
+++ b/drivers/ssb/driver_pcicore.c
@@ -115,7 +115,7 @@ static int ssb_extpci_read_config(struct ssb_pcicore *pc,
 	u32 addr, val;
 	void __iomem *mmio;
 
-	SSB_WARN_ON(!pc->hostmode);
+	WARN_ON(!pc->hostmode);
 	if (unlikely(len != 1 && len != 2 && len != 4))
 		goto out;
 	addr = get_cfgspace_addr(pc, bus, dev, func, off);
@@ -161,7 +161,7 @@ static int ssb_extpci_write_config(struct ssb_pcicore *pc,
 	u32 addr, val = 0;
 	void __iomem *mmio;
 
-	SSB_WARN_ON(!pc->hostmode);
+	WARN_ON(!pc->hostmode);
 	if (unlikely(len != 1 && len != 2 && len != 4))
 		goto out;
 	addr = get_cfgspace_addr(pc, bus, dev, func, off);
@@ -702,7 +702,7 @@ int ssb_pcicore_dev_irqvecs_enable(struct ssb_pcicore *pc,
 		/* Calculate the "coremask" for the device. */
 		coremask = (1 << dev->core_index);
 
-		SSB_WARN_ON(bus->bustype != SSB_BUSTYPE_PCI);
+		WARN_ON(bus->bustype != SSB_BUSTYPE_PCI);
 		err = pci_read_config_dword(bus->host_pci, SSB_PCI_IRQMASK, &tmp);
 		if (err)
 			goto out;
diff --git a/drivers/ssb/embedded.c b/drivers/ssb/embedded.c
index c39ddf8988c1..8254ed25e063 100644
--- a/drivers/ssb/embedded.c
+++ b/drivers/ssb/embedded.c
@@ -77,7 +77,7 @@ u32 ssb_gpio_in(struct ssb_bus *bus, u32 mask)
 	else if (ssb_extif_available(&bus->extif))
 		res = ssb_extif_gpio_in(&bus->extif, mask);
 	else
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	spin_unlock_irqrestore(&bus->gpio_lock, flags);
 
 	return res;
@@ -95,7 +95,7 @@ u32 ssb_gpio_out(struct ssb_bus *bus, u32 mask, u32 value)
 	else if (ssb_extif_available(&bus->extif))
 		res = ssb_extif_gpio_out(&bus->extif, mask, value);
 	else
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	spin_unlock_irqrestore(&bus->gpio_lock, flags);
 
 	return res;
@@ -113,7 +113,7 @@ u32 ssb_gpio_outen(struct ssb_bus *bus, u32 mask, u32 value)
 	else if (ssb_extif_available(&bus->extif))
 		res = ssb_extif_gpio_outen(&bus->extif, mask, value);
 	else
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	spin_unlock_irqrestore(&bus->gpio_lock, flags);
 
 	return res;
@@ -145,7 +145,7 @@ u32 ssb_gpio_intmask(struct ssb_bus *bus, u32 mask, u32 value)
 	else if (ssb_extif_available(&bus->extif))
 		res = ssb_extif_gpio_intmask(&bus->extif, mask, value);
 	else
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	spin_unlock_irqrestore(&bus->gpio_lock, flags);
 
 	return res;
@@ -163,7 +163,7 @@ u32 ssb_gpio_polarity(struct ssb_bus *bus, u32 mask, u32 value)
 	else if (ssb_extif_available(&bus->extif))
 		res = ssb_extif_gpio_polarity(&bus->extif, mask, value);
 	else
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	spin_unlock_irqrestore(&bus->gpio_lock, flags);
 
 	return res;
diff --git a/drivers/ssb/host_soc.c b/drivers/ssb/host_soc.c
index eadaedf466f6..3b438480515c 100644
--- a/drivers/ssb/host_soc.c
+++ b/drivers/ssb/host_soc.c
@@ -61,7 +61,7 @@ static void ssb_host_soc_block_read(struct ssb_device *dev, void *buffer,
 	case sizeof(u16): {
 		__le16 *buf = buffer;
 
-		SSB_WARN_ON(count & 1);
+		WARN_ON(count & 1);
 		while (count) {
 			*buf = (__force __le16)__raw_readw(addr);
 			buf++;
@@ -72,7 +72,7 @@ static void ssb_host_soc_block_read(struct ssb_device *dev, void *buffer,
 	case sizeof(u32): {
 		__le32 *buf = buffer;
 
-		SSB_WARN_ON(count & 3);
+		WARN_ON(count & 3);
 		while (count) {
 			*buf = (__force __le32)__raw_readl(addr);
 			buf++;
@@ -81,7 +81,7 @@ static void ssb_host_soc_block_read(struct ssb_device *dev, void *buffer,
 		break;
 	}
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 }
 #endif /* CONFIG_SSB_BLOCKIO */
@@ -134,7 +134,7 @@ static void ssb_host_soc_block_write(struct ssb_device *dev, const void *buffer,
 	case sizeof(u16): {
 		const __le16 *buf = buffer;
 
-		SSB_WARN_ON(count & 1);
+		WARN_ON(count & 1);
 		while (count) {
 			__raw_writew((__force u16)(*buf), addr);
 			buf++;
@@ -145,7 +145,7 @@ static void ssb_host_soc_block_write(struct ssb_device *dev, const void *buffer,
 	case sizeof(u32): {
 		const __le32 *buf = buffer;
 
-		SSB_WARN_ON(count & 3);
+		WARN_ON(count & 3);
 		while (count) {
 			__raw_writel((__force u32)(*buf), addr);
 			buf++;
@@ -154,7 +154,7 @@ static void ssb_host_soc_block_write(struct ssb_device *dev, const void *buffer,
 		break;
 	}
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 }
 #endif /* CONFIG_SSB_BLOCKIO */
diff --git a/drivers/ssb/main.c b/drivers/ssb/main.c
index 9da56d2fdbd3..0a26984acb2c 100644
--- a/drivers/ssb/main.c
+++ b/drivers/ssb/main.c
@@ -209,7 +209,7 @@ int ssb_devices_freeze(struct ssb_bus *bus, struct ssb_freeze_context *ctx)
 
 	memset(ctx, 0, sizeof(*ctx));
 	ctx->bus = bus;
-	SSB_WARN_ON(bus->nr_devices > ARRAY_SIZE(ctx->device_frozen));
+	WARN_ON(bus->nr_devices > ARRAY_SIZE(ctx->device_frozen));
 
 	for (i = 0; i < bus->nr_devices; i++) {
 		sdev = ssb_device_get(&bus->devices[i]);
@@ -220,7 +220,7 @@ int ssb_devices_freeze(struct ssb_bus *bus, struct ssb_freeze_context *ctx)
 			continue;
 		}
 		sdrv = drv_to_ssb_drv(sdev->dev->driver);
-		if (SSB_WARN_ON(!sdrv->remove))
+		if (WARN_ON(!sdrv->remove))
 			continue;
 		sdrv->remove(sdev);
 		ctx->device_frozen[i] = 1;
@@ -248,10 +248,10 @@ int ssb_devices_thaw(struct ssb_freeze_context *ctx)
 			continue;
 		sdev = &bus->devices[i];
 
-		if (SSB_WARN_ON(!sdev->dev || !sdev->dev->driver))
+		if (WARN_ON(!sdev->dev || !sdev->dev->driver))
 			continue;
 		sdrv = drv_to_ssb_drv(sdev->dev->driver);
-		if (SSB_WARN_ON(!sdrv || !sdrv->probe))
+		if (WARN_ON(!sdrv || !sdrv->probe))
 			continue;
 
 		err = sdrv->probe(sdev, &sdev->id);
@@ -861,13 +861,13 @@ u32 ssb_calc_clock_rate(u32 plltype, u32 n, u32 m)
 	case SSB_PLLTYPE_2: /* 48Mhz, 4 dividers */
 		n1 += SSB_CHIPCO_CLK_T2_BIAS;
 		n2 += SSB_CHIPCO_CLK_T2_BIAS;
-		SSB_WARN_ON(!((n1 >= 2) && (n1 <= 7)));
-		SSB_WARN_ON(!((n2 >= 5) && (n2 <= 23)));
+		WARN_ON(!((n1 >= 2) && (n1 <= 7)));
+		WARN_ON(!((n2 >= 5) && (n2 <= 23)));
 		break;
 	case SSB_PLLTYPE_5: /* 25Mhz, 4 dividers */
 		return 100000000;
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 
 	switch (plltype) {
@@ -916,9 +916,9 @@ u32 ssb_calc_clock_rate(u32 plltype, u32 n, u32 m)
 		m1 += SSB_CHIPCO_CLK_T2_BIAS;
 		m2 += SSB_CHIPCO_CLK_T2M2_BIAS;
 		m3 += SSB_CHIPCO_CLK_T2_BIAS;
-		SSB_WARN_ON(!((m1 >= 2) && (m1 <= 7)));
-		SSB_WARN_ON(!((m2 >= 3) && (m2 <= 10)));
-		SSB_WARN_ON(!((m3 >= 2) && (m3 <= 7)));
+		WARN_ON(!((m1 >= 2) && (m1 <= 7)));
+		WARN_ON(!((m2 >= 3) && (m2 <= 10)));
+		WARN_ON(!((m3 >= 2) && (m3 <= 7)));
 
 		if (!(mc & SSB_CHIPCO_CLK_T2MC_M1BYP))
 			clock /= m1;
@@ -928,7 +928,7 @@ u32 ssb_calc_clock_rate(u32 plltype, u32 n, u32 m)
 			clock /= m3;
 		return clock;
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 	return 0;
 }
@@ -1169,9 +1169,7 @@ int ssb_bus_may_powerdown(struct ssb_bus *bus)
 	if (err)
 		goto error;
 out:
-#ifdef CONFIG_SSB_DEBUG
 	bus->powered_up = 0;
-#endif
 	return err;
 error:
 	pr_err("Bus powerdown failed\n");
@@ -1188,9 +1186,7 @@ int ssb_bus_powerup(struct ssb_bus *bus, bool dynamic_pctl)
 	if (err)
 		goto error;
 
-#ifdef CONFIG_SSB_DEBUG
 	bus->powered_up = 1;
-#endif
 
 	mode = dynamic_pctl ? SSB_CLKMODE_DYNAMIC : SSB_CLKMODE_FAST;
 	ssb_chipco_set_clockmode(&bus->chipco, mode);
@@ -1242,15 +1238,15 @@ u32 ssb_admatch_base(u32 adm)
 		base = (adm & SSB_ADM_BASE0);
 		break;
 	case SSB_ADM_TYPE1:
-		SSB_WARN_ON(adm & SSB_ADM_NEG); /* unsupported */
+		WARN_ON(adm & SSB_ADM_NEG); /* unsupported */
 		base = (adm & SSB_ADM_BASE1);
 		break;
 	case SSB_ADM_TYPE2:
-		SSB_WARN_ON(adm & SSB_ADM_NEG); /* unsupported */
+		WARN_ON(adm & SSB_ADM_NEG); /* unsupported */
 		base = (adm & SSB_ADM_BASE2);
 		break;
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 
 	return base;
@@ -1266,15 +1262,15 @@ u32 ssb_admatch_size(u32 adm)
 		size = ((adm & SSB_ADM_SZ0) >> SSB_ADM_SZ0_SHIFT);
 		break;
 	case SSB_ADM_TYPE1:
-		SSB_WARN_ON(adm & SSB_ADM_NEG); /* unsupported */
+		WARN_ON(adm & SSB_ADM_NEG); /* unsupported */
 		size = ((adm & SSB_ADM_SZ1) >> SSB_ADM_SZ1_SHIFT);
 		break;
 	case SSB_ADM_TYPE2:
-		SSB_WARN_ON(adm & SSB_ADM_NEG); /* unsupported */
+		WARN_ON(adm & SSB_ADM_NEG); /* unsupported */
 		size = ((adm & SSB_ADM_SZ2) >> SSB_ADM_SZ2_SHIFT);
 		break;
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 	size = (1 << (size + 1));
 
diff --git a/drivers/ssb/pci.c b/drivers/ssb/pci.c
index ad4308529eba..84807a9b4b13 100644
--- a/drivers/ssb/pci.c
+++ b/drivers/ssb/pci.c
@@ -946,7 +946,6 @@ int ssb_pci_get_invariants(struct ssb_bus *bus,
 	return err;
 }
 
-#ifdef CONFIG_SSB_DEBUG
 static int ssb_pci_assert_buspower(struct ssb_bus *bus)
 {
 	if (likely(bus->powered_up))
@@ -960,12 +959,6 @@ static int ssb_pci_assert_buspower(struct ssb_bus *bus)
 
 	return -ENODEV;
 }
-#else /* DEBUG */
-static inline int ssb_pci_assert_buspower(struct ssb_bus *bus)
-{
-	return 0;
-}
-#endif /* DEBUG */
 
 static u8 ssb_pci_read8(struct ssb_device *dev, u16 offset)
 {
@@ -1024,15 +1017,15 @@ static void ssb_pci_block_read(struct ssb_device *dev, void *buffer,
 		ioread8_rep(addr, buffer, count);
 		break;
 	case sizeof(u16):
-		SSB_WARN_ON(count & 1);
+		WARN_ON(count & 1);
 		ioread16_rep(addr, buffer, count >> 1);
 		break;
 	case sizeof(u32):
-		SSB_WARN_ON(count & 3);
+		WARN_ON(count & 3);
 		ioread32_rep(addr, buffer, count >> 2);
 		break;
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 
 	return;
@@ -1098,15 +1091,15 @@ static void ssb_pci_block_write(struct ssb_device *dev, const void *buffer,
 		iowrite8_rep(addr, buffer, count);
 		break;
 	case sizeof(u16):
-		SSB_WARN_ON(count & 1);
+		WARN_ON(count & 1);
 		iowrite16_rep(addr, buffer, count >> 1);
 		break;
 	case sizeof(u32):
-		SSB_WARN_ON(count & 3);
+		WARN_ON(count & 3);
 		iowrite32_rep(addr, buffer, count >> 2);
 		break;
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 }
 #endif /* CONFIG_SSB_BLOCKIO */
diff --git a/drivers/ssb/pcmcia.c b/drivers/ssb/pcmcia.c
index 20f63cc88e70..567013f8a8be 100644
--- a/drivers/ssb/pcmcia.c
+++ b/drivers/ssb/pcmcia.c
@@ -169,7 +169,7 @@ int ssb_pcmcia_switch_segment(struct ssb_bus *bus, u8 seg)
 	int err;
 	u8 val;
 
-	SSB_WARN_ON((seg != 0) && (seg != 1));
+	WARN_ON((seg != 0) && (seg != 1));
 	while (1) {
 		err = ssb_pcmcia_cfg_write(bus, SSB_PCMCIA_MEMSEG, seg);
 		if (err)
@@ -299,7 +299,7 @@ static void ssb_pcmcia_block_read(struct ssb_device *dev, void *buffer,
 	case sizeof(u16): {
 		__le16 *buf = buffer;
 
-		SSB_WARN_ON(count & 1);
+		WARN_ON(count & 1);
 		while (count) {
 			*buf = (__force __le16)__raw_readw(addr);
 			buf++;
@@ -310,7 +310,7 @@ static void ssb_pcmcia_block_read(struct ssb_device *dev, void *buffer,
 	case sizeof(u32): {
 		__le16 *buf = buffer;
 
-		SSB_WARN_ON(count & 3);
+		WARN_ON(count & 3);
 		while (count) {
 			*buf = (__force __le16)__raw_readw(addr);
 			buf++;
@@ -321,7 +321,7 @@ static void ssb_pcmcia_block_read(struct ssb_device *dev, void *buffer,
 		break;
 	}
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 unlock:
 	spin_unlock_irqrestore(&bus->bar_lock, flags);
@@ -399,7 +399,7 @@ static void ssb_pcmcia_block_write(struct ssb_device *dev, const void *buffer,
 	case sizeof(u16): {
 		const __le16 *buf = buffer;
 
-		SSB_WARN_ON(count & 1);
+		WARN_ON(count & 1);
 		while (count) {
 			__raw_writew((__force u16)(*buf), addr);
 			buf++;
@@ -410,7 +410,7 @@ static void ssb_pcmcia_block_write(struct ssb_device *dev, const void *buffer,
 	case sizeof(u32): {
 		const __le16 *buf = buffer;
 
-		SSB_WARN_ON(count & 3);
+		WARN_ON(count & 3);
 		while (count) {
 			__raw_writew((__force u16)(*buf), addr);
 			buf++;
@@ -421,7 +421,7 @@ static void ssb_pcmcia_block_write(struct ssb_device *dev, const void *buffer,
 		break;
 	}
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 unlock:
 	mmiowb();
diff --git a/drivers/ssb/scan.c b/drivers/ssb/scan.c
index 71fbb4b3eb6a..6ceee98ed6ff 100644
--- a/drivers/ssb/scan.c
+++ b/drivers/ssb/scan.c
@@ -210,7 +210,7 @@ void ssb_iounmap(struct ssb_bus *bus)
 #ifdef CONFIG_SSB_PCIHOST
 		pci_iounmap(bus->host_pci, bus->mmio);
 #else
-		SSB_BUG_ON(1); /* Can't reach this code. */
+		WARN_ON(1); /* Can't reach this code. */
 #endif
 		break;
 	case SSB_BUSTYPE_SDIO:
@@ -236,7 +236,7 @@ static void __iomem *ssb_ioremap(struct ssb_bus *bus,
 #ifdef CONFIG_SSB_PCIHOST
 		mmio = pci_iomap(bus->host_pci, 0, ~0UL);
 #else
-		SSB_BUG_ON(1); /* Can't reach this code. */
+		WARN_ON(1); /* Can't reach this code. */
 #endif
 		break;
 	case SSB_BUSTYPE_SDIO:
diff --git a/drivers/ssb/sdio.c b/drivers/ssb/sdio.c
index 1aedc5fbb62f..7fe0afb42234 100644
--- a/drivers/ssb/sdio.c
+++ b/drivers/ssb/sdio.c
@@ -316,18 +316,18 @@ static void ssb_sdio_block_read(struct ssb_device *dev, void *buffer,
 		break;
 	}
 	case sizeof(u16): {
-		SSB_WARN_ON(count & 1);
+		WARN_ON(count & 1);
 		error = sdio_readsb(bus->host_sdio, buffer, offset, count);
 		break;
 	}
 	case sizeof(u32): {
-		SSB_WARN_ON(count & 3);
+		WARN_ON(count & 3);
 		offset |= SBSDIO_SB_ACCESS_2_4B_FLAG;	/* 32 bit data access */
 		error = sdio_readsb(bus->host_sdio, buffer, offset, count);
 		break;
 	}
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 	if (!error)
 		goto out;
@@ -423,18 +423,18 @@ static void ssb_sdio_block_write(struct ssb_device *dev, const void *buffer,
 				     (void *)buffer, count);
 		break;
 	case sizeof(u16):
-		SSB_WARN_ON(count & 1);
+		WARN_ON(count & 1);
 		error = sdio_writesb(bus->host_sdio, offset,
 				     (void *)buffer, count);
 		break;
 	case sizeof(u32):
-		SSB_WARN_ON(count & 3);
+		WARN_ON(count & 3);
 		offset |= SBSDIO_SB_ACCESS_2_4B_FLAG;	/* 32 bit data access */
 		error = sdio_writesb(bus->host_sdio, offset,
 				     (void *)buffer, count);
 		break;
 	default:
-		SSB_WARN_ON(1);
+		WARN_ON(1);
 	}
 	if (!error)
 		goto out;
diff --git a/drivers/ssb/ssb_private.h b/drivers/ssb/ssb_private.h
index 885e278c4487..5f31bdfbe77f 100644
--- a/drivers/ssb/ssb_private.h
+++ b/drivers/ssb/ssb_private.h
@@ -9,15 +9,6 @@
 #include <linux/types.h>
 #include <linux/bcm47xx_wdt.h>
 
-#ifdef CONFIG_SSB_DEBUG
-# define SSB_WARN_ON(x)		WARN_ON(x)
-# define SSB_BUG_ON(x)		BUG_ON(x)
-#else
-static inline int __ssb_do_nothing(int x) { return x; }
-# define SSB_WARN_ON(x)		__ssb_do_nothing(unlikely(!!(x)))
-# define SSB_BUG_ON(x)		__ssb_do_nothing(unlikely(!!(x)))
-#endif
-
 
 /* pci.c */
 #ifdef CONFIG_SSB_PCIHOST
diff --git a/include/linux/ssb/ssb.h b/include/linux/ssb/ssb.h
index 3b43655cabe6..0d5a2691e7e9 100644
--- a/include/linux/ssb/ssb.h
+++ b/include/linux/ssb/ssb.h
@@ -499,11 +499,9 @@ struct ssb_bus {
 
 	/* Internal-only stuff follows. Do not touch. */
 	struct list_head list;
-#ifdef CONFIG_SSB_DEBUG
 	/* Is the bus already powered up? */
 	bool powered_up;
 	int power_warn_count;
-#endif /* DEBUG */
 };
 
 enum ssb_quirks {
-- 
2.18.0




-- 
Michael

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH] powerpc/selftests: Wait all threads to join
From: Breno Leitao @ 2018-07-31 20:55 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mikey, rashmicy, Breno Leitao

Test tm-tmspr might exit before all threads stop executing, because it just
waits for the very last thread to join before proceeding/exiting.

This patch makes sure that all threads that were created will join before
proceeding/exiting.

This patch also guarantees that the amount of threads being created is equal
to thread_num.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 tools/testing/selftests/powerpc/tm/tm-tmspr.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/powerpc/tm/tm-tmspr.c b/tools/testing/selftests/powerpc/tm/tm-tmspr.c
index 2bda81c7bf23..df1d7d4b1c89 100644
--- a/tools/testing/selftests/powerpc/tm/tm-tmspr.c
+++ b/tools/testing/selftests/powerpc/tm/tm-tmspr.c
@@ -98,7 +98,7 @@ void texasr(void *in)
 
 int test_tmspr()
 {
-	pthread_t 	thread;
+	pthread_t	*thread;
 	int	   	thread_num;
 	unsigned long	i;
 
@@ -107,21 +107,28 @@ int test_tmspr()
 	/* To cause some context switching */
 	thread_num = 10 * sysconf(_SC_NPROCESSORS_ONLN);
 
+	thread = malloc(thread_num * sizeof(pthread_t));
+	if (thread == NULL)
+		return EXIT_FAILURE;
+
 	/* Test TFIAR and TFHAR */
-	for (i = 0 ; i < thread_num ; i += 2){
-		if (pthread_create(&thread, NULL, (void*)tfiar_tfhar, (void *)i))
+	for (i = 0; i < thread_num; i += 2) {
+		if (pthread_create(&thread[i], NULL, (void *)tfiar_tfhar,
+				   (void *)i))
 			return EXIT_FAILURE;
 	}
-	if (pthread_join(thread, NULL) != 0)
-		return EXIT_FAILURE;
-
 	/* Test TEXASR */
-	for (i = 0 ; i < thread_num ; i++){
-		if (pthread_create(&thread, NULL, (void*)texasr, (void *)i))
+	for (i = 1; i < thread_num; i += 2) {
+		if (pthread_create(&thread[i], NULL, (void *)texasr, (void *)i))
 			return EXIT_FAILURE;
 	}
-	if (pthread_join(thread, NULL) != 0)
-		return EXIT_FAILURE;
+
+	for (i = 0; i < thread_num; i++) {
+		if (pthread_join(thread[i], NULL) != 0)
+			return EXIT_FAILURE;
+	}
+
+	free(thread);
 
 	if (passed)
 		return 0;
-- 
2.16.3

^ permalink raw reply related

* Re: [PATCH 1/4] treewide: convert ISO_8859-1 text comments to utf-8
From: Rob Herring @ 2018-07-31 21:49 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Andrew Morton, Joe Perches, Samuel Ortiz, David S. Miller,
	Michael Ellerman, Jonathan Cameron, linux-wireless, netdev,
	devicetree, linux-kernel, linux-arm-kernel, linux-crypto,
	linuxppc-dev, linux-iio, linux-pm, lvs-devel, netfilter-devel,
	coreteam
In-Reply-To: <20180724111600.4158975-1-arnd@arndb.de>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 1690 bytes --]

On Tue, Jul 24, 2018 at 01:13:25PM +0200, Arnd Bergmann wrote:
> Almost all files in the kernel are either plain text or UTF-8
> encoded. A couple however are ISO_8859-1, usually just a few
> characters in a C comments, for historic reasons.
> 
> This converts them all to UTF-8 for consistency.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  .../devicetree/bindings/net/nfc/pn544.txt     |   2 +-
>  arch/arm/boot/dts/sun4i-a10-inet97fv2.dts     |   2 +-
>  arch/arm/crypto/sha256_glue.c                 |   2 +-
>  arch/arm/crypto/sha256_neon_glue.c            |   4 +-
>  drivers/crypto/vmx/ghashp8-ppc.pl             |  12 +-
>  drivers/iio/dac/ltc2632.c                     |   2 +-
>  drivers/power/reset/ltc2952-poweroff.c        |   4 +-
>  kernel/events/callchain.c                     |   2 +-
>  net/netfilter/ipvs/Kconfig                    |   8 +-
>  net/netfilter/ipvs/ip_vs_mh.c                 |   4 +-
>  tools/power/cpupower/po/de.po                 |  44 +++----
>  tools/power/cpupower/po/fr.po                 | 120 +++++++++---------
>  12 files changed, 103 insertions(+), 103 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/net/nfc/pn544.txt b/Documentation/devicetree/bindings/net/nfc/pn544.txt
> index 538a86f7b2b0..72593f056b75 100644
> --- a/Documentation/devicetree/bindings/net/nfc/pn544.txt
> +++ b/Documentation/devicetree/bindings/net/nfc/pn544.txt
> @@ -2,7 +2,7 @@
>  
>  Required properties:
>  - compatible: Should be "nxp,pn544-i2c".
> -- clock-frequency: I²C work frequency.
> +- clock-frequency: I²C work frequency.

I'd prefer just plain ASCII 'I2C' here, but either way:

Acked-by: Rob Herring <robh@kernel.org>

Rob

^ permalink raw reply

* Re: [PATCH v2 2/2] selftests/powerpc: Add more version checks to alignment_handler test
From: Andrew Donnellan @ 2018-08-01  0:16 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: mikey
In-Reply-To: <20180731120842.32715-2-mpe@ellerman.id.au>

On 31/07/18 22:08, Michael Ellerman wrote:
> The alignment_handler is documented to only work on Power8/Power9, but
> we can make it run on older CPUs by guarding more of the tests with
> feature checks.
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Looks good to me.

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

> ---
>   .../powerpc/alignment/alignment_handler.c          | 67 +++++++++++++++++++---
>   1 file changed, 59 insertions(+), 8 deletions(-)
> 
> v2: Don't incorrectly duplicate any of the tests, as noticed by @ajd.
> 
> diff --git a/tools/testing/selftests/powerpc/alignment/alignment_handler.c b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
> index 0eddd16af49f..169a8b9719fb 100644
> --- a/tools/testing/selftests/powerpc/alignment/alignment_handler.c
> +++ b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
> @@ -49,6 +49,8 @@
>   #include <setjmp.h>
>   #include <signal.h>
>   
> +#include <asm/cputable.h>
> +
>   #include "utils.h"
>   
>   int bufsize;
> @@ -289,6 +291,7 @@ int test_alignment_handler_vsx_206(void)
>   	int rc = 0;
>   
>   	SKIP_IF(!can_open_fb0());
> +	SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_06));
>   
>   	printf("VSX: 2.06B\n");
>   	LOAD_VSX_XFORM_TEST(lxvd2x);
> @@ -306,6 +309,7 @@ int test_alignment_handler_vsx_207(void)
>   	int rc = 0;
>   
>   	SKIP_IF(!can_open_fb0());
> +	SKIP_IF(!have_hwcap2(PPC_FEATURE2_ARCH_2_07));
>   
>   	printf("VSX: 2.07B\n");
>   	LOAD_VSX_XFORM_TEST(lxsspx);
> @@ -380,7 +384,6 @@ int test_alignment_handler_integer(void)
>   	LOAD_DFORM_TEST(ldu);
>   	LOAD_XFORM_TEST(ldx);
>   	LOAD_XFORM_TEST(ldux);
> -	LOAD_XFORM_TEST(ldbrx);
>   	LOAD_DFORM_TEST(lmw);
>   	STORE_DFORM_TEST(stb);
>   	STORE_XFORM_TEST(stbx);
> @@ -400,8 +403,23 @@ int test_alignment_handler_integer(void)
>   	STORE_XFORM_TEST(stdx);
>   	STORE_DFORM_TEST(stdu);
>   	STORE_XFORM_TEST(stdux);
> -	STORE_XFORM_TEST(stdbrx);
>   	STORE_DFORM_TEST(stmw);
> +
> +	return rc;
> +}
> +
> +int test_alignment_handler_integer_206(void)
> +{
> +	int rc = 0;
> +
> +	SKIP_IF(!can_open_fb0());
> +	SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_06));
> +
> +	printf("Integer: 2.06\n");
> +
> +	LOAD_XFORM_TEST(ldbrx);
> +	STORE_XFORM_TEST(stdbrx);
> +
>   	return rc;
>   }
>   
> @@ -410,6 +428,7 @@ int test_alignment_handler_vmx(void)
>   	int rc = 0;
>   
>   	SKIP_IF(!can_open_fb0());
> +	SKIP_IF(!have_hwcap(PPC_FEATURE_HAS_ALTIVEC));
>   
>   	printf("VMX\n");
>   	LOAD_VMX_XFORM_TEST(lvx);
> @@ -441,20 +460,14 @@ int test_alignment_handler_fp(void)
>   	printf("Floating point\n");
>   	LOAD_FLOAT_DFORM_TEST(lfd);
>   	LOAD_FLOAT_XFORM_TEST(lfdx);
> -	LOAD_FLOAT_DFORM_TEST(lfdp);
> -	LOAD_FLOAT_XFORM_TEST(lfdpx);
>   	LOAD_FLOAT_DFORM_TEST(lfdu);
>   	LOAD_FLOAT_XFORM_TEST(lfdux);
>   	LOAD_FLOAT_DFORM_TEST(lfs);
>   	LOAD_FLOAT_XFORM_TEST(lfsx);
>   	LOAD_FLOAT_DFORM_TEST(lfsu);
>   	LOAD_FLOAT_XFORM_TEST(lfsux);
> -	LOAD_FLOAT_XFORM_TEST(lfiwzx);
> -	LOAD_FLOAT_XFORM_TEST(lfiwax);
>   	STORE_FLOAT_DFORM_TEST(stfd);
>   	STORE_FLOAT_XFORM_TEST(stfdx);
> -	STORE_FLOAT_DFORM_TEST(stfdp);
> -	STORE_FLOAT_XFORM_TEST(stfdpx);
>   	STORE_FLOAT_DFORM_TEST(stfdu);
>   	STORE_FLOAT_XFORM_TEST(stfdux);
>   	STORE_FLOAT_DFORM_TEST(stfs);
> @@ -466,6 +479,38 @@ int test_alignment_handler_fp(void)
>   	return rc;
>   }
>   
> +int test_alignment_handler_fp_205(void)
> +{
> +	int rc = 0;
> +
> +	SKIP_IF(!can_open_fb0());
> +	SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_05));
> +
> +	printf("Floating point: 2.05\n");
> +
> +	LOAD_FLOAT_DFORM_TEST(lfdp);
> +	LOAD_FLOAT_XFORM_TEST(lfdpx);
> +	LOAD_FLOAT_XFORM_TEST(lfiwax);
> +	STORE_FLOAT_DFORM_TEST(stfdp);
> +	STORE_FLOAT_XFORM_TEST(stfdpx);
> +
> +	return rc;
> +}
> +
> +int test_alignment_handler_fp_206(void)
> +{
> +	int rc = 0;
> +
> +	SKIP_IF(!can_open_fb0());
> +	SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_06));
> +
> +	printf("Floating point: 2.06\n");
> +
> +	LOAD_FLOAT_XFORM_TEST(lfiwzx);
> +
> +	return rc;
> +}
> +
>   void usage(char *prog)
>   {
>   	printf("Usage: %s [options]\n", prog);
> @@ -513,9 +558,15 @@ int main(int argc, char *argv[])
>   			   "test_alignment_handler_vsx_300");
>   	rc |= test_harness(test_alignment_handler_integer,
>   			   "test_alignment_handler_integer");
> +	rc |= test_harness(test_alignment_handler_integer_206,
> +			   "test_alignment_handler_integer_206");
>   	rc |= test_harness(test_alignment_handler_vmx,
>   			   "test_alignment_handler_vmx");
>   	rc |= test_harness(test_alignment_handler_fp,
>   			   "test_alignment_handler_fp");
> +	rc |= test_harness(test_alignment_handler_fp_205,
> +			   "test_alignment_handler_fp_205");
> +	rc |= test_harness(test_alignment_handler_fp_206,
> +			   "test_alignment_handler_fp_206");
>   	return rc;
>   }
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited

^ permalink raw reply

* Re: Infinite looping observed in __offline_pages
From: Rashmica @ 2018-08-01  1:37 UTC (permalink / raw)
  To: John Allen, linux-kernel, linuxppc-dev
  Cc: mhocko, n-horiguchi, kamezawa.hiroyu, mgorman
In-Reply-To: <20180725181115.hmlyd3tmnu3mn3sf@p50.austin.ibm.com>



On 26/07/18 04:11, John Allen wrote:
> Hi All,
>
> Under heavy stress and constant memory hot add/remove, I have observed
> the following loop to occasionally loop infinitely:
>
> mm/memory_hotplug.c:__offline_pages
>
> repeat:
>        /* start memory hot removal */
>        ret = -EINTR;
>        if (signal_pending(current))
>                goto failed_removal;
>
>        cond_resched();
>        lru_add_drain_all();
>        drain_all_pages(zone);
>
>        pfn = scan_movable_pages(start_pfn, end_pfn);
>        if (pfn) { /* We have movable pages */
>                ret = do_migrate_range(pfn, end_pfn);
>                goto repeat;
>        }
>

What is CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE set to for you?

I have also observed this when hot removing and adding memory. However I
only have only seen this when my kernel has
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n (when it is set to online
automatically I do not have this issue) so I assumed that I wasn't
onlining the memory properly...

> What appears to be happening in this case is that do_migrate_range
> returns a failure code which is being ignored. The failure is stemming
> from migrate_pages returning "1" which I'm guessing is the result of
> us hitting the following case:
>
> mm/migrate.c: migrate_pages
>
>     default:
>         /*
>          * Permanent failure (-EBUSY, -ENOSYS, etc.):
>          * unlike -EAGAIN case, the failed page is
>          * removed from migration page list and not
>          * retried in the next outer loop.
>          */
>         nr_failed++;
>         break;
>     }
>
> Does a failure in do_migrate_range indicate that the range is
> unmigratable and the loop in __offline_pages should terminate and goto
> failed_removal? Or should we allow a certain number of retrys before we
> give up on migrating the range?
>
> This issue was observed on a ppc64le lpar on a 4.18-rc6 kernel.
>
> -John
>

^ permalink raw reply

* Re: [PATCH] powerpc/64s/radix: Fix missing global invalidations when removing copro
From: Nicholas Piggin @ 2018-08-01  1:43 UTC (permalink / raw)
  To: Frederic Barrat; +Cc: linuxppc-dev, vaibhav, felix, clombard
In-Reply-To: <20180731132452.15994-1-fbarrat@linux.ibm.com>

On Tue, 31 Jul 2018 15:24:52 +0200
Frederic Barrat <fbarrat@linux.ibm.com> wrote:

> With the optimizations for TLB invalidation from commit 0cef77c7798a
> ("powerpc/64s/radix: flush remote CPUs out of single-threaded
> mm_cpumask"), the scope of a TLBI (global vs. local) can now be
> influenced by the value of the 'copros' counter of the memory context.
> 
> When calling mm_context_remove_copro(), the 'copros' counter is
> decremented first before flushing. It may have the unintended side
> effect of sending local TLBIs when we explicitly need global
> invalidations in this case. Thus breaking any nMMU user in a bad and
> unpredictable way.
> 
> Fix it by flushing first, before updating the 'copros' counter, so
> that invalidations will be global.
> 
> Fixes: 0cef77c7798a ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask")
> Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>

Thanks for catching this, looks good to me.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

^ permalink raw reply

* Re: [resend] [PATCH 0/3] powerpc/pseries: use H_BLOCK_REMOVE
From: Nicholas Piggin @ 2018-08-01  1:55 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: linuxppc-dev, linux-kernel, aneesh.kumar, mpe, benh, paulus
In-Reply-To: <1532699493-10883-1-git-send-email-ldufour@linux.vnet.ibm.com>

On Fri, 27 Jul 2018 15:51:30 +0200
Laurent Dufour <ldufour@linux.vnet.ibm.com> wrote:

> [Resending so everyone is getting the cover letter]
> 
> On very large system we could see soft lockup fired when a process is exiting
> 
> watchdog: BUG: soft lockup - CPU#851 stuck for 21s! [forkoff:215523]
> Modules linked in: pseries_rng rng_core xfs raid10 vmx_crypto btrfs libcrc32c xor zstd_decompress zstd_compress xxhash lzo_compress raid6_pq crc32c_vpmsum lpfc crc_t10dif crct10dif_generic crct10dif_common dm_multipath scsi_dh_rdac scsi_dh_alua autofs4
> CPU: 851 PID: 215523 Comm: forkoff Not tainted 4.17.0 #1
> NIP:  c0000000000b995c LR: c0000000000b8f64 CTR: 000000000000aa18
> REGS: c00006b0645b7610 TRAP: 0901   Not tainted  (4.17.0)
> MSR:  800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 22042082  XER: 00000000
> CFAR: 00000000006cf8f0 SOFTE: 0 
> GPR00: 0010000000000000 c00006b0645b7890 c000000000f99200 0000000000000000 
> GPR04: 8e000001a5a4de58 400249cf1bfd5480 8e000001a5a4de50 400249cf1bfd5480 
> GPR08: 8e000001a5a4de48 400249cf1bfd5480 8e000001a5a4de40 400249cf1bfd5480 
> GPR12: ffffffffffffffff c00000001e690800 
> NIP [c0000000000b995c] plpar_hcall9+0x44/0x7c
> LR [c0000000000b8f64] pSeries_lpar_flush_hash_range+0x324/0x3d0
> Call Trace:
> [c00006b0645b7890] [8e000001a5a4dd20] 0x8e000001a5a4dd20 (unreliable)
> [c00006b0645b7a00] [c00000000006d5b0] flush_hash_range+0x60/0x110
> [c00006b0645b7a50] [c000000000072a2c] __flush_tlb_pending+0x4c/0xd0
> [c00006b0645b7a80] [c0000000002eaf44] unmap_page_range+0x984/0xbd0
> [c00006b0645b7bc0] [c0000000002eb594] unmap_vmas+0x84/0x100
> [c00006b0645b7c10] [c0000000002f8afc] exit_mmap+0xac/0x1f0
> [c00006b0645b7cd0] [c0000000000f2638] mmput+0x98/0x1b0
> [c00006b0645b7d00] [c0000000000fc9d0] do_exit+0x330/0xc00
> [c00006b0645b7dc0] [c0000000000fd384] do_group_exit+0x64/0x100
> [c00006b0645b7e00] [c0000000000fd44c] sys_exit_group+0x2c/0x30
> [c00006b0645b7e30] [c00000000000b960] system_call+0x58/0x6c
> Instruction dump:
> 60000000 f8810028 7ca42b78 7cc53378 7ce63b78 7d074378 7d284b78 7d495378 
> e9410060 e9610068 e9810070 44000022 <7d806378> e9810028 f88c0000 f8ac0008
> 
> This happens when removing the PTE by calling the hypervisor using the
> H_BULK_REMOVE call. This call is processing up to 4 PTEs but is doing a
> tlbie for each PTE it is processing. This could lead to long time spent in
> the hypervisor (sometimes up to 4s) and soft lockup being raised because
> the scheduler is not called in zap_pte_range().
> 
> Since the Power7's time, the hypervisor is providing a new hcall
> H_BLOCK_REMOVE allowing processing up to 8 PTEs with one call to
> tlbie. By limiting the amount of tlbie generated, this reduces the time
> spent invalidating the PTEs.

Oh that's a nice feature. I must have an ancient PAPR because I don't
have it. It could be a good project for someone to implement it in KVM
too.

> 
> This hcall requires that the pages are "all within the same naturally
> aligned 8 page virtual address block".
> 
> With this patch series applied, I couldn't see any soft lockup raised on
> the victim LPAR I was running the test one.
> 
> This series is covering both normal pages and huge pages.

Really nice, thanks for working on the problem.

Thanks,
Nick

^ permalink raw reply

* Re: [PATCH] powerpc/64s: Make rfi_flush_fallback a little more robust
From: Nicholas Piggin @ 2018-08-01  2:01 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, anton
In-Reply-To: <20180726124244.13993-1-mpe@ellerman.id.au>

On Thu, 26 Jul 2018 22:42:44 +1000
Michael Ellerman <mpe@ellerman.id.au> wrote:

> Because rfi_flush_fallback runs immediately before the return to
> userspace it currently runs with the user r1 (stack pointer). This
> means if we oops in there we will report a bad kernel stack pointer in
> the exception entry path, eg:
> 
>   Bad kernel stack pointer 7ffff7150e40 at c0000000000023b4
>   Oops: Bad kernel stack pointer, sig: 6 [#1]
>   LE SMP NR_CPUS=32 NUMA PowerNV
>   Modules linked in:
>   CPU: 0 PID: 1246 Comm: klogd Not tainted 4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3 #7
>   NIP:  c0000000000023b4 LR: 0000000010053e00 CTR: 0000000000000040
>   REGS: c0000000fffe7d40 TRAP: 4100   Not tainted  (4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3)
>   MSR:  9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE>  CR: 44000442  XER: 20000000
>   CFAR: c00000000000bac8 IRQMASK: c0000000f1e66a80
>   GPR00: 0000000002000000 00007ffff7150e40 00007fff93a99900 0000000000000020
>   ...
>   NIP [c0000000000023b4] rfi_flush_fallback+0x34/0x80
>   LR [0000000010053e00] 0x10053e00
> 
> Although the NIP tells us where we were, and the TRAP number tells us
> what happened, it would still be nicer if we could report the actual
> exception rather than barfing about the stack pointer.
> 
> We an do that fairly simply by loading the kernel stack pointer on
> entry and restoring the user value before returning. That way we see a
> regular oops such as:
> 
>   Unrecoverable exception 4100 at c00000000000239c
>   Oops: Unrecoverable exception, sig: 6 [#1]
>   LE SMP NR_CPUS=32 NUMA PowerNV
>   Modules linked in:
>   CPU: 0 PID: 1251 Comm: klogd Not tainted 4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty #40
>   NIP:  c00000000000239c LR: 0000000010053e00 CTR: 0000000000000040
>   REGS: c0000000f1e17bb0 TRAP: 4100   Not tainted  (4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty)
>   MSR:  9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE>  CR: 44000442  XER: 20000000
>   CFAR: c00000000000bac8 IRQMASK: 0
>   ...
>   NIP [c00000000000239c] rfi_flush_fallback+0x3c/0x80
>   LR [0000000010053e00] 0x10053e00
>   Call Trace:
>   [c0000000f1e17e30] [c00000000000b9e4] system_call+0x5c/0x70 (unreliable)
> 
> Note this shouldn't make the kernel stack pointer vulnerable to a
> meltdown attack, because it should be flushed from the cache before we
> return to userspace. The user r1 value will be in the cache, because
> we load it in the return path, but that is harmless.
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Yeah that's a lot nicer, thanks.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

> ---
>  arch/powerpc/kernel/exceptions-64s.S | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index f7cc12aa3dc6..a6fa85916273 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1518,6 +1518,8 @@ TRAMP_REAL_BEGIN(stf_barrier_fallback)
>  TRAMP_REAL_BEGIN(rfi_flush_fallback)
>  	SET_SCRATCH0(r13);
>  	GET_PACA(r13);
> +	std	r1,PACA_EXRFI+EX_R12(r13)
> +	ld	r1,PACAKSAVE(r13)
>  	std	r9,PACA_EXRFI+EX_R9(r13)
>  	std	r10,PACA_EXRFI+EX_R10(r13)
>  	std	r11,PACA_EXRFI+EX_R11(r13)
> @@ -1552,12 +1554,15 @@ TRAMP_REAL_BEGIN(rfi_flush_fallback)
>  	ld	r9,PACA_EXRFI+EX_R9(r13)
>  	ld	r10,PACA_EXRFI+EX_R10(r13)
>  	ld	r11,PACA_EXRFI+EX_R11(r13)
> +	ld	r1,PACA_EXRFI+EX_R12(r13)
>  	GET_SCRATCH0(r13);
>  	rfid
>  
>  TRAMP_REAL_BEGIN(hrfi_flush_fallback)
>  	SET_SCRATCH0(r13);
>  	GET_PACA(r13);
> +	std	r1,PACA_EXRFI+EX_R12(r13)
> +	ld	r1,PACAKSAVE(r13)
>  	std	r9,PACA_EXRFI+EX_R9(r13)
>  	std	r10,PACA_EXRFI+EX_R10(r13)
>  	std	r11,PACA_EXRFI+EX_R11(r13)
> @@ -1592,6 +1597,7 @@ TRAMP_REAL_BEGIN(hrfi_flush_fallback)
>  	ld	r9,PACA_EXRFI+EX_R9(r13)
>  	ld	r10,PACA_EXRFI+EX_R10(r13)
>  	ld	r11,PACA_EXRFI+EX_R11(r13)
> +	ld	r1,PACA_EXRFI+EX_R12(r13)
>  	GET_SCRATCH0(r13);
>  	hrfid
>  

^ permalink raw reply

* Re: [PATCH] powerpc/64s: Make unrecoverable SLB miss less confusing
From: Nicholas Piggin @ 2018-08-01  2:11 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, anton, paulus
In-Reply-To: <20180726130151.16410-1-mpe@ellerman.id.au>

On Thu, 26 Jul 2018 23:01:51 +1000
Michael Ellerman <mpe@ellerman.id.au> wrote:

> If we take an SLB miss while MSR[RI]=0 we can't recover and have to
> oops. Currently this is reported by faking up a 0x4100 exception, eg:
> 
>   Unrecoverable exception 4100 at 0
>   Oops: Unrecoverable exception, sig: 6 [#1]
>   ...
>   CPU: 0 PID: 1262 Comm: sh Not tainted 4.18.0-rc3-gcc-7.3.1-00098-g7fc2229fb2ab-dirty #9
>   NIP:  0000000000000000 LR: c00000000000b9e4 CTR: 00007fff8bb971b0
>   REGS: c0000000ee02bbb0 TRAP: 4100
>   ...
>   LR [c00000000000b9e4] system_call+0x5c/0x70
> 
> The 0x4100 value was chosen back in 2004 as part of the fix for the
> "mega bug" - "ppc64: Fix SLB reload bug". Back then it was obvious
> that 0x4100 was not a real trap value, as the highest actual trap was
> less than 0x2000.
> 
> Since then however the architecture has changed and now we have
> "virtual mode" or "relon" exceptions, in which exceptions can be
> delivered with the MMU on starting at 0x4000.
> 
> At a glance 0x4100 looks like a virtual mode 0x100 exception, aka
> system reset exception. A close reading of the architecture will show
> that system reset exceptions can't be delivered in virtual mode, and
> so 0x4100 is not a valid trap number. But that's not immediately
> obvious. There's also nothing about 0x4100 that suggests SLB miss.
> 
> So to make things a bit less confusing switch to a fake but unique and
> hopefully more helpful numbering. For data SLB misses we report a
> 0x390 trap and for instruction we report 0x490. Compared to 0x380 and
> 0x480 for the actual data & instruction SLB exceptions.
> 
> Also add a C handler that prints a more explicit message. The end
> result is something like:
> 
>   Oops: Unrecoverable SLB miss (MSR[RI]=0), sig: 6 [#3]

This is all good, but allow me to nitpick. Our unrecoverable
exception messages (and other messages, but those) are becoming a bit
ad-hoc and messy.

It would be nice to go the other way eventually and consolidate them
into one. Would be nice to have a common function that takes regs and
returns the string of the corresponding exception name that makes
these more readable.

>   ...
>   CPU: 0 PID: 1262 Comm: sh Not tainted 4.18.0-rc3-gcc-7.3.1-00098-g7fc2229fb2ab-dirty #9
>   NIP:  0000000000000000 LR: c00000000000b9e4 CTR: 0000000000000000
>   REGS: c0000000f19a3bb0 TRAP: 0490

Unless I'm mistaken, the fake trap number was only because the code
couldn't distinguish between 380 and 480. Now that you do, I think you
can just use them directly rather than 390/490.

Thanks,
Nick

>   ...
>   LR [c00000000000b9e4] system_call+0x5c/0x70
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  arch/powerpc/include/asm/asm-prototypes.h | 1 +
>  arch/powerpc/kernel/exceptions-64s.S      | 7 +++++--
>  arch/powerpc/kernel/traps.c               | 6 ++++++
>  3 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
> index 7841b8a60657..ffba4a6ee619 100644
> --- a/arch/powerpc/include/asm/asm-prototypes.h
> +++ b/arch/powerpc/include/asm/asm-prototypes.h
> @@ -74,6 +74,7 @@ void facility_unavailable_exception(struct pt_regs *regs);
>  void TAUException(struct pt_regs *regs);
>  void altivec_assist_exception(struct pt_regs *regs);
>  void unrecoverable_exception(struct pt_regs *regs);
> +void unrecoverable_slb_miss(struct pt_regs *regs);
>  void kernel_bad_stack(struct pt_regs *regs);
>  void system_reset_exception(struct pt_regs *regs);
>  void machine_check_exception(struct pt_regs *regs);
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index a6fa85916273..8e1396433eb4 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -743,11 +743,14 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)
>  	b	.
>  
>  EXC_COMMON_BEGIN(unrecov_slb)
> -	EXCEPTION_PROLOG_COMMON(0x4100, PACA_EXSLB)
> +	EXCEPTION_PROLOG_COMMON(0x390, PACA_EXSLB)
>  	RECONCILE_IRQ_STATE(r10, r11)
>  	bl	save_nvgprs
> +	beq	cr6, 1f		// cr6.eq is set for a data SLB miss ...
> +	li	r10, 0x490	// else fix trap number for instruction SLB miss
> +	std	r10, _TRAP(r1)
>  1:	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	bl	unrecoverable_exception
> +	bl	unrecoverable_slb_miss
>  	b	1b
>  
>  EXC_COMMON_BEGIN(large_addr_slb)
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 0e17dcb48720..0b1724a0b001 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -2061,6 +2061,12 @@ void unrecoverable_exception(struct pt_regs *regs)
>  }
>  NOKPROBE_SYMBOL(unrecoverable_exception);
>  
> +void unrecoverable_slb_miss(struct pt_regs *regs)
> +{
> +	die("Unrecoverable SLB miss (MSR[RI]=0)", regs, SIGABRT);
> +}
> +NOKPROBE_SYMBOL(unrecoverable_slb_miss);
> +
>  #if defined(CONFIG_BOOKE_WDT) || defined(CONFIG_40x)
>  /*
>   * Default handler for a Watchdog exception,

^ permalink raw reply

* Re: [PATCH 00/16] Finally remove PSERIES from exception naming
From: Nicholas Piggin @ 2018-08-01  2:12 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, anton, paulus
In-Reply-To: <20180726130717.18761-1-mpe@ellerman.id.au>

On Thu, 26 Jul 2018 23:07:01 +1000
Michael Ellerman <mpe@ellerman.id.au> wrote:

> This finally annoyed me enough to do something about it, I foolishly started
> pulling the string and now here I am.
> 
> I think the end result is sufficiently more readable to justify the churn. I
> particularly like that we now have EXCEPTION_PROLOG_0/1/2.
> 
> It will cause some pain for backports, but Nick plans to rewrite the exception
> vectors entirely at some point so this will be trivial in comparison.

Yeah I have no problem with this series. I'll get onto that rewrite
again soon.

Acked-by: Nicholas Piggin <npiggin@gmail.com>

> 
> cheers
> 
> 
> Michael Ellerman (16):
>   powerpc/64s: Move SET_SCRATCH0() into EXCEPTION_PROLOG_PSERIES()
>   powerpc/64s: Move SET_SCRATCH0() into EXCEPTION_RELON_PROLOG_PSERIES()
>   powerpc/64s: Rename STD_EXCEPTION_PSERIES to STD_EXCEPTION
>   powerpc/64s: Rename STD_EXCEPTION_PSERIES_OOL to STD_EXCEPTION_OOL
>   powerpc/64s: Rename STD_RELON_EXCEPTION_PSERIES to STD_RELON_EXCEPTION
>   powerpc/64s: Rename STD_RELON_EXCEPTION_PSERIES_OOL to
>     STD_RELON_EXCEPTION_OOL
>   powerpc/64s: Rename EXCEPTION_PROLOG_PSERIES_1 to EXCEPTION_PROLOG_2
>   powerpc/64s: Remove PSERIES from the NORI macros
>   powerpc/64s: Rename EXCEPTION_RELON_PROLOG_PSERIES_1
>   powerpc/64s: Rename EXCEPTION_RELON_PROLOG_PSERIES
>   powerpc/64s: Rename EXCEPTION_PROLOG_PSERIES to EXCEPTION_PROLOG
>   powerpc/64s: Drop _MASKABLE_EXCEPTION_PSERIES()
>   powerpc/64s: Drop _MASKABLE_RELON_EXCEPTION_PSERIES()
>   powerpc/64s: Remove PSERIES naming from the MASKABLE macros
>   powerpc/64s: Drop unused loc parameter to MASKABLE_EXCEPTION macros
>   powerpc/64s: Don't use __MASKABLE_EXCEPTION unnecessarily
> 
>  arch/powerpc/include/asm/exception-64s.h | 117 ++++++++++++++-----------------
>  arch/powerpc/include/asm/head-64.h       |  16 ++---
>  arch/powerpc/kernel/exceptions-64s.S     |  32 ++++-----
>  3 files changed, 72 insertions(+), 93 deletions(-)
> 

^ permalink raw reply

* Re: [PATCH resend] powerpc/64s: fix page table fragment refcount race vs speculative references
From: Nicholas Piggin @ 2018-08-01  2:45 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Matthew Wilcox, linux-mm, Linus Torvalds, Andrew Morton,
	linuxppc-dev, Aneesh Kumar K . V
In-Reply-To: <87600vhbs1.fsf@concordia.ellerman.id.au>

On Tue, 31 Jul 2018 21:42:22 +1000
Michael Ellerman <mpe@ellerman.id.au> wrote:

> Nicholas Piggin <npiggin@gmail.com> writes:
> > On Fri, 27 Jul 2018 08:38:35 -0700
> > Matthew Wilcox <willy@infradead.org> wrote:  
> >> On Sat, Jul 28, 2018 at 12:29:06AM +1000, Nicholas Piggin wrote:  
> >> > On Fri, 27 Jul 2018 06:41:56 -0700
> >> > Matthew Wilcox <willy@infradead.org> wrote:  
> >> > > On Fri, Jul 27, 2018 at 09:48:17PM +1000, Nicholas Piggin wrote:    
> >> > > > The page table fragment allocator uses the main page refcount racily
> >> > > > with respect to speculative references. A customer observed a BUG due
> >> > > > to page table page refcount underflow in the fragment allocator. This
> >> > > > can be caused by the fragment allocator set_page_count stomping on a
> >> > > > speculative reference, and then the speculative failure handler
> >> > > > decrements the new reference, and the underflow eventually pops when
> >> > > > the page tables are freed.      
> >> > > 
> >> > > Oof.  Can't you fix this instead by using page_ref_add() instead of
> >> > > set_page_count()?    
> >> > 
> >> > It's ugly doing it that way. The problem is we have a page table
> >> > destructor and that would be missed if the spec ref was the last
> >> > put. In practice with RCU page table freeing maybe you can say
> >> > there will be no spec ref there (unless something changes), but
> >> > still it just seems much simpler doing this and avoiding any
> >> > complexity or relying on other synchronization.    
> >> 
> >> I don't want to rely on the speculative reference not happening by the
> >> time the page table is torn down; that's way too black-magic for me.
> >> Another possibility would be to use, say, the top 16 bits of the
> >> atomic for your counter and call the dtor once the atomic is below 64k.
> >> I'm also thinking about overhauling the dtor system so it's not tied to
> >> compound pages; anyone with a bit in page_type would be able to use it.
> >> That way you'd always get your dtor called, even if the speculative
> >> reference was the last one.  
> >
> > Yeah we could look at doing either of those if necessary.
> >  
> >> > > > Any objection to the struct page change to grab the arch specific
> >> > > > page table page word for powerpc to use? If not, then this should
> >> > > > go via powerpc tree because it's inconsequential for core mm.      
> >> > > 
> >> > > I want (eventually) to get to the point where every struct page carries
> >> > > a pointer to the struct mm that it belongs to.  It's good for debugging
> >> > > as well as handling memory errors in page tables.    
> >> > 
> >> > That doesn't seem like it should be a problem, there's some spare
> >> > words there for arch independent users.    
> >> 
> >> Could you take one of the spare words instead then?  My intent was to
> >> just take the 'x86 pgds only' comment off that member.  _pt_pad_2 looks
> >> ideal because it'll be initialised to 0 and you'll return it to 0 by
> >> the time you're done.  
> >
> > It doesn't matter for powerpc where the atomic_t goes, so I'm fine with
> > moving it. But could you juggle the fields with your patch instead? I
> > thought it would be nice to using this field that has been already
> > tested on x86 not to overlap with any other data for
> > bug fix that'll have to be widely backported.  
> 
> Can we come to a conclusion on this one?
> 
> As far as backporting goes pt_mm is new in 4.18-rc so the patch will
> need to be manually backported anyway. But I agree with Nick we'd rather
> use a slot that is known to be free for arch use.

Let's go with that for now. I'd really rather not fix this obscure
bug by introducing something even worse. I'll volunteer to change
the powerpc page table cache code if we can't find any more space in
the struct page.

So what does mapping get used for by page table pages? 4c21e2f2441
("[PATCH] mm: split page table lock") adds that page->mapping = NULL
in pte_lock_deinit, but I don't see why because page->mapping is
never used anywhere else by that patch. Maybe a previous version
of that patch used mapping rather than private?

Thanks,
Nick

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox