LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [RFC PATCH kernel 0/5] powerpc/P9/vfio: Pass through NVIDIA Tesla V100
From: Alexey Kardashevskiy @ 2018-06-08  4:14 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Benjamin Herrenschmidt, linuxppc-dev, David Gibson, kvm-ppc,
	Ram Pai, kvm, Alistair Popple
In-Reply-To: <20180607214455.51ecfa1a@w520.home>

On 8/6/18 1:44 pm, Alex Williamson wrote:
> On Fri, 8 Jun 2018 13:08:54 +1000
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
>> On 8/6/18 8:15 am, Alex Williamson wrote:
>>> On Fri, 08 Jun 2018 07:54:02 +1000
>>> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>>>   
>>>> On Thu, 2018-06-07 at 11:04 -0600, Alex Williamson wrote:  
>>>>>
>>>>> Can we back up and discuss whether the IOMMU grouping of NVLink
>>>>> connected devices makes sense?  AIUI we have a PCI view of these
>>>>> devices and from that perspective they're isolated.  That's the view of
>>>>> the device used to generate the grouping.  However, not visible to us,
>>>>> these devices are interconnected via NVLink.  What isolation properties
>>>>> does NVLink provide given that its entire purpose for existing seems to
>>>>> be to provide a high performance link for p2p between devices?    
>>>>
>>>> Not entire. On POWER chips, we also have an nvlink between the device
>>>> and the CPU which is running significantly faster than PCIe.
>>>>
>>>> But yes, there are cross-links and those should probably be accounted
>>>> for in the grouping.  
>>>
>>> Then after we fix the grouping, can we just let the host driver manage
>>> this coherent memory range and expose vGPUs to guests?  The use case of
>>> assigning all 6 GPUs to one VM seems pretty limited.  (Might need to
>>> convince NVIDIA to support more than a single vGPU per VM though)  
>>
>> These are physical GPUs, not virtual sriov-alike things they are
>> implementing as well elsewhere.
> 
> vGPUs as implemented on M- and P-series Teslas aren't SR-IOV like
> either.  That's why we have mdev devices now to implement software
> defined devices.  I don't have first hand experience with V-series, but
> I would absolutely expect a PCIe-based Tesla V100 to support vGPU.

So assuming V100 can do vGPU, you are suggesting ditching this patchset and
using mediated vGPUs instead, correct?


>> My current understanding is that every P9 chip in that box has some NVLink2
>> logic on it so each P9 is directly connected to 3 GPUs via PCIe and
>> 2xNVLink2, and GPUs in that big group are interconnected by NVLink2 links
>> as well.
>>
>> From small bits of information I have it seems that a GPU can perfectly
>> work alone and if the NVIDIA driver does not see these interconnects
>> (because we do not pass the rest of the big 3xGPU group to this guest), it
>> continues with a single GPU. There is an "nvidia-smi -r" big reset hammer
>> which simply refuses to work until all 3 GPUs are passed so there is some
>> distinction between passing 1 or 3 GPUs, and I am trying (as we speak) to
>> get a confirmation from NVIDIA that it is ok to pass just a single GPU.
>>
>> So we will either have 6 groups (one per GPU) or 2 groups (one per
>> interconnected group).
> 
> I'm not gaining much confidence that we can rely on isolation between
> NVLink connected GPUs, it sounds like you're simply expecting that
> proprietary code from NVIDIA on a proprietary interconnect from NVIDIA
> is going to play nice and nobody will figure out how to do bad things
> because... obfuscation?  Thanks,

Well, we already believe that a proprietary firmware of a sriov-capable
adapter like Mellanox ConnextX is not doing bad things, how is this
different in principle?


ps. their obfuscation is funny indeed :)
-- 
Alexey

^ permalink raw reply

* Re: [RFC PATCH kernel 5/5] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] [10de:1db1] subdriver
From: Alexey Kardashevskiy @ 2018-06-08  3:52 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linuxppc-dev, David Gibson, kvm-ppc, Benjamin Herrenschmidt,
	Ram Pai, kvm, Alistair Popple
In-Reply-To: <20180607213535.04e68bac@w520.home>

On 8/6/18 1:35 pm, Alex Williamson wrote:
> On Fri, 8 Jun 2018 13:09:13 +1000
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>> On 8/6/18 3:04 am, Alex Williamson wrote:
>>> On Thu,  7 Jun 2018 18:44:20 +1000
>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>>> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
>>>> index 7bddf1e..38c9475 100644
>>>> --- a/drivers/vfio/pci/vfio_pci.c
>>>> +++ b/drivers/vfio/pci/vfio_pci.c
>>>> @@ -306,6 +306,15 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev)
>>>>  		}
>>>>  	}
>>>>  
>>>> +	if (pdev->vendor == PCI_VENDOR_ID_NVIDIA &&
>>>> +	    pdev->device == 0x1db1 &&
>>>> +	    IS_ENABLED(CONFIG_VFIO_PCI_NVLINK2)) {  
>>>
>>> Can't we do better than check this based on device ID?  Perhaps PCIe
>>> capability hints at this?  
>>
>> A normal PCI pluggable device looks like this:
>>
>> root@fstn3:~# sudo lspci -vs 0000:03:00.0
>> 0000:03:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
>> 	Subsystem: NVIDIA Corporation GK210GL [Tesla K80]
>> 	Flags: fast devsel, IRQ 497
>> 	Memory at 3fe000000000 (32-bit, non-prefetchable) [disabled] [size=16M]
>> 	Memory at 200000000000 (64-bit, prefetchable) [disabled] [size=16G]
>> 	Memory at 200400000000 (64-bit, prefetchable) [disabled] [size=32M]
>> 	Capabilities: [60] Power Management version 3
>> 	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
>> 	Capabilities: [78] Express Endpoint, MSI 00
>> 	Capabilities: [100] Virtual Channel
>> 	Capabilities: [128] Power Budgeting <?>
>> 	Capabilities: [420] Advanced Error Reporting
>> 	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
>> 	Capabilities: [900] #19
>>
>>
>> This is a NVLink v1 machine:
>>
>> aik@garrison1:~$ sudo lspci -vs 000a:01:00.0
>> 000a:01:00.0 3D controller: NVIDIA Corporation Device 15fe (rev a1)
>> 	Subsystem: NVIDIA Corporation Device 116b
>> 	Flags: bus master, fast devsel, latency 0, IRQ 457
>> 	Memory at 3fe300000000 (32-bit, non-prefetchable) [size=16M]
>> 	Memory at 260000000000 (64-bit, prefetchable) [size=16G]
>> 	Memory at 260400000000 (64-bit, prefetchable) [size=32M]
>> 	Capabilities: [60] Power Management version 3
>> 	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
>> 	Capabilities: [78] Express Endpoint, MSI 00
>> 	Capabilities: [100] Virtual Channel
>> 	Capabilities: [250] Latency Tolerance Reporting
>> 	Capabilities: [258] L1 PM Substates
>> 	Capabilities: [128] Power Budgeting <?>
>> 	Capabilities: [420] Advanced Error Reporting
>> 	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
>> 	Capabilities: [900] #19
>> 	Kernel driver in use: nvidia
>> 	Kernel modules: nvidiafb, nouveau, nvidia_384_drm, nvidia_384
>>
>>
>> This is the one the patch is for:
>>
>> [aik@yc02goos ~]$ sudo lspci -vs 0035:03:00.0
>> 0035:03:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2]
>> (rev a1)
>> 	Subsystem: NVIDIA Corporation Device 1212
>> 	Flags: fast devsel, IRQ 82, NUMA node 8
>> 	Memory at 620c280000000 (32-bit, non-prefetchable) [disabled] [size=16M]
>> 	Memory at 6228000000000 (64-bit, prefetchable) [disabled] [size=16G]
>> 	Memory at 6228400000000 (64-bit, prefetchable) [disabled] [size=32M]
>> 	Capabilities: [60] Power Management version 3
>> 	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
>> 	Capabilities: [78] Express Endpoint, MSI 00
>> 	Capabilities: [100] Virtual Channel
>> 	Capabilities: [250] Latency Tolerance Reporting
>> 	Capabilities: [258] L1 PM Substates
>> 	Capabilities: [128] Power Budgeting <?>
>> 	Capabilities: [420] Advanced Error Reporting
>> 	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
>> 	Capabilities: [900] #19
>> 	Capabilities: [ac0] #23
>> 	Kernel driver in use: vfio-pci
>>
>>
>> I can only see a new capability #23 which I have no idea about what it
>> actually does - my latest PCIe spec is
>> PCI_Express_Base_r3.1a_December7-2015.pdf and that only knows capabilities
>> till #21, do you have any better spec? Does not seem promising anyway...
> 
> You could just look in include/uapi/linux/pci_regs.h and see that 23
> (0x17) is a TPH Requester capability and google for that...  It's a TLP
> processing hint related to cache processing for requests from system
> specific interconnects.  Sounds rather promising.  Of course there's
> also the vendor specific capability that might be probed if NVIDIA will
> tell you what to look for and the init function you've implemented
> looks for specific devicetree nodes, that I imagine you could test for
> in a probe as well.


This 23 is in hex:

[aik@yc02goos ~]$ sudo lspci -vs 0035:03:00.0
0035:03:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2]
(rev a1)
	Subsystem: NVIDIA Corporation Device 1212
	Flags: fast devsel, IRQ 82, NUMA node 8
	Memory at 620c280000000 (32-bit, non-prefetchable) [disabled] [size=16M]
	Memory at 6228000000000 (64-bit, prefetchable) [disabled] [size=16G]
	Memory at 6228400000000 (64-bit, prefetchable) [disabled] [size=32M]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [250] Latency Tolerance Reporting
	Capabilities: [258] L1 PM Substates
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [420] Advanced Error Reporting
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] #19
	Capabilities: [ac0] #23
	Kernel driver in use: vfio-pci

[aik@yc02goos ~]$ sudo lspci -vvvxxxxs 0035:03:00.0 | grep ac0
	Capabilities: [ac0 v1] #23
ac0: 23 00 01 00 de 10 c1 00 01 00 10 00 00 00 00 00


Talking to NVIDIA is always an option :)


> 
>>> Is it worthwhile to continue with assigning the device in the !ENABLED
>>> case?  For instance, maybe it would be better to provide a weak
>>> definition of vfio_pci_nvlink2_init() that would cause us to fail here
>>> if we don't have this device specific support enabled.  I realize
>>> you're following the example set forth for IGD, but those regions are
>>> optional, for better or worse.  
>>
>>
>> The device is supposed to work even without GPU RAM passed through, this
>> should look like NVLink v1 in this case (there used to be bugs in the
>> driver, may be still are, have not checked for a while but there is a bug
>> opened at NVIDIA about this and they were going to fix that), this is why I
>> chose not to fail here.
> 
> Ok.
> 
>>>> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
>>>> index 24ee260..2725bc8 100644
>>>> --- a/drivers/vfio/pci/Kconfig
>>>> +++ b/drivers/vfio/pci/Kconfig
>>>> @@ -30,3 +30,7 @@ config VFIO_PCI_INTX
>>>>  config VFIO_PCI_IGD
>>>>  	depends on VFIO_PCI
>>>>  	def_bool y if X86
>>>> +
>>>> +config VFIO_PCI_NVLINK2
>>>> +	depends on VFIO_PCI
>>>> +	def_bool y if PPC_POWERNV  
>>>
>>> As written, this also depends on PPC_POWERNV (or at least TCE), it's not
>>> a portable implementation that we could re-use on X86 or ARM or any
>>> other platform if hardware appeared for it.  Can we improve that as
>>> well to make this less POWER specific?  Thanks,  
>>
>>
>> As I said in another mail, every P9 chip in that box has some NVLink2 logic
>> on it so it is not even common among P9's in general and I am having hard
>> time seeing these V100s used elsewhere in such way.
> 
> https://www.redhat.com/archives/vfio-users/2018-May/msg00000.html
> 
> Not much platform info, but based on the rpm mentioned, looks like an
> x86_64 box.  Thanks,

Wow. Interesting. Thanks for the pointer. No advertising material actually
says that it is P9 only or even mention P9, wiki does not say it is P9 only
either. Hmmm...



-- 
Alexey

^ permalink raw reply

* Re: [RFC PATCH kernel 0/5] powerpc/P9/vfio: Pass through NVIDIA Tesla V100
From: Alex Williamson @ 2018-06-08  3:44 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Benjamin Herrenschmidt, linuxppc-dev, David Gibson, kvm-ppc,
	Ram Pai, kvm, Alistair Popple
In-Reply-To: <f0561258-698c-9086-255d-ff6c1aa16cfd@ozlabs.ru>

On Fri, 8 Jun 2018 13:08:54 +1000
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:

> On 8/6/18 8:15 am, Alex Williamson wrote:
> > On Fri, 08 Jun 2018 07:54:02 +1000
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> >   
> >> On Thu, 2018-06-07 at 11:04 -0600, Alex Williamson wrote:  
> >>>
> >>> Can we back up and discuss whether the IOMMU grouping of NVLink
> >>> connected devices makes sense?  AIUI we have a PCI view of these
> >>> devices and from that perspective they're isolated.  That's the view of
> >>> the device used to generate the grouping.  However, not visible to us,
> >>> these devices are interconnected via NVLink.  What isolation properties
> >>> does NVLink provide given that its entire purpose for existing seems to
> >>> be to provide a high performance link for p2p between devices?    
> >>
> >> Not entire. On POWER chips, we also have an nvlink between the device
> >> and the CPU which is running significantly faster than PCIe.
> >>
> >> But yes, there are cross-links and those should probably be accounted
> >> for in the grouping.  
> > 
> > Then after we fix the grouping, can we just let the host driver manage
> > this coherent memory range and expose vGPUs to guests?  The use case of
> > assigning all 6 GPUs to one VM seems pretty limited.  (Might need to
> > convince NVIDIA to support more than a single vGPU per VM though)  
> 
> These are physical GPUs, not virtual sriov-alike things they are
> implementing as well elsewhere.

vGPUs as implemented on M- and P-series Teslas aren't SR-IOV like
either.  That's why we have mdev devices now to implement software
defined devices.  I don't have first hand experience with V-series, but
I would absolutely expect a PCIe-based Tesla V100 to support vGPU.

> My current understanding is that every P9 chip in that box has some NVLink2
> logic on it so each P9 is directly connected to 3 GPUs via PCIe and
> 2xNVLink2, and GPUs in that big group are interconnected by NVLink2 links
> as well.
> 
> From small bits of information I have it seems that a GPU can perfectly
> work alone and if the NVIDIA driver does not see these interconnects
> (because we do not pass the rest of the big 3xGPU group to this guest), it
> continues with a single GPU. There is an "nvidia-smi -r" big reset hammer
> which simply refuses to work until all 3 GPUs are passed so there is some
> distinction between passing 1 or 3 GPUs, and I am trying (as we speak) to
> get a confirmation from NVIDIA that it is ok to pass just a single GPU.
> 
> So we will either have 6 groups (one per GPU) or 2 groups (one per
> interconnected group).

I'm not gaining much confidence that we can rely on isolation between
NVLink connected GPUs, it sounds like you're simply expecting that
proprietary code from NVIDIA on a proprietary interconnect from NVIDIA
is going to play nice and nobody will figure out how to do bad things
because... obfuscation?  Thanks,

Alex

^ permalink raw reply

* Re: [RFC PATCH kernel 5/5] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] [10de:1db1] subdriver
From: Alex Williamson @ 2018-06-08  3:35 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: linuxppc-dev, David Gibson, kvm-ppc, Benjamin Herrenschmidt,
	Ram Pai, kvm, Alistair Popple
In-Reply-To: <d4ffacf9-5028-9f76-cbad-4f1a487b6db3@ozlabs.ru>

On Fri, 8 Jun 2018 13:09:13 +1000
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> On 8/6/18 3:04 am, Alex Williamson wrote:
> > On Thu,  7 Jun 2018 18:44:20 +1000
> > Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> >> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> >> index 7bddf1e..38c9475 100644
> >> --- a/drivers/vfio/pci/vfio_pci.c
> >> +++ b/drivers/vfio/pci/vfio_pci.c
> >> @@ -306,6 +306,15 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev)
> >>  		}
> >>  	}
> >>  
> >> +	if (pdev->vendor == PCI_VENDOR_ID_NVIDIA &&
> >> +	    pdev->device == 0x1db1 &&
> >> +	    IS_ENABLED(CONFIG_VFIO_PCI_NVLINK2)) {  
> > 
> > Can't we do better than check this based on device ID?  Perhaps PCIe
> > capability hints at this?  
> 
> A normal PCI pluggable device looks like this:
> 
> root@fstn3:~# sudo lspci -vs 0000:03:00.0
> 0000:03:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
> 	Subsystem: NVIDIA Corporation GK210GL [Tesla K80]
> 	Flags: fast devsel, IRQ 497
> 	Memory at 3fe000000000 (32-bit, non-prefetchable) [disabled] [size=16M]
> 	Memory at 200000000000 (64-bit, prefetchable) [disabled] [size=16G]
> 	Memory at 200400000000 (64-bit, prefetchable) [disabled] [size=32M]
> 	Capabilities: [60] Power Management version 3
> 	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
> 	Capabilities: [78] Express Endpoint, MSI 00
> 	Capabilities: [100] Virtual Channel
> 	Capabilities: [128] Power Budgeting <?>
> 	Capabilities: [420] Advanced Error Reporting
> 	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
> 	Capabilities: [900] #19
> 
> 
> This is a NVLink v1 machine:
> 
> aik@garrison1:~$ sudo lspci -vs 000a:01:00.0
> 000a:01:00.0 3D controller: NVIDIA Corporation Device 15fe (rev a1)
> 	Subsystem: NVIDIA Corporation Device 116b
> 	Flags: bus master, fast devsel, latency 0, IRQ 457
> 	Memory at 3fe300000000 (32-bit, non-prefetchable) [size=16M]
> 	Memory at 260000000000 (64-bit, prefetchable) [size=16G]
> 	Memory at 260400000000 (64-bit, prefetchable) [size=32M]
> 	Capabilities: [60] Power Management version 3
> 	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
> 	Capabilities: [78] Express Endpoint, MSI 00
> 	Capabilities: [100] Virtual Channel
> 	Capabilities: [250] Latency Tolerance Reporting
> 	Capabilities: [258] L1 PM Substates
> 	Capabilities: [128] Power Budgeting <?>
> 	Capabilities: [420] Advanced Error Reporting
> 	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
> 	Capabilities: [900] #19
> 	Kernel driver in use: nvidia
> 	Kernel modules: nvidiafb, nouveau, nvidia_384_drm, nvidia_384
> 
> 
> This is the one the patch is for:
> 
> [aik@yc02goos ~]$ sudo lspci -vs 0035:03:00.0
> 0035:03:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2]
> (rev a1)
> 	Subsystem: NVIDIA Corporation Device 1212
> 	Flags: fast devsel, IRQ 82, NUMA node 8
> 	Memory at 620c280000000 (32-bit, non-prefetchable) [disabled] [size=16M]
> 	Memory at 6228000000000 (64-bit, prefetchable) [disabled] [size=16G]
> 	Memory at 6228400000000 (64-bit, prefetchable) [disabled] [size=32M]
> 	Capabilities: [60] Power Management version 3
> 	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
> 	Capabilities: [78] Express Endpoint, MSI 00
> 	Capabilities: [100] Virtual Channel
> 	Capabilities: [250] Latency Tolerance Reporting
> 	Capabilities: [258] L1 PM Substates
> 	Capabilities: [128] Power Budgeting <?>
> 	Capabilities: [420] Advanced Error Reporting
> 	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
> 	Capabilities: [900] #19
> 	Capabilities: [ac0] #23
> 	Kernel driver in use: vfio-pci
> 
> 
> I can only see a new capability #23 which I have no idea about what it
> actually does - my latest PCIe spec is
> PCI_Express_Base_r3.1a_December7-2015.pdf and that only knows capabilities
> till #21, do you have any better spec? Does not seem promising anyway...

You could just look in include/uapi/linux/pci_regs.h and see that 23
(0x17) is a TPH Requester capability and google for that...  It's a TLP
processing hint related to cache processing for requests from system
specific interconnects.  Sounds rather promising.  Of course there's
also the vendor specific capability that might be probed if NVIDIA will
tell you what to look for and the init function you've implemented
looks for specific devicetree nodes, that I imagine you could test for
in a probe as well.

> > Is it worthwhile to continue with assigning the device in the !ENABLED
> > case?  For instance, maybe it would be better to provide a weak
> > definition of vfio_pci_nvlink2_init() that would cause us to fail here
> > if we don't have this device specific support enabled.  I realize
> > you're following the example set forth for IGD, but those regions are
> > optional, for better or worse.  
> 
> 
> The device is supposed to work even without GPU RAM passed through, this
> should look like NVLink v1 in this case (there used to be bugs in the
> driver, may be still are, have not checked for a while but there is a bug
> opened at NVIDIA about this and they were going to fix that), this is why I
> chose not to fail here.

Ok.

> >> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> >> index 24ee260..2725bc8 100644
> >> --- a/drivers/vfio/pci/Kconfig
> >> +++ b/drivers/vfio/pci/Kconfig
> >> @@ -30,3 +30,7 @@ config VFIO_PCI_INTX
> >>  config VFIO_PCI_IGD
> >>  	depends on VFIO_PCI
> >>  	def_bool y if X86
> >> +
> >> +config VFIO_PCI_NVLINK2
> >> +	depends on VFIO_PCI
> >> +	def_bool y if PPC_POWERNV  
> > 
> > As written, this also depends on PPC_POWERNV (or at least TCE), it's not
> > a portable implementation that we could re-use on X86 or ARM or any
> > other platform if hardware appeared for it.  Can we improve that as
> > well to make this less POWER specific?  Thanks,  
> 
> 
> As I said in another mail, every P9 chip in that box has some NVLink2 logic
> on it so it is not even common among P9's in general and I am having hard
> time seeing these V100s used elsewhere in such way.

https://www.redhat.com/archives/vfio-users/2018-May/msg00000.html

Not much platform info, but based on the rpm mentioned, looks like an
x86_64 box.  Thanks,

Alex

^ permalink raw reply

* [PATCH] powerpc: Don't let userspace trigger a kernel WARN_ON
From: Benjamin Herrenschmidt @ 2018-06-08  3:21 UTC (permalink / raw)
  To: linuxppc-dev

In commit 2865d08dd9ea876524652f3900b4b3b9c8b22e77
"powerpc/mm: Move the DSISR_PROTFAULT sanity check",
I completely missed the fact that an attempt at reading
kernel memory *will* trip the warning.

So this partially reverts it. We keep the test in a
helper to keep the code clean, but we move it back to
after the VMA has been found.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index c01d627e687a..20384445ca44 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -416,9 +416,6 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 		return SIGBUS;
 	}
 
-	/* Additional sanity check(s) */
-	sanity_check_fault(is_write, error_code);
-
 	/*
 	 * The kernel should never take an execute fault nor should it
 	 * take a page fault to a kernel address.
@@ -511,6 +508,10 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 		return bad_area(regs, address);
 
 good_area:
+	/* Additional sanity check(s) */
+	sanity_check_fault(is_write, error_code);
+
+	/* Check for VMA access permissions */
 	if (unlikely(access_error(is_write, is_exec, vma)))
 		return bad_access(regs, address);
 

^ permalink raw reply related

* Re: [RFC PATCH kernel 5/5] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] [10de:1db1] subdriver
From: Alexey Kardashevskiy @ 2018-06-08  3:09 UTC (permalink / raw)
  To: Alex Williamson
  Cc: linuxppc-dev, David Gibson, kvm-ppc, Benjamin Herrenschmidt,
	Ram Pai, kvm, Alistair Popple
In-Reply-To: <20180607110449.2c25a62d@w520.home>

On 8/6/18 3:04 am, Alex Williamson wrote:
> On Thu,  7 Jun 2018 18:44:20 +1000
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
>> Some POWER9 chips come with special NVLink2 links which provide
>> cacheable memory access to the RAM physically located on NVIDIA GPU.
>> This memory is presented to a host via the device tree but remains
>> offline until the NVIDIA driver onlines it.
>>
>> This exports this RAM to the userspace as a new region so
>> the NVIDIA driver in the guest can train these links and online GPU RAM.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>  drivers/vfio/pci/Makefile           |   1 +
>>  drivers/vfio/pci/vfio_pci_private.h |   8 ++
>>  include/uapi/linux/vfio.h           |   3 +
>>  drivers/vfio/pci/vfio_pci.c         |   9 ++
>>  drivers/vfio/pci/vfio_pci_nvlink2.c | 190 ++++++++++++++++++++++++++++++++++++
>>  drivers/vfio/pci/Kconfig            |   4 +
>>  6 files changed, 215 insertions(+)
>>  create mode 100644 drivers/vfio/pci/vfio_pci_nvlink2.c
>>
>> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
>> index 76d8ec0..9662c06 100644
>> --- a/drivers/vfio/pci/Makefile
>> +++ b/drivers/vfio/pci/Makefile
>> @@ -1,5 +1,6 @@
>>  
>>  vfio-pci-y := vfio_pci.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
>>  vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
>> +vfio-pci-$(CONFIG_VFIO_PCI_NVLINK2) += vfio_pci_nvlink2.o
>>  
>>  obj-$(CONFIG_VFIO_PCI) += vfio-pci.o
>> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
>> index 86aab05..7115b9b 100644
>> --- a/drivers/vfio/pci/vfio_pci_private.h
>> +++ b/drivers/vfio/pci/vfio_pci_private.h
>> @@ -160,4 +160,12 @@ static inline int vfio_pci_igd_init(struct vfio_pci_device *vdev)
>>  	return -ENODEV;
>>  }
>>  #endif
>> +#ifdef CONFIG_VFIO_PCI_NVLINK2
>> +extern int vfio_pci_nvlink2_init(struct vfio_pci_device *vdev);
>> +#else
>> +static inline int vfio_pci_nvlink2_init(struct vfio_pci_device *vdev)
>> +{
>> +	return -ENODEV;
>> +}
>> +#endif
>>  #endif /* VFIO_PCI_PRIVATE_H */
>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>> index 1aa7b82..2fe8227 100644
>> --- a/include/uapi/linux/vfio.h
>> +++ b/include/uapi/linux/vfio.h
>> @@ -301,6 +301,9 @@ struct vfio_region_info_cap_type {
>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
>>  
>> +/* NVIDIA GPU NV2 */
>> +#define VFIO_REGION_SUBTYPE_NVIDIA_NVLINK2	(4)
> 
> You're continuing the Intel vendor ID sub-types for an NVIDIA vendor ID
> subtype.  Each vendor has their own address space of sub-types.


True, I'll update. I just like unique numbers better :)

> 
>> +
>>  /*
>>   * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped
>>   * which allows direct access to non-MSIX registers which happened to be within
>> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
>> index 7bddf1e..38c9475 100644
>> --- a/drivers/vfio/pci/vfio_pci.c
>> +++ b/drivers/vfio/pci/vfio_pci.c
>> @@ -306,6 +306,15 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev)
>>  		}
>>  	}
>>  
>> +	if (pdev->vendor == PCI_VENDOR_ID_NVIDIA &&
>> +	    pdev->device == 0x1db1 &&
>> +	    IS_ENABLED(CONFIG_VFIO_PCI_NVLINK2)) {
> 
> Can't we do better than check this based on device ID?  Perhaps PCIe
> capability hints at this?

A normal PCI pluggable device looks like this:

root@fstn3:~# sudo lspci -vs 0000:03:00.0
0000:03:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
	Subsystem: NVIDIA Corporation GK210GL [Tesla K80]
	Flags: fast devsel, IRQ 497
	Memory at 3fe000000000 (32-bit, non-prefetchable) [disabled] [size=16M]
	Memory at 200000000000 (64-bit, prefetchable) [disabled] [size=16G]
	Memory at 200400000000 (64-bit, prefetchable) [disabled] [size=32M]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [420] Advanced Error Reporting
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] #19


This is a NVLink v1 machine:

aik@garrison1:~$ sudo lspci -vs 000a:01:00.0
000a:01:00.0 3D controller: NVIDIA Corporation Device 15fe (rev a1)
	Subsystem: NVIDIA Corporation Device 116b
	Flags: bus master, fast devsel, latency 0, IRQ 457
	Memory at 3fe300000000 (32-bit, non-prefetchable) [size=16M]
	Memory at 260000000000 (64-bit, prefetchable) [size=16G]
	Memory at 260400000000 (64-bit, prefetchable) [size=32M]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [250] Latency Tolerance Reporting
	Capabilities: [258] L1 PM Substates
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [420] Advanced Error Reporting
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] #19
	Kernel driver in use: nvidia
	Kernel modules: nvidiafb, nouveau, nvidia_384_drm, nvidia_384


This is the one the patch is for:

[aik@yc02goos ~]$ sudo lspci -vs 0035:03:00.0
0035:03:00.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2]
(rev a1)
	Subsystem: NVIDIA Corporation Device 1212
	Flags: fast devsel, IRQ 82, NUMA node 8
	Memory at 620c280000000 (32-bit, non-prefetchable) [disabled] [size=16M]
	Memory at 6228000000000 (64-bit, prefetchable) [disabled] [size=16G]
	Memory at 6228400000000 (64-bit, prefetchable) [disabled] [size=32M]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [250] Latency Tolerance Reporting
	Capabilities: [258] L1 PM Substates
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [420] Advanced Error Reporting
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] #19
	Capabilities: [ac0] #23
	Kernel driver in use: vfio-pci


I can only see a new capability #23 which I have no idea about what it
actually does - my latest PCIe spec is
PCI_Express_Base_r3.1a_December7-2015.pdf and that only knows capabilities
till #21, do you have any better spec? Does not seem promising anyway...


> Is it worthwhile to continue with assigning the device in the !ENABLED
> case?  For instance, maybe it would be better to provide a weak
> definition of vfio_pci_nvlink2_init() that would cause us to fail here
> if we don't have this device specific support enabled.  I realize
> you're following the example set forth for IGD, but those regions are
> optional, for better or worse.


The device is supposed to work even without GPU RAM passed through, this
should look like NVLink v1 in this case (there used to be bugs in the
driver, may be still are, have not checked for a while but there is a bug
opened at NVIDIA about this and they were going to fix that), this is why I
chose not to fail here.



>> +		ret = vfio_pci_nvlink2_init(vdev);
>> +		if (ret)
>> +			dev_warn(&vdev->pdev->dev,
>> +				 "Failed to setup NVIDIA NV2 RAM region\n");
>> +	}
>> +
>>  	vfio_pci_probe_mmaps(vdev);
>>  
>>  	return 0;
>> diff --git a/drivers/vfio/pci/vfio_pci_nvlink2.c b/drivers/vfio/pci/vfio_pci_nvlink2.c
>> new file mode 100644
>> index 0000000..451c5cb
>> --- /dev/null
>> +++ b/drivers/vfio/pci/vfio_pci_nvlink2.c
>> @@ -0,0 +1,190 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * VFIO PCI NVIDIA Whitherspoon GPU support a.k.a. NVLink2.
>> + *
>> + * Copyright (C) 2018 IBM Corp.  All rights reserved.
>> + *     Author: Alexey Kardashevskiy <aik@ozlabs.ru>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * Register an on-GPU RAM region for cacheable access.
>> + *
>> + * Derived from original vfio_pci_igd.c:
>> + * Copyright (C) 2016 Red Hat, Inc.  All rights reserved.
>> + *	Author: Alex Williamson <alex.williamson@redhat.com>
>> + */
>> +
>> +#include <linux/io.h>
>> +#include <linux/pci.h>
>> +#include <linux/uaccess.h>
>> +#include <linux/vfio.h>
>> +#include <linux/sched/mm.h>
>> +#include <linux/mmu_context.h>
>> +
>> +#include "vfio_pci_private.h"
>> +
>> +struct vfio_pci_nvlink2_data {
>> +	unsigned long gpu_hpa;
>> +	unsigned long useraddr;
>> +	unsigned long size;
>> +	struct mm_struct *mm;
>> +	struct mm_iommu_table_group_mem_t *mem;
>> +};
>> +
>> +static size_t vfio_pci_nvlink2_rw(struct vfio_pci_device *vdev,
>> +		char __user *buf, size_t count, loff_t *ppos, bool iswrite)
>> +{
>> +	unsigned int i = VFIO_PCI_OFFSET_TO_INDEX(*ppos) - VFIO_PCI_NUM_REGIONS;
>> +	void *base = vdev->region[i].data;
>> +	loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
>> +
>> +	if (pos >= vdev->region[i].size)
>> +		return -EINVAL;
>> +
>> +	count = min(count, (size_t)(vdev->region[i].size - pos));
>> +
>> +	if (iswrite) {
>> +		if (copy_from_user(base + pos, buf, count))
>> +			return -EFAULT;
>> +	} else {
>> +		if (copy_to_user(buf, base + pos, count))
>> +			return -EFAULT;
>> +	}
>> +	*ppos += count;
>> +
>> +	return count;
>> +}
>> +
>> +static void vfio_pci_nvlink2_release(struct vfio_pci_device *vdev,
>> +		struct vfio_pci_region *region)
>> +{
>> +	struct vfio_pci_nvlink2_data *data = region->data;
>> +	long ret;
>> +
>> +	ret = mm_iommu_put(data->mm, data->mem);
>> +	WARN_ON(ret);
>> +
>> +	mmdrop(data->mm);
>> +	kfree(data);
>> +}
>> +
>> +static int vfio_pci_nvlink2_mmap_fault(struct vm_fault *vmf)
>> +{
>> +	struct vm_area_struct *vma = vmf->vma;
>> +	struct vfio_pci_region *region = vma->vm_private_data;
>> +	struct vfio_pci_nvlink2_data *data = region->data;
>> +	int ret;
>> +	unsigned long vmf_off = (vmf->address - vma->vm_start) >> PAGE_SHIFT;
>> +	unsigned long nv2pg = data->gpu_hpa >> PAGE_SHIFT;
>> +	unsigned long vm_pgoff = vma->vm_pgoff &
>> +		((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) - 1);
>> +	unsigned long pfn = nv2pg + vm_pgoff + vmf_off;
>> +
>> +	ret = vm_insert_pfn(vma, vmf->address, pfn);
>> +	/* TODO: make it a tracepoint */
>> +	pr_debug("NVLink2: vmf=%lx hpa=%lx ret=%d\n",
>> +		 vmf->address, pfn << PAGE_SHIFT, ret);
>> +	if (ret)
>> +		return VM_FAULT_SIGSEGV;
>> +
>> +	return VM_FAULT_NOPAGE;
>> +}
>> +
>> +static const struct vm_operations_struct vfio_pci_nvlink2_mmap_vmops = {
>> +	.fault = vfio_pci_nvlink2_mmap_fault,
>> +};
>> +
>> +static int vfio_pci_nvlink2_mmap(struct vfio_pci_device *vdev,
>> +		struct vfio_pci_region *region, struct vm_area_struct *vma)
>> +{
>> +	long ret;
>> +	struct vfio_pci_nvlink2_data *data = region->data;
>> +
>> +	if (data->useraddr)
>> +		return -EPERM;
>> +
>> +	if (vma->vm_end - vma->vm_start > data->size)
>> +		return -EINVAL;
>> +
>> +	vma->vm_private_data = region;
>> +	vma->vm_flags |= VM_PFNMAP;
>> +	vma->vm_ops = &vfio_pci_nvlink2_mmap_vmops;
>> +
>> +	/*
>> +	 * Calling mm_iommu_newdev() here once as the region is not
>> +	 * registered yet and therefore right initialization will happen now.
>> +	 * Other places will use mm_iommu_find() which returns
>> +	 * registered @mem and does not go gup().
>> +	 */
>> +	data->useraddr = vma->vm_start;
>> +	data->mm = current->mm;
>> +	atomic_inc(&data->mm->mm_count);
>> +	ret = mm_iommu_newdev(data->mm, data->useraddr,
>> +			(vma->vm_end - vma->vm_start) >> PAGE_SHIFT,
>> +			data->gpu_hpa, &data->mem);
>> +
>> +	pr_debug("VFIO NVLINK2 mmap: useraddr=%lx hpa=%lx size=%lx ret=%ld\n",
>> +			data->useraddr, data->gpu_hpa,
>> +			vma->vm_end - vma->vm_start, ret);
>> +
>> +	return ret;
>> +}
>> +
>> +static const struct vfio_pci_regops vfio_pci_nvlink2_regops = {
>> +	.rw = vfio_pci_nvlink2_rw,
>> +	.release = vfio_pci_nvlink2_release,
>> +	.mmap = vfio_pci_nvlink2_mmap,
>> +};
>> +
>> +int vfio_pci_nvlink2_init(struct vfio_pci_device *vdev)
>> +{
>> +	int len = 0, ret;
>> +	struct device_node *npu_node, *mem_node;
>> +	struct pci_dev *npu_dev;
>> +	uint32_t *mem_phandle, *val;
>> +	struct vfio_pci_nvlink2_data *data;
>> +
>> +	npu_dev = pnv_pci_get_npu_dev(vdev->pdev, 0);
>> +	if (!npu_dev)
>> +		return -EINVAL;
>> +
>> +	npu_node = pci_device_to_OF_node(npu_dev);
>> +	if (!npu_node)
>> +		return -EINVAL;
>> +
>> +	mem_phandle = (void *) of_get_property(npu_node, "memory-region", NULL);
>> +	if (!mem_phandle)
>> +		return -EINVAL;
>> +
>> +	mem_node = of_find_node_by_phandle(be32_to_cpu(*mem_phandle));
>> +	if (!mem_node)
>> +		return -EINVAL;
>> +
>> +	val = (uint32_t *) of_get_property(mem_node, "reg", &len);
>> +	if (!val || len != 2 * sizeof(uint64_t))
>> +		return -EINVAL;
>> +
>> +	data = kzalloc(sizeof(*data), GFP_KERNEL);
>> +	if (!data)
>> +		return -ENOMEM;
>> +
>> +	data->gpu_hpa = ((uint64_t)be32_to_cpu(val[0]) << 32) |
>> +			be32_to_cpu(val[1]);
>> +	data->size = ((uint64_t)be32_to_cpu(val[2]) << 32) |
>> +			be32_to_cpu(val[3]);
>> +
>> +	dev_dbg(&vdev->pdev->dev, "%lx..%lx\n", data->gpu_hpa,
>> +			data->gpu_hpa + data->size - 1);
>> +
>> +	ret = vfio_pci_register_dev_region(vdev,
>> +			PCI_VENDOR_ID_NVIDIA | VFIO_REGION_TYPE_PCI_VENDOR_TYPE,
>> +			VFIO_REGION_SUBTYPE_NVIDIA_NVLINK2,
>> +			&vfio_pci_nvlink2_regops, data->size,
>> +			VFIO_REGION_INFO_FLAG_READ, data);
>> +	if (ret)
>> +		kfree(data);
>> +
>> +	return ret;
>> +}
>> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
>> index 24ee260..2725bc8 100644
>> --- a/drivers/vfio/pci/Kconfig
>> +++ b/drivers/vfio/pci/Kconfig
>> @@ -30,3 +30,7 @@ config VFIO_PCI_INTX
>>  config VFIO_PCI_IGD
>>  	depends on VFIO_PCI
>>  	def_bool y if X86
>> +
>> +config VFIO_PCI_NVLINK2
>> +	depends on VFIO_PCI
>> +	def_bool y if PPC_POWERNV
> 
> As written, this also depends on PPC_POWERNV (or at least TCE), it's not
> a portable implementation that we could re-use on X86 or ARM or any
> other platform if hardware appeared for it.  Can we improve that as
> well to make this less POWER specific?  Thanks,


As I said in another mail, every P9 chip in that box has some NVLink2 logic
on it so it is not even common among P9's in general and I am having hard
time seeing these V100s used elsewhere in such way.



-- 
Alexey

^ permalink raw reply

* Re: [RFC PATCH kernel 0/5] powerpc/P9/vfio: Pass through NVIDIA Tesla V100
From: Alexey Kardashevskiy @ 2018-06-08  3:08 UTC (permalink / raw)
  To: Alex Williamson, Benjamin Herrenschmidt
  Cc: linuxppc-dev, David Gibson, kvm-ppc, Ram Pai, kvm,
	Alistair Popple
In-Reply-To: <20180607161541.21df6434@w520.home>

On 8/6/18 8:15 am, Alex Williamson wrote:
> On Fri, 08 Jun 2018 07:54:02 +1000
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> 
>> On Thu, 2018-06-07 at 11:04 -0600, Alex Williamson wrote:
>>>
>>> Can we back up and discuss whether the IOMMU grouping of NVLink
>>> connected devices makes sense?  AIUI we have a PCI view of these
>>> devices and from that perspective they're isolated.  That's the view of
>>> the device used to generate the grouping.  However, not visible to us,
>>> these devices are interconnected via NVLink.  What isolation properties
>>> does NVLink provide given that its entire purpose for existing seems to
>>> be to provide a high performance link for p2p between devices?  
>>
>> Not entire. On POWER chips, we also have an nvlink between the device
>> and the CPU which is running significantly faster than PCIe.
>>
>> But yes, there are cross-links and those should probably be accounted
>> for in the grouping.
> 
> Then after we fix the grouping, can we just let the host driver manage
> this coherent memory range and expose vGPUs to guests?  The use case of
> assigning all 6 GPUs to one VM seems pretty limited.  (Might need to
> convince NVIDIA to support more than a single vGPU per VM though)

These are physical GPUs, not virtual sriov-alike things they are
implementing as well elsewhere.

My current understanding is that every P9 chip in that box has some NVLink2
logic on it so each P9 is directly connected to 3 GPUs via PCIe and
2xNVLink2, and GPUs in that big group are interconnected by NVLink2 links
as well.

>From small bits of information I have it seems that a GPU can perfectly
work alone and if the NVIDIA driver does not see these interconnects
(because we do not pass the rest of the big 3xGPU group to this guest), it
continues with a single GPU. There is an "nvidia-smi -r" big reset hammer
which simply refuses to work until all 3 GPUs are passed so there is some
distinction between passing 1 or 3 GPUs, and I am trying (as we speak) to
get a confirmation from NVIDIA that it is ok to pass just a single GPU.

So we will either have 6 groups (one per GPU) or 2 groups (one per
interconnected group).


-- 
Alexey

^ permalink raw reply

* Re: pkeys on POWER: Access rights not reset on execve
From: Ram Pai @ 2018-06-08  2:34 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Linux-MM, linuxppc-dev, Andy Lutomirski, Dave Hansen
In-Reply-To: <30040030-1aa2-623b-beec-dd1ceb3eb9a7@redhat.com>

> 
> So the remaining question at this point is whether the Intel
> behavior (default-deny instead of default-allow) is preferable.

Florian, remind me what behavior needs to fixed? 

-- 
Ram Pai

^ permalink raw reply

* [PATCH v2 09/12] macintosh/via-pmu: Replace via-pmu68k driver with via-pmu driver
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel,
	Geert Uytterhoeven
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

Now that the PowerMac via-pmu driver supports m68k PowerBooks,
switch over to that driver and remove the via-pmu68k driver.

Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 arch/m68k/configs/mac_defconfig   |   2 +-
 arch/m68k/configs/multi_defconfig |   2 +-
 arch/m68k/mac/config.c            |   2 +-
 arch/m68k/mac/misc.c              |  48 +--
 drivers/macintosh/Kconfig         |  13 +-
 drivers/macintosh/Makefile        |   1 -
 drivers/macintosh/adb.c           |   2 +-
 drivers/macintosh/via-pmu68k.c    | 846 --------------------------------------
 include/uapi/linux/pmu.h          |   1 -
 9 files changed, 13 insertions(+), 904 deletions(-)
 delete mode 100644 drivers/macintosh/via-pmu68k.c

diff --git a/arch/m68k/configs/mac_defconfig b/arch/m68k/configs/mac_defconfig
index 390d4a87441c..ee63f1242e9a 100644
--- a/arch/m68k/configs/mac_defconfig
+++ b/arch/m68k/configs/mac_defconfig
@@ -370,7 +370,7 @@ CONFIG_TCM_PSCSI=m
 CONFIG_ADB=y
 CONFIG_ADB_MACII=y
 CONFIG_ADB_IOP=y
-CONFIG_ADB_PMU68K=y
+CONFIG_ADB_PMU=y
 CONFIG_ADB_CUDA=y
 CONFIG_INPUT_ADBHID=y
 CONFIG_MAC_EMUMOUSEBTN=y
diff --git a/arch/m68k/configs/multi_defconfig b/arch/m68k/configs/multi_defconfig
index 77be97d82dc3..6421a3da616c 100644
--- a/arch/m68k/configs/multi_defconfig
+++ b/arch/m68k/configs/multi_defconfig
@@ -403,7 +403,7 @@ CONFIG_TCM_PSCSI=m
 CONFIG_ADB=y
 CONFIG_ADB_MACII=y
 CONFIG_ADB_IOP=y
-CONFIG_ADB_PMU68K=y
+CONFIG_ADB_PMU=y
 CONFIG_ADB_CUDA=y
 CONFIG_INPUT_ADBHID=y
 CONFIG_MAC_EMUMOUSEBTN=y
diff --git a/arch/m68k/mac/config.c b/arch/m68k/mac/config.c
index e522307db47c..92e80cf0d8aa 100644
--- a/arch/m68k/mac/config.c
+++ b/arch/m68k/mac/config.c
@@ -891,7 +891,7 @@ static void __init mac_identify(void)
 #ifdef CONFIG_ADB_CUDA
 	find_via_cuda();
 #endif
-#ifdef CONFIG_ADB_PMU68K
+#ifdef CONFIG_ADB_PMU
 	find_via_pmu();
 #endif
 }
diff --git a/arch/m68k/mac/misc.c b/arch/m68k/mac/misc.c
index 7ccb799eeb57..28090a44fa09 100644
--- a/arch/m68k/mac/misc.c
+++ b/arch/m68k/mac/misc.c
@@ -85,7 +85,7 @@ static void cuda_write_pram(int offset, __u8 data)
 }
 #endif /* CONFIG_ADB_CUDA */
 
-#ifdef CONFIG_ADB_PMU68K
+#ifdef CONFIG_ADB_PMU
 static long pmu_read_time(void)
 {
 	struct adb_request req;
@@ -136,7 +136,7 @@ static void pmu_write_pram(int offset, __u8 data)
 	while (!req.complete)
 		pmu_poll();
 }
-#endif /* CONFIG_ADB_PMU68K */
+#endif /* CONFIG_ADB_PMU */
 
 /*
  * VIA PRAM/RTC access routines
@@ -367,38 +367,6 @@ static void cuda_shutdown(void)
 }
 #endif /* CONFIG_ADB_CUDA */
 
-#ifdef CONFIG_ADB_PMU68K
-
-void pmu_restart(void)
-{
-	struct adb_request req;
-	if (pmu_request(&req, NULL,
-			2, PMU_SET_INTR_MASK, PMU_INT_ADB|PMU_INT_TICK) < 0)
-		return;
-	while (!req.complete)
-		pmu_poll();
-	if (pmu_request(&req, NULL, 1, PMU_RESET) < 0)
-		return;
-	while (!req.complete)
-		pmu_poll();
-}
-
-void pmu_shutdown(void)
-{
-	struct adb_request req;
-	if (pmu_request(&req, NULL,
-			2, PMU_SET_INTR_MASK, PMU_INT_ADB|PMU_INT_TICK) < 0)
-		return;
-	while (!req.complete)
-		pmu_poll();
-	if (pmu_request(&req, NULL, 5, PMU_SHUTDOWN, 'M', 'A', 'T', 'T') < 0)
-		return;
-	while (!req.complete)
-		pmu_poll();
-}
-
-#endif
-
 /*
  *-------------------------------------------------------------------
  * Below this point are the generic routines; they'll dispatch to the
@@ -423,7 +391,7 @@ void mac_pram_read(int offset, __u8 *buffer, int len)
 		func = cuda_read_pram;
 		break;
 #endif
-#ifdef CONFIG_ADB_PMU68K
+#ifdef CONFIG_ADB_PMU
 	case MAC_ADB_PB2:
 		func = pmu_read_pram;
 		break;
@@ -453,7 +421,7 @@ void mac_pram_write(int offset, __u8 *buffer, int len)
 		func = cuda_write_pram;
 		break;
 #endif
-#ifdef CONFIG_ADB_PMU68K
+#ifdef CONFIG_ADB_PMU
 	case MAC_ADB_PB2:
 		func = pmu_write_pram;
 		break;
@@ -477,7 +445,7 @@ void mac_poweroff(void)
 	           macintosh_config->adb_type == MAC_ADB_CUDA) {
 		cuda_shutdown();
 #endif
-#ifdef CONFIG_ADB_PMU68K
+#ifdef CONFIG_ADB_PMU
 	} else if (macintosh_config->adb_type == MAC_ADB_PB2) {
 		pmu_shutdown();
 #endif
@@ -518,7 +486,7 @@ void mac_reset(void)
 	           macintosh_config->adb_type == MAC_ADB_CUDA) {
 		cuda_restart();
 #endif
-#ifdef CONFIG_ADB_PMU68K
+#ifdef CONFIG_ADB_PMU
 	} else if (macintosh_config->adb_type == MAC_ADB_PB2) {
 		pmu_restart();
 #endif
@@ -670,7 +638,7 @@ int mac_hwclk(int op, struct rtc_time *t)
 			now = cuda_read_time();
 			break;
 #endif
-#ifdef CONFIG_ADB_PMU68K
+#ifdef CONFIG_ADB_PMU
 		case MAC_ADB_PB2:
 			now = pmu_read_time();
 			break;
@@ -706,7 +674,7 @@ int mac_hwclk(int op, struct rtc_time *t)
 			cuda_write_time(now);
 			break;
 #endif
-#ifdef CONFIG_ADB_PMU68K
+#ifdef CONFIG_ADB_PMU
 		case MAC_ADB_PB2:
 			pmu_write_time(now);
 			break;
diff --git a/drivers/macintosh/Kconfig b/drivers/macintosh/Kconfig
index 26abae4c899d..47c350cdfb12 100644
--- a/drivers/macintosh/Kconfig
+++ b/drivers/macintosh/Kconfig
@@ -39,17 +39,6 @@ config ADB_IOP
 	  <http://www.angelfire.com/ca2/dev68k/iopdesc.html> to enable direct
 	  support for it, say 'Y' here.
 
-config ADB_PMU68K
-	bool "Include PMU (Powerbook) ADB driver"
-	depends on ADB && MAC
-	help
-	  Say Y here if want your kernel to support the m68k based Powerbooks.
-	  This includes the PowerBook 140, PowerBook 145, PowerBook 150,
-	  PowerBook 160, PowerBook 165, PowerBook 165c, PowerBook 170,
-	  PowerBook 180, PowerBook, 180c, PowerBook 190cs, PowerBook 520,
-	  PowerBook Duo 210, PowerBook Duo 230, PowerBook Duo 250,
-	  PowerBook Duo 270c, PowerBook Duo 280 and PowerBook Duo 280c.
-
 # we want to change this to something like CONFIG_SYSCTRL_CUDA/PMU
 config ADB_CUDA
 	bool "Support for Cuda/Egret based Macs and PowerMacs"
@@ -66,7 +55,7 @@ config ADB_CUDA
 
 config ADB_PMU
 	bool "Support for PMU based PowerMacs and PowerBooks"
-	depends on PPC_PMAC
+	depends on PPC_PMAC || MAC
 	help
 	  On PowerBooks, iBooks, and recent iMacs and Power Macintoshes, the
 	  PMU is an embedded microprocessor whose primary function is to
diff --git a/drivers/macintosh/Makefile b/drivers/macintosh/Makefile
index ee803638e595..49819b1b6f20 100644
--- a/drivers/macintosh/Makefile
+++ b/drivers/macintosh/Makefile
@@ -22,7 +22,6 @@ obj-$(CONFIG_PMAC_SMU)		+= smu.o
 obj-$(CONFIG_ADB)		+= adb.o
 obj-$(CONFIG_ADB_MACII)		+= via-macii.o
 obj-$(CONFIG_ADB_IOP)		+= adb-iop.o
-obj-$(CONFIG_ADB_PMU68K)	+= via-pmu68k.o
 obj-$(CONFIG_ADB_MACIO)		+= macio-adb.o
 
 obj-$(CONFIG_THERM_WINDTUNNEL)	+= therm_windtunnel.o
diff --git a/drivers/macintosh/adb.c b/drivers/macintosh/adb.c
index 4c8097e0e6fe..76e98f0f7a3e 100644
--- a/drivers/macintosh/adb.c
+++ b/drivers/macintosh/adb.c
@@ -65,7 +65,7 @@ static struct adb_driver *adb_driver_list[] = {
 #ifdef CONFIG_ADB_IOP
 	&adb_iop_driver,
 #endif
-#if defined(CONFIG_ADB_PMU) || defined(CONFIG_ADB_PMU68K)
+#ifdef CONFIG_ADB_PMU
 	&via_pmu_driver,
 #endif
 #ifdef CONFIG_ADB_MACIO
diff --git a/drivers/macintosh/via-pmu68k.c b/drivers/macintosh/via-pmu68k.c
deleted file mode 100644
index bec8e1837d7d..000000000000
--- a/drivers/macintosh/via-pmu68k.c
+++ /dev/null
@@ -1,846 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Device driver for the PMU on 68K-based Apple PowerBooks
- *
- * The VIA (versatile interface adapter) interfaces to the PMU,
- * a 6805 microprocessor core whose primary function is to control
- * battery charging and system power on the PowerBooks.
- * The PMU also controls the ADB (Apple Desktop Bus) which connects
- * to the keyboard and mouse, as well as the non-volatile RAM
- * and the RTC (real time clock) chip.
- *
- * Adapted for 68K PMU by Joshua M. Thompson
- *
- * Based largely on the PowerMac PMU code by Paul Mackerras and
- * Fabio Riccardi.
- *
- * Also based on the PMU driver from MkLinux by Apple Computer, Inc.
- * and the Open Software Foundation, Inc.
- */
-
-#include <stdarg.h>
-#include <linux/types.h>
-#include <linux/errno.h>
-#include <linux/kernel.h>
-#include <linux/delay.h>
-#include <linux/miscdevice.h>
-#include <linux/blkdev.h>
-#include <linux/pci.h>
-#include <linux/init.h>
-#include <linux/interrupt.h>
-
-#include <linux/adb.h>
-#include <linux/pmu.h>
-#include <linux/cuda.h>
-
-#include <asm/macintosh.h>
-#include <asm/macints.h>
-#include <asm/mac_via.h>
-
-#include <asm/pgtable.h>
-#include <asm/irq.h>
-#include <linux/uaccess.h>
-
-/* Misc minor number allocated for /dev/pmu */
-#define PMU_MINOR	154
-
-/* VIA registers - spaced 0x200 bytes apart */
-#define RS		0x200		/* skip between registers */
-#define B		0		/* B-side data */
-#define A		RS		/* A-side data */
-#define DIRB		(2*RS)		/* B-side direction (1=output) */
-#define DIRA		(3*RS)		/* A-side direction (1=output) */
-#define T1CL		(4*RS)		/* Timer 1 ctr/latch (low 8 bits) */
-#define T1CH		(5*RS)		/* Timer 1 counter (high 8 bits) */
-#define T1LL		(6*RS)		/* Timer 1 latch (low 8 bits) */
-#define T1LH		(7*RS)		/* Timer 1 latch (high 8 bits) */
-#define T2CL		(8*RS)		/* Timer 2 ctr/latch (low 8 bits) */
-#define T2CH		(9*RS)		/* Timer 2 counter (high 8 bits) */
-#define SR		(10*RS)		/* Shift register */
-#define ACR		(11*RS)		/* Auxiliary control register */
-#define PCR		(12*RS)		/* Peripheral control register */
-#define IFR		(13*RS)		/* Interrupt flag register */
-#define IER		(14*RS)		/* Interrupt enable register */
-#define ANH		(15*RS)		/* A-side data, no handshake */
-
-/* Bits in B data register: both active low */
-#define TACK		0x02		/* Transfer acknowledge (input) */
-#define TREQ		0x04		/* Transfer request (output) */
-
-/* Bits in ACR */
-#define SR_CTRL		0x1c		/* Shift register control bits */
-#define SR_EXT		0x0c		/* Shift on external clock */
-#define SR_OUT		0x10		/* Shift out if 1 */
-
-/* Bits in IFR and IER */
-#define SR_INT		0x04		/* Shift register full/empty */
-#define CB1_INT		0x10		/* transition on CB1 input */
-
-static enum pmu_state {
-	idle,
-	sending,
-	intack,
-	reading,
-	reading_intr,
-} pmu_state;
-
-static struct adb_request *current_req;
-static struct adb_request *last_req;
-static struct adb_request *req_awaiting_reply;
-static unsigned char interrupt_data[32];
-static unsigned char *reply_ptr;
-static int data_index;
-static int data_len;
-static int adb_int_pending;
-static int pmu_adb_flags;
-static int adb_dev_map;
-static struct adb_request bright_req_1, bright_req_2, bright_req_3;
-static int pmu_kind = PMU_UNKNOWN;
-static int pmu_fully_inited;
-
-int asleep;
-
-static int pmu_probe(void);
-static int pmu_init(void);
-static void pmu_start(void);
-static irqreturn_t pmu_interrupt(int irq, void *arg);
-static int pmu_send_request(struct adb_request *req, int sync);
-static int pmu_autopoll(int devs);
-void pmu_poll(void);
-static int pmu_reset_bus(void);
-
-static int init_pmu(void);
-static void pmu_start(void);
-static void send_byte(int x);
-static void recv_byte(void);
-static void pmu_done(struct adb_request *req);
-static void pmu_handle_data(unsigned char *data, int len);
-static void set_volume(int level);
-static void pmu_enable_backlight(int on);
-static void pmu_set_brightness(int level);
-
-struct adb_driver via_pmu_driver = {
-	.name         = "68K PMU",
-	.probe        = pmu_probe,
-	.init         = pmu_init,
-	.send_request = pmu_send_request,
-	.autopoll     = pmu_autopoll,
-	.poll         = pmu_poll,
-	.reset_bus    = pmu_reset_bus,
-};
-
-/*
- * This table indicates for each PMU opcode:
- * - the number of data bytes to be sent with the command, or -1
- *   if a length byte should be sent,
- * - the number of response bytes which the PMU will return, or
- *   -1 if it will send a length byte.
- */
-static s8 pmu_data_len[256][2] = {
-/*	   0	   1	   2	   3	   4	   5	   6	   7  */
-/*00*/	{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*08*/	{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},
-/*10*/	{ 1, 0},{ 1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*18*/	{ 0, 1},{ 0, 1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{ 0, 0},
-/*20*/	{-1, 0},{ 0, 0},{ 2, 0},{ 1, 0},{ 1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*28*/	{ 0,-1},{ 0,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{ 0,-1},
-/*30*/	{ 4, 0},{20, 0},{-1, 0},{ 3, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*38*/	{ 0, 4},{ 0,20},{ 2,-1},{ 2, 1},{ 3,-1},{-1,-1},{-1,-1},{ 4, 0},
-/*40*/	{ 1, 0},{ 1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*48*/	{ 0, 1},{ 0, 1},{-1,-1},{ 1, 0},{ 1, 0},{-1,-1},{-1,-1},{-1,-1},
-/*50*/	{ 1, 0},{ 0, 0},{ 2, 0},{ 2, 0},{-1, 0},{ 1, 0},{ 3, 0},{ 1, 0},
-/*58*/	{ 0, 1},{ 1, 0},{ 0, 2},{ 0, 2},{ 0,-1},{-1,-1},{-1,-1},{-1,-1},
-/*60*/	{ 2, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*68*/	{ 0, 3},{ 0, 3},{ 0, 2},{ 0, 8},{ 0,-1},{ 0,-1},{-1,-1},{-1,-1},
-/*70*/	{ 1, 0},{ 1, 0},{ 1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*78*/	{ 0,-1},{ 0,-1},{-1,-1},{-1,-1},{-1,-1},{ 5, 1},{ 4, 1},{ 4, 1},
-/*80*/	{ 4, 0},{-1, 0},{ 0, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*88*/	{ 0, 5},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},
-/*90*/	{ 1, 0},{ 2, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*98*/	{ 0, 1},{ 0, 1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},
-/*a0*/	{ 2, 0},{ 2, 0},{ 2, 0},{ 4, 0},{-1, 0},{ 0, 0},{-1, 0},{-1, 0},
-/*a8*/	{ 1, 1},{ 1, 0},{ 3, 0},{ 2, 0},{-1,-1},{-1,-1},{-1,-1},{-1,-1},
-/*b0*/	{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*b8*/	{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},
-/*c0*/	{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*c8*/	{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},
-/*d0*/	{ 0, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*d8*/	{ 1, 1},{ 1, 1},{-1,-1},{-1,-1},{ 0, 1},{ 0,-1},{-1,-1},{-1,-1},
-/*e0*/	{-1, 0},{ 4, 0},{ 0, 1},{-1, 0},{-1, 0},{ 4, 0},{-1, 0},{-1, 0},
-/*e8*/	{ 3,-1},{-1,-1},{ 0, 1},{-1,-1},{ 0,-1},{-1,-1},{-1,-1},{ 0, 0},
-/*f0*/	{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},{-1, 0},
-/*f8*/	{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},{-1,-1},
-};
-
-int __init find_via_pmu(void)
-{
-	switch (macintosh_config->adb_type) {
-	case MAC_ADB_PB2:
-		pmu_kind = PMU_68K_V2;
-		break;
-	default:
-		pmu_kind = PMU_UNKNOWN;
-		return -ENODEV;
-	}
-
-	pmu_state = idle;
-
-	if (!init_pmu())
-		goto fail_init;
-
-	pr_info("adb: PMU 68K driver v0.5 for Unified ADB\n");
-
-	return 1;
-
-fail_init:
-	pmu_kind = PMU_UNKNOWN;
-	return 0;
-}
-
-static int pmu_probe(void)
-{
-	if (pmu_kind == PMU_UNKNOWN)
-		return -ENODEV;
-	return 0;
-}
-
-static int pmu_init(void)
-{
-	if (pmu_kind == PMU_UNKNOWN)
-		return -ENODEV;
-	return 0;
-}
-
-static int __init via_pmu_start(void)
-{
-	if (pmu_kind == PMU_UNKNOWN)
-		return -ENODEV;
-
-	if (request_irq(IRQ_MAC_ADB_SR, pmu_interrupt, 0, "PMU_SR",
-			pmu_interrupt)) {
-		pr_err("%s: can't get SR irq\n", __func__);
-		return -ENODEV;
-	}
-	if (request_irq(IRQ_MAC_ADB_CL, pmu_interrupt, 0, "PMU_CL",
-			pmu_interrupt)) {
-		pr_err("%s: can't get CL irq\n", __func__);
-		free_irq(IRQ_MAC_ADB_SR, pmu_interrupt);
-		return -ENODEV;
-	}
-
-	pmu_fully_inited = 1;
-
-	/* Enable backlight */
-	pmu_enable_backlight(1);
-
-	return 0;
-}
-
-arch_initcall(via_pmu_start);
-
-static int __init init_pmu(void)
-{
-	int timeout;
-	volatile struct adb_request req;
-
-	via2[B] |= TREQ;				/* negate TREQ */
-	via2[DIRB] = (via2[DIRB] | TREQ) & ~TACK;	/* TACK in, TREQ out */
-
-	pmu_request((struct adb_request *) &req, NULL, 2, PMU_SET_INTR_MASK, PMU_INT_ADB);
-	timeout =  100000;
-	while (!req.complete) {
-		if (--timeout < 0) {
-			printk(KERN_ERR "pmu_init: no response from PMU\n");
-			return -EAGAIN;
-		}
-		udelay(10);
-		pmu_poll();
-	}
-
-	/* ack all pending interrupts */
-	timeout = 100000;
-	interrupt_data[0] = 1;
-	while (interrupt_data[0] || pmu_state != idle) {
-		if (--timeout < 0) {
-			printk(KERN_ERR "pmu_init: timed out acking intrs\n");
-			return -EAGAIN;
-		}
-		if (pmu_state == idle) {
-			adb_int_pending = 1;
-			pmu_interrupt(0, NULL);
-		}
-		pmu_poll();
-		udelay(10);
-	}
-
-	pmu_request((struct adb_request *) &req, NULL, 2, PMU_SET_INTR_MASK,
-			PMU_INT_ADB_AUTO|PMU_INT_SNDBRT|PMU_INT_ADB);
-	timeout =  100000;
-	while (!req.complete) {
-		if (--timeout < 0) {
-			printk(KERN_ERR "pmu_init: no response from PMU\n");
-			return -EAGAIN;
-		}
-		udelay(10);
-		pmu_poll();
-	}
-
-	bright_req_1.complete = 1;
-	bright_req_2.complete = 1;
-	bright_req_3.complete = 1;
-
-	return 1;
-}
-
-int
-pmu_get_model(void)
-{
-	return pmu_kind;
-}
-
-/* Send an ADB command */
-static int 
-pmu_send_request(struct adb_request *req, int sync)
-{
-    int i, ret;
-
-    if (!pmu_fully_inited)
-    {
- 	req->complete = 1;
-   	return -ENXIO;
-   }
-
-    ret = -EINVAL;
-	
-    switch (req->data[0]) {
-    case PMU_PACKET:
-		for (i = 0; i < req->nbytes - 1; ++i)
-			req->data[i] = req->data[i+1];
-		--req->nbytes;
-		if (pmu_data_len[req->data[0]][1] != 0) {
-			req->reply[0] = ADB_RET_OK;
-			req->reply_len = 1;
-		} else
-			req->reply_len = 0;
-		ret = pmu_queue_request(req);
-		break;
-    case CUDA_PACKET:
-		switch (req->data[1]) {
-		case CUDA_GET_TIME:
-			if (req->nbytes != 2)
-				break;
-			req->data[0] = PMU_READ_RTC;
-			req->nbytes = 1;
-			req->reply_len = 3;
-			req->reply[0] = CUDA_PACKET;
-			req->reply[1] = 0;
-			req->reply[2] = CUDA_GET_TIME;
-			ret = pmu_queue_request(req);
-			break;
-		case CUDA_SET_TIME:
-			if (req->nbytes != 6)
-				break;
-			req->data[0] = PMU_SET_RTC;
-			req->nbytes = 5;
-			for (i = 1; i <= 4; ++i)
-				req->data[i] = req->data[i+1];
-			req->reply_len = 3;
-			req->reply[0] = CUDA_PACKET;
-			req->reply[1] = 0;
-			req->reply[2] = CUDA_SET_TIME;
-			ret = pmu_queue_request(req);
-			break;
-		case CUDA_GET_PRAM:
-			if (req->nbytes != 4)
-				break;
-			req->data[0] = PMU_READ_NVRAM;
-			req->data[1] = req->data[2];
-			req->data[2] = req->data[3];
-			req->nbytes = 3;
-			req->reply_len = 3;
-			req->reply[0] = CUDA_PACKET;
-			req->reply[1] = 0;
-			req->reply[2] = CUDA_GET_PRAM;
-			ret = pmu_queue_request(req);
-			break;
-		case CUDA_SET_PRAM:
-			if (req->nbytes != 5)
-				break;
-			req->data[0] = PMU_WRITE_NVRAM;
-			req->data[1] = req->data[2];
-			req->data[2] = req->data[3];
-			req->data[3] = req->data[4];
-			req->nbytes = 4;
-			req->reply_len = 3;
-			req->reply[0] = CUDA_PACKET;
-			req->reply[1] = 0;
-			req->reply[2] = CUDA_SET_PRAM;
-			ret = pmu_queue_request(req);
-			break;
-		}
-		break;
-    case ADB_PACKET:
-		for (i = req->nbytes - 1; i > 1; --i)
-			req->data[i+2] = req->data[i];
-		req->data[3] = req->nbytes - 2;
-		req->data[2] = pmu_adb_flags;
-		/*req->data[1] = req->data[1];*/
-		req->data[0] = PMU_ADB_CMD;
-		req->nbytes += 2;
-		req->reply_expected = 1;
-		req->reply_len = 0;
-		ret = pmu_queue_request(req);
-		break;
-    }
-    if (ret)
-    {
-    	req->complete = 1;
-    	return ret;
-    }
-    	
-    if (sync) {
-	while (!req->complete)
-		pmu_poll();
-    }
-
-    return 0;
-}
-
-/* Enable/disable autopolling */
-static int 
-pmu_autopoll(int devs)
-{
-	struct adb_request req;
-
-	if (!pmu_fully_inited) return -ENXIO;
-
-	if (devs) {
-		adb_dev_map = devs;
-		pmu_request(&req, NULL, 5, PMU_ADB_CMD, 0, 0x86,
-			    adb_dev_map >> 8, adb_dev_map);
-		pmu_adb_flags = 2;
-	} else {
-		pmu_request(&req, NULL, 1, PMU_ADB_POLL_OFF);
-		pmu_adb_flags = 0;
-	}
-	while (!req.complete)
-		pmu_poll();
-	return 0;
-}
-
-/* Reset the ADB bus */
-static int 
-pmu_reset_bus(void)
-{
-	struct adb_request req;
-	long timeout;
-	int save_autopoll = adb_dev_map;
-
-	if (!pmu_fully_inited) return -ENXIO;
-
-	/* anyone got a better idea?? */
-	pmu_autopoll(0);
-
-	req.nbytes = 5;
-	req.done = NULL;
-	req.data[0] = PMU_ADB_CMD;
-	req.data[1] = 0;
-	req.data[2] = 3; /* ADB_BUSRESET ??? */
-	req.data[3] = 0;
-	req.data[4] = 0;
-	req.reply_len = 0;
-	req.reply_expected = 1;
-	if (pmu_queue_request(&req) != 0)
-	{
-		printk(KERN_ERR "pmu_adb_reset_bus: pmu_queue_request failed\n");
-		return -EIO;
-	}
-	while (!req.complete)
-		pmu_poll();
-	timeout = 100000;
-	while (!req.complete) {
-		if (--timeout < 0) {
-			printk(KERN_ERR "pmu_adb_reset_bus (reset): no response from PMU\n");
-			return -EIO;
-		}
-		udelay(10);
-		pmu_poll();
-	}
-
-	if (save_autopoll != 0)
-		pmu_autopoll(save_autopoll);
-		
-	return 0;
-}
-
-/* Construct and send a pmu request */
-int 
-pmu_request(struct adb_request *req, void (*done)(struct adb_request *),
-	    int nbytes, ...)
-{
-	va_list list;
-	int i;
-
-	if (nbytes < 0 || nbytes > 32) {
-		printk(KERN_ERR "pmu_request: bad nbytes (%d)\n", nbytes);
-		req->complete = 1;
-		return -EINVAL;
-	}
-	req->nbytes = nbytes;
-	req->done = done;
-	va_start(list, nbytes);
-	for (i = 0; i < nbytes; ++i)
-		req->data[i] = va_arg(list, int);
-	va_end(list);
-	if (pmu_data_len[req->data[0]][1] != 0) {
-		req->reply[0] = ADB_RET_OK;
-		req->reply_len = 1;
-	} else
-		req->reply_len = 0;
-	req->reply_expected = 0;
-	return pmu_queue_request(req);
-}
-
-int
-pmu_queue_request(struct adb_request *req)
-{
-	unsigned long flags;
-	int nsend;
-
-	if (req->nbytes <= 0) {
-		req->complete = 1;
-		return 0;
-	}
-	nsend = pmu_data_len[req->data[0]][0];
-	if (nsend >= 0 && req->nbytes != nsend + 1) {
-		req->complete = 1;
-		return -EINVAL;
-	}
-
-	req->next = NULL;
-	req->sent = 0;
-	req->complete = 0;
-	local_irq_save(flags);
-
-	if (current_req != 0) {
-		last_req->next = req;
-		last_req = req;
-	} else {
-		current_req = req;
-		last_req = req;
-		if (pmu_state == idle)
-			pmu_start();
-	}
-
-	local_irq_restore(flags);
-	return 0;
-}
-
-static void 
-send_byte(int x)
-{
-	via1[ACR] |= SR_CTRL;
-	via1[SR] = x;
-	via2[B] &= ~TREQ;		/* assert TREQ */
-}
-
-static void 
-recv_byte(void)
-{
-	char c;
-
-	via1[ACR] = (via1[ACR] | SR_EXT) & ~SR_OUT;
-	c = via1[SR];		/* resets SR */
-	via2[B] &= ~TREQ;
-}
-
-static void 
-pmu_start(void)
-{
-	unsigned long flags;
-	struct adb_request *req;
-
-	/* assert pmu_state == idle */
-	/* get the packet to send */
-	local_irq_save(flags);
-	req = current_req;
-	if (req == 0 || pmu_state != idle
-	    || (req->reply_expected && req_awaiting_reply))
-		goto out;
-
-	pmu_state = sending;
-	data_index = 1;
-	data_len = pmu_data_len[req->data[0]][0];
-
-	/* set the shift register to shift out and send a byte */
-	send_byte(req->data[0]);
-
-out:
-	local_irq_restore(flags);
-}
-
-void 
-pmu_poll(void)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	if (via1[IFR] & SR_INT) {
-		via1[IFR] = SR_INT;
-		pmu_interrupt(IRQ_MAC_ADB_SR, NULL);
-	}
-	if (via1[IFR] & CB1_INT) {
-		via1[IFR] = CB1_INT;
-		pmu_interrupt(IRQ_MAC_ADB_CL, NULL);
-	}
-	local_irq_restore(flags);
-}
-
-static irqreturn_t
-pmu_interrupt(int irq, void *dev_id)
-{
-	struct adb_request *req;
-	int timeout, bite = 0;	/* to prevent compiler warning */
-
-#if 0
-	printk("pmu_interrupt: irq %d state %d acr %02X, b %02X data_index %d/%d adb_int_pending %d\n",
-		irq, pmu_state, (uint) via1[ACR], (uint) via2[B], data_index, data_len, adb_int_pending);
-#endif
-
-	if (irq == IRQ_MAC_ADB_CL) {		/* CB1 interrupt */
-		adb_int_pending = 1;
-	} else if (irq == IRQ_MAC_ADB_SR) {	/* SR interrupt  */
-		if (via2[B] & TACK) {
-			printk(KERN_DEBUG "PMU: SR_INT but ack still high! (%x)\n", via2[B]);
-		}
-
-		/* if reading grab the byte */
-		if ((via1[ACR] & SR_OUT) == 0) bite = via1[SR];
-
-		/* reset TREQ and wait for TACK to go high */
-		via2[B] |= TREQ;
-		timeout = 3200;
-		while (!(via2[B] & TACK)) {
-			if (--timeout < 0) {
-				printk(KERN_ERR "PMU not responding (!ack)\n");
-				goto finish;
-			}
-			udelay(10);
-		}
-
-		switch (pmu_state) {
-		case sending:
-			req = current_req;
-			if (data_len < 0) {
-				data_len = req->nbytes - 1;
-				send_byte(data_len);
-				break;
-			}
-			if (data_index <= data_len) {
-				send_byte(req->data[data_index++]);
-				break;
-			}
-			req->sent = 1;
-			data_len = pmu_data_len[req->data[0]][1];
-			if (data_len == 0) {
-				pmu_state = idle;
-				current_req = req->next;
-				if (req->reply_expected)
-					req_awaiting_reply = req;
-				else
-					pmu_done(req);
-			} else {
-				pmu_state = reading;
-				data_index = 0;
-				reply_ptr = req->reply + req->reply_len;
-				recv_byte();
-			}
-			break;
-
-		case intack:
-			data_index = 0;
-			data_len = -1;
-			pmu_state = reading_intr;
-			reply_ptr = interrupt_data;
-			recv_byte();
-			break;
-
-		case reading:
-		case reading_intr:
-			if (data_len == -1) {
-				data_len = bite;
-				if (bite > 32)
-					printk(KERN_ERR "PMU: bad reply len %d\n",
-					       bite);
-			} else {
-				reply_ptr[data_index++] = bite;
-			}
-			if (data_index < data_len) {
-				recv_byte();
-				break;
-			}
-
-			if (pmu_state == reading_intr) {
-				pmu_handle_data(interrupt_data, data_index);
-			} else {
-				req = current_req;
-				current_req = req->next;
-				req->reply_len += data_index;
-				pmu_done(req);
-			}
-			pmu_state = idle;
-
-			break;
-
-		default:
-			printk(KERN_ERR "pmu_interrupt: unknown state %d?\n",
-			       pmu_state);
-		}
-	}
-finish:
-	if (pmu_state == idle) {
-		if (adb_int_pending) {
-			pmu_state = intack;
-			send_byte(PMU_INT_ACK);
-			adb_int_pending = 0;
-		} else if (current_req) {
-			pmu_start();
-		}
-	}
-
-#if 0
-	printk("pmu_interrupt: exit state %d acr %02X, b %02X data_index %d/%d adb_int_pending %d\n",
-		pmu_state, (uint) via1[ACR], (uint) via2[B], data_index, data_len, adb_int_pending);
-#endif
-	return IRQ_HANDLED;
-}
-
-static void 
-pmu_done(struct adb_request *req)
-{
-	req->complete = 1;
-	if (req->done)
-		(*req->done)(req);
-}
-
-/* Interrupt data could be the result data from an ADB cmd */
-static void 
-pmu_handle_data(unsigned char *data, int len)
-{
-	static int show_pmu_ints = 1;
-
-	asleep = 0;
-	if (len < 1) {
-		adb_int_pending = 0;
-		return;
-	}
-	if (data[0] & PMU_INT_ADB) {
-		if ((data[0] & PMU_INT_ADB_AUTO) == 0) {
-			struct adb_request *req = req_awaiting_reply;
-			if (req == 0) {
-				printk(KERN_ERR "PMU: extra ADB reply\n");
-				return;
-			}
-			req_awaiting_reply = NULL;
-			if (len <= 2)
-				req->reply_len = 0;
-			else {
-				memcpy(req->reply, data + 1, len - 1);
-				req->reply_len = len - 1;
-			}
-			pmu_done(req);
-		} else {
-			adb_input(data+1, len-1, 1);
-		}
-	} else {
-		if (data[0] == 0x08 && len == 3) {
-			/* sound/brightness buttons pressed */
-			pmu_set_brightness(data[1] >> 3);
-			set_volume(data[2]);
-		} else if (show_pmu_ints
-			   && !(data[0] == PMU_INT_TICK && len == 1)) {
-			int i;
-			printk(KERN_DEBUG "pmu intr");
-			for (i = 0; i < len; ++i)
-				printk(" %.2x", data[i]);
-			printk("\n");
-		}
-	}
-}
-
-static int backlight_level = -1;
-static int backlight_enabled = 0;
-
-#define LEVEL_TO_BRIGHT(lev)	((lev) < 1? 0x7f: 0x4a - ((lev) << 1))
-
-static void 
-pmu_enable_backlight(int on)
-{
-	struct adb_request req;
-
-	if (on) {
-	    /* first call: get current backlight value */
-	    if (backlight_level < 0) {
-		switch(pmu_kind) {
-		    case PMU_68K_V2:
-			pmu_request(&req, NULL, 3, PMU_READ_NVRAM, 0x14, 0xe);
-			while (!req.complete)
-				pmu_poll();
-			printk(KERN_DEBUG "pmu: nvram returned bright: %d\n", (int)req.reply[1]);
-			backlight_level = req.reply[1];
-			break;
-		    default:
-		        backlight_enabled = 0;
-		        return;
-		}
-	    }
-	    pmu_request(&req, NULL, 2, PMU_BACKLIGHT_BRIGHT,
-	    	LEVEL_TO_BRIGHT(backlight_level));
-	    while (!req.complete)
-		pmu_poll();
-	}
-	pmu_request(&req, NULL, 2, PMU_POWER_CTRL,
-	    PMU_POW_BACKLIGHT | (on ? PMU_POW_ON : PMU_POW_OFF));
-	while (!req.complete)
-		pmu_poll();
-	backlight_enabled = on;
-}
-
-static void 
-pmu_set_brightness(int level)
-{
-	int bright;
-
-	backlight_level = level;
-	bright = LEVEL_TO_BRIGHT(level);
-	if (!backlight_enabled)
-		return;
-	if (bright_req_1.complete)
-		pmu_request(&bright_req_1, NULL, 2, PMU_BACKLIGHT_BRIGHT,
-		    bright);
-	if (bright_req_2.complete)
-		pmu_request(&bright_req_2, NULL, 2, PMU_POWER_CTRL,
-		    PMU_POW_BACKLIGHT | (bright < 0x7f ? PMU_POW_ON : PMU_POW_OFF));
-}
-
-void 
-pmu_enable_irled(int on)
-{
-	struct adb_request req;
-
-	pmu_request(&req, NULL, 2, PMU_POWER_CTRL, PMU_POW_IRLED |
-	    (on ? PMU_POW_ON : PMU_POW_OFF));
-	while (!req.complete)
-		pmu_poll();
-}
-
-static void 
-set_volume(int level)
-{
-}
-
-int
-pmu_present(void)
-{
-	return (pmu_kind != PMU_UNKNOWN);
-}
diff --git a/include/uapi/linux/pmu.h b/include/uapi/linux/pmu.h
index 30f64d46f5db..a2d3c289ed81 100644
--- a/include/uapi/linux/pmu.h
+++ b/include/uapi/linux/pmu.h
@@ -93,7 +93,6 @@ enum {
 	PMU_HEATHROW_BASED,	/* PowerBook G3 series */
 	PMU_PADDINGTON_BASED,	/* 1999 PowerBook G3 */
 	PMU_KEYLARGO_BASED,	/* Core99 motherboard (PMU99) */
-	PMU_68K_V2, 		/* 68K PMU, version 2 */
 };
 
 /* PMU PMU_POWER_EVENTS commands */
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 12/12] macintosh/via-pmu: Disambiguate interrupt statistics
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

Some of the event counters are overloaded which makes it very
difficult to interpret their values.

Counter 0 is supposed to report CB1 interrupts but it can also count
PMU_INT_WAITING_CHARGER events.

Counter 1 is supposed to report GPIO interrupts but it can also count
other events (depending upon the value of the PMU_INT_ADB bit).

Disambiguate these statistics with dedicated counters for GPIO and
CB1 interrupts.

Comments in the MkLinux source code say that the type 0 and type 1
interrupts are model-specific. Label them as "unknown".

This change to the contents of /proc/pmu/interrupts is by necessity
visible in userland. However, packages which interact with the PMU
(that is, pbbuttonsd, pmac-utils and pmud) don't open this file.
AFAIK, user software has no need to poll these counters.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
The file now looks like this,

  0:          0 (Unknown interrupt (type 0))
  1:          0 (Unknown interrupt (type 1))
  2:          0 (PC-Card eject button)
  3:         23 (Sound/Brightness button)
  4:         74 (ADB message)
  5:          0 (Battery state change)
  6:          0 (Environment interrupt)
  7:        121 (Tick timer)
  8:          0 (Ghost interrupt (zero len))
  9:          1 (Empty interrupt (empty mask))
 10:          2 (Max irqs in a row)
 11:        194 (Total CB1 triggered events)
 12:          0 (Total GPIO1 triggered events)

rather than this,

  0:        194 (Total CB1 triggered events)
  1:          0 (Total GPIO1 triggered events)
  2:          0 (PC-Card eject button)
  3:         23 (Sound/Brightness button)
  4:         74 (ADB message)
  5:          0 (Battery state change)
  6:          0 (Environment interrupt)
  7:        121 (Tick timer)
  8:          0 (Ghost interrupt (zero len))
  9:          1 (Empty interrupt (empty mask))
 10:          2 (Max irqs in a row)

If some parser exists for this file, and if this change is problematic,
we could increment the driver version number in /proc/pmu/info, to
correspond with the format change.
---
 drivers/macintosh/via-pmu.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index 730c10f7fbb7..44919b3b56e0 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -172,7 +172,9 @@ static int drop_interrupts;
 static int option_lid_wakeup = 1;
 #endif /* CONFIG_SUSPEND && CONFIG_PPC32 */
 static unsigned long async_req_locks;
-static unsigned int pmu_irq_stats[11];
+
+#define NUM_IRQ_STATS 13
+static unsigned int pmu_irq_stats[NUM_IRQ_STATS];
 
 static struct proc_dir_entry *proc_pmu_root;
 static struct proc_dir_entry *proc_pmu_info;
@@ -884,9 +886,9 @@ static const struct file_operations pmu_info_proc_fops = {
 static int pmu_irqstats_proc_show(struct seq_file *m, void *v)
 {
 	int i;
-	static const char *irq_names[] = {
-		"Total CB1 triggered events",
-		"Total GPIO1 triggered events",
+	static const char *irq_names[NUM_IRQ_STATS] = {
+		"Unknown interrupt (type 0)",
+		"Unknown interrupt (type 1)",
 		"PC-Card eject button",
 		"Sound/Brightness button",
 		"ADB message",
@@ -895,10 +897,12 @@ static int pmu_irqstats_proc_show(struct seq_file *m, void *v)
 		"Tick timer",
 		"Ghost interrupt (zero len)",
 		"Empty interrupt (empty mask)",
-		"Max irqs in a row"
+		"Max irqs in a row",
+		"Total CB1 triggered events",
+		"Total GPIO1 triggered events",
         };
 
-	for (i=0; i<11; i++) {
+	for (i = 0; i < NUM_IRQ_STATS; i++) {
 		seq_printf(m, " %2u: %10u (%s)\n",
 			     i, pmu_irq_stats[i], irq_names[i]);
 	}
@@ -1659,7 +1663,7 @@ via_pmu_interrupt(int irq, void *arg)
 		}
 		if (intr & CB1_INT) {
 			adb_int_pending = 1;
-			pmu_irq_stats[0]++;
+			pmu_irq_stats[11]++;
 		}
 		if (intr & SR_INT) {
 			req = pmu_sr_intr();
@@ -1746,7 +1750,7 @@ gpio1_interrupt(int irq, void *arg)
 			disable_irq_nosync(gpio_irq);
 			gpio_irq_enabled = 0;
 		}
-		pmu_irq_stats[1]++;
+		pmu_irq_stats[12]++;
 		adb_int_pending = 1;
 		spin_unlock_irqrestore(&pmu_lock, flags);
 		via_pmu_interrupt(0, NULL);
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 11/12] macintosh/via-pmu: Clean up interrupt statistics
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

Replace an open-coded ffs() with the function call.
Simplify an if-else cascade using a switch statement.
Correct a typo and an indentation issue.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
 drivers/macintosh/via-pmu.c | 39 ++++++++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index 38d7dd0bdb28..730c10f7fbb7 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -1392,7 +1392,8 @@ pmu_resume(void)
 static void
 pmu_handle_data(unsigned char *data, int len)
 {
-	unsigned char ints, pirq;
+	unsigned char ints;
+	int idx;
 	int i = 0;
 
 	asleep = 0;
@@ -1414,25 +1415,24 @@ pmu_handle_data(unsigned char *data, int len)
 		ints &= ~(PMU_INT_ADB_AUTO | PMU_INT_AUTO_SRQ_POLL);
 
 next:
-
 	if (ints == 0) {
 		if (i > pmu_irq_stats[10])
 			pmu_irq_stats[10] = i;
 		return;
 	}
-
-	for (pirq = 0; pirq < 8; pirq++)
-		if (ints & (1 << pirq))
-			break;
-	pmu_irq_stats[pirq]++;
 	i++;
-	ints &= ~(1 << pirq);
+
+	idx = ffs(ints) - 1;
+	ints &= ~BIT(idx);
+
+	pmu_irq_stats[idx]++;
 
 	/* Note: for some reason, we get an interrupt with len=1,
 	 * data[0]==0 after each normal ADB interrupt, at least
 	 * on the Pismo. Still investigating...  --BenH
 	 */
-	if ((1 << pirq) & PMU_INT_ADB) {
+	switch (BIT(idx)) {
+	case PMU_INT_ADB:
 		if ((data[0] & PMU_INT_ADB_AUTO) == 0) {
 			struct adb_request *req = req_awaiting_reply;
 			if (req == 0) {
@@ -1470,25 +1470,28 @@ pmu_handle_data(unsigned char *data, int len)
 				adb_input(data+1, len-1, 1);
 #endif /* CONFIG_ADB */		
 		}
-	}
+		break;
+
 	/* Sound/brightness button pressed */
-	else if ((1 << pirq) & PMU_INT_SNDBRT) {
+	case PMU_INT_SNDBRT:
 #ifdef CONFIG_PMAC_BACKLIGHT
 		if (len == 3)
 			pmac_backlight_set_legacy_brightness_pmu(data[1] >> 4);
 #endif
-	}
+		break;
+
 	/* Tick interrupt */
-	else if ((1 << pirq) & PMU_INT_TICK) {
-		/* Environement or tick interrupt, query batteries */
+	case PMU_INT_TICK:
+		/* Environment or tick interrupt, query batteries */
 		if (pmu_battery_count) {
 			if ((--query_batt_timer) == 0) {
 				query_battery_state();
 				query_batt_timer = BATTERY_POLLING_COUNT;
 			}
 		}
-        }
-	else if ((1 << pirq) & PMU_INT_ENVIRONMENT) {
+		break;
+
+	case PMU_INT_ENVIRONMENT:
 		if (pmu_battery_count)
 			query_battery_state();
 		pmu_pass_intr(data, len);
@@ -1498,7 +1501,9 @@ pmu_handle_data(unsigned char *data, int len)
 			via_pmu_event(PMU_EVT_POWER, !!(data[1]&8));
 			via_pmu_event(PMU_EVT_LID, data[1]&1);
 		}
-	} else {
+		break;
+
+	default:
 	       pmu_pass_intr(data, len);
 	}
 	goto next;
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 10/12] macintosh: Use common code to access RTC
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel,
	Geert Uytterhoeven, Paul Mackerras, , Michael Ellerman
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

Now that the 68k Mac port has adopted the via-pmu driver, it must access
the PMU RTC using the appropriate command format. The same code can now
be used for both m68k and powerpc.

Replace the RTC code that's duplicated in arch/powerpc and arch/m68k
with common RTC accessors for Cuda and PMU devices.

Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Paul Mackerras <paulus@samba.org>,
Cc: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 arch/m68k/mac/misc.c                   | 64 ++---------------------------
 arch/powerpc/platforms/powermac/time.c | 74 +---------------------------------
 drivers/macintosh/via-cuda.c           | 34 ++++++++++++++++
 drivers/macintosh/via-pmu.c            | 32 +++++++++++++++
 include/linux/cuda.h                   |  3 ++
 include/linux/pmu.h                    |  3 ++
 6 files changed, 78 insertions(+), 132 deletions(-)

diff --git a/arch/m68k/mac/misc.c b/arch/m68k/mac/misc.c
index 28090a44fa09..397f9f942a9f 100644
--- a/arch/m68k/mac/misc.c
+++ b/arch/m68k/mac/misc.c
@@ -33,34 +33,6 @@
 static void (*rom_reset)(void);
 
 #ifdef CONFIG_ADB_CUDA
-static long cuda_read_time(void)
-{
-	struct adb_request req;
-	long time;
-
-	if (cuda_request(&req, NULL, 2, CUDA_PACKET, CUDA_GET_TIME) < 0)
-		return 0;
-	while (!req.complete)
-		cuda_poll();
-
-	time = (req.reply[3] << 24) | (req.reply[4] << 16) |
-	       (req.reply[5] << 8) | req.reply[6];
-	return time - RTC_OFFSET;
-}
-
-static void cuda_write_time(long data)
-{
-	struct adb_request req;
-
-	data += RTC_OFFSET;
-	if (cuda_request(&req, NULL, 6, CUDA_PACKET, CUDA_SET_TIME,
-			 (data >> 24) & 0xFF, (data >> 16) & 0xFF,
-			 (data >> 8) & 0xFF, data & 0xFF) < 0)
-		return;
-	while (!req.complete)
-		cuda_poll();
-}
-
 static __u8 cuda_read_pram(int offset)
 {
 	struct adb_request req;
@@ -86,34 +58,6 @@ static void cuda_write_pram(int offset, __u8 data)
 #endif /* CONFIG_ADB_CUDA */
 
 #ifdef CONFIG_ADB_PMU
-static long pmu_read_time(void)
-{
-	struct adb_request req;
-	long time;
-
-	if (pmu_request(&req, NULL, 1, PMU_READ_RTC) < 0)
-		return 0;
-	while (!req.complete)
-		pmu_poll();
-
-	time = (req.reply[1] << 24) | (req.reply[2] << 16) |
-	       (req.reply[3] << 8) | req.reply[4];
-	return time - RTC_OFFSET;
-}
-
-static void pmu_write_time(long data)
-{
-	struct adb_request req;
-
-	data += RTC_OFFSET;
-	if (pmu_request(&req, NULL, 5, PMU_SET_RTC,
-			(data >> 24) & 0xFF, (data >> 16) & 0xFF,
-			(data >> 8) & 0xFF, data & 0xFF) < 0)
-		return;
-	while (!req.complete)
-		pmu_poll();
-}
-
 static __u8 pmu_read_pram(int offset)
 {
 	struct adb_request req;
@@ -635,12 +579,12 @@ int mac_hwclk(int op, struct rtc_time *t)
 #ifdef CONFIG_ADB_CUDA
 		case MAC_ADB_EGRET:
 		case MAC_ADB_CUDA:
-			now = cuda_read_time();
+			now = cuda_get_time();
 			break;
 #endif
 #ifdef CONFIG_ADB_PMU
 		case MAC_ADB_PB2:
-			now = pmu_read_time();
+			now = pmu_get_time();
 			break;
 #endif
 		default:
@@ -671,12 +615,12 @@ int mac_hwclk(int op, struct rtc_time *t)
 #ifdef CONFIG_ADB_CUDA
 		case MAC_ADB_EGRET:
 		case MAC_ADB_CUDA:
-			cuda_write_time(now);
+			cuda_set_time(now);
 			break;
 #endif
 #ifdef CONFIG_ADB_PMU
 		case MAC_ADB_PB2:
-			pmu_write_time(now);
+			pmu_set_time(now);
 			break;
 #endif
 		default:
diff --git a/arch/powerpc/platforms/powermac/time.c b/arch/powerpc/platforms/powermac/time.c
index 274af6fa388e..e9c1f3dafe2f 100644
--- a/arch/powerpc/platforms/powermac/time.c
+++ b/arch/powerpc/platforms/powermac/time.c
@@ -42,9 +42,6 @@
 #define DBG(x...)
 #endif
 
-/* Apparently the RTC stores seconds since 1 Jan 1904 */
-#define RTC_OFFSET	2082844800
-
 /*
  * Calibrate the decrementer frequency with the VIA timer 1.
  */
@@ -103,43 +100,8 @@ static unsigned long from_rtc_time(struct rtc_time *tm)
 #endif
 
 #ifdef CONFIG_ADB_CUDA
-static unsigned long cuda_get_time(void)
-{
-	struct adb_request req;
-	unsigned int now;
-
-	if (cuda_request(&req, NULL, 2, CUDA_PACKET, CUDA_GET_TIME) < 0)
-		return 0;
-	while (!req.complete)
-		cuda_poll();
-	if (req.reply_len != 7)
-		printk(KERN_ERR "cuda_get_time: got %d byte reply\n",
-		       req.reply_len);
-	now = (req.reply[3] << 24) + (req.reply[4] << 16)
-		+ (req.reply[5] << 8) + req.reply[6];
-	return ((unsigned long)now) - RTC_OFFSET;
-}
-
 #define cuda_get_rtc_time(tm)	to_rtc_time(cuda_get_time(), (tm))
-
-static int cuda_set_rtc_time(struct rtc_time *tm)
-{
-	unsigned int nowtime;
-	struct adb_request req;
-
-	nowtime = from_rtc_time(tm) + RTC_OFFSET;
-	if (cuda_request(&req, NULL, 6, CUDA_PACKET, CUDA_SET_TIME,
-			 nowtime >> 24, nowtime >> 16, nowtime >> 8,
-			 nowtime) < 0)
-		return -ENXIO;
-	while (!req.complete)
-		cuda_poll();
-	if ((req.reply_len != 3) && (req.reply_len != 7))
-		printk(KERN_ERR "cuda_set_rtc_time: got %d byte reply\n",
-		       req.reply_len);
-	return 0;
-}
-
+#define cuda_set_rtc_time(tm)	cuda_set_time(from_rtc_time(tm))
 #else
 #define cuda_get_time()		0
 #define cuda_get_rtc_time(tm)
@@ -147,40 +109,8 @@ static int cuda_set_rtc_time(struct rtc_time *tm)
 #endif
 
 #ifdef CONFIG_ADB_PMU
-static unsigned long pmu_get_time(void)
-{
-	struct adb_request req;
-	unsigned int now;
-
-	if (pmu_request(&req, NULL, 1, PMU_READ_RTC) < 0)
-		return 0;
-	pmu_wait_complete(&req);
-	if (req.reply_len != 4)
-		printk(KERN_ERR "pmu_get_time: got %d byte reply from PMU\n",
-		       req.reply_len);
-	now = (req.reply[0] << 24) + (req.reply[1] << 16)
-		+ (req.reply[2] << 8) + req.reply[3];
-	return ((unsigned long)now) - RTC_OFFSET;
-}
-
 #define pmu_get_rtc_time(tm)	to_rtc_time(pmu_get_time(), (tm))
-
-static int pmu_set_rtc_time(struct rtc_time *tm)
-{
-	unsigned int nowtime;
-	struct adb_request req;
-
-	nowtime = from_rtc_time(tm) + RTC_OFFSET;
-	if (pmu_request(&req, NULL, 5, PMU_SET_RTC, nowtime >> 24,
-			nowtime >> 16, nowtime >> 8, nowtime) < 0)
-		return -ENXIO;
-	pmu_wait_complete(&req);
-	if (req.reply_len != 0)
-		printk(KERN_ERR "pmu_set_rtc_time: %d byte reply from PMU\n",
-		       req.reply_len);
-	return 0;
-}
-
+#define pmu_set_rtc_time(tm)	pmu_set_time(from_rtc_time(tm))
 #else
 #define pmu_get_time()		0
 #define pmu_get_rtc_time(tm)
diff --git a/drivers/macintosh/via-cuda.c b/drivers/macintosh/via-cuda.c
index 019556136e77..8e7fb2115f10 100644
--- a/drivers/macintosh/via-cuda.c
+++ b/drivers/macintosh/via-cuda.c
@@ -771,3 +771,37 @@ cuda_input(unsigned char *buf, int nb)
 	               buf, nb, false);
     }
 }
+
+/* Offset between Unix time (1970-based) and Mac time (1904-based) */
+#define RTC_OFFSET	2082844800
+
+unsigned long cuda_get_time(void)
+{
+	struct adb_request req;
+	unsigned long now;
+
+	if (cuda_request(&req, NULL, 2, CUDA_PACKET, CUDA_GET_TIME) < 0)
+		return 0;
+	while (!req.complete)
+		cuda_poll();
+	if (req.reply_len != 7)
+		pr_err("%s: got %d byte reply\n", __func__, req.reply_len);
+	now = (req.reply[3] << 24) + (req.reply[4] << 16) +
+	      (req.reply[5] << 8) + req.reply[6];
+	return now - RTC_OFFSET;
+}
+
+int cuda_set_time(unsigned long now)
+{
+	struct adb_request req;
+
+	now += RTC_OFFSET;
+	if (cuda_request(&req, NULL, 6, CUDA_PACKET, CUDA_SET_TIME,
+			now >> 24, now >> 16, now >> 8, now) < 0)
+		return -ENXIO;
+	while (!req.complete)
+		cuda_poll();
+	if ((req.reply_len != 3) && (req.reply_len != 7))
+		pr_err("%s: got %d byte reply\n", __func__, req.reply_len);
+	return 0;
+}
diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index 22cb7d94e3ce..38d7dd0bdb28 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -1820,6 +1820,38 @@ pmu_shutdown(void)
 		;
 }
 
+/* Offset between Unix time (1970-based) and Mac time (1904-based) */
+#define RTC_OFFSET	2082844800
+
+unsigned long pmu_get_time(void)
+{
+	struct adb_request req;
+	unsigned long now;
+
+	if (pmu_request(&req, NULL, 1, PMU_READ_RTC) < 0)
+		return 0;
+	pmu_wait_complete(&req);
+	if (req.reply_len != 4)
+		pr_err("%s: got %d byte reply\n", __func__, req.reply_len);
+	now = (req.reply[0] << 24) + (req.reply[1] << 16) +
+	      (req.reply[2] << 8) + req.reply[3];
+	return now - RTC_OFFSET;
+}
+
+int pmu_set_time(unsigned long now)
+{
+	struct adb_request req;
+
+	now += RTC_OFFSET;
+	if (pmu_request(&req, NULL, 5, PMU_SET_RTC,
+			now >> 24, now >> 16, now >> 8, now) < 0)
+		return -ENXIO;
+	pmu_wait_complete(&req);
+	if (req.reply_len != 0)
+		pr_err("%s: got %d byte reply\n", __func__, req.reply_len);
+	return 0;
+}
+
 int
 pmu_present(void)
 {
diff --git a/include/linux/cuda.h b/include/linux/cuda.h
index 056867f09a01..a68669f746e1 100644
--- a/include/linux/cuda.h
+++ b/include/linux/cuda.h
@@ -16,4 +16,7 @@ extern int cuda_request(struct adb_request *req,
 			void (*done)(struct adb_request *), int nbytes, ...);
 extern void cuda_poll(void);
 
+extern unsigned long cuda_get_time(void);
+extern int cuda_set_time(unsigned long now);
+
 #endif /* _LINUX_CUDA_H */
diff --git a/include/linux/pmu.h b/include/linux/pmu.h
index 9ac8fc60ad49..feefd0bff9cf 100644
--- a/include/linux/pmu.h
+++ b/include/linux/pmu.h
@@ -34,6 +34,9 @@ static inline void pmu_resume(void)
 {}
 #endif
 
+extern unsigned long pmu_get_time(void);
+extern int pmu_set_time(unsigned long now);
+
 extern void pmu_enable_irled(int on);
 
 extern void pmu_restart(void);
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 08/12] macintosh/via-pmu68k: Don't load driver on unsupported hardware
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel,
	Geert Uytterhoeven
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

Don't load the via-pmu68k driver on early PowerBooks. The M50753 PMU
device found in those models was never supported by this driver.
Attempting to load the driver usually causes a boot hang.

Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 arch/m68k/mac/misc.c           | 6 ++----
 drivers/macintosh/via-pmu68k.c | 4 ----
 include/uapi/linux/pmu.h       | 1 -
 3 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/m68k/mac/misc.c b/arch/m68k/mac/misc.c
index c68054361615..7ccb799eeb57 100644
--- a/arch/m68k/mac/misc.c
+++ b/arch/m68k/mac/misc.c
@@ -478,8 +478,7 @@ void mac_poweroff(void)
 		cuda_shutdown();
 #endif
 #ifdef CONFIG_ADB_PMU68K
-	} else if (macintosh_config->adb_type == MAC_ADB_PB1
-		|| macintosh_config->adb_type == MAC_ADB_PB2) {
+	} else if (macintosh_config->adb_type == MAC_ADB_PB2) {
 		pmu_shutdown();
 #endif
 	}
@@ -520,8 +519,7 @@ void mac_reset(void)
 		cuda_restart();
 #endif
 #ifdef CONFIG_ADB_PMU68K
-	} else if (macintosh_config->adb_type == MAC_ADB_PB1
-		|| macintosh_config->adb_type == MAC_ADB_PB2) {
+	} else if (macintosh_config->adb_type == MAC_ADB_PB2) {
 		pmu_restart();
 #endif
 	} else if (CPU_IS_030) {
diff --git a/drivers/macintosh/via-pmu68k.c b/drivers/macintosh/via-pmu68k.c
index d545ed45e482..bec8e1837d7d 100644
--- a/drivers/macintosh/via-pmu68k.c
+++ b/drivers/macintosh/via-pmu68k.c
@@ -175,9 +175,6 @@ static s8 pmu_data_len[256][2] = {
 int __init find_via_pmu(void)
 {
 	switch (macintosh_config->adb_type) {
-	case MAC_ADB_PB1:
-		pmu_kind = PMU_68K_V1;
-		break;
 	case MAC_ADB_PB2:
 		pmu_kind = PMU_68K_V2;
 		break;
@@ -785,7 +782,6 @@ pmu_enable_backlight(int on)
 	    /* first call: get current backlight value */
 	    if (backlight_level < 0) {
 		switch(pmu_kind) {
-		    case PMU_68K_V1:
 		    case PMU_68K_V2:
 			pmu_request(&req, NULL, 3, PMU_READ_NVRAM, 0x14, 0xe);
 			while (!req.complete)
diff --git a/include/uapi/linux/pmu.h b/include/uapi/linux/pmu.h
index 89cb1acea93a..30f64d46f5db 100644
--- a/include/uapi/linux/pmu.h
+++ b/include/uapi/linux/pmu.h
@@ -93,7 +93,6 @@ enum {
 	PMU_HEATHROW_BASED,	/* PowerBook G3 series */
 	PMU_PADDINGTON_BASED,	/* 1999 PowerBook G3 */
 	PMU_KEYLARGO_BASED,	/* Core99 motherboard (PMU99) */
-	PMU_68K_V1,		/* 68K PMU, version 1 */
 	PMU_68K_V2, 		/* 68K PMU, version 2 */
 };
 
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 06/12] macintosh/via-pmu: Add support for m68k PowerBooks
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

Put #ifdefs around the Open Firmware, xmon, interrupt dispatch,
battery and suspend code. Add the necessary interrupt handling to
support m68k PowerBooks.

The pmu_kind value is available to userspace using the
PMU_IOC_GET_MODEL ioctl. It is not clear yet what hardware classes
are be needed to describe m68k PowerBook models, so pmu_kind is given
the provisional value PMU_UNKNOWN.

To find out about the hardware, user programs can use /proc/bootinfo
or /proc/hardware, or send the PMU_GET_VERSION command using /dev/adb.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/macintosh/Kconfig   |   2 +-
 drivers/macintosh/via-pmu.c | 101 +++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 91 insertions(+), 12 deletions(-)

diff --git a/drivers/macintosh/Kconfig b/drivers/macintosh/Kconfig
index 97a420c11eed..9c6452b38c36 100644
--- a/drivers/macintosh/Kconfig
+++ b/drivers/macintosh/Kconfig
@@ -65,7 +65,7 @@ config ADB_CUDA
 	  If unsure say Y.
 
 config ADB_PMU
-	bool "Support for PMU  based PowerMacs"
+	bool "Support for PMU based PowerMacs and PowerBooks"
 	depends on PPC_PMAC
 	help
 	  On PowerBooks, iBooks, and recent iMacs and Power Macintoshes, the
diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index 2e09137410f6..22cb7d94e3ce 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * Device driver for the via-pmu on Apple Powermacs.
+ * Device driver for the PMU in Apple PowerBooks and PowerMacs.
  *
  * The VIA (versatile interface adapter) interfaces to the PMU,
  * a 6805 microprocessor core whose primary function is to control
@@ -49,20 +49,26 @@
 #include <linux/compat.h>
 #include <linux/of_address.h>
 #include <linux/of_irq.h>
-#include <asm/prom.h>
+#include <linux/uaccess.h>
 #include <asm/machdep.h>
 #include <asm/io.h>
 #include <asm/pgtable.h>
 #include <asm/sections.h>
 #include <asm/irq.h>
+#ifdef CONFIG_PPC_PMAC
 #include <asm/pmac_feature.h>
 #include <asm/pmac_pfunc.h>
 #include <asm/pmac_low_i2c.h>
-#include <linux/uaccess.h>
+#include <asm/prom.h>
 #include <asm/mmu_context.h>
 #include <asm/cputable.h>
 #include <asm/time.h>
 #include <asm/backlight.h>
+#else
+#include <asm/macintosh.h>
+#include <asm/macints.h>
+#include <asm/mac_via.h>
+#endif
 
 #include "via-pmu-event.h"
 
@@ -97,8 +103,13 @@ static DEFINE_MUTEX(pmu_info_proc_mutex);
 #define ANH		(15*RS)		/* A-side data, no handshake */
 
 /* Bits in B data register: both active low */
+#ifdef CONFIG_PPC_PMAC
 #define TACK		0x08		/* Transfer acknowledge (input) */
 #define TREQ		0x10		/* Transfer request (output) */
+#else
+#define TACK		0x02
+#define TREQ		0x04
+#endif
 
 /* Bits in ACR */
 #define SR_CTRL		0x1c		/* Shift register control bits */
@@ -140,13 +151,15 @@ static int data_index;
 static int data_len;
 static volatile int adb_int_pending;
 static volatile int disable_poll;
-static struct device_node *vias;
 static int pmu_kind = PMU_UNKNOWN;
 static int pmu_fully_inited;
 static int pmu_has_adb;
+#ifdef CONFIG_PPC_PMAC
 static volatile unsigned char __iomem *via1;
 static volatile unsigned char __iomem *via2;
+static struct device_node *vias;
 static struct device_node *gpio_node;
+#endif
 static unsigned char __iomem *gpio_reg;
 static int gpio_irq = 0;
 static int gpio_irq_enabled = -1;
@@ -273,6 +286,7 @@ static char *pbook_type[] = {
 
 int __init find_via_pmu(void)
 {
+#ifdef CONFIG_PPC_PMAC
 	u64 taddr;
 	const u32 *reg;
 
@@ -355,9 +369,6 @@ int __init find_via_pmu(void)
 	if (!init_pmu())
 		goto fail_init;
 
-	printk(KERN_INFO "PMU driver v%d initialized for %s, firmware: %02x\n",
-	       PMU_DRIVER_VERSION, pbook_type[pmu_kind], pmu_version);
-	       
 	sys_ctrler = SYS_CTRLER_PMU;
 	
 	return 1;
@@ -373,6 +384,30 @@ int __init find_via_pmu(void)
 	vias = NULL;
 	pmu_state = uninitialized;
 	return 0;
+#else
+	if (macintosh_config->adb_type != MAC_ADB_PB2)
+		return 0;
+
+	pmu_kind = PMU_UNKNOWN;
+
+	spin_lock_init(&pmu_lock);
+
+	pmu_has_adb = 1;
+
+	pmu_intr_mask =	PMU_INT_PCEJECT |
+			PMU_INT_SNDBRT |
+			PMU_INT_ADB |
+			PMU_INT_TICK;
+
+	pmu_state = idle;
+
+	if (!init_pmu()) {
+		pmu_state = uninitialized;
+		return 0;
+	}
+
+	return 1;
+#endif /* !CONFIG_PPC_PMAC */
 }
 
 #ifdef CONFIG_ADB
@@ -396,13 +431,14 @@ static int pmu_init(void)
  */
 static int __init via_pmu_start(void)
 {
-	unsigned int irq;
+	unsigned int __maybe_unused irq;
 
 	if (pmu_state == uninitialized)
 		return -ENODEV;
 
 	batt_req.complete = 1;
 
+#ifdef CONFIG_PPC_PMAC
 	irq = irq_of_parse_and_map(vias, 0);
 	if (!irq) {
 		printk(KERN_ERR "via-pmu: can't map interrupt\n");
@@ -439,6 +475,19 @@ static int __init via_pmu_start(void)
 
 	/* Enable interrupts */
 	out_8(&via1[IER], IER_SET | SR_INT | CB1_INT);
+#else
+	if (request_irq(IRQ_MAC_ADB_SR, via_pmu_interrupt, IRQF_NO_SUSPEND,
+			"VIA-PMU-SR", NULL)) {
+		pr_err("%s: couldn't get SR irq\n", __func__);
+		return -ENODEV;
+	}
+	if (request_irq(IRQ_MAC_ADB_CL, via_pmu_interrupt, IRQF_NO_SUSPEND,
+			"VIA-PMU-CL", NULL)) {
+		pr_err("%s: couldn't get CL irq\n", __func__);
+		free_irq(IRQ_MAC_ADB_SR, NULL);
+		return -ENODEV;
+	}
+#endif /* !CONFIG_PPC_PMAC */
 
 	pmu_fully_inited = 1;
 
@@ -587,6 +636,10 @@ init_pmu(void)
 			       option_server_mode ? "enabled" : "disabled");
 		}
 	}
+
+	printk(KERN_INFO "PMU driver v%d initialized for %s, firmware: %02x\n",
+	       PMU_DRIVER_VERSION, pbook_type[pmu_kind], pmu_version);
+
 	return 1;
 }
 
@@ -625,6 +678,7 @@ static void pmu_set_server_mode(int server_mode)
 static void
 done_battery_state_ohare(struct adb_request* req)
 {
+#ifdef CONFIG_PPC_PMAC
 	/* format:
 	 *  [0]    :  flags
 	 *    0x01 :  AC indicator
@@ -706,6 +760,7 @@ done_battery_state_ohare(struct adb_request* req)
 	pmu_batteries[pmu_cur_battery].amperage = amperage;
 	pmu_batteries[pmu_cur_battery].voltage = voltage;
 	pmu_batteries[pmu_cur_battery].time_remaining = time;
+#endif /* CONFIG_PPC_PMAC */
 
 	clear_bit(0, &async_req_locks);
 }
@@ -1393,6 +1448,7 @@ pmu_handle_data(unsigned char *data, int len)
 			}
 			pmu_done(req);
 		} else {
+#ifdef CONFIG_XMON
 			if (len == 4 && data[1] == 0x2c) {
 				extern int xmon_wants_key, xmon_adb_keycode;
 				if (xmon_wants_key) {
@@ -1400,6 +1456,7 @@ pmu_handle_data(unsigned char *data, int len)
 					return;
 				}
 			}
+#endif /* CONFIG_XMON */
 #ifdef CONFIG_ADB
 			/*
 			 * XXX On the [23]400 the PMU gives us an up
@@ -1567,7 +1624,25 @@ via_pmu_interrupt(int irq, void *arg)
 	++disable_poll;
 	
 	for (;;) {
-		intr = in_8(&via1[IFR]) & (SR_INT | CB1_INT);
+		/* On 68k Macs, VIA interrupts are dispatched individually.
+		 * Unless we are polling, the relevant IRQ flag has already
+		 * been cleared.
+		 */
+		intr = 0;
+		if (IS_ENABLED(CONFIG_PPC_PMAC) || !irq) {
+			intr = in_8(&via1[IFR]) & (SR_INT | CB1_INT);
+			out_8(&via1[IFR], intr);
+		}
+#ifndef CONFIG_PPC_PMAC
+		switch (irq) {
+		case IRQ_MAC_ADB_CL:
+			intr = CB1_INT;
+			break;
+		case IRQ_MAC_ADB_SR:
+			intr = SR_INT;
+			break;
+		}
+#endif
 		if (intr == 0)
 			break;
 		handled = 1;
@@ -1577,7 +1652,6 @@ via_pmu_interrupt(int irq, void *arg)
 			       intr, in_8(&via1[IER]), pmu_state);
 			break;
 		}
-		out_8(&via1[IFR], intr);
 		if (intr & CB1_INT) {
 			adb_int_pending = 1;
 			pmu_irq_stats[0]++;
@@ -1587,6 +1661,9 @@ via_pmu_interrupt(int irq, void *arg)
 			if (req)
 				break;
 		}
+#ifndef CONFIG_PPC_PMAC
+		break;
+#endif
 	}
 
 recheck:
@@ -1653,7 +1730,7 @@ pmu_unlock(void)
 }
 
 
-static irqreturn_t
+static __maybe_unused irqreturn_t
 gpio1_interrupt(int irq, void *arg)
 {
 	unsigned long flags;
@@ -2287,6 +2364,7 @@ static int pmu_ioctl(struct file *filp,
 	int error = -EINVAL;
 
 	switch (cmd) {
+#ifdef CONFIG_PPC_PMAC
 	case PMU_IOC_SLEEP:
 		if (!capable(CAP_SYS_ADMIN))
 			return -EACCES;
@@ -2296,6 +2374,7 @@ static int pmu_ioctl(struct file *filp,
 			return put_user(0, argp);
 		else
 			return put_user(1, argp);
+#endif
 
 #ifdef CONFIG_PMAC_BACKLIGHT_LEGACY
 	/* Compatibility ioctl's for backlight */
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 05/12] macintosh/via-pmu: Replace via pointer with via1 and via2 pointers
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

On most PowerPC Macs, the PMU driver uses the shift register and
IO port B from a single VIA chip.

On 68k and early PowerPC PowerBooks, the driver uses the shift register
from one VIA chip together with IO port B from another.

Replace via with via1 and via2 to accommodate this. For the
CONFIG_PPC_PMAC case, set via1 = via2 so there is no change.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/macintosh/via-pmu.c | 142 +++++++++++++++++++++-----------------------
 1 file changed, 69 insertions(+), 73 deletions(-)

diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index c4c324fb5fa6..2e09137410f6 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -76,7 +76,6 @@
 #define BATTERY_POLLING_COUNT	2
 
 static DEFINE_MUTEX(pmu_info_proc_mutex);
-static volatile unsigned char __iomem *via;
 
 /* VIA registers - spaced 0x200 bytes apart */
 #define RS		0x200		/* skip between registers */
@@ -145,6 +144,8 @@ static struct device_node *vias;
 static int pmu_kind = PMU_UNKNOWN;
 static int pmu_fully_inited;
 static int pmu_has_adb;
+static volatile unsigned char __iomem *via1;
+static volatile unsigned char __iomem *via2;
 static struct device_node *gpio_node;
 static unsigned char __iomem *gpio_reg;
 static int gpio_irq = 0;
@@ -340,14 +341,14 @@ int __init find_via_pmu(void)
 	} else
 		pmu_kind = PMU_UNKNOWN;
 
-	via = ioremap(taddr, 0x2000);
-	if (via == NULL) {
+	via1 = via2 = ioremap(taddr, 0x2000);
+	if (via1 == NULL) {
 		printk(KERN_ERR "via-pmu: Can't map address !\n");
 		goto fail_via_remap;
 	}
 	
-	out_8(&via[IER], IER_CLR | 0x7f);	/* disable all intrs */
-	out_8(&via[IFR], 0x7f);			/* clear IFR */
+	out_8(&via1[IER], IER_CLR | 0x7f);	/* disable all intrs */
+	out_8(&via1[IFR], 0x7f);			/* clear IFR */
 
 	pmu_state = idle;
 
@@ -362,8 +363,8 @@ int __init find_via_pmu(void)
 	return 1;
 
  fail_init:
-	iounmap(via);
-	via = NULL;
+	iounmap(via1);
+	via1 = via2 = NULL;
  fail_via_remap:
 	iounmap(gpio_reg);
 	gpio_reg = NULL;
@@ -437,7 +438,7 @@ static int __init via_pmu_start(void)
 	}
 
 	/* Enable interrupts */
-	out_8(&via[IER], IER_SET | SR_INT | CB1_INT);
+	out_8(&via1[IER], IER_SET | SR_INT | CB1_INT);
 
 	pmu_fully_inited = 1;
 
@@ -533,8 +534,8 @@ init_pmu(void)
 	struct adb_request req;
 
 	/* Negate TREQ. Set TACK to input and TREQ to output. */
-	out_8(&via[B], in_8(&via[B]) | TREQ);
-	out_8(&via[DIRB], (in_8(&via[DIRB]) | TREQ) & ~TACK);
+	out_8(&via2[B], in_8(&via2[B]) | TREQ);
+	out_8(&via2[DIRB], (in_8(&via2[DIRB]) | TREQ) & ~TACK);
 
 	pmu_request(&req, NULL, 2, PMU_SET_INTR_MASK, pmu_intr_mask);
 	timeout =  100000;
@@ -1174,7 +1175,7 @@ wait_for_ack(void)
 	 * reported
 	 */
 	int timeout = 4000;
-	while ((in_8(&via[B]) & TACK) == 0) {
+	while ((in_8(&via2[B]) & TACK) == 0) {
 		if (--timeout < 0) {
 			printk(KERN_ERR "PMU not responding (!ack)\n");
 			return;
@@ -1188,23 +1189,19 @@ wait_for_ack(void)
 static inline void
 send_byte(int x)
 {
-	volatile unsigned char __iomem *v = via;
-
-	out_8(&v[ACR], in_8(&v[ACR]) | SR_OUT | SR_EXT);
-	out_8(&v[SR], x);
-	out_8(&v[B], in_8(&v[B]) & ~TREQ);		/* assert TREQ */
-	(void)in_8(&v[B]);
+	out_8(&via1[ACR], in_8(&via1[ACR]) | SR_OUT | SR_EXT);
+	out_8(&via1[SR], x);
+	out_8(&via2[B], in_8(&via2[B]) & ~TREQ);	/* assert TREQ */
+	(void)in_8(&via2[B]);
 }
 
 static inline void
 recv_byte(void)
 {
-	volatile unsigned char __iomem *v = via;
-
-	out_8(&v[ACR], (in_8(&v[ACR]) & ~SR_OUT) | SR_EXT);
-	in_8(&v[SR]);		/* resets SR */
-	out_8(&v[B], in_8(&v[B]) & ~TREQ);
-	(void)in_8(&v[B]);
+	out_8(&via1[ACR], (in_8(&via1[ACR]) & ~SR_OUT) | SR_EXT);
+	in_8(&via1[SR]);		/* resets SR */
+	out_8(&via2[B], in_8(&via2[B]) & ~TREQ);
+	(void)in_8(&via2[B]);
 }
 
 static inline void
@@ -1307,7 +1304,7 @@ pmu_suspend(void)
 		if (!adb_int_pending && pmu_state == idle && !req_awaiting_reply) {
 			if (gpio_irq >= 0)
 				disable_irq_nosync(gpio_irq);
-			out_8(&via[IER], CB1_INT | IER_CLR);
+			out_8(&via1[IER], CB1_INT | IER_CLR);
 			spin_unlock_irqrestore(&pmu_lock, flags);
 			break;
 		}
@@ -1331,7 +1328,7 @@ pmu_resume(void)
 	adb_int_pending = 1;
 	if (gpio_irq >= 0)
 		enable_irq(gpio_irq);
-	out_8(&via[IER], CB1_INT | IER_SET);
+	out_8(&via1[IER], CB1_INT | IER_SET);
 	spin_unlock_irqrestore(&pmu_lock, flags);
 	pmu_poll();
 }
@@ -1456,20 +1453,20 @@ pmu_sr_intr(void)
 	struct adb_request *req;
 	int bite = 0;
 
-	if (in_8(&via[B]) & TREQ) {
-		printk(KERN_ERR "PMU: spurious SR intr (%x)\n", in_8(&via[B]));
+	if (in_8(&via2[B]) & TREQ) {
+		printk(KERN_ERR "PMU: spurious SR intr (%x)\n", in_8(&via2[B]));
 		return NULL;
 	}
 	/* The ack may not yet be low when we get the interrupt */
-	while ((in_8(&via[B]) & TACK) != 0)
+	while ((in_8(&via2[B]) & TACK) != 0)
 			;
 
 	/* if reading grab the byte, and reset the interrupt */
 	if (pmu_state == reading || pmu_state == reading_intr)
-		bite = in_8(&via[SR]);
+		bite = in_8(&via1[SR]);
 
 	/* reset TREQ and wait for TACK to go high */
-	out_8(&via[B], in_8(&via[B]) | TREQ);
+	out_8(&via2[B], in_8(&via2[B]) | TREQ);
 	wait_for_ack();
 
 	switch (pmu_state) {
@@ -1570,17 +1567,17 @@ via_pmu_interrupt(int irq, void *arg)
 	++disable_poll;
 	
 	for (;;) {
-		intr = in_8(&via[IFR]) & (SR_INT | CB1_INT);
+		intr = in_8(&via1[IFR]) & (SR_INT | CB1_INT);
 		if (intr == 0)
 			break;
 		handled = 1;
 		if (++nloop > 1000) {
 			printk(KERN_DEBUG "PMU: stuck in intr loop, "
 			       "intr=%x, ier=%x pmu_state=%d\n",
-			       intr, in_8(&via[IER]), pmu_state);
+			       intr, in_8(&via1[IER]), pmu_state);
 			break;
 		}
-		out_8(&via[IFR], intr);
+		out_8(&via1[IFR], intr);
 		if (intr & CB1_INT) {
 			adb_int_pending = 1;
 			pmu_irq_stats[0]++;
@@ -1762,29 +1759,29 @@ static u32 save_via[8];
 static void
 save_via_state(void)
 {
-	save_via[0] = in_8(&via[ANH]);
-	save_via[1] = in_8(&via[DIRA]);
-	save_via[2] = in_8(&via[B]);
-	save_via[3] = in_8(&via[DIRB]);
-	save_via[4] = in_8(&via[PCR]);
-	save_via[5] = in_8(&via[ACR]);
-	save_via[6] = in_8(&via[T1CL]);
-	save_via[7] = in_8(&via[T1CH]);
+	save_via[0] = in_8(&via1[ANH]);
+	save_via[1] = in_8(&via1[DIRA]);
+	save_via[2] = in_8(&via1[B]);
+	save_via[3] = in_8(&via1[DIRB]);
+	save_via[4] = in_8(&via1[PCR]);
+	save_via[5] = in_8(&via1[ACR]);
+	save_via[6] = in_8(&via1[T1CL]);
+	save_via[7] = in_8(&via1[T1CH]);
 }
 static void
 restore_via_state(void)
 {
-	out_8(&via[ANH], save_via[0]);
-	out_8(&via[DIRA], save_via[1]);
-	out_8(&via[B], save_via[2]);
-	out_8(&via[DIRB], save_via[3]);
-	out_8(&via[PCR], save_via[4]);
-	out_8(&via[ACR], save_via[5]);
-	out_8(&via[T1CL], save_via[6]);
-	out_8(&via[T1CH], save_via[7]);
-	out_8(&via[IER], IER_CLR | 0x7f);	/* disable all intrs */
-	out_8(&via[IFR], 0x7f);				/* clear IFR */
-	out_8(&via[IER], IER_SET | SR_INT | CB1_INT);
+	out_8(&via1[ANH],  save_via[0]);
+	out_8(&via1[DIRA], save_via[1]);
+	out_8(&via1[B],    save_via[2]);
+	out_8(&via1[DIRB], save_via[3]);
+	out_8(&via1[PCR],  save_via[4]);
+	out_8(&via1[ACR],  save_via[5]);
+	out_8(&via1[T1CL], save_via[6]);
+	out_8(&via1[T1CH], save_via[7]);
+	out_8(&via1[IER], IER_CLR | 0x7f);	/* disable all intrs */
+	out_8(&via1[IFR], 0x7f);			/* clear IFR */
+	out_8(&via1[IER], IER_SET | SR_INT | CB1_INT);
 }
 
 #define	GRACKLE_PM	(1<<7)
@@ -2426,33 +2423,33 @@ device_initcall(pmu_device_init);
 
 #ifdef DEBUG_SLEEP
 static inline void 
-polled_handshake(volatile unsigned char __iomem *via)
+polled_handshake(void)
 {
-	via[B] &= ~TREQ; eieio();
-	while ((via[B] & TACK) != 0)
+	via2[B] &= ~TREQ; eieio();
+	while ((via2[B] & TACK) != 0)
 		;
-	via[B] |= TREQ; eieio();
-	while ((via[B] & TACK) == 0)
+	via2[B] |= TREQ; eieio();
+	while ((via2[B] & TACK) == 0)
 		;
 }
 
 static inline void 
-polled_send_byte(volatile unsigned char __iomem *via, int x)
+polled_send_byte(int x)
 {
-	via[ACR] |= SR_OUT | SR_EXT; eieio();
-	via[SR] = x; eieio();
-	polled_handshake(via);
+	via1[ACR] |= SR_OUT | SR_EXT; eieio();
+	via1[SR] = x; eieio();
+	polled_handshake();
 }
 
 static inline int
-polled_recv_byte(volatile unsigned char __iomem *via)
+polled_recv_byte(void)
 {
 	int x;
 
-	via[ACR] = (via[ACR] & ~SR_OUT) | SR_EXT; eieio();
-	x = via[SR]; eieio();
-	polled_handshake(via);
-	x = via[SR]; eieio();
+	via1[ACR] = (via1[ACR] & ~SR_OUT) | SR_EXT; eieio();
+	x = via1[SR]; eieio();
+	polled_handshake();
+	x = via1[SR]; eieio();
 	return x;
 }
 
@@ -2461,7 +2458,6 @@ pmu_polled_request(struct adb_request *req)
 {
 	unsigned long flags;
 	int i, l, c;
-	volatile unsigned char __iomem *v = via;
 
 	req->complete = 1;
 	c = req->data[0];
@@ -2473,21 +2469,21 @@ pmu_polled_request(struct adb_request *req)
 	while (pmu_state != idle)
 		pmu_poll();
 
-	while ((via[B] & TACK) == 0)
+	while ((via2[B] & TACK) == 0)
 		;
-	polled_send_byte(v, c);
+	polled_send_byte(c);
 	if (l < 0) {
 		l = req->nbytes - 1;
-		polled_send_byte(v, l);
+		polled_send_byte(l);
 	}
 	for (i = 1; i <= l; ++i)
-		polled_send_byte(v, req->data[i]);
+		polled_send_byte(req->data[i]);
 
 	l = pmu_data_len[c][1];
 	if (l < 0)
-		l = polled_recv_byte(v);
+		l = polled_recv_byte();
 	for (i = 0; i < l; ++i)
-		req->reply[i + req->reply_len] = polled_recv_byte(v);
+		req->reply[i + req->reply_len] = polled_recv_byte();
 
 	if (req->done)
 		(*req->done)(req);
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 07/12] macintosh/via-pmu: Make CONFIG_PPC_PMAC Kconfig deps explicit
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

At present, CONFIG_ADB_PMU depends on CONFIG_PPC_PMAC. When this gets
relaxed to CONFIG_PPC_PMAC || CONFIG_MAC, those Kconfig symbols with
implicit deps on PPC_PMAC will need explicit deps. Add them now.
No functional change.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/macintosh/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/macintosh/Kconfig b/drivers/macintosh/Kconfig
index 9c6452b38c36..26abae4c899d 100644
--- a/drivers/macintosh/Kconfig
+++ b/drivers/macintosh/Kconfig
@@ -79,7 +79,7 @@ config ADB_PMU
 
 config ADB_PMU_LED
 	bool "Support for the Power/iBook front LED"
-	depends on ADB_PMU
+	depends on PPC_PMAC && ADB_PMU
 	select NEW_LEDS
 	select LEDS_CLASS
 	help
@@ -122,7 +122,7 @@ config PMAC_MEDIABAY
 
 config PMAC_BACKLIGHT
 	bool "Backlight control for LCD screens"
-	depends on ADB_PMU && FB = y && (BROKEN || !PPC64)
+	depends on PPC_PMAC && ADB_PMU && FB = y && (BROKEN || !PPC64)
 	select FB_BACKLIGHT
 	help
 	  Say Y here to enable Macintosh specific extensions of the generic
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 04/12] macintosh/via-pmu: Enhance state machine with new 'uninitialized' state
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

On 68k Macs, the via/vias pointer can't be used to determine whether
the PMU driver has been initialized. For portability, add a new state
to indicate that via_find_pmu() succeeded.

After via_find_pmu() executes, testing vias == NULL is equivalent to
testing via == NULL. Replace these tests with pmu_state == uninitialized
which is simpler and more consistent. No functional change.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/macintosh/via-pmu.c | 44 ++++++++++++++++++++++----------------------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index 4c1bae5380c2..c4c324fb5fa6 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -114,6 +114,7 @@ static volatile unsigned char __iomem *via;
 #define CB1_INT		0x10		/* transition on CB1 input */
 
 static volatile enum pmu_state {
+	uninitialized = 0,
 	idle,
 	sending,
 	intack,
@@ -274,7 +275,7 @@ int __init find_via_pmu(void)
 	u64 taddr;
 	const u32 *reg;
 
-	if (via != 0)
+	if (pmu_state != uninitialized)
 		return 1;
 	vias = of_find_node_by_name(NULL, "via-pmu");
 	if (vias == NULL)
@@ -369,20 +370,19 @@ int __init find_via_pmu(void)
  fail:
 	of_node_put(vias);
 	vias = NULL;
+	pmu_state = uninitialized;
 	return 0;
 }
 
 #ifdef CONFIG_ADB
 static int pmu_probe(void)
 {
-	return vias == NULL? -ENODEV: 0;
+	return pmu_state == uninitialized ? -ENODEV : 0;
 }
 
 static int pmu_init(void)
 {
-	if (vias == NULL)
-		return -ENODEV;
-	return 0;
+	return pmu_state == uninitialized ? -ENODEV : 0;
 }
 #endif /* CONFIG_ADB */
 
@@ -397,7 +397,7 @@ static int __init via_pmu_start(void)
 {
 	unsigned int irq;
 
-	if (vias == NULL)
+	if (pmu_state == uninitialized)
 		return -ENODEV;
 
 	batt_req.complete = 1;
@@ -463,7 +463,7 @@ arch_initcall(via_pmu_start);
  */
 static int __init via_pmu_dev_init(void)
 {
-	if (vias == NULL)
+	if (pmu_state == uninitialized)
 		return -ENODEV;
 
 #ifdef CONFIG_PMAC_BACKLIGHT
@@ -966,7 +966,7 @@ static int pmu_send_request(struct adb_request *req, int sync)
 {
 	int i, ret;
 
-	if ((vias == NULL) || (!pmu_fully_inited)) {
+	if (pmu_state == uninitialized || !pmu_fully_inited) {
 		req->complete = 1;
 		return -ENXIO;
 	}
@@ -1060,7 +1060,7 @@ static int __pmu_adb_autopoll(int devs)
 
 static int pmu_adb_autopoll(int devs)
 {
-	if ((vias == NULL) || (!pmu_fully_inited) || !pmu_has_adb)
+	if (pmu_state == uninitialized || !pmu_fully_inited || !pmu_has_adb)
 		return -ENXIO;
 
 	adb_dev_map = devs;
@@ -1073,7 +1073,7 @@ static int pmu_adb_reset_bus(void)
 	struct adb_request req;
 	int save_autopoll = adb_dev_map;
 
-	if ((vias == NULL) || (!pmu_fully_inited) || !pmu_has_adb)
+	if (pmu_state == uninitialized || !pmu_fully_inited || !pmu_has_adb)
 		return -ENXIO;
 
 	/* anyone got a better idea?? */
@@ -1109,7 +1109,7 @@ pmu_request(struct adb_request *req, void (*done)(struct adb_request *),
 	va_list list;
 	int i;
 
-	if (vias == NULL)
+	if (pmu_state == uninitialized)
 		return -ENXIO;
 
 	if (nbytes < 0 || nbytes > 32) {
@@ -1134,7 +1134,7 @@ pmu_queue_request(struct adb_request *req)
 	unsigned long flags;
 	int nsend;
 
-	if (via == NULL) {
+	if (pmu_state == uninitialized) {
 		req->complete = 1;
 		return -ENXIO;
 	}
@@ -1247,7 +1247,7 @@ pmu_start(void)
 void
 pmu_poll(void)
 {
-	if (!via)
+	if (pmu_state == uninitialized)
 		return;
 	if (disable_poll)
 		return;
@@ -1257,7 +1257,7 @@ pmu_poll(void)
 void
 pmu_poll_adb(void)
 {
-	if (!via)
+	if (pmu_state == uninitialized)
 		return;
 	if (disable_poll)
 		return;
@@ -1272,7 +1272,7 @@ pmu_poll_adb(void)
 void
 pmu_wait_complete(struct adb_request *req)
 {
-	if (!via)
+	if (pmu_state == uninitialized)
 		return;
 	while((pmu_state != idle && pmu_state != locked) || !req->complete)
 		via_pmu_interrupt(0, NULL);
@@ -1288,7 +1288,7 @@ pmu_suspend(void)
 {
 	unsigned long flags;
 
-	if (!via)
+	if (pmu_state == uninitialized)
 		return;
 	
 	spin_lock_irqsave(&pmu_lock, flags);
@@ -1319,7 +1319,7 @@ pmu_resume(void)
 {
 	unsigned long flags;
 
-	if (!via || (pmu_suspended < 1))
+	if (pmu_state == uninitialized || pmu_suspended < 1)
 		return;
 
 	spin_lock_irqsave(&pmu_lock, flags);
@@ -1681,7 +1681,7 @@ pmu_enable_irled(int on)
 {
 	struct adb_request req;
 
-	if (vias == NULL)
+	if (pmu_state == uninitialized)
 		return ;
 	if (pmu_kind == PMU_KEYLARGO_BASED)
 		return ;
@@ -1696,7 +1696,7 @@ pmu_restart(void)
 {
 	struct adb_request req;
 
-	if (via == NULL)
+	if (pmu_state == uninitialized)
 		return;
 
 	local_irq_disable();
@@ -1721,7 +1721,7 @@ pmu_shutdown(void)
 {
 	struct adb_request req;
 
-	if (via == NULL)
+	if (pmu_state == uninitialized)
 		return;
 
 	local_irq_disable();
@@ -1749,7 +1749,7 @@ pmu_shutdown(void)
 int
 pmu_present(void)
 {
-	return via != 0;
+	return pmu_state != uninitialized;
 }
 
 #if defined(CONFIG_SUSPEND) && defined(CONFIG_PPC32)
@@ -2415,7 +2415,7 @@ static struct miscdevice pmu_device = {
 
 static int pmu_device_init(void)
 {
-	if (!via)
+	if (pmu_state == uninitialized)
 		return 0;
 	if (misc_register(&pmu_device) < 0)
 		printk(KERN_ERR "via-pmu: cannot register misc device.\n");
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 03/12] macintosh/via-pmu: Don't clear shift register interrupt flag twice
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

The shift register interrupt flag gets cleared in via_pmu_interrupt()
and once again in pmu_sr_intr(). Fix this theoretical race condition.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
 drivers/macintosh/via-pmu.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index 74065ea410bd..4c1bae5380c2 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -1458,7 +1458,6 @@ pmu_sr_intr(void)
 
 	if (in_8(&via[B]) & TREQ) {
 		printk(KERN_ERR "PMU: spurious SR intr (%x)\n", in_8(&via[B]));
-		out_8(&via[IFR], SR_INT);
 		return NULL;
 	}
 	/* The ack may not yet be low when we get the interrupt */
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 01/12] macintosh/via-pmu: Fix section mismatch warning
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

The pmu_init() function has the __init qualifier, but the ops struct
that holds a pointer to it does not. This causes a build warning.
The driver works fine because the pointer is only dereferenced early.

The function is so small that there's negligible benefit from using
the __init qualifier. Remove it to fix the warning, consistent with
the other ADB drivers.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
 drivers/macintosh/via-pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index 433dbeddfcf9..fd3c5640d586 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -378,7 +378,7 @@ static int pmu_probe(void)
 	return vias == NULL? -ENODEV: 0;
 }
 
-static int __init pmu_init(void)
+static int pmu_init(void)
 {
 	if (vias == NULL)
 		return -ENODEV;
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 02/12] macintosh/via-pmu: Add missing mmio accessors
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel
In-Reply-To: <cover.1528423341.git.fthain@telegraphics.com.au>

Add missing in_8() accessors to init_pmu() and pmu_sr_intr().

This fixes several sparse warnings:
drivers/macintosh/via-pmu.c:536:29: warning: dereference of noderef expression
drivers/macintosh/via-pmu.c:537:33: warning: dereference of noderef expression
drivers/macintosh/via-pmu.c:1455:17: warning: dereference of noderef expression
drivers/macintosh/via-pmu.c:1456:69: warning: dereference of noderef expression

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
 drivers/macintosh/via-pmu.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index fd3c5640d586..74065ea410bd 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -532,8 +532,9 @@ init_pmu(void)
 	int timeout;
 	struct adb_request req;
 
-	out_8(&via[B], via[B] | TREQ);			/* negate TREQ */
-	out_8(&via[DIRB], (via[DIRB] | TREQ) & ~TACK);	/* TACK in, TREQ out */
+	/* Negate TREQ. Set TACK to input and TREQ to output. */
+	out_8(&via[B], in_8(&via[B]) | TREQ);
+	out_8(&via[DIRB], (in_8(&via[DIRB]) | TREQ) & ~TACK);
 
 	pmu_request(&req, NULL, 2, PMU_SET_INTR_MASK, pmu_intr_mask);
 	timeout =  100000;
@@ -1455,8 +1456,8 @@ pmu_sr_intr(void)
 	struct adb_request *req;
 	int bite = 0;
 
-	if (via[B] & TREQ) {
-		printk(KERN_ERR "PMU: spurious SR intr (%x)\n", via[B]);
+	if (in_8(&via[B]) & TREQ) {
+		printk(KERN_ERR "PMU: spurious SR intr (%x)\n", in_8(&via[B]));
 		out_8(&via[IFR], SR_INT);
 		return NULL;
 	}
-- 
2.16.4

^ permalink raw reply related

* [PATCH v2 00/12] macintosh: Resolve various PMU driver problems
From: Finn Thain @ 2018-06-08  2:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Schmitz, linuxppc-dev, linux-m68k, linux-kernel

This series of patches has the following aims.

1) Eliminate duplicated code. Linux presently has two drivers for
   the 68HC05-based PMU devices found in Macs: via-pmu and via-pmu68k.
   There's no value in having separate PMU drivers for each architecture.

2) Avoid further work on via-pmu68k that's not needed for via-pmu.

3) Fix some bugs in the via-pmu driver.

4) Enable the /dev/pmu and /proc/pmu/* userspace APIs on m68k Macs
   by adopting via-pmu.

5) Improve stability on early 100-series PowerBooks by loading no PMU
   driver at all. Neither via-pmu nor via-pmu68k supports the early
   M50753-based PMU device found in these models.

6) Eliminate duplicated RTC accessors for PMU and Cuda. Presently these
   can be found under both arch/m68k and arch/powerpc.

7) Assist the out-of-tree NuBus PowerMac port to support PMU designs
   shared with the m68k Mac port (e.g. PowerBooks 190 and 5300).

This patch series has been regression tested on various PowerBooks
(190, 520, 3400, Pismo G3) and PowerMacs (Beige G3, G5). These patches
did not affect userland utilities. (Note that there is a userland-
visible change to the contents of /proc/pmu/interrupts.)

Changed since v1:
1) Added blank lines after 'break' statements in patch 10.
2) Improved patch description for patch 3.
3) Added reviewed-by tags.
4) Split patch 8 to make code review easier.

Finn Thain (12):
  macintosh/via-pmu: Fix section mismatch warning
  macintosh/via-pmu: Add missing mmio accessors
  macintosh/via-pmu: Don't clear shift register interrupt flag twice
  macintosh/via-pmu: Enhance state machine with new 'uninitialized'
    state
  macintosh/via-pmu: Replace via pointer with via1 and via2 pointers
  macintosh/via-pmu: Add support for m68k PowerBooks
  macintosh/via-pmu: Make CONFIG_PPC_PMAC Kconfig deps explicit
  macintosh/via-pmu68k: Don't load driver on unsupported hardware
  macintosh/via-pmu: Replace via-pmu68k driver with via-pmu driver
  macintosh: Use common code to access RTC
  macintosh/via-pmu: Clean up interrupt statistics
  macintosh/via-pmu: Disambiguate interrupt statistics

 arch/m68k/configs/mac_defconfig        |   2 +-
 arch/m68k/configs/multi_defconfig      |   2 +-
 arch/m68k/mac/config.c                 |   2 +-
 arch/m68k/mac/misc.c                   | 118 +----
 arch/powerpc/platforms/powermac/time.c |  74 +--
 drivers/macintosh/Kconfig              |  19 +-
 drivers/macintosh/Makefile             |   1 -
 drivers/macintosh/adb.c                |   2 +-
 drivers/macintosh/via-cuda.c           |  34 ++
 drivers/macintosh/via-pmu.c            | 378 ++++++++++-----
 drivers/macintosh/via-pmu68k.c         | 850 ---------------------------------
 include/linux/cuda.h                   |   3 +
 include/linux/pmu.h                    |   3 +
 include/uapi/linux/pmu.h               |   2 -
 14 files changed, 311 insertions(+), 1179 deletions(-)
 delete mode 100644 drivers/macintosh/via-pmu68k.c

-- 
2.16.4

^ permalink raw reply

* Re: [PATCH 1/3] powerpc: make CPU selection logic generic in Makefile
From: Nicholas Piggin @ 2018-06-08  1:54 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	linux-kernel, linuxppc-dev
In-Reply-To: <273f8ed3e980b9385c6e1b31e17f890ea08ce33c.1528365638.git.christophe.leroy@c-s.fr>

On Thu,  7 Jun 2018 10:10:18 +0000 (UTC)
Christophe Leroy <christophe.leroy@c-s.fr> wrote:

> At the time being, when adding a new CPU for selection, both
> Kconfig.cputype and Makefile have to be modified.
> 
> This patch moves into Kconfig.cputype the name of the CPU to me
> passed to the -mcpu= argument.

Seems like a good cleanup.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

> 
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
>  arch/powerpc/Makefile                  |  8 +-------
>  arch/powerpc/platforms/Kconfig.cputype | 15 +++++++++++++++
>  2 files changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> index 9704ab360d39..9a5642552abc 100644
> --- a/arch/powerpc/Makefile
> +++ b/arch/powerpc/Makefile
> @@ -175,13 +175,7 @@ ifdef CONFIG_MPROFILE_KERNEL
>      endif
>  endif
>  
> -CFLAGS-$(CONFIG_CELL_CPU) += $(call cc-option,-mcpu=cell)
> -CFLAGS-$(CONFIG_POWER5_CPU) += $(call cc-option,-mcpu=power5)
> -CFLAGS-$(CONFIG_POWER6_CPU) += $(call cc-option,-mcpu=power6)
> -CFLAGS-$(CONFIG_POWER7_CPU) += $(call cc-option,-mcpu=power7)
> -CFLAGS-$(CONFIG_POWER8_CPU) += $(call cc-option,-mcpu=power8)
> -CFLAGS-$(CONFIG_POWER9_CPU) += $(call cc-option,-mcpu=power9)
> -CFLAGS-$(CONFIG_PPC_8xx) += $(call cc-option,-mcpu=860)
> +CFLAGS-$(CONFIG_SPECIAL_CPU_BOOL) += $(call cc-option,-mcpu=$(CONFIG_SPECIAL_CPU))
>  
>  # Altivec option not allowed with e500mc64 in GCC.
>  ifeq ($(CONFIG_ALTIVEC),y)
> diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
> index cc892dcfa114..71ef559cc474 100644
> --- a/arch/powerpc/platforms/Kconfig.cputype
> +++ b/arch/powerpc/platforms/Kconfig.cputype
> @@ -140,6 +140,21 @@ config E6500_CPU
>  
>  endchoice
>  
> +config SPECIAL_CPU_BOOL
> +	bool
> +	default !GENERIC_CPU
> +
> +config SPECIAL_CPU
> +	string
> +	depends on SPECIAL_CPU_BOOL
> +	default "cell" if CELL_CPU
> +	default "power5" if POWER5_CPU
> +	default "power6" if POWER6_CPU
> +	default "power7" if POWER7_CPU
> +	default "power8" if POWER8_CPU
> +	default "power9" if POWER9_CPU
> +	default "860" if PPC_8xx
> +
>  config PPC_BOOK3S
>  	def_bool y
>  	depends on PPC_BOOK3S_32 || PPC_BOOK3S_64

^ permalink raw reply

* Re: [v3 PATCH 5/5] powerpc/pseries: Display machine check error details.
From: Nicholas Piggin @ 2018-06-08  1:51 UTC (permalink / raw)
  To: Mahesh J Salgaonkar
  Cc: linuxppc-dev, Aneesh Kumar K.V, Michael Ellerman, Laurent Dufour
In-Reply-To: <152839254280.25118.11212831020041096859.stgit@jupiter.in.ibm.com>

On Thu, 07 Jun 2018 22:59:04 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> Extract the MCE error details from RTAS extended log and display it to
> console.
> 
> With this patch you should now see mce logs like below:
> 
> [  142.371818] Severe Machine check interrupt [Recovered]
> [  142.371822]   NIP [d00000000ca301b8]: init_module+0x1b8/0x338 [bork_kernel]
> [  142.371822]   Initiator: CPU
> [  142.371823]   Error type: SLB [Multihit]
> [  142.371824]     Effective address: d00000000ca70000
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/rtas.h      |    5 +
>  arch/powerpc/platforms/pseries/ras.c |  128 +++++++++++++++++++++++++++++++++-
>  2 files changed, 131 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index 3f2fba7ef23b..8100a95c133a 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -190,6 +190,11 @@ static inline uint8_t rtas_error_extended(const struct rtas_error_log *elog)
>  	return (elog->byte1 & 0x04) >> 2;
>  }
>  
> +static inline uint8_t rtas_error_initiator(const struct rtas_error_log *elog)
> +{
> +	return (elog->byte2 & 0xf0) >> 4;
> +}
> +
>  #define rtas_error_type(x)	((x)->byte3)
>  
>  static inline
> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
> index e56759d92356..cd9446980092 100644
> --- a/arch/powerpc/platforms/pseries/ras.c
> +++ b/arch/powerpc/platforms/pseries/ras.c
> @@ -422,7 +422,130 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
>  	return 0; /* need to perform reset */
>  }
>  
> -static int mce_handle_error(struct rtas_error_log *errp)
> +#define VAL_TO_STRING(ar, val)	((val < ARRAY_SIZE(ar)) ? ar[val] : "Unknown")
> +
> +static void pseries_print_mce_info(struct pt_regs *regs,
> +				struct rtas_error_log *errp, int disposition)
> +{
> +	const char *level, *sevstr;
> +	struct pseries_errorlog *pseries_log;
> +	struct pseries_mc_errorlog *mce_log;
> +	uint8_t error_type, err_sub_type;
> +	uint8_t initiator = rtas_error_initiator(errp);
> +	uint64_t addr;
> +
> +	static const char * const initiators[] = {
> +		"Unknown",
> +		"CPU",
> +		"PCI",
> +		"ISA",
> +		"Memory",
> +		"Power Mgmt",
> +	};
> +	static const char * const mc_err_types[] = {
> +		"UE",
> +		"SLB",
> +		"ERAT",
> +		"TLB",
> +		"D-Cache",
> +		"Unknown",
> +		"I-Cache",
> +	};
> +	static const char * const mc_ue_types[] = {
> +		"Indeterminate",
> +		"Instruction fetch",
> +		"Page table walk ifetch",
> +		"Load/Store",
> +		"Page table walk Load/Store",
> +	};
> +
> +	/* SLB sub errors valid values are 0x0, 0x1, 0x2 */
> +	static const char * const mc_slb_types[] = {
> +		"Parity",
> +		"Multihit",
> +		"Indeterminate",
> +	};
> +
> +	/* TLB and ERAT sub errors valid values are 0x1, 0x2, 0x3 */
> +	static const char * const mc_soft_types[] = {
> +		"Unknown",
> +		"Parity",
> +		"Multihit",
> +		"Indeterminate",
> +	};
> +
> +	pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
> +	if (pseries_log == NULL)
> +		return;
> +
> +	mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
> +
> +	error_type = rtas_mc_error_type(mce_log);
> +	err_sub_type = rtas_mc_error_sub_type(mce_log);
> +
> +	switch (rtas_error_severity(errp)) {
> +	case RTAS_SEVERITY_NO_ERROR:
> +		level = KERN_INFO;
> +		sevstr = "Harmless";
> +		break;
> +	case RTAS_SEVERITY_WARNING:
> +		level = KERN_WARNING;
> +		sevstr = "";
> +		break;
> +	case RTAS_SEVERITY_ERROR:
> +	case RTAS_SEVERITY_ERROR_SYNC:
> +		level = KERN_ERR;
> +		sevstr = "Severe";
> +		break;
> +	case RTAS_SEVERITY_FATAL:
> +	default:
> +		level = KERN_ERR;
> +		sevstr = "Fatal";
> +		break;
> +	}
> +
> +	printk("%s%s Machine check interrupt [%s]\n", level, sevstr,
> +		disposition == RTAS_DISP_FULLY_RECOVERED ?
> +		"Recovered" : "Not recovered");
> +	if (user_mode(regs)) {
> +		printk("%s  NIP: [%016lx] PID: %d Comm: %s\n", level,
> +			regs->nip, current->pid, current->comm);
> +	} else {
> +		printk("%s  NIP [%016lx]: %pS\n", level, regs->nip,
> +			(void *)regs->nip);
> +	}

I think it's probably still useful to print pid/comm for kernel mode
faults if !in_interrupt()... I see you're basically taking kernel/mce.c
and doing the same thing.

Is there any reasonable way to share code here?

Thanks,
Nick

^ permalink raw reply

* Re: [v3 PATCH 4/5] powerpc/pseries: Dump and flush SLB contents on SLB MCE errors.
From: Nicholas Piggin @ 2018-06-08  1:48 UTC (permalink / raw)
  To: Mahesh J Salgaonkar
  Cc: linuxppc-dev, Aneesh Kumar K.V, Michael Ellerman, Laurent Dufour
In-Reply-To: <152839253238.25118.3114450844744290470.stgit@jupiter.in.ibm.com>

On Thu, 07 Jun 2018 22:58:55 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> If we get a machine check exceptions due to SLB errors then dump the
> current SLB contents which will be very much helpful in debugging the
> root cause of SLB errors. On pseries, as of today system crashes on SLB
> errors. These are soft errors and can be fixed by flushing the SLBs so
> the kernel can continue to function instead of system crash. This patch
> fixes that also.

So pseries never flushed SLB and reloaded in response to multi hit
errors? This seems like quite a good improvement then. I like
dumping SLB too.

It's a bit annoying we can't share the same code with xmon really,
that's okay but I just suggest commenting them both if you take a
copy like this with a note to keep them in synch if you re-post
the series.

> 
> With this patch the console will log SLB contents like below on SLB MCE
> errors:
> 
> [  822.711728] slb contents:

Suggest keeping the same format as the xmon dump (in particular
CPU number, even though it's probably printed elsewhere in the MCE
message it doesn't hurt.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

Thanks,
Nick

> [  822.711730] 00 c000000008000000 400ea1b217000500
> [  822.711731]   1T  ESID=   c00000  VSID=      ea1b217 LLP:100
> [  822.711732] 01 d000000008000000 400d43642f000510
> [  822.711733]   1T  ESID=   d00000  VSID=      d43642f LLP:110
> [  822.711734] 09 f000000008000000 400a86c85f000500
> [  822.711736]   1T  ESID=   f00000  VSID=      a86c85f LLP:100
> [  822.711737] 10 00007f0008000000 400d1f26e3000d90
> [  822.711738]   1T  ESID=       7f  VSID=      d1f26e3 LLP:110
> [  822.711739] 11 0000000018000000 000e3615f520fd90
> [  822.711740]  256M ESID=        1  VSID=   e3615f520f LLP:110
> [  822.711740] 12 d000000008000000 400d43642f000510
> [  822.711741]   1T  ESID=   d00000  VSID=      d43642f LLP:110
> [  822.711742] 13 d000000008000000 400d43642f000510
> [  822.711743]   1T  ESID=   d00000  VSID=      d43642f LLP:110
> 
> 
> Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/mmu-hash.h |    1 +
>  arch/powerpc/mm/slb.c                         |   35 +++++++++++++++++++++++++
>  arch/powerpc/platforms/pseries/ras.c          |   29 ++++++++++++++++++++-
>  3 files changed, 64 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 50ed64fba4ae..c0da68927235 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -487,6 +487,7 @@ extern void hpte_init_native(void);
>  
>  extern void slb_initialize(void);
>  extern void slb_flush_and_rebolt(void);
> +extern void slb_dump_contents(void);
>  
>  extern void slb_vmalloc_update(void);
>  extern void slb_set_size(u16 size);
> diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
> index 66577cc66dc9..799aa117cec3 100644
> --- a/arch/powerpc/mm/slb.c
> +++ b/arch/powerpc/mm/slb.c
> @@ -145,6 +145,41 @@ void slb_flush_and_rebolt(void)
>  	get_paca()->slb_cache_ptr = 0;
>  }
>  
> +void slb_dump_contents(void)
> +{
> +	int i;
> +	unsigned long e, v;
> +	unsigned long llp;
> +
> +	pr_err("slb contents:\n");
> +	for (i = 0; i < mmu_slb_size; i++) {
> +		asm volatile("slbmfee  %0,%1" : "=r" (e) : "r" (i));
> +		asm volatile("slbmfev  %0,%1" : "=r" (v) : "r" (i));
> +
> +		if (!e && !v)
> +			continue;
> +
> +		pr_err("%02d %016lx %016lx", i, e, v);
> +
> +		if (!(e & SLB_ESID_V)) {
> +			pr_err("\n");
> +			continue;
> +		}
> +		llp = v & SLB_VSID_LLP;
> +		if (v & SLB_VSID_B_1T) {
> +			pr_err("  1T  ESID=%9lx  VSID=%13lx LLP:%3lx\n",
> +				GET_ESID_1T(e),
> +				(v & ~SLB_VSID_B) >> SLB_VSID_SHIFT_1T,
> +				llp);
> +		} else {
> +			pr_err(" 256M ESID=%9lx  VSID=%13lx LLP:%3lx\n",
> +				GET_ESID(e),
> +				(v & ~SLB_VSID_B) >> SLB_VSID_SHIFT,
> +				llp);
> +		}
> +	}
> +}
> +
>  void slb_vmalloc_update(void)
>  {
>  	unsigned long vflags;
> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
> index 2edc673be137..e56759d92356 100644
> --- a/arch/powerpc/platforms/pseries/ras.c
> +++ b/arch/powerpc/platforms/pseries/ras.c
> @@ -422,6 +422,31 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
>  	return 0; /* need to perform reset */
>  }
>  
> +static int mce_handle_error(struct rtas_error_log *errp)
> +{
> +	struct pseries_errorlog *pseries_log;
> +	struct pseries_mc_errorlog *mce_log;
> +	int disposition = rtas_error_disposition(errp);
> +	uint8_t error_type;
> +
> +	pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
> +	if (pseries_log == NULL)
> +		goto out;
> +
> +	mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
> +	error_type = rtas_mc_error_type(mce_log);
> +
> +	if ((disposition == RTAS_DISP_NOT_RECOVERED) &&
> +			(error_type == PSERIES_MC_ERROR_TYPE_SLB)) {
> +		slb_dump_contents();
> +		slb_flush_and_rebolt();
> +		disposition = RTAS_DISP_FULLY_RECOVERED;
> +	}
> +
> +out:
> +	return disposition;
> +}
> +
>  /*
>   * See if we can recover from a machine check exception.
>   * This is only called on power4 (or above) and only via
> @@ -434,7 +459,9 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
>  static int recover_mce(struct pt_regs *regs, struct rtas_error_log *err)
>  {
>  	int recovered = 0;
> -	int disposition = rtas_error_disposition(err);
> +	int disposition;
> +
> +	disposition = mce_handle_error(err);
>  
>  	if (!(regs->msr & MSR_RI)) {
>  		/* If MSR_RI isn't set, we cannot recover */
> 

^ permalink raw reply

* Re: [v3 PATCH 2/5] powerpc/pseries: Fix endainness while restoring of r3 in MCE handler.
From: Nicholas Piggin @ 2018-06-08  1:33 UTC (permalink / raw)
  To: Mahesh J Salgaonkar
  Cc: linuxppc-dev, stable, Aneesh Kumar K.V, Michael Ellerman,
	Laurent Dufour
In-Reply-To: <152839249913.25118.1191250274945665204.stgit@jupiter.in.ibm.com>

On Thu, 07 Jun 2018 22:58:33 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> During Machine Check interrupt on pseries platform, register r3 points
> RTAS extended event log passed by hypervisor. Since hypervisor uses r3
> to pass pointer to rtas log, it stores the original r3 value at the
> start of the memory (first 8 bytes) pointed by r3. Since hypervisor
> stores this info and rtas log is in BE format, linux should make
> sure to restore r3 value in correct endian format.
> 
> Without this patch when MCE handler, after recovery, returns to code that
> that caused the MCE may end up with Data SLB access interrupt for invalid
> address followed by kernel panic or hang.
> 
> [   62.878965] Severe Machine check interrupt [Recovered]
> [   62.878968]   NIP [d00000000ca301b8]: init_module+0x1b8/0x338 [bork_kernel]
> [   62.878969]   Initiator: CPU
> [   62.878970]   Error type: SLB [Multihit]
> [   62.878971]     Effective address: d00000000ca70000
> cpu 0xa: Vector: 380 (Data SLB Access) at [c0000000fc7775b0]
>     pc: c0000000009694c0: vsnprintf+0x80/0x480
>     lr: c0000000009698e0: vscnprintf+0x20/0x60
>     sp: c0000000fc777830
>    msr: 8000000002009033
>    dar: a803a30c000000d0
>   current = 0xc00000000bc9ef00
>   paca    = 0xc00000001eca5c00	 softe: 3	 irq_happened: 0x01
>     pid   = 8860, comm = insmod
> [c0000000fc7778b0] c0000000009698e0 vscnprintf+0x20/0x60
> [c0000000fc7778e0] c00000000016b6c4 vprintk_emit+0xb4/0x4b0
> [c0000000fc777960] c00000000016d40c vprintk_func+0x5c/0xd0
> [c0000000fc777980] c00000000016cbb4 printk+0x38/0x4c
> [c0000000fc7779a0] d00000000ca301c0 init_module+0x1c0/0x338 [bork_kernel]
> [c0000000fc777a40] c00000000000d9c4 do_one_initcall+0x54/0x230
> [c0000000fc777b00] c0000000001b3b74 do_init_module+0x8c/0x248
> [c0000000fc777b90] c0000000001b2478 load_module+0x12b8/0x15b0
> [c0000000fc777d30] c0000000001b29e8 sys_finit_module+0xa8/0x110
> [c0000000fc777e30] c00000000000b204 system_call+0x58/0x6c
> --- Exception: c00 (System Call) at 00007fff8bda0644
> SP (7fffdfbfe980) is in userspace
> 
> This patch fixes this issue.

LGTM

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

> 
> Fixes: a08a53ea4c97 ("powerpc/le: Enable RTAS events support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/ras.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
> index 5e1ef9150182..2edc673be137 100644
> --- a/arch/powerpc/platforms/pseries/ras.c
> +++ b/arch/powerpc/platforms/pseries/ras.c
> @@ -360,7 +360,7 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs)
>  	}
>  
>  	savep = __va(regs->gpr[3]);
> -	regs->gpr[3] = savep[0];	/* restore original r3 */
> +	regs->gpr[3] = be64_to_cpu(savep[0]);	/* restore original r3 */
>  
>  	/* If it isn't an extended log we can use the per cpu 64bit buffer */
>  	h = (struct rtas_error_log *)&savep[1];
> 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox