LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v3 2/5] powerpc: apm82181: create shared dtsi for APM bluestone
From: Christian Lamparter @ 2020-09-19 20:23 UTC (permalink / raw)
  To: Rob Herring
  Cc: devicetree, arnd, Chris Blake, Paul Mackerras, andriy.shevchenko,
	linuxppc-dev
In-Reply-To: <20200915010543.GB612463@bogus>

On 2020-09-15 03:05, Rob Herring wrote:
> On Sun, Sep 06, 2020 at 12:06:12AM +0200, Christian Lamparter wrote:
>> This patch adds an DTSI-File that can be used by various device-tree
>> files for APM82181-based devices.
>>
>> Some of the nodes (like UART, PCIE, SATA) are used by the uboot and
>> need to stick with the naming-conventions of the old times'.
>> I've added comments whenever this was the case.
>>
>> Signed-off-by: Chris Blake <chrisrblake93@gmail.com>
>> Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
>> ---
>> rfc v1 -> v2:
>> 	- removed PKA (this CryptoPU will need driver)
>> 	- stick with compatibles, nodes, ... from either
>> 	  Bluestone (APM82181) or Canyonlands (PPC460EX).
>> 	- add labels for NAND and NOR to help with access.
>> v2 -> v3:
>> 	- nodename of pciex@d.... was changed to pcie@d..
>> 	  due to upstream patch.
>> 	- use simple-bus on the ebc, opb and plb nodes
>> ---
>>   arch/powerpc/boot/dts/apm82181.dtsi | 466 ++++++++++++++++++++++++++++
>>   1 file changed, 466 insertions(+)
>>   create mode 100644 arch/powerpc/boot/dts/apm82181.dtsi
>>
>> diff --git a/arch/powerpc/boot/dts/apm82181.dtsi b/arch/powerpc/boot/dts/apm82181.dtsi
>> new file mode 100644
>> index 000000000000..60283430978d
>> --- /dev/null
>> +++ b/arch/powerpc/boot/dts/apm82181.dtsi
>> @@ -0,0 +1,466 @@
>> +// SPDX-License-Identifier: GPL-2.0-or-later
>> +/*
>> + * Device Tree template include for various APM82181 boards.
>> + *
>> + * The SoC is an evolution of the PPC460EX predecessor.
>> + * This is why dt-nodes from the canyonlands EBC, OPB, USB,
>> + * DMA, SATA, EMAC, ... ended up in here.
>> + *
>> + * Copyright (c) 2010, Applied Micro Circuits Corporation
>> + * Author: Tirumala R Marri <tmarri@apm.com>,
>> + *	   Christian Lamparter <chunkeey@gmail.com>,
>> + *	   Chris Blake <chrisrblake93@gmail.com>
>> + */
>> +
>> +#include <dt-bindings/dma/dw-dmac.h>
>> +#include <dt-bindings/input/input.h>
>> +#include <dt-bindings/interrupt-controller/irq.h>
>> +#include <dt-bindings/gpio/gpio.h>
>> +
>> +/ {
>> +	#address-cells = <2>;
>> +	#size-cells = <1>;
>> +	dcr-parent = <&{/cpus/cpu@0}>;
>> +
>> +	aliases {
>> +		ethernet0 = &EMAC0; /* needed for BSP u-boot */
>> +	};
>> +
>> +	cpus {
>> +		#address-cells = <1>;
>> +		#size-cells = <0>;
>> +
>> +		CPU0: cpu@0 {
>> +			device_type = "cpu";
>> +			model = "PowerPC,apm82181";
> 
> This doesn't match the existing bluestone dts file.
> 
> Please separate any restructuring from changes.


"I see" (I'm including your comment of the dt-binding as well).

I'm getting the vibe that I better should not touch that bluestone.dts.
An honestly, looking at the series and patches that the APM-engineers
posted back in the day, I can see why this well is so poisoned... and
stuff like SATA/AHBDMA/USB/GPIO/CPM/... was missing.

As for the devices. In the spirit of Arnd Bergmann's post of
<https://lkml.org/lkml/2020/3/30/195>

|It would be nice to move over the the bluestone .dts to the apm82181.dtsi file
|when that gets added, if only to ensure they use the same description for each
|node, but that shouldn't stop the addition of the new file if that is needed for
|distros to make use of a popular device.
|I see a couple of additional files in openwrt.

I mean I don't have the bluestone dev board, just the consumer devices.
Would it be possible to support those? I can start from a "skeleton" apm82181.dtsi
This would just include CPU, Memory (SD-RAM+L2C+OCM), UIC (Interrupt-Controller),
the PLB+OBP+EBC Busses and UART. Just enough to make a board "boot from ram".

And then add nodes for PCIE+MSI, AHBDMA+SATA, I2C, Ethernet, NAND+NOR and finally
the Crypto each in separate patches.

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Arnd Bergmann @ 2020-09-19 21:16 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-aio, open list:MIPS, David Howells, Linux-MM, keyrings,
	sparclinux, Christoph Hellwig, linux-arch, linux-s390,
	Linux SCSI List, X86 ML, linux-block, Al Viro, io-uring,
	linux-arm-kernel, Jens Axboe, Parisc List, Network Development,
	LKML, LSM List, Linux FS Devel, Andrew Morton, linuxppc-dev
In-Reply-To: <CALCETrW=BzodXeTAjSvpCoUQoL+MKaKPEeSTRWnB=-C9jMotbQ@mail.gmail.com>

On Sat, Sep 19, 2020 at 6:21 PM Andy Lutomirski <luto@kernel.org> wrote:
> On Fri, Sep 18, 2020 at 8:16 AM Christoph Hellwig <hch@lst.de> wrote:
> > On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote:
> > > Said that, why not provide a variant that would take an explicit
> > > "is it compat" argument and use it there?  And have the normal
> > > one pass in_compat_syscall() to that...
> >
> > That would help to not introduce a regression with this series yes.
> > But it wouldn't fix existing bugs when io_uring is used to access
> > read or write methods that use in_compat_syscall().  One example that
> > I recently ran into is drivers/scsi/sg.c.

Ah, so reading /dev/input/event* would suffer from the same issue,
and that one would in fact be broken by your patch in the hypothetical
case that someone tried to use io_uring to read /dev/input/event on x32...

For reference, I checked the socket timestamp handling that has a
number of corner cases with time32/time64 formats in compat mode,
but none of those appear to be affected by the problem.

> Aside from the potentially nasty use of per-task variables, one thing
> I don't like about PF_FORCE_COMPAT is that it's one-way.  If we're
> going to have a generic mechanism for this, shouldn't we allow a full
> override of the syscall arch instead of just allowing forcing compat
> so that a compat syscall can do a non-compat operation?

The only reason it's needed here is that the caller is in a kernel
thread rather than a system call. Are there any possible scenarios
where one would actually need the opposite?

       Arnd

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Finn Thain @ 2020-09-19 21:52 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Peter Maydell, linux-aio, open list:MIPS, David Howells, Linux-MM,
	keyrings, sparclinux, Christoph Hellwig, linux-arch, linux-s390,
	Linux SCSI List, X86 ML, linux-block, Al Viro, Andy Lutomirski,
	io-uring, linux-arm-kernel, Jens Axboe, Parisc List,
	Network Development, LKML, LSM List, Linux FS Devel,
	Andrew Morton, linuxppc-dev
In-Reply-To: <CAK8P3a2Mi+1yttyGk4k7HxRVrMtmFqJewouVhynqUL0PJycmog@mail.gmail.com>

On Sat, 19 Sep 2020, Arnd Bergmann wrote:

> On Sat, Sep 19, 2020 at 6:21 PM Andy Lutomirski <luto@kernel.org> wrote:
> > On Fri, Sep 18, 2020 at 8:16 AM Christoph Hellwig <hch@lst.de> wrote:
> > > On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote:
> > > > Said that, why not provide a variant that would take an explicit 
> > > > "is it compat" argument and use it there?  And have the normal one 
> > > > pass in_compat_syscall() to that...
> > >
> > > That would help to not introduce a regression with this series yes. 
> > > But it wouldn't fix existing bugs when io_uring is used to access 
> > > read or write methods that use in_compat_syscall().  One example 
> > > that I recently ran into is drivers/scsi/sg.c.
> 
> Ah, so reading /dev/input/event* would suffer from the same issue, and 
> that one would in fact be broken by your patch in the hypothetical case 
> that someone tried to use io_uring to read /dev/input/event on x32...
> 
> For reference, I checked the socket timestamp handling that has a number 
> of corner cases with time32/time64 formats in compat mode, but none of 
> those appear to be affected by the problem.
> 
> > Aside from the potentially nasty use of per-task variables, one thing 
> > I don't like about PF_FORCE_COMPAT is that it's one-way.  If we're 
> > going to have a generic mechanism for this, shouldn't we allow a full 
> > override of the syscall arch instead of just allowing forcing compat 
> > so that a compat syscall can do a non-compat operation?
> 
> The only reason it's needed here is that the caller is in a kernel 
> thread rather than a system call. Are there any possible scenarios where 
> one would actually need the opposite?
> 

Quite possibly. The ext4 vs. compat getdents bug is still unresolved. 
Please see,
https://lore.kernel.org/lkml/CAFEAcA9W+JK7_TrtTnL1P2ES1knNPJX9wcUvhfLwxLq9augq1w@mail.gmail.com/

>        Arnd
> 

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Al Viro @ 2020-09-19 22:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, linux-arch, linux-s390, linux-scsi, x86,
	Arnd Bergmann, linux-block, io-uring, linux-arm-kernel,
	Jens Axboe, linux-parisc, netdev, linux-kernel,
	linux-security-module, linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <20200918151615.GA23432@lst.de>

On Fri, Sep 18, 2020 at 05:16:15PM +0200, Christoph Hellwig wrote:
> On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote:
> > Said that, why not provide a variant that would take an explicit
> > "is it compat" argument and use it there?  And have the normal
> > one pass in_compat_syscall() to that...
> 
> That would help to not introduce a regression with this series yes.
> But it wouldn't fix existing bugs when io_uring is used to access
> read or write methods that use in_compat_syscall().  One example that
> I recently ran into is drivers/scsi/sg.c.

So screw such read/write methods - don't use them with io_uring.
That, BTW, is one of the reasons I'm sceptical about burying the
decisions deep into the callchain - we don't _want_ different
data layouts on read/write depending upon the 32bit vs. 64bit
caller, let alone the pointer-chasing garbage that is /dev/sg.

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Andy Lutomirski @ 2020-09-19 22:22 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-aio, open list:MIPS, David Howells, Linux-MM, keyrings,
	sparclinux, Christoph Hellwig, linux-arch, linux-s390,
	Linux SCSI List, X86 ML, linux-block, Al Viro, Andy Lutomirski,
	io-uring, linux-arm-kernel, Jens Axboe, Parisc List,
	Network Development, LKML, LSM List, Linux FS Devel,
	Andrew Morton, linuxppc-dev
In-Reply-To: <CAK8P3a2Mi+1yttyGk4k7HxRVrMtmFqJewouVhynqUL0PJycmog@mail.gmail.com>


> On Sep 19, 2020, at 2:16 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> 
> On Sat, Sep 19, 2020 at 6:21 PM Andy Lutomirski <luto@kernel.org> wrote:
>>> On Fri, Sep 18, 2020 at 8:16 AM Christoph Hellwig <hch@lst.de> wrote:
>>> On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote:
>>>> Said that, why not provide a variant that would take an explicit
>>>> "is it compat" argument and use it there?  And have the normal
>>>> one pass in_compat_syscall() to that...
>>> 
>>> That would help to not introduce a regression with this series yes.
>>> But it wouldn't fix existing bugs when io_uring is used to access
>>> read or write methods that use in_compat_syscall().  One example that
>>> I recently ran into is drivers/scsi/sg.c.
> 
> Ah, so reading /dev/input/event* would suffer from the same issue,
> and that one would in fact be broken by your patch in the hypothetical
> case that someone tried to use io_uring to read /dev/input/event on x32...
> 
> For reference, I checked the socket timestamp handling that has a
> number of corner cases with time32/time64 formats in compat mode,
> but none of those appear to be affected by the problem.
> 
>> Aside from the potentially nasty use of per-task variables, one thing
>> I don't like about PF_FORCE_COMPAT is that it's one-way.  If we're
>> going to have a generic mechanism for this, shouldn't we allow a full
>> override of the syscall arch instead of just allowing forcing compat
>> so that a compat syscall can do a non-compat operation?
> 
> The only reason it's needed here is that the caller is in a kernel
> thread rather than a system call. Are there any possible scenarios
> where one would actually need the opposite?
> 

I can certainly imagine needing to force x32 mode from a kernel thread.

As for the other direction: what exactly are the desired bitness/arch semantics of io_uring?  Is the operation bitness chosen by the io_uring creation or by the io_uring_enter() bitness?

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Andy Lutomirski @ 2020-09-19 22:23 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, Christoph Hellwig, linux-arch, linux-s390, linux-scsi,
	x86, Arnd Bergmann, linux-block, io-uring, linux-arm-kernel,
	Jens Axboe, linux-parisc, netdev, linux-kernel,
	linux-security-module, linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <20200919220920.GI3421308@ZenIV.linux.org.uk>


> On Sep 19, 2020, at 3:09 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> 
> On Fri, Sep 18, 2020 at 05:16:15PM +0200, Christoph Hellwig wrote:
>>> On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote:
>>> Said that, why not provide a variant that would take an explicit
>>> "is it compat" argument and use it there?  And have the normal
>>> one pass in_compat_syscall() to that...
>> 
>> That would help to not introduce a regression with this series yes.
>> But it wouldn't fix existing bugs when io_uring is used to access
>> read or write methods that use in_compat_syscall().  One example that
>> I recently ran into is drivers/scsi/sg.c.
> 
> So screw such read/write methods - don't use them with io_uring.
> That, BTW, is one of the reasons I'm sceptical about burying the
> decisions deep into the callchain - we don't _want_ different
> data layouts on read/write depending upon the 32bit vs. 64bit
> caller, let alone the pointer-chasing garbage that is /dev/sg.

Well, we could remove in_compat_syscall(), etc and instead have an implicit parameter in DEFINE_SYSCALL.  Then everything would have to be explicit.  This would probably be a win, although it could be quite a bit of work.

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Al Viro @ 2020-09-19 22:41 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, Christoph Hellwig, linux-arch, linux-s390, linux-scsi,
	x86, Arnd Bergmann, linux-block, io-uring, linux-arm-kernel,
	Jens Axboe, linux-parisc, netdev, linux-kernel,
	linux-security-module, linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <AA2CFC7E-E685-4596-84AD-0E314137B93F@amacapital.net>

On Sat, Sep 19, 2020 at 03:23:54PM -0700, Andy Lutomirski wrote:
> 
> > On Sep 19, 2020, at 3:09 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > 
> > On Fri, Sep 18, 2020 at 05:16:15PM +0200, Christoph Hellwig wrote:
> >>> On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote:
> >>> Said that, why not provide a variant that would take an explicit
> >>> "is it compat" argument and use it there?  And have the normal
> >>> one pass in_compat_syscall() to that...
> >> 
> >> That would help to not introduce a regression with this series yes.
> >> But it wouldn't fix existing bugs when io_uring is used to access
> >> read or write methods that use in_compat_syscall().  One example that
> >> I recently ran into is drivers/scsi/sg.c.
> > 
> > So screw such read/write methods - don't use them with io_uring.
> > That, BTW, is one of the reasons I'm sceptical about burying the
> > decisions deep into the callchain - we don't _want_ different
> > data layouts on read/write depending upon the 32bit vs. 64bit
> > caller, let alone the pointer-chasing garbage that is /dev/sg.
> 
> Well, we could remove in_compat_syscall(), etc and instead have an implicit parameter in DEFINE_SYSCALL.  Then everything would have to be explicit.  This would probably be a win, although it could be quite a bit of work.

It would not be a win - most of the syscalls don't give a damn
about 32bit vs. 64bit...

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Andy Lutomirski @ 2020-09-19 22:53 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, Christoph Hellwig, linux-arch, linux-s390, linux-scsi,
	x86, Arnd Bergmann, linux-block, io-uring, linux-arm-kernel,
	Jens Axboe, linux-parisc, netdev, linux-kernel,
	linux-security-module, linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <20200919224122.GJ3421308@ZenIV.linux.org.uk>



> On Sep 19, 2020, at 3:41 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> 
> On Sat, Sep 19, 2020 at 03:23:54PM -0700, Andy Lutomirski wrote:
>> 
>>>> On Sep 19, 2020, at 3:09 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>> 
>>> On Fri, Sep 18, 2020 at 05:16:15PM +0200, Christoph Hellwig wrote:
>>>>> On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote:
>>>>> Said that, why not provide a variant that would take an explicit
>>>>> "is it compat" argument and use it there?  And have the normal
>>>>> one pass in_compat_syscall() to that...
>>>> 
>>>> That would help to not introduce a regression with this series yes.
>>>> But it wouldn't fix existing bugs when io_uring is used to access
>>>> read or write methods that use in_compat_syscall().  One example that
>>>> I recently ran into is drivers/scsi/sg.c.
>>> 
>>> So screw such read/write methods - don't use them with io_uring.
>>> That, BTW, is one of the reasons I'm sceptical about burying the
>>> decisions deep into the callchain - we don't _want_ different
>>> data layouts on read/write depending upon the 32bit vs. 64bit
>>> caller, let alone the pointer-chasing garbage that is /dev/sg.
>> 
>> Well, we could remove in_compat_syscall(), etc and instead have an implicit parameter in DEFINE_SYSCALL.  Then everything would have to be explicit.  This would probably be a win, although it could be quite a bit of work.
> 
> It would not be a win - most of the syscalls don't give a damn
> about 32bit vs. 64bit...

Any reasonable implementation would optimize it out for syscalls that don’t care.  Or it could be explicit:

DEFINE_MULTIARCH_SYSCALL(...)

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Al Viro @ 2020-09-19 23:24 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, Christoph Hellwig, linux-arch, linux-s390, linux-scsi,
	x86, Arnd Bergmann, linux-block, io-uring, linux-arm-kernel,
	Jens Axboe, linux-parisc, netdev, linux-kernel,
	linux-security-module, linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <36CF3DE7-7B4B-41FD-9818-FDF8A5B440FB@amacapital.net>

On Sat, Sep 19, 2020 at 03:53:40PM -0700, Andy Lutomirski wrote:

> > It would not be a win - most of the syscalls don't give a damn
> > about 32bit vs. 64bit...
> 
> Any reasonable implementation would optimize it out for syscalls that don’t care.  Or it could be explicit:
> 
> DEFINE_MULTIARCH_SYSCALL(...)

1) what would that look like?
2) have you counted the syscalls that do and do not need that?
3) how many of those realistically *can* be unified with their
compat counterparts?  [hint: ioctl(2) cannot]

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Andy Lutomirski @ 2020-09-20  0:14 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio, open list:MIPS, David Howells, Linux-MM, keyrings,
	sparclinux, Christoph Hellwig, linux-arch, linux-s390,
	Linux SCSI List, X86 ML, Arnd Bergmann, linux-block, io-uring,
	linux-arm-kernel, Jens Axboe, Parisc List, Network Development,
	LKML, LSM List, Linux FS Devel, Andrew Morton, linuxppc-dev
In-Reply-To: <20200919232411.GK3421308@ZenIV.linux.org.uk>

On Sat, Sep 19, 2020 at 4:24 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Sat, Sep 19, 2020 at 03:53:40PM -0700, Andy Lutomirski wrote:
>
> > > It would not be a win - most of the syscalls don't give a damn
> > > about 32bit vs. 64bit...
> >
> > Any reasonable implementation would optimize it out for syscalls that don’t care.  Or it could be explicit:
> >
> > DEFINE_MULTIARCH_SYSCALL(...)
>
> 1) what would that look like?

In effect, it would work like this:

/* Arch-specific, but there's a generic case for sane architectures. */
enum syscall_arch {
  SYSCALL_NATIVE,
  SYSCALL_COMPAT,
  SYSCALL_X32,
};

DEFINE_MULTIARCH_SYSCALLn(args, arch)
{
  args are the args here, and arch is the arch.
}

> 2) have you counted the syscalls that do and do not need that?

No.

> 3) how many of those realistically *can* be unified with their
> compat counterparts?  [hint: ioctl(2) cannot]

There would be no requirement to unify anything.  The idea is that
we'd get rid of all the global state flags.

For ioctl, we'd have a new file_operation:

long ioctl(struct file *, unsigned int, unsigned long, enum syscall_arch);

I'm not saying this is easy, but I think it's possible and the result
would be more obviously correct than what we have now.

^ permalink raw reply

* [PATCH v1 1/2] dt-bindings: usb: dwc2: add support for APM82181 SoCs USB OTG HS and FS
From: Christian Lamparter @ 2020-09-20  0:18 UTC (permalink / raw)
  To: linuxppc-dev, devicetree, linux-usb
  Cc: Minas Harutyunyan, Greg Kroah-Hartman, Rob Herring

adds the specific compatible string for the DWC2 IP found in the APM82181
SoCs. The APM82181's USB-OTG seems like it was taken from its direct
predecessor: the PPC460EX (canyonlands).

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
 Documentation/devicetree/bindings/usb/dwc2.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/usb/dwc2.yaml b/Documentation/devicetree/bindings/usb/dwc2.yaml
index ffa157a0fce7..34ddb5c877a1 100644
--- a/Documentation/devicetree/bindings/usb/dwc2.yaml
+++ b/Documentation/devicetree/bindings/usb/dwc2.yaml
@@ -39,6 +39,7 @@ properties:
               - amlogic,meson-g12a-usb
           - const: snps,dwc2
       - const: amcc,dwc-otg
+      - const: apm,apm82181-dwc-otg
       - const: snps,dwc2
       - const: st,stm32f4x9-fsotg
       - const: st,stm32f4x9-hsotg
-- 
2.28.0


^ permalink raw reply related

* [PATCH v1 2/2] usb: dwc2: add support for APM82181 USB OTG
From: Christian Lamparter @ 2020-09-20  0:18 UTC (permalink / raw)
  To: linuxppc-dev, devicetree, linux-usb
  Cc: Minas Harutyunyan, Greg Kroah-Hartman, Rob Herring
In-Reply-To: <a43868b06566f5d959d8cfc4e763bede2885931a.1600560884.git.chunkeey@gmail.com>

adds the specific compatible string for the DWC2 IP found in the APM82181
SoCs. The IP is setup correctly through the auto detection... With the
exception of the AHB Burst Size. The default of GAHBCFG_HBSTLEN_INCR4 of
the "snps,dwc2" can cause a system hang when the USB and SATA is used
concurrently. Because the predecessor (PPC460EX (Canyonlands)) already
had the same problem, this SoC can make use of the existing
dwc2_set_amcc_params() function.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
 drivers/usb/dwc2/params.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/dwc2/params.c b/drivers/usb/dwc2/params.c
index 8f9d061c4d5f..6d2b9a6c247c 100644
--- a/drivers/usb/dwc2/params.c
+++ b/drivers/usb/dwc2/params.c
@@ -210,6 +210,7 @@ const struct of_device_id dwc2_of_match_table[] = {
 	{ .compatible = "amlogic,meson-g12a-usb",
 	  .data = dwc2_set_amlogic_g12a_params },
 	{ .compatible = "amcc,dwc-otg", .data = dwc2_set_amcc_params },
+	{ .compatible = "apm,apm82181-dwc-otg", .data = dwc2_set_amcc_params },
 	{ .compatible = "st,stm32f4x9-fsotg",
 	  .data = dwc2_set_stm32f4x9_fsotg_params },
 	{ .compatible = "st,stm32f4x9-hsotg" },
-- 
2.28.0


^ permalink raw reply related

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Al Viro @ 2020-09-20  2:57 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-aio, open list:MIPS, David Howells, Linux-MM, keyrings,
	sparclinux, Christoph Hellwig, linux-arch, linux-s390,
	Linux SCSI List, X86 ML, Arnd Bergmann, linux-block, io-uring,
	linux-arm-kernel, Jens Axboe, Parisc List, Network Development,
	LKML, LSM List, Linux FS Devel, Andrew Morton, linuxppc-dev
In-Reply-To: <CALCETrViwOdFia_aX4p4riE8aqop1zoOqVfiQtSAZEzheC+Ozg@mail.gmail.com>

On Sat, Sep 19, 2020 at 05:14:41PM -0700, Andy Lutomirski wrote:

> > 2) have you counted the syscalls that do and do not need that?
> 
> No.

Might be illuminating...

> > 3) how many of those realistically *can* be unified with their
> > compat counterparts?  [hint: ioctl(2) cannot]
> 
> There would be no requirement to unify anything.  The idea is that
> we'd get rid of all the global state flags.

_What_ global state flags?  When you have separate SYSCALL_DEFINE3(ioctl...)
and COMPAT_SYSCALL_DEFINE3(ioctl...), there's no flags at all, global or
local.  They only come into the play when you try to share the same function
for both, right on the top level.

> For ioctl, we'd have a new file_operation:
> 
> long ioctl(struct file *, unsigned int, unsigned long, enum syscall_arch);
> 
> I'm not saying this is easy, but I think it's possible and the result
> would be more obviously correct than what we have now.

No, it would not.  Seriously, from time to time a bit of RTFS before grand
proposals turns out to be useful.

^ permalink raw reply

* Re: [PATCH v2] powerpc/tm: Save and restore AMR on treclaim and trechkpt
From: Aneesh Kumar K.V @ 2020-09-20  5:56 UTC (permalink / raw)
  To: Gustavo Romero, linuxppc-dev; +Cc: mikey, gromero
In-Reply-To: <20200919150025.9609-1-gromero@linux.ibm.com>

Gustavo Romero <gromero@linux.ibm.com> writes:

> Althought AMR is stashed in the checkpoint area, currently we don't save
> it to the per thread checkpoint struct after a treclaim and so we don't
> restore it either from that struct when we trechkpt. As a consequence when
> the transaction is later rolled back the kernel space AMR value when the
> trechkpt was done appears in userspace.
>
> That commit saves and restores AMR accordingly on treclaim and trechkpt.
> Since AMR value is also used in kernel space in other functions, it also
> takes care of stashing kernel live AMR into the stack before treclaim and
> before trechkpt, restoring it later, just before returning from tm_reclaim
> and __tm_recheckpoint.
>
> Is also fixes two nonrelated comments about CR and MSR.
>

Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

> Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/processor.h |  1 +
>  arch/powerpc/kernel/asm-offsets.c    |  1 +
>  arch/powerpc/kernel/tm.S             | 35 ++++++++++++++++++++++++----
>  3 files changed, 33 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index ed0d633ab5aa..9f4f6cc033ac 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -220,6 +220,7 @@ struct thread_struct {
>  	unsigned long	tm_tar;
>  	unsigned long	tm_ppr;
>  	unsigned long	tm_dscr;
> +	unsigned long   tm_amr;
>  
>  	/*
>  	 * Checkpointed FP and VSX 0-31 register set.
> diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
> index 8711c2164b45..c2722ff36e98 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -176,6 +176,7 @@ int main(void)
>  	OFFSET(THREAD_TM_TAR, thread_struct, tm_tar);
>  	OFFSET(THREAD_TM_PPR, thread_struct, tm_ppr);
>  	OFFSET(THREAD_TM_DSCR, thread_struct, tm_dscr);
> +	OFFSET(THREAD_TM_AMR, thread_struct, tm_amr);
>  	OFFSET(PT_CKPT_REGS, thread_struct, ckpt_regs);
>  	OFFSET(THREAD_CKVRSTATE, thread_struct, ckvr_state.vr);
>  	OFFSET(THREAD_CKVRSAVE, thread_struct, ckvrsave);
> diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
> index 6ba0fdd1e7f8..2b91f233b05d 100644
> --- a/arch/powerpc/kernel/tm.S
> +++ b/arch/powerpc/kernel/tm.S
> @@ -122,6 +122,13 @@ _GLOBAL(tm_reclaim)
>  	std	r3, STK_PARAM(R3)(r1)
>  	SAVE_NVGPRS(r1)
>  
> +	/*
> +	 * Save kernel live AMR since it will be clobbered by treclaim
> +	 * but can be used elsewhere later in kernel space.
> +	 */
> +	mfspr	r3, SPRN_AMR
> +	std	r3, TM_FRAME_L1(r1)
> +
>  	/* We need to setup MSR for VSX register save instructions. */
>  	mfmsr	r14
>  	mr	r15, r14
> @@ -245,7 +252,7 @@ _GLOBAL(tm_reclaim)
>  	 * but is used in signal return to 'wind back' to the abort handler.
>  	 */
>  
> -	/* ******************** CR,LR,CCR,MSR ********** */
> +	/* ***************** CTR, LR, CR, XER ********** */
>  	mfctr	r3
>  	mflr	r4
>  	mfcr	r5
> @@ -256,7 +263,6 @@ _GLOBAL(tm_reclaim)
>  	std	r5, _CCR(r7)
>  	std	r6, _XER(r7)
>  
> -
>  	/* ******************** TAR, DSCR ********** */
>  	mfspr	r3, SPRN_TAR
>  	mfspr	r4, SPRN_DSCR
> @@ -264,6 +270,10 @@ _GLOBAL(tm_reclaim)
>  	std	r3, THREAD_TM_TAR(r12)
>  	std	r4, THREAD_TM_DSCR(r12)
>  
> +        /* ******************** AMR **************** */
> +        mfspr	r3, SPRN_AMR
> +        std	r3, THREAD_TM_AMR(r12)
> +
>  	/*
>  	 * MSR and flags: We don't change CRs, and we don't need to alter MSR.
>  	 */
> @@ -308,7 +318,9 @@ _GLOBAL(tm_reclaim)
>  	std	r3, THREAD_TM_TFHAR(r12)
>  	std	r4, THREAD_TM_TFIAR(r12)
>  
> -	/* AMR is checkpointed too, but is unsupported by Linux. */
> +	/* Restore kernel live AMR */
> +	ld	r8, TM_FRAME_L1(r1)
> +	mtspr	SPRN_AMR, r8
>  
>  	/* Restore original MSR/IRQ state & clear TM mode */
>  	ld	r14, TM_FRAME_L0(r1)		/* Orig MSR */
> @@ -355,6 +367,13 @@ _GLOBAL(__tm_recheckpoint)
>  	 */
>  	SAVE_NVGPRS(r1)
>  
> +	/*
> +	 * Save kernel live AMR since it will be clobbered for trechkpt
> +	 * but can be used elsewhere later in kernel space.
> +	 */
> +	mfspr	r8, SPRN_AMR
> +	std	r8, TM_FRAME_L0(r1)
> +
>  	/* Load complete register state from ts_ckpt* registers */
>  
>  	addi	r7, r3, PT_CKPT_REGS		/* Thread's ckpt_regs */
> @@ -404,7 +423,7 @@ _GLOBAL(__tm_recheckpoint)
>  
>  restore_gprs:
>  
> -	/* ******************** CR,LR,CCR,MSR ********** */
> +	/* ****************** CTR, LR, XER ************* */
>  	ld	r4, _CTR(r7)
>  	ld	r5, _LINK(r7)
>  	ld	r8, _XER(r7)
> @@ -417,6 +436,10 @@ restore_gprs:
>  	ld	r4, THREAD_TM_TAR(r3)
>  	mtspr	SPRN_TAR,	r4
>  
> +	/* ******************** AMR ******************** */
> +	ld	r4, THREAD_TM_AMR(r3)
> +	mtspr	SPRN_AMR, r4
> +
>  	/* Load up the PPR and DSCR in GPRs only at this stage */
>  	ld	r5, THREAD_TM_DSCR(r3)
>  	ld	r6, THREAD_TM_PPR(r3)
> @@ -509,6 +532,10 @@ restore_gprs:
>  	li	r4, MSR_RI
>  	mtmsrd	r4, 1
>  
> +	/* Restore kernel live AMR */
> +	ld	r8, TM_FRAME_L0(r1)
> +	mtspr	SPRN_AMR, r8
> +
>  	REST_NVGPRS(r1)
>  
>  	addi    r1, r1, TM_FRAME_SIZE
> -- 
> 2.25.1

^ permalink raw reply

* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Thomas Gleixner @ 2020-09-20  6:23 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Juri Lelli, Peter Zijlstra, Sebastian Andrzej Siewior,
	Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
	Guo Ren, sparclinux, Vincent Chen, Will Deacon, Ard Biesheuvel,
	open list:GENERIC INCLUDE/A..., Vincent Guittot, Herbert Xu,
	X86 ML, Russell King, linux-csky, David Airlie, Mel Gorman, arcml,
	linux-xtensa, Paul McKenney, intel-gfx, linuxppc-dev,
	Steven Rostedt, Linus Torvalds, Jani Nikula, Rodrigo Vivi,
	Dietmar Eggemann, Linux ARM, Chris Zankel, Michal Simek,
	Thomas Bogendoerfer, Nick Hu, Linux-MM, Vineet Gupta, LKML,
	Arnd Bergmann, Paul Mackerras, Andrew Morton,
	Daniel Bristot de Oliveira, David S. Miller, Greentime Hu
In-Reply-To: <CAKMK7uENFDANQKebS_H0bhHeQRijrp1aVHQqyZPute3KBZ+fVQ@mail.gmail.com>

On Sat, Sep 19 2020 at 12:37, Daniel Vetter wrote:
> On Sat, Sep 19, 2020 at 12:35 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>> I think it should be the case, but I want to double check: Will
>> copy_*_user be allowed within a kmap_temporary section? This would
>> allow us to ditch an absolute pile of slowpaths.
>
> (coffee just kicked in) copy_*_user is ofc allowed, but if you hit a
> page fault you get a short read/write. This looks like it would remove
> the need to handle these in a slowpath, since page faults can now be
> served in this new kmap_temporary sections. But this sounds too good
> to be true, so I'm wondering what I'm missing.

In principle we could allow pagefaults, but not with the currently
proposed interface which can be called from any context. Obviously if
called from atomic context it can't handle user page faults.

In theory we could make a variant which does not disable pagefaults, but
that's what kmap() already provides.

Thanks,

        tglx




^ permalink raw reply

* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Thomas Gleixner @ 2020-09-20  6:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Juri Lelli, Peter Zijlstra, Sebastian Andrzej Siewior,
	Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
	Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
	linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
	Russell King, linux-csky, David Airlie, Mel Gorman,
	open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
	intel-gfx, linuxppc-dev, Steven Rostedt, Jani Nikula,
	Rodrigo Vivi, Dietmar Eggemann, Linux ARM, Chris Zankel,
	Michal Simek, Thomas Bogendoerfer, Nick Hu, Linux-MM,
	Vineet Gupta, LKML, Arnd Bergmann, Daniel Vetter, Paul Mackerras,
	Andrew Morton, Daniel Bristot de Oliveira, David S. Miller,
	Greentime Hu
In-Reply-To: <CAHk-=wiYGyrFRbA1cc71D2-nc5U9LM9jUJesXGqpPnB7E4X1YQ@mail.gmail.com>

On Sat, Sep 19 2020 at 10:18, Linus Torvalds wrote:
> On Sat, Sep 19, 2020 at 2:50 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> this provides a preemptible variant of kmap_atomic & related
>> interfaces. This is achieved by:
>
> Ack. This looks really nice, even apart from the new capability.
>
> The only thing I really reacted to is that the name doesn't make sense
> to me: "kmap_temporary()" seems a bit odd.

Yeah. Couldn't come up with something useful.

> Particularly for an interface that really is basically meant as a
> better replacement of "kmap_atomic()" (but is perhaps also a better
> replacement for "kmap()").
>
> I think I understand how the name came about: I think the "temporary"
> is there as a distinction from the "longterm" regular kmap(). So I
> think it makes some sense from an internal implementation angle, but I
> don't think it makes a lot of sense from an interface name.
>
> I don't know what might be a better name, but if we want to emphasize
> that it's thread-private and a one-off, maybe "local" would be a
> better naming, and make it distinct from the "global" nature of the
> old kmap() interface?

Right, _local or _thread would be more intuitive.

> However, another solution might be to just use this new preemptible
> "local" kmap(), and remove the old global one entirely. Yes, the old
> global one caches the page table mapping and that sounds really
> efficient and nice. But it's actually horribly horribly bad, because
> it means that we need to use locking for them. Your new "temporary"
> implementation seems to be fundamentally better locking-wise, and only
> need preemption disabling as locking (and is equally fast for the
> non-highmem case).
>
> So I wonder if the single-page TLB flush isn't a better model, and
> whether it wouldn't be a lot simpler to just get rid of the old
> complex kmap() entirely, and replace it with this?
>
> I agree we can't replace the kmap_atomic() version, because maybe
> people depend on the preemption disabling it also implied. But what
> about replacing the non-atomic kmap()?
>
> Maybe I've missed something.  Is it because the new interface still
> does "pagefault_disable()" perhaps?
>
> But does it even need the pagefault_disable() at all? Yes, the
> *atomic* one obviously needed it. But why does this new one need to
> disable page faults?

It disables pagefaults because it can be called from atomic and
non-atomic context. That was the point to get rid of

         if (preeemptible())
         	kmap();
         else
                kmap_atomic();

If it does not disable pagefaults, then it's just a lightweight variant
of kmap() for short lived mappings.

> But apart from that question about naming (and perhaps replacing
> kmap() entirely), I very much like it.

I thought about it, but then I figured that kmap pointers can be
handed to other contexts from the thread which sets up the mapping
because it's 'permanent'.

I'm not sure whether that actually happens, so we'd need to audit all
kmap() users to be sure. If there is no such use case, then we surely
can get of rid of kmap() completely. It's only 300+ instances to stare
at and quite some of them are wrapped into other functions.

Highmem sucks no matter what and the only sane solution is to remove it
completely.

Thanks,

        tglx


^ permalink raw reply

* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Daniel Vetter @ 2020-09-20  8:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Juri Lelli, Peter Zijlstra, Sebastian Andrzej Siewior,
	Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
	Guo Ren, sparclinux, Vincent Chen, Will Deacon, Ard Biesheuvel,
	open list:GENERIC INCLUDE/A..., Vincent Guittot, Herbert Xu,
	X86 ML, Russell King, linux-csky, David Airlie, Mel Gorman, arcml,
	linux-xtensa, Paul McKenney, intel-gfx, linuxppc-dev,
	Steven Rostedt, Linus Torvalds, Jani Nikula, Rodrigo Vivi,
	Dietmar Eggemann, Linux ARM, Chris Zankel, Michal Simek,
	Thomas Bogendoerfer, Nick Hu, Linux-MM, Vineet Gupta, LKML,
	Arnd Bergmann, Daniel Vetter, Paul Mackerras, Andrew Morton,
	Daniel Bristot de Oliveira, David S. Miller, Greentime Hu
In-Reply-To: <87pn6hc6g1.fsf@nanos.tec.linutronix.de>

On Sun, Sep 20, 2020 at 08:23:26AM +0200, Thomas Gleixner wrote:
> On Sat, Sep 19 2020 at 12:37, Daniel Vetter wrote:
> > On Sat, Sep 19, 2020 at 12:35 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> >> I think it should be the case, but I want to double check: Will
> >> copy_*_user be allowed within a kmap_temporary section? This would
> >> allow us to ditch an absolute pile of slowpaths.
> >
> > (coffee just kicked in) copy_*_user is ofc allowed, but if you hit a
> > page fault you get a short read/write. This looks like it would remove
> > the need to handle these in a slowpath, since page faults can now be
> > served in this new kmap_temporary sections. But this sounds too good
> > to be true, so I'm wondering what I'm missing.
> 
> In principle we could allow pagefaults, but not with the currently
> proposed interface which can be called from any context. Obviously if
> called from atomic context it can't handle user page faults.
 
Yeah that's clear, but does the implemention need to disable pagefaults
unconditionally?

> In theory we could make a variant which does not disable pagefaults, but
> that's what kmap() already provides.

Currently we have a bunch of code which roughly does

	kmap_atomic();
	copy_*_user();
	kunmap_atomic();

	if (short_copy_user) {
		kmap();
		copy_*_user(remaining_stuff);
		kunmap();
	}

And the only reason is that kmap is a notch slower, hence the fastpath. If
we get a kmap which is fast and allows pagefaults (only in contexts that
allow pagefaults already ofc) then we can ditch a pile of kmap users.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply

* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Thomas Gleixner @ 2020-09-20  8:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Juri Lelli, Peter Zijlstra, Sebastian Andrzej Siewior,
	Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
	Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
	linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
	Russell King, linux-csky, David Airlie, Mel Gorman,
	open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
	intel-gfx, linuxppc-dev, Steven Rostedt, Jani Nikula,
	Rodrigo Vivi, Dietmar Eggemann, Linux ARM, Chris Zankel,
	Michal Simek, Thomas Bogendoerfer, Nick Hu, Linux-MM,
	Vineet Gupta, LKML, Arnd Bergmann, Daniel Vetter, Paul Mackerras,
	Andrew Morton, Daniel Bristot de Oliveira, David S. Miller,
	Greentime Hu
In-Reply-To: <87mu1lc5mp.fsf@nanos.tec.linutronix.de>

On Sun, Sep 20 2020 at 08:41, Thomas Gleixner wrote:
> On Sat, Sep 19 2020 at 10:18, Linus Torvalds wrote:
>> Maybe I've missed something.  Is it because the new interface still
>> does "pagefault_disable()" perhaps?
>>
>> But does it even need the pagefault_disable() at all? Yes, the
>> *atomic* one obviously needed it. But why does this new one need to
>> disable page faults?
>
> It disables pagefaults because it can be called from atomic and
> non-atomic context. That was the point to get rid of
>
>          if (preeemptible())
>          	kmap();
>          else
>                 kmap_atomic();
>
> If it does not disable pagefaults, then it's just a lightweight variant
> of kmap() for short lived mappings.

Actually most usage sites of kmap atomic do not need page faults to be
disabled at all. As Daniel pointed out the implicit pagefault disable
enforces copy_from_user_inatomic() even in places which are clearly
preemptible task context.

As we need to look at each instance anyway we can add the PF disable
explicitely to the very few places which actually need it.

Thanks,

        tglx


^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Arnd Bergmann @ 2020-09-20 13:55 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio, open list:BROADCOM NVRAM DRIVER, David Howells,
	Linux-MM, keyrings, sparclinux, Christoph Hellwig, linux-arch,
	linux-s390, linux-scsi, the arch/x86 maintainers, linux-block,
	io-uring, Linux ARM, Jens Axboe, Parisc List, Networking,
	linux-kernel@vger.kernel.org, LSM List,
	Linux FS-devel Mailing List, Andrew Morton, linuxppc-dev
In-Reply-To: <20200919220920.GI3421308@ZenIV.linux.org.uk>

On Sun, Sep 20, 2020 at 12:09 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Fri, Sep 18, 2020 at 05:16:15PM +0200, Christoph Hellwig wrote:
> > On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote:
> > > Said that, why not provide a variant that would take an explicit
> > > "is it compat" argument and use it there?  And have the normal
> > > one pass in_compat_syscall() to that...
> >
> > That would help to not introduce a regression with this series yes.
> > But it wouldn't fix existing bugs when io_uring is used to access
> > read or write methods that use in_compat_syscall().  One example that
> > I recently ran into is drivers/scsi/sg.c.
>
> So screw such read/write methods - don't use them with io_uring.
> That, BTW, is one of the reasons I'm sceptical about burying the
> decisions deep into the callchain - we don't _want_ different
> data layouts on read/write depending upon the 32bit vs. 64bit
> caller, let alone the pointer-chasing garbage that is /dev/sg.

Would it be too late to limit what kind of file descriptors we allow
io_uring to read/write on?

If io_uring can get changed to return -EINVAL on trying to
read/write something other than S_IFREG file descriptors,
that particular problem space gets a lot simpler, but this
is of course only possible if nobody actually relies on it yet.

      Arnd

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Al Viro @ 2020-09-20 15:02 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-aio, open list:BROADCOM NVRAM DRIVER, David Howells,
	Linux-MM, keyrings, sparclinux, Christoph Hellwig, linux-arch,
	linux-s390, linux-scsi, the arch/x86 maintainers, linux-block,
	io-uring, Linux ARM, Jens Axboe, Parisc List, Networking,
	linux-kernel@vger.kernel.org, LSM List,
	Linux FS-devel Mailing List, Andrew Morton, linuxppc-dev
In-Reply-To: <CAK8P3a3QApj3isPu3TkLahArsfb5jaABb7DJ7EKMxey1T1YPbA@mail.gmail.com>

On Sun, Sep 20, 2020 at 03:55:47PM +0200, Arnd Bergmann wrote:
> On Sun, Sep 20, 2020 at 12:09 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Fri, Sep 18, 2020 at 05:16:15PM +0200, Christoph Hellwig wrote:
> > > On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote:
> > > > Said that, why not provide a variant that would take an explicit
> > > > "is it compat" argument and use it there?  And have the normal
> > > > one pass in_compat_syscall() to that...
> > >
> > > That would help to not introduce a regression with this series yes.
> > > But it wouldn't fix existing bugs when io_uring is used to access
> > > read or write methods that use in_compat_syscall().  One example that
> > > I recently ran into is drivers/scsi/sg.c.
> >
> > So screw such read/write methods - don't use them with io_uring.
> > That, BTW, is one of the reasons I'm sceptical about burying the
> > decisions deep into the callchain - we don't _want_ different
> > data layouts on read/write depending upon the 32bit vs. 64bit
> > caller, let alone the pointer-chasing garbage that is /dev/sg.
> 
> Would it be too late to limit what kind of file descriptors we allow
> io_uring to read/write on?
> 
> If io_uring can get changed to return -EINVAL on trying to
> read/write something other than S_IFREG file descriptors,
> that particular problem space gets a lot simpler, but this
> is of course only possible if nobody actually relies on it yet.

S_IFREG is almost certainly too heavy as a restriction.  Looking through
the stuff sensitive to 32bit/64bit, we seem to have
	* /dev/sg - pointer-chasing horror
	* sysfs files for efivar - different layouts for compat and native,
shitty userland ABI design (
struct efi_variable {
        efi_char16_t  VariableName[EFI_VAR_NAME_LEN/sizeof(efi_char16_t)];
        efi_guid_t    VendorGuid;
        unsigned long DataSize;
        __u8          Data[1024];
        efi_status_t  Status;
        __u32         Attributes;
} __attribute__((packed));
) is the piece of crap in question; 'DataSize' is where the headache comes
from.  Regular files, BTW...
	* uhid - character device, milder pointer-chasing horror.  Trouble
comes from this:
/* Obsolete! Use UHID_CREATE2. */
struct uhid_create_req {
        __u8 name[128];
        __u8 phys[64];
        __u8 uniq[64];
        __u8 __user *rd_data;
        __u16 rd_size;

        __u16 bus;
        __u32 vendor;
        __u32 product;
        __u32 version;
        __u32 country;
} __attribute__((__packed__));
and suggested replacement doesn't do any pointer-chasing (rd_data is an
embedded array in the end of struct uhid_create2_req).
	* evdev, uinput - bitness-sensitive layout, due to timestamps
	* /proc/bus/input/devices - weird crap with printing bitmap, different
_text_ layouts seen by 32bit and 64bit readers.  Binary structures are PITA,
but with sufficient effort you can screw the text just as hard...  Oh, and it's
a regular file.
	* similar in sysfs analogue

And AFAICS, that's it for read/write-related method instances.

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Matthew Wilcox @ 2020-09-20 15:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-aio, linux-mips, David Howells, linux-mm, keyrings,
	sparclinux, linux-arch, linux-s390, linux-scsi, x86,
	Arnd Bergmann, linux-block, Alexander Viro, io-uring,
	linux-arm-kernel, Jens Axboe, linux-parisc, netdev, linux-kernel,
	linux-security-module, linux-fsdevel, Andrew Morton, linuxppc-dev
In-Reply-To: <20200918124533.3487701-2-hch@lst.de>

On Fri, Sep 18, 2020 at 02:45:25PM +0200, Christoph Hellwig wrote:
> Add a flag to force processing a syscall as a compat syscall.  This is
> required so that in_compat_syscall() works for I/O submitted by io_uring
> helper threads on behalf of compat syscalls.

Al doesn't like this much, but my suggestion is to introduce two new
opcodes -- IORING_OP_READV32 and IORING_OP_WRITEV32.  The compat code
can translate IORING_OP_READV to IORING_OP_READV32 and then the core
code can know what that user pointer is pointing to.


^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Arnd Bergmann @ 2020-09-20 16:00 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-aio, open list:BROADCOM NVRAM DRIVER, David Howells,
	Linux-MM, keyrings, sparclinux, Christoph Hellwig, linux-arch,
	linux-s390, linux-scsi, the arch/x86 maintainers, linux-block,
	Alexander Viro, io-uring, Linux ARM, Jens Axboe, Parisc List,
	Networking, linux-kernel@vger.kernel.org, LSM List,
	Linux FS-devel Mailing List, Andrew Morton, linuxppc-dev
In-Reply-To: <20200920151510.GS32101@casper.infradead.org>

On Sun, Sep 20, 2020 at 5:15 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Fri, Sep 18, 2020 at 02:45:25PM +0200, Christoph Hellwig wrote:
> > Add a flag to force processing a syscall as a compat syscall.  This is
> > required so that in_compat_syscall() works for I/O submitted by io_uring
> > helper threads on behalf of compat syscalls.
>
> Al doesn't like this much, but my suggestion is to introduce two new
> opcodes -- IORING_OP_READV32 and IORING_OP_WRITEV32.  The compat code
> can translate IORING_OP_READV to IORING_OP_READV32 and then the core
> code can know what that user pointer is pointing to.

How is that different from the current approach of storing the ABI as
a flag in ctx->compat?

     Arnd

^ permalink raw reply

* Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag
From: Andy Lutomirski @ 2020-09-20 16:59 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-aio, open list:MIPS, David Howells, Linux-MM, keyrings,
	sparclinux, Christoph Hellwig, linux-arch, linux-s390,
	Linux SCSI List, X86 ML, Arnd Bergmann, linux-block,
	Andy Lutomirski, io-uring, linux-arm-kernel, Jens Axboe,
	Parisc List, Network Development, LKML, LSM List, Linux FS Devel,
	Andrew Morton, linuxppc-dev
In-Reply-To: <20200920025745.GL3421308@ZenIV.linux.org.uk>

On Sat, Sep 19, 2020 at 7:57 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Sat, Sep 19, 2020 at 05:14:41PM -0700, Andy Lutomirski wrote:
>
> > > 2) have you counted the syscalls that do and do not need that?
> >
> > No.
>
> Might be illuminating...
>
> > > 3) how many of those realistically *can* be unified with their
> > > compat counterparts?  [hint: ioctl(2) cannot]
> >
> > There would be no requirement to unify anything.  The idea is that
> > we'd get rid of all the global state flags.
>
> _What_ global state flags?  When you have separate SYSCALL_DEFINE3(ioctl...)
> and COMPAT_SYSCALL_DEFINE3(ioctl...), there's no flags at all, global or
> local.  They only come into the play when you try to share the same function
> for both, right on the top level.

...

>
> > For ioctl, we'd have a new file_operation:
> >
> > long ioctl(struct file *, unsigned int, unsigned long, enum syscall_arch);
> >
> > I'm not saying this is easy, but I think it's possible and the result
> > would be more obviously correct than what we have now.
>
> No, it would not.  Seriously, from time to time a bit of RTFS before grand
> proposals turns out to be useful.

As one example, look at __sys_setsockopt().  It's called for the
native and compat versions, and it contains an in_compat_syscall()
check.  (This particularly check looks dubious to me, but that's
another story.)  If this were to be done with equivalent semantics
without a separate COMPAT_DEFINE_SYSCALL and without
in_compat_syscall(), there would need to be some indication as to
whether this is compat or native setsockopt.  There are other
setsockopt implementations in the net stack with more
legitimate-seeming uses of in_compat_syscall() that would need some
other mechanism if in_compat_syscall() were to go away.

setsockopt is (I hope!) out of scope for io_uring, but the situation
isn't fundamentally different from read and write.

^ permalink raw reply

* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Linus Torvalds @ 2020-09-20 16:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Juri Lelli, Peter Zijlstra, Sebastian Andrzej Siewior,
	Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
	Guo Ren, linux-sparc, Vincent Chen, Will Deacon, Ard Biesheuvel,
	linux-arch, Vincent Guittot, Herbert Xu, the arch/x86 maintainers,
	Russell King, linux-csky, David Airlie, Mel Gorman,
	open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Paul McKenney,
	intel-gfx, linuxppc-dev, Steven Rostedt, Jani Nikula,
	Rodrigo Vivi, Dietmar Eggemann, Linux ARM, Chris Zankel,
	Michal Simek, Thomas Bogendoerfer, Nick Hu, Linux-MM,
	Vineet Gupta, LKML, Arnd Bergmann, Daniel Vetter, Paul Mackerras,
	Andrew Morton, Daniel Bristot de Oliveira, David S. Miller,
	Greentime Hu
In-Reply-To: <87k0wode9a.fsf@nanos.tec.linutronix.de>

On Sun, Sep 20, 2020 at 1:49 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Actually most usage sites of kmap atomic do not need page faults to be
> disabled at all.

Right. I think the pagefault disabling has (almost) nothing at all to
do with the kmap() itself - it comes from the "atomic" part, not the
"kmap" part.

I say *almost*, because there is one issue that needs some thought:
the amount of kmap nesting.

The kmap_atomic() interface - and your local/temporary/whatever
versions of it - depends very much inherently on being strictly
nesting. In fact, it depends so much on it that maybe that should be
part of the new name?

It's very wrong to do

    addr1 = kmap_atomic();
    addr2 = kmap_atomic();
    ..do something with addr 1..
    kunmap_atomic(addr1);
    .. do something with addr 2..
    kunmap_atomic(addr2);

because the way we allocate the slots is by using a percpu-atomic
inc-return (and we deallocate using dec).

So it's fundamentally a stack.

And that's perfectly fine for page faults - if they do any kmaps,
those will obviously nest.

So the only issue with page faults might be that the stack grows
_larger_. And that might need some thought. We already make the kmap
stack bigger for CONFIG_DEBUG_HIGHMEM, and it's possibly that if we
allow page faults we need to make the kmap stack bigger still.

Btw, looking at the stack code, Ithink your new implementation of it
is a bit scary:

   static inline int kmap_atomic_idx_push(void)
   {
  -       int idx = __this_cpu_inc_return(__kmap_atomic_idx) - 1;
  +       int idx = current->kmap_ctrl.idx++;

and now that 'current->kmap_ctrl.idx' is not atomic wrt

 (a) NMI's (this may be ok, maybe we never do kmaps in NMIs, and with
nesting I think it's fine anyway - the NMI will undo whatever it did)

 (b) the prev/next switch

And that (b) part worries me. You do the kmap_switch_temporary() to
switch the entries, but you do that *separately* from actually
switching 'current' to the new value.

So kmap_switch_temporary() looks safe, but I don't think it actually
is. Because while it first unmaps the old entries and then remaps the
new ones, an interrupt can come in, and at that point it matters what
is *CURRENT*.

And regardless of whether 'current' is 'prev' or 'next', that
kmap_switch_temporary() loop may be doing the wrong thing, depending
on which one had the deeper stack. The interrupt will be using
whatever "current->kmap_ctrl.idx" is, but that might overwrite entries
that are in the process of being restored (if current is still 'prev',
but kmap_switch_temporary() is in the "restore @next's kmaps" pgase),
or it might stomp on entries that have been pte_clear()'ed by the
'prev' thing.

I dunno. The latter may be one of those "it works anyway, it
overwrites things we don't care about", but the former will most
definitely not work.

And it will be completely impossible to debug, because it will depend
on an interrupt that uses kmap_local/atomic/whatever() coming in
_just_ at the right point in the scheduler, and only when the
scheduler has been entered with the right number of kmap entries on
the prev/next stack.

And no developer will ever see this with any amount of debug code
enabled, because it will only hit on legacy platforms that do this
kmap anyway.

So honestly, that code scares me. I think it's buggy. And even if it
"happens to work", it does so for all the wrong reasons, and is very
fragile.

So I would suggest:

 - continue to use an actual per-cpu kmap_atomic_idx

 - make the switching code save the old idx, then unmap the old
entries one by one (while doing the proper "pop" action), and then map
the new entries one by one (while doing the proper "push" action).

which would mean that the only index that is actually ever *USED* is
the percpu one, and it's always up-to-date and pushed/popped for
individual entries, rather than this - imho completely bogus -
optimization where you use "p->kmap_ctrl.idx" directly and very very
unsafely.

Alternatively, that process counter would need about a hundred lines
of commentary about exactly why it's safe. Because I don't think it
is.

                   Linus

^ permalink raw reply

* Re: [patch RFC 00/15] mm/highmem: Provide a preemptible variant of kmap_atomic & friends
From: Thomas Gleixner @ 2020-09-20 17:24 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Juri Lelli, Peter Zijlstra, Sebastian Andrzej Siewior,
	Joonas Lahtinen, dri-devel, linux-mips, Ben Segall, Max Filippov,
	Guo Ren, sparclinux, Vincent Chen, Will Deacon, Ard Biesheuvel,
	open list:GENERIC INCLUDE/A..., Vincent Guittot, Herbert Xu,
	X86 ML, Russell King, linux-csky, David Airlie, Mel Gorman, arcml,
	linux-xtensa, Paul McKenney, intel-gfx, linuxppc-dev,
	Steven Rostedt, Linus Torvalds, Jani Nikula, Rodrigo Vivi,
	Dietmar Eggemann, Linux ARM, Chris Zankel, Michal Simek,
	Thomas Bogendoerfer, Nick Hu, Linux-MM, Vineet Gupta, LKML,
	Arnd Bergmann, Daniel Vetter, Paul Mackerras, Andrew Morton,
	Daniel Bristot de Oliveira, David S. Miller, Greentime Hu
In-Reply-To: <20200920082353.GG438822@phenom.ffwll.local>

On Sun, Sep 20 2020 at 10:23, Daniel Vetter wrote:

> On Sun, Sep 20, 2020 at 08:23:26AM +0200, Thomas Gleixner wrote:
>> On Sat, Sep 19 2020 at 12:37, Daniel Vetter wrote:
>> > On Sat, Sep 19, 2020 at 12:35 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>> >> I think it should be the case, but I want to double check: Will
>> >> copy_*_user be allowed within a kmap_temporary section? This would
>> >> allow us to ditch an absolute pile of slowpaths.
>> >
>> > (coffee just kicked in) copy_*_user is ofc allowed, but if you hit a
>> > page fault you get a short read/write. This looks like it would remove
>> > the need to handle these in a slowpath, since page faults can now be
>> > served in this new kmap_temporary sections. But this sounds too good
>> > to be true, so I'm wondering what I'm missing.
>> 
>> In principle we could allow pagefaults, but not with the currently
>> proposed interface which can be called from any context. Obviously if
>> called from atomic context it can't handle user page faults.
>  
> Yeah that's clear, but does the implemention need to disable pagefaults
> unconditionally?

As I wrote in the other reply it's not required and the final interface
will neither disable preemption nor pagefaults (except for the atomic
wrapper around it).

Thanks,

        tglx

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox