LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v4 10/18] dt-bindings: usb: Convert DWC USB3 bindings to DT schema
From: Rob Herring @ 2020-11-30 15:38 UTC (permalink / raw)
  To: Serge Semin
  Cc: Neil Armstrong, Bjorn Andersson, Pavel Parkhomenko, Kevin Hilman,
	Krzysztof Kozlowski, Andy Gross, Chunfeng Yun, arcml, devicetree,
	Mathias Nyman, Martin Blumenstingl, Lad Prabhakar, Alexey Malahov,
	linux-arm-kernel, Roger Quadros, Felipe Balbi, Greg Kroah-Hartman,
	Yoshihiro Shimoda, Linux USB List, open list:MIPS, Serge Semin,
	linux-kernel@vger.kernel.org, Manu Gautam, linuxppc-dev
In-Reply-To: <20201125083202.ytoyd62bg3s7kvvg@mobilestation>

On Wed, Nov 25, 2020 at 1:32 AM Serge Semin
<Sergey.Semin@baikalelectronics.ru> wrote:
>
> On Sat, Nov 21, 2020 at 06:42:28AM -0600, Rob Herring wrote:
> > On Thu, Nov 12, 2020 at 01:29:46PM +0300, Serge Semin wrote:
> > > On Wed, Nov 11, 2020 at 02:14:23PM -0600, Rob Herring wrote:
> > > > On Wed, Nov 11, 2020 at 12:08:45PM +0300, Serge Semin wrote:
> > > > > DWC USB3 DT node is supposed to be compliant with the Generic xHCI
> > > > > Controller schema, but with additional vendor-specific properties, the
> > > > > controller-specific reference clocks and PHYs. So let's convert the
> > > > > currently available legacy text-based DWC USB3 bindings to the DT schema
> > > > > and make sure the DWC USB3 nodes are also validated against the
> > > > > usb-xhci.yaml schema.
> > > > >
> > > > > Note we have to discard the nodename restriction of being prefixed with
> > > > > "dwc3@" string, since in accordance with the usb-hcd.yaml schema USB nodes
> > > > > are supposed to be named as "^usb(@.*)".
> > > > >
> > > > > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
> > > > >
> > > > > ---
> > > > >
> > > > > Changelog v2:
> > > > > - Discard '|' from the descriptions, since we don't need to preserve
> > > > >   the text formatting in any of them.
> > > > > - Drop quotes from around the string constants.
> > > > > - Fix the "clock-names" prop description to be referring the enumerated
> > > > >   clock-names instead of the ones from the Databook.
> > > > >
> > > > > Changelog v3:
> > > > > - Apply usb-xhci.yaml# schema only if the controller is supposed to work
> > > > >   as either host or otg.
> > > > >
> > > > > Changelog v4:
> > > > > - Apply usb-drd.yaml schema first. If the controller is configured
> > > > >   to work in a gadget mode only, then apply the usb.yaml schema too,
> > > > >   otherwise apply the usb-xhci.yaml schema.
> > > > > - Discard the Rob'es Reviewed-by tag. Please review the patch one more
> > > > >   time.
> > > > > ---
> > > > >  .../devicetree/bindings/usb/dwc3.txt          | 125 --------
> > > > >  .../devicetree/bindings/usb/snps,dwc3.yaml    | 303 ++++++++++++++++++
> > > > >  2 files changed, 303 insertions(+), 125 deletions(-)
> > > > >  delete mode 100644 Documentation/devicetree/bindings/usb/dwc3.txt
> > > > >  create mode 100644 Documentation/devicetree/bindings/usb/snps,dwc3.yaml
> >
> >
> > > > > diff --git a/Documentation/devicetree/bindings/usb/snps,dwc3.yaml b/Documentation/devicetree/bindings/usb/snps,dwc3.yaml
> > > > > new file mode 100644
> > > > > index 000000000000..079617891da6
> > > > > --- /dev/null
> > > > > +++ b/Documentation/devicetree/bindings/usb/snps,dwc3.yaml
> > > > > @@ -0,0 +1,303 @@
> > > > > +# SPDX-License-Identifier: GPL-2.0
> > > > > +%YAML 1.2
> > > > > +---
> > > > > +$id: http://devicetree.org/schemas/usb/snps,dwc3.yaml#
> > > > > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > > > > +
> > > > > +title: Synopsys DesignWare USB3 Controller
> > > > > +
> > > > > +maintainers:
> > > > > +  - Felipe Balbi <balbi@kernel.org>
> > > > > +
> > > > > +description:
> > > > > +  This is usually a subnode to DWC3 glue to which it is connected, but can also
> > > > > +  be presented as a standalone DT node with an optional vendor-specific
> > > > > +  compatible string.
> > > > > +
> > >
> > > > > +allOf:
> > > > > +  - $ref: usb-drd.yaml#
> > > > > +  - if:
> > > > > +      properties:
> > > > > +        dr_mode:
> > > > > +          const: peripheral
> >
>
> > Another thing, this evaluates to true if dr_mode is not present. You
> > need to add 'required'?
>
> Right. Will something like this do that?

Yes.

>
> + allOf:
> +  - $ref: usb-drd.yaml#
> +  - if:
> +      properties:
> +        dr_mode:
> +          const: peripheral
> +
> +      required:
> +        - dr_mode
> +    then:
> +      $ref: usb.yaml#
> +    else
> +      $ref: usb-xhci.yaml#
>
> > If dr_mode is otg, then don't you need to apply
> > both usb.yaml and usb-xhci.yaml?
>
> No I don't. Since there is no peripheral-specific DT schema, then the
> only schema any USB-gadget node needs to pass is usb.yaml, which
> is already included into the usb-xhci.yaml schema. So for pure OTG devices
> with xHCI host and gadget capabilities it's enough to evaluate: allOf:
> [$ref: usb-drd.yaml#, $ref: usb-xhci.yaml#].  Please see the
> sketch/ASCII-figure below and the following text for details.

Okay.

Rob

^ permalink raw reply

* [PATCH 6/6] docs: archis: add a per-architecture features list
From: Mauro Carvalho Chehab @ 2020-11-30 15:36 UTC (permalink / raw)
  To: Linux Doc Mailing List
  Cc: Rich Felker, linux-ia64, linux-sh, linux-mips,
	James E.J. Bottomley, Paul Mackerras, H. Peter Anvin, linux-riscv,
	Will Deacon, Thomas Gleixner, Jonas Bonn, linux-s390,
	Yoshinori Sato, Jonathan Corbet, Mauro Carvalho Chehab,
	Helge Deller, x86, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Fenghua Yu, Albert Ou, Vasily Gorbik,
	Heiko Carstens, Stefan Kristiansson, Tony Luck, Borislav Petkov,
	Paul Walmsley, Stafford Horne, linux-arm-kernel,
	Thomas Bogendoerfer, linux-parisc, Andrew Cooper, linux-kernel,
	openrisc, Palmer Dabbelt, linuxppc-dev
In-Reply-To: <cover.1606748711.git.mchehab+huawei@kernel.org>

Add a feature list matrix for each architecture to their
respective Kernel books.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/arm/features.rst      |  3 +++
 Documentation/arm/index.rst         |  2 ++
 Documentation/arm64/features.rst    |  3 +++
 Documentation/arm64/index.rst       |  2 ++
 Documentation/ia64/features.rst     |  3 +++
 Documentation/ia64/index.rst        |  2 ++
 Documentation/index.rst             |  2 +-
 Documentation/m68k/features.rst     |  3 +++
 Documentation/m68k/index.rst        |  2 ++
 Documentation/mips/features.rst     |  3 +++
 Documentation/mips/index.rst        |  2 ++
 Documentation/nios2/index.rst       | 12 ++++++++++++
 Documentation/openrisc/features.rst |  3 +++
 Documentation/openrisc/index.rst    |  2 ++
 Documentation/parisc/features.rst   |  3 +++
 Documentation/parisc/index.rst      |  2 ++
 Documentation/powerpc/features.rst  |  3 +++
 Documentation/powerpc/index.rst     |  2 ++
 Documentation/riscv/features.rst    |  3 +++
 Documentation/riscv/index.rst       |  2 ++
 Documentation/s390/features.rst     |  3 +++
 Documentation/s390/index.rst        |  2 ++
 Documentation/sh/features.rst       |  3 +++
 Documentation/sh/index.rst          |  2 ++
 Documentation/sparc/features.rst    |  3 +++
 Documentation/sparc/index.rst       |  2 ++
 Documentation/x86/features.rst      |  3 +++
 Documentation/x86/index.rst         |  1 +
 Documentation/xtensa/features.rst   |  3 +++
 Documentation/xtensa/index.rst      |  2 ++
 30 files changed, 82 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/arm/features.rst
 create mode 100644 Documentation/arm64/features.rst
 create mode 100644 Documentation/ia64/features.rst
 create mode 100644 Documentation/m68k/features.rst
 create mode 100644 Documentation/mips/features.rst
 create mode 100644 Documentation/nios2/index.rst
 create mode 100644 Documentation/openrisc/features.rst
 create mode 100644 Documentation/parisc/features.rst
 create mode 100644 Documentation/powerpc/features.rst
 create mode 100644 Documentation/riscv/features.rst
 create mode 100644 Documentation/s390/features.rst
 create mode 100644 Documentation/sh/features.rst
 create mode 100644 Documentation/sparc/features.rst
 create mode 100644 Documentation/x86/features.rst
 create mode 100644 Documentation/xtensa/features.rst

diff --git a/Documentation/arm/features.rst b/Documentation/arm/features.rst
new file mode 100644
index 000000000000..7414ec03dd15
--- /dev/null
+++ b/Documentation/arm/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features arm
diff --git a/Documentation/arm/index.rst b/Documentation/arm/index.rst
index 5fc072dd0c5e..a2e9e1bba7b9 100644
--- a/Documentation/arm/index.rst
+++ b/Documentation/arm/index.rst
@@ -23,6 +23,8 @@ ARM Architecture
    vlocks
    porting
 
+   features
+
 SoC-specific documents
 ======================
 
diff --git a/Documentation/arm64/features.rst b/Documentation/arm64/features.rst
new file mode 100644
index 000000000000..dfa4cb3cd3ef
--- /dev/null
+++ b/Documentation/arm64/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features arm64
diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
index 937634c49979..97d65ba12a35 100644
--- a/Documentation/arm64/index.rst
+++ b/Documentation/arm64/index.rst
@@ -24,6 +24,8 @@ ARM64 Architecture
     tagged-address-abi
     tagged-pointers
 
+    features
+
 .. only::  subproject and html
 
    Indices
diff --git a/Documentation/ia64/features.rst b/Documentation/ia64/features.rst
new file mode 100644
index 000000000000..d7226fdcf5f8
--- /dev/null
+++ b/Documentation/ia64/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features ia64
diff --git a/Documentation/ia64/index.rst b/Documentation/ia64/index.rst
index 4bdfe28067ee..761f2154dfa2 100644
--- a/Documentation/ia64/index.rst
+++ b/Documentation/ia64/index.rst
@@ -15,3 +15,5 @@ IA-64 Architecture
    irq-redir
    mca
    serial
+
+   features
diff --git a/Documentation/index.rst b/Documentation/index.rst
index 57719744774c..5888e8a7272f 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -160,7 +160,7 @@ implementation.
    ia64/index
    m68k/index
    mips/index
-   nios2/nios2
+   nios2/index
    openrisc/index
    parisc/index
    powerpc/index
diff --git a/Documentation/m68k/features.rst b/Documentation/m68k/features.rst
new file mode 100644
index 000000000000..5107a2119472
--- /dev/null
+++ b/Documentation/m68k/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features m68k
diff --git a/Documentation/m68k/index.rst b/Documentation/m68k/index.rst
index b89cb6a86d9b..0f890dbb5fe2 100644
--- a/Documentation/m68k/index.rst
+++ b/Documentation/m68k/index.rst
@@ -10,6 +10,8 @@ m68k Architecture
    kernel-options
    buddha-driver
 
+   features
+
 .. only::  subproject and html
 
    Indices
diff --git a/Documentation/mips/features.rst b/Documentation/mips/features.rst
new file mode 100644
index 000000000000..1973d729b29a
--- /dev/null
+++ b/Documentation/mips/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features mips
diff --git a/Documentation/mips/index.rst b/Documentation/mips/index.rst
index 35cceea4e8bc..037f85a08fe3 100644
--- a/Documentation/mips/index.rst
+++ b/Documentation/mips/index.rst
@@ -11,6 +11,8 @@ MIPS-specific Documentation
    booting
    ingenic-tcu
 
+   features
+
 .. only::  subproject and html
 
    Indices
diff --git a/Documentation/nios2/index.rst b/Documentation/nios2/index.rst
new file mode 100644
index 000000000000..4468fe1a1037
--- /dev/null
+++ b/Documentation/nios2/index.rst
@@ -0,0 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Nios II Specific Documentation
+==============================
+
+.. toctree::
+   :maxdepth: 2
+   :numbered:
+
+   nios2
+   features
diff --git a/Documentation/openrisc/features.rst b/Documentation/openrisc/features.rst
new file mode 100644
index 000000000000..3f7c40d219f2
--- /dev/null
+++ b/Documentation/openrisc/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features openrisc
diff --git a/Documentation/openrisc/index.rst b/Documentation/openrisc/index.rst
index 748b3eea1707..6879f998b87a 100644
--- a/Documentation/openrisc/index.rst
+++ b/Documentation/openrisc/index.rst
@@ -10,6 +10,8 @@ OpenRISC Architecture
    openrisc_port
    todo
 
+   features
+
 .. only::  subproject and html
 
    Indices
diff --git a/Documentation/parisc/features.rst b/Documentation/parisc/features.rst
new file mode 100644
index 000000000000..501d7c450037
--- /dev/null
+++ b/Documentation/parisc/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features parisc
diff --git a/Documentation/parisc/index.rst b/Documentation/parisc/index.rst
index aa3ee0470425..240685751825 100644
--- a/Documentation/parisc/index.rst
+++ b/Documentation/parisc/index.rst
@@ -10,6 +10,8 @@ PA-RISC Architecture
    debugging
    registers
 
+   features
+
 .. only::  subproject and html
 
    Indices
diff --git a/Documentation/powerpc/features.rst b/Documentation/powerpc/features.rst
new file mode 100644
index 000000000000..aeae73df86b0
--- /dev/null
+++ b/Documentation/powerpc/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features powerpc
diff --git a/Documentation/powerpc/index.rst b/Documentation/powerpc/index.rst
index 6ec64b0d5257..bf5f1a2bdbdf 100644
--- a/Documentation/powerpc/index.rst
+++ b/Documentation/powerpc/index.rst
@@ -34,6 +34,8 @@ powerpc
     vas-api
     vcpudispatch_stats
 
+    features
+
 .. only::  subproject and html
 
    Indices
diff --git a/Documentation/riscv/features.rst b/Documentation/riscv/features.rst
new file mode 100644
index 000000000000..c70ef6ac2368
--- /dev/null
+++ b/Documentation/riscv/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features riscv
diff --git a/Documentation/riscv/index.rst b/Documentation/riscv/index.rst
index fa33bffd8992..6e6e39482502 100644
--- a/Documentation/riscv/index.rst
+++ b/Documentation/riscv/index.rst
@@ -9,6 +9,8 @@ RISC-V architecture
     pmu
     patch-acceptance
 
+    features
+
 .. only::  subproject and html
 
    Indices
diff --git a/Documentation/s390/features.rst b/Documentation/s390/features.rst
new file mode 100644
index 000000000000..57c296a9d8f3
--- /dev/null
+++ b/Documentation/s390/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features s390
diff --git a/Documentation/s390/index.rst b/Documentation/s390/index.rst
index cf71df5776b4..b10ca9192557 100644
--- a/Documentation/s390/index.rst
+++ b/Documentation/s390/index.rst
@@ -19,6 +19,8 @@ s390 Architecture
 
     text_files
 
+    features
+
 .. only::  subproject and html
 
    Indices
diff --git a/Documentation/sh/features.rst b/Documentation/sh/features.rst
new file mode 100644
index 000000000000..f722af3b6c99
--- /dev/null
+++ b/Documentation/sh/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features sh
diff --git a/Documentation/sh/index.rst b/Documentation/sh/index.rst
index 7b9a79a28167..c64776738cf6 100644
--- a/Documentation/sh/index.rst
+++ b/Documentation/sh/index.rst
@@ -11,6 +11,8 @@ SuperH Interfaces Guide
     new-machine
     register-banks
 
+    features
+
 Memory Management
 =================
 
diff --git a/Documentation/sparc/features.rst b/Documentation/sparc/features.rst
new file mode 100644
index 000000000000..c0c92468b0fe
--- /dev/null
+++ b/Documentation/sparc/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features sparc
diff --git a/Documentation/sparc/index.rst b/Documentation/sparc/index.rst
index 71cff621f243..ae884224eec2 100644
--- a/Documentation/sparc/index.rst
+++ b/Documentation/sparc/index.rst
@@ -9,3 +9,5 @@ Sparc Architecture
    adi
 
    oradax/oracle-dax
+
+   features
diff --git a/Documentation/x86/features.rst b/Documentation/x86/features.rst
new file mode 100644
index 000000000000..b663f15053ce
--- /dev/null
+++ b/Documentation/x86/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features x86
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index b224d12c880b..b4fcb6f258b2 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -33,3 +33,4 @@ x86-specific Documentation
    i386/index
    x86_64/index
    sva
+   features
diff --git a/Documentation/xtensa/features.rst b/Documentation/xtensa/features.rst
new file mode 100644
index 000000000000..6b92c7bfa19d
--- /dev/null
+++ b/Documentation/xtensa/features.rst
@@ -0,0 +1,3 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. kernel-feat:: $srctree/Documentation/features xtensa
diff --git a/Documentation/xtensa/index.rst b/Documentation/xtensa/index.rst
index 52fa04eb39a3..69952446a9be 100644
--- a/Documentation/xtensa/index.rst
+++ b/Documentation/xtensa/index.rst
@@ -10,3 +10,5 @@ Xtensa Architecture
    atomctl
    booting
    mmu
+
+   features
-- 
2.28.0


^ permalink raw reply related

* [PATCH 0/6] Add documentation for Documentation/features at the built docs
From: Mauro Carvalho Chehab @ 2020-11-30 15:36 UTC (permalink / raw)
  To: Linux Doc Mailing List
  Cc: Rich Felker, linux-ia64, linux-sh, linux-mips,
	James E.J. Bottomley, Paul Mackerras, H. Peter Anvin, linux-riscv,
	Will Deacon, Thomas Gleixner, Jonas Bonn, linux-s390,
	Yoshinori Sato, Jonathan Corbet, Mauro Carvalho Chehab,
	Helge Deller, x86, Christian Borntraeger, Ingo Molnar,
	Catalin Marinas, Fenghua Yu, Albert Ou, Kees Cook, Vasily Gorbik,
	Heiko Carstens, Jonathan Neuschäfer, Stefan Kristiansson,
	Tony Luck, Borislav Petkov, Paul Walmsley, Stafford Horne,
	Daniel W. S. Almeida, linux-arm-kernel, Thomas Bogendoerfer,
	linux-parisc, Andrew Cooper, linux-kernel, openrisc,
	Palmer Dabbelt, Masami Hiramatsu, Greg Kroah-Hartman,
	linuxppc-dev

Hi Jon,

This series got already submitted last year:

   https://lore.kernel.org/lkml/cover.1561222784.git.mchehab+samsung@kernel.org/

Yet, on that time, there were too many other patches related to ReST
conversion floating around. So, at the end, I guess this one got missed.

So, I did a rebase on the top of upstream, and added a few new changes.

Patch 1 contains the original implementation back then. It adds a
get_feat.pl script that parses the contents of Documentation/features.

Patch 2 is new: it re-implements the output of the full contents of the
features table as a set of per-subsystem tables. 

Patch 3 replaces the existing Documentation/features/list-arch.sh
by a call to the new script, in order to avoid having two scripts
doing the same thing.

Patch 4 is a sphinx extension to allow generating features output
via a meta-tag.

Patch 5 adds a complete feature list covering all archs at the
admin guide.

Patch 6 adds a per-arch feature list on each architecture book.

-

The scripts/get_feat.pl supports several types of output:

- $ scripts/get_feat.pl current

  Outputs the supported feadures by the architecture of the
  running Kernel, as an ASCII table;

- $  scripts/get_feat.pl list

  Outputs the supported features on an easy to be parsed
  format. By default, it uses the current architecture as well;

- $  scripts/get_feat.pl rest --feature jump-labels

  Output what architecture supports a given feature
  (on the above example, "jump-labels" feature)

- $ scripts/get_feat.pl rest --arch um

  Outputs the features support for an specific architecture
  (on the above example, for "um" architecture.

- $ scripts/get_feat.pl rest

  Outputs a text file with ASCII tables (ReST compatible)
  with all features, grouped per subsystem.

  E. g. something like:
	
        ===================================
        Feature status on all architectures
        ===================================
        
        Subsystem: core
        ===============
        
        +---------------------+---------------------------------+-------------------------------------------------------------------------+------------+------+
        |Feature              |Kconfig                          |Description                                                              |Architecture|Status|
        +=====================+=================================+=========================================================================+============+======+
        |cBPF-JIT             |HAVE_CBPF_JIT                    |arch supports cBPF JIT optimizations                                     |alpha       |TODO  |
        |                     |                                 |                                                                         +------------+------+
        |                     |                                 |                                                                         |arc         |TODO  |
        |                     |                                 |                                                                         +------------+------+
        |                     |                                 |                                                                         |arm         |TODO  |
        |                     |                                 |                                                                         +------------+------+
        |                     |                                 |                                                                         |arm64       |TODO  |
        |                     |                                 |                                                                         +------------+------+
        |                     |                                 |                                                                         |c6x         |TODO  |
        |                     |                                 |                                                                         +------------+------+
        |                     |                                 |                                                                         |csky        |TODO  |
        |                     |                                 |                                                                         +------------+------+
        |                     |                                 |                                                                         |h8300       |TODO  |
        |                     |                                 |                                                                         +------------+------+
        |                     |                                 |                                                                         |hexagon     |TODO  |
...

Adding those patchsets will basically place the contents of all
files under Documentation/features (currently, 45 files) at the
Kernel documentation, which is, IMO, a good thing to do.

Regards,
Mauro

Mauro Carvalho Chehab (6):
  scripts: get_feat.pl: add a script to handle Documentation/features
  scripts: get_feat.pl: improve matrix output
  scripts: get_feat.pl: use its implementation for list-arch.sh
  sphinx: kernel_feat.py: add a script to parse feature files
  docs: admin-guide: add a features list
  docs: archis: add a per-architecture features list

 Documentation/admin-guide/features.rst |   3 +
 Documentation/admin-guide/index.rst    |   1 +
 Documentation/arm/features.rst         |   3 +
 Documentation/arm/index.rst            |   2 +
 Documentation/arm64/features.rst       |   3 +
 Documentation/arm64/index.rst          |   2 +
 Documentation/conf.py                  |   2 +-
 Documentation/features/list-arch.sh    |  17 +-
 Documentation/ia64/features.rst        |   3 +
 Documentation/ia64/index.rst           |   2 +
 Documentation/index.rst                |   2 +-
 Documentation/m68k/features.rst        |   3 +
 Documentation/m68k/index.rst           |   2 +
 Documentation/mips/features.rst        |   3 +
 Documentation/mips/index.rst           |   2 +
 Documentation/nios2/index.rst          |  12 +
 Documentation/openrisc/features.rst    |   3 +
 Documentation/openrisc/index.rst       |   2 +
 Documentation/parisc/features.rst      |   3 +
 Documentation/parisc/index.rst         |   2 +
 Documentation/powerpc/features.rst     |   3 +
 Documentation/powerpc/index.rst        |   2 +
 Documentation/riscv/features.rst       |   3 +
 Documentation/riscv/index.rst          |   2 +
 Documentation/s390/features.rst        |   3 +
 Documentation/s390/index.rst           |   2 +
 Documentation/sh/features.rst          |   3 +
 Documentation/sh/index.rst             |   2 +
 Documentation/sparc/features.rst       |   3 +
 Documentation/sparc/index.rst          |   2 +
 Documentation/sphinx/kernel_feat.py    | 169 ++++++++
 Documentation/x86/features.rst         |   3 +
 Documentation/x86/index.rst            |   1 +
 Documentation/xtensa/features.rst      |   3 +
 Documentation/xtensa/index.rst         |   2 +
 scripts/get_feat.pl                    | 552 +++++++++++++++++++++++++
 36 files changed, 810 insertions(+), 17 deletions(-)
 create mode 100644 Documentation/admin-guide/features.rst
 create mode 100644 Documentation/arm/features.rst
 create mode 100644 Documentation/arm64/features.rst
 create mode 100644 Documentation/ia64/features.rst
 create mode 100644 Documentation/m68k/features.rst
 create mode 100644 Documentation/mips/features.rst
 create mode 100644 Documentation/nios2/index.rst
 create mode 100644 Documentation/openrisc/features.rst
 create mode 100644 Documentation/parisc/features.rst
 create mode 100644 Documentation/powerpc/features.rst
 create mode 100644 Documentation/riscv/features.rst
 create mode 100644 Documentation/s390/features.rst
 create mode 100644 Documentation/sh/features.rst
 create mode 100644 Documentation/sparc/features.rst
 create mode 100644 Documentation/sphinx/kernel_feat.py
 create mode 100644 Documentation/x86/features.rst
 create mode 100644 Documentation/xtensa/features.rst
 create mode 100755 scripts/get_feat.pl

-- 
2.28.0



^ permalink raw reply

* Re: [PATCH v6 1/5] PCI: Unify ECAM constants in native PCI Express drivers
From: Krzysztof Wilczyński @ 2020-11-30 15:30 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Heiko Stuebner, Shawn Lin, Paul Mackerras, Thomas Petazzoni,
	Jonathan Chocron, Toan Le, Will Deacon, Rob Herring,
	Florian Fainelli, Michal Simek, linux-rockchip,
	bcm-kernel-feedback-list, Jonathan Derrick, linux-pci, Ray Jui,
	linux-rpi-kernel, Jonathan Cameron, Bjorn Helgaas,
	linux-arm-kernel, Scott Branden, Zhou Wang, Robert Richter,
	linuxppc-dev, Nicolas Saenz Julienne
In-Reply-To: <20201130110858.GB16758@e121166-lin.cambridge.arm.com>

Hi Lorenzo!

On 20-11-30 11:08:58, Lorenzo Pieralisi wrote:
[...]
> > Refactor pci_ecam_map_bus() function to use newly added constants so
> > that limits to the bus, device function and offset (now limited to 4K as
> > per the specification) are in place to prevent the defective or
> > malicious caller from supplying incorrect configuration offset and thus
> > targeting the wrong device when accessing extended configuration space.
> > This refactor also allows for the ".bus_shit" initialisers to be dropped
>                                           ^^^^
> 
> Nice typo, I'd fix it while applying it though if you don't mind ;-),
> no need to resend it.

Oh doh!  Apologies. :)

> Jokes aside, nice piece of work, thanks for that.
> 
> > when the user is not using a custom value as a default value will be
> > used as per the PCI Express Specification.
> > 
> > Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
> > Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
> 
> I think Bjorn's reviewed-by still stands so I will apply it.
[...]

Thank you!

Krzysztof

^ permalink raw reply

* [PATCH] powerpc/pseries: Define PCI bus speed for Gen4 and Gen5
From: Frederic Barrat @ 2020-11-30 15:29 UTC (permalink / raw)
  To: linuxppc-dev

Update bus speed definition for PCI Gen4 and 5.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
---
 arch/powerpc/platforms/pseries/pci.c | 51 ++++++++++++----------------
 1 file changed, 21 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
index 911534b89c85..72a4d4167849 100644
--- a/arch/powerpc/platforms/pseries/pci.c
+++ b/arch/powerpc/platforms/pseries/pci.c
@@ -290,6 +290,25 @@ static void fixup_winbond_82c105(struct pci_dev* dev)
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_WINBOND, PCI_DEVICE_ID_WINBOND_82C105,
 			 fixup_winbond_82c105);
 
+static enum pci_bus_speed prop_to_pci_speed(u32 prop)
+{
+	switch (prop) {
+	case 0x01:
+		return PCIE_SPEED_2_5GT;
+	case 0x02:
+		return PCIE_SPEED_5_0GT;
+	case 0x04:
+		return PCIE_SPEED_8_0GT;
+	case 0x08:
+		return PCIE_SPEED_16_0GT;
+	case 0x10:
+		return PCIE_SPEED_32_0GT;
+	default:
+		pr_debug("Unexpected PCI link speed property value\n");
+		return PCI_SPEED_UNKNOWN;
+	}
+}
+
 int pseries_root_bridge_prepare(struct pci_host_bridge *bridge)
 {
 	struct device_node *dn, *pdn;
@@ -322,35 +341,7 @@ int pseries_root_bridge_prepare(struct pci_host_bridge *bridge)
 		return 0;
 	}
 
-	switch (pcie_link_speed_stats[0]) {
-	case 0x01:
-		bus->max_bus_speed = PCIE_SPEED_2_5GT;
-		break;
-	case 0x02:
-		bus->max_bus_speed = PCIE_SPEED_5_0GT;
-		break;
-	case 0x04:
-		bus->max_bus_speed = PCIE_SPEED_8_0GT;
-		break;
-	default:
-		bus->max_bus_speed = PCI_SPEED_UNKNOWN;
-		break;
-	}
-
-	switch (pcie_link_speed_stats[1]) {
-	case 0x01:
-		bus->cur_bus_speed = PCIE_SPEED_2_5GT;
-		break;
-	case 0x02:
-		bus->cur_bus_speed = PCIE_SPEED_5_0GT;
-		break;
-	case 0x04:
-		bus->cur_bus_speed = PCIE_SPEED_8_0GT;
-		break;
-	default:
-		bus->cur_bus_speed = PCI_SPEED_UNKNOWN;
-		break;
-	}
-
+	bus->max_bus_speed = prop_to_pci_speed(pcie_link_speed_stats[0]);
+	bus->cur_bus_speed = prop_to_pci_speed(pcie_link_speed_stats[1]);
 	return 0;
 }
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH 2/8] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode
From: Mathieu Desnoyers @ 2020-11-30 14:57 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: linux-arch, Arnd Bergmann, Peter Zijlstra, x86, linux-kernel,
	linux-mm, linuxppc-dev
In-Reply-To: <20201128160141.1003903-3-npiggin@gmail.com>

----- On Nov 28, 2020, at 11:01 AM, Nicholas Piggin npiggin@gmail.com wrote:

> And get rid of the generic sync_core_before_usermode facility. This is
> functionally a no-op in the core scheduler code, but it also catches

This sentence is incomplete.

> 
> This helper is the wrong way around I think. The idea that membarrier
> state requires a core sync before returning to user is the easy one
> that does not need hiding behind membarrier calls. The gap in core
> synchronization due to x86's sysret/sysexit and lazy tlb mode, is the
> tricky detail that is better put in x86 lazy tlb code.

Ideally yes this complexity should sit within the x86 architecture code
if only that architecture requires it.

Thanks,

Mathieu


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply

* Re: [PATCH] KVM: PPC: Book3S HV: XIVE: Fix vCPU id sanity check
From: Cédric Le Goater @ 2020-11-30 12:47 UTC (permalink / raw)
  To: Greg Kurz, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel, kvm-ppc
In-Reply-To: <160673876747.695514.1809676603724514920.stgit@bahia.lan>

On 11/30/20 1:19 PM, Greg Kurz wrote:
> Commit 062cfab7069f ("KVM: PPC: Book3S HV: XIVE: Make VP block size
> configurable") updated kvmppc_xive_vcpu_id_valid() in a way that
> allows userspace to trigger an assertion in skiboot and crash the host:
> 
> [  696.186248988,3] XIVE[ IC 08  ] eq_blk != vp_blk (0 vs. 1) for target 0x4300008c/0
> [  696.186314757,0] Assert fail: hw/xive.c:2370:0
> [  696.186342458,0] Aborting!
> xive-kvCPU 0043 Backtrace:
>  S: 0000000031e2b8f0 R: 0000000030013840   .backtrace+0x48
>  S: 0000000031e2b990 R: 000000003001b2d0   ._abort+0x4c
>  S: 0000000031e2ba10 R: 000000003001b34c   .assert_fail+0x34
>  S: 0000000031e2ba90 R: 0000000030058984   .xive_eq_for_target.part.20+0xb0
>  S: 0000000031e2bb40 R: 0000000030059fdc   .xive_setup_silent_gather+0x2c
>  S: 0000000031e2bc20 R: 000000003005a334   .opal_xive_set_vp_info+0x124
>  S: 0000000031e2bd20 R: 00000000300051a4   opal_entry+0x134
>  --- OPAL call token: 0x8a caller R1: 0xc000001f28563850 ---
> 
> XIVE maintains the interrupt context state of non-dispatched vCPUs in
> an internal VP structure. We allocate a bunch of those on startup to
> accommodate all possible vCPUs. Each VP has an id, that we derive from
> the vCPU id for efficiency:
> 
> static inline u32 kvmppc_xive_vp(struct kvmppc_xive *xive, u32 server)
> {
> 	return xive->vp_base + kvmppc_pack_vcpu_id(xive->kvm, server);
> }
> 
> The KVM XIVE device used to allocate KVM_MAX_VCPUS VPs. This was
> limitting the number of concurrent VMs because the VP space is
> limited on the HW. Since most of the time, VMs run with a lot less
> vCPUs, commit 062cfab7069f ("KVM: PPC: Book3S HV: XIVE: Make VP
> block size configurable") gave the possibility for userspace to
> tune the size of the VP block through the KVM_DEV_XIVE_NR_SERVERS
> attribute.
> 
> The check in kvmppc_pack_vcpu_id() was changed from
> 
> 	cpu < KVM_MAX_VCPUS * xive->kvm->arch.emul_smt_mode
> 
> to
> 
> 	cpu < xive->nr_servers * xive->kvm->arch.emul_smt_mode
> 
> The previous check was based on the fact that the VP block had
> KVM_MAX_VCPUS entries and that kvmppc_pack_vcpu_id() guarantees
> that packed vCPU ids are below KVM_MAX_VCPUS. We've changed the
> size of the VP block, but kvmppc_pack_vcpu_id() has nothing to
> do with it and it certainly doesn't ensure that the packed vCPU
> ids are below xive->nr_servers. kvmppc_xive_vcpu_id_valid() might
> thus return true when the VM was configured with a non-standard
> VSMT mode, even if the packed vCPU id is higher than what we
> expect. We end up using an unallocated VP id, which confuses
> OPAL. The assert in OPAL is probably abusive and should be
> converted to a regular error that the kernel can handle, but
> we shouldn't really use broken VP ids in the first place.
> 
> Fix kvmppc_xive_vcpu_id_valid() so that it checks the packed
> vCPU id is below xive->nr_servers, which is explicitly what we
> want.
> 
> Fixes: 062cfab7069f ("KVM: PPC: Book3S HV: XIVE: Make VP block size configurable")
> Cc: stable@vger.kernel.org # v5.5+
> Signed-off-by: Greg Kurz <groug@kaod.org>

Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks,

C.

> ---
>  arch/powerpc/kvm/book3s_xive.c |    7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index 85215e79db42..a0ebc29f30b2 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -1214,12 +1214,9 @@ void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu)
>  static bool kvmppc_xive_vcpu_id_valid(struct kvmppc_xive *xive, u32 cpu)
>  {
>  	/* We have a block of xive->nr_servers VPs. We just need to check
> -	 * raw vCPU ids are below the expected limit for this guest's
> -	 * core stride ; kvmppc_pack_vcpu_id() will pack them down to an
> -	 * index that can be safely used to compute a VP id that belongs
> -	 * to the VP block.
> +	 * packed vCPU ids are below that.
>  	 */
> -	return cpu < xive->nr_servers * xive->kvm->arch.emul_smt_mode;
> +	return kvmppc_pack_vcpu_id(xive->kvm, cpu) < xive->nr_servers;
>  }
>  
>  int kvmppc_xive_compute_vp_id(struct kvmppc_xive *xive, u32 cpu, u32 *vp)
> 
> 


^ permalink raw reply

* [PATCH] KVM: PPC: Book3S HV: XIVE: Fix vCPU id sanity check
From: Greg Kurz @ 2020-11-30 12:19 UTC (permalink / raw)
  To: Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, Cédric Le Goater, kvm-ppc, linux-kernel

Commit 062cfab7069f ("KVM: PPC: Book3S HV: XIVE: Make VP block size
configurable") updated kvmppc_xive_vcpu_id_valid() in a way that
allows userspace to trigger an assertion in skiboot and crash the host:

[  696.186248988,3] XIVE[ IC 08  ] eq_blk != vp_blk (0 vs. 1) for target 0x4300008c/0
[  696.186314757,0] Assert fail: hw/xive.c:2370:0
[  696.186342458,0] Aborting!
xive-kvCPU 0043 Backtrace:
 S: 0000000031e2b8f0 R: 0000000030013840   .backtrace+0x48
 S: 0000000031e2b990 R: 000000003001b2d0   ._abort+0x4c
 S: 0000000031e2ba10 R: 000000003001b34c   .assert_fail+0x34
 S: 0000000031e2ba90 R: 0000000030058984   .xive_eq_for_target.part.20+0xb0
 S: 0000000031e2bb40 R: 0000000030059fdc   .xive_setup_silent_gather+0x2c
 S: 0000000031e2bc20 R: 000000003005a334   .opal_xive_set_vp_info+0x124
 S: 0000000031e2bd20 R: 00000000300051a4   opal_entry+0x134
 --- OPAL call token: 0x8a caller R1: 0xc000001f28563850 ---

XIVE maintains the interrupt context state of non-dispatched vCPUs in
an internal VP structure. We allocate a bunch of those on startup to
accommodate all possible vCPUs. Each VP has an id, that we derive from
the vCPU id for efficiency:

static inline u32 kvmppc_xive_vp(struct kvmppc_xive *xive, u32 server)
{
	return xive->vp_base + kvmppc_pack_vcpu_id(xive->kvm, server);
}

The KVM XIVE device used to allocate KVM_MAX_VCPUS VPs. This was
limitting the number of concurrent VMs because the VP space is
limited on the HW. Since most of the time, VMs run with a lot less
vCPUs, commit 062cfab7069f ("KVM: PPC: Book3S HV: XIVE: Make VP
block size configurable") gave the possibility for userspace to
tune the size of the VP block through the KVM_DEV_XIVE_NR_SERVERS
attribute.

The check in kvmppc_pack_vcpu_id() was changed from

	cpu < KVM_MAX_VCPUS * xive->kvm->arch.emul_smt_mode

to

	cpu < xive->nr_servers * xive->kvm->arch.emul_smt_mode

The previous check was based on the fact that the VP block had
KVM_MAX_VCPUS entries and that kvmppc_pack_vcpu_id() guarantees
that packed vCPU ids are below KVM_MAX_VCPUS. We've changed the
size of the VP block, but kvmppc_pack_vcpu_id() has nothing to
do with it and it certainly doesn't ensure that the packed vCPU
ids are below xive->nr_servers. kvmppc_xive_vcpu_id_valid() might
thus return true when the VM was configured with a non-standard
VSMT mode, even if the packed vCPU id is higher than what we
expect. We end up using an unallocated VP id, which confuses
OPAL. The assert in OPAL is probably abusive and should be
converted to a regular error that the kernel can handle, but
we shouldn't really use broken VP ids in the first place.

Fix kvmppc_xive_vcpu_id_valid() so that it checks the packed
vCPU id is below xive->nr_servers, which is explicitly what we
want.

Fixes: 062cfab7069f ("KVM: PPC: Book3S HV: XIVE: Make VP block size configurable")
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 arch/powerpc/kvm/book3s_xive.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index 85215e79db42..a0ebc29f30b2 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -1214,12 +1214,9 @@ void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu)
 static bool kvmppc_xive_vcpu_id_valid(struct kvmppc_xive *xive, u32 cpu)
 {
 	/* We have a block of xive->nr_servers VPs. We just need to check
-	 * raw vCPU ids are below the expected limit for this guest's
-	 * core stride ; kvmppc_pack_vcpu_id() will pack them down to an
-	 * index that can be safely used to compute a VP id that belongs
-	 * to the VP block.
+	 * packed vCPU ids are below that.
 	 */
-	return cpu < xive->nr_servers * xive->kvm->arch.emul_smt_mode;
+	return kvmppc_pack_vcpu_id(xive->kvm, cpu) < xive->nr_servers;
 }
 
 int kvmppc_xive_compute_vp_id(struct kvmppc_xive *xive, u32 cpu, u32 *vp)



^ permalink raw reply related

* Re: [PATCH v6 1/5] PCI: Unify ECAM constants in native PCI Express drivers
From: Lorenzo Pieralisi @ 2020-11-30 11:08 UTC (permalink / raw)
  To: Krzysztof Wilczyński
  Cc: Heiko Stuebner, Shawn Lin, Paul Mackerras, Thomas Petazzoni,
	Jonathan Chocron, Toan Le, Will Deacon, Rob Herring,
	Florian Fainelli, Michal Simek, linux-rockchip,
	bcm-kernel-feedback-list, Jonathan Derrick, linux-pci, Ray Jui,
	linux-rpi-kernel, Jonathan Cameron, Bjorn Helgaas,
	linux-arm-kernel, Scott Branden, Zhou Wang, Robert Richter,
	linuxppc-dev, Nicolas Saenz Julienne
In-Reply-To: <20201129230743.3006978-2-kw@linux.com>

On Sun, Nov 29, 2020 at 11:07:39PM +0000, Krzysztof Wilczyński wrote:
> Add ECAM-related constants to provide a set of standard constants
> defining memory address shift values to the byte-level address that can
> be used to access the PCI Express Configuration Space, and then move
> native PCI Express controller drivers to use the newly introduced
> definitions retiring driver-specific ones.
> 
> Refactor pci_ecam_map_bus() function to use newly added constants so
> that limits to the bus, device function and offset (now limited to 4K as
> per the specification) are in place to prevent the defective or
> malicious caller from supplying incorrect configuration offset and thus
> targeting the wrong device when accessing extended configuration space.
> This refactor also allows for the ".bus_shit" initialisers to be dropped
                                          ^^^^

Nice typo, I'd fix it while applying it though if you don't mind ;-),
no need to resend it.

Jokes aside, nice piece of work, thanks for that.

> when the user is not using a custom value as a default value will be
> used as per the PCI Express Specification.
> 
> Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
> Signed-off-by: Krzysztof Wilczyński <kw@linux.com>

I think Bjorn's reviewed-by still stands so I will apply it.

Thanks !
Lorenzo

> ---
>  drivers/pci/controller/dwc/pcie-al.c        | 12 ++-------
>  drivers/pci/controller/dwc/pcie-hisi.c      |  2 --
>  drivers/pci/controller/pci-aardvark.c       | 13 +++-------
>  drivers/pci/controller/pci-host-generic.c   |  1 -
>  drivers/pci/controller/pci-thunder-ecam.c   |  1 -
>  drivers/pci/controller/pcie-brcmstb.c       | 16 ++----------
>  drivers/pci/controller/pcie-rockchip-host.c | 27 ++++++++++-----------
>  drivers/pci/controller/pcie-rockchip.h      |  8 +-----
>  drivers/pci/controller/pcie-tango.c         |  1 -
>  drivers/pci/controller/pcie-xilinx-nwl.c    |  9 ++-----
>  drivers/pci/controller/pcie-xilinx.c        | 11 ++-------
>  drivers/pci/controller/vmd.c                | 11 ++++-----
>  drivers/pci/ecam.c                          | 23 ++++++++++++------
>  include/linux/pci-ecam.h                    | 27 +++++++++++++++++++++
>  14 files changed, 73 insertions(+), 89 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-al.c b/drivers/pci/controller/dwc/pcie-al.c
> index f973fbca90cf..af9e51ab1af8 100644
> --- a/drivers/pci/controller/dwc/pcie-al.c
> +++ b/drivers/pci/controller/dwc/pcie-al.c
> @@ -76,7 +76,6 @@ static int al_pcie_init(struct pci_config_window *cfg)
>  }
>  
>  const struct pci_ecam_ops al_pcie_ops = {
> -	.bus_shift    = 20,
>  	.init         =  al_pcie_init,
>  	.pci_ops      = {
>  		.map_bus    = al_pcie_map_bus,
> @@ -138,8 +137,6 @@ struct al_pcie {
>  	struct al_pcie_target_bus_cfg target_bus_cfg;
>  };
>  
> -#define PCIE_ECAM_DEVFN(x)		(((x) & 0xff) << 12)
> -
>  #define to_al_pcie(x)		dev_get_drvdata((x)->dev)
>  
>  static inline u32 al_pcie_controller_readl(struct al_pcie *pcie, u32 offset)
> @@ -226,11 +223,6 @@ static void __iomem *al_pcie_conf_addr_map_bus(struct pci_bus *bus,
>  	struct al_pcie_target_bus_cfg *target_bus_cfg = &pcie->target_bus_cfg;
>  	unsigned int busnr_ecam = busnr & target_bus_cfg->ecam_mask;
>  	unsigned int busnr_reg = busnr & target_bus_cfg->reg_mask;
> -	void __iomem *pci_base_addr;
> -
> -	pci_base_addr = (void __iomem *)((uintptr_t)pp->va_cfg0_base +
> -					 (busnr_ecam << 20) +
> -					 PCIE_ECAM_DEVFN(devfn));
>  
>  	if (busnr_reg != target_bus_cfg->reg_val) {
>  		dev_dbg(pcie->pci->dev, "Changing target bus busnum val from 0x%x to 0x%x\n",
> @@ -241,7 +233,7 @@ static void __iomem *al_pcie_conf_addr_map_bus(struct pci_bus *bus,
>  				       target_bus_cfg->reg_mask);
>  	}
>  
> -	return pci_base_addr + where;
> +	return pp->va_cfg0_base + PCIE_ECAM_OFFSET(busnr_ecam, devfn, where);
>  }
>  
>  static struct pci_ops al_child_pci_ops = {
> @@ -264,7 +256,7 @@ static void al_pcie_config_prepare(struct al_pcie *pcie)
>  
>  	target_bus_cfg = &pcie->target_bus_cfg;
>  
> -	ecam_bus_mask = (pcie->ecam_size >> 20) - 1;
> +	ecam_bus_mask = (pcie->ecam_size >> PCIE_ECAM_BUS_SHIFT) - 1;
>  	if (ecam_bus_mask > 255) {
>  		dev_warn(pcie->dev, "ECAM window size is larger than 256MB. Cutting off at 256\n");
>  		ecam_bus_mask = 255;
> diff --git a/drivers/pci/controller/dwc/pcie-hisi.c b/drivers/pci/controller/dwc/pcie-hisi.c
> index 5ca86796d43a..8fc5960faf28 100644
> --- a/drivers/pci/controller/dwc/pcie-hisi.c
> +++ b/drivers/pci/controller/dwc/pcie-hisi.c
> @@ -100,7 +100,6 @@ static int hisi_pcie_init(struct pci_config_window *cfg)
>  }
>  
>  const struct pci_ecam_ops hisi_pcie_ops = {
> -	.bus_shift    = 20,
>  	.init         =  hisi_pcie_init,
>  	.pci_ops      = {
>  		.map_bus    = hisi_pcie_map_bus,
> @@ -135,7 +134,6 @@ static int hisi_pcie_platform_init(struct pci_config_window *cfg)
>  }
>  
>  static const struct pci_ecam_ops hisi_pcie_platform_ops = {
> -	.bus_shift    = 20,
>  	.init         =  hisi_pcie_platform_init,
>  	.pci_ops      = {
>  		.map_bus    = hisi_pcie_map_bus,
> diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
> index 0be485a25327..1043e54c73bd 100644
> --- a/drivers/pci/controller/pci-aardvark.c
> +++ b/drivers/pci/controller/pci-aardvark.c
> @@ -16,6 +16,7 @@
>  #include <linux/kernel.h>
>  #include <linux/module.h>
>  #include <linux/pci.h>
> +#include <linux/pci-ecam.h>
>  #include <linux/init.h>
>  #include <linux/phy/phy.h>
>  #include <linux/platform_device.h>
> @@ -164,14 +165,6 @@
>  #define PCIE_CONFIG_WR_TYPE0			0xa
>  #define PCIE_CONFIG_WR_TYPE1			0xb
>  
> -#define PCIE_CONF_BUS(bus)			(((bus) & 0xff) << 20)
> -#define PCIE_CONF_DEV(dev)			(((dev) & 0x1f) << 15)
> -#define PCIE_CONF_FUNC(fun)			(((fun) & 0x7)	<< 12)
> -#define PCIE_CONF_REG(reg)			((reg) & 0xffc)
> -#define PCIE_CONF_ADDR(bus, devfn, where)	\
> -	(PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn))	| \
> -	 PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where))
> -
>  #define PIO_RETRY_CNT			500
>  #define PIO_RETRY_DELAY			2 /* 2 us*/
>  
> @@ -687,7 +680,7 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
>  	advk_writel(pcie, reg, PIO_CTRL);
>  
>  	/* Program the address registers */
> -	reg = PCIE_CONF_ADDR(bus->number, devfn, where);
> +	reg = ALIGN_DOWN(PCIE_ECAM_OFFSET(bus->number, devfn, where), 4);
>  	advk_writel(pcie, reg, PIO_ADDR_LS);
>  	advk_writel(pcie, 0, PIO_ADDR_MS);
>  
> @@ -748,7 +741,7 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
>  	advk_writel(pcie, reg, PIO_CTRL);
>  
>  	/* Program the address registers */
> -	reg = PCIE_CONF_ADDR(bus->number, devfn, where);
> +	reg = ALIGN_DOWN(PCIE_ECAM_OFFSET(bus->number, devfn, where), 4);
>  	advk_writel(pcie, reg, PIO_ADDR_LS);
>  	advk_writel(pcie, 0, PIO_ADDR_MS);
>  
> diff --git a/drivers/pci/controller/pci-host-generic.c b/drivers/pci/controller/pci-host-generic.c
> index b51977abfdf1..63865aeb636b 100644
> --- a/drivers/pci/controller/pci-host-generic.c
> +++ b/drivers/pci/controller/pci-host-generic.c
> @@ -49,7 +49,6 @@ static void __iomem *pci_dw_ecam_map_bus(struct pci_bus *bus,
>  }
>  
>  static const struct pci_ecam_ops pci_dw_ecam_bus_ops = {
> -	.bus_shift	= 20,
>  	.pci_ops	= {
>  		.map_bus	= pci_dw_ecam_map_bus,
>  		.read		= pci_generic_config_read,
> diff --git a/drivers/pci/controller/pci-thunder-ecam.c b/drivers/pci/controller/pci-thunder-ecam.c
> index 7e8835fee5f7..f964fd26f7e0 100644
> --- a/drivers/pci/controller/pci-thunder-ecam.c
> +++ b/drivers/pci/controller/pci-thunder-ecam.c
> @@ -346,7 +346,6 @@ static int thunder_ecam_config_write(struct pci_bus *bus, unsigned int devfn,
>  }
>  
>  const struct pci_ecam_ops pci_thunder_ecam_ops = {
> -	.bus_shift	= 20,
>  	.pci_ops	= {
>  		.map_bus        = pci_ecam_map_bus,
>  		.read           = thunder_ecam_config_read,
> diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
> index bea86899bd5d..7fc80fd6f13f 100644
> --- a/drivers/pci/controller/pcie-brcmstb.c
> +++ b/drivers/pci/controller/pcie-brcmstb.c
> @@ -22,6 +22,7 @@
>  #include <linux/of_pci.h>
>  #include <linux/of_platform.h>
>  #include <linux/pci.h>
> +#include <linux/pci-ecam.h>
>  #include <linux/printk.h>
>  #include <linux/reset.h>
>  #include <linux/sizes.h>
> @@ -127,11 +128,7 @@
>  #define  MSI_INT_MASK_CLR		0x14
>  
>  #define PCIE_EXT_CFG_DATA				0x8000
> -
>  #define PCIE_EXT_CFG_INDEX				0x9000
> -#define  PCIE_EXT_BUSNUM_SHIFT				20
> -#define  PCIE_EXT_SLOT_SHIFT				15
> -#define  PCIE_EXT_FUNC_SHIFT				12
>  
>  #define  PCIE_RGR1_SW_INIT_1_PERST_MASK			0x1
>  #define  PCIE_RGR1_SW_INIT_1_PERST_SHIFT		0x0
> @@ -695,15 +692,6 @@ static bool brcm_pcie_link_up(struct brcm_pcie *pcie)
>  	return dla && plu;
>  }
>  
> -/* Configuration space read/write support */
> -static inline int brcm_pcie_cfg_index(int busnr, int devfn, int reg)
> -{
> -	return ((PCI_SLOT(devfn) & 0x1f) << PCIE_EXT_SLOT_SHIFT)
> -		| ((PCI_FUNC(devfn) & 0x07) << PCIE_EXT_FUNC_SHIFT)
> -		| (busnr << PCIE_EXT_BUSNUM_SHIFT)
> -		| (reg & ~3);
> -}
> -
>  static void __iomem *brcm_pcie_map_conf(struct pci_bus *bus, unsigned int devfn,
>  					int where)
>  {
> @@ -716,7 +704,7 @@ static void __iomem *brcm_pcie_map_conf(struct pci_bus *bus, unsigned int devfn,
>  		return PCI_SLOT(devfn) ? NULL : base + where;
>  
>  	/* For devices, write to the config space index register */
> -	idx = brcm_pcie_cfg_index(bus->number, devfn, 0);
> +	idx = PCIE_ECAM_OFFSET(bus->number, devfn, 0);
>  	writel(idx, pcie->base + PCIE_EXT_CFG_INDEX);
>  	return base + PCIE_EXT_CFG_DATA + where;
>  }
> diff --git a/drivers/pci/controller/pcie-rockchip-host.c b/drivers/pci/controller/pcie-rockchip-host.c
> index 9705059523a6..f1d08a1b1591 100644
> --- a/drivers/pci/controller/pcie-rockchip-host.c
> +++ b/drivers/pci/controller/pcie-rockchip-host.c
> @@ -157,12 +157,11 @@ static int rockchip_pcie_rd_other_conf(struct rockchip_pcie *rockchip,
>  				       struct pci_bus *bus, u32 devfn,
>  				       int where, int size, u32 *val)
>  {
> -	u32 busdev;
> +	void __iomem *addr;
>  
> -	busdev = PCIE_ECAM_ADDR(bus->number, PCI_SLOT(devfn),
> -				PCI_FUNC(devfn), where);
> +	addr = rockchip->reg_base + PCIE_ECAM_OFFSET(bus->number, devfn, where);
>  
> -	if (!IS_ALIGNED(busdev, size)) {
> +	if (!IS_ALIGNED((uintptr_t)addr, size)) {
>  		*val = 0;
>  		return PCIBIOS_BAD_REGISTER_NUMBER;
>  	}
> @@ -175,11 +174,11 @@ static int rockchip_pcie_rd_other_conf(struct rockchip_pcie *rockchip,
>  						AXI_WRAPPER_TYPE1_CFG);
>  
>  	if (size == 4) {
> -		*val = readl(rockchip->reg_base + busdev);
> +		*val = readl(addr);
>  	} else if (size == 2) {
> -		*val = readw(rockchip->reg_base + busdev);
> +		*val = readw(addr);
>  	} else if (size == 1) {
> -		*val = readb(rockchip->reg_base + busdev);
> +		*val = readb(addr);
>  	} else {
>  		*val = 0;
>  		return PCIBIOS_BAD_REGISTER_NUMBER;
> @@ -191,11 +190,11 @@ static int rockchip_pcie_wr_other_conf(struct rockchip_pcie *rockchip,
>  				       struct pci_bus *bus, u32 devfn,
>  				       int where, int size, u32 val)
>  {
> -	u32 busdev;
> +	void __iomem *addr;
>  
> -	busdev = PCIE_ECAM_ADDR(bus->number, PCI_SLOT(devfn),
> -				PCI_FUNC(devfn), where);
> -	if (!IS_ALIGNED(busdev, size))
> +	addr = rockchip->reg_base + PCIE_ECAM_OFFSET(bus->number, devfn, where);
> +
> +	if (!IS_ALIGNED((uintptr_t)addr, size))
>  		return PCIBIOS_BAD_REGISTER_NUMBER;
>  
>  	if (pci_is_root_bus(bus->parent))
> @@ -206,11 +205,11 @@ static int rockchip_pcie_wr_other_conf(struct rockchip_pcie *rockchip,
>  						AXI_WRAPPER_TYPE1_CFG);
>  
>  	if (size == 4)
> -		writel(val, rockchip->reg_base + busdev);
> +		writel(val, addr);
>  	else if (size == 2)
> -		writew(val, rockchip->reg_base + busdev);
> +		writew(val, addr);
>  	else if (size == 1)
> -		writeb(val, rockchip->reg_base + busdev);
> +		writeb(val, addr);
>  	else
>  		return PCIBIOS_BAD_REGISTER_NUMBER;
>  
> diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
> index c7d0178fc8c2..1650a5087450 100644
> --- a/drivers/pci/controller/pcie-rockchip.h
> +++ b/drivers/pci/controller/pcie-rockchip.h
> @@ -13,6 +13,7 @@
>  
>  #include <linux/kernel.h>
>  #include <linux/pci.h>
> +#include <linux/pci-ecam.h>
>  
>  /*
>   * The upper 16 bits of PCIE_CLIENT_CONFIG are a write mask for the lower 16
> @@ -178,13 +179,6 @@
>  #define MIN_AXI_ADDR_BITS_PASSED		8
>  #define PCIE_RC_SEND_PME_OFF			0x11960
>  #define ROCKCHIP_VENDOR_ID			0x1d87
> -#define PCIE_ECAM_BUS(x)			(((x) & 0xff) << 20)
> -#define PCIE_ECAM_DEV(x)			(((x) & 0x1f) << 15)
> -#define PCIE_ECAM_FUNC(x)			(((x) & 0x7) << 12)
> -#define PCIE_ECAM_REG(x)			(((x) & 0xfff) << 0)
> -#define PCIE_ECAM_ADDR(bus, dev, func, reg) \
> -	  (PCIE_ECAM_BUS(bus) | PCIE_ECAM_DEV(dev) | \
> -	   PCIE_ECAM_FUNC(func) | PCIE_ECAM_REG(reg))
>  #define PCIE_LINK_IS_L2(x) \
>  	(((x) & PCIE_CLIENT_DEBUG_LTSSM_MASK) == PCIE_CLIENT_DEBUG_LTSSM_L2)
>  #define PCIE_LINK_UP(x) \
> diff --git a/drivers/pci/controller/pcie-tango.c b/drivers/pci/controller/pcie-tango.c
> index d093a8ce4bb1..62a061f1d62e 100644
> --- a/drivers/pci/controller/pcie-tango.c
> +++ b/drivers/pci/controller/pcie-tango.c
> @@ -208,7 +208,6 @@ static int smp8759_config_write(struct pci_bus *bus, unsigned int devfn,
>  }
>  
>  static const struct pci_ecam_ops smp8759_ecam_ops = {
> -	.bus_shift	= 20,
>  	.pci_ops	= {
>  		.map_bus	= pci_ecam_map_bus,
>  		.read		= smp8759_config_read,
> diff --git a/drivers/pci/controller/pcie-xilinx-nwl.c b/drivers/pci/controller/pcie-xilinx-nwl.c
> index f3cf7d61924f..7f29c2fdcd51 100644
> --- a/drivers/pci/controller/pcie-xilinx-nwl.c
> +++ b/drivers/pci/controller/pcie-xilinx-nwl.c
> @@ -18,6 +18,7 @@
>  #include <linux/of_platform.h>
>  #include <linux/of_irq.h>
>  #include <linux/pci.h>
> +#include <linux/pci-ecam.h>
>  #include <linux/platform_device.h>
>  #include <linux/irqchip/chained_irq.h>
>  
> @@ -124,8 +125,6 @@
>  #define E_ECAM_CR_ENABLE		BIT(0)
>  #define E_ECAM_SIZE_LOC			GENMASK(20, 16)
>  #define E_ECAM_SIZE_SHIFT		16
> -#define ECAM_BUS_LOC_SHIFT		20
> -#define ECAM_DEV_LOC_SHIFT		12
>  #define NWL_ECAM_VALUE_DEFAULT		12
>  
>  #define CFG_DMA_REG_BAR			GENMASK(2, 0)
> @@ -240,15 +239,11 @@ static void __iomem *nwl_pcie_map_bus(struct pci_bus *bus, unsigned int devfn,
>  				      int where)
>  {
>  	struct nwl_pcie *pcie = bus->sysdata;
> -	int relbus;
>  
>  	if (!nwl_pcie_valid_device(bus, devfn))
>  		return NULL;
>  
> -	relbus = (bus->number << ECAM_BUS_LOC_SHIFT) |
> -			(devfn << ECAM_DEV_LOC_SHIFT);
> -
> -	return pcie->ecam_base + relbus + where;
> +	return pcie->ecam_base + PCIE_ECAM_OFFSET(bus->number, devfn, where);
>  }
>  
>  /* PCIe operations */
> diff --git a/drivers/pci/controller/pcie-xilinx.c b/drivers/pci/controller/pcie-xilinx.c
> index 8523be61bba5..fa5baeb82653 100644
> --- a/drivers/pci/controller/pcie-xilinx.c
> +++ b/drivers/pci/controller/pcie-xilinx.c
> @@ -21,6 +21,7 @@
>  #include <linux/of_platform.h>
>  #include <linux/of_irq.h>
>  #include <linux/pci.h>
> +#include <linux/pci-ecam.h>
>  #include <linux/platform_device.h>
>  
>  #include "../pci.h"
> @@ -86,10 +87,6 @@
>  /* Phy Status/Control Register definitions */
>  #define XILINX_PCIE_REG_PSCR_LNKUP	BIT(11)
>  
> -/* ECAM definitions */
> -#define ECAM_BUS_NUM_SHIFT		20
> -#define ECAM_DEV_NUM_SHIFT		12
> -
>  /* Number of MSI IRQs */
>  #define XILINX_NUM_MSI_IRQS		128
>  
> @@ -183,15 +180,11 @@ static void __iomem *xilinx_pcie_map_bus(struct pci_bus *bus,
>  					 unsigned int devfn, int where)
>  {
>  	struct xilinx_pcie_port *port = bus->sysdata;
> -	int relbus;
>  
>  	if (!xilinx_pcie_valid_device(bus, devfn))
>  		return NULL;
>  
> -	relbus = (bus->number << ECAM_BUS_NUM_SHIFT) |
> -		 (devfn << ECAM_DEV_NUM_SHIFT);
> -
> -	return port->reg_base + relbus + where;
> +	return port->reg_base + PCIE_ECAM_OFFSET(bus->number, devfn, where);
>  }
>  
>  /* PCIe operations */
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index f375c21ceeb1..1361a79bd1e7 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -11,6 +11,7 @@
>  #include <linux/module.h>
>  #include <linux/msi.h>
>  #include <linux/pci.h>
> +#include <linux/pci-ecam.h>
>  #include <linux/srcu.h>
>  #include <linux/rculist.h>
>  #include <linux/rcupdate.h>
> @@ -328,15 +329,13 @@ static void vmd_remove_irq_domain(struct vmd_dev *vmd)
>  static char __iomem *vmd_cfg_addr(struct vmd_dev *vmd, struct pci_bus *bus,
>  				  unsigned int devfn, int reg, int len)
>  {
> -	char __iomem *addr = vmd->cfgbar +
> -			     ((bus->number - vmd->busn_start) << 20) +
> -			     (devfn << 12) + reg;
> +	unsigned int busnr_ecam = bus->number - vmd->busn_start;
> +	u32 offset = PCIE_ECAM_OFFSET(busnr_ecam, devfn, reg);
>  
> -	if ((addr - vmd->cfgbar) + len >=
> -	    resource_size(&vmd->dev->resource[VMD_CFGBAR]))
> +	if (offset + len >= resource_size(&vmd->dev->resource[VMD_CFGBAR]))
>  		return NULL;
>  
> -	return addr;
> +	return vmd->cfgbar + offset;
>  }
>  
>  /*
> diff --git a/drivers/pci/ecam.c b/drivers/pci/ecam.c
> index b54d32a31669..59f91d434859 100644
> --- a/drivers/pci/ecam.c
> +++ b/drivers/pci/ecam.c
> @@ -131,25 +131,36 @@ void __iomem *pci_ecam_map_bus(struct pci_bus *bus, unsigned int devfn,
>  			       int where)
>  {
>  	struct pci_config_window *cfg = bus->sysdata;
> +	unsigned int bus_shift = cfg->ops->bus_shift;
>  	unsigned int devfn_shift = cfg->ops->bus_shift - 8;
>  	unsigned int busn = bus->number;
>  	void __iomem *base;
> +	u32 bus_offset, devfn_offset;
>  
>  	if (busn < cfg->busr.start || busn > cfg->busr.end)
>  		return NULL;
>  
>  	busn -= cfg->busr.start;
> -	if (per_bus_mapping)
> +	if (per_bus_mapping) {
>  		base = cfg->winp[busn];
> -	else
> -		base = cfg->win + (busn << cfg->ops->bus_shift);
> -	return base + (devfn << devfn_shift) + where;
> +		busn = 0;
> +	} else
> +		base = cfg->win;
> +
> +	if (cfg->ops->bus_shift) {
> +		bus_offset = (busn & PCIE_ECAM_BUS_MASK) << bus_shift;
> +		devfn_offset = (devfn & PCIE_ECAM_DEVFN_MASK) << devfn_shift;
> +		where &= PCIE_ECAM_REG_MASK;
> +
> +		return base + (bus_offset | devfn_offset | where);
> +	}
> +
> +	return base + PCIE_ECAM_OFFSET(busn, devfn, where);
>  }
>  EXPORT_SYMBOL_GPL(pci_ecam_map_bus);
>  
>  /* ECAM ops */
>  const struct pci_ecam_ops pci_generic_ecam_ops = {
> -	.bus_shift	= 20,
>  	.pci_ops	= {
>  		.map_bus	= pci_ecam_map_bus,
>  		.read		= pci_generic_config_read,
> @@ -161,7 +172,6 @@ EXPORT_SYMBOL_GPL(pci_generic_ecam_ops);
>  #if defined(CONFIG_ACPI) && defined(CONFIG_PCI_QUIRKS)
>  /* ECAM ops for 32-bit access only (non-compliant) */
>  const struct pci_ecam_ops pci_32b_ops = {
> -	.bus_shift	= 20,
>  	.pci_ops	= {
>  		.map_bus	= pci_ecam_map_bus,
>  		.read		= pci_generic_config_read32,
> @@ -171,7 +181,6 @@ const struct pci_ecam_ops pci_32b_ops = {
>  
>  /* ECAM ops for 32-bit read only (non-compliant) */
>  const struct pci_ecam_ops pci_32b_read_ops = {
> -	.bus_shift	= 20,
>  	.pci_ops	= {
>  		.map_bus	= pci_ecam_map_bus,
>  		.read		= pci_generic_config_read32,
> diff --git a/include/linux/pci-ecam.h b/include/linux/pci-ecam.h
> index 033ce74f02e8..65d3d83015c3 100644
> --- a/include/linux/pci-ecam.h
> +++ b/include/linux/pci-ecam.h
> @@ -9,6 +9,33 @@
>  #include <linux/kernel.h>
>  #include <linux/platform_device.h>
>  
> +/*
> + * Memory address shift values for the byte-level address that
> + * can be used when accessing the PCI Express Configuration Space.
> + */
> +
> +/*
> + * Enhanced Configuration Access Mechanism (ECAM)
> + *
> + * See PCI Express Base Specification, Revision 5.0, Version 1.0,
> + * Section 7.2.2, Table 7-1, p. 677.
> + */
> +#define PCIE_ECAM_BUS_SHIFT	20 /* Bus number */
> +#define PCIE_ECAM_DEVFN_SHIFT	12 /* Device and Function number */
> +
> +#define PCIE_ECAM_BUS_MASK	0xff
> +#define PCIE_ECAM_DEVFN_MASK	0xff
> +#define PCIE_ECAM_REG_MASK	0xfff /* Limit offset to a maximum of 4K */
> +
> +#define PCIE_ECAM_BUS(x)	(((x) & PCIE_ECAM_BUS_MASK) << PCIE_ECAM_BUS_SHIFT)
> +#define PCIE_ECAM_DEVFN(x)	(((x) & PCIE_ECAM_DEVFN_MASK) << PCIE_ECAM_DEVFN_SHIFT)
> +#define PCIE_ECAM_REG(x)	((x) & PCIE_ECAM_REG_MASK)
> +
> +#define PCIE_ECAM_OFFSET(bus, devfn, where) \
> +	(PCIE_ECAM_BUS(bus) | \
> +	 PCIE_ECAM_DEVFN(devfn) | \
> +	 PCIE_ECAM_REG(where))
> +
>  /*
>   * struct to hold pci ops and bus shift of the config window
>   * for a PCI controller.
> -- 
> 2.29.2
> 

^ permalink raw reply

* Re: [PATCH] powerpc: fix the allyesconfig build
From: Stephen Rothwell @ 2020-11-30 10:26 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Salil Mehta, Geert Uytterhoeven, Stephen Boyd, Michael Turquette,
	Linux Kernel Mailing List, Nicholas Piggin, linux-clk,
	Linux-Renesas, Yisen Zhuang, Joel Stanley, netdev, Jakub Kicinski,
	PowerPC, David S. Miller, Daniel Axtens
In-Reply-To: <CAMuHMdVJKarCRRRJq_hmvvv0NcSpREdqDbH8L5NitZmFUEbqmw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 482 bytes --]

Hi Geert,

On Mon, 30 Nov 2020 09:58:23 +0100 Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Thanks for your patch!

No worries, it has been a small irritant to me for quite a while.

> I prefer to fix this in the driver instead.  The space saving by packing the
> structure is minimal.
> I've sent a patch
> https://lore.kernel.org/r/20201130085743.1656317-1-geert+renesas@glider.be
> (when lore is back)

Absolutely, thanks.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH 6/8] lazy tlb: shoot lazies, a non-refcounting lazy tlb option
From: Peter Zijlstra @ 2020-11-30  9:34 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-arch, Arnd Bergmann, X86 ML, LKML, Nicholas Piggin,
	Linux-MM, Mathieu Desnoyers, linuxppc-dev
In-Reply-To: <20201130093000.GM2414@hirez.programming.kicks-ass.net>

On Mon, Nov 30, 2020 at 10:30:00AM +0100, Peter Zijlstra wrote:
> On Sat, Nov 28, 2020 at 07:54:57PM -0800, Andy Lutomirski wrote:
> > This means that mm_cpumask operations won't need to be full barriers
> > forever, and we might not want to take the implied full barriers in
> > set_bit() and clear_bit() for granted.
> 
> There is no implied full barrier for those ops.

Neither ARM nor Power implies any ordering on those ops. But Power has
some of the worst atomic performance in the world despite of that.

IIRC they don't do their LL/SC in L1.

^ permalink raw reply

* Re: [PATCH 6/8] lazy tlb: shoot lazies, a non-refcounting lazy tlb option
From: Peter Zijlstra @ 2020-11-30  9:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-arch, Arnd Bergmann, X86 ML, LKML, Nicholas Piggin,
	Linux-MM, Mathieu Desnoyers, linuxppc-dev
In-Reply-To: <CALCETrVXUbe8LfNn-Qs+DzrOQaiw+sFUg1J047yByV31SaTOZw@mail.gmail.com>

On Sat, Nov 28, 2020 at 07:54:57PM -0800, Andy Lutomirski wrote:
> This means that mm_cpumask operations won't need to be full barriers
> forever, and we might not want to take the implied full barriers in
> set_bit() and clear_bit() for granted.

There is no implied full barrier for those ops.

^ permalink raw reply

* Re: [PATCH 6/8] lazy tlb: shoot lazies, a non-refcounting lazy tlb option
From: Peter Zijlstra @ 2020-11-30  9:26 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-arch, Arnd Bergmann, X86 ML, LKML, Nicholas Piggin,
	Linux-MM, Mathieu Desnoyers, linuxppc-dev
In-Reply-To: <CALCETrVXUbe8LfNn-Qs+DzrOQaiw+sFUg1J047yByV31SaTOZw@mail.gmail.com>

On Sat, Nov 28, 2020 at 07:54:57PM -0800, Andy Lutomirski wrote:
> Version (b) seems fairly straightforward to implement -- add RCU
> protection and a atomic_t special_ref_cleared (initially 0) to struct
> mm_struct itself.  After anyone clears a bit to mm_cpumask (which is
> already a barrier),

No it isn't. clear_bit() implies no barrier what so ever. That's x86
you're thinking about.

> they read mm_users.  If it's zero, then they scan
> mm_cpumask and see if it's empty.  If it is, they atomically swap
> special_ref_cleared to 1.  If it was zero before the swap, they do
> mmdrop().  I can imagine some tweaks that could make this a big
> faster, at least in the limit of a huge number of CPUs.

^ permalink raw reply

* Re: [PATCH 6/8] lazy tlb: shoot lazies, a non-refcounting lazy tlb option
From: Peter Zijlstra @ 2020-11-30  9:25 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: linux-arch, Arnd Bergmann, X86 ML, LKML, Nicholas Piggin,
	Linux-MM, Mathieu Desnoyers, linuxppc-dev
In-Reply-To: <CALCETrWBtCfD+jZ3S+O8FK-HFPODuhbDEbbfWvS=-iPATNFAOA@mail.gmail.com>

On Sun, Nov 29, 2020 at 12:16:26PM -0800, Andy Lutomirski wrote:
> On Sat, Nov 28, 2020 at 7:54 PM Andy Lutomirski <luto@kernel.org> wrote:
> >
> > On Sat, Nov 28, 2020 at 8:02 AM Nicholas Piggin <npiggin@gmail.com> wrote:
> > >
> > > On big systems, the mm refcount can become highly contented when doing
> > > a lot of context switching with threaded applications (particularly
> > > switching between the idle thread and an application thread).
> > >
> > > Abandoning lazy tlb slows switching down quite a bit in the important
> > > user->idle->user cases, so so instead implement a non-refcounted scheme
> > > that causes __mmdrop() to IPI all CPUs in the mm_cpumask and shoot down
> > > any remaining lazy ones.
> > >
> > > Shootdown IPIs are some concern, but they have not been observed to be
> > > a big problem with this scheme (the powerpc implementation generated
> > > 314 additional interrupts on a 144 CPU system during a kernel compile).
> > > There are a number of strategies that could be employed to reduce IPIs
> > > if they turn out to be a problem for some workload.
> >
> > I'm still wondering whether we can do even better.
> >
> 
> Hold on a sec.. __mmput() unmaps VMAs, frees pagetables, and flushes
> the TLB.  On x86, this will shoot down all lazies as long as even a
> single pagetable was freed.  (Or at least it will if we don't have a
> serious bug, but the code seems okay.  We'll hit pmd_free_tlb, which
> sets tlb->freed_tables, which will trigger the IPI.)  So, on
> architectures like x86, the shootdown approach should be free.  The
> only way it ought to have any excess IPIs is if we have CPUs in
> mm_cpumask() that don't need IPI to free pagetables, which could
> happen on paravirt.
> 
> Can you try to figure out why you saw any increase in IPIs?  It would
> be nice if we can make the new code unconditional.

Power doesn't do IPI based TLBI.

^ permalink raw reply

* RE: [PATCH v6 4/5] PCI: vmd: Update type of the __iomem pointers
From: David Laight @ 2020-11-30  9:06 UTC (permalink / raw)
  To: 'Krzysztof Wilczyński', Bjorn Helgaas
  Cc: Heiko Stuebner, linux-pci@vger.kernel.org, Shawn Lin,
	Paul Mackerras, Thomas Petazzoni, Jonathan Chocron, Toan Le,
	Will Deacon, Rob Herring, Lorenzo Pieralisi, Michal Simek,
	linux-rockchip@lists.infradead.org,
	bcm-kernel-feedback-list@broadcom.com, Jonathan Derrick, Ray Jui,
	Florian Fainelli, linux-rpi-kernel@lists.infradead.org,
	Jonathan Cameron, linux-arm-kernel@lists.infradead.org,
	Scott Branden, Zhou Wang, Robert Richter,
	linuxppc-dev@lists.ozlabs.org, Nicolas Saenz Julienne
In-Reply-To: <20201129230743.3006978-5-kw@linux.com>

From: Krzysztof Wilczynski
> Sent: 29 November 2020 23:08
> 
> Use "void __iomem" instead "char __iomem" pointer type when working with
> the accessor functions (with names like readb() or writel(), etc.) to
> better match a given accessor function signature where commonly the
> address pointing to an I/O memory region would be a "void __iomem"
> pointer.

ISTM that is heading in the wrong direction.

I think (form the variable names etc) that these are pointers
to specific registers.

So what you ought to have is a type for that register block.
Typically this is actually a structure - to give some type
checking that the offsets are being used with the correct
base address.

If the code is using numeric offsets (hardware engineers like
numeric offsets) then you can get some type protection by using
a structure that only contains a single field (char in this case).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply

* Re: [PATCH 5/8] net: ethernet: ibm: ibmvnic: Fix some kernel-doc misdemeanours
From: Lee Jones @ 2020-11-30  9:04 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Thomas Falcon, John Allen, linux-kernel, Santiago Leon,
	Jakub Kicinski, netdev, Lijun Pan, Dany Madden, Paul Mackerras,
	Sukadev Bhattiprolu, linuxppc-dev, David S. Miller
In-Reply-To: <20201129184354.GL2234159@lunn.ch>

On Sun, 29 Nov 2020, Andrew Lunn wrote:

> Hi Lee
> 
> >  /**
> >   * build_hdr_data - creates L2/L3/L4 header data buffer
> > - * @hdr_field - bitfield determining needed headers
> > - * @skb - socket buffer
> > - * @hdr_len - array of header lengths
> > - * @tot_len - total length of data
> > + * @hdr_field: bitfield determining needed headers
> > + * @skb: socket buffer
> > + * @hdr_len: array of header lengths
> > + * @tot_len: total length of data
> >   *
> >   * Reads hdr_field to determine which headers are needed by firmware.
> >   * Builds a buffer containing these headers.  Saves individual header
> 
> The code is:
> 
> static int build_hdr_data(u8 hdr_field, struct sk_buff *skb,
>                           int *hdr_len, u8 *hdr_data)
> {
> 
> What about hdr_data? 
> 
> >  /**
> >   * create_hdr_descs - create header and header extension descriptors
> > - * @hdr_field - bitfield determining needed headers
> > - * @data - buffer containing header data
> > - * @len - length of data buffer
> > - * @hdr_len - array of individual header lengths
> > - * @scrq_arr - descriptor array
> > + * @hdr_field: bitfield determining needed headers
> > + * @data: buffer containing header data
> > + * @len: length of data buffer
> > + * @hdr_len: array of individual header lengths
> > + * @scrq_arr: descriptor array
> 
> static int create_hdr_descs(u8 hdr_field, u8 *hdr_data, int len, int *hdr_len,
>                             union sub_crq *scrq_arr)
> 
> There is no data parameter.
> 
> It looks like you just changes - to :, but did not validate the
> parameters are actually correct.

Right.  This is a 'quirk' of my current process.

I build once, then use a script to parse the output, fixing each
issue in the order the compiler presents them.  Then, either after
fixing a reasonable collection, or all issues, I re-run the compile
and fix the next batch (if any).

This patch is only fixing the formatting issue(s).  As you've seen,
there is a subsequent patch in the series which fixes the disparity.

I can squash them though.  No problem.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply

* Re: [PATCH 8/8] net: ethernet: ibm: ibmvnic: Fix some kernel-doc issues
From: Lee Jones @ 2020-11-30  8:59 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Thomas Falcon, John Allen, linux-kernel, Santiago Leon,
	Jakub Kicinski, netdev, Lijun Pan, Dany Madden, Paul Mackerras,
	Sukadev Bhattiprolu, linuxppc-dev, David S. Miller
In-Reply-To: <20201129191015.GO2234159@lunn.ch>

On Sun, 29 Nov 2020, Andrew Lunn wrote:

> On Thu, Nov 26, 2020 at 01:38:53PM +0000, Lee Jones wrote:
> > Fixes the following W=1 kernel build warning(s):
> > 
> >  from drivers/net/ethernet/ibm/ibmvnic.c:35:
> >  inlined from ‘handle_vpd_rsp’ at drivers/net/ethernet/ibm/ibmvnic.c:4124:3:
> >  drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'hdr_data' not described in 'build_hdr_data'
> >  drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Excess function parameter 'tot_len' description in 'build_hdr_data'
> >  drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'hdr_data' not described in 'create_hdr_descs'
> >  drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Excess function parameter 'data' description in 'create_hdr_descs'
> >  drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Function parameter or member 'txbuff' not described in 'build_hdr_descs_arr'
> >  drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Excess function parameter 'skb' description in 'build_hdr_descs_arr'
> >  drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Excess function parameter 'subcrq' description in 'build_hdr_descs_arr'
> 
> Hi Lee
> 
> It looks like this should be squashed into the previous patch to this
> file.

It certainly looks that way.  Will fix.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply

* Re: [PATCH] powerpc: fix the allyesconfig build
From: Geert Uytterhoeven @ 2020-11-30  8:58 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Salil Mehta, Geert Uytterhoeven, Stephen Boyd, Michael Turquette,
	Linux Kernel Mailing List, Nicholas Piggin, linux-clk,
	Linux-Renesas, Yisen Zhuang, Joel Stanley, netdev, Jakub Kicinski,
	PowerPC, David S. Miller, Daniel Axtens
In-Reply-To: <20201128122819.32187696@canb.auug.org.au>

Hi Stephen,

On Sat, Nov 28, 2020 at 2:28 AM Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> There are 2 drivers that have arrays of packed structures that contain
> pointers that end up at unaligned offsets.  These produce warnings in
> the PowerPC allyesconfig build like this:
>
> WARNING: 148 bad relocations
> c00000000e56510b R_PPC64_UADDR64   .rodata+0x0000000001c72378
> c00000000e565126 R_PPC64_UADDR64   .rodata+0x0000000001c723c0
>
> They are not drivers that are used on PowerPC (I assume), so mark them
> to not be built on PPC64 when CONFIG_RELOCATABLE is enabled.
>
> Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> Cc: Michael Turquette <mturquette@baylibre.com>
> Cc: Stephen Boyd <sboyd@kernel.org>
> Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
> Cc: Salil Mehta <salil.mehta@huawei.com>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Nicholas Piggin  <npiggin@gmail.com>
> Cc: Daniel Axtens <dja@axtens.net>
> Cc: Joel Stanley <joel@jms.id.au>
> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>

Thanks for your patch!

> --- a/drivers/clk/renesas/Kconfig
> +++ b/drivers/clk/renesas/Kconfig
> @@ -151,6 +151,10 @@ config CLK_R8A779A0
>         select CLK_RENESAS_CPG_MSSR
>
>  config CLK_R9A06G032
> +       # PPC64 RELOCATABLE kernels cannot handle relocations to
> +       # unaligned locations that are produced by the array of
> +       # packed structures in this driver.
> +       depends on !(PPC64 && RELOCATABLE)
>         bool "Renesas R9A06G032 clock driver"
>         help
>           This is a driver for R9A06G032 clocks

I prefer to fix this in the driver instead.  The space saving by packing the
structure is minimal.
I've sent a patch
https://lore.kernel.org/r/20201130085743.1656317-1-geert+renesas@glider.be
(when lore is back)

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* [PATCH v2] clk: renesas: r9a06g032: Drop __packed for portability
From: Geert Uytterhoeven @ 2020-11-30  8:57 UTC (permalink / raw)
  To: Stephen Rothwell, Michael Turquette, Stephen Boyd,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
  Cc: Geert Uytterhoeven, linux-kernel, Gareth Williams,
	linux-renesas-soc, linuxppc-dev, linux-clk

The R9A06G032 clock driver uses an array of packed structures to reduce
kernel size.  However, this array contains pointers, which are no longer
aligned naturally, and cannot be relocated on PPC64.  Hence when
compile-testing this driver on PPC64 with CONFIG_RELOCATABLE=y (e.g.
PowerPC allyesconfig), the following warnings are produced:

    WARNING: 136 bad relocations
    c000000000616be3 R_PPC64_UADDR64   .rodata+0x00000000000cf338
    c000000000616bfe R_PPC64_UADDR64   .rodata+0x00000000000cf370
    ...

Fix this by dropping the __packed attribute from the r9a06g032_clkdesc
definition, trading a small size increase for portability.

This increases the 156-entry clock table by 1 byte per entry, but due to
the compiler generating more efficient code for unpacked accesses, the
net size increase is only 76 bytes (gcc 9.3.0 on arm32).

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 4c3d88526eba2143 ("clk: renesas: Renesas R9A06G032 clock driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
v2:
  - Fix authorship.
---
 drivers/clk/renesas/r9a06g032-clocks.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/renesas/r9a06g032-clocks.c b/drivers/clk/renesas/r9a06g032-clocks.c
index d900f6bf53d0b944..892e91b92f2c80f5 100644
--- a/drivers/clk/renesas/r9a06g032-clocks.c
+++ b/drivers/clk/renesas/r9a06g032-clocks.c
@@ -55,7 +55,7 @@ struct r9a06g032_clkdesc {
 			u16 sel, g1, r1, g2, r2;
 		} dual;
 	};
-} __packed;
+};
 
 #define I_GATE(_clk, _rst, _rdy, _midle, _scon, _mirack, _mistat) \
 	{ .gate = _clk, .reset = _rst, \
-- 
2.25.1


^ permalink raw reply related

* [PATCH] clk: renesas: r9a06g032: Drop __packed for portability
From: Geert Uytterhoeven @ 2020-11-30  8:53 UTC (permalink / raw)
  To: Stephen Rothwell, Michael Turquette, Stephen Boyd,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
  Cc: linux-kernel, Gareth Williams, linux-renesas-soc,
	Geert Uytterhoeven, linuxppc-dev, linux-clk

The R9A06G032 clock driver uses an array of packed structures to reduce
kernel size.  However, this array contains pointers, which are no longer
aligned naturally, and cannot be relocated on PPC64.  Hence when
compile-testing this driver on PPC64 with CONFIG_RELOCATABLE=y (e.g.
PowerPC allyesconfig), the following warnings are produced:

    WARNING: 136 bad relocations
    c000000000616be3 R_PPC64_UADDR64   .rodata+0x00000000000cf338
    c000000000616bfe R_PPC64_UADDR64   .rodata+0x00000000000cf370
    ...

Fix this by dropping the __packed attribute from the r9a06g032_clkdesc
definition, trading a small size increase for portability.

This increases the 156-entry clock table by 1 byte per entry, but due to
the compiler generating more efficient code for unpacked accesses, the
net size increase is only 76 bytes (gcc 9.3.0 on arm32).

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 4c3d88526eba2143 ("clk: renesas: Renesas R9A06G032 clock driver")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
Please take directly (ppc or clk), as this is a build fix.
https://lore.kernel.org/linux-clk/20201128122819.32187696@canb.auug.org.au/

Compile-tested only due to lack of hardware.

 drivers/clk/renesas/r9a06g032-clocks.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/renesas/r9a06g032-clocks.c b/drivers/clk/renesas/r9a06g032-clocks.c
index d900f6bf53d0b944..892e91b92f2c80f5 100644
--- a/drivers/clk/renesas/r9a06g032-clocks.c
+++ b/drivers/clk/renesas/r9a06g032-clocks.c
@@ -55,7 +55,7 @@ struct r9a06g032_clkdesc {
 			u16 sel, g1, r1, g2, r2;
 		} dual;
 	};
-} __packed;
+};
 
 #define I_GATE(_clk, _rst, _rdy, _midle, _scon, _mirack, _mistat) \
 	{ .gate = _clk, .reset = _rst, \
-- 
2.25.1


^ permalink raw reply related

* [PATCH v5] lkdtm/powerpc: Add SLB multihit test
From: Ganesh Goudar @ 2020-11-30  8:30 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: msuchanek, Ganesh Goudar, keescook, npiggin, mahesh

To check machine check handling, add support to inject slb
multihit errors.

Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Suchánek <msuchanek@suse.de>
Co-developed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
v5:
- Insert entries at SLB_NUM_BOLTED and SLB_NUM_BOLTED +1, remove index
  allocating helper function.
- Move mk_esid_data and mk_vsid_data helpers to asm/book3s/64/mmu-hash.h.
- Use mmu_linear_psize and mmu_vmalloc_psize to get page size.
- Use !radix_enabled() to check if we are in HASH mode.
- And other minor improvements.

v1-v4:
- No major changes here for this patch, This patch was initially posted
  along with the other patch which got accepted.
https://git.kernel.org/powerpc/c/8d0e2101274358d9b6b1f27232b40253ca48bab5
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  28 +++-
 arch/powerpc/mm/book3s64/hash_utils.c         |   1 +
 arch/powerpc/mm/book3s64/slb.c                |  27 ----
 drivers/misc/lkdtm/Makefile                   |   1 +
 drivers/misc/lkdtm/core.c                     |   3 +
 drivers/misc/lkdtm/lkdtm.h                    |   3 +
 drivers/misc/lkdtm/powerpc.c                  | 120 ++++++++++++++++++
 tools/testing/selftests/lkdtm/tests.txt       |   1 +
 8 files changed, 156 insertions(+), 28 deletions(-)
 create mode 100644 drivers/misc/lkdtm/powerpc.c

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 683a9c7d1b03..8b9f07900395 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -842,6 +842,32 @@ static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
 
 unsigned htab_shift_for_mem_size(unsigned long mem_size);
 
-#endif /* __ASSEMBLY__ */
+enum slb_index {
+	LINEAR_INDEX	= 0, /* Kernel linear map  (0xc000000000000000) */
+	KSTACK_INDEX	= 1, /* Kernel stack map */
+};
 
+#define slb_esid_mask(ssize)	\
+	(((ssize) == MMU_SEGSIZE_256M) ? ESID_MASK : ESID_MASK_1T)
+
+static inline unsigned long mk_esid_data(unsigned long ea, int ssize,
+					 enum slb_index index)
+{
+	return (ea & slb_esid_mask(ssize)) | SLB_ESID_V | index;
+}
+
+static inline unsigned long __mk_vsid_data(unsigned long vsid, int ssize,
+					   unsigned long flags)
+{
+	return (vsid << slb_vsid_shift(ssize)) | flags |
+		((unsigned long)ssize << SLB_VSID_SSIZE_SHIFT);
+}
+
+static inline unsigned long mk_vsid_data(unsigned long ea, int ssize,
+					 unsigned long flags)
+{
+	return __mk_vsid_data(get_kernel_vsid(ea, ssize), ssize, flags);
+}
+
+#endif /* __ASSEMBLY__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_MMU_HASH_H_ */
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 24702c0a92e0..38076a998850 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -112,6 +112,7 @@ int mmu_linear_psize = MMU_PAGE_4K;
 EXPORT_SYMBOL_GPL(mmu_linear_psize);
 int mmu_virtual_psize = MMU_PAGE_4K;
 int mmu_vmalloc_psize = MMU_PAGE_4K;
+EXPORT_SYMBOL_GPL(mmu_vmalloc_psize);
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 int mmu_vmemmap_psize = MMU_PAGE_4K;
 #endif
diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
index c30fcbfa0e32..985706acb0e5 100644
--- a/arch/powerpc/mm/book3s64/slb.c
+++ b/arch/powerpc/mm/book3s64/slb.c
@@ -28,35 +28,8 @@
 #include "internal.h"
 
 
-enum slb_index {
-	LINEAR_INDEX	= 0, /* Kernel linear map  (0xc000000000000000) */
-	KSTACK_INDEX	= 1, /* Kernel stack map */
-};
-
 static long slb_allocate_user(struct mm_struct *mm, unsigned long ea);
 
-#define slb_esid_mask(ssize)	\
-	(((ssize) == MMU_SEGSIZE_256M)? ESID_MASK: ESID_MASK_1T)
-
-static inline unsigned long mk_esid_data(unsigned long ea, int ssize,
-					 enum slb_index index)
-{
-	return (ea & slb_esid_mask(ssize)) | SLB_ESID_V | index;
-}
-
-static inline unsigned long __mk_vsid_data(unsigned long vsid, int ssize,
-					 unsigned long flags)
-{
-	return (vsid << slb_vsid_shift(ssize)) | flags |
-		((unsigned long) ssize << SLB_VSID_SSIZE_SHIFT);
-}
-
-static inline unsigned long mk_vsid_data(unsigned long ea, int ssize,
-					 unsigned long flags)
-{
-	return __mk_vsid_data(get_kernel_vsid(ea, ssize), ssize, flags);
-}
-
 bool stress_slb_enabled __initdata;
 
 static int __init parse_stress_slb(char *p)
diff --git a/drivers/misc/lkdtm/Makefile b/drivers/misc/lkdtm/Makefile
index c70b3822013f..f37ecfb0a707 100644
--- a/drivers/misc/lkdtm/Makefile
+++ b/drivers/misc/lkdtm/Makefile
@@ -10,6 +10,7 @@ lkdtm-$(CONFIG_LKDTM)		+= rodata_objcopy.o
 lkdtm-$(CONFIG_LKDTM)		+= usercopy.o
 lkdtm-$(CONFIG_LKDTM)		+= stackleak.o
 lkdtm-$(CONFIG_LKDTM)		+= cfi.o
+lkdtm-$(CONFIG_PPC64)		+= powerpc.o
 
 KASAN_SANITIZE_stackleak.o	:= n
 KCOV_INSTRUMENT_rodata.o	:= n
diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
index 97803f213d9d..538098dc5aec 100644
--- a/drivers/misc/lkdtm/core.c
+++ b/drivers/misc/lkdtm/core.c
@@ -176,6 +176,9 @@ static const struct crashtype crashtypes[] = {
 #ifdef CONFIG_X86_32
 	CRASHTYPE(DOUBLE_FAULT),
 #endif
+#ifdef CONFIG_PPC64
+	CRASHTYPE(PPC_SLB_MULTIHIT),
+#endif
 };
 
 
diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
index 6dec4c9b442f..79ec05c18dd1 100644
--- a/drivers/misc/lkdtm/lkdtm.h
+++ b/drivers/misc/lkdtm/lkdtm.h
@@ -102,4 +102,7 @@ void lkdtm_STACKLEAK_ERASING(void);
 /* cfi.c */
 void lkdtm_CFI_FORWARD_PROTO(void);
 
+/* powerpc.c */
+void lkdtm_PPC_SLB_MULTIHIT(void);
+
 #endif
diff --git a/drivers/misc/lkdtm/powerpc.c b/drivers/misc/lkdtm/powerpc.c
new file mode 100644
index 000000000000..077c9f9ed8d0
--- /dev/null
+++ b/drivers/misc/lkdtm/powerpc.c
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "lkdtm.h"
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <asm/mmu.h>
+
+/* Inserts new slb entries */
+static void insert_slb_entry(unsigned long p, int ssize, int page_size)
+{
+	unsigned long flags;
+
+	flags = SLB_VSID_KERNEL | mmu_psize_defs[page_size].sllp;
+	preempt_disable();
+
+	asm volatile("slbmte %0,%1" :
+		     : "r" (mk_vsid_data(p, ssize, flags)),
+		       "r" (mk_esid_data(p, ssize, SLB_NUM_BOLTED))
+		     : "memory");
+
+	asm volatile("slbmte %0,%1" :
+			: "r" (mk_vsid_data(p, ssize, flags)),
+			  "r" (mk_esid_data(p, ssize, SLB_NUM_BOLTED + 1))
+			: "memory");
+	preempt_enable();
+}
+
+/* Inject slb multihit on vmalloc-ed address i.e 0xD00... */
+static int inject_vmalloc_slb_multihit(void)
+{
+	char *p;
+
+	p = vmalloc(PAGE_SIZE);
+	if (!p)
+		return -ENOMEM;
+
+	insert_slb_entry((unsigned long)p, MMU_SEGSIZE_1T, mmu_vmalloc_psize);
+	/*
+	 * This triggers exception, If handled correctly we must recover
+	 * from this error.
+	 */
+	p[0] = '!';
+	vfree(p);
+	return 0;
+}
+
+/* Inject slb multihit on kmalloc-ed address i.e 0xC00... */
+static int inject_kmalloc_slb_multihit(void)
+{
+	char *p;
+
+	p = kmalloc(2048, GFP_KERNEL);
+	if (!p)
+		return -ENOMEM;
+
+	insert_slb_entry((unsigned long)p, MMU_SEGSIZE_1T, mmu_linear_psize);
+	/*
+	 * This triggers exception, If handled correctly we must recover
+	 * from this error.
+	 */
+	p[0] = '!';
+	kfree(p);
+	return 0;
+}
+
+/*
+ * Few initial SLB entries are bolted. Add a test to inject
+ * multihit in bolted entry 0.
+ */
+static void insert_dup_slb_entry_0(void)
+{
+	unsigned long test_address = PAGE_OFFSET, *test_ptr;
+	unsigned long esid, vsid;
+	unsigned long i = 0;
+
+	test_ptr = (unsigned long *)test_address;
+	preempt_disable();
+
+	asm volatile("slbmfee  %0,%1" : "=r" (esid) : "r" (i));
+	asm volatile("slbmfev  %0,%1" : "=r" (vsid) : "r" (i));
+
+	/* for i !=0 we would need to mask out the old entry number */
+	asm volatile("slbmte %0,%1" :
+			: "r" (vsid),
+			  "r" (esid | SLB_NUM_BOLTED)
+			: "memory");
+
+	asm volatile("slbmfee  %0,%1" : "=r" (esid) : "r" (i));
+	asm volatile("slbmfev  %0,%1" : "=r" (vsid) : "r" (i));
+
+	/* for i !=0 we would need to mask out the old entry number */
+	asm volatile("slbmte %0,%1" :
+			: "r" (vsid),
+			  "r" (esid | (SLB_NUM_BOLTED + 1))
+			: "memory");
+
+	pr_info("%s accessing test address 0x%lx: 0x%lx\n",
+		__func__, test_address, *test_ptr);
+
+	preempt_enable();
+}
+
+void lkdtm_PPC_SLB_MULTIHIT(void)
+{
+	if (!radix_enabled()) {
+		pr_info("Injecting SLB multihit errors\n");
+		/*
+		 * These need not be separate tests, And they do pretty
+		 * much same thing. In any case we must recover from the
+		 * errors introduced by these functions, machine would not
+		 * survive these tests in case of failure to handle.
+		 */
+		inject_vmalloc_slb_multihit();
+		inject_kmalloc_slb_multihit();
+		insert_dup_slb_entry_0();
+		pr_info("Recovered from SLB multihit errors\n");
+	} else {
+		pr_err("XFAIL: This test is for ppc64 and with hash mode MMU only\n");
+	}
+}
diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt
index 74a8d329a72c..18e4599863c0 100644
--- a/tools/testing/selftests/lkdtm/tests.txt
+++ b/tools/testing/selftests/lkdtm/tests.txt
@@ -68,3 +68,4 @@ USERCOPY_STACK_BEYOND
 USERCOPY_KERNEL
 STACKLEAK_ERASING OK: the rest of the thread stack is properly erased
 CFI_FORWARD_PROTO
+PPC_SLB_MULTIHIT Recovered
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH v3 05/19] powerpc: interrupt handler wrapper functions
From: Aneesh Kumar K.V @ 2020-11-30  7:37 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20201128144114.944000-6-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

.....
 +#endif
> +DECLARE_INTERRUPT_HANDLER(emulation_assist_interrupt);
> +DECLARE_INTERRUPT_HANDLER_RAW(do_slb_fault);

Can we add comments here explaining why some of these handlers need to remain RAW()?

> +DECLARE_INTERRUPT_HANDLER(do_bad_slb_fault);
> +DECLARE_INTERRUPT_HANDLER_RET(do_hash_fault);
> +DECLARE_INTERRUPT_HANDLER_RET(do_page_fault);
> +DECLARE_INTERRUPT_HANDLER(do_bad_page_fault);
> +
> +DECLARE_INTERRUPT_HANDLER_ASYNC(timer_interrupt);
> +DECLARE_INTERRUPT_HANDLER_NMI(performance_monitor_exception_nmi);
> +DECLARE_INTERRUPT_HANDLER_ASYNC(performance_monitor_exception_async);
> +DECLARE_INTERRUPT_HANDLER_RAW(performance_monitor_exception);

Same for this.

> +DECLARE_INTERRUPT_HANDLER(WatchdogException);
> +DECLARE_INTERRUPT_HANDLER(unknown_exception);
> +DECLARE_INTERRUPT_HANDLER_ASYNC(unknown_async_exception);
> +
> +void replay_system_reset(void);
> +void replay_soft_interrupts(void);
> +
> +#endif /* _ASM_POWERPC_INTERRUPT_H */
> diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
> index 2f566c1a754c..335d6fd589a7 100644
> --- a/arch/powerpc/include/asm/time.h
> +++ b/arch/powerpc/include/asm/time.h
> @@ -131,6 +131,8 @@ DECLARE_PER_CPU(u64, decrementers_next_tb);
>  /* Convert timebase ticks to nanoseconds */
>  unsigned long long tb_to_ns(unsigned long long tb_ticks);
>  
> +void timer_broadcast_interrupt(void);
> +
>  /* SPLPAR */
>  void accumulate_stolen_time(void);
>  
> 

-aneesh

^ permalink raw reply

* Re: [PATCH v3 03/19] powerpc: bad_page_fault, do_break get registers from regs
From: Aneesh Kumar K.V @ 2020-11-30  7:36 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20201128144114.944000-4-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

> Similar to the previous patch this makes interrupt handler function
> types more regular so they can be wrapped with the next patch.
>
> bad_page_fault and do_break are not performance critical.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

> [32s DABR code from Christophe Leroy <christophe.leroy@csgroup.eu>]
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/include/asm/bug.h             |  2 +-
>  arch/powerpc/include/asm/debug.h           |  3 +--
>  arch/powerpc/kernel/entry_32.S             | 18 +-----------------
>  arch/powerpc/kernel/exceptions-64e.S       |  3 +--
>  arch/powerpc/kernel/exceptions-64s.S       |  3 +--
>  arch/powerpc/kernel/head_8xx.S             |  5 ++---
>  arch/powerpc/kernel/head_book3s_32.S       |  3 +++
>  arch/powerpc/kernel/process.c              |  7 +++----
>  arch/powerpc/kernel/traps.c                |  2 +-
>  arch/powerpc/mm/book3s64/hash_utils.c      |  4 ++--
>  arch/powerpc/mm/book3s64/slb.c             |  2 +-
>  arch/powerpc/mm/fault.c                    | 10 +++++-----
>  arch/powerpc/platforms/8xx/machine_check.c |  2 +-
>  13 files changed, 23 insertions(+), 41 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
> index 897bad6b6bbb..49162faba33f 100644
> --- a/arch/powerpc/include/asm/bug.h
> +++ b/arch/powerpc/include/asm/bug.h
> @@ -113,7 +113,7 @@
>  struct pt_regs;
>  long do_page_fault(struct pt_regs *);
>  long hash__do_page_fault(struct pt_regs *);
> -extern void bad_page_fault(struct pt_regs *, unsigned long, int);
> +void bad_page_fault(struct pt_regs *, int);
>  extern void _exception(int, struct pt_regs *, int, unsigned long);
>  extern void _exception_pkey(struct pt_regs *, unsigned long, int);
>  extern void die(const char *, struct pt_regs *, long);
> diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
> index ec57daf87f40..0550eceab3ca 100644
> --- a/arch/powerpc/include/asm/debug.h
> +++ b/arch/powerpc/include/asm/debug.h
> @@ -52,8 +52,7 @@ extern void do_send_trap(struct pt_regs *regs, unsigned long address,
>  			 unsigned long error_code, int brkpt);
>  #else
>  
> -extern void do_break(struct pt_regs *regs, unsigned long address,
> -		     unsigned long error_code);
> +void do_break(struct pt_regs *regs);
>  #endif
>  
>  #endif /* _ASM_POWERPC_DEBUG_H */
> diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
> index 8cdc8bcde703..57b8e95ea2a0 100644
> --- a/arch/powerpc/kernel/entry_32.S
> +++ b/arch/powerpc/kernel/entry_32.S
> @@ -657,10 +657,6 @@ ppc_swapcontext:
>  	.globl	handle_page_fault
>  handle_page_fault:
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
> -#ifdef CONFIG_PPC_BOOK3S_32
> -	andis.  r0,r5,DSISR_DABRMATCH@h
> -	bne-    handle_dabr_fault
> -#endif
>  	bl	do_page_fault
>  	cmpwi	r3,0
>  	beq+	ret_from_except
> @@ -668,23 +664,11 @@ handle_page_fault:
>  	lwz	r0,_TRAP(r1)
>  	clrrwi	r0,r0,1
>  	stw	r0,_TRAP(r1)
> -	mr	r5,r3
> +	mr	r4,r3		/* err arg for bad_page_fault */
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	lwz	r4,_DAR(r1)
>  	bl	bad_page_fault
>  	b	ret_from_except_full
>  
> -#ifdef CONFIG_PPC_BOOK3S_32
> -	/* We have a data breakpoint exception - handle it */
> -handle_dabr_fault:
> -	SAVE_NVGPRS(r1)
> -	lwz	r0,_TRAP(r1)
> -	clrrwi	r0,r0,1
> -	stw	r0,_TRAP(r1)
> -	bl      do_break
> -	b	ret_from_except_full
> -#endif
> -
>  /*
>   * This routine switches between two different tasks.  The process
>   * state of one is saved on its kernel stack.  Then the state
> diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
> index 25fa7d5a643c..dc728bb1c89a 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -1018,9 +1018,8 @@ storage_fault_common:
>  	bne-	1f
>  	b	ret_from_except_lite
>  1:	bl	save_nvgprs
> -	mr	r5,r3
> +	mr	r4,r3
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	ld	r4,_DAR(r1)
>  	bl	bad_page_fault
>  	b	ret_from_except
>  
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 690058043b17..77b730f515c4 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -2136,8 +2136,7 @@ EXC_COMMON_BEGIN(h_data_storage_common)
>  	GEN_COMMON h_data_storage
>  	addi    r3,r1,STACK_FRAME_OVERHEAD
>  BEGIN_MMU_FTR_SECTION
> -	ld	r4,_DAR(r1)
> -	li	r5,SIGSEGV
> +	li	r4,SIGSEGV
>  	bl      bad_page_fault
>  MMU_FTR_SECTION_ELSE
>  	bl      unknown_exception
> diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
> index 8acd365a2be6..71ad7ce28469 100644
> --- a/arch/powerpc/kernel/head_8xx.S
> +++ b/arch/powerpc/kernel/head_8xx.S
> @@ -376,10 +376,9 @@ do_databreakpoint:
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	mfspr	r4,SPRN_BAR
>  	stw	r4,_DAR(r11)
> -#ifdef CONFIG_VMAP_STACK
> -	lwz	r5,_DSISR(r11)
> -#else
> +#ifndef CONFIG_VMAP_STACK
>  	mfspr	r5,SPRN_DSISR
> +	stw	r5,_DSISR(r11)
>  #endif
>  	EXC_XFER_STD(0x1c00, do_break)
>  
> diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
> index 7addf67832f9..5875f8795d5b 100644
> --- a/arch/powerpc/kernel/head_book3s_32.S
> +++ b/arch/powerpc/kernel/head_book3s_32.S
> @@ -689,7 +689,10 @@ handle_page_fault_tramp_1:
>  #endif
>  	/* fall through */
>  handle_page_fault_tramp_2:
> +	andis.	r0, r5, DSISR_DABRMATCH@h
> +	bne-	1f
>  	EXC_XFER_LITE(0x300, handle_page_fault)
> +1:	EXC_XFER_STD(0x300, do_break)
>  
>  #ifdef CONFIG_VMAP_STACK
>  .macro save_regs_thread		thread
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index d421a2c7f822..0bdd3ed653df 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -660,11 +660,10 @@ static void do_break_handler(struct pt_regs *regs)
>  	}
>  }
>  
> -void do_break (struct pt_regs *regs, unsigned long address,
> -		    unsigned long error_code)
> +void do_break(struct pt_regs *regs)
>  {
>  	current->thread.trap_nr = TRAP_HWBKPT;
> -	if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, error_code,
> +	if (notify_die(DIE_DABR_MATCH, "dabr_match", regs, regs->dsisr,
>  			11, SIGSEGV) == NOTIFY_STOP)
>  		return;
>  
> @@ -682,7 +681,7 @@ void do_break (struct pt_regs *regs, unsigned long address,
>  		do_break_handler(regs);
>  
>  	/* Deliver the signal to userspace */
> -	force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)address);
> +	force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)regs->dar);
>  }
>  #endif	/* CONFIG_PPC_ADV_DEBUG_REGS */
>  
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 5006dcbe1d9f..902fcbd1a778 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -1641,7 +1641,7 @@ void alignment_exception(struct pt_regs *regs)
>  	if (user_mode(regs))
>  		_exception(sig, regs, code, regs->dar);
>  	else
> -		bad_page_fault(regs, regs->dar, sig);
> +		bad_page_fault(regs, sig);
>  
>  bail:
>  	exception_exit(prev_state);
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 0f0bd4af4b2d..731518e7d56f 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -1537,7 +1537,7 @@ long do_hash_fault(struct pt_regs *regs)
>  	 * the access, or panic if there isn't a handler.
>  	 */
>  	if (unlikely(in_nmi())) {
> -		bad_page_fault(regs, ea, SIGSEGV);
> +		bad_page_fault(regs, SIGSEGV);
>  		return 0;
>  	}
>  
> @@ -1576,7 +1576,7 @@ long do_hash_fault(struct pt_regs *regs)
>  			else
>  				_exception(SIGBUS, regs, BUS_ADRERR, ea);
>  		} else {
> -			bad_page_fault(regs, ea, SIGBUS);
> +			bad_page_fault(regs, SIGBUS);
>  		}
>  		err = 0;
>  
> diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
> index cc34d50874c1..ae89ad516247 100644
> --- a/arch/powerpc/mm/book3s64/slb.c
> +++ b/arch/powerpc/mm/book3s64/slb.c
> @@ -898,7 +898,7 @@ void do_bad_slb_fault(struct pt_regs *regs)
>  		if (user_mode(regs))
>  			_exception(SIGSEGV, regs, SEGV_BNDERR, regs->dar);
>  		else
> -			bad_page_fault(regs, regs->dar, SIGSEGV);
> +			bad_page_fault(regs, SIGSEGV);
>  	} else if (err == -EINVAL) {
>  		unrecoverable_exception(regs);
>  	} else {
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 390a296b16a3..e11989be8f1c 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -562,14 +562,14 @@ long do_page_fault(struct pt_regs *regs)
>  	/* 32 and 64e handle errors in their asm code */
>  	if (unlikely(err)) {
>  		if (err > 0) {
> -			bad_page_fault(regs, address, err);
> +			bad_page_fault(regs, err);
>  			err = 0;
>  		} else {
>  			/*
>  			 * do_break() may change NV GPRS while handling the
>  			 * breakpoint. Return -ve to caller to do that.
>  			 */
> -			do_break(regs, address, error_code);
> +			do_break(regs);
>  		}
>  	}
>  #endif
> @@ -591,14 +591,14 @@ long hash__do_page_fault(struct pt_regs *regs)
>  	err = __do_page_fault(regs, address, error_code);
>  	if (unlikely(err)) {
>  		if (err > 0) {
> -			bad_page_fault(regs, address, err);
> +			bad_page_fault(regs, err);
>  			err = 0;
>  		} else {
>  			/*
>  			 * do_break() may change NV GPRS while handling the
>  			 * breakpoint. Return -ve to caller to do that.
>  			 */
> -			do_break(regs, address, error_code);
> +			do_break(regs);
>  		}
>  	}
>  
> @@ -612,7 +612,7 @@ NOKPROBE_SYMBOL(hash__do_page_fault);
>   * It is called from the DSI and ISI handlers in head.S and from some
>   * of the procedures in traps.c.
>   */
> -void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
> +void bad_page_fault(struct pt_regs *regs, int sig)
>  {
>  	const struct exception_table_entry *entry;
>  	int is_write = page_fault_is_write(regs->dsisr);
> diff --git a/arch/powerpc/platforms/8xx/machine_check.c b/arch/powerpc/platforms/8xx/machine_check.c
> index 88dedf38eccd..656365975895 100644
> --- a/arch/powerpc/platforms/8xx/machine_check.c
> +++ b/arch/powerpc/platforms/8xx/machine_check.c
> @@ -26,7 +26,7 @@ int machine_check_8xx(struct pt_regs *regs)
>  	 * to deal with that than having a wart in the mcheck handler.
>  	 * -- BenH
>  	 */
> -	bad_page_fault(regs, regs->dar, SIGBUS);
> +	bad_page_fault(regs, SIGBUS);
>  	return 1;
>  #else
>  	return 0;
> -- 
> 2.23.0

^ permalink raw reply

* Re: [PATCH v3 02/19] powerpc: remove arguments from fault handler functions
From: Aneesh Kumar K.V @ 2020-11-30  7:35 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20201128144114.944000-3-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

> Make mm fault handlers all just take the pt_regs * argument and load
> DAR/DSISR from that. Make those that return a value return long.
>
> This is done to make the function signatures match other handlers, which
> will help with a future patch to add wrappers. Explicit arguments could
> be added for performance but that would require more wrapper macro
> variants.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/include/asm/asm-prototypes.h     |  4 ++--
>  arch/powerpc/include/asm/book3s/64/mmu-hash.h |  2 +-
>  arch/powerpc/include/asm/bug.h                |  4 ++--
>  arch/powerpc/kernel/exceptions-64e.S          |  2 --
>  arch/powerpc/kernel/exceptions-64s.S          | 14 ++------------
>  arch/powerpc/kernel/head_40x.S                | 10 +++++-----
>  arch/powerpc/kernel/head_8xx.S                |  6 +++---
>  arch/powerpc/kernel/head_book3s_32.S          |  6 ++----
>  arch/powerpc/kernel/head_booke.h              |  4 +---
>  arch/powerpc/mm/book3s64/hash_utils.c         |  8 +++++---
>  arch/powerpc/mm/book3s64/slb.c                | 11 +++++++----
>  arch/powerpc/mm/fault.c                       | 16 +++++++++-------
>  12 files changed, 39 insertions(+), 48 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
> index d0b832cbbec8..22c9d08fa3a4 100644
> --- a/arch/powerpc/include/asm/asm-prototypes.h
> +++ b/arch/powerpc/include/asm/asm-prototypes.h
> @@ -82,8 +82,8 @@ void kernel_bad_stack(struct pt_regs *regs);
>  void system_reset_exception(struct pt_regs *regs);
>  void machine_check_exception(struct pt_regs *regs);
>  void emulation_assist_interrupt(struct pt_regs *regs);
> -long do_slb_fault(struct pt_regs *regs, unsigned long ea);
> -void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err);
> +long do_slb_fault(struct pt_regs *regs);
> +void do_bad_slb_fault(struct pt_regs *regs);
>  
>  /* signals, syscalls and interrupts */
>  long sys_swapcontext(struct ucontext __user *old_ctx,
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index bc8c91f2d26f..e843d0b193d3 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -453,7 +453,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>  #define HPTE_LOCAL_UPDATE	0x1
>  #define HPTE_NOHPTE_UPDATE	0x2
>  
> -int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
> +long do_hash_fault(struct pt_regs *regs);
>  extern int __hash_page_4K(unsigned long ea, unsigned long access,
>  			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>  			  unsigned long flags, int ssize, int subpage_prot);
> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
> index c0e9b7a967a8..897bad6b6bbb 100644
> --- a/arch/powerpc/include/asm/bug.h
> +++ b/arch/powerpc/include/asm/bug.h
> @@ -111,8 +111,8 @@
>  #ifndef __ASSEMBLY__
>  
>  struct pt_regs;
> -extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
> -int hash__do_page_fault(struct pt_regs *, unsigned long, unsigned long);
> +long do_page_fault(struct pt_regs *);
> +long hash__do_page_fault(struct pt_regs *);
>  extern void bad_page_fault(struct pt_regs *, unsigned long, int);
>  extern void _exception(int, struct pt_regs *, int, unsigned long);
>  extern void _exception_pkey(struct pt_regs *, unsigned long, int);
> diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
> index f579ce46eef2..25fa7d5a643c 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -1011,8 +1011,6 @@ storage_fault_common:
>  	std	r14,_DAR(r1)
>  	std	r15,_DSISR(r1)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	mr	r4,r14
> -	mr	r5,r15
>  	ld	r14,PACA_EXGEN+EX_R14(r13)
>  	ld	r15,PACA_EXGEN+EX_R15(r13)
>  	bl	do_page_fault
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 336fa1fa39d1..690058043b17 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1438,8 +1438,6 @@ EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
>  EXC_VIRT_END(data_access, 0x4300, 0x80)
>  EXC_COMMON_BEGIN(data_access_common)
>  	GEN_COMMON data_access
> -	ld	r4,_DAR(r1)
> -	ld	r5,_DSISR(r1)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  BEGIN_MMU_FTR_SECTION
>  	bl	do_hash_fault
> @@ -1492,10 +1490,9 @@ EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
>  EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
>  EXC_COMMON_BEGIN(data_access_slb_common)
>  	GEN_COMMON data_access_slb
> -	ld	r4,_DAR(r1)
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
>  BEGIN_MMU_FTR_SECTION
>  	/* HPT case, do SLB fault */
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	bl	do_slb_fault
>  	cmpdi	r3,0
>  	bne-	1f
> @@ -1507,8 +1504,6 @@ MMU_FTR_SECTION_ELSE
>  ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
>  	std	r3,RESULT(r1)
>  	RECONCILE_IRQ_STATE(r10, r11)
> -	ld	r4,_DAR(r1)
> -	ld	r5,RESULT(r1)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	bl	do_bad_slb_fault
>  	b	interrupt_return
> @@ -1543,8 +1538,6 @@ EXC_VIRT_BEGIN(instruction_access, 0x4400, 0x80)
>  EXC_VIRT_END(instruction_access, 0x4400, 0x80)
>  EXC_COMMON_BEGIN(instruction_access_common)
>  	GEN_COMMON instruction_access
> -	ld	r4,_DAR(r1)
> -	ld	r5,_DSISR(r1)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  BEGIN_MMU_FTR_SECTION
>  	bl	do_hash_fault
> @@ -1588,10 +1581,9 @@ EXC_VIRT_BEGIN(instruction_access_slb, 0x4480, 0x80)
>  EXC_VIRT_END(instruction_access_slb, 0x4480, 0x80)
>  EXC_COMMON_BEGIN(instruction_access_slb_common)
>  	GEN_COMMON instruction_access_slb
> -	ld	r4,_DAR(r1)
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
>  BEGIN_MMU_FTR_SECTION
>  	/* HPT case, do SLB fault */
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	bl	do_slb_fault
>  	cmpdi	r3,0
>  	bne-	1f
> @@ -1603,8 +1595,6 @@ MMU_FTR_SECTION_ELSE
>  ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
>  	std	r3,RESULT(r1)
>  	RECONCILE_IRQ_STATE(r10, r11)
> -	ld	r4,_DAR(r1)
> -	ld	r5,RESULT(r1)
>  	addi	r3,r1,STACK_FRAME_OVERHEAD
>  	bl	do_bad_slb_fault
>  	b	interrupt_return
> diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
> index a1ae00689e0f..3c5577ac4dc8 100644
> --- a/arch/powerpc/kernel/head_40x.S
> +++ b/arch/powerpc/kernel/head_40x.S
> @@ -179,9 +179,9 @@ _ENTRY(saved_ksp_limit)
>   */
>  	START_EXCEPTION(0x0300,	DataStorage)
>  	EXCEPTION_PROLOG
> -	mfspr	r5, SPRN_ESR		/* Grab the ESR, save it, pass arg3 */
> +	mfspr	r5, SPRN_ESR		/* Grab the ESR, save it */
>  	stw	r5, _ESR(r11)
> -	mfspr	r4, SPRN_DEAR		/* Grab the DEAR, save it, pass arg2 */
> +	mfspr	r4, SPRN_DEAR		/* Grab the DEAR, save it */
>  	stw	r4, _DEAR(r11)
>  	EXC_XFER_LITE(0x300, handle_page_fault)
>  
> @@ -191,9 +191,9 @@ _ENTRY(saved_ksp_limit)
>   */
>  	START_EXCEPTION(0x0400, InstructionAccess)
>  	EXCEPTION_PROLOG
> -	mr	r4,r12			/* Pass SRR0 as arg2 */
> -	stw	r4, _DEAR(r11)
> -	li	r5,0			/* Pass zero as arg3 */
> +	li	r5,0
> +	stw	r5, _ESR(r11)		/* Zero ESR */
> +	stw	r12, _DEAR(r11)		/* SRR0 as DEAR */
>  	EXC_XFER_LITE(0x400, handle_page_fault)
>  
>  /* 0x0500 - External Interrupt Exception */
> diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
> index ee0bfebc375f..8acd365a2be6 100644
> --- a/arch/powerpc/kernel/head_8xx.S
> +++ b/arch/powerpc/kernel/head_8xx.S
> @@ -324,14 +324,14 @@ DataStoreTLBMiss:
>  	. = 0x1300
>  InstructionTLBError:
>  	EXCEPTION_PROLOG
> -	mr	r4,r12
>  	andis.	r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */
>  	andis.	r10,r9,SRR1_ISI_NOPT@h
>  	beq+	.Litlbie
> -	tlbie	r4
> +	tlbie	r12
>  	/* 0x400 is InstructionAccess exception, needed by bad_page_fault() */
>  .Litlbie:
> -	stw	r4, _DAR(r11)
> +	stw	r12, _DAR(r11)
> +	stw	r5, _DSISR(r11)
>  	EXC_XFER_LITE(0x400, handle_page_fault)
>  
>  /* This is the data TLB error on the MPC8xx.  This could be due to
> diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
> index a0dda2a1f2df..7addf67832f9 100644
> --- a/arch/powerpc/kernel/head_book3s_32.S
> +++ b/arch/powerpc/kernel/head_book3s_32.S
> @@ -370,9 +370,9 @@ BEGIN_MMU_FTR_SECTION
>  	bl	hash_page
>  END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
>  #endif	/* CONFIG_VMAP_STACK */
> -1:	mr	r4,r12
>  	andis.	r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */
> -	stw	r4, _DAR(r11)
> +	stw	r5, _DSISR(r11)
> +	stw	r12, _DAR(r11)
>  	EXC_XFER_LITE(0x400, handle_page_fault)
>  
>  /* External interrupt */
> @@ -687,8 +687,6 @@ handle_page_fault_tramp_1:
>  #ifdef CONFIG_VMAP_STACK
>  	EXCEPTION_PROLOG_2 handle_dar_dsisr=1
>  #endif
> -	lwz	r4, _DAR(r11)
> -	lwz	r5, _DSISR(r11)
>  	/* fall through */
>  handle_page_fault_tramp_2:
>  	EXC_XFER_LITE(0x300, handle_page_fault)
> diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
> index 71c359d438b5..1da0c1d1b0a1 100644
> --- a/arch/powerpc/kernel/head_booke.h
> +++ b/arch/powerpc/kernel/head_booke.h
> @@ -477,9 +477,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
>  	NORMAL_EXCEPTION_PROLOG(INST_STORAGE);		      \
>  	mfspr	r5,SPRN_ESR;		/* Grab the ESR and save it */	      \
>  	stw	r5,_ESR(r11);						      \
> -	mr      r4,r12;                 /* Pass SRR0 as arg2 */		      \
> -	stw	r4, _DEAR(r11);						      \
> -	li      r5,0;                   /* Pass zero as arg3 */		      \
> +	stw	r12, _DEAR(r11);	/* Pass SRR0 as arg2 */		      \
>  	EXC_XFER_LITE(0x0400, handle_page_fault)
>  
>  #define ALIGNMENT_EXCEPTION						      \
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index bfa1b1966218..0f0bd4af4b2d 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -1510,13 +1510,15 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
>  }
>  EXPORT_SYMBOL_GPL(hash_page);
>  
> -int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
> +long do_hash_fault(struct pt_regs *regs)
>  {
> +	unsigned long ea = regs->dar;
> +	unsigned long dsisr = regs->dsisr;
>  	unsigned long access = _PAGE_PRESENT | _PAGE_READ;
>  	unsigned long flags = 0;
>  	struct mm_struct *mm;
>  	unsigned int region_id;
> -	int err;
> +	long err;
>  
>  	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
>  		goto _do_page_fault;
> @@ -1580,7 +1582,7 @@ int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
>  
>  	} else if (err) {
>  _do_page_fault:
> -		err = hash__do_page_fault(regs, ea, dsisr);
> +		err = hash__do_page_fault(regs);
>  	}
>  
>  	return err;
> diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
> index c30fcbfa0e32..cc34d50874c1 100644
> --- a/arch/powerpc/mm/book3s64/slb.c
> +++ b/arch/powerpc/mm/book3s64/slb.c
> @@ -837,8 +837,9 @@ static long slb_allocate_user(struct mm_struct *mm, unsigned long ea)
>  	return slb_insert_entry(ea, context, flags, ssize, false);
>  }
>  
> -long do_slb_fault(struct pt_regs *regs, unsigned long ea)
> +long do_slb_fault(struct pt_regs *regs)
>  {
> +	unsigned long ea = regs->dar;
>  	unsigned long id = get_region_id(ea);
>  
>  	/* IRQs are not reconciled here, so can't check irqs_disabled */
> @@ -889,13 +890,15 @@ long do_slb_fault(struct pt_regs *regs, unsigned long ea)
>  	}
>  }
>  
> -void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err)
> +void do_bad_slb_fault(struct pt_regs *regs)
>  {
> +	int err = regs->result;
> +
>  	if (err == -EFAULT) {
>  		if (user_mode(regs))
> -			_exception(SIGSEGV, regs, SEGV_BNDERR, ea);
> +			_exception(SIGSEGV, regs, SEGV_BNDERR, regs->dar);
>  		else
> -			bad_page_fault(regs, ea, SIGSEGV);
> +			bad_page_fault(regs, regs->dar, SIGSEGV);
>  	} else if (err == -EINVAL) {
>  		unrecoverable_exception(regs);
>  	} else {
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index e65a49f246ef..390a296b16a3 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -549,11 +549,12 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  }
>  NOKPROBE_SYMBOL(__do_page_fault);
>  
> -int do_page_fault(struct pt_regs *regs, unsigned long address,
> -		  unsigned long error_code)
> +long do_page_fault(struct pt_regs *regs)
>  {
>  	enum ctx_state prev_state = exception_enter();
> -	int err;
> +	unsigned long address = regs->dar;
> +	unsigned long error_code = regs->dsisr;
> +	long err;
>  
>  	err = __do_page_fault(regs, address, error_code);
>  
> @@ -580,11 +581,12 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
>  NOKPROBE_SYMBOL(do_page_fault);
>  
>  #ifdef CONFIG_PPC_BOOK3S_64
> -/* Same as do_page_fault but interrupt entry has already run in do_hash_fault */
> -int hash__do_page_fault(struct pt_regs *regs, unsigned long address,
> -		  unsigned long error_code)
> +/* Same as do_page_fault but no interrupt entry */
> +long hash__do_page_fault(struct pt_regs *regs)
>  {
> -	int err;
> +	unsigned long address = regs->dar;
> +	unsigned long error_code = regs->dsisr;
> +	long err;
>  
>  	err = __do_page_fault(regs, address, error_code);
>  	if (unlikely(err)) {
> -- 
> 2.23.0

^ permalink raw reply

* Re: [PATCH v3 01/19] powerpc/64s: move the last of the page fault handling logic to C
From: Aneesh Kumar K.V @ 2020-11-30  7:35 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20201128144114.944000-2-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

> The page fault handling still has some complex logic particularly around
> hash table handling, in asm. Implement this in C instead.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
>  arch/powerpc/include/asm/bug.h                |   1 +
>  arch/powerpc/kernel/exceptions-64s.S          | 131 +++---------------
>  arch/powerpc/mm/book3s64/hash_utils.c         |  77 ++++++----
>  arch/powerpc/mm/fault.c                       |  57 +++++++-
>  5 files changed, 127 insertions(+), 140 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 683a9c7d1b03..bc8c91f2d26f 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -453,6 +453,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
>  #define HPTE_LOCAL_UPDATE	0x1
>  #define HPTE_NOHPTE_UPDATE	0x2
>  
> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
>  extern int __hash_page_4K(unsigned long ea, unsigned long access,
>  			  unsigned long vsid, pte_t *ptep, unsigned long trap,
>  			  unsigned long flags, int ssize, int subpage_prot);
> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
> index 338f36cd9934..c0e9b7a967a8 100644
> --- a/arch/powerpc/include/asm/bug.h
> +++ b/arch/powerpc/include/asm/bug.h
> @@ -112,6 +112,7 @@
>  
>  struct pt_regs;
>  extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
> +int hash__do_page_fault(struct pt_regs *, unsigned long, unsigned long);
>  extern void bad_page_fault(struct pt_regs *, unsigned long, int);
>  extern void _exception(int, struct pt_regs *, int, unsigned long);
>  extern void _exception_pkey(struct pt_regs *, unsigned long, int);
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 4d01f09ecf80..336fa1fa39d1 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>   *
>   * Handling:
>   * - Hash MMU
> - *   Go to do_hash_page first to see if the HPT can be filled from an entry in
> - *   the Linux page table. Hash faults can hit in kernel mode in a fairly
> + *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
> + *   Linux page table. Hash faults can hit in kernel mode in a fairly
>   *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
>   *   "non-bolted" regions, e.g., vmalloc space. However these should always be
> - *   backed by Linux page tables.
> + *   backed by Linux page table entries.
>   *
> - *   If none is found, do a Linux page fault. Linux page faults can happen in
> - *   kernel mode due to user copy operations of course.
> + *   If no entry is found the Linux page fault handler is invoked (by
> + *   do_hash_fault). Linux page faults can happen in kernel mode due to user
> + *   copy operations of course.
>   *
>   *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
>   *   MMU context, which may cause a DSI in the host, which must go to the
> @@ -1439,13 +1440,17 @@ EXC_COMMON_BEGIN(data_access_common)
>  	GEN_COMMON data_access
>  	ld	r4,_DAR(r1)
>  	ld	r5,_DSISR(r1)
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
>  BEGIN_MMU_FTR_SECTION
> -	ld	r6,_MSR(r1)
> -	li	r3,0x300
> -	b	do_hash_page		/* Try to handle as hpte fault */
> +	bl	do_hash_fault
>  MMU_FTR_SECTION_ELSE
> -	b	handle_page_fault
> +	bl	do_page_fault
>  ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
> +        cmpdi	r3,0
> +	beq+	interrupt_return
> +	/* We need to restore NVGPRS */
> +	REST_NVGPRS(r1)
> +	b       interrupt_return
>  
>  	GEN_KVM data_access
>  
> @@ -1540,13 +1545,17 @@ EXC_COMMON_BEGIN(instruction_access_common)
>  	GEN_COMMON instruction_access
>  	ld	r4,_DAR(r1)
>  	ld	r5,_DSISR(r1)
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
>  BEGIN_MMU_FTR_SECTION
> -	ld      r6,_MSR(r1)
> -	li	r3,0x400
> -	b	do_hash_page		/* Try to handle as hpte fault */
> +	bl	do_hash_fault
>  MMU_FTR_SECTION_ELSE
> -	b	handle_page_fault
> +	bl	do_page_fault
>  ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
> +        cmpdi	r3,0
> +	beq+	interrupt_return
> +	/* We need to restore NVGPRS */
> +	REST_NVGPRS(r1)
> +	b       interrupt_return
>  
>  	GEN_KVM instruction_access
>  
> @@ -3202,99 +3211,3 @@ disable_machine_check:
>  	RFI_TO_KERNEL
>  1:	mtlr	r0
>  	blr
> -
> -/*
> - * Hash table stuff
> - */
> -	.balign	IFETCH_ALIGN_BYTES
> -do_hash_page:
> -#ifdef CONFIG_PPC_BOOK3S_64
> -	lis	r0,(DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)@h
> -	ori	r0,r0,DSISR_BAD_FAULT_64S@l
> -	and.	r0,r5,r0		/* weird error? */
> -	bne-	handle_page_fault	/* if not, try to insert a HPTE */
> -
> -	/*
> -	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
> -	 * don't call hash_page, just fail the fault. This is required to
> -	 * prevent re-entrancy problems in the hash code, namely perf
> -	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
> -	 * hash fault. See the comment in hash_preload().
> -	 */
> -	ld	r11, PACA_THREAD_INFO(r13)
> -	lwz	r0,TI_PREEMPT(r11)
> -	andis.	r0,r0,NMI_MASK@h
> -	bne	77f
> -
> -	/*
> -	 * r3 contains the trap number
> -	 * r4 contains the faulting address
> -	 * r5 contains dsisr
> -	 * r6 msr
> -	 *
> -	 * at return r3 = 0 for success, 1 for page fault, negative for error
> -	 */
> -	bl	__hash_page		/* build HPTE if possible */
> -        cmpdi	r3,0			/* see if __hash_page succeeded */
> -
> -	/* Success */
> -	beq	interrupt_return	/* Return from exception on success */
> -
> -	/* Error */
> -	blt-	13f
> -
> -	/* Reload DAR/DSISR into r4/r5 for the DABR check below */
> -	ld	r4,_DAR(r1)
> -	ld      r5,_DSISR(r1)
> -#endif /* CONFIG_PPC_BOOK3S_64 */
> -
> -/* Here we have a page fault that hash_page can't handle. */
> -handle_page_fault:
> -11:	andis.  r0,r5,DSISR_DABRMATCH@h
> -	bne-    handle_dabr_fault
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	bl	do_page_fault
> -	cmpdi	r3,0
> -	beq+	interrupt_return
> -	mr	r5,r3
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	ld	r4,_DAR(r1)
> -	bl	bad_page_fault
> -	b	interrupt_return
> -
> -/* We have a data breakpoint exception - handle it */
> -handle_dabr_fault:
> -	ld      r4,_DAR(r1)
> -	ld      r5,_DSISR(r1)
> -	addi    r3,r1,STACK_FRAME_OVERHEAD
> -	bl      do_break
> -	/*
> -	 * do_break() may have changed the NV GPRS while handling a breakpoint.
> -	 * If so, we need to restore them with their updated values.
> -	 */
> -	REST_NVGPRS(r1)
> -	b       interrupt_return
> -
> -
> -#ifdef CONFIG_PPC_BOOK3S_64
> -/* We have a page fault that hash_page could handle but HV refused
> - * the PTE insertion
> - */
> -13:	mr	r5,r3
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	ld	r4,_DAR(r1)
> -	bl	low_hash_fault
> -	b	interrupt_return
> -#endif
> -
> -/*
> - * We come here as a result of a DSI at a point where we don't want
> - * to call hash_page, such as when we are accessing memory (possibly
> - * user memory) inside a PMU interrupt that occurred while interrupts
> - * were soft-disabled.  We want to invoke the exception handler for
> - * the access, or panic if there isn't a handler.
> - */
> -77:	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	li	r5,SIGSEGV
> -	bl	bad_page_fault
> -	b	interrupt_return
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 24702c0a92e0..bfa1b1966218 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -1510,16 +1510,40 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
>  }
>  EXPORT_SYMBOL_GPL(hash_page);
>  
> -int __hash_page(unsigned long trap, unsigned long ea, unsigned long dsisr,
> -		unsigned long msr)
> +int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
>  {
>  	unsigned long access = _PAGE_PRESENT | _PAGE_READ;
>  	unsigned long flags = 0;
> -	struct mm_struct *mm = current->mm;
> -	unsigned int region_id = get_region_id(ea);
> +	struct mm_struct *mm;
> +	unsigned int region_id;
> +	int err;
> +
> +	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)))
> +		goto _do_page_fault;
> +
> +	/*
> +	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
> +	 * don't call hash_page, just fail the fault. This is required to
> +	 * prevent re-entrancy problems in the hash code, namely perf
> +	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
> +	 * hash fault. See the comment in hash_preload().
> +	 *
> +	 * We come here as a result of a DSI at a point where we don't want
> +	 * to call hash_page, such as when we are accessing memory (possibly
> +	 * user memory) inside a PMU interrupt that occurred while interrupts
> +	 * were soft-disabled.  We want to invoke the exception handler for
> +	 * the access, or panic if there isn't a handler.
> +	 */
> +	if (unlikely(in_nmi())) {
> +		bad_page_fault(regs, ea, SIGSEGV);
> +		return 0;
> +	}
>  
> +	region_id = get_region_id(ea);
>  	if ((region_id == VMALLOC_REGION_ID) || (region_id == IO_REGION_ID))
>  		mm = &init_mm;
> +	else
> +		mm = current->mm;
>  
>  	if (dsisr & DSISR_NOHPTE)
>  		flags |= HPTE_NOHPTE_UPDATE;
> @@ -1535,13 +1559,31 @@ int __hash_page(unsigned long trap, unsigned long ea, unsigned long dsisr,
>  	 * 2) user space access kernel space.
>  	 */
>  	access |= _PAGE_PRIVILEGED;
> -	if ((msr & MSR_PR) || (region_id == USER_REGION_ID))
> +	if (user_mode(regs) || (region_id == USER_REGION_ID))
>  		access &= ~_PAGE_PRIVILEGED;
>  
> -	if (trap == 0x400)
> +	if (regs->trap == 0x400)
>  		access |= _PAGE_EXEC;
>  
> -	return hash_page_mm(mm, ea, access, trap, flags);
> +	err = hash_page_mm(mm, ea, access, regs->trap, flags);
> +	if (unlikely(err < 0)) {
> +		// failed to instert a hash PTE due to an hypervisor error
> +		if (user_mode(regs)) {
> +			if (IS_ENABLED(CONFIG_PPC_SUBPAGE_PROT) && err == -2)
> +				_exception(SIGSEGV, regs, SEGV_ACCERR, ea);
> +			else
> +				_exception(SIGBUS, regs, BUS_ADRERR, ea);
> +		} else {
> +			bad_page_fault(regs, ea, SIGBUS);
> +		}
> +		err = 0;
> +
> +	} else if (err) {
> +_do_page_fault:
> +		err = hash__do_page_fault(regs, ea, dsisr);
> +	}
> +
> +	return err;
>  }
>  
>  #ifdef CONFIG_PPC_MM_SLICES
> @@ -1841,27 +1883,6 @@ void flush_hash_range(unsigned long number, int local)
>  	}
>  }
>  
> -/*
> - * low_hash_fault is called when we the low level hash code failed
> - * to instert a PTE due to an hypervisor error
> - */
> -void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc)
> -{
> -	enum ctx_state prev_state = exception_enter();
> -
> -	if (user_mode(regs)) {
> -#ifdef CONFIG_PPC_SUBPAGE_PROT
> -		if (rc == -2)
> -			_exception(SIGSEGV, regs, SEGV_ACCERR, address);
> -		else
> -#endif
> -			_exception(SIGBUS, regs, BUS_ADRERR, address);
> -	} else
> -		bad_page_fault(regs, address, SIGBUS);
> -
> -	exception_exit(prev_state);
> -}
> -
>  long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
>  			   unsigned long pa, unsigned long rflags,
>  			   unsigned long vflags, int psize, int ssize)
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 0add963a849b..e65a49f246ef 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -370,7 +370,9 @@ static void sanity_check_fault(bool is_write, bool is_user,
>  #define page_fault_is_write(__err)	((__err) & DSISR_ISSTORE)
>  #if defined(CONFIG_PPC_8xx)
>  #define page_fault_is_bad(__err)	((__err) & DSISR_NOEXEC_OR_G)
> -#elif defined(CONFIG_PPC64)
> +#elif defined(CONFIG_PPC_BOOK3S_64)
> +#define page_fault_is_bad(__err)	((__err) & (DSISR_BAD_FAULT_64S | DSISR_DABRMATCH))
> +#elif defined(CONFIG_PPC_BOOK3E_64)
>  #define page_fault_is_bad(__err)	((__err) & DSISR_BAD_FAULT_64S)
>  #else
>  #define page_fault_is_bad(__err)	((__err) & DSISR_BAD_FAULT_32S)
> @@ -406,6 +408,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  		return 0;
>  
>  	if (unlikely(page_fault_is_bad(error_code))) {
> +		if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && (error_code & DSISR_DABRMATCH))
> +			return -1;
> +
>  		if (is_user) {
>  			_exception(SIGBUS, regs, BUS_OBJERR, address);
>  			return 0;
> @@ -548,12 +553,58 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
>  		  unsigned long error_code)
>  {
>  	enum ctx_state prev_state = exception_enter();
> -	int rc = __do_page_fault(regs, address, error_code);
> +	int err;
> +
> +	err = __do_page_fault(regs, address, error_code);
> +
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	/* 32 and 64e handle errors in their asm code */
> +	if (unlikely(err)) {
> +		if (err > 0) {
> +			bad_page_fault(regs, address, err);
> +			err = 0;
> +		} else {
> +			/*
> +			 * do_break() may change NV GPRS while handling the
> +			 * breakpoint. Return -ve to caller to do that.
> +			 */
> +			do_break(regs, address, error_code);
> +		}
> +	}
> +#endif
> +
>  	exception_exit(prev_state);
> -	return rc;
> +
> +	return err;
>  }
>  NOKPROBE_SYMBOL(do_page_fault);
>  
> +#ifdef CONFIG_PPC_BOOK3S_64
> +/* Same as do_page_fault but interrupt entry has already run in do_hash_fault */
> +int hash__do_page_fault(struct pt_regs *regs, unsigned long address,
> +		  unsigned long error_code)
> +{
> +	int err;
> +
> +	err = __do_page_fault(regs, address, error_code);
> +	if (unlikely(err)) {
> +		if (err > 0) {
> +			bad_page_fault(regs, address, err);
> +			err = 0;
> +		} else {
> +			/*
> +			 * do_break() may change NV GPRS while handling the
> +			 * breakpoint. Return -ve to caller to do that.
> +			 */
> +			do_break(regs, address, error_code);
> +		}
> +	}
> +
> +	return err;
> +}
> +NOKPROBE_SYMBOL(hash__do_page_fault);
> +#endif
> +
>  /*
>   * bad_page_fault is called when we have a bad access from the kernel.
>   * It is called from the DSI and ISI handlers in head.S and from some
> -- 
> 2.23.0

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox