* [PATCH v3 0/4] iomap: Constify ioreadX() iomem argument
From: Krzysztof Kozlowski @ 2020-07-09 7:28 UTC (permalink / raw)
To: Richard Henderson, Ivan Kokshaysky, Matt Turner,
James E.J. Bottomley, Helge Deller, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, Yoshinori Sato,
Rich Felker, Kalle Valo, David S. Miller, Jakub Kicinski,
Dave Jiang, Jon Mason, Allen Hubbe, Michael S. Tsirkin,
Jason Wang, Arnd Bergmann, Geert Uytterhoeven, Andrew Morton,
linux-alpha, linux-kernel, linux-parisc, linuxppc-dev, linux-sh,
linux-wireless, netdev, linux-ntb, virtualization, linux-arch
Cc: Krzysztof Kozlowski
Hi,
Multiple architectures are affected in the first patch and all further
patches depend on the first.
Maybe this could go in through Andrew Morton's tree?
Changes since v2
================
1. Drop all non-essential patches (cleanups),
2. Update also drivers/sh/clk/cpg.c .
Changes since v1
================
https://lore.kernel.org/lkml/1578415992-24054-1-git-send-email-krzk@kernel.org/
1. Constify also ioreadX_rep() and mmio_insX(),
2. Squash lib+alpha+powerpc+parisc+sh into one patch for bisectability,
3. Add acks and reviews,
4. Re-order patches so all optional driver changes are at the end.
Description
===========
The ioread8/16/32() and others have inconsistent interface among the
architectures: some taking address as const, some not.
It seems there is nothing really stopping all of them to take
pointer to const.
Patchset was only compile tested on affected architectures. No real
testing.
volatile
========
There is still interface inconsistency between architectures around
"volatile" qualifier:
- include/asm-generic/io.h:static inline u32 ioread32(const volatile void __iomem *addr)
- include/asm-generic/iomap.h:extern unsigned int ioread32(const void __iomem *);
This is still discussed and out of scope of this patchset.
Best regards,
Krzysztof
Krzysztof Kozlowski (4):
iomap: Constify ioreadX() iomem argument (as in generic
implementation)
rtl818x: Constify ioreadX() iomem argument (as in generic
implementation)
ntb: intel: Constify ioreadX() iomem argument (as in generic
implementation)
virtio: pci: Constify ioreadX() iomem argument (as in generic
implementation)
arch/alpha/include/asm/core_apecs.h | 6 +-
arch/alpha/include/asm/core_cia.h | 6 +-
arch/alpha/include/asm/core_lca.h | 6 +-
arch/alpha/include/asm/core_marvel.h | 4 +-
arch/alpha/include/asm/core_mcpcia.h | 6 +-
arch/alpha/include/asm/core_t2.h | 2 +-
arch/alpha/include/asm/io.h | 12 ++--
arch/alpha/include/asm/io_trivial.h | 16 ++---
arch/alpha/include/asm/jensen.h | 2 +-
arch/alpha/include/asm/machvec.h | 6 +-
arch/alpha/kernel/core_marvel.c | 2 +-
arch/alpha/kernel/io.c | 12 ++--
arch/parisc/include/asm/io.h | 4 +-
arch/parisc/lib/iomap.c | 72 +++++++++----------
arch/powerpc/kernel/iomap.c | 28 ++++----
arch/sh/kernel/iomap.c | 22 +++---
.../realtek/rtl818x/rtl8180/rtl8180.h | 6 +-
drivers/ntb/hw/intel/ntb_hw_gen1.c | 2 +-
drivers/ntb/hw/intel/ntb_hw_gen3.h | 2 +-
drivers/ntb/hw/intel/ntb_hw_intel.h | 2 +-
drivers/sh/clk/cpg.c | 2 +-
drivers/virtio/virtio_pci_modern.c | 6 +-
include/asm-generic/iomap.h | 28 ++++----
include/linux/io-64-nonatomic-hi-lo.h | 4 +-
include/linux/io-64-nonatomic-lo-hi.h | 4 +-
lib/iomap.c | 30 ++++----
26 files changed, 146 insertions(+), 146 deletions(-)
--
2.17.1
^ permalink raw reply
* [PATCH v3 1/4] iomap: Constify ioreadX() iomem argument (as in generic implementation)
From: Krzysztof Kozlowski @ 2020-07-09 7:28 UTC (permalink / raw)
To: Richard Henderson, Ivan Kokshaysky, Matt Turner,
James E.J. Bottomley, Helge Deller, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, Yoshinori Sato,
Rich Felker, Kalle Valo, David S. Miller, Jakub Kicinski,
Dave Jiang, Jon Mason, Allen Hubbe, Michael S. Tsirkin,
Jason Wang, Arnd Bergmann, Geert Uytterhoeven, Andrew Morton,
linux-alpha, linux-kernel, linux-parisc, linuxppc-dev, linux-sh,
linux-wireless, netdev, linux-ntb, virtualization, linux-arch
Cc: Krzysztof Kozlowski
In-Reply-To: <20200709072837.5869-1-krzk@kernel.org>
The ioreadX() and ioreadX_rep() helpers have inconsistent interface. On
some architectures void *__iomem address argument is a pointer to const,
on some not.
Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
---
arch/alpha/include/asm/core_apecs.h | 6 +--
arch/alpha/include/asm/core_cia.h | 6 +--
arch/alpha/include/asm/core_lca.h | 6 +--
arch/alpha/include/asm/core_marvel.h | 4 +-
arch/alpha/include/asm/core_mcpcia.h | 6 +--
arch/alpha/include/asm/core_t2.h | 2 +-
arch/alpha/include/asm/io.h | 12 ++---
arch/alpha/include/asm/io_trivial.h | 16 +++---
arch/alpha/include/asm/jensen.h | 2 +-
arch/alpha/include/asm/machvec.h | 6 +--
arch/alpha/kernel/core_marvel.c | 2 +-
arch/alpha/kernel/io.c | 12 ++---
arch/parisc/include/asm/io.h | 4 +-
arch/parisc/lib/iomap.c | 72 +++++++++++++--------------
arch/powerpc/kernel/iomap.c | 28 +++++------
arch/sh/kernel/iomap.c | 22 ++++----
drivers/sh/clk/cpg.c | 2 +-
include/asm-generic/iomap.h | 28 +++++------
include/linux/io-64-nonatomic-hi-lo.h | 4 +-
include/linux/io-64-nonatomic-lo-hi.h | 4 +-
lib/iomap.c | 30 +++++------
21 files changed, 137 insertions(+), 137 deletions(-)
diff --git a/arch/alpha/include/asm/core_apecs.h b/arch/alpha/include/asm/core_apecs.h
index 0a07055bc0fe..2d9726fc02ef 100644
--- a/arch/alpha/include/asm/core_apecs.h
+++ b/arch/alpha/include/asm/core_apecs.h
@@ -384,7 +384,7 @@ struct el_apecs_procdata
} \
} while (0)
-__EXTERN_INLINE unsigned int apecs_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread8(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -420,7 +420,7 @@ __EXTERN_INLINE void apecs_iowrite8(u8 b, void __iomem *xaddr)
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int apecs_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread16(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -456,7 +456,7 @@ __EXTERN_INLINE void apecs_iowrite16(u16 b, void __iomem *xaddr)
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int apecs_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread32(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
if (addr < APECS_DENSE_MEM)
diff --git a/arch/alpha/include/asm/core_cia.h b/arch/alpha/include/asm/core_cia.h
index c706a7f2b061..cb22991f6761 100644
--- a/arch/alpha/include/asm/core_cia.h
+++ b/arch/alpha/include/asm/core_cia.h
@@ -342,7 +342,7 @@ struct el_CIA_sysdata_mcheck {
#define vuip volatile unsigned int __force *
#define vulp volatile unsigned long __force *
-__EXTERN_INLINE unsigned int cia_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread8(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -374,7 +374,7 @@ __EXTERN_INLINE void cia_iowrite8(u8 b, void __iomem *xaddr)
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int cia_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread16(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -404,7 +404,7 @@ __EXTERN_INLINE void cia_iowrite16(u16 b, void __iomem *xaddr)
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int cia_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread32(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
if (addr < CIA_DENSE_MEM)
diff --git a/arch/alpha/include/asm/core_lca.h b/arch/alpha/include/asm/core_lca.h
index 84d5e5b84f4f..ec86314418cb 100644
--- a/arch/alpha/include/asm/core_lca.h
+++ b/arch/alpha/include/asm/core_lca.h
@@ -230,7 +230,7 @@ union el_lca {
} while (0)
-__EXTERN_INLINE unsigned int lca_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int lca_ioread8(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -266,7 +266,7 @@ __EXTERN_INLINE void lca_iowrite8(u8 b, void __iomem *xaddr)
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int lca_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int lca_ioread16(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -302,7 +302,7 @@ __EXTERN_INLINE void lca_iowrite16(u16 b, void __iomem *xaddr)
*(vuip) ((addr << 5) + base_and_type) = w;
}
-__EXTERN_INLINE unsigned int lca_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int lca_ioread32(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
if (addr < LCA_DENSE_MEM)
diff --git a/arch/alpha/include/asm/core_marvel.h b/arch/alpha/include/asm/core_marvel.h
index cc6fd92d5fa9..b266e02e284b 100644
--- a/arch/alpha/include/asm/core_marvel.h
+++ b/arch/alpha/include/asm/core_marvel.h
@@ -332,10 +332,10 @@ struct io7 {
#define vucp volatile unsigned char __force *
#define vusp volatile unsigned short __force *
-extern unsigned int marvel_ioread8(void __iomem *);
+extern unsigned int marvel_ioread8(const void __iomem *);
extern void marvel_iowrite8(u8 b, void __iomem *);
-__EXTERN_INLINE unsigned int marvel_ioread16(void __iomem *addr)
+__EXTERN_INLINE unsigned int marvel_ioread16(const void __iomem *addr)
{
return __kernel_ldwu(*(vusp)addr);
}
diff --git a/arch/alpha/include/asm/core_mcpcia.h b/arch/alpha/include/asm/core_mcpcia.h
index b30dc128210d..cb24d1bd6141 100644
--- a/arch/alpha/include/asm/core_mcpcia.h
+++ b/arch/alpha/include/asm/core_mcpcia.h
@@ -267,7 +267,7 @@ extern inline int __mcpcia_is_mmio(unsigned long addr)
return (addr & 0x80000000UL) == 0;
}
-__EXTERN_INLINE unsigned int mcpcia_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int mcpcia_ioread8(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long)xaddr & MCPCIA_MEM_MASK;
unsigned long hose = (unsigned long)xaddr & ~MCPCIA_MEM_MASK;
@@ -291,7 +291,7 @@ __EXTERN_INLINE void mcpcia_iowrite8(u8 b, void __iomem *xaddr)
*(vuip) ((addr << 5) + hose + 0x00) = w;
}
-__EXTERN_INLINE unsigned int mcpcia_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int mcpcia_ioread16(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long)xaddr & MCPCIA_MEM_MASK;
unsigned long hose = (unsigned long)xaddr & ~MCPCIA_MEM_MASK;
@@ -315,7 +315,7 @@ __EXTERN_INLINE void mcpcia_iowrite16(u16 b, void __iomem *xaddr)
*(vuip) ((addr << 5) + hose + 0x08) = w;
}
-__EXTERN_INLINE unsigned int mcpcia_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int mcpcia_ioread32(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long)xaddr;
diff --git a/arch/alpha/include/asm/core_t2.h b/arch/alpha/include/asm/core_t2.h
index e0b33d09e93a..12bb7addc789 100644
--- a/arch/alpha/include/asm/core_t2.h
+++ b/arch/alpha/include/asm/core_t2.h
@@ -572,7 +572,7 @@ __EXTERN_INLINE int t2_is_mmio(const volatile void __iomem *addr)
it doesn't make sense to merge the pio and mmio routines. */
#define IOPORT(OS, NS) \
-__EXTERN_INLINE unsigned int t2_ioread##NS(void __iomem *xaddr) \
+__EXTERN_INLINE unsigned int t2_ioread##NS(const void __iomem *xaddr) \
{ \
if (t2_is_mmio(xaddr)) \
return t2_read##OS(xaddr); \
diff --git a/arch/alpha/include/asm/io.h b/arch/alpha/include/asm/io.h
index 640e1a2f57b4..1f6a909d1fa5 100644
--- a/arch/alpha/include/asm/io.h
+++ b/arch/alpha/include/asm/io.h
@@ -150,9 +150,9 @@ static inline void generic_##NAME(TYPE b, QUAL void __iomem *addr) \
alpha_mv.mv_##NAME(b, addr); \
}
-REMAP1(unsigned int, ioread8, /**/)
-REMAP1(unsigned int, ioread16, /**/)
-REMAP1(unsigned int, ioread32, /**/)
+REMAP1(unsigned int, ioread8, const)
+REMAP1(unsigned int, ioread16, const)
+REMAP1(unsigned int, ioread32, const)
REMAP1(u8, readb, const volatile)
REMAP1(u16, readw, const volatile)
REMAP1(u32, readl, const volatile)
@@ -307,7 +307,7 @@ static inline int __is_mmio(const volatile void __iomem *addr)
*/
#if IO_CONCAT(__IO_PREFIX,trivial_io_bw)
-extern inline unsigned int ioread8(void __iomem *addr)
+extern inline unsigned int ioread8(const void __iomem *addr)
{
unsigned int ret;
mb();
@@ -316,7 +316,7 @@ extern inline unsigned int ioread8(void __iomem *addr)
return ret;
}
-extern inline unsigned int ioread16(void __iomem *addr)
+extern inline unsigned int ioread16(const void __iomem *addr)
{
unsigned int ret;
mb();
@@ -359,7 +359,7 @@ extern inline void outw(u16 b, unsigned long port)
#endif
#if IO_CONCAT(__IO_PREFIX,trivial_io_lq)
-extern inline unsigned int ioread32(void __iomem *addr)
+extern inline unsigned int ioread32(const void __iomem *addr)
{
unsigned int ret;
mb();
diff --git a/arch/alpha/include/asm/io_trivial.h b/arch/alpha/include/asm/io_trivial.h
index ba3d8f0cfe0c..a1a29cbe02fa 100644
--- a/arch/alpha/include/asm/io_trivial.h
+++ b/arch/alpha/include/asm/io_trivial.h
@@ -7,15 +7,15 @@
#if IO_CONCAT(__IO_PREFIX,trivial_io_bw)
__EXTERN_INLINE unsigned int
-IO_CONCAT(__IO_PREFIX,ioread8)(void __iomem *a)
+IO_CONCAT(__IO_PREFIX,ioread8)(const void __iomem *a)
{
- return __kernel_ldbu(*(volatile u8 __force *)a);
+ return __kernel_ldbu(*(const volatile u8 __force *)a);
}
__EXTERN_INLINE unsigned int
-IO_CONCAT(__IO_PREFIX,ioread16)(void __iomem *a)
+IO_CONCAT(__IO_PREFIX,ioread16)(const void __iomem *a)
{
- return __kernel_ldwu(*(volatile u16 __force *)a);
+ return __kernel_ldwu(*(const volatile u16 __force *)a);
}
__EXTERN_INLINE void
@@ -33,9 +33,9 @@ IO_CONCAT(__IO_PREFIX,iowrite16)(u16 b, void __iomem *a)
#if IO_CONCAT(__IO_PREFIX,trivial_io_lq)
__EXTERN_INLINE unsigned int
-IO_CONCAT(__IO_PREFIX,ioread32)(void __iomem *a)
+IO_CONCAT(__IO_PREFIX,ioread32)(const void __iomem *a)
{
- return *(volatile u32 __force *)a;
+ return *(const volatile u32 __force *)a;
}
__EXTERN_INLINE void
@@ -73,14 +73,14 @@ IO_CONCAT(__IO_PREFIX,writew)(u16 b, volatile void __iomem *a)
__EXTERN_INLINE u8
IO_CONCAT(__IO_PREFIX,readb)(const volatile void __iomem *a)
{
- void __iomem *addr = (void __iomem *)a;
+ const void __iomem *addr = (const void __iomem *)a;
return IO_CONCAT(__IO_PREFIX,ioread8)(addr);
}
__EXTERN_INLINE u16
IO_CONCAT(__IO_PREFIX,readw)(const volatile void __iomem *a)
{
- void __iomem *addr = (void __iomem *)a;
+ const void __iomem *addr = (const void __iomem *)a;
return IO_CONCAT(__IO_PREFIX,ioread16)(addr);
}
diff --git a/arch/alpha/include/asm/jensen.h b/arch/alpha/include/asm/jensen.h
index 436dc905b6ad..916895155a88 100644
--- a/arch/alpha/include/asm/jensen.h
+++ b/arch/alpha/include/asm/jensen.h
@@ -305,7 +305,7 @@ __EXTERN_INLINE int jensen_is_mmio(const volatile void __iomem *addr)
that it doesn't make sense to merge them. */
#define IOPORT(OS, NS) \
-__EXTERN_INLINE unsigned int jensen_ioread##NS(void __iomem *xaddr) \
+__EXTERN_INLINE unsigned int jensen_ioread##NS(const void __iomem *xaddr) \
{ \
if (jensen_is_mmio(xaddr)) \
return jensen_read##OS(xaddr - 0x100000000ul); \
diff --git a/arch/alpha/include/asm/machvec.h b/arch/alpha/include/asm/machvec.h
index a6b73c6d10ee..a4e96e2bec74 100644
--- a/arch/alpha/include/asm/machvec.h
+++ b/arch/alpha/include/asm/machvec.h
@@ -46,9 +46,9 @@ struct alpha_machine_vector
void (*mv_pci_tbi)(struct pci_controller *hose,
dma_addr_t start, dma_addr_t end);
- unsigned int (*mv_ioread8)(void __iomem *);
- unsigned int (*mv_ioread16)(void __iomem *);
- unsigned int (*mv_ioread32)(void __iomem *);
+ unsigned int (*mv_ioread8)(const void __iomem *);
+ unsigned int (*mv_ioread16)(const void __iomem *);
+ unsigned int (*mv_ioread32)(const void __iomem *);
void (*mv_iowrite8)(u8, void __iomem *);
void (*mv_iowrite16)(u16, void __iomem *);
diff --git a/arch/alpha/kernel/core_marvel.c b/arch/alpha/kernel/core_marvel.c
index 4c80d992a659..4485b77f8658 100644
--- a/arch/alpha/kernel/core_marvel.c
+++ b/arch/alpha/kernel/core_marvel.c
@@ -806,7 +806,7 @@ void __iomem *marvel_ioportmap (unsigned long addr)
}
unsigned int
-marvel_ioread8(void __iomem *xaddr)
+marvel_ioread8(const void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr;
if (__marvel_is_port_kbd(addr))
diff --git a/arch/alpha/kernel/io.c b/arch/alpha/kernel/io.c
index 938de13adfbf..838586abb1e0 100644
--- a/arch/alpha/kernel/io.c
+++ b/arch/alpha/kernel/io.c
@@ -14,7 +14,7 @@
"generic", which bumps through the machine vector. */
unsigned int
-ioread8(void __iomem *addr)
+ioread8(const void __iomem *addr)
{
unsigned int ret;
mb();
@@ -23,7 +23,7 @@ ioread8(void __iomem *addr)
return ret;
}
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
{
unsigned int ret;
mb();
@@ -32,7 +32,7 @@ unsigned int ioread16(void __iomem *addr)
return ret;
}
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
{
unsigned int ret;
mb();
@@ -257,7 +257,7 @@ EXPORT_SYMBOL(readq_relaxed);
/*
* Read COUNT 8-bit bytes from port PORT into memory starting at SRC.
*/
-void ioread8_rep(void __iomem *port, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *port, void *dst, unsigned long count)
{
while ((unsigned long)dst & 0x3) {
if (!count)
@@ -300,7 +300,7 @@ EXPORT_SYMBOL(insb);
* the interfaces seems to be slow: just using the inlined version
* of the inw() breaks things.
*/
-void ioread16_rep(void __iomem *port, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *port, void *dst, unsigned long count)
{
if (unlikely((unsigned long)dst & 0x3)) {
if (!count)
@@ -340,7 +340,7 @@ EXPORT_SYMBOL(insw);
* but the interfaces seems to be slow: just using the inlined version
* of the inl() breaks things.
*/
-void ioread32_rep(void __iomem *port, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *port, void *dst, unsigned long count)
{
if (unlikely((unsigned long)dst & 0x3)) {
while (count--) {
diff --git a/arch/parisc/include/asm/io.h b/arch/parisc/include/asm/io.h
index 116effe26143..45e20d38dc59 100644
--- a/arch/parisc/include/asm/io.h
+++ b/arch/parisc/include/asm/io.h
@@ -303,8 +303,8 @@ extern void outsl (unsigned long port, const void *src, unsigned long count);
#define ioread64be ioread64be
#define iowrite64 iowrite64
#define iowrite64be iowrite64be
-extern u64 ioread64(void __iomem *addr);
-extern u64 ioread64be(void __iomem *addr);
+extern u64 ioread64(const void __iomem *addr);
+extern u64 ioread64be(const void __iomem *addr);
extern void iowrite64(u64 val, void __iomem *addr);
extern void iowrite64be(u64 val, void __iomem *addr);
diff --git a/arch/parisc/lib/iomap.c b/arch/parisc/lib/iomap.c
index 0195aec657e2..ce400417d54e 100644
--- a/arch/parisc/lib/iomap.c
+++ b/arch/parisc/lib/iomap.c
@@ -43,13 +43,13 @@
#endif
struct iomap_ops {
- unsigned int (*read8)(void __iomem *);
- unsigned int (*read16)(void __iomem *);
- unsigned int (*read16be)(void __iomem *);
- unsigned int (*read32)(void __iomem *);
- unsigned int (*read32be)(void __iomem *);
- u64 (*read64)(void __iomem *);
- u64 (*read64be)(void __iomem *);
+ unsigned int (*read8)(const void __iomem *);
+ unsigned int (*read16)(const void __iomem *);
+ unsigned int (*read16be)(const void __iomem *);
+ unsigned int (*read32)(const void __iomem *);
+ unsigned int (*read32be)(const void __iomem *);
+ u64 (*read64)(const void __iomem *);
+ u64 (*read64be)(const void __iomem *);
void (*write8)(u8, void __iomem *);
void (*write16)(u16, void __iomem *);
void (*write16be)(u16, void __iomem *);
@@ -57,9 +57,9 @@ struct iomap_ops {
void (*write32be)(u32, void __iomem *);
void (*write64)(u64, void __iomem *);
void (*write64be)(u64, void __iomem *);
- void (*read8r)(void __iomem *, void *, unsigned long);
- void (*read16r)(void __iomem *, void *, unsigned long);
- void (*read32r)(void __iomem *, void *, unsigned long);
+ void (*read8r)(const void __iomem *, void *, unsigned long);
+ void (*read16r)(const void __iomem *, void *, unsigned long);
+ void (*read32r)(const void __iomem *, void *, unsigned long);
void (*write8r)(void __iomem *, const void *, unsigned long);
void (*write16r)(void __iomem *, const void *, unsigned long);
void (*write32r)(void __iomem *, const void *, unsigned long);
@@ -69,17 +69,17 @@ struct iomap_ops {
#define ADDR2PORT(addr) ((unsigned long __force)(addr) & 0xffffff)
-static unsigned int ioport_read8(void __iomem *addr)
+static unsigned int ioport_read8(const void __iomem *addr)
{
return inb(ADDR2PORT(addr));
}
-static unsigned int ioport_read16(void __iomem *addr)
+static unsigned int ioport_read16(const void __iomem *addr)
{
return inw(ADDR2PORT(addr));
}
-static unsigned int ioport_read32(void __iomem *addr)
+static unsigned int ioport_read32(const void __iomem *addr)
{
return inl(ADDR2PORT(addr));
}
@@ -99,17 +99,17 @@ static void ioport_write32(u32 datum, void __iomem *addr)
outl(datum, ADDR2PORT(addr));
}
-static void ioport_read8r(void __iomem *addr, void *dst, unsigned long count)
+static void ioport_read8r(const void __iomem *addr, void *dst, unsigned long count)
{
insb(ADDR2PORT(addr), dst, count);
}
-static void ioport_read16r(void __iomem *addr, void *dst, unsigned long count)
+static void ioport_read16r(const void __iomem *addr, void *dst, unsigned long count)
{
insw(ADDR2PORT(addr), dst, count);
}
-static void ioport_read32r(void __iomem *addr, void *dst, unsigned long count)
+static void ioport_read32r(const void __iomem *addr, void *dst, unsigned long count)
{
insl(ADDR2PORT(addr), dst, count);
}
@@ -150,37 +150,37 @@ static const struct iomap_ops ioport_ops = {
/* Legacy I/O memory ops */
-static unsigned int iomem_read8(void __iomem *addr)
+static unsigned int iomem_read8(const void __iomem *addr)
{
return readb(addr);
}
-static unsigned int iomem_read16(void __iomem *addr)
+static unsigned int iomem_read16(const void __iomem *addr)
{
return readw(addr);
}
-static unsigned int iomem_read16be(void __iomem *addr)
+static unsigned int iomem_read16be(const void __iomem *addr)
{
return __raw_readw(addr);
}
-static unsigned int iomem_read32(void __iomem *addr)
+static unsigned int iomem_read32(const void __iomem *addr)
{
return readl(addr);
}
-static unsigned int iomem_read32be(void __iomem *addr)
+static unsigned int iomem_read32be(const void __iomem *addr)
{
return __raw_readl(addr);
}
-static u64 iomem_read64(void __iomem *addr)
+static u64 iomem_read64(const void __iomem *addr)
{
return readq(addr);
}
-static u64 iomem_read64be(void __iomem *addr)
+static u64 iomem_read64be(const void __iomem *addr)
{
return __raw_readq(addr);
}
@@ -220,7 +220,7 @@ static void iomem_write64be(u64 datum, void __iomem *addr)
__raw_writel(datum, addr);
}
-static void iomem_read8r(void __iomem *addr, void *dst, unsigned long count)
+static void iomem_read8r(const void __iomem *addr, void *dst, unsigned long count)
{
while (count--) {
*(u8 *)dst = __raw_readb(addr);
@@ -228,7 +228,7 @@ static void iomem_read8r(void __iomem *addr, void *dst, unsigned long count)
}
}
-static void iomem_read16r(void __iomem *addr, void *dst, unsigned long count)
+static void iomem_read16r(const void __iomem *addr, void *dst, unsigned long count)
{
while (count--) {
*(u16 *)dst = __raw_readw(addr);
@@ -236,7 +236,7 @@ static void iomem_read16r(void __iomem *addr, void *dst, unsigned long count)
}
}
-static void iomem_read32r(void __iomem *addr, void *dst, unsigned long count)
+static void iomem_read32r(const void __iomem *addr, void *dst, unsigned long count)
{
while (count--) {
*(u32 *)dst = __raw_readl(addr);
@@ -297,49 +297,49 @@ static const struct iomap_ops *iomap_ops[8] = {
};
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read8(addr);
return *((u8 *)addr);
}
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read16(addr);
return le16_to_cpup((u16 *)addr);
}
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read16be(addr);
return *((u16 *)addr);
}
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read32(addr);
return le32_to_cpup((u32 *)addr);
}
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read32be(addr);
return *((u32 *)addr);
}
-u64 ioread64(void __iomem *addr)
+u64 ioread64(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read64(addr);
return le64_to_cpup((u64 *)addr);
}
-u64 ioread64be(void __iomem *addr)
+u64 ioread64be(const void __iomem *addr)
{
if (unlikely(INDIRECT_ADDR(addr)))
return iomap_ops[ADDR_TO_REGION(addr)]->read64be(addr);
@@ -411,7 +411,7 @@ void iowrite64be(u64 datum, void __iomem *addr)
/* Repeating interfaces */
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
{
if (unlikely(INDIRECT_ADDR(addr))) {
iomap_ops[ADDR_TO_REGION(addr)]->read8r(addr, dst, count);
@@ -423,7 +423,7 @@ void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
}
}
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
{
if (unlikely(INDIRECT_ADDR(addr))) {
iomap_ops[ADDR_TO_REGION(addr)]->read16r(addr, dst, count);
@@ -435,7 +435,7 @@ void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
}
}
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
{
if (unlikely(INDIRECT_ADDR(addr))) {
iomap_ops[ADDR_TO_REGION(addr)]->read32r(addr, dst, count);
diff --git a/arch/powerpc/kernel/iomap.c b/arch/powerpc/kernel/iomap.c
index 5ac84efc6ede..9fe4fb3b08aa 100644
--- a/arch/powerpc/kernel/iomap.c
+++ b/arch/powerpc/kernel/iomap.c
@@ -15,23 +15,23 @@
* Here comes the ppc64 implementation of the IOMAP
* interfaces.
*/
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
{
return readb(addr);
}
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
{
return readw(addr);
}
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
{
return readw_be(addr);
}
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
{
return readl(addr);
}
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
{
return readl_be(addr);
}
@@ -41,27 +41,27 @@ EXPORT_SYMBOL(ioread16be);
EXPORT_SYMBOL(ioread32);
EXPORT_SYMBOL(ioread32be);
#ifdef __powerpc64__
-u64 ioread64(void __iomem *addr)
+u64 ioread64(const void __iomem *addr)
{
return readq(addr);
}
-u64 ioread64_lo_hi(void __iomem *addr)
+u64 ioread64_lo_hi(const void __iomem *addr)
{
return readq(addr);
}
-u64 ioread64_hi_lo(void __iomem *addr)
+u64 ioread64_hi_lo(const void __iomem *addr)
{
return readq(addr);
}
-u64 ioread64be(void __iomem *addr)
+u64 ioread64be(const void __iomem *addr)
{
return readq_be(addr);
}
-u64 ioread64be_lo_hi(void __iomem *addr)
+u64 ioread64be_lo_hi(const void __iomem *addr)
{
return readq_be(addr);
}
-u64 ioread64be_hi_lo(void __iomem *addr)
+u64 ioread64be_hi_lo(const void __iomem *addr)
{
return readq_be(addr);
}
@@ -139,15 +139,15 @@ EXPORT_SYMBOL(iowrite64be_hi_lo);
* FIXME! We could make these do EEH handling if we really
* wanted. Not clear if we do.
*/
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
{
readsb(addr, dst, count);
}
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
{
readsw(addr, dst, count);
}
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
{
readsl(addr, dst, count);
}
diff --git a/arch/sh/kernel/iomap.c b/arch/sh/kernel/iomap.c
index ef9e2c97cbb7..0a0dff4e66de 100644
--- a/arch/sh/kernel/iomap.c
+++ b/arch/sh/kernel/iomap.c
@@ -8,31 +8,31 @@
#include <linux/module.h>
#include <linux/io.h>
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
{
return readb(addr);
}
EXPORT_SYMBOL(ioread8);
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
{
return readw(addr);
}
EXPORT_SYMBOL(ioread16);
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
{
return be16_to_cpu(__raw_readw(addr));
}
EXPORT_SYMBOL(ioread16be);
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
{
return readl(addr);
}
EXPORT_SYMBOL(ioread32);
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
{
return be32_to_cpu(__raw_readl(addr));
}
@@ -74,7 +74,7 @@ EXPORT_SYMBOL(iowrite32be);
* convert to CPU byte order. We write in "IO byte
* order" (we also don't have IO barriers).
*/
-static inline void mmio_insb(void __iomem *addr, u8 *dst, int count)
+static inline void mmio_insb(const void __iomem *addr, u8 *dst, int count)
{
while (--count >= 0) {
u8 data = __raw_readb(addr);
@@ -83,7 +83,7 @@ static inline void mmio_insb(void __iomem *addr, u8 *dst, int count)
}
}
-static inline void mmio_insw(void __iomem *addr, u16 *dst, int count)
+static inline void mmio_insw(const void __iomem *addr, u16 *dst, int count)
{
while (--count >= 0) {
u16 data = __raw_readw(addr);
@@ -92,7 +92,7 @@ static inline void mmio_insw(void __iomem *addr, u16 *dst, int count)
}
}
-static inline void mmio_insl(void __iomem *addr, u32 *dst, int count)
+static inline void mmio_insl(const void __iomem *addr, u32 *dst, int count)
{
while (--count >= 0) {
u32 data = __raw_readl(addr);
@@ -125,19 +125,19 @@ static inline void mmio_outsl(void __iomem *addr, const u32 *src, int count)
}
}
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
{
mmio_insb(addr, dst, count);
}
EXPORT_SYMBOL(ioread8_rep);
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
{
mmio_insw(addr, dst, count);
}
EXPORT_SYMBOL(ioread16_rep);
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
{
mmio_insl(addr, dst, count);
}
diff --git a/drivers/sh/clk/cpg.c b/drivers/sh/clk/cpg.c
index a5cacfe24a42..fd72d9088bdc 100644
--- a/drivers/sh/clk/cpg.c
+++ b/drivers/sh/clk/cpg.c
@@ -40,7 +40,7 @@ static int sh_clk_mstp_enable(struct clk *clk)
{
sh_clk_write(sh_clk_read(clk) & ~(1 << clk->enable_bit), clk);
if (clk->status_reg) {
- unsigned int (*read)(void __iomem *addr);
+ unsigned int (*read)(const void __iomem *addr);
int i;
void __iomem *mapped_status = (phys_addr_t)clk->status_reg -
(phys_addr_t)clk->enable_reg + clk->mapped_reg;
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 9d28a5e82f73..649224664969 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -26,14 +26,14 @@
* in the low address range. Architectures for which this is not
* true can't use this generic implementation.
*/
-extern unsigned int ioread8(void __iomem *);
-extern unsigned int ioread16(void __iomem *);
-extern unsigned int ioread16be(void __iomem *);
-extern unsigned int ioread32(void __iomem *);
-extern unsigned int ioread32be(void __iomem *);
+extern unsigned int ioread8(const void __iomem *);
+extern unsigned int ioread16(const void __iomem *);
+extern unsigned int ioread16be(const void __iomem *);
+extern unsigned int ioread32(const void __iomem *);
+extern unsigned int ioread32be(const void __iomem *);
#ifdef CONFIG_64BIT
-extern u64 ioread64(void __iomem *);
-extern u64 ioread64be(void __iomem *);
+extern u64 ioread64(const void __iomem *);
+extern u64 ioread64be(const void __iomem *);
#endif
#ifdef readq
@@ -41,10 +41,10 @@ extern u64 ioread64be(void __iomem *);
#define ioread64_hi_lo ioread64_hi_lo
#define ioread64be_lo_hi ioread64be_lo_hi
#define ioread64be_hi_lo ioread64be_hi_lo
-extern u64 ioread64_lo_hi(void __iomem *addr);
-extern u64 ioread64_hi_lo(void __iomem *addr);
-extern u64 ioread64be_lo_hi(void __iomem *addr);
-extern u64 ioread64be_hi_lo(void __iomem *addr);
+extern u64 ioread64_lo_hi(const void __iomem *addr);
+extern u64 ioread64_hi_lo(const void __iomem *addr);
+extern u64 ioread64be_lo_hi(const void __iomem *addr);
+extern u64 ioread64be_hi_lo(const void __iomem *addr);
#endif
extern void iowrite8(u8, void __iomem *);
@@ -79,9 +79,9 @@ extern void iowrite64be_hi_lo(u64 val, void __iomem *addr);
* memory across multiple ports, use "memcpy_toio()"
* and friends.
*/
-extern void ioread8_rep(void __iomem *port, void *buf, unsigned long count);
-extern void ioread16_rep(void __iomem *port, void *buf, unsigned long count);
-extern void ioread32_rep(void __iomem *port, void *buf, unsigned long count);
+extern void ioread8_rep(const void __iomem *port, void *buf, unsigned long count);
+extern void ioread16_rep(const void __iomem *port, void *buf, unsigned long count);
+extern void ioread32_rep(const void __iomem *port, void *buf, unsigned long count);
extern void iowrite8_rep(void __iomem *port, const void *buf, unsigned long count);
extern void iowrite16_rep(void __iomem *port, const void *buf, unsigned long count);
diff --git a/include/linux/io-64-nonatomic-hi-lo.h b/include/linux/io-64-nonatomic-hi-lo.h
index ae21b72cce85..f32522bb3aa5 100644
--- a/include/linux/io-64-nonatomic-hi-lo.h
+++ b/include/linux/io-64-nonatomic-hi-lo.h
@@ -57,7 +57,7 @@ static inline void hi_lo_writeq_relaxed(__u64 val, volatile void __iomem *addr)
#ifndef ioread64_hi_lo
#define ioread64_hi_lo ioread64_hi_lo
-static inline u64 ioread64_hi_lo(void __iomem *addr)
+static inline u64 ioread64_hi_lo(const void __iomem *addr)
{
u32 low, high;
@@ -79,7 +79,7 @@ static inline void iowrite64_hi_lo(u64 val, void __iomem *addr)
#ifndef ioread64be_hi_lo
#define ioread64be_hi_lo ioread64be_hi_lo
-static inline u64 ioread64be_hi_lo(void __iomem *addr)
+static inline u64 ioread64be_hi_lo(const void __iomem *addr)
{
u32 low, high;
diff --git a/include/linux/io-64-nonatomic-lo-hi.h b/include/linux/io-64-nonatomic-lo-hi.h
index faaa842dbdb9..448a21435dba 100644
--- a/include/linux/io-64-nonatomic-lo-hi.h
+++ b/include/linux/io-64-nonatomic-lo-hi.h
@@ -57,7 +57,7 @@ static inline void lo_hi_writeq_relaxed(__u64 val, volatile void __iomem *addr)
#ifndef ioread64_lo_hi
#define ioread64_lo_hi ioread64_lo_hi
-static inline u64 ioread64_lo_hi(void __iomem *addr)
+static inline u64 ioread64_lo_hi(const void __iomem *addr)
{
u32 low, high;
@@ -79,7 +79,7 @@ static inline void iowrite64_lo_hi(u64 val, void __iomem *addr)
#ifndef ioread64be_lo_hi
#define ioread64be_lo_hi ioread64be_lo_hi
-static inline u64 ioread64be_lo_hi(void __iomem *addr)
+static inline u64 ioread64be_lo_hi(const void __iomem *addr)
{
u32 low, high;
diff --git a/lib/iomap.c b/lib/iomap.c
index e909ab71e995..fbaa3e8f19d6 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -70,27 +70,27 @@ static void bad_io_access(unsigned long port, const char *access)
#define mmio_read64be(addr) swab64(readq(addr))
#endif
-unsigned int ioread8(void __iomem *addr)
+unsigned int ioread8(const void __iomem *addr)
{
IO_COND(addr, return inb(port), return readb(addr));
return 0xff;
}
-unsigned int ioread16(void __iomem *addr)
+unsigned int ioread16(const void __iomem *addr)
{
IO_COND(addr, return inw(port), return readw(addr));
return 0xffff;
}
-unsigned int ioread16be(void __iomem *addr)
+unsigned int ioread16be(const void __iomem *addr)
{
IO_COND(addr, return pio_read16be(port), return mmio_read16be(addr));
return 0xffff;
}
-unsigned int ioread32(void __iomem *addr)
+unsigned int ioread32(const void __iomem *addr)
{
IO_COND(addr, return inl(port), return readl(addr));
return 0xffffffff;
}
-unsigned int ioread32be(void __iomem *addr)
+unsigned int ioread32be(const void __iomem *addr)
{
IO_COND(addr, return pio_read32be(port), return mmio_read32be(addr));
return 0xffffffff;
@@ -142,26 +142,26 @@ static u64 pio_read64be_hi_lo(unsigned long port)
return lo | (hi << 32);
}
-u64 ioread64_lo_hi(void __iomem *addr)
+u64 ioread64_lo_hi(const void __iomem *addr)
{
IO_COND(addr, return pio_read64_lo_hi(port), return readq(addr));
return 0xffffffffffffffffULL;
}
-u64 ioread64_hi_lo(void __iomem *addr)
+u64 ioread64_hi_lo(const void __iomem *addr)
{
IO_COND(addr, return pio_read64_hi_lo(port), return readq(addr));
return 0xffffffffffffffffULL;
}
-u64 ioread64be_lo_hi(void __iomem *addr)
+u64 ioread64be_lo_hi(const void __iomem *addr)
{
IO_COND(addr, return pio_read64be_lo_hi(port),
return mmio_read64be(addr));
return 0xffffffffffffffffULL;
}
-u64 ioread64be_hi_lo(void __iomem *addr)
+u64 ioread64be_hi_lo(const void __iomem *addr)
{
IO_COND(addr, return pio_read64be_hi_lo(port),
return mmio_read64be(addr));
@@ -275,7 +275,7 @@ EXPORT_SYMBOL(iowrite64be_hi_lo);
* order" (we also don't have IO barriers).
*/
#ifndef mmio_insb
-static inline void mmio_insb(void __iomem *addr, u8 *dst, int count)
+static inline void mmio_insb(const void __iomem *addr, u8 *dst, int count)
{
while (--count >= 0) {
u8 data = __raw_readb(addr);
@@ -283,7 +283,7 @@ static inline void mmio_insb(void __iomem *addr, u8 *dst, int count)
dst++;
}
}
-static inline void mmio_insw(void __iomem *addr, u16 *dst, int count)
+static inline void mmio_insw(const void __iomem *addr, u16 *dst, int count)
{
while (--count >= 0) {
u16 data = __raw_readw(addr);
@@ -291,7 +291,7 @@ static inline void mmio_insw(void __iomem *addr, u16 *dst, int count)
dst++;
}
}
-static inline void mmio_insl(void __iomem *addr, u32 *dst, int count)
+static inline void mmio_insl(const void __iomem *addr, u32 *dst, int count)
{
while (--count >= 0) {
u32 data = __raw_readl(addr);
@@ -325,15 +325,15 @@ static inline void mmio_outsl(void __iomem *addr, const u32 *src, int count)
}
#endif
-void ioread8_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread8_rep(const void __iomem *addr, void *dst, unsigned long count)
{
IO_COND(addr, insb(port,dst,count), mmio_insb(addr, dst, count));
}
-void ioread16_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread16_rep(const void __iomem *addr, void *dst, unsigned long count)
{
IO_COND(addr, insw(port,dst,count), mmio_insw(addr, dst, count));
}
-void ioread32_rep(void __iomem *addr, void *dst, unsigned long count)
+void ioread32_rep(const void __iomem *addr, void *dst, unsigned long count)
{
IO_COND(addr, insl(port,dst,count), mmio_insl(addr, dst, count));
}
--
2.17.1
^ permalink raw reply related
* [PATCH v3 2/4] rtl818x: Constify ioreadX() iomem argument (as in generic implementation)
From: Krzysztof Kozlowski @ 2020-07-09 7:28 UTC (permalink / raw)
To: Richard Henderson, Ivan Kokshaysky, Matt Turner,
James E.J. Bottomley, Helge Deller, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, Yoshinori Sato,
Rich Felker, Kalle Valo, David S. Miller, Jakub Kicinski,
Dave Jiang, Jon Mason, Allen Hubbe, Michael S. Tsirkin,
Jason Wang, Arnd Bergmann, Geert Uytterhoeven, Andrew Morton,
linux-alpha, linux-kernel, linux-parisc, linuxppc-dev, linux-sh,
linux-wireless, netdev, linux-ntb, virtualization, linux-arch
Cc: Krzysztof Kozlowski
In-Reply-To: <20200709072837.5869-1-krzk@kernel.org>
The ioreadX() helpers have inconsistent interface. On some architectures
void *__iomem address argument is a pointer to const, on some not.
Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
---
drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h b/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h
index 7948a2da195a..2ff00800d45b 100644
--- a/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h
+++ b/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h
@@ -150,17 +150,17 @@ void rtl8180_write_phy(struct ieee80211_hw *dev, u8 addr, u32 data);
void rtl8180_set_anaparam(struct rtl8180_priv *priv, u32 anaparam);
void rtl8180_set_anaparam2(struct rtl8180_priv *priv, u32 anaparam2);
-static inline u8 rtl818x_ioread8(struct rtl8180_priv *priv, u8 __iomem *addr)
+static inline u8 rtl818x_ioread8(struct rtl8180_priv *priv, const u8 __iomem *addr)
{
return ioread8(addr);
}
-static inline u16 rtl818x_ioread16(struct rtl8180_priv *priv, __le16 __iomem *addr)
+static inline u16 rtl818x_ioread16(struct rtl8180_priv *priv, const __le16 __iomem *addr)
{
return ioread16(addr);
}
-static inline u32 rtl818x_ioread32(struct rtl8180_priv *priv, __le32 __iomem *addr)
+static inline u32 rtl818x_ioread32(struct rtl8180_priv *priv, const __le32 __iomem *addr)
{
return ioread32(addr);
}
--
2.17.1
^ permalink raw reply related
* [PATCH v3 3/4] ntb: intel: Constify ioreadX() iomem argument (as in generic implementation)
From: Krzysztof Kozlowski @ 2020-07-09 7:28 UTC (permalink / raw)
To: Richard Henderson, Ivan Kokshaysky, Matt Turner,
James E.J. Bottomley, Helge Deller, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, Yoshinori Sato,
Rich Felker, Kalle Valo, David S. Miller, Jakub Kicinski,
Dave Jiang, Jon Mason, Allen Hubbe, Michael S. Tsirkin,
Jason Wang, Arnd Bergmann, Geert Uytterhoeven, Andrew Morton,
linux-alpha, linux-kernel, linux-parisc, linuxppc-dev, linux-sh,
linux-wireless, netdev, linux-ntb, virtualization, linux-arch
Cc: Krzysztof Kozlowski
In-Reply-To: <20200709072837.5869-1-krzk@kernel.org>
The ioreadX() helpers have inconsistent interface. On some architectures
void *__iomem address argument is a pointer to const, on some not.
Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/ntb/hw/intel/ntb_hw_gen1.c | 2 +-
drivers/ntb/hw/intel/ntb_hw_gen3.h | 2 +-
drivers/ntb/hw/intel/ntb_hw_intel.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/ntb/hw/intel/ntb_hw_gen1.c b/drivers/ntb/hw/intel/ntb_hw_gen1.c
index 423f9b8fbbcf..3185efeab487 100644
--- a/drivers/ntb/hw/intel/ntb_hw_gen1.c
+++ b/drivers/ntb/hw/intel/ntb_hw_gen1.c
@@ -1205,7 +1205,7 @@ int intel_ntb_peer_spad_write(struct ntb_dev *ntb, int pidx, int sidx,
ndev->peer_reg->spad);
}
-static u64 xeon_db_ioread(void __iomem *mmio)
+static u64 xeon_db_ioread(const void __iomem *mmio)
{
return (u64)ioread16(mmio);
}
diff --git a/drivers/ntb/hw/intel/ntb_hw_gen3.h b/drivers/ntb/hw/intel/ntb_hw_gen3.h
index 2bc5d8356045..dea93989942d 100644
--- a/drivers/ntb/hw/intel/ntb_hw_gen3.h
+++ b/drivers/ntb/hw/intel/ntb_hw_gen3.h
@@ -91,7 +91,7 @@
#define GEN3_DB_TOTAL_SHIFT 33
#define GEN3_SPAD_COUNT 16
-static inline u64 gen3_db_ioread(void __iomem *mmio)
+static inline u64 gen3_db_ioread(const void __iomem *mmio)
{
return ioread64(mmio);
}
diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.h b/drivers/ntb/hw/intel/ntb_hw_intel.h
index d61fcd91714b..05e2335c9596 100644
--- a/drivers/ntb/hw/intel/ntb_hw_intel.h
+++ b/drivers/ntb/hw/intel/ntb_hw_intel.h
@@ -103,7 +103,7 @@ struct intel_ntb_dev;
struct intel_ntb_reg {
int (*poll_link)(struct intel_ntb_dev *ndev);
int (*link_is_up)(struct intel_ntb_dev *ndev);
- u64 (*db_ioread)(void __iomem *mmio);
+ u64 (*db_ioread)(const void __iomem *mmio);
void (*db_iowrite)(u64 db_bits, void __iomem *mmio);
unsigned long ntb_ctl;
resource_size_t db_size;
--
2.17.1
^ permalink raw reply related
* [PATCH v3 4/4] virtio: pci: Constify ioreadX() iomem argument (as in generic implementation)
From: Krzysztof Kozlowski @ 2020-07-09 7:28 UTC (permalink / raw)
To: Richard Henderson, Ivan Kokshaysky, Matt Turner,
James E.J. Bottomley, Helge Deller, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, Yoshinori Sato,
Rich Felker, Kalle Valo, David S. Miller, Jakub Kicinski,
Dave Jiang, Jon Mason, Allen Hubbe, Michael S. Tsirkin,
Jason Wang, Arnd Bergmann, Geert Uytterhoeven, Andrew Morton,
linux-alpha, linux-kernel, linux-parisc, linuxppc-dev, linux-sh,
linux-wireless, netdev, linux-ntb, virtualization, linux-arch
Cc: Krzysztof Kozlowski
In-Reply-To: <20200709072837.5869-1-krzk@kernel.org>
The ioreadX() helpers have inconsistent interface. On some architectures
void *__iomem address argument is a pointer to const, on some not.
Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
drivers/virtio/virtio_pci_modern.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index db93cedd262f..90eff165a719 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -27,16 +27,16 @@
* method, i.e. 32-bit accesses for 32-bit fields, 16-bit accesses
* for 16-bit fields and 8-bit accesses for 8-bit fields.
*/
-static inline u8 vp_ioread8(u8 __iomem *addr)
+static inline u8 vp_ioread8(const u8 __iomem *addr)
{
return ioread8(addr);
}
-static inline u16 vp_ioread16 (__le16 __iomem *addr)
+static inline u16 vp_ioread16 (const __le16 __iomem *addr)
{
return ioread16(addr);
}
-static inline u32 vp_ioread32(__le32 __iomem *addr)
+static inline u32 vp_ioread32(const __le32 __iomem *addr)
{
return ioread32(addr);
}
--
2.17.1
^ permalink raw reply related
* Re: [PATCH v3 1/4] iomap: Constify ioreadX() iomem argument (as in generic implementation)
From: Krzysztof Kozlowski @ 2020-07-09 7:32 UTC (permalink / raw)
To: Richard Henderson, Ivan Kokshaysky, Matt Turner,
James E.J. Bottomley, Helge Deller, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, Yoshinori Sato,
Rich Felker, Kalle Valo, David S. Miller, Jakub Kicinski,
Dave Jiang, Jon Mason, Allen Hubbe, Michael S. Tsirkin,
Jason Wang, Arnd Bergmann, Geert Uytterhoeven, Andrew Morton,
linux-alpha, linux-kernel, linux-parisc, linuxppc-dev, linux-sh,
linux-wireless, netdev, linux-ntb, virtualization, linux-arch
In-Reply-To: <20200709072837.5869-2-krzk@kernel.org>
On Thu, Jul 09, 2020 at 09:28:34AM +0200, Krzysztof Kozlowski wrote:
> The ioreadX() and ioreadX_rep() helpers have inconsistent interface. On
> some architectures void *__iomem address argument is a pointer to const,
> on some not.
>
> Implementations of ioreadX() do not modify the memory under the address
> so they can be converted to a "const" version for const-safety and
> consistency among architectures.
>
> Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
> Reviewed-by: Arnd Bergmann <arnd@arndb.de>
I forgot to put here one more Ack, for PowerPC:
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
https://lore.kernel.org/lkml/87ftedj0zz.fsf@mpe.ellerman.id.au/
Best regards,
Krzysztof
^ permalink raw reply
* [PATCH RESEND 1/2] powerpc/mce: Add MCE notification chain
From: Santosh Sivaraj @ 2020-07-09 7:56 UTC (permalink / raw)
To: linuxppc-dev
Cc: Santosh Sivaraj, Aneesh Kumar K.V, Mahesh Salgaonkar,
Ganesh Goudar, Oliver, Vaibhav Jain
Introduce notification chain which lets know about uncorrected memory
errors(UE). This would help prospective users in pmem or nvdimm subsystem
to track bad blocks for better handling of persistent memory allocations.
Signed-off-by: Santosh S <santosh@fossix.org>
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
arch/powerpc/include/asm/mce.h | 2 ++
arch/powerpc/kernel/mce.c | 15 +++++++++++++++
2 files changed, 17 insertions(+)
Send the two patches together, so the dependencies are clear. The earlier patch reviews are
here: https://lore.kernel.org/linuxppc-dev/20200330071219.12284-1-ganeshgr@linux.ibm.com/
Rebase the patches on top on 5.8-rc4
diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 376a395daf329..a57b0772702a9 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -220,6 +220,8 @@ extern void machine_check_print_event_info(struct machine_check_event *evt,
unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
extern void mce_common_process_ue(struct pt_regs *regs,
struct mce_error_info *mce_err);
+extern int mce_register_notifier(struct notifier_block *nb);
+extern int mce_unregister_notifier(struct notifier_block *nb);
#ifdef CONFIG_PPC_BOOK3S_64
void flush_and_reload_slb(void);
#endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index fd90c0eda2290..b7b3ed4e61937 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -49,6 +49,20 @@ static struct irq_work mce_ue_event_irq_work = {
DECLARE_WORK(mce_ue_event_work, machine_process_ue_event);
+static BLOCKING_NOTIFIER_HEAD(mce_notifier_list);
+
+int mce_register_notifier(struct notifier_block *nb)
+{
+ return blocking_notifier_chain_register(&mce_notifier_list, nb);
+}
+EXPORT_SYMBOL_GPL(mce_register_notifier);
+
+int mce_unregister_notifier(struct notifier_block *nb)
+{
+ return blocking_notifier_chain_unregister(&mce_notifier_list, nb);
+}
+EXPORT_SYMBOL_GPL(mce_unregister_notifier);
+
static void mce_set_error_info(struct machine_check_event *mce,
struct mce_error_info *mce_err)
{
@@ -278,6 +292,7 @@ static void machine_process_ue_event(struct work_struct *work)
while (__this_cpu_read(mce_ue_count) > 0) {
index = __this_cpu_read(mce_ue_count) - 1;
evt = this_cpu_ptr(&mce_ue_event_queue[index]);
+ blocking_notifier_call_chain(&mce_notifier_list, 0, evt);
#ifdef CONFIG_MEMORY_FAILURE
/*
* This should probably queued elsewhere, but
--
2.26.2
^ permalink raw reply related
* [PATCH RESEND 2/2] papr/scm: Add bad memory ranges to nvdimm bad ranges
From: Santosh Sivaraj @ 2020-07-09 7:56 UTC (permalink / raw)
To: linuxppc-dev
Cc: Santosh Sivaraj, Aneesh Kumar K.V, Mahesh Salgaonkar,
Ganesh Goudar, Oliver, Vaibhav Jain
In-Reply-To: <20200709075635.643740-1-santosh@fossix.org>
Subscribe to the MCE notification and add the physical address which
generated a memory error to nvdimm bad range.
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
---
arch/powerpc/platforms/pseries/papr_scm.c | 98 ++++++++++++++++++++++-
1 file changed, 97 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 9c569078a09fd..5ebb1c797795d 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -13,9 +13,11 @@
#include <linux/platform_device.h>
#include <linux/delay.h>
#include <linux/seq_buf.h>
+#include <linux/nd.h>
#include <asm/plpar_wrappers.h>
#include <asm/papr_pdsm.h>
+#include <asm/mce.h>
#define BIND_ANY_ADDR (~0ul)
@@ -80,6 +82,7 @@ struct papr_scm_priv {
struct resource res;
struct nd_region *region;
struct nd_interleave_set nd_set;
+ struct list_head region_list;
/* Protect dimm health data from concurrent read/writes */
struct mutex health_mutex;
@@ -91,6 +94,9 @@ struct papr_scm_priv {
u64 health_bitmap;
};
+LIST_HEAD(papr_nd_regions);
+DEFINE_MUTEX(papr_ndr_lock);
+
static int drc_pmem_bind(struct papr_scm_priv *p)
{
unsigned long ret[PLPAR_HCALL_BUFSIZE];
@@ -759,6 +765,10 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
dev_info(dev, "Region registered with target node %d and online node %d",
target_nid, online_nid);
+ mutex_lock(&papr_ndr_lock);
+ list_add_tail(&p->region_list, &papr_nd_regions);
+ mutex_unlock(&papr_ndr_lock);
+
return 0;
err: nvdimm_bus_unregister(p->bus);
@@ -766,6 +776,70 @@ err: nvdimm_bus_unregister(p->bus);
return -ENXIO;
}
+static void papr_scm_add_badblock(struct nd_region *region,
+ struct nvdimm_bus *bus, u64 phys_addr)
+{
+ u64 aligned_addr = ALIGN_DOWN(phys_addr, L1_CACHE_BYTES);
+
+ if (nvdimm_bus_add_badrange(bus, aligned_addr, L1_CACHE_BYTES)) {
+ pr_err("Bad block registration for 0x%llx failed\n", phys_addr);
+ return;
+ }
+
+ pr_debug("Add memory range (0x%llx - 0x%llx) as bad range\n",
+ aligned_addr, aligned_addr + L1_CACHE_BYTES);
+
+ nvdimm_region_notify(region, NVDIMM_REVALIDATE_POISON);
+}
+
+static int handle_mce_ue(struct notifier_block *nb, unsigned long val,
+ void *data)
+{
+ struct machine_check_event *evt = data;
+ struct papr_scm_priv *p;
+ u64 phys_addr;
+ bool found = false;
+
+ if (evt->error_type != MCE_ERROR_TYPE_UE)
+ return NOTIFY_DONE;
+
+ if (list_empty(&papr_nd_regions))
+ return NOTIFY_DONE;
+
+ /*
+ * The physical address obtained here is PAGE_SIZE aligned, so get the
+ * exact address from the effective address
+ */
+ phys_addr = evt->u.ue_error.physical_address +
+ (evt->u.ue_error.effective_address & ~PAGE_MASK);
+
+ if (!evt->u.ue_error.physical_address_provided ||
+ !is_zone_device_page(pfn_to_page(phys_addr >> PAGE_SHIFT)))
+ return NOTIFY_DONE;
+
+ /* mce notifier is called from a process context, so mutex is safe */
+ mutex_lock(&papr_ndr_lock);
+ list_for_each_entry(p, &papr_nd_regions, region_list) {
+ struct resource res = p->res;
+
+ if (phys_addr >= res.start && phys_addr <= res.end) {
+ found = true;
+ break;
+ }
+ }
+
+ if (found)
+ papr_scm_add_badblock(p->region, p->bus, phys_addr);
+
+ mutex_unlock(&papr_ndr_lock);
+
+ return found ? NOTIFY_OK : NOTIFY_DONE;
+}
+
+static struct notifier_block mce_ue_nb = {
+ .notifier_call = handle_mce_ue
+};
+
static int papr_scm_probe(struct platform_device *pdev)
{
struct device_node *dn = pdev->dev.of_node;
@@ -866,6 +940,10 @@ static int papr_scm_remove(struct platform_device *pdev)
{
struct papr_scm_priv *p = platform_get_drvdata(pdev);
+ mutex_lock(&papr_ndr_lock);
+ list_del(&(p->region_list));
+ mutex_unlock(&papr_ndr_lock);
+
nvdimm_bus_unregister(p->bus);
drc_pmem_unbind(p);
kfree(p->bus_desc.provider_name);
@@ -888,7 +966,25 @@ static struct platform_driver papr_scm_driver = {
},
};
-module_platform_driver(papr_scm_driver);
+static int __init papr_scm_init(void)
+{
+ int ret;
+
+ ret = platform_driver_register(&papr_scm_driver);
+ if (!ret)
+ mce_register_notifier(&mce_ue_nb);
+
+return ret;
+}
+module_init(papr_scm_init);
+
+static void __exit papr_scm_exit(void)
+{
+ mce_unregister_notifier(&mce_ue_nb);
+ platform_driver_unregister(&papr_scm_driver);
+}
+module_exit(papr_scm_exit);
+
MODULE_DEVICE_TABLE(of, papr_scm_match);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("IBM Corporation");
--
2.26.2
^ permalink raw reply related
* Re: [PATCH RESEND 1/2] powerpc/mce: Add MCE notification chain
From: Christophe Leroy @ 2020-07-09 8:07 UTC (permalink / raw)
To: Santosh Sivaraj, linuxppc-dev
Cc: Aneesh Kumar K.V, Oliver, Ganesh Goudar, Mahesh Salgaonkar,
Vaibhav Jain
In-Reply-To: <20200709075635.643740-1-santosh@fossix.org>
Le 09/07/2020 à 09:56, Santosh Sivaraj a écrit :
> Introduce notification chain which lets know about uncorrected memory
> errors(UE). This would help prospective users in pmem or nvdimm subsystem
> to track bad blocks for better handling of persistent memory allocations.
>
> Signed-off-by: Santosh S <santosh@fossix.org>
> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
> ---
> arch/powerpc/include/asm/mce.h | 2 ++
> arch/powerpc/kernel/mce.c | 15 +++++++++++++++
> 2 files changed, 17 insertions(+)
>
> Send the two patches together, so the dependencies are clear. The earlier patch reviews are
> here: https://lore.kernel.org/linuxppc-dev/20200330071219.12284-1-ganeshgr@linux.ibm.com/
>
> Rebase the patches on top on 5.8-rc4
>
> diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
> index 376a395daf329..a57b0772702a9 100644
> --- a/arch/powerpc/include/asm/mce.h
> +++ b/arch/powerpc/include/asm/mce.h
> @@ -220,6 +220,8 @@ extern void machine_check_print_event_info(struct machine_check_event *evt,
> unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
> extern void mce_common_process_ue(struct pt_regs *regs,
> struct mce_error_info *mce_err);
> +extern int mce_register_notifier(struct notifier_block *nb);
> +extern int mce_unregister_notifier(struct notifier_block *nb);
Using the 'extern' keyword on function declaration is pointless and
should be avoided in new patches. (checkpatch.pl --strict usually
complains about it).
> #ifdef CONFIG_PPC_BOOK3S_64
> void flush_and_reload_slb(void);
> #endif /* CONFIG_PPC_BOOK3S_64 */
> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index fd90c0eda2290..b7b3ed4e61937 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -49,6 +49,20 @@ static struct irq_work mce_ue_event_irq_work = {
>
> DECLARE_WORK(mce_ue_event_work, machine_process_ue_event);
>
> +static BLOCKING_NOTIFIER_HEAD(mce_notifier_list);
> +
> +int mce_register_notifier(struct notifier_block *nb)
> +{
> + return blocking_notifier_chain_register(&mce_notifier_list, nb);
> +}
> +EXPORT_SYMBOL_GPL(mce_register_notifier);
> +
> +int mce_unregister_notifier(struct notifier_block *nb)
> +{
> + return blocking_notifier_chain_unregister(&mce_notifier_list, nb);
> +}
> +EXPORT_SYMBOL_GPL(mce_unregister_notifier);
> +
> static void mce_set_error_info(struct machine_check_event *mce,
> struct mce_error_info *mce_err)
> {
> @@ -278,6 +292,7 @@ static void machine_process_ue_event(struct work_struct *work)
> while (__this_cpu_read(mce_ue_count) > 0) {
> index = __this_cpu_read(mce_ue_count) - 1;
> evt = this_cpu_ptr(&mce_ue_event_queue[index]);
> + blocking_notifier_call_chain(&mce_notifier_list, 0, evt);
> #ifdef CONFIG_MEMORY_FAILURE
> /*
> * This should probably queued elsewhere, but
>
Christophe
^ permalink raw reply
* Re: [PATCH v5 1/4] riscv: Move kernel mapping to vmalloc zone
From: Zong Li @ 2020-07-09 8:15 UTC (permalink / raw)
To: Palmer Dabbelt
Cc: Albert Ou, Alexandre Ghiti, Anup Patel,
linux-kernel@vger.kernel.org List, Atish Patra, Paul Mackerras,
Paul Walmsley, linux-riscv, linuxppc-dev
In-Reply-To: <mhng-831c4073-aefa-4aa0-a583-6a17f9aff9b7@palmerdabbelt-glaptop1>
On Thu, Jul 9, 2020 at 1:05 PM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Sun, 07 Jun 2020 00:59:46 PDT (-0700), alex@ghiti.fr wrote:
> > This is a preparatory patch for relocatable kernel.
> >
> > The kernel used to be linked at PAGE_OFFSET address and used to be loaded
> > physically at the beginning of the main memory. Therefore, we could use
> > the linear mapping for the kernel mapping.
> >
> > But the relocated kernel base address will be different from PAGE_OFFSET
> > and since in the linear mapping, two different virtual addresses cannot
> > point to the same physical address, the kernel mapping needs to lie outside
> > the linear mapping.
>
> I know it's been a while, but I keep opening this up to review it and just
> can't get over how ugly it is to put the kernel's linear map in the vmalloc
> region.
>
> I guess I don't understand why this is necessary at all. Specifically: why
> can't we just relocate the kernel within the linear map? That would let the
> bootloader put the kernel wherever it wants, modulo the physical memory size we
> support. We'd need to handle the regions that are coupled to the kernel's
> execution address, but we could just put them in an explicit memory region
> which is what we should probably be doing anyway.
The original implementation of relocation doesn't move the kernel's linear map
to the vmalloc region, and I also give the KASLR RFC patch [1] based on that.
In original, we relocate the kernel in the linear map region, we would
calculate a
random value first as the offset, then we move the kernel image to the
new target
address which is obtained by adding this offset to it's VA and PA.
It's enough for
randomizing the kernel, but it seems to me if we want to decouple the kernel's
linear mapping, the physical mapping of RAM and virtual mapping of RAM,
it might be good to move the kernel's mapping out from the linear region.
Even so, it is still an intrusive change. As far as I know, only arm64
does something
like that.
[1] https://patchwork.kernel.org/project/linux-riscv/list/?series=260615
>
> > In addition, because modules and BPF must be close to the kernel (inside
> > +-2GB window), the kernel is placed at the end of the vmalloc zone minus
> > 2GB, which leaves room for modules and BPF. The kernel could not be
> > placed at the beginning of the vmalloc zone since other vmalloc
> > allocations from the kernel could get all the +-2GB window around the
> > kernel which would prevent new modules and BPF programs to be loaded.
>
> Well, that's not enough to make sure this doesn't happen -- it's just enough to
> make sure it doesn't happen very quickily. That's the same boat we're already
> in, though, so it's not like it's worse.
>
> > Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
> > Reviewed-by: Zong Li <zong.li@sifive.com>
> > ---
> > arch/riscv/boot/loader.lds.S | 3 +-
> > arch/riscv/include/asm/page.h | 10 +++++-
> > arch/riscv/include/asm/pgtable.h | 38 ++++++++++++++-------
> > arch/riscv/kernel/head.S | 3 +-
> > arch/riscv/kernel/module.c | 4 +--
> > arch/riscv/kernel/vmlinux.lds.S | 3 +-
> > arch/riscv/mm/init.c | 58 +++++++++++++++++++++++++-------
> > arch/riscv/mm/physaddr.c | 2 +-
> > 8 files changed, 88 insertions(+), 33 deletions(-)
> >
> > diff --git a/arch/riscv/boot/loader.lds.S b/arch/riscv/boot/loader.lds.S
> > index 47a5003c2e28..62d94696a19c 100644
> > --- a/arch/riscv/boot/loader.lds.S
> > +++ b/arch/riscv/boot/loader.lds.S
> > @@ -1,13 +1,14 @@
> > /* SPDX-License-Identifier: GPL-2.0 */
> >
> > #include <asm/page.h>
> > +#include <asm/pgtable.h>
> >
> > OUTPUT_ARCH(riscv)
> > ENTRY(_start)
> >
> > SECTIONS
> > {
> > - . = PAGE_OFFSET;
> > + . = KERNEL_LINK_ADDR;
> >
> > .payload : {
> > *(.payload)
> > diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
> > index 2d50f76efe48..48bb09b6a9b7 100644
> > --- a/arch/riscv/include/asm/page.h
> > +++ b/arch/riscv/include/asm/page.h
> > @@ -90,18 +90,26 @@ typedef struct page *pgtable_t;
> >
> > #ifdef CONFIG_MMU
> > extern unsigned long va_pa_offset;
> > +extern unsigned long va_kernel_pa_offset;
> > extern unsigned long pfn_base;
> > #define ARCH_PFN_OFFSET (pfn_base)
> > #else
> > #define va_pa_offset 0
> > +#define va_kernel_pa_offset 0
> > #define ARCH_PFN_OFFSET (PAGE_OFFSET >> PAGE_SHIFT)
> > #endif /* CONFIG_MMU */
> >
> > extern unsigned long max_low_pfn;
> > extern unsigned long min_low_pfn;
> > +extern unsigned long kernel_virt_addr;
> >
> > #define __pa_to_va_nodebug(x) ((void *)((unsigned long) (x) + va_pa_offset))
> > -#define __va_to_pa_nodebug(x) ((unsigned long)(x) - va_pa_offset)
> > +#define linear_mapping_va_to_pa(x) ((unsigned long)(x) - va_pa_offset)
> > +#define kernel_mapping_va_to_pa(x) \
> > + ((unsigned long)(x) - va_kernel_pa_offset)
> > +#define __va_to_pa_nodebug(x) \
> > + (((x) >= PAGE_OFFSET) ? \
> > + linear_mapping_va_to_pa(x) : kernel_mapping_va_to_pa(x))
> >
> > #ifdef CONFIG_DEBUG_VIRTUAL
> > extern phys_addr_t __virt_to_phys(unsigned long x);
> > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> > index 35b60035b6b0..94ef3b49dfb6 100644
> > --- a/arch/riscv/include/asm/pgtable.h
> > +++ b/arch/riscv/include/asm/pgtable.h
> > @@ -11,23 +11,29 @@
> >
> > #include <asm/pgtable-bits.h>
> >
> > -#ifndef __ASSEMBLY__
> > -
> > -/* Page Upper Directory not used in RISC-V */
> > -#include <asm-generic/pgtable-nopud.h>
> > -#include <asm/page.h>
> > -#include <asm/tlbflush.h>
> > -#include <linux/mm_types.h>
> > -
> > -#ifdef CONFIG_MMU
> > +#ifndef CONFIG_MMU
> > +#define KERNEL_VIRT_ADDR PAGE_OFFSET
> > +#define KERNEL_LINK_ADDR PAGE_OFFSET
> > +#else
> > +/*
> > + * Leave 2GB for modules and BPF that must lie within a 2GB range around
> > + * the kernel.
> > + */
> > +#define KERNEL_VIRT_ADDR (VMALLOC_END - SZ_2G + 1)
> > +#define KERNEL_LINK_ADDR KERNEL_VIRT_ADDR
>
> At a bare minimum this is going to make a mess of the 32-bit port, as
> non-relocatable kernels are now going to get linked at 1GiB which is where user
> code is supposed to live. That's an easy fix, though, as the 32-bit stuff
> doesn't need any module address restrictions.
>
> > #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1)
> > #define VMALLOC_END (PAGE_OFFSET - 1)
> > #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE)
> >
> > #define BPF_JIT_REGION_SIZE (SZ_128M)
> > -#define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
> > -#define BPF_JIT_REGION_END (VMALLOC_END)
> > +#define BPF_JIT_REGION_START PFN_ALIGN((unsigned long)&_end)
> > +#define BPF_JIT_REGION_END (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
> > +
> > +#ifdef CONFIG_64BIT
> > +#define VMALLOC_MODULE_START BPF_JIT_REGION_END
> > +#define VMALLOC_MODULE_END (((unsigned long)&_start & PAGE_MASK) + SZ_2G)
> > +#endif
> >
> > /*
> > * Roughly size the vmemmap space to be large enough to fit enough
> > @@ -57,9 +63,16 @@
> > #define FIXADDR_SIZE PGDIR_SIZE
> > #endif
> > #define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE)
> > -
> > #endif
> >
> > +#ifndef __ASSEMBLY__
> > +
> > +/* Page Upper Directory not used in RISC-V */
> > +#include <asm-generic/pgtable-nopud.h>
> > +#include <asm/page.h>
> > +#include <asm/tlbflush.h>
> > +#include <linux/mm_types.h>
> > +
> > #ifdef CONFIG_64BIT
> > #include <asm/pgtable-64.h>
> > #else
> > @@ -483,6 +496,7 @@ static inline void __kernel_map_pages(struct page *page, int numpages, int enabl
> >
> > #define kern_addr_valid(addr) (1) /* FIXME */
> >
> > +extern char _start[];
> > extern void *dtb_early_va;
> > void setup_bootmem(void);
> > void paging_init(void);
> > diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> > index 98a406474e7d..8f5bb7731327 100644
> > --- a/arch/riscv/kernel/head.S
> > +++ b/arch/riscv/kernel/head.S
> > @@ -49,7 +49,8 @@ ENTRY(_start)
> > #ifdef CONFIG_MMU
> > relocate:
> > /* Relocate return address */
> > - li a1, PAGE_OFFSET
> > + la a1, kernel_virt_addr
> > + REG_L a1, 0(a1)
> > la a2, _start
> > sub a1, a1, a2
> > add ra, ra, a1
> > diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
> > index 8bbe5dbe1341..1a8fbe05accf 100644
> > --- a/arch/riscv/kernel/module.c
> > +++ b/arch/riscv/kernel/module.c
> > @@ -392,12 +392,10 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
> > }
> >
> > #if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
> > -#define VMALLOC_MODULE_START \
> > - max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START)
> > void *module_alloc(unsigned long size)
> > {
> > return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START,
> > - VMALLOC_END, GFP_KERNEL,
> > + VMALLOC_MODULE_END, GFP_KERNEL,
> > PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
> > __builtin_return_address(0));
> > }
> > diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
> > index 0339b6bbe11a..a9abde62909f 100644
> > --- a/arch/riscv/kernel/vmlinux.lds.S
> > +++ b/arch/riscv/kernel/vmlinux.lds.S
> > @@ -4,7 +4,8 @@
> > * Copyright (C) 2017 SiFive
> > */
> >
> > -#define LOAD_OFFSET PAGE_OFFSET
> > +#include <asm/pgtable.h>
> > +#define LOAD_OFFSET KERNEL_LINK_ADDR
> > #include <asm/vmlinux.lds.h>
> > #include <asm/page.h>
> > #include <asm/cache.h>
> > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> > index 736de6c8739f..71da78914645 100644
> > --- a/arch/riscv/mm/init.c
> > +++ b/arch/riscv/mm/init.c
> > @@ -22,6 +22,9 @@
> >
> > #include "../kernel/head.h"
> >
> > +unsigned long kernel_virt_addr = KERNEL_VIRT_ADDR;
> > +EXPORT_SYMBOL(kernel_virt_addr);
> > +
> > unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
> > __page_aligned_bss;
> > EXPORT_SYMBOL(empty_zero_page);
> > @@ -178,8 +181,12 @@ void __init setup_bootmem(void)
> > }
> >
> > #ifdef CONFIG_MMU
> > +/* Offset between linear mapping virtual address and kernel load address */
> > unsigned long va_pa_offset;
> > EXPORT_SYMBOL(va_pa_offset);
> > +/* Offset between kernel mapping virtual address and kernel load address */
> > +unsigned long va_kernel_pa_offset;
> > +EXPORT_SYMBOL(va_kernel_pa_offset);
> > unsigned long pfn_base;
> > EXPORT_SYMBOL(pfn_base);
> >
> > @@ -271,7 +278,7 @@ static phys_addr_t __init alloc_pmd(uintptr_t va)
> > if (mmu_enabled)
> > return memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> >
> > - pmd_num = (va - PAGE_OFFSET) >> PGDIR_SHIFT;
> > + pmd_num = (va - kernel_virt_addr) >> PGDIR_SHIFT;
> > BUG_ON(pmd_num >= NUM_EARLY_PMDS);
> > return (uintptr_t)&early_pmd[pmd_num * PTRS_PER_PMD];
> > }
> > @@ -372,14 +379,30 @@ static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
> > #error "setup_vm() is called from head.S before relocate so it should not use absolute addressing."
> > #endif
> >
> > +static uintptr_t load_pa, load_sz;
> > +
> > +static void __init create_kernel_page_table(pgd_t *pgdir, uintptr_t map_size)
> > +{
> > + uintptr_t va, end_va;
> > +
> > + end_va = kernel_virt_addr + load_sz;
> > + for (va = kernel_virt_addr; va < end_va; va += map_size)
> > + create_pgd_mapping(pgdir, va,
> > + load_pa + (va - kernel_virt_addr),
> > + map_size, PAGE_KERNEL_EXEC);
> > +}
> > +
> > asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> > {
> > uintptr_t va, end_va;
> > - uintptr_t load_pa = (uintptr_t)(&_start);
> > - uintptr_t load_sz = (uintptr_t)(&_end) - load_pa;
> > uintptr_t map_size = best_map_size(load_pa, MAX_EARLY_MAPPING_SIZE);
> >
> > + load_pa = (uintptr_t)(&_start);
> > + load_sz = (uintptr_t)(&_end) - load_pa;
> > +
> > va_pa_offset = PAGE_OFFSET - load_pa;
> > + va_kernel_pa_offset = kernel_virt_addr - load_pa;
> > +
> > pfn_base = PFN_DOWN(load_pa);
> >
> > /*
> > @@ -402,26 +425,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> > create_pmd_mapping(fixmap_pmd, FIXADDR_START,
> > (uintptr_t)fixmap_pte, PMD_SIZE, PAGE_TABLE);
> > /* Setup trampoline PGD and PMD */
> > - create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
> > + create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
> > (uintptr_t)trampoline_pmd, PGDIR_SIZE, PAGE_TABLE);
> > - create_pmd_mapping(trampoline_pmd, PAGE_OFFSET,
> > + create_pmd_mapping(trampoline_pmd, kernel_virt_addr,
> > load_pa, PMD_SIZE, PAGE_KERNEL_EXEC);
> > #else
> > /* Setup trampoline PGD */
> > - create_pgd_mapping(trampoline_pg_dir, PAGE_OFFSET,
> > + create_pgd_mapping(trampoline_pg_dir, kernel_virt_addr,
> > load_pa, PGDIR_SIZE, PAGE_KERNEL_EXEC);
> > #endif
> >
> > /*
> > - * Setup early PGD covering entire kernel which will allows
> > + * Setup early PGD covering entire kernel which will allow
> > * us to reach paging_init(). We map all memory banks later
> > * in setup_vm_final() below.
> > */
> > - end_va = PAGE_OFFSET + load_sz;
> > - for (va = PAGE_OFFSET; va < end_va; va += map_size)
> > - create_pgd_mapping(early_pg_dir, va,
> > - load_pa + (va - PAGE_OFFSET),
> > - map_size, PAGE_KERNEL_EXEC);
> > + create_kernel_page_table(early_pg_dir, map_size);
> >
> > /* Create fixed mapping for early FDT parsing */
> > end_va = __fix_to_virt(FIX_FDT) + FIX_FDT_SIZE;
> > @@ -441,6 +460,7 @@ static void __init setup_vm_final(void)
> > uintptr_t va, map_size;
> > phys_addr_t pa, start, end;
> > struct memblock_region *reg;
> > + static struct vm_struct vm_kernel = { 0 };
> >
> > /* Set mmu_enabled flag */
> > mmu_enabled = true;
> > @@ -467,10 +487,22 @@ static void __init setup_vm_final(void)
> > for (pa = start; pa < end; pa += map_size) {
> > va = (uintptr_t)__va(pa);
> > create_pgd_mapping(swapper_pg_dir, va, pa,
> > - map_size, PAGE_KERNEL_EXEC);
> > + map_size, PAGE_KERNEL);
> > }
> > }
> >
> > + /* Map the kernel */
> > + create_kernel_page_table(swapper_pg_dir, PMD_SIZE);
> > +
> > + /* Reserve the vmalloc area occupied by the kernel */
> > + vm_kernel.addr = (void *)kernel_virt_addr;
> > + vm_kernel.phys_addr = load_pa;
> > + vm_kernel.size = (load_sz + PMD_SIZE - 1) & ~(PMD_SIZE - 1);
> > + vm_kernel.flags = VM_MAP | VM_NO_GUARD;
> > + vm_kernel.caller = __builtin_return_address(0);
> > +
> > + vm_area_add_early(&vm_kernel);
> > +
> > /* Clear fixmap PTE and PMD mappings */
> > clear_fixmap(FIX_PTE);
> > clear_fixmap(FIX_PMD);
> > diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c
> > index e8e4dcd39fed..35703d5ef5fd 100644
> > --- a/arch/riscv/mm/physaddr.c
> > +++ b/arch/riscv/mm/physaddr.c
> > @@ -23,7 +23,7 @@ EXPORT_SYMBOL(__virt_to_phys);
> >
> > phys_addr_t __phys_addr_symbol(unsigned long x)
> > {
> > - unsigned long kernel_start = (unsigned long)PAGE_OFFSET;
> > + unsigned long kernel_start = (unsigned long)kernel_virt_addr;
> > unsigned long kernel_end = (unsigned long)_end;
> >
> > /*
^ permalink raw reply
* Re: [PATCH RESEND 2/2] papr/scm: Add bad memory ranges to nvdimm bad ranges
From: Christophe Leroy @ 2020-07-09 8:17 UTC (permalink / raw)
To: Santosh Sivaraj, linuxppc-dev
Cc: Aneesh Kumar K.V, Oliver, Ganesh Goudar, Mahesh Salgaonkar,
Vaibhav Jain
In-Reply-To: <20200709075635.643740-2-santosh@fossix.org>
Le 09/07/2020 à 09:56, Santosh Sivaraj a écrit :
> Subscribe to the MCE notification and add the physical address which
> generated a memory error to nvdimm bad range.
>
> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
> ---
> arch/powerpc/platforms/pseries/papr_scm.c | 98 ++++++++++++++++++++++-
> 1 file changed, 97 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> index 9c569078a09fd..5ebb1c797795d 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -13,9 +13,11 @@
> #include <linux/platform_device.h>
> #include <linux/delay.h>
> #include <linux/seq_buf.h>
> +#include <linux/nd.h>
>
> #include <asm/plpar_wrappers.h>
> #include <asm/papr_pdsm.h>
> +#include <asm/mce.h>
>
> #define BIND_ANY_ADDR (~0ul)
>
> @@ -80,6 +82,7 @@ struct papr_scm_priv {
> struct resource res;
> struct nd_region *region;
> struct nd_interleave_set nd_set;
> + struct list_head region_list;
>
> /* Protect dimm health data from concurrent read/writes */
> struct mutex health_mutex;
> @@ -91,6 +94,9 @@ struct papr_scm_priv {
> u64 health_bitmap;
> };
>
> +LIST_HEAD(papr_nd_regions);
> +DEFINE_MUTEX(papr_ndr_lock);
> +
> static int drc_pmem_bind(struct papr_scm_priv *p)
> {
> unsigned long ret[PLPAR_HCALL_BUFSIZE];
> @@ -759,6 +765,10 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
> dev_info(dev, "Region registered with target node %d and online node %d",
> target_nid, online_nid);
>
> + mutex_lock(&papr_ndr_lock);
> + list_add_tail(&p->region_list, &papr_nd_regions);
> + mutex_unlock(&papr_ndr_lock);
> +
> return 0;
>
> err: nvdimm_bus_unregister(p->bus);
> @@ -766,6 +776,70 @@ err: nvdimm_bus_unregister(p->bus);
> return -ENXIO;
> }
>
> +static void papr_scm_add_badblock(struct nd_region *region,
> + struct nvdimm_bus *bus, u64 phys_addr)
> +{
> + u64 aligned_addr = ALIGN_DOWN(phys_addr, L1_CACHE_BYTES);
> +
> + if (nvdimm_bus_add_badrange(bus, aligned_addr, L1_CACHE_BYTES)) {
> + pr_err("Bad block registration for 0x%llx failed\n", phys_addr);
> + return;
> + }
> +
> + pr_debug("Add memory range (0x%llx - 0x%llx) as bad range\n",
> + aligned_addr, aligned_addr + L1_CACHE_BYTES);
> +
> + nvdimm_region_notify(region, NVDIMM_REVALIDATE_POISON);
> +}
> +
> +static int handle_mce_ue(struct notifier_block *nb, unsigned long val,
> + void *data)
> +{
> + struct machine_check_event *evt = data;
> + struct papr_scm_priv *p;
> + u64 phys_addr;
> + bool found = false;
> +
> + if (evt->error_type != MCE_ERROR_TYPE_UE)
> + return NOTIFY_DONE;
> +
> + if (list_empty(&papr_nd_regions))
> + return NOTIFY_DONE;
> +
> + /*
> + * The physical address obtained here is PAGE_SIZE aligned, so get the
> + * exact address from the effective address
> + */
> + phys_addr = evt->u.ue_error.physical_address +
> + (evt->u.ue_error.effective_address & ~PAGE_MASK);
Not properly aligned
> +
> + if (!evt->u.ue_error.physical_address_provided ||
> + !is_zone_device_page(pfn_to_page(phys_addr >> PAGE_SHIFT)))
> + return NOTIFY_DONE;
> +
> + /* mce notifier is called from a process context, so mutex is safe */
> + mutex_lock(&papr_ndr_lock);
> + list_for_each_entry(p, &papr_nd_regions, region_list) {
> + struct resource res = p->res;
Is this local struct really worth it ? Why not use p->res below directly ?
> +
> + if (phys_addr >= res.start && phys_addr <= res.end) {
> + found = true;
> + break;
> + }
> + }
> +
> + if (found)
> + papr_scm_add_badblock(p->region, p->bus, phys_addr);
> +
> + mutex_unlock(&papr_ndr_lock);
> +
> + return found ? NOTIFY_OK : NOTIFY_DONE;
> +}
> +
> +static struct notifier_block mce_ue_nb = {
> + .notifier_call = handle_mce_ue
> +};
> +
> static int papr_scm_probe(struct platform_device *pdev)
> {
> struct device_node *dn = pdev->dev.of_node;
> @@ -866,6 +940,10 @@ static int papr_scm_remove(struct platform_device *pdev)
> {
> struct papr_scm_priv *p = platform_get_drvdata(pdev);
>
> + mutex_lock(&papr_ndr_lock);
> + list_del(&(p->region_list));
> + mutex_unlock(&papr_ndr_lock);
> +
> nvdimm_bus_unregister(p->bus);
> drc_pmem_unbind(p);
> kfree(p->bus_desc.provider_name);
> @@ -888,7 +966,25 @@ static struct platform_driver papr_scm_driver = {
> },
> };
>
> -module_platform_driver(papr_scm_driver);
> +static int __init papr_scm_init(void)
> +{
> + int ret;
> +
> + ret = platform_driver_register(&papr_scm_driver);
> + if (!ret)
> + mce_register_notifier(&mce_ue_nb);
> +
> +return ret;
Not properly aligned.
> +}
> +module_init(papr_scm_init);
> +
> +static void __exit papr_scm_exit(void)
> +{
> + mce_unregister_notifier(&mce_ue_nb);
> + platform_driver_unregister(&papr_scm_driver);
> +}
> +module_exit(papr_scm_exit);
> +
> MODULE_DEVICE_TABLE(of, papr_scm_match);
> MODULE_LICENSE("GPL");
> MODULE_AUTHOR("IBM Corporation");
>
Christophe
^ permalink raw reply
* Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
From: Peter Zijlstra @ 2020-07-09 8:31 UTC (permalink / raw)
To: Waiman Long
Cc: linux-arch, Will Deacon, Boqun Feng, linux-kernel, kvm-ppc,
virtualization, Ingo Molnar, Nicholas Piggin, linuxppc-dev
In-Reply-To: <a9834278-25bf-90e9-10f2-cd10e5407ff6@redhat.com>
On Wed, Jul 08, 2020 at 07:54:34PM -0400, Waiman Long wrote:
> On 7/8/20 4:41 AM, Peter Zijlstra wrote:
> > On Tue, Jul 07, 2020 at 03:57:06PM +1000, Nicholas Piggin wrote:
> > > Yes, powerpc could certainly get more performance out of the slow
> > > paths, and then there are a few parameters to tune.
> > Can you clarify? The slow path is already in use on ARM64 which is weak,
> > so I doubt there's superfluous serialization present. And Will spend a
> > fair amount of time on making that thing guarantee forward progressm, so
> > there just isn't too much room to play.
> >
> > > We don't have a good alternate patching for function calls yet, but
> > > that would be something to do for native vs pv.
> > Going by your jump_label implementation, support for static_call should
> > be fairly straight forward too, no?
> >
> > https://lkml.kernel.org/r/20200624153024.794671356@infradead.org
> >
> Speaking of static_call, I am also looking forward to it. Do you have an
> idea when that will be merged?
0day had one crash on the last round, I think Steve send a fix for that
last night and I'll go look at it.
That said, the last posting got 0 feedback, so either everybody is
really happy with it, or not interested. So let us know in the thread,
with some review feedback.
Once I get through enough of the inbox to actually find the fix and test
it, I'll also update the thread, and maybe threaten to merge it if
everybody stays silent :-)
^ permalink raw reply
* Re: [RFC PATCH v0 2/2] KVM: PPC: Book3S HV: Use H_RPT_INVALIDATE in nested KVM
From: Bharata B Rao @ 2020-07-09 9:08 UTC (permalink / raw)
To: Paul Mackerras; +Cc: aneesh.kumar, linuxppc-dev, npiggin, kvm-ppc
In-Reply-To: <20200709051803.GC2822576@thinks.paulus.ozlabs.org>
On Thu, Jul 09, 2020 at 03:18:03PM +1000, Paul Mackerras wrote:
> On Fri, Jul 03, 2020 at 04:14:20PM +0530, Bharata B Rao wrote:
> > In the nested KVM case, replace H_TLB_INVALIDATE by the new hcall
> > H_RPT_INVALIDATE if available. The availability of this hcall
> > is determined from "hcall-rpt-invalidate" string in ibm,hypertas-functions
> > DT property.
>
> What are we going to use when nested KVM supports HPT guests at L2?
> L1 will need to do partition-scoped tlbies with R=0 via a hypercall,
> but H_RPT_INVALIDATE says in its name that it only handles radix
> page tables (i.e. R=1).
For L2 HPT guests, the old hcall is expected to work after it adds
support for R=0 case?
The new hcall should be advertised via ibm,hypertas-functions only
for radix guests I suppose.
Regards,
Bharata.
^ permalink raw reply
* Re: [PATCH RESEND 2/2] papr/scm: Add bad memory ranges to nvdimm bad ranges
From: Santosh Sivaraj @ 2020-07-09 9:22 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev
Cc: Aneesh Kumar K.V, Oliver, Ganesh Goudar, Mahesh Salgaonkar,
Vaibhav Jain
In-Reply-To: <d8fe2b1b-7779-ffee-e26b-858eb7cd3633@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 09/07/2020 à 09:56, Santosh Sivaraj a écrit :
>> Subscribe to the MCE notification and add the physical address which
>> generated a memory error to nvdimm bad range.
>>
>> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>> Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
>> ---
>> arch/powerpc/platforms/pseries/papr_scm.c | 98 ++++++++++++++++++++++-
>> 1 file changed, 97 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>> index 9c569078a09fd..5ebb1c797795d 100644
>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>> @@ -13,9 +13,11 @@
>> #include <linux/platform_device.h>
>> #include <linux/delay.h>
>> #include <linux/seq_buf.h>
>> +#include <linux/nd.h>
>>
>> #include <asm/plpar_wrappers.h>
>> #include <asm/papr_pdsm.h>
>> +#include <asm/mce.h>
>>
>> #define BIND_ANY_ADDR (~0ul)
>>
>> @@ -80,6 +82,7 @@ struct papr_scm_priv {
>> struct resource res;
>> struct nd_region *region;
>> struct nd_interleave_set nd_set;
>> + struct list_head region_list;
>>
>> /* Protect dimm health data from concurrent read/writes */
>> struct mutex health_mutex;
>> @@ -91,6 +94,9 @@ struct papr_scm_priv {
>> u64 health_bitmap;
>> };
>>
>> +LIST_HEAD(papr_nd_regions);
>> +DEFINE_MUTEX(papr_ndr_lock);
>> +
>> static int drc_pmem_bind(struct papr_scm_priv *p)
>> {
>> unsigned long ret[PLPAR_HCALL_BUFSIZE];
>> @@ -759,6 +765,10 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
>> dev_info(dev, "Region registered with target node %d and online node %d",
>> target_nid, online_nid);
>>
>> + mutex_lock(&papr_ndr_lock);
>> + list_add_tail(&p->region_list, &papr_nd_regions);
>> + mutex_unlock(&papr_ndr_lock);
>> +
>> return 0;
>>
>> err: nvdimm_bus_unregister(p->bus);
>> @@ -766,6 +776,70 @@ err: nvdimm_bus_unregister(p->bus);
>> return -ENXIO;
>> }
>>
>> +static void papr_scm_add_badblock(struct nd_region *region,
>> + struct nvdimm_bus *bus, u64 phys_addr)
>> +{
>> + u64 aligned_addr = ALIGN_DOWN(phys_addr, L1_CACHE_BYTES);
>> +
>> + if (nvdimm_bus_add_badrange(bus, aligned_addr, L1_CACHE_BYTES)) {
>> + pr_err("Bad block registration for 0x%llx failed\n", phys_addr);
>> + return;
>> + }
>> +
>> + pr_debug("Add memory range (0x%llx - 0x%llx) as bad range\n",
>> + aligned_addr, aligned_addr + L1_CACHE_BYTES);
>> +
>> + nvdimm_region_notify(region, NVDIMM_REVALIDATE_POISON);
>> +}
>> +
>> +static int handle_mce_ue(struct notifier_block *nb, unsigned long val,
>> + void *data)
>> +{
>> + struct machine_check_event *evt = data;
>> + struct papr_scm_priv *p;
>> + u64 phys_addr;
>> + bool found = false;
>> +
>> + if (evt->error_type != MCE_ERROR_TYPE_UE)
>> + return NOTIFY_DONE;
>> +
>> + if (list_empty(&papr_nd_regions))
>> + return NOTIFY_DONE;
>> +
>> + /*
>> + * The physical address obtained here is PAGE_SIZE aligned, so get the
>> + * exact address from the effective address
>> + */
>> + phys_addr = evt->u.ue_error.physical_address +
>> + (evt->u.ue_error.effective_address & ~PAGE_MASK);
>
> Not properly aligned
Will fix it.
>
>> +
>> + if (!evt->u.ue_error.physical_address_provided ||
>> + !is_zone_device_page(pfn_to_page(phys_addr >> PAGE_SHIFT)))
>> + return NOTIFY_DONE;
>> +
>> + /* mce notifier is called from a process context, so mutex is safe */
>> + mutex_lock(&papr_ndr_lock);
>> + list_for_each_entry(p, &papr_nd_regions, region_list) {
>> + struct resource res = p->res;
>
> Is this local struct really worth it ? Why not use p->res below directly ?
>
Right, not really needed. I can fix that in v2.
>> +
>> + if (phys_addr >= res.start && phys_addr <= res.end) {
>> + found = true;
>> + break;
>> + }
>> + }
>> +
>> + if (found)
>> + papr_scm_add_badblock(p->region, p->bus, phys_addr);
>> +
>> + mutex_unlock(&papr_ndr_lock);
>> +
>> + return found ? NOTIFY_OK : NOTIFY_DONE;
>> +}
>> +
>> +static struct notifier_block mce_ue_nb = {
>> + .notifier_call = handle_mce_ue
>> +};
>> +
>> static int papr_scm_probe(struct platform_device *pdev)
>> {
>> struct device_node *dn = pdev->dev.of_node;
>> @@ -866,6 +940,10 @@ static int papr_scm_remove(struct platform_device *pdev)
>> {
>> struct papr_scm_priv *p = platform_get_drvdata(pdev);
>>
>> + mutex_lock(&papr_ndr_lock);
>> + list_del(&(p->region_list));
>> + mutex_unlock(&papr_ndr_lock);
>> +
>> nvdimm_bus_unregister(p->bus);
>> drc_pmem_unbind(p);
>> kfree(p->bus_desc.provider_name);
>> @@ -888,7 +966,25 @@ static struct platform_driver papr_scm_driver = {
>> },
>> };
>>
>> -module_platform_driver(papr_scm_driver);
>> +static int __init papr_scm_init(void)
>> +{
>> + int ret;
>> +
>> + ret = platform_driver_register(&papr_scm_driver);
>> + if (!ret)
>> + mce_register_notifier(&mce_ue_nb);
>> +
>> +return ret;
>
> Not properly aligned.
will fix it.
Thanks for the review!
Thanks,
Santosh
>
>> +}
>> +module_init(papr_scm_init);
>> +
>> +static void __exit papr_scm_exit(void)
>> +{
>> + mce_unregister_notifier(&mce_ue_nb);
>> + platform_driver_unregister(&papr_scm_driver);
>> +}
>> +module_exit(papr_scm_exit);
>> +
>> MODULE_DEVICE_TABLE(of, papr_scm_match);
>> MODULE_LICENSE("GPL");
>> MODULE_AUTHOR("IBM Corporation");
>>
>
> Christophe
^ permalink raw reply
* Re: [PATCH RESEND 1/2] powerpc/mce: Add MCE notification chain
From: Santosh Sivaraj @ 2020-07-09 9:24 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev
Cc: Aneesh Kumar K.V, Oliver, Ganesh Goudar, Mahesh Salgaonkar,
Vaibhav Jain
In-Reply-To: <f722e532-070e-1961-3bae-6f385caa5ead@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 09/07/2020 à 09:56, Santosh Sivaraj a écrit :
>> Introduce notification chain which lets know about uncorrected memory
>> errors(UE). This would help prospective users in pmem or nvdimm subsystem
>> to track bad blocks for better handling of persistent memory allocations.
>>
>> Signed-off-by: Santosh S <santosh@fossix.org>
>> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
>> ---
>> arch/powerpc/include/asm/mce.h | 2 ++
>> arch/powerpc/kernel/mce.c | 15 +++++++++++++++
>> 2 files changed, 17 insertions(+)
>>
>> Send the two patches together, so the dependencies are clear. The earlier patch reviews are
>> here: https://lore.kernel.org/linuxppc-dev/20200330071219.12284-1-ganeshgr@linux.ibm.com/
>>
>> Rebase the patches on top on 5.8-rc4
>>
>> diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
>> index 376a395daf329..a57b0772702a9 100644
>> --- a/arch/powerpc/include/asm/mce.h
>> +++ b/arch/powerpc/include/asm/mce.h
>> @@ -220,6 +220,8 @@ extern void machine_check_print_event_info(struct machine_check_event *evt,
>> unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
>> extern void mce_common_process_ue(struct pt_regs *regs,
>> struct mce_error_info *mce_err);
>> +extern int mce_register_notifier(struct notifier_block *nb);
>> +extern int mce_unregister_notifier(struct notifier_block *nb);
>
> Using the 'extern' keyword on function declaration is pointless and
> should be avoided in new patches. (checkpatch.pl --strict usually
> complains about it).
I will remove that in the v2 which I will be sending for your comments for
the other patch.
Thanks,
Santosh
>
>> #ifdef CONFIG_PPC_BOOK3S_64
>> void flush_and_reload_slb(void);
>> #endif /* CONFIG_PPC_BOOK3S_64 */
>> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
>> index fd90c0eda2290..b7b3ed4e61937 100644
>> --- a/arch/powerpc/kernel/mce.c
>> +++ b/arch/powerpc/kernel/mce.c
>> @@ -49,6 +49,20 @@ static struct irq_work mce_ue_event_irq_work = {
>>
>> DECLARE_WORK(mce_ue_event_work, machine_process_ue_event);
>>
>> +static BLOCKING_NOTIFIER_HEAD(mce_notifier_list);
>> +
>> +int mce_register_notifier(struct notifier_block *nb)
>> +{
>> + return blocking_notifier_chain_register(&mce_notifier_list, nb);
>> +}
>> +EXPORT_SYMBOL_GPL(mce_register_notifier);
>> +
>> +int mce_unregister_notifier(struct notifier_block *nb)
>> +{
>> + return blocking_notifier_chain_unregister(&mce_notifier_list, nb);
>> +}
>> +EXPORT_SYMBOL_GPL(mce_unregister_notifier);
>> +
>> static void mce_set_error_info(struct machine_check_event *mce,
>> struct mce_error_info *mce_err)
>> {
>> @@ -278,6 +292,7 @@ static void machine_process_ue_event(struct work_struct *work)
>> while (__this_cpu_read(mce_ue_count) > 0) {
>> index = __this_cpu_read(mce_ue_count) - 1;
>> evt = this_cpu_ptr(&mce_ue_event_queue[index]);
>> + blocking_notifier_call_chain(&mce_notifier_list, 0, evt);
>> #ifdef CONFIG_MEMORY_FAILURE
>> /*
>> * This should probably queued elsewhere, but
>>
>
> Christophe
^ permalink raw reply
* Re: [PATCH v3 1/6] powerpc/powernv: must include hvcall.h to get PAPR defines
From: Michael Ellerman @ 2020-07-09 10:05 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
Cc: linux-arch, Peter Zijlstra, Boqun Feng, linux-kernel,
Nicholas Piggin, virtualization, Ingo Molnar, kvm-ppc,
Waiman Long, Will Deacon
In-Reply-To: <20200706043540.1563616-2-npiggin@gmail.com>
Nicholas Piggin <npiggin@gmail.com> writes:
> An include goes away in future patches which breaks compilation
> without this.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/platforms/powernv/pci-ioda-tce.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda-tce.c b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
> index f923359d8afc..8eba6ece7808 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda-tce.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
> @@ -15,6 +15,7 @@
>
> #include <asm/iommu.h>
> #include <asm/tce.h>
> +#include <asm/hvcall.h> /* share error returns with PAPR */
> #include "pci.h"
>
> unsigned long pnv_ioda_parse_tce_sizes(struct pnv_phb *phb)
> --
> 2.23.0
This isn't needed anymore AFAICS, since:
5f202c1a1d42 ("powerpc/powernv/ioda: Return correct error if TCE level allocation failed")
cheers
^ permalink raw reply
* Re: [RFC PATCH v0 2/2] KVM: PPC: Book3S HV: Use H_RPT_INVALIDATE in nested KVM
From: Paul Mackerras @ 2020-07-09 10:07 UTC (permalink / raw)
To: Bharata B Rao; +Cc: aneesh.kumar, linuxppc-dev, npiggin, kvm-ppc
In-Reply-To: <20200709090851.GD7902@in.ibm.com>
On Thu, Jul 09, 2020 at 02:38:51PM +0530, Bharata B Rao wrote:
> On Thu, Jul 09, 2020 at 03:18:03PM +1000, Paul Mackerras wrote:
> > On Fri, Jul 03, 2020 at 04:14:20PM +0530, Bharata B Rao wrote:
> > > In the nested KVM case, replace H_TLB_INVALIDATE by the new hcall
> > > H_RPT_INVALIDATE if available. The availability of this hcall
> > > is determined from "hcall-rpt-invalidate" string in ibm,hypertas-functions
> > > DT property.
> >
> > What are we going to use when nested KVM supports HPT guests at L2?
> > L1 will need to do partition-scoped tlbies with R=0 via a hypercall,
> > but H_RPT_INVALIDATE says in its name that it only handles radix
> > page tables (i.e. R=1).
>
> For L2 HPT guests, the old hcall is expected to work after it adds
> support for R=0 case?
That was the plan.
> The new hcall should be advertised via ibm,hypertas-functions only
> for radix guests I suppose.
Well, the L1 hypervisor is a radix guest of L0, so it would have
H_RPT_INVALIDATE available to it?
I guess the question is whether H_RPT_INVALIDATE is supposed to do
everything, that is, radix process-scoped invalidations, radix
partition-scoped invalidations, and HPT partition-scoped
invalidations. If that is the plan then we should call it something
different.
This patchset seems to imply that H_RPT_INVALIDATE is at least going
to be used for radix partition-scoped invalidations as well as radix
process-scoped invalidations. If you are thinking that in future when
we need HPT partition-scoped invalidations for a radix L1 hypervisor
running a HPT L2 guest, we are going to define a new hypercall for
that, I suppose that is OK, though it doesn't really seem necessary.
Paul.
^ permalink raw reply
* Re: [PATCH v3 2/6] powerpc/pseries: move some PAPR paravirt functions to their own file
From: Michael Ellerman @ 2020-07-09 10:11 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
Cc: linux-arch, Peter Zijlstra, Boqun Feng, linux-kernel,
Nicholas Piggin, virtualization, Ingo Molnar, kvm-ppc,
Waiman Long, Will Deacon
In-Reply-To: <20200706043540.1563616-3-npiggin@gmail.com>
Nicholas Piggin <npiggin@gmail.com> writes:
>
Little bit of changelog would be nice :D
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/include/asm/paravirt.h | 61 +++++++++++++++++++++++++++++
> arch/powerpc/include/asm/spinlock.h | 24 +-----------
> arch/powerpc/lib/locks.c | 12 +++---
> 3 files changed, 68 insertions(+), 29 deletions(-)
> create mode 100644 arch/powerpc/include/asm/paravirt.h
>
> diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
> new file mode 100644
> index 000000000000..7a8546660a63
> --- /dev/null
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -0,0 +1,61 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef __ASM_PARAVIRT_H
> +#define __ASM_PARAVIRT_H
Should be _ASM_POWERPC_PARAVIRT_H
> +#ifdef __KERNEL__
We shouldn't need __KERNEL__ in here, it's not a uapi header.
cheers
^ permalink raw reply
* Re: [PATCH v3 3/6] powerpc: move spinlock implementation to simple_spinlock
From: Michael Ellerman @ 2020-07-09 10:15 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
Cc: linux-arch, Peter Zijlstra, Boqun Feng, linux-kernel,
Nicholas Piggin, virtualization, Ingo Molnar, kvm-ppc,
Waiman Long, Will Deacon
In-Reply-To: <20200706043540.1563616-4-npiggin@gmail.com>
Nicholas Piggin <npiggin@gmail.com> writes:
> To prepare for queued spinlocks. This is a simple rename except to update
> preprocessor guard name and a file reference.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/include/asm/simple_spinlock.h | 292 ++++++++++++++++++
> .../include/asm/simple_spinlock_types.h | 21 ++
> arch/powerpc/include/asm/spinlock.h | 285 +----------------
> arch/powerpc/include/asm/spinlock_types.h | 12 +-
> 4 files changed, 315 insertions(+), 295 deletions(-)
> create mode 100644 arch/powerpc/include/asm/simple_spinlock.h
> create mode 100644 arch/powerpc/include/asm/simple_spinlock_types.h
>
> diff --git a/arch/powerpc/include/asm/simple_spinlock.h b/arch/powerpc/include/asm/simple_spinlock.h
> new file mode 100644
> index 000000000000..e048c041c4a9
> --- /dev/null
> +++ b/arch/powerpc/include/asm/simple_spinlock.h
> @@ -0,0 +1,292 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef __ASM_SIMPLE_SPINLOCK_H
> +#define __ASM_SIMPLE_SPINLOCK_H
_ASM_POWERPC_SIMPLE_SPINLOCK_H
> +#ifdef __KERNEL__
Shouldn't be necessary.
> +/*
> + * Simple spin lock operations.
> + *
> + * Copyright (C) 2001-2004 Paul Mackerras <paulus@au.ibm.com>, IBM
> + * Copyright (C) 2001 Anton Blanchard <anton@au.ibm.com>, IBM
> + * Copyright (C) 2002 Dave Engebretsen <engebret@us.ibm.com>, IBM
> + * Rework to support virtual processors
> + *
> + * Type of int is used as a full 64b word is not necessary.
> + *
> + * (the type definitions are in asm/simple_spinlock_types.h)
> + */
> +#include <linux/irqflags.h>
> +#include <asm/paravirt.h>
> +#ifdef CONFIG_PPC64
> +#include <asm/paca.h>
> +#endif
I don't think paca.h needs a CONFIG_PPC64 guard, it contains one. I know
you're just moving the code, but still nice to cleanup slightly along
the way.
cheers
^ permalink raw reply
* Re: [PATCH v3 4/6] powerpc/64s: implement queued spinlocks and rwlocks
From: Michael Ellerman @ 2020-07-09 10:20 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
Cc: linux-arch, Peter Zijlstra, Boqun Feng, linux-kernel,
Nicholas Piggin, virtualization, Ingo Molnar, kvm-ppc,
Waiman Long, Will Deacon
In-Reply-To: <20200706043540.1563616-5-npiggin@gmail.com>
Nicholas Piggin <npiggin@gmail.com> writes:
> These have shown significantly improved performance and fairness when
> spinlock contention is moderate to high on very large systems.
>
> [ Numbers hopefully forthcoming after more testing, but initial
> results look good ]
Would be good to have something here, even if it's preliminary.
> Thanks to the fast path, single threaded performance is not noticably
> hurt.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/Kconfig | 13 ++++++++++++
> arch/powerpc/include/asm/Kbuild | 2 ++
> arch/powerpc/include/asm/qspinlock.h | 25 +++++++++++++++++++++++
> arch/powerpc/include/asm/spinlock.h | 5 +++++
> arch/powerpc/include/asm/spinlock_types.h | 5 +++++
> arch/powerpc/lib/Makefile | 3 +++
> include/asm-generic/qspinlock.h | 2 ++
Who's ack do we need for that part?
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 24ac85c868db..17663ea57697 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -492,6 +494,17 @@ config HOTPLUG_CPU
>
> Say N if you are unsure.
>
> +config PPC_QUEUED_SPINLOCKS
> + bool "Queued spinlocks"
> + depends on SMP
> + default "y" if PPC_BOOK3S_64
Not sure about default y? At least until we've got a better idea of the
perf impact on a range of small/big new/old systems.
> + help
> + Say Y here to use to use queued spinlocks which are more complex
> + but give better salability and fairness on large SMP and NUMA
> + systems.
> +
> + If unsure, say "Y" if you have lots of cores, otherwise "N".
Would be nice if we could give a range for "lots".
> diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild
> index dadbcf3a0b1e..1dd8b6adff5e 100644
> --- a/arch/powerpc/include/asm/Kbuild
> +++ b/arch/powerpc/include/asm/Kbuild
> @@ -6,5 +6,7 @@ generated-y += syscall_table_spu.h
> generic-y += export.h
> generic-y += local64.h
> generic-y += mcs_spinlock.h
> +generic-y += qrwlock.h
> +generic-y += qspinlock.h
The 2nd line spits a warning about a redundant entry. I think you want
to just drop it.
cheers
^ permalink raw reply
* Re: [PATCH v2 2/3] powerpc/64s: remove PROT_SAO support
From: Nicholas Piggin @ 2020-07-09 10:20 UTC (permalink / raw)
To: David Gibson, Paul Mackerras; +Cc: linux-api, linuxppc-dev, kvm-ppc, linux-mm
In-Reply-To: <20200709043406.GB2822576@thinks.paulus.ozlabs.org>
Excerpts from Paul Mackerras's message of July 9, 2020 2:34 pm:
> On Fri, Jul 03, 2020 at 11:19:57AM +1000, Nicholas Piggin wrote:
>> ISA v3.1 does not support the SAO storage control attribute required to
>> implement PROT_SAO. PROT_SAO was used by specialised system software
>> (Lx86) that has been discontinued for about 7 years, and is not thought
>> to be used elsewhere, so removal should not cause problems.
>>
>> We rather remove it than keep support for older processors, because
>> live migrating guest partitions to newer processors may not be possible
>> if SAO is in use (or worse allowed with silent races).
>
> This is actually a real problem for KVM, because now we have the
> capabilities of the host affecting the characteristics of the guest
> virtual machine in a manner which userspace (e.g. QEMU) is unable to
> control.
>
> It would probably be better to disallow SAO on all machines than have
> it available on some hosts and not others. (Yes I know there is a
> check on CPU_FTR_ARCH_206 in there, but that has been a no-op since we
> removed the PPC970 KVM support.)
This change doesn't change the SAO difference on the host processors
though, just tries to slightly improve it from silently broken to
maybe complaining a bit.
I didn't want to stop some very old image that uses this and is running
okay on an existing host from working, but maybe the existence of such
a thing would contradict my reasoning. But then if we don't care about
it why care about this KVM behaviour difference at all?
> Solving this properly will probably require creating a new KVM host
> capability and associated machine parameter in QEMU, along with a new
> machine type.
Rather than answer any of these questions, I might take the KVM change
out and that can be dealt with separately from guest SAO removal.
Thanks,
Nick
>
> [snip]
>
>> diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
>> index 9bb9bb370b53..fac39ff659d4 100644
>> --- a/arch/powerpc/include/asm/kvm_book3s_64.h
>> +++ b/arch/powerpc/include/asm/kvm_book3s_64.h
>> @@ -398,9 +398,10 @@ static inline bool hpte_cache_flags_ok(unsigned long hptel, bool is_ci)
>> {
>> unsigned int wimg = hptel & HPTE_R_WIMG;
>>
>> - /* Handle SAO */
>> + /* Handle SAO for POWER7,8,9 */
>> if (wimg == (HPTE_R_W | HPTE_R_I | HPTE_R_M) &&
>> - cpu_has_feature(CPU_FTR_ARCH_206))
>> + cpu_has_feature(CPU_FTR_ARCH_206) &&
>> + !cpu_has_feature(CPU_FTR_ARCH_31))
>> wimg = HPTE_R_M;
>
> Paul.
>
^ permalink raw reply
* Re: [PATCH] powerpc: select ARCH_HAS_MEMBARRIER_SYNC_CORE
From: Nicholas Piggin @ 2020-07-09 10:24 UTC (permalink / raw)
To: Mathieu Desnoyers; +Cc: linux-arch, linuxppc-dev
In-Reply-To: <407005394.1910.1594217551840.JavaMail.zimbra@efficios.com>
Excerpts from Mathieu Desnoyers's message of July 9, 2020 12:12 am:
> ----- On Jul 8, 2020, at 1:17 AM, Nicholas Piggin npiggin@gmail.com wrote:
>
>> Excerpts from Mathieu Desnoyers's message of July 7, 2020 9:25 pm:
>>> ----- On Jul 7, 2020, at 1:50 AM, Nicholas Piggin npiggin@gmail.com wrote:
>>>
> [...]
>>>> I should actually change the comment for 64-bit because soft masked
>>>> interrupt replay is an interesting case. I thought it was okay (because
>>>> the IPI would cause a hard interrupt which does do the rfi) but that
>>>> should at least be written.
>>>
>>> Yes.
>>>
>>>> The context synchronisation happens before
>>>> the Linux IPI function is called, but for the purpose of membarrier I
>>>> think that is okay (the membarrier just needs to have caused a memory
>>>> barrier + context synchronistaion by the time it has done).
>>>
>>> Can you point me to the code implementing this logic ?
>>
>> It's mostly in arch/powerpc/kernel/exception-64s.S and
>> powerpc/kernel/irq.c, but a lot of asm so easier to explain.
>>
>> When any Linux code does local_irq_disable(), we set interrupts as
>> software-masked in a per-cpu flag. When interrupts (including IPIs) come
>> in, the first thing we do is check that flag and if we are masked, then
>> record that the interrupt needs to be "replayed" in another per-cpu
>> flag. The interrupt handler then exits back using RFI (which is context
>> synchronising the CPU). Later, when the kernel code does
>> local_irq_enable(), it checks the replay flag to see if anything needs
>> to be done. At that point we basically just call the interrupt handler
>> code like a normal function, and when that returns there is no context
>> synchronising instruction.
>
> AFAIU this can only happen for interrupts nesting over irqoff sections,
> therefore over kernel code, never userspace, right ?
Right.
>> So membarrier IPI will always cause target CPUs to perform a context
>> synchronising instruction, but sometimes it happens before the IPI
>> handler function runs.
>
> If my understanding is correct, the replayed interrupt handler logic
> only nests over kernel code, which will eventually need to issue a
> context synchronizing instruction before returning to user-space.
Yes.
> All we care about is that starting from the membarrier, each core
> either:
>
> - interrupt user-space to issue the context synchronizing instruction if
> they were running userspace, or
> - _eventually_ issue a context synchronizing instruction before returning
> to user-space if they were running kernel code.
>
> So your earlier statement "the membarrier just needs to have caused a memory
> barrier + context synchronistaion by the time it has done" is not strictly
> correct: the context synchronizing instruction does not strictly need to
> happen on each core before membarrier returns. A similar line of thoughts
> can be followed for memory barriers.
Ah okay that makes it simpler, then no such speical comment is required
for the powerpc specific interrupt handling.
Thanks,
Nick
^ permalink raw reply
* Re: [PATCH v3 4/6] powerpc/64s: implement queued spinlocks and rwlocks
From: Peter Zijlstra @ 2020-07-09 10:33 UTC (permalink / raw)
To: Michael Ellerman
Cc: linux-arch, linuxppc-dev, Boqun Feng, linux-kernel,
Nicholas Piggin, virtualization, Ingo Molnar, kvm-ppc,
Waiman Long, Will Deacon
In-Reply-To: <877dvdvvkm.fsf@mpe.ellerman.id.au>
On Thu, Jul 09, 2020 at 08:20:25PM +1000, Michael Ellerman wrote:
> Nicholas Piggin <npiggin@gmail.com> writes:
> > These have shown significantly improved performance and fairness when
> > spinlock contention is moderate to high on very large systems.
> >
> > [ Numbers hopefully forthcoming after more testing, but initial
> > results look good ]
>
> Would be good to have something here, even if it's preliminary.
>
> > Thanks to the fast path, single threaded performance is not noticably
> > hurt.
> >
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> > arch/powerpc/Kconfig | 13 ++++++++++++
> > arch/powerpc/include/asm/Kbuild | 2 ++
> > arch/powerpc/include/asm/qspinlock.h | 25 +++++++++++++++++++++++
> > arch/powerpc/include/asm/spinlock.h | 5 +++++
> > arch/powerpc/include/asm/spinlock_types.h | 5 +++++
> > arch/powerpc/lib/Makefile | 3 +++
>
> > include/asm-generic/qspinlock.h | 2 ++
>
> Who's ack do we need for that part?
Mine I suppose would do, as discussed earlier, it probably isn't
required anymore, but I understand the paranoia of not wanting to change
too many things at once :-)
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
^ permalink raw reply
* Re: [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR
From: Michael Ellerman @ 2020-07-09 10:53 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
Cc: linux-arch, Peter Zijlstra, Boqun Feng, linux-kernel,
Nicholas Piggin, virtualization, Ingo Molnar, kvm-ppc,
Waiman Long, Will Deacon
In-Reply-To: <20200706043540.1563616-6-npiggin@gmail.com>
Nicholas Piggin <npiggin@gmail.com> writes:
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/include/asm/paravirt.h | 28 ++++++++
> arch/powerpc/include/asm/qspinlock.h | 66 +++++++++++++++++++
> arch/powerpc/include/asm/qspinlock_paravirt.h | 7 ++
> arch/powerpc/platforms/pseries/Kconfig | 5 ++
> arch/powerpc/platforms/pseries/setup.c | 6 +-
> include/asm-generic/qspinlock.h | 2 +
Another ack?
> diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
> index 7a8546660a63..f2d51f929cf5 100644
> --- a/arch/powerpc/include/asm/paravirt.h
> +++ b/arch/powerpc/include/asm/paravirt.h
> @@ -45,6 +55,19 @@ static inline void yield_to_preempted(int cpu, u32 yield_count)
> {
> ___bad_yield_to_preempted(); /* This would be a bug */
> }
> +
> +extern void ___bad_yield_to_any(void);
> +static inline void yield_to_any(void)
> +{
> + ___bad_yield_to_any(); /* This would be a bug */
> +}
Why do we do that rather than just not defining yield_to_any() at all
and letting the build fail on that?
There's a condition somewhere that we know will false at compile time
and drop the call before linking?
> diff --git a/arch/powerpc/include/asm/qspinlock_paravirt.h b/arch/powerpc/include/asm/qspinlock_paravirt.h
> new file mode 100644
> index 000000000000..750d1b5e0202
> --- /dev/null
> +++ b/arch/powerpc/include/asm/qspinlock_paravirt.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef __ASM_QSPINLOCK_PARAVIRT_H
> +#define __ASM_QSPINLOCK_PARAVIRT_H
_ASM_POWERPC_QSPINLOCK_PARAVIRT_H please.
> +
> +EXPORT_SYMBOL(__pv_queued_spin_unlock);
Why's that in a header? Should that (eventually) go with the generic implementation?
> diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
> index 24c18362e5ea..756e727b383f 100644
> --- a/arch/powerpc/platforms/pseries/Kconfig
> +++ b/arch/powerpc/platforms/pseries/Kconfig
> @@ -25,9 +25,14 @@ config PPC_PSERIES
> select SWIOTLB
> default y
>
> +config PARAVIRT_SPINLOCKS
> + bool
> + default n
default n is the default.
> diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
> index 2db8469e475f..747a203d9453 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -771,8 +771,12 @@ static void __init pSeries_setup_arch(void)
> if (firmware_has_feature(FW_FEATURE_LPAR)) {
> vpa_init(boot_cpuid);
>
> - if (lppaca_shared_proc(get_lppaca()))
> + if (lppaca_shared_proc(get_lppaca())) {
> static_branch_enable(&shared_processor);
> +#ifdef CONFIG_PARAVIRT_SPINLOCKS
> + pv_spinlocks_init();
> +#endif
> + }
We could avoid the ifdef with this I think?
diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h
index 434615f1d761..6ec72282888d 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -10,5 +10,9 @@
#include <asm/simple_spinlock.h>
#endif
+#ifndef CONFIG_PARAVIRT_SPINLOCKS
+static inline void pv_spinlocks_init(void) { }
+#endif
+
#endif /* __KERNEL__ */
#endif /* __ASM_SPINLOCK_H */
cheers
^ permalink raw reply related
* Re: [PATCH 1/2] powerpc/powernv/idle: Exclude mfspr on HID1,4,5 on P9 and above
From: Gautham R Shenoy @ 2020-07-09 9:01 UTC (permalink / raw)
To: Pratik Rajesh Sampat
Cc: ego, pratik.r.sampat, linux-kernel, paulus, linuxppc-dev
In-Reply-To: <20200703124640.42820-1-psampat@linux.ibm.com>
On Fri, Jul 03, 2020 at 06:16:39PM +0530, Pratik Rajesh Sampat wrote:
> POWER9 onwards the support for the registers HID1, HID4, HID5 has been
> receded.
> Although mfspr on the above registers worked in Power9, In Power10
> simulator is unrecognized. Moving their assignment under the
> check for machines lower than Power9
>
> Signed-off-by: Pratik Rajesh Sampat <psampat@linux.ibm.com>
Nice catch.
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
> arch/powerpc/platforms/powernv/idle.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 2dd467383a88..19d94d021357 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -73,9 +73,6 @@ static int pnv_save_sprs_for_deep_states(void)
> */
> uint64_t lpcr_val = mfspr(SPRN_LPCR);
> uint64_t hid0_val = mfspr(SPRN_HID0);
> - uint64_t hid1_val = mfspr(SPRN_HID1);
> - uint64_t hid4_val = mfspr(SPRN_HID4);
> - uint64_t hid5_val = mfspr(SPRN_HID5);
> uint64_t hmeer_val = mfspr(SPRN_HMEER);
> uint64_t msr_val = MSR_IDLE;
> uint64_t psscr_val = pnv_deepest_stop_psscr_val;
> @@ -117,6 +114,9 @@ static int pnv_save_sprs_for_deep_states(void)
>
> /* Only p8 needs to set extra HID regiters */
> if (!cpu_has_feature(CPU_FTR_ARCH_300)) {
> + uint64_t hid1_val = mfspr(SPRN_HID1);
> + uint64_t hid4_val = mfspr(SPRN_HID4);
> + uint64_t hid5_val = mfspr(SPRN_HID5);
>
> rc = opal_slw_set_reg(pir, SPRN_HID1, hid1_val);
> if (rc != 0)
> --
> 2.25.4
>
--
Thanks and Regards
gautham.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox