* Re: [PATCH 04/13] powerpc/xive: Use cpu_to_node() instead of ibm, chip-id property
From: Aneesh Kumar K.V @ 2020-12-09 4:39 UTC (permalink / raw)
To: Cédric Le Goater, linuxppc-dev; +Cc: Cédric Le Goater
In-Reply-To: <20201208151124.1329942-5-clg@kaod.org>
Cédric Le Goater <clg@kaod.org> writes:
> The 'chip_id' field of the XIVE CPU structure is used to choose a
> target for a source located on the same chip when possible. This field
> is assigned on the PowerNV platform using the "ibm,chip-id" property
> on pSeries under KVM when NUMA nodes are defined but it is undefined
> under PowerVM. The XIVE source structure has a similar field
> 'src_chip' which is only assigned on the PowerNV platform.
>
> cpu_to_node() returns a compatible value on all platforms, 0 being the
> default node. It will also give us the opportunity to set the affinity
> of a source on pSeries when we can localize them.
But we should avoid assuming that linux numa node number is equivalent
to chip id [1]. What do we expect this value represents on virtualized
platforms like PowerVM and KVM? Is this used for a hcall?
[1] https://lore.kernel.org/linuxppc-dev/20200817103238.158133-1-aneesh.kumar@linux.ibm.com
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> arch/powerpc/sysdev/xive/common.c | 7 +------
> 1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index ee375daf8114..605238ca65e4 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -1342,16 +1342,11 @@ static int xive_prepare_cpu(unsigned int cpu)
>
> xc = per_cpu(xive_cpu, cpu);
> if (!xc) {
> - struct device_node *np;
> -
> xc = kzalloc_node(sizeof(struct xive_cpu),
> GFP_KERNEL, cpu_to_node(cpu));
> if (!xc)
> return -ENOMEM;
> - np = of_get_cpu_node(cpu, NULL);
> - if (np)
> - xc->chip_id = of_get_ibm_chip_id(np);
> - of_node_put(np);
> + xc->chip_id = cpu_to_node(cpu);
> xc->hw_ipi = XIVE_BAD_IRQ;
>
> per_cpu(xive_cpu, cpu) = xc;
> --
> 2.26.2
^ permalink raw reply
* [powerpc:fixes-test] BUILD SUCCESS 5eedf9fe8db23313df104576845cec5f481b9b60
From: kernel test robot @ 2020-12-09 4:42 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes-test
branch HEAD: 5eedf9fe8db23313df104576845cec5f481b9b60 powerpc/mm: Fix KUAP warning by providing copy_from_kernel_nofault_allowed()
elapsed time: 884m
configs tested: 129
configs skipped: 2
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
arm defconfig
powerpc g5_defconfig
powerpc skiroot_defconfig
sh rts7751r2dplus_defconfig
sh magicpanelr2_defconfig
powerpc maple_defconfig
arm rpc_defconfig
parisc generic-32bit_defconfig
arm milbeaut_m10v_defconfig
powerpc warp_defconfig
sh lboxre2_defconfig
powerpc64 defconfig
powerpc pseries_defconfig
powerpc canyonlands_defconfig
powerpc mpc834x_mds_defconfig
sh apsh4a3a_defconfig
x86_64 alldefconfig
powerpc wii_defconfig
powerpc mpc83xx_defconfig
sh rsk7264_defconfig
arc haps_hs_smp_defconfig
sh se7724_defconfig
powerpc ep8248e_defconfig
arm assabet_defconfig
arm corgi_defconfig
sh alldefconfig
powerpc mpc8272_ads_defconfig
sh sdk7780_defconfig
sh se7619_defconfig
mips cu1830-neo_defconfig
sh se7751_defconfig
arm lart_defconfig
powerpc mpc7448_hpc2_defconfig
sh se7721_defconfig
mips omega2p_defconfig
arm dove_defconfig
mips ath79_defconfig
powerpc kmeter1_defconfig
mips maltaaprp_defconfig
powerpc mgcoge_defconfig
arc hsdk_defconfig
xtensa defconfig
powerpc pmac32_defconfig
xtensa virt_defconfig
m68k m5475evb_defconfig
mips mpc30x_defconfig
h8300 defconfig
openrisc simple_smp_defconfig
mips rt305x_defconfig
powerpc tqm8541_defconfig
c6x evmc6678_defconfig
powerpc tqm5200_defconfig
powerpc acadia_defconfig
arc nsimosci_defconfig
arm omap1_defconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 tinyconfig
i386 defconfig
mips allyesconfig
mips allmodconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
i386 randconfig-a004-20201208
i386 randconfig-a005-20201208
i386 randconfig-a001-20201208
i386 randconfig-a002-20201208
i386 randconfig-a006-20201208
i386 randconfig-a003-20201208
x86_64 randconfig-a004-20201208
x86_64 randconfig-a006-20201208
x86_64 randconfig-a005-20201208
x86_64 randconfig-a001-20201208
x86_64 randconfig-a002-20201208
x86_64 randconfig-a003-20201208
i386 randconfig-a013-20201208
i386 randconfig-a014-20201208
i386 randconfig-a011-20201208
i386 randconfig-a015-20201208
i386 randconfig-a012-20201208
i386 randconfig-a016-20201208
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
riscv nommu_k210_defconfig
riscv nommu_virt_defconfig
riscv rv32_defconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
clang tested configs:
x86_64 randconfig-a012-20201208
x86_64 randconfig-a013-20201208
x86_64 randconfig-a011-20201208
x86_64 randconfig-a016-20201208
x86_64 randconfig-a014-20201208
x86_64 randconfig-a015-20201208
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:next-test] BUILD SUCCESS db972a3787d12b1ce9ba7a31ec376d8a79e04c47
From: kernel test robot @ 2020-12-09 4:42 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test
branch HEAD: db972a3787d12b1ce9ba7a31ec376d8a79e04c47 powerpc/powermac: Fix low_sleep_handler with CONFIG_VMAP_STACK
elapsed time: 884m
configs tested: 103
configs skipped: 3
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allmodconfig
arm allyesconfig
arm lubbock_defconfig
sh apsh4ad0a_defconfig
m68k sun3_defconfig
sh migor_defconfig
powerpc g5_defconfig
powerpc skiroot_defconfig
sh rts7751r2dplus_defconfig
sh magicpanelr2_defconfig
sh rsk7264_defconfig
arc haps_hs_smp_defconfig
sh se7724_defconfig
powerpc ep8248e_defconfig
arm assabet_defconfig
mips omega2p_defconfig
arm dove_defconfig
mips ath79_defconfig
powerpc kmeter1_defconfig
mips maltaaprp_defconfig
powerpc mgcoge_defconfig
arc hsdk_defconfig
xtensa defconfig
powerpc pmac32_defconfig
sh se7721_defconfig
c6x evmc6457_defconfig
powerpc acadia_defconfig
mips bcm47xx_defconfig
arm tango4_defconfig
powerpc mpc7448_hpc2_defconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 tinyconfig
i386 defconfig
mips allyesconfig
mips allmodconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
x86_64 randconfig-a004-20201208
x86_64 randconfig-a006-20201208
x86_64 randconfig-a005-20201208
x86_64 randconfig-a001-20201208
x86_64 randconfig-a002-20201208
x86_64 randconfig-a003-20201208
i386 randconfig-a004-20201208
i386 randconfig-a005-20201208
i386 randconfig-a001-20201208
i386 randconfig-a002-20201208
i386 randconfig-a006-20201208
i386 randconfig-a003-20201208
i386 randconfig-a013-20201208
i386 randconfig-a014-20201208
i386 randconfig-a011-20201208
i386 randconfig-a015-20201208
i386 randconfig-a012-20201208
i386 randconfig-a016-20201208
riscv nommu_k210_defconfig
riscv allyesconfig
riscv nommu_virt_defconfig
riscv allnoconfig
riscv defconfig
riscv rv32_defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
clang tested configs:
x86_64 randconfig-a016-20201208
x86_64 randconfig-a012-20201208
x86_64 randconfig-a013-20201208
x86_64 randconfig-a014-20201208
x86_64 randconfig-a015-20201208
x86_64 randconfig-a011-20201208
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: linux-next: build warning after merge of the akpm tree
From: Michael Ellerman @ 2020-12-09 4:44 UTC (permalink / raw)
To: Stephen Rothwell, Andrew Morton
Cc: Kees Cook, Mathieu Malaterre, Linux Kernel Mailing List,
Nicholas Piggin, Linux Next Mailing List, PowerPC
In-Reply-To: <20201208230157.42c42789@canb.auug.org.au>
Stephen Rothwell <sfr@canb.auug.org.au> writes:
> Hi Stephen,
>
> On Fri, 4 Dec 2020 21:00:00 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>>
>> Hi all,
>>
>> After merging the akpm tree, today's linux-next build (powerpc
>> allyesconfig) produced warnings like this:
>>
>> ld: warning: orphan section `.data..Lubsan_data177' from `arch/powerpc/oprofile/op_model_pa6t.o' being placed in section `.data..Lubsan_data177'
>>
>> (lots of these latter ones)
>
> 781584 of them today!
>
>> I don't know what produced these, but it is in the akpm-current or
>> akpm trees.
>
> Presumably the result of commit
>
> 186c3e18dba3 ("ubsan: enable for all*config builds")
>
> from the akpm-current tree.
>
> arch/powerpc/kernel/vmlinux.lds.S has:
>
> #ifdef CONFIG_PPC32
> .data : AT(ADDR(.data) - LOAD_OFFSET) {
> DATA_DATA
> #ifdef CONFIG_UBSAN
> *(.data..Lubsan_data*)
> *(.data..Lubsan_type*)
> #endif
> *(.data.rel*)
> *(SDATA_MAIN)
>
> added by commit
>
> beba24ac5913 ("powerpc/32: Add .data..Lubsan_data*/.data..Lubsan_type* sections explicitly")
>
> in 2018, but no equivalent for 64 bit.
They should really be in DATA_DATA or similar shouldn't they?
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/mce: Remove per cpu variables from MCE handlers
From: Michael Ellerman @ 2020-12-09 5:09 UTC (permalink / raw)
To: Mahesh Jagannath Salgaonkar, Ganesh, linuxppc-dev; +Cc: npiggin
In-Reply-To: <83ca2e1c-88f5-fc57-11e2-056f3ce835d7@linux.ibm.com>
Mahesh Jagannath Salgaonkar <mahesh@linux.ibm.com> writes:
> On 12/8/20 4:16 PM, Ganesh wrote:
>>
>> On 12/8/20 4:01 PM, Michael Ellerman wrote:
>>> Ganesh Goudar <ganeshgr@linux.ibm.com> writes:
>>>> diff --git a/arch/powerpc/include/asm/paca.h
>>>> b/arch/powerpc/include/asm/paca.h
>>>> index 9454d29ff4b4..4769954efa7d 100644
>>>> --- a/arch/powerpc/include/asm/paca.h
>>>> +++ b/arch/powerpc/include/asm/paca.h
>>>> @@ -273,6 +274,17 @@ struct paca_struct {
>>>> #ifdef CONFIG_MMIOWB
>>>> struct mmiowb_state mmiowb_state;
>>>> #endif
>>>> +#ifdef CONFIG_PPC_BOOK3S_64
>>>> + int mce_nest_count;
>>>> + struct machine_check_event mce_event[MAX_MC_EVT];
>>>> + /* Queue for delayed MCE events. */
>>>> + int mce_queue_count;
>>>> + struct machine_check_event mce_event_queue[MAX_MC_EVT];
>>>> +
>>>> + /* Queue for delayed MCE UE events. */
>>>> + int mce_ue_count;
>>>> + struct machine_check_event mce_ue_event_queue[MAX_MC_EVT];
>>>> +#endif /* CONFIG_PPC_BOOK3S_64 */
>>>> } ____cacheline_aligned;
>>> How much does this expand the paca by?
>>
>> Size of paca is 4480 bytes, these add up another 2160 bytes, so expands
>> it by 48%.
>
> Should we dynamically allocate the array sizes early as similar to that
> of paca->mce_faulty_slbs so that we don't bump up paca size ?
Yeah I think that would be preferable.
That way those allocations can be normal node-local allocations on bare
metal, or when using radix. (Or even on KVM).
In fact what we probably want is a separate struct for all the MCE
related data, eg something like:
struct mce_stuff {
int nest_count;
/* Queue for delayed MCE events. */
int queue_count;
/* Queue for delayed MCE UE events. */
int mce_ue_count;
struct machine_check_event events[MAX_MC_EVT];
struct machine_check_event event_queue[MAX_MC_EVT];
struct machine_check_event ue_event_queue[MAX_MC_EVT];
};
And then you allocate one of those per CPU, inside the RMO for pseries
with hash, and node-local otherwise.
cheers
^ permalink raw reply
* [PATCH v4 5/6] powerpc/fault: Avoid heavy search_exception_tables() verification
From: Christophe Leroy @ 2020-12-09 5:29 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin,
aneesh.kumar
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <0d37490a067840f53fc5b118869917c0aec9ab87.1607491747.git.christophe.leroy@csgroup.eu>
search_exception_tables() is an heavy operation, we have to avoid it.
When KUAP is selected, we'll know the fault has been blocked by KUAP.
When it is blocked by KUAP, check whether we are in an expected
userspace access place. If so, emit a warning to spot something is
going work. Otherwise, just remain silent, it will likely Oops soon.
When KUAP is not selected, it behaves just as if the address was
already in the TLBs and no fault was generated.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
---
v4: keep the search once we hit kuap_bad_fault() in order to warn or not
v3: rebased
v2: Squashed with the preceeding patch which was re-ordering tests that get removed in this patch.
---
arch/powerpc/mm/fault.c | 28 +++++++++++++---------------
1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 04505f938bbc..389a2a875262 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -210,28 +210,26 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
return true;
}
- if (!is_exec && address < TASK_SIZE && (error_code & (DSISR_PROTFAULT | DSISR_KEYFAULT)) &&
- !search_exception_tables(regs->nip)) {
- pr_crit_ratelimited("Kernel attempted to access user page (%lx) - exploit attempt? (uid: %d)\n",
- address,
- from_kuid(&init_user_ns, current_uid()));
- }
-
// Kernel fault on kernel address is bad
if (address >= TASK_SIZE)
return true;
- // Fault on user outside of certain regions (eg. copy_tofrom_user()) is bad
- if (!search_exception_tables(regs->nip))
- return true;
+ // Read/write fault blocked by KUAP is bad, it can never succeed.
+ if (bad_kuap_fault(regs, address, is_write)) {
+ pr_crit_ratelimited("Kernel attempted to %s user page (%lx) - exploit attempt? (uid: %d)\n",
+ is_write ? "write" : "read", address,
+ from_kuid(&init_user_ns, current_uid()));
+
+ // Fault on user outside of certain regions (eg. copy_tofrom_user()) is bad
+ if (!search_exception_tables(regs->nip))
+ return true;
- // Read/write fault in a valid region (the exception table search passed
- // above), but blocked by KUAP is bad, it can never succeed.
- if (bad_kuap_fault(regs, address, is_write))
+ // Read/write fault in a valid region (the exception table search passed
+ // above), but blocked by KUAP is bad, it can never succeed.
return WARN(true, "Bug: %s fault blocked by KUAP!", is_write ? "Write" : "Read");
+ }
- // What's left? Kernel fault on user in well defined regions (extable
- // matched), and allowed by KUAP in the faulting context.
+ // What's left? Kernel fault on user and allowed by KUAP in the faulting context.
return false;
}
--
2.25.0
^ permalink raw reply related
* [PATCH v4 1/6] powerpc/book3s64/kuap: Improve error reporting with KUAP
From: Christophe Leroy @ 2020-12-09 5:29 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin,
aneesh.kumar
Cc: linuxppc-dev, linux-kernel
From: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
This partially reverts commit eb232b162446 ("powerpc/book3s64/kuap: Improve
error reporting with KUAP") and update the fault handler to print
[ 55.022514] Kernel attempted to access user page (7e6725b70000) - exploit attempt? (uid: 0)
[ 55.022528] BUG: Unable to handle kernel data access on read at 0x7e6725b70000
[ 55.022533] Faulting instruction address: 0xc000000000e8b9bc
[ 55.022540] Oops: Kernel access of bad area, sig: 11 [#1]
....
when the kernel access userspace address without unlocking AMR.
bad_kuap_fault() is added as part of commit 5e5be3aed230 ("powerpc/mm: Detect
bad KUAP faults") to catch userspace access incorrectly blocked by AMR. Hence
retain the full stack dump there even with hash translation. Also, add a comment
explaining the difference between hash and radix.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/book3s/32/kup.h | 4 +--
arch/powerpc/include/asm/book3s/64/kup.h | 34 ++++++++++----------
arch/powerpc/include/asm/kup.h | 4 +--
arch/powerpc/include/asm/nohash/32/kup-8xx.h | 4 +--
arch/powerpc/mm/fault.c | 4 +--
5 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index b18cd931e325..32fd4452e960 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -177,8 +177,8 @@ static inline void restore_user_access(unsigned long flags)
allow_user_access(to, to, end - addr, KUAP_READ_WRITE);
}
-static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
- bool is_write, unsigned long error_code)
+static inline bool
+bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
{
unsigned long begin = regs->kuap & 0xf0000000;
unsigned long end = regs->kuap << 28;
diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
index f2e6dd78d5e2..7075c92c320c 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -353,29 +353,29 @@ static inline void set_kuap(unsigned long value)
isync();
}
-#define RADIX_KUAP_BLOCK_READ UL(0x4000000000000000)
-#define RADIX_KUAP_BLOCK_WRITE UL(0x8000000000000000)
-
static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
- bool is_write, unsigned long error_code)
+ bool is_write)
{
if (!mmu_has_feature(MMU_FTR_BOOK3S_KUAP))
return false;
-
- if (radix_enabled()) {
- /*
- * Will be a storage protection fault.
- * Only check the details of AMR[0]
- */
- return WARN((regs->kuap & (is_write ? RADIX_KUAP_BLOCK_WRITE : RADIX_KUAP_BLOCK_READ)),
- "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
- }
/*
- * We don't want to WARN here because userspace can setup
- * keys such that a kernel access to user address can cause
- * fault
+ * For radix this will be a storage protection fault (DSISR_PROTFAULT).
+ * For hash this will be a key fault (DSISR_KEYFAULT)
*/
- return !!(error_code & DSISR_KEYFAULT);
+ /*
+ * We do have exception table entry, but accessing the
+ * userspace results in fault. This could be because we
+ * didn't unlock the AMR or access is denied by userspace
+ * using a key value that blocks access. We are only interested
+ * in catching the use case of accessing without unlocking
+ * the AMR. Hence check for BLOCK_WRITE/READ against AMR.
+ */
+ if (is_write) {
+ return WARN(((regs->amr & AMR_KUAP_BLOCK_WRITE) == AMR_KUAP_BLOCK_WRITE),
+ "Bug: Write fault blocked by AMR!");
+ }
+ return WARN(((regs->amr & AMR_KUAP_BLOCK_READ) == AMR_KUAP_BLOCK_READ),
+ "Bug: Read fault blocked by AMR!");
}
static __always_inline void allow_user_access(void __user *to, const void __user *from,
diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index f8ec679bd2de..5a9820c54da9 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -62,8 +62,8 @@ void setup_kuap(bool disabled);
#else
static inline void setup_kuap(bool disabled) { }
-static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
- bool is_write, unsigned long error_code)
+static inline bool
+bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
{
return false;
}
diff --git a/arch/powerpc/include/asm/nohash/32/kup-8xx.h b/arch/powerpc/include/asm/nohash/32/kup-8xx.h
index 7bdd9e5b63ed..567cdc557402 100644
--- a/arch/powerpc/include/asm/nohash/32/kup-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/kup-8xx.h
@@ -60,8 +60,8 @@ static inline void restore_user_access(unsigned long flags)
mtspr(SPRN_MD_AP, flags);
}
-static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
- bool is_write, unsigned long error_code)
+static inline bool
+bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
{
return WARN(!((regs->kuap ^ MD_APG_KUAP) & 0xff000000),
"Bug: fault blocked by AP register !");
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index c91621df0c61..b12595102525 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -210,7 +210,7 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
return true;
}
- if (!is_exec && address < TASK_SIZE && (error_code & DSISR_PROTFAULT) &&
+ if (!is_exec && address < TASK_SIZE && (error_code & (DSISR_PROTFAULT | DSISR_KEYFAULT)) &&
!search_exception_tables(regs->nip)) {
pr_crit_ratelimited("Kernel attempted to access user page (%lx) - exploit attempt? (uid: %d)\n",
address,
@@ -227,7 +227,7 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
// Read/write fault in a valid region (the exception table search passed
// above), but blocked by KUAP is bad, it can never succeed.
- if (bad_kuap_fault(regs, address, is_write, error_code))
+ if (bad_kuap_fault(regs, address, is_write))
return true;
// What's left? Kernel fault on user in well defined regions (extable
--
2.25.0
^ permalink raw reply related
* [PATCH v4 2/6] powerpc/mm: sanity_check_fault() should work for all, not only BOOK3S
From: Christophe Leroy @ 2020-12-09 5:29 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin,
aneesh.kumar
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <0d37490a067840f53fc5b118869917c0aec9ab87.1607491747.git.christophe.leroy@csgroup.eu>
The verification and message introduced by commit 374f3f5979f9
("powerpc/mm/hash: Handle user access of kernel address gracefully")
applies to all platforms, it should not be limited to BOOK3S.
Make the BOOK3S version of sanity_check_fault() the one for all,
and bail out earlier if not BOOK3S.
Fixes: 374f3f5979f9 ("powerpc/mm/hash: Handle user access of kernel address gracefully")
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/mm/fault.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index b12595102525..f6ae56a0d7a3 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -303,7 +303,6 @@ static inline void cmo_account_page_fault(void)
static inline void cmo_account_page_fault(void) { }
#endif /* CONFIG_PPC_SMLPAR */
-#ifdef CONFIG_PPC_BOOK3S
static void sanity_check_fault(bool is_write, bool is_user,
unsigned long error_code, unsigned long address)
{
@@ -320,6 +319,9 @@ static void sanity_check_fault(bool is_write, bool is_user,
return;
}
+ if (!IS_ENABLED(CONFIG_PPC_BOOK3S))
+ return;
+
/*
* For hash translation mode, we should never get a
* PROTFAULT. Any update to pte to reduce access will result in us
@@ -354,10 +356,6 @@ static void sanity_check_fault(bool is_write, bool is_user,
WARN_ON_ONCE(error_code & DSISR_PROTFAULT);
}
-#else
-static void sanity_check_fault(bool is_write, bool is_user,
- unsigned long error_code, unsigned long address) { }
-#endif /* CONFIG_PPC_BOOK3S */
/*
* Define the correct "is_write" bit in error_code based
--
2.25.0
^ permalink raw reply related
* [PATCH v4 3/6] powerpc/fault: Unnest definition of page_fault_is_write() and page_fault_is_bad()
From: Christophe Leroy @ 2020-12-09 5:29 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin,
aneesh.kumar
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <0d37490a067840f53fc5b118869917c0aec9ab87.1607491747.git.christophe.leroy@csgroup.eu>
To make it more readable, separate page_fault_is_write() and page_fault_is_bad()
to avoir several levels of #ifdefs
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/mm/fault.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index f6ae56a0d7a3..3fcd34c28e10 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -363,17 +363,19 @@ static void sanity_check_fault(bool is_write, bool is_user,
*/
#if (defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
#define page_fault_is_write(__err) ((__err) & ESR_DST)
-#define page_fault_is_bad(__err) (0)
#else
#define page_fault_is_write(__err) ((__err) & DSISR_ISSTORE)
-#if defined(CONFIG_PPC_8xx)
+#endif
+
+#if defined(CONFIG_4xx) || defined(CONFIG_BOOKE)
+#define page_fault_is_bad(__err) (0)
+#elif defined(CONFIG_PPC_8xx)
#define page_fault_is_bad(__err) ((__err) & DSISR_NOEXEC_OR_G)
#elif defined(CONFIG_PPC64)
#define page_fault_is_bad(__err) ((__err) & DSISR_BAD_FAULT_64S)
#else
#define page_fault_is_bad(__err) ((__err) & DSISR_BAD_FAULT_32S)
#endif
-#endif
/*
* For 600- and 800-family processors, the error_code parameter is DSISR
--
2.25.0
^ permalink raw reply related
* [PATCH v4 4/6] powerpc/mm: Move the WARN() out of bad_kuap_fault()
From: Christophe Leroy @ 2020-12-09 5:29 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin,
aneesh.kumar
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <0d37490a067840f53fc5b118869917c0aec9ab87.1607491747.git.christophe.leroy@csgroup.eu>
In order to prepare the removal of calls to
search_exception_tables() on the fast path, move the
WARN() out of bad_kuap_fault().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v4: New
---
arch/powerpc/include/asm/book3s/32/kup.h | 6 +-----
arch/powerpc/include/asm/book3s/64/kup.h | 6 ++----
arch/powerpc/include/asm/nohash/32/kup-8xx.h | 3 +--
arch/powerpc/mm/fault.c | 2 +-
4 files changed, 5 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index 32fd4452e960..a0117a9d5b06 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -183,11 +183,7 @@ bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
unsigned long begin = regs->kuap & 0xf0000000;
unsigned long end = regs->kuap << 28;
- if (!is_write)
- return false;
-
- return WARN(address < begin || address >= end,
- "Bug: write fault blocked by segment registers !");
+ return is_write && (address < begin || address >= end);
}
#endif /* CONFIG_PPC_KUAP */
diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
index 7075c92c320c..f50f72e535aa 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -371,11 +371,9 @@ static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
* the AMR. Hence check for BLOCK_WRITE/READ against AMR.
*/
if (is_write) {
- return WARN(((regs->amr & AMR_KUAP_BLOCK_WRITE) == AMR_KUAP_BLOCK_WRITE),
- "Bug: Write fault blocked by AMR!");
+ return (regs->amr & AMR_KUAP_BLOCK_WRITE) == AMR_KUAP_BLOCK_WRITE;
}
- return WARN(((regs->amr & AMR_KUAP_BLOCK_READ) == AMR_KUAP_BLOCK_READ),
- "Bug: Read fault blocked by AMR!");
+ return (regs->amr & AMR_KUAP_BLOCK_READ) == AMR_KUAP_BLOCK_READ;
}
static __always_inline void allow_user_access(void __user *to, const void __user *from,
diff --git a/arch/powerpc/include/asm/nohash/32/kup-8xx.h b/arch/powerpc/include/asm/nohash/32/kup-8xx.h
index 567cdc557402..17a4a616436f 100644
--- a/arch/powerpc/include/asm/nohash/32/kup-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/kup-8xx.h
@@ -63,8 +63,7 @@ static inline void restore_user_access(unsigned long flags)
static inline bool
bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
{
- return WARN(!((regs->kuap ^ MD_APG_KUAP) & 0xff000000),
- "Bug: fault blocked by AP register !");
+ return !((regs->kuap ^ MD_APG_KUAP) & 0xff000000);
}
#endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 3fcd34c28e10..04505f938bbc 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -228,7 +228,7 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
// Read/write fault in a valid region (the exception table search passed
// above), but blocked by KUAP is bad, it can never succeed.
if (bad_kuap_fault(regs, address, is_write))
- return true;
+ return WARN(true, "Bug: %s fault blocked by KUAP!", is_write ? "Write" : "Read");
// What's left? Kernel fault on user in well defined regions (extable
// matched), and allowed by KUAP in the faulting context.
--
2.25.0
^ permalink raw reply related
* [PATCH v4 6/6] powerpc/fault: Perform exception fixup in do_page_fault()
From: Christophe Leroy @ 2020-12-09 5:29 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin,
aneesh.kumar
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <0d37490a067840f53fc5b118869917c0aec9ab87.1607491747.git.christophe.leroy@csgroup.eu>
Exception fixup doesn't require the heady full regs saving,
do it from do_page_fault() directly.
For that, split bad_page_fault() in two parts.
As bad_page_fault() can also be called from other places than
handle_page_fault(), it will still perform exception fixup and
fallback on __bad_page_fault().
handle_page_fault() directly calls __bad_page_fault() as the
exception fixup will now be done by do_page_fault()
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v2: Add prototype of __bad_page_fault() in asm/bug.h
---
arch/powerpc/include/asm/bug.h | 1 +
arch/powerpc/kernel/entry_32.S | 2 +-
arch/powerpc/kernel/exceptions-64e.S | 2 +-
arch/powerpc/kernel/exceptions-64s.S | 2 +-
arch/powerpc/mm/fault.c | 33 ++++++++++++++++++++--------
5 files changed, 28 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index ba0500872cce..464f8ca8a5c9 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -113,6 +113,7 @@
struct pt_regs;
extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
extern void bad_page_fault(struct pt_regs *, unsigned long, int);
+void __bad_page_fault(struct pt_regs *regs, unsigned long address, int sig);
extern void _exception(int, struct pt_regs *, int, unsigned long);
extern void _exception_pkey(struct pt_regs *, unsigned long, int);
extern void die(const char *, struct pt_regs *, long);
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 58177c71dfd4..1c9b0ccc2172 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -684,7 +684,7 @@ handle_page_fault:
mr r5,r3
addi r3,r1,STACK_FRAME_OVERHEAD
lwz r4,_DAR(r1)
- bl bad_page_fault
+ bl __bad_page_fault
b ret_from_except_full
#ifdef CONFIG_PPC_BOOK3S_32
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index f579ce46eef2..74d07dc0bb48 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1023,7 +1023,7 @@ storage_fault_common:
mr r5,r3
addi r3,r1,STACK_FRAME_OVERHEAD
ld r4,_DAR(r1)
- bl bad_page_fault
+ bl __bad_page_fault
b ret_from_except
/*
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 1c8f1b90e174..e02ad6fefa46 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -3259,7 +3259,7 @@ handle_page_fault:
mr r5,r3
addi r3,r1,STACK_FRAME_OVERHEAD
ld r4,_DAR(r1)
- bl bad_page_fault
+ bl __bad_page_fault
b interrupt_return
/* We have a data breakpoint exception - handle it */
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 389a2a875262..8961b44f350c 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -545,10 +545,20 @@ NOKPROBE_SYMBOL(__do_page_fault);
int do_page_fault(struct pt_regs *regs, unsigned long address,
unsigned long error_code)
{
+ const struct exception_table_entry *entry;
enum ctx_state prev_state = exception_enter();
int rc = __do_page_fault(regs, address, error_code);
exception_exit(prev_state);
- return rc;
+ if (likely(!rc))
+ return 0;
+
+ entry = search_exception_tables(regs->nip);
+ if (unlikely(!entry))
+ return rc;
+
+ instruction_pointer_set(regs, extable_fixup(entry));
+
+ return 0;
}
NOKPROBE_SYMBOL(do_page_fault);
@@ -557,17 +567,10 @@ NOKPROBE_SYMBOL(do_page_fault);
* It is called from the DSI and ISI handlers in head.S and from some
* of the procedures in traps.c.
*/
-void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
+void __bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
{
- const struct exception_table_entry *entry;
int is_write = page_fault_is_write(regs->dsisr);
- /* Are we prepared to handle this fault? */
- if ((entry = search_exception_tables(regs->nip)) != NULL) {
- regs->nip = extable_fixup(entry);
- return;
- }
-
/* kernel has accessed a bad area */
switch (TRAP(regs)) {
@@ -601,3 +604,15 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
die("Kernel access of bad area", regs, sig);
}
+
+void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
+{
+ const struct exception_table_entry *entry;
+
+ /* Are we prepared to handle this fault? */
+ entry = search_exception_tables(instruction_pointer(regs));
+ if (entry)
+ instruction_pointer_set(regs, extable_fixup(entry));
+ else
+ __bad_page_fault(regs, address, sig);
+}
--
2.25.0
^ permalink raw reply related
* Re: [PATCH v3 4/5] powerpc/fault: Avoid heavy search_exception_tables() verification
From: Christophe Leroy @ 2020-12-09 5:34 UTC (permalink / raw)
To: Aneesh Kumar K.V, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman, npiggin
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <b532a9c6-97de-031d-f880-901a117cc95c@csgroup.eu>
Le 08/12/2020 à 16:07, Christophe Leroy a écrit :
>
>
> Le 08/12/2020 à 15:52, Aneesh Kumar K.V a écrit :
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>
>>> search_exception_tables() is an heavy operation, we have to avoid it.
>>> When KUAP is selected, we'll know the fault has been blocked by KUAP.
>>> Otherwise, it behaves just as if the address was already in the TLBs
>>> and no fault was generated.
>>>
>>> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>>> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>> v3: rebased
>>> v2: Squashed with the preceeding patch which was re-ordering tests that get removed in this patch.
>>> ---
>>> arch/powerpc/mm/fault.c | 23 +++++++----------------
>>> 1 file changed, 7 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
>>> index 3fcd34c28e10..1770b41e4730 100644
>>> --- a/arch/powerpc/mm/fault.c
>>> +++ b/arch/powerpc/mm/fault.c
>>> @@ -210,28 +210,19 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
>>> return true;
>>> }
>>> - if (!is_exec && address < TASK_SIZE && (error_code & (DSISR_PROTFAULT | DSISR_KEYFAULT)) &&
>>> - !search_exception_tables(regs->nip)) {
>>> - pr_crit_ratelimited("Kernel attempted to access user page (%lx) - exploit attempt? (uid:
>>> %d)\n",
>>> - address,
>>> - from_kuid(&init_user_ns, current_uid()));
>>> - }
>>> -
>>> // Kernel fault on kernel address is bad
>>> if (address >= TASK_SIZE)
>>> return true;
>>> - // Fault on user outside of certain regions (eg. copy_tofrom_user()) is bad
>>> - if (!search_exception_tables(regs->nip))
>>> - return true;
>>> -
>>> - // Read/write fault in a valid region (the exception table search passed
>>> - // above), but blocked by KUAP is bad, it can never succeed.
>>> - if (bad_kuap_fault(regs, address, is_write))
>>> + // Read/write fault blocked by KUAP is bad, it can never succeed.
>>> + if (bad_kuap_fault(regs, address, is_write)) {
>>> + pr_crit_ratelimited("Kernel attempted to %s user page (%lx) - exploit attempt? (uid:
>>> %d)\n",
>>> + is_write ? "write" : "read", address,
>>> + from_kuid(&init_user_ns, current_uid()));
>>> return true;
>>> + }
>>
>>
>> With this I am wondering whether the WARN() in bad_kuap_fault() is
>> needed. A direct access of userspace address will trigger this, whereas
>> previously we used bad_kuap_fault() only to identify incorrect restore
>> of AMR register (ie, to identify kernel bugs). Hence a WARN() there was
>> useful. We loose that differentiation now?
>
> Yes, I wanted to remove the WARN(), see
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/cc9129bdda1dbc2f0a09cf45fece7d0b0e690784.1605541983.git.christophe.leroy@csgroup.eu/
>
> but I understood from Michael that maybe it was not a good idea, so I left it aside for now when
> rebasing to v3.
>
> Yes previously we were able to differentiate between a direct access of userspace and a valid access
> triggering a KUAP fault, but at the cost of the heavy search_exception_tables().
> The issue was reported by Nick through https://github.com/linuxppc/issues/issues/317
>
> Should be perform the search_exception_tables() once we have hit the KUAP fault and WARN() only in
> that case ?
I sent out v4 which does that: only emit the warning once we know it is a KUAP fault within an
uaccess routine. With that, we should be back more or less as before: warning only if we hit KUAP
fault AND it is a place where a userspace access should be granted.
We are not anymore in the fast hot path, so calling search_exception_tables() there should be a
performance issue.
Christophe
>
> I was wondering also if we should keep the WARN() only when CONFIG_PPC_KUAP_DEBUG is set ?
>
^ permalink raw reply
* Re: [PATCH] powerpc/mm: Refactor the floor/ceiling check in hugetlb range freeing functions
From: Aneesh Kumar K.V @ 2020-12-09 6:29 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <16a571bb32eb6e8cd44bda484c8d81cd8a25e6d7.1604668827.git.christophe.leroy@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> All hugetlb range freeing functions have a verification like the following,
> which only differs by the mask used, depending on the page table level.
>
> start &= MASK;
> if (start < floor)
> return;
> if (ceiling) {
> ceiling &= MASK;
> if (! ceiling)
> return;
> }
> if (end - 1 > ceiling - 1)
> return;
>
> Refactor that into a helper function which takes the mask as
> an argument, returning true when [start;end[ is not fully
> contained inside [floor;ceiling[
>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
> arch/powerpc/mm/hugetlbpage.c | 56 ++++++++++++-----------------------
> 1 file changed, 19 insertions(+), 37 deletions(-)
>
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index 36c3800769fb..f8d8a4988e15 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -294,6 +294,21 @@ static void hugepd_free(struct mmu_gather *tlb, void *hugepte)
> static inline void hugepd_free(struct mmu_gather *tlb, void *hugepte) {}
> #endif
>
> +/* Return true when the entry to be freed maps more than the area being freed */
> +static bool range_is_outside_limits(unsigned long start, unsigned long end,
> + unsigned long floor, unsigned long ceiling,
> + unsigned long mask)
> +{
> + if ((start & mask) < floor)
> + return true;
> + if (ceiling) {
> + ceiling &= mask;
> + if (!ceiling)
> + return true;
> + }
> + return end - 1 > ceiling - 1;
> +}
> +
> static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshift,
> unsigned long start, unsigned long end,
> unsigned long floor, unsigned long ceiling)
> @@ -309,15 +324,7 @@ static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshif
> if (shift > pdshift)
> num_hugepd = 1 << (shift - pdshift);
>
> - start &= pdmask;
> - if (start < floor)
> - return;
> - if (ceiling) {
> - ceiling &= pdmask;
> - if (! ceiling)
> - return;
> - }
> - if (end - 1 > ceiling - 1)
> + if (range_is_outside_limits(start, end, floor, ceiling, pdmask))
> return;
>
> for (i = 0; i < num_hugepd; i++, hpdp++)
> @@ -334,18 +341,9 @@ static void hugetlb_free_pte_range(struct mmu_gather *tlb, pmd_t *pmd,
> unsigned long addr, unsigned long end,
> unsigned long floor, unsigned long ceiling)
> {
> - unsigned long start = addr;
> pgtable_t token = pmd_pgtable(*pmd);
>
> - start &= PMD_MASK;
> - if (start < floor)
> - return;
> - if (ceiling) {
> - ceiling &= PMD_MASK;
> - if (!ceiling)
> - return;
> - }
> - if (end - 1 > ceiling - 1)
> + if (range_is_outside_limits(addr, end, floor, ceiling, PMD_MASK))
> return;
>
> pmd_clear(pmd);
> @@ -395,15 +393,7 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
> addr, next, floor, ceiling);
> } while (addr = next, addr != end);
>
> - start &= PUD_MASK;
> - if (start < floor)
> - return;
> - if (ceiling) {
> - ceiling &= PUD_MASK;
> - if (!ceiling)
> - return;
> - }
> - if (end - 1 > ceiling - 1)
> + if (range_is_outside_limits(start, end, floor, ceiling, PUD_MASK))
> return;
>
> pmd = pmd_offset(pud, start);
> @@ -446,15 +436,7 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, p4d_t *p4d,
> }
> } while (addr = next, addr != end);
>
> - start &= PGDIR_MASK;
> - if (start < floor)
> - return;
> - if (ceiling) {
> - ceiling &= PGDIR_MASK;
> - if (!ceiling)
> - return;
> - }
> - if (end - 1 > ceiling - 1)
> + if (range_is_outside_limits(start, end, floor, ceiling, PGDIR_MASK))
> return;
>
> pud = pud_offset(p4d, start);
> --
> 2.25.0
^ permalink raw reply
* Re: linux-next: build warning after merge of the akpm tree
From: Stephen Rothwell @ 2020-12-09 7:07 UTC (permalink / raw)
To: Michael Ellerman
Cc: Kees Cook, Mathieu Malaterre, Linux Kernel Mailing List,
Nicholas Piggin, Linux Next Mailing List, Andrew Morton, PowerPC
In-Reply-To: <87r1nzsi4s.fsf@mpe.ellerman.id.au>
[-- Attachment #1: Type: text/plain, Size: 247 bytes --]
Hi Michael,
On Wed, 09 Dec 2020 15:44:35 +1100 Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> They should really be in DATA_DATA or similar shouldn't they?
No other architecture appears t need them ...
--
Cheers,
Stephen Rothwell
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH 1/3] powerpc/smp: Parse ibm,thread-groups with multiple properties
From: Srikar Dronamraju @ 2020-12-09 8:35 UTC (permalink / raw)
To: Gautham R Shenoy
Cc: Nathan Lynch, Michael Neuling, Vaidyanathan Srinivasan,
Peter Zijlstra, linux-kernel, Nicholas Piggin, linuxppc-dev,
Valentin Schneider
In-Reply-To: <20201208172540.GA14206@in.ibm.com>
* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-12-08 22:55:40]:
> >
> > NIT:
> > tglx mentions in one of his recent comments to try keep a reverse fir tree
> > ordering of variables where possible.
>
> I suppose you mean moving the longer local variable declarations to to
> the top and shorter ones to the bottom. Thanks. Will fix this.
>
Yes.
> > > + }
> > > +
> > > + if (!tg)
> > > + return -EINVAL;
> > > +
> > > + cpu_group_start = get_cpu_thread_group_start(cpu, tg);
> >
> > This whole hunk should be moved to a new function and called before
> > init_cpu_cache_map. It will simplify the logic to great extent.
>
> I suppose you are referring to the part where we select the correct
> tg. Yeah, that can move to a different helper.
>
Yes, I would prefer if we could call this new helper outside
init_cpu_cache_map.
> > >
> > > - zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
> > > - GFP_KERNEL, cpu_to_node(cpu));
> > > + mask = &per_cpu(cpu_l1_cache_map, cpu);
> > > +
> > > + zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cpu));
> > >
> >
> > This hunk (and the next hunk) should be moved to next patch.
> >
>
> The next patch is only about introducing THREAD_GROUP_SHARE_L2. Hence
> I put in any other code in this patch, since it seems to be a logical
> place to collate whatever we have in a generic form.
>
While I am fine with it, having a pointer that always points to the same
mask looks wierd.
--
Thanks and Regards
Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH 3/3] powerpc/cacheinfo: Print correct cache-sibling map/list for L2 cache
From: Srikar Dronamraju @ 2020-12-09 8:39 UTC (permalink / raw)
To: Gautham R Shenoy
Cc: Nathan Lynch, Michael Neuling, Vaidyanathan Srinivasan,
Peter Zijlstra, linux-kernel, Nicholas Piggin, linuxppc-dev,
Valentin Schneider
In-Reply-To: <20201208175647.GC14206@in.ibm.com>
* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-12-08 23:26:47]:
> > The drawback of this is even if cpus 0,2,4,6 are released L1 cache will not
> > be released. Is this as expected?
>
> cacheinfo populates the cache->shared_cpu_map on the basis of which
> CPUs share the common device-tree node for a particular cache. There
> is one l1-cache object in the device-tree for a CPU node corresponding
> to a big-core. That the L1 is further split between the threads of the
> core is shown using ibm,thread-groups.
>
Yes.
> The ideal thing would be to add a "group_leader" field to "struct
> cache" so that we can create separate cache objects , one per thread
> group. I will take a stab at this in the v2.
>
I am not saying this needs to be done immediately. We could add a TODO and
get it done later. Your patch is not making it worse. Its just that there is
still something more left to be done.
--
Thanks and Regards
Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH 1/3] powerpc/smp: Parse ibm,thread-groups with multiple properties
From: Gautham R Shenoy @ 2020-12-09 9:05 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Nathan Lynch, Gautham R Shenoy, Michael Neuling,
Vaidyanathan Srinivasan, Peter Zijlstra, linux-kernel,
Nicholas Piggin, linuxppc-dev, Valentin Schneider
In-Reply-To: <20201209083541.GK528281@linux.vnet.ibm.com>
On Wed, Dec 09, 2020 at 02:05:41PM +0530, Srikar Dronamraju wrote:
> * Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-12-08 22:55:40]:
>
> > >
> > > NIT:
> > > tglx mentions in one of his recent comments to try keep a reverse fir tree
> > > ordering of variables where possible.
> >
> > I suppose you mean moving the longer local variable declarations to to
> > the top and shorter ones to the bottom. Thanks. Will fix this.
> >
>
> Yes.
>
> > > > + }
> > > > +
> > > > + if (!tg)
> > > > + return -EINVAL;
> > > > +
> > > > + cpu_group_start = get_cpu_thread_group_start(cpu, tg);
> > >
> > > This whole hunk should be moved to a new function and called before
> > > init_cpu_cache_map. It will simplify the logic to great extent.
> >
> > I suppose you are referring to the part where we select the correct
> > tg. Yeah, that can move to a different helper.
> >
>
> Yes, I would prefer if we could call this new helper outside
> init_cpu_cache_map.
>
> > > >
> > > > - zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
> > > > - GFP_KERNEL, cpu_to_node(cpu));
> > > > + mask = &per_cpu(cpu_l1_cache_map, cpu);
> > > > +
> > > > + zalloc_cpumask_var_node(mask, GFP_KERNEL, cpu_to_node(cpu));
> > > >
> > >
> > > This hunk (and the next hunk) should be moved to next patch.
> > >
> >
> > The next patch is only about introducing THREAD_GROUP_SHARE_L2. Hence
> > I put in any other code in this patch, since it seems to be a logical
> > place to collate whatever we have in a generic form.
> >
>
> While I am fine with it, having a pointer that always points to the same
> mask looks wierd.
Sure. Moving some of this to a separate preparatory patch.
>
> --
> Thanks and Regards
> Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH 3/3] powerpc/cacheinfo: Print correct cache-sibling map/list for L2 cache
From: Gautham R Shenoy @ 2020-12-09 9:07 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: Nathan Lynch, Gautham R Shenoy, Michael Neuling,
Vaidyanathan Srinivasan, Peter Zijlstra, linux-kernel,
Nicholas Piggin, linuxppc-dev, Valentin Schneider
In-Reply-To: <20201209083921.GL528281@linux.vnet.ibm.com>
On Wed, Dec 09, 2020 at 02:09:21PM +0530, Srikar Dronamraju wrote:
> * Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-12-08 23:26:47]:
>
> > > The drawback of this is even if cpus 0,2,4,6 are released L1 cache will not
> > > be released. Is this as expected?
> >
> > cacheinfo populates the cache->shared_cpu_map on the basis of which
> > CPUs share the common device-tree node for a particular cache. There
> > is one l1-cache object in the device-tree for a CPU node corresponding
> > to a big-core. That the L1 is further split between the threads of the
> > core is shown using ibm,thread-groups.
> >
>
> Yes.
>
> > The ideal thing would be to add a "group_leader" field to "struct
> > cache" so that we can create separate cache objects , one per thread
> > group. I will take a stab at this in the v2.
> >
>
> I am not saying this needs to be done immediately. We could add a TODO and
> get it done later. Your patch is not making it worse. Its just that there is
> still something more left to be done.
Yeah, it needs to be fixed but it may not be a 5.11 target. For now I
will fix this patch to take care of the build errors on !PPC64 !SMT
configs. I will post a separate series for making cacheinfo.c aware of
thread-groups at the time of construction of the cache-chain.
>
> --
> Thanks and Regards
> Srikar Dronamraju
^ permalink raw reply
* Re: [PATCH 2/3] powerpc/smp: Add support detecting thread-groups sharing L2 cache
From: Srikar Dronamraju @ 2020-12-09 9:14 UTC (permalink / raw)
To: Gautham R Shenoy
Cc: Nathan Lynch, Michael Neuling, Vaidyanathan Srinivasan,
Peter Zijlstra, linux-kernel, Nicholas Piggin, linuxppc-dev,
Valentin Schneider
In-Reply-To: <20201208174237.GB14206@in.ibm.com>
* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2020-12-08 23:12:37]:
>
> > For L2 we have thread_group_l2_cache_map to store the tasks from the thread
> > group. but cpu_l2_cache_map for keeping track of tasks.
>
> >
> > I think we should do some renaming to keep the names consistent.
> > I would say probably say move the current cpu_l2_cache_map to
> > cpu_llc_cache_map and move the new aka thread_group_l2_cache_map as
> > cpu_l2_cache_map to be somewhat consistent.
>
> Hmm.. cpu_llc_cache_map is still very generic. We want to have
> something that defines l2 map.
>
> I agree that we need to keep it consistent. How about renaming
> cpu_l1_cache_map to thread_groups_l1_cache_map ?
>
> That way thread_groups_l1_cache_map and thread_groups_l2_cache_map
> refer to the corresponding L1 and L2 siblings as discovered from
> ibm,thread-groups property.
I am fine with this.
> > > +
> > > + for_each_possible_cpu(cpu) {
> > > + int err = init_cpu_cache_map(cpu, THREAD_GROUP_SHARE_L2);
> > > +
> > > + if (err)
> > > + return err;
> > > + }
> > > +
> > > + thread_group_shares_l2 = true;
> >
> > Why do we need a separate loop. Why cant we merge this in the above loop
> > itself?
>
> No, there are platforms where one THREAD_GROUP_SHARE_L1 exists while
> THREAD_GROUP_SHARE_L2 doesn't exist. It becomes easier if these are
> separately tracked. Also, what do we gain if we put this in the same
> loop? It will be (nr_possible_cpus * 2 * invocations of
> init_cpu_cache_map()) as opposed to 2 * (nr_possible_cpus *
> invocations of init_cpu_cache_map()). Isn't it ?
>
Its not about the number of invocations but per-cpu thread group list
that would need not be loaded again. Currently they would probably be in the
cache-line, but get dropped to be loaded again in the next loop.
And we still can support platforms with only THREAD_GROUP_SHARE_L1 since
parse_thread_groups would have given us how many levels of thread groups are
supported on a platform.
> >
> > > + pr_info("Thread-groups in a core share L2-cache\n");
> >
> > Can this be moved to a pr_debug? Does it help any regular user/admins to
> > know if thread-groups shared l2 cache. Infact it may confuse users on what
> > thread groups are and which thread groups dont share cache.
> > I would prefer some other name than thread_group_shares_l2 but dont know any
> > better alternatives and may be my choices are even worse.
>
> Would you be ok with "L2 cache shared by threads of the small core" ?
Sounds better to me. I would still think pr_debug is better since regular
Admins/users may not make too much information from this.
>
> >
> > Ah this can be simplified to:
> > if (thread_group_shares_l2) {
> > cpumask_set_cpu(cpu, cpu_l2_cache_mask(cpu));
> >
> > for_each_cpu(i, per_cpu(thread_group_l2_cache_map, cpu)) {
> > if (cpu_online(i))
> > set_cpus_related(i, cpu, cpu_l2_cache_mask);
> > }
>
> Don't we want to enforce that the siblings sharing L1 be a subset of
> the siblings sharing L2 ? Or do you recommend putting in a check for
> that somewhere ?
>
I didnt think about the case where the device-tree could show L2 to be a
subset of L1.
How about initializing thread_group_l2_cache_map itself with
cpu_l1_cache_map. It would be a simple one time operation and reduce the
overhead here every CPU online.
And it would help in your subsequent patch too. We dont want the cacheinfo
for L1 showing CPUs not present in L2.
--
Thanks and Regards
Srikar Dronamraju
^ permalink raw reply
* Re: linux-next: build warning after merge of the akpm tree
From: Stephen Rothwell @ 2020-12-09 10:33 UTC (permalink / raw)
To: Andrew Morton
Cc: Kees Cook, Mathieu Malaterre, Linux Kernel Mailing List,
Nicholas Piggin, Linux Next Mailing List, PowerPC
In-Reply-To: <20201208230157.42c42789@canb.auug.org.au>
[-- Attachment #1: Type: text/plain, Size: 1232 bytes --]
Hi all,
On Tue, 8 Dec 2020 23:01:57 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> I will try the following patch tomorrow:
>
> From: Stephen Rothwell <sfr@canb.auug.org.au>
> Date: Tue, 8 Dec 2020 22:58:24 +1100
> Subject: [PATCH] powerpc: Add .data..Lubsan_data*/.data..Lubsan_type* sections explicitly
>
> Similarly to commit
>
> beba24ac5913 ("powerpc/32: Add .data..Lubsan_data*/.data..Lubsan_type* sections explicitly")
>
> since CONFIG_UBSAN bits can now be enabled for all*config.
>
> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
> ---
> arch/powerpc/kernel/vmlinux.lds.S | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
> index 3b4c26e94328..0318ba436f34 100644
> --- a/arch/powerpc/kernel/vmlinux.lds.S
> +++ b/arch/powerpc/kernel/vmlinux.lds.S
> @@ -296,6 +296,10 @@ SECTIONS
> #else
> .data : AT(ADDR(.data) - LOAD_OFFSET) {
> DATA_DATA
> +#ifdef CONFIG_UBSAN
> + *(.data..Lubsan_data*)
> + *(.data..Lubsan_type*)
> +#endif
> *(.data.rel*)
> *(.toc1)
> *(.branch_lt)
> --
> 2.29.2
This got rid of all the warnings.
--
Cheers,
Stephen Rothwell
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH v1 2/6] powerpc/8xx: Always pin kernel text TLB
From: Michael Ellerman @ 2020-12-09 10:43 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <203b89de491e1379f1677a2685211b7c32adfff0.1606231483.git.christophe.leroy@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> There is no big poing in not pinning kernel text anymore, as now
> we can keep pinned TLB even with things like DEBUG_PAGEALLOC.
>
> Remove CONFIG_PIN_TLB_TEXT, making it always right.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
> arch/powerpc/Kconfig | 3 +--
> arch/powerpc/kernel/head_8xx.S | 20 +++-----------------
> arch/powerpc/mm/nohash/8xx.c | 3 +--
> arch/powerpc/platforms/8xx/Kconfig | 7 -------
> 4 files changed, 5 insertions(+), 28 deletions(-)
>
...
> diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
> index 231ca95f9ffb..19a3eec1d8c5 100644
> --- a/arch/powerpc/mm/nohash/8xx.c
> +++ b/arch/powerpc/mm/nohash/8xx.c
> @@ -186,8 +186,7 @@ void mmu_mark_initmem_nx(void)
> mmu_mapin_ram_chunk(0, boundary, PAGE_KERNEL_TEXT, false);
> mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL, false);
>
> - if (IS_ENABLED(CONFIG_PIN_TLB_TEXT))
> - mmu_pin_tlb(block_mapped_ram, false);
> + mmu_pin_tlb(block_mapped_ram, false);
> }
This broke mpc885_ads_defconfig with:
ld: arch/powerpc/mm/nohash/8xx.o: in function `mmu_mark_initmem_nx':
/home/michael/linux/arch/powerpc/mm/nohash/8xx.c:189: undefined reference to `mmu_pin_tlb'
make[1]: *** [/home/michael/linux/Makefile:1164: vmlinux] Error 1
make: *** [Makefile:185: __sub-make] Error 2
Fixed by:
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 35707e86c5f3..52702f3db6df 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -702,7 +702,6 @@ FixupDAR:/* Entry point for dcbx workaround. */
mtspr SPRN_DER, r8
blr
-#ifdef CONFIG_PIN_TLB
_GLOBAL(mmu_pin_tlb)
lis r9, (1f - PAGE_OFFSET)@h
ori r9, r9, (1f - PAGE_OFFSET)@l
@@ -802,7 +801,6 @@ _GLOBAL(mmu_pin_tlb)
mtspr SPRN_SRR1, r10
mtspr SPRN_SRR0, r11
rfi
-#endif /* CONFIG_PIN_TLB */
/*
* We put a few things here that have to be page-aligned.
cheers
^ permalink raw reply related
* Re: [PATCH] drivers: usb: gadget: prefer pr_*() functions over raw printk()
From: Enrico Weigelt, metux IT consult @ 2020-12-09 11:11 UTC (permalink / raw)
To: Laurent Pinchart, Enrico Weigelt, metux IT consult
Cc: balbi, linux-usb, linuxppc-dev, linux-kernel, leoyang.li
In-Reply-To: <X8+howyVRiTR9gv/@pendragon.ideasonboard.com>
On 08.12.20 16:54, Laurent Pinchart wrote:
Hi,
>> diff --git a/drivers/usb/gadget/udc/atmel_usba_udc.c b/drivers/usb/gadget/udc/atmel_usba_udc.c
>> index 2b893bceea45..4834fafb3f70 100644
>> --- a/drivers/usb/gadget/udc/atmel_usba_udc.c
>> +++ b/drivers/usb/gadget/udc/atmel_usba_udc.c
>> @@ -1573,7 +1573,7 @@ static void usba_control_irq(struct usba_udc *udc, struct usba_ep *ep)
>> * generate or receive a reply right away. */
>> usba_ep_writel(ep, CLR_STA, USBA_RX_SETUP);
>>
>> - /* printk(KERN_DEBUG "setup: %d: %02x.%02x\n",
>> + /* pr_debug("setup: %d: %02x.%02x\n",
>> ep->state, crq.crq.bRequestType,
>> crq.crq.bRequest); */
>
> I wonder if this shouldn't be dropped instead, commented-out code isn't
> very useful.
Indeed. Shall I send a separate patch for that ?
> When a pointer to a struct device is available, dev_err() would be much
> better. That's however out of scope for this patch, but it would be nice
> to address it. This would become
>
> dev_err(&pdev->dev, "Check IRQ setup!\n");
>
You're right. I didn't check for that yet. I'll do it in a separate
patch.
--mtx
--
---
Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
GPG/PGP-Schlüssel zu.
---
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287
^ permalink raw reply
* Re: [PATCH] drivers: usb: gadget: prefer pr_*() functions over raw printk()
From: Laurent Pinchart @ 2020-12-09 11:27 UTC (permalink / raw)
To: Enrico Weigelt, metux IT consult
Cc: balbi, linux-usb, linuxppc-dev, linux-kernel, leoyang.li
In-Reply-To: <9aaa06ad-0bd8-486d-b16b-66927d57cf96@metux.net>
Hi Enrico,
On Wed, Dec 09, 2020 at 12:11:36PM +0100, Enrico Weigelt, metux IT consult wrote:
> On 08.12.20 16:54, Laurent Pinchart wrote:
> >> diff --git a/drivers/usb/gadget/udc/atmel_usba_udc.c b/drivers/usb/gadget/udc/atmel_usba_udc.c
> >> index 2b893bceea45..4834fafb3f70 100644
> >> --- a/drivers/usb/gadget/udc/atmel_usba_udc.c
> >> +++ b/drivers/usb/gadget/udc/atmel_usba_udc.c
> >> @@ -1573,7 +1573,7 @@ static void usba_control_irq(struct usba_udc *udc, struct usba_ep *ep)
> >> * generate or receive a reply right away. */
> >> usba_ep_writel(ep, CLR_STA, USBA_RX_SETUP);
> >>
> >> - /* printk(KERN_DEBUG "setup: %d: %02x.%02x\n",
> >> + /* pr_debug("setup: %d: %02x.%02x\n",
> >> ep->state, crq.crq.bRequestType,
> >> crq.crq.bRequest); */
> >
> > I wonder if this shouldn't be dropped instead, commented-out code isn't
> > very useful.
>
> Indeed. Shall I send a separate patch for that ?
Yes, that would make sense.
> > When a pointer to a struct device is available, dev_err() would be much
> > better. That's however out of scope for this patch, but it would be nice
> > to address it. This would become
> >
> > dev_err(&pdev->dev, "Check IRQ setup!\n");
> >
>
> You're right. I didn't check for that yet. I'll do it in a separate
> patch.
As most of the files touched by this patch are device drivers, dev_*()
functions should be used instead of pr_*() where possible. I'd recommend
a first patch that converts to dev_*(), and then a second patch that
converts the remaining printk()s, if any, to pr_*() in the contexts
where no struct device is available or can easily be made available.
--
Regards,
Laurent Pinchart
^ permalink raw reply
* Re: [PATCH v1 2/6] powerpc/8xx: Always pin kernel text TLB
From: Christophe Leroy @ 2020-12-09 11:50 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <87lfe7s1j3.fsf@mpe.ellerman.id.au>
Le 09/12/2020 à 11:43, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> There is no big poing in not pinning kernel text anymore, as now
>> we can keep pinned TLB even with things like DEBUG_PAGEALLOC.
>>
>> Remove CONFIG_PIN_TLB_TEXT, making it always right.
>>
>> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>> ---
>> arch/powerpc/Kconfig | 3 +--
>> arch/powerpc/kernel/head_8xx.S | 20 +++-----------------
>> arch/powerpc/mm/nohash/8xx.c | 3 +--
>> arch/powerpc/platforms/8xx/Kconfig | 7 -------
>> 4 files changed, 5 insertions(+), 28 deletions(-)
>>
> ...
>> diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
>> index 231ca95f9ffb..19a3eec1d8c5 100644
>> --- a/arch/powerpc/mm/nohash/8xx.c
>> +++ b/arch/powerpc/mm/nohash/8xx.c
>> @@ -186,8 +186,7 @@ void mmu_mark_initmem_nx(void)
>> mmu_mapin_ram_chunk(0, boundary, PAGE_KERNEL_TEXT, false);
>> mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL, false);
>>
>> - if (IS_ENABLED(CONFIG_PIN_TLB_TEXT))
>> - mmu_pin_tlb(block_mapped_ram, false);
>> + mmu_pin_tlb(block_mapped_ram, false);
>> }
>
> This broke mpc885_ads_defconfig with:
:surprise:
How did I get it working ? Anyway, thanks for fixing it.
Christophe
>
> ld: arch/powerpc/mm/nohash/8xx.o: in function `mmu_mark_initmem_nx':
> /home/michael/linux/arch/powerpc/mm/nohash/8xx.c:189: undefined reference to `mmu_pin_tlb'
> make[1]: *** [/home/michael/linux/Makefile:1164: vmlinux] Error 1
> make: *** [Makefile:185: __sub-make] Error 2
>
> Fixed by:
>
> diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
> index 35707e86c5f3..52702f3db6df 100644
> --- a/arch/powerpc/kernel/head_8xx.S
> +++ b/arch/powerpc/kernel/head_8xx.S
> @@ -702,7 +702,6 @@ FixupDAR:/* Entry point for dcbx workaround. */
> mtspr SPRN_DER, r8
> blr
>
> -#ifdef CONFIG_PIN_TLB
> _GLOBAL(mmu_pin_tlb)
> lis r9, (1f - PAGE_OFFSET)@h
> ori r9, r9, (1f - PAGE_OFFSET)@l
> @@ -802,7 +801,6 @@ _GLOBAL(mmu_pin_tlb)
> mtspr SPRN_SRR1, r10
> mtspr SPRN_SRR0, r11
> rfi
> -#endif /* CONFIG_PIN_TLB */
>
> /*
> * We put a few things here that have to be page-aligned.
>
>
> cheers
>
^ permalink raw reply
* Re: [PATCH v6 0/5] PCI: Unify ECAM constants in native PCI Express drivers
From: Bjorn Helgaas @ 2020-12-09 12:36 UTC (permalink / raw)
To: Michael Walle
Cc: kw, heiko, shawn.lin, paulus, thomas.petazzoni, jonnyc, toan,
will, robh, lorenzo.pieralisi, michal.simek, linux-rockchip,
bcm-kernel-feedback-list, linux-arm-kernel, linux-pci, rjui,
f.fainelli, linux-rpi-kernel, Jonathan.Cameron, bhelgaas,
jonathan.derrick, sbranden, wangzhou1, rrichter, linuxppc-dev,
nsaenzjulienne
In-Reply-To: <20201208154150.20978-1-michael@walle.cc>
On Tue, Dec 08, 2020 at 04:41:50PM +0100, Michael Walle wrote:
> >On Sun, 29 Nov 2020 23:07:38 +0000, Krzysztof Wilczyński wrote:
> >> Unify ECAM-related constants into a single set of standard constants
> >> defining memory address shift values for the byte-level address that can
> >> be used when accessing the PCI Express Configuration Space, and then
> >> move native PCI Express controller drivers to use newly introduced
> >> definitions retiring any driver-specific ones.
> >>
> >> The ECAM ("Enhanced Configuration Access Mechanism") is defined by the
> >> PCI Express specification (see PCI Express Base Specification, Revision
> >> 5.0, Version 1.0, Section 7.2.2, p. 676), thus most hardware should
> >> implement it the same way.
> >>
> >> [...]
> >
> >Applied to pci/ecam, thanks!
> >
> >[1/5] PCI: Unify ECAM constants in native PCI Express drivers
> > https://git.kernel.org/lpieralisi/pci/c/f3c07cf692
> Patch 1/5 breaks LS1028A boards:
>
> [..]
> [ 1.144426] pci-host-generic 1f0000000.pcie: host bridge /soc/pcie@1f0000000 ranges:
> [ 1.152276] pci-host-generic 1f0000000.pcie: MEM 0x01f8000000..0x01f815ffff -> 0x0000000000
> [ 1.161161] pci-host-generic 1f0000000.pcie: MEM 0x01f8160000..0x01f81cffff -> 0x0000000000
> [ 1.170043] pci-host-generic 1f0000000.pcie: MEM 0x01f81d0000..0x01f81effff -> 0x0000000000
> [ 1.178924] pci-host-generic 1f0000000.pcie: MEM 0x01f81f0000..0x01f820ffff -> 0x0000000000
> [ 1.187805] pci-host-generic 1f0000000.pcie: MEM 0x01f8210000..0x01f822ffff -> 0x0000000000
> [ 1.196686] pci-host-generic 1f0000000.pcie: MEM 0x01f8230000..0x01f824ffff -> 0x0000000000
> [ 1.205562] pci-host-generic 1f0000000.pcie: MEM 0x01fc000000..0x01fc3fffff -> 0x0000000000
Can you attach your DT? The fact that all these windows map to PCI
bus address 0 looks broken. Prior to patch 1/5, do the devices below
this bridge actually work?
Looks like you're using the pci-host-generic driver; which of the
.compatible strings (pci-host-cam-generic, pci-host-ecam-generic,
marvell,armada8k-pcie-ecam, etc) are you using? (I think that's in
the DT as well.)
> [ 1.214465] pci-host-generic 1f0000000.pcie: ECAM at [mem 0x1f0000000-0x1f00fffff] for [bus 00]
> [ 1.223318] pci-host-generic 1f0000000.pcie: PCI host bridge to bus 0000:00
> [ 1.230350] pci_bus 0000:00: root bus resource [bus 00]
> [ 1.235625] pci_bus 0000:00: root bus resource [mem 0x1f8000000-0x1f815ffff] (bus address [0x00000000-0x0015ffff])
> [ 1.246077] pci_bus 0000:00: root bus resource [mem 0x1f8160000-0x1f81cffff pref] (bus address [0x00000000-0x0006ffff])
> [ 1.256969] pci_bus 0000:00: root bus resource [mem 0x1f81d0000-0x1f81effff] (bus address [0x00000000-0x0001ffff])
> [ 1.267427] pci_bus 0000:00: root bus resource [mem 0x1f81f0000-0x1f820ffff pref] (bus address [0x00000000-0x0001ffff])
> [ 1.278326] pci_bus 0000:00: root bus resource [mem 0x1f8210000-0x1f822ffff] (bus address [0x00000000-0x0001ffff])
> [ 1.288779] pci_bus 0000:00: root bus resource [mem 0x1f8230000-0x1f824ffff pref] (bus address [0x00000000-0x0001ffff])
> [ 1.299669] pci_bus 0000:00: root bus resource [mem 0x1fc000000-0x1fc3fffff] (bus address [0x00000000-0x003fffff])
> [ 1.310138] pci 0000:00:00.0: [1957:e100] type 00 class 0x020001
> [ 1.316234] pci 0000:00:00.0: BAR 0: [mem 0x1f8000000-0x1f803ffff 64bit] (from Enhanced Allocation, properties 0x0)
> [ 1.326776] pci 0000:00:00.0: BAR 2: [mem 0x1f8160000-0x1f816ffff 64bit pref] (from Enhanced Allocation, properties 0x1)
> [ 1.337759] pci 0000:00:00.0: VF BAR 0: [mem 0x1f81d0000-0x1f81dffff 64bit] (from Enhanced Allocation, properties 0x4)
> [ 1.348563] pci 0000:00:00.0: VF BAR 2: [mem 0x1f81f0000-0x1f81fffff 64bit pref] (from Enhanced Allocation, properties 0x3)
> [ 1.359821] pci 0000:00:00.0: PME# supported from D0 D3hot
> [ 1.365368] pci 0000:00:00.0: VF(n) BAR0 space: [mem 0x1f81d0000-0x1f81effff 64bit] (contains BAR0 for 2 VFs)
> [ 1.375381] pci 0000:00:00.0: VF(n) BAR2 space: [mem 0x1f81f0000-0x1f820ffff 64bit pref] (contains BAR2 for 2 VFs)
> [ 1.385983] Unable to handle kernel paging request at virtual address ffff800012132000
If ffff800012132000 were an actual ECAM address, we would expect the
low 20 bits to contain the device number, function number, and
config register offset, i.e.,
dev (0xffff800012132000 >> 15) & 0x01f = 0x6
fn (0xffff800012132000 >> 12) & 0x007 = 0x2
reg (0xffff800012132000) & 0xfff = 0
but that's non-sensical since we probe for devices in order. So maybe
this is a bad pointer somewhere else. I looked at pci_ecam_map_bus()
but didn't see an obvious problem. Maybe we could brute-force debug
this by adding some printks there.
> [ 1.393972] Mem abort info:
> [ 1.396783] ESR = 0x96000007
> [ 1.399859] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 1.405215] SET = 0, FnV = 0
> [ 1.408290] EA = 0, S1PTW = 0
> [ 1.411453] Data abort info:
> [ 1.414352] ISV = 0, ISS = 0x00000007
> [ 1.418216] CM = 0, WnR = 0
> [ 1.421205] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000008369c000
> [ 1.427966] [ffff800012132000] pgd=00000020fffff003, p4d=00000020fffff003, pud=00000020ffffe003, pmd=00000020ffffa003, pte=0000000000000000
> [ 1.440618] Internal error: Oops: 96000007 [#1] PREEMPT SMP
> [ 1.446239] Modules linked in:
> [ 1.449320] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc3-00101-g2f378db5c89 #191
> [ 1.457484] Hardware name: Kontron SMARC-sAL28 (Single PHY) on SMARC Eval 2.0 carrier (DT)
> [ 1.465827] pstate: 20000085 (nzCv daIf -PAN -UAO -TCO BTYPE=--)
> [ 1.471892] pc : pci_generic_config_read+0x38/0xe0
> [ 1.476723] lr : pci_generic_config_read+0x24/0xe0
> [ 1.481553] sp : ffff80001211b920
> [ 1.484891] x29: ffff80001211b920 x28: 0000000000000000
> [ 1.490252] x27: ffff8000116a04bc x26: 0000000000000000
> [ 1.495612] x25: 0000000000000001 x24: ffff80001211ba54
> [ 1.500972] x23: ffff0020009c3800 x22: 0000000000000000
> [ 1.506332] x21: 0000000000000087 x20: ffff80001211b994
> [ 1.511692] x19: 0000000000000004 x18: 0000000000000000
> [ 1.517052] x17: 0000000000000000 x16: 00000000d5edfbc1
> [ 1.522412] x15: ffffffffffffffff x14: ffff800011cf9948
> [ 1.527772] x13: ffff002000305a1c x12: 0000000000000030
> [ 1.533132] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> [ 1.538491] x9 : 2c6b7173626d686f x8 : 000000000000ea60
> [ 1.543851] x7 : ffff80001211ba54 x6 : 0000000000000000
> [ 1.549211] x5 : 0000000000000000 x4 : ffff800012131000
> [ 1.554570] x3 : 0000000000000000 x2 : 0000000000000000
> [ 1.559930] x1 : 0000000000001000 x0 : ffff800012132000
> [ 1.565290] Call trace:
> [ 1.567752] pci_generic_config_read+0x38/0xe0
> [ 1.572233] pci_bus_read_config_dword+0x84/0xd8
> [ 1.576890] pci_bus_generic_read_dev_vendor_id+0x34/0x1b0
> [ 1.582423] pci_bus_read_dev_vendor_id+0x4c/0x70
> [ 1.587167] pci_scan_single_device+0x84/0xe0
> [ 1.591559] pci_scan_slot+0x6c/0x120
> [ 1.595250] pci_scan_child_bus_extend+0x54/0x298
> [ 1.599994] pci_scan_root_bus_bridge+0xd4/0xf0
> [ 1.604562] pci_host_probe+0x18/0xb0
> [ 1.608254] pci_host_common_probe+0x13c/0x1a0
> [ 1.612735] platform_drv_probe+0x54/0xa8
> [ 1.616777] really_probe+0xe4/0x3b8
> [ 1.620380] driver_probe_device+0x58/0xb8
> [ 1.624509] device_driver_attach+0x74/0x80
> [ 1.628725] __driver_attach+0x58/0xe0
> [ 1.632503] bus_for_each_dev+0x74/0xc8
> [ 1.636369] driver_attach+0x24/0x30
> [ 1.639972] bus_add_driver+0x18c/0x1f0
> [ 1.643838] driver_register+0x64/0x120
> [ 1.647704] __platform_driver_register+0x48/0x58
> [ 1.652449] gen_pci_driver_init+0x1c/0x28
> [ 1.656580] do_one_initcall+0x4c/0x2c0
> [ 1.660447] kernel_init_freeable+0x1e4/0x250
> [ 1.664840] kernel_init+0x14/0x118
> [ 1.668355] ret_from_fork+0x10/0x34
> [ 1.671961] Code: 7100067f 540001c0 71000a7f 54000300 (b9400001)
> [ 1.678114] ---[ end trace 0aca1b048661e8b3 ]---
> [ 1.682770] note: swapper/0[1] exited with preempt_count 1
> [ 1.688305] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [ 1.696031] SMP: stopping secondary CPUs
> [ 1.699989] Kernel Offset: disabled
> [ 1.703503] CPU features: 0x0240022,61006008
> [ 1.707806] Memory Limit: none
> [ 1.710884] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
>
> There is a LS1028A eval board in kernelci here:
> https://lavalab.nxp.com/scheduler/job/170566
>
> I actually have this board which also have a LS1028A SoC:
> https://lavalab.kontron.com/scheduler/job/1771
>
> But in the latter you won't see much because earlycon isn't active. [I'm
> about to fix that.]
>
> By reverting patch 1/5, the board will work again.
>
> -michael
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox