* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Segher Boessenkool @ 2020-10-23 18:27 UTC (permalink / raw)
To: Al Viro
Cc: linux-aio@kvack.org, David Hildenbrand,
linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
Christoph Hellwig, linux-arch@vger.kernel.org,
linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
kernel-team@android.com, Arnd Bergmann,
linux-block@vger.kernel.org, io-uring@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, Jens Axboe,
linux-parisc@vger.kernel.org, 'Greg KH', Nick Desaulniers,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org, David Laight,
netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Andrew Morton, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201023175857.GA3576660@ZenIV.linux.org.uk>
On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
>
> > Now, I am not a compiler expert, but as I already cited, at least on
> > x86-64 clang expects that the high bits were cleared by the caller - in
> > contrast to gcc. I suspect it's the same on arm64, but again, I am no
> > compiler expert.
> >
> > If what I said and cites for x86-64 is correct, if the function expects
> > an "unsigned int", it will happily use 64bit operations without further
> > checks where valid when assuming high bits are zero. That's why even
> > converting everything to "unsigned int" as proposed by me won't work on
> > clang - it assumes high bits are zero (as indicated by Nick).
> >
> > As I am neither a compiler experts (did I mention that already? ;) ) nor
> > an arm64 experts, I can't tell if this is a compiler BUG or not.
>
> On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> for clearing the upper half of 64bit register used to pass the value - it only
> needs to store the actual value into the lower half. The callee must consider
> the contents of the upper half of that register as undefined. See AAPCS64 (e.g.
> https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> ); AFAICS, the relevant bit is
> "Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> the callee rather than the caller."
Or the formal rule:
C.9 If the argument is an Integral or Pointer Type, the size of the
argument is less than or equal to 8 bytes and the NGRN is less
than 8, the argument is copied to the least significant bits in
x[NGRN]. The NGRN is incremented by one. The argument has now
been allocated.
Segher
^ permalink raw reply
* Re: [PATCH] treewide: Convert macro and uses of __section(foo) to __section("foo")
From: Miguel Ojeda @ 2020-10-23 20:03 UTC (permalink / raw)
To: Joe Perches
Cc: clang-built-linux, Nick Desaulniers, Linus Torvalds, linux-kernel,
linuxppc-dev
In-Reply-To: <64b49cd3680f45808dad286b408e7b196c31ec79.camel@perches.com>
On Fri, Oct 23, 2020 at 10:03 AM Joe Perches <joe@perches.com> wrote:
>
> Thanks Miguel, but IMO it doesn't need time in next.
You're welcome! It never hurts to keep things for a bit there.
> Applying it just before an rc1 minimizes conflicts.
There shouldn't be many conflicts after -rc1. The amount of changes is
reasonable too, so no need to apply the script directly. In any case,
if you prefer that Linus picks it up himself right away for this -rc1,
it looks good to me (with the caveat that it isn't tested):
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Cheers,
Miguel
^ permalink raw reply
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-23 21:28 UTC (permalink / raw)
To: 'Segher Boessenkool', Al Viro
Cc: linux-aio@kvack.org, David Hildenbrand,
linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
Christoph Hellwig, linux-arch@vger.kernel.org,
linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
kernel-team@android.com, Arnd Bergmann,
linux-block@vger.kernel.org, io-uring@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, Jens Axboe,
linux-parisc@vger.kernel.org, 'Greg KH', Nick Desaulniers,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
linux-fsdevel@vger.kernel.org, Andrew Morton,
linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201023182713.GG2672@gate.crashing.org>
From: Segher Boessenkool
> Sent: 23 October 2020 19:27
>
> On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> > On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> >
> > > Now, I am not a compiler expert, but as I already cited, at least on
> > > x86-64 clang expects that the high bits were cleared by the caller - in
> > > contrast to gcc. I suspect it's the same on arm64, but again, I am no
> > > compiler expert.
> > >
> > > If what I said and cites for x86-64 is correct, if the function expects
> > > an "unsigned int", it will happily use 64bit operations without further
> > > checks where valid when assuming high bits are zero. That's why even
> > > converting everything to "unsigned int" as proposed by me won't work on
> > > clang - it assumes high bits are zero (as indicated by Nick).
> > >
> > > As I am neither a compiler experts (did I mention that already? ;) ) nor
> > > an arm64 experts, I can't tell if this is a compiler BUG or not.
> >
> > On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> > for clearing the upper half of 64bit register used to pass the value - it only
> > needs to store the actual value into the lower half. The callee must consider
> > the contents of the upper half of that register as undefined. See AAPCS64 (e.g.
> > https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> > ); AFAICS, the relevant bit is
> > "Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> > the callee rather than the caller."
>
> Or the formal rule:
>
> C.9 If the argument is an Integral or Pointer Type, the size of the
> argument is less than or equal to 8 bytes and the NGRN is less
> than 8, the argument is copied to the least significant bits in
> x[NGRN]. The NGRN is incremented by one. The argument has now
> been allocated.
So, in essence, if the value is in a 64bit register the calling
code is independent of the actual type of the formal parameter.
Clearly a value might need explicit widening.
I've found a copy of the 64 bit arm instruction set.
Unfortunately it is alpha sorted and repetitive so shows none
of the symmetry and makes things difficult to find.
But, contrary to what someone suggested most register writes
(eg from arithmetic) seem to zero/extend the high bits.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply
* Re: [PATCH] net: ucc_geth: Drop extraneous parentheses in comparison
From: Jakub Kicinski @ 2020-10-24 1:45 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, leoyang.li, davem, linux-kernel, netdev
In-Reply-To: <20201023033236.3296988-1-mpe@ellerman.id.au>
On Fri, 23 Oct 2020 14:32:36 +1100 Michael Ellerman wrote:
> Clang warns about the extra parentheses in this comparison:
>
> drivers/net/ethernet/freescale/ucc_geth.c:1361:28:
> warning: equality comparison with extraneous parentheses
> if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
> ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> It seems clear the intent here is to do a comparison not an
> assignment, so drop the extra parentheses to avoid any confusion.
>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Applied, thanks!
^ permalink raw reply
* Re: [PATCH] net: ucc_geth: Drop extraneous parentheses in comparison
From: patchwork-bot+netdevbpf @ 2020-10-24 1:50 UTC (permalink / raw)
To: Michael Ellerman
Cc: netdev, linux-kernel, leoyang.li, linuxppc-dev, kuba, davem
In-Reply-To: <20201023033236.3296988-1-mpe@ellerman.id.au>
Hello:
This patch was applied to netdev/net.git (refs/heads/master):
On Fri, 23 Oct 2020 14:32:36 +1100 you wrote:
> Clang warns about the extra parentheses in this comparison:
>
> drivers/net/ethernet/freescale/ucc_geth.c:1361:28:
> warning: equality comparison with extraneous parentheses
> if ((ugeth->phy_interface == PHY_INTERFACE_MODE_SGMII))
> ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> [...]
Here is the summary with links:
- net: ucc_geth: Drop extraneous parentheses in comparison
https://git.kernel.org/netdev/net/c/dab234227cbd
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: C vdso
From: Michael Ellerman @ 2020-10-24 10:07 UTC (permalink / raw)
To: Christophe Leroy; +Cc: linuxppc-dev@ozlabs.org
In-Reply-To: <877drhxeg8.fsf@mpe.ellerman.id.au>
Michael Ellerman <mpe@ellerman.id.au> writes:
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 24/09/2020 à 15:17, Christophe Leroy a écrit :
>>> Le 17/09/2020 à 14:33, Michael Ellerman a écrit :
>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>>
>>>>> What is the status with the generic C vdso merge ?
>>>>> In some mail, you mentionned having difficulties getting it working on
>>>>> ppc64, any progress ? What's the problem ? Can I help ?
>>>>
>>>> Yeah sorry I was hoping to get time to work on it but haven't been able
>>>> to.
>>>>
>>>> It's causing crashes on ppc64 ie. big endian.
> ...
>>>
>>> Can you tell what defconfig you are using ? I have been able to setup a full glibc PPC64 cross
>>> compilation chain and been able to test it under QEMU with success, using Nathan's vdsotest tool.
>>
>> What config are you using ?
>
> ppc64_defconfig + guest.config
>
> Or pseries_defconfig.
>
> I'm using Ubuntu GCC 9.3.0 mostly, but it happens with other toolchains too.
I'm also seeing warnings because of the feature fixups:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at arch/powerpc/lib/feature-fixups.c:109 .do_feature_fixups+0x80/0xc0
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc2-00261-g107a86292cc4 #11
NIP: c0000000000a3660 LR: c0000000000a362c CTR: 0000000000000000
REGS: c00000007e3a3790 TRAP: 0700 Not tainted (5.9.0-rc2-00261-g107a86292cc4)
MSR: 8000000002029032 <SF,VEC,EE,ME,IR,DR,RI> CR: 48000422 XER: 00000000
CFAR: c0000000000a3630 IRQMASK: 0
GPR00: c0000000011e8964 c00000007e3a3a20 c000000001bb2b00 0000000000000001
GPR04: c000000001bc0bc0 c000000001bc0bf0 0000000066736574 00000000fffffe00
GPR08: 0000000300000004 0000000000000000 0000000000000000 0000000000000db8
GPR12: 0000000028000224 c000000001dc0000 c000000000012d40 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: c000000001aae0f0 c000000001063f08 c000000001063f18 c000000001063f28
GPR28: c00000000106e188 000000eb8f4d91a7 c000000001bc0bf0 c000000001bc0bc0
NIP [c0000000000a3660] .do_feature_fixups+0x80/0xc0
LR [c0000000000a362c] .do_feature_fixups+0x4c/0xc0
Call Trace:
[c00000007e3a3a20] [c0000000000a362c] .do_feature_fixups+0x4c/0xc0 (unreliable)
[c00000007e3a3ab0] [c0000000011e8964] .vdso_init+0x498/0x95c
[c00000007e3a3bd0] [c000000000012780] .do_one_initcall+0x60/0x2b8
[c00000007e3a3cb0] [c0000000011e4d8c] .kernel_init_freeable+0x2d8/0x370
[c00000007e3a3da0] [c000000000012d64] .kernel_init+0x24/0x150
[c00000007e3a3e20] [c00000000000e24c] .ret_from_kernel_thread+0x58/0x6c
Instruction dump:
40820030 3bff0030 7c3ef840 4181ffe4 38210090 e8010010 eb81ffe0 eba1ffe8
ebc1fff0 7c0803a6 ebe1fff8 4e800020 <0fe00000> e8ff0028 e8df0020 7f83e378
---[ end trace ece1c957ca5bd6e9 ]---
Unable to patch feature section at bffffffd01bc0bbc - c000000001bc0bc0 with bffffd9101bc0958 - bfffffe501bc0ba4
That's happening because the 32-bit VDSO is built with CONFIG_PPC32=y,
due to config-fake32.h, and that causes the feature fixup entries to be
the wrong size.
See the logic in feature-fixup.h:
#if defined(CONFIG_PPC64) && !defined(__powerpc64__)
/* 64 bits kernel, 32 bits code (ie. vdso32) */
#define FTR_ENTRY_LONG .8byte
#define FTR_ENTRY_OFFSET .long 0xffffffff; .long
#elif defined(CONFIG_PPC64)
#define FTR_ENTRY_LONG .8byte
#define FTR_ENTRY_OFFSET .8byte
#else
#define FTR_ENTRY_LONG .long
#define FTR_ENTRY_OFFSET .long
#endif
We expect the fixup entries to still use 64-bit values, even for the
32-bit VDSO in a 64-bit kernel.
TBH I'm not sure how config-fake32.h can work long term, it's so fragile
to be defining/redefining a handful of CONFIG symbols like that.
The generic VDSO code is fairly careful to only include uapi and vdso
headers, not linux ones. So I think we need to better split our headers
so that we can build the VDSO code with few or no linux headers, and so
avoid the need to define any (or most) CONFIG symbols.
cheers
^ permalink raw reply
* Re: [PATCH v4] powerpc/pseries: Avoid using addr_to_pfn in real mode
From: Michael Ellerman @ 2020-10-24 10:27 UTC (permalink / raw)
To: linuxppc-dev, mpe, Ganesh Goudar; +Cc: aneesh.kumar, npiggin, mahesh
In-Reply-To: <20200724063946.21378-1-ganeshgr@linux.ibm.com>
On Fri, 24 Jul 2020 12:09:46 +0530, Ganesh Goudar wrote:
> When an UE or memory error exception is encountered the MCE handler
> tries to find the pfn using addr_to_pfn() which takes effective
> address as an argument, later pfn is used to poison the page where
> memory error occurred, recent rework in this area made addr_to_pfn
> to run in real mode, which can be fatal as it may try to access
> memory outside RMO region.
>
> [...]
Applied to powerpc/fixes.
[1/1] powerpc/pseries: Avoid using addr_to_pfn in real mode
https://git.kernel.org/powerpc/c/4ff753feab021242144818b9a3ba011238218145
cheers
^ permalink raw reply
* Re: [PATCH v2 1/3] powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9
From: Michael Ellerman @ 2020-10-24 10:27 UTC (permalink / raw)
To: Michael Ellerman, Christophe Leroy, Paul Mackerras,
Benjamin Herrenschmidt, mathieu.desnoyers
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <212d3bc4a52ca71523759517bb9c61f7e477c46a.1603179582.git.christophe.leroy@csgroup.eu>
On Tue, 20 Oct 2020 07:40:07 +0000 (UTC), Christophe Leroy wrote:
> GCC 4.9 sometimes fails to build with "m<>" constraint in
> inline assembly.
>
> CC lib/iov_iter.o
> In file included from ./arch/powerpc/include/asm/cmpxchg.h:6:0,
> from ./arch/powerpc/include/asm/atomic.h:11,
> from ./include/linux/atomic.h:7,
> from ./include/linux/crypto.h:15,
> from ./include/crypto/hash.h:11,
> from lib/iov_iter.c:2:
> lib/iov_iter.c: In function 'iovec_from_user.part.30':
> ./arch/powerpc/include/asm/uaccess.h:287:2: error: 'asm' operand has impossible constraints
> __asm__ __volatile__( \
> ^
> ./include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
> # define unlikely(x) __builtin_expect(!!(x), 0)
> ^
> ./arch/powerpc/include/asm/uaccess.h:583:34: note: in expansion of macro 'unsafe_op_wrap'
> #define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
> ^
> ./arch/powerpc/include/asm/uaccess.h:329:10: note: in expansion of macro '__get_user_asm'
> case 4: __get_user_asm(x, (u32 __user *)ptr, retval, "lwz"); break; \
> ^
> ./arch/powerpc/include/asm/uaccess.h:363:3: note: in expansion of macro '__get_user_size_allowed'
> __get_user_size_allowed(__gu_val, __gu_addr, __gu_size, __gu_err); \
> ^
> ./arch/powerpc/include/asm/uaccess.h:100:2: note: in expansion of macro '__get_user_nocheck'
> __get_user_nocheck((x), (ptr), sizeof(*(ptr)), false)
> ^
> ./arch/powerpc/include/asm/uaccess.h:583:49: note: in expansion of macro '__get_user_allowed'
> #define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
> ^
> lib/iov_iter.c:1663:3: note: in expansion of macro 'unsafe_get_user'
> unsafe_get_user(len, &uiov[i].iov_len, uaccess_end);
> ^
> make[1]: *** [scripts/Makefile.build:283: lib/iov_iter.o] Error 1
>
> [...]
Patch 1 applied to powerpc/fixes.
[1/3] powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9
https://git.kernel.org/powerpc/c/592bbe9c505d9a0ef69260f8c8263df47da2698e
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/64s: Remove TM from Power10 features
From: Michael Ellerman @ 2020-10-24 10:27 UTC (permalink / raw)
To: linuxppc-dev, Jordan Niethe
In-Reply-To: <20200827035529.900-1-jniethe5@gmail.com>
On Thu, 27 Aug 2020 13:55:29 +1000, Jordan Niethe wrote:
> ISA v3.1 removes transactional memory and hence it should not be present
> in cpu_features or cpu_user_features2. Remove CPU_FTR_TM_COMP from
> CPU_FTRS_POWER10. Remove PPC_FEATURE2_HTM_COMP and
> PPC_FEATURE2_HTM_NOSC_COMP from COMMON_USER2_POWER10.
Applied to powerpc/fixes.
[1/1] powerpc/64s: Remove TM from Power10 features
https://git.kernel.org/powerpc/c/ec613a57fa1d57381f890c3166175fe68cf43f12
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/eeh: Fix eeh_dev_check_failure() for PE#0
From: Michael Ellerman @ 2020-10-24 10:27 UTC (permalink / raw)
To: linuxppc-dev, Oliver O'Halloran
In-Reply-To: <20201021232554.1434687-1-oohall@gmail.com>
On Thu, 22 Oct 2020 10:25:54 +1100, Oliver O'Halloran wrote:
> In commit 269e583357df ("powerpc/eeh: Delete eeh_pe->config_addr") the
> following simplification was made:
>
> - if (!pe->addr && !pe->config_addr) {
> + if (!pe->addr) {
> eeh_stats.no_cfg_addr++;
> return 0;
> }
>
> [...]
Applied to powerpc/fixes.
[1/1] powerpc/eeh: Fix eeh_dev_check_failure() for PE#0
https://git.kernel.org/powerpc/c/99f6e9795a68fe23f96a2b5b0be07a3dd9457f99
cheers
^ permalink raw reply
* [GIT PULL] Please pull powerpc/linux.git powerpc-5.10-2 tag
From: Michael Ellerman @ 2020-10-24 10:50 UTC (permalink / raw)
To: Linus Torvalds
Cc: mikey, srikar, aneesh.kumar, linux-kernel, hegdevasant, ganeshgr,
jniethe5, oohall, linuxppc-dev
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Linus,
Please pull powerpc fixes for 5.10:
The following changes since commit ffd0b25ca049a477cb757e5bcf2d5e1664d12e5d:
Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed" (2020-10-15 13:42:49 +1100)
are available in the git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.10-2
for you to fetch changes up to 4ff753feab021242144818b9a3ba011238218145:
powerpc/pseries: Avoid using addr_to_pfn in real mode (2020-10-22 14:34:45 +1100)
- ------------------------------------------------------------------
powerpc fixes for 5.10 #2
A fix for undetected data corruption on Power9 Nimbus <= DD2.1 in the emulation
of VSX loads. The affected CPUs were not widely available.
Two fixes for machine check handling in guests under PowerVM.
A fix for our recent changes to SMP setup, when CONFIG_CPUMASK_OFFSTACK=y.
Three fixes for races in the handling of some of our powernv sysfs attributes.
One change to remove TM from the set of Power10 CPU features.
A couple of other minor fixes.
Thanks to:
Aneesh Kumar K.V, Christophe Leroy, Ganesh Goudar, Jordan Niethe, Mahesh
Salgaonkar, Michael Neuling, Oliver O'Halloran, Qian Cai, Srikar Dronamraju,
Vasant Hegde.
- ------------------------------------------------------------------
Aneesh Kumar K.V (1):
powerpc/opal_elog: Handle multiple writes to ack attribute
Christophe Leroy (1):
powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9
Ganesh Goudar (2):
powerpc/mce: Avoid nmi_enter/exit in real mode on pseries hash
powerpc/pseries: Avoid using addr_to_pfn in real mode
Jordan Niethe (1):
powerpc/64s: Remove TM from Power10 features
Michael Neuling (2):
powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation
selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround
Oliver O'Halloran (1):
powerpc/eeh: Fix eeh_dev_check_failure() for PE#0
Srikar Dronamraju (2):
powerpc/smp: Remove unnecessary variable
powerpc/smp: Use GFP_ATOMIC while allocating tmp mask
Vasant Hegde (2):
powerpc/powernv/dump: Fix race while processing OPAL dump
powerpc/powernv/dump: Handle multiple writes to ack attribute
arch/powerpc/include/asm/asm-const.h | 13 +++
arch/powerpc/include/asm/cputable.h | 2 +-
arch/powerpc/include/asm/uaccess.h | 4 +-
arch/powerpc/kernel/cputable.c | 13 ++-
arch/powerpc/kernel/eeh.c | 5 -
arch/powerpc/kernel/mce.c | 7 +-
arch/powerpc/kernel/smp.c | 70 ++++++------
arch/powerpc/kernel/traps.c | 2 +-
arch/powerpc/platforms/powernv/opal-dump.c | 52 ++++++---
arch/powerpc/platforms/powernv/opal-elog.c | 11 +-
arch/powerpc/platforms/pseries/ras.c | 118 ++++++++++++--------
tools/testing/selftests/powerpc/alignment/alignment_handler.c | 8 +-
12 files changed, 185 insertions(+), 120 deletions(-)
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl+UBeUACgkQUevqPMjh
pYBHUA//THLt6DJlSPPqn8LQZQGT76Gx82cKyy9DQ7/Elcl13xcuq3XbhHD5asi0
QbJGbLhRqpRhtmj3c8BCYAygi5FXZWH4IeN6FK8xoZGR2bi/gY7VkhIUSzFAHnRi
PFXafzb8eWVS7O5k8xbxrjxdOAu8SjEzywG5I8PPn5IWFwhUwjGosv81QtxJOLVc
V9WwuTBK87nfvoMdfcl3YJXRs+4vKOQQ0Gqa5vTVTUmgdbJOqJi1MvLULnSnKxTJ
G+XplOeSI1N3gk+E2cycPasghTYziTtzEyrTHe0Uufgx+9t6VuN1g2zL81kDA7U9
10Oqqry4Z66V2BhGrDMnXKYGeQNGRO8vNLT2DuuZd5XTN/LpV0knJHm/9F2E+5zl
T+GgQwS0IhXDcbS70TcbxXHPyBe2/eXRH1+rkSlAEjl656JVbKefgi1VUsqSzkjH
JBF2+zCodYelbbnRP89Aj5/03t+VeHbNC/1jixoYDHR7drXiU2XQqjfFZeCHvxOQ
YCznpoC84gcDupGC5q4op3tHBmvULJn0KaHYWryaAEWlCxjdVcjBis48B+GQVww8
JnDMC5WGVvAAxPHc74EkyEvdROx4Q+8UeOj+TXnrRlowEF8Wymxcvy7NUn2Bqq2J
VsRCUzLIReKCckdJQ/+SxG8eb9JUxQRWG76+Q9zzTHbdaBSWuMc=
=9Oxg
-----END PGP SIGNATURE-----
^ permalink raw reply
* Re: C vdso
From: Christophe Leroy @ 2020-10-24 11:16 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev@ozlabs.org
In-Reply-To: <874kmkx7gi.fsf@mpe.ellerman.id.au>
Le 24/10/2020 à 12:07, Michael Ellerman a écrit :
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 24/09/2020 à 15:17, Christophe Leroy a écrit :
>>>> Le 17/09/2020 à 14:33, Michael Ellerman a écrit :
>>>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>>>>
>>>>>> What is the status with the generic C vdso merge ?
>>>>>> In some mail, you mentionned having difficulties getting it working on
>>>>>> ppc64, any progress ? What's the problem ? Can I help ?
>>>>>
>>>>> Yeah sorry I was hoping to get time to work on it but haven't been able
>>>>> to.
>>>>>
>>>>> It's causing crashes on ppc64 ie. big endian.
>> ...
>>>>
>>>> Can you tell what defconfig you are using ? I have been able to setup a full glibc PPC64 cross
>>>> compilation chain and been able to test it under QEMU with success, using Nathan's vdsotest tool.
>>>
>>> What config are you using ?
>>
>> ppc64_defconfig + guest.config
>>
>> Or pseries_defconfig.
>>
>> I'm using Ubuntu GCC 9.3.0 mostly, but it happens with other toolchains too.
>
> I'm also seeing warnings because of the feature fixups:
>
>
> That's happening because the 32-bit VDSO is built with CONFIG_PPC32=y,
> due to config-fake32.h, and that causes the feature fixup entries to be
> the wrong size.
>
> See the logic in feature-fixup.h:
>
>
>
> We expect the fixup entries to still use 64-bit values, even for the
> 32-bit VDSO in a 64-bit kernel.
>
> TBH I'm not sure how config-fake32.h can work long term, it's so fragile
> to be defining/redefining a handful of CONFIG symbols like that.
I took the idea from mips (arch/mips/vdso/config-n32-o32-env.c) after struggling in different
direction. At that time the generic VDSO code was far less careful and was including several linux
headers IIRC.
I agree with you that it's rather fragile.
>
> The generic VDSO code is fairly careful to only include uapi and vdso
> headers, not linux ones. So I think we need to better split our headers
> so that we can build the VDSO code with few or no linux headers, and so
> avoid the need to define any (or most) CONFIG symbols.
>
I'll revisit it when I'm back from vacation (I'm leaving now for two weeks).
Christophe
^ permalink raw reply
* Re: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: Segher Boessenkool @ 2020-10-24 17:29 UTC (permalink / raw)
To: David Laight
Cc: linux-aio@kvack.org, David Hildenbrand,
linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
Christoph Hellwig, linux-arch@vger.kernel.org,
linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
kernel-team@android.com, Arnd Bergmann,
linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, Jens Axboe,
linux-parisc@vger.kernel.org, 'Greg KH', Nick Desaulniers,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
linux-fsdevel@vger.kernel.org, Andrew Morton,
linuxppc-dev@lists.ozlabs.org
In-Reply-To: <e9a3136ead214186877804aabde74b38@AcuMS.aculab.com>
On Fri, Oct 23, 2020 at 09:28:59PM +0000, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 23 October 2020 19:27
> > On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> > > On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> > > On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> > > for clearing the upper half of 64bit register used to pass the value - it only
> > > needs to store the actual value into the lower half. The callee must consider
> > > the contents of the upper half of that register as undefined. See AAPCS64 (e.g.
> > > https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> > > ); AFAICS, the relevant bit is
> > > "Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> > > the callee rather than the caller."
> >
> > Or the formal rule:
> >
> > C.9 If the argument is an Integral or Pointer Type, the size of the
> > argument is less than or equal to 8 bytes and the NGRN is less
> > than 8, the argument is copied to the least significant bits in
> > x[NGRN]. The NGRN is incremented by one. The argument has now
> > been allocated.
>
> So, in essence, if the value is in a 64bit register the calling
> code is independent of the actual type of the formal parameter.
> Clearly a value might need explicit widening.
No, this says that if you pass a 32-bit integer in a 64-bit register,
then the top 32 bits of that register hold an undefined value.
> I've found a copy of the 64 bit arm instruction set.
> Unfortunately it is alpha sorted and repetitive so shows none
> of the symmetry and makes things difficult to find.
All of this is ABI, not ISA. Look at the AAPCS64 pointed to above.
> But, contrary to what someone suggested most register writes
> (eg from arithmetic) seem to zero/extend the high bits.
Everything that writes a "w" does, yes. But that has nothing to do with
the parameter passing rules, that is ABI. It just means that very often
a 32-bit integer will be passed zero-extended in a 64-bit register, but
that is just luck (or not, it makes finding bugs harder ;-) )
Segher
^ permalink raw reply
* Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.10-2 tag
From: pr-tracker-bot @ 2020-10-24 18:11 UTC (permalink / raw)
To: Michael Ellerman
Cc: mikey, srikar, aneesh.kumar, linuxppc-dev, linux-kernel,
hegdevasant, ganeshgr, jniethe5, oohall, Linus Torvalds
In-Reply-To: <871rhnyk2a.fsf@mpe.ellerman.id.au>
The pull request you sent on Sat, 24 Oct 2020 21:50:21 +1100:
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.10-2
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b6f96e75ae121ead54da3f58c545d68184079f90
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
^ permalink raw reply
* RE: Buggy commit tracked to: "Re: [PATCH 2/9] iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c"
From: David Laight @ 2020-10-24 21:12 UTC (permalink / raw)
To: 'Segher Boessenkool'
Cc: linux-aio@kvack.org, David Hildenbrand,
linux-mips@vger.kernel.org, David Howells, linux-mm@kvack.org,
keyrings@vger.kernel.org, sparclinux@vger.kernel.org,
Christoph Hellwig, linux-arch@vger.kernel.org,
linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org,
kernel-team@android.com, Arnd Bergmann,
linux-block@vger.kernel.org, Al Viro, io-uring@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, Jens Axboe,
linux-parisc@vger.kernel.org, 'Greg KH', Nick Desaulniers,
linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org, netdev@vger.kernel.org,
linux-fsdevel@vger.kernel.org, Andrew Morton,
linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20201024172903.GK2672@gate.crashing.org>
From: Segher Boessenkool
> Sent: 24 October 2020 18:29
>
> On Fri, Oct 23, 2020 at 09:28:59PM +0000, David Laight wrote:
> > From: Segher Boessenkool
> > > Sent: 23 October 2020 19:27
> > > On Fri, Oct 23, 2020 at 06:58:57PM +0100, Al Viro wrote:
> > > > On Fri, Oct 23, 2020 at 03:09:30PM +0200, David Hildenbrand wrote:
> > > > On arm64 when callee expects a 32bit argument, the caller is *not* responsible
> > > > for clearing the upper half of 64bit register used to pass the value - it only
> > > > needs to store the actual value into the lower half. The callee must consider
> > > > the contents of the upper half of that register as undefined. See AAPCS64 (e.g.
> > > > https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#parameter-passing-rules
> > > > ); AFAICS, the relevant bit is
> > > > "Unlike in the 32-bit AAPCS, named integral values must be narrowed by
> > > > the callee rather than the caller."
> > >
> > > Or the formal rule:
> > >
> > > C.9 If the argument is an Integral or Pointer Type, the size of the
> > > argument is less than or equal to 8 bytes and the NGRN is less
> > > than 8, the argument is copied to the least significant bits in
> > > x[NGRN]. The NGRN is incremented by one. The argument has now
> > > been allocated.
> >
> > So, in essence, if the value is in a 64bit register the calling
> > code is independent of the actual type of the formal parameter.
> > Clearly a value might need explicit widening.
>
> No, this says that if you pass a 32-bit integer in a 64-bit register,
> then the top 32 bits of that register hold an undefined value.
That's sort of what I meant.
The 'normal' junk in the hight bits will there because the variable
in the calling code is wider.
> > I've found a copy of the 64 bit arm instruction set.
> > Unfortunately it is alpha sorted and repetitive so shows none
> > of the symmetry and makes things difficult to find.
>
> All of this is ABI, not ISA. Look at the AAPCS64 pointed to above.
>
> > But, contrary to what someone suggested most register writes
> > (eg from arithmetic) seem to zero/extend the high bits.
>
> Everything that writes a "w" does, yes. But that has nothing to do with
> the parameter passing rules, that is ABI. It just means that very often
> a 32-bit integer will be passed zero-extended in a 64-bit register, but
> that is just luck (or not, it makes finding bugs harder ;-) )
Working out why the code is wrong is more of an ISA issue than an ABI one.
It may be an ABI one, but the analysis is ISA.
I've written a lot of asm over the years - decoding compiler generated
asm isn't that hard.
At least ARM doesn't have annulled delay slots.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply
* [PATCH] ibmvscsi: fix race potential race after loss of transport
From: Tyrel Datwyler @ 2020-10-25 0:13 UTC (permalink / raw)
To: james.bottomley
Cc: Tyrel Datwyler, martin.petersen, linux-scsi, linux-kernel, brking,
linuxppc-dev
After a loss of tranport due to an adatper migration or crash/disconnect from
the host partner there is a tiny window where we can race adjusting the
request_limit of the adapter. The request limit is atomically inc/dec to track
the number of inflight requests against the allowed limit of our VIOS partner.
After a transport loss we set the request_limit to zero to reflect this state.
However, there is a window where the adapter may attempt to queue a command
because the transport loss event hasn't been fully processed yet and
request_limit is still greater than zero. The hypercall to send the event will
fail and the error path will increment the request_limit as a result. If the
adapter processes the transport event prior to this increment the request_limit
becomes out of sync with the adapter state and can result in scsi commands being
submitted on the now reset connection prior to an SRP Login resulting in a
protocol violation.
Fix this race by protecting request_limit with the host lock when changing the
value via atomic_set() to indicate no transport.
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
---
drivers/scsi/ibmvscsi/ibmvscsi.c | 36 +++++++++++++++++++++++---------
1 file changed, 26 insertions(+), 10 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index b1f3017b6547..188ed75417a5 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -806,6 +806,22 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code)
spin_unlock_irqrestore(hostdata->host->host_lock, flags);
}
+/**
+ * ibmvscsi_set_request_limit - Set the adapter request_limit in response to
+ * an adapter failure, reset, or SRP Login. Done under host lock to prevent
+ * race with scsi command submission.
+ * @hostdata: adapter to adjust
+ * @limit: new request limit
+ */
+static void ibmvscsi_set_request_limit(struct ibmvscsi_host_data *hostdata, int limit)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(hostdata->host->host_lock, flags);
+ atomic_set(&hostdata->request_limit, limit);
+ spin_unlock_irqrestore(hostdata->host->host_lock, flags);
+}
+
/**
* ibmvscsi_reset_host - Reset the connection to the server
* @hostdata: struct ibmvscsi_host_data to reset
@@ -813,7 +829,7 @@ static void purge_requests(struct ibmvscsi_host_data *hostdata, int error_code)
static void ibmvscsi_reset_host(struct ibmvscsi_host_data *hostdata)
{
scsi_block_requests(hostdata->host);
- atomic_set(&hostdata->request_limit, 0);
+ ibmvscsi_set_request_limit(hostdata, 0);
purge_requests(hostdata, DID_ERROR);
hostdata->action = IBMVSCSI_HOST_ACTION_RESET;
@@ -1146,13 +1162,13 @@ static void login_rsp(struct srp_event_struct *evt_struct)
dev_info(hostdata->dev, "SRP_LOGIN_REJ reason %u\n",
evt_struct->xfer_iu->srp.login_rej.reason);
/* Login failed. */
- atomic_set(&hostdata->request_limit, -1);
+ ibmvscsi_set_request_limit(hostdata, -1);
return;
default:
dev_err(hostdata->dev, "Invalid login response typecode 0x%02x!\n",
evt_struct->xfer_iu->srp.login_rsp.opcode);
/* Login failed. */
- atomic_set(&hostdata->request_limit, -1);
+ ibmvscsi_set_request_limit(hostdata, -1);
return;
}
@@ -1163,7 +1179,7 @@ static void login_rsp(struct srp_event_struct *evt_struct)
* This value is set rather than added to request_limit because
* request_limit could have been set to -1 by this client.
*/
- atomic_set(&hostdata->request_limit,
+ ibmvscsi_set_request_limit(hostdata,
be32_to_cpu(evt_struct->xfer_iu->srp.login_rsp.req_lim_delta));
/* If we had any pending I/Os, kick them */
@@ -1195,13 +1211,13 @@ static int send_srp_login(struct ibmvscsi_host_data *hostdata)
login->req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT |
SRP_BUF_FORMAT_INDIRECT);
- spin_lock_irqsave(hostdata->host->host_lock, flags);
/* Start out with a request limit of 0, since this is negotiated in
* the login request we are just sending and login requests always
* get sent by the driver regardless of request_limit.
*/
- atomic_set(&hostdata->request_limit, 0);
+ ibmvscsi_set_request_limit(hostdata, 0);
+ spin_lock_irqsave(hostdata->host->host_lock, flags);
rc = ibmvscsi_send_srp_event(evt_struct, hostdata, login_timeout * 2);
spin_unlock_irqrestore(hostdata->host->host_lock, flags);
dev_info(hostdata->dev, "sent SRP login\n");
@@ -1781,7 +1797,7 @@ static void ibmvscsi_handle_crq(struct viosrp_crq *crq,
return;
case VIOSRP_CRQ_XPORT_EVENT: /* Hypervisor telling us the connection is closed */
scsi_block_requests(hostdata->host);
- atomic_set(&hostdata->request_limit, 0);
+ ibmvscsi_set_request_limit(hostdata, 0);
if (crq->format == 0x06) {
/* We need to re-setup the interpartition connection */
dev_info(hostdata->dev, "Re-enabling adapter!\n");
@@ -2137,12 +2153,12 @@ static void ibmvscsi_do_work(struct ibmvscsi_host_data *hostdata)
}
hostdata->action = IBMVSCSI_HOST_ACTION_NONE;
+ spin_unlock_irqrestore(hostdata->host->host_lock, flags);
if (rc) {
- atomic_set(&hostdata->request_limit, -1);
+ ibmvscsi_set_request_limit(hostdata, -1);
dev_err(hostdata->dev, "error after %s\n", action);
}
- spin_unlock_irqrestore(hostdata->host->host_lock, flags);
scsi_unblock_requests(hostdata->host);
}
@@ -2226,7 +2242,7 @@ static int ibmvscsi_probe(struct vio_dev *vdev, const struct vio_device_id *id)
init_waitqueue_head(&hostdata->work_wait_q);
hostdata->host = host;
hostdata->dev = dev;
- atomic_set(&hostdata->request_limit, -1);
+ ibmvscsi_set_request_limit(hostdata, -1);
hostdata->host->max_sectors = IBMVSCSI_MAX_SECTORS_DEFAULT;
if (map_persist_bufs(hostdata)) {
--
2.27.0
^ permalink raw reply related
* [powerpc:next-test] BUILD SUCCESS 29b535f8c5da3984d083068bd651af0631dcdca6
From: kernel test robot @ 2020-10-25 2:48 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test
branch HEAD: 29b535f8c5da3984d083068bd651af0631dcdca6 selftests/powerpc/eeh: disable kselftest timeout setting for eeh-basic
elapsed time: 965m
configs tested: 112
configs skipped: 2
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
mips cu1830-neo_defconfig
powerpc bluestone_defconfig
powerpc mpc8272_ads_defconfig
arm tct_hammer_defconfig
powerpc canyonlands_defconfig
s390 zfcpdump_defconfig
mips malta_kvm_guest_defconfig
arm hackkit_defconfig
m68k m5249evb_defconfig
sh se7619_defconfig
sh hp6xx_defconfig
h8300 h8s-sim_defconfig
sh se7722_defconfig
powerpc allyesconfig
mips malta_qemu_32r6_defconfig
m68k multi_defconfig
powerpc mpc5200_defconfig
arm zx_defconfig
mips bcm47xx_defconfig
xtensa alldefconfig
xtensa nommu_kc705_defconfig
m68k m5407c3_defconfig
powerpc pseries_defconfig
arm tegra_defconfig
m68k m5272c3_defconfig
arm jornada720_defconfig
m68k apollo_defconfig
arm lart_defconfig
m68k allmodconfig
powerpc mpc834x_mds_defconfig
um kunit_defconfig
sh rts7751r2d1_defconfig
arm pxa_defconfig
mips tb0219_defconfig
arm shmobile_defconfig
sh sh03_defconfig
powerpc ppc6xx_defconfig
sparc64 alldefconfig
arm magician_defconfig
arm ezx_defconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k defconfig
m68k allyesconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
mips allyesconfig
mips allmodconfig
powerpc allmodconfig
powerpc allnoconfig
i386 randconfig-a002-20201024
i386 randconfig-a003-20201024
i386 randconfig-a005-20201024
i386 randconfig-a001-20201024
i386 randconfig-a006-20201024
i386 randconfig-a004-20201024
x86_64 randconfig-a013-20201024
x86_64 randconfig-a016-20201024
x86_64 randconfig-a015-20201024
x86_64 randconfig-a012-20201024
x86_64 randconfig-a014-20201024
x86_64 randconfig-a011-20201024
i386 randconfig-a016-20201024
i386 randconfig-a015-20201024
i386 randconfig-a014-20201024
i386 randconfig-a013-20201024
i386 randconfig-a012-20201024
i386 randconfig-a011-20201024
riscv nommu_k210_defconfig
riscv allyesconfig
riscv nommu_virt_defconfig
riscv allnoconfig
riscv defconfig
riscv rv32_defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
clang tested configs:
x86_64 randconfig-a001-20201024
x86_64 randconfig-a003-20201024
x86_64 randconfig-a002-20201024
x86_64 randconfig-a006-20201024
x86_64 randconfig-a005-20201024
x86_64 randconfig-a004-20201024
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:merge] BUILD SUCCESS 8cb17737940b156329cb5210669b9c9b23f4dd56
From: kernel test robot @ 2020-10-25 2:48 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: 8cb17737940b156329cb5210669b9c9b23f4dd56 Automatic merge of 'master' into merge (2020-10-24 21:33)
elapsed time: 966m
configs tested: 106
configs skipped: 2
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
mips cu1830-neo_defconfig
powerpc bluestone_defconfig
powerpc mpc8272_ads_defconfig
arm tct_hammer_defconfig
powerpc canyonlands_defconfig
nios2 defconfig
powerpc mgcoge_defconfig
sh r7780mp_defconfig
nios2 allyesconfig
mips loongson3_defconfig
arm integrator_defconfig
sh ap325rxa_defconfig
sh r7785rp_defconfig
sh ul2_defconfig
h8300 h8s-sim_defconfig
sh se7722_defconfig
powerpc allyesconfig
mips malta_qemu_32r6_defconfig
m68k multi_defconfig
m68k m5407c3_defconfig
powerpc pseries_defconfig
arm tegra_defconfig
m68k m5272c3_defconfig
arm jornada720_defconfig
m68k apollo_defconfig
arm lart_defconfig
m68k allmodconfig
powerpc mpc834x_mds_defconfig
um kunit_defconfig
arm shmobile_defconfig
powerpc wii_defconfig
sh sh7710voipgw_defconfig
powerpc iss476-smp_defconfig
sh rts7751r2d1_defconfig
arm pxa_defconfig
mips tb0219_defconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k defconfig
m68k allyesconfig
nds32 defconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
mips allyesconfig
mips allmodconfig
powerpc allmodconfig
powerpc allnoconfig
i386 randconfig-a002-20201024
i386 randconfig-a003-20201024
i386 randconfig-a005-20201024
i386 randconfig-a001-20201024
i386 randconfig-a006-20201024
i386 randconfig-a004-20201024
x86_64 randconfig-a013-20201024
x86_64 randconfig-a016-20201024
x86_64 randconfig-a015-20201024
x86_64 randconfig-a012-20201024
x86_64 randconfig-a014-20201024
x86_64 randconfig-a011-20201024
i386 randconfig-a016-20201024
i386 randconfig-a015-20201024
i386 randconfig-a014-20201024
i386 randconfig-a013-20201024
i386 randconfig-a012-20201024
i386 randconfig-a011-20201024
riscv nommu_k210_defconfig
riscv allyesconfig
riscv nommu_virt_defconfig
riscv allnoconfig
riscv defconfig
riscv rv32_defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
clang tested configs:
x86_64 randconfig-a001-20201024
x86_64 randconfig-a003-20201024
x86_64 randconfig-a002-20201024
x86_64 randconfig-a006-20201024
x86_64 randconfig-a005-20201024
x86_64 randconfig-a004-20201024
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [PATCH 0/4] arch, mm: improve robustness of direct map manipulation
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
Mike Rapoport, Christian Borntraeger, Ingo Molnar,
Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
David S. Miller, Mike Rapoport
From: Mike Rapoport <rppt@linux.ibm.com>
Hi,
During recent discussion about KVM protected memory, David raised a concern
about usage of __kernel_map_pages() outside of DEBUG_PAGEALLOC scope [1].
Indeed, for architectures that define CONFIG_ARCH_HAS_SET_DIRECT_MAP it is
possible that __kernel_map_pages() would fail, but since this function is
void, the failure will go unnoticed.
Moreover, there's lack of consistency of __kernel_map_pages() semantics
across architectures as some guard this function with
#ifdef DEBUG_PAGEALLOC, some refuse to update the direct map if page
allocation debugging is disabled at run time and some allow modifying the
direct map regardless of DEBUG_PAGEALLOC settings.
This set straightens this out by restoring dependency of
__kernel_map_pages() on DEBUG_PAGEALLOC and updating the call sites
accordingly.
[1] https://lore.kernel.org/lkml/2759b4bf-e1e3-d006-7d86-78a40348269d@redhat.com
Mike Rapoport (4):
mm: introduce debug_pagealloc_map_pages() helper
PM: hibernate: improve robustness of mapping pages in the direct map
arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
arch, mm: make kernel_page_present() always available
arch/Kconfig | 3 +++
arch/arm64/Kconfig | 4 +---
arch/arm64/include/asm/cacheflush.h | 1 +
arch/arm64/mm/pageattr.c | 6 +++--
arch/powerpc/Kconfig | 5 +----
arch/riscv/Kconfig | 4 +---
arch/riscv/include/asm/pgtable.h | 2 --
arch/riscv/include/asm/set_memory.h | 1 +
arch/riscv/mm/pageattr.c | 31 +++++++++++++++++++++++++
arch/s390/Kconfig | 4 +---
arch/sparc/Kconfig | 4 +---
arch/x86/Kconfig | 4 +---
arch/x86/include/asm/set_memory.h | 1 +
arch/x86/mm/pat/set_memory.c | 4 ++--
include/linux/mm.h | 35 +++++++++++++----------------
include/linux/set_memory.h | 5 +++++
kernel/power/snapshot.c | 24 ++++++++++++++++++--
mm/memory_hotplug.c | 3 +--
mm/page_alloc.c | 6 ++---
mm/slab.c | 8 +++----
20 files changed, 97 insertions(+), 58 deletions(-)
--
2.28.0
^ permalink raw reply
* [PATCH 1/4] mm: introduce debug_pagealloc_map_pages() helper
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
Mike Rapoport, Christian Borntraeger, Ingo Molnar,
Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
David S. Miller, Mike Rapoport
In-Reply-To: <20201025101555.3057-1-rppt@kernel.org>
From: Mike Rapoport <rppt@linux.ibm.com>
When CONFIG_DEBUG_PAGEALLOC is enabled, it unmaps pages from the kernel
direct mapping after free_pages(). The pages than need to be mapped back
before they could be used. Theese mapping operations use
__kernel_map_pages() guarded with with debug_pagealloc_enabled().
The only place that calls __kernel_map_pages() without checking whether
DEBUG_PAGEALLOC is enabled is the hibernation code that presumes
availability of this function when ARCH_HAS_SET_DIRECT_MAP is set.
Still, on arm64, __kernel_map_pages() will bail out when DEBUG_PAGEALLOC is
not enabled but set_direct_map_invalid_noflush() may render some pages not
present in the direct map and hibernation code won't be able to save such
pages.
To make page allocation debugging and hibernation interaction more robust,
the dependency on DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP has to be made
more explicit.
Start with combining the guard condition and the call to
__kernel_map_pages() into a single debug_pagealloc_map_pages() function to
emphasize that __kernel_map_pages() should not be called without
DEBUG_PAGEALLOC and use this new function to map/unmap pages when page
allocation debug is enabled.
As the only remaining user of kernel_map_pages() is the hibernation code,
mode that function into kernel/power/snapshot.c closer to a caller.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
include/linux/mm.h | 16 +++++++---------
kernel/power/snapshot.c | 11 +++++++++++
mm/memory_hotplug.c | 3 +--
mm/page_alloc.c | 6 ++----
mm/slab.c | 8 +++-----
5 files changed, 24 insertions(+), 20 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef360fe70aaf..14e397f3752c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2927,21 +2927,19 @@ static inline bool debug_pagealloc_enabled_static(void)
#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
extern void __kernel_map_pages(struct page *page, int numpages, int enable);
-/*
- * When called in DEBUG_PAGEALLOC context, the call should most likely be
- * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static()
- */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void debug_pagealloc_map_pages(struct page *page,
+ int numpages, int enable)
{
- __kernel_map_pages(page, numpages, enable);
+ if (debug_pagealloc_enabled_static())
+ __kernel_map_pages(page, numpages, enable);
}
+
#ifdef CONFIG_HIBERNATION
extern bool kernel_page_present(struct page *page);
#endif /* CONFIG_HIBERNATION */
#else /* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
+static inline void debug_pagealloc_map_pages(struct page *page,
+ int numpages, int enable) {}
#ifdef CONFIG_HIBERNATION
static inline bool kernel_page_present(struct page *page) { return true; }
#endif /* CONFIG_HIBERNATION */
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 46b1804c1ddf..fa499466f645 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,6 +76,17 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
static inline void hibernate_restore_unprotect_page(void *page_address) {}
#endif /* CONFIG_STRICT_KERNEL_RWX && CONFIG_ARCH_HAS_SET_MEMORY */
+#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable)
+{
+ __kernel_map_pages(page, numpages, enable);
+}
+#else
+static inline void
+kernel_map_pages(struct page *page, int numpages, int enable) {}
+#endif
+
static int swsusp_page_is_free(struct page *);
static void swsusp_set_page_forbidden(struct page *);
static void swsusp_unset_page_forbidden(struct page *);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b44d4c7ba73b..e2b6043a4428 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -614,8 +614,7 @@ void generic_online_page(struct page *page, unsigned int order)
* so we should map it first. This is better than introducing a special
* case in page freeing fast path.
*/
- if (debug_pagealloc_enabled_static())
- kernel_map_pages(page, 1 << order, 1);
+ debug_pagealloc_map_pages(page, 1 << order, 1);
__free_pages_core(page, order);
totalram_pages_add(1UL << order);
#ifdef CONFIG_HIGHMEM
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 23f5066bd4a5..9a66a1ff9193 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1272,8 +1272,7 @@ static __always_inline bool free_pages_prepare(struct page *page,
*/
arch_free_page(page, order);
- if (debug_pagealloc_enabled_static())
- kernel_map_pages(page, 1 << order, 0);
+ debug_pagealloc_map_pages(page, 1 << order, 0);
kasan_free_nondeferred_pages(page, order);
@@ -2270,8 +2269,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
set_page_refcounted(page);
arch_alloc_page(page, order);
- if (debug_pagealloc_enabled_static())
- kernel_map_pages(page, 1 << order, 1);
+ debug_pagealloc_map_pages(page, 1 << order, 1);
kasan_alloc_pages(page, order);
kernel_poison_pages(page, 1 << order, 1);
set_page_owner(page, order, gfp_flags);
diff --git a/mm/slab.c b/mm/slab.c
index b1113561b98b..340db0ce74c4 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1431,10 +1431,8 @@ static bool is_debug_pagealloc_cache(struct kmem_cache *cachep)
#ifdef CONFIG_DEBUG_PAGEALLOC
static void slab_kernel_map(struct kmem_cache *cachep, void *objp, int map)
{
- if (!is_debug_pagealloc_cache(cachep))
- return;
-
- kernel_map_pages(virt_to_page(objp), cachep->size / PAGE_SIZE, map);
+ debug_pagealloc_map_pages(virt_to_page(objp),
+ cachep->size / PAGE_SIZE, map);
}
#else
@@ -2062,7 +2060,7 @@ int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
#if DEBUG
/*
- * If we're going to use the generic kernel_map_pages()
+ * If we're going to use the generic debug_pagealloc_map_pages()
* poisoning, then it's going to smash the contents of
* the redzone and userword anyhow, so switch them off.
*/
--
2.28.0
^ permalink raw reply related
* [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
Mike Rapoport, Christian Borntraeger, Ingo Molnar,
Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
David S. Miller, Mike Rapoport
In-Reply-To: <20201025101555.3057-1-rppt@kernel.org>
From: Mike Rapoport <rppt@linux.ibm.com>
When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be
not present in the direct map and has to be explicitly mapped before it
could be copied.
On arm64 it is possible that a page would be removed from the direct map
using set_direct_map_invalid_noflush() but __kernel_map_pages() will refuse
to map this page back if DEBUG_PAGEALLOC is disabled.
Explicitly use set_direct_map_{default,invalid}_noflush() for
ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
DEBUG_PAGEALLOC case.
While on that, rename kernel_map_pages() to hibernate_map_page() and drop
numpages parameter.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
kernel/power/snapshot.c | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index fa499466f645..ecb7b32ce77c 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -76,16 +76,25 @@ static inline void hibernate_restore_protect_page(void *page_address) {}
static inline void hibernate_restore_unprotect_page(void *page_address) {}
#endif /* CONFIG_STRICT_KERNEL_RWX && CONFIG_ARCH_HAS_SET_MEMORY */
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable)
+static inline void hibernate_map_page(struct page *page, int enable)
{
- __kernel_map_pages(page, numpages, enable);
+ if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
+ unsigned long addr = (unsigned long)page_address(page);
+ int ret;
+
+ if (enable)
+ ret = set_direct_map_default_noflush(page);
+ else
+ ret = set_direct_map_invalid_noflush(page);
+
+ if (WARN_ON(ret))
+ return;
+
+ flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+ } else {
+ debug_pagealloc_map_pages(page, 1, enable);
+ }
}
-#else
-static inline void
-kernel_map_pages(struct page *page, int numpages, int enable) {}
-#endif
static int swsusp_page_is_free(struct page *);
static void swsusp_set_page_forbidden(struct page *);
@@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct page *s_page)
if (kernel_page_present(s_page)) {
do_copy_page(dst, page_address(s_page));
} else {
- kernel_map_pages(s_page, 1, 1);
+ hibernate_map_page(s_page, 1);
do_copy_page(dst, page_address(s_page));
- kernel_map_pages(s_page, 1, 0);
+ hibernate_map_page(s_page, 0);
}
}
--
2.28.0
^ permalink raw reply related
* [PATCH 3/4] arch, mm: restore dependency of __kernel_map_pages() of DEBUG_PAGEALLOC
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
Mike Rapoport, Christian Borntraeger, Ingo Molnar,
Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
David S. Miller, Mike Rapoport
In-Reply-To: <20201025101555.3057-1-rppt@kernel.org>
From: Mike Rapoport <rppt@linux.ibm.com>
The design of DEBUG_PAGEALLOC presumes that __kernel_map_pages() must never
fail. With this assumption is wouldn't be safe to allow general usage of
this function.
Moreover, some architectures that implement __kernel_map_pages() have this
function guarded by #ifdef DEBUG_PAGEALLOC and some refuse to map/unmap
pages when page allocation debugging is disabled at runtime.
As all the users of __kernel_map_pages() were converted to use
debug_pagealloc_map_pages() it is safe to make it available only when
DEBUG_PAGEALLOC is set.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
arch/Kconfig | 3 +++
arch/arm64/Kconfig | 4 +---
arch/arm64/mm/pageattr.c | 6 ++++--
arch/powerpc/Kconfig | 5 +----
arch/riscv/Kconfig | 4 +---
arch/riscv/include/asm/pgtable.h | 2 --
arch/riscv/mm/pageattr.c | 2 ++
arch/s390/Kconfig | 4 +---
arch/sparc/Kconfig | 4 +---
arch/x86/Kconfig | 4 +---
arch/x86/mm/pat/set_memory.c | 2 ++
include/linux/mm.h | 10 +++++++---
12 files changed, 24 insertions(+), 26 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 56b6ccc0e32d..56d4752b6db6 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1028,6 +1028,9 @@ config HAVE_STATIC_CALL_INLINE
bool
depends on HAVE_STATIC_CALL
+config ARCH_SUPPORTS_DEBUG_PAGEALLOC
+ bool
+
source "kernel/gcov/Kconfig"
source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 08fa3a1c50f0..1d4da0843668 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -71,6 +71,7 @@ config ARM64
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_USE_QUEUED_SPINLOCKS
select ARCH_USE_SYM_ANNOTATIONS
+ select ARCH_SUPPORTS_DEBUG_PAGEALLOC
select ARCH_SUPPORTS_MEMORY_FAILURE
select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK
select ARCH_SUPPORTS_ATOMIC_RMW
@@ -1004,9 +1005,6 @@ config HOLES_IN_ZONE
source "kernel/Kconfig.hz"
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
- def_bool y
-
config ARCH_SPARSEMEM_ENABLE
def_bool y
select SPARSEMEM_VMEMMAP_ENABLE
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 1b94f5b82654..18613d8834db 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -178,13 +178,15 @@ int set_direct_map_default_noflush(struct page *page)
PAGE_SIZE, change_page_range, &data);
}
+#ifdef CONFIG_DEBUG_PAGEALLOC
void __kernel_map_pages(struct page *page, int numpages, int enable)
{
- if (!debug_pagealloc_enabled() && !rodata_full)
+ if (!rodata_full)
return;
set_memory_valid((unsigned long)page_address(page), numpages, enable);
}
+#endif /* CONFIG_DEBUG_PAGEALLOC */
/*
* This function is used to determine if a linear map page has been marked as
@@ -204,7 +206,7 @@ bool kernel_page_present(struct page *page)
pte_t *ptep;
unsigned long addr = (unsigned long)page_address(page);
- if (!debug_pagealloc_enabled() && !rodata_full)
+ if (!rodata_full)
return true;
pgdp = pgd_offset_k(addr);
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e9f13fe08492..ad8a83f3ddca 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -146,6 +146,7 @@ config PPC
select ARCH_MIGHT_HAVE_PC_SERIO
select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_SUPPORTS_ATOMIC_RMW
+ select ARCH_SUPPORTS_DEBUG_PAGEALLOC if PPC32 || PPC_BOOK3S_64
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_CMPXCHG_LOCKREF if PPC64
select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS
@@ -355,10 +356,6 @@ config PPC_OF_PLATFORM_PCI
depends on PCI
depends on PPC64 # not supported on 32 bits yet
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
- depends on PPC32 || PPC_BOOK3S_64
- def_bool y
-
config ARCH_SUPPORTS_UPROBES
def_bool y
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d5e7ca08f22c..c704562ba45e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -14,6 +14,7 @@ config RISCV
def_bool y
select ARCH_CLOCKSOURCE_INIT
select ARCH_SUPPORTS_ATOMIC_RMW
+ select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
select ARCH_HAS_BINFMT_FLAT
select ARCH_HAS_DEBUG_VM_PGTABLE
select ARCH_HAS_DEBUG_VIRTUAL if MMU
@@ -153,9 +154,6 @@ config ARCH_SELECT_MEMORY_MODEL
config ARCH_WANT_GENERAL_HUGETLB
def_bool y
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
- def_bool y
-
config SYS_SUPPORTS_HUGETLBFS
depends on MMU
def_bool y
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 183f1f4b2ae6..41a72861987c 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -461,8 +461,6 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
#define VMALLOC_START 0
#define VMALLOC_END TASK_SIZE
-static inline void __kernel_map_pages(struct page *page, int numpages, int enable) {}
-
#endif /* !CONFIG_MMU */
#define kern_addr_valid(addr) (1) /* FIXME */
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 19fecb362d81..321b09d2e2ea 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -184,6 +184,7 @@ int set_direct_map_default_noflush(struct page *page)
return ret;
}
+#ifdef CONFIG_DEBUG_PAGEALLOC
void __kernel_map_pages(struct page *page, int numpages, int enable)
{
if (!debug_pagealloc_enabled())
@@ -196,3 +197,4 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
__set_memory((unsigned long)page_address(page), numpages,
__pgprot(0), __pgprot(_PAGE_PRESENT));
}
+#endif
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 34371539a9b9..0a42d457bff4 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -35,9 +35,6 @@ config GENERIC_LOCKBREAK
config PGSTE
def_bool y if KVM
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
- def_bool y
-
config AUDIT_ARCH
def_bool y
@@ -106,6 +103,7 @@ config S390
select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
select ARCH_STACKWALK
select ARCH_SUPPORTS_ATOMIC_RMW
+ select ARCH_SUPPORTS_DEBUG_PAGEALLOC
select ARCH_SUPPORTS_NUMA_BALANCING
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_CMPXCHG_LOCKREF
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index a6ca135442f9..2c729b8d097a 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -88,6 +88,7 @@ config SPARC64
select HAVE_C_RECORDMCOUNT
select HAVE_ARCH_AUDITSYSCALL
select ARCH_SUPPORTS_ATOMIC_RMW
+ select ARCH_SUPPORTS_DEBUG_PAGEALLOC
select HAVE_NMI
select HAVE_REGS_AND_STACK_ACCESS_API
select ARCH_USE_QUEUED_RWLOCKS
@@ -148,9 +149,6 @@ config GENERIC_ISA_DMA
bool
default y if SPARC32
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
- def_bool y if SPARC64
-
config PGTABLE_LEVELS
default 4 if 64BIT
default 3
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f6946b81f74a..0db3fb1da70c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -91,6 +91,7 @@ config X86
select ARCH_STACKWALK
select ARCH_SUPPORTS_ACPI
select ARCH_SUPPORTS_ATOMIC_RMW
+ select ARCH_SUPPORTS_DEBUG_PAGEALLOC
select ARCH_SUPPORTS_NUMA_BALANCING if X86_64
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_QUEUED_RWLOCKS
@@ -329,9 +330,6 @@ config ZONE_DMA32
config AUDIT_ARCH
def_bool y if X86_64
-config ARCH_SUPPORTS_DEBUG_PAGEALLOC
- def_bool y
-
config KASAN_SHADOW_OFFSET
hex
depends on KASAN
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 40baa90e74f4..7f248fc45317 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2194,6 +2194,7 @@ int set_direct_map_default_noflush(struct page *page)
return __set_pages_p(page, 1);
}
+#ifdef CONFIG_DEBUG_PAGEALLOC
void __kernel_map_pages(struct page *page, int numpages, int enable)
{
if (PageHighMem(page))
@@ -2225,6 +2226,7 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
arch_flush_lazy_mmu_mode();
}
+#endif /* CONFIG_DEBUG_PAGEALLOC */
#ifdef CONFIG_HIBERNATION
bool kernel_page_present(struct page *page)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 14e397f3752c..ab0ef6bd351d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2924,7 +2924,11 @@ static inline bool debug_pagealloc_enabled_static(void)
return static_branch_unlikely(&_debug_pagealloc_enabled);
}
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
+#ifdef CONFIG_DEBUG_PAGEALLOC
+/*
+ * To support DEBUG_PAGEALLOC architecture must ensure that
+ * __kernel_map_pages() never fails
+ */
extern void __kernel_map_pages(struct page *page, int numpages, int enable);
static inline void debug_pagealloc_map_pages(struct page *page,
@@ -2937,13 +2941,13 @@ static inline void debug_pagealloc_map_pages(struct page *page,
#ifdef CONFIG_HIBERNATION
extern bool kernel_page_present(struct page *page);
#endif /* CONFIG_HIBERNATION */
-#else /* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#else /* CONFIG_DEBUG_PAGEALLOC */
static inline void debug_pagealloc_map_pages(struct page *page,
int numpages, int enable) {}
#ifdef CONFIG_HIBERNATION
static inline bool kernel_page_present(struct page *page) { return true; }
#endif /* CONFIG_HIBERNATION */
-#endif /* CONFIG_DEBUG_PAGEALLOC || CONFIG_ARCH_HAS_SET_DIRECT_MAP */
+#endif /* CONFIG_DEBUG_PAGEALLOC */
#ifdef __HAVE_ARCH_GATE_AREA
extern struct vm_area_struct *get_gate_vma(struct mm_struct *mm);
--
2.28.0
^ permalink raw reply related
* [PATCH 4/4] arch, mm: make kernel_page_present() always available
From: Mike Rapoport @ 2020-10-25 10:15 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Peter Zijlstra, Dave Hansen, linux-mm,
Paul Mackerras, Pavel Machek, H. Peter Anvin, sparclinux,
Christoph Lameter, Will Deacon, linux-riscv, linux-s390, x86,
Mike Rapoport, Christian Borntraeger, Ingo Molnar,
Catalin Marinas, Len Brown, Albert Ou, Vasily Gorbik, linux-pm,
Heiko Carstens, David Rientjes, Borislav Petkov, Andy Lutomirski,
Paul Walmsley, Kirill A. Shutemov, Thomas Gleixner,
linux-arm-kernel, Rafael J. Wysocki, linux-kernel, Pekka Enberg,
Palmer Dabbelt, Joonsoo Kim, Edgecombe, Rick P, linuxppc-dev,
David S. Miller, Mike Rapoport
In-Reply-To: <20201025101555.3057-1-rppt@kernel.org>
From: Mike Rapoport <rppt@linux.ibm.com>
For architectures that enable ARCH_HAS_SET_MEMORY having the ability to
verify that a page is mapped in the kernel direct map can be useful
regardless of hibernation.
Add RISC-V implementation of kernel_page_present() and update its forward
declarations and stubs to be a part of set_memory API.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
arch/arm64/include/asm/cacheflush.h | 1 +
arch/riscv/include/asm/set_memory.h | 1 +
arch/riscv/mm/pageattr.c | 29 +++++++++++++++++++++++++++++
arch/x86/include/asm/set_memory.h | 1 +
arch/x86/mm/pat/set_memory.c | 2 --
include/linux/mm.h | 7 -------
include/linux/set_memory.h | 5 +++++
7 files changed, 37 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
index 9384fd8fc13c..45217f21f1fe 100644
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -140,6 +140,7 @@ int set_memory_valid(unsigned long addr, int numpages, int enable);
int set_direct_map_invalid_noflush(struct page *page);
int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
#include <asm-generic/cacheflush.h>
diff --git a/arch/riscv/include/asm/set_memory.h b/arch/riscv/include/asm/set_memory.h
index 4c5bae7ca01c..d690b08dff2a 100644
--- a/arch/riscv/include/asm/set_memory.h
+++ b/arch/riscv/include/asm/set_memory.h
@@ -24,6 +24,7 @@ static inline int set_memory_nx(unsigned long addr, int numpages) { return 0; }
int set_direct_map_invalid_noflush(struct page *page);
int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
#endif /* __ASSEMBLY__ */
diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
index 321b09d2e2ea..87ba5a68bbb8 100644
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -198,3 +198,32 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
__pgprot(0), __pgprot(_PAGE_PRESENT));
}
#endif
+
+bool kernel_page_present(struct page *page)
+{
+ unsigned long addr = (unsigned long)page_address(page);
+ pgd_t *pgd;
+ pud_t *pud;
+ p4d_t *p4d;
+ pmd_t *pmd;
+ pte_t *pte;
+
+ pgd = pgd_offset_k(addr);
+ if (!pgd_present(*pgd))
+ return false;
+
+ p4d = p4d_offset(pgd, addr);
+ if (!p4d_present(*p4d))
+ return false;
+
+ pud = pud_offset(p4d, addr);
+ if (!pud_present(*pud))
+ return false;
+
+ pmd = pmd_offset(pud, addr);
+ if (!pmd_present(*pmd))
+ return false;
+
+ pte = pte_offset_kernel(pmd, addr);
+ return pte_present(*pte);
+}
diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 5948218f35c5..4352f08bfbb5 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -82,6 +82,7 @@ int set_pages_rw(struct page *page, int numpages);
int set_direct_map_invalid_noflush(struct page *page);
int set_direct_map_default_noflush(struct page *page);
+bool kernel_page_present(struct page *page);
extern int kernel_set_to_readonly;
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 7f248fc45317..16f878c26667 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2228,7 +2228,6 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
}
#endif /* CONFIG_DEBUG_PAGEALLOC */
-#ifdef CONFIG_HIBERNATION
bool kernel_page_present(struct page *page)
{
unsigned int level;
@@ -2240,7 +2239,6 @@ bool kernel_page_present(struct page *page)
pte = lookup_address((unsigned long)page_address(page), &level);
return (pte_val(*pte) & _PAGE_PRESENT);
}
-#endif /* CONFIG_HIBERNATION */
int __init kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
unsigned numpages, unsigned long page_flags)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ab0ef6bd351d..44b82f22e76a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2937,16 +2937,9 @@ static inline void debug_pagealloc_map_pages(struct page *page,
if (debug_pagealloc_enabled_static())
__kernel_map_pages(page, numpages, enable);
}
-
-#ifdef CONFIG_HIBERNATION
-extern bool kernel_page_present(struct page *page);
-#endif /* CONFIG_HIBERNATION */
#else /* CONFIG_DEBUG_PAGEALLOC */
static inline void debug_pagealloc_map_pages(struct page *page,
int numpages, int enable) {}
-#ifdef CONFIG_HIBERNATION
-static inline bool kernel_page_present(struct page *page) { return true; }
-#endif /* CONFIG_HIBERNATION */
#endif /* CONFIG_DEBUG_PAGEALLOC */
#ifdef __HAVE_ARCH_GATE_AREA
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index 860e0f843c12..fe1aa4e54680 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -23,6 +23,11 @@ static inline int set_direct_map_default_noflush(struct page *page)
{
return 0;
}
+
+static inline bool kernel_page_present(struct page *page)
+{
+ return true;
+}
#endif
#ifndef set_mce_nospec
--
2.28.0
^ permalink raw reply related
* Re: [PATCH 0/3] warn and suppress irqflood
From: Pingfan Liu @ 2020-10-25 11:12 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Maulik Shah, Petr Mladek, Oliver Neukum, Jonathan Corbet,
Gustavo A. R. Silva, Peter Zijlstra, Marc Zyngier, Linus Walleij,
Guilherme G. Piccoli, linux-doc, LKML, Lina Iyer, Jisheng Zhang,
Pawan Gupta, Al Viro, Andrew Morton, afzal mohammed,
Kexec Mailing List, Mike Kravetz
In-Reply-To: <871rhq7j1h.fsf@nanos.tec.linutronix.de>
On Thu, Oct 22, 2020 at 4:37 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote:
> > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform.
> > When the bug happens, the kernel is totally occupies by irq. Currently, there
> > may be nothing or just soft lockup warning showed in console. It is better
> > to warn users with irq flood info.
> >
> > In the kdump case, the kernel can move on by suppressing the irq flood.
>
> You're curing the symptom not the cause and the cure is just magic and
> can't work reliably.
Yeah, it is magic. But at least, it is better to printk something and
alarm users about what happens. With current code, it may show nothing
when system hangs.
>
> Where is that irq flood originated from and why is none of the
> mechanisms we have in place to shut it up working?
The bug originates from a driver tpm_i2c_nuvoton, which calls i2c-bus
driver (i2c-opal.c). After i2c_opal_send_request(), the bug is
triggered.
But things are complicated by introducing a firmware layer: Skiboot.
This software layer hides the detail of manipulating the hardware from
Linux.
I guess the software logic can not enter a sane state when kernel crashes.
Cc Skiboot and ppc64 community to see whether anyone has idea about it.
Thanks,
Pingfan
^ permalink raw reply
* Re: [PATCH 2/4] PM: hibernate: improve robustness of mapping pages in the direct map
From: Edgecombe, Rick P @ 2020-10-26 0:38 UTC (permalink / raw)
To: rppt@kernel.org, akpm@linux-foundation.org
Cc: david@redhat.com, peterz@infradead.org, catalin.marinas@arm.com,
dave.hansen@linux.intel.com, linux-mm@kvack.org, paulus@samba.org,
pavel@ucw.cz, hpa@zytor.com, sparclinux@vger.kernel.org,
cl@linux.com, will@kernel.org, linux-riscv@lists.infradead.org,
linux-s390@vger.kernel.org, x86@kernel.org, rppt@linux.ibm.com,
borntraeger@de.ibm.com, mingo@redhat.com, rientjes@google.com,
Brown, Len, aou@eecs.berkeley.edu, gor@linux.ibm.com,
linux-pm@vger.kernel.org, hca@linux.ibm.com, bp@alien8.de,
luto@kernel.org, paul.walmsley@sifive.com, kirill@shutemov.name,
tglx@linutronix.de, linux-arm-kernel@lists.infradead.org,
rjw@rjwysocki.net, linux-kernel@vger.kernel.org,
penberg@kernel.org, palmer@dabbelt.com, iamjoonsoo.kim@lge.com,
linuxppc-dev@lists.ozlabs.org, davem@davemloft.net
In-Reply-To: <20201025101555.3057-3-rppt@kernel.org>
On Sun, 2020-10-25 at 12:15 +0200, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
>
> When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may
> be
> not present in the direct map and has to be explicitly mapped before
> it
> could be copied.
>
> On arm64 it is possible that a page would be removed from the direct
> map
> using set_direct_map_invalid_noflush() but __kernel_map_pages() will
> refuse
> to map this page back if DEBUG_PAGEALLOC is disabled.
It looks to me that arm64 __kernel_map_pages() will still attempt to
map it if rodata_full is true, how does this happen?
> Explicitly use set_direct_map_{default,invalid}_noflush() for
> ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_map_pages() for
> DEBUG_PAGEALLOC case.
>
> While on that, rename kernel_map_pages() to hibernate_map_page() and
> drop
> numpages parameter.
>
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
> kernel/power/snapshot.c | 29 +++++++++++++++++++----------
> 1 file changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index fa499466f645..ecb7b32ce77c 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -76,16 +76,25 @@ static inline void
> hibernate_restore_protect_page(void *page_address) {}
> static inline void hibernate_restore_unprotect_page(void
> *page_address) {}
> #endif /* CONFIG_STRICT_KERNEL_RWX && CONFIG_ARCH_HAS_SET_MEMORY */
>
> -#if defined(CONFIG_DEBUG_PAGEALLOC) ||
> defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP)
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable)
> +static inline void hibernate_map_page(struct page *page, int enable)
> {
> - __kernel_map_pages(page, numpages, enable);
> + if (IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> + unsigned long addr = (unsigned long)page_address(page);
> + int ret;
> +
> + if (enable)
> + ret = set_direct_map_default_noflush(page);
> + else
> + ret = set_direct_map_invalid_noflush(page);
> +
> + if (WARN_ON(ret))
> + return;
> +
> + flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> + } else {
> + debug_pagealloc_map_pages(page, 1, enable);
> + }
> }
> -#else
> -static inline void
> -kernel_map_pages(struct page *page, int numpages, int enable) {}
> -#endif
>
> static int swsusp_page_is_free(struct page *);
> static void swsusp_set_page_forbidden(struct page *);
> @@ -1366,9 +1375,9 @@ static void safe_copy_page(void *dst, struct
> page *s_page)
> if (kernel_page_present(s_page)) {
> do_copy_page(dst, page_address(s_page));
> } else {
> - kernel_map_pages(s_page, 1, 1);
> + hibernate_map_page(s_page, 1);
> do_copy_page(dst, page_address(s_page));
> - kernel_map_pages(s_page, 1, 0);
> + hibernate_map_page(s_page, 0);
> }
> }
>
If somehow a page was unmapped such that
set_direct_map_default_noflush() would fail, then this code introduces
a WARN, but it will still try to read the unmapped page. Why not just
have the WARN's inside of __kernel_map_pages() if they fail and then
have a warning for the debug page alloc cases as well? Since logic
around both expects them not to fail.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox