* Re: [PATCH 2/2] sched: Centralize SCHED_{SMT, MC, CLUSTER} definitions
From: Valentin Schneider @ 2021-10-08 15:22 UTC (permalink / raw)
To: Barry Song
Cc: Juri Lelli, Mark Rutland, Kefeng Wang, Rich Felker, linux-ia64,
Geert Uytterhoeven, linux-sh, Peter Zijlstra, Catalin Marinas,
Linus Walleij, David Hildenbrand, x86, linux-mips,
James E.J. Bottomley, Hugh Dickins, Paul Mackerras,
H. Peter Anvin, sparclinux, Will Deacon, Ard Biesheuvel,
linux-s390, Vincent Guittot, Arnd Bergmann, Yoshinori Sato,
YiFei Zhu, Helge Deller, Aubrey Li, Daniel Bristot de Oliveira,
Russell King, Christian Borntraeger, Ingo Molnar, Mel Gorman,
Masahiro Yamada, Frederic Weisbecker, Kees Cook, Vasily Gorbik,
Anshuman Khandual, Vlastimil Babka, Vipin Sharma, Heiko Carstens,
Uwe Kleine-König, Steven Rostedt, Nathan Chancellor,
Borislav Petkov, Sergei Trofimovich, Jonathan Cameron,
Thomas Gleixner, Michal Hocko, Dietmar Eggemann, LAK, Barry Song,
Ben Segall, Thomas Bogendoerfer, Daniel Borkmann, linux-parisc,
Chris Down, linuxppc-dev, Randy Dunlap, Nick Desaulniers, LKML,
Rasmus Villemoes, Andrew Morton, Tim Chen, David S. Miller,
Mike Rapoport
In-Reply-To: <CAGsJ_4wqtcOdsFDzR98PFbjxRyTqzf7P3p3erup84SXESYonYw@mail.gmail.com>
On 09/10/21 01:37, Barry Song wrote:
> On Sat, Oct 9, 2021 at 12:54 AM Valentin Schneider
> <valentin.schneider@arm.com> wrote:
>>
>> Barry recently introduced a new CONFIG_SCHED_CLUSTER, and discussions
>> around that highlighted that every architecture redefines its own help text
>> and dependencies for CONFIG_SCHED_SMT and CONFIG_SCHED_MC.
>>
>> Move the definition of those to scheduler's Kconfig to centralize help text
>> and generic dependencies (i.e. SMP). Make them depend on a matching
>> ARCH_SUPPORTS_SCHED_* which the architectures can select with the relevant
>> architecture-specific dependency.
>>
>> s390 uses its own topology table (set_sched_topology()) and doesn't seem to
>> cope without SCHED_MC or SCHED_SMT, so those remain untogglable.
>>
>
> Hi Valentin,
> Thanks!
> I believe this is a cleaner way for Kconfig itself. But I am not quite sure this
> is always beneficial of all platforms. It would be perfect if the patch has no
> side effects and doesn't change the existing behaviour. But it has side effects
> by changing the default N to Y on a couple of platforms.
>
So x86 has it default yes, and a lot of others (e.g. arm64) have it default
no.
IMO you don't gain much by disabling them. SCHED_MC and SCHED_CLUSTER only
control the presence of a sched_domain_topology_level - if it's useless it
gets degenerated at domain build time. Some valid reasons for not using
them is if the architecture defines its own topology table (e.g. powerpc
has CACHE and MC levels which are not gated behind any CONFIG).
SCHED_SMT has an impact on code generated in sched/core.c, but that is also
gated by a static key.
So I'd say having them default yes is sensible. I'd even say we should
change the "If unsure say N here." to "Y".
^ permalink raw reply
* Re: Add Apple M1 support to PASemi i2c driver
From: Christian Zigotzky @ 2021-10-09 6:10 UTC (permalink / raw)
To: Olof Johansson
Cc: Linux ARM Mailing List, Darren Stevens, Arnd Bergmann, Sven Peter,
Hector Martin, Linux Kernel Mailing List, Wolfram Sang,
Paul Mackerras, linux-i2c, R.T.Dickinson, mohamed.mediouni,
Matthew Leaman, Stan Skowronek, linuxppc-dev, R.T.Dickinson,
Alyssa Rosenzweig, Mark Kettenis
In-Reply-To: <CAOesGMgnx6P=J--bygw=vcL1b4mQbdACBX+Mwc7BtmEmMtP7KA@mail.gmail.com>
Olof,
Thank you for the hint.
I think I have found them.
Link: https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=266104
Mbox: https://patchwork.ozlabs.org/series/266104/mbox/
$ wget -O mbox https://patchwork.ozlabs.org/series/266104/mbox/
$ git am mbox
Thanks,
Christian
On 8. Oct 2021, at 22:47, Olof Johansson <olof@lixom.net> wrote:
Christian,
Self-service available on lore:
https://lore.kernel.org/all/20211008163532.75569-1-sven@svenpeter.dev/
There are links on there to download a whole thread as an mbox if needed.
-Olof
On Fri, Oct 8, 2021 at 1:20 PM Christian Zigotzky
<chzigotzky@xenosoft.de> wrote:
Hi Michael,
Do you have a mbox link for the v2 changes?
I would like to test them on my AmigaOne X1000.
Thanks,
Christian
On 27. Sep 2021, at 09:58, Michael Ellerman <mpe@ellerman.id.au> wrote:
Christian, the whole series is downloadable as a single mbox here:
https://patchwork.ozlabs.org/series/264134/mbox/
Save that to a file and apply with `git am`.
eg:
$ wget -O mbox https://patchwork.ozlabs.org/series/264134/mbox/
$ git am mbox
It applies cleanly on v5.15-rc3.
cheers
^ permalink raw reply
* Re: [PATCH 1/2] firmware: include drivers/firmware/Kconfig unconditionally
From: Paul Menzel @ 2021-10-09 9:24 UTC (permalink / raw)
To: Arnd Bergmann, Bjorn Andersson
Cc: linux-ia64, Geert Uytterhoeven, Catalin Marinas, Linus Walleij,
linux-kernel, James E.J. Bottomley, H. Peter Anvin, linux-riscv,
Will Deacon, Helge Deller, x86, Russell King, Ingo Molnar,
linux-mips, Albert Ou, Charles Keepax, Arnd Bergmann,
Simon Trimmer, Mark Brown, Borislav Petkov, Paul Walmsley,
Thomas Gleixner, linux-arm-kernel, Thomas Bogendoerfer,
linux-parisc, Greg Kroah-Hartman, Liam Girdwood, Palmer Dabbelt,
Andrew Morton, linuxppc-dev
In-Reply-To: <20210928075216.4193128-1-arnd@kernel.org>
[Cc: +linuxppc-dev@lists.ozlabs.org]
Dear Arnd,
Am 28.09.21 um 09:50 schrieb Arnd Bergmann:
> From: Arnd Bergmann <arnd@arndb.de>
>
> Compile-testing drivers that require access to a firmware layer
> fails when that firmware symbol is unavailable. This happened
> twice this week:
>
> - My proposed to change to rework the QCOM_SCM firmware symbol
> broke on ppc64 and others.
>
> - The cs_dsp firmware patch added device specific firmware loader
> into drivers/firmware, which broke on the same set of
> architectures.
>
> We should probably do the same thing for other subsystems as well,
> but fix this one first as this is a dependency for other patches
> getting merged.
>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Liam Girdwood <lgirdwood@gmail.com>
> Cc: Charles Keepax <ckeepax@opensource.cirrus.com>
> Cc: Simon Trimmer <simont@opensource.cirrus.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Not sure how we'd want to merge this patch, if two other things
> need it. I'd prefer to merge it along with the QCOM_SCM change
> through the soc tree, but that leaves the cirrus firmware broken
> unless we also merge it the same way (rather than through ASoC
> as it is now).
>
> Alternatively, we can try to find a different home for the Cirrus
> firmware to decouple the two problems. I'd argue that it's actually
> misplaced here, as drivers/firmware is meant for kernel code that
> interfaces with system firmware, not for device drivers to load
> their own firmware blobs from user space.
> ---
> arch/arm/Kconfig | 2 --
> arch/arm64/Kconfig | 2 --
> arch/ia64/Kconfig | 2 --
> arch/mips/Kconfig | 2 --
> arch/parisc/Kconfig | 2 --
> arch/riscv/Kconfig | 2 --
> arch/x86/Kconfig | 2 --
> drivers/Kconfig | 2 ++
> 8 files changed, 2 insertions(+), 14 deletions(-)
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index ad96f3dd7e83..194d10bbff9e 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1993,8 +1993,6 @@ config ARCH_HIBERNATION_POSSIBLE
>
> endmenu
>
> -source "drivers/firmware/Kconfig"
> -
> if CRYPTO
> source "arch/arm/crypto/Kconfig"
> endif
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index ebb49585a63f..8749517482ae 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1931,8 +1931,6 @@ source "drivers/cpufreq/Kconfig"
>
> endmenu
>
> -source "drivers/firmware/Kconfig"
> -
> source "drivers/acpi/Kconfig"
>
> source "arch/arm64/kvm/Kconfig"
> diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
> index 045792cde481..1e33666fa679 100644
> --- a/arch/ia64/Kconfig
> +++ b/arch/ia64/Kconfig
> @@ -388,8 +388,6 @@ config CRASH_DUMP
> help
> Generate crash dump after being started by kexec.
>
> -source "drivers/firmware/Kconfig"
> -
> endmenu
>
> menu "Power management and ACPI options"
> diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
> index 771ca53af06d..6b8f591c5054 100644
> --- a/arch/mips/Kconfig
> +++ b/arch/mips/Kconfig
> @@ -3316,8 +3316,6 @@ source "drivers/cpuidle/Kconfig"
>
> endmenu
>
> -source "drivers/firmware/Kconfig"
> -
> source "arch/mips/kvm/Kconfig"
>
> source "arch/mips/vdso/Kconfig"
> diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
> index 4742b6f169b7..27a8b49af11f 100644
> --- a/arch/parisc/Kconfig
> +++ b/arch/parisc/Kconfig
> @@ -384,6 +384,4 @@ config KEXEC_FILE
>
> endmenu
>
> -source "drivers/firmware/Kconfig"
> -
> source "drivers/parisc/Kconfig"
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 301a54233c7e..6a6fa9e976d5 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -561,5 +561,3 @@ menu "Power management options"
> source "kernel/power/Kconfig"
>
> endmenu
> -
> -source "drivers/firmware/Kconfig"
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index e5ba8afd29a0..5dcec5f13a82 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2834,8 +2834,6 @@ config HAVE_ATOMIC_IOMAP
> def_bool y
> depends on X86_32
>
> -source "drivers/firmware/Kconfig"
> -
> source "arch/x86/kvm/Kconfig"
>
> source "arch/x86/Kconfig.assembler"
> diff --git a/drivers/Kconfig b/drivers/Kconfig
> index 30d2db37cc87..0d399ddaa185 100644
> --- a/drivers/Kconfig
> +++ b/drivers/Kconfig
> @@ -17,6 +17,8 @@ source "drivers/bus/Kconfig"
>
> source "drivers/connector/Kconfig"
>
> +source "drivers/firmware/Kconfig"
> +
> source "drivers/gnss/Kconfig"
>
> source "drivers/mtd/Kconfig"
>
With this change, I have the new entries below in my .config:
```
$ diff -u .config.old .config
--- .config.old 2021-10-07 11:38:39.544000000 +0200
+++ .config 2021-10-09 10:02:03.156000000 +0200
@@ -1992,6 +1992,25 @@
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
+
+#
+# Firmware Drivers
+#
+
+#
+# ARM System Control and Management Interface Protocol
+#
+# end of ARM System Control and Management Interface Protocol
+
+# CONFIG_FIRMWARE_MEMMAP is not set
+# CONFIG_GOOGLE_FIRMWARE is not set
+
+#
+# Tegra firmware driver
+#
+# end of Tegra firmware driver
+# end of Firmware Drivers
+
# CONFIG_GNSS is not set
CONFIG_MTD=m
# CONFIG_MTD_TESTS is not set
```
No idea if the entries could be hidden for platforms not supporting them.
ARM System Control and Management Interface Protocol ----
[ ] Add firmware-provided memory map to sysfs
[ ] Google Firmware Drivers ----
Tegra firmware driver ----
Kind regards,
Paul
^ permalink raw reply
* Re: [PATCH v2 10/11] i2c: pasemi: Add Apple platform driver
From: Wolfram Sang @ 2021-10-09 10:09 UTC (permalink / raw)
To: Sven Peter
Cc: Arnd Bergmann, Hector Martin, linux-kernel, linux-i2c,
Paul Mackerras, linux-arm-kernel, Christian Zigotzky,
Olof Johansson, Mohamed Mediouni, Mark Kettenis, linuxppc-dev,
Alyssa Rosenzweig, Stan Skowronek
In-Reply-To: <20211008163532.75569-11-sven@svenpeter.dev>
[-- Attachment #1: Type: text/plain, Size: 357 bytes --]
> F: arch/arm64/boot/dts/apple/
> +F: drivers/i2c/busses/i2c-pasemi-platform.c
We have no dedicated maintainer for PASEMI. Are maybe you or your
project interested in maintaining the pasemi-core, too? I guess not many
patches will show up and they will likely be for M1 anyhow.
If so, then no need to resend, I could add the extra line while
applying.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH v2 00/11] Add Apple M1 support to PASemi i2c driver
From: Wolfram Sang @ 2021-10-09 10:10 UTC (permalink / raw)
To: Sven Peter
Cc: Arnd Bergmann, Hector Martin, linux-kernel, linux-i2c,
Paul Mackerras, linux-arm-kernel, Christian Zigotzky,
Olof Johansson, Mohamed Mediouni, Mark Kettenis, linuxppc-dev,
Alyssa Rosenzweig, Stan Skowronek
In-Reply-To: <20211008163532.75569-1-sven@svenpeter.dev>
[-- Attachment #1: Type: text/plain, Size: 352 bytes --]
> I still don't have access to any old PASemi hardware but the changes from
> v1 are pretty small and I expect them to still work. Would still be nice
> if someone with access to such hardware could give this a quick test.
Looks good to me. I will wait a few more days so that people can report
their tests. But it will be in the next merge window.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* [powerpc:merge] BUILD SUCCESS 83467bc737d9f37f076f208ccdcd929a96d86dcc
From: kernel test robot @ 2021-10-09 10:41 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: 83467bc737d9f37f076f208ccdcd929a96d86dcc Automatic merge of 'fixes' into merge (2021-10-09 00:21)
elapsed time: 1230m
configs tested: 192
configs skipped: 4
The following configs have been built successfully.
More configs may be tested in the coming days.
gcc tested configs:
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
i386 randconfig-c001-20211009
sh se7206_defconfig
sh sh7724_generic_defconfig
powerpc pasemi_defconfig
x86_64 defconfig
arm cerfcube_defconfig
powerpc mpc866_ads_defconfig
arm bcm2835_defconfig
powerpc cm5200_defconfig
microblaze mmu_defconfig
m68k amiga_defconfig
arm realview_defconfig
mips loongson3_defconfig
arm axm55xx_defconfig
arm stm32_defconfig
powerpc pseries_defconfig
xtensa alldefconfig
arm moxart_defconfig
sh se7724_defconfig
arc nsimosci_hs_smp_defconfig
arm cns3420vb_defconfig
mips rs90_defconfig
xtensa defconfig
sh sh7763rdp_defconfig
powerpc mpc83xx_defconfig
sh rts7751r2d1_defconfig
m68k atari_defconfig
arm mxs_defconfig
arc haps_hs_defconfig
sh sh7770_generic_defconfig
arm mvebu_v7_defconfig
powerpc tqm8xx_defconfig
powerpc mpc8560_ads_defconfig
sh titan_defconfig
sh espt_defconfig
arm jornada720_defconfig
powerpc mpc885_ads_defconfig
arm imx_v4_v5_defconfig
mips mpc30x_defconfig
arm collie_defconfig
sh kfr2r09-romimage_defconfig
powerpc socrates_defconfig
arm pcm027_defconfig
mips loongson1b_defconfig
arm64 alldefconfig
mips maltaup_xpa_defconfig
riscv allnoconfig
arm ixp4xx_defconfig
powerpc mgcoge_defconfig
mips mtx1_defconfig
sh se7712_defconfig
sh secureedge5410_defconfig
sh rsk7264_defconfig
mips malta_qemu_32r6_defconfig
powerpc g5_defconfig
arm keystone_defconfig
riscv defconfig
arm vexpress_defconfig
powerpc ppc40x_defconfig
um defconfig
mips ip22_defconfig
mips sb1250_swarm_defconfig
arm versatile_defconfig
powerpc mpc836x_mds_defconfig
arm gemini_defconfig
m68k q40_defconfig
sh sh7785lcr_32bit_defconfig
sh j2_defconfig
sh shmin_defconfig
sh se7619_defconfig
sh se7721_defconfig
powerpc mpc85xx_cds_defconfig
m68k m5307c3_defconfig
arm milbeaut_m10v_defconfig
arm colibri_pxa270_defconfig
arm mps2_defconfig
sh lboxre2_defconfig
arm davinci_all_defconfig
arm dove_defconfig
powerpc iss476-smp_defconfig
mips xway_defconfig
m68k multi_defconfig
x86_64 randconfig-c001-20211008
i386 randconfig-c001-20211008
arm randconfig-c002-20211008
x86_64 randconfig-c001-20211009
arm randconfig-c002-20211009
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k defconfig
m68k allmodconfig
m68k allyesconfig
nios2 defconfig
nds32 allnoconfig
arc allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 defconfig
parisc allyesconfig
s390 allyesconfig
s390 allmodconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
i386 allyesconfig
mips allyesconfig
mips allmodconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc allyesconfig
x86_64 randconfig-a003-20211009
x86_64 randconfig-a005-20211009
x86_64 randconfig-a001-20211009
x86_64 randconfig-a002-20211009
x86_64 randconfig-a004-20211009
x86_64 randconfig-a006-20211009
i386 randconfig-a001-20211009
i386 randconfig-a003-20211009
i386 randconfig-a005-20211009
i386 randconfig-a004-20211009
i386 randconfig-a002-20211009
i386 randconfig-a006-20211009
x86_64 randconfig-a015-20211008
x86_64 randconfig-a012-20211008
x86_64 randconfig-a016-20211008
x86_64 randconfig-a013-20211008
x86_64 randconfig-a011-20211008
x86_64 randconfig-a014-20211008
i386 randconfig-a013-20211008
i386 randconfig-a016-20211008
i386 randconfig-a014-20211008
i386 randconfig-a011-20211008
i386 randconfig-a012-20211008
i386 randconfig-a015-20211008
arc randconfig-r043-20211008
s390 randconfig-r044-20211008
riscv randconfig-r042-20211008
riscv nommu_k210_defconfig
riscv allyesconfig
riscv nommu_virt_defconfig
riscv rv32_defconfig
riscv allmodconfig
x86_64 rhel-8.3-kselftests
um x86_64_defconfig
um i386_defconfig
x86_64 allyesconfig
x86_64 rhel-8.3
x86_64 kexec
clang tested configs:
x86_64 randconfig-a003-20211008
x86_64 randconfig-a005-20211008
x86_64 randconfig-a001-20211008
x86_64 randconfig-a002-20211008
x86_64 randconfig-a004-20211008
x86_64 randconfig-a006-20211008
i386 randconfig-a001-20211008
i386 randconfig-a003-20211008
i386 randconfig-a005-20211008
i386 randconfig-a004-20211008
i386 randconfig-a002-20211008
i386 randconfig-a006-20211008
x86_64 randconfig-a015-20211009
x86_64 randconfig-a012-20211009
x86_64 randconfig-a016-20211009
x86_64 randconfig-a013-20211009
x86_64 randconfig-a011-20211009
x86_64 randconfig-a014-20211009
i386 randconfig-a013-20211009
i386 randconfig-a016-20211009
i386 randconfig-a014-20211009
i386 randconfig-a012-20211009
i386 randconfig-a011-20211009
i386 randconfig-a015-20211009
hexagon randconfig-r045-20211009
hexagon randconfig-r041-20211009
s390 randconfig-r044-20211009
riscv randconfig-r042-20211009
hexagon randconfig-r045-20211008
hexagon randconfig-r041-20211008
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH v2 10/11] i2c: pasemi: Add Apple platform driver
From: Sven Peter @ 2021-10-09 11:29 UTC (permalink / raw)
To: Wolfram Sang
Cc: Arnd Bergmann, Hector Martin, linux-kernel, linux-i2c,
Paul Mackerras, linux-arm-kernel, Christian Zigotzky,
Olof Johansson, Mohamed Mediouni, Mark Kettenis, linuxppc-dev,
Alyssa Rosenzweig, Stan Skowronek
In-Reply-To: <YWFqUuc7I5Dh8+w6@ninjato>
On Sat, Oct 9, 2021, at 12:09, Wolfram Sang wrote:
>> F: arch/arm64/boot/dts/apple/
>> +F: drivers/i2c/busses/i2c-pasemi-platform.c
>
> We have no dedicated maintainer for PASEMI. Are maybe you or your
> project interested in maintaining the pasemi-core, too? I guess not many
> patches will show up and they will likely be for M1 anyhow.
>
> If so, then no need to resend, I could add the extra line while
> applying.
Sure, feel free to add the core to the entry as well.
Best,
Sven
^ permalink raw reply
* Re: [PATCH v2 00/11] Add Apple M1 support to PASemi i2c driver
From: Sven Peter @ 2021-10-09 11:30 UTC (permalink / raw)
To: Wolfram Sang
Cc: Arnd Bergmann, Hector Martin, linux-kernel, linux-i2c,
Paul Mackerras, linux-arm-kernel, Christian Zigotzky,
Olof Johansson, Mohamed Mediouni, Mark Kettenis, linuxppc-dev,
Alyssa Rosenzweig, Stan Skowronek
In-Reply-To: <YWFqr4uQGlNgnT1z@ninjato>
On Sat, Oct 9, 2021, at 12:10, Wolfram Sang wrote:
>> I still don't have access to any old PASemi hardware but the changes from
>> v1 are pretty small and I expect them to still work. Would still be nice
>> if someone with access to such hardware could give this a quick test.
>
> Looks good to me. I will wait a few more days so that people can report
> their tests. But it will be in the next merge window.
Sounds great, thanks!
Sven
^ permalink raw reply
* [PATCH v10 1/3] tty: hvc: use correct dma alignment size
From: Xianting Tian @ 2021-10-09 11:48 UTC (permalink / raw)
To: gregkh, jirislaby, amit, arnd, osandov
Cc: Xianting Tian, shile.zhang, linuxppc-dev, linux-kernel,
virtualization
In-Reply-To: <20211009114829.1071021-1-xianting.tian@linux.alibaba.com>
Use L1_CACHE_BYTES as the dma alignment size, use 'sizeof(long)'
is wrong.
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com>
---
drivers/tty/hvc/hvc_console.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 5bb8c4e44..5957ab728 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -49,7 +49,7 @@
#define N_OUTBUF 16
#define N_INBUF 16
-#define __ALIGNED__ __attribute__((__aligned__(sizeof(long))))
+#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES)))
static struct tty_driver *hvc_driver;
static struct task_struct *hvc_task;
--
2.17.1
^ permalink raw reply related
* [PATCH v10 0/3] make hvc pass dma capable memory to its backend
From: Xianting Tian @ 2021-10-09 11:48 UTC (permalink / raw)
To: gregkh, jirislaby, amit, arnd, osandov
Cc: Xianting Tian, shile.zhang, linuxppc-dev, linux-kernel,
virtualization
Dear all,
This patch series make hvc framework pass DMA capable memory to
put_chars() of hvc backend(eg, virtio-console), and revert commit
c4baad5029 ("virtio-console: avoid DMA from stack”)
V1
virtio-console: avoid DMA from vmalloc area
https://lkml.org/lkml/2021/7/27/494
For v1 patch, Arnd Bergmann suggests to fix the issue in the first
place:
Make hvc pass DMA capable memory to put_chars()
The fix suggestion is included in v2.
V2
[PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/1/8
[PATCH 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/1/9
For v2 patch, Arnd Bergmann suggests to make new buf part of the
hvc_struct structure, and fix the compile issue.
The fix suggestion is included in v3.
V3
[PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/3/1347
[PATCH v3 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/3/1348
For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of
hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to
calculate the size of hvc_struct. The fix suggestion is included in
v4.
V4
[PATCH v4 0/2] make hvc pass dma capable memory to its backend
https://lkml.org/lkml/2021/8/5/1350
[PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()
https://lkml.org/lkml/2021/8/5/1351
[PATCH v4 2/2] virtio-console: remove unnecessary kmemdup()
https://lkml.org/lkml/2021/8/5/1352
For v4 patch, Arnd Bergmann suggests to introduce another
array(cons_outbuf[]) for the buffer pointers next to the cons_ops[]
and vtermnos[] arrays. This fix included in this v5 patch.
V5
Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment,
use 'sizeof(long)' as dma alignment is wrong. fix it in v6.
V6
It contains coding error, fix it in v7 and it worked normally
according to test result.
V7
Greg KH suggests to add test and code review developer,
Jiri Slaby suggests to use lockless buffer and fix dma alignment
in separate patch.
fix above things in v8.
V8
This contains coding error when switch to use new buffer. fix it in v9.
V9
It didn't make things much clearer, it needs add more comments for new added buf.
Add use lock to protect new added buffer. fix in v10.
********TEST STEPS*********
1, config guest console=hvc0
2, start guest
3, login guest
Welcome to Buildroot
buildroot login: root
#
# cat /proc/cmdline
console=hvc0,115200
#
drivers/tty/hvc/hvc_console.c | 38 +++++++++++++++++++++--------------
drivers/tty/hvc/hvc_console.h | 24 ++++++++++++++++++++--
drivers/char/virtio_console.c | 12 ++----------
3 file changed
^ permalink raw reply
* [PATCH v10 3/3] virtio-console: remove unnecessary kmemdup()
From: Xianting Tian @ 2021-10-09 11:48 UTC (permalink / raw)
To: gregkh, jirislaby, amit, arnd, osandov
Cc: Xianting Tian, shile.zhang, linuxppc-dev, linux-kernel,
virtualization
In-Reply-To: <20211009114829.1071021-1-xianting.tian@linux.alibaba.com>
This revert commit c4baad5029 ("virtio-console: avoid DMA from stack")
hvc framework will never pass stack memory to the put_chars() function,
So the calling of kmemdup() is unnecessary, we can remove it.
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Reviewed-by: Shile Zhang <shile.zhang@linux.alibaba.com>
---
drivers/char/virtio_console.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 7eaf303a7..4ed3ffb1d 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count)
{
struct port *port;
struct scatterlist sg[1];
- void *data;
- int ret;
if (unlikely(early_put_chars))
return early_put_chars(vtermno, buf, count);
@@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count)
if (!port)
return -EPIPE;
- data = kmemdup(buf, count, GFP_ATOMIC);
- if (!data)
- return -ENOMEM;
-
- sg_init_one(sg, data, count);
- ret = __send_to_port(port, sg, 1, count, data, false);
- kfree(data);
- return ret;
+ sg_init_one(sg, buf, count);
+ return __send_to_port(port, sg, 1, count, (void *)buf, false);
}
/*
--
2.17.1
^ permalink raw reply related
* [PATCH v10 2/3] tty: hvc: pass DMA capable memory to put_chars()
From: Xianting Tian @ 2021-10-09 11:48 UTC (permalink / raw)
To: gregkh, jirislaby, amit, arnd, osandov
Cc: Xianting Tian, shile.zhang, linuxppc-dev, linux-kernel,
virtualization
In-Reply-To: <20211009114829.1071021-1-xianting.tian@linux.alibaba.com>
As well known, hvc backend can register its opertions to hvc backend.
the operations contain put_chars(), get_chars() and so on.
Some hvc backend may do dma in its operations. eg, put_chars() of
virtio-console. But in the code of hvc framework, it may pass DMA
incapable memory to put_chars() under a specific configuration, which
is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
1, c[] is on stack,
hvc_console_print():
char c[N_OUTBUF] __ALIGNED__;
cons_ops[index]->put_chars(vtermnos[index], c, i);
2, ch is on stack,
static void hvc_poll_put_char(,,char ch)
{
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;
do {
n = hp->ops->put_chars(hp->vtermno, &ch, 1);
} while (n <= 0);
}
Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
is passed to virtio-console by hvc framework in above code. But I think
the fix is aggressive, it directly uses kmemdup() to alloc new buffer
from kmalloc area and do memcpy no matter the memory is in kmalloc area
or not. But most importantly, it should better be fixed in the hvc
framework, by changing it to never pass stack memory to the put_chars()
function in the first place. Otherwise, we still face the same issue if
a new hvc backend using dma added in the furture.
In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct',
so hp->cons_outbuf is no longer the stack memory, we can use it in above
case 1. Add 'char outchar' as part of 'struct hvc_struct', we can use it
in above case 2. We also add lock for each above buf to protect them
separately instead of using the global lock of hvc.
Introduce another array(cons_hvcs[]) for hvc pointers next to the
cons_ops[] and vtermnos[] arrays. With the array, we can easily find
hvc's cons_outbuf and its lock.
With the patch, we can revert the fix c4baad5029.
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
---
drivers/tty/hvc/hvc_console.c | 37 +++++++++++++++++++++--------------
drivers/tty/hvc/hvc_console.h | 24 +++++++++++++++++++++--
2 files changed, 44 insertions(+), 17 deletions(-)
diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 5bb8c4e44..4d8f112f2 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -41,16 +41,6 @@
*/
#define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */
-/*
- * These sizes are most efficient for vio, because they are the
- * native transfer size. We could make them selectable in the
- * future to better deal with backends that want other buffer sizes.
- */
-#define N_OUTBUF 16
-#define N_INBUF 16
-
-#define __ALIGNED__ __attribute__((__aligned__(sizeof(long))))
-
static struct tty_driver *hvc_driver;
static struct task_struct *hvc_task;
@@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp)
static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES];
static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
{[0 ... MAX_NR_HVC_CONSOLES - 1] = -1};
+static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES];
/*
* Console APIs, NOT TTY. These APIs are available immediately when
@@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
static void hvc_console_print(struct console *co, const char *b,
unsigned count)
{
- char c[N_OUTBUF] __ALIGNED__;
+ char *c;
unsigned i = 0, n = 0;
int r, donecr = 0, index = co->index;
+ unsigned long flags;
+ struct hvc_struct *hp;
/* Console access attempt outside of acceptable console range. */
if (index >= MAX_NR_HVC_CONSOLES)
@@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b,
if (vtermnos[index] == -1)
return;
+ hp = cons_hvcs[index];
+ if (!hp)
+ return;
+
+ c = hp->cons_outbuf;
+
+ spin_lock_irqsave(&hp->cons_outbuf_lock, flags);
while (count > 0 || i > 0) {
if (count > 0 && i < sizeof(c)) {
if (b[n] == '\n' && !donecr) {
@@ -191,6 +191,7 @@ static void hvc_console_print(struct console *co, const char *b,
}
}
}
+ spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags);
hvc_console_flush(cons_ops[index], vtermnos[index]);
}
@@ -878,9 +879,13 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch)
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;
+ unsigned long flags;
do {
- n = hp->ops->put_chars(hp->vtermno, &ch, 1);
+ spin_lock_irqsave(&hp->outchar_lock, flags);
+ hp->outchar = ch;
+ n = hp->ops->put_chars(hp->vtermno, hp->outchar, 1);
+ spin_unlock_irqrestore(&hp->outchar_lock, flags);
} while (n <= 0);
}
#endif
@@ -922,8 +927,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
return ERR_PTR(err);
}
- hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size,
- GFP_KERNEL);
+ hp = kzalloc(struct_size(hp, outbuf, outbuf_size), GFP_KERNEL);
if (!hp)
return ERR_PTR(-ENOMEM);
@@ -931,13 +935,14 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
hp->data = data;
hp->ops = ops;
hp->outbuf_size = outbuf_size;
- hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
tty_port_init(&hp->port);
hp->port.ops = &hvc_port_ops;
INIT_WORK(&hp->tty_resize, hvc_set_winsz);
spin_lock_init(&hp->lock);
+ spin_lock_init(&hp->outchar_lock);
+ spin_lock_init(&hp->cons_outbuf_lock);
mutex_lock(&hvc_structs_mutex);
/*
@@ -964,6 +969,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
if (i < MAX_NR_HVC_CONSOLES) {
cons_ops[i] = ops;
vtermnos[i] = vtermno;
+ cons_hvcs[i] = hp;
}
list_add_tail(&(hp->next), &hvc_structs);
@@ -988,6 +994,7 @@ int hvc_remove(struct hvc_struct *hp)
if (hp->index < MAX_NR_HVC_CONSOLES) {
vtermnos[hp->index] = -1;
cons_ops[hp->index] = NULL;
+ cons_hvcs[hp->index] = NULL;
}
/* Don't whack hp->irq because tty_hangup() will need to free the irq. */
diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h
index 18d005814..98f0ced83 100644
--- a/drivers/tty/hvc/hvc_console.h
+++ b/drivers/tty/hvc/hvc_console.h
@@ -32,13 +32,21 @@
*/
#define HVC_ALLOC_TTY_ADAPTERS 8
+/*
+ * These sizes are most efficient for vio, because they are the
+ * native transfer size. We could make them selectable in the
+ * future to better deal with backends that want other buffer sizes.
+ */
+#define N_OUTBUF 16
+#define N_INBUF 16
+
+#define __ALIGNED__ __attribute__((__aligned__(sizeof(long))))
+
struct hvc_struct {
struct tty_port port;
spinlock_t lock;
int index;
int do_wakeup;
- char *outbuf;
- int outbuf_size;
int n_outbuf;
uint32_t vtermno;
const struct hv_ops *ops;
@@ -48,6 +56,18 @@ struct hvc_struct {
struct work_struct tty_resize;
struct list_head next;
unsigned long flags;
+
+ /* the buf is used in hvc console api for putting chars */
+ char cons_outbuf[N_OUTBUF] __ALIGNED__;
+ spinlock_t cons_outbuf_lock;
+
+ /* the buf is for putting single char to tty */
+ char outchar;
+ spinlock_t outchar_lock;
+
+ /* the buf is for putting chars to tty */
+ int outbuf_size;
+ char outbuf[0] __ALIGNED__;
};
/* implemented by a low level driver */
--
2.17.1
^ permalink raw reply related
* Re: [PATCH v10 2/3] tty: hvc: pass DMA capable memory to put_chars()
From: Greg KH @ 2021-10-09 11:55 UTC (permalink / raw)
To: Xianting Tian
Cc: arnd, amit, jirislaby, shile.zhang, linux-kernel, virtualization,
linuxppc-dev, osandov
In-Reply-To: <20211009114829.1071021-3-xianting.tian@linux.alibaba.com>
On Sat, Oct 09, 2021 at 07:48:28PM +0800, Xianting Tian wrote:
> As well known, hvc backend can register its opertions to hvc backend.
> the operations contain put_chars(), get_chars() and so on.
>
> Some hvc backend may do dma in its operations. eg, put_chars() of
> virtio-console. But in the code of hvc framework, it may pass DMA
> incapable memory to put_chars() under a specific configuration, which
> is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
> 1, c[] is on stack,
> hvc_console_print():
> char c[N_OUTBUF] __ALIGNED__;
> cons_ops[index]->put_chars(vtermnos[index], c, i);
> 2, ch is on stack,
> static void hvc_poll_put_char(,,char ch)
> {
> struct tty_struct *tty = driver->ttys[0];
> struct hvc_struct *hp = tty->driver_data;
> int n;
>
> do {
> n = hp->ops->put_chars(hp->vtermno, &ch, 1);
> } while (n <= 0);
> }
>
> Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
> is passed to virtio-console by hvc framework in above code. But I think
> the fix is aggressive, it directly uses kmemdup() to alloc new buffer
> from kmalloc area and do memcpy no matter the memory is in kmalloc area
> or not. But most importantly, it should better be fixed in the hvc
> framework, by changing it to never pass stack memory to the put_chars()
> function in the first place. Otherwise, we still face the same issue if
> a new hvc backend using dma added in the furture.
>
> In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct',
> so hp->cons_outbuf is no longer the stack memory, we can use it in above
> case 1. Add 'char outchar' as part of 'struct hvc_struct', we can use it
> in above case 2. We also add lock for each above buf to protect them
> separately instead of using the global lock of hvc.
>
> Introduce another array(cons_hvcs[]) for hvc pointers next to the
> cons_ops[] and vtermnos[] arrays. With the array, we can easily find
> hvc's cons_outbuf and its lock.
>
> With the patch, we can revert the fix c4baad5029.
>
> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
> Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
> ---
> drivers/tty/hvc/hvc_console.c | 37 +++++++++++++++++++++--------------
> drivers/tty/hvc/hvc_console.h | 24 +++++++++++++++++++++--
> 2 files changed, 44 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
> index 5bb8c4e44..4d8f112f2 100644
> --- a/drivers/tty/hvc/hvc_console.c
> +++ b/drivers/tty/hvc/hvc_console.c
> @@ -41,16 +41,6 @@
> */
> #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */
>
> -/*
> - * These sizes are most efficient for vio, because they are the
> - * native transfer size. We could make them selectable in the
> - * future to better deal with backends that want other buffer sizes.
> - */
> -#define N_OUTBUF 16
> -#define N_INBUF 16
> -
> -#define __ALIGNED__ __attribute__((__aligned__(sizeof(long))))
> -
Are you sure this applies on top of patch 1/3 here?
> +/*
> + * These sizes are most efficient for vio, because they are the
> + * native transfer size. We could make them selectable in the
> + * future to better deal with backends that want other buffer sizes.
> + */
> +#define N_OUTBUF 16
> +#define N_INBUF 16
> +
> +#define __ALIGNED__ __attribute__((__aligned__(sizeof(long))))
Again, are you sure this is correct?
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCH v10 2/3] tty: hvc: pass DMA capable memory to put_chars()
From: Greg KH @ 2021-10-09 11:58 UTC (permalink / raw)
To: Xianting Tian
Cc: arnd, amit, jirislaby, shile.zhang, linux-kernel, virtualization,
linuxppc-dev, osandov
In-Reply-To: <20211009114829.1071021-3-xianting.tian@linux.alibaba.com>
On Sat, Oct 09, 2021 at 07:48:28PM +0800, Xianting Tian wrote:
> --- a/drivers/tty/hvc/hvc_console.h
> +++ b/drivers/tty/hvc/hvc_console.h
> @@ -32,13 +32,21 @@
> */
> #define HVC_ALLOC_TTY_ADAPTERS 8
>
> +/*
> + * These sizes are most efficient for vio, because they are the
> + * native transfer size. We could make them selectable in the
> + * future to better deal with backends that want other buffer sizes.
> + */
> +#define N_OUTBUF 16
> +#define N_INBUF 16
> +
> +#define __ALIGNED__ __attribute__((__aligned__(sizeof(long))))
Does this conflict with what is in hvcs.c?
> +
> struct hvc_struct {
> struct tty_port port;
> spinlock_t lock;
> int index;
> int do_wakeup;
> - char *outbuf;
> - int outbuf_size;
> int n_outbuf;
> uint32_t vtermno;
> const struct hv_ops *ops;
> @@ -48,6 +56,18 @@ struct hvc_struct {
> struct work_struct tty_resize;
> struct list_head next;
> unsigned long flags;
> +
> + /* the buf is used in hvc console api for putting chars */
> + char cons_outbuf[N_OUTBUF] __ALIGNED__;
> + spinlock_t cons_outbuf_lock;
Did you look at the placement using pahole as to how this structure now
looks?
> +
> + /* the buf is for putting single char to tty */
> + char outchar;
> + spinlock_t outchar_lock;
So you have a lock for a character and a different one for a longer
string? Why can they not just use the same lock? Why are 2 needed at
all, can't you just use the first character of cons_outbuf[] instead?
Surely you do not have 2 sends happening at the same time, right?
> +
> + /* the buf is for putting chars to tty */
> + int outbuf_size;
> + char outbuf[0] __ALIGNED__;
I thought we were not allowing [0] anymore in kernel structures?
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCH v2 00/11] Add Apple M1 support to PASemi i2c driver
From: Christian Zigotzky @ 2021-10-09 13:57 UTC (permalink / raw)
To: Wolfram Sang, Sven Peter, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, Olof Johansson,
Arnd Bergmann, Hector Martin, Mohamed Mediouni, Stan Skowronek,
Mark Kettenis, Alyssa Rosenzweig, linux-arm-kernel, linuxppc-dev,
linux-i2c, linux-kernel, R.T.Dickinson, Matthew Leaman,
Darren Stevens
In-Reply-To: <YWFqr4uQGlNgnT1z@ninjato>
On 09 October 2021 at 12:10 pm, Wolfram Sang wrote:
>> I still don't have access to any old PASemi hardware but the changes from
>> v1 are pretty small and I expect them to still work. Would still be nice
>> if someone with access to such hardware could give this a quick test.
> Looks good to me. I will wait a few more days so that people can report
> their tests. But it will be in the next merge window.
>
Series v2:
Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> [1]
- Christian
[1] https://forum.hyperion-entertainment.com/viewtopic.php?p=54213#p54213
^ permalink raw reply
* Re: [PATCH v10 2/3] tty: hvc: pass DMA capable memory to put_chars()
From: Xianting Tian @ 2021-10-09 15:45 UTC (permalink / raw)
To: Greg KH
Cc: arnd, amit, jirislaby, shile.zhang, linux-kernel, virtualization,
linuxppc-dev, osandov
In-Reply-To: <YWGD8y9VfBIQBu2h@kroah.com>
在 2021/10/9 下午7:58, Greg KH 写道:
> Did you look at the placement using pahole as to how this structure now
> looks?
thanks for all your commnts. for this one, do you mean I need to remove
the blank line? thanks
^ permalink raw reply
* [PATCH v3 00/10] cxl_pci refactor for reusability
From: Dan Williams @ 2021-10-09 16:43 UTC (permalink / raw)
To: linux-cxl
Cc: Ben Widawsky, Andrew Donnellan, linux-pci, linuxppc-dev, stable,
linux-kernel, Bjorn Helgaas, David E. Box, Jonathan Cameron,
Frederic Barrat, Kan Liang, Ira Weiny, hch, Lu Baolu
Changes since v2 [1]:
- Rework some of the changelogs per feedback (Bjorn, and I)
- Move the cxl_register_map refactor earlier in the series to make the
cxl_setup_pci_regs() refactor easier to read.
- Fix a bug added in v5.14 for handling the error return case
cxl_pci_map_regblock()
- Split the addition of @base to cxl_register_map to its own patch
- Drop the cxl_pci_dvsec() wrapper (Christoph)
- Drop the SIOV conversion patch given Baolu's feedback about it being
dead code
[1]: https://lore.kernel.org/r/20210923172647.72738-1-ben.widawsky@intel.com
---
I am helping out with the review feedback on this set while Ben is
focusing on region provisioning. It appears this rework will be suitable
to just carry in cxl/next, no need to make a cross-tree dependency for
"PCI: Add pci_find_dvsec_capability to find designated VSEC" at this
time.
Ben's original cover:
Provide the ability to obtain CXL register blocks as discrete functionality.
This functionality will become useful for other CXL drivers that need access to
CXL register blocks. It is also in line with other additions to core which moves
register mapping functionality.
At the introduction of the CXL driver the only user of CXL MMIO was cxl_pci
(then known as cxl_mem). As the driver has evolved it is clear that cxl_pci will
not be the only entity that needs access to CXL MMIO. This series stops short of
moving the generalized functionality into cxl_core for the sake of getting eyes
on the important foundational bits sooner rather than later. The ultimate plan
is to move much of the code into cxl_core.
Via review of two previous patches [1] & [2] it has been suggested that the bits
which are being used for DVSEC enumeration move into PCI core. As CXL core is
soon going to require these, let's try to get the ball rolling now on making
that happen.
---
[1]: https://lore.kernel.org/linux-pci/20210913190131.xiiszmno46qie7v5@intel.com/
[2]: https://lore.kernel.org/linux-cxl/20210920225638.1729482-1-ben.widawsky@intel.com/
[3]: https://lore.kernel.org/linux-cxl/20210921220459.2437386-1-ben.widawsky@intel.com/
---
Ben Widawsky (8):
cxl/pci: Convert register block identifiers to an enum
cxl/pci: Remove dev_dbg for unknown register blocks
cxl/pci: Remove pci request/release regions
cxl/pci: Make more use of cxl_register_map
cxl/pci: Split cxl_pci_setup_regs()
PCI: Add pci_find_dvsec_capability to find designated VSEC
cxl/pci: Use pci core's DVSEC functionality
ocxl: Use pci core's DVSEC functionality
Dan Williams (2):
cxl/pci: Fix NULL vs ERR_PTR confusion
cxl/pci: Add @base to cxl_register_map
arch/powerpc/platforms/powernv/ocxl.c | 3 -
drivers/cxl/cxl.h | 1
drivers/cxl/pci.c | 157 +++++++++++++--------------------
drivers/cxl/pci.h | 14 ++-
drivers/misc/ocxl/config.c | 13 ---
drivers/pci/pci.c | 32 +++++++
include/linux/pci.h | 1
7 files changed, 105 insertions(+), 116 deletions(-)
base-commit: ed97afb53365cd03dde266c9644334a558fe5a16
^ permalink raw reply
* [PATCH v3 08/10] PCI: Add pci_find_dvsec_capability to find designated VSEC
From: Dan Williams @ 2021-10-09 16:44 UTC (permalink / raw)
To: linux-cxl
Cc: Ben Widawsky, Andrew Donnellan, linux-pci, linux-kernel,
Frederic Barrat, David E. Box, Jonathan Cameron, Bjorn Helgaas,
Kan Liang, linuxppc-dev, hch, Lu Baolu
In-Reply-To: <163379783658.692348.16064992154261275220.stgit@dwillia2-desk3.amr.corp.intel.com>
From: Ben Widawsky <ben.widawsky@intel.com>
Add pci_find_dvsec_capability to locate a Designated Vendor-Specific
Extended Capability with the specified Vendor ID and Capability ID.
The Designated Vendor-Specific Extended Capability (DVSEC) allows one or
more "vendor" specific capabilities that are not tied to the Vendor ID
of the PCI component. Where the DVSEC Vendor may be a standards body
like CXL.
Cc: David E. Box <david.e.box@linux.intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-pci@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Andrew Donnellan <ajd@linux.ibm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/pci/pci.c | 32 ++++++++++++++++++++++++++++++++
include/linux/pci.h | 1 +
2 files changed, 33 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ce2ab62b64cf..94ac86ff28b0 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -732,6 +732,38 @@ u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int cap)
}
EXPORT_SYMBOL_GPL(pci_find_vsec_capability);
+/**
+ * pci_find_dvsec_capability - Find DVSEC for vendor
+ * @dev: PCI device to query
+ * @vendor: Vendor ID to match for the DVSEC
+ * @dvsec: Designated Vendor-specific capability ID
+ *
+ * If DVSEC has Vendor ID @vendor and DVSEC ID @dvsec return the capability
+ * offset in config space; otherwise return 0.
+ */
+u16 pci_find_dvsec_capability(struct pci_dev *dev, u16 vendor, u16 dvsec)
+{
+ int pos;
+
+ pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DVSEC);
+ if (!pos)
+ return 0;
+
+ while (pos) {
+ u16 v, id;
+
+ pci_read_config_word(dev, pos + PCI_DVSEC_HEADER1, &v);
+ pci_read_config_word(dev, pos + PCI_DVSEC_HEADER2, &id);
+ if (vendor == v && dvsec == id)
+ return pos;
+
+ pos = pci_find_next_ext_capability(dev, pos, PCI_EXT_CAP_ID_DVSEC);
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(pci_find_dvsec_capability);
+
/**
* pci_find_parent_resource - return resource region of parent bus of given
* region
diff --git a/include/linux/pci.h b/include/linux/pci.h
index cd8aa6fce204..c93ccfa4571b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1130,6 +1130,7 @@ u16 pci_find_ext_capability(struct pci_dev *dev, int cap);
u16 pci_find_next_ext_capability(struct pci_dev *dev, u16 pos, int cap);
struct pci_bus *pci_find_next_bus(const struct pci_bus *from);
u16 pci_find_vsec_capability(struct pci_dev *dev, u16 vendor, int cap);
+u16 pci_find_dvsec_capability(struct pci_dev *dev, u16 vendor, u16 dvsec);
u64 pci_get_dsn(struct pci_dev *dev);
^ permalink raw reply related
* [PATCH v3 10/10] ocxl: Use pci core's DVSEC functionality
From: Dan Williams @ 2021-10-09 16:44 UTC (permalink / raw)
To: linux-cxl
Cc: Ben Widawsky, Andrew Donnellan, linux-pci, linux-kernel,
Frederic Barrat, linuxppc-dev, hch
In-Reply-To: <163379783658.692348.16064992154261275220.stgit@dwillia2-desk3.amr.corp.intel.com>
From: Ben Widawsky <ben.widawsky@intel.com>
Reduce maintenance burden of DVSEC query implementation by using the
centralized PCI core implementation.
There are two obvious places to simply drop in the new core
implementation. There remains find_dvsec_from_pos() which would benefit
from using a core implementation. As that change is less trivial it is
reserved for later.
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> (v1)
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
arch/powerpc/platforms/powernv/ocxl.c | 3 ++-
drivers/misc/ocxl/config.c | 13 +------------
2 files changed, 3 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c
index 9105efcf242a..28b009b46464 100644
--- a/arch/powerpc/platforms/powernv/ocxl.c
+++ b/arch/powerpc/platforms/powernv/ocxl.c
@@ -107,7 +107,8 @@ static int get_max_afu_index(struct pci_dev *dev, int *afu_idx)
int pos;
u32 val;
- pos = find_dvsec_from_pos(dev, OCXL_DVSEC_FUNC_ID, 0);
+ pos = pci_find_dvsec_capability(dev, PCI_VENDOR_ID_IBM,
+ OCXL_DVSEC_FUNC_ID);
if (!pos)
return -ESRCH;
diff --git a/drivers/misc/ocxl/config.c b/drivers/misc/ocxl/config.c
index a68738f38252..e401a51596b9 100644
--- a/drivers/misc/ocxl/config.c
+++ b/drivers/misc/ocxl/config.c
@@ -33,18 +33,7 @@
static int find_dvsec(struct pci_dev *dev, int dvsec_id)
{
- int vsec = 0;
- u16 vendor, id;
-
- while ((vsec = pci_find_next_ext_capability(dev, vsec,
- OCXL_EXT_CAP_ID_DVSEC))) {
- pci_read_config_word(dev, vsec + OCXL_DVSEC_VENDOR_OFFSET,
- &vendor);
- pci_read_config_word(dev, vsec + OCXL_DVSEC_ID_OFFSET, &id);
- if (vendor == PCI_VENDOR_ID_IBM && id == dvsec_id)
- return vsec;
- }
- return 0;
+ return pci_find_dvsec_capability(dev, PCI_VENDOR_ID_IBM, dvsec_id);
}
static int find_dvsec_afu_ctrl(struct pci_dev *dev, u8 afu_idx)
^ permalink raw reply related
* Re: [PATCH v7 1/3] riscv: Introduce CONFIG_RELOCATABLE
From: Alexandre ghiti @ 2021-10-09 17:20 UTC (permalink / raw)
To: Alexandre Ghiti, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras, Paul Walmsley, Palmer Dabbelt, Albert Ou,
linuxppc-dev, linux-kernel, linux-riscv
In-Reply-To: <20211009171259.2515351-2-alexandre.ghiti@canonical.com>
Arf, I have sent this patchset with the wrong email address. @Palmer
tell me if you want me to resend it correctly.
Thanks,
Alex
On 10/9/21 7:12 PM, Alexandre Ghiti wrote:
> From: Alexandre Ghiti <alex@ghiti.fr>
>
> This config allows to compile 64b kernel as PIE and to relocate it at
> any virtual address at runtime: this paves the way to KASLR.
> Runtime relocation is possible since relocation metadata are embedded into
> the kernel.
>
> Note that relocating at runtime introduces an overhead even if the
> kernel is loaded at the same address it was linked at and that the compiler
> options are those used in arm64 which uses the same RELA relocation
> format.
>
> Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
> ---
> arch/riscv/Kconfig | 12 ++++++++
> arch/riscv/Makefile | 7 +++--
> arch/riscv/kernel/vmlinux.lds.S | 6 ++++
> arch/riscv/mm/Makefile | 4 +++
> arch/riscv/mm/init.c | 54 ++++++++++++++++++++++++++++++++-
> 5 files changed, 80 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index ea16fa2dd768..043ba92559fa 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -213,6 +213,18 @@ config PGTABLE_LEVELS
> config LOCKDEP_SUPPORT
> def_bool y
>
> +config RELOCATABLE
> + bool
> + depends on MMU && 64BIT && !XIP_KERNEL
> + help
> + This builds a kernel as a Position Independent Executable (PIE),
> + which retains all relocation metadata required to relocate the
> + kernel binary at runtime to a different virtual address than the
> + address it was linked at.
> + Since RISCV uses the RELA relocation format, this requires a
> + relocation pass at runtime even if the kernel is loaded at the
> + same address it was linked at.
> +
> source "arch/riscv/Kconfig.socs"
> source "arch/riscv/Kconfig.erratas"
>
> diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> index 0eb4568fbd29..2f509915f246 100644
> --- a/arch/riscv/Makefile
> +++ b/arch/riscv/Makefile
> @@ -9,9 +9,12 @@
> #
>
> OBJCOPYFLAGS := -O binary
> -LDFLAGS_vmlinux :=
> +ifeq ($(CONFIG_RELOCATABLE),y)
> + LDFLAGS_vmlinux += -shared -Bsymbolic -z notext -z norelro
> + KBUILD_CFLAGS += -fPIE
> +endif
> ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
> - LDFLAGS_vmlinux := --no-relax
> + LDFLAGS_vmlinux += --no-relax
> KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> CC_FLAGS_FTRACE := -fpatchable-function-entry=8
> endif
> diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
> index 5104f3a871e3..862a8c09723c 100644
> --- a/arch/riscv/kernel/vmlinux.lds.S
> +++ b/arch/riscv/kernel/vmlinux.lds.S
> @@ -133,6 +133,12 @@ SECTIONS
>
> BSS_SECTION(PAGE_SIZE, PAGE_SIZE, 0)
>
> + .rela.dyn : ALIGN(8) {
> + __rela_dyn_start = .;
> + *(.rela .rela*)
> + __rela_dyn_end = .;
> + }
> +
> #ifdef CONFIG_EFI
> . = ALIGN(PECOFF_SECTION_ALIGNMENT);
> __pecoff_data_virt_size = ABSOLUTE(. - __pecoff_text_end);
> diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> index 7ebaef10ea1b..2d33ec574bbb 100644
> --- a/arch/riscv/mm/Makefile
> +++ b/arch/riscv/mm/Makefile
> @@ -1,6 +1,10 @@
> # SPDX-License-Identifier: GPL-2.0-only
>
> CFLAGS_init.o := -mcmodel=medany
> +ifdef CONFIG_RELOCATABLE
> +CFLAGS_init.o += -fno-pie
> +endif
> +
> ifdef CONFIG_FTRACE
> CFLAGS_REMOVE_init.o = $(CC_FLAGS_FTRACE)
> CFLAGS_REMOVE_cacheflush.o = $(CC_FLAGS_FTRACE)
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index c0cddf0fc22d..42041c12d496 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -20,6 +20,9 @@
> #include <linux/dma-map-ops.h>
> #include <linux/crash_dump.h>
> #include <linux/hugetlb.h>
> +#ifdef CONFIG_RELOCATABLE
> +#include <linux/elf.h>
> +#endif
>
> #include <asm/fixmap.h>
> #include <asm/tlbflush.h>
> @@ -103,7 +106,7 @@ static void __init print_vm_layout(void)
> print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
> (unsigned long)high_memory);
> #ifdef CONFIG_64BIT
> - print_mlm("kernel", (unsigned long)KERNEL_LINK_ADDR,
> + print_mlm("kernel", (unsigned long)kernel_map.virt_addr,
> (unsigned long)ADDRESS_SPACE_END);
> #endif
> }
> @@ -518,6 +521,44 @@ static __init pgprot_t pgprot_from_va(uintptr_t va)
> #error "setup_vm() is called from head.S before relocate so it should not use absolute addressing."
> #endif
>
> +#ifdef CONFIG_RELOCATABLE
> +extern unsigned long __rela_dyn_start, __rela_dyn_end;
> +
> +static void __init relocate_kernel(void)
> +{
> + Elf64_Rela *rela = (Elf64_Rela *)&__rela_dyn_start;
> + /*
> + * This holds the offset between the linked virtual address and the
> + * relocated virtual address.
> + */
> + uintptr_t reloc_offset = kernel_map.virt_addr - KERNEL_LINK_ADDR;
> + /*
> + * This holds the offset between kernel linked virtual address and
> + * physical address.
> + */
> + uintptr_t va_kernel_link_pa_offset = KERNEL_LINK_ADDR - kernel_map.phys_addr;
> +
> + for ( ; rela < (Elf64_Rela *)&__rela_dyn_end; rela++) {
> + Elf64_Addr addr = (rela->r_offset - va_kernel_link_pa_offset);
> + Elf64_Addr relocated_addr = rela->r_addend;
> +
> + if (rela->r_info != R_RISCV_RELATIVE)
> + continue;
> +
> + /*
> + * Make sure to not relocate vdso symbols like rt_sigreturn
> + * which are linked from the address 0 in vmlinux since
> + * vdso symbol addresses are actually used as an offset from
> + * mm->context.vdso in VDSO_OFFSET macro.
> + */
> + if (relocated_addr >= KERNEL_LINK_ADDR)
> + relocated_addr += reloc_offset;
> +
> + *(Elf64_Addr *)addr = relocated_addr;
> + }
> +}
> +#endif /* CONFIG_RELOCATABLE */
> +
> #ifdef CONFIG_XIP_KERNEL
> static void __init create_kernel_page_table(pgd_t *pgdir,
> __always_unused bool early)
> @@ -625,6 +666,17 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> BUG_ON((kernel_map.virt_addr + kernel_map.size) > ADDRESS_SPACE_END - SZ_4K);
> #endif
>
> +#ifdef CONFIG_RELOCATABLE
> + /*
> + * Early page table uses only one PGDIR, which makes it possible
> + * to map PGDIR_SIZE aligned on PGDIR_SIZE: if the relocation offset
> + * makes the kernel cross over a PGDIR_SIZE boundary, raise a bug
> + * since a part of the kernel would not get mapped.
> + */
> + BUG_ON(PGDIR_SIZE - (kernel_map.virt_addr & (PGDIR_SIZE - 1)) < kernel_map.size);
> + relocate_kernel();
> +#endif
> +
> pt_ops.alloc_pte = alloc_pte_early;
> pt_ops.get_pte_virt = get_pte_virt_early;
> #ifndef __PAGETABLE_PMD_FOLDED
^ permalink raw reply
* [PATCH v7 1/3] riscv: Introduce CONFIG_RELOCATABLE
From: Alexandre Ghiti @ 2021-10-09 17:12 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Paul Walmsley, Palmer Dabbelt, Albert Ou, linuxppc-dev,
linux-kernel, linux-riscv
Cc: Alexandre Ghiti
In-Reply-To: <20211009171259.2515351-1-alexandre.ghiti@canonical.com>
From: Alexandre Ghiti <alex@ghiti.fr>
This config allows to compile 64b kernel as PIE and to relocate it at
any virtual address at runtime: this paves the way to KASLR.
Runtime relocation is possible since relocation metadata are embedded into
the kernel.
Note that relocating at runtime introduces an overhead even if the
kernel is loaded at the same address it was linked at and that the compiler
options are those used in arm64 which uses the same RELA relocation
format.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
---
arch/riscv/Kconfig | 12 ++++++++
arch/riscv/Makefile | 7 +++--
arch/riscv/kernel/vmlinux.lds.S | 6 ++++
arch/riscv/mm/Makefile | 4 +++
arch/riscv/mm/init.c | 54 ++++++++++++++++++++++++++++++++-
5 files changed, 80 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index ea16fa2dd768..043ba92559fa 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -213,6 +213,18 @@ config PGTABLE_LEVELS
config LOCKDEP_SUPPORT
def_bool y
+config RELOCATABLE
+ bool
+ depends on MMU && 64BIT && !XIP_KERNEL
+ help
+ This builds a kernel as a Position Independent Executable (PIE),
+ which retains all relocation metadata required to relocate the
+ kernel binary at runtime to a different virtual address than the
+ address it was linked at.
+ Since RISCV uses the RELA relocation format, this requires a
+ relocation pass at runtime even if the kernel is loaded at the
+ same address it was linked at.
+
source "arch/riscv/Kconfig.socs"
source "arch/riscv/Kconfig.erratas"
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 0eb4568fbd29..2f509915f246 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -9,9 +9,12 @@
#
OBJCOPYFLAGS := -O binary
-LDFLAGS_vmlinux :=
+ifeq ($(CONFIG_RELOCATABLE),y)
+ LDFLAGS_vmlinux += -shared -Bsymbolic -z notext -z norelro
+ KBUILD_CFLAGS += -fPIE
+endif
ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
- LDFLAGS_vmlinux := --no-relax
+ LDFLAGS_vmlinux += --no-relax
KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
CC_FLAGS_FTRACE := -fpatchable-function-entry=8
endif
diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
index 5104f3a871e3..862a8c09723c 100644
--- a/arch/riscv/kernel/vmlinux.lds.S
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -133,6 +133,12 @@ SECTIONS
BSS_SECTION(PAGE_SIZE, PAGE_SIZE, 0)
+ .rela.dyn : ALIGN(8) {
+ __rela_dyn_start = .;
+ *(.rela .rela*)
+ __rela_dyn_end = .;
+ }
+
#ifdef CONFIG_EFI
. = ALIGN(PECOFF_SECTION_ALIGNMENT);
__pecoff_data_virt_size = ABSOLUTE(. - __pecoff_text_end);
diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index 7ebaef10ea1b..2d33ec574bbb 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -1,6 +1,10 @@
# SPDX-License-Identifier: GPL-2.0-only
CFLAGS_init.o := -mcmodel=medany
+ifdef CONFIG_RELOCATABLE
+CFLAGS_init.o += -fno-pie
+endif
+
ifdef CONFIG_FTRACE
CFLAGS_REMOVE_init.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_cacheflush.o = $(CC_FLAGS_FTRACE)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index c0cddf0fc22d..42041c12d496 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -20,6 +20,9 @@
#include <linux/dma-map-ops.h>
#include <linux/crash_dump.h>
#include <linux/hugetlb.h>
+#ifdef CONFIG_RELOCATABLE
+#include <linux/elf.h>
+#endif
#include <asm/fixmap.h>
#include <asm/tlbflush.h>
@@ -103,7 +106,7 @@ static void __init print_vm_layout(void)
print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
(unsigned long)high_memory);
#ifdef CONFIG_64BIT
- print_mlm("kernel", (unsigned long)KERNEL_LINK_ADDR,
+ print_mlm("kernel", (unsigned long)kernel_map.virt_addr,
(unsigned long)ADDRESS_SPACE_END);
#endif
}
@@ -518,6 +521,44 @@ static __init pgprot_t pgprot_from_va(uintptr_t va)
#error "setup_vm() is called from head.S before relocate so it should not use absolute addressing."
#endif
+#ifdef CONFIG_RELOCATABLE
+extern unsigned long __rela_dyn_start, __rela_dyn_end;
+
+static void __init relocate_kernel(void)
+{
+ Elf64_Rela *rela = (Elf64_Rela *)&__rela_dyn_start;
+ /*
+ * This holds the offset between the linked virtual address and the
+ * relocated virtual address.
+ */
+ uintptr_t reloc_offset = kernel_map.virt_addr - KERNEL_LINK_ADDR;
+ /*
+ * This holds the offset between kernel linked virtual address and
+ * physical address.
+ */
+ uintptr_t va_kernel_link_pa_offset = KERNEL_LINK_ADDR - kernel_map.phys_addr;
+
+ for ( ; rela < (Elf64_Rela *)&__rela_dyn_end; rela++) {
+ Elf64_Addr addr = (rela->r_offset - va_kernel_link_pa_offset);
+ Elf64_Addr relocated_addr = rela->r_addend;
+
+ if (rela->r_info != R_RISCV_RELATIVE)
+ continue;
+
+ /*
+ * Make sure to not relocate vdso symbols like rt_sigreturn
+ * which are linked from the address 0 in vmlinux since
+ * vdso symbol addresses are actually used as an offset from
+ * mm->context.vdso in VDSO_OFFSET macro.
+ */
+ if (relocated_addr >= KERNEL_LINK_ADDR)
+ relocated_addr += reloc_offset;
+
+ *(Elf64_Addr *)addr = relocated_addr;
+ }
+}
+#endif /* CONFIG_RELOCATABLE */
+
#ifdef CONFIG_XIP_KERNEL
static void __init create_kernel_page_table(pgd_t *pgdir,
__always_unused bool early)
@@ -625,6 +666,17 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
BUG_ON((kernel_map.virt_addr + kernel_map.size) > ADDRESS_SPACE_END - SZ_4K);
#endif
+#ifdef CONFIG_RELOCATABLE
+ /*
+ * Early page table uses only one PGDIR, which makes it possible
+ * to map PGDIR_SIZE aligned on PGDIR_SIZE: if the relocation offset
+ * makes the kernel cross over a PGDIR_SIZE boundary, raise a bug
+ * since a part of the kernel would not get mapped.
+ */
+ BUG_ON(PGDIR_SIZE - (kernel_map.virt_addr & (PGDIR_SIZE - 1)) < kernel_map.size);
+ relocate_kernel();
+#endif
+
pt_ops.alloc_pte = alloc_pte_early;
pt_ops.get_pte_virt = get_pte_virt_early;
#ifndef __PAGETABLE_PMD_FOLDED
--
2.30.2
^ permalink raw reply related
* [PATCH v7 3/3] riscv: Check relocations at compile time
From: Alexandre Ghiti @ 2021-10-09 17:12 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Paul Walmsley, Palmer Dabbelt, Albert Ou, linuxppc-dev,
linux-kernel, linux-riscv
Cc: Anup Patel, Alexandre Ghiti
In-Reply-To: <20211009171259.2515351-1-alexandre.ghiti@canonical.com>
From: Alexandre Ghiti <alex@ghiti.fr>
Relocating kernel at runtime is done very early in the boot process, so
it is not convenient to check for relocations there and react in case a
relocation was not expected.
There exists a script in scripts/ that extracts the relocations from
vmlinux that is then used at postlink to check the relocations.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
---
arch/riscv/Makefile.postlink | 36 ++++++++++++++++++++++++++++++++
arch/riscv/tools/relocs_check.sh | 26 +++++++++++++++++++++++
2 files changed, 62 insertions(+)
create mode 100644 arch/riscv/Makefile.postlink
create mode 100755 arch/riscv/tools/relocs_check.sh
diff --git a/arch/riscv/Makefile.postlink b/arch/riscv/Makefile.postlink
new file mode 100644
index 000000000000..bf2b2bca1845
--- /dev/null
+++ b/arch/riscv/Makefile.postlink
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: GPL-2.0
+# ===========================================================================
+# Post-link riscv pass
+# ===========================================================================
+#
+# Check that vmlinux relocations look sane
+
+PHONY := __archpost
+__archpost:
+
+-include include/config/auto.conf
+include scripts/Kbuild.include
+
+quiet_cmd_relocs_check = CHKREL $@
+cmd_relocs_check = \
+ $(CONFIG_SHELL) $(srctree)/arch/riscv/tools/relocs_check.sh "$(OBJDUMP)" "$(NM)" "$@"
+
+# `@true` prevents complaint when there is nothing to be done
+
+vmlinux: FORCE
+ @true
+ifdef CONFIG_RELOCATABLE
+ $(call if_changed,relocs_check)
+endif
+
+%.ko: FORCE
+ @true
+
+clean:
+ @true
+
+PHONY += FORCE clean
+
+FORCE:
+
+.PHONY: $(PHONY)
diff --git a/arch/riscv/tools/relocs_check.sh b/arch/riscv/tools/relocs_check.sh
new file mode 100755
index 000000000000..baeb2e7b2290
--- /dev/null
+++ b/arch/riscv/tools/relocs_check.sh
@@ -0,0 +1,26 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-or-later
+# Based on powerpc relocs_check.sh
+
+# This script checks the relocations of a vmlinux for "suspicious"
+# relocations.
+
+if [ $# -lt 3 ]; then
+ echo "$0 [path to objdump] [path to nm] [path to vmlinux]" 1>&2
+ exit 1
+fi
+
+bad_relocs=$(
+${srctree}/scripts/relocs_check.sh "$@" |
+ # These relocations are okay
+ # R_RISCV_RELATIVE
+ grep -F -w -v 'R_RISCV_RELATIVE'
+)
+
+if [ -z "$bad_relocs" ]; then
+ exit 0
+fi
+
+num_bad=$(echo "$bad_relocs" | wc -l)
+echo "WARNING: $num_bad bad relocations"
+echo "$bad_relocs"
--
2.30.2
^ permalink raw reply related
* [PATCH v7 0/3] Introduce 64b relocatable kernel
From: Alexandre Ghiti @ 2021-10-09 17:12 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Paul Walmsley, Palmer Dabbelt, Albert Ou, linuxppc-dev,
linux-kernel, linux-riscv
Cc: Alexandre Ghiti
After multiple attempts, this patchset is now based on the fact that the
64b kernel mapping was moved outside the linear mapping.
The first patch allows to build relocatable kernels but is not selected
by default. That patch should ease KASLR implementation a lot.
The second and third patches take advantage of an already existing powerpc
script that checks relocations at compile-time, and uses it for riscv.
This patchset was tested on:
* qemu riscv64 defconfig: OK
* Unmatched ubuntu config: OK
Changes in v7:
* Rebase on top of v5.15
* Fix LDFLAGS_vmlinux which was overriden when CONFIG_DYNAMIC_FTRACE was
set
* Make relocate_kernel static
* Add Ack from Michael
Changes in v6:
* Remove the kernel move to vmalloc zone
* Rebased on top of for-next
* Remove relocatable property from 32b kernel as the kernel is mapped in
the linear mapping and would then need to be copied physically too
* CONFIG_RELOCATABLE depends on !XIP_KERNEL
* Remove Reviewed-by from first patch as it changed a bit
Changes in v5:
* Add "static __init" to create_kernel_page_table function as reported by
Kbuild test robot
* Add reviewed-by from Zong
* Rebase onto v5.7
Changes in v4:
* Fix BPF region that overlapped with kernel's as suggested by Zong
* Fix end of module region that could be larger than 2GB as suggested by Zong
* Fix the size of the vm area reserved for the kernel as we could lose
PMD_SIZE if the size was already aligned on PMD_SIZE
* Split compile time relocations check patch into 2 patches as suggested by Anup
* Applied Reviewed-by from Zong and Anup
Changes in v3:
* Move kernel mapping to vmalloc
Changes in v2:
* Make RELOCATABLE depend on MMU as suggested by Anup
* Rename kernel_load_addr into kernel_virt_addr as suggested by Anup
* Use __pa_symbol instead of __pa, as suggested by Zong
* Rebased on top of v5.6-rc3
* Tested with sv48 patchset
* Add Reviewed/Tested-by from Zong and Anup
Alexandre Ghiti (3):
riscv: Introduce CONFIG_RELOCATABLE
powerpc: Move script to check relocations at compile time in scripts/
riscv: Check relocations at compile time
arch/powerpc/tools/relocs_check.sh | 18 ++--------
arch/riscv/Kconfig | 12 +++++++
arch/riscv/Makefile | 7 ++--
arch/riscv/Makefile.postlink | 36 ++++++++++++++++++++
arch/riscv/kernel/vmlinux.lds.S | 6 ++++
arch/riscv/mm/Makefile | 4 +++
arch/riscv/mm/init.c | 54 +++++++++++++++++++++++++++++-
arch/riscv/tools/relocs_check.sh | 26 ++++++++++++++
scripts/relocs_check.sh | 20 +++++++++++
9 files changed, 164 insertions(+), 19 deletions(-)
create mode 100644 arch/riscv/Makefile.postlink
create mode 100755 arch/riscv/tools/relocs_check.sh
create mode 100755 scripts/relocs_check.sh
--
2.30.2
^ permalink raw reply
* [PATCH v7 2/3] powerpc: Move script to check relocations at compile time in scripts/
From: Alexandre Ghiti @ 2021-10-09 17:12 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Paul Walmsley, Palmer Dabbelt, Albert Ou, linuxppc-dev,
linux-kernel, linux-riscv
Cc: Anup Patel, Alexandre Ghiti
In-Reply-To: <20211009171259.2515351-1-alexandre.ghiti@canonical.com>
From: Alexandre Ghiti <alex@ghiti.fr>
Relocating kernel at runtime is done very early in the boot process, so
it is not convenient to check for relocations there and react in case a
relocation was not expected.
Powerpc architecture has a script that allows to check at compile time
for such unexpected relocations: extract the common logic to scripts/
so that other architectures can take advantage of it.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
---
arch/powerpc/tools/relocs_check.sh | 18 ++----------------
scripts/relocs_check.sh | 20 ++++++++++++++++++++
2 files changed, 22 insertions(+), 16 deletions(-)
create mode 100755 scripts/relocs_check.sh
diff --git a/arch/powerpc/tools/relocs_check.sh b/arch/powerpc/tools/relocs_check.sh
index 014e00e74d2b..e367895941ae 100755
--- a/arch/powerpc/tools/relocs_check.sh
+++ b/arch/powerpc/tools/relocs_check.sh
@@ -15,21 +15,8 @@ if [ $# -lt 3 ]; then
exit 1
fi
-# Have Kbuild supply the path to objdump and nm so we handle cross compilation.
-objdump="$1"
-nm="$2"
-vmlinux="$3"
-
-# Remove from the bad relocations those that match an undefined weak symbol
-# which will result in an absolute relocation to 0.
-# Weak unresolved symbols are of that form in nm output:
-# " w _binary__btf_vmlinux_bin_end"
-undef_weak_symbols=$($nm "$vmlinux" | awk '$1 ~ /w/ { print $2 }')
-
bad_relocs=$(
-$objdump -R "$vmlinux" |
- # Only look at relocation lines.
- grep -E '\<R_' |
+${srctree}/scripts/relocs_check.sh "$@" |
# These relocations are okay
# On PPC64:
# R_PPC64_RELATIVE, R_PPC64_NONE
@@ -43,8 +30,7 @@ R_PPC_ADDR16_LO
R_PPC_ADDR16_HI
R_PPC_ADDR16_HA
R_PPC_RELATIVE
-R_PPC_NONE' |
- ([ "$undef_weak_symbols" ] && grep -F -w -v "$undef_weak_symbols" || cat)
+R_PPC_NONE'
)
if [ -z "$bad_relocs" ]; then
diff --git a/scripts/relocs_check.sh b/scripts/relocs_check.sh
new file mode 100755
index 000000000000..137c660499f3
--- /dev/null
+++ b/scripts/relocs_check.sh
@@ -0,0 +1,20 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+# Get a list of all the relocations, remove from it the relocations
+# that are known to be legitimate and return this list to arch specific
+# script that will look for suspicious relocations.
+
+objdump="$1"
+nm="$2"
+vmlinux="$3"
+
+# Remove from the possible bad relocations those that match an undefined
+# weak symbol which will result in an absolute relocation to 0.
+# Weak unresolved symbols are of that form in nm output:
+# " w _binary__btf_vmlinux_bin_end"
+undef_weak_symbols=$($nm "$vmlinux" | awk '$1 ~ /w/ { print $2 }')
+
+$objdump -R "$vmlinux" |
+ grep -E '\<R_' |
+ ([ "$undef_weak_symbols" ] && grep -F -w -v "$undef_weak_symbols" || cat)
--
2.30.2
^ permalink raw reply related
* Re: [PATCH v10 2/3] tty: hvc: pass DMA capable memory to put_chars()
From: Greg KH @ 2021-10-10 5:33 UTC (permalink / raw)
To: Xianting Tian
Cc: arnd, amit, jirislaby, shile.zhang, linux-kernel, virtualization,
linuxppc-dev, osandov
In-Reply-To: <3516c58c-e8e6-2e5a-2bc8-ad80e2124d37@linux.alibaba.com>
On Sat, Oct 09, 2021 at 11:45:23PM +0800, Xianting Tian wrote:
>
> 在 2021/10/9 下午7:58, Greg KH 写道:
> > Did you look at the placement using pahole as to how this structure now
> > looks?
>
> thanks for all your commnts. for this one, do you mean I need to remove the
> blank line? thanks
>
No, I mean to use the tool 'pahole' to see the structure layout that you
just created and determine if it really is the best way to add these new
fields, especially as you are adding huge buffers with odd alignment.
thanks,
greg k-h
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox