* [PATCH v8 01/30] powerpc/xmon: Remove store_inst() for patch_instruction()
From: Jordan Niethe @ 2020-05-06 3:40 UTC (permalink / raw)
To: linuxppc-dev; +Cc: alistair, npiggin, bala24, naveen.n.rao, Jordan Niethe, dja
In-Reply-To: <20200506034050.24806-1-jniethe5@gmail.com>
For modifying instructions in xmon, patch_instruction() can serve the
same role that store_inst() is performing with the advantage of not
being specific to xmon. In some places patch_instruction() is already
being using followed by store_inst(). In these cases just remove the
store_inst(). Otherwise replace store_inst() with patch_instruction().
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
v4: Read into a local variable
---
arch/powerpc/xmon/xmon.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 7af840c0fc93..f91ae2c9adbe 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -326,11 +326,6 @@ static inline void sync(void)
asm volatile("sync; isync");
}
-static inline void store_inst(void *p)
-{
- asm volatile ("dcbst 0,%0; sync; icbi 0,%0; isync" : : "r" (p));
-}
-
static inline void cflush(void *p)
{
asm volatile ("dcbf 0,%0; icbi 0,%0" : : "r" (p));
@@ -882,8 +877,7 @@ static struct bpt *new_breakpoint(unsigned long a)
for (bp = bpts; bp < &bpts[NBPTS]; ++bp) {
if (!bp->enabled && atomic_read(&bp->ref_count) == 0) {
bp->address = a;
- bp->instr[1] = bpinstr;
- store_inst(&bp->instr[1]);
+ patch_instruction(&bp->instr[1], bpinstr);
return bp;
}
}
@@ -895,25 +889,26 @@ static struct bpt *new_breakpoint(unsigned long a)
static void insert_bpts(void)
{
int i;
+ unsigned int instr;
struct bpt *bp;
bp = bpts;
for (i = 0; i < NBPTS; ++i, ++bp) {
if ((bp->enabled & (BP_TRAP|BP_CIABR)) == 0)
continue;
- if (mread(bp->address, &bp->instr[0], 4) != 4) {
+ if (mread(bp->address, &instr, 4) != 4) {
printf("Couldn't read instruction at %lx, "
"disabling breakpoint there\n", bp->address);
bp->enabled = 0;
continue;
}
- if (IS_MTMSRD(bp->instr[0]) || IS_RFID(bp->instr[0])) {
+ if (IS_MTMSRD(instr) || IS_RFID(instr)) {
printf("Breakpoint at %lx is on an mtmsrd or rfid "
"instruction, disabling it\n", bp->address);
bp->enabled = 0;
continue;
}
- store_inst(&bp->instr[0]);
+ patch_instruction(bp->instr, instr);
if (bp->enabled & BP_CIABR)
continue;
if (patch_instruction((unsigned int *)bp->address,
@@ -923,7 +918,6 @@ static void insert_bpts(void)
bp->enabled &= ~BP_TRAP;
continue;
}
- store_inst((void *)bp->address);
}
}
@@ -958,8 +952,6 @@ static void remove_bpts(void)
(unsigned int *)bp->address, bp->instr[0]) != 0)
printf("Couldn't remove breakpoint at %lx\n",
bp->address);
- else
- store_inst((void *)bp->address);
}
}
--
2.17.1
^ permalink raw reply related
* [PATCH v8 00/30] Initial Prefixed Instruction support
From: Jordan Niethe @ 2020-05-06 3:40 UTC (permalink / raw)
To: linuxppc-dev; +Cc: alistair, npiggin, bala24, naveen.n.rao, Jordan Niethe, dja
A future revision of the ISA will introduce prefixed instructions. A
prefixed instruction is composed of a 4-byte prefix followed by a
4-byte suffix.
All prefixes have the major opcode 1. A prefix will never be a valid
word instruction. A suffix may be an existing word instruction or a
new instruction.
This series enables prefixed instructions and extends the instruction
emulation to support them. Then the places where prefixed instructions
might need to be emulated are updated.
v8 incorporates feedback from Alistair Popple and Balamuruhan Suriyakumar.
The major changes:
- Fix some style issues
- Fix __patch_instruction() on big endian
- Reintroduce v3's forbidding breakpoints on second word of prefix
instructions for kprobes and xmon. Missed this when changing to
using a data type.
- Use the data type in some places that were missed.
v7 fixes compilation issues for some configs reported by Alistair
Popple.
v6 is based on feedback from Balamuruhan Suriyakumar, Alistair Popple,
Christophe Leroy and Segher Boessenkool.
The major changes:
- Use the instruction type in more places that had been missed before
- Fix issues with ppc32
- Introduce new self tests for code patching and feature fixups
v5 is based on feedback from Nick Piggins, Michael Ellerman, Balamuruhan
Suriyakumar and Alistair Popple.
The major changes:
- The ppc instruction type is now a struct
- Series now just based on next
- ppc_inst_masked() dropped
- Space for xmon breakpoints allocated in an assembly file
- "Add prefixed instructions to instruction data type" patch seperated in
to smaller patches
- Calling convention for create_branch() is changed
- Some places which had not been updated to use the data type are now updated
v4 is based on feedback from Nick Piggins, Christophe Leroy and Daniel Axtens.
The major changes:
- Move xmon breakpoints from data section to text section
- Introduce a data type for instructions on powerpc
v3 is based on feedback from Christophe Leroy. The major changes:
- Completely replacing store_inst() with patch_instruction() in
xmon
- Improve implementation of mread_instr() to not use mread().
- Base the series on top of
https://patchwork.ozlabs.org/patch/1232619/ as this will effect
kprobes.
- Some renaming and simplification of conditionals.
v2 incorporates feedback from Daniel Axtens and and Balamuruhan
S. The major changes are:
- Squashing together all commits about SRR1 bits
- Squashing all commits for supporting prefixed load stores
- Changing abbreviated references to sufx/prfx -> suffix/prefix
- Introducing macros for returning the length of an instruction
- Removing sign extension flag from pstd/pld in sstep.c
- Dropping patch "powerpc/fault: Use analyse_instr() to check for
store with updates to sp" from the series, it did not really fit
with prefixed enablement in the first place and as reported by Greg
Kurz did not work correctly.
Alistair Popple (1):
powerpc: Enable Prefixed Instructions
Jordan Niethe (29):
powerpc/xmon: Remove store_inst() for patch_instruction()
powerpc/xmon: Move breakpoint instructions to own array
powerpc/xmon: Move breakpoints to text section
powerpc/xmon: Use bitwise calculations in_breakpoint_table()
powerpc: Change calling convention for create_branch() et. al.
powerpc: Use a macro for creating instructions from u32s
powerpc: Use an accessor for instructions
powerpc: Use a function for getting the instruction op code
powerpc: Use a function for byte swapping instructions
powerpc: Introduce functions for instruction equality
powerpc: Use a datatype for instructions
powerpc: Use a function for reading instructions
powerpc: Add a probe_user_read_inst() function
powerpc: Add a probe_kernel_read_inst() function
powerpc/kprobes: Use patch_instruction()
powerpc: Define and use __get_user_instr{,inatomic}()
powerpc: Introduce a function for reporting instruction length
powerpc/xmon: Use a function for reading instructions
powerpc/xmon: Move insertion of breakpoint for xol'ing
powerpc: Make test_translate_branch() independent of instruction
length
powerpc: Define new SRR1 bits for a future ISA version
powerpc: Add prefixed instructions to instruction data type
powerpc: Test prefixed code patching
powerpc: Test prefixed instructions in feature fixups
powerpc/xmon: Don't allow breakpoints on suffixes
powerpc/kprobes: Don't allow breakpoints on suffixes
powerpc: Support prefixed instructions in alignment handler
powerpc sstep: Add support for prefixed load/stores
powerpc sstep: Add support for prefixed fixed-point arithmetic
arch/powerpc/include/asm/code-patching.h | 37 +-
arch/powerpc/include/asm/inst.h | 107 +++++
arch/powerpc/include/asm/kprobes.h | 2 +-
arch/powerpc/include/asm/ppc-opcode.h | 3 +
arch/powerpc/include/asm/reg.h | 7 +-
arch/powerpc/include/asm/sstep.h | 15 +-
arch/powerpc/include/asm/uaccess.h | 43 ++
arch/powerpc/include/asm/uprobes.h | 7 +-
arch/powerpc/kernel/align.c | 13 +-
arch/powerpc/kernel/asm-offsets.c | 8 +
arch/powerpc/kernel/crash_dump.c | 7 +-
arch/powerpc/kernel/epapr_paravirt.c | 7 +-
arch/powerpc/kernel/hw_breakpoint.c | 5 +-
arch/powerpc/kernel/jump_label.c | 5 +-
arch/powerpc/kernel/kgdb.c | 9 +-
arch/powerpc/kernel/kprobes.c | 37 +-
arch/powerpc/kernel/mce_power.c | 5 +-
arch/powerpc/kernel/module_64.c | 3 +-
arch/powerpc/kernel/optprobes.c | 102 +++--
arch/powerpc/kernel/optprobes_head.S | 3 +
arch/powerpc/kernel/security.c | 12 +-
arch/powerpc/kernel/setup_32.c | 8 +-
arch/powerpc/kernel/trace/ftrace.c | 168 ++++----
arch/powerpc/kernel/traps.c | 20 +-
arch/powerpc/kernel/uprobes.c | 5 +-
arch/powerpc/kernel/vecemu.c | 20 +-
arch/powerpc/kvm/book3s_hv_nested.c | 2 +-
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 2 +-
arch/powerpc/kvm/emulate_loadstore.c | 2 +-
arch/powerpc/lib/Makefile | 2 +-
arch/powerpc/lib/code-patching.c | 319 +++++++++------
arch/powerpc/lib/feature-fixups-test.S | 69 ++++
arch/powerpc/lib/feature-fixups.c | 160 ++++++--
arch/powerpc/lib/inst.c | 70 ++++
arch/powerpc/lib/sstep.c | 459 +++++++++++++++-------
arch/powerpc/lib/test_code-patching.S | 20 +
arch/powerpc/lib/test_emulate_step.c | 56 +--
arch/powerpc/mm/fault.c | 15 +-
arch/powerpc/mm/nohash/8xx.c | 5 +-
arch/powerpc/perf/8xx-pmu.c | 9 +-
arch/powerpc/perf/core-book3s.c | 4 +-
arch/powerpc/platforms/86xx/mpc86xx_smp.c | 5 +-
arch/powerpc/platforms/powermac/smp.c | 5 +-
arch/powerpc/xmon/Makefile | 2 +-
arch/powerpc/xmon/xmon.c | 122 ++++--
arch/powerpc/xmon/xmon_bpts.S | 11 +
arch/powerpc/xmon/xmon_bpts.h | 14 +
47 files changed, 1409 insertions(+), 602 deletions(-)
create mode 100644 arch/powerpc/include/asm/inst.h
create mode 100644 arch/powerpc/lib/inst.c
create mode 100644 arch/powerpc/lib/test_code-patching.S
create mode 100644 arch/powerpc/xmon/xmon_bpts.S
create mode 100644 arch/powerpc/xmon/xmon_bpts.h
--
2.17.1
^ permalink raw reply
* Re: [PATCH] powerpc/ps3: Move static keyword to the front of declaration
From: Michael Ellerman @ 2020-05-06 2:51 UTC (permalink / raw)
To: Xiongfeng Wang, geoff, benh, paulus, linuxppc-dev, linux-kernel,
wangxiongfeng2
In-Reply-To: <1588154448-56759-1-git-send-email-wangxiongfeng2@huawei.com>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 618 bytes --]
On Wed, 2020-04-29 at 10:00:48 UTC, Xiongfeng Wang wrote:
> Move the static keyword to the front of declaration of 'vuart_bus_priv',
> and resolve the following compiler warning that can be seen when
> building with warnings enabled (W=1):
>
> drivers/ps3/ps3-vuart.c:867:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
> } static vuart_bus_priv;
> ^
>
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/43c8a496fa37187b54f7df71fb8262acc6bf6200
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/64: Have MPROFILE_KERNEL depend on FUNCTION_TRACER
From: Michael Ellerman @ 2020-05-06 2:51 UTC (permalink / raw)
To: Naveen N. Rao, linuxppc-dev; +Cc: Steven Rostedt
In-Reply-To: <20200422092612.514301-1-naveen.n.rao@linux.vnet.ibm.com>
On Wed, 2020-04-22 at 09:26:12 UTC, "Naveen N. Rao" wrote:
> Currently, it is possible to have CONFIG_FUNCTION_TRACER disabled, but
> CONFIG_MPROFILE_KERNEL enabled. Though all existing users of
> MPROFILE_KERNEL are doing the right thing, it is weird to have
> MPROFILE_KERNEL enabled when the function tracer isn't. Fix this by
> making MPROFILE_KERNEL depend on FUNCTION_TRACER.
>
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/57b3ed941b5542aaebcd9f59369571bbce9d6dcc
cheers
^ permalink raw reply
* Re: [PATCH v3 1/2] powerpc/fadump: use static allocation for reserved memory ranges
From: Michael Ellerman @ 2020-05-06 2:51 UTC (permalink / raw)
To: Hari Bathini, linuxppc-dev
Cc: Sourabh Jain, Vasant Hegde, stable, Mahesh J Salgaonkar
In-Reply-To: <158737294432.26700.4830263187856221314.stgit@hbathini.in.ibm.com>
On Mon, 2020-04-20 at 08:56:09 UTC, Hari Bathini wrote:
> At times, memory ranges have to be looked up during early boot, when
> kernel couldn't be initialized for dynamic memory allocation. In fact,
> reserved-ranges look up is needed during FADump memory reservation.
> Without accounting for reserved-ranges in reserving memory for FADump,
> MPIPL boot fails with memory corruption issues. So, extend memory
> ranges handling to support static allocation and populate reserved
> memory ranges during early boot.
>
> Fixes: dda9dbfeeb7a ("powerpc/fadump: consider reserved ranges while releasing memory")
> Cc: stable@vger.kernel.org
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/02c04e374e176ae3a3f64a682f80702f8d2fb65d
cheers
^ permalink raw reply
* Re: [PATCH v5 1/5] powerpc: Move idle_loop_prolog()/epilog() functions to header file
From: Michael Ellerman @ 2020-05-06 2:51 UTC (permalink / raw)
To: Gautham R. Shenoy, Nathan Lynch, Vaidyanathan Srinivasan,
Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
Cc: Gautham R. Shenoy, linuxppc-dev, linux-kernel
In-Reply-To: <1586249263-14048-2-git-send-email-ego@linux.vnet.ibm.com>
On Tue, 2020-04-07 at 08:47:39 UTC, "Gautham R. Shenoy" wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> Currently prior to entering an idle state on a Linux Guest, the
> pseries cpuidle driver implement an idle_loop_prolog() and
> idle_loop_epilog() functions which ensure that idle_purr is correctly
> computed, and the hypervisor is informed that the CPU cycles have been
> donated.
>
> These prolog and epilog functions are also required in the default
> idle call, i.e pseries_lpar_idle(). Hence move these accessor
> functions to a common header file and call them from
> pseries_lpar_idle(). Since the existing header files such as
> asm/processor.h have enough clutter, create a new header file
> asm/idle.h. Finally rename idle_loop_prolog() and idle_loop_epilog()
> to pseries_idle_prolog() and pseries_idle_epilog() as they are only
> relavent for on pseries guests.
>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/e4a884cc28fa3f5d8b81de46998ffe29b4ad169e
cheers
^ permalink raw reply
* Re: [PATCH 2/3] ASoC: fsl_esai: Add support for imx8qm
From: Shengjiu Wang @ 2020-05-06 2:33 UTC (permalink / raw)
To: Mark Brown
Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
Linux-ALSA, Timur Tabi, Xiubo Li, Fabio Estevam, Shengjiu Wang,
Liam Girdwood, Takashi Iwai, Nicolin Chen, Rob Herring,
linuxppc-dev, linux-kernel
In-Reply-To: <20200501102158.GA5276@sirena.org.uk>
Hi
On Fri, May 1, 2020 at 6:23 PM Mark Brown <broonie@kernel.org> wrote:
>
> On Fri, May 01, 2020 at 04:12:05PM +0800, Shengjiu Wang wrote:
> > The difference for esai on imx8qm is that DMA device is EDMA.
> >
> > EDMA requires the period size to be multiple of maxburst. Otherwise
> > the remaining bytes are not transferred and thus noise is produced.
>
> If this constraint comes from the DMA controller then normally you'd
> expect the DMA controller integration to be enforcing this - is there no
> information in the DMA API that lets us know that this constraint is
> there?
No, I can't find one API for this.
Do you have a recommendation?
best regards
wang shengjiu
^ permalink raw reply
* Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
From: Michael Ellerman @ 2020-05-06 1:36 UTC (permalink / raw)
To: Segher Boessenkool, Christophe Leroy
Cc: linux-kernel, npiggin, Paul Mackerras, linuxppc-dev
In-Reply-To: <20200505155944.GO31009@gate.crashing.org>
Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Tue, May 05, 2020 at 05:40:21PM +0200, Christophe Leroy wrote:
>> >>+#define __put_user_asm_goto(x, addr, label, op) \
>> >>+ asm volatile goto( \
>> >>+ "1: " op "%U1%X1 %0,%1 # put_user\n" \
>> >>+ EX_TABLE(1b, %l2) \
>> >>+ : \
>> >>+ : "r" (x), "m<>" (*addr) \
>> >
>> >The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.
>> >
>> >Plain "m" works, how much does the "<>" affect code gen in practice?
>> >
>> >A quick diff here shows no difference from removing "<>".
>>
>> It was recommended by Segher, there has been some discussion about it on
>> v1 of this patch, see
>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4fdc2aba6f5e51887d1cd0fee94be0989eada2cd.1586942312.git.christophe.leroy@c-s.fr/
>>
>> As far as I understood that's mandatory on recent gcc to get the
>> pre-update form of the instruction. With older versions "m" was doing
>> the same, but not anymore.
>
> Yes. How much that matters depends on the asm. On older CPUs (6xx/7xx,
> say) the update form was just as fast as the non-update form. On newer
> or bigger CPUs it is usually executed just the same as an add followed
> by the memory access, so it just saves a bit of code size.
The update-forms are stdux, sthux etc. right?
I don't see any change in the number of those with or without the
constraint. That's using GCC 9.3.0.
>> Should we ifdef the "m<>" or "m" based on GCC
>> version ?
>
> That will be a lot of churn. Just make 4.8 minimum?
As I said in my other mail that's not really up to us. We could mandate
a higher minimum for powerpc, but I'd rather not.
I think for now I'm inclined to just drop the "<>", and we can revisit
in a release or two when hopefully GCC 4.8 has become the minimum.
cheers
^ permalink raw reply
* Re: [PATCH v2 1/2] powerpc/64s/hash: add torture_slb kernel boot option to increase SLB faults
From: kbuild test robot @ 2020-05-06 1:32 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev, Aneesh Kumar K . V
Cc: clang-built-linux, kbuild-all, Aneesh Kumar K . V
In-Reply-To: <20200504090658.44996-1-npiggin@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 16681 bytes --]
Hi Nicholas,
I love your patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on linus/master v5.7-rc4 next-20200505]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-64s-hash-add-torture_slb-kernel-boot-option-to-increase-SLB-faults/20200505-053958
base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-randconfig-a001-20200503 (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 9e3549804672c79d64eececab39019f4dfd2b7ea)
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc cross compiling tool for clang build
# apt-get install binutils-powerpc-linux-gnu
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>
All error/warnings (new ones prefixed by >>):
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:10:
In file included from include/linux/time.h:6:
In file included from include/linux/seqlock.h:36:
In file included from include/linux/spinlock.h:51:
In file included from include/linux/preempt.h:78:
In file included from ./arch/powerpc/include/generated/asm/preempt.h:1:
In file included from include/asm-generic/preempt.h:5:
In file included from include/linux/thread_info.h:21:
In file included from arch/powerpc/include/asm/current.h:13:
In file included from arch/powerpc/include/asm/paca.h:17:
In file included from arch/powerpc/include/asm/lppaca.h:47:
In file included from arch/powerpc/include/asm/mmu.h:356:
In file included from arch/powerpc/include/asm/book3s/64/mmu.h:46:
>> arch/powerpc/include/asm/book3s/64/mmu-hash.h:321:1: warning: declaration specifier missing, defaulting to 'int'
DECLARE_STATIC_KEY_FALSE(torture_slb_key);
^
int
>> arch/powerpc/include/asm/book3s/64/mmu-hash.h:321:26: error: a parameter list without types is only allowed in a function definition
DECLARE_STATIC_KEY_FALSE(torture_slb_key);
^
>> arch/powerpc/include/asm/book3s/64/mmu-hash.h:324:9: error: implicit declaration of function 'static_branch_unlikely' [-Werror,-Wimplicit-function-declaration]
return static_branch_unlikely(&torture_slb_key);
^
>> arch/powerpc/include/asm/book3s/64/mmu-hash.h:324:33: error: use of undeclared identifier 'torture_slb_key'; did you mean 'torture_slb'?
return static_branch_unlikely(&torture_slb_key);
^~~~~~~~~~~~~~~
torture_slb
arch/powerpc/include/asm/book3s/64/mmu-hash.h:322:20: note: 'torture_slb' declared here
static inline bool torture_slb(void)
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:17:
In file included from include/linux/fs.h:34:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:87:11: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
return (set->sig[3] | set->sig[2] |
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:17:
In file included from include/linux/fs.h:34:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:87:25: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
return (set->sig[3] | set->sig[2] |
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:17:
In file included from include/linux/fs.h:34:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:88:4: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
set->sig[1] | set->sig[0]) == 0;
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:17:
In file included from include/linux/fs.h:34:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:90:11: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
return (set->sig[1] | set->sig[0]) == 0;
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:17:
In file included from include/linux/fs.h:34:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:103:11: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
return (set1->sig[3] == set2->sig[3]) &&
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:17:
In file included from include/linux/fs.h:34:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:103:27: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
return (set1->sig[3] == set2->sig[3]) &&
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:17:
In file included from include/linux/fs.h:34:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:104:5: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
(set1->sig[2] == set2->sig[2]) &&
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:17:
In file included from include/linux/fs.h:34:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:104:21: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
(set1->sig[2] == set2->sig[2]) &&
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
--
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:232:10: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
case 2: set->sig[1] = 0;
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:17:
In file included from include/linux/fs.h:34:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:244:10: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
case 2: set->sig[1] = -1;
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
include/linux/compat.h:424:22: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
case 4: v.sig[7] = (set->sig[3] >> 32); v.sig[6] = set->sig[3];
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
include/linux/compat.h:424:10: warning: array index 7 is past the end of the array (which contains 2 elements) [-Warray-bounds]
case 4: v.sig[7] = (set->sig[3] >> 32); v.sig[6] = set->sig[3];
^ ~
include/linux/compat.h:131:2: note: array 'sig' declared here
compat_sigset_word sig[_COMPAT_NSIG_WORDS];
^
include/linux/compat.h:424:42: warning: array index 6 is past the end of the array (which contains 2 elements) [-Warray-bounds]
case 4: v.sig[7] = (set->sig[3] >> 32); v.sig[6] = set->sig[3];
^ ~
include/linux/compat.h:131:2: note: array 'sig' declared here
compat_sigset_word sig[_COMPAT_NSIG_WORDS];
^
include/linux/compat.h:424:53: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
case 4: v.sig[7] = (set->sig[3] >> 32); v.sig[6] = set->sig[3];
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
include/linux/compat.h:426:22: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2];
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
include/linux/compat.h:426:10: warning: array index 5 is past the end of the array (which contains 2 elements) [-Warray-bounds]
case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2];
^ ~
include/linux/compat.h:131:2: note: array 'sig' declared here
compat_sigset_word sig[_COMPAT_NSIG_WORDS];
^
include/linux/compat.h:426:42: warning: array index 4 is past the end of the array (which contains 2 elements) [-Warray-bounds]
case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2];
^ ~
include/linux/compat.h:131:2: note: array 'sig' declared here
compat_sigset_word sig[_COMPAT_NSIG_WORDS];
^
include/linux/compat.h:426:53: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
case 3: v.sig[5] = (set->sig[2] >> 32); v.sig[4] = set->sig[2];
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
include/linux/compat.h:428:22: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1];
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:14:
include/linux/compat.h:428:10: warning: array index 3 is past the end of the array (which contains 2 elements) [-Warray-bounds]
case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1];
^ ~
include/linux/compat.h:131:2: note: array 'sig' declared here
compat_sigset_word sig[_COMPAT_NSIG_WORDS];
^
include/linux/compat.h:428:42: warning: array index 2 is past the end of the array (which contains 2 elements) [-Warray-bounds]
case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1];
^ ~
include/linux/compat.h:131:2: note: array 'sig' declared here
compat_sigset_word sig[_COMPAT_NSIG_WORDS];
^
include/linux/compat.h:428:53: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
case 2: v.sig[3] = (set->sig[1] >> 32); v.sig[2] = set->sig[1];
^ ~
arch/powerpc/include/uapi/asm/signal.h:18:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/powerpc/kernel/asm-offsets.c:21:
>> include/linux/mman.h:133:9: warning: division by zero is undefined [-Wdivision-by-zero]
_calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) |
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/mman.h:111:21: note: expanded from macro '_calc_vm_trans'
: ((x) & (bit1)) / ((bit1) / (bit2))))
^ ~~~~~~~~~~~~~~~~~
include/linux/mman.h:134:9: warning: division by zero is undefined [-Wdivision-by-zero]
_calc_vm_trans(flags, MAP_SYNC, VM_SYNC );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/mman.h:111:21: note: expanded from macro '_calc_vm_trans'
: ((x) & (bit1)) / ((bit1) / (bit2))))
^ ~~~~~~~~~~~~~~~~~
64 warnings and 3 errors generated.
make[2]: *** [scripts/Makefile.build:100: arch/powerpc/kernel/asm-offsets.s] Error 1
make[2]: Target '__build' not remade because of errors.
make[1]: *** [Makefile:1141: prepare0] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:180: sub-make] Error 2
vim +321 arch/powerpc/include/asm/book3s/64/mmu-hash.h
319
320 extern bool torture_slb_enabled;
> 321 DECLARE_STATIC_KEY_FALSE(torture_slb_key);
322 static inline bool torture_slb(void)
323 {
> 324 return static_branch_unlikely(&torture_slb_key);
325 }
326
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31814 bytes --]
^ permalink raw reply
* Re: [RFC PATCH 2/2] powerpc/64s: system call support for scv/rfscv instructions
From: Nicholas Piggin @ 2020-05-06 1:11 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: linuxppc-dev
In-Reply-To: <20200505221119.GQ31009@gate.crashing.org>
Excerpts from Segher Boessenkool's message of May 6, 2020 8:11 am:
> Hi!
>
> On Thu, Apr 30, 2020 at 02:02:02PM +1000, Nicholas Piggin wrote:
>> Add support for the scv instruction on POWER9 and later CPUs.
>
> Looks good to me in general :-)
Thanks for taking a look.
>> For now this implements the zeroth scv vector 'scv 0', as identical
>> to 'sc' system calls, with the exception that lr is not preserved, and
>> it is 64-bit only. There may yet be changes made to this ABI, so it's
>> for testing only.
>
> What does it do with SF=0? I don't see how it is obviously not a
> security hole currently (but I didn't look too closely).
Oh that's an outdated comment, I since decided better to keep all the code
common and handle 32-bit compat the same way as existing sc syscall.
Thanks,
Nick
^ permalink raw reply
* Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
From: Michael Ellerman @ 2020-05-06 0:58 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: linux-kernel, npiggin, Paul Mackerras, linuxppc-dev
In-Reply-To: <20200505153245.GN31009@gate.crashing.org>
Segher Boessenkool <segher@kernel.crashing.org> writes:
> Hi!
>
> On Wed, May 06, 2020 at 12:27:58AM +1000, Michael Ellerman wrote:
>> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>> > unsafe_put_user() is designed to take benefit of 'asm goto'.
>> >
>> > Instead of using the standard __put_user() approach and branch
>> > based on the returned error, use 'asm goto' and make the
>> > exception code branch directly to the error label. There is
>> > no code anymore in the fixup section.
>> >
>> > This change significantly simplifies functions using
>> > unsafe_put_user()
>> >
>> ...
>> >
>> > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>> > ---
>> > arch/powerpc/include/asm/uaccess.h | 61 +++++++++++++++++++++++++-----
>> > 1 file changed, 52 insertions(+), 9 deletions(-)
>> >
>> > diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
>> > index 9cc9c106ae2a..9365b59495a2 100644
>> > --- a/arch/powerpc/include/asm/uaccess.h
>> > +++ b/arch/powerpc/include/asm/uaccess.h
>> > @@ -196,6 +193,52 @@ do { \
>> > })
>> >
>> >
>> > +#define __put_user_asm_goto(x, addr, label, op) \
>> > + asm volatile goto( \
>> > + "1: " op "%U1%X1 %0,%1 # put_user\n" \
>> > + EX_TABLE(1b, %l2) \
>> > + : \
>> > + : "r" (x), "m<>" (*addr) \
>>
>> The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.
>
> [ You shouldn't use 4.6.3, there has been 4.6.4 since a while. And 4.6
> is nine years old now. Most projects do not support < 4.8 anymore, on
> any architecture. ]
Moving up to 4.6.4 wouldn't actually help with this though would it?
Also I have 4.6.3 compilers already built, I don't really have time to
rebuild them for 4.6.4.
The kernel has a top-level minimum version, which I'm not in charge of, see:
https://www.kernel.org/doc/html/latest/process/changes.html?highlight=gcc
There were discussions about making 4.8 the minimum, but I'm not sure
where they got to.
>> Plain "m" works, how much does the "<>" affect code gen in practice?
>>
>> A quick diff here shows no difference from removing "<>".
>
> It will make it impossible to use update-form instructions here. That
> probably does not matter much at all, in this case.
>
> If you remove the "<>" constraints, also remove the "%Un" output modifier?
So like this?
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 62cc8d7640ec..ca847aed8e45 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -207,10 +207,10 @@ do { \
#define __put_user_asm_goto(x, addr, label, op) \
asm volatile goto( \
- "1: " op "%U1%X1 %0,%1 # put_user\n" \
+ "1: " op "%X1 %0,%1 # put_user\n" \
EX_TABLE(1b, %l2) \
: \
- : "r" (x), "m<>" (*addr) \
+ : "r" (x), "m" (*addr) \
: \
: label)
cheers
^ permalink raw reply related
* Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics
From: Jim Mattson @ 2020-05-05 16:53 UTC (permalink / raw)
To: Emanuele Giuseppe Esposito
Cc: linux-s390, kvm list, David Hildenbrand, Cornelia Huck,
Emanuele Giuseppe Esposito, LKML, kvm-ppc, Jonathan Adams,
Christian Borntraeger, Alexander Viro, David Rientjes,
Linux FS Devel, Paolo Bonzini, Vitaly Kuznetsov, linux-mips,
linuxppc-dev
In-Reply-To: <f2654143-b8e5-5a1f-8bd0-0cb0df2cd638@redhat.com>
On Tue, May 5, 2020 at 2:18 AM Emanuele Giuseppe Esposito
<eesposit@redhat.com> wrote:
>
>
>
> On 5/4/20 11:37 PM, David Rientjes wrote:
> > Since this is becoming a generic API (good!!), maybe we can discuss
> > possible ways to optimize gathering of stats in mass?
>
> Sure, the idea of a binary format was considered from the beginning in
> [1], and it can be done either together with the current filesystem, or
> as a replacement via different mount options.
ASCII stats are not scalable. A binary format is definitely the way to go.
^ permalink raw reply
* Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
From: Christophe Leroy @ 2020-05-05 15:40 UTC (permalink / raw)
To: Michael Ellerman, Christophe Leroy, Benjamin Herrenschmidt,
Paul Mackerras, npiggin, segher
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <87sggecv81.fsf@mpe.ellerman.id.au>
Hi,
Le 05/05/2020 à 16:27, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>> unsafe_put_user() is designed to take benefit of 'asm goto'.
>>
>> Instead of using the standard __put_user() approach and branch
>> based on the returned error, use 'asm goto' and make the
>> exception code branch directly to the error label. There is
>> no code anymore in the fixup section.
>>
>> This change significantly simplifies functions using
>> unsafe_put_user()
>>
> ...
>>
>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>> ---
>> arch/powerpc/include/asm/uaccess.h | 61 +++++++++++++++++++++++++-----
>> 1 file changed, 52 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
>> index 9cc9c106ae2a..9365b59495a2 100644
>> --- a/arch/powerpc/include/asm/uaccess.h
>> +++ b/arch/powerpc/include/asm/uaccess.h
>> @@ -196,6 +193,52 @@ do { \
>> })
>>
>>
>> +#define __put_user_asm_goto(x, addr, label, op) \
>> + asm volatile goto( \
>> + "1: " op "%U1%X1 %0,%1 # put_user\n" \
>> + EX_TABLE(1b, %l2) \
>> + : \
>> + : "r" (x), "m<>" (*addr) \
>
> The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.
>
> Plain "m" works, how much does the "<>" affect code gen in practice?
>
> A quick diff here shows no difference from removing "<>".
It was recommended by Segher, there has been some discussion about it on
v1 of this patch, see
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4fdc2aba6f5e51887d1cd0fee94be0989eada2cd.1586942312.git.christophe.leroy@c-s.fr/
As far as I understood that's mandatory on recent gcc to get the
pre-update form of the instruction. With older versions "m" was doing
the same, but not anymore. Should we ifdef the "m<>" or "m" based on GCC
version ?
Christophe
^ permalink raw reply
* Re: [PATCH] powerpc/xive: Enforce load-after-store ordering when StoreEOI is active
From: Alistair Popple @ 2020-05-05 23:34 UTC (permalink / raw)
To: Cédric Le Goater; +Cc: linuxppc-dev, Paul Mackerras, Greg Kurz
In-Reply-To: <20200220081506.31209-1-clg@kaod.org>
I am still slowly wrapping my head around XIVE and it's interaction with KVM
but from what I can see this looks good and is needed so we can enable
StoreEOI support in future so:
Reviewed-by: Alistair Popple <alistair@popple.id.au>
On Thursday, 20 February 2020 7:15:06 PM AEST Cédric Le Goater wrote:
> When an interrupt has been handled, the OS notifies the interrupt
> controller with a EOI sequence. On a POWER9 system using the XIVE
> interrupt controller, this can be done with a load or a store
> operation on the ESB interrupt management page of the interrupt. The
> StoreEOI operation has less latency and improves interrupt handling
> performance but it was deactivated during the POWER9 DD2.0 timeframe
> because of ordering issues. We use the LoadEOI today but we plan to
> reactivate StoreEOI in future architectures.
>
> There is usually no need to enforce ordering between ESB load and
> store operations as they should lead to the same result. E.g. a store
> trigger and a load EOI can be executed in any order. Assuming the
> interrupt state is PQ=10, a store trigger followed by a load EOI will
> return a Q bit. In the reverse order, it will create a new interrupt
> trigger from HW. In both cases, the handler processing interrupts is
> notified.
>
> In some cases, the XIVE_ESB_SET_PQ_10 load operation is used to
> disable temporarily the interrupt source (mask/unmask). When the
> source is reenabled, the OS can detect if interrupts were received
> while the source was disabled and reinject them. This process needs
> special care when StoreEOI is activated. The ESB load and store
> operations should be correctly ordered because a XIVE_ESB_STORE_EOI
> operation could leave the source enabled if it has not completed
> before the loads.
>
> For those cases, we enforce Load-after-Store ordering with a special
> load operation offset. To avoid performance impact, this ordering is
> only enforced when really needed, that is when interrupt sources are
> temporarily disabled with the XIVE_ESB_SET_PQ_10 load. It should not
> be needed for other loads.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> arch/powerpc/include/asm/xive-regs.h | 8 ++++++++
> arch/powerpc/kvm/book3s_xive_native.c | 6 ++++++
> arch/powerpc/kvm/book3s_xive_template.c | 3 +++
> arch/powerpc/sysdev/xive/common.c | 3 +++
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 5 +++++
> 5 files changed, 25 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/xive-regs.h
> b/arch/powerpc/include/asm/xive-regs.h index f2dfcd50a2d3..b1996fbae59a
> 100644
> --- a/arch/powerpc/include/asm/xive-regs.h
> +++ b/arch/powerpc/include/asm/xive-regs.h
> @@ -37,6 +37,14 @@
> #define XIVE_ESB_SET_PQ_10 0xe00 /* Load */
> #define XIVE_ESB_SET_PQ_11 0xf00 /* Load */
>
> +/*
> + * Load-after-store ordering
> + *
> + * Adding this offset to the load address will enforce
> + * load-after-store ordering. This is required to use StoreEOI.
> + */
> +#define XIVE_ESB_LD_ST_MO 0x40 /* Load-after-store ordering */
> +
> #define XIVE_ESB_VAL_P 0x2
> #define XIVE_ESB_VAL_Q 0x1
>
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c
> b/arch/powerpc/kvm/book3s_xive_native.c index d83adb1e1490..c80b6a447efd
> 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -31,6 +31,12 @@ static u8 xive_vm_esb_load(struct xive_irq_data *xd, u32
> offset) {
> u64 val;
>
> + /*
> + * The KVM XIVE native device does not use the XIVE_ESB_SET_PQ_10
> + * load operation, so there is no need to enforce load-after-store
> + * ordering.
> + */
> +
> if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> offset |= offset << 4;
>
> diff --git a/arch/powerpc/kvm/book3s_xive_template.c
> b/arch/powerpc/kvm/book3s_xive_template.c index a8a900ace1e6..4ad3c0279458
> 100644
> --- a/arch/powerpc/kvm/book3s_xive_template.c
> +++ b/arch/powerpc/kvm/book3s_xive_template.c
> @@ -58,6 +58,9 @@ static u8 GLUE(X_PFX,esb_load)(struct xive_irq_data *xd,
> u32 offset) {
> u64 val;
>
> + if (offset == XIVE_ESB_SET_PQ_10 && xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
> + offset |= XIVE_ESB_LD_ST_MO;
> +
> if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> offset |= offset << 4;
>
> diff --git a/arch/powerpc/sysdev/xive/common.c
> b/arch/powerpc/sysdev/xive/common.c index f5fadbd2533a..0dc421bb494f 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -202,6 +202,9 @@ static notrace u8 xive_esb_read(struct xive_irq_data
> *xd, u32 offset) {
> u64 val;
>
> + if (offset == XIVE_ESB_SET_PQ_10 && xd->flags & XIVE_IRQ_FLAG_STORE_EOI)
> + offset |= XIVE_ESB_LD_ST_MO;
> +
> /* Handle HW errata */
> if (xd->flags & XIVE_IRQ_FLAG_SHIFT_BUG)
> offset |= offset << 4;
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index e11017897eb0..abe132ff2346
> 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -2911,6 +2911,11 @@ kvm_cede_exit:
> beq 4f
> li r0, 0
> stb r0, VCPU_CEDED(r9)
> + /*
> + * The escalation interrupts are special as we don't EOI them.
> + * There is no need to use the load-after-store ordering offset
> + * to set PQ to 10 as we won't use StoreEOI.
> + */
> li r6, XIVE_ESB_SET_PQ_10
> b 5f
> 4: li r0, 1
^ permalink raw reply
* Re: [PATCH v4 7/8] powerpc/vdso32: implement clock_getres entirely
From: Aurelien Jarno @ 2020-05-05 22:52 UTC (permalink / raw)
To: Christophe Leroy; +Cc: arnd, linux-kernel, Paul Mackerras, linuxppc-dev
In-Reply-To: <37f94e47c91070b7606fb3ec3fe6fd2302a475a0.1575273217.git.christophe.leroy@c-s.fr>
Hi,
On 2019-12-02 07:57, Christophe Leroy wrote:
> clock_getres returns hrtimer_res for all clocks but coarse ones
> for which it returns KTIME_LOW_RES.
>
> return EINVAL for unknown clocks.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
> arch/powerpc/kernel/asm-offsets.c | 3 +++
> arch/powerpc/kernel/vdso32/gettimeofday.S | 19 +++++++++++--------
> 2 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
> index 0013197d89a6..90e53d432f2e 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -413,7 +413,10 @@ int main(void)
> DEFINE(CLOCK_MONOTONIC, CLOCK_MONOTONIC);
> DEFINE(CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE);
> DEFINE(CLOCK_MONOTONIC_COARSE, CLOCK_MONOTONIC_COARSE);
> + DEFINE(CLOCK_MAX, CLOCK_TAI);
> DEFINE(NSEC_PER_SEC, NSEC_PER_SEC);
> + DEFINE(EINVAL, EINVAL);
> + DEFINE(KTIME_LOW_RES, KTIME_LOW_RES);
>
> #ifdef CONFIG_BUG
> DEFINE(BUG_ENTRY_SIZE, sizeof(struct bug_entry));
> diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
> index 9aafacea9c4a..20ae38f3a5a3 100644
> --- a/arch/powerpc/kernel/vdso32/gettimeofday.S
> +++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
> @@ -196,17 +196,20 @@ V_FUNCTION_END(__kernel_clock_gettime)
> V_FUNCTION_BEGIN(__kernel_clock_getres)
> .cfi_startproc
> /* Check for supported clock IDs */
> - cmpwi cr0,r3,CLOCK_REALTIME
> - cmpwi cr1,r3,CLOCK_MONOTONIC
> - cror cr0*4+eq,cr0*4+eq,cr1*4+eq
> - bne cr0,99f
> + cmplwi cr0, r3, CLOCK_MAX
> + cmpwi cr1, r3, CLOCK_REALTIME_COARSE
> + cmpwi cr7, r3, CLOCK_MONOTONIC_COARSE
> + bgt cr0, 99f
> + LOAD_REG_IMMEDIATE(r5, KTIME_LOW_RES)
> + beq cr1, 1f
> + beq cr7, 1f
>
> mflr r12
> .cfi_register lr,r12
> get_datapage r3, r0
> lwz r5, CLOCK_HRTIMER_RES(r3)
> mtlr r12
> - li r3,0
> +1: li r3,0
> cmpli cr0,r4,0
> crclr cr0*4+so
> beqlr
> @@ -215,11 +218,11 @@ V_FUNCTION_BEGIN(__kernel_clock_getres)
> blr
>
> /*
> - * syscall fallback
> + * invalid clock
> */
> 99:
> - li r0,__NR_clock_getres
> - sc
> + li r3, EINVAL
> + crset so
> blr
> .cfi_endproc
> V_FUNCTION_END(__kernel_clock_getres)
Removing the syscall fallback looks wrong, and broke access to
per-processes clocks. With this change a few glibc tests now fail.
This can be reproduced by the simple code below:
| #include <errno.h>
| #include <stdio.h>
| #include <string.h>
| #include <sys/types.h>
| #include <time.h>
| #include <unistd.h>
|
| int main()
| {
| struct timespec res;
| clockid_t ci;
| int e;
|
| e = clock_getcpuclockid(getpid(), &ci);
| if (e) {
| printf("clock_getcpuclockid returned %d\n", e);
| return e;
| }
| e = clock_getres (ci, &res);
| printf("clock_getres returned %d\n", e);
| if (e) {
| printf(" errno: %d, %s\n", errno, strerror(errno));
| }
|
| return e;
| }
Without this patch or with -m64, it returns:
| clock_getres returned 0
With this patch with -m32 it returns:
| clock_getres returned -1
| errno: 22, Invalid argument
Regards,
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply
* Re: [RFC PATCH 2/2] powerpc/64s: system call support for scv/rfscv instructions
From: Segher Boessenkool @ 2020-05-05 22:11 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linuxppc-dev
In-Reply-To: <20200430040202.1735506-3-npiggin@gmail.com>
Hi!
On Thu, Apr 30, 2020 at 02:02:02PM +1000, Nicholas Piggin wrote:
> Add support for the scv instruction on POWER9 and later CPUs.
Looks good to me in general :-)
> For now this implements the zeroth scv vector 'scv 0', as identical
> to 'sc' system calls, with the exception that lr is not preserved, and
> it is 64-bit only. There may yet be changes made to this ABI, so it's
> for testing only.
What does it do with SF=0? I don't see how it is obviously not a
security hole currently (but I didn't look too closely).
Segher
^ permalink raw reply
* Re: New powerpc vdso calling convention
From: Segher Boessenkool @ 2020-05-05 21:56 UTC (permalink / raw)
To: Nicholas Piggin
Cc: Rich Felker, libc-alpha, Adhemerval Zanella, musl, binutils,
Andy Lutomirski, libc-dev, Thomas Gleixner, Vincenzo Frascino,
linuxppc-dev
In-Reply-To: <1588126678.zjwj4d1d90.astroid@bobo.none>
Hi!
On Wed, Apr 29, 2020 at 12:39:22PM +1000, Nicholas Piggin wrote:
> Excerpts from Adhemerval Zanella's message of April 27, 2020 11:09 pm:
> >> Right, I'm just talking about those comments -- it seems like the kernel
> >> vdso should contain an .opd section with function descriptors in it for
> >> elfv1 calls, rather than the hack it has now of creating one in the
> >> caller's .data section.
> >>
> >> But all that function descriptor code is gated by
> >>
> >> #if (defined(__PPC64__) || defined(__powerpc64__)) && _CALL_ELF != 2
> >>
> >> So it seems PPC32 does not use function descriptors but a direct pointer
> >> to the entry point like PPC64 with ELFv2.
> >
> > Yes, this hack is only for ELFv1. The missing ODP has not been an issue
> > or glibc because it has been using the inline assembly to emulate the
> > functions call since initial vDSO support (INTERNAL_VSYSCALL_CALL_TYPE).
> > It just has become an issue when I added a ifunc optimization to
> > gettimeofday so it can bypass the libc.so and make plt branch to vDSO
> > directly.
>
> I can't understand if it's actually a problem for you or not.
>
> Regardless if you can hack around it, it seems to me that if we're going
> to add sane calling conventions to the vdso, then we should also just
> have a .opd section for it as well, whether or not a particular libc
> requires it.
An OPD ("official procedure descriptor") is required for every function,
to have proper C semantics, so that pointers to functions (which are
pointers to descriptors, in fact) are unique. You can "manually" make
descriptors just fine, and use those to call functions -- but you cannot
(in general) use a pointer to such a "fake" descriptor as the "id" of
the function.
The way the ABIs define the OPDs makes them guaranteed unique.
Segher
^ permalink raw reply
* Re: remove set_fs calls from the coredump code v6
From: Al Viro @ 2020-05-05 20:47 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Arnd Bergmann, linuxppc-dev, x86, Oleg Nesterov, linux-kernel,
Eric W . Biederman, linux-fsdevel, Andrew Morton, Linus Torvalds,
Jeremy Kerr
In-Reply-To: <20200505204258.GA24277@lst.de>
On Tue, May 05, 2020 at 10:42:58PM +0200, Christoph Hellwig wrote:
> On Tue, May 05, 2020 at 09:34:46PM +0100, Al Viro wrote:
> > Looks good. Want me to put it into vfs.git? #work.set_fs-exec, perhaps?
>
> Sounds good.
Applied, pushed and added into #for-next
^ permalink raw reply
* Re: remove set_fs calls from the coredump code v6
From: Christoph Hellwig @ 2020-05-05 20:42 UTC (permalink / raw)
To: Al Viro
Cc: Arnd Bergmann, linuxppc-dev, x86, Oleg Nesterov, linux-kernel,
Jeremy Kerr, linux-fsdevel, Andrew Morton, Linus Torvalds,
Christoph Hellwig, Eric W . Biederman
In-Reply-To: <20200505203446.GZ23230@ZenIV.linux.org.uk>
On Tue, May 05, 2020 at 09:34:46PM +0100, Al Viro wrote:
> Looks good. Want me to put it into vfs.git? #work.set_fs-exec, perhaps?
Sounds good.
^ permalink raw reply
* Re: remove set_fs calls from the coredump code v6
From: Al Viro @ 2020-05-05 20:34 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Arnd Bergmann, linuxppc-dev, x86, Oleg Nesterov, linux-kernel,
Eric W . Biederman, linux-fsdevel, Andrew Morton, Linus Torvalds,
Jeremy Kerr
In-Reply-To: <20200505101256.3121270-1-hch@lst.de>
On Tue, May 05, 2020 at 12:12:49PM +0200, Christoph Hellwig wrote:
> Hi all,
>
> this series gets rid of playing with the address limit in the exec and
> coredump code. Most of this was fairly trivial, the biggest changes are
> those to the spufs coredump code.
>
> Changes since v5:
> - fix uaccess under spinlock in spufs (Jeremy)
> - remove use of access_ok in spufs
>
> Changes since v4:
> - change some goto names as suggested by Linus
>
> Changes since v3:
> - fix x86 compilation with x32 in the new version of the signal code
> - split the exec patches into a new series
>
> Changes since v2:
> - don't cleanup the compat siginfo calling conventions, use the patch
> variant from Eric with slight coding style fixes instead.
>
> Changes since v1:
> - properly spell NUL
> - properly handle the compat siginfo case in ELF coredumps
Looks good. Want me to put it into vfs.git? #work.set_fs-exec, perhaps?
^ permalink raw reply
* Re: remove set_fs calls from the coredump code v6
From: Eric W. Biederman @ 2020-05-05 20:28 UTC (permalink / raw)
To: Linus Torvalds
Cc: Arnd Bergmann, the arch/x86 maintainers, Oleg Nesterov,
Linux Kernel Mailing List, Jeremy Kerr, linux-fsdevel,
Andrew Morton, linuxppc-dev, Christoph Hellwig, Alexander Viro
In-Reply-To: <CAHk-=wgrHhaM1XCB=E3Zp2Br8E5c_kmVUTd5y06xh5sev5nRMA@mail.gmail.com>
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Tue, May 5, 2020 at 3:13 AM Christoph Hellwig <hch@lst.de> wrote:
>>
>> this series gets rid of playing with the address limit in the exec and
>> coredump code. Most of this was fairly trivial, the biggest changes are
>> those to the spufs coredump code.
>
> Ack, nice, and looks good.
>
> The only part I dislike is how we have that 'struct compat_siginfo' on
> the stack, which is a huge waste (most of it is the nasty padding to
> 128 bytes).
>
> But that's not new, I only reacted to it because the code moved a bit.
> We cleaned up the regular siginfo to not have the padding in the
> kernel (and by "we" I mean "Eric Biederman did it after some prodding
> as part of his siginfo cleanups" - see commit 4ce5f9c9e754 "signal:
> Use a smaller struct siginfo in the kernel"), and I wonder if we
> could do something similar with that compat thing.
>
> 128 bytes of wasted kernel stack isn't the end of the world, but it's
> sad when the *actual* data is only 32 bytes or so.
We probably can. After introducing a kernel_compat_siginfo that is
the size that userspace actually would need.
It isn't something I want to mess with until this code gets merged, as I
think the set_fs cleanups are more important.
Christoph made some good points about how ugly the #ifdefs are in
the generic copy_siginfo_to_user32 implementation.
I am thinking the right fix is to introduce.
- TS_X32 as a companion to TS_COMPAT in the x86_64.
- Modify in_x32_syscall() to test TS_X32
- Implement x32_copy_siginfo_to_user32 that forces TS_X32 to be
set. AKA:
x32_copy_siginfo_to_user32()
{
unsigned long state = current_thread_info()->state;
current_thread_info()->state |= TS_X32;
copy_siginfo_to_user32();
current_thread_info()->state = state;
}
That would make the #ifdefs go away, but I don't yet know what the x86
maintainers would say about that scheme. I think it is a good path as
it would isolate the runtime cost of that weird SIGCHLD siginfo format
to just x32. Then ia32 in compat mode would not need to pay.
Once I get that then it will be easier to introduce a yet another helper
of copy_siginfo_to_user32 that generates just the kernel_compat_siginfo
part, and the two visible derivatives can call memset and clear_user
to clear the unset parts.
I am assuming you don't don't mind having a full siginfo in
elf_note_info that ultimately gets copied into the core dump?
Eric
^ permalink raw reply
* Re: [PATCH 3/3] mm/hugetlb: Introduce HAVE_ARCH_CLEAR_HUGEPAGE_FLAGS
From: Andrew Morton @ 2020-05-05 20:12 UTC (permalink / raw)
To: Anshuman Khandual
Cc: Rich Felker, linux-ia64, linux-sh, Catalin Marinas,
Heiko Carstens, linux-kernel, James E.J. Bottomley, linux-mm,
Paul Mackerras, H. Peter Anvin, sparclinux, linux-riscv,
Will Deacon, linux-arch, linux-s390, Yoshinori Sato, Helge Deller,
x86, Russell King, Christian Borntraeger, Ingo Molnar, Fenghua Yu,
Vasily Gorbik, Thomas Bogendoerfer, Borislav Petkov,
Paul Walmsley, Thomas Gleixner, linux-arm-kernel, Tony Luck,
linux-parisc, linux-mips, Palmer Dabbelt, linuxppc-dev,
David S. Miller, Mike Kravetz
In-Reply-To: <21460cbc-8e9a-b956-5797-57b2e1df9fb1@arm.com>
On Tue, 5 May 2020 08:21:34 +0530 Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> >>> static inline void arch_clear_hugepage_flags(struct page *page)
> >>> {
> >>> <some implementation>
> >>> }
> >>> #define arch_clear_hugepage_flags arch_clear_hugepage_flags
> >>>
> >>> It's a small difference - mainly to avoid adding two variables to the
> >>> overall namespace where one would do.
> >>
> >> Understood, will change and resend.
> >
> > That's OK - I've queued up that fix.
> >
>
> Hello Andrew,
>
> I might not have searched all the relevant trees or might have just searched
> earlier than required. But I dont see these patches (or your proposed fixes)
> either in mmotm (2020-04-29-23-04) or in next-20200504. Wondering if you are
> waiting on a V2 for this series accommodating the changes you had proposed.
hm. I think I must have got confused and thought you were referring to
a different patch. Yes please, let's have v2.
^ permalink raw reply
* Re: [PATCH v2 17/20] mm: free_area_init: allow defining max_zone_pfn in descending order
From: Vineet Gupta @ 2020-05-05 17:27 UTC (permalink / raw)
To: Guenter Roeck, Mike Rapoport
Cc: Rich Felker, linux-ia64@vger.kernel.org,
linux-doc@vger.kernel.org, Catalin Marinas, Heiko Carstens,
x86@kernel.org, Michal Hocko, James E.J. Bottomley, Max Filippov,
Guo Ren, Ley Foon Tan, sparclinux@vger.kernel.org,
linux-riscv@lists.infradead.org, Greg Ungerer,
linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
linux-c6x-dev@linux-c6x.org, Baoquan He, Jonathan Corbet,
linux-hexagon@vger.kernel.org, Helge Deller,
linux-sh@vger.kernel.org, Russell King,
linux-csky@vger.kernel.org, Mike Rapoport, Geert Uytterhoeven,
Hoan Tran, Mark Salter, Matt Turner,
linux-snps-arc@lists.infradead.org,
uclinux-h8-devel@lists.sourceforge.jp,
linux-xtensa@linux-xtensa.org, Nick Hu,
linux-alpha@vger.kernel.org, linux-um@lists.infradead.org,
linux-mips@vger.kernel.org, Richard Weinberger,
linux-m68k@lists.linux-m68k.org, Thomas Bogendoerfer, Qian Cai,
Greentime Hu, Paul Walmsley, Stafford Horne, Guan Xuetao,
linux-arm-kernel@lists.infradead.org, Michal Simek, Tony Luck,
Yoshinori Sato, linux-parisc@vger.kernel.org, linux-mm@kvack.org,
Brian Cain, linux-kernel@vger.kernel.org,
openrisc@lists.librecores.org, Andrew Morton,
linuxppc-dev@lists.ozlabs.org, David S. Miller
In-Reply-To: <ca099c3e-c0bc-cd2f-cdb0-852dfc2c10db@roeck-us.net>
On 5/5/20 6:18 AM, Guenter Roeck wrote:
> On 5/4/20 8:39 AM, Mike Rapoport wrote:
>> On Sun, May 03, 2020 at 11:43:00AM -0700, Guenter Roeck wrote:
>>> On Sun, May 03, 2020 at 10:41:38AM -0700, Guenter Roeck wrote:
>>>> Hi,
>>>>
>>>> On Wed, Apr 29, 2020 at 03:11:23PM +0300, Mike Rapoport wrote:
>>>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>>>
>>>>> Some architectures (e.g. ARC) have the ZONE_HIGHMEM zone below the
>>>>> ZONE_NORMAL. Allowing free_area_init() parse max_zone_pfn array even it is
>>>>> sorted in descending order allows using free_area_init() on such
>>>>> architectures.
>>>>>
>>>>> Add top -> down traversal of max_zone_pfn array in free_area_init() and use
>>>>> the latter in ARC node/zone initialization.
>>>>>
>>>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>>>> This patch causes my microblazeel qemu boot test in linux-next to fail.
>>>> Reverting it fixes the problem.
>>>>
>>> The same problem is seen with s390 emulations.
>> Yeah, this patch breaks some others as well :(
>>
>> My assumption that max_zone_pfn defines architectural limit for maximal
>> PFN that can belong to a zone was over-optimistic. Several arches
>> actually do that, but others do
>>
>> max_zone_pfn[ZONE_DMA] = MAX_DMA_PFN;
>> max_zone_pfn[ZONE_NORMAL] = max_pfn;
>>
>> where MAX_DMA_PFN is build-time constrain and max_pfn is run time limit
>> for the current system.
>>
>> So, when max_pfn is lower than MAX_DMA_PFN, the free_init_area() will
>> consider max_zone_pfn as descending and will wrongly calculate zone
>> extents.
>>
>> That said, instead of trying to create a generic way to special case
>> ARC, I suggest to simply use the below patch instead.
>>
> As a reminder, I reported the problem against s390 and microblazeel
> (interestingly enough, microblaze (big endian) works), not against arc.
Understood and my comment was to point to any other problems in future.
Thx,
-Vineet
^ permalink raw reply
* Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics
From: Christian Borntraeger @ 2020-05-05 17:30 UTC (permalink / raw)
To: Paolo Bonzini, David Rientjes
Cc: Emanuele Giuseppe Esposito, linux-s390, kvm list,
David Hildenbrand, Cornelia Huck, Emanuele Giuseppe Esposito,
LKML, kvm-ppc, Jonathan Adams, Alexander Viro, Stefan Raspl,
Linux FS Devel, Vitaly Kuznetsov, linux-mips, linuxppc-dev,
Jim Mattson
In-Reply-To: <6cfdf81f-caef-2489-0906-25915d9d58ff@redhat.com>
Adding Stefan Raspl, who has done a lot of kvm_stat work in the past.
On 05.05.20 19:21, Paolo Bonzini wrote:
> On 05/05/20 19:07, David Rientjes wrote:
>>> I am totally in favor of having a binary format, but it should be
>>> introduced as a separate series on top of this one---and preferably by
>>> someone who has already put some thought into the problem (which
>>> Emanuele and I have not, beyond ensuring that the statsfs concept and
>>> API is flexible enough).
>>>
>> The concern is that once this series is merged then /sys/kernel/stats
>> could be considered an ABI and there would be a reasonable expectation
>> that it will remain stable, in so far as the stats that userspace is
>> interested in are stable and not obsoleted.
>>
>> So is this a suggestion that the binary format becomes complementary to
>> statsfs and provide a means for getting all stats from a single subsystem,
>> or that this series gets converted to such a format before it is merged?
>
> The binary format should be complementary. The ASCII format should
> indeed be considered stable even though individual statistics would come
> and go. It may make sense to allow disabling ASCII files via mount
> and/or Kconfig options; but either way, the binary format can and should
> be added on top.
>
> I have not put any thought into what the binary format would look like
> and what its features would be. For example these are but the first
> questions that come to mind:
>
> * would it be possible to read/clear an arbitrary statistic with
> pread/pwrite, or do you have to read all of them?
>
> * if userspace wants to read the schema just once and then read the
> statistics many times, how is it informed of schema changes?
>
> * and of course the details of how the schema (names of stat and
> subsources) is encoded and what details it should include about the
> values (e.g. type or just signedness).
>
> Another possibility is to query stats via BPF. This could be a third
> way to access the stats, or it could be alternative to a binary format.
>
> Paolo
>
^ permalink raw reply
* Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics
From: Paolo Bonzini @ 2020-05-05 17:21 UTC (permalink / raw)
To: David Rientjes
Cc: Emanuele Giuseppe Esposito, linux-s390, kvm list,
David Hildenbrand, Cornelia Huck, Emanuele Giuseppe Esposito,
LKML, kvm-ppc, Jonathan Adams, Christian Borntraeger,
Alexander Viro, Linux FS Devel, Vitaly Kuznetsov, linux-mips,
linuxppc-dev, Jim Mattson
In-Reply-To: <alpine.DEB.2.22.394.2005051003380.216575@chino.kir.corp.google.com>
On 05/05/20 19:07, David Rientjes wrote:
>> I am totally in favor of having a binary format, but it should be
>> introduced as a separate series on top of this one---and preferably by
>> someone who has already put some thought into the problem (which
>> Emanuele and I have not, beyond ensuring that the statsfs concept and
>> API is flexible enough).
>>
> The concern is that once this series is merged then /sys/kernel/stats
> could be considered an ABI and there would be a reasonable expectation
> that it will remain stable, in so far as the stats that userspace is
> interested in are stable and not obsoleted.
>
> So is this a suggestion that the binary format becomes complementary to
> statsfs and provide a means for getting all stats from a single subsystem,
> or that this series gets converted to such a format before it is merged?
The binary format should be complementary. The ASCII format should
indeed be considered stable even though individual statistics would come
and go. It may make sense to allow disabling ASCII files via mount
and/or Kconfig options; but either way, the binary format can and should
be added on top.
I have not put any thought into what the binary format would look like
and what its features would be. For example these are but the first
questions that come to mind:
* would it be possible to read/clear an arbitrary statistic with
pread/pwrite, or do you have to read all of them?
* if userspace wants to read the schema just once and then read the
statistics many times, how is it informed of schema changes?
* and of course the details of how the schema (names of stat and
subsources) is encoded and what details it should include about the
values (e.g. type or just signedness).
Another possibility is to query stats via BPF. This could be a third
way to access the stats, or it could be alternative to a binary format.
Paolo
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox