* Re: How would I code a Write to a proc file from within the kernel without reading anything from user space?
From: Michael Ellerman @ 2020-08-06 1:30 UTC (permalink / raw)
To: thefirst ECS, linuxppc-dev
In-Reply-To: <32112832.26959.1596664366002.JavaMail.Dan@DanHP>
thefirst ECS <ecs_dn@yahoo.com> writes:
> In order to help debug a certain discrepancy, I need to "simulate" an "echo 1 > /proc/file" but doing it from kernel even when root file system is unavailable.
>
> I have simulated it just fine via call_usermodehelper (with argv etc of "echo 1 > /proc/file") from inside the kernel which triggers:
> [ee897ec0] [c0122704] proc_reg_write+0x80/0xb4
> [ee897ef0] [c00d3b7c] vfs_write+0xb4/0x184
>
> just as I had wanted. But now I need to trigger the "vfs_write" and "proc_reg_write" but without using call_usermodehelper since I will be doing it when root "/" is unavailable and so I can no longer access /bin/echo and call the usermodehelper etc. So my question is how can I do that in kernel?
>
> Not sure if I'm supposed to look in fs read_write.c and fs.h and write a method based on those or if there's some other way etc.
You don't say which proc file, which would be useful information.
I'll assume it's /proc/sysrq-trigger based on your previous mail.
Rather than trying to invoke the write path from inside the kernel, the
simplest option is to just call __handle_sysrq() directly.
cheers
^ permalink raw reply
* Re: [PATCH v8 5/8] powerpc/vdso: Prepare for switching VDSO to generic C implementation.
From: Michael Ellerman @ 2020-08-06 2:03 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Christophe Leroy, nathanl, arnd, linux-kernel,
Tulio Magno Quites Machado Filho, Paul Mackerras, luto,
linux-arch, tglx, vincenzo.frascino, linuxppc-dev
In-Reply-To: <20200805133505.GN6753@gate.crashing.org>
Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Wed, Aug 05, 2020 at 04:24:16PM +1000, Michael Ellerman wrote:
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> > Indeed, 32-bit doesn't have a redzone, so I believe it needs a stack
>> > frame whenever it has anything to same.
>>
>> Yeah OK that would explain it.
>>
>> > Here is what I have in libc.so:
>> >
>> > 000fbb60 <__clock_gettime>:
>> > fbb60: 94 21 ff e0 stwu r1,-32(r1)
>
> This is the *only* place where you can use a negative offset from r1:
> in the stwu to extend the stack (set up a new stack frame, or make the
> current one bigger).
(You're talking about 32-bit code here right?)
>> > diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h
>> > b/arch/powerpc/include/asm/vdso/gettimeofday.h
>> > index a0712a6e80d9..0b6fa245d54e 100644
>> > --- a/arch/powerpc/include/asm/vdso/gettimeofday.h
>> > +++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
>> > @@ -10,6 +10,7 @@
>> > .cfi_startproc
>> > PPC_STLU r1, -STACK_FRAME_OVERHEAD(r1)
>> > mflr r0
>> > + PPC_STLU r1, -STACK_FRAME_OVERHEAD(r1)
>> > .cfi_register lr, r0
>>
>> The cfi_register should come directly after the mflr I think.
>
> That is the idiomatic way to write it, and most obviously correct. But
> as long as the value in LR at function entry is available in multiple
> places (like, in LR and in R0 here), it is fine to use either for
> unwinding. Sometimes you can use this to optimise the unwind tables a
> bit -- not really worth it in hand-written code, it's more important to
> make it legible ;-)
OK. Because LR still holds the LR value until it's clobbered later, by
which point the cfi_register has taken effect.
But yeah I think for readability it's best to keep the cfi_register next
to the mflr.
>> >> There's also no code to load/restore the TOC pointer on BE, which I
>> >> think we'll need to handle.
>> >
>> > I see no code in the generated vdso64.so doing anything with r2, but if
>> > you think that's needed, just let's do it:
>>
>> Hmm, true.
>>
>> The compiler will use the toc for globals (and possibly also for large
>> constants?)
>
> And anything else it bloody well wants to, yeah :-)
Haha yeah OK.
>> AFAIK there's no way to disable use of the toc, or make it a build error
>> if it's needed.
>
> Yes.
>
>> At the same time it's much safer for us to just save/restore r2, and
>> probably in the noise performance wise.
>
> If you want a function to be able to work with ABI-compliant code safely
> (in all cases), you'll have to make it itself ABI-compliant as well,
> yes :-)
True. Except this is the VDSO which has previously been a bit wild west
as far as ABI goes :)
>> So yeah we should probably do as below.
>
> [ snip ]
>
> Looks good yes.
Thanks for reviewing.
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/40x: Fix assembler warning about r0
From: Michael Ellerman @ 2020-08-06 2:18 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev
In-Reply-To: <bc70ddd8-d48d-d939-dce0-5aa8b8cf87d8@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 22/07/2020 à 04:24, Michael Ellerman a écrit :
>> The assembler says:
>> arch/powerpc/kernel/head_40x.S:623: Warning: invalid register expression
>
> I get exactly the same with head_32.S, for the exact same reason.
Ah yep, I see it. I mostly build pmac32_defconfig which doesn't have
BDI_SWITCH enabled.
Send a patch? :)
Do we still need the BDI_SWITCH code? Is it likely anyone still has one,
that works?
cheers
^ permalink raw reply
* Re: [PATCH v2 2/5] powerpc/lib: Initialize a temporary mm for code patching
From: Daniel Axtens @ 2020-08-06 3:24 UTC (permalink / raw)
To: Christopher M. Riedl, linuxppc-dev; +Cc: kernel-hardening
In-Reply-To: <20200709040316.12789-3-cmr@informatik.wtf>
"Christopher M. Riedl" <cmr@informatik.wtf> writes:
> When code patching a STRICT_KERNEL_RWX kernel the page containing the
> address to be patched is temporarily mapped with permissive memory
> protections. Currently, a per-cpu vmalloc patch area is used for this
> purpose. While the patch area is per-cpu, the temporary page mapping is
> inserted into the kernel page tables for the duration of the patching.
> The mapping is exposed to CPUs other than the patching CPU - this is
> undesirable from a hardening perspective.
>
> Use the `poking_init` init hook to prepare a temporary mm and patching
> address. Initialize the temporary mm by copying the init mm. Choose a
> randomized patching address inside the temporary mm userspace address
> portion. The next patch uses the temporary mm and patching address for
> code patching.
>
> Based on x86 implementation:
>
> commit 4fc19708b165
> ("x86/alternatives: Initialize temporary mm for patching")
>
> Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf>
> ---
> arch/powerpc/lib/code-patching.c | 33 ++++++++++++++++++++++++++++++++
> 1 file changed, 33 insertions(+)
>
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 0a051dfeb177..8ae1a9e5fe6e 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -11,6 +11,8 @@
> #include <linux/cpuhotplug.h>
> #include <linux/slab.h>
> #include <linux/uaccess.h>
> +#include <linux/sched/task.h>
> +#include <linux/random.h>
>
> #include <asm/tlbflush.h>
> #include <asm/page.h>
> @@ -44,6 +46,37 @@ int raw_patch_instruction(struct ppc_inst *addr, struct ppc_inst instr)
> }
>
> #ifdef CONFIG_STRICT_KERNEL_RWX
> +
> +static struct mm_struct *patching_mm __ro_after_init;
> +static unsigned long patching_addr __ro_after_init;
> +
> +void __init poking_init(void)
> +{
> + spinlock_t *ptl; /* for protecting pte table */
> + pte_t *ptep;
> +
> + /*
> + * Some parts of the kernel (static keys for example) depend on
> + * successful code patching. Code patching under STRICT_KERNEL_RWX
> + * requires this setup - otherwise we cannot patch at all. We use
> + * BUG_ON() here and later since an early failure is preferred to
> + * buggy behavior and/or strange crashes later.
> + */
> + patching_mm = copy_init_mm();
> + BUG_ON(!patching_mm);
> +
> + /*
> + * In hash we cannot go above DEFAULT_MAP_WINDOW easily.
> + * XXX: Do we want additional bits of entropy for radix?
> + */
> + patching_addr = (get_random_long() & PAGE_MASK) %
> + (DEFAULT_MAP_WINDOW - PAGE_SIZE);
It took me a while to understand this calculation. I see that it's
calculating a base address for a page in which to do patching. It does
the following:
- get a random long
- mask with PAGE_MASK so as to get a page aligned value
- make sure that the base address is at least one PAGE_SIZE below
DEFAULT_MAP_WINDOW so we have a clear page between the base and
DEFAULT_MAP_WINDOW.
On 64-bit Book3S with 64K pages, that works out to be
PAGE_SIZE = 0x0000 0000 0001 0000
PAGE_MASK = 0xFFFF FFFF FFFF 0000
DEFAULT_MAP_WINDOW = DEFAULT_MAP_WINDOW_USER64 = TASK_SIZE_128TB
= 0x0000_8000_0000_0000
DEFAULT_MAP_WINDOW - PAGE_SIZE = 0x0000 7FFF FFFF 0000
It took a while (and a conversation with my wife who studied pure
maths!) but I am convinced that the modulo preserves the page-alignement
of the patching address.
One thing I did realise is that patching_addr can be zero at the end of
this process. That seems dubious and slightly error-prone to me - is
the patching process robust to that or should we exclude it?
Anyway, if I have the maths right, that there are 0x7fffffff or ~2
billion possible locations for the patching page, which is just shy of
31 bits of entropy.
I think this compares pretty favourably to most (K)ASLR implementations?
What's the range if built with 4k pages?
Kind regards,
Daniel
> +
> + ptep = get_locked_pte(patching_mm, patching_addr, &ptl);
> + BUG_ON(!ptep);
> + pte_unmap_unlock(ptep, ptl);
> +}
> +
> static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
>
> static int text_area_cpu_up(unsigned int cpu)
> --
> 2.27.0
^ permalink raw reply
* Re: [RFC PATCH 1/2] powerpc/numa: Introduce logical numa id
From: Aneesh Kumar K.V @ 2020-08-06 10:44 UTC (permalink / raw)
To: Srikar Dronamraju; +Cc: Nathan Lynch, linuxppc-dev
In-Reply-To: <20200804072507.GI24375@linux.vnet.ibm.com>
Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> * Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [2020-08-02 19:51:41]:
>> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
>> > * Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [2020-07-31 16:49:14]:
>> >
>> >
>> > If its just to eliminate node 0, then we have 2 other probably better
>> > solutions.
>> > 1. Dont mark node 0 as spl (currently still in mm-tree and a result in
>> > linux-next)
>> > 2. powerpc specific: explicitly clear node 0 during numa bringup.
>> >
>>
>>
>> I am not sure I consider them better. But yes, those patches are good
>> and also resolves the node 0 initialization when the firmware didn't
>> indicate the presence of such a node.
>>
>> This patch in addition make sure that we get the same topolgy report
>> across reboot on a virtualized partitions as longs as the cpu/memory
>> ratio per powervm domains remain the same. This should also help to
>> avoid confusion after an LPM migration once we start applying topology
>> updates.
>>
>
> What do we mean by cpu/memory ratio. The topology across reboot would have
> changed only if PowerVM would have allocated resources differently by
> scrambling/unscrambling. We are no more processing topology updates at
> runtime. As far as I know, after LPM, the source topology is maintained.
A LPAR running with one numa node and 10GB of memory on PowerVM domain
10 will report node 10 and 10GB memory in the current scheme. After LPM
migration or a CEC shutdown/reboot if the domain from which the resource
allocated becomes 11, then the LPAR will report node 11 and 10GB memory.
Having a logical node number means in the both the above cases we report
node 0, 10GB memory.
>
>> >> This can be resolved by mapping the firmware provided group id to a logical Linux
>> >> NUMA id. In this patch, we do this only for pseries platforms considering the
>> >
>> > On PowerVM, as you would know the nid is already a logical or a flattened
>> > chip-id and not the actual hardware chip-id.
>>
>> Yes. But then they are derived based on PowerVM resources AKA domains.
>> Now based on the available resource on a system, we could end up with
>> different node numbers with same toplogy across reboots. Making it
>> logical at OS level prevent that.
>
> The above statement kind of gives an impression, that topology changes
> across every reboot. We only end up with different node numbers if and only
> if the underlying topology has changed and that case is very rare. Or am I
> missing something?
IIUC it also depends on availability of resources within the
domain at the time of LPAR start.
>
>>
>> >> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>> >> index e437a9ac4956..6c659aada55b 100644
>> >> --- a/arch/powerpc/mm/numa.c
>> >> +++ b/arch/powerpc/mm/numa.c
>> >> @@ -221,25 +221,51 @@ static void initialize_distance_lookup_table(int nid,
>> >> }
>> >> }
>> >>
>> >> +static u32 nid_map[MAX_NUMNODES] = {[0 ... MAX_NUMNODES - 1] = NUMA_NO_NODE};
>> >> +
>> >> +int firmware_group_id_to_nid(int firmware_gid)
>> >> +{
>> >> + static int last_nid = 0;
>> >> +
>> >> + /*
>> >> + * For PowerNV we don't change the node id. This helps to avoid
>> >> + * confusion w.r.t the expected node ids. On pseries, node numbers
>> >> + * are virtualized. Hence do logical node id for pseries.
>> >> + */
>> >> + if (!firmware_has_feature(FW_FEATURE_LPAR))
>> >> + return firmware_gid;
>> >> +
>> >> + if (firmware_gid == -1)
>> >> + return NUMA_NO_NODE;
>> >> +
>> >> + if (nid_map[firmware_gid] == NUMA_NO_NODE)
>> >> + nid_map[firmware_gid] = last_nid++;
>> >
>> > How do we ensure 2 simultaneous firmware_group_id_to_nid() calls dont end up
>> > at this place in parallel?
>>
>> Do we have a code path where we do that? All the node id init should
>> happen early and there should not be two cpus doing node init at the
>> same time. I might be mistaken. Can you point to the code path where you
>> expect this to be called in parallel?
>>
>
> associativity_to_nid gets called the first time a cpu is being made present
> from offline. So it need not be in boot path. We may to verify if cpu
> hotplug, dlpar, operations are synchronized. For example a memory hotadd and
> cpu hotplug are they synchronized? I am not sure if they are synchronized at
> this time.
But you don't online cpu or memory to a non existent node post boot
right?. If the node is existent we have already initialized the nid_map.
However i am not sure whether we do a parallel initialization of devices. ie,
of_device_add getting called in parallel. if it can then we need the
below?
@@ -226,6 +226,7 @@ static u32 nid_map[MAX_NUMNODES] = {[0 ... MAX_NUMNODES - 1] = NUMA_NO_NODE};
int firmware_group_id_to_nid(int firmware_gid)
{
static int last_nid = 0;
+ static DEFINE_SPINLOCK(node_id_lock);
/*
* For PowerNV we don't change the node id. This helps to avoid
@@ -238,8 +239,13 @@ int firmware_group_id_to_nid(int firmware_gid)
if (firmware_gid == -1)
return NUMA_NO_NODE;
- if (nid_map[firmware_gid] == NUMA_NO_NODE)
- nid_map[firmware_gid] = last_nid++;
+ if (nid_map[firmware_gid] == NUMA_NO_NODE) {
+ spin_lock(&node_id_lock);
+ /* recheck with lock held */
+ if (nid_map[firmware_gid] == NUMA_NO_NODE)
+ nid_map[firmware_gid] = last_nid++;
+ spin_unlock(&node_id_lock);
+ }
return nid_map[firmware_gid];
}
>
>> >
>> >> +
>> >> + return nid_map[firmware_gid];
>> >> +}
>> >> +
>> >> /* Returns nid in the range [0..MAX_NUMNODES-1], or -1 if no useful numa
>> >> * info is found.
>> >> */
>> >> static int associativity_to_nid(const __be32 *associativity)
>> >> {
>> >> int nid = NUMA_NO_NODE;
>> >> + int firmware_gid = -1;
>> >>
>> >> if (!numa_enabled)
>> >> goto out;
>> >>
>> >> if (of_read_number(associativity, 1) >= min_common_depth)
>> >> - nid = of_read_number(&associativity[min_common_depth], 1);
>> >> + firmware_gid = of_read_number(&associativity[min_common_depth], 1);
>> >>
>> >> /* POWER4 LPAR uses 0xffff as invalid node */
>> >> - if (nid == 0xffff || nid >= MAX_NUMNODES)
>> >> - nid = NUMA_NO_NODE;
>> >> + if (firmware_gid == 0xffff || firmware_gid >= MAX_NUMNODES)
>> >> + firmware_gid = -1;
>> >
>> > Lets assume two or more invocations of associativity_to_nid for the same
>> > associativity, end up with -1, In each case aren't giving different
>> > nids?
>>
>>
>> I didn't quiet get the comment here. But I assume you are indicating the
>> same one you mentioned above?
>>
>
> No its not related to the above comment.
> We are incrementing the nid_map table for every unique firmware_gid or for
> every -1 (aka invalid associativities). If there are sufficiently large
> number of associativities that end up returning invalid associativities,
> then don't we quickly overflow the nid_map table? Not only about the
> overflow but a 8 node machine may soon look like a 80 node machine.
Not sure I follow. What does a large number of associativies imply? Are
you looking at ibm,associativity-lookup-arrays that got entries which
are invalid? Even there we are not parsing the full array, we lookup
only a specific firmware_gid (in case of lookup-arrays we use aa_index
value from drmem_lmb).
I will also add a las_nid > MAX_NUMNODES check in
firmware_group_id_to_nid() to handle the case where we find more numa
nodes than MAX_NUMANODES in device tree.
-aneesh
^ permalink raw reply
* [PATCH][V2] macintosh: windfarm: remove detatch debug containing spelling mistakes
From: Colin King @ 2020-08-06 10:29 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Wolfram Sang, linuxppc-dev
Cc: kernel-janitors, linux-kernel
From: Colin Ian King <colin.king@canonical.com>
There are spelling mistakes in two debug messages. As recommended
by Wolfram Sang, these can be removed as there is plenty of debug
in the driver core.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
V2: remove the debug rather than fixing the spelling
---
drivers/macintosh/windfarm_lm75_sensor.c | 2 --
drivers/macintosh/windfarm_lm87_sensor.c | 2 --
2 files changed, 4 deletions(-)
diff --git a/drivers/macintosh/windfarm_lm75_sensor.c b/drivers/macintosh/windfarm_lm75_sensor.c
index 1e5fa09845e7..29f48c2028b6 100644
--- a/drivers/macintosh/windfarm_lm75_sensor.c
+++ b/drivers/macintosh/windfarm_lm75_sensor.c
@@ -152,8 +152,6 @@ static int wf_lm75_remove(struct i2c_client *client)
{
struct wf_lm75_sensor *lm = i2c_get_clientdata(client);
- DBG("wf_lm75: i2c detatch called for %s\n", lm->sens.name);
-
/* Mark client detached */
lm->i2c = NULL;
diff --git a/drivers/macintosh/windfarm_lm87_sensor.c b/drivers/macintosh/windfarm_lm87_sensor.c
index d011899c0a8a..9fab0b47cd3d 100644
--- a/drivers/macintosh/windfarm_lm87_sensor.c
+++ b/drivers/macintosh/windfarm_lm87_sensor.c
@@ -149,8 +149,6 @@ static int wf_lm87_remove(struct i2c_client *client)
{
struct wf_lm87_sensor *lm = i2c_get_clientdata(client);
- DBG("wf_lm87: i2c detatch called for %s\n", lm->sens.name);
-
/* Mark client detached */
lm->i2c = NULL;
--
2.27.0
^ permalink raw reply related
* Re: [PATCH v3 3/3] powerpc/uaccess: simplify the get_fs() set_fs() logic
From: Christophe Leroy @ 2020-08-06 9:54 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel
In-Reply-To: <20200806091758.GA653@infradead.org>
Le 06/08/2020 à 11:17, Christoph Hellwig a écrit :
> Do you urgently need this? My plan for 5.10 is to rebased and submit
> the remaining bits of this branch:
>
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/set_fs-removal
>
> which will kill off set_fs/get_fs entirely.
>
No this isn't needed urgently at all I think.
It was sleeping in Patchwork since January, and I received comments from
Michael a few days ago asking me to re-submit, see
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/dd2876b808ea38eb7b7f760ecd6ce06096c61fb5.1580295551.git.christophe.leroy@c-s.fr/
But if you are killing set_fs/get_fs entirely, that's even better I
guess. Thanks for the hands up.
Christophe
^ permalink raw reply
* [PATCH] powerpc/signal: Move and simplify get_clean_sp()
From: Christophe Leroy @ 2020-08-06 8:50 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
get_clean_sp() is only used in kernel/signal.c . Move it there.
And GCC is smart enough to reduce the function when on PPC32, no
need of a special PPC32 simple version.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/processor.h | 14 --------------
arch/powerpc/kernel/signal.c | 7 +++++++
2 files changed, 7 insertions(+), 14 deletions(-)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index ed0d633ab5aa..5c20b6d509ae 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -404,20 +404,6 @@ static inline void prefetchw(const void *x)
#define HAVE_ARCH_PICK_MMAP_LAYOUT
-#ifdef CONFIG_PPC64
-static inline unsigned long get_clean_sp(unsigned long sp, int is_32)
-{
- if (is_32)
- return sp & 0x0ffffffffUL;
- return sp;
-}
-#else
-static inline unsigned long get_clean_sp(unsigned long sp, int is_32)
-{
- return sp;
-}
-#endif
-
/* asm stubs */
extern unsigned long isa300_idle_stop_noloss(unsigned long psscr_val);
extern unsigned long isa300_idle_stop_mayloss(unsigned long psscr_val);
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index d15a98c758b8..bd0ba7c5e2cf 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -171,6 +171,13 @@ inline unsigned long copy_ckfpr_from_user(struct task_struct *task,
int show_unhandled_signals = 1;
+static unsigned long get_clean_sp(unsigned long sp, int is_32)
+{
+ if (is_32)
+ return sp & 0x0ffffffffUL;
+ return sp;
+}
+
/*
* Allocate space for the signal frame
*/
--
2.25.0
^ permalink raw reply related
* [PATCH] ASoC: fsl-asoc-card: Get "extal" clock rate by clk_get_rate
From: Shengjiu Wang @ 2020-08-06 7:39 UTC (permalink / raw)
To: timur, nicoleotsuka, Xiubo.Lee, festevam, lgirdwood, broonie,
perex, tiwai, alsa-devel, linuxppc-dev, linux-kernel
On some platform(.e.g. i.MX8QM MEK), the "extal" clock is different
with the mclk of codec, then the clock rate is also different.
So it is better to get clock rate of "extal" rate by clk_get_rate,
don't reuse the clock rate of mclk.
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
sound/soc/fsl/fsl-asoc-card.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/sound/soc/fsl/fsl-asoc-card.c b/sound/soc/fsl/fsl-asoc-card.c
index 52adedc03245..2c92a5efad61 100644
--- a/sound/soc/fsl/fsl-asoc-card.c
+++ b/sound/soc/fsl/fsl-asoc-card.c
@@ -696,6 +696,13 @@ static int fsl_asoc_card_probe(struct platform_device *pdev)
goto asrc_fail;
}
} else if (of_node_name_eq(cpu_np, "esai")) {
+ struct clk *esai_clk = clk_get(&cpu_pdev->dev, "extal");
+
+ if (!IS_ERR(esai_clk)) {
+ priv->cpu_priv.sysclk_freq[TX] = clk_get_rate(esai_clk);
+ priv->cpu_priv.sysclk_freq[RX] = clk_get_rate(esai_clk);
+ clk_put(esai_clk);
+ }
priv->cpu_priv.sysclk_id[1] = ESAI_HCKT_EXTAL;
priv->cpu_priv.sysclk_id[0] = ESAI_HCKR_EXTAL;
} else if (of_node_name_eq(cpu_np, "sai")) {
--
2.27.0
^ permalink raw reply related
* [powerpc:merge] BUILD SUCCESS 3cd2184115b85cc8242fec3d42529cd112962984
From: kernel test robot @ 2020-08-06 4:14 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: 3cd2184115b85cc8242fec3d42529cd112962984 Automatic merge of 'master', 'next' and 'fixes' (2020-08-05 22:24)
elapsed time: 844m
configs tested: 91
configs skipped: 6
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm64 allyesconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
sh polaris_defconfig
mips mips_paravirt_defconfig
microblaze mmu_defconfig
mips pnx8335_stb225_defconfig
arm aspeed_g5_defconfig
mips bmips_stb_defconfig
sh sh7770_generic_defconfig
m68k m5249evb_defconfig
alpha defconfig
sh dreamcast_defconfig
m68k alldefconfig
arm am200epdkit_defconfig
arm nhk8815_defconfig
powerpc mpc5200_defconfig
mips tb0219_defconfig
powerpc ppc6xx_defconfig
arm imx_v4_v5_defconfig
arm eseries_pxa_defconfig
arm pleb_defconfig
arm keystone_defconfig
arm collie_defconfig
arc vdk_hs38_defconfig
arm tct_hammer_defconfig
mips loongson1b_defconfig
powerpc ppc64_defconfig
arm ezx_defconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
i386 allyesconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
nios2 defconfig
arc allyesconfig
nds32 allnoconfig
c6x allyesconfig
mips allyesconfig
mips allmodconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
powerpc defconfig
i386 randconfig-a005-20200805
i386 randconfig-a004-20200805
i386 randconfig-a001-20200805
i386 randconfig-a003-20200805
i386 randconfig-a002-20200805
i386 randconfig-a006-20200805
x86_64 randconfig-a013-20200805
x86_64 randconfig-a011-20200805
x86_64 randconfig-a012-20200805
x86_64 randconfig-a016-20200805
x86_64 randconfig-a015-20200805
x86_64 randconfig-a014-20200805
i386 randconfig-a011-20200805
i386 randconfig-a012-20200805
i386 randconfig-a013-20200805
i386 randconfig-a014-20200805
i386 randconfig-a015-20200805
i386 randconfig-a016-20200805
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:next] BUILD SUCCESS a7aaa2f26bfd932a654706b19859e7adf802bee2
From: kernel test robot @ 2020-08-06 4:14 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
branch HEAD: a7aaa2f26bfd932a654706b19859e7adf802bee2 selftests/powerpc: Fix pkey syscall redefinitions
elapsed time: 843m
configs tested: 134
configs skipped: 14
The following configs have been built successfully.
More configs may be tested in the coming days.
arm defconfig
arm64 defconfig
arm allyesconfig
arm allmodconfig
arm64 allyesconfig
arm mainstone_defconfig
sh sh7770_generic_defconfig
mips rbtx49xx_defconfig
ia64 bigsur_defconfig
arm simpad_defconfig
sh polaris_defconfig
mips mips_paravirt_defconfig
microblaze mmu_defconfig
arm axm55xx_defconfig
xtensa xip_kc705_defconfig
m68k m5475evb_defconfig
powerpc holly_defconfig
powerpc pq2fads_defconfig
c6x defconfig
arm u8500_defconfig
powerpc amigaone_defconfig
sh se7712_defconfig
arc tb10x_defconfig
sh se7780_defconfig
mips sb1250_swarm_defconfig
mips tb0287_defconfig
arc allyesconfig
sh dreamcast_defconfig
m68k m5249evb_defconfig
alpha defconfig
arc axs103_smp_defconfig
mips fuloong2e_defconfig
powerpc mvme5100_defconfig
mips bcm47xx_defconfig
m68k alldefconfig
arm am200epdkit_defconfig
arm nhk8815_defconfig
powerpc pasemi_defconfig
arm rpc_defconfig
m68k mvme147_defconfig
m68k multi_defconfig
powerpc mpc885_ads_defconfig
sh secureedge5410_defconfig
ia64 tiger_defconfig
arm mv78xx0_defconfig
powerpc mpc5200_defconfig
mips tb0219_defconfig
powerpc ppc6xx_defconfig
s390 debug_defconfig
powerpc powernv_defconfig
arm spear6xx_defconfig
mips db1xxx_defconfig
sh apsh4a3a_defconfig
mips gpr_defconfig
arm imx_v4_v5_defconfig
arm eseries_pxa_defconfig
arm pleb_defconfig
arm keystone_defconfig
arm collie_defconfig
arm shannon_defconfig
arm davinci_all_defconfig
xtensa generic_kc705_defconfig
arm moxart_defconfig
sh sh2007_defconfig
sh se7751_defconfig
m68k m5275evb_defconfig
arc vdk_hs38_defconfig
arm tct_hammer_defconfig
mips loongson1b_defconfig
powerpc ppc64_defconfig
arm ezx_defconfig
ia64 allmodconfig
ia64 defconfig
ia64 allyesconfig
m68k allmodconfig
m68k defconfig
m68k allyesconfig
nds32 defconfig
nios2 allyesconfig
csky defconfig
alpha allyesconfig
nios2 defconfig
nds32 allnoconfig
c6x allyesconfig
xtensa allyesconfig
h8300 allyesconfig
arc defconfig
sh allmodconfig
parisc defconfig
s390 allyesconfig
parisc allyesconfig
s390 defconfig
sparc allyesconfig
sparc defconfig
i386 defconfig
i386 allyesconfig
mips allyesconfig
mips allmodconfig
powerpc defconfig
powerpc allyesconfig
powerpc allmodconfig
powerpc allnoconfig
i386 randconfig-a005-20200805
i386 randconfig-a004-20200805
i386 randconfig-a001-20200805
i386 randconfig-a003-20200805
i386 randconfig-a002-20200805
i386 randconfig-a006-20200805
x86_64 randconfig-a013-20200805
x86_64 randconfig-a011-20200805
x86_64 randconfig-a012-20200805
x86_64 randconfig-a016-20200805
x86_64 randconfig-a015-20200805
x86_64 randconfig-a014-20200805
i386 randconfig-a011-20200805
i386 randconfig-a012-20200805
i386 randconfig-a013-20200805
i386 randconfig-a014-20200805
i386 randconfig-a015-20200805
i386 randconfig-a016-20200805
i386 randconfig-a011-20200806
i386 randconfig-a012-20200806
i386 randconfig-a015-20200806
i386 randconfig-a016-20200806
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
x86_64 rhel
x86_64 allyesconfig
x86_64 rhel-7.6-kselftests
x86_64 defconfig
x86_64 rhel-8.3
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH 1/2] sched/topology: Allow archs to override cpu_smt_mask
From: peterz @ 2020-08-06 8:54 UTC (permalink / raw)
To: Michael Ellerman
Cc: Gautham R Shenoy, Michael Neuling, Vincent Guittot,
Srikar Dronamraju, Rik van Riel, linuxppc-dev, LKML,
Valentin Schneider, Thomas Gleixner, Mel Gorman, Ingo Molnar,
Dietmar Eggemann
In-Reply-To: <87ft90z6dy.fsf@mpe.ellerman.id.au>
On Thu, Aug 06, 2020 at 03:32:25PM +1000, Michael Ellerman wrote:
> That brings with it a bunch of problems, such as existing software that
> has been developed/configured for Power8 and expects to see SMT8.
>
> We also allow LPARs to be live migrated from Power8 to Power9 (and back), so
> maintaining the illusion of SMT8 is considered a requirement to make that work.
So how does that work if the kernel booted on P9 and demuxed the SMT8
into 2xSMT4? If you migrate that state onto a P8 with actual SMT8 you're
toast again.
> Yeah I agree the naming is confusing.
>
> Let's call them "SMT4 cores" and "SMT8 cores"?
Works for me, thanks!
> The problem is we are already lying to userspace, because firmware lies to us.
>
> ie. the firmware on these systems shows us an SMT8 core, and so current kernels
> show SMT8 to userspace. I don't think we can realistically change that fact now,
> as these systems are already out in the field.
>
> What this patch tries to do is undo some of the mess, and at least give the
> scheduler the right information.
What a mess... I think it depends on what you do with that P9 to P8
migration case. Does it make sense to have a "p8_compat" boot arg for
the case where you want LPAR migration back onto P8 systems -- in which
case it simply takes the firmware's word as gospel and doesn't untangle
things, because it can actually land on a P8.
^ permalink raw reply
* Re: [PATCH 1/2] sched/topology: Allow archs to override cpu_smt_mask
From: Michael Ellerman @ 2020-08-06 5:32 UTC (permalink / raw)
To: peterz, Srikar Dronamraju
Cc: Gautham R Shenoy, Michael Neuling, Vincent Guittot, Rik van Riel,
linuxppc-dev, LKML, Ingo Molnar, Thomas Gleixner, Mel Gorman,
Valentin Schneider, Dietmar Eggemann
In-Reply-To: <20200804124755.GJ2674@hirez.programming.kicks-ass.net>
peterz@infradead.org writes:
> On Tue, Aug 04, 2020 at 05:40:07PM +0530, Srikar Dronamraju wrote:
>> * peterz@infradead.org <peterz@infradead.org> [2020-08-04 12:45:20]:
>>
>> > On Tue, Aug 04, 2020 at 09:03:06AM +0530, Srikar Dronamraju wrote:
>> > > cpu_smt_mask tracks topology_sibling_cpumask. This would be good for
>> > > most architectures. One of the users of cpu_smt_mask(), would be to
>> > > identify idle-cores. On Power9, a pair of cores can be presented by the
>> > > firmware as a big-core for backward compatibility reasons.
>> > >
>> > > In order to maintain userspace backward compatibility with previous
>> > > versions of processor, (since Power8 had SMT8 cores), Power9 onwards there
>> > > is option to the firmware to advertise a pair of SMT4 cores as a fused
>> > > cores (referred to as the big_core mode in the Linux Kernel). On Power9
>> > > this pair shares the L2 cache as well. However, from the scheduler's point
>> > > of view, a core should be determined by SMT4. The load-balancer already
>> > > does this. Hence allow PowerPc architecture to override the default
>> > > cpu_smt_mask() to point to the SMT4 cores in a big_core mode.
>> >
>> > I'm utterly confused.
>> >
>> > Why can't you set your regular siblings mask to the smt4 thing? Who
>> > cares about the compat stuff, I thought that was an LPAR/AIX thing.
To be clear this stuff is all for when we're running on the PowerVM machines,
ie. as LPARs.
That brings with it a bunch of problems, such as existing software that
has been developed/configured for Power8 and expects to see SMT8.
We also allow LPARs to be live migrated from Power8 to Power9 (and back), so
maintaining the illusion of SMT8 is considered a requirement to make that work.
>> There are no technical challenges to set the sibling mask to SMT4.
>> This is for Linux running on PowerVM. When these Power9 boxes are sold /
>> marketed as X core boxes (where X stand for SMT8 cores). Since on PowerVM
>> world, everything is in SMT8 mode, the device tree properties still mark
>> the system to be running on 8 thread core. There are a number of utilities
>> like ppc64_cpu that directly read from device-tree. They would get core
>> count and thread count which is SMT8 based.
>>
>> If the sibling_mask is set to small core, then same user when looking at
>
> FWIW, I find the small/big core naming utterly confusing vs the
> big/little naming from ARM. When you say small, I'm thinking of
> asymmetric cores, not SMT4/SMT8.
Yeah I agree the naming is confusing.
Let's call them "SMT4 cores" and "SMT8 cores"?
>> output from lscpu and other utilities that look at sysfs will start seeing
>> 2x number of cores to what he had provisioned and what the utilities from
>> the device-tree show. This can gets users confused.
>
> One will report SMT8 and the other SMT4, right? So only users that
> cannot read will be confused, but if you can't read, confusion is
> guaranteed anyway.
It's partly users, but also software that would see different values depending
on where it looks.
> Also, by exposing the true (SMT4) topology to userspace, userspace
> applications could behave better -- for those few that actually parse
> the topology information.
Agreed, though as you say there aren't that many that actually use the low-level
topology information.
>> So to keep the device-tree properties, utilities depending on device-tree,
>> sysfs and utilities depending on sysfs on the same page, userspace are only
>> exposed as SMT8.
>
> I'm not convinced it makes sense to lie to userspace just to accomodate
> a few users that cannot read.
The problem is we are already lying to userspace, because firmware lies to us.
ie. the firmware on these systems shows us an SMT8 core, and so current kernels
show SMT8 to userspace. I don't think we can realistically change that fact now,
as these systems are already out in the field.
What this patch tries to do is undo some of the mess, and at least give the
scheduler the right information.
cheers
^ permalink raw reply
* Re: [PATCH v10 2/5] powerpc/vdso: Prepare for switching VDSO to generic C implementation.
From: Christophe Leroy @ 2020-08-06 5:46 UTC (permalink / raw)
To: Segher Boessenkool
Cc: nathanl, linux-arch, vincenzo.frascino, arnd, linux-kernel,
Paul Mackerras, luto, tglx, linuxppc-dev
In-Reply-To: <20200805184054.GQ6753@gate.crashing.org>
Hi,
On 08/05/2020 06:40 PM, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Aug 05, 2020 at 04:40:16PM +0000, Christophe Leroy wrote:
>>> It cannot optimise it because it does not know shift < 32. The code
>>> below is incorrect for shift equal to 32, fwiw.
>>
>> Is there a way to tell it ?
>
> Sure, for example the &31 should work (but it doesn't, with the GCC
> version you used -- which version is that?)
GCC 10.1
>
>>> What does the compiler do for just
>>>
>>> static __always_inline u64 vdso_shift_ns(u64 ns, unsigned long shift)
>>> return ns >> (shift & 31);
>>> }
>>>
>>
>> Worse:
>
> I cannot make heads or tails of all that branch spaghetti, sorry.
>
>> 73c: 55 8c 06 fe clrlwi r12,r12,27
>> 740: 7f c8 f0 14 addc r30,r8,r30
>> 744: 7c c6 4a 14 add r6,r6,r9
>> 748: 7c c6 e1 14 adde r6,r6,r28
>> 74c: 34 6c ff e0 addic. r3,r12,-32
>> 750: 41 80 00 70 blt 7c0 <__c_kernel_clock_gettime+0x114>
>
> This branch is always true. Hrm.
As a standalone function:
With your suggestion:
000006ac <vdso_shift_ns>:
6ac: 54 a5 06 fe clrlwi r5,r5,27
6b0: 35 25 ff e0 addic. r9,r5,-32
6b4: 41 80 00 10 blt 6c4 <vdso_shift_ns+0x18>
6b8: 7c 64 4c 30 srw r4,r3,r9
6bc: 38 60 00 00 li r3,0
6c0: 4e 80 00 20 blr
6c4: 54 69 08 3c rlwinm r9,r3,1,0,30
6c8: 21 45 00 1f subfic r10,r5,31
6cc: 7c 84 2c 30 srw r4,r4,r5
6d0: 7d 29 50 30 slw r9,r9,r10
6d4: 7c 63 2c 30 srw r3,r3,r5
6d8: 7d 24 23 78 or r4,r9,r4
6dc: 4e 80 00 20 blr
With the version as is in my series:
000006ac <vdso_shift_ns>:
6ac: 21 25 00 20 subfic r9,r5,32
6b0: 7c 69 48 30 slw r9,r3,r9
6b4: 7c 84 2c 30 srw r4,r4,r5
6b8: 7d 24 23 78 or r4,r9,r4
6bc: 7c 63 2c 30 srw r3,r3,r5
6c0: 4e 80 00 20 blr
Christophe
^ permalink raw reply
* Re: [PATCH v3 3/3] powerpc/uaccess: simplify the get_fs() set_fs() logic
From: Christoph Hellwig @ 2020-08-06 9:17 UTC (permalink / raw)
To: Christophe Leroy; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel
In-Reply-To: <cf39cb8e42cffe323393b8cecdc59a7230298eab.1596702117.git.christophe.leroy@csgroup.eu>
Do you urgently need this? My plan for 5.10 is to rebased and submit
the remaining bits of this branch:
http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/set_fs-removal
which will kill off set_fs/get_fs entirely.
^ permalink raw reply
* [PATCH] powerpc/book3s64/radix: Make radix_mem_block_size 64bit
From: Aneesh Kumar K.V @ 2020-08-06 8:14 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V
Similar to commit: 89c140bbaeee ("pseries: Fix 64 bit logical memory block panic")
make sure we update different variables tracking lmb_size are updated
to be 64 bit.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/mmu.h | 2 +-
arch/powerpc/include/asm/drmem.h | 2 +-
arch/powerpc/mm/book3s64/radix_pgtable.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 55442d45c597..1a0c9d09950f 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -85,7 +85,7 @@ extern unsigned int mmu_base_pid;
/*
* memory block size used with radix translation.
*/
-extern unsigned int __ro_after_init radix_mem_block_size;
+extern unsigned long __ro_after_init radix_mem_block_size;
#define PRTB_SIZE_SHIFT (mmu_pid_bits + 4)
#define PRTB_ENTRIES (1ul << mmu_pid_bits)
diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
index 17ccc6474ab6..07c158c5f939 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -21,7 +21,7 @@ struct drmem_lmb {
struct drmem_lmb_info {
struct drmem_lmb *lmbs;
int n_lmbs;
- u32 lmb_size;
+ u64 lmb_size;
};
extern struct drmem_lmb_info *drmem_info;
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 28c784976bed..ca76d9d6372a 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -34,7 +34,7 @@
unsigned int mmu_pid_bits;
unsigned int mmu_base_pid;
-unsigned int radix_mem_block_size __ro_after_init;
+unsigned long radix_mem_block_size __ro_after_init;
static __ref void *early_alloc_pgtable(unsigned long size, int nid,
unsigned long region_start, unsigned long region_end)
--
2.26.2
^ permalink raw reply related
* Re: [PATCH] powerpc/signal: Move and simplify get_clean_sp()
From: Christoph Hellwig @ 2020-08-06 9:25 UTC (permalink / raw)
To: Christophe Leroy; +Cc: Paul Mackerras, linuxppc-dev, linux-kernel
In-Reply-To: <04169f40c09682ce5747518268ca84285bc17fbc.1596703345.git.christophe.leroy@csgroup.eu>
On Thu, Aug 06, 2020 at 08:50:20AM +0000, Christophe Leroy wrote:
> get_clean_sp() is only used in kernel/signal.c . Move it there.
>
> And GCC is smart enough to reduce the function when on PPC32, no
> need of a special PPC32 simple version.
What about just open coding it in the only caller, which would seem even
cleaner?
^ permalink raw reply
* [PATCH] powerpc/32s: Fix assembler warning about r0
From: Christophe Leroy @ 2020-08-06 6:01 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
The assembler says:
arch/powerpc/kernel/head_32.S:1095: Warning: invalid register expression
It's objecting to the use of r0 as the RA argument. That's because
when RA = 0 the literal value 0 is used, rather than the content of
r0, making the use of r0 in the source potentially confusing.
Fix it to use a literal 0, the generated code is identical.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/head_32.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S
index f3ab94d73936..5624db0e09a1 100644
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -1092,7 +1092,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
*/
lis r5, abatron_pteptrs@h
ori r5, r5, abatron_pteptrs@l
- stw r5, 0xf0(r0) /* This much match your Abatron config */
+ stw r5, 0xf0(0) /* This much match your Abatron config */
lis r6, swapper_pg_dir@h
ori r6, r6, swapper_pg_dir@l
tophys(r5, r5)
--
2.25.0
^ permalink raw reply related
* [PATCH v3 3/3] powerpc/uaccess: simplify the get_fs() set_fs() logic
From: Christophe Leroy @ 2020-08-06 8:23 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <a6e62627d25fb7ae9b91d8bf553e707689e37498.1596702117.git.christophe.leroy@csgroup.eu>
On powerpc, we only have USER_DS and KERNEL_DS
Today, this is managed as an 'unsigned long' data space limit
which is used to compare the passed address with, plus a bit
in the thread_info flags that is set whenever modifying the limit
to enable the verification in addr_limit_user_check()
The limit is either the last address of user space when USER_DS is
set, and the last address of address space when KERNEL_DS is set.
In both cases, the limit is a compiletime constant.
get_fs() returns the limit, which is part of thread_info struct
set_fs() updates the limit then set the TI_FSCHECK flag.
addr_limit_user_check() check the flag, and if it is set it checks
the limit is the user limit, then unsets the TI_FSCHECK flag.
In addition, when the flag is set the syscall exit work is involved.
This exit work is heavy compared to normal syscall exit as it goes
through normal exception exit instead of the fast syscall exit.
Rename this TI_FSCHECK flag to TIF_UACCESS_KERNEL flag which tells
whether KERNEL_DS or USER_DS is set. Get mm_segment_t be redifined as
a bool struct that is either false (for USER_DS) or true (for
KERNEL_DS). When TIF_UACCESS_KERNEL is set, the limit is ~0UL.
Otherwise it is TASK_SIZE_USER (resp TASK_SIZE_USER64 on PPC64). When
KERNEL_DS is set, there is no range to check. Define TI_FSCHECK as an
alias to TIF_UACCESS_KERNEL.
On exit, involve exit work when the bit is set, i.e. when KERNEL_DS
is set. addr_limit_user_check() will clear the bit and kill the
user process.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v3: Rebased and taken into account removal of segment_eq() and comments from mpe
---
arch/powerpc/include/asm/processor.h | 5 +---
arch/powerpc/include/asm/thread_info.h | 9 ++++---
arch/powerpc/include/asm/uaccess.h | 35 +++++++++++++-------------
arch/powerpc/lib/sstep.c | 2 +-
4 files changed, 25 insertions(+), 26 deletions(-)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index ed0d633ab5aa..86a9c4395b99 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -84,7 +84,7 @@ void start_thread(struct pt_regs *regs, unsigned long fdptr, unsigned long sp);
void release_thread(struct task_struct *);
typedef struct {
- unsigned long seg;
+ bool uaccess_kernel;
} mm_segment_t;
#define TS_FPR(i) fp_state.fpr[i][TS_FPROFFSET]
@@ -148,7 +148,6 @@ struct thread_struct {
unsigned long ksp_vsid;
#endif
struct pt_regs *regs; /* Pointer to saved register state */
- mm_segment_t addr_limit; /* for get_fs() validation */
#ifdef CONFIG_BOOKE
/* BookE base exception scratch space; align on cacheline */
unsigned long normsave[8] ____cacheline_aligned;
@@ -295,7 +294,6 @@ struct thread_struct {
#define INIT_THREAD { \
.ksp = INIT_SP, \
.ksp_limit = INIT_SP_LIMIT, \
- .addr_limit = KERNEL_DS, \
.pgdir = swapper_pg_dir, \
.fpexc_mode = MSR_FE0 | MSR_FE1, \
SPEFSCR_INIT \
@@ -303,7 +301,6 @@ struct thread_struct {
#else
#define INIT_THREAD { \
.ksp = INIT_SP, \
- .addr_limit = KERNEL_DS, \
.fpexc_mode = 0, \
}
#endif
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index ca6c97025704..123232a63ee7 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -69,7 +69,7 @@ struct thread_info {
#define INIT_THREAD_INFO(tsk) \
{ \
.preempt_count = INIT_PREEMPT_COUNT, \
- .flags = 0, \
+ .flags = _TIF_UACCESS_KERNEL, \
}
#define THREAD_SIZE_ORDER (THREAD_SHIFT - PAGE_SHIFT)
@@ -90,7 +90,8 @@ void arch_setup_new_exec(void);
#define TIF_SYSCALL_TRACE 0 /* syscall trace active */
#define TIF_SIGPENDING 1 /* signal pending */
#define TIF_NEED_RESCHED 2 /* rescheduling necessary */
-#define TIF_FSCHECK 3 /* Check FS is USER_DS on return */
+#define TIF_UACCESS_KERNEL 3 /* KERNEL_DS is set */
+#define TIF_FSCHECK TIF_UACCESS_KERNEL
#define TIF_SYSCALL_EMU 4 /* syscall emulation active */
#define TIF_RESTORE_TM 5 /* need to restore TM FP/VEC/VSX */
#define TIF_PATCH_PENDING 6 /* pending live patching update */
@@ -130,7 +131,7 @@ void arch_setup_new_exec(void);
#define _TIF_SYSCALL_TRACEPOINT (1<<TIF_SYSCALL_TRACEPOINT)
#define _TIF_EMULATE_STACK_STORE (1<<TIF_EMULATE_STACK_STORE)
#define _TIF_NOHZ (1<<TIF_NOHZ)
-#define _TIF_FSCHECK (1<<TIF_FSCHECK)
+#define _TIF_UACCESS_KERNEL (1 << TIF_UACCESS_KERNEL)
#define _TIF_SYSCALL_EMU (1<<TIF_SYSCALL_EMU)
#define _TIF_SYSCALL_DOTRACE (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
_TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \
@@ -139,7 +140,7 @@ void arch_setup_new_exec(void);
#define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
_TIF_NOTIFY_RESUME | _TIF_UPROBE | \
_TIF_RESTORE_TM | _TIF_PATCH_PENDING | \
- _TIF_FSCHECK)
+ _TIF_UACCESS_KERNEL)
#define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR)
/* Bits in local_flags */
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 00699903f1ef..8567bec6f939 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -15,48 +15,49 @@
*
* For historical reasons, these macros are grossly misnamed.
*
- * The fs/ds values are now the highest legal address in the "segment".
+ * The fs/ds values are now a bool which tells the "segment" is user or kernel.
* This simplifies the checking in the routines below.
*/
#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })
-#define KERNEL_DS MAKE_MM_SEG(~0UL)
-#ifdef __powerpc64__
-/* We use TASK_SIZE_USER64 as TASK_SIZE is not constant */
-#define USER_DS MAKE_MM_SEG(TASK_SIZE_USER64 - 1)
-#else
-#define USER_DS MAKE_MM_SEG(TASK_SIZE - 1)
-#endif
+#define KERNEL_DS MAKE_MM_SEG(true)
+#define USER_DS MAKE_MM_SEG(false)
-#define get_fs() (current->thread.addr_limit)
+#define get_fs() (MAKE_MM_SEG(test_thread_flag(TIF_UACCESS_KERNEL)))
static inline void set_fs(mm_segment_t fs)
{
- current->thread.addr_limit = fs;
- /* On user-mode return check addr_limit (fs) is correct */
- set_thread_flag(TIF_FSCHECK);
+ update_thread_flag(TIF_UACCESS_KERNEL, fs.uaccess_kernel);
}
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
-#define user_addr_max() (get_fs().seg)
+#define uaccess_kernel() (get_fs().uaccess_kernel)
+#define user_addr_max() (get_fs().uaccess_kernel ? ~0UL : USER_ADDR_MAX - 1)
#ifdef __powerpc64__
+
+#define USER_ADDR_MAX TASK_SIZE_USER64
+
/*
* This check is sufficient because there is a large enough
* gap between user addresses and the kernel addresses
*/
#define __access_ok(addr, size, segment) \
- (((addr) <= (segment).seg) && ((size) <= (segment).seg))
+ segment.uaccess_kernel ? \
+ 1 : (addr) < USER_ADDR_MAX && ((size) < USER_ADDR_MAX)
#else
+#define USER_ADDR_MAX TASK_SIZE
+
static inline int __access_ok(unsigned long addr, unsigned long size,
mm_segment_t seg)
{
- if (addr > seg.seg)
+ if (seg.uaccess_kernel)
+ return 1;
+ if (addr >= USER_ADDR_MAX)
return 0;
- return (size == 0 || size - 1 <= seg.seg - addr);
+ return addr + size <= USER_ADDR_MAX;
}
#endif
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index caee8cc77e19..e10b642566ba 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -112,7 +112,7 @@ static nokprobe_inline long address_ok(struct pt_regs *regs,
return 1;
if (__access_ok(ea, 1, USER_DS))
/* Access overlaps the end of the user region */
- regs->dar = USER_DS.seg;
+ regs->dar = USER_ADDR_MAX;
else
regs->dar = ea;
return 0;
--
2.25.0
^ permalink raw reply related
* [PATCH v3 2/3] uaccess: remove segment_eq
From: Christophe Leroy @ 2020-08-06 8:23 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <a6e62627d25fb7ae9b91d8bf553e707689e37498.1596702117.git.christophe.leroy@csgroup.eu>
From: Christoph Hellwig <hch@lst.de>
segment_eq is only used to implement uaccess_kernel. Just open code
uaccess_kernel in the arch uaccess headers and remove one layer of
indirection.
Link: http://lkml.kernel.org/r/20200710135706.537715-5-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Greentime Hu <green.hu@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
arch/alpha/include/asm/uaccess.h | 2 +-
arch/arc/include/asm/segment.h | 3 +--
arch/arm/include/asm/uaccess.h | 4 ++--
arch/arm64/include/asm/uaccess.h | 2 +-
arch/csky/include/asm/segment.h | 2 +-
arch/h8300/include/asm/segment.h | 2 +-
arch/ia64/include/asm/uaccess.h | 2 +-
arch/m68k/include/asm/segment.h | 2 +-
arch/microblaze/include/asm/uaccess.h | 2 +-
arch/mips/include/asm/uaccess.h | 2 +-
arch/nds32/include/asm/uaccess.h | 2 +-
arch/nios2/include/asm/uaccess.h | 2 +-
arch/openrisc/include/asm/uaccess.h | 2 +-
arch/parisc/include/asm/uaccess.h | 2 +-
arch/powerpc/include/asm/uaccess.h | 3 +--
arch/riscv/include/asm/uaccess.h | 4 +---
arch/s390/include/asm/uaccess.h | 2 +-
arch/sh/include/asm/segment.h | 3 +--
arch/sparc/include/asm/uaccess_32.h | 2 +-
arch/sparc/include/asm/uaccess_64.h | 2 +-
arch/x86/include/asm/uaccess.h | 2 +-
arch/xtensa/include/asm/uaccess.h | 2 +-
include/asm-generic/uaccess.h | 4 ++--
include/linux/uaccess.h | 2 --
24 files changed, 25 insertions(+), 32 deletions(-)
diff --git a/arch/alpha/include/asm/uaccess.h b/arch/alpha/include/asm/uaccess.h
index 1fe2b56cb861..1b6f25efa247 100644
--- a/arch/alpha/include/asm/uaccess.h
+++ b/arch/alpha/include/asm/uaccess.h
@@ -20,7 +20,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
/*
* Is a address valid? This does a straightforward calculation rather
diff --git a/arch/arc/include/asm/segment.h b/arch/arc/include/asm/segment.h
index 6a2a5be5026d..871f8ab11bfd 100644
--- a/arch/arc/include/asm/segment.h
+++ b/arch/arc/include/asm/segment.h
@@ -14,8 +14,7 @@ typedef unsigned long mm_segment_t;
#define KERNEL_DS MAKE_MM_SEG(0)
#define USER_DS MAKE_MM_SEG(TASK_SIZE)
-
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
#endif /* __ASSEMBLY__ */
#endif /* __ASMARC_SEGMENT_H */
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 98c6b91be4a8..b19c9bec1f7a 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -76,7 +76,7 @@ static inline void set_fs(mm_segment_t fs)
modify_domain(DOMAIN_KERNEL, fs ? DOMAIN_CLIENT : DOMAIN_MANAGER);
}
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
/* We use 33-bit arithmetic here... */
#define __range_ok(addr, size) ({ \
@@ -263,7 +263,7 @@ extern int __put_user_8(void *, unsigned long long);
*/
#define USER_DS KERNEL_DS
-#define segment_eq(a, b) (1)
+#define uaccess_kernel() (true)
#define __addr_ok(addr) ((void)(addr), 1)
#define __range_ok(addr, size) ((void)(addr), 0)
#define get_fs() (KERNEL_DS)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index bc5c7b091152..fcb8174de505 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -49,7 +49,7 @@ static inline void set_fs(mm_segment_t fs)
CONFIG_ARM64_UAO));
}
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
/*
* Test whether a block of memory is a valid user space address.
diff --git a/arch/csky/include/asm/segment.h b/arch/csky/include/asm/segment.h
index db2640d5f575..79ede9b1a646 100644
--- a/arch/csky/include/asm/segment.h
+++ b/arch/csky/include/asm/segment.h
@@ -13,6 +13,6 @@ typedef struct {
#define USER_DS ((mm_segment_t) { 0x80000000UL })
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#endif /* __ASM_CSKY_SEGMENT_H */
diff --git a/arch/h8300/include/asm/segment.h b/arch/h8300/include/asm/segment.h
index a407978f9f9f..37950725d9b9 100644
--- a/arch/h8300/include/asm/segment.h
+++ b/arch/h8300/include/asm/segment.h
@@ -33,7 +33,7 @@ static inline mm_segment_t get_fs(void)
return USER_DS;
}
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#endif /* __ASSEMBLY__ */
diff --git a/arch/ia64/include/asm/uaccess.h b/arch/ia64/include/asm/uaccess.h
index 8aa473a4b0f4..179243c3dfc7 100644
--- a/arch/ia64/include/asm/uaccess.h
+++ b/arch/ia64/include/asm/uaccess.h
@@ -50,7 +50,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
/*
* When accessing user memory, we need to make sure the entire area really is in
diff --git a/arch/m68k/include/asm/segment.h b/arch/m68k/include/asm/segment.h
index c6686559e9b7..2b5e68a71ef7 100644
--- a/arch/m68k/include/asm/segment.h
+++ b/arch/m68k/include/asm/segment.h
@@ -52,7 +52,7 @@ static inline void set_fs(mm_segment_t val)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
#endif
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#endif /* __ASSEMBLY__ */
diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index 6723c56ec378..304b04ffea2f 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -41,7 +41,7 @@
# define get_fs() (current_thread_info()->addr_limit)
# define set_fs(val) (current_thread_info()->addr_limit = (val))
-# define segment_eq(a, b) ((a).seg == (b).seg)
+# define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#ifndef CONFIG_MMU
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 62b298c50905..61fc01f177a6 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -72,7 +72,7 @@ extern u64 __ua_limit;
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
/*
* eva_kernel_access() - determine whether kernel memory access on an EVA system
diff --git a/arch/nds32/include/asm/uaccess.h b/arch/nds32/include/asm/uaccess.h
index 3a9219f53ee0..010ba5f1d7dd 100644
--- a/arch/nds32/include/asm/uaccess.h
+++ b/arch/nds32/include/asm/uaccess.h
@@ -44,7 +44,7 @@ static inline void set_fs(mm_segment_t fs)
current_thread_info()->addr_limit = fs;
}
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
#define __range_ok(addr, size) (size <= get_fs() && addr <= (get_fs() -size))
diff --git a/arch/nios2/include/asm/uaccess.h b/arch/nios2/include/asm/uaccess.h
index e83f831a76f9..a741abbed6fb 100644
--- a/arch/nios2/include/asm/uaccess.h
+++ b/arch/nios2/include/asm/uaccess.h
@@ -30,7 +30,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(seg) (current_thread_info()->addr_limit = (seg))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define __access_ok(addr, len) \
(((signed long)(((long)get_fs().seg) & \
diff --git a/arch/openrisc/include/asm/uaccess.h b/arch/openrisc/include/asm/uaccess.h
index 17c24f14615f..48b691530d3e 100644
--- a/arch/openrisc/include/asm/uaccess.h
+++ b/arch/openrisc/include/asm/uaccess.h
@@ -43,7 +43,7 @@
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
-#define segment_eq(a, b) ((a) == (b))
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
/* Ensure that the range from addr to addr+size is all within the process'
* address space
diff --git a/arch/parisc/include/asm/uaccess.h b/arch/parisc/include/asm/uaccess.h
index ebbb9ffe038c..ed2cd4fb479b 100644
--- a/arch/parisc/include/asm/uaccess.h
+++ b/arch/parisc/include/asm/uaccess.h
@@ -14,7 +14,7 @@
#define KERNEL_DS ((mm_segment_t){0})
#define USER_DS ((mm_segment_t){1})
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 64c04ab09112..00699903f1ef 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -38,8 +38,7 @@ static inline void set_fs(mm_segment_t fs)
set_thread_flag(TIF_FSCHECK);
}
-#define segment_eq(a, b) ((a).seg == (b).seg)
-
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define user_addr_max() (get_fs().seg)
#ifdef __powerpc64__
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
index 8ce9d607b53d..1bdf663e5932 100644
--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -62,11 +62,9 @@ static inline void set_fs(mm_segment_t fs)
current_thread_info()->addr_limit = fs;
}
-#define segment_eq(a, b) ((a).seg == (b).seg)
-
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define user_addr_max() (get_fs().seg)
-
/**
* access_ok: - Checks if a user space pointer is valid
* @addr: User space pointer to start of block to check
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index 324438889fe1..f09444d6aeab 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -32,7 +32,7 @@
#define USER_DS_SACF (3)
#define get_fs() (current->thread.mm_segment)
-#define segment_eq(a,b) (((a) & 2) == ((b) & 2))
+#define uaccess_kernel() ((get_fs() & 2) == KERNEL_DS)
void set_fs(mm_segment_t fs);
diff --git a/arch/sh/include/asm/segment.h b/arch/sh/include/asm/segment.h
index 33d1d28057cb..02e54a3335d6 100644
--- a/arch/sh/include/asm/segment.h
+++ b/arch/sh/include/asm/segment.h
@@ -24,8 +24,7 @@ typedef struct {
#define USER_DS KERNEL_DS
#endif
-#define segment_eq(a, b) ((a).seg == (b).seg)
-
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define get_fs() (current_thread_info()->addr_limit)
#define set_fs(x) (current_thread_info()->addr_limit = (x))
diff --git a/arch/sparc/include/asm/uaccess_32.h b/arch/sparc/include/asm/uaccess_32.h
index d6d8413eca83..0a2d3ebc4bb8 100644
--- a/arch/sparc/include/asm/uaccess_32.h
+++ b/arch/sparc/include/asm/uaccess_32.h
@@ -28,7 +28,7 @@
#define get_fs() (current->thread.current_ds)
#define set_fs(val) ((current->thread.current_ds) = (val))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
/* We have there a nice not-mapped page at PAGE_OFFSET - PAGE_SIZE, so that this test
* can be fairly lightweight.
diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h
index bf9d330073b2..698cf69f74e9 100644
--- a/arch/sparc/include/asm/uaccess_64.h
+++ b/arch/sparc/include/asm/uaccess_64.h
@@ -32,7 +32,7 @@
#define get_fs() ((mm_segment_t){(current_thread_info()->current_ds)})
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define set_fs(val) \
do { \
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 18dfa07d3ef0..dd3261f9f4ea 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -33,7 +33,7 @@ static inline void set_fs(mm_segment_t fs)
set_thread_flag(TIF_FSCHECK);
}
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define user_addr_max() (current->thread.addr_limit.seg)
/*
diff --git a/arch/xtensa/include/asm/uaccess.h b/arch/xtensa/include/asm/uaccess.h
index e57f0d0a88d8..b9758119feca 100644
--- a/arch/xtensa/include/asm/uaccess.h
+++ b/arch/xtensa/include/asm/uaccess.h
@@ -35,7 +35,7 @@
#define get_fs() (current->thread.current_ds)
#define set_fs(val) (current->thread.current_ds = (val))
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#define __kernel_ok (uaccess_kernel())
#define __user_ok(addr, size) \
diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index e935318804f8..ba68ee4dabfa 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -86,8 +86,8 @@ static inline void set_fs(mm_segment_t fs)
}
#endif
-#ifndef segment_eq
-#define segment_eq(a, b) ((a).seg == (b).seg)
+#ifndef uaccess_kernel
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
#endif
#define access_ok(addr, size) __access_ok((unsigned long)(addr),(size))
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 0a76ddc07d59..5c62d0c6f15b 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -6,8 +6,6 @@
#include <linux/sched.h>
#include <linux/thread_info.h>
-#define uaccess_kernel() segment_eq(get_fs(), KERNEL_DS)
-
#include <asm/uaccess.h>
/*
--
2.25.0
^ permalink raw reply related
* [PATCH v3 1/3] syscalls: use uaccess_kernel in addr_limit_user_check
From: Christophe Leroy @ 2020-08-06 8:23 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
From: Christoph Hellwig <hch@lst.de>
Patch series "clean up address limit helpers", v2.
In preparation for eventually phasing out direct use of set_fs(), this
series removes the segment_eq() arch helper that is only used to implement
or duplicate the uaccess_kernel() API, and then adds descriptive helpers
to force the kernel address limit.
This patch (of 6):
Use the uaccess_kernel helper instead of duplicating it.
Link: http://lkml.kernel.org/r/20200714105505.935079-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200710135706.537715-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200710135706.537715-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
include/linux/syscalls.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index b951a87da987..e933a43d4a69 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -263,7 +263,7 @@ static inline void addr_limit_user_check(void)
return;
#endif
- if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
+ if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
"Invalid address limit on user-mode return"))
force_sig(SIGKILL);
--
2.25.0
^ permalink raw reply related
* Re: [PATCH] powerpc/40x: Fix assembler warning about r0
From: Christophe Leroy @ 2020-08-06 8:26 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev
In-Reply-To: <87o8noy0sc.fsf@mpe.ellerman.id.au>
Le 06/08/2020 à 04:18, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 22/07/2020 à 04:24, Michael Ellerman a écrit :
>>> The assembler says:
>>> arch/powerpc/kernel/head_40x.S:623: Warning: invalid register expression
>>
>> I get exactly the same with head_32.S, for the exact same reason.
>
> Ah yep, I see it. I mostly build pmac32_defconfig which doesn't have
> BDI_SWITCH enabled.
>
> Send a patch? :)
Done.
>
> Do we still need the BDI_SWITCH code? Is it likely anyone still has one,
> that works?
I have three (One for 83xx and two for 8xx) and they work, allthough I'm
using them only for Uboot and for very very very early Linux boot
debugging (Last time I used it with Linux was when implementing KASAN)
Christophe
^ permalink raw reply
* Re: [PATCH v2] powerpc/uaccess: simplify the get_fs() set_fs() logic
From: Christophe Leroy @ 2020-08-06 8:29 UTC (permalink / raw)
To: Michael Ellerman, Christophe Leroy, Benjamin Herrenschmidt,
Paul Mackerras
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <87mu3nyh3w.fsf@mpe.ellerman.id.au>
Le 25/07/2020 à 13:22, Michael Ellerman a écrit :
> Hi Christophe,
>
> Unfortunately this would collide messily with "uaccess: remove
> segment_eq" in linux-next, so I'll ask you to do a respin based on that,
> some comments below.
Done, sent as v3, together with the 2 patchs from Linux next to get it
build and boot.
>
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>> On powerpc, we only have USER_DS and KERNEL_DS
>>
>> Today, this is managed as an 'unsigned long' data space limit
>> which is used to compare the passed address with, plus a bit
>> in the thread_info flags that is set whenever modifying the limit
>> to enable the verification in addr_limit_user_check()
>>
>> The limit is either the last address of user space when USER_DS is
>> set, and the last address of address space when KERNEL_DS is set.
>> In both cases, the limit is a compiletime constant.
>>
>> get_fs() returns the limit, which is part of thread_info struct
>> set_fs() updates the limit then set the TI_FSCHECK flag.
>> addr_limit_user_check() check the flag, and if it is set it checks
>> the limit is the user limit, then unsets the TI_FSCHECK flag.
>>
>> In addition, when the flag is set the syscall exit work is involved.
>> This exit work is heavy compared to normal syscall exit as it goes
>> through normal exception exit instead of the fast syscall exit.
>>
>> Rename this TI_FSCHECK flag to TIF_KERNEL_DS flag which tells whether
>> KERNEL_DS or USER_DS is set. Get mm_segment_t be redifined as a bool
>> struct that is either false (for USER_DS) or true (for KERNEL_DS).
>> When TIF_KERNEL_DS is set, the limit is ~0UL. Otherwise it is
>> TASK_SIZE_USER (resp TASK_SIZE_USER64 on PPC64). When KERNEL_DS is
>> set, there is no range to check. Define TI_FSCHECK as an alias to
>> TIF_KERNEL_DS.
>
> I'd rather avoid the "DS" name any more than we have to. Maybe it means
> "data space" but that's not a very common term.
I thought it was a reference to the ds/fs/gs ... segment registers in
the 8086 ?
>
> The generic helper these days is called uaccess_kernel(), which returns
> true when uaccess routines are allowed to access the kernel.
>
> So calling it TIF_UACCESS_KERNEL would work I think?
ok
>
> The bool could be called uaccess_kernel.
> And END_OF_USER_DS could be USER_ADDR_MAX.
ok
>
>> On exit, involve exit work when the bit is set, i.e. when KERNEL_DS
>> is set. addr_limit_user_check() will clear the bit and kill the
>> user process.
>
> I guess this is safe. The check was added to make sure we never return
> to userspace with KERNEL_DS set, but using the actual TIF flag to
> determine the address limit should be equally safe, and avoid the
> overhead of the check in the good case.
That's the purpose indeed, yes.
christophe
^ permalink raw reply
* [PATCH] powerpc/hwirq: Remove stale forward irq_chip declaration
From: Christophe Leroy @ 2020-08-06 12:19 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
Since commit identified below, the forward declaration of
struct irq_chip is useless (was struct hw_interrupt_type at that time)
Remove it, together with the associated comment.
Fixes: c0ad90a32fb6 ("[PATCH] genirq: add ->retrigger() irq op to consolidate hw_irq_resend()")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/hw_irq.h | 6 ------
1 file changed, 6 deletions(-)
diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 3a0db7b0b46e..538698facb80 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -372,12 +372,6 @@ static inline void may_hard_irq_enable(void) { }
#define ARCH_IRQ_INIT_FLAGS IRQ_NOREQUEST
-/*
- * interrupt-retrigger: should we handle this via lost interrupts and IPIs
- * or should we not care like we do now ? --BenH.
- */
-struct irq_chip;
-
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_HW_IRQ_H */
--
2.25.0
^ permalink raw reply related
* [PATCH] powerpc/irq: Drop forward declaration of struct irqaction
From: Christophe Leroy @ 2020-08-06 12:19 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
Since the commit identified below, the forward declaration of
struct irqaction is useless. Drop it.
Fixes: b709c0832824 ("ppc64: move stack switching up in interrupt processing")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/irq.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h
index 814dfab7e392..4f983ca4030a 100644
--- a/arch/powerpc/include/asm/irq.h
+++ b/arch/powerpc/include/asm/irq.h
@@ -35,7 +35,6 @@ static __inline__ int irq_canonicalize(int irq)
extern int distribute_irqs;
-struct irqaction;
struct pt_regs;
#define __ARCH_HAS_DO_SOFTIRQ
--
2.25.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox