* Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
From: David Hildenbrand @ 2020-05-01 21:10 UTC (permalink / raw)
To: Dan Williams
Cc: virtio-dev, linux-hyperv, Michal Hocko, Baoquan He, Linux ACPI,
Wei Yang, linux-s390, linux-nvdimm, Linux Kernel Mailing List,
virtualization, Linux MM, Michael S . Tsirkin, Eric W. Biederman,
Pankaj Gupta, xen-devel, Andrew Morton, Michal Hocko,
linuxppc-dev
In-Reply-To: <CAPcyv4iXyOUDZgqhWH1KCObvATL=gP55xEr64rsRfUuJg5B+eQ@mail.gmail.com>
On 01.05.20 22:12, Dan Williams wrote:
> On Fri, May 1, 2020 at 12:18 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 01.05.20 20:43, Dan Williams wrote:
>>> On Fri, May 1, 2020 at 11:14 AM David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>> On 01.05.20 20:03, Dan Williams wrote:
>>>>> On Fri, May 1, 2020 at 10:51 AM David Hildenbrand <david@redhat.com> wrote:
>>>>>>
>>>>>> On 01.05.20 19:45, David Hildenbrand wrote:
>>>>>>> On 01.05.20 19:39, Dan Williams wrote:
>>>>>>>> On Fri, May 1, 2020 at 10:21 AM David Hildenbrand <david@redhat.com> wrote:
>>>>>>>>>
>>>>>>>>> On 01.05.20 18:56, Dan Williams wrote:
>>>>>>>>>> On Fri, May 1, 2020 at 2:34 AM David Hildenbrand <david@redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 01.05.20 00:24, Andrew Morton wrote:
>>>>>>>>>>>> On Thu, 30 Apr 2020 20:43:39 +0200 David Hildenbrand <david@redhat.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Why does the firmware map support hotplug entries?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I assume:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The firmware memmap was added primarily for x86-64 kexec (and still, is
>>>>>>>>>>>>> mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When DIMMs
>>>>>>>>>>>>> get hotplugged on real HW, they get added to e820. Same applies to
>>>>>>>>>>>>> memory added via HyperV balloon (unless memory is unplugged via
>>>>>>>>>>>>> ballooning and you reboot ... the the e820 is changed as well). I assume
>>>>>>>>>>>>> we wanted to be able to reflect that, to make kexec look like a real reboot.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This worked for a while. Then came dax/kmem. Now comes virtio-mem.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> But I assume only Andrew can enlighten us.
>>>>>>>>>>>>>
>>>>>>>>>>>>> @Andrew, any guidance here? Should we really add all memory to the
>>>>>>>>>>>>> firmware memmap, even if this contradicts with the existing
>>>>>>>>>>>>> documentation? (especially, if the actual firmware memmap will *not*
>>>>>>>>>>>>> contain that memory after a reboot)
>>>>>>>>>>>>
>>>>>>>>>>>> For some reason that patch is misattributed - it was authored by
>>>>>>>>>>>> Shaohui Zheng <shaohui.zheng@intel.com>, who hasn't been heard from in
>>>>>>>>>>>> a decade. I looked through the email discussion from that time and I'm
>>>>>>>>>>>> not seeing anything useful. But I wasn't able to locate Dave Hansen's
>>>>>>>>>>>> review comments.
>>>>>>>>>>>
>>>>>>>>>>> Okay, thanks for checking. I think the documentation from 2008 is pretty
>>>>>>>>>>> clear what has to be done here. I will add some of these details to the
>>>>>>>>>>> patch description.
>>>>>>>>>>>
>>>>>>>>>>> Also, now that I know that esp. kexec-tools already don't consider
>>>>>>>>>>> dax/kmem memory properly (memory will not get dumped via kdump) and
>>>>>>>>>>> won't really suffer from a name change in /proc/iomem, I will go back to
>>>>>>>>>>> the MHP_DRIVER_MANAGED approach and
>>>>>>>>>>> 1. Don't create firmware memmap entries
>>>>>>>>>>> 2. Name the resource "System RAM (driver managed)"
>>>>>>>>>>> 3. Flag the resource via something like IORESOURCE_MEM_DRIVER_MANAGED.
>>>>>>>>>>>
>>>>>>>>>>> This way, kernel users and user space can figure out that this memory
>>>>>>>>>>> has different semantics and handle it accordingly - I think that was
>>>>>>>>>>> what Eric was asking for.
>>>>>>>>>>>
>>>>>>>>>>> Of course, open for suggestions.
>>>>>>>>>>
>>>>>>>>>> I'm still more of a fan of this being communicated by "System RAM"
>>>>>>>>>
>>>>>>>>> I was mentioning somewhere in this thread that "System RAM" inside a
>>>>>>>>> hierarchy (like dax/kmem) will already be basically ignored by
>>>>>>>>> kexec-tools. So, placing it inside a hierarchy already makes it look
>>>>>>>>> special already.
>>>>>>>>>
>>>>>>>>> But after all, as we have to change kexec-tools either way, we can
>>>>>>>>> directly go ahead and flag it properly as special (in case there will
>>>>>>>>> ever be other cases where we could no longer distinguish it).
>>>>>>>>>
>>>>>>>>>> being parented especially because that tells you something about how
>>>>>>>>>> the memory is driver-managed and which mechanism might be in play.
>>>>>>>>>
>>>>>>>>> The could be communicated to some degree via the resource hierarchy.
>>>>>>>>>
>>>>>>>>> E.g.,
>>>>>>>>>
>>>>>>>>> [root@localhost ~]# cat /proc/iomem
>>>>>>>>> ...
>>>>>>>>> 140000000-33fffffff : Persistent Memory
>>>>>>>>> 140000000-1481fffff : namespace0.0
>>>>>>>>> 150000000-33fffffff : dax0.0
>>>>>>>>> 150000000-33fffffff : System RAM (driver managed)
>>>>>>>>>
>>>>>>>>> vs.
>>>>>>>>>
>>>>>>>>> :/# cat /proc/iomem
>>>>>>>>> [...]
>>>>>>>>> 140000000-333ffffff : virtio-mem (virtio0)
>>>>>>>>> 140000000-147ffffff : System RAM (driver managed)
>>>>>>>>> 148000000-14fffffff : System RAM (driver managed)
>>>>>>>>> 150000000-157ffffff : System RAM (driver managed)
>>>>>>>>>
>>>>>>>>> Good enough for my taste.
>>>>>>>>>
>>>>>>>>>> What about adding an optional /sys/firmware/memmap/X/parent attribute.
>>>>>>>>>
>>>>>>>>> I really don't want any firmware memmap entries for something that is
>>>>>>>>> not part of the firmware provided memmap. In addition,
>>>>>>>>> /sys/firmware/memmap/ is still a fairly x86_64 specific thing. Only mips
>>>>>>>>> and two arm configs enable it at all.
>>>>>>>>>
>>>>>>>>> So, IMHO, /sys/firmware/memmap/ is definitely not the way to go.
>>>>>>>>
>>>>>>>> I think that's a policy decision and policy decisions do not belong in
>>>>>>>> the kernel. Give the tooling the opportunity to decide whether System
>>>>>>>> RAM stays that way over a kexec. The parenthetical reference otherwise
>>>>>>>> looks out of place to me in the /proc/iomem output. What makes it
>>>>>>>> "driver managed" is how the kernel handles it, not how the kernel
>>>>>>>> names it.
>>>>>>>
>>>>>>> At least, virtio-mem is different. It really *has to be handled* by the
>>>>>>> driver. This is not a policy. It's how it works.
>>>>>
>>>>> ...but that's not necessarily how dax/kmem works.
>>>>>
>>>>
>>>> Yes, and user space could still take that memory and add it to the
>>>> firmware memmap if it really wants to. It knows that it is special. It
>>>> can figure out that it belongs to a dax device using /proc/iomem.
>>>>
>>>>>>>
>>>>>>
>>>>>> Oh, and I don't see why "System RAM (driver managed)" would hinder any
>>>>>> policy in user case to still do what it thinks is the right thing to do
>>>>>> (e.g., for dax).
>>>>>>
>>>>>> "System RAM (driver managed)" would mean: Memory is not part of the raw
>>>>>> firmware memmap. It was detected and added by a driver. Handle with
>>>>>> care, this is special.
>>>>>
>>>>> Oh, no, I was more reacting to your, "don't update
>>>>> /sys/firmware/memmap for the (driver managed) range" choice as being a
>>>>> policy decision. It otherwise feels to me "System RAM (driver
>>>>> managed)" adds confusion for casual users of /proc/iomem and for clued
>>>>> in tools they have the parent association to decide policy.
>>>>
>>>> Not sure if I understand correctly, so bear with me :).
>>>>
>>>> Adding or not adding stuff to /sys/firmware/memmap is not a policy
>>>> decision. If it's not part of the raw firmware-provided memmap, it has
>>>> nothing to do in /sys/firmware/memmap. That's what the documentation
>>>> from 2008 tells us.
>>>
>>> It just occurs to me that there are valid cases for both wanting to
>>> start over with driver managed memory with a kexec and keeping it in
>>> the map.
>>
>> Yes, there might be valid cases. My gut feeling is that in the general
>> case, you want to let the kexec kernel implement a policy/ let the user
>> in the new system decide.
>>
>> But as I said, you can implement in kexec-tools whatever policy you
>> want. It has access to all information.
>
> Right, so why is a new type needed if all the information is there by
> other means?
You mean "System RAM (driver managed)" in /proc/iomem? See below for more.
>
>>> Consider the case of EFI Special Purpose (SP) Memory that is
>>> marked EFI Conventional Memory with the SP attribute. In that case the
>>> firmware memory map marked it as conventional RAM, but the kernel
>>> optionally marks it as System RAM vs Soft Reserved. The 2008 patch
>>> simply does not consider that case. I'm not sure strict textualism
>>> works for coding decisions.
>>
>> I am no expert on that matter (esp EFI). But looking at the users of
>> firmware_map_add_early(), the single user is in arch/x86/kernel/e820.c
>> . So the single source of /sys/firmware/memmap is (besides hotplug) e820.
>>
>> "'e820_table_firmware': the original firmware version passed to us by
>> the bootloader - not modified by the kernel. ... inform the user about
>> the firmware's notion of memory layout via /sys/firmware/memmap"
>> (arch/x86/kernel/e820.c)
>>
>> How is the EFI Special Purpose (SP) Memory represented in e820?
>> /sys/firmware/memmap is really simple: just dump in e820. No policies IIUC.
>
> e820 now has a Soft Reserved translation for this which means "try to
> reserve, but treat as System RAM is ok too". It seems generically
> useful to me that the toggle for determining whether Soft Reserved or
> System RAM shows up /sys/firmware/memmap is a determination that
> policy can make. The kernel need not preemptively block it.
So, I think I have to clarify something here. We do have two ways to kexec
1. kexec_load(): User space (kexec-tools) crafts the memmap (e.g., using
/sys/firmware/memmap on x86-64) and selects memory where to place the
kexec images (e.g., using /proc/iomem)
2. kexec_file_load(): The kernel reuses the (basically) raw firmware
memmap and selects memory where to place kexec images.
We are talking about changing 1, to behave like 2 in regards to
dax/kmem. 2. does currently not add any hotplugged memory to the
fixed-up e820, and it should be fixed regarding hotplugged DIMMs that
would appear in e820 after a reboot.
Now, all these policy discussions are nice and fun, but I don't really
see a good reason to (ab)use /sys/firmware/memmap for that (e.g., parent
properties). If you want to be able to make this configurable, then
e.g., add a way to configure this in the kernel (for example along with
kmem) to make 1. and 2. behave the same way. Otherwise, you really only
can change 1.
Now, let's clarify what I want regarding virtio-mem:
1. kexec should not add virtio-mem memory to the initial firmware
memmap. The driver has to be in charge as discussed.
2. kexec should not place kexec images onto virtio-mem memory. That
would end badly.
3. kexec should still dump virtio-mem memory via kdump.
This has to work when using kexec_load() or kexec_file_load(). This has
to theoretically work on different architectures (especially, without
/sys/firmware/memmap). kexec-tools has to have access to that
information to figure out what to do.
Regarding 1:
- kexec_file_load(): works out of the box currently.
- kexec_load(): Don't create entries in /sys/firmware/memmap (for
reasons discussed)
Regarding 2:
- kexec_file_load(): tag the resources as IORESOURCE_MEM_DRIVER_MANAGED
(inspired by Eric)
- kexec_load(): indicate the memory as "System RAM (driver managed)"
Regarding 3:
- Same as 2. kexec-tools need to be thought to properly consider the
memory during kdump.
Now, you are asking, "why System RAM (driver managed)". I don't think
it's strictly needed right now, but it feels cleaner. E.g., for
virtio-mem the current plan is to have /proc/iomem look like
:/# cat /proc/iomem
[...]
140000000-333ffffff : virtio-mem (virtio0)
140000000-147ffffff : System RAM (driver managed)
148000000-14fffffff : System RAM (driver managed)
150000000-157ffffff : System RAM (driver managed)
One could judge by looking at the hierarchy, that this memory is
special. kexec-tools will skip it currently in either form.
If we all agree here, that we can drop it, then let's drop it,
especially if it would allow dax/kmem to use the same mechanism I am
proposing here for virtio-mem.
Now, it would be fairly simple to add a config option for dax/kmem,
making it configurable in the kernel, whether to add memory via
MHP_DRIVER_MANAGED or just as we do now. It would contradict with the
"raw firmware/prov..." description of /sys/firmware/memmap, but hey,
somebody explicitly configured it, so it can't be wrong.
--
Thanks,
David / dhildenb
^ permalink raw reply
* Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
From: Dan Williams @ 2020-05-01 21:52 UTC (permalink / raw)
To: David Hildenbrand
Cc: virtio-dev, linux-hyperv, Michal Hocko, Baoquan He, Linux ACPI,
Wei Yang, linux-s390, linux-nvdimm, Linux Kernel Mailing List,
virtualization, Linux MM, Michael S . Tsirkin, Eric W. Biederman,
Pankaj Gupta, xen-devel, Andrew Morton, Michal Hocko,
linuxppc-dev
In-Reply-To: <8242c0c5-2df2-fc0c-079a-3be62c113a11@redhat.com>
On Fri, May 1, 2020 at 2:11 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 01.05.20 22:12, Dan Williams wrote:
[..]
> >>> Consider the case of EFI Special Purpose (SP) Memory that is
> >>> marked EFI Conventional Memory with the SP attribute. In that case the
> >>> firmware memory map marked it as conventional RAM, but the kernel
> >>> optionally marks it as System RAM vs Soft Reserved. The 2008 patch
> >>> simply does not consider that case. I'm not sure strict textualism
> >>> works for coding decisions.
> >>
> >> I am no expert on that matter (esp EFI). But looking at the users of
> >> firmware_map_add_early(), the single user is in arch/x86/kernel/e820.c
> >> . So the single source of /sys/firmware/memmap is (besides hotplug) e820.
> >>
> >> "'e820_table_firmware': the original firmware version passed to us by
> >> the bootloader - not modified by the kernel. ... inform the user about
> >> the firmware's notion of memory layout via /sys/firmware/memmap"
> >> (arch/x86/kernel/e820.c)
> >>
> >> How is the EFI Special Purpose (SP) Memory represented in e820?
> >> /sys/firmware/memmap is really simple: just dump in e820. No policies IIUC.
> >
> > e820 now has a Soft Reserved translation for this which means "try to
> > reserve, but treat as System RAM is ok too". It seems generically
> > useful to me that the toggle for determining whether Soft Reserved or
> > System RAM shows up /sys/firmware/memmap is a determination that
> > policy can make. The kernel need not preemptively block it.
>
> So, I think I have to clarify something here. We do have two ways to kexec
>
> 1. kexec_load(): User space (kexec-tools) crafts the memmap (e.g., using
> /sys/firmware/memmap on x86-64) and selects memory where to place the
> kexec images (e.g., using /proc/iomem)
>
> 2. kexec_file_load(): The kernel reuses the (basically) raw firmware
> memmap and selects memory where to place kexec images.
>
> We are talking about changing 1, to behave like 2 in regards to
> dax/kmem. 2. does currently not add any hotplugged memory to the
> fixed-up e820, and it should be fixed regarding hotplugged DIMMs that
> would appear in e820 after a reboot.
>
> Now, all these policy discussions are nice and fun, but I don't really
> see a good reason to (ab)use /sys/firmware/memmap for that (e.g., parent
> properties). If you want to be able to make this configurable, then
> e.g., add a way to configure this in the kernel (for example along with
> kmem) to make 1. and 2. behave the same way. Otherwise, you really only
> can change 1.
That's clearer.
>
>
> Now, let's clarify what I want regarding virtio-mem:
>
> 1. kexec should not add virtio-mem memory to the initial firmware
> memmap. The driver has to be in charge as discussed.
> 2. kexec should not place kexec images onto virtio-mem memory. That
> would end badly.
> 3. kexec should still dump virtio-mem memory via kdump.
Ok, but then seems to say to me that dax/kmem is a different type of
(driver managed) than virtio-mem and it's confusing to try to apply
the same meaning. Why not just call your type for the distinct type it
is "System RAM (virtio-mem)" and let any other driver managed memory
follow the same "System RAM ($driver)" format if it wants?
^ permalink raw reply
* Re: [PATCH 21/29] mm: remove the pgprot argument to __vmalloc
From: Andrew Morton @ 2020-05-01 22:09 UTC (permalink / raw)
To: John Dorminy
Cc: linux-hyperv, David Airlie, dri-devel, Michael Kelley, linux-mm,
K. Y. Srinivasan, Sumit Semwal, linux-arch, linux-s390, Wei Liu,
Stephen Hemminger, x86, Christoph Hellwig, Peter Zijlstra,
Gao Xiang, Laura Abbott, Nitin Gupta, Daniel Vetter,
Haiyang Zhang, linaro-mm-sig, linux-arm-kernel, Robin Murphy,
Linux Kernel Mailing List, Minchan Kim, iommu, Sakari Ailus, bpf,
linuxppc-dev
In-Reply-To: <CAMeeMh_9N0ORhPM8EmkGeeuiDoQY3+QoAPX5QBuK7=gsC5ONng@mail.gmail.com>
On Thu, 30 Apr 2020 22:38:10 -0400 John Dorminy <jdorminy@redhat.com> wrote:
> the change
> description refers to PROT_KERNEL, which is a symbol which does not
> appear to exist; perhaps PAGE_KERNEL was meant?
Yes, thanks, fixed.
^ permalink raw reply
* Re: [PATCH v8 4/7] perf/tools: Enhance JSON/metric infrastructure to handle "?"
From: Ian Rogers @ 2020-05-01 15:56 UTC (permalink / raw)
To: Kajol Jain
Cc: Mark Rutland, maddy, Peter Zijlstra, Jin Yao, Ingo Molnar,
Liang, Kan, Andi Kleen, Alexander Shishkin, Anju T Sudhakar,
mamatha4, sukadev, Ravi Bangoria, Arnaldo Carvalho de Melo,
jmario, Namhyung Kim, Thomas Gleixner, Michael Petlan,
Greg Kroah-Hartman, LKML, linux-perf-users, Jiri Olsa,
linuxppc-dev
In-Reply-To: <20200401203340.31402-5-kjain@linux.ibm.com>
On Wed, Apr 1, 2020 at 1:35 PM Kajol Jain <kjain@linux.ibm.com> wrote:
>
> Patch enhances current metric infrastructure to handle "?" in the metric
> expression. The "?" can be use for parameters whose value not known while
> creating metric events and which can be replace later at runtime to
> the proper value. It also add flexibility to create multiple events out
> of single metric event added in json file.
>
> Patch adds function 'arch_get_runtimeparam' which is a arch specific
> function, returns the count of metric events need to be created.
> By default it return 1.
Sorry for the slow response, I was trying to understand this patch in
relation to the PMU aliases to see if there was an overlap - I'm still
not sure. This is now merged so I'm just commenting wrt possible
future cleanup. I defer to the maintainers on how this should be
organized. At the metric level, this problem reminds me of both
#smt_on and LLC_MISSES.PCIE_WRITE on cascade lake. #smt_on adds a
degree of CPU specific behavior to an expression.
LLC_MISSES.PCIE_WRITE uses .part0 ... part3 to combine separate but
related counters.
The symbols that the metrics parse are then passed to parse-event. You
don't change parse-event as metricgroup replaces the '?' with a read
value from /devices/hv_24x7/interface/sockets, actually 0 to that
value-1 are passed.
It seems unfortunate to overload the meaning of runtime with a value
read from /devices/hv_24x7/interface/sockets and plumbing this value
around is quite a bit of noise for everything but this use-case. I
kind of wish we could do something like:
for i in 0, read("/devices/hv_24x7/interface/sockets"):
hv_24x7/pm_pb_cyc,chip=$i
in the metric code. I have some patches to send related to metric
groups and I think this will be an active area of development for a
while as I think there are some open questions on the organization of
the code.
Thanks,
Ian
> This infrastructure needed for hv_24x7 socket/chip level events.
> "hv_24x7" chip level events needs specific chip-id to which the
> data is requested. Function 'arch_get_runtimeparam' implemented
> in header.c which extract number of sockets from sysfs file
> "sockets" under "/sys/devices/hv_24x7/interface/".
>
> With this patch basically we are trying to create as many metric events
> as define by runtime_param.
>
> For that one loop is added in function 'metricgroup__add_metric',
> which create multiple events at run time depend on return value of
> 'arch_get_runtimeparam' and merge that event in 'group_list'.
>
> To achieve that we are actually passing this parameter value as part of
> `expr__find_other` function and changing "?" present in metric expression
> with this value.
>
> As in our json file, there gonna be single metric event, and out of
> which we are creating multiple events.
>
> To understand which data count belongs to which parameter value,
> we also printing param value in generic_metric function.
>
> For example,
> command:# ./perf stat -M PowerBUS_Frequency -C 0 -I 1000
> 1.000101867 9,356,933 hv_24x7/pm_pb_cyc,chip=0/ # 2.3 GHz PowerBUS_Frequency_0
> 1.000101867 9,366,134 hv_24x7/pm_pb_cyc,chip=1/ # 2.3 GHz PowerBUS_Frequency_1
> 2.000314878 9,365,868 hv_24x7/pm_pb_cyc,chip=0/ # 2.3 GHz PowerBUS_Frequency_0
> 2.000314878 9,366,092 hv_24x7/pm_pb_cyc,chip=1/ # 2.3 GHz PowerBUS_Frequency_1
>
> So, here _0 and _1 after PowerBUS_Frequency specify parameter value.
>
> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> ---
> tools/perf/arch/powerpc/util/header.c | 8 ++++++++
> tools/perf/tests/expr.c | 8 ++++----
> tools/perf/util/expr.c | 11 ++++++-----
> tools/perf/util/expr.h | 5 +++--
> tools/perf/util/expr.l | 27 +++++++++++++++++++-------
> tools/perf/util/metricgroup.c | 28 ++++++++++++++++++++++++---
> tools/perf/util/metricgroup.h | 2 ++
> tools/perf/util/stat-shadow.c | 17 ++++++++++------
> 8 files changed, 79 insertions(+), 27 deletions(-)
>
> diff --git a/tools/perf/arch/powerpc/util/header.c b/tools/perf/arch/powerpc/util/header.c
> index 3b4cdfc5efd6..d4870074f14c 100644
> --- a/tools/perf/arch/powerpc/util/header.c
> +++ b/tools/perf/arch/powerpc/util/header.c
> @@ -7,6 +7,8 @@
> #include <string.h>
> #include <linux/stringify.h>
> #include "header.h"
> +#include "metricgroup.h"
> +#include <api/fs/fs.h>
>
> #define mfspr(rn) ({unsigned long rval; \
> asm volatile("mfspr %0," __stringify(rn) \
> @@ -44,3 +46,9 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
>
> return bufp;
> }
> +
> +int arch_get_runtimeparam(void)
> +{
> + int count;
> + return sysfs__read_int("/devices/hv_24x7/interface/sockets", &count) < 0 ? 1 : count;
> +}
> diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
> index ea10fc4412c4..516504cf0ea5 100644
> --- a/tools/perf/tests/expr.c
> +++ b/tools/perf/tests/expr.c
> @@ -10,7 +10,7 @@ static int test(struct expr_parse_ctx *ctx, const char *e, double val2)
> {
> double val;
>
> - if (expr__parse(&val, ctx, e))
> + if (expr__parse(&val, ctx, e, 1))
> TEST_ASSERT_VAL("parse test failed", 0);
> TEST_ASSERT_VAL("unexpected value", val == val2);
> return 0;
> @@ -44,15 +44,15 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> return ret;
>
> p = "FOO/0";
> - ret = expr__parse(&val, &ctx, p);
> + ret = expr__parse(&val, &ctx, p, 1);
> TEST_ASSERT_VAL("division by zero", ret == -1);
>
> p = "BAR/";
> - ret = expr__parse(&val, &ctx, p);
> + ret = expr__parse(&val, &ctx, p, 1);
> TEST_ASSERT_VAL("missing operand", ret == -1);
>
> TEST_ASSERT_VAL("find other",
> - expr__find_other("FOO + BAR + BAZ + BOZO", "FOO", &other, &num_other) == 0);
> + expr__find_other("FOO + BAR + BAZ + BOZO", "FOO", &other, &num_other, 1) == 0);
> TEST_ASSERT_VAL("find other", num_other == 3);
> TEST_ASSERT_VAL("find other", !strcmp(other[0], "BAR"));
> TEST_ASSERT_VAL("find other", !strcmp(other[1], "BAZ"));
> diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
> index c3382d58cf40..aa631e37ad1e 100644
> --- a/tools/perf/util/expr.c
> +++ b/tools/perf/util/expr.c
> @@ -27,10 +27,11 @@ void expr__ctx_init(struct expr_parse_ctx *ctx)
>
> static int
> __expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr,
> - int start)
> + int start, int runtime)
> {
> struct expr_scanner_ctx scanner_ctx = {
> .start_token = start,
> + .runtime = runtime,
> };
> YY_BUFFER_STATE buffer;
> void *scanner;
> @@ -54,9 +55,9 @@ __expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr,
> return ret;
> }
>
> -int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char *expr)
> +int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char *expr, int runtime)
> {
> - return __expr__parse(final_val, ctx, expr, EXPR_PARSE) ? -1 : 0;
> + return __expr__parse(final_val, ctx, expr, EXPR_PARSE, runtime) ? -1 : 0;
> }
>
> static bool
> @@ -74,13 +75,13 @@ already_seen(const char *val, const char *one, const char **other,
> }
>
> int expr__find_other(const char *expr, const char *one, const char ***other,
> - int *num_other)
> + int *num_other, int runtime)
> {
> int err, i = 0, j = 0;
> struct expr_parse_ctx ctx;
>
> expr__ctx_init(&ctx);
> - err = __expr__parse(NULL, &ctx, expr, EXPR_OTHER);
> + err = __expr__parse(NULL, &ctx, expr, EXPR_OTHER, runtime);
> if (err)
> return -1;
>
> diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
> index 0938ad166ece..87d627bb699b 100644
> --- a/tools/perf/util/expr.h
> +++ b/tools/perf/util/expr.h
> @@ -17,12 +17,13 @@ struct expr_parse_ctx {
>
> struct expr_scanner_ctx {
> int start_token;
> + int runtime;
> };
>
> void expr__ctx_init(struct expr_parse_ctx *ctx);
> void expr__add_id(struct expr_parse_ctx *ctx, const char *id, double val);
> -int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char *expr);
> +int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char *expr, int runtime);
> int expr__find_other(const char *expr, const char *one, const char ***other,
> - int *num_other);
> + int *num_other, int runtime);
>
> #endif
> diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
> index 2582c2464938..74b9b59b1aa5 100644
> --- a/tools/perf/util/expr.l
> +++ b/tools/perf/util/expr.l
> @@ -35,7 +35,7 @@ static int value(yyscan_t scanner, int base)
> * Allow @ instead of / to be able to specify pmu/event/ without
> * conflicts with normal division.
> */
> -static char *normalize(char *str)
> +static char *normalize(char *str, int runtime)
> {
> char *ret = str;
> char *dst = str;
> @@ -45,6 +45,19 @@ static char *normalize(char *str)
> *dst++ = '/';
> else if (*str == '\\')
> *dst++ = *++str;
> + else if (*str == '?') {
> + char *paramval;
> + int i = 0;
> + int size = asprintf(¶mval, "%d", runtime);
> +
> + if (size < 0)
> + *dst++ = '0';
> + else {
> + while (i < size)
> + *dst++ = paramval[i++];
> + free(paramval);
> + }
> + }
> else
> *dst++ = *str;
> str++;
> @@ -54,16 +67,16 @@ static char *normalize(char *str)
> return ret;
> }
>
> -static int str(yyscan_t scanner, int token)
> +static int str(yyscan_t scanner, int token, int runtime)
> {
> YYSTYPE *yylval = expr_get_lval(scanner);
> char *text = expr_get_text(scanner);
>
> - yylval->str = normalize(strdup(text));
> + yylval->str = normalize(strdup(text), runtime);
> if (!yylval->str)
> return EXPR_ERROR;
>
> - yylval->str = normalize(yylval->str);
> + yylval->str = normalize(yylval->str, runtime);
> return token;
> }
> %}
> @@ -72,8 +85,8 @@ number [0-9]+
>
> sch [-,=]
> spec \\{sch}
> -sym [0-9a-zA-Z_\.:@]+
> -symbol {spec}*{sym}*{spec}*{sym}*
> +sym [0-9a-zA-Z_\.:@?]+
> +symbol {spec}*{sym}*{spec}*{sym}*{spec}*{sym}
>
> %%
> struct expr_scanner_ctx *sctx = expr_get_extra(yyscanner);
> @@ -93,7 +106,7 @@ if { return IF; }
> else { return ELSE; }
> #smt_on { return SMT_ON; }
> {number} { return value(yyscanner, 10); }
> -{symbol} { return str(yyscanner, ID); }
> +{symbol} { return str(yyscanner, ID, sctx->runtime); }
> "|" { return '|'; }
> "^" { return '^'; }
> "&" { return '&'; }
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 7ad81c8177ea..b071df373f8b 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -90,6 +90,7 @@ struct egroup {
> const char *metric_name;
> const char *metric_expr;
> const char *metric_unit;
> + int runtime;
> };
>
> static struct evsel *find_evsel_group(struct evlist *perf_evlist,
> @@ -202,6 +203,7 @@ static int metricgroup__setup_events(struct list_head *groups,
> expr->metric_name = eg->metric_name;
> expr->metric_unit = eg->metric_unit;
> expr->metric_events = metric_events;
> + expr->runtime = eg->runtime;
> list_add(&expr->nd, &me->head);
> }
>
> @@ -485,15 +487,20 @@ static bool metricgroup__has_constraint(struct pmu_event *pe)
> return false;
> }
>
> +int __weak arch_get_runtimeparam(void)
> +{
> + return 1;
> +}
> +
> static int __metricgroup__add_metric(struct strbuf *events,
> - struct list_head *group_list, struct pmu_event *pe)
> + struct list_head *group_list, struct pmu_event *pe, int runtime)
> {
>
> const char **ids;
> int idnum;
> struct egroup *eg;
>
> - if (expr__find_other(pe->metric_expr, NULL, &ids, &idnum) < 0)
> + if (expr__find_other(pe->metric_expr, NULL, &ids, &idnum, runtime) < 0)
> return -EINVAL;
>
> if (events->len > 0)
> @@ -513,6 +520,7 @@ static int __metricgroup__add_metric(struct strbuf *events,
> eg->metric_name = pe->metric_name;
> eg->metric_expr = pe->metric_expr;
> eg->metric_unit = pe->unit;
> + eg->runtime = runtime;
> list_add_tail(&eg->nd, group_list);
>
> return 0;
> @@ -540,7 +548,21 @@ static int metricgroup__add_metric(const char *metric, struct strbuf *events,
>
> pr_debug("metric expr %s for %s\n", pe->metric_expr, pe->metric_name);
>
> - ret = __metricgroup__add_metric(events, group_list, pe);
> + if (!strstr(pe->metric_expr, "?")) {
> + ret = __metricgroup__add_metric(events, group_list, pe, 1);
> + } else {
> + int j, count;
> +
> + count = arch_get_runtimeparam();
> +
> + /* This loop is added to create multiple
> + * events depend on count value and add
> + * those events to group_list.
> + */
> +
> + for (j = 0; j < count; j++)
> + ret = __metricgroup__add_metric(events, group_list, pe, j);
> + }
> if (ret == -ENOMEM)
> break;
> }
> diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
> index 475c7f912864..6b09eb30b4ec 100644
> --- a/tools/perf/util/metricgroup.h
> +++ b/tools/perf/util/metricgroup.h
> @@ -22,6 +22,7 @@ struct metric_expr {
> const char *metric_name;
> const char *metric_unit;
> struct evsel **metric_events;
> + int runtime;
> };
>
> struct metric_event *metricgroup__lookup(struct rblist *metric_events,
> @@ -34,4 +35,5 @@ int metricgroup__parse_groups(const struct option *opt,
> void metricgroup__print(bool metrics, bool groups, char *filter,
> bool raw, bool details);
> bool metricgroup__has_metric(const char *metric);
> +int arch_get_runtimeparam(void);
> #endif
> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> index 402af3e8d287..cf353ca591a5 100644
> --- a/tools/perf/util/stat-shadow.c
> +++ b/tools/perf/util/stat-shadow.c
> @@ -336,7 +336,7 @@ void perf_stat__collect_metric_expr(struct evlist *evsel_list)
> metric_events = counter->metric_events;
> if (!metric_events) {
> if (expr__find_other(counter->metric_expr, counter->name,
> - &metric_names, &num_metric_names) < 0)
> + &metric_names, &num_metric_names, 1) < 0)
> continue;
>
> metric_events = calloc(sizeof(struct evsel *),
> @@ -723,6 +723,7 @@ static void generic_metric(struct perf_stat_config *config,
> char *name,
> const char *metric_name,
> const char *metric_unit,
> + int runtime,
> double avg,
> int cpu,
> struct perf_stat_output_ctx *out,
> @@ -777,7 +778,7 @@ static void generic_metric(struct perf_stat_config *config,
> }
>
> if (!metric_events[i]) {
> - if (expr__parse(&ratio, &pctx, metric_expr) == 0) {
> + if (expr__parse(&ratio, &pctx, metric_expr, runtime) == 0) {
> char *unit;
> char metric_bf[64];
>
> @@ -786,9 +787,13 @@ static void generic_metric(struct perf_stat_config *config,
> &unit, &scale) >= 0) {
> ratio *= scale;
> }
> -
> - scnprintf(metric_bf, sizeof(metric_bf),
> + if (strstr(metric_expr, "?"))
> + scnprintf(metric_bf, sizeof(metric_bf),
> + "%s %s_%d", unit, metric_name, runtime);
> + else
> + scnprintf(metric_bf, sizeof(metric_bf),
> "%s %s", unit, metric_name);
> +
> print_metric(config, ctxp, NULL, "%8.1f",
> metric_bf, ratio);
> } else {
> @@ -1019,7 +1024,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
> print_metric(config, ctxp, NULL, NULL, name, 0);
> } else if (evsel->metric_expr) {
> generic_metric(config, evsel->metric_expr, evsel->metric_events, evsel->name,
> - evsel->metric_name, NULL, avg, cpu, out, st);
> + evsel->metric_name, NULL, 1, avg, cpu, out, st);
> } else if (runtime_stat_n(st, STAT_NSECS, 0, cpu) != 0) {
> char unit = 'M';
> char unit_buf[10];
> @@ -1048,7 +1053,7 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
> out->new_line(config, ctxp);
> generic_metric(config, mexp->metric_expr, mexp->metric_events,
> evsel->name, mexp->metric_name,
> - mexp->metric_unit, avg, cpu, out, st);
> + mexp->metric_unit, mexp->runtime, avg, cpu, out, st);
> }
> }
> if (num == 0)
> --
> 2.21.0
>
^ permalink raw reply
* Re: 5.7-rc interrupt_return Unrecoverable exception 380
From: Nicholas Piggin @ 2020-05-02 2:40 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Michal Suchanek, linuxppc-dev
In-Reply-To: <alpine.LSU.2.11.2005011253250.3734@eggly.anvils>
Excerpts from Hugh Dickins's message of May 2, 2020 6:38 am:
> Hi Nick,
>
> I've been getting an "Unrecoverable exception 380" after a few hours
> of load on the G5 (yes, that G5!) with 5.7-rc: when interrupt_return
> checks lazy_irq_pending, it crashes at check_preemption_disabled+0x24
> with CONFIG_DEBUG_PREEMPT=y.
>
> check_preemption_disabled():
> lib/smp_processor_id.c:13
> 0: 7c 08 02 a6 mflr r0
> 4: fb e1 ff f8 std r31,-8(r1)
> 8: fb 61 ff d8 std r27,-40(r1)
> c: fb 81 ff e0 std r28,-32(r1)
> 10: fb a1 ff e8 std r29,-24(r1)
> 14: fb c1 ff f0 std r30,-16(r1)
> get_current():
> arch/powerpc/include/asm/current.h:20
> 18: eb ed 01 88 ld r31,392(r13)
> check_preemption_disabled():
> lib/smp_processor_id.c:13
> 1c: f8 01 00 10 std r0,16(r1)
> 20: f8 21 ff 61 stdu r1,-160(r1)
> __read_once_size():
> include/linux/compiler.h:199
> 24: 81 3f 00 00 lwz r9,0(r31)
> check_preemption_disabled():
> lib/smp_processor_id.c:14
> 28: a3 cd 00 02 lhz r30,2(r13)
>
> I don't read ppc assembly, and have not jotted down the registers,
> but hope you can make sense of it. I get around it with the patch
> below (just avoiding the debug), but have no idea whether it's a
> necessary fix or a hacky workaround.
Hi Hugh,
Thanks for the report, nice catch. Your fix is actually the correct one
(well, we probably want a __lazy_irq_pending() variant which is to be
used in these cases).
Problem is MSR[RI] is cleared here, ready to do the last few things for
interrupt return where we're not allowed to take any other interrupts.
SLB interrupts can happen just about anywhere aside from kernel text,
global variables, and stack. When that hits, it appears to be
unrecoverable due to RI=0.
We could clear just MSR[EE] for asynchronous interrupts, then check
lazy_irq_pending(), and then clear MSR[RI] ready to return, and the
SLB miss in the debug check would be fine. But that's two mtmsr
instructions, which is slower. So we'll skip the check.
I tested hash, and preempt, possibly even preempt+hash, but clearly not
preempt+preempt_debug+hash+slb thrashing!
Thanks,
Nick
>
> Hugh
>
> --- 5.7-rc3/arch/powerpc/include/asm/hw_irq.h 2020-04-12 16:24:29.802769727 -0700
> +++ linux/arch/powerpc/include/asm/hw_irq.h 2020-04-27 11:31:10.000000000 -0700
> @@ -252,7 +252,7 @@ static inline bool arch_irqs_disabled(vo
>
> static inline bool lazy_irq_pending(void)
> {
> - return !!(get_paca()->irq_happened & ~PACA_IRQ_HARD_DIS);
> + return !!(local_paca->irq_happened & ~PACA_IRQ_HARD_DIS);
> }
>
> /*
>
^ permalink raw reply
* [PATCH] powerpc: Drop CONFIG_MTD_M25P80 in 5xx-hw.config
From: Bin Meng @ 2020-05-02 4:28 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: Bin Meng
From: Bin Meng <bin.meng@windriver.com>
Drop CONFIG_MTD_M25P80 that was removed in
commit b35b9a10362d ("mtd: spi-nor: Move m25p80 code in spi-nor.c")
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
arch/powerpc/configs/85xx-hw.config | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/configs/85xx-hw.config b/arch/powerpc/configs/85xx-hw.config
index b507df6..524db76 100644
--- a/arch/powerpc/configs/85xx-hw.config
+++ b/arch/powerpc/configs/85xx-hw.config
@@ -67,7 +67,6 @@ CONFIG_MTD_CFI_AMDSTD=y
CONFIG_MTD_CFI_INTELEXT=y
CONFIG_MTD_CFI=y
CONFIG_MTD_CMDLINE_PARTS=y
-CONFIG_MTD_M25P80=y
CONFIG_MTD_NAND_FSL_ELBC=y
CONFIG_MTD_NAND_FSL_IFC=y
CONFIG_MTD_RAW_NAND=y
--
2.7.4
^ permalink raw reply related
* [PATCH v2] powerpc: Drop CONFIG_MTD_M25P80 in 85xx-hw.config
From: Bin Meng @ 2020-05-02 4:44 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: Bin Meng
From: Bin Meng <bin.meng@windriver.com>
Drop CONFIG_MTD_M25P80 that was removed in
commit b35b9a10362d ("mtd: spi-nor: Move m25p80 code in spi-nor.c")
Signed-off-by: Bin Meng <bin.meng@windriver.com>
---
Changes in v2:
- correct the typo (5xx => 85xx) in the commit title
arch/powerpc/configs/85xx-hw.config | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/configs/85xx-hw.config b/arch/powerpc/configs/85xx-hw.config
index b507df6..524db76 100644
--- a/arch/powerpc/configs/85xx-hw.config
+++ b/arch/powerpc/configs/85xx-hw.config
@@ -67,7 +67,6 @@ CONFIG_MTD_CFI_AMDSTD=y
CONFIG_MTD_CFI_INTELEXT=y
CONFIG_MTD_CFI=y
CONFIG_MTD_CMDLINE_PARTS=y
-CONFIG_MTD_M25P80=y
CONFIG_MTD_NAND_FSL_ELBC=y
CONFIG_MTD_NAND_FSL_IFC=y
CONFIG_MTD_RAW_NAND=y
--
2.7.4
^ permalink raw reply related
* [powerpc:topic/uaccess-ppc] BUILD SUCCESS 4fe5cda9f89d0aea8e915b7c96ae34bda4e12e51
From: kbuild test robot @ 2020-05-02 8:56 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git topic/uaccess-ppc
branch HEAD: 4fe5cda9f89d0aea8e915b7c96ae34bda4e12e51 powerpc/uaccess: Implement user_read_access_begin and user_write_access_begin
elapsed time: 533m
configs tested: 216
configs skipped: 0
The following configs have been built successfully.
More configs may be tested in the coming days.
arm64 allyesconfig
arm allyesconfig
arm64 allmodconfig
arm allmodconfig
arm64 allnoconfig
arm allnoconfig
arm efm32_defconfig
arm at91_dt_defconfig
arm shmobile_defconfig
arm64 defconfig
arm exynos_defconfig
arm multi_v5_defconfig
arm sunxi_defconfig
arm multi_v7_defconfig
arc defconfig
mips ar7_defconfig
mips allmodconfig
nios2 3c120_defconfig
sparc64 defconfig
csky defconfig
sh rsk7269_defconfig
ia64 allnoconfig
i386 allnoconfig
i386 allyesconfig
i386 alldefconfig
i386 defconfig
i386 debian-10.3
ia64 allmodconfig
ia64 defconfig
ia64 generic_defconfig
ia64 tiger_defconfig
ia64 bigsur_defconfig
ia64 allyesconfig
ia64 alldefconfig
m68k m5475evb_defconfig
m68k allmodconfig
m68k bvme6000_defconfig
m68k sun3_defconfig
m68k multi_defconfig
nios2 10m50_defconfig
c6x evmc6678_defconfig
c6x allyesconfig
openrisc simple_smp_defconfig
openrisc or1ksim_defconfig
nds32 defconfig
nds32 allnoconfig
alpha defconfig
h8300 h8s-sim_defconfig
h8300 edosk2674_defconfig
xtensa iss_defconfig
h8300 h8300h-sim_defconfig
xtensa common_defconfig
arc allyesconfig
microblaze mmu_defconfig
microblaze nommu_defconfig
mips fuloong2e_defconfig
mips malta_kvm_defconfig
mips allyesconfig
mips 64r6el_defconfig
mips allnoconfig
mips 32r2_defconfig
mips malta_kvm_guest_defconfig
mips tb0287_defconfig
mips capcella_defconfig
mips ip32_defconfig
mips decstation_64_defconfig
mips loongson3_defconfig
mips ath79_defconfig
mips bcm63xx_defconfig
parisc allnoconfig
parisc generic-64bit_defconfig
parisc generic-32bit_defconfig
parisc allyesconfig
parisc allmodconfig
powerpc chrp32_defconfig
powerpc defconfig
powerpc holly_defconfig
powerpc ppc64_defconfig
powerpc rhel-kconfig
powerpc allnoconfig
powerpc mpc866_ads_defconfig
powerpc amigaone_defconfig
powerpc adder875_defconfig
powerpc ep8248e_defconfig
powerpc g5_defconfig
powerpc mpc512x_defconfig
m68k randconfig-a001-20200502
mips randconfig-a001-20200502
nds32 randconfig-a001-20200502
alpha randconfig-a001-20200502
parisc randconfig-a001-20200502
riscv randconfig-a001-20200502
parisc randconfig-a001-20200430
mips randconfig-a001-20200430
m68k randconfig-a001-20200430
riscv randconfig-a001-20200430
alpha randconfig-a001-20200430
nds32 randconfig-a001-20200430
h8300 randconfig-a001-20200502
nios2 randconfig-a001-20200502
microblaze randconfig-a001-20200502
c6x randconfig-a001-20200502
sparc64 randconfig-a001-20200502
microblaze randconfig-a001-20200430
nios2 randconfig-a001-20200430
h8300 randconfig-a001-20200430
c6x randconfig-a001-20200430
sparc64 randconfig-a001-20200430
s390 randconfig-a001-20200502
xtensa randconfig-a001-20200502
sh randconfig-a001-20200502
openrisc randconfig-a001-20200502
csky randconfig-a001-20200502
i386 randconfig-b003-20200501
x86_64 randconfig-b002-20200501
i386 randconfig-b001-20200501
x86_64 randconfig-b003-20200501
x86_64 randconfig-b001-20200501
i386 randconfig-b002-20200501
i386 randconfig-b003-20200502
i386 randconfig-b001-20200502
x86_64 randconfig-b003-20200502
x86_64 randconfig-b001-20200502
i386 randconfig-b002-20200502
i386 randconfig-d003-20200502
i386 randconfig-d001-20200502
x86_64 randconfig-d002-20200502
i386 randconfig-d002-20200502
x86_64 randconfig-d002-20200430
x86_64 randconfig-d001-20200430
i386 randconfig-d001-20200430
i386 randconfig-d003-20200430
i386 randconfig-d002-20200430
x86_64 randconfig-d003-20200430
x86_64 randconfig-e003-20200502
i386 randconfig-e003-20200502
x86_64 randconfig-e001-20200502
i386 randconfig-e002-20200502
i386 randconfig-e001-20200502
x86_64 randconfig-e002-20200430
i386 randconfig-e003-20200430
x86_64 randconfig-e003-20200430
i386 randconfig-e002-20200430
x86_64 randconfig-e001-20200430
i386 randconfig-e001-20200430
i386 randconfig-f003-20200501
x86_64 randconfig-f001-20200501
x86_64 randconfig-f003-20200501
i386 randconfig-f001-20200501
i386 randconfig-f002-20200501
i386 randconfig-f003-20200502
x86_64 randconfig-f001-20200502
x86_64 randconfig-f003-20200502
x86_64 randconfig-f002-20200502
i386 randconfig-f001-20200502
i386 randconfig-f002-20200502
x86_64 randconfig-a003-20200502
x86_64 randconfig-a001-20200502
x86_64 randconfig-a002-20200502
i386 randconfig-a002-20200502
i386 randconfig-a003-20200502
i386 randconfig-a001-20200502
i386 randconfig-h001-20200502
i386 randconfig-h002-20200502
i386 randconfig-h003-20200502
x86_64 randconfig-h002-20200502
x86_64 randconfig-h001-20200502
x86_64 randconfig-h003-20200502
i386 randconfig-h001-20200501
i386 randconfig-h002-20200501
i386 randconfig-h003-20200501
x86_64 randconfig-h001-20200501
x86_64 randconfig-h003-20200501
ia64 randconfig-a001-20200502
arm64 randconfig-a001-20200502
arc randconfig-a001-20200502
powerpc randconfig-a001-20200502
arm randconfig-a001-20200502
sparc randconfig-a001-20200502
ia64 randconfig-a001-20200501
arc randconfig-a001-20200501
powerpc randconfig-a001-20200501
arm randconfig-a001-20200501
sparc randconfig-a001-20200501
riscv allyesconfig
riscv nommu_virt_defconfig
riscv allnoconfig
riscv defconfig
riscv rv32_defconfig
riscv allmodconfig
s390 zfcpdump_defconfig
s390 debug_defconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 alldefconfig
s390 defconfig
sh allmodconfig
sh titan_defconfig
sh sh7785lcr_32bit_defconfig
sh allnoconfig
sparc allyesconfig
sparc defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
um x86_64_defconfig
um i386_defconfig
um defconfig
x86_64 rhel
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
From: David Hildenbrand @ 2020-05-02 9:26 UTC (permalink / raw)
To: Dan Williams
Cc: virtio-dev, linux-hyperv, Michal Hocko, Baoquan He, Linux ACPI,
Wei Yang, linux-s390, linux-nvdimm, Linux Kernel Mailing List,
Dave Hansen, virtualization, Linux MM, Michael S . Tsirkin,
Eric W. Biederman, Pankaj Gupta, xen-devel, Andrew Morton,
Michal Hocko, linuxppc-dev
In-Reply-To: <CAPcyv4h1nWjszkVJQgeXkUc=-nPv5=Me25BOGFQCpihUyFsD6w@mail.gmail.com>
>> Now, let's clarify what I want regarding virtio-mem:
>>
>> 1. kexec should not add virtio-mem memory to the initial firmware
>> memmap. The driver has to be in charge as discussed.
>> 2. kexec should not place kexec images onto virtio-mem memory. That
>> would end badly.
>> 3. kexec should still dump virtio-mem memory via kdump.
>
> Ok, but then seems to say to me that dax/kmem is a different type of
> (driver managed) than virtio-mem and it's confusing to try to apply
> the same meaning. Why not just call your type for the distinct type it
> is "System RAM (virtio-mem)" and let any other driver managed memory
> follow the same "System RAM ($driver)" format if it wants?
I had the same idea but discarded it because it seemed to uglify the
add_memory() interface (passing yet another parameter only relevant for
driver managed memory). Maybe we really want a new one, because I like
that idea:
/*
* Add special, driver-managed memory to the system as system ram.
* The resource_name is expected to have the name format "System RAM
* ($DRIVER)", so user space (esp. kexec-tools)" can special-case it.
*
* For this memory, no entries in /sys/firmware/memmap are created,
* as this memory won't be part of the raw firmware-provided memory map
* e.g., after a reboot. Also, the created memory resource is flagged
* with IORESOURCE_MEM_DRIVER_MANAGED, so in-kernel users can special-
* case this memory (e.g., not place kexec images onto it).
*/
int add_memory_driver_managed(int nid, u64 start, u64 size,
const char *resource_name);
If we'd ever have to special case it even more in the kernel, we could
allow to specify further resource flags. While passing the driver name
instead of the resource_name would be an option, this way we don't have
to hand craft new resource strings for added memory resources.
Thoughts?
--
Thanks,
David / dhildenb
^ permalink raw reply
* [powerpc:fixes-test] BUILD SUCCESS e2abb0f00606ece8b191679bbc3f9246738fb88e
From: kbuild test robot @ 2020-05-02 11:05 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git fixes-test
branch HEAD: e2abb0f00606ece8b191679bbc3f9246738fb88e Merge KUAP fix from topic/uaccess-ppc into fixes-test
elapsed time: 689m
configs tested: 204
configs skipped: 0
The following configs have been built successfully.
More configs may be tested in the coming days.
arm64 allyesconfig
arm allyesconfig
arm64 allmodconfig
arm allmodconfig
arm64 allnoconfig
arm allnoconfig
arm efm32_defconfig
arm at91_dt_defconfig
arm shmobile_defconfig
arm64 defconfig
arm exynos_defconfig
arm multi_v5_defconfig
arm sunxi_defconfig
arm multi_v7_defconfig
sparc allyesconfig
powerpc defconfig
ia64 defconfig
arc defconfig
mips ar7_defconfig
mips ath79_defconfig
mips allmodconfig
nios2 3c120_defconfig
sparc64 defconfig
csky defconfig
sh rsk7269_defconfig
ia64 allnoconfig
nds32 allnoconfig
m68k sun3_defconfig
i386 allnoconfig
i386 allyesconfig
i386 alldefconfig
i386 defconfig
i386 debian-10.3
ia64 allmodconfig
ia64 generic_defconfig
ia64 tiger_defconfig
ia64 bigsur_defconfig
ia64 allyesconfig
ia64 alldefconfig
m68k m5475evb_defconfig
m68k allmodconfig
m68k bvme6000_defconfig
m68k multi_defconfig
nios2 10m50_defconfig
c6x evmc6678_defconfig
c6x allyesconfig
openrisc simple_smp_defconfig
openrisc or1ksim_defconfig
nds32 defconfig
alpha defconfig
h8300 h8s-sim_defconfig
h8300 edosk2674_defconfig
xtensa iss_defconfig
h8300 h8300h-sim_defconfig
xtensa common_defconfig
arc allyesconfig
microblaze mmu_defconfig
microblaze nommu_defconfig
mips fuloong2e_defconfig
mips malta_kvm_defconfig
mips allyesconfig
mips 64r6el_defconfig
mips allnoconfig
mips 32r2_defconfig
mips malta_kvm_guest_defconfig
mips tb0287_defconfig
mips capcella_defconfig
mips ip32_defconfig
mips decstation_64_defconfig
mips loongson3_defconfig
mips bcm63xx_defconfig
parisc allnoconfig
parisc generic-64bit_defconfig
parisc generic-32bit_defconfig
parisc allyesconfig
parisc allmodconfig
powerpc chrp32_defconfig
powerpc holly_defconfig
powerpc ppc64_defconfig
powerpc rhel-kconfig
powerpc allnoconfig
powerpc mpc866_ads_defconfig
powerpc amigaone_defconfig
powerpc adder875_defconfig
powerpc ep8248e_defconfig
powerpc g5_defconfig
powerpc mpc512x_defconfig
m68k randconfig-a001-20200502
mips randconfig-a001-20200502
nds32 randconfig-a001-20200502
alpha randconfig-a001-20200502
parisc randconfig-a001-20200502
riscv randconfig-a001-20200502
h8300 randconfig-a001-20200502
nios2 randconfig-a001-20200502
microblaze randconfig-a001-20200502
c6x randconfig-a001-20200502
sparc64 randconfig-a001-20200502
s390 randconfig-a001-20200502
xtensa randconfig-a001-20200502
sh randconfig-a001-20200502
openrisc randconfig-a001-20200502
csky randconfig-a001-20200502
x86_64 randconfig-a003-20200502
x86_64 randconfig-a001-20200502
x86_64 randconfig-a002-20200502
i386 randconfig-a002-20200502
i386 randconfig-a003-20200502
i386 randconfig-a001-20200502
i386 randconfig-b003-20200502
i386 randconfig-b001-20200502
x86_64 randconfig-b003-20200502
x86_64 randconfig-b001-20200502
i386 randconfig-b002-20200502
i386 randconfig-b003-20200501
x86_64 randconfig-b002-20200501
i386 randconfig-b001-20200501
x86_64 randconfig-b003-20200501
x86_64 randconfig-b001-20200501
i386 randconfig-b002-20200501
x86_64 randconfig-c002-20200502
i386 randconfig-c002-20200502
i386 randconfig-c001-20200502
i386 randconfig-c003-20200502
i386 randconfig-d003-20200502
i386 randconfig-d001-20200502
x86_64 randconfig-d002-20200502
i386 randconfig-d002-20200502
x86_64 randconfig-e003-20200502
i386 randconfig-e003-20200502
x86_64 randconfig-e001-20200502
i386 randconfig-e002-20200502
i386 randconfig-e001-20200502
x86_64 randconfig-e002-20200430
i386 randconfig-e003-20200430
x86_64 randconfig-e003-20200430
i386 randconfig-e002-20200430
x86_64 randconfig-e001-20200430
i386 randconfig-e001-20200430
i386 randconfig-f003-20200502
x86_64 randconfig-f001-20200502
x86_64 randconfig-f003-20200502
x86_64 randconfig-f002-20200502
i386 randconfig-f001-20200502
i386 randconfig-f002-20200502
x86_64 randconfig-g003-20200502
i386 randconfig-g003-20200502
i386 randconfig-g002-20200502
x86_64 randconfig-g001-20200502
x86_64 randconfig-g002-20200502
i386 randconfig-g001-20200502
i386 randconfig-h001-20200502
i386 randconfig-h002-20200502
i386 randconfig-h003-20200502
x86_64 randconfig-h002-20200502
x86_64 randconfig-h001-20200502
x86_64 randconfig-h003-20200502
i386 randconfig-h001-20200501
i386 randconfig-h002-20200501
i386 randconfig-h003-20200501
x86_64 randconfig-h001-20200501
x86_64 randconfig-h003-20200501
ia64 randconfig-a001-20200502
arm64 randconfig-a001-20200502
arc randconfig-a001-20200502
powerpc randconfig-a001-20200502
arm randconfig-a001-20200502
sparc randconfig-a001-20200502
ia64 randconfig-a001-20200501
arc randconfig-a001-20200501
powerpc randconfig-a001-20200501
arm randconfig-a001-20200501
sparc randconfig-a001-20200501
riscv allyesconfig
riscv nommu_virt_defconfig
riscv allnoconfig
riscv defconfig
riscv rv32_defconfig
riscv allmodconfig
s390 zfcpdump_defconfig
s390 debug_defconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 alldefconfig
s390 defconfig
sh allmodconfig
sh titan_defconfig
sh sh7785lcr_32bit_defconfig
sh allnoconfig
sparc defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
um x86_64_defconfig
um i386_defconfig
um defconfig
x86_64 rhel
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [PATCH v2 00/12] powerpc/book3s/64/pkeys: Simplify the code
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
This patch series update the pkey subsystem with more documentation and
rename variables so that it is easy to follow the code. The last patch
does fix a problem where we are treating keys above max_pkey as available.
But userspace is not impacted because using that key in mprotect_pkey returns
error due to limit check there. Also the uamor, value set by the platform is such
that it will deny modification of keys above max pkey.
Changes from V1:
* Rebase to the latest kernel.
* Added two new patches 6 and 12.
Aneesh Kumar K.V (12):
powerpc/book3s64/pkeys: Fixup bit numbering
powerpc/book3s64/pkeys: pkeys are supported only on hash on book3s.
powerpc/book3s64/pkeys: Move pkey related bits in the linux page table
powerpc/book3s64/pkeys: Explain key 1 reservation details
powerpc/book3s64/pkeys: Simplify the key initialization
powerpc/book3s64/pkeys: Prevent key 1 modification from userspace.
powerpc/book3s64/pkeys: kill cpu feature key CPU_FTR_PKEY
powerpc/book3s64/pkeys: Convert execute key support to static key
powerpc/book3s64/pkeys: Simplify pkey disable branch
powerpc/book3s64/pkeys: Convert pkey_total to max_pkey
powerpc/book3s64/pkeys: Make initial_allocation_mask static
powerpc/book3s64/pkeys: Mark all the pkeys above max pkey as reserved
arch/powerpc/include/asm/book3s/64/hash-4k.h | 21 +-
arch/powerpc/include/asm/book3s/64/hash-64k.h | 12 +-
.../powerpc/include/asm/book3s/64/hash-pkey.h | 32 +++
arch/powerpc/include/asm/book3s/64/mmu-hash.h | 8 +-
arch/powerpc/include/asm/book3s/64/pgtable.h | 17 +-
arch/powerpc/include/asm/book3s/64/pkeys.h | 25 +++
arch/powerpc/include/asm/cputable.h | 10 +-
arch/powerpc/include/asm/pkeys.h | 43 +---
arch/powerpc/kernel/dt_cpu_ftrs.c | 6 -
arch/powerpc/mm/book3s64/pkeys.c | 210 ++++++++++--------
10 files changed, 222 insertions(+), 162 deletions(-)
create mode 100644 arch/powerpc/include/asm/book3s/64/hash-pkey.h
create mode 100644 arch/powerpc/include/asm/book3s/64/pkeys.h
--
2.26.2
^ permalink raw reply
* [PATCH v2 01/12] powerpc/book3s64/pkeys: Fixup bit numbering
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
This number the pkey bit such that it is easy to follow. PKEY_BIT0 is
the lower order bit. This makes further changes easy to follow.
No functional change in this patch other than linux page table for
hash translation now maps pkeys differently.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/hash-4k.h | 9 +++----
arch/powerpc/include/asm/book3s/64/hash-64k.h | 8 +++----
arch/powerpc/include/asm/book3s/64/mmu-hash.h | 8 +++----
arch/powerpc/include/asm/pkeys.h | 24 +++++++++----------
4 files changed, 25 insertions(+), 24 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 3f9ae3585ab9..f889d56bf8cf 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -57,11 +57,12 @@
#define H_PMD_FRAG_NR (PAGE_SIZE >> H_PMD_FRAG_SIZE_SHIFT)
/* memory key bits, only 8 keys supported */
-#define H_PTE_PKEY_BIT0 0
-#define H_PTE_PKEY_BIT1 0
+#define H_PTE_PKEY_BIT4 0
+#define H_PTE_PKEY_BIT3 0
#define H_PTE_PKEY_BIT2 _RPAGE_RSV3
-#define H_PTE_PKEY_BIT3 _RPAGE_RSV4
-#define H_PTE_PKEY_BIT4 _RPAGE_RSV5
+#define H_PTE_PKEY_BIT1 _RPAGE_RSV4
+#define H_PTE_PKEY_BIT0 _RPAGE_RSV5
+
/*
* On all 4K setups, remap_4k_pfn() equates to remap_pfn_range()
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 0729c034e56f..0a15fd14cf72 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -36,11 +36,11 @@
#define H_PAGE_HASHPTE _RPAGE_RPN43 /* PTE has associated HPTE */
/* memory key bits. */
-#define H_PTE_PKEY_BIT0 _RPAGE_RSV1
-#define H_PTE_PKEY_BIT1 _RPAGE_RSV2
+#define H_PTE_PKEY_BIT4 _RPAGE_RSV1
+#define H_PTE_PKEY_BIT3 _RPAGE_RSV2
#define H_PTE_PKEY_BIT2 _RPAGE_RSV3
-#define H_PTE_PKEY_BIT3 _RPAGE_RSV4
-#define H_PTE_PKEY_BIT4 _RPAGE_RSV5
+#define H_PTE_PKEY_BIT1 _RPAGE_RSV4
+#define H_PTE_PKEY_BIT0 _RPAGE_RSV5
/*
* We need to differentiate between explicit huge page and THP huge
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 3fa1b962dc27..58fcc959f9d5 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -86,8 +86,8 @@
#define HPTE_R_PP0 ASM_CONST(0x8000000000000000)
#define HPTE_R_TS ASM_CONST(0x4000000000000000)
#define HPTE_R_KEY_HI ASM_CONST(0x3000000000000000)
-#define HPTE_R_KEY_BIT0 ASM_CONST(0x2000000000000000)
-#define HPTE_R_KEY_BIT1 ASM_CONST(0x1000000000000000)
+#define HPTE_R_KEY_BIT4 ASM_CONST(0x2000000000000000)
+#define HPTE_R_KEY_BIT3 ASM_CONST(0x1000000000000000)
#define HPTE_R_RPN_SHIFT 12
#define HPTE_R_RPN ASM_CONST(0x0ffffffffffff000)
#define HPTE_R_RPN_3_0 ASM_CONST(0x01fffffffffff000)
@@ -103,8 +103,8 @@
#define HPTE_R_R ASM_CONST(0x0000000000000100)
#define HPTE_R_KEY_LO ASM_CONST(0x0000000000000e00)
#define HPTE_R_KEY_BIT2 ASM_CONST(0x0000000000000800)
-#define HPTE_R_KEY_BIT3 ASM_CONST(0x0000000000000400)
-#define HPTE_R_KEY_BIT4 ASM_CONST(0x0000000000000200)
+#define HPTE_R_KEY_BIT1 ASM_CONST(0x0000000000000400)
+#define HPTE_R_KEY_BIT0 ASM_CONST(0x0000000000000200)
#define HPTE_R_KEY (HPTE_R_KEY_LO | HPTE_R_KEY_HI)
#define HPTE_V_1TB_SEG ASM_CONST(0x4000000000000000)
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 20ebf153c871..f8f4d0793789 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -35,11 +35,11 @@ static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
if (static_branch_likely(&pkey_disabled))
return 0x0UL;
- return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT4 : 0x0UL) |
- ((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT3 : 0x0UL) |
+ return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT0 : 0x0UL) |
+ ((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT1 : 0x0UL) |
((vm_flags & VM_PKEY_BIT2) ? H_PTE_PKEY_BIT2 : 0x0UL) |
- ((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT1 : 0x0UL) |
- ((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT0 : 0x0UL));
+ ((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT3 : 0x0UL) |
+ ((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT4 : 0x0UL));
}
static inline int vma_pkey(struct vm_area_struct *vma)
@@ -53,20 +53,20 @@ static inline int vma_pkey(struct vm_area_struct *vma)
static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
{
- return (((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
+ return (((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL) |
((pteflags & H_PTE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
+ ((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL));
}
static inline u16 pte_to_pkey_bits(u64 pteflags)
{
- return (((pteflags & H_PTE_PKEY_BIT0) ? 0x10 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT1) ? 0x8 : 0x0UL) |
+ return (((pteflags & H_PTE_PKEY_BIT4) ? 0x10 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT3) ? 0x8 : 0x0UL) |
((pteflags & H_PTE_PKEY_BIT2) ? 0x4 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT3) ? 0x2 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT4) ? 0x1 : 0x0UL));
+ ((pteflags & H_PTE_PKEY_BIT1) ? 0x2 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT0) ? 0x1 : 0x0UL));
}
#define pkey_alloc_mask(pkey) (0x1 << pkey)
--
2.26.2
^ permalink raw reply related
* [PATCH v2 02/12] powerpc/book3s64/pkeys: pkeys are supported only on hash on book3s.
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
Move them to hash specific file and add BUG() for radix path.
---
.../powerpc/include/asm/book3s/64/hash-pkey.h | 32 ++++++++++++++++
arch/powerpc/include/asm/book3s/64/pkeys.h | 25 +++++++++++++
arch/powerpc/include/asm/pkeys.h | 37 ++++---------------
3 files changed, 64 insertions(+), 30 deletions(-)
create mode 100644 arch/powerpc/include/asm/book3s/64/hash-pkey.h
create mode 100644 arch/powerpc/include/asm/book3s/64/pkeys.h
diff --git a/arch/powerpc/include/asm/book3s/64/hash-pkey.h b/arch/powerpc/include/asm/book3s/64/hash-pkey.h
new file mode 100644
index 000000000000..795010897e5d
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/hash-pkey.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_BOOK3S_64_HASH_PKEY_H
+#define _ASM_POWERPC_BOOK3S_64_HASH_PKEY_H
+
+static inline u64 hash__vmflag_to_pte_pkey_bits(u64 vm_flags)
+{
+ return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT0 : 0x0UL) |
+ ((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT1 : 0x0UL) |
+ ((vm_flags & VM_PKEY_BIT2) ? H_PTE_PKEY_BIT2 : 0x0UL) |
+ ((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT3 : 0x0UL) |
+ ((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT4 : 0x0UL));
+}
+
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+ return (((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL));
+}
+
+static inline u16 hash__pte_to_pkey_bits(u64 pteflags)
+{
+ return (((pteflags & H_PTE_PKEY_BIT4) ? 0x10 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT3) ? 0x8 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT2) ? 0x4 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT1) ? 0x2 : 0x0UL) |
+ ((pteflags & H_PTE_PKEY_BIT0) ? 0x1 : 0x0UL));
+}
+
+#endif
diff --git a/arch/powerpc/include/asm/book3s/64/pkeys.h b/arch/powerpc/include/asm/book3s/64/pkeys.h
new file mode 100644
index 000000000000..8174662a9173
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/64/pkeys.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#ifndef _ASM_POWERPC_BOOK3S_64_PKEYS_H
+#define _ASM_POWERPC_BOOK3S_64_PKEYS_H
+
+#include <asm/book3s/64/hash-pkey.h>
+
+static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
+{
+ if (static_branch_likely(&pkey_disabled))
+ return 0x0UL;
+
+ if (radix_enabled())
+ BUG();
+ return hash__vmflag_to_pte_pkey_bits(vm_flags);
+}
+
+static inline u16 pte_to_pkey_bits(u64 pteflags)
+{
+ if (radix_enabled())
+ BUG();
+ return hash__pte_to_pkey_bits(pteflags);
+}
+
+#endif /*_ASM_POWERPC_KEYS_H */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index f8f4d0793789..5dd0a79d1809 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -25,23 +25,18 @@ extern u32 reserved_allocation_mask; /* bits set for reserved keys */
PKEY_DISABLE_WRITE | \
PKEY_DISABLE_EXECUTE)
+#ifdef CONFIG_PPC_BOOK3S_64
+#include <asm/book3s/64/pkeys.h>
+#else
+#error "Not supported"
+#endif
+
+
static inline u64 pkey_to_vmflag_bits(u16 pkey)
{
return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
}
-static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
-{
- if (static_branch_likely(&pkey_disabled))
- return 0x0UL;
-
- return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT0 : 0x0UL) |
- ((vm_flags & VM_PKEY_BIT1) ? H_PTE_PKEY_BIT1 : 0x0UL) |
- ((vm_flags & VM_PKEY_BIT2) ? H_PTE_PKEY_BIT2 : 0x0UL) |
- ((vm_flags & VM_PKEY_BIT3) ? H_PTE_PKEY_BIT3 : 0x0UL) |
- ((vm_flags & VM_PKEY_BIT4) ? H_PTE_PKEY_BIT4 : 0x0UL));
-}
-
static inline int vma_pkey(struct vm_area_struct *vma)
{
if (static_branch_likely(&pkey_disabled))
@@ -51,24 +46,6 @@ static inline int vma_pkey(struct vm_area_struct *vma)
#define arch_max_pkey() pkeys_total
-static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
-{
- return (((pteflags & H_PTE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL));
-}
-
-static inline u16 pte_to_pkey_bits(u64 pteflags)
-{
- return (((pteflags & H_PTE_PKEY_BIT4) ? 0x10 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT3) ? 0x8 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT2) ? 0x4 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT1) ? 0x2 : 0x0UL) |
- ((pteflags & H_PTE_PKEY_BIT0) ? 0x1 : 0x0UL));
-}
-
#define pkey_alloc_mask(pkey) (0x1 << pkey)
#define mm_pkey_allocation_map(mm) (mm->context.pkey_allocation_map)
--
2.26.2
^ permalink raw reply related
* [PATCH v2 03/12] powerpc/book3s64/pkeys: Move pkey related bits in the linux page table
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
To keep things simple, all the pkey related bits are kept together
in linux page table for 64K config with hash translation. With hash-4k
kernel requires 4 bits to store slots details. This is done by overloading
some of the RPN bits for storing the slot details. Due to this PKEY_BIT0 on
the 4K config is used for storing hash slot details.
64K before
|....|RSV1| RSV2| RSV3 | RSV4 | RPN44| RPN43 |.... | RSV5|
|....| P4 | P3 | P2 | P1 | Busy | HASHPTE |.... | P0 |
after
|....|RSV1| RSV2| RSV3 | RSV4 | RPN44 | RPN43 |.... | RSV5 |
|....| P4 | P3 | P2 | P1 | P0 | HASHPTE |.... | Busy |
4k before
|....| RSV1 | RSV2 | RSV3 | RSV4 | RPN44| RPN43.... | RSV5|
|....| Busy | HASHPTE | P2 | P1 | F_SEC| F_GIX.... | P0 |
after
|....| RSV1 | RSV2| RSV3 | RSV4 | Free | RPN43.... | RSV5 |
|....| HASHPTE | P2 | P1 | P0 | F_SEC| F_GIX.... | BUSY |
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/hash-4k.h | 16 ++++++++--------
arch/powerpc/include/asm/book3s/64/hash-64k.h | 12 ++++++------
arch/powerpc/include/asm/book3s/64/pgtable.h | 17 ++++++++---------
3 files changed, 22 insertions(+), 23 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index f889d56bf8cf..082b98808701 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -34,11 +34,11 @@
#define H_PUD_TABLE_SIZE (sizeof(pud_t) << H_PUD_INDEX_SIZE)
#define H_PGD_TABLE_SIZE (sizeof(pgd_t) << H_PGD_INDEX_SIZE)
-#define H_PAGE_F_GIX_SHIFT 53
-#define H_PAGE_F_SECOND _RPAGE_RPN44 /* HPTE is in 2ndary HPTEG */
-#define H_PAGE_F_GIX (_RPAGE_RPN43 | _RPAGE_RPN42 | _RPAGE_RPN41)
-#define H_PAGE_BUSY _RPAGE_RSV1 /* software: PTE & hash are busy */
-#define H_PAGE_HASHPTE _RPAGE_RSV2 /* software: PTE & hash are busy */
+#define H_PAGE_F_GIX_SHIFT _PAGE_PA_MAX
+#define H_PAGE_F_SECOND _RPAGE_PKEY_BIT0 /* HPTE is in 2ndary HPTEG */
+#define H_PAGE_F_GIX (_RPAGE_RPN43 | _RPAGE_RPN42 | _RPAGE_RPN41)
+#define H_PAGE_BUSY _RPAGE_RSV1
+#define H_PAGE_HASHPTE _RPAGE_PKEY_BIT4
/* PTE flags to conserve for HPTE identification */
#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
@@ -59,9 +59,9 @@
/* memory key bits, only 8 keys supported */
#define H_PTE_PKEY_BIT4 0
#define H_PTE_PKEY_BIT3 0
-#define H_PTE_PKEY_BIT2 _RPAGE_RSV3
-#define H_PTE_PKEY_BIT1 _RPAGE_RSV4
-#define H_PTE_PKEY_BIT0 _RPAGE_RSV5
+#define H_PTE_PKEY_BIT2 _RPAGE_PKEY_BIT3
+#define H_PTE_PKEY_BIT1 _RPAGE_PKEY_BIT2
+#define H_PTE_PKEY_BIT0 _RPAGE_PKEY_BIT1
/*
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 0a15fd14cf72..f20de1149ebe 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -32,15 +32,15 @@
*/
#define H_PAGE_COMBO _RPAGE_RPN0 /* this is a combo 4k page */
#define H_PAGE_4K_PFN _RPAGE_RPN1 /* PFN is for a single 4k page */
-#define H_PAGE_BUSY _RPAGE_RPN44 /* software: PTE & hash are busy */
+#define H_PAGE_BUSY _RPAGE_RSV1 /* software: PTE & hash are busy */
#define H_PAGE_HASHPTE _RPAGE_RPN43 /* PTE has associated HPTE */
/* memory key bits. */
-#define H_PTE_PKEY_BIT4 _RPAGE_RSV1
-#define H_PTE_PKEY_BIT3 _RPAGE_RSV2
-#define H_PTE_PKEY_BIT2 _RPAGE_RSV3
-#define H_PTE_PKEY_BIT1 _RPAGE_RSV4
-#define H_PTE_PKEY_BIT0 _RPAGE_RSV5
+#define H_PTE_PKEY_BIT4 _RPAGE_PKEY_BIT4
+#define H_PTE_PKEY_BIT3 _RPAGE_PKEY_BIT3
+#define H_PTE_PKEY_BIT2 _RPAGE_PKEY_BIT2
+#define H_PTE_PKEY_BIT1 _RPAGE_PKEY_BIT1
+#define H_PTE_PKEY_BIT0 _RPAGE_PKEY_BIT0
/*
* We need to differentiate between explicit huge page and THP huge
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 368b136517e0..e31369707f9f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -32,11 +32,13 @@
#define _RPAGE_SW1 0x00800
#define _RPAGE_SW2 0x00400
#define _RPAGE_SW3 0x00200
-#define _RPAGE_RSV1 0x1000000000000000UL
-#define _RPAGE_RSV2 0x0800000000000000UL
-#define _RPAGE_RSV3 0x0400000000000000UL
-#define _RPAGE_RSV4 0x0200000000000000UL
-#define _RPAGE_RSV5 0x00040UL
+#define _RPAGE_RSV1 0x00040UL
+
+#define _RPAGE_PKEY_BIT4 0x1000000000000000UL
+#define _RPAGE_PKEY_BIT3 0x0800000000000000UL
+#define _RPAGE_PKEY_BIT2 0x0400000000000000UL
+#define _RPAGE_PKEY_BIT1 0x0200000000000000UL
+#define _RPAGE_PKEY_BIT0 0x0100000000000000UL
#define _PAGE_PTE 0x4000000000000000UL /* distinguishes PTEs from pointers */
#define _PAGE_PRESENT 0x8000000000000000UL /* pte contains a translation */
@@ -58,13 +60,12 @@
*/
#define _RPAGE_RPN0 0x01000
#define _RPAGE_RPN1 0x02000
-#define _RPAGE_RPN44 0x0100000000000000UL
#define _RPAGE_RPN43 0x0080000000000000UL
#define _RPAGE_RPN42 0x0040000000000000UL
#define _RPAGE_RPN41 0x0020000000000000UL
/* Max physical address bit as per radix table */
-#define _RPAGE_PA_MAX 57
+#define _RPAGE_PA_MAX 56
/*
* Max physical address bit we will use for now.
@@ -125,8 +126,6 @@
_PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE | \
_PAGE_SOFT_DIRTY | _PAGE_DEVMAP)
-#define H_PTE_PKEY (H_PTE_PKEY_BIT0 | H_PTE_PKEY_BIT1 | H_PTE_PKEY_BIT2 | \
- H_PTE_PKEY_BIT3 | H_PTE_PKEY_BIT4)
/*
* We define 2 sets of base prot bits, one for basic pages (ie,
* cacheable kernel and user pages) and one for non cacheable
--
2.26.2
^ permalink raw reply related
* [PATCH v2 04/12] powerpc/book3s64/pkeys: Explain key 1 reservation details
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
This explains the details w.r.t key 1.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/mm/book3s64/pkeys.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 1199fc2bfaec..d60e6bfa3e03 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -124,7 +124,10 @@ static int pkey_initialize(void)
#else
os_reserved = 0;
#endif
- /* Bits are in LE format. */
+ /*
+ * key 1 is recommended not to be used. PowerISA(3.0) page 1015,
+ * programming note.
+ */
reserved_allocation_mask = (0x1 << 1) | (0x1 << execute_only_key);
/* register mask is in BE format */
--
2.26.2
^ permalink raw reply related
* [PATCH v2 05/12] powerpc/book3s64/pkeys: Simplify the key initialization
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
Add documentation explaining the execute_only_key. The reservation and initialization mask
details are also explained in this patch.
No functional change in this patch.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/mm/book3s64/pkeys.c | 186 ++++++++++++++++++-------------
1 file changed, 107 insertions(+), 79 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index d60e6bfa3e03..3db0b3cfc322 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -15,48 +15,71 @@
DEFINE_STATIC_KEY_TRUE(pkey_disabled);
int pkeys_total; /* Total pkeys as per device tree */
u32 initial_allocation_mask; /* Bits set for the initially allocated keys */
-u32 reserved_allocation_mask; /* Bits set for reserved keys */
+/*
+ * Keys marked in the reservation list cannot be allocated by userspace
+ */
+u32 reserved_allocation_mask;
static bool pkey_execute_disable_supported;
-static bool pkeys_devtree_defined; /* property exported by device tree */
-static u64 pkey_amr_mask; /* Bits in AMR not to be touched */
-static u64 pkey_iamr_mask; /* Bits in AMR not to be touched */
-static u64 pkey_uamor_mask; /* Bits in UMOR not to be touched */
+static u64 default_amr;
+static u64 default_iamr;
+/* Allow all keys to be modified by default */
+static u64 default_uamor = ~0x0UL;
+/*
+ * Key used to implement PROT_EXEC mmap. Denies READ/WRITE
+ * We pick key 2 because 0 is special key and 1 is reserved as per ISA.
+ */
static int execute_only_key = 2;
+
#define AMR_BITS_PER_PKEY 2
#define AMR_RD_BIT 0x1UL
#define AMR_WR_BIT 0x2UL
#define IAMR_EX_BIT 0x1UL
-#define PKEY_REG_BITS (sizeof(u64)*8)
+#define PKEY_REG_BITS (sizeof(u64) * 8)
#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
-static void scan_pkey_feature(void)
+static int scan_pkey_feature(void)
{
u32 vals[2];
+ int pkeys_total = 0;
struct device_node *cpu;
+ /*
+ * Pkey is not supported with Radix translation.
+ */
+ if (radix_enabled())
+ return 0;
+
cpu = of_find_node_by_type(NULL, "cpu");
if (!cpu)
- return;
+ return 0;
if (of_property_read_u32_array(cpu,
- "ibm,processor-storage-keys", vals, 2))
- return;
+ "ibm,processor-storage-keys", vals, 2) == 0) {
+ /*
+ * Since any pkey can be used for data or execute, we will
+ * just treat all keys as equal and track them as one entity.
+ */
+ pkeys_total = vals[0];
+ /* Should we check for IAMR support FIXME!! */
+ } else {
+ /*
+ * Let's assume 32 pkeys on P8 bare metal, if its not defined by device
+ * tree. We make this exception since skiboot forgot to expose this
+ * property on power8.
+ */
+ if (!firmware_has_feature(FW_FEATURE_LPAR) &&
+ cpu_has_feature(CPU_FTRS_POWER8))
+ pkeys_total = 32;
+ }
/*
- * Since any pkey can be used for data or execute, we will just treat
- * all keys as equal and track them as one entity.
+ * Adjust the upper limit, based on the number of bits supported by
+ * arch-neutral code.
*/
- pkeys_total = vals[0];
- pkeys_devtree_defined = true;
-}
-
-static inline bool pkey_mmu_enabled(void)
-{
- if (firmware_has_feature(FW_FEATURE_LPAR))
- return pkeys_total;
- else
- return cpu_has_feature(CPU_FTR_PKEY);
+ pkeys_total = min_t(int, pkeys_total,
+ ((ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT) + 1));
+ return pkeys_total;
}
static int pkey_initialize(void)
@@ -80,31 +103,13 @@ static int pkey_initialize(void)
!= (sizeof(u64) * BITS_PER_BYTE));
/* scan the device tree for pkey feature */
- scan_pkey_feature();
-
- /*
- * Let's assume 32 pkeys on P8 bare metal, if its not defined by device
- * tree. We make this exception since skiboot forgot to expose this
- * property on power8.
- */
- if (!pkeys_devtree_defined && !firmware_has_feature(FW_FEATURE_LPAR) &&
- cpu_has_feature(CPU_FTRS_POWER8))
- pkeys_total = 32;
-
- /*
- * Adjust the upper limit, based on the number of bits supported by
- * arch-neutral code.
- */
- pkeys_total = min_t(int, pkeys_total,
- ((ARCH_VM_PKEY_FLAGS >> VM_PKEY_SHIFT)+1));
-
- if (!pkey_mmu_enabled() || radix_enabled() || !pkeys_total)
- static_branch_enable(&pkey_disabled);
- else
+ pkeys_total = scan_pkey_feature();
+ if (pkeys_total)
static_branch_disable(&pkey_disabled);
-
- if (static_branch_likely(&pkey_disabled))
+ else {
+ static_branch_enable(&pkey_disabled);
return 0;
+ }
/*
* The device tree cannot be relied to indicate support for
@@ -118,48 +123,71 @@ static int pkey_initialize(void)
#ifdef CONFIG_PPC_4K_PAGES
/*
* The OS can manage only 8 pkeys due to its inability to represent them
- * in the Linux 4K PTE.
+ * in the Linux 4K PTE. Mark all other keys reserved.
*/
os_reserved = pkeys_total - 8;
#else
os_reserved = 0;
#endif
- /*
- * key 1 is recommended not to be used. PowerISA(3.0) page 1015,
- * programming note.
- */
- reserved_allocation_mask = (0x1 << 1) | (0x1 << execute_only_key);
-
- /* register mask is in BE format */
- pkey_amr_mask = ~0x0ul;
- pkey_amr_mask &= ~(0x3ul << pkeyshift(0));
-
- pkey_iamr_mask = ~0x0ul;
- pkey_iamr_mask &= ~(0x3ul << pkeyshift(0));
- pkey_iamr_mask &= ~(0x3ul << pkeyshift(execute_only_key));
-
- pkey_uamor_mask = ~0x0ul;
- pkey_uamor_mask &= ~(0x3ul << pkeyshift(0));
- pkey_uamor_mask &= ~(0x3ul << pkeyshift(execute_only_key));
-
- /* mark the rest of the keys as reserved and hence unavailable */
- for (i = (pkeys_total - os_reserved); i < pkeys_total; i++) {
- reserved_allocation_mask |= (0x1 << i);
- pkey_uamor_mask &= ~(0x3ul << pkeyshift(i));
- }
- initial_allocation_mask = reserved_allocation_mask | (0x1 << 0);
if (unlikely((pkeys_total - os_reserved) <= execute_only_key)) {
/*
* Insufficient number of keys to support
* execute only key. Mark it unavailable.
- * Any AMR, UAMOR, IAMR bit set for
- * this key is irrelevant since this key
- * can never be allocated.
*/
execute_only_key = -1;
+ } else {
+ /*
+ * Mark the execute_only_pkey as not available for
+ * user allocation via pkey_alloc.
+ */
+ reserved_allocation_mask |= (0x1 << execute_only_key);
+
+ /*
+ * Deny READ/WRITE for execute_only_key.
+ * Allow execute in IAMR.
+ */
+ default_amr |= (0x3ul << pkeyshift(execute_only_key));
+ default_iamr &= ~(0x3ul << pkeyshift(execute_only_key));
+
+ /*
+ * Clear the uamor bits for this key.
+ */
+ default_uamor &= ~(0x3ul << pkeyshift(execute_only_key));
}
+ /*
+ * Allow access for only key 0. And prevent any other modification.
+ */
+ default_amr &= ~(0x3ul << pkeyshift(0));
+ default_iamr &= ~(0x3ul << pkeyshift(0));
+ default_uamor &= ~(0x3ul << pkeyshift(0));
+ /*
+ * key 0 is special in that we want to consider it an allocated
+ * key which is preallocated. We don't allow changing AMR bits
+ * w.r.t key 0. But one can pkey_free(key0)
+ */
+ initial_allocation_mask |= (0x1 << 0);
+
+ /*
+ * key 1 is recommended not to be used. PowerISA(3.0) page 1015,
+ * programming note.
+ */
+ reserved_allocation_mask |= (0x1 << 1);
+
+ /*
+ * Prevent the usage of OS reserved the keys. Update UAMOR
+ * for those keys.
+ */
+ for (i = (pkeys_total - os_reserved); i < pkeys_total; i++) {
+ reserved_allocation_mask |= (0x1 << i);
+ default_uamor &= ~(0x3ul << pkeyshift(i));
+ }
+ /*
+ * Prevent the allocation of reserved keys too.
+ */
+ initial_allocation_mask |= reserved_allocation_mask;
+
return 0;
}
@@ -301,13 +329,13 @@ void thread_pkey_regs_init(struct thread_struct *thread)
if (static_branch_likely(&pkey_disabled))
return;
- thread->amr = pkey_amr_mask;
- thread->iamr = pkey_iamr_mask;
- thread->uamor = pkey_uamor_mask;
+ thread->amr = default_amr;
+ thread->iamr = default_iamr;
+ thread->uamor = default_uamor;
- write_uamor(pkey_uamor_mask);
- write_amr(pkey_amr_mask);
- write_iamr(pkey_iamr_mask);
+ write_amr(default_amr);
+ write_iamr(default_iamr);
+ write_uamor(default_uamor);
}
int __execute_only_pkey(struct mm_struct *mm)
--
2.26.2
^ permalink raw reply related
* [PATCH v2 06/12] powerpc/book3s64/pkeys: Prevent key 1 modification from userspace.
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
Key 1 is marked reserved by ISA. Setup uamor to prevent userspace modification
of the same.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/mm/book3s64/pkeys.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 3db0b3cfc322..9e68a08799ee 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -174,6 +174,7 @@ static int pkey_initialize(void)
* programming note.
*/
reserved_allocation_mask |= (0x1 << 1);
+ default_uamor &= ~(0x3ul << pkeyshift(1));
/*
* Prevent the usage of OS reserved the keys. Update UAMOR
--
2.26.2
^ permalink raw reply related
* [PATCH v2 07/12] powerpc/book3s64/pkeys: kill cpu feature key CPU_FTR_PKEY
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
We don't use CPU_FTR_PKEY anymore. Remove the feature bit and mark it
free.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/cputable.h | 10 +++++-----
arch/powerpc/kernel/dt_cpu_ftrs.c | 6 ------
2 files changed, 5 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
index 40a4d3c6fd99..b77f8258ee8c 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -198,7 +198,7 @@ static inline void cpu_feature_keys_init(void) { }
#define CPU_FTR_STCX_CHECKS_ADDRESS LONG_ASM_CONST(0x0000000080000000)
#define CPU_FTR_POPCNTB LONG_ASM_CONST(0x0000000100000000)
#define CPU_FTR_POPCNTD LONG_ASM_CONST(0x0000000200000000)
-#define CPU_FTR_PKEY LONG_ASM_CONST(0x0000000400000000)
+/* LONG_ASM_CONST(0x0000000400000000) Free */
#define CPU_FTR_VMX_COPY LONG_ASM_CONST(0x0000000800000000)
#define CPU_FTR_TM LONG_ASM_CONST(0x0000001000000000)
#define CPU_FTR_CFAR LONG_ASM_CONST(0x0000002000000000)
@@ -437,7 +437,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_DSCR | CPU_FTR_SAO | CPU_FTR_ASYM_SMT | \
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | \
- CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX | CPU_FTR_PKEY)
+ CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX )
#define CPU_FTRS_POWER8 (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -447,7 +447,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
- CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_PKEY)
+ CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP )
#define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG)
#define CPU_FTRS_POWER9 (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
@@ -458,8 +458,8 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
- CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_PKEY | \
- CPU_FTR_P9_TLBIE_STQ_BUG | CPU_FTR_P9_TLBIE_ERAT_BUG | CPU_FTR_P9_TIDR)
+ CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_P9_TLBIE_STQ_BUG | \
+ CPU_FTR_P9_TLBIE_ERAT_BUG | CPU_FTR_P9_TIDR)
#define CPU_FTRS_POWER9_DD2_0 (CPU_FTRS_POWER9 | CPU_FTR_P9_RADIX_PREFETCH_BUG)
#define CPU_FTRS_POWER9_DD2_1 (CPU_FTRS_POWER9 | \
CPU_FTR_P9_RADIX_PREFETCH_BUG | \
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 36bc0d5c4f3a..120ea339ffda 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -747,12 +747,6 @@ static __init void cpufeatures_cpu_quirks(void)
}
update_tlbie_feature_flag(version);
- /*
- * PKEY was not in the initial base or feature node
- * specification, but it should become optional in the next
- * cpu feature version sequence.
- */
- cur_cpu_spec->cpu_features |= CPU_FTR_PKEY;
}
static void __init cpufeatures_setup_finished(void)
--
2.26.2
^ permalink raw reply related
* [PATCH v2 08/12] powerpc/book3s64/pkeys: Convert execute key support to static key
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
Convert the bool to a static key like pkey_disabled.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/mm/book3s64/pkeys.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 9e68a08799ee..7d400d5a4076 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -13,13 +13,13 @@
#include <linux/of_device.h>
DEFINE_STATIC_KEY_TRUE(pkey_disabled);
+DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
int pkeys_total; /* Total pkeys as per device tree */
u32 initial_allocation_mask; /* Bits set for the initially allocated keys */
/*
* Keys marked in the reservation list cannot be allocated by userspace
*/
u32 reserved_allocation_mask;
-static bool pkey_execute_disable_supported;
static u64 default_amr;
static u64 default_iamr;
/* Allow all keys to be modified by default */
@@ -116,9 +116,7 @@ static int pkey_initialize(void)
* execute_disable support. Instead we use a PVR check.
*/
if (pvr_version_is(PVR_POWER7) || pvr_version_is(PVR_POWER7p))
- pkey_execute_disable_supported = false;
- else
- pkey_execute_disable_supported = true;
+ static_branch_enable(&execute_pkey_disabled);
#ifdef CONFIG_PPC_4K_PAGES
/*
@@ -214,7 +212,7 @@ static inline void write_amr(u64 value)
static inline u64 read_iamr(void)
{
- if (!likely(pkey_execute_disable_supported))
+ if (static_branch_unlikely(&execute_pkey_disabled))
return 0x0UL;
return mfspr(SPRN_IAMR);
@@ -222,7 +220,7 @@ static inline u64 read_iamr(void)
static inline void write_iamr(u64 value)
{
- if (!likely(pkey_execute_disable_supported))
+ if (static_branch_unlikely(&execute_pkey_disabled))
return;
mtspr(SPRN_IAMR, value);
@@ -282,7 +280,7 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
return -EINVAL;
if (init_val & PKEY_DISABLE_EXECUTE) {
- if (!pkey_execute_disable_supported)
+ if (static_branch_unlikely(&execute_pkey_disabled))
return -EINVAL;
new_iamr_bits |= IAMR_EX_BIT;
}
--
2.26.2
^ permalink raw reply related
* [PATCH v2 09/12] powerpc/book3s64/pkeys: Simplify pkey disable branch
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
Make the default value FALSE (pkey enabled) and set to TRUE when we
find the total number of keys supported to be zero.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/pkeys.h | 2 +-
arch/powerpc/mm/book3s64/pkeys.c | 7 +++----
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 5dd0a79d1809..75d2a2c19c04 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -11,7 +11,7 @@
#include <linux/jump_label.h>
#include <asm/firmware.h>
-DECLARE_STATIC_KEY_TRUE(pkey_disabled);
+DECLARE_STATIC_KEY_FALSE(pkey_disabled);
extern int pkeys_total; /* total pkeys as per device tree */
extern u32 initial_allocation_mask; /* bits set for the initially allocated keys */
extern u32 reserved_allocation_mask; /* bits set for reserved keys */
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 7d400d5a4076..87d882a9aaf2 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -12,7 +12,7 @@
#include <linux/pkeys.h>
#include <linux/of_device.h>
-DEFINE_STATIC_KEY_TRUE(pkey_disabled);
+DEFINE_STATIC_KEY_FALSE(pkey_disabled);
DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
int pkeys_total; /* Total pkeys as per device tree */
u32 initial_allocation_mask; /* Bits set for the initially allocated keys */
@@ -104,9 +104,8 @@ static int pkey_initialize(void)
/* scan the device tree for pkey feature */
pkeys_total = scan_pkey_feature();
- if (pkeys_total)
- static_branch_disable(&pkey_disabled);
- else {
+ if (!pkeys_total) {
+ /* No support for pkey. Mark it disabled */
static_branch_enable(&pkey_disabled);
return 0;
}
--
2.26.2
^ permalink raw reply related
* [PATCH v2 10/12] powerpc/book3s64/pkeys: Convert pkey_total to max_pkey
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
max_pkey now represents max key value that userspace can allocate.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/pkeys.h | 7 +++++--
arch/powerpc/mm/book3s64/pkeys.c | 14 +++++++-------
2 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 75d2a2c19c04..652bad7334f3 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -12,7 +12,7 @@
#include <asm/firmware.h>
DECLARE_STATIC_KEY_FALSE(pkey_disabled);
-extern int pkeys_total; /* total pkeys as per device tree */
+extern int max_pkey;
extern u32 initial_allocation_mask; /* bits set for the initially allocated keys */
extern u32 reserved_allocation_mask; /* bits set for reserved keys */
@@ -44,7 +44,10 @@ static inline int vma_pkey(struct vm_area_struct *vma)
return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
}
-#define arch_max_pkey() pkeys_total
+static inline int arch_max_pkey(void)
+{
+ return max_pkey;
+}
#define pkey_alloc_mask(pkey) (0x1 << pkey)
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 87d882a9aaf2..a4d7287082a8 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -14,7 +14,7 @@
DEFINE_STATIC_KEY_FALSE(pkey_disabled);
DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
-int pkeys_total; /* Total pkeys as per device tree */
+int max_pkey; /* Maximum key value supported */
u32 initial_allocation_mask; /* Bits set for the initially allocated keys */
/*
* Keys marked in the reservation list cannot be allocated by userspace
@@ -84,7 +84,7 @@ static int scan_pkey_feature(void)
static int pkey_initialize(void)
{
- int os_reserved, i;
+ int pkeys_total, i;
/*
* We define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
@@ -122,12 +122,12 @@ static int pkey_initialize(void)
* The OS can manage only 8 pkeys due to its inability to represent them
* in the Linux 4K PTE. Mark all other keys reserved.
*/
- os_reserved = pkeys_total - 8;
+ max_pkey = min(8, pkeys_total);
#else
- os_reserved = 0;
+ max_pkey = pkeys_total;
#endif
- if (unlikely((pkeys_total - os_reserved) <= execute_only_key)) {
+ if (unlikely(max_pkey <= execute_only_key)) {
/*
* Insufficient number of keys to support
* execute only key. Mark it unavailable.
@@ -174,10 +174,10 @@ static int pkey_initialize(void)
default_uamor &= ~(0x3ul << pkeyshift(1));
/*
- * Prevent the usage of OS reserved the keys. Update UAMOR
+ * Prevent the usage of OS reserved keys. Update UAMOR
* for those keys.
*/
- for (i = (pkeys_total - os_reserved); i < pkeys_total; i++) {
+ for (i = max_pkey; i < pkeys_total; i++) {
reserved_allocation_mask |= (0x1 << i);
default_uamor &= ~(0x3ul << pkeyshift(i));
}
--
2.26.2
^ permalink raw reply related
* [PATCH v2 11/12] powerpc/book3s64/pkeys: Make initial_allocation_mask static
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
initial_allocation_mask is not used outside this file.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/include/asm/pkeys.h | 1 -
arch/powerpc/mm/book3s64/pkeys.c | 2 +-
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 652bad7334f3..47c81d41ea9a 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -13,7 +13,6 @@
DECLARE_STATIC_KEY_FALSE(pkey_disabled);
extern int max_pkey;
-extern u32 initial_allocation_mask; /* bits set for the initially allocated keys */
extern u32 reserved_allocation_mask; /* bits set for reserved keys */
#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index a4d7287082a8..73b5ef1490c8 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -15,11 +15,11 @@
DEFINE_STATIC_KEY_FALSE(pkey_disabled);
DEFINE_STATIC_KEY_FALSE(execute_pkey_disabled);
int max_pkey; /* Maximum key value supported */
-u32 initial_allocation_mask; /* Bits set for the initially allocated keys */
/*
* Keys marked in the reservation list cannot be allocated by userspace
*/
u32 reserved_allocation_mask;
+static u32 initial_allocation_mask; /* Bits set for the initially allocated keys */
static u64 default_amr;
static u64 default_iamr;
/* Allow all keys to be modified by default */
--
2.26.2
^ permalink raw reply related
* [PATCH v2 12/12] powerpc/book3s64/pkeys: Mark all the pkeys above max pkey as reserved
From: Aneesh Kumar K.V @ 2020-05-02 11:13 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, linuxram
In-Reply-To: <20200502111347.541836-1-aneesh.kumar@linux.ibm.com>
The hypervisor can return less than max allowed pkey (for ex: 31) instead
of 32. We should mark all the pkeys above max allowed as reserved so
that we avoid the allocation of the wrong pkey(for ex: key 31 in the above
case) by userspace.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/powerpc/mm/book3s64/pkeys.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 73b5ef1490c8..0ff59acdbb84 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -175,9 +175,10 @@ static int pkey_initialize(void)
/*
* Prevent the usage of OS reserved keys. Update UAMOR
- * for those keys.
+ * for those keys. Also mark the rest of the bits in the
+ * 32 bit mask as reserved.
*/
- for (i = max_pkey; i < pkeys_total; i++) {
+ for (i = max_pkey; i < 32 ; i++) {
reserved_allocation_mask |= (0x1 << i);
default_uamor &= ~(0x3ul << pkeyshift(i));
}
--
2.26.2
^ permalink raw reply related
* [RFC PATCH 01/10] kallsyms: architecture specific symbol lookups
From: Nicholas Piggin @ 2020-05-02 11:19 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20200502111914.166578-1-npiggin@gmail.com>
Provide CONFIG_ARCH_HAS_SYMBOL_LOOKUP which allows architectures to
do their own symbol/address lookup if kernel and module lookups miss.
powerpc will use this to deal with firmware symbols.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
include/linux/kallsyms.h | 20 ++++++++++++++++++++
kernel/kallsyms.c | 13 ++++++++++++-
lib/Kconfig | 3 +++
3 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index 657a83b943f0..e17c1e7c01c0 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -83,6 +83,26 @@ extern int kallsyms_lookup_size_offset(unsigned long addr,
unsigned long *symbolsize,
unsigned long *offset);
+#ifdef CONFIG_ARCH_HAS_SYMBOL_LOOKUP
+const char *arch_symbol_lookup_address(unsigned long addr,
+ unsigned long *symbolsize,
+ unsigned long *offset,
+ char **modname, char *namebuf);
+unsigned long arch_symbol_lookup_name(const char *name);
+#else
+static inline const char *arch_symbol_lookup_address(unsigned long addr,
+ unsigned long *symbolsize,
+ unsigned long *offset,
+ char **modname, char *namebuf)
+{
+ return NULL;
+}
+static inline unsigned long arch_symbol_lookup_name(const char *name)
+{
+ return 0;
+}
+#endif
+
/* Lookup an address. modname is set to NULL if it's in the kernel. */
const char *kallsyms_lookup(unsigned long addr,
unsigned long *symbolsize,
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 16c8c605f4b0..1e403e616126 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -164,6 +164,7 @@ static unsigned long kallsyms_sym_address(int idx)
unsigned long kallsyms_lookup_name(const char *name)
{
char namebuf[KSYM_NAME_LEN];
+ unsigned long ret;
unsigned long i;
unsigned int off;
@@ -173,7 +174,12 @@ unsigned long kallsyms_lookup_name(const char *name)
if (strcmp(namebuf, name) == 0)
return kallsyms_sym_address(i);
}
- return module_kallsyms_lookup_name(name);
+
+ ret = module_kallsyms_lookup_name(name);
+ if (ret)
+ return ret;
+
+ return arch_symbol_lookup_name(name);
}
int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
@@ -309,6 +315,11 @@ const char *kallsyms_lookup(unsigned long addr,
if (!ret)
ret = ftrace_mod_address_lookup(addr, symbolsize,
offset, modname, namebuf);
+
+ if (!ret)
+ ret = arch_symbol_lookup_address(addr, symbolsize,
+ offset, modname, namebuf);
+
return ret;
}
diff --git a/lib/Kconfig b/lib/Kconfig
index 5d53f9609c25..9f86f649a712 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -80,6 +80,9 @@ config ARCH_USE_CMPXCHG_LOCKREF
config ARCH_HAS_FAST_MULTIPLIER
bool
+config ARCH_HAS_SYMBOL_LOOKUP
+ bool
+
config INDIRECT_PIO
bool "Access I/O in non-MMIO mode"
depends on ARM64
--
2.23.0
^ permalink raw reply related
* [RFC PATCH 00/10] OPAL V4
From: Nicholas Piggin @ 2020-05-02 11:19 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
"OPAL V4" is a proposed new approach to running and calling PowerNV
OPAL firmware.
OPAL calls use the caller's (kernel) stack, which vastly simplifies
re-entrancy concerns around doing things like idle and machine check
OPAL drivers.
The OS can get at symbol and assert metadata to help with debugging
firmware.
OPAL may be called (and will run in) virtual mode in its own address
space.
And the operating system provides some services to the firmware,
message logging, for example.
This fairly close to the point where we could run OPAL in user-mode
with a few services (scv could be used to call back to the OS) for
privileged instructions, we may yet do this, but one thing that's
stopped me is it would require a slower API. As it is now with LE
skiboot and LE Linux, the OPAL call is basically a shared-library
function call, which is fast enough that it's feasible to
implement a performant CPU idle driver, which is a significant
motivation.
Anyway this is up and running, coming together pretty well just needs
a bit of polishing and more documentation. I'll post the skiboot
patches on the skiboot list.
Nicholas Piggin (10):
kallsyms: architecture specific symbol lookups
powerpc/powernv: Wire up OPAL address lookups
powerpc/powernv: Use OPAL_REPORT_TRAP to cope with trap interrupts
from OPAL
powerpc/powernv: avoid polling in opal_get_chars
powerpc/powernv: Don't translate kernel addresses to real addresses
for OPAL
powerpc/powernv: opal use new opal call entry point if it exists
powerpc/powernv: Add OPAL_FIND_VM_AREA API
powerpc/powernv: Set up an mm context to call OPAL in
powerpc/powernv: OPAL V4 OS services
powerpc/powernv: OPAL V4 Implement vm_map/unmap service
arch/powerpc/Kconfig | 1 +
arch/powerpc/boot/opal.c | 5 +
arch/powerpc/include/asm/opal-api.h | 29 +-
arch/powerpc/include/asm/opal.h | 8 +
arch/powerpc/kernel/traps.c | 39 ++-
arch/powerpc/perf/imc-pmu.c | 4 +-
arch/powerpc/platforms/powernv/npu-dma.c | 2 +-
arch/powerpc/platforms/powernv/opal-call.c | 58 ++++
arch/powerpc/platforms/powernv/opal-dump.c | 2 +-
arch/powerpc/platforms/powernv/opal-elog.c | 4 +-
arch/powerpc/platforms/powernv/opal-flash.c | 6 +-
arch/powerpc/platforms/powernv/opal-hmi.c | 2 +-
arch/powerpc/platforms/powernv/opal-nvram.c | 4 +-
.../powerpc/platforms/powernv/opal-powercap.c | 2 +-
arch/powerpc/platforms/powernv/opal-psr.c | 2 +-
arch/powerpc/platforms/powernv/opal-xscom.c | 2 +-
arch/powerpc/platforms/powernv/opal.c | 289 ++++++++++++++++--
arch/powerpc/platforms/powernv/pci-ioda.c | 2 +-
arch/powerpc/sysdev/xive/native.c | 2 +-
drivers/char/powernv-op-panel.c | 3 +-
drivers/i2c/busses/i2c-opal.c | 12 +-
drivers/mtd/devices/powernv_flash.c | 4 +-
include/linux/kallsyms.h | 20 ++
kernel/kallsyms.c | 13 +-
lib/Kconfig | 3 +
25 files changed, 461 insertions(+), 57 deletions(-)
--
2.23.0
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox