LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v6 1/4] powerpc/64s: Introduce temporary mm for Radix MMU
From: Christopher M. Riedl @ 2021-09-16  0:24 UTC (permalink / raw)
  To: Jordan Niethe; +Cc: linuxppc-dev, linux-hardening
In-Reply-To: <CACzsE9rThU0JBACJoeeHOyEOA8CbFwRExrOhTsySaOH44yJa6g@mail.gmail.com>

On Sat Sep 11, 2021 at 3:26 AM CDT, Jordan Niethe wrote:
> On Sat, Sep 11, 2021 at 12:35 PM Christopher M. Riedl
> <cmr@bluescreens.de> wrote:
> >
> > x86 supports the notion of a temporary mm which restricts access to
> > temporary PTEs to a single CPU. A temporary mm is useful for situations
> > where a CPU needs to perform sensitive operations (such as patching a
> > STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing
> > said mappings to other CPUs. Another benefit is that other CPU TLBs do
> > not need to be flushed when the temporary mm is torn down.
> >
> > Mappings in the temporary mm can be set in the userspace portion of the
> > address-space.
> >
> > Interrupts must be disabled while the temporary mm is in use. HW
> > breakpoints, which may have been set by userspace as watchpoints on
> > addresses now within the temporary mm, are saved and disabled when
> > loading the temporary mm. The HW breakpoints are restored when unloading
> > the temporary mm. All HW breakpoints are indiscriminately disabled while
> > the temporary mm is in use - this may include breakpoints set by perf.
>
> I had thought CPUs with a DAWR might not need to do this because the
> privilege level that breakpoints trigger on can be configured. But it
> turns out in ptrace, etc we use HW_BRK_TYPE_PRIV_ALL.

Thanks for double checking :)

>
> >
> > Based on x86 implementation:
> >
> > commit cefa929c034e
> > ("x86/mm: Introduce temporary mm structs")
> >
> > Signed-off-by: Christopher M. Riedl <cmr@bluescreens.de>
> >
> > ---
> >
> > v6:  * Use {start,stop}_using_temporary_mm() instead of
> >        {use,unuse}_temporary_mm() as suggested by Christophe.
> >
> > v5:  * Drop support for using a temporary mm on Book3s64 Hash MMU.
> >
> > v4:  * Pass the prev mm instead of NULL to switch_mm_irqs_off() when
> >        using/unusing the temp mm as suggested by Jann Horn to keep
> >        the context.active counter in-sync on mm/nohash.
> >      * Disable SLB preload in the temporary mm when initializing the
> >        temp_mm struct.
> >      * Include asm/debug.h header to fix build issue with
> >        ppc44x_defconfig.
> > ---
> >  arch/powerpc/include/asm/debug.h |  1 +
> >  arch/powerpc/kernel/process.c    |  5 +++
> >  arch/powerpc/lib/code-patching.c | 56 ++++++++++++++++++++++++++++++++
> >  3 files changed, 62 insertions(+)
> >
> > diff --git a/arch/powerpc/include/asm/debug.h b/arch/powerpc/include/asm/debug.h
> > index 86a14736c76c..dfd82635ea8b 100644
> > --- a/arch/powerpc/include/asm/debug.h
> > +++ b/arch/powerpc/include/asm/debug.h
> > @@ -46,6 +46,7 @@ static inline int debugger_fault_handler(struct pt_regs *regs) { return 0; }
> >  #endif
> >
> >  void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk);
> > +void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk);
> >  bool ppc_breakpoint_available(void);
> >  #ifdef CONFIG_PPC_ADV_DEBUG_REGS
> >  extern void do_send_trap(struct pt_regs *regs, unsigned long address,
> > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > index 50436b52c213..6aa1f5c4d520 100644
> > --- a/arch/powerpc/kernel/process.c
> > +++ b/arch/powerpc/kernel/process.c
> > @@ -865,6 +865,11 @@ static inline int set_breakpoint_8xx(struct arch_hw_breakpoint *brk)
> >         return 0;
> >  }
> >
> > +void __get_breakpoint(int nr, struct arch_hw_breakpoint *brk)
> > +{
> > +       memcpy(brk, this_cpu_ptr(&current_brk[nr]), sizeof(*brk));
> > +}
>
> The breakpoint code is already a little hard to follow. I'm worried
> doing this might spread breakpoint handling into more places in the
> future.
> What about something like having a breakpoint_pause() function which
> clears the hardware registers only and then a breakpoint_resume()
> function that copies from current_brk[] back to the hardware
> registers?
> Then we don't have to make another copy of the breakpoint state.

I think that sounds reasonable - I'll add those functions instead with
the next spin.

>
> > +
> >  void __set_breakpoint(int nr, struct arch_hw_breakpoint *brk)
> >  {
> >         memcpy(this_cpu_ptr(&current_brk[nr]), brk, sizeof(*brk));
> > diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> > index f9a3019e37b4..8d61a7d35b89 100644
> > --- a/arch/powerpc/lib/code-patching.c
> > +++ b/arch/powerpc/lib/code-patching.c
>
> Sorry I might have missed it, but what was the reason for not putting
> this stuff in mmu_context.h?

x86 ended up moving this code into their code-patching file as well. I
suppose nobody has thought of another use for a temporary mm like this
yet :)

>
> > @@ -17,6 +17,9 @@
> >  #include <asm/code-patching.h>
> >  #include <asm/setup.h>
> >  #include <asm/inst.h>
> > +#include <asm/mmu_context.h>
> > +#include <asm/debug.h>
> > +#include <asm/tlb.h>
> >
> >  static int __patch_instruction(u32 *exec_addr, struct ppc_inst instr, u32 *patch_addr)
> >  {
> > @@ -45,6 +48,59 @@ int raw_patch_instruction(u32 *addr, struct ppc_inst instr)
> >  }
> >
> >  #ifdef CONFIG_STRICT_KERNEL_RWX
> > +
> > +struct temp_mm {
> > +       struct mm_struct *temp;
> > +       struct mm_struct *prev;
> > +       struct arch_hw_breakpoint brk[HBP_NUM_MAX];
> ^ Then we wouldn't need this.
> > +};
> > +
> > +static inline void init_temp_mm(struct temp_mm *temp_mm, struct mm_struct *mm)
> > +{
> > +       /* We currently only support temporary mm on the Book3s64 Radix MMU */
> > +       WARN_ON(!radix_enabled());
> > +
> > +       temp_mm->temp = mm;
> > +       temp_mm->prev = NULL;
> > +       memset(&temp_mm->brk, 0, sizeof(temp_mm->brk));
> > +}
> > +
> > +static inline void start_using_temporary_mm(struct temp_mm *temp_mm)
> > +{
> > +       lockdep_assert_irqs_disabled();
> > +
> > +       temp_mm->prev = current->active_mm;
> > +       switch_mm_irqs_off(temp_mm->prev, temp_mm->temp, current);
>
> Now that we only support radix it should be fine again to have it like
> this:
> switch_mm_irqs_off(NULL, temp_mm->temp, current);?
> It was changed from that because it would cause issues on nohash I
> thought.

That's true - but if we want to support the other MMUs in the future
I'd rather just keep it as-is. AFAICS there's no harm in passing
temp_mm->prev here instead of NULL.

>
> > +
> > +       WARN_ON(!mm_is_thread_local(temp_mm->temp));
> > +
> > +       if (ppc_breakpoint_available()) {
> > +               struct arch_hw_breakpoint null_brk = {0};
> > +               int i = 0;
> > +
> > +               for (; i < nr_wp_slots(); ++i) {
> > +                       __get_breakpoint(i, &temp_mm->brk[i]);
> > +                       if (temp_mm->brk[i].type != 0)
> > +                               __set_breakpoint(i, &null_brk);
> > +               }
> > +       }
> > +}
> > +
> > +static inline void stop_using_temporary_mm(struct temp_mm *temp_mm)
> > +{
> > +       lockdep_assert_irqs_disabled();
> > +
> > +       switch_mm_irqs_off(temp_mm->temp, temp_mm->prev, current);
> > +
> > +       if (ppc_breakpoint_available()) {
> > +               int i = 0;
> > +
> > +               for (; i < nr_wp_slots(); ++i)
> > +                       if (temp_mm->brk[i].type != 0)
> > +                               __set_breakpoint(i, &temp_mm->brk[i]);
> > +       }
> > +}
> > +
> >  static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
> >
> >  static int text_area_cpu_up(unsigned int cpu)
> > --
> > 2.32.0
> >
> Thanks,
> Jordan


^ permalink raw reply

* Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
From: Borislav Petkov @ 2021-09-15 18:47 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	dri-devel, platform-driver-x86, Paul Mackerras, linux-s390,
	Andi Kleen, Joerg Roedel, x86, amd-gfx, Christoph Hellwig,
	linux-graphics-maintainer, Tom Lendacky, Tianyu Lan, kexec,
	linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <f8388f18-5e90-5d0f-d681-0b17f8307dd4@csgroup.eu>

On Wed, Sep 15, 2021 at 07:18:34PM +0200, Christophe Leroy wrote:
> Could you please provide more explicit explanation why inlining such an
> helper is considered as bad practice and messy ?

Tom already told you to look at the previous threads. Let's read them
together. This one, for example:

https://lore.kernel.org/lkml/YSScWvpXeVXw%2Fed5@infradead.org/

| > To take it out of line, I'm leaning towards the latter, creating a new
| > file that is built based on the ARCH_HAS_PROTECTED_GUEST setting.
| 
| Yes.  In general everytime architectures have to provide the prototype
| and not just the implementation of something we end up with a giant mess
| sooner or later.  In a few cases that is still warranted due to
| performance concerns, but i don't think that is the case here.

So I think what Christoph means here is that you want to have the
generic prototype defined in a header and arches get to implement it
exactly to the letter so that there's no mess.

As to what mess exactly, I'd let him explain that.

> Because as demonstrated in my previous response some days ago, taking that
> outline ends up with an unneccessary ugly generated code and we don't
> benefit front GCC's capability to fold in and opt out unreachable code.

And this is real fast path where a couple of instructions matter or what?

set_memory_encrypted/_decrypted doesn't look like one to me.

> I can't see your point here. Inlining the function wouldn't add any
> ifdeffery as far as I can see.

If the function is touching defines etc, they all need to be visible.
If that function needs to call other functions - which is the case on
x86, perhaps not so much on power - then you need to either ifdef around
them or provide stubs with ifdeffery in the headers. And you need to
make them global functions instead of keeping them static to the same
compilation unit, etc, etc.

With a separate compilation unit, you don't need any of that and it is
all kept in that single file.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply

* Re: [PATCH v3 0/8] Implement generic cc_platform_has() helper function
From: Kuppuswamy, Sathyanarayanan @ 2021-09-15 17:26 UTC (permalink / raw)
  To: Borislav Petkov, Tom Lendacky
  Cc: linux-efi, Brijesh Singh, kvm, David Airlie, Dave Hansen,
	dri-devel, platform-driver-x86, Paul Mackerras, Will Deacon,
	Ard Biesheuvel, linux-s390, Andi Kleen, Baoquan He, Joerg Roedel,
	x86, amd-gfx, Christoph Hellwig, Christian Borntraeger,
	Ingo Molnar, linux-graphics-maintainer, Dave Young, Tianyu Lan,
	Thomas Zimmermann, Vasily Gorbik, Heiko Carstens,
	Maarten Lankhorst, Maxime Ripard, Andy Lutomirski,
	Thomas Gleixner, Peter Zijlstra, kexec, linux-kernel, iommu,
	Daniel Vetter, linux-fsdevel, linuxppc-dev
In-Reply-To: <YUIjS6lKEY5AadZx@zn.tnic>



On 9/15/21 9:46 AM, Borislav Petkov wrote:
> Sathya,
> 
> if you want to prepare the Intel variant intel_cc_platform_has() ontop
> of those and send it to me, that would be good because then I can
> integrate it all in one branch which can be used to base future work
> ontop.

I have a Intel variant patch (please check following patch). But it includes
TDX changes as well. Shall I move TDX changes to different patch and just
create a separate patch for adding intel_cc_platform_has()?


commit fc5f98a0ed94629d903827c5b44ee9295f835831
Author: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Date:   Wed May 12 11:35:13 2021 -0700

     x86/tdx: Add confidential guest support for TDX guest

     TDX architecture provides a way for VM guests to be highly secure and
     isolated (from untrusted VMM). To achieve this requirement, any data
     coming from VMM cannot be completely trusted. TDX guest fixes this
     issue by hardening the IO drivers against the attack from the VMM.
     So, when adding hardening fixes to the generic drivers, to protect
     custom fixes use cc_platform_has() API.

     Also add TDX guest support to cc_platform_has() API to protect the
     TDX specific fixes.

     Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a5b14de03458..2e78358923a1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -871,6 +871,7 @@ config INTEL_TDX_GUEST
         depends on SECURITY
         select X86_X2APIC
         select SECURITY_LOCKDOWN_LSM
+       select ARCH_HAS_CC_PLATFORM
         help
           Provide support for running in a trusted domain on Intel processors
           equipped with Trusted Domain eXtensions. TDX is a new Intel
diff --git a/arch/x86/include/asm/intel_cc_platform.h b/arch/x86/include/asm/intel_cc_platform.h
new file mode 100644
index 000000000000..472c3174beac
--- /dev/null
+++ b/arch/x86/include/asm/intel_cc_platform.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2021 Intel Corporation */
+#ifndef _ASM_X86_INTEL_CC_PLATFORM_H
+#define _ASM_X86_INTEL_CC_PLATFORM_H
+
+#if defined(CONFIG_CPU_SUP_INTEL) && defined(CONFIG_ARCH_HAS_CC_PLATFORM)
+bool intel_cc_platform_has(unsigned int flag);
+#else
+static inline bool intel_cc_platform_has(unsigned int flag) { return false; }
+#endif
+
+#endif /* _ASM_X86_INTEL_CC_PLATFORM_H */
+
diff --git a/arch/x86/kernel/cc_platform.c b/arch/x86/kernel/cc_platform.c
index 3c9bacd3c3f3..e83bc2f48efe 100644
--- a/arch/x86/kernel/cc_platform.c
+++ b/arch/x86/kernel/cc_platform.c
@@ -10,11 +10,16 @@
  #include <linux/export.h>
  #include <linux/cc_platform.h>
  #include <linux/mem_encrypt.h>
+#include <linux/processor.h>
+
+#include <asm/intel_cc_platform.h>

  bool cc_platform_has(enum cc_attr attr)
  {
         if (sme_me_mask)
                 return amd_cc_platform_has(attr);
+       else if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
+               return intel_cc_platform_has(attr);

         return false;
  }
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 8321c43554a1..ab486a3b1eb0 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -11,6 +11,7 @@
  #include <linux/init.h>
  #include <linux/uaccess.h>
  #include <linux/delay.h>
+#include <linux/cc_platform.h>

  #include <asm/cpufeature.h>
  #include <asm/msr.h>
@@ -60,6 +61,21 @@ static u64 msr_test_ctrl_cache __ro_after_init;
   */
  static bool cpu_model_supports_sld __ro_after_init;

+#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
+bool intel_cc_platform_has(enum cc_attr attr)
+{
+       switch (attr) {
+       case CC_ATTR_GUEST_TDX:
+               return cpu_feature_enabled(X86_FEATURE_TDX_GUEST);
+       default:
+               return false;
+       }
+
+       return false;
+}
+EXPORT_SYMBOL_GPL(intel_cc_platform_has);
+#endif
+
  /*
   * Processors which have self-snooping capability can handle conflicting
   * memory type across CPUs by snooping its own cache. However, there exists
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index 253f3ea66cd8..e38430e6e396 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -61,6 +61,15 @@ enum cc_attr {
          * Examples include SEV-ES.
          */
         CC_ATTR_GUEST_STATE_ENCRYPT,
+
+       /**
+        * @CC_ATTR_GUEST_TDX: Trusted Domain Extension Support
+        *
+        * The platform/OS is running as a TDX guest/virtual machine.
+        *
+        * Examples include SEV-ES.
+        */
+       CC_ATTR_GUEST_TDX,
  };

  #ifdef CONFIG_ARCH_HAS_CC_PLATFORM


-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply related

* Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
From: Christophe Leroy @ 2021-09-15 17:18 UTC (permalink / raw)
  To: Borislav Petkov, Michael Ellerman
  Cc: linux-s390, Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh,
	kvm, Tom Lendacky, Tianyu Lan, Joerg Roedel, x86, kexec,
	linux-kernel, dri-devel, platform-driver-x86, Christoph Hellwig,
	iommu, Andi Kleen, Paul Mackerras, amd-gfx, linux-fsdevel,
	linux-graphics-maintainer, linuxppc-dev
In-Reply-To: <YUHGDbtiGrDz5+NS@zn.tnic>

Le 15/09/2021 à 12:08, Borislav Petkov a écrit :
> On Wed, Sep 15, 2021 at 10:28:59AM +1000, Michael Ellerman wrote:
>> I don't love it, a new C file and an out-of-line call to then call back
>> to a static inline that for most configuration will return false ... but
>> whatever :)
> 
> Yeah, hch thinks it'll cause a big mess otherwise:
> 
> https://lore.kernel.org/lkml/YSScWvpXeVXw%2Fed5@infradead.org/

Could you please provide more explicit explanation why inlining such an 
helper is considered as bad practice and messy ?

Because as demonstrated in my previous response some days ago, taking 
that outline ends up with an unneccessary ugly generated code and we 
don't benefit front GCC's capability to fold in and opt out unreachable 
code.

As pointed by Michael in most cases the function will just return false 
so behind the performance concern, there is also the code size and code 
coverage topic that is to be taken into account. And even when the 
function doesn't return false, the only thing it does folds into a 
single powerpc instruction so there is really no point in making a 
dedicated out-of-line fonction for that and suffer the cost and the size 
of a function call and to justify the addition of a dedicated C file.

> 
> I guess less ifdeffery is nice too.

I can't see your point here. Inlining the function wouldn't add any 
ifdeffery as far as I can see.

So, would you mind reconsidering your approach and allow architectures 
to provide inline implementation by just not enforcing a generic 
prototype ? Or otherwise provide more details and exemple of why the 
cons are more important versus the pros ?

Thanks
Christophe

^ permalink raw reply

* Re: [PATCH v3 0/8] Implement generic cc_platform_has() helper function
From: Borislav Petkov @ 2021-09-15 16:46 UTC (permalink / raw)
  To: Tom Lendacky, Sathyanarayanan Kuppuswamy
  Cc: linux-efi, Brijesh Singh, kvm, David Airlie, Dave Hansen,
	dri-devel, platform-driver-x86, Paul Mackerras, Will Deacon,
	Ard Biesheuvel, linux-s390, Andi Kleen, Baoquan He, Joerg Roedel,
	x86, amd-gfx, Christoph Hellwig, Christian Borntraeger,
	Ingo Molnar, linux-graphics-maintainer, Dave Young, Tianyu Lan,
	Thomas Zimmermann, Vasily Gorbik, Heiko Carstens,
	Maarten Lankhorst, Maxime Ripard, Andy Lutomirski,
	Thomas Gleixner, Peter Zijlstra, kexec, linux-kernel, iommu,
	Daniel Vetter, linux-fsdevel, linuxppc-dev
In-Reply-To: <cover.1631141919.git.thomas.lendacky@amd.com>

On Wed, Sep 08, 2021 at 05:58:31PM -0500, Tom Lendacky wrote:
> This patch series provides a generic helper function, cc_platform_has(),
> to replace the sme_active(), sev_active(), sev_es_active() and
> mem_encrypt_active() functions.
> 
> It is expected that as new confidential computing technologies are
> added to the kernel, they can all be covered by a single function call
> instead of a collection of specific function calls all called from the
> same locations.
> 
> The powerpc and s390 patches have been compile tested only. Can the
> folks copied on this series verify that nothing breaks for them. Also,
> a new file, arch/powerpc/platforms/pseries/cc_platform.c, has been
> created for powerpc to hold the out of line function.

...

> 
> Tom Lendacky (8):
>   x86/ioremap: Selectively build arch override encryption functions
>   mm: Introduce a function to check for confidential computing features
>   x86/sev: Add an x86 version of cc_platform_has()
>   powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
>   x86/sme: Replace occurrences of sme_active() with cc_platform_has()
>   x86/sev: Replace occurrences of sev_active() with cc_platform_has()
>   x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()
>   treewide: Replace the use of mem_encrypt_active() with
>     cc_platform_has()

Ok, modulo the minor things the plan is to take this through tip after
-rc2 releases in order to pick up the powerpc build fix and have a clean
base (-rc2) to base stuff on, at the same time.

Pls holler if something's still amiss.

Sathya,

if you want to prepare the Intel variant intel_cc_platform_has() ontop
of those and send it to me, that would be good because then I can
integrate it all in one branch which can be used to base future work
ontop.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply

* Re: [PATCH v5 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance
From: Vincent Guittot @ 2021-09-15 15:43 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Juri Lelli, Aubrey Li, Srikar Dronamraju, Ravi V. Shankar,
	Peter Zijlstra (Intel), Ricardo Neri, Ben Segall,
	Srinivas Pandruvada, Joel Fernandes (Google), Ingo Molnar,
	Rafael J . Wysocki, Steven Rostedt, Mel Gorman, Len Brown,
	Nicholas Piggin, Aubrey Li, Dietmar Eggemann, Tim Chen,
	Quentin Perret, Daniel Bristot de Oliveira, linux-kernel,
	linuxppc-dev
In-Reply-To: <20210911011819.12184-7-ricardo.neri-calderon@linux.intel.com>

On Sat, 11 Sept 2021 at 03:19, Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:
>
> When deciding to pull tasks in ASYM_PACKING, it is necessary not only to
> check for the idle state of the destination CPU, dst_cpu, but also of
> its SMT siblings.
>
> If dst_cpu is idle but its SMT siblings are busy, performance suffers
> if it pulls tasks from a medium priority CPU that does not have SMT
> siblings.
>
> Implement asym_smt_can_pull_tasks() to inspect the state of the SMT
> siblings of both dst_cpu and the CPUs in the candidate busiest group.
>
> Cc: Aubrey Li <aubrey.li@intel.com>
> Cc: Ben Segall <bsegall@google.com>
> Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Quentin Perret <qperret@google.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> Reviewed-by: Len Brown <len.brown@intel.com>
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> ---
> Changes since v4:
>   * Use sg_lb_stats::sum_nr_running the idle state of a scheduling group.
>     (Vincent, Peter)
>   * Do not even idle CPUs in asym_smt_can_pull_tasks(). (Vincent)
>   * Updated function documentation and corrected a typo.
>
> Changes since v3:
>   * Removed the arch_asym_check_smt_siblings() hook. Discussions with the
>     powerpc folks showed that this patch should not impact them. Also, more
>     recent powerpc processor no longer use asym_packing. (PeterZ)
>   * Removed unnecessary local variable in asym_can_pull_tasks(). (Dietmar)
>   * Removed unnecessary check for local CPUs when the local group has zero
>     utilization. (Joel)
>   * Renamed asym_can_pull_tasks() as asym_smt_can_pull_tasks() to reflect
>     the fact that it deals with SMT cases.
>   * Made asym_smt_can_pull_tasks() return false for !CONFIG_SCHED_SMT so
>     that callers can deal with non-SMT cases.
>
> Changes since v2:
>   * Reworded the commit message to reflect updates in code.
>   * Corrected misrepresentation of dst_cpu as the CPU doing the load
>     balancing. (PeterZ)
>   * Removed call to arch_asym_check_smt_siblings() as it is now called in
>     sched_asym().
>
> Changes since v1:
>   * Don't bailout in update_sd_pick_busiest() if dst_cpu cannot pull
>     tasks. Instead, reclassify the candidate busiest group, as it
>     may still be selected. (PeterZ)
>   * Avoid an expensive and unnecessary call to cpumask_weight() when
>     determining if a sched_group is comprised of SMT siblings.
>     (PeterZ).
> ---
>  kernel/sched/fair.c | 94 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 94 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 26db017c14a3..8d763dd0174b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8597,10 +8597,98 @@ group_type group_classify(unsigned int imbalance_pct,
>         return group_has_spare;
>  }
>
> +/**
> + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull tasks
> + * @dst_cpu:   Destination CPU of the load balancing
> + * @sds:       Load-balancing data with statistics of the local group
> + * @sgs:       Load-balancing statistics of the candidate busiest group
> + * @sg:                The candidate busiest group
> + *
> + * Check the state of the SMT siblings of both @sds::local and @sg and decide
> + * if @dst_cpu can pull tasks.
> + *
> + * If @dst_cpu does not have SMT siblings, it can pull tasks if two or more of
> + * the SMT siblings of @sg are busy. If only one CPU in @sg is busy, pull tasks
> + * only if @dst_cpu has higher priority.
> + *
> + * If both @dst_cpu and @sg have SMT siblings, and @sg has exactly one more
> + * busy CPU than @sds::local, let @dst_cpu pull tasks if it has higher priority.
> + * Bigger imbalances in the number of busy CPUs will be dealt with in
> + * update_sd_pick_busiest().
> + *
> + * If @sg does not have SMT siblings, only pull tasks if all of the SMT siblings
> + * of @dst_cpu are idle and @sg has lower priority.
> + */
> +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> +                                   struct sg_lb_stats *sgs,
> +                                   struct sched_group *sg)
> +{
> +#ifdef CONFIG_SCHED_SMT
> +       bool local_is_smt, sg_is_smt;
> +       int sg_busy_cpus;
> +
> +       local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
> +       sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
> +
> +       sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
> +
> +       if (!local_is_smt) {
> +               /*
> +                * If we are here, @dst_cpu is idle and does not have SMT
> +                * siblings. Pull tasks if candidate group has two or more
> +                * busy CPUs.
> +                */
> +               if (sg_is_smt && sg_busy_cpus >= 2)

Do you really need to test sg_is_smt ? if sg_busy_cpus >= 2 then
sd_is_smt must be true ?

Also, This is the default behavior where we want to even the number of
busy cpu. Shouldn't you return false and fall back to the default
behavior ?

That being said, the default behavior tries to even the number of idle
cpus which is easier to compute and is equal to even the number of
busy cpus in "normal" system with the same number of cpus in groups
but this is not the case here. It could be good to change the default
behavior to even the number of busy cpus and that you use the default
behavior here. Additional condition will be used to select the busiest
group like more busy cpu or more number of running tasks

> +                       return true;
> +
> +               /*
> +                * @dst_cpu does not have SMT siblings. @sg may have SMT
> +                * siblings and only one is busy. In such case, @dst_cpu
> +                * can help if it has higher priority and is idle (i.e.,
> +                * it has no running tasks).

The previous comment above assume that "@dst_cpu is idle" but now you
need to check that sds->local_stat.sum_nr_running == 0

> +                */
> +               return !sds->local_stat.sum_nr_running &&
> +                      sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> +       }
> +
> +       /* @dst_cpu has SMT siblings. */
> +
> +       if (sg_is_smt) {
> +               int local_busy_cpus = sds->local->group_weight -
> +                                     sds->local_stat.idle_cpus;
> +               int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
> +
> +               if (busy_cpus_delta == 1)
> +                       return sched_asym_prefer(dst_cpu,
> +                                                sg->asym_prefer_cpu);
> +
> +               return false;
> +       }
> +
> +       /*
> +        * @sg does not have SMT siblings. Ensure that @sds::local does not end
> +        * up with more than one busy SMT sibling and only pull tasks if there
> +        * are not busy CPUs (i.e., no CPU has running tasks).
> +        */
> +       if (!sds->local_stat.sum_nr_running)
> +               return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> +
> +       return false;
> +#else
> +       /* Always return false so that callers deal with non-SMT cases. */
> +       return false;
> +#endif
> +}
> +
>  static inline bool
>  sched_asym(struct lb_env *env, struct sd_lb_stats *sds,  struct sg_lb_stats *sgs,
>            struct sched_group *group)
>  {
> +       /* Only do SMT checks if either local or candidate have SMT siblings */
> +       if ((sds->local->flags & SD_SHARE_CPUCAPACITY) ||
> +           (group->flags & SD_SHARE_CPUCAPACITY))
> +               return asym_smt_can_pull_tasks(env->dst_cpu, sds, sgs, group);
> +
>         return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
>  }
>
> @@ -9606,6 +9694,12 @@ static struct rq *find_busiest_queue(struct lb_env *env,
>                     nr_running == 1)
>                         continue;
>
> +               /* Make sure we only pull tasks from a CPU of lower priority */
> +               if ((env->sd->flags & SD_ASYM_PACKING) &&
> +                   sched_asym_prefer(i, env->dst_cpu) &&
> +                   nr_running == 1)
> +                       continue;
> +
>                 switch (env->migration_type) {
>                 case migrate_load:
>                         /*
> --
> 2.17.1
>

^ permalink raw reply

* [PATCH] powerpc: warn on emulation of dcbz instruction
From: Christophe Leroy @ 2021-09-15 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Stan Johnson, Finn Thain, linuxppc-dev, linux-kernel

dcbz instruction shouldn't be used on non-cached memory. Using
it on non-cached memory can result in alignment exception and
implies a heavy handling.

Instead of silentely emulating the instruction and resulting in high
performance degradation, warn whenever an alignment exception is
taken due to dcbz, so that the user is made aware that dcbz
instruction has been used unexpectedly.

Reported-by: Stan Johnson <userm57@yahoo.com>
Cc: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/align.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
index bbb4181621dd..adc3a4a9c6e4 100644
--- a/arch/powerpc/kernel/align.c
+++ b/arch/powerpc/kernel/align.c
@@ -349,6 +349,7 @@ int fix_alignment(struct pt_regs *regs)
 		if (op.type != CACHEOP + DCBZ)
 			return -EINVAL;
 		PPC_WARN_ALIGNMENT(dcbz, regs);
+		WARN_ON_ONCE(1);
 		r = emulate_dcbz(op.ea, regs);
 	} else {
 		if (type == LARX || type == STCX)
-- 
2.31.1


^ permalink raw reply related

* [PATCH] powerpc/32s: Fix kuap_kernel_restore()
From: Christophe Leroy @ 2021-09-15 14:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: Stan Johnson, Finn Thain, linuxppc-dev, linux-kernel

At interrupt exit, kuap_kernel_restore() calls kuap_unclok() with the
value contained in regs->kuap. However, when regs->kuap contains
0xffffffff it means that KUAP was not unlocked so calling
kuap_unlock() is unrelevant and results in jeopardising the contents
of kernel space segment registers.

So check that regs->kuap doesn't contain KUAP_NONE before calling
kuap_unlock(). In the meantime it also means that if KUAP has not
been correcly locked back at interrupt exit, it must be locked
before continuing. This is done by checking the content of
current->thread.kuap which was returned by kuap_get_and_assert_locked()

Fixes: 16132529cee5 ("powerpc/32s: Rework Kernel Userspace Access Protection")
Reported-by: Stan Johnson <userm57@yahoo.com>
Cc: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/book3s/32/kup.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index d4b145b279f6..9f38040f0641 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -136,6 +136,14 @@ static inline void kuap_kernel_restore(struct pt_regs *regs, unsigned long kuap)
 	if (kuap_is_disabled())
 		return;

+	if (unlikely(kuap != KUAP_NONE)) {
+		current->thread.kuap = KUAP_NONE;
+		kuap_lock(kuap, false);
+	}
+
+	if (likely(regs->kuap == KUAP_NONE))
+		return;
+
 	current->thread.kuap = regs->kuap;

 	kuap_unlock(regs->kuap, false);
-- 
2.31.1

^ permalink raw reply related

* Re: [PATCH] swiotlb: set IO TLB segment size via cmdline
From: Christoph Hellwig @ 2021-09-15 13:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Roman Skakun, linux-doc, Peter Zijlstra, Viresh Kumar,
	Linux Kernel Mailing List, Paul Mackerras, Will Deacon,
	Boris Ostrovsky, Marek Szyprowski, Stefano Stabellini,
	Jonathan Corbet, Christoph Hellwig, xen-devel, Paul E. McKenney,
	Konrad Rzeszutek Wilk, Muchun Song, Thomas Gleixner,
	Juergen Gross, Thomas Bogendoerfer, Andrii Anisov, linuxppc-dev,
	Randy Dunlap, linux-mips, iommu, Roman Skakun, Andrew Morton,
	Lu Baolu, Robin Murphy, Mike Rapoport, Maciej W. Rozycki
In-Reply-To: <84ef7ff7-2c9c-113a-4a2c-cef54a6ded51@suse.com>

On Wed, Sep 15, 2021 at 03:49:52PM +0200, Jan Beulich wrote:
> But the question remains: Why does the framebuffer need to be mapped
> in a single giant chunk?

More importantly: if you use dynamic dma mappings for your framebuffer
you're doing something wrong.

^ permalink raw reply

* Re: [PATCH] swiotlb: set IO TLB segment size via cmdline
From: Jan Beulich @ 2021-09-15 13:49 UTC (permalink / raw)
  To: Roman Skakun
  Cc: linux-doc, Peter Zijlstra, Viresh Kumar,
	Linux Kernel Mailing List, Paul Mackerras, Will Deacon,
	Boris Ostrovsky, Marek Szyprowski, Stefano Stabellini,
	Jonathan Corbet, Christoph Hellwig, xen-devel, Paul E. McKenney,
	Konrad Rzeszutek Wilk, Muchun Song, Thomas Gleixner,
	Juergen Gross, Thomas Bogendoerfer, Andrii Anisov, linuxppc-dev,
	Randy Dunlap, linux-mips, iommu, Roman Skakun, Andrew Morton,
	Lu Baolu, Robin Murphy, Mike Rapoport, Maciej W. Rozycki
In-Reply-To: <CADu_u-Ou08tMFm5xU871ae8ct+2YOuvn4rQ=83CMTbg2bx87Pg@mail.gmail.com>

On 15.09.2021 15:37, Roman Skakun wrote:
>>> From: Roman Skakun <roman_skakun@epam.com>
>>>
>>> It is possible when default IO TLB size is not
>>> enough to fit a long buffers as described here [1].
>>>
>>> This patch makes a way to set this parameter
>>> using cmdline instead of recompiling a kernel.
>>>
>>> [1] https://www.xilinx.com/support/answers/72694.html
>>
>>  I'm not convinced the swiotlb use describe there falls under "intended
>>  use" - mapping a 1280x720 framebuffer in a single chunk?
> 
> I had the same issue while mapping DMA chuck ~4MB for gem fb when
> using xen vdispl.
> I got the next log:
> [ 142.030421] rcar-fcp fea2f000.fcp: swiotlb buffer is full (sz:
> 3686400 bytes), total 32768 (slots), used 32 (slots)
> 
> It happened when I tried to map bounce buffer, which has a large size.
> The default size if 128(IO_TLB_SEGSIZE) * 2048(IO_TLB_SHIFT) = 262144
> bytes, but we requested 3686400 bytes.
> When I change IO_TLB_SEGSIZE to 2048. (2048(IO_TLB_SEGSIZE)  *
> 2048(IO_TLB_SHIFT) = 4194304bytes).
> It makes possible to retrieve a bounce buffer for requested size.
> After changing this value, the problem is gone.

But the question remains: Why does the framebuffer need to be mapped
in a single giant chunk?

>>  In order to be sure to catch all uses like this one (including ones
>>  which make it upstream in parallel to yours), I think you will want
>>  to rename the original IO_TLB_SEGSIZE to e.g. IO_TLB_DEFAULT_SEGSIZE.
> 
> I don't understand your point. Can you clarify this?

There's a concrete present example: I have a patch pending adding
another use of IO_TLB_SEGSIZE. This use would need to be replaced
like you do here in several places. The need for the additional
replacement would be quite obvious (from a build failure) if you
renamed the manifest constant. Without renaming, it'll take
someone running into an issue on a live system, which I consider
far worse. This is because a simple re-basing of one of the
patches on top of the other will not point out the need for the
extra replacement, nor would a test build (with both patches in
place).

Jan


^ permalink raw reply

* Re: [PATCH] swiotlb: set IO TLB segment size via cmdline
From: Roman Skakun @ 2021-09-15 13:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Roman Skakun, linux-doc, Peter Zijlstra, Viresh Kumar,
	Linux Kernel Mailing List, Paul Mackerras, Will Deacon,
	Boris Ostrovsky, Marek Szyprowski, Stefano Stabellini,
	Jonathan Corbet, Christoph Hellwig, xen-devel, Paul E. McKenney,
	Konrad Rzeszutek Wilk, Muchun Song, Thomas Gleixner,
	Juergen Gross, Thomas Bogendoerfer, Andrii Anisov, linuxppc-dev,
	Randy Dunlap, linux-mips, iommu, Roman Skakun, Andrew Morton,
	Lu Baolu, Robin Murphy, Mike Rapoport, Maciej W. Rozycki
In-Reply-To: <7c04db79-7de1-93ff-0908-9bad60a287b9@suse.com>

Hi Jan,

Thanks for the answer.

>> From: Roman Skakun <roman_skakun@epam.com>
>>
>> It is possible when default IO TLB size is not
>> enough to fit a long buffers as described here [1].
>>
>> This patch makes a way to set this parameter
>> using cmdline instead of recompiling a kernel.
>>
>> [1] https://www.xilinx.com/support/answers/72694.html
>
>  I'm not convinced the swiotlb use describe there falls under "intended
>  use" - mapping a 1280x720 framebuffer in a single chunk?

I had the same issue while mapping DMA chuck ~4MB for gem fb when
using xen vdispl.
I got the next log:
[ 142.030421] rcar-fcp fea2f000.fcp: swiotlb buffer is full (sz:
3686400 bytes), total 32768 (slots), used 32 (slots)

It happened when I tried to map bounce buffer, which has a large size.
The default size if 128(IO_TLB_SEGSIZE) * 2048(IO_TLB_SHIFT) = 262144
bytes, but we requested 3686400 bytes.
When I change IO_TLB_SEGSIZE to 2048. (2048(IO_TLB_SEGSIZE)  *
2048(IO_TLB_SHIFT) = 4194304bytes).
It makes possible to retrieve a bounce buffer for requested size.
After changing this value, the problem is gone.

>  the bottom of this page is also confusing, as following "Then we can
>  confirm the modified swiotlb size in the boot log:" there is a log
>  fragment showing the same original size of 64Mb.

I suspect, this is a mistake in the article.
According to https://elixir.bootlin.com/linux/v5.14.4/source/kernel/dma/swiotlb.c#L214
and
https://elixir.bootlin.com/linux/v5.15-rc1/source/kernel/dma/swiotlb.c#L182
The IO_TLB_SEGSIZE is not used to calculate total size of swiotlb area.
This explains why we have the same total size before and after changing of
TLB segment size.

>  In order to be sure to catch all uses like this one (including ones
>  which make it upstream in parallel to yours), I think you will want
>  to rename the original IO_TLB_SEGSIZE to e.g. IO_TLB_DEFAULT_SEGSIZE.

I don't understand your point. Can you clarify this?

>> +     /* get max IO TLB segment size */
>> +     if (isdigit(*str)) {
>> +             tmp = simple_strtoul(str, &str, 0);
>> +             if (tmp)
>> +                     io_tlb_seg_size = ALIGN(tmp, IO_TLB_SEGSIZE);
>
> From all I can tell io_tlb_seg_size wants to be a power of 2. Merely
> aligning to a multiple of IO_TLB_SEGSIZE isn't going to be enough.

Yes, right, thanks!

Cheers,
Roman.

вт, 14 сент. 2021 г. в 18:29, Jan Beulich <jbeulich@suse.com>:
>
> On 14.09.2021 17:10, Roman Skakun wrote:
> > From: Roman Skakun <roman_skakun@epam.com>
> >
> > It is possible when default IO TLB size is not
> > enough to fit a long buffers as described here [1].
> >
> > This patch makes a way to set this parameter
> > using cmdline instead of recompiling a kernel.
> >
> > [1] https://www.xilinx.com/support/answers/72694.html
>
> I'm not convinced the swiotlb use describe there falls under "intended
> use" - mapping a 1280x720 framebuffer in a single chunk? (As an aside,
> the bottom of this page is also confusing, as following "Then we can
> confirm the modified swiotlb size in the boot log:" there is a log
> fragment showing the same original size of 64Mb.
>
> > --- a/arch/mips/cavium-octeon/dma-octeon.c
> > +++ b/arch/mips/cavium-octeon/dma-octeon.c
> > @@ -237,7 +237,7 @@ void __init plat_swiotlb_setup(void)
> >               swiotlbsize = 64 * (1<<20);
> >  #endif
> >       swiotlb_nslabs = swiotlbsize >> IO_TLB_SHIFT;
> > -     swiotlb_nslabs = ALIGN(swiotlb_nslabs, IO_TLB_SEGSIZE);
> > +     swiotlb_nslabs = ALIGN(swiotlb_nslabs, swiotlb_io_seg_size());
>
> In order to be sure to catch all uses like this one (including ones
> which make it upstream in parallel to yours), I think you will want
> to rename the original IO_TLB_SEGSIZE to e.g. IO_TLB_DEFAULT_SEGSIZE.
>
> > @@ -81,15 +86,30 @@ static unsigned int max_segment;
> >  static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
> >
> >  static int __init
> > -setup_io_tlb_npages(char *str)
> > +setup_io_tlb_params(char *str)
> >  {
> > +     unsigned long tmp;
> > +
> >       if (isdigit(*str)) {
> > -             /* avoid tail segment of size < IO_TLB_SEGSIZE */
> > -             default_nslabs =
> > -                     ALIGN(simple_strtoul(str, &str, 0), IO_TLB_SEGSIZE);
> > +             default_nslabs = simple_strtoul(str, &str, 0);
> >       }
> >       if (*str == ',')
> >               ++str;
> > +
> > +     /* get max IO TLB segment size */
> > +     if (isdigit(*str)) {
> > +             tmp = simple_strtoul(str, &str, 0);
> > +             if (tmp)
> > +                     io_tlb_seg_size = ALIGN(tmp, IO_TLB_SEGSIZE);
>
> From all I can tell io_tlb_seg_size wants to be a power of 2. Merely
> aligning to a multiple of IO_TLB_SEGSIZE isn't going to be enough.
>
> Jan
>


-- 
Best Regards, Roman.

^ permalink raw reply

* [PATCH] video: fbdev: use memset_io() instead of memset()
From: Christophe Leroy @ 2021-09-15 13:34 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Andrew Morton
  Cc: linux-fbdev, Finn Thain, Stan Johnson, linux-kernel, dri-devel,
	linuxppc-dev

While investigating a lockup at startup on Powerbook 3400C, it was
identified that the fbdev driver generates alignment exception at
startup:

	--- interrupt: 600 at memset+0x60/0xc0
	NIP:  c0021414 LR: c03fc49c CTR: 00007fff
	REGS: ca021c10 TRAP: 0600   Tainted: G        W          (5.14.2-pmac-00727-g12a41fa69492)
	MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 44008442  XER: 20000100
	DAR: cab80020 DSISR: 00017c07
	GPR00: 00000007 ca021cd0 c14412e0 cab80000 00000000 00100000 cab8001c 00000004
	GPR08: 00100000 00007fff 00000000 00000000 84008442 00000000 c0006fb4 00000000
	GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00100000
	GPR24: 00000000 81800000 00000320 c15fa400 c14d1878 00000000 c14d1800 c094e19c
	NIP [c0021414] memset+0x60/0xc0
	LR [c03fc49c] chipsfb_pci_init+0x160/0x580
	--- interrupt: 600
	[ca021cd0] [c03fc46c] chipsfb_pci_init+0x130/0x580 (unreliable)
	[ca021d20] [c03a3a70] pci_device_probe+0xf8/0x1b8
	[ca021d50] [c043d584] really_probe.part.0+0xac/0x388
	[ca021d70] [c043d914] __driver_probe_device+0xb4/0x170
	[ca021d90] [c043da18] driver_probe_device+0x48/0x144
	[ca021dc0] [c043e318] __driver_attach+0x11c/0x1c4
	[ca021de0] [c043ad30] bus_for_each_dev+0x88/0xf0
	[ca021e10] [c043c724] bus_add_driver+0x190/0x22c
	[ca021e40] [c043ee94] driver_register+0x9c/0x170
	[ca021e60] [c0006c28] do_one_initcall+0x54/0x1ec
	[ca021ed0] [c08246e4] kernel_init_freeable+0x1c0/0x270
	[ca021f10] [c0006fdc] kernel_init+0x28/0x11c
	[ca021f30] [c0017148] ret_from_kernel_thread+0x14/0x1c
	Instruction dump:
	7d4601a4 39490777 7d4701a4 39490888 7d4801a4 39490999 7d4901a4 39290aaa
	7d2a01a4 4c00012c 4bfffe88 0fe00000 <4bfffe80> 9421fff0 38210010 48001970

This is due to 'dcbz' instruction being used on non-cached memory.
'dcbz' instruction is used by memset() to zeroize a complete
cacheline at once, and memset() is not expected to be used on non
cached memory.

When performing a 'sparse' check on fbdev driver, it also appears
that the use of memset() is unexpected:

	drivers/video/fbdev/chipsfb.c:334:17: warning: incorrect type in argument 1 (different address spaces)
	drivers/video/fbdev/chipsfb.c:334:17:    expected void *
	drivers/video/fbdev/chipsfb.c:334:17:    got char [noderef] __iomem *screen_base
	drivers/video/fbdev/chipsfb.c:334:15: warning: memset with byte count of 1048576

Use fb_memset() instead of memset(). fb_memset() is defined as
memset_io() for powerpc.

Fixes: 8c8709334cec ("[PATCH] ppc32: Remove CONFIG_PMAC_PBOOK")
Reported-by: Stan Johnson <userm57@yahoo.com>
Cc: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 drivers/video/fbdev/chipsfb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/chipsfb.c b/drivers/video/fbdev/chipsfb.c
index 998067b701fa..393894af26f8 100644
--- a/drivers/video/fbdev/chipsfb.c
+++ b/drivers/video/fbdev/chipsfb.c
@@ -331,7 +331,7 @@ static const struct fb_var_screeninfo chipsfb_var = {
 
 static void init_chips(struct fb_info *p, unsigned long addr)
 {
-	memset(p->screen_base, 0, 0x100000);
+	fb_memset(p->screen_base, 0, 0x100000);
 
 	p->fix = chipsfb_fix;
 	p->fix.smem_start = addr;
-- 
2.31.1


^ permalink raw reply related

* [PATCH] serial: 8250: SERIAL_8250_FSL should not default to y when compile-testing
From: Geert Uytterhoeven @ 2021-09-15 12:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, Johan Hovold, Shawn Guo, Li Yang,
	Scott Wood
  Cc: Geert Uytterhoeven, linuxppc-dev, linux-arm-kernel, linux-serial,
	linux-kernel

Commit b1442c55ce8977aa ("serial: 8250: extend compile-test coverage")
added compile-test support to the Freescale 16550 driver.  However, as
SERIAL_8250_FSL is an invisible symbol, merely enabling COMPILE_TEST now
enables this driver.

Fix this by making SERIAL_8250_FSL visible.  Tighten the dependencies to
prevent asking the user about this driver when configuring a kernel
without appropriate Freescale SoC or ACPI support.

Fixes: b1442c55ce8977aa ("serial: 8250: extend compile-test coverage")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Yes, it's ugly, but I see no better solution. Do you?

 drivers/tty/serial/8250/Kconfig | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
index 808268edd2e82a45..a2978b31144e94f2 100644
--- a/drivers/tty/serial/8250/Kconfig
+++ b/drivers/tty/serial/8250/Kconfig
@@ -361,9 +361,13 @@ config SERIAL_8250_BCM2835AUX
 	  If unsure, say N.
 
 config SERIAL_8250_FSL
-	bool
+	bool "Freescale 16550-style UART support (8250 based driver)"
 	depends on SERIAL_8250_CONSOLE
-	default PPC || ARM || ARM64 || COMPILE_TEST
+	depends on FSL_SOC || ARCH_LAYERSCAPE || SOC_LS1021A || (ARM64 && ACPI) || COMPILE_TEST
+	default FSL_SOC || ARCH_LAYERSCAPE || SOC_LS1021A || (ARM64 && ACPI)
+	help
+	  Selecting this option will add support for the 16550-style serial
+	  port hardware found on Freescale SoCs.
 
 config SERIAL_8250_DW
 	tristate "Support for Synopsys DesignWare 8250 quirks"
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH] pci: Rename pcibios_add_device to match
From: Niklas Schnelle @ 2021-09-15 10:13 UTC (permalink / raw)
  To: Oliver O'Halloran, Michal Simek, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras, Gerald Schaefer,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	David S. Miller, Bjorn Helgaas, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin
  Cc: sparclinux, linux-s390, linuxppc-dev, linux-kernel, linux-pci
In-Reply-To: <20210913152709.48013-1-oohall@gmail.com>

On Tue, 2021-09-14 at 01:27 +1000, Oliver O'Halloran wrote:
> The general convention for pcibios_* hooks is that they're named after
> the corresponding pci_* function they provide a hook for. The exception
> is pcibios_add_device() which provides a hook for pci_device_add(). This
> has been irritating me for years so rename it.
> 
> Also, remove the export of the microblaze version. The only caller
> must be compiled as a built-in so there's no reason for the export.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/microblaze/pci/pci-common.c           | 3 +--
>  arch/powerpc/kernel/pci-common.c           | 2 +-
>  arch/powerpc/platforms/powernv/pci-sriov.c | 2 +-
>  arch/s390/pci/pci.c                        | 2 +-
>  arch/sparc/kernel/pci.c                    | 2 +-
>  arch/x86/pci/common.c                      | 2 +-
>  drivers/pci/pci.c                          | 4 ++--
>  drivers/pci/probe.c                        | 4 ++--
>  include/linux/pci.h                        | 2 +-
>  9 files changed, 11 insertions(+), 12 deletions(-)
> 
> 
.. snip ..
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index e7e6788d75a8..ded3321b7208 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -561,7 +561,7 @@ static void zpci_cleanup_bus_resources(struct zpci_dev *zdev)
>  	zdev->has_resources = 0;
>  }
>  
> -int pcibios_add_device(struct pci_dev *pdev)
> +int pcibios_device_add(struct pci_dev *pdev)
>  {
>  	struct zpci_dev *zdev = to_zpci(pdev);
>  	struct resource *res;
> 
.. snip ..
>


I agree with your assesment this is indeed confusing. Interestingly
pcibios_release_device() also doesn't follow the convention exactly as
it is called from pci_release_dev() but at least that isn't very
confusing.

So for the arch/s390/pci bit:

Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>


^ permalink raw reply

* Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
From: Borislav Petkov @ 2021-09-15 10:08 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	dri-devel, platform-driver-x86, Paul Mackerras, linux-s390,
	Andi Kleen, Joerg Roedel, x86, amd-gfx, Christoph Hellwig,
	linux-graphics-maintainer, Tom Lendacky, Tianyu Lan, kexec,
	linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <87lf3yk7g4.fsf@mpe.ellerman.id.au>

On Wed, Sep 15, 2021 at 10:28:59AM +1000, Michael Ellerman wrote:
> I don't love it, a new C file and an out-of-line call to then call back
> to a static inline that for most configuration will return false ... but
> whatever :)

Yeah, hch thinks it'll cause a big mess otherwise:

https://lore.kernel.org/lkml/YSScWvpXeVXw%2Fed5@infradead.org/

I guess less ifdeffery is nice too.

> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

Thx.

> Yeah, fixed in mainline today, thanks for trying to cross compile :)

Always!

:-)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply

* Re: [PATCH trivial v2] powerpc/powernv/dump: Fix typo in comment
From: Joel Stanley @ 2021-09-15  8:19 UTC (permalink / raw)
  To: Vasant Hegde; +Cc: linuxppc-dev
In-Reply-To: <20210914143802.54325-1-hegdevasant@linux.vnet.ibm.com>

On Tue, 14 Sept 2021 at 14:38, Vasant Hegde
<hegdevasant@linux.vnet.ibm.com> wrote:
>
> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>

Reviewed-by: Joel Stanley <joel@jms.id.au>

> ---
>  arch/powerpc/platforms/powernv/opal-dump.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/powernv/opal-dump.c b/arch/powerpc/platforms/powernv/opal-dump.c
> index 00c5a59d82d9..717d1d30ade5 100644
> --- a/arch/powerpc/platforms/powernv/opal-dump.c
> +++ b/arch/powerpc/platforms/powernv/opal-dump.c
> @@ -419,7 +419,7 @@ void __init opal_platform_dump_init(void)
>         int rc;
>         int dump_irq;
>
> -       /* ELOG not supported by firmware */
> +       /* Dump not supported by firmware */
>         if (!opal_check_token(OPAL_DUMP_READ))
>                 return;
>
> --
> 2.31.1
>

^ permalink raw reply

* Re: [PATCH] powerpc/powernv/flash: Check OPAL flash calls exist before using
From: Vasant Hegde @ 2021-09-15  6:55 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev
In-Reply-To: <87ilz2jr1f.fsf@mpe.ellerman.id.au>

On 9/15/21 11:53 AM, Michael Ellerman wrote:
> Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes:
>> Currently only FSP based powernv systems supports firmware update
>> interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists
>> before initalising the flash driver.
>>
>> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/platforms/powernv/opal-flash.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/powernv/opal-flash.c b/arch/powerpc/platforms/powernv/opal-flash.c
>> index 7e7d38b17420..05490fc22fae 100644
>> --- a/arch/powerpc/platforms/powernv/opal-flash.c
>> +++ b/arch/powerpc/platforms/powernv/opal-flash.c
>> @@ -520,6 +520,10 @@ void __init opal_flash_update_init(void)
>>   {
>>   	int ret;
>>   
>> +	/* Firmware update is not supported by firmware */
>> +	if (!opal_check_token(OPAL_FLASH_VALIDATE))
>> +		return;
>> +
> 

Michael,

> That will mean the following files no longer appear on BMC systems:
> 
>    /sys/firmware/opal/image
>    /sys/firmware/opal/validate_flash
>    /sys/firmware/opal/manage_flash
>    /sys/firmware/opal/update_flash
> 
> Presumably those files don't actually work correctly, but are we sure
> their mere existence isn't used by anything at all?

That's correct. We never used these files/interfaces on BMC based systems.

> 
> We've had trouble in the past where removing sysfs files breaks tools
> unexpectedly, see smt_snooze_delay.

AFAIK only update_flash uses these interfaces on baremetal systems.
This change shouldn't break update_flash as these interfaces never used/worked
on BMC based powernv systems.

-Vasant


^ permalink raw reply

* Re: [PATCH] powerpc/powernv/flash: Check OPAL flash calls exist before using
From: Vasant Hegde @ 2021-09-15  6:55 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev
In-Reply-To: <87ilz2jr1f.fsf@mpe.ellerman.id.au>

On 9/15/21 11:53 AM, Michael Ellerman wrote:
> Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes:
>> Currently only FSP based powernv systems supports firmware update
>> interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists
>> before initalising the flash driver.
>>
>> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/platforms/powernv/opal-flash.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/powernv/opal-flash.c b/arch/powerpc/platforms/powernv/opal-flash.c
>> index 7e7d38b17420..05490fc22fae 100644
>> --- a/arch/powerpc/platforms/powernv/opal-flash.c
>> +++ b/arch/powerpc/platforms/powernv/opal-flash.c
>> @@ -520,6 +520,10 @@ void __init opal_flash_update_init(void)
>>   {
>>   	int ret;
>>   
>> +	/* Firmware update is not supported by firmware */
>> +	if (!opal_check_token(OPAL_FLASH_VALIDATE))
>> +		return;
>> +
> 

Michael,

> That will mean the following files no longer appear on BMC systems:
> 
>    /sys/firmware/opal/image
>    /sys/firmware/opal/validate_flash
>    /sys/firmware/opal/manage_flash
>    /sys/firmware/opal/update_flash
> 
> Presumably those files don't actually work correctly, but are we sure
> their mere existence isn't used by anything at all?

That's correct. We never used these files/interfaces on BMC based systems.

> 
> We've had trouble in the past where removing sysfs files breaks tools
> unexpectedly, see smt_snooze_delay.

AFAIK only update_flash uses these interfaces on baremetal systems.
This change shouldn't break update_flash as these interfaces never used/worked
on BMC based powernv systems.

-Vasant


^ permalink raw reply

* Re: [PATCH] powerpc/powernv/flash: Check OPAL flash calls exist before using
From: Michael Ellerman @ 2021-09-15  6:23 UTC (permalink / raw)
  To: Vasant Hegde, linuxppc-dev; +Cc: Vasant Hegde
In-Reply-To: <20210914101630.30613-1-hegdevasant@linux.vnet.ibm.com>

Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes:
> Currently only FSP based powernv systems supports firmware update
> interfaces. Hence check that the token OPAL_FLASH_VALIDATE exists
> before initalising the flash driver.
>
> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/powernv/opal-flash.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/powerpc/platforms/powernv/opal-flash.c b/arch/powerpc/platforms/powernv/opal-flash.c
> index 7e7d38b17420..05490fc22fae 100644
> --- a/arch/powerpc/platforms/powernv/opal-flash.c
> +++ b/arch/powerpc/platforms/powernv/opal-flash.c
> @@ -520,6 +520,10 @@ void __init opal_flash_update_init(void)
>  {
>  	int ret;
>  
> +	/* Firmware update is not supported by firmware */
> +	if (!opal_check_token(OPAL_FLASH_VALIDATE))
> +		return;
> +

That will mean the following files no longer appear on BMC systems:

  /sys/firmware/opal/image
  /sys/firmware/opal/validate_flash
  /sys/firmware/opal/manage_flash
  /sys/firmware/opal/update_flash

Presumably those files don't actually work correctly, but are we sure
their mere existence isn't used by anything at all?

We've had trouble in the past where removing sysfs files breaks tools
unexpectedly, see smt_snooze_delay.

cheers

^ permalink raw reply

* Re: [PATCH v6 4/4] powerpc/64s: Initialize and use a temporary mm for patching on Radix
From: Jordan Niethe @ 2021-09-15  4:24 UTC (permalink / raw)
  To: Christopher M. Riedl; +Cc: linuxppc-dev, linux-hardening
In-Reply-To: <20210911022904.30962-5-cmr@bluescreens.de>

On Sat, Sep 11, 2021 at 12:39 PM Christopher M. Riedl
<cmr@bluescreens.de> wrote:
>
> When code patching a STRICT_KERNEL_RWX kernel the page containing the
> address to be patched is temporarily mapped as writeable. Currently, a
> per-cpu vmalloc patch area is used for this purpose. While the patch
> area is per-cpu, the temporary page mapping is inserted into the kernel
> page tables for the duration of patching. The mapping is exposed to CPUs
> other than the patching CPU - this is undesirable from a hardening
> perspective. Use a temporary mm instead which keeps the mapping local to
> the CPU doing the patching.
>
> Use the `poking_init` init hook to prepare a temporary mm and patching
> address. Initialize the temporary mm by copying the init mm. Choose a
> randomized patching address inside the temporary mm userspace address
> space. The patching address is randomized between PAGE_SIZE and
> DEFAULT_MAP_WINDOW-PAGE_SIZE.
>
> Bits of entropy with 64K page size on BOOK3S_64:
>
>         bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)
>
>         PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
>         bits of entropy = log2(128TB / 64K)
>         bits of entropy = 31
>
> The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU
> operates - by default the space above DEFAULT_MAP_WINDOW is not
> available. Currently the Hash MMU does not use a temporary mm so
> technically this upper limit isn't necessary; however, a larger
> randomization range does not further "harden" this overall approach and
> future work may introduce patching with a temporary mm on Hash as well.
>
> Randomization occurs only once during initialization at boot for each
> possible CPU in the system.
>
> Introduce two new functions, map_patch_mm() and unmap_patch_mm(), to
> respectively create and remove the temporary mapping with write
> permissions at patching_addr. Map the page with PAGE_KERNEL to set
> EAA[0] for the PTE which ignores the AMR (so no need to unlock/lock
> KUAP) according to PowerISA v3.0b Figure 35 on Radix.
>
> Based on x86 implementation:
>
> commit 4fc19708b165
> ("x86/alternatives: Initialize temporary mm for patching")
>
> and:
>
> commit b3fd8e83ada0
> ("x86/alternatives: Use temporary mm for text poking")
>
> Signed-off-by: Christopher M. Riedl <cmr@bluescreens.de>
>
> ---
>
> v6:  * Small clean-ups (naming, formatting, style, etc).
>      * Call stop_using_temporary_mm() before pte_unmap_unlock() after
>        patching.
>      * Replace BUG_ON()s in poking_init() w/ WARN_ON()s.
>
> v5:  * Only support Book3s64 Radix MMU for now.
>      * Use a per-cpu datastructure to hold the patching_addr and
>        patching_mm to avoid the need for a synchronization lock/mutex.
>
> v4:  * In the previous series this was two separate patches: one to init
>        the temporary mm in poking_init() (unused in powerpc at the time)
>        and the other to use it for patching (which removed all the
>        per-cpu vmalloc code). Now that we use poking_init() in the
>        existing per-cpu vmalloc approach, that separation doesn't work
>        as nicely anymore so I just merged the two patches into one.
>      * Preload the SLB entry and hash the page for the patching_addr
>        when using Hash on book3s64 to avoid taking an SLB and Hash fault
>        during patching. The previous implementation was a hack which
>        changed current->mm to allow the SLB and Hash fault handlers to
>        work with the temporary mm since both of those code-paths always
>        assume mm == current->mm.
>      * Also (hmm - seeing a trend here) with the book3s64 Hash MMU we
>        have to manage the mm->context.active_cpus counter and mm cpumask
>        since they determine (via mm_is_thread_local()) if the TLB flush
>        in pte_clear() is local or not - it should always be local when
>        we're using the temporary mm. On book3s64's Radix MMU we can
>        just call local_flush_tlb_mm().
>      * Use HPTE_USE_KERNEL_KEY on Hash to avoid costly lock/unlock of
>        KUAP.
> ---
>  arch/powerpc/lib/code-patching.c | 119 +++++++++++++++++++++++++++++--
>  1 file changed, 112 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index e802e42c2789..af8e2a02a9dd 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -11,6 +11,7 @@
>  #include <linux/cpuhotplug.h>
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
> +#include <linux/random.h>
>
>  #include <asm/tlbflush.h>
>  #include <asm/page.h>
> @@ -103,6 +104,7 @@ static inline void stop_using_temporary_mm(struct temp_mm *temp_mm)
>
>  static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
>  static DEFINE_PER_CPU(unsigned long, cpu_patching_addr);
> +static DEFINE_PER_CPU(struct mm_struct *, cpu_patching_mm);
>
>  static int text_area_cpu_up(unsigned int cpu)
>  {
> @@ -126,8 +128,48 @@ static int text_area_cpu_down(unsigned int cpu)
>         return 0;
>  }
>
> +static __always_inline void __poking_init_temp_mm(void)
> +{
> +       int cpu;
> +       spinlock_t *ptl; /* for protecting pte table */
> +       pte_t *ptep;
> +       struct mm_struct *patching_mm;
> +       unsigned long patching_addr;
> +
> +       for_each_possible_cpu(cpu) {
> +               patching_mm = copy_init_mm();
> +               WARN_ON(!patching_mm);
> +               per_cpu(cpu_patching_mm, cpu) = patching_mm;
> +
> +               /*
> +                * Choose a randomized, page-aligned address from the range:
> +                * [PAGE_SIZE, DEFAULT_MAP_WINDOW - PAGE_SIZE] The lower
> +                * address bound is PAGE_SIZE to avoid the zero-page.  The
> +                * upper address bound is DEFAULT_MAP_WINDOW - PAGE_SIZE to
> +                * stay under DEFAULT_MAP_WINDOW with the Book3s64 Hash MMU.
> +                */
> +               patching_addr = PAGE_SIZE + ((get_random_long() & PAGE_MASK)
> +                               % (DEFAULT_MAP_WINDOW - 2 * PAGE_SIZE));
> +               per_cpu(cpu_patching_addr, cpu) = patching_addr;
> +
> +               /*
> +                * PTE allocation uses GFP_KERNEL which means we need to
> +                * pre-allocate the PTE here because we cannot do the
> +                * allocation during patching when IRQs are disabled.
> +                */
> +               ptep = get_locked_pte(patching_mm, patching_addr, &ptl);
> +               WARN_ON(!ptep);
> +               pte_unmap_unlock(ptep, ptl);
> +       }
> +}
> +
>  void __init poking_init(void)
>  {
> +       if (radix_enabled()) {
> +               __poking_init_temp_mm();
> +               return;
> +       }
> +
>         WARN_ON(cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
>                 "powerpc/text_poke:online", text_area_cpu_up,
>                 text_area_cpu_down) < 0);
> @@ -197,30 +239,93 @@ static inline int unmap_patch_area(void)
>         return 0;
>  }
>
> +struct patch_mapping {
> +       spinlock_t *ptl; /* for protecting pte table */
> +       pte_t *ptep;
> +       struct temp_mm temp_mm;
> +};
> +
> +/*
> + * This can be called for kernel text or a module.
> + */
> +static int map_patch_mm(const void *addr, struct patch_mapping *patch_mapping)
> +{
> +       struct page *page;
> +       struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
> +       unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
> +
> +       if (is_vmalloc_or_module_addr(addr))
> +               page = vmalloc_to_page(addr);
> +       else
> +               page = virt_to_page(addr);
> +
> +       patch_mapping->ptep = get_locked_pte(patching_mm, patching_addr,
> +                                            &patch_mapping->ptl);
> +       if (unlikely(!patch_mapping->ptep)) {
> +               pr_warn("map patch: failed to allocate pte for patching\n");
> +               return -1;
> +       }
> +
> +       set_pte_at(patching_mm, patching_addr, patch_mapping->ptep,
> +                  pte_mkdirty(mk_pte(page, PAGE_KERNEL)));

I think because switch_mm_irqs_off() will  not necessarily have a
barrier so a ptesync would be needed.
A spurious fault here from __patch_instruction() would not be handled correctly.

> +
> +       init_temp_mm(&patch_mapping->temp_mm, patching_mm);
> +       start_using_temporary_mm(&patch_mapping->temp_mm);
> +
> +       return 0;
> +}
> +
> +static int unmap_patch_mm(struct patch_mapping *patch_mapping)
> +{
> +       struct mm_struct *patching_mm = __this_cpu_read(cpu_patching_mm);
> +       unsigned long patching_addr = __this_cpu_read(cpu_patching_addr);
> +
> +       pte_clear(patching_mm, patching_addr, patch_mapping->ptep);
> +
> +       local_flush_tlb_mm(patching_mm);
> +       stop_using_temporary_mm(&patch_mapping->temp_mm);
> +
> +       pte_unmap_unlock(patch_mapping->ptep, patch_mapping->ptl);
> +
> +       return 0;
> +}
> +
>  static int do_patch_instruction(u32 *addr, struct ppc_inst instr)
>  {
>         int err, rc = 0;
>         u32 *patch_addr = NULL;
>         unsigned long flags;
> +       struct patch_mapping patch_mapping;
>
>         /*
> -        * During early early boot patch_instruction is called
> -        * when text_poke_area is not ready, but we still need
> -        * to allow patching. We just do the plain old patching
> +        * During early early boot patch_instruction is called when the
> +        * patching_mm/text_poke_area is not ready, but we still need to allow
> +        * patching. We just do the plain old patching.
>          */
> -       if (!this_cpu_read(text_poke_area))
> -               return raw_patch_instruction(addr, instr);
> +       if (radix_enabled()) {
> +               if (!this_cpu_read(cpu_patching_mm))
> +                       return raw_patch_instruction(addr, instr);
> +       } else {
> +               if (!this_cpu_read(text_poke_area))
> +                       return raw_patch_instruction(addr, instr);
> +       }
>
>         local_irq_save(flags);
>
> -       err = map_patch_area(addr);
> +       if (radix_enabled())
> +               err = map_patch_mm(addr, &patch_mapping);
> +       else
> +               err = map_patch_area(addr);
>         if (err)
>                 goto out;
>
>         patch_addr = (u32 *)(__this_cpu_read(cpu_patching_addr) | offset_in_page(addr));
>         rc = __patch_instruction(addr, instr, patch_addr);
>
> -       err = unmap_patch_area();
> +       if (radix_enabled())
> +               err = unmap_patch_mm(&patch_mapping);
> +       else
> +               err = unmap_patch_area();
>
>  out:
>         local_irq_restore(flags);
> --
> 2.32.0
>

^ permalink raw reply

* Re: [RESEND PATCH v4 1/4] drivers/nvdimm: Add nvdimm pmu structure
From: Dan Williams @ 2021-09-15  4:11 UTC (permalink / raw)
  To: kajoljain
  Cc: Linux NVDIMM, Santosh Sivaraj, maddy, Weiny, Ira, rnsastry,
	Peter Zijlstra, Linux Kernel Mailing List, atrajeev,
	Aneesh Kumar K.V, Vishal L Verma, Vaibhav Jain, Thomas Gleixner,
	linuxppc-dev
In-Reply-To: <CAPcyv4hE4rh5R+8zy3X4gDJeuPzQ0oQHmHbe_pppgWB2_RjfAg@mail.gmail.com>

On Tue, Sep 14, 2021 at 9:08 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Thu, Sep 9, 2021 at 12:56 AM kajoljain <kjain@linux.ibm.com> wrote:
> >
> >
> >
> > On 9/8/21 3:29 AM, Dan Williams wrote:
> > > Hi Kajol,
> > >
> > > Apologies for the delay in responding to this series, some comments below:
> >
> > Hi Dan,
> >     No issues, thanks for reviewing the patches.
> >
> > >
> > > On Thu, Sep 2, 2021 at 10:10 PM Kajol Jain <kjain@linux.ibm.com> wrote:
> > >>
> > >> A structure is added, called nvdimm_pmu, for performance
> > >> stats reporting support of nvdimm devices. It can be used to add
> > >> nvdimm pmu data such as supported events and pmu event functions
> > >> like event_init/add/read/del with cpu hotplug support.
> > >>
> > >> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > >> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> > >> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
> > >> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> > >> ---
> > >>  include/linux/nd.h | 43 +++++++++++++++++++++++++++++++++++++++++++
> > >>  1 file changed, 43 insertions(+)
> > >>
> > >> diff --git a/include/linux/nd.h b/include/linux/nd.h
> > >> index ee9ad76afbba..712499cf7335 100644
> > >> --- a/include/linux/nd.h
> > >> +++ b/include/linux/nd.h
> > >> @@ -8,6 +8,8 @@
> > >>  #include <linux/ndctl.h>
> > >>  #include <linux/device.h>
> > >>  #include <linux/badblocks.h>
> > >> +#include <linux/platform_device.h>
> > >> +#include <linux/perf_event.h>
> > >>
> > >>  enum nvdimm_event {
> > >>         NVDIMM_REVALIDATE_POISON,
> > >> @@ -23,6 +25,47 @@ enum nvdimm_claim_class {
> > >>         NVDIMM_CCLASS_UNKNOWN,
> > >>  };
> > >>
> > >> +/* Event attribute array index */
> > >> +#define NVDIMM_PMU_FORMAT_ATTR         0
> > >> +#define NVDIMM_PMU_EVENT_ATTR          1
> > >> +#define NVDIMM_PMU_CPUMASK_ATTR                2
> > >> +#define NVDIMM_PMU_NULL_ATTR           3
> > >> +
> > >> +/**
> > >> + * struct nvdimm_pmu - data structure for nvdimm perf driver
> > >> + *
> > >> + * @name: name of the nvdimm pmu device.
> > >> + * @pmu: pmu data structure for nvdimm performance stats.
> > >> + * @dev: nvdimm device pointer.
> > >> + * @functions(event_init/add/del/read): platform specific pmu functions.
> > >
> > > This is not valid kernel-doc:
> > >
> > > include/linux/nd.h:67: warning: Function parameter or member
> > > 'event_init' not described in 'nvdimm_pmu'
> > > include/linux/nd.h:67: warning: Function parameter or member 'add' not
> > > described in 'nvdimm_pmu'
> > > include/linux/nd.h:67: warning: Function parameter or member 'del' not
> > > described in 'nvdimm_pmu'
> > > include/linux/nd.h:67: warning: Function parameter or member 'read'
> > > not described in 'nvdimm_pmu'
> > >
> > > ...but I think rather than fixing those up 'struct nvdimm_pmu' should be pruned.
> > >
> > > It's not clear to me that it is worth the effort to describe these
> > > details to the nvdimm core which is just going to turn around and call
> > > the pmu core. I'd just as soon have the driver call the pmu core
> > > directly, optionally passing in attributes and callbacks that come
> > > from the nvdimm core and/or the nvdimm provider.
> >
> > The intend for adding these callbacks(event_init/add/del/read) is to give
> > flexibility to the nvdimm core to add some common checks/routines if required
> > in the future. Those checks can be common for all architecture with still having the
> > ability to call arch/platform specific driver code to use its own routines.
> >
> > But as you said, currently we don't have any common checks and it directly
> > calling platform specific code, so we can get rid of it.
> > Should we remove this part for now?
>
> Yes, lets go direct to the perf api for now and await the need for a
> common core wrapper to present itself.
>
> >
> >
> > >
> > > Otherwise it's also not clear which of these structure members are
> > > used at runtime vs purely used as temporary storage to pass parameters
> > > to the pmu core.
> > >
> > >> + * @attr_groups: data structure for events, formats and cpumask
> > >> + * @cpu: designated cpu for counter access.
> > >> + * @node: node for cpu hotplug notifier link.
> > >> + * @cpuhp_state: state for cpu hotplug notification.
> > >> + * @arch_cpumask: cpumask to get designated cpu for counter access.
> > >> + */
> > >> +struct nvdimm_pmu {
> > >> +       const char *name;
> > >> +       struct pmu pmu;
> > >> +       struct device *dev;
> > >> +       int (*event_init)(struct perf_event *event);
> > >> +       int  (*add)(struct perf_event *event, int flags);
> > >> +       void (*del)(struct perf_event *event, int flags);
> > >> +       void (*read)(struct perf_event *event);
> > >> +       /*
> > >> +        * Attribute groups for the nvdimm pmu. Index 0 used for
> > >> +        * format attribute, index 1 used for event attribute,
> > >> +        * index 2 used for cpusmask attribute and index 3 kept as NULL.
> > >> +        */
> > >> +       const struct attribute_group *attr_groups[4];
> > >
> > > Following from above, I'd rather this was organized as static
> > > attributes with an is_visible() helper for the groups for any dynamic
> > > aspects. That mirrors the behavior of nvdimm_create() and allows for
> > > device drivers to compose the attribute groups from a core set and /
> > > or a provider specific set.
> >
> > Since we don't have any common events right now, Can I use papr
> > attributes directly or should we create dummy events for common thing and
> > then merged it with papr event list.
>
> Just use papr events directly.

That is to say...I think if another implementation followed it should
try to match as many common event names as papr_scm picked, and
possibly extend with its own rather than start with a papr_scm
specific namespace for everything.

^ permalink raw reply

* Re: [RESEND PATCH v4 1/4] drivers/nvdimm: Add nvdimm pmu structure
From: Dan Williams @ 2021-09-15  4:08 UTC (permalink / raw)
  To: kajoljain
  Cc: Linux NVDIMM, Santosh Sivaraj, maddy, Weiny, Ira, rnsastry,
	Peter Zijlstra, Linux Kernel Mailing List, atrajeev,
	Aneesh Kumar K.V, Vishal L Verma, Vaibhav Jain, Thomas Gleixner,
	linuxppc-dev
In-Reply-To: <d7f8bf51-059f-4496-37c4-6516a703e209@linux.ibm.com>

On Thu, Sep 9, 2021 at 12:56 AM kajoljain <kjain@linux.ibm.com> wrote:
>
>
>
> On 9/8/21 3:29 AM, Dan Williams wrote:
> > Hi Kajol,
> >
> > Apologies for the delay in responding to this series, some comments below:
>
> Hi Dan,
>     No issues, thanks for reviewing the patches.
>
> >
> > On Thu, Sep 2, 2021 at 10:10 PM Kajol Jain <kjain@linux.ibm.com> wrote:
> >>
> >> A structure is added, called nvdimm_pmu, for performance
> >> stats reporting support of nvdimm devices. It can be used to add
> >> nvdimm pmu data such as supported events and pmu event functions
> >> like event_init/add/read/del with cpu hotplug support.
> >>
> >> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> >> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> >> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
> >> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> >> ---
> >>  include/linux/nd.h | 43 +++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 43 insertions(+)
> >>
> >> diff --git a/include/linux/nd.h b/include/linux/nd.h
> >> index ee9ad76afbba..712499cf7335 100644
> >> --- a/include/linux/nd.h
> >> +++ b/include/linux/nd.h
> >> @@ -8,6 +8,8 @@
> >>  #include <linux/ndctl.h>
> >>  #include <linux/device.h>
> >>  #include <linux/badblocks.h>
> >> +#include <linux/platform_device.h>
> >> +#include <linux/perf_event.h>
> >>
> >>  enum nvdimm_event {
> >>         NVDIMM_REVALIDATE_POISON,
> >> @@ -23,6 +25,47 @@ enum nvdimm_claim_class {
> >>         NVDIMM_CCLASS_UNKNOWN,
> >>  };
> >>
> >> +/* Event attribute array index */
> >> +#define NVDIMM_PMU_FORMAT_ATTR         0
> >> +#define NVDIMM_PMU_EVENT_ATTR          1
> >> +#define NVDIMM_PMU_CPUMASK_ATTR                2
> >> +#define NVDIMM_PMU_NULL_ATTR           3
> >> +
> >> +/**
> >> + * struct nvdimm_pmu - data structure for nvdimm perf driver
> >> + *
> >> + * @name: name of the nvdimm pmu device.
> >> + * @pmu: pmu data structure for nvdimm performance stats.
> >> + * @dev: nvdimm device pointer.
> >> + * @functions(event_init/add/del/read): platform specific pmu functions.
> >
> > This is not valid kernel-doc:
> >
> > include/linux/nd.h:67: warning: Function parameter or member
> > 'event_init' not described in 'nvdimm_pmu'
> > include/linux/nd.h:67: warning: Function parameter or member 'add' not
> > described in 'nvdimm_pmu'
> > include/linux/nd.h:67: warning: Function parameter or member 'del' not
> > described in 'nvdimm_pmu'
> > include/linux/nd.h:67: warning: Function parameter or member 'read'
> > not described in 'nvdimm_pmu'
> >
> > ...but I think rather than fixing those up 'struct nvdimm_pmu' should be pruned.
> >
> > It's not clear to me that it is worth the effort to describe these
> > details to the nvdimm core which is just going to turn around and call
> > the pmu core. I'd just as soon have the driver call the pmu core
> > directly, optionally passing in attributes and callbacks that come
> > from the nvdimm core and/or the nvdimm provider.
>
> The intend for adding these callbacks(event_init/add/del/read) is to give
> flexibility to the nvdimm core to add some common checks/routines if required
> in the future. Those checks can be common for all architecture with still having the
> ability to call arch/platform specific driver code to use its own routines.
>
> But as you said, currently we don't have any common checks and it directly
> calling platform specific code, so we can get rid of it.
> Should we remove this part for now?

Yes, lets go direct to the perf api for now and await the need for a
common core wrapper to present itself.

>
>
> >
> > Otherwise it's also not clear which of these structure members are
> > used at runtime vs purely used as temporary storage to pass parameters
> > to the pmu core.
> >
> >> + * @attr_groups: data structure for events, formats and cpumask
> >> + * @cpu: designated cpu for counter access.
> >> + * @node: node for cpu hotplug notifier link.
> >> + * @cpuhp_state: state for cpu hotplug notification.
> >> + * @arch_cpumask: cpumask to get designated cpu for counter access.
> >> + */
> >> +struct nvdimm_pmu {
> >> +       const char *name;
> >> +       struct pmu pmu;
> >> +       struct device *dev;
> >> +       int (*event_init)(struct perf_event *event);
> >> +       int  (*add)(struct perf_event *event, int flags);
> >> +       void (*del)(struct perf_event *event, int flags);
> >> +       void (*read)(struct perf_event *event);
> >> +       /*
> >> +        * Attribute groups for the nvdimm pmu. Index 0 used for
> >> +        * format attribute, index 1 used for event attribute,
> >> +        * index 2 used for cpusmask attribute and index 3 kept as NULL.
> >> +        */
> >> +       const struct attribute_group *attr_groups[4];
> >
> > Following from above, I'd rather this was organized as static
> > attributes with an is_visible() helper for the groups for any dynamic
> > aspects. That mirrors the behavior of nvdimm_create() and allows for
> > device drivers to compose the attribute groups from a core set and /
> > or a provider specific set.
>
> Since we don't have any common events right now, Can I use papr
> attributes directly or should we create dummy events for common thing and
> then merged it with papr event list.

Just use papr events directly.

^ permalink raw reply

* Re: [PATCH v2 07/29] ABI: sysfs-class-cxl: place "not in a guest" at description
From: Andrew Donnellan @ 2021-09-15  3:33 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Greg Kroah-Hartman, Linux Doc Mailing List
  Cc: Frederic Barrat, linuxppc-dev, linux-kernel, Jonathan Corbet
In-Reply-To: <88ce67c9eed1ae08af3d3992415032723184af9e.1631629496.git.mchehab+huawei@kernel.org>

On 15/9/21 12:32 am, Mauro Carvalho Chehab wrote:
> The What: field should have just the location of the ABI.
> Anything else should be inside the description.
> 
> This fixes its parsing by get_abi.pl script.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>

Looks fine to me.

Acked-by: Andrew Donnellan <ajd@linux.ibm.com>

> ---
>   Documentation/ABI/testing/sysfs-class-cxl | 15 ++++++++++-----
>   1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-class-cxl b/Documentation/ABI/testing/sysfs-class-cxl
> index 818f55970efb..3c77677e0ca7 100644
> --- a/Documentation/ABI/testing/sysfs-class-cxl
> +++ b/Documentation/ABI/testing/sysfs-class-cxl
> @@ -166,10 +166,11 @@ Description:    read only
>                   Decimal value of the Per Process MMIO space length.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> -What:           /sys/class/cxl/<afu>m/pp_mmio_off (not in a guest)
> +What:           /sys/class/cxl/<afu>m/pp_mmio_off
>   Date:           September 2014
>   Contact:        linuxppc-dev@lists.ozlabs.org
>   Description:    read only
> +                (not in a guest)
>                   Decimal value of the Per Process MMIO space offset.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> @@ -190,28 +191,31 @@ Description:    read only
>                   Identifies the revision level of the PSL.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> -What:           /sys/class/cxl/<card>/base_image (not in a guest)
> +What:           /sys/class/cxl/<card>/base_image
>   Date:           September 2014
>   Contact:        linuxppc-dev@lists.ozlabs.org
>   Description:    read only
> +                (not in a guest)
>                   Identifies the revision level of the base image for devices
>                   that support loadable PSLs. For FPGAs this field identifies
>                   the image contained in the on-adapter flash which is loaded
>                   during the initial program load.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> -What:           /sys/class/cxl/<card>/image_loaded (not in a guest)
> +What:           /sys/class/cxl/<card>/image_loaded
>   Date:           September 2014
>   Contact:        linuxppc-dev@lists.ozlabs.org
>   Description:    read only
> +                (not in a guest)
>                   Will return "user" or "factory" depending on the image loaded
>                   onto the card.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> -What:           /sys/class/cxl/<card>/load_image_on_perst (not in a guest)
> +What:           /sys/class/cxl/<card>/load_image_on_perst
>   Date:           December 2014
>   Contact:        linuxppc-dev@lists.ozlabs.org
>   Description:    read/write
> +                (not in a guest)
>                   Valid entries are "none", "user", and "factory".
>                   "none" means PERST will not cause image to be loaded to the
>                   card.  A power cycle is required to load the image.
> @@ -235,10 +239,11 @@ Description:    write only
>                   contexts on the card AFUs.
>   Users:		https://github.com/ibm-capi/libcxl
>   
> -What:		/sys/class/cxl/<card>/perst_reloads_same_image (not in a guest)
> +What:		/sys/class/cxl/<card>/perst_reloads_same_image
>   Date:		July 2015
>   Contact:	linuxppc-dev@lists.ozlabs.org
>   Description:	read/write
> +                (not in a guest)
>   		Trust that when an image is reloaded via PERST, it will not
>   		have changed.
>   
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited

^ permalink raw reply

* Re: [PATCH] swiotlb: set IO TLB segment size via cmdline
From: Stefano Stabellini @ 2021-09-15  1:51 UTC (permalink / raw)
  To: rm.skakun
  Cc: Roman Skakun, linux-doc, Peter Zijlstra, Viresh Kumar,
	linux-kernel, Paul Mackerras, Jan Beulich, Will Deacon,
	Boris Ostrovsky, Marek Szyprowski, Stefano Stabellini,
	Jonathan Corbet, hch, xen-devel, Paul E. McKenney,
	Konrad Rzeszutek Wilk, Muchun Song, Thomas Gleixner,
	Juergen Gross, Thomas Bogendoerfer, Andrii Anisov, linuxppc-dev,
	Randy Dunlap, linux-mips, iommu, Roman Skakun, Andrew Morton,
	Lu Baolu, Robin Murphy, Mike Rapoport, Maciej W. Rozycki
In-Reply-To: <20210914153046.GB815@lst.de>

On Tue, 14 Sep 2021, Christoph Hellwig wrote:
> On Tue, Sep 14, 2021 at 05:29:07PM +0200, Jan Beulich wrote:
> > I'm not convinced the swiotlb use describe there falls under "intended
> > use" - mapping a 1280x720 framebuffer in a single chunk? (As an aside,
> > the bottom of this page is also confusing, as following "Then we can
> > confirm the modified swiotlb size in the boot log:" there is a log
> > fragment showing the same original size of 64Mb.
> 
> It doesn't.  We also do not add hacks to the kernel for whacky out
> of tree modules.

Also, Option 1 listed in the webpage seems to be a lot better. Any
reason you can't do that? Because that option both solves the problem
and increases performance.

^ permalink raw reply

* Re: [PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
From: Michael Ellerman @ 2021-09-15  0:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sathyanarayanan Kuppuswamy, linux-efi, Brijesh Singh, kvm,
	dri-devel, platform-driver-x86, Paul Mackerras, linux-s390,
	Andi Kleen, Joerg Roedel, x86, amd-gfx, Christoph Hellwig,
	linux-graphics-maintainer, Tom Lendacky, Tianyu Lan, kexec,
	linux-kernel, iommu, linux-fsdevel, linuxppc-dev
In-Reply-To: <YUCOTIPPsJJpLO/d@zn.tnic>

Borislav Petkov <bp@alien8.de> writes:
> On Wed, Sep 08, 2021 at 05:58:35PM -0500, Tom Lendacky wrote:
>> Introduce a powerpc version of the cc_platform_has() function. This will
>> be used to replace the powerpc mem_encrypt_active() implementation, so
>> the implementation will initially only support the CC_ATTR_MEM_ENCRYPT
>> attribute.
>> 
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
>> ---
>>  arch/powerpc/platforms/pseries/Kconfig       |  1 +
>>  arch/powerpc/platforms/pseries/Makefile      |  2 ++
>>  arch/powerpc/platforms/pseries/cc_platform.c | 26 ++++++++++++++++++++
>>  3 files changed, 29 insertions(+)
>>  create mode 100644 arch/powerpc/platforms/pseries/cc_platform.c
>
> Michael,
>
> can I get an ACK for the ppc bits to carry them through the tip tree
> pls?

Yeah.

I don't love it, a new C file and an out-of-line call to then call back
to a static inline that for most configuration will return false ... but
whatever :)

Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)


> Btw, on a related note, cross-compiling this throws the following error here:
>
> $ make CROSS_COMPILE=/home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux- V=1 ARCH=powerpc
>
> ...
>
> /home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/powerpc64-linux-gcc -Wp,-MD,arch/powerpc/boot/.crt0.o.d -D__ASSEMBLY__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -O2 -msoft-float -mno-altivec -mno-vsx -pipe -fomit-frame-pointer -fno-builtin -fPIC -nostdinc -include ./include/linux/compiler_attributes.h -I./arch/powerpc/include -I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi -I./arch/powerpc/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -m32 -isystem /home/share/src/crosstool/gcc-9.4.0-nolibc/powerpc64-linux/bin/../lib/gcc/powerpc64-linux/9.4.0/include -mbig-endian -nostdinc -c -o arch/powerpc/boot/crt0.o arch/powerpc/boot/crt0.S
> In file included from <command-line>:
> ././include/linux/compiler_attributes.h:62:5: warning: "__has_attribute" is not defined, evaluates to 0 [-Wundef]
>    62 | #if __has_attribute(__assume_aligned__)
>       |     ^~~~~~~~~~~~~~~
> ././include/linux/compiler_attributes.h:62:20: error: missing binary operator before token "("
>    62 | #if __has_attribute(__assume_aligned__)
>       |                    ^
> ././include/linux/compiler_attributes.h:88:5: warning: "__has_attribute" is not defined, evaluates to 0 [-Wundef]
>    88 | #if __has_attribute(__copy__)
>       |     ^~~~~~~~~~~~~~~
> ...
>
> Known issue?

Yeah, fixed in mainline today, thanks for trying to cross compile :)

cheers

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox