LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v6 6/8] powerpc/pmem: Avoid the barrier in flush routines
From: Aneesh Kumar K.V @ 2020-06-29 20:40 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Jan Kara, linux-nvdimm, Jeff Moyer, oohall, dan.j.williams,
	linuxppc-dev
In-Reply-To: <20200629160940.GU21462@kitsune.suse.cz>

Michal Suchánek <msuchanek@suse.de> writes:

> Hello,
>
> On Mon, Jun 29, 2020 at 07:27:20PM +0530, Aneesh Kumar K.V wrote:
>> nvdimm expect the flush routines to just mark the cache clean. The barrier
>> that mark the store globally visible is done in nvdimm_flush().
>> 
>> Update the papr_scm driver to a simplified nvdim_flush callback that do
>> only the required barrier.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>  arch/powerpc/lib/pmem.c                   |  6 ------
>>  arch/powerpc/platforms/pseries/papr_scm.c | 13 +++++++++++++
>>  2 files changed, 13 insertions(+), 6 deletions(-)
>> 
>> diff --git a/arch/powerpc/lib/pmem.c b/arch/powerpc/lib/pmem.c
>> index 5a61aaeb6930..21210fa676e5 100644
>> --- a/arch/powerpc/lib/pmem.c
>> +++ b/arch/powerpc/lib/pmem.c
>> @@ -19,9 +19,6 @@ static inline void __clean_pmem_range(unsigned long start, unsigned long stop)
>>  
>>  	for (i = 0; i < size >> shift; i++, addr += bytes)
>>  		asm volatile(PPC_DCBSTPS(%0, %1): :"i"(0), "r"(addr): "memory");
>> -
>> -
>> -	asm volatile(PPC_PHWSYNC ::: "memory");
>>  }
>>  
>>  static inline void __flush_pmem_range(unsigned long start, unsigned long stop)
>> @@ -34,9 +31,6 @@ static inline void __flush_pmem_range(unsigned long start, unsigned long stop)
>>  
>>  	for (i = 0; i < size >> shift; i++, addr += bytes)
>>  		asm volatile(PPC_DCBFPS(%0, %1): :"i"(0), "r"(addr): "memory");
>> -
>> -
>> -	asm volatile(PPC_PHWSYNC ::: "memory");
>>  }
>>  
>>  static inline void clean_pmem_range(unsigned long start, unsigned long stop)
>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>> index 9c569078a09f..9a9a0766f8b6 100644
>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>> @@ -630,6 +630,18 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
>>  
>>  	return 0;
>>  }
>> +/*
>> + * We have made sure the pmem writes are done such that before calling this
>> + * all the caches are flushed/clean. We use dcbf/dcbfps to ensure this. Here
>> + * we just need to add the necessary barrier to make sure the above flushes
>> + * are have updated persistent storage before any data access or data transfer
>> + * caused by subsequent instructions is initiated.
>> + */
>> +static int papr_scm_flush_sync(struct nd_region *nd_region, struct bio *bio)
>> +{
>> +	arch_pmem_flush_barrier();
>> +	return 0;
>> +}
>>  
>>  static ssize_t flags_show(struct device *dev,
>>  			  struct device_attribute *attr, char *buf)
>> @@ -743,6 +755,7 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
>>  	ndr_desc.mapping = &mapping;
>>  	ndr_desc.num_mappings = 1;
>>  	ndr_desc.nd_set = &p->nd_set;
>> +	ndr_desc.flush = papr_scm_flush_sync;
>
> AFAICT currently the only device that implements flush is virtio_pmem.
> How does the nfit driver get away without implementing flush?

generic_nvdimm_flush does the required barrier for nfit. The reason for
adding ndr_desc.flush call back for papr_scm was to avoid the usage
of iomem based deep flushing (ndr_region_data.flush_wpq) which is not
supported by papr_scm.

BTW we do return NULL for ndrd_get_flush_wpq() on power. So the upstream
code also does the same thing, but in a different way.


> Also the flush takes arguments that are completely unused but a user of
> the pmem region must assume they are used, and call flush() on the
> region rather than arch_pmem_flush_barrier() directly.

The bio argument can help a pmem driver to do range based flushing in
case of pmem_make_request. If bio is null then we must assume a full
device flush. 

>This may not
> work well with md as discussed with earlier iteration of the patchest.
>

dm-writecache needs some major changes to work with asynchronous pmem
devices. 

-aneesh

^ permalink raw reply

* Re: [PATCH 01/20] nfblock: stop using ->queuedata
From: Geert Uytterhoeven @ 2020-06-29 21:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, open list:TENSILICA XTENSA PORT (xtensa),
	linux-nvdimm@lists.01.org, linux-s390, linux-m68k, linux-nvme,
	Linux Kernel Mailing List, linux-raid, dm-devel, linux-bcache,
	linuxppc-dev, Lars Ellenberg
In-Reply-To: <20200629193947.2705954-2-hch@lst.de>

On Mon, Jun 29, 2020 at 9:40 PM Christoph Hellwig <hch@lst.de> wrote:
> Instead of setting up the queuedata as well just use one private data
> field.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH 0/8] mm: cleanup usage of <asm/pgalloc.h>
From: Pekka Enberg @ 2020-06-29 14:01 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: ia64, linux-sh, Peter Zijlstra, linux-mips, Max Filippov,
	Satheesh Rajendran, linux-csky, sparclinux, linux-riscv,
	list@ebiederm.org:DOCUMENTATION <linux-doc@vger.kernel.org>, list@ebiederm.org:MEMORY MANAGEMENT <linux-mm@kvack.org>, ,
	Stephen Rothwell, linux-hexagon, Joerg Roedel, Mike Rapoport,
	Abdul Haleem, linux-snps-arc, linux-xtensa, Arnd Bergmann,
	linux-s390, linux-um, Steven Rostedt, linux-m68k, openrisc,
	Andy Lutomirski, Stafford Horne, linux-arm-kernel, linux-parisc,
	linux-mm@kvack.org, LKML, linux-alpha, Andrew Morton,
	linuxppc-dev
In-Reply-To: <20200627143453.31835-1-rppt@kernel.org>

On Sat, Jun 27, 2020 at 5:35 PM Mike Rapoport <rppt@kernel.org> wrote:
> Most architectures have very similar versions of pXd_alloc_one() and
> pXd_free_one() for intermediate levels of page table.
> These patches add generic versions of these functions in
> <asm-generic/pgalloc.h> and enable use of the generic functions where
> appropriate.

Very nice cleanup series to the page table code!

FWIW:

Reviewed-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply

* [PATCH 1/1] MAINTAINERS: Remove self
From: Sam Bobroff @ 2020-06-29 22:50 UTC (permalink / raw)
  To: linuxppc-dev

I'm sorry to say I can no longer maintain this position.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
---
 MAINTAINERS | 1 -
 1 file changed, 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 496fd4eafb68..7e954e4a29e1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13187,7 +13187,6 @@ F:	tools/pci/
 
 PCI ENHANCED ERROR HANDLING (EEH) FOR POWERPC
 M:	Russell Currey <ruscur@russell.cc>
-M:	Sam Bobroff <sbobroff@linux.ibm.com>
 M:	Oliver O'Halloran <oohall@gmail.com>
 L:	linuxppc-dev@lists.ozlabs.org
 S:	Supported
-- 
2.22.0.216.g00a2a96fc9


^ permalink raw reply related

* [PATCH] xmon: Reset RCU and soft lockup watchdogs
From: Anton Blanchard @ 2020-06-30  0:02 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Paul Mackerras, Nicholas Piggin

I'm seeing RCU warnings when exiting xmon. xmon resets the NMI watchdog,
but does nothing with the RCU stall or soft lockup watchdogs. Add a
helper function that handles all three.

Signed-off-by: Anton Blanchard <anton@ozlabs.org>
---
 arch/powerpc/xmon/xmon.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 7efe4bc3ccf6..d27944e38b04 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -481,6 +481,13 @@ static inline int unrecoverable_excp(struct pt_regs *regs)
 #endif
 }
 
+static void xmon_touch_watchdogs(void)
+{
+	touch_softlockup_watchdog_sync();
+	rcu_cpu_stall_reset();
+	touch_nmi_watchdog();
+}
+
 static int xmon_core(struct pt_regs *regs, int fromipi)
 {
 	int cmd = 0;
@@ -718,7 +725,7 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
 	else
 		insert_cpu_bpts();
 
-	touch_nmi_watchdog();
+	xmon_touch_watchdogs();
 	local_irq_restore(flags);
 
 	return cmd != 'X' && cmd != EOF;
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH] powerpc: Warn about use of smt_snooze_delay
From: Joel Stanley @ 2020-06-30  1:05 UTC (permalink / raw)
  To: Gautham R . Shenoy; +Cc: linuxppc-dev, ego
In-Reply-To: <20200629104248.GD20062@in.ibm.com>

On Mon, 29 Jun 2020 at 10:42, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
>
> On Thu, Jun 25, 2020 at 07:33:49PM +0930, Joel Stanley wrote:
> > It's not done anything for a long time. Save the percpu variable, and
> > emit a warning to remind users to not expect it to do anything.
> >
> > Signed-off-by: Joel Stanley <joel@jms.id.au>
>
> The only known user of "smt_snooze_delay" is the "ppc64_cpu" which
> uses the presence of this file to assume that the system is SMT
> capable.
>
> Since we have "/sys/devices/system/cpu/smt/" these days, perhaps the
> userspace utility can use that and we can get rid of the file
> altogether ?

I've sent a change to the userspace tool to stop using the file. It
now uses the device tree parsing that was already present to determine
the smt state.

 https://github.com/ibm-power-utilities/powerpc-utils/pull/43

We will want to wait for the userspace tool to propagate through a
release and to distros before we remove the file all together. I agree
it should be removed in the future.

I've got of this patch v2 that changes the message to be:

         pr_warn_ratelimited("%s (%d) used unsupported
smt_snooze_delay, this has no effect\n",
                            current->comm, current->pid);

I'll send that out today.

Cheers,

Joel

>
> FWIW,
> Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/kernel/sysfs.c | 41 +++++++++++++------------------------
> >  1 file changed, 14 insertions(+), 27 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> > index 571b3259697e..530ae92bc46d 100644
> > --- a/arch/powerpc/kernel/sysfs.c
> > +++ b/arch/powerpc/kernel/sysfs.c
> > @@ -32,29 +32,25 @@
> >
> >  static DEFINE_PER_CPU(struct cpu, cpu_devices);
> >
> > -/*
> > - * SMT snooze delay stuff, 64-bit only for now
> > - */
> > -
> >  #ifdef CONFIG_PPC64
> >
> > -/* Time in microseconds we delay before sleeping in the idle loop */
> > -static DEFINE_PER_CPU(long, smt_snooze_delay) = { 100 };
> > +/*
> > + * Snooze delay has not been hooked up since 3fa8cad82b94 ("powerpc/pseries/cpuidle:
> > + * smt-snooze-delay cleanup.") and has been broken even longer. As was foretold in
> > + * 2014:
> > + *
> > + *  "ppc64_util currently utilises it. Once we fix ppc64_util, propose to clean
> > + *  up the kernel code."
> > + *
> > + * At some point in the future this code should be removed.
> > + */
> >
> >  static ssize_t store_smt_snooze_delay(struct device *dev,
> >                                     struct device_attribute *attr,
> >                                     const char *buf,
> >                                     size_t count)
> >  {
> > -     struct cpu *cpu = container_of(dev, struct cpu, dev);
> > -     ssize_t ret;
> > -     long snooze;
> > -
> > -     ret = sscanf(buf, "%ld", &snooze);
> > -     if (ret != 1)
> > -             return -EINVAL;
> > -
> > -     per_cpu(smt_snooze_delay, cpu->dev.id) = snooze;
> > +     WARN_ON_ONCE("smt_snooze_delay sysfs file has no effect\n");
> >       return count;
> >  }
> >
> > @@ -62,9 +58,9 @@ static ssize_t show_smt_snooze_delay(struct device *dev,
> >                                    struct device_attribute *attr,
> >                                    char *buf)
> >  {
> > -     struct cpu *cpu = container_of(dev, struct cpu, dev);
> > +     WARN_ON_ONCE("smt_snooze_delay sysfs file has no effect\n");
> >
> > -     return sprintf(buf, "%ld\n", per_cpu(smt_snooze_delay, cpu->dev.id));
> > +     return sprintf(buf, "100\n");
> >  }
> >
> >  static DEVICE_ATTR(smt_snooze_delay, 0644, show_smt_snooze_delay,
> > @@ -72,16 +68,7 @@ static DEVICE_ATTR(smt_snooze_delay, 0644, show_smt_snooze_delay,
> >
> >  static int __init setup_smt_snooze_delay(char *str)
> >  {
> > -     unsigned int cpu;
> > -     long snooze;
> > -
> > -     if (!cpu_has_feature(CPU_FTR_SMT))
> > -             return 1;
> > -
> > -     snooze = simple_strtol(str, NULL, 10);
> > -     for_each_possible_cpu(cpu)
> > -             per_cpu(smt_snooze_delay, cpu) = snooze;
> > -
> > +     WARN_ON_ONCE("smt-snooze-delay command line option has no effect\n");
> >       return 1;
> >  }
> >  __setup("smt-snooze-delay=", setup_smt_snooze_delay);
> > --
> > 2.27.0
> >

^ permalink raw reply

* Re: [PATCH v2] powerpc/uaccess: Use flexible addressing with __put_user()/__get_user()
From: Michael Ellerman @ 2020-06-30  1:19 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, npiggin,
	segher
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <878sg6862r.fsf@mpe.ellerman.id.au>

Michael Ellerman <mpe@ellerman.id.au> writes:
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Hi Michael,
>>
>> I see this patch is marked as "defered" in patchwork, but I can't see 
>> any related discussion. Is it normal ?
>
> Because it uses the "m<>" constraint which didn't work on GCC 4.6.
>
> https://github.com/linuxppc/issues/issues/297
>
> So we should be able to pick it up for v5.9 hopefully.

It seems to break the build with the kernel.org 4.9.4 compiler and
corenet64_smp_defconfig:

+ make -s CC=powerpc64-linux-gnu-gcc -j 160
In file included from /linux/include/linux/uaccess.h:11:0,
                 from /linux/include/linux/sched/task.h:11,
                 from /linux/include/linux/sched/signal.h:9,
                 from /linux/include/linux/rcuwait.h:6,
                 from /linux/include/linux/percpu-rwsem.h:7,
                 from /linux/include/linux/fs.h:33,
                 from /linux/include/linux/huge_mm.h:8,
                 from /linux/include/linux/mm.h:675,
                 from /linux/arch/powerpc/kernel/signal_32.c:17:
/linux/arch/powerpc/kernel/signal_32.c: In function 'save_user_regs.isra.14.constprop':
/linux/arch/powerpc/include/asm/uaccess.h:161:2: error: 'asm' operand has impossible constraints
  __asm__ __volatile__(     \
  ^
/linux/arch/powerpc/include/asm/uaccess.h:197:12: note: in expansion of macro '__put_user_asm'
    case 4: __put_user_asm(x, ptr, retval, "stw"); break; \
            ^
/linux/arch/powerpc/include/asm/uaccess.h:206:2: note: in expansion of macro '__put_user_size_allowed'
  __put_user_size_allowed(x, ptr, size, retval);  \
  ^
/linux/arch/powerpc/include/asm/uaccess.h:220:2: note: in expansion of macro '__put_user_size'
  __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
  ^
/linux/arch/powerpc/include/asm/uaccess.h:96:2: note: in expansion of macro '__put_user_nocheck'
  __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
  ^
/linux/arch/powerpc/kernel/signal_32.c:120:7: note: in expansion of macro '__put_user'
   if (__put_user((unsigned int)gregs[i], &frame->mc_gregs[i]))
       ^
/linux/scripts/Makefile.build:280: recipe for target 'arch/powerpc/kernel/signal_32.o' failed
make[3]: *** [arch/powerpc/kernel/signal_32.o] Error 1
make[3]: *** Waiting for unfinished jobs....
In file included from /linux/include/linux/uaccess.h:11:0,
                 from /linux/include/linux/sched/task.h:11,
                 from /linux/include/linux/sched/signal.h:9,
                 from /linux/include/linux/rcuwait.h:6,
                 from /linux/include/linux/percpu-rwsem.h:7,
                 from /linux/include/linux/fs.h:33,
                 from /linux/include/linux/huge_mm.h:8,
                 from /linux/include/linux/mm.h:675,
                 from /linux/arch/powerpc/kernel/signal_64.c:12:
/linux/arch/powerpc/kernel/signal_64.c: In function '__se_sys_swapcontext':
/linux/arch/powerpc/include/asm/uaccess.h:319:2: error: 'asm' operand has impossible constraints
  __asm__ __volatile__(    \
  ^
/linux/arch/powerpc/include/asm/uaccess.h:359:10: note: in expansion of macro '__get_user_asm'
  case 1: __get_user_asm(x, (u8 __user *)ptr, retval, "lbz"); break; \
          ^
/linux/arch/powerpc/include/asm/uaccess.h:370:2: note: in expansion of macro '__get_user_size_allowed'
  __get_user_size_allowed(x, ptr, size, retval);  \
  ^
/linux/arch/powerpc/include/asm/uaccess.h:393:3: note: in expansion of macro '__get_user_size'
   __get_user_size(__gu_val, __gu_addr, __gu_size, __gu_err); \
   ^
/linux/arch/powerpc/include/asm/uaccess.h:94:2: note: in expansion of macro '__get_user_nocheck'
  __get_user_nocheck((x), (ptr), sizeof(*(ptr)), true)
  ^
/linux/arch/powerpc/kernel/signal_64.c:672:9: note: in expansion of macro '__get_user'
      || __get_user(tmp, (u8 __user *) new_ctx + ctx_size - 1))
         ^
/linux/scripts/Makefile.build:280: recipe for target 'arch/powerpc/kernel/signal_64.o' failed
make[3]: *** [arch/powerpc/kernel/signal_64.o] Error 1
/linux/scripts/Makefile.build:497: recipe for target 'arch/powerpc/kernel' failed
make[2]: *** [arch/powerpc/kernel] Error 2
/linux/Makefile:1756: recipe for target 'arch/powerpc' failed
make[1]: *** [arch/powerpc] Error 2
Makefile:185: recipe for target '__sub-make' failed
make: *** [__sub-make] Error 2


cheers

^ permalink raw reply

* Re: [PATCH] powerpc: Warn about use of smt_snooze_delay
From: Michael Ellerman @ 2020-06-30  1:22 UTC (permalink / raw)
  To: Joel Stanley, linuxppc-dev; +Cc: ego
In-Reply-To: <20200625100349.2408899-1-joel@jms.id.au>

Joel Stanley <joel@jms.id.au> writes:
> It's not done anything for a long time. Save the percpu variable, and
> emit a warning to remind users to not expect it to do anything.
>
> Signed-off-by: Joel Stanley <joel@jms.id.au>
> ---
>  arch/powerpc/kernel/sysfs.c | 41 +++++++++++++------------------------
>  1 file changed, 14 insertions(+), 27 deletions(-)
>
> diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> index 571b3259697e..530ae92bc46d 100644
> --- a/arch/powerpc/kernel/sysfs.c
> +++ b/arch/powerpc/kernel/sysfs.c
> @@ -32,29 +32,25 @@
>  
>  static DEFINE_PER_CPU(struct cpu, cpu_devices);
>  
> -/*
> - * SMT snooze delay stuff, 64-bit only for now
> - */
> -
>  #ifdef CONFIG_PPC64
>  
> -/* Time in microseconds we delay before sleeping in the idle loop */
> -static DEFINE_PER_CPU(long, smt_snooze_delay) = { 100 };
> +/*
> + * Snooze delay has not been hooked up since 3fa8cad82b94 ("powerpc/pseries/cpuidle:
> + * smt-snooze-delay cleanup.") and has been broken even longer. As was foretold in
> + * 2014:
> + *
> + *  "ppc64_util currently utilises it. Once we fix ppc64_util, propose to clean
> + *  up the kernel code."
> + *
> + * At some point in the future this code should be removed.
> + */
>  
>  static ssize_t store_smt_snooze_delay(struct device *dev,
>  				      struct device_attribute *attr,
>  				      const char *buf,
>  				      size_t count)
>  {
> -	struct cpu *cpu = container_of(dev, struct cpu, dev);
> -	ssize_t ret;
> -	long snooze;
> -
> -	ret = sscanf(buf, "%ld", &snooze);
> -	if (ret != 1)
> -		return -EINVAL;
> -
> -	per_cpu(smt_snooze_delay, cpu->dev.id) = snooze;
> +	WARN_ON_ONCE("smt_snooze_delay sysfs file has no effect\n");

We shouldn't have user-triggerable WARNs.

I think this should just be a pr_warn_ratelimited(), maybe including
current->comm & pid.

cheers

^ permalink raw reply

* Re: [PATCH 1/3] powerpc: inline doorbell sending functions
From: Michael Ellerman @ 2020-06-30  1:31 UTC (permalink / raw)
  To: kernel test robot, Nicholas Piggin, linuxppc-dev
  Cc: kbuild-all, Nicholas Piggin, kvm-ppc, Anton Blanchard,
	Cédric Le Goater, David Gibson
In-Reply-To: <202006280326.fcRFUNzs%lkp@intel.com>

kernel test robot <lkp@intel.com> writes:
> Hi Nicholas,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on powerpc/next]
> [also build test ERROR on scottwood/next v5.8-rc2 next-20200626]
> [cannot apply to kvm-ppc/kvm-ppc-next]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use  as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url:    https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-pseries-IPI-doorbell-improvements/20200627-230544
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
> config: powerpc-randconfig-c003-20200628 (attached as .config)
> compiler: powerpc64-linux-gcc (GCC) 9.3.0

> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All error/warnings (new ones prefixed by >>):
>
>    In file included from arch/powerpc/kernel/asm-offsets.c:38:
>    arch/powerpc/include/asm/dbell.h: In function 'doorbell_global_ipi':
>>> arch/powerpc/include/asm/dbell.h:114:12: error: implicit declaration of function 'get_hard_smp_processor_id'; did you mean 'raw_smp_processor_id'? [-Werror=implicit-function-declaration]
>      114 |  u32 tag = get_hard_smp_processor_id(cpu);
>          |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>          |            raw_smp_processor_id
>    arch/powerpc/include/asm/dbell.h: In function 'doorbell_try_core_ipi':
>>> arch/powerpc/include/asm/dbell.h:146:28: error: implicit declaration of function 'cpu_sibling_mask'; did you mean 'cpu_online_mask'? [-Werror=implicit-function-declaration]
>      146 |  if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
>          |                            ^~~~~~~~~~~~~~~~
>          |                            cpu_online_mask
>>> arch/powerpc/include/asm/dbell.h:146:28: warning: passing argument 2 of 'cpumask_test_cpu' makes pointer from integer without a cast [-Wint-conversion]
>      146 |  if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
>          |                            ^~~~~~~~~~~~~~~~~~~~~~~~~~

Seems like CONFIG_SMP=n is probably the root cause.

You could try including asm/smp.h, but good chance that will lead to
header soup.

Other option would be to wrap the whole lot in #ifdef CONFIG_SMP?

cheers

^ permalink raw reply

* Re: [PATCH updated] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier
From: Dan Williams @ 2020-06-30  1:32 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Jan Kara, linux-nvdimm, Jeff Moyer, Oliver O'Halloran,
	Michal Suchánek, linuxppc-dev
In-Reply-To: <20200629202901.83516-1-aneesh.kumar@linux.ibm.com>

On Mon, Jun 29, 2020 at 1:29 PM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> Architectures like ppc64 provide persistent memory specific barriers
> that will ensure that all stores for which the modifications are
> written to persistent storage by preceding dcbfps and dcbstps
> instructions have updated persistent storage before any data
> access or data transfer caused by subsequent instructions is initiated.
> This is in addition to the ordering done by wmb()
>
> Update nvdimm core such that architecture can use barriers other than
> wmb to ensure all previous writes are architecturally visible for
> the platform buffer flush.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  drivers/md/dm-writecache.c   | 2 +-
>  drivers/nvdimm/region_devs.c | 8 ++++----
>  include/linux/libnvdimm.h    | 4 ++++
>  3 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
> index 74f3c506f084..8c6b6dce64e2 100644
> --- a/drivers/md/dm-writecache.c
> +++ b/drivers/md/dm-writecache.c
> @@ -536,7 +536,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc)
>  static void writecache_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
>  {
>         if (WC_MODE_PMEM(wc))
> -               wmb();
> +               arch_pmem_flush_barrier();
>         else
>                 ssd_commit_flushed(wc, wait_for_ios);
>  }
> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> index 4502f9c4708d..b308ad09b63d 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -1206,13 +1206,13 @@ int generic_nvdimm_flush(struct nd_region *nd_region)
>         idx = this_cpu_add_return(flush_idx, hash_32(current->pid + idx, 8));
>
>         /*
> -        * The first wmb() is needed to 'sfence' all previous writes
> -        * such that they are architecturally visible for the platform
> -        * buffer flush.  Note that we've already arranged for pmem
> +        * The first arch_pmem_flush_barrier() is needed to 'sfence' all
> +        * previous writes such that they are architecturally visible for
> +        * the platform buffer flush. Note that we've already arranged for pmem
>          * writes to avoid the cache via memcpy_flushcache().  The final
>          * wmb() ensures ordering for the NVDIMM flush write.
>          */
> -       wmb();
> +       arch_pmem_flush_barrier();
>         for (i = 0; i < nd_region->ndr_mappings; i++)
>                 if (ndrd_get_flush_wpq(ndrd, i, 0))
>                         writeq(1, ndrd_get_flush_wpq(ndrd, i, idx));
> diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
> index 18da4059be09..66f6c65bd789 100644
> --- a/include/linux/libnvdimm.h
> +++ b/include/linux/libnvdimm.h
> @@ -286,4 +286,8 @@ static inline void arch_invalidate_pmem(void *addr, size_t size)
>  }
>  #endif
>
> +#ifndef arch_pmem_flush_barrier
> +#define arch_pmem_flush_barrier() wmb()
> +#endif

I think it is out of place to define this in libnvdimm.h and it is odd
to give it such a long name. The other pmem api helpers like
arch_wb_cache_pmem() and arch_invalidate_pmem() are function calls for
libnvdimm driver operations, this barrier is just an instruction and
is closer to wmb() than the pmem api routine.

Since it is a store fence for pmem, so let's just call it pmem_wmb()
and define the generic version in include/linux/compiler.h. It should
probably also be documented alongside dma_wmb() in
Documentation/memory-barriers.txt about why code would use it over
wmb(), and why a symmetric pmem_rmb() is not needed.

^ permalink raw reply

* Re: [PATCH v6 5/8] powerpc/pmem/of_pmem: Update of_pmem to use the new barrier instruction.
From: Dan Williams @ 2020-06-30  1:38 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Jan Kara, linux-nvdimm, Jeff Moyer, Oliver O'Halloran,
	Michal Suchánek, linuxppc-dev
In-Reply-To: <20200629135722.73558-6-aneesh.kumar@linux.ibm.com>

On Mon, Jun 29, 2020 at 6:58 AM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> of_pmem on POWER10 can now use phwsync instead of hwsync to ensure
> all previous writes are architecturally visible for the platform
> buffer flush.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/cacheflush.h | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
> index 54764c6e922d..95782f77d768 100644
> --- a/arch/powerpc/include/asm/cacheflush.h
> +++ b/arch/powerpc/include/asm/cacheflush.h
> @@ -98,6 +98,13 @@ static inline void invalidate_dcache_range(unsigned long start,
>         mb();   /* sync */
>  }
>
> +#define arch_pmem_flush_barrier arch_pmem_flush_barrier
> +static inline void  arch_pmem_flush_barrier(void)
> +{
> +       if (cpu_has_feature(CPU_FTR_ARCH_207S))
> +               asm volatile(PPC_PHWSYNC ::: "memory");

Shouldn't this fallback to a compatible store-fence in an else statement?

^ permalink raw reply

* Re: [PATCH v6 6/8] powerpc/pmem: Avoid the barrier in flush routines
From: Dan Williams @ 2020-06-30  1:50 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Jan Kara, linux-nvdimm, Jeff Moyer, Oliver O'Halloran,
	Michal Suchánek, linuxppc-dev
In-Reply-To: <87lfk5hahc.fsf@linux.ibm.com>

On Mon, Jun 29, 2020 at 1:41 PM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> Michal Suchánek <msuchanek@suse.de> writes:
>
> > Hello,
> >
> > On Mon, Jun 29, 2020 at 07:27:20PM +0530, Aneesh Kumar K.V wrote:
> >> nvdimm expect the flush routines to just mark the cache clean. The barrier
> >> that mark the store globally visible is done in nvdimm_flush().
> >>
> >> Update the papr_scm driver to a simplified nvdim_flush callback that do
> >> only the required barrier.
> >>
> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> >> ---
> >>  arch/powerpc/lib/pmem.c                   |  6 ------
> >>  arch/powerpc/platforms/pseries/papr_scm.c | 13 +++++++++++++
> >>  2 files changed, 13 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/powerpc/lib/pmem.c b/arch/powerpc/lib/pmem.c
> >> index 5a61aaeb6930..21210fa676e5 100644
> >> --- a/arch/powerpc/lib/pmem.c
> >> +++ b/arch/powerpc/lib/pmem.c
> >> @@ -19,9 +19,6 @@ static inline void __clean_pmem_range(unsigned long start, unsigned long stop)
> >>
> >>      for (i = 0; i < size >> shift; i++, addr += bytes)
> >>              asm volatile(PPC_DCBSTPS(%0, %1): :"i"(0), "r"(addr): "memory");
> >> -
> >> -
> >> -    asm volatile(PPC_PHWSYNC ::: "memory");
> >>  }
> >>
> >>  static inline void __flush_pmem_range(unsigned long start, unsigned long stop)
> >> @@ -34,9 +31,6 @@ static inline void __flush_pmem_range(unsigned long start, unsigned long stop)
> >>
> >>      for (i = 0; i < size >> shift; i++, addr += bytes)
> >>              asm volatile(PPC_DCBFPS(%0, %1): :"i"(0), "r"(addr): "memory");
> >> -
> >> -
> >> -    asm volatile(PPC_PHWSYNC ::: "memory");
> >>  }
> >>
> >>  static inline void clean_pmem_range(unsigned long start, unsigned long stop)
> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> >> index 9c569078a09f..9a9a0766f8b6 100644
> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> >> @@ -630,6 +630,18 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
> >>
> >>      return 0;
> >>  }
> >> +/*
> >> + * We have made sure the pmem writes are done such that before calling this
> >> + * all the caches are flushed/clean. We use dcbf/dcbfps to ensure this. Here
> >> + * we just need to add the necessary barrier to make sure the above flushes
> >> + * are have updated persistent storage before any data access or data transfer
> >> + * caused by subsequent instructions is initiated.
> >> + */
> >> +static int papr_scm_flush_sync(struct nd_region *nd_region, struct bio *bio)
> >> +{
> >> +    arch_pmem_flush_barrier();
> >> +    return 0;
> >> +}
> >>
> >>  static ssize_t flags_show(struct device *dev,
> >>                        struct device_attribute *attr, char *buf)
> >> @@ -743,6 +755,7 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
> >>      ndr_desc.mapping = &mapping;
> >>      ndr_desc.num_mappings = 1;
> >>      ndr_desc.nd_set = &p->nd_set;
> >> +    ndr_desc.flush = papr_scm_flush_sync;
> >
> > AFAICT currently the only device that implements flush is virtio_pmem.
> > How does the nfit driver get away without implementing flush?
>
> generic_nvdimm_flush does the required barrier for nfit. The reason for
> adding ndr_desc.flush call back for papr_scm was to avoid the usage
> of iomem based deep flushing (ndr_region_data.flush_wpq) which is not
> supported by papr_scm.
>
> BTW we do return NULL for ndrd_get_flush_wpq() on power. So the upstream
> code also does the same thing, but in a different way.
>
>
> > Also the flush takes arguments that are completely unused but a user of
> > the pmem region must assume they are used, and call flush() on the
> > region rather than arch_pmem_flush_barrier() directly.
>
> The bio argument can help a pmem driver to do range based flushing in
> case of pmem_make_request. If bio is null then we must assume a full
> device flush.

The bio argument isn't for range based flushing, it is for flush
operations that need to complete asynchronously.

There's no mechanism for the block layer to communicate range based
cache flushing, block-device flushing is assumed to be the device's
entire cache. For pmem that would be the entirety of the cpu cache.
Instead of modeling the cpu cache as a storage device cache it is
modeled as page-cache. Once the fs-layer writes back page-cache /
cpu-cache the storage device is only responsible for flushing those
cache-writes into the persistence domain.

Additionally there is a concept of deep-flush that relegates some
power-fail scenarios to a smaller failure domain. For example consider
the difference between a write arriving at the head of a device-queue
and successfully traversing a device-queue to media. The expectation
of pmem applications is that data is persisted once they reach the
equivalent of the x86 ADR domain, deep-flush is past ADR.

^ permalink raw reply

* Re: [PATCH v6 7/8] powerpc/pmem: Add WARN_ONCE to catch the wrong usage of pmem flush functions.
From: Dan Williams @ 2020-06-30  1:52 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Jan Kara, linux-nvdimm, Jeff Moyer, Oliver O'Halloran,
	Michal Suchánek, linuxppc-dev
In-Reply-To: <20200629135722.73558-8-aneesh.kumar@linux.ibm.com>

On Mon, Jun 29, 2020 at 6:58 AM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> We only support persistent memory on P8 and above. This is enforced by the
> firmware and further checked on virtualzied platform during platform init.
> Add WARN_ONCE in pmem flush routines to catch the wrong usage of these.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/cacheflush.h | 2 ++
>  arch/powerpc/lib/pmem.c               | 2 ++
>  2 files changed, 4 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
> index 95782f77d768..1ab0fa660497 100644
> --- a/arch/powerpc/include/asm/cacheflush.h
> +++ b/arch/powerpc/include/asm/cacheflush.h
> @@ -103,6 +103,8 @@ static inline void  arch_pmem_flush_barrier(void)
>  {
>         if (cpu_has_feature(CPU_FTR_ARCH_207S))
>                 asm volatile(PPC_PHWSYNC ::: "memory");
> +       else
> +               WARN_ONCE(1, "Using pmem flush on older hardware.");

This seems too late to be making this determination. I'd expect the
driver to fail to successfully bind default if this constraint is not
met.

^ permalink raw reply

* Re: [PATCH 1/3] powerpc: inline doorbell sending functions
From: Nicholas Piggin @ 2020-06-30  1:58 UTC (permalink / raw)
  To: linuxppc-dev, kernel test robot, Michael Ellerman
  Cc: kbuild-all, Anton Blanchard, Cédric Le Goater, kvm-ppc,
	David Gibson
In-Reply-To: <87zh8l7318.fsf@mpe.ellerman.id.au>

Excerpts from Michael Ellerman's message of June 30, 2020 11:31 am:
> kernel test robot <lkp@intel.com> writes:
>> Hi Nicholas,
>>
>> I love your patch! Yet something to improve:
>>
>> [auto build test ERROR on powerpc/next]
>> [also build test ERROR on scottwood/next v5.8-rc2 next-20200626]
>> [cannot apply to kvm-ppc/kvm-ppc-next]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use  as documented in
>> https://git-scm.com/docs/git-format-patch]
>>
>> url:    https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-pseries-IPI-doorbell-improvements/20200627-230544
>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
>> config: powerpc-randconfig-c003-20200628 (attached as .config)
>> compiler: powerpc64-linux-gcc (GCC) 9.3.0
> 
>> If you fix the issue, kindly add following tag as appropriate
>> Reported-by: kernel test robot <lkp@intel.com>
>>
>> All error/warnings (new ones prefixed by >>):
>>
>>    In file included from arch/powerpc/kernel/asm-offsets.c:38:
>>    arch/powerpc/include/asm/dbell.h: In function 'doorbell_global_ipi':
>>>> arch/powerpc/include/asm/dbell.h:114:12: error: implicit declaration of function 'get_hard_smp_processor_id'; did you mean 'raw_smp_processor_id'? [-Werror=implicit-function-declaration]
>>      114 |  u32 tag = get_hard_smp_processor_id(cpu);
>>          |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>>          |            raw_smp_processor_id
>>    arch/powerpc/include/asm/dbell.h: In function 'doorbell_try_core_ipi':
>>>> arch/powerpc/include/asm/dbell.h:146:28: error: implicit declaration of function 'cpu_sibling_mask'; did you mean 'cpu_online_mask'? [-Werror=implicit-function-declaration]
>>      146 |  if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
>>          |                            ^~~~~~~~~~~~~~~~
>>          |                            cpu_online_mask
>>>> arch/powerpc/include/asm/dbell.h:146:28: warning: passing argument 2 of 'cpumask_test_cpu' makes pointer from integer without a cast [-Wint-conversion]
>>      146 |  if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
>>          |                            ^~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Seems like CONFIG_SMP=n is probably the root cause.
> 
> You could try including asm/smp.h, but good chance that will lead to
> header soup.

Possibly. dbell.h shouldn't be included by much, but maybe it gets
dragged in.

> 
> Other option would be to wrap the whole lot in #ifdef CONFIG_SMP?

Yeah that might be a better idea.

I'll fix it up and repost if there's no strong objections to
the KVM detection bit.

Thanks,
Nick

^ permalink raw reply

* [PATCH v2] powerpc: Warn about use of smt_snooze_delay
From: Joel Stanley @ 2020-06-30  1:59 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Tyrel Datwyler, Gautham R Shenoy

It's not done anything for a long time. Save the percpu variable, and
emit a warning to remind users to not expect it to do anything.

Fixes: 3fa8cad82b94 ("powerpc/pseries/cpuidle: smt-snooze-delay cleanup.")
Cc: stable@vger.kernel.org # v3.14
Signed-off-by: Joel Stanley <joel@jms.id.au>
--
v2:
 Use pr_warn instead of WARN
 Reword and print proccess name with pid in message
 Leave CPU_FTR_SMT test in
 Add Fixes line

mpe, if you don't agree then feel free to drop the cc stable.

Testing 'ppc64_cpu --smt=off' on a 24 core / 4 SMT system it's quite noisy
as the online/offline loop that ppc64_cpu runs is slow.

This could be fixed by open coding pr_warn_ratelimit with the ratelimit
parameters tweaked if someone was concerned. I'll leave that to someone
else as a future enhancement.

[  237.642088][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  237.642175][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  237.642261][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  237.642345][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  237.642430][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  237.642516][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  237.642625][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  237.642709][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  237.642793][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  237.642878][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  254.264030][ T1197] store_smt_snooze_delay: 14 callbacks suppressed
[  254.264033][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  254.264048][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  254.264062][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  254.264075][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  254.264089][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  254.264103][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  254.264116][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  254.264130][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  254.264143][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect
[  254.264157][ T1197] ppc64_cpu (1197) used unsupported smt_snooze_delay, this has no effect

Signed-off-by: Joel Stanley <joel@jms.id.au>
---
 arch/powerpc/kernel/sysfs.c | 41 +++++++++++++++----------------------
 1 file changed, 16 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 571b3259697e..ba6d4cee19ef 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -32,29 +32,26 @@
 
 static DEFINE_PER_CPU(struct cpu, cpu_devices);
 
-/*
- * SMT snooze delay stuff, 64-bit only for now
- */
-
 #ifdef CONFIG_PPC64
 
-/* Time in microseconds we delay before sleeping in the idle loop */
-static DEFINE_PER_CPU(long, smt_snooze_delay) = { 100 };
+/*
+ * Snooze delay has not been hooked up since 3fa8cad82b94 ("powerpc/pseries/cpuidle:
+ * smt-snooze-delay cleanup.") and has been broken even longer. As was foretold in
+ * 2014:
+ *
+ *  "ppc64_util currently utilises it. Once we fix ppc64_util, propose to clean
+ *  up the kernel code."
+ *
+ * At some point in the future this code should be removed.
+ */
 
 static ssize_t store_smt_snooze_delay(struct device *dev,
 				      struct device_attribute *attr,
 				      const char *buf,
 				      size_t count)
 {
-	struct cpu *cpu = container_of(dev, struct cpu, dev);
-	ssize_t ret;
-	long snooze;
-
-	ret = sscanf(buf, "%ld", &snooze);
-	if (ret != 1)
-		return -EINVAL;
-
-	per_cpu(smt_snooze_delay, cpu->dev.id) = snooze;
+	pr_warn_ratelimited("%s (%d) used unsupported smt_snooze_delay, this has no effect\n",
+			    current->comm, current->pid);
 	return count;
 }
 
@@ -62,9 +59,9 @@ static ssize_t show_smt_snooze_delay(struct device *dev,
 				     struct device_attribute *attr,
 				     char *buf)
 {
-	struct cpu *cpu = container_of(dev, struct cpu, dev);
-
-	return sprintf(buf, "%ld\n", per_cpu(smt_snooze_delay, cpu->dev.id));
+	pr_warn_ratelimited("%s (%d) used unsupported smt_snooze_delay, this has no effect\n",
+			    current->comm, current->pid);
+	return sprintf(buf, "100\n");
 }
 
 static DEVICE_ATTR(smt_snooze_delay, 0644, show_smt_snooze_delay,
@@ -72,16 +69,10 @@ static DEVICE_ATTR(smt_snooze_delay, 0644, show_smt_snooze_delay,
 
 static int __init setup_smt_snooze_delay(char *str)
 {
-	unsigned int cpu;
-	long snooze;
-
 	if (!cpu_has_feature(CPU_FTR_SMT))
 		return 1;
 
-	snooze = simple_strtol(str, NULL, 10);
-	for_each_possible_cpu(cpu)
-		per_cpu(smt_snooze_delay, cpu) = snooze;
-
+	pr_warn("smt-snooze-delay command line option has no effect\n");
 	return 1;
 }
 __setup("smt-snooze-delay=", setup_smt_snooze_delay);
-- 
2.27.0


^ permalink raw reply related

* Re: [PATCH] kbuild: introduce ccflags-remove-y and asflags-remove-y
From: Masahiro Yamada @ 2020-06-30  2:08 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Michal Marek, Yoshinori Sato, Linux Kbuild mailing list,
	Linux-sh list, Linux Kernel Mailing List, Steven Rostedt,
	Russell King, linuxppc-dev, Ingo Molnar, Paul Mackerras,
	Sami Tolvanen, Rich Felker, linux-arm-kernel
In-Reply-To: <87imfa8le0.fsf@mpe.ellerman.id.au>

On Mon, Jun 29, 2020 at 2:55 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Masahiro Yamada <masahiroy@kernel.org> writes:
> > CFLAGS_REMOVE_<file>.o works per object, that is, there is no
> > convenient way to filter out flags for every object in a directory.
> >
> > Add ccflags-remove-y and asflags-remove-y to make it easily.
> >
> > Use ccflags-remove-y to clean up some Makefiles.
> >
> > Suggested-by: Sami Tolvanen <samitolvanen@google.com>
> > Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
> > ---
> >
> >  arch/arm/boot/compressed/Makefile | 6 +-----
> >  arch/powerpc/xmon/Makefile        | 3 +--
> >  arch/sh/boot/compressed/Makefile  | 5 +----
> >  kernel/trace/Makefile             | 4 ++--
> >  lib/Makefile                      | 5 +----
> >  scripts/Makefile.lib              | 4 ++--
> >  6 files changed, 8 insertions(+), 19 deletions(-)
> >
> > diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile
> > index 89c76ca35640..55cbcdd88ac0 100644
> > --- a/arch/powerpc/xmon/Makefile
> > +++ b/arch/powerpc/xmon/Makefile
> > @@ -7,8 +7,7 @@ UBSAN_SANITIZE := n
> >  KASAN_SANITIZE := n
> >
> >  # Disable ftrace for the entire directory
> > -ORIG_CFLAGS := $(KBUILD_CFLAGS)
> > -KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS))
> > +ccflags-remove-y += $(CC_FLAGS_FTRACE)
>
> This could be:
>
> ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
>
> Similar to kernel/trace/Makefile below.


I fixed it up, and applied to linux-kbuild.
Thanks.


> I don't mind though.
>
> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
>
> cheers
>
> > diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
> > index 6575bb0a0434..7492844a8b1b 100644
> > --- a/kernel/trace/Makefile
> > +++ b/kernel/trace/Makefile
> > @@ -2,9 +2,9 @@
> >
> >  # Do not instrument the tracer itself:
> >
> > +ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
> > +
> >  ifdef CONFIG_FUNCTION_TRACER
> > -ORIG_CFLAGS := $(KBUILD_CFLAGS)
> > -KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS))
> >
> >  # Avoid recursion due to instrumentation.
> >  KCSAN_SANITIZE := n



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply

* Re: [PATCH 3/3] powerpc/pseries: Add KVM guest doorbell restrictions
From: Paul Mackerras @ 2020-06-30  2:27 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: kvm-ppc, Anton Blanchard, Cédric Le Goater, linuxppc-dev,
	David Gibson
In-Reply-To: <20200627150428.2525192-4-npiggin@gmail.com>

On Sun, Jun 28, 2020 at 01:04:28AM +1000, Nicholas Piggin wrote:
> KVM guests have certain restrictions and performance quirks when
> using doorbells. This patch tests for KVM environment in doorbell
> setup, and optimises IPI performance:
> 
>  - PowerVM guests may now use doorbells even if they are secure.
> 
>  - KVM guests no longer use doorbells if XIVE is available.

It seems, from the fact that you completely remove
kvm_para_available(), that you perhaps haven't tried building with
CONFIG_KVM_GUEST=y.  Somewhat confusingly, that option is not used or
needed when building for a PAPR guest (i.e. the "pseries" platform)
but is used on non-IBM platforms using the "epapr" hypervisor
interface.

If you did intend to remove support for the epapr hypervisor interface
then that should have been talked about in the commit message (and
would I expect be controversial).

So NAK on the kvm_para_available() removal.

Paul.

^ permalink raw reply

* Re: [PATCH] xmon: Reset RCU and soft lockup watchdogs
From: Nicholas Piggin @ 2020-06-30  2:29 UTC (permalink / raw)
  To: Anton Blanchard, linuxppc-dev; +Cc: Paul Mackerras
In-Reply-To: <20200630100218.62a3c3fb@kryten.localdomain>

Excerpts from Anton Blanchard's message of June 30, 2020 10:02 am:
> I'm seeing RCU warnings when exiting xmon. xmon resets the NMI watchdog,
> but does nothing with the RCU stall or soft lockup watchdogs. Add a
> helper function that handles all three.
> 
> Signed-off-by: Anton Blanchard <anton@ozlabs.org>

Acked-by: Nicholas Piggin <npiggin@gmail.com>

> ---
>  arch/powerpc/xmon/xmon.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> index 7efe4bc3ccf6..d27944e38b04 100644
> --- a/arch/powerpc/xmon/xmon.c
> +++ b/arch/powerpc/xmon/xmon.c
> @@ -481,6 +481,13 @@ static inline int unrecoverable_excp(struct pt_regs *regs)
>  #endif
>  }
>  
> +static void xmon_touch_watchdogs(void)
> +{
> +	touch_softlockup_watchdog_sync();
> +	rcu_cpu_stall_reset();
> +	touch_nmi_watchdog();
> +}
> +
>  static int xmon_core(struct pt_regs *regs, int fromipi)
>  {
>  	int cmd = 0;
> @@ -718,7 +725,7 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
>  	else
>  		insert_cpu_bpts();
>  
> -	touch_nmi_watchdog();
> +	xmon_touch_watchdogs();
>  	local_irq_restore(flags);
>  
>  	return cmd != 'X' && cmd != EOF;
> -- 
> 2.26.2
> 
> 

^ permalink raw reply

* Re: [PATCH 04/11] ppc64/kexec_file: avoid stomping memory used by special regions
From: piliu @ 2020-06-30  3:30 UTC (permalink / raw)
  To: Hari Bathini, Michael Ellerman, Andrew Morton
  Cc: Kexec-ml, Petr Tesarik, Mahesh J Salgaonkar, Sourabh Jain, lkml,
	linuxppc-dev, Mimi Zohar, Thiago Jung Bauermann, Dave Young,
	Vivek Goyal, Eric Biederman
In-Reply-To: <f745de42-297e-6eed-d25b-ea21d6000dc5@linux.ibm.com>



On 06/29/2020 01:55 PM, Hari Bathini wrote:
> 
> 
> On 28/06/20 7:44 am, piliu wrote:
>> Hi Hari,
> 
> Hi Pingfan,
> 
>>
>> After a quick through for this series, I have a few question/comment on
>> this patch for the time being. Pls see comment inline.
>>
>> On 06/27/2020 03:05 AM, Hari Bathini wrote:
>>> crashkernel region could have an overlap with special memory regions
>>> like  opal, rtas, tce-table & such. These regions are referred to as
>>> exclude memory ranges. Setup this ranges during image probe in order
>>> to avoid them while finding the buffer for different kdump segments.
> 
> [...]
> 
>>> +	/*
>>> +	 * Use the locate_mem_hole logic in kexec_add_buffer() for regular
>>> +	 * kexec_file_load syscall
>>> +	 */
>>> +	if (kbuf->image->type != KEXEC_TYPE_CRASH)
>>> +		return 0;
>> Can the ranges overlap [crashk_res.start, crashk_res.end]?  Otherwise
>> there is no requirement for @exclude_ranges.
> 
> The ranges like rtas, opal are loaded by f/w. They almost always overlap with
> crashkernel region. So, @exclude_ranges is required to support kdump.
f/w passes rtas/opal as service, then must f/w mark these ranges as
fdt_reserved_mem in order to make kernel aware not to use these ranges?
Otherwise kernel memory allocation besides kdump can also overwrite
these ranges.

Hmm, revisiting reserve_crashkernel(). It seems not to take any reserved
memory into consider except kernel text. Could it work based on memblock
allocator?

Thanks,
Pingfan
> 
>> I guess you have a design for future. If not true, then it is better to
>> fold the condition "if (kbuf->image->type != KEXEC_TYPE_CRASH)" into the
>> caller and rename this function to better distinguish use cases between
>> kexec and kdump
> 
> Yeah, this condition will be folded. I have a follow-up patch for that explaining
> why kexec case should also be folded. Will try to add that to this series for v2.
> 
> Thanks
> Hari
> 


^ permalink raw reply

* Re: [PATCH V3 0/4] mm/debug_vm_pgtable: Add some more tests
From: Anshuman Khandual @ 2020-06-30  3:53 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-doc, Heiko Carstens, Paul Mackerras, H. Peter Anvin,
	linux-riscv, Will Deacon, linux-arch, linux-s390, Jonathan Corbet,
	x86, Mike Rapoport, Christian Borntraeger, Ingo Molnar,
	linux-arm-kernel, ziy, Catalin Marinas, linux-snps-arc,
	Vasily Gorbik, Vineet Gupta, Borislav Petkov, Paul Walmsley,
	Kirill A . Shutemov, Thomas Gleixner, gerald.schaefer,
	christophe.leroy, Vineet Gupta, linux-kernel, Palmer Dabbelt,
	Qian Cai, Andrew Morton, linuxppc-dev
In-Reply-To: <70ddc7dd-b688-b73e-642a-6363178c8cdd@arm.com>



On 06/24/2020 08:43 AM, Anshuman Khandual wrote:
> 
> 
> On 06/15/2020 09:07 AM, Anshuman Khandual wrote:
>> This series adds some more arch page table helper validation tests which
>> are related to core and advanced memory functions. This also creates a
>> documentation, enlisting expected semantics for all page table helpers as
>> suggested by Mike Rapoport previously (https://lkml.org/lkml/2020/1/30/40).
>>
>> There are many TRANSPARENT_HUGEPAGE and ARCH_HAS_TRANSPARENT_HUGEPAGE_PUD
>> ifdefs scattered across the test. But consolidating all the fallback stubs
>> is not very straight forward because ARCH_HAS_TRANSPARENT_HUGEPAGE_PUD is
>> not explicitly dependent on ARCH_HAS_TRANSPARENT_HUGEPAGE.
>>
>> Tested on arm64, x86 platforms but only build tested on all other enabled
>> platforms through ARCH_HAS_DEBUG_VM_PGTABLE i.e powerpc, arc, s390. The
>> following failure on arm64 still exists which was mentioned previously. It
>> will be fixed with the upcoming THP migration on arm64 enablement series.
>>
>> WARNING .... mm/debug_vm_pgtable.c:860 debug_vm_pgtable+0x940/0xa54
>> WARN_ON(!pmd_present(pmd_mkinvalid(pmd_mkhuge(pmd))))
>>
>> This series is based on v5.8-rc1.
>>
>> Changes in V3:
>>
>> - Replaced HAVE_ARCH_SOFT_DIRTY with MEM_SOFT_DIRTY
>> - Added HAVE_ARCH_HUGE_VMAP checks in pxx_huge_tests() per Gerald
>> - Updated documentation for pmd_thp_tests() per Zi Yan
>> - Replaced READ_ONCE() with huge_ptep_get() per Gerald
>> - Added pte_mkhuge() and masking with PMD_MASK per Gerald
>> - Replaced pte_same() with holding pfn check in pxx_swap_tests()
>> - Added documentation for all (#ifdef #else #endif) per Gerald
>> - Updated pmd_protnone_tests() per Gerald
>> - Updated HugeTLB PTE creation in hugetlb_advanced_tests() per Gerald
>> - Replaced [pmd|pud]_mknotpresent() with [pmd|pud]_mkinvalid()
>> - Added has_transparent_hugepage() check for PMD and PUD tests
>> - Added a patch which debug prints all individual tests being executed
>> - Updated documentation for renamed [pmd|pud]_mkinvalid() helpers
> 
> Hello Gerald/Christophe/Vineet,
> 
> It would be really great if you could give this series a quick test
> on s390/ppc/arc platforms respectively. Thank you.

Thanks Alexander, Gerald and Christophe for testing this out on s390
and ppc32 platforms. Probably Vineet and Qian (any other volunteers)
could help us with arc and ppc64 platforms, which I would appreciate.

^ permalink raw reply

* Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline
From: Srikar Dronamraju @ 2020-06-30  4:01 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: Gautham R Shenoy, Michal Hocko, David Hildenbrand, Linus Torvalds,
	linux-kernel, linux-mm, Satheesh Rajendran, Mel Gorman,
	Kirill A. Shutemov, Andrew Morton, linuxppc-dev, Vlastimil Babka
In-Reply-To: <alpine.DEB.2.22.394.2006291456550.27163@www.lameter.com>

* Christopher Lameter <cl@linux.com> [2020-06-29 14:58:40]:

> On Wed, 24 Jun 2020, Srikar Dronamraju wrote:
> 
> > Currently Linux kernel with CONFIG_NUMA on a system with multiple
> > possible nodes, marks node 0 as online at boot.  However in practice,
> > there are systems which have node 0 as memoryless and cpuless.
> 
> Maybe add something to explain why you are not simply mapping the
> existing memory to NUMA node 0 which is after all just a numbering scheme
> used by the kernel and can be used arbitrarily?
> 

I thought Michal Hocko already gave a clear picture on why mapping is a bad
idea. https://lore.kernel.org/lkml/20200316085425.GB11482@dhcp22.suse.cz/t/#u
Are you suggesting that we add that as part of the changelog?

> This could be seen more as a bug in the arch code during the setup of NUMA
> nodes. The two nodes are created by the firmwware / bootstrap code after
> all. Just do not do it?
> 

- The arch/setup code in powerpc is not onlining these nodes. 
- Later on cpus/memory in node 0 can be onlined.
- Firmware in this case Phyp is an independent code by itself.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply

* Re: [PATCH updated] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier
From: Aneesh Kumar K.V @ 2020-06-30  5:01 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jan Kara, linux-nvdimm, Jeff Moyer, Oliver O'Halloran,
	Michal Suchánek, linuxppc-dev
In-Reply-To: <CAPcyv4hgjH4We9Th2oir3NxpJEhFuLnQeCrF8auwNfF+5av8jQ@mail.gmail.com>

Dan Williams <dan.j.williams@intel.com> writes:

> On Mon, Jun 29, 2020 at 1:29 PM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> Architectures like ppc64 provide persistent memory specific barriers
>> that will ensure that all stores for which the modifications are
>> written to persistent storage by preceding dcbfps and dcbstps
>> instructions have updated persistent storage before any data
>> access or data transfer caused by subsequent instructions is initiated.
>> This is in addition to the ordering done by wmb()
>>
>> Update nvdimm core such that architecture can use barriers other than
>> wmb to ensure all previous writes are architecturally visible for
>> the platform buffer flush.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>  drivers/md/dm-writecache.c   | 2 +-
>>  drivers/nvdimm/region_devs.c | 8 ++++----
>>  include/linux/libnvdimm.h    | 4 ++++
>>  3 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
>> index 74f3c506f084..8c6b6dce64e2 100644
>> --- a/drivers/md/dm-writecache.c
>> +++ b/drivers/md/dm-writecache.c
>> @@ -536,7 +536,7 @@ static void ssd_commit_superblock(struct dm_writecache *wc)
>>  static void writecache_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
>>  {
>>         if (WC_MODE_PMEM(wc))
>> -               wmb();
>> +               arch_pmem_flush_barrier();
>>         else
>>                 ssd_commit_flushed(wc, wait_for_ios);
>>  }
>> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
>> index 4502f9c4708d..b308ad09b63d 100644
>> --- a/drivers/nvdimm/region_devs.c
>> +++ b/drivers/nvdimm/region_devs.c
>> @@ -1206,13 +1206,13 @@ int generic_nvdimm_flush(struct nd_region *nd_region)
>>         idx = this_cpu_add_return(flush_idx, hash_32(current->pid + idx, 8));
>>
>>         /*
>> -        * The first wmb() is needed to 'sfence' all previous writes
>> -        * such that they are architecturally visible for the platform
>> -        * buffer flush.  Note that we've already arranged for pmem
>> +        * The first arch_pmem_flush_barrier() is needed to 'sfence' all
>> +        * previous writes such that they are architecturally visible for
>> +        * the platform buffer flush. Note that we've already arranged for pmem
>>          * writes to avoid the cache via memcpy_flushcache().  The final
>>          * wmb() ensures ordering for the NVDIMM flush write.
>>          */
>> -       wmb();
>> +       arch_pmem_flush_barrier();
>>         for (i = 0; i < nd_region->ndr_mappings; i++)
>>                 if (ndrd_get_flush_wpq(ndrd, i, 0))
>>                         writeq(1, ndrd_get_flush_wpq(ndrd, i, idx));
>> diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
>> index 18da4059be09..66f6c65bd789 100644
>> --- a/include/linux/libnvdimm.h
>> +++ b/include/linux/libnvdimm.h
>> @@ -286,4 +286,8 @@ static inline void arch_invalidate_pmem(void *addr, size_t size)
>>  }
>>  #endif
>>
>> +#ifndef arch_pmem_flush_barrier
>> +#define arch_pmem_flush_barrier() wmb()
>> +#endif
>
> I think it is out of place to define this in libnvdimm.h and it is odd
> to give it such a long name. The other pmem api helpers like
> arch_wb_cache_pmem() and arch_invalidate_pmem() are function calls for
> libnvdimm driver operations, this barrier is just an instruction and
> is closer to wmb() than the pmem api routine.
>
> Since it is a store fence for pmem, so let's just call it pmem_wmb()
> and define the generic version in include/linux/compiler.h. It should
> probably also be documented alongside dma_wmb() in
> Documentation/memory-barriers.txt about why code would use it over
> wmb(), and why a symmetric pmem_rmb() is not needed.

How about the below? I used pmem_barrier() instead of pmem_wmb(). I
guess we wanted this to order() any data access not jus the following
stores to persistent storage? W.r.t why a symmetric pmem_rmb() is not
needed I was not sure how to explain that. Are you suggesting to explain
why a read/load from persistent storage don't want to wait for
pmem_barrier() ?

modified   Documentation/memory-barriers.txt
@@ -1935,6 +1935,16 @@ There are some more advanced barrier functions:
      relaxed I/O accessors and the Documentation/DMA-API.txt file for more
      information on consistent memory.
 
+ (*) pmem_barrier();
+
+     These are for use with persistent memory to esure the ordering of stores
+     to persistent memory region.
+
+     For example, after a non temporal write to persistent storage we use pmem_barrier()
+     to ensures that stores have updated the persistent storage before
+     any data access or data transfer caused by subsequent instructions is initiated.
+
 
 ===============================
 IMPLICIT KERNEL MEMORY BARRIERS
modified   arch/powerpc/include/asm/barrier.h
@@ -97,6 +97,19 @@ do {									\
 #define barrier_nospec()
 #endif /* CONFIG_PPC_BARRIER_NOSPEC */
 
+/*
+ * pmem_barrier() ensures that all stores for which the modification
+ * are written to persistent storage by preceding dcbfps/dcbstps
+ * instructions have updated persistent storage before any data
+ * access or data transfer caused by subsequent instructions is
+ * initiated.
+ */
+#define pmem_barrier pmem_barrier
+static inline void pmem_barrier(void)
+{
+	asm volatile(PPC_PHWSYNC ::: "memory");
+}
+
 #include <asm-generic/barrier.h>
 
 #endif /* _ASM_POWERPC_BARRIER_H */
modified   include/asm-generic/barrier.h
@@ -257,5 +257,16 @@ do {									\
 })
 #endif
 
+/*
+ * pmem_barrier() ensures that all stores for which the modification
+ * are written to persistent storage by preceding instructions have
+ * updated persistent storage before any data  access or data transfer
+ * caused by subsequent instructions is
+ * initiated.
+ */
+#ifndef pmem_barrier
+#define pmem_barrier  wmb()
+#endif
+
 #endif /* !__ASSEMBLY__ */
 #endif /* __ASM_GENERIC_BARRIER_H */


^ permalink raw reply

* Re: [PATCH v6 5/8] powerpc/pmem/of_pmem: Update of_pmem to use the new barrier instruction.
From: Aneesh Kumar K.V @ 2020-06-30  5:05 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jan Kara, Michal Suchánek, linuxppc-dev, linux-nvdimm
In-Reply-To: <CAPcyv4hFfU0r0GmCgV-wKXq+H=GDV-OPsGNPWmFijrdWm58X_A@mail.gmail.com>

Dan Williams <dan.j.williams@intel.com> writes:

> On Mon, Jun 29, 2020 at 6:58 AM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> of_pmem on POWER10 can now use phwsync instead of hwsync to ensure
>> all previous writes are architecturally visible for the platform
>> buffer flush.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>  arch/powerpc/include/asm/cacheflush.h | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
>> index 54764c6e922d..95782f77d768 100644
>> --- a/arch/powerpc/include/asm/cacheflush.h
>> +++ b/arch/powerpc/include/asm/cacheflush.h
>> @@ -98,6 +98,13 @@ static inline void invalidate_dcache_range(unsigned long start,
>>         mb();   /* sync */
>>  }
>>
>> +#define arch_pmem_flush_barrier arch_pmem_flush_barrier
>> +static inline void  arch_pmem_flush_barrier(void)
>> +{
>> +       if (cpu_has_feature(CPU_FTR_ARCH_207S))
>> +               asm volatile(PPC_PHWSYNC ::: "memory");
>
> Shouldn't this fallback to a compatible store-fence in an else statement?

The idea was to avoid calling this on anything else. We ensure that by
making sure that pmem devices are not initialized on systems without that
cpu feature. Patch 1 does that. Also, the last patch adds a WARN_ON() to
catch the usage of this outside pmem devices and on systems without that
cpu feature.

-aneesh

^ permalink raw reply

* Re: [PATCH v6 7/8] powerpc/pmem: Add WARN_ONCE to catch the wrong usage of pmem flush functions.
From: Aneesh Kumar K.V @ 2020-06-30  5:05 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jan Kara, Michal Suchánek, linuxppc-dev, linux-nvdimm
In-Reply-To: <CAPcyv4giMdgjNVudw1q7p-UpyLMTHTqTad=2Ks8ATNo==edKvQ@mail.gmail.com>

Dan Williams <dan.j.williams@intel.com> writes:

> On Mon, Jun 29, 2020 at 6:58 AM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> We only support persistent memory on P8 and above. This is enforced by the
>> firmware and further checked on virtualzied platform during platform init.
>> Add WARN_ONCE in pmem flush routines to catch the wrong usage of these.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>  arch/powerpc/include/asm/cacheflush.h | 2 ++
>>  arch/powerpc/lib/pmem.c               | 2 ++
>>  2 files changed, 4 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
>> index 95782f77d768..1ab0fa660497 100644
>> --- a/arch/powerpc/include/asm/cacheflush.h
>> +++ b/arch/powerpc/include/asm/cacheflush.h
>> @@ -103,6 +103,8 @@ static inline void  arch_pmem_flush_barrier(void)
>>  {
>>         if (cpu_has_feature(CPU_FTR_ARCH_207S))
>>                 asm volatile(PPC_PHWSYNC ::: "memory");
>> +       else
>> +               WARN_ONCE(1, "Using pmem flush on older hardware.");
>
> This seems too late to be making this determination. I'd expect the
> driver to fail to successfully bind default if this constraint is not
> met.

We do that in Patch 1.

-aneesh

^ permalink raw reply

* Re: [PATCH v2 1/3] powerpc/mm: Enable radix GTSE only if supported.
From: Aneesh Kumar K.V @ 2020-06-30  5:26 UTC (permalink / raw)
  To: Bharata B Rao, linuxppc-dev; +Cc: npiggin, Bharata B Rao
In-Reply-To: <20200626131000.5207-2-bharata@linux.ibm.com>

Bharata B Rao <bharata@linux.ibm.com> writes:

> Make GTSE an MMU feature and enable it by default for radix.
> However for guest, conditionally enable it if hypervisor supports
> it via OV5 vector. Let prom_init ask for radix GTSE only if the
> support exists.
>
> Having GTSE as an MMU feature will make it easy to enable radix
> without GTSE. Currently radix assumes GTSE is enabled by default.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/mmu.h    |  4 ++++
>  arch/powerpc/kernel/dt_cpu_ftrs.c |  1 +
>  arch/powerpc/kernel/prom_init.c   | 13 ++++++++-----
>  arch/powerpc/mm/init_64.c         |  5 ++++-
>  4 files changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
> index f4ac25d4df05..884d51995934 100644
> --- a/arch/powerpc/include/asm/mmu.h
> +++ b/arch/powerpc/include/asm/mmu.h
> @@ -28,6 +28,9 @@
>   * Individual features below.
>   */
>  
> +/* Guest Translation Shootdown Enable */
> +#define MMU_FTR_GTSE			ASM_CONST(0x00001000)
> +
>  /*
>   * Support for 68 bit VA space. We added that from ISA 2.05
>   */
> @@ -173,6 +176,7 @@ enum {
>  #endif
>  #ifdef CONFIG_PPC_RADIX_MMU
>  		MMU_FTR_TYPE_RADIX |
> +		MMU_FTR_GTSE |
>  #ifdef CONFIG_PPC_KUAP
>  		MMU_FTR_RADIX_KUAP |
>  #endif /* CONFIG_PPC_KUAP */
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index a0edeb391e3e..ac650c233cd9 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -336,6 +336,7 @@ static int __init feat_enable_mmu_radix(struct dt_cpu_feature *f)
>  #ifdef CONFIG_PPC_RADIX_MMU
>  	cur_cpu_spec->mmu_features |= MMU_FTR_TYPE_RADIX;
>  	cur_cpu_spec->mmu_features |= MMU_FTRS_HASH_BASE;
> +	cur_cpu_spec->mmu_features |= MMU_FTR_GTSE;
>  	cur_cpu_spec->cpu_user_features |= PPC_FEATURE_HAS_MMU;
>  
>  	return 1;
> diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
> index 90c604d00b7d..cbc605cfdec0 100644
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -1336,12 +1336,15 @@ static void __init prom_check_platform_support(void)
>  		}
>  	}
>  
> -	if (supported.radix_mmu && supported.radix_gtse &&
> -	    IS_ENABLED(CONFIG_PPC_RADIX_MMU)) {
> -		/* Radix preferred - but we require GTSE for now */
> -		prom_debug("Asking for radix with GTSE\n");
> +	if (supported.radix_mmu && IS_ENABLED(CONFIG_PPC_RADIX_MMU)) {
> +		/* Radix preferred - Check if GTSE is also supported */
> +		prom_debug("Asking for radix\n");
>  		ibm_architecture_vec.vec5.mmu = OV5_FEAT(OV5_MMU_RADIX);
> -		ibm_architecture_vec.vec5.radix_ext = OV5_FEAT(OV5_RADIX_GTSE);
> +		if (supported.radix_gtse)
> +			ibm_architecture_vec.vec5.radix_ext =
> +					OV5_FEAT(OV5_RADIX_GTSE);
> +		else
> +			prom_debug("Radix GTSE isn't supported\n");
>  	} else if (supported.hash_mmu) {
>  		/* Default to hash mmu (if we can) */
>  		prom_debug("Asking for hash\n");
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index bc73abf0bc25..152aa0200cef 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -407,12 +407,15 @@ static void __init early_check_vec5(void)
>  		if (!(vec5[OV5_INDX(OV5_RADIX_GTSE)] &
>  						OV5_FEAT(OV5_RADIX_GTSE))) {
>  			pr_warn("WARNING: Hypervisor doesn't support RADIX with GTSE\n");
> -		}
> +			cur_cpu_spec->mmu_features &= ~MMU_FTR_GTSE;
> +		} else
> +			cur_cpu_spec->mmu_features |= MMU_FTR_GTSE;
>  		/* Do radix anyway - the hypervisor said we had to */
>  		cur_cpu_spec->mmu_features |= MMU_FTR_TYPE_RADIX;
>  	} else if (mmu_supported == OV5_FEAT(OV5_MMU_HASH)) {
>  		/* Hypervisor only supports hash - disable radix */
>  		cur_cpu_spec->mmu_features &= ~MMU_FTR_TYPE_RADIX;
> +		cur_cpu_spec->mmu_features &= ~MMU_FTR_GTSE;
>  	}
>  }
>  
> -- 
> 2.21.3

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox