LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v5 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public
From: Baoquan He @ 2018-06-12  9:38 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Linux Kernel Mailing List, Andrew Morton, Rob Herring,
	Dan Williams, Nicolas Pitre, Josh Triplett, kbuild test robot,
	Borislav Petkov, Patrik Jakobsson, David Airlie, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Dmitry Torokhov, Frank Rowand,
	Keith Busch, Jon Derrick, Lorenzo Pieralisi, Bjorn Helgaas,
	Thomas Gleixner, brijesh.singh, Jérôme Glisse,
	Tom Lendacky, Greg Kroah-Hartman, baiyaowei, richard.weiyang,
	devel, linux-input, linux-nvdimm, devicetree, linux-pci,
	Eric Biederman, Vivek Goyal, Dave Young, Yinghai Lu, kexec,
	Michal Simek, David S. Miller, Chris Zankel, Max Filippov,
	Gustavo Padovan, Maarten Lankhorst, Sean Paul, linux-parisc,
	open list:LINUX FOR POWERPC PA SEMI PWRFICIENT,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
In-Reply-To: <CAHp75Vd6v4+RTq-5uw_BJpT6=w2KGjTfU+rEst6+8igs2cyntw@mail.gmail.com>

On 06/12/18 at 11:29am, Andy Shevchenko wrote:
> On Tue, Jun 12, 2018 at 6:28 AM, Baoquan He <bhe@redhat.com> wrote:
> > reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c
> > and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c
> > so that it's shared. Later its code also need be updated using list_head
> > to replace singly linked list.
> 
> While this is a good deduplication of the code, some requirements for
> public functions would be good to satisfy.
> 
> > +/*
> > + * Reparent resource children of pr that conflict with res
> > + * under res, and make res replace those children.
> > + */
> 
> kernel doc format, though...

> 
> > +static int reparent_resources(struct resource *parent,
> > +                                    struct resource *res)
> 
> ...is it really public with static keyword?!

Thanks for looking into this. This is a code bug, I copied and changed,
but forgot merging the changing to local commit. And the error reported
by test robot in patch 2 was changed too locally, forgot merging it to
patch. Will repost to address this.

> 
> 
> 
> > +{
> 
> > +       for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) {
> > +               if (p->end < res->start)
> > +                       continue;
> > +               if (res->end < p->start)
> > +                       break;
> 
> > +               if (p->start < res->start || p->end > res->end)
> > +                       return -1;      /* not completely contained */
> 
> Usually we are expecting real eeror codes.

Hmm, I just copied it from arch/powerpc/kernel/pci-common.c. The
function interface expects an integer returned value, not sure what a
real error codes look like, could you give more hints? Will change
accordingly.

> 
> > +               if (firstpp == NULL)
> > +                       firstpp = pp;
> > +       }
> 
> > +       if (firstpp == NULL)
> > +               return -1;      /* didn't find any conflicting entries? */
> 
> Ditto.
> 
> > +}
> > +EXPORT_SYMBOL(reparent_resources);
> 
> -- 
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply

* Re: [PATCH v5 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public
From: Baoquan He @ 2018-06-12  9:49 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Linux Kernel Mailing List, Andrew Morton, Rob Herring,
	Dan Williams, Nicolas Pitre, Josh Triplett, kbuild test robot,
	Borislav Petkov, Patrik Jakobsson, David Airlie, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Dmitry Torokhov, Frank Rowand,
	Keith Busch, Jon Derrick, Lorenzo Pieralisi, Bjorn Helgaas,
	Thomas Gleixner, brijesh.singh, Jérôme Glisse,
	Tom Lendacky, Greg Kroah-Hartman, baiyaowei, richard.weiyang,
	devel, linux-input, linux-nvdimm, devicetree, linux-pci,
	Eric Biederman, Vivek Goyal, Dave Young, Yinghai Lu, kexec,
	Michal Simek, David S. Miller, Chris Zankel, Max Filippov,
	Gustavo Padovan, Maarten Lankhorst, Sean Paul, linux-parisc,
	open list:LINUX FOR POWERPC PA SEMI PWRFICIENT,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
In-Reply-To: <CAHp75Vd6v4+RTq-5uw_BJpT6=w2KGjTfU+rEst6+8igs2cyntw@mail.gmail.com>

On 06/12/18 at 11:29am, Andy Shevchenko wrote:
> On Tue, Jun 12, 2018 at 6:28 AM, Baoquan He <bhe@redhat.com> wrote:
> > reparent_resources() is duplicated in arch/microblaze/pci/pci-common.c
> > and arch/powerpc/kernel/pci-common.c, so move it to kernel/resource.c
> > so that it's shared. Later its code also need be updated using list_head
> > to replace singly linked list.
> 
> While this is a good deduplication of the code, some requirements for
> public functions would be good to satisfy.
> 
> > +/*
> > + * Reparent resource children of pr that conflict with res
> > + * under res, and make res replace those children.
> > + */
> 
> kernel doc format, though...

Will rewrite it, thanks.

> 
> > +static int reparent_resources(struct resource *parent,
> > +                                    struct resource *res)
> 
> ...is it really public with static keyword?!
> 
> 
> 
> > +{
> 
> > +       for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) {
> > +               if (p->end < res->start)
> > +                       continue;
> > +               if (res->end < p->start)
> > +                       break;
> 
> > +               if (p->start < res->start || p->end > res->end)
> > +                       return -1;      /* not completely contained */
> 
> Usually we are expecting real eeror codes.
> 
> > +               if (firstpp == NULL)
> > +                       firstpp = pp;
> > +       }
> 
> > +       if (firstpp == NULL)
> > +               return -1;      /* didn't find any conflicting entries? */
> 
> Ditto.
> 
> > +}
> > +EXPORT_SYMBOL(reparent_resources);
> 
> -- 
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply

* Re: [PATCH v2 3/3] powerpc/fsl: Implement cpu_show_spectre_v1/v2 for NXP PowerPC Book3E
From: Michal Suchánek @ 2018-06-12 10:07 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: Diana Madalina Craciun, linuxppc-dev@lists.ozlabs.org,
	oss@buserror.net, Leo Li
In-Reply-To: <AM5PR0401MB2545BC30FB194E4D4E42ADA89A7F0@AM5PR0401MB2545.eurprd04.prod.outlook.com>

On Tue, 12 Jun 2018 02:59:11 +0000
Bharat Bhushan <bharat.bhushan@nxp.com> wrote:

> Hi Diana,
> 
> > -----Original Message-----
> > From: Diana Craciun [mailto:diana.craciun@nxp.com]
> > Sent: Monday, June 11, 2018 6:23 PM
> > To: linuxppc-dev@lists.ozlabs.org
> > Cc: mpe@ellerman.id.au; oss@buserror.net; Leo Li
> > <leoyang.li@nxp.com>; Bharat Bhushan <bharat.bhushan@nxp.com>;
> > Diana Madalina Craciun <diana.craciun@nxp.com>
> > Subject: [PATCH v2 3/3] powerpc/fsl: Implement
> > cpu_show_spectre_v1/v2 for NXP PowerPC Book3E  
> 
> Please add some description

To me the subject is self-explanatory. It implements a kernel interface
that was already described elsewhere.

What are you missing here?

Thanks

Michal

> 
> > 
> > Signed-off-by: Diana Craciun <diana.craciun@nxp.com>
> > ---
> >  arch/powerpc/Kconfig           |  2 +-
> >  arch/powerpc/kernel/security.c | 15 +++++++++++++++
> >  2 files changed, 16 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index
> > 940c955..a781d60 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -170,7 +170,7 @@ config PPC
> >  	select GENERIC_CLOCKEVENTS_BROADCAST	if SMP
> >  	select GENERIC_CMOS_UPDATE
> >  	select GENERIC_CPU_AUTOPROBE
> > -	select GENERIC_CPU_VULNERABILITIES	if PPC_BOOK3S_64
> > +	select GENERIC_CPU_VULNERABILITIES	if PPC_BOOK3S_64
> > || PPC_FSL_BOOK3E
> >  	select GENERIC_IRQ_SHOW
> >  	select GENERIC_IRQ_SHOW_LEVEL
> >  	select GENERIC_SMP_IDLE_THREAD
> > diff --git a/arch/powerpc/kernel/security.c
> > b/arch/powerpc/kernel/security.c index 797c975..aceaadc 100644
> > --- a/arch/powerpc/kernel/security.c
> > +++ b/arch/powerpc/kernel/security.c
> > @@ -183,3 +183,18 @@ ssize_t cpu_show_spectre_v2(struct device *dev,
> > struct device_attribute *attr, c  }  #endif /* CONFIG_PPC_BOOK3S_64
> > */
> > 
> > +#ifdef CONFIG_PPC_FSL_BOOK3E
> > +ssize_t cpu_show_spectre_v1(struct device *dev, struct
> > device_attribute +*attr, char *buf) {
> > +	if (barrier_nospec_enabled)
> > +		return sprintf(buf, "Mitigation: __user pointer
> > sanitization\n"); +
> > +	return sprintf(buf, "Vulnerable\n");
> > +}
> > +
> > +ssize_t cpu_show_spectre_v2(struct device *dev, struct
> > device_attribute +*attr, char *buf) {
> > +	return sprintf(buf, "Vulnerable\n");
> > +}
> > +#endif /* CONFIG_PPC_FSL_BOOK3E */
> > +
> > --
> > 2.5.5  
> 

^ permalink raw reply

* Re: [powerpc/powervm]kernel BUG at mm/memory_hotplug.c:1864!
From: Balbir Singh @ 2018-06-12 10:28 UTC (permalink / raw)
  To: vrbagal1, Oscar Salvador
  Cc: sachinp, linux-mm, linuxppc-dev, nfont, Linuxppc-dev, linux-next
In-Reply-To: <0aac625ee724d877b87c69bba5ac9a0e@linux.vnet.ibm.com>



On 11/06/18 17:41, vrbagal1 wrote:
> On 2018-06-08 17:45, Oscar Salvador wrote:
>> On Fri, Jun 08, 2018 at 05:11:24PM +0530, vrbagal1 wrote:
>>> On 2018-06-08 16:58, Oscar Salvador wrote:
>>> >On Fri, Jun 08, 2018 at 04:44:24PM +0530, vrbagal1 wrote:
>>> >>Greetings!!!
>>> >>
>>> >>I am seeing kernel bug followed by oops message and system reboots,
>>> >>while
>>> >>running dlpar memory hotplug test.
>>> >>
>>> >>Machine Details: Power6 PowerVM Platform
>>> >>GCC version: (gcc version 4.8.3 20140911 (Red Hat 4.8.3-7) (GCC))
>>> >>Test case: dlpar memory hotplug test (https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/memhotplug.py)
>>> >>Kernel Version: Linux version 4.17.0-autotest
>>> >>
>>> >>I am seeing this bug on rc7 as well.
> 
> Observing similar traces on linux next kernel: 4.17.0-next-20180608-autotest
> 
>  Block size [0x4000000] unaligned hotplug range: start 0x220000000, size 0x1000000

size < block_size in this case, why? how? Could you confirm that the block size is 64MB and your trying to remove 16MB

Balbir Singh.

^ permalink raw reply

* Re: pkeys on POWER: Access rights not reset on execve
From: Florian Weimer @ 2018-06-12 12:17 UTC (permalink / raw)
  To: Ram Pai; +Cc: Linux-MM, linuxppc-dev, Andy Lutomirski, Dave Hansen
In-Reply-To: <20180611200807.GA5773@ram.oc3035372033.ibm.com>

On 06/11/2018 10:08 PM, Ram Pai wrote:
> Ok. try this patch. This patch is on top of the 5 patches that I had
> sent last week i.e  "[PATCH  0/5] powerpc/pkeys: fixes to pkeys"
> 
> The following is a draft patch though to check if it meets your
> expectations.
> 
> commit fe53b5fe2dcb3139ea27ade3ae7cbbe43c4af3be
> Author: Ram Pai<linuxram@us.ibm.com>
> Date:   Mon Jun 11 14:57:34 2018 -0500
> 
>      powerpc/pkeys: Deny read/write/execute by default

With this patch, my existing misc/tst-pkey test in glibc passes.  The 
in-tree version still has some incorrect assumptions on implementation 
behavior, but those are test bugs.  The kernel behavior with your patch 
look good to me.  Thanks.

Florian

^ permalink raw reply

* Re: [PATCH 5/5] powerpc/pkeys: make protection key 0 less special
From: Florian Weimer @ 2018-06-12 12:18 UTC (permalink / raw)
  To: Ram Pai, mpe
  Cc: Ulrich.Weigand, mhocko, dave.hansen, aneesh.kumar, luto, bauerman,
	linuxppc-dev
In-Reply-To: <1528164594-23823-6-git-send-email-linuxram@us.ibm.com>

On 06/05/2018 04:09 AM, Ram Pai wrote:
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index 0409c80..d349e22 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -13,7 +13,10 @@
>   
>   DECLARE_STATIC_KEY_TRUE(pkey_disabled);
>   extern int pkeys_total; /* total pkeys as per device tree */
> -extern u32 initial_allocation_mask; /* bits set for reserved keys */
> +extern u32 initial_allocation_mask; /*  bits set for the initially allocated keys */
> +extern u32 reserved_allocation_mask; /* bits set for reserved keys */
> +
> +#define PKEY_0	0
>   
>   /*
>    * Define these here temporarily so we're not dependent on patching linux/mm.h.

This part no longer applies cleanly.

Thanks,
Florian

^ permalink raw reply

* Re: [v3 PATCH 4/5] powerpc/pseries: Dump and flush SLB contents on SLB MCE errors.
From: Michael Ellerman @ 2018-06-12 13:47 UTC (permalink / raw)
  To: Mahesh J Salgaonkar, linuxppc-dev
  Cc: Aneesh Kumar K.V, Aneesh Kumar K.V, Laurent Dufour,
	Nicholas Piggin
In-Reply-To: <152839253238.25118.3114450844744290470.stgit@jupiter.in.ibm.com>

Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes:
> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
> index 2edc673be137..e56759d92356 100644
> --- a/arch/powerpc/platforms/pseries/ras.c
> +++ b/arch/powerpc/platforms/pseries/ras.c
> @@ -422,6 +422,31 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
>  	return 0; /* need to perform reset */
>  }
>  
> +static int mce_handle_error(struct rtas_error_log *errp)
> +{
> +	struct pseries_errorlog *pseries_log;
> +	struct pseries_mc_errorlog *mce_log;
> +	int disposition = rtas_error_disposition(errp);
> +	uint8_t error_type;
> +
> +	pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
> +	if (pseries_log == NULL)
> +		goto out;
> +
> +	mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
> +	error_type = rtas_mc_error_type(mce_log);
> +
> +	if ((disposition == RTAS_DISP_NOT_RECOVERED) &&
> +			(error_type == PSERIES_MC_ERROR_TYPE_SLB)) {
> +		slb_dump_contents();
> +		slb_flush_and_rebolt();

Aren't we back in virtual mode here?

Don't we need to do the flush in real mode before turning the MMU back
on. Otherwise we'll just take another multi-hit?

cheers

^ permalink raw reply

* Re: [RFC PATCH 1/3] Revert "mm: always flush VMA ranges affected by zap_page_range"
From: Aneesh Kumar K.V @ 2018-06-12 13:53 UTC (permalink / raw)
  To: Nicholas Piggin, linux-mm
  Cc: linuxppc-dev, linux-arch, Aneesh Kumar K . V, Minchan Kim,
	Mel Gorman, Nadav Amit, Andrew Morton, Linus Torvalds
In-Reply-To: <20180612071621.26775-2-npiggin@gmail.com>

On 06/12/2018 12:46 PM, Nicholas Piggin wrote:
> This reverts commit 4647706ebeee6e50f7b9f922b095f4ec94d581c3.
> 
> Patch 99baac21e4585 ("mm: fix MADV_[FREE|DONTNEED] TLB flush miss
> problem") provides a superset of the TLB flush coverage of this
> commit, and even includes in the changelog "this patch supersedes
> 'mm: Always flush VMA ranges affected by zap_page_range v2'".
> 
> Reverting this avoids double flushing the TLB range, and the less
> efficient flush_tlb_range() call (the mmu_gather API is more precise
> about what ranges it invalidates).
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   mm/memory.c | 14 +-------------
>   1 file changed, 1 insertion(+), 13 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 7206a634270b..9d472e00fc2d 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1603,20 +1603,8 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start,
>   	tlb_gather_mmu(&tlb, mm, start, end);
>   	update_hiwater_rss(mm);
>   	mmu_notifier_invalidate_range_start(mm, start, end);
> -	for ( ; vma && vma->vm_start < end; vma = vma->vm_next) {
> +	for ( ; vma && vma->vm_start < end; vma = vma->vm_next)
>   		unmap_single_vma(&tlb, vma, start, end, NULL);
> -
> -		/*
> -		 * zap_page_range does not specify whether mmap_sem should be
> -		 * held for read or write. That allows parallel zap_page_range
> -		 * operations to unmap a PTE and defer a flush meaning that
> -		 * this call observes pte_none and fails to flush the TLB.
> -		 * Rather than adding a complex API, ensure that no stale
> -		 * TLB entries exist when this call returns.
> -		 */
> -		flush_tlb_range(vma, start, end);
> -	}
> -
>   	mmu_notifier_invalidate_range_end(mm, start, end);
>   	tlb_finish_mmu(&tlb, start, end);
>   }
> 

No really related to this patch, but does 99baac21e4585 do the right 
thing if the range start - end covers pages with multiple page sizes?

-aneesh

^ permalink raw reply

* Re: [PATCH v5 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public
From: Andy Shevchenko @ 2018-06-12 14:20 UTC (permalink / raw)
  To: Baoquan He
  Cc: Linux Kernel Mailing List, Andrew Morton, Rob Herring,
	Dan Williams, Nicolas Pitre, Josh Triplett, kbuild test robot,
	Borislav Petkov, Patrik Jakobsson, David Airlie, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Dmitry Torokhov, Frank Rowand,
	Keith Busch, Jon Derrick, Lorenzo Pieralisi, Bjorn Helgaas,
	Thomas Gleixner, brijesh.singh, Jérôme Glisse,
	Tom Lendacky, Greg Kroah-Hartman, baiyaowei, richard.weiyang,
	devel, linux-input, linux-nvdimm, devicetree, linux-pci,
	Eric Biederman, Vivek Goyal, Dave Young, Yinghai Lu, kexec,
	Michal Simek, David S. Miller, Chris Zankel, Max Filippov,
	Gustavo Padovan, Maarten Lankhorst, Sean Paul, linux-parisc,
	open list:LINUX FOR POWERPC PA SEMI PWRFICIENT,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
In-Reply-To: <20180612093812.GC1820@MiWiFi-R3L-srv>

On Tue, Jun 12, 2018 at 12:38 PM, Baoquan He <bhe@redhat.com> wrote:
> On 06/12/18 at 11:29am, Andy Shevchenko wrote:
>> On Tue, Jun 12, 2018 at 6:28 AM, Baoquan He <bhe@redhat.com> wrote:

>> > +{
>>
>> > +       for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) {
>> > +               if (p->end < res->start)
>> > +                       continue;
>> > +               if (res->end < p->start)
>> > +                       break;
>>
>> > +               if (p->start < res->start || p->end > res->end)
>> > +                       return -1;      /* not completely contained */
>>
>> Usually we are expecting real eeror codes.
>
> Hmm, I just copied it from arch/powerpc/kernel/pci-common.c. The
> function interface expects an integer returned value, not sure what a
> real error codes look like, could you give more hints? Will change
> accordingly.

I briefly looked at the code and error codes we have, so, my proposal
is one of the following
 - use -ECANCELED (not the best choice for first occurrence here,
though I can't find better)
 - use positive integers (or enum), like
  #define RES_REPARENTED 0
  #define RES_OVERLAPPED 1
  #define RES_NOCONFLICT 2


>> > +               if (firstpp == NULL)
>> > +                       firstpp = pp;
>> > +       }
>>
>> > +       if (firstpp == NULL)
>> > +               return -1;      /* didn't find any conflicting entries? */
>>
>> Ditto.

Ditto.

>>
>> > +}
>> > +EXPORT_SYMBOL(reparent_resources);

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply

* Re: [PATCH v5 1/4] resource: Move reparent_resources() to kernel/resource.c and make it public
From: Andy Shevchenko @ 2018-06-12 14:24 UTC (permalink / raw)
  To: Baoquan He
  Cc: Linux Kernel Mailing List, Andrew Morton, Rob Herring,
	Dan Williams, Nicolas Pitre, Josh Triplett, kbuild test robot,
	Borislav Petkov, Patrik Jakobsson, David Airlie, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Dmitry Torokhov, Frank Rowand,
	Keith Busch, Jon Derrick, Lorenzo Pieralisi, Bjorn Helgaas,
	Thomas Gleixner, brijesh.singh, Jérôme Glisse,
	Tom Lendacky, Greg Kroah-Hartman, baiyaowei, richard.weiyang,
	devel, linux-input, linux-nvdimm, devicetree, linux-pci,
	Eric Biederman, Vivek Goyal, Dave Young, Yinghai Lu, kexec,
	Michal Simek, David S. Miller, Chris Zankel, Max Filippov,
	Gustavo Padovan, Maarten Lankhorst, Sean Paul, linux-parisc,
	open list:LINUX FOR POWERPC PA SEMI PWRFICIENT,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
In-Reply-To: <CAHp75Vf_kBLkE6v=JyOdfNoWktWEfKs7JzRP1XEc4TeuT5xqfw@mail.gmail.com>

On Tue, Jun 12, 2018 at 5:20 PM, Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Tue, Jun 12, 2018 at 12:38 PM, Baoquan He <bhe@redhat.com> wrote:
>> On 06/12/18 at 11:29am, Andy Shevchenko wrote:
>>> On Tue, Jun 12, 2018 at 6:28 AM, Baoquan He <bhe@redhat.com> wrote:
>
>>> > +{
>>>
>>> > +       for (pp = &parent->child; (p = *pp) != NULL; pp = &p->sibling) {
>>> > +               if (p->end < res->start)
>>> > +                       continue;
>>> > +               if (res->end < p->start)
>>> > +                       break;
>>>
>>> > +               if (p->start < res->start || p->end > res->end)
>>> > +                       return -1;      /* not completely contained */
>>>
>>> Usually we are expecting real eeror codes.
>>
>> Hmm, I just copied it from arch/powerpc/kernel/pci-common.c. The
>> function interface expects an integer returned value, not sure what a
>> real error codes look like, could you give more hints? Will change
>> accordingly.
>
> I briefly looked at the code and error codes we have, so, my proposal
> is one of the following

>  - use -ECANCELED (not the best choice for first occurrence here,
> though I can't find better)

Actually -ENOTSUPP might suit the first case (although the actual
would be something like -EOVERLAP, which we don't have)

>  - use positive integers (or enum), like
>   #define RES_REPARENTED 0
>   #define RES_OVERLAPPED 1
>   #define RES_NOCONFLICT 2
>
>
>>> > +               if (firstpp == NULL)
>>> > +                       firstpp = pp;
>>> > +       }
>>>
>>> > +       if (firstpp == NULL)
>>> > +               return -1;      /* didn't find any conflicting entries? */
>>>
>>> Ditto.
>
> Ditto.
>
>>>
>>> > +}
>>> > +EXPORT_SYMBOL(reparent_resources);
>
> --
> With Best Regards,
> Andy Shevchenko



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply

* RE: [v3, 00/10] Support DPAA PTP clock and timestamping
From: Madalin-cristian Bucur @ 2018-06-12 14:27 UTC (permalink / raw)
  To: Y.b. Lu, netdev@vger.kernel.org, Richard Cochran, Rob Herring,
	Shawn Guo, David S . Miller
  Cc: devicetree@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Y.b. Lu
In-Reply-To: <20180607092050.46128-1-yangbo.lu@nxp.com>

> -----Original Message-----
> From: Yangbo Lu [mailto:yangbo.lu@nxp.com]
> Sent: Thursday, June 7, 2018 12:21 PM
> To: netdev@vger.kernel.org; Madalin-cristian Bucur
> <madalin.bucur@nxp.com>; Richard Cochran <richardcochran@gmail.com>;
> Rob Herring <robh+dt@kernel.org>; Shawn Guo <shawnguo@kernel.org>;
> David S . Miller <davem@davemloft.net>
> Cc: devicetree@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-arm-
> kernel@lists.infradead.org; linux-kernel@vger.kernel.org; Y.b. Lu
> <yangbo.lu@nxp.com>
> Subject: [v3, 00/10] Support DPAA PTP clock and timestamping
>=20
> This patchset is to support DPAA FMAN PTP clock and HW timestamping.
> It had been verified on both ARM platform and PPC platform.
> - The patch #1 to patch #5 are to support DPAA FMAN 1588 timer in
>   ptp_qoriq driver.
> - The patch #6 to patch #10 are to add HW timestamping support in
>   DPAA ethernet driver.
>=20
> Yangbo Lu (10):
>   fsl/fman: share the event interrupt
>   ptp: support DPAA FMan 1588 timer in ptp_qoriq
>   dt-binding: ptp_qoriq: add DPAA FMan support
>   powerpc/mpc85xx: move ptp timer out of fman in dts
>   arm64: dts: fsl: move ptp timer out of fman
>   fsl/fman: add set_tstamp interface
>   fsl/fman_port: support getting timestamp
>   fsl/fman: define frame description command UPD
>   dpaa_eth: add support for hardware timestamping
>   dpaa_eth: add the get_ts_info interface for ethtool
>=20
>  Documentation/devicetree/bindings/net/fsl-fman.txt |   25 +-----
>  .../devicetree/bindings/ptp/ptp-qoriq.txt          |   15 +++-
>  arch/arm64/boot/dts/freescale/qoriq-fman3-0.dtsi   |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman-0.dtsi        |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman-1.dtsi        |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman3-0.dtsi       |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman3-1.dtsi       |   14 ++-
>  arch/powerpc/boot/dts/fsl/qoriq-fman3l-0.dtsi      |   14 ++-
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c     |   88
> ++++++++++++++++-
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.h     |    3 +
>  drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c |   39 ++++++++
>  drivers/net/ethernet/freescale/fman/fman.c         |    3 +-
>  drivers/net/ethernet/freescale/fman/fman.h         |    1 +
>  drivers/net/ethernet/freescale/fman/fman_dtsec.c   |   27 +++++
>  drivers/net/ethernet/freescale/fman/fman_dtsec.h   |    1 +
>  drivers/net/ethernet/freescale/fman/fman_memac.c   |    5 +
>  drivers/net/ethernet/freescale/fman/fman_memac.h   |    1 +
>  drivers/net/ethernet/freescale/fman/fman_port.c    |   12 +++
>  drivers/net/ethernet/freescale/fman/fman_port.h    |    2 +
>  drivers/net/ethernet/freescale/fman/fman_tgec.c    |   21 ++++
>  drivers/net/ethernet/freescale/fman/fman_tgec.h    |    1 +
>  drivers/net/ethernet/freescale/fman/mac.c          |    3 +
>  drivers/net/ethernet/freescale/fman/mac.h          |    1 +
>  drivers/ptp/Kconfig                                |    2 +-
>  drivers/ptp/ptp_qoriq.c                            |  104 ++++++++++++--=
-----
>  include/linux/fsl/ptp_qoriq.h                      |   38 ++++++--
>  26 files changed, 361 insertions(+), 115 deletions(-)

Acked-by: Madalin Bucur <madalin.bucur@nxp.com>

^ permalink raw reply

* Re: [PATCH v6 3/4] powerpc/lib: implement strlen() in assembly
From: Segher Boessenkool @ 2018-06-12 14:53 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	wei.guo.simon, linux-kernel, linuxppc-dev
In-Reply-To: <8b89f2e21f7e3a865105eeeeda509243db393454.1528791416.git.christophe.leroy@c-s.fr>

On Tue, Jun 12, 2018 at 09:14:53AM +0000, Christophe Leroy wrote:
> ---
> Not tested on PPC64.

It won't be acceptable until that happens.  It also is likely quite bad
performance on all 64-bit CPUs from the last fifteen years or so.  Or you
did nothing to prove otherwise, at least.

> + * Algorigthm:

Typo.


Segher

^ permalink raw reply

* Re: [PATCH v5 2/4] resource: Use list_head to link sibling resource
From: Julia Lawall @ 2018-06-12 15:10 UTC (permalink / raw)
  To: Baoquan He
  Cc: brijesh.singh, devicetree, airlied, linux-pci, richard.weiyang,
	keith.busch, jcmvbkbc, baiyaowei, frowand.list, lorenzo.pieralisi,
	sthemmin, Baoquan He, linux-nvdimm, patrik.r.jakobsson,
	linux-input, gustavo, dyoung, vgoyal, thomas.lendacky, haiyangz,
	maarten.lankhorst, jglisse, seanpaul, bhelgaas, tglx, yinghai,
	jonathan.derrick, chris, monstr, linux-parisc, gregkh,
	dmitry.torokhov, kexec, ebiederm, devel, linuxppc-dev, davem,
	kbuild-all

This looks wrong.  After a list iterator, the index variable points to a
dummy structure.

julia

url:    https://github.com/0day-ci/linux/commits/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180612-113600
:::::: branch date: 7 hours ago
:::::: commit date: 7 hours ago

>> kernel/resource.c:265:17-20: ERROR: invalid reference to the index variable of the iterator on line 253

# https://github.com/0day-ci/linux/commit/e906f15906750a86913ba2b1f08bad99129d3dfc
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout e906f15906750a86913ba2b1f08bad99129d3dfc
vim +265 kernel/resource.c

^1da177e4 Linus Torvalds 2005-04-16  247
5eeec0ec9 Yinghai Lu     2009-12-22  248  static void __release_child_resources(struct resource *r)
5eeec0ec9 Yinghai Lu     2009-12-22  249  {
e906f1590 Baoquan He     2018-06-12  250  	struct resource *tmp, *next;
5eeec0ec9 Yinghai Lu     2009-12-22  251  	resource_size_t size;
5eeec0ec9 Yinghai Lu     2009-12-22  252
e906f1590 Baoquan He     2018-06-12 @253  	list_for_each_entry_safe(tmp, next, &r->child, sibling) {
5eeec0ec9 Yinghai Lu     2009-12-22  254  		tmp->parent = NULL;
e906f1590 Baoquan He     2018-06-12  255  		INIT_LIST_HEAD(&tmp->sibling);
5eeec0ec9 Yinghai Lu     2009-12-22  256  		__release_child_resources(tmp);
5eeec0ec9 Yinghai Lu     2009-12-22  257
5eeec0ec9 Yinghai Lu     2009-12-22  258  		printk(KERN_DEBUG "release child resource %pR\n", tmp);
5eeec0ec9 Yinghai Lu     2009-12-22  259  		/* need to restore size, and keep flags */
5eeec0ec9 Yinghai Lu     2009-12-22  260  		size = resource_size(tmp);
5eeec0ec9 Yinghai Lu     2009-12-22  261  		tmp->start = 0;
5eeec0ec9 Yinghai Lu     2009-12-22  262  		tmp->end = size - 1;
5eeec0ec9 Yinghai Lu     2009-12-22  263  	}
e906f1590 Baoquan He     2018-06-12  264
e906f1590 Baoquan He     2018-06-12 @265  	INIT_LIST_HEAD(&tmp->child);
5eeec0ec9 Yinghai Lu     2009-12-22  266  }
5eeec0ec9 Yinghai Lu     2009-12-22  267

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply

* Re: [PATCH v6 3/4] powerpc/lib: implement strlen() in assembly
From: Christophe LEROY @ 2018-06-12 17:01 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	wei.guo.simon, linux-kernel, linuxppc-dev
In-Reply-To: <20180612145315.GJ27520@gate.crashing.org>



Le 12/06/2018 à 16:53, Segher Boessenkool a écrit :
> On Tue, Jun 12, 2018 at 09:14:53AM +0000, Christophe Leroy wrote:
>> ---
>> Not tested on PPC64.
> 
> It won't be acceptable until that happens.  It also is likely quite bad
> performance on all 64-bit CPUs from the last fifteen years or so.  Or you
> did nothing to prove otherwise, at least.

Will it be as bad as the generic implementation which does it byte per 
byte ?

I don't have any 64 bits target, can someone test it using the test app 
I have added in selftests ?

Or should I just leave it as is for 64 bits and just do the 
implementation for 32 bits until someone wants to try and do it for PPC64 ?

Christophe

> 
>> + * Algorigthm:
> 
> Typo.
> 
> 
> Segher
> 

^ permalink raw reply

* Re: [RFC PATCH 2/3] mm: mmu_gather track of invalidated TLB ranges explicitly for more precise flushing
From: Linus Torvalds @ 2018-06-12 18:14 UTC (permalink / raw)
  To: Nick Piggin
  Cc: linux-mm, ppc-dev, linux-arch, Aneesh Kumar K. V, Minchan Kim,
	Mel Gorman, Nadav Amit, Andrew Morton
In-Reply-To: <20180612071621.26775-3-npiggin@gmail.com>

On Tue, Jun 12, 2018 at 12:16 AM Nicholas Piggin <npiggin@gmail.com> wrote:
>
> +static inline void __tlb_adjust_page_range(struct mmu_gather *tlb,
> +                                     unsigned long address,
> +                                     unsigned int range_size)
> +{
> +       tlb->page_start = min(tlb->page_start, address);
> +       tlb->page_end = max(tlb->page_end, address + range_size);
> +}

Why add this unnecessary complexity for architectures where it doesn't matter?

This is not "generic". This is some crazy powerpc special case. Why
add it to generic code, and why make everybody else take the cost?

                    Linus

^ permalink raw reply

* Re: [RFC PATCH 3/3] powerpc/64s/radix: optimise TLB flush with precise TLB ranges in mmu_gather
From: Linus Torvalds @ 2018-06-12 18:18 UTC (permalink / raw)
  To: Nick Piggin
  Cc: linux-mm, ppc-dev, linux-arch, Aneesh Kumar K. V, Minchan Kim,
	Mel Gorman, Nadav Amit, Andrew Morton
In-Reply-To: <20180612071621.26775-4-npiggin@gmail.com>

On Tue, Jun 12, 2018 at 12:16 AM Nicholas Piggin <npiggin@gmail.com> wrote:
>
> This brings the number of tlbiel instructions required by a kernel
> compile from 33M to 25M, most avoided from exec->shift_arg_pages.

And this shows that "page_start/end" is purely for powerpc and used
nowhere else.

The previous patch should have been to purely powerpc page table
walking and not touch asm-generic/tlb.h

I think you should make those changes to
arch/powerpc/include/asm/tlb.h. If that means you can't use the
generic header, then so be it.

Or maybe you can embed the generic case in some ppc-specific
structures, and use 90% of the generic code just with your added
wrappers for that radix invalidation on top.

But don't make other architectures do pointless work that doesn't
matter - or make sense - for them.

               Linus

^ permalink raw reply

* Re: [RFC PATCH 1/3] Revert "mm: always flush VMA ranges affected by zap_page_range"
From: Nadav Amit @ 2018-06-12 18:52 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: open list:MEMORY MANAGEMENT, linuxppc-dev, linux-arch,
	Aneesh Kumar K . V, Minchan Kim, Mel Gorman, Andrew Morton,
	Linus Torvalds
In-Reply-To: <20180612071621.26775-2-npiggin@gmail.com>

at 12:16 AM, Nicholas Piggin <npiggin@gmail.com> wrote:

> This reverts commit 4647706ebeee6e50f7b9f922b095f4ec94d581c3.
>=20
> Patch 99baac21e4585 ("mm: fix MADV_[FREE|DONTNEED] TLB flush miss
> problem") provides a superset of the TLB flush coverage of this
> commit, and even includes in the changelog "this patch supersedes
> 'mm: Always flush VMA ranges affected by zap_page_range v2'".
>=20
> Reverting this avoids double flushing the TLB range, and the less
> efficient flush_tlb_range() call (the mmu_gather API is more precise
> about what ranges it invalidates).
>=20
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> mm/memory.c | 14 +-------------
> 1 file changed, 1 insertion(+), 13 deletions(-)
>=20
> diff --git a/mm/memory.c b/mm/memory.c
> index 7206a634270b..9d472e00fc2d 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1603,20 +1603,8 @@ void zap_page_range(struct vm_area_struct *vma, =
unsigned long start,
> 	tlb_gather_mmu(&tlb, mm, start, end);
> 	update_hiwater_rss(mm);
> 	mmu_notifier_invalidate_range_start(mm, start, end);
> -	for ( ; vma && vma->vm_start < end; vma =3D vma->vm_next) {
> +	for ( ; vma && vma->vm_start < end; vma =3D vma->vm_next)
> 		unmap_single_vma(&tlb, vma, start, end, NULL);
> -
> -		/*
> -		 * zap_page_range does not specify whether mmap_sem =
should be
> -		 * held for read or write. That allows parallel =
zap_page_range
> -		 * operations to unmap a PTE and defer a flush meaning =
that
> -		 * this call observes pte_none and fails to flush the =
TLB.
> -		 * Rather than adding a complex API, ensure that no =
stale
> -		 * TLB entries exist when this call returns.
> -		 */
> -		flush_tlb_range(vma, start, end);
> -	}
> -
> 	mmu_notifier_invalidate_range_end(mm, start, end);
> 	tlb_finish_mmu(&tlb, start, end);
> }

Yes, this was in my =E2=80=9Cto check when I have time=E2=80=9D todo =
list, especially since
the flush was from start to end, not even vma->vm_start to vma->vm_end.

The revert seems correct.

Reviewed-by: Nadav Amit <namit@vmware.com>

^ permalink raw reply

* Re: [PATCH v2 08/12] macintosh/via-pmu68k: Don't load driver on unsupported hardware
From: Michael Schmitz @ 2018-06-12 20:12 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: Finn Thain, Benjamin Herrenschmidt, Andreas Schwab, linuxppc-dev,
	Linux/m68k, Linux Kernel Development, Geert Uytterhoeven
In-Reply-To: <6c1205cb-7d65-8b30-c4df-59a0ed041313@redhat.com>

Hi,

On Tue, Jun 12, 2018 at 6:53 PM, Laurent Vivier <lvivier@redhat.com> wrote:
> On 12/06/2018 01:47, Finn Thain wrote:
>> On Sun, 10 Jun 2018, Benjamin Herrenschmidt wrote:
> ...
>> I don't know what the bootloader situation is, but it looks messy...
>> http://nubus-pmac.sourceforge.net/#booters
>>
>> Laurent, does Emile work on these machines?
>>
>
> No, Emile doesn't work on pmac-nubus, I tried to implement the switch
> from m68k to ppc, but it has never worked.

On the PowerMac 6100 that I installed Linux on many years ago, I'm
pretty sure I used Apple's mkLinux boot loader.

Not sure how that would work with modern kernels and initrds though.
Haven't got that hardware anymore so no way try.

Cheers,

  Michael

>
> Laurent

^ permalink raw reply

* Re: [RFC PATCH 3/3] powerpc/64s/radix: optimise TLB flush with precise TLB ranges in mmu_gather
From: Nicholas Piggin @ 2018-06-12 22:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-mm, ppc-dev, linux-arch, Aneesh Kumar K. V, Minchan Kim,
	Mel Gorman, Nadav Amit, Andrew Morton
In-Reply-To: <CA+55aFzKBieD0Y3sgFQzt+x5esqb9vT6SEQ28xyCz5UWegfFVg@mail.gmail.com>

On Tue, 12 Jun 2018 11:18:27 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Jun 12, 2018 at 12:16 AM Nicholas Piggin <npiggin@gmail.com> wrot=
e:
> >
> > This brings the number of tlbiel instructions required by a kernel
> > compile from 33M to 25M, most avoided from exec->shift_arg_pages. =20
>=20
> And this shows that "page_start/end" is purely for powerpc and used
> nowhere else.
>=20
> The previous patch should have been to purely powerpc page table
> walking and not touch asm-generic/tlb.h
>=20
> I think you should make those changes to
> arch/powerpc/include/asm/tlb.h. If that means you can't use the
> generic header, then so be it.

I can make it ppc specific if nobody else would use it. But at least
mmu notifiers AFAIKS would rather use a precise range.

> Or maybe you can embed the generic case in some ppc-specific
> structures, and use 90% of the generic code just with your added
> wrappers for that radix invalidation on top.

Would you mind another arch specific ifdefs in there?

>=20
> But don't make other architectures do pointless work that doesn't
> matter - or make sense - for them.

Okay sure, and this is the reason for the wide cc list. Intel does
need it of course, from 4.10.3.1 of the dev manual:

  =E2=80=94 The processor may create a PML4-cache entry even if there are no
    translations for any linear address that might use that entry
    (e.g., because the P flags are 0 in all entries in the referenced
    page-directory-pointer table).

But I'm sure others would not have paging structure caches at all
(some don't even walk the page tables in hardware right?). Maybe
they're all doing their own thing though.

Thanks,
Nick

^ permalink raw reply

* Re: [RFC PATCH 3/3] powerpc/64s/radix: optimise TLB flush with precise TLB ranges in mmu_gather
From: Linus Torvalds @ 2018-06-12 22:42 UTC (permalink / raw)
  To: Nick Piggin
  Cc: linux-mm, ppc-dev, linux-arch, Aneesh Kumar K. V, Minchan Kim,
	Mel Gorman, Nadav Amit, Andrew Morton
In-Reply-To: <20180613083131.139a3c34@roar.ozlabs.ibm.com>

On Tue, Jun 12, 2018 at 3:31 PM Nicholas Piggin <npiggin@gmail.com> wrote:
>
> Okay sure, and this is the reason for the wide cc list. Intel does
> need it of course, from 4.10.3.1 of the dev manual:
>
>   =E2=80=94 The processor may create a PML4-cache entry even if there are=
 no
>     translations for any linear address that might use that entry
>     (e.g., because the P flags are 0 in all entries in the referenced
>     page-directory-pointer table).

But does intel need it?

Because I don't see it. We already do the __tlb_adjust_range(), and we
never tear down the highest-level page tables afaik.

Am I missing something?

               Linus

^ permalink raw reply

* Re: [RFC PATCH 3/3] powerpc/64s/radix: optimise TLB flush with precise TLB ranges in mmu_gather
From: Nicholas Piggin @ 2018-06-12 23:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-mm, ppc-dev, linux-arch, Aneesh Kumar K. V, Minchan Kim,
	Mel Gorman, Nadav Amit, Andrew Morton
In-Reply-To: <CA+55aFyk9VBLUk8VYhfEUR55x0TXY9_QX1dE4wE0A_ias9tMNQ@mail.gmail.com>

On Tue, 12 Jun 2018 15:42:34 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Jun 12, 2018 at 3:31 PM Nicholas Piggin <npiggin@gmail.com> wrote:
> >
> > Okay sure, and this is the reason for the wide cc list. Intel does
> > need it of course, from 4.10.3.1 of the dev manual:
> >
> >   =E2=80=94 The processor may create a PML4-cache entry even if there a=
re no
> >     translations for any linear address that might use that entry
> >     (e.g., because the P flags are 0 in all entries in the referenced
> >     page-directory-pointer table). =20
>=20
> But does intel need it?
>=20
> Because I don't see it. We already do the __tlb_adjust_range(), and we
> never tear down the highest-level page tables afaik.
>=20
> Am I missing something?


Sorry I mean Intel needs the existing behaviour of range flush expanded
to cover page table pages.... right? The manual has similar wording for
lower levels of page tables too. So it does need to send an invalidate
*somewhere* that a freed page table page covers, even if no valid pte
was torn down.

Thanks,
Nick

^ permalink raw reply

* Re: [RFC PATCH 3/3] powerpc/64s/radix: optimise TLB flush with precise TLB ranges in mmu_gather
From: Linus Torvalds @ 2018-06-12 23:26 UTC (permalink / raw)
  To: Nick Piggin
  Cc: linux-mm, ppc-dev, linux-arch, Aneesh Kumar K. V, Minchan Kim,
	Mel Gorman, Nadav Amit, Andrew Morton
In-Reply-To: <20180613090950.50566245@roar.ozlabs.ibm.com>

On Tue, Jun 12, 2018 at 4:09 PM Nicholas Piggin <npiggin@gmail.com> wrote:
>
> Sorry I mean Intel needs the existing behaviour of range flush expanded
> to cover page table pages.... right?

Right.  Intel depends on the current thing, ie if a page table
*itself* is freed, we will will need to do a flush, but it's the exact
same flush as if there had been a regular page there.

That's already handled by (for example) pud_free_tlb() doing the
__tlb_adjust_range().

Again, I may be missing entirely what you're talking about, because it
feels like we're talking across each other.

My argument is that your new patches in (2-3 in the series - patch #1
looks ok) seem to be fundamentally specific to things that have a
*different* tlb invalidation for the directory entries than for the
leaf entries.

But that's not what at least x86 has, and not what the generic code has done.

I think it might be fine to introduce a few new helpers that end up
being no-ops for the traditional cases.

I just don't think it makes sense to maintain a set of range values
that then aren't actually used in the general case.

              Linus

^ permalink raw reply

* Re: [RFC PATCH 3/3] powerpc/64s/radix: optimise TLB flush with precise TLB ranges in mmu_gather
From: Linus Torvalds @ 2018-06-12 23:39 UTC (permalink / raw)
  To: Nick Piggin
  Cc: linux-mm, ppc-dev, linux-arch, Aneesh Kumar K. V, Minchan Kim,
	Mel Gorman, Nadav Amit, Andrew Morton
In-Reply-To: <CA+55aFxd97-29qi-JMxyPPoZMxw=eObQHB5XXGiLj7SNV8B-oQ@mail.gmail.com>

On Tue, Jun 12, 2018 at 4:26 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Right.  Intel depends on the current thing, ie if a page table
> *itself* is freed, we will will need to do a flush, but it's the exact
> same flush as if there had been a regular page there.
>
> That's already handled by (for example) pud_free_tlb() doing the
> __tlb_adjust_range().

Side note: I guess we _could_ make the "page directory" flush be
special on x86 too.

Right now a page directory flush just counts as a range, and then a
range that is more that a few entries just means "flush everything".

End result: in practice, every time you free a page directory, you
flush the whole TLB because it looks identical to flushing a large
range of pages.

And in _theory_, maybe you could have just used "invalpg" with a
targeted address instead. In fact, I think a single invlpg invalidates
_all_ caches for the associated MM, but don't quote me on that.

That said, I don't think this is a common case. But I think that *if*
you extend this to be aware of the page directory caches, and _if_ you
extend it to cover both ppc and x86, at that point all my "this isn't
generic" arguments go away.

Because once x86 does it, it's "common enough" that it counts as
generic. It may be only a single other architecture, but it's the bulk
of all the development machines, so..

                 Linus

^ permalink raw reply

* Re: [RFC PATCH 3/3] powerpc/64s/radix: optimise TLB flush with precise TLB ranges in mmu_gather
From: Nicholas Piggin @ 2018-06-12 23:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-mm, ppc-dev, linux-arch, Aneesh Kumar K. V, Minchan Kim,
	Mel Gorman, Nadav Amit, Andrew Morton
In-Reply-To: <CA+55aFxd97-29qi-JMxyPPoZMxw=eObQHB5XXGiLj7SNV8B-oQ@mail.gmail.com>

On Tue, 12 Jun 2018 16:26:33 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Jun 12, 2018 at 4:09 PM Nicholas Piggin <npiggin@gmail.com> wrote:
> >
> > Sorry I mean Intel needs the existing behaviour of range flush expanded
> > to cover page table pages.... right?  
> 
> Right.  Intel depends on the current thing, ie if a page table
> *itself* is freed, we will will need to do a flush, but it's the exact
> same flush as if there had been a regular page there.
> 
> That's already handled by (for example) pud_free_tlb() doing the
> __tlb_adjust_range().

Agreed.

> 
> Again, I may be missing entirely what you're talking about, because it
> feels like we're talking across each other.
> 
> My argument is that your new patches in (2-3 in the series - patch #1
> looks ok) seem to be fundamentally specific to things that have a
> *different* tlb invalidation for the directory entries than for the
> leaf entries.

Yes I think I confused myself a bit. You're right these patches are
only useful if there is no page structure cache, or if it's managed
separately from TLB invalidation.

> 
> But that's not what at least x86 has, and not what the generic code has done.
> 
> I think it might be fine to introduce a few new helpers that end up
> being no-ops for the traditional cases.
> 
> I just don't think it makes sense to maintain a set of range values
> that then aren't actually used in the general case.

Sure, I'll make it optional. That would probably give a better result
for powerpc too because it doesn't need to maintain two ranges either.

Thanks,
Nick

^ permalink raw reply

* Re: [RFC PATCH 3/3] powerpc/64s/radix: optimise TLB flush with precise TLB ranges in mmu_gather
From: Nicholas Piggin @ 2018-06-13  0:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-mm, ppc-dev, linux-arch, Aneesh Kumar K. V, Minchan Kim,
	Mel Gorman, Nadav Amit, Andrew Morton
In-Reply-To: <CA+55aFzbYBXUDcAGaP_HoCxjTvOgkixc0+7nJqMea0yKjLSnhw@mail.gmail.com>

On Tue, 12 Jun 2018 16:39:55 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Jun 12, 2018 at 4:26 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Right.  Intel depends on the current thing, ie if a page table
> > *itself* is freed, we will will need to do a flush, but it's the exact
> > same flush as if there had been a regular page there.
> >
> > That's already handled by (for example) pud_free_tlb() doing the
> > __tlb_adjust_range().  
> 
> Side note: I guess we _could_ make the "page directory" flush be
> special on x86 too.
> 
> Right now a page directory flush just counts as a range, and then a
> range that is more that a few entries just means "flush everything".
> 
> End result: in practice, every time you free a page directory, you
> flush the whole TLB because it looks identical to flushing a large
> range of pages.
> 
> And in _theory_, maybe you could have just used "invalpg" with a
> targeted address instead. In fact, I think a single invlpg invalidates
> _all_ caches for the associated MM, but don't quote me on that.

Yeah I was thinking that, you could treat it separately (similar to
powerpc maybe) despite using the same instructions to invalidate it.

> That said, I don't think this is a common case. But I think that *if*
> you extend this to be aware of the page directory caches, and _if_ you
> extend it to cover both ppc and x86, at that point all my "this isn't
> generic" arguments go away.
> 
> Because once x86 does it, it's "common enough" that it counts as
> generic. It may be only a single other architecture, but it's the bulk
> of all the development machines, so..

I'll do the small step first (basically just this patch as an opt-in
for architectures that don't need page tables in their tlb range). But
after that it would be interesting to see if x86 could do anything
with explicit page table cache management.

Thanks,
Nick

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox