Re: [PATCH] mm: Flush the TLB for a single address in a huge page

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	David Rientjes <rientjes@google.com>,
	linux-mm <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Heiko Carstens <heiko.carstens@de.ibm.com>
Subject: Re: [PATCH] mm: Flush the TLB for a single address in a huge page
Date: Fri, 24 Jul 2015 09:17:49 +0200	[thread overview]
Message-ID: <20150724091749.766df0d7@mschwide> (raw)
In-Reply-To: <20150723164921.GH27052@e104818-lin.cambridge.arm.com>

On Thu, 23 Jul 2015 17:49:21 +0100
Catalin Marinas <catalin.marinas@arm.com> wrote:

> On Thu, Jul 23, 2015 at 03:13:03PM +0100, Andrea Arcangeli wrote:
> > On Thu, Jul 23, 2015 at 11:49:38AM +0100, Catalin Marinas wrote:
> > > On Thu, Jul 23, 2015 at 12:05:21AM +0100, Dave Hansen wrote:
> > > > On 07/22/2015 03:48 PM, Catalin Marinas wrote:
> > > > > You are right, on x86 the tlb_single_page_flush_ceiling seems to be
> > > > > 33, so for an HPAGE_SIZE range the code does a local_flush_tlb()
> > > > > always. I would say a single page TLB flush is more efficient than a
> > > > > whole TLB flush but I'm not familiar enough with x86.
> > > > 
> > > > The last time I looked, the instruction to invalidate a single page is
> > > > more expensive than the instruction to flush the entire TLB. 
> [...]
> > > Another question is whether flushing a single address is enough for a
> > > huge page. I assumed it is since tlb_remove_pmd_tlb_entry() only adjusts
> [...]
> > > the mmu_gather range by PAGE_SIZE (rather than HPAGE_SIZE) and
> > > no-one complained so far. AFAICT, there are only 3 architectures
> > > that don't use asm-generic/tlb.h but they all seem to handle this
> > > case:
> > 
> > Agreed that archs using the generic tlb.h that sets the tlb->end to
> > address+PAGE_SIZE should be fine with the flush_tlb_page.
> > 
> > > arch/arm: it implements tlb_remove_pmd_tlb_entry() in a similar way to
> > > the generic one
> > > 
> > > arch/s390: tlb_remove_pmd_tlb_entry() is a no-op
> > 
> > I guess s390 is fine too but I'm not convinced that the fact it won't
> > adjust the tlb->start/end is a guarantees that flush_tlb_page is
> > enough when a single 2MB TLB has to be invalidated (not during range
> > zapping).

tlb_remove_pmd_tlb_entry() is a no-op because pmdp_get_and_clear_full()
already did the job. s390 is special in regard to TLB flushing, the
machines have the requirement that a pte/pmd needs to be invalidated
with specific instruction if there is a process that might use the
translation path. In this case the IDTE instruction needs to be used
which sets the invalid bit in the pmd *and* flushes the TLB at the
same time. The code still tries to be lazy and do batched flushes to
improve performance. All in all quite complicated..

> > For the range zapping, could the arch decide to unconditionally flush
> > the whole TLB without doing the tlb->start/end tracking by overriding
> > tlb_gather_mmu in a way that won't call __tlb_reset_range? There seems
> > to be quite some flexibility in the per-arch tlb_gather_mmu setup in
> > order to unconditionally set tlb->start/end to the total range zapped,
> > without actually narrowing it down during the pagetable walk.
> 
> You are right, looking at the s390 code, tlb_finish_mmu() flushes the
> whole TLB, so the ranges don't seem to matter. I'm cc'ing the s390
> maintainers to confirm whether this patch affects them in any way:
> 
> https://lkml.org/lkml/2015/7/22/521
> 
> IIUC, all the functions touched by this patch are implemented by s390 in
> its specific way, so I don't think it makes any difference:
> 
> pmdp_set_access_flags
> pmdp_clear_flush_young
> pmdp_huge_clear_flush
> pmdp_splitting_flush
> pmdp_invalidate

tlb_finish_mmu may flush all entries for a specific address space, not
the whole TLB. And it does so only for batched operations. If all changes
to the page tables have been done with IPTE/IDTE then flush_mm will not
be set and no full address space flush is done.

But to answer the question: s390 is fine with the change outlined in
https://lkml.org/lkml/2015/7/22/521

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	David Rientjes <rientjes@google.com>,
	linux-mm <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Heiko Carstens <heiko.carstens@de.ibm.com>
Subject: Re: [PATCH] mm: Flush the TLB for a single address in a huge page
Date: Fri, 24 Jul 2015 09:17:49 +0200	[thread overview]
Message-ID: <20150724091749.766df0d7@mschwide> (raw)
In-Reply-To: <20150723164921.GH27052@e104818-lin.cambridge.arm.com>

On Thu, 23 Jul 2015 17:49:21 +0100
Catalin Marinas <catalin.marinas@arm.com> wrote:

> On Thu, Jul 23, 2015 at 03:13:03PM +0100, Andrea Arcangeli wrote:
> > On Thu, Jul 23, 2015 at 11:49:38AM +0100, Catalin Marinas wrote:
> > > On Thu, Jul 23, 2015 at 12:05:21AM +0100, Dave Hansen wrote:
> > > > On 07/22/2015 03:48 PM, Catalin Marinas wrote:
> > > > > You are right, on x86 the tlb_single_page_flush_ceiling seems to be
> > > > > 33, so for an HPAGE_SIZE range the code does a local_flush_tlb()
> > > > > always. I would say a single page TLB flush is more efficient than a
> > > > > whole TLB flush but I'm not familiar enough with x86.
> > > > 
> > > > The last time I looked, the instruction to invalidate a single page is
> > > > more expensive than the instruction to flush the entire TLB. 
> [...]
> > > Another question is whether flushing a single address is enough for a
> > > huge page. I assumed it is since tlb_remove_pmd_tlb_entry() only adjusts
> [...]
> > > the mmu_gather range by PAGE_SIZE (rather than HPAGE_SIZE) and
> > > no-one complained so far. AFAICT, there are only 3 architectures
> > > that don't use asm-generic/tlb.h but they all seem to handle this
> > > case:
> > 
> > Agreed that archs using the generic tlb.h that sets the tlb->end to
> > address+PAGE_SIZE should be fine with the flush_tlb_page.
> > 
> > > arch/arm: it implements tlb_remove_pmd_tlb_entry() in a similar way to
> > > the generic one
> > > 
> > > arch/s390: tlb_remove_pmd_tlb_entry() is a no-op
> > 
> > I guess s390 is fine too but I'm not convinced that the fact it won't
> > adjust the tlb->start/end is a guarantees that flush_tlb_page is
> > enough when a single 2MB TLB has to be invalidated (not during range
> > zapping).

tlb_remove_pmd_tlb_entry() is a no-op because pmdp_get_and_clear_full()
already did the job. s390 is special in regard to TLB flushing, the
machines have the requirement that a pte/pmd needs to be invalidated
with specific instruction if there is a process that might use the
translation path. In this case the IDTE instruction needs to be used
which sets the invalid bit in the pmd *and* flushes the TLB at the
same time. The code still tries to be lazy and do batched flushes to
improve performance. All in all quite complicated..

> > For the range zapping, could the arch decide to unconditionally flush
> > the whole TLB without doing the tlb->start/end tracking by overriding
> > tlb_gather_mmu in a way that won't call __tlb_reset_range? There seems
> > to be quite some flexibility in the per-arch tlb_gather_mmu setup in
> > order to unconditionally set tlb->start/end to the total range zapped,
> > without actually narrowing it down during the pagetable walk.
> 
> You are right, looking at the s390 code, tlb_finish_mmu() flushes the
> whole TLB, so the ranges don't seem to matter. I'm cc'ing the s390
> maintainers to confirm whether this patch affects them in any way:
> 
> https://lkml.org/lkml/2015/7/22/521
> 
> IIUC, all the functions touched by this patch are implemented by s390 in
> its specific way, so I don't think it makes any difference:
> 
> pmdp_set_access_flags
> pmdp_clear_flush_young
> pmdp_huge_clear_flush
> pmdp_splitting_flush
> pmdp_invalidate

tlb_finish_mmu may flush all entries for a specific address space, not
the whole TLB. And it does so only for batched operations. If all changes
to the page tables have been done with IPTE/IDTE then flush_mm will not
be set and no full address space flush is done.

But to answer the question: s390 is fine with the change outlined in
https://lkml.org/lkml/2015/7/22/521

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

next prev parent reply	other threads:[~2015-07-24  7:17 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-22 17:13 [PATCH] mm: Flush the TLB for a single address in a huge page Catalin Marinas
2015-07-22 17:13 ` Catalin Marinas
2015-07-22 21:39 ` David Rientjes
2015-07-22 21:39   ` David Rientjes
2015-07-22 22:48   ` Catalin Marinas
2015-07-22 22:48     ` Catalin Marinas
2015-07-22 23:05     ` Dave Hansen
2015-07-22 23:05       ` Dave Hansen
2015-07-23 10:49       ` Catalin Marinas
2015-07-23 10:49         ` Catalin Marinas
2015-07-23 14:13         ` Andrea Arcangeli
2015-07-23 14:13           ` Andrea Arcangeli
2015-07-23 14:41           ` Dave Hansen
2015-07-23 14:41             ` Dave Hansen
2015-07-23 15:58             ` Andrea Arcangeli
2015-07-23 15:58               ` Andrea Arcangeli
2015-07-23 16:52               ` Dave Hansen
2015-07-23 16:52                 ` Dave Hansen
2015-07-23 16:16             ` Catalin Marinas
2015-07-23 16:16               ` Catalin Marinas
2015-07-23 16:55               ` Dave Hansen
2015-07-23 16:55                 ` Dave Hansen
2015-07-23 17:13                 ` Andrea Arcangeli
2015-07-23 17:13                   ` Andrea Arcangeli
2015-07-23 16:49           ` Catalin Marinas
2015-07-23 16:49             ` Catalin Marinas
2015-07-24  7:17             ` Martin Schwidefsky [this message]
2015-07-24  7:17               ` Martin Schwidefsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150724091749.766df0d7@mschwide \
    --to=schwidefsky@de.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@intel.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.