public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Aaro Koskinen <aaro.koskinen@nokia.com>
To: linux-kernel@vger.kernel.org
Subject: tlb_start_vma() / tlb_end_vma() inefficiency (was Re: [PATCH 1/1] [ARM] Always do the full MM flush when unmapping VMA)
Date: Tue, 03 Mar 2009 20:19:23 +0200	[thread overview]
Message-ID: <49AD74AB.9090304@nokia.com> (raw)
In-Reply-To: <20090303163212.GB11096@n2100.arm.linux.org.uk>

Hello,

Russell King - ARM Linux wrote:
> On Tue, Mar 03, 2009 at 06:23:55PM +0200, Aaro Koskinen wrote:
>> When unmapping N pages (e.g. shared memory) the amount of TLB flushes
>> done is (N*PAGE_SIZE/ZAP_BLOCK_SIZE)*N although it should be N at
>> maximum. With PREEMPT kernel ZAP_BLOCK_SIZE is 8 pages, so there is a
>> noticeable performance penalty and the system is spending its time in
>> flush_tlb_range().
>>
>> The problem is that tlb_end_vma() is passing always the full VMA
>> range. The subrange that needs to be flushed would be available in
>> tlb_finish_mmu(), but the VMA is not available anymore. So always do
>> the full MM flush.
> 
> NAK.  If we're only unmapping a small VMA, this will result in us knocking
> out all TLB entries.  That's far from desirable.
> 
> The better solution is to probably seek to change tlb_end_vma() so that
> it knows how much work to do, which does need a generic kernel change
> and therefore to be discussed on lkml.

Ok, fair enough, moving this to lkml.

So, there is a problem in the way tlb_start_vma() and tlb_end_vma() are 
currently used: unmap_page_range() can be called multiple times when 
unmapping a VMA, and each time it calls tlb_start_vma()/tlb_end_vma() 
with the full range, instead of the subrange it's actually unmapping.

On ARM, tlb_flush_range() is called from tlb_end_vma(), and so, every 
time it goes unnecessarily through the whole VMA range. If I unmap 2048 
pages with PREEMPT enabled, that's 256*2048 flushes. You don't even have 
to measure to see an application freeze when it's unmapping a large 
area. (On some architectures this problem is not visible at all since 
these routines can be NOP.)

The question is how to fix this. There is currently no good way to 
implement these routines for architectures that are doing range-specific 
TLB flushes. As suggested above by Russell, perhaps it could be 
reasonable to change tlb_{start,end}_end() API so that it would also 
pass on the range that is/was actually unmapped by unmap_page_range()?

A.

           reply	other threads:[~2009-03-03 18:19 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <20090303163212.GB11096@n2100.arm.linux.org.uk>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49AD74AB.9090304@nokia.com \
    --to=aaro.koskinen@nokia.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox