Re: [PATCH 2/4] mm: Send one IPI per CPU to TLB flush all entries after unmapping pages

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@suse.de>
To: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
	Minchan Kim <minchan@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Andi Kleen <andi@firstfloor.org>, H Peter Anvin <hpa@zytor.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/4] mm: Send one IPI per CPU to TLB flush all entries after unmapping pages
Date: Wed, 10 Jun 2015 10:58:26 +0100	[thread overview]
Message-ID: <20150610095826.GD26425@suse.de> (raw)
In-Reply-To: <20150610082640.GA24483@gmail.com>

On Wed, Jun 10, 2015 at 10:26:40AM +0200, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@suse.de> wrote:
> 
> > On a 4-socket machine the results were
> > 
> >                                         4.1.0-rc6          4.1.0-rc6
> >                                     batchdirty-v6      batchunmap-v6
> > Ops lru-file-mmap-read-elapsed   121.27 (  0.00%)   118.79 (  2.05%)
> > 
> >            4.1.0-rc6      4.1.0-rc6
> >         batchdirty-v6 batchunmap-v6
> > User          620.84         608.48
> > System       4245.35        4152.89
> > Elapsed       122.65         120.15
> > 
> > In this case the workload completed faster and there was less CPU overhead
> > but as it's a NUMA machine there are a lot of factors at play. It's easier
> > to quantify on a single socket machine;
> > 
> >                                         4.1.0-rc6          4.1.0-rc6
> >                                     batchdirty-v6      batchunmap-v6
> > Ops lru-file-mmap-read-elapsed    20.35 (  0.00%)    21.52 ( -5.75%)
> > 
> >            4.1.0-rc6   4.1.0-rc6
> >         batchdirty-v6r5batchunmap-v6r5
> > User           58.02       60.70
> > System         77.57       81.92
> > Elapsed        22.14       23.16
> > 
> > That shows the workload takes 5.75% longer to complete with a similar
> > increase in the system CPU usage.
> 
> Btw., do you have any stddev noise numbers?
> 

                                           4.1.0-rc6          4.1.0-rc6          4.1.0-rc6          4.1.0-rc6
                                             vanilla     flushfull-v6r5    batchdirty-v6r5    batchunmap-v6r5
Ops lru-file-mmap-read-elapsed       25.43 (  0.00%)    20.59 ( 19.03%)    20.35 ( 19.98%)    21.52 ( 15.38%)
Ops lru-file-mmap-read-time_stddv     0.32 (  0.00%)     0.32 ( -1.30%)     0.39 (-23.00%)     0.45 (-40.91%)


flushfull  -- patch 2
batchdirty -- patch 3
batchunmap -- patch 4

So the impact of tracking the PFNs is outside the noise and there is
definite direct cost to it. This was expected for both the PFN tracking
and the individual flushes.

> The batching speedup is brutal enough to not need any noise estimations, it's a 
> clear winner.
> 

Agreed.

> But this PFN tracking patch is more difficult to judge as the numbers are pretty 
> close to each other.
> 

It's definitely measurable, no doubt about it and there never was. The
concerns were always the refill costs due to flushing potentially active
TLB entries unnecessarily. From https://lkml.org/lkml/2014/7/31/825, this
is potentially high where it says that a 512 DTLB refill takes 22,000
cycles which is higher than the individual flushes. However, this is an
estimate and it'll always be a case of "it depends". It's been asserted
that the refill costs are really low so lets just go with that, drop
patch 4 and wait and see who complains.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Mel Gorman <mgorman@suse.de>
To: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
	Minchan Kim <minchan@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Andi Kleen <andi@firstfloor.org>, H Peter Anvin <hpa@zytor.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/4] mm: Send one IPI per CPU to TLB flush all entries after unmapping pages
Date: Wed, 10 Jun 2015 10:58:26 +0100	[thread overview]
Message-ID: <20150610095826.GD26425@suse.de> (raw)
In-Reply-To: <20150610082640.GA24483@gmail.com>

On Wed, Jun 10, 2015 at 10:26:40AM +0200, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@suse.de> wrote:
> 
> > On a 4-socket machine the results were
> > 
> >                                         4.1.0-rc6          4.1.0-rc6
> >                                     batchdirty-v6      batchunmap-v6
> > Ops lru-file-mmap-read-elapsed   121.27 (  0.00%)   118.79 (  2.05%)
> > 
> >            4.1.0-rc6      4.1.0-rc6
> >         batchdirty-v6 batchunmap-v6
> > User          620.84         608.48
> > System       4245.35        4152.89
> > Elapsed       122.65         120.15
> > 
> > In this case the workload completed faster and there was less CPU overhead
> > but as it's a NUMA machine there are a lot of factors at play. It's easier
> > to quantify on a single socket machine;
> > 
> >                                         4.1.0-rc6          4.1.0-rc6
> >                                     batchdirty-v6      batchunmap-v6
> > Ops lru-file-mmap-read-elapsed    20.35 (  0.00%)    21.52 ( -5.75%)
> > 
> >            4.1.0-rc6   4.1.0-rc6
> >         batchdirty-v6r5batchunmap-v6r5
> > User           58.02       60.70
> > System         77.57       81.92
> > Elapsed        22.14       23.16
> > 
> > That shows the workload takes 5.75% longer to complete with a similar
> > increase in the system CPU usage.
> 
> Btw., do you have any stddev noise numbers?
> 

                                           4.1.0-rc6          4.1.0-rc6          4.1.0-rc6          4.1.0-rc6
                                             vanilla     flushfull-v6r5    batchdirty-v6r5    batchunmap-v6r5
Ops lru-file-mmap-read-elapsed       25.43 (  0.00%)    20.59 ( 19.03%)    20.35 ( 19.98%)    21.52 ( 15.38%)
Ops lru-file-mmap-read-time_stddv     0.32 (  0.00%)     0.32 ( -1.30%)     0.39 (-23.00%)     0.45 (-40.91%)


flushfull  -- patch 2
batchdirty -- patch 3
batchunmap -- patch 4

So the impact of tracking the PFNs is outside the noise and there is
definite direct cost to it. This was expected for both the PFN tracking
and the individual flushes.

> The batching speedup is brutal enough to not need any noise estimations, it's a 
> clear winner.
> 

Agreed.

> But this PFN tracking patch is more difficult to judge as the numbers are pretty 
> close to each other.
> 

It's definitely measurable, no doubt about it and there never was. The
concerns were always the refill costs due to flushing potentially active
TLB entries unnecessarily. From https://lkml.org/lkml/2014/7/31/825, this
is potentially high where it says that a 512 DTLB refill takes 22,000
cycles which is higher than the individual flushes. However, this is an
estimate and it'll always be a case of "it depends". It's been asserted
that the refill costs are really low so lets just go with that, drop
patch 4 and wait and see who complains.

-- 
Mel Gorman
SUSE Labs

next prev parent reply	other threads:[~2015-06-10  9:58 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-09 17:31 [PATCH 0/3] TLB flush multiple pages per IPI v6 Mel Gorman
2015-06-09 17:31 ` Mel Gorman
2015-06-09 17:31 ` [PATCH 1/4] x86, mm: Trace when an IPI is about to be sent Mel Gorman
2015-06-09 17:31   ` Mel Gorman
2015-06-09 17:31 ` [PATCH 2/4] mm: Send one IPI per CPU to TLB flush all entries after unmapping pages Mel Gorman
2015-06-09 17:31   ` Mel Gorman
2015-06-09 20:01   ` Rik van Riel
2015-06-09 20:01     ` Rik van Riel
2015-06-10  7:47   ` Ingo Molnar
2015-06-10  7:47     ` Ingo Molnar
2015-06-10  8:14     ` Mel Gorman
2015-06-10  8:14       ` Mel Gorman
2015-06-10  8:21       ` Ingo Molnar
2015-06-10  8:21         ` Ingo Molnar
2015-06-10  8:51         ` Mel Gorman
2015-06-10  8:51           ` Mel Gorman
2015-06-10  8:26   ` Ingo Molnar
2015-06-10  8:26     ` Ingo Molnar
2015-06-10  9:58     ` Mel Gorman [this message]
2015-06-10  9:58       ` Mel Gorman
2015-06-10  8:33   ` Ingo Molnar
2015-06-10  8:33     ` Ingo Molnar
2015-06-10  8:59     ` Mel Gorman
2015-06-10  8:59       ` Mel Gorman
2015-06-11 15:02       ` Ingo Molnar
2015-06-11 15:02         ` Ingo Molnar
2015-06-11 15:25         ` Mel Gorman
2015-06-11 15:25           ` Mel Gorman
2015-06-09 17:31 ` [PATCH 3/4] mm: Defer flush of writable TLB entries Mel Gorman
2015-06-09 17:31   ` Mel Gorman
2015-06-09 20:02   ` Rik van Riel
2015-06-09 20:02     ` Rik van Riel
2015-06-10  7:50   ` Ingo Molnar
2015-06-10  7:50     ` Ingo Molnar
2015-06-10  8:17     ` Mel Gorman
2015-06-10  8:17       ` Mel Gorman
2015-06-09 17:31 ` [PATCH 4/4] mm: Send one IPI per CPU to TLB flush pages that were recently unmapped Mel Gorman
2015-06-09 17:31   ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2015-07-06 13:39 [PATCH 0/4] TLB flush multiple pages per IPI v7 Mel Gorman
2015-07-06 13:39 ` [PATCH 2/4] mm: Send one IPI per CPU to TLB flush all entries after unmapping pages Mel Gorman
2015-07-06 13:39   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150610095826.GD26425@suse.de \
    --to=mgorman@suse.de \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=mingo@kernel.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.