All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Dave Chinner <david@fromorbit.com>
Cc: linuxppc-dev@lists.ozlabs.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	xfs@oss.sgi.com, Linux-MM <linux-mm@kvack.org>,
	Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH 0/3] Reduce system overhead of automatic NUMA balancing
Date: Tue, 24 Mar 2015 15:33:06 +0000	[thread overview]
Message-ID: <20150324153306.GG4701@suse.de> (raw)
In-Reply-To: <20150324115141.GS28621@dastard>

On Tue, Mar 24, 2015 at 10:51:41PM +1100, Dave Chinner wrote:
> On Mon, Mar 23, 2015 at 12:24:00PM +0000, Mel Gorman wrote:
> > These are three follow-on patches based on the xfsrepair workload Dave
> > Chinner reported was problematic in 4.0-rc1 due to changes in page table
> > management -- https://lkml.org/lkml/2015/3/1/226.
> > 
> > Much of the problem was reduced by commit 53da3bc2ba9e ("mm: fix up numa
> > read-only thread grouping logic") and commit ba68bc0115eb ("mm: thp:
> > Return the correct value for change_huge_pmd"). It was known that the performance
> > in 3.19 was still better even if is far less safe. This series aims to
> > restore the performance without compromising on safety.
> > 
> > Dave, you already tested patch 1 on its own but it would be nice to test
> > patches 1+2 and 1+2+3 separately just to be certain.
> 
> 			   3.19  4.0-rc4    +p1      +p2      +p3
> mm_migrate_pages	266,750  572,839  558,632  223,706  201,429
> run time		  4m54s    7m50s    7m20s    5m07s    4m31s
> 

Excellent, this is in line with predictions and roughly matches what I
was seeing on bare metal + real NUMA + spinning disk instead of KVM +
fake NUMA + SSD.

Editting slightly;

> numa stats form p1+p2:    numa_pte_updates 46109698
> numa stats form p1+p2+p3: numa_pte_updates 24460492

The big drop in PTE updates matches what I expected -- migration
failures should not lead to increased scan rates which is what patch 3
fixes. I'm also pleased that there was not a drop in performance.

> 
> OK, the summary with all patches applied:
> 
> config                          3.19   4.0-rc1  4.0-rc4  4.0-rc5+
> defaults                       8m08s     9m34s    9m14s    6m57s
> -o ag_stride=-1                4m04s     4m38s    4m11s    4m06s
> -o bhash=101073                6m04s    17m43s    7m35s    6m13s
> -o ag_stride=-1,bhash=101073   4m54s     9m58s    7m50s    4m31s
> 
> So it looks like the patch set fixes the remaining regression and in
> 2 of the four cases actually improves performance....
> 

\o/

Linus, these three patches plus the small fixlet for pmd_mkyoung (to match
pte_mkyoung) is already in Andrew's tree. I'm expecting it'll arrive to
you before 4.0 assuming nothing else goes pear shaped.

> Thanks, Linus and Mel, for tracking this tricky problem down! 
> 

Thanks Dave for persisting with this and collecting the necessary data.
FWIW, I've marked the xfsrepair test case as a "large memory test".
It'll take time before the test machines have historical data for it but
in theory if this regresses again then I should spot it eventually.

-- 
Mel Gorman
SUSE Labs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Dave Chinner <david@fromorbit.com>
Cc: linuxppc-dev@lists.ozlabs.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	xfs@oss.sgi.com, Linux-MM <linux-mm@kvack.org>,
	Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH 0/3] Reduce system overhead of automatic NUMA balancing
Date: Tue, 24 Mar 2015 15:33:06 +0000	[thread overview]
Message-ID: <20150324153306.GG4701@suse.de> (raw)
In-Reply-To: <20150324115141.GS28621@dastard>

On Tue, Mar 24, 2015 at 10:51:41PM +1100, Dave Chinner wrote:
> On Mon, Mar 23, 2015 at 12:24:00PM +0000, Mel Gorman wrote:
> > These are three follow-on patches based on the xfsrepair workload Dave
> > Chinner reported was problematic in 4.0-rc1 due to changes in page table
> > management -- https://lkml.org/lkml/2015/3/1/226.
> > 
> > Much of the problem was reduced by commit 53da3bc2ba9e ("mm: fix up numa
> > read-only thread grouping logic") and commit ba68bc0115eb ("mm: thp:
> > Return the correct value for change_huge_pmd"). It was known that the performance
> > in 3.19 was still better even if is far less safe. This series aims to
> > restore the performance without compromising on safety.
> > 
> > Dave, you already tested patch 1 on its own but it would be nice to test
> > patches 1+2 and 1+2+3 separately just to be certain.
> 
> 			   3.19  4.0-rc4    +p1      +p2      +p3
> mm_migrate_pages	266,750  572,839  558,632  223,706  201,429
> run time		  4m54s    7m50s    7m20s    5m07s    4m31s
> 

Excellent, this is in line with predictions and roughly matches what I
was seeing on bare metal + real NUMA + spinning disk instead of KVM +
fake NUMA + SSD.

Editting slightly;

> numa stats form p1+p2:    numa_pte_updates 46109698
> numa stats form p1+p2+p3: numa_pte_updates 24460492

The big drop in PTE updates matches what I expected -- migration
failures should not lead to increased scan rates which is what patch 3
fixes. I'm also pleased that there was not a drop in performance.

> 
> OK, the summary with all patches applied:
> 
> config                          3.19   4.0-rc1  4.0-rc4  4.0-rc5+
> defaults                       8m08s     9m34s    9m14s    6m57s
> -o ag_stride=-1                4m04s     4m38s    4m11s    4m06s
> -o bhash=101073                6m04s    17m43s    7m35s    6m13s
> -o ag_stride=-1,bhash=101073   4m54s     9m58s    7m50s    4m31s
> 
> So it looks like the patch set fixes the remaining regression and in
> 2 of the four cases actually improves performance....
> 

\o/

Linus, these three patches plus the small fixlet for pmd_mkyoung (to match
pte_mkyoung) is already in Andrew's tree. I'm expecting it'll arrive to
you before 4.0 assuming nothing else goes pear shaped.

> Thanks, Linus and Mel, for tracking this tricky problem down! 
> 

Thanks Dave for persisting with this and collecting the necessary data.
FWIW, I've marked the xfsrepair test case as a "large memory test".
It'll take time before the test machines have historical data for it but
in theory if this regresses again then I should spot it eventually.

-- 
Mel Gorman
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	xfs@oss.sgi.com, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 0/3] Reduce system overhead of automatic NUMA balancing
Date: Tue, 24 Mar 2015 15:33:06 +0000	[thread overview]
Message-ID: <20150324153306.GG4701@suse.de> (raw)
In-Reply-To: <20150324115141.GS28621@dastard>

On Tue, Mar 24, 2015 at 10:51:41PM +1100, Dave Chinner wrote:
> On Mon, Mar 23, 2015 at 12:24:00PM +0000, Mel Gorman wrote:
> > These are three follow-on patches based on the xfsrepair workload Dave
> > Chinner reported was problematic in 4.0-rc1 due to changes in page table
> > management -- https://lkml.org/lkml/2015/3/1/226.
> > 
> > Much of the problem was reduced by commit 53da3bc2ba9e ("mm: fix up numa
> > read-only thread grouping logic") and commit ba68bc0115eb ("mm: thp:
> > Return the correct value for change_huge_pmd"). It was known that the performance
> > in 3.19 was still better even if is far less safe. This series aims to
> > restore the performance without compromising on safety.
> > 
> > Dave, you already tested patch 1 on its own but it would be nice to test
> > patches 1+2 and 1+2+3 separately just to be certain.
> 
> 			   3.19  4.0-rc4    +p1      +p2      +p3
> mm_migrate_pages	266,750  572,839  558,632  223,706  201,429
> run time		  4m54s    7m50s    7m20s    5m07s    4m31s
> 

Excellent, this is in line with predictions and roughly matches what I
was seeing on bare metal + real NUMA + spinning disk instead of KVM +
fake NUMA + SSD.

Editting slightly;

> numa stats form p1+p2:    numa_pte_updates 46109698
> numa stats form p1+p2+p3: numa_pte_updates 24460492

The big drop in PTE updates matches what I expected -- migration
failures should not lead to increased scan rates which is what patch 3
fixes. I'm also pleased that there was not a drop in performance.

> 
> OK, the summary with all patches applied:
> 
> config                          3.19   4.0-rc1  4.0-rc4  4.0-rc5+
> defaults                       8m08s     9m34s    9m14s    6m57s
> -o ag_stride=-1                4m04s     4m38s    4m11s    4m06s
> -o bhash=101073                6m04s    17m43s    7m35s    6m13s
> -o ag_stride=-1,bhash=101073   4m54s     9m58s    7m50s    4m31s
> 
> So it looks like the patch set fixes the remaining regression and in
> 2 of the four cases actually improves performance....
> 

\o/

Linus, these three patches plus the small fixlet for pmd_mkyoung (to match
pte_mkyoung) is already in Andrew's tree. I'm expecting it'll arrive to
you before 4.0 assuming nothing else goes pear shaped.

> Thanks, Linus and Mel, for tracking this tricky problem down! 
> 

Thanks Dave for persisting with this and collecting the necessary data.
FWIW, I've marked the xfsrepair test case as a "large memory test".
It'll take time before the test machines have historical data for it but
in theory if this regresses again then I should spot it eventually.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	xfs@oss.sgi.com, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 0/3] Reduce system overhead of automatic NUMA balancing
Date: Tue, 24 Mar 2015 15:33:06 +0000	[thread overview]
Message-ID: <20150324153306.GG4701@suse.de> (raw)
In-Reply-To: <20150324115141.GS28621@dastard>

On Tue, Mar 24, 2015 at 10:51:41PM +1100, Dave Chinner wrote:
> On Mon, Mar 23, 2015 at 12:24:00PM +0000, Mel Gorman wrote:
> > These are three follow-on patches based on the xfsrepair workload Dave
> > Chinner reported was problematic in 4.0-rc1 due to changes in page table
> > management -- https://lkml.org/lkml/2015/3/1/226.
> > 
> > Much of the problem was reduced by commit 53da3bc2ba9e ("mm: fix up numa
> > read-only thread grouping logic") and commit ba68bc0115eb ("mm: thp:
> > Return the correct value for change_huge_pmd"). It was known that the performance
> > in 3.19 was still better even if is far less safe. This series aims to
> > restore the performance without compromising on safety.
> > 
> > Dave, you already tested patch 1 on its own but it would be nice to test
> > patches 1+2 and 1+2+3 separately just to be certain.
> 
> 			   3.19  4.0-rc4    +p1      +p2      +p3
> mm_migrate_pages	266,750  572,839  558,632  223,706  201,429
> run time		  4m54s    7m50s    7m20s    5m07s    4m31s
> 

Excellent, this is in line with predictions and roughly matches what I
was seeing on bare metal + real NUMA + spinning disk instead of KVM +
fake NUMA + SSD.

Editting slightly;

> numa stats form p1+p2:    numa_pte_updates 46109698
> numa stats form p1+p2+p3: numa_pte_updates 24460492

The big drop in PTE updates matches what I expected -- migration
failures should not lead to increased scan rates which is what patch 3
fixes. I'm also pleased that there was not a drop in performance.

> 
> OK, the summary with all patches applied:
> 
> config                          3.19   4.0-rc1  4.0-rc4  4.0-rc5+
> defaults                       8m08s     9m34s    9m14s    6m57s
> -o ag_stride=-1                4m04s     4m38s    4m11s    4m06s
> -o bhash=101073                6m04s    17m43s    7m35s    6m13s
> -o ag_stride=-1,bhash=101073   4m54s     9m58s    7m50s    4m31s
> 
> So it looks like the patch set fixes the remaining regression and in
> 2 of the four cases actually improves performance....
> 

\o/

Linus, these three patches plus the small fixlet for pmd_mkyoung (to match
pte_mkyoung) is already in Andrew's tree. I'm expecting it'll arrive to
you before 4.0 assuming nothing else goes pear shaped.

> Thanks, Linus and Mel, for tracking this tricky problem down! 
> 

Thanks Dave for persisting with this and collecting the necessary data.
FWIW, I've marked the xfsrepair test case as a "large memory test".
It'll take time before the test machines have historical data for it but
in theory if this regresses again then I should spot it eventually.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2015-03-24 15:33 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-23 12:24 [PATCH 0/3] Reduce system overhead of automatic NUMA balancing Mel Gorman
2015-03-23 12:24 ` Mel Gorman
2015-03-23 12:24 ` Mel Gorman
2015-03-23 12:24 ` Mel Gorman
2015-03-23 12:24 ` [PATCH 1/3] mm: numa: Group related processes based on VMA flags instead of page table flags Mel Gorman
2015-03-23 12:24   ` Mel Gorman
2015-03-23 12:24   ` Mel Gorman
2015-03-23 12:24   ` Mel Gorman
2015-03-23 12:24 ` [PATCH 2/3] mm: numa: Preserve PTE write permissions across a NUMA hinting fault Mel Gorman
2015-03-23 12:24   ` Mel Gorman
2015-03-23 12:24   ` Mel Gorman
2015-03-23 12:24   ` Mel Gorman
2015-03-23 12:24 ` [PATCH 3/3] mm: numa: Slow PTE scan rate if migration failures occur Mel Gorman
2015-03-23 12:24   ` Mel Gorman
2015-03-23 12:24   ` Mel Gorman
2015-03-23 12:24   ` Mel Gorman
2015-03-24 11:51 ` [PATCH 0/3] Reduce system overhead of automatic NUMA balancing Dave Chinner
2015-03-24 11:51   ` Dave Chinner
2015-03-24 11:51   ` Dave Chinner
2015-03-24 11:51   ` Dave Chinner
2015-03-24 15:33   ` Mel Gorman [this message]
2015-03-24 15:33     ` Mel Gorman
2015-03-24 15:33     ` Mel Gorman
2015-03-24 15:33     ` Mel Gorman
2015-03-24 20:23     ` Linus Torvalds
2015-03-24 20:23       ` Linus Torvalds
2015-03-24 20:23       ` Linus Torvalds
2015-03-24 20:23       ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150324153306.GG4701@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.