From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay3.corp.sgi.com [198.149.34.15]) by oss.sgi.com (Postfix) with ESMTP id 352DE7F47 for ; Tue, 3 Mar 2015 15:34:03 -0600 (CST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay3.corp.sgi.com (Postfix) with ESMTP id A5BF8AC003 for ; Tue, 3 Mar 2015 13:33:59 -0800 (PST) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by cuda.sgi.com with ESMTP id 0eioMFgWsBzGfn3s for ; Tue, 03 Mar 2015 13:33:55 -0800 (PST) Date: Wed, 4 Mar 2015 08:33:53 +1100 From: Dave Chinner Subject: Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation. Message-ID: <20150303213353.GS4251@dastard> References: <20150302010413.GP4251@dastard> <20150303014733.GL18360@dastard> <20150303052004.GM18360@dastard> <20150303113437.GR4251@dastard> <20150303134346.GO3087@suse.de> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20150303134346.GO3087@suse.de> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Mel Gorman Cc: Matt B , Linux Kernel Mailing List , xfs@oss.sgi.com, linux-mm , Andrew Morton , Linus Torvalds , Ingo Molnar On Tue, Mar 03, 2015 at 01:43:46PM +0000, Mel Gorman wrote: > On Tue, Mar 03, 2015 at 10:34:37PM +1100, Dave Chinner wrote: > > On Mon, Mar 02, 2015 at 10:56:14PM -0800, Linus Torvalds wrote: > > > On Mon, Mar 2, 2015 at 9:20 PM, Dave Chinner wrote: > > > >> > > > >> But are those migrate-page calls really common enough to make these > > > >> things happen often enough on the same pages for this all to matter? > > > > > > > > It's looking like that's a possibility. > > > > > > Hmm. Looking closer, commit 10c1045f28e8 already should have > > > re-introduced the "pte was already NUMA" case. > > > > > > So that's not it either, afaik. Plus your numbers seem to say that > > > it's really "migrate_pages()" that is done more. So it feels like the > > > numa balancing isn't working right. > > > > So that should show up in the vmstats, right? Oh, and there's a > > tracepoint in migrate_pages, too. Same 6x10s samples in phase 3: > > > > The stats indicate both more updates and more faults. Can you try this > please? It's against 4.0-rc1. > > ---8<--- > mm: numa: Reduce amount of IPI traffic due to automatic NUMA balancing Makes no noticable difference to behaviour or performance. Stats: 359,857 migrate:mm_migrate_pages ( +- 5.54% ) numa_hit 36026802 numa_miss 14287 numa_foreign 14287 numa_interleave 18408 numa_local 36006052 numa_other 35037 numa_pte_updates 81803359 numa_huge_pte_updates 0 numa_hint_faults 79810798 numa_hint_faults_local 21227730 numa_pages_migrated 32037516 pgmigrate_success 32037516 pgmigrate_fail 0 -Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f170.google.com (mail-pd0-f170.google.com [209.85.192.170]) by kanga.kvack.org (Postfix) with ESMTP id 6CA0D6B0038 for ; Tue, 3 Mar 2015 16:33:58 -0500 (EST) Received: by pdno5 with SMTP id o5so5060400pdn.12 for ; Tue, 03 Mar 2015 13:33:58 -0800 (PST) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net. [150.101.137.145]) by mx.google.com with ESMTP id y10si2462397pdm.136.2015.03.03.13.33.56 for ; Tue, 03 Mar 2015 13:33:57 -0800 (PST) Date: Wed, 4 Mar 2015 08:33:53 +1100 From: Dave Chinner Subject: Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation. Message-ID: <20150303213353.GS4251@dastard> References: <20150302010413.GP4251@dastard> <20150303014733.GL18360@dastard> <20150303052004.GM18360@dastard> <20150303113437.GR4251@dastard> <20150303134346.GO3087@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150303134346.GO3087@suse.de> Sender: owner-linux-mm@kvack.org List-ID: To: Mel Gorman Cc: Linus Torvalds , Andrew Morton , Ingo Molnar , Matt B , Linux Kernel Mailing List , linux-mm , xfs@oss.sgi.com On Tue, Mar 03, 2015 at 01:43:46PM +0000, Mel Gorman wrote: > On Tue, Mar 03, 2015 at 10:34:37PM +1100, Dave Chinner wrote: > > On Mon, Mar 02, 2015 at 10:56:14PM -0800, Linus Torvalds wrote: > > > On Mon, Mar 2, 2015 at 9:20 PM, Dave Chinner wrote: > > > >> > > > >> But are those migrate-page calls really common enough to make these > > > >> things happen often enough on the same pages for this all to matter? > > > > > > > > It's looking like that's a possibility. > > > > > > Hmm. Looking closer, commit 10c1045f28e8 already should have > > > re-introduced the "pte was already NUMA" case. > > > > > > So that's not it either, afaik. Plus your numbers seem to say that > > > it's really "migrate_pages()" that is done more. So it feels like the > > > numa balancing isn't working right. > > > > So that should show up in the vmstats, right? Oh, and there's a > > tracepoint in migrate_pages, too. Same 6x10s samples in phase 3: > > > > The stats indicate both more updates and more faults. Can you try this > please? It's against 4.0-rc1. > > ---8<--- > mm: numa: Reduce amount of IPI traffic due to automatic NUMA balancing Makes no noticable difference to behaviour or performance. Stats: 359,857 migrate:mm_migrate_pages ( +- 5.54% ) numa_hit 36026802 numa_miss 14287 numa_foreign 14287 numa_interleave 18408 numa_local 36006052 numa_other 35037 numa_pte_updates 81803359 numa_huge_pte_updates 0 numa_hint_faults 79810798 numa_hint_faults_local 21227730 numa_pages_migrated 32037516 pgmigrate_success 32037516 pgmigrate_fail 0 -Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756914AbbCCVd6 (ORCPT ); Tue, 3 Mar 2015 16:33:58 -0500 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:52661 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754821AbbCCVd5 (ORCPT ); Tue, 3 Mar 2015 16:33:57 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2AyFQBwJ/ZUPLeDLHlagwKBLII8g32oKQEBAQEBB5hGAgIBAQKBKE4BAQEBAQEFAQEBATg7hBABBTocIxAIAw4KCSUPBSUDBxoTiC7WKgELAR8YhW6FDIE9AYMwB4QrBZlBgRuDIotlgz6CMiOBUCoxgkMBAQE Date: Wed, 4 Mar 2015 08:33:53 +1100 From: Dave Chinner To: Mel Gorman Cc: Linus Torvalds , Andrew Morton , Ingo Molnar , Matt B , Linux Kernel Mailing List , linux-mm , xfs@oss.sgi.com Subject: Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing significant performance degradation. Message-ID: <20150303213353.GS4251@dastard> References: <20150302010413.GP4251@dastard> <20150303014733.GL18360@dastard> <20150303052004.GM18360@dastard> <20150303113437.GR4251@dastard> <20150303134346.GO3087@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150303134346.GO3087@suse.de> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 03, 2015 at 01:43:46PM +0000, Mel Gorman wrote: > On Tue, Mar 03, 2015 at 10:34:37PM +1100, Dave Chinner wrote: > > On Mon, Mar 02, 2015 at 10:56:14PM -0800, Linus Torvalds wrote: > > > On Mon, Mar 2, 2015 at 9:20 PM, Dave Chinner wrote: > > > >> > > > >> But are those migrate-page calls really common enough to make these > > > >> things happen often enough on the same pages for this all to matter? > > > > > > > > It's looking like that's a possibility. > > > > > > Hmm. Looking closer, commit 10c1045f28e8 already should have > > > re-introduced the "pte was already NUMA" case. > > > > > > So that's not it either, afaik. Plus your numbers seem to say that > > > it's really "migrate_pages()" that is done more. So it feels like the > > > numa balancing isn't working right. > > > > So that should show up in the vmstats, right? Oh, and there's a > > tracepoint in migrate_pages, too. Same 6x10s samples in phase 3: > > > > The stats indicate both more updates and more faults. Can you try this > please? It's against 4.0-rc1. > > ---8<--- > mm: numa: Reduce amount of IPI traffic due to automatic NUMA balancing Makes no noticable difference to behaviour or performance. Stats: 359,857 migrate:mm_migrate_pages ( +- 5.54% ) numa_hit 36026802 numa_miss 14287 numa_foreign 14287 numa_interleave 18408 numa_local 36006052 numa_other 35037 numa_pte_updates 81803359 numa_huge_pte_updates 0 numa_hint_faults 79810798 numa_hint_faults_local 21227730 numa_pages_migrated 32037516 pgmigrate_success 32037516 pgmigrate_fail 0 -Dave. -- Dave Chinner david@fromorbit.com