From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx150.postini.com [74.125.245.150]) by kanga.kvack.org (Postfix) with SMTP id 433C76B0074 for ; Tue, 13 Nov 2012 10:52:40 -0500 (EST) Received: by mail-ee0-f41.google.com with SMTP id d41so1041eek.14 for ; Tue, 13 Nov 2012 07:52:38 -0800 (PST) Date: Tue, 13 Nov 2012 16:52:32 +0100 From: Ingo Molnar Subject: [patch 00/31] Latest numa/core patches, v15 Message-ID: <20121113155232.GA28466@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Mel Gorman , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Ingo Molnar , Thomas Gleixner Hi, This is the latest iteration of our numa/core patches, which implements adaptive NUMA affinity balancing. Changes in this version: https://lkml.org/lkml/2012/11/12/315 Performance figures: https://lkml.org/lkml/2012/11/12/330 Any review feedback, comments and test results are welcome! For testing purposes I'd suggest using the latest tip:master integration tree, which has the latest numa/core tree merged: git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master (But you can also directly use the tip:numa/core tree as well.) Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx134.postini.com [74.125.245.134]) by kanga.kvack.org (Postfix) with SMTP id B129B6B005A for ; Tue, 13 Nov 2012 12:14:22 -0500 (EST) Received: by mail-ee0-f41.google.com with SMTP id d41so65876eek.14 for ; Tue, 13 Nov 2012 09:14:20 -0800 (PST) From: Ingo Molnar Subject: [PATCH 00/31] Latest numa/core patches, v15 Date: Tue, 13 Nov 2012 18:13:23 +0100 Message-Id: <1352826834-11774-1-git-send-email-mingo@kernel.org> Sender: owner-linux-mm@kvack.org List-ID: To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Mel Gorman , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner Hi, This is the latest iteration of our numa/core tree, which implements adaptive NUMA affinity balancing. Changes in this version: https://lkml.org/lkml/2012/11/12/315 Performance figures: https://lkml.org/lkml/2012/11/12/330 Any review feedback, comments and test results are welcome! For testing purposes I'd suggest using the latest tip:master integration tree, which has the latest numa/core tree merged: git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master (But you can also directly use the tip:numa/core tree as well.) Thanks, Ingo -----------------------> Andrea Arcangeli (1): numa, mm: Support NUMA hinting page faults from gup/gup_fast Gerald Schaefer (1): sched, numa, mm, s390/thp: Implement pmd_pgprot() for s390 Ingo Molnar (3): mm/pgprot: Move the pgprot_modify() fallback definition to mm.h sched, mm, x86: Add the ARCH_SUPPORTS_NUMA_BALANCING flag mm: Allow the migration of shared pages Lee Schermerhorn (3): mm/mpol: Add MPOL_MF_NOOP mm/mpol: Check for misplaced page mm/mpol: Add MPOL_MF_LAZY Peter Zijlstra (16): sched, numa, mm: Make find_busiest_queue() a method sched, numa, mm: Describe the NUMA scheduling problem formally mm/thp: Preserve pgprot across huge page split mm/mpol: Make MPOL_LOCAL a real policy mm/mpol: Create special PROT_NONE infrastructure mm/migrate: Introduce migrate_misplaced_page() mm/mpol: Use special PROT_NONE to migrate pages sched, numa, mm: Introduce sched_feat_numa() sched, numa, mm: Implement THP migration sched, numa, mm: Add last_cpu to page flags sched, numa, mm, arch: Add variable locality exception sched, numa, mm: Add the scanning page fault machinery sched, numa, mm: Add adaptive NUMA affinity support sched, numa, mm: Implement constant, per task Working Set Sampling (WSS) rate sched, numa, mm: Count WS scanning against present PTEs, not virtual memory ranges sched, numa, mm: Implement slow start for working set sampling Ralf Baechle (1): sched, numa, mm, MIPS/thp: Add pmd_pgprot() implementation Rik van Riel (6): mm/generic: Only flush the local TLB in ptep_set_access_flags() x86/mm: Only do a local tlb flush in ptep_set_access_flags() x86/mm: Introduce pte_accessible() mm: Only flush the TLB when clearing an accessible pte x86/mm: Completely drop the TLB flush from ptep_set_access_flags() sched, numa, mm: Add credits for NUMA placement --- CREDITS | 1 + Documentation/scheduler/numa-problem.txt | 236 +++++++++++ arch/mips/include/asm/pgtable.h | 2 + arch/s390/include/asm/pgtable.h | 13 + arch/sh/mm/Kconfig | 1 + arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 7 + arch/x86/mm/pgtable.c | 8 +- include/asm-generic/pgtable.h | 4 + include/linux/huge_mm.h | 19 + include/linux/hugetlb.h | 8 +- include/linux/init_task.h | 8 + include/linux/mempolicy.h | 8 + include/linux/migrate.h | 7 + include/linux/migrate_mode.h | 3 + include/linux/mm.h | 122 ++++-- include/linux/mm_types.h | 10 + include/linux/mmzone.h | 14 +- include/linux/page-flags-layout.h | 83 ++++ include/linux/sched.h | 46 ++- include/uapi/linux/mempolicy.h | 16 +- init/Kconfig | 23 ++ kernel/bounds.c | 2 + kernel/sched/core.c | 68 +++- kernel/sched/fair.c | 1032 ++++++++++++++++++++++++++++++++++++++++--------- kernel/sched/features.h | 8 + kernel/sched/sched.h | 38 +- kernel/sysctl.c | 45 ++- mm/huge_memory.c | 253 +++++++++--- mm/hugetlb.c | 10 +- mm/memory.c | 129 ++++++- mm/mempolicy.c | 206 ++++++++-- mm/migrate.c | 81 +++- mm/mprotect.c | 64 ++- mm/pgtable-generic.c | 9 +- 35 files changed, 2200 insertions(+), 385 deletions(-) create mode 100644 Documentation/scheduler/numa-problem.txt create mode 100644 include/linux/page-flags-layout.h -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx129.postini.com [74.125.245.129]) by kanga.kvack.org (Postfix) with SMTP id 8A9286B00BE for ; Tue, 13 Nov 2012 12:54:34 -0500 (EST) Date: Tue, 13 Nov 2012 17:54:28 +0000 From: Mel Gorman Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 Message-ID: <20121113175428.GF8218@suse.de> References: <1352826834-11774-1-git-send-email-mingo@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1352826834-11774-1-git-send-email-mingo@kernel.org> Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner On Tue, Nov 13, 2012 at 06:13:23PM +0100, Ingo Molnar wrote: > Hi, > > This is the latest iteration of our numa/core tree, which > implements adaptive NUMA affinity balancing. > > Changes in this version: > > https://lkml.org/lkml/2012/11/12/315 > > Performance figures: > > https://lkml.org/lkml/2012/11/12/330 > > Any review feedback, comments and test results are welcome! > For the purposes of review and testing, this is going to be hard to pick apart and compare. It doesn't apply against 3.7-rc5 and when trying to resolve the conflicts it quickly becomes obvious that the series depends on other scheduler patches such as sched: Add an rq migration call-back to sched_class sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking This is not a full list, it was just the first I hit. What are the other scheduler patches you are depend on? Knowing that will probably help pick apart some of the massive patches like "sched, numa, mm: Add adaptive NUMA affinity support" which is a massive monolithic patch I have not even attempted to read yet but the diffstat for it alone says a lot. 7 files changed, 901 insertions(+), 197 deletions(-) -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx196.postini.com [74.125.245.196]) by kanga.kvack.org (Postfix) with SMTP id A03226B004D for ; Wed, 14 Nov 2012 02:52:28 -0500 (EST) Received: by mail-ee0-f41.google.com with SMTP id d41so99044eek.14 for ; Tue, 13 Nov 2012 23:52:27 -0800 (PST) Date: Wed, 14 Nov 2012 08:52:22 +0100 From: Ingo Molnar Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 Message-ID: <20121114075222.GA3522@gmail.com> References: <1352826834-11774-1-git-send-email-mingo@kernel.org> <20121113175428.GF8218@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121113175428.GF8218@suse.de> Sender: owner-linux-mm@kvack.org List-ID: To: Mel Gorman Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner * Mel Gorman wrote: > On Tue, Nov 13, 2012 at 06:13:23PM +0100, Ingo Molnar wrote: > > Hi, > > > > This is the latest iteration of our numa/core tree, which > > implements adaptive NUMA affinity balancing. > > > > Changes in this version: > > > > https://lkml.org/lkml/2012/11/12/315 > > > > Performance figures: > > > > https://lkml.org/lkml/2012/11/12/330 > > > > Any review feedback, comments and test results are welcome! > > > > For the purposes of review and testing, this is going to be > hard to pick apart and compare. It doesn't apply against > 3.7-rc5 [...] Because the scheduler changes are highly non-trivial it's on top of the scheduler tree: git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core I just tested the patches, they all apply cleanly, with zero fuzz and offsets. Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx206.postini.com [74.125.245.206]) by kanga.kvack.org (Postfix) with SMTP id 5C5946B005D for ; Wed, 14 Nov 2012 06:36:42 -0500 (EST) Date: Wed, 14 Nov 2012 11:36:36 +0000 From: Mel Gorman Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 Message-ID: <20121114113636.GK8218@suse.de> References: <1352826834-11774-1-git-send-email-mingo@kernel.org> <20121113175428.GF8218@suse.de> <20121114075222.GA3522@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20121114075222.GA3522@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner On Wed, Nov 14, 2012 at 08:52:22AM +0100, Ingo Molnar wrote: > > * Mel Gorman wrote: > > > On Tue, Nov 13, 2012 at 06:13:23PM +0100, Ingo Molnar wrote: > > > Hi, > > > > > > This is the latest iteration of our numa/core tree, which > > > implements adaptive NUMA affinity balancing. > > > > > > Changes in this version: > > > > > > https://lkml.org/lkml/2012/11/12/315 > > > > > > Performance figures: > > > > > > https://lkml.org/lkml/2012/11/12/330 > > > > > > Any review feedback, comments and test results are welcome! > > > > > > > For the purposes of review and testing, this is going to be > > hard to pick apart and compare. It doesn't apply against > > 3.7-rc5 [...] > > Because the scheduler changes are highly non-trivial it's on top > of the scheduler tree: > > git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core > > I just tested the patches, they all apply cleanly, with zero > fuzz and offsets. > The actual numa patches don't apply on top of that but at least the conflicts are obvious to resolve. I'll queue up a test to run overnight but in the meantime, does the current implementation of the NUMA patches *depend* on any of those scheduler patches? Normally I would say it'd be obvious from the series except in this case it just isn't because of the monolithic nature of some of the patches. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx107.postini.com [74.125.245.107]) by kanga.kvack.org (Postfix) with SMTP id A112E6B004D for ; Wed, 14 Nov 2012 07:03:38 -0500 (EST) Date: Wed, 14 Nov 2012 12:03:32 +0000 From: Mel Gorman Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 Message-ID: <20121114120332.GM8218@suse.de> References: <1352826834-11774-1-git-send-email-mingo@kernel.org> <20121113175428.GF8218@suse.de> <20121114075222.GA3522@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20121114075222.GA3522@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner On Wed, Nov 14, 2012 at 08:52:22AM +0100, Ingo Molnar wrote: > > * Mel Gorman wrote: > > > On Tue, Nov 13, 2012 at 06:13:23PM +0100, Ingo Molnar wrote: > > > Hi, > > > > > > This is the latest iteration of our numa/core tree, which > > > implements adaptive NUMA affinity balancing. > > > > > > Changes in this version: > > > > > > https://lkml.org/lkml/2012/11/12/315 > > > > > > Performance figures: > > > > > > https://lkml.org/lkml/2012/11/12/330 > > > > > > Any review feedback, comments and test results are welcome! > > > > > > > For the purposes of review and testing, this is going to be > > hard to pick apart and compare. It doesn't apply against > > 3.7-rc5 [...] > > Because the scheduler changes are highly non-trivial it's on top > of the scheduler tree: > > git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core > > I just tested the patches, they all apply cleanly, with zero > fuzz and offsets. > My apologies about the merge complaint. I used the wrong baseline and the problem was on my side. The series does indeed apply cleanly once the scheduler patches are pulled in too. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx140.postini.com [74.125.245.140]) by kanga.kvack.org (Postfix) with SMTP id 19BAB6B0070 for ; Sat, 17 Nov 2012 03:45:18 -0500 (EST) Received: by mail-ob0-f169.google.com with SMTP id lz20so4352055obb.14 for ; Sat, 17 Nov 2012 00:45:17 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <1352826834-11774-1-git-send-email-mingo@kernel.org> References: <1352826834-11774-1-git-send-email-mingo@kernel.org> Date: Sat, 17 Nov 2012 16:45:17 +0800 Message-ID: Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 From: Alex Shi Content-Type: text/plain; charset=ISO-8859-1 Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Mel Gorman , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner had caught a ops on my 2 sockets SNB EP server. but can not reproduce it. send out as a reminder: on tip/master, head : a7b7a8ad4476bb641c8455a4e0d7d0fd3eb86f90 Oops: 0000 [#1] SMP [ 21.967103] Modules linked in: iTCO_wdt iTCO_vendor_support i2c_i801 igb microcode lpc_ich ioatdma i2c_core joydev mfd_core hed dca ipv6 isci libsas scsi_transport_sas [ 21.967109] CPU 7 [ 21.967109] Pid: 754, comm: systemd-readahe Not tainted 3.7.0-rc5-tip+ #20 Intel Corporation S2600CP/S2600CP [ 21.967115] RIP: 0010:[] [] __fd_install+0x2d/0x4f [ 21.967117] RSP: 0018:ffff8808187f7de8 EFLAGS: 00010246 [ 21.967118] RAX: ffff881018bfb700 RBX: ffff88081c2f5d80 RCX: ffff880818dfc620 [ 21.967120] RDX: ffff881019b10000 RSI: 00000000ffffffff RDI: ffff88081c2f5e00 [ 21.967122] RBP: ffff8808187f7e08 R08: ffff88101b37e008 R09: ffffffff811644a6 [ 21.967123] R10: ffff880818005e00 R11: ffff880818005e00 R12: 00000000ffffffff [ 21.967125] R13: 0000000000000000 R14: 00000000fffffff2 R15: 0000000000000000 [ 21.967128] FS: 00007ffa79ead7e0(0000) GS:ffff88081fce0000(0000) knlGS:0000000000000000 [ 21.967130] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 21.967131] CR2: ffff881819b0fff8 CR3: 000000081be54000 CR4: 00000000000407e0 [ 21.967133] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 21.967135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 21.967137] Process systemd-readahe (pid: 754, threadinfo ffff8808187f6000, task ffff880818dfc620) [ 21.967138] Stack: [ 21.967145] ffff880818005e00 ffff88101b37e000 ffff880818005e00 00007fff57d29378 [ 21.967150] ffff8808187f7e18 ffffffff811498c6 ffff8808187f7ed8 ffffffff81167a7c [ 21.967155] ffff880818dfc620 ffff880818dfc620 ffff880818004d00 ffff880818005e40 [ 21.967156] Call Trace: [ 21.967162] [] fd_install+0x25/0x27 [ 21.967168] [] fanotify_read+0x38d/0x475 [ 21.967176] [] ? remove_wait_queue+0x3a/0x3a [ 21.967181] [] vfs_read+0xa9/0xf0 [ 21.967186] [] ? poll_select_set_timeout+0x63/0x81 [ 21.967189] [] sys_read+0x59/0x7e [ 21.967195] [] system_call_fastpath+0x16/0x1b [ 21.967222] Code: 66 66 90 55 48 89 e5 41 55 49 89 d5 41 54 41 89 f4 53 48 89 fb 48 8d bf 80 00 00 00 41 53 e8 69 ce 36 00 48 8b 43 08 48 8b 50 08 <4a> 83 3c e2 00 74 02 0f 0b 48 8b 40 08 4e 89 2c e0 66 83 83 80 [ 21.967226] RIP [] __fd_install+0x2d/0x4f [ 21.967227] RSP [ 21.967228] CR2: ffff881819b0fff8 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx136.postini.com [74.125.245.136]) by kanga.kvack.org (Postfix) with SMTP id 9316E6B005D for ; Sun, 18 Nov 2012 14:33:19 -0500 (EST) Received: by mail-we0-f169.google.com with SMTP id u3so1988533wey.14 for ; Sun, 18 Nov 2012 11:33:17 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <1352826834-11774-1-git-send-email-mingo@kernel.org> From: Linus Torvalds Date: Sun, 18 Nov 2012 09:32:57 -1000 Message-ID: Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 Content-Type: text/plain; charset=ISO-8859-1 Sender: owner-linux-mm@kvack.org List-ID: To: Alex Shi Cc: Ingo Molnar , Linux Kernel Mailing List , linux-mm , Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Mel Gorman , Andrew Morton , Andrea Arcangeli , Peter Zijlstra , Thomas Gleixner On Fri, Nov 16, 2012 at 10:45 PM, Alex Shi wrote: > had caught a ops on my 2 sockets SNB EP server. but can not reproduce it. > send out as a reminder: > on tip/master, head : a7b7a8ad4476bb641c8455a4e0d7d0fd3eb86f90 This is an independent bug, nothing to do with the NUMA stuff. Fixed in my tree now (commit 3587b1b097d70). Of course, it's entirely possible that the NUMA patches are subtly buggy and helped trigger the fanotify OVERFLOW event that had this particular bug. But the oops itself is due to a real bug. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754710Ab2KMPwl (ORCPT ); Tue, 13 Nov 2012 10:52:41 -0500 Received: from mail-ee0-f46.google.com ([74.125.83.46]:55204 "EHLO mail-ee0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751643Ab2KMPwj (ORCPT ); Tue, 13 Nov 2012 10:52:39 -0500 Date: Tue, 13 Nov 2012 16:52:32 +0100 From: Ingo Molnar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Mel Gorman , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Ingo Molnar , Thomas Gleixner Subject: [patch 00/31] Latest numa/core patches, v15 Message-ID: <20121113155232.GA28466@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: quilt/0.51-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, This is the latest iteration of our numa/core patches, which implements adaptive NUMA affinity balancing. Changes in this version: https://lkml.org/lkml/2012/11/12/315 Performance figures: https://lkml.org/lkml/2012/11/12/330 Any review feedback, comments and test results are welcome! For testing purposes I'd suggest using the latest tip:master integration tree, which has the latest numa/core tree merged: git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master (But you can also directly use the tip:numa/core tree as well.) Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755350Ab2KMROX (ORCPT ); Tue, 13 Nov 2012 12:14:23 -0500 Received: from mail-ee0-f46.google.com ([74.125.83.46]:37723 "EHLO mail-ee0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754907Ab2KMROW (ORCPT ); Tue, 13 Nov 2012 12:14:22 -0500 From: Ingo Molnar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Mel Gorman , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner Subject: [PATCH 00/31] Latest numa/core patches, v15 Date: Tue, 13 Nov 2012 18:13:23 +0100 Message-Id: <1352826834-11774-1-git-send-email-mingo@kernel.org> X-Mailer: git-send-email 1.7.11.7 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, This is the latest iteration of our numa/core tree, which implements adaptive NUMA affinity balancing. Changes in this version: https://lkml.org/lkml/2012/11/12/315 Performance figures: https://lkml.org/lkml/2012/11/12/330 Any review feedback, comments and test results are welcome! For testing purposes I'd suggest using the latest tip:master integration tree, which has the latest numa/core tree merged: git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master (But you can also directly use the tip:numa/core tree as well.) Thanks, Ingo -----------------------> Andrea Arcangeli (1): numa, mm: Support NUMA hinting page faults from gup/gup_fast Gerald Schaefer (1): sched, numa, mm, s390/thp: Implement pmd_pgprot() for s390 Ingo Molnar (3): mm/pgprot: Move the pgprot_modify() fallback definition to mm.h sched, mm, x86: Add the ARCH_SUPPORTS_NUMA_BALANCING flag mm: Allow the migration of shared pages Lee Schermerhorn (3): mm/mpol: Add MPOL_MF_NOOP mm/mpol: Check for misplaced page mm/mpol: Add MPOL_MF_LAZY Peter Zijlstra (16): sched, numa, mm: Make find_busiest_queue() a method sched, numa, mm: Describe the NUMA scheduling problem formally mm/thp: Preserve pgprot across huge page split mm/mpol: Make MPOL_LOCAL a real policy mm/mpol: Create special PROT_NONE infrastructure mm/migrate: Introduce migrate_misplaced_page() mm/mpol: Use special PROT_NONE to migrate pages sched, numa, mm: Introduce sched_feat_numa() sched, numa, mm: Implement THP migration sched, numa, mm: Add last_cpu to page flags sched, numa, mm, arch: Add variable locality exception sched, numa, mm: Add the scanning page fault machinery sched, numa, mm: Add adaptive NUMA affinity support sched, numa, mm: Implement constant, per task Working Set Sampling (WSS) rate sched, numa, mm: Count WS scanning against present PTEs, not virtual memory ranges sched, numa, mm: Implement slow start for working set sampling Ralf Baechle (1): sched, numa, mm, MIPS/thp: Add pmd_pgprot() implementation Rik van Riel (6): mm/generic: Only flush the local TLB in ptep_set_access_flags() x86/mm: Only do a local tlb flush in ptep_set_access_flags() x86/mm: Introduce pte_accessible() mm: Only flush the TLB when clearing an accessible pte x86/mm: Completely drop the TLB flush from ptep_set_access_flags() sched, numa, mm: Add credits for NUMA placement --- CREDITS | 1 + Documentation/scheduler/numa-problem.txt | 236 +++++++++++ arch/mips/include/asm/pgtable.h | 2 + arch/s390/include/asm/pgtable.h | 13 + arch/sh/mm/Kconfig | 1 + arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 7 + arch/x86/mm/pgtable.c | 8 +- include/asm-generic/pgtable.h | 4 + include/linux/huge_mm.h | 19 + include/linux/hugetlb.h | 8 +- include/linux/init_task.h | 8 + include/linux/mempolicy.h | 8 + include/linux/migrate.h | 7 + include/linux/migrate_mode.h | 3 + include/linux/mm.h | 122 ++++-- include/linux/mm_types.h | 10 + include/linux/mmzone.h | 14 +- include/linux/page-flags-layout.h | 83 ++++ include/linux/sched.h | 46 ++- include/uapi/linux/mempolicy.h | 16 +- init/Kconfig | 23 ++ kernel/bounds.c | 2 + kernel/sched/core.c | 68 +++- kernel/sched/fair.c | 1032 ++++++++++++++++++++++++++++++++++++++++--------- kernel/sched/features.h | 8 + kernel/sched/sched.h | 38 +- kernel/sysctl.c | 45 ++- mm/huge_memory.c | 253 +++++++++--- mm/hugetlb.c | 10 +- mm/memory.c | 129 ++++++- mm/mempolicy.c | 206 ++++++++-- mm/migrate.c | 81 +++- mm/mprotect.c | 64 ++- mm/pgtable-generic.c | 9 +- 35 files changed, 2200 insertions(+), 385 deletions(-) create mode 100644 Documentation/scheduler/numa-problem.txt create mode 100644 include/linux/page-flags-layout.h From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755445Ab2KMRyg (ORCPT ); Tue, 13 Nov 2012 12:54:36 -0500 Received: from cantor2.suse.de ([195.135.220.15]:39565 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754864Ab2KMRye (ORCPT ); Tue, 13 Nov 2012 12:54:34 -0500 Date: Tue, 13 Nov 2012 17:54:28 +0000 From: Mel Gorman To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 Message-ID: <20121113175428.GF8218@suse.de> References: <1352826834-11774-1-git-send-email-mingo@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1352826834-11774-1-git-send-email-mingo@kernel.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 13, 2012 at 06:13:23PM +0100, Ingo Molnar wrote: > Hi, > > This is the latest iteration of our numa/core tree, which > implements adaptive NUMA affinity balancing. > > Changes in this version: > > https://lkml.org/lkml/2012/11/12/315 > > Performance figures: > > https://lkml.org/lkml/2012/11/12/330 > > Any review feedback, comments and test results are welcome! > For the purposes of review and testing, this is going to be hard to pick apart and compare. It doesn't apply against 3.7-rc5 and when trying to resolve the conflicts it quickly becomes obvious that the series depends on other scheduler patches such as sched: Add an rq migration call-back to sched_class sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking This is not a full list, it was just the first I hit. What are the other scheduler patches you are depend on? Knowing that will probably help pick apart some of the massive patches like "sched, numa, mm: Add adaptive NUMA affinity support" which is a massive monolithic patch I have not even attempted to read yet but the diffstat for it alone says a lot. 7 files changed, 901 insertions(+), 197 deletions(-) -- Mel Gorman SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756743Ab2KNHwa (ORCPT ); Wed, 14 Nov 2012 02:52:30 -0500 Received: from mail-ea0-f174.google.com ([209.85.215.174]:45136 "EHLO mail-ea0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756694Ab2KNHw2 (ORCPT ); Wed, 14 Nov 2012 02:52:28 -0500 Date: Wed, 14 Nov 2012 08:52:22 +0100 From: Ingo Molnar To: Mel Gorman Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 Message-ID: <20121114075222.GA3522@gmail.com> References: <1352826834-11774-1-git-send-email-mingo@kernel.org> <20121113175428.GF8218@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121113175428.GF8218@suse.de> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Mel Gorman wrote: > On Tue, Nov 13, 2012 at 06:13:23PM +0100, Ingo Molnar wrote: > > Hi, > > > > This is the latest iteration of our numa/core tree, which > > implements adaptive NUMA affinity balancing. > > > > Changes in this version: > > > > https://lkml.org/lkml/2012/11/12/315 > > > > Performance figures: > > > > https://lkml.org/lkml/2012/11/12/330 > > > > Any review feedback, comments and test results are welcome! > > > > For the purposes of review and testing, this is going to be > hard to pick apart and compare. It doesn't apply against > 3.7-rc5 [...] Because the scheduler changes are highly non-trivial it's on top of the scheduler tree: git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core I just tested the patches, they all apply cleanly, with zero fuzz and offsets. Thanks, Ingo From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161176Ab2KNLgn (ORCPT ); Wed, 14 Nov 2012 06:36:43 -0500 Received: from cantor2.suse.de ([195.135.220.15]:42290 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161052Ab2KNLgl (ORCPT ); Wed, 14 Nov 2012 06:36:41 -0500 Date: Wed, 14 Nov 2012 11:36:36 +0000 From: Mel Gorman To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 Message-ID: <20121114113636.GK8218@suse.de> References: <1352826834-11774-1-git-send-email-mingo@kernel.org> <20121113175428.GF8218@suse.de> <20121114075222.GA3522@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20121114075222.GA3522@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 14, 2012 at 08:52:22AM +0100, Ingo Molnar wrote: > > * Mel Gorman wrote: > > > On Tue, Nov 13, 2012 at 06:13:23PM +0100, Ingo Molnar wrote: > > > Hi, > > > > > > This is the latest iteration of our numa/core tree, which > > > implements adaptive NUMA affinity balancing. > > > > > > Changes in this version: > > > > > > https://lkml.org/lkml/2012/11/12/315 > > > > > > Performance figures: > > > > > > https://lkml.org/lkml/2012/11/12/330 > > > > > > Any review feedback, comments and test results are welcome! > > > > > > > For the purposes of review and testing, this is going to be > > hard to pick apart and compare. It doesn't apply against > > 3.7-rc5 [...] > > Because the scheduler changes are highly non-trivial it's on top > of the scheduler tree: > > git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core > > I just tested the patches, they all apply cleanly, with zero > fuzz and offsets. > The actual numa patches don't apply on top of that but at least the conflicts are obvious to resolve. I'll queue up a test to run overnight but in the meantime, does the current implementation of the NUMA patches *depend* on any of those scheduler patches? Normally I would say it'd be obvious from the series except in this case it just isn't because of the monolithic nature of some of the patches. -- Mel Gorman SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1422714Ab2KNMDj (ORCPT ); Wed, 14 Nov 2012 07:03:39 -0500 Received: from cantor2.suse.de ([195.135.220.15]:43428 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755134Ab2KNMDi (ORCPT ); Wed, 14 Nov 2012 07:03:38 -0500 Date: Wed, 14 Nov 2012 12:03:32 +0000 From: Mel Gorman To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 Message-ID: <20121114120332.GM8218@suse.de> References: <1352826834-11774-1-git-send-email-mingo@kernel.org> <20121113175428.GF8218@suse.de> <20121114075222.GA3522@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20121114075222.GA3522@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 14, 2012 at 08:52:22AM +0100, Ingo Molnar wrote: > > * Mel Gorman wrote: > > > On Tue, Nov 13, 2012 at 06:13:23PM +0100, Ingo Molnar wrote: > > > Hi, > > > > > > This is the latest iteration of our numa/core tree, which > > > implements adaptive NUMA affinity balancing. > > > > > > Changes in this version: > > > > > > https://lkml.org/lkml/2012/11/12/315 > > > > > > Performance figures: > > > > > > https://lkml.org/lkml/2012/11/12/330 > > > > > > Any review feedback, comments and test results are welcome! > > > > > > > For the purposes of review and testing, this is going to be > > hard to pick apart and compare. It doesn't apply against > > 3.7-rc5 [...] > > Because the scheduler changes are highly non-trivial it's on top > of the scheduler tree: > > git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core > > I just tested the patches, they all apply cleanly, with zero > fuzz and offsets. > My apologies about the merge complaint. I used the wrong baseline and the problem was on my side. The series does indeed apply cleanly once the scheduler patches are pulled in too. -- Mel Gorman SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753793Ab2KQIpT (ORCPT ); Sat, 17 Nov 2012 03:45:19 -0500 Received: from mail-ob0-f174.google.com ([209.85.214.174]:59497 "EHLO mail-ob0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753149Ab2KQIpR (ORCPT ); Sat, 17 Nov 2012 03:45:17 -0500 MIME-Version: 1.0 In-Reply-To: <1352826834-11774-1-git-send-email-mingo@kernel.org> References: <1352826834-11774-1-git-send-email-mingo@kernel.org> Date: Sat, 17 Nov 2012 16:45:17 +0800 Message-ID: Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 From: Alex Shi To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Mel Gorman , Andrew Morton , Andrea Arcangeli , Linus Torvalds , Peter Zijlstra , Thomas Gleixner Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org had caught a ops on my 2 sockets SNB EP server. but can not reproduce it. send out as a reminder: on tip/master, head : a7b7a8ad4476bb641c8455a4e0d7d0fd3eb86f90 Oops: 0000 [#1] SMP [ 21.967103] Modules linked in: iTCO_wdt iTCO_vendor_support i2c_i801 igb microcode lpc_ich ioatdma i2c_core joydev mfd_core hed dca ipv6 isci libsas scsi_transport_sas [ 21.967109] CPU 7 [ 21.967109] Pid: 754, comm: systemd-readahe Not tainted 3.7.0-rc5-tip+ #20 Intel Corporation S2600CP/S2600CP [ 21.967115] RIP: 0010:[] [] __fd_install+0x2d/0x4f [ 21.967117] RSP: 0018:ffff8808187f7de8 EFLAGS: 00010246 [ 21.967118] RAX: ffff881018bfb700 RBX: ffff88081c2f5d80 RCX: ffff880818dfc620 [ 21.967120] RDX: ffff881019b10000 RSI: 00000000ffffffff RDI: ffff88081c2f5e00 [ 21.967122] RBP: ffff8808187f7e08 R08: ffff88101b37e008 R09: ffffffff811644a6 [ 21.967123] R10: ffff880818005e00 R11: ffff880818005e00 R12: 00000000ffffffff [ 21.967125] R13: 0000000000000000 R14: 00000000fffffff2 R15: 0000000000000000 [ 21.967128] FS: 00007ffa79ead7e0(0000) GS:ffff88081fce0000(0000) knlGS:0000000000000000 [ 21.967130] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 21.967131] CR2: ffff881819b0fff8 CR3: 000000081be54000 CR4: 00000000000407e0 [ 21.967133] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 21.967135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 21.967137] Process systemd-readahe (pid: 754, threadinfo ffff8808187f6000, task ffff880818dfc620) [ 21.967138] Stack: [ 21.967145] ffff880818005e00 ffff88101b37e000 ffff880818005e00 00007fff57d29378 [ 21.967150] ffff8808187f7e18 ffffffff811498c6 ffff8808187f7ed8 ffffffff81167a7c [ 21.967155] ffff880818dfc620 ffff880818dfc620 ffff880818004d00 ffff880818005e40 [ 21.967156] Call Trace: [ 21.967162] [] fd_install+0x25/0x27 [ 21.967168] [] fanotify_read+0x38d/0x475 [ 21.967176] [] ? remove_wait_queue+0x3a/0x3a [ 21.967181] [] vfs_read+0xa9/0xf0 [ 21.967186] [] ? poll_select_set_timeout+0x63/0x81 [ 21.967189] [] sys_read+0x59/0x7e [ 21.967195] [] system_call_fastpath+0x16/0x1b [ 21.967222] Code: 66 66 90 55 48 89 e5 41 55 49 89 d5 41 54 41 89 f4 53 48 89 fb 48 8d bf 80 00 00 00 41 53 e8 69 ce 36 00 48 8b 43 08 48 8b 50 08 <4a> 83 3c e2 00 74 02 0f 0b 48 8b 40 08 4e 89 2c e0 66 83 83 80 [ 21.967226] RIP [] __fd_install+0x2d/0x4f [ 21.967227] RSP [ 21.967228] CR2: ffff881819b0fff8 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752558Ab2KRTdT (ORCPT ); Sun, 18 Nov 2012 14:33:19 -0500 Received: from mail-we0-f174.google.com ([74.125.82.174]:64358 "EHLO mail-we0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752393Ab2KRTdS (ORCPT ); Sun, 18 Nov 2012 14:33:18 -0500 MIME-Version: 1.0 In-Reply-To: References: <1352826834-11774-1-git-send-email-mingo@kernel.org> From: Linus Torvalds Date: Sun, 18 Nov 2012 09:32:57 -1000 X-Google-Sender-Auth: iMLqWD7C-K3-w8JhoPIDCHedeXQ Message-ID: Subject: Re: [PATCH 00/31] Latest numa/core patches, v15 To: Alex Shi Cc: Ingo Molnar , Linux Kernel Mailing List , linux-mm , Paul Turner , Lee Schermerhorn , Christoph Lameter , Rik van Riel , Mel Gorman , Andrew Morton , Andrea Arcangeli , Peter Zijlstra , Thomas Gleixner Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 16, 2012 at 10:45 PM, Alex Shi wrote: > had caught a ops on my 2 sockets SNB EP server. but can not reproduce it. > send out as a reminder: > on tip/master, head : a7b7a8ad4476bb641c8455a4e0d7d0fd3eb86f90 This is an independent bug, nothing to do with the NUMA stuff. Fixed in my tree now (commit 3587b1b097d70). Of course, it's entirely possible that the NUMA patches are subtly buggy and helped trigger the fanotify OVERFLOW event that had this particular bug. But the oops itself is due to a real bug. Linus