* Panic on ppc64 with numa_balancing and !sparsemem_vmemmap
@ 2014-02-19 18:02 Srikar Dronamraju
2014-03-03 17:26 ` Mel Gorman
0 siblings, 1 reply; 4+ messages in thread
From: Srikar Dronamraju @ 2014-02-19 18:02 UTC (permalink / raw)
To: Aneesh Kumar, riel, mgorman
Cc: Peter Zijlstra, paulus, linuxppc-dev, linux-mm
On a powerpc machine with CONFIG_NUMA_BALANCING=y and CONFIG_SPARSEMEM_VMEMMAP
not enabled, kernel panics.
This is true of kernel versions 3.13 to the latest commit 960dfc4 which is
3.14-rc3+. i.e the recent 3 fixups from Aneesh doesnt seem to help this case.
Sometimes it fails on boot up itself. Otherwise a kernel compile is good enough
to trigger the same. I am seeing this on a Power 7 box.
Kernel 3.14.0-rc3-mainline_v313-00168-g960dfc4 on an ppc64
transam2s-lp1 login: qla2xxx [0003:01:00.1]-8038:2: Cable is unplugged...
Unable to handle kernel paging request for data at address 0x00000457
Faulting instruction address: 0xc0000000000d6004
cpu 0x38: Vector: 300 (Data Access) at [c00000171561f700]
pc: c0000000000d6004: .task_numa_fault+0x604/0xa30
lr: c0000000000d62fc: .task_numa_fault+0x8fc/0xa30
sp: c00000171561f980
msr: 8000000000009032
dar: 457
dsisr: 40000000
current = 0xc0000017155d9b00
paca = 0xc00000000ec1e000 softe: 0 irq_happened: 0x00
pid = 16898, comm = gzip
enter ? for help
[c00000171561fa70] c0000000001b0fb0 .do_numa_page+0x1b0/0x2a0
[c00000171561fb20] c0000000001b2788 .handle_mm_fault+0x538/0xca0
[c00000171561fc00] c00000000082f498 .do_page_fault+0x378/0x880
[c00000171561fe30] c000000000009568 handle_page_fault+0x10/0x30
--- Exception: 301 (Data Access) at 00000000100031d8
SP (3fffd45ea2d0) is in userspace
38:mon>
(gdb) list *(task_numa_fault+0x604)
0xc0000000000d6004 is in task_numa_fault (/home/srikar/work/linux.git/include/linux/mm.h:753).
748 return cpupid_to_cpu(cpupid) == (-1 & LAST__CPU_MASK);
749 }
750
751 static inline bool __cpupid_match_pid(pid_t task_pid, int cpupid)
752 {
753 return (task_pid & LAST__PID_MASK) == cpupid_to_pid(cpupid);
754 }
755
756 #define cpupid_match_pid(task, cpupid) __cpupid_match_pid(task->pid, cpupid)
757 #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
(gdb)
However this doesnt seem to happen if we have CONFIG_SPARSEMEM_VMEMMAP=y set in the config.
--
Thanks nnn Regards
Srikar Dronamraju
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Panic on ppc64 with numa_balancing and !sparsemem_vmemmap
2014-02-19 18:02 Panic on ppc64 with numa_balancing and !sparsemem_vmemmap Srikar Dronamraju
@ 2014-03-03 17:26 ` Mel Gorman
2014-03-03 19:15 ` Aneesh Kumar K.V
0 siblings, 1 reply; 4+ messages in thread
From: Mel Gorman @ 2014-03-03 17:26 UTC (permalink / raw)
To: Srikar Dronamraju
Cc: riel, Peter Zijlstra, linux-mm, paulus, Aneesh Kumar,
linuxppc-dev
On Wed, Feb 19, 2014 at 11:32:00PM +0530, Srikar Dronamraju wrote:
>
> On a powerpc machine with CONFIG_NUMA_BALANCING=y and CONFIG_SPARSEMEM_VMEMMAP
> not enabled, kernel panics.
>
This?
---8<---
sched: numa: Do not group tasks if last cpu is not set
On configurations with vmemmap disabled, the following partial is observed
[ 299.268623] CPU: 47 PID: 4366 Comm: numa01 Tainted: G D 3.14.0-rc5-vanilla #4
[ 299.278295] Hardware name: Dell Inc. PowerEdge R810/0TT6JF, BIOS 2.7.4 04/26/2012
[ 299.287452] task: ffff880c670bc110 ti: ffff880c66db6000 task.ti: ffff880c66db6000
[ 299.296642] RIP: 0010:[<ffffffff8109013f>] [<ffffffff8109013f>] task_numa_fault+0x50f/0x8b0
[ 299.306778] RSP: 0000:ffff880c66db7670 EFLAGS: 00010282
[ 299.313769] RAX: 00000000000033ee RBX: ffff880c670bc110 RCX: 0000000000000001
[ 299.322590] RDX: 0000000000000001 RSI: 0000000000000003 RDI: 00000000ffffffff
[ 299.331394] RBP: ffff880c66db76c8 R08: 0000000000000000 R09: 00000000000166b0
[ 299.340203] R10: ffff880c7ffecd80 R11: 0000000000000000 R12: 00000000000001ff
[ 299.348989] R13: 00000000000000ff R14: 00000000ffffffff R15: 0000000000000003
[ 299.357763] FS: 00007f5a60a3f700(0000) GS:ffff88106f2c0000(0000) knlGS:0000000000000000
[ 299.367510] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 299.374913] CR2: 00000000000037da CR3: 0000000868ed4000 CR4: 00000000000007e0
[ 299.383726] Stack:
[ 299.387414] 0000000000000003 0000000000000000 0000000100000003 0000000100000003
[ 299.396564] ffffffff811888f4 ffff880c66db7698 0000000000000003 ffff880c7f9b3ac0
[ 299.405730] ffff880c66ccebd8 00000000ffffffff 0000000000000003 ffff880c66db7718
[ 299.414907] Call Trace:
[ 299.419095] [<ffffffff811888f4>] ? migrate_misplaced_page+0xb4/0x140
[ 299.427301] [<ffffffff8115950c>] do_numa_page+0x18c/0x1f0
[ 299.434554] [<ffffffff8115a6f7>] handle_mm_fault+0x617/0xf70
[ ..........] SNIPPED
The oops occurs in task_numa_group looking up cpu_rq(LAST__CPU_MASK). The
bug exists for all configurations but will manifest differently. On vmemmap
configurations, it looks up garbage and on !vmemmap configuraitons it
will oops. This patch adds the necessary check and also fixes the type
for LAST__PID_MASK and LAST__CPU_MASK which are currently signed instead
of unsigned integers.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: stable@vger.kernel.org
diff --git a/include/linux/page-flags-layout.h b/include/linux/page-flags-layout.h
index da52366..6f661d9 100644
--- a/include/linux/page-flags-layout.h
+++ b/include/linux/page-flags-layout.h
@@ -63,10 +63,10 @@
#ifdef CONFIG_NUMA_BALANCING
#define LAST__PID_SHIFT 8
-#define LAST__PID_MASK ((1 << LAST__PID_SHIFT)-1)
+#define LAST__PID_MASK ((1UL << LAST__PID_SHIFT)-1)
#define LAST__CPU_SHIFT NR_CPUS_BITS
-#define LAST__CPU_MASK ((1 << LAST__CPU_SHIFT)-1)
+#define LAST__CPU_MASK ((1UL << LAST__CPU_SHIFT)-1)
#define LAST_CPUPID_SHIFT (LAST__PID_SHIFT+LAST__CPU_SHIFT)
#else
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7815709..b44a8b1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1463,6 +1463,9 @@ static void task_numa_group(struct task_struct *p, int cpupid, int flags,
int cpu = cpupid_to_cpu(cpupid);
int i;
+ if (unlikely(cpu == LAST__CPU_MASK && !cpu_online(cpu)))
+ return;
+
if (unlikely(!p->numa_group)) {
unsigned int size = sizeof(struct numa_group) +
2*nr_node_ids*sizeof(unsigned long);
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Panic on ppc64 with numa_balancing and !sparsemem_vmemmap
2014-03-03 17:26 ` Mel Gorman
@ 2014-03-03 19:15 ` Aneesh Kumar K.V
2014-03-03 20:04 ` Mel Gorman
0 siblings, 1 reply; 4+ messages in thread
From: Aneesh Kumar K.V @ 2014-03-03 19:15 UTC (permalink / raw)
To: Mel Gorman, Srikar Dronamraju
Cc: riel, Peter Zijlstra, linux-mm, paulus, linuxppc-dev
Mel Gorman <mgorman@suse.de> writes:
> On Wed, Feb 19, 2014 at 11:32:00PM +0530, Srikar Dronamraju wrote:
>>
>> On a powerpc machine with CONFIG_NUMA_BALANCING=y and CONFIG_SPARSEMEM_VMEMMAP
>> not enabled, kernel panics.
>>
>
> This?
This one fixed that crash on ppc64
http://mid.gmane.org/1393578122-6500-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com
-aneesh
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Panic on ppc64 with numa_balancing and !sparsemem_vmemmap
2014-03-03 19:15 ` Aneesh Kumar K.V
@ 2014-03-03 20:04 ` Mel Gorman
0 siblings, 0 replies; 4+ messages in thread
From: Mel Gorman @ 2014-03-03 20:04 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: riel, Srikar Dronamraju, Peter Zijlstra, linux-mm, paulus,
linuxppc-dev
On Tue, Mar 04, 2014 at 12:45:19AM +0530, Aneesh Kumar K.V wrote:
> Mel Gorman <mgorman@suse.de> writes:
>
> > On Wed, Feb 19, 2014 at 11:32:00PM +0530, Srikar Dronamraju wrote:
> >>
> >> On a powerpc machine with CONFIG_NUMA_BALANCING=y and CONFIG_SPARSEMEM_VMEMMAP
> >> not enabled, kernel panics.
> >>
> >
> > This?
>
> This one fixed that crash on ppc64
>
> http://mid.gmane.org/1393578122-6500-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com
>
Thanks.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-03-03 20:04 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-19 18:02 Panic on ppc64 with numa_balancing and !sparsemem_vmemmap Srikar Dronamraju
2014-03-03 17:26 ` Mel Gorman
2014-03-03 19:15 ` Aneesh Kumar K.V
2014-03-03 20:04 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).