From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcin Jabrzyk Subject: =?UTF-8?B?UFJPQkxFTTrCoEJVRyAgYXBwZWFyaW5nIHdoZW4gdHJ5aW5nIHRvIGE=?= =?UTF-8?B?bGxvY2F0ZSBpbnRlcnJ1cHQgb24gRXh5bm9zIE1DVCBhZnRlciBDUFUgaG90cGw=?= =?UTF-8?B?dWc=?= Date: Thu, 23 Oct 2014 15:51:16 +0200 Message-ID: <544907D4.1020409@samsung.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mailout4.w1.samsung.com ([210.118.77.14]:19250 "EHLO mailout4.w1.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752690AbaJWNvU (ORCPT ); Thu, 23 Oct 2014 09:51:20 -0400 Sender: linux-samsung-soc-owner@vger.kernel.org List-Id: linux-samsung-soc@vger.kernel.org To: Daniel Lezcano , Kukjin Kim , Thomas Gleixner Cc: linux-kernel@vger.kernel.org, Bartlomiej Zolnierkiewicz , kyungmin.park@samsung.com, linux-arm-kernel@lists.infradead.org, linux-samsung-soc@vger.kernel.org [1.] One line summary of the problem: "BUG: sleeping function called from invalid context at mm/slub.c:1250" after CPU hotplug [2.] Full description of the problem/report: This was tested on Exynos 3250 board with https://lkml.org/lkml/2014/9/24/441 applied. Board is booting to /bin/sh. After executing: mount -t sysfs sys /sys && echo 0 > /sys/devices/system/cpu/cpu1/online && echo 1 > /sys/devices/system/cpu/cpu1/online I'm getting: [ 7.226405] IRQ258 no longer affine to CPU1 [ 7.226629] CPU1: shutdown [ 7.230037] CPU1: Software reset [ 7.231822] CPU1: Booted secondary processor [ 7.231843] BUG: sleeping function called from invalid context at mm/slub.c:1250 [ 7.231850] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/1 [ 7.231861] Preemption disabled at:[< (null)>] (null) [ 7.231864] [ 7.231876] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.17.0-dirty #45 [ 7.231914] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 7.231931] [] (show_stack) from [] (dump_stack+0x70/0xbc) [ 7.231950] [] (dump_stack) from [] (kmem_cache_alloc+0xe8/0x184) [ 7.231968] [] (kmem_cache_alloc) from [] (request_threaded_irq+0x64/0x128) [ 7.231985] [] (request_threaded_irq) from [] (exynos4_local_timer_setup+0xc0/0x13c) [ 7.232000] [] (exynos4_local_timer_setup) from [] (exynos4_mct_cpu_notify+0x30/0xa8) [ 7.232016] [] (exynos4_mct_cpu_notify) from [] (notifier_call_chain+0x44/0x84) [ 7.232034] [] (notifier_call_chain) from [] (__cpu_notify+0x28/0x44) [ 7.232049] [] (__cpu_notify) from [] (secondary_start_kernel+0xe8/0x138) [ 7.232062] [] (secondary_start_kernel) from [<400086a4>] (0x400086a4) The problem is that request_irq is calling allocation with GFP_KERNEL flag in atomic block. This bug should be easy observable on any board with "samsung,exynos4210-mct" compatible MCT block. [4.1.] Kernel version (from /proc/version): 3.17.0 [4.2.] Kernel .config file: exynos_defconfig + DEBUG_ATOMIC_SLEEP and DEBUG_PREEMPT [7.] A small shell script or example program which triggers the problem (if possible) mount -t sysfs sys /sys && echo 0 > /sys/devices/system/cpu/cpu1/online && echo 1 > /sys/devices/system/cpu/cpu1/online [8.] Environment /bin/sh When SoC have MCT_INT_SPI interrupt it is being allocated after hotplugging of the CPU, secondary_start_kernel() is sending CPU boot notifications which are send when preemption and interrupts are disabled. Exynos_mct notification handler tries to set up and allocate IRQ for SPI type interrupt for started CPU and then BUG appears. There might be similar problem on qcom-timer I think just after looking on the code. Best regards, -- Marcin Jabrzyk Samsung R&D Institute Poland Samsung Electronics From mboxrd@z Thu Jan 1 00:00:00 1970 From: m.jabrzyk@samsung.com (Marcin Jabrzyk) Date: Thu, 23 Oct 2014 15:51:16 +0200 Subject: =?UTF-8?B?UFJPQkxFTTrCoEJVRyAgYXBwZWFyaW5nIHdoZW4gdHJ5aW5nIHRvIGE=?= =?UTF-8?B?bGxvY2F0ZSBpbnRlcnJ1cHQgb24gRXh5bm9zIE1DVCBhZnRlciBDUFUgaG90cGw=?= =?UTF-8?B?dWc=?= Message-ID: <544907D4.1020409@samsung.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org [1.] One line summary of the problem: "BUG: sleeping function called from invalid context at mm/slub.c:1250" after CPU hotplug [2.] Full description of the problem/report: This was tested on Exynos 3250 board with https://lkml.org/lkml/2014/9/24/441 applied. Board is booting to /bin/sh. After executing: mount -t sysfs sys /sys && echo 0 > /sys/devices/system/cpu/cpu1/online && echo 1 > /sys/devices/system/cpu/cpu1/online I'm getting: [ 7.226405] IRQ258 no longer affine to CPU1 [ 7.226629] CPU1: shutdown [ 7.230037] CPU1: Software reset [ 7.231822] CPU1: Booted secondary processor [ 7.231843] BUG: sleeping function called from invalid context at mm/slub.c:1250 [ 7.231850] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/1 [ 7.231861] Preemption disabled at:[< (null)>] (null) [ 7.231864] [ 7.231876] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.17.0-dirty #45 [ 7.231914] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 7.231931] [] (show_stack) from [] (dump_stack+0x70/0xbc) [ 7.231950] [] (dump_stack) from [] (kmem_cache_alloc+0xe8/0x184) [ 7.231968] [] (kmem_cache_alloc) from [] (request_threaded_irq+0x64/0x128) [ 7.231985] [] (request_threaded_irq) from [] (exynos4_local_timer_setup+0xc0/0x13c) [ 7.232000] [] (exynos4_local_timer_setup) from [] (exynos4_mct_cpu_notify+0x30/0xa8) [ 7.232016] [] (exynos4_mct_cpu_notify) from [] (notifier_call_chain+0x44/0x84) [ 7.232034] [] (notifier_call_chain) from [] (__cpu_notify+0x28/0x44) [ 7.232049] [] (__cpu_notify) from [] (secondary_start_kernel+0xe8/0x138) [ 7.232062] [] (secondary_start_kernel) from [<400086a4>] (0x400086a4) The problem is that request_irq is calling allocation with GFP_KERNEL flag in atomic block. This bug should be easy observable on any board with "samsung,exynos4210-mct" compatible MCT block. [4.1.] Kernel version (from /proc/version): 3.17.0 [4.2.] Kernel .config file: exynos_defconfig + DEBUG_ATOMIC_SLEEP and DEBUG_PREEMPT [7.] A small shell script or example program which triggers the problem (if possible) mount -t sysfs sys /sys && echo 0 > /sys/devices/system/cpu/cpu1/online && echo 1 > /sys/devices/system/cpu/cpu1/online [8.] Environment /bin/sh When SoC have MCT_INT_SPI interrupt it is being allocated after hotplugging of the CPU, secondary_start_kernel() is sending CPU boot notifications which are send when preemption and interrupts are disabled. Exynos_mct notification handler tries to set up and allocate IRQ for SPI type interrupt for started CPU and then BUG appears. There might be similar problem on qcom-timer I think just after looking on the code. Best regards, -- Marcin Jabrzyk Samsung R&D Institute Poland Samsung Electronics