From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1754431AbcKQTQL (ORCPT <rfc822;w@1wt.eu>);
        Thu, 17 Nov 2016 14:16:11 -0500
Received: from mail-pg0-f43.google.com ([74.125.83.43]:34903 "EHLO
        mail-pg0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753455AbcKQTQK (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 17 Nov 2016 14:16:10 -0500
Date: Thu, 17 Nov 2016 11:16:06 -0800
From: Brian Norris <briannorris@chromium.org>
To: Jason Wessel <jason.wessel@windriver.com>
Cc: kgdb-bugreport@lists.sourceforge.net, linux-kernel@vger.kernel.org,
        Steven Rostedt <rostedt@goodmis.org>, Ingo Molnar <mingo@redhat.com>
Subject: [BUG] kgdb/ftrace - sleeping in invalid context
Message-ID: <20161117191605.GA21459@google.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hi,

I've been poking around KGDB, and I noticed that the KDB 'ftdump'
command (to dump ftrace logs) produces warnings like this:

(gdb) monitor ftdump
Dumping ftrace buffer:
BUG: sleeping function called from invalid context at mm/slab.h:359
in_atomic(): 1, irqs_disabled(): 128, pid: 116, name: irq/65-chromeos
CPU: 4 PID: 116 Comm: irq/65-chromeos Not tainted 4.4.21 #575
Call trace:
[<ffffffc0002087dc>] dump_backtrace+0x0/0x160
[<ffffffc00020895c>] show_stack+0x20/0x28
[<ffffffc0004bdc28>] dump_stack+0x90/0xb0
[<ffffffc0002486d8>] ___might_sleep+0x140/0x14c
[<ffffffc000248764>] __might_sleep+0x80/0x90
[<ffffffc00034ace0>] kmem_cache_alloc_trace+0x5c/0x238
[<ffffffc0002d3ab8>] ring_buffer_read_prepare+0x4c/0xa4
[<ffffffc0002f3214>] kdb_ftdump+0x200/0x3e4
[<ffffffc0002c6244>] kdb_parse+0x548/0x628
[<ffffffc0002c17a8>] gdb_serial_stub+0x89c/0xaac
[<ffffffc0002bf954>] kgdb_cpu_enter+0x1e0/0x5b0
[<ffffffc0002c002c>] kgdb_handle_exception+0x1a0/0x1e4
[<ffffffc000213004>] kgdb_compiled_brk_fn+0x30/0x3c
[<ffffffc000201a18>] brk_handler+0x9c/0xb0
[<ffffffc0002004d4>] do_debug_exception+0x60/0xe0
Exception stack(0xffffffc0ecffb820 to 0xffffffc0ecffb950)
b820: ffffffc0010a6000 0000008000000000 ffffffc0ecffba10 ffffffc0002bf020
b840: 00000000600001c5 0000000000000000 ffffffc0ecffb870 0000000000018160
b860: 0000000000000003 00000000000000c3 ffffffc000be15b9 ffffffc000be7053
b880: 0000000000000005 ffffffc0010a6c48 ffffffc0ecffb930 ffffffc00026f8e8
b8a0: ffffffc000c362ca ffffffc00108c000 ffffffc00026f880 0000000000000001
b8c0: 0000000000000007 ffffffc0ecfb6600 0000000000000001 cb88537fdc8ba615
b8e0: ffffffc0011b6430 0000000000000001 0000000000000000 ffffffc0011b6438
b900: 0000000000000000 0000000000000000 0000000000000006 ffffffc0010cc1c0
b920: ffffffc00043eed8 7f7f7f7fffffffff 681f39616369ff46 7f7f7f7f7f7f7f7f
b940: 0101010101010101 0000000000000008
[<ffffffc0002035d4>] el1_dbg+0x18/0x74
[<ffffffc0002bf0c0>] sysrq_handle_dbg+0x54/0x5c
[<ffffffc00054e5dc>] __handle_sysrq+0xa4/0x15c
[<ffffffc00054e7ec>] sysrq_filter+0x11c/0x348
[<ffffffc00069abb8>] input_to_handler+0x60/0x100
[<ffffffc00069c6d0>] input_pass_values.part.2+0x78/0x144
[<ffffffc00069e408>] input_handle_event+0x280/0x4d4
[<ffffffc00069e6cc>] input_event+0x70/0x8c
[...]

It looks like (almost?) all KDB code gets run in an exception context,
so I don't see how sleeping allocations (such as those in
ring_buffer_read_prepare()) are supposed to be able to work if they
don't immediately find enough memory.

I can't think of a great simple fix for this, other than borrowing the
hack from kdb_private.h:

#define GFP_KDB (in_interrupt() ? GFP_ATOMIC : GFP_KERNEL)

AFAICT, the necessary allocations are all pretty small actually, and so
GFP_ATOMIC may not be a problem, even in the non-KDB ring buffer case.

Thoughts? I figured I'd ask questions before blindly sending patches, as
I'm not very familiar with this code.

Regards,
Brian