* [PATCH v2 6/9] kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles
2020-04-21 21:14 [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ boot consoles Douglas Anderson
@ 2020-04-21 21:14 ` Douglas Anderson
2020-04-27 16:36 ` Daniel Thompson
2020-04-23 13:50 ` [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ " Greg KH
2020-04-24 8:32 ` Sumit Garg
2 siblings, 1 reply; 7+ messages in thread
From: Douglas Anderson @ 2020-04-21 21:14 UTC (permalink / raw)
To: jason.wessel, daniel.thompson, gregkh
Cc: kgdb-bugreport, mingo, hpa, bp, linux-serial, agross, tglx,
frowand.list, bjorn.andersson, jslaby, catalin.marinas, corbet,
will, Douglas Anderson, Arnd Bergmann, linux-kernel, linux-usb
We want to enable kgdb to debug the early parts of the kernel.
Unfortunately kgdb normally is a client of the tty API in the kernel
and serial drivers don't register to the tty layer until fairly late
in the boot process.
Serial drivers do, however, commonly register a boot console. Let's
enable the kgdboc driver to work with boot consoles to provide early
debugging.
This change co-opts the existing read() function pointer that's part
of "struct console". It's assumed that if a boot console (with the
flag CON_BOOT) has implemented read() that both the read() and write()
function are polling functions. That means they work without
interrupts and read() will return immediately (with 0 bytes read) if
there's nothing to read. This should be a safe assumption since it
appears that no current boot consoles implement read() right now and
there seems no reason to do so unless they wanted to support
"earlycon_kgdboc".
The console API isn't really intended to have clients work with it
like we're doing. Specifically there doesn't appear to be any way for
clients to be notified about a boot console being unregistered. We'll
work around this by checking that our console is still valid before
using it. We'll also try to transition off of the boot console and
onto the "tty" API as quickly as possible.
The normal/expected way to make all this work is to use
"earlycon_kgdboc" and "kgdboc" together. You should point them both
to the same physical serial connection. At boot time, as the system
transitions from the boot console to the normal console, kgdb will
switch over. If you don't use things in the normal/expected way it's
a bit of a buyer-beware situation. Things thought about:
- If you specify only "earlycon_kgdboc" but not "kgdboc" you still
might end up dropping into kgdb upon a crash/sysrq but you may not
be able to type.
- If you use "keep_bootcon" (which is already a bit of a buyer-beware
option) and specify "earlycon_kgdboc" but not "kgdboc" we'll keep
trying to use your boot console for kgdb.
- If your "earlycon_kgdboc" and "kgdboc" devices are not the same
device things should work OK, but it'll be your job to switch over
which device you're monitoring (including figuring out how to switch
over gdb in-flight if you're using it).
When trying to enable "earlycon_kgdboc" it should be noted that the
names that are registered through the boot console layer and the tty
layer are not the same for the same port. For example when debugging
on one board I'd need to pass "earlycon_kgdboc=qcom_geni
kgdboc=ttyMSM0" to enable things properly. Since digging up the boot
console name is a pain and there will rarely be more than one boot
console enabled, you can provide the "earlycon_kgdboc" parameter
without specifying the name of the boot console. In this case we'll
just pick the first boot that implements read() that we find.
This new "earlycon_kgdboc" parameter should be contrasted to the
existing "ekgdboc" parameter. While both provide a way to debug very
early, the usage and mechanisms are quite different. Specifically
"earlycon_kgdboc" is meant to be used in tandem with "kgdboc" and
there is a transition from one to the other. The "ekgdboc" parameter,
on the other hand, replaces the "kgdboc" parameter. It runs the same
logic as the "kgdboc" parameter but just relies on your TTY driver
being present super early. The only known usage of the old "ekgdboc"
parameter is documented as "ekgdboc=kbd earlyprintk=vga". It should
be noted that "kbd" has special treatment allowing it to init early as
a tty device.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This patch touches files in several different subsystems, but it
touches a single line and that line is related to kgdb. I'm assuming
this can all go through the kgdb tree, but if needed I can always
introduce a new API call instead of modifying the old one and then
just have the old API call be a thin wrapper on the new one.
Changes in v2:
- Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
- Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp
drivers/misc/kgdbts.c | 2 +-
drivers/tty/mips_ejtag_fdc.c | 2 +-
drivers/tty/serial/kgdboc.c | 132 +++++++++++++++++++++++++++++++++-
drivers/usb/early/ehci-dbgp.c | 2 +-
include/linux/kgdb.h | 3 +-
kernel/debug/debug_core.c | 15 +++-
6 files changed, 149 insertions(+), 7 deletions(-)
diff --git a/drivers/misc/kgdbts.c b/drivers/misc/kgdbts.c
index bccd341e9ae1..5c4e4a8771cf 100644
--- a/drivers/misc/kgdbts.c
+++ b/drivers/misc/kgdbts.c
@@ -1077,7 +1077,7 @@ static int configure_kgdbts(void)
final_ack = 0;
run_plant_and_detach_test(1);
- err = kgdb_register_io_module(&kgdbts_io_ops);
+ err = kgdb_register_io_module(&kgdbts_io_ops, false);
if (err) {
configured = 0;
return err;
diff --git a/drivers/tty/mips_ejtag_fdc.c b/drivers/tty/mips_ejtag_fdc.c
index 21e76a2ec182..68817cca39ce 100644
--- a/drivers/tty/mips_ejtag_fdc.c
+++ b/drivers/tty/mips_ejtag_fdc.c
@@ -1265,7 +1265,7 @@ static struct kgdb_io kgdbfdc_io_ops = {
static int __init kgdbfdc_init(void)
{
- kgdb_register_io_module(&kgdbfdc_io_ops);
+ kgdb_register_io_module(&kgdbfdc_io_ops, false);
return 0;
}
early_initcall(kgdbfdc_init);
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
index 519d8cfbfbed..2f526f2d2bea 100644
--- a/drivers/tty/serial/kgdboc.c
+++ b/drivers/tty/serial/kgdboc.c
@@ -21,6 +21,7 @@
#include <linux/input.h>
#include <linux/module.h>
#include <linux/platform_device.h>
+#include <linux/serial_core.h>
#define MAX_CONFIG_LEN 40
@@ -42,6 +43,14 @@ static int kgdb_tty_line;
static struct platform_device *kgdboc_pdev;
+#ifdef CONFIG_KGDB_SERIAL_CONSOLE
+static struct kgdb_io earlycon_kgdboc_io_ops;
+struct console *earlycon;
+bool earlycon_neutered;
+#else /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+#define earlycon NULL
+#endif /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+
#ifdef CONFIG_KDB_KEYBOARD
static int kgdboc_reset_connect(struct input_handler *handler,
struct input_dev *dev,
@@ -135,8 +144,46 @@ static void kgdboc_unregister_kbd(void)
#define kgdboc_restore_input()
#endif /* ! CONFIG_KDB_KEYBOARD */
+#ifdef CONFIG_KGDB_SERIAL_CONSOLE
+
+static void cleanup_earlycon(bool unregister)
+{
+ if (earlycon && unregister)
+ kgdb_unregister_io_module(&earlycon_kgdboc_io_ops);
+ earlycon = NULL;
+}
+
+static bool is_earlycon_still_valid(void)
+{
+ struct console *con;
+
+ for_each_console(con)
+ if (con == earlycon)
+ return true;
+ return false;
+}
+
+static void cleanup_earlycon_if_invalid(void)
+{
+ console_lock();
+ if (earlycon && (earlycon_neutered || !is_earlycon_still_valid())) {
+ pr_warn("earlycon vanished; unregistering\n");
+ cleanup_earlycon(true);
+ }
+ console_unlock();
+}
+
+#else /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+
+static inline void cleanup_earlycon(bool unregister) { ; }
+static inline void cleanup_earlycon_if_invalid(void) { ; }
+
+#endif /* ! CONFIG_KGDB_SERIAL_CONSOLE */
+
static void cleanup_kgdboc(void)
{
+ cleanup_earlycon(true);
+
if (configured != 1)
return;
@@ -188,9 +235,10 @@ static int configure_kgdboc(void)
kgdb_tty_line = tty_line;
do_register:
- err = kgdb_register_io_module(&kgdboc_io_ops);
+ err = kgdb_register_io_module(&kgdboc_io_ops, earlycon != NULL);
if (err)
goto noconfig;
+ cleanup_earlycon(false);
err = kgdb_register_nmi_console();
if (err)
@@ -206,6 +254,14 @@ static int configure_kgdboc(void)
kgdboc_unregister_kbd();
configured = 0;
+ /*
+ * Each time we run configure_kgdboc() but don't find a console, use
+ * that as a chance to validate that our earlycon didn't vanish on
+ * us. If it vanished we should unregister which will disable kgdb
+ * if we're the last I/O module.
+ */
+ cleanup_earlycon_if_invalid();
+
return err;
}
@@ -409,6 +465,80 @@ static int __init kgdboc_early_init(char *opt)
}
early_param("ekgdboc", kgdboc_early_init);
+
+static int earlycon_kgdboc_get_char(void)
+{
+ char c;
+
+ if (earlycon_neutered || !earlycon->read(earlycon, &c, 1))
+ return NO_POLL_CHAR;
+
+ return c;
+}
+
+static void earlycon_kgdboc_put_char(u8 chr)
+{
+ if (!earlycon_neutered)
+ earlycon->write(earlycon, &chr, 1);
+}
+
+static void earlycon_kgdboc_pre_exp_handler(void)
+{
+ /*
+ * We don't get notified when the boot console is unregistered.
+ * Double-check when we enter the debugger. Unfortunately we
+ * can't really unregister ourselves now, but at least don't crash.
+ */
+ if (earlycon && !earlycon_neutered && !is_earlycon_still_valid()) {
+ pr_warn("Neutering kgdb since boot console vanished\n");
+ earlycon_neutered = true;
+ }
+}
+
+static struct kgdb_io earlycon_kgdboc_io_ops = {
+ .name = "earlycon_kgdboc",
+ .read_char = earlycon_kgdboc_get_char,
+ .write_char = earlycon_kgdboc_put_char,
+ .pre_exception = earlycon_kgdboc_pre_exp_handler,
+ .is_console = true,
+};
+
+static int __init earlycon_kgdboc_init(char *opt)
+{
+ struct console *con;
+
+ kdb_init(KDB_INIT_EARLY);
+
+ /*
+ * Look for a matching console, or if the name was left blank just
+ * pick the first one we find.
+ */
+ console_lock();
+ for_each_console(con) {
+ if (con->write && con->read &&
+ (con->flags & (CON_BOOT | CON_ENABLED)) &&
+ (!opt || !opt[0] || strcmp(con->name, opt) == 0))
+ break;
+ }
+ console_unlock();
+
+ if (!con) {
+ pr_info("Couldn't find kgdb earlycon\n");
+ return 0;
+ }
+
+ earlycon = con;
+ pr_info("Going to register kgdb with earlycon '%s'\n", con->name);
+ if (kgdb_register_io_module(&earlycon_kgdboc_io_ops, false) != 0) {
+ earlycon = NULL;
+ pr_info("Failed to register kgdb with earlycon\n");
+ return 0;
+ }
+
+ return 0;
+}
+
+early_param("earlycon_kgdboc", earlycon_kgdboc_init);
#endif /* CONFIG_KGDB_SERIAL_CONSOLE */
module_init(init_kgdboc);
diff --git a/drivers/usb/early/ehci-dbgp.c b/drivers/usb/early/ehci-dbgp.c
index ea0d531c63e2..bb04c688e094 100644
--- a/drivers/usb/early/ehci-dbgp.c
+++ b/drivers/usb/early/ehci-dbgp.c
@@ -1057,7 +1057,7 @@ static int __init kgdbdbgp_parse_config(char *str)
ptr++;
kgdbdbgp_wait_time = simple_strtoul(ptr, &ptr, 10);
}
- kgdb_register_io_module(&kgdbdbgp_io_ops);
+ kgdb_register_io_module(&kgdbdbgp_io_ops, false);
kgdbdbgp_io_ops.is_console = early_dbgp_console.index != -1;
return 0;
diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 7371517aeacc..2e86307f2683 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -323,7 +323,8 @@ static inline int kgdb_unregister_nmi_console(void) { return 0; }
static inline bool kgdb_nmi_poll_knock(void) { return 1; }
#endif
-extern int kgdb_register_io_module(struct kgdb_io *local_kgdb_io_ops);
+extern int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops,
+ bool replace);
extern void kgdb_unregister_io_module(struct kgdb_io *local_kgdb_io_ops);
extern struct kgdb_io *dbg_io_ops;
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 8f178239856d..1b5435c6d92a 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -1074,16 +1074,21 @@ EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);
/**
* kgdb_register_io_module - register KGDB IO module
* @new_dbg_io_ops: the io ops vector
+ * @replace: If true it's OK if there were old ops. This is used
+ * to transition from early kgdb to normal kgdb. It's
+ * assumed these are the same device so kgdb can continue.
*
* Register it with the KGDB core.
*/
-int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
+int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops, bool replace)
{
+ struct kgdb_io *old_dbg_io_ops;
int err;
spin_lock(&kgdb_registration_lock);
- if (dbg_io_ops) {
+ old_dbg_io_ops = dbg_io_ops;
+ if (dbg_io_ops && !replace) {
spin_unlock(&kgdb_registration_lock);
pr_err("Another I/O driver is already registered with KGDB\n");
@@ -1102,6 +1107,12 @@ int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
spin_unlock(&kgdb_registration_lock);
+ if (replace) {
+ pr_info("Replaced I/O driver %s with %s\n",
+ old_dbg_io_ops->name, new_dbg_io_ops->name);
+ return 0;
+ }
+
pr_info("Registered I/O driver %s\n", new_dbg_io_ops->name);
/* Arm KGDB now. */
--
2.26.1.301.g55bc3eb7cb9-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH v2 6/9] kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles
2020-04-21 21:14 ` [PATCH v2 6/9] kgdboc: Add earlycon_kgdboc to support early kgdb using " Douglas Anderson
@ 2020-04-27 16:36 ` Daniel Thompson
2020-04-28 21:32 ` Doug Anderson
0 siblings, 1 reply; 7+ messages in thread
From: Daniel Thompson @ 2020-04-27 16:36 UTC (permalink / raw)
To: Douglas Anderson
Cc: jason.wessel, gregkh, kgdb-bugreport, mingo, hpa, bp,
linux-serial, agross, tglx, frowand.list, bjorn.andersson, jslaby,
catalin.marinas, corbet, will, Arnd Bergmann, linux-kernel,
linux-usb
On Tue, Apr 21, 2020 at 02:14:44PM -0700, Douglas Anderson wrote:
> We want to enable kgdb to debug the early parts of the kernel.
> Unfortunately kgdb normally is a client of the tty API in the kernel
> and serial drivers don't register to the tty layer until fairly late
> in the boot process.
>
> Serial drivers do, however, commonly register a boot console. Let's
> enable the kgdboc driver to work with boot consoles to provide early
> debugging.
>
> This change co-opts the existing read() function pointer that's part
> of "struct console". It's assumed that if a boot console (with the
> flag CON_BOOT) has implemented read() that both the read() and write()
> function are polling functions. That means they work without
> interrupts and read() will return immediately (with 0 bytes read) if
> there's nothing to read. This should be a safe assumption since it
> appears that no current boot consoles implement read() right now and
> there seems no reason to do so unless they wanted to support
> "earlycon_kgdboc".
>
> The console API isn't really intended to have clients work with it
> like we're doing. Specifically there doesn't appear to be any way for
> clients to be notified about a boot console being unregistered. We'll
> work around this by checking that our console is still valid before
> using it. We'll also try to transition off of the boot console and
> onto the "tty" API as quickly as possible.
>
> The normal/expected way to make all this work is to use
> "earlycon_kgdboc" and "kgdboc" together. You should point them both
> to the same physical serial connection. At boot time, as the system
> transitions from the boot console to the normal console, kgdb will
> switch over. If you don't use things in the normal/expected way it's
> a bit of a buyer-beware situation. Things thought about:
>
> - If you specify only "earlycon_kgdboc" but not "kgdboc" you still
> might end up dropping into kgdb upon a crash/sysrq but you may not
> be able to type.
> - If you use "keep_bootcon" (which is already a bit of a buyer-beware
> option) and specify "earlycon_kgdboc" but not "kgdboc" we'll keep
> trying to use your boot console for kgdb.
> - If your "earlycon_kgdboc" and "kgdboc" devices are not the same
> device things should work OK, but it'll be your job to switch over
> which device you're monitoring (including figuring out how to switch
> over gdb in-flight if you're using it).
>
> When trying to enable "earlycon_kgdboc" it should be noted that the
> names that are registered through the boot console layer and the tty
> layer are not the same for the same port. For example when debugging
> on one board I'd need to pass "earlycon_kgdboc=qcom_geni
> kgdboc=ttyMSM0" to enable things properly. Since digging up the boot
> console name is a pain and there will rarely be more than one boot
> console enabled, you can provide the "earlycon_kgdboc" parameter
> without specifying the name of the boot console. In this case we'll
> just pick the first boot that implements read() that we find.
>
> This new "earlycon_kgdboc" parameter should be contrasted to the
> existing "ekgdboc" parameter. While both provide a way to debug very
> early, the usage and mechanisms are quite different. Specifically
> "earlycon_kgdboc" is meant to be used in tandem with "kgdboc" and
> there is a transition from one to the other. The "ekgdboc" parameter,
> on the other hand, replaces the "kgdboc" parameter. It runs the same
> logic as the "kgdboc" parameter but just relies on your TTY driver
> being present super early. The only known usage of the old "ekgdboc"
> parameter is documented as "ekgdboc=kbd earlyprintk=vga". It should
> be noted that "kbd" has special treatment allowing it to init early as
> a tty device.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
Again, very happy with the overall approach, just a few quibbles.
> ---
> This patch touches files in several different subsystems, but it
> touches a single line and that line is related to kgdb. I'm assuming
> this can all go through the kgdb tree, but if needed I can always
> introduce a new API call instead of modifying the old one and then
> just have the old API call be a thin wrapper on the new one.
Funny you should say that!
I don't really like that extra argument although it is nothing to do
with simplifying merges...
> diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
> index 519d8cfbfbed..2f526f2d2bea 100644
> --- a/drivers/tty/serial/kgdboc.c
> +++ b/drivers/tty/serial/kgdboc.c
> @@ -409,6 +465,80 @@ static int __init kgdboc_early_init(char *opt)
> }
>
> early_param("ekgdboc", kgdboc_early_init);
> +
> +static int earlycon_kgdboc_get_char(void)
> +{
> + char c;
> +
> + if (earlycon_neutered || !earlycon->read(earlycon, &c, 1))
> + return NO_POLL_CHAR;
> +
> + return c;
> +}
> +
> +static void earlycon_kgdboc_put_char(u8 chr)
> +{
> + if (!earlycon_neutered)
> + earlycon->write(earlycon, &chr, 1);
> +}
> +
> +static void earlycon_kgdboc_pre_exp_handler(void)
> +{
> + /*
> + * We don't get notified when the boot console is unregistered.
> + * Double-check when we enter the debugger. Unfortunately we
> + * can't really unregister ourselves now, but at least don't crash.
> + */
> + if (earlycon && !earlycon_neutered && !is_earlycon_still_valid()) {
> + pr_warn("Neutering kgdb since boot console vanished\n");
> + earlycon_neutered = true;
This is, IMHO, too subtle.
I don't think this is merely a warning with a gentle message about
neutering. IIUC the system is (or will shortly be) dead in the water.
After diligently stopping all the CPUs the debug-core will then start
waiting for a character that cannot possibly come!
I think this might be one of those vanishingly rare places where
panicing might actually the right thing to do... although only after
neutering" the kgdb panic handler first ;-).
> + }
> +}
> +
> +static struct kgdb_io earlycon_kgdboc_io_ops = {
> + .name = "earlycon_kgdboc",
> + .read_char = earlycon_kgdboc_get_char,
> + .write_char = earlycon_kgdboc_put_char,
> + .pre_exception = earlycon_kgdboc_pre_exp_handler,
> + .is_console = true,
> +};
> +
> +static int __init earlycon_kgdboc_init(char *opt)
> +{
> + struct console *con;
> +
> + kdb_init(KDB_INIT_EARLY);
This is normally taken care of by debug-core.c . Could this be
integrated into kgdb_register_io_module() ?
> +
> + /*
> + * Look for a matching console, or if the name was left blank just
> + * pick the first one we find.
> + */
> + console_lock();
> + for_each_console(con) {
> + if (con->write && con->read &&
> + (con->flags & (CON_BOOT | CON_ENABLED)) &&
> + (!opt || !opt[0] || strcmp(con->name, opt) == 0))
> + break;
> + }
> + console_unlock();
> +
> + if (!con) {
> + pr_info("Couldn't find kgdb earlycon\n");
> + return 0;
> + }
> +
> + earlycon = con;
> + pr_info("Going to register kgdb with earlycon '%s'\n", con->name);
> + if (kgdb_register_io_module(&earlycon_kgdboc_io_ops, false) != 0) {
> + earlycon = NULL;
> + pr_info("Failed to register kgdb with earlycon\n");
> + return 0;
> + }
> +
> + return 0;
> +}
> +
> +early_param("earlycon_kgdboc", earlycon_kgdboc_init);
> #endif /* CONFIG_KGDB_SERIAL_CONSOLE */
>
> module_init(init_kgdboc);
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 8f178239856d..1b5435c6d92a 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -1074,16 +1074,21 @@ EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);
> /**
> * kgdb_register_io_module - register KGDB IO module
> * @new_dbg_io_ops: the io ops vector
> + * @replace: If true it's OK if there were old ops. This is used
> + * to transition from early kgdb to normal kgdb. It's
> + * assumed these are the same device so kgdb can continue.
> *
> * Register it with the KGDB core.
> */
> -int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
> +int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops, bool replace)
As I said I'm not a big fan of the extra argument. It makes the call
sites harder to read.
Could earlycon_kgdboc be registered with a boolean flag set so that
a subsequent register will automatically replace the old one
(maybe "is_replaceable" or "is_temporary")?
For bonus marks the core could also enforce that a replaceable io ops
table must have init set to null (because there is no deinit).
Daniel.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v2 6/9] kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles
2020-04-27 16:36 ` Daniel Thompson
@ 2020-04-28 21:32 ` Doug Anderson
0 siblings, 0 replies; 7+ messages in thread
From: Doug Anderson @ 2020-04-28 21:32 UTC (permalink / raw)
To: Daniel Thompson
Cc: Jason Wessel, Greg Kroah-Hartman, kgdb-bugreport, Ingo Molnar,
H. Peter Anvin, bp, linux-serial, Andy Gross, Thomas Gleixner,
Frank Rowand, Bjorn Andersson, Jiri Slaby, Catalin Marinas,
Jonathan Corbet, Will Deacon, Arnd Bergmann, LKML, linux-usb
Hi,
On Mon, Apr 27, 2020 at 9:36 AM Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>
> On Tue, Apr 21, 2020 at 02:14:44PM -0700, Douglas Anderson wrote:
> > We want to enable kgdb to debug the early parts of the kernel.
> > Unfortunately kgdb normally is a client of the tty API in the kernel
> > and serial drivers don't register to the tty layer until fairly late
> > in the boot process.
> >
> > Serial drivers do, however, commonly register a boot console. Let's
> > enable the kgdboc driver to work with boot consoles to provide early
> > debugging.
> >
> > This change co-opts the existing read() function pointer that's part
> > of "struct console". It's assumed that if a boot console (with the
> > flag CON_BOOT) has implemented read() that both the read() and write()
> > function are polling functions. That means they work without
> > interrupts and read() will return immediately (with 0 bytes read) if
> > there's nothing to read. This should be a safe assumption since it
> > appears that no current boot consoles implement read() right now and
> > there seems no reason to do so unless they wanted to support
> > "earlycon_kgdboc".
> >
> > The console API isn't really intended to have clients work with it
> > like we're doing. Specifically there doesn't appear to be any way for
> > clients to be notified about a boot console being unregistered. We'll
> > work around this by checking that our console is still valid before
> > using it. We'll also try to transition off of the boot console and
> > onto the "tty" API as quickly as possible.
> >
> > The normal/expected way to make all this work is to use
> > "earlycon_kgdboc" and "kgdboc" together. You should point them both
> > to the same physical serial connection. At boot time, as the system
> > transitions from the boot console to the normal console, kgdb will
> > switch over. If you don't use things in the normal/expected way it's
> > a bit of a buyer-beware situation. Things thought about:
> >
> > - If you specify only "earlycon_kgdboc" but not "kgdboc" you still
> > might end up dropping into kgdb upon a crash/sysrq but you may not
> > be able to type.
> > - If you use "keep_bootcon" (which is already a bit of a buyer-beware
> > option) and specify "earlycon_kgdboc" but not "kgdboc" we'll keep
> > trying to use your boot console for kgdb.
> > - If your "earlycon_kgdboc" and "kgdboc" devices are not the same
> > device things should work OK, but it'll be your job to switch over
> > which device you're monitoring (including figuring out how to switch
> > over gdb in-flight if you're using it).
> >
> > When trying to enable "earlycon_kgdboc" it should be noted that the
> > names that are registered through the boot console layer and the tty
> > layer are not the same for the same port. For example when debugging
> > on one board I'd need to pass "earlycon_kgdboc=qcom_geni
> > kgdboc=ttyMSM0" to enable things properly. Since digging up the boot
> > console name is a pain and there will rarely be more than one boot
> > console enabled, you can provide the "earlycon_kgdboc" parameter
> > without specifying the name of the boot console. In this case we'll
> > just pick the first boot that implements read() that we find.
> >
> > This new "earlycon_kgdboc" parameter should be contrasted to the
> > existing "ekgdboc" parameter. While both provide a way to debug very
> > early, the usage and mechanisms are quite different. Specifically
> > "earlycon_kgdboc" is meant to be used in tandem with "kgdboc" and
> > there is a transition from one to the other. The "ekgdboc" parameter,
> > on the other hand, replaces the "kgdboc" parameter. It runs the same
> > logic as the "kgdboc" parameter but just relies on your TTY driver
> > being present super early. The only known usage of the old "ekgdboc"
> > parameter is documented as "ekgdboc=kbd earlyprintk=vga". It should
> > be noted that "kbd" has special treatment allowing it to init early as
> > a tty device.
> >
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
>
> Again, very happy with the overall approach, just a few quibbles.
>
>
> > ---
> > This patch touches files in several different subsystems, but it
> > touches a single line and that line is related to kgdb. I'm assuming
> > this can all go through the kgdb tree, but if needed I can always
> > introduce a new API call instead of modifying the old one and then
> > just have the old API call be a thin wrapper on the new one.
>
> Funny you should say that!
>
> I don't really like that extra argument although it is nothing to do
> with simplifying merges...
>
>
> > diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
> > index 519d8cfbfbed..2f526f2d2bea 100644
> > --- a/drivers/tty/serial/kgdboc.c
> > +++ b/drivers/tty/serial/kgdboc.c
> > @@ -409,6 +465,80 @@ static int __init kgdboc_early_init(char *opt)
> > }
> >
> > early_param("ekgdboc", kgdboc_early_init);
> > +
> > +static int earlycon_kgdboc_get_char(void)
> > +{
> > + char c;
> > +
> > + if (earlycon_neutered || !earlycon->read(earlycon, &c, 1))
> > + return NO_POLL_CHAR;
> > +
> > + return c;
> > +}
> > +
> > +static void earlycon_kgdboc_put_char(u8 chr)
> > +{
> > + if (!earlycon_neutered)
> > + earlycon->write(earlycon, &chr, 1);
> > +}
> > +
> > +static void earlycon_kgdboc_pre_exp_handler(void)
> > +{
> > + /*
> > + * We don't get notified when the boot console is unregistered.
> > + * Double-check when we enter the debugger. Unfortunately we
> > + * can't really unregister ourselves now, but at least don't crash.
> > + */
> > + if (earlycon && !earlycon_neutered && !is_earlycon_still_valid()) {
> > + pr_warn("Neutering kgdb since boot console vanished\n");
> > + earlycon_neutered = true;
>
> This is, IMHO, too subtle.
>
> I don't think this is merely a warning with a gentle message about
> neutering. IIUC the system is (or will shortly be) dead in the water.
> After diligently stopping all the CPUs the debug-core will then start
> waiting for a character that cannot possibly come!
>
> I think this might be one of those vanishingly rare places where
> panicing might actually the right thing to do... although only after
> neutering" the kgdb panic handler first ;-).
OK. I ended up adding a patch that makes our general re-entry
handling better and then relying on that since there's no other great
way to neuter the kgdb panic handler. Then I just called panic().
NOTE: it's actually quite hard to reproduce this. If you specify
"earlycon_kgdboc" but not "kgdboc" it'll notice at configure_kgdboc()
that the boot console vanished. I could reproduce this by hacking
configure_kgdboc() not to do this, but otherwise it was hard.
...oh, but I did realize that there's a window where the boot console
has vanished and our init function hasn't yet been called. That's a
pretty small window on the systems I tested, probably owing to the
fact that kgdboc itself is listed in the serial drivers and is listed
last, so it'll typically probe right after serial drivers do. ...and
if I hit deferred probing again I should run after the deferred probe
of the serial driver I needed. It's slightly fragile but maybe it'll
do for now. I guess if people start hitting this panic we'll have to
figure out what to do. If we don't want to add hooks in for the
kernel to tell us about this event we could always do something hacky
like poll every millisecond and it's probably work. For now I'll just
document that people should use "keep_bootcon" if they end up in this
situation.
> > + }
> > +}
> > +
> > +static struct kgdb_io earlycon_kgdboc_io_ops = {
> > + .name = "earlycon_kgdboc",
> > + .read_char = earlycon_kgdboc_get_char,
> > + .write_char = earlycon_kgdboc_put_char,
> > + .pre_exception = earlycon_kgdboc_pre_exp_handler,
> > + .is_console = true,
> > +};
> > +
> > +static int __init earlycon_kgdboc_init(char *opt)
> > +{
> > + struct console *con;
> > +
> > + kdb_init(KDB_INIT_EARLY);
>
> This is normally taken care of by debug-core.c . Could this be
> integrated into kgdb_register_io_module() ?
Unfortunately it's not totally trivial. At least one problem that
feels difficult to solve is that kdb_init() (and all its
sub-functions) are marked "__init" but kgdb_register_io_module() isn't
(and can't be).
One possible solution: I could totally remove this call and things
will work fine, but only if you do "kgdbwait" or if you make sure your
code doesn't crash or hit any hardcoded kgdb_breakpoint() until
dbg_late_init() is called. That's not totally ideal. I'm going to
assume it's OK for me to leave the kdb_init() here.
NOTE: I believe that the existing "ekgdboc" has the same issues but
I'm not setup to use "ekgdboc" and so I haven't tested. If you can
reproduce the "ekgdboc" issue that is there (in theory) I can also
post up a patch that'll fix that the same way...
> > +
> > + /*
> > + * Look for a matching console, or if the name was left blank just
> > + * pick the first one we find.
> > + */
> > + console_lock();
> > + for_each_console(con) {
> > + if (con->write && con->read &&
> > + (con->flags & (CON_BOOT | CON_ENABLED)) &&
> > + (!opt || !opt[0] || strcmp(con->name, opt) == 0))
> > + break;
> > + }
> > + console_unlock();
> > +
> > + if (!con) {
> > + pr_info("Couldn't find kgdb earlycon\n");
> > + return 0;
> > + }
> > +
> > + earlycon = con;
> > + pr_info("Going to register kgdb with earlycon '%s'\n", con->name);
> > + if (kgdb_register_io_module(&earlycon_kgdboc_io_ops, false) != 0) {
> > + earlycon = NULL;
> > + pr_info("Failed to register kgdb with earlycon\n");
> > + return 0;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +early_param("earlycon_kgdboc", earlycon_kgdboc_init);
> > #endif /* CONFIG_KGDB_SERIAL_CONSOLE */
> >
> > module_init(init_kgdboc);
> > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > index 8f178239856d..1b5435c6d92a 100644
> > --- a/kernel/debug/debug_core.c
> > +++ b/kernel/debug/debug_core.c
> > @@ -1074,16 +1074,21 @@ EXPORT_SYMBOL_GPL(kgdb_schedule_breakpoint);
> > /**
> > * kgdb_register_io_module - register KGDB IO module
> > * @new_dbg_io_ops: the io ops vector
> > + * @replace: If true it's OK if there were old ops. This is used
> > + * to transition from early kgdb to normal kgdb. It's
> > + * assumed these are the same device so kgdb can continue.
> > *
> > * Register it with the KGDB core.
> > */
> > -int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops)
> > +int kgdb_register_io_module(struct kgdb_io *new_dbg_io_ops, bool replace)
>
> As I said I'm not a big fan of the extra argument. It makes the call
> sites harder to read.
>
> Could earlycon_kgdboc be registered with a boolean flag set so that
> a subsequent register will automatically replace the old one
> (maybe "is_replaceable" or "is_temporary")?
>
> For bonus marks the core could also enforce that a replaceable io ops
> table must have init set to null (because there is no deinit).
OK. I ended up adding a "deinit" function call and using that as an
indication that the ops are replaceable. This cleaned up some of the
earlycon_kgdb code and seemed sane.
-Doug
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ boot consoles
2020-04-21 21:14 [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ boot consoles Douglas Anderson
2020-04-21 21:14 ` [PATCH v2 6/9] kgdboc: Add earlycon_kgdboc to support early kgdb using " Douglas Anderson
@ 2020-04-23 13:50 ` Greg KH
2020-04-24 8:32 ` Sumit Garg
2 siblings, 0 replies; 7+ messages in thread
From: Greg KH @ 2020-04-23 13:50 UTC (permalink / raw)
To: Douglas Anderson
Cc: jason.wessel, daniel.thompson, kgdb-bugreport, mingo, hpa, bp,
linux-serial, agross, tglx, frowand.list, bjorn.andersson, jslaby,
catalin.marinas, corbet, will, Alexios Zavras, Allison Randal,
Andrew Morton, Arnd Bergmann, Borislav Petkov, Dave Martin,
Enrico Weigelt, Eric W. Biederman, James Morse, Juergen Gross,
Mark Rutland, Masami Hiramatsu, Matt Mullins,
Mauro Carvalho Chehab, Nadav Amit, Pawan Gupta, Peter Zijlstra,
Rick Edgecombe, jinho lim, linux-arm-kernel, linux-arm-msm,
linux-doc, linux-kernel, linux-usb, x86
On Tue, Apr 21, 2020 at 02:14:38PM -0700, Douglas Anderson wrote:
> This whole pile of patches was motivated by me trying to get kgdb to
> work properly on a platform where my serial driver ended up being hit
> by the -EPROBE_DEFER virus (it wasn't practicing social distancing
> from other drivers). Specifically my serial driver's parent device
> depended on a resource that wasn't available when its probe was first
> called. It returned -EPROBE_DEFER which meant that when "kgdboc"
> tried to run its setup the serial driver wasn't there. Unfortunately
> "kgdboc" never tried again, so that meant that kgdb was disabled until
> I manually enalbed it via sysfs.
>
> While I could try to figure out how to get around the -EPROBE_DEFER
> somehow, the above problems could happen to anyone and -EPROBE_DEFER
> is generally considered something you just have to live with. In any
> case the current "kgdboc" setup is a bit of a race waiting to happen.
> I _think_ I saw during early testing that even adding a msleep() in
> the typical serial driver's probe() is enough to trigger similar
> issues.
>
> I decided that for the above race the best attitude to get kgdb to
> register at boot was probably "if you can't beat 'em, join 'em".
> Thus, "kgdboc" now jumps on the -EPROBE_DEFER bandwagon (now that my
> driver uses it it's no longer a virus). It does so a little awkwardly
> because "kgdboc" hasn't normally had a "struct device" associated with
> it, but it's really not _that_ ugly to make a platform device and
> seems less ugly than alternatives.
>
> Unfortunately now on my system the debugger is one of the last things
> to register at boot. That's OK for debugging problems that show up
> significantly after boot, but isn't so hot for all the boot problems
> that I end up debugging. This motivated me to try to get something
> working a little earlier.
>
> My first attempt was to try to get the existing "ekgdboc" to work
> earlier. I tried that for a bit until I realized that it needed to
> work at the tty layer and I couldn't find any serial drivers that
> managed to register themselves to the tty layer super early at boot.
> The only documented use of "ekgdboc" is "ekgdboc=kbd" and that's a bit
> of a special snowflake. Trying to get my serial driver and all its
> dependencies to probe normally and register the tty driver super early
> at boot seemed like a bad way to go. In fact, all the complexity
> needed to do something like this is why the system already has a
> special concept of a "boot console" that lives only long enough to
> transition to the normal console.
>
> Leveraging the boot console seemed like a good way to go and that's
> what this series does. I found that consoles could have a read()
> function, though I couldn't find anyone who implemented it. I
> implemented it for two serial drivers for the devices I had easy
> access to, making the assumption that for boot consoles that we could
> assume read() and write() were polling-compatible (seems sane I
> think).
>
> Now anyone who makes a small change to their serial driver can easily
> enable early kgdb debugging!
>
> The devices I had for testing were:
> - arm32: rk3288-veyron-jerry
> - arm64: rk3399-gru-kevin
> - arm64: qcom-sc7180-trogdor (not mainline yet)
>
> These are the devices I tested this series on. I tried to test
> various combinations of enabling/disabling various options and I
> hopefully caught the corner cases, but I'd appreciate any extra
> testing people can do. Notably I didn't test on x86, but (I think) I
> didn't touch much there so I shouldn't have broken anything.
>
> When testing I found a few problems with actually dropping into the
> debugger super early on arm and arm64 devices. Patches in this series
> should help with this. For arm I just avoid dropping into the
> debugger until a little later and for arm64 I actually enable
> debugging super early.
>
> I realize that bits of this series might feel a little hacky, though
> I've tried to do things in the cleanest way I could without overly
> interferring with the rest of the kernel. If you hate the way I
> solved a problem I would love it if you could provide guidance on how
> you think I could solve the problem better.
>
> This series (and my comments / documentation / commit messages) are
> now long enough that my eyes glaze over when I try to read it all over
> to double-check. I've nontheless tried to double-check it, but I'm
> pretty sure I did something stupid. Thank you ahead of time for
> pointing it out to me so I can fix it in v3. If somehow I managed to
> not do anything stupid (really?) then thank you for double-checking me
> anyway.
>
> Changes in v2:
> - ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb") new for v2.
> - ("Revert "kgdboc: disable the console lock when in kgdb"") new for v2.
> - Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
> - Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp
>
> Douglas Anderson (9):
> kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
> Revert "kgdboc: disable the console lock when in kgdb"
> kgdboc: Use a platform device to handle tty drivers showing up late
> kgdb: Delay "kgdbwait" to dbg_late_init() by default
> arm64: Add call_break_hook() to early_brk64() for early kgdb
> kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles
> Documentation: kgdboc: Document new earlycon_kgdboc parameter
> serial: qcom_geni_serial: Support earlycon_kgdboc
> serial: 8250_early: Support earlycon_kgdboc
>
> .../admin-guide/kernel-parameters.txt | 20 ++
> Documentation/dev-tools/kgdb.rst | 14 +
> arch/arm64/include/asm/debug-monitors.h | 2 +
> arch/arm64/kernel/debug-monitors.c | 2 +-
> arch/arm64/kernel/kgdb.c | 5 +
> arch/arm64/kernel/traps.c | 3 +
> arch/x86/kernel/kgdb.c | 5 +
> drivers/misc/kgdbts.c | 2 +-
> drivers/tty/mips_ejtag_fdc.c | 2 +-
> drivers/tty/serial/8250/8250_early.c | 23 ++
> drivers/tty/serial/kgdboc.c | 262 ++++++++++++++++--
> drivers/tty/serial/qcom_geni_serial.c | 32 +++
> drivers/usb/early/ehci-dbgp.c | 2 +-
> include/linux/kgdb.h | 25 +-
> kernel/debug/debug_core.c | 48 +++-
> 15 files changed, 400 insertions(+), 47 deletions(-)
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ boot consoles
2020-04-21 21:14 [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ boot consoles Douglas Anderson
2020-04-21 21:14 ` [PATCH v2 6/9] kgdboc: Add earlycon_kgdboc to support early kgdb using " Douglas Anderson
2020-04-23 13:50 ` [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ " Greg KH
@ 2020-04-24 8:32 ` Sumit Garg
2020-04-24 10:13 ` Daniel Thompson
2 siblings, 1 reply; 7+ messages in thread
From: Sumit Garg @ 2020-04-24 8:32 UTC (permalink / raw)
To: Douglas Anderson
Cc: jason.wessel, Daniel Thompson, Greg Kroah-Hartman, Mark Rutland,
Linux Doc Mailing List, Peter Zijlstra, catalin.marinas,
bjorn.andersson, Nadav Amit, hpa, Mauro Carvalho Chehab, will,
Matt Mullins, Jonathan Corbet, frowand.list, x86, jinho lim,
agross, Pawan Gupta, linux-arm-kernel, linux-serial,
kgdb-bugreport, Borislav Petkov, Dave Martin, Masami Hiramatsu,
Arnd Bergmann, linux-arm-msm, jslaby, Alexios Zavras, bp, tglx,
mingo, Allison Randal, Juergen Gross, linux-usb,
Linux Kernel Mailing List, James Morse, Eric W. Biederman,
Andrew Morton, Rick Edgecombe, Enrico Weigelt
Hi Doug,
On Wed, 22 Apr 2020 at 02:45, Douglas Anderson <dianders@chromium.org> wrote:
>
> This whole pile of patches was motivated by me trying to get kgdb to
> work properly on a platform where my serial driver ended up being hit
> by the -EPROBE_DEFER virus (it wasn't practicing social distancing
> from other drivers). Specifically my serial driver's parent device
> depended on a resource that wasn't available when its probe was first
> called. It returned -EPROBE_DEFER which meant that when "kgdboc"
> tried to run its setup the serial driver wasn't there. Unfortunately
> "kgdboc" never tried again, so that meant that kgdb was disabled until
> I manually enalbed it via sysfs.
>
> While I could try to figure out how to get around the -EPROBE_DEFER
> somehow, the above problems could happen to anyone and -EPROBE_DEFER
> is generally considered something you just have to live with. In any
> case the current "kgdboc" setup is a bit of a race waiting to happen.
> I _think_ I saw during early testing that even adding a msleep() in
> the typical serial driver's probe() is enough to trigger similar
> issues.
>
> I decided that for the above race the best attitude to get kgdb to
> register at boot was probably "if you can't beat 'em, join 'em".
> Thus, "kgdboc" now jumps on the -EPROBE_DEFER bandwagon (now that my
> driver uses it it's no longer a virus). It does so a little awkwardly
> because "kgdboc" hasn't normally had a "struct device" associated with
> it, but it's really not _that_ ugly to make a platform device and
> seems less ugly than alternatives.
>
> Unfortunately now on my system the debugger is one of the last things
> to register at boot. That's OK for debugging problems that show up
> significantly after boot, but isn't so hot for all the boot problems
> that I end up debugging. This motivated me to try to get something
> working a little earlier.
>
> My first attempt was to try to get the existing "ekgdboc" to work
> earlier. I tried that for a bit until I realized that it needed to
> work at the tty layer and I couldn't find any serial drivers that
> managed to register themselves to the tty layer super early at boot.
> The only documented use of "ekgdboc" is "ekgdboc=kbd" and that's a bit
> of a special snowflake. Trying to get my serial driver and all its
> dependencies to probe normally and register the tty driver super early
> at boot seemed like a bad way to go. In fact, all the complexity
> needed to do something like this is why the system already has a
> special concept of a "boot console" that lives only long enough to
> transition to the normal console.
>
> Leveraging the boot console seemed like a good way to go and that's
> what this series does. I found that consoles could have a read()
> function, though I couldn't find anyone who implemented it. I
> implemented it for two serial drivers for the devices I had easy
> access to, making the assumption that for boot consoles that we could
> assume read() and write() were polling-compatible (seems sane I
> think).
>
> Now anyone who makes a small change to their serial driver can easily
> enable early kgdb debugging!
>
> The devices I had for testing were:
> - arm32: rk3288-veyron-jerry
> - arm64: rk3399-gru-kevin
> - arm64: qcom-sc7180-trogdor (not mainline yet)
>
> These are the devices I tested this series on. I tried to test
> various combinations of enabling/disabling various options and I
> hopefully caught the corner cases, but I'd appreciate any extra
> testing people can do.
earlycon_kgdboc sounds like a really cool feature. So I gave it a try
on my arm64 machine (Developerbox) and it works like a charm. So for
patch 6/9 you can add:
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Plus, in order to enable earlycon_kgdboc on Developerbox I had to
implement the read() function in the early console driver for
amba-pl011 (see patch [1]). It would be great if you could pick that
patch [1] too as part of this series.
[1] https://lkml.org/lkml/2020/4/24/173
-Sumit
> Notably I didn't test on x86, but (I think) I
> didn't touch much there so I shouldn't have broken anything.
>
> When testing I found a few problems with actually dropping into the
> debugger super early on arm and arm64 devices. Patches in this series
> should help with this. For arm I just avoid dropping into the
> debugger until a little later and for arm64 I actually enable
> debugging super early.
>
> I realize that bits of this series might feel a little hacky, though
> I've tried to do things in the cleanest way I could without overly
> interferring with the rest of the kernel. If you hate the way I
> solved a problem I would love it if you could provide guidance on how
> you think I could solve the problem better.
>
> This series (and my comments / documentation / commit messages) are
> now long enough that my eyes glaze over when I try to read it all over
> to double-check. I've nontheless tried to double-check it, but I'm
> pretty sure I did something stupid. Thank you ahead of time for
> pointing it out to me so I can fix it in v3. If somehow I managed to
> not do anything stupid (really?) then thank you for double-checking me
> anyway.
>
> Changes in v2:
> - ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb") new for v2.
> - ("Revert "kgdboc: disable the console lock when in kgdb"") new for v2.
> - Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
> - Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp
>
> Douglas Anderson (9):
> kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
> Revert "kgdboc: disable the console lock when in kgdb"
> kgdboc: Use a platform device to handle tty drivers showing up late
> kgdb: Delay "kgdbwait" to dbg_late_init() by default
> arm64: Add call_break_hook() to early_brk64() for early kgdb
> kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles
> Documentation: kgdboc: Document new earlycon_kgdboc parameter
> serial: qcom_geni_serial: Support earlycon_kgdboc
> serial: 8250_early: Support earlycon_kgdboc
>
> .../admin-guide/kernel-parameters.txt | 20 ++
> Documentation/dev-tools/kgdb.rst | 14 +
> arch/arm64/include/asm/debug-monitors.h | 2 +
> arch/arm64/kernel/debug-monitors.c | 2 +-
> arch/arm64/kernel/kgdb.c | 5 +
> arch/arm64/kernel/traps.c | 3 +
> arch/x86/kernel/kgdb.c | 5 +
> drivers/misc/kgdbts.c | 2 +-
> drivers/tty/mips_ejtag_fdc.c | 2 +-
> drivers/tty/serial/8250/8250_early.c | 23 ++
> drivers/tty/serial/kgdboc.c | 262 ++++++++++++++++--
> drivers/tty/serial/qcom_geni_serial.c | 32 +++
> drivers/usb/early/ehci-dbgp.c | 2 +-
> include/linux/kgdb.h | 25 +-
> kernel/debug/debug_core.c | 48 +++-
> 15 files changed, 400 insertions(+), 47 deletions(-)
>
> --
> 2.26.1.301.g55bc3eb7cb9-goog
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v2 0/9] kgdb: Support late serial drivers; enable early debug w/ boot consoles
2020-04-24 8:32 ` Sumit Garg
@ 2020-04-24 10:13 ` Daniel Thompson
0 siblings, 0 replies; 7+ messages in thread
From: Daniel Thompson @ 2020-04-24 10:13 UTC (permalink / raw)
To: Sumit Garg
Cc: Douglas Anderson, jason.wessel, Greg Kroah-Hartman, Mark Rutland,
Linux Doc Mailing List, Peter Zijlstra, catalin.marinas,
bjorn.andersson, Nadav Amit, hpa, Mauro Carvalho Chehab, will,
Matt Mullins, Jonathan Corbet, frowand.list, x86, jinho lim,
agross, Pawan Gupta, linux-arm-kernel, linux-serial,
kgdb-bugreport, Borislav Petkov, Dave Martin, Masami Hiramatsu,
Arnd Bergmann, linux-arm-msm, jslaby, Alexios Zavras, bp, tglx,
mingo, Allison Randal, Juergen Gross, linux-usb,
Linux Kernel Mailing List, James Morse, Eric W. Biederman,
Andrew Morton, Rick Edgecombe, Enrico Weigelt
On Fri, Apr 24, 2020 at 02:02:51PM +0530, Sumit Garg wrote:
> Hi Doug,
>
> On Wed, 22 Apr 2020 at 02:45, Douglas Anderson <dianders@chromium.org> wrote:
> >
> > This whole pile of patches was motivated by me trying to get kgdb to
> > work properly on a platform where my serial driver ended up being hit
> > by the -EPROBE_DEFER virus (it wasn't practicing social distancing
> > from other drivers). Specifically my serial driver's parent device
> > depended on a resource that wasn't available when its probe was first
> > called. It returned -EPROBE_DEFER which meant that when "kgdboc"
> > tried to run its setup the serial driver wasn't there. Unfortunately
> > "kgdboc" never tried again, so that meant that kgdb was disabled until
> > I manually enalbed it via sysfs.
> >
> > While I could try to figure out how to get around the -EPROBE_DEFER
> > somehow, the above problems could happen to anyone and -EPROBE_DEFER
> > is generally considered something you just have to live with. In any
> > case the current "kgdboc" setup is a bit of a race waiting to happen.
> > I _think_ I saw during early testing that even adding a msleep() in
> > the typical serial driver's probe() is enough to trigger similar
> > issues.
> >
> > I decided that for the above race the best attitude to get kgdb to
> > register at boot was probably "if you can't beat 'em, join 'em".
> > Thus, "kgdboc" now jumps on the -EPROBE_DEFER bandwagon (now that my
> > driver uses it it's no longer a virus). It does so a little awkwardly
> > because "kgdboc" hasn't normally had a "struct device" associated with
> > it, but it's really not _that_ ugly to make a platform device and
> > seems less ugly than alternatives.
> >
> > Unfortunately now on my system the debugger is one of the last things
> > to register at boot. That's OK for debugging problems that show up
> > significantly after boot, but isn't so hot for all the boot problems
> > that I end up debugging. This motivated me to try to get something
> > working a little earlier.
> >
> > My first attempt was to try to get the existing "ekgdboc" to work
> > earlier. I tried that for a bit until I realized that it needed to
> > work at the tty layer and I couldn't find any serial drivers that
> > managed to register themselves to the tty layer super early at boot.
> > The only documented use of "ekgdboc" is "ekgdboc=kbd" and that's a bit
> > of a special snowflake. Trying to get my serial driver and all its
> > dependencies to probe normally and register the tty driver super early
> > at boot seemed like a bad way to go. In fact, all the complexity
> > needed to do something like this is why the system already has a
> > special concept of a "boot console" that lives only long enough to
> > transition to the normal console.
> >
> > Leveraging the boot console seemed like a good way to go and that's
> > what this series does. I found that consoles could have a read()
> > function, though I couldn't find anyone who implemented it. I
> > implemented it for two serial drivers for the devices I had easy
> > access to, making the assumption that for boot consoles that we could
> > assume read() and write() were polling-compatible (seems sane I
> > think).
> >
> > Now anyone who makes a small change to their serial driver can easily
> > enable early kgdb debugging!
> >
> > The devices I had for testing were:
> > - arm32: rk3288-veyron-jerry
> > - arm64: rk3399-gru-kevin
> > - arm64: qcom-sc7180-trogdor (not mainline yet)
> >
> > These are the devices I tested this series on. I tried to test
> > various combinations of enabling/disabling various options and I
> > hopefully caught the corner cases, but I'd appreciate any extra
> > testing people can do.
>
> earlycon_kgdboc sounds like a really cool feature. So I gave it a try
> on my arm64 machine (Developerbox) and it works like a charm. So for
> patch 6/9 you can add:
>
> Tested-by: Sumit Garg <sumit.garg@linaro.org>
>
> Plus, in order to enable earlycon_kgdboc on Developerbox I had to
> implement the read() function in the early console driver for
> amba-pl011 (see patch [1]). It would be great if you could pick that
> patch [1] too as part of this series.
>
> [1] https://lkml.org/lkml/2020/4/24/173
I think PL011 support is also useful for getting this feature integrated
into the test suite too!
Daniel.
>
> -Sumit
>
> > Notably I didn't test on x86, but (I think) I
> > didn't touch much there so I shouldn't have broken anything.
> >
> > When testing I found a few problems with actually dropping into the
> > debugger super early on arm and arm64 devices. Patches in this series
> > should help with this. For arm I just avoid dropping into the
> > debugger until a little later and for arm64 I actually enable
> > debugging super early.
> >
> > I realize that bits of this series might feel a little hacky, though
> > I've tried to do things in the cleanest way I could without overly
> > interferring with the rest of the kernel. If you hate the way I
> > solved a problem I would love it if you could provide guidance on how
> > you think I could solve the problem better.
> >
> > This series (and my comments / documentation / commit messages) are
> > now long enough that my eyes glaze over when I try to read it all over
> > to double-check. I've nontheless tried to double-check it, but I'm
> > pretty sure I did something stupid. Thank you ahead of time for
> > pointing it out to me so I can fix it in v3. If somehow I managed to
> > not do anything stupid (really?) then thank you for double-checking me
> > anyway.
> >
> > Changes in v2:
> > - ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb") new for v2.
> > - ("Revert "kgdboc: disable the console lock when in kgdb"") new for v2.
> > - Assumes we have ("kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb")
> > - Fix kgdbts, tty/mips_ejtag_fdc, and usb/early/ehci-dbgp
> >
> > Douglas Anderson (9):
> > kgdb: Disable WARN_CONSOLE_UNLOCKED for all kgdb
> > Revert "kgdboc: disable the console lock when in kgdb"
> > kgdboc: Use a platform device to handle tty drivers showing up late
> > kgdb: Delay "kgdbwait" to dbg_late_init() by default
> > arm64: Add call_break_hook() to early_brk64() for early kgdb
> > kgdboc: Add earlycon_kgdboc to support early kgdb using boot consoles
> > Documentation: kgdboc: Document new earlycon_kgdboc parameter
> > serial: qcom_geni_serial: Support earlycon_kgdboc
> > serial: 8250_early: Support earlycon_kgdboc
> >
> > .../admin-guide/kernel-parameters.txt | 20 ++
> > Documentation/dev-tools/kgdb.rst | 14 +
> > arch/arm64/include/asm/debug-monitors.h | 2 +
> > arch/arm64/kernel/debug-monitors.c | 2 +-
> > arch/arm64/kernel/kgdb.c | 5 +
> > arch/arm64/kernel/traps.c | 3 +
> > arch/x86/kernel/kgdb.c | 5 +
> > drivers/misc/kgdbts.c | 2 +-
> > drivers/tty/mips_ejtag_fdc.c | 2 +-
> > drivers/tty/serial/8250/8250_early.c | 23 ++
> > drivers/tty/serial/kgdboc.c | 262 ++++++++++++++++--
> > drivers/tty/serial/qcom_geni_serial.c | 32 +++
> > drivers/usb/early/ehci-dbgp.c | 2 +-
> > include/linux/kgdb.h | 25 +-
> > kernel/debug/debug_core.c | 48 +++-
> > 15 files changed, 400 insertions(+), 47 deletions(-)
> >
> > --
> > 2.26.1.301.g55bc3eb7cb9-goog
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 7+ messages in thread