* [RFC PATCH 0/3] tty: amba-pl011: Decruft and streamline earlycon output
@ 2017-10-18 14:14 Dave Martin
2017-10-18 14:14 ` [RFC PATCH 1/3] tty: amba-pl011: earlycon: Switch to relaxed I/O Dave Martin
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Dave Martin @ 2017-10-18 14:14 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Andre Przywara, Stephen Boyd, Russell King, Andy Gross,
linux-serial, Greg Kroah-Hartman, Bhupinder Thakur
This series avoids waiting for the UART transmitter to go idle in
between every earlycon char written, and does some refactoring so that
this change doesn't have to be implemented multiple times.
The initial motiviation for this is a performance issue encountered
during Xen virtual UART development (see [1] and patch 3).
It's debatable whether this is Linux's responsibility or the
responsibility of the virtual UART: this is a situation where the
virtualised behaviour is inevitably so different from hardware that
it's hard to make a ruling on exactly what behaviour is reasonable.
The best course is probably to work around it on _both_ ends, in the
interest of best interoperation.
Currently amba-pl011 has two distinct earlycon implementations: a
generic pl011 implementation and a quirked implementation that works
around the lack of a usable BUSY status bit in QDF2400 SoCs.
I don't like applying the same obscure fix in multiple places if I can
avoid it, so I've had a go at unifying the earlycons here (patches 1-2).
I'd appreciate people's views, particularly on:
* (pl011) Whether to refactor harder: the caching of vendor->reg_offset
vendor->access_32b in uart_amba_port is convenient and avoids chasing
an extra pointer in the common case, but this the refactoring for the
earlycon case a bit ugly. Maybe we can do better.
* (pl011, serial_core) Whether the new use of port->private_data will
conflict with existing code (no, AFAICT).
* (earlycon) Whether overriding port->iotype for 32-bit-only
implementations is the right thing to do.
* (earlycon) Whether there is some reason to drain the transmitter
after every char that I've overlooked.
Currently I don't consider this to be a candidate for stable, but it
could be backported. Due to the steady quirkage churn in amba-pl011,
it would be better to drop the refactoring stuff for backporting
purposes.
Testing:
* tested rather casually no ARM Juno r0 (real PL011) and the ARM VFP
Base Model (emulated PL011)
* *Not tested* on QDF2400, since I don't have access to one.
Someone probably ought to test that.
[1] Re: [Xen-devel] [PATCH 26/27 v12] arm/xen: vpl011: Fix the slow
early console SBSA UART output
https://lists.xenproject.org/archives/html/xen-devel/2017-10/msg01949.html
Dave Martin (3):
tty: amba-pl011: earlycon: Switch to relaxed I/O
tty: amba-pl011: earlycon: Unify earlycon backends
tty: amba-pl011: earlycon: Don't drain the transmitter after each char
drivers/tty/serial/amba-pl011.c | 108 +++++++++++++++++++++++-----------------
1 file changed, 61 insertions(+), 47 deletions(-)
--
2.1.4
^ permalink raw reply [flat|nested] 4+ messages in thread
* [RFC PATCH 1/3] tty: amba-pl011: earlycon: Switch to relaxed I/O
2017-10-18 14:14 [RFC PATCH 0/3] tty: amba-pl011: Decruft and streamline earlycon output Dave Martin
@ 2017-10-18 14:14 ` Dave Martin
2017-10-18 14:14 ` [RFC PATCH 2/3] tty: amba-pl011: earlycon: Unify earlycon backends Dave Martin
2017-10-18 14:14 ` [RFC PATCH 3/3] tty: amba-pl011: earlycon: Don't drain the transmitter after each char Dave Martin
2 siblings, 0 replies; 4+ messages in thread
From: Dave Martin @ 2017-10-18 14:14 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Andre Przywara, Stephen Boyd, Russell King, Andy Gross,
linux-serial, Greg Kroah-Hartman, Bhupinder Thakur
The current pl011 earlycon implementation uses the regular I/O
accessors, but this is unnecessary because the architecture
enforces ordering of accesses to the same device anyway.
Using relaxed I/O brings the added bonus that the generic pl011
helpers can be used instead of having to open-code: this allows
some duplicate logic to be unified.
This patch does some refactoring so that pl011_{read,write,
tx_empty}() are split into a frontend that does the same thing as
before, and a backend __* that can be used with a uart_port that
has no corresponding uart_amba_port structure yet (i.e., the
earlycon scenario).
__pl011_tx_empty() can now be used in place of an open-coded poll
that differs between the generic and qdf2400_e44 earlycon
implementations, because __pl011_tx_empty() handles the necessary
quirkage transparently.
Moving to relaxed I/O loses wmb() semantics at the start of an
earlycon write, and this may be important for some debugging
scenarios, so an explicit wmb() is inserted at the start of each
earlycon write implementation.
Because qdf2400_e44 only supports 32-bit I/O, this patch also
explicitly sets port->iotype == UPIO_MEM32 so that __pl011_{write,
read}() use the correct I/O size.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
drivers/tty/serial/amba-pl011.c | 69 ++++++++++++++++++++++++++++-------------
1 file changed, 47 insertions(+), 22 deletions(-)
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 111e6a9..fd9e08c 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -300,26 +300,38 @@ static unsigned int pl011_reg_to_offset(const struct uart_amba_port *uap,
return uap->reg_offset[reg];
}
-static unsigned int pl011_read(const struct uart_amba_port *uap,
- unsigned int reg)
+static unsigned int __pl011_read(const struct uart_port *port,
+ const u16 *reg_offset, unsigned int reg)
{
- void __iomem *addr = uap->port.membase + pl011_reg_to_offset(uap, reg);
+ void __iomem *addr = port->membase + reg_offset[reg];
- return (uap->port.iotype == UPIO_MEM32) ?
+ return (port->iotype == UPIO_MEM32) ?
readl_relaxed(addr) : readw_relaxed(addr);
}
-static void pl011_write(unsigned int val, const struct uart_amba_port *uap,
+static unsigned int pl011_read(const struct uart_amba_port *uap,
unsigned int reg)
{
- void __iomem *addr = uap->port.membase + pl011_reg_to_offset(uap, reg);
+ return __pl011_read(&uap->port, uap->reg_offset, reg);
+}
- if (uap->port.iotype == UPIO_MEM32)
+static void __pl011_write(unsigned int val, const struct uart_port *port,
+ const u16 *reg_offset, unsigned int reg)
+{
+ void __iomem *addr = port->membase + reg_offset[reg];
+
+ if (port->iotype == UPIO_MEM32)
writel_relaxed(val, addr);
else
writew_relaxed(val, addr);
}
+static void pl011_write(unsigned int val, const struct uart_amba_port *uap,
+ unsigned int reg)
+{
+ __pl011_write(val, &uap->port, uap->reg_offset, reg);
+}
+
/*
* Reads up to 256 characters from the FIFO or until it's empty and
* inserts them into the TTY layer. Returns the number of characters
@@ -1537,16 +1549,23 @@ static irqreturn_t pl011_int(int irq, void *dev_id)
return IRQ_RETVAL(handled);
}
-static unsigned int pl011_tx_empty(struct uart_port *port)
+static unsigned int __pl011_tx_empty(struct uart_port *port,
+ const u16 *reg_offset, const struct vendor_data *vendor)
{
- struct uart_amba_port *uap =
- container_of(port, struct uart_amba_port, port);
+ unsigned int status = __pl011_read(port, reg_offset, REG_FR);
/* Allow feature register bits to be inverted to work around errata */
- unsigned int status = pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr;
+ status ^= vendor->inv_fr;
+ status &= vendor->fr_busy | UART01x_FR_TXFF;
+ return status ? 0 : TIOCSER_TEMT;
+}
- return status & (uap->vendor->fr_busy | UART01x_FR_TXFF) ?
- 0 : TIOCSER_TEMT;
+static unsigned int pl011_tx_empty(struct uart_port *port)
+{
+ struct uart_amba_port *uap =
+ container_of(port, struct uart_amba_port, port);
+
+ return __pl011_tx_empty(port, uap->reg_offset, uap->vendor);
}
static unsigned int pl011_get_mctrl(struct uart_port *port)
@@ -2419,10 +2438,13 @@ static struct console amba_console = {
static void qdf2400_e44_putc(struct uart_port *port, int c)
{
- while (readl(port->membase + UART01x_FR) & UART01x_FR_TXFF)
+ const struct vendor_data *vendor = &vendor_qdt_qdf2400_e44;
+ const u16 *reg_offset = vendor->reg_offset;
+
+ while (__pl011_read(port, reg_offset, REG_FR) & UART01x_FR_TXFF)
cpu_relax();
- writel(c, port->membase + UART01x_DR);
- while (!(readl(port->membase + UART01x_FR) & UART011_FR_TXFE))
+ __pl011_write(c, port, reg_offset, REG_DR);
+ while (!__pl011_tx_empty(port, reg_offset, vendor))
cpu_relax();
}
@@ -2430,18 +2452,19 @@ static void qdf2400_e44_early_write(struct console *con, const char *s, unsigned
{
struct earlycon_device *dev = con->data;
+ wmb();
uart_console_write(&dev->port, s, n, qdf2400_e44_putc);
}
static void pl011_putc(struct uart_port *port, int c)
{
- while (readl(port->membase + UART01x_FR) & UART01x_FR_TXFF)
+ const struct vendor_data *vendor = &vendor_arm;
+ const u16 *reg_offset = vendor->reg_offset;
+
+ while (__pl011_read(port, reg_offset, REG_FR) & UART01x_FR_TXFF)
cpu_relax();
- if (port->iotype == UPIO_MEM32)
- writel(c, port->membase + UART01x_DR);
- else
- writeb(c, port->membase + UART01x_DR);
- while (readl(port->membase + UART01x_FR) & UART01x_FR_BUSY)
+ __pl011_write(c, port, reg_offset, REG_DR);
+ while (!__pl011_tx_empty(port, reg_offset, vendor))
cpu_relax();
}
@@ -2449,6 +2472,7 @@ static void pl011_early_write(struct console *con, const char *s, unsigned n)
{
struct earlycon_device *dev = con->data;
+ wmb();
uart_console_write(&dev->port, s, n, pl011_putc);
}
@@ -2494,6 +2518,7 @@ qdf2400_e44_early_console_setup(struct earlycon_device *device,
if (!device->port.membase)
return -ENODEV;
+ device->port.iotype = UPIO_MEM32;
device->con->write = qdf2400_e44_early_write;
return 0;
}
--
2.1.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [RFC PATCH 2/3] tty: amba-pl011: earlycon: Unify earlycon backends
2017-10-18 14:14 [RFC PATCH 0/3] tty: amba-pl011: Decruft and streamline earlycon output Dave Martin
2017-10-18 14:14 ` [RFC PATCH 1/3] tty: amba-pl011: earlycon: Switch to relaxed I/O Dave Martin
@ 2017-10-18 14:14 ` Dave Martin
2017-10-18 14:14 ` [RFC PATCH 3/3] tty: amba-pl011: earlycon: Don't drain the transmitter after each char Dave Martin
2 siblings, 0 replies; 4+ messages in thread
From: Dave Martin @ 2017-10-18 14:14 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Andre Przywara, Stephen Boyd, Russell King, Andy Gross,
linux-serial, Greg Kroah-Hartman, Bhupinder Thakur
Now that the generic pl011 and qdf2400_e44 earlycon backends differ
only in the choice of vendor_data and minor initialisation detatils
that are detectable based on vendor_data, there's no strong need
for them to be separate.
This patch factors common earlyon setup into a new function
pl011_early_console_setup_common(), which does generic setup and
stashes a pointer to the relevant vendor_data struct in
port->private_data (which appears otherwise unused by the
amba-pl011 driver).
The qdf2400_e44-specific earlycon write/putc functions would now be
identical to the pl011 versions, so they are deleted.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
drivers/tty/serial/amba-pl011.c | 53 ++++++++++++++++-------------------------
1 file changed, 20 insertions(+), 33 deletions(-)
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index fd9e08c..084ed3f 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -2436,29 +2436,9 @@ static struct console amba_console = {
#define AMBA_CONSOLE (&amba_console)
-static void qdf2400_e44_putc(struct uart_port *port, int c)
-{
- const struct vendor_data *vendor = &vendor_qdt_qdf2400_e44;
- const u16 *reg_offset = vendor->reg_offset;
-
- while (__pl011_read(port, reg_offset, REG_FR) & UART01x_FR_TXFF)
- cpu_relax();
- __pl011_write(c, port, reg_offset, REG_DR);
- while (!__pl011_tx_empty(port, reg_offset, vendor))
- cpu_relax();
-}
-
-static void qdf2400_e44_early_write(struct console *con, const char *s, unsigned n)
-{
- struct earlycon_device *dev = con->data;
-
- wmb();
- uart_console_write(&dev->port, s, n, qdf2400_e44_putc);
-}
-
static void pl011_putc(struct uart_port *port, int c)
{
- const struct vendor_data *vendor = &vendor_arm;
+ const struct vendor_data *vendor = port->private_data;
const u16 *reg_offset = vendor->reg_offset;
while (__pl011_read(port, reg_offset, REG_FR) & UART01x_FR_TXFF)
@@ -2476,6 +2456,22 @@ static void pl011_early_write(struct console *con, const char *s, unsigned n)
uart_console_write(&dev->port, s, n, pl011_putc);
}
+static int __init
+pl011_early_console_setup_common(struct earlycon_device *device,
+ const struct vendor_data *vendor)
+{
+ if (!device->port.membase)
+ return -ENODEV;
+
+ device->port.private_data = (void *)vendor;
+ if (vendor->access_32b)
+ device->port.iotype = UPIO_MEM32;
+
+ device->con->write = pl011_early_write;
+
+ return 0;
+}
+
/*
* On non-ACPI systems, earlycon is enabled by specifying
* "earlycon=pl011,<address>" on the kernel command line.
@@ -2491,12 +2487,7 @@ static void pl011_early_write(struct console *con, const char *s, unsigned n)
static int __init pl011_early_console_setup(struct earlycon_device *device,
const char *opt)
{
- if (!device->port.membase)
- return -ENODEV;
-
- device->con->write = pl011_early_write;
-
- return 0;
+ return pl011_early_console_setup_common(device, &vendor_arm);
}
OF_EARLYCON_DECLARE(pl011, "arm,pl011", pl011_early_console_setup);
OF_EARLYCON_DECLARE(pl011, "arm,sbsa-uart", pl011_early_console_setup);
@@ -2515,12 +2506,8 @@ static int __init
qdf2400_e44_early_console_setup(struct earlycon_device *device,
const char *opt)
{
- if (!device->port.membase)
- return -ENODEV;
-
- device->port.iotype = UPIO_MEM32;
- device->con->write = qdf2400_e44_early_write;
- return 0;
+ return pl011_early_console_setup_common(device,
+ &vendor_qdt_qdf2400_e44);
}
EARLYCON_DECLARE(qdf2400_e44, qdf2400_e44_early_console_setup);
--
2.1.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [RFC PATCH 3/3] tty: amba-pl011: earlycon: Don't drain the transmitter after each char
2017-10-18 14:14 [RFC PATCH 0/3] tty: amba-pl011: Decruft and streamline earlycon output Dave Martin
2017-10-18 14:14 ` [RFC PATCH 1/3] tty: amba-pl011: earlycon: Switch to relaxed I/O Dave Martin
2017-10-18 14:14 ` [RFC PATCH 2/3] tty: amba-pl011: earlycon: Unify earlycon backends Dave Martin
@ 2017-10-18 14:14 ` Dave Martin
2 siblings, 0 replies; 4+ messages in thread
From: Dave Martin @ 2017-10-18 14:14 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Andre Przywara, Stephen Boyd, Russell King, Andy Gross,
linux-serial, Greg Kroah-Hartman, Bhupinder Thakur
Currently, the pl011 earlycon implementation waits for the UART
transmitter to drain completely and become idle after each
character is written.
This can result in (mostly harmless) delays and stuttering output
on the wire, and can also lead to poor performance in virtualised
UART implementations: thrashing between VMs can occur as each
character gets forcibly drained to the remove sink (dom0 in the Xen
case) before the source guest writes another character.
However, the semantics of earlycon don't allow callers to write one
character at a time with the expectation of completion: the only
interface to exposed to earlycon writes a whole string before
returning.
So, this patch eliminates the draining of the transmitted from
pl011_putc() and moves if to the the and of pl011_early_write()
instead. From the earlycon caller's point of view, the effect
should be faster but otherwise identical: either way, all the
characters have gone out onto the wire before the write method
returns.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
For background on the virtualisation issue, see [1]. Although
virtual UART implementations should be working around this issue
somehow for compatibility with current and older kernels, this patch
should promote better interoperation, and earlycon throughput should
improve a bit on real hardware too.
[1] Re: [Xen-devel] [PATCH 26/27 v12] arm/xen: vpl011: Fix the slow
early console SBSA UART output
https://lists.xenproject.org/archives/html/xen-devel/2017-10/msg01949.html
---
drivers/tty/serial/amba-pl011.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 084ed3f..e27059a 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -2444,16 +2444,18 @@ static void pl011_putc(struct uart_port *port, int c)
while (__pl011_read(port, reg_offset, REG_FR) & UART01x_FR_TXFF)
cpu_relax();
__pl011_write(c, port, reg_offset, REG_DR);
- while (!__pl011_tx_empty(port, reg_offset, vendor))
- cpu_relax();
}
static void pl011_early_write(struct console *con, const char *s, unsigned n)
{
struct earlycon_device *dev = con->data;
+ const struct vendor_data *vendor = dev->port.private_data;
+ const u16 *reg_offset = vendor->reg_offset;
wmb();
uart_console_write(&dev->port, s, n, pl011_putc);
+ while (!__pl011_tx_empty(&dev->port, reg_offset, vendor))
+ cpu_relax();
}
static int __init
--
2.1.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-10-18 14:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-18 14:14 [RFC PATCH 0/3] tty: amba-pl011: Decruft and streamline earlycon output Dave Martin
2017-10-18 14:14 ` [RFC PATCH 1/3] tty: amba-pl011: earlycon: Switch to relaxed I/O Dave Martin
2017-10-18 14:14 ` [RFC PATCH 2/3] tty: amba-pl011: earlycon: Unify earlycon backends Dave Martin
2017-10-18 14:14 ` [RFC PATCH 3/3] tty: amba-pl011: earlycon: Don't drain the transmitter after each char Dave Martin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).