From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Anders Kaseorg <andersk@MIT.EDU>,
"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] IRQ handling race and spurious IIR read in serial/8250.c
Date: Thu, 19 Feb 2009 11:24:12 -0800 [thread overview]
Message-ID: <499DB1DC.1020704@goop.org> (raw)
In-Reply-To: <18834.63479.700909.321686@mariner.uk.xensource.com>
Added cc:
Ian Jackson wrote:
> Anders Kaseorg writes ("Re: Serial console hangs with Linux 2.6.20 HVM guest"):
>
>> Yes, I took Linux v2.6.20 on amd64, ran `make defconfig`, then ran `make
>> menuconfig` and turned off CONFIG_HOTPLUG_CPU (Processor type and features
>> ò Support for hot-pluggable CPUs).
>>
>
> Thanks. I think I have tracked down the bug. In
> drivers/serial/8250.c in Linux there are two bugs:
> 1. UART_BUG_TXEN can be spuriously set, due to an IRQ race
> 2. The workaround then applied by the kernel is itself buggy
>
> Anders: can you try two tests for me ? Firstly, in
> serial8250_startup, delete the section which sets UART_BUG_TXEN (see
> 2nd patch below); I think this will fix the symptoms for you.
> Secondly, in serial8250_start_tx delete the read from the IIR and the
> relevant branch of the text (see 3rd patch below); I think this will
> also in itself fix your symptoms. I haven't compiled either patch (so
> you may find that eg I missed deleting some variables).
>
>
> The bugs in detail (this discussion applies to 2.6.20 and also to
> 2.6.28.4):
>
> 1. The hunk of serial8250_startup I quote below attempts to discover
> whether writing the IER re-asserts the THRI (transmit ready)
> interrupt. However the spinlock that it has taken out,
> port->lock, is not the one that the IRQ service routine takes
> before reading the IIR (i->lock). As a result, on an SMP system
> the generated interrupt races with the straight-line code in
> serial8250_startup.
>
> If serial8250_startup loses the race (perhaps because the system
> is a VM and its VCPU got preempted), UART_BUG_TXEN is spuriously
> added to bugs. This is quite unlikely in a normal system but in
> certain Xen configurations, particularly ones where there is CPU
> pressure, we may lose the race every time.
>
> It is not exactly clear to me how this ought to be resolved. One
> possibility is that the UART_BUG_TXEN problem might be worked
> around perfectly well by the new and very similar workaround
> UART_BUG_THRE[1] in 2.6.21ish in which case it could just be
> removed.
>
> 2. UART_BUG_TXEN's workaround appears to be intended to be harmless.
> However what it actually does is to read the IIR, thus clearing
> any actual interrupt (including incidentally non-THRI), and then
> only perform the intended servicing if the interrupt was _not_
> asserted. That is, it breaks on any serial port with the bug.
>
> As far as I can see there is not much use in UART_BUG_TXEN reading
> IIR at all, so a suitable change if we want to keep UART_BUG_TXEN
> might be the first patch I enclose below (again, not compiled
> or tested).
>
> If UART_BUG_TXEN is retained something along these lines should be
> done at the very least.
>
> Ian.
>
> [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=40b36daad0ac704e6d5c1b75789f371ef5b053c1
> in which case UART
>
>
> Proposed initial band-aid fix (against 2.6.28.4):
>
>
> Do not read IIR in serial8250_start_tx when UART_BUG_TXEN
>
> Reading the IIR clears some oustanding interrupts so it is not safe.
> Instead, simply transmit immediately if the buffer is empty without
> regard to IIR.
>
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
>
> --- ../linux-2.6.28.4/drivers/serial/8250.c~ 2009-02-06 21:47:45.000000000 +0000
> +++ ../linux-2.6.28.4/drivers/serial/8250.c 2009-02-11 15:55:24.000000000 +0000
> @@ -1257,14 +1257,12 @@
> serial_out(up, UART_IER, up->ier);
>
> if (up->bugs & UART_BUG_TXEN) {
> - unsigned char lsr, iir;
> + unsigned char lsr;
> lsr = serial_in(up, UART_LSR);
> up->lsr_saved_flags |= lsr & LSR_SAVE_FLAGS;
> - iir = serial_in(up, UART_IIR) & 0x0f;
> if ((up->port.type == PORT_RM9000) ?
> - (lsr & UART_LSR_THRE &&
> - (iir == UART_IIR_NO_INT || iir == UART_IIR_THRI)) :
> - (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT))
> + (lsr & UART_LSR_THRE) :
> + (lsr & UART_LSR_TEMT))
> transmit_chars(up);
> }
> }
>
>
> Anders - first patch to try (against 2.6.20):
>
> --- drivers/serial/8250.c~ 2007-02-04 18:44:54.000000000 +0000
> +++ drivers/serial/8250.c 2009-02-11 15:39:43.000000000 +0000
> @@ -1645,25 +1645,6 @@
>
> serial8250_set_mctrl(&up->port, up->port.mctrl);
>
> - /*
> - * Do a quick test to see if we receive an
> - * interrupt when we enable the TX irq.
> - */
> - serial_outp(up, UART_IER, UART_IER_THRI);
> - lsr = serial_in(up, UART_LSR);
> - iir = serial_in(up, UART_IIR);
> - serial_outp(up, UART_IER, 0);
> -
> - if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT) {
> - if (!(up->bugs & UART_BUG_TXEN)) {
> - up->bugs |= UART_BUG_TXEN;
> - pr_debug("ttyS%d - enabling bad tx status workarounds\n",
> - port->line);
> - }
> - } else {
> - up->bugs &= ~UART_BUG_TXEN;
> - }
> -
> spin_unlock_irqrestore(&up->port.lock, flags);
>
> /*
>
>
> Anders - second patch to try (against 2.6.20):
> Fix should be suitable for distribution IMO.
>
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
>
> --- drivers/serial/8250.c~ 2007-02-04 18:44:54.000000000 +0000
> +++ drivers/serial/8250.c 2009-02-11 15:41:51.000000000 +0000
> @@ -1136,10 +1136,9 @@
> serial_out(up, UART_IER, up->ier);
>
> if (up->bugs & UART_BUG_TXEN) {
> - unsigned char lsr, iir;
> + unsigned char lsr;
> lsr = serial_in(up, UART_LSR);
> - iir = serial_in(up, UART_IIR);
> - if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT)
> + if (lsr & UART_LSR_TEMT)
> transmit_chars(up);
> }
> }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
next prev parent reply other threads:[~2009-02-19 19:24 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1233800599.5398.0.camel@balanced-tree>
[not found] ` <18827.7191.646173.99255@mariner.uk.xensource.com>
[not found] ` <1233862446.5676.28.camel@balanced-tree>
[not found] ` <1233870772.5676.40.camel@balanced-tree>
[not found] ` <18833.40557.119090.48930@mariner.uk.xensource.com>
[not found] ` <alpine.DEB.2.00.0902101249181.20723@vinegar-pot.mit.edu>
2009-02-11 16:08 ` [PATCH] IRQ handling race and spurious IIR read in serial/8250.c Ian Jackson
2009-02-19 17:52 ` Ian Jackson
2009-02-19 18:37 ` [Xen-devel] " Markus Armbruster
2009-02-19 19:24 ` Jeremy Fitzhardinge [this message]
2009-03-12 17:57 Markus Armbruster
2009-03-12 18:09 ` Alan Cox
2009-03-12 19:30 ` Ian Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=499DB1DC.1020704@goop.org \
--to=jeremy@goop.org \
--cc=Ian.Jackson@eu.citrix.com \
--cc=akpm@linux-foundation.org \
--cc=andersk@MIT.EDU \
--cc=linux-kernel@vger.kernel.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox