From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Anders Kaseorg <andersk@MIT.EDU>,
"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] IRQ handling race and spurious IIR read in serial/8250.c
Date: Thu, 19 Feb 2009 11:24:12 -0800 [thread overview]
Message-ID: <499DB1DC.1020704@goop.org> (raw)
In-Reply-To: <18834.63479.700909.321686@mariner.uk.xensource.com>
Added cc:
Ian Jackson wrote:
> Anders Kaseorg writes ("Re: Serial console hangs with Linux 2.6.20 HVM guest"):
>
>> Yes, I took Linux v2.6.20 on amd64, ran `make defconfig`, then ran `make
>> menuconfig` and turned off CONFIG_HOTPLUG_CPU (Processor type and features
>> ò Support for hot-pluggable CPUs).
>>
>
> Thanks. I think I have tracked down the bug. In
> drivers/serial/8250.c in Linux there are two bugs:
> 1. UART_BUG_TXEN can be spuriously set, due to an IRQ race
> 2. The workaround then applied by the kernel is itself buggy
>
> Anders: can you try two tests for me ? Firstly, in
> serial8250_startup, delete the section which sets UART_BUG_TXEN (see
> 2nd patch below); I think this will fix the symptoms for you.
> Secondly, in serial8250_start_tx delete the read from the IIR and the
> relevant branch of the text (see 3rd patch below); I think this will
> also in itself fix your symptoms. I haven't compiled either patch (so
> you may find that eg I missed deleting some variables).
>
>
> The bugs in detail (this discussion applies to 2.6.20 and also to
> 2.6.28.4):
>
> 1. The hunk of serial8250_startup I quote below attempts to discover
> whether writing the IER re-asserts the THRI (transmit ready)
> interrupt. However the spinlock that it has taken out,
> port->lock, is not the one that the IRQ service routine takes
> before reading the IIR (i->lock). As a result, on an SMP system
> the generated interrupt races with the straight-line code in
> serial8250_startup.
>
> If serial8250_startup loses the race (perhaps because the system
> is a VM and its VCPU got preempted), UART_BUG_TXEN is spuriously
> added to bugs. This is quite unlikely in a normal system but in
> certain Xen configurations, particularly ones where there is CPU
> pressure, we may lose the race every time.
>
> It is not exactly clear to me how this ought to be resolved. One
> possibility is that the UART_BUG_TXEN problem might be worked
> around perfectly well by the new and very similar workaround
> UART_BUG_THRE[1] in 2.6.21ish in which case it could just be
> removed.
>
> 2. UART_BUG_TXEN's workaround appears to be intended to be harmless.
> However what it actually does is to read the IIR, thus clearing
> any actual interrupt (including incidentally non-THRI), and then
> only perform the intended servicing if the interrupt was _not_
> asserted. That is, it breaks on any serial port with the bug.
>
> As far as I can see there is not much use in UART_BUG_TXEN reading
> IIR at all, so a suitable change if we want to keep UART_BUG_TXEN
> might be the first patch I enclose below (again, not compiled
> or tested).
>
> If UART_BUG_TXEN is retained something along these lines should be
> done at the very least.
>
> Ian.
>
> [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=40b36daad0ac704e6d5c1b75789f371ef5b053c1
> in which case UART
>
>
> Proposed initial band-aid fix (against 2.6.28.4):
>
>
> Do not read IIR in serial8250_start_tx when UART_BUG_TXEN
>
> Reading the IIR clears some oustanding interrupts so it is not safe.
> Instead, simply transmit immediately if the buffer is empty without
> regard to IIR.
>
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
>
> --- ../linux-2.6.28.4/drivers/serial/8250.c~ 2009-02-06 21:47:45.000000000 +0000
> +++ ../linux-2.6.28.4/drivers/serial/8250.c 2009-02-11 15:55:24.000000000 +0000
> @@ -1257,14 +1257,12 @@
> serial_out(up, UART_IER, up->ier);
>
> if (up->bugs & UART_BUG_TXEN) {
> - unsigned char lsr, iir;
> + unsigned char lsr;
> lsr = serial_in(up, UART_LSR);
> up->lsr_saved_flags |= lsr & LSR_SAVE_FLAGS;
> - iir = serial_in(up, UART_IIR) & 0x0f;
> if ((up->port.type == PORT_RM9000) ?
> - (lsr & UART_LSR_THRE &&
> - (iir == UART_IIR_NO_INT || iir == UART_IIR_THRI)) :
> - (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT))
> + (lsr & UART_LSR_THRE) :
> + (lsr & UART_LSR_TEMT))
> transmit_chars(up);
> }
> }
>
>
> Anders - first patch to try (against 2.6.20):
>
> --- drivers/serial/8250.c~ 2007-02-04 18:44:54.000000000 +0000
> +++ drivers/serial/8250.c 2009-02-11 15:39:43.000000000 +0000
> @@ -1645,25 +1645,6 @@
>
> serial8250_set_mctrl(&up->port, up->port.mctrl);
>
> - /*
> - * Do a quick test to see if we receive an
> - * interrupt when we enable the TX irq.
> - */
> - serial_outp(up, UART_IER, UART_IER_THRI);
> - lsr = serial_in(up, UART_LSR);
> - iir = serial_in(up, UART_IIR);
> - serial_outp(up, UART_IER, 0);
> -
> - if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT) {
> - if (!(up->bugs & UART_BUG_TXEN)) {
> - up->bugs |= UART_BUG_TXEN;
> - pr_debug("ttyS%d - enabling bad tx status workarounds\n",
> - port->line);
> - }
> - } else {
> - up->bugs &= ~UART_BUG_TXEN;
> - }
> -
> spin_unlock_irqrestore(&up->port.lock, flags);
>
> /*
>
>
> Anders - second patch to try (against 2.6.20):
> Fix should be suitable for distribution IMO.
>
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
>
> --- drivers/serial/8250.c~ 2007-02-04 18:44:54.000000000 +0000
> +++ drivers/serial/8250.c 2009-02-11 15:41:51.000000000 +0000
> @@ -1136,10 +1136,9 @@
> serial_out(up, UART_IER, up->ier);
>
> if (up->bugs & UART_BUG_TXEN) {
> - unsigned char lsr, iir;
> + unsigned char lsr;
> lsr = serial_in(up, UART_LSR);
> - iir = serial_in(up, UART_IIR);
> - if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT)
> + if (lsr & UART_LSR_TEMT)
> transmit_chars(up);
> }
> }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
next prev parent reply other threads:[~2009-02-19 19:24 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-05 2:23 Serial console hangs with Linux 2.6.20 HVM guest Anders Kaseorg
2009-02-05 17:04 ` Ian Jackson
2009-02-05 19:34 ` Anders Kaseorg
2009-02-05 21:52 ` Anders Kaseorg
2009-02-10 15:34 ` Ian Jackson
2009-02-10 18:20 ` Anders Kaseorg
2009-02-11 16:08 ` [PATCH] IRQ handling race and spurious IIR read in serial/8250.c Ian Jackson
2009-02-19 17:52 ` Ian Jackson
2009-02-19 17:52 ` Ian Jackson
2009-02-19 18:37 ` [Xen-devel] " Markus Armbruster
2009-02-19 18:37 ` Markus Armbruster
2009-02-19 19:24 ` Jeremy Fitzhardinge [this message]
2009-02-09 17:57 ` Serial console hangs with Linux 2.6.20 HVM guest Ian Jackson
2009-02-09 18:13 ` Anders Kaseorg
-- strict thread matches above, loose matches on Subject: below --
2009-03-12 17:57 [PATCH] IRQ handling race and spurious IIR read in serial/8250.c Markus Armbruster
2009-03-12 17:57 ` Markus Armbruster
2009-03-12 18:09 ` Alan Cox
2009-03-12 18:09 ` Alan Cox
2009-03-12 19:30 ` Ian Jackson
2009-03-12 19:30 ` Ian Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=499DB1DC.1020704@goop.org \
--to=jeremy@goop.org \
--cc=Ian.Jackson@eu.citrix.com \
--cc=akpm@linux-foundation.org \
--cc=andersk@MIT.EDU \
--cc=linux-kernel@vger.kernel.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.