xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
@ 2013-01-16 20:20 Ben Guthro
  2013-01-16 21:24 ` Pasi Kärkkäinen
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Guthro @ 2013-01-16 20:20 UTC (permalink / raw)
  To: xen-devel; +Cc: Ben Guthro

Check for ioport access, before fully resuming operation, to avoid
spinning in __ns16550_poll when reading the LSR register returns 0xFF
on failing ioport access.

On some systems, there is a SuperIO card that provides this legacy ioport
on the LPC bus.

In this case, we need to wait for dom0's ACPI processing to run the proper
AML to re-initialize the chip, before we can use the card again.

This may cause a small amount of garbage to be written
to the serial log while we wait patiently for that AML to
be executed.

It limits the number of retries, to avoid a situation where we keep trying
over and over again, in the case of some other failure on the ioport.

Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com>
---
 xen/drivers/char/ns16550.c |   56 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index d77042e..27555c7 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -42,6 +42,7 @@ static struct ns16550 {
     struct irqaction irqaction;
     /* UART with no IRQ line: periodically-polled I/O. */
     struct timer timer;
+    struct timer resume_timer;
     unsigned int timeout_ms;
     bool_t intr_works;
     /* PCI card parameters. */
@@ -120,6 +121,10 @@ static struct ns16550 {
 /* Frequency of external clock source. This definition assumes PC platform. */
 #define UART_CLOCK_HZ   1843200
 
+/* Resume retry settings */
+#define RESUME_DELAY    MILLISECS(10)
+#define RESUME_RETRIES  100
+
 static char ns_read_reg(struct ns16550 *uart, int reg)
 {
     if ( uart->remapped_io_base == NULL )
@@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port *port)
                                   uart->ps_bdf[2], PCI_COMMAND);
 }
 
-static void ns16550_resume(struct serial_port *port)
+static void __ns16550_resume(struct serial_port *port)
 {
     struct ns16550 *uart = port->uart;
 
@@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port *port)
     ns16550_setup_postirq(port->uart);
 }
 
+static int ns16550_ioport_invalid(struct ns16550 *uart)
+{
+    return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) &&
+            (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) &&
+            (((unsigned char)ns_read_reg(uart, IER)) == 0xff) &&
+            (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) &&
+            (((unsigned char)ns_read_reg(uart, LCR)) == 0xff));
+}
+
+static int delayed_resume_tries;
+static void ns16550_delayed_resume(void *data)
+{
+    struct serial_port *port = data;
+    struct ns16550 *uart = port->uart;
+
+    if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) {
+	set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
+	return;
+    }
+
+    __ns16550_resume(port);
+}
+
+static void ns16550_resume(struct serial_port *port)
+{
+    struct ns16550 *uart = port->uart;
+
+    /*
+     * Check for ioport access, before fully resuming operation.
+     * On some systems, there is a SuperIO card that provides
+     * this legacy ioport on the LPC bus.
+     *
+     * We need to wait for dom0's ACPI processing to run the proper
+     * AML to re-initialize the chip, before we can use the card again.
+     *
+     * This may cause a small amount of garbage to be written
+     * to the serial log while we wait patiently for that AML to
+     * be executed.
+     */
+    if (ns16550_ioport_invalid(uart)) {
+	delayed_resume_tries = RESUME_RETRIES;
+	init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0);
+	set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
+	return;
+    }
+
+    __ns16550_resume(port);
+}
+
 #ifdef CONFIG_X86
 static void __init ns16550_endboot(struct serial_port *port)
 {
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-16 20:20 [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run Ben Guthro
@ 2013-01-16 21:24 ` Pasi Kärkkäinen
  2013-01-16 21:31   ` Ben Guthro
  0 siblings, 1 reply; 15+ messages in thread
From: Pasi Kärkkäinen @ 2013-01-16 21:24 UTC (permalink / raw)
  To: Ben Guthro; +Cc: xen-devel

On Wed, Jan 16, 2013 at 03:20:56PM -0500, Ben Guthro wrote:
> Check for ioport access, before fully resuming operation, to avoid
> spinning in __ns16550_poll when reading the LSR register returns 0xFF
> on failing ioport access.
> 
> On some systems, there is a SuperIO card that provides this legacy ioport
> on the LPC bus.
> 
> In this case, we need to wait for dom0's ACPI processing to run the proper
> AML to re-initialize the chip, before we can use the card again.
> 
> This may cause a small amount of garbage to be written
> to the serial log while we wait patiently for that AML to
> be executed.
> 
> It limits the number of retries, to avoid a situation where we keep trying
> over and over again, in the case of some other failure on the ioport.
> 

So I assume this patch fixes the ACPI S3 suspend/resume on the Lenovo T430/T530 laptops?
It might be good to mention that aswell..

-- Pasi


> Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com>
> ---
>  xen/drivers/char/ns16550.c |   56 +++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
> index d77042e..27555c7 100644
> --- a/xen/drivers/char/ns16550.c
> +++ b/xen/drivers/char/ns16550.c
> @@ -42,6 +42,7 @@ static struct ns16550 {
>      struct irqaction irqaction;
>      /* UART with no IRQ line: periodically-polled I/O. */
>      struct timer timer;
> +    struct timer resume_timer;
>      unsigned int timeout_ms;
>      bool_t intr_works;
>      /* PCI card parameters. */
> @@ -120,6 +121,10 @@ static struct ns16550 {
>  /* Frequency of external clock source. This definition assumes PC platform. */
>  #define UART_CLOCK_HZ   1843200
>  
> +/* Resume retry settings */
> +#define RESUME_DELAY    MILLISECS(10)
> +#define RESUME_RETRIES  100
> +
>  static char ns_read_reg(struct ns16550 *uart, int reg)
>  {
>      if ( uart->remapped_io_base == NULL )
> @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port *port)
>                                    uart->ps_bdf[2], PCI_COMMAND);
>  }
>  
> -static void ns16550_resume(struct serial_port *port)
> +static void __ns16550_resume(struct serial_port *port)
>  {
>      struct ns16550 *uart = port->uart;
>  
> @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port *port)
>      ns16550_setup_postirq(port->uart);
>  }
>  
> +static int ns16550_ioport_invalid(struct ns16550 *uart)
> +{
> +    return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) &&
> +            (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) &&
> +            (((unsigned char)ns_read_reg(uart, IER)) == 0xff) &&
> +            (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) &&
> +            (((unsigned char)ns_read_reg(uart, LCR)) == 0xff));
> +}
> +
> +static int delayed_resume_tries;
> +static void ns16550_delayed_resume(void *data)
> +{
> +    struct serial_port *port = data;
> +    struct ns16550 *uart = port->uart;
> +
> +    if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) {
> +	set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
> +	return;
> +    }
> +
> +    __ns16550_resume(port);
> +}
> +
> +static void ns16550_resume(struct serial_port *port)
> +{
> +    struct ns16550 *uart = port->uart;
> +
> +    /*
> +     * Check for ioport access, before fully resuming operation.
> +     * On some systems, there is a SuperIO card that provides
> +     * this legacy ioport on the LPC bus.
> +     *
> +     * We need to wait for dom0's ACPI processing to run the proper
> +     * AML to re-initialize the chip, before we can use the card again.
> +     *
> +     * This may cause a small amount of garbage to be written
> +     * to the serial log while we wait patiently for that AML to
> +     * be executed.
> +     */
> +    if (ns16550_ioport_invalid(uart)) {
> +	delayed_resume_tries = RESUME_RETRIES;
> +	init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0);
> +	set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
> +	return;
> +    }
> +
> +    __ns16550_resume(port);
> +}
> +
>  #ifdef CONFIG_X86
>  static void __init ns16550_endboot(struct serial_port *port)
>  {
> -- 
> 1.7.9.5
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-16 21:24 ` Pasi Kärkkäinen
@ 2013-01-16 21:31   ` Ben Guthro
  2013-01-16 21:40     ` Malcolm Crossley
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Guthro @ 2013-01-16 21:31 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel@lists.xen.org, Ben Guthro



On 01/16/2013 04:24 PM, Pasi Kärkkäinen wrote:
> On Wed, Jan 16, 2013 at 03:20:56PM -0500, Ben Guthro wrote:
>> Check for ioport access, before fully resuming operation, to avoid
>> spinning in __ns16550_poll when reading the LSR register returns 0xFF
>> on failing ioport access.
>>
>> On some systems, there is a SuperIO card that provides this legacy ioport
>> on the LPC bus.
>>
>> In this case, we need to wait for dom0's ACPI processing to run the proper
>> AML to re-initialize the chip, before we can use the card again.
>>
>> This may cause a small amount of garbage to be written
>> to the serial log while we wait patiently for that AML to
>> be executed.
>>
>> It limits the number of retries, to avoid a situation where we keep trying
>> over and over again, in the case of some other failure on the ioport.
>>
>
> So I assume this patch fixes the ACPI S3 suspend/resume on the Lenovo T430/T530 laptops?
> It might be good to mention that aswell..

Yes, it seems to resolve the issues seen on T430, T530, and Intel 
Ivybridge mobile SDP. It likely fixes others, as the Intel mobile SDP is 
the reference platform for all the OEMs.

If I need to re-submit this patch for any other reason, I will be sure 
to mention this in the comment.

That said - there seems to still be an issue on some older Sandy Bridge 
machines, like a Lenovo T520, and X220 - but I don't yet know if it is 
related.

Ben



>
> -- Pasi
>
>
>> Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com>
>> ---
>>   xen/drivers/char/ns16550.c |   56 +++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 55 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
>> index d77042e..27555c7 100644
>> --- a/xen/drivers/char/ns16550.c
>> +++ b/xen/drivers/char/ns16550.c
>> @@ -42,6 +42,7 @@ static struct ns16550 {
>>       struct irqaction irqaction;
>>       /* UART with no IRQ line: periodically-polled I/O. */
>>       struct timer timer;
>> +    struct timer resume_timer;
>>       unsigned int timeout_ms;
>>       bool_t intr_works;
>>       /* PCI card parameters. */
>> @@ -120,6 +121,10 @@ static struct ns16550 {
>>   /* Frequency of external clock source. This definition assumes PC platform. */
>>   #define UART_CLOCK_HZ   1843200
>>
>> +/* Resume retry settings */
>> +#define RESUME_DELAY    MILLISECS(10)
>> +#define RESUME_RETRIES  100
>> +
>>   static char ns_read_reg(struct ns16550 *uart, int reg)
>>   {
>>       if ( uart->remapped_io_base == NULL )
>> @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port *port)
>>                                     uart->ps_bdf[2], PCI_COMMAND);
>>   }
>>
>> -static void ns16550_resume(struct serial_port *port)
>> +static void __ns16550_resume(struct serial_port *port)
>>   {
>>       struct ns16550 *uart = port->uart;
>>
>> @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port *port)
>>       ns16550_setup_postirq(port->uart);
>>   }
>>
>> +static int ns16550_ioport_invalid(struct ns16550 *uart)
>> +{
>> +    return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) &&
>> +            (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) &&
>> +            (((unsigned char)ns_read_reg(uart, IER)) == 0xff) &&
>> +            (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) &&
>> +            (((unsigned char)ns_read_reg(uart, LCR)) == 0xff));
>> +}
>> +
>> +static int delayed_resume_tries;
>> +static void ns16550_delayed_resume(void *data)
>> +{
>> +    struct serial_port *port = data;
>> +    struct ns16550 *uart = port->uart;
>> +
>> +    if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) {
>> +	set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
>> +	return;
>> +    }
>> +
>> +    __ns16550_resume(port);
>> +}
>> +
>> +static void ns16550_resume(struct serial_port *port)
>> +{
>> +    struct ns16550 *uart = port->uart;
>> +
>> +    /*
>> +     * Check for ioport access, before fully resuming operation.
>> +     * On some systems, there is a SuperIO card that provides
>> +     * this legacy ioport on the LPC bus.
>> +     *
>> +     * We need to wait for dom0's ACPI processing to run the proper
>> +     * AML to re-initialize the chip, before we can use the card again.
>> +     *
>> +     * This may cause a small amount of garbage to be written
>> +     * to the serial log while we wait patiently for that AML to
>> +     * be executed.
>> +     */
>> +    if (ns16550_ioport_invalid(uart)) {
>> +	delayed_resume_tries = RESUME_RETRIES;
>> +	init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0);
>> +	set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
>> +	return;
>> +    }
>> +
>> +    __ns16550_resume(port);
>> +}
>> +
>>   #ifdef CONFIG_X86
>>   static void __init ns16550_endboot(struct serial_port *port)
>>   {
>> --
>> 1.7.9.5
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-16 21:31   ` Ben Guthro
@ 2013-01-16 21:40     ` Malcolm Crossley
  2013-01-16 21:48       ` Ben Guthro
  2013-01-16 21:52       ` Pasi Kärkkäinen
  0 siblings, 2 replies; 15+ messages in thread
From: Malcolm Crossley @ 2013-01-16 21:40 UTC (permalink / raw)
  To: xen-devel

On 16/01/13 21:31, Ben Guthro wrote:
>
> On 01/16/2013 04:24 PM, Pasi Kärkkäinen wrote:
>> On Wed, Jan 16, 2013 at 03:20:56PM -0500, Ben Guthro wrote:
>>> Check for ioport access, before fully resuming operation, to avoid
>>> spinning in __ns16550_poll when reading the LSR register returns 0xFF
>>> on failing ioport access.
>>>
>>> On some systems, there is a SuperIO card that provides this legacy ioport
>>> on the LPC bus.
>>>
>>> In this case, we need to wait for dom0's ACPI processing to run the proper
>>> AML to re-initialize the chip, before we can use the card again.
>>>
>>> This may cause a small amount of garbage to be written
>>> to the serial log while we wait patiently for that AML to
>>> be executed.
>>>
>>> It limits the number of retries, to avoid a situation where we keep trying
>>> over and over again, in the case of some other failure on the ioport.
>>>
>> So I assume this patch fixes the ACPI S3 suspend/resume on the Lenovo T430/T530 laptops?
>> It might be good to mention that aswell..
> Yes, it seems to resolve the issues seen on T430, T530, and Intel
> Ivybridge mobile SDP. It likely fixes others, as the Intel mobile SDP is
> the reference platform for all the OEMs.
>
> If I need to re-submit this patch for any other reason, I will be sure
> to mention this in the comment.
>
> That said - there seems to still be an issue on some older Sandy Bridge
> machines, like a Lenovo T520, and X220 - but I don't yet know if it is
> related.

Do these laptops (T430/T530) have built in serial?

Or is the hang fixed because you were using a serial Express card to debug?

BTW, nearly all PC's have external SUPERIO devices, it just seems we 
have managed to be lucky that most platforms seem to default to using 
the BIOS to enable the serial port after resume instead of the OS.

Malcolm
> Ben
>
>
>
>> -- Pasi
>>
>>
>>> Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com>
>>> ---
>>>    xen/drivers/char/ns16550.c |   56 +++++++++++++++++++++++++++++++++++++++++++-
>>>    1 file changed, 55 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
>>> index d77042e..27555c7 100644
>>> --- a/xen/drivers/char/ns16550.c
>>> +++ b/xen/drivers/char/ns16550.c
>>> @@ -42,6 +42,7 @@ static struct ns16550 {
>>>        struct irqaction irqaction;
>>>        /* UART with no IRQ line: periodically-polled I/O. */
>>>        struct timer timer;
>>> +    struct timer resume_timer;
>>>        unsigned int timeout_ms;
>>>        bool_t intr_works;
>>>        /* PCI card parameters. */
>>> @@ -120,6 +121,10 @@ static struct ns16550 {
>>>    /* Frequency of external clock source. This definition assumes PC platform. */
>>>    #define UART_CLOCK_HZ   1843200
>>>
>>> +/* Resume retry settings */
>>> +#define RESUME_DELAY    MILLISECS(10)
>>> +#define RESUME_RETRIES  100
>>> +
>>>    static char ns_read_reg(struct ns16550 *uart, int reg)
>>>    {
>>>        if ( uart->remapped_io_base == NULL )
>>> @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port *port)
>>>                                      uart->ps_bdf[2], PCI_COMMAND);
>>>    }
>>>
>>> -static void ns16550_resume(struct serial_port *port)
>>> +static void __ns16550_resume(struct serial_port *port)
>>>    {
>>>        struct ns16550 *uart = port->uart;
>>>
>>> @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port *port)
>>>        ns16550_setup_postirq(port->uart);
>>>    }
>>>
>>> +static int ns16550_ioport_invalid(struct ns16550 *uart)
>>> +{
>>> +    return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) &&
>>> +            (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) &&
>>> +            (((unsigned char)ns_read_reg(uart, IER)) == 0xff) &&
>>> +            (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) &&
>>> +            (((unsigned char)ns_read_reg(uart, LCR)) == 0xff));
>>> +}
>>> +
>>> +static int delayed_resume_tries;
>>> +static void ns16550_delayed_resume(void *data)
>>> +{
>>> +    struct serial_port *port = data;
>>> +    struct ns16550 *uart = port->uart;
>>> +
>>> +    if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) {
>>> +	set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
>>> +	return;
>>> +    }
>>> +
>>> +    __ns16550_resume(port);
>>> +}
>>> +
>>> +static void ns16550_resume(struct serial_port *port)
>>> +{
>>> +    struct ns16550 *uart = port->uart;
>>> +
>>> +    /*
>>> +     * Check for ioport access, before fully resuming operation.
>>> +     * On some systems, there is a SuperIO card that provides
>>> +     * this legacy ioport on the LPC bus.
>>> +     *
>>> +     * We need to wait for dom0's ACPI processing to run the proper
>>> +     * AML to re-initialize the chip, before we can use the card again.
>>> +     *
>>> +     * This may cause a small amount of garbage to be written
>>> +     * to the serial log while we wait patiently for that AML to
>>> +     * be executed.
>>> +     */
>>> +    if (ns16550_ioport_invalid(uart)) {
>>> +	delayed_resume_tries = RESUME_RETRIES;
>>> +	init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0);
>>> +	set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
>>> +	return;
>>> +    }
>>> +
>>> +    __ns16550_resume(port);
>>> +}
>>> +
>>>    #ifdef CONFIG_X86
>>>    static void __init ns16550_endboot(struct serial_port *port)
>>>    {
>>> --
>>> 1.7.9.5
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-16 21:40     ` Malcolm Crossley
@ 2013-01-16 21:48       ` Ben Guthro
  2013-01-16 21:54         ` Pasi Kärkkäinen
  2013-01-17 11:09         ` Jan Beulich
  2013-01-16 21:52       ` Pasi Kärkkäinen
  1 sibling, 2 replies; 15+ messages in thread
From: Ben Guthro @ 2013-01-16 21:48 UTC (permalink / raw)
  To: Malcolm Crossley; +Cc: xen-devel

On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley
<malcolm.crossley@citrix.com> wrote:

> Do these laptops (T430/T530) have built in serial?

They seem to have the hardware for it, but no actual serial connector
out of the machine.
This hardware provides the legacy port that Xen initializes in
xen/arch/x86/setup.c __start_xen()

When the resume happened, it was getting stuck in __ns16550_poll()
because it thought that the
LSR register was 0xFF - and had lots of data to read. It got stuck in
that while loop, and never
exited.


>
> Or is the hang fixed because you were using a serial Express card to debug?
>
> BTW, nearly all PC's have external SUPERIO devices, it just seems we have
> managed to be lucky that most platforms seem to default to using the BIOS to
> enable the serial port after resume instead of the OS.
>
> Malcolm
>
>> Ben
>>
>>
>>
>>> -- Pasi
>>>
>>>
>>>> Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com>
>>>> ---
>>>>    xen/drivers/char/ns16550.c |   56
>>>> +++++++++++++++++++++++++++++++++++++++++++-
>>>>    1 file changed, 55 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
>>>> index d77042e..27555c7 100644
>>>> --- a/xen/drivers/char/ns16550.c
>>>> +++ b/xen/drivers/char/ns16550.c
>>>> @@ -42,6 +42,7 @@ static struct ns16550 {
>>>>        struct irqaction irqaction;
>>>>        /* UART with no IRQ line: periodically-polled I/O. */
>>>>        struct timer timer;
>>>> +    struct timer resume_timer;
>>>>        unsigned int timeout_ms;
>>>>        bool_t intr_works;
>>>>        /* PCI card parameters. */
>>>> @@ -120,6 +121,10 @@ static struct ns16550 {
>>>>    /* Frequency of external clock source. This definition assumes PC
>>>> platform. */
>>>>    #define UART_CLOCK_HZ   1843200
>>>>
>>>> +/* Resume retry settings */
>>>> +#define RESUME_DELAY    MILLISECS(10)
>>>> +#define RESUME_RETRIES  100
>>>> +
>>>>    static char ns_read_reg(struct ns16550 *uart, int reg)
>>>>    {
>>>>        if ( uart->remapped_io_base == NULL )
>>>> @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port
>>>> *port)
>>>>                                      uart->ps_bdf[2], PCI_COMMAND);
>>>>    }
>>>>
>>>> -static void ns16550_resume(struct serial_port *port)
>>>> +static void __ns16550_resume(struct serial_port *port)
>>>>    {
>>>>        struct ns16550 *uart = port->uart;
>>>>
>>>> @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port
>>>> *port)
>>>>        ns16550_setup_postirq(port->uart);
>>>>    }
>>>>
>>>> +static int ns16550_ioport_invalid(struct ns16550 *uart)
>>>> +{
>>>> +    return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) &&
>>>> +            (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) &&
>>>> +            (((unsigned char)ns_read_reg(uart, IER)) == 0xff) &&
>>>> +            (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) &&
>>>> +            (((unsigned char)ns_read_reg(uart, LCR)) == 0xff));
>>>> +}
>>>> +
>>>> +static int delayed_resume_tries;
>>>> +static void ns16550_delayed_resume(void *data)
>>>> +{
>>>> +    struct serial_port *port = data;
>>>> +    struct ns16550 *uart = port->uart;
>>>> +
>>>> +    if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) {
>>>> +       set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
>>>> +       return;
>>>> +    }
>>>> +
>>>> +    __ns16550_resume(port);
>>>> +}
>>>> +
>>>> +static void ns16550_resume(struct serial_port *port)
>>>> +{
>>>> +    struct ns16550 *uart = port->uart;
>>>> +
>>>> +    /*
>>>> +     * Check for ioport access, before fully resuming operation.
>>>> +     * On some systems, there is a SuperIO card that provides
>>>> +     * this legacy ioport on the LPC bus.
>>>> +     *
>>>> +     * We need to wait for dom0's ACPI processing to run the proper
>>>> +     * AML to re-initialize the chip, before we can use the card again.
>>>> +     *
>>>> +     * This may cause a small amount of garbage to be written
>>>> +     * to the serial log while we wait patiently for that AML to
>>>> +     * be executed.
>>>> +     */
>>>> +    if (ns16550_ioport_invalid(uart)) {
>>>> +       delayed_resume_tries = RESUME_RETRIES;
>>>> +       init_timer(&uart->resume_timer, ns16550_delayed_resume, port,
>>>> 0);
>>>> +       set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
>>>> +       return;
>>>> +    }
>>>> +
>>>> +    __ns16550_resume(port);
>>>> +}
>>>> +
>>>>    #ifdef CONFIG_X86
>>>>    static void __init ns16550_endboot(struct serial_port *port)
>>>>    {
>>>> --
>>>> 1.7.9.5
>>>>
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xen.org
>>>> http://lists.xen.org/xen-devel
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-16 21:40     ` Malcolm Crossley
  2013-01-16 21:48       ` Ben Guthro
@ 2013-01-16 21:52       ` Pasi Kärkkäinen
  1 sibling, 0 replies; 15+ messages in thread
From: Pasi Kärkkäinen @ 2013-01-16 21:52 UTC (permalink / raw)
  To: Malcolm Crossley; +Cc: xen-devel

On Wed, Jan 16, 2013 at 09:40:40PM +0000, Malcolm Crossley wrote:
> On 16/01/13 21:31, Ben Guthro wrote:
> >
> >On 01/16/2013 04:24 PM, Pasi Kärkkäinen wrote:
> >>On Wed, Jan 16, 2013 at 03:20:56PM -0500, Ben Guthro wrote:
> >>>Check for ioport access, before fully resuming operation, to avoid
> >>>spinning in __ns16550_poll when reading the LSR register returns 0xFF
> >>>on failing ioport access.
> >>>
> >>>On some systems, there is a SuperIO card that provides this legacy ioport
> >>>on the LPC bus.
> >>>
> >>>In this case, we need to wait for dom0's ACPI processing to run the proper
> >>>AML to re-initialize the chip, before we can use the card again.
> >>>
> >>>This may cause a small amount of garbage to be written
> >>>to the serial log while we wait patiently for that AML to
> >>>be executed.
> >>>
> >>>It limits the number of retries, to avoid a situation where we keep trying
> >>>over and over again, in the case of some other failure on the ioport.
> >>>
> >>So I assume this patch fixes the ACPI S3 suspend/resume on the Lenovo T430/T530 laptops?
> >>It might be good to mention that aswell..
> >Yes, it seems to resolve the issues seen on T430, T530, and Intel
> >Ivybridge mobile SDP. It likely fixes others, as the Intel mobile SDP is
> >the reference platform for all the OEMs.
> >
> >If I need to re-submit this patch for any other reason, I will be sure
> >to mention this in the comment.
> >
> >That said - there seems to still be an issue on some older Sandy Bridge
> >machines, like a Lenovo T520, and X220 - but I don't yet know if it is
> >related.
> 
> Do these laptops (T430/T530) have built in serial?
> 
> Or is the hang fixed because you were using a serial Express card to debug?
>

At least my T430 laptop is "Intel Core i7 vPro" model, so it has the 
Intel AMT (Active Management Technology), which provides Serial Over LAN (SOL) port.

-- Pasi

 
> BTW, nearly all PC's have external SUPERIO devices, it just seems we
> have managed to be lucky that most platforms seem to default to
> using the BIOS to enable the serial port after resume instead of the
> OS.
> 
> Malcolm
> >Ben
> >
> >
> >
> >>-- Pasi
> >>
> >>
> >>>Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com>
> >>>---
> >>>   xen/drivers/char/ns16550.c |   56 +++++++++++++++++++++++++++++++++++++++++++-
> >>>   1 file changed, 55 insertions(+), 1 deletion(-)
> >>>
> >>>diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
> >>>index d77042e..27555c7 100644
> >>>--- a/xen/drivers/char/ns16550.c
> >>>+++ b/xen/drivers/char/ns16550.c
> >>>@@ -42,6 +42,7 @@ static struct ns16550 {
> >>>       struct irqaction irqaction;
> >>>       /* UART with no IRQ line: periodically-polled I/O. */
> >>>       struct timer timer;
> >>>+    struct timer resume_timer;
> >>>       unsigned int timeout_ms;
> >>>       bool_t intr_works;
> >>>       /* PCI card parameters. */
> >>>@@ -120,6 +121,10 @@ static struct ns16550 {
> >>>   /* Frequency of external clock source. This definition assumes PC platform. */
> >>>   #define UART_CLOCK_HZ   1843200
> >>>
> >>>+/* Resume retry settings */
> >>>+#define RESUME_DELAY    MILLISECS(10)
> >>>+#define RESUME_RETRIES  100
> >>>+
> >>>   static char ns_read_reg(struct ns16550 *uart, int reg)
> >>>   {
> >>>       if ( uart->remapped_io_base == NULL )
> >>>@@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port *port)
> >>>                                     uart->ps_bdf[2], PCI_COMMAND);
> >>>   }
> >>>
> >>>-static void ns16550_resume(struct serial_port *port)
> >>>+static void __ns16550_resume(struct serial_port *port)
> >>>   {
> >>>       struct ns16550 *uart = port->uart;
> >>>
> >>>@@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port *port)
> >>>       ns16550_setup_postirq(port->uart);
> >>>   }
> >>>
> >>>+static int ns16550_ioport_invalid(struct ns16550 *uart)
> >>>+{
> >>>+    return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) &&
> >>>+            (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) &&
> >>>+            (((unsigned char)ns_read_reg(uart, IER)) == 0xff) &&
> >>>+            (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) &&
> >>>+            (((unsigned char)ns_read_reg(uart, LCR)) == 0xff));
> >>>+}
> >>>+
> >>>+static int delayed_resume_tries;
> >>>+static void ns16550_delayed_resume(void *data)
> >>>+{
> >>>+    struct serial_port *port = data;
> >>>+    struct ns16550 *uart = port->uart;
> >>>+
> >>>+    if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) {
> >>>+	set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
> >>>+	return;
> >>>+    }
> >>>+
> >>>+    __ns16550_resume(port);
> >>>+}
> >>>+
> >>>+static void ns16550_resume(struct serial_port *port)
> >>>+{
> >>>+    struct ns16550 *uart = port->uart;
> >>>+
> >>>+    /*
> >>>+     * Check for ioport access, before fully resuming operation.
> >>>+     * On some systems, there is a SuperIO card that provides
> >>>+     * this legacy ioport on the LPC bus.
> >>>+     *
> >>>+     * We need to wait for dom0's ACPI processing to run the proper
> >>>+     * AML to re-initialize the chip, before we can use the card again.
> >>>+     *
> >>>+     * This may cause a small amount of garbage to be written
> >>>+     * to the serial log while we wait patiently for that AML to
> >>>+     * be executed.
> >>>+     */
> >>>+    if (ns16550_ioport_invalid(uart)) {
> >>>+	delayed_resume_tries = RESUME_RETRIES;
> >>>+	init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0);
> >>>+	set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
> >>>+	return;
> >>>+    }
> >>>+
> >>>+    __ns16550_resume(port);
> >>>+}
> >>>+
> >>>   #ifdef CONFIG_X86
> >>>   static void __init ns16550_endboot(struct serial_port *port)
> >>>   {
> >>>--
> >>>1.7.9.5
> >>>
> >>>
> >>>_______________________________________________
> >>>Xen-devel mailing list
> >>>Xen-devel@lists.xen.org
> >>>http://lists.xen.org/xen-devel
> >_______________________________________________
> >Xen-devel mailing list
> >Xen-devel@lists.xen.org
> >http://lists.xen.org/xen-devel
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-16 21:48       ` Ben Guthro
@ 2013-01-16 21:54         ` Pasi Kärkkäinen
  2013-01-17 11:09         ` Jan Beulich
  1 sibling, 0 replies; 15+ messages in thread
From: Pasi Kärkkäinen @ 2013-01-16 21:54 UTC (permalink / raw)
  To: Ben Guthro; +Cc: Malcolm Crossley, xen-devel

On Wed, Jan 16, 2013 at 04:48:41PM -0500, Ben Guthro wrote:
> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley
> <malcolm.crossley@citrix.com> wrote:
> 
> > Do these laptops (T430/T530) have built in serial?
> 
> They seem to have the hardware for it, but no actual serial connector
> out of the machine.
>

I believe you can access the serial port over the network,
thru the AMT management feature.

-- Pasi

> This hardware provides the legacy port that Xen initializes in
> xen/arch/x86/setup.c __start_xen()
> 
> When the resume happened, it was getting stuck in __ns16550_poll()
> because it thought that the
> LSR register was 0xFF - and had lots of data to read. It got stuck in
> that while loop, and never
> exited.
> 
> 
> >
> > Or is the hang fixed because you were using a serial Express card to debug?
> >
> > BTW, nearly all PC's have external SUPERIO devices, it just seems we have
> > managed to be lucky that most platforms seem to default to using the BIOS to
> > enable the serial port after resume instead of the OS.
> >
> > Malcolm
> >
> >> Ben
> >>
> >>
> >>
> >>> -- Pasi
> >>>
> >>>
> >>>> Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com>
> >>>> ---
> >>>>    xen/drivers/char/ns16550.c |   56
> >>>> +++++++++++++++++++++++++++++++++++++++++++-
> >>>>    1 file changed, 55 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
> >>>> index d77042e..27555c7 100644
> >>>> --- a/xen/drivers/char/ns16550.c
> >>>> +++ b/xen/drivers/char/ns16550.c
> >>>> @@ -42,6 +42,7 @@ static struct ns16550 {
> >>>>        struct irqaction irqaction;
> >>>>        /* UART with no IRQ line: periodically-polled I/O. */
> >>>>        struct timer timer;
> >>>> +    struct timer resume_timer;
> >>>>        unsigned int timeout_ms;
> >>>>        bool_t intr_works;
> >>>>        /* PCI card parameters. */
> >>>> @@ -120,6 +121,10 @@ static struct ns16550 {
> >>>>    /* Frequency of external clock source. This definition assumes PC
> >>>> platform. */
> >>>>    #define UART_CLOCK_HZ   1843200
> >>>>
> >>>> +/* Resume retry settings */
> >>>> +#define RESUME_DELAY    MILLISECS(10)
> >>>> +#define RESUME_RETRIES  100
> >>>> +
> >>>>    static char ns_read_reg(struct ns16550 *uart, int reg)
> >>>>    {
> >>>>        if ( uart->remapped_io_base == NULL )
> >>>> @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port
> >>>> *port)
> >>>>                                      uart->ps_bdf[2], PCI_COMMAND);
> >>>>    }
> >>>>
> >>>> -static void ns16550_resume(struct serial_port *port)
> >>>> +static void __ns16550_resume(struct serial_port *port)
> >>>>    {
> >>>>        struct ns16550 *uart = port->uart;
> >>>>
> >>>> @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port
> >>>> *port)
> >>>>        ns16550_setup_postirq(port->uart);
> >>>>    }
> >>>>
> >>>> +static int ns16550_ioport_invalid(struct ns16550 *uart)
> >>>> +{
> >>>> +    return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) &&
> >>>> +            (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) &&
> >>>> +            (((unsigned char)ns_read_reg(uart, IER)) == 0xff) &&
> >>>> +            (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) &&
> >>>> +            (((unsigned char)ns_read_reg(uart, LCR)) == 0xff));
> >>>> +}
> >>>> +
> >>>> +static int delayed_resume_tries;
> >>>> +static void ns16550_delayed_resume(void *data)
> >>>> +{
> >>>> +    struct serial_port *port = data;
> >>>> +    struct ns16550 *uart = port->uart;
> >>>> +
> >>>> +    if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) {
> >>>> +       set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
> >>>> +       return;
> >>>> +    }
> >>>> +
> >>>> +    __ns16550_resume(port);
> >>>> +}
> >>>> +
> >>>> +static void ns16550_resume(struct serial_port *port)
> >>>> +{
> >>>> +    struct ns16550 *uart = port->uart;
> >>>> +
> >>>> +    /*
> >>>> +     * Check for ioport access, before fully resuming operation.
> >>>> +     * On some systems, there is a SuperIO card that provides
> >>>> +     * this legacy ioport on the LPC bus.
> >>>> +     *
> >>>> +     * We need to wait for dom0's ACPI processing to run the proper
> >>>> +     * AML to re-initialize the chip, before we can use the card again.
> >>>> +     *
> >>>> +     * This may cause a small amount of garbage to be written
> >>>> +     * to the serial log while we wait patiently for that AML to
> >>>> +     * be executed.
> >>>> +     */
> >>>> +    if (ns16550_ioport_invalid(uart)) {
> >>>> +       delayed_resume_tries = RESUME_RETRIES;
> >>>> +       init_timer(&uart->resume_timer, ns16550_delayed_resume, port,
> >>>> 0);
> >>>> +       set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
> >>>> +       return;
> >>>> +    }
> >>>> +
> >>>> +    __ns16550_resume(port);
> >>>> +}
> >>>> +
> >>>>    #ifdef CONFIG_X86
> >>>>    static void __init ns16550_endboot(struct serial_port *port)
> >>>>    {
> >>>> --
> >>>> 1.7.9.5
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Xen-devel mailing list
> >>>> Xen-devel@lists.xen.org
> >>>> http://lists.xen.org/xen-devel
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@lists.xen.org
> >> http://lists.xen.org/xen-devel
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-16 21:48       ` Ben Guthro
  2013-01-16 21:54         ` Pasi Kärkkäinen
@ 2013-01-17 11:09         ` Jan Beulich
  2013-01-17 12:04           ` Ben Guthro
  2013-04-22 13:51           ` Ben Guthro
  1 sibling, 2 replies; 15+ messages in thread
From: Jan Beulich @ 2013-01-17 11:09 UTC (permalink / raw)
  To: Ben Guthro; +Cc: Malcolm Crossley, xen-devel

>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote:
> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley
> <malcolm.crossley@citrix.com> wrote:
> 
>> Do these laptops (T430/T530) have built in serial?
> 
> They seem to have the hardware for it, but no actual serial connector
> out of the machine.
> This hardware provides the legacy port that Xen initializes in
> xen/arch/x86/setup.c __start_xen()
> 
> When the resume happened, it was getting stuck in __ns16550_poll()
> because it thought that the
> LSR register was 0xFF - and had lots of data to read. It got stuck in
> that while loop, and never
> exited.

So before acking the patch I'd like to understand how we end up
in that loop even when no serial console is in use. Assuming that's
because the post-IRQ initialization (mostly) unconditionally inserts
the timer, that shouldn't be an issue on -unstable (as post-IRQ
init of the individual drivers doesn't get called anymore when no
respective command line option was present, and likewise their
suspend/resume handlers don't get called anymore in that case).
In which case backporting from -unstable would be preferable
over putting custom stuff on the 4.x branches (albeit we likely
still want the change here to have a way to resume with serial
console, but the impact would be quite different).

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-17 11:09         ` Jan Beulich
@ 2013-01-17 12:04           ` Ben Guthro
  2013-01-17 13:37             ` Ben Guthro
  2013-04-22 13:51           ` Ben Guthro
  1 sibling, 1 reply; 15+ messages in thread
From: Ben Guthro @ 2013-01-17 12:04 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Malcolm Crossley, Ben Guthro, xen-devel@lists.xen.org

On Jan 17, 2013, at 6:09 AM, Jan Beulich <JBeulich@suse.com> wrote:

>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote:
>> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley
>> <malcolm.crossley@citrix.com> wrote:
>>
>>> Do these laptops (T430/T530) have built in serial?
>>
>> They seem to have the hardware for it, but no actual serial connector
>> out of the machine.
>> This hardware provides the legacy port that Xen initializes in
>> xen/arch/x86/setup.c __start_xen()
>>
>> When the resume happened, it was getting stuck in __ns16550_poll()
>> because it thought that the
>> LSR register was 0xFF - and had lots of data to read. It got stuck in
>> that while loop, and never
>> exited.
>
> So before acking the patch I'd like to understand how we end up
> in that loop even when no serial console is in use. Assuming that's
> because the post-IRQ initialization (mostly) unconditionally inserts
> the timer, that shouldn't be an issue on -unstable (as post-IRQ
> init of the individual drivers doesn't get called anymore when no
> respective command line option was present, and likewise their
> suspend/resume handlers don't get called anymore in that case).
> In which case backporting from -unstable would be preferable
> over putting custom stuff on the 4.x branches (albeit we likely
> still want the change here to have a way to resume with serial
> console, but the impact would be quite different).
>
> Jan
>

Admittedly, I have been doing my testing on 4.2.y

I can try unstable today to see if it makes a difference in this path.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-17 12:04           ` Ben Guthro
@ 2013-01-17 13:37             ` Ben Guthro
  2013-01-18 15:53               ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Guthro @ 2013-01-17 13:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Malcolm Crossley, Ben Guthro, xen-devel@lists.xen.org

On Thu, Jan 17, 2013 at 7:04 AM, Ben Guthro <ben.guthro@gmail.com> wrote:
> On Jan 17, 2013, at 6:09 AM, Jan Beulich <JBeulich@suse.com> wrote:
>
>>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote:
>>> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley
>>> <malcolm.crossley@citrix.com> wrote:
>>>
>>>> Do these laptops (T430/T530) have built in serial?
>>>
>>> They seem to have the hardware for it, but no actual serial connector
>>> out of the machine.
>>> This hardware provides the legacy port that Xen initializes in
>>> xen/arch/x86/setup.c __start_xen()
>>>
>>> When the resume happened, it was getting stuck in __ns16550_poll()
>>> because it thought that the
>>> LSR register was 0xFF - and had lots of data to read. It got stuck in
>>> that while loop, and never
>>> exited.
>>
>> So before acking the patch I'd like to understand how we end up
>> in that loop even when no serial console is in use. Assuming that's
>> because the post-IRQ initialization (mostly) unconditionally inserts
>> the timer, that shouldn't be an issue on -unstable (as post-IRQ
>> init of the individual drivers doesn't get called anymore when no
>> respective command line option was present, and likewise their
>> suspend/resume handlers don't get called anymore in that case).
>> In which case backporting from -unstable would be preferable
>> over putting custom stuff on the 4.x branches (albeit we likely
>> still want the change here to have a way to resume with serial
>> console, but the impact would be quite different).
>>
>> Jan
>>
>
> Admittedly, I have been doing my testing on 4.2.y
>
> I can try unstable today to see if it makes a difference in this path.

Further testing on the Lenovo T430 shows that this fix is insufficient
to fully resolve the S3 failures, (even on 4.2.y)
The first one seems to work, but the second fails - which is a
different behavior than I am seeing on the Intel Mobile SDP.

So while I think this is a valuable patch to have for debugging S3, it
is not the root cause of the failures as I had previously believed.

...back to the drawing board, I guess.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-17 13:37             ` Ben Guthro
@ 2013-01-18 15:53               ` Konrad Rzeszutek Wilk
  2013-01-18 20:07                 ` Ben Guthro
  0 siblings, 1 reply; 15+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-01-18 15:53 UTC (permalink / raw)
  To: Ben Guthro; +Cc: Malcolm Crossley, Jan Beulich, xen-devel@lists.xen.org

On Thu, Jan 17, 2013 at 08:37:40AM -0500, Ben Guthro wrote:
> On Thu, Jan 17, 2013 at 7:04 AM, Ben Guthro <ben.guthro@gmail.com> wrote:
> > On Jan 17, 2013, at 6:09 AM, Jan Beulich <JBeulich@suse.com> wrote:
> >
> >>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote:
> >>> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley
> >>> <malcolm.crossley@citrix.com> wrote:
> >>>
> >>>> Do these laptops (T430/T530) have built in serial?
> >>>
> >>> They seem to have the hardware for it, but no actual serial connector
> >>> out of the machine.
> >>> This hardware provides the legacy port that Xen initializes in
> >>> xen/arch/x86/setup.c __start_xen()
> >>>
> >>> When the resume happened, it was getting stuck in __ns16550_poll()
> >>> because it thought that the
> >>> LSR register was 0xFF - and had lots of data to read. It got stuck in
> >>> that while loop, and never
> >>> exited.
> >>
> >> So before acking the patch I'd like to understand how we end up
> >> in that loop even when no serial console is in use. Assuming that's
> >> because the post-IRQ initialization (mostly) unconditionally inserts
> >> the timer, that shouldn't be an issue on -unstable (as post-IRQ
> >> init of the individual drivers doesn't get called anymore when no
> >> respective command line option was present, and likewise their
> >> suspend/resume handlers don't get called anymore in that case).
> >> In which case backporting from -unstable would be preferable
> >> over putting custom stuff on the 4.x branches (albeit we likely
> >> still want the change here to have a way to resume with serial
> >> console, but the impact would be quite different).
> >>
> >> Jan
> >>
> >
> > Admittedly, I have been doing my testing on 4.2.y
> >
> > I can try unstable today to see if it makes a difference in this path.
> 
> Further testing on the Lenovo T430 shows that this fix is insufficient
> to fully resolve the S3 failures, (even on 4.2.y)
> The first one seems to work, but the second fails - which is a
> different behavior than I am seeing on the Intel Mobile SDP.
> 
> So while I think this is a valuable patch to have for debugging S3, it
> is not the root cause of the failures as I had previously believed.
> 
> ...back to the drawing board, I guess.

I just tried Xen 4.3 without trying to use any serial console, and found out that
it succesfully came back from resume. But only with one CPU active.

Doing 'xl debug-keys r' before shows:

(XEN) sched_smt_power_savings: disabled
(XEN) NOW=0x000000099F1C9611
(XEN) Idle cpupool:
(XEN) Scheduler: SMP Credit Scheduler (credit)
(XEN) info:
(XEN)   ncpus              = 2
(XEN)   master             = 0
(XEN)   credit             = 600
(XEN)   credit balance     = 0
(XEN)   weight             = 0
(XEN)   runq_sort          = 380
(XEN)   default-weight     = 256
(XEN)   tslice             = 30ms
(XEN)   ratelimit          = 1000us
(XEN)   credits per msec   = 10
(XEN)   ticks per tslice   = 3
(XEN)   migration delay    = 0us
(XEN) idlers: 02
(XEN) active vcpus:
(XEN) Cpupool 0:
(XEN) Scheduler: SMP Credit Scheduler (credit)
(XEN) info:
(XEN)   ncpus              = 2
(XEN)   master             = 0
(XEN)   credit             = 600
(XEN)   credit balance     = 0
(XEN)   weight             = 0
(XEN)   runq_sort          = 380
(XEN)   default-weight     = 256
(XEN)   tslice             = 30ms
(XEN)   ratelimit          = 1000us
(XEN)   credits per msec   = 10
(XEN)   ticks per tslice   = 3
(XEN)   migration delay    = 0us
(XEN) idlers: 02
(XEN) active vcpus:
(XEN) CPU[00]  sort=380, sibling=01, core=03
(XEN)   run: [0.0] pri=0 flags=0 cpu=0 credit=172 [w=256]
(XEN)     1: [32767.0] pri=-64 flags=0 cpu=0
(XEN) CPU[01]  sort=380, sibling=02, core=03
(XEN)   run: [32767.1] pri=-64 flags=0 cpu=1


after ACPI S3 resume:
(XEN) Idle cpupool:
(XEN) Scheduler: SMP Credit Scheduler (credit)
(XEN) info:
(XEN)   ncpus              = 2
(XEN)   master             = 0
(XEN)   credit             = 600
(XEN)   credit balance     = 0
(XEN)   weight             = 0
(XEN)   runq_sort          = 471
(XEN)   default-weight     = 256
(XEN)   tslice             = 30ms
(XEN)   ratelimit          = 1000us
(XEN)   credits per msec   = 10
(XEN)   ticks per tslice   = 3
(XEN)   migration delay    = 0us
(XEN) idlers: 02
(XEN) active vcpus:
(XEN) Cpupool 0:
(XEN) Scheduler: SMP Credit Scheduler (credit)
(XEN) info:
(XEN)   ncpus              = 2
(XEN)   master             = 0
(XEN)   credit             = 600
(XEN)   credit balance     = 0
(XEN)   weight             = 0
(XEN)   runq_sort          = 471
(XEN)   default-weight     = 256
(XEN)   tslice             = 30ms
(XEN)   ratelimit          = 1000us
(XEN)   credits per msec   = 10
(XEN)   ticks per tslice   = 3
(XEN)   migration delay    = 0us
(XEN) idlers: 02
(XEN) active vcpus:
(XEN) CPU[00]  sort=471, sibling=01, core=03
(XEN)   run: [0.1] pri=0 flags=0 cpu=0 credit=0 [w=256]
(XEN)     1: [0.0] pri=0 flags=0 cpu=0 credit=-120 [w=256]
(XEN)     2: [32767.0] pri=-64 flags=0 cpu=0

Where did the other CPU go!?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-18 15:53               ` Konrad Rzeszutek Wilk
@ 2013-01-18 20:07                 ` Ben Guthro
  0 siblings, 0 replies; 15+ messages in thread
From: Ben Guthro @ 2013-01-18 20:07 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Malcolm Crossley, Jan Beulich, xen-devel@lists.xen.org

On Fri, Jan 18, 2013 at 10:53 AM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
> Where did the other CPU go!?

strange.
We aren't seeing that particular failure.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-01-17 11:09         ` Jan Beulich
  2013-01-17 12:04           ` Ben Guthro
@ 2013-04-22 13:51           ` Ben Guthro
  2013-04-23  6:40             ` Jan Beulich
  1 sibling, 1 reply; 15+ messages in thread
From: Ben Guthro @ 2013-04-22 13:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Thu, Jan 17, 2013 at 11:09 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote:
>> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley
>> <malcolm.crossley@citrix.com> wrote:
>>
>>> Do these laptops (T430/T530) have built in serial?
>>
>> They seem to have the hardware for it, but no actual serial connector
>> out of the machine.
>> This hardware provides the legacy port that Xen initializes in
>> xen/arch/x86/setup.c __start_xen()
>>
>> When the resume happened, it was getting stuck in __ns16550_poll()
>> because it thought that the
>> LSR register was 0xFF - and had lots of data to read. It got stuck in
>> that while loop, and never
>> exited.
>
> So before acking the patch I'd like to understand how we end up
> in that loop even when no serial console is in use. Assuming that's
> because the post-IRQ initialization (mostly) unconditionally inserts
> the timer, that shouldn't be an issue on -unstable (as post-IRQ
> init of the individual drivers doesn't get called anymore when no
> respective command line option was present, and likewise their
> suspend/resume handlers don't get called anymore in that case).
> In which case backporting from -unstable would be preferable
> over putting custom stuff on the 4.x branches (albeit we likely
> still want the change here to have a way to resume with serial
> console, but the impact would be quite different).
>
> Jan

I'm dragging this thread back up from the archives, because I think it is still
applicable, and now that some of the other S3 scheduling things are explained,
this is a good fix.

My assertion in this thread about getting into this loop when no
serial is in use
turned out to be false.

However, if you are using serial with one of these SuperIO controllers on the
LPC bus, you will get into an infinite loop at resume time.

Ben

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-04-22 13:51           ` Ben Guthro
@ 2013-04-23  6:40             ` Jan Beulich
  2013-04-23 18:56               ` Ben Guthro
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2013-04-23  6:40 UTC (permalink / raw)
  To: Ben Guthro; +Cc: xen-devel

>>> On 22.04.13 at 15:51, Ben Guthro <ben@guthro.net> wrote:
> On Thu, Jan 17, 2013 at 11:09 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote:
>>> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley
>>> <malcolm.crossley@citrix.com> wrote:
>>>
>>>> Do these laptops (T430/T530) have built in serial?
>>>
>>> They seem to have the hardware for it, but no actual serial connector
>>> out of the machine.
>>> This hardware provides the legacy port that Xen initializes in
>>> xen/arch/x86/setup.c __start_xen()
>>>
>>> When the resume happened, it was getting stuck in __ns16550_poll()
>>> because it thought that the
>>> LSR register was 0xFF - and had lots of data to read. It got stuck in
>>> that while loop, and never
>>> exited.
>>
>> So before acking the patch I'd like to understand how we end up
>> in that loop even when no serial console is in use. Assuming that's
>> because the post-IRQ initialization (mostly) unconditionally inserts
>> the timer, that shouldn't be an issue on -unstable (as post-IRQ
>> init of the individual drivers doesn't get called anymore when no
>> respective command line option was present, and likewise their
>> suspend/resume handlers don't get called anymore in that case).
>> In which case backporting from -unstable would be preferable
>> over putting custom stuff on the 4.x branches (albeit we likely
>> still want the change here to have a way to resume with serial
>> console, but the impact would be quite different).
> 
> I'm dragging this thread back up from the archives, because I think it is 
> still applicable, and now that some of the other S3 scheduling things are 
> explained, this is a good fix.
> 
> My assertion in this thread about getting into this loop when no
> serial is in use turned out to be false.

Good.

> However, if you are using serial with one of these SuperIO controllers on 
> the LPC bus, you will get into an infinite loop at resume time.

Yes, I can see how that can happen.

However, there's another loose end on that thread - see your
response
http://lists.xen.org/archives/html/xen-devel/2013-01/msg01223.html

And I'd expect you to resubmit anyway, in order to
- fix the description (as you had promised to Pasi)
- fix various coding style issues
- Cc Keir (who will have to ack the patch in order for it to go in)
- make sure we're not applying something stale after this long a
  time

Jan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
  2013-04-23  6:40             ` Jan Beulich
@ 2013-04-23 18:56               ` Ben Guthro
  0 siblings, 0 replies; 15+ messages in thread
From: Ben Guthro @ 2013-04-23 18:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Tue, Apr 23, 2013 at 7:40 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 22.04.13 at 15:51, Ben Guthro <ben@guthro.net> wrote:
>> On Thu, Jan 17, 2013 at 11:09 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote:
>>>> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley
>>>> <malcolm.crossley@citrix.com> wrote:
>>>>
>>>>> Do these laptops (T430/T530) have built in serial?
>>>>
>>>> They seem to have the hardware for it, but no actual serial connector
>>>> out of the machine.
>>>> This hardware provides the legacy port that Xen initializes in
>>>> xen/arch/x86/setup.c __start_xen()
>>>>
>>>> When the resume happened, it was getting stuck in __ns16550_poll()
>>>> because it thought that the
>>>> LSR register was 0xFF - and had lots of data to read. It got stuck in
>>>> that while loop, and never
>>>> exited.
>>>
>>> So before acking the patch I'd like to understand how we end up
>>> in that loop even when no serial console is in use. Assuming that's
>>> because the post-IRQ initialization (mostly) unconditionally inserts
>>> the timer, that shouldn't be an issue on -unstable (as post-IRQ
>>> init of the individual drivers doesn't get called anymore when no
>>> respective command line option was present, and likewise their
>>> suspend/resume handlers don't get called anymore in that case).
>>> In which case backporting from -unstable would be preferable
>>> over putting custom stuff on the 4.x branches (albeit we likely
>>> still want the change here to have a way to resume with serial
>>> console, but the impact would be quite different).
>>
>> I'm dragging this thread back up from the archives, because I think it is
>> still applicable, and now that some of the other S3 scheduling things are
>> explained, this is a good fix.
>>
>> My assertion in this thread about getting into this loop when no
>> serial is in use turned out to be false.
>
> Good.
>
>> However, if you are using serial with one of these SuperIO controllers on
>> the LPC bus, you will get into an infinite loop at resume time.
>
> Yes, I can see how that can happen.
>
> However, there's another loose end on that thread - see your
> response
> http://lists.xen.org/archives/html/xen-devel/2013-01/msg01223.html

This issue was resolved with the following changeset:

commit 1868a0c5c083a53f7473067ceb871bd096c72386
Author: Jan Beulich <jbeulich@suse.com>
Date:   Tue Sep 11 12:33:31 2012 +0200

    x86/MSI: fix 2nd S3 resume with interrupt remapping enabled

    The first resume from S3 was corrupting internal data structures (in
    that pci_restore_msi_state() updated the globally stored MSI message
    from traditional to interrupt remapped format, which would then be
    translated a second time during the second resume, breaking interrupt
    delivery).

    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    Acked-by: Keir Fraser <keir@xen.org>
    xen-unstable changeset: 25834:0376c85caaf3
    xen-unstable date: Fri Sep  7 15:57:10 UTC 2012



>
> And I'd expect you to resubmit anyway, in order to
> - fix the description (as you had promised to Pasi)
> - fix various coding style issues
> - Cc Keir (who will have to ack the patch in order for it to go in)
> - make sure we're not applying something stale after this long a
>   time

OK, I'll take a look at doing this.

Thanks.

Ben

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-04-23 18:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-16 20:20 [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run Ben Guthro
2013-01-16 21:24 ` Pasi Kärkkäinen
2013-01-16 21:31   ` Ben Guthro
2013-01-16 21:40     ` Malcolm Crossley
2013-01-16 21:48       ` Ben Guthro
2013-01-16 21:54         ` Pasi Kärkkäinen
2013-01-17 11:09         ` Jan Beulich
2013-01-17 12:04           ` Ben Guthro
2013-01-17 13:37             ` Ben Guthro
2013-01-18 15:53               ` Konrad Rzeszutek Wilk
2013-01-18 20:07                 ` Ben Guthro
2013-04-22 13:51           ` Ben Guthro
2013-04-23  6:40             ` Jan Beulich
2013-04-23 18:56               ` Ben Guthro
2013-01-16 21:52       ` Pasi Kärkkäinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).