Linux Serial subsystem development

Linux Serial subsystem development
 help / color / mirror / Atom feed

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Greg KH @ 2012-09-26 20:30 UTC (permalink / raw)
  To: Poddar, Sourav
  Cc: alan, tony, khilman, linux-omap, linux-arm-kernel, linux-serial,
	linux-kernel, santosh.shilimkar, balbi, paul
In-Reply-To: <CAKdam572krJZcayRdJ_AciCUZH3LuzD9yoHenhxrB6j9xDZ6JQ@mail.gmail.com>

On Tue, Sep 25, 2012 at 01:52:03PM +0530, Poddar, Sourav wrote:
> Hi Greg,
> 
> Ping on this?

It was a RFT patch, with a huge thread.  What's the resolution here?
Did people figure out the real problem here or not?  If so, care to
resend the proper patch so I know what to apply?

thanks,

greg k-h

^ permalink raw reply

* [PATCH v3] serial: pl011: handle corruption at high clock speeds
From: Linus Walleij @ 2012-09-26 15:21 UTC (permalink / raw)
  To: linux-serial, Greg Kroah-Hartman
  Cc: linux-arm-kernel, Anmar Oueja, Linus Walleij, stable, Bibek Basu,
	Par-Gunnar Hjalmdahl, Guillaume Jaunet, Christophe Arnal,
	Matthias Locher, Rajanikanth HV

From: Linus Walleij <linus.walleij@linaro.org>

This works around a few glitches in the ST version of the PL011
serial driver when using very high baud rates, as we do in the
Ux500: 3, 3.25, 4 and 4.05 Mbps.

Problem Observed/rootcause:

When using high baud-rates, and the baudrate*8 is getting close to
the provided clock frequency (so a division factor close to 1), when
using bursts of characters (so they are abutted), then it seems as if
there is not enough time to detect the beginning of the start-bit which
is a timing reference for the entire character, and thus the sampling
moment of character bits is moving towards the end of each bit, instead
of the middle.

Fix:
Increase slightly the RX baud rate of the UART above the theoretical
baudrate by 5%. This will definitely give more margin time to the
UART_RX to correctly sample the data at the middle of the bit period.

Also fix the ages old copy-paste error in the very stressed comment,
it's referencing the registers used in the PL010 driver rather than
the PL011 ones.

Cc: stable@kernel.org
Cc: Bibek Basu <bibek.basu@stericsson.com>
Cc: Par-Gunnar Hjalmdahl <par-gunnar.hjalmdahl@stericsson.com>
Signed-off-by: Guillaume Jaunet <guillaume.jaunet@stericsson.com>
Signed-off-by: Christophe Arnal <christophe.arnal@stericsson.com>
Signed-off-by: Matthias Locher <matthias.locher@stericsson.com>
Signed-off-by: Rajanikanth HV <rajanikanth.hv@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
ChangeLog v2->v3:
- Actually commit the v2 changes...
ChangeLog v1->v2:
- Use >= comparison for 3 Mbaud, add some parenthesis to make the
  logical expressions crystal clear.
---
 drivers/tty/serial/amba-pl011.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index d626d84..25508c7 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -1594,13 +1594,26 @@ pl011_set_termios(struct uart_port *port, struct ktermios *termios,
 			old_cr &= ~ST_UART011_CR_OVSFACT;
 	}
 
+	/*
+	 * Workaround for the ST Micro oversampling variants to
+	 * increase the bitrate slightly, by lowering the divisor,
+	 * to avoid delayed sampling of start bit at high speeds,
+	 * else we see data corruption.
+	 */
+	if (uap->vendor->oversampling) {
+		if ((baud >= 3000000) && (baud < 3250000) && (quot > 1))
+			quot -= 1;
+		else if ((baud > 3250000) && (quot > 2))
+			quot -= 2;
+	}
 	/* Set baud rate */
 	writew(quot & 0x3f, port->membase + UART011_FBRD);
 	writew(quot >> 6, port->membase + UART011_IBRD);
 
 	/*
 	 * ----------v----------v----------v----------v-----
-	 * NOTE: MUST BE WRITTEN AFTER UARTLCR_M & UARTLCR_L
+	 * NOTE: lcrh_tx and lcrh_rx MUST BE WRITTEN AFTER
+	 * UART011_FBRD & UART011_IBRD.
 	 * ----------^----------^----------^----------^-----
 	 */
 	writew(lcr_h, port->membase + uap->lcrh_rx);
-- 
1.7.11.3


^ permalink raw reply related

* Re: [PATCH 3/3] serial: pl011: allow very high baudrates
From: Alan Cox @ 2012-09-26 10:04 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Russell King - ARM Linux, Greg Kroah-Hartman, Alan Cox,
	Linus Walleij, linux-serial, Guillaume Jaunet,
	Par-Gunnar Hjalmdahl, Anmar Oueja, Matthias Locher,
	Rajanikanth H.V, Christophe Arnal
In-Reply-To: <CACRpkda5rwpsiu4+wuXOCwHjnKZpAywGNJNRmPHtBHrcRUUzXg@mail.gmail.com>

On Wed, 26 Sep 2012 10:06:04 +0200
Linus Walleij <linus.walleij@linaro.org> wrote:

> On Tue, Sep 25, 2012 at 8:48 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
> > On Fri, Sep 21, 2012 at 9:56 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> >>
> >> Caller passes BOTHER and actual bit rate - we return BOTHER and a bit
> >> rate
> >>
> >> Caller does not pass BOTHER (may not be TCGETS2 aware) we snap to the
> >> nearest Bfoo rate if within 5% otherwise we return BOTHER based rates.
> >
> > OK sorry for getting this backwards, so I was under the impression that
> > BOTHER was an internal detail of the TTY layer, not to be or:ed on
> > and passed in from the outside.
> 
> OK not I got it working thusly:
> 
> /*
>  * Make sure the core will not snap baudrate to something
>  * "close to" requested rate by setting the BOTHER
>  * (baud rate other) flag.
>  */
> tty->termios->c_cflag &= ~CBAUD;
> tty->termios->c_cflag |= BOTHER | (BOTHER >> IBSHIFT);
> tty_encode_baud_rate(tty, baud, baud);
> 
> There are no in-kernel consumers doing this wicked thing so
> mailing it here for reference.
> 
> Hope I got it right now...

Yes.

^ permalink raw reply

* [PATCH v2] serial: pl011: handle corruption at high clock speeds
From: Linus Walleij @ 2012-09-26  8:50 UTC (permalink / raw)
  To: linux-serial, Greg Kroah-Hartman
  Cc: linux-arm-kernel, Anmar Oueja, Linus Walleij, stable, Bibek Basu,
	Par-Gunnar Hjalmdahl, Guillaume Jaunet, Christophe Arnal,
	Matthias Locher, Rajanikanth HV

From: Linus Walleij <linus.walleij@linaro.org>

This works around a few glitches in the ST version of the PL011
serial driver when using very high baud rates, as we do in the
Ux500: 3, 3.25, 4 and 4.05 Mbps.

Problem Observed/rootcause:

When using high baud-rates, and the baudrate*8 is getting close to
the provided clock frequency (so a division factor close to 1), when
using bursts of characters (so they are abutted), then it seems as if
there is not enough time to detect the beginning of the start-bit which
is a timing reference for the entire character, and thus the sampling
moment of character bits is moving towards the end of each bit, instead
of the middle.

Fix:
Increase slightly the RX baud rate of the UART above the theoretical
baudrate by 5%. This will definitely give more margin time to the
UART_RX to correctly sample the data at the middle of the bit period.

Also fix the ages old copy-paste error in the very stressed comment,
it's referencing the registers used in the PL010 driver rather than
the PL011 ones.

Cc: stable@kernel.org
Cc: Bibek Basu <bibek.basu@stericsson.com>
Cc: Par-Gunnar Hjalmdahl <par-gunnar.hjalmdahl@stericsson.com>
Signed-off-by: Guillaume Jaunet <guillaume.jaunet@stericsson.com>
Signed-off-by: Christophe Arnal <christophe.arnal@stericsson.com>
Signed-off-by: Matthias Locher <matthias.locher@stericsson.com>
Signed-off-by: Rajanikanth HV <rajanikanth.hv@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
ChangeLog v1->v2:
- Use >= comparison for 3 Mbaud, add some parenthesis to make the
  logical expressions crystal clear.
---
 drivers/tty/serial/amba-pl011.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index d626d84..cb9f694 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -1594,13 +1594,26 @@ pl011_set_termios(struct uart_port *port, struct ktermios *termios,
 			old_cr &= ~ST_UART011_CR_OVSFACT;
 	}

+	/*
+	 * Workaround for the ST Micro oversampling variants to
+	 * increase the bitrate slightly, by lowering the divisor,
+	 * to avoid delayed sampling of start bit at high speeds,
+	 * else we see data corruption.
+	 */
+	if (uap->vendor->oversampling) {
+		if (baud > 3000000 && baud < 3250000 && quot > 1)
+			quot -= 1;
+		else if (baud > 3250000 && quot > 2)
+			quot -= 2;
+	}
 	/* Set baud rate */
 	writew(quot & 0x3f, port->membase + UART011_FBRD);
 	writew(quot >> 6, port->membase + UART011_IBRD);

 	/*
 	 * ----------v----------v----------v----------v-----
-	 * NOTE: MUST BE WRITTEN AFTER UARTLCR_M & UARTLCR_L
+	 * NOTE: lcrh_tx and lcrh_rx MUST BE WRITTEN AFTER
+	 * UART011_FBRD & UART011_IBRD.
 	 * ----------^----------^----------^----------^-----
 	 */
 	writew(lcr_h, port->membase + uap->lcrh_rx);
-- 
1.7.11.3

^ permalink raw reply related

* Re: [PATCH 3/3] serial: pl011: allow very high baudrates
From: Linus Walleij @ 2012-09-26  8:06 UTC (permalink / raw)
  To: Alan Cox
  Cc: Russell King - ARM Linux, Greg Kroah-Hartman, Alan Cox,
	Linus Walleij, linux-serial, Guillaume Jaunet,
	Par-Gunnar Hjalmdahl, Anmar Oueja, Matthias Locher,
	Rajanikanth H.V, Christophe Arnal, linux-arm-kernel
In-Reply-To: <CACRpkdbjsbUJqMgKt1CyBMmCxMAMz0hhwxzn3T6MdTyYGymjLg@mail.gmail.com>

On Tue, Sep 25, 2012 at 8:48 PM, Linus Walleij <linus.walleij@linaro.org> wrote:
> On Fri, Sep 21, 2012 at 9:56 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>>
>> Caller passes BOTHER and actual bit rate - we return BOTHER and a bit
>> rate
>>
>> Caller does not pass BOTHER (may not be TCGETS2 aware) we snap to the
>> nearest Bfoo rate if within 5% otherwise we return BOTHER based rates.
>
> OK sorry for getting this backwards, so I was under the impression that
> BOTHER was an internal detail of the TTY layer, not to be or:ed on
> and passed in from the outside.

OK not I got it working thusly:

/*
 * Make sure the core will not snap baudrate to something
 * "close to" requested rate by setting the BOTHER
 * (baud rate other) flag.
 */
tty->termios->c_cflag &= ~CBAUD;
tty->termios->c_cflag |= BOTHER | (BOTHER >> IBSHIFT);
tty_encode_baud_rate(tty, baud, baud);

There are no in-kernel consumers doing this wicked thing so
mailing it here for reference.

Hope I got it right now...

Yours,
Linus Walleij

^ permalink raw reply

* Re: [PATCH] RFC: serial: pl011: allow very high baudrates
From: Linus Walleij @ 2012-09-26  8:03 UTC (permalink / raw)
  To: Linus Walleij
  Cc: linux-serial, Greg Kroah-Hartman, Russell King, Alan Cox,
	linux-arm-kernel, Anmar Oueja, Par-Gunnar Hjalmdahl,
	Guillaume Jaunet, Christophe Arnal, Matthias Locher,
	Rajanikanth HV
In-Reply-To: <1348250399-5651-1-git-send-email-linus.walleij@stericsson.com>

On Fri, Sep 21, 2012 at 7:59 PM, Linus Walleij
<linus.walleij@stericsson.com> wrote:

> From: Linus Walleij <linus.walleij@linaro.org>
>
> The ST Microelectronics variant of the PL011 is capable of supporting
> very high non-standard baud rates, even above 4 Mbps. Mostly this
> works, but when we try to use 4050000 baud, the code in
> tty_termios_encode_baud_rate(), which is called from
> uart_get_baud_rate() will do some fuzzy matching and "snap" the
> baudrate to 4000000 baud, which happens to be in the baud_table[].

So I've found the right solution to this problem, Greg ignore
this patch.

Yours,
Linus Walleij

^ permalink raw reply

* Re: [PATCH 3/3] serial: pl011: allow very high baudrates
From: Alan Cox @ 2012-09-25 19:00 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Russell King - ARM Linux, Greg Kroah-Hartman, Alan Cox,
	Linus Walleij, linux-serial, Guillaume Jaunet,
	Par-Gunnar Hjalmdahl, Anmar Oueja, Matthias Locher,
	Rajanikanth H.V, Christophe Arnal, linux-arm-kernel
In-Reply-To: <CACRpkdbjsbUJqMgKt1CyBMmCxMAMz0hhwxzn3T6MdTyYGymjLg@mail.gmail.com>

> OK sorry for getting this backwards, so I was under the impression that
> BOTHER was an internal detail of the TTY layer, not to be or:ed on
> and passed in from the outside.

BOTHER is a speed setting like the other Bxxx values - it's an external
API.

> 
> Then everything makes perfect sense, I'll try to patch the caller
> instead for this "bug" and see what happens. Probably it JustWorks...

Cool

^ permalink raw reply

* Re: [PATCH 3/3] serial: pl011: allow very high baudrates
From: Linus Walleij @ 2012-09-25 18:48 UTC (permalink / raw)
  To: Alan Cox
  Cc: Russell King - ARM Linux, Greg Kroah-Hartman, Alan Cox,
	Linus Walleij, linux-serial, Guillaume Jaunet,
	Par-Gunnar Hjalmdahl, Anmar Oueja, Matthias Locher,
	Rajanikanth H.V, Christophe Arnal, linux-arm-kernel
In-Reply-To: <20120921205603.436a0194@pyramind.ukuu.org.uk>

On Fri, Sep 21, 2012 at 9:56 PM, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> On Fri, 21 Sep 2012 19:52:04 +0200
> Linus Walleij <linus.walleij@linaro.org> wrote:

>> it's the call to
>> tty_termios_encode_baud_rate() that is the problem, not how it
>> gets called. It's that function that fuzzes and "snaps" the baudrate
>> to some rough-fit speed and screws things up for me.
>
> (...)
> The intended behaviour at tty layer is
>
> Caller passes BOTHER and actual bit rate - we return BOTHER and a bit
> rate
>
> Caller does not pass BOTHER (may not be TCGETS2 aware) we snap to the
> nearest Bfoo rate if within 5% otherwise we return BOTHER based rates.

OK sorry for getting this backwards, so I was under the impression that
BOTHER was an internal detail of the TTY layer, not to be or:ed on
and passed in from the outside.

Then everything makes perfect sense, I'll try to patch the caller
instead for this "bug" and see what happens. Probably it JustWorks...

Thanks a lot Alan!
Linus Walleij

^ permalink raw reply

* Re: [PATCH-v2] tty: prevent unnecessary work queue lock checking on flip buffer copy
From: Alan Cox @ 2012-09-25 14:47 UTC (permalink / raw)
  To: Ivo Sieben; +Cc: Alan Cox, Greg KH, linux-serial, RT
In-Reply-To: <CAMSQXEHzWxqZZyAGrmdk-Grvz4EtTXFWivWnjQf5E4ReY88XoQ@mail.gmail.com>

> Or should drivers that use tty_schedule_flip() function never set the
> low_latency flag? (a quick scan indeed shows that all drivers that use
> the low_latency flag indeed make use of the tty_flip_buffer_push()
> function).

They should never do that - and adding a WARN_ON() will ensure we catch
anyone who offends.

I think thats worth doing as the resulting code is clearer and faster.

Alan

^ permalink raw reply

* Re: [PATCH-v2] tty: prevent unnecessary work queue lock checking on flip buffer copy
From: Ivo Sieben @ 2012-09-25 14:39 UTC (permalink / raw)
  To: Alan Cox; +Cc: Alan Cox, Greg KH, linux-serial, RT
In-Reply-To: <20120925140627.5aea6ad7@pyramind.ukuu.org.uk>

Hi,

2012/9/25 Alan Cox <alan@lxorguk.ukuu.org.uk>:
>
>
> then surely the right fix is
>
>         if (tty->low_latency == 0)
>                 flush_work(&tty->buf.work);
>

Your are right, that is indeed more straightforward & logical.
But what if a TTY driver uses the tty_schedule_flip() function instead
of tty_flip_buffer_push() and has low_latency set to 1?
In that case the work queue will never be flushed...

Or should drivers that use tty_schedule_flip() function never set the
low_latency flag? (a quick scan indeed shows that all drivers that use
the low_latency flag indeed make use of the tty_flip_buffer_push()
function).

Best regards,
Ivo Sieben

^ permalink raw reply

* Re: [PATCH-v2] tty: prevent unnecessary work queue lock checking on flip buffer copy
From: Alan Cox @ 2012-09-25 13:06 UTC (permalink / raw)
  To: Ivo Sieben; +Cc: Alan Cox, Greg KH, linux-serial, RT
In-Reply-To: <1348574506-23625-1-git-send-email-meltedpianoman@gmail.com>

> instead of using a work queue in the background. Therefor only in case a
> workqueue is actually used for copying data to the line discipline
> we'll have to check & wait for the workqueue to finish.
> 
> This prevents unnecessary spin lock/unlock on the workqueue spin lock that can
> cause additional scheduling overhead on a PREEMPT_RT system. On a 240 MHz
> AT91SAM9261 processor setup this fixes about 100us of scheduling overhead on the
> TTY read call.

This seems incredibly convoluted and complicated as well as asking for
races and weirdness with the divisibility between the schedule_work and
the atomic operation.

If the only cases are:

	tty->low_latency = 0
				call flush_work
	tty->low_latency = 1
				don't call

	tty->low_latency changes
				call flush_work

then surely the right fix is

	if (tty->low_latency == 0)
		flush_work(&tty->buf.work);

and making the couple of spots we set/unset low latency on a running port
somewhat smarter ?

That avoids an expensive atomic operation as well (and atomic ops are
very expensive on some platforms).

Alan

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Jassi Brar @ 2012-09-25 12:29 UTC (permalink / raw)
  To: balbi
  Cc: Grazvydas Ignotas, Sourav Poddar, gregkh, alan, tony, khilman,
	linux-omap, linux-arm-kernel, linux-serial, linux-kernel,
	santosh.shilimkar, paul
In-Reply-To: <20120919115933.GJ3772@arwen.pp.htv.fi>

On 19 September 2012 17:29, Felipe Balbi <balbi@ti.com> wrote:

> this is what I mean, actually. If we remove pm_runtime_get_sync() in
> exchange for pm_runtime_set_active() before pm_runtime_enable(), it
> works on PandaBoard, but breaks BeagleBoard.
>
Perhaps it suggests that OMAP4 (PandaBoard) serial port is already
activated but OMAP3 (BeagleBoard) isn't. So we need to reflect that
either by doing pm_runtime_set_active() only on OMAP4 or by bringing
up the OMAP3 too before the pm_runtime_set_active() call.
BTW I understand get_sync(), set_active() and enable() are for
different purposes, they can't be traded for one another.

^ permalink raw reply

* [PATCH-v2] tty: prevent unnecessary work queue lock checking on flip buffer copy
From: Ivo Sieben @ 2012-09-25 12:01 UTC (permalink / raw)
  To: Alan Cox, Greg KH, linux-serial, RT; +Cc: Ivo Sieben
In-Reply-To: <CAMSQXEGEf8xSYZa+0N77K-Ld5xHLefyMHtK6FxcKCqNhpnMp6g@mail.gmail.com>

In case of PREEMPT_RT or when low_latency flag is set by the serial driver
the TTY receive flip buffer is copied to the line discipline directly
instead of using a work queue in the background. Therefor only in case a
workqueue is actually used for copying data to the line discipline
we'll have to check & wait for the workqueue to finish.

This prevents unnecessary spin lock/unlock on the workqueue spin lock that can
cause additional scheduling overhead on a PREEMPT_RT system. On a 240 MHz
AT91SAM9261 processor setup this fixes about 100us of scheduling overhead on the
TTY read call.

Signed-off-by: Ivo Sieben <meltedpianoman@gmail.com>
---

 v2:
 Patch v1 was based on the fact that only the tty_flip_buffer_push() function
 was used to copy dat to the line discipline because I removed the
 tty_schedule_flip() in my previous patch "[PATCH] tty: cleanup duplicate
 functions in tty_buffer". Since that patched turned out to be invalid, I had
 to implement this functionality differently.

 drivers/tty/tty_buffer.c |   17 ++++++++++++++---
 include/linux/tty.h      |    3 ++-
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 6146e8b..dee77b4 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -354,6 +354,7 @@ void tty_schedule_flip(struct tty_struct *tty)
 		tty->buf.tail->commit = tty->buf.tail->used;
 	spin_unlock_irqrestore(&tty->buf.lock, flags);
 	schedule_work(&tty->buf.work);
+	atomic_set(&tty->buf.flush_work, 1);
 }
 EXPORT_SYMBOL(tty_schedule_flip);
 
@@ -514,7 +515,14 @@ static void flush_to_ldisc(struct work_struct *work)
  */
 void tty_flush_to_ldisc(struct tty_struct *tty)
 {
-	flush_work(&tty->buf.work);
+	/*
+	 * The work queue is not always used to move data from the flip buffer
+	 * to the line discipline: the tty_flip_buffer_push() will call the
+	 * flush_to_ldisc() routine directly when low_latency flag is set.
+	 * Therefor only flush the work queue when required.
+	 */
+	if (atomic_xchg(&tty->buf.flush_work, 0))
+		flush_work(&tty->buf.work);
 }
 
 /**
@@ -538,10 +546,12 @@ void tty_flip_buffer_push(struct tty_struct *tty)
 		tty->buf.tail->commit = tty->buf.tail->used;
 	raw_spin_unlock_irqrestore(&tty->buf.lock, flags);
 
-	if (tty->low_latency)
+	if (tty->low_latency) {
 		flush_to_ldisc(&tty->buf.work);
-	else
+	} else {
 		schedule_work(&tty->buf.work);
+		atomic_set(&tty->buf.flush_work, 1);
+	}
 }
 EXPORT_SYMBOL(tty_flip_buffer_push);
 
@@ -563,5 +573,6 @@ void tty_buffer_init(struct tty_struct *tty)
 	tty->buf.free = NULL;
 	tty->buf.memory_used = 0;
 	INIT_WORK(&tty->buf.work, flush_to_ldisc);
+	atomic_set(&tty->buf.flush_work, 0);
 }
 
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 21bceef..f76ac5e 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -43,7 +43,7 @@
 #include <linux/tty_driver.h>
 #include <linux/tty_ldisc.h>
 #include <linux/mutex.h>
-
+#include <linux/atomic.h>
 
 
 /*
@@ -86,6 +86,7 @@ struct tty_buffer {
 
 struct tty_bufhead {
 	struct work_struct work;
+	atomic_t flush_work;
 	raw_spinlock_t lock;
 	struct tty_buffer *head;	/* Queue head */
 	struct tty_buffer *tail;	/* Active buffer */
-- 
1.7.9.5



^ permalink raw reply related

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Russell King - ARM Linux @ 2012-09-25 11:32 UTC (permalink / raw)
  To: Felipe Balbi
  Cc: Poddar, Sourav, gregkh, khilman, paul, tony, linux-kernel,
	santosh.shilimkar, linux-serial, linux-omap, linux-arm-kernel,
	alan
In-Reply-To: <20120925111243.GR9137@arwen.pp.htv.fi>

On Tue, Sep 25, 2012 at 02:12:43PM +0300, Felipe Balbi wrote:
> I don't see any aggressive attitude towards what you suggested,
> actually. Mailing list archives are available to check, but the one
> cursing around was always yourself and THAT deserves an apology.

Total rubbish.  No apology, because I wasn't cursing you.  Whatever,
I'm not going to re-explain the email thread.

What I said was that your raising of beagleboard breakage (twice) was
idiotic to a request for investigation.  That's a statement I _still_
fully agree with, and I'll say it again if I have to.

Trying to shut down investigation because investigation may break
something _is_ idiotic - especially if the investigation reveals
potential reasons why that breakage would occur.  Further investigation
is precisely how we arrive at better solutions.

^ permalink raw reply

* Re: [PATCH] tty: cleanup duplicate functions in tty_buffer
From: Ivo Sieben @ 2012-09-25 11:28 UTC (permalink / raw)
  To: Alan Cox; +Cc: Alan Cox, Greg KH, linux-serial, RT, Jiri Slaby, Heiko Carstens
In-Reply-To: <20120924102547.62581af4@pyramind.ukuu.org.uk>

Hi

2012/9/24 Alan Cox <alan@lxorguk.ukuu.org.uk>:
>
> flush_to_ldisc calls into the tty core code which sleeps. The rule is
> that you may not call flush_to_ldisc except from a workqueue context.
>
> If you break that rule then various bad things will occur (eg rx ->
> flush_to_ldisc -> n_tty -> flow control -> tx -> deadlock) in a lot of
> drivers.
>
>
> Several of them are ones that shouldn't.
>

OK, I now understand what you mean with "pile that has needs to move
the other way"
I'll abandon this patch.

Thank you all for reviewing

Regards,
Ivo Sieben

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Russell King - ARM Linux @ 2012-09-25 11:15 UTC (permalink / raw)
  To: Felipe Balbi
  Cc: Poddar, Sourav, gregkh, khilman, paul, tony, linux-kernel,
	santosh.shilimkar, linux-serial, linux-omap, linux-arm-kernel,
	alan
In-Reply-To: <20120925091112.GK9137@arwen.pp.htv.fi>

On Tue, Sep 25, 2012 at 12:11:14PM +0300, Felipe Balbi wrote:
> that's most likely, of course. But should we cause a regression to
> beagleboard XM because of that ? Also, if you look into chapter 9 of the
> runtime_pm documentation, starting on line 822 you'll see documentation
> suggests the use of mystruct->is_suspended flag.

BTW, I'll point out a fatal flaw in your justification above.

If you read the entire example, you'll see that the is_suspended flag
is _not_ used to prevent resumes without suspends, but is used as a
flag to control whether the driver processes requests or not.  That's
entirely functionally different from using a "is_suspended" flag in
the way you mention above.

Section 5 is quite clear about the requirements at initialization time
for runtime PM, and nothing in section 9 contradicts that, and the
is_suspended flag in that example has nothing to do with any of this.

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Felipe Balbi @ 2012-09-25 11:12 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Felipe Balbi, Poddar, Sourav, gregkh, khilman, paul, tony,
	linux-kernel, santosh.shilimkar, linux-serial, linux-omap,
	linux-arm-kernel, alan
In-Reply-To: <20120925110703.GN31374@n2100.arm.linux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 2943 bytes --]

Hi,

On Tue, Sep 25, 2012 at 12:07:03PM +0100, Russell King - ARM Linux wrote:
> On Tue, Sep 25, 2012 at 01:37:03PM +0300, Felipe Balbi wrote:
> > On Tue, Sep 25, 2012 at 11:29:58AM +0100, Russell King - ARM Linux wrote:
> > > Because you are accusing me of potentially breaking your beagleboard
> > > for merely suggesting further investigation and a better commit message.
> > 
> > Where did I accuse you of anyting ? I just mentioned we experienced a
> > regression with beagleboard XM when using pm_runtime_set_active().
> 
> I quote:
> :> But should we cause a regression to beagleboard XM because of that ?
>    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

maybe that wasn't the best way to put it, but I was trying to point out
that either there was a bug on pm core or omap_device/hwmod, not that we
shouldn't investigate.

> I say again: I was _only_ suggesting further investigation, yet you
> were mouthing off about causing a regression to beagleboard "because
> of that", effectively saying that no, we should not do any further
> investigation and this is the only fix.

not what I meant, but fair enough... The "because of that" was supposed
to mean "because of pm_runtime_set_active()". I find the documentation
for that particular call to be rather poor and a bit confusing,
specially when further down, the very same document uses a
"is_suspended" flag which, in fact, shouldn't be needed when we have
pm_runtime_is_suspended() and the like.

> > We pinged Paul and asked if he had seen that before, he had no
> > pointers... Because Documentation/power/runtime_pm.txt was using a
> > mystruct->is_suspended flag, we just decided to follow the same
> > "design" since no-one was able to suggest why pm_runtime_set_active()
> > was breaking beagleXM nor how it was supposed to actually work.
> > 
> > Reading the code: pm_runtime_set_active() would tell pm_runtime core
> > the device is actually active by setting runtime_status to RPM_ACTIVE,
> > thus the following pm_runtime_get_sync() wouldn't actually call
> > runtime_resume() callback, but it would increment usage_counter.
> > 
> > I can't see why this would fail on beagleXM, but it does and we'd like
> > to hear in which situations this could fail...
> 
> Well, I've just spent five minutes analysing the code paths - which I
> hadn't looked at before - and I've pointed out what's probably causing
> the problem for Beagle.  I think you owe me an appology over your
> aggressive attitude towards my suggestions that there needed to be
> some further investigation.

I don't see any aggressive attitude towards what you suggested,
actually. Mailing list archives are available to check, but the one
cursing around was always yourself and THAT deserves an apology.

If you still think I've been at all aggressive, then sure, I apologize,
it wasn't what I meant though.

-- 
balbi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Russell King - ARM Linux @ 2012-09-25 11:07 UTC (permalink / raw)
  To: Felipe Balbi
  Cc: Poddar, Sourav, gregkh, khilman, paul, tony, linux-kernel,
	santosh.shilimkar, linux-serial, linux-omap, linux-arm-kernel,
	alan
In-Reply-To: <20120925103652.GN9137@arwen.pp.htv.fi>

On Tue, Sep 25, 2012 at 01:37:03PM +0300, Felipe Balbi wrote:
> On Tue, Sep 25, 2012 at 11:29:58AM +0100, Russell King - ARM Linux wrote:
> > Because you are accusing me of potentially breaking your beagleboard
> > for merely suggesting further investigation and a better commit message.
> 
> Where did I accuse you of anyting ? I just mentioned we experienced a
> regression with beagleboard XM when using pm_runtime_set_active().

I quote:
:> But should we cause a regression to beagleboard XM because of that ?
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I say again: I was _only_ suggesting further investigation, yet you
were mouthing off about causing a regression to beagleboard "because
of that", effectively saying that no, we should not do any further
investigation and this is the only fix.

> To add extra info, here you go:

Finally, something constructive.

> We pinged Paul and asked if he had seen that before, he had no
> pointers... Because Documentation/power/runtime_pm.txt was using a
> mystruct->is_suspended flag, we just decided to follow the same
> "design" since no-one was able to suggest why pm_runtime_set_active()
> was breaking beagleXM nor how it was supposed to actually work.
> 
> Reading the code: pm_runtime_set_active() would tell pm_runtime core
> the device is actually active by setting runtime_status to RPM_ACTIVE,
> thus the following pm_runtime_get_sync() wouldn't actually call
> runtime_resume() callback, but it would increment usage_counter.
> 
> I can't see why this would fail on beagleXM, but it does and we'd like
> to hear in which situations this could fail...

Well, I've just spent five minutes analysing the code paths - which I
hadn't looked at before - and I've pointed out what's probably causing
the problem for Beagle.  I think you owe me an appology over your
aggressive attitude towards my suggestions that there needed to be
some further investigation.

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Russell King - ARM Linux @ 2012-09-25 10:59 UTC (permalink / raw)
  To: Poddar, Sourav
  Cc: Felipe Balbi, gregkh, khilman, paul, tony, linux-kernel,
	santosh.shilimkar, linux-serial, linux-omap, linux-arm-kernel,
	alan
In-Reply-To: <CAKdam57bpTJ94Gn=mKVw1Fm-76fVPp3DGjFuNu_HfesybM1hPg@mail.gmail.com>

Right, let's get this thread back onto a constructive footing and try
to understand the problems here.

On Tue, Sep 25, 2012 at 03:26:06PM +0530, Poddar, Sourav wrote:
> The issue was observed at serial init itself in the N800 board and the
> log does not show up much.
> http://www.pwsan.com/omap/testlogs/test_tty_next_e36851d0/20120910020323/boot/2420n800/2420n800_log.txt

Right, so output stops when ttyO2 is initialized.

> What we thought the problem might be with n800 is that it tries to
> resume when it didn't suspend before.
> 
> There are two ways through which we thought of handling this issue:
> 
> a) set device as active before enabling pm (which will prevent
> 
> pm_runtime_set_active(dev);
> pm_runtime_enable(dev);
> 
> OR
> 
> b) adding a "suspended" flag to struct omap_uart_port which gets set on
> suspend and cleared on resume. Then on resume you can check:
> 
> if (!up->suspended)
>         return 0;
> 
> But using "pm_runtime_set_active" approach breaks things even on
> beagle board xm, though it works fine on Panda.

Right, so now the question becomes - what is different on the Beagle Board
that prevents solution (a) from working - or put it another way, have we
uncovered _another_ bug.

For N800, for ttyO0 and ttyO1 which aren't in use, it may be correct.
We don't know for certain.  For ttyO2, the port is clearly already in
use as it's being used for console output.  So that means it's _not_
in suspended state, and therefore the initial runtime PM state is
wrong.

For Beagleboard - who knows, we have no information on that.  All we
know is that your (a) sequence causes a regression.

Anyway, let's analyse the code paths.  What is pm_runtime_get_sync()
doing?  Well, it calls __pm_runtime_resume(, RPM_GET_PUT).  That alters
the parent device's state (which doesn't matter for this, as the platform
device is a child of the main platform device, which has no runtime PM.)

It then calls the resume callback (from the pm domain/type/class/bus/
driver) and then does an idle check.

OMAP devices have a pm domain, so this is the candidate for the callback
at this level, which gets us into _od_runtime_resume().  This calls
omap_device_enable() before then moving on to the driver's runtime resume
method, and at that point the runtime PM resume is complete.

So, with your (a) solution, what's different is that we omit calling
omap_device_enable().  Therefore, my hunch is the reason that Beagleboard
doesn't work is because it doesn't like missing that call.

I bet if you do this:

	omap_device_enable(dev);
	pm_runtime_set_active(dev);
	pm_runtime_enable(dev);

that this will solve your problem, and Beagleboard will continue working.

What we have is a mismatch in both your case, and the Beagleboard case,
between the real state of the hardware, the runtime PM state, and the
OMAP hwmod state, and the above should bring all those states entirely
into line with each other for _every_ case.  It doesn't need any hacks
to prevent resume callbacks without prior suspends (which, incidentally,
the runtime PM core already guarantees provided you get the initial
state correct.)

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Felipe Balbi @ 2012-09-25 10:37 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Felipe Balbi, Poddar, Sourav, gregkh, khilman, paul, tony,
	linux-kernel, santosh.shilimkar, linux-serial, linux-omap,
	linux-arm-kernel, alan
In-Reply-To: <20120925102958.GL31374@n2100.arm.linux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 4320 bytes --]

On Tue, Sep 25, 2012 at 11:29:58AM +0100, Russell King - ARM Linux wrote:
> On Tue, Sep 25, 2012 at 12:48:16PM +0300, Felipe Balbi wrote:
> > Hi,
> > 
> > On Tue, Sep 25, 2012 at 10:21:18AM +0100, Russell King - ARM Linux wrote:
> > > On Tue, Sep 25, 2012 at 12:11:14PM +0300, Felipe Balbi wrote:
> > > > On Tue, Sep 25, 2012 at 10:12:28AM +0100, Russell King - ARM Linux wrote:
> > > > > On Tue, Sep 25, 2012 at 11:31:20AM +0300, Felipe Balbi wrote:
> > > > > > On Tue, Sep 25, 2012 at 09:30:29AM +0100, Russell King - ARM Linux wrote:
> > > > > > > How is this happening?  I think that needs proper investigation - or if
> > > > > > > it's had more investigation, then the results needs to be included in
> > > > > > > the commit description so that everyone can understand the issue here.
> > > > > > > 
> > > > > > > We should not be resuming a device which hasn't been suspended.  Maybe
> > > > > > > the runtime PM enable sequence is wrong, and that's what should be fixed
> > > > > > > instead?  
> > > > > > > 
> > > > > > > This sequence in the probe() function:
> > > > > > > 
> > > > > > >         pm_runtime_irq_safe(&pdev->dev);
> > > > > > >         pm_runtime_enable(&pdev->dev);
> > > > > > >         pm_runtime_get_sync(&pdev->dev);
> > > > > > > 
> > > > > > > would enable runtime PM while the s/w state indicates that it's disabled,
> > > > > > > and then that pm_runtime_get_sync() will want to resume the device.  See
> > > > > > > the section "5. Runtime PM Initialization, Device Probing and Removal"
> > > > > > > in Documentation/power/runtime_pm.txt, specifically the second paragraph
> > > > > > > of that section.
> > > > > > 
> > > > > > that was tested. It worked in pandaboard but didn't work on beagleboard
> > > > > > XM. Sourav tried to start a discussion about that, but it simply died...
> > > > > > 
> > > > > > In any case, pm_runtime_get_sync() in probe will always call
> > > > > > runtime_resume callback, right ?
> > > > > 
> > > > > Well, if the runtime PM state says it's suspended, and then you enable
> > > > > runtime PM, the first call to pm_runtime_get_sync() will trigger a resume
> > > > > attempt.  The patch description is complaining about resume events without
> > > > > there being a preceding suspend event.
> > > > > 
> > > > > This could well be why.
> > > > 
> > > > that's most likely, of course. But should we cause a regression to
> > > > beagleboard XM because of that ?
> > > 
> > > What would cause a regression on beagleboard XM?  I have not suggested
> > > any change other than more investigation of the issue and a fuller patch
> > > description - yet you're screaming (idiotically IMHO) that mere
> > > investigation would break beagleboard.
> > > 
> > > Well, if it's _that_ fragile, that mere investigation of this issue by
> > > someone elsewhere on the planet would break your beagleboard, maybe it
> > > deserves to be broken!
> > 
> > why are you always so over the top like that ? This is just
> > counter-productive to say the least.
> 
> Because you are accusing me of potentially breaking your beagleboard
> for merely suggesting further investigation and a better commit message.

Where did I accuse you of anyting ? I just mentioned we experienced a
regression with beagleboard XM when using pm_runtime_set_active().

here's my quote:

> that was tested. It worked in pandaboard but didn't work on
> beagleboard XM. Sourav tried to start a discussion about that, but it
> simply died...

To add extra info, here you go:

We pinged Paul and asked if he had seen that before, he had no
pointers... Because Documentation/power/runtime_pm.txt was using a
mystruct->is_suspended flag, we just decided to follow the same
"design" since no-one was able to suggest why pm_runtime_set_active()
was breaking beagleXM nor how it was supposed to actually work.

Reading the code: pm_runtime_set_active() would tell pm_runtime core
the device is actually active by setting runtime_status to RPM_ACTIVE,
thus the following pm_runtime_get_sync() wouldn't actually call
runtime_resume() callback, but it would increment usage_counter.

I can't see why this would fail on beagleXM, but it does and we'd like
to hear in which situations this could fail...

-- 
balbi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Russell King - ARM Linux @ 2012-09-25 10:29 UTC (permalink / raw)
  To: Felipe Balbi
  Cc: Poddar, Sourav, gregkh, khilman, paul, tony, linux-kernel,
	santosh.shilimkar, linux-serial, linux-omap, linux-arm-kernel,
	alan
In-Reply-To: <20120925094815.GL9137@arwen.pp.htv.fi>

On Tue, Sep 25, 2012 at 12:48:16PM +0300, Felipe Balbi wrote:
> Hi,
> 
> On Tue, Sep 25, 2012 at 10:21:18AM +0100, Russell King - ARM Linux wrote:
> > On Tue, Sep 25, 2012 at 12:11:14PM +0300, Felipe Balbi wrote:
> > > On Tue, Sep 25, 2012 at 10:12:28AM +0100, Russell King - ARM Linux wrote:
> > > > On Tue, Sep 25, 2012 at 11:31:20AM +0300, Felipe Balbi wrote:
> > > > > On Tue, Sep 25, 2012 at 09:30:29AM +0100, Russell King - ARM Linux wrote:
> > > > > > How is this happening?  I think that needs proper investigation - or if
> > > > > > it's had more investigation, then the results needs to be included in
> > > > > > the commit description so that everyone can understand the issue here.
> > > > > > 
> > > > > > We should not be resuming a device which hasn't been suspended.  Maybe
> > > > > > the runtime PM enable sequence is wrong, and that's what should be fixed
> > > > > > instead?  
> > > > > > 
> > > > > > This sequence in the probe() function:
> > > > > > 
> > > > > >         pm_runtime_irq_safe(&pdev->dev);
> > > > > >         pm_runtime_enable(&pdev->dev);
> > > > > >         pm_runtime_get_sync(&pdev->dev);
> > > > > > 
> > > > > > would enable runtime PM while the s/w state indicates that it's disabled,
> > > > > > and then that pm_runtime_get_sync() will want to resume the device.  See
> > > > > > the section "5. Runtime PM Initialization, Device Probing and Removal"
> > > > > > in Documentation/power/runtime_pm.txt, specifically the second paragraph
> > > > > > of that section.
> > > > > 
> > > > > that was tested. It worked in pandaboard but didn't work on beagleboard
> > > > > XM. Sourav tried to start a discussion about that, but it simply died...
> > > > > 
> > > > > In any case, pm_runtime_get_sync() in probe will always call
> > > > > runtime_resume callback, right ?
> > > > 
> > > > Well, if the runtime PM state says it's suspended, and then you enable
> > > > runtime PM, the first call to pm_runtime_get_sync() will trigger a resume
> > > > attempt.  The patch description is complaining about resume events without
> > > > there being a preceding suspend event.
> > > > 
> > > > This could well be why.
> > > 
> > > that's most likely, of course. But should we cause a regression to
> > > beagleboard XM because of that ?
> > 
> > What would cause a regression on beagleboard XM?  I have not suggested
> > any change other than more investigation of the issue and a fuller patch
> > description - yet you're screaming (idiotically IMHO) that mere
> > investigation would break beagleboard.
> > 
> > Well, if it's _that_ fragile, that mere investigation of this issue by
> > someone elsewhere on the planet would break your beagleboard, maybe it
> > deserves to be broken!
> 
> why are you always so over the top like that ? This is just
> counter-productive to say the least.

Because you are accusing me of potentially breaking your beagleboard
for merely suggesting further investigation and a better commit message.

You are the one going over the top, not me.

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Poddar, Sourav @ 2012-09-25  9:56 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Felipe Balbi, gregkh, khilman, paul, tony, linux-kernel,
	santosh.shilimkar, linux-serial, linux-omap, linux-arm-kernel,
	alan
In-Reply-To: <20120925092118.GJ31374@n2100.arm.linux.org.uk>

Hi,

On Tue, Sep 25, 2012 at 2:51 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Sep 25, 2012 at 12:11:14PM +0300, Felipe Balbi wrote:
>> On Tue, Sep 25, 2012 at 10:12:28AM +0100, Russell King - ARM Linux wrote:
>> > On Tue, Sep 25, 2012 at 11:31:20AM +0300, Felipe Balbi wrote:
>> > > On Tue, Sep 25, 2012 at 09:30:29AM +0100, Russell King - ARM Linux wrote:
>> > > > How is this happening?  I think that needs proper investigation - or if
>> > > > it's had more investigation, then the results needs to be included in
>> > > > the commit description so that everyone can understand the issue here.
>> > > >
>> > > > We should not be resuming a device which hasn't been suspended.  Maybe
>> > > > the runtime PM enable sequence is wrong, and that's what should be fixed
>> > > > instead?
>> > > >
>> > > > This sequence in the probe() function:
>> > > >
>> > > >         pm_runtime_irq_safe(&pdev->dev);
>> > > >         pm_runtime_enable(&pdev->dev);
>> > > >         pm_runtime_get_sync(&pdev->dev);
>> > > >
>> > > > would enable runtime PM while the s/w state indicates that it's disabled,
>> > > > and then that pm_runtime_get_sync() will want to resume the device.  See
>> > > > the section "5. Runtime PM Initialization, Device Probing and Removal"
>> > > > in Documentation/power/runtime_pm.txt, specifically the second paragraph
>> > > > of that section.
>> > >
>> > > that was tested. It worked in pandaboard but didn't work on beagleboard
>> > > XM. Sourav tried to start a discussion about that, but it simply died...
>> > >
>> > > In any case, pm_runtime_get_sync() in probe will always call
>> > > runtime_resume callback, right ?
>> >
>> > Well, if the runtime PM state says it's suspended, and then you enable
>> > runtime PM, the first call to pm_runtime_get_sync() will trigger a resume
>> > attempt.  The patch description is complaining about resume events without
>> > there being a preceding suspend event.
>> >
>> > This could well be why.
>>
>> that's most likely, of course. But should we cause a regression to
>> beagleboard XM because of that ?
>
> What would cause a regression on beagleboard XM?  I have not suggested
> any change other than more investigation of the issue and a fuller patch
> description - yet you're screaming (idiotically IMHO) that mere
> investigation would break beagleboard.
>
> Well, if it's _that_ fragile, that mere investigation of this issue by
> someone elsewhere on the planet would break your beagleboard, maybe it
> deserves to be broken!

The issue was observed at serial init itself in the N800 board and the
log does not
show up much.
http://www.pwsan.com/omap/testlogs/test_tty_next_e36851d0/20120910020323/boot/2420n800/2420n800_log.txt
 What we thought the problem might be with n800 is that it tries to
resume when it didn't suspend before.

There are two ways through which we thought of handling this issue:

a) set device as active before enabling pm (which will prevent

pm_runtime_set_active(dev);
pm_runtime_enable(dev);

OR

b) adding a "suspended" flag to struct omap_uart_port which gets set on
suspend and cleared on resume. Then on resume you can check:

if (!up->suspended)
        return 0;

But using "pm_runtime_set_active" approach breaks things even on
beagle board xm,  though
it works fine on Panda.
Therefore, we used the "suspended" flag approach.

So. I just wanted to get some feedback from community about how using
"pm_runtime_set_active"
behaves differently in omap3 and omap4.

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Felipe Balbi @ 2012-09-25  9:48 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Felipe Balbi, Poddar, Sourav, gregkh, khilman, paul, tony,
	linux-kernel, santosh.shilimkar, linux-serial, linux-omap,
	linux-arm-kernel, alan
In-Reply-To: <20120925092118.GJ31374@n2100.arm.linux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 3152 bytes --]

Hi,

On Tue, Sep 25, 2012 at 10:21:18AM +0100, Russell King - ARM Linux wrote:
> On Tue, Sep 25, 2012 at 12:11:14PM +0300, Felipe Balbi wrote:
> > On Tue, Sep 25, 2012 at 10:12:28AM +0100, Russell King - ARM Linux wrote:
> > > On Tue, Sep 25, 2012 at 11:31:20AM +0300, Felipe Balbi wrote:
> > > > On Tue, Sep 25, 2012 at 09:30:29AM +0100, Russell King - ARM Linux wrote:
> > > > > How is this happening?  I think that needs proper investigation - or if
> > > > > it's had more investigation, then the results needs to be included in
> > > > > the commit description so that everyone can understand the issue here.
> > > > > 
> > > > > We should not be resuming a device which hasn't been suspended.  Maybe
> > > > > the runtime PM enable sequence is wrong, and that's what should be fixed
> > > > > instead?  
> > > > > 
> > > > > This sequence in the probe() function:
> > > > > 
> > > > >         pm_runtime_irq_safe(&pdev->dev);
> > > > >         pm_runtime_enable(&pdev->dev);
> > > > >         pm_runtime_get_sync(&pdev->dev);
> > > > > 
> > > > > would enable runtime PM while the s/w state indicates that it's disabled,
> > > > > and then that pm_runtime_get_sync() will want to resume the device.  See
> > > > > the section "5. Runtime PM Initialization, Device Probing and Removal"
> > > > > in Documentation/power/runtime_pm.txt, specifically the second paragraph
> > > > > of that section.
> > > > 
> > > > that was tested. It worked in pandaboard but didn't work on beagleboard
> > > > XM. Sourav tried to start a discussion about that, but it simply died...
> > > > 
> > > > In any case, pm_runtime_get_sync() in probe will always call
> > > > runtime_resume callback, right ?
> > > 
> > > Well, if the runtime PM state says it's suspended, and then you enable
> > > runtime PM, the first call to pm_runtime_get_sync() will trigger a resume
> > > attempt.  The patch description is complaining about resume events without
> > > there being a preceding suspend event.
> > > 
> > > This could well be why.
> > 
> > that's most likely, of course. But should we cause a regression to
> > beagleboard XM because of that ?
> 
> What would cause a regression on beagleboard XM?  I have not suggested
> any change other than more investigation of the issue and a fuller patch
> description - yet you're screaming (idiotically IMHO) that mere
> investigation would break beagleboard.
> 
> Well, if it's _that_ fragile, that mere investigation of this issue by
> someone elsewhere on the planet would break your beagleboard, maybe it
> deserves to be broken!

why are you always so over the top like that ? This is just
counter-productive to say the least.

What I'm trying to say here, is that there might be a bug either on pm
core or on OMAP3's implementation and I'd like to get input from Tony,
Santosh, Paul, etc before going forward. Maybe it's something they've
already been through, or maybe it rings a bell and points to somewhere
we should look for.

anyway, instead of feeding your ego, we can stop this discussion and let
Sourav see what he can come up with.

-- 
balbi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Russell King - ARM Linux @ 2012-09-25  9:21 UTC (permalink / raw)
  To: Felipe Balbi
  Cc: Poddar, Sourav, gregkh, khilman, paul, tony, linux-kernel,
	santosh.shilimkar, linux-serial, linux-omap, linux-arm-kernel,
	alan
In-Reply-To: <20120925091112.GK9137@arwen.pp.htv.fi>

On Tue, Sep 25, 2012 at 12:11:14PM +0300, Felipe Balbi wrote:
> On Tue, Sep 25, 2012 at 10:12:28AM +0100, Russell King - ARM Linux wrote:
> > On Tue, Sep 25, 2012 at 11:31:20AM +0300, Felipe Balbi wrote:
> > > On Tue, Sep 25, 2012 at 09:30:29AM +0100, Russell King - ARM Linux wrote:
> > > > How is this happening?  I think that needs proper investigation - or if
> > > > it's had more investigation, then the results needs to be included in
> > > > the commit description so that everyone can understand the issue here.
> > > > 
> > > > We should not be resuming a device which hasn't been suspended.  Maybe
> > > > the runtime PM enable sequence is wrong, and that's what should be fixed
> > > > instead?  
> > > > 
> > > > This sequence in the probe() function:
> > > > 
> > > >         pm_runtime_irq_safe(&pdev->dev);
> > > >         pm_runtime_enable(&pdev->dev);
> > > >         pm_runtime_get_sync(&pdev->dev);
> > > > 
> > > > would enable runtime PM while the s/w state indicates that it's disabled,
> > > > and then that pm_runtime_get_sync() will want to resume the device.  See
> > > > the section "5. Runtime PM Initialization, Device Probing and Removal"
> > > > in Documentation/power/runtime_pm.txt, specifically the second paragraph
> > > > of that section.
> > > 
> > > that was tested. It worked in pandaboard but didn't work on beagleboard
> > > XM. Sourav tried to start a discussion about that, but it simply died...
> > > 
> > > In any case, pm_runtime_get_sync() in probe will always call
> > > runtime_resume callback, right ?
> > 
> > Well, if the runtime PM state says it's suspended, and then you enable
> > runtime PM, the first call to pm_runtime_get_sync() will trigger a resume
> > attempt.  The patch description is complaining about resume events without
> > there being a preceding suspend event.
> > 
> > This could well be why.
> 
> that's most likely, of course. But should we cause a regression to
> beagleboard XM because of that ?

What would cause a regression on beagleboard XM?  I have not suggested
any change other than more investigation of the issue and a fuller patch
description - yet you're screaming (idiotically IMHO) that mere
investigation would break beagleboard.

Well, if it's _that_ fragile, that mere investigation of this issue by
someone elsewhere on the planet would break your beagleboard, maybe it
deserves to be broken!

^ permalink raw reply

* Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.
From: Felipe Balbi @ 2012-09-25  9:11 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Felipe Balbi, Poddar, Sourav, gregkh, khilman, paul, tony,
	linux-kernel, santosh.shilimkar, linux-serial, linux-omap,
	linux-arm-kernel, alan
In-Reply-To: <20120925091228.GI31374@n2100.arm.linux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 2050 bytes --]

On Tue, Sep 25, 2012 at 10:12:28AM +0100, Russell King - ARM Linux wrote:
> On Tue, Sep 25, 2012 at 11:31:20AM +0300, Felipe Balbi wrote:
> > On Tue, Sep 25, 2012 at 09:30:29AM +0100, Russell King - ARM Linux wrote:
> > > How is this happening?  I think that needs proper investigation - or if
> > > it's had more investigation, then the results needs to be included in
> > > the commit description so that everyone can understand the issue here.
> > > 
> > > We should not be resuming a device which hasn't been suspended.  Maybe
> > > the runtime PM enable sequence is wrong, and that's what should be fixed
> > > instead?  
> > > 
> > > This sequence in the probe() function:
> > > 
> > >         pm_runtime_irq_safe(&pdev->dev);
> > >         pm_runtime_enable(&pdev->dev);
> > >         pm_runtime_get_sync(&pdev->dev);
> > > 
> > > would enable runtime PM while the s/w state indicates that it's disabled,
> > > and then that pm_runtime_get_sync() will want to resume the device.  See
> > > the section "5. Runtime PM Initialization, Device Probing and Removal"
> > > in Documentation/power/runtime_pm.txt, specifically the second paragraph
> > > of that section.
> > 
> > that was tested. It worked in pandaboard but didn't work on beagleboard
> > XM. Sourav tried to start a discussion about that, but it simply died...
> > 
> > In any case, pm_runtime_get_sync() in probe will always call
> > runtime_resume callback, right ?
> 
> Well, if the runtime PM state says it's suspended, and then you enable
> runtime PM, the first call to pm_runtime_get_sync() will trigger a resume
> attempt.  The patch description is complaining about resume events without
> there being a preceding suspend event.
> 
> This could well be why.

that's most likely, of course. But should we cause a regression to
beagleboard XM because of that ? Also, if you look into chapter 9 of the
runtime_pm documentation, starting on line 822 you'll see documentation
suggests the use of mystruct->is_suspended flag.

-- 
balbi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox