* hvc_console change results in corrupt oops output
@ 2011-07-04 10:57 Anton Blanchard
2011-07-04 13:56 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 10+ messages in thread
From: Anton Blanchard @ 2011-07-04 10:57 UTC (permalink / raw)
To: benh, brueckner, borntraeger, linuxppc-dev
Hi,
We've been struggling to debug a hang on a large ppc64 box. Every time
we collect oops output there are pieces of the oops output missing and
in some cases entire CPUs are missing.
Eventually I realised the hvc_console driver is dropping characters.
The commit that caused this is:
commit 3feebbb5492e9e463467cefb633e23a3dfcec132
Author: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Date: Mon Oct 13 23:12:50 2008 +0000
hvc_console: Fix loop if put_char() returns 0
If put_char() routine of a hvc console backend returns 0, then the
hvc console starts looping in the following scenarios:
1. hvc_console_print()
If put_char() returns 0 then the while loop may loop forever.
I have added the missing check for 0 to throw away console
messages.
The hypervisor gives us a busy return, so we could retry a number of
times instead of dropping it on the floor. We'd need to do it in the
hvc_console driver - the tty drivers share the same backend
functions so we can't hide it in the pseries put_chars function.
Anton
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output
2011-07-04 10:57 hvc_console change results in corrupt oops output Anton Blanchard
@ 2011-07-04 13:56 ` Benjamin Herrenschmidt
2011-07-04 14:24 ` Hendrik Brueckner
0 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2011-07-04 13:56 UTC (permalink / raw)
To: Anton Blanchard; +Cc: borntraeger, brueckner, linuxppc-dev
On Mon, 2011-07-04 at 20:57 +1000, Anton Blanchard wrote:
.../...
> The hypervisor gives us a busy return, so we could retry a number of
> times instead of dropping it on the floor. We'd need to do it in the
> hvc_console driver - the tty drivers share the same backend
> functions so we can't hide it in the pseries put_chars function.
For kernel console, I don't see why not wait forever ... If the
underlying backend really lost the connection it can always return a
negative error no ?
Or we could have "well defined" return codes for "wait forever" vs.
"drop this" ... maybe return -EAGAIN.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output
2011-07-04 13:56 ` Benjamin Herrenschmidt
@ 2011-07-04 14:24 ` Hendrik Brueckner
2011-07-05 4:17 ` Tabi Timur-B04825
0 siblings, 1 reply; 10+ messages in thread
From: Hendrik Brueckner @ 2011-07-04 14:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: borntraeger, brueckner, linuxppc-dev, Anton Blanchard
On Mon, Jul 04, 2011 at 11:56:27PM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2011-07-04 at 20:57 +1000, Anton Blanchard wrote:
>
> .../...
>
> > The hypervisor gives us a busy return, so we could retry a number of
> > times instead of dropping it on the floor. We'd need to do it in the
> > hvc_console driver - the tty drivers share the same backend
> > functions so we can't hide it in the pseries put_chars function.
>
> For kernel console, I don't see why not wait forever ... If the
> underlying backend really lost the connection it can always return a
> negative error no ?
>
> Or we could have "well defined" return codes for "wait forever" vs.
> "drop this" ... maybe return -EAGAIN.
I will check this again for my hvc_iucv back-end. Meanwhile a found
an old thread discussing the same issue. It covers some differences
between console and ttys which actually does not matter for hvc-backend
because of the shared put_chars() routine.
You can read the thread on lkml.org: http://lkml.org/lkml/2009/10/15/149
Kind regards,
Hendrik
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output
2011-07-04 14:24 ` Hendrik Brueckner
@ 2011-07-05 4:17 ` Tabi Timur-B04825
2011-07-05 6:22 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 10+ messages in thread
From: Tabi Timur-B04825 @ 2011-07-05 4:17 UTC (permalink / raw)
To: Hendrik Brueckner
Cc: linuxppc-dev@lists.ozlabs.org, Anton Blanchard,
borntraeger@de.ibm.com
On Mon, Jul 4, 2011 at 9:24 AM, Hendrik Brueckner
<brueckner@linux.vnet.ibm.com> wrote:
> I will check this again for my hvc_iucv back-end. =A0Meanwhile a found
> an old thread discussing the same issue. =A0It covers some differences
> between console and ttys which actually does not matter for hvc-backend
> because of the shared put_chars() routine.
>
> You can read the thread on lkml.org: http://lkml.org/lkml/2009/10/15/149
I started that thread. After much soul searching, we came to the
conclusion that HVC is not compatible with hypervisors that return
BUSY on writes. So I threw out my HVC driver and rewrote it as a
standard console and TTY driver. That driver is waiting to be
included in the 3.1 kernel.
--=20
Timur Tabi
Linux kernel developer at Freescale=
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output
2011-07-05 4:17 ` Tabi Timur-B04825
@ 2011-07-05 6:22 ` Benjamin Herrenschmidt
2011-07-05 13:51 ` Tabi Timur-B04825
2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner
0 siblings, 2 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2011-07-05 6:22 UTC (permalink / raw)
To: Tabi Timur-B04825
Cc: borntraeger@de.ibm.com, Hendrik Brueckner,
linuxppc-dev@lists.ozlabs.org, Anton Blanchard
On Tue, 2011-07-05 at 04:17 +0000, Tabi Timur-B04825 wrote:
> On Mon, Jul 4, 2011 at 9:24 AM, Hendrik Brueckner
> <brueckner@linux.vnet.ibm.com> wrote:
>
> > I will check this again for my hvc_iucv back-end. Meanwhile a found
> > an old thread discussing the same issue. It covers some differences
> > between console and ttys which actually does not matter for hvc-backend
> > because of the shared put_chars() routine.
> >
> > You can read the thread on lkml.org: http://lkml.org/lkml/2009/10/15/149
>
> I started that thread. After much soul searching, we came to the
> conclusion that HVC is not compatible with hypervisors that return
> BUSY on writes.
That is a fun conclusion considering that hvc has been written for the
pseries hypervisor which ... can return BUSY on writes :-)
> So I threw out my HVC driver and rewrote it as a
> standard console and TTY driver. That driver is waiting to be
> included in the 3.1 kernel.
Sucktastic.
We just need to fix HVC properly.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output
2011-07-05 6:22 ` Benjamin Herrenschmidt
@ 2011-07-05 13:51 ` Tabi Timur-B04825
2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner
1 sibling, 0 replies; 10+ messages in thread
From: Tabi Timur-B04825 @ 2011-07-05 13:51 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: borntraeger@de.ibm.com, Hendrik Brueckner,
linuxppc-dev@lists.ozlabs.org, Anton Blanchard
Benjamin Herrenschmidt wrote:
> That is a fun conclusion considering that hvc has been written for the
> pseries hypervisor which ... can return BUSY on writes :-)
Go read the original thread. The problem is that tty writes and console=20
writes are treated the same by the hvc client driver. If a client driver=20
detects that the hypervisor is busy, it has the choice of either spinning=20
or returning right away. Spinning is not acceptable for tty output, so=20
all drivers return right away. hvc then drops the unwritten characters.
According to Hendrik, this is still happening.
> We just need to fix HVC properly.
Where were you two years ago? I complained about the problem, and even=20
posted a hackish "fix". The response I got was tepid -- some=20
acknowledgement that the problem exists, but no real desire to fix it by=20
anyone.
So I had no choice but to abandon hvc. And frankly, I still don't=20
understand why it exists. Since then, I wrote a very nice console/tty=20
driver, and I have no plans to return to hvc even if the problem is fixed.
--=20
Timur Tabi
Linux kernel developer at Freescale=
^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC] [PATCH] hvc_console: improve tty/console put_chars handling
2011-07-05 6:22 ` Benjamin Herrenschmidt
2011-07-05 13:51 ` Tabi Timur-B04825
@ 2011-07-05 14:28 ` Hendrik Brueckner
2011-07-06 7:49 ` Anton Blanchard
` (2 more replies)
1 sibling, 3 replies; 10+ messages in thread
From: Hendrik Brueckner @ 2011-07-05 14:28 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: borntraeger@de.ibm.com, Hendrik Brueckner, Tabi Timur-B04825,
linuxppc-dev@lists.ozlabs.org, Anton Blanchard
Hi folks,
On Tue, Jul 05, 2011 at 04:22:43PM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2011-07-05 at 04:17 +0000, Tabi Timur-B04825 wrote:
> > On Mon, Jul 4, 2011 at 9:24 AM, Hendrik Brueckner
> > <brueckner@linux.vnet.ibm.com> wrote:
> >
> > I started that thread. After much soul searching, we came to the
> > conclusion that HVC is not compatible with hypervisors that return
> > BUSY on writes.
>
> That is a fun conclusion considering that hvc has been written for the
> pseries hypervisor which ... can return BUSY on writes :-)
>
> We just need to fix HVC properly.
So I took initiative and looked again into this issue. Below you can
find a patch that is based on Ben's -EAGAIN idea. The hvc console layer
takes care of retrying depending on the backend's return code.
However, the main issue is that from a backend perspective, there is no
difference between tty and console output. Because consoles, especially
the preferred console, behave different than a simple tty. Blocked write
to a preferred console can stop the system.
So with the patch below, the backend can now indirectly control the way
console output is handled for it. I still have to think if this solution
is ok or if it is better to introduce a new callback to console output only
(and might provide a default implemenatation similar to the patch below).
NOTE: I did not yet test this patch but will do.. I just want to share it
early to get feedback from you.
-->8---------------------------------------------------------------------
Currently, the hvc_console_print() function drops console output if the
hvc backend's put_chars() returns 0. This patch changes this behavior
to allow a retry through returning -EAGAIN.
This change also affects the hvc_push() function. Both functions are
changed to handle -EAGAIN and to retry the put_chars() operation.
If a hvc backend returns -EAGAIN, the retry handling differs:
- hvc_console_print() spins to write the complete console output.
- hvc_push() behaves the same way as for returning 0.
Now hvc backends can indirectly control the way how console output is
handled through the hvc console layer.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
---
drivers/tty/hvc/hvc_console.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -163,8 +163,10 @@ static void hvc_console_print(struct con
} else {
r = cons_ops[index]->put_chars(vtermnos[index], c, i);
if (r <= 0) {
- /* throw away chars on error */
- i = 0;
+ /* throw away characters on error
+ * but spin in case of -EAGAIN */
+ if (r != -EAGAIN)
+ i = 0;
} else if (r > 0) {
i -= r;
if (i > 0)
@@ -448,7 +450,7 @@ static int hvc_push(struct hvc_struct *h
n = hp->ops->put_chars(hp->vtermno, hp->outbuf, hp->n_outbuf);
if (n <= 0) {
- if (n == 0) {
+ if (n == 0 || n == -EAGAIN) {
hp->do_wakeup = 1;
return 0;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] [PATCH] hvc_console: improve tty/console put_chars handling
2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner
@ 2011-07-06 7:49 ` Anton Blanchard
2011-07-06 7:50 ` [PATCH 1/2]: " Anton Blanchard
2011-07-06 7:51 ` [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output Anton Blanchard
2 siblings, 0 replies; 10+ messages in thread
From: Anton Blanchard @ 2011-07-06 7:49 UTC (permalink / raw)
To: Hendrik Brueckner
Cc: Tabi Timur-B04825, linuxppc-dev@lists.ozlabs.org,
borntraeger@de.ibm.com
Hi Hendrik,
> So with the patch below, the backend can now indirectly control the
> way console output is handled for it. I still have to think if this
> solution is ok or if it is better to introduce a new callback to
> console output only (and might provide a default implemenatation
> similar to the patch below).
>
> NOTE: I did not yet test this patch but will do.. I just want to
> share it early to get feedback from you.
Thanks a lot for this. I added the pseries bits and tested it, patches
to follow.
Anton
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2]: hvc_console: improve tty/console put_chars handling
2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner
2011-07-06 7:49 ` Anton Blanchard
@ 2011-07-06 7:50 ` Anton Blanchard
2011-07-06 7:51 ` [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output Anton Blanchard
2 siblings, 0 replies; 10+ messages in thread
From: Anton Blanchard @ 2011-07-06 7:50 UTC (permalink / raw)
To: Hendrik Brueckner
Cc: Tabi Timur-B04825, linuxppc-dev@lists.ozlabs.org,
borntraeger@de.ibm.com
From: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Currently, the hvc_console_print() function drops console output if the
hvc backend's put_chars() returns 0. This patch changes this behavior
to allow a retry through returning -EAGAIN.
This change also affects the hvc_push() function. Both functions are
changed to handle -EAGAIN and to retry the put_chars() operation.
If a hvc backend returns -EAGAIN, the retry handling differs:
- hvc_console_print() spins to write the complete console output.
- hvc_push() behaves the same way as for returning 0.
Now hvc backends can indirectly control the way how console output is
handled through the hvc console layer.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
---
Index: linux-powerpc/drivers/tty/hvc/hvc_console.c
===================================================================
--- linux-powerpc.orig/drivers/tty/hvc/hvc_console.c 2011-07-06 17:10:51.288360541 +1000
+++ linux-powerpc/drivers/tty/hvc/hvc_console.c 2011-07-06 17:11:16.398794913 +1000
@@ -163,8 +163,10 @@ static void hvc_console_print(struct con
} else {
r = cons_ops[index]->put_chars(vtermnos[index], c, i);
if (r <= 0) {
- /* throw away chars on error */
- i = 0;
+ /* throw away characters on error
+ * but spin in case of -EAGAIN */
+ if (r != -EAGAIN)
+ i = 0;
} else if (r > 0) {
i -= r;
if (i > 0)
@@ -448,7 +450,7 @@ static int hvc_push(struct hvc_struct *h
n = hp->ops->put_chars(hp->vtermno, hp->outbuf, hp->n_outbuf);
if (n <= 0) {
- if (n == 0) {
+ if (n == 0 || n == -EAGAIN) {
hp->do_wakeup = 1;
return 0;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output
2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner
2011-07-06 7:49 ` Anton Blanchard
2011-07-06 7:50 ` [PATCH 1/2]: " Anton Blanchard
@ 2011-07-06 7:51 ` Anton Blanchard
2 siblings, 0 replies; 10+ messages in thread
From: Anton Blanchard @ 2011-07-06 7:51 UTC (permalink / raw)
To: Hendrik Brueckner
Cc: Tabi Timur-B04825, linuxppc-dev@lists.ozlabs.org,
borntraeger@de.ibm.com
Return -EAGAIN when we get H_BUSY back from the hypervisor. This
makes the hvc console driver retry, avoiding dropped printks.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
---
Index: linux-powerpc/arch/powerpc/platforms/pseries/hvconsole.c
===================================================================
--- linux-powerpc.orig/arch/powerpc/platforms/pseries/hvconsole.c 2011-07-06 17:11:33.799095935 +1000
+++ linux-powerpc/arch/powerpc/platforms/pseries/hvconsole.c 2011-07-06 17:11:47.499332962 +1000
@@ -73,7 +73,7 @@ int hvc_put_chars(uint32_t vtermno, cons
if (ret == H_SUCCESS)
return count;
if (ret == H_BUSY)
- return 0;
+ return -EAGAIN;
return -EIO;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-07-06 7:51 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-04 10:57 hvc_console change results in corrupt oops output Anton Blanchard
2011-07-04 13:56 ` Benjamin Herrenschmidt
2011-07-04 14:24 ` Hendrik Brueckner
2011-07-05 4:17 ` Tabi Timur-B04825
2011-07-05 6:22 ` Benjamin Herrenschmidt
2011-07-05 13:51 ` Tabi Timur-B04825
2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner
2011-07-06 7:49 ` Anton Blanchard
2011-07-06 7:50 ` [PATCH 1/2]: " Anton Blanchard
2011-07-06 7:51 ` [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output Anton Blanchard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).