* hvc_console change results in corrupt oops output @ 2011-07-04 10:57 Anton Blanchard 2011-07-04 13:56 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 10+ messages in thread From: Anton Blanchard @ 2011-07-04 10:57 UTC (permalink / raw) To: benh, brueckner, borntraeger, linuxppc-dev Hi, We've been struggling to debug a hang on a large ppc64 box. Every time we collect oops output there are pieces of the oops output missing and in some cases entire CPUs are missing. Eventually I realised the hvc_console driver is dropping characters. The commit that caused this is: commit 3feebbb5492e9e463467cefb633e23a3dfcec132 Author: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Date: Mon Oct 13 23:12:50 2008 +0000 hvc_console: Fix loop if put_char() returns 0 If put_char() routine of a hvc console backend returns 0, then the hvc console starts looping in the following scenarios: 1. hvc_console_print() If put_char() returns 0 then the while loop may loop forever. I have added the missing check for 0 to throw away console messages. The hypervisor gives us a busy return, so we could retry a number of times instead of dropping it on the floor. We'd need to do it in the hvc_console driver - the tty drivers share the same backend functions so we can't hide it in the pseries put_chars function. Anton ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output 2011-07-04 10:57 hvc_console change results in corrupt oops output Anton Blanchard @ 2011-07-04 13:56 ` Benjamin Herrenschmidt 2011-07-04 14:24 ` Hendrik Brueckner 0 siblings, 1 reply; 10+ messages in thread From: Benjamin Herrenschmidt @ 2011-07-04 13:56 UTC (permalink / raw) To: Anton Blanchard; +Cc: borntraeger, brueckner, linuxppc-dev On Mon, 2011-07-04 at 20:57 +1000, Anton Blanchard wrote: .../... > The hypervisor gives us a busy return, so we could retry a number of > times instead of dropping it on the floor. We'd need to do it in the > hvc_console driver - the tty drivers share the same backend > functions so we can't hide it in the pseries put_chars function. For kernel console, I don't see why not wait forever ... If the underlying backend really lost the connection it can always return a negative error no ? Or we could have "well defined" return codes for "wait forever" vs. "drop this" ... maybe return -EAGAIN. Cheers, Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output 2011-07-04 13:56 ` Benjamin Herrenschmidt @ 2011-07-04 14:24 ` Hendrik Brueckner 2011-07-05 4:17 ` Tabi Timur-B04825 0 siblings, 1 reply; 10+ messages in thread From: Hendrik Brueckner @ 2011-07-04 14:24 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: borntraeger, brueckner, linuxppc-dev, Anton Blanchard On Mon, Jul 04, 2011 at 11:56:27PM +1000, Benjamin Herrenschmidt wrote: > On Mon, 2011-07-04 at 20:57 +1000, Anton Blanchard wrote: > > .../... > > > The hypervisor gives us a busy return, so we could retry a number of > > times instead of dropping it on the floor. We'd need to do it in the > > hvc_console driver - the tty drivers share the same backend > > functions so we can't hide it in the pseries put_chars function. > > For kernel console, I don't see why not wait forever ... If the > underlying backend really lost the connection it can always return a > negative error no ? > > Or we could have "well defined" return codes for "wait forever" vs. > "drop this" ... maybe return -EAGAIN. I will check this again for my hvc_iucv back-end. Meanwhile a found an old thread discussing the same issue. It covers some differences between console and ttys which actually does not matter for hvc-backend because of the shared put_chars() routine. You can read the thread on lkml.org: http://lkml.org/lkml/2009/10/15/149 Kind regards, Hendrik ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output 2011-07-04 14:24 ` Hendrik Brueckner @ 2011-07-05 4:17 ` Tabi Timur-B04825 2011-07-05 6:22 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 10+ messages in thread From: Tabi Timur-B04825 @ 2011-07-05 4:17 UTC (permalink / raw) To: Hendrik Brueckner Cc: linuxppc-dev@lists.ozlabs.org, Anton Blanchard, borntraeger@de.ibm.com On Mon, Jul 4, 2011 at 9:24 AM, Hendrik Brueckner <brueckner@linux.vnet.ibm.com> wrote: > I will check this again for my hvc_iucv back-end. =A0Meanwhile a found > an old thread discussing the same issue. =A0It covers some differences > between console and ttys which actually does not matter for hvc-backend > because of the shared put_chars() routine. > > You can read the thread on lkml.org: http://lkml.org/lkml/2009/10/15/149 I started that thread. After much soul searching, we came to the conclusion that HVC is not compatible with hypervisors that return BUSY on writes. So I threw out my HVC driver and rewrote it as a standard console and TTY driver. That driver is waiting to be included in the 3.1 kernel. --=20 Timur Tabi Linux kernel developer at Freescale= ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output 2011-07-05 4:17 ` Tabi Timur-B04825 @ 2011-07-05 6:22 ` Benjamin Herrenschmidt 2011-07-05 13:51 ` Tabi Timur-B04825 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner 0 siblings, 2 replies; 10+ messages in thread From: Benjamin Herrenschmidt @ 2011-07-05 6:22 UTC (permalink / raw) To: Tabi Timur-B04825 Cc: borntraeger@de.ibm.com, Hendrik Brueckner, linuxppc-dev@lists.ozlabs.org, Anton Blanchard On Tue, 2011-07-05 at 04:17 +0000, Tabi Timur-B04825 wrote: > On Mon, Jul 4, 2011 at 9:24 AM, Hendrik Brueckner > <brueckner@linux.vnet.ibm.com> wrote: > > > I will check this again for my hvc_iucv back-end. Meanwhile a found > > an old thread discussing the same issue. It covers some differences > > between console and ttys which actually does not matter for hvc-backend > > because of the shared put_chars() routine. > > > > You can read the thread on lkml.org: http://lkml.org/lkml/2009/10/15/149 > > I started that thread. After much soul searching, we came to the > conclusion that HVC is not compatible with hypervisors that return > BUSY on writes. That is a fun conclusion considering that hvc has been written for the pseries hypervisor which ... can return BUSY on writes :-) > So I threw out my HVC driver and rewrote it as a > standard console and TTY driver. That driver is waiting to be > included in the 3.1 kernel. Sucktastic. We just need to fix HVC properly. Cheers, Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output 2011-07-05 6:22 ` Benjamin Herrenschmidt @ 2011-07-05 13:51 ` Tabi Timur-B04825 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner 1 sibling, 0 replies; 10+ messages in thread From: Tabi Timur-B04825 @ 2011-07-05 13:51 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: borntraeger@de.ibm.com, Hendrik Brueckner, linuxppc-dev@lists.ozlabs.org, Anton Blanchard Benjamin Herrenschmidt wrote: > That is a fun conclusion considering that hvc has been written for the > pseries hypervisor which ... can return BUSY on writes :-) Go read the original thread. The problem is that tty writes and console=20 writes are treated the same by the hvc client driver. If a client driver=20 detects that the hypervisor is busy, it has the choice of either spinning=20 or returning right away. Spinning is not acceptable for tty output, so=20 all drivers return right away. hvc then drops the unwritten characters. According to Hendrik, this is still happening. > We just need to fix HVC properly. Where were you two years ago? I complained about the problem, and even=20 posted a hackish "fix". The response I got was tepid -- some=20 acknowledgement that the problem exists, but no real desire to fix it by=20 anyone. So I had no choice but to abandon hvc. And frankly, I still don't=20 understand why it exists. Since then, I wrote a very nice console/tty=20 driver, and I have no plans to return to hvc even if the problem is fixed. --=20 Timur Tabi Linux kernel developer at Freescale= ^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC] [PATCH] hvc_console: improve tty/console put_chars handling 2011-07-05 6:22 ` Benjamin Herrenschmidt 2011-07-05 13:51 ` Tabi Timur-B04825 @ 2011-07-05 14:28 ` Hendrik Brueckner 2011-07-06 7:49 ` Anton Blanchard ` (2 more replies) 1 sibling, 3 replies; 10+ messages in thread From: Hendrik Brueckner @ 2011-07-05 14:28 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: borntraeger@de.ibm.com, Hendrik Brueckner, Tabi Timur-B04825, linuxppc-dev@lists.ozlabs.org, Anton Blanchard Hi folks, On Tue, Jul 05, 2011 at 04:22:43PM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2011-07-05 at 04:17 +0000, Tabi Timur-B04825 wrote: > > On Mon, Jul 4, 2011 at 9:24 AM, Hendrik Brueckner > > <brueckner@linux.vnet.ibm.com> wrote: > > > > I started that thread. After much soul searching, we came to the > > conclusion that HVC is not compatible with hypervisors that return > > BUSY on writes. > > That is a fun conclusion considering that hvc has been written for the > pseries hypervisor which ... can return BUSY on writes :-) > > We just need to fix HVC properly. So I took initiative and looked again into this issue. Below you can find a patch that is based on Ben's -EAGAIN idea. The hvc console layer takes care of retrying depending on the backend's return code. However, the main issue is that from a backend perspective, there is no difference between tty and console output. Because consoles, especially the preferred console, behave different than a simple tty. Blocked write to a preferred console can stop the system. So with the patch below, the backend can now indirectly control the way console output is handled for it. I still have to think if this solution is ok or if it is better to introduce a new callback to console output only (and might provide a default implemenatation similar to the patch below). NOTE: I did not yet test this patch but will do.. I just want to share it early to get feedback from you. -->8--------------------------------------------------------------------- Currently, the hvc_console_print() function drops console output if the hvc backend's put_chars() returns 0. This patch changes this behavior to allow a retry through returning -EAGAIN. This change also affects the hvc_push() function. Both functions are changed to handle -EAGAIN and to retry the put_chars() operation. If a hvc backend returns -EAGAIN, the retry handling differs: - hvc_console_print() spins to write the complete console output. - hvc_push() behaves the same way as for returning 0. Now hvc backends can indirectly control the way how console output is handled through the hvc console layer. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> --- drivers/tty/hvc/hvc_console.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -163,8 +163,10 @@ static void hvc_console_print(struct con } else { r = cons_ops[index]->put_chars(vtermnos[index], c, i); if (r <= 0) { - /* throw away chars on error */ - i = 0; + /* throw away characters on error + * but spin in case of -EAGAIN */ + if (r != -EAGAIN) + i = 0; } else if (r > 0) { i -= r; if (i > 0) @@ -448,7 +450,7 @@ static int hvc_push(struct hvc_struct *h n = hp->ops->put_chars(hp->vtermno, hp->outbuf, hp->n_outbuf); if (n <= 0) { - if (n == 0) { + if (n == 0 || n == -EAGAIN) { hp->do_wakeup = 1; return 0; } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] [PATCH] hvc_console: improve tty/console put_chars handling 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner @ 2011-07-06 7:49 ` Anton Blanchard 2011-07-06 7:50 ` [PATCH 1/2]: " Anton Blanchard 2011-07-06 7:51 ` [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output Anton Blanchard 2 siblings, 0 replies; 10+ messages in thread From: Anton Blanchard @ 2011-07-06 7:49 UTC (permalink / raw) To: Hendrik Brueckner Cc: Tabi Timur-B04825, linuxppc-dev@lists.ozlabs.org, borntraeger@de.ibm.com Hi Hendrik, > So with the patch below, the backend can now indirectly control the > way console output is handled for it. I still have to think if this > solution is ok or if it is better to introduce a new callback to > console output only (and might provide a default implemenatation > similar to the patch below). > > NOTE: I did not yet test this patch but will do.. I just want to > share it early to get feedback from you. Thanks a lot for this. I added the pseries bits and tested it, patches to follow. Anton ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2]: hvc_console: improve tty/console put_chars handling 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner 2011-07-06 7:49 ` Anton Blanchard @ 2011-07-06 7:50 ` Anton Blanchard 2011-07-06 7:51 ` [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output Anton Blanchard 2 siblings, 0 replies; 10+ messages in thread From: Anton Blanchard @ 2011-07-06 7:50 UTC (permalink / raw) To: Hendrik Brueckner Cc: Tabi Timur-B04825, linuxppc-dev@lists.ozlabs.org, borntraeger@de.ibm.com From: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Currently, the hvc_console_print() function drops console output if the hvc backend's put_chars() returns 0. This patch changes this behavior to allow a retry through returning -EAGAIN. This change also affects the hvc_push() function. Both functions are changed to handle -EAGAIN and to retry the put_chars() operation. If a hvc backend returns -EAGAIN, the retry handling differs: - hvc_console_print() spins to write the complete console output. - hvc_push() behaves the same way as for returning 0. Now hvc backends can indirectly control the way how console output is handled through the hvc console layer. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> --- Index: linux-powerpc/drivers/tty/hvc/hvc_console.c =================================================================== --- linux-powerpc.orig/drivers/tty/hvc/hvc_console.c 2011-07-06 17:10:51.288360541 +1000 +++ linux-powerpc/drivers/tty/hvc/hvc_console.c 2011-07-06 17:11:16.398794913 +1000 @@ -163,8 +163,10 @@ static void hvc_console_print(struct con } else { r = cons_ops[index]->put_chars(vtermnos[index], c, i); if (r <= 0) { - /* throw away chars on error */ - i = 0; + /* throw away characters on error + * but spin in case of -EAGAIN */ + if (r != -EAGAIN) + i = 0; } else if (r > 0) { i -= r; if (i > 0) @@ -448,7 +450,7 @@ static int hvc_push(struct hvc_struct *h n = hp->ops->put_chars(hp->vtermno, hp->outbuf, hp->n_outbuf); if (n <= 0) { - if (n == 0) { + if (n == 0 || n == -EAGAIN) { hp->do_wakeup = 1; return 0; } ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner 2011-07-06 7:49 ` Anton Blanchard 2011-07-06 7:50 ` [PATCH 1/2]: " Anton Blanchard @ 2011-07-06 7:51 ` Anton Blanchard 2 siblings, 0 replies; 10+ messages in thread From: Anton Blanchard @ 2011-07-06 7:51 UTC (permalink / raw) To: Hendrik Brueckner Cc: Tabi Timur-B04825, linuxppc-dev@lists.ozlabs.org, borntraeger@de.ibm.com Return -EAGAIN when we get H_BUSY back from the hypervisor. This makes the hvc console driver retry, avoiding dropped printks. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> --- Index: linux-powerpc/arch/powerpc/platforms/pseries/hvconsole.c =================================================================== --- linux-powerpc.orig/arch/powerpc/platforms/pseries/hvconsole.c 2011-07-06 17:11:33.799095935 +1000 +++ linux-powerpc/arch/powerpc/platforms/pseries/hvconsole.c 2011-07-06 17:11:47.499332962 +1000 @@ -73,7 +73,7 @@ int hvc_put_chars(uint32_t vtermno, cons if (ret == H_SUCCESS) return count; if (ret == H_BUSY) - return 0; + return -EAGAIN; return -EIO; } ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-07-06 7:51 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-07-04 10:57 hvc_console change results in corrupt oops output Anton Blanchard 2011-07-04 13:56 ` Benjamin Herrenschmidt 2011-07-04 14:24 ` Hendrik Brueckner 2011-07-05 4:17 ` Tabi Timur-B04825 2011-07-05 6:22 ` Benjamin Herrenschmidt 2011-07-05 13:51 ` Tabi Timur-B04825 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner 2011-07-06 7:49 ` Anton Blanchard 2011-07-06 7:50 ` [PATCH 1/2]: " Anton Blanchard 2011-07-06 7:51 ` [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output Anton Blanchard
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).