public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Stop printk printing non-printable chars
@ 2004-06-18 20:53 matthew-lkml
  2004-06-18 21:08 ` Linus Torvalds
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: matthew-lkml @ 2004-06-18 20:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: torvalds

Hi,

I have had problems recently with the output from dmesg. Somewhere in
the depths of ACPI (drivers/acpi/tables.c:104) the
header->asl_compiler_id contained non-printable characters, and it made
xterm stop displaying any more output. dmesg|less had to be used as less
filters out the duff chars.

The main problem seems to be in ACPI, but I don't see any reason for
printk to even consider printing _any_ non-printable characters at all.
It makes all characters out of the range 32..126 (except for newline)
print as a '?'.

Patch is for 2.6.7.

Matthew


--- linux-2.6.7/kernel/printk.c.orig	2004-06-18 20:44:28.000000000 +0100
+++ linux-2.6.7/kernel/printk.c	2004-06-18 20:53:36.000000000 +0100
@@ -14,6 +14,8 @@
  *     manfreds@colorfullife.com
  * Rewrote bits to get rid of console_lock
  *	01Mar01 Andrew Morton <andrewm@uow.edu.au>
+ * Stop emit_log_char from emitting non-ASCII chars.
+ *  Matthew Newton, 18 June 2004 <matthew-lkml@newtoncomputing.co.uk>
  */
 
 #include <linux/kernel.h>
@@ -538,7 +540,11 @@
 			}
 			log_level_unknown = 0;
 		}
-		emit_log_char(*p);
+		if (p[0] != '\n' && (p[0] < 32 || p[0] > 126)) {
+			emit_log_char('?');
+		} else {
+			emit_log_char(*p);
+		}
 		if (*p == '\n')
 			log_level_unknown = 1;
 	}



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-18 20:53 [PATCH] Stop printk printing non-printable chars matthew-lkml
@ 2004-06-18 21:08 ` Linus Torvalds
  2004-06-18 22:44   ` Jesper Juhl
  2004-06-18 21:32 ` Jan-Benedict Glaw
  2004-06-19 11:18 ` David Woodhouse
  2 siblings, 1 reply; 25+ messages in thread
From: Linus Torvalds @ 2004-06-18 21:08 UTC (permalink / raw)
  To: matthew-lkml; +Cc: linux-kernel



On Fri, 18 Jun 2004 matthew-lkml@newtoncomputing.co.uk wrote:
> 
> The main problem seems to be in ACPI, but I don't see any reason for
> printk to even consider printing _any_ non-printable characters at all.
> It makes all characters out of the range 32..126 (except for newline)
> print as a '?'.

How about emitting them as \xxx, so that you see what they are. And using 
a case-statement to make it easy and clear when to do exceptions (I think 
we should accept \t too, no?).

		Linus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-18 20:53 [PATCH] Stop printk printing non-printable chars matthew-lkml
  2004-06-18 21:08 ` Linus Torvalds
@ 2004-06-18 21:32 ` Jan-Benedict Glaw
  2004-06-18 21:58   ` Pekka Pietikainen
  2004-06-19  0:03   ` matthew-lkml
  2004-06-19 11:18 ` David Woodhouse
  2 siblings, 2 replies; 25+ messages in thread
From: Jan-Benedict Glaw @ 2004-06-18 21:32 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1389 bytes --]

On Fri, 2004-06-18 21:53:55 +0100, matthew-lkml@newtoncomputing.co.uk <matthew-lkml@newtoncomputing.co.uk>
wrote in message <20040618205355.GA5286@newtoncomputing.co.uk>:

> The main problem seems to be in ACPI, but I don't see any reason for

Right.

> printk to even consider printing _any_ non-printable characters at all.

It's dandy if you pump out some data via serial link.

> It makes all characters out of the range 32..126 (except for newline)
> print as a '?'.

I don't see why that's needed. I'd say let's better fix ACPI to put
those strings as a hexdump or something like that.

>  #include <linux/kernel.h>
> @@ -538,7 +540,11 @@
>  			}
>  			log_level_unknown = 0;
>  		}
> -		emit_log_char(*p);
> +		if (p[0] != '\n' && (p[0] < 32 || p[0] > 126)) {
> +			emit_log_char('?');
> +		} else {
> +			emit_log_char(*p);
> +		}
>  		if (*p == '\n')
>  			log_level_unknown = 1;
>  	}

So you're ripping off something that could be a nice feature and place
some slow path. By the way, why do you use 'p[0]' instead of '*p'?

MfG, JBG

-- 
   Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481
   "Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg
    fuer einen Freien Staat voll Freier Bürger" | im Internet! |   im Irak!
   ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-18 21:32 ` Jan-Benedict Glaw
@ 2004-06-18 21:58   ` Pekka Pietikainen
  2004-06-19  0:03   ` matthew-lkml
  1 sibling, 0 replies; 25+ messages in thread
From: Pekka Pietikainen @ 2004-06-18 21:58 UTC (permalink / raw)
  To: linux-kernel

On Fri, Jun 18, 2004 at 11:32:52PM +0200, Jan-Benedict Glaw wrote:
> > It makes all characters out of the range 32..126 (except for newline)
> > print as a '?'.
> 
> I don't see why that's needed. I'd say let's better fix ACPI to put
> those strings as a hexdump or something like that.
There's actually some other cases where doing the escaping would be a
good idea.

Try grabbing an old a.out binary, renaming it to something like
\n<0>Oops: ,running it and see what happens...

Combined with an old-enough terminal emulator there could actually
be a bit more trouble than a messed-up screen...

-- 
Pekka Pietikainen

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-18 21:08 ` Linus Torvalds
@ 2004-06-18 22:44   ` Jesper Juhl
  2004-06-18 23:52     ` matthew-lkml
  2004-06-19  1:23     ` Matthias Urlichs
  0 siblings, 2 replies; 25+ messages in thread
From: Jesper Juhl @ 2004-06-18 22:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: matthew-lkml, linux-kernel


On Fri, 18 Jun 2004, Linus Torvalds wrote:
>
>
> On Fri, 18 Jun 2004 matthew-lkml@newtoncomputing.co.uk wrote:
> >
> > The main problem seems to be in ACPI, but I don't see any reason for
> > printk to even consider printing _any_ non-printable characters at all.
> > It makes all characters out of the range 32..126 (except for newline)
> > print as a '?'.
>
> How about emitting them as \xxx, so that you see what they are. And using
> a case-statement to make it easy and clear when to do exceptions (I think
> we should accept \t too, no?).
>

Would there be any reason not to allow all the standard C escape sequences
- true, they are hardly used atm (I see a few \f uses with grep, but not
much else), but it's not unthinkable they could be useful somewhere in
some cases (I'm thinking \f could be useful for console on line-printer
for example, \a could be useful for critical errors on boxes without a
monitor - or maybe that's all too far fetched)...

\a	alert (bell)
\b	backspace
\f	formfeed
\n	newline
\r	carriage return
\t	horizontal tab
\v	vertical tab
\\	backslash
\?	question mark
\'	single quote
\"	double quote

(the last few are in the 32..126 range, just listing for completeness)...
none of them should cause trouble on output, so little reason to exclude
them if someone find a use for them at some point - or am I not making
sense?


 --
Jesper Juhl <juhl-lkml@dif.dk>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-18 22:44   ` Jesper Juhl
@ 2004-06-18 23:52     ` matthew-lkml
  2004-06-19  4:18       ` Willy Tarreau
  2004-06-19 23:00       ` Dave Jones
  2004-06-19  1:23     ` Matthias Urlichs
  1 sibling, 2 replies; 25+ messages in thread
From: matthew-lkml @ 2004-06-18 23:52 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: Linus Torvalds, linux-kernel

On Sat, Jun 19, 2004 at 12:44:55AM +0200, Jesper Juhl wrote:
> On Fri, 18 Jun 2004, Linus Torvalds wrote:
> >
> > How about emitting them as \xxx, so that you see what they are. And using
> > a case-statement to make it easy and clear when to do exceptions (I think
> > we should accept \t too, no?).
> 
> Would there be any reason not to allow all the standard C escape sequences
> - true, they are hardly used atm (I see a few \f uses with grep, but not
> much else), but it's not unthinkable they could be useful somewhere in

I must admit, I don't think I've even seen a tab before (not that you'd
actually _see_ a tab). Oh, grep tells me that powernow uses it. By the
time that gets through syslog it's changed into "^I", so it would
probably be better to not actually use tabs, either (or fix syslog).

New patch below outputs as \xxx if it's not a "nice" character.  "Nice"
is now 32..126, \n and \t.


--- linux-2.6.7/kernel/printk.c.orig	2004-06-18 20:44:28.000000000 +0100
+++ linux-2.6.7/kernel/printk.c	2004-06-19 00:11:30.000000000 +0100
@@ -14,6 +14,8 @@
  *     manfreds@colorfullife.com
  * Rewrote bits to get rid of console_lock
  *	01Mar01 Andrew Morton <andrewm@uow.edu.au>
+ * Stop emit_log_char from emitting non-ASCII chars.
+ *  Matthew Newton, 18 June 2004 <matthew-lkml@newtoncomputing.co.uk>
  */
 
 #include <linux/kernel.h>
@@ -472,6 +474,17 @@
 }
 
 /*
+ * Emit character in numeric (octal) form
+ */
+static void emit_log_char_octal(char c)
+{
+	emit_log_char('\\');
+	emit_log_char(((c >> 6) & 3) + '0');
+	emit_log_char(((c >> 3) & 7) + '0');
+	emit_log_char((c & 7) + '0');
+}
+
+/*
  * Zap console related locks when oopsing. Only zap at most once
  * every 10 seconds, to leave time for slow consoles to print a
  * full oops.
@@ -538,7 +551,17 @@
 			}
 			log_level_unknown = 0;
 		}
-		emit_log_char(*p);
+		switch (*p) {
+			case '\n':
+			case '\t':
+				emit_log_char(*p);
+				break;
+			default:
+				if (*p > 31 && *p < 127)
+					emit_log_char(*p);
+				else
+					emit_log_char_octal(*p);
+		}
 		if (*p == '\n')
 			log_level_unknown = 1;
 	}



-- 
Matthew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-18 21:32 ` Jan-Benedict Glaw
  2004-06-18 21:58   ` Pekka Pietikainen
@ 2004-06-19  0:03   ` matthew-lkml
  2004-06-19  8:31     ` Jan-Benedict Glaw
  1 sibling, 1 reply; 25+ messages in thread
From: matthew-lkml @ 2004-06-19  0:03 UTC (permalink / raw)
  To: linux-kernel

On Fri, Jun 18, 2004 at 11:32:52PM +0200, Jan-Benedict Glaw wrote:
> On Fri, 2004-06-18 21:53:55 +0100, matthew-lkml@newtoncomputing.co.uk <matthew-lkml@newtoncomputing.co.uk>
> wrote in message <20040618205355.GA5286@newtoncomputing.co.uk>:
> > printk to even consider printing _any_ non-printable characters at all.
> 
> It's dandy if you pump out some data via serial link.

Is printk ever used to send anything out via a serial link? I assumed it
was only kernel log messages (that should really be fairly sane). Log
messages sent to serial printer, etc, don't want dodgy chars in them
that may mess up the printer, do they?

> 
> > It makes all characters out of the range 32..126 (except for newline)
> > print as a '?'.
> 
> I don't see why that's needed. I'd say let's better fix ACPI to put
> those strings as a hexdump or something like that.

Looking at the ACPI code (and not understanding it too well) it looks
like this data is retrieved from the BIOS, but is only printed here for
info and not actually used anywhere. In this case, I'd think there isn't
a lot of point checking for data correctness in the ACPI code. As
someone else pointed out, though, other things can cause the kernel log
to print nasty chars that are unwanted, so there should really be a
check here anyway.

> > +		if (p[0] != '\n' && (p[0] < 32 || p[0] > 126)) {
> 
> So you're ripping off something that could be a nice feature and place
> some slow path. By the way, why do you use 'p[0]' instead of '*p'?

The string has just been through the equivalent of sprintf, so I guess
this is hardly going to slow it down much more. Used p[0] to look
tidier, matching another "if" statement 5 lines up. In new patch used
*p, as it doesn't really matter.

Thanks,

-- 
Matthew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-18 22:44   ` Jesper Juhl
  2004-06-18 23:52     ` matthew-lkml
@ 2004-06-19  1:23     ` Matthias Urlichs
  2004-06-19  1:43       ` Jesper Juhl
  1 sibling, 1 reply; 25+ messages in thread
From: Matthias Urlichs @ 2004-06-19  1:23 UTC (permalink / raw)
  To: linux-kernel

Hi, Jesper Juhl wrote:

> [ printing control characters as "meaningful" C escapes ]
> or am I not making sense?

No, you're not. ;-)

Reason: They're not intended to be meaningful. If the kernel prints them,
the reason isn't that somebody actually used an \a or \v in there, so
doing that isn't helpful. (Quick, what's the ASCII for \v?)

-- 
Matthias Urlichs

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-19  1:23     ` Matthias Urlichs
@ 2004-06-19  1:43       ` Jesper Juhl
  2004-06-19 10:20         ` Matthias Urlichs
  0 siblings, 1 reply; 25+ messages in thread
From: Jesper Juhl @ 2004-06-19  1:43 UTC (permalink / raw)
  To: Matthias Urlichs; +Cc: linux-kernel

On Sat, 19 Jun 2004, Matthias Urlichs wrote:

> Hi, Jesper Juhl wrote:
>
> > [ printing control characters as "meaningful" C escapes ]
> > or am I not making sense?
>
> No, you're not. ;-)
>
Ok, I had a feeling that might be so. But I did not intend them to be
printed as '"meaningful" C escapes', I meant "why filter out \v or \f,
someone might find a clever use for them and they do no real harm
otherwhise"...


> Reason: They're not intended to be meaningful. If the kernel prints them,
> the reason isn't that somebody actually used an \a or \v in there, so
> doing that isn't helpful. (Quick, what's the ASCII for \v?)
>
What I meant was not for the kernel to attempt to print something like \a
, but it could be useful for it's original purpose of making a sound.. If
it's simply filtering out what goes to the screen (log, serial line,
whatever), but not preventing other uses, then my comments made no
sense... and 0x0B is \v I believe...


--
Jesper Juhl <juhl-lkml@dif.dk>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-18 23:52     ` matthew-lkml
@ 2004-06-19  4:18       ` Willy Tarreau
  2004-06-19 10:27         ` Matthias Urlichs
  2004-06-19 23:00       ` Dave Jones
  1 sibling, 1 reply; 25+ messages in thread
From: Willy Tarreau @ 2004-06-19  4:18 UTC (permalink / raw)
  To: matthew-lkml; +Cc: Jesper Juhl, Linus Torvalds, linux-kernel

Hi,

On Sat, Jun 19, 2004 at 12:52:23AM +0100, matthew-lkml@newtoncomputing.co.uk wrote:
>  /*
> + * Emit character in numeric (octal) form
> + */

Don't most of us handle hex easier than octal ? I'd prefer to read \xE9
than \351, but that's only personal taste.

> -		emit_log_char(*p);
> +		switch (*p) {
> +			case '\n':
> +			case '\t':
> +				emit_log_char(*p);
> +				break;

Logically, ig you use '\' as an escape char, your should also protect it
to avoid confusion, because if you read "\351", you won't know if the
function really sent these four chars or only the \xE9 char.

Another way to do it would be to display "<XX>" like less, but '<' and '>'
are sensible because they're used to indicate the log level.

Regards,
Willy


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-19  0:03   ` matthew-lkml
@ 2004-06-19  8:31     ` Jan-Benedict Glaw
  0 siblings, 0 replies; 25+ messages in thread
From: Jan-Benedict Glaw @ 2004-06-19  8:31 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1302 bytes --]

On Sat, 2004-06-19 01:03:30 +0100, matthew-lkml@newtoncomputing.co.uk <matthew-lkml@newtoncomputing.co.uk>
wrote in message <20040619000330.GC5286@newtoncomputing.co.uk>:
> On Fri, Jun 18, 2004 at 11:32:52PM +0200, Jan-Benedict Glaw wrote:
> > On Fri, 2004-06-18 21:53:55 +0100, matthew-lkml@newtoncomputing.co.uk <matthew-lkml@newtoncomputing.co.uk>
> > wrote in message <20040618205355.GA5286@newtoncomputing.co.uk>:

> > It's dandy if you pump out some data via serial link.
> 
> Is printk ever used to send anything out via a serial link? I assumed it
> was only kernel log messages (that should really be fairly sane). Log
> messages sent to serial printer, etc, don't want dodgy chars in them
> that may mess up the printer, do they?

printk() is a printf() like interface to the log buffer. From there, all
console drivers are fed, and in development (not for office or private
use), you often attach a seruak cibsike to capture Oops messages and the
like.

MfG, JBG

-- 
   Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481
   "Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg
    fuer einen Freien Staat voll Freier Bürger" | im Internet! |   im Irak!
   ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-19  1:43       ` Jesper Juhl
@ 2004-06-19 10:20         ` Matthias Urlichs
  0 siblings, 0 replies; 25+ messages in thread
From: Matthias Urlichs @ 2004-06-19 10:20 UTC (permalink / raw)
  To: linux-kernel

Hi, Jesper Juhl wrote:

> But I did not intend them to be
> printed as '"meaningful" C escapes', I meant "why filter out \v or \f,
> someone might find a clever use for them and they do no real harm
> otherwhise"...

On the console, from the kernel? No such use exists today.

IMHO: Filter them out. If (big IF, methinks) somebody thinks of something
that actually makes sense, they can add an exception.

-- 
Matthias Urlichs

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-19  4:18       ` Willy Tarreau
@ 2004-06-19 10:27         ` Matthias Urlichs
  0 siblings, 0 replies; 25+ messages in thread
From: Matthias Urlichs @ 2004-06-19 10:27 UTC (permalink / raw)
  To: linux-kernel

Hi, Willy Tarreau wrote:

> Another way to do it would be to display "<XX>" like less, but '<' and '>'
> are sensible because they're used to indicate the log level.

Umm, they're not sensible. (You're mixing up word meanings.)
Log levels are not a big problem, because the log level is always printed
first. Usually.

IMHO, the stuff should make sense to a human reader, and it should be
possible to figure out quickly and unambiguously what the random bit
pattern that inadvertently got printed actually *is*.

That means: Escape anything <\x31 except \n. Escape \\. Escape anything
with the 8th bit set. Use hex escapes \x## -- let's face it, they're as
long as octal and easier to read these days.

-- 
Matthias Urlichs

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-18 20:53 [PATCH] Stop printk printing non-printable chars matthew-lkml
  2004-06-18 21:08 ` Linus Torvalds
  2004-06-18 21:32 ` Jan-Benedict Glaw
@ 2004-06-19 11:18 ` David Woodhouse
  2004-06-19 15:49   ` matthew-lkml
  2 siblings, 1 reply; 25+ messages in thread
From: David Woodhouse @ 2004-06-19 11:18 UTC (permalink / raw)
  To: matthew-lkml; +Cc: linux-kernel, torvalds

On Fri, 2004-06-18 at 21:53 +0100, matthew-lkml@newtoncomputing.co.uk
wrote:
> The main problem seems to be in ACPI, but I don't see any reason for
> printk to even consider printing _any_ non-printable characters at all.
> It makes all characters out of the range 32..126 (except for newline)
> print as a '?'.

Please don't do that -- it makes printing UTF-8 impossible. While I'd
not argue that now is the time to start outputting UTF-8 all over the
place, I wouldn't accept that it's a good time to _prevent_ it either,
as your patch would do.

If you want to post-process printk output, don't do it in the kernel. 

I'd suggest that in this instance you should be fixing the ACPI code
instead, so it doesn't print the characters to which you object.

-- 
dwmw2



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-19 11:18 ` David Woodhouse
@ 2004-06-19 15:49   ` matthew-lkml
  2004-06-19 16:09     ` Arjan van de Ven
                       ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: matthew-lkml @ 2004-06-19 15:49 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-kernel

On Sat, Jun 19, 2004 at 12:18:24PM +0100, David Woodhouse wrote:
> On Fri, 2004-06-18 at 21:53 +0100, matthew-lkml@newtoncomputing.co.uk
> wrote:
> > The main problem seems to be in ACPI, but I don't see any reason for
> > printk to even consider printing _any_ non-printable characters at all.
> > It makes all characters out of the range 32..126 (except for newline)
> > print as a '?'.
> 
> Please don't do that -- it makes printing UTF-8 impossible. While I'd
> not argue that now is the time to start outputting UTF-8 all over the
> place, I wouldn't accept that it's a good time to _prevent_ it either,
> as your patch would do.

Please forgive me if I'm wrong on this, but I seem to remember reading
something a while ago indicating that the kernel is and always will be
internally English (i.e. debugging messages and the like) as there is no
need to bloat it with many different languages (that can be done in
userspace). As printk is really just a log system, I personally don't
see any way that it should ever print anything other than ASCII.

(Yes, of course some parts of the kernel, like filesystems, do need to
be able to handle UTF-8 etc, but not all.)

-- 
Matthew


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-19 15:49   ` matthew-lkml
@ 2004-06-19 16:09     ` Arjan van de Ven
  2004-06-20  2:19     ` Horst von Brand
  2004-06-20 14:17     ` David Woodhouse
  2 siblings, 0 replies; 25+ messages in thread
From: Arjan van de Ven @ 2004-06-19 16:09 UTC (permalink / raw)
  To: matthew-lkml; +Cc: David Woodhouse, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1261 bytes --]

On Sat, 2004-06-19 at 17:49, matthew-lkml@newtoncomputing.co.uk wrote:
> On Sat, Jun 19, 2004 at 12:18:24PM +0100, David Woodhouse wrote:
> > On Fri, 2004-06-18 at 21:53 +0100, matthew-lkml@newtoncomputing.co.uk
> > wrote:
> > > The main problem seems to be in ACPI, but I don't see any reason for
> > > printk to even consider printing _any_ non-printable characters at all.
> > > It makes all characters out of the range 32..126 (except for newline)
> > > print as a '?'.
> > 
> > Please don't do that -- it makes printing UTF-8 impossible. While I'd
> > not argue that now is the time to start outputting UTF-8 all over the
> > place, I wouldn't accept that it's a good time to _prevent_ it either,
> > as your patch would do.
> 
> Please forgive me if I'm wrong on this, but I seem to remember reading
> something a while ago indicating that the kernel is and always will be
> internally English (i.e. debugging messages and the like) as there is no
> need to bloat it with many different languages (that can be done in
> userspace). As printk is really just a log system, I personally don't
> see any way that it should ever print anything other than ASCII.

english != no-UTF8.

Names of people and things still can be UTF8 ....


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
@ 2004-06-19 20:12 Albert Cahalan
  2004-06-19 22:56 ` Jan-Benedict Glaw
  0 siblings, 1 reply; 25+ messages in thread
From: Albert Cahalan @ 2004-06-19 20:12 UTC (permalink / raw)
  To: linux-kernel mailing list; +Cc: dwmw2

David Woodhouse writes:

> Please don't do that -- it makes printing UTF-8 impossible.
> While I'd not argue that now is the time to start outputting
> UTF-8 all over the place, I wouldn't accept that it's a good
> time to _prevent_ it either, as your patch would do.
>
> If you want to post-process printk output, don't do it in the kernel. 

It is dangerous to let the 0x9b character go out
to a serial console. It means the same as ESC [ does
when you have a normal 8-bit terminal.

Non-cannonical UTF-8 encoding for ESC and other
troublesome characters may also cause problems.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-19 20:12 Albert Cahalan
@ 2004-06-19 22:56 ` Jan-Benedict Glaw
  0 siblings, 0 replies; 25+ messages in thread
From: Jan-Benedict Glaw @ 2004-06-19 22:56 UTC (permalink / raw)
  To: linux-kernel mailing list

[-- Attachment #1: Type: text/plain, Size: 1228 bytes --]

On Sat, 2004-06-19 16:12:00 -0400, Albert Cahalan <albert@users.sf.net>
wrote in message <1087675920.9831.941.camel@cube>:
> David Woodhouse writes:
> > Please don't do that -- it makes printing UTF-8 impossible.
> > While I'd not argue that now is the time to start outputting
> > UTF-8 all over the place, I wouldn't accept that it's a good
> > time to _prevent_ it either, as your patch would do.
> >
> > If you want to post-process printk output, don't do it in the kernel. 
> 
> It is dangerous to let the 0x9b character go out
> to a serial console. It means the same as ESC [ does
> when you have a normal 8-bit terminal.

Get real: either you *want* to get those codes interpreted (think about
full-blown ncurses apps being run over serial link), or you *don't* (think
about simply recording serial console's output). You just have to choose
the correct application for your task.

MfG, JBG

-- 
   Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481
   "Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg
    fuer einen Freien Staat voll Freier Bürger" | im Internet! |   im Irak!
   ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-18 23:52     ` matthew-lkml
  2004-06-19  4:18       ` Willy Tarreau
@ 2004-06-19 23:00       ` Dave Jones
  1 sibling, 0 replies; 25+ messages in thread
From: Dave Jones @ 2004-06-19 23:00 UTC (permalink / raw)
  To: matthew-lkml; +Cc: Jesper Juhl, Linus Torvalds, linux-kernel

On Sat, Jun 19, 2004 at 12:52:23AM +0100, matthew-lkml@newtoncomputing.co.uk wrote:

 > I must admit, I don't think I've even seen a tab before (not that you'd
 > actually _see_ a tab). Oh, grep tells me that powernow uses it. By the
 > time that gets through syslog it's changed into "^I", so it would
 > probably be better to not actually use tabs, either (or fix syslog).

I've been meaning to fix that for a while, and kept forgetting
about it.  I just fixed it in my local cpufreq tree, and will
push it along with the next lot of updates.

Thanks,

		Dave


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-19 15:49   ` matthew-lkml
  2004-06-19 16:09     ` Arjan van de Ven
@ 2004-06-20  2:19     ` Horst von Brand
  2004-06-20 14:17     ` David Woodhouse
  2 siblings, 0 replies; 25+ messages in thread
From: Horst von Brand @ 2004-06-20  2:19 UTC (permalink / raw)
  To: matthew-lkml; +Cc: David Woodhouse, linux-kernel

matthew-lkml@newtoncomputing.co.uk said:
> On Sat, Jun 19, 2004 at 12:18:24PM +0100, David Woodhouse wrote:

[...]

> > Please don't do that -- it makes printing UTF-8 impossible. While I'd
> > not argue that now is the time to start outputting UTF-8 all over the
> > place, I wouldn't accept that it's a good time to _prevent_ it either,
> > as your patch would do.

> Please forgive me if I'm wrong on this, but I seem to remember reading
> something a while ago indicating that the kernel is and always will be
> internally English (i.e. debugging messages and the like) as there is no
> need to bloat it with many different languages (that can be done in
> userspace). As printk is really just a log system, I personally don't
> see any way that it should ever print anything other than ASCII.

Messages including user-level stuff (file names, ...) could very well be
UTF-8.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
@ 2004-06-20  4:02 Albert Cahalan
  2004-06-20  8:38 ` David Woodhouse
  2004-06-20  8:49 ` Jan-Benedict Glaw
  0 siblings, 2 replies; 25+ messages in thread
From: Albert Cahalan @ 2004-06-20  4:02 UTC (permalink / raw)
  To: linux-kernel mailing list
  Cc: jbglaw, dwmw2, arjanv, Linus Torvalds, jbglaw, matthew-lkml

>> It is dangerous to let the 0x9b character go out
>> to a serial console. It means the same as ESC [ does
>> when you have a normal 8-bit terminal.
>
> Get real: either you *want* to get those codes
> interpreted (think about full-blown ncurses apps
> being run over serial link), or you *don't* (think
> about simply recording serial console's output).
> You just have to choose the correct application
> for your task.

If there are full-blown ncurses apps being routed
through printk -- that is, the KERNEL log -- then
we have far bigger issues.

The 0x9b character must be blocked, at least when
a serial console is in use. (apps on such a console
may of course use 0x9b as desired -- just not printk)





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-20  4:02 Albert Cahalan
@ 2004-06-20  8:38 ` David Woodhouse
  2004-06-20  8:49 ` Jan-Benedict Glaw
  1 sibling, 0 replies; 25+ messages in thread
From: David Woodhouse @ 2004-06-20  8:38 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: linux-kernel mailing list, jbglaw, arjanv, Linus Torvalds,
	matthew-lkml

On Sun, 2004-06-20 at 00:02 -0400, Albert Cahalan wrote:
> The 0x9b character must be blocked, 

Why do you say 'must be blocked' instead of 'should not be printed'?

The former implies some kind of post-processing to work around buggy
kernel code printing crap. Surely it's better just to refrain from
printing the crap in the first place?

Btw, your replies lack a correct References: and In-Reply-To: header, in
violation of a 'SHOULD' in RFC2822 §3.6.4. Please fix this if you wish
to participate in public fora.

-- 
dwmw2



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-20  4:02 Albert Cahalan
  2004-06-20  8:38 ` David Woodhouse
@ 2004-06-20  8:49 ` Jan-Benedict Glaw
  1 sibling, 0 replies; 25+ messages in thread
From: Jan-Benedict Glaw @ 2004-06-20  8:49 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: linux-kernel mailing list, dwmw2, arjanv, Linus Torvalds,
	matthew-lkml

[-- Attachment #1: Type: text/plain, Size: 1397 bytes --]

On Sun, 2004-06-20 00:02:58 -0400, Albert Cahalan <albert@users.sf.net>
wrote in message <1087704177.8185.951.camel@cube>:
> > Get real: either you *want* to get those codes
> > interpreted (think about full-blown ncurses apps
> > being run over serial link), or you *don't* (think
> > about simply recording serial console's output).
> > You just have to choose the correct application
> > for your task.
> 
> If there are full-blown ncurses apps being routed
> through printk -- that is, the KERNEL log -- then
> we have far bigger issues.

You're right, I mixed up the ways for console output and for userspace
apps using a shell over the very same serial line (what I commonly do).

> The 0x9b character must be blocked, at least when
> a serial console is in use. (apps on such a console
> may of course use 0x9b as desired -- just not printk)

I still don't see a need to filter out *anything*. To be exact, I've
seen other places where a \r is added when a \n occures. Not always a
great idea, too...  Just keep it as it is, I think it's fine...

MfG, JBG

-- 
   Jan-Benedict Glaw       jbglaw@lug-owl.de    . +49-172-7608481
   "Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg
    fuer einen Freien Staat voll Freier Bürger" | im Internet! |   im Irak!
   ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-19 15:49   ` matthew-lkml
  2004-06-19 16:09     ` Arjan van de Ven
  2004-06-20  2:19     ` Horst von Brand
@ 2004-06-20 14:17     ` David Woodhouse
  2004-06-20 20:06       ` Jeff Woods
  2 siblings, 1 reply; 25+ messages in thread
From: David Woodhouse @ 2004-06-20 14:17 UTC (permalink / raw)
  To: matthew-lkml; +Cc: linux-kernel

On Sat, 2004-06-19 at 16:49 +0100, matthew-lkml@newtoncomputing.co.uk
wrote:
> Please forgive me if I'm wrong on this, but I seem to remember reading
> something a while ago indicating that the kernel is and always will be
> internally English (i.e. debugging messages and the like) as there is no
> need to bloat it with many different languages (that can be done in
> userspace). As printk is really just a log system, I personally don't
> see any way that it should ever print anything other than ASCII.

It's very naïve of you to think that English means nothing but ASCII.
Non-ASCII characters play a very important rôle even in English
communication.

-- 
dwmw2


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Stop printk printing non-printable chars
  2004-06-20 14:17     ` David Woodhouse
@ 2004-06-20 20:06       ` Jeff Woods
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Woods @ 2004-06-20 20:06 UTC (permalink / raw)
  To: David Woodhouse; +Cc: matthew-lkml, linux-kernel

A LKML meta-patch:

@On Sat, 2004-06-19 at 16:49 +0100, matthew-lkml@newtoncomputing.co.uk wrote:
  Please forgive me if I'm wrong on this, but I seem to remember reading
  something a while ago indicating that the kernel is and always will be
-internally English (i.e. debugging messages and the like) as there is no
+internally that subset of English representable in 7-bit ASCII
+(i.e. debugging messages and the like) as there is no
  need to bloat it with many different languages (that can be done in
  userspace). As printk is really just a log system, I personally don't
  see any way that it should ever print anything other than ASCII.

P.S.  At 6/20/2004 03:17 PM +0100, David Woodhouse wrote:
>It's very naïve of you to think that English means nothing but ASCII.
>Non-ASCII characters play a very important rôle even in English
>communication.

Thank you for illustrating the point that maximizing portability and 
compatibility favors simplicity.
--
Jeff Woods <kazrak+kernel@cesmail.net> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2004-06-21 13:09 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-18 20:53 [PATCH] Stop printk printing non-printable chars matthew-lkml
2004-06-18 21:08 ` Linus Torvalds
2004-06-18 22:44   ` Jesper Juhl
2004-06-18 23:52     ` matthew-lkml
2004-06-19  4:18       ` Willy Tarreau
2004-06-19 10:27         ` Matthias Urlichs
2004-06-19 23:00       ` Dave Jones
2004-06-19  1:23     ` Matthias Urlichs
2004-06-19  1:43       ` Jesper Juhl
2004-06-19 10:20         ` Matthias Urlichs
2004-06-18 21:32 ` Jan-Benedict Glaw
2004-06-18 21:58   ` Pekka Pietikainen
2004-06-19  0:03   ` matthew-lkml
2004-06-19  8:31     ` Jan-Benedict Glaw
2004-06-19 11:18 ` David Woodhouse
2004-06-19 15:49   ` matthew-lkml
2004-06-19 16:09     ` Arjan van de Ven
2004-06-20  2:19     ` Horst von Brand
2004-06-20 14:17     ` David Woodhouse
2004-06-20 20:06       ` Jeff Woods
  -- strict thread matches above, loose matches on Subject: below --
2004-06-19 20:12 Albert Cahalan
2004-06-19 22:56 ` Jan-Benedict Glaw
2004-06-20  4:02 Albert Cahalan
2004-06-20  8:38 ` David Woodhouse
2004-06-20  8:49 ` Jan-Benedict Glaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox