linux-admin.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* graphic chars, set-font and sed
@ 2005-04-12 16:25 Luca Ferrari
  2005-04-12 16:44 ` Jens Knoell
  2005-04-14 20:47 ` Glynn Clements
  0 siblings, 2 replies; 5+ messages in thread
From: Luca Ferrari @ 2005-04-12 16:25 UTC (permalink / raw)
  To: linux-admin

Hi,
I've got a few problem with semigraphic chars (those used tipically in dos or 
in ncurses applications) under linux. Firs of all, if I use the setfont 
command on a tty I can see files with the above characters listed well, but I 
cannot do this on pseudo-tty (like those opened thru telnet or ssh). Any 
trick for this? Second, I've noticed that sed regular expressions get 
confused by the presence of multiple semigraphic chars, while a single one 
seems to work ok. Does anybody knows a way to "escape" those chars, in order 
to make them understandable to sed and other programs?

Thanks,
Luca
-- 
Luca Ferrari,
fluca1978@infinito.it

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: graphic chars, set-font and sed
  2005-04-12 16:25 graphic chars, set-font and sed Luca Ferrari
@ 2005-04-12 16:44 ` Jens Knoell
  2005-04-14 20:47 ` Glynn Clements
  1 sibling, 0 replies; 5+ messages in thread
From: Jens Knoell @ 2005-04-12 16:44 UTC (permalink / raw)
  To: fluca1978; +Cc: linux-admin

Hi Luca,

On Tuesday 12 April 2005 10:25, Luca Ferrari <LF> wrote:
> Hi,
> I've got a few problem with semigraphic chars (those used tipically in dos
> or in ncurses applications) under linux. Firs of all, if I use the setfont
> command on a tty I can see files with the above characters listed well, but
> I cannot do this on pseudo-tty (like those opened thru telnet or ssh). Any
> trick for this? Second, I've noticed that sed regular expressions get
> confused by the presence of multiple semigraphic chars, while a single one
> seems to work ok. Does anybody knows a way to "escape" those chars, in
> order to make them understandable to sed and other programs?
>
> Thanks,
> Luca

These fonts usually get set on the local side. I.e. if you telnet or SSH into 
your machine, the font used is the font on your LOCAL side, not the one on 
your machine. Presuming that you come in from another linux box, issue the 
setfont command before you telnet/ssh over to your box.

If you use an X terminal (or windows telnet/SSH client) it is usually the same 
thing, except that with these you usually use the GUI to change the terminal 
font (which really is a X font in this case).

Can't help with the regex question though, sorry.


J

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: graphic chars, set-font and sed
  2005-04-12 16:25 graphic chars, set-font and sed Luca Ferrari
  2005-04-12 16:44 ` Jens Knoell
@ 2005-04-14 20:47 ` Glynn Clements
  2005-04-20 16:46   ` Luca Ferrari
  1 sibling, 1 reply; 5+ messages in thread
From: Glynn Clements @ 2005-04-14 20:47 UTC (permalink / raw)
  To: fluca1978; +Cc: linux-admin


Luca Ferrari wrote:

> I've got a few problem with semigraphic chars (those used tipically in dos or 
> in ncurses applications) under linux. Firs of all, if I use the setfont 
> command on a tty I can see files with the above characters listed well, but I 
> cannot do this on pseudo-tty (like those opened thru telnet or ssh). Any 
> trick for this?

setfont operates at the hardware level, and is specific to the Linux
virtual terminal driver. It essentially uploads the specified font to
the graphics card. You have to run it on the system whose graphics
card you wish to reconfigure (i.e. the "terminal" end of an ssh or
telnet session, not the "server" end).

You can configure xterm etc to use a different font; there's a
standard "VGA" font (vga.bdf) which is bundled with a few programs
which require DOS compatible graphics (e.g. dosemu).

> Second, I've noticed that sed regular expressions get 
> confused by the presence of multiple semigraphic chars, while a single one 
> seems to work ok. Does anybody knows a way to "escape" those chars, in order 
> to make them understandable to sed and other programs?

sed itself should be 8-bit clean; are you sure that this isn't an
encoding (e.g. ISO-8859-1 vs UTF-8) issue?

-- 
Glynn Clements <glynn@gclements.plus.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: graphic chars, set-font and sed
  2005-04-14 20:47 ` Glynn Clements
@ 2005-04-20 16:46   ` Luca Ferrari
  2005-04-21 15:34     ` Glynn Clements
  0 siblings, 1 reply; 5+ messages in thread
From: Luca Ferrari @ 2005-04-20 16:46 UTC (permalink / raw)
  To: linux-admin

On Thursday 14 April 2005 22:47 Glynn Clements's cat walking on the keyboard  
wrote:

> > Second, I've noticed that sed regular expressions get
> > confused by the presence of multiple semigraphic chars, while a single
> > one seems to work ok. Does anybody knows a way to "escape" those chars,
> > in order to make them understandable to sed and other programs?
>
> sed itself should be 8-bit clean; are you sure that this isn't an
> encoding (e.g. ISO-8859-1 vs UTF-8) issue?

I don't know what you mean with "encoding issue". How can I discover it?

Luca

-- 
Luca Ferrari,
fluca1978@infinito.it

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: graphic chars, set-font and sed
  2005-04-20 16:46   ` Luca Ferrari
@ 2005-04-21 15:34     ` Glynn Clements
  0 siblings, 0 replies; 5+ messages in thread
From: Glynn Clements @ 2005-04-21 15:34 UTC (permalink / raw)
  To: fluca1978; +Cc: linux-admin


Luca Ferrari wrote:

> > > Second, I've noticed that sed regular expressions get
> > > confused by the presence of multiple semigraphic chars, while a single
> > > one seems to work ok. Does anybody knows a way to "escape" those chars,
> > > in order to make them understandable to sed and other programs?
> >
> > sed itself should be 8-bit clean; are you sure that this isn't an
> > encoding (e.g. ISO-8859-1 vs UTF-8) issue?
> 
> I don't know what you mean with "encoding issue". How can I discover it?

An encoding is a mechanism for representing characters as bytes. 
Examples of commonly-used encodings are ASCII, ISO-8859-1 and UTF-8.

ISO-8859-1 is a single-byte encoding. There are 192 printable
characters and 64 control characters, each encoded as a single byte. 
E.g. the character "æ" (a-e ligature, code 230) is represented by the
byte 230 ("\xE6" in C notation).

UTF-8 is a multi-byte encoding. It supports up to 2^31 characters,
each of which is encoded using between 1 and 6 bytes. The first 128
characters (the ASCII subset) are encoded as a single byte; the next
1920 characters are encoded as two bytes. E.g. the character "æ" (a-e
ligature, code 230) is represented by the byte sequence 195,166
("\xC3\xA6" in C notation).

sed itself works with bytes, not characters. This means that it will
work with any single-byte encoding (e.g. ASCII and all of the
ISO-8859-* encodings), but it won't work with multi-byte encodings
such as UTF-8.

If you were to use an expression such as 'æ*' (zero or more
occurrences of the æ character), it would work in ISO-8859-1 (i.e. 
zero or more occurrences of byte 230) but not in UTF-8, where it would
be interpreted as byte 195 followed by zero or more occurrences of
byte 166 (the * operator means "zero or more occurrences of the
preceding byte).

The semigraphic characters aren't part of the ASCII set, so the
sequence of bytes used to represent them will vary depending upon the
encoding which is used.

Essentially, you have to bear in mind that sed's regular expressions,
and the stream of data which it processes, are sequences of bytes, not
characters.

-- 
Glynn Clements <glynn@gclements.plus.com>
-
To unsubscribe from this list: send the line "unsubscribe linux-admin" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-04-21 15:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-12 16:25 graphic chars, set-font and sed Luca Ferrari
2005-04-12 16:44 ` Jens Knoell
2005-04-14 20:47 ` Glynn Clements
2005-04-20 16:46   ` Luca Ferrari
2005-04-21 15:34     ` Glynn Clements

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).