--- utf-8.7~	2011-02-27 18:26:48.000000000 +0100
+++ utf-8.7	2011-02-27 18:24:22.000000000 +0100
@@ -42,8 +42,10 @@
 parts of many 16-bit characters bytes
 like \(aq\\0\(aq or \(aq/\(aq which have a
 special meaning in filenames and other C library function arguments.
-In addition, the majority of UNIX tools expects ASCII files and can't
-read 16-bit words as characters without major modifications.
+In addition, the majority of UNIX tools expects
+.B ASCII
+files and can't read 16-bit words as characters without major
+modifications.
 For these reasons,
 .B UCS-2
 is not a suitable external encoding of
@@ -51,7 +53,9 @@
 in filenames, text files, environment variables, etc.
 The
 .BR "ISO 10646 Universal Character Set (UCS)" ,
-a superset of Unicode, occupies even a 31-bit code space and the obvious
+a superset of
+.BR Unicode ,
+occupies even a 31-bit code space and the obvious
 .B UCS-4
 encoding for it (a sequence of 32-bit words) has the same problems.
 
@@ -73,10 +77,13 @@
 .B UCS
 characters 0x00000000 to 0x0000007f (the classic
 .B US-ASCII
-characters) are encoded simply as bytes 0x00 to 0x7f (ASCII
+characters) are encoded simply as bytes 0x00 to 0x7f
+.RB ( ASCII
 compatibility).
 This means that files and strings which contain only
-7-bit ASCII characters have the same encoding under both
+7-bit
+.B ASCII
+characters have the same encoding under both
 .B ASCII
 and
 .BR UTF-8 .
@@ -85,7 +92,8 @@
 All
 .B UCS
 characters greater than 0x7f are encoded as a multibyte sequence
-consisting only of bytes in the range 0x80 to 0xfd, so no ASCII
+consisting only of bytes in the range 0x80 to 0xfd, so no
+.B ASCII
 byte can appear as part of another character and there are no
 problems with, for example,  \(aq\\0\(aq or \(aq/\(aq.
 .TP
@@ -95,7 +103,9 @@
 strings is preserved.
 .TP
 *
-All possible 2^31 UCS codes can be encoded using
+All possible 2^31
+.B UCS
+codes can be encoded using
 .BR UTF-8 .
 .TP
 *
@@ -104,7 +114,8 @@
 encoding.
 .TP
 *
-The first byte of a multibyte sequence which represents a single non-ASCII
+The first byte of a multibyte sequence which represents a single non-
+.B ASCII
 .B UCS
 character is always in the range 0xc0 to 0xfd and indicates how long
 this multibyte sequence is.
@@ -119,12 +130,15 @@
 .B UCS
 characters may be up to six bytes long, however the
 .B Unicode
-standard specifies no characters above 0x10ffff, so Unicode characters
-can only be up to four bytes long in
+standard specifies no characters above 0x10ffff, so
+.B Unicode
+characters can only be up to four bytes long in
 .BR UTF-8 .
 .SS Encoding
 The following byte sequences are used to represent a character.
-The sequence to be used depends on the UCS code number of the character:
+The sequence to be used depends on the
+.B UCS
+code number of the character:
 .TP 0.4i
 0x00000000 \- 0x0000007F:
 .RI 0 xxxxxxx
@@ -168,15 +182,19 @@
 .PP
 The
 .B UCS
-code values 0xd800\(en0xdfff (UTF-16 surrogates) as well as 0xfffe and
-0xffff (UCS noncharacters) should not appear in conforming
+code values 0xd800\(en0xdfff
+.RB ( UTF-16
+surrogates) as well as 0xfffe and 0xffff
+.RB ( UCS
+noncharacters) should not appear in conforming
 .B UTF-8
 streams.
 .SS Example
 The
 .B Unicode
-character 0xa9 = 1010 1001 (the copyright sign) is encoded
-in UTF-8 as
+character 0xa9 = 1010 1001 (the copyright sign) is encoded in
+.B UTF-8
+as
 .PP
 .RS
 11000010 10101001 = 0xc2 0xa9
@@ -256,8 +274,12 @@
 ("\\x1b%G").
 The corresponding return sequence from
 .B UTF-8
-to ISO 2022 is ESC % @ ("\\x1b%@").
-Other ISO 2022 sequences (such as
+to
+.B ISO 2022
+is ESC % @ ("\\x1b%@").
+Other
+.B ISO 2022
+sequences (such as
 for switching the G0 and G1 sets) are not applicable in UTF-8 mode.
 .PP
 It can be hoped that in the foreseeable future,