From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org
To: linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [Bug 60807] not all the pages are encoded using utf-8
Date: Fri, 14 Feb 2014 10:22:04 +0000 [thread overview]
Message-ID: <bug-60807-11311-MQEHsQCnOr@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=60807
Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
--- Comment #4 from Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> ---
(In reply to Peter Schiffer from comment #3)
> $ ./print_encoding.sh man?/*
>
> Man Page Encoding by file Encoding by first line
>
> * man2/close.2 iso-8859-1
> * man2/getdomainname.2 iso-8859-1
> * man2/getrlimit.2 iso-8859-1
> * man2/madvise.2 iso-8859-1
> * man2/mount.2 utf-8
> * man2/sysinfo.2 iso-8859-1
> * man2/umask.2 iso-8859-1
> * man3/encrypt.3 iso-8859-1
> * man3/fclose.3 iso-8859-1
> * man3/fflush.3 iso-8859-1
> * man3/lockf.3 iso-8859-1
> * man3/rand.3 iso-8859-1
> * man3/strtok.3 iso-8859-1
> * man3/toupper.3 iso-8859-1
> * man3/updwtmp.3 iso-8859-1
> * man4/st.4 utf-8
> * man5/utmp.5 iso-8859-1
> * man7/armscii-8.7 iso-8859-1 ARMSCII-8
> * man7/cp1251.7 unknown-8bit CP1251
> * man7/environ.7 iso-8859-1
> * man7/hier.7 iso-8859-1
> * man7/iso_8859-10.7 iso-8859-1 ISO-8859-10
> * man7/iso_8859-11.7 iso-8859-1 ISO-8859-11
> * man7/iso_8859-13.7 iso-8859-1 ISO-8859-7
> * man7/iso_8859-14.7 iso-8859-1 ISO-8859-14
> * man7/iso_8859-15.7 iso-8859-1 ISO-8859-15
> * man7/iso_8859-16.7 iso-8859-1 ISO-8859-16
> * man7/iso_8859-1.7 iso-8859-1
> * man7/iso_8859-2.7 iso-8859-1 ISO-8859-2
> * man7/iso_8859-3.7 iso-8859-1 ISO-8859-3
> * man7/iso_8859-4.7 iso-8859-1 ISO-8859-4
> * man7/iso_8859-5.7 iso-8859-1 ISO-8859-5
> * man7/iso_8859-6.7 iso-8859-1 ISO-8859-6
> * man7/iso_8859-7.7 iso-8859-1 ISO-8859-7
> * man7/iso_8859-8.7 iso-8859-1 ISO-8859-8
> * man7/iso_8859-9.7 iso-8859-1 ISO-8859-9
> * man7/koi8-r.7 unknown-8bit KOI8-R
> * man7/koi8-u.7 unknown-8bit
> * man7/suffixes.7 iso-8859-1
>
> $ ./convert_to_utf_8.sh tmp_encoded man?/*
> Converting man2/close.2 from iso-8859-1
> Converting man2/getdomainname.2 from iso-8859-1
> Converting man2/getrlimit.2 from iso-8859-1
> Converting man2/madvise.2 from iso-8859-1
> Converting man2/mount.2 from utf-8
> Converting man2/sysinfo.2 from iso-8859-1
> Converting man2/umask.2 from iso-8859-1
> Converting man3/encrypt.3 from iso-8859-1
> Converting man3/fclose.3 from iso-8859-1
> Converting man3/fflush.3 from iso-8859-1
> Converting man3/lockf.3 from iso-8859-1
> Converting man3/rand.3 from iso-8859-1
> Converting man3/strtok.3 from iso-8859-1
> Converting man3/toupper.3 from iso-8859-1
> Converting man3/updwtmp.3 from iso-8859-1
> Converting man4/st.4 from utf-8
> Converting man5/utmp.5 from iso-8859-1
> Converting man7/armscii-8.7 from armscii-8
> Converting man7/cp1251.7 from cp1251
> Converting man7/environ.7 from iso-8859-1
> Converting man7/hier.7 from iso-8859-1
> Converting man7/iso_8859-10.7 from iso_8859-10
> Converting man7/iso_8859-11.7 from iso-8859-1
> Converting man7/iso_8859-13.7 from iso-8859-1
> Converting man7/iso_8859-14.7 from iso_8859-14
> Converting man7/iso_8859-15.7 from iso_8859-15
> Converting man7/iso_8859-16.7 from iso_8859-16
> Converting man7/iso_8859-1.7 from iso_8859-1
> Converting man7/iso_8859-2.7 from iso_8859-2
> Converting man7/iso_8859-3.7 from iso_8859-3
> Converting man7/iso_8859-4.7 from iso_8859-4
> Converting man7/iso_8859-5.7 from iso_8859-5
> Converting man7/iso_8859-6.7 from iso_8859-6
> Converting man7/iso_8859-7.7 from iso_8859-7
> Converting man7/iso_8859-8.7 from iso_8859-8
> Converting man7/iso_8859-9.7 from iso_8859-9
> Converting man7/koi8-r.7 from koi8-r
> Converting man7/koi8-u.7 from koi8-u
> Converting man7/suffixes.7 from iso-8859-1
>
> $ cd tmp_encoded/
>
> $ ../print_encoding.sh man?/*
>
> Man Page Encoding by file Encoding by first line
>
> * man2/close.2 utf-8 UTF-8
> * man2/getdomainname.2 utf-8 UTF-8
> * man2/getrlimit.2 utf-8 UTF-8
> * man2/madvise.2 utf-8 UTF-8
> * man2/mount.2 utf-8 UTF-8
> * man2/sysinfo.2 utf-8 UTF-8
> * man2/umask.2 utf-8 UTF-8
> * man3/encrypt.3 utf-8 UTF-8
> * man3/fclose.3 utf-8 UTF-8
> * man3/fflush.3 utf-8 UTF-8
> * man3/lockf.3 utf-8 UTF-8
> * man3/rand.3 utf-8 UTF-8
> * man3/strtok.3 utf-8 UTF-8
> * man3/toupper.3 utf-8 UTF-8
> * man3/updwtmp.3 utf-8 UTF-8
> * man4/st.4 utf-8 UTF-8
> * man5/utmp.5 utf-8 UTF-8
> * man7/armscii-8.7 utf-8 UTF-8
> * man7/cp1251.7 utf-8 UTF-8
> * man7/environ.7 utf-8 UTF-8
> * man7/hier.7 utf-8 UTF-8
> * man7/iso_8859-10.7 utf-8 UTF-8
> * man7/iso_8859-11.7 utf-8 UTF-8
> * man7/iso_8859-13.7 utf-8 UTF-8
> * man7/iso_8859-14.7 utf-8 UTF-8
> * man7/iso_8859-15.7 utf-8 UTF-8
> * man7/iso_8859-16.7 utf-8 UTF-8
> * man7/iso_8859-1.7 utf-8 UTF-8
> * man7/iso_8859-2.7 utf-8 UTF-8
> * man7/iso_8859-3.7 utf-8 UTF-8
> * man7/iso_8859-4.7 utf-8 UTF-8
> * man7/iso_8859-5.7 utf-8 UTF-8
> * man7/iso_8859-6.7 utf-8 UTF-8
> * man7/iso_8859-7.7 utf-8 UTF-8
> * man7/iso_8859-8.7 utf-8 UTF-8
> * man7/iso_8859-9.7 utf-8 UTF-8
> * man7/koi8-r.7 utf-8 UTF-8
> * man7/koi8-u.7 utf-8 UTF-8
> * man7/suffixes.7 utf-8 UTF-8
Peter,
Sorry to be slow following up on this. Thanks for the scripts.
As some background, I'll just note that the current encoding markers in the
iso_8859* pages were added in response to this 2009 bug report:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=519209
It seems a reasonable idea to convert everything to UTF-8, but I have some
concerns/questions.
1. Is the encoding line:
'\" t -*- coding: UTF-8 -*-
really needed, or does modern groff just work this out?
2. I'm concerned about backward compatibility issues. As in: what if someone
loads the man pages onto a system with old groff. Now, as far as I can work
out, groff added input unicode support in v1.20, 2009
(http://lists.gnu.org/archive/html/groff/2009-01/msg00011.html). So, perhaps
that's long enough ago that we don't need to worry too much about these issues.
Any thoughts?
--
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-02-14 10:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-28 13:38 [Bug 60807] New: not all the pages are encoded using utf-8 bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
[not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
2013-12-05 17:43 ` [Bug 60807] " bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2013-12-05 17:44 ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2013-12-05 17:46 ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2014-02-14 10:22 ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r [this message]
2014-02-14 12:47 ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2014-02-16 6:34 ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2014-02-16 7:44 ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2014-02-18 15:42 ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-60807-11311-MQEHsQCnOr@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon-590eeb7gvniway/ihj7yzeb+6bgklq7r@public.gmane.org \
--cc=linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.