All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab@kernel.org>
To: Markus Heiser <markus.heiser@darmarit.de>
Cc: "Michal Suchánek" <msuchanek@suse.de>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Matthew Wilcox" <willy@infradead.org>,
	linux-doc@vger.kernel.org, "Jonathan Corbet" <corbet@lwn.net>
Subject: Re: Sphinx parallel build error: UnicodeEncodeError: 'latin-1' codec can't encode characters in position 18-20: ordinal not in range(256)
Date: Fri, 7 May 2021 11:14:51 +0200	[thread overview]
Message-ID: <20210507111451.36f063bb@coco.lan> (raw)
In-Reply-To: <85bebda3-df0b-8554-5f90-45aa097ce405@darmarit.de>

Em Fri, 7 May 2021 10:56:39 +0200
Markus Heiser <markus.heiser@darmarit.de> escreveu:

> Am 07.05.21 um 10:35 schrieb Michal Suchánek:
> > So the bottom line is that UTF-8 in the files will stay, and Sphinx
> > cannot handle UTF-8 when the locale is not UTF-8.
> > 
> > In the long run it might be nice to fix Sphinx to properly set the
> > encoding of the files it reads and writes. Or maybe there is some
> > parameter that specifies it?  
> 
> Let's not mix things up. The Unicode-Error is not related or limited
> to log nor to sphinx, it is related to the fact that we (you) try to
> run a utf-8 application in an environment which is not full utf-8
> functional.

No. The application itself is not UTF-8. The application input files are.

The big issue with the way python works with charsets is due to that:
it does a very poor job with regards to that.

I remember that in the past I had to use this quite often
(before UTF-8 being default on the distros I was using on that time):

	LANG=C <some_python_script>

Just to avoid them to crash.

If I'm not mistaken, older Fedora/Mandrake distros had some bugs with
python-written scripts that, if the machine's language were not
English, such scripts crash, as the i18n translated messages were
on a different charset than what the python script would be expecting.

> > For the short term I think it is reasonable to run a python test script
> > that prints fancy unicode characters before running Sphinx and bail if
> > the test script fails.  
> 
> To be assure, I recommend to set UTF-8 locale environment in the
> Makefile.
> 
> My experience shows that this is the default with almost all
> containers (images), there are only a few where this is not the
> case (may be suse?).

That may not be true on certain parts of the globe.

I've no idea what charsets the most-used distributions in Asian
Countries use use ;-)

Thanks,
Mauro

  reply	other threads:[~2021-05-07  9:16 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 10:39 Sphinx parallel build error: UnicodeEncodeError: 'latin-1' codec can't encode characters in position 18-20: ordinal not in range(256) Michal Suchánek
2021-05-06 11:20 ` Mauro Carvalho Chehab
2021-05-06 13:32   ` Michal Suchánek
2021-05-06 14:24     ` Mauro Carvalho Chehab
2021-05-06 14:35       ` Michal Suchánek
2021-05-06 15:57 ` Markus Heiser
2021-05-06 16:46   ` Mauro Carvalho Chehab
2021-05-06 17:04     ` Markus Heiser
2021-05-06 17:27       ` Mauro Carvalho Chehab
2021-05-06 17:53         ` Markus Heiser
2021-05-06 18:06           ` Michal Suchánek
2021-05-07  8:52             ` Mauro Carvalho Chehab
2021-05-06 17:57         ` Randy Dunlap
2021-05-06 18:08           ` Matthew Wilcox
2021-05-06 21:21             ` Randy Dunlap
2021-05-07  6:39               ` Mauro Carvalho Chehab
2021-05-07  6:49                 ` Randy Dunlap
2021-05-07  8:04                 ` Mauro Carvalho Chehab
2021-05-07  8:35                   ` Michal Suchánek
2021-05-07  8:56                     ` Markus Heiser
2021-05-07  9:14                       ` Mauro Carvalho Chehab [this message]
2021-05-07  9:51                         ` Markus Heiser
2021-05-07 10:29                           ` Michal Suchánek
2021-05-07  9:02                     ` Mauro Carvalho Chehab
2021-05-08  9:22                 ` Mauro Carvalho Chehab
2021-05-08 10:41                   ` Michal Suchánek
2021-05-08 14:41                     ` Mauro Carvalho Chehab
2021-05-08 15:55                       ` Randy Dunlap
2021-05-08 17:09                         ` Michal Suchánek
2021-05-08 17:46                           ` Randy Dunlap
2021-05-10  6:22                             ` Mauro Carvalho Chehab
2021-05-10  8:17                         ` Mauro Carvalho Chehab
2021-05-06 17:48       ` Michal Suchánek
2021-05-06 17:59         ` Markus Heiser
2021-05-06 18:16           ` Michal Suchánek
2021-05-12  6:22         ` Mauro Carvalho Chehab
2021-05-12  7:01           ` Michal Suchánek
2021-05-12  7:18             ` Markus Heiser
2021-05-12  7:37               ` Markus Heiser
2021-05-12  7:59             ` Mauro Carvalho Chehab
2021-05-17 13:10               ` Michal Suchánek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210507111451.36f063bb@coco.lan \
    --to=mchehab@kernel.org \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=markus.heiser@darmarit.de \
    --cc=msuchanek@suse.de \
    --cc=rdunlap@infradead.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.