All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab@kernel.org>
To: Randy Dunlap <rdunlap@infradead.org>
Cc: "Michal Suchánek" <msuchanek@suse.de>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Markus Heiser" <markus.heiser@darmarit.de>,
	linux-doc@vger.kernel.org, "Jonathan Corbet" <corbet@lwn.net>
Subject: Re: Sphinx parallel build error: UnicodeEncodeError: 'latin-1' codec can't encode characters in position 18-20: ordinal not in range(256)
Date: Mon, 10 May 2021 08:22:01 +0200	[thread overview]
Message-ID: <20210510082201.6e14a6c0@coco.lan> (raw)
In-Reply-To: <6d455415-9d19-841f-01f7-7139a77a30c5@infradead.org>

Em Sat, 8 May 2021 10:46:46 -0700
Randy Dunlap <rdunlap@infradead.org> escreveu:

> On 5/8/21 10:09 AM, Michal Suchánek wrote:
> > On Sat, May 08, 2021 at 08:55:11AM -0700, Randy Dunlap wrote:  
> >> Hi Mauro,
> >>
> >> On 5/8/21 7:41 AM, Mauro Carvalho Chehab wrote:  
> >>> Em Sat, 8 May 2021 12:41:57 +0200
> >>> Michal Suchánek <msuchanek@suse.de> escreveu:
> >>>  
> >>>> On Sat, May 08, 2021 at 11:22:05AM +0200, Mauro Carvalho Chehab wrote:  
> >>>>> Em Fri, 7 May 2021 08:39:24 +0200
> >>>>> Mauro Carvalho Chehab <mchehab@kernel.org> escreveu:
> >>>>>     
> >>>>>> Em Thu, 6 May 2021 14:21:01 -0700
> >>>>>> Randy Dunlap <rdunlap@infradead.org> escreveu:
> >>>>>>     
> >>>>>     
> >>>>>> I'll prepare a patch fixing it. Some care should be taken, however, as
> >>>>>> it has two places where UTF-8 chars should be used[2].    
> >>>>>
> >>>>> Ok, I did a small script in order to check what special chars we
> >>>>> currently have (next-20210507) at Documentation/ excluding the
> >>>>> translations.
> >>>>>
> >>>>> Based on my script results, we have those groups:
> >>>>>
> >>>>> 1. Latin accented characters:
> >>>>> 	- U+00c7 (LATIN CAPITAL LETTER C WITH CEDILLA) (Ç)
> >>>>> 	- U+00df (LATIN SMALL LETTER SHARP S) (ß)
> >>>>> 	- U+00e1 (LATIN SMALL LETTER A WITH ACUTE) (á)
> >>>>> 	- U+00e4 (LATIN SMALL LETTER A WITH DIAERESIS) (ä)
> >>>>> 	- U+00e6 (LATIN SMALL LETTER AE) (æ)
> >>>>> 	- U+00e7 (LATIN SMALL LETTER C WITH CEDILLA) (ç)
> >>>>> 	- U+00e9 (LATIN SMALL LETTER E WITH ACUTE) (é)
> >>>>> 	- U+00ea (LATIN SMALL LETTER E WITH CIRCUMFLEX) (ê)
> >>>>> 	- U+00eb (LATIN SMALL LETTER E WITH DIAERESIS) (ë)
> >>>>> 	- U+00f3 (LATIN SMALL LETTER O WITH ACUTE) (ó)
> >>>>> 	- U+00f4 (LATIN SMALL LETTER O WITH CIRCUMFLEX) (ô)
> >>>>> 	- U+00f6 (LATIN SMALL LETTER O WITH DIAERESIS) (ö)
> >>>>> 	- U+00f8 (LATIN SMALL LETTER O WITH STROKE) (ø)
> >>>>> 	- U+00fc (LATIN SMALL LETTER U WITH DIAERESIS) (ü)
> >>>>> 	- U+011f (LATIN SMALL LETTER G WITH BREVE) (ğ)
> >>>>> 	- U+0142 (LATIN SMALL LETTER L WITH STROKE) (ł)
> >>>>>
> >>>>> 2. symbols:
> >>>>> 	- U+00a9 (COPYRIGHT SIGN) (©)
> >>>>> 	- U+2122 (TRADE MARK SIGN) (™)
> >>>>> 	- U+00ae (REGISTERED SIGN) (®)
> >>>>> 	- U+00b0 (DEGREE SIGN) (°)
> >>>>> 	- U+00b1 (PLUS-MINUS SIGN) (±)
> >>>>> 	- U+00b2 (SUPERSCRIPT TWO) (²)
> >>>>> 	- U+00b5 (MICRO SIGN) (µ)
> >>>>> 	- U+00bd (VULGAR FRACTION ONE HALF) (½)
> >>>>> 	- U+2026 (HORIZONTAL ELLIPSIS) (…)
> >>>>>
> >>>>> 3. arrows:
> >>>>> 	- U+2191 (UPWARDS ARROW) (↑)
> >>>>> 	- U+2192 (RIGHTWARDS ARROW) (→)
> >>>>> 	- U+2193 (DOWNWARDS ARROW) (↓)
> >>>>> 	- U+2b0d (UP DOWN BLACK ARROW) (⬍)
> >>>>>
> >>>>> 4. box drawings:
> >>>>> 	- U+2500 (BOX DRAWINGS LIGHT HORIZONTAL) (─)
> >>>>> 	- U+2502 (BOX DRAWINGS LIGHT VERTICAL) (│)
> >>>>> 	- U+2514 (BOX DRAWINGS LIGHT UP AND RIGHT) (└)
> >>>>> 	- U+251c (BOX DRAWINGS LIGHT VERTICAL AND RIGHT) (├)
> >>>>>
> >>>>> 5. math symbols:
> >>>>> 	- U+00b7 (MIDDLE DOT) (·)
> >>>>> 	- U+00d7 (MULTIPLICATION SIGN) (×)
> >>>>> 	- U+2212 (MINUS SIGN) (−)
> >>>>> 	- U+2217 (ASTERISK OPERATOR) (∗)
> >>>>> 	- U+223c (TILDE OPERATOR) (∼)
> >>>>> 	- U+2264 (LESS-THAN OR EQUAL TO) (≤)
> >>>>> 	- U+2265 (GREATER-THAN OR EQUAL TO) (≥)
> >>>>> 	- U+27e8 (MATHEMATICAL LEFT ANGLE BRACKET) (⟨)
> >>>>> 	- U+27e9 (MATHEMATICAL RIGHT ANGLE BRACKET) (⟩)
> >>>>> 	- U+00ac (NOT SIGN) (¬)    
> >>>>  
> >>>  
> >>>>
> >>>> Use of ¬ is also very dubious in documentation (in fonts it is understandable):
> >>>> Documentation/ABI/obsolete/sysfs-kernel-fadump_registered:This ABI is renamed and moved to a new location /sys/kernel/fadump/registered.¬
> >>>> Documentation/ABI/obsolete/sysfs-kernel-fadump_release_mem:This ABI is renamed and moved to a new location /sys/kernel/fadump/release_mem.¬  
> >>>
> >>>  
> >>>> Documentation/powerpc/transactional_memory.rst:  if (MSR 29:31 ¬ = 0b010 | SRR1 29:31 ¬ = 0b000) then  
> >>>
> >>> Yeah, this should probably be better written as:
> >>>
> >>>   if (MSR 29:31 == 0b010 | SRR1 29:31 == 0b000) then  
> >>
> >> If the original with the 'NOT SIGN' was correct, then this
> >> version can't be correct. Or do you suspect that the "original"
> >> was corrupted somehow?  

No, I just misread the expression.

> > 
> > This does not make sense however you look at it. Using | between logical
> > expressions ...  
> 
> To my eyes/brain, it looks like classic (IBM) symbolic logic notation.
> In that context, I don't see anything wrong with it.

In this particular case, I would keep it as-is, with the UTF-8 char
on it. I mean, it might be converted to some other symbolic logic
notation, but "MSR 29:31" and "SRR1 29:31" aren't valid names on C.

> Yeah, I have been looking thru the arch/powerpc/ source code for this,
> but I haven't found it yet.

The title of the session says that it is part of "h/rfid mtmsrd quirk".

Searching for rfid:

	$ git grep -l rfid arch/powerpc/

Shows a lot of asm code. I guess that if the above quirk is still at
the Kernel, it is probably somewhere at the assembler part.

So, it sounds to me that converting it into C (or pseudo-C) won't
make it any better.

Thanks,
Mauro

  reply	other threads:[~2021-05-10  6:22 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-06 10:39 Sphinx parallel build error: UnicodeEncodeError: 'latin-1' codec can't encode characters in position 18-20: ordinal not in range(256) Michal Suchánek
2021-05-06 11:20 ` Mauro Carvalho Chehab
2021-05-06 13:32   ` Michal Suchánek
2021-05-06 14:24     ` Mauro Carvalho Chehab
2021-05-06 14:35       ` Michal Suchánek
2021-05-06 15:57 ` Markus Heiser
2021-05-06 16:46   ` Mauro Carvalho Chehab
2021-05-06 17:04     ` Markus Heiser
2021-05-06 17:27       ` Mauro Carvalho Chehab
2021-05-06 17:53         ` Markus Heiser
2021-05-06 18:06           ` Michal Suchánek
2021-05-07  8:52             ` Mauro Carvalho Chehab
2021-05-06 17:57         ` Randy Dunlap
2021-05-06 18:08           ` Matthew Wilcox
2021-05-06 21:21             ` Randy Dunlap
2021-05-07  6:39               ` Mauro Carvalho Chehab
2021-05-07  6:49                 ` Randy Dunlap
2021-05-07  8:04                 ` Mauro Carvalho Chehab
2021-05-07  8:35                   ` Michal Suchánek
2021-05-07  8:56                     ` Markus Heiser
2021-05-07  9:14                       ` Mauro Carvalho Chehab
2021-05-07  9:51                         ` Markus Heiser
2021-05-07 10:29                           ` Michal Suchánek
2021-05-07  9:02                     ` Mauro Carvalho Chehab
2021-05-08  9:22                 ` Mauro Carvalho Chehab
2021-05-08 10:41                   ` Michal Suchánek
2021-05-08 14:41                     ` Mauro Carvalho Chehab
2021-05-08 15:55                       ` Randy Dunlap
2021-05-08 17:09                         ` Michal Suchánek
2021-05-08 17:46                           ` Randy Dunlap
2021-05-10  6:22                             ` Mauro Carvalho Chehab [this message]
2021-05-10  8:17                         ` Mauro Carvalho Chehab
2021-05-06 17:48       ` Michal Suchánek
2021-05-06 17:59         ` Markus Heiser
2021-05-06 18:16           ` Michal Suchánek
2021-05-12  6:22         ` Mauro Carvalho Chehab
2021-05-12  7:01           ` Michal Suchánek
2021-05-12  7:18             ` Markus Heiser
2021-05-12  7:37               ` Markus Heiser
2021-05-12  7:59             ` Mauro Carvalho Chehab
2021-05-17 13:10               ` Michal Suchánek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210510082201.6e14a6c0@coco.lan \
    --to=mchehab@kernel.org \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=markus.heiser@darmarit.de \
    --cc=msuchanek@suse.de \
    --cc=rdunlap@infradead.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.