From: Alejandro Colomar <alx@kernel.org>
To: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Helge Kreutzmann <debian@helgefjell.de>,
Bruno Haible <bruno@clisp.org>,
linux-man@vger.kernel.org,
Mario Blaettermann <mario.blaettermann@gmail.com>
Subject: Re: names of ISO 8859 encodings
Date: Sat, 14 Dec 2024 18:27:14 +0100 [thread overview]
Message-ID: <20241214172714.bostrlr6v3fxvmta@devuan> (raw)
In-Reply-To: <20241214154713.njpgwqgm4vycuiiq@illithid>
[-- Attachment #1: Type: text/plain, Size: 9629 bytes --]
Hi Branden,
On Sat, Dec 14, 2024 at 09:47:13AM -0600, G. Branden Robinson wrote:
> At 2024-12-14T06:12:15+0000, Helge Kreutzmann wrote:
> > Am Fri, Dec 13, 2024 at 06:56:54PM -0600 schrieb G. Branden Robinson:
> > > Oy vey. Helge Kreutzmann submitted a similar bug report to groff
> > > and I was planning to make the ISO -> ISO/IEC change to its man
> > > pages.
> >
> > I'm not going into the business of valuating which standards should be
> > adhered to. But when referrring to the proper document the correct
> > name should be given IMHO.
I agree.
>
> Possibly the "use/mention" distinction of linguistics would be helpful
> here.[1] In some technical discussion contexts, we merely _mention_ a
> character encoding standard. For instance, "This program is capable of
> transliterating any document using an ISO/IEC 8859 character encoding to
> valid UTF-8.".
>
> In other contexts, we _use_ the identifier itself, perhaps as an input
> argument to a program. For example:
>
> $ iconv -f iso-8859-1 -t utf-8 NEWS
>
> In this shell command, we must spell the character encoding specifiers
> exactly as such,[2] and when documenting the foregoing in an example in
> a man page, we are well advised to spell the hyphen-minus signs with
> leading backslashes.
>
> .RS
> .EX
> $ \c
> .B "iconv \-f iso\-8859\-1 \-t utf\-8 NEWS"
> .EE
> .RE
>
> Alex, do you think this issue is enough of a trip hazard to warrant
> presentation in man-pages(7)?
What's the issue? I think it's simple:
When referring to a standard, use the pedantically correct name for it.
When showing a command line, use text that is pedantically correct to
the command interpreter.
Am I missing anything?
Cheers,
Alex
>
> > My personal opinion is that correct typography is important, but on
> > quick reading I probably would not spot the differences amongs the
> > various dashes for example. So for me, having all the correct letters
> > is important and of course, to copy and paste text (e.g. code) where
> > necessary, even if that violates typography standards.
>
> I think we can avoid violating standards of typography; more precisely,
> the process of rendering to an output device of limited capability will
> violate those standards for us.[3] For example, a character-cell
> terminal device generally can't (1) render arbitrary glyphs sequences
> superscripted or subscripted[4]; (2) change the type size;[5] or (3)
> change the font family (to use letterforms with or without serifs) for
> only part of the rendered text (as opposed to the entire display,
> including scrollback buffer) at once.
>
> > And yes, I'm well aware that Branden and Donald Knuth (and successors)
> > strive for well printed documents, and I'm glad for this.
>
> That's pretty august company to be paired with. Lest anyone get any
> inflated notions of my role in groff, Joe Ossanna of Bell Labs wrote
> troff in the mid-1970s. After his untimely death, Brian Kernighan
> refactored troff circa 1980 into "device-independent troff". These were
> proprietary to AT&T (and commercial products for a while), so the FSF
> hired James Clark to write a clean reimplementation of AT&T troff,
> called groff, in about 1989. Werner Lemberg later became groff
> maintainer and added many features to it such that it became a viable
> alternative to TeX in many more applications (partisan preferences
> aside). Then Bertrand Garrigues did some mostly unsung but badly needed
> work on groff's build system, making it more pleasant to work with. My
> role has largely been (1) fixing bugs; (2) writing automated tests to
> (try to) ensure that dead bugs stay dead; (3) revising and correcting
> documentation; and (4) making modest extensions and reforms to the *roff
> language and some of the macro packages, provoking heated arguments
> and/or revealing formerly unspecified behavior, around which some people
> of course poured fast-drying cement in fits of delirium years ago.
>
> In software as in religion, the commandments held most sacrosanct are
> those that no one thought to write down in the first place.
>
> ("Of _course_ I can interchange pointers and ints. No one ever said I
> can't!" Eventually, they did say so. To much gnashing of teeth.)
>
> Regards,
> Branden
>
> And now the footnotes, where we play free-association rambling bingo.
>
> [1] https://en.wikipedia.org/wiki/Use%E2%80%93mention_distinction
>
> [2] a given system's iconv(1) command may recognize alternative names
> for some encodings
>
> [3] For example, the bash(1) man page contains this:
>
> .if n Bash is Copyright (C) 1989-2024 by the Free Software Foundation, Inc.
> .if t Bash is Copyright \(co 1989-2024 by the Free Software Foundation, Inc.
>
> In principle, this shouldn't be necessary. Chet should just write
> the second line without the ".if t" conditional and delete the
> first. The output device should know how to gracefully map the
> special character "\(co" to a copyright sign, and itself do the job
> of translating it to "(C)" if it has only an ASCII repertoire.
> Presumably, at some point in the past Chet (or the initial Bash
> maintainer, Brian Fox) used an nroff program that was defective,
> and also labored under the no-longer-correct misconception that
> omitting a copyright symbol from one's notice was a fatal defect
> that effectively placed the work in the public domain. That
> stopped being true as of 1 March 1989.[7] Further, prior to
> guidance issued by the U.S. Copyright Office in the decades since,
> the use of "(C)" as a substitute for a copyright sign _may not have
> sufficed_ to prevent the copyright notice from being regarded as
> defective. The Copyright Office, then and now, prefers the
> abbreviation "copr." when © is typographically unavailable.[7]
> Nowadays, its advice is that "c" (note lowercase) is an "acceptable
> variant", that _may_ retain the efficacy of the copyright notice.
> However, it is not the U.S. Copyright Office but the courts that
> ultimately arbitrate such things. Moreover, given recent
> developments, the Office's guidance to authors need not carry any
> weight to a federal judge. Between the U.S. Supreme Court's repeal
> of "Chevron Deference"[8] and the availability of a federal
> district court in Western Texas offering itself as a venue to any
> right-wing plaintiff in the country and pursuing a crusade of
> maximalist Federalist [read: monarchist] Society doctrine with a
> penchant for issuing nationwide permanent injunctions,[9][10] the
> status of any federal statute, executive agency guidance, or even
> constitutional provision[11] is uncertain for the next few years at
> least. But rest assured--we term this sort of radical disruption
> of American jurisprudence a "conservative" judicial philosophy. 👍
>
> [4] Often, the decimal digits 0-9 are available as superscripts. This
> selection is too meager for general typography, let alone
> mathematical typesetting where arbitrary, complex expressions may
> occur in exponents, for instance. Occasionally you need an
> integral up there.
>
> [5] The DEC VT100 and its successors could do double-width and
> double-size type.[6] Try this in your preferred terminal emulator.
>
> $ printf "$(tput bold)\e#3See also\n\e#4See also$(tput sgr0)\n\
> $(tput sitm)xterm$(tput ritm)(1)\n\n\e#6Patch #395 2024-09-11\
> $(tput sitm)xterm$(tput ritm)(1)\e#5\n"
>
> Anyone think these are worth supporting in grotty(1)? ;-)
>
> [6] https://vt100.net/docs/vt510-rm/DECDHL.html
> https://vt100.net/docs/vt510-rm/DECDWL.html
>
> [7] https://www.copyright.gov/circs/circ03.pdf
> [8] https://www.scotusblog.com/2024/06/supreme-court-strikes-down-chevron-curtailing-power-of-federal-agencies/
> [9] https://www.americanprogress.org/article/the-5th-circuit-court-of-appeals-is-spearheading-a-judicial-power-grab/
>
> [10] I would not personally wager that copyright holders have much to
> fear under the current regime; revenues consequent to copyrights
> are a form of monopoly rent and therefore a worldwide tent pole of
> conservative political economy. But, if a poweful stakeholder has
> a prospect of a sufficiently large windfall from a radical change
> to copyright protections, and is willing to spend lavishly enough
> on political campaigns and super PACs, who knows what might happen?
>
> Here's some model statutory language. "Any work under copyright by
> any entity other than the Walt Disney Company, its subsidiaries, or
> affiliates, enters the public domain as of January 1 of the year
> subsequent to its fixation in tangible form."
>
> I mean, that's just "common sense", right?[12] Only Disney has any
> business adapting anything into a feature film, or exercising
> merchandising rights. Duh.
>
> [11] https://www.cbsnews.com/news/what-is-birthright-citizenship/
>
> [12] another term debased by conservative/centrist political rhetoric
>
> I offer my own definition, in the spirit of Ambrose Bierce.
>
> "Commonsense solution": a course of action I want to take for
> reasons I will not share with you.
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2024-12-14 17:27 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-14 0:23 names of ISO 8859 encodings Bruno Haible
2024-12-14 0:37 ` Alejandro Colomar
2024-12-14 0:56 ` G. Branden Robinson
2024-12-14 6:12 ` Helge Kreutzmann
2024-12-14 15:47 ` G. Branden Robinson
2024-12-14 17:27 ` Alejandro Colomar [this message]
2024-12-14 18:01 ` on the need for better quotation in man(7) (was: names of ISO 8859 encodings) G. Branden Robinson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241214172714.bostrlr6v3fxvmta@devuan \
--to=alx@kernel.org \
--cc=bruno@clisp.org \
--cc=debian@helgefjell.de \
--cc=g.branden.robinson@gmail.com \
--cc=linux-man@vger.kernel.org \
--cc=mario.blaettermann@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox