From: Alejandro Colomar <alx.manpages@gmail.com>
To: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: linux-man@vger.kernel.org
Subject: Re: [PATCH 1/9] ldconfig.8: Fix markup nits
Date: Wed, 4 Jan 2023 21:15:00 +0100 [thread overview]
Message-ID: <afd3a0d3-9bf4-2687-4f62-2ebd62398447@gmail.com> (raw)
In-Reply-To: <20230104191118.xs7jwtjcqz6fhbbx@illithid>
[-- Attachment #1.1: Type: text/plain, Size: 10210 bytes --]
Hi Branden,
On 1/4/23 20:11, G. Branden Robinson wrote:
> Hi Alex,
>
> At 2023-01-04T19:41:51+0100, Alejandro Colomar wrote:
>> On 1/4/23 16:55, G. Branden Robinson wrote:
>>> At 2023-01-04T13:26:33+0100, Alejandro Colomar wrote:
>>>>> .SH NAME
>>>>> ldconfig \- configure dynamic linker run-time bindings
>>>>> .SH SYNOPSIS
>>>>
>>>> We should wrap this in .nf/.fi
>>>
>>> That will have a cost. It will mean using a lot of \c escape
>>> sequences to connect the output lines.
>>>
>>> The existing synopsis fits within 74 output columns on a terminal.
>>>
>>> Do you think it's worth it?
>>
>> But, it we don't use it, if someone uses a smaller terminal, there
>> might appear our beloved hyphens breaking a word...
>
> That's true. But what is _not_ true is that you don't have a minimum
> expected terminal width. You do, you just might not know what it is and
> it may not even have been consciously chosen.
I haven't consciously chosen it, but I often use 66-col terminals, especially
when I plan to paste text into an email.
>
> The minimum expected terminal width for the Linux man-pages corpus is
> the longest output line produced by an unfilled (.nf/.fi) region or a
> tbl(1) row that doesn't use a text block. Somewhere in the ~2,539 man
> pages, a longest unfilled line lurks...and its identity may change
> depending on the output device used to render it (terminal vs.
> typesetter).
Even if unfilled blocks exceed the terminal width, and therefore might be
unpleasant to the eyes, they have a nice feature: the line never breaks, and you
can still pipe it to other programs, or paste it, and it will be a single line.
>
> If you _do_ know what that expected minimum is, please document it.
>
> The nearest thing I see is:
>
> "Please limit source code line length to no more than about 75
> characters wherever possible." -- man-pages(7)
>
> But the relationship between input document line length for and
> formatted output line length is a loose one.
>
> In any event, groff man(7)'s SY/YS extension macros are _built_ for this
> application. I'm happy to "port" this page to use them; doing so will
> permit removal of the PD macro calls, among other benefits.
Yup, that would be nice!
>
>> Is there anything that "reverts" \%? So that if it were the default,
>> we could use \anti-% to say "groff, you might break this word"?
>
> Yes. \% itself does that.
I mean something like \X\%foobar, so that the \X "cancels" the \%. Not manually
inserting break points. So, imagining a world in which hyphenation was disabled
_only_ within font-selection macros, I could specify that a word is fine to be
hyphenated like this:
.B \Xidontcareifthisishyphenated
>
> From the groff 1.23 Texinfo manual (with stuff irrelevant to man(7)
> usage stripped out):
>
> -- Escape sequence: \%
> -- Escape sequence: \:
> To tell GNU 'troff' how to hyphenate words as they occur in input,
> use the '\%' escape sequence; it is the default "hyphenation
> character". Each instance within a word indicates to GNU 'troff'
> that the word may be hyphenated at that point, while prefixing a
> word with this escape sequence prevents it from being otherwise
> hyphenated. This mechanism affects only that occurrence of the
> word; [...]
>
> [...]
>
> '\:' inserts a non-printing break point; that is, a word can break
> there, but the soft hyphen glyph (see below) is not written to the
> output if it does. This escape sequence is an input word boundary,
> so the remainder of the word is subject to hyphenation as normal.
>
> You can use '\:' and '\%' in combination to control breaking of a
> file name or URL or to permit hyphenation only after certain
> explicit hyphens within a word.
>
> The \%Lethbridge-Stewart-\:\%Sackville-Baggins divorce
> was, in retrospect, inevitable once the contents of
> \%/var/log/\:\%httpd/\:\%access_log on the family web
> server came to light, revealing visitors from Hogwarts.
>
>>> groff man(7) _has_ a mechanism for this, and has since groff 1.19
>>> (2003). It's the `HY` register. People can put this in their
>>> man.local files (on Debian-based systems, that's in /etc/groff).
>>>
>>> .nr HY 0
>>
>> I know, but I don't think we should write manual pages in a way that
>> forces distributors to use such a thing.
>
> As far as I know, most distributors aren't configuring man.local this
> way today, despite it having been possible for almost 20 years.
Touche.
> Adding
> explicit hyphenation breakpoints (or their suppression) isn't going to
> force them any harder than they have been for 2 decades; it will in fact
> reduce any such pressure by reducing the number of bogus hyphenation
> breaks when hyphenation is enabled.
>
>> Either the pages are written plagued with \%,
>
> Like changes in lettercase, this is _information_.
I don't argue against that, but if there was a way to return that information
explicitly, we wouldn't be loosing it.
>
>> and the distros don't need to use .HY, or we write pages lazily so
>> that distros need to fix the hyphenation.
>
> Distributors' laziness seems to be a match for Linux man-pages's own;
> users seem to muddle through without much evident complaint.
Oh, I do complain a lot; however, I don't express it too much in the form of bug
reports, since I don't believe it's the fault of the writer, but rather lack of
support from groff(1).
But I do find it very uncomfortable, similar to when manual page authors don't
use the proper \-. However, I do think that one is fault of the author, and you
can already find many such reports signed by me :)
> I suppose
> people who copy-and-paste multiple lines from a man page realize they
> need to remove the hyphens along with newlines. Fortunately, on UTF-8
> terminals, they have hope of seeing the difference between hyphens and
> the ASCII hyphen-minus that is always(?) meant as a literal.
My problem is not about pasting text. That's very minor. My problem is finding
text.
For finding command options, I usually type:
/ --foo
If \- hasn't been used, I need to use:
/ ..foo
and skip all the noise. When there's too much noise, sometimes using an anchor
(^) helps. But it's way nicer when writers use \-. I keep finding such bugs,
and reporting them as much as I can.
When searching for keywords, the problem is the following: I do `/keyword`, but
then if the keyword is hyphenated... well, good luck.
>
>> But writing the pages lazily and having distributors ignore it would
>> result in suboptimal pages for our readers.
>
> I think marking break points, hyphenated and otherwise (as with URLs),
> is the opposite of laziness. It is a level of fastidiousness I don't
> actually expect of many man(7) writers apart from myself.
I would want to use \:. What I want is a tool which re-enables the default
hyphenation points after they have been cancelled.
>
>>> Colin Watson's man-db man(1) also has a feature to suppress
>>> hyphenation, using a hack; it's not pretty but it works even on
>>> other *roff formatters.
>>
>> Does that disable hyphenation for macros, or for the entire document?
>> I only want to disable it in highlighting macros.
>
> I don't quite understand what you mean by "macros" here. Macro
> interpolation is textual replacement, there isn't really a macro "mode"
> that is visible to the formatter when hyphenation decisions are made.
>
> But if by "highlighting macros" you mean font selection and
> alternation macros in man(7) (.B, .I, .BR, etc.), then the answer is
> "for the entire document". You don't always want to disable hyphenation
> when using these macros anyway. Not everything is a literal. The font
> macros are presentational, not semantic.
>
> Even then, I would not suppress hyphenation of a metasyntactic variable,
> like "directory". The whole point of these is that they are textually
> replaced _by the reader_.
I would. I wouldn't be able to count how many times I've tried to search for
such a keyword, and it was hyphenated.
>
>>> I don't insist that people keep hyphenation enabled, but assuming
>>> that no one will do so will keep us from putting worthwhile
>>> information in our man pages.
>>>
>>> If you dread the tedium of adding \% escape sequences to "keywords"
>>> all over the place, I don't blame you. This is one reason I
>>> proposed my most ambitious man(7) extension yet, a two-macro
>>> semantic tag mechanism.
>>>
>>> https://marc.info/?l=linux-man&m=165868366126909&w=2
>>
>> I still don't know what to think about that.
>
> That's okay. Its realization is some ways off, if ever. First I need
> Bertrand to recover from holidays. :-O
>
>> "XXX - quick hack, should disappear before anyone notices :)."
>>
>> Of course, the quick hack never disappeared after Oct 7, 2007, when it
>> was written in stone.
>
> Of course!
>
>> <https://github.com/shadow-maint/shadow/commit/6b6e005ce1cc4a5e4fc7fc40a52f2ed229f54b5b>
>>
>> "XXX - is the above ok or should it be <time.h> on ultrix?"
>
> If you pine for a stagnant commercial Unix to kick around, Solaris 10
> will be around for another year or so...
Solaris 10, I already remove code that supports it at every chance that I have :)
I wonder when the day will come that things like C89 will officially be declared
dead by consensus. I wish GCC would drop -std=gnu89 some day.
Seeing how some people strongly defend portability to dinosaur shells, because
POSIX is not portable enough... well, I don't think we'll get rid of C89 in
another 50 years...
But hey, when you don't care about big piles o'money, you can write
"non-portable" POSIX-only code. I prefer writing non-portable code, than
Solaris-3-portable code for a big pile o'money. :)
>
> Regards,
> Branden
Cheers,
Alex
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-01-04 20:15 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-04 7:38 [PATCH 1/9] ldconfig.8: Fix markup nits G. Branden Robinson
2023-01-04 12:26 ` Alejandro Colomar
2023-01-04 12:36 ` Alejandro Colomar
2023-01-04 16:06 ` G. Branden Robinson
2023-01-04 15:55 ` G. Branden Robinson
2023-01-04 18:41 ` Alejandro Colomar
2023-01-04 19:11 ` G. Branden Robinson
2023-01-04 20:15 ` Alejandro Colomar [this message]
2023-01-04 20:59 ` G. Branden Robinson
2023-01-05 1:21 ` Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afd3a0d3-9bf4-2687-4f62-2ebd62398447@gmail.com \
--to=alx.manpages@gmail.com \
--cc=g.branden.robinson@gmail.com \
--cc=linux-man@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox