public inbox for linux-man@vger.kernel.org
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx.manpages@gmail.com>
To: Ingo Schwarze <schwarze@usta.de>
Cc: linux-man@vger.kernel.org,
	"G . Branden Robinson" <g.branden.robinson@gmail.com>
Subject: Re: Linux man-pages Makefile portability
Date: Sun, 3 Jul 2022 23:44:51 +0200	[thread overview]
Message-ID: <6e294373-2661-286c-09c4-e67cd84103d7@gmail.com> (raw)
In-Reply-To: <YrB66rgFZqryrmpt@asta-kit.de>


[-- Attachment #1.1: Type: text/plain, Size: 17436 bytes --]

[added Branden, as he was involved in discussions regarding man3type;
Branden, you might want to visit this thread from the begining, as I 
only copied the minimum to reply; it's in linux-man@]

Hi, Ingo! and Branden!

On 6/20/22 15:49, Ingo Schwarze wrote:
> 
>> But that Makefile was clearly unused since no-one knows.
> 
> I'm not saying "noone knows"; some packaging tools on some Linux
> distributions might very well rely on the Makefile - i don't know.

Oh, Michael seemed a bit surprised that I started patching the Makefile, 
as if the features I was patching hadn't been used in a very long time.

Anyway, I fixed `make all`.  I never liked it.  Now it builds all that 
it can build, which now is HTML pages, and in the future will probably 
include PDF pages too.  It was that, or make it a no-op, so I thought 
HTML+PDF was more useful.

> All i'm saying is i don't readily see a reason why people *not*
> running Linux might need the Makefile...

Well, the Makefile is basically meant to install (copy) the files to the 
system, so since you copy them to somewhere, `gmake install mandir=...` 
should work, but cp(1) -r tends to be as useful for such a simple package.

> 
>> If you know a better tool, I could start using it.  Maybe I could use
>> groff(1) directly, with grohtml(1).
> 
> I don't think using groff to generate HTML output from manual pages
> is a good idea.  It generates low-quality HTML code, and it is
> impossible to fix because of the basic concept how groff works.

[...]

Hmm, I'll try both and see.  Thanks.

> 
> [...]
>> I guess it's due to the use of $(foreach ).  I guess it's a GNU
>> extension and make(1POSIX) ignores that creating an empty string.  Since
>> the Makefile uses a lot of functions[1], I guess it's not easy to make
>> it portable.
> [...]
>> [1]:  addsuffix, wildcard, foreach, filter, patsubst, sort, shell,
>> basename, notdir, info.  Not sure how many of those are supported by
>> your make(1); maybe none?
> 
> The OpenBSD implementation of make(1) is much more powerful than
> POSIX make, but according to the manual page, you are right that
> none of these keywords are supported by OpenBSD make, let alone by
> POSIX make.

Heh, then to be compatible with BSD make(1) I guess I'd have to hardcode 
the page names, or use suffix rules, none of which convinces me.

Maybe suffix rules could work, but I'd have to stop testing the 
EXAMPLES, which is the most complex part of the Makefile.

I'll consider using them, or at least Substitution References[1] if it 
adds compatibility, at least for the simplest tasks, such as `make install`.

[1]: 
<https://www.gnu.org/software/make/manual/html_node/Substitution-Refs.html#Substitution-Refs>

>> No, you didn't.  I expected autocomplete to help,
> 
> I almost never use autocomplete except for the names of commands that are
> installed system-wide and for the names of files in the local file system.

Oh, I use autocomplete everyday for things like arguments to git 
commands, and I really feel it when I'm in a system where I don't have 
such help.

> I didn't even know it is possible to use autocompletion for make targets,
> and i dislike the idea.  But don't worry!  Your build system *is*
> complicated for a package that actually doesn't need to build anything,
> but not so bad that i didn't find what i looked for.  :)

:)

> As a side remark, i consider it bad style to use dependencies during
> installation: dependencies are for the build stage, not for the
> installation stage.  When i say "make install", i just want *all*
> the files installed unconditionally for two reasons: On the one hand,
> dependency handling is error-prone and it would be bad if some file
> does not get installed due to the notorious problem of oversights in
> dependency handling (and dependency handling in parallel Makefiles
> is even more fragile than in serial ones).  On the other hand, "make
> install" also has the purpose of repairing an installation that got
> broken in some way or other, and skipping some files because the
> build system *thinks* they are probably still installed properly
> defeats the purpose IMHO.

I've had doubts about that, and in the past I tended to do the same as 
you suggest, not because of fear to broken deps, but for making sure I 
don't create temporary files owned by root.  But in this case, where 
there are thousands of files to install, there's an important time 
difference between installing just the diff and installing the whole 
repo, so I asked the following question[2] just to confirm my doubts, 
and added the deps.

Regarding the possibility of broken deps, I believe the solution is to 
fix the Makefile, not to assume that it can't be done right and make it 
dumb; and I try hard to make sure my Makefiles work in multi-process 
mode.  There's always a chance that I got some corner case wrong, but 
this case it's pretty low (and if someone doesn't trust my Makefiles to 
behave well with -j, I don't force to use it, but I recommend it very 
much :)).

Regarding `make install` having a secondary purpose of being kind of a 
reinstall, I disagree.  I tend to write an explicit `make reinstall` 
target for that purpose (implemented as `\t$(MAKE) uninstall\n\t$(MAKE) 
install`); I didn't write it yet for the man-pages, but I'm going to add 
it now.

[2]: 
<https://stackoverflow.com/questions/70901364/should-make-install-depend-on-compilation>

> 
>> That's a lot, but it has
>> it's advantages (generating the file list on-the-fly; no ./configure).
>>
>> Then, the actual installation of the ~2.5k pages (most of them are link
>> pages),
> 
> As another aside, i consider using .so bad style.  It is unnecessarily
> fragile.  Using hard links on the file system level (see ln(1))
> is significantly more robust.  With mandoc(1), you don't need links
> at all, but i admit traditional man(1) implementations including
> man-db still require them for manual pages having more than one name.

I also had that feeling at first.  I just leave it there because of 
"don't fix it if it ain't broke" and it just works.  .so has a good 
side, which is that the Makefile is simpler, as it doesn't need to 
create links.

> 
>> takes another 1.4 s in multi-process mode, and 6 s in
>> single-process mode (so at least 4.6 s that are not I/O).  Maybe it's
>> make(1) that has a hard time traversing the tree... I don't know where
>> the bottleneck is, but it's clearly there.
> 
> I see.  So you need multiple processors purely for dealing with make(1)
> overhead...  Gee!  :-/

Yupee!  :/

I'll see if I can reduce that overhead without losing features.  Maybe I 
improve compatibility in the way.  :)

> 
> [...]
>> BTW, did you check the changes to queue.3?  I guess you could improve
>> yours in a similar manner.
>>
>> <https://linux-man-pages.blogspot.com/2020/11/man-pages-509-is-released.html>
> 
> I think in OpenBSD, these changes would get vetoed by large numbers
> of developers because they violate the way OpenBSD manual pages are
> organized in several ways:
> 
>   1. Your queue(7) manual page is placed in the wrong section.
>      It is purely about an API provided by a library for the C language.
>      Such information unambiguously belongs in section 3 and certainly
>      not in section 7.  It is not even an edge case; it is perfectly
>      clear what the correct section is.

See 2.

> 
>   2. Your file names and .TH names violate the OpenBSD convention
>      that section 2 and 3 manual pages must be named after functions
>      or macros.  For example, the page name "slist" is not acceptable
>      because no sname() function or macro exists.

Heh, I agree!  I would have put them in section 7, but I was new to the 
project, and didn't want to change things too much at the time.  Since 
queue(3) was in man3, I kept the tradition, and the child pages were 
kept in man3.  Probably I should have put them in man7, but blame 
history, not me :)

Buuut, is it me, or I see a contradiction with point 1, which claims 
that queue(3) should be in man3?  We don't have a slist() 
function/macro, but we don't have a queue() one either (maybe 
historically there was one and I don't know it, but I guess not.  My 
systems say:

alx@devuan:/usr/include$ grepc queue
alx@devuan:/usr/include$

alx@debian:/usr/include$ grepc queue
alx@debian:/usr/include$

Should now queue() be in man3 or man7?


> 
>   3. Splitting the page up into multiple pages is a bad idea for
>      two reasons: it results in significant duplication of information
>      and it splits information about interfaces so closely related
>      to each other that most of their features are identical across
>      multiple pages.

Actually, I didn't duplicate information at all, AFAIR.  It was already 
_very_ duplicated in the same queue(3) page, so I just splitted it at 
the right points.  I only had to cut the page into many little ones, 
then translate the pages from mdoc(7) to man(7), and then fix minor 
style issues.

See:

$ wc -l man3/circleq.3 man3/list.3 man3/slist.3 man3/stailq.3 
man3/tailq.3 man7/queue.7
   318 man3/circleq.3
   306 man3/list.3
   317 man3/slist.3
   375 man3/stailq.3
   395 man3/tailq.3
   133 man7/queue.7
  1844 total
$ git checkout man-pages-5.08 >/dev/null 2>&1
$ wc -l man3/queue.3
1231 man3/queue.3

The difference is just source code overhead; the text is almost the same.

Maybe you could still simplify your queue(7) page in a different way, 
without splitting it; it is very repetitive.

> 
> You would have no chance of getting anything like that committed to
> OpenBSD.

Heh, I know.  The only thing that was well received from my side in that 
list was a bug report about some exec(3) function (about alloca(3)).

> 
>> Also, if you have been following the addition of pages about types, and
>> would like to comment, you'll be welcome!
>>
>> <https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=178eaf37e2e971cae88bd4d3f124ede0afbb1015>
> 
> BSD doesn't have manual pages about types, and i don't think there is a
> significant benefit from having them.  Most standard types are trivial
> and can easily be looked up in the header files with no need for separate
> documentation.

Oh, yes, I envy your headers.  They are really readable!
But glibc source code is not as friendly, and it took me a long long 
time to get used to search things in those headers (I still can't find 
some things in their code; function definitions, for example, are very 
cryptic in some cases).

Also, some programmers, especially when starting (but I know of many 
programmers that are "senior" and still have serious issues with types), 
would benefit from documentation specific about types.  That would help 
understand their limitations, and what a type is appropriate for or not.

>  In the unusual case that a type has non-trivial syntax
> and/or semantics, it can be documented in the manual page of the most
> closely related API function; for example, "struct pollfd" is documented
> in our poll(2) page.



> That makes it easy to find with the usual
> "man -k Vt=typename" search command.

Oh, that's where man(7) sucks :)

I wouldn't need these:

$ sed -n /^man_lsfunc/,/^}/p <scripts/bash_aliases
man_lsfunc()
{
	if [ $# -lt 1 ]; then
		>&2 echo "Usage: ${FUNCNAME[0]} <manpage|manNdir>...";
		return $EX_USAGE;
	fi

	for arg in "$@"; do
		man_section "$arg" 'SYNOPSIS';
	done \
	|sed_rm_ccomments \
	|pcregrep -Mn '(?s)^ [\w ]+ \**\w+\([\w\s(,)[\]*]*?(...)?\s*\); *$' \
	|grep '^[0-9]' \
	|sed -E 's/syscall\(SYS_(\w*),?/\1(/' \
	|sed -E 's/^[^(]+ \**(\w+)\(.*/\1/' \
	|uniq;
}
$ sed -n /^man_lsvar/,/^}/p <scripts/bash_aliases
man_lsvar()
{
	if [ $# -lt 1 ]; then
		>&2 echo "Usage: ${FUNCNAME[0]} <manpage|manNdir>...";
		return $EX_USAGE;
	fi

	for arg in "$@"; do
		man_section "$arg" 'SYNOPSIS';
	done \
	|sed_rm_ccomments \
	|pcregrep -Mv '(?s)^ [\w ]+ \**\w+\([\w\s(,)[\]*]+?(...)?\s*\); *$' \
	|pcregrep -Mn \
	  -e '(?s)^ +extern [\w ]+ \**\(\*+[\w ]+\)\([\w\s(,)[\]*]+?\s*\); *$' \
	  -e '^ +extern [\w ]+ \**[\w ]+; *$' \
	|grep '^[0-9]' \
	|grep -v 'typedef' \
	|sed -E 's/^[0-9]+: +extern [^(]+ \**\(\*+(\w* )?(\w+)\)\(.*/\2/' \
	|sed    's/^[0-9]\+: \+extern .* \**\(\w\+\); */\1/' \
	|uniq;
}

And they're not perfect...


> 
> Documenting a non-trivial type separately from the functions using it
> is counter-productive and makes the documentation hard to read because
> programmers *never* need to use a type defined by a library unless they
> also want to use related API functions.

That's not true.  I've needed (or better phrased, wanted) types, even 
when I wasn't using any APIs that used them.  The reason was that I was 
designing an API, and wanted to use the most appropriate types for my 
functions.

Having had documentation about types would have helped *a lot* at the time.

Many programmers don't know all the differences between size_t and 
ssize_t, and for example that ISO C only provides one of them (the other 
is added by POSIX), and I know of one *great* programmer that learnt the 
difference between those types from me a few months ago :).

One could search in the standard documents about the types, but I guess 
we will agree that those documents are not very friendly, especially for 
beginners.

Another case where I've found my type pages very useful was when I 
contributed[3] to iwyu(1)[4].  Having to read the POSIX or ISO C 
documents would have been crazy (I had to do it anyway, to write the 
pages, but I don't want to repeat that process again ;)).

[3]: <https://github.com/include-what-you-use/include-what-you-use/pull/930>
[4]: <https://include-what-you-use.org/>

>  Actually, i find it better
> to *not* add type names as names of manual pages because that way,
> the classical syntax "man functname" can be used to search for
> function and macro names and the advanced syntax "man -k Vt=typename"
> to search for types, with less potential for confusion.

We don't have that feature in man(7), so the closest thing that I do is 
to grep in the glibc and BSDs source code with grepc(1)[3], and also 
`grep -rn ...` inside the man-pages repo.

[5]: <http://www.alejandro-colomar.es/src/alx/alx/grepc.git/>

> 
> So in OpenBSD, your pages about types would get vetoed on the grounds
> of "pages not named after functions or macros" as well as on the
> grounds of "these pages do not document any function or macro;
> instead of creating a new page, put the information where it belongs."

The first argument, I agree, and it's why I didn't use section 3, but 
subsection 3type.

The second, I disagree for the reasons above, but can understand why 
others might disagree with me.  Maybe I can convince you :)

> 
> That said, other projects are of course free to have such pages if
> they want to.  The mandoc(1) program is also able to handle paths like
> "man3/id_t.3type".  It will consider that page to be *both* in section
> "3" (as specified by the directory name) and in section "3type" (as
> specified by the file name and by the .TH macro).  I would consider
> it better style to keep section names consistent, i.e. to use either
> "man3/id_t.3" .TH id_t 3 or "man3type/id_t.3type" .TH id_t 3type,
> but it's not a big deal: since many systems (in particular various
> Linux distros) suffer from such inconsistencies, handling such
> inconsistencies gracefully is an important feature that certainly
> won't get removed.

I considered[6] using man3type, and used man3 in the end just because 
when in doubt I opted for the smallest change.  Knowing that it breaks 
mandoc(1), I'll definitely move to <man3type/>.

[6]: 
<https://lore.kernel.org/linux-man/761bb12f-31e0-369d-8315-d2e1545505c7@gmail.com/T/#u>

> 
>>>   Commands like
>>>
>>>      $ man -M /co/man-pages open
>>>
>>> work perfectly fine on my system to view the Linux open(2) manual,
>>> nicely formatted, with no need for installation or a Makefile.
>>> Even when i put up a copy at
>>>
>>>     https://man.bsd.lv/Linux-5.13/open

Yes, since there's no compilation, `make install` is basically a wrapper 
around `cp -r`.

It has nice features, such as reduced install time by checking 
timestamps, but that's more useful to me as a maintainer (since I 
install several times a minute in some cases), and not so much for end 
users, where a few seconds are not important.

> 
>> How do you generate your HTML pages?  mandoc(1)?  They are nice.
> 
> https://man.openbsd.org/mandoc.1#HTML_Output
> https://man.openbsd.org/man.cgi.8
> 
> I think that's the usual way to generate HTML from manual pages
> nowadays.  The following sites also use mandoc for HTML output:
> 
>   * https://www.freebsd.org/cgi/man.cgi
>   * https://manpages.debian.org/
>   * https://man.archlinux.org/
>   * https://man.voidlinux.org/
> 
> Some of these have their own CGI handling and/or database code,
> but they all use the mandoc parser and HTML generator.

Interesting.  Thanks!

Cheers,

Alex

-- 
Alejandro Colomar
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2022-07-03 21:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-19 17:52 Linux man-pages Makefile portability Alejandro Colomar
2022-06-19 21:06 ` Ingo Schwarze
2022-06-19 22:23   ` Alejandro Colomar
2022-06-20 13:49     ` Ingo Schwarze
2022-07-03 21:44       ` Alejandro Colomar [this message]
2022-07-21 14:17         ` Alejandro Colomar
2022-07-22 16:59           ` Ingo Schwarze
2022-07-22 17:37             ` Alejandro Colomar
2022-07-23 18:16               ` Ingo Schwarze
2022-07-24 11:09                 ` Alejandro Colomar (man-pages)
2022-07-24 15:57                   ` Ingo Schwarze
2022-07-24 17:29                     ` Semantic man(7) markup (was: Linux man-pages Makefile portability) G. Branden Robinson
2022-07-24 21:26                       ` Ingo Schwarze
2022-07-25  9:28                   ` Linux man-pages Makefile portability Colin Watson
2022-07-25 12:46                     ` bash-completion doesn't support man subsections (was: Linux man-pages Makefile portability) Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6e294373-2661-286c-09c4-e67cd84103d7@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=g.branden.robinson@gmail.com \
    --cc=linux-man@vger.kernel.org \
    --cc=schwarze@usta.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox