From: Jakub Narebski <jnareb@gmail.com>
To: Thomas Rast <trast@student.ethz.ch>
Cc: git@vger.kernel.org, "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Jonathan Nieder" <jrnieder@gmail.com>,
"Junio C Hamano" <gitster@pobox.com>,
"Sverre Rabbelier" <srabbelier@gmail.com>
Subject: Re: [RFC PATCH v2 2/3] Documentation: complete config list from other manpages
Date: Fri, 22 Oct 2010 16:31:52 +0200 [thread overview]
Message-ID: <201010221631.55580.jnareb@gmail.com> (raw)
In-Reply-To: <8145782bddf60325909f328337cb76d25c4402cf.1287690696.git.trast@student.ethz.ch>
On Fri, 22 Oct 2010 07:02, Thomas Rast wrote:
> Add an autogeneration script that inserts links to other manpages
> describing config variables, as follows:
>
> * parse config-vars-src.txt (as it now needs to be renamed) to find
> out its current contents
>
> * parse each manpage source (following includes) for config variable
> headers (the blocks are ignored)
>
> * assemble a new config-vars.txt that completes the original list with
> "See linkgit:git-foo[1]" entries for all variables that were not in
> it.
So is this script about automatically managing links from git-config
manpage to manpages of individual commands? Does it mean that for
something like the following in Documentation/config-vars-src.txt
foo.bar::
Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua.
and if `foo.bar` is referenced in Documentation/git-foo.txt in some way
(from reading source of mentioned autogeneration script it considers
only 'foo.bar::', like in Documentation/git-send-email.txt), then it
adds
See linkgit:git-foo[1]
at the end of description of variable in Documentation/config-vars-src.txt
(taking into account continuation blocks)? What about config variables
mentioned in different ways, e.g. '`foo.bar`', like in git-update-index
documentation? Does it checks that 'See linkgit:git-foo[1]' already
exists, e.g. in extended form 'See linkgit:git-foo[1]. True by default.'?
Or is it about automatically creating and updating blocks like this:
sendemail.aliasesfile::
sendemail.aliasfiletype::
sendemail.bcc::
sendemail.cc::
[...]
sendemail.thread::
sendemail.validate::
See linkgit:git-send-email[1] for description.
See also comments below, where I realized what this scipt does...
This really should be in commit message.
> Signed-off-by: Thomas Rast <trast@student.ethz.ch>
> ---
> Documentation/Makefile | 9 +-
> Documentation/config-vars-src.txt | 1747 +++++++++++++++++++++++++++++++++++
> Documentation/config-vars.txt | 1747 -----------------------------------
> Documentation/make-config-list.perl | 131 +++
> 4 files changed, 1886 insertions(+), 1748 deletions(-)
> create mode 100644 Documentation/config-vars-src.txt
> delete mode 100644 Documentation/config-vars.txt
> create mode 100755 Documentation/make-config-list.perl
[...]
> diff --git a/Documentation/make-config-list.perl b/Documentation/make-config-list.perl
> new file mode 100755
> index 0000000..f086867
> --- /dev/null
> +++ b/Documentation/make-config-list.perl
> @@ -0,0 +1,131 @@
> +#!/usr/bin/perl
While other helper scripts do not include comments describing them, it
would be nice to have here description what script does and for what it
is used.
Comments in code would also be nice.
The code lacks consistency:
> + open my $fh, "<", $file or die "cannot open $file: $!";
vs
> + my $fp;
> + open $fp, '<', $name or die "open $name failed: $!";
> + close $fh or die "eh? close failed: $!";
vs
> + close $fp or die "close $name failed: $!";
> +
> +use strict;
> +use warnings;
> +use Getopt::Long;
> +
> +my %ignore;
> +
> +my $rc = GetOptions(
> + "mainlist=s" => \my $mainlistfile,
> + "ignore=s" => sub { $ignore{$_[1]} = 1 },
> + "no-sort" => \my $no_sort,
> + );
> +
> +if (!$rc or (!-r $mainlistfile)) {
> + print "$0 --mainlist=<mainlist> [--ignore=<ignore>...] <asciidoc_manpage>...\n";
> + exit 1;
> +}
The names <mainlist> (which is file with list of configuration variables
to modify), <ignore> (which is about ignoring some asciidoc_manpage files,
but only when recursing / following linkgit links) doesn't tell us much
by themselves. It really needs either better names, or comment describing
what they mean, or both.
> +
It would be good to have comment what this subroutine does. It reads
and parses file with list of config variables and their description,
and returns reference to list of variables, in the order they were in
file, and reference to hash with contents:
variable => [ lines of description of variable ]
> +sub read_varlist {
> + my ($file) = @_;
> +
> + open my $fh, "<", $file or die "cannot open $file: $!";
> + my (%mainlist, @mainvars);
> +
> + my ($v, $last_v);
> + my $in_block = 0;
> + while (<$fh>) {
> + if (/^(\S+)::/) {
> + $v = lc $1;
> + $in_block = 0;
> + push @{$mainlist{$v}}, $_;
> + push @mainvars, $v;
O.K., that is easy to understand (if one remembers about autovivification
in Perl). But shouldn't we check if we are not in literal block, just
in case?
> + } elsif (/^$/ && !$in_block) {
> + if (defined $last_v && !$#{$mainlist{$last_v}}) {
> + $mainlist{$last_v} = $mainlist{$v};
> + }
> + $last_v = $v;
What is this branch / this code about?
> + } elsif (defined $v) {
> + push @{$mainlist{$v}}, $_;
> + $in_block = !$in_block if /^--$/;
O.K., easy to understand.
> + }
> + }
> +
> + close $fh or die "eh? close failed: $!";
> +
> + return \%mainlist, \@mainvars;
> +}
> +
It would be nice to have description what those variables should contain
(what structure would they have).
> +my %vars;
> +my %sections;
> +
It would be good to have comment what this subroutine does; better name
than 'read_file' would also be good.
This subroutine does *two* things: finds _manual_ section that manpage
belongs to, to create correct link (e.g. 1 for git-send-email(1), 5 for
gitattributes(5), 7 for gitcli(7)), and finds all manpages that refer
to config variable using '^foo.bar::': $vars{'foo.bar'} = [ 'git-foo', ... ]
It would also automatically follow includes, excluding ignored files
from following 'include::<filename>[]' links.
> +sub read_file {
> + my ($name, $main_name) = @_;
$name is name of current file, $main_name is name of file that included
it, isn't it?
> + if (!defined $main_name) {
> + $main_name = $name;
> + }
> + my $manpage_name = $main_name;
> + $manpage_name =~ s/\.txt//;
> + my $fp;
> + open $fp, '<', $name or die "open $name failed: $!";
> + while (<$fp>) {
> + if ($. < 5 && /^$manpage_name\((\d+)\)/) {
> + $sections{$manpage_name} = $1;
> + }
Shouldn't it be always first line of manpage?
> + if (/^([a-z0-9]+\.[a-z<>0-9.]+)::/) {
> + push @{$vars{$1}}, $manpage_name;
> + }
> + if (/^include::\s*(\S+)\s*\[\]/
> + && !exists $ignore{$1}) {
> + read_file($1, $main_name);
> + }
> + }
> + close $fp or die "close $name failed: $!";
> +}
> +
> +foreach my $name (@ARGV) {
> + read_file($name);
> +}
> +
> +my ($mainlist, $mainvars) = read_varlist($mainlistfile);
> +
> +my @all_keys = @$mainvars;
> +foreach my $k (keys %vars) {
> + if (!exists $mainlist->{$k}) {
> + push @all_keys, $k;
> + }
Nice, so it would detect config variables which are missing from
config-vars-src.txt
> +}
> +
> +@all_keys = sort @all_keys unless $no_sort;
> +
> +my %out;
> +foreach my $k (@all_keys) {
> + if (exists $mainlist->{$k}) {
> + push @{$out{$k}}, @{$mainlist->{$k}}, "\n";
Ah, so it doesn't add 'See linkgit:git-foo[1]' if 'foo.bar' is present
in config-vars-src.txt, but only generate notice about config variable
in git-config manpage, refering to the page where it is defined and
described.
This should be make more clear in commit message.
> + } elsif (exists $vars{$k}) {
> + push @{$out{$k}}, $k . "::\n";
> + push @{$out{$k}}, "\tSee ";
Wouldn't this result in something like
sendemail.aliasesfile::
See linkgit:git-send-email[1]
sendemail.aliasfiletype::
See linkgit:git-send-email[1]
instead of
sendemail.aliasesfile::
sendemail.aliasfiletype::
See linkgit:git-send-email[1]
in the case where config-vars-src.txt is missing some variable?
Ah, I see that it is collapsed later.
> + my $sep = " ";
Not $sep = "", or "\tSee"? Otherwise you would get "\tSee linkgit:git-foo[1]"
with double space.
> + foreach my $p (sort @{$vars{$k}}) {
> + next if ($p =~ /$mainlistfile/);
> + if (!exists $sections{$p}) {
> + warn "section for $p unknown";
> + $sections{$p} = 1;
> + }
> + push @{$out{$k}}, $sep . "linkgit:" . $p . "[" . $sections{$p} . "]";
A bit of possible (but perhaps not necessary) refactoring:
+ push @{$out{$k}}, $sep . gen_linkgit($p);
(or something like that).
Besides wouldn't it be better to do collapsing based on data, not on
formatted string?
> + $sep = ", ";
> + }
> + push @{$out{$k}}, ".\n\n";
> + } else {
> + die "can't happen: $k not in any source";
> + }
> +}
> +
A comment what this loop does would be nice.
Note that we don't really have to do this collapsing for existing
contents; only for contents that was generated by this script.
> +for (my $i = 0; $i < $#all_keys; $i++) {
> + next if $#{$out{$all_keys[$i]}} != $#{$out{$all_keys[$i+1]}};
> + my $same = 1;
> + for (my $j = 1; $j <= $#{$out{$all_keys[$i]}}; $j++) {
> + if ($out{$all_keys[$i]}[$j] ne $out{$all_keys[$i+1]}[$j]) {
> + $same = 0;
> + last;
> + }
> + }
A bit of possible (but perhaps not necessary) refactoring:
+ next unless eq_deeply($out{$all_keys[$i]}, $out{$all_keys[$i+1]});
(or eq_deeply_arr).
> + if ($same) {
> + $out{$all_keys[$i]} = [$out{$all_keys[$i]}[0]];
> + }
> +}
I really think that we can do this in an easier, more elegant, and not
that convoluted way.
First, we don't really need to store 'foo.bar::' as first element in
$out{'foo.bar'}. Second, then compacting is just grouping hash by value.
> +
> +foreach my $k (@all_keys) {
> + print @{$out{$k}};
> +}
> --
> 1.7.3.1.281.g5da0b
>
>
--
Jakub Narebski
Poland
next prev parent reply other threads:[~2010-10-22 14:32 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-22 5:02 [RFC PATCH v2 0/3] Documentation: refactor config variable descriptions Thomas Rast
2010-10-22 5:02 ` [RFC PATCH v2 3/3] Documentation: move format.* documentation to format-patch Thomas Rast
[not found] ` <c3f621cd062b2c4f80aa2e8dadcfddbc042aefaa.1287690696.git.trast@student.ethz.ch>
2010-10-22 8:18 ` [RFC PATCH v2 1/3] Documentation: Move variables from config.txt to separate file Jakub Narebski
2010-10-22 15:17 ` [RFC PATCH v2 0/3] Documentation: refactor config variable descriptions Jakub Narebski
2010-10-22 15:53 ` Jeff King
2010-10-24 1:24 ` Thomas Rast
2010-10-25 15:04 ` Jeff King
2010-10-25 12:44 ` Jakub Narebski
2010-10-25 15:11 ` Jeff King
2010-10-25 15:49 ` Jakub Narebski
2010-10-27 10:56 ` Jakub Narebski
2010-11-02 13:42 ` Jens Lehmann
[not found] ` <8145782bddf60325909f328337cb76d25c4402cf.1287690696.git.trast@student.ethz.ch>
2010-10-22 6:19 ` [RFC PATCH v2 2/3] Documentation: complete config list from other manpages Jakub Narebski
2010-10-22 14:31 ` Jakub Narebski [this message]
2010-10-23 22:24 ` [PATCH] " Thomas Rast
2010-10-23 22:30 ` Jakub Narebski
2010-10-24 14:36 ` Jakub Narebski
2010-10-24 20:44 ` [RFC PATCH v3 2/3] " Jakub Narebski
2010-10-24 20:55 ` [RFC PATCH v3 2/3 (amend)] " Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201010221631.55580.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=srabbelier@gmail.com \
--cc=trast@student.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).