All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: "René Scharfe" <l.s.r@web.de>, "Git List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>
Subject: Re: [PATCH] p5311: handle spaces in wc(1) output
Date: Sun, 03 Oct 2021 10:04:30 +0200	[thread overview]
Message-ID: <87wnmuo7ii.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <YVk8SeuDIWwsrdO0@nand.local>


On Sun, Oct 03 2021, Taylor Blau wrote:

> On Sat, Oct 02, 2021 at 10:33:18PM +0200, René Scharfe wrote:
>> Some implementations of wc(1) align their output with leading spaces,
>> even when just a single number is requested, e.g. with "wc -c".  p5311
>> runs all tests successfully on such a platform, but fails to aggregate
>> their results and reports:
>
> This makes sense, and makes me think that wc's platform-specific
> implementations are too tricky to use when we are being picky about
> leading spaces.
>
> In other words, I think that your fix is absolutely correct, but I
> wonder if test_size should be friendlier in what it accepts, and to
> chomp off any leading space. So perhaps something like the below would
> work without any modification to p5311.
>
> --- 8< ---
>
> Subject: [PATCH] t/perf/aggregate.perl: tolerate leading spaces
>
> When using `test_size` with `wc -c`, users on certain platforms can run
> into issues when `wc` emits leading space characters in its output,
> which confuses get_times.
>
> Callers could switch to use test_file_size instead of `wc -c` (the
> former never prints leading space characters, so will always work with
> test_size regardless of platform), but this is an easy enough spot to
> miss that we should teach get_times to be more tolerant of the input it
> accepts.
>
> Teach get_times to do just that by stripping any leading space
> characters.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  t/perf/aggregate.perl | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
> index 82c0df4553..575d2000cc 100755
> --- a/t/perf/aggregate.perl
> +++ b/t/perf/aggregate.perl
> @@ -17,8 +17,8 @@ sub get_times {
>  		my $rt = ((defined $1 ? $1 : 0.0)*60+$2)*60+$3;
>  		return ($rt, $4, $5);
>  	# size
> -	} elsif ($line =~ /^\d+$/) {
> -		return $&;
> +	} elsif ($line =~ /^\s*(\d+)$/) {
> +		return $1;
>  	} else {
>  		die "bad input line: $line";
>  	}

This approach seems like a bit of plastering over the real problem. It's
fine to use the output of "wc -l" or "wc -c" in the context of the
shell's whitespace handling. That's why in various places we do:

    test $(wc -l <$file>) = 1

Or similar, but *don't* put that $() in double-quotes. I.e. we're
relying on the shell's whitespace semantics.

So isn't it better to just pass this through the shell's own handling
before emitting the data, something like this POC:

    $ stripspace() { var=$1; echo $@; }; x=$(stripspace "  hi" "  there "); echo "\"$x\""
    "hi there"

Of course fixing it up after that in Perl will work just as well, so I
guess this is just an asthetic preference for having the shell handle
the shell's output issues with what's guaranteed to be shell-portable
solutions... :)

  reply	other threads:[~2021-10-03  8:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-02 20:33 [PATCH] p5311: handle spaces in wc(1) output René Scharfe
2021-10-03  5:14 ` Taylor Blau
2021-10-03  8:04   ` Ævar Arnfjörð Bjarmason [this message]
2021-10-04 16:16     ` Junio C Hamano
2021-10-04  7:43   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wnmuo7ii.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.