From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0583FC74A5B for ; Wed, 29 Mar 2023 03:13:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230113AbjC2DNb convert rfc822-to-8bit (ORCPT ); Tue, 28 Mar 2023 23:13:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230096AbjC2DNY (ORCPT ); Tue, 28 Mar 2023 23:13:24 -0400 Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8FD5826A5 for ; Tue, 28 Mar 2023 20:13:20 -0700 (PDT) Received: by mail-pf1-f182.google.com with SMTP id l14so9356875pfc.11 for ; Tue, 28 Mar 2023 20:13:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680059600; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I53jDXLYnsFbIjfZBQbPxXBpdih0oo0ggbxPTaoWO4o=; b=X2H5qHKN02UA4hnyCm/ehasBlRP6Jg3V9YHNQkZ0kQY6GWPJpMp9iX4mExkmWByocA kD3etPp8siC41oydS2vVPBjHh3Ytg7xqUO7LX/VQxoR4nRaXj1X/ORnI0qkqPvZZ5xf+ yXTkWHDp2+nEPti+n775ssk7IlEBqLMIIO7z3nKtxVlWNDJg7665N+h6cppYUzXK/Q+A k2nvY4ucJX7vEalJ+2/FQB6MAoUG2damYAQqluhakFQG9MJJet3nd21XBLfVM0+D/vXZ iwoNhUUEdOF1e7ohYPVZW6u1WE3OwmFu+DQy3OClUhMHrRxBWNVXnd84xOwegBpUaCYD gSog== X-Gm-Message-State: AAQBX9cw7E8lrDZ7TyF9c1HaBSXKzRwBFd/PH3JWf9t5hf8eiUIGq9SD QlADYgE43bXaO/8bY/KhEtHulBuVAjcl4LPprzU= X-Google-Smtp-Source: AKy350ZgkJlUUTxHw/G1Kd0nh8yTRDoHPXVHXmO5WBBdW1yuHWcneFNwKjxvpDfjkjkGdFt7l6TG92YT6LVBPS7zif4= X-Received: by 2002:a63:612:0:b0:503:2612:5554 with SMTP id 18-20020a630612000000b0050326125554mr4770378pgg.10.1680059599874; Tue, 28 Mar 2023 20:13:19 -0700 (PDT) MIME-Version: 1.0 References: <20230328202043.GA1241391@coredump.intra.peff.net> <20230328202819.GC1241631@coredump.intra.peff.net> <20230329023702.GA1793752@coredump.intra.peff.net> <20230329030432.GA1801645@coredump.intra.peff.net> In-Reply-To: <20230329030432.GA1801645@coredump.intra.peff.net> From: Eric Sunshine Date: Tue, 28 Mar 2023 23:13:08 -0400 Message-ID: Subject: Re: [PATCH 3/4] tests: drop here-doc check from internal chain-linter To: Jeff King Cc: Junio C Hamano , git@vger.kernel.org, Phillip Wood , Michael J Gruber Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Tue, Mar 28, 2023 at 11:04 PM Jeff King wrote: > On Tue, Mar 28, 2023 at 10:37:02PM -0400, Jeff King wrote: > > I just think chainlint.pl is doing a good enough job of catching it that > > we can rely on it. I'll be curious if Eric has input there on whether it > > can do even better, which would remove all of the caveats from the > > commit message. > > So I _think_ it's something like this: > > @@ -171,6 +171,9 @@ sub swallow_heredocs { > my $start = pos($$b); > my $indent = $tag =~ s/^\t// ? '\\s*' : ''; > $$b =~ /(?:\G|\n)$indent\Q$tag\E(?:\n|\z)/gc; > + if (pos($$b) == $start) { > + die "oops, we did not find the end of the heredoc"; > + } > my $body = substr($$b, $start, pos($$b) - $start); > $self->{lineno} += () = $body =~ /\n/sg; > } > > But I wasn't sure how to surface a clean error from here, since we're in > the Lexer. Maybe we just accumulate a "problems" array here, and then > roll those up via the TestParser? I'm not very familiar with the > arrangement of that part of the script. Yes, it would look something like that and you chose the correct spot to detect the problem, but to get a "pretty" error message properly positioned in the input, we need to capture the input stream position of the here-doc tag itself in scan_heredoc_tag(). It doesn't look too difficult, and I even started writing a bit of code to do it, but I'm not sure how soon I can get around to finishing the implementation. > And I say "think" because the thing I was worried about is that we'd do > this lexing at too high a level, and accidentally walk past the end of > the test. Which would mean getting fooled by; > > test_expect_success 'this one is broken' ' > cat >foo <<\EOF > oops, we are missing our here-doc end > ' > > test_expect_success 'this one is ok' ' > cat >foo <<\EOF > this one is OK, but we would not want to confuse > its closing tag for the missing one > EOF > ' > > But it looks like Lexer::swallow_heredocs gets to see the individual > test snippets. Correct. ScriptParser scans the input files for test_expect_{success,failure} invocations in order to extract the individual test snippets, which then get passed to TestParser for semantic analysis.