From: Darren Hart <dvhart@infradead.org>
To: Joe Perches <joe@perches.com>
Cc: "John 'Warthog9' Hawley (VMware)" <warthog9@eaglescrag.net>,
linux-kernel@vger.kernel.org, Andy Whitcroft <apw@canonical.com>
Subject: Re: [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings
Date: Wed, 22 Mar 2017 23:01:33 -0700 [thread overview]
Message-ID: <20170323060133.GD17578@localhost.localdomain> (raw)
In-Reply-To: <1490206653.2041.17.camel@perches.com>
On Wed, Mar 22, 2017 at 11:17:33AM -0700, Joe Perches wrote:
> On Wed, 2017-03-22 at 08:25 -0700, Darren Hart wrote:
> > On Tue, Mar 21, 2017 at 11:31:08AM -0700, Joe Perches wrote:
> > > On Tue, 2017-03-21 at 09:30 -0700, John 'Warthog9' Hawley (VMware) wrote:
> > > > Spamassassin sticks a long (~79 character) long string after a
> > > > line that has a single space in it. The line with space causes
> > > > checkpatch to erroniously think that it's in the content body, as
> > > > opposed to headers and thus flag a mail header as an unwrapped long
> > > > comment line.
> > >
> > > If the spammassassin header is like
> > >
> > > email-header-n: foo
> > > email-header-m: bar
> > >
> > > X-Spam-Report: bar
> >
> > The specific content of the X-Spam-Report that triggers this for me,
> > from this patch for example, is:
> >
> > === 8< ===
> > X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary:
> > Content analysis details: (-1.9 points, 5.0 required)
> >
> > pts rule name description
> > ---- ---------------------- --------------------------------------------------
> > -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
> > [score: 0.0000]
> > X-TUID: alGBIuPZmqOj
> >
> > === >8 ===
> >
> > The long ---- ----... line is over 75 characters and triggers the test
> > for long commit_log lines.
> >
> > >
> > > Does that form follow rfc 5322?
> >
> > By my reading, this is governed by the long header fields defined by
> > 2.2.3, with whitespace folding defined as "a CRLF may be inserted before
> > any WSP."
> >
> > >
> > > If it does then any email header could have that
> > > form and the header wrapping test should be
> >
> > Yes, agreed.
> >
> > So the logic we want is:
> >
> > If we are in headers and we detect a CRLF and the next line starts with a WSP,
> > then we are still in headers (and therefor not in the commit log). The CRLF
> > information does not appear to be available as it is replaced with just \n.
> >
> > > updated from
> > >
> > > if ($in_header_lines && $realfile =~ /^$/ &&
> > > !($rawline =~ /^\s+\S/ ||
> > > $rawline =~ /^(commit\b|from\b|[\w-]+:).*$/i)) {
> > > $in_header_lines = 0;
> > > $in_commit_log = 1;
> > > $has_commit_log = 1;
> > > }
> > >
> > > to something like
> > >
> > > if ($in_header_lines && $realfile =~ /^$/ &&
> > > !($rawline =~ /^ (?:\s*\S|$)/ ||
> >
> > Hrm... lines that start with maybe a space followed by a : ... Why did you
> > introduce that part of the check?
>
> The regex doesn't care about colons.
> It's a perl non-capturing group.
> https://perldoc.perl.org/perlretut.html#Non-capturing-groupings
Aha, thank you for the pointer.
>
> > Looking at this more closely, I was also not clear why the original test looked
> > for several spaces followed by non-space. What case is this for?
>
> Not several spaces, one or more spaces then a non-space.
> The only change here is allowing an initial space followed
> by either:
>
> 1: optional spaces, then non-space.
> 2: EOL
>
> I supposed you could argue that case 2 should
> also allow optional spaces before EOL and the
> test should be
>
> if ($in_header_lines && $realfile =~ /^$/ &&
> !($rawline =~ /^\s+(?:\S|$)/ ||
I still haven't figured out why we test for this specific set of patterns. Why
is a line that starts with a space and ends with a newline considered still
in_header_lines. Or more specifically, why aren't we just testing for an empty
line (RFC 5322 Section 2.1, defining the separation of headers and the body).
--
Darren Hart
VMware Open Source Technology Center
next prev parent reply other threads:[~2017-03-23 7:45 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-21 16:30 [PATCH] checkpatch: Flag spam header (X-Spam-Report) to prevent spurious warnings John 'Warthog9' Hawley (VMware)
2017-03-21 18:31 ` Joe Perches
2017-03-22 15:25 ` Darren Hart
2017-03-22 18:17 ` Joe Perches
2017-03-23 6:01 ` Darren Hart [this message]
2017-03-23 6:07 ` Joe Perches
2017-03-24 20:14 ` John 'Warthog9' Hawley
2017-03-24 20:19 ` John 'Warthog9' Hawley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170323060133.GD17578@localhost.localdomain \
--to=dvhart@infradead.org \
--cc=apw@canonical.com \
--cc=joe@perches.com \
--cc=linux-kernel@vger.kernel.org \
--cc=warthog9@eaglescrag.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.