From: Josh Triplett <josh@joshtriplett.org>
To: Nick Bowler <nbowler@elliptictech.com>
Cc: linux-kernel@vger.kernel.org, Andy Whitcroft <apw@canonical.com>,
Joe Perches <joe@perches.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: [PATCH] checkpatch: Check for quoted strings broken across lines
Date: Thu, 2 Feb 2012 17:31:08 -0800 [thread overview]
Message-ID: <20120203013108.GA13456@leaf> (raw)
In-Reply-To: <20120202220853.GA13723@elliptictech.com>
On Thu, Feb 02, 2012 at 05:08:53PM -0500, Nick Bowler wrote:
> On 2012-02-02 13:16 -0800, Josh Triplett wrote:
> > On Thu, Feb 02, 2012 at 03:22:07PM -0500, Nick Bowler wrote:
> > > On 2012-02-02 12:06 -0800, Josh Triplett wrote:
> > > > Documentation/CodingStyle recommends not splitting quoted strings across
> > > > lines, because it breaks the ability to grep for the string. checkpatch
> > > > already makes an exception to the 80-column rule for quoted strings to
> > > > allow this. Rather than just allowing it, actively warn about quoted
> > > > strings split across lines.
> > > [...]
> > > > +# Check for strings broken across lines (breaks greppability). Make an
> > > > +# exception when the previous string ends in a newline (multiple lines in one
> > > > +# string constant) or \n\t (common in inline assembly to indent the instruction
> > > > +# on the following line).
> > >
> > > There are tons of strings in the kernel that this makes checkpatch warn
> > > about where it probably shouldn't. For example, this one (from
> > > kernel/auditsc.c:1476):
> > >
> > > audit_log_format(ab,
> > > "oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
> > > "mq_msgsize=%ld mq_curmsgs=%ld",
> > >
> > > WARNING: quoted string split across lines
> > > #1478: FILE: auditsc.c:1478:
> > > + "mq_msgsize=%ld mq_curmsgs=%ld",
> > >
> > > Breaking "greppability" of this string is a non-issue, because this sort
> > > of string is not really greppable to begin with (and would certainly not
> > > be any easier to grep for if it were all on one line).
> >
> > While I agree that in that particular case (heavy on the %formats and
> > light on the text) you can't easily grep to begin with, the guideline
> > from CodingStyle still applies,
>
> The guideline from CodingStyle talks about "user-visible strings". I
> don't think this string counts, because it contains code that is not
> displayed to the user (hence, the string is not user-visible). That is
> the reason why it's not greppable in the first place.
>
> There are hundreds, if not thousands of similar strings in the kernel.
OK, with a combination of arcane grep/sed commands and hand checking, I
just scanned the *entire* kernel for quoted strings split across lines,
and filtered out all user-visible strings, leaving only the split
non-user-visible strings.
An excerpt from that pile of ugly, suggesting just how many ways kernel
code has to spell "user-visible":
\(print[a-z_]*\|pr_[a-z]*\|pr_[a-z]*_ratelimited\|pr_debug[0-9]\|pr_info_once\|dev_[a-z]*\|msg\|warn\|warn_once\|dbg\|debug\|debugp\|err\|error\|errdev\|errx\|warning\|DBG[A-Z0-9_]*\|DEBUG[A-Z_]*\|trace[a-z0-9_]*\|LOG_KMS\|exception\|FAIL\|fatal\|assert\|panic_vm\|trace\|panic\|puts\|MODULE_[A-Z_]*\|asm\|__asm__\|volatile\|__volatile__\|message\|log\|info\|DAC960_[A-Za-z_]*\|notify\|ufsd\|crit\|debugf[0-9]\|DP\|die\|fyi\|notice\|dout\|snoop\|throtl_log_tg\|ERROR_INODE\|RFALSE\|DMESGE\|SAY\|JOM\|SAM\|JOT\|DBF_EVENT\|DE_ACT\|s390_handle_damage\|CIO_MSG_EVENT\|CIO_CRW_EVENT\|DBF_EVENT_DEVID\|fat_fs_error_ratelimit\|mark_tsc_unstable\|ata_ehi_push_desc\|DEB[0-9]\|IUCV_DBF_TEXT_\|LOG_PARSE\|esp_log_[a-z]*\|check_warn_return\|D_ISR\|D_CALIB\|D_RATE\|D_TX_REPLY\|D_TXPOWER\|D_EEPROM\|D_POWER\|D_SCAN\|D_ASSOC\|D_HC\|D_HC_DUMP\|IP_VS_ERR_BUF\|DMWARN_LIMIT\|ide_dump_status\|ppfinit\|mca_recovered\|die_if_kernel\|die_if_no_fixup\|gdbstub_proto\|ite_pr\|nvt_pr\|deb_i2c\|deb_irq\|deb_ts\|deb_rc\|mxl_i2c\|MLX4_EN_PARM_INT\|DI\|d_fnstart\|d_fnend\|lbs_deb_[a-z_]*\|LPFC_[A-Z_]*\|stop\|mconsole_reply_v0\|CH_ALERT\|D1\|ADD\|lbtf_deb_usbd\|IWL_DEBUG_MAC80211\|WL_CONN\|CX18_INFO_DEV\|CX18_ERR_DEV\|OPT_BOOLEAN\|asc_prt_line\|gdbstub_msg_write\|DCCP_BUG\|fappend\)
I found that a single change to my checkpatch test (only checking
strings passed as function arguments) limits the false positives in the
entire kernel to a grand total of 12:
arch/powerpc/platforms/iseries/mf.c: memcpy(src, "\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00"
arch/um/os-Linux/process.c: if (sscanf(buf, "%*d " COMM_SCANF " %*c %*d %*d %*d %*d %*d %*d %*d "
arch/x86/platform/olpc/olpc_dt.c: olpc_dt_interpret("\" /battery@0\" find-device"
arch/x86/platform/olpc/olpc_dt.c: olpc_dt_interpret("\" /pci/display@1\" find-device"
arch/x86/platform/olpc/olpc_dt.c: olpc_dt_interpret("\" /pci/display@1,1\" find-device"
drivers/block/rbd.c: "%" __stringify(RBD_MAX_OBJ_NAME_LEN) "s"
drivers/pcmcia/ds.c: if (add_uevent_var(env, "MODALIAS=pcmcia:m%04Xc%04Xf%02Xfn%02Xpfn%02X"
drivers/tty/tty_audit.c: audit_log_format(ab, "%s pid=%u uid=%u auid=%u ses=%u "
fs/ecryptfs/crypto.c:static unsigned char *portable_filename_chars = ("-.0123456789ABCD"
kernel/auditsc.c: audit_log_format(ab, " inode=%lu"
kernel/auditsc.c: audit_log_format(ab, "login pid=%d uid=%u "
security/selinux/ss/services.c: audit_log_format(ab, "op=security_compute_av reason=%s "
Not "hundreds, if not thousands", but 12, including 4 calls to
audit_log_format. :)
Meanwhile, this checkpatch test correctly flags several thousand
instances of user-visible strings that shouldn't get split. That seems
like a pretty reasonable ratio to me; what do you think?
I'll send PATCHv2, which limits the check to function parameters,
momentarily.
- Josh Triplett
next prev parent reply other threads:[~2012-02-03 1:31 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-02 20:06 [PATCH] checkpatch: Check for quoted strings broken across lines Josh Triplett
2012-02-02 20:22 ` Nick Bowler
2012-02-02 21:16 ` Josh Triplett
2012-02-02 22:08 ` Nick Bowler
2012-02-03 1:31 ` Josh Triplett [this message]
2012-02-02 21:29 ` Jesper Juhl
2012-02-02 21:34 ` Joe Perches
2012-02-02 22:02 ` Josh Triplett
2012-02-02 20:36 ` Joe Perches
2012-02-02 21:28 ` Josh Triplett
2012-02-02 21:32 ` Joe Perches
2012-02-02 22:36 ` Josh Triplett
-- strict thread matches above, loose matches on Subject: below --
2012-02-03 5:27 Josh Triplett
2012-02-03 5:38 ` Joe Perches
2012-02-03 8:55 ` Josh Triplett
2012-02-03 6:34 ` Josh Triplett
2012-03-20 21:06 Josh Triplett
2012-03-21 1:24 ` Joe Perches
2012-03-21 4:28 ` Josh Triplett
2012-03-21 4:35 ` Joe Perches
2012-03-21 6:05 ` Josh Triplett
2012-03-21 12:11 ` Joe Perches
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120203013108.GA13456@leaf \
--to=josh@joshtriplett.org \
--cc=apw@canonical.com \
--cc=joe@perches.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nbowler@elliptictech.com \
--cc=paulmck@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).