git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.org>,
	git@vger.kernel.org
Subject: Re: [PATCH 00/14] numparse module: systematically tighten up integer parsing
Date: Thu, 19 Mar 2015 01:26:20 -0400	[thread overview]
Message-ID: <20150319052620.GA30645@peff.net> (raw)
In-Reply-To: <1426608016-2978-1-git-send-email-mhagger@alum.mit.edu>

On Tue, Mar 17, 2015 at 05:00:02PM +0100, Michael Haggerty wrote:

> My main questions:
> 
> * Do people like the API? My main goal was to make these functions as
>   painless as possible to use correctly, because there are so many
>   call sites.
> 
> * Is it too gimmicky to encode the base together with other options in
>   `flags`? (I think it is worth it to avoid the need for another
>   parameter, which callers could easily put in the wrong order.)

I definitely like the overall direction of this. My first thought was
that it does seem like there are a lot of possible options to the
functions (and OR-ing the flags with the base does seem weird, though I
can't think of a plausible case where it would actually cause errors).
Many of those options don't seem used in the example conversions (I'm
not clear who would want NUM_SATURATE, for example).

I wondered if we could do away with the radix entirely. Wouldn't we be
asking for base 10 most of the time? Of course, your first few patches
show octal parsing, but I wonder if we should actually have a separate
parse_mode() for that, since that seems to be the general reason for
doing octal parsing. 100000644 does not overflow an int, but it is
hardly a reasonable mode.

I also wondered if we could get rid of NUM_SIGN in favor of just having
the type imply it (e.g., convert_l would always allow negative numbers,
whereas convert_ul would not). But I suppose there are times when we end
up using an "int" to store an unsigned value for a good reason (e.g.,
"-1" is a sentinel value, but we expect only positive values from the
user). So that might be a bad idea.

I notice that you go up to "unsigned long" here for sizes. If we want to
use this everywhere, we'll need something larger for parsing off_t
values on 32-bit machines (though I would not at all be surprised with
the current code if 32-bit machines have a hard time configuring a
pack.packSizeLimit above 4G).

I wonder how much of the boilerplate in the parse_* functions could be
factored out to use a uintmax_t, with the caller just providing the
range. That would make it easier to add new types like off_t, and
possibly even constrained types (e.g., an integer from 0 to 100). On the
other hand, you mentioned to me elsewhere that there may be some bugs in
the range-checks of config.c's integer parsing. I suspect they are
related to exactly this kind of refactoring, so perhaps writing things
out is best.

> * Am I making callers too strict? In many cases where a positive
>   integer is expected (e.g., "--abbrev=<num>"), I have been replacing
>   code like
> [...]

IMHO most of the tightening happening here is a good thing, and means we
are more likely to notice mistakes rather than silently doing something
stupid.

For sites that currently allow it, I could imagine people using hex
notation for some values, though, depending on the context. It looks
there aren't many of them ((it is just when the radix is "0", right?).
Some of them look to be accidental (does anybody really ask for
--threads=0x10?), but others might not be (the pack index-version
contains an offset field that might be quite big).


Feel free to ignore any or all of that. It is not so much criticism as
a dump of thoughts I had while reading the patches. Perhaps you can pick
something useful out of that, and perhaps not. :)

-Peff

  parent reply	other threads:[~2015-03-19  5:26 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-17 16:00 [PATCH 00/14] numparse module: systematically tighten up integer parsing Michael Haggerty
2015-03-17 16:00 ` [PATCH 01/14] numparse: new module for parsing integral numbers Michael Haggerty
2015-03-18 18:27   ` Eric Sunshine
2015-03-18 22:47     ` Michael Haggerty
2015-03-20  8:54       ` Eric Sunshine
2015-03-20 17:51   ` Junio C Hamano
2015-03-17 16:00 ` [PATCH 02/14] cacheinfo_callback(): use convert_ui() when handling "--cacheinfo" Michael Haggerty
2015-03-17 16:00 ` [PATCH 03/14] write_subdirectory(): use convert_ui() for parsing mode Michael Haggerty
2015-03-17 16:00 ` [PATCH 04/14] handle_revision_opt(): use skip_prefix() in many places Michael Haggerty
2015-03-17 16:00 ` [PATCH 05/14] handle_revision_opt(): use convert_i() when handling "-<digit>" Michael Haggerty
2015-03-19  6:34   ` Junio C Hamano
2015-03-17 16:00 ` [PATCH 06/14] strtoul_ui(), strtol_i(): remove functions Michael Haggerty
2015-03-17 16:00 ` [PATCH 07/14] handle_revision_opt(): use convert_ui() when handling "--abbrev=" Michael Haggerty
2015-03-17 16:00 ` [PATCH 08/14] builtin_diff(): detect errors when parsing --unified argument Michael Haggerty
2015-03-17 16:00 ` [PATCH 09/14] opt_arg(): val is always non-NULL Michael Haggerty
2015-03-17 16:00 ` [PATCH 10/14] opt_arg(): use convert_i() in implementation Michael Haggerty
2015-03-17 16:00 ` [PATCH 11/14] opt_arg(): report errors parsing option values Michael Haggerty
2015-03-17 16:00 ` [PATCH 12/14] opt_arg(): simplify pointer handling Michael Haggerty
2015-03-17 16:00 ` [PATCH 13/14] diff_opt_parse(): use convert_i() when handling "-l<num>" Michael Haggerty
2015-03-17 16:00 ` [PATCH 14/14] diff_opt_parse(): use convert_i() when handling --abbrev=<num> Michael Haggerty
2015-03-19  6:37   ` Junio C Hamano
2015-03-17 18:48 ` [PATCH 00/14] numparse module: systematically tighten up integer parsing Junio C Hamano
2015-03-17 19:46   ` Michael Haggerty
2015-03-19  6:31     ` Junio C Hamano
2015-03-17 23:05 ` Duy Nguyen
2015-03-18  9:47   ` Michael Haggerty
2015-03-18  9:58     ` Duy Nguyen
2015-03-18 10:03     ` Jeff King
2015-03-18 10:20       ` Michael Haggerty
2015-03-19  5:26 ` Jeff King [this message]
2015-03-19  6:41   ` Junio C Hamano
2015-03-19  7:32   ` Junio C Hamano
2015-03-24 16:06     ` Michael Haggerty
2015-03-24 16:49       ` René Scharfe
2015-03-25 21:14         ` Michael Haggerty
2015-03-25 21:59           ` Junio C Hamano
2015-03-24 15:05   ` Michael Haggerty
2015-03-19  6:22 ` Junio C Hamano
2015-03-24 15:42   ` Michael Haggerty
2015-03-24 15:58     ` Junio C Hamano
2015-03-24 16:09       ` Junio C Hamano
2015-03-24 17:39       ` Michael Haggerty
2015-03-24 18:08         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150319052620.GA30645@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).