From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>
Cc: Jeff King <peff@peff.org>, git@vger.kernel.org
Subject: Re: [PATCH 00/14] numparse module: systematically tighten up integer parsing
Date: Tue, 24 Mar 2015 17:06:12 +0100 [thread overview]
Message-ID: <55118B74.1030201@alum.mit.edu> (raw)
In-Reply-To: <xmqqk2ydjvcd.fsf@gitster.dls.corp.google.com>
On 03/19/2015 08:32 AM, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
>> I wonder how much of the boilerplate in the parse_* functions could be
>> factored out to use a uintmax_t, with the caller just providing the
>> range. That would make it easier to add new types like off_t, and
>> possibly even constrained types (e.g., an integer from 0 to 100). On the
>> other hand, you mentioned to me elsewhere that there may be some bugs in
>> the range-checks of config.c's integer parsing. I suspect they are
>> related to exactly this kind of refactoring, so perhaps writing things
>> out is best.
>
> I like this idea very well. I wonder if we can implement the family
> of
>
> parse_{type}(const char *, unsigned int flags,
> const char **endptr, {type} *result)
>
> functions by calls a helper that internally deals with the numbers
> in uintmax_t, and then checks if the value fits within the possible
> range of the *result before returning.
>
> int parse_i(const char *str, unsigned flags,
> const char **endptr, int *result) {
> uintmax_t val;
> int sign = 1, status;
> if (*str == '-') {
> sign = -1;
> str++;
> }
> status = parse_num(str, flags, endptr, &val, INT_MAX);
> if (status < 0)
> return status;
> *result = sign < 0 ? -val : val;
> return 0;
> }
>
> (assuming the last parameter to parse_num is used to check if the
> result fits within that range). Or that may be easier and cleaner
> to be done in the caller with or something like that:
>
> status = parse_num(str, flags, endptr, &val);
> if (status < 0)
> return status;
> if (INT_MAX <= val * sign || val * sign <= INT_MIN) {
> errno = ERANGE;
> return -1;
> }
>
> If we wanted to, we may even be able to avoid duplication of
> boilerplate by wrapping the above pattern within a macro,
> parameterizing the TYPE_{MIN,MAX} and using token pasting, to
> expand to four necessary result types.
>
> There is no reason for the implementation of the parse_num() helper
> to be using strtoul(3) or strtoull(3); its behaviour will be under
> our total control. It can become fancier by enriching the flags
> bits (e.g. allow scaling factor, etc.) only once and all variants
> for various result types will get the same feature.
Parsing numbers is not rocket science, but there are a lot of pitfalls,
especially around overflow. It's even harder to write such code via
macros and the result is less readable.
This patch series is mostly about finding a reasonable API and whipping
the callers into shape. That seems ambitious enough for me. I'd rather
stick with boring wrappers for now and lean on strtol()/strtoul() to do
the dirty work. It will be easy for a motivated person to change the
implementation later.
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
next prev parent reply other threads:[~2015-03-24 16:06 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-17 16:00 [PATCH 00/14] numparse module: systematically tighten up integer parsing Michael Haggerty
2015-03-17 16:00 ` [PATCH 01/14] numparse: new module for parsing integral numbers Michael Haggerty
2015-03-18 18:27 ` Eric Sunshine
2015-03-18 22:47 ` Michael Haggerty
2015-03-20 8:54 ` Eric Sunshine
2015-03-20 17:51 ` Junio C Hamano
2015-03-17 16:00 ` [PATCH 02/14] cacheinfo_callback(): use convert_ui() when handling "--cacheinfo" Michael Haggerty
2015-03-17 16:00 ` [PATCH 03/14] write_subdirectory(): use convert_ui() for parsing mode Michael Haggerty
2015-03-17 16:00 ` [PATCH 04/14] handle_revision_opt(): use skip_prefix() in many places Michael Haggerty
2015-03-17 16:00 ` [PATCH 05/14] handle_revision_opt(): use convert_i() when handling "-<digit>" Michael Haggerty
2015-03-19 6:34 ` Junio C Hamano
2015-03-17 16:00 ` [PATCH 06/14] strtoul_ui(), strtol_i(): remove functions Michael Haggerty
2015-03-17 16:00 ` [PATCH 07/14] handle_revision_opt(): use convert_ui() when handling "--abbrev=" Michael Haggerty
2015-03-17 16:00 ` [PATCH 08/14] builtin_diff(): detect errors when parsing --unified argument Michael Haggerty
2015-03-17 16:00 ` [PATCH 09/14] opt_arg(): val is always non-NULL Michael Haggerty
2015-03-17 16:00 ` [PATCH 10/14] opt_arg(): use convert_i() in implementation Michael Haggerty
2015-03-17 16:00 ` [PATCH 11/14] opt_arg(): report errors parsing option values Michael Haggerty
2015-03-17 16:00 ` [PATCH 12/14] opt_arg(): simplify pointer handling Michael Haggerty
2015-03-17 16:00 ` [PATCH 13/14] diff_opt_parse(): use convert_i() when handling "-l<num>" Michael Haggerty
2015-03-17 16:00 ` [PATCH 14/14] diff_opt_parse(): use convert_i() when handling --abbrev=<num> Michael Haggerty
2015-03-19 6:37 ` Junio C Hamano
2015-03-17 18:48 ` [PATCH 00/14] numparse module: systematically tighten up integer parsing Junio C Hamano
2015-03-17 19:46 ` Michael Haggerty
2015-03-19 6:31 ` Junio C Hamano
2015-03-17 23:05 ` Duy Nguyen
2015-03-18 9:47 ` Michael Haggerty
2015-03-18 9:58 ` Duy Nguyen
2015-03-18 10:03 ` Jeff King
2015-03-18 10:20 ` Michael Haggerty
2015-03-19 5:26 ` Jeff King
2015-03-19 6:41 ` Junio C Hamano
2015-03-19 7:32 ` Junio C Hamano
2015-03-24 16:06 ` Michael Haggerty [this message]
2015-03-24 16:49 ` René Scharfe
2015-03-25 21:14 ` Michael Haggerty
2015-03-25 21:59 ` Junio C Hamano
2015-03-24 15:05 ` Michael Haggerty
2015-03-19 6:22 ` Junio C Hamano
2015-03-24 15:42 ` Michael Haggerty
2015-03-24 15:58 ` Junio C Hamano
2015-03-24 16:09 ` Junio C Hamano
2015-03-24 17:39 ` Michael Haggerty
2015-03-24 18:08 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55118B74.1030201@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=peff@peff.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).