From: Markus Armbruster <armbru@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: kwolf@redhat.com, qemu-devel@nongnu.org, qemu-block@nongnu.org,
mreitz@redhat.com
Subject: Re: [PATCH for-5.1 8/8] qemu-option: Move is_valid_option_list() to qemu-img.c and rewrite
Date: Tue, 14 Apr 2020 10:47:44 +0200 [thread overview]
Message-ID: <87k12ixxlb.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <e3d05915-268f-0d1a-e760-723a10807d16@redhat.com> (Eric Blake's message of "Thu, 9 Apr 2020 14:45:10 -0500")
Eric Blake <eblake@redhat.com> writes:
> On 4/9/20 10:30 AM, Markus Armbruster wrote:
>> is_valid_option_list()'s purpose is ensuring qemu-img.c's can safely
>> join multiple parameter strings separated by ',' like this:
>>
>> g_strdup_printf("%s,%s", params1, params2);
>>
>> How it does that is anything but obvious. A close reading of the code
>> reveals that it fails exactly when its argument starts with ',' or
>> ends with an odd number of ','. Makes sense, actually, because when
>> the argument starts with ',', a separating ',' preceding it would get
>> escaped, and when it ends with an odd number of ',', a separating ','
>> following it would get escaped.
>>
>> Move it to qemu-img.c and rewrite it the obvious way.
>>
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>> include/qemu/option.h | 1 -
>> qemu-img.c | 26 ++++++++++++++++++++++++++
>> util/qemu-option.c | 22 ----------------------
>> 3 files changed, 26 insertions(+), 23 deletions(-)
>>
>
>> +++ b/qemu-img.c
>> @@ -223,6 +223,32 @@ static bool qemu_img_object_print_help(const char *type, QemuOpts *opts)
>> return true;
>> }
>> +/*
>> + * Is @optarg safe for accumulate_options()?
>> + * It is when multiple of them can be joined together separated by ','.
>> + * To make that work, @optarg must not start with ',' (or else a
>> + * separating ',' preceding it gets escaped), and it must not end with
>> + * an odd number of ',' (or else a separating ',' following it gets
>> + * escaped).
>> + */
>> +static bool is_valid_option_list(const char *optarg)
>> +{
>> + size_t len = strlen(optarg);
>> + size_t i;
>> +
>> + if (optarg[0] == ',') {
>> + return false;
>> + }
>> +
>> + for (i = len; i > 0 && optarg[i - 1] == ','; i--) {
>> + }
>> + if ((len - i) % 2) {
>> + return false;
>> + }
>> +
>> + return true;
>
> Okay, that's easy to read. Note that is_valid_option_list("") returns
> true.
Hmm, that's a bug:
$ qemu-img create -f qcow2 -o backing_file=a -o "" -o backing_fmt=raw,size=1M new.qcow2
qemu-img: warning: Could not verify backing image. This may become an error in future versions.
Could not open 'a,backing_fmt=raw': No such file or directory
Formatting 'new.qcow2', fmt=qcow2 size=1048576 backing_file=a,,backing_fmt=raw cluster_size=65536 lazy_refcounts=off refcount_bits=16
$ qemu-img info new.qcow2
image: new.qcow2
file format: qcow2
virtual size: 1 MiB (1048576 bytes)
disk size: 196 KiB
cluster_size: 65536
--> backing file: a,backing_fmt=raw
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
My rewrite preserves this bug. Will fix in v2.
> But proving it replaces the obtuse mess is harder...
>
>> +++ b/util/qemu-option.c
>> @@ -165,28 +165,6 @@ void parse_option_size(const char *name, const char *value,
>> *ret = size;
>> }
>> -bool is_valid_option_list(const char *p)
>> -{
>> - char *value = NULL;
>> - bool result = false;
>> -
>> - while (*p) {
>> - p = get_opt_value(p, &value);
>
> Refreshing myself on what get_opt_value() does:
>
>> const char *get_opt_value(const char *p, char **value)
>> {
>> size_t capacity = 0, length;
>> const char *offset;
>>
>> *value = NULL;
>
> start with p pointing to the tail of a string from which to extract a value,
>
>> while (1) {
>> offset = qemu_strchrnul(p, ',');
>
> find the potential end of the value: either the first comma (not yet
> sure if it is ',,' or the end of this value), or the end of the
> string,
>
>> length = offset - p;
>> if (*offset != '\0' && *(offset + 1) == ',') {
>> length++;
>> }
>
> if we found a comma and it was doubled, we are going to unescape the comma,
>
>> *value = g_renew(char, *value, capacity + length + 1);
>> strncpy(*value + capacity, p, length);
>
> copy bytes into the tail of value,
>
>> (*value)[capacity + length] = '\0';
>> capacity += length;
>> if (*offset == '\0' ||
>> *(offset + 1) != ',') {
>> break;
>> }
>
> if we hit the end of the string or the ',' before the next option, we
> are done,
>
>>
>> p += (offset - p) + 2;
>
> otherwise we unescaped a ',,' and continue appending to the current value.
>
>> }
>>
>> return offset;
>> }
>
> and the resulting return value is a substring of p: either the NUL
> byte ending p, or the last ',' in the first odd run of commas (the
> next byte after the return is then NUL if the parser would next see an
> empty option name, or non-NUL if the parser would start processing the
> next option name).
Yes.
> Some interesting cases:
> get_opt_value("", &value) returns "" with value set to ""
> get_opt_value(",", &value) returns "," with value set to ""
> get_opt_value(",,", &value) returns "" with value set to ","
> get_opt_value(",,,", &value) returns "," with value set to ","
> get_opt_value(",,a", &value) returns "" with value set to ",a"
> get_opt_value("a,,", &value) returns "" with value set to "a,"
> get_opt_value("a,b", &value) returns ",b" with value set to "a"
> get_opt_value("a,,b", &value) returns "" with value set to "a,b"
QemuOpts is a language without syntax errors.
> With that detour out of the way:
>
>> - if ((*p && !*++p) ||
>
> If *p, then we know '*p' == ','; checking !*++p moves past the comma
> and determines if we have the empty string as the next potential
> option name (if so, we ended in an odd number of commas) or anything
> else (if so, repeat the loop, just past the comma). Oddly, this means
> that is_valid_option_list("a,,,b") returns true (first pass on "a,,,b"
> returned ",b", second pass on "b" returned ""). But I agree that
> !*++p only fires on the final iteration through our loop of
> get_opt_value() calls, matching your rewrite to check for an odd
> number of trailing commas (regardless of even or odd pairing of commas
> in the middle).
>
>> - (!*value || *value == ',')) {
>
> Focusing on "*value == ','", on the first iteration, *value can be
> comma; on later iterations, we are guaranteed that *value will not be
> comma (because later iterations can only occur if the first iteration
> returned p pointing to a final comma of a run, but we always start the
> next iteration beyond that comma). So this matches your rewrite to
> check for a leading comma.
>
> Focusing on "!*value": the loop is never entered to begin with on
> is_valid_option_list("") (that returns true in the old implementation
> as well as yours); otherwise, value will only be empty if p originally
> started with an unpaired comma (but your new code catches that with
> its check for a leading comma).
>
> Note that is_valid_option_list("") returning true has some odd effects:
>
> $ qemu-img create -f qcow2 -o '' xyz 1
> Formatting 'xyz', fmt=qcow2 size=1 cluster_size=65536
> lazy_refcounts=off refcount_bits=16
> $ qemu-img create -f qcow2 -o '' -o '' xyz 1
> qemu-img: xyz: Invalid parameter ''
> $ qemu-img create -f qcow2 -o 'help' -o '' -o '' xyz 1
> qemu-img: xyz: Invalid parameter 'help'
> $ qemu-img create -f qcow2 -o 'help' -o '' xyz 1 | head -n1
> Supported options:
>
> but as I can't see any substantial differences between the old
> algorithm and your new one, I don't see that changing here. Perhaps
> you want another patch to make is_valid_option_list() return false for
> "".
I do.
>> - goto out;
>> - }
>> -
>> - g_free(value);
>> - value = NULL;
>> - }
>> -
>> - result = true;
>> -out:
>> - g_free(value);
>> - return result;
>> -}
>
> At any rate, your new code is a LOT more efficient and legible than
> the old. I'll restate the "why" in my own words:
> is_valid_option_list() is only used by our code base to concatenate
> multiple -o strings into one string, where each -o individually parses
> into one or more options;
Correct.
> and where the concatenated string must parse
> back to the same number of embedded options.
We want the concatenated string to parse into the concatenation of the
parse of its parts. Whether "same number" implies that is an
interesting puzzle, but not one worth solving today :)
> As both -o ",help" and
> -o "help," are invalid option lists on their own (there is no valid
> empty option for -o),
Yes, it's *semantically* invalid. It parses as sugar for "help=on,=on",
except "help=on" doesn't get you help, only "help" does.
> it makes no sense to append that -o to other
> option lists to produce a single string (where the appending would
> become confusing due to creating escaped ',,' that were not present in
> either original -o).
Yes. If the empty option existed, concatenation would not do. It
doesn't now, and I expect it to stay that way.
> Reviewed-by: Eric Blake <eblake@redhat.com>
Thanks for your thorough review!
next prev parent reply other threads:[~2020-04-14 8:49 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-09 15:30 [PATCH for-5.1 0/8] qemu-option: Fix corner cases and clean up Markus Armbruster
2020-04-09 15:30 ` [PATCH for-5.1 1/8] tests-qemu-opts: Cover has_help_option(), qemu_opt_has_help_opt() Markus Armbruster
2020-04-09 17:50 ` Eric Blake
2020-04-14 9:10 ` Markus Armbruster
2020-04-14 13:13 ` Kevin Wolf
2020-04-14 13:36 ` Markus Armbruster
2020-04-14 14:29 ` Kevin Wolf
2020-04-14 20:13 ` Markus Armbruster
2020-04-15 10:00 ` Kevin Wolf
2020-04-15 14:45 ` Markus Armbruster
2020-04-09 15:30 ` [PATCH for-5.1 2/8] qemu-options: Factor out get_opt_name_value() helper Markus Armbruster
2020-04-09 18:01 ` Eric Blake
2020-04-14 9:42 ` Markus Armbruster
2020-04-14 14:05 ` Kevin Wolf
2020-04-15 7:03 ` Markus Armbruster
2020-04-14 14:05 ` Kevin Wolf
2020-04-09 15:30 ` [PATCH for-5.1 3/8] qemu-option: Fix sloppy recognition of "id=..." after ", , " Markus Armbruster
2020-04-09 18:05 ` Eric Blake
2020-04-14 14:44 ` [PATCH for-5.1 3/8] qemu-option: Fix sloppy recognition of "id=..." after ",," Kevin Wolf
2020-04-09 15:30 ` [PATCH for-5.1 4/8] qemu-option: Avoid has_help_option() in qemu_opts_parse_noisily() Markus Armbruster
2020-04-09 18:07 ` Eric Blake
2020-04-14 10:04 ` Markus Armbruster
2020-04-09 15:30 ` [PATCH for-5.1 5/8] qemu-option: Fix has_help_option()'s sloppy parsing Markus Armbruster
2020-04-09 18:10 ` Eric Blake
2020-04-14 10:16 ` Markus Armbruster
2020-04-14 14:57 ` Kevin Wolf
2020-04-15 7:48 ` Markus Armbruster
2020-04-09 15:30 ` [PATCH for-5.1 6/8] test-qemu-opts: Simplify test_has_help_option() after bug fix Markus Armbruster
2020-04-09 18:13 ` Eric Blake
2020-04-14 14:58 ` Kevin Wolf
2020-04-09 15:30 ` [PATCH for-5.1 7/8] qemu-img: Factor out accumulate_options() helper Markus Armbruster
2020-04-09 18:15 ` Eric Blake
2020-04-14 15:00 ` Kevin Wolf
2020-04-09 15:30 ` [PATCH for-5.1 8/8] qemu-option: Move is_valid_option_list() to qemu-img.c and rewrite Markus Armbruster
2020-04-09 19:45 ` Eric Blake
2020-04-14 8:47 ` Markus Armbruster [this message]
2020-04-14 14:34 ` Markus Armbruster
2020-04-14 15:10 ` Kevin Wolf
2020-04-14 20:14 ` Markus Armbruster
2020-04-09 17:09 ` [PATCH for-5.1 0/8] qemu-option: Fix corner cases and clean up no-reply
2020-04-09 17:44 ` Eric Blake
2020-04-14 8:52 ` Markus Armbruster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k12ixxlb.fsf@dusky.pond.sub.org \
--to=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.