Re: [Qemu-devel] [PATCH 1/4] block/rbd: don't copy strings in qemu_rbd_next_tok()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jeff Cody <jcody@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 1/4] block/rbd: don't copy strings in qemu_rbd_next_tok()
Date: Mon, 27 Feb 2017 13:07:10 -0500	[thread overview]
Message-ID: <20170227180710.GG25637@localhost.localdomain> (raw)
In-Reply-To: <87fuiza8q6.fsf@dusky.pond.sub.org>

On Mon, Feb 27, 2017 at 05:39:45PM +0100, Markus Armbruster wrote:
> Jeff Cody <jcody@redhat.com> writes:
> 
> > This patch is prep work for parsing options for .bdrv_parse_filename,
> > and using QDict options.
> >
> > The function qemu_rbd_next_tok() searched for various key/value pairs,
> > and copied them into buffers.  This will soon be an unnecessary extra
> > step, so we will now return found strings by reference only, and
> > offload the responsibility for safely handling/coping these strings to
> > the caller.
> >
> > This also cleans up error handling some, as the callers now rely on
> > the Error object to determine if there is a parse error.
> >
> > Signed-off-by: Jeff Cody <jcody@redhat.com>
> > ---
> >  block/rbd.c | 99 +++++++++++++++++++++++++++++++++++++++----------------------
> >  1 file changed, 64 insertions(+), 35 deletions(-)
> >
> > diff --git a/block/rbd.c b/block/rbd.c
> > index 22e8e69..3f1a9de 100644
> > --- a/block/rbd.c
> > +++ b/block/rbd.c
> > @@ -102,10 +102,10 @@ typedef struct BDRVRBDState {
> >      char *snap;
> >  } BDRVRBDState;
> >  
> > -static int qemu_rbd_next_tok(char *dst, int dst_len,
> > -                             char *src, char delim,
> > -                             const char *name,
> > -                             char **p, Error **errp)
> > +static char *qemu_rbd_next_tok(int max_len,
> > +                               char *src, char delim,
> > +                               const char *name,
> > +                               char **p, Error **errp)
> >  {
> >      int l;
> >      char *end;
> 
>        *p = NULL;
> 
>        if (delim != '\0') {
>            for (end = src; *end; ++end) {
>                if (*end == delim) {
>                    break;
>                }
>                if (*end == '\\' && end[1] != '\0') {
>                    end++;
>                }
>            }
>            if (*end == delim) {
>                *p = end + 1;
>                *end = '\0';
> >          }
> >      }
> >      l = strlen(src);
> 
> Not this patch's problem: this is a rather thoughtless way to say
> 
>        l = end - src;
> 
> > -    if (l >= dst_len) {
> > +    if (l >= max_len) {
> >          error_setg(errp, "%s too long", name);
> > -        return -EINVAL;
> > +        return NULL;
> >      } else if (l == 0) {
> >          error_setg(errp, "%s too short", name);
> > -        return -EINVAL;
> > +        return NULL;
> >      }
> >  
> > -    pstrcpy(dst, dst_len, src);
> > -
> > -    return 0;
> > +    return src;
> >  }
> 
> Note for later:
> 
> 1. This function always dereferences @src.
> 2. If @delim, it sets *@p to point behind @src plus the delimiter,
>    else to NULL
> 3. It returns NULL exactly when it sets an error.
> 4. It returns NULL and sets an error when @src is empty.
> 
> Not this patch's problem, but I have to say it: whoever wrote this code
> should either write simpler functions or get into the habit of writing
> function contract comments.  Because having to document your
> embarrassingly complicated shit is great motivation to simplify (pardon
> my french).
>

Heh.  I had to read and re-read this function multiple times to get a feel
for what it was doing.


> >  
> >  static void qemu_rbd_unescape(char *src)
> > @@ -162,7 +160,9 @@ static int qemu_rbd_parsename(const char *filename,
> 
> This is a parser.  As so often, it is a parser without any hint on what
> it's supposed to parse, let alone a grammar.  Experience tells that
> these are wrong more often than not, and exposing me to yet another one
> is a surefire way to make me grumpy.  Not your fault, of course.
> 
> >  {
> >      const char *start;
> >      char *p, *buf;
> > -    int ret;
> > +    int ret = 0;
> > +    char *found_str;
> > +    Error *local_err = NULL;
> >  
> >      if (!strstart(filename, "rbd:", &start)) {
> >          error_setg(errp, "File name must start with 'rbd:'");
>            return -EINVAL;
>        }
> 
>        buf = g_strdup(start);
>        p = buf;
> 
> This assignment to @p ...
> 
> >      *snap = '\0';
> >      *conf = '\0';
> >  
> > -    ret = qemu_rbd_next_tok(pool, pool_len, p,
> > -                            '/', "pool name", &p, errp);
> > -    if (ret < 0 || !p) {
> > +    found_str = qemu_rbd_next_tok(pool_len, p,
> > +                                 '/', "pool name", &p, &local_err);
> 
> ... is dead, because qemu_rbd_next_tok() assigns to it unconditionally.
> 

While that is true, @p is also used as the src argument to
qemu_rbd_next_tok() in addition (second arg).  We could just pass in @buf
for that argument, but using @p keeps it consistent with the other calls.



> The call extracts the part up to the first unescaped '/' or the end of
> the string.
> 
> > +    if (local_err) {
> > +        goto done;
> > +    }
> > +    if (!p) {
> 
> We extracted to end of string, i.e. we didn't find '/'.
> 
> >          ret = -EINVAL;
> > +        error_setg(errp, "Pool name is required");
> >          goto done;
> >      }
> > -    qemu_rbd_unescape(pool);
> > +    qemu_rbd_unescape(found_str);
> > +    g_strlcpy(pool, found_str, pool_len);
> 
> Before, we copy, then unescape the copy.
> 
> After, we unescape in place, then copy.
> 
> Unescaping can't lengthen the string.  Therefore, this is safe as long
> as nothing else uses this part of @buf.
> 
> >  
> >      if (strchr(p, '@')) {
> > -        ret = qemu_rbd_next_tok(name, name_len, p,
> > -                                '@', "object name", &p, errp);
> > -        if (ret < 0) {
> > +        found_str = qemu_rbd_next_tok(name_len, p,
> > +                                     '@', "object name", &p, &local_err);
> 
> Extracts from first unescaped '/' to next unescaped '@' or end of string.
> 
> @p can't be null.
> 
> > +        if (local_err) {
> >              goto done;
> >          }
> > -        ret = qemu_rbd_next_tok(snap, snap_len, p,
> > -                                ':', "snap name", &p, errp);
> > -        qemu_rbd_unescape(snap);
> > +        qemu_rbd_unescape(found_str);
> > +        g_strlcpy(name, found_str, name_len);
> > +
> > +        found_str = qemu_rbd_next_tok(snap_len, p,
> > +                                      ':', "snap name", &p, &local_err);
> 
> From that '@' to next ':' or end of string.
> 
> > +        if (local_err) {
> > +            goto done;
> > +        }
> > +        qemu_rbd_unescape(found_str);
> > +        g_strlcpy(snap, found_str, snap_len);
> 
> This confused me, but I figured it out: you're moving the @name unescape
> from after the conditional into both its arms.  This is the first copy.
> 
> >      } else {
> > -        ret = qemu_rbd_next_tok(name, name_len, p,
> > -                                ':', "object name", &p, errp);
> > +        found_str = qemu_rbd_next_tok(name_len, p,
> > +                                     ':', "object name", &p, &local_err);
> 
> From first '/' to next ':' or end of string.
> 
> Indentation off by one.
> 

Those pesky single quotes are almost invisible...


> > +        if (local_err) {
> > +            goto done;
> > +        }
> > +        qemu_rbd_unescape(found_str);
> 
> This is the second copy of the moved unescape.
> 
> > +        g_strlcpy(name, found_str, name_len);
> >      }
> > -    qemu_rbd_unescape(name);
> 
> This is where the copies come from.
> 
> > -    if (ret < 0 || !p) {
> > +    if (!p) {
> 
> We didn't find ':'.
> 
> >          goto done;
> >      }
> >  
> > -    ret = qemu_rbd_next_tok(conf, conf_len, p,
> > -                            '\0', "configuration", &p, errp);
> > +    found_str = qemu_rbd_next_tok(conf_len, p,
> > +                                 '\0', "configuration", &p, &local_err);
> 
> From that ':' to end of string.
> 
> > +    if (local_err) {
> > +        goto done;
> > +    }
> > +    g_strlcpy(conf, found_str, conf_len);
> 
> No unescape?  Strange... but your patch doesn't change anything.
> 

This part is essentially chunked off to the end of the string, as you noted
above.  This chunk is then parsed (and unescaped) in qemu_rbd_set_conf().

> >  
> >  done:
> > +    if (local_err) {
> > +        ret = -EINVAL;
> > +        error_propagate(errp, local_err);
> > +    }
> >      g_free(buf);
> >      return ret;
> >  }
> 
> Okay, now someone tell me what it's supposed to parse, and I tell you
> whether it works.
> 
> Unescaping in place looks safe.
> 
> > @@ -262,17 +286,18 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
> 
> Great, another parser for an unknown language.  I'll pass.
> 
> >                               Error **errp)
> >  {
> >      char *p, *buf;
> > -    char name[RBD_MAX_CONF_NAME_SIZE];
> > -    char value[RBD_MAX_CONF_VAL_SIZE];
> > +    char *name;
> > +    char *value;
> > +    Error *local_err = NULL;
> >      int ret = 0;
> >  
> >      buf = g_strdup(conf);
> >      p = buf;
> 
> Again, dead assignment to @p.
> 

I'm inclined to leave it for the same reason as the first instance.

> >  
> >      while (p) {
> > -        ret = qemu_rbd_next_tok(name, sizeof(name), p,
> > -                                '=', "conf option name", &p, errp);
> > -        if (ret < 0) {
> > +        name = qemu_rbd_next_tok(RBD_MAX_CONF_NAME_SIZE, p,
> > +                                 '=', "conf option name", &p, &local_err);
> > +        if (local_err) {
> >              break;
> >          }
> >          qemu_rbd_unescape(name);
> 
> Again, you change to unescape in place.
> 
> > @@ -283,9 +308,9 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
> >              break;
> >          }
> >  
> > -        ret = qemu_rbd_next_tok(value, sizeof(value), p,
> > -                                ':', "conf option value", &p, errp);
> > -        if (ret < 0) {
> > +        value = qemu_rbd_next_tok(RBD_MAX_CONF_VAL_SIZE, p,
> > +                                  ':', "conf option value", &p, &local_err);
> > +        if (local_err) {
> >              break;
> >          }
> >          qemu_rbd_unescape(value);
> 
> Likewise.
> 
> > @@ -313,6 +338,10 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
> >          }
> >      }
> >  
> > +    if (local_err) {
> > +        error_propagate(errp, local_err);
> > +        ret = -EINVAL;
> > +    }
> >      g_free(buf);
> >      return ret;
> >  }
> 
> Again, unescaping in place looks safe.
> 
> Preferably with the two dead assignments dropped:

How strongly do you feel about those?

> Reviewed-by: Markus Armbruster <armbru@redhat.com>

next prev parent reply	other threads:[~2017-02-27 18:07 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-27  7:30 [Qemu-devel] [PATCH 0/4] RBD: blockdev-add Jeff Cody
2017-02-27  7:30 ` [Qemu-devel] [PATCH 1/4] block/rbd: don't copy strings in qemu_rbd_next_tok() Jeff Cody
2017-02-27 16:39   ` Markus Armbruster
2017-02-27 18:07     ` Jeff Cody [this message]
2017-02-27  7:30 ` [Qemu-devel] [PATCH 2/4] block/rbd: code movement Jeff Cody
2017-02-27  9:28   ` Daniel P. Berrange
2017-02-27 13:14     ` Jeff Cody
2017-02-27  7:30 ` [Qemu-devel] [PATCH 3/4] block/rbd: parse all options via bdrv_parse_filename Jeff Cody
2017-02-27  7:30 ` [Qemu-devel] [PATCH 4/4] block/rbd: Add blockdev-add support Jeff Cody
2017-02-27  7:36   ` Jeff Cody
2017-02-27  9:31     ` Daniel P. Berrange
2017-02-27 13:18       ` Jeff Cody
2017-02-27 13:30         ` Daniel P. Berrange
2017-02-27 13:38           ` Jeff Cody
2017-02-27 13:45   ` Daniel P. Berrange
2017-02-27 15:27     ` Jeff Cody

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170227180710.GG25637@localhost.localdomain \
    --to=jcody@redhat.com \
    --cc=armbru@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.