qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: antonios.motakis@huawei.com
Cc: qemu-devel@nongnu.org, veaceslav.falico@huawei.com,
	Eduard.Shishkin@huawei.com, andy.wangguoli@huawei.com,
	Jani.Kokkonen@huawei.com, cota@braap.org, berrange@redhat.com,
	Eric Blake <eblake@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 3/4] 9pfs: stat_to_qid: use device as input to qid.path
Date: Fri, 9 Feb 2018 16:13:26 +0100	[thread overview]
Message-ID: <20180209161326.653f3468@bahia.lan> (raw)
In-Reply-To: <20180208180019.13683-4-antonios.motakis@huawei.com>

On Thu, 8 Feb 2018 19:00:18 +0100
<antonios.motakis@huawei.com> wrote:

> From: Antonios Motakis <antonios.motakis@huawei.com>
> 
> To support multiple devices on the 9p share, and avoid
> qid path collisions we take the device id as input
> to generate a unique QID path. The lowest 48 bits of
> the path will be set equal to the file inode, and the
> top bits will be uniquely assigned based on the top
> 16 bits of the inode and the device id.
> 
> Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com>
> ---
>  hw/9pfs/9p.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++-------
>  hw/9pfs/9p.h | 13 ++++++++-
>  2 files changed, 90 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 4da858f..f434f05 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -25,6 +25,8 @@
>  #include "trace.h"
>  #include "migration/blocker.h"
>  #include "sysemu/qtest.h"
> +#include "exec/tb-hash-xx.h"
> +#include "qemu/qht.h"
>  
>  int open_fd_hw;
>  int total_open_fd;
> @@ -572,20 +574,82 @@ static void coroutine_fn virtfs_reset(V9fsPDU *pdu)
>                                  P9_STAT_MODE_NAMED_PIPE |   \
>                                  P9_STAT_MODE_SOCKET)
>  
> -/* This is the algorithm from ufs in spfs */
> +
> +/* creative abuse of tb_hash_func7, which is based on xxhash */
> +static uint32_t qpp_hash(QppEntry e)
> +{
> +    return tb_hash_func7(e.ino_prefix, e.dev, 0, 0, 0);

Hmm... Looking at git log include/exec/tb-hash-xx.h, I see that this hash
function signature evolved according to TCG needs. It started with 3
arguments, then 4 and we have 5 today.

So I don't think we should add another unrelated user. Probably best to
have our own version. Also it seems it could be simpler since you always
pass 0 for the third and later arguments.

> +}
> +
> +static bool qpp_lookup_func(const void *obj, const void *userp)
> +{
> +    const QppEntry *e1 = obj, *e2 = userp;
> +    return (e1->dev == e2->dev) && (e1->ino_prefix == e2->ino_prefix);

I guess this expression is simple enough so that you can drop the
parenthesis, since they're not needed because of '==' having
precedence over '&&'.

See: http://en.cppreference.com/w/c/language/operator_precedence

> +}
> +
> +static void qpp_table_remove(struct qht *ht, void *p, uint32_t h, void *up)
> +{
> +    g_free(p);
> +}
> +
> +static void qpp_table_destroy(struct qht *ht)
> +{
> +    qht_iter(ht, qpp_table_remove, NULL);
> +    qht_destroy(ht);
> +}
> +
> +/* stat_to_qid needs to map inode number (64 bits) and device id (32 bits)
> + * to a unique QID path (64 bits). To avoid having to map and keep track
> + * of up to 2^64 objects, we map only the 16 highest bits of the inode plus
> + * the device id to the 16 highest bits of the QID path. The 48 lowest bits
> + * of the QID path equal to the lowest bits of the inode number.
> + *
> + * This takes advantage of the fact that inode number are usually not
> + * random but allocated sequentially, so we have fewer items to keep
> + * track of.
> + */
> +static int qid_path_prefixmap(V9fsPDU *pdu, const struct stat *stbuf,
> +                                uint64_t *path)
> +{
> +    QppEntry lookup = {
> +        .dev = stbuf->st_dev,
> +        .ino_prefix = (uint16_t) (stbuf->st_ino >> 48)
> +    }, *val;
> +    uint32_t hash = qpp_hash(lookup);
> +
> +    val = qht_lookup(&pdu->s->qpp_table, qpp_lookup_func, &lookup, hash);
> +
> +    if (!val) {
> +        if (pdu->s->qp_prefix_next == 0) {
> +            /* we ran out of prefixes */
> +            return -ENFILE;

Not sure this errno would make sense for guest syscalls that don't open
file descriptors... Maybe ENOENT ?

Cc'ing Eric for insights.

> +        }
> +
> +        val = g_malloc0(sizeof(QppEntry));
> +        if (!val) {
> +            return -ENOMEM;
> +        }
> +        *val = lookup;
> +
> +        /* new unique inode prefix and device combo */
> +        val->qp_prefix = pdu->s->qp_prefix_next++;
> +        qht_insert(&pdu->s->qpp_table, val, hash);
> +    }
> +
> +    *path = ((uint64_t)val->qp_prefix << 48) | (stbuf->st_ino & QPATH_INO_MASK);
> +    return 0;
> +}
> +
>  static int stat_to_qid(V9fsPDU *pdu, const struct stat *stbuf, V9fsQID *qidp)
>  {
> -    size_t size;
> +    int err;
>  
> -    if (pdu->s->dev_id == 0) {
> -        pdu->s->dev_id = stbuf->st_dev;
> -    } else if (pdu->s->dev_id != stbuf->st_dev) {
> -        return -ENOSYS;
> +    /* map inode+device to qid path (fast path) */
> +    err = qid_path_prefixmap(pdu, stbuf, &qidp->path);
> +    if (err) {
> +        return err;
>      }
>  
> -    memset(&qidp->path, 0, sizeof(qidp->path));
> -    size = MIN(sizeof(stbuf->st_ino), sizeof(qidp->path));
> -    memcpy(&qidp->path, &stbuf->st_ino, size);
>      qidp->version = stbuf->st_mtime ^ (stbuf->st_size << 8);
>      qidp->type = 0;
>      if (S_ISDIR(stbuf->st_mode)) {
> @@ -3626,7 +3690,9 @@ int v9fs_device_realize_common(V9fsState *s, const V9fsTransport *t,
>          goto out;
>      }
>  
> -    s->dev_id = 0;
> +    /* QID path hash table. 1 entry ought to be enough for anybody ;) */
> +    qht_init(&s->qpp_table, 1, QHT_MODE_AUTO_RESIZE);
> +    s->qp_prefix_next = 1; /* reserve 0 to detect overflow */
>  
>      s->ctx.fst = &fse->fst;
>      fsdev_throttle_init(s->ctx.fst);
> @@ -3641,6 +3707,7 @@ out:
>          }
>          g_free(s->tag);
>          g_free(s->ctx.fs_root);
> +        qpp_table_destroy(&s->qpp_table);
>          v9fs_path_free(&path);
>      }
>      return rc;
> @@ -3653,6 +3720,7 @@ void v9fs_device_unrealize_common(V9fsState *s, Error **errp)
>      }
>      fsdev_throttle_cleanup(s->ctx.fst);
>      g_free(s->tag);
> +    qpp_table_destroy(&s->qpp_table);
>      g_free(s->ctx.fs_root);
>  }
>  
> diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
> index afb4ebd..80428f7 100644
> --- a/hw/9pfs/9p.h
> +++ b/hw/9pfs/9p.h
> @@ -8,6 +8,7 @@
>  #include "fsdev/9p-iov-marshal.h"
>  #include "qemu/thread.h"
>  #include "qemu/coroutine.h"
> +#include "qemu/qht.h"
>  
>  enum {
>      P9_TLERROR = 6,
> @@ -231,6 +232,15 @@ struct V9fsFidState
>      V9fsFidState *rclm_lst;
>  };
>  
> +#define QPATH_INO_MASK        (((unsigned long)1 << 48) - 1)
> +
> +/* QID path prefix entry, see stat_to_qid */
> +typedef struct {
> +    dev_t dev;
> +    uint16_t ino_prefix;
> +    uint16_t qp_prefix;
> +} QppEntry;
> +
>  struct V9fsState
>  {
>      QLIST_HEAD(, V9fsPDU) free_list;
> @@ -252,7 +262,8 @@ struct V9fsState
>      Error *migration_blocker;
>      V9fsConf fsconf;
>      V9fsQID root_qid;
> -    dev_t dev_id;
> +    struct qht qpp_table;
> +    uint16_t qp_prefix_next;
>  };
>  
>  /* 9p2000.L open flags */

  reply	other threads:[~2018-02-09 15:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-08 18:00 [Qemu-devel] [PATCH 0/4] QID path collision fix antonios.motakis
2018-02-08 18:00 ` [Qemu-devel] [PATCH 1/4] 9pfs: V9fsQID: set type of version and path to unsigned antonios.motakis
2018-02-09 12:37   ` Greg Kurz
2018-02-16 10:19     ` Antonios Motakis
2018-02-08 18:00 ` [Qemu-devel] [PATCH 2/4] 9pfs: check for file device to avoid QID path collisions antonios.motakis
2018-02-09 13:03   ` Greg Kurz
2018-02-16 10:19     ` Antonios Motakis
2018-02-08 18:00 ` [Qemu-devel] [PATCH 3/4] 9pfs: stat_to_qid: use device as input to qid.path antonios.motakis
2018-02-09 15:13   ` Greg Kurz [this message]
2018-02-09 16:06     ` Eric Blake
2018-02-09 17:57       ` Greg Kurz
2018-02-16 10:20         ` Antonios Motakis
2018-02-16 11:21           ` Greg Kurz
2018-02-09 21:58     ` Emilio G. Cota
2018-02-16 10:19       ` Antonios Motakis
2018-02-08 18:00 ` [Qemu-devel] [PATCH 4/4] 9pfs: stat_to_qid: implement slow path antonios.motakis
2018-02-09 15:22   ` Greg Kurz
2018-02-09 21:47     ` Emilio G. Cota
2018-02-16 10:28       ` Antonios Motakis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180209161326.653f3468@bahia.lan \
    --to=groug@kaod.org \
    --cc=Eduard.Shishkin@huawei.com \
    --cc=Jani.Kokkonen@huawei.com \
    --cc=andy.wangguoli@huawei.com \
    --cc=antonios.motakis@huawei.com \
    --cc=berrange@redhat.com \
    --cc=cota@braap.org \
    --cc=eblake@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=veaceslav.falico@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).