qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [PATCH v2 02/20] fuse: Allow exporting BDSs via FUSE
Date: Thu, 15 Oct 2020 16:46:14 +0200	[thread overview]
Message-ID: <2db9dbfe-137a-4cdc-08ab-aaa1aed26423@redhat.com> (raw)
In-Reply-To: <20201015085703.GB4610@merkur.fritz.box>


[-- Attachment #1.1: Type: text/plain, Size: 16004 bytes --]

On 15.10.20 10:57, Kevin Wolf wrote:
> Am 22.09.2020 um 12:49 hat Max Reitz geschrieben:
>> block-export-add type=fuse allows mounting block graph nodes via FUSE on
>> some existing regular file.  That file should then appears like a raw
>> disk image, and accesses to it result in accesses to the exported BDS.
>>
>> Right now, we only implement the necessary block export functions to set
>> it up and shut it down.  We do not implement any access functions, so
>> accessing the mount point only results in errors.  This will be
>> addressed by a followup patch.
>>
>> We keep a hash table of exported mount points, because we want to be
>> able to detect when users try to use a mount point twice.  This is
>> because we invoke stat() to check whether the given mount point is a
>> regular file, but if that file is served by ourselves (because it is
>> already used as a mount point), then this stat() would have to be served
>> by ourselves, too, which is impossible to do while we (as the caller)
>> are waiting for it to settle.  Therefore, keep track of mount point
>> paths to at least catch the most obvious instances of that problem.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>  qapi/block-export.json   |  24 +++-
>>  include/block/fuse.h     |  30 +++++
>>  block.c                  |   1 +
>>  block/export/export.c    |   4 +
>>  block/export/fuse.c      | 253 +++++++++++++++++++++++++++++++++++++++
>>  block/export/meson.build |   1 +
>>  6 files changed, 311 insertions(+), 2 deletions(-)
>>  create mode 100644 include/block/fuse.h
>>  create mode 100644 block/export/fuse.c
>>
>> diff --git a/qapi/block-export.json b/qapi/block-export.json
>> index 0c045dfe7b..cb5bd54cbf 100644
>> --- a/qapi/block-export.json
>> +++ b/qapi/block-export.json
>> @@ -174,6 +174,21 @@
>>  ##
>>  { 'command': 'nbd-server-stop' }
>>  
>> +##
>> +# @BlockExportOptionsFuse:
>> +#
>> +# Options for exporting a block graph node on some (file) mountpoint
>> +# as a raw image.
>> +#
>> +# @mountpoint: Path on which to export the block device via FUSE.
>> +#              This must point to an existing regular file.
>> +#
>> +# Since: 5.2
>> +##
>> +{ 'struct': 'BlockExportOptionsFuse',
>> +  'data': { 'mountpoint': 'str' },
>> +  'if': 'defined(CONFIG_FUSE)' }
>> +
>>  ##
>>  # @BlockExportType:
>>  #
>> @@ -181,10 +196,13 @@
>>  #
>>  # @nbd: NBD export
>>  #
>> +# @fuse: Fuse export (since: 5.2)
> 
> s/Fuse/FUSE/?

Tomato, TOMATO.

Sure. :)

>> +#
>>  # Since: 4.2
>>  ##
>>  { 'enum': 'BlockExportType',
>> -  'data': [ 'nbd' ] }
>> +  'data': [ 'nbd',
>> +            { 'name': 'fuse', 'if': 'defined(CONFIG_FUSE)' } ] }
>>  
>>  ##
>>  # @BlockExportOptions:
>> @@ -213,7 +231,9 @@
>>              '*writethrough': 'bool' },
>>    'discriminator': 'type',
>>    'data': {
>> -      'nbd': 'BlockExportOptionsNbd'
>> +      'nbd': 'BlockExportOptionsNbd',
>> +      'fuse': { 'type': 'BlockExportOptionsFuse',
>> +                'if': 'defined(CONFIG_FUSE)' }
>>     } }
>>  
>>  ##
>> diff --git a/include/block/fuse.h b/include/block/fuse.h
>> new file mode 100644
>> index 0000000000..ffa91fe364
>> --- /dev/null
>> +++ b/include/block/fuse.h
>> @@ -0,0 +1,30 @@
>> +/*
>> + * Present a block device as a raw image through FUSE
>> + *
>> + * Copyright (c) 2020 Max Reitz <mreitz@redhat.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; under version 2 or later of the License.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef BLOCK_FUSE_H
>> +#define BLOCK_FUSE_H
>> +
>> +#ifdef CONFIG_FUSE
>> +
>> +#include "block/export.h"
>> +
>> +extern const BlockExportDriver blk_exp_fuse;
>> +
>> +#endif /* CONFIG_FUSE */
>> +
>> +#endif
>> diff --git a/block.c b/block.c
>> index acde53f92a..7bf45f6ede 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -26,6 +26,7 @@
>>  #include "block/trace.h"
>>  #include "block/block_int.h"
>>  #include "block/blockjob.h"
>> +#include "block/fuse.h"
>>  #include "block/nbd.h"
>>  #include "block/qdict.h"
>>  #include "qemu/error-report.h"
>> diff --git a/block/export/export.c b/block/export/export.c
>> index 64d39a6934..ba866d6ba7 100644
>> --- a/block/export/export.c
>> +++ b/block/export/export.c
>> @@ -16,6 +16,7 @@
>>  #include "block/block.h"
>>  #include "sysemu/block-backend.h"
>>  #include "block/export.h"
>> +#include "block/fuse.h"
>>  #include "block/nbd.h"
>>  #include "qapi/error.h"
>>  #include "qapi/qapi-commands-block-export.h"
>> @@ -24,6 +25,9 @@
>>  
>>  static const BlockExportDriver *blk_exp_drivers[] = {
>>      &blk_exp_nbd,
>> +#ifdef CONFIG_FUSE
>> +    &blk_exp_fuse,
>> +#endif
>>  };
>>  
>>  /* Only accessed from the main thread */
>> diff --git a/block/export/fuse.c b/block/export/fuse.c
>> new file mode 100644
>> index 0000000000..75f11d2514
>> --- /dev/null
>> +++ b/block/export/fuse.c
>> @@ -0,0 +1,253 @@
>> +/*
>> + * Present a block device as a raw image through FUSE
>> + *
>> + * Copyright (c) 2020 Max Reitz <mreitz@redhat.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; under version 2 or later of the License.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#define FUSE_USE_VERSION 31
>> +
>> +#include "qemu/osdep.h"
>> +#include "block/aio.h"
>> +#include "block/block.h"
>> +#include "block/export.h"
>> +#include "block/fuse.h"
>> +#include "block/qapi.h"
>> +#include "qapi/error.h"
>> +#include "qapi/qapi-commands-block.h"
>> +#include "sysemu/block-backend.h"
>> +
>> +#include <fuse.h>
>> +#include <fuse_lowlevel.h>
>> +
>> +
>> +typedef struct FuseExport {
>> +    BlockExport common;
>> +
>> +    struct fuse_session *fuse_session;
>> +    struct fuse_buf fuse_buf;
>> +    bool mounted, fd_handler_set_up;
>> +
>> +    char *mountpoint;
>> +    bool writable;
>> +} FuseExport;
>> +
>> +static GHashTable *exports;
>> +static const struct fuse_lowlevel_ops fuse_ops;
>> +
>> +static void fuse_export_shutdown(BlockExport *exp);
>> +static void fuse_export_delete(BlockExport *exp);
>> +
>> +static void init_fuse(void);
>> +static int setup_fuse_export(FuseExport *exp, const char *mountpoint,
>> +                             Error **errp);
>> +static void read_from_fuse_export(void *opaque);
>> +
>> +static bool is_regular_file(const char *path, Error **errp);
>> +
>> +
>> +static int fuse_export_create(BlockExport *blk_exp,
>> +                              BlockExportOptions *blk_exp_args,
>> +                              Error **errp)
>> +{
>> +    FuseExport *exp = container_of(blk_exp, FuseExport, common);
>> +    BlockExportOptionsFuse *args = &blk_exp_args->u.fuse;
>> +    int ret;
>> +
>> +    assert(blk_exp_args->type == BLOCK_EXPORT_TYPE_FUSE);
>> +
>> +    init_fuse();
>> +
>> +    /*
>> +     * It is important to do this check before calling is_regular_file() --
>> +     * that function will do a stat(), which we would have to handle if we
>> +     * already exported something on @mountpoint.  But we cannot, because
>> +     * we are currently caught up here.
>> +     * (Note that ideally we would want to resolve relative paths here,
>> +     * but bdrv_make_absolute_filename() might do the wrong thing for
>> +     * paths that contain colons, and realpath() would resolve symlinks,
>> +     * which we do not want: The mount point is not going to be the
>> +     * symlink's destination, but the link itself.)
>> +     * So this will not catch all potential clashes, but hopefully at
>> +     * least the most common one of specifying exactly the same path
>> +     * string twice.
>> +     */
>> +    if (g_hash_table_contains(exports, args->mountpoint)) {
>> +        error_setg(errp, "There already is a FUSE export on '%s'",
>> +                   args->mountpoint);
>> +        ret = -EEXIST;
>> +        goto fail;
>> +    }
> 
> I guess the proper solution would be to make block export creation
> asynchronous so that we can actually serve that request in the
> background.
> 
> Something for another day...
> 
>> +
>> +    if (!is_regular_file(args->mountpoint, errp)) {
>> +        ret = -EINVAL;
>> +        goto fail;
>> +    }
>> +
>> +    exp->mountpoint = g_strdup(args->mountpoint);
>> +    exp->writable = blk_exp_args->writable;
>> +
>> +    ret = setup_fuse_export(exp, args->mountpoint, errp);
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
>> +
>> +    g_hash_table_insert(exports, g_strdup(args->mountpoint), NULL);
>> +    return 0;
>> +
>> +fail:
>> +    fuse_export_shutdown(blk_exp);
>> +    fuse_export_delete(blk_exp);
> 
> Why fuse_export_shutdown()? The only thing to possibly undo at
> this point is allocating exp->mountpoint, which is freed in
> fuse_export_delete().

I just realized it’s not only a question of style, but even technically
wrong to do so.  fuse_export_shutdown() drops the mount point path from
the @exports hash table, which shouldn’t be done if we got here from the
g_hash_table_contains() condition.

That could be fixed by introducing a new label that won’t execute
fuse_export_shutdown(), but we can also just let setup_fuse_export()
call that.

> Or can setup_fuse_export() fail without cleaning up? I hope not.
> 
>> +    return ret;
>> +}
>> +
>> +/**
>> + * Ensure that the global FUSE context is set up.
>> + */
>> +static void init_fuse(void)
>> +{
>> +    if (exports) {
>> +        return;
>> +    }
>> +
>> +    exports = g_hash_table_new_full(g_str_hash, g_str_equal, g_free, NULL);
>> +}
>> +
>> +/**
>> + * Create exp->fuse_session and mount it.
>> + */
>> +static int setup_fuse_export(FuseExport *exp, const char *mountpoint,
>> +                             Error **errp)
>> +{
>> +    const char *fuse_argv[2];
>> +    struct fuse_args fuse_args;
>> +    int ret;
>> +
>> +    fuse_argv[0] = ""; /* Dummy program name */
>> +    fuse_argv[1] = NULL;
>> +    fuse_args = (struct fuse_args)FUSE_ARGS_INIT(1, (char **)fuse_argv);
>> +
>> +    exp->fuse_session = fuse_session_new(&fuse_args, &fuse_ops,
>> +                                         sizeof(fuse_ops), exp);
>> +    if (!exp->fuse_session) {
>> +        error_setg(errp, "Failed to set up FUSE session");
>> +        return -EIO;
>> +    }
>> +
>> +    ret = fuse_session_mount(exp->fuse_session, mountpoint);
>> +    if (ret < 0) {
>> +        error_setg(errp, "Failed to mount FUSE session to export");
>> +        return -EIO;
> 
> Oh, it can. :-(
> 
> I think it would be better to call fuse_export_shutdown() here so that
> the function doesn't change the state when it returns failure.

I suppose then this function (setup_fuse_export()) should also add the
mount point name from the @exports hash table, so it makes sense to drop
it via fuse_export_shutdown().

But then yeah, why not.  Makes sense.

>> +    }
>> +    exp->mounted = true;
>> +
>> +    aio_set_fd_handler(exp->common.ctx,
>> +                       fuse_session_fd(exp->fuse_session), true,
>> +                       read_from_fuse_export, NULL, NULL, exp);
>> +    exp->fd_handler_set_up = true;
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>> + * Callback to be invoked when the FUSE session FD can be read from.
>> + * (This is basically the FUSE event loop.)
>> + */
>> +static void read_from_fuse_export(void *opaque)
>> +{
>> +    FuseExport *exp = opaque;
>> +    int ret;
>> +
>> +    blk_exp_ref(&exp->common);
>> +
>> +    ret = fuse_session_receive_buf(exp->fuse_session, &exp->fuse_buf);
> 
> The fuse_session_loop() implementation seems to imply that we should
> retry on EINTR.

OK.  I see you’re digging into libfuse already. :)

>> +    if (ret < 0) {
>> +        goto out;
>> +    }
>> +
>> +    fuse_session_process_buf(exp->fuse_session, &exp->fuse_buf);
>> +
>> +out:
>> +    blk_exp_unref(&exp->common);
>> +}
>> +
>> +static void fuse_export_shutdown(BlockExport *blk_exp)
>> +{
>> +    FuseExport *exp = container_of(blk_exp, FuseExport, common);
>> +
>> +    if (exp->fuse_session) {
>> +        fuse_session_exit(exp->fuse_session);
>> +
>> +        if (exp->mounted) {
>> +            fuse_session_unmount(exp->fuse_session);
>> +            exp->mounted = false;
>> +        }
>> +
>> +        if (exp->fd_handler_set_up) {
>> +            aio_set_fd_handler(exp->common.ctx,
>> +                               fuse_session_fd(exp->fuse_session), true,
>> +                               NULL, NULL, NULL, NULL);
>> +            exp->fd_handler_set_up = false;
>> +        }
>> +
>> +        fuse_session_destroy(exp->fuse_session);
>> +        exp->fuse_session = NULL;
> 
> What happens if a request is still in flight?
> 
> Oh, can't happen because the driver is fully synchronous after this
> series. Fair enough, making it asynchronous can come on top of it.

(I had multiple approaches of handling parallel requests, but none made
a substantial performance difference, which is why I left the driver in
the most simple form for this first proposal.)

> However, I think we should move fuse_session_unmount() and
> fuse_session_destroy() to .delete instead of .shutdown to avoid setting
> up a trap for whoever introduces async requests to the FUSE export.

Yes, makes sense.

>> +    }
>> +
>> +    if (exp->mountpoint) {
>> +        g_hash_table_remove(exports, exp->mountpoint);
>> +    }
>> +}
>> +
>> +static void fuse_export_delete(BlockExport *blk_exp)
>> +{
>> +    FuseExport *exp = container_of(blk_exp, FuseExport, common);
>> +
>> +    free(exp->fuse_buf.mem);
>> +    g_free(exp->mountpoint);
>> +}
>> +
>> +/**
>> + * Check whether @path points to a regular file.  If not, put an
>> + * appropriate message into *errp.
>> + */
>> +static bool is_regular_file(const char *path, Error **errp)
>> +{
>> +    struct stat statbuf;
>> +    int ret;
>> +
>> +    ret = stat(path, &statbuf);
>> +    if (ret < 0) {
>> +        error_setg_errno(errp, errno, "Failed to stat '%s'", path);
>> +        return false;
>> +    }
>> +
>> +    if (!S_ISREG(statbuf.st_mode)) {
>> +        error_setg(errp, "'%s' is not a regular file", path);
>> +        return false;
>> +    }
>> +
>> +    return true;
>> +}
> 
> This is obviously racy (TOCTOU), but if I understand correctly, we use
> it only to provide a nicer error message than if fuse_session_mount()
> fails, so this is fine.

It’s a check provided for convenience, yes.  More important is perhaps
that users might provide a directory instead of a file as the mount
point, which wouldn’t work.  (You’ll just get errors back.  Which isn’t
a catastrophe, it’s just nice to tell users early and explicitly that
that won’t work.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-10-15 14:48 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-22 10:49 [PATCH v2 00/20] block/export: Allow exporting BDSs via FUSE Max Reitz
2020-09-22 10:49 ` [PATCH v2 01/20] configure: Detect libfuse Max Reitz
2020-09-22 11:14   ` Thomas Huth
2020-09-22 11:21     ` Paolo Bonzini
2020-09-22 11:46     ` Max Reitz
2020-09-22 15:37     ` Max Reitz
2020-09-22 15:45       ` Paolo Bonzini
2020-09-22 10:49 ` [PATCH v2 02/20] fuse: Allow exporting BDSs via FUSE Max Reitz
2020-10-15  8:57   ` Kevin Wolf
2020-10-15 14:46     ` Max Reitz [this message]
2020-10-15 15:41       ` Kevin Wolf
2020-10-15 15:59         ` Max Reitz
2020-10-15 17:01           ` Kevin Wolf
2020-09-22 10:49 ` [PATCH v2 03/20] fuse: Implement standard FUSE operations Max Reitz
2020-10-15  9:46   ` Kevin Wolf
2020-10-15 15:18     ` Max Reitz
2020-10-15 15:58       ` Kevin Wolf
2020-10-15 16:04         ` Max Reitz
2020-09-22 10:49 ` [PATCH v2 04/20] fuse: Allow growable exports Max Reitz
2020-10-15 10:41   ` Kevin Wolf
2020-10-15 15:20     ` Max Reitz
2020-09-22 10:49 ` [PATCH v2 05/20] fuse: (Partially) implement fallocate() Max Reitz
2020-09-22 10:49 ` [PATCH v2 06/20] fuse: Implement hole detection through lseek Max Reitz
2020-09-22 10:49 ` [PATCH v2 07/20] iotests: Do not needlessly filter _make_test_img Max Reitz
2020-09-22 10:49 ` [PATCH v2 08/20] iotests: Do not pipe _make_test_img Max Reitz
2020-09-22 10:49 ` [PATCH v2 09/20] iotests: Use convert -n in some cases Max Reitz
2020-09-22 10:49 ` [PATCH v2 10/20] iotests/046: Avoid renaming images Max Reitz
2020-09-22 10:49 ` [PATCH v2 11/20] iotests: Derive image names from $TEST_IMG Max Reitz
2020-09-22 10:49 ` [PATCH v2 12/20] iotests/091: Use _cleanup_qemu instad of "wait" Max Reitz
2020-09-22 10:49 ` [PATCH v2 13/20] iotests: Restrict some Python tests to file Max Reitz
2020-09-22 10:49 ` [PATCH v2 14/20] iotests: Let _make_test_img guess $TEST_IMG_FILE Max Reitz
2020-09-22 10:49 ` [PATCH v2 15/20] iotests/287: Clean up subshell test image Max Reitz
2020-09-22 10:49 ` [PATCH v2 16/20] storage-daemon: Call bdrv_close_all() on exit Max Reitz
2020-09-22 10:49 ` [PATCH v2 17/20] iotests: Give access to the qemu-storage-daemon Max Reitz
2020-10-15 11:27   ` Kevin Wolf
2020-10-15 15:22     ` Max Reitz
2020-09-22 10:49 ` [PATCH v2 18/20] iotests: Allow testing FUSE exports Max Reitz
2020-10-15 11:43   ` Kevin Wolf
2020-10-15 15:27     ` Max Reitz
2020-09-22 10:49 ` [PATCH v2 19/20] iotests: Enable fuse for many tests Max Reitz
2020-09-22 10:49 ` [PATCH v2 20/20] iotests/308: Add test for FUSE exports Max Reitz
2020-09-22 15:58 ` [PATCH v2 00/20] block/export: Allow exporting BDSs via FUSE Daniel P. Berrangé
2020-09-23  7:21   ` Max Reitz
2020-09-23  9:08   ` Stefan Hajnoczi
2020-10-15 12:01 ` Kevin Wolf
2020-10-15 16:47   ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2db9dbfe-137a-4cdc-08ab-aaa1aed26423@redhat.com \
    --to=mreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).