From: Vivek Goyal <vgoyal@redhat.com>
To: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
Cc: lvivier@redhat.com, quintela@redhat.com, hgcoin@gmail.com,
qemu-devel@nongnu.org, peterx@redhat.com, zhengchuan@huawei.com,
dovmurik@linux.vnet.ibm.com, zhangjiachen.jaycee@bytedance.com,
stefanha@redhat.com, jinyan12@huawei.com,
ann.zhuangyanying@huawei.com
Subject: Re: [PULL 26/26] virtiofsd: Add -o allow_direct_io|no_allow_direct_io options
Date: Tue, 29 Sep 2020 17:53:19 -0400 [thread overview]
Message-ID: <20200929215319.GI220516@redhat.com> (raw)
In-Reply-To: <20200925120655.295142-27-dgilbert@redhat.com>
On Fri, Sep 25, 2020 at 01:06:55PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
>
> Due to the commit 65da4539803373ec4eec97ffc49ee90083e56efd, the O_DIRECT
> open flag of guest applications will be discarded by virtiofsd. While
> this behavior makes it consistent with the virtio-9p scheme when guest
> applications use direct I/O, we no longer have any chance to bypass the
> host page cache.
>
> Therefore, we add a flag 'allow_direct_io' to lo_data. If '-o
> no_allow_direct_io' option is added, or none of '-o allow_direct_io' or
> '-o no_allow_direct_io' is added, the 'allow_direct_io' will be set to
> 0, and virtiofsd discards O_DIRECT as before. If '-o allow_direct_io'
> is added to the starting command-line, 'allow_direct_io' will be set to
> 1, so that the O_DIRECT flags will be retained and host page cache can
> be bypassed.
Hi Jiachen,
Curious that in what cases you want to bypass host page cache.
Thanks
Vivek
>
> Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> Message-Id: <20200824105957.61265-1-zhangjiachen.jaycee@bytedance.com>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> tools/virtiofsd/helper.c | 4 ++++
> tools/virtiofsd/passthrough_ll.c | 20 ++++++++++++++------
> 2 files changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/tools/virtiofsd/helper.c b/tools/virtiofsd/helper.c
> index 7bc5d7dc5a..85770d63f1 100644
> --- a/tools/virtiofsd/helper.c
> +++ b/tools/virtiofsd/helper.c
> @@ -178,6 +178,10 @@ void fuse_cmdline_help(void)
> " (0 leaves rlimit unchanged)\n"
> " default: min(1000000, fs.file-max - 16384)\n"
> " if the current rlimit is lower\n"
> + " -o allow_direct_io|no_allow_direct_io\n"
> + " retain/discard O_DIRECT flags passed down\n"
> + " to virtiofsd from guest applications.\n"
> + " default: no_allow_direct_io\n"
> );
> }
>
> diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
> index 784330e0e4..0b229ebd57 100644
> --- a/tools/virtiofsd/passthrough_ll.c
> +++ b/tools/virtiofsd/passthrough_ll.c
> @@ -151,6 +151,7 @@ struct lo_data {
> int timeout_set;
> int readdirplus_set;
> int readdirplus_clear;
> + int allow_direct_io;
> struct lo_inode root;
> GHashTable *inodes; /* protected by lo->mutex */
> struct lo_map ino_map; /* protected by lo->mutex */
> @@ -179,6 +180,8 @@ static const struct fuse_opt lo_opts[] = {
> { "cache=always", offsetof(struct lo_data, cache), CACHE_ALWAYS },
> { "readdirplus", offsetof(struct lo_data, readdirplus_set), 1 },
> { "no_readdirplus", offsetof(struct lo_data, readdirplus_clear), 1 },
> + { "allow_direct_io", offsetof(struct lo_data, allow_direct_io), 1 },
> + { "no_allow_direct_io", offsetof(struct lo_data, allow_direct_io), 0 },
> FUSE_OPT_END
> };
> static bool use_syslog = false;
> @@ -1516,7 +1519,8 @@ static void lo_releasedir(fuse_req_t req, fuse_ino_t ino,
> fuse_reply_err(req, 0);
> }
>
> -static void update_open_flags(int writeback, struct fuse_file_info *fi)
> +static void update_open_flags(int writeback, int allow_direct_io,
> + struct fuse_file_info *fi)
> {
> /*
> * With writeback cache, kernel may send read requests even
> @@ -1541,10 +1545,13 @@ static void update_open_flags(int writeback, struct fuse_file_info *fi)
>
> /*
> * O_DIRECT in guest should not necessarily mean bypassing page
> - * cache on host as well. If somebody needs that behavior, it
> - * probably should be a configuration knob in daemon.
> + * cache on host as well. Therefore, we discard it by default
> + * ('-o no_allow_direct_io'). If somebody needs that behavior,
> + * the '-o allow_direct_io' option should be set.
> */
> - fi->flags &= ~O_DIRECT;
> + if (!allow_direct_io) {
> + fi->flags &= ~O_DIRECT;
> + }
> }
>
> static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
> @@ -1576,7 +1583,7 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
> goto out;
> }
>
> - update_open_flags(lo->writeback, fi);
> + update_open_flags(lo->writeback, lo->allow_direct_io, fi);
>
> fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & ~O_NOFOLLOW,
> mode);
> @@ -1786,7 +1793,7 @@ static void lo_open(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info *fi)
> fuse_log(FUSE_LOG_DEBUG, "lo_open(ino=%" PRIu64 ", flags=%d)\n", ino,
> fi->flags);
>
> - update_open_flags(lo->writeback, fi);
> + update_open_flags(lo->writeback, lo->allow_direct_io, fi);
>
> sprintf(buf, "%i", lo_fd(req, ino));
> fd = openat(lo->proc_self_fd, buf, fi->flags & ~O_NOFOLLOW);
> @@ -2823,6 +2830,7 @@ int main(int argc, char *argv[])
> .debug = 0,
> .writeback = 0,
> .posix_lock = 0,
> + .allow_direct_io = 0,
> .proc_self_fd = -1,
> };
> struct lo_map_elem *root_elem;
> --
> 2.26.2
>
next prev parent reply other threads:[~2020-09-29 21:54 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-25 12:06 [PULL 00/26] migration and friends queue Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 01/26] migration: Properly destroy variables on incoming side Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 02/26] migration: Rework migrate_send_rp_req_pages() function Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 03/26] migration/dirtyrate: setup up query-dirtyrate framwork Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 04/26] migration/dirtyrate: add DirtyRateStatus to denote calculation status Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 05/26] migration/dirtyrate: Add RamblockDirtyInfo to store sampled page info Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 06/26] migration/dirtyrate: Add dirtyrate statistics series functions Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 07/26] migration/dirtyrate: move RAMBLOCK_FOREACH_MIGRATABLE into ram.h Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 08/26] migration/dirtyrate: Record hash results for each sampled page Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 09/26] migration/dirtyrate: Compare page hash results for recorded " Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 10/26] migration/dirtyrate: skip sampling ramblock with size below MIN_RAMBLOCK_SIZE Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 11/26] migration/dirtyrate: Implement set_sample_page_period() and is_sample_period_valid() Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 12/26] migration/dirtyrate: Implement calculate_dirtyrate() function Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 13/26] migration/dirtyrate: Implement qmp_cal_dirty_rate()/qmp_get_dirty_rate() function Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 14/26] migration/dirtyrate: Add trace_calls to make it easier to debug Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 15/26] migration: Truncate state file in xen-save-devices-state Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 16/26] migration: increase max-bandwidth to 128 MiB/s (1 Gib/s) Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 17/26] migration/tls: save hostname into MigrationState Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 18/26] migration/tls: extract migration_tls_client_create for common-use Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 19/26] migration/tls: add tls_hostname into MultiFDSendParams Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 20/26] migration/tls: extract cleanup function for common-use Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 21/26] migration/tls: add support for multifd tls-handshake Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 22/26] migration/tls: add trace points for multifd-tls Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 23/26] monitor: Use LOCK_GUARD macros Dr. David Alan Gilbert (git)
2023-02-07 13:26 ` Marc-André Lureau
2023-02-07 13:41 ` Dr. David Alan Gilbert
2023-02-07 14:06 ` Marc-André Lureau
2020-09-25 12:06 ` [PULL 24/26] virtiofsd: document cache=auto default Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 25/26] virtiofsd: Used glib "shared" thread pool Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 26/26] virtiofsd: Add -o allow_direct_io|no_allow_direct_io options Dr. David Alan Gilbert (git)
2020-09-29 21:53 ` Vivek Goyal [this message]
2020-09-30 2:14 ` [External] " Jiachen Zhang
2020-09-25 12:35 ` [PULL 00/26] migration and friends queue no-reply
2020-09-25 14:31 ` Dr. David Alan Gilbert
2020-09-25 16:22 ` Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200929215319.GI220516@redhat.com \
--to=vgoyal@redhat.com \
--cc=ann.zhuangyanying@huawei.com \
--cc=dgilbert@redhat.com \
--cc=dovmurik@linux.vnet.ibm.com \
--cc=hgcoin@gmail.com \
--cc=jinyan12@huawei.com \
--cc=lvivier@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@redhat.com \
--cc=zhangjiachen.jaycee@bytedance.com \
--cc=zhengchuan@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.