From: Vivek Goyal <vgoyal@redhat.com>
To: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
Cc: lvivier@redhat.com, quintela@redhat.com, hgcoin@gmail.com,
qemu-devel@nongnu.org, peterx@redhat.com, zhengchuan@huawei.com,
dovmurik@linux.vnet.ibm.com, zhangjiachen.jaycee@bytedance.com,
stefanha@redhat.com, jinyan12@huawei.com,
ann.zhuangyanying@huawei.com
Subject: Re: [PULL 26/26] virtiofsd: Add -o allow_direct_io|no_allow_direct_io options
Date: Tue, 29 Sep 2020 17:53:19 -0400 [thread overview]
Message-ID: <20200929215319.GI220516@redhat.com> (raw)
In-Reply-To: <20200925120655.295142-27-dgilbert@redhat.com>
On Fri, Sep 25, 2020 at 01:06:55PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
>
> Due to the commit 65da4539803373ec4eec97ffc49ee90083e56efd, the O_DIRECT
> open flag of guest applications will be discarded by virtiofsd. While
> this behavior makes it consistent with the virtio-9p scheme when guest
> applications use direct I/O, we no longer have any chance to bypass the
> host page cache.
>
> Therefore, we add a flag 'allow_direct_io' to lo_data. If '-o
> no_allow_direct_io' option is added, or none of '-o allow_direct_io' or
> '-o no_allow_direct_io' is added, the 'allow_direct_io' will be set to
> 0, and virtiofsd discards O_DIRECT as before. If '-o allow_direct_io'
> is added to the starting command-line, 'allow_direct_io' will be set to
> 1, so that the O_DIRECT flags will be retained and host page cache can
> be bypassed.
Hi Jiachen,
Curious that in what cases you want to bypass host page cache.
Thanks
Vivek
>
> Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> Message-Id: <20200824105957.61265-1-zhangjiachen.jaycee@bytedance.com>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> tools/virtiofsd/helper.c | 4 ++++
> tools/virtiofsd/passthrough_ll.c | 20 ++++++++++++++------
> 2 files changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/tools/virtiofsd/helper.c b/tools/virtiofsd/helper.c
> index 7bc5d7dc5a..85770d63f1 100644
> --- a/tools/virtiofsd/helper.c
> +++ b/tools/virtiofsd/helper.c
> @@ -178,6 +178,10 @@ void fuse_cmdline_help(void)
> " (0 leaves rlimit unchanged)\n"
> " default: min(1000000, fs.file-max - 16384)\n"
> " if the current rlimit is lower\n"
> + " -o allow_direct_io|no_allow_direct_io\n"
> + " retain/discard O_DIRECT flags passed down\n"
> + " to virtiofsd from guest applications.\n"
> + " default: no_allow_direct_io\n"
> );
> }
>
> diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
> index 784330e0e4..0b229ebd57 100644
> --- a/tools/virtiofsd/passthrough_ll.c
> +++ b/tools/virtiofsd/passthrough_ll.c
> @@ -151,6 +151,7 @@ struct lo_data {
> int timeout_set;
> int readdirplus_set;
> int readdirplus_clear;
> + int allow_direct_io;
> struct lo_inode root;
> GHashTable *inodes; /* protected by lo->mutex */
> struct lo_map ino_map; /* protected by lo->mutex */
> @@ -179,6 +180,8 @@ static const struct fuse_opt lo_opts[] = {
> { "cache=always", offsetof(struct lo_data, cache), CACHE_ALWAYS },
> { "readdirplus", offsetof(struct lo_data, readdirplus_set), 1 },
> { "no_readdirplus", offsetof(struct lo_data, readdirplus_clear), 1 },
> + { "allow_direct_io", offsetof(struct lo_data, allow_direct_io), 1 },
> + { "no_allow_direct_io", offsetof(struct lo_data, allow_direct_io), 0 },
> FUSE_OPT_END
> };
> static bool use_syslog = false;
> @@ -1516,7 +1519,8 @@ static void lo_releasedir(fuse_req_t req, fuse_ino_t ino,
> fuse_reply_err(req, 0);
> }
>
> -static void update_open_flags(int writeback, struct fuse_file_info *fi)
> +static void update_open_flags(int writeback, int allow_direct_io,
> + struct fuse_file_info *fi)
> {
> /*
> * With writeback cache, kernel may send read requests even
> @@ -1541,10 +1545,13 @@ static void update_open_flags(int writeback, struct fuse_file_info *fi)
>
> /*
> * O_DIRECT in guest should not necessarily mean bypassing page
> - * cache on host as well. If somebody needs that behavior, it
> - * probably should be a configuration knob in daemon.
> + * cache on host as well. Therefore, we discard it by default
> + * ('-o no_allow_direct_io'). If somebody needs that behavior,
> + * the '-o allow_direct_io' option should be set.
> */
> - fi->flags &= ~O_DIRECT;
> + if (!allow_direct_io) {
> + fi->flags &= ~O_DIRECT;
> + }
> }
>
> static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
> @@ -1576,7 +1583,7 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
> goto out;
> }
>
> - update_open_flags(lo->writeback, fi);
> + update_open_flags(lo->writeback, lo->allow_direct_io, fi);
>
> fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & ~O_NOFOLLOW,
> mode);
> @@ -1786,7 +1793,7 @@ static void lo_open(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info *fi)
> fuse_log(FUSE_LOG_DEBUG, "lo_open(ino=%" PRIu64 ", flags=%d)\n", ino,
> fi->flags);
>
> - update_open_flags(lo->writeback, fi);
> + update_open_flags(lo->writeback, lo->allow_direct_io, fi);
>
> sprintf(buf, "%i", lo_fd(req, ino));
> fd = openat(lo->proc_self_fd, buf, fi->flags & ~O_NOFOLLOW);
> @@ -2823,6 +2830,7 @@ int main(int argc, char *argv[])
> .debug = 0,
> .writeback = 0,
> .posix_lock = 0,
> + .allow_direct_io = 0,
> .proc_self_fd = -1,
> };
> struct lo_map_elem *root_elem;
> --
> 2.26.2
>
next prev parent reply other threads:[~2020-09-29 21:54 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-25 12:06 [PULL 00/26] migration and friends queue Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 01/26] migration: Properly destroy variables on incoming side Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 02/26] migration: Rework migrate_send_rp_req_pages() function Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 03/26] migration/dirtyrate: setup up query-dirtyrate framwork Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 04/26] migration/dirtyrate: add DirtyRateStatus to denote calculation status Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 05/26] migration/dirtyrate: Add RamblockDirtyInfo to store sampled page info Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 06/26] migration/dirtyrate: Add dirtyrate statistics series functions Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 07/26] migration/dirtyrate: move RAMBLOCK_FOREACH_MIGRATABLE into ram.h Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 08/26] migration/dirtyrate: Record hash results for each sampled page Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 09/26] migration/dirtyrate: Compare page hash results for recorded " Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 10/26] migration/dirtyrate: skip sampling ramblock with size below MIN_RAMBLOCK_SIZE Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 11/26] migration/dirtyrate: Implement set_sample_page_period() and is_sample_period_valid() Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 12/26] migration/dirtyrate: Implement calculate_dirtyrate() function Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 13/26] migration/dirtyrate: Implement qmp_cal_dirty_rate()/qmp_get_dirty_rate() function Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 14/26] migration/dirtyrate: Add trace_calls to make it easier to debug Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 15/26] migration: Truncate state file in xen-save-devices-state Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 16/26] migration: increase max-bandwidth to 128 MiB/s (1 Gib/s) Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 17/26] migration/tls: save hostname into MigrationState Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 18/26] migration/tls: extract migration_tls_client_create for common-use Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 19/26] migration/tls: add tls_hostname into MultiFDSendParams Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 20/26] migration/tls: extract cleanup function for common-use Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 21/26] migration/tls: add support for multifd tls-handshake Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 22/26] migration/tls: add trace points for multifd-tls Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 23/26] monitor: Use LOCK_GUARD macros Dr. David Alan Gilbert (git)
2023-02-07 13:26 ` Marc-André Lureau
2023-02-07 13:41 ` Dr. David Alan Gilbert
2023-02-07 14:06 ` Marc-André Lureau
2020-09-25 12:06 ` [PULL 24/26] virtiofsd: document cache=auto default Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 25/26] virtiofsd: Used glib "shared" thread pool Dr. David Alan Gilbert (git)
2020-09-25 12:06 ` [PULL 26/26] virtiofsd: Add -o allow_direct_io|no_allow_direct_io options Dr. David Alan Gilbert (git)
2020-09-29 21:53 ` Vivek Goyal [this message]
2020-09-30 2:14 ` [External] " Jiachen Zhang
2020-09-25 12:35 ` [PULL 00/26] migration and friends queue no-reply
2020-09-25 14:31 ` Dr. David Alan Gilbert
2020-09-25 16:22 ` Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200929215319.GI220516@redhat.com \
--to=vgoyal@redhat.com \
--cc=ann.zhuangyanying@huawei.com \
--cc=dgilbert@redhat.com \
--cc=dovmurik@linux.vnet.ibm.com \
--cc=hgcoin@gmail.com \
--cc=jinyan12@huawei.com \
--cc=lvivier@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@redhat.com \
--cc=zhangjiachen.jaycee@bytedance.com \
--cc=zhengchuan@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).