From: Josef Bacik <josef@redhat.com>
To: Anand Avati <avati@redhat.com>
Cc: Josef Bacik <josef@redhat.com>,
miklos@szeredi.hu, linux-fsdevel@vger.kernel.org,
fuse-devel@lists.sourceforge.net, chenk@redhat.com
Subject: Re: [PATCH v4] fuse: O_DIRECT support for files
Date: Fri, 17 Feb 2012 12:59:29 -0500 [thread overview]
Message-ID: <20120217175929.GC1856@localhost.localdomain> (raw)
In-Reply-To: <4F3E9151.4000101@redhat.com>
On Fri, Feb 17, 2012 at 11:11:37PM +0530, Anand Avati wrote:
> On 02/17/2012 09:32 PM, Josef Bacik wrote:
> >On Fri, Feb 17, 2012 at 08:49:58AM -0500, Anand Avati wrote:
> >>Implement ->direct_IO() method in aops. The ->direct_IO() method combines
> >>the existing fuse_direct_read/fuse_direct_write methods to implement
> >>O_DIRECT functionality.
> >>
> >>Reaching ->direct_IO() in the read path via generic_file_aio_read ensures
> >>proper synchronization with page cache with its existing framework.
> >>
> >>Reaching ->direct_IO() in the write path via fuse_file_aio_write is made
> >>to come via generic_file_direct_write() which makes it play nice with
> >>the page cache w.r.t other mmap pages etc.
> >>
> >>On files marked 'direct_io' by the filesystem server, IO always follows
> >>the fuse_direct_read/write path. There is no effect of fcntl(O_DIRECT)
> >>and it always succeeds.
> >>
> >>On files not marked with 'direct_io' by the filesystem server, the IO
> >>path depends on O_DIRECT flag by the application. This can be passed
> >>at the time of open() as well as via fcntl().
> >>
> >>Note that asynchronous O_DIRECT iocb jobs are completed synchronously
> >>always (this has been the case with FUSE even before this patch)
> >>
> >>Signed-off-by: Anand Avati<avati@redhat.com>
> >>---
> >>
> >>Test case:
> >>
> >>- concurrent read and write DDs with oflag=direct and iflag=direct set
> >> in a few writes and a few reads
> >>
> >>- artificially decrease 6th parameter to generic_file_direct_write to
> >> simulate a partial write of the direct IO request and verify proper
> >> buffered I/O for remaining offset and size with printk's and verify
> >> write completion at backend
> >>
> >> fs/fuse/dir.c | 3 -
> >> fs/fuse/file.c | 124 ++++++++++++++++++++++++++++++++++++++++++++++++--------
> >> 2 files changed, 107 insertions(+), 20 deletions(-)
> >>
> >
> >All in all it looks good, just one little nit
> >
> >>diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> >>index 2066328..7e5dbd0 100644
> >>--- a/fs/fuse/dir.c
> >>+++ b/fs/fuse/dir.c
> >>@@ -387,9 +387,6 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
> >> if (fc->no_create)
> >> return -ENOSYS;
> >>
> >>- if (flags& O_DIRECT)
> >>- return -EINVAL;
> >>-
> >> forget = fuse_alloc_forget();
> >> if (!forget)
> >> return -ENOMEM;
> >>diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> >>index 4a199fd..9dd611b 100644
> >>--- a/fs/fuse/file.c
> >>+++ b/fs/fuse/file.c
> >>@@ -194,10 +194,6 @@ int fuse_open_common(struct inode *inode, struct file *file, bool isdir)
> >> struct fuse_conn *fc = get_fuse_conn(inode);
> >> int err;
> >>
> >>- /* VFS checks this, but only _after_ ->open() */
> >>- if (file->f_flags& O_DIRECT)
> >>- return -EINVAL;
> >>-
> >> err = generic_file_open(inode, file);
> >> if (err)
> >> return err;
> >>@@ -932,17 +928,23 @@ static ssize_t fuse_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
> >> struct file *file = iocb->ki_filp;
> >> struct address_space *mapping = file->f_mapping;
> >> size_t count = 0;
> >>+ size_t ocount = 0;
> >> ssize_t written = 0;
> >>+ ssize_t written_buffered = 0;
> >> struct inode *inode = mapping->host;
> >> ssize_t err;
> >> struct iov_iter i;
> >>+ loff_t endbyte = 0;
> >>
> >> WARN_ON(iocb->ki_pos != pos);
> >>
> >>- err = generic_segment_checks(iov,&nr_segs,&count, VERIFY_READ);
> >>+ ocount = 0;
> >>+ err = generic_segment_checks(iov,&nr_segs,&ocount, VERIFY_READ);
> >> if (err)
> >> return err;
> >>
> >>+ count = ocount;
> >>+
> >> mutex_lock(&inode->i_mutex);
> >> vfs_check_frozen(inode->i_sb, SB_FREEZE_WRITE);
> >>
> >>@@ -962,11 +964,36 @@ static ssize_t fuse_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
> >>
> >> file_update_time(file);
> >>
> >>- iov_iter_init(&i, iov, nr_segs, count, 0);
> >>- written = fuse_perform_write(file, mapping,&i, pos);
> >>- if (written>= 0)
> >>- iocb->ki_pos = pos + written;
> >>+ if (file->f_flags& O_DIRECT) {
> >>+ written = generic_file_direct_write(iocb, iov,&nr_segs,
> >>+ pos,&iocb->ki_pos,
> >>+ count, ocount);
> >>+ if (written< 0 || written == count)
> >>+ goto out;
> >>+
> >>+ pos += written;
> >>+ count -= written;
> >>+
> >>+ iov_iter_init(&i, iov, nr_segs, count, written);
> >>+ written_buffered = fuse_perform_write(file, mapping,&i, pos);
> >>+ if (written_buffered< 0) {
> >>+ err = written_buffered;
> >>+ goto out;
> >>+ }
> >>+ endbyte = pos + written_buffered - 1;
> >>
> >>+ err = filemap_write_and_wait_range(file->f_mapping, pos,
> >>+ endbyte);
> >>+ if (err)
> >>+ goto out;
> >>+ written += written_buffered;
> >>+ iocb->ki_pos = pos + written_buffered;
> >
> >You need to call invalidate_mapping_pages here to evict the pages from
> >pagecache, basically copy what __generic_file_aio_write does and you are good to
> >go. Thanks,
>
> I have a more fundamental question. Does FUSE require such a
> "buffered fallback" path at all? Both branches
> (generic_file_direct_write on a fuse file and fuse_perform_write)
> finally end up in a fuse_send_write(). What is the reason behind
> expecting fuse_send_write() to fail via fuse_direct_IO() branch but
> succeed via fuse_perform_write()?
Ahh good point helps if I pay attention to what you are doing. You're right if
it fails in the direct write I doubt trying it again with buffered writing is
going to get you anywhere, I had it in my mind you were still using the generic
dio infrastructure which can fail for a variety of reasons but you aren't so you
could probably get away with just failing if generic_file_direct_write fails.
Thanks,
Josef
prev parent reply other threads:[~2012-02-17 17:59 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-17 13:49 [PATCH v4] fuse: O_DIRECT support for files Anand Avati
[not found] ` <20120217134958.GA6303-WxyJVcTm+KGVOal23zz4YIGKTjYczspe@public.gmane.org>
2012-02-17 16:02 ` Josef Bacik
[not found] ` <20120217160206.GA1856-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2012-02-17 17:41 ` Anand Avati
2012-02-17 17:59 ` Josef Bacik [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120217175929.GC1856@localhost.localdomain \
--to=josef@redhat.com \
--cc=avati@redhat.com \
--cc=chenk@redhat.com \
--cc=fuse-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).