From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anand Avati Subject: Re: [PATCH v4] fuse: O_DIRECT support for files Date: Fri, 17 Feb 2012 23:11:37 +0530 Message-ID: <4F3E9151.4000101@redhat.com> References: <20120217134958.GA6303@shell.devel.redhat.com> <20120217160206.GA1856@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, fuse-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org, chenk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, miklos-sUDqSbJrdHQHWmgEVkV9KA@public.gmane.org To: Josef Bacik Return-path: In-Reply-To: <20120217160206.GA1856-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: fuse-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: linux-fsdevel.vger.kernel.org On 02/17/2012 09:32 PM, Josef Bacik wrote: > On Fri, Feb 17, 2012 at 08:49:58AM -0500, Anand Avati wrote: >> Implement ->direct_IO() method in aops. The ->direct_IO() method combines >> the existing fuse_direct_read/fuse_direct_write methods to implement >> O_DIRECT functionality. >> >> Reaching ->direct_IO() in the read path via generic_file_aio_read ensures >> proper synchronization with page cache with its existing framework. >> >> Reaching ->direct_IO() in the write path via fuse_file_aio_write is made >> to come via generic_file_direct_write() which makes it play nice with >> the page cache w.r.t other mmap pages etc. >> >> On files marked 'direct_io' by the filesystem server, IO always follows >> the fuse_direct_read/write path. There is no effect of fcntl(O_DIRECT) >> and it always succeeds. >> >> On files not marked with 'direct_io' by the filesystem server, the IO >> path depends on O_DIRECT flag by the application. This can be passed >> at the time of open() as well as via fcntl(). >> >> Note that asynchronous O_DIRECT iocb jobs are completed synchronously >> always (this has been the case with FUSE even before this patch) >> >> Signed-off-by: Anand Avati >> --- >> >> Test case: >> >> - concurrent read and write DDs with oflag=direct and iflag=direct set >> in a few writes and a few reads >> >> - artificially decrease 6th parameter to generic_file_direct_write to >> simulate a partial write of the direct IO request and verify proper >> buffered I/O for remaining offset and size with printk's and verify >> write completion at backend >> >> fs/fuse/dir.c | 3 - >> fs/fuse/file.c | 124 ++++++++++++++++++++++++++++++++++++++++++++++++-------- >> 2 files changed, 107 insertions(+), 20 deletions(-) >> > > All in all it looks good, just one little nit > >> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c >> index 2066328..7e5dbd0 100644 >> --- a/fs/fuse/dir.c >> +++ b/fs/fuse/dir.c >> @@ -387,9 +387,6 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry, >> if (fc->no_create) >> return -ENOSYS; >> >> - if (flags& O_DIRECT) >> - return -EINVAL; >> - >> forget = fuse_alloc_forget(); >> if (!forget) >> return -ENOMEM; >> diff --git a/fs/fuse/file.c b/fs/fuse/file.c >> index 4a199fd..9dd611b 100644 >> --- a/fs/fuse/file.c >> +++ b/fs/fuse/file.c >> @@ -194,10 +194,6 @@ int fuse_open_common(struct inode *inode, struct file *file, bool isdir) >> struct fuse_conn *fc = get_fuse_conn(inode); >> int err; >> >> - /* VFS checks this, but only _after_ ->open() */ >> - if (file->f_flags& O_DIRECT) >> - return -EINVAL; >> - >> err = generic_file_open(inode, file); >> if (err) >> return err; >> @@ -932,17 +928,23 @@ static ssize_t fuse_file_aio_write(struct kiocb *iocb, const struct iovec *iov, >> struct file *file = iocb->ki_filp; >> struct address_space *mapping = file->f_mapping; >> size_t count = 0; >> + size_t ocount = 0; >> ssize_t written = 0; >> + ssize_t written_buffered = 0; >> struct inode *inode = mapping->host; >> ssize_t err; >> struct iov_iter i; >> + loff_t endbyte = 0; >> >> WARN_ON(iocb->ki_pos != pos); >> >> - err = generic_segment_checks(iov,&nr_segs,&count, VERIFY_READ); >> + ocount = 0; >> + err = generic_segment_checks(iov,&nr_segs,&ocount, VERIFY_READ); >> if (err) >> return err; >> >> + count = ocount; >> + >> mutex_lock(&inode->i_mutex); >> vfs_check_frozen(inode->i_sb, SB_FREEZE_WRITE); >> >> @@ -962,11 +964,36 @@ static ssize_t fuse_file_aio_write(struct kiocb *iocb, const struct iovec *iov, >> >> file_update_time(file); >> >> - iov_iter_init(&i, iov, nr_segs, count, 0); >> - written = fuse_perform_write(file, mapping,&i, pos); >> - if (written>= 0) >> - iocb->ki_pos = pos + written; >> + if (file->f_flags& O_DIRECT) { >> + written = generic_file_direct_write(iocb, iov,&nr_segs, >> + pos,&iocb->ki_pos, >> + count, ocount); >> + if (written< 0 || written == count) >> + goto out; >> + >> + pos += written; >> + count -= written; >> + >> + iov_iter_init(&i, iov, nr_segs, count, written); >> + written_buffered = fuse_perform_write(file, mapping,&i, pos); >> + if (written_buffered< 0) { >> + err = written_buffered; >> + goto out; >> + } >> + endbyte = pos + written_buffered - 1; >> >> + err = filemap_write_and_wait_range(file->f_mapping, pos, >> + endbyte); >> + if (err) >> + goto out; >> + written += written_buffered; >> + iocb->ki_pos = pos + written_buffered; > > You need to call invalidate_mapping_pages here to evict the pages from > pagecache, basically copy what __generic_file_aio_write does and you are good to > go. Thanks, I have a more fundamental question. Does FUSE require such a "buffered fallback" path at all? Both branches (generic_file_direct_write on a fuse file and fuse_perform_write) finally end up in a fuse_send_write(). What is the reason behind expecting fuse_send_write() to fail via fuse_direct_IO() branch but succeed via fuse_perform_write()? Avati ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/