* [PATCH RFC] Add cache-only generic file read
@ 2009-10-04 23:20 Chris Mason
2009-10-05 21:58 ` Christoph Hellwig
0 siblings, 1 reply; 2+ messages in thread
From: Chris Mason @ 2009-10-04 23:20 UTC (permalink / raw)
To: linux-fsdevel
Hello everyone,
We're still hammering out O_DIRECT code for btrfs that doesn't go
through the page cache. Until things are done, it helps to have
O_DIRECT that goes through the page cache and just immediately
invalidates. It doesn't give us AIO, but it is still useful for
some workloads.
Btrfs has had O_DIRECT writes for a while by invalidating the cache in
its own file_write. Reads are harder because filemap.c doesn't quite
export enough. Rather than export things and make a btrfs read, I've
made a generic cache-only read function that includes O_DIRECT
invalidation.
What does everyone think of adding something like this:
-chris
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2adaa25..9ee97a9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2203,6 +2203,9 @@ extern int generic_file_readonly_mmap(struct file *, struct vm_area_struct *);
extern int file_read_actor(read_descriptor_t * desc, struct page *page, unsigned long offset, unsigned long size);
int generic_write_checks(struct file *file, loff_t *pos, size_t *count, int isblk);
extern ssize_t generic_file_aio_read(struct kiocb *, const struct iovec *, unsigned long, loff_t);
+extern ssize_t generic_file_cached_read(struct kiocb *iocb,
+ const struct iovec *iov,
+ unsigned long nr_segs, loff_t pos);
extern ssize_t __generic_file_aio_write(struct kiocb *, const struct iovec *, unsigned long,
loff_t *);
extern ssize_t generic_file_aio_write(struct kiocb *, const struct iovec *, unsigned long, loff_t);
diff --git a/mm/filemap.c b/mm/filemap.c
index ef169f3..506d769 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1269,6 +1269,85 @@ int generic_segment_checks(const struct iovec *iov,
EXPORT_SYMBOL(generic_segment_checks);
/**
+ * generic_file_cached_read - generic filesystem read routine, no O_DIRECT
+ * is done
+ * @iocb: kernel I/O control block
+ * @iov: io vector request
+ * @nr_segs: number of segments in the iovec
+ * @pos: current file position
+ *
+ * This is the "read()" routine for all filesystems
+ * that can use the page cache directly, but do not support
+ * O_DIRECT. The O_DIRECT part is emulated with page cache invalidatation
+ */
+ssize_t
+generic_file_cached_read(struct kiocb *iocb, const struct iovec *iov,
+ unsigned long nr_segs, loff_t pos)
+{
+ struct file *filp = iocb->ki_filp;
+ ssize_t retval;
+ unsigned long seg;
+ size_t count;
+ loff_t *ppos = &iocb->ki_pos;
+ struct address_space *mapping;
+ struct inode *inode;
+ loff_t size;
+ size_t iov_len = 0;
+
+ count = 0;
+ retval = generic_segment_checks(iov, &nr_segs, &count, VERIFY_WRITE);
+ if (retval)
+ return retval;
+
+ mapping = filp->f_mapping;
+ inode = mapping->host;
+ size = i_size_read(inode);
+
+ /* coalesce the iovecs and go direct-to-BIO for O_DIRECT */
+ if (filp->f_flags & O_DIRECT) {
+ if (!count)
+ goto out; /* skip atime */
+
+ iov_len = iov_length(iov, nr_segs);
+
+ if (pos < size) {
+ retval = filemap_write_and_wait_range(mapping, pos,
+ pos + iov_len - 1);
+ if (retval)
+ goto out;
+ }
+ }
+
+ for (seg = 0; seg < nr_segs; seg++) {
+ read_descriptor_t desc;
+
+ desc.written = 0;
+ desc.arg.buf = iov[seg].iov_base;
+ desc.count = iov[seg].iov_len;
+ if (desc.count == 0)
+ continue;
+ desc.error = 0;
+ do_generic_file_read(filp, ppos, &desc, file_read_actor);
+ retval += desc.written;
+ if (desc.error) {
+ retval = retval ?: desc.error;
+ break;
+ }
+ if (desc.count > 0)
+ break;
+ }
+ if ((filp->f_flags & O_DIRECT) && pos < size) {
+ invalidate_inode_pages2_range(mapping,
+ pos >> PAGE_CACHE_SHIFT,
+ (pos + iov_len - 1) >> PAGE_CACHE_SHIFT);
+ }
+ file_accessed(filp);
+out:
+ return retval;
+}
+EXPORT_SYMBOL(generic_file_cached_read);
+
+/**
* generic_file_aio_read - generic filesystem read routine
* @iocb: kernel I/O control block
* @iov: io vector request
--
1.6.4.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH RFC] Add cache-only generic file read
2009-10-04 23:20 [PATCH RFC] Add cache-only generic file read Chris Mason
@ 2009-10-05 21:58 ` Christoph Hellwig
0 siblings, 0 replies; 2+ messages in thread
From: Christoph Hellwig @ 2009-10-05 21:58 UTC (permalink / raw)
To: Chris Mason, linux-fsdevel
On Sun, Oct 04, 2009 at 07:20:23PM -0400, Chris Mason wrote:
> What does everyone think of adding something like this:
- duplicates too much of generic_file_aio_read. The patch should just
change generic_file_aio_read to a small wrapper for an O_DIRECT
and a buffered version, and btrfs would jsut use the buffered one
directly.
- the name. We use the term buffered for pagecache I/O, not cached,
e.g. generic_file_buffered_write.
- not sure if the invalidations around the core buffered helper
are generic anough to put them into filemap.c. I'd just put them
into btrfs for now, although that part isn't too important.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2009-10-05 21:59 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-04 23:20 [PATCH RFC] Add cache-only generic file read Chris Mason
2009-10-05 21:58 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).