From: Matthew Wilcox <matthew@wil.cx>
To: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v3 0/3] Add XIP support to ext4
Date: Fri, 20 Dec 2013 09:45:30 -0700 [thread overview]
Message-ID: <20131220164529.GE19166@parisc-linux.org> (raw)
In-Reply-To: <20131219234653.GD31386@dastard>
On Fri, Dec 20, 2013 at 10:46:54AM +1100, Dave Chinner wrote:
> Then perhaps we need to get rid of the xip_sparse_mutex first? :/
Yeah, already done in my tree. Just finishing up a few other pieces.
> > > And that solves the unwritten extent problem for the IO path. Now we
> > > just need to solve it for the mmap path. That, I suspect will
> > > require a custom .page_mkwrite handler....
> >
> > No, page_mkwrite() never gets called. At this point, I'm thinking a
> > custom ->fault handler that looks something like this:
>
> And that's another difference to the normal filesystem mmap paths.
> .fault is a read-only operation for filesystems and
> .page-mkwrite is the write-fault modification path. i.e.
> .fault is only supposed to populate the page into the page
> cache by reading it from disk via ->readpage(s). It does not do
> block allocation - if the fault is over a hole then a new, zeroed
> page is placed in the page cache regardless of whether it is a read
> or write page fault.
>
> ->page_mkwrite is then used by page fault infrstructure to inform
> filesystem that a write fault has occurred and they may need to
> allocate backing store for the page, or convert unwritten extents to
> written.
>
> What xip_file_fault() does is ask the fielsystem to allocate blocks
> for writeable mappings, rather than just inserting a sparse page
> over holes and unwritten extents. That fails to handle unwritten
> extents correctly - they remain unwritten despite the fact that
> userspace can now write to the page.
I agree with you up to this point. But xip_file_fault() uses the same
get_block_t callback to allocate blocks that block_page_mkwrite() does.
So there's no real difference from the fs' point of view.
> IOWs, xip_file_fault() needs to drop the allocation of blocks and
> only ever insert mappings for pages that have data in them or sprase
> pages for holes and unwritten extents. Then the filesystem needs to
> provide it's own ->page_mkwrite callout to do block allocation and
> unwritten extent conversion on write page faults, and the XIP code
> needs to provide a helper function to replace the sparse page in the
> mappings with the correct page mapped from the backing device after
> allocation or unwritten extent conversion.
>
> That will make XIP behave almost identically to the normal page
> cache based page fault path, requiring only a small addition to the
> filesystem page_mkwrite handler to support XIP...
I decided to see if there was anything particularly hard about the XFS
code in this area. I really think it's just this for you:
+++ b/fs/xfs/xfs_file.c
@@ -957,12 +957,27 @@ xfs_file_readdir(
return 0;
}
+static int xfs_xip_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+ return xip_fault(vma, vmf, xfs_get_blocks);
+}
+
+static const struct vm_operations_struct xfs_xip_vm_ops = {
+ .fault = xfs_xip_fault,
+ .remap_pages = generic_file_remap_pages,
+};
+
STATIC int
xfs_file_mmap(
struct file *filp,
struct vm_area_struct *vma)
{
- vma->vm_ops = &xfs_file_vm_ops;
+ if (IS_XIP(file_inode(filp))) {
+ vma->vm_ops = &xfs_xip_vm_ops;
+ vma->vm_flags |= VM_MIXEDMAP;
+ } else {
+ vma->vm_ops = &xfs_file_vm_ops;
+ }
file_accessed(filp);
return 0;
> > static int ext4_xip_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> > {
> > return xip_fault(vma, vmf, ext4_get_block_write, ext4_end_io_dio);
> > }
>
> I think the xip fault handler should be generic as there's no reason
> for it to do anything other that read-only operations. It's the
> page_mkwrite callout that needs custom code for each filesystem.
With no struct page for the XIP memory, it's just not feasible to do it
that way.
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
next prev parent reply other threads:[~2013-12-20 16:45 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-17 19:18 [PATCH v3 0/3] Add XIP support to ext4 Matthew Wilcox
2013-12-17 19:18 ` [PATCH v3 1/3] Fix XIP fault vs truncate race Matthew Wilcox
2013-12-17 19:18 ` [PATCH v3 2/3] xip: Add xip_zero_page_range Matthew Wilcox
2013-12-17 19:18 ` [PATCH v3 3/3] ext4: Add XIP functionality Matthew Wilcox
2013-12-17 22:30 ` [PATCH v3 0/3] Add XIP support to ext4 Dave Chinner
2013-12-18 2:31 ` Matthew Wilcox
2013-12-18 5:01 ` Theodore Ts'o
2013-12-18 14:27 ` Matthew Wilcox
2013-12-19 2:07 ` Theodore Ts'o
2013-12-19 4:12 ` Matthew Wilcox
2013-12-19 4:37 ` Dave Chinner
2013-12-19 5:43 ` Theodore Ts'o
2013-12-19 15:20 ` Matthew Wilcox
2013-12-19 16:17 ` Theodore Ts'o
2013-12-19 17:12 ` Matthew Wilcox
2013-12-19 17:18 ` Theodore Ts'o
2013-12-20 18:17 ` Matthew Wilcox
2013-12-20 19:34 ` Theodore Ts'o
2013-12-20 20:11 ` Matthew Wilcox
2013-12-23 3:36 ` Dave Chinner
2013-12-23 3:45 ` Matthew Wilcox
2013-12-23 4:32 ` Dave Chinner
2013-12-23 6:56 ` Dave Chinner
2013-12-23 14:51 ` Theodore Ts'o
2013-12-23 3:16 ` Dave Chinner
2013-12-24 16:27 ` Matthew Wilcox
2013-12-18 12:33 ` Dave Chinner
2013-12-18 15:22 ` Matthew Wilcox
2013-12-19 0:48 ` Dave Chinner
2013-12-19 1:05 ` Matthew Wilcox
2013-12-19 1:58 ` Dave Chinner
2013-12-19 15:32 ` Matthew Wilcox
2013-12-19 23:46 ` Dave Chinner
2013-12-20 16:45 ` Matthew Wilcox [this message]
2013-12-23 4:14 ` Dave Chinner
2013-12-18 18:13 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131220164529.GE19166@parisc-linux.org \
--to=matthew@wil.cx \
--cc=david@fromorbit.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=matthew.r.wilcox@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.