From: Jan Kara <jack@suse.cz>
To: Matthew Wilcox <matthew@wil.cx>
Cc: Jan Kara <jack@suse.cz>, Dave Chinner <david@fromorbit.com>,
linux-fsdevel@vger.kernel.org
Subject: Re: hole-punch vs fault
Date: Tue, 3 Dec 2013 00:13:01 +0100 [thread overview]
Message-ID: <20131202231301.GA26183@quack.suse.cz> (raw)
In-Reply-To: <20131202201315.GA14336@parisc-linux.org>
On Mon 02-12-13 13:13:15, Matthew Wilcox wrote:
> On Mon, Dec 02, 2013 at 08:58:25AM -0700, Matthew Wilcox wrote:
> > On Mon, Dec 02, 2013 at 09:33:18AM +0100, Jan Kara wrote:
> > > > Indeed, XIP could make use of the structures we already
> > > > have in the struct inode/address space to behave more like a normal
> > > > filesystem. We already use the mapping tree to store
> > > > per-page information that is used to serialise truncate vs page
> > > > faults, so why not make XIP do exactly the same thing?
> > > I believe that grabbing mmap_sem for writing during truncate (in case of
> > > ext2 around xip_truncate_page() & truncate_setsize() calls should do the
> > > trick. But I need to verify with lockdep that it doesn't introduce new
> > > locking problems.
> >
> > Umm ... mmap_sem for write? On each of the tasks that have a file mmaped?
> > I know we have double_down that orders by memory address, but I don't
> > think we have an N_down_write().
>
> Here's a simpler solution: Grab i_mmap_sem over vm_insert_mixed().
> i_mmap_sem is already taken in unmap_mapping_range(), called from
> truncate_pagecache(), so we know that while we hold i_mmap_sem that
> i_size has already been updated and the block can't be deallocated
> from under us. We know that i_mmap_sem is safe to take here because it
> already is (in __xip_unmap()).
Yeah, that should do the trick.
Honza
> Now we need only worry about holepunch, and it's the same problem for
> pagecache and XIP files alike.
>
> diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c
> index d8d9fe3..f6de4c3 100644
> --- a/mm/filemap_xip.c
> +++ b/mm/filemap_xip.c
> @@ -260,8 +260,17 @@ again:
> __xip_unmap(mapping, vmf->pgoff);
>
> found:
> + /* We must recheck i_size under i_mmap_mutex */
> + mutex_lock(&mapping->i_mmap_mutex);
> + size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >>
> + PAGE_CACHE_SHIFT;
> + if (unlikely(vmf->pgoff >= size)) {
> + mutex_unlock(&mapping->i_mmap_mutex);
> + return VM_FAULT_SIGBUS;
> + }
> err = vm_insert_mixed(vma, (unsigned long)vmf->virtual_address,
> xip_pfn);
> + mutex_unlock(&mapping->i_mmap_mutex);
> if (err == -ENOMEM)
> return VM_FAULT_OOM;
> /*
> @@ -285,6 +294,15 @@ found:
> }
> if (error != -ENODATA)
> goto out;
> +
> + /* We must recheck i_size under i_mmap_mutex */
> + mutex_lock(&mapping->i_mmap_mutex);
> + size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >>
> + PAGE_CACHE_SHIFT;
> + if (unlikely(vmf->pgoff >= size)) {
> + ret = VM_FAULT_SIGBUS;
> + goto out;
> + }
> /* not shared and writable, use xip_sparse_page() */
> page = xip_sparse_page();
> if (!page)
> @@ -296,6 +314,7 @@ found:
>
> ret = VM_FAULT_NOPAGE;
> out:
> + mutex_unlock(&mapping->i_mmap_mutex);
> write_seqcount_end(&xip_sparse_seq);
> mutex_unlock(&xip_sparse_mutex);
>
>
> --
> Matthew Wilcox Intel Open Source Technology Centre
> "Bill, look, we understand that you're interested in selling us this
> operating system, but compare it to ours. We can't possibly take such
> a retrograde step."
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
prev parent reply other threads:[~2013-12-02 23:13 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-27 13:48 hole-punch vs fault Matthew Wilcox
2013-11-27 22:19 ` Jan Kara
2013-11-28 2:33 ` Matthew Wilcox
2013-11-28 3:30 ` Matthew Wilcox
2013-11-28 4:22 ` Dave Chinner
2013-11-28 4:44 ` Matthew Wilcox
2013-11-28 12:24 ` Matthew Wilcox
2013-11-28 22:12 ` Dave Chinner
2013-11-29 13:11 ` Matthew Wilcox
2013-12-01 21:52 ` Dave Chinner
2013-12-02 8:33 ` Jan Kara
2013-12-02 15:58 ` Matthew Wilcox
2013-12-02 20:11 ` Jan Kara
2013-12-02 20:13 ` Matthew Wilcox
2013-12-02 23:13 ` Jan Kara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131202231301.GA26183@quack.suse.cz \
--to=jack@suse.cz \
--cc=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=matthew@wil.cx \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).