Re: hole-punch vs fault

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jan Kara <jack@suse.cz>
To: Matthew Wilcox <matthew@wil.cx>
Cc: Jan Kara <jack@suse.cz>, Dave Chinner <david@fromorbit.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: hole-punch vs fault
Date: Tue, 3 Dec 2013 00:13:01 +0100	[thread overview]
Message-ID: <20131202231301.GA26183@quack.suse.cz> (raw)
In-Reply-To: <20131202201315.GA14336@parisc-linux.org>

On Mon 02-12-13 13:13:15, Matthew Wilcox wrote:
> On Mon, Dec 02, 2013 at 08:58:25AM -0700, Matthew Wilcox wrote:
> > On Mon, Dec 02, 2013 at 09:33:18AM +0100, Jan Kara wrote:
> > > > Indeed, XIP could make use of the structures we already
> > > > have in the struct inode/address space to behave more like a normal
> > > > filesystem. We already use the mapping tree to store
> > > > per-page information that is used to serialise truncate vs page
> > > > faults, so why not make XIP do exactly the same thing? 
> > >   I believe that grabbing mmap_sem for writing during truncate (in case of
> > > ext2 around xip_truncate_page() & truncate_setsize() calls should do the
> > > trick. But I need to verify with lockdep that it doesn't introduce new
> > > locking problems.
> > 
> > Umm ... mmap_sem for write?  On each of the tasks that have a file mmaped?
> > I know we have double_down that orders by memory address, but I don't
> > think we have an N_down_write().
> 
> Here's a simpler solution: Grab i_mmap_sem over vm_insert_mixed().
> i_mmap_sem is already taken in unmap_mapping_range(), called from
> truncate_pagecache(), so we know that while we hold i_mmap_sem that
> i_size has already been updated and the block can't be deallocated
> from under us.  We know that i_mmap_sem is safe to take here because it
> already is (in __xip_unmap()).
  Yeah, that should do the trick.

								Honza

> Now we need only worry about holepunch, and it's the same problem for
> pagecache and XIP files alike.
> 
> diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c
> index d8d9fe3..f6de4c3 100644
> --- a/mm/filemap_xip.c
> +++ b/mm/filemap_xip.c
> @@ -260,8 +260,17 @@ again:
>  		__xip_unmap(mapping, vmf->pgoff);
>  
>  found:
> +		/* We must recheck i_size under i_mmap_mutex */
> +		mutex_lock(&mapping->i_mmap_mutex);
> +		size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >>
> +							PAGE_CACHE_SHIFT;
> +		if (unlikely(vmf->pgoff >= size)) {
> +			mutex_unlock(&mapping->i_mmap_mutex);
> +			return VM_FAULT_SIGBUS;
> +		}
>  		err = vm_insert_mixed(vma, (unsigned long)vmf->virtual_address,
>  							xip_pfn);
> +		mutex_unlock(&mapping->i_mmap_mutex);
>  		if (err == -ENOMEM)
>  			return VM_FAULT_OOM;
>  		/*
> @@ -285,6 +294,15 @@ found:
>  		}
>  		if (error != -ENODATA)
>  			goto out;
> +
> +		/* We must recheck i_size under i_mmap_mutex */
> +		mutex_lock(&mapping->i_mmap_mutex);
> +		size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >>
> +							PAGE_CACHE_SHIFT;
> +		if (unlikely(vmf->pgoff >= size)) {
> +			ret = VM_FAULT_SIGBUS;
> +			goto out;
> +		}
>  		/* not shared and writable, use xip_sparse_page() */
>  		page = xip_sparse_page();
>  		if (!page)
> @@ -296,6 +314,7 @@ found:
>  
>  		ret = VM_FAULT_NOPAGE;
>  out:
> +		mutex_unlock(&mapping->i_mmap_mutex);
>  		write_seqcount_end(&xip_sparse_seq);
>  		mutex_unlock(&xip_sparse_mutex);
>  
> 
> -- 
> Matthew Wilcox				Intel Open Source Technology Centre
> "Bill, look, we understand that you're interested in selling us this
> operating system, but compare it to ours.  We can't possibly take such
> a retrograde step."
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

     prev parent reply	other threads:[~2013-12-02 23:13 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-27 13:48 hole-punch vs fault Matthew Wilcox
2013-11-27 22:19 ` Jan Kara
2013-11-28  2:33   ` Matthew Wilcox
2013-11-28  3:30     ` Matthew Wilcox
2013-11-28  4:22     ` Dave Chinner
2013-11-28  4:44     ` Matthew Wilcox
2013-11-28 12:24       ` Matthew Wilcox
2013-11-28 22:12       ` Dave Chinner
2013-11-29 13:11         ` Matthew Wilcox
2013-12-01 21:52           ` Dave Chinner
2013-12-02  8:33             ` Jan Kara
2013-12-02 15:58               ` Matthew Wilcox
2013-12-02 20:11                 ` Jan Kara
2013-12-02 20:13                 ` Matthew Wilcox
2013-12-02 23:13                   ` Jan Kara [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131202231301.GA26183@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=matthew@wil.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).