From: Matthew Wilcox <matthew@wil.cx>
To: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v3 0/3] Add XIP support to ext4
Date: Thu, 19 Dec 2013 08:32:13 -0700 [thread overview]
Message-ID: <20131219153213.GC19166@parisc-linux.org> (raw)
In-Reply-To: <20131219015843.GE20579@dastard>
On Thu, Dec 19, 2013 at 12:58:44PM +1100, Dave Chinner wrote:
> >
> > - if (unlikely(iocb->ki_filp->f_flags & O_DIRECT))
> > + if (unlikely(iocb->ki_filp->f_flags & O_DIRECT) ||
> > + (mapping_is_xip(inode->i_mapping)))
>
> I suspect a helper function a good idea here. Something like
> "is_io_direct(iocb->ki_filp)"
Seems like a good idea.
> > index 594009f..ae760d9 100644
> > --- a/fs/ext4/indirect.c
> > +++ b/fs/ext4/indirect.c
> > @@ -686,15 +686,22 @@ retry:
> > inode_dio_done(inode);
> > goto locked;
> > }
> > - ret = __blockdev_direct_IO(rw, iocb, inode,
> > - inode->i_sb->s_bdev, iov,
> > - offset, nr_segs,
> > - ext4_get_block, NULL, NULL, 0);
> > + if (mapping_is_xip(file->f_mapping))
> > + ret = xip_io(rw, iocb, inode, iov, offset, nr_segs,
> > + ext4_get_block, NULL, 0);
>
> xip_direct_io() might be a better name here...
I you're a man who his functions verbs :-)
> > +static inline bool mapping_is_xip(struct address_space *mapping)
> > +{
> > + return mapping->a_ops->get_xip_mem != NULL;
> > +}
>
> I think that you should put a flag in the mapping for this, rather
> than chase pointers to do the check.
Probably. I think we may end up without a get_xip_mem() aop by the time
we're finished.
> > + retval = get_block(inode, block, bh, 0);
> > + if (retval)
> > + break;
> > + if (buffer_mapped(bh))
> > + hole = false;
> > + else
> > + hole = true;
> > + if (rw == WRITE && hole) {
> > + bh->b_size = ALIGN(end - offset, PAGE_SIZE);
> > + retval = get_block(inode, block, bh, 1);
> > + if (retval)
> > + break;
> > + }
>
> Why do two write mappings here? If it's a write, then we always want
> to fill a hole, so the create value sent to getblock is:
Yeah, there's a missing piece here. At the moment, I'm supposed to take
the stupid xip_sparse_mutex before filling a hole, and call __xip_unmap
after filling it. I think that has to go away, and once that's done,
I agree with your optimisation.
> > +/*
> > + * Perform I/O to an XIP file. We follow the same rules as
> > + * __blockdev_direct_IO with respect to locking
> > + */
>
> OK, that's interesting, because it means that it will be different
> to normal buffered page cache IO. It will allow concurrent
> overlapping reads and writes - something that POSIX does not allow -
> and places the burden of synchronising concurrent reads and writes
> on the application.
>
> That is different to the current XIP, which serialises writes
> against each other, but not against reads. That's not strictly POSIX
> compliant, either, but it prevents concurrent overlapping writes
> from occurring and so behaves more like applications expect buffered
> IO to work.
>
> For persistent memory, I'd prefer that we have concurrent write Io
> capabilities from the start, but I'm not sure we should just do this
> without first talking about it....
I think you're right. Let's drag this topic out to lkml and make sure
Linus is aware before we go too much further.
> > + /* Protects against truncate */
> > + atomic_inc(&inode->i_dio_count);
> > +
> > + retval = __xip_io(rw, inode, iov, offset, end, nr_segs, get_block, &bh);
>
> Can we avoid using "__" prefixes for new code? xip_do_direct_io() is
> a much better name....
Then it won't fit on a single line ;-) I have no attachment to the name,
but isn't all xip IO direct?
> > +
> > + if ((flags & DIO_LOCKING) && (rw == READ))
> > + mutex_unlock(&inode->i_mutex);
> > +
> > + inode_dio_done(inode);
> > +
> > + if (end_io)
> > + end_io(iocb, offset, transferred, bh.b_private);
>
> And that solves the unwritten extent problem for the IO path. Now we
> just need to solve it for the mmap path. That, I suspect will
> require a custom .page_mkwrite handler....
No, page_mkwrite() never gets called. At this point, I'm thinking a
custom ->fault handler that looks something like this:
static int ext4_xip_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
{
return xip_fault(vma, vmf, ext4_get_block_write, ext4_end_io_dio);
}
But I'll report back further when I've had a chance to see how it
turns out.
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
next prev parent reply other threads:[~2013-12-19 15:32 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-17 19:18 [PATCH v3 0/3] Add XIP support to ext4 Matthew Wilcox
2013-12-17 19:18 ` [PATCH v3 1/3] Fix XIP fault vs truncate race Matthew Wilcox
2013-12-17 19:18 ` [PATCH v3 2/3] xip: Add xip_zero_page_range Matthew Wilcox
2013-12-17 19:18 ` [PATCH v3 3/3] ext4: Add XIP functionality Matthew Wilcox
2013-12-17 22:30 ` [PATCH v3 0/3] Add XIP support to ext4 Dave Chinner
2013-12-18 2:31 ` Matthew Wilcox
2013-12-18 5:01 ` Theodore Ts'o
2013-12-18 14:27 ` Matthew Wilcox
2013-12-19 2:07 ` Theodore Ts'o
2013-12-19 4:12 ` Matthew Wilcox
2013-12-19 4:37 ` Dave Chinner
2013-12-19 5:43 ` Theodore Ts'o
2013-12-19 15:20 ` Matthew Wilcox
2013-12-19 16:17 ` Theodore Ts'o
2013-12-19 17:12 ` Matthew Wilcox
2013-12-19 17:18 ` Theodore Ts'o
2013-12-20 18:17 ` Matthew Wilcox
2013-12-20 19:34 ` Theodore Ts'o
2013-12-20 20:11 ` Matthew Wilcox
2013-12-23 3:36 ` Dave Chinner
2013-12-23 3:45 ` Matthew Wilcox
2013-12-23 4:32 ` Dave Chinner
2013-12-23 6:56 ` Dave Chinner
2013-12-23 14:51 ` Theodore Ts'o
2013-12-23 3:16 ` Dave Chinner
2013-12-24 16:27 ` Matthew Wilcox
2013-12-18 12:33 ` Dave Chinner
2013-12-18 15:22 ` Matthew Wilcox
2013-12-19 0:48 ` Dave Chinner
2013-12-19 1:05 ` Matthew Wilcox
2013-12-19 1:58 ` Dave Chinner
2013-12-19 15:32 ` Matthew Wilcox [this message]
2013-12-19 23:46 ` Dave Chinner
2013-12-20 16:45 ` Matthew Wilcox
2013-12-23 4:14 ` Dave Chinner
2013-12-18 18:13 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131219153213.GC19166@parisc-linux.org \
--to=matthew@wil.cx \
--cc=david@fromorbit.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=matthew.r.wilcox@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.