* [PATCH 0/1] iomap: Direct I/O for inline data @ 2018-06-27 0:39 Andreas Gruenbacher 2018-06-27 0:39 ` [PATCH 1/1] " Andreas Gruenbacher 2018-06-27 11:14 ` [PATCH 0/1] " Andreas Gruenbacher 0 siblings, 2 replies; 13+ messages in thread From: Andreas Gruenbacher @ 2018-06-27 0:39 UTC (permalink / raw) To: Christoph Hellwig; +Cc: cluster-devel, linux-fsdevel, linux-ext4 Here's a patch that implements direct I/O for inline data. Direct I/O to inline data is a bit weird because it's not direct in the usual sense, but since Christoph's been asking for it ... The usual alignment restrictions to the logical block size of the underlying block device still apply. I don't see a reason for changing that; the resulting behavior would only become very weird for no benefit. I've tested this against a hacked-up version of gfs2. However, the "real" gfs2 will keep falling back to buffered I/O for writes to inline data: gfs2 takes a shared lock during direct I/O, and writing to the inode under that shared lock is not allowed. Ext4 may become the first actual user of this part of the patch. Thanks, Andreas Andreas Gruenbacher (1): iomap: Direct I/O to inline data fs/iomap.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) -- 2.17.1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/1] iomap: Direct I/O for inline data 2018-06-27 0:39 [PATCH 0/1] iomap: Direct I/O for inline data Andreas Gruenbacher @ 2018-06-27 0:39 ` Andreas Gruenbacher 2018-06-27 1:44 ` kbuild test robot 2018-06-29 8:56 ` Christoph Hellwig 2018-06-27 11:14 ` [PATCH 0/1] " Andreas Gruenbacher 1 sibling, 2 replies; 13+ messages in thread From: Andreas Gruenbacher @ 2018-06-27 0:39 UTC (permalink / raw) To: Christoph Hellwig; +Cc: cluster-devel, linux-fsdevel, linux-ext4 Add support for reading from and writing to inline data to iomap_dio_rw. This saves filesystems from having to implement fallback code for this case. The inline data is actually cached in the inode, so the I/O is only direct in the sense that it doesn't go through the page cache. The same alignment restrictions as to non-inline data apply. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> --- fs/iomap.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/fs/iomap.c b/fs/iomap.c index d393bb0c7384..74668b3ca2ed 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1231,6 +1231,32 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, return submit_bio(bio); } +static loff_t iomap_dio_actor_inline(struct inode *inode, struct iomap_dio *dio, + struct iomap *iomap, loff_t pos, loff_t length) +{ + struct iov_iter *iter = dio->submit.iter; + size_t copied; + + BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data)); + + if (dio->flags & IOMAP_DIO_WRITE) { + loff_t size = inode->i_size; + + if (pos > size) + memset(iomap->inline_data + size, 0, pos - size); + copied = copy_from_iter(iomap->inline_data + pos, length, iter); + if (copied) { + if (pos + copied > size) + i_size_write(inode, pos + copied); + mark_inode_dirty(inode); + } + } else { + copied = copy_to_iter(iomap->inline_data + pos, length, iter); + } + dio->size += copied; + return copied; +} + static loff_t iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, void *data, struct iomap *iomap) @@ -1281,6 +1307,8 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, use_fua = true; } break; + case IOMAP_INLINE: + return iomap_dio_actor_inline(inode, dio, iomap, pos, length); default: WARN_ON_ONCE(1); return -EIO; -- 2.17.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] iomap: Direct I/O for inline data 2018-06-27 0:39 ` [PATCH 1/1] " Andreas Gruenbacher @ 2018-06-27 1:44 ` kbuild test robot 2018-06-29 8:56 ` Christoph Hellwig 1 sibling, 0 replies; 13+ messages in thread From: kbuild test robot @ 2018-06-27 1:44 UTC (permalink / raw) To: Andreas Gruenbacher Cc: cluster-devel, kbuild-all, linux-fsdevel, linux-ext4, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 3195 bytes --] Hi Andreas, I love your patch! Yet something to improve: [auto build test ERROR on linus/master] [also build test ERROR on v4.18-rc2 next-20180626] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Andreas-Gruenbacher/iomap-Direct-I-O-for-inline-data/20180627-084229 config: x86_64-randconfig-x015-201825 (attached as .config) compiler: gcc-7 (Debian 7.3.0-16) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All error/warnings (new ones prefixed by >>): In file included from include/linux/kernel.h:10:0, from include/linux/list.h:9, from include/linux/module.h:9, from fs/iomap.c:14: fs/iomap.c: In function 'iomap_dio_actor_inline': >> fs/iomap.c:971:56: error: 'struct iomap' has no member named 'inline_data' BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data)); ^ include/linux/compiler.h:77:42: note: in definition of macro 'unlikely' # define unlikely(x) __builtin_expect(!!(x), 0) ^ >> fs/iomap.c:971:2: note: in expansion of macro 'BUG_ON' BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data)); ^~~~~~ >> fs/iomap.c:971:36: note: in expansion of macro 'offset_in_page' BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data)); ^~~~~~~~~~~~~~ fs/iomap.c:977:16: error: 'struct iomap' has no member named 'inline_data' memset(iomap->inline_data + size, 0, pos - size); ^~ fs/iomap.c:978:32: error: 'struct iomap' has no member named 'inline_data' copied = copy_from_iter(iomap->inline_data + pos, length, iter); ^~ fs/iomap.c:985:30: error: 'struct iomap' has no member named 'inline_data' copied = copy_to_iter(iomap->inline_data + pos, length, iter); ^~ vim +971 fs/iomap.c 964 965 static loff_t iomap_dio_actor_inline(struct inode *inode, struct iomap_dio *dio, 966 struct iomap *iomap, loff_t pos, loff_t length) 967 { 968 struct iov_iter *iter = dio->submit.iter; 969 size_t copied; 970 > 971 BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data)); 972 973 if (dio->flags & IOMAP_DIO_WRITE) { 974 loff_t size = inode->i_size; 975 976 if (pos > size) 977 memset(iomap->inline_data + size, 0, pos - size); 978 copied = copy_from_iter(iomap->inline_data + pos, length, iter); 979 if (copied) { 980 if (pos + copied > size) 981 i_size_write(inode, pos + copied); 982 mark_inode_dirty(inode); 983 } 984 } else { 985 copied = copy_to_iter(iomap->inline_data + pos, length, iter); 986 } 987 dio->size += copied; 988 return copied; 989 } 990 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 27637 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] iomap: Direct I/O for inline data 2018-06-27 0:39 ` [PATCH 1/1] " Andreas Gruenbacher 2018-06-27 1:44 ` kbuild test robot @ 2018-06-29 8:56 ` Christoph Hellwig 2018-06-29 14:40 ` Andreas Gruenbacher 1 sibling, 1 reply; 13+ messages in thread From: Christoph Hellwig @ 2018-06-29 8:56 UTC (permalink / raw) To: Andreas Gruenbacher Cc: cluster-devel, linux-fsdevel, linux-ext4, Christoph Hellwig This looks generally fine. But I think it might be worth refactoring iomap_dio_actor a bit first, e.g. something like this new patch before yours, which would also nicely solve your alignmnet concern (entirely untested for now): --- >From f8c58ffe79df63d23332376ce481cdc4753cc567 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig <hch@lst.de> Date: Fri, 29 Jun 2018 10:54:10 +0200 Subject: iomap: refactor iomap_dio_actor Split the function up into two helpers for the bio based I/O and hole case, and a small helper to call the two. This separates the code a little better in preparation for supporting I/O to inline data. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/iomap.c | 88 ++++++++++++++++++++++++++++++++---------------------- 1 file changed, 52 insertions(+), 36 deletions(-) diff --git a/fs/iomap.c b/fs/iomap.c index 7d1e9f45f098..f05c83773cbf 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -963,10 +963,9 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, } static loff_t -iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap) +iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, + struct iomap_dio *dio, struct iomap *iomap) { - struct iomap_dio *dio = data; unsigned int blkbits = blksize_bits(bdev_logical_block_size(iomap->bdev)); unsigned int fs_block_size = i_blocksize(inode), pad; unsigned int align = iov_iter_alignment(dio->submit.iter); @@ -980,41 +979,27 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, if ((pos | length | align) & ((1 << blkbits) - 1)) return -EINVAL; - switch (iomap->type) { - case IOMAP_HOLE: - if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE)) - return -EIO; - /*FALLTHRU*/ - case IOMAP_UNWRITTEN: - if (!(dio->flags & IOMAP_DIO_WRITE)) { - length = iov_iter_zero(length, dio->submit.iter); - dio->size += length; - return length; - } + if (iomap->type == IOMAP_UNWRITTEN) { dio->flags |= IOMAP_DIO_UNWRITTEN; need_zeroout = true; - break; - case IOMAP_MAPPED: - if (iomap->flags & IOMAP_F_SHARED) - dio->flags |= IOMAP_DIO_COW; - if (iomap->flags & IOMAP_F_NEW) { - need_zeroout = true; - } else { - /* - * Use a FUA write if we need datasync semantics, this - * is a pure data IO that doesn't require any metadata - * updates and the underlying device supports FUA. This - * allows us to avoid cache flushes on IO completion. - */ - if (!(iomap->flags & (IOMAP_F_SHARED|IOMAP_F_DIRTY)) && - (dio->flags & IOMAP_DIO_WRITE_FUA) && - blk_queue_fua(bdev_get_queue(iomap->bdev))) - use_fua = true; - } - break; - default: - WARN_ON_ONCE(1); - return -EIO; + } + + if (iomap->flags & IOMAP_F_SHARED) + dio->flags |= IOMAP_DIO_COW; + + if (iomap->flags & IOMAP_F_NEW) { + need_zeroout = true; + } else { + /* + * Use a FUA write if we need datasync semantics, this + * is a pure data IO that doesn't require any metadata + * updates and the underlying device supports FUA. This + * allows us to avoid cache flushes on IO completion. + */ + if (!(iomap->flags & (IOMAP_F_SHARED|IOMAP_F_DIRTY)) && + (dio->flags & IOMAP_DIO_WRITE_FUA) && + blk_queue_fua(bdev_get_queue(iomap->bdev))) + use_fua = true; } /* @@ -1093,6 +1078,37 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, return copied; } +static loff_t +iomap_dio_hole_actor(loff_t length, struct iomap_dio *dio) +{ + length = iov_iter_zero(length, dio->submit.iter); + dio->size += length; + return length; +} + +static loff_t +iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, + void *data, struct iomap *iomap) +{ + struct iomap_dio *dio = data; + + switch (iomap->type) { + case IOMAP_HOLE: + if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE)) + return -EIO; + return iomap_dio_hole_actor(length, dio); + case IOMAP_UNWRITTEN: + if (!(dio->flags & IOMAP_DIO_WRITE)) + return iomap_dio_hole_actor(length, dio); + return iomap_dio_bio_actor(inode, pos, length, dio, iomap); + case IOMAP_MAPPED: + return iomap_dio_bio_actor(inode, pos, length, dio, iomap); + default: + WARN_ON_ONCE(1); + return -EIO; + } +} + /* * iomap_dio_rw() always completes O_[D]SYNC writes regardless of whether the IO * is being issued as AIO or not. This allows us to optimise pure data writes -- 2.17.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] iomap: Direct I/O for inline data 2018-06-29 8:56 ` Christoph Hellwig @ 2018-06-29 14:40 ` Andreas Gruenbacher 2018-06-29 16:01 ` Christoph Hellwig 2018-07-01 6:29 ` Christoph Hellwig 0 siblings, 2 replies; 13+ messages in thread From: Andreas Gruenbacher @ 2018-06-29 14:40 UTC (permalink / raw) To: Christoph Hellwig; +Cc: cluster-devel, linux-fsdevel, linux-ext4 On 29 June 2018 at 10:56, Christoph Hellwig <hch@lst.de> wrote: > This looks generally fine. But I think it might be worth refactoring > iomap_dio_actor a bit first, e.g. something like this new patch > before yours, which would also nicely solve your alignmnet concern > (entirely untested for now): This looks correct. I've rebased my patches on top of it and I ran the xfstest auto group on gfs2 and xfs on top. Can you push this to your gfs2-iomap branch? I'll then repost an updated version of "iomap: Direct I/O for inline data". > --- > From f8c58ffe79df63d23332376ce481cdc4753cc567 Mon Sep 17 00:00:00 2001 > From: Christoph Hellwig <hch@lst.de> > Date: Fri, 29 Jun 2018 10:54:10 +0200 > Subject: iomap: refactor iomap_dio_actor > > Split the function up into two helpers for the bio based I/O and hole > case, and a small helper to call the two. This separates the code a > little better in preparation for supporting I/O to inline data. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> > --- > fs/iomap.c | 88 ++++++++++++++++++++++++++++++++---------------------- > 1 file changed, 52 insertions(+), 36 deletions(-) > > diff --git a/fs/iomap.c b/fs/iomap.c > index 7d1e9f45f098..f05c83773cbf 100644 > --- a/fs/iomap.c > +++ b/fs/iomap.c > @@ -963,10 +963,9 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap *iomap, loff_t pos, > } > > static loff_t > -iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, > - void *data, struct iomap *iomap) > +iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, > + struct iomap_dio *dio, struct iomap *iomap) > { > - struct iomap_dio *dio = data; > unsigned int blkbits = blksize_bits(bdev_logical_block_size(iomap->bdev)); > unsigned int fs_block_size = i_blocksize(inode), pad; > unsigned int align = iov_iter_alignment(dio->submit.iter); > @@ -980,41 +979,27 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, > if ((pos | length | align) & ((1 << blkbits) - 1)) > return -EINVAL; > > - switch (iomap->type) { > - case IOMAP_HOLE: > - if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE)) > - return -EIO; > - /*FALLTHRU*/ > - case IOMAP_UNWRITTEN: > - if (!(dio->flags & IOMAP_DIO_WRITE)) { > - length = iov_iter_zero(length, dio->submit.iter); > - dio->size += length; > - return length; > - } > + if (iomap->type == IOMAP_UNWRITTEN) { > dio->flags |= IOMAP_DIO_UNWRITTEN; > need_zeroout = true; > - break; > - case IOMAP_MAPPED: > - if (iomap->flags & IOMAP_F_SHARED) > - dio->flags |= IOMAP_DIO_COW; > - if (iomap->flags & IOMAP_F_NEW) { > - need_zeroout = true; > - } else { > - /* > - * Use a FUA write if we need datasync semantics, this > - * is a pure data IO that doesn't require any metadata > - * updates and the underlying device supports FUA. This > - * allows us to avoid cache flushes on IO completion. > - */ > - if (!(iomap->flags & (IOMAP_F_SHARED|IOMAP_F_DIRTY)) && > - (dio->flags & IOMAP_DIO_WRITE_FUA) && > - blk_queue_fua(bdev_get_queue(iomap->bdev))) > - use_fua = true; > - } > - break; > - default: > - WARN_ON_ONCE(1); > - return -EIO; > + } > + > + if (iomap->flags & IOMAP_F_SHARED) > + dio->flags |= IOMAP_DIO_COW; > + > + if (iomap->flags & IOMAP_F_NEW) { > + need_zeroout = true; > + } else { > + /* > + * Use a FUA write if we need datasync semantics, this > + * is a pure data IO that doesn't require any metadata > + * updates and the underlying device supports FUA. This > + * allows us to avoid cache flushes on IO completion. > + */ > + if (!(iomap->flags & (IOMAP_F_SHARED|IOMAP_F_DIRTY)) && > + (dio->flags & IOMAP_DIO_WRITE_FUA) && > + blk_queue_fua(bdev_get_queue(iomap->bdev))) > + use_fua = true; > } > > /* > @@ -1093,6 +1078,37 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, > return copied; > } > > +static loff_t > +iomap_dio_hole_actor(loff_t length, struct iomap_dio *dio) > +{ > + length = iov_iter_zero(length, dio->submit.iter); > + dio->size += length; > + return length; > +} Just a minor nit: iomap_dio_hole_actor should come before iomap_dio_bio_actor. > + > +static loff_t > +iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, > + void *data, struct iomap *iomap) > +{ > + struct iomap_dio *dio = data; > + > + switch (iomap->type) { > + case IOMAP_HOLE: > + if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE)) > + return -EIO; > + return iomap_dio_hole_actor(length, dio); > + case IOMAP_UNWRITTEN: > + if (!(dio->flags & IOMAP_DIO_WRITE)) > + return iomap_dio_hole_actor(length, dio); > + return iomap_dio_bio_actor(inode, pos, length, dio, iomap); > + case IOMAP_MAPPED: > + return iomap_dio_bio_actor(inode, pos, length, dio, iomap); > + default: > + WARN_ON_ONCE(1); > + return -EIO; > + } > +} > + > /* > * iomap_dio_rw() always completes O_[D]SYNC writes regardless of whether the IO > * is being issued as AIO or not. This allows us to optimise pure data writes > -- > 2.17.1 > Thanks, Andreas ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] iomap: Direct I/O for inline data 2018-06-29 14:40 ` Andreas Gruenbacher @ 2018-06-29 16:01 ` Christoph Hellwig 2018-06-29 17:02 ` Andreas Gruenbacher 2018-07-01 6:29 ` Christoph Hellwig 1 sibling, 1 reply; 13+ messages in thread From: Christoph Hellwig @ 2018-06-29 16:01 UTC (permalink / raw) To: Andreas Gruenbacher Cc: cluster-devel, linux-fsdevel, linux-ext4, Christoph Hellwig On Fri, Jun 29, 2018 at 04:40:40PM +0200, Andreas Gruenbacher wrote: > On 29 June 2018 at 10:56, Christoph Hellwig <hch@lst.de> wrote: > > This looks generally fine. But I think it might be worth refactoring > > iomap_dio_actor a bit first, e.g. something like this new patch > > before yours, which would also nicely solve your alignmnet concern > > (entirely untested for now): > > This looks correct. I've rebased my patches on top of it and I ran the > xfstest auto group on gfs2 and xfs on top. > > Can you push this to your gfs2-iomap branch? I'll then repost an > updated version of "iomap: Direct I/O for inline data". Darrick now has a real iomap merge branch which replaced it. Before formally submitting the patch I'd also like to verify that it does not regress for XFS by doing a full xfstests run. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] iomap: Direct I/O for inline data 2018-06-29 16:01 ` Christoph Hellwig @ 2018-06-29 17:02 ` Andreas Gruenbacher 2018-07-01 6:13 ` Christoph Hellwig 0 siblings, 1 reply; 13+ messages in thread From: Andreas Gruenbacher @ 2018-06-29 17:02 UTC (permalink / raw) To: Darrick J. Wong Cc: cluster-devel, linux-fsdevel, linux-ext4, Christoph Hellwig On 29 June 2018 at 18:01, Christoph Hellwig <hch@lst.de> wrote: > On Fri, Jun 29, 2018 at 04:40:40PM +0200, Andreas Gruenbacher wrote: >> On 29 June 2018 at 10:56, Christoph Hellwig <hch@lst.de> wrote: >> > This looks generally fine. But I think it might be worth refactoring >> > iomap_dio_actor a bit first, e.g. something like this new patch >> > before yours, which would also nicely solve your alignmnet concern >> > (entirely untested for now): >> >> This looks correct. I've rebased my patches on top of it and I ran the >> xfstest auto group on gfs2 and xfs on top. >> >> Can you push this to your gfs2-iomap branch? I'll then repost an >> updated version of "iomap: Direct I/O for inline data". > > Darrick now has a real iomap merge branch which replaced it. Where is it? Not in https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git it seems. > Before formally submitting the patch I'd also like to verify that it does > not regress for XFS by doing a full xfstests run. Thanks, Andreas ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] iomap: Direct I/O for inline data 2018-06-29 17:02 ` Andreas Gruenbacher @ 2018-07-01 6:13 ` Christoph Hellwig 0 siblings, 0 replies; 13+ messages in thread From: Christoph Hellwig @ 2018-07-01 6:13 UTC (permalink / raw) To: Andreas Gruenbacher Cc: cluster-devel, linux-fsdevel, linux-ext4, Christoph Hellwig, Darrick J. Wong On Fri, Jun 29, 2018 at 07:02:07PM +0200, Andreas Gruenbacher wrote: > On 29 June 2018 at 18:01, Christoph Hellwig <hch@lst.de> wrote: > > On Fri, Jun 29, 2018 at 04:40:40PM +0200, Andreas Gruenbacher wrote: > >> On 29 June 2018 at 10:56, Christoph Hellwig <hch@lst.de> wrote: > >> > This looks generally fine. But I think it might be worth refactoring > >> > iomap_dio_actor a bit first, e.g. something like this new patch > >> > before yours, which would also nicely solve your alignmnet concern > >> > (entirely untested for now): > >> > >> This looks correct. I've rebased my patches on top of it and I ran the > >> xfstest auto group on gfs2 and xfs on top. > >> > >> Can you push this to your gfs2-iomap branch? I'll then repost an > >> updated version of "iomap: Direct I/O for inline data". > > > > Darrick now has a real iomap merge branch which replaced it. > > Where is it? Not in > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git > it seems. https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git/log/?h=iomap-4.19-merge ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] iomap: Direct I/O for inline data 2018-06-29 14:40 ` Andreas Gruenbacher 2018-06-29 16:01 ` Christoph Hellwig @ 2018-07-01 6:29 ` Christoph Hellwig 2018-07-01 21:44 ` Andreas Gruenbacher 1 sibling, 1 reply; 13+ messages in thread From: Christoph Hellwig @ 2018-07-01 6:29 UTC (permalink / raw) To: Andreas Gruenbacher Cc: cluster-devel, linux-fsdevel, linux-ext4, Christoph Hellwig On Fri, Jun 29, 2018 at 04:40:40PM +0200, Andreas Gruenbacher wrote: > On 29 June 2018 at 10:56, Christoph Hellwig <hch@lst.de> wrote: > > This looks generally fine. But I think it might be worth refactoring > > iomap_dio_actor a bit first, e.g. something like this new patch > > before yours, which would also nicely solve your alignmnet concern > > (entirely untested for now): > > This looks correct. I've rebased my patches on top of it and I ran the > xfstest auto group on gfs2 and xfs on top. As I've just been rebasing the iomap work I've done the work already, does the version below work for you? --- >From 5e8a0f157629bb8850b8d8fe049bb896730f0da7 Mon Sep 17 00:00:00 2001 From: Andreas Gruenbacher <agruenba@redhat.com> Date: Sun, 1 Jul 2018 08:26:22 +0200 Subject: iomap: support direct I/O to inline data Add support for reading from and writing to inline data to iomap_dio_rw. This saves filesystems from having to implement fallback code for this case. The inline data is actually cached in the inode, so the I/O is only direct in the sense that it doesn't go through the page cache. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> --- fs/iomap.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/fs/iomap.c b/fs/iomap.c index 4d8ff0f5ecc9..98a1fdd5c091 100644 --- a/fs/iomap.c +++ b/fs/iomap.c @@ -1450,6 +1450,33 @@ iomap_dio_hole_actor(loff_t length, struct iomap_dio *dio) return length; } +static loff_t +iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length, + struct iomap_dio *dio, struct iomap *iomap) +{ + struct iov_iter *iter = dio->submit.iter; + size_t copied; + + BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data)); + + if (dio->flags & IOMAP_DIO_WRITE) { + loff_t size = inode->i_size; + + if (pos > size) + memset(iomap->inline_data + size, 0, pos - size); + copied = copy_from_iter(iomap->inline_data + pos, length, iter); + if (copied) { + if (pos + copied > size) + i_size_write(inode, pos + copied); + mark_inode_dirty(inode); + } + } else { + copied = copy_to_iter(iomap->inline_data + pos, length, iter); + } + dio->size += copied; + return copied; +} + static loff_t iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, void *data, struct iomap *iomap) @@ -1467,6 +1494,8 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, return iomap_dio_bio_actor(inode, pos, length, dio, iomap); case IOMAP_MAPPED: return iomap_dio_bio_actor(inode, pos, length, dio, iomap); + case IOMAP_INLINE: + return iomap_dio_inline_actor(inode, pos, length, dio, iomap); default: WARN_ON_ONCE(1); return -EIO; -- 2.18.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/1] iomap: Direct I/O for inline data 2018-07-01 6:29 ` Christoph Hellwig @ 2018-07-01 21:44 ` Andreas Gruenbacher 0 siblings, 0 replies; 13+ messages in thread From: Andreas Gruenbacher @ 2018-07-01 21:44 UTC (permalink / raw) To: Christoph Hellwig; +Cc: cluster-devel, linux-fsdevel, linux-ext4 On 1 July 2018 at 08:29, Christoph Hellwig <hch@lst.de> wrote: > On Fri, Jun 29, 2018 at 04:40:40PM +0200, Andreas Gruenbacher wrote: >> On 29 June 2018 at 10:56, Christoph Hellwig <hch@lst.de> wrote: >> > This looks generally fine. But I think it might be worth refactoring >> > iomap_dio_actor a bit first, e.g. something like this new patch >> > before yours, which would also nicely solve your alignmnet concern >> > (entirely untested for now): >> >> This looks correct. I've rebased my patches on top of it and I ran the >> xfstest auto group on gfs2 and xfs on top. > > As I've just been rebasing the iomap work I've done the work already, > does the version below work for you? Yes, it does. > --- > From 5e8a0f157629bb8850b8d8fe049bb896730f0da7 Mon Sep 17 00:00:00 2001 > From: Andreas Gruenbacher <agruenba@redhat.com> > Date: Sun, 1 Jul 2018 08:26:22 +0200 > Subject: iomap: support direct I/O to inline data > > Add support for reading from and writing to inline data to iomap_dio_rw. > This saves filesystems from having to implement fallback code for this > case. > > The inline data is actually cached in the inode, so the I/O is only > direct in the sense that it doesn't go through the page cache. > > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> > --- > fs/iomap.c | 29 +++++++++++++++++++++++++++++ > 1 file changed, 29 insertions(+) > > diff --git a/fs/iomap.c b/fs/iomap.c > index 4d8ff0f5ecc9..98a1fdd5c091 100644 > --- a/fs/iomap.c > +++ b/fs/iomap.c > @@ -1450,6 +1450,33 @@ iomap_dio_hole_actor(loff_t length, struct iomap_dio *dio) > return length; > } > > +static loff_t > +iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length, > + struct iomap_dio *dio, struct iomap *iomap) > +{ > + struct iov_iter *iter = dio->submit.iter; > + size_t copied; > + > + BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data)); > + > + if (dio->flags & IOMAP_DIO_WRITE) { > + loff_t size = inode->i_size; > + > + if (pos > size) > + memset(iomap->inline_data + size, 0, pos - size); > + copied = copy_from_iter(iomap->inline_data + pos, length, iter); > + if (copied) { > + if (pos + copied > size) > + i_size_write(inode, pos + copied); > + mark_inode_dirty(inode); > + } > + } else { > + copied = copy_to_iter(iomap->inline_data + pos, length, iter); > + } > + dio->size += copied; > + return copied; > +} > + > static loff_t > iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, > void *data, struct iomap *iomap) > @@ -1467,6 +1494,8 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, > return iomap_dio_bio_actor(inode, pos, length, dio, iomap); > case IOMAP_MAPPED: > return iomap_dio_bio_actor(inode, pos, length, dio, iomap); > + case IOMAP_INLINE: > + return iomap_dio_inline_actor(inode, pos, length, dio, iomap); > default: > WARN_ON_ONCE(1); > return -EIO; > -- > 2.18.0 > Thanks, Andreas ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/1] iomap: Direct I/O for inline data 2018-06-27 0:39 [PATCH 0/1] iomap: Direct I/O for inline data Andreas Gruenbacher 2018-06-27 0:39 ` [PATCH 1/1] " Andreas Gruenbacher @ 2018-06-27 11:14 ` Andreas Gruenbacher 2018-06-29 8:43 ` Christoph Hellwig 1 sibling, 1 reply; 13+ messages in thread From: Andreas Gruenbacher @ 2018-06-27 11:14 UTC (permalink / raw) To: Christoph Hellwig; +Cc: cluster-devel, linux-fsdevel, linux-ext4 On 27 June 2018 at 02:39, Andreas Gruenbacher <agruenba@redhat.com> wrote: > Here's a patch that implements direct I/O for inline data. Direct I/O > to inline data is a bit weird because it's not direct in the usual > sense, but since Christoph's been asking for it ... > > The usual alignment restrictions to the logical block size of the > underlying block device still apply. I don't see a reason for changing > that; the resulting behavior would only become very weird for no > benefit. > > I've tested this against a hacked-up version of gfs2. However, the > "real" gfs2 will keep falling back to buffered I/O for writes to inline > data: gfs2 takes a shared lock during direct I/O, and writing to the > inode under that shared lock is not allowed. Ext4 may become the first > actual user of this part of the patch. One further issue is the alignment check in iomap_dio_actor: > if ((pos | length | align) & ((1 << blkbits) - 1)) > return -EINVAL; For inline data, iomap->length is set to the file size. iomap_apply truncates the requested length down to that, so iomap_dio_actor sees the truncated length instead of the requested length and fails with -EINVAL. This causes tests like the following to fail (also see xfstest generic/120): xfs_io -fd -c 'truncate 300' -c 'pread -v 0 4096' /mnt/test/foo A possible fix is to change the alignment check in iomap_dio_actor as follows: - if ((pos | length | align) & ((1 << blkbits) - 1)) + if ((pos | align) & ((1 << blkbits) - 1)) + return -EINVAL; + if (length & ((1 << blkbits) - 1) && + pos + length != iomap->offset + iomap->length) return -EINVAL; Moving the alignment check from iomap_dio_actor to iomap_dio_rw isn't that easy because iomap->bdev isn't known there. Any thoughts? Thanks, Andreas ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/1] iomap: Direct I/O for inline data 2018-06-27 11:14 ` [PATCH 0/1] " Andreas Gruenbacher @ 2018-06-29 8:43 ` Christoph Hellwig 2018-06-29 11:01 ` Andreas Gruenbacher 0 siblings, 1 reply; 13+ messages in thread From: Christoph Hellwig @ 2018-06-29 8:43 UTC (permalink / raw) To: Andreas Gruenbacher Cc: cluster-devel, linux-fsdevel, linux-ext4, Christoph Hellwig > A possible fix is to change the alignment check in iomap_dio_actor as follows: > > - if ((pos | length | align) & ((1 << blkbits) - 1)) > + if ((pos | align) & ((1 << blkbits) - 1)) > + return -EINVAL; > + if (length & ((1 << blkbits) - 1) && > + pos + length != iomap->offset + iomap->length) > return -EINVAL; > > Moving the alignment check from iomap_dio_actor to iomap_dio_rw isn't > that easy because iomap->bdev isn't known there. Just make the check conditional on iomap->type != IOMAP_INLINE as alignment checks on inline data don't make much sense. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/1] iomap: Direct I/O for inline data 2018-06-29 8:43 ` Christoph Hellwig @ 2018-06-29 11:01 ` Andreas Gruenbacher 0 siblings, 0 replies; 13+ messages in thread From: Andreas Gruenbacher @ 2018-06-29 11:01 UTC (permalink / raw) To: Christoph Hellwig; +Cc: cluster-devel, linux-fsdevel, linux-ext4 On 29 June 2018 at 10:43, Christoph Hellwig <hch@lst.de> wrote: >> A possible fix is to change the alignment check in iomap_dio_actor as follows: >> >> - if ((pos | length | align) & ((1 << blkbits) - 1)) >> + if ((pos | align) & ((1 << blkbits) - 1)) >> + return -EINVAL; >> + if (length & ((1 << blkbits) - 1) && >> + pos + length != iomap->offset + iomap->length) >> return -EINVAL; >> >> Moving the alignment check from iomap_dio_actor to iomap_dio_rw isn't >> that easy because iomap->bdev isn't known there. > > Just make the check conditional on iomap->type != IOMAP_INLINE > as alignment checks on inline data don't make much sense. Yeah, probably. Thanks, Andreas ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-07-01 21:44 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-06-27 0:39 [PATCH 0/1] iomap: Direct I/O for inline data Andreas Gruenbacher 2018-06-27 0:39 ` [PATCH 1/1] " Andreas Gruenbacher 2018-06-27 1:44 ` kbuild test robot 2018-06-29 8:56 ` Christoph Hellwig 2018-06-29 14:40 ` Andreas Gruenbacher 2018-06-29 16:01 ` Christoph Hellwig 2018-06-29 17:02 ` Andreas Gruenbacher 2018-07-01 6:13 ` Christoph Hellwig 2018-07-01 6:29 ` Christoph Hellwig 2018-07-01 21:44 ` Andreas Gruenbacher 2018-06-27 11:14 ` [PATCH 0/1] " Andreas Gruenbacher 2018-06-29 8:43 ` Christoph Hellwig 2018-06-29 11:01 ` Andreas Gruenbacher
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).