* reflink direct I/O improvements V2 @ 2017-02-01 21:42 Christoph Hellwig 2017-02-01 21:42 ` [PATCH 1/4] xfs: reject all unaligned direct writes to reflinked files Christoph Hellwig ` (3 more replies) 0 siblings, 4 replies; 10+ messages in thread From: Christoph Hellwig @ 2017-02-01 21:42 UTC (permalink / raw) To: linux-xfs This series improves the reflink direct I/O path, by avoiding a pointless detour through delayed allocations and doing the allocations directly from the iomap code. Changes since V1: - align the alignment code with the regular direct I/O path - rebase on top the patches from Darrick to use unwritten extents in the direct I/O COW path ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/4] xfs: reject all unaligned direct writes to reflinked files 2017-02-01 21:42 reflink direct I/O improvements V2 Christoph Hellwig @ 2017-02-01 21:42 ` Christoph Hellwig 2017-02-02 23:01 ` Darrick J. Wong 2017-02-01 21:42 ` [PATCH 2/4] xfs: introduce xfs_aligned_fsb_count Christoph Hellwig ` (2 subsequent siblings) 3 siblings, 1 reply; 10+ messages in thread From: Christoph Hellwig @ 2017-02-01 21:42 UTC (permalink / raw) To: linux-xfs We currently fall back from direct to buffered writes if we detect a remaining shared extent in the iomap_begin callback. But by the time iomap_begin is called for the potentially unaligned end block we might have already written most of the data to disk, which we'd now write again using buffered I/O. To avoid this reject all writes to reflinked files before starting I/O so that we are guaranteed to only write the data once. The alternative would be to unshare the unaligned start and/or end block before doing the I/O. I think that's doable, and will actually be required to support reflinks on DAX file system. But it will take a little more time and I'd rather get rid of the double write ASAP. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> --- fs/xfs/xfs_file.c | 7 +++++++ fs/xfs/xfs_iomap.c | 12 +----------- 2 files changed, 8 insertions(+), 11 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index bbb9eb6..06236bf 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -527,6 +527,13 @@ xfs_file_dio_aio_write( if ((iocb->ki_pos & mp->m_blockmask) || ((iocb->ki_pos + count) & mp->m_blockmask)) { unaligned_io = 1; + + /* + * We can't properly handle unaligned direct I/O to reflink + * files yet, as we can't unshare a partial block. + */ + if (xfs_is_reflink_inode(ip)) + return -EREMCHG; iolock = XFS_IOLOCK_EXCL; } else { iolock = XFS_IOLOCK_SHARED; diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 767208f..d17aa3b 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -1026,17 +1026,7 @@ xfs_file_iomap_begin( if (error) goto out_unlock; - /* - * We're here because we're trying to do a directio write to a - * region that isn't aligned to a filesystem block. If the - * extent is shared, fall back to buffered mode to handle the - * RMW. - */ - if (!(flags & IOMAP_REPORT) && shared) { - trace_xfs_reflink_bounce_dio_write(ip, &imap); - error = -EREMCHG; - goto out_unlock; - } + ASSERT((flags & IOMAP_REPORT) || !shared); } if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { -- 2.1.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/4] xfs: reject all unaligned direct writes to reflinked files 2017-02-01 21:42 ` [PATCH 1/4] xfs: reject all unaligned direct writes to reflinked files Christoph Hellwig @ 2017-02-02 23:01 ` Darrick J. Wong 0 siblings, 0 replies; 10+ messages in thread From: Darrick J. Wong @ 2017-02-02 23:01 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Wed, Feb 01, 2017 at 10:42:36PM +0100, Christoph Hellwig wrote: > We currently fall back from direct to buffered writes if we detect a > remaining shared extent in the iomap_begin callback. But by the time > iomap_begin is called for the potentially unaligned end block we might > have already written most of the data to disk, which we'd now write > again using buffered I/O. To avoid this reject all writes to reflinked > files before starting I/O so that we are guaranteed to only write the > data once. > > The alternative would be to unshare the unaligned start and/or end block > before doing the I/O. I think that's doable, and will actually be > required to support reflinks on DAX file system. But it will take a > little more time and I'd rather get rid of the double write ASAP. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > Reviewed-by: Brian Foster <bfoster@redhat.com> > --- > fs/xfs/xfs_file.c | 7 +++++++ > fs/xfs/xfs_iomap.c | 12 +----------- > 2 files changed, 8 insertions(+), 11 deletions(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index bbb9eb6..06236bf 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -527,6 +527,13 @@ xfs_file_dio_aio_write( > if ((iocb->ki_pos & mp->m_blockmask) || > ((iocb->ki_pos + count) & mp->m_blockmask)) { > unaligned_io = 1; > + > + /* > + * We can't properly handle unaligned direct I/O to reflink > + * files yet, as we can't unshare a partial block. > + */ > + if (xfs_is_reflink_inode(ip)) > + return -EREMCHG; > iolock = XFS_IOLOCK_EXCL; > } else { > iolock = XFS_IOLOCK_SHARED; > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > index 767208f..d17aa3b 100644 > --- a/fs/xfs/xfs_iomap.c > +++ b/fs/xfs/xfs_iomap.c > @@ -1026,17 +1026,7 @@ xfs_file_iomap_begin( > if (error) > goto out_unlock; > > - /* > - * We're here because we're trying to do a directio write to a > - * region that isn't aligned to a filesystem block. If the > - * extent is shared, fall back to buffered mode to handle the > - * RMW. > - */ > - if (!(flags & IOMAP_REPORT) && shared) { > - trace_xfs_reflink_bounce_dio_write(ip, &imap); Can you either port the tracepoint to the new hunk or just kill it entirely? (If you opt for killing it, I'll just remove it from xfs_trace.h when I import this patch.) Otherwise, Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> --D > - error = -EREMCHG; > - goto out_unlock; > - } > + ASSERT((flags & IOMAP_REPORT) || !shared); > } > > if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { > -- > 2.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/4] xfs: introduce xfs_aligned_fsb_count 2017-02-01 21:42 reflink direct I/O improvements V2 Christoph Hellwig 2017-02-01 21:42 ` [PATCH 1/4] xfs: reject all unaligned direct writes to reflinked files Christoph Hellwig @ 2017-02-01 21:42 ` Christoph Hellwig 2017-02-02 22:56 ` Darrick J. Wong 2017-02-01 21:42 ` [PATCH 3/4] xfs: go straight to real allocations for direct I/O COW writes Christoph Hellwig 2017-02-01 21:42 ` [PATCH 4/4] xfs: allocate direct I/O COW blocks in iomap_begin Christoph Hellwig 3 siblings, 1 reply; 10+ messages in thread From: Christoph Hellwig @ 2017-02-01 21:42 UTC (permalink / raw) To: linux-xfs Factor a helper to calculate the extent-size aligned block out of the iomap code, so that it can be reused by the upcoming reflink dio code. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_iomap.c | 11 ++--------- fs/xfs/xfs_iomap.h | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+), 9 deletions(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index d17aa3b..9d811eb 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -162,7 +162,7 @@ xfs_iomap_write_direct( xfs_fileoff_t last_fsb; xfs_filblks_t count_fsb, resaligned; xfs_fsblock_t firstfsb; - xfs_extlen_t extsz, temp; + xfs_extlen_t extsz; int nimaps; int quota_flag; int rt; @@ -203,14 +203,7 @@ xfs_iomap_write_direct( } count_fsb = last_fsb - offset_fsb; ASSERT(count_fsb > 0); - - resaligned = count_fsb; - if (unlikely(extsz)) { - if ((temp = do_mod(offset_fsb, extsz))) - resaligned += temp; - if ((temp = do_mod(resaligned, extsz))) - resaligned += extsz - temp; - } + resaligned = xfs_aligned_fsb_count(offset_fsb, count_fsb, extsz); if (unlikely(rt)) { resrtextents = qblocks = resaligned; diff --git a/fs/xfs/xfs_iomap.h b/fs/xfs/xfs_iomap.h index 6d45cf0..1fef4a5 100644 --- a/fs/xfs/xfs_iomap.h +++ b/fs/xfs/xfs_iomap.h @@ -33,6 +33,26 @@ void xfs_bmbt_to_iomap(struct xfs_inode *, struct iomap *, struct xfs_bmbt_irec *); xfs_extlen_t xfs_eof_alignment(struct xfs_inode *ip, xfs_extlen_t extsize); +static inline xfs_filblks_t +xfs_aligned_fsb_count( + xfs_fileoff_t offset_fsb, + xfs_filblks_t count_fsb, + xfs_extlen_t extsz) +{ + if (extsz) { + xfs_extlen_t align; + + align = do_mod(offset_fsb, extsz); + if (align) + count_fsb += align; + align = do_mod(count_fsb, extsz); + if (align) + count_fsb += extsz - align; + } + + return count_fsb; +} + extern struct iomap_ops xfs_iomap_ops; extern struct iomap_ops xfs_xattr_iomap_ops; -- 2.1.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/4] xfs: introduce xfs_aligned_fsb_count 2017-02-01 21:42 ` [PATCH 2/4] xfs: introduce xfs_aligned_fsb_count Christoph Hellwig @ 2017-02-02 22:56 ` Darrick J. Wong 0 siblings, 0 replies; 10+ messages in thread From: Darrick J. Wong @ 2017-02-02 22:56 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Wed, Feb 01, 2017 at 10:42:37PM +0100, Christoph Hellwig wrote: > Factor a helper to calculate the extent-size aligned block out of the > iomap code, so that it can be reused by the upcoming reflink dio code. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Looks like a reasonable enough code replacement, so: Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> --D > --- > fs/xfs/xfs_iomap.c | 11 ++--------- > fs/xfs/xfs_iomap.h | 20 ++++++++++++++++++++ > 2 files changed, 22 insertions(+), 9 deletions(-) > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > index d17aa3b..9d811eb 100644 > --- a/fs/xfs/xfs_iomap.c > +++ b/fs/xfs/xfs_iomap.c > @@ -162,7 +162,7 @@ xfs_iomap_write_direct( > xfs_fileoff_t last_fsb; > xfs_filblks_t count_fsb, resaligned; > xfs_fsblock_t firstfsb; > - xfs_extlen_t extsz, temp; > + xfs_extlen_t extsz; > int nimaps; > int quota_flag; > int rt; > @@ -203,14 +203,7 @@ xfs_iomap_write_direct( > } > count_fsb = last_fsb - offset_fsb; > ASSERT(count_fsb > 0); > - > - resaligned = count_fsb; > - if (unlikely(extsz)) { > - if ((temp = do_mod(offset_fsb, extsz))) > - resaligned += temp; > - if ((temp = do_mod(resaligned, extsz))) > - resaligned += extsz - temp; > - } > + resaligned = xfs_aligned_fsb_count(offset_fsb, count_fsb, extsz); > > if (unlikely(rt)) { > resrtextents = qblocks = resaligned; > diff --git a/fs/xfs/xfs_iomap.h b/fs/xfs/xfs_iomap.h > index 6d45cf0..1fef4a5 100644 > --- a/fs/xfs/xfs_iomap.h > +++ b/fs/xfs/xfs_iomap.h > @@ -33,6 +33,26 @@ void xfs_bmbt_to_iomap(struct xfs_inode *, struct iomap *, > struct xfs_bmbt_irec *); > xfs_extlen_t xfs_eof_alignment(struct xfs_inode *ip, xfs_extlen_t extsize); > > +static inline xfs_filblks_t > +xfs_aligned_fsb_count( > + xfs_fileoff_t offset_fsb, > + xfs_filblks_t count_fsb, > + xfs_extlen_t extsz) > +{ > + if (extsz) { > + xfs_extlen_t align; > + > + align = do_mod(offset_fsb, extsz); > + if (align) > + count_fsb += align; > + align = do_mod(count_fsb, extsz); > + if (align) > + count_fsb += extsz - align; > + } > + > + return count_fsb; > +} > + > extern struct iomap_ops xfs_iomap_ops; > extern struct iomap_ops xfs_xattr_iomap_ops; > > -- > 2.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/4] xfs: go straight to real allocations for direct I/O COW writes 2017-02-01 21:42 reflink direct I/O improvements V2 Christoph Hellwig 2017-02-01 21:42 ` [PATCH 1/4] xfs: reject all unaligned direct writes to reflinked files Christoph Hellwig 2017-02-01 21:42 ` [PATCH 2/4] xfs: introduce xfs_aligned_fsb_count Christoph Hellwig @ 2017-02-01 21:42 ` Christoph Hellwig 2017-02-01 21:42 ` [PATCH 4/4] xfs: allocate direct I/O COW blocks in iomap_begin Christoph Hellwig 3 siblings, 0 replies; 10+ messages in thread From: Christoph Hellwig @ 2017-02-01 21:42 UTC (permalink / raw) To: linux-xfs When we allocate COW fork blocks for direct I/O writes we currently first create a delayed allocation, and then convert it to a real allocation once we've got the delayed one. As there is no good reason for that this patch instead makes use call xfs_bmapi_write from the COW allocation path. The only interesting bits are a few tweaks the low-level allocator to allow for this, most notably the need to remove the call to xfs_bmap_extsize_align for the cowextsize in xfs_bmap_btalloc - for the existing convert case it's a no-op, but for the direct allocation case it would blow up our block reservation way beyond what we reserved for the transaction. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/libxfs/xfs_bmap.c | 3 +- fs/xfs/xfs_reflink.c | 94 +++++++++++++++++++++++++++++++++--------------- 2 files changed, 68 insertions(+), 29 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 1566e9d..2b4ad84 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -2909,13 +2909,14 @@ xfs_bmap_add_extent_hole_real( ASSERT(!isnullstartblock(new->br_startblock)); ASSERT(!bma->cur || !(bma->cur->bc_private.b.flags & XFS_BTCUR_BPRV_WASDEL)); - ASSERT(whichfork != XFS_COW_FORK); XFS_STATS_INC(mp, xs_add_exlist); state = 0; if (whichfork == XFS_ATTR_FORK) state |= BMAP_ATTRFORK; + if (whichfork == XFS_COW_FORK) + state |= BMAP_COWFORK; /* * Check and set flags if this segment has a left neighbor. diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index a2e1fff..1a96741 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -391,62 +391,100 @@ __xfs_reflink_allocate_cow( xfs_fileoff_t end_fsb) { struct xfs_mount *mp = ip->i_mount; - struct xfs_bmbt_irec imap; + struct xfs_bmbt_irec imap, got; struct xfs_defer_ops dfops; struct xfs_trans *tp; xfs_fsblock_t first_block; - int nimaps = 1, error; - bool shared; + int nimaps, error, lockmode; + bool shared, trimmed; + xfs_filblks_t resaligned; + xfs_extlen_t resblks; + xfs_extnum_t idx; - xfs_defer_init(&dfops, &first_block); + resaligned = xfs_aligned_fsb_count(*offset_fsb, end_fsb - *offset_fsb, + xfs_get_cowextsz_hint(ip)); + resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, 0, 0, - XFS_TRANS_RESERVE, &tp); + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp); if (error) return error; - xfs_ilock(ip, XFS_ILOCK_EXCL); + lockmode = XFS_ILOCK_EXCL; + xfs_ilock(ip, lockmode); - /* Read extent from the source file. */ - nimaps = 1; - error = xfs_bmapi_read(ip, *offset_fsb, end_fsb - *offset_fsb, - &imap, &nimaps, 0); - if (error) - goto out_unlock; - ASSERT(nimaps == 1); + /* + * Even if the extent is not shared we might have a preallocation for + * it in the COW fork. If so use it. + */ + if (xfs_iext_lookup_extent(ip, ip->i_cowfp, *offset_fsb, &idx, &got) && + got.br_startoff <= *offset_fsb) { + /* If we have a real allocation in the COW fork we're done. */ + if (!isnullstartblock(got.br_startblock)) { + xfs_trim_extent(&got, *offset_fsb, + end_fsb - *offset_fsb); + *offset_fsb = got.br_startoff + got.br_blockcount; + goto out_trans_cancel; + } + } else { + nimaps = 1; + error = xfs_bmapi_read(ip, *offset_fsb, end_fsb - *offset_fsb, + &imap, &nimaps, 0); + if (error) + goto out_trans_cancel; + ASSERT(nimaps == 1); + + /* Trim the mapping to the nearest shared extent boundary. */ + error = xfs_reflink_trim_around_shared(ip, &imap, &shared, + &trimmed); + if (error) + goto out_trans_cancel; + + if (!shared) { + *offset_fsb = imap.br_startoff + imap.br_blockcount; + goto out_trans_cancel; + } - /* Make sure there's a CoW reservation for it. */ - error = xfs_reflink_reserve_cow(ip, &imap, &shared); + *offset_fsb = imap.br_startoff; + end_fsb = imap.br_startoff + imap.br_blockcount; + } + + error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0, + XFS_QMOPT_RES_REGBLKS); if (error) goto out_trans_cancel; - if (!shared) { - *offset_fsb = imap.br_startoff + imap.br_blockcount; - goto out_trans_cancel; - } + xfs_trans_ijoin(tp, ip, 0); + + xfs_defer_init(&dfops, &first_block); + nimaps = 1; /* Allocate the entire reservation as unwritten blocks. */ - xfs_trans_ijoin(tp, ip, 0); - error = xfs_bmapi_write(tp, ip, imap.br_startoff, imap.br_blockcount, + error = xfs_bmapi_write(tp, ip, *offset_fsb, end_fsb - *offset_fsb, XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, &first_block, - XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK), - &imap, &nimaps, &dfops); + resblks, &imap, &nimaps, &dfops); if (error) - goto out_trans_cancel; + goto out_bmap_cancel; /* Finish up. */ error = xfs_defer_finish(&tp, &dfops, NULL); if (error) - goto out_trans_cancel; + goto out_bmap_cancel; error = xfs_trans_commit(tp); + if (error) + goto out_unlock; *offset_fsb = imap.br_startoff + imap.br_blockcount; + out_unlock: - xfs_iunlock(ip, XFS_ILOCK_EXCL); + xfs_iunlock(ip, lockmode); return error; -out_trans_cancel: + +out_bmap_cancel: xfs_defer_cancel(&dfops); + xfs_trans_unreserve_quota_nblks(tp, ip, (long)resblks, 0, + XFS_QMOPT_RES_REGBLKS); +out_trans_cancel: xfs_trans_cancel(tp); goto out_unlock; } -- 2.1.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 4/4] xfs: allocate direct I/O COW blocks in iomap_begin 2017-02-01 21:42 reflink direct I/O improvements V2 Christoph Hellwig ` (2 preceding siblings ...) 2017-02-01 21:42 ` [PATCH 3/4] xfs: go straight to real allocations for direct I/O COW writes Christoph Hellwig @ 2017-02-01 21:42 ` Christoph Hellwig 2017-02-03 0:55 ` Darrick J. Wong 3 siblings, 1 reply; 10+ messages in thread From: Christoph Hellwig @ 2017-02-01 21:42 UTC (permalink / raw) To: linux-xfs Instead of preallocating all the required COW blocks in the high-level write code do it inside the iomap code, like we do for all other I/O. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_file.c | 8 --- fs/xfs/xfs_iomap.c | 31 +++++------ fs/xfs/xfs_reflink.c | 147 ++++++++++++++++++++------------------------------- fs/xfs/xfs_reflink.h | 5 +- fs/xfs/xfs_trace.h | 2 - 5 files changed, 71 insertions(+), 122 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 06236bf..56ac5b7 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -559,14 +559,6 @@ xfs_file_dio_aio_write( } trace_xfs_file_direct_write(ip, count, iocb->ki_pos); - - /* If this is a block-aligned directio CoW, remap immediately. */ - if (xfs_is_reflink_inode(ip) && !unaligned_io) { - ret = xfs_reflink_allocate_cow_range(ip, iocb->ki_pos, count); - if (ret) - goto out; - } - ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, xfs_dio_write_end_io); out: xfs_iunlock(ip, iolock); diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 9d811eb..de717e7 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -995,37 +995,31 @@ xfs_file_iomap_begin( offset_fsb = XFS_B_TO_FSBT(mp, offset); end_fsb = XFS_B_TO_FSB(mp, offset + length); - if (xfs_is_reflink_inode(ip) && - (flags & IOMAP_WRITE) && (flags & IOMAP_DIRECT)) { - shared = xfs_reflink_find_cow_mapping(ip, offset, &imap); - if (shared) { - xfs_iunlock(ip, lockmode); - goto alloc_done; - } - ASSERT(!isnullstartblock(imap.br_startblock)); - } - error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, &imap, &nimaps, 0); if (error) goto out_unlock; - if ((flags & IOMAP_REPORT) || - (xfs_is_reflink_inode(ip) && - (flags & IOMAP_WRITE) && (flags & IOMAP_DIRECT))) { + if (flags & IOMAP_REPORT) { /* Trim the mapping to the nearest shared extent boundary. */ error = xfs_reflink_trim_around_shared(ip, &imap, &shared, &trimmed); if (error) goto out_unlock; - - ASSERT((flags & IOMAP_REPORT) || !shared); } if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { - error = xfs_reflink_reserve_cow(ip, &imap, &shared); - if (error) - goto out_unlock; + if (flags & IOMAP_DIRECT) { + /* may drop and re-acquire the ilock */ + error = xfs_reflink_allocate_cow(ip, offset_fsb, end_fsb, + &imap, &shared, &lockmode); + if (error) + goto out_unlock; + } else { + error = xfs_reflink_reserve_cow(ip, &imap, &shared); + if (error) + goto out_unlock; + } end_fsb = imap.br_startoff + imap.br_blockcount; length = XFS_FSB_TO_B(mp, end_fsb) - offset; @@ -1054,7 +1048,6 @@ xfs_file_iomap_begin( if (error) return error; -alloc_done: iomap->flags = IOMAP_F_NEW; trace_xfs_iomap_alloc(ip, offset, length, 0, &imap); } else { diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 1a96741..d5a2cf2 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -384,74 +384,84 @@ xfs_reflink_convert_cow( } /* Allocate all CoW reservations covering a range of blocks in a file. */ -static int -__xfs_reflink_allocate_cow( +int +xfs_reflink_allocate_cow( struct xfs_inode *ip, - xfs_fileoff_t *offset_fsb, - xfs_fileoff_t end_fsb) + xfs_fileoff_t offset_fsb, + xfs_fileoff_t end_fsb, + struct xfs_bmbt_irec *imap, + bool *shared, + uint *lockmode) { struct xfs_mount *mp = ip->i_mount; - struct xfs_bmbt_irec imap, got; + struct xfs_bmbt_irec got; struct xfs_defer_ops dfops; - struct xfs_trans *tp; + struct xfs_trans *tp = NULL; xfs_fsblock_t first_block; - int nimaps, error, lockmode; - bool shared, trimmed; + int nimaps, error = 0; + bool trimmed; xfs_filblks_t resaligned; - xfs_extlen_t resblks; + xfs_extlen_t resblks = 0; xfs_extnum_t idx; - resaligned = xfs_aligned_fsb_count(*offset_fsb, end_fsb - *offset_fsb, - xfs_get_cowextsz_hint(ip)); - resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); - - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp); - if (error) - return error; - - lockmode = XFS_ILOCK_EXCL; - xfs_ilock(ip, lockmode); +retry: + ASSERT(xfs_is_reflink_inode(ip)); + ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL | XFS_ILOCK_SHARED)); /* * Even if the extent is not shared we might have a preallocation for * it in the COW fork. If so use it. */ - if (xfs_iext_lookup_extent(ip, ip->i_cowfp, *offset_fsb, &idx, &got) && - got.br_startoff <= *offset_fsb) { + if (xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &idx, &got) && + got.br_startoff <= offset_fsb) { + *shared = true; + /* If we have a real allocation in the COW fork we're done. */ if (!isnullstartblock(got.br_startblock)) { - xfs_trim_extent(&got, *offset_fsb, - end_fsb - *offset_fsb); - *offset_fsb = got.br_startoff + got.br_blockcount; - goto out_trans_cancel; + xfs_trim_extent(&got, offset_fsb, end_fsb - offset_fsb); + *imap = got; + goto out; } + + xfs_trim_extent(imap, got.br_startoff, got.br_blockcount); } else { - nimaps = 1; - error = xfs_bmapi_read(ip, *offset_fsb, end_fsb - *offset_fsb, - &imap, &nimaps, 0); + error = xfs_reflink_trim_around_shared(ip, imap, shared, &trimmed); + if (error || !*shared) + goto out; + } + + if (!tp) { + resaligned = xfs_aligned_fsb_count(offset_fsb, + end_fsb - offset_fsb, xfs_get_cowextsz_hint(ip)); + resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); + + xfs_iunlock(ip, *lockmode); + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp); + *lockmode = XFS_ILOCK_EXCL; + xfs_ilock(ip, *lockmode); + if (error) - goto out_trans_cancel; - ASSERT(nimaps == 1); + return error; - /* Trim the mapping to the nearest shared extent boundary. */ - error = xfs_reflink_trim_around_shared(ip, &imap, &shared, - &trimmed); + error = xfs_qm_dqattach_locked(ip, 0); if (error) - goto out_trans_cancel; + goto out; - if (!shared) { - *offset_fsb = imap.br_startoff + imap.br_blockcount; - goto out_trans_cancel; - } + /* Read extent from the source file. */ + nimaps = 1; + error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, + imap, &nimaps, 0); + if (error) + goto out; + ASSERT(nimaps == 1); - *offset_fsb = imap.br_startoff; - end_fsb = imap.br_startoff + imap.br_blockcount; + goto retry; } error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0, XFS_QMOPT_RES_REGBLKS); if (error) - goto out_trans_cancel; + goto out; xfs_trans_ijoin(tp, ip, 0); @@ -459,9 +469,9 @@ __xfs_reflink_allocate_cow( nimaps = 1; /* Allocate the entire reservation as unwritten blocks. */ - error = xfs_bmapi_write(tp, ip, *offset_fsb, end_fsb - *offset_fsb, + error = xfs_bmapi_write(tp, ip, imap->br_startoff, imap->br_blockcount, XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, &first_block, - resblks, &imap, &nimaps, &dfops); + resblks, imap, &nimaps, &dfops); if (error) goto out_bmap_cancel; @@ -470,60 +480,15 @@ __xfs_reflink_allocate_cow( if (error) goto out_bmap_cancel; - error = xfs_trans_commit(tp); - if (error) - goto out_unlock; - - *offset_fsb = imap.br_startoff + imap.br_blockcount; - -out_unlock: - xfs_iunlock(ip, lockmode); - return error; + return xfs_trans_commit(tp); out_bmap_cancel: xfs_defer_cancel(&dfops); xfs_trans_unreserve_quota_nblks(tp, ip, (long)resblks, 0, XFS_QMOPT_RES_REGBLKS); -out_trans_cancel: - xfs_trans_cancel(tp); - goto out_unlock; -} - -/* Allocate all CoW reservations covering a part of a file. */ -int -xfs_reflink_allocate_cow_range( - struct xfs_inode *ip, - xfs_off_t offset, - xfs_off_t count) -{ - struct xfs_mount *mp = ip->i_mount; - xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset); - xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + count); - int error; - - ASSERT(xfs_is_reflink_inode(ip)); - - trace_xfs_reflink_allocate_cow_range(ip, offset, count); - - /* - * Make sure that the dquots are there. - */ - error = xfs_qm_dqattach(ip, 0); - if (error) - return error; - - while (offset_fsb < end_fsb) { - error = __xfs_reflink_allocate_cow(ip, &offset_fsb, end_fsb); - if (error) { - trace_xfs_reflink_allocate_cow_range_error(ip, error, - _RET_IP_); - goto out; - } - } - - /* Convert the CoW extents to regular. */ - error = xfs_reflink_convert_cow(ip, offset, count); out: + if (tp) + xfs_trans_cancel(tp); return error; } diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h index 1583c47..08792a4 100644 --- a/fs/xfs/xfs_reflink.h +++ b/fs/xfs/xfs_reflink.h @@ -28,8 +28,9 @@ extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip, extern int xfs_reflink_reserve_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *imap, bool *shared); -extern int xfs_reflink_allocate_cow_range(struct xfs_inode *ip, - xfs_off_t offset, xfs_off_t count); +extern int xfs_reflink_allocate_cow(struct xfs_inode *ip, + xfs_fileoff_t offset_fsb, xfs_fileoff_t end_fsb, + struct xfs_bmbt_irec *imap, bool *shared, uint *lockmode); extern int xfs_reflink_convert_cow(struct xfs_inode *ip, xfs_off_t offset, xfs_off_t count); extern bool xfs_reflink_find_cow_mapping(struct xfs_inode *ip, xfs_off_t offset, diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index d3d11905..5f0a0d6 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3248,7 +3248,6 @@ DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_enospc); DEFINE_INODE_IREC_EVENT(xfs_reflink_convert_cow); DEFINE_RW_EVENT(xfs_reflink_reserve_cow); -DEFINE_RW_EVENT(xfs_reflink_allocate_cow_range); DEFINE_INODE_IREC_EVENT(xfs_reflink_bounce_dio_write); DEFINE_IOMAP_EVENT(xfs_reflink_find_cow_mapping); @@ -3258,7 +3257,6 @@ DEFINE_SIMPLE_IO_EVENT(xfs_reflink_cancel_cow_range); DEFINE_SIMPLE_IO_EVENT(xfs_reflink_end_cow); DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap); -DEFINE_INODE_ERROR_EVENT(xfs_reflink_allocate_cow_range_error); DEFINE_INODE_ERROR_EVENT(xfs_reflink_cancel_cow_range_error); DEFINE_INODE_ERROR_EVENT(xfs_reflink_end_cow_error); -- 2.1.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] xfs: allocate direct I/O COW blocks in iomap_begin 2017-02-01 21:42 ` [PATCH 4/4] xfs: allocate direct I/O COW blocks in iomap_begin Christoph Hellwig @ 2017-02-03 0:55 ` Darrick J. Wong 2017-02-03 10:53 ` Christoph Hellwig 0 siblings, 1 reply; 10+ messages in thread From: Darrick J. Wong @ 2017-02-03 0:55 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Wed, Feb 01, 2017 at 10:42:39PM +0100, Christoph Hellwig wrote: > Instead of preallocating all the required COW blocks in the high-level > write code do it inside the iomap code, like we do for all other I/O. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_file.c | 8 --- > fs/xfs/xfs_iomap.c | 31 +++++------ > fs/xfs/xfs_reflink.c | 147 ++++++++++++++++++++------------------------------- > fs/xfs/xfs_reflink.h | 5 +- > fs/xfs/xfs_trace.h | 2 - > 5 files changed, 71 insertions(+), 122 deletions(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 06236bf..56ac5b7 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -559,14 +559,6 @@ xfs_file_dio_aio_write( > } > > trace_xfs_file_direct_write(ip, count, iocb->ki_pos); > - > - /* If this is a block-aligned directio CoW, remap immediately. */ > - if (xfs_is_reflink_inode(ip) && !unaligned_io) { > - ret = xfs_reflink_allocate_cow_range(ip, iocb->ki_pos, count); > - if (ret) > - goto out; > - } > - > ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, xfs_dio_write_end_io); > out: > xfs_iunlock(ip, iolock); > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > index 9d811eb..de717e7 100644 > --- a/fs/xfs/xfs_iomap.c > +++ b/fs/xfs/xfs_iomap.c > @@ -995,37 +995,31 @@ xfs_file_iomap_begin( > offset_fsb = XFS_B_TO_FSBT(mp, offset); > end_fsb = XFS_B_TO_FSB(mp, offset + length); > > - if (xfs_is_reflink_inode(ip) && > - (flags & IOMAP_WRITE) && (flags & IOMAP_DIRECT)) { > - shared = xfs_reflink_find_cow_mapping(ip, offset, &imap); > - if (shared) { > - xfs_iunlock(ip, lockmode); > - goto alloc_done; > - } > - ASSERT(!isnullstartblock(imap.br_startblock)); > - } > - > error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, &imap, > &nimaps, 0); > if (error) > goto out_unlock; > > - if ((flags & IOMAP_REPORT) || > - (xfs_is_reflink_inode(ip) && > - (flags & IOMAP_WRITE) && (flags & IOMAP_DIRECT))) { > + if (flags & IOMAP_REPORT) { > /* Trim the mapping to the nearest shared extent boundary. */ > error = xfs_reflink_trim_around_shared(ip, &imap, &shared, > &trimmed); > if (error) > goto out_unlock; > - > - ASSERT((flags & IOMAP_REPORT) || !shared); > } > > if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { > - error = xfs_reflink_reserve_cow(ip, &imap, &shared); > - if (error) > - goto out_unlock; > + if (flags & IOMAP_DIRECT) { > + /* may drop and re-acquire the ilock */ > + error = xfs_reflink_allocate_cow(ip, offset_fsb, end_fsb, > + &imap, &shared, &lockmode); > + if (error) > + goto out_unlock; > + } else { > + error = xfs_reflink_reserve_cow(ip, &imap, &shared); > + if (error) > + goto out_unlock; > + } > > end_fsb = imap.br_startoff + imap.br_blockcount; > length = XFS_FSB_TO_B(mp, end_fsb) - offset; > @@ -1054,7 +1048,6 @@ xfs_file_iomap_begin( > if (error) > return error; > > -alloc_done: > iomap->flags = IOMAP_F_NEW; > trace_xfs_iomap_alloc(ip, offset, length, 0, &imap); > } else { > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c > index 1a96741..d5a2cf2 100644 > --- a/fs/xfs/xfs_reflink.c > +++ b/fs/xfs/xfs_reflink.c > @@ -384,74 +384,84 @@ xfs_reflink_convert_cow( > } > > /* Allocate all CoW reservations covering a range of blocks in a file. */ > -static int > -__xfs_reflink_allocate_cow( > +int > +xfs_reflink_allocate_cow( > struct xfs_inode *ip, > - xfs_fileoff_t *offset_fsb, > - xfs_fileoff_t end_fsb) > + xfs_fileoff_t offset_fsb, > + xfs_fileoff_t end_fsb, > + struct xfs_bmbt_irec *imap, > + bool *shared, > + uint *lockmode) > { > struct xfs_mount *mp = ip->i_mount; > - struct xfs_bmbt_irec imap, got; > + struct xfs_bmbt_irec got; > struct xfs_defer_ops dfops; > - struct xfs_trans *tp; > + struct xfs_trans *tp = NULL; > xfs_fsblock_t first_block; > - int nimaps, error, lockmode; > - bool shared, trimmed; > + int nimaps, error = 0; > + bool trimmed; > xfs_filblks_t resaligned; > - xfs_extlen_t resblks; > + xfs_extlen_t resblks = 0; > xfs_extnum_t idx; > > - resaligned = xfs_aligned_fsb_count(*offset_fsb, end_fsb - *offset_fsb, > - xfs_get_cowextsz_hint(ip)); > - resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); > - > - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp); > - if (error) > - return error; > - > - lockmode = XFS_ILOCK_EXCL; > - xfs_ilock(ip, lockmode); > +retry: > + ASSERT(xfs_is_reflink_inode(ip)); > + ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL | XFS_ILOCK_SHARED)); > > /* > * Even if the extent is not shared we might have a preallocation for > * it in the COW fork. If so use it. > */ > - if (xfs_iext_lookup_extent(ip, ip->i_cowfp, *offset_fsb, &idx, &got) && > - got.br_startoff <= *offset_fsb) { > + if (xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &idx, &got) && > + got.br_startoff <= offset_fsb) { > + *shared = true; > + > /* If we have a real allocation in the COW fork we're done. */ > if (!isnullstartblock(got.br_startblock)) { > - xfs_trim_extent(&got, *offset_fsb, > - end_fsb - *offset_fsb); > - *offset_fsb = got.br_startoff + got.br_blockcount; > - goto out_trans_cancel; > + xfs_trim_extent(&got, offset_fsb, end_fsb - offset_fsb); > + *imap = got; > + goto out; > } > + > + xfs_trim_extent(imap, got.br_startoff, got.br_blockcount); > } else { > - nimaps = 1; > - error = xfs_bmapi_read(ip, *offset_fsb, end_fsb - *offset_fsb, > - &imap, &nimaps, 0); > + error = xfs_reflink_trim_around_shared(ip, imap, shared, &trimmed); > + if (error || !*shared) > + goto out; > + } > + > + if (!tp) { > + resaligned = xfs_aligned_fsb_count(offset_fsb, > + end_fsb - offset_fsb, xfs_get_cowextsz_hint(ip)); > + resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); > + > + xfs_iunlock(ip, *lockmode); > + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp); > + *lockmode = XFS_ILOCK_EXCL; > + xfs_ilock(ip, *lockmode); > + > if (error) > - goto out_trans_cancel; > - ASSERT(nimaps == 1); > + return error; > > - /* Trim the mapping to the nearest shared extent boundary. */ > - error = xfs_reflink_trim_around_shared(ip, &imap, &shared, > - &trimmed); > + error = xfs_qm_dqattach_locked(ip, 0); > if (error) > - goto out_trans_cancel; > + goto out; > > - if (!shared) { > - *offset_fsb = imap.br_startoff + imap.br_blockcount; > - goto out_trans_cancel; > - } > + /* Read extent from the source file. */ > + nimaps = 1; > + error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, > + imap, &nimaps, 0); > + if (error) > + goto out; > + ASSERT(nimaps == 1); > > - *offset_fsb = imap.br_startoff; > - end_fsb = imap.br_startoff + imap.br_blockcount; > + goto retry; > } > > error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0, > XFS_QMOPT_RES_REGBLKS); > if (error) > - goto out_trans_cancel; > + goto out; > > xfs_trans_ijoin(tp, ip, 0); > > @@ -459,9 +469,9 @@ __xfs_reflink_allocate_cow( > nimaps = 1; > > /* Allocate the entire reservation as unwritten blocks. */ > - error = xfs_bmapi_write(tp, ip, *offset_fsb, end_fsb - *offset_fsb, > + error = xfs_bmapi_write(tp, ip, imap->br_startoff, imap->br_blockcount, > XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, &first_block, > - resblks, &imap, &nimaps, &dfops); > + resblks, imap, &nimaps, &dfops); > if (error) > goto out_bmap_cancel; > > @@ -470,60 +480,15 @@ __xfs_reflink_allocate_cow( > if (error) > goto out_bmap_cancel; > > - error = xfs_trans_commit(tp); > - if (error) > - goto out_unlock; > - > - *offset_fsb = imap.br_startoff + imap.br_blockcount; > - > -out_unlock: > - xfs_iunlock(ip, lockmode); > - return error; > + return xfs_trans_commit(tp); > > out_bmap_cancel: > xfs_defer_cancel(&dfops); > xfs_trans_unreserve_quota_nblks(tp, ip, (long)resblks, 0, > XFS_QMOPT_RES_REGBLKS); > -out_trans_cancel: > - xfs_trans_cancel(tp); > - goto out_unlock; > -} > - > -/* Allocate all CoW reservations covering a part of a file. */ > -int > -xfs_reflink_allocate_cow_range( > - struct xfs_inode *ip, > - xfs_off_t offset, > - xfs_off_t count) > -{ > - struct xfs_mount *mp = ip->i_mount; > - xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset); > - xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + count); > - int error; > - > - ASSERT(xfs_is_reflink_inode(ip)); > - > - trace_xfs_reflink_allocate_cow_range(ip, offset, count); > - > - /* > - * Make sure that the dquots are there. > - */ > - error = xfs_qm_dqattach(ip, 0); > - if (error) > - return error; > - > - while (offset_fsb < end_fsb) { > - error = __xfs_reflink_allocate_cow(ip, &offset_fsb, end_fsb); > - if (error) { > - trace_xfs_reflink_allocate_cow_range_error(ip, error, > - _RET_IP_); > - goto out; > - } > - } > - > - /* Convert the CoW extents to regular. */ > - error = xfs_reflink_convert_cow(ip, offset, count); I think this is incorrect -- we create an unwritten extent in the cow fork above (BMAP_COWFORK | BMAP_PREALLOC) but we no longer convert the part we're actually writing to a real extent. This means that _reflink_cow won't actually remap anything that we've written. I saw generic/{139,183,187,188} fail and then g/190 crashed. There's something incorrect going on for sure, though it's late and I won't have time to pinpoint it tonight. --D > out: > + if (tp) > + xfs_trans_cancel(tp); > return error; > } > > diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h > index 1583c47..08792a4 100644 > --- a/fs/xfs/xfs_reflink.h > +++ b/fs/xfs/xfs_reflink.h > @@ -28,8 +28,9 @@ extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip, > > extern int xfs_reflink_reserve_cow(struct xfs_inode *ip, > struct xfs_bmbt_irec *imap, bool *shared); > -extern int xfs_reflink_allocate_cow_range(struct xfs_inode *ip, > - xfs_off_t offset, xfs_off_t count); > +extern int xfs_reflink_allocate_cow(struct xfs_inode *ip, > + xfs_fileoff_t offset_fsb, xfs_fileoff_t end_fsb, > + struct xfs_bmbt_irec *imap, bool *shared, uint *lockmode); > extern int xfs_reflink_convert_cow(struct xfs_inode *ip, xfs_off_t offset, > xfs_off_t count); > extern bool xfs_reflink_find_cow_mapping(struct xfs_inode *ip, xfs_off_t offset, > diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h > index d3d11905..5f0a0d6 100644 > --- a/fs/xfs/xfs_trace.h > +++ b/fs/xfs/xfs_trace.h > @@ -3248,7 +3248,6 @@ DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_enospc); > DEFINE_INODE_IREC_EVENT(xfs_reflink_convert_cow); > > DEFINE_RW_EVENT(xfs_reflink_reserve_cow); > -DEFINE_RW_EVENT(xfs_reflink_allocate_cow_range); > > DEFINE_INODE_IREC_EVENT(xfs_reflink_bounce_dio_write); > DEFINE_IOMAP_EVENT(xfs_reflink_find_cow_mapping); > @@ -3258,7 +3257,6 @@ DEFINE_SIMPLE_IO_EVENT(xfs_reflink_cancel_cow_range); > DEFINE_SIMPLE_IO_EVENT(xfs_reflink_end_cow); > DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap); > > -DEFINE_INODE_ERROR_EVENT(xfs_reflink_allocate_cow_range_error); > DEFINE_INODE_ERROR_EVENT(xfs_reflink_cancel_cow_range_error); > DEFINE_INODE_ERROR_EVENT(xfs_reflink_end_cow_error); > > -- > 2.1.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] xfs: allocate direct I/O COW blocks in iomap_begin 2017-02-03 0:55 ` Darrick J. Wong @ 2017-02-03 10:53 ` Christoph Hellwig 2017-02-05 20:13 ` Christoph Hellwig 0 siblings, 1 reply; 10+ messages in thread From: Christoph Hellwig @ 2017-02-03 10:53 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs On Thu, Feb 02, 2017 at 04:55:23PM -0800, Darrick J. Wong wrote: > On Wed, Feb 01, 2017 at 10:42:39PM +0100, Christoph Hellwig wrote: > > Instead of preallocating all the required COW blocks in the high-level > > write code do it inside the iomap code, like we do for all other I/O. > > > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > --- > > fs/xfs/xfs_file.c | 8 --- > > fs/xfs/xfs_iomap.c | 31 +++++------ > > fs/xfs/xfs_reflink.c | 147 ++++++++++++++++++++------------------------------- > > fs/xfs/xfs_reflink.h | 5 +- > > fs/xfs/xfs_trace.h | 2 - > > 5 files changed, 71 insertions(+), 122 deletions(-) > > > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > > index 06236bf..56ac5b7 100644 > > --- a/fs/xfs/xfs_file.c > > +++ b/fs/xfs/xfs_file.c > > @@ -559,14 +559,6 @@ xfs_file_dio_aio_write( > > } > > > > trace_xfs_file_direct_write(ip, count, iocb->ki_pos); > > - > > - /* If this is a block-aligned directio CoW, remap immediately. */ > > - if (xfs_is_reflink_inode(ip) && !unaligned_io) { > > - ret = xfs_reflink_allocate_cow_range(ip, iocb->ki_pos, count); > > - if (ret) > > - goto out; > > - } > > - > > ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, xfs_dio_write_end_io); > > out: > > xfs_iunlock(ip, iolock); > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > > index 9d811eb..de717e7 100644 > > --- a/fs/xfs/xfs_iomap.c > > +++ b/fs/xfs/xfs_iomap.c > > @@ -995,37 +995,31 @@ xfs_file_iomap_begin( > > offset_fsb = XFS_B_TO_FSBT(mp, offset); > > end_fsb = XFS_B_TO_FSB(mp, offset + length); > > > > - if (xfs_is_reflink_inode(ip) && > > - (flags & IOMAP_WRITE) && (flags & IOMAP_DIRECT)) { > > - shared = xfs_reflink_find_cow_mapping(ip, offset, &imap); > > - if (shared) { > > - xfs_iunlock(ip, lockmode); > > - goto alloc_done; > > - } > > - ASSERT(!isnullstartblock(imap.br_startblock)); > > - } > > - > > error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, &imap, > > &nimaps, 0); > > if (error) > > goto out_unlock; > > > > - if ((flags & IOMAP_REPORT) || > > - (xfs_is_reflink_inode(ip) && > > - (flags & IOMAP_WRITE) && (flags & IOMAP_DIRECT))) { > > + if (flags & IOMAP_REPORT) { > > /* Trim the mapping to the nearest shared extent boundary. */ > > error = xfs_reflink_trim_around_shared(ip, &imap, &shared, > > &trimmed); > > if (error) > > goto out_unlock; > > - > > - ASSERT((flags & IOMAP_REPORT) || !shared); > > } > > > > if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { > > - error = xfs_reflink_reserve_cow(ip, &imap, &shared); > > - if (error) > > - goto out_unlock; > > + if (flags & IOMAP_DIRECT) { > > + /* may drop and re-acquire the ilock */ > > + error = xfs_reflink_allocate_cow(ip, offset_fsb, end_fsb, > > + &imap, &shared, &lockmode); > > + if (error) > > + goto out_unlock; > > + } else { > > + error = xfs_reflink_reserve_cow(ip, &imap, &shared); > > + if (error) > > + goto out_unlock; > > + } > > > > end_fsb = imap.br_startoff + imap.br_blockcount; > > length = XFS_FSB_TO_B(mp, end_fsb) - offset; > > @@ -1054,7 +1048,6 @@ xfs_file_iomap_begin( > > if (error) > > return error; > > > > -alloc_done: > > iomap->flags = IOMAP_F_NEW; > > trace_xfs_iomap_alloc(ip, offset, length, 0, &imap); > > } else { > > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c > > index 1a96741..d5a2cf2 100644 > > --- a/fs/xfs/xfs_reflink.c > > +++ b/fs/xfs/xfs_reflink.c > > @@ -384,74 +384,84 @@ xfs_reflink_convert_cow( > > } > > > > /* Allocate all CoW reservations covering a range of blocks in a file. */ > > -static int > > -__xfs_reflink_allocate_cow( > > +int > > +xfs_reflink_allocate_cow( > > struct xfs_inode *ip, > > - xfs_fileoff_t *offset_fsb, > > - xfs_fileoff_t end_fsb) > > + xfs_fileoff_t offset_fsb, > > + xfs_fileoff_t end_fsb, > > + struct xfs_bmbt_irec *imap, > > + bool *shared, > > + uint *lockmode) > > { > > struct xfs_mount *mp = ip->i_mount; > > - struct xfs_bmbt_irec imap, got; > > + struct xfs_bmbt_irec got; > > struct xfs_defer_ops dfops; > > - struct xfs_trans *tp; > > + struct xfs_trans *tp = NULL; > > xfs_fsblock_t first_block; > > - int nimaps, error, lockmode; > > - bool shared, trimmed; > > + int nimaps, error = 0; > > + bool trimmed; > > xfs_filblks_t resaligned; > > - xfs_extlen_t resblks; > > + xfs_extlen_t resblks = 0; > > xfs_extnum_t idx; > > > > - resaligned = xfs_aligned_fsb_count(*offset_fsb, end_fsb - *offset_fsb, > > - xfs_get_cowextsz_hint(ip)); > > - resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); > > - > > - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp); > > - if (error) > > - return error; > > - > > - lockmode = XFS_ILOCK_EXCL; > > - xfs_ilock(ip, lockmode); > > +retry: > > + ASSERT(xfs_is_reflink_inode(ip)); > > + ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL | XFS_ILOCK_SHARED)); > > > > /* > > * Even if the extent is not shared we might have a preallocation for > > * it in the COW fork. If so use it. > > */ > > - if (xfs_iext_lookup_extent(ip, ip->i_cowfp, *offset_fsb, &idx, &got) && > > - got.br_startoff <= *offset_fsb) { > > + if (xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &idx, &got) && > > + got.br_startoff <= offset_fsb) { > > + *shared = true; > > + > > /* If we have a real allocation in the COW fork we're done. */ > > if (!isnullstartblock(got.br_startblock)) { > > - xfs_trim_extent(&got, *offset_fsb, > > - end_fsb - *offset_fsb); > > - *offset_fsb = got.br_startoff + got.br_blockcount; > > - goto out_trans_cancel; > > + xfs_trim_extent(&got, offset_fsb, end_fsb - offset_fsb); > > + *imap = got; > > + goto out; > > } > > + > > + xfs_trim_extent(imap, got.br_startoff, got.br_blockcount); > > } else { > > - nimaps = 1; > > - error = xfs_bmapi_read(ip, *offset_fsb, end_fsb - *offset_fsb, > > - &imap, &nimaps, 0); > > + error = xfs_reflink_trim_around_shared(ip, imap, shared, &trimmed); > > + if (error || !*shared) > > + goto out; > > + } > > + > > + if (!tp) { > > + resaligned = xfs_aligned_fsb_count(offset_fsb, > > + end_fsb - offset_fsb, xfs_get_cowextsz_hint(ip)); > > + resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); > > + > > + xfs_iunlock(ip, *lockmode); > > + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp); > > + *lockmode = XFS_ILOCK_EXCL; > > + xfs_ilock(ip, *lockmode); > > + > > if (error) > > - goto out_trans_cancel; > > - ASSERT(nimaps == 1); > > + return error; > > > > - /* Trim the mapping to the nearest shared extent boundary. */ > > - error = xfs_reflink_trim_around_shared(ip, &imap, &shared, > > - &trimmed); > > + error = xfs_qm_dqattach_locked(ip, 0); > > if (error) > > - goto out_trans_cancel; > > + goto out; > > > > - if (!shared) { > > - *offset_fsb = imap.br_startoff + imap.br_blockcount; > > - goto out_trans_cancel; > > - } > > + /* Read extent from the source file. */ > > + nimaps = 1; > > + error = xfs_bmapi_read(ip, offset_fsb, end_fsb - offset_fsb, > > + imap, &nimaps, 0); > > + if (error) > > + goto out; > > + ASSERT(nimaps == 1); > > > > - *offset_fsb = imap.br_startoff; > > - end_fsb = imap.br_startoff + imap.br_blockcount; > > + goto retry; > > } > > > > error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0, > > XFS_QMOPT_RES_REGBLKS); > > if (error) > > - goto out_trans_cancel; > > + goto out; > > > > xfs_trans_ijoin(tp, ip, 0); > > > > @@ -459,9 +469,9 @@ __xfs_reflink_allocate_cow( > > nimaps = 1; > > > > /* Allocate the entire reservation as unwritten blocks. */ > > - error = xfs_bmapi_write(tp, ip, *offset_fsb, end_fsb - *offset_fsb, > > + error = xfs_bmapi_write(tp, ip, imap->br_startoff, imap->br_blockcount, > > XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, &first_block, > > - resblks, &imap, &nimaps, &dfops); > > + resblks, imap, &nimaps, &dfops); > > if (error) > > goto out_bmap_cancel; > > > > @@ -470,60 +480,15 @@ __xfs_reflink_allocate_cow( > > if (error) > > goto out_bmap_cancel; > > > > - error = xfs_trans_commit(tp); > > - if (error) > > - goto out_unlock; > > - > > - *offset_fsb = imap.br_startoff + imap.br_blockcount; > > - > > -out_unlock: > > - xfs_iunlock(ip, lockmode); > > - return error; > > + return xfs_trans_commit(tp); > > > > out_bmap_cancel: > > xfs_defer_cancel(&dfops); > > xfs_trans_unreserve_quota_nblks(tp, ip, (long)resblks, 0, > > XFS_QMOPT_RES_REGBLKS); > > -out_trans_cancel: > > - xfs_trans_cancel(tp); > > - goto out_unlock; > > -} > > - > > -/* Allocate all CoW reservations covering a part of a file. */ > > -int > > -xfs_reflink_allocate_cow_range( > > - struct xfs_inode *ip, > > - xfs_off_t offset, > > - xfs_off_t count) > > -{ > > - struct xfs_mount *mp = ip->i_mount; > > - xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset); > > - xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + count); > > - int error; > > - > > - ASSERT(xfs_is_reflink_inode(ip)); > > - > > - trace_xfs_reflink_allocate_cow_range(ip, offset, count); > > - > > - /* > > - * Make sure that the dquots are there. > > - */ > > - error = xfs_qm_dqattach(ip, 0); > > - if (error) > > - return error; > > - > > - while (offset_fsb < end_fsb) { > > - error = __xfs_reflink_allocate_cow(ip, &offset_fsb, end_fsb); > > - if (error) { > > - trace_xfs_reflink_allocate_cow_range_error(ip, error, > > - _RET_IP_); > > - goto out; > > - } > > - } > > - > > - /* Convert the CoW extents to regular. */ > > - error = xfs_reflink_convert_cow(ip, offset, count); > > I think this is incorrect -- we create an unwritten extent in the cow > fork above (BMAP_COWFORK | BMAP_PREALLOC) but we no longer convert the > part we're actually writing to a real extent. This means that > _reflink_cow won't actually remap anything that we've written. It's not correct and I messed up when rebasing - the version I tested still had it and worked. Sigh.. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 4/4] xfs: allocate direct I/O COW blocks in iomap_begin 2017-02-03 10:53 ` Christoph Hellwig @ 2017-02-05 20:13 ` Christoph Hellwig 0 siblings, 0 replies; 10+ messages in thread From: Christoph Hellwig @ 2017-02-05 20:13 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs On Fri, Feb 03, 2017 at 11:53:59AM +0100, Christoph Hellwig wrote: > > I think this is incorrect -- we create an unwritten extent in the cow > > fork above (BMAP_COWFORK | BMAP_PREALLOC) but we no longer convert the > > part we're actually writing to a real extent. This means that > > _reflink_cow won't actually remap anything that we've written. > > It's not correct and I messed up when rebasing - the version I tested > still had it and worked. Sigh.. Or rather I hadn't finished rebasing and the version I had was totally b0rked. Proper one in progress now.. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-02-05 20:13 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-02-01 21:42 reflink direct I/O improvements V2 Christoph Hellwig 2017-02-01 21:42 ` [PATCH 1/4] xfs: reject all unaligned direct writes to reflinked files Christoph Hellwig 2017-02-02 23:01 ` Darrick J. Wong 2017-02-01 21:42 ` [PATCH 2/4] xfs: introduce xfs_aligned_fsb_count Christoph Hellwig 2017-02-02 22:56 ` Darrick J. Wong 2017-02-01 21:42 ` [PATCH 3/4] xfs: go straight to real allocations for direct I/O COW writes Christoph Hellwig 2017-02-01 21:42 ` [PATCH 4/4] xfs: allocate direct I/O COW blocks in iomap_begin Christoph Hellwig 2017-02-03 0:55 ` Darrick J. Wong 2017-02-03 10:53 ` Christoph Hellwig 2017-02-05 20:13 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).