From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH v2 04/11] xfs: CoW fork operations should only update quota reservations
Date: Fri, 26 Jan 2018 08:02:16 -0500 [thread overview]
Message-ID: <20180126130215.GA47923@bfoster.bfoster> (raw)
In-Reply-To: <20180125182003.GH9068@magnolia>
On Thu, Jan 25, 2018 at 10:20:03AM -0800, Darrick J. Wong wrote:
> On Thu, Jan 25, 2018 at 08:03:53AM -0500, Brian Foster wrote:
> > On Wed, Jan 24, 2018 at 05:20:35PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > >
> > > Since the CoW fork only exists in memory, it is incorrect to update the
> > > on-disk quota block counts when we modify the CoW fork. Unlike the data
> > > fork, even real extents in the CoW fork are only reservations (on-disk
> > > they're owned by the refcountbt) so they must not be tracked in the on
> > > disk quota info.
> > >
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > > v2: make documentation more crisp and to the point
> > > ---
> > > fs/xfs/libxfs/xfs_bmap.c | 118 ++++++++++++++++++++++++++++++++++++++++++----
> > > fs/xfs/xfs_quota.h | 14 ++++-
> > > fs/xfs/xfs_reflink.c | 8 ++-
> > > 3 files changed, 122 insertions(+), 18 deletions(-)
> > >
...
> > > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> > > index 82abff6..e367351 100644
> > > --- a/fs/xfs/xfs_reflink.c
> > > +++ b/fs/xfs/xfs_reflink.c
> > > @@ -599,10 +599,6 @@ xfs_reflink_cancel_cow_blocks(
> > > del.br_startblock, del.br_blockcount,
> > > NULL);
> > >
> > > - /* Update quota accounting */
> > > - xfs_trans_mod_dquot_byino(*tpp, ip, XFS_TRANS_DQ_BCOUNT,
> > > - -(long)del.br_blockcount);
> > > -
> > > /* Roll the transaction */
> > > xfs_defer_ijoin(&dfops, ip);
> > > error = xfs_defer_finish(tpp, &dfops);
> > > @@ -795,6 +791,10 @@ xfs_reflink_end_cow(
> > > if (error)
> > > goto out_defer;
> > >
> > > + /* Charge this new data fork mapping to the on-disk quota. */
> > > + xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT,
> > > + (long)del.br_blockcount);
> > > +
> >
> > Should this technically be XFS_TRANS_DQ_DELBCOUNT? The blocks obviously
> > aren't delalloc and this transaction doesn't make a quota reservation so
> > I don't think it screws up accounting. But if the transaction did make a
> > quota reservation, it seems like this would account the extent against
> > the tx reservation where it instead should recognize that cow blocks
> > have already been reserved (which is essentially what DELBCOUNT means,
> > IIUC).
>
> Hmmm, there's a subtlety here -- we're opencoding what DELBCOUNT does,
> because the subsequent xfs_bmap_del_extent_cow unconditionally reduces
> the in-core reservation after we've mapped in the extent as if it had
> been accounted as a real extent all along. But considering all the
> blather about how cow fork blocks are treated as incore reservations, it
> does look funny, doesn't it?
>
Ok.. I missed that the end/del cases were tied together, then reconfused
myself over the accounting in the end_cow() path (re: our irc chat
yesterday) when reassessing that bit. So to reset my brain, we have the
following with this current patch:
- cow reserve does a delalloc and in-core dquot reservation
- cow real alloc either skips dquot adjustment if wasdel, else reduces
the quota res acquired by the transaction by the size of the alloc[1].
Either way we leave around an in-core quota reservation as if the blocks
remained delalloc.
- A cancel at this point simply kills the in-core dquot reservation
along with the cow fork blocks.
- end_cow() unmaps the current data fork blocks and decrements
associated real quota usage (tx), remaps the cow blocks and increments
real quota usage (tx), then kills off the in-core dquot reservation.
[1] Would this even be necessary if we just acquired a delalloc like
reservation in xfs_reflink_allocate_cow() rather than associate the
reservation with the transaction in the first place (assuming we have
enough information to cover error handling, extent manipulations and
whatnot)?
When the tx commits, this essentially has the effect of applying the
bcount delta to both the on-disk dquot and the in-core res. The former
reflects the change in the file on-disk and the latter is rectified
because the field accounts for the current real usage plus outstanding
reservation. The original cowblocks res has been dropped directly, so
the bcount delta reflects the change to the data fork.
If we instead use delbcount in end_cow(), we're telling the transaction
to drop bcount by whatever old data fork blocks were removed and that
we've converted N delalloc (cow fork, actually) blocks that already had
in-core reservation. Therefore, transaction commit updates the on-disk
dquot just the same (-dataforkblocks + delallocblocks), but delbcount
blocks have already updated the in-core dquot res so the transaction has
nothing else to do there (and so we must also not remove that
reservation in del_cow()). This approach does seem like it requires a
bit less mental gymnastics to follow because it more closely resembles
delalloc quota accounting. ;)
Another thing that I'm not sure has been considered here is whether
doing the bcount delta in the transaction and dropping the cowblocks res
from the dquot directly leaves a race window where the quota can overrun
a limit. E.g., since the transaction has to up the in-core res in the
original example at commit time, is there anything that locks out
further external reservation from the dquot between the time the in-core
res is dropped and the transaction commits?
> So perhaps the solution is to pass intent into xfs_bmap_del_extent_cow:
> if we're calling it from _end_cow then we want to hang on to the
> reservation so that delbcount can do its thing, but if we're calling
> from _cancel_cow then we're dumping the extent and reservation.
>
Indeed. But since those are the only callers and we'd already update
delbcount from end_cow(), could we not just lift the del_cow() decrement
into the cancel_cow() function? FWIW, some extra comments around quota
manipulation in the reflink functions would also be useful for future
reference.
Brian
> --D
>
> >
> > Other than that the code seems Ok to me.
> >
> > Brian
> >
> > > /* Remove the mapping from the CoW fork. */
> > > xfs_bmap_del_extent_cow(ip, &icur, &got, &del);
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2018-01-26 13:09 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-24 2:17 [PATCH 00/11] xfs: reflink/scrub/quota fixes Darrick J. Wong
2018-01-24 2:18 ` [PATCH 01/11] xfs: reflink should break pnfs leases before sharing blocks Darrick J. Wong
2018-01-24 14:16 ` Brian Foster
2018-01-26 9:06 ` Christoph Hellwig
2018-01-26 18:26 ` Darrick J. Wong
2018-01-24 2:18 ` [PATCH 02/11] xfs: only grab shared inode locks for source file during reflink Darrick J. Wong
2018-01-24 14:18 ` Brian Foster
2018-01-24 18:40 ` Darrick J. Wong
2018-01-26 12:07 ` Christoph Hellwig
2018-01-26 18:48 ` Darrick J. Wong
2018-01-27 3:32 ` Dave Chinner
2018-01-24 2:18 ` [PATCH 03/11] xfs: call xfs_qm_dqattach before performing reflink operations Darrick J. Wong
2018-01-24 14:18 ` Brian Foster
2018-01-26 9:07 ` Christoph Hellwig
2018-01-24 2:18 ` [PATCH 04/11] xfs: CoW fork operations should only update quota reservations Darrick J. Wong
2018-01-24 14:22 ` Brian Foster
2018-01-24 19:14 ` Darrick J. Wong
2018-01-25 13:01 ` Brian Foster
2018-01-25 17:52 ` Darrick J. Wong
2018-01-25 1:20 ` [PATCH v2 " Darrick J. Wong
2018-01-25 13:03 ` Brian Foster
2018-01-25 18:20 ` Darrick J. Wong
2018-01-26 13:02 ` Brian Foster [this message]
2018-01-26 18:40 ` Darrick J. Wong
2018-01-26 12:12 ` Christoph Hellwig
2018-01-24 2:18 ` [PATCH 05/11] xfs: track CoW blocks separately in the inode Darrick J. Wong
2018-01-25 13:06 ` Brian Foster
2018-01-25 19:21 ` Darrick J. Wong
2018-01-26 13:04 ` Brian Foster
2018-01-26 19:08 ` Darrick J. Wong
2018-01-26 12:15 ` Christoph Hellwig
2018-01-26 19:00 ` Darrick J. Wong
2018-01-26 23:51 ` Darrick J. Wong
2018-01-24 2:18 ` [PATCH 06/11] xfs: fix up cowextsz allocation shortfalls Darrick J. Wong
2018-01-25 17:31 ` Brian Foster
2018-01-25 20:20 ` Darrick J. Wong
2018-01-26 13:06 ` Brian Foster
2018-01-26 19:12 ` Darrick J. Wong
2018-01-26 9:11 ` Christoph Hellwig
2018-01-24 2:18 ` [PATCH 07/11] xfs: always zero di_flags2 when we free the inode Darrick J. Wong
2018-01-25 17:31 ` Brian Foster
2018-01-25 18:36 ` Darrick J. Wong
2018-01-26 9:08 ` Christoph Hellwig
2018-01-24 2:18 ` [PATCH 08/11] xfs: fix tracepoint %p formats Darrick J. Wong
2018-01-25 17:31 ` Brian Foster
2018-01-25 18:47 ` Darrick J. Wong
2018-01-26 0:19 ` Darrick J. Wong
2018-01-26 9:09 ` Christoph Hellwig
2018-01-24 2:18 ` [PATCH 09/11] xfs: make tracepoint inode number format consistent Darrick J. Wong
2018-01-25 17:31 ` Brian Foster
2018-01-26 9:09 ` Christoph Hellwig
2018-01-24 2:19 ` [PATCH 10/11] xfs: refactor inode verifier corruption error printing Darrick J. Wong
2018-01-25 17:31 ` Brian Foster
2018-01-25 18:23 ` Darrick J. Wong
2018-01-26 9:10 ` Christoph Hellwig
2018-01-24 2:19 ` [PATCH 11/11] xfs: don't clobber inobt/finobt cursors when xref with rmap Darrick J. Wong
2018-01-26 9:10 ` Christoph Hellwig
2018-01-25 5:26 ` [PATCH 12/11] xfs: refactor quota code in xfs_bmap_btalloc Darrick J. Wong
2018-01-26 12:17 ` Christoph Hellwig
2018-01-26 21:46 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180126130215.GA47923@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).