From: Dave Chinner <david@fromorbit.com>
To: Zhang Yi <yi.zhang@huaweicloud.com>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, djwong@kernel.org,
hch@infradead.org, brauner@kernel.org, chandanbabu@kernel.org,
jack@suse.cz, willy@infradead.org, yi.zhang@huawei.com,
chengzhihao1@huawei.com, yukuai3@huawei.com
Subject: Re: [RFC PATCH v4 5/8] xfs: refactor the truncating order
Date: Mon, 3 Jun 2024 08:46:40 +1000 [thread overview]
Message-ID: <Zlz2UCS4jqQO3Vm6@dread.disaster.area> (raw)
In-Reply-To: <20240529095206.2568162-6-yi.zhang@huaweicloud.com>
On Wed, May 29, 2024 at 05:52:03PM +0800, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
>
> When truncating down an inode, we call xfs_truncate_page() to zero out
> the tail partial block that beyond new EOF, which prevents exposing
> stale data. But xfs_truncate_page() always assumes the blocksize is
> i_blocksize(inode), it's not always true if we have a large allocation
> unit for a file and we should aligned to this unitsize, e.g. realtime
> inode should aligned to the rtextsize.
>
> Current xfs_setattr_size() can't support zeroing out a large alignment
> size on trucate down since the process order is wrong. We first do zero
> out through xfs_truncate_page(), and then update inode size through
> truncate_setsize() immediately. If the zeroed range is larger than a
> folio, the write back path would not write back zeroed pagecache beyond
> the EOF folio, so it doesn't write zeroes to the entire tail extent and
> could expose stale data after an appending write into the next aligned
> extent.
>
> We need to adjust the order to zero out tail aligned blocks, write back
> zeroed or cached data, update i_size and drop cache beyond aligned EOF
> block, preparing for the fix of realtime inode and supporting the
> upcoming forced alignment feature.
>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> ---
.....
> @@ -853,30 +854,7 @@ xfs_setattr_size(
> * the transaction because the inode cannot be unlocked once it is a
> * part of the transaction.
> *
> - * Start with zeroing any data beyond EOF that we may expose on file
> - * extension, or zeroing out the rest of the block on a downward
> - * truncate.
> - */
> - if (newsize > oldsize) {
> - trace_xfs_zero_eof(ip, oldsize, newsize - oldsize);
> - error = xfs_zero_range(ip, oldsize, newsize - oldsize,
> - &did_zeroing);
> - } else if (newsize != oldsize) {
> - error = xfs_truncate_page(ip, newsize, &did_zeroing);
> - }
> -
> - if (error)
> - return error;
> -
> - /*
> - * We've already locked out new page faults, so now we can safely remove
> - * pages from the page cache knowing they won't get refaulted until we
> - * drop the XFS_MMAP_EXCL lock after the extent manipulations are
> - * complete. The truncate_setsize() call also cleans partial EOF page
> - * PTEs on extending truncates and hence ensures sub-page block size
> - * filesystems are correctly handled, too.
> - *
> - * We have to do all the page cache truncate work outside the
> + * And we have to do all the page cache truncate work outside the
> * transaction context as the "lock" order is page lock->log space
> * reservation as defined by extent allocation in the writeback path.
> * Hence a truncate can fail with ENOMEM from xfs_trans_alloc(), but
......
Lots of new logic for zeroing here. That makes xfs_setattr_size()
even longer than it already is. Can you lift this EOF zeroing logic
into it's own helper function so that it is clear that it is a
completely independent operation to the actual transaction that
changes the inode size. That would also allow the operations to be
broken up into:
if (newsize >= oldsize) {
/* do the simple stuff */
....
return error;
}
/* do the complex size reduction stuff without additional indenting */
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-06-02 22:46 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-29 9:51 [RFC PATCH v4 0/8] iomap/xfs: fix stale data exposure when truncating realtime inodes Zhang Yi
2024-05-29 9:51 ` [RFC PATCH v4 1/8] iomap: zeroing needs to be pagecache aware Zhang Yi
2024-05-31 13:11 ` Christoph Hellwig
2024-05-31 14:03 ` Darrick J. Wong
2024-05-31 14:05 ` Christoph Hellwig
2024-05-31 15:44 ` Brian Foster
2024-05-31 15:43 ` Brian Foster
2024-06-02 22:22 ` Dave Chinner
2024-06-02 11:04 ` Brian Foster
2024-06-03 9:07 ` Zhang Yi
2024-06-03 14:37 ` Brian Foster
2024-06-04 23:38 ` Dave Chinner
2024-05-29 9:52 ` [RFC PATCH v4 2/8] math64: add rem_u64() to just return the remainder Zhang Yi
2024-05-31 12:35 ` Christoph Hellwig
2024-05-31 14:04 ` Darrick J. Wong
2024-05-29 9:52 ` [RFC PATCH v4 3/8] iomap: pass blocksize to iomap_truncate_page() Zhang Yi
2024-05-31 12:39 ` Christoph Hellwig
2024-06-02 11:16 ` Brian Foster
2024-06-03 13:23 ` Zhang Yi
2024-05-29 9:52 ` [RFC PATCH v4 4/8] fsdax: pass blocksize to dax_truncate_page() Zhang Yi
2024-05-29 9:52 ` [RFC PATCH v4 5/8] xfs: refactor the truncating order Zhang Yi
2024-05-31 13:31 ` Christoph Hellwig
2024-05-31 15:27 ` Darrick J. Wong
2024-05-31 16:17 ` Christoph Hellwig
2024-06-03 13:51 ` Zhang Yi
2024-05-31 15:44 ` Darrick J. Wong
2024-06-03 14:15 ` Zhang Yi
2024-06-02 22:46 ` Dave Chinner [this message]
2024-06-03 14:18 ` Zhang Yi
2024-05-29 9:52 ` [RFC PATCH v4 6/8] xfs: correct the truncate blocksize of realtime inode Zhang Yi
2024-05-31 13:36 ` Christoph Hellwig
2024-06-03 14:35 ` Zhang Yi
2024-05-29 9:52 ` [RFC PATCH v4 7/8] xfs: reserve blocks for truncating " Zhang Yi
2024-05-31 12:42 ` Christoph Hellwig
2024-05-31 14:10 ` Darrick J. Wong
2024-05-31 14:13 ` Christoph Hellwig
2024-05-31 15:29 ` Darrick J. Wong
2024-05-31 16:17 ` Christoph Hellwig
2024-05-29 9:52 ` [RFC PATCH v4 8/8] xfs: improve truncate on a realtime inode with huge extsize Zhang Yi
2024-05-31 13:46 ` Christoph Hellwig
2024-05-31 14:12 ` Darrick J. Wong
2024-05-31 14:15 ` Christoph Hellwig
2024-05-31 15:00 ` Darrick J. Wong
2024-06-04 7:09 ` Zhang Yi
2024-05-31 12:26 ` [RFC PATCH v4 0/8] iomap/xfs: fix stale data exposure when truncating realtime inodes Christoph Hellwig
2024-06-01 7:38 ` Zhang Yi
2024-06-01 7:40 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zlz2UCS4jqQO3Vm6@dread.disaster.area \
--to=david@fromorbit.com \
--cc=brauner@kernel.org \
--cc=chandanbabu@kernel.org \
--cc=chengzhihao1@huawei.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=willy@infradead.org \
--cc=yi.zhang@huawei.com \
--cc=yi.zhang@huaweicloud.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox