From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: prevent multi-fsb dir readahead from reading random blocks
Date: Thu, 20 Apr 2017 10:17:29 -0700 [thread overview]
Message-ID: <20170420171729.GC23371@birch.djwong.org> (raw)
In-Reply-To: <1492694944-46582-1-git-send-email-bfoster@redhat.com>
On Thu, Apr 20, 2017 at 09:29:04AM -0400, Brian Foster wrote:
> Directory block readahead uses a complex iteration mechanism to map
> between high-level directory blocks and underlying physical extents.
> This mechanism attempts to traverse the higher-level dir blocks in a
> manner that handles multi-fsb directory blocks and simultaneously
> maintains a reference to the corresponding physical blocks.
>
> This logic doesn't handle certain (discontiguous) physical extent
> layouts correctly with multi-fsb directory blocks. For example,
> consider the case of a 4k FSB filesystem with a 2 FSB (8k) directory
> block size and a directory with the following extent layout:
>
> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
> 0: [0..7]: 88..95 0 (88..95) 8
> 1: [8..15]: 80..87 0 (80..87) 8
> 2: [16..39]: 168..191 0 (168..191) 24
> 3: [40..63]: 5242952..5242975 1 (72..95) 24
>
> Directory block 0 spans physical extents 0 and 1, dirblk 1 lies
> entirely within extent 2 and dirblk 2 spans extents 2 and 3. Because
> extent 2 is larger than the directory block size, the readahead code
> erroneously assumes the block is contiguous and issues a readahead
> based on the physical mapping of the first fsb of the dirblk. This
> results in read verifier failure and a spurious corruption or crc
> failure, depending on the filesystem format.
>
> Further, the subsequent readahead code responsible for walking
> through the physical table doesn't correctly advance the physical
> block reference for dirblk 2. Instead of advancing two physical
> filesystem blocks, the first iteration of the loop advances 1 block
> (correctly), but the subsequent iteration advances 2 more physical
> blocks because the next physical extent (extent 3, above) happens to
> cover more than dirblk 2. At this point, the higher-level directory
> block walking is completely off the rails of the actual physical
> layout of the directory for the respective mapping table.
>
> Update the contiguous dirblock logic to consider the current offset
> in the physical extent to avoid issuing directory readahead to
> unrelated blocks. Also, update the mapping table advancing code to
> consider the current offset within the current dirblock to avoid
> advancing the mapping reference too far beyond the dirblock.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>
> This so far survives a couple xfstests runs with default and 2xfsb
> directory block sizes. I have another run in progress with max sized
> directory blocks. This applies on top of Eric's original overrun fix.
Looks ok, will test...
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
>
> Brian
>
> fs/xfs/xfs_dir2_readdir.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> index c45de72..20b7a5c 100644
> --- a/fs/xfs/xfs_dir2_readdir.c
> +++ b/fs/xfs/xfs_dir2_readdir.c
> @@ -405,7 +405,8 @@ xfs_dir2_leaf_readbuf(
> * Read-ahead a contiguous directory block.
> */
> if (i > mip->ra_current &&
> - map[mip->ra_index].br_blockcount >= geo->fsbcount) {
> + (map[mip->ra_index].br_blockcount - mip->ra_offset) >=
> + geo->fsbcount) {
> xfs_dir3_data_readahead(dp,
> map[mip->ra_index].br_startoff + mip->ra_offset,
> XFS_FSB_TO_DADDR(dp->i_mount,
> @@ -438,7 +439,7 @@ xfs_dir2_leaf_readbuf(
> * The rest of this extent but not more than a dir
> * block.
> */
> - length = min_t(int, geo->fsbcount,
> + length = min_t(int, geo->fsbcount - j,
> map[mip->ra_index].br_blockcount -
> mip->ra_offset);
> mip->ra_offset += length;
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2017-04-20 17:17 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-20 13:29 [PATCH] xfs: prevent multi-fsb dir readahead from reading random blocks Brian Foster
2017-04-20 17:17 ` Darrick J. Wong [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170420171729.GC23371@birch.djwong.org \
--to=darrick.wong@oracle.com \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox