public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: prevent multi-fsb dir readahead from reading random blocks
Date: Thu, 20 Apr 2017 10:17:29 -0700	[thread overview]
Message-ID: <20170420171729.GC23371@birch.djwong.org> (raw)
In-Reply-To: <1492694944-46582-1-git-send-email-bfoster@redhat.com>

On Thu, Apr 20, 2017 at 09:29:04AM -0400, Brian Foster wrote:
> Directory block readahead uses a complex iteration mechanism to map
> between high-level directory blocks and underlying physical extents.
> This mechanism attempts to traverse the higher-level dir blocks in a
> manner that handles multi-fsb directory blocks and simultaneously
> maintains a reference to the corresponding physical blocks.
> 
> This logic doesn't handle certain (discontiguous) physical extent
> layouts correctly with multi-fsb directory blocks. For example,
> consider the case of a 4k FSB filesystem with a 2 FSB (8k) directory
> block size and a directory with the following extent layout:
> 
>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
>    0: [0..7]:          88..95            0 (88..95)             8
>    1: [8..15]:         80..87            0 (80..87)             8
>    2: [16..39]:        168..191          0 (168..191)          24
>    3: [40..63]:        5242952..5242975  1 (72..95)            24
> 
> Directory block 0 spans physical extents 0 and 1, dirblk 1 lies
> entirely within extent 2 and dirblk 2 spans extents 2 and 3. Because
> extent 2 is larger than the directory block size, the readahead code
> erroneously assumes the block is contiguous and issues a readahead
> based on the physical mapping of the first fsb of the dirblk. This
> results in read verifier failure and a spurious corruption or crc
> failure, depending on the filesystem format.
> 
> Further, the subsequent readahead code responsible for walking
> through the physical table doesn't correctly advance the physical
> block reference for dirblk 2. Instead of advancing two physical
> filesystem blocks, the first iteration of the loop advances 1 block
> (correctly), but the subsequent iteration advances 2 more physical
> blocks because the next physical extent (extent 3, above) happens to
> cover more than dirblk 2. At this point, the higher-level directory
> block walking is completely off the rails of the actual physical
> layout of the directory for the respective mapping table.
> 
> Update the contiguous dirblock logic to consider the current offset
> in the physical extent to avoid issuing directory readahead to
> unrelated blocks. Also, update the mapping table advancing code to
> consider the current offset within the current dirblock to avoid
> advancing the mapping reference too far beyond the dirblock.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> 
> This so far survives a couple xfstests runs with default and 2xfsb
> directory block sizes. I have another run in progress with max sized
> directory blocks. This applies on top of Eric's original overrun fix.

Looks ok, will test...

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> 
> Brian
> 
>  fs/xfs/xfs_dir2_readdir.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> index c45de72..20b7a5c 100644
> --- a/fs/xfs/xfs_dir2_readdir.c
> +++ b/fs/xfs/xfs_dir2_readdir.c
> @@ -405,7 +405,8 @@ xfs_dir2_leaf_readbuf(
>  		 * Read-ahead a contiguous directory block.
>  		 */
>  		if (i > mip->ra_current &&
> -		    map[mip->ra_index].br_blockcount >= geo->fsbcount) {
> +		    (map[mip->ra_index].br_blockcount - mip->ra_offset) >=
> +		    geo->fsbcount) {
>  			xfs_dir3_data_readahead(dp,
>  				map[mip->ra_index].br_startoff + mip->ra_offset,
>  				XFS_FSB_TO_DADDR(dp->i_mount,
> @@ -438,7 +439,7 @@ xfs_dir2_leaf_readbuf(
>  			 * The rest of this extent but not more than a dir
>  			 * block.
>  			 */
> -			length = min_t(int, geo->fsbcount,
> +			length = min_t(int, geo->fsbcount - j,
>  					map[mip->ra_index].br_blockcount -
>  							mip->ra_offset);
>  			mip->ra_offset += length;
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

      reply	other threads:[~2017-04-20 17:17 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-20 13:29 [PATCH] xfs: prevent multi-fsb dir readahead from reading random blocks Brian Foster
2017-04-20 17:17 ` Darrick J. Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170420171729.GC23371@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox