Re: [PATCH] xfs: handle array index overrun in xfs_dir2_leaf_readbuf()

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Eric Sandeen <sandeen@redhat.com>
Cc: Brian Foster <bfoster@redhat.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Carlos Maiolino <cmaiolin@redhat.com>
Subject: Re: [PATCH] xfs: handle array index overrun in xfs_dir2_leaf_readbuf()
Date: Tue, 18 Apr 2017 17:29:27 -0700	[thread overview]
Message-ID: <20170419002927.GG5193@birch.djwong.org> (raw)
In-Reply-To: <12d5c25d-7e59-0361-f5e6-ec8bfb8b1fff@redhat.com>

On Tue, Apr 18, 2017 at 07:14:03PM -0500, Eric Sandeen wrote:
> On 4/18/17 7:09 PM, Darrick J. Wong wrote:
> > On Tue, Apr 18, 2017 at 12:55:44PM -0400, Brian Foster wrote:
> >> On Mon, Apr 17, 2017 at 05:12:36PM -0500, Eric Sandeen wrote:
> >>> On 4/17/17 3:57 PM, Brian Foster wrote:
> >>>> On Thu, Apr 13, 2017 at 01:45:43PM -0500, Eric Sandeen wrote:
> >> ...
> >>>>
> >>>> This fix seems fine to me, but I'm wondering if this code may have
> >>>> issues with other kinds of misalignment between the directory blocks and
> >>>> underlying bmap extents as well. For example, what happens if we end up
> >>>> with something like the following on an 8k dir fsb fs?
> >>>>
> >>>>  0:[0,xxx,3,0]
> >>>>  1:[3,xxx,1,0]
> >>>>
> >>>> ... or ...
> >>>>
> >>>>  0:[0,xxx,3,0]
> >>>>  1:[3,xxx,3,0]
> >>>
> >>> Well, as far as that goes it won't be an issue; for 8k dir block sizes
> >>> we will allocate an extent map with room for 10 extents, and we'll go
> >>> well beyond the above extents which cross directory block boundaries.
> >>>
> >>>>  ...
> >>>>  N:[...]
> >>>>
> >>>> Am I following correctly that we may end up assuming the wrong mapping
> >>>> for the second dir fsb and/or possibly skipping blocks?
> >>>
> >>> As far as I can tell, this code is only managing the read-ahead state
> >>> by looking at these cached extents. We keep track of our position within
> >>> that allocated array of mappings - this bug just stepped off the end
> >>> while doing so.
> >>>
> >>> Stopping at the correct point should keep all of the state consistent
> >>> and correct.
> >>>
> >>> But yeah, it's kind of hairy & hard to read, IMHO.
> >>>
> >>> Also as far as I can tell, we handle such discontiguities correctly,
> >>> other than the bug I found.  If you see something that looks suspicious,
> >>> I'm sure I could tweak my test case to craft a specific situation if
> >>> there's something you'd like to see tested...
> >>>
> >>
> >> Background: Eric and I chatted a bit on irc to rectify that what I'm
> >> calling out above is a different issue from what is fixed by this patch.
> >>
> >> Eric,
> >>
> >> I managed to construct a directory that looks like this:
> >>
> >>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
> >>    0: [0..7]:          88..95            0 (88..95)             8
> >>    1: [8..15]:         80..87            0 (80..87)             8
> >>    2: [16..39]:        168..191          0 (168..191)          24
> >>    3: [40..63]:        5242952..5242975  1 (72..95)            24
> >>
> >> The fs has 8k directory fsbs. Dir fsb offset 0 spans extents 0 and 1,
> >> offset 1 (which corresponds to the 512b range 16-31 above) is covered
> >> completely by extent 2 and dir offset 2 (range 32-47) spans extents 2
> >> and 3. An ls of this directory produces this:
> >>
> >>  XFS (dm-3): Metadata corruption detected at xfs_dir3_data_reada_verify+0x42/0x80 [xfs], xfs_dir3_data_reada block 0x500058
> >>  XFS (dm-3): Unmount and run xfs_repair
> >>  XFS (dm-3): First 64 bytes of corrupted metadata buffer:
> >>  ffffbcb901c44000: 00 00 00 00 00 00 00 64 0f 78 78 78 78 78 78 78  .......d.xxxxxxx
> >>  ffffbcb901c44010: 78 78 78 78 2e 38 38 36 01 00 00 00 00 00 10 00  xxxx.886........
> >>  ffffbcb901c44020: 00 00 00 00 00 00 00 64 0f 78 78 78 78 78 78 78  .......d.xxxxxxx
> >>  ffffbcb901c44030: 78 78 78 78 2e 38 38 37 01 00 00 00 00 00 10 20  xxxx.887....... 
> >>
> >> ... which is yelling about block 184 (dir fsb 2). The fs is otherwise
> >> clean according to xfs_repair. 
> >>
> >> I _think_ something like the appended diff deals with it, but this is
> >> lightly tested only and could definitely use more eyes.
> >>
> >> Brian
> >>
> >> --- 8< ---
> >>
> >> diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
> >> index ad9396e..9fa379d 100644
> >> --- a/fs/xfs/xfs_dir2_readdir.c
> >> +++ b/fs/xfs/xfs_dir2_readdir.c
> >> @@ -404,7 +404,8 @@ xfs_dir2_leaf_readbuf(
> >>  		 * Read-ahead a contiguous directory block.
> >>  		 */
> >>  		if (i > mip->ra_current &&
> >> -		    map[mip->ra_index].br_blockcount >= geo->fsbcount) {
> >> +		    (map[mip->ra_index].br_blockcount - mip->ra_offset) >=
> >> +		     geo->fsbcount) {
> >>  			xfs_dir3_data_readahead(dp,
> >>  				map[mip->ra_index].br_startoff + mip->ra_offset,
> >>  				XFS_FSB_TO_DADDR(dp->i_mount,
> >> @@ -432,7 +433,7 @@ xfs_dir2_leaf_readbuf(
> >>  			 * The rest of this extent but not more than a dir
> >>  			 * block.
> >>  			 */
> >> -			length = min_t(int, geo->fsbcount,
> >> +			length = min_t(int, geo->fsbcount - j,
> > 
> > Looks ok to me to make Eric's bugfix complete.
> 
> Brian, thanks for crafting the image to expose this.  :)
> 
> I've been otherwise occupied today, sorry - technically this fixes a separate
> issue, yes? So 2 patches, 2 bugfixes AFAICT?

Ok.  I don't mind taking two patches since you're right, the first patch
saves us from running off the end of the shadow bmap array, and the
second one fixes bookkeepping problems when a dir extent starts midway
through a large dirblock.

--D

> 
> Thanks,
> -Eric
> 
> > I will, however, post a cleanup patch to remove the persistent shadow
> > bmap and have readahead issued directly off the inode fork contents.
> > 
> > --D
> > 
> >>  					map[mip->ra_index].br_blockcount -
> >>  							mip->ra_offset);
> >>  			mip->ra_offset += length;
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2017-04-19  0:29 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-13 18:45 [PATCH] xfs: handle array index overrun in xfs_dir2_leaf_readbuf() Eric Sandeen
2017-04-13 19:09 ` Carlos Maiolino
2017-04-13 22:12 ` Darrick J. Wong
2017-04-13 22:24 ` Bill O'Donnell
2017-04-17 20:57 ` Brian Foster
2017-04-17 22:12   ` Eric Sandeen
2017-04-18 16:55     ` Brian Foster
2017-04-18 17:00       ` Brian Foster
2017-04-19  0:09       ` Darrick J. Wong
2017-04-19  0:14         ` Eric Sandeen
2017-04-19  0:29           ` Darrick J. Wong [this message]
2017-04-19 12:32             ` Brian Foster
2017-04-19 14:40               ` Eric Sandeen
2017-04-19 16:08                 ` Darrick J. Wong
2017-04-19 18:01                   ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170419002927.GG5193@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=cmaiolin@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox