From: Eric Sandeen <sandeen@redhat.com>
To: Brian Foster <bfoster@redhat.com>,
"Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs <linux-xfs@vger.kernel.org>,
Carlos Maiolino <cmaiolin@redhat.com>
Subject: Re: [PATCH] xfs: handle array index overrun in xfs_dir2_leaf_readbuf()
Date: Wed, 19 Apr 2017 09:40:20 -0500 [thread overview]
Message-ID: <e6315a52-e8ee-c238-bf8f-66e76af9a68b@redhat.com> (raw)
In-Reply-To: <20170419123254.GB40497@bfoster.bfoster>
On 4/19/17 7:32 AM, Brian Foster wrote:
> On Tue, Apr 18, 2017 at 05:29:27PM -0700, Darrick J. Wong wrote:
>> On Tue, Apr 18, 2017 at 07:14:03PM -0500, Eric Sandeen wrote:
>>> On 4/18/17 7:09 PM, Darrick J. Wong wrote:
>>>> On Tue, Apr 18, 2017 at 12:55:44PM -0400, Brian Foster wrote:
>>>>> On Mon, Apr 17, 2017 at 05:12:36PM -0500, Eric Sandeen wrote:
>>>>>> On 4/17/17 3:57 PM, Brian Foster wrote:
>>>>>>> On Thu, Apr 13, 2017 at 01:45:43PM -0500, Eric Sandeen wrote:
>>>>> ...
>>>>>>>
>>>>>>> This fix seems fine to me, but I'm wondering if this code may have
>>>>>>> issues with other kinds of misalignment between the directory blocks and
>>>>>>> underlying bmap extents as well. For example, what happens if we end up
>>>>>>> with something like the following on an 8k dir fsb fs?
>>>>>>>
>>>>>>> 0:[0,xxx,3,0]
>>>>>>> 1:[3,xxx,1,0]
>>>>>>>
>>>>>>> ... or ...
>>>>>>>
>>>>>>> 0:[0,xxx,3,0]
>>>>>>> 1:[3,xxx,3,0]
>>>>>>
>>>>>> Well, as far as that goes it won't be an issue; for 8k dir block sizes
>>>>>> we will allocate an extent map with room for 10 extents, and we'll go
>>>>>> well beyond the above extents which cross directory block boundaries.
>>>>>>
>>>>>>> ...
>>>>>>> N:[...]
>>>>>>>
>>>>>>> Am I following correctly that we may end up assuming the wrong mapping
>>>>>>> for the second dir fsb and/or possibly skipping blocks?
>>>>>>
>>>>>> As far as I can tell, this code is only managing the read-ahead state
>>>>>> by looking at these cached extents. We keep track of our position within
>>>>>> that allocated array of mappings - this bug just stepped off the end
>>>>>> while doing so.
>>>>>>
>>>>>> Stopping at the correct point should keep all of the state consistent
>>>>>> and correct.
>>>>>>
>>>>>> But yeah, it's kind of hairy & hard to read, IMHO.
>>>>>>
>>>>>> Also as far as I can tell, we handle such discontiguities correctly,
>>>>>> other than the bug I found. If you see something that looks suspicious,
>>>>>> I'm sure I could tweak my test case to craft a specific situation if
>>>>>> there's something you'd like to see tested...
>>>>>>
>>>>>
>>>>> Background: Eric and I chatted a bit on irc to rectify that what I'm
>>>>> calling out above is a different issue from what is fixed by this patch.
>>>>>
>>>>> Eric,
>>>>>
>>>>> I managed to construct a directory that looks like this:
>>>>>
>>>>> EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
>>>>> 0: [0..7]: 88..95 0 (88..95) 8
>>>>> 1: [8..15]: 80..87 0 (80..87) 8
>>>>> 2: [16..39]: 168..191 0 (168..191) 24
>>>>> 3: [40..63]: 5242952..5242975 1 (72..95) 24
>>>>>
>>>>> The fs has 8k directory fsbs. Dir fsb offset 0 spans extents 0 and 1,
>>>>> offset 1 (which corresponds to the 512b range 16-31 above) is covered
>>>>> completely by extent 2 and dir offset 2 (range 32-47) spans extents 2
>>>>> and 3. An ls of this directory produces this:
>>>>>
>>>>> XFS (dm-3): Metadata corruption detected at xfs_dir3_data_reada_verify+0x42/0x80 [xfs], xfs_dir3_data_reada block 0x500058
>>>>> XFS (dm-3): Unmount and run xfs_repair
>>>>> XFS (dm-3): First 64 bytes of corrupted metadata buffer:
>>>>> ffffbcb901c44000: 00 00 00 00 00 00 00 64 0f 78 78 78 78 78 78 78 .......d.xxxxxxx
>>>>> ffffbcb901c44010: 78 78 78 78 2e 38 38 36 01 00 00 00 00 00 10 00 xxxx.886........
>>>>> ffffbcb901c44020: 00 00 00 00 00 00 00 64 0f 78 78 78 78 78 78 78 .......d.xxxxxxx
>>>>> ffffbcb901c44030: 78 78 78 78 2e 38 38 37 01 00 00 00 00 00 10 20 xxxx.887.......
>>>>>
>>>>> ... which is yelling about block 184 (dir fsb 2). The fs is otherwise
>>>>> clean according to xfs_repair.
>>>>>
>>>>> I _think_ something like the appended diff deals with it, but this is
>>>>> lightly tested only and could definitely use more eyes.
>>>>>
>>>>> Brian
>>>>>
>>>>> --- 8< ---
>>>>>
>>>>> diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c
>>>>> index ad9396e..9fa379d 100644
>>>>> --- a/fs/xfs/xfs_dir2_readdir.c
>>>>> +++ b/fs/xfs/xfs_dir2_readdir.c
>>>>> @@ -404,7 +404,8 @@ xfs_dir2_leaf_readbuf(
>>>>> * Read-ahead a contiguous directory block.
>>>>> */
>>>>> if (i > mip->ra_current &&
>>>>> - map[mip->ra_index].br_blockcount >= geo->fsbcount) {
>>>>> + (map[mip->ra_index].br_blockcount - mip->ra_offset) >=
>>>>> + geo->fsbcount) {
>>>>> xfs_dir3_data_readahead(dp,
>>>>> map[mip->ra_index].br_startoff + mip->ra_offset,
>>>>> XFS_FSB_TO_DADDR(dp->i_mount,
>>>>> @@ -432,7 +433,7 @@ xfs_dir2_leaf_readbuf(
>>>>> * The rest of this extent but not more than a dir
>>>>> * block.
>>>>> */
>>>>> - length = min_t(int, geo->fsbcount,
>>>>> + length = min_t(int, geo->fsbcount - j,
>>>>
>>>> Looks ok to me to make Eric's bugfix complete.
>>>
>>> Brian, thanks for crafting the image to expose this. :)
>>>
>>> I've been otherwise occupied today, sorry - technically this fixes a separate
>>> issue, yes? So 2 patches, 2 bugfixes AFAICT?
>>
>> Ok. I don't mind taking two patches since you're right, the first patch
>> saves us from running off the end of the shadow bmap array, and the
>> second one fixes bookkeepping problems when a dir extent starts midway
>> through a large dirblock.
>>
>
> Yeah, these are definitely separate issues and Eric's original patch
> here is probably higher priority on its own.
>
> But I'm not clear on the plan here.. do we want this additional patch or
> are we planning just to rewrite the readahead bits (or do we want to do
> both)? I guess having a bugfix patch in the tree is useful from a stable
> perspective... hm?
>
> Brian
Yeah.
There's 20 years of the old code in the field with these bugs - I think 2
straightforward fixes have value even if we might want to rewrite it in the
future.
-Eric
next prev parent reply other threads:[~2017-04-19 14:40 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-13 18:45 [PATCH] xfs: handle array index overrun in xfs_dir2_leaf_readbuf() Eric Sandeen
2017-04-13 19:09 ` Carlos Maiolino
2017-04-13 22:12 ` Darrick J. Wong
2017-04-13 22:24 ` Bill O'Donnell
2017-04-17 20:57 ` Brian Foster
2017-04-17 22:12 ` Eric Sandeen
2017-04-18 16:55 ` Brian Foster
2017-04-18 17:00 ` Brian Foster
2017-04-19 0:09 ` Darrick J. Wong
2017-04-19 0:14 ` Eric Sandeen
2017-04-19 0:29 ` Darrick J. Wong
2017-04-19 12:32 ` Brian Foster
2017-04-19 14:40 ` Eric Sandeen [this message]
2017-04-19 16:08 ` Darrick J. Wong
2017-04-19 18:01 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e6315a52-e8ee-c238-bf8f-66e76af9a68b@redhat.com \
--to=sandeen@redhat.com \
--cc=bfoster@redhat.com \
--cc=cmaiolin@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox