From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Eric Sandeen <sandeen@sandeen.net>,
Dave Chinner <david@fromorbit.com>,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/6] xfs: verify extent size hint is valid in inode verifier
Date: Mon, 20 Aug 2018 11:59:18 -0400 [thread overview]
Message-ID: <20180820155918.GB9568@bfoster> (raw)
In-Reply-To: <20180820153626.GB4334@magnolia>
On Mon, Aug 20, 2018 at 08:36:26AM -0700, Darrick J. Wong wrote:
> On Mon, Aug 20, 2018 at 10:27:42AM -0500, Eric Sandeen wrote:
> >
> >
> > On 8/20/18 10:06 AM, Brian Foster wrote:
> > > On Tue, Jul 24, 2018 at 09:43:46AM -0700, Darrick J. Wong wrote:
> > >> On Mon, Jul 23, 2018 at 11:39:53PM -0700, Eric Sandeen wrote:
> > >>> On 6/4/18 11:24 PM, Dave Chinner wrote:
> > >>>> From: Dave Chinner <dchinner@redhat.com>
> > >>>>
> > >>>> There are rules for vald extent size hints. We enforce them when
> > >>>> applications set them, but fuzzers violate those rules and that
> > >>>> screws us over.
> > >>>>
> > >>>> This results in alignment assertion failures when setting up
> > >>>> allocations such as this in direct IO:
> > >>>>
> > >>>> XFS: Assertion failed: ap->length, file: fs/xfs/libxfs/xfs_bmap.c, line: 3432
> > >>>> ....
> > >>>> Call Trace:
> > >>>> xfs_bmap_btalloc+0x415/0x910
> > >>>> xfs_bmapi_write+0x71c/0x12e0
> > >>>> xfs_iomap_write_direct+0x2a9/0x420
> > >>>> xfs_file_iomap_begin+0x4dc/0xa70
> > >>>> iomap_apply+0x43/0x100
> > >>>> iomap_file_buffered_write+0x62/0x90
> > >>>> xfs_file_buffered_aio_write+0xba/0x300
> > >>>> __vfs_write+0xd5/0x150
> > >>>> vfs_write+0xb6/0x180
> > >>>> ksys_write+0x45/0xa0
> > >>>> do_syscall_64+0x5a/0x180
> > >>>> entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > >>>>
> > >>>> And from xfs_db:
> > >>>>
> > >>>> core.extsize = 10380288
> > >>>>
> > >>>> Which is not an integer multiple of the block size, and so violates
> > >>>> Rule #7 for setting extent size hints. Validate extent size hint
> > >>>> rules in the inode verifier to catch this.
> > >>>
> > >>> So, I think that if I do:
> > >>>
> > >>> # mkfs.xfs -f -m crc=0 $TEST_DEV
> > >>> # ./check xfs/229
> > >>> # ./check xfs/229
> > >>>
> > >>> I trip the verifier, because I end up with freed inodes on disk with an
> > >>> extent size hints but zeroed flags.
> > >>>
> > >>> xfs_ifree sets di_flags = 0 but doesn't clear di_extsize; xfs_inode_validate_extsize
> > >>> says if extsize !=0 and the hint flag is set, it fails
> > >>>
> > >>> Anyone else see this?
> > >>
> > >> Yeah, I think I just hit this on the TEST_DEV in xfs/242.
> > >>
> > >> git blame says I lifted the code from the scrub code, and I probably
> > >> wrote the code having read the ioctl code (which clears the extsize
> > >> field if the iflag isn't set).
> > >>
> > >>> (crc=0 needed because that causes us to actually reread the inode chunks
> > >>> in xfs_iread vs. /* shortcut IO on inode allocation if possible */
> > >>
> > >> Hmmm, so a v5 fs mounted with ikeep will also read an inode chunk when
> > >> creating an inode. It looks like we do that (instead of zeroing the
> > >> incore inode and setting a random i_generation) to preserve the existing
> > >> generation number?
> > >>
> > >> In any case, it's pretty clear that kernels have been writing out freed
> > >> inode cores with di_mode == 0, di_flags == 0, and di_extsize == (some
> > >> number) so we clearly can't have that in the verifier. It looks like we
> > >> only examine di_extsize if either EXTSZ flag are set, so it's not
> > >> causing incorrect behavior. Maybe it can be a preening fix in
> > >> scrub/repair.
> > >>
> > >
> > > I just stumbled on this problem with xfs/229 that Eric reported. I'm
> > > confused by the comment above regarding this not causing incorrect
> > > behavior.
> >
> > I think Darrick meant that having a nonzero extent size hint on disk
> > won't cause incorrect behavior because "we only examine di_extsize if
> > either EXTSZ flag are set"
>
> Yeah, he probably did. :)
>
Got it, thanks.
> I think Brian's suggestion of
>
> if (i_mode != 0 && !hint && extsize != 0)
> barf_error();
>
> sounds reasonable (having not tested that at all).
>
I'll run it through xfstests and get it posted if nothing else fails.
BTW, do we have a similar issue with the cowextsize hint (assuming
v5+ikeep)? It looks like it's cleared similarly in xfs_ialloc(), but I'm
not sure if it's cleared somewhere else on free...
Brian
> --D
>
> > -Eric
next prev parent reply other threads:[~2018-08-20 19:15 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-05 6:24 [PATCH 0/6 V2] xfs: more verifications! Dave Chinner
2018-06-05 6:24 ` [PATCH 1/6] xfs: catch bad stripe alignment configurations Dave Chinner
2018-06-05 9:27 ` Carlos Maiolino
2018-06-05 6:24 ` [PATCH 2/6] xfs: verify extent size hint is valid in inode verifier Dave Chinner
2018-06-05 9:53 ` Carlos Maiolino
2018-06-05 22:56 ` Dave Chinner
2018-06-05 17:10 ` Darrick J. Wong
2018-06-07 16:16 ` Darrick J. Wong
2018-06-08 1:10 ` Dave Chinner
2018-06-08 1:23 ` Darrick J. Wong
2018-06-08 2:23 ` Eric Sandeen
2018-07-24 6:39 ` Eric Sandeen
2018-07-24 16:43 ` Darrick J. Wong
2018-08-20 15:06 ` Brian Foster
2018-08-20 15:27 ` Eric Sandeen
2018-08-20 15:36 ` Darrick J. Wong
2018-08-20 15:59 ` Brian Foster [this message]
2018-08-20 22:15 ` Dave Chinner
2018-08-21 10:56 ` Brian Foster
2018-08-22 0:41 ` Dave Chinner
2018-06-05 6:24 ` [PATCH 3/6] xfs: verify COW " Dave Chinner
2018-06-05 10:00 ` Carlos Maiolino
2018-06-05 17:09 ` Darrick J. Wong
2018-06-05 6:24 ` [PATCH 4/6] xfs: validate btree records on retreival Dave Chinner
2018-06-05 6:40 ` [PATCH 4/6 v2] " Dave Chinner
2018-06-05 10:42 ` Carlos Maiolino
2018-06-05 23:00 ` Dave Chinner
2018-06-05 17:47 ` Darrick J. Wong
2018-06-05 23:02 ` Dave Chinner
2018-06-06 1:21 ` [PATCH 4/6 v3] " Dave Chinner
2018-06-05 6:24 ` [PATCH 5/6] xfs: verify root inode more thoroughly Dave Chinner
2018-06-05 10:50 ` Carlos Maiolino
2018-06-05 17:10 ` Darrick J. Wong
2018-06-05 6:24 ` [PATCH 6/6] xfs: push corruption -> ESTALE conversion to xfs_nfs_get_inode() Dave Chinner
2018-06-05 11:12 ` Carlos Maiolino
2018-06-05 17:11 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180820155918.GB9568@bfoster \
--to=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
--cc=sandeen@sandeen.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).