From: Jens Axboe <jaxboe@fusionio.com>
To: Theodore Tso <tytso@MIT.EDU>
Cc: Dave Chinner <david@fromorbit.com>,
Markus Trippelsdorf <markus@trippelsdorf.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Chris Mason <chris.mason@oracle.com>
Subject: Re: [GIT PULL] Core block IO bits for 2.6.39 - early Oops
Date: Fri, 25 Mar 2011 13:14:36 +0100 [thread overview]
Message-ID: <4D8C872C.1030805@fusionio.com> (raw)
In-Reply-To: <91CCAB14-F9CC-4676-94C3-FBCDD0663FD5@mit.edu>
On 2011-03-25 12:59, Theodore Tso wrote:
>
> On Mar 25, 2011, at 12:41 AM, Dave Chinner wrote:
>
>>>
>>> It works insofar as the Oops is gone. But my xfs partitions apparently
>>> still get corrupted (I had to run xfs_repair on several of them, because
>>> they would not mount otherwise).
>>
>> So the patchset is causing repeatable filesystem corruption? Sounds
>> to me like this series is not yet ready for mainline merging. Last
>> thing I want to spend the .39 cycle helping people recover busted
>> filesystems as a result of undercooked block layer changes...
>
> FYI. I did a trial merge last night of the ext4 changes last night with
> the tip of Linus's tree. The ext4 changes (based on 2.6.38-rc5)
> survived xfstests -g auto before I merged in Linus's 2.6.39 master
> branch. After I merged with 2.6.39-tip, I reran xfstests, and it got
> past test #13 (fsstress), which normally means that everything is
> OK, so I sent a pull request to Linus. Much later, (-g auto takes a
> long time) I got an OOPS inside the virtio driver. Ext4 was nowhere
> in the stack trace, but of course the block layer was. Grumbling
> that someone had broke virtio during the merge window, I switched
> my KVM setup to use SATA emulation and used the sda devices
> instead. This time I got an oops in the block I/O layer, again quite
> late in xfstests. Somewhere around test #224 or so if I remember
> correctly.
>
> It was too late last night to do any more investigating, which is why
> I hadn't sent a formal report yet, but next up is for me to retry xfstests
> before merging in my changes, and then to start a git bisect.
>
> So before accusing some patch series which hasn't been merged
> into 2.6.39 yet, you might want to also worry about some change
> that already has been merged. Of course the symptoms for me are
> quite different. I'm not seeing an early oops, but only something
> which shows up when the the system is put under a lot of stress
> by xfstests. So it could be a different problem....
>
> - Ted
>
> P.S. And of course there is the chance that there is some
> subtle bug in the ext4 branch, which worked just fine when
> it was just based on 2.6.38-rc5, but which only manifested
> itself when I merged in the tip of Linus's branch. So I'm not
> __accusing__ the block layer yet, even though the stack traces
> seem to point that way, because I don't have a smoking gun
> yet. But I do have to admit I'm suspicious....
But this plugging change is merged, so it is a very likely candidate.
With the oddness going on, I suspect that we end up flushing a plug that
resides on a stack that is no longer valid.
Is there a way to check whether a given pointer is valid on the current
stack for this process?
I think we can rule out stack overflows, since the plug context itself
is very small (28 bytes). But if we have something like:
blk_start_plug(&plug1);
...
blk_start_plug(&plug2);
...
flush(&plug2);
then that could explain the corruption and lockups.
So I'd really like to have something ala:
if (is_str_ptr_valid(current, ptr, size))
...
to aid the debugging.
--
Jens Axboe
next prev parent reply other threads:[~2011-03-25 12:14 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-24 13:43 [GIT PULL] Core block IO bits for 2.6.39 Jens Axboe
2011-03-24 18:30 ` [GIT PULL] Core block IO bits for 2.6.39 - early Oops Markus Trippelsdorf
2011-03-24 18:36 ` Jens Axboe
2011-03-24 18:47 ` Markus Trippelsdorf
2011-03-24 18:51 ` Jens Axboe
2011-03-24 18:54 ` Markus Trippelsdorf
2011-03-24 18:58 ` Jens Axboe
2011-03-24 19:34 ` Markus Trippelsdorf
2011-03-24 19:36 ` Jens Axboe
2011-03-24 19:45 ` Markus Trippelsdorf
2011-03-24 19:57 ` Jens Axboe
2011-03-24 20:06 ` Markus Trippelsdorf
2011-03-24 21:01 ` Jens Axboe
2011-03-24 21:41 ` Markus Trippelsdorf
2011-03-25 7:23 ` Jens Axboe
2011-03-25 8:37 ` Markus Trippelsdorf
2011-03-25 8:44 ` Jens Axboe
2011-03-25 9:27 ` Markus Trippelsdorf
2011-03-25 9:57 ` Markus Trippelsdorf
2011-03-25 10:11 ` Jens Axboe
2011-03-25 12:44 ` Jens Axboe
2011-03-25 13:09 ` Markus Trippelsdorf
2011-03-25 14:10 ` Jens Axboe
2011-03-25 14:14 ` Markus Trippelsdorf
2011-03-25 14:18 ` Chris Mason
2011-03-25 14:19 ` Chris Mason
2011-03-25 14:24 ` Markus Trippelsdorf
2011-03-25 14:20 ` Jens Axboe
2011-03-25 14:28 ` Markus Trippelsdorf
2011-03-25 15:51 ` Jens Axboe
2011-03-25 15:58 ` Markus Trippelsdorf
2011-03-25 16:01 ` Jens Axboe
2011-03-24 22:06 ` Markus Trippelsdorf
2011-03-25 4:41 ` Dave Chinner
2011-03-25 7:26 ` Jens Axboe
2011-03-25 11:59 ` Theodore Tso
2011-03-25 12:14 ` Jens Axboe [this message]
2011-03-25 12:33 ` Ted Ts'o
2011-03-25 12:43 ` Jens Axboe
2011-03-25 13:01 ` Chris Mason
2011-03-25 21:35 ` [GIT PULL] Core block IO bits for 2.6.39 Geert Uytterhoeven
2011-03-26 6:29 ` Jens Axboe
2011-03-26 7:21 ` Geert Uytterhoeven
2011-03-26 8:25 ` Jens Axboe
2011-03-26 8:34 ` Geert Uytterhoeven
2011-03-26 9:26 ` Jens Axboe
2011-03-26 16:48 ` Linus Torvalds
2011-03-26 16:53 ` Jens Axboe
2011-03-26 18:48 ` Jens Axboe
2011-03-27 13:21 ` Alan Cox
2011-03-27 11:49 ` Avi Kivity
2011-03-27 12:00 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D8C872C.1030805@fusionio.com \
--to=jaxboe@fusionio.com \
--cc=chris.mason@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=markus@trippelsdorf.de \
--cc=torvalds@linux-foundation.org \
--cc=tytso@MIT.EDU \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox