From: Mike Snitzer <snitzer@redhat.com>
To: Matt <jackdachef@gmail.com>
Cc: Milan Broz <mbroz@redhat.com>, Andi Kleen <andi@firstfloor.org>,
linux-btrfs <linux-btrfs@vger.kernel.org>,
dm-devel <dm-devel@redhat.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
htd <htd@fancy-poultry.org>, Chris Mason <chris.mason@oracle.com>,
htejun@gmail.com
Subject: Re: dm-crypt barrier support is effective
Date: Wed, 1 Dec 2010 11:52:29 -0500 [thread overview]
Message-ID: <20101201165229.GC13415@redhat.com> (raw)
In-Reply-To: <AANLkTik_p0yxQBQ3ehHWYPLfiqxGbSNUT5XUk9dC5-x6@mail.gmail.com>
On Wed, Dec 01 2010 at 11:05am -0500,
Matt <jackdachef@gmail.com> wrote:
> On Mon, Nov 15, 2010 at 12:24 AM, Matt <jackdachef@gmail.com> wrote:
> > On Sun, Nov 14, 2010 at 10:54 PM, Milan Broz <mbroz@redhat.com> wrote:
> >> On 11/14/2010 10:49 PM, Matt wrote:
> >>> only with the dm-crypt scaling patch I could observe the data-corruption
> >>
> >> even with v5 I sent on Friday?
> >>
> >> Are you sure that it is not related to some fs problem in 2.6.37-rc1?
> >>
> >> If it works on 2.6.36 without problems, it is probably problems somewhere
> >> else (flush/fua conversion was trivial here - DM is still doing full flush
> >> and there are no other changes in code IMHO.)
> >>
> >> Milan
> >>
> >
> > Hi Milan,
> >
> > I'm aware of your new v5 patch (which should include several
> > improvements (or potential fixes in my case) over the v3 patch)
> >
> > as I already wrote my schedule unfortunately currently doesn't allow
> > me to test it
> >
> > * in the case of no corruption it would be nice to have 2.6.37-rc* running :)
> >
> > * in the case of data corruption that would mean restoring my system -
> > since it's my production box and right now I don't have a fallback at
> > reach
> > at earliest I could give it a shot at the beginning of December. Then
> > I could also test reiserfs and ext4 as a system partition to rule out
> > that it's
> > a ext4-specific thing (currently I'm running reiserfs on my system-partition).
> >
> > Thanks !
> >
> > Matt
> >
>
>
> OK guys,
>
> I've updated my system to latest glibc 2.12.1-r3 (on gentoo) and gcc
> hardened 4.5.1-r1 with 1.4 patchset which also uses pie (that one
> should fix problems with graphite)
>
> not much system changes besides that,
>
> with those it worked fine with 2.6.36 and I couldn't observe any
> filesystem corruption
So dm-crypt cpu scalability v5 with 2.6.36 worked fine.
> the bad news is: I'm again seeing corruption (!) [on ext4, on the /
> (root) partition]:
...
> ===> so the No.1 trigger of this kind of corruption where files are
> empty, missing or the content gets corrupted (at least for me) is
> compiling software which is part of the system (e.g. emerge -e
> system);
>
> the system is Gentoo ~amd64; with binutils 2.20.51.0.12 (afaik this
> one has changed from 2.20.51.0.10 to 2.20.51.0.12 from my last
> report); gcc 4.5.1 (Gentoo Hardened 4.5.1-r1 p1.4, pie-0.4.5) <--
> works fine with 2.6.36 and 2.6.36.1
>
> I'm not sure whether benchmarks would have the same "impact"
Seems this emerge is a good test if it reliably enduces the corruption.
> the kernel currently running is 2.6.37-rc4 with the [PATCH v5] dm
> crypt: scale to multiple CPUs
>
> besides that additional patchsets are applied (I apologize that it's
> not only plain vanilla with the dm-crypt patch):
> * Prevent kswapd dumping excessive amounts of memory in response to
> high-order allocation
> * ext4: coordinate data-only flush requests sent by fsync
> * vmscan: protect executable page from inactive list scan
> * writeback livelock fixes v2
Have you actually experienced any of the issues the above patches are
meant to address? Seems you're applying patches guessing/hoping
that they'll fix the dm-crypt corruption.
> I originally had hoped that the mentioned patch in "ext4: coordinate
> data-only flush requests sent by fsync", namely: "md: Call
> blk_queue_flush() to establish flush/fua" and additional changes &
> fixes to 2.6.37-rc4 would once and for all fix problems but it didn't
That md patch doesn't help DM at all. And the ext4 coordination patch
is completely bleeding and actually broken (especially as it relates to
DM -- but that breakage is ony a concern for request-based DM,
e.g. DM-mapth), anyway see:
https://www.redhat.com/archives/dm-devel/2010-November/msg00185.html
I'm not sure which patches you're using for the ext4 fsync changes but
please don't use them at all. It is purely an optimization for
extremely heavy fsync workloads and is only getting in the way at this
point.
> I'm also using the the writeback livelock fixes and the dm-crypt scale
> to multiple CPUs with 2.6.36 so those generally work fine
>
> so it has be something that changed from 2.6.36->2.6.37 within
> dm-crypt or other parts that gets stressed and breaks during usage of
> the "[PATCH v5] dm crypt: scale to multiple CPUs" patch
>
> the other included patches surely won't be the cause for that (100%).
>
> Filesystem corruption only seems to occur on the / (root) where the
> system resides -
We need better fault isolation; you've introduced enough change that it
isn't helping zero in on what your particular problem is. Milan has
tested he latest version of the dm-crypt cpu scalability patch quite a
bit and hasn't seen any corruption -- but clearly the corruption you're
seeing is a real concern and we need to get to the bottom of it.
I'd really appreciate it if you could just use Linus' latest linux-2.6
tree plus Milan's latest patch (technically v6 even though it wasn't
labeled as such): https://patchwork.kernel.org/patch/365542/
Porting that same v6 patch to 2.6.36 would also be nice (to verify you
still don't see any corruption there).
Mike
next prev parent reply other threads:[~2010-12-01 16:52 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <AANLkTim6WTCChWGbTb-PUGd2AERGibeRtgan-WDznf2s@mail.gmail.com>
[not found] ` <4CD6B7FA.3050005@redhat.com>
[not found] ` <AANLkTikbsU+SGAaoq_oek=7tfDdjg+0wFoydhA+K9ZU+@mail.gmail.com>
[not found] ` <AANLkTinna7BiGHogXnn1iEG6ccUAjFM3p3S3aHpv=h-E@mail.gmail.com>
[not found] ` <20101107194547.GA12521@basil.fritz.box>
[not found] ` <4CD71C8B.1050604@redhat.com>
[not found] ` <20101107230508.GB17592@basil.fritz.box>
2010-11-08 14:58 ` DM-CRYPT: Scale to multiple CPUs v3 on 2.6.37-rc* ? Mike Snitzer
2010-11-08 17:59 ` Chris Mason
2010-11-14 20:59 ` dm-crypt barrier support is effective (was: Re: DM-CRYPT: Scale to multiple CPUs v3 on 2.6.37-rc* ?) Mike Snitzer
2010-11-14 21:49 ` Matt
2010-11-14 21:54 ` dm-crypt barrier support is effective Milan Broz
2010-11-14 23:24 ` Matt
2010-12-01 16:05 ` Matt
2010-12-01 16:52 ` Mike Snitzer [this message]
2010-12-01 17:35 ` Matt
2010-12-01 18:24 ` Milan Broz
2010-12-01 19:34 ` Jon Nelson
2010-12-01 20:45 ` Milan Broz
2010-12-01 21:23 ` hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective) Mike Snitzer
2010-12-02 21:30 ` Matt
2010-12-04 19:18 ` Matt
2010-12-04 19:38 ` Mike Snitzer
2010-12-04 23:47 ` Matt
2010-12-07 14:21 ` Chris Mason
2010-12-07 18:10 ` Jon Nelson
2010-12-07 18:10 ` Jon Nelson
2010-12-07 18:15 ` Chris Mason
2010-12-07 18:22 ` Mike Snitzer
2010-12-07 18:45 ` Jon Nelson
2010-12-07 18:52 ` Chris Mason
2010-12-07 19:34 ` Jon Nelson
2010-12-07 20:02 ` Chris Mason
2010-12-07 20:25 ` Jon Nelson
2010-12-07 20:33 ` Chris Mason
2010-12-07 20:36 ` Jon Nelson
2010-12-07 20:41 ` Chris Mason
2010-12-07 20:48 ` Jon Nelson
2010-12-07 21:02 ` Chris Mason
2010-12-08 3:29 ` Jon Nelson
2010-12-08 8:03 ` hunt for 2.6.37 dm-crypt+ext4 corruption? Milan Broz
2010-12-08 12:20 ` hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective) Chris Mason
2010-12-16 3:37 ` Dave Chinner
2010-12-16 12:29 ` Chris Mason
2010-12-08 3:55 ` Jon Nelson
2010-12-07 19:35 ` Ted Ts'o
2010-12-07 21:01 ` Jon Nelson
2010-12-07 21:01 ` Jon Nelson
2010-12-08 3:37 ` Jon Nelson
2010-12-08 15:26 ` Jon Nelson
2010-12-08 15:26 ` Jon Nelson
2010-12-09 18:01 ` Ted Ts'o
2010-12-09 18:10 ` Jon Nelson
2010-12-09 20:13 ` Ted Ts'o
2010-12-09 20:38 ` Jon Nelson
2010-12-09 20:38 ` Jon Nelson
2010-12-09 23:16 ` Andi Kleen
2010-12-10 1:38 ` Chris Mason
2010-12-10 1:53 ` Matt
2010-12-10 2:38 ` Ted Ts'o
2010-12-10 6:52 ` Jon Nelson
2010-12-10 6:52 ` Jon Nelson
2010-12-10 14:58 ` Jon Nelson
2010-12-10 14:58 ` Jon Nelson
2010-12-10 16:54 ` Jon Nelson
2010-12-10 16:54 ` Jon Nelson
2010-12-11 2:14 ` Jon Nelson
2010-12-12 1:40 ` Ted Ts'o
2010-12-12 2:34 ` Ted Ts'o
2010-12-12 3:16 ` Jon Nelson
2010-12-12 10:18 ` Jon Nelson
2010-12-12 12:43 ` Ted Ts'o
2010-12-12 13:11 ` Jon Nelson
2010-12-13 2:06 ` Ted Ts'o
2010-12-13 18:56 ` Jon Nelson
2010-12-13 18:56 ` Jon Nelson
2010-12-15 19:15 ` Matt
2010-12-15 19:16 ` Andi Kleen
2010-12-15 19:25 ` Matt
2010-12-15 19:28 ` Matt
2010-12-12 13:11 ` Jon Nelson
2010-12-12 10:18 ` Jon Nelson
2010-12-12 3:16 ` Jon Nelson
2010-12-11 2:14 ` Jon Nelson
2010-12-10 1:58 ` Mike Fedyk
2010-12-10 2:00 ` Chris Mason
2010-12-10 2:05 ` Jon Nelson
2010-12-09 18:10 ` Jon Nelson
2010-12-08 3:37 ` Jon Nelson
2010-12-04 23:52 ` Matt
2010-12-05 10:09 ` Heinz Diehl
2010-12-05 10:21 ` hunt for 2.6.37 dm-crypt+ext4 corruption? Milan Broz
2010-12-05 12:49 ` Heinz Diehl
2010-12-05 13:24 ` [dm-devel] " Theodore Tso
2010-12-05 13:44 ` Matt
2010-12-05 14:02 ` Ted Ts'o
2010-12-05 14:33 ` Heinz Diehl
2010-12-05 20:17 ` Daniel J Blueman
2010-12-06 7:08 ` Heinz Diehl
2010-12-05 20:28 ` Andi Kleen
2010-12-05 21:15 ` Mike Snitzer
2010-12-05 21:42 ` [dm-devel] " Milan Broz
2010-12-06 2:37 ` Valdis.Kletnieks
2011-01-06 15:56 ` Heinz Diehl
2011-01-07 16:45 ` Matt
2010-12-05 13:30 ` hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt barrier support is effective) Matt
2010-12-05 0:57 ` Matt
2010-12-04 20:51 ` Heinz Diehl
2010-12-01 19:59 ` dm-crypt barrier support is effective Heinz Diehl
2010-11-15 7:25 ` Heinz Diehl
2010-11-15 8:41 ` Milan Broz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101201165229.GC13415@redhat.com \
--to=snitzer@redhat.com \
--cc=andi@firstfloor.org \
--cc=chris.mason@oracle.com \
--cc=dm-devel@redhat.com \
--cc=htd@fancy-poultry.org \
--cc=htejun@gmail.com \
--cc=jackdachef@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mbroz@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).