From: Tejun Heo <tj@kernel.org>
To: Milan Broz <mbroz@redhat.com>
Cc: device-mapper development <dm-devel@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
just.for.lkml@googlemail.com, hch@infradead.org,
herbert@gondor.hengli.com.au
Subject: Re: [dm-devel] Linux 2.6.36-rc7
Date: Fri, 08 Oct 2010 19:02:11 +0200 [thread overview]
Message-ID: <4CAF4E93.1040101@kernel.org> (raw)
In-Reply-To: <4CAE29CC.8030702@redhat.com>
Hello, again.
On 10/07/2010 10:13 PM, Milan Broz wrote:
> Yes, XFS is very good to show up problems in dm-crypt:)
>
> But there was no change in dm-crypt which can itself cause such problem,
> planned workqueue changes are not in 2.6.36 yet.
> Code is basically the same for the last few releases.
>
> So it seems that workqueue processing really changed here under memory pressure.
>
> Milan
>
> p.s.
> Anyway, if you are able to reproduce it and you think that there is problem
> in per-device dm-crypt workqueue, there are patches from Andi for shared
> per-cpu workqueue, maybe it can help here. (But this is really not RC material.)
>
> Unfortunately not yet in dm-devel tree, but I have them here ready for review:
> http://mbroz.fedorapeople.org/dm-crypt/2.6.36-devel/
> (all 4 patches must be applied, I hope Alasdair will put them in dm quilt soon.)
Okay, spent the whole day reproduing the problem and trying to
determine what's going on. In the process, I've found a bug and a
potential issue (not sure whether it's an actual issue which should be
fixed for this release yet) but the hang doesn't seem to have anything
to do with workqueue update. All the queues are behaving exactly as
expected during hang.
Also, it isn't a regression. I can reliably trigger the same deadlock
on v2.6.35.
Here's the setup, which should be mostly similar to Torsten's setup I
used to trigger the problem.
The machine is dual quad-core Opteron (8 phys cores) w/ 4GiB memory.
* 80GB raid1 of two SATA disks
* On top of that, luks encrypted device w/ twofish-cbc-essiv:sha256
* In the encrypted device, xfs filesystem which hosts 8GiB swapfile
* 12GiB tmpfs
The workload is v2.6.35 allyesconfig -j 128 build in the tmpfs. Not
too long after swap starts being used (several tens of secs), the
system hangs. IRQ handling and all are fine but no IO gets through
with a lot of tasks stuck in bio allocation somewhere.
I suspected that with md and dm stacked together, something in the
upper layer ended up exhausting a shared bio pool and tried a couple
of things but haven't succeeded at finding where the culprit is. It
probably would be best to run blktrace together and analyze how IO
gets stuck.
So, well, we seem to be broken the same way as before. No need to
delay release for this one.
Thanks.
--
tejun
next prev parent reply other threads:[~2010-10-08 17:02 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <AANLkTi=LsBNU+O2hqZUcM2nYM_ze6qPq3thwSZBMtY_v@mail.gmail.com>
2010-10-07 19:28 ` Linux 2.6.36-rc7 Tejun Heo
2010-10-07 20:13 ` Milan Broz
2010-10-08 17:02 ` Tejun Heo [this message]
2010-10-10 11:56 ` [dm-devel] " Torsten Kaiser
2010-10-11 10:09 ` [PATCH wq#for-next] workqueue: fix HIGHPRI handling in keep_working() Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CAF4E93.1040101@kernel.org \
--to=tj@kernel.org \
--cc=dm-devel@redhat.com \
--cc=hch@infradead.org \
--cc=herbert@gondor.hengli.com.au \
--cc=just.for.lkml@googlemail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mbroz@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).