linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thilo-Alexander Ginkel <thilo@ginkel.com>
To: Arnd Bergmann <arnd@arndb.de>, Tejun Heo <tj@kernel.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Soft lockup during suspend since ~2.6.36 [bisected]
Date: Thu, 14 Apr 2011 14:24:42 +0200	[thread overview]
Message-ID: <BANLkTimDOz7M6m6Xo==DbLHP6q2pMBzd9g@mail.gmail.com> (raw)
In-Reply-To: <BANLkTi=aED731W4WoKK1HUU88qR7RxpW6Q@mail.gmail.com>

On Wed, Apr 6, 2011 at 08:03, Thilo-Alexander Ginkel <thilo@ginkel.com> wrote:
> On Wed, Apr 6, 2011 at 01:28, Arnd Bergmann <arnd@arndb.de> wrote:
>> On Tuesday 05 April 2011, Thilo-Alexander Ginkel wrote:
>>> Thanks, that worked pretty well. A bisect with eleven builds later I
>>> have now identified the following candidate commit, which may have
>>> introduced the bug:
>>>
>>> dcd989cb73ab0f7b722d64ab6516f101d9f43f88 is the first bad commit
>>> commit dcd989cb73ab0f7b722d64ab6516f101d9f43f88
>>> Author: Tejun Heo <tj@kernel.org>
>>> Date:   Tue Jun 29 10:07:14 2010 +0200
>>
>> Sorry, but looking at the patch shows that it can't possibly have introduced
>> the problem, since all the code that is modified in it is new code that
>> is not even used anywhere at that stage.
>>
>> As far as I can tell, you must have hit a false positive or a false negative
>> somewhere in the bisect.
>
> Well you're right. I hit "Reply" too early and should have paid closer
> attention to what change the bisect actually brought up.
>
> I already found a false negative (fortunately pretty close to the end
> of the bisect sequence) and also verified the preceding good commits,
> which gives me two new commits to test. I'll provide an update once
> the builds and tests are through, which may however take until early
> next week as I will be on vacation until then.

All right... I verified all my bisect tests and actually found yet
another bug. After correcting that one (and verifying the correctness
of the other tests), git bisect actually came up with a commit, which
makes some more sense:

| e22bee782b3b00bd4534ae9b1c5fb2e8e6573c5c is the first bad commit
| commit e22bee782b3b00bd4534ae9b1c5fb2e8e6573c5c
| Author: Tejun Heo <tj@kernel.org>
| Date:   Tue Jun 29 10:07:14 2010 +0200
|
|     workqueue: implement concurrency managed dynamic worker pool

The good news is that I am able to reproduce the issue within a KVM
virtual machine, so I am able to test for the soft lockup (which
somewhat looks like a race condition during worker / CPU shutdown) in
a mostly automated fashion. Unfortunately, that also means that this
issue is all but hardware specific, i.e., it most probably affects all
SMP systems (with a varying probability depending on the number of
CPUs).

Adding some further details about my configuration (which I replicated
in the VM):
- lvm running on top of
- dmcrypt (luks) running on top of
- md raid1

If anyone is interested in getting hold of this VM for further tests,
let me know and I'll try to figure out how to get it (2*8 GB, barely
compressible due to dmcrypt) to its recipient.

Regards,
Thilo

  reply	other threads:[~2011-04-14 12:25 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-05 18:56 Soft lockup during suspend since ~2.6.36 [bisected] Thilo-Alexander Ginkel
2011-04-05 23:28 ` Arnd Bergmann
2011-04-06  6:03   ` Thilo-Alexander Ginkel
2011-04-14 12:24     ` Thilo-Alexander Ginkel [this message]
2011-04-17 19:35       ` Arnd Bergmann
2011-04-17 21:53         ` Thilo-Alexander Ginkel
2011-04-26 13:11           ` Tejun Heo
2011-04-27 23:51             ` Thilo-Alexander Ginkel
2011-04-28 10:30               ` Tejun Heo
2011-04-28 23:56                 ` Thilo-Alexander Ginkel
2011-04-29 16:00                   ` Tejun Heo
2011-04-29 16:18                     ` [PATCH] workqueue: fix deadlock in worker_maybe_bind_and_lock() Tejun Heo
2011-04-29 20:40                       ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='BANLkTimDOz7M6m6Xo==DbLHP6q2pMBzd9g@mail.gmail.com' \
    --to=thilo@ginkel.com \
    --cc=arnd@arndb.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rjw@sisk.pl \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).