From: ebiederm@xmission.com (Eric W. Biederman)
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Borislav Petkov <petkovbb@googlemail.com>,
David Airlie <airlied@linux.ie>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Greg KH <greg@kroah.com>, Al Viro <viro@ZenIV.linux.org.uk>
Subject: Re: drm_vm.c:drm_mmap: possible circular locking dependency detected (was: Re: Linux 2.6.33-rc2 - Merry Christmas ...)
Date: Thu, 31 Dec 2009 00:40:18 -0800 [thread overview]
Message-ID: <m1fx6rtu31.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <alpine.LFD.2.00.0912301348320.11961@localhost.localdomain> (Linus Torvalds's message of "Wed\, 30 Dec 2009 14\:03\:25 -0800 \(PST\)")
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Wed, 30 Dec 2009, Eric W. Biederman wrote:
>
>> Linus Torvalds <torvalds@linux-foundation.org> writes:
>>
>> > We've seen it several times (yes, mostly with drm, but it's been seen with
>> > others too), and it's very annoying. It can be fixed by having very
>> > careful readdir implementations, but I really would blame sysfs in
>> > particular for having a very annoying lock reversal issue when used
>> > reasonably.
>>
>> Maybe. The mnmap_sem has some interesting issues all of it's own.
>> What reasonable thing is the drm doing that is causing problems?
>
> The details are in the original thread on lkml, but it boils down to
> basically (the below may not be the exact sequence, but it's close)
Thanks.
> - drm_mmap (called with mmap_sem) takes 'dev->struct_mutex' to protect
> it's own device data (very reasonable)
>
> - drm_release takes 'dev->struct_mutex' again to protect its own data,
> and calls "mtrr_del_page()" which ends up taking cpu_hotplug.lock.
>
> Again, that doesn't sound "wrong" in any way.
>
> - hibernate ends up with the sequence: _cpu_down (cpu_hotplug.lock) -> ..
> kref_put .. -> sysfs_addrm_start (sysfs_mutex)
>
> Again, nothing suspicious or "bad", and this part of the dependency
> chain has nothing to do with the DRM code itself.
kobject_del with a lock held scares me.
There is a possible deadlock (that lockdep is ignorant of) if you hold
a lock over sysfs_deactivate() and if any sysfs file takes that lock.
I won't argue with a claim of inconvenient locking semantics here, and
this is different to the problem you are seeing (except that fixing this
problem would happen to fix the filldir issue).
> - sysfs_readdir() (and this is the big problem) holds sysfs_mutex in its
> readdir implementation over the call to filldir. And filldir copies the
> data to user space, so now you have sysfs_mutex -> mmap_sem.
>
> See? None of the chains look bad. Except sysfs_readdir() obviously has
> that sysfs_mutex -> mmap_sem thing, which is _very_ annoying, because now
> you end up with a chain like
>
> mmap_sem -> dev->struct_mutex -> cpu_hotplug.lock -> sysfs_mutex -> mmap_sem
>
> and I think you'll agree that of all the lock chains, the place to break
> the association is at sysfs_mutex. And the obvious place to break it would
> be that last "sysfs_mutex -> mmap_sem" stage.
I agree that fixing sysfs_readdir to not hold the sysfs_mutex over filldir
is useful to reduce the lock hold time if nothing else.
The cheap fix here is mostly a matter of grabbing a reference to the
sysfs_dirent and then revalidating that the reference is still useful
after we reacquire the sysfs_mutex. If not we already have the code for
restarting from just an offset. We just don't want to use it too much as
that will give us O(n^2) times for sysfs readdir.
I will see if I can dig up or regenerate my patch in the next couple of days.
>> > Added Eric and Greg to the cc, in case the sysfs people want to solve it.
>>
>> There are scalability reasons for dropping the sysfs_mutex in sysfs_readdir
>> and I have some tenative patches for that. I will take a look after I
>> come back from the holidays, in a couple of days. I don't understand
>> the issue as described.
>
> Ok, hopefully the above chain explains it to you, and also makes it clear
> that it's rather hard to break anywhere else, and it's not somebody else
> doing anything "obviously bogus".
We very definitely have an ABBA deadlock with sysfs_deactivate and the
cpu_hotplug.lock. arch/x86/kernel/microcode_core.c:reload_store() is the
code for a sysfs file that when written to calls get_online_cpus().
Regardless of what we do with sysfs_readdir we need to see if we can
fix cpu_down(), to remove this nasty deadlock.
Eric
next prev parent reply other threads:[~2009-12-31 8:40 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-24 22:00 Linux 2.6.33-rc2 - Merry Christmas Linus Torvalds
2009-12-25 10:27 ` -tip: origin tree boot crash Ingo Molnar
2009-12-25 19:49 ` Dmitry Torokhov
2009-12-26 20:19 ` Len Brown
2009-12-26 20:17 ` Len Brown
2009-12-27 4:20 ` Len Brown
2009-12-28 9:44 ` Ingo Molnar
2009-12-28 12:01 ` Ingo Molnar
2009-12-28 15:02 ` Paul Rolland
2009-12-28 16:15 ` Paul Rolland
2009-12-28 16:53 ` Paul Rolland
2009-12-28 20:17 ` Dmitry Torokhov
2009-12-30 6:14 ` Len Brown
2009-12-30 7:13 ` Paul Rolland
2009-12-30 6:19 ` [PATCH] wmi: check find_guid() return value to prevent oops Len Brown
2009-12-30 6:21 ` [PATCH] dell-wmi: sys_init_module: 'dell_wmi'->init suspiciously returned 21, it should follow 0/-E convention Len Brown
2009-12-25 13:10 ` Linux 2.6.33-rc2 - Blank screen for Intel KMS Miguel Calleja
2009-12-29 9:50 ` Miguel Calleja
2009-12-29 14:01 ` Rafael J. Wysocki
2009-12-25 20:00 ` Linux 2.6.33-rc2 - Merry Christmas Borislav Petkov
2009-12-25 21:50 ` Borislav Petkov
2009-12-26 6:00 ` Jesse Barnes
2009-12-26 8:02 ` Borislav Petkov
2009-12-26 9:36 ` EHCI resume sysfs duplicates (was: Re: Linux 2.6.33-rc2 - Merry Christmas ...) Borislav Petkov
2009-12-26 9:45 ` drm_vm.c:drm_mmap: possible circular locking dependency detected " Borislav Petkov
2009-12-28 0:40 ` KOSAKI Motohiro
2009-12-30 21:10 ` Linus Torvalds
2009-12-30 21:34 ` Eric W. Biederman
2009-12-30 22:03 ` Linus Torvalds
2009-12-31 8:40 ` Eric W. Biederman [this message]
2009-12-31 19:04 ` Linus Torvalds
2010-01-01 13:58 ` [PATCH] sysfs: Cache the last sysfs_dirent to improve readdir scalability Eric W. Biederman
2010-01-01 15:33 ` Borislav Petkov
2010-01-01 18:56 ` Linus Torvalds
2010-01-01 22:43 ` [PATCH] sysfs: Cache the last sysfs_dirent to improve readdir scalability v2 Eric W. Biederman
2010-01-01 23:10 ` Linus Torvalds
2010-01-02 5:59 ` Greg KH
2010-01-02 15:40 ` Borislav Petkov
2010-01-01 15:16 ` drm_vm.c:drm_mmap: possible circular locking dependency detected (was: Re: Linux 2.6.33-rc2 - Merry Christmas ...) Eric W. Biederman
2010-01-02 2:59 ` drm_vm.c:drm_mmap: possible circular locking dependency detected Tejun Heo
2010-01-02 21:37 ` [PATCH] sysfs: Add lockdep annotations for the sysfs active reference Eric W. Biederman
2010-01-03 0:02 ` Tejun Heo
2010-01-17 16:26 ` Ming Lei
2010-01-17 17:18 ` Eric W. Biederman
2010-01-17 18:03 ` Dominik Brodowski
2010-01-02 21:49 ` drm_vm.c:drm_mmap: possible circular locking dependency detected Eric W. Biederman
2010-01-03 0:32 ` Tejun Heo
2010-01-03 2:06 ` Eric W. Biederman
2010-01-03 5:01 ` Tejun Heo
2010-01-03 5:38 ` Eric W. Biederman
2010-01-03 6:05 ` Tejun Heo
2010-01-03 7:47 ` Dmitry Torokhov
2010-01-03 10:57 ` Eric W. Biederman
2010-01-03 11:14 ` Eric W. Biederman
2010-01-04 19:16 ` Dmitry Torokhov
2010-01-04 18:57 ` Dmitry Torokhov
2010-01-04 19:43 ` Eric W. Biederman
2010-01-04 21:12 ` Dmitry Torokhov
2010-01-04 23:09 ` Tejun Heo
2009-12-31 8:40 ` drm_vm.c:drm_mmap: possible circular locking dependency detected (was: Re: Linux 2.6.33-rc2 - Merry Christmas ...) Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1fx6rtu31.fsf@fess.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=airlied@linux.ie \
--cc=greg@kroah.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=petkovbb@googlemail.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox