public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Borislav Petkov <petkovbb@googlemail.com>,
	David Airlie <airlied@linux.ie>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Greg KH <greg@kroah.com>, Al Viro <viro@ZenIV.linux.org.uk>
Subject: Re: drm_vm.c:drm_mmap: possible circular locking dependency detected
Date: Sat, 02 Jan 2010 13:49:20 -0800	[thread overview]
Message-ID: <m17hs0xjmn.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <4B3EB687.7000005@kernel.org> (Tejun Heo's message of "Sat\, 02 Jan 2010 11\:59\:19 +0900")

Tejun Heo <tj@kernel.org> writes:

> Hello, Eric.
>
> On 01/02/2010 12:16 AM, Eric W. Biederman wrote:
>>>> kobject_del with a lock held scares me.
>>>
>>> I would not object at _all_ if sysfs fixed the locking here instead of in 
>>> filldir.
>> 
>> I just sent you my sysfs filldir scalability patch, so we can take that
>> red-herring off the plate.
>> 
>> The problem as I see it is that kobject_del is convenient.
>> kobject_del waits until all of the sysfs show and store methods for
>> that kobject have stopped executing.  Which imposes the rule that
>> kobject_del can not be called with any locks held that are taken in a
>> sysfs show or store method.  This is all invisible to lockdep as the
>> wait is done with a completion and not a lock.
>
> The synchronization against read/write ops is in sysfs_deactivate on
> purpose so that drivers (most common users) don't have to worry about
> sysfs ops accessing different parts of data structures once
> device_del() is complete.  Implementing the exlusion at the driver
> level is possible but not easy because some hardware devices are
> represented with complex data structures, some of them are reused when
> devices are exchanged and some sysfs ops end up accessing the
> hardware.  So, it's often not possible to simply disassociate the data
> structure and float it till the last reference goes away.  There needs
> to be a synchronization point where the driver can tell that nothing
> is accessing released data structure or hardware resource after it and
> it's far easier to define it at the sysfs level.
>
>> sysfs_deactivate happens in the device_del(), but if we were to move
>> sysfs_deactivate into the final kobject_put then in theory we can
>> continue to block and be friendly but not need to be called with
>> locations where locks are held.
>
> Nobody would know when that final put will actually happen.  In
> progress sysfs ops might access the hardware after the hardware is
> gone or replaced with another unit.

Alright than that is a bad possible split of the functionality.  Which
is all I was suggesting splitting the functionality not doing away
with the wait or moving it to a point where the wait would not work.
It was simply my bad assumption that the final kobject_put would
happen before the module that controlled that kobject could be
removed.

I still think it might make sense to separate kobject_del into two
parts.  One that we call with the locks held and one without, but that
does seem to be applicable to only a very small set of cases and our
problems appear to be much larger than that.

For the moment I have generated a patch that does the lockdep
annotations, and I have found that a simple:

   find /sys -type f | xargs cat {} > /dev/null

trivially generates lockdep warnings.  In particular:

[  165.049042] 
[  165.049044] =======================================================
[  165.052761] [ INFO: possible circular locking dependency detected ]
[  165.052761] 2.6.33-rc2x86_64 #3
[  165.052761] -------------------------------------------------------
[  165.052761] cat/5026 is trying to acquire lock:
[  165.052761]  (&serio->drv_mutex){+.+.+.}, at: [<ffffffff8132ecaa>] atkbd_attr_show_helper+0x28/0x6e
[  165.052761] 
[  165.052761] but task is already holding lock:
[  165.089443]  (s_active){++++.+}, at: [<ffffffff810e84dd>] sysfs_get_active_two+0x2c/0x43
[  165.089443] 
[  165.089443] which lock already depends on the new lock.
[  165.089443] 
[  165.089443] 
[  165.089443] the existing dependency chain (in reverse order) is:
[  165.089443] 
[  165.089443] -> #1 (s_active){++++.+}:
[  165.089443]        [<ffffffff81054956>] validate_chain+0xa25/0xd1d
[  165.089443]        [<ffffffff810553d3>] __lock_acquire+0x785/0x7dc
[  165.089443]        [<ffffffff81056112>] lock_acquire+0x5a/0x74
[  165.089443]        [<ffffffff810e8202>] sysfs_addrm_finish+0xba/0x125
[  165.089443]        [<ffffffff810e68b0>] sysfs_hash_and_remove+0x4f/0x6b
[  165.089443]        [<ffffffff810e94cf>] remove_files+0x1f/0x2c
[  165.089443]        [<ffffffff810e9561>] sysfs_remove_group+0x85/0xb4
[  165.089443]        [<ffffffff81331f0f>] psmouse_disconnect+0x33/0x147
[  165.089443]        [<ffffffff8132687b>] serio_disconnect_driver+0x2d/0x3a
[  165.089443]        [<ffffffff81326898>] serio_driver_remove+0x10/0x14
[  165.089443]        [<ffffffff812077f0>] __device_release_driver+0x67/0xb0
[  165.089443]        [<ffffffff81207857>] device_release_driver+0x1e/0x2b
[  165.089443]        [<ffffffff81326e68>] serio_disconnect_port+0x60/0x69
[  165.089443]        [<ffffffff8132757a>] serio_thread+0x170/0x34a
[  165.089443]        [<ffffffff810470e7>] kthread+0x7d/0x85
[  165.089443]        [<ffffffff81002cd4>] kernel_thread_helper+0x4/0x10
[  165.089443] 
[  165.089443] -> #0 (&serio->drv_mutex){+.+.+.}:
[  165.089443]        [<ffffffff81054642>] validate_chain+0x711/0xd1d
[  165.089443]        [<ffffffff810553d3>] __lock_acquire+0x785/0x7dc
[  165.089443]        [<ffffffff81056112>] lock_acquire+0x5a/0x74
[  165.089443]        [<ffffffff814378ed>] mutex_lock_interruptible_nested+0x4a/0x307
[  165.089443]        [<ffffffff8132ecaa>] atkbd_attr_show_helper+0x28/0x6e
[  165.089443]        [<ffffffff8132ed81>] atkbd_do_show_extra+0x13/0x15
[  165.089443]        [<ffffffff812049b6>] dev_attr_show+0x20/0x43
[  165.089443]        [<ffffffff810e71db>] sysfs_read_file+0xba/0x145
[  165.089443]        [<ffffffff8109f507>] vfs_read+0xab/0x147
[  165.089443]        [<ffffffff8109f85c>] sys_read+0x47/0x70
[  165.089443]        [<ffffffff81001f2b>] system_call_fastpath+0x16/0x1b
[  165.089443] 
[  165.089443] other info that might help us debug this:
[  165.089443] 
[  165.089443] 3 locks held by cat/5026:
[  165.089443]  #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff810e715a>] sysfs_read_file+0x39/0x145
[  165.089443]  #1:  (s_active){++++.+}, at: [<ffffffff810e84d0>] sysfs_get_active_two+0x1f/0x43
[  165.089443]  #2:  (s_active){++++.+}, at: [<ffffffff810e84dd>] sysfs_get_active_two+0x2c/0x43
[  165.089443] 
[  165.089443] stack backtrace:
[  165.089443] Pid: 5026, comm: cat Not tainted 2.6.33-rc2x86_64 #3
[  165.089443] Call Trace:
[  165.089443]  [<ffffffff810538f3>] print_circular_bug+0xb3/0xc1
[  165.089443]  [<ffffffff81054642>] validate_chain+0x711/0xd1d
[  165.089443]  [<ffffffff81052fb6>] ? trace_hardirqs_on_caller+0x10b/0x12f
[  165.089443]  [<ffffffff810553d3>] __lock_acquire+0x785/0x7dc
[  165.089443]  [<ffffffff8132ecaa>] ? atkbd_attr_show_helper+0x28/0x6e
[  165.089443]  [<ffffffff81056112>] lock_acquire+0x5a/0x74
[  165.089443]  [<ffffffff8132ecaa>] ? atkbd_attr_show_helper+0x28/0x6e
[  165.089443]  [<ffffffff814378ed>] mutex_lock_interruptible_nested+0x4a/0x307
[  165.089443]  [<ffffffff8132ecaa>] ? atkbd_attr_show_helper+0x28/0x6e
[  165.089443]  [<ffffffff8132ee41>] ? atkbd_show_extra+0x0/0x28
[  165.089443]  [<ffffffff8132ecaa>] atkbd_attr_show_helper+0x28/0x6e
[  165.089443]  [<ffffffff8132ed81>] atkbd_do_show_extra+0x13/0x15
[  165.089443]  [<ffffffff812049b6>] dev_attr_show+0x20/0x43
[  165.089443]  [<ffffffff810e71db>] sysfs_read_file+0xba/0x145
[  165.089443]  [<ffffffff8109f507>] vfs_read+0xab/0x147
[  165.089443]  [<ffffffff8109f85c>] sys_read+0x47/0x70
[  165.089443]  [<ffffffff81001f2b>] system_call_fastpath+0x16/0x1b

Suggestions on how to sort out this other set of issues are welcome.

Eric

  parent reply	other threads:[~2010-01-02 21:49 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-24 22:00 Linux 2.6.33-rc2 - Merry Christmas Linus Torvalds
2009-12-25 10:27 ` -tip: origin tree boot crash Ingo Molnar
2009-12-25 19:49   ` Dmitry Torokhov
2009-12-26 20:19     ` Len Brown
2009-12-26 20:17   ` Len Brown
2009-12-27  4:20     ` Len Brown
2009-12-28  9:44       ` Ingo Molnar
2009-12-28 12:01         ` Ingo Molnar
2009-12-28 15:02           ` Paul Rolland
2009-12-28 16:15             ` Paul Rolland
2009-12-28 16:53             ` Paul Rolland
2009-12-28 20:17               ` Dmitry Torokhov
2009-12-30  6:14               ` Len Brown
2009-12-30  7:13                 ` Paul Rolland
2009-12-30  6:19               ` [PATCH] wmi: check find_guid() return value to prevent oops Len Brown
2009-12-30  6:21               ` [PATCH] dell-wmi: sys_init_module: 'dell_wmi'->init suspiciously returned 21, it should follow 0/-E convention Len Brown
2009-12-25 13:10 ` Linux 2.6.33-rc2 - Blank screen for Intel KMS Miguel Calleja
2009-12-29  9:50   ` Miguel Calleja
2009-12-29 14:01     ` Rafael J. Wysocki
2009-12-25 20:00 ` Linux 2.6.33-rc2 - Merry Christmas Borislav Petkov
2009-12-25 21:50   ` Borislav Petkov
2009-12-26  6:00     ` Jesse Barnes
2009-12-26  8:02       ` Borislav Petkov
2009-12-26  9:36 ` EHCI resume sysfs duplicates (was: Re: Linux 2.6.33-rc2 - Merry Christmas ...) Borislav Petkov
2009-12-26  9:45 ` drm_vm.c:drm_mmap: possible circular locking dependency detected " Borislav Petkov
2009-12-28  0:40   ` KOSAKI Motohiro
2009-12-30 21:10     ` Linus Torvalds
2009-12-30 21:34       ` Eric W. Biederman
2009-12-30 22:03         ` Linus Torvalds
2009-12-31  8:40           ` Eric W. Biederman
2009-12-31 19:04             ` Linus Torvalds
2010-01-01 13:58               ` [PATCH] sysfs: Cache the last sysfs_dirent to improve readdir scalability Eric W. Biederman
2010-01-01 15:33                 ` Borislav Petkov
2010-01-01 18:56                 ` Linus Torvalds
2010-01-01 22:43                   ` [PATCH] sysfs: Cache the last sysfs_dirent to improve readdir scalability v2 Eric W. Biederman
2010-01-01 23:10                     ` Linus Torvalds
2010-01-02  5:59                       ` Greg KH
2010-01-02 15:40                       ` Borislav Petkov
2010-01-01 15:16               ` drm_vm.c:drm_mmap: possible circular locking dependency detected (was: Re: Linux 2.6.33-rc2 - Merry Christmas ...) Eric W. Biederman
2010-01-02  2:59                 ` drm_vm.c:drm_mmap: possible circular locking dependency detected Tejun Heo
2010-01-02 21:37                   ` [PATCH] sysfs: Add lockdep annotations for the sysfs active reference Eric W. Biederman
2010-01-03  0:02                     ` Tejun Heo
2010-01-17 16:26                     ` Ming Lei
2010-01-17 17:18                       ` Eric W. Biederman
2010-01-17 18:03                         ` Dominik Brodowski
2010-01-02 21:49                   ` Eric W. Biederman [this message]
2010-01-03  0:32                     ` drm_vm.c:drm_mmap: possible circular locking dependency detected Tejun Heo
2010-01-03  2:06                       ` Eric W. Biederman
2010-01-03  5:01                         ` Tejun Heo
2010-01-03  5:38                           ` Eric W. Biederman
2010-01-03  6:05                             ` Tejun Heo
2010-01-03  7:47                       ` Dmitry Torokhov
2010-01-03 10:57                         ` Eric W. Biederman
2010-01-03 11:14                           ` Eric W. Biederman
2010-01-04 19:16                             ` Dmitry Torokhov
2010-01-04 18:57                           ` Dmitry Torokhov
2010-01-04 19:43                             ` Eric W. Biederman
2010-01-04 21:12                               ` Dmitry Torokhov
2010-01-04 23:09                               ` Tejun Heo
2009-12-31  8:40           ` drm_vm.c:drm_mmap: possible circular locking dependency detected (was: Re: Linux 2.6.33-rc2 - Merry Christmas ...) Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m17hs0xjmn.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=airlied@linux.ie \
    --cc=greg@kroah.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=petkovbb@googlemail.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox