All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Borislav Petkov <petkovbb@googlemail.com>,
	David Airlie <airlied@linux.ie>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Greg KH <greg@kroah.com>, Al Viro <viro@ZenIV.linux.org.uk>
Subject: Re: drm_vm.c:drm_mmap: possible circular locking dependency detected
Date: Sat, 02 Jan 2010 13:49:20 -0800	[thread overview]
Message-ID: <m17hs0xjmn.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <4B3EB687.7000005@kernel.org> (Tejun Heo's message of "Sat\, 02 Jan 2010 11\:59\:19 +0900")

Tejun Heo <tj@kernel.org> writes:

> Hello, Eric.
>
> On 01/02/2010 12:16 AM, Eric W. Biederman wrote:
>>>> kobject_del with a lock held scares me.
>>>
>>> I would not object at _all_ if sysfs fixed the locking here instead of in 
>>> filldir.
>> 
>> I just sent you my sysfs filldir scalability patch, so we can take that
>> red-herring off the plate.
>> 
>> The problem as I see it is that kobject_del is convenient.
>> kobject_del waits until all of the sysfs show and store methods for
>> that kobject have stopped executing.  Which imposes the rule that
>> kobject_del can not be called with any locks held that are taken in a
>> sysfs show or store method.  This is all invisible to lockdep as the
>> wait is done with a completion and not a lock.
>
> The synchronization against read/write ops is in sysfs_deactivate on
> purpose so that drivers (most common users) don't have to worry about
> sysfs ops accessing different parts of data structures once
> device_del() is complete.  Implementing the exlusion at the driver
> level is possible but not easy because some hardware devices are
> represented with complex data structures, some of them are reused when
> devices are exchanged and some sysfs ops end up accessing the
> hardware.  So, it's often not possible to simply disassociate the data
> structure and float it till the last reference goes away.  There needs
> to be a synchronization point where the driver can tell that nothing
> is accessing released data structure or hardware resource after it and
> it's far easier to define it at the sysfs level.
>
>> sysfs_deactivate happens in the device_del(), but if we were to move
>> sysfs_deactivate into the final kobject_put then in theory we can
>> continue to block and be friendly but not need to be called with
>> locations where locks are held.
>
> Nobody would know when that final put will actually happen.  In
> progress sysfs ops might access the hardware after the hardware is
> gone or replaced with another unit.

Alright than that is a bad possible split of the functionality.  Which
is all I was suggesting splitting the functionality not doing away
with the wait or moving it to a point where the wait would not work.
It was simply my bad assumption that the final kobject_put would
happen before the module that controlled that kobject could be
removed.

I still think it might make sense to separate kobject_del into two
parts.  One that we call with the locks held and one without, but that
does seem to be applicable to only a very small set of cases and our
problems appear to be much larger than that.

For the moment I have generated a patch that does the lockdep
annotations, and I have found that a simple:

   find /sys -type f | xargs cat {} > /dev/null

trivially generates lockdep warnings.  In particular:

[  165.049042] 
[  165.049044] =======================================================
[  165.052761] [ INFO: possible circular locking dependency detected ]
[  165.052761] 2.6.33-rc2x86_64 #3
[  165.052761] -------------------------------------------------------
[  165.052761] cat/5026 is trying to acquire lock:
[  165.052761]  (&serio->drv_mutex){+.+.+.}, at: [<ffffffff8132ecaa>] atkbd_attr_show_helper+0x28/0x6e
[  165.052761] 
[  165.052761] but task is already holding lock:
[  165.089443]  (s_active){++++.+}, at: [<ffffffff810e84dd>] sysfs_get_active_two+0x2c/0x43
[  165.089443] 
[  165.089443] which lock already depends on the new lock.
[  165.089443] 
[  165.089443] 
[  165.089443] the existing dependency chain (in reverse order) is:
[  165.089443] 
[  165.089443] -> #1 (s_active){++++.+}:
[  165.089443]        [<ffffffff81054956>] validate_chain+0xa25/0xd1d
[  165.089443]        [<ffffffff810553d3>] __lock_acquire+0x785/0x7dc
[  165.089443]        [<ffffffff81056112>] lock_acquire+0x5a/0x74
[  165.089443]        [<ffffffff810e8202>] sysfs_addrm_finish+0xba/0x125
[  165.089443]        [<ffffffff810e68b0>] sysfs_hash_and_remove+0x4f/0x6b
[  165.089443]        [<ffffffff810e94cf>] remove_files+0x1f/0x2c
[  165.089443]        [<ffffffff810e9561>] sysfs_remove_group+0x85/0xb4
[  165.089443]        [<ffffffff81331f0f>] psmouse_disconnect+0x33/0x147
[  165.089443]        [<ffffffff8132687b>] serio_disconnect_driver+0x2d/0x3a
[  165.089443]        [<ffffffff81326898>] serio_driver_remove+0x10/0x14
[  165.089443]        [<ffffffff812077f0>] __device_release_driver+0x67/0xb0
[  165.089443]        [<ffffffff81207857>] device_release_driver+0x1e/0x2b
[  165.089443]        [<ffffffff81326e68>] serio_disconnect_port+0x60/0x69
[  165.089443]        [<ffffffff8132757a>] serio_thread+0x170/0x34a
[  165.089443]        [<ffffffff810470e7>] kthread+0x7d/0x85
[  165.089443]        [<ffffffff81002cd4>] kernel_thread_helper+0x4/0x10
[  165.089443] 
[  165.089443] -> #0 (&serio->drv_mutex){+.+.+.}:
[  165.089443]        [<ffffffff81054642>] validate_chain+0x711/0xd1d
[  165.089443]        [<ffffffff810553d3>] __lock_acquire+0x785/0x7dc
[  165.089443]        [<ffffffff81056112>] lock_acquire+0x5a/0x74
[  165.089443]        [<ffffffff814378ed>] mutex_lock_interruptible_nested+0x4a/0x307
[  165.089443]        [<ffffffff8132ecaa>] atkbd_attr_show_helper+0x28/0x6e
[  165.089443]        [<ffffffff8132ed81>] atkbd_do_show_extra+0x13/0x15
[  165.089443]        [<ffffffff812049b6>] dev_attr_show+0x20/0x43
[  165.089443]        [<ffffffff810e71db>] sysfs_read_file+0xba/0x145
[  165.089443]        [<ffffffff8109f507>] vfs_read+0xab/0x147
[  165.089443]        [<ffffffff8109f85c>] sys_read+0x47/0x70
[  165.089443]        [<ffffffff81001f2b>] system_call_fastpath+0x16/0x1b
[  165.089443] 
[  165.089443] other info that might help us debug this:
[  165.089443] 
[  165.089443] 3 locks held by cat/5026:
[  165.089443]  #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff810e715a>] sysfs_read_file+0x39/0x145
[  165.089443]  #1:  (s_active){++++.+}, at: [<ffffffff810e84d0>] sysfs_get_active_two+0x1f/0x43
[  165.089443]  #2:  (s_active){++++.+}, at: [<ffffffff810e84dd>] sysfs_get_active_two+0x2c/0x43
[  165.089443] 
[  165.089443] stack backtrace:
[  165.089443] Pid: 5026, comm: cat Not tainted 2.6.33-rc2x86_64 #3
[  165.089443] Call Trace:
[  165.089443]  [<ffffffff810538f3>] print_circular_bug+0xb3/0xc1
[  165.089443]  [<ffffffff81054642>] validate_chain+0x711/0xd1d
[  165.089443]  [<ffffffff81052fb6>] ? trace_hardirqs_on_caller+0x10b/0x12f
[  165.089443]  [<ffffffff810553d3>] __lock_acquire+0x785/0x7dc
[  165.089443]  [<ffffffff8132ecaa>] ? atkbd_attr_show_helper+0x28/0x6e
[  165.089443]  [<ffffffff81056112>] lock_acquire+0x5a/0x74
[  165.089443]  [<ffffffff8132ecaa>] ? atkbd_attr_show_helper+0x28/0x6e
[  165.089443]  [<ffffffff814378ed>] mutex_lock_interruptible_nested+0x4a/0x307
[  165.089443]  [<ffffffff8132ecaa>] ? atkbd_attr_show_helper+0x28/0x6e
[  165.089443]  [<ffffffff8132ee41>] ? atkbd_show_extra+0x0/0x28
[  165.089443]  [<ffffffff8132ecaa>] atkbd_attr_show_helper+0x28/0x6e
[  165.089443]  [<ffffffff8132ed81>] atkbd_do_show_extra+0x13/0x15
[  165.089443]  [<ffffffff812049b6>] dev_attr_show+0x20/0x43
[  165.089443]  [<ffffffff810e71db>] sysfs_read_file+0xba/0x145
[  165.089443]  [<ffffffff8109f507>] vfs_read+0xab/0x147
[  165.089443]  [<ffffffff8109f85c>] sys_read+0x47/0x70
[  165.089443]  [<ffffffff81001f2b>] system_call_fastpath+0x16/0x1b

Suggestions on how to sort out this other set of issues are welcome.

Eric

  parent reply	other threads:[~2010-01-02 21:49 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-24 22:00 Linux 2.6.33-rc2 - Merry Christmas Linus Torvalds
2009-12-25 10:27 ` -tip: origin tree boot crash Ingo Molnar
2009-12-25 19:49   ` Dmitry Torokhov
2009-12-26 20:19     ` Len Brown
2009-12-26 20:17   ` Len Brown
2009-12-27  4:20     ` Len Brown
2009-12-28  9:44       ` Ingo Molnar
2009-12-28 12:01         ` Ingo Molnar
2009-12-28 15:02           ` Paul Rolland
2009-12-28 16:15             ` Paul Rolland
2009-12-28 16:53             ` Paul Rolland
2009-12-28 20:17               ` Dmitry Torokhov
2009-12-30  6:14               ` Len Brown
2009-12-30  7:13                 ` Paul Rolland
2009-12-30  6:19               ` [PATCH] wmi: check find_guid() return value to prevent oops Len Brown
2009-12-30  6:21               ` [PATCH] dell-wmi: sys_init_module: 'dell_wmi'->init suspiciously returned 21, it should follow 0/-E convention Len Brown
2009-12-25 13:10 ` Linux 2.6.33-rc2 - Blank screen for Intel KMS Miguel Calleja
2009-12-29  9:50   ` Miguel Calleja
2009-12-29 14:01     ` Rafael J. Wysocki
2009-12-25 20:00 ` Linux 2.6.33-rc2 - Merry Christmas Borislav Petkov
2009-12-25 21:50   ` Borislav Petkov
2009-12-26  6:00     ` Jesse Barnes
2009-12-26  8:02       ` Borislav Petkov
2009-12-26  9:36 ` EHCI resume sysfs duplicates (was: Re: Linux 2.6.33-rc2 - Merry Christmas ...) Borislav Petkov
2009-12-26  9:45 ` drm_vm.c:drm_mmap: possible circular locking dependency detected " Borislav Petkov
2009-12-28  0:40   ` KOSAKI Motohiro
2009-12-30 21:10     ` Linus Torvalds
2009-12-30 21:34       ` Eric W. Biederman
2009-12-30 22:03         ` Linus Torvalds
2009-12-31  8:40           ` Eric W. Biederman
2009-12-31 19:04             ` Linus Torvalds
2010-01-01 13:58               ` [PATCH] sysfs: Cache the last sysfs_dirent to improve readdir scalability Eric W. Biederman
2010-01-01 15:33                 ` Borislav Petkov
2010-01-01 18:56                 ` Linus Torvalds
2010-01-01 22:43                   ` [PATCH] sysfs: Cache the last sysfs_dirent to improve readdir scalability v2 Eric W. Biederman
2010-01-01 23:10                     ` Linus Torvalds
2010-01-02  5:59                       ` Greg KH
2010-01-02 15:40                       ` Borislav Petkov
2010-01-01 15:16               ` drm_vm.c:drm_mmap: possible circular locking dependency detected (was: Re: Linux 2.6.33-rc2 - Merry Christmas ...) Eric W. Biederman
2010-01-02  2:59                 ` drm_vm.c:drm_mmap: possible circular locking dependency detected Tejun Heo
2010-01-02 21:37                   ` [PATCH] sysfs: Add lockdep annotations for the sysfs active reference Eric W. Biederman
2010-01-03  0:02                     ` Tejun Heo
2010-01-17 16:26                     ` Ming Lei
2010-01-17 17:18                       ` Eric W. Biederman
2010-01-17 18:03                         ` Dominik Brodowski
2010-01-02 21:49                   ` Eric W. Biederman [this message]
2010-01-03  0:32                     ` drm_vm.c:drm_mmap: possible circular locking dependency detected Tejun Heo
2010-01-03  2:06                       ` Eric W. Biederman
2010-01-03  5:01                         ` Tejun Heo
2010-01-03  5:38                           ` Eric W. Biederman
2010-01-03  6:05                             ` Tejun Heo
2010-01-03  7:47                       ` Dmitry Torokhov
2010-01-03 10:57                         ` Eric W. Biederman
2010-01-03 11:14                           ` Eric W. Biederman
2010-01-04 19:16                             ` Dmitry Torokhov
2010-01-04 18:57                           ` Dmitry Torokhov
2010-01-04 19:43                             ` Eric W. Biederman
2010-01-04 21:12                               ` Dmitry Torokhov
2010-01-04 23:09                               ` Tejun Heo
2009-12-31  8:40           ` drm_vm.c:drm_mmap: possible circular locking dependency detected (was: Re: Linux 2.6.33-rc2 - Merry Christmas ...) Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m17hs0xjmn.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=airlied@linux.ie \
    --cc=greg@kroah.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=petkovbb@googlemail.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.