From: "Thomas Hellström" <thomas@shipmail.org>
To: Eric Anholt <eric@anholt.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
Wang Chen <wangchen@cn.fujitsu.com>,
Nick Piggin <nickpiggin@yahoo.com.au>,
Ingo Molnar <mingo@elte.hu>,
dri-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] drm: Fix lock order reversal between mmap_sem and struct_mutex.
Date: Wed, 25 Feb 2009 09:54:39 +0100 [thread overview]
Message-ID: <49A5074F.8060307@shipmail.org> (raw)
In-Reply-To: <1235549745.5213.4.camel@gaiman>
Eric Anholt wrote:
> On Fri, 2009-02-20 at 08:36 +0100, Peter Zijlstra wrote:
>
>> On Thu, 2009-02-19 at 18:04 -0800, Eric Anholt wrote:
>>
>>> On Thu, 2009-02-19 at 23:26 +0100, Peter Zijlstra wrote:
>>>
>>>> On Thu, 2009-02-19 at 22:02 +0100, Thomas Hellstrom wrote:
>>>>
>>>>>
>>>>> It looks to me like the driver preferred locking order is
>>>>>
>>>>> object_mutex (which happens to be the device global struct_mutex)
>>>>> mmap_sem
>>>>> offset_mutex.
>>>>>
>>>>> So if one could avoid using the struct_mutex for object bookkeeping (A
>>>>> separate lock) then
>>>>> vm_open() and vm_close() would adhere to that locking order as well,
>>>>> simply by not taking the struct_mutex at all.
>>>>>
>>>>> So only fault() remains, in which that locking order is reversed.
>>>>> Personally I think the trylock ->reschedule->retry method with proper
>>>>> commenting is a good solution. It will be the _only_ place where locking
>>>>> order is reversed and it is done in a deadlock-safe manner. Note that
>>>>> fault() doesn't really fail, but requests a retry from user-space with
>>>>> rescheduling to give the process holding the struct_mutex time to
>>>>> release it.
>>>>>
>>>> It doesn't do the reschedule -- need_resched() will check if the current
>>>> task was marked to be scheduled away, furthermore yield based locking
>>>> sucks chunks.
>>>>
>> Imagine what would happen if your faulting task was the highest RT prio
>> task in the system, you'd end up with a life-lock.
>>
>>
>>>> What's so very difficult about pulling the copy_*_user() out from under
>>>> the locks?
>>>>
>>> That we're expecting the data movement to occur while holding device
>>> state in place. For example, we write data through the GTT most of the
>>> time so we:
>>>
>>> lock struct_mutex
>>> pin the object to the GTT
>>> flushing caches as needed
>>> copy_from_user
>>> unpin object
>>> unlock struct_mutex
>>>
>> So you cannot drop the lock once you've pinned the dst object?
>>
>>
>>> If I'm to pull the copy_from_user out, that means I have to:
>>>
>>> alloc temporary storage
>>> for each block of temp storage size:
>>> copy_from_user
>>> lock struct_mutex
>>> pin the object to the GTT
>>> flush caches as needed
>>> memcpy
>>> unpin object
>>> unlock struct_mutex
>>>
>>> At this point of introducing our third copy of the user's data in our
>>> hottest path, we should probably ditch the pwrite path entirely and go
>>> to user mapping of the objects for performance. Requiring user mapping
>>> (which has significant overhead) cuts the likelihood of moving from
>>> user-space object caching to kernel object caching in the future, which
>>> has the potential of saving steaming piles of memory.
>>>
>> Or you could get_user_pages() to fault the user pages and pin them, and
>> then do pagefault_disable() and use copy_from_user_inatomic or such, and
>> release the pages again.
>>
>
> I started poking at this today, since the get_user_pages sounded like
> the solution. Only then I noticed: when we unbind an existing object,
> we have to unmap_mapping_range to clear the clients' mappings to it in
> the GTT, which needs to happen while the struct lock (protecting the gtt
> structure and the gtt to object mappings) is held. So for fault we have
> mmap_sem held to struct mutex taken for poking at the gtt structure, and
> for unbind we have struct mutex held to mmap_sem taken to clear
> mappings.
>
>
I don't think the mmap_sem is taken during unmap_mapping_rage() ?
/Thomas
next prev parent reply other threads:[~2009-02-25 8:55 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-18 0:59 [PATCH] drm: Fix lock order reversal between mmap_sem and struct_mutex Eric Anholt
2009-02-18 8:02 ` Wang Chen
2009-02-18 16:38 ` [PATCH] drm: Take mmap_sem up front to avoid lock order violations krh
2009-02-19 9:19 ` Peter Zijlstra
2009-02-19 10:33 ` Peter Zijlstra
2009-02-19 14:49 ` Kristian Høgsberg
2009-02-19 15:17 ` Nick Piggin
2009-02-19 15:21 ` Kristian Høgsberg
2009-02-19 12:57 ` Nick Piggin
2009-02-21 2:33 ` Eric Anholt
2009-02-18 15:08 ` [PATCH] drm: Fix lock order reversal between mmap_sem and struct_mutex Peter Zijlstra
2009-02-19 21:02 ` Thomas Hellstrom
2009-02-19 22:26 ` Peter Zijlstra
2009-02-20 2:04 ` Eric Anholt
2009-02-20 7:36 ` Peter Zijlstra
2009-02-25 8:15 ` Eric Anholt
2009-02-25 8:54 ` Thomas Hellström [this message]
2009-02-25 9:07 ` Peter Zijlstra
2009-02-20 8:31 ` Thomas Hellstrom
2009-02-20 8:47 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49A5074F.8060307@shipmail.org \
--to=thomas@shipmail.org \
--cc=dri-devel@lists.sourceforge.net \
--cc=eric@anholt.net \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=nickpiggin@yahoo.com.au \
--cc=peterz@infradead.org \
--cc=wangchen@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox