* lockdep: how to tell it multiple pte locks is OK?
@ 2007-10-07 6:31 Jeremy Fitzhardinge
2007-10-07 8:29 ` Peter Zijlstra
2007-10-07 16:58 ` Arjan van de Ven
0 siblings, 2 replies; 7+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-07 6:31 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Linux Kernel Mailing List
I'm writing some code which is doing some batch processing on pte pages,
and so wants to hold multiple pte locks at once. This seems OK, but
lockdep is giving me the warning:
=============================================
[ INFO: possible recursive locking detected ]
2.6.23-rc9-paravirt #1673
---------------------------------------------
init/1 is trying to acquire lock:
(__pte_lockptr(new)){--..}, at: [<c0102d85>] lock_pte+0x10/0x15
but task is already holding lock:
(__pte_lockptr(new)){--..}, at: [<c0102d85>] lock_pte+0x10/0x15
other info that might help us debug this:
4 locks held by init/1:
#0: (&mm->mmap_sem){----}, at: [<c012999e>] copy_process+0xab4/0x12bf
#1: (&mm->mmap_sem/1){--..}, at: [<c01299ae>] copy_process+0xac4/0x12bf
#2: (&mm->page_table_lock){--..}, at: [<c010334a>] xen_dup_mmap+0x11/0x24
#3: (__pte_lockptr(new)){--..}, at: [<c0102d85>] lock_pte+0x10/0x15
stack backtrace:
[<c0109282>] show_trace_log_lvl+0x1a/0x2f
[<c0109d18>] show_trace+0x12/0x14
[<c0109d30>] dump_stack+0x16/0x18
[<c0147bd0>] __lock_acquire+0x195/0xc5f
[<c0148722>] lock_acquire+0x88/0xac
[<c035c2a3>] _spin_lock+0x35/0x42
[<c0102d85>] lock_pte+0x10/0x15
[<c010347d>] pin_page+0x67/0x17e
[<c0102d23>] pgd_walk+0x168/0x1ba
[<c0103283>] xen_pgd_pin+0x42/0xf8
[<c0103352>] xen_dup_mmap+0x19/0x24
[<c0129b63>] copy_process+0xc79/0x12bf
[<c012a419>] do_fork+0x99/0x1bf
[<c0106216>] sys_clone+0x33/0x39
[<c010814e>] syscall_call+0x7/0xb
=======================
I presume this is because I'm holding multiple pte locks (class
"__pte_lockptr(new)"). Is there some way I can tell lockdep this is OK?
I'm presume I'm the first person to try holding multiple pte locks at
once, so there's no existing locking order for these locks. I'm always
traversing and locking the pagetable in virtual address order (and this
seems like a sane-enough rule for anyone else who wants to hold multiple
pte locks).
Thanks,
J
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: lockdep: how to tell it multiple pte locks is OK?
2007-10-07 6:31 lockdep: how to tell it multiple pte locks is OK? Jeremy Fitzhardinge
@ 2007-10-07 8:29 ` Peter Zijlstra
2007-10-07 15:06 ` Jeremy Fitzhardinge
2007-10-07 16:58 ` Arjan van de Ven
1 sibling, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2007-10-07 8:29 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Ingo Molnar, Linux Kernel Mailing List
On Sat, 2007-10-06 at 23:31 -0700, Jeremy Fitzhardinge wrote:
> I'm writing some code which is doing some batch processing on pte pages,
> and so wants to hold multiple pte locks at once. This seems OK, but
> lockdep is giving me the warning:
>
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.23-rc9-paravirt #1673
> ---------------------------------------------
> init/1 is trying to acquire lock:
> (__pte_lockptr(new)){--..}, at: [<c0102d85>] lock_pte+0x10/0x15
>
> but task is already holding lock:
> (__pte_lockptr(new)){--..}, at: [<c0102d85>] lock_pte+0x10/0x15
>
> other info that might help us debug this:
> 4 locks held by init/1:
> #0: (&mm->mmap_sem){----}, at: [<c012999e>] copy_process+0xab4/0x12bf
> #1: (&mm->mmap_sem/1){--..}, at: [<c01299ae>] copy_process+0xac4/0x12bf
> #2: (&mm->page_table_lock){--..}, at: [<c010334a>] xen_dup_mmap+0x11/0x24
> #3: (__pte_lockptr(new)){--..}, at: [<c0102d85>] lock_pte+0x10/0x15
>
> stack backtrace:
> [<c0109282>] show_trace_log_lvl+0x1a/0x2f
> [<c0109d18>] show_trace+0x12/0x14
> [<c0109d30>] dump_stack+0x16/0x18
> [<c0147bd0>] __lock_acquire+0x195/0xc5f
> [<c0148722>] lock_acquire+0x88/0xac
> [<c035c2a3>] _spin_lock+0x35/0x42
> [<c0102d85>] lock_pte+0x10/0x15
> [<c010347d>] pin_page+0x67/0x17e
> [<c0102d23>] pgd_walk+0x168/0x1ba
> [<c0103283>] xen_pgd_pin+0x42/0xf8
> [<c0103352>] xen_dup_mmap+0x19/0x24
> [<c0129b63>] copy_process+0xc79/0x12bf
> [<c012a419>] do_fork+0x99/0x1bf
> [<c0106216>] sys_clone+0x33/0x39
> [<c010814e>] syscall_call+0x7/0xb
> =======================
>
>
> I presume this is because I'm holding multiple pte locks (class
> "__pte_lockptr(new)"). Is there some way I can tell lockdep this is OK?
Yeah, the typical way is to use spin_lock_nested(lock, nesting_level),
this allows one to annotate these nestings. However, nesting_level must
not be larger than 8, so if your batch is larger than that, we have a
problem.
> I'm presume I'm the first person to try holding multiple pte locks at
> once, so there's no existing locking order for these locks.
Not quite, things like copy_pte_range() take 2.
> I'm always
> traversing and locking the pagetable in virtual address order (and this
> seems like a sane-enough rule for anyone else who wants to hold multiple
> pte locks).
I'm quite sure copy_pte_range() could be used so that it violates that
order.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: lockdep: how to tell it multiple pte locks is OK?
2007-10-07 8:29 ` Peter Zijlstra
@ 2007-10-07 15:06 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 7+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-07 15:06 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, Linux Kernel Mailing List
Peter Zijlstra wrote:
>>
>> I presume this is because I'm holding multiple pte locks (class
>> "__pte_lockptr(new)"). Is there some way I can tell lockdep this is OK?
>>
>
> Yeah, the typical way is to use spin_lock_nested(lock, nesting_level),
> this allows one to annotate these nestings. However, nesting_level must
> not be larger than 8, so if your batch is larger than that, we have a
> problem.
>
Yeah, my batches are up to about 32.
>> I'm presume I'm the first person to try holding multiple pte locks at
>> once, so there's no existing locking order for these locks.
>>
>
> Not quite, things like copy_pte_range() take 2.
>
Hm, and it uses SINGLE_DEPTH_NESTING...
>> I'm always
>> traversing and locking the pagetable in virtual address order (and this
>> seems like a sane-enough rule for anyone else who wants to hold multiple
>> pte locks).
>>
>
> I'm quite sure copy_pte_range() could be used so that it violates that
> order.
>
Good point. It's not a problem for me because they're two ptes in
different pagetables, whereas my multiple ptes are always within the
same pagetable. Someone wanting to lock multiple ptes from multiple
pagetables would have a more complex locking order problem.
(Or if we end up sharing ptes between pagetables it might get tricky.)
J
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: lockdep: how to tell it multiple pte locks is OK?
2007-10-07 6:31 lockdep: how to tell it multiple pte locks is OK? Jeremy Fitzhardinge
2007-10-07 8:29 ` Peter Zijlstra
@ 2007-10-07 16:58 ` Arjan van de Ven
2007-10-07 17:17 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 7+ messages in thread
From: Arjan van de Ven @ 2007-10-07 16:58 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Ingo Molnar, Linux Kernel Mailing List
On Sat, 06 Oct 2007 23:31:33 -0700
> I'm presume I'm the first person to try holding multiple pte locks at
> once, so there's no existing locking order for these locks. I'm
> always traversing and locking the pagetable in virtual address order
> (and this seems like a sane-enough rule for anyone else who wants to
> hold multiple pte locks).
I'm not sure that's a valid assumption in light of things like sharing
pagetables between processes etc etc..
(granted, that one is out of tree right now but I still hope it'll go
in some day:)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: lockdep: how to tell it multiple pte locks is OK?
2007-10-07 16:58 ` Arjan van de Ven
@ 2007-10-07 17:17 ` Jeremy Fitzhardinge
2007-10-07 17:46 ` Arjan van de Ven
0 siblings, 1 reply; 7+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-07 17:17 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Ingo Molnar, Linux Kernel Mailing List
On Oct 7, 2007, at 9:58 AM, Arjan van de Ven wrote:
> On Sat, 06 Oct 2007 23:31:33 -0700
>
>> I'm presume I'm the first person to try holding multiple pte locks at
>> once, so there's no existing locking order for these locks. I'm
>> always traversing and locking the pagetable in virtual address order
>> (and this seems like a sane-enough rule for anyone else who wants to
>> hold multiple pte locks).
>
> I'm not sure that's a valid assumption in light of things like sharing
> pagetables between processes etc etc..
> (granted, that one is out of tree right now but I still hope it'll go
> in some day:)
Well, yes, but that will take some thought about how split pte locks
will work anyway (or more specifically, fork will probably just end
up reusing the pte pages and avoid the need to do any cross-pagetable
pte locking anyway, though I guess that will be deferred to COW
handling).
So are you saying I should pass up the opportunity to optimise a
relatively hot path (fork/exec/exit) because it will need some
further thought if/when shared ptes get implemented? Doesn't seem
like a good tradeoff...
J
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: lockdep: how to tell it multiple pte locks is OK?
2007-10-07 17:17 ` Jeremy Fitzhardinge
@ 2007-10-07 17:46 ` Arjan van de Ven
2007-10-09 0:20 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 7+ messages in thread
From: Arjan van de Ven @ 2007-10-07 17:46 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Ingo Molnar, Linux Kernel Mailing List
On Sun, 7 Oct 2007 10:17:47 -0700
Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>
> On Oct 7, 2007, at 9:58 AM, Arjan van de Ven wrote:
>
> > On Sat, 06 Oct 2007 23:31:33 -0700
> >
> >> I'm presume I'm the first person to try holding multiple pte locks
> >> at once, so there's no existing locking order for these locks. I'm
> >> always traversing and locking the pagetable in virtual address
> >> order (and this seems like a sane-enough rule for anyone else who
> >> wants to hold multiple pte locks).
> >
> > I'm not sure that's a valid assumption in light of things like
> > sharing pagetables between processes etc etc..
> > (granted, that one is out of tree right now but I still hope it'll
> > go in some day:)
>
> Well, yes, but that will take some thought about how split pte locks
> will work anyway (or more specifically, fork will probably just end
> up reusing the pte pages and avoid the need to do any
> cross-pagetable pte locking anyway, though I guess that will be
> deferred to COW handling).
>
> So are you saying I should pass up the opportunity to optimise a
> relatively hot path (fork/exec/exit) because it will need some
> further thought if/when shared ptes get implemented?
s/implemented/merged/ :)
IN fact shared pagetables are already there for hugepages.
For small pages it's a patch at this point.
> Doesn't seem
> like a good tradeoff...
no I'm not saying that. I'm just saying that I'm worried about the
locking robustness of your trick in general.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: lockdep: how to tell it multiple pte locks is OK?
2007-10-07 17:46 ` Arjan van de Ven
@ 2007-10-09 0:20 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 7+ messages in thread
From: Jeremy Fitzhardinge @ 2007-10-09 0:20 UTC (permalink / raw)
To: Arjan van de Ven; +Cc: Ingo Molnar, Linux Kernel Mailing List
Arjan van de Ven wrote:
> s/implemented/merged/ :)
>
> IN fact shared pagetables are already there for hugepages.
> For small pages it's a patch at this point.
>
Is it kept up to date? Where does it live?
> no I'm not saying that. I'm just saying that I'm worried about the
> locking robustness of your trick in general.
>
Hm, well I won't need to re-pin shared ptes anyway, so I think it's moot.
J
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-10-09 0:21 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-07 6:31 lockdep: how to tell it multiple pte locks is OK? Jeremy Fitzhardinge
2007-10-07 8:29 ` Peter Zijlstra
2007-10-07 15:06 ` Jeremy Fitzhardinge
2007-10-07 16:58 ` Arjan van de Ven
2007-10-07 17:17 ` Jeremy Fitzhardinge
2007-10-07 17:46 ` Arjan van de Ven
2007-10-09 0:20 ` Jeremy Fitzhardinge
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox