* [PATCH] Increase lockdep MAX_LOCK_DEPTH
@ 2007-08-31 4:43 Eric Sandeen
2007-08-31 6:39 ` Peter Zijlstra
0 siblings, 1 reply; 11+ messages in thread
From: Eric Sandeen @ 2007-08-31 4:43 UTC (permalink / raw)
To: linux-kernel Mailing List; +Cc: xfs-oss
The xfs filesystem can exceed the current lockdep
MAX_LOCK_DEPTH, because when deleting an entire cluster of inodes,
they all get locked in xfs_ifree_cluster(). The normal cluster
size is 8192 bytes, and with the default (and minimum) inode size
of 256 bytes, that's up to 32 inodes that get locked. Throw in a
few other locks along the way, and 40 seems enough to get me through
all the tests in the xfsqa suite on 4k blocks. (block sizes
above 8K will still exceed this though, I think)
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Index: linux-2.6.23-rc3/include/linux/sched.h
===================================================================
--- linux-2.6.23-rc3.orig/include/linux/sched.h
+++ linux-2.6.23-rc3/include/linux/sched.h
@@ -1125,7 +1125,7 @@ struct task_struct {
int softirq_context;
#endif
#ifdef CONFIG_LOCKDEP
-# define MAX_LOCK_DEPTH 30UL
+# define MAX_LOCK_DEPTH 40UL
u64 curr_chain_key;
int lockdep_depth;
struct held_lock held_locks[MAX_LOCK_DEPTH];
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH 2007-08-31 4:43 [PATCH] Increase lockdep MAX_LOCK_DEPTH Eric Sandeen @ 2007-08-31 6:39 ` Peter Zijlstra 2007-08-31 13:50 ` David Chinner 0 siblings, 1 reply; 11+ messages in thread From: Peter Zijlstra @ 2007-08-31 6:39 UTC (permalink / raw) To: Eric Sandeen Cc: linux-kernel Mailing List, xfs-oss, Dave Chinner, Ingo Molnar On Thu, 2007-08-30 at 23:43 -0500, Eric Sandeen wrote: > The xfs filesystem can exceed the current lockdep > MAX_LOCK_DEPTH, because when deleting an entire cluster of inodes, > they all get locked in xfs_ifree_cluster(). The normal cluster > size is 8192 bytes, and with the default (and minimum) inode size > of 256 bytes, that's up to 32 inodes that get locked. Throw in a > few other locks along the way, and 40 seems enough to get me through > all the tests in the xfsqa suite on 4k blocks. (block sizes > above 8K will still exceed this though, I think) As 40 will still not be enough for people with larger block sizes, this does not seems like a solid solution. Could XFS possibly batch in smaller (fixed sized) chunks, or does that have significant down sides? > Signed-off-by: Eric Sandeen <sandeen@sandeen.net> > > Index: linux-2.6.23-rc3/include/linux/sched.h > =================================================================== > --- linux-2.6.23-rc3.orig/include/linux/sched.h > +++ linux-2.6.23-rc3/include/linux/sched.h > @@ -1125,7 +1125,7 @@ struct task_struct { > int softirq_context; > #endif > #ifdef CONFIG_LOCKDEP > -# define MAX_LOCK_DEPTH 30UL > +# define MAX_LOCK_DEPTH 40UL > u64 curr_chain_key; > int lockdep_depth; > struct held_lock held_locks[MAX_LOCK_DEPTH]; > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH 2007-08-31 6:39 ` Peter Zijlstra @ 2007-08-31 13:50 ` David Chinner 2007-08-31 14:33 ` Eric Sandeen 2007-08-31 14:33 ` Peter Zijlstra 0 siblings, 2 replies; 11+ messages in thread From: David Chinner @ 2007-08-31 13:50 UTC (permalink / raw) To: Peter Zijlstra Cc: Eric Sandeen, linux-kernel Mailing List, xfs-oss, Dave Chinner, Ingo Molnar On Fri, Aug 31, 2007 at 08:39:49AM +0200, Peter Zijlstra wrote: > On Thu, 2007-08-30 at 23:43 -0500, Eric Sandeen wrote: > > The xfs filesystem can exceed the current lockdep > > MAX_LOCK_DEPTH, because when deleting an entire cluster of inodes, > > they all get locked in xfs_ifree_cluster(). The normal cluster > > size is 8192 bytes, and with the default (and minimum) inode size > > of 256 bytes, that's up to 32 inodes that get locked. Throw in a > > few other locks along the way, and 40 seems enough to get me through > > all the tests in the xfsqa suite on 4k blocks. (block sizes > > above 8K will still exceed this though, I think) > > As 40 will still not be enough for people with larger block sizes, this > does not seems like a solid solution. Could XFS possibly batch in > smaller (fixed sized) chunks, or does that have significant down sides? The problem is not filesystem block size, it's the xfs inode cluster buffer size / the size of the inodes that determines the lock depth. the common case is 8k/256 = 32 inodes in a buffer, and they all get looked during inode cluster writeback. This inode writeback clustering is one of the reasons XFS doesn't suffer from atime issues as much as other filesystems - it doesn't need to do as much I/O to write back dirty inodes to disk. IOWs, we are not going to make the inode clusters smallers - if anything they are going to get *larger* in future so we do less I/O during inode writeback than we do now..... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH 2007-08-31 13:50 ` David Chinner @ 2007-08-31 14:33 ` Eric Sandeen 2007-08-31 14:36 ` Peter Zijlstra 2007-08-31 14:33 ` Peter Zijlstra 1 sibling, 1 reply; 11+ messages in thread From: Eric Sandeen @ 2007-08-31 14:33 UTC (permalink / raw) To: David Chinner Cc: Peter Zijlstra, linux-kernel Mailing List, xfs-oss, Ingo Molnar David Chinner wrote: > On Fri, Aug 31, 2007 at 08:39:49AM +0200, Peter Zijlstra wrote: >> On Thu, 2007-08-30 at 23:43 -0500, Eric Sandeen wrote: >>> The xfs filesystem can exceed the current lockdep >>> MAX_LOCK_DEPTH, because when deleting an entire cluster of inodes, >>> they all get locked in xfs_ifree_cluster(). The normal cluster >>> size is 8192 bytes, and with the default (and minimum) inode size >>> of 256 bytes, that's up to 32 inodes that get locked. Throw in a >>> few other locks along the way, and 40 seems enough to get me through >>> all the tests in the xfsqa suite on 4k blocks. (block sizes >>> above 8K will still exceed this though, I think) >> As 40 will still not be enough for people with larger block sizes, this >> does not seems like a solid solution. Could XFS possibly batch in >> smaller (fixed sized) chunks, or does that have significant down sides? > > The problem is not filesystem block size, it's the xfs inode cluster buffer > size / the size of the inodes that determines the lock depth. the common case > is 8k/256 = 32 inodes in a buffer, and they all get looked during inode > cluster writeback. Right, but as I understand it, the cluster size *minimum* is the block size; that's why I made reference to block size - 16k blocks would have 64 inodes per cluster, minimum, potentially all locked in these paths. Just saying that today, larger blocks -> larger clusters -> more locks. Even though MAX_LOCK_DEPTH of 40 may not accomodate these scenarios, at least it would accomodate the most common case today... Peter, unless there is some other reason to do so, changing xfs performance behavior simply to satisfy lockdep limitations* doesn't seem like the best plan. I suppose one slightly flakey option would be for xfs to see whether lockdep is enabled and adjust cluster size based on MAX_LOCK_DEPTH... on the argument that lockdep is likely used in debugging kernels where sheer performance is less important... but, that sounds pretty flakey to me. -Eric *and I don't mean that in a pejorative sense; just the fact that some max depth must be chosen - the literal "limitation." ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH 2007-08-31 14:33 ` Eric Sandeen @ 2007-08-31 14:36 ` Peter Zijlstra 0 siblings, 0 replies; 11+ messages in thread From: Peter Zijlstra @ 2007-08-31 14:36 UTC (permalink / raw) To: Eric Sandeen Cc: David Chinner, linux-kernel Mailing List, xfs-oss, Ingo Molnar On Fri, 2007-08-31 at 09:33 -0500, Eric Sandeen wrote: > Peter, unless there is some other reason to do so, changing xfs > performance behavior simply to satisfy lockdep limitations* doesn't seem > like the best plan. > > I suppose one slightly flakey option would be for xfs to see whether > lockdep is enabled and adjust cluster size based on MAX_LOCK_DEPTH... on > the argument that lockdep is likely used in debugging kernels where > sheer performance is less important... but, that sounds pretty flakey to me. Agreed, that sucks too :-/ I was hoping there would be a 'nice' solution, a well, again, reality ruins it. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH 2007-08-31 13:50 ` David Chinner 2007-08-31 14:33 ` Eric Sandeen @ 2007-08-31 14:33 ` Peter Zijlstra 2007-08-31 15:05 ` David Chinner 1 sibling, 1 reply; 11+ messages in thread From: Peter Zijlstra @ 2007-08-31 14:33 UTC (permalink / raw) To: David Chinner Cc: Eric Sandeen, linux-kernel Mailing List, xfs-oss, Ingo Molnar On Fri, 2007-08-31 at 23:50 +1000, David Chinner wrote: > On Fri, Aug 31, 2007 at 08:39:49AM +0200, Peter Zijlstra wrote: > > On Thu, 2007-08-30 at 23:43 -0500, Eric Sandeen wrote: > > > The xfs filesystem can exceed the current lockdep > > > MAX_LOCK_DEPTH, because when deleting an entire cluster of inodes, > > > they all get locked in xfs_ifree_cluster(). The normal cluster > > > size is 8192 bytes, and with the default (and minimum) inode size > > > of 256 bytes, that's up to 32 inodes that get locked. Throw in a > > > few other locks along the way, and 40 seems enough to get me through > > > all the tests in the xfsqa suite on 4k blocks. (block sizes > > > above 8K will still exceed this though, I think) > > > > As 40 will still not be enough for people with larger block sizes, this > > does not seems like a solid solution. Could XFS possibly batch in > > smaller (fixed sized) chunks, or does that have significant down sides? > > The problem is not filesystem block size, it's the xfs inode cluster buffer > size / the size of the inodes that determines the lock depth. the common case > is 8k/256 = 32 inodes in a buffer, and they all get looked during inode > cluster writeback. > > This inode writeback clustering is one of the reasons XFS doesn't suffer from > atime issues as much as other filesystems - it doesn't need to do as much I/O > to write back dirty inodes to disk. > > IOWs, we are not going to make the inode clusters smallers - if anything they > are going to get *larger* in future so we do less I/O during inode writeback > than we do now..... Since they are all trylocks it seems to suggest there is no hard _need_ to lock a whole inode cluster at once, and could iterate through it with less inodes locked. Granted I have absolutely no understanding of what I'm talking about :-) Trouble is, we'd like to have a sane upper bound on the amount of held locks at any one time, obviously this is just wanting, because a lot of lock chains also depend on the number of online cpus... ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH 2007-08-31 14:33 ` Peter Zijlstra @ 2007-08-31 15:05 ` David Chinner 2007-08-31 15:09 ` Peter Zijlstra 0 siblings, 1 reply; 11+ messages in thread From: David Chinner @ 2007-08-31 15:05 UTC (permalink / raw) To: Peter Zijlstra Cc: David Chinner, Eric Sandeen, linux-kernel Mailing List, xfs-oss, Ingo Molnar On Fri, Aug 31, 2007 at 04:33:51PM +0200, Peter Zijlstra wrote: > On Fri, 2007-08-31 at 23:50 +1000, David Chinner wrote: > > On Fri, Aug 31, 2007 at 08:39:49AM +0200, Peter Zijlstra wrote: > > > On Thu, 2007-08-30 at 23:43 -0500, Eric Sandeen wrote: > > > > The xfs filesystem can exceed the current lockdep > > > > MAX_LOCK_DEPTH, because when deleting an entire cluster of inodes, > > > > they all get locked in xfs_ifree_cluster(). The normal cluster > > > > size is 8192 bytes, and with the default (and minimum) inode size > > > > of 256 bytes, that's up to 32 inodes that get locked. Throw in a > > > > few other locks along the way, and 40 seems enough to get me through > > > > all the tests in the xfsqa suite on 4k blocks. (block sizes > > > > above 8K will still exceed this though, I think) > > > > > > As 40 will still not be enough for people with larger block sizes, this > > > does not seems like a solid solution. Could XFS possibly batch in > > > smaller (fixed sized) chunks, or does that have significant down sides? > > > > The problem is not filesystem block size, it's the xfs inode cluster buffer > > size / the size of the inodes that determines the lock depth. the common case > > is 8k/256 = 32 inodes in a buffer, and they all get looked during inode > > cluster writeback. > > > > This inode writeback clustering is one of the reasons XFS doesn't suffer from > > atime issues as much as other filesystems - it doesn't need to do as much I/O > > to write back dirty inodes to disk. > > > > IOWs, we are not going to make the inode clusters smallers - if anything they > > are going to get *larger* in future so we do less I/O during inode writeback > > than we do now..... > > Since they are all trylocks it seems to suggest there is no hard _need_ > to lock a whole inode cluster at once, and could iterate through it with > less inodes locked. That's kind of misleading. They are trylocks to prevent deadlocks with other threads that may be cleaning up a given inode or the inodes may already be locked for writeback and so locking them a second time would deadlock. Basically we have to run the lower loops with the inodes locked, and the trylocks simply avoid the need for us to explicity check if items are locked to avoid deadlocks. > Granted I have absolutely no understanding of what I'm talking about :-) > > Trouble is, we'd like to have a sane upper bound on the amount of held > locks at any one time, obviously this is just wanting, because a lot of > lock chains also depend on the number of online cpus... Sure - this is an obvious case where it is valid to take >30 locks at once in a single thread. In fact, worst case here we are taking twice this number of locks - we actually take 2 per inode (ilock and flock) so a full 32 inode cluster free would take >60 locks in the middle of this function and we should be busting this depth couter limit all the time. Do semaphores (the flush locks) contribute to the lock depth counters? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH 2007-08-31 15:05 ` David Chinner @ 2007-08-31 15:09 ` Peter Zijlstra 2007-08-31 15:11 ` Eric Sandeen ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Peter Zijlstra @ 2007-08-31 15:09 UTC (permalink / raw) To: David Chinner Cc: Eric Sandeen, linux-kernel Mailing List, xfs-oss, Ingo Molnar On Sat, 2007-09-01 at 01:05 +1000, David Chinner wrote: > > Trouble is, we'd like to have a sane upper bound on the amount of held > > locks at any one time, obviously this is just wanting, because a lot of > > lock chains also depend on the number of online cpus... > > Sure - this is an obvious case where it is valid to take >30 locks at > once in a single thread. In fact, worst case here we are taking twice this > number of locks - we actually take 2 per inode (ilock and flock) so a > full 32 inode cluster free would take >60 locks in the middle of this > function and we should be busting this depth couter limit all the > time. I think this started because jeffpc couldn't boot without XFS busting lockdep :-) > Do semaphores (the flush locks) contribute to the lock depth > counters? No, alas, we cannot handle semaphores in lockdep. Semaphores don't have a strict owner, hence we cannot track them. This is one of the reasons to rid ourselves of semaphores - that and there are very few cases where the actual semantics of semaphores are needed. Most of the times code using semaphores can be expressed with either a mutex or a completion. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH 2007-08-31 15:09 ` Peter Zijlstra @ 2007-08-31 15:11 ` Eric Sandeen 2007-08-31 15:19 ` David Chinner 2007-08-31 16:33 ` Josef Sipek 2 siblings, 0 replies; 11+ messages in thread From: Eric Sandeen @ 2007-08-31 15:11 UTC (permalink / raw) To: Peter Zijlstra Cc: David Chinner, linux-kernel Mailing List, xfs-oss, Ingo Molnar Peter Zijlstra wrote: > On Sat, 2007-09-01 at 01:05 +1000, David Chinner wrote: > >>> Trouble is, we'd like to have a sane upper bound on the amount of held >>> locks at any one time, obviously this is just wanting, because a lot of >>> lock chains also depend on the number of online cpus... >> Sure - this is an obvious case where it is valid to take >30 locks at >> once in a single thread. In fact, worst case here we are taking twice this >> number of locks - we actually take 2 per inode (ilock and flock) so a >> full 32 inode cluster free would take >60 locks in the middle of this >> function and we should be busting this depth couter limit all the >> time. > > I think this started because jeffpc couldn't boot without XFS busting > lockdep :-) > >> Do semaphores (the flush locks) contribute to the lock depth >> counters? > > No, alas, we cannot handle semaphores in lockdep. That explains why 40 was enough for me, I guess :) -Eric ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH 2007-08-31 15:09 ` Peter Zijlstra 2007-08-31 15:11 ` Eric Sandeen @ 2007-08-31 15:19 ` David Chinner 2007-08-31 16:33 ` Josef Sipek 2 siblings, 0 replies; 11+ messages in thread From: David Chinner @ 2007-08-31 15:19 UTC (permalink / raw) To: Peter Zijlstra Cc: David Chinner, Eric Sandeen, linux-kernel Mailing List, xfs-oss, Ingo Molnar On Fri, Aug 31, 2007 at 05:09:21PM +0200, Peter Zijlstra wrote: > On Sat, 2007-09-01 at 01:05 +1000, David Chinner wrote: > > > > Trouble is, we'd like to have a sane upper bound on the amount of held > > > locks at any one time, obviously this is just wanting, because a lot of > > > lock chains also depend on the number of online cpus... > > > > Sure - this is an obvious case where it is valid to take >30 locks at > > once in a single thread. In fact, worst case here we are taking twice this > > number of locks - we actually take 2 per inode (ilock and flock) so a > > full 32 inode cluster free would take >60 locks in the middle of this > > function and we should be busting this depth couter limit all the > > time. > > I think this started because jeffpc couldn't boot without XFS busting > lockdep :-) Ok.... > > Do semaphores (the flush locks) contribute to the lock depth > > counters? > > No, alas, we cannot handle semaphores in lockdep. Semaphores don't have > a strict owner, hence we cannot track them. This is one of the reasons > to rid ourselves of semaphores - that and there are very few cases where > the actual semantics of semaphores are needed. Most of the times code > using semaphores can be expressed with either a mutex or a completion. Yeah, and the flush lock is something we can't really use either of those for as we require both the multi-process lock/unlock behaviour and the mutual exclusion that a semaphore provides. So I guess that means it's only the ilock nesting that is of issue here, so that means right now a max lock depth of 40 would probably be ok.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Increase lockdep MAX_LOCK_DEPTH 2007-08-31 15:09 ` Peter Zijlstra 2007-08-31 15:11 ` Eric Sandeen 2007-08-31 15:19 ` David Chinner @ 2007-08-31 16:33 ` Josef Sipek 2 siblings, 0 replies; 11+ messages in thread From: Josef Sipek @ 2007-08-31 16:33 UTC (permalink / raw) To: Peter Zijlstra Cc: David Chinner, Eric Sandeen, linux-kernel Mailing List, xfs-oss, Ingo Molnar On Fri, Aug 31, 2007 at 05:09:21PM +0200, Peter Zijlstra wrote: > On Sat, 2007-09-01 at 01:05 +1000, David Chinner wrote: > > > > Trouble is, we'd like to have a sane upper bound on the amount of held > > > locks at any one time, obviously this is just wanting, because a lot of > > > lock chains also depend on the number of online cpus... > > > > Sure - this is an obvious case where it is valid to take >30 locks at > > once in a single thread. In fact, worst case here we are taking twice this > > number of locks - we actually take 2 per inode (ilock and flock) so a > > full 32 inode cluster free would take >60 locks in the middle of this > > function and we should be busting this depth couter limit all the > > time. > > I think this started because jeffpc couldn't boot without XFS busting > lockdep :-) It booted, but if I tried to build the kernel it would make lockdep blowup - not very useful when you have your own code you'd like lockdep to check :) Josef 'Jeff' Sipek. -- I'm somewhere between geek and normal. - Linus Torvalds ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-08-31 16:34 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-08-31 4:43 [PATCH] Increase lockdep MAX_LOCK_DEPTH Eric Sandeen 2007-08-31 6:39 ` Peter Zijlstra 2007-08-31 13:50 ` David Chinner 2007-08-31 14:33 ` Eric Sandeen 2007-08-31 14:36 ` Peter Zijlstra 2007-08-31 14:33 ` Peter Zijlstra 2007-08-31 15:05 ` David Chinner 2007-08-31 15:09 ` Peter Zijlstra 2007-08-31 15:11 ` Eric Sandeen 2007-08-31 15:19 ` David Chinner 2007-08-31 16:33 ` Josef Sipek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox