* [PATCH] locks: fix possible infinite loop in posix deadlock detection [not found] ` <20071026224707.GO13033@fieldses.org> @ 2007-10-28 17:31 ` J. Bruce Fields 2007-10-28 17:43 ` [RFC, PATCH] locks: remove " J. Bruce Fields 2007-10-30 15:20 ` [PATCH, RESEND] locks: fix possible infinite loop in " J. Bruce Fields 0 siblings, 2 replies; 28+ messages in thread From: J. Bruce Fields @ 2007-10-28 17:31 UTC (permalink / raw) To: Linus Torvalds, stable Cc: linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel From: J. Bruce Fields <bfields@citi.umich.edu> I think the real solution is to remove deadlock detection completely; it's hard to imaagine applications really depend on it anyway. For now, though, just bail out after a few iterations. Thanks to George Davis for reporting the problem. Cc: "George G. Davis" <gdavis@mvista.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> --- fs/locks.c | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 0127a28..131aa88 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -696,17 +696,29 @@ EXPORT_SYMBOL(posix_test_lock); * Note: the above assumption may not be true when handling lock requests * from a broken NFS client. But broken NFS clients have a lot more to * worry about than proper deadlock detection anyway... --okir + * + * However, the failure of this assumption (also possible in the case of + * multiple tasks sharing the same open file table) also means there's no + * guarantee that the loop below will terminate. As a hack, we give up + * after a few iterations. We don't bother returning EDEADLK in that case; + * the deadlock has probably already happened anyway. */ + +#define MAX_DEADLK_ITERATIONS 10 + static int posix_locks_deadlock(struct file_lock *caller_fl, struct file_lock *block_fl) { struct file_lock *fl; + int i = 0; next_task: if (posix_same_owner(caller_fl, block_fl)) return 1; list_for_each_entry(fl, &blocked_list, fl_link) { if (posix_same_owner(fl, block_fl)) { + if (i++ > MAX_DEADLK_ITERATIONS) + return 0; fl = fl->fl_next; block_fl = fl; goto next_task; -- 1.5.3.4.208.gc990 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 17:31 ` [PATCH] locks: fix possible infinite loop in posix deadlock detection J. Bruce Fields @ 2007-10-28 17:43 ` J. Bruce Fields 2007-10-28 18:27 ` Matthew Wilcox 2007-10-29 8:06 ` Alan Cox 2007-10-30 15:20 ` [PATCH, RESEND] locks: fix possible infinite loop in " J. Bruce Fields 1 sibling, 2 replies; 28+ messages in thread From: J. Bruce Fields @ 2007-10-28 17:43 UTC (permalink / raw) To: Linus Torvalds Cc: linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel From: J. Bruce Fields <bfields@citi.umich.edu> We currently attempt to return -EDEALK to blocking fcntl() file locking requests that would create a cycle in the graph of tasks waiting on locks. This is inefficient: in the general case it requires us determining whether we're adding a cycle to an arbitrary directed acyclic graph. And this calculation has to be performed while holding a lock (currently the BKL) that prevents that graph from changing. It has historically been a source of bugs; most recently it was noticed that it could loop indefinitely while holding the BKL. It seems unlikely to be useful to applications: - The difficulty of implementation has kept standards from requiring it. (E.g. SUSv3 : "Since implementation of full deadlock detection is not always feasible, the [EDEADLK] error was made optional.") So portable applications may not be able to depend on it. - It only detects deadlocks that involve nothing but local posix file locks; deadlocks involving network filesystems or other kinds of locks or resources are missed. It therefore seems best to remove deadlock detection. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> --- fs/locks.c | 48 ------------------------------------------------ 1 files changed, 0 insertions(+), 48 deletions(-) This also solves the problem addressed by the previous patch. But this patch would require more discussion, and the problem needs to be fixed now. Of course, we shouldn't apply this if there's a chance that real applications may depend on the existing (imperfect) deadlock detection. diff --git a/fs/locks.c b/fs/locks.c index 131aa88..564b85d 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -683,50 +683,6 @@ posix_test_lock(struct file *filp, struct file_lock *fl) EXPORT_SYMBOL(posix_test_lock); -/* This function tests for deadlock condition before putting a process to - * sleep. The detection scheme is no longer recursive. Recursive was neat, - * but dangerous - we risked stack corruption if the lock data was bad, or - * if the recursion was too deep for any other reason. - * - * We rely on the fact that a task can only be on one lock's wait queue - * at a time. When we find blocked_task on a wait queue we can re-search - * with blocked_task equal to that queue's owner, until either blocked_task - * isn't found, or blocked_task is found on a queue owned by my_task. - * - * Note: the above assumption may not be true when handling lock requests - * from a broken NFS client. But broken NFS clients have a lot more to - * worry about than proper deadlock detection anyway... --okir - * - * However, the failure of this assumption (also possible in the case of - * multiple tasks sharing the same open file table) also means there's no - * guarantee that the loop below will terminate. As a hack, we give up - * after a few iterations. We don't bother returning EDEADLK in that case; - * the deadlock has probably already happened anyway. - */ - -#define MAX_DEADLK_ITERATIONS 10 - -static int posix_locks_deadlock(struct file_lock *caller_fl, - struct file_lock *block_fl) -{ - struct file_lock *fl; - int i = 0; - -next_task: - if (posix_same_owner(caller_fl, block_fl)) - return 1; - list_for_each_entry(fl, &blocked_list, fl_link) { - if (posix_same_owner(fl, block_fl)) { - if (i++ > MAX_DEADLK_ITERATIONS) - return 0; - fl = fl->fl_next; - block_fl = fl; - goto next_task; - } - } - return 0; -} - /* Try to create a FLOCK lock on filp. We always insert new FLOCK locks * after any leases, but before any posix locks. * @@ -846,10 +802,6 @@ static int __posix_lock_file(struct inode *inode, struct file_lock *request, str error = -EAGAIN; if (!(request->fl_flags & FL_SLEEP)) goto out; - error = -EDEADLK; - if (posix_locks_deadlock(request, fl)) - goto out; - error = -EAGAIN; locks_insert_block(fl, request); goto out; } -- 1.5.3.4.208.gc990 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 17:43 ` [RFC, PATCH] locks: remove " J. Bruce Fields @ 2007-10-28 18:27 ` Matthew Wilcox 2007-10-28 18:40 ` Alan Cox 2007-10-29 8:06 ` Alan Cox 1 sibling, 1 reply; 28+ messages in thread From: Matthew Wilcox @ 2007-10-28 18:27 UTC (permalink / raw) To: J. Bruce Fields Cc: Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, Oct 28, 2007 at 01:43:21PM -0400, J. Bruce Fields wrote: > We currently attempt to return -EDEALK to blocking fcntl() file locking > requests that would create a cycle in the graph of tasks waiting on > locks. > > This is inefficient: in the general case it requires us determining > whether we're adding a cycle to an arbitrary directed acyclic graph. > And this calculation has to be performed while holding a lock (currently > the BKL) that prevents that graph from changing. > > It has historically been a source of bugs; most recently it was noticed > that it could loop indefinitely while holding the BKL. It can also return -EDEADLK spuriously. So yeah, just kill it. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 18:27 ` Matthew Wilcox @ 2007-10-28 18:40 ` Alan Cox 2007-10-28 20:11 ` Matthew Wilcox 2007-10-29 1:13 ` J. Bruce Fields 0 siblings, 2 replies; 28+ messages in thread From: Alan Cox @ 2007-10-28 18:40 UTC (permalink / raw) To: Matthew Wilcox Cc: J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, 28 Oct 2007 12:27:32 -0600 Matthew Wilcox <matthew@wil.cx> wrote: > On Sun, Oct 28, 2007 at 01:43:21PM -0400, J. Bruce Fields wrote: > > We currently attempt to return -EDEALK to blocking fcntl() file locking > > requests that would create a cycle in the graph of tasks waiting on > > locks. > > > > This is inefficient: in the general case it requires us determining > > whether we're adding a cycle to an arbitrary directed acyclic graph. > > And this calculation has to be performed while holding a lock (currently > > the BKL) that prevents that graph from changing. > > > > It has historically been a source of bugs; most recently it was noticed > > that it could loop indefinitely while holding the BKL. > > It can also return -EDEADLK spuriously. So yeah, just kill it. NAK. This is an ABI change. It was also comprehensively rejected before because - EDEADLK behaviour is ABI - EDEADLK behaviour is required by SuSv3 - We have no idea what applications may rely on this behaviour. and also SuSv3 is required by LSB See the thread http://osdir.com/ml/file-systems/2004-06/msg00017.html so we need to fix the bugs - the lock usage and the looping. At that point it merely becomes a performance concern to those who use it, which is the proper behaviour. If you want a faster non-checking one use flock(), or add another flag that is a Linux "don't check for deadlock" Alan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 18:40 ` Alan Cox @ 2007-10-28 20:11 ` Matthew Wilcox 2007-10-28 21:38 ` Alan Cox 2007-10-28 21:50 ` Trond Myklebust 2007-10-29 1:13 ` J. Bruce Fields 1 sibling, 2 replies; 28+ messages in thread From: Matthew Wilcox @ 2007-10-28 20:11 UTC (permalink / raw) To: Alan Cox Cc: J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, Oct 28, 2007 at 06:40:52PM +0000, Alan Cox wrote: > NAK. This is an ABI change. It was also comprehensively rejected before > because > > - EDEADLK behaviour is ABI Not in any meaningful way. > - EDEADLK behaviour is required by SuSv3 What SuSv3 actually says is: If the system detects that sleeping until a locked region is unlocked would cause a deadlock, fcntl() shall fail with an [EDEADLK] error. It doesn't require the system to detect it, only mandate what to return if it does detect it. > - We have no idea what applications may rely on this behaviour. I've never heard of one that does. > so we need to fix the bugs - the lock usage and the looping. At that > point it merely becomes a performance concern to those who use it, which > is the proper behaviour. If you want a faster non-checking one use > flock(), or add another flag that is a Linux "don't check for deadlock" You can't fix the false EDEADLK detection without solving the halting problem. Best of luck with that. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 20:11 ` Matthew Wilcox @ 2007-10-28 21:38 ` Alan Cox 2007-10-28 21:45 ` Jiri Kosina 2007-10-28 23:38 ` Matthew Wilcox 2007-10-28 21:50 ` Trond Myklebust 1 sibling, 2 replies; 28+ messages in thread From: Alan Cox @ 2007-10-28 21:38 UTC (permalink / raw) To: Matthew Wilcox Cc: J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel > > - EDEADLK behaviour is ABI > > Not in any meaningful way. I've seen SYS5 software that relies on it so we should be careful. Again see the 2004 discussion where the conclusion was that EDEADLK should stay > > > - EDEADLK behaviour is required by SuSv3 > > What SuSv3 actually says is: > > If the system detects that sleeping until a locked region is > unlocked would cause a deadlock, fcntl() shall fail with an > [EDEADLK] error. > > It doesn't require the system to detect it, only mandate what to return > if it does detect it. We should be detecting at least the obvious case. > > - We have no idea what applications may rely on this behaviour. > > I've never heard of one that does. Very scientific. I have on SYS5 though not afaik Linux > > so we need to fix the bugs - the lock usage and the looping. At that > > point it merely becomes a performance concern to those who use it, which > > is the proper behaviour. If you want a faster non-checking one use > > flock(), or add another flag that is a Linux "don't check for deadlock" > > You can't fix the false EDEADLK detection without solving the halting > problem. Best of luck with that. A good question to ask here would be what subset of deadlock loops on flock does SYS5 Unix error. I also don't see why you need to solve the halting problem If SYSV only spots simple AB - BA deadlocks or taking the same lock twice yourself then that ought to be sufficient for us too. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 21:38 ` Alan Cox @ 2007-10-28 21:45 ` Jiri Kosina 2007-10-28 23:38 ` Matthew Wilcox 1 sibling, 0 replies; 28+ messages in thread From: Jiri Kosina @ 2007-10-28 21:45 UTC (permalink / raw) To: Matthew Wilcox Cc: J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel, Alan Cox On Sun, 28 Oct 2007, Matthew Wilcox wrote: > You can't fix the false EDEADLK detection without solving the halting > problem. Best of luck with that. Could you please elaborate a little bit more on this? I don't see how detecting loops in graph relates to solving halting problem. Of course the halting problem can be transformed to deadlock-detection problem, but this relates to static code analysis, right? Not anything we are interested in, i.e. tracking things in runtime and detecting loops in simple dependency graphs. -- Jiri Kosina ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 21:38 ` Alan Cox 2007-10-28 21:45 ` Jiri Kosina @ 2007-10-28 23:38 ` Matthew Wilcox 2007-10-28 23:44 ` Alan Cox 1 sibling, 1 reply; 28+ messages in thread From: Matthew Wilcox @ 2007-10-28 23:38 UTC (permalink / raw) To: Alan Cox Cc: J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, Oct 28, 2007 at 09:38:55PM +0000, Alan Cox wrote: > > It doesn't require the system to detect it, only mandate what to return > > if it does detect it. > > We should be detecting at least the obvious case. What is the obvious case? A task that has never called clone()? > If SYSV only spots simple AB - BA deadlocks or taking the same lock twice > yourself then that ought to be sufficient for us too. You can't deadlock against yourself -- either with POSIX locks or BSD locks (POSIX simple upgrades/downgrades the lock; each byte in a file can only be in either read-locked state or write-locked state for a given process. BSD locks release the lock and wake all waiters before attempting to acquire the lock in its new state). In my other post, I showed a simple AB-BA deadlock that can't be easily detected. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 23:38 ` Matthew Wilcox @ 2007-10-28 23:44 ` Alan Cox 0 siblings, 0 replies; 28+ messages in thread From: Alan Cox @ 2007-10-28 23:44 UTC (permalink / raw) To: Matthew Wilcox Cc: J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, 28 Oct 2007 17:38:14 -0600 Matthew Wilcox <matthew@wil.cx> wrote: > On Sun, Oct 28, 2007 at 09:38:55PM +0000, Alan Cox wrote: > > > It doesn't require the system to detect it, only mandate what to return > > > if it does detect it. > > > > We should be detecting at least the obvious case. > > What is the obvious case? A task that has never called clone()? Simple AB BA I would have thought obvious. Clone as has been said several times now is irrelevant as the standard is about *processes* [in the SuS sense] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 20:11 ` Matthew Wilcox 2007-10-28 21:38 ` Alan Cox @ 2007-10-28 21:50 ` Trond Myklebust 2007-10-28 22:41 ` Matthew Wilcox 1 sibling, 1 reply; 28+ messages in thread From: Trond Myklebust @ 2007-10-28 21:50 UTC (permalink / raw) To: Matthew Wilcox Cc: Alan Cox, J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, 2007-10-28 at 14:11 -0600, Matthew Wilcox wrote: > On Sun, Oct 28, 2007 at 06:40:52PM +0000, Alan Cox wrote: > > so we need to fix the bugs - the lock usage and the looping. At that > > point it merely becomes a performance concern to those who use it, which > > is the proper behaviour. If you want a faster non-checking one use > > flock(), or add another flag that is a Linux "don't check for deadlock" > > You can't fix the false EDEADLK detection without solving the halting > problem. Best of luck with that. I can see that it would be difficult to do efficiently, but basically, this boils down to finding a circular path in a graph. That is hardly an unsolvable issue... Trond ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 21:50 ` Trond Myklebust @ 2007-10-28 22:41 ` Matthew Wilcox 2007-10-28 22:48 ` Alan Cox ` (3 more replies) 0 siblings, 4 replies; 28+ messages in thread From: Matthew Wilcox @ 2007-10-28 22:41 UTC (permalink / raw) To: Trond Myklebust Cc: Alan Cox, J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, Oct 28, 2007 at 05:50:30PM -0400, Trond Myklebust wrote: > > You can't fix the false EDEADLK detection without solving the halting > > problem. Best of luck with that. > > I can see that it would be difficult to do efficiently, but basically, > this boils down to finding a circular path in a graph. That is hardly an > unsolvable issue... Bzzt. You get a false deadlock with multiple threads like so: Thread A of task B takes lock 1 Thread C of task D takes lock 2 Thread C of task D blocks on lock 1 Thread E of task B blocks on lock 2 We currently declare deadlock at this point (unless the deadlock detection code has changed since I last looked at it), despite thread A being about to release lock 1. Oh, and by the way, thread E is capable of releasing lock 1, so you can't just say "well, detect by thread instead of by task". So the only way I can see to accurately detect deadlock is to simulate the future execution of all threads in task B to see if any of them will release lock 1 without first gaining lock 2. Which, I believe, is halting-equivalent. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 22:41 ` Matthew Wilcox @ 2007-10-28 22:48 ` Alan Cox 2007-10-28 22:55 ` Matthew Wilcox 2007-10-28 22:55 ` Jiri Kosina ` (2 subsequent siblings) 3 siblings, 1 reply; 28+ messages in thread From: Alan Cox @ 2007-10-28 22:48 UTC (permalink / raw) To: Matthew Wilcox Cc: Trond Myklebust, J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel > Bzzt. You get a false deadlock with multiple threads like so: > > Thread A of task B takes lock 1 > Thread C of task D takes lock 2 > Thread C of task D blocks on lock 1 > Thread E of task B blocks on lock 2 The spec and SYSV certainly ignore threading in this situation and you know that perfectly well (or did in 2004) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 22:48 ` Alan Cox @ 2007-10-28 22:55 ` Matthew Wilcox 2007-10-28 23:38 ` Alan Cox 0 siblings, 1 reply; 28+ messages in thread From: Matthew Wilcox @ 2007-10-28 22:55 UTC (permalink / raw) To: Alan Cox Cc: Trond Myklebust, J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, Oct 28, 2007 at 10:48:33PM +0000, Alan Cox wrote: > > Bzzt. You get a false deadlock with multiple threads like so: > > > > Thread A of task B takes lock 1 > > Thread C of task D takes lock 2 > > Thread C of task D blocks on lock 1 > > Thread E of task B blocks on lock 2 > > The spec and SYSV certainly ignore threading in this situation and you > know that perfectly well (or did in 2004) The discussion petered out (or that mailing list archive lost articles from the thread) without any kind of resolution, or indeed interest. What is your suggestion for handling this problem? As it is now, the kernel 'detects' deadlock where there is none, which doesn't seem allowed by SuSv3 either. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 22:55 ` Matthew Wilcox @ 2007-10-28 23:38 ` Alan Cox 2007-10-29 2:29 ` J. Bruce Fields 0 siblings, 1 reply; 28+ messages in thread From: Alan Cox @ 2007-10-28 23:38 UTC (permalink / raw) To: Matthew Wilcox Cc: Trond Myklebust, J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel > > The spec and SYSV certainly ignore threading in this situation and you > > know that perfectly well (or did in 2004) > > The discussion petered out (or that mailing list archive lost articles > from the thread) without any kind of resolution, or indeed interest. I think the resolution was that the EDEADLK stayed. > What is your suggestion for handling this problem? As it is now, the > kernel 'detects' deadlock where there is none, which doesn't seem > allowed by SuSv3 either Re-read the spec. The EDEADLK doesn't account for threads, only processes. Alan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 23:38 ` Alan Cox @ 2007-10-29 2:29 ` J. Bruce Fields 2007-10-29 8:08 ` Alan Cox 2007-10-29 9:15 ` Jiri Kosina 0 siblings, 2 replies; 28+ messages in thread From: J. Bruce Fields @ 2007-10-29 2:29 UTC (permalink / raw) To: Alan Cox Cc: Matthew Wilcox, Trond Myklebust, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, Oct 28, 2007 at 11:38:26PM +0000, Alan Cox wrote: > > > The spec and SYSV certainly ignore threading in this situation and you > > > know that perfectly well (or did in 2004) > > > > The discussion petered out (or that mailing list archive lost articles > > from the thread) without any kind of resolution, or indeed interest. > > I think the resolution was that the EDEADLK stayed. > > > What is your suggestion for handling this problem? As it is now, the > > kernel 'detects' deadlock where there is none, which doesn't seem > > allowed by SuSv3 either > > Re-read the spec. The EDEADLK doesn't account for threads, only processes. Do you have in mind a way to take advantage of that distinction? The practical requirement I see here is that our deadlock detection should never give false positives. If we return EDEADLK to applications with locking schemes that don't actually deadlock, then we're telling application designers that avoiding deadlock in itself isn't sufficient--they also have to know our particular deadlock detection algorithm, so as to avoid giving even the appearance of deadlock. And if posix file locks are to be useful to threaded applications, then we have to preserve the same no-false-positives requirement for them as well. But, OK, if we can identify unshared current->files at the time we put a task to sleep, then a slight modification of our current algorithm might be sufficient to detect any deadlock that involves purely posix file locks and processes. And we can tell people that avoiding deadlock is their problem as soon as any task with a shared current->files is involved. (Ditto, I assume, if nfsd/lockd acquires a lock.) --b. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-29 2:29 ` J. Bruce Fields @ 2007-10-29 8:08 ` Alan Cox 2007-10-29 9:15 ` Jiri Kosina 1 sibling, 0 replies; 28+ messages in thread From: Alan Cox @ 2007-10-29 8:08 UTC (permalink / raw) To: J. Bruce Fields Cc: Matthew Wilcox, Trond Myklebust, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel > And if posix file locks are to be useful to threaded applications, then > we have to preserve the same no-false-positives requirement for them as > well. It isn't useful to threaded applications. The specification requires this. Which is another reason for having an additional Linux (for now) flag to say "don't bother" Alan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-29 2:29 ` J. Bruce Fields 2007-10-29 8:08 ` Alan Cox @ 2007-10-29 9:15 ` Jiri Kosina 2007-10-30 15:35 ` J. Bruce Fields 1 sibling, 1 reply; 28+ messages in thread From: Jiri Kosina @ 2007-10-29 9:15 UTC (permalink / raw) To: J. Bruce Fields Cc: Alan Cox, Matthew Wilcox, Trond Myklebust, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, 28 Oct 2007, J. Bruce Fields wrote: > But, OK, if we can identify unshared current->files at the time we put a > task to sleep, then a slight modification of our current algorithm might > be sufficient to detect any deadlock that involves purely posix file > locks and processes. And we can tell people that avoiding deadlock is > their problem as soon as any task with a shared current->files is > involved. (Ditto, I assume, if nfsd/lockd acquires a lock.) Don't forget that comparing file_lock->fl_owner (i.e. current->files) is not the only way how lock ownership could be computed (there could be specific file_lock->fl_lmops->fl_compare_owner() and all of them should be teached this new semantics, right?). -- Jiri Kosina ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-29 9:15 ` Jiri Kosina @ 2007-10-30 15:35 ` J. Bruce Fields 0 siblings, 0 replies; 28+ messages in thread From: J. Bruce Fields @ 2007-10-30 15:35 UTC (permalink / raw) To: Jiri Kosina Cc: Alan Cox, Matthew Wilcox, Trond Myklebust, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Mon, Oct 29, 2007 at 10:15:19AM +0100, Jiri Kosina wrote: > On Sun, 28 Oct 2007, J. Bruce Fields wrote: > > > But, OK, if we can identify unshared current->files at the time we put a > > task to sleep, then a slight modification of our current algorithm might > > be sufficient to detect any deadlock that involves purely posix file > > locks and processes. And we can tell people that avoiding deadlock is > > their problem as soon as any task with a shared current->files is > > involved. (Ditto, I assume, if nfsd/lockd acquires a lock.) > > Don't forget that comparing file_lock->fl_owner (i.e. current->files) is > not the only way how lock ownership could be computed (there could be > specific file_lock->fl_lmops->fl_compare_owner() and all of them should > be teached this new semantics, right?). Right. So, for example, this would handle both cases as described above. We're turning off deadlock detection in the problematic cases just by neglecting to add such waiting lockers to the blocked_list (which is what we search to look for lock cycles). (It'd be nice if we didn't have to search that list at all. There should be some way to set up the data structures so we can follow the lock dependencies without having to scan through all the blocked locks at each step. But I haven't quite figured out how to do that yet. And perhaps it's not important to optimize cases where there are lots of blocks.) --b. diff --git a/fs/locks.c b/fs/locks.c index 8b8388e..5446305 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -512,6 +512,8 @@ static void locks_delete_block(struct file_lock *waiter) unlock_kernel(); } +static int posix_owner_shared(struct file_lock *caller_fl); + /* Insert waiter into blocker's block list. * We use a circular list so that processes can be easily woken up in * the order they blocked. The documentation doesn't require this but @@ -523,7 +525,7 @@ static void locks_insert_block(struct file_lock *blocker, BUG_ON(!list_empty(&waiter->fl_block)); list_add_tail(&waiter->fl_block, &blocker->fl_block); waiter->fl_next = blocker; - if (IS_POSIX(blocker)) + if (IS_POSIX(blocker) && !posix_owner_shared(waiter)) list_add(&waiter->fl_link, &blocked_list); } @@ -683,46 +685,79 @@ posix_test_lock(struct file *filp, struct file_lock *fl) EXPORT_SYMBOL(posix_test_lock); -/* This function tests for deadlock condition before putting a process to - * sleep. The detection scheme is no longer recursive. Recursive was neat, - * but dangerous - we risked stack corruption if the lock data was bad, or - * if the recursion was too deep for any other reason. - * - * We rely on the fact that a task can only be on one lock's wait queue - * at a time. When we find blocked_task on a wait queue we can re-search - * with blocked_task equal to that queue's owner, until either blocked_task - * isn't found, or blocked_task is found on a queue owned by my_task. - * - * Note: the above assumption may not be true when handling lock requests - * from a broken NFS client. But broken NFS clients have a lot more to - * worry about than proper deadlock detection anyway... --okir - * - * However, the failure of this assumption (also possible in the case of - * multiple tasks sharing the same open file table) also means there's no - * guarantee that the loop below will terminate. As a hack, we give up - * after a few iterations. +/* + * Deadlock detection: + * + * We attempt to detect deadlocks that are due purely to posix file + * locks. + * + * We assume that a task can be waiting for at most one lock at a time. + * So for any acquired lock, the process holding that lock may be + * waiting on at most one other lock. That lock in turns may be held by + * someone waiting for at most one other lock. Given a requested lock + * caller_fl which is about to wait for a conflicting lock block_fl, we + * follow this chain of waiters to ensure we are not about to create a + * cycle. + * + * Since we do this before we ever put a process to sleep on a lock, we + * are ensured that there is never a cycle; that is what guarantees that + * the while() loop in posix_locks_deadlock() eventually completes. + * + * Note: the above assumption may not be true when handling lock + * requests from a broken NFS client. It may also fail in the presence + * of tasks (such as posix threads) sharing the same open file table. + * + * We don't necessarily care about returning EDEALK correctly in such + * cases, but we do need to avoid cycles in the lock dependency graph in + * order to ensure the loop in posix_locks_deadlock eventually + * terminates. To that end, we enforce the assumption above by refusing + * to return EDEADLK or add to the list of blocked locks in any case + * where a lock owner might be able to block on more than one lock. */ -#define MAX_DEADLK_ITERATIONS 10 +static int posix_owner_shared(struct file_lock *caller_fl) +{ + /* + * The caller is a lock manager (lockd/nfsd), and won't + * necessarily guarantee that a single lock owner won't block on + * two locks at once: + */ + if (caller_fl->fl_lmops && caller_fl->fl_lmops->fl_compare_owner) + return 1; + /* + * Multiple tasks share current->files, also allowing the same + * "owner" to block on two locks at once: + */ + if (current->files == NULL || atomic_read(¤t->files->count) > 1) + return 1; + /* + * The lock is not on behalf of a file manager, and no other + * tasks share this file owner (and, as long as this task is + * stuck waiting for a lock, that's not going to change): + */ + return 0; +} + +/* Find a lock that the owner of the given block_fl is blocking on. */ +static struct file_lock * what_owner_is_waiting_for(struct file_lock *block_fl) +{ + struct file_lock *fl; + + list_for_each_entry(fl, &blocked_list, fl_link) + if (posix_same_owner(fl, block_fl)) + return fl->fl_next; + return NULL; +} static int posix_locks_deadlock(struct file_lock *caller_fl, struct file_lock *block_fl) { - struct file_lock *fl; - int i = 0; + if (posix_owner_shared(caller_fl)) + return 0; -next_task: - if (posix_same_owner(caller_fl, block_fl)) - return 1; - list_for_each_entry(fl, &blocked_list, fl_link) { - if (posix_same_owner(fl, block_fl)) { - if (i++ > MAX_DEADLK_ITERATIONS) - return 0; - fl = fl->fl_next; - block_fl = fl; - goto next_task; - } - } + while ((block_fl = what_owner_is_waiting_for(block_fl))) + if (posix_same_owner(caller_fl, block_fl)) + return 1; return 0; } ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 22:41 ` Matthew Wilcox 2007-10-28 22:48 ` Alan Cox @ 2007-10-28 22:55 ` Jiri Kosina 2007-10-28 23:31 ` Matthew Wilcox 2007-10-29 2:10 ` J. Bruce Fields 2007-10-29 3:26 ` Trond Myklebust 3 siblings, 1 reply; 28+ messages in thread From: Jiri Kosina @ 2007-10-28 22:55 UTC (permalink / raw) To: Matthew Wilcox Cc: Trond Myklebust, Alan Cox, J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, 28 Oct 2007, Matthew Wilcox wrote: > Bzzt. You get a false deadlock with multiple threads like so: > Thread A of task B takes lock 1 > Thread C of task D takes lock 2 > Thread C of task D blocks on lock 1 > Thread E of task B blocks on lock 2 A potential for deadlock occurs if a process controlling a locked region is put to sleep by attempting to lock another process' locked region. If the system detects that sleeping until a locked region is unlocked would cause a deadlock, fcntl() shall fail with an [EDEADLK] error. This is what POSIX says [1], even after being modified with respect to POSIX Threads Extension, right? So it doesn't deal with threads at all, just processess are taken into account. Probably for a reason :) [1] http://www.opengroup.org/onlinepubs/009695399/functions/fcntl.html -- Jiri Kosina ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 22:55 ` Jiri Kosina @ 2007-10-28 23:31 ` Matthew Wilcox 2007-10-29 9:11 ` Jiri Kosina 0 siblings, 1 reply; 28+ messages in thread From: Matthew Wilcox @ 2007-10-28 23:31 UTC (permalink / raw) To: Jiri Kosina Cc: Trond Myklebust, Alan Cox, J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, Oct 28, 2007 at 11:55:52PM +0100, Jiri Kosina wrote: > On Sun, 28 Oct 2007, Matthew Wilcox wrote: > > > Bzzt. You get a false deadlock with multiple threads like so: > > Thread A of task B takes lock 1 > > Thread C of task D takes lock 2 > > Thread C of task D blocks on lock 1 > > Thread E of task B blocks on lock 2 > > A potential for deadlock occurs if a process controlling a locked > region is put to sleep by attempting to lock another process' > locked region. If the system detects that sleeping until a locked > region is unlocked would cause a deadlock, fcntl() shall fail with > an [EDEADLK] error. > > This is what POSIX says [1], even after being modified with respect to > POSIX Threads Extension, right? > > So it doesn't deal with threads at all, just processess are taken into > account. Probably for a reason :) Did you have a concrete suggestion, or are you just quoting the spec? The problem is that it's nonsense -- processes don't sleep, threads do. I think the key is "would deadlock", not "might deadlock". Our current behaviour is clearly in violation of SuSv3. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 23:31 ` Matthew Wilcox @ 2007-10-29 9:11 ` Jiri Kosina 0 siblings, 0 replies; 28+ messages in thread From: Jiri Kosina @ 2007-10-29 9:11 UTC (permalink / raw) To: Matthew Wilcox Cc: Trond Myklebust, Alan Cox, J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, 28 Oct 2007, Matthew Wilcox wrote: > > A potential for deadlock occurs if a process controlling a locked > > region is put to sleep by attempting to lock another process' > > locked region. If the system detects that sleeping until a locked > > region is unlocked would cause a deadlock, fcntl() shall fail with > > an [EDEADLK] error. > > This is what POSIX says [1], even after being modified with respect to > > POSIX Threads Extension, right? So it doesn't deal with threads at > > all, just processess are taken into account. Probably for a reason :) > Did you have a concrete suggestion, or are you just quoting the spec? I was quoting the spec and I thought that the suggestion is implicit -- the specification defines what happens only when processess are in place. If the application uses POSIX threads in combination with locks, it is going to receive undefined behavior. > The problem is that it's nonsense -- processes don't sleep, threads do. > I think the key is "would deadlock", not "might deadlock". Our current > behaviour is clearly in violation of SuSv3. - either we can add a special fcntl() Linux-specific flag, that will ask the kernel not to perform deadlock detection. Any application that is combining threads and posix locks (and thus IMHO asking for undefined behavior) could use this flag and not receive EDEADLK any more - or we can add some heuristics here-and-there to track which current->files are shared and which are not, and do not return EDEADLK in the case of shared ->files (could be a little bit tricky) -- Jiri Kosina ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 22:41 ` Matthew Wilcox 2007-10-28 22:48 ` Alan Cox 2007-10-28 22:55 ` Jiri Kosina @ 2007-10-29 2:10 ` J. Bruce Fields 2007-10-29 3:26 ` Trond Myklebust 3 siblings, 0 replies; 28+ messages in thread From: J. Bruce Fields @ 2007-10-29 2:10 UTC (permalink / raw) To: Matthew Wilcox Cc: Trond Myklebust, Alan Cox, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, Oct 28, 2007 at 04:41:57PM -0600, Matthew Wilcox wrote: > On Sun, Oct 28, 2007 at 05:50:30PM -0400, Trond Myklebust wrote: > > > You can't fix the false EDEADLK detection without solving the halting > > > problem. Best of luck with that. > > > > I can see that it would be difficult to do efficiently, but basically, > > this boils down to finding a circular path in a graph. That is hardly an > > unsolvable issue... > > Bzzt. You get a false deadlock with multiple threads like so: > > Thread A of task B takes lock 1 > Thread C of task D takes lock 2 > Thread C of task D blocks on lock 1 > Thread E of task B blocks on lock 2 Oh neat, I missed that case, thanks for pointing it out. > We currently declare deadlock at this point (unless the deadlock detection > code has changed since I last looked at it), despite thread A being about > to release lock 1. Oh, and by the way, thread E is capable of releasing > lock 1, so you can't just say "well, detect by thread instead of by task". > > So the only way I can see to accurately detect deadlock is to simulate > the future execution of all threads in task B to see if any of them > will release lock 1 without first gaining lock 2. Hm. It's annoying, but I'm not convinced it's *that* annoying. We're not trying to predict whether a deadlock could arise as the result of future behavior. We're just trying to determine whether granting the current lock request results in an immediate deadlock consisting purely of posix file locks. But yes, I'm assume it's possible, for example, that a thread-exit could race with a lock request, with the result that we see no deadlock at the time we handle the lock request, even though at that point the last task with the ability to solve the problem is already exiting. Supposing that we're willing to permit the request in such cases and return EDEADLK only in cases where we're positive there's a deadlock, is there still some useful subset of cases where we could return EDEADLK? For example, could we take note of tasks that, when they block on a lock, have a current->files with reference count one, and only follow cycles consisting of such blocks? I'm still not convinced it's worth the trouble, but I suspect you're overstating the difficulty. --b. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 22:41 ` Matthew Wilcox ` (2 preceding siblings ...) 2007-10-29 2:10 ` J. Bruce Fields @ 2007-10-29 3:26 ` Trond Myklebust 3 siblings, 0 replies; 28+ messages in thread From: Trond Myklebust @ 2007-10-29 3:26 UTC (permalink / raw) To: Matthew Wilcox Cc: Alan Cox, J. Bruce Fields, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, 2007-10-28 at 16:41 -0600, Matthew Wilcox wrote: > On Sun, Oct 28, 2007 at 05:50:30PM -0400, Trond Myklebust wrote: > > > You can't fix the false EDEADLK detection without solving the halting > > > problem. Best of luck with that. > > > > I can see that it would be difficult to do efficiently, but basically, > > this boils down to finding a circular path in a graph. That is hardly an > > unsolvable issue... > > Bzzt. You get a false deadlock with multiple threads like so: > > Thread A of task B takes lock 1 > Thread C of task D takes lock 2 > Thread C of task D blocks on lock 1 > Thread E of task B blocks on lock 2 > > We currently declare deadlock at this point (unless the deadlock detection > code has changed since I last looked at it), despite thread A being about > to release lock 1. Oh, and by the way, thread E is capable of releasing > lock 1, so you can't just say "well, detect by thread instead of by task". > > So the only way I can see to accurately detect deadlock is to simulate > the future execution of all threads in task B to see if any of them > will release lock 1 without first gaining lock 2. Which, I believe, > is halting-equivalent. As several people have told you, the SUSv3 section on fcntl and deadlocks reads as follows: "A potential for deadlock occurs if a process controlling a locked region is put to sleep by attempting to lock another process' locked region. If the system detects that sleeping until a locked region is unlocked would cause a deadlock, fcntl() shall fail with an [EDEADLK] error." There is no mention there or anywhere else of a need to make exceptions when dealing with threads. The posix locking model is _process_ based, and so our deadlock detection only needs to take that into account. If programmers choose to play tricksy little games with threads, then it is their responsibility to ensure that the application doesn't get into a situation where the posix deadlock detection model breaks down. Trond ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 18:40 ` Alan Cox 2007-10-28 20:11 ` Matthew Wilcox @ 2007-10-29 1:13 ` J. Bruce Fields 1 sibling, 0 replies; 28+ messages in thread From: J. Bruce Fields @ 2007-10-29 1:13 UTC (permalink / raw) To: Alan Cox Cc: Matthew Wilcox, Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, Oct 28, 2007 at 06:40:52PM +0000, Alan Cox wrote: > On Sun, 28 Oct 2007 12:27:32 -0600 > Matthew Wilcox <matthew@wil.cx> wrote: > > > On Sun, Oct 28, 2007 at 01:43:21PM -0400, J. Bruce Fields wrote: > > > We currently attempt to return -EDEALK to blocking fcntl() file locking > > > requests that would create a cycle in the graph of tasks waiting on > > > locks. > > > > > > This is inefficient: in the general case it requires us determining > > > whether we're adding a cycle to an arbitrary directed acyclic graph. > > > And this calculation has to be performed while holding a lock (currently > > > the BKL) that prevents that graph from changing. > > > > > > It has historically been a source of bugs; most recently it was noticed > > > that it could loop indefinitely while holding the BKL. > > > > It can also return -EDEADLK spuriously. So yeah, just kill it. > > NAK. This is an ABI change. It was also comprehensively rejected before > because > > - EDEADLK behaviour is ABI > - EDEADLK behaviour is required by SuSv3 > - We have no idea what applications may rely on this behaviour. On the second point I think you're mistaken; see http://www.opengroup.org/onlinepubs/009695399/functions/fcntl.html "Since implementation of full deadlock detection is not always feasible, the [EDEADLK] error was made optional." The third objection is the one that concerns me most, and is the one I'd like to understand better. So I'd be interested in any evidence of applications that do actually depend on the current behavior. (Even hypothetical use cases might be interesting.) I'm also curious what other OS's do. > See the thread > http://osdir.com/ml/file-systems/2004-06/msg00017.html > > so we need to fix the bugs - the lock usage and the looping. At that > point it merely becomes a performance concern to those who use it, which > is the proper behaviour. If you want a faster non-checking one use > flock(), or add another flag that is a Linux "don't check for deadlock" That's an interesting idea. The flag might not help as much, since looking for a cycle in the graph of blocked requests may be more difficult in the case where we don't know whether the graph is acyclic to start out with. (But I don't know--I haven't thought about it much.) --b. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-28 17:43 ` [RFC, PATCH] locks: remove " J. Bruce Fields 2007-10-28 18:27 ` Matthew Wilcox @ 2007-10-29 8:06 ` Alan Cox 2007-10-30 15:51 ` J. Bruce Fields 1 sibling, 1 reply; 28+ messages in thread From: Alan Cox @ 2007-10-29 8:06 UTC (permalink / raw) To: J. Bruce Fields Cc: Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Sun, 28 Oct 2007 13:43:21 -0400 "J. Bruce Fields" <bfields@fieldses.org> wrote: > From: J. Bruce Fields <bfields@citi.umich.edu> > > We currently attempt to return -EDEALK to blocking fcntl() file locking > requests that would create a cycle in the graph of tasks waiting on > locks. > > This is inefficient: in the general case it requires us determining > whether we're adding a cycle to an arbitrary directed acyclic graph. > And this calculation has to be performed while holding a lock (currently > the BKL) that prevents that graph from changing. > > It has historically been a source of bugs; most recently it was noticed > that it could loop indefinitely while holding the BKL. > > It seems unlikely to be useful to applications: > - The difficulty of implementation has kept standards from > requiring it. (E.g. SUSv3 : "Since implementation of full > deadlock detection is not always feasible, the [EDEADLK] error > was made optional.") So portable applications may not be able to > depend on it. > - It only detects deadlocks that involve nothing but local posix > file locks; deadlocks involving network filesystems or other kinds > of locks or resources are missed. > > It therefore seems best to remove deadlock detection. > > Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> NAK. This is an ABI change and one that was rejected before when this was last discussed in detail. Moving it out of BKL makes a ton of sense, even adding a "don't check" flag makes a lot of sense. Removing the checking does not. I'd much rather see if (flags & FL_NODLCHECK) posix_deadlock_detect(....) The failure case for removing this feature is obscure and hard to debug application hangs for the afflicted programs - not nice for users at all. Alan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [RFC, PATCH] locks: remove posix deadlock detection 2007-10-29 8:06 ` Alan Cox @ 2007-10-30 15:51 ` J. Bruce Fields 0 siblings, 0 replies; 28+ messages in thread From: J. Bruce Fields @ 2007-10-30 15:51 UTC (permalink / raw) To: Alan Cox Cc: Linus Torvalds, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Mon, Oct 29, 2007 at 08:06:04AM +0000, Alan Cox wrote: > On Sun, 28 Oct 2007 13:43:21 -0400 > "J. Bruce Fields" <bfields@fieldses.org> wrote: > > > From: J. Bruce Fields <bfields@citi.umich.edu> > > > > We currently attempt to return -EDEALK to blocking fcntl() file locking > > requests that would create a cycle in the graph of tasks waiting on > > locks. > > > > This is inefficient: in the general case it requires us determining > > whether we're adding a cycle to an arbitrary directed acyclic graph. > > And this calculation has to be performed while holding a lock (currently > > the BKL) that prevents that graph from changing. > > > > It has historically been a source of bugs; most recently it was noticed > > that it could loop indefinitely while holding the BKL. > > > > It seems unlikely to be useful to applications: > > - The difficulty of implementation has kept standards from > > requiring it. (E.g. SUSv3 : "Since implementation of full > > deadlock detection is not always feasible, the [EDEADLK] error > > was made optional.") So portable applications may not be able to > > depend on it. > > - It only detects deadlocks that involve nothing but local posix > > file locks; deadlocks involving network filesystems or other kinds > > of locks or resources are missed. > > > > It therefore seems best to remove deadlock detection. > > > > Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> > > > NAK. This is an ABI change OK, fair enough. > and one that was rejected before when this was last discussed in > detail. That previous discussion (http://marc.info/?t=108652857400003&r=1&w=2) read the spec wrong and--now that I reread it--failed to address the original bug report. In fact it looks to me like the actual bug reported: http://bugzilla.kernel.org/show_bug.cgi?id=2829 was probably a minor variant of the one which George Davis stumbled on again here. That's a little depressing. > Moving it out of BKL makes a ton of sense, That would at least reduce the impact of bugs here, yeah. It would be great to figure out how to start getting locks.c and lockd out from under the BKL, but I don't personally have the time now. (And I believe there's been a failed attempt or two so it's not completely easy.) > even adding a "don't check" flag makes a lot of sense. That's an idea I'll keep in mind, though I'm not convinced it'll help much (or that applications would actually use it). > The failure case for removing this feature is obscure and hard to debug > application hangs for the afflicted programs - not nice for users at all. Yeah. I'd still be curious for any data about applications that actually rely on posix deadlock detection, though. --b. ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH, RESEND] locks: fix possible infinite loop in posix deadlock detection 2007-10-28 17:31 ` [PATCH] locks: fix possible infinite loop in posix deadlock detection J. Bruce Fields 2007-10-28 17:43 ` [RFC, PATCH] locks: remove " J. Bruce Fields @ 2007-10-30 15:20 ` J. Bruce Fields 2007-10-30 15:35 ` Alan Cox 1 sibling, 1 reply; 28+ messages in thread From: J. Bruce Fields @ 2007-10-30 15:20 UTC (permalink / raw) To: Linus Torvalds, stable Cc: linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel From: J. Bruce Fields <bfields@citi.umich.edu> It's currently possible to send posix_locks_deadlock() into an infinite loop (under the BKL). For now, fix this just by bailing out after a few iterations. We may want to fix this in a way that better clarifies the semantics of deadlock detection. But that will take more time, and this minimal fix is probably adequate for any realistic scenario, and is simple enough to be appropriate for applying to stable kernels now. Thanks to George Davis for reporting the problem. Cc: "George G. Davis" <gdavis@mvista.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> --- fs/locks.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) I didn't see objections to this quick fix (just to the followup that attempts to rip out posix deadlock detection entirely), so I'm resending with just comment modifications. I haven't given up on a more comprehensive solution, but I think we really need to apply some fix now. --b. diff --git a/fs/locks.c b/fs/locks.c index 0127a28..8b8388e 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -696,17 +696,28 @@ EXPORT_SYMBOL(posix_test_lock); * Note: the above assumption may not be true when handling lock requests * from a broken NFS client. But broken NFS clients have a lot more to * worry about than proper deadlock detection anyway... --okir + * + * However, the failure of this assumption (also possible in the case of + * multiple tasks sharing the same open file table) also means there's no + * guarantee that the loop below will terminate. As a hack, we give up + * after a few iterations. */ + +#define MAX_DEADLK_ITERATIONS 10 + static int posix_locks_deadlock(struct file_lock *caller_fl, struct file_lock *block_fl) { struct file_lock *fl; + int i = 0; next_task: if (posix_same_owner(caller_fl, block_fl)) return 1; list_for_each_entry(fl, &blocked_list, fl_link) { if (posix_same_owner(fl, block_fl)) { + if (i++ > MAX_DEADLK_ITERATIONS) + return 0; fl = fl->fl_next; block_fl = fl; goto next_task; -- 1.5.3.4.208.gc990 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH, RESEND] locks: fix possible infinite loop in posix deadlock detection 2007-10-30 15:20 ` [PATCH, RESEND] locks: fix possible infinite loop in " J. Bruce Fields @ 2007-10-30 15:35 ` Alan Cox 0 siblings, 0 replies; 28+ messages in thread From: Alan Cox @ 2007-10-30 15:35 UTC (permalink / raw) To: J. Bruce Fields Cc: Linus Torvalds, stable, linux-kernel, George G. Davis, Andrew Morton, linux-fsdevel On Tue, 30 Oct 2007 11:20:02 -0400 "J. Bruce Fields" <bfields@fieldses.org> wrote: > From: J. Bruce Fields <bfields@citi.umich.edu> > > It's currently possible to send posix_locks_deadlock() into an infinite > loop (under the BKL). > > For now, fix this just by bailing out after a few iterations. We may > want to fix this in a way that better clarifies the semantics of > deadlock detection. But that will take more time, and this minimal fix > is probably adequate for any realistic scenario, and is simple enough to > be appropriate for applying to stable kernels now. > > Thanks to George Davis for reporting the problem. > > Cc: "George G. Davis" <gdavis@mvista.com> > Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Alan Cox <alan@redhat.com> Its a good fix for now and I doubt any real world user has that complex a locking pattern to break. ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2007-10-30 15:51 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20071017185157.GC3785@mvista.com>
[not found] ` <20071018185759.GU3785@mvista.com>
[not found] ` <20071026170750.GC13033@fieldses.org>
[not found] ` <20071026224707.GO13033@fieldses.org>
2007-10-28 17:31 ` [PATCH] locks: fix possible infinite loop in posix deadlock detection J. Bruce Fields
2007-10-28 17:43 ` [RFC, PATCH] locks: remove " J. Bruce Fields
2007-10-28 18:27 ` Matthew Wilcox
2007-10-28 18:40 ` Alan Cox
2007-10-28 20:11 ` Matthew Wilcox
2007-10-28 21:38 ` Alan Cox
2007-10-28 21:45 ` Jiri Kosina
2007-10-28 23:38 ` Matthew Wilcox
2007-10-28 23:44 ` Alan Cox
2007-10-28 21:50 ` Trond Myklebust
2007-10-28 22:41 ` Matthew Wilcox
2007-10-28 22:48 ` Alan Cox
2007-10-28 22:55 ` Matthew Wilcox
2007-10-28 23:38 ` Alan Cox
2007-10-29 2:29 ` J. Bruce Fields
2007-10-29 8:08 ` Alan Cox
2007-10-29 9:15 ` Jiri Kosina
2007-10-30 15:35 ` J. Bruce Fields
2007-10-28 22:55 ` Jiri Kosina
2007-10-28 23:31 ` Matthew Wilcox
2007-10-29 9:11 ` Jiri Kosina
2007-10-29 2:10 ` J. Bruce Fields
2007-10-29 3:26 ` Trond Myklebust
2007-10-29 1:13 ` J. Bruce Fields
2007-10-29 8:06 ` Alan Cox
2007-10-30 15:51 ` J. Bruce Fields
2007-10-30 15:20 ` [PATCH, RESEND] locks: fix possible infinite loop in " J. Bruce Fields
2007-10-30 15:35 ` Alan Cox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).