From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dr Fields James Bruce Subject: Re: [PATCH] locks: try to catch potential deadlock between file-private and classic locks from same process Date: Tue, 4 Mar 2014 16:14:43 -0500 Message-ID: <20140304211443.GJ12805@fieldses.org> References: <1393960249-18961-1-git-send-email-jlayton@redhat.com> <20140304193551.GG12805@fieldses.org> <20140304151451.07530a98@tlielax.poochiereds.net> <20140304153723.088db7cd@tlielax.poochiereds.net> <849B7F2D-A1C2-4E65-AE6A-0CCFA92990F0@primarydata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Layton Jeff , Andy Lutomirski , Linux FS Devel To: Trond Myklebust Return-path: Received: from fieldses.org ([174.143.236.118]:57938 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756238AbaCDVOq (ORCPT ); Tue, 4 Mar 2014 16:14:46 -0500 Content-Disposition: inline In-Reply-To: <849B7F2D-A1C2-4E65-AE6A-0CCFA92990F0@primarydata.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, Mar 04, 2014 at 03:52:47PM -0500, Trond Myklebust wrote: >=20 > On Mar 4, 2014, at 15:37, Jeff Layton wrote: >=20 > > On Tue, 4 Mar 2014 12:19:44 -0800 > > Andy Lutomirski wrote: > >=20 > >> On Tue, Mar 4, 2014 at 12:14 PM, Jeff Layton = wrote: > >>> On Tue, 4 Mar 2014 14:35:51 -0500 > >>> "J. Bruce Fields" wrote: > >>>=20 > >>>> On Tue, Mar 04, 2014 at 02:10:49PM -0500, Jeff Layton wrote: > >>>>> My expectation is that programs shouldn't mix classic and file-= private > >>>>> locks, but Glenn Skinner pointed out to me that that may occur = at times > >>>>> even if the programmer isn't aware. > >>>>>=20 > >>>>> Suppose we have a program that uses file-private locks. That pr= ogram > >>>>> then links in a library that uses classic POSIX locks. If those= locks > >>>>> end up conflicting and one is using blocking locks, then the pr= ogram > >>>>> could end up deadlocked. > >>>>>=20 > >>>>> Try to catch this situation in posix_locks_deadlock by looking = for the > >>>>> case where the blocking lock was set by the same process but ha= s a > >>>>> different type, and have the kernel return EDEADLK if that occu= rs. > >>>>>=20 > >>>>> This check is not perfect. You could (in principle) have a thre= aded > >>>>> process that is using classic locks in one thread and file-priv= ate locks > >>>>> in another. That's not necessarily a deadlockable situation but= this > >>>>> check would cause an EDEADLK return in that case. > >>>>>=20 > >>>>> By the same token, you could also have a file-private lock that= was > >>>>> inherited across a fork(). If the inheriting process ends up bl= ocking on > >>>>> that while trying to set a classic POSIX lock then this check w= ould miss > >>>>> it and the program would deadlock. > >>>>=20 > >>>> If the caller's not prepared for the library to use classic posi= x locks, > >>>> then it's not going to know how to recover from this EDEADLCK ei= ther, is > >>>> it? > >>>>=20 > >>>=20 > >>> Well, callers should be aware of that if we take this change. The > >>> semantics aren't yet set in stone... > >>>=20 > >>>> I guess I don't understand how this helps anyone. > >>>>=20 > >>>> Has it ever made sense for a library function and its caller to = both use > >>>> classic posix locking on the same file without any coordination? > >>>>=20 > >>>=20 > >>> Not really, but that doesn't mean that it isn't done... ;) > >>>=20 > >>>> Besides the first-close problem there's the problem that locks m= erge, so > >>>> for example you can't hold your own lock across a call to a func= tion > >>>> that grabs and drops a lock on the same file. > >>>>=20 > >>>=20 > >>> It depends, but you're basically correct... > >>>=20 > >>> It's likely that if the above situation occurred with a program u= sing > >>> classic locks, then those locks were silently lost at times. It's= also > >>> plausible that when it occurs that no one is aware of it due to t= he way > >>> POSIX locks work. > >>>=20 > >>> If the program switched to using file-private locks and the libra= ry > >>> stays using classic locks (or vice versa), you then potentially t= rade > >>> that silent loss of locks for a deadlock (since classic and > >>> file-private locks always conflict). > >>>=20 > >>> So, the idea would be to try to catch that situation explicitly a= nd > >>> return a hard error instead of deadlocking. Unfortunately, it's a > >>> little tough to do that in all cases so all this does is try to c= atch a > >>> subset of them. > >>>=20 > >>> Will it be helpful in the long run? I'm not sure. It seems unlike= ly to > >>> harm legit use cases though, and might catch some problematic > >>> situations. I can drop this if that's the consensus however. > >>=20 > >> I don't think I like it except in the case where there are no thre= ads > >> (number of tasks sharing the fd table is 1) and where the struct f= ile > >> only has one fd. Otherwise I think it can have false positives. = Or > >> am I missing something? > >>=20 > >=20 > > The only case where I think this would hit a false positive is if y= ou > > have a threaded program that's doing something weird like having on= e > > thread that's setting classic POSIX locks on a file, and one thread > > that isn't. Once you hit a conflict between the two, you'd get back > > EDEADLK on one of them, even though that situation might not actual= ly > > be a deadlock. > >=20 > > That doesn't really seem like a real-world use-case though, so I'm > > generally OK with that potential false-positive. > >=20 >=20 > How do these locks interact with locks_mandatory_area(), and mandator= y locking in general? Unless I missed something, it looks to me as if t= here is a nasty potential for a self-DOS if you set a file-private lock= on a file with the mandatory lock bits set and the filesystem is mount= ed =E2=80=98-omand'. Good point: if I understand it right, in the mandatory locking case, before doing a read or write we first check if we'd be able to apply a classic posix lock. And that lock will always conflict with a file-private lock. I think we should just not worry about it and see if anyone complains. =46ile-private locks are a new feature and I don't see that we're under any obligation to support the combination of file-private locks and mandatory locking. Mandatory locking is already buggy (because of the race between checkin= g for locks and performing the IO). If we get no complaints about this file-private behavior then that's more evidence we could use to justify just ripping it out completely some day.... But if we really want to be helpful to (possibly nonexistant?) users of mandatory locking, maybe we could allow locks_mandatory_area to try *both* a file-private and a classic lock and to succeed if either one succeeds?? --b. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html