* RE: posix_lock_deadlocks() hang
@ 2004-09-14 19:23 Stuart_Hayes
2004-09-14 19:28 ` Jeremy Allison
0 siblings, 1 reply; 9+ messages in thread
From: Stuart_Hayes @ 2004-09-14 19:23 UTC (permalink / raw)
To: jra; +Cc: michael, linux-fsdevel
Jeremy Allison wrote:
> On Mon, Sep 13, 2004 at 03:25:41PM -0500, Stuart_Hayes@Dell.com wrote:
>>
>> I am seeing the kernel get stuck in an endless loop in
>> posix_locks_deadlock() checking for deadlocks.
>>
>> It appears that samba (smbd) is getting both flocks and posix locks.
>> Since flock_lock_file() isn't checking for potential deadlocks before
>> adding lock requests to the blocked_list, we end up with (what
>> appears to be) a circular dependency between a posix lock and an
>> flock in blocked_list.
>
> I'm not aware that we obtain any locks with flock. We only use fcntl
> (POSIX) locks. I just grepped the code so I'm quite sure :-).
>
> Jeremy.
I haven't specifically traced your application, but, while debugging the
locking code, I definitely saw smbd requesting both flocks and posix
locks.
Also, looking at Samba version 3.0.6, from Red Hat Enterprise Linux, and
I see
source/smbd/open.c: if (kernel_mode) flock(fsp->fd, kernel_mode);
among other grep matches.
I just checked 3.0.7, and it seems to do the same thing.
Stuart
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: posix_lock_deadlocks() hang
2004-09-14 19:23 posix_lock_deadlocks() hang Stuart_Hayes
@ 2004-09-14 19:28 ` Jeremy Allison
2004-09-14 19:30 ` Jeremy Allison
0 siblings, 1 reply; 9+ messages in thread
From: Jeremy Allison @ 2004-09-14 19:28 UTC (permalink / raw)
To: Stuart_Hayes; +Cc: jra, michael, linux-fsdevel
On Tue, Sep 14, 2004 at 02:23:29PM -0500, Stuart_Hayes@Dell.com wrote:
> Jeremy Allison wrote:
> > On Mon, Sep 13, 2004 at 03:25:41PM -0500, Stuart_Hayes@Dell.com wrote:
> >>
> >> I am seeing the kernel get stuck in an endless loop in
> >> posix_locks_deadlock() checking for deadlocks.
> >>
> >> It appears that samba (smbd) is getting both flocks and posix locks.
> >> Since flock_lock_file() isn't checking for potential deadlocks before
> >> adding lock requests to the blocked_list, we end up with (what
> >> appears to be) a circular dependency between a posix lock and an
> >> flock in blocked_list.
> >
> > I'm not aware that we obtain any locks with flock. We only use fcntl
> > (POSIX) locks. I just grepped the code so I'm quite sure :-).
> >
> > Jeremy.
>
> I haven't specifically traced your application, but, while debugging the
> locking code, I definitely saw smbd requesting both flocks and posix
> locks.
>
> Also, looking at Samba version 3.0.6, from Red Hat Enterprise Linux, and
> I see
> source/smbd/open.c: if (kernel_mode) flock(fsp->fd, kernel_mode);
> among other grep matches.
>
> I just checked 3.0.7, and it seems to do the same thing.
Yes, that's only if #if HAVE_KERNEL_SHARE_MODES is defined,
which is not turned on by default. That's only done if someone
sets it by hand (it's not autoconf tested). It's also not a
standard flock (to lock files) but a request for a share mode
in the kernel.
Did you turn this on ?
Jeremy.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: posix_lock_deadlocks() hang
2004-09-14 19:28 ` Jeremy Allison
@ 2004-09-14 19:30 ` Jeremy Allison
0 siblings, 0 replies; 9+ messages in thread
From: Jeremy Allison @ 2004-09-14 19:30 UTC (permalink / raw)
To: Jeremy Allison; +Cc: Stuart_Hayes, michael, linux-fsdevel
On Tue, Sep 14, 2004 at 12:28:14PM -0700, Jeremy Allison wrote:
>
> Yes, that's only if #if HAVE_KERNEL_SHARE_MODES is defined,
> which is not turned on by default. That's only done if someone
> sets it by hand (it's not autoconf tested). It's also not a
> standard flock (to lock files) but a request for a share mode
> in the kernel.
>
> Did you turn this on ?
Arrrgggh. Scratch that - turns out it *was* autoconf'ed in,
and also that code is called by default if the kernel share
modes are detected. That shouldn't be the case (IMHO). I'll
look into when it changed and fix it.
Thanks, my error not yours.
Jeremy.
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: posix_lock_deadlocks() hang
@ 2004-09-17 16:21 Stuart_Hayes
0 siblings, 0 replies; 9+ messages in thread
From: Stuart_Hayes @ 2004-09-17 16:21 UTC (permalink / raw)
To: willy; +Cc: matthew, linux-fsdevel, akpm
>>
>> flocks aren't supposed to be on the blocked list in the first place.
>> I'm not sure how this hasn't been noticed before. The correct patch
>> would be something like (compile-tested only):
>>
>> Index: linux-2.6/fs/locks.c
>> ===================================================================
>> RCS file: /var/cvs/linux-2.6/fs/locks.c,v retrieving revision 1.12
>> diff -u -p -r1.12 locks.c --- linux-2.6/fs/locks.c 13 Aug 2004
>> 14:30:02 -0000 1.12 +++ linux-2.6/fs/locks.c 14 Sep 2004
00:54:13
>> -0000 @@ -459,7 +459,8 @@ static void locks_insert_block(struct fi
>> } list_add_tail(&waiter->fl_block, &blocker->fl_block);
>> waiter->fl_next = blocker; - list_add(&waiter->fl_link,
>> &blocked_list); + if (IS_POSIX(blocker))
>> + list_add(&waiter->fl_link, &blocked_list);
>> }
>>
>> /* Wake up processes blocked waiting for blocker.
>
> That makes sense. I'll give this a try to verify that it fixes the
> problem we're seeing, and will let you know. Thanks for the help!
That did indeed fix the problem we were seeing. Thanks again!
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: posix_lock_deadlocks() hang
@ 2004-09-14 14:43 Stuart_Hayes
0 siblings, 0 replies; 9+ messages in thread
From: Stuart_Hayes @ 2004-09-14 14:43 UTC (permalink / raw)
To: willy; +Cc: matthew, linux-fsdevel, akpm
Matthew Wilcox wrote:
> On Mon, Sep 13, 2004 at 03:25:41PM -0500, Stuart_Hayes@Dell.com wrote:
>> Michael --
>
> Matthew, actually ...
>
Sorry about that! How embarrassing.
>
> flocks aren't supposed to be on the blocked list in the first place.
> I'm not sure how this hasn't been noticed before. The correct patch
> would be something like (compile-tested only):
>
> Index: linux-2.6/fs/locks.c
> ===================================================================
> RCS file: /var/cvs/linux-2.6/fs/locks.c,v retrieving revision 1.12
> diff -u -p -r1.12 locks.c --- linux-2.6/fs/locks.c 13 Aug 2004
> 14:30:02 -0000 1.12 +++ linux-2.6/fs/locks.c 14 Sep 2004
00:54:13
> -0000 @@ -459,7 +459,8 @@ static void locks_insert_block(struct fi
> }
> list_add_tail(&waiter->fl_block, &blocker->fl_block);
> waiter->fl_next = blocker;
> - list_add(&waiter->fl_link, &blocked_list);
> + if (IS_POSIX(blocker))
> + list_add(&waiter->fl_link, &blocked_list);
> }
>
> /* Wake up processes blocked waiting for blocker.
That makes sense. I'll give this a try to verify that it fixes the
problem we're seeing, and will let you know. Thanks for the help!
^ permalink raw reply [flat|nested] 9+ messages in thread
* posix_lock_deadlocks() hang
@ 2004-09-13 20:25 Stuart_Hayes
2004-09-14 0:55 ` Matthew Wilcox
2004-09-14 19:09 ` Jeremy Allison
0 siblings, 2 replies; 9+ messages in thread
From: Stuart_Hayes @ 2004-09-13 20:25 UTC (permalink / raw)
To: michael; +Cc: linux-fsdevel, Stuart_Hayes
Michael --
I am seeing the kernel get stuck in an endless loop in
posix_locks_deadlock() checking for deadlocks.
It appears that samba (smbd) is getting both flocks and
posix locks. Since flock_lock_file() isn't checking for
potential deadlocks before adding lock requests to the
blocked_list, we end up with (what appears to be) a circular
dependency between a posix lock and an flock in blocked_list.
This causes posix_locks_deadlock() to get stuck in a loop
next time we try to get a posix_lock that is blocked by
something with the circular dependency.
Since flocks aren't supposed to do any deadlock checking, it
seems like the right solution to this would be to modify
posix_locks_deadlock() to only check for potential deadlock
situations with other posix locks and lock requests, and
ignore flocks.
I have patched posix_locks_deadlock() to do this (see patch
below), and it fixes the problem.
Do you see any problems with this fix? I could submit a real
patch for this if you like.
(I am seeing this on a RHEL3 update 3 (2.4.21-20) kernel, but,
looking at the source code, it appears to be applicable to the
2.6.7 kernel, too.)
Thanks!
Stuart
--- locks.c Wed Apr 21 23:09:18 2004
+++ locks.c.new Fri Sep 10 17:15:04 2004
@@ -685,7 +685,8 @@ next_task:
return 1;
list_for_each(tmp, &blocked_list) {
struct file_lock *fl = list_entry(tmp, struct file_lock,
fl_link);
- if ((fl->fl_owner == blocked_owner)
+ if ( (fl->fl_flags & FL_POSIX)
+ && (fl->fl_owner == blocked_owner)
&& (fl->fl_pid == blocked_pid)) {
fl = fl->fl_next;
blocked_owner = fl->fl_owner;
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: posix_lock_deadlocks() hang
2004-09-13 20:25 Stuart_Hayes
@ 2004-09-14 0:55 ` Matthew Wilcox
2004-09-14 19:09 ` Jeremy Allison
1 sibling, 0 replies; 9+ messages in thread
From: Matthew Wilcox @ 2004-09-14 0:55 UTC (permalink / raw)
To: Stuart_Hayes; +Cc: matthew, linux-fsdevel, Andrew Morton
On Mon, Sep 13, 2004 at 03:25:41PM -0500, Stuart_Hayes@Dell.com wrote:
> Michael --
Matthew, actually ...
> It appears that samba (smbd) is getting both flocks and
> posix locks. Since flock_lock_file() isn't checking for
> potential deadlocks before adding lock requests to the
> blocked_list, we end up with (what appears to be) a circular
> dependency between a posix lock and an flock in blocked_list.
> This causes posix_locks_deadlock() to get stuck in a loop
> next time we try to get a posix_lock that is blocked by
> something with the circular dependency.
>
> Since flocks aren't supposed to do any deadlock checking, it
> seems like the right solution to this would be to modify
> posix_locks_deadlock() to only check for potential deadlock
> situations with other posix locks and lock requests, and
> ignore flocks.
flocks aren't supposed to be on the blocked list in the first place.
I'm not sure how this hasn't been noticed before. The correct patch
would be something like (compile-tested only):
Index: linux-2.6/fs/locks.c
===================================================================
RCS file: /var/cvs/linux-2.6/fs/locks.c,v
retrieving revision 1.12
diff -u -p -r1.12 locks.c
--- linux-2.6/fs/locks.c 13 Aug 2004 14:30:02 -0000 1.12
+++ linux-2.6/fs/locks.c 14 Sep 2004 00:54:13 -0000
@@ -459,7 +459,8 @@ static void locks_insert_block(struct fi
}
list_add_tail(&waiter->fl_block, &blocker->fl_block);
waiter->fl_next = blocker;
- list_add(&waiter->fl_link, &blocked_list);
+ if (IS_POSIX(blocker))
+ list_add(&waiter->fl_link, &blocked_list);
}
/* Wake up processes blocked waiting for blocker.
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: posix_lock_deadlocks() hang
2004-09-13 20:25 Stuart_Hayes
2004-09-14 0:55 ` Matthew Wilcox
@ 2004-09-14 19:09 ` Jeremy Allison
2004-09-14 23:23 ` Stephen Rothwell
1 sibling, 1 reply; 9+ messages in thread
From: Jeremy Allison @ 2004-09-14 19:09 UTC (permalink / raw)
To: Stuart_Hayes; +Cc: michael, linux-fsdevel
On Mon, Sep 13, 2004 at 03:25:41PM -0500, Stuart_Hayes@Dell.com wrote:
>
> I am seeing the kernel get stuck in an endless loop in
> posix_locks_deadlock() checking for deadlocks.
>
> It appears that samba (smbd) is getting both flocks and
> posix locks. Since flock_lock_file() isn't checking for
> potential deadlocks before adding lock requests to the
> blocked_list, we end up with (what appears to be) a circular
> dependency between a posix lock and an flock in blocked_list.
I'm not aware that we obtain any locks with flock. We only
use fcntl (POSIX) locks. I just grepped the code so I'm
quite sure :-).
Jeremy.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-09-17 16:21 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-14 19:23 posix_lock_deadlocks() hang Stuart_Hayes
2004-09-14 19:28 ` Jeremy Allison
2004-09-14 19:30 ` Jeremy Allison
-- strict thread matches above, loose matches on Subject: below --
2004-09-17 16:21 Stuart_Hayes
2004-09-14 14:43 Stuart_Hayes
2004-09-13 20:25 Stuart_Hayes
2004-09-14 0:55 ` Matthew Wilcox
2004-09-14 19:09 ` Jeremy Allison
2004-09-14 23:23 ` Stephen Rothwell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).