All of lore.kernel.org
 help / color / mirror / Atom feed
* kernel 2.6 and simulated flock() with posix locks
@ 2008-02-25 13:20 Thanos Chatziathanassiou
       [not found] ` <47C2C09D.2010203-nz9JlX+3IF8@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Thanos Chatziathanassiou @ 2008-02-25 13:20 UTC (permalink / raw)
  To: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]

Hi,

I've been trying to replace kernel 2.4 in a web server mounting its Document Root via NFS with kernel 2.6 and faced a rather disturbing problem.
About 1/2 hour after starting, the server would stop serving requests though it seemed fine.
Earlier 2.6 kernels exhibited the ``do_vfs_lock: VFS is out of sync with lock manager!'' symptom, later (when this was changed to a dprintk()) just sat there.
No apparent error apart from apache compaining ``[error] server reached MaxClients setting, consider raising the MaxClients setting'', unable to serve any requests.

This issue does not surface under 2.4, where everything works as expected.
I came across this (http://blog.notreally.org/articles/2007/12/19/modifying-a-live-linux-kernel/) where apparently they faced the same problem, but their solution 
(which seemed a little crude) resulted in apache spitting ``There are no available locks'' messages (or roughly this, translated from my regional settings).

Is there any solution to this or a way to get 2.4 behavior under 2.6 ?

Best Regards,
Thanos Chatziathanassiou



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 3229 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel 2.6 and simulated flock() with posix locks
       [not found] ` <47C2C09D.2010203-nz9JlX+3IF8@public.gmane.org>
@ 2008-02-25 16:38   ` J. Bruce Fields
  2008-02-25 16:42     ` Thanos Chatziathanassiou
  0 siblings, 1 reply; 5+ messages in thread
From: J. Bruce Fields @ 2008-02-25 16:38 UTC (permalink / raw)
  To: Thanos Chatziathanassiou; +Cc: linux-nfs

On Mon, Feb 25, 2008 at 03:20:29PM +0200, Thanos Chatziathanassiou wrote:
> Hi,
>
> I've been trying to replace kernel 2.4 in a web server mounting its Document Root via NFS with kernel 2.6 and faced a rather disturbing problem.
> About 1/2 hour after starting, the server would stop serving requests though it seemed fine.
> Earlier 2.6 kernels exhibited the ``do_vfs_lock: VFS is out of sync with lock manager!'' symptom, later (when this was changed to a dprintk()) just sat there.
> No apparent error apart from apache compaining ``[error] server reached MaxClients setting, consider raising the MaxClients setting'', unable to serve any requests.
>
> This issue does not surface under 2.4, where everything works as expected.
> I came across this 
> (http://blog.notreally.org/articles/2007/12/19/modifying-a-live-linux-kernel/) 
> where apparently they faced the same problem, but their solution (which 
> seemed a little crude) resulted in apache spitting ``There are no 
> available locks'' messages (or roughly this, translated from my regional 
> settings).
>
> Is there any solution to this or a way to get 2.4 behavior under 2.6 ?

I'm a little confused--how do you know that the problem you face is the
same as the one described on the blog above?  Are you re-exporting NFS
via Samba?

--b.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel 2.6 and simulated flock() with posix locks
  2008-02-25 16:38   ` J. Bruce Fields
@ 2008-02-25 16:42     ` Thanos Chatziathanassiou
       [not found]       ` <47C2EFFB.40807-nz9JlX+3IF8@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Thanos Chatziathanassiou @ 2008-02-25 16:42 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 1513 bytes --]

J. Bruce Fields wrote:
> On Mon, Feb 25, 2008 at 03:20:29PM +0200, Thanos Chatziathanassiou wrote:
>   
>> Hi,
>>
>> I've been trying to replace kernel 2.4 in a web server mounting its Document Root via NFS with kernel 2.6 and faced a rather disturbing problem.
>> About 1/2 hour after starting, the server would stop serving requests though it seemed fine.
>> Earlier 2.6 kernels exhibited the ``do_vfs_lock: VFS is out of sync with lock manager!'' symptom, later (when this was changed to a dprintk()) just sat there.
>> No apparent error apart from apache compaining ``[error] server reached MaxClients setting, consider raising the MaxClients setting'', unable to serve any requests.
>>
>> This issue does not surface under 2.4, where everything works as expected.
>> I came across this 
>> (http://blog.notreally.org/articles/2007/12/19/modifying-a-live-linux-kernel/) 
>> where apparently they faced the same problem, but their solution (which 
>> seemed a little crude) resulted in apache spitting ``There are no 
>> available locks'' messages (or roughly this, translated from my regional 
>> settings).
>>
>> Is there any solution to this or a way to get 2.4 behavior under 2.6 ?
>>     
>
> I'm a little confused--how do you know that the problem you face is the
> same as the one described on the blog above?  Are you re-exporting NFS
> via Samba?
>
> --b.
>   
Indeed I am. But I am willing to convince you ;) What kind of debug info 
would I need to collect to find out what really the problem is ?


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 3229 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel 2.6 and simulated flock() with posix locks
       [not found]       ` <47C2EFFB.40807-nz9JlX+3IF8@public.gmane.org>
@ 2008-02-28 22:32         ` J. Bruce Fields
  2008-02-29 15:20           ` Thanos Chatziathanassiou
  0 siblings, 1 reply; 5+ messages in thread
From: J. Bruce Fields @ 2008-02-28 22:32 UTC (permalink / raw)
  To: Thanos Chatziathanassiou; +Cc: linux-nfs

On Mon, Feb 25, 2008 at 06:42:35PM +0200, Thanos Chatziathanassiou wrote:
> J. Bruce Fields wrote:
>> On Mon, Feb 25, 2008 at 03:20:29PM +0200, Thanos Chatziathanassiou wrote:
>>   
>>> Hi,
>>>
>>> I've been trying to replace kernel 2.4 in a web server mounting its Document Root via NFS with kernel 2.6 and faced a rather disturbing problem.
>>> About 1/2 hour after starting, the server would stop serving requests though it seemed fine.
>>> Earlier 2.6 kernels exhibited the ``do_vfs_lock: VFS is out of sync with lock manager!'' symptom, later (when this was changed to a dprintk()) just sat there.
>>> No apparent error apart from apache compaining ``[error] server reached MaxClients setting, consider raising the MaxClients setting'', unable to serve any requests.
>>>
>>> This issue does not surface under 2.4, where everything works as expected.
>>> I came across this  
>>> (http://blog.notreally.org/articles/2007/12/19/modifying-a-live-linux-kernel/) 
>>> where apparently they faced the same problem, but their solution 
>>> (which seemed a little crude) resulted in apache spitting ``There are 
>>> no available locks'' messages (or roughly this, translated from my 
>>> regional settings).
>>>
>>> Is there any solution to this or a way to get 2.4 behavior under 2.6 ?
>>>     
>>
>> I'm a little confused--how do you know that the problem you face is the
>> same as the one described on the blog above?  Are you re-exporting NFS
>> via Samba?
>>
>> --b.
>>   
> Indeed I am. But I am willing to convince you ;) What kind of debug info  
> would I need to collect to find out what really the problem is ?

Can you give a more detailed explanation of the symptoms?  For example,
when you say "the server would stop serving requests", are you referring
to the web server or the nfs server?  If you think the problem is that
Apache is hanging on a lock, you should be able to verify that with
strace or /proc/locks or a sysrq-T trace.

--b.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel 2.6 and simulated flock() with posix locks
  2008-02-28 22:32         ` J. Bruce Fields
@ 2008-02-29 15:20           ` Thanos Chatziathanassiou
  0 siblings, 0 replies; 5+ messages in thread
From: Thanos Chatziathanassiou @ 2008-02-29 15:20 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 3157 bytes --]

J. Bruce Fields wrote:
> On Mon, Feb 25, 2008 at 06:42:35PM +0200, Thanos Chatziathanassiou wrote:
>   
>> J. Bruce Fields wrote:
>>     
>>> On Mon, Feb 25, 2008 at 03:20:29PM +0200, Thanos Chatziathanassiou wrote:
>>>   
>>>       
>>>> Hi,
>>>>
>>>> I've been trying to replace kernel 2.4 in a web server mounting its Document Root via NFS with kernel 2.6 and faced a rather disturbing problem.
>>>> About 1/2 hour after starting, the server would stop serving requests though it seemed fine.
>>>> Earlier 2.6 kernels exhibited the ``do_vfs_lock: VFS is out of sync with lock manager!'' symptom, later (when this was changed to a dprintk()) just sat there.
>>>> No apparent error apart from apache compaining ``[error] server reached MaxClients setting, consider raising the MaxClients setting'', unable to serve any requests.
>>>>
>>>> This issue does not surface under 2.4, where everything works as expected.
>>>> I came across this  
>>>> (http://blog.notreally.org/articles/2007/12/19/modifying-a-live-linux-kernel/) 
>>>> where apparently they faced the same problem, but their solution 
>>>> (which seemed a little crude) resulted in apache spitting ``There are 
>>>> no available locks'' messages (or roughly this, translated from my 
>>>> regional settings).
>>>>
>>>> Is there any solution to this or a way to get 2.4 behavior under 2.6 ?
>>>>     
>>>>         
>>> I'm a little confused--how do you know that the problem you face is the
>>> same as the one described on the blog above?  Are you re-exporting NFS
>>> via Samba?
>>>
>>> --b.
>>>   
>>>       
>> Indeed I am. But I am willing to convince you ;) What kind of debug info  
>> would I need to collect to find out what really the problem is ?
>>     
>
> Can you give a more detailed explanation of the symptoms?  For example,
> when you say "the server would stop serving requests", are you referring
> to the web server or the nfs server?
sorry if I wasn't clear on this. this particular (stock 2.6.16.60) web 
server stops serving requests.
the nfs server (2.6.12.6 based) as well as other (2.4 based) web servers 
continue humming along just fine.
>   If you think the problem is that
> Apache is hanging on a lock, you should be able to verify that with
> strace or /proc/locks
well, /proc/locks doesn't tell much...
---snip---
www4:~# cat /proc/locks
1: FLOCK  ADVISORY  WRITE 2512 08:07:829070 0 EOF
2: POSIX  ADVISORY  READ  2459 08:07:1284232 0 EOF
3: POSIX  ADVISORY  WRITE 2454 08:07:829066 0 EOF
---snip---
process 2459 is
root      2459  0.0  0.0   1552   500 ?        S    16:07   0:00 ypbind 
(slave)
and 2454 is
root      2454  0.0  0.0   1532   448 ?        S    16:07   0:00 ypbind 
(master)
...I couldn't find 2512 (?) in the process table.

however,
straceing random httpd processes, yields:
---snip---
strace -p 22149
flock(11, LOCK_EX
---snip---

...which is understandably blocking
unfortunately, this child did not ever get to write what it was serving 
at the time to the access and/or error log, but we can (safely ?) assume 
it'd be some mod_perl script that called flock().

let me know if I can grab anything else
>  or a sysrq-T trace.
>
> --b.
>   


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 3229 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-02-29 15:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-25 13:20 kernel 2.6 and simulated flock() with posix locks Thanos Chatziathanassiou
     [not found] ` <47C2C09D.2010203-nz9JlX+3IF8@public.gmane.org>
2008-02-25 16:38   ` J. Bruce Fields
2008-02-25 16:42     ` Thanos Chatziathanassiou
     [not found]       ` <47C2EFFB.40807-nz9JlX+3IF8@public.gmane.org>
2008-02-28 22:32         ` J. Bruce Fields
2008-02-29 15:20           ` Thanos Chatziathanassiou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.