Linux NFS development
 help / color / mirror / Atom feed
* Re: [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa
       [not found] ` <bug-11448-10286-V0hAGp6uBxO456/isadD/XN4h3HLQggn@public.gmane.org/>
@ 2008-08-28 20:27   ` Andrew Morton
  2008-08-28 20:33     ` Doug Hughes
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Andrew Morton @ 2008-08-28 20:27 UTC (permalink / raw)
  To: linux-nfs; +Cc: bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r, doug-rDJHdQPhaF8


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 28 Aug 2008 11:41:08 -0700 (PDT)
bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11448
> 
>            Summary: NFS client has inconsistent write flushing to non-linux
>                     serversa
>            Product: File System
>            Version: 2.5
>      KernelVersion: 2.6.22.15
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: NFS
>         AssignedTo: trond.myklebust@fys.uio.no
>         ReportedBy: doug-rDJHdQPhaF8@public.gmane.org
> 
> 
> Latest working kernel version: N/A (works on 2.6.18 with Linux NFS server, but
> we cannot continue to use that kernel for various reasons)
> Earliest failing kernel version: N/A (2.6.18, 2.6.24, and 2.6.25 are also known
> to fail by another party experiencing same bug against non-Linux NFS servers).
> Not currently known to be reproducible against NetApp, but this is not
> authoritative (lack of seeing a bug does not guarantee lack of existence)
> Distribution: CentOS 4.6
> Hardware Environment: supermicro twin, 2 quad core Harpertown CPU, 16G ram.
> Software Environment: CentOS 4.6
> Problem Description: 
> 
> NFS client writes to Sun Solaris 10 U4 server. 
> at some point in time, there is an empty portion of the output file from the
> writer containing missing data (shows as NULL bytes from another NFS client
> issuing a tail -f on the file being written). 
> confirmed that the file as exists on the NFS server is sparse, missing bytes
> (not necessarily multiple of 512 or 1024, one sample is a gap of 3818 bytes,
> another is 1895 bytes, another is 423 bytes)
> 
> if you do a read of the entire file from the NFS client doing the writing, it
> causes the non-flushed writes to be instantly flushed to the server followed by
> a NFS3 commit operation. The data then can be seen on all other NFS clients.
> 
> If you do an open of the file alone, no flush
> if you do an open and a close, no flush
> if you do an open and a read at the beginning of the file (far before the data
> that is outstanding), *usually* no flush (one case where it did).
> If you do a read at another position in the file, no flush (other than as
> indicated above).
> If you do a read at the indicated offset where the bytes are null, it causes
> the NFS client to write and NFS commit to the server (truss output available)
> 
> The missing blocks may flush themselves after undefined periods of time which
> can be hours. Our runs last days.
> 
> Steps to reproduce:
> 
> Chemist running NAMD sees frequent cases of this in his output trajectory index
> files. We don't have an exact sequence of steps to reproduce. After I file this
> ticket I will be giving ticket number to another person I know at a different
> company experiencing the same problem as described above (to the best of my
> knowledge)
> 

That seems rather ugly.

2.6.22 is getting a bit old though.  It's quite possible that this was
subsequently fixed, in which case upgrading your kernel or hassling the
vendor to backport the fix would be needed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa
  2008-08-28 20:27   ` [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa Andrew Morton
@ 2008-08-28 20:33     ` Doug Hughes
  2008-08-29 12:54     ` Doug Hughes
  2008-08-29 17:08     ` J. Bruce Fields
  2 siblings, 0 replies; 8+ messages in thread
From: Doug Hughes @ 2008-08-28 20:33 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-nfs, bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r

Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Thu, 28 Aug 2008 11:41:08 -0700 (PDT)
> bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
>
>   
>> http://bugzilla.kernel.org/show_bug.cgi?id=11448
>>
>>            Summary: NFS client has inconsistent write flushing to non-linux
>>                     serversa
>>            Product: File System
>>            Version: 2.5
>>      KernelVersion: 2.6.22.15
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: NFS
>>         AssignedTo: trond.myklebust@fys.uio.no
>>         ReportedBy: doug-rDJHdQPhaF8@public.gmane.org
>>
>>
>> Latest working kernel version: N/A (works on 2.6.18 with Linux NFS server, but
>> we cannot continue to use that kernel for various reasons)
>> Earliest failing kernel version: N/A (2.6.18, 2.6.24, and 2.6.25 are also known
>> to fail by another party experiencing same bug against non-Linux NFS servers).
>> Not currently known to be reproducible against NetApp, but this is not
>> authoritative (lack of seeing a bug does not guarantee lack of existence)
>> Distribution: CentOS 4.6
>> Hardware Environment: supermicro twin, 2 quad core Harpertown CPU, 16G ram.
>> Software Environment: CentOS 4.6
>> Problem Description: 
>>
>> NFS client writes to Sun Solaris 10 U4 server. 
>> at some point in time, there is an empty portion of the output file from the
>> writer containing missing data (shows as NULL bytes from another NFS client
>> issuing a tail -f on the file being written). 
>> confirmed that the file as exists on the NFS server is sparse, missing bytes
>> (not necessarily multiple of 512 or 1024, one sample is a gap of 3818 bytes,
>> another is 1895 bytes, another is 423 bytes)
>>
>> if you do a read of the entire file from the NFS client doing the writing, it
>> causes the non-flushed writes to be instantly flushed to the server followed by
>> a NFS3 commit operation. The data then can be seen on all other NFS clients.
>>
>> If you do an open of the file alone, no flush
>> if you do an open and a close, no flush
>> if you do an open and a read at the beginning of the file (far before the data
>> that is outstanding), *usually* no flush (one case where it did).
>> If you do a read at another position in the file, no flush (other than as
>> indicated above).
>> If you do a read at the indicated offset where the bytes are null, it causes
>> the NFS client to write and NFS commit to the server (truss output available)
>>
>> The missing blocks may flush themselves after undefined periods of time which
>> can be hours. Our runs last days.
>>
>> Steps to reproduce:
>>
>> Chemist running NAMD sees frequent cases of this in his output trajectory index
>> files. We don't have an exact sequence of steps to reproduce. After I file this
>> ticket I will be giving ticket number to another person I know at a different
>> company experiencing the same problem as described above (to the best of my
>> knowledge)
>>
>>     
>
> That seems rather ugly.
>
> 2.6.22 is getting a bit old though.  It's quite possible that this was
> subsequently fixed, in which case upgrading your kernel or hassling the
> vendor to backport the fix would be needed.
>   

I am in the process of trying to duplicate this on 2.6.26.. I need the 
chemist to change his machine. (it's a kernel.org kernel for reasons of 
IB support, so no vendor to hassle). There is another party to this bug 
who seems to have the same symptoms in 2.6.25 who is trying to capture 
packet data and reproduce.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa
  2008-08-28 20:27   ` [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa Andrew Morton
  2008-08-28 20:33     ` Doug Hughes
@ 2008-08-29 12:54     ` Doug Hughes
  2008-08-29 17:08     ` J. Bruce Fields
  2 siblings, 0 replies; 8+ messages in thread
From: Doug Hughes @ 2008-08-29 12:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-nfs, bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r

confirmed that this bug is present in same way in 2.6.26 using default 
ASYNC NFS, but so far does not exhibit when sync mount option is used.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa
  2008-08-28 20:27   ` [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa Andrew Morton
  2008-08-28 20:33     ` Doug Hughes
  2008-08-29 12:54     ` Doug Hughes
@ 2008-08-29 17:08     ` J. Bruce Fields
  2008-08-29 17:14       ` Peter Staubach
  2 siblings, 1 reply; 8+ messages in thread
From: J. Bruce Fields @ 2008-08-29 17:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-nfs, bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r,
	doug-rDJHdQPhaF8

On Thu, Aug 28, 2008 at 01:27:53PM -0700, Andrew Morton wrote:
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Thu, 28 Aug 2008 11:41:08 -0700 (PDT)
> bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
> > NFS client writes to Sun Solaris 10 U4 server. 
> > at some point in time, there is an empty portion of the output file from the
> > writer containing missing data (shows as NULL bytes from another NFS client
> > issuing a tail -f on the file being written). 
> > confirmed that the file as exists on the NFS server is sparse, missing bytes
> > (not necessarily multiple of 512 or 1024, one sample is a gap of 3818 bytes,
> > another is 1895 bytes, another is 423 bytes)

Seems like something that could happen if for example two write rpc's
got reordered on the network.  That's not necessarily a bug--the nfs
client isn't required to wait for confirmation of every previous write
before sending the next one.

However if the client isn't flushing dirty data to the server before
returning from close, then that's a violation of NFS's close-to-open
semantics:...

> > 
> > if you do a read of the entire file from the NFS client doing the writing, it
> > causes the non-flushed writes to be instantly flushed to the server followed by
> > a NFS3 commit operation. The data then can be seen on all other NFS clients.
> > 
> > If you do an open of the file alone, no flush
> > if you do an open and a close, no flush

... so this "close, no flush" could be a bug (depending on who is doing
that close when--I don't completely understand the described situation).

--b.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa
  2008-08-29 17:08     ` J. Bruce Fields
@ 2008-08-29 17:14       ` Peter Staubach
  2008-08-29 17:23         ` Doug Hughes
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Staubach @ 2008-08-29 17:14 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Andrew Morton, linux-nfs,
	bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r, doug-rDJHdQPhaF8

J. Bruce Fields wrote:
> On Thu, Aug 28, 2008 at 01:27:53PM -0700, Andrew Morton wrote:
>   
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Thu, 28 Aug 2008 11:41:08 -0700 (PDT)
>> bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
>>     
>>> NFS client writes to Sun Solaris 10 U4 server. 
>>> at some point in time, there is an empty portion of the output file from the
>>> writer containing missing data (shows as NULL bytes from another NFS client
>>> issuing a tail -f on the file being written). 
>>> confirmed that the file as exists on the NFS server is sparse, missing bytes
>>> (not necessarily multiple of 512 or 1024, one sample is a gap of 3818 bytes,
>>> another is 1895 bytes, another is 423 bytes)
>>>       
>
> Seems like something that could happen if for example two write rpc's
> got reordered on the network.  That's not necessarily a bug--the nfs
> client isn't required to wait for confirmation of every previous write
> before sending the next one.
>
> However if the client isn't flushing dirty data to the server before
> returning from close, then that's a violation of NFS's close-to-open
> semantics:...
>
>   
>>> if you do a read of the entire file from the NFS client doing the writing, it
>>> causes the non-flushed writes to be instantly flushed to the server followed by
>>> a NFS3 commit operation. The data then can be seen on all other NFS clients.
>>>
>>> If you do an open of the file alone, no flush
>>> if you do an open and a close, no flush
>>>       
>
> ... so this "close, no flush" could be a bug (depending on who is doing
> that close when--I don't completely understand the described situation).

I suspect that this last might depend upon 1) what options were used
when the file system was mounted and 2) how the file was opened.  The
flush-on-close wouldn't be needed if the file was opened read-only.

It seems a little odd that the holes aren't page aligned or page
sized multiples.

What application is being used to generate the file which is showing
these holes?

    Thanx...

       ps

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa
  2008-08-29 17:14       ` Peter Staubach
@ 2008-08-29 17:23         ` Doug Hughes
       [not found]           ` <48B83091.7060800-rDJHdQPhaF8@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Doug Hughes @ 2008-08-29 17:23 UTC (permalink / raw)
  To: Peter Staubach
  Cc: J. Bruce Fields, Andrew Morton, linux-nfs,
	bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r

Peter Staubach wrote:
> J. Bruce Fields wrote:
>> On Thu, Aug 28, 2008 at 01:27:53PM -0700, Andrew Morton wrote:
>>  
>>> (switched to email.  Please respond via emailed reply-to-all, not 
>>> via the
>>> bugzilla web interface).
>>>
>>> On Thu, 28 Aug 2008 11:41:08 -0700 (PDT)
>>> bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
>>>    
>>>> NFS client writes to Sun Solaris 10 U4 server. at some point in 
>>>> time, there is an empty portion of the output file from the
>>>> writer containing missing data (shows as NULL bytes from another 
>>>> NFS client
>>>> issuing a tail -f on the file being written). confirmed that the 
>>>> file as exists on the NFS server is sparse, missing bytes
>>>> (not necessarily multiple of 512 or 1024, one sample is a gap of 
>>>> 3818 bytes,
>>>> another is 1895 bytes, another is 423 bytes)
>>>>       
>>
>> Seems like something that could happen if for example two write rpc's
>> got reordered on the network.  That's not necessarily a bug--the nfs
>> client isn't required to wait for confirmation of every previous write
>> before sending the next one.
>>
if two RPCs got reordered on the network, and they encompass all the 
data, then there shouldn't be any missing data. It seems to me like 
pieces of data are just being skipped, for whatever reason, but I 
haven't exhaustively examined the NFS network data.

>> However if the client isn't flushing dirty data to the server before
>> returning from close, then that's a violation of NFS's close-to-open
>> semantics:...
>>
this is not confirmed yet. No solid cases of data not being present 
after close.
>>  
>>>> if you do a read of the entire file from the NFS client doing the 
>>>> writing, it
>>>> causes the non-flushed writes to be instantly flushed to the server 
>>>> followed by
>>>> a NFS3 commit operation. The data then can be seen on all other NFS 
>>>> clients.
>>>>
>>>> If you do an open of the file alone, no flush
>>>> if you do an open and a close, no flush
>>>>       
>>
>> ... so this "close, no flush" could be a bug (depending on who is doing
>> that close when--I don't completely understand the described situation).
>
> I suspect that this last might depend upon 1) what options were used
> when the file system was mounted and 2) how the file was opened.  The
> flush-on-close wouldn't be needed if the file was opened read-only.
>
no special options on open. Here are the mount options:
retry=1000,tcp,noatime,nosuid,nodev,dirsync,timeo=100,rsize=32768,wsize=32768
,hard,intr


> It seems a little odd that the holes aren't page aligned or page
> sized multiples.
>
indeed. and the time for them to actually get to the server is 
indeterminate (days is not uncommon. We have not as yet confirmed that 
some of the data never gets sent to the server until close)

> What application is being used to generate the file which is showing
> these holes?
>
namd and some custom code developed in-house for chemistry research (at 
the very least)



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa
       [not found]           ` <48B83091.7060800-rDJHdQPhaF8@public.gmane.org>
@ 2008-08-29 17:53             ` Peter Staubach
  2008-08-29 18:27               ` Doug Hughes
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Staubach @ 2008-08-29 17:53 UTC (permalink / raw)
  To: Doug Hughes
  Cc: J. Bruce Fields, Andrew Morton, linux-nfs,
	bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r

Doug Hughes wrote:
> Peter Staubach wrote:
>> J. Bruce Fields wrote:
>>> On Thu, Aug 28, 2008 at 01:27:53PM -0700, Andrew Morton wrote:
>>>  
>>>> (switched to email.  Please respond via emailed reply-to-all, not 
>>>> via the
>>>> bugzilla web interface).
>>>>
>>>> On Thu, 28 Aug 2008 11:41:08 -0700 (PDT)
>>>> bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
>>>>   
>>>>> NFS client writes to Sun Solaris 10 U4 server. at some point in 
>>>>> time, there is an empty portion of the output file from the
>>>>> writer containing missing data (shows as NULL bytes from another 
>>>>> NFS client
>>>>> issuing a tail -f on the file being written). confirmed that the 
>>>>> file as exists on the NFS server is sparse, missing bytes
>>>>> (not necessarily multiple of 512 or 1024, one sample is a gap of 
>>>>> 3818 bytes,
>>>>> another is 1895 bytes, another is 423 bytes)
>>>>>       
>>>
>>> Seems like something that could happen if for example two write rpc's
>>> got reordered on the network.  That's not necessarily a bug--the nfs
>>> client isn't required to wait for confirmation of every previous write
>>> before sending the next one.
>>>
> if two RPCs got reordered on the network, and they encompass all the 
> data, then there shouldn't be any missing data. It seems to me like 
> pieces of data are just being skipped, for whatever reason, but I 
> haven't exhaustively examined the NFS network data.
>
>>> However if the client isn't flushing dirty data to the server before
>>> returning from close, then that's a violation of NFS's close-to-open
>>> semantics:...
>>>
> this is not confirmed yet. No solid cases of data not being present 
> after close.
>>>  
>>>>> if you do a read of the entire file from the NFS client doing the 
>>>>> writing, it
>>>>> causes the non-flushed writes to be instantly flushed to the 
>>>>> server followed by
>>>>> a NFS3 commit operation. The data then can be seen on all other 
>>>>> NFS clients.
>>>>>
>>>>> If you do an open of the file alone, no flush
>>>>> if you do an open and a close, no flush
>>>>>       
>>>
>>> ... so this "close, no flush" could be a bug (depending on who is doing
>>> that close when--I don't completely understand the described 
>>> situation).
>>
>> I suspect that this last might depend upon 1) what options were used
>> when the file system was mounted and 2) how the file was opened.  The
>> flush-on-close wouldn't be needed if the file was opened read-only.
>>
> no special options on open. Here are the mount options:
> retry=1000,tcp,noatime,nosuid,nodev,dirsync,timeo=100,rsize=32768,wsize=32768 
>
> ,hard,intr
>
>
>> It seems a little odd that the holes aren't page aligned or page
>> sized multiples.
>>
> indeed. and the time for them to actually get to the server is 
> indeterminate (days is not uncommon. We have not as yet confirmed that 
> some of the data never gets sent to the server until close)
>
>> What application is being used to generate the file which is showing
>> these holes?
>>
> namd and some custom code developed in-house for chemistry research 
> (at the very least) 

Do these applications use mmap() or generate the file contents
serially or randomly?

    Thanx...

       ps

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa
  2008-08-29 17:53             ` Peter Staubach
@ 2008-08-29 18:27               ` Doug Hughes
  0 siblings, 0 replies; 8+ messages in thread
From: Doug Hughes @ 2008-08-29 18:27 UTC (permalink / raw)
  To: Peter Staubach
  Cc: J. Bruce Fields, Andrew Morton, linux-nfs,
	bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r

Peter Staubach wrote:
> Doug Hughes wrote:
>> Peter Staubach wrote:
>>> J. Bruce Fields wrote:
>>>> On Thu, Aug 28, 2008 at 01:27:53PM -0700, Andrew Morton wrote:
>>>>  
>>>>> (switched to email.  Please respond via emailed reply-to-all, not 
>>>>> via the
>>>>> bugzilla web interface).
>>>>>
>>>>> On Thu, 28 Aug 2008 11:41:08 -0700 (PDT)
>>>>> bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
>>>>>  
>>>>>> NFS client writes to Sun Solaris 10 U4 server. at some point in 
>>>>>> time, there is an empty portion of the output file from the
>>>>>> writer containing missing data (shows as NULL bytes from another 
>>>>>> NFS client
>>>>>> issuing a tail -f on the file being written). confirmed that the 
>>>>>> file as exists on the NFS server is sparse, missing bytes
>>>>>> (not necessarily multiple of 512 or 1024, one sample is a gap of 
>>>>>> 3818 bytes,
>>>>>> another is 1895 bytes, another is 423 bytes)
>>>>>>       
>>>>
>>>> Seems like something that could happen if for example two write rpc's
>>>> got reordered on the network.  That's not necessarily a bug--the nfs
>>>> client isn't required to wait for confirmation of every previous write
>>>> before sending the next one.
>>>>
>> if two RPCs got reordered on the network, and they encompass all the 
>> data, then there shouldn't be any missing data. It seems to me like 
>> pieces of data are just being skipped, for whatever reason, but I 
>> haven't exhaustively examined the NFS network data.
>>
>>>> However if the client isn't flushing dirty data to the server before
>>>> returning from close, then that's a violation of NFS's close-to-open
>>>> semantics:...
>>>>
>> this is not confirmed yet. No solid cases of data not being present 
>> after close.
>>>>  
>>>>>> if you do a read of the entire file from the NFS client doing the 
>>>>>> writing, it
>>>>>> causes the non-flushed writes to be instantly flushed to the 
>>>>>> server followed by
>>>>>> a NFS3 commit operation. The data then can be seen on all other 
>>>>>> NFS clients.
>>>>>>
>>>>>> If you do an open of the file alone, no flush
>>>>>> if you do an open and a close, no flush
>>>>>>       
>>>>
>>>> ... so this "close, no flush" could be a bug (depending on who is 
>>>> doing
>>>> that close when--I don't completely understand the described 
>>>> situation).
>>>
>>> I suspect that this last might depend upon 1) what options were used
>>> when the file system was mounted and 2) how the file was opened.  The
>>> flush-on-close wouldn't be needed if the file was opened read-only.
>>>
>> no special options on open. Here are the mount options:
>> retry=1000,tcp,noatime,nosuid,nodev,dirsync,timeo=100,rsize=32768,wsize=32768 
>>
>> ,hard,intr
>>
>>
>>> It seems a little odd that the holes aren't page aligned or page
>>> sized multiples.
>>>
>> indeed. and the time for them to actually get to the server is 
>> indeterminate (days is not uncommon. We have not as yet confirmed 
>> that some of the data never gets sent to the server until close)
>>
>>> What application is being used to generate the file which is showing
>>> these holes?
>>>
>> namd and some custom code developed in-house for chemistry research 
>> (at the very least) 
>
> Do these applications use mmap() or generate the file contents
> serially or randomly?
>
>    Thanx...
>
>  
open file at beginning. write, write, write, write, write, (no seek, no 
offset, entirely serial), run a very long time, end.

strace excerpt:
16:42:56.143512 write(8, "1948900 47.1225 0 0 0 47.7759 0 "..., 118) = 118
16:43:01.845742 write(8, "1949000 47.0474 0 0 0 47.8865 0 "..., 116) = 116
16:43:07.481889 write(8, "1949100 47.045 0 0 0 48.0742 0 0"..., 116) = 116
16:43:13.150555 write(8, "1949200 47.1848 0 0 0 47.8868 0 "..., 116) = 116
16:43:18.788863 write(8, "1949300 47.251 0 0 0 47.7743 0 0"..., 113) = 113
16:43:24.429424 write(8, "1949400 47.2722 0 0 0 47.6937 0 "..., 118) = 118
16:43:30.057179 write(8, "1949500 47.4865 0 0 0 47.6251 0 "..., 117) = 117



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-08-29 18:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-11448-10286@http.bugzilla.kernel.org/>
     [not found] ` <bug-11448-10286-V0hAGp6uBxO456/isadD/XN4h3HLQggn@public.gmane.org/>
2008-08-28 20:27   ` [Bugme-new] [Bug 11448] New: NFS client has inconsistent write flushing to non-linux serversa Andrew Morton
2008-08-28 20:33     ` Doug Hughes
2008-08-29 12:54     ` Doug Hughes
2008-08-29 17:08     ` J. Bruce Fields
2008-08-29 17:14       ` Peter Staubach
2008-08-29 17:23         ` Doug Hughes
     [not found]           ` <48B83091.7060800-rDJHdQPhaF8@public.gmane.org>
2008-08-29 17:53             ` Peter Staubach
2008-08-29 18:27               ` Doug Hughes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox