All of lore.kernel.org
 help / color / mirror / Atom feed
* Propagation of changes in shared mmap()ed NFS files
@ 2008-06-21 19:05 Phil Endecott
       [not found] ` <1214075120367-YnoLgZYwwYuCbKHnblo0pmrPP3OPMK55cpQHUIT47Ck@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Phil Endecott @ 2008-06-21 19:05 UTC (permalink / raw)
  To: linux-nfs

Dear Experts,

I have a program which uses an mmap()ed read-mostly data file.  When 
not using NFS, each instance of the program can use inotify to detect 
when other instances have made changes to the data file.  Since inotify 
doesn't work with NFS, I have now implemented a scheme using network 
broadcasts to announce changes.  At present it works like this:

All instances of the program mmap(MAP_SHARED) the data file.

One instance stores some new data at the end of the file and calls 
msync(MS_SYNC) on the affected pages.  It then "atomically commits" the 
new data by write()ing a new header at the start of the file with an 
"end of data" field advanced to include the new data.  It then calls 
fdatasync().  Then it transmits a broadcast packet.

The other instance(s) of the program receive the broadcast packet and 
read() the header at the start of the file.  My hope was that they 
would see the new value, but they don't; they continue to see the old value.

In order to allow for network broadcasts being unreliable the 
wait-for-broadcast code has a 30 second timeout; when this timeout next 
expires the program reads the header again and now it sees the new 
end-of-data offset, and the new data in the mapped memory region.

So, what do I have to do so that the new data is visible promptly?  Is 
there more that the sender or the receiver should do to tell the local 
kernel or the NFS server to propagate changes?  For example, does 
msync(MS_INVALIDATE) do anything useful?  Do I simply have to wait for 
some delay after receiving the broadcast?

I have also observed that when the changes are finally noticed, it 
seems that the whole of the multi-megabyte file is re-fetched from the 
server as it is accessed, despite only a few hundred bytes having 
changed.  This is undesirable; is there anything that I can do to 
prevent it?  (I tried calling mlock(), mainly to check that this wasn't 
a memory-pressure problem, but it still seems to re-fetch it.)

Ideally I'd like to have something that will work with any non-ancient 
version of NFS, and perhaps even CIFS too, but for now I'd be happy 
with getting it working on this nfsv3 system.

Many thanks for any advice,

Phil.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Propagation of changes in shared mmap()ed NFS files
       [not found] ` <1214075120367-YnoLgZYwwYuCbKHnblo0pmrPP3OPMK55cpQHUIT47Ck@public.gmane.org>
@ 2008-06-21 21:43   ` Trond Myklebust
  2008-06-21 22:02     ` Phil Endecott
  0 siblings, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2008-06-21 21:43 UTC (permalink / raw)
  To: Phil Endecott; +Cc: linux-nfs

On Sat, 2008-06-21 at 20:05 +0100, Phil Endecott wrote:
> Dear Experts,
> 
> I have a program which uses an mmap()ed read-mostly data file.  When 
> not using NFS, each instance of the program can use inotify to detect 
> when other instances have made changes to the data file.  Since inotify 
> doesn't work with NFS, I have now implemented a scheme using network 
> broadcasts to announce changes.  At present it works like this:
> 
> All instances of the program mmap(MAP_SHARED) the data file.
> 
> One instance stores some new data at the end of the file and calls 
> msync(MS_SYNC) on the affected pages.  It then "atomically commits" the 
> new data by write()ing a new header at the start of the file with an 
> "end of data" field advanced to include the new data.  It then calls 
> fdatasync().  Then it transmits a broadcast packet.
> 
> The other instance(s) of the program receive the broadcast packet and 
> read() the header at the start of the file.  My hope was that they 
> would see the new value, but they don't; they continue to see the old value.

open(O_DIRECT) is your friend.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Propagation of changes in shared mmap()ed NFS files
  2008-06-21 21:43   ` Trond Myklebust
@ 2008-06-21 22:02     ` Phil Endecott
       [not found]       ` <1214085757714-YnoLgZYwwYuCbKHnblo0pmrPP3OPMK55cpQHUIT47Ck@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Phil Endecott @ 2008-06-21 22:02 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

Trond Myklebust wrote:
> On Sat, 2008-06-21 at 20:05 +0100, Phil Endecott wrote:
>> Dear Experts,
>> 
>> I have a program which uses an mmap()ed read-mostly data file.  When 
>> not using NFS, each instance of the program can use inotify to detect 
>> when other instances have made changes to the data file.  Since inotify 
>> doesn't work with NFS, I have now implemented a scheme using network 
>> broadcasts to announce changes.  At present it works like this:
>> 
>> All instances of the program mmap(MAP_SHARED) the data file.
>> 
>> One instance stores some new data at the end of the file and calls 
>> msync(MS_SYNC) on the affected pages.  It then "atomically commits" the 
>> new data by write()ing a new header at the start of the file with an 
>> "end of data" field advanced to include the new data.  It then calls 
>> fdatasync().  Then it transmits a broadcast packet.
>> 
>> The other instance(s) of the program receive the broadcast packet and 
>> read() the header at the start of the file.  My hope was that they 
>> would see the new value, but they don't; they continue to see the old value.
>
> open(O_DIRECT) is your friend.

Thanks Trond, I'll give it a try.

This only affects the write()s and read()s though, doesn't it?  So are 
you suggesting that the mmap()ed data is correctly propagated already, 
and only the write-to-read needs fixing?

BTW the man page is a bit discouraging about the combination of 
O_DIRECT and mmap(): "applications should avoid mixing mmap(2) of files 
with direct I/O to the same files."  Fingers crossed....


Phil.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Propagation of changes in shared mmap()ed NFS files
       [not found]       ` <1214085757714-YnoLgZYwwYuCbKHnblo0pmrPP3OPMK55cpQHUIT47Ck@public.gmane.org>
@ 2008-06-21 22:12         ` Trond Myklebust
  2008-06-22 12:09           ` Phil Endecott
  0 siblings, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2008-06-21 22:12 UTC (permalink / raw)
  To: Phil Endecott; +Cc: linux-nfs

On Sat, 2008-06-21 at 23:02 +0100, Phil Endecott wrote:
> Trond Myklebust wrote:
> > On Sat, 2008-06-21 at 20:05 +0100, Phil Endecott wrote:
> >> Dear Experts,
> >> 
> >> I have a program which uses an mmap()ed read-mostly data file.  When 
> >> not using NFS, each instance of the program can use inotify to detect 
> >> when other instances have made changes to the data file.  Since inotify 
> >> doesn't work with NFS, I have now implemented a scheme using network 
> >> broadcasts to announce changes.  At present it works like this:
> >> 
> >> All instances of the program mmap(MAP_SHARED) the data file.
> >> 
> >> One instance stores some new data at the end of the file and calls 
> >> msync(MS_SYNC) on the affected pages.  It then "atomically commits" the 
> >> new data by write()ing a new header at the start of the file with an 
> >> "end of data" field advanced to include the new data.  It then calls 
> >> fdatasync().  Then it transmits a broadcast packet.
> >> 
> >> The other instance(s) of the program receive the broadcast packet and 
> >> read() the header at the start of the file.  My hope was that they 
> >> would see the new value, but they don't; they continue to see the old value.
> >
> > open(O_DIRECT) is your friend.
> 
> Thanks Trond, I'll give it a try.
> 
> This only affects the write()s and read()s though, doesn't it?  So are 
> you suggesting that the mmap()ed data is correctly propagated already, 
> and only the write-to-read needs fixing?
> 
> BTW the man page is a bit discouraging about the combination of 
> O_DIRECT and mmap(): "applications should avoid mixing mmap(2) of files 
> with direct I/O to the same files."  Fingers crossed....

You shouldn't use mmap() to read data in this situation. mmap() is
designed for cases where the authoritative copy of the data can be kept
in local memory.
In your situation, the authoritative copy is always on disk (or the NFS
server), and so the correct paradigm is to use O_DIRECT read() and
write() or to use POSIX file locking. The latter allows the NFS clients
to do the read()/write() synchronisation for you, whereas the former
assumes that you are doing some other form of locking to ensure
synchronisation between readers and writers.

Cheers
  Trond


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Propagation of changes in shared mmap()ed NFS files
  2008-06-21 22:12         ` Trond Myklebust
@ 2008-06-22 12:09           ` Phil Endecott
  0 siblings, 0 replies; 5+ messages in thread
From: Phil Endecott @ 2008-06-22 12:09 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

Trond Myklebust wrote:
>> > On Sat, 2008-06-21 at 20:05 +0100, Phil Endecott wrote:
>> >> Dear Experts,
>> >> 
>> >> I have a program which uses an mmap()ed read-mostly data file.  When 
>> >> not using NFS, each instance of the program can use inotify to detect 
>> >> when other instances have made changes to the data file.  Since inotify 
>> >> doesn't work with NFS, I have now implemented a scheme using network 
>> >> broadcasts to announce changes.  At present it works like this:
>> >> 
>> >> All instances of the program mmap(MAP_SHARED) the data file.
>> >> 
>> >> One instance stores some new data at the end of the file and calls 
>> >> msync(MS_SYNC) on the affected pages.  It then "atomically commits" the 
>> >> new data by write()ing a new header at the start of the file with an 
>> >> "end of data" field advanced to include the new data.  It then calls 
>> >> fdatasync().  Then it transmits a broadcast packet.
>> >> 
>> >> The other instance(s) of the program receive the broadcast packet and 
>> >> read() the header at the start of the file.  My hope was that they 
>> >> would see the new value, but they don't; they continue to see the old value.

> You shouldn't use mmap() to read data in this situation. mmap() is
> designed for cases where the authoritative copy of the data can be kept
> in local memory.
> In your situation, the authoritative copy is always on disk (or the NFS
> server), and so the correct paradigm is to use O_DIRECT read() and
> write() or to use POSIX file locking. The latter allows the NFS clients
> to do the read()/write() synchronisation for you, whereas the former
> assumes that you are doing some other form of locking to ensure
> synchronisation between readers and writers.

Hmmm.  OK.   But mmap(MAP_SHARED) does exactly what I want in the more 
common case where the files are not on NFS; I can have multiple 
instances of the program and only one RAM copy of the data is needed, 
and changes made by one instance are immediately visible to the 
others.  The problem is that the NFS implementation of mmap(MAP_SHARED) 
doesn't match the behaviour of the non-NFS version.

It looks to me as if the writer does the right thing: after it modifies 
pages they are written back to the server when I call msync().  But, 
IIUC, the server has no way to inform the other clients that those 
pages are modified.  Instead, the clients will revalidate with the 
server after some timeout; this revalidation is not per-page but 
per-file, so the server will tell them that the whole file has changed 
and the clients will invalidate all of their pages.  Is this true?

So: is there anything that I can do on the client to say:

- "even though the timeout hasn't expired, invalidate these cached 
pages now"?
- "even though the timeout has expired, and the server says that the 
file has changed, keep using your cached copies of these pages"?


Phil.





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-06-22 12:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-21 19:05 Propagation of changes in shared mmap()ed NFS files Phil Endecott
     [not found] ` <1214075120367-YnoLgZYwwYuCbKHnblo0pmrPP3OPMK55cpQHUIT47Ck@public.gmane.org>
2008-06-21 21:43   ` Trond Myklebust
2008-06-21 22:02     ` Phil Endecott
     [not found]       ` <1214085757714-YnoLgZYwwYuCbKHnblo0pmrPP3OPMK55cpQHUIT47Ck@public.gmane.org>
2008-06-21 22:12         ` Trond Myklebust
2008-06-22 12:09           ` Phil Endecott

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.