All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: Corrupt Data when using NFS on Linux
@ 2002-10-28 20:48 Lever, Charles
  2002-10-28 23:05 ` Alan Witz
  0 siblings, 1 reply; 7+ messages in thread
From: Lever, Charles @ 2002-10-28 20:48 UTC (permalink / raw)
  To: 'Alan Witz'; +Cc: nfs

[-- Attachment #1: Type: text/plain, Size: 1807 bytes --]

hi alan-
 
that information is crap, and should be removed from whereever you found it.
 
the problem is that typical file systems used on *Linux* NFS servers (like ext2) can't
store time stamps with sub-second resolution.  this is not a problem with typical
commercial NFS servers like Solaris or NetApp filers.  i'm not aware of any plan to
address this specific problem in 2.5, but that doesn't mean it won't be.
 
can you tell us more about your environment, especially which kernel is running
on your clients and what mount options you're using?
 

-----Original Message-----
From: Alan Witz [mailto:awitz@magstarinc.com]
Sent: Monday, October 28, 2002 3:07 PM
To: nfs@lists.sourceforge.net
Subject: [NFS] Corrupt Data when using NFS on Linux


I work for a small software company that recently began using NFS to implement a solution using a lesser-known database (Appgen).  The problem is that we're getting lots of corrupt database files in those files modified via NFS.  The on-line manual on linux.org makes the following reference which I think may be relevant:
 


SYMPTOM107.10. File Corruption When Using Multiple Clients


If a file has been modified within one second of its previous modification and left the same size, it will continue to generate the same inode number. Because of this, constant reads and writes to a file by multiple clients may cause file corruption. Fixing this bug requires changes deep within the filesystem layer, and therefore it is a 2.5 item. 

I was wondering if someone could clarify what is meant by this.  What is the relevance of the inode number?  And doesn't the inode of the file stay the same even if it is being modified?  Any help would be greatly appreciated.  Even some direction as to where else I might look would be helpful.  Thanks,

Alan Witz


[-- Attachment #2: Type: text/html, Size: 4026 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corrupt Data when using NFS on Linux
  2002-10-28 20:48 Corrupt Data when using NFS on Linux Lever, Charles
@ 2002-10-28 23:05 ` Alan Witz
  2002-10-29  1:34   ` Eff Norwood
  2002-10-29 17:03   ` Daniel Forrest
  0 siblings, 2 replies; 7+ messages in thread
From: Alan Witz @ 2002-10-28 23:05 UTC (permalink / raw)
  To: Lever, Charles; +Cc: nfs

[-- Attachment #1: Type: text/plain, Size: 3195 bytes --]

Thanks for your quick response.

The operating environment is Red Hat Linux 2.4.18-17.7.xsmp.  We are running NFS version 3.  We have implemented some of our own rudimentary file locking techniques to try and circumvent the problem.  This consists of creating a lock file which acts as a flag which tells the other clients not to access the file.  Basically, if the lock file exists then the other clients will wait until the file is cleared before writing to the database file.  To ensure that this works properly the "lock" flag is being created using the "ln" command so that the process of checking for a lock and setting a lock is essentially done in one step (thus eliminating the possibility of another client setting the lock after the current client has checked for the lock but before it can set the lock itself).  We are also running NFS in synchronous mode to try and reduce the chances of data corruption due to multiple clients.  The mount options are as follows:

    rsize=8192,wsize=8192,noac,hard,sync,nfsvers=3

Any thoughts would be greatly appreciated.

        Alan Witz


  ----- Original Message ----- 
  From: Lever, Charles 
  To: 'Alan Witz' 
  Cc: nfs@lists.sourceforge.net 
  Sent: Monday, October 28, 2002 3:48 PM
  Subject: RE: [NFS] Corrupt Data when using NFS on Linux


  hi alan-

  that information is crap, and should be removed from whereever you found it.

  the problem is that typical file systems used on *Linux* NFS servers (like ext2) can't
  store time stamps with sub-second resolution.  this is not a problem with typical
  commercial NFS servers like Solaris or NetApp filers.  i'm not aware of any plan to
  address this specific problem in 2.5, but that doesn't mean it won't be.

  can you tell us more about your environment, especially which kernel is running
  on your clients and what mount options you're using?

    -----Original Message-----
    From: Alan Witz [mailto:awitz@magstarinc.com]
    Sent: Monday, October 28, 2002 3:07 PM
    To: nfs@lists.sourceforge.net
    Subject: [NFS] Corrupt Data when using NFS on Linux


    I work for a small software company that recently began using NFS to implement a solution using a lesser-known database (Appgen).  The problem is that we're getting lots of corrupt database files in those files modified via NFS.  The on-line manual on linux.org makes the following reference which I think may be relevant:

      7.10. File Corruption When Using Multiple Clients
      If a file has been modified within one second of its previous modification and left the same size, it will continue to generate the same inode number. Because of this, constant reads and writes to a file by multiple clients may cause file corruption. Fixing this bug requires changes deep within the filesystem layer, and therefore it is a 2.5 item. 

    I was wondering if someone could clarify what is meant by this.  What is the relevance of the inode number?  And doesn't the inode of the file stay the same even if it is being modified?  Any help would be greatly appreciated.  Even some direction as to where else I might look would be helpful.  Thanks,

    Alan Witz

[-- Attachment #2: Type: text/html, Size: 6865 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Corrupt Data when using NFS on Linux
  2002-10-28 23:05 ` Alan Witz
@ 2002-10-29  1:34   ` Eff Norwood
  2002-10-29 17:03   ` Daniel Forrest
  1 sibling, 0 replies; 7+ messages in thread
From: Eff Norwood @ 2002-10-29  1:34 UTC (permalink / raw)
  To: Alan Witz, Lever, Charles; +Cc: nfs

Hi Alan,

>    rsize=8192,wsize=8192,noac,hard,sync,nfsvers=3
>
>Any thoughts would be greatly appreciated.

>From my own experiences with similar issues, I think you have two and maybe
three options:

1. Stick with your own Linux locking scheme until the kernel catches up. It
will eventually as there are a lot of darn smart people working on it-like
those on this list. :)
2. Possibly use FreeBSD? I don't know if it addresses this problem or not.
Does anyone know if FreeBSD works here?
3. Get a non-free working commercial solution like a Network Appliance
filer.

Eff Norwood





-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corrupt Data when using NFS on Linux
  2002-10-28 23:05 ` Alan Witz
  2002-10-29  1:34   ` Eff Norwood
@ 2002-10-29 17:03   ` Daniel Forrest
  2002-10-29 17:20     ` Corrupt Data when using NFS on Linux -- This should be in the HOWTO Bryan J. Smith
  1 sibling, 1 reply; 7+ messages in thread
From: Daniel Forrest @ 2002-10-29 17:03 UTC (permalink / raw)
  To: Alan Witz; +Cc: nfs

Alan,

>> The operating environment is Red Hat Linux 2.4.18-17.7.xsmp.  We
>> are running NFS version 3.  We have implemented some of our own
>> rudimentary file locking techniques to try and circumvent the
>> problem.  This consists of creating a lock file which acts as a
>> flag which tells the other clients not to access the file.
>> Basically, if the lock file exists then the other clients will wait
>> until the file is cleared before writing to the database file.  To
>> ensure that this works properly the "lock" flag is being created
>> using the "ln" command so that the process of checking for a lock
>> and setting a lock is essentially done in one step (thus
>> eliminating the possibility of another client setting the lock
>> after the current client has checked for the lock but before it can
>> set the lock itself).  We are also running NFS in synchronous mode
>> to try and reduce the chances of data corruption due to multiple
>> clients.  The mount options are as follows:
>> 
>>     rsize=8192,wsize=8192,noac,hard,sync,nfsvers=3
>> 
>> Any thoughts would be greatly appreciated.

You need to be careful when creating your lock files.

The "guaranteed" way to create a lock file over NFS:

    create tempfile
    link tempfile lockfile (ignore return code)
    stat tempfile

If the link count is 2, then you have the lock file.  Apparently, link
may return success even if the link failed or return failure even if
the link succeeded (I don't remember which).  Doing the stat verifies
if you have actually created a link to the temporary file.  While I
have never seen this problem, the people who do mailbox locking have
documented this as a problem over NFS.

Also, you will have to use "fcntl" locking if you want to ensure the
data you are reading is consistent.  Doing a "lockf(fd, F_LOCK, 0)"
will guarantee that data written by other clients has been written to
the file and clear the client cache.  Of course, now that you're using
"fcntl" locking for this, you can probably get rid of the lock file.

-- 
+----------------------------------+----------------------------------+
| Daniel K. Forrest                | Laboratory for Molecular and     |
| forrest@lmcg.wisc.edu            | Computational Genomics           |
| (608)262-9479                    | University of Wisconsin, Madison |
+----------------------------------+----------------------------------+


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corrupt Data when using NFS on Linux -- This should be in the HOWTO
  2002-10-29 17:03   ` Daniel Forrest
@ 2002-10-29 17:20     ` Bryan J. Smith
  2002-10-29 18:07       ` Tom McNeal
  0 siblings, 1 reply; 7+ messages in thread
From: Bryan J. Smith @ 2002-10-29 17:20 UTC (permalink / raw)
  To: Daniel Forrest; +Cc: Alan Witz, nfs


Quoting Daniel Forrest <forrest@lmcg.wisc.edu>:
> The "guaranteed" way to create a lock file over NFS:
>     create tempfile
>     link tempfile lockfile (ignore return code)
>     stat tempfile
> If the link count is 2, then you have the lock file.  Apparently, link
> may return success even if the link failed or return failure even if
> the link succeeded (I don't remember which).  Doing the stat verifies
> if you have actually created a link to the temporary file.

I know the HOWTO is more for users/sysadmins, but stuff like this could really
help in an additional "common workarounds for potential gotchas" section at the
end of the HOWTO.

-- 
Bryan J. Smith, E.I.            Contact Info:  http://thebs.org
A+/i-Net+/Linux+/Network+/Server+ CCNA CIWA CNA SCSA/SCWSE/SCNA
---------------------------------------------------------------
           limit      guilt   =     { psychopath,
         remorse->0                    innocent }



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corrupt Data when using NFS on Linux -- This should be in the HOWTO
  2002-10-29 17:20     ` Corrupt Data when using NFS on Linux -- This should be in the HOWTO Bryan J. Smith
@ 2002-10-29 18:07       ` Tom McNeal
  2002-10-29 18:23         ` Bryan J. Smith
  0 siblings, 1 reply; 7+ messages in thread
From: Tom McNeal @ 2002-10-29 18:07 UTC (permalink / raw)
  To: Bryan J. Smith; +Cc: Daniel Forrest, Alan Witz, nfs

Its easier with the FAQ, but the HOWTO might be the better place.
We'll take a look at it...

Tom

Bryan J. Smith wrote:

> Quoting Daniel Forrest <forrest@lmcg.wisc.edu>:
> 
>>The "guaranteed" way to create a lock file over NFS:
>>    create tempfile
>>    link tempfile lockfile (ignore return code)
>>    stat tempfile
>>If the link count is 2, then you have the lock file.  Apparently, link
>>may return success even if the link failed or return failure even if
>>the link succeeded (I don't remember which).  Doing the stat verifies
>>if you have actually created a link to the temporary file.
>>
> 
> I know the HOWTO is more for users/sysadmins, but stuff like this could really
> help in an additional "common workarounds for potential gotchas" section at the
> end of the HOWTO.
> 
> 


-- 
------------------------------------------------------------------------
Tom McNeal            trmcneal@attbi.com          (650)906-0761 (cell)
------------------------------------------------------------------------



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Corrupt Data when using NFS on Linux -- This should be in the HOWTO
  2002-10-29 18:07       ` Tom McNeal
@ 2002-10-29 18:23         ` Bryan J. Smith
  0 siblings, 0 replies; 7+ messages in thread
From: Bryan J. Smith @ 2002-10-29 18:23 UTC (permalink / raw)
  To: Tom McNeal; +Cc: Bryan J. Smith, Daniel Forrest, Alan Witz, nfs


Quoting Tom McNeal <trmcneal@attbi.com>:
> Its easier with the FAQ, but the HOWTO might be the better place.
> We'll take a look at it...

Maybe just a "Top 10 Differences Between NFS And Local"

- Mounts in exported filesystems are not automatically exported
- Locking issues and accomodation in scripts/programs
- Etc...

I think it would go a long way to helping many.

-- 
Bryan J. Smith, E.I.            Contact Info:  http://thebs.org
A+/i-Net+/Linux+/Network+/Server+ CCNA CIWA CNA SCSA/SCWSE/SCNA
---------------------------------------------------------------
           limit      guilt   =     { psychopath,
         remorse->0                    innocent }



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-10-29 18:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-28 20:48 Corrupt Data when using NFS on Linux Lever, Charles
2002-10-28 23:05 ` Alan Witz
2002-10-29  1:34   ` Eff Norwood
2002-10-29 17:03   ` Daniel Forrest
2002-10-29 17:20     ` Corrupt Data when using NFS on Linux -- This should be in the HOWTO Bryan J. Smith
2002-10-29 18:07       ` Tom McNeal
2002-10-29 18:23         ` Bryan J. Smith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.