public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mark Mielke <mark@mark.mielke.cc>
To: linux-kernel@vger.kernel.org
Subject: Appropriate use of sync() from user space?
Date: Wed, 05 Oct 2011 19:57:03 -0400	[thread overview]
Message-ID: <4E8CEECF.4050008@mark.mielke.cc> (raw)

Hi all:

Quick summary: We have a vendor who is claiming that it is required for 
their userspace program to execute sync(), and I am looking for some 
sort of authoritative document or person to refer them to that will 
state that this belief is incorrect and/or that this architecture is not 
acceptable in a Unix environment.

I checked Google and the archives and didn't find anything appropriate. 
Unfortunately, the word "sync" is very popular. :-)

We have users who have been experiencing 3 to 5 minutes "freezes" for a 
particular command which often times out and fails. I traced this down 
from the commercial userspace program (IBM Rational ClearCase / 
"cleartool mkview") that they are executing to a backend "view_server" 
process (also IBM Rational ClearCase) that is running sync() as a means 
of synchronizing their database to disk before proceeding, and VMware 
using a "large" memory mapped file to back it's virtual "RAM". The 
sync() for my computer normally completes in 7 to 8 seconds. The sync() 
for some of our users is taking 5 minutes or longer. This can be 
demonstrated simply by typing "time sync" from the command line at 
intervals. The time itself is relevant because if it finishes before a 
timeout elapses - the operation works (albeit slowly). If the timeout 
elapses, the operation fails.

The vendor stated that sync() is integral to their synchronization 
process to ensure all files reach disk before they are accessed, and 
that this is not a defect in their product. We have a work around - run 
"sync" before calling their command, and this generally avoids the failures.

I think the use of sync() in this regard is a hack. According to POSIX.1 
and the Linux man pages, it seems clear to me that sync() does not 
guarantee data integrity (bytes guaranteed to have reached disk) - and 
it also seems clear that forcing all system data to flush out in 
response to a minor command is over kill. Like cutting down the forest 
to harvest fruit from a single tree.

I'm wondering what you think.

Thanks!

-- 
Mark Mielke<mark@mielke.cc>


             reply	other threads:[~2011-10-06  0:03 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-05 23:57 Mark Mielke [this message]
2011-10-06  0:30 ` Appropriate use of sync() from user space? Chris Friesen
2011-10-18 21:14 ` Jan Kara
2011-10-18 23:53   ` David Rientjes
2011-10-19  0:03     ` david
2011-10-19  0:34       ` Mark Mielke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E8CEECF.4050008@mark.mielke.cc \
    --to=mark@mark.mielke.cc \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox